|
| |||
|
|
[Print] AboutSwish-e PHP Tools are still in development - call it alpha phase. RationaleSwish-e PHP Tools are being created to fill a specific need at the Fitzwilliam Museum - the basic need being the desire to develop our own 'on-server' search engine (both for our intranet and internet sites). After some research we decided to try to use Swish-e as our indexing and search engine. We choose to stick with a single web development language, PHP , where possible. Swish-e is very versatile but the majority of the tools and utilities surrounding it are Perl based (which is fine).
SPT_spider.phpThis command line PHP tool is designed to crawl an entire http structure from a starting page and is capable of being used as a source to the swish-e.exe indexer. It provides (internal) configuration options to include/exclude urls, allowed file types, disallowed url types etc Usage 1) It can just be run from command line, do the spidering and generate a lists of valid and invalid spider urls (example output). 2) It can be run as the source to swish-e.exe indexer. In this case it is named using the swish-e -S prog option and configuration file, e.g. swish-e -S prog -c example.cfg swish-e then indexes contents supplied by SPT_spider.php spidering through valid URLs (based on SPT_spider settings).
|
||
|
|
|
|
phpSiteFramework powered |
|