Simple Web crawler written in Java
Simple Web crawler written in Java.
ant
java -cp spider.jar org.spektom.spider.SpiderTool
java Spider [options] URL
Where options are:
- -r
Follow robots.txt and META robot tag rules (default: true) -t Number of concurrent downloads (default: 5) -f Follow other domains (default: false) -c Connect/read timeout in milliseconds (default: 5000) -u String that will be sent in User-Agent header (default: none) -p Follow only URLs that match pattern -v Verbose output (default: false)