项目作者: spektom

项目描述 :
Simple Web crawler written in Java
高级语言: Java
项目地址: git://github.com/spektom/spider.git
创建时间: 2011-09-06T08:48:36Z
项目社区:https://github.com/spektom/spider

开源协议:

下载


Spider

Simple Web crawler written in Java.

Building

ant

Running

java -cp spider.jar org.spektom.spider.SpiderTool

Usage

java Spider [options] URL

Where options are:

  1. -r Follow robots.txt and META robot tag rules (default: true)
  2. -t Number of concurrent downloads (default: 5)
  3. -f Follow other domains (default: false)
  4. -c Connect/read timeout in milliseconds (default: 5000)
  5. -u String that will be sent in User-Agent header (default: none)
  6. -p Follow only URLs that match pattern
  7. -v Verbose output (default: false)