Small web crawler developer in Java and Spring Boot
Small web crawler developed in Java and Spring Boot.
This crawler is a multi-threaded one which can start multiple crawling.
This is a REST based crawler and crawling a web site can be started using a REST end point.
Start with a seed web site to crawl and depth to crawl. This is a “POST” request
http://localhost:8080/start?depth=5&seed=http://www.google.com
Once it is requested, it will generate a “token”. This token can be used to query the status of the crawling operation
You can use another endpoint to check the status of the crawling:
http://localhost:8080/status/
This endpoint which will give you the result of the crawling.
The following endpoint will cancel the current crawling task