open source, restful, distributed crawler engine
I wrote a crawler engine named ants in python base on scrapy. But sometimes, dynamic language is chaos. So I start to write it in a compile language.
export GOPATH=PATH/TO/ants-gogo get github.com/PuerkitoBio/goquery
go get github.com/go-sql-driver/mysqlgo install src/ants/ants/ants.gocd bin
./antsto test cluster in one computer,you can run it from different port in different terminal
one node,use the default port tcp 8300 http 8200
cd bin
./antsthe other node set tcp port and http port
cd bin
./ants -tcp 9300 -http 9200there are some flags you can set,check out the help message
./ants -h
./ants -help- go to src/spiders
- write your spiders follow the example deap_loop_spider.go or go to the spider page
- add you spider to spiderMap,follow the example in LoadAllSpiders in load_all_spider.go
- install again