This repository has been archived by the owner on Apr 20, 2019. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6
Heroshi – open source web crawler.
temoto/heroshi
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Heroshi, open source web crawler. Motivation 1: learn HTTP, libraries, real world quirks. Motivation 2: collection of libraries and tools for building custom crawlers. Motivation 3: provide access to representative subset of Web for educational and research purposes. As of 2012-10-12, last goal is not even started, but these guys did amazing job at it http://commoncrawl.org/ See http://temoto.github.com/heroshi/ for more information.
About
Heroshi – open source web crawler.
Topics
Resources
Stars
Watchers
Forks
Packages 0
No packages published