Skip to content

shushi2016/xiaomi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Web Crawler (Scrapy + Splash + MongoDB)

Project description:

This is a project to crawl the names of APPs on www.mi.com. The detailed description can be found here.

Basically the final code will crawl the http://app.mi.com/, grab all the APPs' name and store them into a MongoDB database.

I delevoped these codes based on two useful tutorials, (here and here)however, both of them have some obsolete codes in there so I had some modifications.

The master brand codes finish the first two steps of this project: crawl the main page and store the APPs' names into a MongoDB database.

The final step is to use Splash to crawl some linked pages, not only the main page. I am actively working on this final step now.

Watch this repo if you are interested about the progress!

About

bittiger webcrawler project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages