Skip to content

the-sea-ink/ds-lib-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

This is a subcomponent for a Master's Thesis Project at the TU Berlin.

This projects contains two scrapers that collect data about functions provided by pandas and scikit-learn API references.

The data is being scraped in a following format for each function:
full function title, function description, link to the doc reference, function arguments

To start a scraper, go into the folder with the scripts:

cd ds_scraper/ds_scraper/spiders/

And start the respective scraper:

python pandas_scraper.py

or

python sklearn_scraper.py

In the same folder you can find the postprocessing script. Currently, it only appends an index column in front. To start, cd into the same folder as for scrapers:

cd ds_scraper/ds_scraper/spiders/

And call the script with input and output files as arguments respectfully:

python postprocessing.py -i "input_file" -o "output_file"

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages