A simple script to take the Biorxiv Coronavirus Collection and Triage it for interaction terms
- Pipenv (https://pypi.org/project/pipenv/)
- Python 3.6.9 (https://www.python.org/)
- You need to create a new file called
config/config.yml
containing the settings for your implementation. You can use theconfig/config.sample.yml
file as a template.
- Go into the directory containing this repository
- Run:
pipenv shell
- Run:
pipenv install
- Run:
python -m spacy download en_core_web_sm
- Create a directory called
<DOWNLOAD_PATH>
(what you setdownload_path
equal to in the config file) - Run:
python run.py -d -e <DATE>
where<DATE>
is replaced with the date you want to go back to. For example, to fetch papers back to August 15, 2020 use:2020-08-15
- On subsequent runs, if you don't want to re-download the json, you can simply execute:
python run.py
- Running this script will will create a file called
results.csv
in the<DOWNLOAD_PATH>
folder