Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAlex data collection pipeline #57

Open
beingkk opened this issue Oct 14, 2024 · 0 comments
Open

OpenAlex data collection pipeline #57

beingkk opened this issue Oct 14, 2024 · 0 comments

Comments

@beingkk
Copy link
Contributor

beingkk commented Oct 14, 2024

Collection of OpenAlex works using our keyword taxonomy. We can build on the simple function in discovery_utils.getters.openalex, to check every week for new works with the specific keywords and add them to our S3.

Consider also using OpenAlex's concepts or - if concepts are deprecated - then their new topics.

Needs some careful testing of how to best formulate OpenAlex search, as sometimes the query results can be off topic (when using the simplest search option, as presently implemented in discovery_utils.getters.openalex).

Re keyword search, see also it's useful to reuse keyword search code from here - but note that it might be deprecated, as we should use pyalex package going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant