Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DataCite's schema and update relationships between databases #50

Merged
merged 5 commits into from
Jun 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 158 additions & 0 deletions docs/datacite.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
DataCite publication data
=========================

.. Automatically generated file. Do not modify by hand.

.. code:: py

from alexandria3k.data_sources import datacite

.. autoclass:: data_sources.datacite.Datacite
:members: query, populate

Generated schema
----------------

.. code:: sql

CREATE TABLE dc_works(
id,
container_id,
identifier,
identifier_type,
doi,
publisher,
publication_year,
resource_type,
resource_type_general,
language,
sizes,
formats,
schema_version,
metadata_version,
url,
created,
registered,
published,
updated
);

CREATE TABLE dc_work_creators(
id,
container_id,
work_id,
name,
name_type,
given_name,
family_name
);

CREATE TABLE dc_creator_name_identifiers(
creator_id,
container_id,
name_identifier,
name_identifier_scheme,
scheme_uri
);

CREATE TABLE dc_creator_affiliations(
creator_id,
container_id,
name
);

CREATE TABLE dc_work_titles(
work_id,
container_id,
title,
title_type
);

CREATE TABLE dc_work_subjects(
work_id,
container_id,
subject,
subject_scheme,
scheme_uri,
value_uri
);

CREATE TABLE dc_work_contributors(
id,
container_id,
work_id,
contributor_type,
name,
family_name,
given_name
);

CREATE TABLE dc_contributor_name_identifiers(
contributor_id,
container_id,
name_identifier,
name_identifier_scheme,
scheme_uri
);

CREATE TABLE dc_contributor_affiliations(
contributor_id,
container_id,
name
);

CREATE TABLE dc_work_dates(
work_id,
container_id,
date,
date_type
);

CREATE TABLE dc_work_related_identifiers(
work_id,
container_id,
related_identifier,
related_identifier_type,
relation_type,
related_metadata_scheme,
scheme_uri,
scheme_type
);

CREATE TABLE dc_work_rights(
work_id,
container_id,
rights,
lang,
rights_uri,
rights_identifier,
rights_identifier_scheme,
scheme_uri
);

CREATE TABLE dc_work_descriptions(
work_id,
container_id,
description,
description_type
);

CREATE TABLE dc_work_geo_locations(
work_id,
container_id,
geo_location_place,
geo_location_point,
geo_location_box
);

CREATE TABLE dc_work_funding_references(
work_id,
container_id,
funder_name,
funder_identifier,
funder_identifier_type,
award_number,
award_uri,
award_title
);

2 changes: 1 addition & 1 deletion docs/dev.rst
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ Adding a new data source plugin to *alexandria3k* involves the following steps.
downloading its data.
The data source's name for the CLI is all lowercase (e.g. ``datacite``),
for the class name with an initial capital (e.g. ``Datacite``), and in
the documentation and schema's as formally spelled (e.g. ``DataCtire``).
the documentation and schema's as formally spelled (e.g. ``DataCite``).
All table rows have an ``id`` field, with a unique identifier for that
table across all table rows.
As detail table indices are reset for each record, the identifier
Expand Down
23 changes: 23 additions & 0 deletions docs/issn_subject_codes.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
ISSN Subject Codes data processing
==================================

.. Automatically generated file. Do not modify by hand.

.. code:: py

from alexandria3k.data_sources import issn_subject_codes

.. autoclass:: data_sources.issn_subject_codes.IssnSubjectCodes
:members: query, populate

Generated schema
----------------

.. code:: sql

CREATE TABLE issn_subject_codes(
id INTEGER PRIMARY KEY,
issn,
subject_code INTEGER
);

1 change: 1 addition & 0 deletions docs/schema/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ other.pdf
ror.pdf
uspto.pdf
pubmed.pdf
datacite.pdf
3,227 changes: 1,818 additions & 1,409 deletions docs/schema/all.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
33 changes: 18 additions & 15 deletions docs/schema/datacite.dot
Original file line number Diff line number Diff line change
@@ -1,18 +1,21 @@
node [fillcolor="#00b0e1", fontcolor="#000000"];
dc_works;
dc_work_geo_locations;
dc_work_funding_references;
dc_work_titles;
dc_creator_name_identifiers;
dc_work_contributors;
dc_work_sizes;
dc_work_subjects;
dc_work_descriptions;
dc_contributor_name_identifiers;
dc_work_formats;
dc_work_related_identifiers;
dc_work_dates;
dc_work_rights;
dc_contributor_affiliations;
dc_work_creators;
dc_creator_affiliations;
dc_work_contributors;

dc_works -> dc_work_creators [headlabel="1…N", taillabel="1"];
dc_works -> dc_work_titles [headlabel="1…N", taillabel="1"];

edge [headlabel="0…N", taillabel="1"];
dc_work_creators -> dc_creator_name_identifiers;
dc_work_creators -> dc_creator_affiliations;
dc_works -> dc_work_subjects;
dc_works -> dc_work_contributors;
dc_work_contributors -> dc_contributor_name_identifiers;
dc_work_contributors -> dc_contributor_affiliations;
dc_works -> dc_work_dates;
dc_works -> dc_work_related_identifiers;
dc_works -> dc_work_rights;
dc_works -> dc_work_geo_locations;
dc_works -> dc_work_descriptions;
dc_works -> dc_work_funding_references;
Loading