Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop means of metadata schema versioning #170

Open
erinspace opened this issue Apr 22, 2015 · 1 comment
Open

Develop means of metadata schema versioning #170

erinspace opened this issue Apr 22, 2015 · 1 comment

Comments

@erinspace
Copy link
Member

Same as CenterForOpenScience/SHARE#156

When a schema from a provider changes, we'll need to be able to specify which version of the schema we'd like to normalize against.

For example, when pubmed central changed their metadata, we could have used a versioned schema to make those changes retroactively, and used the new schema in the future.

@fabianvf
Copy link
Contributor

Proposal: Create a Schema class that is initialized with 2 values, the first a function that takes a metadata record and returns a boolean, and the second a dictionary that defines the schema (what we currently have). Create a field in the harvesters called schemas, which is a list of Schema objects. When normalizing a document, iterate through that list and use the first schema which returns True when given the metadata document. The last Schema entry in the list will be considered the default (it will have a function that always returns True). @JeffSpies, @chrisseto @erinspace, thoughts?

class Schema(object):
    def __init__(self, schema, fn=lambda x: True):
        self.schema = schema
        self.matches = fn

class BaseHarvester(object):
    @property
    def schema(self):
        matches = filter(lambda x: x.matches(doc), self.schemas)
        assert len(matches) == 1
        return matches[0]

    @abc.abstractproperty
    def schemas(self):
        raise NotImplementedError

@fabianvf fabianvf changed the title Develop means of metadata schema vesioning Develop means of metadata schema versioning Apr 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants