-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: bump version to 0.12.0 #296
Conversation
fce0c44
to
a3d67d5
Compare
Coverage Report
Files without new missing coverage
257 files skipped due to complete coverage. Coverage success: total of 97.60% is above 97.58% 🎉 |
Changelog
Added
eds.transformer
component now acceptsprompts
(passed to itspreprocess
method, see breaking change below) to add before each window of text to embed.LazyCollection.map
/map_batches
now support generator functions as arguments.eds.transformer
component bytraining_stride = False
eds.ner_overlap_scorer
to evaluate matches between two lists of entities, counting true when the dice overlap is above a given thresholdedsnlp.load
now accepts EDS-NLP models from the huggingface hub 🤗 !Changed
💥 Major breaking change in trainable components, moving towards a more "task-centric" design:
eds.transformer
component is no longer responsible for deciding which spans of text ("contexts") should be embedded. These contexts are now passed via thepreprocess
method, which now accepts more arguments than just the docs to process.eds.span_pooler
is now longer responsible for deciding which spans to pool, and instead pools all spans passed to it in thepreprocess
method.Consequently, the
eds.transformer
andeds.span_pooler
no longer accept theirspan_getter
argument, and theeds.ner_crf
,eds.span_classifier
,eds.span_linker
andeds.span_qualifier
components now accept acontext_getter
argument instead, as well as aspan_getter
argument for the latter two. This refactoring can be summarized as follows:and as an example for the
eds.span_linker
component:Trainable embedding components now all use
foldedtensor
to return embeddings, instead of returning a tensor of floats and a mask tensor.💥 TorchComponent
__call__
no longer applies the end to end method, and instead calls theforward
method directly, like all torch modules.The trainable
eds.span_qualifier
component has been renamed toeds.span_classifier
to reflect its general purpose (it doesn't only predict qualifiers, but any attribute of a span using its context or not).omop
converter now takes thenote_datetime
field into account by default when building a documentspan._.date.to_datetime()
andspan._.date.to_duration()
now automatically take thenote_datetime
into accountnlp.vocab
is no longer serialized when saving a model, as it may contain sensitive information and can be recomputed during inference anywayFixed
edsnlp.data.read_json
now correctly read the files from the directory passed as an argument, and not from the parent directory.Checklist