Skip to content
/ edsnlp Public

Modular, fast NLP framework, compatible with Pytorch and spaCy, offering tailored support for French clinical notes.

License

Notifications You must be signed in to change notification settings

aphp/edsnlp

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

53842f1 · Feb 24, 2024
Feb 24, 2024
Dec 4, 2023
Feb 23, 2024
Feb 23, 2024
Sep 5, 2022
Feb 28, 2023
Feb 23, 2024
Jan 18, 2022
Mar 19, 2023
Dec 4, 2023
Dec 4, 2023
Jun 2, 2021
Sep 13, 2023
Feb 23, 2024
Feb 23, 2024
Dec 4, 2023
Feb 19, 2024
Feb 19, 2024
Dec 4, 2023

Repository files navigation

Tests Documentation PyPI Demo Codecov DOI

EDS-NLP

EDS-NLP is a collaborative NLP framework that aims primarily at extracting information from French clinical notes. At its core, it is a collection of components or pipes, either rule-based functions or deep learning modules. These components are organized into a novel efficient and modular pipeline system, built for hybrid and multitask models. We use spaCy to represent documents and their annotations, and Pytorch as a deep-learning backend for trainable components.

EDS-NLP is versatile and can be used on any textual document. The rule-based components are fully compatible with spaCy's components, and vice versa. This library is a product of collaborative effort, and we encourage further contributions to enhance its capabilities.

Check out our interactive demo !

Features

Quick start

Installation

You can install EDS-NLP via pip. We recommend pinning the library version in your projects, or use a strict package manager like Poetry.

pip install edsnlp==0.10.6

or if you want to use the trainable components (using pytorch)

pip install "edsnlp[ml]==0.10.6"

A first pipeline

Once you've installed the library, let's begin with a very simple example that extracts mentions of COVID19 in a text, and detects whether they are negated.

import edsnlp

nlp = edsnlp.blank("eds")

terms = dict(
    covid=["covid", "coronavirus"],
)

# Split the documents into sentences, this isneeded for negation detection
nlp.add_pipe("eds.sentences")
# Matcher component
nlp.add_pipe("eds.matcher", config=dict(terms=terms))
# Negation detection
nlp.add_pipe("eds.negation")

# Process your text in one call !
doc = nlp("Le patient n'est pas atteint de covid")

doc.ents
# Out: (covid,)

doc.ents[0]._.negation
# Out: True

Documentation & Tutorials

Go to the documentation for more information.

Disclaimer

The performances of an extraction pipeline may depend on the population and documents that are considered.

Contributing to EDS-NLP

We welcome contributions ! Fork the project and propose a pull request. Take a look at the dedicated page for detail.

Citation

If you use EDS-NLP, please cite us as below.

@misc{edsnlp,
  author = {Wajsburt, Perceval and Petit-Jean, Thomas and Dura, Basile and Cohen, Ariel and Jean, Charline and Bey, Romain},
  doi    = {10.5281/zenodo.6424993},
  title  = {EDS-NLP: efficient information extraction from French clinical notes},
  url    = {https://aphp.github.io/edsnlp}
}

Acknowledgement

We would like to thank Assistance Publique – Hôpitaux de Paris, AP-HP Foundation and Inria for funding this project.