From 91729699d9c5b79f75549d11b0dd688e751ddff1 Mon Sep 17 00:00:00 2001 From: Kenneth Enevoldsen Date: Mon, 1 Apr 2024 11:10:59 +0200 Subject: [PATCH] docs: Updated based on paper feedback --- docs/faq.rst | 30 ------------------------------ paper/paper.md | 2 +- readme.md | 2 +- 3 files changed, 2 insertions(+), 32 deletions(-) diff --git a/docs/faq.rst b/docs/faq.rst index e9fead1..6bdecce 100644 --- a/docs/faq.rst +++ b/docs/faq.rst @@ -8,36 +8,6 @@ If you use this library in your research, it would be much appreciated it if you "cite this repository" on the `github page `__ for an up to date citation. -How do I test the code and run the test suite? -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -This package comes with an extensive test suite. In order to run the tests, -you'll usually want to clone the repository and build the package from the -source. This will also install the required development dependencies -and test utilities defined in the extras_require section of the :code:`pyproject.toml`. - -.. code-block:: bash - - pip install -e ".[tests]" - - python -m pytest - - -which will run all the test in the `tests` folder. - -Specific tests can be run using: - -.. code-block:: bash - - python -m pytest tests/desired_test.py - -If you want to check code coverage you can run the following: - -.. code-block:: - - python -m pytest --cov=. - - Does this package run on X? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/paper/paper.md b/paper/paper.md index c4d8f08..e4a762a 100644 --- a/paper/paper.md +++ b/paper/paper.md @@ -39,7 +39,7 @@ Other tools for data augmentation focus on specific downstream application such # Features & Functionality `Augmenty` is a Python library that implements augmentations based on `spaCy`'s `Doc` object. `spaCy`'s `Doc` object is a container for a text and its annotations. This makes it easy to augment text and annotations simultaneously. The `Doc` object can easily be extended to include custom augmentation not available in `spaCy` by adding custom attributes to the `Doc` object. While `Augmenty` is built to augment `Doc`s the object is easily converted into strings, lists or other formats. The annotations within a `Doc` can be provided either by human annotations or using a trained model. -Augmenty implements a series of augmenters for token-, span- and sentence-level augmentation. These augmenters range from primitive augmentations such as word replacement to language specific augmenters such as keystroke error augmentations based on a French keyboard layout. Augmenty also integrates with other libraries such as `NLTK` [@bird2009natural] to allow for augmentations based on WordNet [@miller-1994-wordnet] and allows for specification of static word vectors [pennington-etal-2014-glove] to allow for augmentations based on word similarity. Lastly, `augmenty` provides a set of utility functions for repeating augmentations, combining augmenters or adjust the percentage of documents that should be augmented. This allow for the flexible construction of augmentation pipelines specific to the task at hand. +Augmenty implements a series of augmenters for token-, span- and sentence-level augmentation. These augmenters range from primitive augmentations such as word replacement to language specific augmenters such as keystroke error augmentations based on a French keyboard layout. Augmenty also integrates with other libraries such as `NLTK` [@bird2009natural] to allow for augmentations based on WordNet [@miller-1994-wordnet] and allows for specification of static word vectors [@pennington-etal-2014-glove] to allow for augmentations based on word similarity. Lastly, `augmenty` provides a set of utility functions for repeating augmentations, combining augmenters or adjust the percentage of documents that should be augmented. This allow for the flexible construction of augmentation pipelines specific to the task at hand. # Example Use Cases diff --git a/readme.md b/readme.md index df2647a..ae20d08 100644 --- a/readme.md +++ b/readme.md @@ -44,7 +44,7 @@ nlp = spacy.load("en_core_web_md") docs = nlp.pipe(["Augmenty is a great tool for text augmentation"]) entity_augmenter = augmenty.load("ents_replace_v1", - ent_dict = {"ORG": [["spaCy"], ["spaCy", "Universe"]]}, level=1) + ent_dict = {"GPE": [["spaCy"], ["spaCy", "Universe"]]}, level=1) for doc in augmenty.docs(docs, augmenter=entity_augmenter, nlp=nlp): print(doc)