Skip to content

Commit

Permalink
Merge pull request #223 from KennethEnevoldsen/paper-feedback
Browse files Browse the repository at this point in the history
docs: Updated based on paper feedback
  • Loading branch information
KennethEnevoldsen authored Apr 1, 2024
2 parents d85a31f + 9172969 commit 2822710
Show file tree
Hide file tree
Showing 3 changed files with 2 additions and 32 deletions.
30 changes: 0 additions & 30 deletions docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,36 +8,6 @@ If you use this library in your research, it would be much appreciated it if you
"cite this repository" on the `github page <https://github.com/KennethEnevoldsen/augmenty>`__ for an up to date citation.


How do I test the code and run the test suite?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This package comes with an extensive test suite. In order to run the tests,
you'll usually want to clone the repository and build the package from the
source. This will also install the required development dependencies
and test utilities defined in the extras_require section of the :code:`pyproject.toml`.

.. code-block:: bash
pip install -e ".[tests]"
python -m pytest
which will run all the test in the `tests` folder.

Specific tests can be run using:

.. code-block:: bash
python -m pytest tests/desired_test.py
If you want to check code coverage you can run the following:

.. code-block::
python -m pytest --cov=.
Does this package run on X?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
2 changes: 1 addition & 1 deletion paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Other tools for data augmentation focus on specific downstream application such
# Features & Functionality
`Augmenty` is a Python library that implements augmentations based on `spaCy`'s `Doc` object. `spaCy`'s `Doc` object is a container for a text and its annotations. This makes it easy to augment text and annotations simultaneously. The `Doc` object can easily be extended to include custom augmentation not available in `spaCy` by adding custom attributes to the `Doc` object. While `Augmenty` is built to augment `Doc`s the object is easily converted into strings, lists or other formats. The annotations within a `Doc` can be provided either by human annotations or using a trained model.

Augmenty implements a series of augmenters for token-, span- and sentence-level augmentation. These augmenters range from primitive augmentations such as word replacement to language specific augmenters such as keystroke error augmentations based on a French keyboard layout. Augmenty also integrates with other libraries such as `NLTK` [@bird2009natural] to allow for augmentations based on WordNet [@miller-1994-wordnet] and allows for specification of static word vectors [pennington-etal-2014-glove] to allow for augmentations based on word similarity. Lastly, `augmenty` provides a set of utility functions for repeating augmentations, combining augmenters or adjust the percentage of documents that should be augmented. This allow for the flexible construction of augmentation pipelines specific to the task at hand.
Augmenty implements a series of augmenters for token-, span- and sentence-level augmentation. These augmenters range from primitive augmentations such as word replacement to language specific augmenters such as keystroke error augmentations based on a French keyboard layout. Augmenty also integrates with other libraries such as `NLTK` [@bird2009natural] to allow for augmentations based on WordNet [@miller-1994-wordnet] and allows for specification of static word vectors [@pennington-etal-2014-glove] to allow for augmentations based on word similarity. Lastly, `augmenty` provides a set of utility functions for repeating augmentations, combining augmenters or adjust the percentage of documents that should be augmented. This allow for the flexible construction of augmentation pipelines specific to the task at hand.

# Example Use Cases

Expand Down
2 changes: 1 addition & 1 deletion readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ nlp = spacy.load("en_core_web_md")
docs = nlp.pipe(["Augmenty is a great tool for text augmentation"])

entity_augmenter = augmenty.load("ents_replace_v1",
ent_dict = {"ORG": [["spaCy"], ["spaCy", "Universe"]]}, level=1)
ent_dict = {"GPE": [["spaCy"], ["spaCy", "Universe"]]}, level=1)

for doc in augmenty.docs(docs, augmenter=entity_augmenter, nlp=nlp):
print(doc)
Expand Down

0 comments on commit 2822710

Please sign in to comment.