- Blackstone - Blackstone is a spaCy model and library for processing long-form, unstructured legal text. Blackstone is an experimental research project from the Incorporated Council of Law Reporting for England and Wales' research lab, ICLR&D.
- CTRL - A Conditional Transformer Language Model for Controllable Generation released by SalesForce
- Facebook's XLM - PyTorch original implementation of Cross-lingual Language Model Pretraining which includes BERT, XLM, NMT, XNLI, PKM, etc.
- Flair - Simple framework for state-of-the-art NLP developed by Zalando which builds directly on PyTorch.
- Github's Semantic - Github's text library for parsing, analyzing, and comparing source code across many languages .
- GluonNLP - GluonNLP is a toolkit that enables easy text preprocessing, datasets loading and neural models building to help you speed up your Natural Language Processing (NLP) research.
- GNES - Generic Neural Elastic Search is a cloud-native semantic search system based on deep neural networks.
- Grover - Grover is a model for Neural Fake News -- both generation and detection. However, it probably can also be used for other generation tasks.
- Kashgari - Kashgari is a simple and powerful NLP Transfer learning framework, build a state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS), and text classification tasks.
- OpenAI GPT-2 - OpenAI's code from their paper "Language Models are Unsupervised Multitask Learners".
- sense2vec - A Pytorch library that allows for training and using sense2vec models, which are models that leverage the same approach than word2vec, but also leverage part-of-speech attributes for each token, which allows it to be "meaning-aware"
- Snorkel - Snorkel is a system for quickly generating training data with weak supervision https://snorkel.org.
- SpaCy - Industrial-strength natural language processing library built with python and cython by the explosion.ai team.
- Stable Baselines - A fork of OpenAI Baselines, implementations of reinforcement learning algorithms http://stable-baselines.readthedocs.io/.
- Tensorflow Lingvo - A framework for building neural networks in Tensorflow, particularly sequence models. Lingvo: A TensorFlow Framework for Sequence Modeling.
- Tensorflow Text - TensorFlow Text provides a collection of text related classes and ops ready to use with TensorFlow 2.0.
- Wav2Letter++ - A speech to text system developed by Facebook's FAIR teams.
- YouTokenToMe - YouTokenToMe is an unsupervised text tokenizer focused on computational efficiency. It currently implements fast Byte Pair Encoding (BPE) [Sennrich et al.].
- 🤗 Transformers - Huggingface's library of state-of-the-art pretrained models for Natural Language Processing (NLP).