Skip to content

Commit

Permalink
release: v1.0.0
Browse files Browse the repository at this point in the history
  • Loading branch information
severinsimmler authored Jan 6, 2021
2 parents 5d5ef54 + bce8633 commit 22039a7
Show file tree
Hide file tree
Showing 28 changed files with 777 additions and 618 deletions.
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Linear-chain conditional random fields for natural language processing.

Chaine is a modern Python library without third-party dependencies and a backend written in C. You can train conditional random fields for natural language processing tasks like [named entity recognition](https://en.wikipedia.org/wiki/Named-entity_recognition) or [part-of-speech tagging](https://en.wikipedia.org/wiki/Part-of-speech_tagging).
Chaine is a modern Python library without third-party dependencies and a backend written in C. You can train conditional random fields for natural language processing tasks like [named entity recognition](https://en.wikipedia.org/wiki/Named-entity_recognition).

- **Lightweight**: No use of bloated third-party libraries.
- **Fast**: Performance critical parts are written in C and thus [blazingly fast](http://www.chokkan.org/software/crfsuite/benchmark.html).
Expand All @@ -14,26 +14,26 @@ You can install the latest stable version from [PyPI](https://pypi.org/project/c
$ pip install chaine
```

If you are interested in the theoretical concepts behind conditional random fields, please refer to the introducing paper by [Lafferty et al](https://repository.upenn.edu/cgi/viewcontent.cgi?article=1162&context=cis_papers).
Please refer to the introducing paper by [Lafferty et al.](https://repository.upenn.edu/cgi/viewcontent.cgi?article=1162&context=cis_papers) for the theoretical concepts behind conditional random fields.


## Example
## Minimal working example

```python
>>> import chaine
>>> tokens = [["John", "Lennon", "was", "born", "in" "Liverpool"]]
>>> tokens = [["John", "Lennon", "was", "born", "in", "Liverpool"]]
>>> labels = [["B-PER", "I-PER", "O", "O", "O", "B-LOC"]]
>>> model = chaine.train(tokens, labels, max_iterations=5)
>>> model.predict(tokens)
[['B-PER', 'I-PER', 'O', 'O', 'O', 'B-LOC']]
```

Check out the introducing [Jupyter notebook](https://github.com/severinsimmler/chaine/blob/master/notebooks/tutorial.ipynb).
Check out the [examples](https://github.com/severinsimmler/chaine/blob/master/examples) for a more real-world use case.


## Credits

This library makes use of and is partially based on:
This project makes use of and is partially based on:

- [CRFsuite](https://github.com/chokkan/crfsuite)
- [libLBFGS](https://github.com/chokkan/liblbfgs)
Expand Down
4 changes: 2 additions & 2 deletions chaine/__init__.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
from chaine.training import train
from chaine.crf import Model, Trainer
from chaine.api import train
from chaine import crf
150 changes: 150 additions & 0 deletions chaine/api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
"""
chaine.api
~~~~~~~~~~
This module implements the high-level API to train a conditional random field
"""

from chaine.crf import Model, Trainer
from chaine.typing import Dataset, Labels


def train(dataset: Dataset, labels: Labels, **kwargs) -> Model:
"""Train a conditional random field
Parameters
----------
dataset : Dataset
Dataset consisting of sequences of feature sets
labels : Labels
Labels corresponding to each instance in the dataset
algorithm : str
Following algorithms are available:
* lbfgs: Limited-memory BFGS with L1/L2 regularization
* l2sgd: Stochastic gradient descent with L2 regularization
* ap: Averaged perceptron
* pa: Passive aggressive
* arow: Adaptive regularization of weights
Limited-memory BFGS Parameters
------------------------------
min_freq : float, optional (default=0)
Threshold value for minimum frequency of a feature occurring in training data
all_possible_states : bool, optional (default=False)
Generate state features that do not even occur in the training data
all_possible_transitions : bool, optional (default=False)
Generate transition features that do not even occur in the training data
max_iterations : int, optional (default=None)
Maximum number of iterations (unlimited by default)
num_memories : int, optional (default=6)
Number of limited memories for approximating the inverse hessian matrix
c1 : float, optional (default=0)
Coefficient for L1 regularization
c2 : float, optional (default=1.0)
Coefficient for L2 regularization
epsilon : float, optional (default=1e-5)
Parameter that determines the condition of convergence
period : int, optional (default=10)
Threshold value for iterations to test the stopping criterion
delta : float, optional (default=1e-5)
Top iteration when log likelihood is not greater than this
linesearch : str, optional (default="MoreThuente")
Line search algorithm used in updates:
* MoreThuente: More and Thuente's method
* Backtracking: Backtracking method with regular Wolfe condition
* StrongBacktracking: Backtracking method with strong Wolfe condition
max_linesearch : int, optional (default=20)
Maximum number of trials for the line search algorithm
SGD with L2 Parameters
----------------------
min_freq : float, optional (default=0)
Threshold value for minimum frequency of a feature occurring in training data
all_possible_states : bool, optional (default=False)
Generate state features that do not even occur in the training data
all_possible_transitions : bool, optional (default=False)
Generate transition features that do not even occur in the training data
max_iterations : int, optional (default=None)
Maximum number of iterations (1000 by default)
c2 : float, optional (default=1.0)
Coefficient for L2 regularization
period : int, optional (default=10)
Threshold value for iterations to test the stopping criterion
delta : float, optional (default=1e-5)
Top iteration when log likelihood is not greater than this
calibration_eta : float, optional (default=0.1)
Initial value of learning rate (eta) used for calibration
calibration_rate : float, optional (default=2.0)
Rate of increase/decrease of learning rate for calibration
calibration_samples : int, optional (default=1000)
Number of instances used for calibration
calibration_candidates : int, optional (default=10)
Number of candidates of learning rate
calibration_max_trials : int, optional (default=20)
Maximum number of trials of learning rates for calibration
Averaged Perceptron Parameters
------------------------------
min_freq : float, optional (default=0)
Threshold value for minimum frequency of a feature occurring in training data
all_possible_states : bool, optional (default=False)
Generate state features that do not even occur in the training data
all_possible_transitions : bool, optional (default=False)
Generate transition features that do not even occur in the training data
max_iterations : int, optional (default=None)
Maximum number of iterations (100 by default)
epsilon : float, optional (default=1e-5)
Parameter that determines the condition of convergence
Passive Aggressive Parameters
-----------------------------
min_freq : float, optional (default=0)
Threshold value for minimum frequency of a feature occurring in training data
all_possible_states : bool, optional (default=False)
Generate state features that do not even occur in the training data
all_possible_transitions : bool, optional (default=False)
Generate transition features that do not even occur in the training data
max_iterations : int, optional (default=None)
Maximum number of iterations (100 by default)
epsilon : float, optional (default=1e-5)
Parameter that determines the condition of convergence
pa_type : int, optional (default=1)
Strategy for updating feature weights:
* 0: PA without slack variables
* 1: PA type I
* 2: PA type II
c : float, optional (default=1)
Aggressiveness parameter (used only for PA-I and PA-II)
error_sensitive : bool, optional (default=True)
Include square root of predicted incorrect labels into optimization routine
averaging : bool, optional (default=True)
Compute average of feature weights at all updates
Adaptive Regularization of Weights (AROW) Parameters
----------------------------------------------------
min_freq : float, optional (default=0)
Threshold value for minimum frequency of a feature occurring in training data
all_possible_states : bool, optional (default=False)
Generate state features that do not even occur in the training data
all_possible_transitions : bool, optional (default=False)
Generate transition features that do not even occur in the training data
max_iterations : int, optional (default=None)
Maximum number of iterations (100 by default)
epsilon : float, optional (default=1e-5)
Parameter that determines the condition of convergence
variance : float, optional (default=1)
Initial variance of every feature weight
gamma : float, optional (default=1)
Trade-off between loss function and changes of feature weights
Returns
-------
Model
A conditional random field trained on the dataset
"""
# initialize trainer and start training
trainer = Trainer(**kwargs)
trainer.train(dataset, labels, "model.crf")

# load and return the trained model
return Model("model.crf")
Loading

0 comments on commit 22039a7

Please sign in to comment.