Skip to content

Tokenizer 1.36.0

Compare
Choose a tag to compare
@guillaumekln guillaumekln released this 13 Jan 15:05
· 13 commits to master since this release

New features

  • [Python] Add argument vocabulary in the Tokenizer constructor to set the vocabulary with a list of tokens instead of using a file
  • [Python] Add function pyonmttok.is_valid_language to check if a language code is valid and can be passed to the Tokenizer constructor