Skip to content

Simple text to phonemes converter for multiple languages

License

Notifications You must be signed in to change notification settings

resemble-ai/phonemizer

 
 

Repository files navigation

Linux MacOS Windows Codecov GitHub release (latest SemVer) DOI

Phonemizer -- foʊnmaɪzɚ

  • The phonemizer allows simple phonemization of words and texts in many languages.

  • Provides both the phonemize command-line tool and the Python function phonemizer.phonemize.

  • It is using four backends: espeak, espeak-mbrola, festival and segments.

    • espeak-ng supports a lot of languages and IPA (International Phonetic Alphabet) output.

    • espeak-ng-mbrola uses the SAMPA phonetic alphabet instead of IPA but does not preserve word boundaries.

    • festival currently supports only American English. It uses a custom phoneset, but it allows tokenization at the syllable level.

    • segments is a Unicode tokenizer that build a phonemization from a grapheme to phoneme mapping provided as a file by the user.

Installation

You need python>=3.6. If you really need to use python2, use an older version of the phonemizer.

Dependencies

  • You need to install festival, espeak-ng and mbrola on your system. On Debian/Ubuntu simply run:

      $ sudo apt-get install festival espeak-ng mbrola
    
  • When using the espeak-mbrola backend, additional mbrola voices must be installed (see here). On Debian/Ubuntu, list the possible voices with apt search mbrola.

Phonemizer

  • The simplest way is using pip:

      $ pip install phonemizer
    
  • OR install it from sources with:

      $ git clone https://github.com/bootphon/phonemizer
      $ cd phonemizer
      $ [sudo] python setup.py install
    

    If you experiment an error such as ImportError: No module named setuptools during installation, refeer to issue 11.

Docker image

Alternatively you can run the phonemizer within docker, using the provided Dockerfile. To build the docker image, have a:

$ git clone https://github.com/bootphon/phonemizer
$ cd phonemizer
$ sudo docker build -t phonemizer .

Then run an interactive session with:

$ sudo docker run -it phonemizer /bin/bash

Testing

When installed from sources or whithin a Docker image, you can run the tests suite from the root phonemizer folder (once you installed pytest):

$ pip install pytest
$ pytest

Python usage

In Python import the phonemize function with from phonemizer import phonemize. See here for function documentation.

Command-line examples

The above examples can be run from Python using the phonemize function

For a complete list of available options, have a:

$ phonemize --help

See the installed backends with the --version option:

$ phonemize --version
phonemizer-3.0
available backends: espeak-ng-1.50, espeak-mbrola, festival-2.5.0, segments-2.1.3

Input/output exemples

  • from stdin to stdout:

      $ echo "hello world" | phonemize
      həloʊ wɜːld
    
  • Prepend the input text to output:

      $ echo "hello world" | phonemize --prepend-text
      hello world | həloʊ wɜːld
      $ echo "hello world" | phonemize --prepend-text=';'
      hello world ; həloʊ wɜːld
    
  • from file to stdout

      $ echo "hello world" > hello.txt
      $ phonemize hello.txt
      həloʊ wɜːld
    
  • from file to file

      $ phonemize hello.txt -o hello.phon --strip
      $ cat hello.phon
      həloʊ wɜːld
    

Backends

  • The default is to use espeak us-english:

      $ echo "hello world" | phonemize
      həloʊ wɜːld
      $ echo "hello world" | phonemize -l en-us -b espeak
      həloʊ wɜːld
      $ echo 'hello world' | phonemize -l en-us -b espeak --tie
      həlo͡ʊ wɜːld
    
  • Use festival US English instead

      $ echo "hello world" | phonemize -l en-us -b festival
      hhaxlow werld
    
  • In French, using espeak and espeak-mbrola, with custom token separators (see below). espeak-mbrola does not support words separation.

      $ echo "bonjour le monde" | phonemize -b espeak -l fr-fr -p ' ' -w '/w '
      b ɔ̃ ʒ u ʁ /w l ə /w m ɔ̃ d /w
      $ echo "bonjour le monde" | phonemize -b espeak-mbrola -l mb-fr1 -p ' ' -w '/w '
      b o~ Z u R l @ m o~ d
    
  • In Japanese, using segments

      $ echo 'konnichiwa' | phonemize -b segments -l japanese
      konnitʃiwa
      $ echo 'konnichiwa' | phonemize -b segments -l ./phonemizer/share/japanese.g2p
      konnitʃiwa
    

Supported languages

The exhaustive list of supported languages is available with the command phonemize --list-languages [--backend <backend>].

  • Languages supported by espeak are available here.

  • Languages supported by espeak-mbrola are available here. Please note that the mbrola voices are not bundled with the phonemizer and must be installed separately.

  • Languages supported by festival are:

      en-us -> english-us
    
  • Languages supported by the segments backend are:

      chintang  -> ./phonemizer/share/segments/chintang.g2p
      cree      -> ./phonemizer/share/segments/cree.g2p
      inuktitut -> ./phonemizer/share/segments/inuktitut.g2p
      japanese  -> ./phonemizer/share/segments/japanese.g2p
      sesotho   -> ./phonemizer/share/segments/sesotho.g2p
      yucatec   -> ./phonemizer/share/segments/yucatec.g2p
    

    Instead of a language you can also provide a file specifying a grapheme to phone mapping (see the files above for examples).

Token separators

You can specify separators for phones, syllables (festival only) and words (excepted espeak-mbrola).

$ echo "hello world" | phonemize -b festival -w ' ' -p ''
hhaxlow werld

$ echo "hello world" | phonemize -b festival -p ' ' -w ''
hh ax l ow w er l d

$ echo "hello world" | phonemize -b festival -p '-' -s '|'
hh-ax-l-|ow-| w-er-l-d-|

$ echo "hello world" | phonemize -b festival -p '-' -s '|' --strip
hh-ax-l|ow w-er-l-d

$ echo "hello world" | phonemize -b festival -p ' ' -s ';esyll ' -w ';eword '
hh ax l ;esyll ow ;esyll ;eword w er l d ;esyll ;eword

You cannot specify the same separator for several tokens (for instance a space for both phones and words):

$ echo "hello world" | phonemize -b festival -p ' ' -w ' '
fatal error: illegal separator with word=" ", syllable="" and phone=" ",
must be all differents if not empty

Punctuation

By default the punctuation is removed in the phonemized output. You can preserve it using the --preserve-punctuation option (not supported by the espeak-mbrola backend):

$ echo "hello, world!" | phonemize --strip
həloʊ wɜːld

$ echo "hello, world!" | phonemize --preserve-punctuation --strip
həloʊ, wɜːld!

Espeak specific options

  • The espeak backend can output the stresses on phones:

      $ echo "hello world" | phonemize -l en-us -b espeak --with-stress
      həlˈoʊ wˈɜːld
    
  • The espeak backend can add tie on multi-characters phonemes:

      $ echo "hello world" | phonemize -l en-us -b espeak --tie
      həlo͡ʊ wɜːld
    
  • The espeak backend can switch languages during phonemization (below from French to English), use the --language-switch option to deal with it:

      $ echo "j'aime le football" | phonemize -l fr-fr -b espeak --language-switch keep-flags
      [WARNING] fount 1 utterances containing language switches on lines 1
      [WARNING] extra phones may appear in the "fr-fr" phoneset
      [WARNING] language switch flags have been kept (applying "keep-flags" policy)
      ʒɛm lə- (en)fʊtbɔːl(fr)
    
      $ echo "j'aime le football" | phonemize -l fr-fr -b espeak --language-switch remove-flags
      [WARNING] fount 1 utterances containing language switches on lines 1
      [WARNING] extra phones may appear in the "fr-fr" phoneset
      [WARNING] language switch flags have been removed (applying "remove-flags" policy)
      ʒɛm lə- fʊtbɔːl
    
      $ echo "j'aime le football" | phonemize -l fr-fr -b espeak --language-switch remove-utterance
      [WARNING] removed 1 utterances containing language switches (applying "remove-utterance" policy)
    
  • The espeak backend sometimes merge words together in the output, use the --words-mismatch option to deal with it:

      $ echo "that's it, words are merged" | phonemize -l en-us -b espeak
      [WARNING] words count mismatch on 100.0% of the lines (1/1)
      ðætsɪt wɜːdz ɑːɹ mɜːdʒd
    

Licence

Copyright 2015-2021 Mathieu Bernard

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

About

Simple text to phonemes converter for multiple languages

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 98.5%
  • Other 1.5%