-
Notifications
You must be signed in to change notification settings - Fork 38
CMUdict Maintenance and Expansion
This covers Rudnicky's current work on the dictionary, which includes maintenance and extension. This in turn includes a collection of scripts that do this and that, schemes for converting the master dictionary into a database, as well as various schemes for rational extension. Currently the latter include workflows for filtering OOV from the cmudict and the lmtool tools; it also includes a scheme for using Google Books to identify OOVs that happen to be high-frequency words and (eventually) to digest the POS information in that resource.
Documentation
Check out the repository. Current notes will be README.txt files scattered around the project. They are currently authored by me (air). This is the central repository for the project. air's local repository is used to sync with the one in Sourceforge (in SVN), and to support movement between various operating systems and environments. The purpose of this repository is to provide a persistent backup of the work in git-land, since the frequency of commits will be higher than should be tolerated by users of the SVN repository.
There's other READ file but they are part of the main (currently SVN-based) repository and is meant for actual users.