pyCelex.py: a python interface to CELEX2.

Some code I wrote a while ago for reading stuff, mostly morpho-phonological data, from CELEX2. Feel free to see if it works for you, fork it, whatever, consistent with the LICENSE.

Currently hard-coded to read just the English lexicon, but it shouldn't be hard to parameterize that.

-Max

Usage example

Assume you have CELEX2 installed at /path/to/CELEX2. This should be top directory from the CELEX2 disc containing the README, and subdirectories awk, c, dutch, english, german, etc.

>>> import pyCelex
>>> celex = pyCelex.buildWordFormDict('/path/to/CELEX2')
>>> celex['run']
[WordForm('run', 75882, 39588, 987, 'S', '@'),
 WordForm('run', 75883, 39589, 626, 'i', '@'),
 WordForm('run', 113816, 39589, 626, 'e1S', '@'),
 WordForm('run', 130829, 39589, 626, 'e2S', '@'),
 WordForm('run', 147739, 39589, 626, 'eP', '@'),
 WordForm('run', 158066, 39589, 626, 'pa', 'IRR')]
>>> celex['run'][0].cob # corpus freq of first wordform
987
>>> dir(celex['run'][0]) # lots of other wordform properties
    ...
>>> dir(celex['run'][0].lemma) # lemma properties
    ...

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LICENSE		LICENSE
README.md		README.md
makeReadme.sh		makeReadme.sh
pyCelex.py		pyCelex.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pyCelex.py: a python interface to CELEX2.

Usage example

About

Releases

Packages

Languages

License

maxbane/pyCelex

Folders and files

Latest commit

History

Repository files navigation

pyCelex.py: a python interface to CELEX2.

Usage example

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages