LmCSC (Language Model-based Chinese Spelling Check)

This is an implementation of Chinese spelling check system.

Quick Links

About
Demon
Installation

About

The system mainly consists of the following three parts:

A Tri-gram Language Model
Confusionset
Other sources

Demo

Installation

Except for some pre-installed python libraries, there some additional packages needed to be installed in order to successfully run our system. We have listed the compulsory packages in the requirements.txt. Run the following commands to clone the repository and install LmCSC:

git clone https://github.com/wdimmy/LmCSC.git
cd LmCSC; pip install -r requirements.txt; python setup.py develop

Note: requirements.txt includes a subset of all the possible required packages. Depending on what you want to run, you might need to install an extra package.

You can train the langauge model using kenlm, or downlowed our already trained model by run:

chmod 777 ./download.sh 
./download.sh

NOTE: we provide two versions:

kenlm_3.bin（about 13GB): https://pan.baidu.com/s/1g7LL_sLs-ra2l9VxeDp-9w Extraction Code：0u3q

kenlm_3_small.bin (about 3GB): https://pan.baidu.com/s/1mMVVHmNtM_FXLJ5yIiRX7Q Extraction Code：91qj

The bigger one works better.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
CSC		CSC
img		img
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
download.sh		download.sh
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LmCSC (Language Model-based Chinese Spelling Check)

Quick Links

About

Demo

Installation

About

Releases

Packages

Languages

License

wdimmy/LmCSC

Folders and files

Latest commit

History

Repository files navigation

LmCSC (Language Model-based Chinese Spelling Check)

Quick Links

About

Demo

Installation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages