Skip to content

Charset Normalizer

Compare
Choose a tag to compare
@Ousret Ousret released this 17 Sep 17:17
· 484 commits to master since this release
d3996ce
Release 1.0.0 (#11)

* Adjustement in frequencies.json about Chinese

Remove latin based char in it

* Added the possibility to list encoding aliases for a match

Encoding name are known by many name, using this could help when searching for IBM855 when it's listed as CP855.

* Added submatch in match

list of submatch that produce the EXACT same output as a match

* Changes in docs

+ comment unused code.

* Add param in doc ProbeChaos giveup_threshold

* Doc improvement in unicode.py

* Add static method list_by_range in unicode.py

Sort letters by unicode range in a dict

* ProbeCoherence reliability improved 

Can now probe & sort by alphabet used or unicode range.

* Added coherence_non_latin method in NormalizerMatch

Verify if a non latin based language got verified by probe coherence

* CLI is now more verbose

* More tests, yay !

* bump 1.0.0

* readme upd8