Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update chardet to 5.1.0 #154

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pyup-bot
Copy link
Collaborator

@pyup-bot pyup-bot commented Dec 2, 2022

This PR updates chardet from 3.0.4 to 5.1.0.

Changelog

5.1.0

Features
- Add `should_rename_legacy` argument to most functions, which will rename older encodings to their more modern equivalents (e.g., `GB2312` becomes `GB18030`) (264, dan-blanchard)
- Add capital letter sharp S and ISO-8859-15 support (222, SimonWaldherr)
- Add a prober for MacRoman encoding (5 updated as c292b52a97e57c95429ef559af36845019b88b33, Rob Speer and dan-blanchard )
- Add `--minimal` flag to `chardetect` command (214, dan-blanchard)
- Add type annotations to the project and run mypy on CI (261, jdufresne)
- Add support for Python 3.11 (274, hugovk)

Fixes
- Clarify LGPL version in License trove classifier (255, musicinmybrain)
- Remove support for EOL Python 3.6 (260, jdufresne)
- Remove unnecessary guards for non-falsey values (259, jdufresne)

Misc changes
- Switch to Python 3.10 release in GitHub actions (257, jdufresne)
- Remove setup.py in favor of build package (262, jdufresne)
- Run tests on macos, Windows, and 3.11-dev (267, dan-blanchard)

5.0.0

⚠️ This release is the first release of chardet that no longer supports Python < 3.6 ⚠️

In addition to that change, it features the following user-facing changes:

- Added a prober for Johab Korean (207, grizlupo)
- Added a prober for UTF-16/32 BE/LE (109, 206, jpz) 
- Added test data for Croatian, Czech, Hungarian, Polish, Slovak, Slovene, Greek, and Turkish, which should help prevent future errors with those languages
- Improved XML tag filtering, which should improve accuracy for XML files (208)
- Tweaked `SingleByteCharSetProber` confidence to match latest uchardet (209)
- Made `detect_all` return child prober confidences (210)
- Updated examples in docs (223, domdfcoding)
- Documentation fixes (212, 224, 225, 226, 220, 221, 244 from too many to mention)
- Minor performance improvements (252, deedy5)
- Add support for Python 3.10 when testing (232, jdufresne)
- Lots of little development cycle improvements, mostly thanks to jdufresne

4.0.0

Benchmarking chardet 4.0.0 on CPython 3.7.5 (default, Sep  8 2020, 12:19:42)
[Clang 11.0.3 (clang-1103.0.32.62)]
--------------------------------------------------------------------------------
.......................................................................................................................................................................................................................................................................................................................................................................
Calls per second for each encoding:
Links

@pyup-bot pyup-bot mentioned this pull request Dec 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant