-
-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce default_encoding
parameter to [set|autodetect] the encoding if the charset is missing from the headers
#284
Conversation
fyi httpx uses utf-8 with |
fb021fb
to
bd90ef4
Compare
Needs decision. (I prefer httpx way, to not force new dependency.) |
e2072f7
to
c8cafcb
Compare
httpx PR with useful information |
I also prefer to add this as an optional feature. |
3c23431
to
1aa8c23
Compare
@yifeikong Update the documentation where you see fit, I've added a description in the first post. |
b0a6c14
to
4d7aaaa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, Thank you!
1c9e80e
to
95407e6
Compare
87116bc
to
bdd2783
Compare
7822b17
to
e480780
Compare
The problem in guthub workflow check |
da5da98
to
ba3eb25
Compare
I recommend splitting out unrelated changes into another PR(s), as it helps with review (and for downstream users to audit) |
12c83e5
to
dc69172
Compare
default_encoding
parameter to [set|autodetect] the encoding if the charset is missing from the headers
Split the PR into several, in this one I left only adding the |
Generally, I think the You can add new properties or change the underlying implementation, though. Thanks for fixing the CI! |
…f no charset is found in the headers
799e566
to
3f04ced
Compare
@yifeikong 🏁 |
PR summary:
default_encoding
parameterUsing the default encoding
When requests are made to a site without explicit character set information from the server, but the encoding is known, it's advisable to explicitly set the default encoding.
Using character set auto-detection
When the server doesn't provide character set information reliably, and the encoding is unknown, enabling auto-detection allows for a best-effort guess when converting bytes to text. To activate auto-detection, set the default_encoding parameter to a function that takes input bytes and returns the appropriate character set for decoding.
Let's take a look at autodetection using
charset-normalizer