-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request params encoded using system encoding #39
Comments
Did a bit of testing, looks like tmdb is expecting utf-8 encoding. Did a bit of a hack to get things working again:
|
If the user is going to be accessing unicode content, such as movies with the character "П" in the title, it expects the user will have configured their system to handle unicode content. Specifically, that means configuring a UTF language in their environment. # unconfigured default > locale LANG= LC_CTYPE="C" LC_COLLATE="C" LC_TIME="C" LC_NUMERIC="C" LC_MONETARY="C" LC_MESSAGES="C" LC_ALL= # Bourne users > export LANG="en_US.UTF-8" # C-shell users > setenv LANG en_US.UTF-8 # confirmation > locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_ALL= The tmdb3 library will then pull that encoding from the environment using the locale library. > projects/pytmdb3/scripts/pytmdb3.py PyTMDB3 Interactive Shell. TAB completion available. >>> import locale >>> locale.getdefaultlocale() ('en_US', 'UTF-8') >>> get_locale().encoding 'UTF-8' |
The problem is, we can't just pick an arbitrary encoding when sending requests to tmdb. They are expecting utf-8. |
It has nothing to do with the platform we are running on what encoding the api expects. |
Here is some more evidence that just picking a codec that supports all unicode codepoints still isn't correct. It has to be in the encoding tmdb is expecting in order for it to be able to decode again:
|
The environment does need to be configured for unicode to receive unicode responses from TMDb, due to the behavior of Python 2 itself, however I'll need to look at this again to figure out how to handle non-bytecode encodings. |
This should be entirely independent of the environment. Unicode is unicode no matter what locale an user has set. Tmdb declares what encoding they accept and send for byte strings, and the python library should only expose and accept strings as |
|
Perhaps I'm misunderstanding how this is supposed to work, but it looks like all request parameters are encoded using the system locale encoding. (https://github.com/wagnerrp/pytmdb3/blob/master/tmdb3/request.py#L70) This causes problems when the system locale cannot encode all the charaters in the parameters, plus, I have no idea how tmdb is expected to know what encoding you have used to encode the parameters, I suspect it should be using a constant encoding defined by the tmdb api.
Portion of a relevant traceback:
Downstream ticket: http://flexget.com/ticket/2392
The text was updated successfully, but these errors were encountered: