-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encrypt/Decrypt Mailbox urls #198
base: master
Are you sure you want to change the base?
Conversation
…-soc-email FIX - Django settings fix to unicode
…soc-email FIX - Fix default charset
Update models.py
This fix works in python 2.7, django 1.11 and database with charset utf8 and collation ut8_general_ci |
…subject Update models.py
django_mailbox/models.py
Outdated
if 'subject' in message: | ||
msg.subject = ( | ||
utils.convert_header_to_unicode(message['subject'])[0:255] | ||
utils.convert_header_to_unicode(unicode(message['subject']).decode('utf-8'))[0:255] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a rather surprising change; could you elaborate on how this helps, exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am working in a app with python 2.7, django 1.11 and production database in 'utf8' charset. And i need to use django-mailbox to receive emails. If an email have a 'emoji' in subject, Django return a OperationalError. I should not change the character set to 'utf8mb4' in production. This fix (I don't know another way to do it, in utils.convert_header_to_unicode perhaps?) allow receive emails with emojis in django 1.11, python 2.7 and utf8 charset and collation
Before this fix: Django return a OperationalError
After this fix: Email subject with unicode emojis: "Resume of your a\xc3\xb1o with \xf0\x9f\x9a\x80"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I understand that you believe that this fixes the issue you're encountering, but what I meant was, specifically, how does the above change help that, really; consider this:
There are two possibilities here; one is that message['subject']
is a unicode object and the other is that it's bytes; given your example emoji of 🚀, that means we have two possibilities:
If it's bytes:
value = unicode('\xf0\x9f\x9a\x80')
# Will raise the following exception:
# Traceback (most recent call last):
# File "<stdin>", line 1, in <module>
# UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 0: ordinal not in range(128)
If it's unicode:
value = unicode(u'\U0001f680')
# Now let's try running 'decode'
value.decode('utf-8')
# Will raise the following exception:
# Traceback (most recent call last):
# File "<stdin>", line 1, in <module>
# File "/var/www/envs/latestrevision/lib/python2.7/encodings/utf_8.py", line 16, in decode
# return codecs.utf_8_decode(input, errors, True)
# UnicodeEncodeError: 'ascii' codec can't encode character u'\U0001f680' in position 0: ordinal # not in range(128)
There are a couple things to be learned from the above:
- Using
unicode
without supplying an encoding to use will attempt to interpret the provided string using your default encoding (sys.getdefaultencoding()
). In most peoples' cases, that encoding is going to beascii
, and that is certainly not going to work for codepoints above 127. decode
is intended to be used for converting bytes into unicode objects -- not for converting unicode objects into anything at all -- so when you rundecode
on a unicode object, you're actually asking python to re-interpret your object into your default encoding, then to decode those bytes using the encoding you've selected. This is also not going to help you get the result you want, but is one of the more common misunderstandings of how unicode and bytes objects work in Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. If I explain how I got to this point we can better understand the solution to the problem.
I use "(message ['subject]).decode(' utf-8 ')" to force the utf-8 encoding, which is the encoding that I have configured by default in my django app and in my production database.
I thought that the variable 'DJANGO_MAILBOX_default_charset' contained in utils.get_settings () could help me, but I saw that being lowercase django does not detect it as settings. I made a fix to capitalize it and force the 'default_charset' to be utf-8, but it still gave the same OperationalError.
I read several articles where they indicated that I had to change all the tables and columns of the production databases to 'utf8mb4', since the 'emojis' use 4 bytes to represent it in unicode.
But I can not change that encoding in my production database and I do not care that the emoji is represented as bytes in the subject.
My intention is to use django-mailbox to automate actions when receiving emails, and I do not care that the emoji is not represented correctly. What I want is that django does not return an OperationalError if I do not have the encoding to 'utf8mb4'.
I understand that this conversion from header to unicode should be done by the function utils.covert_header_to_unicode(), but I made the fix in _models.Mailbox.process_message() as workaround.
When making the decode, it returns a string "'=?Utf-8?Bxxxxxxxxxxxx ...'" which is a MIME header. This string is converted to a readable string with "email.header.decode_header (msg.subject)".
And at this point my question is, is there any way to use django-mailbox without the encoding 'utf8mb4' in the production database if i received an email with a "emoji"?. Thanks for everything
Encrypt uri
Update models.py
Remove fix UTF-8
No description provided.