Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when unpickling TextFile with text using encoding: "maximum recursion depth exceeded" #384

Open
ArneNx opened this issue Mar 8, 2017 · 0 comments

Comments

@ArneNx
Copy link

ArneNx commented Mar 8, 2017

from blocks.serialization import dump, load

dictionary = {'<UNK>': 0, '</S>': 1, 'this': 2, 'a': 3, 'one': 4}
dataset = TextFile(['example_data.gz'], dictionary, None, level='word',
                   encoding='utf8', preprocess=None).open()

with open('dumpfile', 'w') as f:
    dump(dataset, f)

with open('dumpfile', 'r') as f:
    y = load(f) 

In the example above, you would get an error from load(f), since this tries to unpickle codecs.StreamReader. At least for Python 2.7, it is a known issue, that this leads to an infinite recursion.

If you try the same thing without encoding or with an unzipped file, it will work without problems, since then codecs.StreamReader is not used.

Also in fuel version 0.1.1 this wasn't an issue since the reading was done differently.
What was the motivation to switch to codecs.StreamReader? Can this be done without it?
I would really appreciate it if there were a solution that would allow me to pickle the TextFile object without dropping the encoding or switching back to an older fuel version.

Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant