-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
README error #34
Comments
I have a question. how long does it take for the training to conclude. I have been running the "python preprocess.py --train data/chembl/all.txt --vocab vocab.txt --ncpu 16 --mode single" for a whole day and it has not completed. Is there something I am doing wrongly? |
That's not normal. It took a couple of hours. I had to change the number of CPUs used because it was killing the memory ram of my workstation, and I have 256 GB of RAM. |
wow. Thanks. I was relying on my 16 ram laptop to do the work. it seems that was an ambitious thought. Now I see why I wasn't getting any headway. |
The chembl dataset is huge, and I think the script is doing its stuff but keeping everything in memory. At some point, you will run out of RAM. There are libraries, like Dask, that could allow you to work with processes requiring a huge amount of RAM but you would need to implement it. If you read the |
Thanks so much for the suggestion. I am trying to run the get_vocab.py code on a far reduced size of the chembl dataset but got this error - multiprocessing.pool.MaybeEncodingError: Error sending result: '<multiprocessing.pool.ExceptionWithTraceback object at 0x7f6c291da0a0>'. Reason: 'PicklingError("Can't pickle <class 'Boost.Python.ArgumentError'>: import of module 'Boost.Python' failed" - i have checked online but i haven't worked it out. |
See #33 |
Forget about the message above. It is not using multiprocessing at all. |
After you generate the vocabulary in the first step of the README,
the next line should be:
Otherwise, you get the following error:
The text was updated successfully, but these errors were encountered: