Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting error while generating vocabulary #14

Open
aliraza-ece opened this issue Aug 10, 2020 · 6 comments
Open

Getting error while generating vocabulary #14

aliraza-ece opened this issue Aug 10, 2020 · 6 comments

Comments

@aliraza-ece
Copy link

Hello Wengong !

Thanks for the great work !!

I am trying to get vocabulary using your dataset < ../data/polymers/all.txt > ; however, I am getting this error. I cannot figure this out. At the end I tried try-exception there but there are lots of these errors in the whole run. I will appreciate if you could assist me.

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "[...]\Anaconda3\envs\myenv\lib\multiprocessing\pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "[...]\Anaconda3\envs\myenv\lib\multiprocessing\pool.py", line 44, in mapstar
    return list(map(*args))
  File "[...]\hgraph2graph-master\hgraph2graph-master\generation\get_vocab.py", line 12, in process
    hmol = MolGraph(s)
  File "[...]\hgraph2graph-master\hgraph2graph-master\generation\poly_hgraph\mol_graph.py", line 29, in __init__
    self.clusters, self.atom_cls = self.pool_clusters()
  File "[...]\hgraph2graph-master\hgraph2graph-master\generation\poly_hgraph\mol_graph.py", line 87, in pool_clusters
    **if fsmiles not in MolGraph.FRAGMENTS: continue**
TypeError: argument of type 'NoneType' is not iterable
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "[...]/hgraph2graph-master/generation/get_vocab.py", line 62, in <module>
    vocab_list = pool.map(process, batches) # getting error here TypeError: argument of type 'NoneType' is not iterable
  File "[...]\Anaconda3\envs\myenv\lib\multiprocessing\pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "[...]\Anaconda3\envs\myenv\lib\multiprocessing\pool.py", line 644, in get
    raise self._value
TypeError: argument of type 'NoneType' is not iterable
@wengong-jin
Copy link
Owner

Hi,

It seems that MolGraph.FRAGMENTS is not initialized. In

MolGraph.load_fragments(fragments)
, the load_fragment function will set MolGraph.FRAGMENTS to a list of fragments collected from your training data.

This is strange because as long as load_fragment is called (get_vocab.py Line 50), MolGraph.FRAGMENTS cannot be None (at best it's an empty list). I think the error happened before load_fragment function is called. You can try to print out fragments variable in line 49 to see whether it gets executed or not.

@nikhilmittal444
Copy link

I too was getting the same error in generation/preprocessing.py file.
I debugged the code step by step and found the issue that when the program calls partial(tensorize, mol_batches), the MolGraph.tensorize initializes the FRAGMENTS to None and never calls load_fragments before going to pool_clusters() leading to this NoneType iterable issue.
Please help with this if I am wrong. Thank you in advance

@aliraza-ece
Copy link
Author

@nikhilmittal444 This is an issue with Pool in Windows. MolGraph.FRAGMENTS is not accessible in functions called through Pool. I removed the multiprocessing and I am able to get the vocabulary without any issue. However, I am only getting 2273 lines in contrast to 2288 lines in the provided vocab. @wengong-jin I am still going through the code to see if there is any randomness. However, do you think this is normal?

@nikhilmittal444
Copy link

I made the FRAGMENTS from load_fragments as a new variable and put that as input argument to the tensorize function(self.new_variable) and the MolGraph object in the init(), which gave me 2288 lines as initialized.
It also resolved the MolGraph.FRAGMENTS not iterable as NoneType object

mateuszrezler added a commit to mateuszrezler/hgraph2graph that referenced this issue Aug 31, 2020
@mateuszrezler
Copy link

Hi guys,

this issue could be easily solved by simple replacement of None with an empty list (see #15).
@wengong-jin, please review if this change seems to be safe.

@orubaba
Copy link

orubaba commented Jul 17, 2022

Hi gurus,
please
image
, I need your help. I am trying to run the get-vocab.py on my small dataset around 100. but keep getting this error as shown below:
Is there a way to go around this. the reference for the error is to the mol_graph.py line82:
"assert n - m <= 1 #must be connected"
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants