You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The amount of RAM the program takes quickly increments until the memory is overflown. I'm running this code with multi-processing setup and I don't know if the problem applies to single processing.
After a good while of investigation I noticed that the problem comes from the getitem function of the customized Dataset. I changed it to the following:
`
def getitem(self, index):
return {
'index' : index,
'question' : self.question_prefix + " " + self.data[index]['question'],
'target' : self.get_target(self.data[index]),
'passages' : [self.f.format(c['title'], c['text']) for c in self.data[index]['ctxs'][:self.n_context]],
'scores' : torch.tensor([float(c['score']) for c in self.data[index]['ctxs'][:self.n_context]]),
'graph' :self.data[index]['graph'],
'node_indices':self.data[index]['node_indices']
}
`
and the memory issue is gone for me. It seems like as long as I define any local variable inside this function, the RAM will get blown eventually.
This operation might disable certain functionalities of the original code and makes certain corner cases crashing the training loop. Any other suggestions?
Thanks in advance!
The text was updated successfully, but these errors were encountered:
Well, I think you might miss some logic when rewriting the code to eliminate the local variable.
The following code works fine for me. Hope this can help you.
return {
'index' : index,
'question' : self.question_prefix + " " + self.data[index]['question'],
'target' : self.get_target(self.data[index]),
'passages' : [(self.title_prefix + " {} " + self.passage_prefix + " {}").format(c['title'], c['text']) for c in self.data[index]['ctxs'][:self.n_context]] if ('ctxs' in self.data[index] and self.n_context is not None) else None,
'scores' : torch.tensor([float(c['score']) for c in self.data[index]['ctxs'][:self.n_context]]) if ('ctxs' in self.data[index] and self.n_context is not None) else None,
}
The amount of RAM the program takes quickly increments until the memory is overflown. I'm running this code with multi-processing setup and I don't know if the problem applies to single processing.
After a good while of investigation I noticed that the problem comes from the getitem function of the customized Dataset. I changed it to the following:
`
def getitem(self, index):
`
and the memory issue is gone for me. It seems like as long as I define any local variable inside this function, the RAM will get blown eventually.
This operation might disable certain functionalities of the original code and makes certain corner cases crashing the training loop. Any other suggestions?
Thanks in advance!
The text was updated successfully, but these errors were encountered: