-
Notifications
You must be signed in to change notification settings - Fork 519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tensor-based annotation storage to reduce DDP RAM usage with large COCO-format datasets #1885
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! See comments inline.
src/super_gradients/training/datasets/detection_datasets/coco_format_detection.py
Outdated
Show resolved
Hide resolved
start_addr = 0 if sample_id == 0 else self._addr[sample_id - 1].item() | ||
end_addr = self._addr[sample_id].item() | ||
annotation = pickle.loads(self._annotations[start_addr:end_addr].numpy().data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We saw slowdown due to pickle load, do we want to make the serialize-parse fix optional?
I mean, eventually a memory leak is a memory leak, but for small datasets you get overhead whereas without the fix "you'd be fine". @BloodAxe , thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NatanBagrov yes IMO
src/super_gradients/training/datasets/detection_datasets/coco_format_detection.py
Outdated
Show resolved
Hide resolved
…o work as expected
src/super_gradients/training/datasets/detection_datasets/coco_format_detection.py
Outdated
Show resolved
Hide resolved
src/super_gradients/training/datasets/detection_datasets/coco_format_detection.py
Outdated
Show resolved
Hide resolved
src/super_gradients/training/datasets/detection_datasets/coco_format_detection.py
Outdated
Show resolved
Hide resolved
start_addr = 0 if sample_id == 0 else self._addr[sample_id - 1].item() | ||
end_addr = self._addr[sample_id].item() | ||
annotation = pickle.loads(self._annotations[start_addr:end_addr].numpy().data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NatanBagrov yes IMO
…b.com:Deci-AI/super-gradients into feature/ALG-000_memory-efficient-coco-dataset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me this looks quite ready.
I do have one more reqyest: please add a unit test that utilizes this feature so that we see nothing crashes. You can use our data in tests/data/coco2017 (see how we use it for example in tests/unit_tests/preprocessing_unit_test.py).
It can be a simple test that just iterates throught the dataset with use_tensor_backed_storage set.
@@ -52,6 +56,7 @@ def __init__( | |||
:param with_crowd: Add the crowd groundtruths to __getitem__ | |||
:param class_ids_to_ignore: List of class ids to ignore in the dataset. By default, doesnt ignore any class. | |||
:param tight_box_rotation: This parameter is deprecated and will be removed in a SuperGradients 3.8. | |||
:param use_tensor_backed_storage: Whether to use tensor backed storage to mitigate python memory leak with large datasets () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe give an estimate what's considered "large" as a recommendation, from your experience.
Source of the bug: pytorch/pytorch#13246
Solution based on: facebookresearch/detectron2@0cd0e72
Fixes: #1214