Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: inconsistent streaming dataloader state (specific to StreamingDataset) #318

Merged
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
3658d64
chore: Add reset_state_dict method to StreamingDataset
bhimrazy Aug 9, 2024
8eb1c7f
chore: Update num_workers fallback value in StreamingDataset
bhimrazy Aug 9, 2024
10c10b3
fix: Reset dataset state after each epoch
bhimrazy Aug 9, 2024
391c68b
update
tchaton Aug 9, 2024
5d74ed8
Update src/litdata/streaming/dataset.py
tchaton Aug 9, 2024
7412064
feat: Add test for dataloader with loading states
bhimrazy Aug 9, 2024
0290a30
chore: Add test for dataloader with loading states with peristent wor…
bhimrazy Aug 9, 2024
00c2928
rm commment
bhimrazy Aug 9, 2024
25a87b7
🐛 fix: restore only if there are any remaining batches/samples to str…
bhimrazy Aug 11, 2024
678c3fc
added notes to checkout later
bhimrazy Aug 11, 2024
532dacd
Merge branch 'main' into bugfix/316-streaming-dataloader-state
bhimrazy Aug 11, 2024
9866992
add note
bhimrazy Aug 11, 2024
16bc40f
chore: Add test for dataloader resuming after completing last epoch
bhimrazy Aug 11, 2024
d3f9498
feat: Add test for resuming dataloader with new dataset
bhimrazy Aug 11, 2024
6769694
adds type ignore
bhimrazy Aug 11, 2024
81bc537
update timeout and num of samples
bhimrazy Aug 11, 2024
998fe5a
Add explicit test for resuming dataloader with new dataset
bhimrazy Aug 11, 2024
61120a4
chore: add validation for num_samples_yielded
bhimrazy Aug 11, 2024
faa0213
Merge branch 'main' into bugfix/316-streaming-dataloader-state
bhimrazy Aug 12, 2024
d98681c
removed unrequired test, as it was testing for wrong thing, when rese…
bhimrazy Aug 12, 2024
743f0dd
removed the unnecesssary todo
bhimrazy Aug 12, 2024
2db07e0
chore: Add restore flag to dataloader tests
bhimrazy Aug 12, 2024
fc3a960
chore: Add restore flag to dataloader for StreamingDataset
bhimrazy Aug 13, 2024
4a50cac
update
bhimrazy Aug 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/litdata/streaming/dataloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -615,6 +615,7 @@ def __iter__(self) -> Any:
self.current_epoch += 1
self._num_samples_yielded_combined = {}
self._num_samples_yielded_streaming = 0
self.dataset.reset_state_dict()
bhimrazy marked this conversation as resolved.
Show resolved Hide resolved

self.dataset.set_epoch(self.current_epoch)

Expand Down
5 changes: 4 additions & 1 deletion src/litdata/streaming/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -388,7 +388,7 @@ def state_dict(self, num_samples_yielded: int, num_workers: int, batch_size: int

return {
"num_samples_yielded": num_samples_yielded,
"num_workers": num_workers,
"num_workers": num_workers if num_workers > 0 else 1,
tchaton marked this conversation as resolved.
Show resolved Hide resolved
"batch_size": batch_size,
"current_epoch": self.current_epoch,
"input_dir_path": self.input_dir.path,
Expand All @@ -407,6 +407,9 @@ def load_state_dict(self, state_dict: Dict[str, Any]) -> None:
# the state is restored within the workers
self._state_dict = state_dict

def reset_state_dict(self) -> None:
self._state_dict = None

def _validate_state_dict(self) -> None:
assert self._state_dict
assert self.worker_env
Expand Down
Loading