Skip to content

Commit

Permalink
refactor: Improve index file loading and adapt MDS shards to chunks f…
Browse files Browse the repository at this point in the history
…ormat
  • Loading branch information
bhimrazy committed Jul 7, 2024
1 parent 930eb1d commit 54c32d5
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions src/litdata/utilities/dataset_utilities.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,11 +138,11 @@ def load_index_file(input_dir: str) -> Dict[str, Any]:
with open(index_filepath) as f:
data = json.load(f)

if "chunks" in data:
return data
if "shards" in data:
if "chunks" not in data and "shards" in data:
# load mds shard-based index file and adapt to chunks format
return adapt_mds_shards_to_chunks(data)
raise ValueError(f"Invalid index file format at {index_filepath}.")

return data
except FileNotFoundError:
raise FileNotFoundError(f"Index file not found at {index_filepath}.")

Expand Down

0 comments on commit 54c32d5

Please sign in to comment.