Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to change cache directory path? #232

Closed
cnoco1at3 opened this issue Jul 15, 2024 · 4 comments
Closed

Allow to change cache directory path? #232

cnoco1at3 opened this issue Jul 15, 2024 · 4 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@cnoco1at3
Copy link

🚀 Feature

Thanks for the great work! Wondering if it's possible to change the cache directory to a user specified one, e.g. having an additional parameter for StreamingDataset.

Motivation

Right now this cache path is hardcoded and cannot be changed unless we hacked into the constants.py file. However, in our case, we have an extremely slow disk that happens to be mounted to the home directory, where is currently used by litdata for caching. This becomes the biggest bottleneck even though our connection to S3 is fast enough.

@cnoco1at3 cnoco1at3 added enhancement New feature or request help wanted Extra attention is needed labels Jul 15, 2024
Copy link

Hi! thanks for your contribution!, great first issue!

@cnoco1at3
Copy link
Author

cnoco1at3 commented Jul 15, 2024

Actually found a ad-hoc method to make this work, which is not documented I believe. For those curious, here's a minimalist example:

from litdata.streaming.resolver import Dir
from litdata import StreamingDataset

input_dir = "<path-to-input-dir>"
if input_dir.startswith("s3://"):
    input_dir = Dir(url=input_dir, path="<path-to-cache-dir>")
dataset = StreamingDataset(input_dir)

@csy1204
Copy link
Contributor

csy1204 commented Jul 16, 2024

@cnoco1at3

you can find a documentation about this in Specify cache directory section. #Readme

@tchaton
Copy link
Collaborator

tchaton commented Jul 16, 2024

Hey @cnoco1at3, thanks for the kind words. I have updated the README, so it is more clear.

@tchaton tchaton closed this as completed Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants