-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
custom cache directory for local path #224
Comments
Hi! thanks for your contribution!, great first issue! |
Hey @csy1204, You can already do it right now. from litdata import StreamingDataset
from litdata.streaming.cache import Dir
dataset = StreamingDataset(input_dir=Dir(cache_dir, data_dir), max_cache_size="10GB") But I do agree this isn't very straightforward. Feel free to make a PR expose it on the StreamingDataset. This should be fairly simple. |
Thanks! I would like to use a remote storage for input_dir and store the cache in a custom directory as well. from litdata import StreamingDataset
from litdata.streaming.cache import Dir
dataset = StreamingDataset(input_dir="s3://data-bucket/train", cache_dir="/fast_fs/.cache" max_cache_size="10GB") litdata/src/litdata/utilities/dataset_utilities.py Lines 93 to 101 in df8dcd1
|
Have you tried: ds= StreamingDataset(input_dir=Dir(path="/fast_fs/.cache", url="s3://data-bucket/train"), max_cache_size="10GB") |
@deependujha cc. @tchaton While working on this feature myself, I gained a precise understanding of its meaning. I realized that in the directory, path and URL can be utilized differently. Fortunately, this helped me to deepen my understanding of |
🚀 Feature
There have been many situations where it would be beneficial for users to specify the Cache Directory. I would like to contribute by developing a feature that allows passing the desired path as a paramter.
Motivation
We use several file systems that have different performances and features. so we need to choose cache directory for different use cases.
Pitch
Alternatives
Additional context
The text was updated successfully, but these errors were encountered: