Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Is there a list for publicly available s3 links of datasets of litdata.StreamingDataset format? #430

Open
2catycm opened this issue Dec 2, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@2catycm
Copy link

2catycm commented Dec 2, 2024

If there is a list that collects some popular datasets that have been preprocessed by litdata and upload to lightning studio or S3, then the usability of this project will be really awesome for me.

For example, is there a streaming dataset for imagenet that is publicly available?

@2catycm 2catycm added the enhancement New feature or request label Dec 2, 2024
@tchaton
Copy link
Collaborator

tchaton commented Dec 2, 2024

Hey @2catycm. Yes, there is. I haven't processed much datasets so far.

Here are my published Studios: https://lightning.ai/thomasgridai

The dataset is available under s3://optimized-imagenet-1m/lightning_data_imagenet I think to remember

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants