Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Image-Modality Throughput Benchmark #9778

Closed
1 task done
lk-chen opened this issue Oct 28, 2024 · 2 comments · Fixed by #9851
Closed
1 task done

[Feature]: Image-Modality Throughput Benchmark #9778

lk-chen opened this issue Oct 28, 2024 · 2 comments · Fixed by #9851

Comments

@lk-chen
Copy link
Contributor

lk-chen commented Oct 28, 2024

🚀 The feature, motivation and pitch

This is a subset of #8385. This issue is intended to track the effort of enabling throughput benchmark for image-modal models.

This is a reasonably large feature, and will span the work among multiple PRs.

Alternatives

Ad-hoc scripts for each model.

Additional context

see #8385

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@ywang96
Copy link
Member

ywang96 commented Oct 31, 2024

This is great! I think as part of the effort, it would be great if we can refactor out the dataset generation used for this benchmark if you have then bandwidth.

For the context, today the dataset sampling and generation is done within the benchmark files themselves, but I think we should consolidate the datasets used in these benchmarks into one file (say, benchmark_datasets.py) and import from it instead. This also allows us to plug in custom datasets more easily in the future.

@lk-chen
Copy link
Contributor Author

lk-chen commented Nov 5, 2024

This is great! I think as part of the effort, it would be great if we can refactor out the dataset generation used for this benchmark if you have then bandwidth.

For the context, today the dataset sampling and generation is done within the benchmark files themselves, but I think we should consolidate the datasets used in these benchmarks into one file (say, benchmark_datasets.py) and import from it instead. This also allows us to plug in custom datasets more easily in the future.

Sure I can take a look at dataset generation consolidation in parallel to #9851

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants