Skip to content

Commit

Permalink
Merge pull request #56 from tsaishien-chen/main
Browse files Browse the repository at this point in the history
Update webpage and readme
  • Loading branch information
AliaksandrSiarohin authored Jun 26, 2024
2 parents 433ba6d + 2ea820a commit bbae2b1
Show file tree
Hide file tree
Showing 10 changed files with 41 additions and 15 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,11 @@ This is the offical Github repository of Panda-70M.
[Ming-Hsuan Yang](https://faculty.ucmerced.edu/mhyang/),
[Sergey Tulyakov](http://www.stulyakov.com/)
</br>
*Computer Vision and Pattern Recognition 2024*
*Computer Vision and Pattern Recognition (CVPR) 2024*

<!-- [Arxiv Report](https://arxiv.org/abs/2307.04725) | [Project Page](https://snap-research.github.io/Panda-70M) -->
[![arXiv](https://img.shields.io/badge/arXiv-2402.19479-b31b1b.svg)](https://arxiv.org/abs/2402.19479)
[![Project Page](https://img.shields.io/badge/Project-Website-green)](https://snap-research.github.io/Panda-70M)
[![YouTube](https://badges.aleen42.com/src/youtube.svg)](https://youtu.be/m2NQ5k1oTcs)

## Introduction
Panda-70M is a large-scale dataset with 70M high-quality video-caption pairs.
Expand Down Expand Up @@ -86,7 +86,7 @@ More details can be found in [Dataset Dataloading](./dataset_dataloading) sectio
</tr>
</table>

<sup>**We will remove the video samples from our dataset / Github / project webpage as long as you need it. Please contact tsaishienchen at gmail dot com for the request.</sup>
<sup>**We will remove the video samples from our dataset / Github / project webpage / technical presentation as long as you need it. Please contact tsaishienchen at gmail dot com for the request.</sup>

Please check [here](https://snap-research.github.io/Panda-70M/more_samples) for more samples.

Expand Down
4 changes: 2 additions & 2 deletions captioning/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# 🐼 Panda-70M: Video Captioning

**[Note] To use our captioning code, please make sure you follow [this guideline](https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md#how-to-apply-delta-weights-only-needed-for-weights-v0) and correctly prepare vicuna-7b-v0 weight. Basically, you need to first download the original weights and then apply delta weights. Improper weights preparation will lead to meaningless outputs.**
**[Note] To run the captioning code, please make sure you follow [this guideline](https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md#how-to-apply-delta-weights-only-needed-for-weights-v0) and correctly prepare vicuna-7b-v0 weight. You need to first download the original weights and then apply delta weights. Improper weights preparation will lead to meaningless outputs.**

## Introduction
We propose a video captioning model to generate a caption for a short video clip.
Expand Down Expand Up @@ -61,7 +61,7 @@ Please look at the video and faithfully summarize it in one sentence.</sup></td>
</tr>
</table>

<sup>**We will remove the video samples from our dataset / Github / project webpage as long as you need it. Please contact tsaishienchen at gmail dot com for the request.</sup>
<sup>**We will remove the video samples from our dataset / Github / project webpage / technical presentation as long as you need it. Please contact tsaishienchen at gmail dot com for the request.</sup>

- **[Note]** You might get different outputs due to the randomness of LLM's generation.

Expand Down
Binary file modified captioning/assets/architecture.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion dataset_dataloading/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# 🐼 Panda-70M: Dataset Dataloading
The section includes the csv files listing the data samples in Panda-70M and the code to download the videos.

**[Note] Please use the video2dataset tool from this repository to download the dataset, as the video2dataset from [the official repository](https://github.com/iejMac/video2dataset) cannot work with our csv format for now.**
**[Note] Please use the video2dataset tool from this repository to download the dataset, as the video2dataset from [the official repository](https://github.com/iejMac/video2dataset) cannot work with our csv format.**

## Data Splitting and Download Link
| Split | Download | # Source Videos | # Samples | Video Duration | Storage Space |
Expand Down
1 change: 1 addition & 0 deletions docs/assets/wordcloud.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 0 additions & 1 deletion docs/assets/worldcloud.svg

This file was deleted.

15 changes: 15 additions & 0 deletions docs/html_pages/resources/stylesheet.css
Original file line number Diff line number Diff line change
Expand Up @@ -186,4 +186,19 @@ div.scroll-container {
.table-container {
width: 100%;
}
}

.youtube-container {
position: relative;
width: 100%;
padding-bottom: 56.25%; /* 16:9 aspect ratio */
height: 0;
overflow: hidden;
}
.youtube-container iframe {
position: absolute;
top: 0;
left: 0;
width: 100%;
height: 100%;
}
19 changes: 15 additions & 4 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
</nav>
<nav>
<a href="#download">Download</a>
<a href="#presentation">Presentation</a>
<a href="#collection">Collection</a>
<a href="#demo">Demo</a>
<a href="#statistic">Statistic</a>
Expand Down Expand Up @@ -167,7 +168,7 @@ <h5 class="pt-1" style="font-size: 2rem; font-weight: normal">A Large-Scale Data
</div>
</div>
<div class="container text-center footnote">
We will remove the video samples from our dataset / Github / project webpage as long as you need it. Please contact tsaishienchen at gmail dot com for the request.
We will remove the video samples from our dataset / Github / project webpage / technical presentation as long as you need it. Please contact tsaishienchen at gmail dot com for the request.
</div>
</div>

Expand Down Expand Up @@ -208,6 +209,16 @@ <h1 class="jumbotron-heading">Download Panda-70M</h1>

<hr class="mt-5">

<section id="presentation">
<div class="container text-center" style="margin-top: 10px">
<div class="youtube-container">
<iframe src="https://www.youtube.com/embed/m2NQ5k1oTcs?si=jCc8gruNWA_oXNyP&autoplay=0&mute=0" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
</div>
</div>
</section>

<hr class="mt-5">

<section id="collection">
<div class="container text-center">
<h1 class="jumbotron-heading">Collection Pipeline of Panda-70M</h1>
Expand Down Expand Up @@ -268,7 +279,7 @@ <h1 class="jumbotron-heading">Demo of Long Video Annotation</h1>
</video>
</div>
<div class="container text-center footnote">
We will remove the video samples from our dataset / Github / project webpage as long as you need it. Please contact tsaishienchen at gmail dot com for the request.
We will remove the video samples from our dataset / Github / project webpage / technical presentation as long as you need it. Please contact tsaishienchen at gmail dot com for the request.
</div>
</div>
</section>
Expand All @@ -283,7 +294,7 @@ <h1 class="jumbotron-heading">Statistic</h1>
<img src="./assets/statistic.svg" style="margin-top: -20px;">
</div>
<div class="image-item">
<img src="./assets/worldcloud.svg" style="margin-top: 10px; margin-bottom: 10px; width: 90%">
<img src="./assets/wordcloud.svg" style="margin-top: 10px; margin-bottom: 10px; width: 90%">
</div>
</div>
</div>
Expand Down Expand Up @@ -360,7 +371,7 @@ <h1 class="jumbotron-heading text-center">Acknowledgement</h1>
imageItems.forEach(item => {
const elementPosition = item.getBoundingClientRect().top;

if (elementPosition < window.innerHeight * 0.7) {
if (elementPosition < window.innerHeight * 0.85) {
item.style.opacity = '1';
} else {
item.style.opacity = '0';
Expand Down
6 changes: 3 additions & 3 deletions docs/more_samples.html
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
<a href="#Scenery">Scenery</a>
<a href="#Food">Food</a>
<a href="#Sports_Activity">Sports Activity</a>
<a href="#Vehicles">Vehicles</a>
<a href="#Vehicle">Vehicle</a>
<a href="#Tutorial_and_Narrative">Tutorial and Narrative</a>
<a href="#News_and_TV_Shows">News and TV Shows</a>
<a href="#Gaming_and_3D_Rendering">Gaming and 3D Rendering</a>
Expand All @@ -48,7 +48,7 @@ <h5 class="pt-1" style="font-size: 2rem; font-weight: normal">A Large-Scale Data
<a class="paper-btn" style="width: 130px" href="#Scenery">Scenery</a>
<a class="paper-btn" style="width: 130px" href="#Food">Food</a>
<a class="paper-btn" style="width: 130px" href="#Sports_Activity">Sports Activity</a>
<a class="paper-btn" style="width: 130px" href="#Vehicles">Vehicles</a>
<a class="paper-btn" style="width: 130px" href="#Vehicle">Vehicle</a>
</div>
<div class="paper-btn-parent">
<a class="paper-btn" style="width: 228px" href="#Tutorial_and_Narrative">Tutorial and Narrative</a>
Expand Down Expand Up @@ -462,7 +462,7 @@ <h2 class="pt-4"><p class="text-center" id="Sports_Activity">Sports Activity</p>

<hr class="mt-5">

<h2 class="pt-4"><p class="text-center" id="Vehicles">Vehicles</p></h2>
<h2 class="pt-4"><p class="text-center" id="Vehicle">Vehicle</p></h2>
<th style="text-align: center; vertical-align: top; padding: 10px;">
<video playsinline autoplay loop muted src="./assets/samples/489O6JiJ8Qk.14.mp4" style="width: 100%" type="video/mp4"></video>
<p class="responsive-text" style="font-family: Chalkduster; font-size: 16px; color: white">"A remote control monster truck is driving on rough terrain."</p>
Expand Down
2 changes: 1 addition & 1 deletion splitting/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ The code will split the videos listed in the `video_list.txt` and output the vid
</tr>
</table>

<sup>**We will remove the video samples from our dataset / Github / project webpage as long as you need it. Please contact tsaishienchen at gmail dot com for the request.</sup>
<sup>**We will remove the video samples from our dataset / Github / project webpage / technical presentation as long as you need it. Please contact tsaishienchen at gmail dot com for the request.</sup>

## Acknowledgements
The code for video splitting is built upon [PySceneDetect](https://github.com/Breakthrough/PySceneDetect) and [ImageBind](https://github.com/facebookresearch/ImageBind).
Expand Down

0 comments on commit bbae2b1

Please sign in to comment.