You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for sharing the code!
I have some questions on how you construct the clip-caption from one video.
what if one cation cross multiple clips?
In your paper, Figure2 shows the clip-caption pairs. The caption "two stiches on two and we'll slip stitch" corresponds to two clips, as your figure shows. Did you segment the video into shots, and assign each caption with its nearest caption? ( Another way is segmenting the video by captions, and for each caption, find its nearest video. I dont think you use this way, since in such way, one caption cannot match two clips.)
2 did you use all the clip-captions within one video?
Since one video might contain lots clip-caption pairs, suppose a video might contain 1000 clip-caption pairs. Did you use all 1000 pairs in the howto100M dataset? Is there any selection work on those pairs?
I would appreciate your reply. Thanks.
The text was updated successfully, but these errors were encountered:
Hi, thanks for sharing the code!
I have some questions on how you construct the clip-caption from one video.
In your paper, Figure2 shows the clip-caption pairs. The caption "two stiches on two and we'll slip stitch" corresponds to two clips, as your figure shows. Did you segment the video into shots, and assign each caption with its nearest caption? ( Another way is segmenting the video by captions, and for each caption, find its nearest video. I dont think you use this way, since in such way, one caption cannot match two clips.)
2 did you use all the clip-captions within one video?
Since one video might contain lots clip-caption pairs, suppose a video might contain 1000 clip-caption pairs. Did you use all 1000 pairs in the howto100M dataset? Is there any selection work on those pairs?
I would appreciate your reply. Thanks.
The text was updated successfully, but these errors were encountered: