-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changed video processing section (Unit 7 - CNN Based Video Model) #356
Conversation
Hello @cjfghk5697 😄 I think you haven't modified _toctree.yml file yet.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution, was a pleasure to read and learn 🙂
I just left some smaller suggestions, mainly formatting related
|
||
In this paper, the authors applied the ResNet architecture to the 3D CNNs. This approach introduces deeper models for 3D CNNs and achieves higher accuracy. | ||
|
||
Experiments showed that the 3D ResNets (especially deeper ones like the ResNet-34) outperform models like the C3D, particularly on larger datasets. Pretrained models like Sports-1M C3D can help mitigate overfitting on smaller datasets. Overall, 3D ResNets effectively leverage deeper architectures to capture complex spatiotemporal patterns in the video data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you mention C3D
as if it is already known, maybe shortly introduce it or give a link to the paper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for reviewing it! I’ve added C3D research paper's link.
Co-authored-by: Johannes Kolbe <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixes. LGTM 👍
Changed
chapters/en/unit7/video-processing/overview-of-previous-sota-models.mdx
tochapters/en/unit7/video-processing/cnn-based-video-model.mdx
at video processing section. This document provides an overview of various CNN video architectures which integrate different kinds of modalities into a unified representation space.Part of #348
Who can review? (Initial)
@jungnerd @cjfghk5697 @mreraser and anyone who wants to review!