-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong 2D CNN input #4
Comments
The authors of the paper didn't provide the code where they use both 2D & 3D CNN, the code that is present in this repo is a stepping stone towards. I had some doubts on the "Teacher-Student" transfer learning process that they mentioned in the paper (2D & 3D) and have also tried reaching out to authors on the same. If you have a solution to the issue you raised, please feel free to post it here, else, please reachout the authors again at - https://github.com/MohsenFayyaz89/T3D The code I have written is just a first iteration not a foolproof one. Please feel free to correct it |
It would be of great help if you could modify the code for mean subtraction & augmentation. Edit: I have pushed some code to development branch for rectifying the issue mentioned. I will test further and let you know |
According to the paper, under the Training subsection: "To the 2D CNN, 32 RGB frames are fed as input. The input RGB images are randomly cropped to a size 224×224, and then mean-subtracted for the network training."
From the get_video.py get_video_frames function I can see that it only returns one frame of a specified video. Isn't that wrong? It's also missing the augmentation that they do in the paper.
The text was updated successfully, but these errors were encountered: