Wrong 2D CNN input #4

jsshawe · 2019-02-26T00:12:01Z

According to the paper, under the Training subsection: "To the 2D CNN, 32 RGB frames are fed as input. The input RGB images are randomly cropped to a size 224×224, and then mean-subtracted for the network training."

From the get_video.py get_video_frames function I can see that it only returns one frame of a specified video. Isn't that wrong? It's also missing the augmentation that they do in the paper.

rekon · 2019-02-26T15:02:30Z

The authors of the paper didn't provide the code where they use both 2D & 3D CNN, the code that is present in this repo is a stepping stone towards. I had some doubts on the "Teacher-Student" transfer learning process that they mentioned in the paper (2D & 3D) and have also tried reaching out to authors on the same. If you have a solution to the issue you raised, please feel free to post it here, else, please reachout the authors again at - https://github.com/MohsenFayyaz89/T3D

The code I have written is just a first iteration not a foolproof one. Please feel free to correct it

rekon · 2019-02-26T15:04:24Z

It would be of great help if you could modify the code for mean subtraction & augmentation.

Edit: I have pushed some code to development branch for rectifying the issue mentioned. I will test further and let you know

Deeptanshu-sankhwar added a commit to Deeptanshu-sankhwar/T3D-keras that referenced this issue Sep 11, 2019

PR fixes rekon#4

b967420

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong 2D CNN input #4

Wrong 2D CNN input #4

jsshawe commented Feb 26, 2019

rekon commented Feb 26, 2019 •

edited

Loading

rekon commented Feb 26, 2019 •

edited

Loading

Wrong 2D CNN input #4

Wrong 2D CNN input #4

Comments

jsshawe commented Feb 26, 2019

rekon commented Feb 26, 2019 • edited Loading

rekon commented Feb 26, 2019 • edited Loading

rekon commented Feb 26, 2019 •

edited

Loading

rekon commented Feb 26, 2019 •

edited

Loading