-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transfer learning process incorrect #5
Comments
Thanks for correction, I have updated the development branch to do a quite similar thing. Just need to do a sanity check. I'll let you know once I check them on proper GPU machines. |
I still don't understand how did the authors feed the 2D CNN 32 images while inputting a video clip to the 3D CNN at the same time. 2D input has the shape (224,224,3), but it has to take 32 images, so the shape then should be (32,224,224,3). The 3D CNN also takes the input shape of (32,224,224,3). Is there a way to feed the 2D CNN 32 images one by one and only feed the 3D CNN one video clip in a single batch? If that's impossible, you'd have to replace the input layer of the 2D CNN to match the one of the 3D CNN. |
@MartynasJanonis, I agree with you. Edit: Other people also have same opinion in original repo |
As far as I can tell, during the transfer learning process, you're already trying to make the network classify the videos.
Instead, the 2D CNN should take 32 RGB frames from a video of a certain timestamp and the 3D CNN should take a video clip from the same timestamp. The network then should tell if the 32 frames and the video clip match. That way you can use a huge data set of unlabeled videos, because the label doesn't matter.
It shouldn't be too difficult to fix. The video generator could pick a random video with a random timestamp and extract 32 frames from it. Then it could either feed the same data to the 3D CNN with a label of '1', as the data is from the same video, or it could pick another video and feed that data to the 3D CNN with a label of '0'.
The text was updated successfully, but these errors were encountered: