-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot reproduce COIN dataset result #26
Comments
Hi @bluehawk2k , it seems that the loss is too large and the model is not converged. Could you show your training scripts? Sometimes on different devices, the training in the same learning rate may not be stable again, you can decrease the learning rate or extend the training epochs a little bit. When my model is converged, the loss is in 0.0x scale. @leebebeto the labels here is not the real label, it is used as the dataset index. |
Recently I am busy with other projects. But please feel free to leave your questions here and I will solve them as soon as possible. |
Thank you for the reply. The labels are indeed the sample indexes rather than actual gt labels. By the way, I also have loss value converging to 0.x scale rather than 0.0x scale. I used your released training script https://github.com/showlab/videollm-online/blob/main/scripts/coin/live1%2B.sh Thank you! |
@chenjoya Do you think the training script for ego4d narration (https://github.com/showlab/videollm-online/blob/main/scripts/ego4d/narration/live1%2B.sh) should also be fixed? I'm trying to train the model on |
hello @yankee624 this is very good since there is no loss spike. The loss in training COIN is low because it just contains single video-text pairs, instead of multiple video-text streams in Ego4D narration. |
@chenjoya Thank you so much for confirming! The metrics seem good so I guess it's trained well! |
hi @chenjoya , can I ask how many test samples you evaluate on? Because when I evaluated on 512 samples, the test step accuracy is 65%. However, the test step accuracy on 10k samples is only 10%. |
hi @nguyentthong , I tested on the standard test set of COIN dataset. Do you get the correct video ids? |
hi @chenjoya , I use the official data in this link: https://github.com/coin-dataset/annotations/blob/master/COIN.json My loss landscape is as follows: Is this a normal loss landscape? Moreover, can you release the checkpoint trained on the COIN dataset so that it is easier to examine the problem? |
Additionally, I use the preprocess scripts here: https://github.com/showlab/videollm-online/tree/main/data/preprocess I suppose this is applicable for both COIN and Ego4D datasets. Can you tell me whether my assumption is correct? @chenjoya |
Hi, here is another issue about reproducibility of COIN dataset result.
I've also tried to reproduce your result of COIN dataset with using 8 A100 GPUs.
However, the evaluation result gives too low performance comparing with your result reported in your paper.
Do you have any idea about this issue?
The text was updated successfully, but these errors were encountered: