You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I am very interested in your work and see that you release the weight of Show-o before fine-tuning on LLaVA instructional tuning datasets.
I have the following two questions:
I see that you recommend in the README to go to finetune on the basis of the show-o-512x512-wo-llava-tuning checkpoint, so why don't go to finetune on the basis of the show-o-512x512. Is it because there is a performance degradation on certain downstream tasks after fine-tuning on LLaVA instructional tuning datasets?
If I want to fine-tune on certain visual downstream tasks, which checkpoint should I use?
The text was updated successfully, but these errors were encountered:
Hi, thanks for your interest in our work. If you'd like to reproduce our results, you can try the pre-trained one. Besides, because the final checkpoint was fine-tuned on the llava data, further fine-tuning will degrade the performance (overfitting). If there is new training data, you can directly fine-tune the final checkpoint I think.
Hello! I am very interested in your work and see that you release the weight of Show-o before fine-tuning on LLaVA instructional tuning datasets.
I have the following two questions:
I see that you recommend in the README to go to finetune on the basis of the show-o-512x512-wo-llava-tuning checkpoint, so why don't go to finetune on the basis of the show-o-512x512. Is it because there is a performance degradation on certain downstream tasks after fine-tuning on LLaVA instructional tuning datasets?
If I want to fine-tune on certain visual downstream tasks, which checkpoint should I use?
The text was updated successfully, but these errors were encountered: