-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why do the comparative experiments for I2V not use the same initial frame? #19
Comments
Hello, thank you for your interest in our work. That's a great question. |
Thank you very much for your thoughtful and patient response! I greatly appreciate it. However, I would like to mention that, according to the paper, FancyVideo utilizes images generated by SD XL as the first frame. This approach evidently enhances image quality and also influences Text-Video Alignment. In my view, this is a significant factor contributing to FancyVideo's advantages in these two metrics. I wonder if it might be perceived as somewhat unfair to compare it quantitatively and qualitatively with other I2V methods (such as DynamiCrafter, Gen2, and Pika) without using the same initial frame. Additionally, as far as I understand, animatediff can achieve the I2V task by incorporating SparseCtrl. Thank you for considering my perspective! |
Hello, I am very pleased to receive your response and look forward to discussing this matter further.
|
It seems that FancyVideo is an I2V model, and in Figure 4, it did not use the same initial frame when comparing with other I2V models. Wouldn't this be considered somewhat unfair?
The text was updated successfully, but these errors were encountered: