Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA Driver version problem. #14

Open
androidgeek077 opened this issue Jun 28, 2020 · 16 comments
Open

CUDA Driver version problem. #14

androidgeek077 opened this issue Jun 28, 2020 · 16 comments

Comments

@androidgeek077
Copy link

When i try to train the the birds model, I encountered this problem.
screenshot

CUDA driver version is insufficient for CUDA runtime version at torch/csrc/cuda/Module.cpp:32

@MinfengZhu
Copy link
Owner

Have you installed the cudatoolkit=9.0?
cudatoolkit=9.0 may be incompatible in your environment.

conda install requests nltk pandas scikit-image pyyaml cudatoolkit=9.0

@androidgeek077
Copy link
Author

androidgeek077 commented Jul 21, 2020 via email

@MinfengZhu
Copy link
Owner

This code should work on CPU when you set --gpu=-1. Besides, please carefully check that all variables are not moved to GPUs. You can remove all .cuda() commands.

@hs457681503
Copy link

When i try to train the the birds model, I encountered this problem.

Coulde you help me solve this problem?

RuntimeError: Error(s) in loading state_dict for RNN_ENCODER:
size mismatch for encoder.weight: copying a param of torch.Size([1, 300]) from checkpoint, where the shape is torch.Size([5450, 300]) in current model.

@MinfengZhu
Copy link
Owner

When i try to train the the birds model, I encountered this problem.

Coulde you help me solve this problem?

RuntimeError: Error(s) in loading state_dict for RNN_ENCODER:
size mismatch for encoder.weight: copying a param of torch.Size([1, 300]) from checkpoint, where the shape is torch.Size([5450, 300]) in current model.

Could you please provide more information (e.g., config file or command)? If you train the model from scratch, there is no checkpoint to load.

@savitha91
Copy link

Hi I am using Colab to execute the code. I am unable to install cudatoolkit=9.0 and the pytorch installation given. I am using pre-trained model, so when i run the command ! python main.py --cfg eval_coco.yml --gpu 0 . i get the error , No such file or directory: u'../data/coco/captions.pickle' . I have just places examples_filenames.txt into the data/coco folder. Nothing is mentioned about the captions.pickle

@androidgeek077
Copy link
Author

androidgeek077 commented Aug 5, 2020 via email

@MinfengZhu
Copy link
Owner

Hi I am using Colab to execute the code. I am unable to install cudatoolkit=9.0 and the pytorch installation given. I am using pre-trained model, so when i run the command ! python main.py --cfg eval_coco.yml --gpu 0 . i get the error , No such file or directory: u'../data/coco/captions.pickle' . I have just places examples_filenames.txt into the data/coco folder. Nothing is mentioned about the captions.pickle

I hope DM-GAN can generate images with consistent quality using different pytorch versions.

Have you downloaded the metadata for COCO dataset?
python google_drive.py 1rSnbIGNDGZeHlsUlLdahj0RJ9oo6lgH9 ./data/coco.zip

@MinfengZhu
Copy link
Owner

I am using example_filename.txt and generating images using main.py. But I am achieving IS 2.92 only. Can you please guide me why I am not getting the same mentioned IS score as you have mentioned in your paper. For FID, I am getting 65.48. Please guide me. I have also mentioned it in issues Thanks and Regards

Please generate images using the following command. You should generate images using text descriptions from the whole valid dataset.
python main.py --cfg cfg/eval_bird.yml --gpu 0

@androidgeek077
Copy link
Author

androidgeek077 commented Aug 5, 2020 via email

@androidgeek077
Copy link
Author

androidgeek077 commented Sep 19, 2020 via email

@MinfengZhu
Copy link
Owner

MinfengZhu commented Sep 30, 2020

We train DM-GAN with much more epochs (please see yml file in cfg folder) and choose the epoch which has the best performance.

We report the performance on CUB dataset using 600 epochs in our paper. The released model on CUB is trained using 800 epochs (bird_DMGAN.yml).

@cloverzyy
Copy link

When i try to train the the birds model, I encountered this problem.

Coulde you help me solve this problem?

RuntimeError: Error(s) in loading state_dict for RNN_ENCODER:
size mismatch for encoder.weight: copying a param of torch.Size([1, 300]) from checkpoint, where the shape is torch.Size([5450, 300]) in current model.

Did you solve the problem? I have the same problem. Could you tell me how to modify the code or add something? Thank you!

@MinfengZhu
Copy link
Owner

When i try to train the the birds model, I encountered this problem.
Coulde you help me solve this problem?
RuntimeError: Error(s) in loading state_dict for RNN_ENCODER:
size mismatch for encoder.weight: copying a param of torch.Size([1, 300]) from checkpoint, where the shape is torch.Size([5450, 300]) in current model.

Did you solve the problem? I have the same problem. Could you tell me how to modify the code or add something? Thank you!

Have you correctly load pertrained models? Text encoder and image encoder are already provided in DAMSM.

@srinivaspavan9
Copy link

srinivaspavan9 commented Feb 5, 2021

This code should work on CPU when you set --gpu=-1. Besides, please carefully check that all variables are not moved to GPUs. You can remove all .cuda() commands.

Can you please elaborate this on how to make it work for cpus with out CUDA. I am still getting the same error.

@srinivaspavan9
Copy link

Hello MinfengZhu ! Thanks for replying sir. I figured out that i didn;t have CUDA GPUs. Can you please tell me that I can run this code without CUDA GPU? I'm a student of masters working on your bases as my base paper. I shall be very thankful to you. Regards

On Fri, 17 Jul 2020 at 23:54, TwilightSnow @.***> wrote: Have you installed the cudatoolkit=9.0? cudatoolkit=9.0 may be incompatible in your environment. conda install requests nltk pandas scikit-image pyyaml cudatoolkit=9.0 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#14 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL4PEROLKCPE6GTUBFY5AQTR4FBJFANCNFSM4OKOLWSQ .

I am facing the same issue, can you help me on how to execute on cpu with out cuda.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants