Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

进行evaluate时候出现问题 #6

Open
Xu-feng-feng opened this issue Jul 30, 2021 · 8 comments
Open

进行evaluate时候出现问题 #6

Xu-feng-feng opened this issue Jul 30, 2021 · 8 comments

Comments

@Xu-feng-feng
Copy link

进行评估的时候运行sh evaluate.sh的时候训练出来的模型出现如下的问题,RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

@zhangming8
Copy link
Owner

拉一下最新代码,这个后处理之前用的cpu,但是预测结果是在gpu上,现在这个已经修复了。https://github.com/zhangming8/yolox-pytorch/blob/main/models/post_process.py#L22-L24

@Xu-feng-feng
Copy link
Author

新的代码在训练设定300个epoch,在284个epoch的时候又出现了**AttributeError: 'MosaicDetection' object has no attribute 'close_mosaic'**这个问题

@zhangming8
Copy link
Owner

新的代码在训练设定300个epoch,在284个epoch的时候又出现了**AttributeError: 'MosaicDetection' object has no attribute 'close_mosaic'**这个问题

不好意思,修bug的时候这个没注意到,需要把https://github.com/zhangming8/yolox-pytorch/blob/main/train.py#L137
里面的.dataset去掉。
中途断掉可以参考train.sh里面resume接着训练

@Xu-feng-feng
Copy link
Author

Xu-feng-feng commented Aug 6, 2021

好,谢谢,请问下训练集损失下降而验证集到训练到一定的阶段却损失上升是什么原因

@zhangming8
Copy link
Owner

好,谢谢,请问下训练集损失下降而验证集到训练到一定的阶段却损失上升是什么原因

说明模型在你的数据上逐渐开始过拟合了

@Xu-feng-feng
Copy link
Author

使用amp训练的时候出现RuntimeError: "sigmoid_cpu" not implemented for 'Half'

@zhangming8
Copy link
Owner

使用amp训练的时候出现RuntimeError: "sigmoid_cpu" not implemented for 'Half'

目前混合精度use_amp=Ture训练我还没push上去,还在验证

@zhangming8
Copy link
Owner

使用amp训练的时候出现RuntimeError: "sigmoid_cpu" not implemented for 'Half'

目前混合精度use_amp=Ture训练我还没push上去,还在验证

目前最新代码已经支持混合精度训练,设置use_amp=True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants