进行evaluate时候出现问题 #6

Xu-feng-feng · 2021-07-30T11:19:25Z

进行评估的时候运行sh evaluate.sh的时候训练出来的模型出现如下的问题，RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

zhangming8 · 2021-07-30T12:16:33Z

拉一下最新代码，这个后处理之前用的cpu，但是预测结果是在gpu上，现在这个已经修复了。https://github.com/zhangming8/yolox-pytorch/blob/main/models/post_process.py#L22-L24

Xu-feng-feng · 2021-08-06T12:42:26Z

新的代码在训练设定300个epoch，在284个epoch的时候又出现了**AttributeError: 'MosaicDetection' object has no attribute 'close_mosaic'**这个问题

zhangming8 · 2021-08-06T17:03:20Z

新的代码在训练设定300个epoch，在284个epoch的时候又出现了**AttributeError: 'MosaicDetection' object has no attribute 'close_mosaic'**这个问题

不好意思，修bug的时候这个没注意到，需要把https://github.com/zhangming8/yolox-pytorch/blob/main/train.py#L137
里面的.dataset去掉。
中途断掉可以参考train.sh里面resume接着训练

Xu-feng-feng · 2021-08-06T18:13:51Z

好，谢谢，请问下训练集损失下降而验证集到训练到一定的阶段却损失上升是什么原因

zhangming8 · 2021-08-06T22:48:24Z

好，谢谢，请问下训练集损失下降而验证集到训练到一定的阶段却损失上升是什么原因

说明模型在你的数据上逐渐开始过拟合了

Xu-feng-feng · 2021-08-07T11:22:54Z

使用amp训练的时候出现RuntimeError: "sigmoid_cpu" not implemented for 'Half'

zhangming8 · 2021-08-09T02:15:15Z

使用amp训练的时候出现RuntimeError: "sigmoid_cpu" not implemented for 'Half'

目前混合精度use_amp=Ture训练我还没push上去，还在验证

zhangming8 · 2021-08-10T14:01:45Z

使用amp训练的时候出现RuntimeError: "sigmoid_cpu" not implemented for 'Half'

目前混合精度use_amp=Ture训练我还没push上去，还在验证

目前最新代码已经支持混合精度训练，设置use_amp=True

zhangming8 pushed a commit that referenced this issue Aug 7, 2021

fix mosaic bug(#6); mv argparse str labels to 'config.py'

9a743eb

zhangming8 pushed a commit that referenced this issue Aug 10, 2021

support muli-gpu train with metric=Ap(#12); support amp(#6)

5d6faac

Provide feedback