About SOT training speed and GPU-Util #215

cuikf · 2021-12-04T09:17:57Z

When I use python ./main/train.py --config 'experiments/siamfcpp/train/lasot/siamfcpp_alexnet-trn.yaml', the GPU-Util always jumps from 89% seconds later to 0%, then 89% again.

in the .yaml:
num_processes: 2
minibatch: &MINIBATCH 128
num_workers: 64

I'm very confused

MARMOTatZJU · 2021-12-16T03:12:31Z

@cuikf This is highly probably due to the bottleneck at data providing stage. The ability of data provider is usually due to the CPU and memory of the training machine (that's why training under large batch size often requires high-performance machine). You can try to reduce the batch size to ease this issue.

cuikf · 2021-12-26T08:31:42Z

@MARMOTatZJU Thank u！I'll try it!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About SOT training speed and GPU-Util #215

About SOT training speed and GPU-Util #215

cuikf commented Dec 4, 2021

MARMOTatZJU commented Dec 16, 2021

cuikf commented Dec 26, 2021

About SOT training speed and GPU-Util #215

About SOT training speed and GPU-Util #215

Comments

cuikf commented Dec 4, 2021

MARMOTatZJU commented Dec 16, 2021

cuikf commented Dec 26, 2021