Some questions about the code implementation #4

qwqwq1445 · 2023-02-27T04:03:24Z

1.In the Pretraining Stage
It is said that you only use the labeled source data in the pretraining stage in your published paper. However, I use the same strategy and found that the model can not produce reliable pseudo labels. Given that you conduct adversarial training on your network, why don't you use the unlabel target data for adversarial training during the pretraining stage?
2.In the Joint-training Stage
I wonder how do you set the arg 'pseudo_label_policy'. When I set the arg to 'by_consistency', it seems the model didn't train well. Do you set the arg to 'traditional'? If so, how do you choose the valuable pseudo labels? By the module PostProcess and a threshold?
Wish all the best, looking forward to your reply.

Lafite-Yu · 2023-03-04T17:06:31Z

For the second question, I remember that traditional is the default setting in command line arguments? And as mentioned in the paper, pseudo-label filtering is simply performed by threshold filtering. 'by_consistency' is experimental, and no improvements are obtained in experiments, and thus, we finally discard that.

For the first question, a good starting point is loading the SFA retained weights for cityscapes. Actually, we also did not pretrain the model on the cityscapes -> foggy cityscapes scenario well, and we load the SFA pretrained weights, except for the newly added components. (This can also be done for the other two scenarios.) Interestingly, as SFA did not provide pertained weights for the other two tasks, the model trained by ourselves can largely beat SFA. Thus, we think our model's performance on foggy cityscapes might be improved by better pretraining.

The reason for a separate pertaining stage is that to generate pseudo-labels on unlabeled data, the model should have already learned something, and a randomly initialized model can not generate meaningful pseudo labels.

qwqwq1445 · 2023-03-05T02:58:17Z

For the second question, I remember that traditional is the default setting in command line arguments? And as mentioned in the paper, pseudo-label filtering is simply performed by threshold filtering. 'by_consistency' is experimental, and no improvements are obtained in experiments, and thus, we finally discard that.

For the first question, a good starting point is loading the SFA retained weights for cityscapes. Actually, we also did not pretrain the model on the cityscapes -> foggy cityscapes scenario well, and we load the SFA pretrained weights, except for the newly added components. (This can also be done for the other two scenarios.) Interestingly, as SFA did not provide pertained weights for the other two tasks, the model trained by ourselves can largely beat SFA. Thus, we think our model's performance on foggy cityscapes might be improved by better pretraining.

The reason for a separate pertaining stage is that to generate pseudo-labels on unlabeled data, the model should have already learned something, and a randomly initialized model can not generate meaningful pseudo labels.

Appreciated for your apply.

qwqwq1445 · 2023-03-10T02:22:54Z

Another details I would like to know:

How many epochs do you use for model pretraining and self-supervised training?
How do you set your random seed?

Lafite-Yu · 2023-03-14T11:59:50Z

Q1: I'm not sure about the exact number, but for the pretraining stage, the number of epochs is relatively large. The model is trained until the performance converges to a satisfying level. You may need to try many things, like lr, random seed, and even warm restarts. (However, we failed to get a good cityscapes pretrained model, and thus we started from the SFA pretrained weights; while for the other two scenarios, we did not try many configs to get a satisfying one.) For the self-supervised training, the number should be small (less than ten or even five), or it collapses quickly (you may need a small validation set split from the training set or try a small fixed epoch num, or we think this can be a future work).

Q2: For the self-sup training phase, we did not set the random seed by some complex methods, maybe we tried some common numbers like 42 or 0, but however, the default number in arguments is what we really use.

Lafite-Yu · 2023-03-14T12:03:37Z

Oh, by the way, considering the recent progress in UDA OD in the past year, you may better start from some other open-source UDA OD projects by adding the mean teacher workflow to them (referring to this repo, or projects like Unbiased Teacher in semi-supervised OD area we referred to), as we noticed some of them can largely surpass our performance without the mean teacher.

qwqwq1445 · 2023-03-30T03:53:58Z

Your work has achieved great performance in the Sim10k2Cityscapes benchmark, I wonder how you set the coefficient of the three domain loss for this benchmark? Like in SFA, the coefficient of TIFA and DQFA is set to 0.01 and 0.001.

qwqwq1445 closed this as completed Mar 5, 2023

qwqwq1445 reopened this Mar 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions about the code implementation #4

Some questions about the code implementation #4

qwqwq1445 commented Feb 27, 2023

Lafite-Yu commented Mar 4, 2023

qwqwq1445 commented Mar 5, 2023

qwqwq1445 commented Mar 10, 2023

Lafite-Yu commented Mar 14, 2023

Lafite-Yu commented Mar 14, 2023

qwqwq1445 commented Mar 30, 2023

Some questions about the code implementation #4

Some questions about the code implementation #4

Comments

qwqwq1445 commented Feb 27, 2023

Lafite-Yu commented Mar 4, 2023

qwqwq1445 commented Mar 5, 2023

qwqwq1445 commented Mar 10, 2023

Lafite-Yu commented Mar 14, 2023

Lafite-Yu commented Mar 14, 2023

qwqwq1445 commented Mar 30, 2023