Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about the code implementation #4

Open
qwqwq1445 opened this issue Feb 27, 2023 · 6 comments
Open

Some questions about the code implementation #4

qwqwq1445 opened this issue Feb 27, 2023 · 6 comments

Comments

@qwqwq1445
Copy link

1.In the Pretraining Stage
It is said that you only use the labeled source data in the pretraining stage in your published paper. However, I use the same strategy and found that the model can not produce reliable pseudo labels. Given that you conduct adversarial training on your network, why don't you use the unlabel target data for adversarial training during the pretraining stage?
2.In the Joint-training Stage
I wonder how do you set the arg 'pseudo_label_policy'. When I set the arg to 'by_consistency', it seems the model didn't train well. Do you set the arg to 'traditional'? If so, how do you choose the valuable pseudo labels? By the module PostProcess and a threshold?
Wish all the best, looking forward to your reply.

@Lafite-Yu
Copy link
Owner

For the second question, I remember that traditional is the default setting in command line arguments? And as mentioned in the paper, pseudo-label filtering is simply performed by threshold filtering. 'by_consistency' is experimental, and no improvements are obtained in experiments, and thus, we finally discard that.

For the first question, a good starting point is loading the SFA retained weights for cityscapes. Actually, we also did not pretrain the model on the cityscapes -> foggy cityscapes scenario well, and we load the SFA pretrained weights, except for the newly added components. (This can also be done for the other two scenarios.) Interestingly, as SFA did not provide pertained weights for the other two tasks, the model trained by ourselves can largely beat SFA. Thus, we think our model's performance on foggy cityscapes might be improved by better pretraining.

The reason for a separate pertaining stage is that to generate pseudo-labels on unlabeled data, the model should have already learned something, and a randomly initialized model can not generate meaningful pseudo labels.

@qwqwq1445
Copy link
Author

For the second question, I remember that traditional is the default setting in command line arguments? And as mentioned in the paper, pseudo-label filtering is simply performed by threshold filtering. 'by_consistency' is experimental, and no improvements are obtained in experiments, and thus, we finally discard that.

For the first question, a good starting point is loading the SFA retained weights for cityscapes. Actually, we also did not pretrain the model on the cityscapes -> foggy cityscapes scenario well, and we load the SFA pretrained weights, except for the newly added components. (This can also be done for the other two scenarios.) Interestingly, as SFA did not provide pertained weights for the other two tasks, the model trained by ourselves can largely beat SFA. Thus, we think our model's performance on foggy cityscapes might be improved by better pretraining.

The reason for a separate pertaining stage is that to generate pseudo-labels on unlabeled data, the model should have already learned something, and a randomly initialized model can not generate meaningful pseudo labels.

Appreciated for your apply.

@qwqwq1445 qwqwq1445 reopened this Mar 10, 2023
@qwqwq1445
Copy link
Author

Another details I would like to know:

  1. How many epochs do you use for model pretraining and self-supervised training?
  2. How do you set your random seed?

@Lafite-Yu
Copy link
Owner

Q1: I'm not sure about the exact number, but for the pretraining stage, the number of epochs is relatively large. The model is trained until the performance converges to a satisfying level. You may need to try many things, like lr, random seed, and even warm restarts. (However, we failed to get a good cityscapes pretrained model, and thus we started from the SFA pretrained weights; while for the other two scenarios, we did not try many configs to get a satisfying one.) For the self-supervised training, the number should be small (less than ten or even five), or it collapses quickly (you may need a small validation set split from the training set or try a small fixed epoch num, or we think this can be a future work).

Q2: For the self-sup training phase, we did not set the random seed by some complex methods, maybe we tried some common numbers like 42 or 0, but however, the default number in arguments is what we really use.

@Lafite-Yu
Copy link
Owner

Oh, by the way, considering the recent progress in UDA OD in the past year, you may better start from some other open-source UDA OD projects by adding the mean teacher workflow to them (referring to this repo, or projects like Unbiased Teacher in semi-supervised OD area we referred to), as we noticed some of them can largely surpass our performance without the mean teacher.

@qwqwq1445
Copy link
Author

Your work has achieved great performance in the Sim10k2Cityscapes benchmark, I wonder how you set the coefficient of the three domain loss for this benchmark? Like in SFA, the coefficient of TIFA and DQFA is set to 0.01 and 0.001.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants