Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training configuration inquery #19

Open
Garry101CN opened this issue Sep 29, 2024 · 1 comment
Open

Training configuration inquery #19

Garry101CN opened this issue Sep 29, 2024 · 1 comment

Comments

@Garry101CN
Copy link

Hi Zhang,
Thank you for sharing your nice work. We found your provided docres.pkl achieving promising results on Doc3D dataset. Then we try to train a new DocRes model on Doc3D based on the same settings(but without parallel training) for 100,000 iterations. The results seem underfit and are far more behind your trained docres.pkl.

Could you please share the training configuration of your docres.pkl model? Like the GPU model/number you used and the total number of iterations you trained for? Additionally, could the lack of multi-GPU parallel training be a possible reason for our model's underfitting?

Best,
Gary

@ZZZHANG-jx
Copy link
Owner

Hi Zhang, Thank you for sharing your nice work. We found your provided docres.pkl achieving promising results on Doc3D dataset. Then we try to train a new DocRes model on Doc3D based on the same settings(but without parallel training) for 100,000 iterations. The results seem underfit and are far more behind your trained docres.pkl.

Could you please share the training configuration of your docres.pkl model? Like the GPU model/number you used and the total number of iterations you trained for? Additionally, could the lack of multi-GPU parallel training be a possible reason for our model's underfitting?

Best, Gary

As mentioned in our paper, we trained our model on 8 NVIDIA A6000 GPUs with a global batch size of 80. I haven't explored the impact of different batch sizes on this specific task. Are you training solely for the dewarping task? If the dewarping task is trained directly alongside other tasks without pre-training, the performance could degrade significantly. Additionally, are you using the document foreground mask to remove environmental boundaries? This step is also crucial for improving the model's performance on the dewarping task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants