-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About the Cost Volume #4
Comments
The original paper concatenates the left and the right features across all disparities. In your case, it just concatenates features from D = 0 to D = 400. |
Thanks, so for those whose max disparity are less than 400, it means that I have to construct the same 400 dims , with some filling zero, am I right? And yet I wonder for such a input : batchsize* f DH*W , does it require a huge GPU memory for training ? |
About your first question, I think you're right. As indicated by the paper, the dimension of cost volume is DxHxWx2F, which means each feature pair is a DxHxW array. For the second question, yes, you'll need lots of memory to run the model. |
你实现的代码,能复现论文的效果吗? |
Hi, unfortunately I haven't trained the model with Scene_Flow data. It seems that the model will run out of memory with if hyper parameters are set too high. In addition, it took me more than 15 seconds to run an iteration with batch size of 1. |
Well , I have only a 16G memory and a single TitanX GPU and I wonder that if D is set as high as 400, will it run out of memory ? |
Yes,It will ran out of memory. |
It seems the final layer (soft argmin) limits the output value since it is an affine combination of disparity values. I am pondering if we can use linear combinations instead. In that case, we might be able to reduce disparity levels. |
I have updated the repository to make it easier to use. Please check it. |
@LinHungShi I can't understand how they train the left and the right features without image patches. Could you help me? |
Could you explain your problem in more details? From what I understand, they do patch the images during training. |
I thought the cost volume is constructed by left cost features and the corresponding right cost features with deviation d. But I wonder that for Middlebury dataset, the disparity range for each image is different and some is as large as 400, so I wonder how can this be solved.
The text was updated successfully, but these errors were encountered: