Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The dimensionality of the last conv's output is 6*6*30,but how can we reshape it to 1470。 #16

Open
sysuzyq opened this issue Feb 23, 2017 · 4 comments

Comments

@sysuzyq
Copy link

sysuzyq commented Feb 23, 2017

layer {
name: "conv_reg"
type: "Convolution"
bottom: "add_conv2"
top: "conv_reg"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 30 ################output :6 * 6 * 30
kernel_size: 3
stride: 1
pad: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "reg_reshape"
type: "Reshape"
bottom: "conv_reg"
top: "regression"
reshape_param {
axis: 1
shape {
dim: 1470 #############but here is 7 * 7 * 30, whish is (classes+num_object*5) * side *side
}
}
}
Your input is 448 * 448, but the feature map of last conv layer is 6 * 6, then the output should be 6 * 6 * 30 = 1080. So should we reshape to 1080 instead of 1470?
look forward to your reply, thanks

@ICTwangbiao
Copy link

total stride=64, then the last feature map size is 448/64=7, so 7730=1470.
side = weight (or height) / stride

@sysuzyq
Copy link
Author

sysuzyq commented Feb 23, 2017

@ICTwangbiao thanks for your reply, but it confuses me. when i use the equation " h_o = (h_i + 2 * pad_h - kernel_h) / stride_h +1 " to calculate the output's side layer by layer, finally i get 6 . what's wrong with me? thanks

@quhezheng
Copy link

quhezheng commented Jan 2, 2018

@ICTwangbiao (classes+num_object*5) * side *side comes from the paper.
But this code use 7 instead of 30 to represent one cell. https://github.com/yeahkun/caffe-yolo/blob/master/src/caffe/layers/box_data_layer.cpp#L142 make it very clear.:

CHECK_EQ(count, locations * 7)

Here locations is 7*7=49. So dim: 1470 is wrong here. I don't know why the code can run. The right should be dim: 343

@ICTwangbiao
Copy link

ICTwangbiao commented Jan 10, 2018

@quhezheng "CHECK_EQ(count, locations * 7)" is used to check your ground truth (of course, the number of your labels should be locations * 7 {class_LABEL , difficult, isobj, x, y, w ,h}), so 343 your mentioned is the number of your ground truth.
BUT 1470 is the number of your regression layer outputs: locations * (class_NUMBER + score_confidence + coordinate infos)
In @sysuzyq 's case, I guess he/she want detect 25 classes if num_object=1, so outputs = locations * (25+5)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants