Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For faster NMS #144

Open
naoe1999 opened this issue Jun 28, 2021 · 11 comments
Open

For faster NMS #144

naoe1999 opened this issue Jun 28, 2021 · 11 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@naoe1999
Copy link

naoe1999 commented Jun 28, 2021

First, thank you for your code sharing, it helps me a lot.

To make your code even better, I would like to suggest much faster NMS code below.
This code is to replace both nms() and iou() functions.
I referred to the nms algorithm by "Malisiewicz et al", and modified it to fit LUNA16's 3d data format.

Speed acceleration is achieved by:

  1. limiting maximum number of candidates before nms (to 3000 for example). This is extremely helpful when detector is not trained enough as it is producing millions of predictions over the certain threshold (such as -0.5). This would not be happening when the detector is properly trained.

  2. removing inner loop and utilizing numpy vectorization. You know the numpy vectorization is much much faster than python iteration loop.


def nms(output, nms_th=0.1):
    if len(output) == 0:
        return output

    output = output[np.argsort(-output[:, 0])]
    output = output[:3000]  # limit max number of candidates

    r = output[:, 4] / 2.
    x1 = output[:, 1] - r
    y1 = output[:, 2] - r
    z1 = output[:, 3] - r
    x2 = output[:, 1] + r
    y2 = output[:, 2] + r
    z2 = output[:, 3] + r
    volume = output[:, 4] ** 3  # (x2 - x1) * (y2 - y1) * (z2 - z1)

    pick = []
    idxs = np.arange(len(output))

    while len(idxs) > 0:
        i = idxs[0]
        pick.append(i)

        xx1 = np.maximum(x1[i], x1[idxs[1:]])
        yy1 = np.maximum(y1[i], y1[idxs[1:]])
        zz1 = np.maximum(z1[i], z1[idxs[1:]])
        xx2 = np.minimum(x2[i], x2[idxs[1:]])
        yy2 = np.minimum(y2[i], y2[idxs[1:]])
        zz2 = np.minimum(z2[i], z2[idxs[1:]])

        w = np.maximum(0.0, xx2 - xx1)
        h = np.maximum(0.0, yy2 - yy1)
        d = np.maximum(0.0, zz2 - zz1)
        intersection = w * h * d

        iou = intersection / (volume[i] + volume[idxs[1:]] - intersection)

        idxs = np.delete(idxs, np.concatenate(([0], np.where(iou >= nms_th)[0] + 1)))

    return output[pick]
@wentaozhu
Copy link
Owner

Yes. Thank you so much for the contribution!

@wentaozhu wentaozhu added enhancement New feature or request help wanted Extra attention is needed labels Jun 28, 2021
@SirMwan
Copy link

SirMwan commented Jul 16, 2021

First, thank you for your code sharing, it helps me a lot.

To make your code even better, I would like to suggest much faster NMS code below.
This code is to replace both nms() and iou() functions.
I referred to the nms algorithm by "Malisiewicz et al", and modified it to fit LUNA16's 3d data format.

Speed acceleration is achieved by:

1. limiting maximum number of candidates before nms  (to 3000 for example). This is extremely helpful when detector is not trained enough as it is producing millions of predictions over the certain threshold (such as -0.5).  This would not be happening when the detector is properly trained.

2. removing inner loop and utilizing numpy vectorization. You know the numpy vectorization is much much faster than python iteration loop.
def nms(output, nms_th=0.1):
    if len(output) == 0:
        return output

    output = output[np.argsort(-output[:, 0])]
    output = output[:3000]  # limit max number of candidates

    r = output[:, 4] / 2.
    x1 = output[:, 1] - r
    y1 = output[:, 2] - r
    z1 = output[:, 3] - r
    x2 = output[:, 1] + r
    y2 = output[:, 2] + r
    z2 = output[:, 3] + r
    volume = output[:, 4] ** 3  # (x2 - x1) * (y2 - y1) * (z2 - z1)

    pick = []
    idxs = np.arange(len(output))

    while len(idxs) > 0:
        i = idxs[0]
        pick.append(i)

        xx1 = np.maximum(x1[i], x1[idxs[1:]])
        yy1 = np.maximum(y1[i], y1[idxs[1:]])
        zz1 = np.maximum(z1[i], z1[idxs[1:]])
        xx2 = np.minimum(x2[i], x2[idxs[1:]])
        yy2 = np.minimum(y2[i], y2[idxs[1:]])
        zz2 = np.minimum(z2[i], z2[idxs[1:]])

        w = np.maximum(0.0, xx2 - xx1)
        h = np.maximum(0.0, yy2 - yy1)
        d = np.maximum(0.0, zz2 - zz1)
        intersection = w * h * d

        iou = intersection / (volume[i] + volume[idxs[1:]] - intersection)

        idxs = np.delete(idxs, np.concatenate(([0], np.where(iou >= nms_th)[0] + 1)))

    return output[pick]

Hello naoe1999

I have tried to make the follow up on training through this repository, and tried to train res18 without 064.cpkt, and I found the training loss is not decreasing.
I think because of the torch version changes, I also changed some parts of data.py and main.py files based on the torch vesion without any success.

Kindly please can you share with me your data.py and main.py files?
My email is [email protected]
Thanks in advance

@SirMwan
Copy link

SirMwan commented Jul 16, 2021

First, thank you for your code sharing, it helps me a lot.
To make your code even better, I would like to suggest much faster NMS code below.
This code is to replace both nms() and iou() functions.
I referred to the nms algorithm by "Malisiewicz et al", and modified it to fit LUNA16's 3d data format.
Speed acceleration is achieved by:

1. limiting maximum number of candidates before nms  (to 3000 for example). This is extremely helpful when detector is not trained enough as it is producing millions of predictions over the certain threshold (such as -0.5).  This would not be happening when the detector is properly trained.

2. removing inner loop and utilizing numpy vectorization. You know the numpy vectorization is much much faster than python iteration loop.
def nms(output, nms_th=0.1):
    if len(output) == 0:
        return output

    output = output[np.argsort(-output[:, 0])]
    output = output[:3000]  # limit max number of candidates

    r = output[:, 4] / 2.
    x1 = output[:, 1] - r
    y1 = output[:, 2] - r
    z1 = output[:, 3] - r
    x2 = output[:, 1] + r
    y2 = output[:, 2] + r
    z2 = output[:, 3] + r
    volume = output[:, 4] ** 3  # (x2 - x1) * (y2 - y1) * (z2 - z1)

    pick = []
    idxs = np.arange(len(output))

    while len(idxs) > 0:
        i = idxs[0]
        pick.append(i)

        xx1 = np.maximum(x1[i], x1[idxs[1:]])
        yy1 = np.maximum(y1[i], y1[idxs[1:]])
        zz1 = np.maximum(z1[i], z1[idxs[1:]])
        xx2 = np.minimum(x2[i], x2[idxs[1:]])
        yy2 = np.minimum(y2[i], y2[idxs[1:]])
        zz2 = np.minimum(z2[i], z2[idxs[1:]])

        w = np.maximum(0.0, xx2 - xx1)
        h = np.maximum(0.0, yy2 - yy1)
        d = np.maximum(0.0, zz2 - zz1)
        intersection = w * h * d

        iou = intersection / (volume[i] + volume[idxs[1:]] - intersection)

        idxs = np.delete(idxs, np.concatenate(([0], np.where(iou >= nms_th)[0] + 1)))

    return output[pick]

Hello naoe1999

I have tried to make the follow up on training through this repository, and tried to train res18 without 064.cpkt, and I found the training loss is not decreasing.
I think because of the torch version changes, I also changed some parts of data.py and main.py files based on the torch vesion without any success.

Kindly please can you share with me your data.py and main.py files?
My email is [email protected]
Thanks in advance

This is part of main.py file I have changed

def train(data_loader, net, loss, epoch, optimizer, get_lr, save_freq, save_dir):
start_time = time.time()

net.train()
lr = get_lr(epoch)
for param_group in optimizer.param_groups:
    param_group['lr'] = lr

metrics = []

for i, (data, target, coord) in enumerate(data_loader):
    if torch.cuda.is_available():
        data = Variable(data.cuda())
        target = Variable(target.cuda())
        coord = Variable(coord.cuda())
    data = data.float()
    target = target.float()
    coord = coord.float()


    optimizer.zero_grad()
    output = net(data, coord)
    loss_output = loss(output, target)
    loss_output[0].backward()
    optimizer.step()

    loss_output[0] = loss_output[0].item()    ####
    metrics.append(loss_output)

if epoch % args.save_freq == 0:            
    state_dict = net.state_dict()
    for key in state_dict.keys():
        state_dict[key] = state_dict[key].cpu()
        
    torch.save({
        'epoch': epoch,
        'save_dir': save_dir,
        'state_dict': state_dict,
        'args': args},
        os.path.join(save_dir, '%03d.ckpt' % epoch))

end_time = time.time()
metrics = np.asarray(metrics, np.float32)
print('Epoch %03d (lr %.5f)' % (epoch, lr))
print('Train:      tpr %3.2f, tnr %3.2f, total pos %d, total neg %d, time %3.2f' % (
    100.0 * np.sum(metrics[:, 6]) / np.sum(metrics[:, 7]),
    100.0 * np.sum(metrics[:, 8]) / np.sum(metrics[:, 9]),
    np.sum(metrics[:, 7]),
    np.sum(metrics[:, 9]),
    end_time - start_time))
print('loss %2.4f, classify loss %2.4f, regress loss %2.4f, %2.4f, %2.4f, %2.4f' % (
    np.mean(metrics[:, 0]),
    np.mean(metrics[:, 1]),
    np.mean(metrics[:, 2]),
    np.mean(metrics[:, 3]),
    np.mean(metrics[:, 4]),
    np.mean(metrics[:, 5])))
print()

Based on the above changes, as well in the layer.py file I have done changes at the following parts:

def hard_mining(neg_output, neg_labels, num_hard):
_, idcs = torch.topk(neg_output, min(num_hard, len(neg_output)))
neg_output = torch.index_select(neg_output, 0, idcs)
neg_labels = torch.index_select(neg_labels, 0, idcs)
return neg_output, neg_labels

class Loss(nn.Module):
def init(self, num_hard = 2):
super(Loss, self).init()
self.sigmoid = nn.Sigmoid()
self.classify_loss = nn.BCELoss()
self.regress_loss = nn.SmoothL1Loss()
self.num_hard = num_hard

def forward(self, output, labels, train = True):
    batch_size = labels.size(0)
    output = output.view(-1, 5)
    labels = labels.view(-1, 5)
    
    pos_idcs = labels[:, 0] > 0.5
    pos_idcs = pos_idcs.unsqueeze(1).expand(pos_idcs.size(0), 5)
    pos_output = output[pos_idcs].view(-1, 5)
    pos_labels = labels[pos_idcs].view(-1, 5)

    neg_idcs = labels[:, 0] < -0.5
    neg_output = output[:, 0][neg_idcs]
    neg_labels = labels[:, 0][neg_idcs]
    
    if self.num_hard > 0 and train:
        neg_output, neg_labels = hard_mining(neg_output, neg_labels, self.num_hard * batch_size)
    neg_prob = self.sigmoid(neg_output)

    #classify_loss = self.classify_loss(
     #   torch.cat((pos_prob, neg_prob), 0),
      #  torch.cat((pos_labels[:, 0], neg_labels + 1), 0))
    if len(pos_output)>0:
        pos_prob = self.sigmoid(pos_output[:, 0])
        pz, ph, pw, pd = pos_output[:, 1], pos_output[:, 2], pos_output[:, 3], pos_output[:, 4]
        lz, lh, lw, ld = pos_labels[:, 1], pos_labels[:, 2], pos_labels[:, 3], pos_labels[:, 4]

        regress_losses = [
            self.regress_loss(pz, lz),
            self.regress_loss(ph, lh),
            self.regress_loss(pw, lw),
            self.regress_loss(pd, ld)]
        regress_losses_data = [loz.item() for loz in regress_losses]  ### changed
        classify_loss = 0.5 * self.classify_loss(
        pos_prob, pos_labels[:, 0]) + 0.5 * self.classify_loss(
        neg_prob, neg_labels + 1)
        pos_correct = (pos_prob.data >= 0.5).sum()
        pos_total = len(pos_prob)

    else:
        regress_losses = [0,0,0,0]
        classify_loss =  0.5 * self.classify_loss(
        neg_prob, neg_labels + 1)
        pos_correct = 0
        pos_total = 0
        regress_losses_data = [0,0,0,0]
    classify_loss_data = classify_loss.item() #####changed
    loss = classify_loss
    for regress_loss in regress_losses:
        loss += regress_loss
    neg_correct = (neg_prob.data < 0.5).sum()
    neg_total = len(neg_prob)
    return [loss, classify_loss_data] + regress_losses_data + [pos_correct, pos_total, neg_correct, neg_total]

And in the data.py file, there are alot of int issues.

@naoe1999
Copy link
Author

Dear @SirMwan,

I think the changes you have made ( .data to .item() ) have no problem.
I didn't change that part though, because it just worked fine with my pytorch version.

I guess some other changes have effected.
You mentioned "a lot of int issues", and you probably made some changes to it to get it worked.
If I remember correctly, those are python version issues:

  • for python 2.x, (int) / (int) -> (int) , but (float) / (int), (int) / (float), or (float) / (float) -> (float)
  • for python 3.x, all of them -> (float)

Thus, when you make change you have to review the code line by line.
If the original code is to operate (int) / (int), then you should change "/" to "//"
However, any of operand is (float), then you should not make change.
That was the most tricky part when I dealt with version issue.

After those changes, my data.py has 20 "//"s, and split_combine.py has 2 "//"s.

I hope it helps.
If you have already done it correctly but still have problem, then that would be another issue.

@SirMwan
Copy link

SirMwan commented Jul 22, 2021

Dear @SirMwan,

I think the changes you have made ( .data to .item() ) have no problem.
I didn't change that part though, because it just worked fine with my pytorch version.

I guess some other changes have effected.
You mentioned "a lot of int issues", and you probably made some changes to it to get it worked.
If I remember correctly, those are python version issues:

* for python 2.x, (int) / (int) -> (int) ,  but (float) / (int),  (int) / (float), or (float) / (float) -> (float)

* for python 3.x, all of them -> (float)

Thus, when you make change you have to review the code line by line.
If the original code is to operate (int) / (int), then you should change "/" to "//"
However, any of operand is (float), then you should not make change.
That was the most tricky part when I dealt with version issue.

After those changes, my data.py has 20 "//"s, and split_combine.py has 2 "//"s.

I hope it helps.
If you have already done it correctly but still have problem, then that would be another issue.

Dear @naoe1999

Kindly help me to cross check the lines I have changed here below: In changes I put # then number of changes

class DataBowl3Detector(Dataset):
def init(self, data_dir, split_path, config, phase='train', split_comber=None):
assert(phase == 'train' or phase == 'val' or phase == 'test')
self.phase = phase
self.max_stride = config['max_stride']
self.stride = config['stride']
sizelim = config['sizelim']/config['reso']
sizelim2 = config['sizelim2']//config['reso'] #1
sizelim3 = config['sizelim3']//config['reso'] #2

.....
else:
imgs = np.load(self.filenames[idx])
bboxes = self.sample_bboxes[idx]
nz, nh, nw = imgs.shape[1:]
pz = int(np.ceil(float(nz) / self.stride)) * self.stride
ph = int(np.ceil(float(nh) / self.stride)) * self.stride
pw = int(np.ceil(float(nw) / self.stride)) * self.stride
imgs = np.pad(imgs, [[0,0],[0, pz - nz], [0, ph - nh], [0, pw - nw]], 'constant',constant_values = self.pad_value)

        xx,yy,zz = np.meshgrid(np.linspace(-0.5,0.5,imgs.shape[1]//self.stride),
                               np.linspace(-0.5,0.5,imgs.shape[2]//self.stride),
                               np.linspace(-0.5,0.5,imgs.shape[3]//self.stride),indexing ='ij')  #3,4,5
        coord = np.concatenate([xx[np.newaxis,...], yy[np.newaxis,...],zz[np.newaxis,:]],0).astype('float32')
        imgs, nzhw = self.split_comber.split(imgs)
        coord2, nzhw2 = self.split_comber.split(coord,
                                               side_len = self.split_comber.side_len//self.stride,
                                               max_stride = self.split_comber.max_stride//self.stride,
                                               margin = self.split_comber.margin//self.stride)  #6,7,8
        assert np.all(nzhw==nzhw2)
        imgs = (imgs.astype(np.float32)-128)/128
        return torch.from_numpy(imgs), bboxes, torch.from_numpy(coord2), np.array(nzhw)

def len(self):
if self.phase == 'train':
return len(self.bboxes)//(1-self.r_rand) #9
elif self.phase =='val':
return len(self.bboxes)
else:
return len(self.sample_bboxes)

.......
start = []
for i in range(3):
if not isRand:
r = target[3] // 2 #10
s = np.floor(target[i] - r)+ 1 - bound_size
e = np.ceil (target[i] + r)+ 1 + bound_size - crop_size[i]
else:
s = np.max([imgs.shape[i+1]-crop_size[i]//2,imgs.shape[i+1]//2+bound_size])
e = np.min([crop_size[i]//2, imgs.shape[i+1]//2-bound_size]) #11,12,13,14
target = np.array([np.nan,np.nan,np.nan,np.nan])
if s>e:
start.append(np.random.randint(e,s))#!
else:
start.append(int(target[i])-crop_size[i]//2+np.random.randint(-bound_size//2,bound_size//2)) #15,16,17

    normstart = np.array(start).astype('float32')/np.array(imgs.shape[1:])-0.5
    normsize = np.array(crop_size).astype('float32')/np.array(imgs.shape[1:])
    xx,yy,zz = np.meshgrid(np.linspace(normstart[0],normstart[0]+normsize[0],self.crop_size[0]//self.stride),
                       np.linspace(normstart[1],normstart[1]+normsize[1],self.crop_size[1]//self.stride),
                       np.linspace(normstart[2],normstart[2]+normsize[2],self.crop_size[2]//self.stride),indexing ='ij')  #18,19,20
    coord = np.concatenate([xx[np.newaxis,...], yy[np.newaxis,...],zz[np.newaxis,:]],0).astype('float32')

.....
def call(self, input_size, target, bboxes, filename):
stride = self.stride
num_neg = self.num_neg
th_neg = self.th_neg
anchors = self.anchors
th_pos = self.th_pos

    output_size = []
    for i in range(3):
        if input_size[i] % stride != 0:
            print(filename)
        # assert(input_size[i] % stride == 0) 
        output_size.append(input_size[i] // stride)  #21

@SirMwan
Copy link

SirMwan commented Jul 23, 2021

Then, when I am trying to visualize images with bounding boxes after prepare.py (preprocessing):-

  1. Some images displayed well, means images with the bounding boxes at the nodules,
  2. But some images only displayed for example "Groundtruth (1, 291, 180, 266) (1, 4) " without an image.
    Is the second case a problem?

@naoe1999
Copy link
Author

Dear @SirMwan ,

I checked it as below :

#1 ~ #2:
I changed the previous line as well: sizelim = config['sizelim'] // config['reso']
I don't think it's critical either its result is float or int.

#3 ~ #5: OK
#6 ~ #8: OK

#9: int(len(self.bboxes) / (1 - self.r_rand)) # this would be as intended without any error or warning.

#10: I didn't change this. "s" and "e" will be int by the following lines. : r = target[3] / 2

#11 ~ #14: OK
#15 ~ #17: OK
#18 ~ #20: OK
#21: OK

Not-displaying image problem can happen by a bunch of reasons. You'd better look into the values in those images.
If the values do not make sense, that would be the critical reason for failing training.

@SirMwan
Copy link

SirMwan commented Jul 23, 2021

Dear @SirMwan ,

I checked it as below :

#1 ~ #2:
I changed the previous line as well: sizelim = config['sizelim'] // config['reso']
I don't think it's critical either its result is float or int.

#3 ~ #5: OK
#6 ~ #8: OK

#9: int(len(self.bboxes) / (1 - self.r_rand)) # this would be as intended without any error or warning.

#10: I didn't change this. "s" and "e" will be int by the following lines. : r = target[3] / 2

#11 ~ #14: OK
#15 ~ #17: OK
#18 ~ #20: OK
#21: OK

Not-displaying image problem can happen by a bunch of reasons. You'd better look into the values in those images.
If the values do not make sense, that would be the critical reason for failing training.

Dear @naoe1999
Thank you very much, Now it is going well.

Then let me ask this last question. In the model res18, there is "two MaxUnpool layers" but no where have been used in the code.

self.unmaxpool1 = nn.MaxUnpool3d(kernel_size=2,stride=2)
self.unmaxpool2 = nn.MaxUnpool3d(kernel_size=2,stride=2)

Thank you very much.

@naoe1999
Copy link
Author

naoe1999 commented Jul 23, 2021

Dear @SirMwan,

Congrats!!! Pleasure to hear it's going well.

For your question of self.unmaxpool1 ..., I didn't find its usage either.
Instead, nn.ConvTranspose3d() is used for the upscaling operation (for decoder part of U-net structure).

@Yuke2021
Copy link

Dear @SirMwan,
I think the changes you have made ( .data to .item() ) have no problem.
I didn't change that part though, because it just worked fine with my pytorch version.
I guess some other changes have effected.
You mentioned "a lot of int issues", and you probably made some changes to it to get it worked.
If I remember correctly, those are python version issues:

* for python 2.x, (int) / (int) -> (int) ,  but (float) / (int),  (int) / (float), or (float) / (float) -> (float)

* for python 3.x, all of them -> (float)

Thus, when you make change you have to review the code line by line.
If the original code is to operate (int) / (int), then you should change "/" to "//"
However, any of operand is (float), then you should not make change.
That was the most tricky part when I dealt with version issue.
After those changes, my data.py has 20 "//"s, and split_combine.py has 2 "//"s.
I hope it helps.
If you have already done it correctly but still have problem, then that would be another issue.

Dear @naoe1999

Kindly help me to cross check the lines I have changed here below: In changes I put # then number of changes

class DataBowl3Detector(Dataset):
def init(self, data_dir, split_path, config, phase='train', split_comber=None):
assert(phase == 'train' or phase == 'val' or phase == 'test')
self.phase = phase
self.max_stride = config['max_stride']
self.stride = config['stride']
sizelim = config['sizelim']/config['reso']
sizelim2 = config['sizelim2']//config['reso'] #1
sizelim3 = config['sizelim3']//config['reso'] #2

.....
else:
imgs = np.load(self.filenames[idx])
bboxes = self.sample_bboxes[idx]
nz, nh, nw = imgs.shape[1:]
pz = int(np.ceil(float(nz) / self.stride)) * self.stride
ph = int(np.ceil(float(nh) / self.stride)) * self.stride
pw = int(np.ceil(float(nw) / self.stride)) * self.stride
imgs = np.pad(imgs, [[0,0],[0, pz - nz], [0, ph - nh], [0, pw - nw]], 'constant',constant_values = self.pad_value)

        xx,yy,zz = np.meshgrid(np.linspace(-0.5,0.5,imgs.shape[1]//self.stride),
                               np.linspace(-0.5,0.5,imgs.shape[2]//self.stride),
                               np.linspace(-0.5,0.5,imgs.shape[3]//self.stride),indexing ='ij')  #3,4,5
        coord = np.concatenate([xx[np.newaxis,...], yy[np.newaxis,...],zz[np.newaxis,:]],0).astype('float32')
        imgs, nzhw = self.split_comber.split(imgs)
        coord2, nzhw2 = self.split_comber.split(coord,
                                               side_len = self.split_comber.side_len//self.stride,
                                               max_stride = self.split_comber.max_stride//self.stride,
                                               margin = self.split_comber.margin//self.stride)  #6,7,8
        assert np.all(nzhw==nzhw2)
        imgs = (imgs.astype(np.float32)-128)/128
        return torch.from_numpy(imgs), bboxes, torch.from_numpy(coord2), np.array(nzhw)

def len(self):
if self.phase == 'train':
return len(self.bboxes)//(1-self.r_rand) #9
elif self.phase =='val':
return len(self.bboxes)
else:
return len(self.sample_bboxes)

.......
start = []
for i in range(3):
if not isRand:
r = target[3] // 2 #10
s = np.floor(target[i] - r)+ 1 - bound_size
e = np.ceil (target[i] + r)+ 1 + bound_size - crop_size[i]
else:
s = np.max([imgs.shape[i+1]-crop_size[i]//2,imgs.shape[i+1]//2+bound_size])
e = np.min([crop_size[i]//2, imgs.shape[i+1]//2-bound_size]) #11,12,13,14
target = np.array([np.nan,np.nan,np.nan,np.nan])
if s>e:
start.append(np.random.randint(e,s))#!
else:
start.append(int(target[i])-crop_size[i]//2+np.random.randint(-bound_size//2,bound_size//2)) #15,16,17

    normstart = np.array(start).astype('float32')/np.array(imgs.shape[1:])-0.5
    normsize = np.array(crop_size).astype('float32')/np.array(imgs.shape[1:])
    xx,yy,zz = np.meshgrid(np.linspace(normstart[0],normstart[0]+normsize[0],self.crop_size[0]//self.stride),
                       np.linspace(normstart[1],normstart[1]+normsize[1],self.crop_size[1]//self.stride),
                       np.linspace(normstart[2],normstart[2]+normsize[2],self.crop_size[2]//self.stride),indexing ='ij')  #18,19,20
    coord = np.concatenate([xx[np.newaxis,...], yy[np.newaxis,...],zz[np.newaxis,:]],0).astype('float32')

.....
def call(self, input_size, target, bboxes, filename):
stride = self.stride
num_neg = self.num_neg
th_neg = self.th_neg
anchors = self.anchors
th_pos = self.th_pos

    output_size = []
    for i in range(3):
        if input_size[i] % stride != 0:
            print(filename)
        # assert(input_size[i] % stride == 0) 
        output_size.append(input_size[i] // stride)  #21

Dear @SirMwan,

I think the changes you have made ( .data to .item() ) have no problem.
I didn't change that part though, because it just worked fine with my pytorch version.

I guess some other changes have effected.
You mentioned "a lot of int issues", and you probably made some changes to it to get it worked.
If I remember correctly, those are python version issues:

  • for python 2.x, (int) / (int) -> (int) , but (float) / (int), (int) / (float), or (float) / (float) -> (float)
  • for python 3.x, all of them -> (float)

Thus, when you make change you have to review the code line by line.
If the original code is to operate (int) / (int), then you should change "/" to "//"
However, any of operand is (float), then you should not make change.
That was the most tricky part when I dealt with version issue.

After those changes, my data.py has 20 "//"s, and split_combine.py has 2 "//"s.

I hope it helps.
If you have already done it correctly but still have problem, then that would be another issue.

I also encountered the same problem, can you share your code with me: [email protected], thank you very much

1 similar comment
@Yuke2021
Copy link

Dear @SirMwan,
I think the changes you have made ( .data to .item() ) have no problem.
I didn't change that part though, because it just worked fine with my pytorch version.
I guess some other changes have effected.
You mentioned "a lot of int issues", and you probably made some changes to it to get it worked.
If I remember correctly, those are python version issues:

* for python 2.x, (int) / (int) -> (int) ,  but (float) / (int),  (int) / (float), or (float) / (float) -> (float)

* for python 3.x, all of them -> (float)

Thus, when you make change you have to review the code line by line.
If the original code is to operate (int) / (int), then you should change "/" to "//"
However, any of operand is (float), then you should not make change.
That was the most tricky part when I dealt with version issue.
After those changes, my data.py has 20 "//"s, and split_combine.py has 2 "//"s.
I hope it helps.
If you have already done it correctly but still have problem, then that would be another issue.

Dear @naoe1999

Kindly help me to cross check the lines I have changed here below: In changes I put # then number of changes

class DataBowl3Detector(Dataset):
def init(self, data_dir, split_path, config, phase='train', split_comber=None):
assert(phase == 'train' or phase == 'val' or phase == 'test')
self.phase = phase
self.max_stride = config['max_stride']
self.stride = config['stride']
sizelim = config['sizelim']/config['reso']
sizelim2 = config['sizelim2']//config['reso'] #1
sizelim3 = config['sizelim3']//config['reso'] #2

.....
else:
imgs = np.load(self.filenames[idx])
bboxes = self.sample_bboxes[idx]
nz, nh, nw = imgs.shape[1:]
pz = int(np.ceil(float(nz) / self.stride)) * self.stride
ph = int(np.ceil(float(nh) / self.stride)) * self.stride
pw = int(np.ceil(float(nw) / self.stride)) * self.stride
imgs = np.pad(imgs, [[0,0],[0, pz - nz], [0, ph - nh], [0, pw - nw]], 'constant',constant_values = self.pad_value)

        xx,yy,zz = np.meshgrid(np.linspace(-0.5,0.5,imgs.shape[1]//self.stride),
                               np.linspace(-0.5,0.5,imgs.shape[2]//self.stride),
                               np.linspace(-0.5,0.5,imgs.shape[3]//self.stride),indexing ='ij')  #3,4,5
        coord = np.concatenate([xx[np.newaxis,...], yy[np.newaxis,...],zz[np.newaxis,:]],0).astype('float32')
        imgs, nzhw = self.split_comber.split(imgs)
        coord2, nzhw2 = self.split_comber.split(coord,
                                               side_len = self.split_comber.side_len//self.stride,
                                               max_stride = self.split_comber.max_stride//self.stride,
                                               margin = self.split_comber.margin//self.stride)  #6,7,8
        assert np.all(nzhw==nzhw2)
        imgs = (imgs.astype(np.float32)-128)/128
        return torch.from_numpy(imgs), bboxes, torch.from_numpy(coord2), np.array(nzhw)

def len(self):
if self.phase == 'train':
return len(self.bboxes)//(1-self.r_rand) #9
elif self.phase =='val':
return len(self.bboxes)
else:
return len(self.sample_bboxes)

.......
start = []
for i in range(3):
if not isRand:
r = target[3] // 2 #10
s = np.floor(target[i] - r)+ 1 - bound_size
e = np.ceil (target[i] + r)+ 1 + bound_size - crop_size[i]
else:
s = np.max([imgs.shape[i+1]-crop_size[i]//2,imgs.shape[i+1]//2+bound_size])
e = np.min([crop_size[i]//2, imgs.shape[i+1]//2-bound_size]) #11,12,13,14
target = np.array([np.nan,np.nan,np.nan,np.nan])
if s>e:
start.append(np.random.randint(e,s))#!
else:
start.append(int(target[i])-crop_size[i]//2+np.random.randint(-bound_size//2,bound_size//2)) #15,16,17

    normstart = np.array(start).astype('float32')/np.array(imgs.shape[1:])-0.5
    normsize = np.array(crop_size).astype('float32')/np.array(imgs.shape[1:])
    xx,yy,zz = np.meshgrid(np.linspace(normstart[0],normstart[0]+normsize[0],self.crop_size[0]//self.stride),
                       np.linspace(normstart[1],normstart[1]+normsize[1],self.crop_size[1]//self.stride),
                       np.linspace(normstart[2],normstart[2]+normsize[2],self.crop_size[2]//self.stride),indexing ='ij')  #18,19,20
    coord = np.concatenate([xx[np.newaxis,...], yy[np.newaxis,...],zz[np.newaxis,:]],0).astype('float32')

.....
def call(self, input_size, target, bboxes, filename):
stride = self.stride
num_neg = self.num_neg
th_neg = self.th_neg
anchors = self.anchors
th_pos = self.th_pos

    output_size = []
    for i in range(3):
        if input_size[i] % stride != 0:
            print(filename)
        # assert(input_size[i] % stride == 0) 
        output_size.append(input_size[i] // stride)  #21

Dear @SirMwan,

I think the changes you have made ( .data to .item() ) have no problem.
I didn't change that part though, because it just worked fine with my pytorch version.

I guess some other changes have effected.
You mentioned "a lot of int issues", and you probably made some changes to it to get it worked.
If I remember correctly, those are python version issues:

  • for python 2.x, (int) / (int) -> (int) , but (float) / (int), (int) / (float), or (float) / (float) -> (float)
  • for python 3.x, all of them -> (float)

Thus, when you make change you have to review the code line by line.
If the original code is to operate (int) / (int), then you should change "/" to "//"
However, any of operand is (float), then you should not make change.
That was the most tricky part when I dealt with version issue.

After those changes, my data.py has 20 "//"s, and split_combine.py has 2 "//"s.

I hope it helps.
If you have already done it correctly but still have problem, then that would be another issue.

I also encountered the same problem, can you share your code with me: [email protected], thank you very much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants