Error in gt_util.sample_random_batch(batch_size=32, input_size=model.image_size) #21

kamae · 2019-06-24T08:41:31Z

ssd_detectors-master\ssd_data.py in preprocess(img, size)
628 img = img.astype(np.float32)
629 mean = np.array([104,117,123])
--> 630 img -= mean[np.newaxis, np.newaxis, :]
631 return img
632
ValueError: operands could not be broadcast together with shapes (512,512) (1,1,3) (512,512)

mvoelk · 2019-06-24T10:19:50Z

I guess your image data is grayscale with shape (512,512) or (512,512,1). I always used RGB images (e.g. shape (512,512,3)) and hard coded the channel means for compatibility with caffe models.

kamae · 2019-06-24T11:57:10Z

Markus Thank you again for your kind assistance. I was not loading images to predict. I want to use the ones in ssd_detectors-master\data\images They are boys.jpg, cafr_cat.jpg, fish-bike.jpg. Or those in ssd_detectors-master\images I didn’t know how to load them in “Predict” section before the following code _, inputs, images, data = gt_util.sample_random_batch(batch_size=32, input_size=model.image_size) How can I link to those images directory? Tune Kamae

mvoelk · 2019-06-24T13:01:46Z

Okay, what you are looking for is probably in SL_predict.ipynb under 'Real world images', but with SSD Model and PriorUtility.

For training with your own dataset, you should write a custom parser (GTUtility), like it is done in data_voc.py.

kamae · 2019-06-24T13:11:54Z

Markus Thank you for the email. I am running SSD_predict.ipynb to test its prediction power. I don’t know yet how to upload images to the “Predict” section of the notebook. Tune Kamae

mvoelk · 2019-06-24T14:07:10Z

import numpy as np
import matplotlib.pyplot as plt
import os
import glob
import cv2

from ssd_model import SSD300, SSD512
from ssd_utils import PriorUtil
from ssd_data import preprocess
from utils.model import load_weights

%matplotlib inline

# MS COCO
from data_coco import GTUtility
gt_util = GTUtility('./data/COCO/', validation=True)

# SDD512
model = SSD512(num_classes=gt_util.num_classes)
weights_path = './models/ssd512_coco_weights_fixed.hdf5'; confidence_threshold = 0.7

load_weights(model, weights_path)
prior_util = PriorUtil(model)

# predict 
inputs = []
images = []

img_paths = glob.glob('./data/images/*.jpg')

for img_path in img_paths:
    img = cv2.imread(img_path)
    inputs.append(preprocess(img, model.image_size))
    h, w = model.image_size
    img = cv2.resize(img, (w,h), cv2.INTER_LINEAR).astype('float32')
    img = img[:, :, (2,1,0)] # BGR to RGB
    img /= 255
    images.append(img)
    
inputs = np.asarray(inputs)

preds = model.predict(inputs, batch_size=1, verbose=1)

for i in range(len(images)):
    print(img_paths[i])
    plt.figure(figsize=[8]*2, frameon=True)
    plt.imshow(images[i])
    res = prior_util.decode(preds[i], confidence_threshold=0.5)
    prior_util.plot_results(res, classes=gt_util.classes)
    plt.axis('off')
    plt.show()

The converted caffe models may require fine tuning and the threshold was chosen more or less ad hoc.

kamae · 2019-06-24T14:07:53Z

Markus Thank you for the advice. I copied “Real World Images” and managed to run almost to end. What I could figure out is where data[i] in prior_util.plot_results(res, classes=gt_util.classes, show_labels=True, gt_data=data[i]) data[i] probably has the bounding boxes etc. Tune Kamae

mvoelk · 2019-06-24T14:10:05Z

prior_util.plot_results(res, classes=gt_util.classes, show_labels=True)

kamae · 2019-06-24T14:22:30Z

Markus Millions of thanks! Yes the program ran through and recognized objects! I will try to follow other notebooks too. Your End-to-End samples are so valuable! Regards, Tune Kamae

kamae · 2019-06-27T12:41:22Z

Markus and collaborators Did someone write SSD-MobileNetV2 structure in keras? I am trying to train COCO 2017 dataset on SSD-MobileNetV2. Appreciate any advice or instructions, Tune Kamae

mvoelk · 2019-06-27T14:08:25Z

I tried MobileNet V1, but I'm not sure if it is working...

from keras.models import Model
from keras.applications import MobileNet
from keras.layers import Activation
from keras.layers import Conv2D
from keras.layers import SeparableConv2D
from keras.layers import BatchNormalization


def ssd300_mobilenet_body(x):
    
    source_layers = []
    
    mobilenet = MobileNet(input_shape=(224,224,3), include_top=False, weights='imagenet')
    x = Model(inputs=mobilenet.input, outputs=mobilenet.get_layer('conv_dw_11_relu').output)(x)

    x = Conv2D(512, (1, 1), padding='same', name='conv11')(x)
    x = BatchNormalization(name='bn11')(x)
    x = Activation('relu')(x)
    source_layers.append(x)
    
    x = SeparableConv2D(512, (3, 3),strides=(2, 2), padding='same', name='conv12dw')(x)
    x = BatchNormalization(name='bn12dw')(x)
    x = Activation('relu')(x)
    x = Conv2D(1024, (1, 1), padding='same', name='conv12')(x)
    x = BatchNormalization(name='bn12')(x)
    x = Activation('relu')(x)
    x = SeparableConv2D(1024, (3, 3), padding='same',name='conv13dw')(x)
    x = BatchNormalization(name='bn13dw')(x)
    x = Activation('relu')(x)
    x = Conv2D(1024, (1, 1), padding='same', name='conv13')(x)
    x = BatchNormalization(name='bn13')(x)
    x = Activation('relu')(x)
    source_layers.append(x)
    
    x = Conv2D(256, (1, 1), padding='same', name='conv14_1')(x)
    x = BatchNormalization(name='bn14_1')(x)
    x = Activation('relu')(x)
    x = Conv2D(512, (3, 3), strides=(2, 2), padding='same', name='conv14_2')(x)
    x = BatchNormalization(name='bn14_2')(x)
    x = Activation('relu')(x)
    source_layers.append(x)
    
    x = Conv2D(128, (1, 1), padding='same', name='conv15_1')(x)
    x = BatchNormalization(name='bn15_1')(x)
    x = Activation('relu')(x)
    x = Conv2D(256, (3, 3), strides=(2, 2), padding='same', name='conv15_2')(x)
    x = BatchNormalization(name='bn15_2')(x)
    x = Activation('relu')(x)
    source_layers.append(x)
    
    x = Conv2D(128, (1, 1), padding='same', name='conv16_1')(x)
    x = BatchNormalization(name='bn16_1')(x)
    x = Activation('relu')(x)
    x = Conv2D(256, (3, 3), strides=(2, 2), padding='same', name='conv16_2')(x)
    x = BatchNormalization(name='bn16_2')(x)
    x = Activation('relu')(x)
    source_layers.append(x)
    
    x = Conv2D(64, (1, 1), padding='same', name='conv17_1')(x)
    x = BatchNormalization(name='bn17_1')(x)
    x = Activation('relu')(x)
    x = Conv2D(128, (3, 3), strides=(2, 2), padding='same', name='conv17_2')(x)
    x = BatchNormalization(name='bn17_2')(x)
    x = Activation('relu')(x)
    source_layers.append(x)
    
    return source_layers


def SSD300_mobile(input_shape=(300, 300, 3), num_classes=21, softmax=True):
    """SSD300 with MobileNet architecture.
    
    Based on the Keras implementationo of MobileNet.
    
    # References
        https://arxiv.org/abs/1704.04861
    """
    
    x = input_tensor = Input(shape=input_shape)
    source_layers = ssd300_mobilenet_body(x)
    
    num_priors = [4, 6, 6, 6, 4, 4]
    normalizations = [20, 20, 20, 20, 20, 20]

    output_tensor = multibox_head(source_layers, num_priors, num_classes, normalizations, softmax)
    model = Model(input_tensor, output_tensor)
    model.num_classes = num_classes

    # parameters for prior boxes
    model.image_size = input_shape[:2]
    model.source_layers = source_layers
    model.aspect_ratios = [[1,2,1/2], [1,2,1/2,3,1/3], [1,2,1/2,3,1/3], [1,2,1/2,3,1/3], [1,2,1/2], [1,2,1/2]]
    model.minmax_sizes = [(30, 60), (60, 111), (111, 162), (162, 213), (213, 264), (264, 315)]
    model.steps = [8, 16, 32, 64, 100, 300]
    model.special_ssd_boxes = True
    
    return model

If you get SSD running with MobileNet V2, I would appreciate if you could share your findings.

kamae · 2019-06-28T13:31:07Z

Markus, May I bother you again? I am trying to understand your code and reading your Thesis. I examined your ssd512_coco_weights_fixed.hdf5 using HDFview-3.5 (for Win10) And compared with what are in ssd512_body(x): in ssd_model.py, as well as Fig.3.5 of your thesis. What I have to understand is: ssd_model.py # Block 1 x=Conv2D(64, 3 ,strides=1, padding=’same’, name=’conv1_1’,activation=’relu’)(x) x=Conv2D(64, 3 ,strides=1, padding=’same’, name=’conv1_2’,activation=’relu’)(x) x=MaxPool2D(pool_size=2, strides=2, padding=’same’, name=’pool1’)(x) # Block 2 x=Conv2D(128, 3 ,strides=1, padding=’same’, name=’conv2_1’,activation=’relu’)(x) x=Conv2D(128, 3 ,strides=1, padding=’same’, name=’conv2_2’,activation=’relu’)(x) x=MaxPool2D(pool_size=2, strides=2, padding=’same’, name=’pool2’)(x) HDFview conv1_1 Weights 3x3x3x64 relu 64 conv1_2 Weights 3x3x64x64 relu 64 Question 1: depth changed from 3(rgb?) to 64 but not explicitly written in ssd_model.py # Block 1 Max pool 2D Conv2_1 Weights 3x3x64x128 relu 128 Conv2_2 Weights 3x3x128x128 relu 128 Question 2: Again where in ssd_model.py does this change from 64 to 128 specified? In Fig.3.5 these dimensions do not match. Why? Another bigger challenge for me to find out where the branch 38x38x512 (multi-box?) and other branches to follow are specified in your code? Apology for asking to many questions. Thanking you in advance, Tune Kamae

mvoelk · 2019-06-28T15:00:17Z

conv1_1 Weights 3x3x3x64 relu 64
conv1_2 Weights 3x3x64x64 relu 64
Question 1: depth changed from 3(rgb?) to 64 but not explicitly written in ssd_model.py

Yes, the weights have always shape (kernel_size, kernel_size, input_channels, output_channels). 3 is the number of input channels (BGR) defined in SSD512.

The missing Conv2_1 and Conv2_2 layers in Fig. 3.5 are my mistake...

mvoelk · 2019-07-01T07:05:02Z

The tensors at the branching point are collected in source_layers. multibox_head adds the prediction paths.

kamae · 2019-07-13T02:34:12Z

Markus I am now beginning to reproduce your SL_predict.ipynb is set to use SynthText data set as the default. The dataset is too big for me (41GB). I could not download. I would guess I don’t need the data-set if I use your pre-trained /201809231008_sl512_synthtext/weights.002.h5. Am I right? Another question is how to modify so that I can use use Total-Text (Ch’ng and Chan 2017)? That dataset seems to be closer to what I need for the blind people. Thank you for assistance. Tune Kamae

mvoelk · 2019-07-13T10:12:46Z

Am I right?

Yes

SegLink is actually not intended for the detection of curved text instances. Curved text would require custom encoding and decoding procedure, as well as another representation in the GTUtility and rectification before the recognition stage. It should also work to just write a new decoder and use it with the SynthText models, but I do not have the time for implementing this. arXiv:1807.01544 is probably the approach that comes closest to this idea.

If you just need a custom parser for a dataset with oriented bounding boxes, #12...

kamae · 2019-07-14T03:17:30Z

Markus I am trying to run SL_predict.ipynb. I ran into an error early on. Below are two snapshots of my screen. Do you see any problem the way I am running? It seems that 'data/SynthText/gt.mat' is needed. Thank you again for your kind help. Tune Kamae

mvoelk · 2019-07-15T06:48:46Z

I have no idea how to access the email attachments from the github issues, but I hope you find the answer to your question in #1 or #8.

kamae · 2019-07-17T05:06:08Z

Markus Thank you for your assistance. I managed to run SL_end2end_predict.ipynb (SL512,) and find texts in photo images. One error I got is: words = crop_words(img, np.clip(boxes/512,0,1), input_height, width=input_width, grayscale=True) NameError: name 'input_height' is not defined What value do you recommend for 'input_height' i? Eventually I would like to use video or phone-camera inputs. Tune Kamae

mvoelk · 2019-07-17T08:35:34Z

ssd_detectors/SL_end2end_predict.ipynb

Line 109 in df70980

"input_height = 32\n",

kamae · 2019-07-17T15:22:12Z

Markus Thank you. Now I can detect and recognize roman letters using SL_end2end_predict.ipynb. I am trying to figure out a way to save tiny image boxes including characters and send them to Google cloud service to recognize non-roman letters. Do you have any suggestion? Tune Kamae

mvoelk mentioned this issue Oct 12, 2019

How do I train a very lightweight object detector? #37

Closed

mvoelk mentioned this issue Nov 20, 2020

Light architectures for object detection #53

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in gt_util.sample_random_batch(batch_size=32, input_size=model.image_size) #21

Error in gt_util.sample_random_batch(batch_size=32, input_size=model.image_size) #21

kamae commented Jun 24, 2019

mvoelk commented Jun 24, 2019

kamae commented Jun 24, 2019 via email •

edited by mvoelk

Loading

mvoelk commented Jun 24, 2019

kamae commented Jun 24, 2019 via email •

edited by mvoelk

Loading

mvoelk commented Jun 24, 2019 •

edited

Loading

kamae commented Jun 24, 2019 via email •

edited by mvoelk

Loading

mvoelk commented Jun 24, 2019 •

edited

Loading

kamae commented Jun 24, 2019 via email •

edited by mvoelk

Loading

kamae commented Jun 27, 2019 via email •

edited by mvoelk

Loading

mvoelk commented Jun 27, 2019 •

edited

Loading

kamae commented Jun 28, 2019 via email •

edited by mvoelk

Loading

mvoelk commented Jun 28, 2019

mvoelk commented Jul 1, 2019 •

edited

Loading

kamae commented Jul 13, 2019 via email •

edited by mvoelk

Loading

mvoelk commented Jul 13, 2019

kamae commented Jul 14, 2019 via email •

edited by mvoelk

Loading

mvoelk commented Jul 15, 2019

kamae commented Jul 17, 2019 via email

mvoelk commented Jul 17, 2019

kamae commented Jul 17, 2019 via email •

edited by mvoelk

Loading

Error in gt_util.sample_random_batch(batch_size=32, input_size=model.image_size) #21

Error in gt_util.sample_random_batch(batch_size=32, input_size=model.image_size) #21

Comments

kamae commented Jun 24, 2019

mvoelk commented Jun 24, 2019

kamae commented Jun 24, 2019 via email • edited by mvoelk Loading

mvoelk commented Jun 24, 2019

kamae commented Jun 24, 2019 via email • edited by mvoelk Loading

mvoelk commented Jun 24, 2019 • edited Loading

kamae commented Jun 24, 2019 via email • edited by mvoelk Loading

mvoelk commented Jun 24, 2019 • edited Loading

kamae commented Jun 24, 2019 via email • edited by mvoelk Loading

kamae commented Jun 27, 2019 via email • edited by mvoelk Loading

mvoelk commented Jun 27, 2019 • edited Loading

kamae commented Jun 28, 2019 via email • edited by mvoelk Loading

mvoelk commented Jun 28, 2019

mvoelk commented Jul 1, 2019 • edited Loading

kamae commented Jul 13, 2019 via email • edited by mvoelk Loading

mvoelk commented Jul 13, 2019

kamae commented Jul 14, 2019 via email • edited by mvoelk Loading

mvoelk commented Jul 15, 2019

kamae commented Jul 17, 2019 via email

mvoelk commented Jul 17, 2019

kamae commented Jul 17, 2019 via email • edited by mvoelk Loading

kamae commented Jun 24, 2019 via email •

edited by mvoelk

Loading

kamae commented Jun 24, 2019 via email •

edited by mvoelk

Loading

mvoelk commented Jun 24, 2019 •

edited

Loading

kamae commented Jun 24, 2019 via email •

edited by mvoelk

Loading

mvoelk commented Jun 24, 2019 •

edited

Loading

kamae commented Jun 24, 2019 via email •

edited by mvoelk

Loading

kamae commented Jun 27, 2019 via email •

edited by mvoelk

Loading

mvoelk commented Jun 27, 2019 •

edited

Loading

kamae commented Jun 28, 2019 via email •

edited by mvoelk

Loading

mvoelk commented Jul 1, 2019 •

edited

Loading

kamae commented Jul 13, 2019 via email •

edited by mvoelk

Loading

kamae commented Jul 14, 2019 via email •

edited by mvoelk

Loading

kamae commented Jul 17, 2019 via email •

edited by mvoelk

Loading