Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I train a very lightweight object detector? #37

Closed
offchan42 opened this issue Oct 12, 2019 · 4 comments
Closed

How do I train a very lightweight object detector? #37

offchan42 opened this issue Oct 12, 2019 · 4 comments

Comments

@offchan42
Copy link

offchan42 commented Oct 12, 2019

  1. Without bias, what do you think are the major differences between your repository and the other repository (https://github.com/pierluigiferrari/ssd_keras)?

For me, I want to train a very lightweight/fast object detection model for recognizing a single solid object e.g. a play station joystick. I tried transfer learning on tensorflow object detection API with SSDLiteMobileNetV2 but it's not fast enough because it was made to be big so that it can predict multiple classes. But I want to predict only one class which is a rigid object that won't deform or change shape at all.

That's why I'm thinking of defining MobileNetV2 to be a bit smaller and training SSD from scratch (as I think it's not possible to reuse the weights from the bigger model) so that I could achieve faster inference on a mobile phone. And maybe later I will convert the model to TF Lite.
For example, I want my model to run fast like this paper: https://arxiv.org/abs/1907.05047

  1. Which repo should I use for easy and efficient implementation?
    @mvoelk
@offchan42 offchan42 changed the title Comparing this repo to https://github.com/pierluigiferrari/ssd_keras Which SSD repo should I use to train a very lightweight object detector? Oct 12, 2019
@offchan42 offchan42 changed the title Which SSD repo should I use to train a very lightweight object detector? How do I train a very lightweight object detector? Oct 12, 2019
@mvoelk
Copy link
Owner

mvoelk commented Oct 12, 2019

The implementation you mentioned is probably faster, since it does the decoding with TensorFlow and not with NumPy. It should also work to use the decoder layer in my implementation.

In your case, I would recommend a custom architecture. Separable convolution is a good starting point. Also reducing the depth of the architecture should make it faster, since depth can not be parallelized. Reducing the number of features should also be taken in consideration. Both may affect performance and require training from scratch.

Which implementation do you use is up to you...

@mvoelk
Copy link
Owner

mvoelk commented Oct 12, 2019

#21 (comment)

@offchan42
Copy link
Author

It seems that SSD7 from the other repo is lightweight enough that I can configure it to fit my needs. The only concern left is to test speed on mobile.

@mvoelk
Copy link
Owner

mvoelk commented Oct 19, 2019

I was curious and implemented the SSD7 model from keras_ssd7.py

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Conv2D, BatchNormalization, Activation, MaxPooling2D

from ssd_model import multibox_head


def SSD7(input_shape=(128, 128, 3), num_classes=2, softmax=True):
    
    source_layers = []
    
    x = input_tensor = Input(shape=input_shape)
    
    x = Conv2D(32, 5, strides=1, padding='same', kernel_initializer='he_normal')(x)
    x = BatchNormalization(axis=3, momentum=0.99)(x)
    x = Activation('elu')(x)
    source_layers.append(x)
    
    x = MaxPooling2D(pool_size=2)(x)
    x = Conv2D(48, 3, strides=1, padding='same', kernel_initializer='he_normal')(x)
    x = BatchNormalization(axis=3, momentum=0.99)(x)
    x = Activation('elu')(x)
    source_layers.append(x)
    
    x = MaxPooling2D(pool_size=2)(x)
    x = Conv2D(64, 3, strides=1, padding='same', kernel_initializer='he_normal')(x)
    x = BatchNormalization(axis=3, momentum=0.99)(x)
    x = Activation('elu')(x)
    source_layers.append(x)
    
    x = MaxPooling2D(pool_size=2)(x)
    x = Conv2D(64, 3, strides=1, padding='same', kernel_initializer='he_normal')(x)
    x = BatchNormalization(axis=3, momentum=0.99)(x)
    x = Activation('elu')(x)
    source_layers.append(x)
    
    x = MaxPooling2D(pool_size=2)(x)
    x = Conv2D(48, 3, strides=1, padding='same', kernel_initializer='he_normal')(x)
    x = BatchNormalization(axis=3, momentum=0.99)(x)
    x = Activation('elu')(x)
    source_layers.append(x)
    
    x = MaxPooling2D(pool_size=2)(x)
    x = Conv2D(48, 3, strides=1, padding='same', kernel_initializer='he_normal')(x)
    x = BatchNormalization(axis=3, momentum=0.99)(x)
    x = Activation('elu')(x)
    source_layers.append(x)
    
    x = MaxPooling2D(pool_size=2)(x)
    x = Conv2D(32, 3, strides=1, padding='same', kernel_initializer='he_normal')(x)
    x = BatchNormalization(axis=3, momentum=0.99)(x)
    x = Activation('elu')(x)
    source_layers.append(x)
    
    num_priors = [3] * 7
    normalizations = [-1] * 7

    output_tensor = multibox_head(source_layers, num_priors, num_classes, normalizations, softmax)
    model = Model(input_tensor, output_tensor)
    model.num_classes = num_classes

    # parameters for prior boxes
    model.image_size = input_shape[:2]
    model.source_layers = source_layers
    model.aspect_ratios = [[1,2,1/2]] * 7
    
    return model

That's all, relu and depthwise conv should be cheaper, regularisation is done in in training notebook... If you want, add the layer names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants