-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in gt_util.sample_random_batch(batch_size=32, input_size=model.image_size) #21
Comments
I guess your image data is grayscale with shape (512,512) or (512,512,1). I always used RGB images (e.g. shape (512,512,3)) and hard coded the channel means for compatibility with caffe models. |
Markus
Thank you again for your kind assistance.
I was not loading images to predict.
I want to use the ones in ssd_detectors-master\data\images
They are boys.jpg, cafr_cat.jpg, fish-bike.jpg.
Or those in ssd_detectors-master\images
I didn’t know how to load them in
“Predict” section before the following code
_, inputs, images, data = gt_util.sample_random_batch(batch_size=32, input_size=model.image_size)
How can I link to those images directory?
Tune Kamae
|
Okay, what you are looking for is probably in For training with your own dataset, you should write a custom parser ( |
Markus
Thank you for the email.
I am running SSD_predict.ipynb to test its prediction power.
I don’t know yet how to upload images to the “Predict” section of the notebook.
Tune Kamae
|
The converted caffe models may require fine tuning and the threshold was chosen more or less ad hoc. |
Markus
Thank you for the advice.
I copied “Real World Images” and managed to run almost to end.
What I could figure out is where data[i] in
prior_util.plot_results(res, classes=gt_util.classes, show_labels=True, gt_data=data[i])
data[i] probably has the bounding boxes etc.
Tune Kamae
|
|
Markus
Millions of thanks!
Yes the program ran through and recognized objects!
I will try to follow other notebooks too.
Your End-to-End samples are so valuable!
Regards,
Tune Kamae
|
Markus and collaborators
Did someone write SSD-MobileNetV2 structure in keras?
I am trying to train COCO 2017 dataset on SSD-MobileNetV2.
Appreciate any advice or instructions,
Tune Kamae
|
I tried MobileNet V1, but I'm not sure if it is working...
If you get SSD running with MobileNet V2, I would appreciate if you could share your findings. |
Markus,
May I bother you again?
I am trying to understand your code and reading your Thesis.
I examined your ssd512_coco_weights_fixed.hdf5 using HDFview-3.5 (for Win10)
And compared with what are in ssd512_body(x): in ssd_model.py, as well as Fig.3.5 of your thesis.
What I have to understand is:
ssd_model.py
# Block 1
x=Conv2D(64, 3 ,strides=1, padding=’same’, name=’conv1_1’,activation=’relu’)(x)
x=Conv2D(64, 3 ,strides=1, padding=’same’, name=’conv1_2’,activation=’relu’)(x)
x=MaxPool2D(pool_size=2, strides=2, padding=’same’, name=’pool1’)(x)
# Block 2
x=Conv2D(128, 3 ,strides=1, padding=’same’, name=’conv2_1’,activation=’relu’)(x)
x=Conv2D(128, 3 ,strides=1, padding=’same’, name=’conv2_2’,activation=’relu’)(x)
x=MaxPool2D(pool_size=2, strides=2, padding=’same’, name=’pool2’)(x)
HDFview
conv1_1 Weights 3x3x3x64 relu 64
conv1_2 Weights 3x3x64x64 relu 64
Question 1: depth changed from 3(rgb?) to 64 but not explicitly written in ssd_model.py # Block 1
Max pool 2D
Conv2_1 Weights 3x3x64x128 relu 128
Conv2_2 Weights 3x3x128x128 relu 128
Question 2: Again where in ssd_model.py does this change from 64 to 128 specified?
In Fig.3.5 these dimensions do not match. Why?
Another bigger challenge for me to find out where the branch 38x38x512 (multi-box?) and other branches to follow are specified in your code?
Apology for asking to many questions.
Thanking you in advance,
Tune Kamae
|
Yes, the weights have always shape The missing |
The tensors at the branching point are collected in |
Markus
I am now beginning to reproduce your SL_predict.ipynb is set to use SynthText data set as the default. The dataset is too big for me (41GB). I could not download.
I would guess I don’t need the data-set if I use your pre-trained /201809231008_sl512_synthtext/weights.002.h5.
Am I right?
Another question is how to modify so that I can use use Total-Text (Ch’ng and Chan 2017)?
That dataset seems to be closer to what I need for the blind people.
Thank you for assistance.
Tune Kamae
|
Yes SegLink is actually not intended for the detection of curved text instances. Curved text would require custom encoding and decoding procedure, as well as another representation in the If you just need a custom parser for a dataset with oriented bounding boxes, #12... |
Markus
I am trying to run SL_predict.ipynb.
I ran into an error early on.
Below are two snapshots of my screen. Do you see any problem the way I am running?
It seems that 'data/SynthText/gt.mat' is needed.
Thank you again for your kind help.
Tune Kamae
|
Markus
Thank you for your assistance.
I managed to run SL_end2end_predict.ipynb (SL512,) and find texts in photo images.
One error I got is:
words = crop_words(img, np.clip(boxes/512,0,1), input_height, width=input_width, grayscale=True)
NameError: name 'input_height' is not defined
What value do you recommend for 'input_height' i?
Eventually I would like to use video or phone-camera inputs.
Tune Kamae
|
ssd_detectors/SL_end2end_predict.ipynb Line 109 in df70980
|
Markus
Thank you. Now I can detect and recognize roman letters using SL_end2end_predict.ipynb.
I am trying to figure out a way to save tiny image boxes including characters and send them to Google cloud service to recognize non-roman letters.
Do you have any suggestion?
Tune Kamae
|
ssd_detectors-master\ssd_data.py in preprocess(img, size)
628 img = img.astype(np.float32)
629 mean = np.array([104,117,123])
--> 630 img -= mean[np.newaxis, np.newaxis, :]
631 return img
632
ValueError: operands could not be broadcast together with shapes (512,512) (1,1,3) (512,512)
The text was updated successfully, but these errors were encountered: