Traffic Sign Recognition

Writeup

Build a Traffic Sign Recognition Project

The goals / steps of this project are the following:

Load the data set (see below for links to the project data set)
Explore, summarize and visualize the data set
Design, train and test a model architecture
Use the model to make predictions on new images
Analyze the softmax probabilities of the new images
Summarize the results with a written report

Rubric Points

Here I will consider the rubric points individually and describe how I addressed each point in my implementation.

Writeup / README

1. Provide a Writeup / README that includes all the rubric points and how you addressed each one. You can submit your writeup as markdown or pdf. You can use this template as a guide for writing the report. The submission includes the project code.

You're reading it! and here is a link to my project code

Data Set Summary & Exploration

1. Provide a basic summary of the data set. In the code, the analysis should be done using python, numpy and/or pandas methods rather than hardcoding results manually.

I used the pandas library to calculate summary statistics of the traffic signs data set, which are:

The size of training set is ?

The training set is composed by 34799 traffic sign images

The size of the validation set is ?

The validation set is composed by 4410traffic sign images

The size of test set is ?

The test set is composed by 12630 traffic sign images

The shape of a traffic sign image is ?

Each of the traffic sign images has a size of 32x32 pixels and 3 color channels (R,G,B).

The number of unique classes/labels in the data set is ?

The dataset has a total of 43 unique classes/labels, according to the following extracted list:

ClassId                                         SignName
--------------------------------------------------------
0                                   Speed limit (20km/h)
1                                   Speed limit (30km/h)
2                                   Speed limit (50km/h)
3                                   Speed limit (60km/h)
4                                   Speed limit (70km/h)
5                                   Speed limit (80km/h)
6                            End of speed limit (80km/h)
7                                  Speed limit (100km/h)
8                                  Speed limit (120km/h)
9                                             No passing
10           No passing for vehicles over 3.5 metric tons
11                  Right-of-way at the next intersection
12                                          Priority road
13                                                  Yield
14                                                   Stop
15                                            No vehicles
16               Vehicles over 3.5 metric tons prohibited
17                                               No entry
18                                        General caution
19                            Dangerous curve to the left
20                           Dangerous curve to the right
21                                           Double curve
22                                             Bumpy road
23                                          Slippery road
24                              Road narrows on the right
25                                              Road work
26                                        Traffic signals
7                                            Pedestrians
28                                      Children crossing
29                                      Bicycles crossing
30                                     Beware of ice/snow
31                                  Wild animals crossing
32                    End of all speed and passing limits
33                                       Turn right ahead
34                                        Turn left ahead
35                                             Ahead only
36                                   Go straight or right
37                                    Go straight or left
38                                             Keep right
39                                              Keep left
40                                   Roundabout mandatory
41                                      End of no passing
42      End of no passing by vehicles over 3.5 metric ...

2. Include an exploratory visualization of the dataset.

Here is an exploratory visualization of the train data set using the seaborn library.

Looking at the graph, it is possible to observe that our data set doesn't has an uniform quantity of images per class - it actually has over 2000 images of some specific classes (like classes 1 and 2 - signs of Speed limit 30km/h 50km/h) and less than 250 images of other classes (like class 0 - signs of Speed limit of 20km/h).

It is not very clear why this discrepancy exists... my guess is that perhaps the traffic signs with the most images in the data set represents the ones which are the most common signs to find on the roads in Germany? This would be a fun research to do in a future.

Also, it is interesting to visualize if, despite having different number of images for each class, the partitioning of training, validation and test subsets have the same proportions.

Apparently, the three sets do have the same proportions for each class. It is important to notice that there is not a rule-of-thumb for how to divide the data set into training, validation and testing sets, but the most common ratios used are:

70% train, 15% val, 15% test
80% train, 10% val, 10% test
60% train, 20% val, 20% test

Then, so far we can say that our data set is all good!

Design and Test a Model Architecture

1. Describe how you preprocessed the image data. What techniques were chosen and why did you choose these techniques? Consider including images showing the output of each preprocessing technique. Pre-processing refers to techniques such as converting to grayscale, normalization, etc.

As first step, I decided to convert the images to grayscale mainly because grayscale simplifies the algorithm and reduces computational requirements. According to LeCun's paper we can see that, for the classification task, the traffic sign's color is not that relevant to increase the model's final accuracy.

Here is an example of a traffic sign image before and after grayscaling.

According to Depeursinge's Fundamentals of Texture Processing for Biomedical Image Analysis, image normalization ensures optimal comparisons across data acquisition methods and texture instances. The normalization of pixel values (intensity) is recommended for imaging modalities that do not correspond to absolute physical quantities. Keeping that in mind, to complete the prepossessing, I also normalized all images by dividing them by 255 so the pixel values of all images has a value between 0 and 1, making our model converge faster.

In addition to gray-scaling and normalizing the images, I tried to equalize them using the function cv2.equalizeHist in order to improve the contrast of our images, but the model didn't perform any better with it, so I decided to discard this processioning technique so our model can stay faster.

Also, still according to Sermanet and LeCun, adding transformed images to our original training dataset synthetically will yield more robust learning to potential deformations in the test set. So, I experimented perturbing the samples in position ([-2,2] pixels), in scale ([.9,1.1] ratio) and rotation ([-15,+15] degrees) using keras.preprocessing.image library.

datagen = ImageDataGenerator(width_shift_range = 0.2,
                             height_shift_range = 0.2,
                             zoom_range = 0.1,
                             shear_range = 0.1,
                             rotation_range = 15)
datagen.fit(X_train)

Our final training image dataset became like the following images:

2. Describe what your final model architecture looks like including model type, layers, layer sizes, connectivity, etc.) Consider including a diagram and/or table describing the final model.

My final model consisted of the following layers:

Layer	Description
Input	32x32x1 Grayscale image
Convolution	1x1 stride, padding = 'VALID', outputs 28x28x6
Activation	relu
Max pooling	2x2 stride, outputs 14x14x6
Convolution	1x1 stride, padding = 'VALID', outputs 10x10x16
Activation	relu
Max pooling	2x2 stride, outputs 5x5x16
Flatten	Outputs 400
Fully Connected	Outputs 120
Activation	relu
Dropout Layer	Keep Prob = 0.6
Fully Connected	Outputs 84
Activation	relu
Dropout Layer	Keep Prob = 0.6
Fully Connected	Outputs 43

3. Describe how you trained your model. The discussion can include the type of optimizer, the batch size, number of epochs and any hyperparameters such as learning rate.

To train the model, I used:

150 Epochs;
Batch size of 128;
Learning rate of 0.0009;
Adam Optimizer;

4. Describe the approach taken for finding a solution and getting the validation set accuracy to be at least 0.93. Include in the discussion the results on the training, validation and test sets and where in the code these were calculated. Your approach may have been an iterative process, in which case, outline the steps you took to get to the final solution and why you chose those steps. Perhaps your solution involved an already well known implementation or architecture. In this case, discuss why you think the architecture is suitable for the current problem.

If an iterative approach was chosen:

What was the first architecture that was tried and why was it chosen? What were some problems with the initial architecture? How was the architecture adjusted and why was it adjusted? Which parameters were tuned? How were they adjusted and why?

Initially, the chosen architecture was the LeNet-5 CNN using 100 Epochs, batch size of 128, learning rate of 0.01 and Adam Optimizer. However, the validation accuracy became way lower than our target of at least 93%. Here is the "train of thought" I took in order to adjust my architecture:

Lowered the learning rate. Observing the generated loss graph, it was possible to see that at some moment, the loss value was not converging - it was bouncing from one high value to another, just like the right-most graph in the illustration bellow (took from this Kagle's Learning Rate Article. So, to achieve an optimal learning rate, I tried using lr=0.0009 and it started converging in a much better rate!

Increased epochs. Initially I was using epochs of 100, however, I could notice that the model loss and accuracy values were still decreasing when the 100th epoch was achieved. So I increased it to 150 epochs so the model could have more "time" to train and achieve more constant loss and accuracy values by the end of the training.
Added dropout layers to prevent overfitting. After analyzing the first model accuracy plots, I observed that, once the model was achieving a validation accuracy lower than the training accuracy, we could consider that the model was overfitting. Among the main regularization techniques used to prevent overfitting in neural networks, I went for adding two dropout layers in the architecture. According to Abhinav Sagar in "Neural Networks, Overfitting", adding a dropout layer in an architecture will randomly drop neurons from the neural network during training in each iteration. When we drop different sets of neurons, it’s equivalent to training different neural networks. The different networks will overfit in different ways, so the net effect of dropout will be to reduce overfitting.

Finally, here are the final loss and accuracy results (registered using the sess.run(loss_operation) and sess.run(accuracy_operation):

Test Loss = 0.2850, Test Accuracy = 94.173

Test a Model on New Images

1. Choose five German traffic signs found on the web and provide them in the report. For each image, discuss what quality or qualities might be difficult to classify.

Here are five German traffic signs that I found on the web:

The first image might be difficult to classify because of its displacement, once the sign is not plain as the ones in the dataset. Also, the model may have some trouble when classifying the second and fourth images, once they have a watermark on them, which can be confusing for the model identify it or not.

When preprocessing the images (which consists of resizing them to 32x32 pixels, converting to grayscale and normalizing them), all the images obviously lost some resolution and details. Also, the third and fifth images seems to have a lot of noisy pixels, which can also confuse our models prediction.

2. Discuss the model's predictions on these new traffic signs and compare the results to predicting on the test set. At a minimum, discuss what the predictions were, the accuracy on these new predictions, and compare the accuracy to the accuracy on the test set (OPTIONAL: Discuss the results in more detail as described in the "Stand Out Suggestions" part of the rubric).

Here are the results of the prediction:

Image	Prediction
Speed limit (50km/h)	Speed limit (50km/h)
Yield	Yield
Right-of-way at the next intersection	Right-of-way at the next intersection
Go straight or right	Go straight or right
Speed limit (30km/h)	Speed limit (30km/h)

The model was able to correctly guess 5 of the 5 traffic signs, which gives an accuracy of 100%!! This compares favorably to the accuracy on the test set of 94.173%.

3. Describe how certain the model is when predicting on each of the five new images by looking at the softmax probabilities for each prediction. Provide the top 5 softmax probabilities for each image along with the sign type of each probability.

For the first image, the model is relatively sure that this is a stop sign (probability of 0.78), and the image does contain a 50km/h speed limit sign. The top five soft max probabilities were:

Probability	Prediction
0.7756	Speed limit (50km/h)
0.2107	Double curve
0.01378	Speed limit (30km/h)
2.02321e-05	Wild animals crossing
3.81858e-06	Road work

For the second image the model was 100% sure it was a Yield sign.

Probability	Prediction
1.0	Yield
0.0	Speed limit (20km/h)
0.0	Speed limit (30km/h)
0.0	Speed limit (50km/h)
0.0	Speed limit (60km/h)

For the third image as well the model was very close to classifying 100% correctly the traffic sign.

Probability	Prediction
1.0	Right-of-way at the next intersection
8.81395e-12	Double curve
2.17037e-13	Beware of ice/snow
3.21533e-15	Priority road
1.18125e-15	Pedestrians

For the fourth image, the model was not so flawless, but it could correctly classify the traffic sign with a probability of over 80%.

Probability	Prediction
0.830474	Go straight or right
0.0942405	End of all speed and passing limits
0.064576	Traffic signals
0.0106805	Speed limit (20km/h)
5.60259e-06	Speed limit (30km/h)

Finally, for the last image, the model did another prediction with a probability of almost 100%.

Probability	Prediction
0.999999	Speed limit (30km/h)
1.42727e-06	Speed limit (50km/h)
2.8886e-09	Speed limit (80km/h)
1.53915e-11	Speed limit (70km/h)
1.45847e-11	Speed limit (20km/h)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
examples		examples
traffic-signs-data		traffic-signs-data
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
README.md		README.md
Traffic_Sign_Classifier.ipynb		Traffic_Sign_Classifier.ipynb
checkpoint		checkpoint
lenet.data-00000-of-00001		lenet.data-00000-of-00001
lenet.index		lenet.index
lenet.meta		lenet.meta
set_git.sh		set_git.sh
signnames.csv		signnames.csv
visualize_cnn.png		visualize_cnn.png
writeup_template.md		writeup_template.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Traffic Sign Recognition

Writeup

Rubric Points

Here I will consider the rubric points individually and describe how I addressed each point in my implementation.

Writeup / README

1. Provide a Writeup / README that includes all the rubric points and how you addressed each one. You can submit your writeup as markdown or pdf. You can use this template as a guide for writing the report. The submission includes the project code.

Data Set Summary & Exploration

1. Provide a basic summary of the data set. In the code, the analysis should be done using python, numpy and/or pandas methods rather than hardcoding results manually.

2. Include an exploratory visualization of the dataset.

Design and Test a Model Architecture

1. Describe how you preprocessed the image data. What techniques were chosen and why did you choose these techniques? Consider including images showing the output of each preprocessing technique. Pre-processing refers to techniques such as converting to grayscale, normalization, etc.

2. Describe what your final model architecture looks like including model type, layers, layer sizes, connectivity, etc.) Consider including a diagram and/or table describing the final model.

3. Describe how you trained your model. The discussion can include the type of optimizer, the batch size, number of epochs and any hyperparameters such as learning rate.

Test a Model on New Images

1. Choose five German traffic signs found on the web and provide them in the report. For each image, discuss what quality or qualities might be difficult to classify.

3. Describe how certain the model is when predicting on each of the five new images by looking at the softmax probabilities for each prediction. Provide the top 5 softmax probabilities for each image along with the sign type of each probability.

About

Releases

Packages

Languages

License

lucasmirachi/CarND-Traffic-Sign-Classifier

Folders and files

Latest commit

History

Repository files navigation

Traffic Sign Recognition

Writeup

Rubric Points

Here I will consider the rubric points individually and describe how I addressed each point in my implementation.

Writeup / README

1. Provide a Writeup / README that includes all the rubric points and how you addressed each one. You can submit your writeup as markdown or pdf. You can use this template as a guide for writing the report. The submission includes the project code.

Data Set Summary & Exploration

1. Provide a basic summary of the data set. In the code, the analysis should be done using python, numpy and/or pandas methods rather than hardcoding results manually.

2. Include an exploratory visualization of the dataset.

Design and Test a Model Architecture

1. Describe how you preprocessed the image data. What techniques were chosen and why did you choose these techniques? Consider including images showing the output of each preprocessing technique. Pre-processing refers to techniques such as converting to grayscale, normalization, etc.

2. Describe what your final model architecture looks like including model type, layers, layer sizes, connectivity, etc.) Consider including a diagram and/or table describing the final model.

3. Describe how you trained your model. The discussion can include the type of optimizer, the batch size, number of epochs and any hyperparameters such as learning rate.

Test a Model on New Images

1. Choose five German traffic signs found on the web and provide them in the report. For each image, discuss what quality or qualities might be difficult to classify.

3. Describe how certain the model is when predicting on each of the five new images by looking at the softmax probabilities for each prediction. Provide the top 5 softmax probabilities for each image along with the sign type of each probability.

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages