-
Notifications
You must be signed in to change notification settings - Fork 3
/
Visual Odometry for Localization in Autonomous Driving.py
871 lines (588 loc) · 28.6 KB
/
Visual Odometry for Localization in Autonomous Driving.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
#!/usr/bin/env python
# coding: utf-8
# # Visual Odometry for Localization in Autonomous Driving
#
# Welcome to the assignment for Module 2: Visual Features - Detection, Description and Matching. In this assignment, you will practice using the material you have learned to estimate an autonomous vehicle trajectory by images taken with a monocular camera set up on the vehicle.
#
#
# **In this assignment, you will:**
# - Extract features from the photographs taken with a camera setup on the vehicle.
# - Use the extracted features to find matches between the features in different photographs.
# - Use the found matches to estimate the camera motion between subsequent photographs.
# - Use the estimated camera motion to build the vehicle trajectory.
#
# For most exercises, you are provided with a suggested outline. You are encouraged to diverge from the outline if you think there is a better, more efficient way to solve a problem.
#
# You are only allowed to use the packages loaded bellow and the custom functions explained in the notebook. Run the cell bellow to import the required packages:
# In[1]:
import numpy as np
import cv2
from matplotlib import pyplot as plt
from m2bk import *
get_ipython().run_line_magic('matplotlib', 'inline')
get_ipython().run_line_magic('load_ext', 'autoreload')
get_ipython().run_line_magic('autoreload', '2')
np.random.seed(1)
np.set_printoptions(threshold=np.nan)
# ## 0 - Loading and Visualizing the Data
# We provide you with a convenient dataset handler class to read and iterate through samples taken from the CARLA simulator. Run the following code to create a dataset handler object.
# In[2]:
dataset_handler = DatasetHandler()
# The dataset handler contains 52 data frames. Each frame contains an RGB image and a depth map taken with a setup on the vehicle and a grayscale version of the RGB image which will be used for computation. Furthermore, camera calibration matrix K is also provided in the dataset handler.
#
# Upon creation of the dataset handler object, all the frames will be automatically read and loaded. The frame content can be accessed by using `images`, `images_rgb`, `depth_maps` attributes of the dataset handler object along with the index of the requested frame. See how to access the images (grayscale), rgb images (3-channel color), depth maps and camera calibration matrix in the example below.
#
# **Note (Depth Maps)**: Maximum depth distance is 1000. This value of depth shows that the selected pixel is at least 1000m (1km) far from the camera, however the exact distance of this pixel from the camera is unknown. Having this kind of points in further trajectory estimation might affect the trajectory precision.
# In[3]:
image = dataset_handler.images[0]
plt.figure(figsize=(8, 6), dpi=100)
plt.imshow(image, cmap='gray')
# In[4]:
image_rgb = dataset_handler.images_rgb[0]
plt.figure(figsize=(8, 6), dpi=100)
plt.imshow(image_rgb)
# In[5]:
i = 0
depth = dataset_handler.depth_maps[i]
plt.figure(figsize=(8, 6), dpi=100)
plt.imshow(depth, cmap='jet')
# In[6]:
print("Depth map shape: {0}".format(depth.shape))
v, u = depth.shape
depth_val = depth[v-1, u-1]
print("Depth value of the very bottom-right pixel of depth map {0} is {1:0.3f}".format(i, depth_val))
# In[7]:
dataset_handler.k
# In order to access an arbitrary frame use image index, as shown in the examples below. Make sure the indexes are within the number of frames in the dataset. The number of frames in the dataset can be accessed with num_frames attribute.
# In[8]:
# Number of frames in the dataset
print(dataset_handler.num_frames)
# In[9]:
i = 30
image = dataset_handler.images[i]
plt.figure(figsize=(8, 6), dpi=100)
plt.imshow(image, cmap='gray')
# ## 1 - Feature Extraction
#
# ### 1.1 - Extracting Features from an Image
#
# **Task**: Implement feature extraction from a single image. You can use any feature descriptor of your choice covered in the lectures, ORB for example.
#
#
# Note 1: Make sure you understand the structure of the keypoint descriptor object, this will be very useful for your further tasks. You might find [OpenCV: Keypoint Class Description](https://docs.opencv.org/3.4.3/d2/d29/classcv_1_1KeyPoint.html) handy.
#
# Note 2: Make sure you understand the image coordinate system, namely the origin location and axis directions.
#
# Note 3: We provide you with a function to visualise the features detected. Run the last 2 cells in section 1.1 to view.
#
# ***Optional***: Try to extract features with different descriptors such as SIFT, ORB, SURF and BRIEF. You can also try using detectors such as Harris corners or FAST and pairing them with a descriptor. Lastly, try changing parameters of the algorithms. Do you see the difference in various approaches?
# You might find this link useful: [OpenCV:Feature Detection and Description](https://docs.opencv.org/3.4.3/db/d27/tutorial_py_table_of_contents_feature2d.html).
# In[10]:
def extract_features(image):
"""
Find keypoints and descriptors for the image
Arguments:
image -- a grayscale image
Returns:
kp -- list of the extracted keypoints (features) in an image
des -- list of the keypoint descriptors in an image
"""
### START CODE HERE ###
"""
# Initiate ORB detector
orb = cv.ORB_create()
# find the keypoints with ORB
kp = orb.detect(image,None)
# compute the descriptors with ORB
kp, des = orb.compute(image, kp)
# draw only keypoints location,not size and orientation
img2 = cv.drawKeypoints(image, kp, None, color=(0,255,0), flags=0)
plt.imshow(img2), plt.show()
"""
surf = cv2.xfeatures2d.SURF_create(500)
kp, des = surf.detectAndCompute(image,None)
"""
# Initiate FAST detector
star = cv.xfeatures2d.StarDetector_create()
# Initiate BRIEF extractor
brief = cv.xfeatures2d.BriefDescriptorExtractor_create()
# find the keypoints with STAR
kp = star.detect(image,None)
# compute the descriptors with BRIEF
kp, des = brief.compute(image, kp)
print("BRIEF descriptor size= ", brief.descriptorSize() )
"""
### END CODE HERE ###
return kp, des
# In[11]:
i = 0
image = dataset_handler.images[i]
kp, des = extract_features(image)
print("Number of features detected in frame {0}: {1}\n".format(i, len(kp)))
print("Coordinates of the first keypoint in frame {0}: {1}".format(i, str(kp[0].pt)))
# In[12]:
def visualize_features(image, kp):
"""
Visualize extracted features in the image
Arguments:
image -- a grayscale image
kp -- list of the extracted keypoints
Returns:
"""
display = cv2.drawKeypoints(image, kp, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
#display = cv2.drawKeypoints(image, kp, None)
plt.imshow(display)
# In[13]:
# Optional: visualizing and experimenting with various feature descriptors
i = 0
image = dataset_handler.images_rgb[i]
visualize_features(image, kp)
# ### 1.2 - Extracting Features from Each Image in the Dataset
#
# **Task**: Implement feature extraction for each image in the dataset with the function you wrote in the above section.
#
# **Note**: If you do not remember how to pass functions as arguments, make sure to brush up on this topic. This [
# Passing Functions as Arguments](https://www.coursera.org/lecture/program-code/passing-functions-as-arguments-hnmqD) might be helpful.
# In[14]:
def extract_features_dataset(images, extract_features_function):
"""
Find keypoints and descriptors for each image in the dataset
Arguments:
images -- a list of grayscale images
extract_features_function -- a function which finds features (keypoints and descriptors) for an image
Returns:
kp_list -- a list of keypoints for each image in images
des_list -- a list of descriptors for each image in images
"""
kp_list = []
des_list = []
for i in range(len(images)):
kp, des = extract_features(images[i])
kp_list.append(kp)
des_list.append(des)
### START CODE HERE ###
#print(kp)
#print(kp_list)
### END CODE HERE ###
return kp_list, des_list
# In[15]:
images = dataset_handler.images
kp_list, des_list = extract_features_dataset(images, extract_features)
i = 0
print("Number of features detected in frame {0}: {1}".format(i, len(kp_list[i])))
print("Coordinates of the first keypoint in frame {0}: {1}\n".format(i, str(kp_list[i][0].pt)))
# Remember that the length of the returned by dataset_handler lists should be the same as the length of the image array
print("Length of images array: {0}".format(len(images)))
# ## 2 - Feature Matching
#
# Next step after extracting the features in each image is matching the features from the subsequent frames. This is what is needed to be done in this section.
#
# ### 2.1 - Matching Features from a Pair of Subsequent Frames
#
# **Task**: Implement feature matching for a pair of images. You can use any feature matching algorithm of your choice covered in the lectures, Brute Force Matching or FLANN based Matching for example.
#
# ***Optional 1***: Implement match filtering by thresholding the distance between the best matches. This might be useful for improving your overall trajectory estimation results. Recall that you have an option of specifying the number best matches to be returned by the matcher.
#
# We have provided a visualization of the found matches. Do all the matches look legitimate to you? Do you think match filtering can improve the situation?
# In[36]:
def match_features(des1, des2):
"""
Match features from two images
Arguments:
des1 -- list of the keypoint descriptors in the first image
des2 -- list of the keypoint descriptors in the second image
Returns:
match -- list of matched features from two images. Each match[i] is k or less matches for the same query descriptor
"""
### START CODE HERE ###
'''
#Brute Force Matching
# create BFMatcher object
#bf = cv.BFMatcher(cv.NORM_HAMMING, crossCheck=True)
bf = cv2.BFMatcher(cv2.NORM_L1,crossCheck=False)
# Match descriptors.
matches = bf.match(des1,des2)
# Sort them in the order of their distance.
matches = sorted(matches, key = lambda x:x.distance)
'''
# FLANN parameters
FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks=50) # or pass empty dictionary
flann = cv2.FlannBasedMatcher(index_params,search_params)
matches = flann.knnMatch(np.asarray(des1,np.float32),np.asarray(des2,np.float32), 2)
### END CODE HERE ###
return matches
# In[37]:
i = 0
des1 = des_list[i]
des2 = des_list[i+1]
match = match_features(des1, des2)
print("Number of features matched in frames {0} and {1}: {2}".format(i, i+1, len(match)))
# Remember that a matcher finds the best matches for EACH descriptor from a query set
# In[49]:
# Optional
def filter_matches_distance(match, dist_threshold):
"""
Filter matched features from two images by distance between the best matches
Arguments:
match -- list of matched features from two images
dist_threshold -- maximum allowed relative distance between the best matches, (0.0, 1.0)
Returns:
filtered_match -- list of good matches, satisfying the distance threshold
"""
filtered_match = []
### START CODE HERE ###
"""
for i,(m,n) in enumerate(match):
if m.distance < dist_threshold*n.distance:
filtered_match.append(m)
"""
for m,n in match:
if m.distance < dist_threshold*n.distance:
filtered_match.append(m)
### END CODE HERE ###
return filtered_match
# In[50]:
# Optional
i = 0
des1 = des_list[i]
des2 = des_list[i+1]
match = match_features(des1, des2)
dist_threshold = 0.6 #better results maybe with 0.7
filtered_match = filter_matches_distance(match, dist_threshold)
print("Number of features matched in frames {0} and {1} after filtering by distance: {2}".format(i, i+1, len(filtered_match)))
# In[51]:
def visualize_matches(image1, kp1, image2, kp2, match):
"""
Visualize corresponding matches in two images
Arguments:
image1 -- the first image in a matched image pair
kp1 -- list of the keypoints in the first image
image2 -- the second image in a matched image pair
kp2 -- list of the keypoints in the second image
match -- list of matched features from the pair of images
Returns:
image_matches -- an image showing the corresponding matches on both image1 and image2 or None if you don't use this function
"""
#image_matches = cv2.drawMatches(image1,kp1,image2,kp2,match,None,flags=2)
image_matches = cv2.drawMatchesKnn(image1,kp1,image2,kp2,match,None,flags=2)
plt.figure(figsize=(16, 6), dpi=100)
plt.imshow(image_matches)
# In[52]:
# Visualize n first matches, set n to None to view all matches
# set filtering to True if using match filtering, otherwise set to False
n = 20
filtering = False
i = 0
image1 = dataset_handler.images[i]
image2 = dataset_handler.images[i+1]
kp1 = kp_list[i]
kp2 = kp_list[i+1]
des1 = des_list[i]
des2 = des_list[i+1]
match = match_features(des1, des2)
if filtering:
dist_threshold = 0.6
match = filter_matches_distance(match, dist_threshold)
image_matches = visualize_matches(image1, kp1, image2, kp2, match[:n])
print("Number of features matched in frames {0} and {1}: {2}".format(i, i+1, len(match)))
# ### 2.2 - Matching Features in Each Subsequent Image Pair in the Dataset
#
# **Task**: Implement feature matching for each subsequent image pair in the dataset with the function you wrote in the above section.
#
# ***Optional***: Implement match filtering by thresholding the distance for each subsequent image pair in the dataset with the function you wrote in the above section.
# In[53]:
def match_features_dataset(images, des_list, match_features):
"""
Match features for each subsequent image pair in the dataset
Arguments:
des_list -- a list of descriptors for each image in the dataset
match_features -- a function which maches features between a pair of images
Returns:
matches -- list of matches for each subsequent image pair in the dataset.
Each matches[i] is a list of matched features from images i and i + 1
"""
matches = []
### START CODE HERE ###
for i in range(len(images)-1):
match = match_features(des_list[i], des_list[i+1])
matches.append(match)
### END CODE HERE ###
return matches
# In[54]:
matches = match_features_dataset(images, des_list, match_features)
i = 0
print("Number of features matched in frames {0} and {1}: {2}".format(i, i+1, len(matches[i])))
# In[55]:
# Optional
def filter_matches_dataset(filter_matches_distance, matches, dist_threshold):
"""
Filter matched features by distance for each subsequent image pair in the dataset
Arguments:
filter_matches_distance -- a function which filters matched features from two images by distance between the best matches
matches -- list of matches for each subsequent image pair in the dataset.
Each matches[i] is a list of matched features from images i and i + 1
dist_threshold -- maximum allowed relative distance between the best matches, (0.0, 1.0)
Returns:
filtered_matches -- list of good matches for each subsequent image pair in the dataset.
Each matches[i] is a list of good matches, satisfying the distance threshold
"""
filtered_matches = []
### START CODE HERE ###
for m in matches:
match = filter_matches_distance(m, dist_threshold)
filtered_matches.append(match)
### END CODE HERE ###
return filtered_matches
# In[56]:
# Optional
dist_threshold = 0.6
filtered_matches = filter_matches_dataset(filter_matches_distance, matches, dist_threshold)
if len(filtered_matches) > 0:
# Make sure that this variable is set to True if you want to use filtered matches further in your assignment
is_main_filtered_m = True
if is_main_filtered_m:
matches = filtered_matches
i = 2
print("Number of filtered matches in frames {0} and {1}: {2}".format(i, i+1, len(filtered_matches[i])))
# ## 3 - Trajectory Estimation
#
# At this point you have everything to perform visual odometry for the autonomous vehicle. In this section you will incrementally estimate the pose of the vehicle by examining the changes that motion induces on the images of its onboard camera.
#
# ### 3.1 - Estimating Camera Motion between a Pair of Images
#
# **Task**: Implement camera motion estimation from a pair of images. You can use the motion estimation algorithm covered in the lecture materials, namely Perspective-n-Point (PnP), as well as Essential Matrix Decomposition.
#
# - If you decide to use PnP, you will need depth maps of frame and they are provided with the dataset handler. Check out Section 0 of this assignment to recall how to access them if you need. As this method has been covered in the course, review the lecture materials if need be.
# - If you decide to use Essential Matrix Decomposition, more information about this method can be found in [Wikipedia: Determining R and t from E](https://en.wikipedia.org/wiki/Essential_matrix).
#
# More information on both approaches implementation can be found in [OpenCV: Camera Calibration and 3D Reconstruction](https://docs.opencv.org/3.4.3/d9/d0c/group__calib3d.html). Specifically, you might be interested in _Detailed Description_ section of [OpenCV: Camera Calibration and 3D Reconstruction](https://docs.opencv.org/3.4.3/d9/d0c/group__calib3d.html) as it explains the connection between the 3D world coordinate system and the 2D image coordinate system.
#
#
# ***Optional***: Implement camera motion estimation with PnP, PnP with RANSAC and Essential Matrix Decomposition. Check out how filtering matches by distance changes estimated camera movement. Do you see the difference in various approaches?
# In[57]:
def estimate_motion(match, kp1, kp2, k, depth1=None):
"""
Estimate camera motion from a pair of subsequent image frames
Arguments:
match -- list of matched features from the pair of images
kp1 -- list of the keypoints in the first image
kp2 -- list of the keypoints in the second image
k -- camera calibration matrix
Optional arguments:
depth1 -- a depth map of the first frame. This argument is not needed if you use Essential Matrix Decomposition
Returns:
rmat -- recovered 3x3 rotation numpy matrix
tvec -- recovered 3x1 translation numpy vector
image1_points -- a list of selected match coordinates in the first image. image1_points[i] = [u, v], where u and v are
coordinates of the i-th match in the image coordinate system
image2_points -- a list of selected match coordinates in the second image. image1_points[i] = [u, v], where u and v are
coordinates of the i-th match in the image coordinate system
"""
rmat = np.eye(3)
tvec = np.zeros((3, 1))
image1_points = []
image2_points = []
### START CODE HERE ###
'''
image1_points = (kp1[m.queryIdx].pt for m in match)
image2_points = (kp2[m.trainIdx].pt for m in match)
'''
for m in match:
#m = m[0]
#m.distance: This attribute gives us the distance between the descriptors.
#A lower distance indicates a better match.
#m.trainIdx: This attribute gives us the index of the descriptor in the list of train descriptors
#(in our case, it’s the list of descriptors in the img2).
#m.queryIdx: This attribute gives us the index of the descriptor in the list of query descriptors
#(in our case, it’s the list of descriptors in the img1).
#m.imgIdx: This attribute gives us the index of the train image.
train_idx = m.trainIdx
query_idx = m.queryIdx
p1x, p1y = kp1[query_idx].pt
image1_points.append([p1x, p1y])
p2x, p2y = kp2[train_idx].pt
image2_points.append([p2x, p2y])
E, mask = cv2.findEssentialMat(np.array(image1_points), np.array(image2_points), k)
retval, rmat, tvec, mask = cv2.recoverPose(E, np.array(image1_points), np.array(image2_points), k)
#solve_PnPRansac is supposed to give the best results
#_, rvecs, tvecs, inliers = cv2.solvePnPRansac(objp, corners2, mtx, dist)
### END CODE HERE ###
return rmat, tvec, image1_points, image2_points
# In[58]:
i = 30
match = matches[i]
kp1 = kp_list[i]
kp2 = kp_list[i+1]
k = dataset_handler.k
depth = dataset_handler.depth_maps[i]
rmat, tvec, image1_points, image2_points = estimate_motion(match, kp1, kp2, k, depth1=depth)
print("Estimated rotation:\n {0}".format(rmat))
print("Estimated translation:\n {0}".format(tvec))
# **Expected Output Format**:
#
# Make sure that your estimated rotation matrix and translation vector are in the same format as the given initial values
#
# ```
# rmat = np.eye(3)
# tvec = np.zeros((3, 1))
#
# print("Initial rotation:\n {0}".format(rmat))
# print("Initial translation:\n {0}".format(tvec))
# ```
#
#
# ```
# Initial rotation:
# [[1. 0. 0.]
# [0. 1. 0.]
# [0. 0. 1.]]
# Initial translation:
# [[0.]
# [0.]
# [0.]]
# ```
# **Camera Movement Visualization**:
# You can use `visualize_camera_movement` that is provided to you. This function visualizes final image matches from an image pair connected with an arrow corresponding to direction of camera movement (when `is_show_img_after_mov = False`). The function description:
# ```
# Arguments:
# image1 -- the first image in a matched image pair (RGB or grayscale)
# image1_points -- a list of selected match coordinates in the first image. image1_points[i] = [x, y], where x and y are
# coordinates of the i-th match in the image coordinate system
# image2 -- the second image in a matched image pair (RGB or grayscale)
# image2_points -- a list of selected match coordinates in the second image. image1_points[i] = [x, y], where x and y are
# coordinates of the i-th match in the image coordinate system
# is_show_img_after_mov -- a boolean variable, controling the output (read image_move description for more info)
#
# Returns:
# image_move -- an image with the visualization. When is_show_img_after_mov=False then the image points from both images are visualized on the first image. Otherwise, the image points from the second image only are visualized on the second image
# ```
# In[59]:
i=30
image1 = dataset_handler.images_rgb[i]
image2 = dataset_handler.images_rgb[i + 1]
image_move = visualize_camera_movement(image1, image1_points, image2, image2_points)
plt.figure(figsize=(16, 12), dpi=100)
plt.imshow(image_move)
# In[60]:
image_move = visualize_camera_movement(image1, image1_points, image2, image2_points, is_show_img_after_move=True)
plt.figure(figsize=(16, 12), dpi=100)
plt.imshow(image_move)
# These visualizations might be helpful for understanding the quality of image points selected for the camera motion estimation
# ### 3.2 - Camera Trajectory Estimation
#
# **Task**: Implement camera trajectory estimation with visual odometry. More specifically, implement camera motion estimation for each subsequent image pair in the dataset with the function you wrote in the above section.
#
# ***Note***: Do not forget that the image pairs are not independent one to each other. i-th and (i + 1)-th image pairs have an image in common
# In[61]:
def estimate_trajectory(estimate_motion, matches, kp_list, k, depth_maps=[]):
"""
Estimate complete camera trajectory from subsequent image pairs
Arguments:
estimate_motion -- a function which estimates camera motion from a pair of subsequent image frames
matches -- list of matches for each subsequent image pair in the dataset.
Each matches[i] is a list of matched features from images i and i + 1
des_list -- a list of keypoints for each image in the dataset
k -- camera calibration matrix
Optional arguments:
depth_maps -- a list of depth maps for each frame. This argument is not needed if you use Essential Matrix Decomposition
Returns:
trajectory -- a 3xlen numpy array of the camera locations, where len is the lenght of the list of images and
trajectory[:, i] is a 3x1 numpy vector, such as:
trajectory[:, i][0] - is X coordinate of the i-th location
trajectory[:, i][1] - is Y coordinate of the i-th location
trajectory[:, i][2] - is Z coordinate of the i-th location
* Consider that the origin of your trajectory cordinate system is located at the camera position
when the first image (the one with index 0) was taken. The first camera location (index = 0) is geven
at the initialization of this function
"""
trajectory = np.zeros((3, 1))
### START CODE HERE ###
trajectory = [np.array([0, 0, 0])]
#T_k = [[R_(k,k-1) t_(k,k-1)] [0 1]]
T_k = np.eye(4)
for i in range(len(matches)):
match = matches[i]
kp1 = kp_list[i]
kp2 = kp_list[i+1]
depth = depth_maps[i]
rmat, tvec, image1_points, image2_points = estimate_motion(match, kp1, kp2, k, depth1=depth)
#Very useful: http://ksimek.github.io/2012/08/22/extrinsic/
Proj_new = np.eye(4)
Proj_new[0:3, 0:3] = rmat.T
T = -rmat.T @ tvec
Proj_new[0:3, 3] = T.reshape(3)
T_k = T_k @ Proj_new
trajectory.append(T_k[:3, 3])
#print(trajectory[i])
trajectory = np.array(trajectory).T
### END CODE HERE ###
return trajectory
# In[62]:
depth_maps = dataset_handler.depth_maps
trajectory = estimate_trajectory(estimate_motion, matches, kp_list, k, depth_maps=depth_maps)
i = 1
print("Camera location in point {0} is: \n {1}\n".format(i, trajectory[:, [i]]))
# Remember that the length of the returned by trajectory should be the same as the length of the image array
print("Length of trajectory: {0}".format(trajectory.shape[1]))
# **Expected Output**:
#
# ```
# Camera location in point i is:
# [[locXi]
# [locYi]
# [locZi]]```
#
# In this output: locXi, locYi, locZi are the coordinates of the corresponding i-th camera location
# ## 4 - Submission:
#
# Evaluation of this assignment is based on the estimated trajectory from the output of the cell below.
# Please run the cell bellow, then copy its output to the provided yaml file for submission on the programming assignment page.
#
# **Expected Submission Format**:
#
# ```
# Trajectory X:
# [[ 0. locX1 locX2 ... ]]
# Trajectory Y:
# [[ 0. locY1 locY2 ... ]]
# Trajectory Z:
# [[ 0. locZ1 locZ2 ... ]]
# ```
#
# In this output: locX1, locY1, locZ1; locX2, locY2, locZ2; ... are the coordinates of the corresponding 1st, 2nd and etc. camera locations
# In[63]:
# Note: Make sure to uncomment the below line if you modified the original data in any ways
#dataset_handler = DatasetHandler()
# Part 1. Features Extraction
images = dataset_handler.images
kp_list, des_list = extract_features_dataset(images, extract_features)
# Part II. Feature Matching
matches = match_features_dataset(images, des_list, match_features)
# Set to True if you want to use filtered matches or False otherwise
is_main_filtered_m = True
if is_main_filtered_m:
dist_threshold = 0.75
filtered_matches = filter_matches_dataset(filter_matches_distance, matches, dist_threshold)
matches = filtered_matches
# Part III. Trajectory Estimation
depth_maps = dataset_handler.depth_maps
trajectory = estimate_trajectory(estimate_motion, matches, kp_list, k, depth_maps=depth_maps)
#!!! Make sure you don't modify the output in any way
# Print Submission Info
print("Trajectory X:\n {0}".format(trajectory[0,:].reshape((1,-1))))
print("Trajectory Y:\n {0}".format(trajectory[1,:].reshape((1,-1))))
print("Trajectory Z:\n {0}".format(trajectory[2,:].reshape((1,-1))))
# ### Visualize your Results
#
# **Important**:
#
# 1) Make sure your results visualization is appealing before submitting your results. You might want to download this project dataset and check whether the trajectory that you have estimated is consistent to the one that you see from the dataset frames.
#
# 2) Assure that your trajectory axis directions follow the ones in _Detailed Description_ section of [OpenCV: Camera Calibration and 3D Reconstruction](https://docs.opencv.org/3.4.3/d9/d0c/group__calib3d.html).
# In[64]:
visualize_trajectory(trajectory)
# Congrats on finishing this assignment!
# In[ ]:
# In[ ]: