Contents
这是 Udacity 无人驾驶课程的笔记。第一个作业是检测车道,简单地说就是把路面上的行车线找出来。
环境准备
这个作业需要有 Anaconda 的环境,然后装上 jupyter notebook, OpenCV 等必要的库。
具体的安装方法可参考(fork 自 Udacity):
https://github.com/feichashao/CarND-Term1-Starter-Kit/blob/master/doc/configure_via_anaconda.md
其中,安装 tensorflow 的时候,可能会遇到 Google 被墙的情况,所以安装的时候最好先挂个代理,在安装前设置环境变量, 比如:
# export http_proxy=http://8.8.8.8:5187/ # export https_proxy=$http_proxy
颜色选择(Color Selection)
颜色是区分行车线的其中一个因素。对于一种颜色,可以有不同的编码方式,常见的是RGB,YUV等。
由于行车线的颜色与周围环境有一定区分,我们可以对RGB分别设定阈值,把有用的信息分离出来。
### Source: Udacity CarND Term1 Lesson 1 Section 4. import matplotlib.pyplot as plt import matplotlib.image as mpimg import numpy as np %matplotlib inline image = (mpimg.imread('test.jpg')).astype('uint8') print('This image is: ',type(image), 'with dimensions:', image.shape) # Grab the x and y size and make a copy of the image ysize = image.shape[0] xsize = image.shape[1] color_select = np.copy(image) # Define color selection criteria ###### MODIFY THESE VARIABLES TO MAKE YOUR COLOR SELECTION red_threshold = 195 ### <----- green_threshold = 195 ### <----- blue_threshold = 195 ### <----- ###### rgb_threshold = [red_threshold, green_threshold, blue_threshold] # Do a boolean or with the "|" character to identify # pixels below the thresholds thresholds = (image[:,:,0] < rgb_threshold[0]) \ | (image[:,:,1] < rgb_threshold[1]) \ | (image[:,:,2] < rgb_threshold[2]) color_select[thresholds] = [0,0,0] plt.imshow(color_select) # Display the image plt.imshow(color_select)
RGB每一个Channel对应的数值范围是 0~255,这里对 RGB 都设置了 195 为阈值,过滤后可以看到左右行车线。
区域选择(Region Masking)
从汽车的前置摄像头中,行车线通常都在图像下半部分的一个梯形区域内,所以在提取行车线的时候,只关注这个梯形区域内的图像,可以避免其他区域的信息造成干扰。这个梯形区域如果选取地太大,则会引入更多无关信息(比如护栏,树木等),如果梯形区域选取太小,则可能看不见行车线,所以这里需要权衡。
在练习题的例子中,使用了三角形来选取区域。
# Source: Udacity CarND term1 Lession 1 Section 7 import matplotlib.pyplot as plt import matplotlib.image as mpimg import numpy as np %matplotlib inline image = mpimg.imread('test.jpg') # Grab the x and y size and make a copy of the image ysize = image.shape[0] xsize = image.shape[1] color_select = np.copy(image) line_image = np.copy(image) # Define color selection criteria # MODIFY THESE VARIABLES TO MAKE YOUR COLOR SELECTION red_threshold = 195 green_threshold = 195 blue_threshold = 195 rgb_threshold = [red_threshold, green_threshold, blue_threshold] # Define the vertices of a triangular mask. # Keep in mind the origin (x=0, y=0) is in the upper left # MODIFY THESE VALUES TO ISOLATE THE REGION # WHERE THE LANE LINES ARE IN THE IMAGE left_bottom = [130, 719] ###<<------- right_bottom = [1150, 719] ###<<------- apex = [640, 400] ###<<------- # Perform a linear fit (y=Ax+B) to each of the three sides of the triangle # np.polyfit returns the coefficients [A, B] of the fit fit_left = np.polyfit((left_bottom[0], apex[0]), (left_bottom[1], apex[1]), 1) fit_right = np.polyfit((right_bottom[0], apex[0]), (right_bottom[1], apex[1]), 1) fit_bottom = np.polyfit((left_bottom[0], right_bottom[0]), (left_bottom[1], right_bottom[1]), 1) # Mask pixels below the threshold color_thresholds = (image[:,:,0] < rgb_threshold[0]) | \ (image[:,:,1] < rgb_threshold[1]) | \ (image[:,:,2] < rgb_threshold[2]) # Find the region inside the lines XX, YY = np.meshgrid(np.arange(0, xsize), np.arange(0, ysize)) region_thresholds = (YY > (XX*fit_left[0] + fit_left[1])) & \ (YY > (XX*fit_right[0] + fit_right[1])) & \ (YY < (XX*fit_bottom[0] + fit_bottom[1])) # Mask color and region selection color_select[color_thresholds | ~region_thresholds] = [0, 0, 0] # Color pixels red where both color and region selections met line_image[~color_thresholds & region_thresholds] = [255, 0, 0] # Display the image and show region and color selections plt.imshow(image) x = [left_bottom[0], right_bottom[0], apex[0], left_bottom[0]] y = [left_bottom[1], right_bottom[1], apex[1], left_bottom[1]] plt.plot(x, y, 'b--', lw=4) plt.imshow(color_select) plt.imshow(line_image)
图中,蓝线是所选择的区域,红线是颜色过滤(Color threshold)得出的像素。
Canny 边缘检测(Edge Detection)
边缘检测也是一种检测行车线的手段。以灰度图为例,每个像素点的灰度数值在[0,255]区间,行车线的颜色通常与路面有较大差异,我们可以利用路面到行车线的颜色突变来进行检测。
Canny edge detector 是其中一种边缘检测方法,在 OpenCV 中可以这样调用:
edges = cv2.Canny(gray, low_threshold, high_threshold)
Canny edge detector 的大致过程如下:
1. 首先会做一个 Gaussian filter 来除杂。
2. 然后会对 x,y 方向分别求梯度(gradient)。
3. 综合x,y的梯度得出综合的梯度G。
4. 如果梯度G大于 high_threshold,就会认为它是边缘像素,保留这些像素;然后把所有低于 low_threshold 的像素去除,在 [low_threshold, high_threshold]之间的像素,如果它位置邻于高于 high_threshold 的像素,则保留,其余的去除。
low_threshold 和 high_threshold 通常的取值范围是 low:high 是 1:2 或 1:3.
具体参考:
http://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/canny_detector/canny_detector.html
# Source: Udacity Carnd term1 Lesson 1 Sector 11 import matplotlib.pyplot as plt import matplotlib.image as mpimg import numpy as np import cv2 %matplotlib inline image = mpimg.imread('test.jpg') gray = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY) ## 转换成灰度图 # Define a kernel size for Gaussian smoothing / blurring kernel_size = 5 # Must be an odd number (3, 5, 7...) ## 使用 5 作为 kernel size 来除杂。 blur_gray = cv2.GaussianBlur(gray,(kernel_size, kernel_size),0) # Define our parameters for Canny and run it low_threshold = 50 ## 设置高低 threshold high_threshold = 120 edges = cv2.Canny(blur_gray, low_threshold, high_threshold) # Display the image plt.imshow(edges, cmap='Greys_r')
霍夫变换 (Hough Transform)
霍夫变换是一种以投票来检测线段的方式。它将图像从X-Y域转换到R-θ域,R是指X-Y域中像素点距离(0,0)的最短距离,θ是与X轴形成的角度。
X-Y域的一个点在R-θ域是一条直线,X-Y域的一条线段上的点在R-θ域会相交到一个点,这个点反过来就是X-Y域的一条直线。 所以我们只要在R-θ域找到各线的交点,就能找到X-Y域的线。
具体参考维基百科:
https://en.wikipedia.org/wiki/Hough_transform
在 OpenCV 中,可以通过如下方法来找到黑白图中的线段。黑白图是经过上面颜色和区域过滤得到的图像。
lines = cv2.HoughLinesP(edges, rho, theta, threshold, np.array([]), min_line_length, max_line_gap)
edges 是上面处理好的黑白图. rho 和 theta 表示距离(rho)和角度(theta)在R-θ中的分辨率,threshold 是指可以判断为一条线段所需的最小votes(R-θ中的有多少条直线相交于这一点), np.array([])是个placeholder不用在意,min_line_length指在X-Y域一条线段的长度最少是多少才认为是线段,max_line_gap是这条线段之间各个部分(segment)所允许的间隔(有多少个空缺点)。
# Source: Udacity CarND term1 Lesson 1 Sector 15 import matplotlib.pyplot as plt import matplotlib.image as mpimg import numpy as np import cv2 image = mpimg.imread('test.jpg') gray = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY) # Define a kernel size and apply Gaussian smoothing kernel_size = 5 blur_gray = cv2.GaussianBlur(gray,(kernel_size, kernel_size),0) # Define our parameters for Canny and apply low_threshold = 50 high_threshold = 150 edges = cv2.Canny(blur_gray, low_threshold, high_threshold) # Next we'll create a masked edges image using cv2.fillPoly() mask = np.zeros_like(edges) ignore_mask_color = 255 # This time we are defining a four sided polygon to mask imshape = image.shape vertices = np.array([[(0,imshape[0]),(450, 290), (490, 290), (imshape[1],imshape[0])]], dtype=np.int32) cv2.fillPoly(mask, vertices, ignore_mask_color) masked_edges = cv2.bitwise_and(edges, mask) # Define the Hough transform parameters # Make a blank the same size as our image to draw on rho = 2 # distance resolution in pixels of the Hough grid theta = np.pi/180 # angular resolution in radians of the Hough grid threshold = 15 # minimum number of votes (intersections in Hough grid cell) min_line_length = 40 #minimum number of pixels making up a line max_line_gap = 20 # maximum gap in pixels between connectable line segments line_image = np.copy(image)*0 # creating a blank to draw lines on # Run Hough on edge detected image # Output "lines" is an array containing endpoints of detected line segments lines = cv2.HoughLinesP(masked_edges, rho, theta, threshold, np.array([]), ###<<------- min_line_length, max_line_gap) # Iterate over the output "lines" and draw lines on a blank image for line in lines: for x1,y1,x2,y2 in line: cv2.line(line_image,(x1,y1),(x2,y2),(255,0,0),10) # Create a "color" binary image to combine with line image color_edges = np.dstack((edges, edges, edges)) # Draw the lines on the edge image lines_edges = cv2.addWeighted(color_edges, 0.8, line_image, 1, 0) plt.imshow(lines_edges)
Project 1 - Finding Lane Lines on the Road
万事具备,可以写作业了。这个作业的最终目的是提取出视频中的行车线。视频也是由一帧帧的图像组成的,所以只要做好了对单张图片的pipeline,也可以将其应用到视频上。
完整源码请见:
https://github.com/feichashao/CarND-LaneLines-P1/blob/master/P1.ipynb
准备工作
首先的首先是导入所需要的库:
#importing some useful packages import matplotlib.pyplot as plt import matplotlib.image as mpimg import numpy as np import cv2 %matplotlib inline
然后是一些 helper functions:
import math def grayscale(img): """ 功能:将 img 从彩色图像转换成灰度图像。 mpimg.imread 读取到是RGB,使用OpenCV进行转换 """ return cv2.cvtColor(img, cv2.COLOR_RGB2GRAY) def canny(img, low_threshold, high_threshold): """功能:边缘检测 Canny transform""" return cv2.Canny(img, low_threshold, high_threshold) def gaussian_blur(img, kernel_size): """功能:高斯模糊(除噪)""" return cv2.GaussianBlur(img, (kernel_size, kernel_size), 0) def region_of_interest(img, vertices): """ 功能:选取图形区域 只保留由 vertices 组成的多边形内的图像 """ #defining a blank mask to start with mask = np.zeros_like(img) #defining a 3 channel or 1 channel color to fill the mask with depending on the input image if len(img.shape) > 2: channel_count = img.shape[2] # i.e. 3 or 4 depending on your image ignore_mask_color = (255,) * channel_count else: ignore_mask_color = 255 #filling pixels inside the polygon defined by "vertices" with the fill color cv2.fillPoly(mask, vertices, ignore_mask_color) #returning the image only where mask pixels are nonzero masked_image = cv2.bitwise_and(img, mask) return masked_image def draw_lines(img, lines, color=[255, 0, 0], thickness=2): """ 功能:把 lines 画到 img 上 """ for line in lines: for x1,y1,x2,y2 in line: cv2.line(img, (x1, y1), (x2, y2), color, thickness) def hough_lines(img, rho, theta, threshold, min_line_len, max_line_gap): """ 功能:进行霍夫变换找出线段 `img` 是 Canny transform 的输出 输出结果是霍夫变换后找到的线段 """ lines = cv2.HoughLinesP(img, rho, theta, threshold, np.array([]), minLineLength=min_line_len, maxLineGap=max_line_gap) line_img = np.zeros((img.shape[0], img.shape[1], 3), dtype=np.uint8) draw_lines(line_img, lines) return line_img def weighted_img(img, initial_img, α=0.8, β=1., λ=0.): """ 功能:将霍夫变换得到的线段 img 绘制到 原始图像 initial_img 上。 输出图像由此计算: initial_img * α + img * β + λ img, initial_img 的形状(长宽/Channel数量)必须相同 """ return cv2.addWeighted(initial_img, α, img, β, λ)
定义一些针对寻找行车线所需的函数:
def slop(line): """ 功能:计算斜率 input a line [x0, y0, x1, y1] return (y1-y0)/(x1-x0) """ return 1.0*(line[3]-line[1])/(line[2]-line[0]) def get_side_lines(lines, min_slop=-5, max_slop=5): """ 功能:保留合理范围的斜线。从图像上看,行车线不太可能是水平的,可以去除这些不合理的斜线。 input lines returned from hough output side lines that between the min_slop and max_slop. """ side_lines = [] for line in lines: if slop(line[0]) > min_slop and slop(line[0]) < max_slop: side_lines.append(line) return side_lines def generate_result_line(lines, y_min = 330): """ 功能: 对左线/右线得到的所有有效线段进行平均,获得最终的左线/右线。 比如左线,从hough中可以获得左线上的多条线段。首先求出这些线段的平均斜率, 再根据平均斜率,把每一条线段向下延伸,得到这条线段大概的x轴起点位置, 最后将这些线段延伸后的x轴起点位置进行平均,得到最终的x轴平均起点。 从这个x轴平均起点,按照平均斜率,向上延伸得到最终左线。 input left/right lines from get_side_lines(). this function will average and extend them to y_min and output a result line. """ # draw line from the image bottom y_max = 539 # calculate the average slop sum_slop = 0.1 sum_length = 0.1 for line in lines: length = abs(line[0][2] - line[0][0]) sum_length += length sum_slop += length * slop(line[0]) avg_slop = sum_slop / sum_length # calculate the x_start (the x of lane start point) # base on avg_slop x_start_sum = 0 x_start_count = 0 x_start = 0 for line in lines: x_start_sum += line[0][0] + (y_max - line[0][1]) / avg_slop x_start_sum += line[0][2] + (y_max - line[0][3]) / avg_slop x_start_count += 2 if x_start_count > 0: x_start = x_start_sum / x_start_count # Calculate the x of end point x_end = x_start + (y_min - y_max) / avg_slop # return the line as np.array return np.array([[x_start, y_max, x_end, y_min]], dtype=np.int32)
对单张图片进行处理
将之前提到的功能组合起来,得到寻找行车线的pipeline:
彩色到灰度转换 -> 高斯模糊(除噪) -> Canny 边缘检测 -> 有效区域选择 -> 霍夫变换寻找线段 -> 根据斜率获取左右线 -> 计算最终左右线.
# import images image = mpimg.imread('test_images/solidWhiteRight.jpg') # Transform to grayscale image_gray = grayscale(image) # Gaussian blur gaussian_kernel_size = 5 image_blur = gaussian_blur(image_gray, gaussian_kernel_size) # Canny edges canny_low_threshold = 90 canny_high_threshold = 270 image_edge = canny(image_blur, canny_low_threshold, canny_high_threshold) # Mask for interest region. upper_left = (450,280) upper_right = (510, 289) botton_left = (100, 539) botton_right = (860, 539) vertices = np.array([[botton_left, upper_left, upper_right, botton_right]], dtype=np.int32) edge_masked = region_of_interest(image_edge, vertices) # Filter lines by hough. hough_rho = 2 # distance resolution in pixels of the Hough grid hough_theta = np.pi/180 # angular resolution in radians of the Hough grid hough_threshold = 30 # minimum number of votes (intersections in Hough grid cell) hough_min_line_len = 20 #minimum number of pixels making up a line hough_max_line_gap = 20 # maximum gap in pixels between connectable line segments lines = cv2.HoughLinesP(edge_masked, hough_rho, hough_theta, hough_threshold, np.array([]), minLineLength=hough_min_line_len, maxLineGap=hough_max_line_gap) # Get left lines left_lines = get_side_lines(lines, -10, -0.2) # Get right lines right_lines = get_side_lines(lines, 0.2, 10) # Average and extend left/right line. merged_left_line = generate_result_line(left_lines) merged_right_line = generate_result_line(right_lines) # Draw line to a black image. line_img = np.zeros((image.shape[0], image.shape[1], 3), dtype=np.uint8) draw_lines(line_img, [merged_left_line, merged_right_line], thickness=8) # Merge lines into the original image. result_image = weighted_img(line_img, image) plt.imshow(result_image)
处理视频
能处理单个图像,处理视频也就好说了。使用 VideoFileClip 可以对每一帧应用上面的pipeline进行处理。
首先定义一个 processimage 函数,以一帧图像作为输入,输出则是处理好的图像。将上述pipeline直接放到函数内即可。
def process_image(image): return result_image
from moviepy.editor import VideoFileClip from IPython.display import HTML white_output = 'white.mp4' clip1 = VideoFileClip("solidWhiteRight.mp4") white_clip = clip1.fl_image(process_image) #NOTE: this function expects color images!! %time white_clip.write_videofile(white_output, audio=False)