CarND Project 1 - 车道检测


这是 Udacity 无人驾驶课程的笔记。第一个作业是检测车道,简单地说就是把路面上的行车线找出来。

Origin

target

环境准备

这个作业需要有 Anaconda 的环境,然后装上 jupyter notebook, OpenCV 等必要的库。
具体的安装方法可参考(fork 自 Udacity):
https://github.com/feichashao/CarND-Term1-Starter-Kit/blob/master/doc/configure_via_anaconda.md
其中,安装 tensorflow 的时候,可能会遇到 Google 被墙的情况,所以安装的时候最好先挂个代理,在安装前设置环境变量, 比如:

# export http_proxy=http://8.8.8.8:5187/ 
# export https_proxy=$http_proxy

颜色选择(Color Selection)

颜色是区分行车线的其中一个因素。对于一种颜色,可以有不同的编码方式,常见的是RGB,YUV等。

由于行车线的颜色与周围环境有一定区分,我们可以对RGB分别设定阈值,把有用的信息分离出来。

### Source: Udacity CarND Term1 Lesson 1 Section 4.
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
%matplotlib inline

image = (mpimg.imread('test.jpg')).astype('uint8')
print('This image is: ',type(image), 
         'with dimensions:', image.shape)

# Grab the x and y size and make a copy of the image
ysize = image.shape[0]
xsize = image.shape[1]
color_select = np.copy(image)

# Define color selection criteria
###### MODIFY THESE VARIABLES TO MAKE YOUR COLOR SELECTION
red_threshold = 195   ### <-----
green_threshold = 195 ### <-----
blue_threshold = 195  ### <-----
######

rgb_threshold = [red_threshold, green_threshold, blue_threshold]

# Do a boolean or with the "|" character to identify
# pixels below the thresholds
thresholds = (image[:,:,0] < rgb_threshold[0]) \
            | (image[:,:,1] < rgb_threshold[1]) \
            | (image[:,:,2] < rgb_threshold[2])
color_select[thresholds] = [0,0,0]
plt.imshow(color_select)
# Display the image                 
plt.imshow(color_select)

RGB每一个Channel对应的数值范围是 0~255,这里对 RGB 都设置了 195 为阈值,过滤后可以看到左右行车线。

处理前
处理后

区域选择(Region Masking)

从汽车的前置摄像头中,行车线通常都在图像下半部分的一个梯形区域内,所以在提取行车线的时候,只关注这个梯形区域内的图像,可以避免其他区域的信息造成干扰。这个梯形区域如果选取地太大,则会引入更多无关信息(比如护栏,树木等),如果梯形区域选取太小,则可能看不见行车线,所以这里需要权衡。

在练习题的例子中,使用了三角形来选取区域。

# Source: Udacity CarND term1 Lession 1 Section 7
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
%matplotlib inline

image = mpimg.imread('test.jpg')

# Grab the x and y size and make a copy of the image
ysize = image.shape[0]
xsize = image.shape[1]
color_select = np.copy(image)
line_image = np.copy(image)

# Define color selection criteria
# MODIFY THESE VARIABLES TO MAKE YOUR COLOR SELECTION
red_threshold = 195
green_threshold = 195
blue_threshold = 195

rgb_threshold = [red_threshold, green_threshold, blue_threshold]

# Define the vertices of a triangular mask.
# Keep in mind the origin (x=0, y=0) is in the upper left
# MODIFY THESE VALUES TO ISOLATE THE REGION 
# WHERE THE LANE LINES ARE IN THE IMAGE
left_bottom = [130, 719]  ###<<-------
right_bottom = [1150, 719] ###<<-------
apex = [640, 400]     ###<<-------

# Perform a linear fit (y=Ax+B) to each of the three sides of the triangle
# np.polyfit returns the coefficients [A, B] of the fit
fit_left = np.polyfit((left_bottom[0], apex[0]), (left_bottom[1], apex[1]), 1)
fit_right = np.polyfit((right_bottom[0], apex[0]), (right_bottom[1], apex[1]), 1)
fit_bottom = np.polyfit((left_bottom[0], right_bottom[0]), (left_bottom[1], right_bottom[1]), 1)

# Mask pixels below the threshold
color_thresholds = (image[:,:,0] < rgb_threshold[0]) | \
                    (image[:,:,1] < rgb_threshold[1]) | \
                    (image[:,:,2] < rgb_threshold[2])

# Find the region inside the lines
XX, YY = np.meshgrid(np.arange(0, xsize), np.arange(0, ysize))
region_thresholds = (YY > (XX*fit_left[0] + fit_left[1])) & \
                    (YY > (XX*fit_right[0] + fit_right[1])) & \
                    (YY < (XX*fit_bottom[0] + fit_bottom[1]))
                    
# Mask color and region selection
color_select[color_thresholds | ~region_thresholds] = [0, 0, 0]
# Color pixels red where both color and region selections met
line_image[~color_thresholds & region_thresholds] = [255, 0, 0]

# Display the image and show region and color selections
plt.imshow(image)
x = [left_bottom[0], right_bottom[0], apex[0], left_bottom[0]]
y = [left_bottom[1], right_bottom[1], apex[1], left_bottom[1]]
plt.plot(x, y, 'b--', lw=4)
plt.imshow(color_select)
plt.imshow(line_image)

图中,蓝线是所选择的区域,红线是颜色过滤(Color threshold)得出的像素。
mask_output

Canny 边缘检测(Edge Detection)

边缘检测也是一种检测行车线的手段。以灰度图为例,每个像素点的灰度数值在[0,255]区间,行车线的颜色通常与路面有较大差异,我们可以利用路面到行车线的颜色突变来进行检测。

Canny edge detector 是其中一种边缘检测方法,在 OpenCV 中可以这样调用:

edges = cv2.Canny(gray, low_threshold, high_threshold)

Canny edge detector 的大致过程如下:
1. 首先会做一个 Gaussian filter 来除杂。
Canny_filter_noise

2. 然后会对 x,y 方向分别求梯度(gradient)。
Canny_mask

3. 综合x,y的梯度得出综合的梯度G。
Canny_gradient

4. 如果梯度G大于 high_threshold,就会认为它是边缘像素,保留这些像素;然后把所有低于 low_threshold 的像素去除,在 [low_threshold, high_threshold]之间的像素,如果它位置邻于高于 high_threshold 的像素,则保留,其余的去除。

low_threshold 和 high_threshold 通常的取值范围是 low:high 是 1:2 或 1:3.

具体参考:
http://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/canny_detector/canny_detector.html

# Source: Udacity Carnd term1 Lesson 1 Sector 11
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import cv2
%matplotlib inline


image = mpimg.imread('test.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)  ## 转换成灰度图

# Define a kernel size for Gaussian smoothing / blurring
kernel_size = 5 # Must be an odd number (3, 5, 7...) ## 使用 5 作为 kernel size 来除杂。
blur_gray = cv2.GaussianBlur(gray,(kernel_size, kernel_size),0)

# Define our parameters for Canny and run it
low_threshold = 50  ## 设置高低 threshold
high_threshold = 120
edges = cv2.Canny(blur_gray, low_threshold, high_threshold)

# Display the image
plt.imshow(edges, cmap='Greys_r')

Canny_output

霍夫变换 (Hough Transform)

霍夫变换是一种以投票来检测线段的方式。它将图像从X-Y域转换到R-θ域,R是指X-Y域中像素点距离(0,0)的最短距离,θ是与X轴形成的角度。

Source: Wikipedia
Source: Wikipedia

X-Y域的一个点在R-θ域是一条直线,X-Y域的一条线段上的点在R-θ域会相交到一个点,这个点反过来就是X-Y域的一条直线。 所以我们只要在R-θ域找到各线的交点,就能找到X-Y域的线。
By Daf-de - Own work, CC BY 2.5, https://commons.wikimedia.org/w/index.php?curid=1121165
By Daf-de - Own work, CC BY 2.5, https://commons.wikimedia.org/w/index.php?curid=1121165

具体参考维基百科:
https://en.wikipedia.org/wiki/Hough_transform

在 OpenCV 中,可以通过如下方法来找到黑白图中的线段。黑白图是经过上面颜色和区域过滤得到的图像。

lines = cv2.HoughLinesP(edges, rho, theta, threshold, np.array([]), min_line_length, max_line_gap)

edges 是上面处理好的黑白图. rho 和 theta 表示距离(rho)和角度(theta)在R-θ中的分辨率,threshold 是指可以判断为一条线段所需的最小votes(R-θ中的有多少条直线相交于这一点), np.array([])是个placeholder不用在意,min_line_length指在X-Y域一条线段的长度最少是多少才认为是线段,max_line_gap是这条线段之间各个部分(segment)所允许的间隔(有多少个空缺点)。

# Source: Udacity CarND term1 Lesson 1 Sector 15
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import cv2

image = mpimg.imread('test.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_RGB2GRAY)

# Define a kernel size and apply Gaussian smoothing
kernel_size = 5
blur_gray = cv2.GaussianBlur(gray,(kernel_size, kernel_size),0)

# Define our parameters for Canny and apply
low_threshold = 50
high_threshold = 150
edges = cv2.Canny(blur_gray, low_threshold, high_threshold)

# Next we'll create a masked edges image using cv2.fillPoly()
mask = np.zeros_like(edges)   
ignore_mask_color = 255   

# This time we are defining a four sided polygon to mask
imshape = image.shape
vertices = np.array([[(0,imshape[0]),(450, 290), (490, 290), (imshape[1],imshape[0])]], dtype=np.int32)
cv2.fillPoly(mask, vertices, ignore_mask_color)
masked_edges = cv2.bitwise_and(edges, mask)

# Define the Hough transform parameters
# Make a blank the same size as our image to draw on
rho = 2 # distance resolution in pixels of the Hough grid
theta = np.pi/180 # angular resolution in radians of the Hough grid
threshold = 15     # minimum number of votes (intersections in Hough grid cell)
min_line_length = 40 #minimum number of pixels making up a line
max_line_gap = 20    # maximum gap in pixels between connectable line segments
line_image = np.copy(image)*0 # creating a blank to draw lines on

# Run Hough on edge detected image
# Output "lines" is an array containing endpoints of detected line segments
lines = cv2.HoughLinesP(masked_edges, rho, theta, threshold, np.array([]),  ###<<-------
                            min_line_length, max_line_gap)

# Iterate over the output "lines" and draw lines on a blank image
for line in lines:
    for x1,y1,x2,y2 in line:
        cv2.line(line_image,(x1,y1),(x2,y2),(255,0,0),10)

# Create a "color" binary image to combine with line image
color_edges = np.dstack((edges, edges, edges)) 

# Draw the lines on the edge image
lines_edges = cv2.addWeighted(color_edges, 0.8, line_image, 1, 0) 
plt.imshow(lines_edges)

hough_output

Project 1 - Finding Lane Lines on the Road

万事具备,可以写作业了。这个作业的最终目的是提取出视频中的行车线。视频也是由一帧帧的图像组成的,所以只要做好了对单张图片的pipeline,也可以将其应用到视频上。
完整源码请见:
https://github.com/feichashao/CarND-LaneLines-P1/blob/master/P1.ipynb

准备工作

首先的首先是导入所需要的库:

#importing some useful packages
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import cv2
%matplotlib inline

然后是一些 helper functions:

import math

def grayscale(img):
    """
    功能:将 img 从彩色图像转换成灰度图像。
    mpimg.imread 读取到是RGB,使用OpenCV进行转换
    """
    return cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
    
def canny(img, low_threshold, high_threshold):
    """功能:边缘检测 Canny transform"""
    return cv2.Canny(img, low_threshold, high_threshold)

def gaussian_blur(img, kernel_size):
    """功能:高斯模糊(除噪)"""
    return cv2.GaussianBlur(img, (kernel_size, kernel_size), 0)

def region_of_interest(img, vertices):
    """
    功能:选取图形区域
    只保留由 vertices 组成的多边形内的图像
    """
    #defining a blank mask to start with
    mask = np.zeros_like(img)   
    
    #defining a 3 channel or 1 channel color to fill the mask with depending on the input image
    if len(img.shape) > 2:
        channel_count = img.shape[2]  # i.e. 3 or 4 depending on your image
        ignore_mask_color = (255,) * channel_count
    else:
        ignore_mask_color = 255
        
    #filling pixels inside the polygon defined by "vertices" with the fill color    
    cv2.fillPoly(mask, vertices, ignore_mask_color)
    
    #returning the image only where mask pixels are nonzero
    masked_image = cv2.bitwise_and(img, mask)
    return masked_image


def draw_lines(img, lines, color=[255, 0, 0], thickness=2):
    """
    功能:把 lines 画到 img 上
    """
    for line in lines:
        for x1,y1,x2,y2 in line:
            cv2.line(img, (x1, y1), (x2, y2), color, thickness)

def hough_lines(img, rho, theta, threshold, min_line_len, max_line_gap):
    """
    功能:进行霍夫变换找出线段
    `img` 是 Canny transform 的输出
    输出结果是霍夫变换后找到的线段
    """
    lines = cv2.HoughLinesP(img, rho, theta, threshold, np.array([]), minLineLength=min_line_len, maxLineGap=max_line_gap)
    line_img = np.zeros((img.shape[0], img.shape[1], 3), dtype=np.uint8)
    draw_lines(line_img, lines)
    return line_img

def weighted_img(img, initial_img, α=0.8, β=1., λ=0.):
    """
    功能:将霍夫变换得到的线段 img 绘制到 原始图像 initial_img 上。
    输出图像由此计算:
    initial_img * α + img * β + λ
    img, initial_img 的形状(长宽/Channel数量)必须相同
    """
    return cv2.addWeighted(initial_img, α, img, β, λ)

定义一些针对寻找行车线所需的函数:


def slop(line):
    """
    功能:计算斜率
    input a line [x0, y0, x1, y1]
    return (y1-y0)/(x1-x0)
    """
    return 1.0*(line[3]-line[1])/(line[2]-line[0])


def get_side_lines(lines, min_slop=-5, max_slop=5):
    """
    功能:保留合理范围的斜线。从图像上看,行车线不太可能是水平的,可以去除这些不合理的斜线。
    input lines returned from hough
    output side lines that between the min_slop and max_slop.
    """
    side_lines = []
    
    for line in lines:
        if slop(line[0]) > min_slop and slop(line[0]) < max_slop:
            side_lines.append(line)
    
    return side_lines


def generate_result_line(lines, y_min = 330):
    """
    功能: 对左线/右线得到的所有有效线段进行平均,获得最终的左线/右线。
    比如左线,从hough中可以获得左线上的多条线段。首先求出这些线段的平均斜率,
    再根据平均斜率,把每一条线段向下延伸,得到这条线段大概的x轴起点位置,
    最后将这些线段延伸后的x轴起点位置进行平均,得到最终的x轴平均起点。
    从这个x轴平均起点,按照平均斜率,向上延伸得到最终左线。
    input left/right lines from get_side_lines().
    this function will average and extend them to
    y_min and output a result line.
    """
    
    # draw line from the image bottom
    y_max = 539

    # calculate the average slop
    sum_slop = 0.1
    sum_length = 0.1
    for line in lines:
        length = abs(line[0][2] - line[0][0])
        sum_length += length
        sum_slop += length * slop(line[0])
    avg_slop = sum_slop / sum_length
        
    # calculate the x_start (the x of lane start point)
    # base on avg_slop
    x_start_sum = 0
    x_start_count = 0
    x_start = 0
    for line in lines:
        x_start_sum += line[0][0] + (y_max - line[0][1]) / avg_slop
        x_start_sum += line[0][2] + (y_max - line[0][3]) / avg_slop
        x_start_count += 2
    if x_start_count > 0:
        x_start = x_start_sum / x_start_count
        
        
    # Calculate the x of end point
    x_end = x_start + (y_min - y_max) / avg_slop
    
    # return the line as np.array
    return np.array([[x_start, y_max, x_end, y_min]], dtype=np.int32)

对单张图片进行处理

将之前提到的功能组合起来,得到寻找行车线的pipeline:
彩色到灰度转换 -> 高斯模糊(除噪) -> Canny 边缘检测 -> 有效区域选择 -> 霍夫变换寻找线段 -> 根据斜率获取左右线 -> 计算最终左右线.

# import images
image = mpimg.imread('test_images/solidWhiteRight.jpg')

# Transform to grayscale
image_gray = grayscale(image)

# Gaussian blur
gaussian_kernel_size = 5
image_blur = gaussian_blur(image_gray, gaussian_kernel_size)

# Canny edges
canny_low_threshold = 90
canny_high_threshold = 270
image_edge = canny(image_blur, canny_low_threshold, canny_high_threshold)

# Mask for interest region.
upper_left = (450,280)
upper_right = (510, 289)
botton_left = (100, 539)
botton_right = (860, 539)
vertices = np.array([[botton_left, upper_left, upper_right, botton_right]], dtype=np.int32)
edge_masked = region_of_interest(image_edge, vertices)

# Filter lines by hough.
hough_rho = 2  # distance resolution in pixels of the Hough grid
hough_theta = np.pi/180 # angular resolution in radians of the Hough grid
hough_threshold = 30     # minimum number of votes (intersections in Hough grid cell)
hough_min_line_len = 20  #minimum number of pixels making up a line
hough_max_line_gap =  20    # maximum gap in pixels between connectable line segments
lines = cv2.HoughLinesP(edge_masked, hough_rho, hough_theta, hough_threshold, np.array([]), minLineLength=hough_min_line_len, maxLineGap=hough_max_line_gap)

# Get left lines
left_lines = get_side_lines(lines, -10, -0.2)

# Get right lines
right_lines = get_side_lines(lines, 0.2, 10)

# Average and extend left/right line.
merged_left_line = generate_result_line(left_lines)
merged_right_line = generate_result_line(right_lines)

# Draw line to a black image.
line_img = np.zeros((image.shape[0], image.shape[1], 3), dtype=np.uint8)
draw_lines(line_img, [merged_left_line, merged_right_line], thickness=8)

# Merge lines into the original image.
result_image = weighted_img(line_img, image)
plt.imshow(result_image)

target

处理视频

能处理单个图像,处理视频也就好说了。使用 VideoFileClip 可以对每一帧应用上面的pipeline进行处理。

首先定义一个 processimage 函数,以一帧图像作为输入,输出则是处理好的图像。将上述pipeline直接放到函数内即可。

def process_image(image):
    return result_image
from moviepy.editor import VideoFileClip
from IPython.display import HTML

white_output = 'white.mp4'
clip1 = VideoFileClip("solidWhiteRight.mp4")
white_clip = clip1.fl_image(process_image) #NOTE: this function expects color images!!
%time white_clip.write_videofile(white_output, audio=False)