计算机视觉基础 - 特征向量生成算法 ORB & HOG

我们通常所说的图片中的关键点 Keypoints 是指图片当中具有明显区别意义或对被检测对象有定义性质的小区域,例如成角和边缘部分。执行边缘、成角和轮廓检测等前述所有的独立的特征点 Feature Points 的提取都是为了后续在算法实现中可以通过将这些具有识别能力的特征组合起来构成特征向量 Feature Vector 来更稳健的完成物体的检测和识别。在计算机视觉的领域里,特征向量生成算法也常被称为 Feature Descriptor。

A feature descriptor is a representation of an image or an image patch that simplifies the image by extracting useful information and throwing away extraneous information.

ORB

ORB, oriented fast and rotated brief 是一个可以快速实现特征检测 Feature Detection 和特征向量生成 Vector Creation 的算法,其中 FAST 算法部分实现特征检测,而 BRIEF 则用于实现特征向量生成。事实上,在计算机视觉的很多应用领域里,特征检测和识别都是分步骤执行的,这也是很多计算机视觉应用的标准流程。

ORB is basically a fusion of FAST keypoint detector and BRIEF descriptor with many modifications to enhance the performance. First it use FAST to find keypoints, then apply Harris corner measure to find top N points among them. It also use pyramid to produce multiscale-features.

FAST

在 ORB 算法的实现中,第一步就是要完成特征点的检测,这一步骤通过 FAST, Features from Accelerated Segments Test 算法来完成,其实现过程如下:

  1. 选定一个像素点 p 作为备选特征点,其强度记为 Ip
  2. 设定一个合适的阈值 t
  3. 选定围绕 p 点一周的 16 个像素点
  4. p 点是否构成特征点的判断标准为:这 16 个像素点中有 12 个像素的强度高于 Ip + t 或低于 Ip - t,否则 p 就不是特征点
  5. 为了提高算法的执行效率,算法中还添加了一个快速实现部分:可以首先只检测备选特征点周围 1,5,9,13 几个位置的点的强度,如果其中至少 3 个点的强度高于 Ip + t 或低于 Ip - t 才进一步检测 p 是否是特征点
FAST

具体的优缺点请见 OpenCV 的 文档 部分。

BRIEF

在完成了特征点检测后,ORB 通过 BRIEF, Binary Robust Independent Elementary Features 算法来生成二元的特征向量。

It takes smoothened image patch and selects a set of nd (x,y) location pairs in an unique way (explained in paper). Then some pixel intensity comparisons are done on these location pairs. For eg, let first location pairs be p and q. If I(p) < I(q), then its result is 1, else it is 0. This is applied for all the nd location pairs to get a nd dimensional bitstring.

引文中特征点附近的参考点的选择方式如下:先围绕特征点选择一个与这个特征点的距离服从标准差为 σ 的高斯分布的任意一点 p,在此基础上,再围绕 p 选择一个服从标准差为 σ / 2 的点 q,之后再将两者的强度进行对比,最终重复这个过程 nd 次。

在 ORB 算法的实现过程中,除了基本的 FAST 和 BRIEF 还引入了其他的措施,如 Image Pyramid 和 rBRIEF 来进一步提高算法对于缩放比例和图像旋转的稳健性。由于算法的快速识别能力,ORB 算法常用于在视频中执行具有稳定特征的对象识别(如人脸、车辆等)。

在实际使用中,会先利用 ORB 算法提取训练图片和目标图片的特征向量,之后再通过特征匹配来完成查找。

使用 ORB 进行特征提取的代码如下:


# Convert the training image to gray Scale
training_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Set the parameters of the ORB algorithm by specifying the maximum number of keypoints to locate and
# the pyramid decimation ratio
orb = cv2.ORB_create(200, 2.0)

# Find the keypoints in the gray scale training image and compute their ORB descriptor.
# The None parameter is needed to indicate that we are not using a mask.
keypoints, descriptor = orb.detectAndCompute(training_gray, None)

# Create copies of the training image to draw our keypoints on
keyp_without_size = copy.copy(training_image)
keyp_with_size = copy.copy(training_image)

# Draw the keypoints without size or orientation on one copy of the training image 
cv2.drawKeypoints(training_image, keypoints, keyp_without_size, color = (0, 255, 0))

# Draw the keypoints with size and orientation on the other copy of the training image
cv2.drawKeypoints(training_image, keypoints, keyp_with_size, flags = cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

ORB keypoints detection

Every keypoint has a center, a size, and an angle. The center determines the location of each keypoint in the image; the size of each keypoint is determined by the patch size used by BRIEF to create its feature vector; and the angle tells us the orientation of the keypoint as determined by rBRIEF.

使用 ORB 算法实现目标检测的代码如下:


# Convert the training image to gray scale
training_gray = cv2.cvtColor(training_image, cv2.COLOR_BGR2GRAY)

# Convert the query image to gray scale
query_gray = cv2.cvtColor(query_image, cv2.COLOR_BGR2GRAY)

# Set the parameters of the ORB algorithm by specifying the maximum number of keypoints to locate and
# the pyramid decimation ratio
orb = cv2.ORB_create(1000, 2.0)

# Find the keypoints in the gray scale training and query images and compute their ORB descriptor.
# The None parameter is needed to indicate that we are not using a mask in either case.
keypoints_train, descriptors_train = orb.detectAndCompute(training_gray, None)
keypoints_query, descriptors_query = orb.detectAndCompute(query_gray, None)

# Create a Brute Force Matcher object. Set crossCheck to True so that the BFMatcher will only return consistent
# pairs. Such technique usually produces best results with minimal number of outliers when there are enough matches.
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck = True)

# Perform the matching between the ORB descriptors of the training image and the query image
matches = bf.match(descriptors_train, descriptors_query)

# The matches with shorter distance are the ones we want. So, we sort the matches according to distance
matches = sorted(matches, key = lambda x : x.distance)

# Connect the keypoints in the training image with their best matching keypoints in the query image.
# The best matches correspond to the first elements in the sorted matches list, since they are the ones
# with the shorter distance. We draw the first 30 mathces and use flags = 2 to plot the matching keypoints
# without size or orientation.
result = cv2.drawMatches(training_gray, keypoints_train, query_gray, keypoints_query, matches[:30], query_gray, flags = 2)

# Display the best matching points
plt.title('Best Matching Points')
plt.imshow(result)
plt.show()
Object detection and matching with ORB

HOG, Histogram of Oriented Gradient

对于 ORB 这类基于特征点匹配的算法来说,其对于检测那些有固定特征的且不易受背景环境影响的对象如人脸识别类的应用来说具有非常高的识别度和准确率,但对于更加一般的物体识别,如行人检测,则由于姿态和背景变化多样,因此并不能很好的寻找特定的特征点来完成。同理,前面讲到的边缘和轮廓检测这类方法同样也受到待识别物体和背景的对比情况的影响,因此需要引入诸如本节的 HOG 这类的算法。

In the HOG feature descriptor, the distribution ( histograms ) of directions of gradients ( oriented gradients ) are used as features. Gradients ( x and y derivatives ) of an image are useful because the magnitude of gradients is large around edges and corners ( regions of abrupt intensity changes ) and we know that edges and corners pack in a lot more information about object shape than flat regions.

由于直方图 Histogram 可以很好的展示其中的数据在不同取值范围内的分布情况,例如灰度图片中亮度强度的分布情况,HOG 算法是通过统计图片中的梯度方向的直方图来对图像进行分析,其实现过程如下:

  1. 计算图像中待检测区域 Detection Window 内每一个像素点的梯度大小和方向
  2. 将相邻的像素点分组在临近的正方形区域 Cell 内,例如 8 x 8 的区域内
  3. 在每一个区域内统计不同像素点的梯度方向在不同范围内的分布情况,进而建立梯度方向为区间的梯度大小的直方图
  4. 上述直方图中梯度的大小对于强度变化异常敏感,因此需要通过归一化 Normalization 来去除强度的影响,在实际算法实现中采用的是在一个大的 block 内来实现归一化,这一处理也有助于进一步减少图像表征信息的量
  5. 这个梯度方向的直方图本身即可以做为一个特征向量,可以用一系列的梯度方向的直方图构成的特征向量来训练一个分类器如 SVM 进行图像识别
  1. Given the image of particular object, set a detection window (region of interest) that covers the entire object in the image (see Fig. 3).
  2. Calculate the magnitude and direction of the gradient for each individual pixel in the detection window.
  3. Divide the detection window into connected cells of pixels, with all cells being of the same size (see Fig. 3). The size of the cells is a free parameter and it is usually chosen so as to match the scale of the features that want to be detected. For example, in a 64 x 128 pixel detection window, square cells 6 to 8 pixels wide are suitable for detecting human limbs.
  4. Create a Histogram for each cell, by first grouping the gradient directions of all pixels in each cell into a particular number of orientation (angular) bins; and then adding up the gradient magnitudes of the gradients in each angular bin (see Fig. 3). The number of bins in the histogram is a free parameter and it is usually set to 9 angular bins.
  5. Group adjacent cells into blocks (see Fig. 3). The number of cells in each block is a free parameter and all blocks must be of the same size. The distance between each block (known as the stride) is a free parameter but it is usually set to half the block size, in which case you will get overlapping blocks. The HOG algorithm has been shown empirically to work better with overlapping blocks.
  6. Use the cells contained within each block to normalize the cell histograms in that block (see Fig. 3). If you have overlapping blocks this means that most cells will be normalized with respect to different blocks . Therefore, the same cell may have several different normalizations.
  7. Collect all the normalized histograms from all the blocks into a single feature vector called the HOG descriptor.
  8. Use the resulting HOG descriptors from many images of the same type of object to train a machine learning algorithm, such as an SVM, to detect those type of objects in images. For example, you could use the HOG descriptors from many images of pedestrians to train an SVM to detect pedestrians in images. The training is done with both positive and negative examples of the object you want detect in the image.
  9. Once the SVM has been trained, a sliding window approach is used to try to detect and locate objects in images. Detecting an object in the image entails finding the part of the image that looks similar to the HOG pattern learned by the SVM.
Fig.3 - HOG Diagram, form Udacity
# Initialize a HOG detector
hog = cv2.HOGDescriptor()

svm_detector = hog.getDefaultPeopleDetector()

hog.setSVMDetector(svm_detector)

if img is None:
    raise Exception("Please check the availability of the image file!")
    
bboxes, weights = hog.detectMultiScale(img, winStride=(8, 8), padding=(32, 32), 
                                       scale=1.05, finalThreshold=2, hitThreshold=0)

参考阅读

  1. A brilliant illustration of HOG
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 159,716评论 4 364
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 67,558评论 1 294
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 109,431评论 0 244
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 44,127评论 0 209
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 52,511评论 3 287
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 40,692评论 1 222
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 31,915评论 2 313
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 30,664评论 0 202
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 34,412评论 1 246
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 30,616评论 2 245
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 32,105评论 1 260
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 28,424评论 2 254
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 33,098评论 3 238
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 26,096评论 0 8
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 26,869评论 0 197
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 35,748评论 2 276
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 35,641评论 2 271

推荐阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 7,106评论 0 10
  • 时隔十年的《越狱》第五季终于在忠粉们的翘首期待中上映了,伴随着一代人成长的《越狱》不仅把美剧第一次大范围内介绍到中...
    酒研社阅读 798评论 0 0
  • 半路相逢亦是首歌 ——留在心底的飨宴(五) 此次提笔心情特别复杂,不知道你们发现没有,最近的我话不多了,喜欢上了在...
    双豪虎娘阅读 1,143评论 1 11
  • 我拥有明确的长短期目标,我拥有强烈的学习动力,我拥有整合资源的意愿和能力,我拥有挖掘与创造需求的能力。
    继续折腾折腾阅读 91评论 0 0
  • 榜样的力量 在我们生活中,能学习的优秀的人很多,但往往我们忽略了别人的优点,很多时候我们记住的是别人的缺...
    流年七里香农庄阅读 183评论 0 0