2021-06-23 基于AprilTag的位姿估计，原理，完整代码（相机坐标系、世界坐标系）

1.AprilTag概述

AprilTag是一种视觉基准系统，类似的视觉基准标签还有：ArUco、ARTag、TopoTag等。而AprilTag因其对旋转、光照、模糊具有良好的鲁棒性，因此被广泛应用于目标跟踪、室内定位、增强现实等领域。本文主要展示使用AprilTag标签，分别获取AprilTag标签在相机坐标系下和世界坐标系下位置。

生成AprilTag不同家族标签图像教程

2.AprilTag 相关资料

AprilTag论文三部曲：

AprilTag: A robust and flexible visual fiducial system. --2011
AprilTag 2: Efficient and robust fiducial detection. --2016
Flexible Layouts for Fiducial Tags. --2019（AprilTag3）

一些应用AprilTag的论文：

UAV Autonomous Landing Technology Based on AprilTags Vision Positioning Algorithm.
Pose Estimation for Multicopters Based on Monocular Vision and AprilTag.
Image Digital Zoom Based Single Target Apriltag Recognition Algorithm in Large Scale Changes on the Distance.

上述资料打包下载地址：https://cloud.189.cn/web/share?code=JRJRJbNfe2ay（访问码：7l96）

AprilTag论文源码地址：https://github.com/AprilRobotics/apriltag

3.AprilTag 原理

检测标签

检测线段
- 计算每个像素处的梯度大小和方向
- 利用聚类算法，将这些像素聚集成一个具有相似梯度方向和大小的分量
- 使用加权最小二乘法，每一点的权值就是该点的梯度的强度，把上述分量拟合成线段
检测矩形（基于深度优先搜索）
- 深度为1：考虑所有的线段
- 深度为2-4：考虑与上一条线段的末端足够近的线段，和服从逆时针方向的线段
单应性变换与外参估计

单应矩阵计算：

由于世界坐标和图像坐标之间的关系，能够通过一个单应矩阵H(homography)联系在一起，公式如下所示：

$s表示深度信息 \qquad \begin{bmatrix} u&v&1\\ \end{bmatrix}^T 表示图像坐标系 \qquad \begin{bmatrix} x_w& y_w& 1 \end{bmatrix}^T表示Z_w=0的世界坐标系$

$s \begin{bmatrix} u\\ v\\ 1\\ \end{bmatrix} = \begin{bmatrix} h_{00} & h_{01} & h_{02} \\ h_{10} & h_{11} & h_{12} \\ h_{20} & h_{21} & h_{22} \\ \end{bmatrix}\begin{bmatrix} x_w\\ y_w\\ 1\\ \end{bmatrix} \tag{1-1}$

将上式(1-1)展开可得：
$\begin{cases} su=h_{00}x_w+h_{01}y_w+h_{02}\\ sv=h_{10}x_w+h_{11}y_w+h_{12}\\ s=h_{20}x_w+h_{21}y_w+h_{22} \end{cases} \tag{1-2}$

进一步变换消去深度信息s，可得:
$\begin{bmatrix} x_w & y_w & 1 & 0 & 0 & 0 &-ux_w & -uy_w & -u \\ 0 & 0 & 0 & xw & y_w & 1 & -vx_w & -vy_w & -v \\ \end{bmatrix} \begin{bmatrix} h_{00}\\h_{01}\\h_{02}\\h_{10}\\h_{11}\\h_{12}\\h_{20}\\h_{21}\\h_{22} \end{bmatrix}=0 \tag{1-3}$

最终产生一个 $Ah=0$ 的矩阵求解问题，论文中根据从AprilTag标签图像坐标系上提取的四个角点，和在世界坐标系中假设的四个点对应，形成四个点对，然后采用直接线性法(DLT)求解该线性方程组。

外参估计：

到此为止，我们已经得到了单应矩阵H，然而估计标签的位姿还需要额外的信息：相机内参、AprilTag标签的物理尺寸。而这两个信息都能通过事先标定得知。

因此，接下来我们把单应矩阵写成如下公式(2-1)所示的形式：

$s = 比例因子（表示投影过程重点缩放平移关系） \quad \\ P = 3\times4的相机投影矩阵（表示三维世界坐标系到二维图像坐标系的变换）$

$E = 4\times3截断的外在矩阵（旋转矩阵+平移矩阵）$

$\begin{bmatrix} h_{00} & h_{01} & h_{02} \\ h_{10} & h_{11} & h_{12} \\ h_{20} & h_{21} & h_{22} \\ \end{bmatrix} = sPE = s\begin{bmatrix} 1/f_x & 0 & 0 & 0\\ 0 & 1/f_y & 0 & 0\\ 0 & 0 & 1 & 0\\ \end{bmatrix} \begin{bmatrix} R_{00} & R_{01} & T_{x} \\ R_{10} & R_{11} & T_{y} \\ R_{20} & R_{21} & T_{z} \\ 0 & 0 & 1 \end{bmatrix} \tag{2-1}$

将上式(1-1)展开可得：
$\begin{cases} h_{00}=sR_{00}/f_x\\ h_{01}=sR_{01}/f_x\\ h_{02}=sT_{x}/f_x\\ \qquad ......\\ h_{22}=s \end{cases} \tag{2-2}$
现在我们已知了单应矩阵 $h_{ij}$ 和相机内参 $f_x,f_y$ ，因此很容易求得 $R_{ij}$ 和 $T_k$ 。但比例因子 $s$ 还很难求得。所以，我们利用约束条件：旋转矩阵 $R$ 中列向量的模长均为1，能够计算得到 $s$ 的大小；而由标签必须位于摄像机前方可以推导出约束条件比例因子 $s$ 必须保证 $T_z<0$ ，从而得到 $s$ 的符号。

由于旋转矩阵 $R$ 为正交矩阵，因此 $R$ 被截断的列向量可以通过其与前两组列向量的内积为 $0$ 求解。

至此，我们得到了从世界坐标系转换到相机坐标系的旋转矩阵和平移矩阵，即相机的外参矩阵。

解码内容

建立两种阈值模型（超过该阈值即认为”1“，少于该阈值即认为是”0“），一种基于“黑色”像素强度的空间变化，一种基于“白色“像素强度的空间变化模型：
$I(x,y)=Ax+Bxy+Cy+D$
该模型有四个参数，能用最小二乘回归求解。最后阈值大小由两种模型预测的平均值取得。根据不同黑白色块解码标签得到不同的码型，将解码得到的码型与储存在本地的码型库相对比，排除其它码型的干扰，判断是否为正确的标签。

解算位姿

现在我们已知了由世界坐标系至相机坐标系的旋转矩阵 $R$ 和平移矩阵 $T$ ，现在可以根据这两个矩阵，估计相机在世界坐标系下的位置和相机坐标系下标签的位置。

1. 相机在世界坐标系下的位置

根据相机坐标系与世界坐标系之间的关系可知：
$\begin{bmatrix} X_{cam}\\ y_{cam}\\ Z_{cam}\\ \end{bmatrix}=R\begin{bmatrix} X_{world}\\ y_{world}\\ Z_{world}\\ \end{bmatrix}+T \tag{3-1}$

因为相机在相机坐标系下的坐标为 $P_{cam}=[0,0,0]$ ，因此可得：

$\begin{bmatrix} X_{world}\\ y_{world}\\ Z_{world}\\ \end{bmatrix} =-R^{-1}T$

从而得到相机在世界坐标系下的位置。

2.相机坐标系下标签的位置

因为标签中心在世界坐标系下的位置为 $P_{world}=[0,0,0]$ ，因此可得：
$\begin{bmatrix} X_{cam}\\ y_{cam}\\ Z_{cam}\\ \end{bmatrix} = T$
从而得到标签在相机坐标系下的位置。

4.为ApriTag论文源码 opencv_demo.cc 添加位姿估计的代码

运行环境

Ubuntu 18.04

Opencv 3.14

IDE: Clion

推荐在Linux环境下，先安装基于C++的OpenCV lib，再编译执行下列代码。

添加的改进代码如下：

首先在头文件中添加了用于计算线性代数和估计位姿的库（已注释的行）

#include <iostream>
#include "opencv2/opencv.hpp"
#include <eigen3/Eigen/Dense>  //necessary before including the opencv eigen lib
#include <opencv2/core/eigen.hpp> //include linear calculation lib

extern "C" {
#include "apriltag.h"
#include "apriltag_pose.h" //pose estimation lib
#include "tag36h11.h"
#include "tag25h9.h"
#include "tag16h5.h"
#include "tagCircle21h7.h"
#include "tagCircle49h12.h"
#include "tagCustom48h12.h"
#include "tagStandard41h12.h"
#include "tagStandard52h13.h"
#include "common/getopt.h"

其次，为了便于使用，添加了命令行参数代码，用于选择世界坐标系还是相机坐标系为基准

getopt_add_string(getopt, 'a', "axis", "world", "Choose axis to use (world/camera)");
const char *axisname = getopt_get_string(getopt, "axis");
unsigned int coordinate;
if(!strcmp(axisname,"world"))
    coordinate = 1;
else
    coordinate = 0;

接下来，输入相机的内参矩阵和打印出的标签大小（单位为m）

apriltag_detection_info_t info;
info.tagsize = 0.146-0.012*4;   //The size of Tag
info.det = NULL;
info.fx = 848.469;   //focal length of x
info.fy = 847.390;   //focal length of y
info.cx = cap.get(CV_CAP_PROP_FRAME_WIDTH)/2;
info.cy = cap.get(CV_CAP_PROP_FRAME_HEIGHT)/2;

根据解算单应矩阵得到的旋转矩阵和平移矩阵估计相机在世界坐标系中的位置（此处产生了一个问题：由apriltag源代码解算得到的旋转矩阵，在估计相机在世界坐标系中的位置时，不用乘以负号，与上一节公式的推导结果不同，如果谁知道原因还请赐教！）

if (coordinate) 
{
    Mat rvec(3, 3, CV_64FC1, pose.R->data); //rotation matrix
    Mat tvec(3, 1, CV_64FC1, pose.t->data); //translation matrix
    Mat Pos(3, 3, CV_64FC1);
    Pos = rvec.inv() * tvec;              //注意，此处的旋转矩阵的逆矩阵没有乘上负号
    cout << "x: " << Pos.ptr<double>(0)[0] << endl;
    cout << "y: " << Pos.ptr<double>(1)[0] << endl;
    cout << "z: " << Pos.ptr<double>(2)[0] << endl;
    cout << "-----------world--------------" << endl;
}

根据解算单应矩阵得到的旋转矩阵和平移矩阵估计AprilTag标签在相机坐标系中的位置

 else
 {
     cout << "x: " << pose.t ->data[0] << endl;
     cout << "y: " << pose.t ->data[1] << endl;
     cout << "z: " << pose.t ->data[2] << endl;
     cout << "-----------camera-------------" << endl;
 }

完整代码见第六节

5.为Apriltag-cpp添加位姿估计代码

Apriltag-cpp源码的github地址：https://github.com/swatbotics/apriltags-cpp

该代码由第三方开发者用C++重构了AprilTag论文源代码。

在OpenCV环境下编译前，只需要在camtest.cpp的第283行添加如下代码，（注意这里旋转矩阵 $R$ 需要和公式中推导的一样，在前面添加负号），然后编译执行即可。

cv::Mat R;
cv::Rodrigues(r, R);
cv::Mat Pos(3,3,CV_64F);
Pos = -R.inv() * t; //注意此处的 -R
std::cout << "x: " << Pos.ptr<double>(0)[0] << std::endl;
std::cout << "y: " << Pos.ptr<double>(1)[0] << std::endl;
std::cout << "z: " << Pos.ptr<double>(2)[0] << std::endl;
std::cout << "-----------world--------------" << std::endl;

即可解算出世界坐标系下相机的位置。如果需要计算相机坐标系下标签的位置，则与上一节中的代码同理。

6.opencv_demo.cc的完整代码：

GitHub地址：https://github.com/Reza666-cloud/apriltag-with-pose-estimation/blob/master/example/opencv_demo.cc

#include <iostream>
#include "opencv2/opencv.hpp"
#include <eigen3/Eigen/Dense>  //necessary before including the opencv eigen lib
#include <opencv2/core/eigen.hpp> //include linear calculation lib

extern "C" {
#include "apriltag.h"
#include "apriltag_pose.h" //pose estimation lib
#include "tag36h11.h"
#include "tag25h9.h"
#include "tag16h5.h"
#include "tagCircle21h7.h"
#include "tagCircle49h12.h"
#include "tagCustom48h12.h"
#include "tagStandard41h12.h"
#include "tagStandard52h13.h"
#include "common/getopt.h"
}

using namespace std;
using namespace cv;


int main(int argc, char *argv[])
{
    getopt_t *getopt = getopt_create();

    getopt_add_bool(getopt, 'h', "help", 0, "Show this help");
    getopt_add_bool(getopt, 'd', "debug", 0, "Enable debugging output (slow)");
    getopt_add_bool(getopt, 'q', "quiet", 0, "Reduce output");
    getopt_add_string(getopt, 'f', "family", "tag36h11", "Tag family to use");
    getopt_add_int(getopt, 't', "threads", "1", "Use this many CPU threads");
    getopt_add_double(getopt, 'x', "decimate", "2.0", "Decimate input image by this factor");
    getopt_add_double(getopt, 'b', "blur", "0.0", "Apply low-pass blur to input");
    getopt_add_bool(getopt, '0', "refine-edges", 1, "Spend more time trying to align edges of tags");
    getopt_add_string(getopt, 'a', "axis", "world", "Choose axis to use (world/camera)");


    if (!getopt_parse(getopt, argc, argv, 1) ||
            getopt_get_bool(getopt, "help")) {
        printf("Usage: %s [options]\n", argv[0]);
        getopt_do_usage(getopt);
        exit(0);
    }

    // Initialize camera
    VideoCapture cap(0);
    if (!cap.isOpened()) {
        cerr << "Couldn't open video capture device" << endl;
        return -1;
    }

    // Initialize tag detector with options
    apriltag_family_t *tf = NULL;
    const char *famname = getopt_get_string(getopt, "family");
    if (!strcmp(famname, "tag36h11")) {
        tf = tag36h11_create();
    } else if (!strcmp(famname, "tag25h9")) {
        tf = tag25h9_create();
    } else if (!strcmp(famname, "tag16h5")) {
        tf = tag16h5_create();
    } else if (!strcmp(famname, "tagCircle21h7")) {
        tf = tagCircle21h7_create();
    } else if (!strcmp(famname, "tagCircle49h12")) {
        tf = tagCircle49h12_create();
    } else if (!strcmp(famname, "tagStandard41h12")) {
        tf = tagStandard41h12_create();
    } else if (!strcmp(famname, "tagStandard52h13")) {
        tf = tagStandard52h13_create();
    } else if (!strcmp(famname, "tagCustom48h12")) {
        tf = tagCustom48h12_create();
    } else {
        printf("Unrecognized tag family name. Use e.g. \"tag36h11\".\n");
        exit(-1);
    }

    const char *axisname = getopt_get_string(getopt, "axis");
    unsigned int coordinate;
    if(!strcmp(axisname,"world"))
        coordinate = 1;
    else
        coordinate = 0;

//  input camera intrinsic matrix
    apriltag_detection_info_t info;
    info.tagsize = 0.146-0.012*4;   //The size of Tag
    info.det = NULL;
    info.fx = 848.469;   //focal length of x
    info.fy = 847.390;   //focal length of y
    info.cx = cap.get(CV_CAP_PROP_FRAME_WIDTH)/2;
    info.cy = cap.get(CV_CAP_PROP_FRAME_HEIGHT)/2;

    apriltag_detector_t *td = apriltag_detector_create();
    apriltag_detector_add_family(td, tf);
    td->quad_decimate = getopt_get_double(getopt, "decimate");
    td->quad_sigma = getopt_get_double(getopt, "blur");
    td->nthreads = getopt_get_int(getopt, "threads");
    td->debug = getopt_get_bool(getopt, "debug");
    td->refine_edges = getopt_get_bool(getopt, "refine-edges");

    Mat frame, gray;
    while (true) {
        cap >> frame;
        cvtColor(frame, gray, COLOR_BGR2GRAY);

        // Make an image_u8_t header for the Mat data
        image_u8_t im = { .width = gray.cols,
            .height = gray.rows,
            .stride = gray.cols,
            .buf = gray.data
        };

        zarray_t *detections = apriltag_detector_detect(td, &im);

        // Draw detection outlines
        for (int i = 0; i < zarray_size(detections); i++) {
            apriltag_detection_t *det;
            zarray_get(detections, i, &det);
            info.det = det;
            apriltag_pose_t pose;
            estimate_tag_pose(&info, &pose);
            //calculate cam position in world coordinate
            if (coordinate)
            {
                Mat rvec(3, 3, CV_64FC1, pose.R->data); //rotation matrix
                Mat tvec(3, 1, CV_64FC1, pose.t->data); //translation matrix
                Mat Pos(3, 3, CV_64FC1);
                Pos = rvec.inv() * tvec;
                cout << "x: " << Pos.ptr<double>(0)[0] << endl;
                cout << "y: " << Pos.ptr<double>(1)[0] << endl;
                cout << "z: " << Pos.ptr<double>(2)[0] << endl;
                cout << "-----------world--------------" << endl;
            }
            else
            {
                cout << "x: " << pose.t ->data[0] << endl;
                cout << "y: " << pose.t ->data[1] << endl;
                cout << "z: " << pose.t ->data[2] << endl;
                cout << "-----------camera-------------" << endl;
            }

            //draw the line and show tag ID
            line(frame, Point(det->p[0][0], det->p[0][1]),
                     Point(det->p[1][0], det->p[1][1]),
                     Scalar(0, 0xff, 0), 2);
            line(frame, Point(det->p[0][0], det->p[0][1]),
                     Point(det->p[3][0], det->p[3][1]),
                     Scalar(0, 0, 0xff), 2);
            line(frame, Point(det->p[1][0], det->p[1][1]),
                     Point(det->p[2][0], det->p[2][1]),
                     Scalar(0xff, 0, 0), 2);
            line(frame, Point(det->p[2][0], det->p[2][1]),
                     Point(det->p[3][0], det->p[3][1]),
                     Scalar(0xff, 0, 0), 2);

            stringstream ss;
            ss << det->id;
            String text = ss.str();
            int fontface = FONT_HERSHEY_SCRIPT_SIMPLEX;
            double fontscale = 1.0;
            int baseline;
            Size textsize = getTextSize(text, fontface, fontscale, 2,
                                            &baseline);
            putText(frame, text, Point(det->c[0]-textsize.width/2,
                                       det->c[1]+textsize.height/2),
                    fontface, fontscale, Scalar(0xff, 0x99, 0), 2);
        }
        apriltag_detections_destroy(detections);

        imshow("Tag Detections", frame);
        if (waitKey(30) >= 0)
            break;
    }

    apriltag_detector_destroy(td);

    if (!strcmp(famname, "tag36h11")) {
        tag36h11_destroy(tf);
    } else if (!strcmp(famname, "tag25h9")) {
        tag25h9_destroy(tf);
    } else if (!strcmp(famname, "tag16h5")) {
        tag16h5_destroy(tf);
    } else if (!strcmp(famname, "tagCircle21h7")) {
        tagCircle21h7_destroy(tf);
    } else if (!strcmp(famname, "tagCircle49h12")) {
        tagCircle49h12_destroy(tf);
    } else if (!strcmp(famname, "tagStandard41h12")) {
        tagStandard41h12_destroy(tf);
    } else if (!strcmp(famname, "tagStandard52h13")) {
        tagStandard52h13_destroy(tf);
    } else if (!strcmp(famname, "tagCustom48h12")) {
        tagCustom48h12_destroy(tf);
    }

    getopt_destroy(getopt);
    return 0;
}

参考资料：

三维空间坐标系转换：https://blog.csdn.net/fireflychh/article/details/82352710

相机位姿估计：https://www.cnblogs.com/singlex/p/pose_estimation_0.html

AprilTag原理：https://www.cnblogs.com/brt2/p/13547075.html

https://zhuanlan.zhihu.com/p/53367734

最后编辑于：2021.07.12 14:51:50

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 159,716评论 4赞 364
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 67,558评论 1赞 294
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 109,431评论 0赞 244
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 44,127评论 0赞 209
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 52,511评论 3赞 287
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 40,692评论 1赞 222
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 31,915评论 2赞 313
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 30,664评论 0赞 202
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 34,412评论 1赞 246
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 30,616评论 2赞 245
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 32,105评论 1赞 260
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 28,424评论 2赞 254
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 33,098评论 3赞 238
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 26,096评论 0赞 8
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 26,869评论 0赞 197
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 35,748评论 2赞 276
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 35,641评论 2赞 271