当前位置：首页 > ds >正文

【Depth与RGB对齐算法(D2C)】

ds 2025/8/29 18:35:04

本文介绍一下RGBD相机中深度图像与RGB图像的对齐算法。大部分RGBD相机是自带RGB与深度图像对齐功能的，开发者可以很方便地调用。但有的开发者使用的RGBD相机是自己用独立的RGB相机与深度相机组建而成的，这时候通常要自己实现RGB与深度对齐的算法。本篇博客的算法主要参考Realsense SDK中的代码。

D2C算法的核心，是解决深度图像从一个相机的图像坐标系到另外一个相机图像坐标系的映射问题，特别是在两个相机分辨率、焦距不同的情况下，如何又快又好地实现深度图像的映射。

实现D2C对齐的前提是需要知道两个相机的内参和相对外参。两个相机内参和相对外参可以通过标定获得。

一个简单直接的想法是，首先根据深度相机的内参，将深度图像转化为深度相机坐标系中的点云，然后将点云根据深度相机和RGB相机的相对外参，转换到RGB相机坐标系，最后根据RGB相机的内参，投影转换后的点云到RGB相机的图像坐标系，把点云Z值赋给最近邻的整像素。

用数学公式描述这一过程。首先，根据深度相机内参，将深度 $Depth\text{Depth}$ 转换为点云 $PDepthP_{\text{Depth}}$
$PDepth=[(u−cx)/fx∗Depth(u,v)(v−cy)/fy∗Depth(u,v)Depth(u,v)]P_{\text{Depth}}=\begin{bmatrix}(u-c_x)/f_x*\text{Depth}(u,v)\\(v-c_y)/f_y*\text{Depth}(u,v)\\\text{Depth}(u,v)\end{bmatrix}$ 其中， $(u, v)$ 为深度相机的图像坐标， $f_x,f_y,c_x,c_y$ 为深度相机内参的焦距和主点， $Depth\text{Depth}$ 为深度值。其次，根据相对外参，将深度相机坐标系中的点云 $PDepthP_{\text{Depth}}$ 转换为RGB相机坐标系中的点云 $PRGBP_{\text{RGB}}$
$PRGB=[XRGBYRGBZRGB]=R⋅PDepth+TP_{\text{RGB}} = \begin{bmatrix} X_{\text{RGB}}\\ Y_{\text{RGB}}\\ Z_{\text{RGB}} \end{bmatrix}=R \cdot P_{\text{Depth}}+T$ 其中， $R, T$ 为相对外参的旋转和平移矩阵。最后，将转换的点云 $PRGBP_{\text{RGB}}$ 投影到RGB相机的图像坐标系，将转换点云的Z值赋值给最近邻的图像坐标
$=f_x'X_{\text{RGB}}/Z_{\text{RGB}}+c_x'\\ v' =f_y'Y_{\text{RGB}}/Z_{\text{RGB}}+c_y'\\ \text{Depth}_\text{Aligned}(\text{round}(u'),\text{round}(v')) = Z_{\text{RGB}}$ 其中， $(u^{'}, v^{'})$ 为RGB相机的图像坐标， $f_x',f_y',c_x',c_y'$ 为RGB相机内参的焦距和主点， $DepthAligned\text{Depth}_\text{Aligned}$ 为对齐深度。

然而，由于RGB相机通常比深度相机有更高的图像分辨率，此方法得到的对齐深度图会存在很多空缺的像素。

本文介绍的算法是此想法的改进。核心思想是把深度图中的像素看成一个1x1大小矩形，而非一个点。对于深度图中所有的像素，计算以其为中心的矩形左上点和右下点在RGB相机图像坐标系中的映射坐标，将映射坐标所构成矩形区域中的所有像素，赋值为矩形左上点和右下点转换后点云Z值的均值。当像素已被赋值时，则取已有值和新赋值的最小值，以正确处理坐标转换带来的遮挡问题。

算法代码如下：

// d2c.h
#pragma once
#include <cstdint>struct Intrinsics { //内参int width, height; //图像长宽float fx, fy; //焦距float cx, cy; //主点
};struct Extrinsics { //外参float R[9]; //旋转矩阵float T[3]; //平移矩阵
};void DepthToColor(const uint16_t* depth, uint16_t* aligned_depth, const Intrinsics& depth_intrinsics, const Intrinsics& color_intrinsics, const Extrinsics& extrinsics);

//d2c.cpp
#include "d2c.h"
#include <algorithm>//将深度值根据深度相机内参转换为点云
static inline void DeprojectPointToPoint(float point[3], const Intrinsics& intrinsics, const float pixel[2], float depth) {point[0] = (pixel[0] - intrinsics.cx) * depth / intrinsics.fx;point[1] = (pixel[1] - intrinsics.cy) * depth / intrinsics.fy;point[2] = depth;
}
//根据外参，将深度相机坐标系中的点云转换到RGB相机坐标系
static inline void TransformPointToPoint(float to[3], const Extrinsics& extrinsics, const float from[3]) {to[0] = extrinsics.R[0] * from[0] + extrinsics.R[1] * from[1] + extrinsics.R[2] * from[2] + extrinsics.T[0];to[1] = extrinsics.R[3] * from[0] + extrinsics.R[4] * from[1] + extrinsics.R[5] * from[2] + extrinsics.T[1];to[2] = extrinsics.R[6] * from[0] + extrinsics.R[7] * from[1] + extrinsics.R[8] * from[2] + extrinsics.T[2];
}
//将RGB相机坐标系的点云投影为图像坐标
static inline void ProjectPointToPixel(float pixel[2], const Intrinsics& intrinsics, const float point[3]) {pixel[0] = (point[0] * intrinsics.fx / point[2]) + intrinsics.cx;pixel[1] = (point[1] * intrinsics.fy / point[2]) + intrinsics.cy;
}
//深度RGB对齐
void DepthToColor(const uint16_t* depth, uint16_t* aligned_depth, const Intrinsics& depth_intrinsics, const Intrinsics& color_intrinsics, const Extrinsics& extrinsics) {for (int depth_y = 0; depth_y < depth_intrinsics.height; ++depth_y) {int depth_pixel_index = depth_y * depth_intrinsics.width;for (int depth_x = 0; depth_x < depth_intrinsics.width; ++depth_x, ++depth_pixel_index) {//跳过深度值为0的像素if (uint16_t z = depth[depth_pixel_index]) {//将左上角点{ depth_x - 0.5f, depth_y - 0.5f }映射到RGB图像坐标float depth_pixel[2] = { depth_x - 0.5f, depth_y - 0.5f }, depth_point[3], other_point0[3], other_point1[3], other_pixel[2];DeprojectPointToPoint(depth_point, depth_intrinsics, depth_pixel, z);TransformPointToPoint(other_point0, extrinsics, depth_point);ProjectPointToPixel(other_pixel, color_intrinsics, other_point0);const int other_x0 = static_cast<int>(other_pixel[0] + 0.5f);//快速四舍五入const int other_y0 = static_cast<int>(other_pixel[1] + 0.5f);//将右下角点{ depth_x + 0.5f, depth_y + 0.5f }映射到RGB图像坐标depth_pixel[0] = depth_x + 0.5f; depth_pixel[1] = depth_y + 0.5f;DeprojectPointToPoint(depth_point, depth_intrinsics, depth_pixel, z);TransformPointToPoint(other_point1, extrinsics, depth_point);ProjectPointToPixel(other_pixel, color_intrinsics, other_point1);const int other_x1 = static_cast<int>(other_pixel[0] + 0.5f);const int other_y1 = static_cast<int>(other_pixel[1] + 0.5f);//映射后的矩形超出RGB图像范围则跳过if (other_x0 < 0 || other_y0 < 0 || other_x1 >= color_intrinsics.width || other_y1 >= color_intrinsics.height)continue;//计算转换点云的Z值均值const uint16_t new_z = (other_point0[2] + other_point1[2]) * 0.5;//对映射后矩阵中的像素进行赋值for (int y = other_y0; y <= other_y1; ++y) {for (int x = other_x0; x <= other_x1; ++x) {const int index = y * color_intrinsics.width + x;const uint16_t aligned_z = aligned_depth[index];aligned_depth[index] = aligned_z ? std::min(aligned_z, new_z) : new_z;//如果已有值，则取两者最小值}}}}}
}