RK3588 MNN CPU/Vulkan/OpenCL ResNet50推理测试
RK3588 MNN CPU/Vulkan/OpenCL ResNet50推理测试
- 一、背景介绍
- 1.1 RK3588芯片特性
- 1.2 为什么选择MNN?
- 1.3 测试目标解析
- 二、参考链接
- 三、操作步骤
- 3.1 Vulkan环境搭建
- 3.2 安装OpenCL环境
- 3.3 Vulkan运行`relu`算子
- 3.3.1 安装`glslang-tools`
- 3.3.2 编写计算着色器(`relu.comp`)
- 3.3.3 生成 C++ 代码(`main.cpp`)
- 3.3.4 编译计算着色器
- 3.3.5 编译 C++ 程序
- 3.3.6 运行程序
- 3.4 MNN运行`resnet50`推理
- 3.4.1 编译MNN
- 3.4.2 生成onnx模型、量化用的图片、量化配置文件
- 3.4.3 模型转换
- 3.4.4 生成、编译、运行测试程序
- 3.4.5 推理性能数据分析
一、背景介绍
1.1 RK3588芯片特性
Rockchip RK3588是面向AIoT领域的高性能SoC芯片,采用8nm制程工艺,搭载:
- 4xCortex-A76 + 4xCortex-A55大小核架构
- Mali-G610 MP4 GPU(支持Vulkan 1.2/OpenCL 2.2)
- 6TOPS NPU(本测试未涉及)
1.2 为什么选择MNN?
阿里巴巴开源的MNN(Mobile Neural Network)推理引擎具有以下优势:
- 多平台支持:iOS/Android/Linux/Windows全平台覆盖
- 异构计算:支持CPU/GPU/NPU多后端
- 轻量化:基础库仅约500KB
- 量化加速:支持FP16/INT8量化压缩
1.3 测试目标解析
通过ResNet50模型测试不同计算后端的性能表现:
- CPU:通用计算,验证基础性能
- Vulkan:新一代跨平台图形计算API,低开销并行计算
- OpenCL:通用异构计算标准,支持多类型加速器
- 量化对比:验证精度与速度的平衡点
二、参考链接
- Mali610Vulkan
三、操作步骤
3.1 Vulkan环境搭建
# 安装Mali GPU官方驱动(包含Vulkan支持)
wget https://repo.rock-chips.com/edge/debian-release-v2.0.0/pool/main/r/rockchip-mali/rockchip-mali_1.9-12_arm64.deb
sudo dpkg -i rockchip-mali_1.9-12_arm64.deb# 创建符号链接确保动态库可见性
sudo ln -s /usr/lib/aarch64-linux-gnu/libmali-valhall-g610-g6p0-wayland-gbm-vulkan.so /usr/lib/aarch64-linux-gnu/libmali.so# 配置Vulkan驱动描述文件
sudo mkdir -p /etc/vulkan/icd.d/
echo '{"file_format_version": "1.0.0","ICD": {"library_path": "/usr/lib/aarch64-linux-gnu/libmali-valhall-g610-g6p0-wayland-gbm-vulkan.so","api_version": "1.0.0"}
}' | sudo tee /etc/vulkan/icd.d/mali.jsonapt install vulkan-tools vulkan-utils -y
vulkaninfo
关键配置解析:
libmali.so
是Mali GPU的统一驱动入口- ICD(Installable Client Driver)文件声明Vulkan驱动路径
vulkaninfo
工具用于验证驱动安装成功
如果一切正常,控制台将输出:
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '10'.
'DISPLAY' environment variable not set... skipping surface info
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '10'.
==========
VULKANINFO
==========Vulkan Instance Version: 1.2.131Instance Extensions: count = 10
====================VK_EXT_debug_report : extension revision 9VK_EXT_debug_utils : extension revision 1VK_EXT_headless_surface : extension revision 1VK_KHR_device_group_creation : extension revision 1VK_KHR_display : extension revision 23VK_KHR_external_fence_capabilities : extension revision 1VK_KHR_external_memory_capabilities : extension revision 1VK_KHR_external_semaphore_capabilities : extension revision 1VK_KHR_get_physical_device_properties2 : extension revision 2VK_KHR_surface : extension revision 25Layers: count = 0
=======
Presentable Surfaces:
=====================Groups:
=======Device Group Properties (Group 0):physicalDeviceCount: count = 1Mali-LODX (ID: 0)subsetAllocation = 0Device Group Present Capabilities (Group 0):
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '10'.Mali-LODX (ID: 0)Can present images from the following devices:Mali-LODX (ID: 0)Present modes:DEVICE_GROUP_PRESENT_MODE_LOCAL_BIT_KHRDevice Properties and Extensions:
=================================
GPU0:
VkPhysicalDeviceProperties:
---------------------------apiVersion = 4202661 (1.2.165)driverVersion = 25165824 (0x1800000)vendorID = 0x13b5deviceID = 0xa8670000deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPUdeviceName = Mali-LODX
3.2 安装OpenCL环境
# 替换系统默认OpenCL驱动
mv /lib/aarch64-linux-gnu/libOpenCL.so.1 /lib/aarch64-linux-gnu/libOpenCL.so.1.bk
ln -s /usr/lib/aarch64-linux-gnu/libmali.so /lib/aarch64-linux-gnu/libOpenCL.so.1# 安装开发工具链
sudo apt install -y opencl-headers
sudo apt install -y ocl-icd-libopencl1
sudo apt install -y ocl-icd-opencl-dev
sudo apt install -y clinfo
clinfo
3.3 Vulkan运行relu
算子
3.3.1 安装glslang-tools
apt install glslang-tools -y
3.3.2 编写计算着色器(relu.comp
)
- 计算着色器原理
ReLU(Rectified Linear Unit)是深度学习中的常用激活函数,数学表达式为:
f ( x ) = m a x ( 0 , x ) f(x) = max(0, x) f(x)=max(0,x)
-
生成GLSL着色器代码
cat > relu.comp <<-'EOF' #version 450layout(local_size_x = 256) in; // 每个工作组256个线程layout(binding = 0) buffer InputBuffer {float inputData[]; };layout(binding = 1) buffer OutputBuffer {float outputData[]; };void main() {uint idx = gl_GlobalInvocationID.x; // 全局线程索引outputData[idx] = max(inputData[idx], 0.0); } EOF
说明:
layout(local_size_x = 256) in;
指定计算着色器的工作组大小。binding = 0
和binding = 1
分别绑定输入和输出缓冲区。gl_GlobalInvocationID.x
获取全局线程 ID,遍历所有数据元素。max(inputData[idx], 0.0)
实现 ReLU 操作,对于每个元素,输出其与零的最大值。
-
Vulkan执行流程