当前位置: 首页 > ai >正文

.QOI: Lossless Image Compression in O(n) Time

专栏传送:[QOI] qoi_desc | qoi_encode | qoi_decode
(↓英文翻译,当学英语素材来的…😗)

Lossless Image Compression in O(n) Time

O(n) 时间内的无损图像压缩

Introducing QOI — the Quite OK Image Format. It losslessly compresses RGB and RGBA images to a similar size of PNG, while offering a 20x-50x speedup in compression and 3x-4x speedup in decompression. All single-threaded, no SIMD. It’s also stupidly simple.
介绍 QOI——相当不错的图像格式。它无损地将 RGB 和 RGBA 图像压缩为相似大小的 PNG,同时提供 20 倍至 50 倍的压缩速度和 3 倍-4 倍的解压缩速度

全单线程,无 SIMD。这也简单得愚蠢。

tl;dr: 300 lines of C, single header, source on github, benchmark results here.
tl;dr:300 行 C,单个标头,github 上的源代码,基准测试结果在这里。

在这里插入图片描述


QOI compression

QOI 压缩

I want to preface this article by saying that I have no idea what I’m doing. I’m not a compression guy. I barely understand how Huffman Coding and DCT works. Luckily, QOI uses neither.
我想在这篇文章的序言中说,我不知道我在做什么。我不是一个压缩的人。我几乎不明白霍夫曼编码DCT 是如何工作的。幸运的是,QOI 两者都没有使用。

I was just tinkering with some ideas that I thought would maybe compress images. The result surprised me quite a bit.
我只是在修补一些我认为可能会压缩图像的想法。结果让我大吃一惊。

霍夫曼编码

在这里插入图片描述

一种压缩数据的算法,通过给出现频率高的符号分配短码、频率低的分配长码,减少整体存储空间。

DCT(离散余弦变换)

将图像或信号从空间域转换到频率域的技术,保留主要能量成分便于压缩(如JPEG),丢弃人眼不敏感的高频细节。


Why? A Short Rant.

为什么?简短的咆哮。

File formats. They suck. I absolutely loathe the usual suspects. PNG, JPEG or even worse MPEG, MOV, MP4. They burst with complexity at the seams. Every tiny aspect screams “design by consortium”.
文件格式。他们很糟糕。我绝对厌恶那些常见的嫌疑人。PNG、JPEG 甚至更糟糕的 MEPG、MOV、MP4。它们在接缝处爆发出复杂性。每一个微小的方面都在尖叫"由联盟设计"。

A while ago I dabbled into MPEG a bit. The basic ideas for video compression in MPEG are ingenious, even more so for 1993, but the resulting file format is an abomination.
不久前,我稍微涉足了 MPEG。MPEG 中视频压缩的基本思想很巧妙,对于 1993 年来说更是如此,但最终的文件格式却令人厌恶。

I can almost picture the meeting of the Moving Picture Experts Group where some random suit demanded there to be a way to indicate a video stream is copyrighted. And thus, the copyright bit flag made its way into the standard and successfully stopped movie piracy before it even began.
我几乎可以想象运动图像专家组的会议,其中一些随机的诉讼要求有一种方法来表明视频流受版权保护。因此,copyright 位标志进入了标准,并在电影盗版开始之前就成功地阻止了它。

MPEG, an industry standard conceived 3 decades past, all patents long expired, all professional interest abandoned. Yet, the holy specification — there named ISO/IEC 11172-2 — is a well guarded secret, revealed only to those that fork over a cool $200 to endow the sacred work of the ISO.
MPEG,一个3年前构思的行业标准,所有专利早已过期,所有专业利益都被放弃了。然而,这个神圣的规范——被命名为 ISO/IEC 11172-2——是一个严密保守的秘密,只有那些花 200 美元以上来捐赠 ISO 神圣工作的人才会知道。

Alternative open video codecs exist, but are again immensely complex. They compete with the state of the art, require huge libraries, are compute hungry and difficult to work with. Alternatives for PNG all compete on the compression ratio too, with ever increasing complexity.
存在替代的开放视频编解码器,但同样非常复杂。它们与最先进的技术竞争,需要庞大的库,需要大量计算并且难以使用。PNG 的替代品也都在压缩比上竞争,复杂性不断增加。

There absolutely is a market for video, audio and image codecs that trade compression ratio for speed and simplicity, but no one is serving it. (Well, these guys maybe, but it’s all proprietary.)
视频、音频和图像编解码器绝对有一个市场,它们以压缩比换取速度和简单性,但没有人为它服务。(好吧,这些家伙也许是,但这都是专有的。)

Yes, stb_image saved us all from the pains of dealing with libpng and is therefore used in countless games and apps. A while ago I aimed to do the same for video with pl_mpeg, with some success.
是的,stb_image使我们所有人免于处理 libpng 的痛苦,因此被用于无数的游戏和应用程序中。不久前,我的目标是用pl_mpeg对视频做同样的事情,并取得了一些成功。

But with all that we learned, why did no one go back and implement a simple compression scheme to compete with PNG, but without the cruft? Why did no one implement a simple video compression scheme similar to MPEG, but in a sane file format instead?
但是,有了我们学到的一切,为什么没有人回去实施一个简单的压缩方案来与巴布亚新几内亚竞争,但又没有垃圾呢?为什么没有人实现类似于 MPEG 的简单视频压缩方案,而是以合理的文件格式实现?


Technical Details

技术细节

QOI encodes and decodes images in a single pass. It touches every pixel just once.
QOI 在一次通过中对图像进行编码和解码。它只接触每个像素一次。

Pixels are encoded as:
像素编码为:

  1. A run of the previous pixel
  2. 上一个像素的运行
  3. An index into an array of previously seen pixels
  4. 以前看到的像素数组的索引
  5. A difference to the previous pixel value in r,g,b
  6. 与之前的像素值(以 r,g,b 为单位)的差异
  7. Full r,g,b or r,g,b,a values
  8. 完整的 r,g,b 或 r,g,b,a 值

The resulting values are packed into chunks starting with a 2- or 8-bit tag (indicating one of those methods) followed by a number of data bits. All of these chunks (tag and data bits) are byte aligned, so there’s no bit twiddling needed between those chunks.
结果值被打包成块,以 2 位或 8 位 tag (表示其中一种方法)开始,后跟多个数据位。所有这些块(tag 和数据位)都是字节对齐的,因此这些块之间不需要位摆动


Chunk Types

块类型

  1. QOI_OP_RUN
id: qoi-op-run
name: 运行块结构
type: mermaid
content: |-┌─ QOI_OP_RUN ────────────┐│         Byte[0]         ││  7  6  5  4  3  2  1  0 ││───────┼─────────────────││  1  1 │       run       │└───────┴─────────────────┘
  • 2-bit tag b11
  • 6-bit run-length (1…62)
  1. QOI_OP_INDEX
id: qoi-op-index
name: 索引块结构
type: mermaid
content: |-┌─ QOI_OP_INDEX ──────────┐│         Byte[0]         ││  7  6  5  4  3  2  1  0 ││───────┼─────────────────││  0  0 │     index       │└───────┴─────────────────┘
  • 2-bit tag b00
  • 6-bit index (0…63)
  1. QOI_OP_DIFF
id: qoi-op-diff
name: 差值块结构
type: mermaid
content: |-┌─ QOI_OP_DIFF ───────────┐│         Byte[0]         ││  7  6  5  4  3  2  1  0 ││───────┼─────┼─────┼─────││  0  1 │  dr │  dg │  db │└───────┴─────┴─────┴─────┘
  • 2-bit tag b01
  • 2-bit RGB差值 (-2…1)
  1. QOI_OP_LUMA
id: qoi-op-luma
name: 亮度块结构
type: mermaid
content: |-┌─ QOI_OP_LUMA ───────────┬─────────────────────────┐│         Byte[0]         │         Byte[1]         ││  7  6  5  4  3  2  1  0 │  7  6  5  4  3  2  1  0 ││───────┼─────────────────┼─────────────┼───────────││  1  0 │   diff green    │   dr - dg   │  db - dg  │└───────┴─────────────────┴─────────────┴───────────┘
  • 2-bit tag b10
  • 6-bit绿色通道差值 (-32…31)
  • 4-bit红蓝差值偏移
  1. QOI_OP_RGB
id: qoi-op-rgb
name: RGB块结构
type: mermaid
content: |-┌─ QOI_OP_RGB ────────────┬─────────┬─────────┬─────────┐│         Byte[0]         │ Byte[1] │ Byte[2] │ Byte[3] ││  7  6  5  4  3  2  1  0 │ 7 .. 0  │ 7 .. 0  │ 7 .. 0  ││─────────────────────────┼─────────┼─────────┼─────────││  1  1  1  1  1  1  1  0 │   red   │  green  │  blue   │└─────────────────────────┴─────────┴─────────┴─────────┘
  • 8-bit tag b11111110
  • 24-bit RGB值
  1. QOI_OP_RGBA
id: qoi-op-rgba
name: RGBA块结构
type: mermaid
content: |-┌─ QOI_OP_RGBA ───────────┬─────────┬─────────┬─────────┬─────────┐│         Byte[0]         │ Byte[1] │ Byte[2] │ Byte[3] │ Byte[4] ││  7  6  5  4  3  2  1  0 │ 7 .. 0  │ 7 .. 0  │ 7 .. 0  │ 7 .. 0  ││─────────────────────────┼─────────┼─────────┼─────────┼─────────││  1  1  1  1  1  1  1  1 │   red   │  green  │  blue   │  alpha  │└─────────────────────────┴─────────┴─────────┴─────────┴─────────┘
  • 8-bit tag b11111111
  • 32-bit RGBA值

Onward

向前

Seriously, I’m dumbfounded. BMP and TIFF have run-length-encoding and then GIF comes around with LZW. But there’s nothing in between. Why? I found the space between RLE and LZW to be large enough to spend many days on. And there’s a lot more to explore.
说真的,我傻眼了。BMP 和 TIFF 具有游程编码,然后 GIF 与 LZW 一起出现。但中间没有任何东西。为什么?我发现 RLE 和 LZW 之间的空间足够大,可以花很多天的时间。还有很多东西需要探索。

Working on QOI was a lot of fun. I had a “test runner” with some sample images lying around. Seeing how every change I made affected the compression ratio was quite exciting.
从事 QOI 工作非常有趣。我有一个"测试运行器",周围有一些示例图像。看到我所做的每一个改变如何影响压缩比,我非常令人兴奋。

With some more work, QOI could serve as the basis for a lossless video codec, suitable for screencasts and the like.
通过更多的工作,QOI 可以作为无损视频编解码器的基础,适用于截屏视频等。

SIMD acceleration for QOI would also be cool but (from my very limited knowledge about some SIMD instructions on ARM), the format doesn’t seem to be well suited for it. Maybe someone with a bit more experience can shed some light?
QOI 的 SIMD 加速也很酷,但是(根据我对 ARM 上的一些 SIMD 指令的了解非常有限),这种格式似乎不太适合它。也许有更多经验的人可以阐明一些?

I’m also quite hyped to explore the even larger space of a simple, lossy image compression format. Many texture compression schemes have very exciting ideas, yet there’s nothing that competes with JPEG but with less complexity.

我也非常高兴能够探索简单、有损图像压缩格式的更大空间。

许多纹理压缩方案都有非常令人兴奋的想法,但没有什么可以与 JPEG 竞争,但复杂性较低。

http://www.xdnf.cn/news/15630.html

相关文章:

  • Flutter 应用如何设计通知服务
  • Nature Communications:人工有机传入神经为智能机器人提供闭环触觉反馈
  • 半小时部署本地deepseek【1】
  • Hadoop与云原生集成:弹性扩缩容与OSS存储分离架构深度解析
  • Django母婴商城项目实践(五)
  • Linux中的LVS集群技术
  • 二进制写入与文本写入的本质区别:系统视角下的文件操作
  • IT 和OT指的什么?
  • LangChain 源码剖析(七)RunnableBindingBase 深度剖析:给 Runnable“穿衣服“ 的装饰器架构
  • 基于现代R语言【Tidyverse、Tidymodel】的机器学习方法
  • KuiperInfer第八课-实现resnet推理
  • 在ComfyUI中CLIP Text Encode (Prompt)和CLIPTextEncodeFlux的区别
  • git是啥
  • Selenium自动化浏览器操作指南
  • 5 种可行的方法:如何将 Redmi 联系人备份到 Mac
  • 智能Agent场景实战指南 Day 16:Agent记忆系统设计
  • 微流控工程普鲁士蓝水凝胶微球用于增强骨关节炎抗氧化效应
  • PyTorch新手实操 安装
  • 如何区别HTML和HTML5?
  • 【移动端知识】移动端多 WebView 互访方案:Android、iOS 与鸿蒙实现
  • 格式转换Total Excel Converter:20 种格式XLS XLSX 批量转 PDFWord
  • SpringMVC + Tomcat10
  • 链路聚合技术
  • 时序数据库 Apache IoTDB 实战:基于 Kubernetes 的部署运维全指南
  • 抗辐照与国产替代:ASM1042在卫星光纤放大器(EDFA)中的应用探索
  • 新手向:图片批量裁剪工具
  • Jfinal+SQLite解决MYSQL迁移表未复制索引问题,完善迁移工具
  • AI问答-供应链管理:各种交通运输方式货运成本分析
  • 20.轮廓特征与近似,改变图像的轮廓识别画线的精确度,同时画出轮廓对应的矩形
  • 下载了docker但是VirtualBox突然启动不了了