.QOI: Lossless Image Compression in O(n) Time
专栏传送:[QOI] qoi_desc | qoi_encode | qoi_decode
(↓英文翻译,当学英语素材来的…😗)
Lossless Image Compression in O(n) Time
O(n) 时间内的无损图像压缩
Introducing QOI — the Quite OK Image Format. It losslessly compresses RGB and RGBA images to a similar size of PNG, while offering a 20x-50x speedup in compression and 3x-4x speedup in decompression. All single-threaded, no SIMD. It’s also stupidly simple.
介绍 QOI——相当不错的图像格式。它无损地将 RGB 和 RGBA 图像压缩为相似大小的 PNG,同时提供 20 倍至 50 倍的压缩速度和 3 倍-4 倍的解压缩速度。
全单线程,无 SIMD。这也简单得愚蠢。
tl;dr: 300 lines of C, single header, source on github, benchmark results here.
tl;dr:300 行 C,单个标头,github 上的源代码,基准测试结果在这里。
QOI compression
QOI 压缩
I want to preface this article by saying that I have no idea what I’m doing. I’m not a compression guy. I barely understand how Huffman Coding and DCT works. Luckily, QOI uses neither.
我想在这篇文章的序言中说,我不知道我在做什么。我不是一个压缩的人。我几乎不明白霍夫曼编码
和DCT
是如何工作的。幸运的是,QOI 两者都没有使用。
I was just tinkering with some ideas that I thought would maybe compress images. The result surprised me quite a bit.
我只是在修补一些我认为可能会压缩图像的想法。结果让我大吃一惊。
霍夫曼编码
一种压缩数据的算法,通过给出现频率高
的符号分配短码
、频率低的分配长码,减少整体存储空间。
DCT(离散余弦变换)
将图像或信号从空间域
转换到频率域
的技术,保留主要能量成分便于压缩(如JPEG),丢弃人眼不敏感的高频细节。
Why? A Short Rant.
为什么?简短的咆哮。
File formats. They suck. I absolutely loathe the usual suspects. PNG, JPEG or even worse MPEG, MOV, MP4. They burst with complexity at the seams. Every tiny aspect screams “design by consortium”.
文件格式。他们很糟糕。我绝对厌恶那些常见的嫌疑人。PNG、JPEG 甚至更糟糕的 MEPG、MOV、MP4。它们在接缝处爆发出复杂性。每一个微小的方面都在尖叫"由联盟设计"。
A while ago I dabbled into MPEG a bit. The basic ideas for video compression in MPEG are ingenious, even more so for 1993, but the resulting file format is an abomination.
不久前,我稍微涉足了 MPEG。MPEG 中视频压缩的基本思想很巧妙,对于 1993 年来说更是如此,但最终的文件格式却令人厌恶。
I can almost picture the meeting of the Moving Picture Experts Group where some random suit demanded there to be a way to indicate a video stream is copyrighted. And thus, the copyright bit flag made its way into the standard and successfully stopped movie piracy before it even began.
我几乎可以想象运动图像专家组的会议,其中一些随机的诉讼要求有一种方法来表明视频流受版权保护。因此,copyright 位标志
进入了标准,并在电影盗版开始之前就成功地阻止了它。
MPEG, an industry standard conceived 3 decades past, all patents long expired, all professional interest abandoned. Yet, the holy specification — there named ISO/IEC 11172-2 — is a well guarded secret, revealed only to those that fork over a cool $200 to endow the sacred work of the ISO.
MPEG,一个3年前构思的行业标准,所有专利早已过期,所有专业利益都被放弃了。然而,这个神圣的规范——被命名为 ISO/IEC 11172-2——是一个严密保守的秘密,只有那些花 200 美元以上来捐赠 ISO 神圣工作的人才会知道。
Alternative open video codecs exist, but are again immensely complex. They compete with the state of the art, require huge libraries, are compute hungry and difficult to work with. Alternatives for PNG all compete on the compression ratio too, with ever increasing complexity.
存在替代的开放视频编解码器,但同样非常复杂。它们与最先进的技术竞争,需要庞大的库,需要大量计算并且难以使用。PNG 的替代品也都在压缩比上竞争,复杂性不断增加。
There absolutely is a market for video, audio and image codecs that trade compression ratio for speed and simplicity, but no one is serving it. (Well, these guys maybe, but it’s all proprietary.)
视频、音频和图像编解码器绝对有一个市场,它们以压缩比换取速度和简单性,但没有人为它服务。(好吧,这些家伙也许是,但这都是专有的。)
Yes, stb_image saved us all from the pains of dealing with libpng and is therefore used in countless games and apps. A while ago I aimed to do the same for video with pl_mpeg, with some success.
是的,stb_image使我们所有人免于处理 libpng 的痛苦,因此被用于无数的游戏和应用程序中。不久前,我的目标是用pl_mpeg对视频做同样的事情,并取得了一些成功。
But with all that we learned, why did no one go back and implement a simple compression scheme to compete with PNG, but without the cruft? Why did no one implement a simple video compression scheme similar to MPEG, but in a sane file format instead?
但是,有了我们学到的一切,为什么没有人回去实施一个简单的压缩方案来与巴布亚新几内亚竞争,但又没有垃圾呢?为什么没有人实现类似于 MPEG 的简单视频压缩方案,而是以合理的文件格式实现?
Technical Details
技术细节
QOI encodes and decodes images in a single pass. It touches every pixel just once.
QOI 在一次通过中对图像进行编码和解码。它只接触每个像素一次。
Pixels are encoded as:
像素编码为:
- A run of the previous pixel
上一个像素的运行
- An index into an array of previously seen pixels
以前看到的像素数组的索引
- A difference to the previous pixel value in r,g,b
与之前的像素值(以 r,g,b 为单位)的差异
- Full r,g,b or r,g,b,a values
完整的 r,g,b 或 r,g,b,a 值
The resulting values are packed into chunks starting with a 2- or 8-bit tag (indicating one of those methods) followed by a number of data bits. All of these chunks (tag and data bits) are byte aligned, so there’s no bit twiddling needed between those chunks.
结果值被打包成块,以 2 位或 8 位 tag (表示其中一种方法)开始,后跟多个数据位。所有这些块(tag 和数据位)都是字节对齐的,因此这些块之间不需要位摆动。
Chunk Types
块类型
- QOI_OP_RUN
id: qoi-op-run
name: 运行块结构
type: mermaid
content: |-┌─ QOI_OP_RUN ────────────┐│ Byte[0] ││ 7 6 5 4 3 2 1 0 ││───────┼─────────────────││ 1 1 │ run │└───────┴─────────────────┘
- 2-bit tag
b11
- 6-bit run-length (1…62)
- QOI_OP_INDEX
id: qoi-op-index
name: 索引块结构
type: mermaid
content: |-┌─ QOI_OP_INDEX ──────────┐│ Byte[0] ││ 7 6 5 4 3 2 1 0 ││───────┼─────────────────││ 0 0 │ index │└───────┴─────────────────┘
- 2-bit tag
b00
- 6-bit index (0…63)
- QOI_OP_DIFF
id: qoi-op-diff
name: 差值块结构
type: mermaid
content: |-┌─ QOI_OP_DIFF ───────────┐│ Byte[0] ││ 7 6 5 4 3 2 1 0 ││───────┼─────┼─────┼─────││ 0 1 │ dr │ dg │ db │└───────┴─────┴─────┴─────┘
- 2-bit tag
b01
- 2-bit RGB差值 (-2…1)
- QOI_OP_LUMA
id: qoi-op-luma
name: 亮度块结构
type: mermaid
content: |-┌─ QOI_OP_LUMA ───────────┬─────────────────────────┐│ Byte[0] │ Byte[1] ││ 7 6 5 4 3 2 1 0 │ 7 6 5 4 3 2 1 0 ││───────┼─────────────────┼─────────────┼───────────││ 1 0 │ diff green │ dr - dg │ db - dg │└───────┴─────────────────┴─────────────┴───────────┘
- 2-bit tag
b10
- 6-bit绿色通道差值 (-32…31)
- 4-bit红蓝差值偏移
- QOI_OP_RGB
id: qoi-op-rgb
name: RGB块结构
type: mermaid
content: |-┌─ QOI_OP_RGB ────────────┬─────────┬─────────┬─────────┐│ Byte[0] │ Byte[1] │ Byte[2] │ Byte[3] ││ 7 6 5 4 3 2 1 0 │ 7 .. 0 │ 7 .. 0 │ 7 .. 0 ││─────────────────────────┼─────────┼─────────┼─────────││ 1 1 1 1 1 1 1 0 │ red │ green │ blue │└─────────────────────────┴─────────┴─────────┴─────────┘
- 8-bit tag
b11111110
- 24-bit RGB值
- QOI_OP_RGBA
id: qoi-op-rgba
name: RGBA块结构
type: mermaid
content: |-┌─ QOI_OP_RGBA ───────────┬─────────┬─────────┬─────────┬─────────┐│ Byte[0] │ Byte[1] │ Byte[2] │ Byte[3] │ Byte[4] ││ 7 6 5 4 3 2 1 0 │ 7 .. 0 │ 7 .. 0 │ 7 .. 0 │ 7 .. 0 ││─────────────────────────┼─────────┼─────────┼─────────┼─────────││ 1 1 1 1 1 1 1 1 │ red │ green │ blue │ alpha │└─────────────────────────┴─────────┴─────────┴─────────┴─────────┘
- 8-bit tag
b11111111
- 32-bit RGBA值
Onward
向前
Seriously, I’m dumbfounded. BMP and TIFF have run-length-encoding and then GIF comes around with LZW. But there’s nothing in between. Why? I found the space between RLE and LZW to be large enough to spend many days on. And there’s a lot more to explore.
说真的,我傻眼了。BMP 和 TIFF 具有游程编码,然后 GIF 与 LZW 一起出现。但中间没有任何东西。为什么?我发现 RLE 和 LZW 之间的空间足够大,可以花很多天的时间。还有很多东西需要探索。
Working on QOI was a lot of fun. I had a “test runner” with some sample images lying around. Seeing how every change I made affected the compression ratio was quite exciting.
从事 QOI 工作非常有趣。我有一个"测试运行器
",周围有一些示例图像。看到我所做的每一个改变如何影响压缩比
,我非常令人兴奋。
With some more work, QOI could serve as the basis for a lossless video codec, suitable for screencasts and the like.
通过更多的工作,QOI 可以作为无损视频编解码器的基础,适用于截屏视频等。
SIMD acceleration for QOI would also be cool but (from my very limited knowledge about some SIMD instructions on ARM), the format doesn’t seem to be well suited for it. Maybe someone with a bit more experience can shed some light?
QOI 的 SIMD 加速也很酷,但是(根据我对 ARM 上的一些 SIMD 指令的了解非常有限),这种格式似乎不太适合它。也许有更多经验的人可以阐明一些?
I’m also quite hyped to explore the even larger space of a simple, lossy image compression format. Many texture compression schemes have very exciting ideas, yet there’s nothing that competes with JPEG but with less complexity.
我也非常高兴能够探索简单、有损图像压缩格式的更大空间。
许多纹理压缩方案都有非常令人兴奋的想法,但没有什么可以与 JPEG 竞争,但复杂性较低。