-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demo 10: a Proof of Concept #22
Comments
any idea why the size is worse than experimental? is that fixable? |
It's not strictly worse, though. On the "size kb" column, qoi-experi is smaller on kodak, textures and wallpaper (which are all opaque and, on average, more photographic) but qoi-demo10 is smaller on misc and screenshots. I'm not sure about the underlying reasons other than there's a fixed amount of 8-bit opcode space and giving more opcodes to one thing means taking away opcodes from another thing. demo10 has a larger QOI_INDEX size. experi has a whole new op As a data point, I added a simple "subtract green" transform to demo10's qoi_encode (and its inverse to qoi_decode): --- qoi-demo10.h 2021-12-04 08:53:03.115172245 +1100
+++ qoi-subg10.h 2021-12-04 09:00:34.854176340 +1100
@@ -454,6 +454,8 @@
px.rgba.g = pixels[px_pos+1];
px.rgba.b = pixels[px_pos+2];
}
+ px.rgba.r -= px.rgba.g;
+ px.rgba.b -= px.rgba.g;
if (px.v == px_prev.v) {
run++;
@@ -947,6 +949,11 @@
}
}
+ for (int px_pos = 0; px_pos < px_len; px_pos += channels) {
+ pixels[px_pos + 0] += pixels[px_pos + 1];
+ pixels[px_pos + 2] += pixels[px_pos + 1];
+ }
+
return pixels;
} Again, there's not an outright winner. Some compression sizes got slightly better. Some got slightly worse. It also tanked the throughput numbers, though. I'm not sure if that's fixable. This was a very quick evaluation focuing on file sizes.
|
Another data point, on the 512x512 pixel editions of the Tango Icon Library icons (as suggested in #17). 213 PNG images. Average PNG size is 51 KiB.
|
Honestly I didn't understand why increasing the color cache should affect the compression ratio much. This helps mostly for palletized images, but not for regular images where colors are not often reused. |
Here's a patch to count opcode histograms: --- a/qoi-demo10.h 2021-12-04 08:53:03.115172245 +1100
+++ b/qoi-demo10.h 2021-12-06 09:22:55.592694824 +1100
@@ -1,3 +1,15 @@
+int histogram[9] = {0};
+const char* histnames[9] = {
+"QOI_INDEX1",
+"QOI_DIFF1",
+"QOI_DIFF2",
+"QOI_DIFF3",
+"QOI_DIFF4",
+"QOI_DIFF5",
+"QOI_RUN1",
+"QOI_RUN2",
+"QOI_RUN3",
+};
/*
QOI - The "Quite OK Image" format for fast, lossless image compression
@@ -386,19 +398,23 @@
static inline void qoi_encode_run(unsigned char *bytes, int *pptr, int run) {
if (run == 1) {
+histogram[1]++;
poke_u8(bytes+*pptr, QOI_DIFF1 | 0xA8);
*pptr += 1;
}
else if (run < 30) {
+histogram[6]++;
poke_u8(bytes+*pptr, QOI_RUN1 | (run << 3));
*pptr += 1;
}
else if (run < 256) {
+histogram[7]++;
poke_u8(bytes+*pptr+0, QOI_RUN2);
poke_u8(bytes+*pptr+1, run);
*pptr += 2;
}
else {
+histogram[8]++;
poke_u8(bytes+*pptr+0, QOI_RUN3);
poke_u8(bytes+*pptr+1, run>>0);
poke_u8(bytes+*pptr+2, run>>8);
@@ -471,6 +487,7 @@
int index_pos = QOI_COLOR_HASH(px);
if (index[index_pos].v == px.v) {
+histogram[0]++;
poke_u8(bytes+p, QOI_INDEX1 | (index_pos << 1));
p += 1;
}
@@ -488,6 +505,7 @@
((uint8_t)(dg + 2u) <= 3u) &&
((uint8_t)(db + 2u) <= 3u)
) {
+histogram[1]++;
poke_u8(bytes+p, QOI_DIFF1 |
((uint8_t)(dr + 2u) << 2u) |
((uint8_t)(dg + 2u) << 4u) |
@@ -499,6 +517,7 @@
((uint8_t)(dg + 8u) <= 15u) &&
((uint8_t)(db + 8u) <= 15u)
) {
+histogram[2]++;
poke_u16le(bytes+p, QOI_DIFF2 |
((uint8_t)(dr + 8u) << 4u) |
((uint8_t)(dg + 8u) << 8u) |
@@ -510,6 +529,7 @@
((uint8_t)(dg + 16u) <= 31u) &&
((uint8_t)(db + 16u) <= 31u)
) {
+histogram[3]++;
poke_u24le(bytes+p, QOI_DIFF3 |
((uint8_t)(dr + 16u) << 4u) |
((uint8_t)(dg + 16u) << 9u) |
@@ -518,6 +538,7 @@
p += 3;
}
else {
+histogram[4]++;
bytes[p++] = QOI_DIFF4;
bytes[p++] = dr;
bytes[p++] = dg;
@@ -531,6 +552,7 @@
((uint8_t)(db + 16u) <= 31u) &&
((uint8_t)(da + 16u) <= 31u)
) {
+histogram[3]++;
poke_u24le(bytes+p, QOI_DIFF3 |
((uint8_t)(dr + 16u) << 4u) |
((uint8_t)(dg + 16u) << 9u) |
@@ -539,6 +561,7 @@
p += 3;
}
else {
+histogram[5]++;
bytes[p++] = QOI_DIFF5;
bytes[p++] = dr;
bytes[p++] = dg; and --- a/qoibench.c 2021-12-06 09:29:35.576698449 +1100
+++ b/qoibench.c 2021-12-06 09:30:58.729699203 +1100
@@ -519,5 +519,12 @@
benchmark_print_result("Totals (AVG)", totals);
+int histotal = 0;
+for (int i = 0; i < 9; i++) {
+ histotal += histogram[i];
+}
+for (int i = 0; i < 9; i++) {
+ printf("%-12s%12d %0.4f\n", histnames[i], histogram[i], (double)histogram[i]/(double)histotal);
+}
return 0;
} |
Opcode HistogramsEdit: I split out an artificial QOI_REP1 row to count "a 1-length run", which demo10 can encode in 1 byte either as a QOI_INDEX1 or a QOI_DIFF1. I might be triple-counting (calling qoi_encode thrice), but it's the fractions that matter, not the absolute counts.
|
I'm not entirely sure either, but FWIW here's some data on how often #define QOI_COLOR_HASH(C) ((C.rgba.r ^ C.rgba.g ^ C.rgba.b ^ C.rgba.a) & 127) Color Cache Hit Rate
|
In WebP lossless I use a 'double RLE' mode. The double RLE mode can repeat the previous pixel, or copy N pixels of the previous row at respective locations. I choose the longer of the RLEs. Having both RLEs instead of one turns out to be helpful on a large variety of images. In addition to it, I try the full LZ77 mode, and choose the one that generates less entropy in initial analysis. The double RLE mode uses only distance codes 0 and 1, reducing the entropy of distance codes (See https://developers.google.com/speed/webp/docs/webp_lossless_bitstream_specification#42_encoding_of_image_data) into one bit. |
Here are some pretty sweet throughput numbers on my laptop (with similar compression sizes) for a proof of concept, combining little-endianness (#3) with 7-bit indexes (#20). Little-endianness in particular allows us to remove a lot of branching in the decoder's inner loops.
I can't think of a good name for it right now, so I'm just going to call it "Demo 10".
Details:
The text was updated successfully, but these errors were encountered: