Skip to content

Latest commit

 

History

History
371 lines (279 loc) · 12.1 KB

README.md

File metadata and controls

371 lines (279 loc) · 12.1 KB

jsPackCompress

Utility for packing JavaScript files into self-extracting compressed HTML. Also can add extra files (images, css, anything else) into package.

While there are already tools like this (for example: fetchcrunch or jsExe) they do not suit my needs because they share same flaw - compressed HTML file must be served from HTTP server or they will not work.

You can read more about my motivation and how this tool works in Why and how section.

⚠️ NOT FOR REGULAR WEB!

While it may be tempting to use this packing for regular web sites (or PWAs), it is a bad idea to do so.

You can achieve much better result with properly configured compression on your HTTP server and by using service workers.

And even without that, delay introduced by unpacking is quite noticeable. Especially on low-end devices (like smartphones). Better let users download more and then see content instantly without delay and page freeze.

So this tool is a curiosity better suited for niche projects or stuff like demoscene or game jams like js13kgames.

Note

If you still want to publish packed file on the web, read about --universal-decoder option.

Install

From NPM for use as a command line app:

npm install jspackcompress -g

From NPM for programmatic use:

npm install jspackcompress

Command line usage

Usage: jspackcompress [options] [extra files...]

Package JavaScript file into self-extracting compressed HTML file

Options:
  -i, --input <file>         JavaScript file to use as the entry point
  -o, --output <file>        output filename for the compressed HTML. If not
                             specified, the input file name with the extension
                             changed from `.js` to `.html` will be used
  -l, --level <number>       compression level (2–9). A lower value is faster
                             but less compressed; a higher value is slower but
                             more compressed (default: 9)
  -a, --arg-name <name>      name of the wrapper function argument to access
                             extra files (default: "$fs")
  -g, --getter <type>        how content of extra files should be returned.
                             Options include: `ArrayBuffer`, `Uint8Array`,
                             `String`, `Blob`, or `BlobURL`. Use `all` to have
                             all types as properties of the wrapper argument
                             (default: "all")
  -s, --small-decoder        always use small decoder
  -u, --universal-decoder    use universal decoder
  -c, --compact-html         generate compact HTML
  -e, --extra-head <string>  additional string to include in the <HEAD> section
  -b, --extra-body <string>  additional string to include in the <BODY> section
  -V, --version              output the version number
  -h, --help                 display help for command

You can use glob patterns to specify extra files:
  $ jspackcompress -i index.js -o game.html assets/*

Options worth explaining:

--arg-name

If package consists of single .js file, after decompression its contents executed via eval command.

But if package contains extra files, their content must be passed to main script somehow. This is achieved by executing main script via Function. It looks something like this:

new Function("$fs", scriptContent)(filesData);

"$fs" is the default name for function argument. If it clashes with some variable in your code, or if you want to save some extra bytes, you can change name of this variable with --arg-name. Next section describes how to use this argument to get files content.

Caution

No checks are made that value provided with --arg-name is valid variable name in JavaScript.

--getter

Before packing all files are concatenated and compressed as single binary blob. Uncompressed ArrayBuffer then passed to main script as $fs argument.

To make access easier a small bit of code prepended to main script that transforms this $fs to an object with following methods and props:

// get file content as ArrayBuffer (slice)
$fs.ab("image.webp");

// get file content as Uint8Array
$fs.u8("image.webp");

// get file content as a String
$fs.str("data.json");

// get file content as a Blob
$fs.blob("data.csv");

// get file content as a Blob URL
$fs.url("data.csv");

// original ArrayBuffer with all files
$fs.b;

// object with information about each file
// NOTE: main script is not included into this object
$fs.list;
// and it looks liek this:
{
  "1.webp": {
    o: 0,    // offest
    s: 12345 // size of the file
  },
  "2.webp": {
    o: 12345,
    s: 54321
  },
  ...
}

This is default. But if you don't need all of this and want to save some bytes, then you can use --getter to limit returned value to one type. Possible values are ArrayBuffer, Uint8Array, String, Blob, and BlobURL. In this case you use $fs directly as a function

// if packed with `--getter BlobURL` then Blolb URL for the file is returned
let url = $fs("1.webp");

--small-decoder

There are two decoders for the packed file. jsPackCompress chooses one depending on the size of the payload. If it is less than 100KiB then smaller decoder is used. If it is more than 100KiB then it uses bigger decoder.

The bigger one uses Look Up Table (needs to be filled first) for faster decoding. Which is especially noticeable when packing several megabytes. But it becomes slower on small payloads (especially when look up table is bigger than payload itself)

If you still need to save extra bytes (like 31 in this case) you can force smaller decoder with this option.

--compact-html

HTML of packed file looks like this (newlines and indentation added for readability):

<!doctype html>
  <html>
    <head>
      <meta charset="iso-8859-15"/>
    </head>
  <body onload="decoderCode">
<!--BINARY_DATA

If you don't need doctype and do not mind that there will be an empty svg image in the HTML, you can use --compact-html which will reduce it to this:

<meta charset="iso-8859-15"/>
<svg onload="decoderCode">
<!--BINARY_DATA

This can save you 35 bytes.

--universal-decoder

Default decoders require that HTML file is parsed by browser with iso-8859-15 charset.

While proper charset is set in <meta charset tag in the wrapping HTML, it can be overriden by Content-Type header from HTTP server. This means that if you want to put packed file on the web, you must properly setup the server too.

If server configuration can't be changed, you can use this option to generate file with universal decoder. This will add around 200 bytes to wrapper HTML.

What it does is check if file was opened not from file: protocol. In that case it will fetch file again (hopefully this time browwser will use cached version) and then decode its content instead of the embeded string.

Using from code

If installed as package dependency, you can use jsPackCompress like this:

import { pack } from "jspackcompress";

let result = await pack({
  script: "console.log('Hello, World!');",
  compressor: customCompressFunction,
  compressionlevel: 9,
  argName: "$fs",
  accessType: "all",
  compactHtml: false,
  useSmallDecoder: false,
  extraHead: "<title>Cool Page!</title>",
  extraBody: "<canvas>",
  files: {
    "1.webp": {
      content: fileData, // Uint8Array / node.js Buffer
      label: "1",
    },
  },
});

Note

Function pack is asynchronous. Use await or .then().

There are two extras in config object when compared to cli:

compressor: you can provide custom compression function. It will be called with Uint8Array as its only argument and is expected to return "raw deflate" compressed data as Uint8Array.

label in list of files. This label will be used instead of the file name to access file data from code. So in this example to get Blob URL for 1.webp you need to use $fs.url(1) method.

Returned value of the pack function will be an object structured like this:

{
  payload: Uint8Array, // packed HTML file content
  stats: {
    origSize: 90393, // size of uncompressed data
    cmpSize: 47463,  // size of compressed data
    escSize: 47761   // size of compressed data after escaping special symbols
  }
}

How to build

Note

this repo uses git submodules. Use --recursive option while cloning.

JavaScript part of the package can be build by executing npm run build command.

Building WASM module

Requirements:

In console with emsdk_env applied go to zopfli folder and execute:

emcmake cmake .
cmake --bild .

This will also execute npm run build command in the end.

Why and how

My initial goal was to pack a small game made with Godot into a single HTML file. Not a problem since you can embed anything into HTML using base64. However, in this case this "small" game had a 42MiB WASM file with complete engine.

By recompiling web export template and removing unused engine features, I managed to reduce WASM size to 27MiB. Or 6.6MiB when gzipped.

This is when i discovered Efficient Compression Tool. Its modified zopfli algorithm works much faster and achieves a better compression ratio to the original zopfli (at least with less than 100 iterations. And using 1000 iterations doesn't seem to produce better results).

Anyway, encoding this with base64 adds 2 extra megabytes to HTML. Clearly, another encoding was needed for the compressed data.

I knew that jsExe exists. And that would be an ideal solution. But unfortunately modern browsers security restricts access to image pixels when page is opened with file:// protocol. I've searched what else is used on demoscene these days and found fetchcrunch - modern solution but with the same flaw. This time it is not allowed to use fetch from local files.

It seems that for my case, compressed data must be part of the decoder code.

iso-8859-15

I've tried storing binary data in js strings as-is (with escaping of course) and reading it back with charCodeAt. This was not working well with UTF-8 due to its multi-byte encoding. So I've tried single-byte encodings. And discovered that if you set HTML codepage to "iso-8859-15" there will be only 10 problematic bytes.

First is the byte 0x0D (\r character), which is converted to \n during HTML parsing. This meant I had to escape both \r and \\.

Then there are 9 bytes for which charCodeAt returns incorrect values:

{
  0: 65533,
  164: 8364,
  166: 352,
  168: 353,
  180: 381,
  184: 382,
  188: 338,
  189: 339,
  190: 376,
}

I moved the compressed data to HTML comments at the end of the file, avoiding the need to escape quotes and newlines. However, I still had to escape -->. Bonus: browser doesn't try to render all this bytes as page text (which was another source of stutter to page loading with existing tools).

I wrote function to escape data. This function also checks if there are any less common bytes than \\ and \r. And if there is it swaps them to save on escaping. This is also added to byte mapping.

Finally the decoding process looks like this:

  • get data as String from comment with document.body.lastChild.textContent
  • convert it to bytes with charCodeAt
  • remove escaping and fix \r char
  • pipe result through DecompressionStream('deflate-raw')
  • convert result to text with TextDecoder() and execute

In the end this escaping adds around 0.7% to original file size. Which is more than acceptable for me when compared to 33% size increase from base64.

License

MIT license.

Includes code from Efficient Compression Tool which is under Apache-2.0 license.