Skip to content

Commit

Permalink
Tar support and Compression to .tar.gz (#81)
Browse files Browse the repository at this point in the history
Tested on ESP32, RP2040 and ESP8266
  • Loading branch information
tobozo authored Feb 1, 2025
1 parent 02d46e0 commit 9014862
Show file tree
Hide file tree
Showing 28 changed files with 4,769 additions and 870 deletions.
53 changes: 10 additions & 43 deletions Doxyfile
Original file line number Diff line number Diff line change
@@ -1,62 +1,29 @@
EXTRACT_STATIC = YES
EXTRACT_ALL = YES


INPUT = "src"
INPUT_ENCODING = UTF-8
FILE_PATTERNS = *.c \
*.cc \
*.cxx \
*.cpp \
*.c++ \
*.java \
*.ii \
*.ixx \
*.ipp \
*.i++ \
*.inl \
*.idl \
*.ddl \
*.odl \
*.h \
*.hh \
*.hxx \
*.hpp \
*.h++ \
*.cs \
*.d \
*.php \
*.php4 \
*.php5 \
*.phtml \
*.inc \
*.m \
*.markdown \
*.md \
*.mm \
*.dox \
*.py \
*.f90 \
*.f \
*.for \
*.tcl \
*.vhd \
*.vhdl \
*.ucf \
*.qsf \
*.as \
*.js
*.h++
RECURSIVE = YES
EXCLUDE =
EXCLUDE =
EXCLUDE_SYMLINKS = NO
EXCLUDE_PATTERNS =
EXCLUDE_SYMBOLS =
EXAMPLE_PATH =
EXCLUDE_PATTERNS =
EXCLUDE_SYMBOLS =
EXAMPLE_PATH =
EXAMPLE_PATTERNS = *
EXAMPLE_RECURSIVE = NO
IMAGE_PATH =
INPUT_FILTER =
FILTER_PATTERNS =
IMAGE_PATH =
INPUT_FILTER =
FILTER_PATTERNS =
FILTER_SOURCE_FILES = NO
FILTER_SOURCE_PATTERNS =
USE_MDFILE_AS_MAINPAGE =
FILTER_SOURCE_PATTERNS =
USE_MDFILE_AS_MAINPAGE =
154 changes: 128 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,29 @@
- uzlib https://github.com/pfalcon/uzlib
- TinyUntar https://github.com/dsoprea/TinyUntar

ESP32-targz enables the channeling of gz :arrow_right: tar :arrow_right: filesystem data ~~without using an intermediate file~~ (bug: see [#4](https://github.com/tobozo/ESP32-targz/issues/4)).
ESP32-targz enables the channeling of gz :arrow_left::arrow_right: tar :arrow_left::arrow_right: filesystem data in both directions.

In order to reach this goal, TinyUntar was heavily modified to allow data streaming, uzlib is also customized.
Parental advisory: this project was made under the influence of hyperfocus and its code may contain comments that are unfit for children.


Scope
-----

- Compressing to `.tar.gz`
- Decompressing from `tar.gz`
- Compressing to `gz`
- Decompressing from `gz`
- Packing files/folders to `tar`
- Unpacking `tar`
- Supports any fs::FS filesystem (SD, SD_MMC, FFat, LittleFS) and Stream (HTTP, HTTPS, UDP, CAN, Ethernet)
- This is experimental, expect bugs!
- Contributions and feedback are more than welcome :-)


Tradeoffs
---------

When the output is the filesystem (e.g. NOT when streaming to TAR), gzip can work without the dictionary.
When decompressing to the filesystem (e.g. NOT when streaming to TAR), gzip can work without the dictionary.
Disabling the dictionary can cause huge slowdowns but saves ~36KB of ram.

TinyUntar requires 512bytes only so its memory footprint is negligible.
Expand All @@ -33,22 +47,10 @@ TinyUntar requires 512bytes only so its memory footprint is negligible.
Limitations
-----------

ESP32-TarGz can only have one **output** filesystem (see *Support Matrix*), and it must be set at compilation time (see *Usage*).
- ESP32-targz decompression can only have one **output** filesystem (see *Support Matrix*), and it must be set at compilation time (see *Usage*).
This limitation does not apply to the **input** filesystem/stream.


Scope
-----

- Compressing to `gz` (deflate/lz77)
- Decompressing `gz`
- Expanding `tar`
- Decompressing and expanding `tar.gz`
- Supports any fs::FS filesystem (SD, SD_MMC, FFat, LittleFS) and streams (HTTP, HTTPS, UDP, CAN, Ethernet)
- This is experimental, expect bugs!
- Contributions and feedback are more than welcome :-)



Support Matrix
--------------
Expand All @@ -68,7 +70,7 @@ Usage
-----


:warning: Important note: setting the `#define` **before** including `<ESP32-targz.h>` is recommended to prevent the library from defaulting to SPIFFS.
:warning: Optional: setting the `#define` **before** including `<ESP32-targz.h>` will alias a default flash filesystem to `tarGzFS`.


```C
Expand Down Expand Up @@ -296,8 +298,24 @@ ESP32 Only: Direct Update (no intermediate file) from `.tar.gz.` stream
```
LZPacker::compress() signatures:
-------------------------------
```cpp
// buffer to stream (best compression)
size_t compress( uint8_t* srcBuf, size_t srcBufLen, Stream* dstStream );
// buffer to buffer (best compression)
size_t compress( uint8_t* srcBuf, size_t srcBufLen, uint8_t** dstBufPtr );
// stream to buffer
size_t compress( Stream* srcStream, size_t srcLen, uint8_t** dstBufPtr );
// stream to stream
size_t compress( Stream* srcStream, size_t srcLen, Stream* dstStream );
// stream to file
size_t compress( Stream* srcStream, size_t srcLen, fs::FS*dstFS, const char* dstFilename );
// file to file
size_t compress( fs::FS *srcFS, const char* srcFilename, fs::FS*dstFS, const char* dstFilename );
// file to stream
size_t compress( fs::FS *srcFS, const char* srcFilename, Stream* dstStream );
```

Compress to `.gz` (buffer to stream)
-------------------------------
Expand Down Expand Up @@ -344,11 +362,94 @@ Compress to `.gz` (stream to stream)
```


TarPacker::pack_files() signatures:
-------------------------------
```cpp
int pack_files(fs::FS *srcFS, std::vector<dir_entity_t> dirEntities, Stream* dstStream, const char* tar_prefix=nullptr);
int pack_files(fs::FS *srcFS, std::vector<dir_entity_t> dirEntities, fs::FS *dstFS, const char*tar_output_file_path, const char* tar_prefix=nullptr);
```
Pack to `.tar` (entities to File)
-------------------------------
```C
std::vector<TAR::dir_entity_t> dirEntities;
TarPacker::collectDirEntities(&dirEntities, &LittleFS, "/folder/to/pack");
auto packedSize = TarPacker::pack_files(&LittleFS, dirEntities, &LittleFS, "/my.archive.tar");
```

Pack to `.tar` (entities to Stream)
-------------------------------
```C
std::vector<TAR::dir_entity_t> dirEntities;
TarPacker::collectDirEntities(&dirEntities, &LittleFS, "/folder/to/pack");
File tarOutfile = LittleFS.open("/my.archive.tar", "w");
size_t packedSize = TarPacker::pack_files(&LittleFS, dirEntities, &tarOutfile);
tarOutfile.close();
```
TarGzPacker::compress() signatures:
-------------------------------
```cpp
int compress(fs::FS *srcFS, const char* srcDir, Stream* dstStream, const char* tar_prefix=nullptr);
int compress(fs::FS *srcFS, const char* srcDir, fs::FS *dstFS, const char* tgz_name, const char* tar_prefix=nullptr);
int compress(fs::FS *srcFS, std::vector<dir_entity_t> dirEntities, Stream* dstStream, const char* tar_prefix=nullptr);
int compress(fs::FS *srcFS, std::vector<dir_entity_t> dirEntities, fs::FS *dstFS, const char* tgz_name, const char* tar_prefix=nullptr);
```


Pack & compress to `.tar.gz` file/stream (no filtering on source files/folders list, recursion applies)
-------------------------------
```C
File TarGzOutFile = LittleFS.open("/my.archive.tar.gz", "w");
size_t compressedSize = TarGzPacker::compress(&LittleFS/*source*/, "/folder/to/compress", &TarGzOutFile);
TarGzOutFile.close();
```

Pack & compress to `.tar.gz` file/stream
-------------------------------

```C
std::vector<TAR::dir_entity_t> dirEntities;
TarPacker::collectDirEntities(&dirEntities, &LittleFS/*source*/, "/folder/to/compress");
// eventually filter content from dirEntities
File TarGzOutFile = LittleFS.open("/my.archive.tar.gz", "w");
size_t compressedSize = TarGzPacker::compress(&LittleFS/*source*/, dirEntities, &TarGzOutFile);
TarGzOutFile.close();
```
Pack & compress to `.tar.gz` file (no filtering on source files/folders list, recursion applies)
-------------------------------
```C
File TarGzOutFile = LittleFS.open("/my.archive.tar.gz", "w");
size_t compressedSize = TarGzPacker::compress(&LittleFS/*source*/, "/folder/to/compress", &LittleFS/*destination*/, "/my.archive.tar.gz");
TarGzOutFile.close();
```


Pack & compress to `.tar.gz` file
-------------------------------

```C
std::vector<TAR::dir_entity_t> dirEntities;
TarPacker::collectDirEntities(&dirEntities, &LittleFS/*source*/, "/folder/to/compress");
// eventually filter content from dirEntities
File TarGzOutFile = LittleFS.open("/my.archive.tar.gz", "w");
size_t compressedSize = TarGzPacker::compress(&LittleFS/*source*/, dirEntities, &LittleFS/*destination*/, "/my.archive.tar.gz");
TarGzOutFile.close();
```
Callbacks
TarGzUnpacker/GzUnpacker/TarUnpacker Callbacks
---------
```C
Expand Down Expand Up @@ -402,7 +503,7 @@ Callbacks
```

Return Codes
TarGzUnpacker/GzUnpacker/TarUnpacker Return Codes
------------

`*Unpacker->tarGzGetError()` returns a value when a problem occured:
Expand Down Expand Up @@ -470,9 +571,9 @@ Test Suite
Known bugs
----------

- tarGzStreamExpander hates SPIFFS
- tarGzExpander/tarExpander: some formats aren't supported with SPIFFS (e.g contains symlinks or long filename/path)
- tarGzExpander without intermediate file hates situations with low heap
- SPIFFS is deprecated, migrate to LittleFS!
- tarGzExpander/tarExpander: symlinks or long filename/path not supported, path limit is 100 chars
- tarGzExpander without intermediate file uses a lot of heap
- tarGzExpander/gzExpander on ESP8266 : while the provided examples will work, the 32Kb dynamic allocation for gzip dictionary is unlikely to work in real world scenarios (e.g. with a webserver) and would probably require static allocation


Expand All @@ -482,13 +583,14 @@ Debugging:

- ESP32: use all of the "Debug level" values from the boards menu
- ESP8266: Warning/Error when "Debug Port:Serial" is used, and Debug/Verbose when "Debug Level:Core" is selected from the boards menu
- RP2040: only "Debug port: Serial" and "Debug Level: Core" enable logging


Resources
-----------
- [LittleFS for ESP32](https://github.com/lorol/LITTLEFS)
- [ESP8266 Sketch Data Upload tool for LittleFS](https://github.com/earlephilhower/arduino-esp8266littlefs-plugin)
- [ESP32 Sketch Data Upload tool for FFat/LittleFS/SPIFFS](https://github.com/lorol/arduino-esp32fs-plugin/releases)
- [Pico LittlsFS Data Upload tool](https://github.com/earlephilhower/arduino-pico-littlefs-plugin)

![image](https://user-images.githubusercontent.com/1893754/99714053-635de380-2aa5-11eb-98e3-631a94836742.png)

Expand All @@ -501,7 +603,6 @@ Alternate links

Credits:
--------
- [pfalcon](https://github.com/pfalcon/uzlib) (uzlib maintainer)
- [dsoprea](https://github.com/dsoprea/TinyUntar) (TinyUntar maintainer)
- [lorol](https://github.com/lorol) (LittleFS-ESP32 + fs plugin)
Expand All @@ -511,5 +612,6 @@ Credits:
- [scubachristopher](https://github.com/scubachristopher) (contribution and support)
- [infrafast](https://github.com/infrafast) (feedback fueler)
- [vortigont](https://github.com/vortigont/) (inspiration and support)
- [hitecSmartHome](https://github.com/hitecSmartHome) (feedback fueler)


6 changes: 3 additions & 3 deletions examples/ESP32/WebServer_mod_gzip/GzipStaticHandler.h
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,9 @@ bool GzCacheMiddleware::run(WebServer &server, Middleware::Callback next) {
return next();

assert(gzHandler);
return gzHandler->handle( server, server.method(), server.uri() );
if( gzHandler->handle( server, server.method(), server.uri() ) )
return true;
return next();
}


Expand Down Expand Up @@ -445,5 +447,3 @@ GzStaticRequestHandler &GzStaticRequestHandler::setFilter(WebServer::FilterFunct
_filter = filter;
return *this;
}


43 changes: 43 additions & 0 deletions examples/ESP32/WebServer_mod_gzip/WebServer_mod_gzip.ino
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,49 @@ void setup()
// mod_gzip.enableCache(); // cache gz files
mod_gzip.disableCache(); // ignore existing gz files, compress on the fly


server.on("/json", []() { // send gz compressed JSON
// building HTTP response without "Content-Length" header isn't 100% standard, so we have to do this
int responseCode = 200;
const char* myJsonData = "{\"ceci\":\"cela\",\"couci\":\"couça\",\"patati\":\"patata\"}";
server.sendHeader(String(F("Content-Type")), String(F("application/json")), true);
server.sendHeader(String(F("Content-Encoding")), String(F("gzip")));
server.sendHeader(String(F("Connection")), String(F("close")));
String HTTPResponse = String(F("HTTP/1.1"))+' '+String(responseCode)+' '+server.responseCodeToString(responseCode)+"\r\n";
size_t headersCount = server.responseHeaders();
for(size_t i=0;i<headersCount;i++)
HTTPResponse.concat(server.responseHeaderName(i) + F(": ") + server.responseHeader(i) + F("\r\n"));
HTTPResponse.concat(F("\r\n"));
// sent HTTP response
server.client().write(HTTPResponse.c_str(), HTTPResponse.length());

// stream compressed json
size_t compressed_size = LZPacker::compress( (uint8_t*)myJsonData, strlen(myJsonData), &server.client() );
log_i("Sent %d compressed bytes", compressed_size);
});


server.on("/spiffs.tar.gz", []() { // compress all filesystem files/folders on the fly
// building HTTP response without "Content-Length" header isn't 100% standard, so we have to do this
int responseCode = 200;
server.sendHeader(String(F("Content-Type")), String(F("application/tar+gzip")), true);
server.sendHeader(String(F("Connection")), String(F("close")));
String HTTPResponse = String(F("HTTP/1.1"))+' '+String(responseCode)+' '+server.responseCodeToString(responseCode)+"\r\n";
size_t headersCount = server.responseHeaders();
for(size_t i=0;i<headersCount;i++)
HTTPResponse.concat(server.responseHeaderName(i) + F(": ") + server.responseHeader(i) + F("\r\n"));
HTTPResponse.concat(F("\r\n"));
// sent HTTP response
server.client().write(HTTPResponse.c_str(), HTTPResponse.length());

// stream tar.gz data
std::vector<TAR::dir_entity_t> dirEntities; // storage for scanned dir entities
TarPacker::collectDirEntities(&dirEntities, &tarGzFS, "/"); // collect dir and files
size_t compressed_size = TarGzPacker::compress(&tarGzFS, dirEntities, &server.client());
log_i("Sent %d compressed bytes", compressed_size);
});


server.addMiddleware( &mod_gzip );

server.begin();
Expand Down
Loading

0 comments on commit 9014862

Please sign in to comment.