Skip to content

Commit

Permalink
Merge pull request #2248 from opensafely-core/mikerkelly/backup-db-zstd
Browse files Browse the repository at this point in the history
Make DB backup use Zstandard compression
  • Loading branch information
mikerkelly authored Dec 17, 2024
2 parents ef25815 + 7984176 commit 75dee86
Show file tree
Hide file tree
Showing 3 changed files with 16 additions and 8 deletions.
11 changes: 5 additions & 6 deletions DEVELOPERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,18 +71,17 @@ then restore from the decompressed backup file. On the production server:

```sh
dokku enter opencodelists
sqlite3 /app/db.sqlite3 ".backup /storage/backup/previous-db.sqlite"
sqlite3 /storage/db.sqlite3 ".backup /storage/backup/previous-db.sqlite3"

cp /storage/backup/db/{PATH_TO_BACKUP_GZ} /storage/backup
gunzip /storage/backup/{PATH_TO_BACKUP_GZ}
sqlite3 /app/db.sqlite3 ".restore /storage/backup/{PATH_TO_BACKUP_SQLITE}
zstd -d /storage/backup/db/{PATH_TO_BACKUP_ZST} -o /storage/backup/restore-db.sqlite3
sqlite3 /storage/db.sqlite3 ".restore /storage/backup/restore-db.sqlite3
```
When all is confirmed working with the restore, you can delete
`previous-db.sqlite3`.
`previous-db.sqlite3` and `restore-db.sqlite3`.
The latest backup is available via symlink at
`/storage/backup/db/latest-db.sqlite3.gz`. You can use `scp`, `gunzip` and
`/storage/backup/db/latest-db.sqlite3.zst`. You can use `scp`, `zstd -d` and
`sqlite3 ".restore" to bring your local database into the same state as the
production database. You may also wish to retrieve the coding systems
databases, otherwise you will not be able to interact with codelists that
Expand Down
11 changes: 9 additions & 2 deletions deploy/bin/backup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,17 +14,24 @@ BACKUP_FILEPATH="$BACKUP_DIR/$BACKUP_FILENAME"
sqlite3 "$DATABASE_DIR/db.sqlite3" ".backup $BACKUP_FILEPATH"

# Compress the latest backup.
gzip -f "$BACKUP_FILEPATH"
# Zstandard is a fast, modern, lossless data compression algorithm. It gives
# marginally better compression ratios than gzip on the backup and much faster
# compression and particularly decompression. We want the backup process to be
# quick as it's a CPU-intensive activity that could affect site performance.
# --rm flag removes the source file after compression.
zstd "$BACKUP_FILEPATH" --rm

# Symlink to the new latest backup to make it easy to discover.
# Make the target a relative path -- an absolute one won't mean the same thing
# in the host file system if executed inside a container as we expect.
ln -sf "$BACKUP_FILENAME.gz" "$BACKUP_DIR/latest-db.sqlite3.gz"
ln -sf "$BACKUP_FILENAME.zst" "$BACKUP_DIR/latest-db.sqlite3.zst"

# Keep only the last 30 days of backups.
# For now, apply this to both the original backup dir with backups based on the
# Django dumpdata management command and the new dir with backups based on
# sqlite .backup. Once there are none of the former remaining, the first line can be
# removed, along with most of this comment.
find "$DATABASE_DIR" -name "core-data-*.json.gz" -type f -mtime +30 -exec rm {} \;
# We initially compressed with gzip, this can be removed when none left.
find "$BACKUP_DIR" -name "*-db.sqlite3.gz" -type f -mtime +30 -exec rm {} \;
find "$BACKUP_DIR" -name "*-db.sqlite3.zst" -type f -mtime +30 -exec rm {} \;
2 changes: 2 additions & 0 deletions docker/dependencies.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,5 @@ python3.12
python3.12-venv
sqlite3
tzdata
# Fast, modern compression utility. Compress backups. Search 'zstd' to find uses.
zstd

0 comments on commit 75dee86

Please sign in to comment.