Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take backups via sqlite ".backup" #2214

Merged
merged 1 commit into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DEPLOY.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ Check cron tasks:
dokku$ dokku cron:list opencodelists
```

Backups are saved to `/var/lib/dokku/data/storage/opencodelists` on dokku3.
Backups are saved to `/var/lib/dokku/data/storage/opencodelists/backup` on dokku3.

### Manually deploying

Expand Down
46 changes: 35 additions & 11 deletions DEVELOPERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,23 +46,47 @@ A place to put scripts to be run via [runscript](https://django-extensions.readt

## Production database and backups

The production database and backups are located at `/var/lib/dokku/data/storage/opencodelists` on dokku3 (see also [deployment notes](DEPLOY.md)).
This database is the core (default) database;
the coding system databases are located within the `coding_systems` subdirectory.
Production data is stored on dokku3 at `/storage/` within the container layer
file system. This maps to `/var/lib/dokku/data/storage/opencodelists` in the
host operating system's file system. See also [deployment notes](DEPLOY.md)).

The backups are created with the dumpdata management command (`deploy/bin/backup.sh`).
They can be restored with:
`/storage/db.sqlite3` is the core Django database.

```sh
mv db.sqlite3 previous-db.sqlite3
`/storage/coding_systems` contains the coding system databases. These are
read-only. Refer to their README files for information on the source data and
creation process.

The core database is fully backed up daily on the local file system. Coding
system databases are not backed up locally but can be recreated from source.
Weekly backups of the droplets allow a restore of the file system.

The core database backups are located at `/storage/backup/db`. They are created
by `deploy/bin/backup.sh` scheduled via `cron` as configured in `app.json`.
Backups are taken via the `sqlite3` `.backup` command . These are effectively
copies of the database file. They are compressed to save space.

python manage.py migrate
To restore from a backup, use the command-line tool to create a fresh temporary
backup of the current state of the database (in case anything gones wrong),
then restore from the decompressed backup file. On the production server:

python manage.py loaddata core-data-<date>.json
```sh
dokku enter opencodelists
sqlite3 /app/db.sqlite3 ".backup /storage/backup/previous-db.sqlite"

cp /storage/backup/db/{PATH_TO_BACKUP_GZ} /storage/backup
gunzip /storage/backup/{PATH_TO_BACKUP_GZ}
sqlite3 /app/db.sqlite3 ".restore /storage/backup/{PATH_TO_BACKUP_SQLITE}
```

When all is confirmed working with the restore,
you can delete `previous-db.sqlite3`.
When all is confirmed working with the restore, you can delete
`previous-db.sqlite3`.

The latest backup is available via symlink at
`/storage/backup/db/latest-db.sqlite3.gz`. You can use `scp`, `gunzip` and
`sqlite3 ".restore" to bring your local database into the same state as the
production database. You may also wish to retrieve the coding systems
databases, otherwise you will not be able to interact with codelists that
require them.

## Local development

Expand Down
37 changes: 26 additions & 11 deletions deploy/bin/backup.sh
Original file line number Diff line number Diff line change
@@ -1,17 +1,32 @@
#!/bin/bash

set -euo pipefail
set -euxo pipefail

# We are changing the backup format and where they are stored. We want to
# retain 30 days of backups across both locations and formats. Once there
# are none of the old format remaining, this can be updated to just refer
# to the new location.
REPO_ROOT="/app"
BACKUP_DIR="/storage"
ORIGINAL_BACKUP_DIR="/storage"
BACKUP_DIR="$ORIGINAL_BACKUP_DIR/backup/db"

python \
"$REPO_ROOT"/manage.py \
dumpdata \
builder codelists opencodelists \
--indent 2 \
--verbosity 0 \
--output "${BACKUP_DIR}/core-data-$(date +%F).json.gz"
# Make the backup dir if it doesn't exist.
mkdir "$BACKUP_DIR" -p

# Keep only the last 30 backups
find "$BACKUP_DIR" -name "core-data-*.json.gz" | sort | head -n -30 | xargs rm
# Take a datestamped backup.
BACKUP_FILENAME="$BACKUP_DIR/$(date +%F)-db.sqlite3"
sqlite3 "$REPO_ROOT/db.sqlite3" ".backup $BACKUP_FILENAME"

# Compress the latest backup.
gzip -f "$BACKUP_FILENAME"

# Symlink to the new latest backup to make it easy to discover.
ln -sf "$BACKUP_FILENAME.gz" "$BACKUP_DIR/latest-db.sqlite3.gz"

# Keep only the last 30 days of backups.
# For now, apply this to both the original backup dir with backups based on the
# Django dumpdata management command and the new dir with backups based on
# sqlite .backup. Once there are none of the former remaining, the first line can be
# removed, along with most of this comment.
find "$ORIGINAL_BACKUP_DIR" -name "core-data-*.json.gz" -type f -mtime +30 -exec rm {} \;
find "$BACKUP_DIR" -name "*-db.sqlite3.gz" -type f -mtime +30 -exec rm {} \;
Loading