How to Monitor Automated Database Backups Effectively

Backups are your last line of defense, but only if they are actually running. Here is a foolproof strategy to ensure your database is safe.

Schrödinger's Backup: "The condition of any backup is unknown until you attempt to restore from it."

While testing your restoration process is critical, there is an even more fundamental step that teams often miss: ensuring the automated backup script is actually triggering and completing successfully every single night.

Here is a step-by-step guide to writing a bulletproof database backup script that alerts you the moment it fails.

Step 1: Write a Verbose Script

A good backup script should use strict error handling. In bash, this means starting with set -e so the script exits immediately if any command fails.

#!/bin/bash
set -e

# Define variables
DB_USER="admin"
DB_NAME="production_db"
BACKUP_DIR="/var/backups/postgres"
DATE=$(date +%Y-%m-%d_%H-%M-%S)
FILE_NAME="$BACKUP_DIR/$DB_NAME-$DATE.sql"

# Run the dump
echo "Starting backup..."
pg_dump -U $DB_USER $DB_NAME > $FILE_NAME

# Compress it
gzip $FILE_NAME

# Optional: Sync to S3
aws s3 cp $FILE_NAME.gz s3://my-company-backups/

Step 2: Add the Heartbeat

If the pg_dump fails (e.g., bad password, disk full database offline), the set -e instruction will instantly terminate the script.

This gives us the perfect mechanism for our dead man's switch. Since the script will only reach the very last line if every preceding command was successful, we place our CronRabbit ping at the very end.

# ... previous code ...
aws s3 cp $FILE_NAME.gz s3://my-company-backups/

# If we get here, it was a total success!
curl -m 10 https://ping.cronrabbit.com/your-backup-monitor-id

Step 3: Configure the Alert

In the CronRabbit dashboard:

  1. Create a monitor named "Production DB Backup".
  2. Set the expected schedule to match your crontab (e.g., 0 2 * * * for 2 AM daily).
  3. Set the Grace Period to a reasonable amount (e.g., 2 hours).

That's it. If the cron daemon crashes, if the database is locked, or if the S3 bucket permissions change, the curl command will never execute. Two hours after the scheduled time, CronRabbit will blast an alert into your Slack channel, letting you know your safety net has a hole in it.