Backup and Operations

Backup and Operations

1. Importance of Backup

Database backup is the most important task to prevent data loss.

┌──────────────────────────────────────────────────────────┐
│                    Backup Strategy                        │
├──────────────────────────────────────────────────────────┤
│  • Regular backups: Daily/weekly full backup              │
│  • Incremental backups: WAL archiving                     │
│  • Replication: Real-time replica servers                 │
└──────────────────────────────────────────────────────────┘

2. pg_dump - Logical Backup

Basic Backup

# Single database backup
pg_dump dbname > backup.sql

# Specify user/host
pg_dump -U username -h localhost dbname > backup.sql

# Compressed backup
pg_dump dbname | gzip > backup.sql.gz

Format Options

# Plain text SQL (-Fp, default)
pg_dump -Fp dbname > backup.sql

# Custom format (-Fc, compressed, selective restore)
pg_dump -Fc dbname > backup.dump

# Directory format (-Fd, parallel backup/restore support)
pg_dump -Fd dbname -f backup_dir

# Tar format (-Ft)
pg_dump -Ft dbname > backup.tar

Selective Backup

# Specific tables only
pg_dump -t users -t orders dbname > tables.sql

# Exclude specific tables
pg_dump -T logs -T temp_* dbname > backup.sql

# Schema only (exclude data)
pg_dump -s dbname > schema.sql

# Data only (exclude schema)
pg_dump -a dbname > data.sql

# Specific schema only
pg_dump -n public dbname > public_schema.sql

Backup from Docker

# Run pg_dump in Docker container
docker exec -t postgres-container pg_dump -U postgres dbname > backup.sql

# Compressed backup
docker exec -t postgres-container pg_dump -U postgres dbname | gzip > backup.sql.gz

3. pg_dumpall - Full Cluster Backup

Backs up all databases and global objects (users, permissions, etc.).

# Full cluster backup
pg_dumpall -U postgres > full_backup.sql

# Global objects only (users, roles, etc.)
pg_dumpall -U postgres --globals-only > globals.sql

# Roles only
pg_dumpall -U postgres --roles-only > roles.sql

4. pg_restore - Restore

Restoring SQL Files

# Restore plain SQL
psql dbname < backup.sql

# Create new database and restore
createdb newdb
psql newdb < backup.sql

Restoring Custom/Directory Format

# Restore custom format
pg_restore -d dbname backup.dump

# Restore to new database
createdb newdb
pg_restore -d newdb backup.dump

# Restore specific table only
pg_restore -d dbname -t users backup.dump

# Parallel restore (4 workers)
pg_restore -d dbname -j 4 backup_dir

Restore Options

# Drop existing objects before restore
pg_restore -d dbname --clean backup.dump

# Ignore errors and continue
pg_restore -d dbname --if-exists backup.dump

# Data only restore
pg_restore -d dbname --data-only backup.dump

# Schema only restore
pg_restore -d dbname --schema-only backup.dump

5. Physical Backup (pg_basebackup)

Backs up the entire data directory.

# Basic backup
pg_basebackup -D /backup/path -U postgres -Fp -Xs -P

# Compressed backup
pg_basebackup -D /backup/path -U postgres -Ft -z -P

# Option descriptions:
# -D: Backup directory
# -Fp: Plain format
# -Ft: Tar format
# -Xs: WAL streaming
# -z: gzip compression
# -P: Show progress

WAL Archiving Setup

postgresql.conf:

wal_level = replica
archive_mode = on
archive_command = 'cp %p /archive/%f'

6. Automated Backup Script

Daily Backup Script

#!/bin/bash
# daily_backup.sh

# Configuration
DB_NAME="mydb"
DB_USER="postgres"
BACKUP_DIR="/backup/postgres"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=7

# Create backup directory
mkdir -p $BACKUP_DIR

# Execute backup
pg_dump -U $DB_USER -Fc $DB_NAME > $BACKUP_DIR/${DB_NAME}_${DATE}.dump

# Compress
gzip $BACKUP_DIR/${DB_NAME}_${DATE}.dump

# Delete old backups
find $BACKUP_DIR -name "*.dump.gz" -mtime +$RETENTION_DAYS -delete

echo "Backup completed: ${DB_NAME}_${DATE}.dump.gz"

Cron Setup

# crontab -e
# Backup daily at 2 AM
0 2 * * * /scripts/daily_backup.sh >> /var/log/backup.log 2>&1

7. Monitoring

Database Size

-- Database sizes
SELECT
    datname,
    pg_size_pretty(pg_database_size(datname)) AS size
FROM pg_database
ORDER BY pg_database_size(datname) DESC;

-- Table sizes
SELECT
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) AS total_size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC
LIMIT 10;

Connection Status

-- Current connection count
SELECT COUNT(*) FROM pg_stat_activity;

-- Connections by state
SELECT state, COUNT(*)
FROM pg_stat_activity
GROUP BY state;

-- Active queries
SELECT
    pid,
    now() - query_start AS duration,
    query,
    state
FROM pg_stat_activity
WHERE state != 'idle'
  AND query NOT LIKE '%pg_stat_activity%'
ORDER BY duration DESC;

Slow Queries

-- Queries running longer than 5 seconds
SELECT
    pid,
    now() - query_start AS duration,
    query
FROM pg_stat_activity
WHERE state = 'active'
  AND now() - query_start > interval '5 seconds';

Lock Status

-- Queries waiting for locks
SELECT
    blocked.pid AS blocked_pid,
    blocked.query AS blocked_query,
    blocking.pid AS blocking_pid,
    blocking.query AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_stat_activity blocking
    ON blocking.pid = ANY(pg_blocking_pids(blocked.pid));

8. Performance Statistics

Table Statistics

-- Table access statistics
SELECT
    schemaname,
    relname,
    seq_scan,
    seq_tup_read,
    idx_scan,
    idx_tup_fetch,
    n_tup_ins,
    n_tup_upd,
    n_tup_del
FROM pg_stat_user_tables
ORDER BY seq_scan DESC
LIMIT 10;

Index Usage

-- Unused indexes
SELECT
    schemaname,
    tablename,
    indexname,
    idx_scan,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;

Cache Hit Rate

-- Cache hit rate (99%+ is good)
SELECT
    sum(blks_hit) * 100.0 / sum(blks_hit + blks_read) AS cache_hit_ratio
FROM pg_stat_database;

9. Maintenance

VACUUM

Cleans up unnecessary space.

-- Regular VACUUM
VACUUM;
VACUUM users;

-- VACUUM FULL (rebuilds table, locks table)
VACUUM FULL users;

-- VACUUM ANALYZE (includes statistics update)
VACUUM ANALYZE users;

ANALYZE

Collects statistics for query optimization.

ANALYZE;
ANALYZE users;

REINDEX

Rebuilds indexes.

REINDEX TABLE users;
REINDEX DATABASE mydb;

Autovacuum Settings

postgresql.conf:

autovacuum = on
autovacuum_naptime = 1min
autovacuum_vacuum_threshold = 50
autovacuum_analyze_threshold = 50

10. Log Configuration

postgresql.conf:

# Log destination
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d.log'

# Log level
log_min_messages = warning
log_min_error_statement = error

# Query logging
log_statement = 'ddl'           # none, ddl, mod, all
log_duration = off
log_min_duration_statement = 1000  # Queries longer than 1 second

# Connection logging
log_connections = on
log_disconnections = on

11. Security Settings

pg_hba.conf

# TYPE  DATABASE    USER        ADDRESS         METHOD

# Local connections
local   all         all                         peer

# IPv4 local connections
host    all         all         127.0.0.1/32    scram-sha-256

# Allow specific network
host    mydb        appuser     192.168.1.0/24  scram-sha-256

# Deny specific IP
host    all         all         192.168.1.100   reject

SSL Configuration

# postgresql.conf
ssl = on
ssl_cert_file = 'server.crt'
ssl_key_file = 'server.key'

12. Practice Examples

Practice 1: Backup and Restore

# 1. Backup
pg_dump -U postgres -Fc mydb > mydb_backup.dump

# 2. Create new database
createdb -U postgres mydb_restored

# 3. Restore
pg_restore -U postgres -d mydb_restored mydb_backup.dump

# 4. Verify
psql -U postgres -d mydb_restored -c "SELECT COUNT(*) FROM users;"

Practice 2: Save Monitoring Queries

-- Create monitoring views
CREATE VIEW v_db_stats AS
SELECT
    datname,
    pg_size_pretty(pg_database_size(datname)) AS size,
    numbackends AS connections
FROM pg_database
WHERE datistemplate = false;

CREATE VIEW v_slow_queries AS
SELECT
    pid,
    now() - query_start AS duration,
    state,
    query
FROM pg_stat_activity
WHERE state = 'active'
  AND now() - query_start > interval '5 seconds';

-- Usage
SELECT * FROM v_db_stats;
SELECT * FROM v_slow_queries;

Practice 3: Maintenance Script

-- Regular maintenance procedure
CREATE PROCEDURE run_maintenance()
AS $$
BEGIN
    -- Update statistics
    ANALYZE;

    -- Clean unnecessary space
    VACUUM;

    RAISE NOTICE 'Maintenance completed: %', NOW();
END;
$$ LANGUAGE plpgsql;

-- Execute
CALL run_maintenance();

13. Checklist

Daily Checks

  • [ ] Verify backup success
  • [ ] Check disk usage
  • [ ] Check connection count
  • [ ] Review error logs

Weekly Checks

  • [ ] Check index usage
  • [ ] Analyze slow queries
  • [ ] Monitor table size trends

Monthly Checks

  • [ ] Test backup restore
  • [ ] Clean up unnecessary data
  • [ ] Analyze performance trends

Conclusion

This concludes the PostgreSQL learning materials.

Review of Learning Sequence: 1. Basics → DB management → Tables → CRUD → Conditions/sorting 2. JOIN → Aggregation → Subqueries → Views/indexes 3. Functions → Transactions → Triggers → Backup/operations

For deeper learning: - PostgreSQL Official Documentation - PostgreSQL Tutorial

to navigate between lessons