July 29, 2025

PostgreSQL pg_upgrade: Workflow Analysis & Usage Guide

Master PostgreSQL version upgrades with pg_upgrade: Explore its workflow, key options, and best practices for seamless data migration. Avoid downtime with expert tips.

Core Concepts of pg_upgrade

pg_upgrade​ enables fast PostgreSQL upgrades by:

  1. Directly copying data files​ from old to new instances (avoids re-export/import).
  2. Migrating database definitions​ via pg_dump/pg_restore.

Key Requirements:​

  • Runs only between compatible PostgreSQL versions (e.g., v11 → v16).
  • Requires both old and new PostgreSQL installations.
  • New instances must use initdb with matching parameters (e.g., --wal-segsize).

Critical Usage Notes

1. ​Upgrade Modes

  • ​**--check**: Validates compatibility without modifying data (useful for pre-flight checks).
  • ​**--link/--clone**: Shares data files between instances (faster but riskier).

2. ​Operational Constraints

  • Locking: Upgrades block access to both old and new instances.
  • Port Conflict: Defaults to port 50432 for temporary instances.
  • Manual Post-Upgrade Steps:
    • Sync postgresql.conf/pg_hba.conf.
    • Run ANALYZE (via analyze_new_cluster.sh).
    • Upgrade extensions (via update_extensions.sql).

3. ​Limitations

  • No Automatic Syncs: Filesystem paths, plugins, or full-text indexes must be manually migrated.
  • High Availability: Requires per-node execution in clustered environments.

Detailed Workflow (Version 13.8+)

Phase 1: Pre-Upgrade Checks

  1. Compatibility Validation:
    • Verifies binary compatibility using pg_controldata.
    • Checks for deprecated features (e.g., system types like regclass).
  2. Old Instance Inspection:
    • Dumps global objects (users, tablespaces) with pg_dumpall --globals-only.
    • Extracts schema-only dumps for user databases (excluding template0).

Phase 2: New Instance Setup

  1. Initialization:
    • Creates a new data directory with initdb.
    • Ensures compatibility with old instance settings (e.g., locale, encoding).
  2. Data Import:
    • Restores global objects and user schemas using psql/pg_restore.
    • Prioritizes template1 to avoid dependency issues.

Phase 3: Data Migration

  1. File Copy Strategies:
    • Hard Links (--link)​: Shares data files between instances (fastest).
    • Clone (--clone)​: Uses OS-level references (requires filesystem support).
  2. Finalization:
    • Resets transaction IDs with pg_resetwal.
    • Generates cleanup scripts (delete_old_cluster.sh).

Troubleshooting & Logs

Key log files for debugging:

  • pg_upgrade_internal.log: Tool execution details.
  • pg_upgrade_server.log: Server logs during startup/shutdown.
  • pg_upgrade_dump_*.log: Schema dump errors.

Best Practices

  1. Test Thoroughly: Use --check in a staging environment.
  2. Parallelize: Leverage --jobs for faster exports/imports.
  3. Backup First: Always retain backups before upgrading.
  4. Monitor Disk Space: Data duplication can double storage needs.

For distributed systems, follow this rule: ​Upgrade all nodes sequentially​ to maintain cluster integrity.

Need help optimizing your upgrade strategy? Share your use case! 🚀

You will get best features of ChatDBA