blog
How to Migrate Cloud PostgreSQL to On-Prem with Minimal Downtime
In the last decade, cloud adoption has accelerated as organizations looked to scale quickly, reduce upfront capital investment, and leverage managed services. However, in recent years, a growing number of companies are choosing to migrate certain workloads back from the cloud to on-premises environments.
The reasons vary; cost reduction, compliance with data sovereignty laws, improved performance, and more. Once decided, one of the most critical migration challenges is minimizing downtime. Today’s businesses cannot afford long service interruptions — customers expect 24/7 availability.
This blog will explore best practices for cloud-to-on-premises database migration, featuring a step-by-step use example that highlights migrating from AWS RDS PostgreSQL to a self-hosted environment with ClusterControl.
Pre-Migration Planning
1. Workload Assessment
Before embarking on any database migration, a comprehensive, meticulous assessment of the existing database infrastructure is paramount. This initial phase is critical for identifying potential challenges, optimizing the migration strategy, and ensuring a smooth transition with minimal disruption. The assessment should begin with a thorough identification of the following key aspects:
- Database Type: OLTP (transaction-heavy) databases such as MySQL or PostgreSQL require different approaches compared to OLAP data warehouses.
- Criticality of Data: Which databases are mission-critical, and which can tolerate longer downtime?
- Size of Data: Multi-terabyte datasets require more sophisticated strategies than smaller workloads.
2. Sizing & Capacity Planning
For optimal performance and scalability, an on-premises infrastructure must be robust enough to manage existing workloads efficiently while simultaneously offering ample headroom to accommodate future growth. The key considerations:
- CPU and Memory: Match or exceed cloud specs.
- Storage: Low-latency SSDs for OLTP workloads.
- Network Bandwidth: High throughput between cloud and on-prem ensures faster sync.
3. Security & Compliance
Security and compliance are paramount considerations throughout the migration process, encompassing both the transfer of data and its subsequent state at rest. During data transfer, robust encryption protocols, such as TLS/SSL, are essential to prevent interception and unauthorized access. This ensures data integrity and confidentiality as it moves from the source to the destination environment. Furthermore, secure network configurations, including firewalls and intrusion detection systems, play a crucial role in safeguarding data in transit by mitigating potential cyber threats.
4. Network Connectivity
Data transfer is a critical component of any database migration, especially with large datasets. The initial snapshot of the database must be completed quickly to minimize downtime, and the ongoing replication during the migration process needs to be efficient to prevent any lag or data inconsistencies.
To facilitate large-scale data transfers between cloud environments and on-premises infrastructure, it’s crucial to establish secure, high-bandwidth connections. One common solution is to utilize VPN tunnels, which provide a secure pathway over the public internet.
However, for large volume transfers or those requiring guaranteed performance, dedicated private lines are highly recommended. These include services like AWS Direct Connect, Azure ExpressRoute, and GCP Interconnect. These dedicated connections bypass the public internet, offering lower latency, higher throughput, and greater reliability, which are essential for maintaining data integrity and minimizing timelines.
Migration Strategies
Migrating a database from a cloud environment to an on-premises infrastructure presents a unique set of challenges and considerations, demanding a carefully tailored approach. There is no single, universally applicable migration strategy; the optimal path depends heavily on various factors, including the database type, its size and complexity, the specific cloud provider and its services used, the target environment’s capabilities, downtime tolerance, data sensitivity, and regulatory compliance requirements.
Below are common approaches for online migration for minimum downtime:
1. Physical Replication
Physical replication typically utilizes WAL/binary logs to replicate data from a source to a target database. This method is well-suited for homogeneous migrations; e.g., MySQL → MySQL. Essentially, it establishes an on-premises replica node that is synchronized with the cloud environment.
2. Logical Replication / Change Data Capture (CDC)
Logical replication operates at a higher level of abstraction compared to physical replication. Instead of replicating block-level changes or entire data files, it focuses on replicating the actual data modifications, such as individual row insertions, updates, and deletions, as well as the transactional order in which these changes occur. This method provides greater flexibility, as it is independent of the underlying storage engine or physical data layout.
A key advantage of logical replication is its ability to replicate data between different versions of a database or even different database systems, provided they support a compatible logical replication protocol. This makes it an ideal solution for heterogeneous environments, upgrades, and migrations.
Furthermore, it allows for selective replication, where only specific tables or subsets of data are replicated, reducing network bandwidth usage and storage requirements on the replica. You can utilize tools such as AWS DMS, GCP Data Migration Services, Debezium + Kafka, or pg_logical in PostgreSQL.
Techniques to Minimize Downtime
1. Change Data Capture (CDC)
Change Data Capture (CDC) can play a crucial role in ensuring data consistency and minimal downtime during database migrations, particularly when moving from a cloud-based database to an on-premises solution. By continuously capturing and streaming write operations from the source cloud DB to the target on-prem DB in near real-time, CDC significantly reduces the data gap between the two systems. This real-time synchronization is essential for a smooth and efficient cutover process.
The cutover itself is designed to be as seamless as possible. Once the on-premises database has caught up with the cloud database, the critical step involves re-routing all query traffic. This means directing applications and services that previously accessed the cloud database to now point to the newly synchronized on-premises database. The downtime experienced during this switchover is limited exclusively to the brief period required for these database endpoint changes to propagate and take effect. This approach minimizes the impact on ongoing operations, ensuring that business continuity is maintained throughout the migration.
2. Read Replica Promotion
In a hybrid cloud environment, a read replica promotion strategy essentially creates a unified, single cluster. This setup involves an active, primary database residing in the cloud, with a replica maintained on-premises. The core principle is to ensure continuous replication between the cloud and on-premises nodes until a planned cutover event.
It involves promoting the on-premises replica to become the new primary database. Once this promotion is complete, all query traffic is re-routed from the previous cloud primary to the newly promoted on-premises primary. This allows for a seamless transition, ensuring data consistency and continuous service availability during the cutover.
The on-premises replica’s ability to take over as the primary ensures business continuity, even in non-migration scenarios where the cloud primary might experience issues. This architecture provides flexibility, disaster recovery capabilities, and potentially reduced latency for on-premises applications accessing the local primary.
Testing & Validation
Testing and validation are important topics during the migration process. We need to ensure the data is replicated exactly the same within the source database and involves a multi-faceted approach to verify data integrity, consistency, and completeness.
Key Aspects of Testing and Validation:
- Data Integrity Verification: This involves checking for any corruption or alteration of data during transit. Techniques include:
- Checksums and Hashing: Comparing checksums or hash values of data blocks between source and target.
- Record Counts: Verifying that the total number of records in each table matches between the source and target.
- Row-by-Row Comparison: For critical tables, a detailed comparison of individual rows and their values is essential to catch subtle discrepancies. This can be achieved through custom scripts or data comparison tools.
- Data Consistency Checks: Ensuring that relationships between data elements are maintained and that data types are accurately mapped.
- Foreign Key Constraints: Validating that all foreign key relationships are correctly established and enforced in the target database.
- Data Type Mapping: Confirming that data types from the source are appropriately converted to the target database’s data types, avoiding truncation or loss of precision.
- Uniqueness Constraints: Verifying that unique keys and primary keys are correctly enforced.
- Data Completeness Validation: Guaranteeing that all expected data has been successfully migrated.
- Schema Comparison: Comparing the schema of the source and target databases to ensure all tables, columns, indexes, and views are present.
- Data Volume Comparison: Checking the total volume of data (e.g., in gigabytes) to ensure a similar size, accounting for potential differences in storage efficiency.
- Specific Data Set Spot Checks: Selecting a representative sample of data and performing manual checks to confirm its presence and accuracy in the target.
- Application-Level Testing: Beyond just the database, it’s vital to test how applications interact with the new migrated database.
- Functional Testing: Ensuring that all application functionalities work as expected with the new database.
- Performance Testing: Benchmarking the performance of the applications with the migrated database to ensure it meets or exceeds previous performance levels. This includes query response times, transaction throughput, and overall system responsiveness.
Cutover Planning
The database cutover from cloud to on-premises typically occurs during low-traffic periods, e.g. nights / weekends. A rollback plan should be prepared to revert to the cloud database if validation issues arise. Additionally, don’t forget to inform users about the cutover beforehand.
Post-Migration Best Practices
After moving to your on-prem environment, we should take a look in the following:
- Monitoring & Alerting, it is important to monitor the workloads of the database after the migration is completed. Keep an eye on some OS metrics such as CPU, memory, disk usage, also database metrics — buffer pool hit ratio, connections, row operations, locks, checkpoint. We can utilize some tools for monitoring such as Prometheus + Grafana, Percona Monitoring, pg_stat_statements.
- Performance Tuning, performance tuning is repeatable activities which need to be regularly monitored and corrected. It can be database parameter tuning, query tuning, indexing strategies.
- Security Hardening, ensure the firewall rules meet requirements, especially for connections from the application to the database. Implement least-privilege access and regularly rotate credentials.
- Cost Validation, after the migration, compare the actual on-prem TCO with the cloud environment, and don’t forget to factor in operational overhead.
Use Case: PostgreSQL Cloud to On-Prem Migration with Cluster
In a Cloud PostgreSQL to on-prem migration, we will utilize logical replication to replicate data between the cloud and on-prem environments. Before setting up logical replication from Cloud RDS PostgreSQL to self-hosted PostgreSQL, we need to configure the following parameters in the RDS PostgreSQL(restart required):
rds.logical_replication = 1
wal_level = logical
max_replication_slots = 10
max_wal_sender = 10
Step 1: Prepare On-Prem PostgreSQL Cluster
Set up a PostgreSQL streaming replication cluster on-prem (at least 2 nodes for HA), we can use ClusterControl to spin up a PostgreSQL cluster with 2 nodes (Primary & Replica) and configure replication slots for resilience.
After the PostgreSQL cluster is up and running in the on-prem environment, we need to ensure the following parameters are set:
wal_level = logical
max_replication_slots = 10
max_wal_senders = 10
Step 2: Schema Backup and Restore
When using logical replication, only the schema (DDL) must be backed up and restored in order to mirror the existing database structure while the data itself is replicated separately — confirm there are no DDL changes when setting up logical replication.
- Run the database’s schema backup using
pg_dumpin the Cloud PostgreSQL node:
pg_dump -h mydb.xxxxxx.ap-southeast-1.rds.amazonaws.com -p 5432 -U dbadmin -d mydb —-schema-only -f db_schema_dump.sql
- Create your on-prem database:
psql# create database mydb
- Restore the schema backup into your on-prem PostgreSQL database.
psql -h <onprem-host> -p 5432 -U <username> -f db_schema_dump.sql mydb
Step 3: Logical Replication (Publication & Subscription)
PostgreSQL logical replication relies on tables having a primary key because every replicated table must have a REPLICA IDENTITY, which is the primary key by default. If the table does not have a primary key, we must configure the table as REPLICA IDENTITY FULL.
- If the tables does not have primary keys, we need to configure the following in Cloud RDS PostgreSQL:
ALTER TABLE my_table REPLICA IDENTITY FULL;
- On Cloud RDS PostgreSQL (source), create a publication:
CREATE PUBLICATION mypub FOR ALL TABLES;
- On your on-prem PostgreSQL (target), create a subscription:
CREATE SUBSCRIPTION mysub
CONNECTION 'host=mydb.xxxxxx.ap-southeast-1.rds.amazonaws.com dbname=mydb user=replica password=pass'
PUBLICATION mypub
WITH (
copy_data = true,
create_slot = true,
enabled = true
);
The option copy_data will sync the raw data from Cloud RDS to on-prem PostgreSQL.
- Monitor data replication in the source (Cloud RDS PostgreSQL) and self-hosted on-prem PostgreSQL by executing the following query:
Cloud RDS PostgreSQL (Publisher):
psql# SELECT * FROM pg_replication_slots;
Self-hosted on-prem PostgreSQL (Subscription):
psql# SELECT * FROM pg_stat_subscription;
Step 4: Cutover
The cutover process involves transferring the application’s endpoint from the AWS RDS instance to your on-prem PG cluster and confirming the on-prem cluster is in sync (pg_stat_replication).
Downtime: Minimal, limited to DNS or connection string changes.
Conclusion
Migrating databases from cloud to on-prem with minimal downtime is possible with careful planning, replication strategies, and thorough testing. Regardless of your database, the key is to minimize the cutover window, validate data consistency, and maintain business continuity.
Organizations should treat migration not as a one-time event, but as an iterative process: plan, test, migrate, validate, and optimize. By following these best practices, businesses can achieve a smooth transition back to on-prem while meeting performance, compliance, and cost objectives.
Ready to migrate your PostgreSQL database and maintain your orchestration on-prem?
Install ClusterControl in 10-minutes. Free 30-day Enterprise trial included!
Script Installation Instructions
The installer script is the simplest way to get ClusterControl up and running. Run it on your chosen host, and it will take care of installing all required packages and dependencies.
Offline environments are supported as well. See the Offline Installation guide for more details.
On the ClusterControl server, run the following commands:
wget https://severalnines.com/downloads/cmon/install-cc
chmod +x install-cc
With your install script ready, run the command below. Replace S9S_CMON_PASSWORD and S9S_ROOT_PASSWORD placeholders with your choice password, or remove the environment variables from the command to interactively set the passwords. If you have multiple network interface cards, assign one IP address for the HOST variable in the command using HOST=<ip_address>.
S9S_CMON_PASSWORD=<your_password> S9S_ROOT_PASSWORD=<your_password> HOST=<ip_address> ./install-cc # as root or sudo user
After the installation is complete, open a web browser, navigate to https://<ClusterControl_host>/, and create the first admin user by entering a username (note that “admin” is reserved) and a password on the welcome page. Once you’re in, you can deploy a new database cluster or import an existing one.
The installer script supports a range of environment variables for advanced setup. You can define them using export or by prefixing the install command.
See the list of supported variables and example use cases to tailor your installation.
Other Installation Options
Helm Chart
Deploy ClusterControl on Kubernetes using our official Helm chart.
Ansible Role
Automate installation and configuration using our Ansible playbooks.
Puppet Module
Manage your ClusterControl deployment with the Puppet module.
ClusterControl on Marketplaces
Prefer to launch ClusterControl directly from the cloud? It’s available on these platforms: