Migrating applications to AKS is essentially straightforward… until data enters the picture. That is where things get complicated. Downtime, consistency, replication, IOPS, latency, backups, rollback — all at the same time.
Start from the Goal: RPO, RTO, Volume, Topology
Four key metrics:
- RPO (Recovery Point Objective) — the maximum acceptable amount of data loss
- RTO (Recovery Time Objective) — the acceptable duration of downtime
- Volume and data type — dataset size, number of tables, files
- Topology — number of write sources and external producers
These factors determine the strategy, tooling, and network preparation.
Choosing a Strategy: Cold vs. Warm vs. Hot
Cold (offline): Stop writes, dump/snapshot, transfer to Azure, restore. Simplest approach, but longer downtime.
Warm (pre-replication): Set up replication in advance, short cutover after stopping writes. Shorter RTO but more complex.
Hot (near-zero): CDC (Change Data Capture) with both environments running in parallel. Near-zero downtime, but high complexity and risk of conflicts.
Storage-level migration: Managed disk snapshots, Azure File Sync, rsync, AzCopy. Does not address application-level consistency.
Cloud-native: Instead of running a DB in AKS, use Azure SQL, PostgreSQL Flexible, MySQL Flexible, or Cosmos DB.
Storage Mapping in AKS
Recommended storage by use case:
- Transactional DB: Azure Disks (Premium/Ultra)
- Shared data: Azure Files
- High throughput: Azure NetApp Files (ANF)
- Analytics: Blob Storage
Common mistakes: underestimated IOPS, missing zonal locality, absent PodDisruptionBudget, ignored affinity rules, untested snapshots.
Database Scenarios
PostgreSQL: Streaming replication or logical replication with CDC (Debezium/Kafka).
MySQL: Binlog replication or CDC with caution around GTID conflicts.
MongoDB: Oplog replication with close attention to oplog size and sharding.
SQL Server: Log Shipping or Transaction Replication into Azure SQL Managed Instance.
Backups, DR, and Rollback
Before: full backup with a proven restore. During: snapshot, rollback plan ready. After: Azure Backup for AKS with regular testing.
Security and Compliance
Encryption at rest (Customer-Managed Keys). Encryption in transit (TLS). Workload Identity for pods. Audit logs into Log Analytics/SIEM.
Cutover Runbook
- Freeze deployments, write-block the source
- Delta sync and catch-up
- Backup on the source side
- Promote the cloud environment, rotate connection strings
- DNS/Ingress cutover (canary)
- Intensive monitoring
- Rollback if needed
Data drives the migration. Choose your strategy based on RPO/RTO. Test restore beforehand. Never underestimate IOPS and topology. Always have a rollback plan ready.