The Cloud Migration That Almost Broke Our Export Service

We had three weeks to migrate file storage from AWS S3 to Azure before our AWS contract renewed. The codebase had a clean storage abstraction built for exactly this scenario. We almost shipped without checking whether everything actually used it.

What's at stake

Hard deadline — AWS contract renewal in three weeks with no extension option
Customer-facing file exports silently broken if the migration went wrong
No equivalent staging environment to validate Azure behavior before go-live

File exports weren't a background feature — customers downloaded invoices, reports, and audit records through them. A broken export that fails silently doesn't produce an error page. It produces a missing file when someone needs it most.

The Scenario

You're the lead engineer at a B2B SaaS company. You have three weeks to migrate file storage from AWS S3 to Azure before your AWS contract auto-renews at a rate leadership won't approve. The codebase has a storage abstraction layer — an enum-routed system designed years ago with exactly this kind of migration in mind. It looks clean. How do you approach it?

No hints. Just judgment.

Cases/The Cloud Migration That Almost Broke Our Export Service

ArchitectureFeatured

The Cloud Migration That Almost Broke Our Export Service

What's at stake

Hard deadline — AWS contract renewal in three weeks with no extension option
Customer-facing file exports silently broken if the migration went wrong
No equivalent staging environment to validate Azure behavior before go-live

The Scenario

No hints. Just judgment.

The common mistake

Most engineers start a cloud storage migration by updating the abstraction layer and deploying — it feels clean because the abstraction was designed for exactly this. But that collapses two separate risks into one deployment: whether all traffic actually flows through the abstraction, and whether the new backend behaves identically under real load. An audit addresses the first. A hard cutover leaves you exposed on the second. Both need to be solved in the design before you write a line of migration code.

Lessons

Abstraction layers only protect what actually flows through them — audit before any migration
Design for parallel backends from the start, not as a contingency after problems appear
Rollback should be a config change throughout the migration, not a code revert
Silent failures in customer-facing features are discovered by customers, not monitoring
Two risks in one deployment doubles your exposure — solve coverage and behavioral parity separately

Impact

Export service bypass caught before migration — zero customer-facing file failures
Migration completed on deadline with no production incidents
Parallel backend pattern adopted as the standard for future infrastructure migrations
Rollback capability maintained throughout — cutover was a config change, not a deployment event