Continuous delivery is, in my opinion, the superior way to build and ship software. Since I’ve been part of a few different journeys to continuous delivery, I’ll use this article to capture my learnings.
Problem 🔗
Before discussing the learnings, let’s have a quick look at the typical issues that organizations following GitFlow and similar approaches usually experience:
- It takes ages to get stuff to production.
- Poor quality resulting in frequent incidents, bugs, and customer dissatisfaction.
- Big and thus risky release batches (since changes accumulate faster than the process can deliver).
- A lot of time wasted resolving merge conflicts.
- The release process has multiple unnecessary steps that add friction without increasing release confidence.
- The process for hotfixes is convoluted and fragile.
- All issues above are exacerbated the more collaborators you have working on the same codebase.
Transitioning to continuous delivery mitigates the issues above and increases deployment frequency, reduces change failure rate, and improves developer experiences.
Learnings 🔗
Expect resistance and skepticism, but dare to keep your conviction. I’ve never seen anyone who transitioned to continuous delivery and wasn’t excited about the way of working.
Don’t underestimate the size of the investment. It requires time to build automation, develop new behaviors, and foster a new culture. Therefore, you need to get strong buy-in from senior leadership to ensure the teams will get adequate support and air cover.
Continuous delivery means a lot of things to a lot of people. Create a crisp definition and ways to assess progress. This is the best definition of continuous delivery that I’ve come across. That website is generally awesome, as it also contains actionable advice on how to undertake the journey.
Start small with a pilot team and let the engineers on the team communicate the success. Make sure you have some (doesn’t have to be a full-blown metrics program) data in place beforehand to quantify the impact.
There are a ton of Engineering Intelligence tools out there that can measure software delivery performance, but you don’t necessarily need that to establish a baseline. Dumping all data from your version control system into a database will get you 90%+ of the data you need.
Encourage teams to take risks and dare to deploy every commit. This is a major shift for those who are used to working with long-lived feature branches, development and main, but it’s also the quickest way for people to see and experience the value. It’s not as risky as it sounds because each deployment is so tiny, and it will accelerate the right behaviors to be developed.
Invest in ephemeral environments. They unlock ample value and quickly set teams off toward continuous integration.
Self-Assessment Maturity Model 🔗
Below is a table I’ve used (created together with ChatGPT) to enable teams to self-assess and set goals on their journey to continuous delivery. Simply add columns for current and target maturity.
| Practice | What good looks like | Typical issues | Possible indicators |
|---|---|---|---|
| Continuous Integration | • Trunk-based development: all work integrates into trunk. • Each developer integrates work to trunk at least daily. • Automated testing before merge to trunk. • Work is tested with other work automatically on merge. • All feature work stops when the build is red (“stop-the-line culture”). • New work does not break delivered work. | • Long-lived branches that diverge from main. • Manual merges and integration conflicts. • CI only runs nightly or per PR, not per commit. • Broken build tolerated for days. • Flaky or slow test suites discourage frequent integration. • No clear ownership of build health. | • Average branch lifetime (<2 days) • Build success rate (%) • Mean time to fix broken build • % of commits passing all tests • Time from commit → green build |
| Only Path to Any Environment | • All deployments must go through a single automated pipeline – one path for all environments. • No manual deployments bypassing the pipeline; the pipeline verdict controls deployability. | • Manual hotfixes or SSH deploys to staging/prod. • Multiple deployment scripts or ad-hoc Jenkins jobs. • QA or staging deploys done manually. • Pipeline drift between services or teams. | • % of deployments through pipeline • # of manual production interventions • Audit logs: pipeline vs manual deploys |
| Deterministic Pipeline | • Pipeline produces consistent, repeatable results for same inputs. • Pass = deployable, fail = fix. • No manual changes between stages. • All inputs version-controlled. • Flaky tests fixed immediately; dependencies locked. | • Pipeline occasionally “just fails” → reruns instead of investigation. • Environment-specific config differences. • Manual approvals without clear criteria. • Non-versioned test data or secrets. • Flaky test culture tolerated. | • Pipeline pass rate on first run • Flaky test rate • % builds requiring manual approval • Avg retries per commit |
| Definition of Deployable | • Automated quality gates enforced (lint, security, compliance). • Artifacts always meet the deployable definition. | • No shared agreement on “deployable”. • Manual QA sign-off required. • Late manual security checks. • Quality gates vary by team. • Failed tests ignored. | • % builds passing quality gates • % deploys blocked by failed checks • % tests automated |
| Immutable Artifact | • Build once, deploy same artifact everywhere. • No manual changes. • Everything version-controlled. | • Rebuilds per environment (“build in prod”). • Manual hotfixes on servers. • Snowflake builds. • Env-specific variants. • Inconsistent versioning. | • Artifact reuse ratio • # of manual rebuilds • % configs version-controlled |
| Prod-Like Test Environment | • Test env closely matches production. • Realistic testing before prod. | • Environment drift. • Missing / stale data. • Scale/performance gaps. • “Works in staging, fails in prod”. | • Environment drift score • % prod incidents not reproducible in staging • Deployment success staging→prod |
| Rollback On-Demand | • Fast, automated rollback paths. • Rollback supported in pipeline. | • Manual rollback steps. • Rollback untested. • Irreversible DB migrations. • Rollback causes downtime. • Not integrated into pipeline. | • Mean time to rollback • # rollback tests per month • % deploys with automated rollback • Deployment recovery time |
| Application Configuration | • Config separated from code. • Config versioned. • Changes flow via pipeline. | • Manual prod edits. • Secrets not automated. • Env-specific settings not tracked. • Config drift → “works on my machine”. | • % configs version-controlled • # manual config edits in prod • Config rollback success rate |
| Trunk-Based Development | • All changes integrated to trunk. • Short-lived branches. • Avoid merge hell. | • Branches live for weeks. • Merges painful and delayed. • Manual QA gating. • Parallel branch drift. • Fear of merging due to instability. | • Median branch lifetime • Merge frequency per developer • % commits merged conflict-free |