Retro: Implementing continuous delivery

Continuous delivery is, in my opinion, the superior way to build and ship software. Since I’ve been part of a few different journeys to continuous delivery, I’ll use this article to capture my learnings.

Problem

Before discussing the learnings, let’s have a quick look at the typical issues that organizations following GitFlow and similar approaches usually experience:

Transitioning to continuous delivery mitigates the issues above and increases deployment frequency, reduces change failure rate, and improves developer experiences.

Learnings

Self-Assessment Maturity Model

Below is a table I’ve used (created together with ChatGPT) to enable teams to self-assess and set goals on their journey to continuous delivery. Simply add columns for current and target maturity.

PracticeWhat good looks likeTypical issuesPossible indicators
Continuous Integration• Trunk-based development: all work integrates into trunk.
• Each developer integrates work to trunk at least daily.
• Automated testing before merge to trunk.
• Work is tested with other work automatically on merge.
• All feature work stops when the build is red (“stop-the-line culture”).
• New work does not break delivered work.
• Long-lived branches that diverge from main.
• Manual merges and integration conflicts.
• CI only runs nightly or per PR, not per commit.
• Broken build tolerated for days.
• Flaky or slow test suites discourage frequent integration.
• No clear ownership of build health.
• Average branch lifetime (<2 days)
• Build success rate (%)
• Mean time to fix broken build
• % of commits passing all tests
• Time from commit → green build
Only Path to Any Environment• All deployments must go through a single automated pipeline – one path for all environments.
• No manual deployments bypassing the pipeline; the pipeline verdict controls deployability.
• Manual hotfixes or SSH deploys to staging/prod.
• Multiple deployment scripts or ad-hoc Jenkins jobs.
• QA or staging deploys done manually.
• Pipeline drift between services or teams.
• % of deployments through pipeline
• # of manual production interventions
• Audit logs: pipeline vs manual deploys
Deterministic Pipeline• Pipeline produces consistent, repeatable results for same inputs.
• Pass = deployable, fail = fix.
• No manual changes between stages.
• All inputs version-controlled.
• Flaky tests fixed immediately; dependencies locked.
• Pipeline occasionally “just fails” → reruns instead of investigation.
• Environment-specific config differences.
• Manual approvals without clear criteria.
• Non-versioned test data or secrets.
• Flaky test culture tolerated.
• Pipeline pass rate on first run
• Flaky test rate
• % builds requiring manual approval
• Avg retries per commit
Definition of Deployable• Automated quality gates enforced (lint, security, compliance).
• Artifacts always meet the deployable definition.
• No shared agreement on “deployable”.
• Manual QA sign-off required.
• Late manual security checks.
• Quality gates vary by team.
• Failed tests ignored.
• % builds passing quality gates
• % deploys blocked by failed checks
• % tests automated
Immutable Artifact• Build once, deploy same artifact everywhere.
• No manual changes.
• Everything version-controlled.
• Rebuilds per environment (“build in prod”).
• Manual hotfixes on servers.
• Snowflake builds.
• Env-specific variants.
• Inconsistent versioning.
• Artifact reuse ratio
• # of manual rebuilds
• % configs version-controlled
Prod-Like Test Environment• Test env closely matches production.
• Realistic testing before prod.
• Environment drift.
• Missing / stale data.
• Scale/performance gaps.
• “Works in staging, fails in prod”.
• Environment drift score
• % prod incidents not reproducible in staging
• Deployment success staging→prod
Rollback On-Demand• Fast, automated rollback paths.
• Rollback supported in pipeline.
• Manual rollback steps.
• Rollback untested.
• Irreversible DB migrations.
• Rollback causes downtime.
• Not integrated into pipeline.
• Mean time to rollback
• # rollback tests per month
• % deploys with automated rollback
• Deployment recovery time
Application Configuration• Config separated from code.
• Config versioned.
• Changes flow via pipeline.
• Manual prod edits.
• Secrets not automated.
• Env-specific settings not tracked.
• Config drift → “works on my machine”.
• % configs version-controlled
• # manual config edits in prod
• Config rollback success rate
Trunk-Based Development• All changes integrated to trunk.
• Short-lived branches.
• Avoid merge hell.
• Branches live for weeks.
• Merges painful and delayed.
• Manual QA gating.
• Parallel branch drift.
• Fear of merging due to instability.
• Median branch lifetime
• Merge frequency per developer
• % commits merged conflict-free