Deployment despair

A repulsive example

Recently I pulled out a few strains of my hair out of despair when I opened a web app’s README and read the deployment section:

Ask around on Slack whether it is okay to deploy. Make sure that enough people and someone from (dev)ops are available to help in case of a necessary roll back. Watch at the CI deployment step to notice when something goes wrong, and be able to test the deployment once it’s done.

Nope, we’re not talking about a decade old legacy product. This was a two year old codebase using next.js, graphQL and a popular headless CMS.

The deployment process itself wasn’t that hard, but there was a climate of fear and a resistance to deployments. It’s not that the people running the project weren’t technically capable of implementing automated testing (there was an abandoned cypress flow, and a couple useless jest unit tests). The root cause was that team was mismanaged. The devs however were responsible for their own misery too: they were not self-protective enough to spend a couple of hours to evolve the deployment process.

Deploying should be as casual as lacing a shoe, a regular automatic action one can do without thinking, drunken in the middle of night.

Benefits of frequent deployments

Let’s re-state the obvious:

  • The more atomic a deployment, the lower the risk of a long term outage. That’s because reverting tiny code changes is easy, and so is finding the responsible code change.
  • Frequent deployments result in atomic deployments.
  • Frequent deployments train everyone to get good at it. The annoying repetition will quickly motivate the team to automate deployments. The automation will pay off quickly.
  • Frequent deployments allow the business to ship changes quicker. This includes fixing the inevitable bugs that eventually make it to production. But really, the higher iteration speed’s value lies in the mindset shift throughout the organisation: we can ship small improvements anytime. The mindset is highly correlated with agile development, and running a lean startup.

How to get there?

If you develop a web app, your goal needs to be that no team member is ever scared to deploy their changes.

  • Cover the most popular real-world usage with tests. This means E2E tests. Unit and integration tests don’t hurt, but in my experience their main benefit is to help documenting the business logic’s detailed acceptance criteria for future team members and prevent small regressions during development. Actual confidence in the deployment stems from knowing that the core user flow has been covered in an environment alike to the user’s product navigation. If this is a ✅, the developer knows that he/she won’t be called on duty on the weekend. Non-essential bugs can wait.
  • Run those tests on every code change that is pushed to the git server. That’s continuous integration (CI). If git is used correctly (feature branches, atomic commits, frequent pushes), the developer has an automatic tight feedback loop, where he/she is made aware that functionality is broken shortly after causing the issue.
  • Enforce change review. Code review of a pull request is typical, but having a product manager or QA look at a code change to verify that the code change’s functionality change is working as expected and there’s no immediately visible side-effect goes a long way towards piece of mind. Code review naturally would contribute to team member’s learning from each other and potentially spotting deeper hidden issues (like code smells). Reduce the friction of review as far as possible. Having a separate, accessible deployment for the code change is perfect (netlify, vercel, heroku, AWS aplify, and other providers offer a “branch deployment” feature, usually it’s a 1-click install when using github).
  • Allow the person who makes a code change to deploy this change once tests are passing and the change review gave a green light, independent of other people’s code changes. Letting a change owner deploy distributes knowledge and responsibility. No one knows better than the code change owner when there’s a reason to be careful about a specific deployment. If merging a code change from feature to the main branch results in an automatic deployment (with pre / post-deploy test execution against production please!) to production you can praise yourself for doing continuous deployment (CD).

In what situation is CI/CD bad advice?

The general gist of simplifying deployments, and being confident in deployments are always applicable. Thanks dear devops movement for making everyone aware!

If your product has a high risk profile, then you might not have any business deploying fast - which is part of the culture that makes us bad at software engineering! 😬 Let’s say your product is a nuclear plant. Or election computers. Really any critical infrastructure. You might want to have more safe-guards in place for software deployments. You might not want to allow individual contributors to deploy changes themselves. You might have technical reasons preventing frequent deployments, e.g. software development might not happen on-site with the deployment and the site might be disconnected from public networks for security reasons.

But besides those examples?

For regular consumer facing web based projects I’d rather “ask [for] forgiveness, not permission”. The former you’ll have to do for 1 out of 100 deployments, the latter a 100 out of a 100 times. I know which one seems the better trade-off to me.