Category: Reliability Strategy

  • How developers can survive “you build it, you run it”

    Introduction As a developer, you might not have anything to do with your code once it’s been committed all the way to looking after the code right up to production. The latter is called the “you build it, you run it” model. It’s not going away. But that depends on your organization. It’s likely to…

  • Cost-benefit analysis of infrastructure-as-code (IAC)

    You might have heard that Infrastructure-as-code (IaC) contributes to better cloud-native software architecture. But what is IaC, what are its benefits & trade-offs and how can it be improved? This guide aims to give clarity around IaC through: It can serve as a starting point for business-specific conversations with stakeholders. At some point, senior management…

  • Reduce software outage risk with passive guardrails

    Shocking fact: only 10-25% of software outages are because of hardware or network failure. The rest are the result of human error like misconfiguration — paraphrasing Martin Kleppman, Designing Data-Intensive Applications In this article, I will share with you how setting up passive guardrails in and around developer workflows can reduce the frequency and severity…

  • SRE’s role in safer infrastructure-as-code

    This article explores 2 simple ways for SREs to drive better practices and code hygiene within infrastructure-as-code (IAC) tooling like Terraform. Why bother? Because of its centrality to cloud infrastructure efficiency, it’s highly likely that you will get involved with an IAC problem at some point in your SRE career. I will mention Terraform from…