Category: Articles

Check out our written research on SRE and software operations topics ⬇️⬇️⬇️

  • How developers can survive “you build it, you run it”

    Introduction As a developer, you might not have anything to do with your code once it’s been committed all the way to looking after the code right up to production. The latter is called the “you build it, you run it” model. It’s not going away. But that depends on your organization. It’s likely to…

  • 10 Tips for Onboarding New SRE Hires

    How new SRE hires can get stuck There’s more than one way to mess up your new SRE hire and get them stuck in a loop. Here are 6 ways new hires will know you’ve made this mistake: This article will unpack these 6 sticking points and show how to solve them. Later on, I…

  • Cost-benefit analysis of infrastructure-as-code (IAC)

    You might have heard that Infrastructure-as-code (IaC) contributes to better cloud-native software architecture. But what is IaC, what are its benefits & trade-offs and how can it be improved? This guide aims to give clarity around IaC through: It can serve as a starting point for business-specific conversations with stakeholders. At some point, senior management…

  • Starting SRE at startups and smaller organizations

    Who should pay attention to this article ❌  SRE at a very small startup with few users rarely makes a difference until you’ve reached a fair userbase size or have growing pains ❌  Many organizations without a strong money/legal incentive e.g. SLAs tied to their operations, cannot justify diving into a complex field like SRE…

  • Inside Spotify’s Site Reliability Engineering (SRE) practice

    You’ve undoubtedly caught wind of the latest Netflix series, dubbed “The Playlist,” a show loosely inspired by the birth of Spotify. Chances are, you may have already devoured it in one glorious binge-watching session. As for me, I only got around to it recently. I was enticed by a Youtube ad that hinted at a…

  • How SRE reduces software operations costs

    I wax lyrical about this almost every day to engineering managers, tech executives, and even SRE managers themselves that… Site Reliability Engineering (SRE) is an indispensable asset for organizations that are seeking to reduce operating costs. You might not have felt that cost reduction pressure in the last few years. But that pressure is now…

  • Success factors for Site Reliability Engineering digital transformation

    This guide will help you better engage in business-level conversations about Site Reliability Engineering with key stakeholders. It is part of the SRE Digital Transformation series exploring how to integrate SRE into your organization. Introduction Site Reliability Engineering (SRE) is a powerful tool for achieving high software performance and reliability in enterprises, as well as…

  • How to pitch Site Reliability Engineering to executives and stakeholders

    This article will help you communicate the advantages of SRE to stakeholders through 3 arguments. It is part of the SRE Digital Transformation series exploring how to integrate SRE into your organization. Introduction It takes confidence and conviction to introduce significant changes that may affect the entire team or organization. You will naturally face resistance…

  • Reaffirming the value of SREs amid ongoing tech layoffs

    I’ve been curious about the prospects for Site Reliability Engineers (SREs) as companies scale back headcount across the board. This opinion piece will unpack the pressing issue. Many experts predict an ongoing downturn in the tech job market that could last for the next 3-5 years. An unfortunate turn for many employed in the tech…

  • Inside Disney’s Site Reliability Engineering practice

    Introduction It is no small feat to run an ecosystem of entertainment experiences to delight a wide range of people, from young children to older “Disney adults”. Almost every Disney experience relies on a sophisticated technology stack working in the background. “Steve Jobs once said technology amplifies human ability. At Disney, we use technology to…