Tag: site reliability engineer management

Podcast

#21 – Better SRE in 2024 is all we can hope for

Episode 21 [SREpath Podcast] Show notes Sebastian is back for this episode to help set out direction for 2024. We reflected during the holidays on the problems SREs faced in 2023 in terms of job insecurity, burnout, and “that really shouldn’t be my sole job”. Sebastian and I talked about what we hope to bring…

January 6, 2024
Case Studies, Podcast, Team Development

#9 Inside Booking.com’s Site Reliability Engineering Practice

Episode 9 [SREpath Podcast] Ash Patel interviews Samuele Tonon and Yoann Fouquet about their experiences in managing and growing the Site Reliability Engineering (SRE) function at Booking.com. Booking.com is one of the world’s largest travel sites with a market capitalization of over $100 billion and over 1.5 million bookings per day. Here are key highlights…

October 3, 2023
Articles, Team Development

10 Tips for Onboarding New SRE Hires

How new SRE hires can get stuck There’s more than one way to mess up your new SRE hire and get them stuck in a loop. Here are 6 ways new hires will know you’ve made this mistake: This article will unpack these 6 sticking points and show how to solve them. Later on, I…

August 23, 2023
Articles, Team Development

Starting SRE at startups and smaller organizations

Who should pay attention to this article ❌ SRE at a very small startup with few users rarely makes a difference until you’ve reached a fair userbase size or have growing pains ❌ Many organizations without a strong money/legal incentive e.g. SLAs tied to their operations, cannot justify diving into a complex field like SRE…

August 1, 2023
Podcast

SREs are risk managers, IAC hate and more! [Audio]

Episode 1 [SRE Review Podcast] Listen to this episode now: In this maiden episode of SRE Review, I cover the following articles:

June 23, 2023
Podcast

#6 Building a successful SRE practice through capabilities

Episode 6 [SREpath Podcast] We discuss the need for a framework to guide the development of Site Reliability Engineers (SREs) and drive value for organizations. You will learn about our pillar view of areas like observability and service management, to identify areas for improvement and emphasize the importance of focusing on a few key areas…

June 21, 2023
Podcast

#5 Where does SRE fit into your organization’s structure? [Audio]

Episode 5 [SREpath Podcast] We discuss throughout this episode the different engagement models for Site Reliability Engineering (SRE) and how to contextualize SRE into an organization’s structure. Sebastian Vietz, an experienced SRE practitioner, suggests five different engagement models for SRE and emphasizes the importance of considering the cost associated with each model. The hosts also…

June 15, 2023
Podcast

#4 Should organizations care about SRE? [Audio]

Episode 4 [SREpath Podcast] This episode discusses how Site Reliability Engineering (SRE) can be important to organizations. SRE can help: We will also cover how to integrate SRE into the organization’s culture for continuous improvement and innovation.

June 1, 2023
Articles, Team Development

Analysis of SRE and platform setup at 10+ tech companies

In this article, you will see a breakdown of the platform setup and SRE practices within 12 non-FAANG technology companies. This is based on the case studies by Andrios Robert. “There is a lot of content available on how Google did [Site Reliability Engineering]; let’s uncover what happens with the rest of the world.” —…

November 22, 2022
Articles, Opinion

Is platform engineering at risk of shiny object syndrome?

So much has been debated lately about the emergence of “Platform Engineering” as a solution to software operations problems. It’s an interesting proposition. However, it is not your silver bullet that will fix all things one felt didn’t work out with Dev versus Ops, DevOps, or SRE. We are missing something very important in our…

November 13, 2022