





















































Today’s CloudPro is a little different. You won’t hear from me. I’m stepping aside to hand the mic to someone very special: Alexandra McCoy.
She’s been deep in the trenches of SRE and reliability engineering for over a decade, and now she’s turned all that hard-won experience into a hands-on, workshop-style book that helps teams define and implement SLIs and SLOs.
She has just finished writing it and I think its going to help a lot of engineers. It’s called SLIs and SLOs Demystified, and it’s out for pre-order this week.
I’m thrilled to have her guest-author today’s CloudPro. It’s honest, practical, and personal, just like the book. Over to you, Alexandra!
Cheers,
Editor-in-Chief
I’m so excited to be joining you for this special edition of CloudPro. Over the last few months, I’ve been working closely with the team at Packt to bring a project to life that I’ve had on my heart for a long time. It’s called SLIs and SLOs Demystified, and it’s finally ready.
This book is for anyone who’s ever sat in a postmortem wondering, “How did we miss this?”
Or been asked for an SLO and thought, “Okay, but where do I even start?”
It’s the book I wish I’d had years ago, when I first got pulled into reliability work and was piecing it all together from docs, Slack threads, and trial by fire.
We’ve also put together a little bonus for CloudPro readers:
30% off the book (for the next 72 hours only)
+ A free cheatsheet I made with the Packt team - A quick reference to use on the job.
The offer’s only live for the next 72 hours, so if it sounds helpful, don’t wait.
Just use the code CLOUDPRO at checkout.
I’ve been working in tech for a little over 13 years now, with more than a decade spent working on container orchestrators and cloud platforms.
For four of those years, I was a SRE at VMware, where I led incident response across global teams, helped product engineering teams build better dashboards, and ran SLI/SLO workshops.
Before that, I worked on Kubernetes platforms at IBM and Diamanti, and now I run my own consulting practice where I work with companies on their cloud-native architecture and operational strategy.
That’s the formal bit. But really, I’ve just spent a lot of time trying to make complex systems less painful for the people responsible for keeping them up.
I’ve always been a visual and hands-on learner. Early in my SRE journey, I read the Google SRE books and had the chance to work with a former Google SRE. The theory was excellent, but I often found myself wanting something more grounded. I wanted practical examples, clear steps, and a way to start working through the lower-level details without getting stuck in the abstract.
I also saw a pattern: in a lot of orgs, SRE became synonymous with just being on-call. The deeper parts of the practice: design thinking, meaningful measurement, aligning systems with business context, got lost. That’s really what motivated me to write this book.
SLIs and SLOs Demystified is designed to be clear and visual. It’s a practical guide with just enough structure to help you start. Because in the end, your metrics and calculations will always depend on your system’s architecture, so the best thing you can have is a confident, repeatable process that helps you figure it out.
I can tell you the book is practical and honest, but I’d rather show you:
From Chapter 7:
SLIs and their respective SLOs should be prioritized based on business impact and feasibility. Business impact refers to how directly an SLI contributes to customer experience, revenue, and customer satisfaction. Feasibility includes technical complexity, cost, and resource availability, as well as how easily the team can monitor and respond to the metric. For instance, prioritizing authentication success rate over payment processing latency may have a greater business impact if authentication issues are causing users to abandon the workflow. When considering the business impact, we want to ask ourselves the following questions:
What is the level of impact this change brings to the following?
👉Our customer bases
👉Our team
👉Our organization
Does the impact affect the business from a monetary standpoint?
👉If so, how?
👉Is this a SaaS offering?
👉Is this a licensed offering?
Have we assessed industry competition?
👉If so, does our solution offer something that everyone else’s does not?
Regarding feasibility, consider the following:
👉On a scale of 1 to 5, how easy is the technical implementation?
👉What does feasibility mean to the technical team members?
👉Are there other solutions available to achieve the desired outcome?
This also includes weighing the number of engineers and other staff the implementation might require.
The ranking system is based on internal dialogue between the individuals leading the initiative and the technical staff responsible for the respective technical components or designs. In our instance, we might consider the following:
There is no prioritization focused on which trait is of importance. That should be determined by the team based on the business and technical requirements of each SLI and SLO. In this example, we focused on building out three SLI and SLO metrics. However, it is also possible to work through this same flow, add items to the prioritization chart, and then loop through the process again, increasing the number of items the team will manage before implementation.
👉How to define SLIs and SLOs that actually work in practice
👉How to use observability and monitoring to catch problems earlier
👉How to make error budgets useful (instead of confusing)
👉How to align reliability with what your team and users really need
Whether you’re an SRE, a developer who got handed reliability work, or a PM trying to understand what “reliable” even means, I hope this book helps you feel a little more confident and a lot more equipped.
If you decide to pick up the book, thank you. That means a lot.
If not, and something in this issue helped you think differently about reliability, that’s good enough for me too.
Thanks for reading,
Alexandra
📢 If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us.
If you have any comments or feedback, just reply back to this email.
Thanks for reading and have a great day!