SLO Adoption and Usage in Site Reliability Engineering


SLO Adoption and Usage in Site Reliability Engineering
SLO Adoption and Usage in Site Reliability Engineering
Compliments of Google Cloud

Book Details

Authors Julie McCoy, Nicole Forsgren
Publisher O'Reilly Media
Published 2020
Edition 1st
Paperback 104 pages
Language English
ISBN-13 9781492075349
ISBN-10 1492075345
License Compliments of Google Cloud

Book Description

To realize the full benefits of SRE, organizations need well-thoughtout reliability targets known as service level objectives (SLOs) that are measured by service level indicators (SLIs), a quantitative measure of an aspect of the service. As detailed in the following section, the measurable goals set forth in an organization's SLOs eliminate the conflicts inherent in change management and event handling that cause the pace of innovation to slow and business to suffer.

Understanding how well your service meets expectations also gives managers valuable business perspectives. SLO compliance can inform whether you invest in making your system faster, more available, or more resilient. Or, if your system consistently meets SLOs, you may decide to invest staff time on other priorities, such as new products or features.


This book is published as open-access, which means it is freely available to read, download, and share without restrictions.

If you enjoyed the book and would like to support the author, you can purchase a printed copy (hardcover or paperback) from official retailers.

Download and Read Links

Share this Book

[localhost]# find . -name "*Similar_Books*"


97 Things Every SRE Should Know

SRE

Site reliability engineering (SRE) is more relevant than ever. Knowing how to keep systems reliable has become a critical skill. With this practical book, newcomers and old hats alike will explore a broad range of conversations happening in SRE. You'll get actionable advice on several topics, including how to adopt SRE, why SLOs matter, when you ne

Modeling and Simulation in Python

Python

Modeling and Simulation in Python is a thorough but easy-to-follow introduction to physical modeling - that is, the art of describing and simulating real-world systems. Readers are guided through modeling things like world population growth, infectious disease, bungee jumping, baseball flight trajectories, celestial mechanics, and more while simult

Training Site Reliability Engineers

SRE

Learn how to train site reliability engineers at your organization in both general and domain-specific subject matter. With this detailed guide from Google's SRE team, you'll not only learn a set of training best practices Google uses for ramping up new SREs; you'll also explore use cases from smaller organizations that have successfully trained pe

Building Secure and Reliable Systems

Can a system be considered truly reliable if it isn't fundamentally secure? Or can it be considered secure if it's unreliable? Security is crucial to the design and operation of scalable systems in production, as it plays an important part in product quality, performance, and availability. In this book, experts from Google share best practices to h

High-Performance Caching with Nginx and Nginx Plus

Nginx

One of its most important capabilities is content caching, which is a highly effective method for improving a website's performance. In this ebook, the authors describe how NGINX caches content, how to implement caching and cache clustering, and some of the ways to improve performance. The text provides a deep dive into how content caching truly wo

Open Government

In a world where web services can make real-time data accessible to anyone, how can the government leverage this openness to improve its operations and increase citizen participation and awareness? Through a collection of essays and case studies, leading visionaries and practitioners both inside and outside of government share their ideas on how to