SRE > Culture
graph TD
SRE[Site Reliability Engineering]
SRE --> Cul[Culture]
Cul --> SLA
Cul --> SLO
Cul --> Inc[Incidents]
Inc --> Onc[On-call]
Inc --> ReM[Incident Reponse]
Inc --> PoM[Post-Mortem]
- Introductory
- Deeper Introduction
- Site Reliability Engineering - How Google Runs Production Systems 📕 🆓
- The Site Reliability Workbook - Practical Ways to Implement SRE 📕 🆓
- SRE - Keeping Google up and running 24/7 📼 🆓
- Keys to SRE - Google 📼 🆓
- Who/What? is SRE - Google (Panel) 📼 🆓
- Google Series on SRE - class SRE implements DevOps
- What's the Difference Between DevOps and SRE? 📼 🆓
- SLIs, SLOs, SLAs, oh my! 📼 🆓
- Risk and Error Budgets 📼 🆓
- Toil and Toil Budgets 📼 🆓
- Now SRE Everyone Else with CRE! 📼 🆓
- Managing Risks as a Site Reliability Engineer 📼 🆓
- Actionable Alerting for Site Reliability Engineers 📼 🆓
- Observability of Distributed Systems 📼 🆓
- Incident Management 📼 🆓
- Postmortems and Retrospectives 📼 🆓
- IBM Garage - Building SRE from Scratch
- Use cases
- Site Reliability Engineering - Google - Christof Leng 📼 🆓
- Implementing SLOs for a New Service - Squarespace 📼 🆓
- Shipping Software with an SRE Mindset - Circonous 📼 🆓
- Latency SLOs Done Right - Circonous 📼 🆓
- Site Reliability Engineering at Dropbox - Tammy Buttow 📼 🆓
- 190 Countries and 5 core SREs - Netflix - Jonah Horowitz 📼 🆓
- The SRE I Aspire to Be - Usenix - Yaniv Aknin 📼 🆓
- People to Follow
- Monitoring (See Operations Section)
- Incidents
- Post-mortem