DevopsHQ #2017
-
Upload
neil-alwin-hermosilla -
Category
Software
-
view
96 -
download
0
Transcript of DevopsHQ #2017
SLA - Service Level Agreement
is a contract between a service provider (either internal or external) and the end user that defines the level of service expected from the service provider. SLAs are output-based in that their purpose is specifically to define what the customer will receive.
SRE - Site Reliability Engineering
Google’s mastermind behind SRE, Ben Treynor, still hasn’t published a single-sentence definition, but describes site reliability as “what happens when a software engineer is tasked with what used to be called operations.”
What to take note?
- People are not generally evil, but busy
- Consider time
- Consider stress
- Consider benefits
- Consider rewards
- Help people be great on their job
Improvement
- Post-Mortems
- Communication
- Collaboration
- Ownership
- Standardization
- Policy
- ISMS (Optional)
>>> LOOP
OPS METRICS
- CPU Utilization
- Memory Utilization
- Disk Utilization
- Process Monitoring
- Webserver Processes
- DB server Processes
- Custom Script Processes
- UPTIME
- Server Load
- NTP (Time)
Working as SRE, everything on your metric/stats count.
SLA is hard to maintain, it takes a badass SRE (not just one but a team) to get things rolling.
- Always be proactive
- Always improve
- Always be ready
@NeilUpbeta01 | [email protected]