Velocity 2015: Building Self-Healing Systems
-
Upload
soasta -
Category
Data & Analytics
-
view
132 -
download
3
Transcript of Velocity 2015: Building Self-Healing Systems
![Page 1: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/1.jpg)
Building Self-Healing Systems
Todd Minnella and Matt Solnit, SOASTA
![Page 2: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/2.jpg)
Speaker Intro - Todd● Director of Ops for● Over 25 years in IT● Experience with both
academic andenterprise computing
● Favorite operating system is Tru64● Enjoys solving problems...but loves sleep more!@[email protected]
![Page 3: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/3.jpg)
Speaker Intro - Matt● VP of Engineering for ● Started programming with Atari BASIC in
elementary school● Ops on the side :-)● First Velocity presentation!
![Page 4: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/4.jpg)
Who are you? :-)http://www.cliarthut.com/clip-arts/751/who-are-you-clip-art-751173.jpg
![Page 5: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/5.jpg)
Agenda (1 of 2)Part One - Theory
● Distributed Systems Challenges● Mitigating Failure Impact● Benefits and Risks ● Testing Requirements● Methodology
![Page 6: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/6.jpg)
Agenda (2 of 2)Part Two - Practice
● Description of Demo System● Example #1 - Externally Triggered Full GC● Example #2 - External System Restart● Example #3 - System-initiated Support Case● Tools Demonstrated● Other Ideas for Automation
![Page 7: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/7.jpg)
Part One
Theory
![Page 8: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/8.jpg)
What makes a distributed system?● Multiple components● Different servers● Different regions (data center or geo)● A component failure != service or app failure● Requires systems thinking
![Page 9: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/9.jpg)
Challenges faced by dist. systems● Complexity● Uncontrollable elements● Hard to see the whole picture● Impossible for a single person to manage
![Page 10: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/10.jpg)
What can we do about it?Easy answer:
Add people!But… easy != correct
![Page 11: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/11.jpg)
Better coping strategy
Enable your systems to heal themselves
...which is why we are here!
![Page 12: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/12.jpg)
Benefits of Self-Healing● Better uptime (at the component level)● Higher service quality● Rapid identification of repeating issues● Improved Ops team morale and productivity
![Page 13: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/13.jpg)
Risk of Self-Healing Systems● Worse uptime (at the component level)● Lower service quality● Maintenance complexities● Degraded Ops team morale and productivity
![Page 14: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/14.jpg)
Risks
![Page 15: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/15.jpg)
So why take the risks?
Implemented well, self-healing systems can make for happier customers!
![Page 16: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/16.jpg)
Failsafe Design
Bibel, G. D. Train Wreck: The Forensics of Rail Disasters. Baltimore: Johns Hopkins UP, 2012. 69-70. Print.
![Page 17: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/17.jpg)
MethodologyIdentify the ProblemDesign the SolutionExecute by Hand
Automate the solution Watch and adjust
PSHAW!
![Page 18: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/18.jpg)
Part Two
Practice
![Page 19: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/19.jpg)
Demo ApplicationJava App Server Farm (n = 2)Amazon Linux EC2 InstanceEC2 Elastic IP addressLoad Balanced via DNS (Dyn Traffic Director)Simple Web Application (HTTP/HTTPS)
![Page 20: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/20.jpg)
Example #1Externally Triggered Full GC
![Page 21: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/21.jpg)
Real-life mPulse exampleStarted reporting Java statistics to monitoring tool in 2013.When investigating outages, often found an exact correlation with large garbage collections (sound familiar?).Set up an alert to fire when heap usage went above 70%Everybody into the war room!
![Page 22: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/22.jpg)
Real-life mPulse example, cont’d
![Page 23: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/23.jpg)
Real-life mPulse example, cont’dEngineering looks for a possible memory leak.Eventually someone says, “Just force a GC!”Most of the time, this would fix it. JVM isn’t perfect, if we help it then the system remains stable.Occasionally this didn’t fix it, which would indicate an actual bug.Engineering fixes, deploy, repeat!
![Page 24: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/24.jpg)
“Intermittent gratification”90% of the time, there was no need to gather everyone together.
Real-life mPulse example, cont’d
![Page 25: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/25.jpg)
Engineering says…Ops, can you fix it?
![Page 26: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/26.jpg)
Identify the Problem1. Java isn’t garbage-collecting efficiently.2. Tuning the JVM is time-consuming and
dangerous.3. Forcing a collection works, but it requires
waking someone up.
![Page 27: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/27.jpg)
Describe a Solution (1 of 2)Identify a metric for JVM Heap Use that is indicative of the problem:
Java VM Old % UsedStart monitoring/reporting this metric.Specify a threshold for action:
Old % Used > 65%
![Page 28: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/28.jpg)
Describe a Solution (2 of 2)When the threshold is reached, take an action:
Trigger a full garbage collectionAfter the action, monitor for success:
Old % Used < 65%
![Page 29: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/29.jpg)
Execute by HandTrigger the condition that causes the problem (or be patient and let it happen).
Once monitoring indicates high old % used, manually execute the full GC.
![Page 30: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/30.jpg)
Automate the Solution, Manually Trigger
Write a script to check for Java old % used.Run the script via cron or similar mechanism.Report when old % used exceeds threshold.A DevOps human will trigger the full GC.
![Page 31: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/31.jpg)
Script SnippetJAVA_PID=`pgrep -f -u tomcat /usr/lib/jvm/jre/bin/java`
RAW_JSTATS=`jstat -gcutil $JAVA_PID | grep -v "S0"`old_pcnt_used=`echo $RAW_JSTATS | cut -f4 -d" "`
integer_old_pcnt_used=`echo $old_pcnt_used | \ awk '{ printf ("%1.0f", $1) }'`
if [ $integer_old_pcnt_used -gt $oldpcnttrigger ]; thenecho "Would trigger full GC here"
fihttps://github.com/SOASTA/velocity-2015-self-healing-systems
![Page 32: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/32.jpg)
DEMO (part 1)
![Page 33: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/33.jpg)
Automate the Solution, Automate the Trigger
Taking the script shown previously, combine the step that:Reports that old % used > 65%with the step that:Triggers the full GC
![Page 34: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/34.jpg)
DEMO (part 2)
![Page 35: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/35.jpg)
Watch and adjustSet up the automated script to run in as many test environments as are available/applicable.Review the results (script log, metrics graphs).Does it work?Investigate any issues thoroughly.Potentially, install the script in a dry-run mode in production.
![Page 36: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/36.jpg)
Go Live!We recommend a gradual deployment.Deploy to a subset of production, then assess.Expand the subset, assess again.When all of production is live, enjoy more sleep!
![Page 37: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/37.jpg)
Example #2Externally Triggered Restart
![Page 38: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/38.jpg)
Real-life mPulse example
![Page 39: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/39.jpg)
Real-life mPulse exampleWhat is a beacon?{"timestamp":1392256183739,"drop_code":"crumb:missing","http_method":"GET","http_version":"HTTP/1.1","http_referrer":"","headers":{"host":"localhost:8080","accept":"*/*"},"params":{"nt_dns_end":"1392147897985","nt_load_end":"1392147912182","nt_first_paint":"1392147900.964995","mem.used":"131000000","nt_spdy":"0","nt_unload_end":"1392147898577","nt_dns_st":"1392147897985","nt_con_st":"1392147897985","rt.bmr.conEn":"834.00000000006","rt.bmr.resEn":"2320.0000000001637","mem.total":"199000000","nt_nav_st":"1392147897985","nt_domcontloaded_end":"1392147901891","dom.sz":"58549","rt.tstart":"1392147897985","rt.bmr.domSt":"419.0000000000964","nt_con_end":"1392147897985","nt_domint":"1392147901585","nt_red_end":"0","dom.ln":"939","nt_unload_st":"1392147898574","t_done":"14201","nt_load_st":"1392147912129","t_page":"13638","rt.end":"1392147912186","nt_domloading":"1392147898927","nt_res_end":"1392147898571","t_resp":"563","rt.bmr.domEn":"813.0000000001019","rt.tt":"14201","nt_red_cnt":"0","if":"","nt_fet_st":"1392147897985","nt_res_st":"1392147898548","nt_req_st":"1392147897995","nt_nav_type":"0","mob.ct":"0","dom.img":"16","nt_red_st":"0","rt.ss":"1392147897985","config.timedout":"true","rt.bmr.resSt":"2312.0000000001255","rt.si":"3el0j57fms0885mi-n0uk6y","rt.sl":"1","rt.bmr.fetSt":"16.000000000076398","rt.bmr.conSt":"813.0000000001019","nt_domcomp":"1392147912129","dom.script":"27","v":"0.9.1389663787","rt.bmr.reqSt":"834.00000000006","r":"","rt.bstart":"1392147906107","rt.obo":"0","rt.start":"navigation","nt_domcontloaded_st":"1392147901585"}}
![Page 40: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/40.jpg)
Real-life mPulse example, cont’dEach server processes millions of these per day.Beacons are logged to disk, eventually compressed and uploaded to S3.
![Page 41: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/41.jpg)
Real-life mPulse example, cont’dEvery so often, the background uploader thread stops working.
(we don’t know why yet)
When this happens, we get 10-12 hours before the disk fills up and the server dies.
![Page 42: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/42.jpg)
Real-life mPulse example, cont’dA simple re-start fixes it.
SO...
While developers are investigating, Ops is getting paged (and woken up) to re-start boxes.
![Page 43: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/43.jpg)
Ops says…We can do better!
![Page 44: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/44.jpg)
Identify the Problem (Demo App)● Lack of activity indicates a failed thread● While the issue goes unresolved, data is
delayed (and the disk may fill)
![Page 45: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/45.jpg)
Describe a Solution● A restart of the application solves the
problem● The application server needs to be removed
from service prior to the restart● The server hosting the application is an AWS
instance, and a reboot is fast and effective
![Page 46: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/46.jpg)
Execute by Hand
1. Take the application out-of-service2. Restart the application3. Watch for Self-Check OK4. Put the application back in-service
![Page 47: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/47.jpg)
Automate the Solution, Manually Trigger
● Log metrics go to AWS CloudWatch● Lack of activity triggers an Alarm● Alarm triggers a SNS notification● Human being makes the DNS changes and
restart the server.
![Page 48: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/48.jpg)
DEMO
![Page 49: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/49.jpg)
Developers say…We can do better!
![Page 50: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/50.jpg)
Automate the Solution, Automate the Trigger
● EC2 and DynECT both have APIs● DNS changes and reboot can all be
automated● Todd can sleep!
![Page 51: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/51.jpg)
Automate the Solution, Automate the Trigger
AWS LambdaUpload code to Amazon (Node.js)Attach it to a listener (SNS)No instance required!
![Page 52: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/52.jpg)
Automate the Solution, Automate the Trigger
Lambda function listens on “logs are not being uploaded” notification.Uses Dyn REST API to disable the DNS record.Uses EC2 API to re-boot the instance.
![Page 53: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/53.jpg)
Automate the Solution, Automate the Trigger
Lambda function listens on “all OK” notification.Uses Dyn REST API to re-enable the DNS record.
![Page 54: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/54.jpg)
var dynect = require('./dynect_api.js');var AWS = require('aws-sdk');
exports.cloudwatch_alarm_sns_handler = function(event, context) { event.Records.forEach(function(record) { var alarm = JSON.parse(record.Sns.Message);
// Extract the instance status. ALARM means it's down, OK means it's up. var instance_up = alarm.NewStateValue !== "ALARM";
// ...
https://github.com/SOASTA/velocity-2015-self-healing-systems
Node.js code snippet
![Page 55: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/55.jpg)
New workflow
Look, no Todd!
![Page 56: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/56.jpg)
DEMO
![Page 57: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/57.jpg)
Watch and adjust● Include Ops team on ALARM and
SELFCHECKOK notifications● Observe effects - use monitoring tools to
assess availability
![Page 58: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/58.jpg)
Example #3Application files support ticket
![Page 59: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/59.jpg)
Real-life mPulse example● Customers configure raw beacon uploads to
their own S3 buckets.● Sometimes they break
things (or AWS accesskey is changed, etc.)
● We log the error, but we don’t monitor it and don’t notify customers.
![Page 60: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/60.jpg)
Identify the Problem● Another example: yser connecting to a site
can’t authenticate successfully● Assumption is that this is a limited access
site
![Page 61: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/61.jpg)
DevOps says…Now, let’s help our customers succeed!
![Page 62: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/62.jpg)
Describe a Solution● Notify the Customer Support team● Provide Support with details so that they can
proactively reach out
![Page 63: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/63.jpg)
Execute by Hand● Examine the logs for the error● Review the situation with Support● Work with Support to handle a case end-to-
end
![Page 64: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/64.jpg)
Automate the Solution, Manually Trigger
● Log metrics go to AWS CloudWatch● Presence of error triggers an Alarm● Alarm triggers a SNS notification● Human being can then create a Zendesk
case
![Page 65: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/65.jpg)
Automate the Solution, Automate the Trigger
● AWS Lambda listens on SNS notification● Collects information from the notification● Files a Zendesk case categorized to go to
the correct team
![Page 66: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/66.jpg)
AWS Lambda ActionsOn Failed Login notification● Create a Zendesk case with user details
![Page 67: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/67.jpg)
Watch and adjust● Ops reviews logs● Ops meets with Support to review case
frequency and outcomes
![Page 68: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/68.jpg)
Testing Requirements● Start small● Develop (and verify) in stages● Let run in production-like environment● Verify behavior in “dry-run” mode
![Page 69: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/69.jpg)
Tools Demonstrated - AWSCloudWatch http://aws.amazon.com/cloudwatch/
EC2 http://aws.amazon.com/ec2/
Lambda http://aws.amazon.com/lambda/
Linux http://aws.amazon.com/amazon-linux-ami/
![Page 70: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/70.jpg)
Tools Demonstrated - OtherDatadog https://www.datadoghq.com/product/
Dyn Traffic Director http://dyn.com/traffic-director/
Monitis http://www.monitis.com/
PagerDuty http://www.pagerduty.com
ZenDesk https://www.zendesk.com
![Page 71: Velocity 2015: Building Self-Healing Systems](https://reader035.fdocuments.us/reader035/viewer/2022070600/589fe2581a28abf3238b51e5/html5/thumbnails/71.jpg)
See SOASTA at booth #801