Debugging Session Preview - USENIX · Debugging is twice as hard as writing the code in the rst...

Post on 03-Aug-2020

5 views 0 download

Transcript of Debugging Session Preview - USENIX · Debugging is twice as hard as writing the code in the rst...

Debugging Session Preview

Xu Zhao

University of Toronto

Session: 3:50 pm - 5:10 pm, Tuesday

Why Care About Debugging?

I More than 50% of development time are on debugging

I Service down time is critical

Google’s blackout in 2013 caused40% drop in global Internet traffic

Amazon service down on prime day

Debugging is Hard

Debugging is twice as hard as writing the code in the firstplace. So if you write the code as cleverly as possible, you are,by definition, not smart enough to debug it.

– Brian Kernighan

Topics of Tomorrow’s Papers

Speedup Failure Resolution

Bug localization Identify the energy hotspot

Performance debugging Cluster-fueled debugger

I Quick service recovery by reverting the buggy commit

Traditional Debugging

CollectInformation

Understanding FailureFixingBug

DeployFix

Orca

CollectInformation

LocateBuggy

Commit

Revert toPre-bugVersion

I Quick service recovery by reverting the buggy commit

Traditional Debugging

CollectInformation

Understanding FailureFixingBug

DeployFix

Orca

CollectInformation

LocateBuggy

Commit

Revert toPre-bugVersion

I Power is the most contraining resource on mobile devicesI both iOS and Android provide following user interfaces

I Power is the most contraining resource on mobile devicesI Which part of the application is the energy hotspot?

I Breakdown the app into basic execution units calledapplication tasks.

ObservationEnergy consuming pattern can be very different even for similarapplication tasks.

Example: energy consumption breakdown of two IM apps

Other App TasksUI Task

UI Task Other App Tasks

Key Question

I Where is the bottleneck?

A

B

Disk

f1( )

f2( )

Idle

wait B

wait Disk

Disk write

Key Question

I Where is the bottleneck?

A

B

Disk

f1( )

f2( )

Idle

wait B

wait Disk

Disk write

On-CPU Analysis: does not reveal the blocking pattern

Key Question

I Where is the bottleneck?

A

B

Disk

f1( )

f2( )

Idle

wait B

wait Disk

Disk write

Off-CPU Analysis: result could be misleading

Key Question

I Where is the bottleneck?

A

B

Disk

f1( )

f2( )

Idle

wait B

wait Disk

Disk write

wPerf: correctly identify bottlenecking wait events

Highlights

I The best way to understand a failure is to replay it

I First distributed deterministic replay tool

I Debugging distributed systems as easy as GDB

Conclusion

Attend the 3:50pm session on Tuesday!