Performance Tuning 101
-
Upload
janmejay-singh -
Category
Technology
-
view
1.485 -
download
3
description
Transcript of Performance Tuning 101
Problem Approach Nature of fixes War Stories
Performance Tuning 101An introduction to performance tuning an application in managed
environment
Pavan KS1 Janmejay Singh2
1mail: [email protected]: http://itspanzi.blogspot.com
2mail: [email protected]: http://codehunk.wordpress.com
Rootconf Bangalore, May 2012
Problem Approach Nature of fixes War Stories
Gist
Don’t be out there with a gun looking for problems.
Do not go overboard, know your end goal.Intution tends to work very poorly. Belive proof, not gut feel.Understand your environment. Know what to measure.Round pegs for round holes. Use appropriate tools.Jumps places, unlike functional bugs, look beyond localimpact.Requires mechanical approach, attack worst problem first.Prefer correctness over performance.Reasoning every fix (of paramount importance inmultithreaded app).Do one fix at a time, allows easy reasoning andverification, can be reverted.Don’t jump in the lake without knowing whats waitinginside.
Problem Approach Nature of fixes War Stories
Gist
Don’t be out there with a gun looking for problems.Do not go overboard, know your end goal.
Intution tends to work very poorly. Belive proof, not gut feel.Understand your environment. Know what to measure.Round pegs for round holes. Use appropriate tools.Jumps places, unlike functional bugs, look beyond localimpact.Requires mechanical approach, attack worst problem first.Prefer correctness over performance.Reasoning every fix (of paramount importance inmultithreaded app).Do one fix at a time, allows easy reasoning andverification, can be reverted.Don’t jump in the lake without knowing whats waitinginside.
Problem Approach Nature of fixes War Stories
Gist
Don’t be out there with a gun looking for problems.Do not go overboard, know your end goal.Intution tends to work very poorly. Belive proof, not gut feel.
Understand your environment. Know what to measure.Round pegs for round holes. Use appropriate tools.Jumps places, unlike functional bugs, look beyond localimpact.Requires mechanical approach, attack worst problem first.Prefer correctness over performance.Reasoning every fix (of paramount importance inmultithreaded app).Do one fix at a time, allows easy reasoning andverification, can be reverted.Don’t jump in the lake without knowing whats waitinginside.
Problem Approach Nature of fixes War Stories
Gist
Don’t be out there with a gun looking for problems.Do not go overboard, know your end goal.Intution tends to work very poorly. Belive proof, not gut feel.Understand your environment. Know what to measure.
Round pegs for round holes. Use appropriate tools.Jumps places, unlike functional bugs, look beyond localimpact.Requires mechanical approach, attack worst problem first.Prefer correctness over performance.Reasoning every fix (of paramount importance inmultithreaded app).Do one fix at a time, allows easy reasoning andverification, can be reverted.Don’t jump in the lake without knowing whats waitinginside.
Problem Approach Nature of fixes War Stories
Gist
Don’t be out there with a gun looking for problems.Do not go overboard, know your end goal.Intution tends to work very poorly. Belive proof, not gut feel.Understand your environment. Know what to measure.Round pegs for round holes. Use appropriate tools.
Jumps places, unlike functional bugs, look beyond localimpact.Requires mechanical approach, attack worst problem first.Prefer correctness over performance.Reasoning every fix (of paramount importance inmultithreaded app).Do one fix at a time, allows easy reasoning andverification, can be reverted.Don’t jump in the lake without knowing whats waitinginside.
Problem Approach Nature of fixes War Stories
Gist
Don’t be out there with a gun looking for problems.Do not go overboard, know your end goal.Intution tends to work very poorly. Belive proof, not gut feel.Understand your environment. Know what to measure.Round pegs for round holes. Use appropriate tools.Jumps places, unlike functional bugs, look beyond localimpact.
Requires mechanical approach, attack worst problem first.Prefer correctness over performance.Reasoning every fix (of paramount importance inmultithreaded app).Do one fix at a time, allows easy reasoning andverification, can be reverted.Don’t jump in the lake without knowing whats waitinginside.
Problem Approach Nature of fixes War Stories
Gist
Don’t be out there with a gun looking for problems.Do not go overboard, know your end goal.Intution tends to work very poorly. Belive proof, not gut feel.Understand your environment. Know what to measure.Round pegs for round holes. Use appropriate tools.Jumps places, unlike functional bugs, look beyond localimpact.Requires mechanical approach, attack worst problem first.
Prefer correctness over performance.Reasoning every fix (of paramount importance inmultithreaded app).Do one fix at a time, allows easy reasoning andverification, can be reverted.Don’t jump in the lake without knowing whats waitinginside.
Problem Approach Nature of fixes War Stories
Gist
Don’t be out there with a gun looking for problems.Do not go overboard, know your end goal.Intution tends to work very poorly. Belive proof, not gut feel.Understand your environment. Know what to measure.Round pegs for round holes. Use appropriate tools.Jumps places, unlike functional bugs, look beyond localimpact.Requires mechanical approach, attack worst problem first.Prefer correctness over performance.
Reasoning every fix (of paramount importance inmultithreaded app).Do one fix at a time, allows easy reasoning andverification, can be reverted.Don’t jump in the lake without knowing whats waitinginside.
Problem Approach Nature of fixes War Stories
Gist
Don’t be out there with a gun looking for problems.Do not go overboard, know your end goal.Intution tends to work very poorly. Belive proof, not gut feel.Understand your environment. Know what to measure.Round pegs for round holes. Use appropriate tools.Jumps places, unlike functional bugs, look beyond localimpact.Requires mechanical approach, attack worst problem first.Prefer correctness over performance.Reasoning every fix (of paramount importance inmultithreaded app).
Do one fix at a time, allows easy reasoning andverification, can be reverted.Don’t jump in the lake without knowing whats waitinginside.
Problem Approach Nature of fixes War Stories
Gist
Don’t be out there with a gun looking for problems.Do not go overboard, know your end goal.Intution tends to work very poorly. Belive proof, not gut feel.Understand your environment. Know what to measure.Round pegs for round holes. Use appropriate tools.Jumps places, unlike functional bugs, look beyond localimpact.Requires mechanical approach, attack worst problem first.Prefer correctness over performance.Reasoning every fix (of paramount importance inmultithreaded app).Do one fix at a time, allows easy reasoning andverification, can be reverted.
Don’t jump in the lake without knowing whats waitinginside.
Problem Approach Nature of fixes War Stories
Gist
Don’t be out there with a gun looking for problems.Do not go overboard, know your end goal.Intution tends to work very poorly. Belive proof, not gut feel.Understand your environment. Know what to measure.Round pegs for round holes. Use appropriate tools.Jumps places, unlike functional bugs, look beyond localimpact.Requires mechanical approach, attack worst problem first.Prefer correctness over performance.Reasoning every fix (of paramount importance inmultithreaded app).Do one fix at a time, allows easy reasoning andverification, can be reverted.Don’t jump in the lake without knowing whats waitinginside.
Problem Approach Nature of fixes War Stories
Context
App in reference: ThoughtWorks Studios - Go (CI/CDsuite)
Java (backend) and JRuby on Rails (frontend) webapp.Managed environment.GNU/LinuxApproach and learning is independent of tech-stack.
Problem Approach Nature of fixes War Stories
Context
App in reference: ThoughtWorks Studios - Go (CI/CDsuite)Java (backend) and JRuby on Rails (frontend) webapp.
Managed environment.GNU/LinuxApproach and learning is independent of tech-stack.
Problem Approach Nature of fixes War Stories
Context
App in reference: ThoughtWorks Studios - Go (CI/CDsuite)Java (backend) and JRuby on Rails (frontend) webapp.Managed environment.
GNU/LinuxApproach and learning is independent of tech-stack.
Problem Approach Nature of fixes War Stories
Context
App in reference: ThoughtWorks Studios - Go (CI/CDsuite)Java (backend) and JRuby on Rails (frontend) webapp.Managed environment.GNU/Linux
Approach and learning is independent of tech-stack.
Problem Approach Nature of fixes War Stories
Context
App in reference: ThoughtWorks Studios - Go (CI/CDsuite)Java (backend) and JRuby on Rails (frontend) webapp.Managed environment.GNU/LinuxApproach and learning is independent of tech-stack.
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
Don’t go looking for performance issues
Doing it proactively is a wild goose chase!
Varying parameters and running into a real problem is improbable, ifnot impossible.
Problem Approach Nature of fixes War Stories
Remember...
Don’t be out there with a gun looking for problems.
Its easier to reproduce and fix a performance issue when you haveseen it happen in a real setup. That way, you also know for sure, it is
real.
Problem Approach Nature of fixes War Stories
My app is slow, it doesn’t respond...
In case of Go, this could mean,The dashboard is slow
Javascript is slowBuilds taking too long to scheduleWorker processes picking up work too lateSCM changes reflect after a long timeProcess is hogging resources(requires 16 cores, 8 gig ramfor a basic setup)
Problem Approach Nature of fixes War Stories
My app is slow, it doesn’t respond...
In case of Go, this could mean,The dashboard is slowJavascript is slow
Builds taking too long to scheduleWorker processes picking up work too lateSCM changes reflect after a long timeProcess is hogging resources(requires 16 cores, 8 gig ramfor a basic setup)
Problem Approach Nature of fixes War Stories
My app is slow, it doesn’t respond...
In case of Go, this could mean,The dashboard is slowJavascript is slowBuilds taking too long to schedule
Worker processes picking up work too lateSCM changes reflect after a long timeProcess is hogging resources(requires 16 cores, 8 gig ramfor a basic setup)
Problem Approach Nature of fixes War Stories
My app is slow, it doesn’t respond...
In case of Go, this could mean,The dashboard is slowJavascript is slowBuilds taking too long to scheduleWorker processes picking up work too late
SCM changes reflect after a long timeProcess is hogging resources(requires 16 cores, 8 gig ramfor a basic setup)
Problem Approach Nature of fixes War Stories
My app is slow, it doesn’t respond...
In case of Go, this could mean,The dashboard is slowJavascript is slowBuilds taking too long to scheduleWorker processes picking up work too lateSCM changes reflect after a long time
Process is hogging resources(requires 16 cores, 8 gig ramfor a basic setup)
Problem Approach Nature of fixes War Stories
My app is slow, it doesn’t respond...
In case of Go, this could mean,The dashboard is slowJavascript is slowBuilds taking too long to scheduleWorker processes picking up work too lateSCM changes reflect after a long timeProcess is hogging resources(requires 16 cores, 8 gig ramfor a basic setup)
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
Benchmark
How slow is slow?
How fast is fast?Benchmark using realistic setup.Use a repeatable and automated setup.
Problem Approach Nature of fixes War Stories
Benchmark
How slow is slow?How fast is fast?
Benchmark using realistic setup.Use a repeatable and automated setup.
Problem Approach Nature of fixes War Stories
Benchmark
How slow is slow?How fast is fast?Benchmark using realistic setup.
Use a repeatable and automated setup.
Problem Approach Nature of fixes War Stories
Benchmark
How slow is slow?How fast is fast?Benchmark using realistic setup.Use a repeatable and automated setup.
Problem Approach Nature of fixes War Stories
Remember...
Do not go overboard.
Know your end goal.
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
Trust in testing
What if I start at fix?
What if I stop at fix?Why do I need to iterate?How long should I iterate?
Problem Approach Nature of fixes War Stories
Trust in testing
What if I start at fix?What if I stop at fix?
Why do I need to iterate?How long should I iterate?
Problem Approach Nature of fixes War Stories
Trust in testing
What if I start at fix?What if I stop at fix?
Why do I need to iterate?
How long should I iterate?
Problem Approach Nature of fixes War Stories
Trust in testing
What if I start at fix?What if I stop at fix?
Why do I need to iterate?How long should I iterate?
Problem Approach Nature of fixes War Stories
Remember...
Intution tends to work very poorly.
Belive proof, not gut feel.
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
What to measure
Easy and obvious:Page response time
Not so easy:App throughputEnqueue/Dequeue time for message queuesCPU/IO churnDatabase performanceLock contentionManaged environment specific parameters(Memory usage,GC churn etc).Environmental issues(too many processes, too lowmemory, very slow IO).
Problem Approach Nature of fixes War Stories
What to measure
Easy and obvious:Page response time
Not so easy:App throughputEnqueue/Dequeue time for message queuesCPU/IO churnDatabase performanceLock contentionManaged environment specific parameters(Memory usage,GC churn etc).Environmental issues(too many processes, too lowmemory, very slow IO).
Problem Approach Nature of fixes War Stories
Remember...
Understand your environment.
Know what to measure.
Problem Approach Nature of fixes War Stories
Tools to measure
User loadabhtperf
App throughput: measure app-levelchurn(creation/processing/deletion/delivery) countsEnqueue/Dequeue time: Use broker’s managementconsole/logs/statistics.CPU/IO churn: Load Avg, iostat, klogd, inotify.Database performance: Profiler.Lock contention: Profiler.Managed env specific params: GC logs, GC profiling.Environmental issues: ps, dmesg, /proc/<pid>/status,/proc/net/dev, syslog, messages.
Problem Approach Nature of fixes War Stories
Tools to measure
User loadabhtperf
App throughput: measure app-levelchurn(creation/processing/deletion/delivery) counts
Enqueue/Dequeue time: Use broker’s managementconsole/logs/statistics.CPU/IO churn: Load Avg, iostat, klogd, inotify.Database performance: Profiler.Lock contention: Profiler.Managed env specific params: GC logs, GC profiling.Environmental issues: ps, dmesg, /proc/<pid>/status,/proc/net/dev, syslog, messages.
Problem Approach Nature of fixes War Stories
Tools to measure
User loadabhtperf
App throughput: measure app-levelchurn(creation/processing/deletion/delivery) countsEnqueue/Dequeue time: Use broker’s managementconsole/logs/statistics.
CPU/IO churn: Load Avg, iostat, klogd, inotify.Database performance: Profiler.Lock contention: Profiler.Managed env specific params: GC logs, GC profiling.Environmental issues: ps, dmesg, /proc/<pid>/status,/proc/net/dev, syslog, messages.
Problem Approach Nature of fixes War Stories
Tools to measure
User loadabhtperf
App throughput: measure app-levelchurn(creation/processing/deletion/delivery) countsEnqueue/Dequeue time: Use broker’s managementconsole/logs/statistics.CPU/IO churn: Load Avg, iostat, klogd, inotify.
Database performance: Profiler.Lock contention: Profiler.Managed env specific params: GC logs, GC profiling.Environmental issues: ps, dmesg, /proc/<pid>/status,/proc/net/dev, syslog, messages.
Problem Approach Nature of fixes War Stories
Tools to measure
User loadabhtperf
App throughput: measure app-levelchurn(creation/processing/deletion/delivery) countsEnqueue/Dequeue time: Use broker’s managementconsole/logs/statistics.CPU/IO churn: Load Avg, iostat, klogd, inotify.Database performance: Profiler.
Lock contention: Profiler.Managed env specific params: GC logs, GC profiling.Environmental issues: ps, dmesg, /proc/<pid>/status,/proc/net/dev, syslog, messages.
Problem Approach Nature of fixes War Stories
Tools to measure
User loadabhtperf
App throughput: measure app-levelchurn(creation/processing/deletion/delivery) countsEnqueue/Dequeue time: Use broker’s managementconsole/logs/statistics.CPU/IO churn: Load Avg, iostat, klogd, inotify.Database performance: Profiler.Lock contention: Profiler.
Managed env specific params: GC logs, GC profiling.Environmental issues: ps, dmesg, /proc/<pid>/status,/proc/net/dev, syslog, messages.
Problem Approach Nature of fixes War Stories
Tools to measure
User loadabhtperf
App throughput: measure app-levelchurn(creation/processing/deletion/delivery) countsEnqueue/Dequeue time: Use broker’s managementconsole/logs/statistics.CPU/IO churn: Load Avg, iostat, klogd, inotify.Database performance: Profiler.Lock contention: Profiler.Managed env specific params: GC logs, GC profiling.
Environmental issues: ps, dmesg, /proc/<pid>/status,/proc/net/dev, syslog, messages.
Problem Approach Nature of fixes War Stories
Tools to measure
User loadabhtperf
App throughput: measure app-levelchurn(creation/processing/deletion/delivery) countsEnqueue/Dequeue time: Use broker’s managementconsole/logs/statistics.CPU/IO churn: Load Avg, iostat, klogd, inotify.Database performance: Profiler.Lock contention: Profiler.Managed env specific params: GC logs, GC profiling.Environmental issues: ps, dmesg, /proc/<pid>/status,/proc/net/dev, syslog, messages.
Problem Approach Nature of fixes War Stories
Remember...
Round pegs for round holes.
Use appropriate tools. Configure them correctly with least possibleoverhead.
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
How to carry out fixes
Understand libraries: read fine print in documentation, orbetter yet, read code.
Be mechanical.Use exhaggerated load to maginify the problem.Use the same load profile to test your fix.Be rigorous.Mob pair on complicated fixes, especially fixes involvingdata mutation across threads.Attack one and only one problem.Having said all that, correctness always comes first.
Problem Approach Nature of fixes War Stories
How to carry out fixes
Understand libraries: read fine print in documentation, orbetter yet, read code.Be mechanical.
Use exhaggerated load to maginify the problem.Use the same load profile to test your fix.Be rigorous.Mob pair on complicated fixes, especially fixes involvingdata mutation across threads.Attack one and only one problem.Having said all that, correctness always comes first.
Problem Approach Nature of fixes War Stories
How to carry out fixes
Understand libraries: read fine print in documentation, orbetter yet, read code.Be mechanical.Use exhaggerated load to maginify the problem.
Use the same load profile to test your fix.Be rigorous.Mob pair on complicated fixes, especially fixes involvingdata mutation across threads.Attack one and only one problem.Having said all that, correctness always comes first.
Problem Approach Nature of fixes War Stories
How to carry out fixes
Understand libraries: read fine print in documentation, orbetter yet, read code.Be mechanical.Use exhaggerated load to maginify the problem.Use the same load profile to test your fix.
Be rigorous.Mob pair on complicated fixes, especially fixes involvingdata mutation across threads.Attack one and only one problem.Having said all that, correctness always comes first.
Problem Approach Nature of fixes War Stories
How to carry out fixes
Understand libraries: read fine print in documentation, orbetter yet, read code.Be mechanical.Use exhaggerated load to maginify the problem.Use the same load profile to test your fix.Be rigorous.
Mob pair on complicated fixes, especially fixes involvingdata mutation across threads.Attack one and only one problem.Having said all that, correctness always comes first.
Problem Approach Nature of fixes War Stories
How to carry out fixes
Understand libraries: read fine print in documentation, orbetter yet, read code.Be mechanical.Use exhaggerated load to maginify the problem.Use the same load profile to test your fix.Be rigorous.Mob pair on complicated fixes, especially fixes involvingdata mutation across threads.
Attack one and only one problem.Having said all that, correctness always comes first.
Problem Approach Nature of fixes War Stories
How to carry out fixes
Understand libraries: read fine print in documentation, orbetter yet, read code.Be mechanical.Use exhaggerated load to maginify the problem.Use the same load profile to test your fix.Be rigorous.Mob pair on complicated fixes, especially fixes involvingdata mutation across threads.Attack one and only one problem.
Having said all that, correctness always comes first.
Problem Approach Nature of fixes War Stories
How to carry out fixes
Understand libraries: read fine print in documentation, orbetter yet, read code.Be mechanical.Use exhaggerated load to maginify the problem.Use the same load profile to test your fix.Be rigorous.Mob pair on complicated fixes, especially fixes involvingdata mutation across threads.Attack one and only one problem.Having said all that, correctness always comes first.
Problem Approach Nature of fixes War Stories
Remember...
Requires mechanical approach
Attack worst problem first
Reason every fix
Reasoning is of paramount importance in multithreaded app
Do one fix at a time
Allows easy reasoning and verification, can be reverted
Prefer correctness over performance
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
Lock granularity and lockless
When you fix lock contention somewhere, ensure it hasn’tmoved elsewhere, making overall performance worse.
More than one shared variable mandates synchronizationusually, watch out!Atomic operations and globaly unique objects(like internedsymbols) are your friends.
Problem Approach Nature of fixes War Stories
Lock granularity and lockless
When you fix lock contention somewhere, ensure it hasn’tmoved elsewhere, making overall performance worse.More than one shared variable mandates synchronizationusually, watch out!
Atomic operations and globaly unique objects(like internedsymbols) are your friends.
Problem Approach Nature of fixes War Stories
Lock granularity and lockless
When you fix lock contention somewhere, ensure it hasn’tmoved elsewhere, making overall performance worse.More than one shared variable mandates synchronizationusually, watch out!Atomic operations and globaly unique objects(like internedsymbols) are your friends.
Problem Approach Nature of fixes War Stories
Remember...
Jumps places, unlike functional bugs
Look beyond local impact of a fix
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
Garbage Collection
When you hear unresponsiveness is intermittent, bealarmed!
It probably is GCIt is likely not easy to fixRoot cause may be you or a library
Scan GC logsProfile to see if majority of allocations are from just ahandful of calls
If its several calls with each contributing very little to the pie,don’t even try fixing it.
Understand GC ergonomics and each algorithm(if youhave alternatives) in detail.Tweek your max/min permitted allocation judiciously,malloc/free are expensive system calls.
Problem Approach Nature of fixes War Stories
Garbage Collection
When you hear unresponsiveness is intermittent, bealarmed!
It probably is GCIt is likely not easy to fixRoot cause may be you or a library
Scan GC logs
Profile to see if majority of allocations are from just ahandful of calls
If its several calls with each contributing very little to the pie,don’t even try fixing it.
Understand GC ergonomics and each algorithm(if youhave alternatives) in detail.Tweek your max/min permitted allocation judiciously,malloc/free are expensive system calls.
Problem Approach Nature of fixes War Stories
Garbage Collection
When you hear unresponsiveness is intermittent, bealarmed!
It probably is GCIt is likely not easy to fixRoot cause may be you or a library
Scan GC logsProfile to see if majority of allocations are from just ahandful of calls
If its several calls with each contributing very little to the pie,don’t even try fixing it.
Understand GC ergonomics and each algorithm(if youhave alternatives) in detail.Tweek your max/min permitted allocation judiciously,malloc/free are expensive system calls.
Problem Approach Nature of fixes War Stories
Garbage Collection
When you hear unresponsiveness is intermittent, bealarmed!
It probably is GCIt is likely not easy to fixRoot cause may be you or a library
Scan GC logsProfile to see if majority of allocations are from just ahandful of calls
If its several calls with each contributing very little to the pie,don’t even try fixing it.
Understand GC ergonomics and each algorithm(if youhave alternatives) in detail.Tweek your max/min permitted allocation judiciously,malloc/free are expensive system calls.
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
SQL Database
Optimize connection pool size.Expand is very expensive.Is dispatched on caller thread(unless your connection poolis super-human).
Watch out for slow queries, it can snowball.Understand your DB settings, MVCC, table-lockingscenarios, row-locking scenarios, transaction isolationtradeoffs.Contd...
Problem Approach Nature of fixes War Stories
SQL Database
Optimize connection pool size.Expand is very expensive.Is dispatched on caller thread(unless your connection poolis super-human).
Watch out for slow queries, it can snowball.
Understand your DB settings, MVCC, table-lockingscenarios, row-locking scenarios, transaction isolationtradeoffs.Contd...
Problem Approach Nature of fixes War Stories
SQL Database
Optimize connection pool size.Expand is very expensive.Is dispatched on caller thread(unless your connection poolis super-human).
Watch out for slow queries, it can snowball.Understand your DB settings, MVCC, table-lockingscenarios, row-locking scenarios, transaction isolationtradeoffs.
Contd...
Problem Approach Nature of fixes War Stories
SQL Database
Optimize connection pool size.Expand is very expensive.Is dispatched on caller thread(unless your connection poolis super-human).
Watch out for slow queries, it can snowball.Understand your DB settings, MVCC, table-lockingscenarios, row-locking scenarios, transaction isolationtradeoffs.Contd...
Problem Approach Nature of fixes War Stories
SQL Database contd...
Explain and Analyze are your friends.
Understanding implications of file-system IO helps realizetable scans can be horrible
DB runs with fixed page cache.Table scan can wipe it all cold.Use indexes to avoid table scan.Indexes are order sensitive.Tweek inner queries to reduce selectivity.
Every DB may be subtly different(joins can be moreexpensive than materialzed views(if you have em))Denormalization can help crack tough nuts.
Problem Approach Nature of fixes War Stories
SQL Database contd...
Explain and Analyze are your friends.Understanding implications of file-system IO helps realizetable scans can be horrible
DB runs with fixed page cache.Table scan can wipe it all cold.Use indexes to avoid table scan.Indexes are order sensitive.Tweek inner queries to reduce selectivity.
Every DB may be subtly different(joins can be moreexpensive than materialzed views(if you have em))Denormalization can help crack tough nuts.
Problem Approach Nature of fixes War Stories
SQL Database contd...
Explain and Analyze are your friends.Understanding implications of file-system IO helps realizetable scans can be horrible
DB runs with fixed page cache.Table scan can wipe it all cold.Use indexes to avoid table scan.Indexes are order sensitive.Tweek inner queries to reduce selectivity.
Every DB may be subtly different(joins can be moreexpensive than materialzed views(if you have em))
Denormalization can help crack tough nuts.
Problem Approach Nature of fixes War Stories
SQL Database contd...
Explain and Analyze are your friends.Understanding implications of file-system IO helps realizetable scans can be horrible
DB runs with fixed page cache.Table scan can wipe it all cold.Use indexes to avoid table scan.Indexes are order sensitive.Tweek inner queries to reduce selectivity.
Every DB may be subtly different(joins can be moreexpensive than materialzed views(if you have em))Denormalization can help crack tough nuts.
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
Context Switch Thrashing
Too few cores, too many processes will obviously hurt.
Too many locks will cause too many sleeps, hence toomuch ctx switching.Ctx switch may schedule a different process, causing CPUcaches and TLB to get cold.Too frequent IO will cause io-wait, hence switch.
Use of swap can completely kill performance. Swap is foremergency, not for regular usage.Over logging can hurt, be careful!Concept of buffered IO exists for a reason, use it!Use memory mapped IO if it makes sense.
Assigning very low max memory limit can cause frequentGCs, allocate enough memory so its your functional threadthat works and not GC threads.
Problem Approach Nature of fixes War Stories
Context Switch Thrashing
Too few cores, too many processes will obviously hurt.Too many locks will cause too many sleeps, hence toomuch ctx switching.
Ctx switch may schedule a different process, causing CPUcaches and TLB to get cold.Too frequent IO will cause io-wait, hence switch.
Use of swap can completely kill performance. Swap is foremergency, not for regular usage.Over logging can hurt, be careful!Concept of buffered IO exists for a reason, use it!Use memory mapped IO if it makes sense.
Assigning very low max memory limit can cause frequentGCs, allocate enough memory so its your functional threadthat works and not GC threads.
Problem Approach Nature of fixes War Stories
Context Switch Thrashing
Too few cores, too many processes will obviously hurt.Too many locks will cause too many sleeps, hence toomuch ctx switching.Ctx switch may schedule a different process, causing CPUcaches and TLB to get cold.
Too frequent IO will cause io-wait, hence switch.Use of swap can completely kill performance. Swap is foremergency, not for regular usage.Over logging can hurt, be careful!Concept of buffered IO exists for a reason, use it!Use memory mapped IO if it makes sense.
Assigning very low max memory limit can cause frequentGCs, allocate enough memory so its your functional threadthat works and not GC threads.
Problem Approach Nature of fixes War Stories
Context Switch Thrashing
Too few cores, too many processes will obviously hurt.Too many locks will cause too many sleeps, hence toomuch ctx switching.Ctx switch may schedule a different process, causing CPUcaches and TLB to get cold.Too frequent IO will cause io-wait, hence switch.
Use of swap can completely kill performance. Swap is foremergency, not for regular usage.Over logging can hurt, be careful!Concept of buffered IO exists for a reason, use it!Use memory mapped IO if it makes sense.
Assigning very low max memory limit can cause frequentGCs, allocate enough memory so its your functional threadthat works and not GC threads.
Problem Approach Nature of fixes War Stories
Context Switch Thrashing
Too few cores, too many processes will obviously hurt.Too many locks will cause too many sleeps, hence toomuch ctx switching.Ctx switch may schedule a different process, causing CPUcaches and TLB to get cold.Too frequent IO will cause io-wait, hence switch.
Use of swap can completely kill performance. Swap is foremergency, not for regular usage.Over logging can hurt, be careful!Concept of buffered IO exists for a reason, use it!Use memory mapped IO if it makes sense.
Assigning very low max memory limit can cause frequentGCs, allocate enough memory so its your functional threadthat works and not GC threads.
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
Common Pitfalls
PrejudiceSSL is badUsing asynchronicity to avoid fixing root cause
Lack of testing: In Go context, myth of 150 agent limit.Lack of setup understanding: Misconception that 64 bitJVM is always bad.Atomic operations and globaly unique objects(like internedsymbols) are your friends.Unchartred waters
ERB garbage churnStrange JVM allocationsSingle threaded databaseJRuby class hirarchy modification(runtime level) locks.
Problem Approach Nature of fixes War Stories
Common Pitfalls
PrejudiceSSL is badUsing asynchronicity to avoid fixing root cause
Lack of testing: In Go context, myth of 150 agent limit.
Lack of setup understanding: Misconception that 64 bitJVM is always bad.Atomic operations and globaly unique objects(like internedsymbols) are your friends.Unchartred waters
ERB garbage churnStrange JVM allocationsSingle threaded databaseJRuby class hirarchy modification(runtime level) locks.
Problem Approach Nature of fixes War Stories
Common Pitfalls
PrejudiceSSL is badUsing asynchronicity to avoid fixing root cause
Lack of testing: In Go context, myth of 150 agent limit.Lack of setup understanding: Misconception that 64 bitJVM is always bad.
Atomic operations and globaly unique objects(like internedsymbols) are your friends.Unchartred waters
ERB garbage churnStrange JVM allocationsSingle threaded databaseJRuby class hirarchy modification(runtime level) locks.
Problem Approach Nature of fixes War Stories
Common Pitfalls
PrejudiceSSL is badUsing asynchronicity to avoid fixing root cause
Lack of testing: In Go context, myth of 150 agent limit.Lack of setup understanding: Misconception that 64 bitJVM is always bad.Atomic operations and globaly unique objects(like internedsymbols) are your friends.
Unchartred watersERB garbage churnStrange JVM allocationsSingle threaded databaseJRuby class hirarchy modification(runtime level) locks.
Problem Approach Nature of fixes War Stories
Common Pitfalls
PrejudiceSSL is badUsing asynchronicity to avoid fixing root cause
Lack of testing: In Go context, myth of 150 agent limit.Lack of setup understanding: Misconception that 64 bitJVM is always bad.Atomic operations and globaly unique objects(like internedsymbols) are your friends.Unchartred waters
ERB garbage churnStrange JVM allocationsSingle threaded databaseJRuby class hirarchy modification(runtime level) locks.
Problem Approach Nature of fixes War Stories
Know what you are getting into...
Know what you are getting into...
Problem Approach Nature of fixes War Stories
Gist, to reiterate...
Do not go overboard, know your end goal.Intution tends to work very poorly. Belive proof, not gut feel.Understand your environment. Know what to measure.Round pegs for round holes. Use appropriate tools.Jumps places, unlike functional bugs, look beyond localimpact.Requires mechanical approach, attack worst problem first.Prefer correctness over performance.Reasoning every fix (of paramount importance inmultithreaded app).Do one fix at a time, allows easy reasoning andverification, can be reverted.Don’t jump in the lake without knowing whats waitinginside.
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
Garbage churn issues
Debug log statements
ERB renderingPooling objects with significant memory footprint
XML parser factoryBuffered reader
Stream being dereferenced into a stringThread locals to the rescue(used for caching big reusableobjects)
Problem Approach Nature of fixes War Stories
Garbage churn issues
Debug log statementsERB rendering
Pooling objects with significant memory footprintXML parser factoryBuffered reader
Stream being dereferenced into a stringThread locals to the rescue(used for caching big reusableobjects)
Problem Approach Nature of fixes War Stories
Garbage churn issues
Debug log statementsERB renderingPooling objects with significant memory footprint
XML parser factoryBuffered reader
Stream being dereferenced into a stringThread locals to the rescue(used for caching big reusableobjects)
Problem Approach Nature of fixes War Stories
Garbage churn issues
Debug log statementsERB renderingPooling objects with significant memory footprint
XML parser factoryBuffered reader
Stream being dereferenced into a string
Thread locals to the rescue(used for caching big reusableobjects)
Problem Approach Nature of fixes War Stories
Garbage churn issues
Debug log statementsERB renderingPooling objects with significant memory footprint
XML parser factoryBuffered reader
Stream being dereferenced into a stringThread locals to the rescue(used for caching big reusableobjects)
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
Contention and reasoning issues
Locking on interned strings generated for context sensitivelocking
Death of synchronized methodsCopy on to stack(using lock), mutate locklessly, copy backto shared memory(using lock)Use read/write lock when appropriateUse atomic operations for lockess concurrencyConcurrent data-structure != thread-safetyTake difficult design tradeoffs to keep code requiring aclass of lock in one placeDouble check locking can be very helpful in frequentlycalled code
Problem Approach Nature of fixes War Stories
Contention and reasoning issues
Locking on interned strings generated for context sensitivelockingDeath of synchronized methods
Copy on to stack(using lock), mutate locklessly, copy backto shared memory(using lock)Use read/write lock when appropriateUse atomic operations for lockess concurrencyConcurrent data-structure != thread-safetyTake difficult design tradeoffs to keep code requiring aclass of lock in one placeDouble check locking can be very helpful in frequentlycalled code
Problem Approach Nature of fixes War Stories
Contention and reasoning issues
Locking on interned strings generated for context sensitivelockingDeath of synchronized methodsCopy on to stack(using lock), mutate locklessly, copy backto shared memory(using lock)
Use read/write lock when appropriateUse atomic operations for lockess concurrencyConcurrent data-structure != thread-safetyTake difficult design tradeoffs to keep code requiring aclass of lock in one placeDouble check locking can be very helpful in frequentlycalled code
Problem Approach Nature of fixes War Stories
Contention and reasoning issues
Locking on interned strings generated for context sensitivelockingDeath of synchronized methodsCopy on to stack(using lock), mutate locklessly, copy backto shared memory(using lock)Use read/write lock when appropriate
Use atomic operations for lockess concurrencyConcurrent data-structure != thread-safetyTake difficult design tradeoffs to keep code requiring aclass of lock in one placeDouble check locking can be very helpful in frequentlycalled code
Problem Approach Nature of fixes War Stories
Contention and reasoning issues
Locking on interned strings generated for context sensitivelockingDeath of synchronized methodsCopy on to stack(using lock), mutate locklessly, copy backto shared memory(using lock)Use read/write lock when appropriateUse atomic operations for lockess concurrency
Concurrent data-structure != thread-safetyTake difficult design tradeoffs to keep code requiring aclass of lock in one placeDouble check locking can be very helpful in frequentlycalled code
Problem Approach Nature of fixes War Stories
Contention and reasoning issues
Locking on interned strings generated for context sensitivelockingDeath of synchronized methodsCopy on to stack(using lock), mutate locklessly, copy backto shared memory(using lock)Use read/write lock when appropriateUse atomic operations for lockess concurrencyConcurrent data-structure != thread-safety
Take difficult design tradeoffs to keep code requiring aclass of lock in one placeDouble check locking can be very helpful in frequentlycalled code
Problem Approach Nature of fixes War Stories
Contention and reasoning issues
Locking on interned strings generated for context sensitivelockingDeath of synchronized methodsCopy on to stack(using lock), mutate locklessly, copy backto shared memory(using lock)Use read/write lock when appropriateUse atomic operations for lockess concurrencyConcurrent data-structure != thread-safetyTake difficult design tradeoffs to keep code requiring aclass of lock in one place
Double check locking can be very helpful in frequentlycalled code
Problem Approach Nature of fixes War Stories
Contention and reasoning issues
Locking on interned strings generated for context sensitivelockingDeath of synchronized methodsCopy on to stack(using lock), mutate locklessly, copy backto shared memory(using lock)Use read/write lock when appropriateUse atomic operations for lockess concurrencyConcurrent data-structure != thread-safetyTake difficult design tradeoffs to keep code requiring aclass of lock in one placeDouble check locking can be very helpful in frequentlycalled code
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
Caching
Eager caching is dangerous(using miss => load => prime=> hit works better in practice)
Testing cache with boundry conditionsTest to ensure cache-hitEnsure boundry conditions are well understood, overflow isjudiciously defined/used
LRU eviction is the right(only?) solution sometimesDouble check cache invalidation flows, its hard, sometimesbleeds across classes.Be vigilant when using hirarchical caches(noassociative-arrays or maps should be cached), its easy torun memory out with this.
Problem Approach Nature of fixes War Stories
Caching
Eager caching is dangerous(using miss => load => prime=> hit works better in practice)Testing cache with boundry conditions
Test to ensure cache-hitEnsure boundry conditions are well understood, overflow isjudiciously defined/used
LRU eviction is the right(only?) solution sometimesDouble check cache invalidation flows, its hard, sometimesbleeds across classes.Be vigilant when using hirarchical caches(noassociative-arrays or maps should be cached), its easy torun memory out with this.
Problem Approach Nature of fixes War Stories
Caching
Eager caching is dangerous(using miss => load => prime=> hit works better in practice)Testing cache with boundry conditions
Test to ensure cache-hitEnsure boundry conditions are well understood, overflow isjudiciously defined/used
LRU eviction is the right(only?) solution sometimes
Double check cache invalidation flows, its hard, sometimesbleeds across classes.Be vigilant when using hirarchical caches(noassociative-arrays or maps should be cached), its easy torun memory out with this.
Problem Approach Nature of fixes War Stories
Caching
Eager caching is dangerous(using miss => load => prime=> hit works better in practice)Testing cache with boundry conditions
Test to ensure cache-hitEnsure boundry conditions are well understood, overflow isjudiciously defined/used
LRU eviction is the right(only?) solution sometimesDouble check cache invalidation flows, its hard, sometimesbleeds across classes.
Be vigilant when using hirarchical caches(noassociative-arrays or maps should be cached), its easy torun memory out with this.
Problem Approach Nature of fixes War Stories
Caching
Eager caching is dangerous(using miss => load => prime=> hit works better in practice)Testing cache with boundry conditions
Test to ensure cache-hitEnsure boundry conditions are well understood, overflow isjudiciously defined/used
LRU eviction is the right(only?) solution sometimesDouble check cache invalidation flows, its hard, sometimesbleeds across classes.Be vigilant when using hirarchical caches(noassociative-arrays or maps should be cached), its easy torun memory out with this.
Problem Approach Nature of fixes War Stories
Caching contd...
Do not cache uncommited data or invalidate cache beforecommit(transactional caches may come in handy for this).
Ensure transactions use cached data iff no othertransaction modifies it or ensure transactionally consistentinvalidation.Understand your caching library, too low overflow limit maycause IO issues.Use one and only one cache instance(its easy to end up ina memory runout with this). We had view fragments, urls,domain objects and view models cached in the sameinstance.
Problem Approach Nature of fixes War Stories
Caching contd...
Do not cache uncommited data or invalidate cache beforecommit(transactional caches may come in handy for this).Ensure transactions use cached data iff no othertransaction modifies it or ensure transactionally consistentinvalidation.
Understand your caching library, too low overflow limit maycause IO issues.Use one and only one cache instance(its easy to end up ina memory runout with this). We had view fragments, urls,domain objects and view models cached in the sameinstance.
Problem Approach Nature of fixes War Stories
Caching contd...
Do not cache uncommited data or invalidate cache beforecommit(transactional caches may come in handy for this).Ensure transactions use cached data iff no othertransaction modifies it or ensure transactionally consistentinvalidation.Understand your caching library, too low overflow limit maycause IO issues.
Use one and only one cache instance(its easy to end up ina memory runout with this). We had view fragments, urls,domain objects and view models cached in the sameinstance.
Problem Approach Nature of fixes War Stories
Caching contd...
Do not cache uncommited data or invalidate cache beforecommit(transactional caches may come in handy for this).Ensure transactions use cached data iff no othertransaction modifies it or ensure transactionally consistentinvalidation.Understand your caching library, too low overflow limit maycause IO issues.Use one and only one cache instance(its easy to end up ina memory runout with this). We had view fragments, urls,domain objects and view models cached in the sameinstance.
Problem Approach Nature of fixes War Stories
Outline1 Problem
Identify scopeAcceptance criteria
2 ApproachHow to analyze itMeasureFix
3 Nature of fixesSynchronizationGCDBThrashingPitfalls
4 War StoriesGCLockingCachingJavascript
Problem Approach Nature of fixes War Stories
Javascript
JS dom frequent offset/size calls hurt
Framework provided foreach loops are bad, use native forloops.Using beautiful framework provided methods like filter, findand select can have detrimental consequences.Avoid multiple lookups for the same element, use a loadedcopy as far as possible.Avoid CSS selectors in frequently called code.Avoid too much DOM churn.Use tools like Chromium dev toolset to nail leaks andprofile methods.Auto-refreshing dashboards should refresh only visible andchanged area.
Problem Approach Nature of fixes War Stories
Javascript
JS dom frequent offset/size calls hurtFramework provided foreach loops are bad, use native forloops.
Using beautiful framework provided methods like filter, findand select can have detrimental consequences.Avoid multiple lookups for the same element, use a loadedcopy as far as possible.Avoid CSS selectors in frequently called code.Avoid too much DOM churn.Use tools like Chromium dev toolset to nail leaks andprofile methods.Auto-refreshing dashboards should refresh only visible andchanged area.
Problem Approach Nature of fixes War Stories
Javascript
JS dom frequent offset/size calls hurtFramework provided foreach loops are bad, use native forloops.Using beautiful framework provided methods like filter, findand select can have detrimental consequences.
Avoid multiple lookups for the same element, use a loadedcopy as far as possible.Avoid CSS selectors in frequently called code.Avoid too much DOM churn.Use tools like Chromium dev toolset to nail leaks andprofile methods.Auto-refreshing dashboards should refresh only visible andchanged area.
Problem Approach Nature of fixes War Stories
Javascript
JS dom frequent offset/size calls hurtFramework provided foreach loops are bad, use native forloops.Using beautiful framework provided methods like filter, findand select can have detrimental consequences.Avoid multiple lookups for the same element, use a loadedcopy as far as possible.
Avoid CSS selectors in frequently called code.Avoid too much DOM churn.Use tools like Chromium dev toolset to nail leaks andprofile methods.Auto-refreshing dashboards should refresh only visible andchanged area.
Problem Approach Nature of fixes War Stories
Javascript
JS dom frequent offset/size calls hurtFramework provided foreach loops are bad, use native forloops.Using beautiful framework provided methods like filter, findand select can have detrimental consequences.Avoid multiple lookups for the same element, use a loadedcopy as far as possible.Avoid CSS selectors in frequently called code.
Avoid too much DOM churn.Use tools like Chromium dev toolset to nail leaks andprofile methods.Auto-refreshing dashboards should refresh only visible andchanged area.
Problem Approach Nature of fixes War Stories
Javascript
JS dom frequent offset/size calls hurtFramework provided foreach loops are bad, use native forloops.Using beautiful framework provided methods like filter, findand select can have detrimental consequences.Avoid multiple lookups for the same element, use a loadedcopy as far as possible.Avoid CSS selectors in frequently called code.Avoid too much DOM churn.
Use tools like Chromium dev toolset to nail leaks andprofile methods.Auto-refreshing dashboards should refresh only visible andchanged area.
Problem Approach Nature of fixes War Stories
Javascript
JS dom frequent offset/size calls hurtFramework provided foreach loops are bad, use native forloops.Using beautiful framework provided methods like filter, findand select can have detrimental consequences.Avoid multiple lookups for the same element, use a loadedcopy as far as possible.Avoid CSS selectors in frequently called code.Avoid too much DOM churn.Use tools like Chromium dev toolset to nail leaks andprofile methods.
Auto-refreshing dashboards should refresh only visible andchanged area.
Problem Approach Nature of fixes War Stories
Javascript
JS dom frequent offset/size calls hurtFramework provided foreach loops are bad, use native forloops.Using beautiful framework provided methods like filter, findand select can have detrimental consequences.Avoid multiple lookups for the same element, use a loadedcopy as far as possible.Avoid CSS selectors in frequently called code.Avoid too much DOM churn.Use tools like Chromium dev toolset to nail leaks andprofile methods.Auto-refreshing dashboards should refresh only visible andchanged area.
Problem Approach Nature of fixes War Stories
Questions?
Questions?