Outline Introduction Prototype Summary
Linux Kernel extensions to minimize effects ofSoftware Aging
Ariel Sabiguero Andres Aguirre Fabricio Gonzalez
Daniel Pedraja Agustın Van Rompaey
Instituto de Computacion, Facultad de Ingenierıa, Universidad de la RepublicaJ. Herrera y Reissig 565, Montevideo, Uruguay
{asabigue|aaguirre}@fing.edu.uy {fabgonz|danigpc|fenix.uy}@gmail.com
20/10/2010
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
1 IntroductionConceptsFinner grained rejuvenation
2 PrototypeProblem definitionKey challenges addressedKernel modifications performedValidationPerformance testing
3 Summary...ongoing workfinally...
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Concepts
Soft Errors
A soft error is a transient failure in semiconductors causingthe eventual lose of data integrity in memory.
It implies a change in a program or a data value.
Soft errors do not imply a permanent damage on system’shardware, the only damage is to the data that is beingprocessed.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Concepts
Software Aging & Rejuvenation
The term Software aging refers to the deteriorating in the availabilityof OS resources caused by data corruption.
Software Rejuvenation aims at proactive fault management tech-niques addressing the restoration of system’s internal state in orderto prevent the occurrence of failures.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Concepts
Software Aging & Rejuvenation
The term Software aging refers to the deteriorating in the availabilityof OS resources caused by data corruption.
Software Rejuvenation aims at proactive fault management tech-niques addressing the restoration of system’s internal state in orderto prevent the occurrence of failures.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Finner grained rejuvenation
A new approach
Instead of a proactive full process/systemrejuvenation we address a finner grain.
We take advantage of the fact that programcode and parts of program data remainconstant during program execution.
We will apply reactive rejuvenation to the constant areas of thesystem when they get modified.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Finner grained rejuvenation
A new approach
Instead of a proactive full process/systemrejuvenation we address a finner grain.
We take advantage of the fact that programcode and parts of program data remainconstant during program execution.
We will apply reactive rejuvenation to the constant areas of thesystem when they get modified.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Finner grained rejuvenation
A new approach
Instead of a proactive full process/systemrejuvenation we address a finner grain.
We take advantage of the fact that programcode and parts of program data remainconstant during program execution.
We will apply reactive rejuvenation to the constant areas of thesystem when they get modified.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Finner grained rejuvenation
Relevance of R.O. memory
State-of-the-art software engineering techniques suggest thatwe do not code programs that change their own instructions.
Modern systems allows the definition of certain sections ofprograms to be read only, that means, that remain constantall through program execution.
Different portions of code and data are marked R.O. atcompile time.
Modern compilers enforce the usage of R.O. memory on theirnative formats (ELF - Executable and Linking Format and PE- Portable Executable for Linux and Windows respectively).
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Finner grained rejuvenation
Relevance of R.O. memory
State-of-the-art software engineering techniques suggest thatwe do not code programs that change their own instructions.
Modern systems allows the definition of certain sections ofprograms to be read only, that means, that remain constantall through program execution.
Different portions of code and data are marked R.O. atcompile time.
Modern compilers enforce the usage of R.O. memory on theirnative formats (ELF - Executable and Linking Format and PE- Portable Executable for Linux and Windows respectively).
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Finner grained rejuvenation
Relevance of R.O. memory
State-of-the-art software engineering techniques suggest thatwe do not code programs that change their own instructions.
Modern systems allows the definition of certain sections ofprograms to be read only, that means, that remain constantall through program execution.
Different portions of code and data are marked R.O. atcompile time.
Modern compilers enforce the usage of R.O. memory on theirnative formats (ELF - Executable and Linking Format and PE- Portable Executable for Linux and Windows respectively).
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Finner grained rejuvenation
Relevance of R.O. memory
State-of-the-art software engineering techniques suggest thatwe do not code programs that change their own instructions.
Modern systems allows the definition of certain sections ofprograms to be read only, that means, that remain constantall through program execution.
Different portions of code and data are marked R.O. atcompile time.
Modern compilers enforce the usage of R.O. memory on theirnative formats (ELF - Executable and Linking Format and PE- Portable Executable for Linux and Windows respectively).
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Problem definition
Objective & target platform
Detect and handle the occurrence of Soft Errors in R.O.memory.
Platform
O.S.: GNU Linux Kernel 2.6.25.9Distribution: OpenSuSE 11.0Architecture: Intel x86
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Problem definition
Objective & target platform
Detect and handle the occurrence of Soft Errors in R.O.memory.
Platform
O.S.: GNU Linux Kernel 2.6.25.9Distribution: OpenSuSE 11.0Architecture: Intel x86
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Key challenges addressed
Read-Only Memory in Linux
Characteristics
Frame GranularityProtection scheme: User space onlyFrames shared between tasks
Read-only subset: frames mapped to one or more processeswith Read-Only access in every instance
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Key challenges addressed
Read-Only Memory in Linux
Characteristics
Frame GranularityProtection scheme: User space onlyFrames shared between tasks
Read-only subset: frames mapped to one or more processeswith Read-Only access in every instance
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Key challenges addressed
Error detection mechanism
Memory change-detection algorithm
Frame levelError detection code: CRC32
Search Strategies
System level Frame PollingTask subset PollingTask scheduler checks
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Key challenges addressed
Error detection mechanism
Memory change-detection algorithm
Frame levelError detection code: CRC32
Search Strategies
System level Frame PollingTask subset PollingTask scheduler checks
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Key challenges addressed
Error Handling actions
Error correction code: Hamming
Automatic File Rejuvenation
User space rejuvenation assistance
Error details =⇒ high granularity actionsAgent notifications =⇒ Synchronous actions
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Kernel modifications performed
Kernel Map
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Validation
Ensuring correctness of the implementation
Motivation: Separate bugs from Soft Errors
Challenges
Low error probability in our typical scenarioHardware error generation difficult and expensive
Fault Injection
Software based memory error simulationKernel integrated vs High levelExposed as System call
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Validation
Ensuring correctness of the implementation
Motivation: Separate bugs from Soft Errors
Challenges
Low error probability in our typical scenarioHardware error generation difficult and expensive
Fault Injection
Software based memory error simulationKernel integrated vs High levelExposed as System call
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Performance testing
Case study
We decided to evaluate the impact on an IO-boundedapplication and a CPU-bounded one.
Methodologically, we contrast benchmarks run on a modifiedkernel and a standard one (vanilla).
Different levels of performance in tasks depending onresources used:
Memory corruption correction routines almost do not competewith IO-bounded loads.CPU-bounded applications compete for the same resourceimpacting on system performance.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Performance testing
Case study
We decided to evaluate the impact on an IO-boundedapplication and a CPU-bounded one.
Methodologically, we contrast benchmarks run on a modifiedkernel and a standard one (vanilla).
Different levels of performance in tasks depending onresources used:
Memory corruption correction routines almost do not competewith IO-bounded loads.CPU-bounded applications compete for the same resourceimpacting on system performance.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Performance testing
Case study
We decided to evaluate the impact on an IO-boundedapplication and a CPU-bounded one.
Methodologically, we contrast benchmarks run on a modifiedkernel and a standard one (vanilla).
Different levels of performance in tasks depending onresources used:
Memory corruption correction routines almost do not competewith IO-bounded loads.
CPU-bounded applications compete for the same resourceimpacting on system performance.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Performance testing
Case study
We decided to evaluate the impact on an IO-boundedapplication and a CPU-bounded one.
Methodologically, we contrast benchmarks run on a modifiedkernel and a standard one (vanilla).
Different levels of performance in tasks depending onresources used:
Memory corruption correction routines almost do not competewith IO-bounded loads.CPU-bounded applications compete for the same resourceimpacting on system performance.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
Performance testing
Case study: performance results
IO-bounded CPU-bounded
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
...ongoing work
Future work
Address lose of cache locality.
Consider power consumption due to continuous 100% CPUusage.
Focus in embedded solutions
Improve CPU usage (different approach than on desktops).Test in architectures different from x86
Wish: to be able to test in ambient with more probability ofsoft errors (EMI).
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
...ongoing work
Future work
Address lose of cache locality.
Consider power consumption due to continuous 100% CPUusage.
Focus in embedded solutions
Improve CPU usage (different approach than on desktops).Test in architectures different from x86
Wish: to be able to test in ambient with more probability ofsoft errors (EMI).
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
...ongoing work
Future work
Address lose of cache locality.
Consider power consumption due to continuous 100% CPUusage.
Focus in embedded solutions
Improve CPU usage (different approach than on desktops).Test in architectures different from x86
Wish: to be able to test in ambient with more probability ofsoft errors (EMI).
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
...ongoing work
Future work
Address lose of cache locality.
Consider power consumption due to continuous 100% CPUusage.
Focus in embedded solutions
Improve CPU usage (different approach than on desktops).Test in architectures different from x86
Wish: to be able to test in ambient with more probability ofsoft errors (EMI).
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
finally...
Conclusions
We built and tested a prototype with the expectedcharacteristics.
The software rejuvenation implementation is based onsoftware instead the traditional hardware based scheme.
Our approach avoids full system restart or full process restart,for the kind of errors addressed.
Being simple and non-intrusive, it is aplicable to any piece of(Linux) software without any modifications.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
finally...
Conclusions
We built and tested a prototype with the expectedcharacteristics.
The software rejuvenation implementation is based onsoftware instead the traditional hardware based scheme.
Our approach avoids full system restart or full process restart,for the kind of errors addressed.
Being simple and non-intrusive, it is aplicable to any piece of(Linux) software without any modifications.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
finally...
Conclusions
We built and tested a prototype with the expectedcharacteristics.
The software rejuvenation implementation is based onsoftware instead the traditional hardware based scheme.
Our approach avoids full system restart or full process restart,for the kind of errors addressed.
Being simple and non-intrusive, it is aplicable to any piece of(Linux) software without any modifications.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
finally...
Conclusions
We built and tested a prototype with the expectedcharacteristics.
The software rejuvenation implementation is based onsoftware instead the traditional hardware based scheme.
Our approach avoids full system restart or full process restart,for the kind of errors addressed.
Being simple and non-intrusive, it is aplicable to any piece of(Linux) software without any modifications.
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
finally...
Thank you for your time
Questions?
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Outline Introduction Prototype Summary
finally...
Linux Kernel extensions to minimize effects ofSoftware Aging
Ariel Sabiguero Andres Aguirre Fabricio Gonzalez
Daniel Pedraja Agustın Van Rompaey
Instituto de Computacion, Facultad de Ingenierıa, Universidad de la RepublicaJ. Herrera y Reissig 565, Montevideo, Uruguay
{asabigue|aaguirre}@fing.edu.uy {fabgonz|danigpc|fenix.uy}@gmail.com
20/10/2010
A. Sabiguero, A. Aguirre, F. Gonzalez, D. Pedraja, A. Van Rompaey Linux Kernel extensions to minimize effects of Software Aging
Top Related