S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes...
Transcript of S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes...
![Page 1: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/1.jpg)
System-Level Debug Jakob Engblom, PhDTechnical Marketing Manager – SimicsWind River, Stockholm, Sweden
![Page 2: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/2.jpg)
What’s the Problem?
![Page 3: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/3.jpg)
Debug has Always Been with Us
![Page 4: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/4.jpg)
| ©2011 Wind River. All Rights Reserved.4
Debug Context Increasing in Size
Processorand Memory
SoC Devices Complete Boards Complete Systems and Networks
Devices, Racks of Boards,and Backplanes
Design Scale
Sys
tem
and
Deb
ug C
ompl
exity
![Page 5: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/5.jpg)
Multi-X is here to Stay
Multiple threads Multiple processes
Multiple operating systems
Stacked operating systems (Hypervisor)
You want to debug a single system as a unit
Multicore
Multiple chips Multiple architectures
Heterogeneous architectures
![Page 6: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/6.jpg)
Current Debugger Design Assumption
Target program
Debugger
![Page 7: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/7.jpg)
Target OS
Target program
… We Need to Think Big
Thread
Thread
ThreadDebugger
![Page 8: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/8.jpg)
Target OS
Target program
… We Need to Think Big
Thread
Thread
ThreadDebugger
Target program
Target program
![Page 9: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/9.jpg)
Target OS
Target program
… We Need to Think Big
Thread
Thread
ThreadDebugger
Target program
Target program
Target OS
Target program
Target program
Thread
Thread
Target OS
Target program
Target program
Thread
Thread
![Page 10: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/10.jpg)
Background:Simics Evolution
![Page 11: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/11.jpg)
System Debugging: The Beginning
When we started with Simics more than a decade ago, we included a simple debugger inspired by gdb
It had nice commands like:– break (plant breakpoint)– %r (read register value)– sym (lookup symbol)– x (examine memory)– set‐pc (change PC )– ptime (print current time)
Worked fine for debugging an OS booting up on a virtual platform
Target OS
Debugger
![Page 12: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/12.jpg)
Simics Debugging: System Strikes
Over time, targets moved to multiple processors, multiple machines, and OS awareness with multiple processes
So now: where does ”break” apply?
This is the essential question for all system-level debug
Over several generations, a design pattern emerged:– Namespaces –
And then hierarchical namespaces
![Page 13: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/13.jpg)
Example of Namespacing All operations need to specify what they operate on, typically
by using command namespaces– Example in full verbosity:
tuna.vxworks.break (tuna.vxworkssymbols.pos myISR)– tuna – name of machine– vxworks – name of an execution context in the machine,
corresponding to a vxworks kernel– vxworkssymbols – a symbol lookup engine for the binary
corresponding to the vxworks kernel– pos myIsr – command to find position in the code of the function
called myIsr– There can be more machines in the system, and more contexts
A plain “break” could have anything as its target Using a “current processor” does not scale
gdb uses a simple scheme like this for multithreaded debug
![Page 14: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/14.jpg)
Applies to All Debugger Backends
Note that this is not just an issue for virtual platforms– An ICE/hardware debugger can work with multiple cores in a
single box, each with a separate OS– A Hypervisor debug agent has multiple guest OSes that it is
simultaneously debugging– A target agent can work with many programs at once– A debugger can have multiple connections open to multiple
backends in a single target system For example, one JTAG connection per board in a rack, all
controlled from the same host PC
Same debugger frontend across backends, provide a consistent debug experience
– Still exposing the strong points of each backend
![Page 15: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/15.jpg)
System-Level Issues
![Page 16: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/16.jpg)
Mixed Architectures
In a system setting, target machines can differ in architecture, word length, endianness
Debugger needs to be able to debug several target architectures in a single session
32‐bit PPC BE
Target OS
64‐bit IA LE
Target OS
Apps
8‐bit AVR
Apps
Apps
Debugger
PPC IA
AVR
![Page 17: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/17.jpg)
Example Problems
Reading variables and stack frames from memory – Get endianness and word length right
Remember to deal with execution modes– 64-bit processor running a 32-bit OS: how big are registers?– Common both x86 and PPC to have 32-bit OS on 64-bit HW
![Page 18: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/18.jpg)
Keep it Local
When debugging multiple programs or systems at once, debug contexts have to be local to each target
– Symbol file associations, source code file paths, breakpoints, etc. has to be maintained per debugged entity
– Breakpoints and other actions have to apply to the smallest possible part of the target system
Target program
Debugger context
Target program
Debugger context
Target OS
Target program
Target program
Thread
Thread
Debugger context
![Page 19: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/19.jpg)
Example: Eclipse Breakpoints Eclipse saves breakpoints
across sessions– Same for expressions,
watch expressions, … When a name happens to
match a current breakpoint context, it is replanted
– If several programs contain the same names, breakpoints will be planted in all matching locations File name, variable name,
etc.– Very strange effects for a
user
![Page 20: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/20.jpg)
Step Here, Stop There When following multiple programs, you will end up with some
programs or threads hitting breakpoints while others are stepping
– Essentially, non-stop debugging is the only reasonable choice
running inactive op
step line
hit breakpt
Stop the system for break in app 3, or wait until the step line in app 1 completes?
switch outApp 1
App 2
App 3 step out
switch out
switch in
step complete
If we stop due to app 1 step complete, we need to make the user understand that app 3 is still stepping…
step complete
step in
Step on an inactive task – we should stop once it activates
The user now focuses on this application and decides to step out of the current function
switch in
step complete
![Page 21: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/21.jpg)
Single Connection Ideally, use a single connection to the target system in its entirety
– Requires connection to handle heterogeneity Most existing debug protocols (such as gdb-serial) assume a homogeneous counterpart Target Communications Framework (TCF) can handle heterogeneous systems
– Homogeneous connection implies several separate connections to target– Coordinating the run control across multiple connections can get painful– Coordination hardware box close to target?
DebuggerDebugger
32‐bit PPC BE
Target OS
64‐bit IA LE
Target OS
Apps
8‐bit AVR
Apps
Apps
![Page 22: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/22.jpg)
Single Connection
Example in action:
![Page 23: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/23.jpg)
Follow Multiple Programs We want to see where several programs are executing
– At the same time, in the same debugger GUI Requires a program-centric user interface Problem also for with multicore debug today
– Eclipse CDT default model of source-centric does not work well– Many other debuggers have a good model for this already in place
Machine 1
Target OS
App 1 App 2
Machine 2
Target OS
App 3
Machine 3
Target OS
Don’t care
Debugger
App 1 App 2
App 3
![Page 24: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/24.jpg)
Source File != Program
Typically, debuggers associate from a source code file to an executing binary to set breakpoints
But the “same” source file can be part of several programs executing in a system
– Same name (main.c) used in many programs – distinguish by compilation path
– Same file, in the same place in the file system, included into multiple programs with different compilation settings Common code base of portable code For example, a piece of middleware compiled for PPC-32-Linux,
x86-64-Windows, ARM-32-VxWorks, all running in the same target system
![Page 25: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/25.jpg)
Source File != Program
Debugger needs to distinguish different context to a shared source code file to correctly target breakpoints and resolve variable values
Machine 1
Target OS
mymiddleware
Machine 2
Target OS
mymiddleware
Machine 3
Target OS
mymiddleware
App 1App 2
App 3
common/mymiddleware.c
![Page 26: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/26.jpg)
Example: Same-Name Programs
Machine 1 (shark, ppc‐32)
Machine 2 (mackerel, x86‐64)
Machine 3(herring, x86‐32)
![Page 27: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/27.jpg)
The Target is Already Running
In a classic debugger workflow, the debugger has the ability to launch the target program to debug
In a system-level debug setting, that does not work The target software is already there, and it is starting and
stopping based on events inside and outside the target system
What does this require from a debugger?– Ability to debug any existing process, including chasing it as it is
switched in and out by the OS kernel– Ability to hook into the start of a new process to attach to it as
something starts it– More smarts in the backend, fewer round-trips to frontend
![Page 28: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/28.jpg)
ExampleIn fact the most cumbersome procedure with Android is not necessarily to setup the debugger, as this is a one-time step. What is tedious and time consuming is the ability to attach a debugger to the right process at the right point in time and under the right condition. Within Android, a single service is made up by the interaction of many software entities on all layers of the SW stack. As an Android integrator, you cannot be sure where the problem us rooted. This, you can only find out during debugging. Debugging however, requires the injection of a halt at the beginning of the thread/process as you do not launch those manually. It is practically not possible to inject those halts in all potential error prone processes, as you would need to attach a debugger to all of them and step over. For sure, it is not needed for all processes as some of them already wait. It is just very annoying and complicated to debug native code in Android practically [I have not seen anybody claiming something else so far]. At the end you fall back to printf/log based debugging and guessing rather conducting a systematic analysis using a debugger.
http://www.synopsysoc.org/viewfromtop/2011/09/vp‐software‐debugging‐myths‐and‐facts/comment‐page‐1/#comment‐1506
![Page 29: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/29.jpg)
Search, Filter, Summary
We will need to use search and filter as a way to interact with a debugger
– Imagine 10 boards, each with 10 cores, each with 100 threads running – we quickly get to 10000+ nodes in the system
– Finding your way around requires tools more like desktop search
Such filtering is necessary to use bandwidth smartly– Passing over the state of 10000 threads on each debugger stop
is not practical– Even for a virtual platform on the same host (“infinite
bandwidth”), the data passing will take noticable time
Smart summaries of system state to let users focus in the right place is also needed
![Page 30: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/30.jpg)
Actual Time In classic debug, time is logical
– Lines of code, values of loop variables, progress of processing In system-level debug, physical time becomes important
– Did this breakpoint hit “just after” something else, or “way later”?– Does a step of a certain thread take microseconds or seconds?– Is this before or after something happened in another place?– As you jump between threads, programs, and targets, just where
are you in the overall system execution?– Physical time answers a lot of these questions
Debuggers needs to present current time(s), delta(s) in time– There is more than one time in any moderately complex system– (search the gdb mailing lists for a discussion we had on this)
Real time is obviously really important for debugging real-time systems (in a different way)
![Page 31: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/31.jpg)
Example: Actual Time
![Page 32: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/32.jpg)
Trace to Analysis to Debug
Trace is the basis for modern debug and analysis Trace processing finds an event
– (hiccup, delay, sudden spike in CPU usage, whatever)– ... jump to the point in time, and the system, task, source file, and
line of code where the suspicious event occurred OS awareness needs to permeate debugging at all levels
Target program
Target program
Thread
Thread
Target program
![Page 33: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/33.jpg)
Script!
To solve complex problems, you need to be able to script the debugger
– A GUI is nice… but sometimes you need to program
It allows program-specific custom automatic debug
![Page 34: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/34.jpg)
Conclusions
![Page 35: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/35.jpg)
System-Level Debug is a Challenge
Backends– OS awareness, same frontend to many backends, more smarts
Communications to backend– Heterogeneity, bandwidth, abstraction
User Interface– Local actions, control over scopes, program-centric
Debugging concepts– Time, trace-to-code, search-and-filter, scripting
![Page 36: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/36.jpg)
Questions?Comments?Hate mail?
![Page 37: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/37.jpg)
Backups
![Page 38: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/38.jpg)
Multiple Connections
Another case is that several debuggers have to be used at once to debug a single system
– One debugger might not support all architectures– Specialized debuggers for architectures like DSPs need to be
used for a subset of processors
32‐bit PPC BE
Target OS
64‐bit IA LE
Target OS
AppsApps
DebuggerDebuggerDebuggerDebugger
![Page 39: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/39.jpg)
Multiple Connections: Who’s in Charge?
Some interesting problems– Debugger cannot assume it knows why a target stopped -– Debugger needs to ask the target for why, where, and it stopped
Debug protocols have to support state update from the target to the debuggers
– Don’t use any blocking operations in the GUI - otherwise, deadlock
![Page 40: S4D keynote system-level debug Jakob Engblom October 2011€¦ · Simics Debugging: System Strikes Over time, targets moved to multiple processors, multiple machines, and OS awareness](https://reader033.fdocuments.us/reader033/viewer/2022052011/6026272f57b36e6af80e019c/html5/thumbnails/40.jpg)