THOMAS GUSTAFSSON Some Real-Time-ish Problems-ish•A Scania vehicle is determined by its Scania...
Transcript of THOMAS GUSTAFSSON Some Real-Time-ish Problems-ish•A Scania vehicle is determined by its Scania...
-
Title Slide
Some Real-Time-ishProblems-ish
THOMAS GUSTAFSSON
-
Title and Content
• Present some real-time problems from my career
• Think hard and discuss among yourselfs for a while
• Discussions about possible solutions
6 December 2019 Info class internal Department / Name / Subject 2
Agenda
-
Title and Content
• Scania is world-leading manufacturer of
− heavy trucks
− buses
− engines
• Some 50000 employees around the world
• R&D in Södertälje, Sweden, and Sao Paulo, Brazil
6 December 2019 Info class internal Department / Name / Subject 3
Scania
-
Title and Content
• Ca 3500 employees mainly focused to Södertälje, Sweden
• Examples of R&D development
− engine
− gearbox
− chassi
− cab
− Scania electrical system
− a lot of ECUs and the software for some of them
• A lot of SW is developed by Scania, so there are a lot of SW developeropportunities
6 December 2019 Info class internal Department / Name / Subject 4
Scania R&D
-
Title and Content
• Test automation framework for complete-vehicle HIL testing (past)
• Logging system for autonomous vehicles (current)
• Base software for autonomous vehicles (current)
• All work has boiled down to system development in general and sw developmentand sw engineering in particular
6 December 2019 Info class internal Department / Name / Subject 5
My career at Scania
-
Title and Content
• A Hardware In the Loop (HIL) connects inputs and outputs of Electronic Control Units (ECUs) and ”fools” them to be in a real vehicle
• Purpose of HIL is to
− run regression test suites
− make sure new software changes do not change old behavior
− run dangerous tests
− remove a brake management system sensor in 90 km/h
− manipulate hw that is difficult to get to in real vehicle
− sensors in engine
6 December 2019 Info class internal Department / Name / Subject 6
Test automation framework for HIL testing
-
Title and Content
• A Scania vehicle is determined by its Scania On-board Product Specification(SOPS)
− A SOPS describes which function product codes (FPCs) the vehicle has, and their values
− For instance, FPC1 describes the type of vehicle, A is truck, and B is bus
• To run a test suite:
− read SOPS and configure input to test system according to FPC conditions
− flash ECUs
− parameterize ECUs
− run tests
− collect test results
6 December 2019 Info class internal Department / Name / Subject 7
Test automation framework for HIL testing
-
Title and Content
• Electrical system consists of some main CAN buses
− red for safety critical systems, e.g., engine management system, brake management system, air production system
− yellow for less critical systems, e.g., information cluster, external lights system
− green for non-critical systems, e.g., infortainment system
− brown for ADAS related sensors
• And a lot of sub-buses to ECUs
• As many as possible of these CAN buses shall be recorded during testing in HIL
6 December 2019 Info class internal Department / Name / Subject 8
Test automation framework for HIL testing
-
Title and Content
• Web based GUI
• Utility programs like boot program, html and websockets server, process monitor
• Log domains: CAN, CCP/XCP, cameras, IO, ethernet, etc.
• Each log domain is its own
• Originally Windows based but today Linux based
• C++ with Cmake build chain targetting both platforms
6 December 2019 Info class internal Department / Name / Subject 9
Logging system
-
Title and Content
• Most log domains are implementing the actor pattern
− a dedicated thread reading the sensor data and putting it into a FIFO
− a dedicated thread reading from the FIFO doing some useful stuff with the data
6 December 2019 Info class internal Department / Name / Subject 10
Logging system
sensor to be logged
thread
1
thread
2
Do something
like publish to
middleware
-
Title and Content
6 December 2019 Info class internal Department / Name / Subject 11
Logging system
Log computer CANn
Bus1
CANn
Bus2
Camera1
Camera2
Switch
cameralog
domain
CAN log
domain
publish/subscribemiddleware
write to disk
lidar1
lidar2
Switch
lidar log domain
-
Title and Content
• An application framework is needed for the applications to be based on
− abstraction layer so the underlying OS can be switched out
− an application consists of init method, step function, and an execution profile like periodicexecution
− Upon start up, all applications start and the OS schedules them
6 December 2019 Info class internal Department / Name / Subject 12
Base software
-
Title and Content
• The phase between applications may change each startup
• The behavior of the system can, but most likely will not, be slightly different eachtime it runs
• Determinism is important so test results are meaningful
6 December 2019 Info class internal Department / Name / Subject 13
Base software
-
Title and Content
• A small PIC processor reads the seat heater knob
− each knob position corresponds to a resistance
• A digital code shall be generated to an ASIC that controls the heater
6 December 2019 Info class internal Department / Name / Subject 14
Seat heater controller
12 3
4
PIC ASIC
-
Title and Content
• Best practice requirements on code
− no interrupts or show interrupt bursts are properly handled
− code guidelines: no dynamic memory, no pointers
− consistency checks in code
− checksums on permanent data structures
− variables originating from different code paths that shall match
− use of watchdog
6 December 2019 Info class internal Department / Name / Subject 15
Seat heater controller
-
Title and Content
• How to log many (40+) CAN buses in HIL?
• How to make response time analysis on logged CAN?
• How to ensure time sync on logged log domains?
• How to ensure synchronized execution of applications using applicationframework?
• What is a good sw architecture for the seat heater problem?
6 December 2019 Info class internal Department / Name / Subject 16
Real-time problems to ponder
-
Title and Content
• Given:
− Around 40 CAN buses
− up to 10 meters apart (CAN specification says how long stubs can be)
− the CAN frames shall be synchronized in time
• Solution 1
− centralized recording
− time synchronization by the recorder’s clock
• Solution 2
− decentralized recording
− time synchronization must be solved by some means
6 December 2019 Info class internal Department / Name / Subject 17
Test automation framework and HIL testing
-
Title and Content
• CAN frames are sent on a bus
• Several CAN controllers can connect to the bus
• In the arbitration phase, it is determined which CAN controller that can continueto send its frame
• All CAN controller read and get the frame at (roughly) the same time
6 December 2019 Info class internal Department / Name / Subject 18
Test automation framework and HIL testing
-
Title and Content
6 December 2019 Info class internal Department / Name / Subject 19
Test automation framework and HIL testing
CAN controller
CAN controller
CAN synchronization
Computer Computer
Sender ofsynch msg payload: 123
timestamp: 100023,channel: 1,payload: 123
timestamp: 99,channel: 1,payload: 123
-
Title and Content
• Now we know that first CAN controller’s timestamp 99 references the same timepoint as second CAN controller’s timestamp 100023
• A computer program can thus
− get all CAN frames
− wait for the same synch message from each CAN controller
− form a global time
− sort buffered frames
− repeat
6 December 2019 Info class internal Department / Name / Subject 20
Test automation framework and HIL testing
-
Title and Content
• Performs a worst case response time analysis on each message based on published work by Reinder J. Bril: Controller Area Network (CAN) schedulabilityanalysis: Refuted, revisited and revised, 2007
• Message properties found in CAN databases and which CAN frames used from real logs
• Outputs a list of potential response time problems
6 December 2019 Info class internal Department / Name / Subject 21
Test automation framework and HIL testing
-
Title and Content
• It works and since the load is split over several computers it can support logging40+ CAN buses
• The solution consists of several programs
− synch message sender
− CAN frame receiver
− Merger that sorts all CAN frames
− Logger of sorted CAN frames to file
− Worst case response time analsysis program
• Distributed solution is much more complex than the centralized solution. Even ifthey would have the same number of lines of code
6 December 2019 Info class internal Department / Name / Subject 22
Reflections on CAN logging solution
-
Title and Content
• This problem can be split into two problems
− single computer logging
− distributed logging
• Log domains on single computer can rely on the same PC clock for timestamping
• Important to timestamp as close to the source as possible
• After the timestamping is done, the time it takes to save to disk does not matter
6 December 2019 Info class internal Department / Name / Subject 23
Synched time between log domains
-
Title and Content
• In distributed logging, a global time must be established
• There are protocols for this
− NTP can achieve millisecond level synch
− PTP can achieve sub-millisecond or even sub-microsecond level synch
• When a global time is established, the same hold as for single computer
− timestamp as close to the source as possible
6 December 2019 Info class internal Department / Name / Subject 24
Synched time between log domains
-
Title and Content
• On a high abstraction level there are two options for a sw architecture
− event based
− time triggered
• Most sensible sw architectures for this problem and incapable CPU is timetriggered
− state machine
− time slots
6 December 2019 Info class internal Department / Name / Subject 25
Seat heater
-
Title and Content
• Time slots
− start by resetting free running timer
− do some work
− wait for reaching specific timer value
− Repeat for next slot
• Slots dedicated for
− sampling
− clocking digital signal
− logic
• Check different counter regularly and rewrite GPIO registers
6 December 2019 Info class internal Department / Name / Subject 26
Seat heater
-
Title and Content
• Solution in user-space
− an application must be aware of all other applications and denote this the ticker that knowsabout clock ticks
− each application must wait to be started by the ticker
− each application has a period time that is converted into clock ticks by the ticker
− when remaining clock ticks reaches zero for an application, the ticker releases it
− the synchronization primitives can be, e.g., mutexes and condition variables
6 December 2019 Info class internal Department / Name / Subject 27
Base software application synchronization
-
Title and Content
• Solution in user-space
6 December 2019 Info class internal Department / Name / Subject 28
Base software application synchronization
application 1
init method sends period time
ticker
ticker says go!
waits at start of step function
ticker says go!
waits at start of step function
clock ticks
clock ticks
-
Title and Content
• Solution in kernel space
− The operating system must have some mechanism to start applications synchronized
− The operating system must have notion of period times
− The operating system must have notion of priority and possibly priority inheritance
6 December 2019 Info class internal Department / Name / Subject 29
Base software application synchronization