DeepPicar: A Low-cost Deep Neural Network-based Autonomous...
Transcript of DeepPicar: A Low-cost Deep Neural Network-based Autonomous...
![Page 1: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/1.jpg)
DeepPicar: A Low-cost Deep Neural Network-based
Autonomous Car
Michael Bechtel$, Elise McEllhiney$, Minje Kim^, Heechul Yun$
$ University of Kansas, ^ Indiana University Bloomington
1
![Page 2: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/2.jpg)
End-to-End Deep Learning
• Produce control outputs directly from sensory inputs.
• Simplifies process by bypassing intermediary steps.
2
Adopted from http://rll.berkeley.edu/deeprlcourse/f17docs/lecture_1_introduction.pdf
![Page 3: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/3.jpg)
DAVE-2• 2016 project done by NVIDIA.
• Used the End-to-End approach with a Convolutional Neural Network (CNN).
• Could successfully drive a car on public roads.
3Source: https://devblogs.nvidia.com/deep-learning-self-driving-cars/
![Page 4: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/4.jpg)
DAVE-2's CNN
• DAVE-2 used a 9-layer CNN to drive their car• ~250K weights
• ~27M connections
• 3MB large
• Relatively small by today's standards• More recent networks have millions of
weights and are >100MB large
4
DAVE-2 CNN
![Page 5: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/5.jpg)
Outline
• Background
• DeepPicar Platform
• CNN Evaluation
• Shared Resource Isolation
• Embedded Platform Comparison
• Conclusions
5
![Page 6: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/6.jpg)
DeepPicar
• A low cost, small scale replication of NVIDIA’s DAVE-2.
• Uses the exact same CNN.
• Runs on a Raspberry Pi 3/4 in real-time.
6
![Page 7: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/7.jpg)
System Design
7
USB
GPIO
Jumper
Camera Embedded Computer
Actuator
RC car
Jumper
Portable charger
![Page 8: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/8.jpg)
Motor Control• The RC car has two separate motors: steering and throttle
• Convert the steering angle to a PWM value
• Send a signal to the steering motor with the PWM value 8
Steering Throttle
Jumper wires
![Page 9: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/9.jpg)
CNN-Based Real-Time Control Loop
9
![Page 10: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/10.jpg)
Image Collection• Get images from the camera sensor using OpenCV
• Configured to return a 320x240x3 image frame• The network requires 66x200x3 input
10
read()
![Page 11: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/11.jpg)
Image Preprocessing
• Transform the image's dimensions (also with OpenCV)
11
resize()
320x240x3 66x200x3
![Page 12: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/12.jpg)
CNN Inferencing
• Feed the preprocessed image to the network
12
Steering angle(Radians)
![Page 13: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/13.jpg)
Output Handling
• Convert network output to degrees
• Control car based on relative value• Angle > 15: turn left
• Angle < -15: turn right
• Else: go straight
13
![Page 14: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/14.jpg)
14
![Page 15: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/15.jpg)
Outline
• Background
• DeepPicar Platform
• CNN Evaluation
• Shared Resource Isolation
• Embedded Platform Comparison
• Conclusions
15
![Page 16: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/16.jpg)
CNN on Raspberry Pi 3
16
• Pi 3 is able to run the CNN based control at under 40 Hz (25 ms).
• CNN inferencing dominates the processing time (>80%).
Time breakdown
![Page 17: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/17.jpg)
Effect of Number of Cores Used
• Performance improves with more cores: 20Hz (1core) – 40Hz (4cores).
• But scalability is limited (due to parallelization overhead).
17
![Page 18: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/18.jpg)
Effect of Multiple Concurrent Models
• CNNs experience modest slowdown (due to interference).
18
2Nx2C: 2 CNNs each using 2 cores4Nx1C: 4 CNNs each using 1 core1Nx1C: 1 CNN using 1 core1Nx2C: 1 CNN using 2 cores
2Nx2C
![Page 19: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/19.jpg)
Effect of Memory Intensive Co-runners
19
Co-runners:BwRead: 16MB 1D array readBwWrite: 16MB 1D array write
• CNN can suffer very high (up to 11.6X) slowdown.
• Likely caused due to contention in shared hardware resources.
![Page 20: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/20.jpg)
Effect of Co-runners
20
![Page 21: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/21.jpg)
Outline
• Background
• DeepPicar Platform
• CNN Evaluation
• Shared Resource Isolation
• Embedded Platform Comparison
• Conclusions
21
![Page 22: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/22.jpg)
Isolation Mechanisms
• L2 Cache Isolation: PALLOC (*)• Page-coloring based kernel-level memory allocator that partitions the cache
by allocating memory pages to disjoint cache sets.
• DRAM Isolation: MemGuard (**)• Memory bandwidth reservation system that limits the bandwidth each core
gets in a given interval (1 ms).
22(*) H. Yun et al., “PALLOC: DRAM Bank-Aware Memory Allocator for Performance Isolation on Multicore Platforms.” RTAS’14(**) Yun et al., “MemGuard: Memory Bandwidth Reservation System for Efficient Performance Isolation in Multi-core Platforms.” RTAS’13
![Page 23: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/23.jpg)
PALLOC
• L2 cache is partitioned using bits 13 and 14.
• Four partitions are created with 4, 3, 2, and 1 colors.• 100%, 75%, 50%, and 25% L2 cache space availability.
23
![Page 24: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/24.jpg)
PALLOC cont.
• The CNN workload is insensitive to cache space availability.
24
DRAM
LLC
Core1 Core2 Core3 Core4
DNN
![Page 25: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/25.jpg)
PALLOC cont.
• Cache partitioning is ineffective in protecting the CNN.• Using PALLOC provides no benefits.
25
![Page 26: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/26.jpg)
MemGuard
• CNN performance is sensitive to memory bandwidth.• At least 400 MB/s required for ideal performance.
26
![Page 27: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/27.jpg)
MemGuard cont.
• Performance improves when co-runner bandwidths are limited.• Using MemGuard is very beneficial.
27
solo
CNN bandwidth: 1000 MB/s
![Page 28: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/28.jpg)
Outline
• Background
• DeepPicar Platform
• CNN Evaluation
• Shared Resource Isolation
• Embedded Platform Comparison
• Conclusions
28
![Page 29: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/29.jpg)
Embedded Platform Comparison
• Does the CNN behave similarly on other platforms?
• We test the NVIDIA TX2 with GPU, and without.
• Three experiments were replicated on the other platforms.
29
![Page 30: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/30.jpg)
Comparison of Multicore Experiments
• The CNN scales on all platforms except for TX2 (GPU).
• Scalability is still limited.
30
![Page 31: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/31.jpg)
Comparison of Multimodel Experiment
• All platforms experience some interference.
• Slowdown is tolerable on all platforms.31
2Nx2C: 2 CNNs each using 2 cores4Nx1C: 4 CNNs each using 1 core1Nx1C: 1 CNN using 1 core1Nx2C: 1 CNN using 2 cores
2Nx2C
![Page 32: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/32.jpg)
Comparison of Co-runners
• Worst-case performance is bad on all platforms.
• But the Pi 3 is especially bad for memory write co-runners.
32
![Page 33: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/33.jpg)
Outline
• Background
• DeepPicar Platform
• CNN Evaluation
• Shared Resource Isolation
• Embedded Platform Comparison
• Conclusions
33
![Page 34: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/34.jpg)
Conclusions
• DeepPicar Platform• Low-cost replication of the DAVE-2 autonomous car.
• Runs the same CNN in real-time on a Raspberry Pi 3.
• Real-time CNN inferencing• Feasible on embedded multicore platforms.
• Multiple CNNs can be co-scheduled.
• Caution must be taken regarding interference.
• Shared Resource Isolation• L2 cache partitioning had no benefits.
• Limiting core memory bandwidths was very effective.
34
![Page 35: DeepPicar: A Low-cost Deep Neural Network-based Autonomous …heechul/courses/eecs388/DeepPicar-Overview.pdf · DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car Michael](https://reader034.fdocuments.us/reader034/viewer/2022050514/5f9e45b39ab02036534a3610/html5/thumbnails/35.jpg)
Thank youDisclaimer:
This research has been supported by the National Science Foundation (NSF) under the grant number CNS 1815959, and the National Security Agency (NSA) Science of Security Initiative. The Titan Xp and Jetson TX2 used for this research
were donated by the NVIDIA Corporation.
35