基于机器学习的 AD ADAS 及消费电子解决方案

赛灵思技术日XILINX TECHNOLOGY DAY

原钢赛灵思 AI 解决方案市场专家2009 年 3 月 19 日

基于机器学习的赛灵思自动驾驶和ADAS 解决方案

赛灵思 ADAS 市场领导地位

Over 12 Years Semi Supplier Heritage

CY13 - CY17 > 60% CAGR

40M+ cumulative units shipped

14 Makes - 29 Models

9 Makes - 13 Models

赛灵思方案已经覆盖的应用

Auto Trailer HitchFull Display Mirror

Surround View

Front Camera - Mono

EV Car Charger SystemHeads Up Display

Driver Monitoring System

LiDAR Front Camera - Stereo

赛灵思扩大汽车级（XA）产品系列

硬件可编程性成就性能更高的架构For (i=0, i< num;++i){ classification_process();hashing_process();encryption_process();

GPU Implementation FPGA Implementation

unloadloadKernel

Pipelining

No Kernel loading/unloading is required to run different applications Thanks to pipelining

To run different applications, GPU requires loading different kernel

Same kernel run many times using multiple small cores

Parallelizing Parallelizing

OTA 硅片和动态功能

˃ Dyanmic Function eXchange (DFX) – Using the same FPGA for mutually exclusive functions– Eg: Driver monitoring and Valet Parking– Time-multiplexing hardware requires smaller FPGA– System Cost and Size Reduction with less silicon chips

˃ OTA Silicon– OEMs require OTA update to enable upgradability for new innovation in

emerging applications like Automated Driving – We provide both Software just like other SoC vendors but we can go further

by providing Hardware = OTA Silicon.

˃ 2D Object DetectionVehicle: Car, SUV, Bus…Pedestrian, Cyclist, RiderTraffic -sign, Traffic-light

˃ 3D Object Detection

˃ Pose Estimation

˃ Lane Detection

˃ Drivable Space Detection

˃ Semantic Segmentation

汽车模块

2D 目标检测

˃ 2D Object DetectionDetection Algorithms: SSD, TINY YOLOv2, YOLOv2, TINY YOLOv3, YOLOv3, Light-head RCNN etc.Datasets: KITTI 、Cityscapes 、BDD100K and Private data etc.

2D 目标检测˃ SSD

Dataset: BDD100k and private dataCategories: Pedestrian, Car, Cyclist

GOPs(480*360)

Compress Ratio mAP(GPU)

FPS（DPU, Dual core,

ZCU102）

117 - 46.8 -

93.5 20% 46.7 -

69.7 40% 46.3 -

61 50% 48.6 -

46.9 60% 48.1 -

35.2 70% 49.4 -

28.9 75% 48.6 -

17.8 85% 47.5 -

12.1 90% 46.2 -

8.7 93% 44.3 -

6.3 95% 42.7 ~ 110 fps

2D 目标检测 – 小目标检测

˃ RefineDet: Small pedestrian detectionThe original SSD model permanence on the small pedestrian dataset is 24%(MAP)Now the RefineDet model permanence is 31.8%(MAP)

˃ RefineDet Pruning

˃ FPS: baseline 210G 25fps, pruned 9.4G 101fps (ZCU102, triple B4096@330MHz)

17.79.4

31.8 31.7831.39

30.27 27.79

baseline 1 2 3 4

RefineDet compression

Operations(G) mAP(%)

2D 目标检测

˃ YOLO2 Performance after Compression @Customer’s Data.

52 4534 26.4 21.2 16.8 12.8

56.7 57.9 58.3 58.7 58.1 57.9 57.8 56.9 56.6 55.4 54.4

11.6 14.418.4 21.2

26.8 28.432.4

37.241.6 43.6 46.4

Baseline 1 2 3 4 5 6 7 8 9 10

Pruning loops

Pruning Speed up on Hardware (2xDPU@Zu9)YoloV2 single class detection @ Customer's data

Operations(G) mAP(%) fps

2.8x4x

2D 目标检测˃ YOLO3 Performance of Compression

Dataset: CityscapesCategories: Pedestrian, Car, CyclistPlatform: ZCU102, triple B4096@330MHz

GOPs(512*256)

Compress Ratio mAP(DarkNet)

mAP（DPU）

FPS（DPU）

53.7 - 53.7 53.1 43

24.5 54% 53.7 53.7 61

17.0 68% 54.0 53.4 74

13.7 75% 56.1 55.4 82

10.7 80% 55.4 52.9 86

7.5 86% 57.0 55.3 93

5.7 89% 55.2 53.0 97

4.0 93% 51.2 49.3 100

2D 目标检测

˃ SSD LiteBackbone : Mobilenet_v2 (Relu verison)Datasets: BDD100kInput size: 480*360,Operations: 6.57GmAP: 32.9DPU (one core) FPS: 36(ZU9), 21(ZU2)

˃ Tiny YOLO v3Datasets: KITTI ,Cityscapes ,BDD100K and Private data etc.Input size: 416*416Operations: 5.9GDPU FPS: 170 (ZU9 dual core)

3D 目标检测˃ 3D Object Detection

Reproduce latest advanced 3D detection methods(F-PointNet and AVOD) combing the information of Lidar point cloud and RGB imageOptimize post processing

姿态预判

˃ Driver Monitoring, Gesture Recognition

˃ Single Person Pose Estimation (After person detection)

head, neck, shoulder, elbow, wrist, hip, knee, ankleModel: CNN networks with coordinates regression300k train images, 70k test images, PCKh0.5 90.25%

˃ Multi-person Pose EstimationThis model uses heatmap to regression the joints’ location and the lines between two related jointsThe OKS of this model on AI challenger dataset is 0.32609

˃ Motivation: detect lane even if the lanes are occluded by vehicles

˃ Algorithm:SCNN(left) and VPGNet (right)

˃ Dataset: SCNN: 9600 training and 1,300 test images capture from SCNN datasetVPGNet: 1000 training and 200 test images from Caltech-lane datasetInput size: SCNN (800x288), VPGNet (640x480)

车道检测

˃ VPGnet compression：Dataset: 960 training and 240 test images capture from different scenesEvaluation metric: F1 scoreCompress to 10%, performance degrade 2%

车道检测- 剪枝

90 88.9 88.8 88.5 88

baseline 1 2 3 4

Operation (G) F1 score (%s)

语义分割

˃ Semantic SegmentationUsing state-of-art algorithm for high performance Compress large model & try light-weight model to ensure efficiency and performance

Algorithm Input size Model backbone Operation numbers IOU(%) FPS @ Input sizeZCU9

WiderRes38 1024* 2048 wider-Resnet-38 10T 77.68 ——

SegNet 1024 * 2048 VGG 16 2.4T 56 ——

FPN-Deephi 1024 * 2048 Google_v1 136G 71.25 ——

Deeplabv3+ 1024 * 2048 Mobilenet_v2 49G 70.88 ——

ESPNet 512 * 1024 —— 9.4G 63.64 21.48 @ 256 * 512

ENet 512 * 1024 —— 9.36G 57.9 54.86 @ 256 * 512

FPN-Deephi(light weight) 256 * 512 Google_v1 9G 56.45 119 @ 256 * 512

Tiny-FPN 512 * 512 —— 1.8G 60.2 117 @ 256 * 512

语义分割

˃ Semantic Segmentation

(a) Result of WiderRes38

(b) Result of FPN-Deephi (light weight)

多任务学习

˃ Multi-task learningShared feature extraction backboneImprove accuracy by model architecture optimization multi-task model including 2D box detection, orientation and semantic segmentation (left)multi-task model including object detection, lane detection and drivable space detection (right)

多任务学习- 剪枝

˃ Multi-task: 2D box detection, orientation and semantic segmentationDataset: BDD100k (train: 6967, test: 988)

Networks Input size Compression Ratio

Detection: mAP(IOU>0.5)

Segmentation: mIOU

VGG 288x512 Rate: 0 29% 46.9% 106.5G

Resnet50

480x640 Rate: 0 42.4% 48.3% 72.7G

480x640 Rate: 0.5 42.1% 47.1% 34.2G

480x640 Rate: 0.6 40.5% 45.8% 27.5G

480x640 Rate: 0.8 32.0% 39.6% 22.3G

Resnet18480x640 Rate: 0 26.5% 39.9% 27.7G

480x640 Rate: 0.5 24.2% 37.0% 14.0G

多任务学习- 剪枝

˃ Multi-task: object detection, orientation, lane detection and drivable space detectionDataset: BDD100k (train: 6967, test: 988)

Networks Input size Compression Ratio

Detection: mAP(IOU>0.5)

Segmentation: mIOU

VGG288x512 Rate: 0 34.51% 57.43% 103.5G

288x512 Rate: 0.4 31.62% 56.35% 60.9G

288x512 Rate: 0.6 31.51% 55.42% 40.8G

Resnet18

288x512 Rate: 0 24.80% 56.26% 13.8G

288x512 Rate: 0.4 24.00% 54.83% 7.6G

288x512 Rate: 0.5 23.27% 54.30% 6.3G

288x512 Rate: 0.6 23.42% 53.58% 5.1G

Resnet50

288x512 Rate: 0 35.55% 58.81% 34.1G

288x512 Rate: 0.4 35.55% 58.09% 18.9G

288x512 Rate: 0.5 35.29% 57.61% 15.7G

288x512 Rate: 0.6 33.41% 56.80% 12.5G

多任务学习: 在 ZCU102 上部署

˃ 1CH multi-task modelPlatform: ZU9 Network: ‒ ResNet 18 + 2D box detection, orientation and semantic segmentation

Input size: ‒ detection 480 * 360

Operation: ‒ detection 27.7G

FPS: ~29 fps

现有客户案例Major application Functions Device CNN Demands Target Perf.

Front camera

2D object detection & classification

Zynq7020, ZU2/3/4/5

Yolo and Tiny Yolo, SSD, ResNet, Mobilenet v2

5FPS ~ 15FPS

Semantic Segmentation Zynq7020, ZU3/4 SegNet, FPN, ENet,

ESPNet 5FPS ~ 15FPS

SurroundView & Parking

Multi-channel object detection ZU5/ZU9 Yolo, SSD,

Lighthead RCN10FPS/CH ~ 30FPS/CH

LiDAR Object detection ZU3 SegNet, AVOD, F-PointNet 15FPS ~ 25FPS

L2-L4 ECU

2D and 3D object detection ZU9/ZU11

Yolo and Tiny Yolo, SSD;Complex Yolo;

10FPS/CH ~ 30FPS/CH

Semantic Segmentation ZU9/ZU11 SegNet, FPN, ENet,

ESPNet 10FPS/CH

Driver Monitoring Pose Estimation ZU9 OpenPose, 20FPS

ADAS 域控制器 - 相机

Rear Camera *1 Surround view fisheye camera *4

Front Camera *2

Front Cam (near): Detection & SegmentationD Mode

R Mode

Fisheye Cam(1CH) @ turning: Segmentation

Fisheye Cam(4CH): Segmentation

基于机器学习的消费电子解决方案

消费电子中的机器学习

Drones: obstacle recognitionSmart Appliance: intelligent controlSet Top Box: content recognition

Multi-function Printer: quality enhancement Projector: quality enhancement, SR Camcoder: scenario recognition

为什么选择赛灵思？

Software programmability of an ARM®-based processor with the hardware programmability of an FPGA

Easy to design single-chip solution

Programmable hardware for diverse interface

Fusion of multi-function

基于概念的 Zynq 消费电子系统架构

PSDual-core A9

PeripheralsUSB2.0 x2

GPIOsSPIs/I2Cs

DVPInterface

(Motor Control)

DisplayController

MIPIDSI

UI LCD

POD MotorsPOD MotorsPOD Motors

Motors POD & others

QSPIFLASH

DDR3DDR3

POD Camera

Face Camera AXI Bus Fabric

Adaptable.Intelligent.

基于机器学习的 AD ADAS 及消费电子解决方案

Documents

Transcript of 基于机器学习的 AD ADAS 及消费电子解决方案

难忘的就业考察之旅 - ep.chinanshw.cnep.chinanshw.cn/Img/2020/9/p20200901aee22a78e1664d...书，每年的学费餐费及零用钱，花费 大概8000元。在当时，这对于内地农

谈英语学习方法与策略培养 - heep.unipus.cnheep.unipus.cn/gykejianNews/files/教会学习——_谈英语学习方法与... · 1 教会学习 ——谈英语学习方法与策略培养

免费标准网() 标准最全面zb.guaihou.com/stdpool/JIS A8403-3.pdf 免费标准网() 标准最全面 免费标准网() 无需注册 即可下载 免费标准网() 标准最全面

中国邮政集团公司2018年度部门决算 - China Post · 用于基本支出22452.47万元，用于项目支出64.02万元。较2017 年度决算数增加2328.81万元，增幅11.5%，主要是人员经费支

kátn zofhfnhlhlkNNCknearestnghhorlll.lt 2œº器学习/机器学习上课11...kátn 降维与度量学习 C d. 2. zofhfnhlhlkNNCknearestnghhorlll.ltzd KNN o 数据集S = 佖,别绌

Chapter 2练习 Personality and Values

必需 免费 红帽认证工程师学习路径 - Red Hat...免费 推荐 必需 课程 考试 认证 红帽认证工程师学习路径 红帽系统管理一 RH124 · 5 天 · 推荐红帽系统管理二

topik中高级词汇第二版曲线.pdf 1 2018/11/9 15:11:35 · 练习题主要用于学习者巩固和复习。练习题通过不同的题型，多角度考察词 练习题通过不同的题型，多角度考察词

L9 句型练习 Sentence pattern practice

体育 - wb.sznews.comwb.sznews.com/attachment/pdf/201710/16/14748052... · 赛决赛质量有些差强人意：费德勒开局就破 掉了纳达尔的发球局，全场拿到7次破发机

小基站传输解决方案白皮书/media/CNBG/Downloads/Solutions/MBB...运维解决方案，E2E传输安全解决方案，灵活的时钟同步解决方案，E2E传输QoS 解决方案，E2E传输可靠性解决方案。另外提供了解决方案的关键产品信息、解

机器学习：现在与未来yww/papers/MachineLearning... · 人工智能和机器学习，谈谈其发展现状 与未来趋势。 1 什么是机器学习？ 机器学习是人工智能的一个分支

Ⅰ제43권 제1호Ⅰpp. 113136 · 2020-03-04 · 爱X而不是Y，倾向于增加决策者对X的偏好。例如，Colman, et. al.(2007)研究发现，当消费 者只能在X手机（价值120.00美金，每分钟通话费15.00美分）和Y手机（价值100.00美金，

Netflix Prize与机器学习： 行家看点€¦ · Netflix Prize胜出解决方案所应用的一项技术为集成法，其被称为“线性堆栈”。Netflix采用一种线

Unit 3 What does he look like?. 学习目标 1. 复习上节课学习的生词 2. 谈论某人的外貌特征…

深度学习 vs.. 机器学习 —— 方法选择与模型训练download.ilovematlab.cn/meetup/2019HUST/... · 6 深度学习还是机器学习？ 你有标签数据吗？ – 如果没有，传统的机器学习可能是更合

36940493 Fin512 习题1 Without Answers

· Web view西安市殡仪馆殡仪车维修费. 西安市殡仪馆业务车辆维修费用. 西安市殡仪馆业务车辆维修费用. 西安市殡仪馆殡仪车维修费 ...

决胜新能源经济 - Accenture · 过社交媒体、在线游戏甚至组织环 ，都 能帮助企业与消费者进行平等对话， 切实了解用户需求，鼓励消费者参

难忘的就业考察之旅 - ep.chinanshw.cnep.chinanshw.cn/Img/2020/9/p20200901aee22a78e1664d...书，每年的学费餐费及零用钱，花费大概8000元。在当时，这对于内地农

免费标准网() 标准最全面zb.guaihou.com/stdpool/JIS A8403-3.pdf 免费标准网() 标准最全面免费标准网() 无需注册即可下载免费标准网() 标准最全面

必需免费红帽认证工程师学习路径 - Red Hat...免费推荐必需课程考试认证红帽认证工程师学习路径红帽系统管理一 RH124 · 5 天 · 推荐红帽系统管理二

topik中高级词汇第二版曲线.pdf 1 2018/11/9 15:11:35 · 练习题主要用于学习者巩固和复习。练习题通过不同的题型，多角度考察词练习题通过不同的题型，多角度考察词

体育 - wb.sznews.comwb.sznews.com/attachment/pdf/201710/16/14748052... · 赛决赛质量有些差强人意：费德勒开局就破掉了纳达尔的发球局，全场拿到7次破发机

机器学习：现在与未来yww/papers/MachineLearning... · 人工智能和机器学习，谈谈其发展现状与未来趋势。 1 什么是机器学习？机器学习是人工智能的一个分支

Ⅰ제43권 제1호Ⅰpp. 113136 · 2020-03-04 · 爱X而不是Y，倾向于增加决策者对X的偏好。例如，Colman, et. al.(2007)研究发现，当消费者只能在X手机（价值120.00美金，每分钟通话费15.00美分）和Y手机（价值100.00美金，

Netflix Prize与机器学习：行家看点€¦ · Netflix Prize胜出解决方案所应用的一项技术为集成法，其被称为“线性堆栈”。Netflix采用一种线

深度学习 vs.. 机器学习 —— 方法选择与模型训练download.ilovematlab.cn/meetup/2019HUST/... · 6 深度学习还是机器学习？你有标签数据吗？ – 如果没有，传统的机器学习可能是更合

决胜新能源经济 - Accenture · 过社交媒体、在线游戏甚至组织环，都能帮助企业与消费者进行平等对话，切实了解用户需求，鼓励消费者参