Data Science for Connected Vehicles

24
The Data Science behind Predictive Maintenance in Connected Vehicles Esther Vasiete Srivatsan Ramanujam Pivotal Data Science Data Engineers Guild - Meetup June-21, 2016

Transcript of Data Science for Connected Vehicles

The Data Science behind Predictive Maintenance in Connected Vehicles

Esther VasieteSrivatsan RamanujamPivotal Data Science

Data Engineers Guild - MeetupJune-21, 2016

How can these connected devices in our home be smart enough to make daily life easier?

How does this……become this?

By recognizing this

And by processing this

Sensors + Other Unstructured Data

How can we know a tree has fallen on a power line before

the residents complain?

How can we use datato help prevent

accidents like the Macondo Disaster ?

Gene Sequencing

Smart Grids

COST TO SEQUENCE ONE GENOMEHAS FALLEN FROM $100M IN 2001 TO $10K IN 2011TO $1K IN 2014

READING SMART METERSEVERY 15 MINUTES IS

3000X MOREDATA INTENSIVE

Stock Market

Social Media

FACEBOOK UPLOADS250 MILLION

PHOTOS EACH DAY

In all industries billions of data points represent opportunities for the Internet of Things

Oil Exploration

Video Surveillance

OIL RIGS GENERATE25000DATA POINTS PER SECOND

Medical Imaging

Mobile Sensors

To realize this opportunity requires the right tools and techniques

Problem Formulation

Modeling Step

Data StepApps Step

Data Lake

Ingest

Business Levers

Dashboard/App

PL/X

Modeling• Data cleaning• Data Exploration• Feature

EngineeringModel Validation

Feedback loop for continuous

model improvement

Driver and Vehicle Meta

Data

Data Ingestion Platform

✔ ✔ ✔ ✔✔ ✔ ✔

Data to Apps

Data Science Use-cases for connected cars

12

Data Science Use-Cases

13

● Predictive Car Maintenance‒ More accurately predict part failure‒ Optimize part repair and replacement schedule● Leveraging Driving Behaviour‒ Useful to differentiate insurance pricing based on driving

style‒ Optimize car design● Improving GPS Systems‒ Establish baseline for traffic congestion‒ Create more meaningful metrics for routing‒ Infer public transportation effects on traffic‒ Predict how long incidents would take to clear

● Predictive Power for Assistance Systems

‒ Optimize fuel efficiency‒ Predict the future state of a car in the next 2

minutes (starts, stops, emergency braking)● Traffic Light Assistance‒ Signal timing of traffic lights‒ Crowd sourcing of traffic signals‒ Optimize traffic light patterns to reduce

congestion

Preventive Maintenance for Connected Vehicles

14

On-Board Diagnostics

Diagnostic Trouble Codes (DTC)

Unscheduled repairs

AB1029 – Power steering pump replacementCT3408 – Wheel alignment

Solving the preventive maintenance problem

Automakers

Customer Satisfaction

Auto Repairs

Data Sources for Predictive Maintenance

VINTimestamp DTC CodeOdometer

SpeedAcceleration

Engine Temperature Engine Torque GPS

Coordinates etc.

VINDate vehicle in

Date vehicle outRepair code

Parts replacedWarranty claims

Repair Comments

Vehicle Data Car Repairs Data

Predicting Job Type from Diagnostic Trouble Codes (DTCs)

Time

Job Type: Transmission

Job Type: Transmission

EngineJob Type:

Regular check

DTC: B DTC: B,

P, C

DTC: U DTC: B DTC: B

DTC: B, P, C, U

DTC:P, B, U

DTC: P DTC: B DTC: B,P

DTC: B,P

Can the DTCs observed here predict

this Job Type?

Can the DTCs observed here predict this Job

Type?

Can the DTCs observed here predict this Job

Type?

Predicting Job Type: a multi-class classification problem

DF1210

DF1215

DF2980

AB1029

AB1622

AB1625

AB8622

CT3402

CT3408

CT3560

CT2409

Vehicle Features

Hierarchical Classification Framework

Vehicle Features

DF1210

DF1215

DF2980

AB1029

AB1622

AB1625

AB8622

CT3402

CT3408

CT3560

CT2409

Model Parallelism

One or more job on the same day

Multi-labeling problem

One-vs-rest classifiers built in parallel

1

0

0

1

0 1

0

Class 1

Class 2

Class 3

One-vs-Rest Classification

Red vs. Non Red

On Segment 1

Green vs. Non Green

On Segment 2

Blue vs. Non Blue

On Segment N

• Predictive maintenance problems are challenging because DTC signals are not always symptomatic of an ensuing repair.

• Given the hierarchical nature of repair codes, we built a two stage hierarchical classification framework comprising a top-down cascade of classifiers.

• Major system jobs can be predicted earlier to the repair date.

Key Takeaways

Reference Architecture

%%publishmodel info.

/

Microservices (Spring Boot)

/load_model/score_model

Spring Cloud Data Flow

vehicle data (streaming)

connector

exploratory data analysis & model

training

Rabbit/Kafka source

training (offline) scoring (online)

/

web or mobile app dashboard