Machine Learning for the Sensored IoT

Post on 22-Jan-2018

604 views 0 download

Transcript of Machine Learning for the Sensored IoT

H2O.ai Machine Intelligence

Machine Learning for the

Sensored Internet of Things

Hank Roarkhank@h2o.ai@hankroark

1

H2O.ai Machine Intelligence

Who am I?

▪ Data Scientist & Hacker @ H2O.ai▪ Lecturer in Systems Thinking, University of Illinois at Urbana-Champaign

▪ John Deere, Research, Software Product Development, High Tech Ventures▪ Lots of time dealing with data off of machines, equipment, satellites, radar,

hand sampled, and on.▪ Geospatial and temporal / time series data almost all from sensors.▪ Previously at startups and consulting (Red Sky Interactive, Nuforia,

NetExplorer, Perot Systems, a few of my own)

▪ Systems Design & Management MIT▪ Physics Georgia Tech

H2O.ai Machine Intelligence

IoT Data Comes From Lots of Places, Much of it from Sensors

H2O.ai Machine Intelligence

The data is going to be huge, so get ready

H2O.ai Machine Intelligence

Wow, how big is a brontobyte?

H2O.ai Machine Intelligence

This much data will require a fast OODA loopMuch of these models will then be used in control systems

Image courtesy http://www.telecom-cloud.net/wp-content/uploads/2015/05/Screen-Shot-2015-05-27-at-3.51.47-PM.png

H2O.ai Machine Intelligence

Machine Prognostics Use Case Sensor data of turbofan remaining useful life prediction

Jupyter notebook @ http://bit.ly/1OmdBg7

Many more tips and tricks

H2O.ai Machine Intelligence

Key take aways for modeling the sensored IoT

• Some sort of signal processing is usually helpful, but can introduce bias• Smoothers, filters, frequency domain, interpolation, LOWESS, ... ,

aka feature engineering or post-processing• Knowing a little about the physics of the system will be helpful here

• Validation strategy is important• Easy to memorize due to autocorrelation

• Sometimes the simplest things work• Treat each observation independently; Use time, location, as data elements

• Uncertainty is the name of the game• Methods that will report out probabilities are often required (not shown here)

• The data can be big, get ready, it'll be a great ride• Scalable tools like H2O will help you model the coming brontobytes of data