Connecting Stuff to Azure (IoT)

36
Mark Simms, Principal Group Program Manager Azure Customer Advisory Team (AzureCAT) Twitter: @mabsimms Connecting Stuff to Azure And all of the crazy stuff that happens along the way..

description

From cars, to thermostats, through media players and embedded controllers, devices are being connected to the Internet at a furious pace. This session will discuss and demonstrate and coding practices from live Azure customers.

Transcript of Connecting Stuff to Azure (IoT)

Page 1: Connecting Stuff to Azure (IoT)

Mark Simms, Principal Group Program Manager Azure Customer Advisory Team (AzureCAT)Twitter: @mabsimms

Connecting Stuff to AzureAnd all of the crazy stuff that happens along the way..

Page 2: Connecting Stuff to Azure (IoT)

Building an end-to-end Internet of Things (IoT) experience requires careful design and architecture choicesIn this session we are going design a 1M device IoT solution, and go through the fire hose of choices.

All of these choices will be about balancing conflicting constraints regarding physics and economics.

Most of these choices are made in devices, and adapted to in the cloud.

Azure Customer Advisory Team (AzureCAT) Works with internal and external customers to build out some of the largest applications on Azure.

This presentation is an early preview of an IoT guidance series to be published by Patterns & Practices early in 2015.

This presentation will be an interactive design session.

We’re going to break down the sequence of choices that go into delivering IoT at scale in Azure

This is going to be a densely packed sequence of choices..

Note: all code samples will be pure OSS (Java, Kafka, etc)

Setting the stage

Page 3: Connecting Stuff to Azure (IoT)

What is IoT?

Page 4: Connecting Stuff to Azure (IoT)

Collecting information from lots of devices is cool – telematics

Merging perspectives between devices, systems and humans to build a better understanding of the world around us..

Then tying together insight with action – there lies the promise of IoT.

No really.. What is IoT?

http://en.wikipedia.org/wiki/Internet_of_Things

Page 5: Connecting Stuff to Azure (IoT)

End to End IoT Architecture

Not going to cover

these today

Page 6: Connecting Stuff to Azure (IoT)

RFC1149?

https://pbs.twimg.com/media/Ak5nLs2CIAA1ypg.jpghttps://twitter.com/hashtag/ipoac

Page 7: Connecting Stuff to Azure (IoT)

In any IoT application at scale, the needs of the device outweight the needs of the service.

Devices have to be designed, manufactured, programmed, shipped and provisioned.

Devices consume – battery, heat, network bandwidth, network sockets.

Commercially viable products and services must maintain operational CoGS (cost of goods sold)

Device choices endure – sometimes for decades!

Devices (or why physics always wins)

Page 8: Connecting Stuff to Azure (IoT)

Cost of the “oops” – cloud services

Oops. Inverted a variable assignment. Cost of fixing –

edit, commit, push, build, deploy.

Page 9: Connecting Stuff to Azure (IoT)

Cost of the “oops” – hardware design

Oops. Put the piezo electric buzzer on a general purpose

I/O pin, not a PWM (pulse width modulation) pin.

Not enough processor cycles to handle software based

PWM / would destroy battery life.

Cost to fix: $100k, including 6 week product ship delay.

Page 10: Connecting Stuff to Azure (IoT)

Choices – What powers the device?Option Upside Downside Common examples

Battery (primary) Device can operate in a mobile environment for extended periods of time.

Device now has a current / wattage budget (CPU cycles are not free).

Efficient and safe battery charging requires sophisticated circuitry (you won’t do it in firmware).

Mobile brains phones

Battery (secondary) Device can sustain function through transient power interrupts

Efficient and safe battery charging requires sophisticated circuitry (you won’t do it in firmware).

May have to add additional circuitry to run while charging

Laptops

Main power (primary)

Device can leverage all available computing power (barring thermal constraints)

Device functionality susceptible to interruption during power supply events

3D printer

Main power + backup

Device can leverage all available computing power (barring thermal constraints), and operate at reduced capacity during power events.

Additional power management circuitry. Need to reduce current load during loss of main power.

NEST thermostat

Page 11: Connecting Stuff to Azure (IoT)

Choices – What connects the device to cloud services?Option Upside Downside Common examples

Ethernet Cheap, easy to install. No hard bandwidth or framing limitations.

Requires hard wired connection provided by end-user. May require additional configuration or security enhancements to route through firewalls, etc.

Industrial PLC (programmable logic controllers)

WiFi Readily available on more sophisticated microcontrollers and embedded devices.

Requires ambient WiFi network, and method of managing security keys and access (including rotation).

May require additional configuration or security enhancements to route through firewalls (commercial).

NEST thermostat.

Cellular Self-contained; plug and go. Communication heavily metered – cost of operations (CoGS) borne by service operator.

3rd party car data logger

Local (Bluetooth, Zigbee, etc)

Minimal cost and power requirements.

Short ranged, require field gateway or other “smart” edge device to proxy connections.

iBeacon

Page 12: Connecting Stuff to Azure (IoT)

With the ubiquity of firewalls and NAT (network address translators), cloud services connecting inbound to devices is typically impractical.

If two local devices want to talk to each other, two options: Device A connects directly to device B, or vice-versa The devices communicate through a secured cloud endpoint (service assisted communication)

Whom connects to whom?

Page 13: Connecting Stuff to Azure (IoT)

Messaging and Connectivity

Page 14: Connecting Stuff to Azure (IoT)

LiFX lightbulbs create a mesh network between each other

One lightbulb elects as master, and proxies to WiFi router

Devices shipped from factory with a single GLOBAL PRE-SHARED KEY.

Break one device – break them all. Remediation Options:

Global firmware update. How do the devices “call home” to get firmware updates? At scale there will always be devices behind the update curve.

Don’t make any mistakes in the bootloader for in-field firmware updates. A single RMA (return material authorization) can wipe out the profit from dozens of devices.

Move to provisioned key-per-device. Need to build and manage key infrastructure. Also need to incorporate key rotation (don’t make a mistake here of the device will “bricked”).

Is there an out-of-band update mechanism (USB?). Is the end-user community amenable to handling firmware updates (industrial, technical vs. mass consumer)

Peer to peer sounds cool!

http://contextis.com/resources/blog/hacking-internet-connected-light-bulbs/

Page 15: Connecting Stuff to Azure (IoT)

Choices – Let’s connect!Option Upside Downside

UDP • Simple; datagrams require no framing.

• Efficient on bandwidth metered links.

• Impractical to secure channel. • Need faith or out of band acknowledgement

mechanism for reliable transfer. • Cannot reliably support ordered data streams.• Challenging to implement return-channel (cloud to

device) for commands

TCP/IP • Simple; minimal code footprint for RTOS class devices.

• Can use TLS to secure channel• Bi-directional channel for notifications

and commands

• Need to handle framing on both sides of connection (or hard code avoidance of MTU limits from end to end)

• Firewall traversal is challenging

HTTP/S • Straightforward firewall traversal, use of SSL for channel encryption and signing

• Built in framing, can leverage semantic conventions (REST) to publish data

• Inefficient for Signal-to-Noise ratio of bytes on wire• Heavy device stack footprint to implement

general purpose HTTP client stack

AMQP, MQTT • Bi-directional channel for notifications and commands

• Efficient use of bandwidth (batching, efficient framing, etc)

• Firewall traversal is challenging• Client stack may not fit on smaller devices• Evolving standards and implementation levels

Page 16: Connecting Stuff to Azure (IoT)

Choices – Let’s encode!Option Upside Downside

XML • You have more money than you know what to do with. Enjoy another mojito on your yacht.

• Extremely inefficient for both serialization/deserialization time and wire encoding.

JSON • Self-describing (“tagged”) format requiring no type identifiers. Readable by convention.

• Need to handle framing on both sides of connection (or hard code avoidance of MTU limits from end to end)

• Firewall traversal is challenging

Tagged / Untagged “standard” Binary (Protobuf, Thrift, etc)

• Highly efficient wire protocol with broad range of encoder bindings for various languages

• Can use common IDL (definition) to generate device and cloud code

• Built in support for protocol versioning

• Implementation may not be compatible with RTOS class device BSP (board support packages)

• Until you’ve lived through the mistake, you probably won’t use the versioning features.

Custom Binary (roll your own)

• You can put “wrote yet another custom protocol” on your resume

• High degree of control over bit packing, ordering, etc.

• Can support any device.. Since you wrote it for that device

• Very few implementations use code generation from a common definition (result -> divergent implementations with subtle differences)

• Rarely incorporate version management, self-describing type and version fields, rich variable support (arrays, maps, etc)

• Take on a life of their own, generating support burdens with inertia

Page 17: Connecting Stuff to Azure (IoT)

Enough Choice Lists – Let’s Design!

Page 18: Connecting Stuff to Azure (IoT)

Segment: Commercial or consumer?

Power: Parasitic or battery-assist? Transport: Cellular or periodic WiFi? Connection: UDP, TCP/IP, HTTP,

etc? Encoding: custom, binary, json? Workflows:

Telematics only (publish data) Latent notifications (time for a firmware update)

and commands Interactive commands (unlock the car)

Design Challenge: Automotive Telematics Interface

Page 19: Connecting Stuff to Azure (IoT)

Building cloud services for IoT requires an understanding of the target device(s), and adapting to their needs

Green field (new) devices allow a broader range of choice – we’re at an inflection point: Many extant devices and platforms being retrofitting for IoT

connectivity Stable investments in protocols, encoding approaches Highly sophisticated system-on-chip designs emerging (relaxing

processing efficiency constraints) More ubiquitous / cheaper networking options (cellular data chips and

plans more available – efficiency for long-term CoGS still crucial)

Recap: devices drive choices, cloud follows

Page 20: Connecting Stuff to Azure (IoT)

End to End IoT Architecture

Page 21: Connecting Stuff to Azure (IoT)

Building a cloud gateway; responsibilities: Scalability. Connections are a metered resource in a shared environment. In Azure

each SLB (cloud service) will handle 60k-80k concurrent TCP/IP sockets. Security (authz/authn). Validate that connecting devices are allowed and trusted

to send information. Connection affinity and command routing. How to route commands and

notifications from other devices and entities down to specific devices. Protocol / encoding translation. Sparse/packaged network formats may not be

optimal for hot-path (workflow processing, stream analytics) and cold-path (bulk analytics). May need to convert older protocols into canonical formats.

Routing. Enrich incoming data streams with context (per-device, per-system), and route messages to the appropriate downstream consumers.

System telemetry. How many devices are connecting, status, resources, errors. Load shedding / shock absorption. What happens when everybody decides to

reconnect all at once (inrush effect)?

Now – connecting stuff to Azure!

Page 22: Connecting Stuff to Azure (IoT)

What destination resource is encoded on the device? Hard coded IP address, factory-set? Hard coded IP address, set during provisioning (where does the provisioning information come from?)

Host name, dynamic lookup via DNS (can I fit a DNS stack on the device)?

What are pros and cons of each approach?

Not so obvious learning moments

Page 23: Connecting Stuff to Azure (IoT)

Imagine an embedded linux device, periodically publishing telemetry data (files) to a remote service. Using crontab:

What’s the problem with this approach?

Everybody.. Go!

0,15,30,45 * * * * /usr/local/bin/publish_data

Page 24: Connecting Stuff to Azure (IoT)

Microsoft Confidential.

Delivering on CollectionChallenges and Physics

Synchronized Interval• Devices publishing data at fixed interval and offset (e.g.

every 15 minutes, on the quarter hour – 12:00:00, 12:15:00, etc)

• No guarantee of precise clock synchronization in a highly distributed system

Unsynchronized Interval• Devices publishing data at fixed interval• Start on device or application activation• No guarantee of precise clock synchronization in a highly

distributed system

1 2 3 4 5 6 7 8 9 100

10

20

30

40

50

60

1 2 3 4 5 6 7 8 9 100

1020304050607080

This is wasted money (unless you can auto scale in the troughs)

Page 25: Connecting Stuff to Azure (IoT)

Azure Software Load Balancer kills inactive connections after 4 minutes

To increase timeout value (max value 30 minutes)

Managing Socket Lifetime

Set-AzurePublicIP –PublicIPName webip –VM MyVM -IdleTimeoutInMinutes 15http://azure.microsoft.com/blog/2014/08/14/new-configurable-idle-timeout-for-azure-load-balancer/

Page 26: Connecting Stuff to Azure (IoT)

Then you’ll need to “nudge” the socket with some data.

What about TcpKeepAlive?

But I want at least an hour!

Page 27: Connecting Stuff to Azure (IoT)

A really simple cloud gateway in Java with Jetty and Kafka

Page 28: Connecting Stuff to Azure (IoT)

End to End IoT Architecture

Page 29: Connecting Stuff to Azure (IoT)

Decouple incoming data streams from consumers. Static topologies are brittle, and challenging to extend.

Handle inrush shock, transient interrrupts Enable “time travel” – downstream

consumers can go back in time to begin reading data

Rate conversion; can handle offsets between rate of production and rate of consumption

Why an Event Broker?

Page 30: Connecting Stuff to Azure (IoT)

Qualities of an event broker for IoT: Partitioned, append-only journal with tunable consistency Ingest 100k’s -> 1M+ events / second, with sub-second e2e latency Standards-based wire protocol Client cursors for externalized state management; no resource

contention between readers Tunable retention policies

Azure options for an event broker: Event Hubs (PaaS service in public preview -

http://azure.microsoft.com/en-us/services/event-hubs/) Kafka on Linux (IaaS)

Event Broker

Page 31: Connecting Stuff to Azure (IoT)

React to the incoming message stream, apply business logic + enduring state Retrieve next message. Determine (a) associated device state, and (b)

processing logic Execute processing logic, update durable state Trigger any additional actions as a result of the processing logic (raise

an alert, start another workflow, etc)

Workflow processing

Page 32: Connecting Stuff to Azure (IoT)

Common aspects: Most real-world IoT solutions have hundreds of message types and

handlers (including multiple versions) Common contention points are the message dispatcher (retrieving

logic) and state management. Relational databases are generally ill-suited state stores for IoT

(everyone sharing the same transaction log)

Workflow processing

Page 33: Connecting Stuff to Azure (IoT)

Not all workloads are created equal

Pareto curve in message processing – can focus optimization efforts on a small number of message types.

Focus on optimizing

these message handlers

These, probably not.

Page 34: Connecting Stuff to Azure (IoT)

A really simple message processor in C# with Event Hub

Page 35: Connecting Stuff to Azure (IoT)

When designing an IoT application, the needs of the device drive engineering choices (physics always wins)

Need to carefully design cloud services to meet devices, and handle scale / availability / compatibility

This stuff is hard at scale. We are committed to making it easier. Azure Intelligent Systems Service (

http://www.microsoft.com/windowsembedded/en-us/intelligent-systems-service.aspx) Patterns and Practices – IoT Guidance (coming early 2015)

More details and context on building Azure cloud services at scale Building Big; Lessons Learned from Azure Customers (http

://channel9.msdn.com/Events/Build/2014/3-633) Connecting the World: Building Services for Connected Devices (

http://channel9.msdn.com/Events/Build/2014/3-634)

Takeaways

Page 36: Connecting Stuff to Azure (IoT)

© 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.