Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... ·...

58

Transcript of Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... ·...

Page 1: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 2: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 3: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Three Buzzwords of our time

IOT - Big Data - Predictive Analytics

A fourth – The Cloud

Page 4: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 5: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Agility Control

Elastically Scale

Storage & Compute

Data Management &

Governance

Access Control

Rich Stores &

Compute

Options

Discovery in Familiar

Tools

Informed Decisions

on Time

Re-imagining modern data analytics by balancing agility and control

Business

• Innovate Faster

• Discover New

Opportunities

• Reliable Information

IT

• Lower Costs

• Minimize Complexity

• Improve Efficiency

• Control/Reduce Risks

Page 6: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Connect

Collect

Enrich

Transform

Publish

Data

Co

nsu

mp

tio

n

Data

Pro

du

ctio

n

Information Production

Data

Pro

du

ctio

n

Operational Dashboards, etc

BI & Analytics

I need to learn big

data technologies to

develop the pipeline

Do I need to start

developing every

pipeline from scratch?

What does it take to

deploy the pipeline

once developed?

Do I need to touch the

code of every step when

metadata changes?

How do I monitor

health and execution

status of the pipeline?

How do I ensure an

update does not break

the entire pipeline?

How do I deal with

streaming and batch

requirements?

How to I ensure

reliable execution

and fault tolerance?

How do I make it

available to consumers?

Authoring Operating Managing Lifecycle Publishing

Microsoft Confidential

Page 7: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Typical Azure Data Architecture

Stream Analytics

Transform Ingest

Web logs

Present &

decide

Kinect In-Store

Activity

Social Data

Event Hubs HDInsight

Azure Data

Factory

Azure SQL DB

Azure Blob Storage

Azure Machine

Learning

Power BI

Web

dashboards

Mobile devices

DW / Long-term

storage

Predictive

analytics

Event & data

producers

APS

Page 8: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Cloud-scale telemetry ingestion from websites, apps, and devices •Log millions of events per second in near real time •Connect devices with flexible authorization and throttling •Time-based event buffering •Managed service with elastic scale •Broad platform reach with native client libraries •Pluggable adapters for other cloud services

Page 9: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Supports small number of queries that arrive at high volume

Streaming in Azure

Project Codename NRT

Input

Adapter

Output

Adapter

Complex Event Processor

NRT

Cloud

Service

Data

Stores

Event

Hub

Data Stores

Dashboards & Alerts Sensors &

Devices

Event

Hub

Small number of high volume queries

Complex Event Processing

(aggregation, reduction, cleanup)

Predictable & repeatable results

SQL-like queries

Ingress Azure blobs and Event Hub

Egress to Azure DB, blobs, Event Hub

Dashboarding

Alerting & Bind notification

Anomaly detection

Compute datasets

Page 10: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Orchestrate data movement, machine learning, Hadoop (via HDInsight) for on-premise and cloud data

Plan workflow dependencies and scheduling

Publish to Power BI users as a searchable data view

Lifecycle management, monitoring

Operationalize information production & governance

Orchestration and Data Production in Azure

Project Codename MDP

Page 11: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Support HBase as NoSQL columnar database on Azure Blobs

Support Storm as streaming

Hadoop in Azure

HDInsight

Data Node Data Node Data Node Data Node

Task Tracker Task Tracker Task Tracker Task Tracker

Name Node

Job Tracker

HMaster Coordination

Region Server Region Server Region Server Region Server

HBase as a columnar NoSQL transactional database running on Azure Blobs

Storm as a streaming service for near real time processing

Hadoop 2.4 support for 100x query gains on Hive queries

Mahout support for machine learning + Hadoop

Graphical User Interface for HIVE queries

Page 12: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Enjoy unprecedented efficiencies via a near-zero database-as-a-service

Ensure predictable performance and elastic scale from one to thousands of databases

Support business continuity policies with self-service restore and disaster recovery

Drive DevOps tasks via programmatic APIs

Migrate LOB apps for reduced CAPEX & OPEX; drive database administration efficiencies at scale

SQL Database service

Relational database-as-a-service designed for devs & architects

Azure SQL Database

Page 13: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Enable collaborative data science work with anyone, anywhere via a personal Windows Azure Machine Learning Studio workspace

Bring in cloud data sources with the ease of a drop down menu

Utilize the same best in class algorithms in ML Studio that run Xbox and Bing

Quickly deploy models as Azure web services with Machine Learning API service

TBs of scalability via HDInsight

ML SDK enabling partners to build and monetize ML web services

Easily create sophisticated models using numerous languages Including R & Python

Deploy predictive models into production in minutes instead of days or weeks

Connect seamlessly with Excel for results visualization

Use historical data to predict future outcomes using cloud based machine learning

Azure Machine Learning

Page 14: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Power BI mobile app support for iOS devices

New data visualizations and self-service predictive

analytics for forecasting and population plotting

Enhanced data source and data refresh support

Enhanced data management and governance built

into Power BI

• Connectivity to on-premises data source

• Mobile access to Power BI reports

Powerful new ways to work with data with Excel and Power BI for Office 365

Self-service analysis in

Power View

Power BI

Page 15: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 16: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 17: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 18: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Intake millions of events per second Process data from connected devices/apps

Integrated with highly-scalable publish-subscriber ingestor

Easy processing on continuous streams of data Transform, augment, correlate, temporal operations

Detect patterns and anomalies in streaming data

Correlate streaming with reference data

Page 19: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Guaranteed events delivery Guaranteed not to lose events or incorrect output

Preserves event order on per-device basis

Guaranteed business continuity Guaranteed uptime (three nines of availability)

Auto-recovery from failures

Built in state management for fast recovery

Page 20: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Elasticity of the cloud for scale up or scale down Spin up any number of resources on demand

Scale from small to large when required

Distributed, scale-out architecture

Scale using slider in Azure Portal and not writing code

Low startup costs Provision and run Streaming solution for as low as $25/month

Pay only for the resources you use

Ability to incrementally add resources

Reduce costs when business needs changes

Page 21: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

End-to-End Architecture Overview

Data Source Collect Process Consume Deliver

Event Inputs - Event Hub

- Azure Blob

Transform - Temporal joins

- Filter

- Aggregates

- Projections

- Windows

- Etc.

Enrich

Correlate

Upcoming –

Call ML models

Outputs - SQL Azure

- Azure Blobs

- Event Hub

Upcoming

- PowerBI (in Private

Preview)

- Azure Tables

BI

Dashboards

Predictive

Analytics

Azure

Storage

• Temporal Semantics

• Guaranteed delivery

• Guaranteed up time

Azure “NRT”

Reference Data - Azure Blob

- …

Page 22: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Every event that flows through the system has a timestamp

SELECT FROM TIMESTAMP BY

SELECT FROM

Projecting timestamp into payload SELECT System.Timestamp AS FROM

Page 23: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

SELECT TimeZone, COUNT(*) AS Count FROM TwitterStream TIMESTAMP BY CreatedAt GROUP BY TimeZone, TumblingWindow(second,10)

Tell me the count of tweets per time zone every 10 seconds

1 5 4 2 6 8 6 5

Time

(secs)

1 5 4 2 6

8 6

A 10-second Tumbling Window

3 6 1

5 3 6 1

Page 24: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

SELECT Topic, COUNT(*) AS TotalTweets, AVG(SentimentScore) FROM TwitterStream TIMESTAMP BY CreatedAt GROUP BY Topic, HoppingWindow(second, 10 , 5)

Every 5 seconds give me the

count of tweets and the average

sentiment score over the last 10

seconds

1 5 4 2 6 8 7

A 10-second Hopping Window with a 5-second “Hop”

4 2 6

8 6

5 3 6 1

1 5 4 2 6

8 6 5 3

6 1 5 3

Page 25: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

SELECT Topic, COUNT(*) FROM TwitterStream TIMESTAMP BY CreatedAt GROUP BY Topic, SlidingWindow(second, 10) HAVING COUNT(*) > 10

Give me the count of tweets for all

topics which are tweeted more

than 10 times in the last 10

seconds

1 5

A 10-second Sliding Window

8

8

5 1

9

5 1 9

1

Page 26: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 27: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Data sources

Consumed by BI

Integrated with Apps

Coordination and management

• Build and manage a network data pipelines

• From a single pane of glass:

• See full data and operational lineage

• Monitor pipeline and dataset health

• Control data production policy

Data stores and processing environments

• Work with your data

• On premise SQL Server

• Azure DB, Azure Blobs, Azure table

• Compose and orchestrate data processing

• HDInsight, Custom Code, etc.

AZURE DATA FACTORY

Relational and non-relational

On-premise or cloud

Batch or Stream

Hadoop (Hive, Pig, etc.)

Custom code

Data movement

Manage and monitor

Data and operational lineage

Coordination and scheduling

Policy

DATA PIPELINES

Activity Activity

Produce trusted information from raw data

Page 28: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Data sources

Consumed by BI

Integrated with Apps

AZURE DATA FACTORY

Relational and non-relational

On-premise or cloud

Batch or Stream

Hadoop (Hive, Pig, etc.)

Custom code

Data movement

Manage and monitor

Data and operational lineage

Coordination and scheduling

Policy

DATA PIPELINES

Activity Activity

Information assets Raw data Orchestrate, monitor

Data Factory

A platform for developers to compose data processing, storage and

movement services to create & operationalize analytics Pipelines

Pipeline

Pipelines are groups of data movement and/or processing Activities that

accept N input Datasets and produce N output Datasets. Pipelines can be

executed once or on a flexible range of schedules (hourly, daily, weekly,

etc…).

Dataset

A Dataset is a named view of data. The data being described can vary

from simple bytes, semi-structured data like CSV files all the way to Tables

or Models.

Activity

An Activity is the unit of execution within the pipeline that can

perform data movement or transformation. It can import/export data

from disparate Data Stores (DB, files, SaaS services, etc) used by the

organization into a Data Hub

Data Hub

A Data Hub is a pairing of collocated data storage and associated

compute services. For example, a Hadoop cluster ( HDFS as storage ,

Hive/Pig/etc as compute) is a Data Hub. Similarly, an EDW can be

modelled as a Data Hub (DB as storage, Sprocs and/or ETL tool as

compute services).

Page 29: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

C#

MapReduce

Hive

Pig

Stored Procedures

Page 30: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

ETL Tool (SSIS, etc)

EDW (SQL Svr, Teradata, etc)

Extract

Original Data

Load

Transformed Data

Transform

BI Tools

Data Marts

Data Lake(s)

Ingest (EL)

Original Data

Dashboards

Apps Scale-out Storage & Compute

(HDFS, Blob Storage, etc)

Transform & Load

Streaming data

Page 31: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

New Azure service for data developers and IT

Compose data processing, storage, and movement services to create and manage

analytics pipelines

Rich, simple end-to-end pipeline monitoring and management

Initially focused on Azure and hybrid movement to/from on premises SQL Server. Overtime

will expand to more storage and processing systems

Page 32: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

HDInsight Hadoop for the Cloud

Page 33: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 34: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 35: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 36: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Azure HDInsight

Hadoop Meets the Cloud Microsoft’s cloud Hadoop offering

100% open source Apache Hadoop

Built on the latest releases across Hadoop (2.4)

Up and running in minutes with no hardware to deploy

Harness existing .NET and Java skills to write MapReduce

Utilize familiar BI tools for analysis including Microsoft Excel

Page 37: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Demo: Getting Started With HDInsight

Page 38: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Hadoop 2.0

Page 39: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Data Node Data Node Data Node Data Node

Task Tracker Task Tracker Task Tracker Task Tracker

Name Node

Job Tracker

HMaster Coordination

Region Server Region Server Region Server Region Server

Page 40: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 41: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 42: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 43: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 44: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 45: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

$£€¥

Page 46: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 47: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Fully

managed

Integrated Best in Class

Algorithms + R Deploy in

minutes

No software to install,

no hardware to manage,

and one portal to view

and update.

Simple drag, drop and

connect interface for

Data Science. No need

for programming for

common tasks.

Built-in collection of

best of breed

algorithms. Support for

R and popular CRAN

packages.

Operationalize models

with a single click.

Monetize in Machine

Learning Marketplace.

Page 48: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 49: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Drag & Drop + Best in Class Algorithms

Page 50: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 51: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 52: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 53: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Live Connectivity to SQL Server Analysis Services

Live Query

Page 54: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 55: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 56: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

Stream Analytics

Transform Ingest

Web logs

Present &

decide

Kinect In-Store

Activity

Social Data

Event Hubs HDInsight

Azure Data

Factory

Azure SQL DB

Azure Blob Storage

Azure Machine

Learning

Power BI

Web

dashboards

Mobile devices

DW / Long-term

storage

Predictive

analytics

Event & data

producers

APS

Page 57: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored
Page 58: Three Buzzwords of our timedownload.microsoft.com/documents/hk/technet/techdays2015... · 2018-12-05 · Three Buzzwords of our time IOT - Big Data - Predictive Analytics ... Stored

http://aka.ms/DBI233

Session Evaluation