Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data....

23

Transcript of Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data....

Page 1: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business
Page 2: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 2Think Digital Customer Conference 2019

Conference Theme

It can be challenging to understand what exactly Digital Transformation means for your business.We want to help you take a step back and reconsider how you run your business and perhaps even how you go to market in this new world we’re already living in.We understand it can be challenging to think strategically rather than tactically about specific products and tools. We want to help you overcome this challenge, so that you don’t underutilize the power of the solutions Hitachi Solutions has to offer.It’s time to rethink our approach, it’s time to Think Digital.Looking to solve a problem? Think Digital.Streamline your workload? Think Digital.Extend Your Reach? Better communicate? Increase sales? Think Digital.And we’re going to help you do just that, starting with this conference.

DIGITAL TRANSFORMATION CAN BE AN INTIMIDATING CONCEPT

Page 3: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

Off to the Big Data Race: Performance, Speed and Storage

Director, Data & Analytics

Email: [email protected]

Orlando Gonzalez

#HSCCATLANTA19

Breakout Track 3: Analytics and AI

Page 4: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

Introduction

Structured Data

Unstructured Data

Modern Data Warehouse

Azure Data Services

01

02

03

04

05

C O N T E N T S

Business Use Case

Azure Data Lake

HDInsight

Azure SQL DW

Databricks

06

07

08

09

10

Page 5: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 5Think Digital Customer Conference 2019

Introduction

• How do I manage various data types?• Which data service should you use?• Correlation of data?• Does all your data have value?• Can you afford to keep everything?• Who has a deep understanding of your data?

Page 6: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 6Think Digital Customer Conference 2019

Structured Data• Organization (tables, rows, columns)• Standard data mining techniques• Ongoing administration and maintenance• SQL Server

• Relational database• Pre-defined schemas for structure• Upfront preparation and architecture required• Changes in data type (numeric/text) requires schema change• Transaction security, keys, locks, views

Page 7: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 7Think Digital Customer Conference 2019

Structured Data

• Integrity• Relational• Columns have

known Data types

• Data/Log files• Fixed partition

sizes• Concurrently &

Locks

Page 8: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 8Think Digital Customer Conference 2019

Unstructured Data

• Complex structured data• Traditional database not needed for data management• Impose schema on read• Store data in its native format• The business users decide on which data to interpret• Scalable• Less administration & maintenance

Page 9: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 9Think Digital Customer Conference 2019

Unstructured Data

• No pre-defined data model

• All data types • Structured

information in different ways

• Large scale data mining

• Supports images, audio, video, email body text

Page 10: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 10Think Digital Customer Conference 2019

Traditional Data Warehouse

• Single source, ERP or Transactional• Simple data model• Relational data sources• Standard costing model• Optimized design for analytical queries• Traditional ETL design & data movement• User maintained (Server, DB, Tuning)

Page 11: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 11Think Digital Customer Conference 2019

Modern Data Warehouse

• Integrated MPP Data Platform• Near Real-Time• All Structured Data Types• Scalable & Performance • Supporting all levels of Analytics• No single solution for a data estate• All data may have value• Updates to source systems or

changing data types

• Data• -base• Warehouse• Bricks• Cube• Lake• Lake Analytics• Catalog

• Cosmos DB• Blob• Event Hub• Stream Analytics• HDInsight• Analysis Services

• 4:15

Page 12: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 12Think Digital Customer Conference 2019

Modern Data Warehouse - Azure

Page 13: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 13Think Digital Customer Conference 2019

Modern Data Warehouse - Azure

Page 14: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 14Think Digital Customer Conference 2019

Page 15: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 15Think Digital Customer Conference 2019

Business Cases

• What are you solving for the business? • Do not jump directly to technology architecture• Products support business need and usage• Lead >> Lag >> Match (think about scale & business comfort)• Data trends in your industry• Performance consideration, use the tools the way they were designed• Integrate multiple data sources

Page 16: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 16Think Digital Customer Conference 2019

Compute requirement U-SQL

ADLS WASB

Azure Data Lake Topology

Page 17: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 17Think Digital Customer Conference 2019

HDInsight Cluster

Azure Data Lake Storage

Domain credentials

Azure Storage Blob

Head node

Back-up

Data node

AAD tenantAzure VNET to VNET peering

Azure HDInsight Topology

Page 18: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 18Think Digital Customer Conference 2019

Com

pute

Rem

ote

stor

age

Cache TempDB

NVMe SSD

Cores Memory

Data

Log

Cache TempDB

NVMe SSD

Cores Memory

Cache TempDB

NVMe SSD

Cores Memory

Snapshot backups

Azure SQL Data Warehouse

Page 19: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 19Think Digital Customer Conference 2019

Azure Resource Manager

Storage Compute Network

Microsoft.Databricks RP

Azure Databricks Workspace

VNetVM

VM

VM

VM

Blob Storage ClustersDBFS

Azure Databricks Topology

Page 20: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 20Think Digital Customer Conference 2019

HDInsight Hive HDInsight Spark Azure Data Lake Azure Databricks Azure SQL DWVolume Petabytes Petabytes Petabytes Petabytes TerabytesSecurity AAD, ADLS /

Apache RangerAAD, ADLS AAD, ADLS AAD, ADLS, ADB

role-based accessTDE, Threat Detect, CA. AAD

Languages HiveQL SparkSQL, HiveQL, Scala, Java, Python, R

U-SQL, R, Python PySpark, SparkR, sparklyR, Scala, SparkSQL

T-SQL

Extensibility Yes, .NET/SerDe Yes, maven/PyPi Yes, .NET Yes, 3rd party libs PolybaseExternalSources

ORC, CSV,Parquet + others

Parquet, JSON,Hive + others

Text, CSV, TSV, Custom

CSV, JSON, Parquet + Many Sources

Text, Hive RCFile, Hive ORC, Parquet

Admin Medium-High Medium-High Low Low Low-MediumCost Model Nodes & VM Nodes & VM Units/Jobs VM, DBU DWU, cDWUSchema Definition

Schema on Read Schema on Read Schema on Read Schema on Read Schema on Write

Storage Blob or ADLS Blob or ADLS ADLS Blob or ADLS Internal Storage

Modern Data WarehousingDecision points and trade-offs, but not necessarily one versus the other...

Page 21: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 21Think Digital Customer Conference 2019

Next Steps

Page 22: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 22Think Digital Customer Conference 2019

Questions?

• Linkedin: https://www.linkedin.com/in/orlando-gonzalez

• Websites:• https://www.capaxglobal.com/• https://us.hitachi-solutions.com/• https://azure.microsoft.com/en-us/solutions/data-warehouse/

Page 23: Conference Theme - Hitachi Solutions America · Introduction. Structured Data. Unstructured Data. Modern Data Warehouse. Azure Data Services. 01. 02. 03. 04. 05. CONTENTS. Business

/ 23Think Digital Customer Conference 2019

Data processing with Azure Databricks

Modern Data Warehouse: ETL

Orchestration

Load flat filesinto data lake on a schedule

Ingest storage Data processing

Read data from files using DBFS

Serving storage

Load processed data into tables

optimized for analytics

Dashboards

Logs, files, and media (unstructured) Azure Storage/

Data Lake Store

Azure SQL DW

Applications

Azure Databricks

Azure Data Factory

Business and custom apps (structured)

Transactional storage

Applications manage their

transactional data directly

Extract and transform

relational data

Load into DBFS

Orchestration

SQL DBAzure Data

Factory

SQL