Joe Caserta President Elliott Cordo Chief Architect September 30, 2015, Javits Center, New York City...

Joe CasertaPresident

Elliott CordoChief Architect

September 30, 2015, Javits Center, New York City

Building a Data Lake for Digital Music Dominance

Big Data StrategyInnovation

Technical Implementation

Awards and Recognition

The Music Maze

Build a Dynamic Platform – Paradigm ShiftOLD WAY:• Structure Ingest Analyze• Fixed Capacity• Monolith

NEW WAY:• Ingest Analyze Structure• Dynamic Capacity• Ecosystem

RECIPE:• Cloud• Data Lake• Polyglot Warehouse

Move to the Cloud

Existing On-Premise Solution • Challenges with operations of Hadoop servers in Data Center• Increasing infrastructure complexity• Keeping up with data growth

Cloud Advantages• Reduced upfront capital investment• Faster speed to value• Elasticity “Those that go out and buy expensive

infrastructure find that the problem scope and domain shift really quickly. By the time they get around to answering the original question, the business has moved on.” - Matt Wood, AWS

Cost savings of dynamic capacity

Elasticity not only saves money

Essentially, Servers Suck

But more importantly think Infrastructure as code• Your servers should be API calls• Use stateless processes• Make all resources ephemeral• Make everything scalable and elastic!

Ephemeral?Disposable:• Processing Fleets• Elastic Map Reduce Clusters• Redshift Clusters

Use distributed services and systems to maintain state and preserve your data: • Cassandra, Dynamo • S3

Anatomy of our Processing Fleet

S3 Input Buckets

Auto-scaling

S3 Output Buckets

Elastic Map Reduce

Hadoop on Demand• No Operations –your cluster dies so what• Bootstrap whatever processing engine makes sense• Programmatically estimate instance type and cluster

You May Need Some Persistent Servers

If at all possible they should be inherently scalable, distributed, and elastic

Move to a Data Lake ParadigmTechnology:• Scalable distributed storage S3• Pluggable fit-for-purpose processing EMR

Functional Capabilities:• Remove barriers from data ingestion and analysis• Storage and processing for all data• Tunable Governance

Ingest Raw Data

Organize, Define, Complete

Munging, BlendingMachine Learning Data Quality and Monitoring

Metadata, ILM , Security Data Catalog Data Integration

Fully Governed ( trusted)Arbitrary/Ad-hoc Queries and Reporting

BigDataWarehouse

Data Science Workspace

Data Lake – Integrated Sandbox

Landing Area – Source Data in “Full Fidelity”

Usage Pattern Data Governance

Metadata, ILM, Security

Putting it together: The Big Data Pyramid

Data Ingestion and Onboarding

• Incoming to S3:– Lightweight API wrapper– Web front end– Direct writes to S3

• Ingest the data in a reasonable partitioning schema: Bucket and Keys

• Turn analysts and data scientists loose Late bind analytics

But we need to feed the cash register

• Data needs to be refined and mapped:– Processing Fleet– EMR

• 80/20 rule: metadata driven when possible• Abstract away “Big Data”• And make sure it’s right!– Automated data quality checks using HAMBOT, soon to be

open sourced

“…any decent sized enterprise will have a variety of different data technologies for different kinds of data. There will still be large amounts of it managed in relational stores, but increasingly we'll be first asking how we want to manipulate the data and only then figuring out what technology is the best bet for it.” - Martin Fowler

Think Data Ecosystem, Not Tech Stack

Polyglot in Practice

Best practices from traditional EDW• Consolidation• Data Governance• Master Data• Tuned for analytics

Applied to:• Fit-for-purpose technologies and approaches• Relational, MPP, Graph, KV, TimeseriesDB, Data Lake• Apply “tunable governance” and traditional principles

Use the right tool for the job

The Landscape for Digital Dominance

Landing Que

Data Lake

Data Science

Data Providers

Near Real-time

Data Science Clusters

EDWGraph

RDS Metastore

Joe CasertaPresident, Caserta Conceptsjoe@casertaconcepts.com @joe_Caserta

Elliott CordoChief Architect, Caserta Conceptselliott@casertaconcepts.com

•Award-winning company•Transformative Data Strategies•Modern Data Engineering•Advanced Architecture

•Innovation Partner•Strategic Consulting•Advanced Technical Design•Build & Deploy Solutions

•BDW Meetup•New York City•3,000+ members•Knowledge sharing

Data is not important, it’s what you do with it that’s important!

Thank You

Joe Caserta President Elliott Cordo Chief Architect September 30, 2015, Javits Center, New York City...

Documents

Transcript of Joe Caserta President Elliott Cordo Chief Architect September 30, 2015, Javits Center, New York City...

Cordo Bike accessories 2013-2014

Marco Caserta marco.caserta@ie

Archived FY 2011 Application for the Jacob K. Javits ... · Web viewFY 2011. APPLICATION FOR NEW FELLOWSHIPS. UNDER THE JACOB K. JAVITS FELLOWSHIP ... Institutions of higher education

Visio-Javits Org Chart 2016-05-02javitscenter.com/media/57102/3-javits-org-chart-2016-05-02.pdf · Guerin, Doreen Senior VP Sales and Marketing MacDonald, Edward (Ed) Finance Officer

June 8–10, 2010 Jacob K. Javits Convention Center New York ...onpeak.s3.amazonaws.com/canontradeshows/MDMEast_HF_2010.pdf · Jacob K. Javits Convention Center New York City, NY

Le immagini e i dati tecnici sono indicati · MARINA DI CASTELLO RESORT GOLF & SPA Castel Volturno CE PLAZA CASERTA Caserta CE ... AGRITURISMO SCUDERIE DE MORIMENTA Gonnesa CI LU'

NOVEMBER 5-6, 2014 JAVITS CONVENTION … 5-6, 2014 JAVITS CONVENTION CENTER • NEW YORK, NY The keys. ... Javits Convention Center • New York ... Mohawk Industries,

Javits Center Tool Kit€¦ · Javits Tool Kit ensures that you’ll spend more time entertaining than complaining. You’ll find simple descriptions of the products and service options

GeGGrafiche CASERTA

Mirko Caserta-Una Guida All'Ear Training

FY 2011 Application for the Jacob K. Javits Fellowship ...

2014 Round Table Caserta Service Delivery Balance between Central and local levels.

Javits Convention Center AES CONVENTION New …1036 J. Audio Eng. Soc., Vol. 63, No. 12, 2015 December CONVENTION REPORT 1CONVENTION REPORT39 TH Javits Convention Center New York,

The Royal Palace of Caserta

Jacob K. Javits Convention Center Benchmark Test (Sprint)

Jacob K. Javits Convention Center Benchmark Test (MetroPCs)

Provenance: The New Chapter In The Museum Narrative Paul Caserta & Victoria Reed.

The JAVITS Iowa Twice Exceptional Project :

CORDO - CAPITAN 1978

TEDx Caserta Adventure 2015