Taming the Beast: Extracting Value from Hadoop

Post on 22-Jan-2018

309 views 0 download

Transcript of Taming the Beast: Extracting Value from Hadoop

John L Myers

Enterprise Management Associates

Managing Research Director

JMyers@EnterpriseManagement.com

@johnlmyers44

Taming the Beast:

Extracting Value from Hadoop

Ingo Mierswa

RapidMiner

Founder & CTO

imierswa@rapidminer.com

Panel Moderator

Lyndsay Wise, Research Director, EMA

Lyndsay has over 10 years experience in software

research, BI consulting, and strategy development,

specializing in software evaluation and best-fit solution

selection. Her focus at EMA is on data integration, data

governance, cloud technologies, data visualization,

analytics, and collaboration.

Slide 2 © 2015 Enterprise Management Associates, Inc.

Featured Speakers

John Myers, Managing Research Director, EMA

John has over 10 years of experience working in areas related to business

analytics in professional services consulting and product development

roles. Additionally, John helps organizations solve their business analytics

problems, whether they relate to operational platforms – such as customer

care or billing – or applied analytical applications – such as revenue

assurance or fraud management.

Ingo Mierswa, Founder & CTO, RapidMiner

Ingo, an industry-veteran data scientist, is the founder and CTO of

RapidMiner, the industry’s #1 open source platform for predictive

analytics. Ingo is passionate about the technological innovation enabled

by the open source community and envisions a world where easy-to-use

predictive analytics software empowers all business analysts and data

scientists. Ingo is the author of numerous award-winning publications

about predictive analytics and big data, and has spoken at countless

industry events.

Slide 3 © 2015 Enterprise Management Associates, Inc.

A PDF of the PowerPoint

presentation will be available

Event Presentation

Logistics for Today’s Webinar

Slide 4 © 2015 Enterprise Management Associates, Inc.

An archived version of the event recording will be

available at www.enterprisemanagement.com

• Log questions in the Q&A panel located on the

lower right corner of your screen

• Questions will be addressed during the Q&A

session of the event

Questions

Event Recording

Join the Conversation…

Submit your questions or comments to the panel

using: @wiseanalytics @johnlmyers44 @rapidminer

#predictiveanalytics

Slide 5 © 2015 Enterprise Management Associates, Inc.

Topic #1:

Issues With Data Lakes

Adoption of Hadoop-based Data Lake Architectures

Slide 7 © 2015 Enterprise Management Associates, Inc.

Topic #2:

Obstacles Implementing

Analytics On Hadoop

Obstacles Implementing Analytics

Slide 9 © 2015 Enterprise Management Associates, Inc.

Topic #3:

Processing Requirements for

Predictive Analytics

Required Processing and Compute Latency

for Big Data Projects

Slide 11 © 2015 Enterprise Management Associates, Inc.

©2015 RapidMiner, Inc. All rights reserved. - 12 -

Architecture of Hadoop

Orchestration node

Worker nodes

©2015 RapidMiner, Inc. All rights reserved. - 13 -

Leverage Hadoop’s Compute Capacity

• Design advanced analytics workflows in your predictive analytics platform

• Ensure your solution automatically translates predictive analytics needs into native Hadoop code, e.g., MapReduce, Hive, Pig, Spark, etc.

• Push predictive analytic instructions into your Hadoop

• Hadoop performs calculations across the entire Hadoop cluster for a holistic view of your data

• Data remains in Hadoop Results are delivered to the business

• Recommendations

– GUI workflow language (code-free)

– Don’t forget about security

ResultsAnalytic instructions

translated to native

Hadoop

Calculations

Results

operationalized in

business processes

Predictive Analytics Platform

Topic #4:

Successful Big Data Analytics

Projects

Project Success

Slide 15 © 2015 Enterprise Management Associates, Inc.

©2015 RapidMiner, Inc. All rights reserved. - 16 -

©2015 RapidMiner, Inc. All rights reserved. - 17 -

OPERATIONALIZEPredictive Decisions

Close the Loop BetweenInsight and Action

Embed predictive models into critical business processes

Recommend best options for human or automated actions

©2015 RapidMiner, Inc. All rights reserved. - 17 -

Topic #5:

Best Practices For

Implementing

Advanced/Modern Analytics

©2015 RapidMiner, Inc. All rights reserved. - 19 -

EFFORTLESS Predictive Analytics

Immediately Empower Analysts to Anticipate

Opportunity & Risk

Easily Combine Any Data at Unlimited Scale with Any Model

Code-Free, Lightning-Fastand Intuitive

©2015 RapidMiner, Inc. All rights reserved. - 19 -

Topic #6:

Use Of Mixed Environments

For Implementation Of Big

Data Analytics

Growing Importance of Cloud Resources

Slide 21 © 2015 Enterprise Management Associates, Inc.

©2015 RapidMiner, Inc. All rights reserved. - 22 -

- 22 -

Design Once, Deploy ANYWHERE

Leverage Investments in Existing and Future Systems

Design predictive analytics independent of platforms

Seamlessly execute predictive analytics in-memory or in any source, including

data-at-rest or data-in-motion

- 22 -©2015 RapidMiner, Inc. All rights reserved.

Topic #7:

Evolving Role of

the Data Consumer

What We Used to Think

of Analytical Users

Slide 24 © 2015 Enterprise Management Associates, Inc.

Empowering the Line of Business

Slide 25 © 2015 Enterprise Management Associates, Inc.

Topic #8:

Use Cases – Monetizing

Insights Buried In Your

Multi-Structured Data

©2015 RapidMiner, Inc. All rights reserved. - 27 -

Challenge Better understand TV viewing habits to prevent churn and optimize advertising

“RapidMiner allows us to leverage Big Data, in real-time.”

-- Avi BernsteinProfessor at the University of Zurich, Department of Informatics

Drive Broadcast Revenue and Customer Retention

<5stime to generate high value activities based

on predictive analytics

Solution Process Big Data from three million TV viewers, in real-time, to make program recommendations and personalized advertising

©2015 RapidMiner, Inc. All rights reserved. - 28 -

Challenge Monitor corporate performance data in real time to identify correlations, outliers, and economic drivers

“We benefit from the availability of community extensions via the RapidMiner Marketplace. We can easily search for what others have designed in RapidMiner, and use the extensions that are a fit for us.”

-- Tom GattenCEO

Track Data from Millions of Companies to Identify Critical Economic Drivers

4.5 Msubject matter experts’

content analyzed in the United Kingdom

every single day

Solution Use RapidMiner to mashup data of UK businesses, rapidly prototype predictive models & identify outlying, unusual, data

Where To Go From Here?

Slide 29 © 2015 Enterprise Management Associates, Inc.

• Data lakes are an emerging data management architecture

• There are issues fully realizing value from data lakes

• Following best practice/pattern helps

Join the Conversation…

Submit your questions or comments to the panel

using: @wiseanalytics @johnlmyers44 @rapidminer

#predictiveanalytics

Slide 30 © 2015 Enterprise Management Associates, Inc.

Q&A – Please Log Questions in the Q&A Panel

Slide 31 © 2015 Enterprise Management Associates, Inc.

• Visit RapidMiner.com to learn more about

Effortless Predictive Analytics

• Learn more about leading IT analyst firm Enterprise

Management Associates (EMA) at

enterprisemanagement.com