Hortonworks Data Platform - Data Integration Services with HDP
Introduction to Hortonworks Data Platform for Windows
-
Upload
hortonworks -
Category
Education
-
view
2.250 -
download
4
description
Transcript of Introduction to Hortonworks Data Platform for Windows
© Hortonworks Inc. 2013
Quick House Keeping Rule
• Q&A panel is available if you have any questions during the
webinar
• There will be time for Q&A at the end
• We will record the webinar for future viewing
• All attendees will receive a copy of the slides and recording
Page 1
© Hortonworks Inc. 2013
Introducing Hortonworks Data Platform for Windows
Enterprise Apache Hadoop for Windows Environments
March 2013
Page 2
© Hortonworks Inc. 2013
Our Speakers
Page 3
John Kreisa
VP, Strategic Marketing
Saptak Sen
Sr. Product Manager
Rohit Bakshi
Product Manager
© Hortonworks Inc. 2013
Agenda
• Why Hadoop on Windows?
• Hortonworks Data Platform for Windows
• Microsoft - Big Data and Apache Hadoop
• Hortonworks Data Platform under the covers
• Q&A
Page 4
© Hortonworks Inc. 2013
Polling Question
Where are you with Hadoop?
__ We are running it in production
__ We have it running in our labs
__ We are just investigating Hadoop
__ What is Hadoop?
Page 5
© Hortonworks Inc. 2013
Agenda
• Why Hadoop on Windows?
• Hortonworks Data Platform for Windows
• Microsoft - Big Data and Apache Hadoop
• Hortonworks Data Platform under the covers
• Q&A
Page 6
© Hortonworks Inc. 2013
Why Apache Hadoop on Windows?
• According to IDC Windows Server held 73% market share in 2012– Hadoop was traditionally built for Linux servers so there are a large number of
underserved organizations
• Apache Hadoop: de-facto platform for processing massive amounts of unstructured data– Complementary to existing Microsoft technologies– There is a huge untapped community of Windows developers
and ecosystem partners
• A strong Microsoft-Hortonworks partnership and 18 months of development makes this a natural next step
Page 7
© Hortonworks Inc. 2013
What Makes Up Big Data?
Megabytes
Gigabytes
Terabytes
Petabytes
Purchase detail
Purchase record
Payment record
ERP
CRM
WEB
BIG DATA
Offer details
Support Contacts
Customer Touches
Segmentation
Web logs
Offer history
A/B testing
Dynamic Pricing
Affiliate Networks
Search Marketing
Behavioral Targeting
Dynamic Funnels
User Generated Content
Mobile Web
SMS/MMSSentiment
External Demographics
HD Video, Audio, Images
Speech to Text
Product/Service Logs
Social Interactions & Feeds
Business Data Feeds
User Click Stream
Sensors / RFID / Devices
Spatial & GPS Coordinates
Increasing Data Variety and Complexity
Transactions + Interactions + Observations
= BIG DATA
Page 8
© Hortonworks Inc. 2013
Big Data: Big and Getting Bigger Fast!
• Unstructured data growth exceeds 80% year/year in most enterprises– Machine-generated data is a key driver in data growth
• IDC projects digital universe will reach 40 zettabytes (ZB) by 2020– 1 ZB = 1,000,000,000,000 GBs!– Projected to increase 15x by 2020
• According to 2012 Barclays CIO study big data outranks virtualization as #1 spending initiative
Page 9*2012 IDC Digital Universe Study
© Hortonworks Inc. 2013
Enter Apache Hadoop
OSS that delivers high-scale storage & processing with enterprise-ready platform services
Page 10
HADOOP CORE
Hortonworkers are the original architects, operators, and builders of core Hadoop
PLATFORM SERVICES Enterprise Readiness
HDFS MAP REDUCE
The core of the next generation data platform…
© Hortonworks Inc. 2013
Agenda
• Why Hadoop on Windows?
• Hortonworks Data Platform for Windows
• Microsoft - Big Data and Apache Hadoop
• Hortonworks Data Platform under the covers
• Q&A
Page 11
© Hortonworks Inc. 2013
Introducing HDP for Windows
Page 12
HORTONWORKS DATA PLATFORM (HDP)For Windows
Hortonworks Data Platform (HDP)For Windows
• 100% Open Source Enterprise Hadoop
• Component and version compatible with Microsoft HDInsight
• Availability
• Beta release available now
• GA early 2Q 2013
PLATFORM SERVICES
HADOOP CORE Distributed Storage & Processing
DATASERVICES
Store, Process and Access Data
OPERATIONAL SERVICES
Manage & Operate at
Scale
Manage & Operate at
Scale
Store, Process and Access Data
Distributed Storage & Processing
Enterprise Readiness
© Hortonworks Inc. 2013
Hortonworks Data Platform for Windows
• Enterprise-grade Apache Hadoop on Windows– Enables same experience for Hadoop on Windows & Linux
• More partners, more developers for Hadoop– Makes native Apache Hadoop available to Windows ecosystem– More options for Windows focused organizations
• Hortonworks focus: Enterprise Apache Hadoop for all platforms– Trusted reliable production-ready distribution for on-premise Hadoop on Windows
deployments
• Built with joint investment and contributions from Microsoft– Deep engineering relationship ensures tight integration and maximum performance
Page 13
HDP: the first and only distribution available on Windows & Linux
© Hortonworks Inc. 2013
Hortonworks: Best In Class Hadoop Support
• Experienced enterprise support team – Experience supporting enterprise clients in production– Core engineers have real operational
experience: built and supported 44+K nodes in production– Extensive experience in commercial big data offerings
including HDP, MapR, Karmasphere
• Global 24x7 operation – support based in Sunnyvale, UK & India
• Stringent case management processes ensures high quality customer service & responsiveness
Page 14
© Hortonworks Inc. 2013
Transferring Our Hadoop Expertise to You
The expert source for Apache Hadoop training &
certification
• World class training programs designed to help you learn fast
– Role-based hands on classes with 50% lab time– New HDP on Windows course
• Expert consulting services– Programs designed to transfer knowledge
• Industry leading Hadoop Sandbox program– Fastest way to learn Apache Hadoop– Multi-level tutorials for wide applicability– Customizable and updateable
Page 15
© Hortonworks Inc. 2013
Hortonworks Snapshot
Page 16
• We distribute the only 100% Open Source Enterprise Hadoop Distribution: Hortonworks Data Platform
• We engineer, test & certify HDP for enterprise usage
• We employ the core architects, builders and operators of Apache Hadoop
• We drive innovation within Apache Software Foundation projects
• We are uniquely positioned to deliver the highest quality of Hadoop support
• We enable the ecosystem to work better with Hadoop
Develop Distribute Support
We develop, distribute and support the ONLY 100% open source Enterprise Hadoop distribution
Endorsed by Strategic Partners
Headquarters: Palo Alto, CAEmployees: 180+ and growingInvestors: Benchmark, Index, Yahoo
© Hortonworks Inc. 2013
Agenda
• Why Hadoop on Windows?
• Hortonworks Data Platform for Windows
• Microsoft - Big Data and Apache Hadoop
• Hortonworks Data Platform under the covers
• Q&A
Page 17
© Hortonworks Inc. 2013
Microsoft Big Data
Microsoft Big Data – Simplifies data management for IT – Enables IT and users to easily enrich their data with the world’s data, and– Delivers agility to end users through familiar tools like Excel
Page 18
microsoft.com/bigdata
Simplicity for IT
Agility for End Users
© Hortonworks Inc. 2013
Microsoft End-To-End Big Data Platform
Page 19
© Hortonworks Inc. 2013
Agenda
• Why Hadoop on Windows?
• Hortonworks Data Platform for Windows
• Microsoft - Big Data and Apache Hadoop
• Hortonworks Data Platform under the covers
• Q&A
Page 20
© Hortonworks Inc. 2013
Enhancing the Core of Apache Hadoop
Deliver high-scale storage & processing with enterprise-ready platform services
Unique Focus Areas:• Bigger, faster, more flexible
Continued focus on speed & scale and enabling near-real-time apps
• Tested & certified at scale Run ~1300 system tests on large clusters for every release
• Enterprise-ready servicesHigh availability, disaster recovery, snapshots, security, …
Page 21
HADOOP CORE
Hortonworkers are the architects, operators, and builders of core Hadoop
PLATFORM SERVICES Enterprise Readiness
HDFS
MAP REDUCEWEBHDFS
© Hortonworks Inc. 2013Page 22
HADOOP CORE
DATASERVICES
Provide data services to store, process & access data in many ways
Unique Focus Areas:• Apache HCatalog
Metadata services for consistent table access to Hadoop data
• Apache Hive Explore & process Hadoop data via SQL & ODBC-compliant BI tools
Distributed Storage & Processing
Hortonworks enables Hadoop data to be accessed via existing tools & systems
PLATFORM SERVICES Enterprise Readiness
Data Services for Full Data Lifecycle
HCATALOG
HIVEPIGSQOOP
© Hortonworks Inc. 2013Page 23
HADOOP CORE
DATASERVICES
Provide data services to store, process & access data in many ways
Unique Focus Areas:• Apache HCatalog
Metadata services for consistent table access to Hadoop data
• Apache Hive Explore & process Hadoop data via SQL & ODBC-compliant BI tools
Distributed Storage & Processing
Hortonworks enables Hadoop data to be accessed via existing tools & systems
PLATFORM SERVICES Enterprise Readiness
Data Services for Full Data Lifecycle
HCATALOG
HIVEPIGSQOOP
© Hortonworks Inc. 2013Page 24
HADOOP CORE
DATASERVICES
Provide data services to store, process & access data in many ways
Unique Focus Areas:• Apache HCatalog
Metadata services for consistent table access to Hadoop data
• Apache Hive Explore & process Hadoop data via SQL & ODBC-compliant BI tools
Distributed Storage & Processing
Hortonworks enables Hadoop data to be accessed via existing tools & systems
PLATFORM SERVICES Enterprise Readiness
Data Services for Full Data Lifecycle
HCATALOG
HIVEPIGSQOOP
© Hortonworks Inc. 2013
Operational Services for Ease of Use
Page 25
OPERATIONAL SERVICES
Include complete operational services for productive operations & management
• Apache Oozie: Manage and schedule job execution for Hadoop jobs
Only Hortonworks provides a complete open source Hadoop management tool
DATASERVICES
Store, Process and Access Data
HADOOP CORE Distributed Storage & Processing
PLATFORM SERVICES Enterprise Readiness
Oozie
© Hortonworks Inc. 2013
Inside HDP for Windows
Page 26
Hortonworks Data Platform (HDP)For Windows
• 100% Open Source Enterprise Hadoop
• Component and version compatible with Microsoft HDInsight
• Availability
• Beta release available now
• GA early 2Q 2012
PLATFORM SERVICES
HADOOP CORE
DATASERVICES
OPERATIONAL SERVICES
Manage & Operate at
Scale
Store, Process and Access Data
HORTONWORKS DATA PLATFORM (HDP)For Windows
Distributed Storage & ProcessingHDFS
WEBHDFS
MAP REDUCE
HCATALOG
HIVEPIG
SQOOP
Oozie
© Hortonworks Inc. 2013
Seamless Interoperability with Your Microsoft Tools
• Integrated with Microsoft tools for native big data analysis
– Bi-directional connectors for SQL Server and SQL Azure through SQOOP
– Excel ODBC integration through Hive
• Addressing demand for Hadoop on Windows
– Ideal for Windows customers with Hadoop operational experience
• Enables all common Hadoop workloads
– Data refinement and ETL offload for high-volume data landing
– Data exploration for discovery of new business opportunities
Page 27
APPL
ICAT
ION
SDA
TA S
YSTE
MS
Microsoft Applications
HORTONWORKS DATA PLATFORMFor Windows
DATA
SO
URC
ES
MOBILEDATA
OLTP, POS SYSTEMS
Traditional Sources (RDBMS, OLTP, OLAP)
New Sources (web logs, email, sensor data, social media)
© Hortonworks Inc. 2013
Demo Time!
Page 28
Excel integration with HDP• Interact with HDP through Excel• Use Data Explorer to explore and turn raw data
into valuable information
© Hortonworks Inc. 2013
Maximize Your Hadoop Deployment Choice
• Use HDP for Windows for on-premises deployment on Windows Server– Ideal for Windows users with Hadoop experience– Perfect next step for those who are ready to move from POC to production
• Use HDInsight for Microsoft tooling and Management and Provisioning– HDInsight Service that offers full benefit of Windows Azure (e.g. elasticity & low cost) –
available in Preview today– HDInsight Server for full integration of Hadoop with Microsoft tools on premises –
Developer Preview available today
• Full interoperability and deployment choice across platforms– Implement big data applications that run on-premise & cloud– By leveraging open source HDP, enables seamless interoperability across
environments: Linux, Windows, Windows Azure
Page 29
© Hortonworks Inc. 2013
Next Steps
Page 30
Download Hortonworks Sandboxwww.hortonworks.com/sandbox
Download Hortonworks Data Platform for Windows (Beta)www.hortonworks.com/download
Follow…@hortonworks, @hortonworks_U