Post on 15-Aug-2015
Hadoop in the Cisco Cloud
Kartik Kanakasbesan
kartikka@cisco.com
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
• Introduction
• What is Cisco Cloud services?
• Cisco’s Big Data as a service
• Why Hadoop in the Cloud?
• Use Cases for Hadoop in the Cloud
• Customer Experiences so far
• Going forward
Agenda
3
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
The IntercloudThe globally connected network of clouds
EnterprisePrivateClouds
Public Clouds
Intercloud Alliance
Intercloud Services
INTERCLOUD
Intercloud Providers
CIMK v1.0
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
The IntercloudCustomer Value
INTERCLOUDCONTROLAcross services, in every location
COMPLIANCEManage risk locally and globally
CHOICECloud the way
you need it
Public Clouds
EnterprisePrivateClouds
Intercloud Alliance
Intercloud Services
Intercloud Providers
with…
and…
CIMK v1.0
Intercloud Services
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Worldwide leader in cloud building, cloud services and managed services
Extensive global partner network
Global Scale Workload mobility Open Standards
Customer ValueCisco
Cisco & OpenStack – Delivering value
430 Companies and growing
17000+ individual members
2655 Cumulative contributions
OpenStack
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Cisco Intercloud Services
OpenStack and standards based cloud
OPEN STANDARDS
Self-service infrastructure enabling application lifecycle
GLOBAL SCALE
PUBLIC CLOUD
PLATFORM APIS
Workload mobility with control and compliance
Empowering developers and cloud-scale applications
RAPID INNOVATION
Best-of-breed of Cisco’s products and best practices
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Cisco Intercloud Services: Target Customers
• Hybrid workloads requiring common network and security policies
Enterprise
• Value-added services with NGN/NFV
• Federation capabilities
Network-Based Service Providers
Developers
• SaaS, Network-centric workloads
• IOT/IOE, SP Video, Collaboration, and Mobility workloads
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
IoT World Forum Reference ModelLevels
Application(Reporting, Analytics, Control)
Data Abstraction(Aggregation & Access)
Data Accumulation(Storage)
Edge Computing(Analysis & Transformation)
Connectivity(Communication & Processing Units)
Physical Devices & Controllers(The “Things” in IoT)
Collaboration & Processes(Involving People & Business Processes)
Sensors, Devices, Machines,Intelligent Edge Nodes of all types
Center
Edge1
2
3
4
5
6
7
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
QueryBased
EventBased
Data at Rest
Data in Motion
Non-realTime
RealTime
Sensors, Devices, Machines,Intelligent Edge Nodes of all types
Center
Edge
IoT World Forum Reference Model
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Cisco Intercloud services – platform
Cloud-Centric Networking, Security, Policy
Core Cloud Services Analytics Building Blocks
Application Enablement Tools
Enterprise/Hybrid Services
Cisco Micro Services
Collaboration SP Video Analytics
Inter-Region VirtualPrivate Backbone
Automated Private VPN Connectivity
Network-optimized Workload Placement
Managed Public + Private Cloud
Marketplace for Cisco, Third Party ISV, Enterprise Applications and Services
Third Party Open Source Tools
Security Network/Device ManagementIOE
Global SP Backbone
Dashboard
Basic Cloud Resource Monitoring
NetworkPerformanceMetrics
Deployment/ Management Tools, PaaS
VNF Library Orchestration, Auto-Scaling
Federated Network & Security Policies
Compute Storage Database Virtual Network LB VPN Hadoop
Service Chaining
Data Virtualization
Intercloud Fabric support for heterogeneous environments
Data Ingest
App-Level Sovereignty, Privacy Policies
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation IDMarket place
3rd party algorithms etc.
Op
enstack A
PIs, S
QL
, RE
ST
Data Ingestion as a service
Hadoop as a service
Machine Learning as a service
Data Warehouse services
Data Virtualization services
Other….
Vision for Cisco Cloud Provided Data Service*
IOE/IOT ApplicationsProactive
MaintenanceManufacturing
Apps
Machine as a service
Oil and Gas
Service Provider Analytics AppsNetwork
DiagnosticService Provider
Analytics
Customer Loyalty
AnalyticsFeature Analytics
Collaboration Analytics AppsTelepresence
AnalyticsCollaboration
Analytics
Social Analytics
Sentiment Analysis
Others Applications
Marketing Apps
Availability Analytics
Demand Planning
Sentiment Analysis
Deliver an integrated and
managed environment of these primitives
Deliver analytics
applications to customers (Hybrid, on-Premise or
Software as a service
Data S
ources
Remove the burden of
managing the infrastructure
Allow Organization and Line of businesses to focus on Market Opportunities and develop Analytics Applications
CIS Provided Data Services
*Subject to change based on market feedback
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Cisco’s Big Data as a service*
Hadoop as a service (aaS)Data Ingestion -aaS
Visualization -aaSMachine learning -aaS Data Virtualization- aaS
Analytics -aaS • All these services need
• Provisioning• Monitoring• Scaling
• Consumption model• Integrated• Individually
• Cisco Branded service• Minimal Flexibility
on Vendor choice
Provisioned on Big Optimized instances• Local Storage• Object and Block Storage
*Subject to change based on market feedback
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Cisco’s Hadoop as service*
Hadoop
Reliable, Secure & Monitored
HaaS
Cisco’s Hadoop as a service• Provides market leading Cloudera’s
Hadoop distribution
• Flexibility to deploy Hadoop optimized templates for Streaming and Batch processing
• Data ingest with Apache Kafka
• Support Apache Spark Stack• Core, SQL, Mlib,& GraphX• Running on YARN
• Secure access to Hadoop APIs
• Integrate with on premise Hadoop distributions (if needed)
Openstack
*Subject to change based on market feedback
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Does Hadoop in the Cloud make sense ?
15
?
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Why Hadoop in the Cloud makes sense?• Reducing barriers in adopting Hadoop
• Cloud and Hadoop provide the perfect solution to “test” the Hadoop waters
• Help customers build IoE/IoT applications faster with Cisco’s solutions
• Run your Hadoop dev/test workloads in the cloud and provision them on premises
• Leverage Cisco’s Networking capabilities as a differentiator for your capabilities
• Provide consistent policies on the cloud just like on-premises
• Provide a scalable, reliable, and secure environment
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
• $16B by 2020*
• Targeted to grow by 70.8% CAGR
• Over 20 plus players in the market• Highly fragmented Amazon, Azure, IBM, Google, Rackspace, and many more
• North America is the leading market• Europe is further behind
• AP markets are maturing fast
Hadoop as a service(HaaS) Market size & forecast
*Source:GigaOM
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Use Case: Preventative Maintenance
Data ingestion Data Processing
In-memory database
HadoopIn-memory querying
Real-time query with low latency
1000’s of robots streaming messages (structured &
unstructured data)
Lambda architecture in the Cloud for IoT/IoE (elastically scalability and secure)
Data aggregation at the plant floor
done by a Cisco UCS
box
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Use Case: Omni-Channel Customer Journeys
Server Logs
Social & Chat
MobileEvent
StreamsCall
Center
S/W Download
Open Trouble Ticket
Assign Engineer
Update Trouble Ticket
Close Trouble Ticket
Resolve Trouble Ticket
Read Support Documents
View Design Documents
View Tech Documents
New Registration
Bug Search FAQs
Contract Details
Product Details
Device Coverage
Interaction Touch points
Channels
Journey
Case Resolution
Software Upgrade
The customers’ interaction with Cisco across multiple touch points to get the desired business outcome.
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Pilot Test Data
• Test performed on one day’s production data • Total no. of records processed – 110,852,667• Total data ingest size – 32GB/day• Total no. of M/R jobs in the data pipeline – 17• Two test cycles
• Cycle 1: Heterogeneous CCS nodes (vCPUs, storage, memory)
• Cycle 2: Homogeneous CCS nodes
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
AWS to CIS Migration – Success Criteria Successful synthesis of customer interaction data
Successful automation of the end-end data process pipeline
Build behavioral insight services
Access to data and services via data discovery and visualization tools
Meet the performance, scale and platform stability requirements
Successful deployment of CiscoDV on CIS
Connect HDFS and Hive DS with CiscoDV via Hive and Impala
Build and expose insight services for consumption by limited users
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
AWS and CIS Data Node Sizing Comparison Hadoop Cluster for Batch and Query Analytics
Node Service AWS Instance Type vCPU Mem Storage Number of Data Nodes Comments
Data Nodes/Node Master m3.2xlarge 8 30 2x80 GB 30
Each hadoop data node has 1500GB of EBS available for HDFS storage
AWS Sizing
CCS Sizing Node Service CCS Instance Type vCPU Mem Storage Number of
Data Nodes Comments
Data Nodes/Node Master GP-2XLarge 8 32 50 35
Each hadoop data node has 1500GB of volume storage available for HDFS storage
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
CIS Performance of Batch Analytics – Limited Test
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Test Details by M/R job
Job Name
CCS 12 nodes: cycle1
CCS 18 nodes: cycle1
CCS 24 nodes: cycle1
CCS 30 nodes: cycle1
CCS 18 nodes: cycle2
CCS 24 nodes: cycle2
CCS 30 nodes: cycle2
CCS 35 nodes: cycle2
New_cleanse 249 176 143 117 82 67 55 51Process_private_ip 27 14 11 10 7 5 6 6join_web_and_ip_data 142 95 76 61 49 40 34 29combine_ip_decorated_files 26 14 11 10 9 7 8 7filterBotEntries 34 19 15 13 10 8 7 7sessionize 71 64 69 62 60 63 15 13firstActivitiesFilter 26 15 13 10 9 8 6 6allOtherActivitiesFilter 29 18 13 13 11 9 7 6matchFirstActivities 21 13 11 13 13 11 8 8buildActivities 27 15 12 10 7 6 9 9filterBUG 8 5 3 2 3 3 4 4filterSEA 8 5 3 2 3 3 4 4filterTCO 8 5 3 2 3 3 4 4filterTDV 8 5 3 2 3 3 4 4filterWDV 8 5 3 2 3 3 4 4filterMOD 8 5 3 2 3 3 4 4filterTOOL 8 5 3 2 3 3 4 4
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Discover data beyond the enterprise: Virtual integration that combines traditional enterprise data, Big Data stores on CIS and AWS, cloud data from SaaS providers and, Cisco Customers and Partners
Seamless interoperability offers easy access to data across distributed data sources in the intercloud analytics platform
Universal data governance maximizes enforcement of data security rules
Analytics Data Hubs: Deployment flexibility to build hybrid/virtual sandboxes that enable nimble data discovery and rapid data analytics to support multiple LOBs
In addition to Hadoop: Cisco Data virtualization
25
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
CiscoDV on Intercloud Analytics Platform (CIS)
Scenario 1
CIS Cisco DV to Cisco Enterprise Data Store
Scenario 2
CIS CiscoDV to Impala and Hive on CIS Intercloud Analytics Platform
Scenario 3
CIS Cisco DV to Hive on AWS Big Data Cluster
Sce
na
rio
1
Scenario
2
Scenario 3
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Sample result for Cisco DV
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Cisco’s Hadoop services is available to select customers only before General Availability
Part of the broader Cisco Big Data as a service play
Let us know the kind of tools you use for Visualization Machine Learning
How can we address your Data challenges together ?
Going forward
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID
Questions ?
29
Thank you
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco PublicPresentation ID 30