Phases of Big Data Challenges @ Nokia
-
Upload
innovation-enterprise -
Category
Technology
-
view
97 -
download
3
description
Transcript of Phases of Big Data Challenges @ Nokia
![Page 1: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/1.jpg)
Yekesa KosuruHERE.com
Nokia
Hadoop Innovation Summit February 20 & 21, San Diego 2013
Phases of Big Data Challenges@ Nokia
11
![Page 2: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/2.jpg)
• Phases of Big Data Challenges @Nokia
– Who we are
– Big data platform
– Use case data flows
– High level architecture
–Challenges• Phases of challenges
Agenda
22
![Page 3: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/3.jpg)
Accelerometer
GPS
Water
Proof
12h
Battery
Bluetooth 2GB Storage
Barometer
NFC
Gyroscope
Magnetometer
Who we are – disrupting the future
3
![Page 4: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/4.jpg)
Apps
Smart Data
Platform
Content
PositionsMaps TrafficPlaces Directions Guidance
Location Platform, Enabling Contextually Rich Mobile Experiences
44
![Page 5: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/5.jpg)
5
Big DataAnalytics
…to Be MadeAvailable for Analysis
Enabling feedback loops for continuous improvement,Location Optimized Experience, CRM, etc..!
Big Data Flows and Differentiates
…on All SupportedPlatforms…
NokiaAccount
We CollectUser Data…
5
![Page 6: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/6.jpg)
Click to edit Master title style
Phase 0
66
2008 – ‘10Build Technology
Platform,Get Data
![Page 7: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/7.jpg)
7
Business Challenges
• Data silos, no unique identifiers, missing semantics
• Multiple sources - overlapping, conflicting
• Timely processing of large volumes & velocity of data
• Partial, insufficient, inaccurate, inconsistent.. data
• Data/wire formats, Security, privacy and other policies unknown
Central Big Data Platform created
![Page 8: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/8.jpg)
8
…to verify Map accuracy and create Motion Graph
Using different big data sets
![Page 9: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/9.jpg)
Reports
AnalyticalDBMS
Analytics Cluster
Data AssetCatalog
AnalyticalDBMS
Dashboards
Data Discovery
InteractiveQueries
BatchQueries
Web Applications
Activity Logs
VShards(NoSQL)
Reference Data
Device Applications
Probes
3rd Party
Device
User Profile
POI, Map
ActivitySensor
Dat
a In
take
ETL,
dat
a cr
un
chin
g,
attr
ibu
tio
n, M
L A
lgo
rith
ms
Agg
rega
tio
n
HDFS
9
AnalyticalDBMS
Big Data Analytics Platform Data Flows
![Page 10: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/10.jpg)
Technology Platform
10
Hadoop RVShards
(KV)SDK,
Scribe, FTPHive, Pig
AnalyticalDBMS
Export/Import
Workflow Engine
Config./Deploy
Monitor AlertsData
PipelineScheduler
Security/Kerberos & ACL
On-Premise & Cloud Infrastructure
![Page 11: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/11.jpg)
11
Data Platform
Self ServeTools
ETL, AggMachine Learning
Data QualityData Asset
Catalog
Data, Metadata, Operational Data
Collect Ingest Organize Analyze Deliver
Technology Platform
![Page 12: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/12.jpg)
Click to edit Master title style
Phase 1 – 2012
1212
2008 – ‘10Build Technology
Platform,Get Data
2011Enhance Platform,
More Data,Simple Analytics,Data Crunching
2012PB’s of Data,
Hundreds of UsersThousands of JobsComplex Analytics,Multiple Clusters
![Page 13: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/13.jpg)
13
2012 Production Statistics
• 10’s PB of data all across Nokia
• Multi-tenant, multi-petabyte analytics cluster
• 10-20K+ jobs per day
• 600+ internal users
• 300M+ KV queries
• Terabytes flowing in every day
• Multiple data centers around the world
![Page 14: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/14.jpg)
14
Challenges With Big Data• Complex eco-system of technologies - many moving
parts, slower deploy cycles, data integration is complex
• Capacity & Scale Issues – Provision for peaks or sustained, storage or compute ?
• DBMS great for performance & data management, but cant scale - price/performance & ACIDity
• Hadoop great for ETL, but poor on query performance & data management, not interactive
• Data and Metadata fragmentation
![Page 15: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/15.jpg)
15
Big Data Capacity Issues
• Spikey Workloads
• Capacity Provisioning– Peaks
– Sustained loads
• How many clusters ? – SLA/Adhoc/Research
– Multiple data centers
– Data duplication
• Tenancy – single/multi
• TOC – Hadoop can get expensive -
storage & computed tightly coupled, idle machines
![Page 16: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/16.jpg)
16
Cloud helps with some issues• Operational & IT complexity reduced – API based spin up
& tear down – rapid deployments, faster cycles
• Pay for what is used
• Capacity issues mitigated - idle machines or peaks not an issue – elastically scale up and down
• De-coupled Storage and Compute makes sense
• Stateless architecture, recycle slow/bad machines, no need for rolling upgrades, instead do rolling replace
![Page 17: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/17.jpg)
Click to edit Master title style
Phase 2
1717
2012PB’s of Data,
Hundreds of UsersThousands of JobsSimple & Complex
Analytics
2008 – ‘10Build Technology
Platform,Get Data
17
2011Enhance Platform,
More Data,Simple Analytics
2013Still Pending Challenges
![Page 18: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/18.jpg)
18
Still Pending
• Data and Metadata fragmentation, need deeper integration into all tools/frameworks
• Advanced Analytics - Data science problems are hard & inefficient to implement in Map Reduce/RDBMS
![Page 19: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/19.jpg)
19
Complex Analytics
• Mathematicians think terms of Arrays not Map Reduce
• Data science tools can’t efficiently handle big data
• Data partitioning is naïve, indexing wont scale
![Page 20: Phases of Big Data Challenges @ Nokia](https://reader031.fdocuments.us/reader031/viewer/2022020218/5593c2371a28abaa4a8b46ab/html5/thumbnails/20.jpg)
Big Data Technologies for Future