Wrangling Customer Usage Data with Hadoop
description
Transcript of Wrangling Customer Usage Data with Hadoop
![Page 1: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/1.jpg)
Wrangling Customer UsageData with Hadoop
Clearwire – Thursday, June 27th
Carmen Hall – IT DirectorMathew Johnson – Sr. IT Manager
![Page 2: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/2.jpg)
Starting With…
• …a little ingenuITy!
![Page 3: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/3.jpg)
ingenuITy Day @ Clearwire
• Opportunity for everyone in IT to innovate and present new and even crazy ideas
• One of those crazy ideas was from Roger Hosto• Roger had the solution for Clearwire’s Big Data
problem: Hadoop
![Page 4: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/4.jpg)
But Wait!
• Now we had a solution for Big Data• We needed a Big Data opportunity• We had just the thing…
![Page 5: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/5.jpg)
The Perfect Problem• Customer Usage Data – our commodity to Wholesale
partners
![Page 6: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/6.jpg)
Totally (un)Wired
• Americans used more than 1,304 petabytes of wireless data in 2012 - an increase of 69.3% over the previous 12 months' usage (827 TB)
• Clearwire processes over 3B individual usage detail records each month
![Page 7: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/7.jpg)
Shifting Landscape
• The U.S. wireless industry is a $195.5 billion enterprise - larger than publishing, agriculture, hotels and lodging, air transportation and movies – just to name a few
• Prepaid/Pay-As-You-Go services' share of overall market penetration is 23.4% driving higher exposure of lost revenue if usage delivery is delayed.
• In some cases, a customer can consume data faster than we can bill for it
![Page 8: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/8.jpg)
Anatomy Of Latency - Legacy
IT UsageProcessing
ASN GW PTS SPB WholesalePartners
Internet
AAA
OSS SDU
1 Hour Up to 90 Minutes
![Page 9: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/9.jpg)
Let’s Talk Numbers
• Assume a 2GB plan• An HD movie from Netflix consumes 2+ GB per hour
• Assume wholesale price = $6/GB• Assume the retail price for a GB of data (as top up or
overage) ranges from $20 – $100
![Page 10: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/10.jpg)
As if that wasn’t enough -
• Clearwire was locked into a very expensive vendor contract which handled both network provisioning and usage delivery needs
• Legacy solution was not adaptable or flexible
• We needed something innovative, reliable, internally supportable, scalable – and we needed it fast
![Page 11: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/11.jpg)
Putting ingenuITy to Work!
• Roger’s idea was suddenly a project• We needed to build a platform to ingest, process, and
provide cleaned usage data for downstream applications – and quickly
• We needed:• A Hadoop Cluster• 24x7 Operations• Code to ingest data and handle a myriad of business
rules• Integration with legacy and new systems
![Page 12: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/12.jpg)
Atlas was Born
• Development work began immediately on Clearwire’s private cloud infrastructure
• Selected BigTop Packaging of Apache Hadoop v1.0.1• Custom code leveraging Hive and other common tools
to ingest and process data was written• Infrastructure was built
![Page 13: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/13.jpg)
Hybrid Approach to Hadoop
• Virtual Edge Nodes• Leveraged our existing private cloud
• Physical Data Nodes• Per Unit Cost (Storage & CPU) was lower than
existing infrastructure• Smaller and more efficient than you think
• 24 data nodes, each with 3TB of usable storage• Gives us 72TB of usable space• 3x block replication for production data
• Deployed identical DR/Analytics platform
![Page 14: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/14.jpg)
Operational in No Time
• 2.5 months from project approval to production• Leveraged our existing support organizations
• Solution leveraged common tools, did not require specialized teams
• Fault tolerance inherent within Hadoop helps us minimize late night calls
• An endless supply of data was quickly flowing through the system
• The results were looking good!
![Page 15: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/15.jpg)
Real Results
• 65% improvement in end to end delivery times• From 2.5 hours to 1.3 hours
• Reduced catch up time from upstream outages by more than half
• Reduced outage impacts by introducing flexibility to deliver partial files
• Eliminated 4 hour weekly usage delivery outages tied to provisioning system maintenance
![Page 16: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/16.jpg)
Anatomy of Latency - Now
ASN GW PTS SPB WholesalePartners
Internet
AAA
OSS SDU
1 Hour Average of 15 Minutes
Atlas Medusa
~6 Minutes ~9 Minutes
![Page 17: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/17.jpg)
Real (Financial) Results
• 6 month return on investment• Delivered at 1/3 the cost of competing solutions• Foundational – Enabling Wholesale support plan of
legacy platform migration• Saving Clearwire 10’s of millions of dollars over life of
contract and internalizing support and development
![Page 18: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/18.jpg)
The Intangibles
• Proved to internal and external partners that we deliver what we promise with limited negative impacts to ongoing business• This was KEY to the speed at which we were able to
migrate our billing platform• Delivered more than just a single, targeted process –
delivered an enterprise usage platform to grow from• Kept true to our innovative spirit and the commitment
to IT professionals that they can make a difference
![Page 19: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/19.jpg)
Evolution – Proving More
The Atlas Hadoop platform is now a go-to IT solution
• LTE Usage Data – Now in production • Other Data Sources - ESR Data • Data Replication and real-time ETL• Exploring opportunities with network team to move
closer to usage generation• Changing mindset of what IT can mean to an
organization
![Page 20: Wrangling Customer Usage Data with Hadoop](https://reader035.fdocuments.us/reader035/viewer/2022062814/5681686a550346895dded904/html5/thumbnails/20.jpg)
Q & A