MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is...

20
2 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

Transcript of MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is...

Page 1: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

2

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

Page 2: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

DATA CONVERGENCE IN TELECOMMUNICATIONS

INTRODUCTIONFew industries have more to offer and to gain from big data than telecommunications. For decades, communications service providers (CSPs) have transported and captured huge volumes of information about customer calling patterns, wireless data usage, location, network bandwidth statistics, and even the individual apps and webpages accessed by mobile devices.

Until recently, much of that data was discarded. There simply was no efficient way to mine value from it and storing it was expensive. However, this is all changing. Big data technologies, when combined with streaming analytics and analytics at scale, are enabling telecommunications companies to uncover significant new insights about their infrastructure and their customers. These insights are leading to massive changes in the ways they do business. In addition, with new legislation that opens the door for internet service providers to sell data about their customers’ online behavior, data may become a significant new revenue source.

As the range of telecommunications services has expanded with the adoption of mobile data, so have the potential applications to improve efficiency and generate new business. Providers can leverage this information to better understand their own business. For example, usage pattern data can guide bandwidth allocation and the positioning of equipment such as cell towers. It can also help identify new services to offer customers and even open up new revenue streams in areas such as targeted advertising.

Big data technologies are revolutionizing telecommunications. Tools like Apache Hadoop, streaming analytics, and machine learning are opening up new opportunities for CSPs to gain insights from data sets that were previously unwieldy.

This industry guide looks at the trends driving the big data revolution in telecommunications, how different segments of the industry are being affected, and how big data is being put to work in the field to change the competitive equation. This guide outlines a number of issues the industry faces and discusses how these trends are driving new solution areas. It also looks at use cases that show how big data and analytics are yielding game-changing breakthroughs for telecom providers today.

INDUSTRY TRENDSTo say the telecommunications industry is in transition is an understatement. Once a highly regulated business with fixed prices, monopoly market control, and little customer choice, the telecom market has been upended by digitization and mobility.

Customers now expect access from any device, anywhere. Their bandwidth needs have expanded to include video and high-speed data. Customers also have plenty of carriers to choose from, ranging from traditional phone service providers to cable companies to VOIP services. Thanks to full mobile phone number portability, customers can move their service from one carrier to another with virtually no disruption, enabling them to play carriers against each other.

3

Page 3: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

4

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

In short, a once-staid industry is now hypercompetitive. This has created pressure on the large incumbents and opportunities for nimble niche players, though there’s opportunity for the big players as well. Telecom firms are aggressively seeking new lines of business, ranging from advertising to cloud hosting to original programming. They are leveraging their infrastructure to deliver new high-margin information and entertainment services to mobile customers, homes, and businesses. Smart phones and programming are profitable new revenue sources. Although competition has increased, customers are writing bigger checks to telecommunication service providers than they ever have.

Some of the priorities today’s telecom providers must address include the following:• System-wide cost take out and optimization

• Improving customer loyalty, acquisition, and retention

• Monetizing data and analytics

• Mass personalization

System-Wide Cost Reduction and Optimization Customer demand for data is growing at an accelerating rate, putting pressure on all service providers to optimize infrastructure. Operational expenses typically consume 30-40% of revenue, and network operations account for about 45% of that cost.1 A single cell tower can cost $250,000 to build, and equipment must be continually maintained and enhanced to support new protocols and services. For example, the cost of upgrading existing networks to the new 5G network infrastructure is expected to cost more than $100 billion over the next 10 years.2 As a result, service providers are always looking for ways to squeeze more capacity out of existing infrastructure, reduce the cost of expansion, and find new ways to leverage existing infrastructure profitably.

They are also seeking to better understand customer behavior in order to maximize margins. Just 1% of mobile users consume half of the world’s bandwidth.3 Carriers want to identify these heavy users and charge them appropriately.

Two factors that will influence their planning are the move to software-defined networking (SDN) and the surge of new bandwidth demand associated with the growth of the Internet of Things (IoT). SDN promises to lower costs and increase flexibility, but there are big migration expenses involved. IoT will create new pressure on capacity planning, but also yield attractive new sources of revenue. Both technologies promise to create massive structural change in existing networks, and they will require careful planning and ROI analysis.

Improving Customer Loyalty, Acquisition, and RetentionThe consumer market is essentially saturated with more phones on the planet than people. This means growth must come from selling more products and services to existing customers and stealing customers away from competitors. Factors such as service quality, price, speed, and customer service are key variables in this equation.

Research shows that customers perceive little difference between telecommunication service providers. This means the ability to understand customer needs at a fine-grained level, provide fast and friendly service, and customize service plans for businesses and individual customers is crucial for success.

1Banerjee, Ari, Big Data & Advanced Analytics in Telecom: A Multi-Billion-Dollar Revenue Opportunity, Heavy Reading, December, 2013.2Goovaerts, Diana, iGR Study Forecasts $104B Cost to Upgrade LTE Networks, Build Out 5G Network, Wireless Week, December 7, 2015.3O’Brien, Kevin, Top 1% of Mobile Users Consume Half of World’s Bandwidth, and Gap Is Growing, The New York Times, January 5, 2012

Page 4: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

5

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

KEY INDUSTRY STAKEHOLDERSTelecommunications industry diversification has created a host of new stakeholders beyond carriers and their customers. Each has different information needs.

CarriersService providers need to capture detailed information about infrastructure, service quality levels, demand patterns, and other operational data in order to optimize resources and deliver high-quality user experiences. They also need to mine customer behavior data in order to fine-tune their product offerings and take advantage of new opportunities like advertising and paid information services.

CustomersIn a market with a profusion of service offerings, many of them based on use, customers appreciate having up-to-the-minute data on account status and warnings of additional charges they may incur. CSPs can enhance customer satisfaction by offering complete reports and recommendations for account changes that can optimize each customer’s spending.

SuppliersThe high cost of infrastructure gives service providers plenty of incentive to understand the ROI of the dollars they spend on equipment and service. Sharing this information with suppliers can help them negotiate more favorable contracts and ensure that suppliers are delivering on service-level agreements.

Companies that supply end-user devices such as handsets and accessories want detailed sales information as well as recommendations for promotions and other incentives that can improve their bottom lines.

RegulatorsGovernment overseers demand full transparency about service levels, rates, customer satisfaction, and other metrics. They also want evidence that service providers are staying within customer privacy and confidentiality guidelines. CSPs must not only capture this data, but tag and index it for rapid access since compliance audits may require a response in 48 hours or less.

Content ProvidersThis new but important constituency provides programming and information services that open up new revenue streams. Content providers require information on how their content is being catalogued, promoted, and consumed as well as details on royalties, licenses, and other forms of compensation.

AdvertisersAdvertisers are another new and promising business opportunity. Ad buyers want clickstream data such as views, click-throughs, downloads, registrations, view times, and other engagement metrics. They also may demand this information within the context of location, time of day, app use, and other variables.

Page 5: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

6

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

TELECOMMUNICATIONS DATA SOURCES AND STANDARDSThere is no shortage of data available to telecom providers both from within their networks and from external sources. Here are a few of the sources they can mine:

Internal systems used to provision services and manage customer accounts throw off huge volumes of data, so do the endpoint devices that subscribers use. CSPs can choose to capture literally every bit that goes over their networks and correlate it with external factors such as time and location as well as the identity of individual subscribers. While regulations limit what carriers can do with individual subscriber data, carriers can anonymize this information and pull it together into profiles that are useful for everything from network optimization to advertising.

Independent research organizations are too numerous to list here. Scores of independent analyst firms as well as captive research organizations owned by the carriers themselves monitor various aspects of the industry. The U.S. Library of Congress has a sample of sources here.

Industry associations are also too numerous to list. Major U.S. groups include:

• Telecommunications Industry Association• NCTA (The Internet & Television Association) • Alliance for Telecommunications Industry Solutions (ATIS)• Cellular Telecommunications Industry Association (CTIA)• Rural Broadband Association (NTCA)

Many countries have equivalent bodies. Nearly all of these organizations publish research and data.

Government and regulatory agencies exist in every international market. In the U.S., the big four are the Federal Communications Commission, CTIA, NCTA and the National Association of Regulatory Utility Commissioners. Wikipedia lists more than 200 others.

Suppliers can be useful sources of information about equipment usage, technology trends, and advice on getting more bang for the buck from their products.

Competitors may publish their own research about their markets, and their public regulatory filings can yield insights on their own operations.

Page 6: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

7

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

TELECOMMUNICATIONS DATA TYPESTelecom carriers collect so much data about their customers that one of their biggest challenges is deciding what information not to keep. Fortunately, big data tools make it possible to build inexpensive data lakes and then decide what is most useful. Some of the data types carriers already collect that could be analyzed include the following:

Data Source DescriptionCALL DATA RECORD (CDR) Analysis of individual customers can help carriers design service packages that

appeal to different categories of customers, such as frequent callers, heavy users of international services, or those who call the same numbers often. When aggregated, these records can yield insight into usage patterns that allow carriers to more effectively manage their infrastructure and to optimize capacity.

MOBILE INTERNET USAGE

Data is the most expensive service telecom carriers provide, and it is also the most highly valued by customers. Understanding how different classes of customers use data can help CSPs be more creative about designing service plans for different categories of users.

SMART DEVICE/IoT USAGE By tracking the types of apps and sites that customers access from their smart devices, mobile providers can make informed choices about providing alternative sources or forming partnerships that could result in bonus revenue streams via subscriptions or transaction fees. As the Internet of Things ramps up, carriers will increasingly be able to understand the types of devices that are accessing their network and design service packages around them.

AUTOMOTIVE DATA Autonomous and semi-autonomous vehicles are essentially computers on wheels, generating enormous amounts of data that needs to be transmitted between vehicles and to central control hubs. Telecom providers are already partnering with automotive companies to outfit vehicles with the wireless capabilities required by smart cars. This will tax existing networks, but also create new revenue opportunities.

NETWORK EQUIPMENT DATA

Analyzing data from infrastructure equipment—such as voltage and current levels, outages and operational efficiency—can help providers deploy those resources more efficiently, detect trouble areas, and perform preventive maintenance to avoid downtime and expensive after-hours repairs.

SERVER, NETWORK, AND APPLICATION LOGS

These can yield a bounty of information about customer behaviors that can be used to optimize bandwidth, improve service levels, and correlate customer behavior to external events such as storms and breaking news.

BILLING DATA This can be a source of both insight and competitive advantage. Understanding how and when customers pay bills can help service providers reduce delinquency rates and create services that make payments easier for customers. Billing data can also be used to proactively optimize customers’ accounts and present them with information that helps them make more cost-effective use of services.

MACHINE-TO-MACHINE DATA

Taking advantage of existing networks to connect equipment within the service provider’s infrastructure can help providers balance resources to reduce slowdowns and outages. As the Internet of Things takes hold, services optimized to connect machines, such as medical equipment and automobiles, could uncover new revenue streams.

SOCIAL NETWORK DATA Tapping into conversations on social networks is one of the best ways to ensure customer satisfaction, detect potential defections, and gain intelligence on competitors.

Page 7: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

8

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

USE CASE EXAMPLESAll these resources have been used in real-life scenarios to change the telecommunications game.

Customer Loyalty and AcquisitionOnce a highly regulated industry, telecom is now a virtual free-for-all. With the market nearly saturated, carriers focus much of their efforts on stealing customers from each other. And thanks to phone number portability and the decline of service agreements, it’s never been easier for customers to switch.

The telecommunications industry suffers the second lowest customer satisfaction ratings after government.4 In the U.S., turnover rates for the major carriers typically run around 20% per year. Globally, the churn rate is much higher.5 At the same time, the cost of acquiring a single new telecommunications customer has been pegged at more than $300.6 With more than 17 million customers signing up for new plans or switching carriers each year, acquisition costs add up to more than $5 billion annually.

This has made customer service a critical competency for all providers.

Big data and analytics tools help carriers understand customer behaviors and interests at a fine level. For example, CSPs can use a variety of internal and external metrics for churn analysis, which alerts companies to signals that a customer may be about to defect. Evidence might include declining usage, repeated calls to the customer support center, or frequent dropped calls. Social media also presents valuable new ways to detect at-risk customers. By monitoring online sentiment and matching usernames to customer accounts, telcos can pinpoint dissatisfied customers and extend individualized retention offers. Social media is also a valuable feedback mechanism for new products and services. Customer reactions to new equipment, service plans, or offers can be captured almost immediately and used to adjust pricing or marketing plans proactively.

Churn analysis, driven by big data and analytics, enables telecom companies to identify potential defectors quickly and to target their retention strategies more selectively. For example, a company can look at a large pool of recent lost customers and cross-tabulate the data with other characteristics, such as marital status, age, location, volume of use, or payment delinquency. The same analysis can be performed on customers who have increased spending with their providers. This analysis yields “buckets” of customers that can be categorized according to their likely future spending patterns. Offers and incentives can then be targeted to groups of customers. Likely defectors can be intercepted, even if they have expressed no explicit plans to leave. This is important since research indicates that only 5% of dissatisfied customers ever overtly express dissatisfaction and about 80% of defectors give no reason for leaving.7

4Benchmarks by Sector, American Customer Satisfaction Index, 2017.5Dobardziev, Angel, Ovum’s global survey shows that telecoms operators could lose up to half their customers in the next year, Ovum TMT Intelligence, November 6, 2014.6Safko, Lon, How Much Did That New Customer Cost You?, Entrepreneur, January 14, 20137Barlow, Janelle M. & Moller, Claus, A Complaint Is a Gift: Recovering Customer Loyalty When Things Go Wrong, 2nd edition (Oakland, Calif: Berrett-Koehler Publishers; 2nd edition, 2008).

Page 8: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

9

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

The impact of churn analysis can be substantial. McKinsey cited one telecom provider’s use of machine learning to combine sociodemographic data, information from customer calls, and social media interaction network usage data to identify the customers who were most likely to defect or have trouble paying their bills. It cut churn by three percent and improved payment recovery by 35%.8

A key element of customer retention is delivering individualized service and promotions. Multi-channel call center automation software now makes it possible for companies to create unified views of their customers composed of feedback from phone calls, email, social media, and even in-store visits. By applying analytics to these rich profiles, telecom companies can group customers by category and customize marketing programs and offers to them. For example, heavy data users may be presented with bonus bandwidth or coupons for free streaming video, while customers with modest data needs may be offered discount upgrade incentives to get them to the next level. These tactics work; half of the business-to-business customers surveyed by Forrester Consulting rated personalized recommendations as the feature they would most like suppliers to offer.9

Capturing data from multiple sources in a reference database, as illustrated below, makes it possible for that information to be used in a variety of use cases ranging from searchable customer records to model training for machine learning applications.10 The ability of Hadoop and NoSQL databases to combine and perform analytics on a mix of both structured and unstructured data makes applications practical that were previously impossible.

Customers also vote with their clicks, and this activity can be captured and analyzed to understand customers’ content needs. For example, customers who upload a lot of photographs or streaming video may be offered free accounts on media-sharing services or cloud storage space. Frequent music listeners may be offered gift cards for streaming music services. The cost of these giveaways is a pittance compared to the cost of acquiring a new customer.

WEB LOGS

TRANSACTIONS

CSR NOTES & LOGS

SEARCH INDEX

MODEL TRAINING DATA

MASKED EXTRACT

REFERENCE DB

8Bughin, Jacques, Telcos: The untapped promise of big data, McKinsey Quarterly, June, 2016. 9How B2B Vendors Are Working to Meet Buyers’ Omni-Channel Desires, MarketingCharts, November 17, 2014.10Dunning, Ted & Friedman, Ellen, Prototypical Hadoop Use Cases, MapR Technologies e-book, 2010.

Page 9: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

10

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

Social media sentiment analysis cuts both ways, and it can be useful in identifying at-risk customers of competitors. By monitoring negative comments about rivals, telecom companies can offer timely incentives to make the switch. Online real estate transactions or openings of new businesses can also trigger offers to customers who are new to the area or those who are leaving but can be retained. The same profiling techniques used to extend retention offers to customers based upon demographic characteristics can also customize offers to potential new customers.

Big data permits much finer levels of targeted marketing. Instead of creating direct mail according to geography, companies can segment customers by combinations of demographic and behavioral data. Email and display advertising campaigns can be customized based on the characteristics of individual customers, and tactics like online A/B testing quickly provide feedback on the most effective offers and messages.

Customers increasingly expect to be treated as markets of one. Big data and analytics is enabling this goal to be realized.

Product and Service QualityTelecommunications is a capital-intensive business with global CSP capital expenditures expected to total more than $2 trillion by 2019.11 Providers are under intense pressure to minimize dropped calls and data dead zones, which are among the biggest sources of customer dissatisfaction. Breadth of coverage is also a competitive advantage, so telecoms need to maximize reach while constantly monitoring their networks for outages, capacity thresholds, and other service quality issues.

The amount of data generated by data equipment is enormous and getting bigger. The advent of 4G mobile networks alone increased the data volumes from mobile devices about tenfold,12 and the arrival of 5G networks over the next two years promises to do the same. Other growth factors include location-based data, streaming media, IPv6 addressing, and the arrival of an estimated 50 billion connected IoT devices by 2020. Much of this data will need to be analyzed in real time, both for service-level compliance and to recognize the promise of new revenue sources through services like location-based and contextual marketing.

Before the arrival of big data systems like Hadoop, Spark, Flink, and the MapR Converged Data Platform, it was impractical for carriers to analyze more than a fraction of that information. But with the price/performance improvements that big data tools have introduced, carriers can now afford to sift through a much larger amount of activity on their networks. For example, Razorsight, a provider of analytics services that helps telecommunication companies optimize their sales and marketing activities, has seen the total cost of storage and processing drop from up to $20,000 per terabyte in a traditional data warehouse to less than $3,000 per terabyte with a converged solution from MapR Technologies.13

11Ovum forecasts CSP capex over 2014–19 period will surpass US$2tn, Ovum TMT Intelligence, December14, 201412Big Data for the Telecommunications Industry, Informatica Corp., 201213Nemschoff, Michele, Hadoop In Action: Razorsight Offers Telecom Clients Predictive Analytics Solutions Based On Hadoop And Apache Spark, MapR blog, September 8, 2015

Page 10: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

11

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

Big data ecosystems like the MapR Converged Data Platform are changing the cost calculus by drastically reducing the cost of managing the millions of records that flow from telecom systems and networks every second. Here are some examples of how individual components can be applied:

• Data platform (the MapR Converged Data Platform) can store and manage billions/trillions of files and petabytes of raw data.

• Event streaming (Apache Kafka or MapR Streams) can handle millions of messages from connected devices for storage and processing.

• Apache Flume is capable of ingesting millions of CDRs per second into a NoSQL database like MapR-DB or Cassandra each second.

• Apache Storm can process streaming data in real time and identify irregular or troublesome patterns.

• Apache Spark or Mahout can be applied to create machine learning models that anticipate capacity problems, usage spikes, and even equipment outages.

• Apache Flink is a true real-time processing engine that can be analyze data streaming over the network.

When historical data is combined with stream processing and predictive analytics, telecom providers can optimize their networks’ performance to unprecedented levels. They can also reduce costs through predictive maintenance, which enables equipment problems to be diagnosed earlier and prevents expensive field repairs. IoT will be a major factor here. By capturing data streaming from connected devices, providers can pinpoint potential trouble areas and dispatch repair crews before a problem occurs.

Other operational benefits that can be a realized through the use of big data include:

• Call routing efficiency can be improved to reduce customer hold times and optimize service representative efficiency.

• Demand forecasting can better prepare carriers for infrastructure upgrades or new services.

• Real time call detail record analysis identifies service problems and forecasts capacity needs.

• Proactive customer care alerts customers to service problems or offers incentives for specific usage scenarios.

• Service plans can be optimized based upon actual use.

Analysis of operational data has traditionally been a batch process, but streaming analytics tools such as Kafka and Spark extend the same kind of analytics capabilities to data flowing across the network. Not only are there operational benefits to capturing this kind of data, but CSPs can use streaming technology to deliver on-the-spot promotions or alerts. We’ve all heard stories of users being blindsided by large overage charges for services they weren’t even aware they were using like global roaming. Streaming analytics and mobile alerts should prevent these unpleasant surprises from damaging customer satisfaction.

Telecom companies that best leverage streaming technologies will gain a competitive edge.

Page 11: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

12

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

Security and ComplianceTelecommunications is a regulated industry, and service provider networks are prime targets for attackers. Again, big data has an important role to play.

Service providers must comply with many standards in areas like service levels, availability, pricing, and coverage. The penalties for failing to capture this information can be steep, and audit demands may carry deadlines of 48 hours or less. Historically, responding to regulatory requests has involved digging through racks of data archived on tape. Thanks to Hadoop, much of this data can now be stored online for rapid retrieval. Telecom providers can also use the technologies of big data to better understand their own operations, flag potential regulatory violations, and correct them.

Security is a perpetual cat-and-mouse game in which analytics is playing a growing role. For example, security information and event management (SIEM) is a growing class of real-time analytics tools that monitors security alerts generated by network hardware and applications. It constantly compares network activity to normal traffic patterns and flags anomalies that may indicate penetration or fraud. The concept has existed for more than a decade, but the new breed of machine learning and predictive analytics tools, combined with large data stores, promises to make this technology far more effective.

Telecom providers have long had the ability to capture all the data that streams across their networks, but they can now do so much more affordably thanks to big data and streaming analytics. The potential bottom-line impact is clear. The industry loses about $38 billion to fraud each year,14 or about 1.7% of total revenues.

The benefits of strong security go beyond just revenue impacts. As CSPs increasingly expand into cloud hosting, software-as-a-service, and managed services, their ability to secure their networks will be an increasingly important factor in customer satisfaction. For example, Macquarie Telecom, which provides secures communications services for 42% of government agencies in Australia, is using the MapR Converged Data Platform to monitor hundreds of systems and to aggregate logs that produce data in multiple formats. Data volumes have increased exponentially in recent years, and the MapR Converged Platform was the best one to handle the company’s capacity and speed requirements. The combination of real-time analytics and predictive security enables government employees to access internet information without worrying about malicious payloads. It has also made reporting more timely. Reports that used to take 14 days now take only two hours.15

14Global Fraud Loss Survey, Communications Fraud Control Association, 201515Macquarie Telecom Deploys MapR to Secure Australian Government Communications, MapR press release, November 24, 2015

Page 12: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

13

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

THE MAPR CONVERGED DATA PLATFORM IN TELECOMMUNICATIONSBy pursuing our data-centric vision for a new generation of applications, MapR has created an applications platform that converges the management of data of any size, speed, and format. It was for this work that the company was recently awarded a patent (US9207930). This is the MapR Converged Data Platform.

Open Source Innovation on a Trusted PlatformThe MapR Converged Data Platform is designed to deliver utility-grade data services and commercially supported open source innovations to development teams, IT operations, business analysts, and data scientists. Open source technology is a fantastic creative driver for the sophisticated new challenges that big data, and especially new data, uncovers.

Without a converged data platform, critical information can get stuck in data silos, and inefficient use of hardware resources can result in a costly cluster sprawl of underutilized servers and storage. With the MapR Platform, businesses can enjoy real-time insights based on secure, protected, high-fidelity data.

Seamless Integration with Existing Enterprise SystemsOne of the most profound design decisions made by MapR was to create an enterprise-grade file and storage system to house the data in the big data ecosystem. The MapR File System, based on the trusted POSIX/NFS standard, makes it easier to get data in and out of the MapR Platform using familiar enterprise tools. MapR provides open APIs for developer access to data with standard interfaces like SQL, HDFS, HBase, JSON, and Kafka.

Continuous Trusted OperationsWith its consistent focus on the integrity of data, MapR has created a hardened, clustered platform that can withstand multiple hardware failures, data center outages, malicious attacks, and intrusions from cybercriminals. Many proven methods of data protection, such as failover, redundancy, and access controls, are built into the MapR Platform.

OPEN SOURCE ENGINES AND TOOLS

ENTERPRISE-GRADE PLATFORM SERVICES

MAPR-FS

High Availability

HDFS API POSIX API HBase API JSON API Kafka API

Real Time Unified Security Multi-tenancy Disaster Recovery Global Namespace

MAPR-DB MAPR STREAMSWEB-SCALE STORAGE DATABASE EVENT STREAMING

Search and Others

Cloud and Managed Services

Custom Apps

COMMERCIAL ENGINES AND APPLICATIONS

UNIFIED MANAGEM

ENT AND MONITORING

PROC

ESSI

NGDA

TA

MAPR CONVERGED DATA PLATFORM

Page 13: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

14

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

Big Data with Enterprise StabilityGame-changing big data applications and analytics will continue to rely upon open-source software. As a company founded in and contributing to the open-source world of Hadoop and Spark, MapR continues to define enterprise requirements and best practices for successfully using the latest open source innovations. We deliver monthly updates to open source software packages to ensure you have the latest innovations.

MapR Telecommunications Architecture

Data-Driven Improvement of Services or ProductTelecom companies need to share data between cell towers, users, and processing centers. Because the volumes can be very large, it’s important to process data from the source and efficiently transfer it to various data centers for further use. MapR Streams, a new distributed messaging system, is able to transport huge amounts of data and make it available with reliable geo-distributed replication across multiple data centers. With MapR Streams, you can replicate streams in a master-slave, many-to-one, or multi-master configuration between thousands of geographically distributed clusters.

REPLICATING TO ANOTHER CLUSTER

TOPIC. A

TOPIC. B

TOPIC. C

TOPIC. A

TOPIC. B

TOPIC. C

DATA SOURCES INGEST INSIGHTS STAKEHOLDERS

Billing Data

Call Data Records

Mobile Usage

Smart Device Data

Network Data

Server Logs

M2M Data

Social Network Data

Streaming Data Ingest

POSIXNFSFile Ingest

Data Exploration

Dashboards

Analytics

Applications

Search

Carriers

Customers

Suppliers

Regulators

Content Providers

Advertisers

UpsellingCross-selling

Telemetry

Product and Service Quality

CustomerSegmentation

CapacityPlanning

Security and Compliance

FraudDetection

Call NetworkOptimization

PersonalizedOffers

TargetedMarketing

RecommendationEngine

Customer LoyaltyAcquisition

USE CASES

OPEN SOURCE ENGINES AND TOOLS

ENTERPRISE-GRADE PLATFORM SERVICES

MAPR-FS

High Availability

HDFS API POSIX API HBase API JSON API Kafka API

Real Time Unified Security Multi-tenancy Disaster Recovery Global Namespace

MAPR-DB MAPR STREAMSWEB-SCALE STORAGE DATABASE EVENT STREAMING

Search and Others

Cloud and Managed Services

Custom Apps

COMMERCIAL ENGINES AND APPLICATIONS

UNIFIED MANAGEM

ENT AND MONITORING

PROC

ESSI

NGDA

TA

MAPR CONVERGED DATA PLATFORM

PROCESSING

Page 14: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

15

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

For example, one MapR customer uses MapR Streams to collect real-time data from all of its regional data centers and bring it to its central data center. Previously, the customer used FTP to transfer data from antennas to regional data centers and from there to headquarters, but the process suffered from extreme latency delays.

Now data is collected at regional data centers with MapR Streams and made available in real time to regional dashboards.

MapR Streams topics at regional data centers are replicated in a many-to-one configuration to the central data center, making events available in real time to the headquarters dashboard. The company can now monitor global performance and react quickly to improve customer services.

FILE SERVER

Monitoring directoryParsing CSV filesPublishing messages to topic

Parsing master dataSubscribing topicJoin tablesAggregation

DASHBOARD

KIBANAPRODUCER

(JAVA)CONSUMER

(JAVA)

FILTERING CONFIG

INDEX

ELASTICSEARCH

TOPIC

20-30 MINUTES

AGGR.AGGR.

AGGR.

AGGR.

REGIONAL DATA CENTER

A

REGIONAL DATA CENTER

B

REGIONAL DATA CENTER

C

STAGING FILE

SERVER

MONITORINGSYSTEM

REPORTINGSYSTEM

CENTRAL DATA CENTER

DASHBOARD FOR

REGIONAL DATA CENTER

A

DASHBOARD FOR

REGIONAL DATA CENTER

B

DASHBOARD FOR

REGIONAL DATA CENTER

C

FTP

FTP

FTP

FTP

FTP

FTP

Page 15: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

16

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

Being able to process high throughput geo-distributed events in real time enables the company to understand how and where service issues are trending and how that is affecting customers. Crowd-based antenna optimization enables monitoring of rapidly changing network usage patterns, with the ability to reconfigure network support to handle short-term surges, such as heavy usage near a stadium during a sporting event.

Service optimization through equipment monitoring, capacity planning, and preventative maintenance cuts down on dropped calls, network coverage gaps, bandwidth issues, slow download times, long service wait times, and frequency switching.

Customer 360Using data science to better understand and predict customer behavior is an iterative process that involves the following steps:

1. Data discovery, collection, correlation, and analysis of data across multiple data sources, including new data sources that traditional analytics or databases can’t use.

2. Application of machine learning algorithms to get value out of the data.

3. Use of models in production to make predictions.

4. Updating models with new data.

NETWORK COMPONENTS

REGIONALDATA CENTERS

EVENT STREAMREPLICATION

CENTRAL DATA CENTER

Performance and other monitoring

related data

REAL-TIMEDASHBOARD

AD-HOCANALYSIS

REAL-TIMEANALYSIS

REPORTING

STREAM

TOPIC

STREAM

TOPIC

STREAM

TOPIC

OTHER DATA SOURCES

Page 16: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

17

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

Factors that can be analyzed to better understand the customer include:

• Customer demographic data (age, marital status, etc.)

• Sentiment analysis of social media

• Customer usage patterns

• Geographic usage trend

• Calling-circle data

• Browsing behavior from clickstream logs

• Support center statistics

• Historical data that shows patterns of behavior that suggest churn

With this analysis, telecom companies can gain insights that help them predict and enhance the customer experience, prevent churn, and tailor marketing campaigns.

The architecture below shows how batch processing on different data sources can be used to build and update a model, which can then be used for real-time predictions on streaming data.

DATA DISCOVERYMODEL CREATION

PRODUCTION

HISTORICALDATA

NEWDATA

CUSTOMER DATA CRM

CALL CENTERRECORDS

APPLICATIONLOGS

WEBCLICKSTREAM

TESTSET

TRAININGSET

MODELTRAINING/BUILDING

TESTMODEL

PREDICTIONS

DEPLOYEDMODEL

PREDICTIONS

FEATURE EXTRACTION

FEATURE EXTRACTION

Prediction ModelingCohort AnalysisCustomer Lifetime ValueAnalysisAttrition ModelingResponse ModelingChurn Modeling

EVALUATE RESULTS

CUSTOMER DATA CRM

CALL CENTERRECORDS

APPLICATIONLOGS

WEBCLICKSTREAM

STREAM

SERVE DATADATA SOURCES COLLECT DATA STREAM PROCESSING

PROCESS

DERIVEFEATURES

MODELBuild Model

Models

Update Model

Feature Extraction at Machine-learning

BATCH PROCESSING

TOPIC

Page 17: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

18

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

Data Warehouse OptimizationA leading telecommunications provider plans to improve reporting and analytics on all aspects of customer usage and billing with the expectation that it can reduce churn by identifying and addressing network hotspots. MapR is the only Converged Data Platform that can scale to meet the data volume needs of this company while also satisfying reporting requirements by reducing workload on existing data warehouse systems.

Threat Detection Solutionary, a subsidiary of NTT Group, is a leader in managed security services. It provides threat intelligence, incident response, compliance and vulnerability management as a service, using a platform that collects and correlates vast amounts of data from logs, endpoints, firewalls, and network devices.

The company needed to improve scalability as data volume grew, but the task was cost-prohibitive using its existing Oracle database solution. It couldn’t process unstructured log data at scale, and there were also major performance issues.

Solutionary replaced its RDBMS solution with the MapR Converged Data Platform to achieve scalability while still meeting reliability requirements. The new solution combines machine learning algorithms, complex event processing, and predictive analytics to detect real-time security threats.

RELATIONAL,SAAS, MAINFRAME

DOCUMENTS, EMAILS

BLOGS, TWEETS,LINK DATA

LOG FILES, CLICKSTREAMSSENSORS

DATA SOURCES

OPERATIONAL APPS

RECOMMENDATIONS

FRAUD DETECTION

LOGISTICS

ANALYTICS

SEARCH

SCHEMA-LESSDATA EXPLORATION

BI, REPORTING, AD-HOC INTEGRATED ANALYTICS

MACHINE LEARNINGOPTIMIZED DATA ARCHITECTURE

MAPR-FS MAPR-DB MAPR STREAMSWEB-SCALE STORAGE DATABASE EVENT STREAMING

DATA TRANSFORMATION, ENRICHMENT AND INTEGRATION

DATA WAREHOUSE

ETL Into Operational Reporting Formats

(e.g., Parquet)

DATA INGEST

TOPICS

SOURCES

Security Feeds

HTTP

Syslog

Firewall

Other

STREAM PROCESSING SERVENOSQL STORAGE

Page 18: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

19

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

MACQUARIE SECURE DATA SERVICESMacquarie Telecom’s Government Division secures telecommunications for 42% of government agencies in Australia. Macquarie Telecom secures, monitors, and analyzes hundreds of systems logs and other data that together comprise about one billion events every day. The company uses this information to predict and prevent cyber attacks. The division needed to transition from standard tools to big data analytics to be able to provide more real-time analysis on a rapidly growing amount of data.

Macquarie chose MapR to collect internet traffic that travels through its gateways into a centralized data lake. This repository stores information about when users open email attachments, visit websites, and download software to their devices. The MapR Platform then runs analytics to predict when and where attacks might come from and to enable insights about how to anticipate threats and proactively secure the government’s system.

RELATIONALJSON

SERVERLOGS

EMAIL,SOCIAL

DATA SOURCES

AGILE, SELF-SERVICE DATA EXPLORATION

TABLE REPLICATIONGLOBAL MULTI-MASTER,BUSINESS CONTINUITY

TIME SERIES, STRUCTURED DATA,JSON

UNSTRUCTURED DATA

NFS/ RAW FILES

REAL-TIME EVENT DATA

MAPR-FS MAPR-DB MAPR STREAMSWEB-SCALE STORAGE DATABASE EVENT STREAMING

ETL Into Operational Reporting Formats

(e.g., Parquet)

MULTI-TENANCYJOB/DATA PLACEMENT CONTROL, VOLUMES

ACCESS CONTROLSFILE, TABLE, COLUMN, COLUMN FAMILY, DOC, SUB-DOC LEVELS

AUDITINGCOMPLIANCE, ANALYZE USER ACCESS

SNAPSHOTSTRACK DATA LINEAGE AND HISTORY

MAPR CONVERGED DATA PLATFORM

The MapR Platform provides Macquarie with cost-effective scalable storage, analysis, and better performance. Macquarie can now provide timely, tailored reports to its government clients, allowing them to get more value more from their data, make better predictions, and be more responsive to citizens.

Page 19: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

20

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

All of the components of the use cases just discussed can run on the same cluster with the MapR Converged Data Platform, providing advantages such as:

• Less complexity, fewer moving parts, and fewer things to manage because multiple clusters for Streams/HBase/Spark/Hadoop can be merged into one cluster

• Joining data sources into one core data mediation platform so that applications consume data in an easier way

• Unified security

• High reliability and high availability with replication from data center to data center

Telecom is a classic example of the big data issues of huge volume and velocity, but CSPs also have demanding requirements for quick response, security, and reliability. The use cases we just described show how telecom companies can not only address these requirements, but also unlock value from data that was previously inaccessible or impractical to use.

MA PR-FS MAPR-DB MAPR STREAMSWEB-SCALE STORAGE DATABASE EVENT STREAMING

MAPR CONVERGED DATA PLATFORM

SOURCES/APPS BULK PROCESSING STREAM PROCESSING

TM

Page 20: MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS€¦ · 3O’Brien, Kevin, Top 1% of Mobile ... Is Growing, The New York Times, January 5, 2012. 5 MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

MapR and the MapR logo are registered trademarks of MapR and its subsidiaries in the United States and other countries. Other marks and brands may be claimed as the property of others. The product plans, specifications, and descriptions herein are provided for information only and subject to change without notice, and are provided without warranty of any kind, express or implied. Copyright © 2017 MapR Technologies, Inc.

For more information visit mapr.com

MAPR GUIDE TO BIG DATA IN TELECOMMUNICATIONS

CONCLUSION

In a recent survey of 273 global telecom companies,16 McKinsey identified a strong appetite for big data projects, but relatively little use. While nearly half of respondents said their companies are considering investments in big data and analytics, only 30% had actually made them. Many of those reported disappointing results, with little incremental profit improvement. However, these results were mostly blamed on poor data quality, lack of talent, and under-investment.

In contrast, a small group of telecom providers had achieved “outsized benefit” from their investments. For example, one had used analytics models to predict the periods of heaviest network use resulting from customer video streaming. It was able to take steps to relieve congestion and reduce its planned capital expenditures by 15%. “The potential for companies that apply data science effectively is substantial,” McKinsey researcher Jacques Bughin wrote.

Effective use of big data requires commitment, a clear understanding of goals, and an investment in skills and technologies. There can be no question that telecommunications companies have many potential use cases that can significantly improve their understanding of customers and their own infrastructure. The best approach for early adopters is to identify projects with measurable short-term opportunities then deploy a scalable platform that can adapt to a wide variety of data types and tools.

MORE INFORMATION AND USE CASES• Big Data and MapR for Telecommunications

• Churn Prediction with PySpark using MLlib and ML Packages

• How to Use Data Science and Machine Learning to Revolutionize 360° Customer Views

• MapR Streams Apache Apex Telecom use case

• NTT Comware Deploys MapR to Power Hadoop-as-a-Service for SmartCloud®

• Razorsight Offers Telecom Clients Predictive Analytics Solutions based on Hadoop and Apache Spark

• Macquarie Telecom deploys MapR technology

• Quantium Delivers Lightning-Fast Customer Analytics Using Hadoop and Apache Spark

16Jacques, Telcos: The untapped promise of big data