© 2015 MapR Technologies 2
SEMI-STRUCTURED DATA
STRUCTURED DATA
1980 2000 20101990 2020
Data Is Doubling Every Two Years
Unstructured data will account
for more than 80% of the data
collected by organizations
Source: Human-Computer Interaction & Knowledge Discovery in Complex Unstructured, Big Data
To
tal D
ata
Sto
red
IT
Resources
© 2015 MapR Technologies 3
1980 2000 20101990 2020
Fixed schema
DBA controls structure
Dynamic / Flexible schema
Application controls structure
NON-RELATIONAL DATASTORESRELATIONAL DATABASES
GBs-TBs TBs-PBsVolume
Database
Data Increasingly Stored in Non-Relational Datastores
Structure
Development
Structured Structured, semi-structured and unstructured
Planned (release cycle = months-years) Iterative (release cycle = days-weeks)
© 2015 MapR Technologies 4
How To Bring SQL Into An Unstructured Future?
Familiarity of SQL Flexibility of NoSQL
• SQL
• BI (Tableau, MicroStrategy,
etc.)
• Low latency
• Scalability
• No schema management
– HDFS (Parquet, JSON, etc.)
– HBase
– …
• No transform or silos of data
© 2015 MapR Technologies 6
Apache Drill Brings Flexibility & PerformanceAccess to any data type, any data source
• Relational
• Nested data
• Schema-less
Rapid time to insights
• Query data in-situ
• No Schemas required
• Easy to get started
Integration with existing tools
• ANSI SQL
• BI tool integration
Scale in all dimensions
• TB-PB of scale
• 1000’s of users
• 1000’s of nodes
Granular Security
• Authentication
• Row/column level controls
• De-centralized
© 2015 MapR Technologies 7
Extending Self Service to Schema-free dataA
gil
ity &
Bu
sin
ess V
alu
e
Use cases for BI
IT-Driven BI
Self-Service BI
Schema-Free
Data Exploration
IT-Driven BI IT-Driven BI
Self-Service BI
Analyst-driven with
no IT dependency
Analyst-driven with
IT support for ETL
IT-created
reports, spreadsheets
1980s -1990s 2000s Now
© 2015 MapR Technologies 8
Enabling “As-It-Happens” Business with Instant Analytics
Hadoop data Data modeling Transformation
Data movement
(optional)
Users
Hadoop data Users
Governed
approach
Exploratory
approach
New Business questionsSource data evolution
Total time to insight: weeks to months
Total time to insight: minutes
© 2015 MapR Technologies 9
Drill’s Role in the Enterprise Data Architecture
Raw data
• JSON, CSV, ...
“Optimized” data
• Parquet, …
Centrally-structured data
• Schemas in Hive Metastore
Relational data
• Highly-structured data
Hive, Impala, Spark SQL
Oracle, Teradata
Exploration
(known and unknown questions)
© 2015 MapR Technologies 10
MapR with Drill is Top-Ranked SQL-on-Hadoop
Source: Gigaom Research, 2015
Key:
• Number indicates companies relative strength across all vectors
• Size of ball indicates company’s relative strength along individual vector
Like other vendors’
offerings, Drill
handles BI and
interactive queries with
great aplomb, but it is
designed to serve these
workloads with data
complexity that goes
well beyond the flat
structured data that
other SQL-on-
Hadoop systems deal
with.
© 2015 MapR Technologies 11
Drill Benefits
Business users Technical IT
Business Analyst,
Data scientists,
VP of Hadoop Dev.,
Director of BI & Analytics,
Enterprise architect
• Self Service access to
Hadoop data from BI
tools
• Agility with no IT
intervention
• Interactive performance
• Drive Hadoop adoption
in company
• Enable better/new BI in
raw, real time and new
data types
• Reduce cost of
traditional systems
© 2015 MapR Technologies 12
Additional Resources
Resource Hub:
MapR.com/Drill
Tutorial:
Apache Drill in
10 Minutes
Whitepaper:
Faster Time to
Value