Big Data - What's the Big Deal? - Fujitsusp.ts.fujitsu.com/.../ps-IT-Future-13-KZ-GF-ru.pdf ·...
Transcript of Big Data - What's the Big Deal? - Fujitsusp.ts.fujitsu.com/.../ps-IT-Future-13-KZ-GF-ru.pdf ·...
0 Copyright 2013 FUJITSU
Big Data – What’s the Big Deal?
Gernot Fels, Fujitsu International Solution Marketing
1 Copyright 2013 FUJITSU
Some dreams organizations dream
What if you could …
Predict what will happen
Market trends
Customer behavior
Business opportunities
Problems
Take the right decisions?
Accelerate your decisions?
Have respective actions initiated automatically?
Fully understand the root cause of costs?
Skip useless activities?
Quantify and minimize risks?
Know, what you don’t know?
Can these dreams come true?
2 Copyright 2013 FUJITSU
Is Business Intelligence an option?
Concept
Turn data into information
Collect and consolidate data
Transform data into usable form
Analyze data (discover relations, patterns)
Objectives
Improve business processes
Reduce costs
Minimize risk
Increase business value
Data warehouse
Analyze
ETL
Act
Decide
Decision support based on data instead of intuition.
3 Copyright 2013 FUJITSU
Is Business Intelligence an option?
BI as it used to be
Few sources
Internal
Structured
GB and TB
At rest
Reports on history
Avoid risk
Periodic
Batch
Static model
Few direct users
On-premise
4 Copyright 2013 FUJITSU
The world has changed since then
Many more data sources, much more data, but most of it remains untapped.
BI as it used to be
Few sources
Internal
Structured
GB and TB
At rest
Reports on history
Avoid risk
Periodic
Batch
Static model
Few direct users
On-premise
5 Copyright 2013 FUJITSU
The world has changed since then
BI as it used to be
Few sources
Internal
Structured
GB and TB
At rest
Reports on history
Avoid risk
Periodic
Batch
Static model
Few direct users
On-premise
Today’s demands
Versatile sources
Internal and external
Un- / semi- / poly- / structured
TB and PB
At rest / in motion
Predict
Recognize opportunities
Ad-hoc
Real-time
Try and innovate
Many direct users
Anywhere, from any device
6 Copyright 2013 FUJITSU
Big Data
Be welcome in the world of Big Data
BI as it used to be
Few sources
Internal
Structured
GB and TB
At rest
Reports on history
Avoid risk
Periodic
Batch
Static model
Few direct users
On-premise
Today’s demands
Versatile sources
Internal and external
Un- / semi- / poly- / structured
TB and PB
At rest / in motion
Predict
Recognize opportunities
Ad-hoc
Real-time
Try and innovate
Many direct users
Anywhere, from any device
Affordable technologies to quickly capture, store and analyze data.
Volume
Variety
Velocity
Versatility
Value
7 Copyright 2013 FUJITSU
Manufacturing
Energy
Maintenance
Agriculture
Big Data matters to every industry
Big Data
Marketing
Healthcare Traffic, Transport
Public Sector
New opportunities, new values for enterprises and society.
Retail Finance
8 Copyright 2013 FUJITSU
Why traditional solutions do not work
Objective: Keep processing time constant while volumes increase.
Exponential
growth of
data
9 Copyright 2013 FUJITSU
Why traditional solutions do not work
Row-oriented store
Key Region Product Sales Vendor
0 Europe Mice 750 Tigerfoot
1 America Mice 1100 Lionhead
2 America Rabbits 250 Tigerfoot
3 Asia Rats 2000 Lionhead
4 Europe Rats 2000 Tigerfoot
SELECT SUM (Sales) GROUP BY Region
Data required
Data read, but not required
Index Area
Objective: Keep processing time constant while volumes increase.
RDBMS
Many interrelated tables, data in rows
Long-lasting queries (reading irrelevant data)
Rigid schema (pre-processing of unstructured data)
10 Copyright 2013 FUJITSU
Why traditional solutions do not work
Objective: Keep processing time constant while volumes increase.
RDBMS
Many interrelated tables, data in rows
Long-lasting queries (reading irrelevant data)
Rigid schema (pre-processing of unstructured data)
RDBMS and server scale-up
Server performance is limited
Hard limit for each resource type
11 Copyright 2013 FUJITSU
Why traditional solutions do not work
+
RDBMS
Many interrelated tables, data in rows
Long-lasting queries (reading irrelevant data)
Rigid schema (pre-processing of unstructured data)
RDBMS and server scale-up
Server performance is limited
Hard limit for each resource type
RDBMS and server scale-out
Effort to coordinate access to shared data
Decreasing server efficiency with increasing number
Storage connection as bottleneck
Time-consuming analysis – Far away from real-time
12 Copyright 2013 FUJITSU
What will work for Big Data?
New demands require new solution approaches.
13 Copyright 2013 FUJITSU
Distributed parallel processing
Data Node
Task Tracker
Data Node
Task Tracker
Data Node
Task Tracker
Name Node
Job Tracker
Clie
nt
Master
Slaves DFS Concept
Distribute data and I/O to many nodes
Local server storage
Move computing to where data resides
Shared nothing architecture, non-blocking network
Data replication to several nodes
Benefits
High performance, fast results
Unlimited scalability
Fault-tolerance
Cost-effective (standard servers with OSS)
Examples
14 Copyright 2013 FUJITSU
In-Memory DB (IMDB)
Concept
Benefits
Load entire DB into RAM of server(s) for analysis
Operation in RAM
Persistence layer on disk
Data replication among servers for fast recovery
Scale-up and scale-out
Fast data storing, retrieval, sort
Real-time analysis (1K-10K times faster than on disk)
Restriction
Data size is limited by RAM size and budget
Eliminate I/O.
IMDB
RAM …
Persistence Layer:
Replication, Backup, change log
RAM …
RAM …
Table Table
Table
Table
Table Table
Table
App App …
15 Copyright 2013 FUJITSU
IMDG
Concept
Benefits
In-Memory Data Grid (IMDG)
Distributed in-memory cache between apps and data
Caching of complex queries and reusable results
Access to disk only when needed
Transparent access by apps possible,
optimized cache utilization by adapting apps
Persistence layer on disk
Data replication among servers for fast recovery
Scale-up and scale-out
Fast response
Install performance as you grow
Business continuity
Increase app performance by reducing I/O.
App App …
RAM …
RAM …
RAM … Query Report Web page
Query Report Web page
Query Report Web page
Persistence Layer:
Replication, Backup, change log
16 Copyright 2013 FUJITSU
All Flash Array (AFA)
Concept
Benefits
Designed and optimized for flash
High-speed interconnect to servers
Scale-up and scale-out
No single point of failure
Highest I/O performance (6- to 7-digit number of IOPS)
Low latency
Consistent user experience
Better utilization of capacity
Less space and power consumption
Data integrity, no data loss
More IOPS, shorter latency – Accelerated I/O.
17 Copyright 2013 FUJITSU
Databases – Designed for Big Data
Flexible data model
Designed to be distributed
Schema-less (simple structure)
Easily cope with new data types
Change format of data without app disruption
Ease of app development
Data automatically spread across servers
Distributed query support
Designed for scale-out
Add / remove DB server to / from cluster
Standard server HW
Integrated caching
Tolerate and recover from failure
Reduce latency, increase throughput
Improve app responsiveness
Data replication (copies across cluster / across DC)
Automatic repair
Various flavors
Graph DB
Document-oriented DB
Key-value stores
Columnar stores
NoSQL databases help overcome the limitations of RDBMS.
18 Copyright 2013 FUJITSU
Column-oriented databases
Row-oriented data store
Column-oriented data store
Billions of rows (records)
Many queries relate to small number of columns
Inefficient to read all rows
Reduced I/O, fast analysis
Access only relevant columns
Split column into multiple sections
for simple parallelized processing
High compression
Column contains only 1 data type
Often only few values per column (to store)
Typical compression factor 10-50
NoSQL databases help overcome the limitations of RDBMS.
Row-oriented store
Key Region Product Sales Vendor
0 Europe Mice 750 Tigerfoot
1 America Mice 1100 Lionhead
2 America Rabbits 250 Tigerfoot
3 Asia Rats 2000 Lionhead
4 Europe Rats 2000 Tigerfoot
Column-oriented store
Key Region Product Sales Vendor
0 Europe Mice 750 Tigerfoot
1 America Mice 1100 Lionhead
2 America Rabbits 250 Tigerfoot
3 Asia Rats 2000 Lionhead
4 Europe Rats 2000 Tigerfoot
SELECT SUM (Sales) GROUP BY Region
Data required
Data read, but not required
19 Copyright 2013 FUJITSU
Complex Event Processing (CEP)
Concept
Critical parameters
Collect and analyze continuously generated data
(event streams) in real-time
Compare events to set of rules
Logical and temporal correlation
Detect patterns
Raise alert when match is found / absence of events
Trigger actions in real-time
In-memory cache for acceleration (IMDS or IMDG)
Throughput: 10K’s to M’s of events per sec
Latency: µs to ms (time from event input to output)
CEP requires parallel processing and in-memory technologies.
Streaming
Sensor
Event input
Ru
les
State transition
Apply Event output
Control
CEP Engine
20 Copyright 2013 FUJITSU
Un- /
Semi-/
Poly-
structured
data
Big Data solution architecture
IMDB
DB / DW
Data Sources Analytics Platform Access
Extract, Collect Clean, Transform Decide, Act Analyze, Visualize
Consolidated data Distilled essence Applied knowledge Various data
21 Copyright 2013 FUJITSU
Un- /
Semi-/
Poly-
structured
data
Big Data solution architecture
IMDB
DB / DW
Data Sources Analytics Platform Access
Extract, Collect Clean, Transform Decide, Act Analyze, Visualize
Consolidated data Distilled essence Applied knowledge Various data
Data Store
CEP
Distributed
Data Store
Map Reduce
22 Copyright 2013 FUJITSU
Un- /
Semi-/
Poly-
structured
data
Big Data solution architecture
IMDB
DB / DW
Data Sources Analytics Platform Access
Extract, Collect Clean, Transform Decide, Act Analyze, Visualize
Consolidated data Distilled essence Applied knowledge Various data
IMDB
DB / DW
Visualization
Reporting
Notification
Queries
…
Distributed
Data Store
Map Reduce
Data Store
CEP
23 Copyright 2013 FUJITSU
Data Store
Un- /
Semi-/
Poly-
structured
data
Big Data solution architecture
IMDB
DB / DW
Data Sources Analytics Platform Access
Extract, Collect Clean, Transform Decide, Act Analyze, Visualize
Consolidated data Distilled essence Applied knowledge Various data
IMDB
DB / DW
Visualization
Reporting
Notification
Queries
…
CEP
Distributed
Data Store
Map Reduce
24 Copyright 2013 FUJITSU
Un- /
Semi-/
Poly-
structured
data
Big Data solution architecture
IMDB
DB / DW
Data Sources Analytics Platform Access
Extract, Collect Clean, Transform Decide, Act Analyze, Visualize
Consolidated data Distilled essence Applied knowledge Various data
IMDB
DB / DW
Visualization
Reporting
Notification
Queries
…
CEP
IMDS IMDG
Distributed
Data Store
Map Reduce
IMDG IMDG
25 Copyright 2013 FUJITSU
Big Data is not just about infrastructure
Hoarding data is not enough
Transform data into high quality for high quality results
Iterative and exploratory analysis
Understand data and discover meaning
Lots of questions
Which data?
What to look for?
Which questions to ask?
Which tools? How to use them?
What about data retention?
Analytic skills (data scientist)
Change to analytical culture
Deep knowledge of data, tools and intended results.
26 Copyright 2013 FUJITSU
How can help
Optimum concept for each situation
Use right combination of technologies
Infrastructure products, software and middleware
Open for tools from leading ISV and OSS
Appliances for IM computing
End-to-end services
Assessment, consulting
Optimum solution design
Deployment, integration, support
Financial Services
Sourcing options
Self-managed (on-premise)
Managed services (on- / off-premise)
Fujitsu Cloud
One-stop shop for Big Data: Reduce complexity, time and risk.
…
27 Copyright 2013 FUJITSU
Visit our Big Data Internet Site
For more information:
http://www.fujitsu.com/fts/solutions/high-
tech/bigdata/
http://www.fujitsu.com/kz/solutions/high-
tech/bigdata/
http://www.fujitsu.com/global/services/sof
tware/interstage/solutions/big-data
28 Copyright 2013 FUJITSU
Summary
Big Data offers enormous potential for
business opportunities and value
If you ignore it, your competitor won’t !
It change the way companies
make decisions, do business, succeed or fail
New infrastructure concepts
New types of analytics
New tools, SW and MW
New ideas and use cases
New skills
New value
Fujitsu – A one-stop shop for Big Data
Fujitsu turns Big Data into a Big Deal.
29 Copyright 2013 FUJITSU