Galaxy Big Data with MariaDB
description
Transcript of Galaxy Big Data with MariaDB
1
Galaxy Semiconductor Intelligence Case Study: Big Data with MariaDB 10
Bernard Garros, Sandrine Chirokoff, Stéphane Varoqui
Galaxy confidenFal
Galaxy Big Data scalability Menu
• About Galaxy Semiconductor (BG)
• The big data challenge (BG) • Scalable, fail-‐safe architecture for big data (BG) • MariaDB challenges: compression (SV)
• MariaDB challenges: sharding (SC)
• Results (BG) • Next Steps (BG) • Q&A
2
Galaxy confidenFal
About Galaxy Semiconductor
• A soSware company dedicated to semiconductor: ü Quality improvement ü Yield enhancement ü NPI acceleraAon ü Test cell OEE opAmizaAon
• Founded in 1988 • Track record of building products that offer the best user experience + premier customer support
• Products used by 3500+ users and all major ATE companies
3
via SEMICONDUCTOR INTELLIGENCE
Galaxy confidenFal 4
Galaxy Teo, Ireland HQ, G&A
Galaxy East Sales, Marketing, Apps
Galaxy France R&D, QA, & Apps
Partner Taiwan Sales & Apps
Partner Israel Sales
Partner Singapore Sales & Apps
Galaxy West Sales, Apps
Partner Japan Sales & Apps
Partner China Sales & Apps
Worldwide Presence
Galaxy confidenFal
Test Data producAon / consumpAon
5
ATE
Test Data
Files
ETL, Data Cleansing
Yield-‐Man
Data Cube(s)
ETL
Galaxy TDR
Examinator-‐Pro
Browser-‐based dashboards
Custom Agents
Data Mining
OEE Alarms PAT
Automated Agents
SYA
Galaxy confidenFal
Growing volumes
6
MB
GEX
STDF STDF STDF
GB/TB
GEX, Dashboard, Monitoring
TDR
YM STDF STDF STDF TB/PB
GEX, Dashboard, Monitoring
TDR
YM
STDF STDF STDF
Galaxy confidenFal
Big Data, Big Problem
• More data can produce more knowledge and higher profits • Modern systems make it easy to generate more data • The problem is how to create a hardware and soSware placorm that can make full and effecFve use of all this data as it conFnues to grow • Galaxy has the experFse to guide you to a soluFon for this big data problem that includes: – Real-‐Fme data streams – High data inserFon rates – Scalable database to extreme data volumes – AutomaFc compensaFon for server failures – Use of inexpensive, commodity servers – Load balancing
7
Galaxy confidenFal
First-‐level soluAons
• ParFFoning – SUMMARY data
• High level reports • 10% of the volume • Must be persistent for a long period (years)
– RAW data • Detailed data inspecFon • 90% of the volume • Must be persistent for a short period (months)
• PURGE – ParFFoning per date (e.g. daily) on RAW data
tables – Instant purge by drop parFFons
• Parallel inserFon
8
Yield-‐Man
Yield-‐Man
Yield-‐Man
Galaxy confidenFal
New customer use case
9
• SoluFon needs to be easily setup • SoluFon needs to handle large (~50TB+) data • Need to handle large inserFon speed of approximately 2 MB/sec
SoluAons • SoluFon 1: Single scale-‐up node (lots of RAM, lots of CPU, expensive high-‐speed SSD storage, single point of failure, not scalable, heavy for replicaFon) • SoluFon 2: Cluster of commodity nodes (see later)
Galaxy confidenFal
Cluster of Nodes
Other customer applicaAons and systems
Other Test Data Files
Event Data Stream
ATE config & maintenance events
Real-‐;me Tester Status
Test Floor Data Sources
STDF Data Files
.
.
.
RESTful A
PI
RESTful API
Test Hardware Management System
MES
Galaxy Cluster of Commodity Servers
DB Node
DB Node DB Node
DB Node
Compute Node
Compute Node
Head Node
Dashboard Node
Yield-‐Man PAT-‐Man
Yield-‐Man PAT-‐Man
Real-‐Time Interface Test Data Stream
10
Galaxy confidenFal
Easy Scalability
Other customer applicaAons and systems
Other Test Data Files
Event Data Stream
ATE config & maintenance events
Real-‐;me Tester Status
Test Floor Data Sources
STDF Data Files
.
.
.
RESTful A
PI
Test Hardware Management System
MES
Galaxy Cluster of Commodity Servers
DB Node
DB Node DB Node
DB Node
Compute Node
Compute Node
Head Node
Dashboard Node
Yield-‐Man PAT-‐Man
Yield-‐Man PAT-‐Man
Real-‐Time Interface Test Data Stream
DB Node DB Node
Compute Node
RESTful API
11
Galaxy confidenFal
MariaDB challenges
12
❏ From a single box to elasFc architecture ❏ Reducing the TCO ❏ OEM soluFon ❏ Minimizing the impact on exisFng code ❏ Reach 200B records
Galaxy confidenFal
A classic case
13
SENSOR SENSOR SENSOR SENSOR SENSOR
STORE
QUERY QUERY QUERY QUERY QUERY
❏ Millions of records/s sorted by Fmeline ❏ Data is queried in other order ❏ Indexes don’t fit into main memory ❏ Disk IOps become bonleneck
Galaxy confidenFal
B-‐tree gotcha
14
2ms disk or network latency, 100 head seeks/s, 2 opFons: ❏ Increase concurrency ❏ Increase packet size Increased both long Fme ago using innodb_write_io_threads , innodb_io_capacity, bulk load
Galaxy confidenFal
B-‐tree gotcha
15
With a Billion records, a single parFFon B-‐tree stops staying in main memory, a single write produces read IOps to traverse the tree: ❏ Use parFFoning ❏ Insert in primary key order ❏ Big redo log and smaller amount of dirty pages ❏ Covering index The next step is to radically change the IO panern
Galaxy confidenFal
Data Structuring modeling
16
INDEXES MAINTENANCE NO INDEXES COLUMN STORE TTREE BTREE FRACTAL TREE
STORE NDB InnoDB -‐ MyISAM ZFS
TokuDB LevelDB
Cassandra Hbase
InfiniDB VerFca
MEMORY
WRITE +++++
++++ +++ +++++ +++++
READ 99% ++ + ++++ ++++++
READ 1% +++++ ++++ +++ -‐-‐-‐-‐-‐-‐-‐ -‐-‐-‐-‐-‐-‐
DISK
WRITE
BTREE
-‐ +++ ++++ +++++
READ 99% -‐ + ++++ +++++
READ 1% + +++ -‐-‐-‐-‐-‐ -‐
Galaxy confidenFal
INDEXES MAINTENANCE NO INDEXES COLUMN STORE TTREE BTREE FRACTAL TREE
NDB InnoDB -‐ MyISAM ZFS
TokuDB LevelDB
Cassandra Hbase InfiniDB
Average Compression Rate
NA 1/2 1/6 1/3 1/12
IO Size
NA 4K to 64K Variable base on compression & Depth 64M 8M To 64M
READ Disk Access Model
NA O(Log(N)/ Log(B)) ~O(Log(N)/ Log(B)) O(N/B ) O(N/B -‐ B EliminaFon)
WRITE Disk Access Model
NA O(Log(N)/ Log(B)) ~O(Log(N)/B) O(1/B ) O(1/B)
Data Structure for big data
17
Galaxy confidenFal
Top 10 Alexa’s PETA Bytes store is InnoDB
18
Top Alexa InnoDB
Galaxy TokuDB
❏ DBA to setup Insert buffer + Dirty pages ❏ Admins to monitor IO ❏ Admins to increase # nodes ❏ Use flash & hybride storage ❏ DBAs to parFFon and shard ❏ DBAs to organize maintenance ❏ DBAs to set covering and clustering
indexes ❏ Zipf read distribuFon ❏ Concurrent by design
❏ Remove fragmentaFon ❏ Constant insert rate regardless
memory/disk raFo ❏ High compression rate ❏ No control over client architecture ❏ All indexes can be clustered
Galaxy confidenFal 19
1/5 Compression on 6 Billion Rows
Key point for 200 Billion records
Galaxy confidenFal 20
2 Fmes slower insert Fme vs. InnoDB 2.5 Fmes faster insert vs. InnoDB compressed
Key point for 200 Billion records
Galaxy confidenFal 21
❏ Disk IOps on InnoDB was bonleneck, despite parFFoning
❏ Moving to TokuDB, move bonleneck to
CPU for compression ❏ So how to increase performance more?
Sharding!!
Galaxy take away for 200 Billion records
Galaxy confidenFal 22
INDEXES MAINTENANCE NO INDEXES COLUMN STORE TTREE BTREE FRACTAL TREE
NDB InnoDB MyISAM
ZFS
TokuDB LevelDB
Cassandra Hbase
InfiniDB VeFca
CLUSTERING
NaFve Manual, Spider, Vitess, Fabric, Shardquery
Manual, Spider, Vitess, Fabric, Shardquery
NaFve NaFve
# OF NODES
+++++ +++ ++ +++++ +
Sharding to fix CPU Bo`leneck
Galaxy confidenFal 23
NO DATA IS STORED IN SPIDER NODES
Spider… it’s a MED storage engine
Galaxy confidenFal 24
Preserve data consistency between shards
Allow shard replica
Enable joining between shards
ha_spider.cc SEMI TRX
Galaxy confidenFal
Spider - A Sharding + HA solution
25
Galaxy confidenFal
Implemented architecture
26
SUMMARY universal tables
RAW Sharded tables
DATA NODE #1
COMPUTE NODE #1
…
DATA NODE #2 DATA NODE #3 DATA NODE #4
HEAD NODE COMPUTE NODE #2
…
• SPIDER • NO DATA • MONITORING
• TOKUDB • COMPRESSED DATA • PARTITIONS
Delay current inserFon
Replay inserFon with new shard key
1/4 OR 1/2
1/4 OR 1/2
1/4 OR 1/2
1/4 OR 1/2
Galaxy confidenFal
Re-‐sharding without data copy
27
Spider table L1.1
Node 01
Node 02
Spider table L1.2
Node 01
Node 02
Node 03
Node 04
Spider table L2
CURRENT
Toku table
P#Week 01
P#Week 02
Spider table L2
BEFORE
AFTER
Toku table
P#Week 01
P#Week 02
Toku table
P#Week 03
P#Week 04 Toku table
P#Week 03
P#Week 04 Toku table
P#Week 03
P#Week 04 Toku table
P#Week 03
P#Week 04
ParFFon by date (e.g. daily) Shard by node modulo Shard by date range
Galaxy confidenFal
Proven Performance
28
Galaxy has deployed its big data soluFon at a major test subcontractor in Asia with the following performance:
• Peak data inserFon rate : 2 TB of STDF data per day
• Data compression of raw data : 60-‐80 %
• DB retenFon of raw data : 3 months
• DB retenFon of summary data : 1 year
• Archiving of test data : AutomaFc
• Target was 2MB/sec, we get about 10MB/sec
• Since 17th June, steady producFon : – Constant inserFon speed – 1400 files/day, 120 GB/day – S_ptest_results: 92 billion rows / 1.5 TB across 4 nodes – S_mptest_results: 14 billion rows / 266 GB acroos 4 nodes – wt_ptest_results: 9 billion rows / 153 GB across 4 nodes – 50TB available volume, total DB size is 8TB across all 4 nodes
• 7 servers (22k$) + SAN ($$$) OR DAS (15k$)
Galaxy confidenFal
File count inserted per day
29
• IntegraFon issues up to May 7
• Raw & Summary-‐only data inserFon up to May 18
• Raw & Summary data inserFon, Problem solving, fine tuning up to June 16
• Steady producFon inserFon of Raw & Summary data since June 17
Galaxy confidenFal
File count and data size per day
30
• Up to 2TB inserted per day • Up to 20k files per day
Galaxy confidenFal
Raw data inserAon duraAon over file size (each colored series is 1 day)
31
Consistant inserFon performance
Galaxy confidenFal
What’s next?
32
• Make Yield-‐Man more SPIDER-‐aware: – Integrated scale-‐out (add compute/data nodes) – NaFve database schema upgrade on compute/data nodes
• Add more monitoring capability to monitor SPIDER events (node failure, table desynchronizaFon across nodes…) • Automate recover aSer failures/issues, today:
– Manual script to detect de-‐synchronizaFon – PT table sync from Percona to manually re-‐sync – Manual script to reintroduce table nodes in the cluster
IN SPIDER 2014 ROADMAP
Thank you!!
Q&A?
33