Hadoop-DS: A SQL over Hadoop Benchmark
-
Upload
hadoop-dev -
Category
Software
-
view
45 -
download
3
Transcript of Hadoop-DS: A SQL over Hadoop Benchmark
![Page 1: Hadoop-DS: A SQL over Hadoop Benchmark](https://reader036.fdocuments.us/reader036/viewer/2022071819/55b3b371bb61eb643e8b4610/html5/thumbnails/1.jpg)
1 © 2013 IBM Corporation
• Based on popular TPC-DS benchmark
• Mimics porting workload from RDBMS Data Warehouse to SQL over Hadoop solution
Hadoop-DS: A SQL over Hadoop Benchmark
SQL Compatibility Matters:
• Big SQL is the only solution with a robust SQL engine able to execute all 99 queries, and with minimal porting effort
• Hive/Impala took weeks to port queries:
Only subset working due to SQL limitations, query timeouts & runtime failures
<1 hour ~4 weeks Porting effort ~4 weeks
73% % working 70% 100%
Common set of 46 queries working
Independently audited
**See Speaker notes for disclaimer
![Page 2: Hadoop-DS: A SQL over Hadoop Benchmark](https://reader036.fdocuments.us/reader036/viewer/2022071819/55b3b371bb61eb643e8b4610/html5/thumbnails/2.jpg)
2 © 2013 IBM Corporation
Throughput Matters:
• Big SQL is 3.6x faster than Impala and 5.4x faster than Hive 0.13 for 46 common queries at 10TB:
• Big SQL also able to execute all 99 queries with 6 concurrent streams at 10TB.
Hadoop-DS: A SQL over Hadoop Benchmark Scaling Matters:
• Big SQL completed 4 concurrent query streams @30TB in 1.8x time of a single query stream
**See Speaker notes for disclaimer
Independently
audited results.