Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.
-
Upload
carter-garcia -
Category
Documents
-
view
218 -
download
2
Transcript of Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.
![Page 1: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/1.jpg)
Howard Fosdick(630)-279-4286
(C) 2004 FCI
World’s Largest Databases
![Page 2: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/2.jpg)
Who Am I?
Hands-on DBA (and SA) for …
• Oracle, DB2, SQL Server• Unix, Linux, Windows
• Founder IDUG, MWDUG, CAMP• Author, Speaker
Independent Contractor (630)-279-4286
![Page 3: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/3.jpg)
Outline
1. What’s a “Big Database”2. DSS3. OLTP4. Observations
![Page 4: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/4.jpg)
Statistics Sources
1. Winter Corp.
-- Database Top Ten -- Yearly survey
-- Vendor neutral-- Free at: www.wintercorp.com
2. Survey.com
-- High-End BI/DW Competitive Analysis -- Survey of 150 companies w/ big warehouses -- Free at: www.survey.com
“Thank You” to both sources
![Page 5: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/5.jpg)
Classifying Large Databases
DSS OLTP
Decision Support Systems (DSS)Online Analytical Processing (OLAP)Data Warehouses (DW) Multi-dimensional Databases (MDD)
+ Query oriented, mainly Read-only
Online Transaction Processing (OLTP)
+ Update with short transactions (transaction = small CPU & data resources)
Commercial IT vs. Scientific/Research databases
![Page 6: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/6.jpg)
What’s a Large Database ?
Database Size
- User data - User data plus metadata & indexes - DASD farm
Users
- Concurrent users - Total user population
Load
- Concurrent queries - Queries / day or hour (simple vs complex queries)
VLDB = Very Large Database
Good definitions and measurements are key to success
![Page 7: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/7.jpg)
II. World’s Biggest DSS Systems
![Page 8: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/8.jpg)
Data Warehouses VS. Data Marts
DW DM
• Application neutral• Service multiple organizational needs
Largest systems are usually data warehouses
• Application specific• Organizationally focused
![Page 9: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/9.jpg)
What’s Driving the Growth of Large Data Warehouses ?
Web Sites --
- Clickstream data
Retail --
- Transaction Level Detail (TLD)
!!!!! Super Big Groceries !!!!!
Preferred Customer Card #283736
Hello, I’m Scot94
03/04/04 02:38 3284 03 2918 33 Store 493 Loc 229
PRETTY-LADY HAIRCLR 1 5.99 AARP MAGAZINE 1 4.95 DIAPERS 2 10.00 BEER SIX-PACK 1 3.45
Tax 2.40 BAL 36.79 Cash 40.00 Change 3.21
Save this Receipt – Get $2.00 off on Prozac When You Buy Super-Baby Food !
Understanding customer behavior means $$$ !
![Page 10: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/10.jpg)
What’s Driving the Growth of Large Data Warehouses ?
Necessary Preconditions --
• Cheap Hardware
• Higher reliability / availability (based on dynamic hardware swapping)
• Better Software
• Lax privacy laws in USA
• EU curtails cross-usage of data• EU has stronger privacy laws
![Page 11: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/11.jpg)
World’s Largest DSS Systems
• Way bigger than just 3 years ago• All Unix “mainframes”• All use SANs (Storage Area Networks) (aka ESS)
• No IBM Mainframes• No Windows or Wintel• No SQL Server• No Linux or Open Source databases• NCR/Teradata niche market at 2.7% (Gartner 05/28/03)
• Goodbye Informix!
© 2003 Winter Corp.
Database Size = disk storage for user tables, indices, aggregates
![Page 12: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/12.jpg)
Large DSS Systems
Sun E12/15K
HP Superdome
IBM Regatta
Unix “mainframe” Storage Area Network
QueryUsers EMC
Hitachi
HP
LSI
Unix “mainframes” –
+ Dynamically add/drop CPUs, RAM (Sun calls it partitioning)
+ High reliability (as good as clusters or Mainframes)+ Capacity on Demand
SANs –
+ Flash (“snap”) backup(OS-level backup)
+ Large Cache+ Intelligent data placement/movement
![Page 13: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/13.jpg)
Example Evolution – Scaling a Unix “Mainframe”
8 CPUs@ 16 Gig RAM
12 concurrentusers 32 CPUs
@ 64 Gig RAM
64 CPUs@ 64 Gig RAM
25 concurrentusers
35 concurrentusers
Other upgrades:
Oracle 8i -> 9iSun E10K -> E12K
![Page 14: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/14.jpg)
World’s Largest DSS Systems -- Windows
• Way smaller than Unix systems• Way bigger than just 3 years ago• Oracle vs SQL Server (like market share battle for Windows DBMSs)• Also use SANs (Storage Area Networks) • No IBM DB2 UDB• No Teradata
© 2003 Winter Corp.
![Page 15: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/15.jpg)
World’s Largest DSS Systems -- By Peak Workload
© 2003 Winter Corp.
© 2003 Winter Corp.
![Page 16: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/16.jpg)
Where did IBM Mainframes Go ?
Big Silicon
Big Iron
+ Hello Linux !+ Good for -- + Consolidation platform
+ Legacy systems+ Virtualization (multi-OS platform)
Poof!
-- Goodbye… -- Largest databases-- Smaller mainframes (VM, VSE)
-- Reliability advantage eroded-- High cost per CPU
1994 2004
![Page 17: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/17.jpg)
Oracle Rising
• Joined the Top Ten list 3 to 5 years ago• 8i added essential DSS technologies ...
+ Partitions+ New ROW ID (for bigger databases)+ Thorough Parallelism (DML, DDL, utilities)+ Index improvements (bit mapped IXs, function-based, desc, others)+ Resource Manager (proactive)+ Materialized Views+ Large memory mgmt + Optimizer is Partition-aware+ Online DDL operations and Utilities
![Page 18: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/18.jpg)
Example Oracle Warehouses Amazon Best Buy Colgate Telecom Italia
Mobile
System HP Superdome Sun 15K IBM p690 Regatta
HP AlphaServer
Architecture SMP SMP SMP Cluster
Storage EMC EMC IBM EMC
Processors 64 24 24 2 node cluster
Oracle Version 9i 8i 9i 8i
DB Size 13 T 6.3 T 3.8 T 16 T
Number of Tables
600 4025 27,000 1,200
Detail DataClickstream data
Sales Transaction data
Varied detail data
Call detail records
User Population 800 16,000 6,200 400
Concurrent Users
55-60 600-700 600-700 55
DBAs2 2 n/a 3
Peak Workload 4300 queries / day
150,000 queries / 4 hour period
14,200 steps /day
700 M records loaded / day
© 2003 Winter Corp.
![Page 19: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/19.jpg)
Why Not Oracle Clustering ?
+ Great for non-disruptive scaling of existing systems
. . . But the biggest systems tend not to use it
-- Unix “mainframe” no longer requires clustering for reliability, availability or easy scalability
-- Clustering means complexity in minimizing the…
-- Locking issues
9i improved this via Cache Fusion – but SMP Unix “mainframe” will still be favored
![Page 20: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/20.jpg)
Where’s SQL Server 2000 ?
• Big in OLTP but lacks essential DSS technologies ...
-- Parallelism restricted to SELECTs
-- Needs it for other DML, DDL, utilities
-- Partitions
-- Wintel restriction
(Features = partitioning, database mirroring, mirrored backups, online Indexing & Restore, fast recovery, ANSI 1999 T-SQL, CLR support, native XML, XML Query, better .NET support, Reporting Services, Service Broker (async messaging), extensible data types…)
Yukon ?
-- Many new features. . . ready for “Top Ten” DSS ?
![Page 21: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/21.jpg)
Where’s Open Source ?
Linux
+ 2.6 kernel now out+ More CPUs (to 16)+ More RAM (> 4+ Gig)+ Better threading, file system support
MySQL and PostgresQL
-- Top out at 500,000 page views per day (EWeek 2003)
(or 15 per second) + Improving rapidly
Prediction – open source will support big databases but not “Top Ten” list sites
![Page 22: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/22.jpg)
Risks of Large DWs
• 40% of IT projects fail due to … Management (time & budget issues)
• “Large warehouses are unforgiving” -- Survey.com
• Design issues critical• Database Design• Query design (and EXPLAINs)• ETL design and scheduling
• Pre-program wherever possible (control users and the resources they use)
• Monitoring and alerts
• Scale gradually (staggered loads on a schedule…)
• Benchmarks (after each Scaling Point)
![Page 23: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/23.jpg)
Risks of Large DWs
• Partitioning data properly is critical
• For better physical management (utilities)• Optimizers use this info• Parallelism via multiple partitions
• How to partition
• Depends on data usage• Examples: geographical, hash, unique id, ranges…
![Page 24: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/24.jpg)
III. World’s Biggest OLTP Systems
![Page 25: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/25.jpg)
World’s Largest OLTP Systems
© 2003 Winter Corp.
• Wintel “mainframes” arrive !• SQL Server arrives• Use SANs• CA can do the job (but has tiny overall database market share)• Oracle has big systems -- but not in the top ten
![Page 26: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/26.jpg)
World’s Largest OLTP Systems -- Unix -- Windows
© 2003 Winter Corp.
© 2003 Winter Corp.
![Page 27: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/27.jpg)
World’s Largest OLTP Systems -- By Number of Rows
© 2003 Winter Corp.
© 2003 Winter Corp.
![Page 28: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/28.jpg)
OLTP Observations
• Wintel “mainframes” w/ SQL Server displace MVS/CICS
• SQL Server dominates Wintel OLTP
• Great for pre-programmed, resource-limited txns
• Oracle dominates Unix OLTP
![Page 29: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/29.jpg)
IV. Observations
![Page 30: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/30.jpg)
Architectures
Large SMP “mainframe”
Shared-diskClusters Shared-nothing
(Massively Parallel Processing or MPP)
The “architectural debate” means far less than it used to !
![Page 31: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/31.jpg)
Vendor Architectures
Product: Architecture: Implementation:
DB2 UDB for z/OS Shared-disk clustering DB2 Data Sharing on Sysplex
DB2 UDB for LUW Shared nothing DB2 UDB ESE partitioning feature
Oracle Shared-disk clustering or SMP
Real Application Clusters (RAC) -- previously known as Oracle Parallel Server (OPS)SQL Server 2000 Shared nothing
or SMP Customer-developed partitioning based on SQL Server features
Teradata Shared nothingTeradata on NCR MPP
![Page 32: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/32.jpg)
DBMS Licensing Costs
Open Source(MySQL, PostgreSQL)
SQL Server 2000
DB2 UDB
Oracle
Teradata
Database pricing varies by the options selected and by the deal an IT organizationcuts with the vendor.Your mileage may vary!
Biggest DSS Systems
$$$$$
Biggest OLTPSystems
TCO ?
+ Low-cost SQL Server supports the biggest OLTP systems
-- Pressure on Teradata to keep its niche
+ Open Source DBMSs have a role but it’s not “Top Ten” databases
$
![Page 33: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/33.jpg)
DW Labor Costs
© 2002 Survey.com
Like TCO, Labor Costs may be an un-measurable …
• Figures applicable across sites ?• Every vendor claims lowest labor costs• “Terabytes per DBA” may be non-linear!• 1 or 2 DBAs for a 24/7 site ? • Development staff will be larger than Maintenance staff• Your mileage will vary
![Page 34: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/34.jpg)
Multi- Machine Mixed Systems
45 Linux w/MySQL servers
(Transactional updates)EWeek, 2/23/04
Sabre /Travelocity
17 HimalayaNon-stop w/Master database
(Fare look-up and routing)
![Page 35: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/35.jpg)
Multi- Machine Mixed Systems
OmahaSteaks
17 Linux w/MySQL servers
(Shopping cart)
(Transactionalupdates)
* 50,000 to 68,000 daily sessions* 1 year in Production / 8 Million sessions
ISeriesDB2
EWeek 2003
![Page 36: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/36.jpg)
Conclusions
• Databases are growing exponentially
• IT is closing in on Scientific/Research databases
• “Multiple machine” mixed systems are becoming popular
(Monolithic central databases are no longer the only game in town)
• “Mixed use” databases are becoming more common
• Multiple applications• Read and update
• Open Source supports large systems -- but not “Top Ten”
• VLDBs are instructive – but unique in some ways
![Page 37: Howard Fosdick (630)-279-4286 (C) 2004 FCI Worlds Largest Databases.](https://reader036.fdocuments.us/reader036/viewer/2022081515/5515705955034685568b60cb/html5/thumbnails/37.jpg)
??
? ?
?questions...
?
??
?