GigaByte TeraByte PetaByte ExaByte In Search of PetaByte Databases Jim Gray Tony Hey.
-
Upload
connor-hall -
Category
Documents
-
view
221 -
download
2
Transcript of GigaByte TeraByte PetaByte ExaByte In Search of PetaByte Databases Jim Gray Tony Hey.
GigaByte
TeraByte
PetaByte
ExaByte
The Cost of Storage(heading for 1K$/TB soon)
y = 6.7x
y = 17.9x
0100200300400500600700800900
1000
0 20 40 60GB
$ IDE
SCSI
Price vs disk capacity
6
0
5
10
15
20
25
30
35
40
0 10 20 30 40 50 60GB
$
IDE
SCSI
k$/TB
12/1/1999
y = 3.8x
y = 13x
0100200300400500600700800900
1000
0 20 40 60 80Raw Disk unit Size GB
$
SCSI
IDE
Price vs disk capacity
0
5
10
15
20
25
30
35
40
0 20 40 60 80Disk unit size GB
$
SCSI
IDE
raw k$/TB
9/1/2000
y = 2.0x
y = 7.2x
0
200
400
600
800
1000
1200
1400
0 50 100 150 200Raw Disk unit Size GB
$ SCSI
IDE
Price vs disk capacity
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
10.0
0 50 100 150 200Disk unit size GB
$ SCSI
IDE
raw k$/TB
9/1/2001
GigaByte
TeraByte
PetaByte
ExaByteSummary• DBs own the sweet-spot:
– 1GB to 100TB
• Big data is not in databasesHPTS does not own high performance storage (BIG DATA)
• We should
• Cost of storage is people:
–Performance goal:1 Admin per PB
GigaByte
TeraByte
PetaByte
ExaByte
State is Expensive• Stateless clones are easy to manage
– App servers are middle tier
• Cost goes to zero with Moore’s law.– One admin per 1,000 clones.– Good story about scaleout.
• Stateful servers are expensive to manage– 1TB to 100TB per admin – Storage cost is going to zero(2k$ to 200k$).
• Cost of storage is management cost
GigaByte
TeraByte
PetaByte
ExaBytePersonal 100 GB today
The Personal Petabyte (someday)
• It’s coming (2M$ today…2K$ in 10 years)
• Today the pack rats have ~ 10-100GB– 1-10 GB in text (eMail, PDF, PPT, OCR…)– 10GB – 50GB tiff, mpeg, jpeg,…– Some have 1TB (voice + video).
• Video can drive it to 1PB.
• Online PB affordable in 10 years.
• Get ready: tools to capture, manage, organize, search, display will be big app.
GigaByte
TeraByte
PetaByte
ExaByte
10 TBAn Image Database: TerraServer
• Snapshot of the USA (1 meter granularity)– 10,000,000,000,000 (=10^13) sq meters– == 15TB raw (some duplicates)– == 5 TB cooked
• 5x compression• + Image pyramid• + gazetteer
• Interesting things: – Its all in the Database– Clustered (allows flaky hardware, online upgrade)
– Triplexed – snapshot each night
GigaByte
TeraByte
PetaByte
ExaByteDatabases (== SQL)
• VLDB survey (Winter Corp).
• 10 TB to 100TB DBs.– Size doubling yearly– Riding disk Moore’s law– 10,000 disks at 18GB is 100TB cooked.
• Mostly DSS and data warehouses.
• Some media managers
GigaByte
TeraByte
PetaByte
ExaByteDB iFS
• DB2: leave the files where they live– Referential integrity between DBMS and FS.
• Oracle: put the files in the DBMS– One security model– One storage management model– One space manager– One recovery manger– One replication system– One thing to tune.– Features: transactions,….
GigaByte
TeraByte
PetaByte
ExaByteInteresting facts
• No DBMSs beyond 100TB.
• Most bytes are in files.
• The web is file centric
• eMail is file centric.
• Science (and batch) is file centric.
• But….
• SQL performance is better than CIFS/NFS..– CISC vs RISC
GigaByte
TeraByte
PetaByte
ExaByte BarBar: the biggest DB
• 350 TB
• Uses Objectivity™
• SLAC events
• Linux cluster scans DB looking for patterns
GigaByte
TeraByte
PetaByte
ExaByte
300 TB (cooked)Hotmail / Yahoo
• Clone front ends ~10,000@hotmail.
• Application servers– ~100 @ hotmail – Get mail box– Get/put mail– Disk bound
• ~30,000 disks
• ~ 20 admins
GigaByte
TeraByte
PetaByte
ExaByteAOL (msn)
(1PB?)
• 10 B transactions per day (10% of that)
• Huge storage
• Huge traffic
• Lots of eye candy
• DB used for security/accounting.
• GUESS AOL is a petabyte – (40M x 10MB = 400 x 1012)
GigaByte
TeraByte
PetaByte
ExaByteGoogle
1.5PB as of last spring• 8,000 no-name PCs
– Each 1/3U, 2 x 80 GB disk, 2 cpu 256MB ram
• 1.4 PB online.• 2 TB ram online• 8 TeraOps • Slice-price is 1K$ so 8M$.• 15 admins (!) (== 1/100TB).
GigaByte
TeraByte
PetaByte
ExaByte
ComputationalScience
• Traditional Empirical Science – Scientist gathers data by direct observation– Scientist analyzes data
• Computational Science– Data captured by instruments
Or data generated by simulator– Processed by software– Placed in a database– Scientist analyzes database– tcl scripts
• on C programs – on ASCII files
GigaByte
TeraByte
PetaByte
ExaByteAstronomy
• I’ve been trying to apply DB to astronomy
• Today they are at 10TB per data set
• Heading for Petabytes
• Using Objectivity
• Trying SQL (talk to me offline)
GigaByte
TeraByte
PetaByte
ExaByteFast Moving Objects
• Find near earth asteroids:SELECT r.objID as rId, g.objId as gId, r.run, r.camcol, r.field as field, g.field as gField,
r.ra as ra_r, r.dec as dec_r, g.ra as ra_g, g.dec as dec_g,sqrt( power(r.cx -g.cx,2)+ power(r.cy-g.cy,2)+power(r.cz-g.cz,2) )*(10800/PI()) as distance
FROM PhotoObj r, PhotoObj g WHERE
r.run = g.run and r.camcol=g.camcol and abs(g.field-r.field)<2 -- the match criteria-- the red selection criteriaand ((power(r.q_r,2) + power(r.u_r,2)) > 0.111111 )and r.fiberMag_r between 6 and 22 and r.fiberMag_r < r.fiberMag_g and r.fiberMag_r < r.fiberMag_iand r.parentID=0 and r.fiberMag_r < r.fiberMag_u and r.fiberMag_r < r.fiberMag_zand r.isoA_r/r.isoB_r > 1.5 and r.isoA_r>2.0-- the green selection criteriaand ((power(g.q_g,2) + power(g.u_g,2)) > 0.111111 )and g.fiberMag_g between 6 and 22 and g.fiberMag_g < g.fiberMag_r and g.fiberMag_g < g.fiberMag_iand g.fiberMag_g < g.fiberMag_u and g.fiberMag_g < g.fiberMag_zand g.parentID=0 and g.isoA_g/g.isoB_g > 1.5 and g.isoA_g > 2.0-- the matchup of the pairand sqrt(power(r.cx -g.cx,2)+ power(r.cy-g.cy,2)+power(r.cz-g.cz,2))*(10800/PI())< 4.0and abs(r.fiberMag_r-g.fiberMag_g)< 2.0
• Finds 3 objects in 11 minutes• Ugly,
but consider the alternatives (c programs an files and…)
–
GigaByte
TeraByte
PetaByte
ExaByte
Particle Physics – Hunting the Higgs and Dark Matter
• April 2006: First pp collisions at TeV energies at the Large Hadron Collider in Geneva
• ATLAS/CMS Experiments involve 2000 physicists from 200 organizations in US, EU, Asia
• Need to store,access, process, analyse 10 PB/yr with 200 TFlop/s distributed computation
• Building hierarchical Grid infrastructure to distribute data and computation
• Many 10’s of million $ funding – GryPhyN, PPDataGrid, iVDGL, DataGrid, DataTag, GridPP
ExaBytes and PetaFlop/s by 2015
GigaByte
TeraByte
PetaByte
ExaByteAstronomy: Past and
Future of the Universe• Virtual Observatories – NVO, AVO, AstroGrid
– Store all wavelengths, need distributed joins– NVO 500 TB/yr from 2004
• Laser Interferometer Gravitational Observatory– Search for direct evidence for gravitational waves– LIGO 250 TB/yr, random streaming from 2002
• VISTA Visible and IR Survey Telescope in 2004– 250 GB/night, 100 TB/yr, Petabytes in 10 yrs
New phase of astronomy, storing, searching and analysing Petabytes of data
GigaByte
TeraByte
PetaByte
ExaByte
Engineering, Environment and
Medical Applications• Real-Time Health Monitoring– UK DAME project for Rolls Royce Aero Engines– 1 GB sensor data/flight, 100,000 engine hours/day
• Earth Observation – ESA satellites generate 100 GB/day– NASA 15 PB by 2007
• Medical Images to Information– UK IRC Project on mammograms and MRIs– 100 MB/mammogram, UK 3M/yr, US 26M/yr– 200 MB/patient, Oxford 500 women/yr
Many Petabytes of data of real commercial interest
GigaByte
TeraByte
PetaByte
ExaByte
Grids, Databases and Cool Tools
• Scientists:– will build Grids based on Globus Open Source m/w– will have instruments generating Petabytes of data– will annotate their data with XML-based metadata
Realize a version of Licklider and Taylor’s original vision of resource sharing and the ARPANET
• TP and DB community:- Should assist in developing Grid Interfaces to DBMS- Should develop ‘Cool Tools’ for Grid Services
There will be commercial Grid applications and viable business opportunities