1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh,...

35
1 Challenges in Exploiting Challenges in Exploiting Exponential Storage Exponential Storage Gains Gains Seagate Research Lab Grand Seagate Research Lab Grand Opening Opening Pittsburgh, PA Pittsburgh, PA 21 August 2002 21 August 2002 Gordon Bell Gordon Bell Microsoft Bay Area Research Microsoft Bay Area Research Center Center http://www.research.microsoft.com http://www.research.microsoft.com
  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh,...

Page 1: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

11

Challenges in Exploiting Challenges in Exploiting Exponential Storage GainsExponential Storage Gains

Seagate Research Lab Grand OpeningSeagate Research Lab Grand Opening

Pittsburgh, PAPittsburgh, PA

21 August 200221 August 2002

Gordon BellGordon Bell

Microsoft Bay Area Research CenterMicrosoft Bay Area Research Centerhttp://www.research.microsoft.com/~gbellhttp://www.research.microsoft.com/~gbell

Page 2: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

2#

Bottom Lines aka “Killer apps” for Bottom Lines aka “Killer apps” for storage everywhere we lookstorage everywhere we look

1.1. ““MyLifeBits” recording MyLifeBits” recording almostalmost everything everything

2.2. The most cost-effective, highest volume The most cost-effective, highest volume stores: consumer & home PCs – for video.stores: consumer & home PCs – for video.

3.3. Small form factor drives: pocket form factor Small form factor drives: pocket form factor cameras, phones, tablets, … e-books cameras, phones, tablets, … e-books

4.4. Largest stores include Operating System, Largest stores include Operating System, database, and interconnection via database, and interconnection via LANs/WANs and in the “cloud”LANs/WANs and in the “cloud”

Page 3: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

3#

Page 4: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

MyLifeBitsMyLifeBits, The Challenge of a , The Challenge of a One…1K Tbyte, lifetime PCs: One…1K Tbyte, lifetime PCs:

Cyberizing everything…Cyberizing everything…I’ve I’ve writtenwritten, , saidsaid, , presentedpresented (incl. video), (incl. video),

photos of physical objects & a few things photos of physical objects & a few things I’ve read, heard, seenI’ve read, heard, seen

and might “want to see” on TVand might “want to see” on TV

Page 5: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

5#

"The PC is going to be the place where you store the "The PC is going to be the place where you store the information … really the center of control“ Billg information … really the center of control“ Billg

1/7/20011/7/2001MyLifeBits is an “on-going” project following MyLifeBits is an “on-going” project following CyberAll to “cyberize” all of personal bits!CyberAll to “cyberize” all of personal bits!►Memory recall of books, CDs, Memory recall of books, CDs, communication, papers, photos, videocommunication, papers, photos, video►Photos of physical object collections Photos of physical object collections ►Elimination of all physical stores & objectsElimination of all physical stores & objects►Content source for home media: ambiance, Content source for home media: ambiance, entertainment, communication, interaction entertainment, communication, interaction FreestyleFreestyle for CDs, photos, TV content, videos for CDs, photos, TV content, videosGoal: to understand the 1 TByte PC: Goal: to understand the 1 TByte PC: need, utility, cost, feasibility, challenge & tools.need, utility, cost, feasibility, challenge & tools.

Page 6: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

6#

MyLifeBits charter: MemexMyLifeBits charter: MemexAs We May Think - Vannevar BushAs We May Think - Vannevar Bush

““A memex is a device in which an individual stores A memex is a device in which an individual stores all his books, records, and communications, and all his books, records, and communications, and which is mechanized so that it may be consulted which is mechanized so that it may be consulted with exceeding speed and flexibility”with exceeding speed and flexibility”

““Selection by association, rather than indexing, may Selection by association, rather than indexing, may yet be mechanized “yet be mechanized “

Page 7: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

7#

Storing all we’ve read, heard, & seenStoring all we’ve read, heard, & seen

Human data-types /hr /day (/4yr) /lifetimeread text, few pictures 200 K 2 -10 M/G 60-300 G

speech text @120wpm 43 K 0.5 M/G 15 Gspeech @1KBps 3.6 M 40 M/G 1.2 T

stills w/voice @100KB 200 K 2 M/G 60 G

video-like 50Kb/s POTS 22 M .25 G/T 25 Tvideo 200Kb/s VHS-lite 90 M 1 G/T 100 T

video 4.3Mb/s HDTV/DVD 1.8 G 20 G/T 1 P

Page 8: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

8#

Character of Cyber All UseCharacter of Cyber All UseUser

Context / Timelines

Personal(including financial)

Professional (work related)

Archival

Working

Page 9: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

10#

MyLifeBits use scenariosMyLifeBits use scenarios1.1. Acquire from every potentially useful source including the web, Acquire from every potentially useful source including the web,

voice and instant messagesvoice and instant messages2.2. Personal use of MLB for work to recall everythingPersonal use of MLB for work to recall everything3.3. Provide ambiance & entertainment: Personal/home broadcast, Provide ambiance & entertainment: Personal/home broadcast,

CD, Internet radio, TV screen savingCD, Internet radio, TV screen saving4.4. Creation of photo and video albums Creation of photo and video albums

Events, places, trips, people, time intervalsEvents, places, trips, people, time intervals-------------- Database land ---------------------------------------------------- Database land --------------------------------------

5.5. Personal/web hosted collections & catalogsPersonal/web hosted collections & catalogs6.6. A Person (auto- or -biography web hosted time lineA Person (auto- or -biography web hosted time line

Historical events by type; Personal time lineHistorical events by type; Personal time lineCompile a life’s story about (event types, range, etc.)Compile a life’s story about (event types, range, etc.)

7.7. Individual…How I spent my year. A personal diary.Individual…How I spent my year. A personal diary.8.8. ISBQ: Interactive Story By QueryISBQ: Interactive Story By Query

Page 10: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

11#

ISBQ Editor InterfaceISBQ Editor Interface

Query for media Query results can be

dragged and dropped into timeline below

Video and images can be

added to HTML page

Audio track for story

Page 11: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

12#

Why annotate…the future?Why annotate…the future?

►Future cameras have: Future cameras have: Creation time, content info e.g. people, scene typeCreation time, content info e.g. people, scene type GPS: placeGPS: place Voice annotation about the shot and sceneVoice annotation about the shot and scene Speech recognition of voiceSpeech recognition of voice

► Is annotation = meta-data about an object?Is annotation = meta-data about an object?

Page 12: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

13#

Imagine the “killer app” for: Imagine the “killer app” for: The One Tbyte, Lifetime, PCThe One Tbyte, Lifetime, PC

► MyLifeBits demonstrates need for lifetime memory!MyLifeBits demonstrates need for lifetime memory!► MODI (Microsoft Office Document Imaging)! MODI (Microsoft Office Document Imaging)!

TThe most significant Office™ addition since HTML.he most significant Office™ addition since HTML.

► Technology to support the vision:Technology to support the vision:1.1. Guarantee that data will live forever!Guarantee that data will live forever!2.2. A single index that includes mail, conversations, A single index that includes mail, conversations,

web accesses, and books!web accesses, and books!3.3. E-book…e-magazines reach critical mass!E-book…e-magazines reach critical mass!4.4. Telephony and audio capture are neededTelephony and audio capture are needed5.5. Photo & video “index serving”Photo & video “index serving”6.6. More meta-information … Office, photosMore meta-information … Office, photos7.7. Lots of GUIs to improve ease-of-useLots of GUIs to improve ease-of-use

Page 13: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

14#

MyMainBrain storageMyMainBrain storage

►Everything stored in a database to facilitate Everything stored in a database to facilitate searching, backup, complex attributes e.g. searching, backup, complex attributes e.g. photo characteristicsphoto characteristics

►Audio, video, images(?) may also be stored Audio, video, images(?) may also be stored in file system (for access). in file system (for access).

►Ability to easily “annotate” and form Ability to easily “annotate” and form “collections” of all the globs“collections” of all the globs

Page 14: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

The Home The Home Digital Multimedia NetworkDigital Multimedia Network

►Vision: All digital content. IP on everything.Vision: All digital content. IP on everything.►Content source for home media: ambiance, Content source for home media: ambiance, entertainment, communication, and interaction entertainment, communication, and interaction ►FreestyleFreestyle for CDs, photos, TV content, videos for CDs, photos, TV content, videos►All listening/viewing stations will be digital.All listening/viewing stations will be digital.►In the 10+year, short-term, Digital Transformers In the 10+year, short-term, Digital Transformers convert IP to legacy analog devices.convert IP to legacy analog devices.

►Today DigitalToday Digital Transformers = computers! Transformers = computers!

Page 15: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

PeripheralsPeripherals

Screen Screen devicesdevices

GamingGaming

StereoStereo

TVTV

TVTVDigital photosDigital photos

The Connected HomeThe Connected Home

DSL-TELCO

SATELLITE

TERRESTRIAL

DIGITAL CABLE

Page 16: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

Home Networks: PC-based Home Networks: PC-based serviceservice

Home IP networkHome IP network

CATVDist

Rec/AMP

X*X*X*C.srv

Monitor

TVsetHDTV

Tuner

CATV Network

Servers:•Hold & deliver audio, photos, video•Encode TV content

Computers:•Control, get content from web, servers

Monitors: HDTV

TV-sets: receive encoded & CATV contentC* = computer. X = digital transformer.

DSL, etc. input

broadcast

Spkr

X*

Page 17: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

Home media network with Digital Home media network with Digital Transformers…Transformers…

mediaserver

1 2 3

4 5 6

7 8 9

* 8 #

IP phone

networkmic

Print Server

Link/Rx LPT1 LPT2 COMPower/TXdtPWR

OK

WIC0ACT/CH0

ACT/CH1

WIC0ACT/CH0

ACT/CH1

ETHACT

COL

router

PC

Internet

Print Server

Link/Rx LPT1 LPT2 COMPower/TXdt

Print Server

Link/Rx LPT1 LPT2 COMPower/TXdt

networkcameraradio

CATV

homenetwork

stereonetworkmonitor

networkspeaker

dt = digital transformer

TVHDTV

Page 18: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

A Digital Transformer for Audio: A Digital Transformer for Audio: Gateway’s Connected Home Gateway’s Connected Home

Audio PlayerAudio Player

Page 19: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

Existing Home Entertainment CentersExisting Home Entertainment Centers

DVD

Sniffer Servermonitoring/analysis

radioCD

DVDcassette

PVR

set top

amp

HDTV receiver

surroundspeakers

HEWLETT

PACKARD

POWERFAULT DATA ALARM

camera

... remotes

TV

Page 20: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

The “Black PC” aka DHEC: Digital The “Black PC” aka DHEC: Digital Home Entertainment CenterHome Entertainment Center

monitor

amp

Digital HomeEntertainment

Center

surroundspeakers

camera tabletPC onradioLAN

Page 21: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

ACTIVY Media CenterACTIVY Media CenterOne H/W for multiple functionsOne H/W for multiple functions

Reduces the number of devices, remotes and Reduces the number of devices, remotes and wires around the TVwires around the TV

Page 22: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

Pioneer Plasma Panel with 1280 x 768 pixelsPioneer Plasma Panel with 1280 x 768 pixelsTV & Computer: Web Surfing at 12’TV & Computer: Web Surfing at 12’

Page 23: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

ArtArt

Page 24: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

Caneel Bay Vacation Jan. 1998Caneel Bay Vacation Jan. 1998

Page 25: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

Disks are becoming computersDisks are becoming computers►Smart drivesSmart drives►Camera with micro-driveCamera with micro-drive►Replay / Tivo / Ultimate TVReplay / Tivo / Ultimate TV►Phone with micro-drivePhone with micro-drive►MP3 playersMP3 players►TabletTablet►XboxXbox►Many more…Many more…

Disk Ctlr + 1Ghz cpu+1GB RAM

Comm:Infiniband, Ethernet, radio…

ApplicationsWeb, DBMS, Files

OS

Courtesy of Jim Gray, Microsoft Bay Area Research

Page 26: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

ChameleonChameleon

Page 27: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

Chameleon: an XP/CE/CellphoneChameleon: an XP/CE/Cellphone((800x300 pixels, 5 GB; 256 MB 800x300 pixels, 5 GB; 256 MB

computer)computer)

Page 28: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

Disk As Tape: What format?Disk As Tape: What format?► Today we ship NTFS/SQL disks.Today we ship NTFS/SQL disks.► But that is not a good format for Linux.But that is not a good format for Linux.► Solution: Ship NFS/CIFS/ODBC servers (not Solution: Ship NFS/CIFS/ODBC servers (not

disks)disks)► Plug “disk” into LAN.Plug “disk” into LAN.

DHCP then file or DB server via standard interface.DHCP then file or DB server via standard interface. Web Service in long termWeb Service in long term

Courtesy of Jim Gray

Page 29: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

Gray’s $2.4 K, 1 TByte Sneakernet Gray’s $2.4 K, 1 TByte Sneakernet aka Disk Brickaka Disk Brick

Courtesy of Jim Gray, Microsoft Bay Area Research

Cost to move a Terabyte

Cost, time, and speed to move a Terabyte

Cost of a “Sneaker-Net” Terabyte

Page 30: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

Cost to move a TerabyteCost to move a Terabyte

ContextSpeed Mbps

Rent$/month

Raw $/Mbps

Raw $/TB sent

Time/TBdays

home phone 0.04 40 1,000 3,086 6 years home DSL 0.6 70 117 360 5 monthsT1 1.5 1,200 800 2,469 2 monthsT3 43 28,000 651 2,010 2 daysOC3 155 49,000 316 976 14 hours100 Mpbs 100 1 dayGbps 1000 2.2 hoursOC192 9600 1,920,000 200 617 14 minutes

Page 31: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

Cost, time of Sneaker-net vs AltsCost, time of Sneaker-net vs Alts

Media Robot$ Media$TB read +write time

ship time

TotalTime/TB Mbps

Cost (10 TB)

$/TB shipped

CD 1500 2x800 240 60 hrs24 hrs 6 days 28 $2 K $208

DVD 200 2x8K 400 60 hrs24 hrs 6 days 28 $20 K $2,000

Tape 25 2x15K 1000 92 hrs24 hrs 5 days 18 $31 K $3,100

DiskBrick 7 1K 1,400 19 hrs24 hrs 2 days 52 $2.6 K $260

Courtesy of Jim Gray, Microsoft Bay Area Research

Page 32: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

Gray’s $2,400 1 TByte Sneaker-netGray’s $2,400 1 TByte Sneaker-netItem Price

Cabinet (Lian LiPC-68 USG 12 bay case) 138

Power Supply (EnermaxEG465AX-VD 431W) 117

Motherboard (Abit KX7A-RAID KT266A) 108

Cpu (AMD 2GHz Athlon XP 1800+) 110

1 GB Memory (2x512MB PC2100 266MHz DDR) 120

1 TB Disks (7xMaxtor EIDE 153GB ATA/133 5400RPM 1,281

Gbps Ethernet (SysKonnect SK-9D21 Gig copper) 219

DVD (Sony DDY1621 16x DVD) 45

Floppy & 3xIDE cables, Video Card 57

OS (WindowsXP Pro OEM) 95

Database (SQL Server 2000 MSDE) 0

Shipping 50

Labor 100

Total $2,440

Page 33: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

34

Google1.5PB as of last spring8,000 no-name PCs Each 1/3U, 2 x 80 GB disk,

2 cpu 256MB ram

1.4 PB online.2 TB ram online8 TeraOps Slice-price is 1K$ so 8M$.15 admins (!) (== 1/100TB).

Page 34: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

35#

Bottom LineBottom Line

►The focus of computation has shifted from The focus of computation has shifted from processing to storage.processing to storage.

►Every app and price level is storage oriented Every app and price level is storage oriented from in/on body, personal, home servers, to from in/on body, personal, home servers, to large scale commercial and scientific appslarge scale commercial and scientific apps

►With databases, pre-computed indices beat With databases, pre-computed indices beat exhaustive searches every time.exhaustive searches every time.

Page 35: 1 Challenges in Exploiting Exponential Storage Gains Seagate Research Lab Grand Opening Pittsburgh, PA 21 August 2002 Gordon Bell Microsoft Bay Area Research.

36#

The EndThe End