Optical and molecular technologies in modern computer systems Lytel Optical and... · Web Tier...

Post on 05-Oct-2020

16 views 1 download

Transcript of Optical and molecular technologies in modern computer systems Lytel Optical and... · Web Tier...

Optical and molecular technologies in modern computer systems

Rick Lytel

Distinguished Engineer & Director

RAS Computer Analysis Laboratory

Agenda

• Who uses computers, anyway?

• Lay of the land

• System scaling

• Role for new technologies

• Is this different than 10 years ago?

Datacenter Tiers

Warehouse Storage

DatabaseTier

DSSOLTPOperational Storage

ApplicationTier

Web Tier

Users SunRay, Net Appliances,Thin Clients, PCs

Web Web Web Web Web Web Web Web Web Web Web

OLAP DataMart

AppsServer

DataMarts

AppsServer

AppsServer

OLTP Database Server DSS Database Server

10’s smallto big

Lots ofSmall sys

FewBig Sys

Vertical scaling

Horizontal scaling

Sun server platforms

SPARCF3800

F4800F6800

• Vertical scaling, 1 to 106 processors• Web, app, and database servers• SPARC, Solaris, and JAVA

F150002001 releases

Server revenue

Operating systems in place

Processor scaling

Performance scaling

Power scaling at Intel

Source: Fred Pollack, Intel Director of Microprocessor Research Labs, Micro32

Code scaling and complexity

0.01

0.10

1.00

10.00

100.00

1975 1980 1985 1990 1995 2000

Mill

ions

of l

ines

UNIX

BSDSunview

“The merge”

“UNIX WARS”Threads

Solaris 2.6

JAVA

*Windows 2000

Power scaling (ie, Moore’s law) ubiquitous?

High-end system

4 Fan Trays

18 Boardsets with18 CPU-Memory Boards

and 18 I/O orDual CPU Boards

4 Fan Trays

2 System Controllers

Six Dual Input 4 KWAC to 48 volt DCPower Supplies

75"

33" 65"

System board

Four banks of 8 SDRAM DIMMs

4 Data Switch ASICs

DataControl ASIC

AddressASIC

Boot bus ASIC

Boot bus

ASIC

Two sets of 8 CPU Data Switch ASICs

19.35"

16.5"C

PU

E$ DIMMs

CP

U

CP

U

CP

U

Po

wer

Po

wer

Po

wer

F15K Components

CPU-Memory board (18)

I/O or MaxCPUboard (18)

System Controller board (2)

System Controller peripheral board (2)

Control expander frame (2)

System expander frame (18)Expander board (18)

Fan trays (8)

Fan Center-planes (8)

Power centerplane

Logic centerplane

Control expander sockets (2)

System expander sockets (18)

Centerplane ASICs (20)

(One side shown)

Backplane connectivity

PCI

I

I

PCI

PCI

I

I

PCIAdrBusDataXbar

AdrBusDataXbar

P

PM

M

P

PM

M

P

PM

M

P

PM

M

P

PM

M

P

PM

M

P

PCI

PCI

I

IP M

PCI

PCI

I

IP

P

M

PCI

PCI

I

IP

P M

M

PCI

PCI

I

IP

P M

M

PCI

PCI

I

I

P M

P

P M

M

PCI

PCI

I

I

P M

P

P M

M

PCI

PCI

I

I

P

P M

M

P

P M

M

PCI

PCI

I

I

P

P M

M

P

P M

M

PCI

PCI

I

I

P

P

P

PP

P

P

P

I

I

P

P

P

P

I

I

P

PM

M

P

PM

M

PCI

PCI

I

I

P

PM

M

P

PM

M

PCI

PCI

I

I

PCI

I

I

PCI

PCI

I

I

PCIAdrBusDataXbar

AdrBusDataXbar

AdrBusDataXbar

AdrBusDataXbar

P

P

PP

P

P

P PCI

I

I

PCI

PCI

I

I

PCI

P

P

M

M

M

M

Passive centerplane

CPU/memory boards

PCI assemblies

Address crossbar

Data crossbar

CPU/memory boards

PCI assemblies

Interconnections in computing

Circuit Distance Speed Width Link BW Carrier

Gate-gate 1-100 mm 1 GHz 100’s 100 GHz e-

Chip-chip 1 cm 500 MHz 100’s 50 GHz e-

Board-board 10-100 cm 500 MHz 100’s 50 GHz e-

Cabinet-cabinet 1-10 m 2.5 GHz 10’s 25 GHz h?

Floor-floor 10-100 m 100 MHz 1’s 0.1 GHz h?

Campus 100-1000 m 1 GHz 1’s 1 GHz h?

Intracity 1-10 km 2.5 GHz 1’s 2.5 GHz h?

Intercity 10-100 km 2.5 GHz 10’s 25 GHz h? k

Continental 100-1000 km 10 GHz 100’s 1 THz h? k

Intercontinental 1000-10000 km 10 GHz 100’s 1 THz h? k

“the computer”

“the network”

Scaling according to…

EE Shrink Silicon process and lower voltageME Refrigerate computer, then do what EE doesOptics Photons, not electrons, in interconnectsChemist Organic molecular wires & logicBiologist Nucleic acid logic & processor, PCR chipsSS physicist Carbon nanotube gates, HT superconductorsTheorist Quantum computing: it’s been demo’d, QEDGrad student Can I get a job?Marketing “The network is the computer”Customer More for less moneySys Admin More for less workAl Gore “When I invented the internet…”

Expectation gap for optical solutions

What suppliers think we want

• Modest cost

• Reliability of best lasers

• Unique opportunity

• High density

• High speed

• WDM

• External modulation

What we actually want

• Copper cost

• Reliability of Si

• Standard product

• High density

• High speed

• What’s WDM?

• Huh? (we don’t care)

Real reason optics is interesting to us

• These two cables have the same bandwidth

• Big cable is 160 pair, 83 MHz LVDS, up to 10 m

• Small cable is 12 fiber, 1.25 Gbps per fiber multimode ribbon, up to 100 m

• Small cable scales to 2.5 Gbps per fiber, maybe to 10 Gbps) 6”

Modern systems have high RAS

Reliability• component fail rates, in FITs (1 FIT = 1fail per 109

device-hrs)

• subsystem FIT rates and failure modes

Availability• system up-time as fraction of total time (e.g. #9’s…)

• Markov models

Serviceability• state capture and accessibility to service personnel

• ease of repair

If you ignore RAS, it makes you pay

Nov 11, 2001

Nov 12, 2001

• Super-Kamiokande Observatory• electron neutrino detector• 50 M liters water

• Fundamental discovery about ‘missing’ solar neutrinos

• 10 events per year• very high criticality

• One photomultiplier tube exploded

• cascaded to ALL 11,200 tubes• “What happened?”

NOT DESIGNED WITH HIGH RAS

Graduate students in boat cleaning detectors

Glass shards at the bottom of the tank

Trends in microelectronics

• Processor logic density nearing air-cooling limits• low power CMOS, selective clocking, Cu metal, SOI

• IBM’s already announced all four

• Low-voltage 70 nm, but soft error rates and noise margins are not yet measured

• ECC on the pipes, TLBs, ALUs, most registers

• s/w mitigation - checkpointing, CPU sparing

• Packaging moving back to multichip modules• IBM power4 chip multiprocessor (8 cores/module)

• diamond, Cu heat spreaders; novel air flow

• Moore’s (CMOS) law is bending but not yet broken

Trends in high availability systems

• S/w fault management stack

• system complexity prohibits 100% test coverage

• fault boundaries delineated and managed

• Lockstep cores, processor failover w/state capture

• Dynamic system monitoring and fail prediction

• Software rejuvenation to mitigate s/w aging

• 10,000s embedded h/w sensors, registers

• End-end system checksum

Infiniband as the system area networkG

atew

ay

serv

ers

Fro

nt-

En

d s

erve

rs

Mai

l Sto

rage

New

s/W

EB

sto

rag

e

Router

Internet

• Fiber-optic Tx/Rx• 2.5 Gbps/channel• 1, 4, 12 channels• Scaling path to 10 Gbps• No WDM

System board interconnection

MCM

Driver

Lightguide

Fiber

MCM

rcvr

PDarray

lightguide

VCSELarray

• Free space, compliant interconnect (patent pending)•~ 500 lines @ 1 Gbps each• areal routing between system boards• mechanical latch provides alignment

Challenges for chip-chip optics

• Optical elements are much larger than VLSI elements

• 2-10 mm VCSEL and 20-50 mm PD apertures

• 5-10 micron optical waveguides

• VLSI device elements are getting smaller

• 0.10 micron features, micron-sized gates

• 70 nm features are only a few years away

• Utilization of third dimension requires known good die

• Hybrids require new layout and validation tools

• Cost increases vs. performance gained?

The optical opportunities

Inside the box (packaging & process)

• Active cooling

• Low voltage CMOS

• Merged logic and memory

• Asynchronous circuits & systems

• SiGe HBT @ 100K gates, fT ~ 75 GHz

• Free-space or fiber board-board links

Outside the box (transport and switching)

• Infiniband and 10 Gb Ethernet• Router and OC-192 interfaces

A real system scaling limitation

• DRAM cheap, but slow and far (100s nsec) from the processor

• Cache model uses SRAM (L1, L2, L3) with 2-10 clock tick latency

• SRAM limited to < 1 MB in 2000

• Frequent branches generate cache misses & add latency)

1

10

100

1000

1980 1985 1990 1995 2000

Rel

. per

f. c

om

par

ed t

o 1

980

CPU

DRAMPerformance gap

(CPU-memory wall)

UltraSPARC II cache memory

I$

D$

Core Tag RAM

Data RAM

Data

buffer

UPA

Switch

System

Controller

300 + MHzUltraSparc-II Main

MemoryMain

MemoryMain

MemoryMain

Memory

Load to UseMemory Latency

~ 200 ns

System address bus

/144@150 MHz

E-Cache tag address

E-Cache tag data

E-Cache data address

E-Cache data bus

Memory control(RAS, CAS)

/576/144@100 MHz

UPA“port”

6.6 nsec

10 nsec

10 nsec10 nsec

10 nsec

10 nsec

90 nsec

Closing the CPU-memory gap

• Merged logic and memory• fastest memory next to the CPU (practical)

• Hide latency• multithreading, prefetch, out-of-order execution• little help in SMP systems (“Nearly Uniform MA”)

• Larger SRAM• 20 MB/die in 5-7 years• molecular electronics (discussed next)

• Faster DRAM• custom memory modules (lots of $$$)

• Optical interconnects? No...• Latency due to cache model, not wires• Increase, not decrease power, latency, noise

Potential SRAM cache densities

• Six transistors per SRAM cell

• Silicon transistors

• 108 logic transistors/cm2 in 2008 (SIA)

• 109 SRAM transistors/cm 2 in 2008 (SIA)

• 20 MByte SRAM L2 (1 cm2) cache chip

• Nano-transistors (fast, < 1 nsec)

• 1 nm x 10 nm, so 1013 SRAM transistors/cm2

• ~ 0.2 TByte SRAM L2 (1 cm2) cache chip, but…

• power/bit must scale down accordingly

How much is a mole of memory?

• 1 mole = 6.022 x 1023 ‘things’

• A single processor generating addresses at 10

GHz will take 2 x 106 years to touch every word

• Current large servers may have up to 1 sec of

DRAM

• Would need ~ 1013, 10 GHz processors to balance

one mole of memory with present architectures

How large is a mole of memory?

• If we spread out a mole of memory on a 50 A grid, it would cover 1.5 x 107 m2, a square 3.9 km on a side

• Assuming 104 kT per fetch, 1013 processors fetching at 10 GHz dissipate 4 MW

• comparable to one of the ASCI machines

• at 109 kT/op, this is 4 x 105 MW

• If we pack it in a cube we get 0.075 m3, about 42 cm on a side

• ~ 5 MW/m3 at 104 kT/op

Applications to backing store

• Write once

• slow is acceptable

• behaving like tape is acceptable

• Archive the whole file state of machine forever

• infinite undo

• should be non-volatile at zero power

• Trees more probable than meshes…

• Fat trees might be fault-tolerant

Logic gates

• Fan-in and fan-out are required for existing designs

• Level restoration required to assemble a functional

system from billions of devices

• It is possible to obtain gain from a tunnel diode

• discrete component tunnel diode logic attempted in 1960’s

• it was intractable

• twitchy

• Three terminal devices w/power and ground planes

Power distribution

• All electrical logic families require a reference voltage or current to set threshold and a return path

• Immersion in a conductive liquid for global return...

• ...common ground return impedances are a noise source

• Assuming probabilistic assembly it is very important that any power distribution follow the actual logic structure

• it might be easier to power the gates from an energetic compound dissolved in the ground return path

• pump liquid to clear decomposition products and move heat

Essential physics to solve

• Nano-transistor with gain

• Speed < 1 nsec desirable, 1 msec usable

• Low impedance power rails

• Long data buses for multiple SRAMs

• Energy per SRAM bit ~ 102 - 104 kT

• Deterministic nets and devices

Fundamental question for nano is...

• Start with molecules, molecular wires...

• Assemble mesoscopic wires, junctions...

• Assemble stable and predictable computing elements (e.g., gates)by forming structures using the wires, junctions...

• Assemble large-scale circuits for memory, logic, using the computing elements...

• Add ECC circuits...

• Add power and ground circuits, signal planes, I/Os…

How does the bit density, power density, and I/O bandwidth compare to Silicon CMOS @ 50 nm feature size?

What’s changed since ten years ago?

Then

• “A wall” on packaging 50 MHz CMOS CPU

• 2 micron features

• 512 KB RAM chip

• MHz laser modulation

• ARPANET

• Mainframes

• Wires for transport

Now

• CMOS scaling to 5 GHz through 2010

• 0.1 micron features

• 64 MB RAM chip

• 10 GHz laser modulation

• Ubiquitous internet

• SMP servers

• Fiber, EDFA for transport

The more things change, the more they remain the same!