100Gb/s Electrical Links System View - ieee802.org Electrical Links System View ... technical...

38
100Gb/s Electrical Links System View Mark Gustlin – Xilinx Beth Kochuparambil – Cisco Kent Lusted – Intel Gary Nicholl - Cisco David Ofelt – Juniper Rob Stone - Broadcom 1

Transcript of 100Gb/s Electrical Links System View - ieee802.org Electrical Links System View ... technical...

100Gb/sElectricalLinksSystemViewMarkGustlin – XilinxBethKochuparambil – CiscoKentLusted– IntelGaryNicholl- CiscoDavidOfelt– JuniperRobStone- Broadcom

1

GroundRulesThispresentationisaboutmotivatingusecasesandtheissuesaroundeach◦ Expectotherpresentationsinresponse- providing:

◦ objectives◦ technicalfeasibility(TF)◦ economicfeasibility(EF)◦ broadmarketpotential(BMP)

Sometraditionalobjectives(ex:chiptomodule)mayhavedifferentissuesateachendofthelink

2

SystemOverview

3

SystemOverview– PizzaBox

Module*

ASIC

Retimer

ASICEscape

ModuleConnection

NewCopperCablePMDs

PizzaBox

Module*ExistingPMDs

or

Retimer

*:Modulecanbetraditionalfront-panelmoduleorOBO

or

ModuleConnector

4

SystemOverview– LineCard

Module*

ASIC

BackplaneConnector

ModuleConnector

Retimer

BackplaneLink ASICEscape

ModuleConnectionLineCard

Module*ExistingPMDs

or

Retimer

*:Modulecanbetraditionalfront-panelmoduleorOBO

or

or

Retimer

5

NewCopperCablePMDs

GeneralSystemObservationsForline-cardbasedsystems- afasterlinecardislessuseful:◦ Ifthereisn’tbackplanebandwidthtosupportit◦ Ifthereisn’tenoughfaceplatedensity(orboardsurfacearea)togetthebandwidthoutofthebox

Forstandalonesystems– afasterASICcanstillbeuseful◦ Noincreaseinfaceplatedensity,butwillincreasetotalbandwidthandcrossbarradix

OIF100Gb/sSERDESwork:◦ Arethecurrentchannelsandobjectivesappropriatefortheworkhere?

Boardsarebig– gettingfromthecentertomodulesinthefrontcornerscanbetricky◦ OBOdon’tnecessarilyhelpthatmuch,sincetheycan’tallbepackedrightnexttotheASIC

6

CopperCable

7

CopperCableWhatareachievablereachesfor:◦ Passivecoppercable?◦ Activecoppercable?

Whataresystemimplicationstoachievethesereaches?

Whataretheend-userimplicationsofthesereaches◦ Whatcommondatacenterarchitectureswork/don’t’work?

Whatisthecostdeltaforactive–vs- passivecable◦ ImpactsbroadmarketpotentialforCucableifthecostapproachesoptics

8

DataCenterArchitectureInfluence

Source:http://www.ieee802.org/3/cfi/1117_3/CFI_03_1117.pdf

9

CopperCablesinDataCenterRacks

“Inter-rackSwitch”5mDACreach– 802.3by

(25GBASE-CR)

“Intra-rackSwitch”3mDACreach– 802.3cd3mDACreach– 802.3by

(25GBASE-CR-S)

=switch =server=Cucable “EndofRow Switch”

NoDACreach

“MiddleofRackSwitch”1-2m?DACreach

=Fibercable

?

Whatlengthistooshort?

AreFECorhostlosschangesneeded?

Whatistheimpacttoindustry?

10

ModuleConnection

11

ModuleConnection(AUI)Backwardscompatibilityissues(seefollowingslides)

Whatarethesystemimplicationstomake100Gb/sAUIswork?◦ retimer proximity/universality,PCBmaterial,fly-overcablesrequired,etc,etc?◦ DotheserequirementsaffectEForBMP?

WhataretheeconomictradeoffsbetweenredoingPMDbudgetsandspendingsignificantlymoreonthesystem?

HowdifferentisthemoduleTX/RXSERDESfeaturesetthantheASICTX/RXSERDESfeatureset?

SystemswillneedtosupportmodulesthatrunmanyratesofEthernet◦ Hardtohavelossbudgetsthataredifferentperratewhenthesamemodulecanoperateatdifferentratesusingbreakout.

12

BackwardsCompatibilityIssuesWillwebeabletoorwanttosupportexisting(orsoontobeexisting)PMDswithanewAUI?◦ Assumptioniswewillwanttosupport8023.bs/cdPMDs,especiallytheDRversionssincethoseare100Gperlane

Issuesinclude:◦ FECpartitioningbudgets◦ FECchoice◦ Latency◦ Backwardscompatibility

SolutionSpacetoinvestigate:1. Make100Gb/schip-to-modulelinksworkwith802.3bsand802.3cdelectricalFECbudgets2. Changeend-to-endlinkbudgetof802.3bsand802.3cdPHYstoallocatemoreerrorstotheelectricallinks3. TerminateFECinmoduleandregenerateFECforthewire(segment-by-segmentFEC)4. Adda(hopefullylightweight)wrapperFECtoprotect100Gb/selectricallinks

13

200/400GbE“Legacy”ModuleBasedPHYs

200/400GEPHYs Technology/Reach FEC FECCoverage

200GBASE-DR4 4lanesSMF,500mreach RS(544,514) EndtoEnd

200GBASE-FR4 4WDMSMF,2kmreach RS(544,514) EndtoEnd

200GBASE-LR4 4WDMSMF,10kmreach RS(544,514) EndtoEnd

400GBASE-SR16 16lanes MMF,100mreach RS(544,514) EndtoEnd

400GBASE-DR4 4lanesSMF,500mreach RS(544,514) EndtoEnd

400GBASE-FR8 8WDMSMF,2kmreach RS(544,514) EndtoEnd

400GBASE-LR8 8WDMSMF,10kmreach RS(544,514) EndtoEnd

ThemajorityofthesePHYsarebasedon50Gb/sperlanetechnology◦ Supportingthemwith100Gelectricalinterfacesrequiresareversemuxinthemodule

◦ Bitmuxingissupportedforchanginglanewidths

Only400GBASE-DR4uses100Gb/slanetechnology

FECisend-to-end,RS(544,514)

PMDportionoftheBERend-to-endbudgetisalways2.4x10-4

14

100GbE“Legacy”ModuleBasedPHYs200/400GEPHYs Technology/Reach FEC FECCoverage

100GBASE-SR10 10lanes MMF,100mreach NoFEC N/A

100GBASE-SR4 4lanesMMF,100mreach RS(528,514) PMDonly

100GBASE-LR4 4WDMSMF,10kmreach NoFEC N/A

100GBASE-ER4 4WDMSMF,40kmreach NoFEC N/A

100GBASE-SR2 2lanesMMF,100mreach RS(544,514) EndtoEnd

100GBASE-DR 1laneSMF,500mreach RS(544,514) EndtoEnd

ThesePHYsarebasedon10,25,50and100Gb/sperlanetechnology◦ Supportingthem(withtheexceptionofDR)with100Gelectricalinterfacesrequiresareversemuxinthemodule

Only100GBASE-DRuses100Gb/slanetechnologyFECisend-to-endforsome(SR10,LR4andER4don’trequireFECatall)TherearemanyderivativePMDsintheindustryspecifiedbyMSAsetc.◦ TheyreusetheIEEEPCS/FEC

PMDportionoftheBERendtoendbudgetis2.4x10-4fortheDRPHY

15

Opt1:LegacyBERPartitioningfor802.3cd/bs PHYsForacompletePHYtheerrorpartitioningbelowgivesusaBERof1x10-13 (orequivalentframelossratio)

IfwekeeptheC2MsegmentataBERof1x10-5orbetterfor100Gperlaneinterfaces,thenwecanreusethecd/bs PMDsasis◦ AssumingweweretokeepusingtheRS(544,514)FEC◦ Allowsforbothmoduleandhostbackwardscompatibility(seerightside)

Otherwisewecan’tdirectlysupporttheseexistingPMDs

NewHostDevice

NewModule

C2Cor

Retimer

C2M

LegacyModule

LegacyHostDevice

C2C

or

Retimer

C2M

C2M

BER<2.4x10-4BER<1x10-5 BER<1x10-5

544FEC

544FEC

C2M

100G/laneAUI 50G/laneAUI

16

Opt2:NewBERPartitioningfor802.3cd/bs PMDsWouldneedtoredefinethePMDBERrequirements(forexamplefor100GBASE-DR)

ThiswouldcreatenewPMDsdefinitions,notcompatiblewiththe802.3cd/bs PMD◦ Mightbepossibletoallowinteropatreducedlossforlegacymodules(mightaddconfusiontothemarket?)◦ Butmightnotbeabletodependingontheerrorfloor?

◦ Allowsforhostbackwardscompatibility(seerightside)

NewHostDevice

NewModule

C2Cor

Retimer

C2M

LegacyModule??

LegacyHostDevice

C2C

or

Retimer

C2M

C2M

LowerErrorratiothanaBERof2.4x10-4

HighererrorratiothanaBERof1x10-5

BER<1x10-5

KP4FEC

KP4FEC

100G/laneAUI 50G/laneAUIReducedDistance/Loss

17

Opt3:SegmentbySegmentFECThe100GperelectricallaneFECwillhavetobeterminatedinthemodule,sonodependenciesexist◦ Same(RS(544,514))ordifferentFECcouldbeusedforelectricalinterface◦ Addslatencytothespanandcomplexity/powertothemodules◦ Allowsforbothmoduleandhostbackwardscompatibility(seerightside)

HostDevice

NewModule

C2Cor

Retimer

C2M

LegacyModule

LegacyHostDevice

C2C

or

Retimer

C2M

C2M

BER<open BER<1x10-5

?FEC

?FEC

KP4FEC

KP4FEC

BER<2.4x10-4

100G/laneAUI 50G/laneAUI

18

Opt3:BERPartitioningfor802.3ba/bm PHYsIfthereisinterestinsupportingtheseolderPMDs,theFECmustbesegmentbysegment

The100GperlaneFECwillhavetobeterminatedinthemodule,sonodependenciesexist◦ Allowsforbothmoduleandhostbackwardscompatibility(seerightside)

HostDevice

NewModule

C2Cor

Retimer

C2M

LegacyModule

LegacyHostDevice

C2C

or

Retimer

C2M

C2M

NoFECorpostFECBER<1x10-12BER<open BER<1x10-12

?FEC

KR4FEC

ornoFEC

?FEC

KR4FEC

ornoFEC

100G/laneAUI 25G/laneAUI

19

HostDevice

Opt4:NewFECWrapperThisoptionaddssomeadditionalFEC(wrapper)tothedata◦ Allowsforbothmoduleandhostbackwardscompatibility(seerightside)◦ ThisadditionalFECispointtopointacrosstheAUIinterfaceandterminatedinthemodule◦ Willaddadditionallatency◦ Willincreasethedatarate◦ Addcomplexity/powertothemodule(andhostdevice)

NewModule

C2Cor

Retimer

C2M

LegacyModule

LegacyHost

Device

C2C

or

Retimer

C2M

C2M

BER<open BER<1x10-5

KP4FEC

KP4FEC

WrapFEC

WrapFEC

BER<2.4x10-4

100G/laneAUI 50G/laneAUI

20

ASICEscape

21

ASICEscapeNext-generationASICswantto(atleast)doubletheirbandwidth◦ 50Gb/sgenerationhavelikelymaxedoutSERDEScountperdie/package

◦ Nonext-generationproductforsomearchitectureswithout100Gb/sSERDES

AretheASICSERDESuniversal(CR/KR/C2M/C2M)oraretheyjust(say)C2Candretimers providetheotherconnectivity?

100Gb/sASICscanuseretimers todownshiftto50Gb/sAUIs(etc)◦ Nofaceplatedensityincrease,butallowsforASICbandwidthgrowth.

22

CommercialExample:EthernetSwitch– 128x25GLanes– 30%ChipAreaisIO(shadedblue),70%forothermissionfunctions– Growsto>>30%in50GIOgeneration!

Keyrequirementfor100GgenerationIO:preservesiliconareaforvalueaddedmissionfunctions– MinimizePower– MinimizeArea(costanddevicefeasibility)– PCS&FEClogiccommonality(reusewherepossible)– “Balanced”serdes choice– area&powervs reach/capability

SwitchDieAreaIncreasinglyConsumedbyI/O

PackageDesign

LGA

BGA

LargerPackagesHigherpackagelossAdditionalCost

MaturetechnologyLowestCostPackagingCommercial

Products

30

40

50

60

70

80

90

100

110

0 10 20 30 40 50

PackageSize(m

m)

TotalIOBandwidth(Tb/s)

Drivenprimarilybyswitchpackageescape

PracticalBGAlimitis~256lanesina70mmpackage,1mmballpitch

Productsannouncedalreadywith50GIOwiththisformfactor(12.8T)

Switchprocessingcapacitydoublingevery~24mo

100GIOrequiredinproductsby~2019tocontinuethegrowthtrendataconstantrate!

Whenis100GelectricalI/Orequired?

LeadingSwitchCapacityvs Year

>2019ASICrequirementsareexpectedtoexceedBWdeliveredbyaconventionalBGAwith50G IO

[email protected]

10GIO

25GIO

50GIO

2019

100GIO

Non-retimedFeasibilityEstimatesInterface Architecture ApproximateChannelLoss(ball–

ball,dB)FeasibilityRank

ChiptoModule(assumes1RUsystem)

Conventional(10”PCB+Frontpanel module) 201 High

InternallyCabledHost+Front-panelmodule 152? High

Mid-board OpticalModule 10– 152 High

ChiptoChip

Conventional (PCB+MezzConnector) 401 Low

Internally Cabled+MezzConnector ~20– 30 Med

Backplane(KR)

1mConventionalPCB 601 VeryLow

Orthogonal 20– 402 Med

CabledBackplane 20– 352 Med

Copper Cable (CR)ConventionalDAC+PCB Host 601 VeryLow

InternallyCabled Host+DAC ~20- 30 Med

1– Projection- Scaled2xfrom802.3cdexistingchannels2- Source:NathanTracy,TEConnectivity,DesignCon 2017- CEI-112G:ConsideringElectricalChannels

• C2Mlookachievable

• ChiptoChip,Backplane,CopperCablesrequirearchitecturechange

Notsurprisingly,tallpolesappeartobeassociatedwiththelongerchannels

Moredataandstudyrequiredonbothchannels,aswellasserdes capability

Initial100G/laneelectricalapplicationsarelikelytobeassociatedwithswitchpackageescape,whereahighpower,largerareaserdes willbeprohibitive,andtimetostandardizationiscritical

Projectscopeimplications?

RightsizingtheIOstandardizationforswitchingapplications

Backplane

29

BackplaneConsiderationsLineinthesandat30dBat28GHz- Pastprojecttargets- Needastartingpointtodiscuss

systemroutingneeds

Traditionallengthsstruggleatthisspeedstep.

PCBindustryhasmadefurtherprogressinthelast12months.

Slide/dataUsedwithpermissionfromNathanTracy,TEConnectivity.DatapresentedatDesignCon2017.

30

BackplaneConsiderationsTraditionalbackplane:◦ Needtoshortento~14-17”ofMeg6tomeet30dB

GivenroutingrequiredforonandoffLCs,Routabledistanceonbackplaneitselfistoolimited.

Traditionalbackplanearchitectureisn’tveryrealisticatthisspeed.

Approximatecalculations:◦ Holding4dBforconn+vias◦ 1.89dB/inà 13.75”◦ 1.485dB/inà 17.5”◦ Loss#sfromGoergen_nea_01_0517

31

BackplaneConsiderationsOrthogonalConnector◦ ShorteningDCtraces(onmoderatematerials)tocloserto6-8”fitsina30dBbudget

◦ Using8”flyovercablefitsina30dBbudget

Therearepathsforwardhere.

SimulationresultsusedwithpermissionbyNathanTracy,TEConnectivity.

6”DCtraceinlowcostFR4

2”trace+8”cable

32

BackplaneConsiderationsCabledBackplane◦ ShorteningDCtraces(onmoderatematerials)tocloserto4-6”fitsina30dBbudget

◦ Using8”flyovercablefitsina30dBbudget

Therearepathsforwardhere.

SimulationresultsusedwithpermissionbyNathanTracy,TEConnectivity.

4”PCBtraces1mcable

4”DCtraceinFR42”trace+8”cable

8”flyovercables1mcable

33

BackplaneSummaryDesignchoiceswillfactorinmorethaneverbefore:- Mediumused(PCBvs.cable)- Surfaceroughness- Vias &stubs- Groundlayerthickness- Radiatedemissionsshielding

Pure-backplanelinksdonothavetheburdenofneedingtosupportopticsandpassiveCucable◦ UniversalSERDESonASICsaredesirable(seenicholl_3bs_01c_0115*),butmaymakethingstoodifficult

Backplaneoptionsthatremainat30dBarestillusefulforsystems!*http://www.ieee802.org/3/bs/public/15_01/nicholl_3bs_01c_0115.pdf

34

Retimers

35

RetimersAreausefultooltomakeconnectionsfunction◦ CanincreaseTFfortrickierlinks

Increasecostperbit,power,andtakeupboardsurfacearea◦ ThesehaveBMPandEFimplications

Needeverywhereorjustsparinglywhereneeded?◦ Ifweuseeverywhere,thenASICSERDEScanbesimpler

36

DiscussionWeshouldkeepinmindhowlongeachobjectiveisexpectedtotaketostandardize◦ Asubsetoftheobjectives(e.g.C2M)maybeeconomicallyworthsplittingofftofinishfaster

Needtogetconsensusonwhatusecasesshouldbesupported

37

Thanks!

38