Post on 12-Sep-2021
Realizing a Power Efficient, Easy to Program Manycore:
The Tile ProcessorAnant Agarwal
Tilera, MIT
2 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Multicore is Everywhere
DesktopTVs
Cellphones Telepresence
Datacenters/Clouds
NetworkSwitchesNetworkSwitches
LaptopAutomobiles
3 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Parallel Computing No Longer Province of Rocket Scientists
The computing world is ready for radical change
Time
#Cores
2007
1
2006
2
4
32
2014
Quadcore
2005
64cores
Dualcore
1000 cores
Intel
Sun IBM Cell
nCores
8-24cores
Larrabee
4 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Tilera Announced the TILE-Gx100 in Oct `09
100 general-purpose coresRuns SMP LinuxStandard programming1.25GHz – 1.5GHzFull 64-bit processors32 MBytes total cache546 Gbps memory BW200 Tbps iMesh BW
80-120 Gbps packet I/O80 Gbps PCIe I/O
Wire-speed packet engine– 120Mpps
MiCA engines:– 40 Gbps crypto– 20 Gbps compressMemory ControllerMemory Controller Memory ControllerMemory Controller
Memory ControllerMemory Controller Memory ControllerMemory Controller
mPI
PEm
PIPE
MiscI/O
MiscI/O
MiCAMiCA
MiCAMiCA
3 PCIeInterfaces3 PCIe
Interfaces
NetworkI/O
Eight10Gb-or-
32 1Gb-or-
Two 40Gb
NetworkI/O
Eight10Gb-or-
32 1Gb-or-
Two 40Gb
FlexibleI/O
FlexibleI/O
5 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
But first, let’s rewind to 1996
6 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Research Vision to Commercial Product
2002
2007
100B transistors
2018
1B Transistors
in 2007
1996
The opportunity
CPUMem
1997
A blankslate
MIT Raw 16 cores Tile Processor
64 cores
Memory Controller (DDR3)
Memory Controller (DDR3) Memory Controller (DDR3)
Memory Controller (DDR3)
Memory Controller (DDR3)
Memory Controller (DDR3) Memory Controller (DDR3)
Memory Controller (DDR3)
mPI
PEm
PIPE
FlexibleI/O
FlexibleI/O
UART x2, USB x2,JTAG, I2C, SPI
UART x2, USB x2,JTAG, I2C, SPI
MiCAMiCA
MiCAMiCA
Ser
Des
Ser
Des
PCIe 2.08-lane
PCIe 2.08-lane
Ser
Des
Ser
Des
PCIe 2.04-lane
PCIe 2.04-lane
Ser
Des
Ser
Des
PCIe 2.08-lane
PCIe 2.08-lane
Inte
rlake
nIn
terla
ken
Inte
rlake
nIn
terla
ken
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
TileGx100100 cores
The future?
2010
7 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
The Opportunity
20MIPS cpuin 1987
1996…
Few thousand gates
8 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
The Opportunity
The billion transistor chip of 2007
9 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
How to Fritter Away Opportunity
100 ported RegFil and RR
Caches
Control
More resolution buffers, control
Does not scale
Burns a lot of power
10 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Key Multicore Challenges: The 3 P’s
Performance challenge– How to scale from 1 to 1000 cores -- the number of cores is
the new MegaHertz
Power efficiency challenge– Performance per watt is the new metric – power efficiency
trumps instruction-set (ISA) compatibility
Programming challenge– How to distinguish between a processor and a paper weight
11 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Our Early Raw Proposal
CPUMem
Got parallelism?
12 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
ASICs have high performance and low power• Custom-routed, short wires• Lots of ALUs, registers, memories – huge on-chip parallelism
memmem
mem
mem
mem
Take Inspiration from ASICs
But how to build a programmable chip?
13 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Replace Long Wires with Routed Interconnect
Ctrl
[IEEE Computer ’97]
14 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
16-Way ALU Clump Distributed ALUs
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
Bypass Net
RF
15 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Distributed ALUs, Routed Bypass Network
ALU
ALU
ALU ALU
ALU
ALU
R
Scalar Operand Network (SON) [TPDS 2005]
16 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
From a Large Centralized Cache…
17 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
…to a Distributed Shared Cache
ALU
ALU ALU ALU
ALU
ALU
R
$
[ISCA 1999]
18 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Distributed Everything + Routed InterconnectTiled Multicore
ALU
ALU
ALU ALU
ALU
ALU
R
$
Each tile is a processor, so programmable
19 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
On-Chip Interconnect Routes Messages
ALU
ALU
ALU ALU
ALU
ALU
R
$
head
erpa
yload
For distributed cache access, off-chip misses, I/O, user-level messages
20 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Tiled Multicore Captures ASIC Benefits and is ProgrammableScales to large numbers of coresModular – design and verify 1 tilePower efficient– Short wires plus locality opts –
CV2f– Chandrakasan effect, more cores at
lower freq and voltage – CV2f
ProcessorCore
= TileCore + Switch
S Current Bus Architecture
21 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
“Raw” Die Photo
16 tiles, 425MHz, 18 Watts (vpenta)IBM 0.18 micron process
[ISCA 2004]
…2002
22 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Research Vision to Commercial Product
2002
2007
100B transistors
100B transistors
2018
1B Transistors
in 2007
1996
The opportunity
CPUMem
1997
A blankslate
MIT Raw 16 cores Tile Processor
64 cores
Memory Controller (DDR3)
Memory Controller (DDR3) Memory Controller (DDR3)
Memory Controller (DDR3)
Memory Controller (DDR3)
Memory Controller (DDR3) Memory Controller (DDR3)
Memory Controller (DDR3)
mPI
PEm
PIPE
FlexibleI/O
FlexibleI/O
UART x2, USB x2,JTAG, I2C, SPI
UART x2, USB x2,JTAG, I2C, SPI
MiCAMiCA
MiCAMiCA
Ser
Des
Ser
Des
PCIe 2.08-lane
PCIe 2.08-lane
Ser
Des
Ser
Des
PCIe 2.04-lane
PCIe 2.04-lane
Ser
Des
Ser
Des
PCIe 2.08-lane
PCIe 2.08-lane
Inte
rlake
nIn
terla
ken
Inte
rlake
nIn
terla
ken
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
10 GbEXAUI
10 GbEXAUI S
erD
esS
erD
es4x GbESGMII
TILE-Gx100100 cores
The future?
2010
23 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Cloud Demands higher performance density and low power
Wireless infrastructure marketDemands higher throughput, more services and low power
Networking market Demands higher performance, better security and services
Digital multimedia market Newer algorithms (e.g., H.264) for higher compression demand more performance
Why Do We Care?Markets Demanding More Performance at Lower Power
Cable & BroadcastCable & BroadcastVideo ConferencingVideo Conferencing
SwitchesSwitchesSecurity AppliancesSecurity Appliances
GGSNGGSN Base StationBase Station
Cloud server rackCloud server rack
24 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
The Tile Processor is a System-on-a-Chip
Yes w/ DDCCache coherency
64Flexible I/O pins
27-34Typical power -9 device (W)
DDR2 bandwidth (peak Gbps)
PCIe interfaces
Ethernet bandwidth
I/O and Memory
Typical power -7 device (W)Typical power -5 device (W)
PowerClock speed (MHz)On chip bandwidth (Terabit/s)
Operations (16/32-bit BOPS)
On-chip cache (MB)# of cores
Performance
5.6
700, 866
TILEPro64
2 XAUI, 2GbE
N/A
200
2 x 4-lanes
17-21
64
221/166
38
TILEPro64 Block DiagramSe
rDes
SerD
es
GbE 0GbE 0
GbE 1GbE 1
FlexibleI/O
UARTJTAG
SPI, I2C
FlexibleI/O
UARTJTAG
SPI, I2C
DDR2 Controller 3DDR2 Controller 3
DDR2 Controller 0DDR2 Controller 0
DDR2 Controller 2DDR2 Controller 2
DDR2 Controller 1DDR2 Controller 1
PCIe 0PCIe 0
SerD
esSe
rDes
PCIe 1PCIe 1
XAUI 0XAUI 0
SerD
esSe
rDes
XAUI 1XAUI 1
SerD
esSe
rDes
TILEPro36 also available Tilera development systems
25 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Remember the 3 P’s?
Performance challenge– How to scale from 1 to 1000 cores -- the number of cores is
the new MegaHertz
Power efficiency challenge– Performance per watt is the new metric – power efficiency
trumps instruction-set (ISA) compatibility
Programming challenge– How to distinguish between a processor and a toaster
26 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Key Innovations
1. General purpose cores – Standard OS and
programming
3. Multicore Dynamic Distributed Cache– How to achieve cache coherence and run
standard software
2. iMesh™ Network – How to scale and be energy efficient
4. Multicore Hardwall™– How to virtualize multicore
5. Multicore Development Environment
– How to program
OS1/APP1
OS1/APP3OS2/APP2
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
27 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
1 - Full-Featured General Cores Enable Throughput Oriented Computing and Standard Languages
Processor– Each core is a complete computer– 3-way VLIW CPU– Designed for low power – 200mW per core– SIMD instructions: 32, 16, and 8-bit ops– Instructions for video (e.g., SAD) and networking– Protection and interrupts– Single core performance roughly the same as a
modern MIPS or ARM coreMemory
– L1 cache: 8KB I, 8KB D, 1 cycle latency– L2 cache: 64KB unified, 7 cycle latency– 32-bit virtual address space per process– 64-bit physical address space– Instruction and data TLBs– Cache integrated 2D DMA engine
Switch in each tile
Runs SMP LinuxRuns off-the-shelf open-source C/C++ programs
Core
Cache16K L1-I
64K L2
I-TLB
D-TLB2D
DMA8K L1-D
Register File
Three Execution Pipelines
TerabitSwitch
28 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
MDN TDNUDN IDN
SWITCH
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
Core
Cache + MMU
TerabitSwitch
2- iMesh On-Chip Network
Distributed resources– 2D Mesh peer-to-peer tile networks– 5 independent networks – Each with 32-bit channels, full duplex– Tile-to-memory, tile-to-tile, and tile-to-IO data transfer– Packet switched, wormhole routed, point-to-point– Near-neighbour flow control, dimension-ordered routing
Performance and energy efficiency– ASIC-like one cycle hop latency– 2 Tbps bisection bandwidth– 32 Tbps interconnect bandwidth– Low power
6 independent networks– One static, four dynamic – IDN – System and I/O– MDN – Cache misses, DMA, other memory– TDN, VDN – Tile to tile memory access and coherence– UDN, STN – User-level streaming and scalar transfer
Achieves scalability and power efficiency
VDN STN
29 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Meshes are Power Efficient
[Konstantakopoulos ’07]
More than 80% power savings over buses
30 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
3 – Coherent On-Chip Cache System Enables Standard Programming
Distributed cache– Each tile has local L1 and L2 cache– Aggregate of L2 serves as a globally shared L3
Dynamic Distributed Cache (DDC™)– Hardware based cache coherence– Hardware tracks sharers, invalidates stale
copies– One or multiple coherency domains– Dedicated network to manage coherency
Coherent direct-to-cache I/O– Header/packet delivered directly to tile caches– Cache coherent delivery– Significant DRAM bandwidth and latency
reductionSe
rDes
SerD
es
GbE 0GbE 0
GbE 1GbE 1
FlexibleI/O
UARTJTAG
SPI, I2C
FlexibleI/O
UARTJTAG
SPI, I2C
DDR2 Controller 3DDR2 Controller 3
DDR2 Controller 0DDR2 Controller 0
DDR2 Controller 2DDR2 Controller 2
DDR2 Controller 1DDR2 Controller 1
PCIe 0PCIe 0
SerD
esSe
rDes
PCIe 1PCIe 1
XAUI 0XAUI 0
SerD
esSe
rDes
XAUI 1XAUI 1
SerD
esSe
rDes
CompletelyCoherentSystem
tile a tile h
tile b
31 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
4 – Multicore Hardwall™ Technology for Virtualization and Protection in Cloud Environments
The virtualization and protection challengeMulticores need to run multiple OS’s and applications in cloud environmentsOS’s must be protected from each otherI/O and other shared resources must be virtualized
Multicore Hardwall technologyProtects applications and OS by prohibiting unwanted interactionsConfigurable to include one or many tiles in a protected areaSupported by Tilera hypervisor running on all the tiles
OS1/APP1
OS1/APP3
OS2/APP2
32 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Configurable Fine Grain Protection
01 2
3
TLB Access
01 2
3
DMA Engine
01 2
3
“User” Network
01 2
3
I/O Network
Key:0 – User Code1 – OS2 – Hypervisor or PAL3 – Hypervisor Debugger
Key:0 – User Code1 – OS2 – Hypervisor or PAL3 – Hypervisor Debugger
Full Stack Linux with Hypervisor
33 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
5 – Standard Tools and Software Stack
Multicore Development Environment
Standard application stack
Standard programmingSMP Linux 2.6.26ANSI C/C++pthreads
Integrated toolsSGI compilerStandard gdb gprofEclipse IDE
Innovative toolsMulticore debugMulticore profile
Standards-based tools
TILE hardware
Tile Tile Tile Tile …
Hypervisor
Operating System
Applications
Virtualization and high speed I/O drivers
Linux kernel and kernel drivers
Linux libraries
Applications libraries
Application layerOpen source appsStandard C/C++ libs
Operating System layer64-way SMP LinuxZero Overhead Linux Bare metal environment
Hypervisor layerVirtualizes hardwareI/O devices driversLoad balancer
34 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Multiple Software Environments to Meet Diverse Needs of Embedded and Cloud Systems
Standard SMP Linux– Standard Linux environment with processes and threads– Ideal for applications and control plane code requiring operating system services– Open source applications work out of the box
Standard SMP Linux with Zero Overhead Linux (ZOL)– Zero Linux overhead (Eliminates OS interrupts, timer ticks, etc..) – Transparent to programmer - no software change required – For high performance data-plane applications not requiring OS services
Bare metal environment– Full control of the hardware on up to 64 tiles– No operating system or hypervisor layers– For embedded applications requiring fine grain control of memory, and IO
Hybrid environment – Using 2 or all three of the above models – Each environment can be run on one or more Tiles– Ideal for customers aggregating data plane and control plane code on one chip
35 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Parallel Programming using Standard Models
64-way SMP Linux– Single system image across all tiles
Standard pthreads API– pthread_create()– Shared memory model by default– Synchronize using mutexes and locks
Standard Linux processes – fork(), exec()– Separate address space– Share memory: mmap(), mspaces– Communicate: Pipes / local sockets
Gentle slope programming optimizations using Linux extensions
– Control memory location and distribution– Control thread scheduling and location
New models and further optimizations using TMC library (Tile Multicore Components)
DPI
DPI
DPI
LoadBalancer Histogram
Packet stream
Deep packet inspection
PCIe 1MACPHY
PCIe 1MACPHY
PCIe 0MACPHY
PCIe 0MACPHY
SerdesSerdes
SerdesSerdes
Flexible IOFlexible IO
GbE 0GbE 0
GbE 1GbE 1Flexible IOFlexible IO
UART, HPIJTAG, I2C,
SPI
UART, HPIJTAG, I2C,
SPI
DDR2 Memory Controller 3DDR2 Memory Controller 3
DDR2 Memory Controller 0DDR2 Memory Controller 0
DDR2 Memory Controller 2DDR2 Memory Controller 2
DDR2 Memory Controller 1DDR2 Memory Controller 1
XAUIMAC
PHY 0
XAUIMAC
PHY 0SerdesSerdes
XAUIMAC
PHY 1
XAUIMAC
PHY 1SerdesSerdes
DPI
DPI
DPI
HST LB
36 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Scaling Up: TileGx100 Announced in Oct `09
100 general-purpose coresRuns SMP LinuxStandard programming1.25GHz – 1.5GHzFull 64-bit processors32 MBytes total cache546 Gbps memory BW200 Tbps iMesh BW
80-120 Gbps packet I/O80 Gbps PCIe I/O
Wire-speed packet engine– 120Mpps
MiCA engines:– 40 Gbps crypto– 20 Gbps compressMemory ControllerMemory Controller Memory ControllerMemory Controller
Memory ControllerMemory Controller Memory ControllerMemory Controller
mPI
PEm
PIPE
MiscI/O
MiscI/O
MiCAMiCA
MiCAMiCA
3 PCIeInterfaces3 PCIe
Interfaces
NetworkI/O
Eight10Gb-or-
32 1Gb-or-
Two 40Gb
NetworkI/O
Eight10Gb-or-
32 1Gb-or-
Two 40Gb
FlexibleI/O
FlexibleI/O
37 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Scaling Down: TILEGx16™ also Announced
16 Processor Cores1.0 &1.25 GHz speedsFull 64-bit processors5.2 MBytes total cache200 Gbps memory BW20 Tbps iMesh BW
24 Gbps total packet I/O – 2 ports 10GbE (XAUI)– 12 ports 1GbE (SGMII)
32 Gbps PCIe I/O
Wire-speed packet engine– 30Mpps
MiCA engine:– 10 Gbps crypto– 5 Gbps compress &
5 Gbps decompressMidrange 36 core part also announced
Memory Controller (DDR3)Memory Controller (DDR3)
mPI
PEm
PIPE
10 GbEXAUI
10 GbEXAUI
SerD
esSe
rDes
4x GbESGMII
10 GbEXAUI
10 GbEXAUI
SerD
esSe
rDes
4x GbESGMII
SerD
esSe
rDes4x GbE
SGMII
FlexibleI/O
FlexibleI/O
UART x2, USB x2,JTAG, I2C, SPI
UART x2, USB x2,JTAG, I2C, SPI
MiCAMiCA
SerD
esSe
rDes PCIe 2.0
4-lanePCIe 2.04-lane
SerD
esSe
rDes PCIe 2.0
4-lanePCIe 2.04-lane
Memory Controller (DDR3)Memory Controller (DDR3)
SerD
esSe
rDes PCIe 2.0
4-lanePCIe 2.04-lane
38 © 2009 Copyright Tilera Corporation. All Rights Reserved.© 2009 Copyright Tilera Corporation. All Rights Reserved.
Vision for the Future
The ‘core’ is the logic gate of the 21st century
pm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
spm
s
Corollary of Moore’s Law: The number of cores will double every 18 months