OpenVMS Technical Update Days September 22 nd, 2003 Bad Homburg, Germany Dr. Herbert Cornelius Intel...
-
date post
18-Dec-2015 -
Category
Documents
-
view
216 -
download
1
Transcript of OpenVMS Technical Update Days September 22 nd, 2003 Bad Homburg, Germany Dr. Herbert Cornelius Intel...
OpenVMS Technical Update Days September 22nd, 2003Bad Homburg, GermanyDr. Herbert CorneliusIntel EMEA
Intel® Itanium® ArchitectureTechnical Overview
The Enterprise Architecture for the next Decade
Intel Confidential
2*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Agenda
Enterprise Computing Trends
Intel® Itanium® Architecture
Enterprise System Platforms
Intel Software Tools
Intel Confidential
3*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Enterprise Computing Areas
Engineering BusinessScience
Intel Confidential
4*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Some Enterprise Computing History
1960s 1980s 1990s 2000s
ProprietarySolutions
RIS
C
1970s
Solutions based on Building Blocks
using Industry Standard
Intel Confidential
5*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Designed for Enterprise Computing
ARCHITECTURE
Intel Confidential
6*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
A new Architecture for Enterprise Computing
RISCTechnology
RISCTechnology
CISC Technology
CISC Technology
New Architectural features• EPIC• Predication• Speculation• Enhanced floating point
performance• Massive Resources• 64-bit instruction set, registers
& addressing
New Architectural features• EPIC• Predication• Speculation• Enhanced floating point
performance• Massive Resources• 64-bit instruction set, registers
& addressing
Enhanced reliability features
Enhanced reliability features
IA-32IA-32Enterprise class
OSEnterprise class
OS
Intel Confidential
7*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Why Intel?
Intel Confidential
8*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Economy of Scale
Eco-SystemInvestment
Performance MemoryCosts
SolutionCosts
Intel® Architecture
RISC
Intel Confidential
9*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
www.intel.com/research/silicon
10GHz1Billion
Transistors~2007 (est.)
Driving The Change in ComputingEconomics
EnablingPeta-Flop
Computing
Moore’s Law will continue for the next 10 Years
Intel Confidential
10*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Driving Performance Vectors
• Silicon Process• Density • Frequency• Manufacturing
• Micro-Architecture• Execution Units, Caches• Threading• Memory Subsystem• I/O-Subsystem• System Architecture
• Compilers• Libraries• Tools• ISVs
Intel Confidential
11*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Intel Confidential
12*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Fundamental Architecture Challenges
• Sequentiality inherent in traditional architectures• Complex hardware needed to (re)extract ILP• Limited ILP available within basic blocks• Branches make extracting ILP difficult• Memory dependencies further limit ILP• Increasing latency exacerbates ILP need• Limited resources : A fundamental constraint• Shared resources create more overhead• Loop ILP extraction costs code size• And the challenges continue ...
Itanium® Architecture overcomes these fundamental challenges!
Intel Confidential
13*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Characteristics of High-End Processors
High-end Processors Require Significant Resources, Capabilities
Sun US III*
96k** on-die cache
96 Registers
4 Issue ports
3 MB 6 MB on-die cache
11 Issue ports
264Registers
Itanium® 2 Processor
1.75 MBon-die cache
4-6 Issue ports
152Registers
Alpha EV7*
Up to 1.5 MBon-die cache
72 Registers
IBM Power* 4
8 Issue ports
Source: IBM.comSource: IBM.com Source: HP.comSource: HP.com Source: Sun.com**Source: Sun.com**Source: IntelSource: Intel
**CPU connects to external 8 MB L2 cache
Intel Confidential
14*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Intel® Itanium® Processor Family
800MHz4MB L3-Cache460GX Chip-setOEM Chip-sets180nm
1GHz3MB iL3-CacheE8870 Chip-setOEM Chip-sets180nm
1.5GHz6MB iL3-CacheE8870 Chip-setOEM Chip-sets130nm
>>1.5GHzlarger L3-CacheEnhanced Dual-CoreE8870 Chip-setOEM Chip-sets90nm
(Madison**)Montecito**
**codename
2001 2002 2003 2005
Madison9M**
2004
>1.5GHz9MB iL3-CacheE8870 Chip-setOEM Chip-sets130nm
All features and dates specified are targets provided for planning purposes only and are subject to change
common platform
Intel Confidential
15*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Performance Advantage over RISC
1322 2119
http://www.intel.com/ebusiness/products/itanium/index.htm as of 06/30/2003
Intel Confidential
16*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Itanium® Processor ArchitectureSelected Features
• Instruction Level Parallelism (6-way)• Large Register Files• Automatic Register Stack Engine• Predication• Software Pipelining Support with Loop Control
Hardware• Register Rotation • Sophisticated Branch Architecture• Control & Data Speculation• Powerful 64-bit Integer Architecture• Advanced 82-bit Floating Point Architecture• Multimedia Support (MMX™ Technology) • 64-bit Addressing Flat Memory Model• IA-32 Binary Execution Support
Intel Confidential
17*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Itanium® 2 Processor Block Diagram
(schematic overview)
iL3 cache3-6MB
(24-way128B CL)
Intel Confidential
18*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Itanium® 2 Memory Cache Hierarchy
Itanium® 2 Processor (1.5GHz)
L1D16KB64B CL1 CLK
L1I16KB64B CL1 CLK
L2-Cache256KB128B CL8-way5-7 CLKS
L3-Cache3-6MB128B CL24-way14-17 CLKS
48GB/s
6.4 GB/s
48GB/s
48GB/s
Memory(Controller)
~150 CLKS
3-level caching on Itanium® Architecture• 1st level cache optimized for latency• 2nd level cache optimized for bandwidth• 3rd level cache optimized for size• all integrated, non-blocking caches at full CPU frequency
Intel Confidential
19*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Itanium® 2 Pipelines
IPG IP Generate, L1I Cache (6 inst) and TLB access
EXE ALU Execute(6), L1D Cache and TLB access + L2 Cache Tag Access(4)
ROT Instruction Rotate and Buffer (6 inst) DET Exception Detect, Branch Correction
EXP Expand, Port Assignment and Routing
WB Writeback, Integer Register update
REN Integer and FP Register Rename (6 inst)
FP1-WB FP FMAC pipeline (2) + reg write
REG Integer and FP Register File read (6) L2N-L2I L2 Queue Nominate/Issue (4)
L2A-W L2 Access, Rotate, Correct, Write (4)
Short 8-stage in-order main pipeline– In-order issue, out-of-order completion
– Reduced branch misprediction penalties
– Fully interlocked, no way-prediction or flush/replay mechanism
Pipelines are designed for very low latency
RENEXPROTIPG DET WBEXEREG
L2N L2I L2A L2ML2D L2C L2W
FP1 FP2 FP3 FP4 WBFPU
Core
L2
Intel Confidential
20*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Large Register Set
BR7
BR0
Branch Registers
63 0
96 Framed, Rotating
GR1
GR31
GR127
GR32
GR0
NaT
32 Static
0
Integer Registers
63 0
Predicate Registers
PR1
PR63
PR0
PR15PR16
48 Rotating16 Static
96 Rotating
FR1
FR31
FR127
FR32
FR0
32 Static
+ 0.0
F.P. Registers
81 0
+ 1.0
1
Intel Confidential
21*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Parallel Execution Unitsfully pipelined
Itanium® Itanium® 2
Integer
F.P.
Multimedia
Load/Store
Branch
F.P. MAC
F.P. MAC
ALU/INT/MM
ALU/INT/MM
ALU/MM/MEM
ALU/MM/MEM
ALU/MM/MEM
ALU/MM/MEM
BRANCH
BRANCH
BRANCH
Issue Ports/Units
Intel Confidential
22*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
EPIC (Explicit Parallel Instruction Computing)
Source Code
InstructionBundles
(3 Instr. each,128 bit wide)
Instruction Groups(series of bundles)
Up to 6 instructions executed per clock
Michael S.Schlansker, B.Ramakrishna Rau:EPIC: Explicit Parallel Instruction Computing;IEEE Computer, February 2000, pp.37-45
Instructions
Compiler
Intel Confidential
23*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Itanium® 2 Dispersal Matrix
Possible Itanium® 2 full issuePossible Itanium® processor and Itanium® 2 full issue
* hint in first bundle
MII MLI MMI MFI MMF MIB MBB BBB MBB MFM
MII
MLI
MMI
MFI
MMF
MIB*
MBB
BBB
MMB*
MFB*
Itanium® 2 allows more compiler dispersal options
Intel Confidential
24*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Itanium® Floating-Point Architecture
High Performance and High Precision
• Dual Fused Multiply-Add Operation (FMA) - An efficient core computation unit
• Abundant Register resources- 128 registers (32 static, 96 rotating)
• High Precision Data computations- 82-bit unified internal format for all data types
• Software divide/square-root- High throughput achieved via pipelining
Floating-Point: High Performance and High Precision
Intel Confidential
25*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
PredicationControl Flow to Data Flow
Traditional Arch.
then
else
brcmp
br
cmp p1,p2p2
p2
p1
p1
Itanium® Architecture
ifif
Removes/Reduces Branches andEnables Parallel Execution
64 predicate registers
Can be combined with logical ops
Intel Confidential
26*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Software Pipelining
• Traditional architectures use loop unrolling– Results in code expansion and increased cache misses
• Itanium®-Processor Software Pipelining uses rotating registers– Allows overlapping execution of multiple loop instances
• Predication controls the pipeline stages
Sequential Loop
Tim
e
Software-Pipelined Loop
Tim
e
loadload
computecompute
storestore
Intel Confidential
27*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Software Pipelining (cont.)
stage 1
stage 2
stage 3
stage 4
Loop Iteration
Special Loop control and branch registers, also usable for WHILE-loops
Predicate registers rotate as well and define the pipeline stages
Intel Confidential
28*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Register Rotation• GR32-127 and FR32-127 can rotate (specified range)• Separate rotating register base for each set (GR, FR)• Loop branches decrement all register rotating bases (RRB)• Instructions contain a “virtual” register number
– physical register # = RRB + virtual register #
i=0 i=1 i=2 i=3 i=4 i=5 i=6 i=7
samephy.reg.
Predicate register range also rotates.diff.
virtualnumber
Intel Confidential
29*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Control & Data Speculation
Control Speculation moves loads above branches / calls
Barrierinstr. 2
ld r1=use = r1use = r1
branch st[?]
instr. 1instr. 2instr. 1
ld r1=
Barrier
Data Speculation moves loads above possibly conflicting stores
Speculation reduces the impactof memory latency
Intel Confidential
30*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Advanced Load Address Table: ALAT
• ld.a inserts entries
• Conflicting stores remove entries– also ld.c.clr, chk.a.clr
• Presence of entry indicates success– chk.a branches when no entry is found
reg#reg#reg#
reg#
::
addraddraddr
addr
::
ld.a reg# =
chk.a reg# ?
st[addr]
Intel Confidential
31*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Itanium® vs. Itanium® 2 Assembly Code
3 clockticks on Itanium
.b1_2:
{ .mmf
(p16) ldfd f37=[r8],8
(p16) ldfd f45=[r3],8
(p19) fma.d f52=f40,f48,f0 ;;
}
{ .mmi
(p16) ldfd f32=[r33]
(p16) ldfd f40=[r2],8
nop.i 0 ;;
}
{ .mfi
(p23) stfd [r40]=f51
(p20) fma.d f48=f36,f44,f53
nop.i 0
}
{ .mib
(p16) add r32=8,r33
nop.i 0
br.ctop.sptk .b1_2 ;;
}
2 clockticks on Itanium 2 !
.b1_2:
{ .mfi
(p16) ldfd f43=[r8],8
(p19) fma.d f51=f46,f50,f0
nop.i 0
}
{ .mmf
(p16) ldfd f47=[r3],8
(p23) stfd [r32]=f56
(p21) fma.d f54=f37,f42,f53 ;;
}
{ .mii
(p16) ldfd f32=[r33]
nop.i 0
nop.i 0
}
{ .mmb
(p16) ldfd f37=[r2],8
(p16) add r32=8,r33
br.ctop.sptk .b1_2 ;;
}
Intel Confidential
32*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
A simple Example
. . double precision, dimension(10000) :: a,b,c,d do i=1,10000 a(i)=a(i)*b(i)+c(i)*d(i) enddo . .
• DAXPY like loop over floating-point vectors• can be optimized differently for Itanium®
and Itanium® 2
Intel Confidential
33*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
2002 2003 2004 2005
Itanium® 2Processor
1GHz3MB iL3 cache
Itanium® 2Processor
(Madison**)1.5GHz
6MB iL3 cache
Itanium® 2Processor
(Madison9M**)>1.5GHz
9MB iL3 cache
Montecito**Dual Core
High FrequencyLarge Caches
per core
130nm180nm
Low Voltage Itanium® 2
Processor(Deerfield**)
1GHz1.5MB iL3 cache
62W
Low Voltage Itanium® 2
Processor(Deerfield** follow-on)
Montecito-based
Low Voltage
MP/DP CAPABLE
DP-ONLY
**codename
All features and dates specified are targets provided for planning purposes only and are subject to change
Potential Enhancements:faster FSB/Links and optimized market segment SKUs
Tanglewood**Multi Core
future
Future ProcessorsLow Voltage
90nm
Intel Confidential
34*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
3rd Generation Itanium® Architecture Processor 130nm Process, 410M Transistors
Up to 1.5GHz Frequency6 GFLOPS DP-F.P Peak Performance6MB integrated L3-Cache (48GB/s)100% Software Binary Compatible
Pin-Compatible to Itanium® 2 ProcessorSame Thermal Envelope
~1.3-1.5x faster than Itanium® 2 1GHz/3MB~1.3-1.5x faster than Itanium® 2 1GHz/3MB
New Intel® Itanium® 2 Processors
Intel Confidential
35*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
1.5 GHz, 6MB iL3 Cache1.4 GHz, 4MB iL3 Cache1.3 GHz, 3MB iL3 Cache
1.4 GHz, 1.5MB iL3 Compute Optimized DP
1.0 GHz, 1.5MB iL3 Cache DPLow-Voltage
Available Intel® Itanium® 2 Processorswidening the deployment areas
Max. Performance
Best FLOP/Watt
Best $/FLOP
Intel Confidential
36*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Itanium® 2 Reliability Features
**codename
Intel Confidential
37*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Potential Future Directions
All features and dates specified are targets provided for planning purposes only and are subject to change
Intel Confidential
38*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Itanium® Architecture Systems
High-end Itanium® 2-based systems …>2X more than Itanium !
up to 128 CPUs2003
1P/2P WSShipping
(not drawn to scale)
Wide range of choice, e.g.
4 CPUsShipping
2 CPUsShipping
Intel Confidential
39*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Itanium® 2-based ServersBringing High-End Data Center Capabilities
to Intel® Architecture
Large Memory CapacityEx. 4P node w/48GB
512P+ system w/512GB
Scalable to High-EndMulti-Processing
32P+ SMP systems512P+ Clustered configurations
High-Bandwidth,Flexible I/O
Large Qty PCI-X slotsDual GbE LAN Ultra 320 SCSI
Remote I/O capabilities
PartitioningMultiple System ImagesStatic/Dynamic Domains
High-End RASIntelligent Platform
Management,Hardware redundancy for
Fault-Tolerance,Modular and Hot-Plug
Capabilities
(Selected examples of some high-end OEM platform capabilities. Not all capabilities found on all platforms)
e.g.
Intel Confidential
40*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Performance Scaling
Scale-Up
(SMP, ccNUMA)
Scale-Out
(Cluster)
Scale Right
Intel Confidential
41*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
OSV Support for Itanium® Architecture
available available
Port to Itanium Architecture underwayavailable
OpenVMSOpenVMS™, ™, NonStopNonStop™ ™ KernelKernel, ,
Converged Converged Enterprise UNIX*Enterprise UNIX*
Intel Confidential
42*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
IA-32H/W
Native IPF Hardware
IPF code IA-32 code
IA-32H/W
IA-32 EL
IPF code IA-32 code
IA-32 Execution Layer
Today Future
Enables Increased Utilization of Itanium® Architecture Features
Native IPF Hardware
Intel Confidential
43*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Comprehensive Software Toolset Windows* and Linux*, IA-32 and Itanium®
Compilers (C/C++, F77/F95)Performance Libraries (MKL, IPP)Performance Analyzer (VTune)
Threading Tools
Intel® Developer Services (IDS)Intel® Early Access Program (EAP)
Software Technologies
Intel Confidential
44*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Software Development Tools
Compilers
Intel®
ThreadingTools
VTune™ Performance
Analyzer
Performance Libraries
SW Products Developer Services
www.intel.com/ids
Intel Confidential
45*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
IA-optimized Managed Runtime Choices
• Windows* Server 2003 framework for Itanium® processor family
• Framework includes– CLR– Base class– Libraries– ADO.NET– ASP .NET– Windows Forms
• BEA* WebLogic* and JRockit* JVM for Itanium® Processor Family
– Shipped Technology Preview on Windows* .NET Server 2003
– Limited Availability on Red Hat* Linux 11/7/02
– GA for both Windows* .NET Server 2003 and Red Hat: Q1’03
Intel Confidential
46*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
High-End Enterprise Applications(Databases, Business Intelligence, ERP / SCM)
(available or ongoing)
Itanium® Software Solution Support
Intel Confidential
47*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Summary
The Economics of Enterprise Computing are changing.
Intel® Itanium Architecture addresses all needs of Enterprise Computing.
Intel is playing a key role in accelerating Enterprise Solutions with technology leadership.
Intel Confidential
48*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Technology Leadership
www.intel.com
Intel Confidential
49*Other brands and names are the property of their respective owners
© Copyright 2002-2003 Intel Corporation. All Rights Reserved.
Intel® Itanium® Architecture
Madison** Processor Features
**codename