IBM Power Systems Compiler Roadmap
description
Transcript of IBM Power Systems Compiler Roadmap
Compilation Technology
SCINET Briefing | Month day, 2008 © 2009 IBM Corporation
Software Group
IBM Power Systems Compiler Roadmap
Roch ArchambaultIBM Toronto [email protected]
2
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Agenda
Overall Roadmap The Power Systems Compiler Products Detailed Roadmaps
Common Features & Compiler Architecture
XL Fortran
XL C/C++
XL Compilers for Blue Gene
XL Compilers for Cell
XL UPC Compiler Online Documentation Performance Comparison Q&A
3
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Roadmap of XL Compiler Releases
Dev Line
2007 2008 Future
All information subject to change without notice
XL C/C++ V11.1 XL Fort V13.1 Linux
XL C/C++ V9.0
XL Fort V11.1 BG/P
XL C/C++ V9.0
XL Fort V11.1 BG/L
SLES 10
SLES 9
XL C/C++ V9.0 for CELL
XL C/C++ V11.1
XL Fort V13.1 AIX
SLES 10 SLES 11
XL Fort V11.1 for CELL
XL C/C++ V10.1 XL Fort V12.1 AIX
XL C/C++ V10.1
XL Fort V12.1 Linux
XL C/C++ V10.1 for CELL
4
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
The Power Systems Compiler Products: Previous Versions
All POWER4, POWER5, POWER5+ and PPC970 enabled
XL C/C++ Enterprise Edition V8.0 for AIX
XL Fortran Enterprise Edition V10.1 for AIX
XL C/C++ Advanced Edition V8.0 for Linux (SLES 9 & RHEL4)
XL Fortran Advanced Edition V10.1 for Linux (SLES 9 & RHEL4)
XL C/C++ Advanced Edition V8.0.1 for Linux (SLES 10 & RHEL4)
XL Fortran Advanced Edition V10.1.1 for Linux (SLES 10 & RHEL4)
XL C/C++ Enterprise Edition for AIX, V9.0 (POWER6 enabled)
XL Fortran Enterprise Edition for AIX, V11.1 (POWER6 enabled)
XL C/C++ Advanced Edition for Linux, V9.0 (POWER6 enabled)
XL Fortran Advanced Edition for Linux, V11.1 (POWER6 enabled)
5
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
The Power Systems Compiler Products: Latest Versions
All POWER4, POWER5, POWER6 and PPC970 enabledXL C/C++ for AIX, V10.1 (July 2008)
XL Fortran for AIX, V12.1 (July 2008)
XL C/C++ for Linux, V10.1 (September 2008)
XL Fortran for Linux, V12.1 (September 2008)
Blue Gene (BG/L and BG/P) enabledXL C/C++ Advanced Edition for BG/L, V9.0
XL Fortran Advanced Edition for BG/L, V11.1
XL C/C++ Advanced Edition for BG/P, V9.0
XL Fortran Advanced Edition for BG/P, V11.1
Cell/B.E. cross compiler products:XL C/C++ for Multicore Acceleration for Linux on System p, V9.0
XL C/C++ for Multicore Acceleration for Linux on x86 Systems, V9.0
XL Fortran for Multicore Acceleration for Linux on System p, V11.1
6
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
The Power Systems Compiler Products: Latest Versions
Technology Preview currently available from alphaWorks
XL UPC language support on AIX and Linux
Download: http://www.alphaworks.ibm.com/tech/upccompiler
XL C/C++ for Transactional Memory for AIX
Download: http://www.alphaworks.ibm.com/tech/xlcstm
7
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
The Power Systems Compiler Products: Future Versions
Cell/B.E. cross compilers: XL C/C++ for Multicore Acceleration for Linux on Power Systems, V10.1
XL C/C++ for Multicore Acceleration for Linux on x86 Systems, V10.1
POWER7 support XL C/C++ for AIX, V11.1
XL Fortran for AIX, V13.1
XL C/C++ for Linux, V11.1
XL Fortran for Linux, V13.1
All information subject to change without notice
8
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Common Fortran, C and C++ Features
Linux (SLES and RHEL) and AIX, 32 and 64 bit Debug support
Debuggers on AIX:
Total View (TotalView Technologies), DDT (Allinea), IBM Debugger and DBX
Debuggers on Linux:
TotalView, DDT and GDB Full support for debugging of OpenMP programs (TotalView) Snapshot directive for debugging optimized code Portfolio of optimizing transformations
Instruction path length reduction
Whole program analysis
Loop optimization for parallelism, locality and instruction scheduling
Use profile directed feedback (PDF) in most optimizations Tuned performance on POWER3, POWER4, POWER5, PPC970, PPC440,
PPC450, POWER6 and CELL systems Optimized OpenMP
9
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
TPOTPO
IPA IPA ObjectsObjects
Other Other ObjectsObjects
System System LinkerLinker
Optimized Optimized ObjectsObjects
EXE
DLLPartitionsPartitions
TOBEYTOBEY
C FEC FE C++ FEC++ FE FORTRAN FORTRAN FEFECompile Step
Optimization
LibrariesLibraries
PDF infoPDF info
Link StepOptimization O4 and O5
Wcode+
Wcode
Wcode+
Instrumentedruns
WcodeWcode
Wcode
Wcode
IBM XL Compiler Architecture
noopt and O2 O3, O4 and O5
10
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
XL Fortran Roadmap: Strategic Priorities
Superior Customer ServiceContinue to work closely with key ISVs and customers in scientific and technical
computing industries Compliance to Language Standards and Industry Specifications
OpenMP API V2.5 (Full) and OpenMP API V3.0 (Partial)
Fortran 77, 90 and 95 standards
Fortran 2003 Standard Exploitation of Hardware
Committed to maximum performance on POWER4, PPC970, POWER5, POWER6, PPC440, PPC450, CELL and successors
Continue to work very closely with processor design teams
11
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
XL Fortran Version 12.1 for AIX/Linux – Summer/Fall 2008
New features since XL Fortran Version 10.1:Continued rollout of Fortran 2003
Compliant to OpenMP V2.5
Perform subset of loop transformations at –O3 optimization level
Tuned BLAS routines (DGEMM and DGEMV) are included in compiler runtime (libxlopt)
Recognize matrix multiply and replace with call to DGEMM
Runtime check for availability of ESSL
Support for auto-simdization and VMX intrinsics (and data types) on AIX
Inline MASS library functions (math functions)
AsdasdPartial support for OpenMP V3.0
Fine grain control for –qstrict option
Improved compile/link time
More Interprocedural data reorganization optimizations
12
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
XL C/C++ Roadmap: Strategic Priorities
Superior Customer Service Compliance to Language Standards and Industry Specifications
ANSI / ISO C and C++ Standards
OpenMP API V3.0 Exploitation of Hardware
Committed to maximum performance on POWER4, PPC970, POWER5, PPC440, POWER6, PPC450, CELL and successors
Continue to work very closely with processor design teams Exploitation of OS and Middleware
Synergies with operating system and middleware ISVs (performance, specialized function)
Committed to AIX Linux affinity strategy and to Linux on pSeries Reduced Emphasis on Proprietary Tooling
Affinity with GNU toolchain
13
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
XL C/C++ Version 10.1 for AIX/Linux – Summer/Fall 2008
New features since XL C/C++ Version 8.0:Exploit “restrict” keyword in C 1999
Partial compliance to C++ TR1 libraries and Boost 1.34.1
Support for -qtemplatedepth which allows the user to control number of recursive template instantiations allowed by the compiler.
Exploit DFP and VMX on Power6.
Improved inline assembler support
Full support for OpenMP V3.0
Fine grain control for –qstrict option
Improved compile/link time
More Interprocedural data reorganization optimizations
14
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Blue Gene Compilers
XL C/C++ Advanced Edition V8.0 for BG/L and XL Fortran Advanced Edition V10.1 for BG/L
Performance tuning of SPEC2000FP, DDCMD Kernels, NAS 3.2 Serial and sPPM.
Performance tuning of MASS library
Exploit 440D instructions for complex arithmetic
BG/L compiler white paper (Exploiting the Dual FPU in BG/L):
http://www.ibm.com/support/docview.wss?uid=swg27007511 June 2006 PTF (compiler refresh):
Support Blue Gene software release 3Overall SPEC2000FP faster for 440D than 440Updated white paper to reflect June 2006 PTF performance improvements
December 2006 PTF (compiler refresh)
Continue to improve 440D performance of benchmarks listed aboveUpdated white paper to reflect December 2006 PTF performance improvements
15
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Blue Gene Compilers
XL C/C++ Advanced Edition for BG/P, V9.0 and XL Fortran Advanced Edition for BG/P, V11.1
Support for OpenMP, automatic parallelization and dynamic linking
Performance improvements: SIMD and other general optimizations
MASS/MASSV performance improvements
FEN (Front End Node) is SLES10
September 2008 Fortran PTF available:
http://www.ibm.com/support/docview.wss?rs=43&uid=swg24020392
September 2008 C/C++ PTF available:
http://www.ibm.com/support/docview.wss?rs=2239&uid=swg24020391
XL C/C++ Advanced Edition for BG/L, V9.0 and XL Fortran Advanced Edition for BG/L, V11.1
Same code base as BG/P release except FEN is SLES 9
GA is one month after BG/P GA
16
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Cell/B.E. Compilers
Current cross compilers products: Hosted on RHEL5U1 (Red Hat) and F7 (Fedora)
Hosted on x86 and PPC (separate products)
Support SDK 3.0 interfaces
Targets QS20, QS21 and QS22 Blades
IBM XL C/C++ for Multicore Acceleration for Linux, V9.0
IBM XL Fortran for Multicore Acceleration for Linux, V11.1 (Hosted on PPC only)
Future cross compiler products:Hosted on RHEL5U2 and F9
Support SDK 3.1 interfaces
User directed single source compiler (using OpenMP)
IBM XL C/C++ for Multicore Acceleration for Linux, V10.1 (Hosted on x86 and PPC)
All information subject to change without notice
17
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
XL UPC Compiler Tech preview on alphaWorks
Based on XL C V10.1 compiler
Compiler generated interface to the runtime system is identical for shared and distributed memory implementations
Optimizations take advantage of system architecture knowledge
On AIX Shared Memory (pthreads)
Distributed (LAPI)
On LinuxShared Memory (pthreads)
Distributed (LAPI)
On BG/LBG Message Layer
Using approximately 1000 test scenarios:GWU UPC test suite
UPC version of NAS benchmarks
Berkeley UPC test suite
MTU UPC test suite
HPC Challenge suite
All information subject to change without notice
18
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
HPCC Stream Triad Results
shared [BF] int A[N],B[N],C[N];upc_forall (i=0; i < N; i++; &A[i])
A[i] = B[i] + k*C[i];
shared [BF] int A[N],B[N],C[N];for (i=0; i < N; i++) if (upc_threadof(A[i]) == MYTHREAD)
A[i] = B[i] + k*C[i];
Branchnaïve translation
Runtime Calls
shared [BF] int A[N], B[N], C[N];
for (i=MYTHREAD*BF; i<N; i+=THREADS*BF)
for (j=i; j < i+BF; j++)
A’[j] = B’[j] + k*C’[j];
optimized loop
Local Access
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
0 4 8 12 16 20 24 28 32
MB/sLAPI
Threads
MPI Fully Optimized
155 GB/s
138 GB/s12%No UPC Optimizations
~2000X improvement
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
200000
220000
0 8 16 24 32 40 48 56 64
SMP
66X improvement
No UPC OptimizationsFully OptimizedOpenMP
221 GB/s216 GB/s
3.2 GB/s
19
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
OpenMP 3.0: Examples of Task
Recursive algorithm
int fib(int n) { int x, y; if (n<2) return n; { #pragma omp task shared(x) x=fib(n-1); #pragma omp task shared(y) y=fib(n-2); } #pragma omp taskwait return x+y;}
Pointer chasing
#pragma omp parallel{ #pragma omp single { while(p) { #pragma omp task process(p) p=p->next; } }}
20
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
General –qstrict Suboptions
-qstrict=all -- be strict or -qstrict=none -- relaxed about all changes
-qstrict=precision -- be strict or -qstrict=noprecision -- relaxed about changing precision
-qstrict=exceptions -- be strict or -qstrict=noexceptions -- relaxed about changing exceptions
(whether more or less or moved)[no]exceptions does not control everything that could produce different results, so different exceptions are possible even with –qstrict=noexceptions.
Each general suboption controls decisions itself,and also controls nested suboptions.
21
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
General –qstrict Suboptions
-qstrict=ieeefp -- be strict or -qstrict=noieeefp -- relaxed about violating IEEE 754
[no]ieeefp controls individual operations defined by IEEE 754, not how operations interact or are ordered.
Detail suboptions allow controlling specific aspects of [no]ieeefp.Most operations are affected by multiple detail suboptions, and also by [no]exceptions.
-qstrict=order -- be strict or -qstrict=noorder -- relaxed about operation order
[no]order controls the order between operations, as defined by language semantics, not how each operation is implemented.
Detail suboptions allow controlling specific aspects of [no]order.
22
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Examples
-qstrict=all:nooperationprecision:noreductionorder- Be strict about everything,
- except about operationprecision, part of what’s needed to allow x / loop_constant => x * (1 / loop_constant),which is needed to allow faster MOD(note this allows other changes too),
- and about reductionorder, to allow recognizing dot product and similar reductions and generating faster parallelized code, without allowing other reordering.
23
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Examples
do i = 1, n a(i) = b(i) / x end do
-qstrict=operationprecision\ :exceptions:zerosigns
LFL fp0=x(gr31,0)
. . .CL.30: LFL fp2=b[](gr4,8) AI gr3=gr3,8 STFL a[](gr3,0)=fp1 DFL fp1=fp2,fp0,fcr =b(i)/x AI gr4=gr4,8 BCT ctr=CL.30
divide by x
-qstrict=nooperationprecision\ :noexceptions:nozerosigns
LFS fp0=+CONSTANT_AREA(gr5,4) =1 . . . LFL fp2=x(gr31,0) RCPFL fp0=fp0,fp2,fcr =1/x . . .CL.30: AI gr3=gr3,8 LFL fp2=b[](gr4,8) AI gr4=gr4,8 STFL a[](gr3,0)=fp1 MFL fp1=fp2,fp0,fcr =b(i)*(1/x) BCT ctr=CL.30
multiply by reciprocal of x
24
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software GroupThe IBM Rational C/C++ CaféThe IBM Rational C/C++ Café
ibm.com/rational/cafe/community/ccppibm.com/rational/cafe/community/ccpp
25
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Feature Request
Request for a feature to be supported by our compilers
C/C++ feature request page:http://www.ibm.com/support/docview.wss?uid=swg27005811
Fortran feature request page:http://www.ibm.com/support/docview.wss?uid=swg27005812
Or send e-mail to [email protected]
26
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Documentation
An information center containing the documentation for the XL Fortran V12.1 and XL C/C++ V10.1 versions of the AIX compilers is available at: http://publib.boulder.ibm.com/infocenter/comphelp/v101v121/index.jsp
An information center containing the documentation for the XL Fortran V11.1 and XL C/C++ V9.0 versions of the AIX compilers is available at: http://publib.boulder.ibm.com/infocenter/comphelp/v9v111/index.jsp
Optimization and Programming Guide for XLF V12.1 is now available online at: http://publib.boulder.ibm.com/infocenter/comphelp/v101v121/index.jsp
New whitepaper “Overview of the IBM XL C/C++ and XL Fortran Compiler Family” available at: http://www.ibm.com/support/docview.wss?uid=swg27005175
This information center contains all the html documentation shipped with the compilers. It is completely searchable.
Please send any comments or suggestions on this information center or about the existing C, C++ or Fortran documentation shipped with the products to [email protected].
27
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
SPEC2006 FP Comparison Between Power6, Itanium-2 And Core Duo
-100.00%
-80.00%
-60.00%
-40.00%
-20.00%
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%b
wav
es
gam
es
mil
c
zeu
smp
gro
mac
s
cact
usA
DM
lesl
ie3d
nam
d
dea
lII
sop
lex
po
vray
calc
uli
x
gem
s
ton
to
lbm
wrf
sph
inx3
Ove
rall
P6 vs IT2 (6)
P6 vs DUO (7)
Using base options from spec.org
Overall 11% faster than
DUO and 8% faster than IT2
28
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
SPEC2006 FP Comparison Between AIX and Linux on Power6
-15.00%
-10.00%
-5.00%
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
40.00%
45.00%
50.00%b
wav
es
gam
es
mil
c
zeu
smp
gro
mac
s
cact
usA
DM
lesl
ie3d
nam
d
dea
lII
sop
lex
po
vray
calc
uli
x
gem
s
ton
to
lbm
wrf
sph
inx3
Ove
rall
AIXvsLinux
Using peak options from spec.org (positive means AIX is faster than Linux)
Overall < 1%
29
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
SPEC2006 FP Comparison Between XL Compilers and GNU Compilers on Power6
0.00%
50.00%
100.00%
150.00%
200.00%
250.00%
300.00%
350.00%
400.00%
450.00%
500.00%
550.00%
600.00%
bwav
es
gam
es
milc
zeus
mp
lesl
ie3d
nam
d
sopl
ex
povr
ay
calc
ulix
gem
s
lbm
sphi
nx3
XLVSGNU
Using peak options with latest XL compilers and GNU compilers V4.2
GNU V4.2 failed to run
gromacs, cactus, dealll,
tonto and wrf
30
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Blue Gene Compilers: Performance Results
Note that NAS and ddCMD actually improved with 440d, but bars are smaller due to 440 improvements
0.00%
20.00%
40.00%
60.00%
80.00%
440D Speedup
Improvements of –qarch=440d versus –qarch=440 using XL C/C++ V9.0 and XL Fortran V11.1 with –O5 (0709 driver)
31
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
OpenMP Scaling on BG/P: NAS OpenMP Benchmarks (Class A)
OMP Speedup on 4 threads with respect to Serial
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
ft mg sp lu lu-hp bt is ep cg ua Average
32
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
OpenMP Scaling on BG/P: Miscellaneous Codes
Misc codes: OMP Speedup on 4 threads wrt a single thread
0.000
0.500
1.000
1.500
2.000
2.500
3.000
3.500
4.000
4.500
sppm hycom umt2k sphot wrf overflow CPMD
33
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
SPECOMPM2001 Performance – IBM PPC Generations
0
10
20
30
40
50
p690 Turbo32x1.7
p5 57016*1.9
p6 hv88*4.2
SPECmark
34
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
SPECOMPM2001 Performance – 64-way Competition
0
20
40
60
80
100
120
140
160
Unisys ES7000/1
3.4 GHZXEON
HPAlphaserver
1.15 GHZAlpha
FujistuPrimepower
2GHZSPAC64 V
SGI Altix1.6GHZ
Itanium2
IBM p5 5952.3GHZ
POWER5
SPECmark
35
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
BACKUP SLIDES
36
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
History Of Compiler Improvement On Power4
Note: SPEC2000 base options improvements from www.spec.org
Compilers 2001V5/V7.1.1
2002V6/V8.1
2003 V6/V8.1.1
2004V7/V9.1
2005V8/V10.1
CompoundOver 4 Years
CAGRRate
SpecINT baseline 21% 0% 3% 7% 34% 7.6%
SpecFLOAT baseline 12% 5% 18% 5% 46% 9.9%
37
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
History Of Compiler Improvement On Power5
Note: SPEC2000 base options improvements from www.spec.org
Compilers 2004V7/V9.1
2005V8/V10.1
2007V9/V11.1
CompoundOver 3Years
SpecINT baseline 4.3% 6.4% 11%
SpecFLOAT baseline 5.4% 1.8% 7.3%
38
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Installation of Multiple Compiler Versions
Installation of multiple compiler versions is supported The vacppndi and xlfndi scripts shipped with VisualAge C++ 6.0 and
XL Fortran 8.1 and all subsequent releases allow the installation of a given compiler release or update into a non-default directory
The configuration file can be used to direct compilation to a specific version of the compiler
Example: xlf_v8r1 –c foo.f
May direct compilation to use components in a non-default directory
Care must be taken when multiple runtimes are installed on the same machine (details on next slide)
39
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Coexistence of Multiple Compiler Runtimes
Backward compatibilityC, C++ and Fortran runtimes support backward compatibility.
Executables generated by an earlier release of a compiler will work with a later version of the run-time environment.
Concurrent installationMultiple versions of a compiler and runtime environment can be installed on the
same machine
Full support in xlfndi and vacppndi scripts is now available Limited support for coexistence
LIBPATH must be used to ensure that a compatible runtime version is used with a given executable
Only one runtime version can be used in a given process.
Renaming a compiler library is not allowed.
Take care in statically linking compiler libraries or in the use of dlopen or load .
Details in the compiler FAQ http://www.ibm.com/software/awdtools/fortran/xlfortran/support/
http://www.ibm.com/software/awdtools/xlcpp/support/
40
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
FPU
A Unified Simdization FrameworkGlobal information gathering
Pointer Analysis Alignment Analysis
Simdization
Straightline-code Simdization Loop-level Simdization
General Transformation for SIMD
Dependence Elimination Data Layout Optimization
Simdization
SIMD Intrinsic Generator
Constant Propagation
VMX
CELL
architecture independent
architecture specific
Diagnostic output
…
Idiom Recognition
41
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Blue Gene Compilers: Performance ResultsOverall Improvement with -O5:
V8/10.1 GA, PTF1, PTF2 vs. V7/9.1 Compilers
0.00%
5.00%
10.00%
15.00%
20.00%
25.00%
30.00%
35.00%
40.00%
45.00%
50.00%
Spec2000FP NAS 3.2 Serial sPPM ddcmd uKernels
V8/10.1 GA 440
V8/10.1 GA 440d
V8/10.1 PTF1 440
V8/10.1 PTF1 440d
V8/10.1 PTF2 440
V8/10.1 PTF2 440d
Note that NAS and ddCMD actually improved with 440d, but bars are smaller due to 440 improvements
42
Compilation Technology
SCINET Briefing | IBM Power Systems Compiler Roadmap © 2009 IBM Corporation
Software Group
Blue Gene Compilers: Performance ResultsOverall Improvement with -O5:
-qarch=440d vs. -qarch=440
-60.00%
-40.00%
-20.00%
0.00%
20.00%
40.00%
60.00%
80.00%
Spec2000FP NAS 3.2 Serial sPPM ddcmd uKernels
V7/9.1
V8/10.1 GA
V8/10.1 PTF1
V8/10.1 PTF2