Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.
-
Upload
annabella-blair -
Category
Documents
-
view
229 -
download
2
Transcript of Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.
![Page 1: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/1.jpg)
Intel Compilers 9.x on the Intel® Core Duo™
ProcessorWindows version
Intel Software College
![Page 2: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/2.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
2
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Objectives
At the successful completion of this module, you will be able to:
• Use key compiler optimization switches
• Optimize software for the Architecture
• Enhance performance with vectorization and other techniques
![Page 3: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/3.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
3
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Agenda
Introduction
Compiler Switches
Dual Core
Vectorization
![Page 4: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/4.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
4
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Key to optimizing: Intel® Core™ Duo
Exploiting Architectural Power requires Sophisticated Compilers
Optimal use of
• Registers & functional units
• Dual-Core/Multi-processor
• SSE instructions
• Cache architecture
![Page 5: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/5.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
5
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
C++ Compatibility with Microsoft
Source & binary compatible with VC2003 with /Qvc71,
Source & binary compatible with w/ VC 2005 under /Qvc8.
Microsoft* & Intel OpenMP binaries are not compatible. • Use the one compiler for all modules compiled with OpenMP
For more information, refer to the User’s Guide
![Page 6: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/6.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
6
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Use Intel Compiler in Microsoft IDEC++
![Page 7: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/7.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
7
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Agenda
Introduction
Compiler Switches• Intel® C++ compiler
Dual Core
Vectorization
![Page 8: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/8.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
8
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
General Optimizations
Windows* Linux* Mac*
/Od -O0 -O0 Disables optimizations
/Zi -g -g Creates symbols
/O1 -O1 -O1 Optimize for Binary Size: Server Code
/O2 -O2 -O2 Optimizes for speed (default)
/O3 -O3 -O3 Optimize for Data Cache:
Loopy Floating Point Code
![Page 9: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/9.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
9
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Multi-pass Optimization Interprocedural Optimizations (IPO)
ip: Enables interproceduraloptimizations for single file compilation
ipo: Enables interproceduraloptimizations across files
Can inline functions in separate files
Enhances optimization when used in combination with other compiler features
Windows* Linux* Mac*
/Qip -ip -ip
/Qipo -ipo -ipo
![Page 10: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/10.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
10
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Multi-pass Optimization - IPOUsage: Two-Step Process
Linking
Windows* icl /Qipo main.o func1.o func2.o
Linux* icc -ipo main.o func1.o func2.o
Mac* icc -ipo main.o func1.o func2.o
Pass 1
Pass 2
virtual .o
executable
Compiling
Windows* icl -c /Qipo main.c func1.c func2.c
Linux* icc -c -ipo main.c func1.c func2.c
Mac* icc -c -ipo main.c func1.c func2.c
![Page 11: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/11.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
11
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Profile Guided Optimizations (PGO)
Use execution-time feedback to guide many other compiler optimizations
Helps I-cache, paging, branch-prediction
Enabled optimizations:
• Basic block ordering
• Better register allocation
• Better decision of functions to inline
• Function ordering
• Switch-statement optimization
• Better vectorization decisions
![Page 12: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/12.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
12
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Instrumented Compilation(Mac*/Linux*) icc -prof_gen[x] prog.c(Windows*) icl -Qprof_gen[x] prog.c
Instrumented ExecutionRun program on a typical dataset
Feedback Compilation(Mac/Linux) icc -prof_use prog.c(Windows) icl -Qprof_use prog.c
DYN file containingdynamic info: .dyn
Instrumented executable
Merged DYNsummary file: .dpiDelete old dyn files if you do not want the info included
Step 1
Step 2
Step 3
Multi-pass OptimizationPGO: Three-Step Process
![Page 13: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/13.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
13
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Agenda
Introduction
Compiler Switches
Dual Core• Auto Parallelization• OpenMP• Threading Diagnostics
Vectorization
![Page 14: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/14.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
14
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Auto-parallelization
Auto-parallelization: Automatic threading of loops without having to manually insert OpenMP* directives.
• Compiler can identify “easy” candidates for parallelization, but large applications are difficult to analyze.
Windows* Linux* Mac*
/Qparallel -parallel -parallel
/Qpar_report[n] -par_report[n] -par_report[n]
![Page 15: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/15.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
15
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
OpenMP* Threading Technology
Pragma based approach to parallelism
Usage:OpenMP switches: -openmp : /Qopenmp
OpenMP reports: -openmp-report : /Qopenmp-report
#pragma omp parallel for for (i=0;i<MAX;i++) A[i]= c*A[i] + B[i];
![Page 16: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/16.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
16
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
OpenMP: Workqueueing Extension Example
Intel Compiler’s Workqueuing extension
• Create Queue of tasks…Works on…• Recursive functions• Linked lists, etc.
#pragma intel omp parallel taskq shared(p){ while (p != NULL) {#pragma intel omp task captureprivate(p)
do_work1(p); p = p->next; }}
![Page 17: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/17.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
17
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Parallel Diagnostics
Source Instrumentation for Intel Thread Checker
• Allows thread checker to diagnose threading correctness bugs
• To use tcheck/Qtcheck you must have Intel Thread Checker installed
• See thread checker documentation• http://www.intel.com/support/
performancetools/sb/CS-009681.htm
Windows* Linux* Mac*
/Qtcheck -tcheck No support
![Page 18: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/18.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
18
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Agenda
Introduction
Compiler Switches
Dual Core
Vectorization• SSE & Vectorization• Vectorization Reports• Explanations of a few specific vectorization inhibitors
![Page 19: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/19.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
19
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
SIMD – SSE, SSE2, SSE3 Support
16x bytes
8x words
4x dwords
2x qwords
1x dqword
4x floats
2x doubles
MMX*
SSE
SSE2SSE3
* MMX actually used the x87 Floating Point Registers - SSE, SSE2, and SSE3 use the new SSE registers
![Page 20: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/20.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
20
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
SIMD FP using AOS format*
Thread Synchronization
Video encoding
Complex arithmetic
FP to integer conversions
HADDPD, HSUBPD
HADDPS, HSUBPS
MONITOR, MWAIT
LDDQU
ADDSUBPD, ADDSUBPS,
MOVDDUP, MOVSHDUP,
MOVSLDUP
FISTTP
* Also benefits Complex and Vectorization
SSE3 Instructions
![Page 21: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/21.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
21
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Using SSE3 - Your Task: Convert This…
128-bit Registers
A[0]
B[0]
C[0]
+ + + +
A[1]
B[1]
C[1]
not used not used not used
not used not used not used
not used not used not used
for (i=0;i<=MAX;i++) c[i]=a[i]+b[i];
![Page 22: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/22.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
22
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
… Into This …
128-bit Registers
A[3] A[2]
B[3] B[2]
C[3] C[2]
+ +
A[1] A[0]
B[1] B[0]
C[1] C[0]
+ +
for (i=0;i<=MAX;i++) c[i]=a[i]+b[i];
![Page 23: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/23.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
23
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Compiler Based VectorizationProcessor Specific
Description Use Windows* Linux* Mac*
Generate instructions and optimize for Intel® Pentium® 4 compatible processors including MMX, SSE and SSE2.
W /QxW -xW Does not apply
Generate instructions and optimize for Intel® processors with SSE3 capability including Core Duo. These processors support SSE3 as well as MMX,SSE and SSE2.
P /QxP/QaxP
-xP,-axP
Vector-ization occurs by default
![Page 24: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/24.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
24
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Compiler Based Vectorization Automatic Processor Dispatch – ax[?]
Single executable
• Optimized for Intel® Core Duo processors and generic code that runs on all IA32 processors.
For each target processor it uses:
• Processor-specific instructions
• Vectorization
Low overhead
• Some increase in code size
![Page 25: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/25.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
25
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Why Loops Don’t Vectorize
Independence
• Loop Iterations generally must be independent
Some relevant qualifiers:
• Some dependent loops can be vectorized.
• Most function calls cannot be vectorized.
• Some conditional branches prevent vectorization.
• Loops must be countable.
• Outer loop of nest cannot be vectorized.
• Mixed data types cannot be vectorized.
![Page 26: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/26.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
26
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Why Didn’t My Loop Vectorize?
Windows* Linux* Macintosh*
-Qvec_reportn -vec_reportn -vec_reportn
Set diagnostic level dumped to stdout
n=0: No diagnostic information
n=1: (Default) Loops successfully vectorized
n=2: Loops not vectorized – and the reason why not
n=3: Adds dependency Information
n=4: Reports only non-vectorized loops
n=5: Reports only non-vectorized loops and adds dependency info
![Page 27: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/27.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
27
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Why Loops Don’t Vectorize
• “Existence of vector dependence”
• “Nonunit stride used”
• “Mixed Data Types”
• “Unsupported Loop Structure”
• “Contains unvectorizable statement at line XX”
• There are more reasons loops don’t vectorize but we will disucss the reasons above
![Page 28: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/28.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
28
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
“Existence of Vector Dependency”
Usually, indicates a real dependency between iterations of the loop, as shown here:
for (i = 0; i < 100; i++) x[i] = A * x[i + 1];
![Page 29: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/29.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
29
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Defining Loop Independence
Iteration Y of a loop is independent of when (or whether) iteration X occurs.
int a[MAX], b[MAX];
for (j=0;j<MAX;j++) {
a[j] = b[j];
}
![Page 30: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/30.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
30
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
“Nonunit stride used”
for (I=0;I<=MAX;I++)
for (J=0;J<=MAX;J++) {
c[I][J]+=1; // Unit Stride
c[J][I]+=1; // Non-Unit
A[J*J]+=1; // Non-unit
A[B[J]]+=1; // Non-Unit
if (A[MAX-J])=1 last1=J;}// Non-Unit
End Result: Loading Vector may take more cycles than executing operation sequentially.
Mem
ory
![Page 31: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/31.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
31
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
“Mixed Data Types”
An example:
int howmany_close(double *x, double *y)
{ int withinborder=0;
double dist;
for(int i=0;i<MAX;i++) {
dist=sqrtf(x[i]*x[i] + y[i]*y[i]);
if (dist<5) withinborder++;
}
}
Mixed data types are possible – but complicate things• i.e.: 2 doubles vs 4 ints per SIMD register
Some operations with specific data types won’t work
![Page 32: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/32.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
32
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
“Unsupported Loop Structure”
Example:struct _xx {
int data;
int bound; } ;
doit1(int *a, struct _xx *x) {
for (int i=0; i<x->bound; i++) a[i] = 0;
An unsupported loop structure means the loop is not countable, or the compiler for whatever reason can’t construct a run-time expression for the trip count.
![Page 33: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/33.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
33
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
“Contains unvectorizable statement”
for (i=1;i<nx;i++) {
B[i] = func(A[i]); }
128-bit Registers128-bit Registers
A[3] A[2]
B[3] B[2]
func func
A[1] A[0]
B[1] B[0]
func func
![Page 34: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/34.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
34
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Reference
Web-based and classroom training
• www.intel.com/software/college
White papers and technical notes
• www.intel.com/ids
• www.intel.com/software/products
Product support resources
• www.intel.com/software/products/support
![Page 35: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/35.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
35
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
![Page 36: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/36.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
36
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Activity 1 - raytrace2: Initial Compilation
Set up environment and compile with both Microsoft* Visual C++ .NET (MSVC*) and Intel® C++ Compiler (icl)
![Page 37: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/37.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
37
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Activity 2 - raytrace2: O3 Compilation
Use Intel compiler’s High Level Optimizer (-O3) for loop centric codes
![Page 38: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/38.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
38
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Activity 3 - raytrace2: IPO Compilation
Use Intel compiler’s Inter-procedural Optimization (-Qipo)
![Page 39: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/39.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
39
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Activity 4 - raytrace2: PGO Compilation
Use Intel compiler’s Profile-guided Optimization
![Page 40: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/40.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
40
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Activity 5 – raytrace2: Vectorization
Use Intel compiler’s Vectorization optimization (-QxP)
![Page 41: Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version Intel Software College.](https://reader030.fdocuments.us/reader030/viewer/2022032802/56649e005503460f94ae8a0a/html5/thumbnails/41.jpg)
Copyright © 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
41
Intel Compilers 9.x on the Intel® Core Duo™ Processor Windows version
Activity 6 - raytrace2: Putting it all together
Use all previous optimizations in tandem (-O3, -QxP, IPO and PGO)