Intro_To_OpenMP_Mattson.pdf
-
Upload
sidharthnegi -
Category
Documents
-
view
217 -
download
0
Transcript of Intro_To_OpenMP_Mattson.pdf
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
1/200
A Hands!on Introduction to"#$%&'(
! The name OpenMP is the property of the OpenMP Architecture Review Board.
*+, &-../0%
1%.$2 304#5
.+,0.67585,-../0%9+%.$25:0,
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
2/200
1%.40;
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
3/200
A:@%0?2$;8$,$%./
*6+/ :0
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
4/200
%
'4$2+,+%-4+$/B
"$ 2$-4%+%8UC$ ?+22 ,+V /604. 2$:.
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
5/200
&
"$4/$ 2+%@$; 2+/./ M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/
&0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
6/200
'
"$4/$ 2+%@$; 2+/./ M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/
&0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
7/200
&004$c/ Q-?
Moores Law
!"#$% '()*+%, UCB CS 194 Fall2010
1% Yb_^F 1%.$2 :0!=0
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
8/200
Consequences of Moores law
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
9/200
*6$ P-4;?-4$dJ0=.?-4$ :0%.4-:.
C4+.$ 70
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
10/200 #
Computer architecture and the power wall
)
&
*)
*&
#)
#&
$)
) # % ' +,-./.0 1203405.6-2
1472
084720 9 8203 : *;?5 @
>%+' 126=>?5
126=>?5 104
126=>?5 % AB5=C
126=>?5 % A1D-C
+%/5)# ., 1/5$%
.& ',&'&)-.,-;($
!"#$%&' )* +$"%,"-./0 "1 234&5
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
11/200 #
partial solution: simple low power cores
)
&
*)
*&
#)
#&
$)
) # % ' +,-./.0 1203405.6-2
147
20 84720 9 8203 : *;?5 @
>%+' 126=>?5
126=>?5 104
126=>?5 % AB5=C
126=>?5 % A1D-C
?&
5.)# -((/5
1.1$(.,$& '&$ ($&&
1/5$%
!"#$%&' )* +$"%,"-./0 "1 234&5
)6&34#7558 9&340#: ; #.&%/3$&&/%
0GF
>%/3$&&/%
0GF
0
@,1') A')1')
@,1')
A')1')
=-1-3.)-,3$ B =
C/()-4$ B C
D%$E'$,39 B 0
>/5$% B =CF0 =-1-3.)-,3$ B F:F=C/()-4$ B H:IC
D%$E'$,39 B H:J0
>/5$% B H:KLI=CF0
C,73
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
14/200
&+:40#40:$//04 .4$%;/
!"# %&''
_G2[2P Y&.57C\>`>!()&' *%% +,-.&//-,
PS[ PY2 TGaa>
0,123,)4 (35&/ 3,& )6& 2,-2&,)4 -7 )6&8, -9(&,/:
@,7.M.7'-( 1%/3$&&/%& -%$ 2-,9 3/%$ 8-,7 /0)$, #$)$%/4$,$/'&N 1%/3$&&/%&:
;< .-,&/0< .-,&/
; 981& *!#=
> %+? @ A .-,&/
>< .-,&/
>A 981& *!#=
B; .-,&/
!"#$%&' W?&3Cb 4#4"$075O +7.4&$O c"-&.O S744."3O 73< b"/,:"4"6Oc09)PC M>\\
PTS S9CWT)234&5d e&"3d ?$"%&.."$
B .-,&/
B .-,&/
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
15/200
The result
*&
O
BA new contract HW people will do whats natural
0/% )#$2 8(/)& /0 &.21($ 3/%$&N -,7 PQ 1$/1($ 5.((
#-M$ )/ -7-1) 8%$5%.)$ $M$%9)#.,4N
The problem is this was presented as an ultimatum
nobody asked us if we were OK with this new contract
5#.3# .& R.,7 /0 %'7$:
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
16/200
30%:/5 '-4-22$2+/,*?0 +,#04.-%. ;$=+%+.+0%/B
30%:
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
17/200
30%:/5 '-4-22$2+/,
Figure from An Introduction to Concurrency in Programming Languages by J. Sottile, Timothy G. Mattson, and Craig E Rasmusse,S FHTH
*?0 +,#04.-%. ;$=+%+.+0%/B
30%:%/4%-2&
>-%-(($(
>%/4%-2&
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
18/200
30%:/5 '-4-22$2 -##2+:-.+0%/
!"#"$$%$ "''$()"*(+,- ., "''$()"*(+, /+# 01()1 *1%)+2'3*"*(+,4 !"#$!%%&%5%)3*% 4(23$*",%+34$6 (,
+#7%# *+ )+2'$%*% " '#+8$%2 (, $%44 *(2%9
The problem doesnt inherently require
concurrency you can state it sequentially.
:+,)3##%,* "''$()"*(+,- ., "''$()"*(+, /+# 01()1
)+2'3*"*(+,4 %'()"!%%&%5%)3*% 4(23$*",%+34$6 73%
*+ *1% 4%2",*()4 +/ *1% "''$()"*(+,9
;1% '#+8$%2 (4 /3,7"2%,*"$$6 )+,)3##%,*9
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
19/200
*6$ '-4-22$2 #4084-,,+%8 #40:$//B
A%.4.,-( >%/;($2 "-&R&S -%$7 -,7 (/3-(
7-)-
D.,7=/,3'%%$,39
8
@21($2$,)-)./,&)%-)$49
=/%%$&1/,7.,4 &/'%3$
3/7$
>%/4%-2 P>-% 8N
X
"Y>W Z)21S Z0',38N[
4(/;-(V-%%-9 U-)-8"Y>WN[
4(/;-(V-%%-9 \$&8"Y>WN[
.,) ] B 4$)V,'2V1%/3&8N[
.,) .7 B 4$)V1%/3V.78N[
.0 8.7BBHN &$)'1V1%/;($28]SU^"^N[
0/% 8.,) @B H[ @_][@B@O]'2NX
)21 B 0',38@N[
\$&:-33'2'(-)$8 )21N[
`
`
>%/4%-2 P>-% 8N
X
"Y>W Z)21S Z0',38N[
4(/;-(V-%%-9 U-)-8"Y>WN[
4(/;-(V-%%-9 \$&8"Y>WN[
.,) ] B 4$)V,'2V1%/3&8N[
.,) .7 B 4$)V1%/3V.78N[
.0 8.7BBHN &$)'1V1%/;($28]SU^"^N[
0/% 8.,) @B H[ @_][@B@O]'2NX
)21 B 0',38@N[
\$&:-33'2'(-)$8 )21N[
`
`
>%/4%-2 P>-% 8N
X
"Y>W Z)21S Z0',38N[
4(/;-(V-%%-9 U-)-8"Y>WN[
4(/;-(V-%%-9 \$&8"Y>WN[
.,) ] B 4$)V,'2V1%/3&8N[
.,) .7 B 4$)V1%/3V.78N[
.0 8.7BBHN &$)'1V1%/;($28]SU^"^N[
0/% 8.,) @B H[ @_][@B@O]'2NX
)21 B 0',38@N[
\$&:-33'2'(-)$8 )21N[
`
`
>%/4%-2 P>-% 8N
X
"Y>W Z)21S Z0',38N[
4(/;-(V-%%-9 U-)-8"Y>WN[
4(/;-(V-%%-9 \$&8"Y>WN[
.,) ]'2 B 4$)V,'2V1%/3&8N[
.,) .7 B 4$)V1%/3V.78N[
.0 8.7BBHN &$)'1V1%/;($28]S U-)-N[
0/% 8.,) @B @U[ @_][@B@O]'2NX
)21 B 0',38@S U-)-N[
\$&:-33'2'(-)$8 )21N[
`
`
?,.)& /0 $a$3')./, O ,$5 -%$7 7-)-
0/% $a)%-3)$7 7$1$,7$,3.$&
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
20/200
#)
"#$%&'(">$4>+$?B
!"#$%&'$(!)*+()*,
.#/01"0 !"# #0/0((&( 2!/ #/340'&+56 7,
.#/01"0 !"# )/3'3)0(
89:;< #0/0((&( =! %>0/&=+06 ?6 ),
89:;<
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
21/200
#*
"#$%&' L-/+: W$=/B J02
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
22/200
##
"#$%&' :04$ /7%.-V
&0/. 0= .6$ :0%/.4
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
23/200
#$
"$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
24/200
#%
30,#+2$4 %0.$/B 1%.$2 0% C+%;0?/ Q-
$%>+40%,$%.
:; .0 .6$ ;+4$:.047 .6-.602;/ 70
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
25/200
#&
30,#+2$4 %0.$/B k+/
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
26/200
#'
30,#+2$4 %0.$/B ".6$4
Q+%
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
27/200
#?@>A
*+!,-. 0-0',& 1'02$#-,3
*+!,-. 0-0',& "'02$#-, - ",6 )+2'3*%# )+2'+4%7 +/ 23$*('$%
'#+)%44(,= %$%2%,*4 *1"* 41"#% ", "77#%44 4'")%9 ;0+ :$"44%4- *&00-#,)" 0$%#)2,'"-33', B*45C- " 41"#%7 "77#%44 4'")% 0(*1
equalDtime access for each processor, and the OS treats every
'#+)%44+# *1% 4"2% 0"69
6'7 87)9',0 !..,-33 32!"- 0$%#)2,'"-33', B684:C- 7(//%#%,*
memory regions have different access costs think of memorysegmented into Near and Far memory.
104-$104-#104-* 104-P
,K.02O QOO02DD ,8.-2
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
32/200
>E@>A
*+!,-. 0-0',& 0!"+)7-3; *45
:#"6D2 the last large4)"$% FG! )+2'3*%#9 H%$%"4%7 (, ?IJK 0(*1
4 heads, 1.9 GFLOPS
'%"L '%#/+#2",)%
B/"4*%7 43'%#)+2'3*%#(, *1% 0+#$7 3,*($ ?IIMC9
;1% N%)*+# 3,(*4 (, %")1head had equalD*(2%
"))%44 *+ *1% 2%2+#6
+#=",(O%7 (,*+ 8",L4 *+43''+#* 1(=1D
8",70(7*1 '"#"$$%$
2%2+#6 "))%44
"#.%7 1-%)9 ,-2$& -%$ )#$ 1%/1$%)9 /0 )#$.% /5,$%&
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
33/200
>>@>A
*+!,-. 0-0',& 0!"+)7-3; *45
A )+#%4P ED0"6 23$*(*1#%"7%7P AD0(7% 43'%#4)"$"#P Q3"7D(443%P RD0(7%FSGT B+, > +/ A '('%$(,%4C
4.5 KB (6 x 768 B) Architectural Registers, 192 KB (6 x 32 KB) L1:")1%P ?9K GU BA 5 EKA VUC WE )")1%P ?E GU W> :")1% GXFSY :")1% :+1%#%,)%P !#+)%44+# :+,4(4*%,)6 G+7%$ ?9?Z U($$(+, ;#",4(4*+#4 +, >E ,2 '#+)%44 [ E9A \]O
R# R# R# R# R# R#
R$
@2540S T46=04//20
Intel Core i7U(D >=Z
T.-K2 K>20.0-KS 52.6D O>332026= 804-2DD40D K.[2 O>332026=costs to access different address ranges . Its NUMA
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
34/200
>R@>A
104-$104-#104-* 104-P
,K.02O QOO02DD ,8.-2
*+!,-. 0-0',& "'02$#-,3
Shared memory computers are everywhere most laptops and
4%#N%#4 1"N% 23$*()+#% 23$*('#+)%44+# :!^4
;1% 41"#%7 "77#%44 4'")% ",7 B"4 0% 0($$ 4%%C '#+=#"22(,=2+7%$4 %,)+3#"=% 34 *+ *1(,L +/ *1%2 "* FG! 464*%249
Reality is more complex any multiprocessor CPU with a cache is" _^G. 464*%29 F*"#* +3* 86 *#%"*(,= *1% 464*%2 "4 ", FG! ",7
`34* "))%'* *1"* 23)1 +/ 6+3# +'*(2(O"*(+, 0+#L 0($$ "77#%44 )"4%4
01%#% *1"* )"4% 8#%"L4 7+0,9
104-$104-#104-* 104-P
,K.02O QOO02DD ,8.-2
5,'(,!00)7( 3+!,-. 0-0',&
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
35/200
>K@>A
5,'(,!00)7( 3+!,-. 0-0',&
"'02$#-,3
0',3^8N M-%T
M-%F
2-.,8N
0',3^8N
0',3b8N
: : : : :
-%%-9T
-%%-9F
P)-3R
)$a)
7-)-
#$-1
>%/3$&&
^, .,&)-,3$ /0 -1%/4%-2 $a$3')./,:
"#$ $a$3')./,
3/,)$a) /0 - %',,.,4
program i.e. the
%$&/'%3$& -&&/3.-)$7with a programs
$a$3')./,:
>%/3$&& @U
?&$% @U
+%/'1 @U
D.($&
e/3R&
P/3R$)&
P)-3R >/.,)$%
>%/4%-2 =/',)$%
\$4.&)$%&
5,'(,!00)7( 3+!,-. 0-0',&
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
36/200
>A@>A
5,'(,!00)7( 3+!,-. 0-0',&
"'02$#-,3
0',3^8N M-%T
M-%F
2-.,8N
0',3^8N
0',3b8N
: : : : :
-%%-9T
-%%-9F
"#%$-7 HP)-3R
)$a)
7-)-
#$-1
0',3b8N M-%T
M-%F
M-%K
>%/3$&& @U
?&$% @U
+%/'1 @U
D.($&
e/3R&
P/3R$)&
P)-3R >/.,)$%
>%/4%-2 =/',)$%
\$4.&)$%&
P)-3R >/.,)$%
>%/4%-2 =/',)$%
\$4.&)$%&
"#%$-7 T
P)-3R
"#%$-7&*
"#%$-7& -%$ f(.4#)weight processes
"#%$-7& -%$>%/3$&& &)-)$ -2/,4
multiple threads
)#.& 4%$-)(9 %$7'3$&)#$ 3/&) /0 &5.)3#.,4
3/,)$a):
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
37/200
>Z@>A
: 3+!,-. 0-0',& 2,'(,!0
:7 )73#!7"- '9 ! 2,'(,!0; a,% '#+)%44 ",7 $+*4 +/
*1#%"749
;1#%"74 (,*%#")* *1#+3=1
#%"74@0#(*%4 *+ " 41"#%7
"77#%44 4'")%9
aF 4)1%73$%# 7%)(7%4
01%, *+ #3, 01()1
threads interleaved
/+# /"(#,%449
F6,)1#+,(O"*(+, *+
"443#% %N%#6 $%="$ +#7%#
#%43$*4 (, )+##%)* #%43$*49
)#%$-7 >%.M-)$
)#%$-7 >%.M-)$
)#%$-7 >%.M-)$
)#%$-7 >%.M-)$
)#%$-7 >%.M-)$
P#-%$7 ^77%$&&P1-3$
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
38/200
$+
NV$4:+/$ YB J026= FG 9 458NM2=N=K02.ON6?5ACH80>6=3( hello(%d) , ID);80>6=3( world(%d) In, ID);
J
J
#include 458;K>6= 5.>6ACE
L80.M5. 458 8.0.//2/E
>6= FG 9 458NM2=N=K02.ON6?5ACH80>6=3( hello(%d) , ID);80>6=3( world(%d) In, ID);
J
J
J-,#2$ "
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
39/200
$(
"#$%&' ">$4>+$?BP0? ;0 .64$-;/ +%.$4-:.m
"#$%&' +/ - ,
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
40/200
%)
"$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
41/200
%*
"$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
42/200
%#
"#$%&' '4084-,,+%8 &0;$2B
\40]U^4>6 1.0.//2/>[email protected]=20 =K02.OD8.76D . =2.5 43 =K02.OD.D 622O2O;
1.0.//2/>D5 .OO2O >6-02526=.//S ?6=>/ 8203405.6-2 M4./D.02 52=V >;2; =K2 D2_?26=>./ 804M0.5 2[4/[2D >6=4 .8.0.//2/ 804M0.5;
'-4-22$2 O$8+0%/&-/.$4*64$-;+% 4$;
A X$/.$;'-4-22$24$8+0%
A X$/.$;'-4-22$24$8+0%
>#?+#$3(.*'&./37'
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
43/200
%$
*64$-; 34$-.+0%B '-4-22$2 O$8+0%/
n0< :4$-.$ .64$-;/ +% "#$%&'( ?+.6 .6$ #-4-22$2:0%/.4
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
44/200
%%
*64$-; 34$-.+0%B '-4-22$2 O$8+0%/
n0< :4$-.$ .64$-;/ +% "#$%&'( ?+.6 .6$ #-4-22$2:0%/.4
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
45/200
*64$-; 34$-.+0%B '-4-22$2 O$8+0%/
b.-K =K02.O 2c2-?=2D=K2 D.52 -4O202O?6O.6=/S;
O4?Y/2 Q`*)))aH
L80.M5. 458 8.0.//2/ 6?5N=K02.ODA%CE
>6= FG 9 458NM2=N=K02.ON6?5ACH844KAFGX QCH
J
80>6=3(all doneIn);
458ND2=N6?5N=K02.ODA%C
844KA*XQC 844KA#XQC 844KA$XQC
printf(all doneIn);
844KA)XQC
O4?Y/2 Q`*)))aH
Q D>6M/2
-48S 43 Q >DDK.02OY2=7226 .//=K02.OD;
Q D>6M/2
-48S 43 Q >DDK.02OY2=7226 .//=K02.OD;
eK02.OD 7.>= K202 340 .// =K02.OD =4 3>6>DK
Y23402 804-22O>6M A>;2; .!"##$%#
C
eK02.OD 7.>= K202 340 .// =K02.OD =4 3>6>DK
Y23402 804-22O>6M A>;2; .!"##$%#
C! The name OpenMP is the property of the OpenMP Architecture Review Board
"#$%&' 6-. .6$ :0,#+2$4 ;0$/
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
46/200
"#$%&'B ?6-. .6$ :0,#+2$4 ;0$/
!"#$%&$ (&" "$#$))*) +,&-./#*$01234
5
6((7$# 248
9
:(;0 ./,+< 24
56((7$# 248
9
"./#*$0-. .;0=3>8
6(# 2;+. ; ? @8 ; A 38 BB;4"./#*$0-C#*$.* 2
D.;0=;>EFE./,+
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
47/200
% 9 )
P
@.=K25.=>-.//SX 72 ]647 =K.=V
B2 -.6 .8804c>5.=2 =K2>6=2M0./ .D . D?5 4302-=.6M/2DV
BK202 2.-K 02-=.6M/2 K.D7>O=K c .6O K2>MK= \Ac>C .==K2 5>OO/2 43 >6=20[./ >;
]5o
[5o
Y5o
lo5o
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
48/200
%+
NV$4:+/$/ [ .0 ]B J$4+-2 '1 '4084-,
/.-.+: 20%8 %
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
49/200
%(
NV$4:+/$ [
34$-.$ - #-4-22$2 >$4/+0% 0= .6$ #+ #4084-,
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
50/200
&)
"$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
51/200
&*
J$4+-2 '1 '4084-,
/.-.+: 20%8 %
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
52/200
JF
D($)*+,#'E15 'O'
P1(, 5.($ QR
S ($3 (T $34/#.,7K ,1+6*# "(T 7+5UCL%G@M8=N;>VK73#" H I
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
53/200
&$
A2804+.6, /.4-.$87B*6$ J'&W HJ+%82$ '4084-, &6M 2/2526=D 7K202 1-.6 Y2 .0Y>=0.0>/S /.0M2;
Use the rank an ID ranging from 0 to (PU1) to selectY2=7226 . D2= 43 =.D]D .6O =4 5.6.M2 .6S DK.02O O.=.D=0?-=?02D;
eK>D 8.==206 >D [20S M2620./ .6O K.D Y226 ?D2O =4 D?8840=54D= A>3 64= .//C =K2 ./M40>=K5 D=0.=2MS 8.==206D;
MPI programs almost always use this pattern it is
804Y.Y/S =K2 54D= -45546/S ?D2O 8.==206 >6 =K2 K>D=40S 438.0.//2/ 804M0.55>6M;
eK>D 8.==206 >D [20S M2620./ .6O K.D Y226 ?D2O =4 D?8840=54D= A>3 64= .//C =K2 ./M40>=K5 D=0.=2MS 8.==206D;
MPI programs almost always use this pattern it is
804Y.Y/S =K2 54D= -45546/S ?D2O 8.==206 >6 =K2 K>D=40S 438.0.//2/ 804M0.55>6M;
O$/
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
54/200
O$//20 A>-8-C 7>=K 64 48=>5>g.=>46 46 Q88/2 W, h *);=K . O?./ -402 A34?0 iB=K02.OC F6=2/j T402e@>& 804-2DD40 .= *;< kKg .6O % kYS=2 GGd$ 52540S .= *;$$$ kKg;
=K02.OD *D=,1@G
* *;+'
# *;)$
$ *;)+
% );(M>6./ ,20>./ 8> 804M0.5 7>=K *)))))))) D=28D 0.6 >6 *;+$ D2-46OD;
C67 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
55/200
JJ
C67 /6O2826O26= O.=. 2/2526=D K.8826 =4 D>= 46 =K2 D.52 -.-K2 />62X 2.-K
update will cause the cache lines to slosh back and forth between threads
This is called false sharing;
F3 S4? 80454=2 D-./.0D =4 .6 .00.S =4 D?8840= -02.=>46 43 .6 ,1@G 804M0.5X=K2 .00.S 2/2526=D .02 -46=>M?4?D >6 52540S .6O K26-2 DK.02 -.-K2 />62D Results in poor scalability.
,4/?=>46V 1.O .00.SD D4 2/2526=D S4? ?D2 .02 46 O>D=>6-= -.-K2 />62D;
J
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
56/200
JI
D($)*+,# E15" O
P1(, 5.($ QRS ($3 (T $34/#.,7K ,1+6*# "(T 7+5UCL%G@M8=N;>VU&N;VK'
73#" H IRK
D"/.05. 15" "./.**#*
S ($3 (T (,T$34/,7K,1+6*# AK
''''''''''''''(, H 15"G0#3G34/#.,G$+5QRK
$34/,7 H 15"G0#3G$+5G34/#.,7QRK
(- Q(, HH JR $34/#.,7 H $34/,7K
-1/ Q(H(,T 7+5U(,VHJ
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
57/200
O$/=K 64 48=>5>g.=>46 46 Q88/2 W, h *);=K . O?./ -402 A34?0 iB
=K02.OC F6=2/j T402e@>& 804-2DD40 .= *;< kKg .6O % kYS=2 GGd$ 52540S .= *;$$$ kKg;
W0>M>6./ ,20>./ 8> 804M0.5 7>=K *)))))))) D=28D 0.6 >6 *;+$ D2-46OD;
=K02.OD *D=,1@G
*D=,1@G
8.OO2O
* *;+' *;+'
# *;)$ *;)*
$ *;)+ );'(
% );(< );&$
W0 ?$ 4$-227 %$$; .0 #-; 0
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
58/200
W0 ?$ 4$-227 %$$; .0 #-; 06M .00.SD 02_?>02D O228 ]647/2OM2 43 =K2 -.-K2
.0-K>=2-=?02; @4[2 =4 . 5.-K>62 7>=K O>332026= D>g2O-.-K2 />62D .6O S4?0 D43=7.02 8203405.6-2 3.//D .8.0=;
eK202 K.D M4= =4 Y2 . Y2==20 7.S =4 O2./ 7>=K 3./D2 DK.0>6M;
!"
"
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
59/200
&(
"$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
60/200
')
"#$%&' ">$4>+$?BP0? ;0 .64$-;/ +%.$4-:.m
"#$%&' +/ - ,
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
61/200
A?@A?
*&7"+,'7)
;1% *0+ 2+4* )+22+, /+#24 +/ 46,)1#+,(O"*(+, "#%-
&= .= =K2 Y.00>20 ?6=>/ .//
=K02.OD .00>[2;
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
62/200
'#
;(7)+77#,'
*.3#/'
J7%:640%+T-.+0%
P+86 2$>$2 /7%:640%+T-.+0%B :4+.+:-2
-.0,+:
G-44+$4
04;$4$;
Q0? 2$>$2 /7%:640%+T-.+0%
=29$)4/1$(`.3(1$'(7'+7#,'
31'(5 17#'1/,#/'
)1$73/.($37'.$,'31'
/13#)3'.))#77'31'74./#,'
,.3.'
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
63/200
'$
J7%:640%+T-.+0%B L-44+$4
L-44+$4V b.-K =K02.O 7.>=D ?6=>/ .// =K02.OD .00>[2;
e#4-8,- 0,# #-4-22$2
q
+%. +;j0,#i8$.i.64$-;i%
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
64/200
'%
J7%:640%+T-.+0%B :4+.+:-2
&
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
65/200
IJ
7 H I
A.0,+:804[>O2D 5?=?./ 2c-/?D>46 Y?= 46/S .88/>2D =4 =K2?8O.=2 43 . 52540S /4-.=>46 A=K2 ?8O.=2 43 h >6 =K2 34//47>6M
2c.58/2Ce#4-8,- 0,# #-4-22$2
EO4?Y/2 =58X mH
m 9 GWFeACH
=58 9 Y>MN?M/SAmCH
e#4-8,- 0,# -.0,+:
h f9 =58H
J
QOO>=>46./ 3405D 43 .=45>- 7202 .OO2O >6 W826@1 $;*;
B2 7>// O>D-?DD =K2D2 /.=20;
eK2 D=.=2526= >6D>O2 =K2.=45>- 5?D= Y2 462 43 =K234//47>6M 3405DV
c Y>6489 2c80
cff ffc c UUc
h >D .6 /[./?2 43 D-./.0 =S82.6O Y>648 >D . 646U4[20/4.O2OY?>/= >6 4820.=40;
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
66/200
''
NV$4:+/$ \
1% $V$4:+/$ [F 70< #40G-G27
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
67/200
'/20 A>-8-C 7>=K 64 48=>5>g.=>46 46 Q88/2 W, h *);=K . O?./ -402 A34?0 iB=K02.OC F6=2/j T402e@>& 804-2DD40 .= *;< kKg .6O % kYS=2 GGd$ 52540S .= *;$$$ kKg;
=K02.OD *D=,1@G
* *;+'# *;)$
$ *;)+
% );(M>6./ ,20>./ 8> 804M0.5 7>=K *)))))))) D=28D 0.6 >6 *;+$ D2-46OD;
\$3-(( )#-) 1%/2/).,4 &'2
)/ -, -%%-9 2-7$ )#$
3/7.,4 $-&9S ;') ($7 )/ 0-(&$
-%.,4 -,7 1//%
1$%0/%2-,3$:
NV-,#2$BM/+%8 - :4+.+:-2 /$:.+0% .0 4$,0>$ +,#-:. 0= =-2/$ /6-4+%8
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
69/200
IL
D($)*+,#'E15 'O'
P1(, 5.($ QRS ,1+6*# "(K 73#" H I
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
70/200
O$/
!F6=2/ -458>/20 A>-8-C 7>=K 64 48=>5>g.=>46 46 Q88/2 W, h *);=K . O?./ -402 A34?0 iB
=K02.OC F6=2/j T402e@
>& 804-2DD40 .= *;< kKg .6O % kYS=2 GGd$ 52540S .= *;$$$ kKg;
W0>M>6./ ,20>./ 8> 804M0.5 7>=K *)))))))) D=28D 0.6 >6 *;+$ D2-46OD;
=K02.OD *D=
,1@G *D=
,1@G-0>=>-./
* *;+' *;+' *;+$ +,#-:. 0= =-2/$ /6-4+%8
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
71/200
hT
D($)*+,#'E15 'O'
P1(, 5.($ QRS ,1+6*# "(K 73#" H I
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
72/200
hF
D($)*+,#'E15 'O'
P1(, 5.($ QRS ,1+6*# "(K 73#" H I
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
73/200
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
74/200
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
75/200
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
76/200
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
77/200
46D; eK2 D>g2 43 =K2 Y/4-]starts large and shrinks down to size chunk as the calculation804-22OD;
D-K2O?/2A0?6=>52C
,-K2O?/2 .6O -K?6] D>g2 =.]26 3045 =K2 W@1N,TibGpRb26[>046526= [.0>.Y/2 A40 =K2 0?6=>52 />Y0.0SC;
D-K2O?/2A.?=4C
,-K2O?/2 >D /23= ?8 =4 =K2 0?6=>52 =4 -K44D2 AO42D 64= K.[2 =4 Y2 .6S43 =K2 .Y4[2C;
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
78/200
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
79/200
[./26=
7/';($ %$&i
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
80/200
+)
C04@+%8 ?+.6 200#/L-/+: -##40-:6
D+%; :0,#$ 200#/
&-@$ .6$ 200# +.$4-.+0%/ +%;$#$%;$%. 55 J0 .6$7 :-%/-=$27 $V$:
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
81/200
.#/01"0 !"# #0/0((&( 2!/ )!((0#%&+Y,2!/ +3Z' 3VO[ 3\G[ 3HH, ]2!/ +3Z' KVO[ K\;[ KHH, ]
^^^^^_
_
+*
X$/.$; 200#/
C+22 =04, - /+%82$ 200# 0= 2$%8.6 XV& -%; .6$%#-4-22$2+T$ .6-.5
M/$=
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
82/200
+#
C$ -4$ :0,G+%+%8 >-2
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
83/200
+$
"#$%&' 4$;
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
84/200
+%
# # &-%7 ;+==$4$%. -//0:+-.+>$ 0#$4-%;/ :-% G$ -2-2
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
85/200
+&
NV$4:+/$ ]B '+ ?+.6 200#/
Z0 G-:@ .0 .6$ /$4+-2 #+ #4084-, -%; #-4-22$2+T$+. ?+.6 - 200# :0%/.4
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
86/200
+'
"$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
87/200
+O 5.>6 ACE >6= >H O4?Y/2 cX 8>X D?5 9 );)H
D=28 9 *;)oAO4?Y/2C 6?5ND=28DHe#4-8,- 0,# #-4-22$2q
;0ffCEc 9 A>f);&C!D=28H
D?5 9 D?5 f %;)oA*;)fc!cCHJ
r8> 9 D=28 ! D?5H
J
a/#.3# . 7).*./ *1).* 31 #.)4 34/#., 31 41*,
P.*+# 1- A .3 34# )#$3#/ 1- #.)4 ($3#/P.*
Create a team of threads
without a parallel construct, youll
$#P#/ 4.P# 51/# 34.$ 1$# 34/#.,
g/#.B +" *11" (3#/.3(1$7
.$, .77(0$ 34#5 31
threads setting up a/#,+)3(1$ ($31 7+5-8-C 7>=K 64 48=>5>g.=>46 46 Q88/2 W, h *);=K . O?./ -402 A34?0 iB
=K02.OC F6=2/j T402e@
>& 804-2DD40 .= *;< kKg .6O % kYS=2 GGd$ 52540S .= *;$$$ kKg;
W0>M>6./ ,20>./ 8> 804M0.5 7>=K *)))))))) D=28D 0.6 >6 *;+$ D2-46OD;
=K02.OD *D=
,1@G
*D=
,1@G
-0>=>-./
1F R448
* *;+' *;+' *;+< *;(*
# *;)$ *;)* *;)) *;)#
$ *;)+ );'( );'+ );+)
% );(< );&$ );&$ );'+
'-4-22$2 200#/
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
90/200
LH
W826@1 3.0 guarantees that this works i.e. that the sameD-K2O?/2 >D ?D2O >6 =K2 =74 /448DV
W9!"# =! %)>&=`(&+%'0'3),=! 3VN6Z
0+3, V ^^^^&Z= =!W9!"# &Z= =! Z!a03'W9!"# =! %)>&=`(&+%'0'3),
=! 3VN6Z^^^^ V 0+3,
&Z= =!
Q00#/ H:0%.5I
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
91/200
LT
@.O2 %)>&=`(&+/`Z'3"&,5402 ?D23?/-.6 M2=oD2= >= 7>=K />Y0.0S 04?=>62D
!"#$%&'$%)>&=`(&+,!"#$1&'$%)>&=`(&+,
.//47 >58/2526=.=>46D =4 >58/2526= =K2>0 476 D-K2O?/2 ]>6OD
QOO2O . 627 D-K2O?/2 ]>6O AM*"7K>-K M>[2D 3?// 3022O45=4 =K2 0?6=>52 =4 O2=205>62 =K2 D-K2O?/>6M 43 >=20.=>46D =4=K02.OD;
Q//472O Tff d.6O45 .--2DD >=20.=40D .D /448 -46=04/[.0>.Y/2D >6 8.0.//2/ /448D
"
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
92/200
(#
M%+. YB Z$..+%8 /.-4.$; ?+.6 "#$%&' &0;YB 1%.40;+/+.$;I
W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8
&0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I
W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
93/200
($
M%+. YB Z$..+%8 /.-4.$; ?+.6 "#$%&' &0;YB 1%.40;+/+.$;I
W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8
&0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I
W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
94/200
(%
J7%:640%+T-.+0%B L-44+$4
L-44+$4V b.-K =K02.O 7.>=D ?6=>/ .// =K02.OD .00>[2;e#4-8,- 0,# #-4-22$2 /6-4$; HAF LF 3I #4+>-.$H+;Iq
+;j0,#i8$.i.64$-;i%58/>->= Y.00>20 .= =K2 26O
43 . 8.0.//2/ 02M>46
>58/>->= Y.00>20 .= =K2 26O
43 . 8.0.//2/ 02M>46
>58/>->= Y.00>20 .= =K2 26O 43 .340 740]DK.0>6M -46D=0?-=>58/>->= Y.00>20 .= =K2 26O 43 .340 740]DK.0>6M -46D=0?-=
64 >58/>->= Y.00>20
O?2 =4 647.>=
64 >58/>->= Y.00>20
O?2 =4 647.>=
&-/.$4 30%/.4
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
95/200
(&
&-/.$4 30%/.4
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
96/200
('
J+%82$ ?04@/6-4+%8 30%/.42DACH JO4N5.6SN4=K20N=K>6MDACH
J
J$:.+0%/ ?04@/6-4+%8 30%/.4
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
97/200
(046526= [.0>.Y/2 =4 -46=04/ =K2 D>g2 43child threads stack
"&'iJ*A3SJ1zN
Q/D4 .OO2O .6 26[>046526= [.0>.Y/2 =4 K>6= =4 0?6=>52 K47 =4
=02.= >O/2 =K02.OD"&'iCA1*i'"Q13n
A3*1kN @$$# .64$-;/ -2+>$ -. G-44+$4/d20:@/
'AJJ1kN .47 .0 4$2$-/$ #40:$//04 -. G-44+$4/d20:@/
Process binding is enabled if this variable is true i.e. if true=K2 0?6=>52 7>// 64= 54[2 =K02.OD .04?6O Y2=7226 804-2DD40D;
"&'i'O"3iL1XW .4
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
103/200
*)$
M%+. YB Z$..+%8 /.-4.$; ?+.6 "#$%&' &0;YB 1%.40;+/+.$;I
W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8
&0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I
W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
104/200
*)%
W$=-.Y/2D .02 DK.02O YS O23.?/=
k/4Y./ [.0>.Y/2D .02 ,iQdbG .546M =K02.OD\40=0.6V TW@@WP Y/4-]DX ,Qsb [.0>.Y/2DX @WGpRb
[.0>.Y/2D
TV \>/2 D-482 [.0>.Y/2DX D=.=>-
m4=KV OS6.5>-.//S .//4-.=2O 52540S AQRRWTQebX 5.//4-X 627C
m?= 64= 2[20S=K>6M >D DK.02O;;;,=.-] [.0>.Y/2D >6 D?Y804M0.5DA\40=0.6C 40 3?6-=>46DATC -.//2O
3045 8.0.//2/ 02M>46D .02 1dFsQeb
Q?=45.=>- [.0>.Y/2D 7>=K>6 . D=.=2526= Y/4-] .02 1dFsQeb;
W-.- /6-4+%8B NV-,#2$/
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
105/200
*)&
O4?Y/2 Q`*)aH>6= 5.>6AC E
>6= >6O2c`*)aH
L80.M5. 458 8.0.//2/
740]A>6O2cCHprintf(%dIn, index[0]);
J
2c=206 O4?Y/2 Q`*)aH[4>O 740]A>6= !>6O2cC E
O4?Y/2 =258`*)aH
D=.=>- >6= -4?6=H
;;;
J
8 #
'&"#
56 3Z=&b6 )!`Z'
'&"# '&"#
56 3Z=&b6 )!`Z'
AF +%;$V -%; :0
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
106/200
*)'
836-%8+%8 /.04-8$ -..4+G
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
107/200
*)-.$ 32--4+-G2$ >-2+;m
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
108/200
*)+
C6$% +/ .6$ 04+8+%-2 >-4+-G2$ >-2+;m
>6= =58H
[4>O O.6M20AC E
=58 9 )H
L80.M5. 458 8.0.//2/ 80>[.=2A=58C
740]ACH
printf(%dIn, tmp);J
The original variables value is unspecified if it is
4$=$4$%:$; 0O 740]AC E
=58 9 &H
J
?6D82->3>2O 7K>-K-48S 43 =58=58 K.D ?6D82->3>2O
[./?2
D+4/.#4+>-.$ 32-
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
109/200
D+4/.#4+>-.$ 32--4+-G2$3EE 0GR$:./ -4$ :0#7!:0%/.4-.$H+%:4I
=04 H+ j op + fj &Alp +EEI q
+= HH+{[IjjoI +%:4EEp
As+t j +%:4p
r
+%:4 j op
e#4-8,- 0,# #-4-22$2 =04 =+4/.#4+>-.$H+%:4I
=04 H+ j op + fj &Alp +EEI q
+= HH+{[IjjoI +%:4EEp
As+t j +%:4p
rb.-K =K02.O M2=D >=D 476 -48S43 >6-0 7>=K .6 >6>=>./ [./?2 43 )
Q-/.#4+>-.$ 32-
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
110/200
Q-/.#4+>-.$ 32-;2;X 340 >9A6U*CC
W-.- J6-4+%8BA ; . + . . .
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
111/200
***
A ;-.- $%>+40%,$%. .$/. 30%/+;$4 .6+/ $V-,#2$ 0= 'O1kA*N -%; D1OJ*'O1kA*N
A4$ AFLF3 20:-2 .0 $-:6 .64$-; 04 /6-4$; +%/+;$ .6$ #-4-22$2 4$8+0%m
C6-. -4$ .6$+4 +%+.+-2 >-2-2.Y/2DV Q 9 *Xm 9 *X T 9 *L80.M5. 458 8.0.//2/ 80>[.=2AmC 3>0D=80>[.=2ATC
1%/+;$ .6+/ #-4-22$2 4$8+0% 555
A is shared by all threads; equals 1
B and C are local to each thread.
Bs initial value is undefined
Cs initial value equals 1
D0220?+%8 .6$ #-4-22$2 4$8+0% 555
L -%; 3 4$>$4. .0 .6$+4 04+8+%-2 >-2-2
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
112/200
**#
W-.- J6-4+%8B W$=-\@C^"WN
"%/*>-4+-G2$ +% .6$ :0%/.4-.$ -/ += /#$:+=+$; +% -#4+>-.$ :2-$/ .7#+%8
UWD^?e"8]A]WN
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
113/200
**$
W-.- J6-4+%8B W$=-=4=./C
68 9 458NM2=N6?5N=K02.ODAC
2.-K 9 >=4=./o68
TlW@1 bPG 1QdQRRbR
>=4=./ 9 *)))
TlW@1 1QdQRRbR 1dFsQebA68X 2.-KC
68 9 458NM2=N6?5N=K02.ODAC
2.-K 9 >=4=./o68
TlW@1 bPG 1QdQRRbR
*6$/$ .?0:0;$=4-8,$%./ -4$$u-2$%.
NV$4:+/$ ^B &-%;$2G40. /$. -4$-
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
114/200
**%
NV$4:+/$ ^B &-%;$2G40. /$. -4$-
*6$ /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
115/200
**&
$ : /$ ^ H:0 . I
"%:$ 70< 6->$ - ?04@+%8 >$4/+0%F .47 .00#.+,+T$ .6$ #4084-,m*47 ;+==$4$%. /:6$;
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
116/200
**'
M%+. YB Z$..+%8 /.-4.$; ?+.6 "#$%&' &0;YB 1%.40;+/+.$;I
W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8
&0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I
W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
117/200
L O23>62 P1WFPe, *)))L O23>62 @hFed *)))[4>O =2D=84>6=A[4>OCH
D=0?-= ON-458/2cEO4?Y/2 0H O4?Y/2 >HJHD=0?-= ON-458/2c -H>6= 6?54?=D>O2 9 )H
>6= 5.>6ACE
>6= >X rHO4?Y/2 .02.X 20040X 28D 9 *;)2U&H
L80.M5. 458 8.0.//2/ 340 O23.?/=ADK.02OC 80>[.=2A-X28DC340 A>9)H >nP1WFPe,H >ffC E
340 Ar9)H rnP1WFPe,H rffC E-;0 9 U#;)f#;&!AO4?Y/2CA>CoAO4?Y/2CAP1WFPe,Cf28DH
-;> 9 *;*#&!AO4?Y/2CArCoAO4?Y/2CAP1WFPe,Cf28DH=2D=84>6=ACHJ
J.02.9#;)!#;&!*;*#&!AO4?Y/2CAP1WFPe,!P1WFPe,U6?54?=D>O2CoAO4?Y/2CAP1WFPe,!P1WFPe,CH
200409.02.oAO4?Y/2CP1WFPe,H
J $$%
>0+; .$/.#0+%.H>0+;Iq/.4
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
118/200
\>6O =44/D =K.= 740] 7>=K S4?0 26[>046526= .6O /2.06 =4 ?D2
=K25; Q M44O 8.0.//2/ O2Y?MM20 -.6 5.]2 . K?M2O>332026-2;
m?= 8.0.//2/ O2Y?MM20D .02 64= 840=.Y/2 .6O S4? 7>//assuredly need to debug by hand at some point.
eK202 .02 =0>-]D =4 K2/8 S4?; eK2 54D= >5840=.6= >D =4 ?D2=K2 O23.?/=A6462C 80.M5.
$$"
e#4-8,- 0,# #-4-22$2 =04 ;$=--.$H:F $#/I=04 H+jop +fX'"1X*Jp +EEI q
=04 HRjop RfX'"1X*Jp REEI q
:54 j ![5oE[5^(H;0
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
119/200
L O23>62 P1WFPe, *)))L O23>62 @hFed *)))D=0?-= ON-458/2cE
O4?Y/2 0H O4?Y/2 >HJH[4>O =2D=84>6=AD=0?-= ON-458/2cCHD=0?-= ON-458/2c -H>6= 6?54?=D>O2 9 )H
>6= 5.>6ACE
>6= >X rHO4?Y/2 .02.X 20040X 28D 9 *;)2U&H
e#4-8,- 0,# #-4-22$2 =04 ;$=--.$H:F RI |=+4/.#4++>-.$H$#/I
340 A>9)H >nP1WFPe,H >ffC E340 Ar9)H rnP1WFPe,H rffC E
-;0 9 U#;)f#;&!AO4?Y/2CA>CoAO4?Y/2CAP1WFPe,Cf28DH-;> 9 *;*#&!AO4?Y/2CArCoAO4?Y/2CAP1WFPe,Cf28DH=2D=84>6=A:CH
JJ
.02.9#;)!#;&!*;*#&!AO4?Y/2CAP1WFPe,!P1WFPe,U6?54?=D>O2CoAO4?Y/2CAP1WFPe,!P1WFPe,CH
200409.02.oAO4?Y/2CP1WFPe,H $$&
>0+; .$/.#0+%.H/.4
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
120/200
*#)
8
/.-.+: 20%8 %
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
121/200
TFT
e+%:2- /46M 6?5ND=28D 9 *)))))H O4?Y/2 D=28H
[4>O 5.>6 ACE >6= >H O4?Y/2 cX 8>X D?5 9 );)H
D=28 9 *;)oAO4?Y/2C 6?5ND=28DHe#4-8,- 0,# #-4-22$2 =04 #4+>-.$HVI 4$;n 6?5ND=28DH >ffCE
c 9 A>f);&C!D=28HD?5 9 D?5 f %;)oA*;)fc!cCH
J8> 9 D=28 ! D?5H
J
P4=2V 72 -02.=2O .8.0.//2/ 804M0.5 7>=K4?=-K.6M>6M .6S 2c2-?=.Y/2-4O2 .6O YS .OO>6M #D>58/2 />62D 43 =2c=t
( "/(P.3# 69
,#-.+*3
( "/(P.3# 69
,#-.+*3
\40 M44O W826@1>58/2526=.=>46DX02O?-=>46 >D 5402D-./.Y/2 =K.6 -0>=>-./;
\40 M44O W826@1>58/2526=.=>46DX02O?-=>46 >D 5402D-./.Y/2 =K.6 -0>=>-./;
"
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
122/200
*##
M%+. YB Z$..+%8 /.-4.$; ?+.6 "#$%&' &0;YB 1%.40;+/+.$;I
W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8
&0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I
W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
123/200
*#$
*0 :4$-.$ - .$-, 0= .64$-;/e#4-8,- 0,# #-4-22$2
*0 /6-4$ ?04@ G$.?$$% .64$-;/Be#4-8,- 0,# =04
e#4-8,- 0,# /+%82$
*0 #4$>$%. :0%=2+:./ H#4$>$%. 4-:$/Ie#4-8,- 0,# :4+.+:-2e#4-8,- 0,# -.0,+:
e#4-8,- 0,# G-44+$4
e#4-8,- 0,# ,-/.$4
W-.- $%>+40%,$%. :2--.$ H>-4+-G2$i2+/.I=+4/.#4+>-.$ H>-4+-G2$i2+/.I
2-/.#4+>-.$ H>-4+-G2$i2+/.I
4$;-4+-G2$i2+/.I
Q#$%$ M-%.-;($V(.&) .& -
3/22- &$1-%-)$7 (.&) /0
M-%.-;($&
>%.,) )#$ M-('$ /0 )#$ 2-3%/
VA>W]
^,7 .)& M-('$ 5.(( ;$
999922
D/% )#$ 9$-% -,7 2/,)# /0 )#$&1$3 )#$ .21($2$,)-)./, '&$7
30%/+;$4 /+,#2$ 2+/. .4->$4/-2 Given what weve covered about W826@1X K47 74?/O S4?
=K> / > 1 // /Z
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
124/200
TFs
89K2.OH7K>/2 A8C E
804-2DDA8CH
8 9 8Uq62c=H
J
804-2DD =K>D /448 >6 1.0.//2/Z
d2525Y20X =K2 /448 740]DK.0>6M -46D=0?-= 46/S 740]D 7>=K
/448D 340 7K>-K =K2 6?5Y20 43 /448 >=20.=>46D -.6 Y202802D26=2O YS . -/4D2OU3405 2c802DD>46 .= -458>/20 =>52;BK>/2 /448D .02 64= -4[202O;
NV$4:+/$ _B 2+%@$; 2+/./ .6$ 6-4; ?-7
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
125/200
*#&
30%/+;$4 .6$ #4084-, 2+%@$;5:*4->$4/$/ - 2+%@$; 2+/. :0,#
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
126/200
*#'
8 # &0;YB 1%.40;+/+.$;I
W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8
&0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I
W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
127/200
TFh
89K2.OH
7K>/2 A8C E
804-2DDA8CH
8 9 8Uq62c=H
J
cases in HPC Fortran arrays processed over regular/448D;
Recursion and pointer chasing were so far removed from4?0 \40=.6 focus that we didnt even consider more generalD=0?-=?02D;
i26-2X 2[26 . D>58/2 />D= =0.[20D./ >D 2c-22O>6M/S O>33>-?/=7>=K =K2 40>M>6./ [20D>46D 43 W826@1;
Q+%@$; 2+/./ ?+.60
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
128/200
*#+
?6+2$ H# Uj XMQQI q
# j #!g%$V.p
:0
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
129/200
*#(
&)7**M$3)/%_,/7$ Zd ,/7$(.&)[
0/% 81 B #$-7[ 1 qB ]?ee[ 1 B 16d,$a)N,/7$(.&):1'V;-3R81N[
.,) l B 8.,)N,/7$(.&):&.r$8N[
c1%-42- /21 1-%-(($( 0/% &3#$7'($8&)-).3STN
0/% 8.,) . B H[ . _ l[ OO.N
1%/3$&&5/%R8,/7$(.&)i.kN[
3EEF ;$=-
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
130/200
C$ ?$4$ -G2$ .0 #-4-22$2+T$ .6$ 2+%@$; 2+/.traversal but it was ugly and required,$4 .6$ ;-.-5
*0 ,0>$ G$70%; +./ 400./ +% .6$ -44-7 G-/$;
?042; 0= /:+$%.+=+: :0,#
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
131/200
*$*
&0;YB 1%.40;+/+.$;I
W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8
&0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I
W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
132/200
*$#
&0;YB 1%.40;+/+.$;I
W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8
&0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I
W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
133/200
e.D]D .02 >6O2826O26= ?6>=D 43 740];e.D]D .02 -4584D2O 43V
:0;$ =4 2c2-?=2
;-.- 26[>046526=
+%.$4%-2:0%.402 >-4+-G2$/ AFTsC
eK02.OD 8203405 =K2 740] 43 2.-K =.D];
eK2 0?6=>52 DSD=25 O2->O2D 7K26 =.D]D.02 2c2-?=2Oe.D]D 5.S Y2 O232002O
e.D]D 5.S Y2 2c2-?=2O >552O>.=2/S
J$4+-2 '-4-22$2
W$=+%+.+0%/
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
134/200
*$%
=%&> /.4&+'5/+'0%*;+4$:.+>$ #2
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
135/200
L80.M5. 458 Y.00>20
40 =.D] Y.00>20DL80.M5. 458 =.D]7.>=
TKJ
"#$%&'% ('# #%$%))*)
+
.#/01"0 !"# '0%*,((-./"#$%&'% ('# 0%$$1*$
"#$%&'% ('# 213&)*
+
.#/01"0 !"# '0%*0%$-./
4
4
@?/=>8/2 344 =.D]D -02.=2OK202462 340 2.-K =K02.O
Q// 344 =.D]D M?.0.6=22O =4Y2 -458/2=2O K202
W62 Y.0 =.D] -02.=2O K202
Y.0 =.D] M?.0.6=22O =4 Y2-458/2=2O K202
W-.- J:0#+%8 ?+.6 .-/@/B D+G0%-::+ $V-,#2$5
"#.& .& -, .,&)-,3$ /0 )#$
7. .7 7 7 .
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
136/200
+%. =+G H +%. % I
q
+%. VF7p+= H % f [ I 4$.[.=2 >6 Y4=K =.D]D
Whats wrong here?
A tasks private variables are
[.=2 [.0>.Y/2S >D . 80>[.=2 [.0>.Y/2
7.M.7$ -,7 3/,E'$% 7$&.4,
1-))$%,
W-.- J:0#+%8 ?+.6 .-/@/B D+G0%-::+ $V-,#2$5
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
137/200
+%. =+G H +%. % I
q
+%. VF7p+= H % f [ I 4$.[.=2 >6 Y4=K =.D]D
c u S .02 DK.02OZ00; /02$4/-2 $V-,#2$
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
138/200
A3%' "([ TT"c$(3%'B(&"&Z' d&[.#/01"0 !"# #0/0((&(.#/01"0 !"# %3Z1(&]
2!/+&V"(ef23/%'[&[&V&efZ&b',
.#/01"0 !"# '0%*#/!)&%%+&,[
_
Whats wrong here?
'0//+G2$ ;-.- 4-:$ UJ6-4$; >-4+-G2$ $
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
139/200
A3%' "([ TT"c$(3%'B(&"&Z' d&[.#/01"0 !"# #0/0((&(.#/01"0 !"# %3Z1(&]
2!/+&V"(ef23/%'[&[&V&efZ&b',
.#/01"0 !"# '0%* 23/%'#/340'&+&,#/!)&%%+&,[
_Z00; /02D
3>0D=80>[.=2
NV$4:+/$ `B .-/@/ +% "#$%&'
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
140/200
*%)
30%/+;$4 .6$ #4084-, 2+%@$;5:*4->$4/$/ - 2+%@$; 2+/. :0,#
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
141/200
*%*
&0;YB 1%.40;+/+.$;I
W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8
&0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I
W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
&0; aB *6$ /:-4$7 stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-4+?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
142/200
*%#
30%/+;$4 .6$ #4084-, 2+%@$;5:*4->$4/$/ - 2+%@$; 2+/. :0,#
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
143/200
e#4-8,- 0,# #-4-22$2qe#4-8,- 0,# /+%82$q
%0;$ ( # j 6$-;p
?6+2$ H#I qe#4-8,- 0,# .-/@ =+4/.#4+>-.$H#I
#40:$//H#Ip# j #!g%$V.p
rr
r
*; T02.=2. =2.5 43
=K02.OD;
#; W62 =K02.O2c2-?=2D =K2/+%82$-46D=0?-=
other threads
7.>= .= =K2 >58/>2OY.00>20 .= =K2 26O 43=K2 D>6M/2 -46D=0?-=
3. The single thread
-02.=2D . =.D] 7>=K >=D 476
[./?2 340 =K2 84>6=20 8
%; eK02.OD 7.>=>6M .= =K2 Y.00>20 2c2-?=2=.D]D;
bc2-?=>46 54[2D Y2S46O =K2 Y.00>20 46-2.// =K2 =.D]D .02 -458/2=2
NV$:./ =4 8.0.//2/>g2 >002M?/.0 8.==206D .6O 02-?0D>[2 3?6-=>46 -.//D
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
144/200
e#4-8,- 0,# #-4-22$2qe#4-8,- 0,# /+%82$q ddG20:@ Y
%0;$ ( # j 6$-;p?6+2$ H#I qdd G20:@ [e#4-8,- 0,# .-/@
#40:$//H#Ip# j #!g%$V.p ddG20:@ \
rr
r
i.[2 84=26=>./ =4 8.0.//2/>g2 >002M?/.0 8.==206D .6O 02-?0D>[2 3?6-=>46 -.//D
"'-.C >
"'-.C D
E3/C >
"'-.C D
E3/C D
"'-.C D
E3/C 0
"'-.C 0
"'-.C 0
*+,
$
J+%82$*64$-;$;
"'-.C >
*64Y *64[ *64\ *64]
"'-.C D
E3/C D
"'-.C D
E3/C >
"'-.C DE3/C 0
e>52,.[2O
1;2$
1;2$
Q 02./ 2c.58/2V ,S552=0>- 0.6]U] ?8O.=2
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
145/200
!$%
f9
T Q Qe
Q*
Q)
Qe) Qe
*T*)T**
QOO Q*Qe
)
QOO Q)Qe
)
P4=2V =K2 >=20.=>46 D7228D =K04?MK T .6O QX -02.=>6M . 627 Y/4-] 43 047D =4 Y2?8O.=2O 7>=K 627 8.0=D 43 Q; eK2D2 ?8O.=2D .02 -458/2=2/S >6O2826O26=;
@`# %#$0 _12T k#$3 %(*-#*,T 816#/3 P.$ ,# j#(l$T .$, d(#*, h.$ m##< &./.**#*(`($0 d_N%=
a1,# 2(34 !"#$%& @.7B n+#+#7
T 7+65(33#,
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
146/200
!$&
F2,3G53 -52 23,3''&'
H
F2,3G53 -52 /8(G'&
H
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
147/200
!$'
I JJ &(1 -7 )3/CKLM&M&
I JJ &(1 -7 23,3''&' ,&G8-(
F2,3G53 -52 )3/C 78,/)2,8N3)&OPQ %>>R
5 ?MR 46#Lb$, BN"#EC
e48 />62 02802D26=D 82.] [email protected]>62 AF=.6>?5# *;&kigX %T1pC
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
148/200
!$(
M EMM RMM AMM JMM ?MMM ?EMM ?RMM ?AMM ?JMM EMMMM
M9K
?
?9K
E
2"*#(5 7(2%,4(+, ,
GYWa!F@4%)9
H%/%#%,)%
YW.GX
a'%,YW.GXb,*1?
a'%,YW.GXb,*1E
a'%,YW.GXb,*1>a'%,YW.GXb,*1R
P4=2V =K2 .Y4[2 M0.8KD >D 340 =K2 54D= 6.v[2 7.S 43 5.0-K>6M =K04?MK =K2 5.=0>-2D;mS 8>-]>6M Y/4-]D OS6.5>-.//SX 5?-K 3.D=20 0.58U?8 -.6 Y2 .-K>2[2O;
"
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
149/200
*%(
&0;YB 1%.40;+/+.$;I
W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8
&0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I
W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
150/200
*&)
1*(+@ 1*(+A 1*(+B 1*(+C
!=2*%$7%7(*= 40O20
80>[.=2 [>27
=K02.O =K02.O
80>[.=2 [>27
=K02.O80>[.=2=K02.O80>[.=2..
Y Y
C- CG O- OG 5 5 5
-458>/20
bc2-?=.Y/2 -4O2
T4O2 40O20
CGOGC-O-5 5 5
8b7'($'.$9'
7#5.$3().**9'
#?+(P.*#$3'1/,#/'
30%/+/.$%:7B &$,047 A::$// O$!04;$4+%8
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
152/200
*
O$!04;$4+%8B30,#+2$4 4$!04;$4/ #4084-, 04;$4.0 .6$ :0;$ 04;$4
&-:6+%$ 4$!04;$4/ :0;$ 04;$4.0 .6$ ,$,047 :0,,+. 04;$4
At a given point in time, the private view seen by a.64$-; ,-7 G$ ;+==$4$%. =40, .6$ >+$? +% /6-4$;,$,0475
30%/+/.$%:7 &0;$2/;$=+%$ :0%/.4-+%./ 0% .6$ 04;$4/ 0=O$-;/ HOIF C4+.$/ HCI -%; J7%:640%+T-.+0%/ HJI
i.e. how do the values seen by a thread change as you
:6-%8$ 60? 0#/ =0220? HI 0.6$4 0#/5'0//+G+2+.+$/ +%:2
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
153/200
*&$
J$u
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
154/200
*&%
"#$%&' ;$=+%$/ :0%/+/.$%:7 -/ - >-4+-%. 0=?$-@ :0%/+/.$%:7B
3-% %0. 4$04;$4 J 0#/ ?+.6 O 04 C 0#/ 0% .6$ /-,$.64$-;
C$-@ :0%/+/.$%:7 8-%. .0 .6+/;+/:
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
155/200
*&&
W$=+%$/ - /$u
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
156/200
*&'
D2
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
157/200
*&-.$ .0 - .64$-;
D04.4-%B =A\@C^"W
?+.6 >\@C^"W820G-2 >-4+-G2$/ -4$ ,-/@$;5
"p\W^U>\@C^"W#4$/$4>$/ 820G-2 /:0#$ ?+.6+% $-:6.64$-;
*64$-;#4+>-.$ >-4+-G2$/ :-% G$ +%+.+-2+T$; Y@] 04 -. .+,$ 0= ;$=+%+.+0% H-.$ .0 :4$-.$ - :0
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
169/200
*'(
+%. :0-.$H:0
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
170/200
*-2-.$/=40, 0%$ ,$,G$4 0= - .$-, .0 .6$ 4$/. 0= .6$ .$-,
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
171/200
*-.$ HX/+T$F :60+:$I
q
e#4-8,- 0,# /+%82$ :0#7#4+>-.$ HX/+T$F :60+:$I
+%#-.$ HX/+T$F :60+:$I
q
e#4-8,- 0,# /+%82$ :0#7#4+>-.$ HX/+T$F :60+:$I
+%#
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
172/200
*
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
173/200
*+;$ .64$$ =+2$/ =04 .6+/ $V$4:+/$#+i,:5:B .6$ ,0%.$ :-420 ,$.60; #+ #4084-,
4-%;0,5:B - /+,#2$ 4-%;0, %
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
174/200
*+/+.$;I
W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8
&0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I
W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./
M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I
W+/: `B M%;$4/.-%;+%8 *-/@/
Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).
W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%
&0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
175/200
*$% #4$>+0-2-2
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
176/200
*
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
177/200
*
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
178/200
*
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
179/200
*
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
180/200
*+)
);))))*
);)))*
);))*
);)*
);*
*
* # $ % & '
RTk U 462 =K02.O
RTkX % =K02.ODX=0.>/ *
RTk % =K02.ODX=0>./ #
RTkX % =K02.ODX=0>./ $
Q08YoO$2-
.+>$$4404
Q08Yo%
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
181/200
*+*
/.-.+: 20%8 AWWNXW j Y^oaabp
/.-.+: 20%8 '&"W j `Y]o[^p20%8 4-%;0,i2-/. j op
e#4-8,- 0,# .64$-;#4+>-.$H4-%;0,i2-/.I
;0
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
182/200
*+#
Q08YoO$2-.+>$$4404
Q08Yo%$4/+0% 8+>$/ .6$/-,$ -%/?$4$-:6 .+,$ 70/S /.0M2; Use the rank an ID ranging from 0 to (PU1) to select
Y2=7226 . D2= 43 =.D]D .6O =4 5.6.M2 .6S DK.02O O.=.D=0?-=?02D;
eK>D 8.==206 >D [20S M2620./ .6O K.D Y226 ?D2O =4 D?8840=54D= A>3 64= .//C =K2 ./M40>=K5 D=0.=2MS 8.==206D;
MPI programs almost always use this pattern it is804Y.Y/S =K2 54D= -45546/S ?D2O 8.==206 >6 =K2 K>D=40S 43
8.0.//2/ 804M0.55>6M;
eK>D 8.==206 >D [20S M2620./ .6O K.D Y226 ?D2O =4 D?8840=54D= A>3 64= .//C =K2 ./M40>=K5 D=0.=2MS 8.==206D;
MPI programs almost always use this pattern it is804Y.Y/S =K2 54D= -45546/S ?D2O 8.==206 >6 =K2 K>D=40S 43
8.0.//2/ 804M0.55>6M;
"#$%&' '+ #4084-,B J'&W #-..$4%
)*+,-./0 123456782*/ 39*+ :*+; 9 ,69< ?9
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
190/200
*()
C
*+; *> 4*DE5E> F;04> F.3 D E5EGF;04 D $5EH:/2.I-0B +.3JF;04F G)4
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
191/200
*(*
/448D;R448 >=20.=>46D .02 O>[>O2O Y2=7226 . -4//2-=>46 43
804-2DD>6M 2/2526=D =4 -458?=2 =.D]D >6 8.0.//2/;
eK>D O2D>M6 8.==206 >D K2.[>/S ?D2O 7>=K O.=. 8.0.//2/ O2D>M68.==206D;
W826@1 804M0.5520D -45546/S ?D2 =K>D 8.==206;
L80.M5. 458 8.0.//2/ 340 DK.02OAd2D?/=DC D-K2O?/2AOS6.5>-C
340A>9)H>nPH>ffCEG4N740]A>X d2D?/=DCH
J
"#$%&' '1 '4084-,BQ00# 2$>$2 #-4-22$2+/, #-..$4%
D($)*+,# E15"
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
192/200
TLF
73.3() *1$0 $+5G73#"7 H IJJJJJK ,1+6*# 73#"K
D,#-($# CL%G@M8=N;> O
P1(, 5.($ QR
S ($3 (K ,1+6*# AT "(T 7+5 HJ
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
193/200
Q 804Y/25 >6-/?O2D . 52=K4O =4 O>[>O2 >6=4 D?Y804Y/25D.6O . 7.S =4 02-45Y>62 D4/?=>46D 43 D?Y804Y/25D >6=4 .M/4Y./ D4/?=>46;
,4/?=>46
G23>62 . D8/>= 4820.=>46
T46=>6?2 =4 D8/>= =K2 804Y/25 ?6=>/ D?Y804Y/25D .02D5.// 264?MK =4 D4/[2 O>02-=/S;
d2-45Y>62 D4/?=>46D =4 D?Y804Y/25D =4 D4/[2 40>M>6./M/4Y./ 804Y/25;
P4=2VT458?=>6M 5.S 4--?0 .= 2.-K 8K.D2 AD8/>=X /2.[2DX
02-45Y>62C;
W+>+;$ -%; :0%u= =K2 804Y/25 >6=4 D5.//20 D?YU804Y/25D; T46=>6?2 ?6=>/=K2 D?YU804Y/25D -.6 Y2 D4/[2 O>02-=/S;
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
194/200
=K2 D?Y 804Y/25D -.6 Y2 D4/[2 O>02-=/S;
$ W8=>46DV
G4 740] .D S4? D8/>=>6=4 D?YU804Y/25D;
G4 740] 46/S .= =K2/2.[2D;
G4 740] .D S4?02-45Y>62;
'4084-,B "#$%&' .-/@/ H;+>+;$ -%; :0%u6-/?O2 n458;Kq
D=.=>- /46M 6?5ND=28D 9 *))))))))H
LO23>62 @FPNmRw *)))))))
O4?Y/2 8>N-458A>6= PD=.0=X>6= P3>6>DKXO4?Y/2 D=28C
E > = > >Y/]
>6= 5.>6 AC
E
>6= >H
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
195/200
E >6= >X>Y/]H
O4?Y/2 cX D?5 9 );)XD?5*X D?5#H>3 AP3>6>DKUPD=.0= n @FPNmRwCE
340 A>9PD=.0=H>n P3>6>DKH >ffCE
c 9 A>f);&C!D=28H
D?5 9 D?5 f %;)oA*;)fc!cCHJ
J
2/D2E
>Y/] 9 P3>6>DKUPD=.0=H
e#4-8,- 0,# .-/@ /6-4$;H/N-458APD=.0=X P3>6>DKU>Y/]o#XD=28CH
e#4-8,- 0,# .-/@ /6-4$;H/N-458AP3>6>DKU>Y/]o#X P3>6>DKX D=28CH
e#4-8,- 0,# .-/@?-+.
D?5 9 D?5* f D?5#H
J02=?06 D?5H
J\]
O4?Y/2 D=28X 8>X D?5H
D=28 9 *;)oAO4?Y/2C 6?5ND=28DH
e#4-8,- 0,# #-4-22$2
E
e#4-8,- 0,# /+%82$
D?5 9 8>N-458A)X6?5ND=28DXD=28CH
J8> 9 D=28 ! D?5H
J
O$/M>6./ ,20>./ 8> 804M0.5 7>=K *)))))))) D=28D 0.6 >6 *;+$ D2-46OD;
-
7/26/2019 Intro_To_OpenMP_Mattson.pdf
196/200
\]`
!F6=2/ -458>/20 A>-8-C 7>=K 64 48=>5>g.=>46 46 Q88/2 W, h *);=K . O?./ -402 A34?0 iB=K02.OC F6=2/j T402e@>& 804-2DD40 .= *;< kKg .6O % kYS=2 GGd$ 52540S .= *;$$$ kKg;
=K02.OD *D=,1@G
,1@G-0>=>-./
1F R448 1> =.D]D
* *;+' *;+< *;(* *;+