Intro_To_OpenMP_Mattson.pdf

download Intro_To_OpenMP_Mattson.pdf

of 200

Transcript of Intro_To_OpenMP_Mattson.pdf

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    1/200

    A Hands!on Introduction to"#$%&'(

    ! The name OpenMP is the property of the OpenMP Architecture Review Board.

    *+, &-../0%

    1%.$2 304#5

    .+,0.67585,-../0%9+%.$25:0,

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    2/200

    1%.40;

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    3/200

    A:@%0?2$;8$,$%./

    *6+/ :0

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    4/200

    %

    '4$2+,+%-4+$/B

    "$ 2$-4%+%8UC$ ?+22 ,+V /604. 2$:.

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    5/200

    &

    "$4/$ 2+%@$; 2+/./ M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/

    &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    6/200

    '

    "$4/$ 2+%@$; 2+/./ M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/

    &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    7/200

    &004$c/ Q-?

    Moores Law

    !"#$% '()*+%, UCB CS 194 Fall2010

    1% Yb_^F 1%.$2 :0!=0

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    8/200

    Consequences of Moores law

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    9/200

    *6$ P-4;?-4$dJ0=.?-4$ :0%.4-:.

    C4+.$ 70

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    10/200 #

    Computer architecture and the power wall

    )

    &

    *)

    *&

    #)

    #&

    $)

    ) # % ' +,-./.0 1203405.6-2

    1472

    084720 9 8203 : *;?5 @

    >%+' 126=>?5

    126=>?5 104

    126=>?5 % AB5=C

    126=>?5 % A1D-C

    +%/5)# ., 1/5$%

    .& ',&'&)-.,-;($

    !"#$%&' )* +$"%,"-./0 "1 234&5

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    11/200 #

    partial solution: simple low power cores

    )

    &

    *)

    *&

    #)

    #&

    $)

    ) # % ' +,-./.0 1203405.6-2

    147

    20 84720 9 8203 : *;?5 @

    >%+' 126=>?5

    126=>?5 104

    126=>?5 % AB5=C

    126=>?5 % A1D-C

    ?&

    5.)# -((/5

    1.1$(.,$& '&$ ($&&

    1/5$%

    !"#$%&' )* +$"%,"-./0 "1 234&5

    )6&34#7558 9&340#: ; #.&%/3$&&/%

    0GF

    >%/3$&&/%

    0GF

    0

    @,1') A')1')

    @,1')

    A')1')

    =-1-3.)-,3$ B =

    C/()-4$ B C

    D%$E'$,39 B 0

    >/5$% B =CF0 =-1-3.)-,3$ B F:F=C/()-4$ B H:IC

    D%$E'$,39 B H:J0

    >/5$% B H:KLI=CF0

    C,73

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    14/200

    &+:40#40:$//04 .4$%;/

    !"# %&''

    _G2[2P Y&.57C\>`>!()&' *%% +,-.&//-,

    PS[ PY2 TGaa>

    0,123,)4 (35&/ 3,& )6& 2,-2&,)4 -7 )6&8, -9(&,/:

    @,7.M.7'-( 1%/3$&&/%& -%$ 2-,9 3/%$ 8-,7 /0)$, #$)$%/4$,$/'&N 1%/3$&&/%&:

    ;< .-,&/0< .-,&/

    ; 981& *!#=

    > %+? @ A .-,&/

    >< .-,&/

    >A 981& *!#=

    B; .-,&/

    !"#$%&' W?&3Cb 4#4"$075O +7.4&$O c"-&.O S744."3O 73< b"/,:"4"6Oc09)PC M>\\

    PTS S9CWT)234&5d e&"3d ?$"%&.."$

    B .-,&/

    B .-,&/

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    15/200

    The result

    *&

    O

    BA new contract HW people will do whats natural

    0/% )#$2 8(/)& /0 &.21($ 3/%$&N -,7 PQ 1$/1($ 5.((

    #-M$ )/ -7-1) 8%$5%.)$ $M$%9)#.,4N

    The problem is this was presented as an ultimatum

    nobody asked us if we were OK with this new contract

    5#.3# .& R.,7 /0 %'7$:

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    16/200

    30%:/5 '-4-22$2+/,*?0 +,#04.-%. ;$=+%+.+0%/B

    30%:

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    17/200

    30%:/5 '-4-22$2+/,

    Figure from An Introduction to Concurrency in Programming Languages by J. Sottile, Timothy G. Mattson, and Craig E Rasmusse,S FHTH

    *?0 +,#04.-%. ;$=+%+.+0%/B

    30%:%/4%-2&

    >-%-(($(

    >%/4%-2&

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    18/200

    30%:/5 '-4-22$2 -##2+:-.+0%/

    !"#"$$%$ "''$()"*(+,- ., "''$()"*(+, /+# 01()1 *1%)+2'3*"*(+,4 !"#$!%%&%5%)3*% 4(23$*",%+34$6 (,

    +#7%# *+ )+2'$%*% " '#+8$%2 (, $%44 *(2%9

    The problem doesnt inherently require

    concurrency you can state it sequentially.

    :+,)3##%,* "''$()"*(+,- ., "''$()"*(+, /+# 01()1

    )+2'3*"*(+,4 %'()"!%%&%5%)3*% 4(23$*",%+34$6 73%

    *+ *1% 4%2",*()4 +/ *1% "''$()"*(+,9

    ;1% '#+8$%2 (4 /3,7"2%,*"$$6 )+,)3##%,*9

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    19/200

    *6$ '-4-22$2 #4084-,,+%8 #40:$//B

    A%.4.,-( >%/;($2 "-&R&S -%$7 -,7 (/3-(

    7-)-

    D.,7=/,3'%%$,39

    8

    @21($2$,)-)./,&)%-)$49

    =/%%$&1/,7.,4 &/'%3$

    3/7$

    >%/4%-2 P>-% 8N

    X

    "Y>W Z)21S Z0',38N[

    4(/;-(V-%%-9 U-)-8"Y>WN[

    4(/;-(V-%%-9 \$&8"Y>WN[

    .,) ] B 4$)V,'2V1%/3&8N[

    .,) .7 B 4$)V1%/3V.78N[

    .0 8.7BBHN &$)'1V1%/;($28]SU^"^N[

    0/% 8.,) @B H[ @_][@B@O]'2NX

    )21 B 0',38@N[

    \$&:-33'2'(-)$8 )21N[

    `

    `

    >%/4%-2 P>-% 8N

    X

    "Y>W Z)21S Z0',38N[

    4(/;-(V-%%-9 U-)-8"Y>WN[

    4(/;-(V-%%-9 \$&8"Y>WN[

    .,) ] B 4$)V,'2V1%/3&8N[

    .,) .7 B 4$)V1%/3V.78N[

    .0 8.7BBHN &$)'1V1%/;($28]SU^"^N[

    0/% 8.,) @B H[ @_][@B@O]'2NX

    )21 B 0',38@N[

    \$&:-33'2'(-)$8 )21N[

    `

    `

    >%/4%-2 P>-% 8N

    X

    "Y>W Z)21S Z0',38N[

    4(/;-(V-%%-9 U-)-8"Y>WN[

    4(/;-(V-%%-9 \$&8"Y>WN[

    .,) ] B 4$)V,'2V1%/3&8N[

    .,) .7 B 4$)V1%/3V.78N[

    .0 8.7BBHN &$)'1V1%/;($28]SU^"^N[

    0/% 8.,) @B H[ @_][@B@O]'2NX

    )21 B 0',38@N[

    \$&:-33'2'(-)$8 )21N[

    `

    `

    >%/4%-2 P>-% 8N

    X

    "Y>W Z)21S Z0',38N[

    4(/;-(V-%%-9 U-)-8"Y>WN[

    4(/;-(V-%%-9 \$&8"Y>WN[

    .,) ]'2 B 4$)V,'2V1%/3&8N[

    .,) .7 B 4$)V1%/3V.78N[

    .0 8.7BBHN &$)'1V1%/;($28]S U-)-N[

    0/% 8.,) @B @U[ @_][@B@O]'2NX

    )21 B 0',38@S U-)-N[

    \$&:-33'2'(-)$8 )21N[

    `

    `

    ?,.)& /0 $a$3')./, O ,$5 -%$7 7-)-

    0/% $a)%-3)$7 7$1$,7$,3.$&

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    20/200

    #)

    "#$%&'(">$4>+$?B

    !"#$%&'$(!)*+()*,

    .#/01"0 !"# #0/0((&( 2!/ #/340'&+56 7,

    .#/01"0 !"# )/3'3)0(

    89:;< #0/0((&( =! %>0/&=+06 ?6 ),

    89:;<

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    21/200

    #*

    "#$%&' L-/+: W$=/B J02

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    22/200

    ##

    "#$%&' :04$ /7%.-V

    &0/. 0= .6$ :0%/.4

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    23/200

    #$

    "$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    24/200

    #%

    30,#+2$4 %0.$/B 1%.$2 0% C+%;0?/ Q-

    $%>+40%,$%.

    :; .0 .6$ ;+4$:.047 .6-.602;/ 70

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    25/200

    #&

    30,#+2$4 %0.$/B k+/

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    26/200

    #'

    30,#+2$4 %0.$/B ".6$4

    Q+%

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    27/200

    #?@>A

    *+!,-. 0-0',& 1'02$#-,3

    *+!,-. 0-0',& "'02$#-, - ",6 )+2'3*%# )+2'+4%7 +/ 23$*('$%

    '#+)%44(,= %$%2%,*4 *1"* 41"#% ", "77#%44 4'")%9 ;0+ :$"44%4- *&00-#,)" 0$%#)2,'"-33', B*45C- " 41"#%7 "77#%44 4'")% 0(*1

    equalDtime access for each processor, and the OS treats every

    '#+)%44+# *1% 4"2% 0"69

    6'7 87)9',0 !..,-33 32!"- 0$%#)2,'"-33', B684:C- 7(//%#%,*

    memory regions have different access costs think of memorysegmented into Near and Far memory.

    104-$104-#104-* 104-P

    ,K.02O QOO02DD ,8.-2

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    32/200

    >E@>A

    *+!,-. 0-0',& 0!"+)7-3; *45

    :#"6D2 the last large4)"$% FG! )+2'3*%#9 H%$%"4%7 (, ?IJK 0(*1

    4 heads, 1.9 GFLOPS

    '%"L '%#/+#2",)%

    B/"4*%7 43'%#)+2'3*%#(, *1% 0+#$7 3,*($ ?IIMC9

    ;1% N%)*+# 3,(*4 (, %")1head had equalD*(2%

    "))%44 *+ *1% 2%2+#6

    +#=",(O%7 (,*+ 8",L4 *+43''+#* 1(=1D

    8",70(7*1 '"#"$$%$

    2%2+#6 "))%44

    "#.%7 1-%)9 ,-2$& -%$ )#$ 1%/1$%)9 /0 )#$.% /5,$%&

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    33/200

    >>@>A

    *+!,-. 0-0',& 0!"+)7-3; *45

    A )+#%4P ED0"6 23$*(*1#%"7%7P AD0(7% 43'%#4)"$"#P Q3"7D(443%P RD0(7%FSGT B+, > +/ A '('%$(,%4C

    4.5 KB (6 x 768 B) Architectural Registers, 192 KB (6 x 32 KB) L1:")1%P ?9K GU BA 5 EKA VUC WE )")1%P ?E GU W> :")1% GXFSY :")1% :+1%#%,)%P !#+)%44+# :+,4(4*%,)6 G+7%$ ?9?Z U($$(+, ;#",4(4*+#4 +, >E ,2 '#+)%44 [ E9A \]O

    R# R# R# R# R# R#

    R$

    @2540S T46=04//20

    Intel Core i7U(D >=Z

    T.-K2 K>20.0-KS 52.6D O>332026= 804-2DD40D K.[2 O>332026=costs to access different address ranges . Its NUMA

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    34/200

    >R@>A

    104-$104-#104-* 104-P

    ,K.02O QOO02DD ,8.-2

    *+!,-. 0-0',& "'02$#-,3

    Shared memory computers are everywhere most laptops and

    4%#N%#4 1"N% 23$*()+#% 23$*('#+)%44+# :!^4

    ;1% 41"#%7 "77#%44 4'")% ",7 B"4 0% 0($$ 4%%C '#+=#"22(,=2+7%$4 %,)+3#"=% 34 *+ *1(,L +/ *1%2 "* FG! 464*%249

    Reality is more complex any multiprocessor CPU with a cache is" _^G. 464*%29 F*"#* +3* 86 *#%"*(,= *1% 464*%2 "4 ", FG! ",7

    `34* "))%'* *1"* 23)1 +/ 6+3# +'*(2(O"*(+, 0+#L 0($$ "77#%44 )"4%4

    01%#% *1"* )"4% 8#%"L4 7+0,9

    104-$104-#104-* 104-P

    ,K.02O QOO02DD ,8.-2

    5,'(,!00)7( 3+!,-. 0-0',&

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    35/200

    >K@>A

    5,'(,!00)7( 3+!,-. 0-0',&

    "'02$#-,3

    0',3^8N M-%T

    M-%F

    2-.,8N

    0',3^8N

    0',3b8N

    : : : : :

    -%%-9T

    -%%-9F

    P)-3R

    )$a)

    7-)-

    #$-1

    >%/3$&&

    ^, .,&)-,3$ /0 -1%/4%-2 $a$3')./,:

    "#$ $a$3')./,

    3/,)$a) /0 - %',,.,4

    program i.e. the

    %$&/'%3$& -&&/3.-)$7with a programs

    $a$3')./,:

    >%/3$&& @U

    ?&$% @U

    +%/'1 @U

    D.($&

    e/3R&

    P/3R$)&

    P)-3R >/.,)$%

    >%/4%-2 =/',)$%

    \$4.&)$%&

    5,'(,!00)7( 3+!,-. 0-0',&

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    36/200

    >A@>A

    5,'(,!00)7( 3+!,-. 0-0',&

    "'02$#-,3

    0',3^8N M-%T

    M-%F

    2-.,8N

    0',3^8N

    0',3b8N

    : : : : :

    -%%-9T

    -%%-9F

    "#%$-7 HP)-3R

    )$a)

    7-)-

    #$-1

    0',3b8N M-%T

    M-%F

    M-%K

    >%/3$&& @U

    ?&$% @U

    +%/'1 @U

    D.($&

    e/3R&

    P/3R$)&

    P)-3R >/.,)$%

    >%/4%-2 =/',)$%

    \$4.&)$%&

    P)-3R >/.,)$%

    >%/4%-2 =/',)$%

    \$4.&)$%&

    "#%$-7 T

    P)-3R

    "#%$-7&*

    "#%$-7& -%$ f(.4#)weight processes

    "#%$-7& -%$>%/3$&& &)-)$ -2/,4

    multiple threads

    )#.& 4%$-)(9 %$7'3$&)#$ 3/&) /0 &5.)3#.,4

    3/,)$a):

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    37/200

    >Z@>A

    : 3+!,-. 0-0',& 2,'(,!0

    :7 )73#!7"- '9 ! 2,'(,!0; a,% '#+)%44 ",7 $+*4 +/

    *1#%"749

    ;1#%"74 (,*%#")* *1#+3=1

    #%"74@0#(*%4 *+ " 41"#%7

    "77#%44 4'")%9

    aF 4)1%73$%# 7%)(7%4

    01%, *+ #3, 01()1

    threads interleaved

    /+# /"(#,%449

    F6,)1#+,(O"*(+, *+

    "443#% %N%#6 $%="$ +#7%#

    #%43$*4 (, )+##%)* #%43$*49

    )#%$-7 >%.M-)$

    )#%$-7 >%.M-)$

    )#%$-7 >%.M-)$

    )#%$-7 >%.M-)$

    )#%$-7 >%.M-)$

    P#-%$7 ^77%$&&P1-3$

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    38/200

    $+

    NV$4:+/$ YB J026= FG 9 458NM2=N=K02.ON6?5ACH80>6=3( hello(%d) , ID);80>6=3( world(%d) In, ID);

    J

    J

    #include 458;K>6= 5.>6ACE

    L80.M5. 458 8.0.//2/E

    >6= FG 9 458NM2=N=K02.ON6?5ACH80>6=3( hello(%d) , ID);80>6=3( world(%d) In, ID);

    J

    J

    J-,#2$ "

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    39/200

    $(

    "#$%&' ">$4>+$?BP0? ;0 .64$-;/ +%.$4-:.m

    "#$%&' +/ - ,

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    40/200

    %)

    "$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    41/200

    %*

    "$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    42/200

    %#

    "#$%&' '4084-,,+%8 &0;$2B

    \40]U^4>6 1.0.//2/>[email protected]=20 =K02.OD8.76D . =2.5 43 =K02.OD.D 622O2O;

    1.0.//2/>D5 .OO2O >6-02526=.//S ?6=>/ 8203405.6-2 M4./D.02 52=V >;2; =K2 D2_?26=>./ 804M0.5 2[4/[2D >6=4 .8.0.//2/ 804M0.5;

    '-4-22$2 O$8+0%/&-/.$4*64$-;+% 4$;

    A X$/.$;'-4-22$24$8+0%

    A X$/.$;'-4-22$24$8+0%

    >#?+#$3(.*'&./37'

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    43/200

    %$

    *64$-; 34$-.+0%B '-4-22$2 O$8+0%/

    n0< :4$-.$ .64$-;/ +% "#$%&'( ?+.6 .6$ #-4-22$2:0%/.4

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    44/200

    %%

    *64$-; 34$-.+0%B '-4-22$2 O$8+0%/

    n0< :4$-.$ .64$-;/ +% "#$%&'( ?+.6 .6$ #-4-22$2:0%/.4

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    45/200

    *64$-; 34$-.+0%B '-4-22$2 O$8+0%/

    b.-K =K02.O 2c2-?=2D=K2 D.52 -4O202O?6O.6=/S;

    O4?Y/2 Q`*)))aH

    L80.M5. 458 8.0.//2/ 6?5N=K02.ODA%CE

    >6= FG 9 458NM2=N=K02.ON6?5ACH844KAFGX QCH

    J

    80>6=3(all doneIn);

    458ND2=N6?5N=K02.ODA%C

    844KA*XQC 844KA#XQC 844KA$XQC

    printf(all doneIn);

    844KA)XQC

    O4?Y/2 Q`*)))aH

    Q D>6M/2

    -48S 43 Q >DDK.02OY2=7226 .//=K02.OD;

    Q D>6M/2

    -48S 43 Q >DDK.02OY2=7226 .//=K02.OD;

    eK02.OD 7.>= K202 340 .// =K02.OD =4 3>6>DK

    Y23402 804-22O>6M A>;2; .!"##$%#

    C

    eK02.OD 7.>= K202 340 .// =K02.OD =4 3>6>DK

    Y23402 804-22O>6M A>;2; .!"##$%#

    C! The name OpenMP is the property of the OpenMP Architecture Review Board

    "#$%&' 6-. .6$ :0,#+2$4 ;0$/

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    46/200

    "#$%&'B ?6-. .6$ :0,#+2$4 ;0$/

    !"#$%&$ (&" "$#$))*) +,&-./#*$01234

    5

    6((7$# 248

    9

    :(;0 ./,+< 24

    56((7$# 248

    9

    "./#*$0-. .;0=3>8

    6(# 2;+. ; ? @8 ; A 38 BB;4"./#*$0-C#*$.* 2

    D.;0=;>EFE./,+

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    47/200

    % 9 )

    P

    @.=K25.=>-.//SX 72 ]647 =K.=V

    B2 -.6 .8804c>5.=2 =K2>6=2M0./ .D . D?5 4302-=.6M/2DV

    BK202 2.-K 02-=.6M/2 K.D7>O=K c .6O K2>MK= \Ac>C .==K2 5>OO/2 43 >6=20[./ >;

    ]5o

    [5o

    Y5o

    lo5o

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    48/200

    %+

    NV$4:+/$/ [ .0 ]B J$4+-2 '1 '4084-,

    /.-.+: 20%8 %

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    49/200

    %(

    NV$4:+/$ [

    34$-.$ - #-4-22$2 >$4/+0% 0= .6$ #+ #4084-,

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    50/200

    &)

    "$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    51/200

    &*

    J$4+-2 '1 '4084-,

    /.-.+: 20%8 %

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    52/200

    JF

    D($)*+,#'E15 'O'

    P1(, 5.($ QR

    S ($3 (T $34/#.,7K ,1+6*# "(T 7+5UCL%G@M8=N;>VK73#" H I

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    53/200

    &$

    A2804+.6, /.4-.$87B*6$ J'&W HJ+%82$ '4084-, &6M 2/2526=D 7K202 1-.6 Y2 .0Y>=0.0>/S /.0M2;

    Use the rank an ID ranging from 0 to (PU1) to selectY2=7226 . D2= 43 =.D]D .6O =4 5.6.M2 .6S DK.02O O.=.D=0?-=?02D;

    eK>D 8.==206 >D [20S M2620./ .6O K.D Y226 ?D2O =4 D?8840=54D= A>3 64= .//C =K2 ./M40>=K5 D=0.=2MS 8.==206D;

    MPI programs almost always use this pattern it is

    804Y.Y/S =K2 54D= -45546/S ?D2O 8.==206 >6 =K2 K>D=40S 438.0.//2/ 804M0.55>6M;

    eK>D 8.==206 >D [20S M2620./ .6O K.D Y226 ?D2O =4 D?8840=54D= A>3 64= .//C =K2 ./M40>=K5 D=0.=2MS 8.==206D;

    MPI programs almost always use this pattern it is

    804Y.Y/S =K2 54D= -45546/S ?D2O 8.==206 >6 =K2 K>D=40S 438.0.//2/ 804M0.55>6M;

    O$/

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    54/200

    O$//20 A>-8-C 7>=K 64 48=>5>g.=>46 46 Q88/2 W, h *);=K . O?./ -402 A34?0 iB=K02.OC F6=2/j T402e@>& 804-2DD40 .= *;< kKg .6O % kYS=2 GGd$ 52540S .= *;$$$ kKg;

    =K02.OD *D=,1@G

    * *;+'

    # *;)$

    $ *;)+

    % );(M>6./ ,20>./ 8> 804M0.5 7>=K *)))))))) D=28D 0.6 >6 *;+$ D2-46OD;

    C67 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    55/200

    JJ

    C67 /6O2826O26= O.=. 2/2526=D K.8826 =4 D>= 46 =K2 D.52 -.-K2 />62X 2.-K

    update will cause the cache lines to slosh back and forth between threads

    This is called false sharing;

    F3 S4? 80454=2 D-./.0D =4 .6 .00.S =4 D?8840= -02.=>46 43 .6 ,1@G 804M0.5X=K2 .00.S 2/2526=D .02 -46=>M?4?D >6 52540S .6O K26-2 DK.02 -.-K2 />62D Results in poor scalability.

    ,4/?=>46V 1.O .00.SD D4 2/2526=D S4? ?D2 .02 46 O>D=>6-= -.-K2 />62D;

    J

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    56/200

    JI

    D($)*+,# E15" O

    P1(, 5.($ QRS ($3 (T $34/#.,7K ,1+6*# "(T 7+5UCL%G@M8=N;>VU&N;VK'

    73#" H IRK

    D"/.05. 15" "./.**#*

    S ($3 (T (,T$34/,7K,1+6*# AK

    ''''''''''''''(, H 15"G0#3G34/#.,G$+5QRK

    $34/,7 H 15"G0#3G$+5G34/#.,7QRK

    (- Q(, HH JR $34/#.,7 H $34/,7K

    -1/ Q(H(,T 7+5U(,VHJ

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    57/200

    O$/=K 64 48=>5>g.=>46 46 Q88/2 W, h *);=K . O?./ -402 A34?0 iB

    =K02.OC F6=2/j T402e@>& 804-2DD40 .= *;< kKg .6O % kYS=2 GGd$ 52540S .= *;$$$ kKg;

    W0>M>6./ ,20>./ 8> 804M0.5 7>=K *)))))))) D=28D 0.6 >6 *;+$ D2-46OD;

    =K02.OD *D=,1@G

    *D=,1@G

    8.OO2O

    * *;+' *;+'

    # *;)$ *;)*

    $ *;)+ );'(

    % );(< );&$

    W0 ?$ 4$-227 %$$; .0 #-; 0

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    58/200

    W0 ?$ 4$-227 %$$; .0 #-; 06M .00.SD 02_?>02D O228 ]647/2OM2 43 =K2 -.-K2

    .0-K>=2-=?02; @4[2 =4 . 5.-K>62 7>=K O>332026= D>g2O-.-K2 />62D .6O S4?0 D43=7.02 8203405.6-2 3.//D .8.0=;

    eK202 K.D M4= =4 Y2 . Y2==20 7.S =4 O2./ 7>=K 3./D2 DK.0>6M;

    !"

    "

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    59/200

    &(

    "$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    60/200

    ')

    "#$%&' ">$4>+$?BP0? ;0 .64$-;/ +%.$4-:.m

    "#$%&' +/ - ,

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    61/200

    A?@A?

    *&7"+,'7)

    ;1% *0+ 2+4* )+22+, /+#24 +/ 46,)1#+,(O"*(+, "#%-

    &= .= =K2 Y.00>20 ?6=>/ .//

    =K02.OD .00>[2;

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    62/200

    '#

    ;(7)+77#,'

    *.3#/'

    J7%:640%+T-.+0%

    P+86 2$>$2 /7%:640%+T-.+0%B :4+.+:-2

    -.0,+:

    G-44+$4

    04;$4$;

    Q0? 2$>$2 /7%:640%+T-.+0%

    =29$)4/1$(`.3(1$'(7'+7#,'

    31'(5 17#'1/,#/'

    )1$73/.($37'.$,'31'

    /13#)3'.))#77'31'74./#,'

    ,.3.'

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    63/200

    '$

    J7%:640%+T-.+0%B L-44+$4

    L-44+$4V b.-K =K02.O 7.>=D ?6=>/ .// =K02.OD .00>[2;

    e#4-8,- 0,# #-4-22$2

    q

    +%. +;j0,#i8$.i.64$-;i%

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    64/200

    '%

    J7%:640%+T-.+0%B :4+.+:-2

    &

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    65/200

    IJ

    7 H I

    A.0,+:804[>O2D 5?=?./ 2c-/?D>46 Y?= 46/S .88/>2D =4 =K2?8O.=2 43 . 52540S /4-.=>46 A=K2 ?8O.=2 43 h >6 =K2 34//47>6M

    2c.58/2Ce#4-8,- 0,# #-4-22$2

    EO4?Y/2 =58X mH

    m 9 GWFeACH

    =58 9 Y>MN?M/SAmCH

    e#4-8,- 0,# -.0,+:

    h f9 =58H

    J

    QOO>=>46./ 3405D 43 .=45>- 7202 .OO2O >6 W826@1 $;*;

    B2 7>// O>D-?DD =K2D2 /.=20;

    eK2 D=.=2526= >6D>O2 =K2.=45>- 5?D= Y2 462 43 =K234//47>6M 3405DV

    c Y>6489 2c80

    cff ffc c UUc

    h >D .6 /[./?2 43 D-./.0 =S82.6O Y>648 >D . 646U4[20/4.O2OY?>/= >6 4820.=40;

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    66/200

    ''

    NV$4:+/$ \

    1% $V$4:+/$ [F 70< #40G-G27

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    67/200

    '/20 A>-8-C 7>=K 64 48=>5>g.=>46 46 Q88/2 W, h *);=K . O?./ -402 A34?0 iB=K02.OC F6=2/j T402e@>& 804-2DD40 .= *;< kKg .6O % kYS=2 GGd$ 52540S .= *;$$$ kKg;

    =K02.OD *D=,1@G

    * *;+'# *;)$

    $ *;)+

    % );(M>6./ ,20>./ 8> 804M0.5 7>=K *)))))))) D=28D 0.6 >6 *;+$ D2-46OD;

    \$3-(( )#-) 1%/2/).,4 &'2

    )/ -, -%%-9 2-7$ )#$

    3/7.,4 $-&9S ;') ($7 )/ 0-(&$

    -%.,4 -,7 1//%

    1$%0/%2-,3$:

    NV-,#2$BM/+%8 - :4+.+:-2 /$:.+0% .0 4$,0>$ +,#-:. 0= =-2/$ /6-4+%8

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    69/200

    IL

    D($)*+,#'E15 'O'

    P1(, 5.($ QRS ,1+6*# "(K 73#" H I

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    70/200

    O$/

    !F6=2/ -458>/20 A>-8-C 7>=K 64 48=>5>g.=>46 46 Q88/2 W, h *);=K . O?./ -402 A34?0 iB

    =K02.OC F6=2/j T402e@

    >& 804-2DD40 .= *;< kKg .6O % kYS=2 GGd$ 52540S .= *;$$$ kKg;

    W0>M>6./ ,20>./ 8> 804M0.5 7>=K *)))))))) D=28D 0.6 >6 *;+$ D2-46OD;

    =K02.OD *D=

    ,1@G *D=

    ,[email protected]

    ,1@G-0>=>-./

    * *;+' *;+' *;+$ +,#-:. 0= =-2/$ /6-4+%8

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    71/200

    hT

    D($)*+,#'E15 'O'

    P1(, 5.($ QRS ,1+6*# "(K 73#" H I

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    72/200

    hF

    D($)*+,#'E15 'O'

    P1(, 5.($ QRS ,1+6*# "(K 73#" H I

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    73/200

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    74/200

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    75/200

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    76/200

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    77/200

    46D; eK2 D>g2 43 =K2 Y/4-]starts large and shrinks down to size chunk as the calculation804-22OD;

    D-K2O?/2A0?6=>52C

    ,-K2O?/2 .6O -K?6] D>g2 =.]26 3045 =K2 W@1N,TibGpRb26[>046526= [.0>.Y/2 A40 =K2 0?6=>52 />Y0.0SC;

    D-K2O?/2A.?=4C

    ,-K2O?/2 >D /23= ?8 =4 =K2 0?6=>52 =4 -K44D2 AO42D 64= K.[2 =4 Y2 .6S43 =K2 .Y4[2C;

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    78/200

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    79/200

    [./26=

    7/';($ %$&i

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    80/200

    +)

    C04@+%8 ?+.6 200#/L-/+: -##40-:6

    D+%; :0,#$ 200#/

    &-@$ .6$ 200# +.$4-.+0%/ +%;$#$%;$%. 55 J0 .6$7 :-%/-=$27 $V$:

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    81/200

    .#/01"0 !"# #0/0((&( 2!/ )!((0#%&+Y,2!/ +3Z' 3VO[ 3\G[ 3HH, ]2!/ +3Z' KVO[ K\;[ KHH, ]

    ^^^^^_

    _

    +*

    X$/.$; 200#/

    C+22 =04, - /+%82$ 200# 0= 2$%8.6 XV& -%; .6$%#-4-22$2+T$ .6-.5

    M/$=

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    82/200

    +#

    C$ -4$ :0,G+%+%8 >-2

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    83/200

    +$

    "#$%&' 4$;

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    84/200

    +%

    # # &-%7 ;+==$4$%. -//0:+-.+>$ 0#$4-%;/ :-% G$ -2-2

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    85/200

    +&

    NV$4:+/$ ]B '+ ?+.6 200#/

    Z0 G-:@ .0 .6$ /$4+-2 #+ #4084-, -%; #-4-22$2+T$+. ?+.6 - 200# :0%/.4

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    86/200

    +'

    "$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    87/200

    +O 5.>6 ACE >6= >H O4?Y/2 cX 8>X D?5 9 );)H

    D=28 9 *;)oAO4?Y/2C 6?5ND=28DHe#4-8,- 0,# #-4-22$2q

    ;0ffCEc 9 A>f);&C!D=28H

    D?5 9 D?5 f %;)oA*;)fc!cCHJ

    r8> 9 D=28 ! D?5H

    J

    a/#.3# . 7).*./ *1).* 31 #.)4 34/#., 31 41*,

    P.*+# 1- A .3 34# )#$3#/ 1- #.)4 ($3#/P.*

    Create a team of threads

    without a parallel construct, youll

    $#P#/ 4.P# 51/# 34.$ 1$# 34/#.,

    g/#.B +" *11" (3#/.3(1$7

    .$, .77(0$ 34#5 31

    threads setting up a/#,+)3(1$ ($31 7+5-8-C 7>=K 64 48=>5>g.=>46 46 Q88/2 W, h *);=K . O?./ -402 A34?0 iB

    =K02.OC F6=2/j T402e@

    >& 804-2DD40 .= *;< kKg .6O % kYS=2 GGd$ 52540S .= *;$$$ kKg;

    W0>M>6./ ,20>./ 8> 804M0.5 7>=K *)))))))) D=28D 0.6 >6 *;+$ D2-46OD;

    =K02.OD *D=

    ,1@G

    *D=

    ,[email protected]

    ,1@G

    -0>=>-./

    1F R448

    * *;+' *;+' *;+< *;(*

    # *;)$ *;)* *;)) *;)#

    $ *;)+ );'( );'+ );+)

    % );(< );&$ );&$ );'+

    '-4-22$2 200#/

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    90/200

    LH

    W826@1 3.0 guarantees that this works i.e. that the sameD-K2O?/2 >D ?D2O >6 =K2 =74 /448DV

    W9!"# =! %)>&=`(&+%'0'3),=! 3VN6Z

    0+3, V ^^^^&Z= =!W9!"# &Z= =! Z!a03'W9!"# =! %)>&=`(&+%'0'3),

    =! 3VN6Z^^^^ V 0+3,

    &Z= =!

    Q00#/ H:0%.5I

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    91/200

    LT

    @.O2 %)>&=`(&+/`Z'3"&,5402 ?D23?/-.6 M2=oD2= >= 7>=K />Y0.0S 04?=>62D

    !"#$%&'$%)>&=`(&+,!"#$1&'$%)>&=`(&+,

    .//47 >58/2526=.=>46D =4 >58/2526= =K2>0 476 D-K2O?/2 ]>6OD

    QOO2O . 627 D-K2O?/2 ]>6O AM*"7K>-K M>[2D 3?// 3022O45=4 =K2 0?6=>52 =4 O2=205>62 =K2 D-K2O?/>6M 43 >=20.=>46D =4=K02.OD;

    Q//472O Tff d.6O45 .--2DD >=20.=40D .D /448 -46=04/[.0>.Y/2D >6 8.0.//2/ /448D

    "

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    92/200

    (#

    M%+. YB Z$..+%8 /.-4.$; ?+.6 "#$%&' &0;YB 1%.40;+/+.$;I

    W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8

    &0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I

    W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    93/200

    ($

    M%+. YB Z$..+%8 /.-4.$; ?+.6 "#$%&' &0;YB 1%.40;+/+.$;I

    W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8

    &0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I

    W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    94/200

    (%

    J7%:640%+T-.+0%B L-44+$4

    L-44+$4V b.-K =K02.O 7.>=D ?6=>/ .// =K02.OD .00>[2;e#4-8,- 0,# #-4-22$2 /6-4$; HAF LF 3I #4+>-.$H+;Iq

    +;j0,#i8$.i.64$-;i%58/>->= Y.00>20 .= =K2 26O

    43 . 8.0.//2/ 02M>46

    >58/>->= Y.00>20 .= =K2 26O

    43 . 8.0.//2/ 02M>46

    >58/>->= Y.00>20 .= =K2 26O 43 .340 740]DK.0>6M -46D=0?-=>58/>->= Y.00>20 .= =K2 26O 43 .340 740]DK.0>6M -46D=0?-=

    64 >58/>->= Y.00>20

    O?2 =4 647.>=

    64 >58/>->= Y.00>20

    O?2 =4 647.>=

    &-/.$4 30%/.4

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    95/200

    (&

    &-/.$4 30%/.4

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    96/200

    ('

    J+%82$ ?04@/6-4+%8 30%/.42DACH JO4N5.6SN4=K20N=K>6MDACH

    J

    J$:.+0%/ ?04@/6-4+%8 30%/.4

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    97/200

    (046526= [.0>.Y/2 =4 -46=04/ =K2 D>g2 43child threads stack

    "&'iJ*A3SJ1zN

    Q/D4 .OO2O .6 26[>046526= [.0>.Y/2 =4 K>6= =4 0?6=>52 K47 =4

    =02.= >O/2 =K02.OD"&'iCA1*i'"Q13n

    A3*1kN @$$# .64$-;/ -2+>$ -. G-44+$4/d20:@/

    'AJJ1kN .47 .0 4$2$-/$ #40:$//04 -. G-44+$4/d20:@/

    Process binding is enabled if this variable is true i.e. if true=K2 0?6=>52 7>// 64= 54[2 =K02.OD .04?6O Y2=7226 804-2DD40D;

    "&'i'O"3iL1XW .4

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    103/200

    *)$

    M%+. YB Z$..+%8 /.-4.$; ?+.6 "#$%&' &0;YB 1%.40;+/+.$;I

    W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8

    &0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I

    W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    104/200

    *)%

    W$=-.Y/2D .02 DK.02O YS O23.?/=

    k/4Y./ [.0>.Y/2D .02 ,iQdbG .546M =K02.OD\40=0.6V TW@@WP Y/4-]DX ,Qsb [.0>.Y/2DX @WGpRb

    [.0>.Y/2D

    TV \>/2 D-482 [.0>.Y/2DX D=.=>-

    m4=KV OS6.5>-.//S .//4-.=2O 52540S AQRRWTQebX 5.//4-X 627C

    m?= 64= 2[20S=K>6M >D DK.02O;;;,=.-] [.0>.Y/2D >6 D?Y804M0.5DA\40=0.6C 40 3?6-=>46DATC -.//2O

    3045 8.0.//2/ 02M>46D .02 1dFsQeb

    Q?=45.=>- [.0>.Y/2D 7>=K>6 . D=.=2526= Y/4-] .02 1dFsQeb;

    W-.- /6-4+%8B NV-,#2$/

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    105/200

    *)&

    O4?Y/2 Q`*)aH>6= 5.>6AC E

    >6= >6O2c`*)aH

    L80.M5. 458 8.0.//2/

    740]A>6O2cCHprintf(%dIn, index[0]);

    J

    2c=206 O4?Y/2 Q`*)aH[4>O 740]A>6= !>6O2cC E

    O4?Y/2 =258`*)aH

    D=.=>- >6= -4?6=H

    ;;;

    J

    8 #

    '&"#

    56 3Z=&b6 )!`Z'

    '&"# '&"#

    56 3Z=&b6 )!`Z'

    AF +%;$V -%; :0

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    106/200

    *)'

    836-%8+%8 /.04-8$ -..4+G

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    107/200

    *)-.$ 32--4+-G2$ >-2+;m

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    108/200

    *)+

    C6$% +/ .6$ 04+8+%-2 >-4+-G2$ >-2+;m

    >6= =58H

    [4>O O.6M20AC E

    =58 9 )H

    L80.M5. 458 8.0.//2/ 80>[.=2A=58C

    740]ACH

    printf(%dIn, tmp);J

    The original variables value is unspecified if it is

    4$=$4$%:$; 0O 740]AC E

    =58 9 &H

    J

    ?6D82->3>2O 7K>-K-48S 43 =58=58 K.D ?6D82->3>2O

    [./?2

    D+4/.#4+>-.$ 32-

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    109/200

    D+4/.#4+>-.$ 32--4+-G2$3EE 0GR$:./ -4$ :0#7!:0%/.4-.$H+%:4I

    =04 H+ j op + fj &Alp +EEI q

    += HH+{[IjjoI +%:4EEp

    As+t j +%:4p

    r

    +%:4 j op

    e#4-8,- 0,# #-4-22$2 =04 =+4/.#4+>-.$H+%:4I

    =04 H+ j op + fj &Alp +EEI q

    += HH+{[IjjoI +%:4EEp

    As+t j +%:4p

    rb.-K =K02.O M2=D >=D 476 -48S43 >6-0 7>=K .6 >6>=>./ [./?2 43 )

    Q-/.#4+>-.$ 32-

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    110/200

    Q-/.#4+>-.$ 32-;2;X 340 >9A6U*CC

    W-.- J6-4+%8BA ; . + . . .

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    111/200

    ***

    A ;-.- $%>+40%,$%. .$/. 30%/+;$4 .6+/ $V-,#2$ 0= 'O1kA*N -%; D1OJ*'O1kA*N

    A4$ AFLF3 20:-2 .0 $-:6 .64$-; 04 /6-4$; +%/+;$ .6$ #-4-22$2 4$8+0%m

    C6-. -4$ .6$+4 +%+.+-2 >-2-2.Y/2DV Q 9 *Xm 9 *X T 9 *L80.M5. 458 8.0.//2/ 80>[.=2AmC 3>0D=80>[.=2ATC

    1%/+;$ .6+/ #-4-22$2 4$8+0% 555

    A is shared by all threads; equals 1

    B and C are local to each thread.

    Bs initial value is undefined

    Cs initial value equals 1

    D0220?+%8 .6$ #-4-22$2 4$8+0% 555

    L -%; 3 4$>$4. .0 .6$+4 04+8+%-2 >-2-2

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    112/200

    **#

    W-.- J6-4+%8B W$=-\@C^"WN

    "%/*>-4+-G2$ +% .6$ :0%/.4-.$ -/ += /#$:+=+$; +% -#4+>-.$ :2-$/ .7#+%8

    UWD^?e"8]A]WN

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    113/200

    **$

    W-.- J6-4+%8B W$=-=4=./C

    68 9 458NM2=N6?5N=K02.ODAC

    2.-K 9 >=4=./o68

    TlW@1 bPG 1QdQRRbR

    >=4=./ 9 *)))

    TlW@1 1QdQRRbR 1dFsQebA68X 2.-KC

    68 9 458NM2=N6?5N=K02.ODAC

    2.-K 9 >=4=./o68

    TlW@1 bPG 1QdQRRbR

    *6$/$ .?0:0;$=4-8,$%./ -4$$u-2$%.

    NV$4:+/$ ^B &-%;$2G40. /$. -4$-

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    114/200

    **%

    NV$4:+/$ ^B &-%;$2G40. /$. -4$-

    *6$ /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    115/200

    **&

    $ : /$ ^ H:0 . I

    "%:$ 70< 6->$ - ?04@+%8 >$4/+0%F .47 .00#.+,+T$ .6$ #4084-,m*47 ;+==$4$%. /:6$;

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    116/200

    **'

    M%+. YB Z$..+%8 /.-4.$; ?+.6 "#$%&' &0;YB 1%.40;+/+.$;I

    W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8

    &0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I

    W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    117/200

    L O23>62 P1WFPe, *)))L O23>62 @hFed *)))[4>O =2D=84>6=A[4>OCH

    D=0?-= ON-458/2cEO4?Y/2 0H O4?Y/2 >HJHD=0?-= ON-458/2c -H>6= 6?54?=D>O2 9 )H

    >6= 5.>6ACE

    >6= >X rHO4?Y/2 .02.X 20040X 28D 9 *;)2U&H

    L80.M5. 458 8.0.//2/ 340 O23.?/=ADK.02OC 80>[.=2A-X28DC340 A>9)H >nP1WFPe,H >ffC E

    340 Ar9)H rnP1WFPe,H rffC E-;0 9 U#;)f#;&!AO4?Y/2CA>CoAO4?Y/2CAP1WFPe,Cf28DH

    -;> 9 *;*#&!AO4?Y/2CArCoAO4?Y/2CAP1WFPe,Cf28DH=2D=84>6=ACHJ

    J.02.9#;)!#;&!*;*#&!AO4?Y/2CAP1WFPe,!P1WFPe,U6?54?=D>O2CoAO4?Y/2CAP1WFPe,!P1WFPe,CH

    200409.02.oAO4?Y/2CP1WFPe,H

    J $$%

    >0+; .$/.#0+%.H>0+;Iq/.4

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    118/200

    \>6O =44/D =K.= 740] 7>=K S4?0 26[>046526= .6O /2.06 =4 ?D2

    =K25; Q M44O 8.0.//2/ O2Y?MM20 -.6 5.]2 . K?M2O>332026-2;

    m?= 8.0.//2/ O2Y?MM20D .02 64= 840=.Y/2 .6O S4? 7>//assuredly need to debug by hand at some point.

    eK202 .02 =0>-]D =4 K2/8 S4?; eK2 54D= >5840=.6= >D =4 ?D2=K2 O23.?/=A6462C 80.M5.

    $$"

    e#4-8,- 0,# #-4-22$2 =04 ;$=--.$H:F $#/I=04 H+jop +fX'"1X*Jp +EEI q

    =04 HRjop RfX'"1X*Jp REEI q

    :54 j ![5oE[5^(H;0

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    119/200

    L O23>62 P1WFPe, *)))L O23>62 @hFed *)))D=0?-= ON-458/2cE

    O4?Y/2 0H O4?Y/2 >HJH[4>O =2D=84>6=AD=0?-= ON-458/2cCHD=0?-= ON-458/2c -H>6= 6?54?=D>O2 9 )H

    >6= 5.>6ACE

    >6= >X rHO4?Y/2 .02.X 20040X 28D 9 *;)2U&H

    e#4-8,- 0,# #-4-22$2 =04 ;$=--.$H:F RI |=+4/.#4++>-.$H$#/I

    340 A>9)H >nP1WFPe,H >ffC E340 Ar9)H rnP1WFPe,H rffC E

    -;0 9 U#;)f#;&!AO4?Y/2CA>CoAO4?Y/2CAP1WFPe,Cf28DH-;> 9 *;*#&!AO4?Y/2CArCoAO4?Y/2CAP1WFPe,Cf28DH=2D=84>6=A:CH

    JJ

    .02.9#;)!#;&!*;*#&!AO4?Y/2CAP1WFPe,!P1WFPe,U6?54?=D>O2CoAO4?Y/2CAP1WFPe,!P1WFPe,CH

    200409.02.oAO4?Y/2CP1WFPe,H $$&

    >0+; .$/.#0+%.H/.4

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    120/200

    *#)

    8

    /.-.+: 20%8 %

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    121/200

    TFT

    e+%:2- /46M 6?5ND=28D 9 *)))))H O4?Y/2 D=28H

    [4>O 5.>6 ACE >6= >H O4?Y/2 cX 8>X D?5 9 );)H

    D=28 9 *;)oAO4?Y/2C 6?5ND=28DHe#4-8,- 0,# #-4-22$2 =04 #4+>-.$HVI 4$;n 6?5ND=28DH >ffCE

    c 9 A>f);&C!D=28HD?5 9 D?5 f %;)oA*;)fc!cCH

    J8> 9 D=28 ! D?5H

    J

    P4=2V 72 -02.=2O .8.0.//2/ 804M0.5 7>=K4?=-K.6M>6M .6S 2c2-?=.Y/2-4O2 .6O YS .OO>6M #D>58/2 />62D 43 =2c=t

    ( "/(P.3# 69

    ,#-.+*3

    ( "/(P.3# 69

    ,#-.+*3

    \40 M44O W826@1>58/2526=.=>46DX02O?-=>46 >D 5402D-./.Y/2 =K.6 -0>=>-./;

    \40 M44O W826@1>58/2526=.=>46DX02O?-=>46 >D 5402D-./.Y/2 =K.6 -0>=>-./;

    "

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    122/200

    *##

    M%+. YB Z$..+%8 /.-4.$; ?+.6 "#$%&' &0;YB 1%.40;+/+.$;I

    W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8

    &0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I

    W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    123/200

    *#$

    *0 :4$-.$ - .$-, 0= .64$-;/e#4-8,- 0,# #-4-22$2

    *0 /6-4$ ?04@ G$.?$$% .64$-;/Be#4-8,- 0,# =04

    e#4-8,- 0,# /+%82$

    *0 #4$>$%. :0%=2+:./ H#4$>$%. 4-:$/Ie#4-8,- 0,# :4+.+:-2e#4-8,- 0,# -.0,+:

    e#4-8,- 0,# G-44+$4

    e#4-8,- 0,# ,-/.$4

    W-.- $%>+40%,$%. :2--.$ H>-4+-G2$i2+/.I=+4/.#4+>-.$ H>-4+-G2$i2+/.I

    2-/.#4+>-.$ H>-4+-G2$i2+/.I

    4$;-4+-G2$i2+/.I

    Q#$%$ M-%.-;($V(.&) .& -

    3/22- &$1-%-)$7 (.&) /0

    M-%.-;($&

    >%.,) )#$ M-('$ /0 )#$ 2-3%/

    VA>W]

    ^,7 .)& M-('$ 5.(( ;$

    999922

    D/% )#$ 9$-% -,7 2/,)# /0 )#$&1$3 )#$ .21($2$,)-)./, '&$7

    30%/+;$4 /+,#2$ 2+/. .4->$4/-2 Given what weve covered about W826@1X K47 74?/O S4?

    =K> / > 1 // /Z

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    124/200

    TFs

    89K2.OH7K>/2 A8C E

    804-2DDA8CH

    8 9 8Uq62c=H

    J

    804-2DD =K>D /448 >6 1.0.//2/Z

    d2525Y20X =K2 /448 740]DK.0>6M -46D=0?-= 46/S 740]D 7>=K

    /448D 340 7K>-K =K2 6?5Y20 43 /448 >=20.=>46D -.6 Y202802D26=2O YS . -/4D2OU3405 2c802DD>46 .= -458>/20 =>52;BK>/2 /448D .02 64= -4[202O;

    NV$4:+/$ _B 2+%@$; 2+/./ .6$ 6-4; ?-7

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    125/200

    *#&

    30%/+;$4 .6$ #4084-, 2+%@$;5:*4->$4/$/ - 2+%@$; 2+/. :0,#

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    126/200

    *#'

    8 # &0;YB 1%.40;+/+.$;I

    W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8

    &0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I

    W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    127/200

    TFh

    89K2.OH

    7K>/2 A8C E

    804-2DDA8CH

    8 9 8Uq62c=H

    J

    cases in HPC Fortran arrays processed over regular/448D;

    Recursion and pointer chasing were so far removed from4?0 \40=.6 focus that we didnt even consider more generalD=0?-=?02D;

    i26-2X 2[26 . D>58/2 />D= =0.[20D./ >D 2c-22O>6M/S O>33>-?/=7>=K =K2 40>M>6./ [20D>46D 43 W826@1;

    Q+%@$; 2+/./ ?+.60

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    128/200

    *#+

    ?6+2$ H# Uj XMQQI q

    # j #!g%$V.p

    :0

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    129/200

    *#(

    &)7**M$3)/%_,/7$ Zd ,/7$(.&)[

    0/% 81 B #$-7[ 1 qB ]?ee[ 1 B 16d,$a)N,/7$(.&):1'V;-3R81N[

    .,) l B 8.,)N,/7$(.&):&.r$8N[

    c1%-42- /21 1-%-(($( 0/% &3#$7'($8&)-).3STN

    0/% 8.,) . B H[ . _ l[ OO.N

    1%/3$&&5/%R8,/7$(.&)i.kN[

    3EEF ;$=-

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    130/200

    C$ ?$4$ -G2$ .0 #-4-22$2+T$ .6$ 2+%@$; 2+/.traversal but it was ugly and required,$4 .6$ ;-.-5

    *0 ,0>$ G$70%; +./ 400./ +% .6$ -44-7 G-/$;

    ?042; 0= /:+$%.+=+: :0,#

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    131/200

    *$*

    &0;YB 1%.40;+/+.$;I

    W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8

    &0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I

    W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    132/200

    *$#

    &0;YB 1%.40;+/+.$;I

    W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8

    &0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I

    W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    133/200

    e.D]D .02 >6O2826O26= ?6>=D 43 740];e.D]D .02 -4584D2O 43V

    :0;$ =4 2c2-?=2

    ;-.- 26[>046526=

    +%.$4%-2:0%.402 >-4+-G2$/ AFTsC

    eK02.OD 8203405 =K2 740] 43 2.-K =.D];

    eK2 0?6=>52 DSD=25 O2->O2D 7K26 =.D]D.02 2c2-?=2Oe.D]D 5.S Y2 O232002O

    e.D]D 5.S Y2 2c2-?=2O >552O>.=2/S

    J$4+-2 '-4-22$2

    W$=+%+.+0%/

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    134/200

    *$%

    =%&> /.4&+'5/+'0%*;+4$:.+>$ #2

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    135/200

    L80.M5. 458 Y.00>20

    40 =.D] Y.00>20DL80.M5. 458 =.D]7.>=

    TKJ

    "#$%&'% ('# #%$%))*)

    +

    .#/01"0 !"# '0%*,((-./"#$%&'% ('# 0%$$1*$

    "#$%&'% ('# 213&)*

    +

    .#/01"0 !"# '0%*0%$-./

    4

    4

    @?/=>8/2 344 =.D]D -02.=2OK202462 340 2.-K =K02.O

    Q// 344 =.D]D M?.0.6=22O =4Y2 -458/2=2O K202

    W62 Y.0 =.D] -02.=2O K202

    Y.0 =.D] M?.0.6=22O =4 Y2-458/2=2O K202

    W-.- J:0#+%8 ?+.6 .-/@/B D+G0%-::+ $V-,#2$5

    "#.& .& -, .,&)-,3$ /0 )#$

    7. .7 7 7 .

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    136/200

    +%. =+G H +%. % I

    q

    +%. VF7p+= H % f [ I 4$.[.=2 >6 Y4=K =.D]D

    Whats wrong here?

    A tasks private variables are

    [.=2 [.0>.Y/2S >D . 80>[.=2 [.0>.Y/2

    7.M.7$ -,7 3/,E'$% 7$&.4,

    1-))$%,

    W-.- J:0#+%8 ?+.6 .-/@/B D+G0%-::+ $V-,#2$5

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    137/200

    +%. =+G H +%. % I

    q

    +%. VF7p+= H % f [ I 4$.[.=2 >6 Y4=K =.D]D

    c u S .02 DK.02OZ00; /02$4/-2 $V-,#2$

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    138/200

    A3%' "([ TT"c$(3%'B(&"&Z' d&[.#/01"0 !"# #0/0((&(.#/01"0 !"# %3Z1(&]

    2!/+&V"(ef23/%'[&[&V&efZ&b',

    .#/01"0 !"# '0%*#/!)&%%+&,[

    _

    Whats wrong here?

    '0//+G2$ ;-.- 4-:$ UJ6-4$; >-4+-G2$ $

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    139/200

    A3%' "([ TT"c$(3%'B(&"&Z' d&[.#/01"0 !"# #0/0((&(.#/01"0 !"# %3Z1(&]

    2!/+&V"(ef23/%'[&[&V&efZ&b',

    .#/01"0 !"# '0%* 23/%'#/340'&+&,#/!)&%%+&,[

    _Z00; /02D

    3>0D=80>[.=2

    NV$4:+/$ `B .-/@/ +% "#$%&'

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    140/200

    *%)

    30%/+;$4 .6$ #4084-, 2+%@$;5:*4->$4/$/ - 2+%@$; 2+/. :0,#

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    141/200

    *%*

    &0;YB 1%.40;+/+.$;I

    W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8

    &0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I

    W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    &0; aB *6$ /:-4$7 stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-4+?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    142/200

    *%#

    30%/+;$4 .6$ #4084-, 2+%@$;5:*4->$4/$/ - 2+%@$; 2+/. :0,#

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    143/200

    e#4-8,- 0,# #-4-22$2qe#4-8,- 0,# /+%82$q

    %0;$ ( # j 6$-;p

    ?6+2$ H#I qe#4-8,- 0,# .-/@ =+4/.#4+>-.$H#I

    #40:$//H#Ip# j #!g%$V.p

    rr

    r

    *; T02.=2. =2.5 43

    =K02.OD;

    #; W62 =K02.O2c2-?=2D =K2/+%82$-46D=0?-=

    other threads

    7.>= .= =K2 >58/>2OY.00>20 .= =K2 26O 43=K2 D>6M/2 -46D=0?-=

    3. The single thread

    -02.=2D . =.D] 7>=K >=D 476

    [./?2 340 =K2 84>6=20 8

    %; eK02.OD 7.>=>6M .= =K2 Y.00>20 2c2-?=2=.D]D;

    bc2-?=>46 54[2D Y2S46O =K2 Y.00>20 46-2.// =K2 =.D]D .02 -458/2=2

    NV$:./ =4 8.0.//2/>g2 >002M?/.0 8.==206D .6O 02-?0D>[2 3?6-=>46 -.//D

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    144/200

    e#4-8,- 0,# #-4-22$2qe#4-8,- 0,# /+%82$q ddG20:@ Y

    %0;$ ( # j 6$-;p?6+2$ H#I qdd G20:@ [e#4-8,- 0,# .-/@

    #40:$//H#Ip# j #!g%$V.p ddG20:@ \

    rr

    r

    i.[2 84=26=>./ =4 8.0.//2/>g2 >002M?/.0 8.==206D .6O 02-?0D>[2 3?6-=>46 -.//D

    "'-.C >

    "'-.C D

    E3/C >

    "'-.C D

    E3/C D

    "'-.C D

    E3/C 0

    "'-.C 0

    "'-.C 0

    *+,

    $

    J+%82$*64$-;$;

    "'-.C >

    *64Y *64[ *64\ *64]

    "'-.C D

    E3/C D

    "'-.C D

    E3/C >

    "'-.C DE3/C 0

    e>52,.[2O

    1;2$

    1;2$

    Q 02./ 2c.58/2V ,S552=0>- 0.6]U] ?8O.=2

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    145/200

    !$%

    f9

    T Q Qe

    Q*

    Q)

    Qe) Qe

    *T*)T**

    QOO Q*Qe

    )

    QOO Q)Qe

    )

    P4=2V =K2 >=20.=>46 D7228D =K04?MK T .6O QX -02.=>6M . 627 Y/4-] 43 047D =4 Y2?8O.=2O 7>=K 627 8.0=D 43 Q; eK2D2 ?8O.=2D .02 -458/2=2/S >6O2826O26=;

    @`# %#$0 _12T k#$3 %(*-#*,T 816#/3 P.$ ,# j#(l$T .$, d(#*, h.$ m##< &./.**#*(`($0 d_N%=

    a1,# 2(34 !"#$%& @.7B n+#+#7

    T 7+65(33#,

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    146/200

    !$&

    F2,3G53 -52 23,3''&'

    H

    F2,3G53 -52 /8(G'&

    H

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    147/200

    !$'

    I JJ &(1 -7 )3/CKLM&M&

    I JJ &(1 -7 23,3''&' ,&G8-(

    F2,3G53 -52 )3/C 78,/)2,8N3)&OPQ %>>R

    5 ?MR 46#Lb$, BN"#EC

    e48 />62 02802D26=D 82.] [email protected]>62 AF=.6>?5# *;&kigX %T1pC

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    148/200

    !$(

    M EMM RMM AMM JMM ?MMM ?EMM ?RMM ?AMM ?JMM EMMMM

    M9K

    ?

    ?9K

    E

    2"*#(5 7(2%,4(+, ,

    GYWa!F@4%)9

    H%/%#%,)%

    YW.GX

    a'%,YW.GXb,*1?

    a'%,YW.GXb,*1E

    a'%,YW.GXb,*1>a'%,YW.GXb,*1R

    P4=2V =K2 .Y4[2 M0.8KD >D 340 =K2 54D= 6.v[2 7.S 43 5.0-K>6M =K04?MK =K2 5.=0>-2D;mS 8>-]>6M Y/4-]D OS6.5>-.//SX 5?-K 3.D=20 0.58U?8 -.6 Y2 .-K>2[2O;

    "

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    149/200

    *%(

    &0;YB 1%.40;+/+.$;I

    W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8

    &0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I

    W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    150/200

    *&)

    1*(+@ 1*(+A 1*(+B 1*(+C

    !=2*%$7%7(*= 40O20

    80>[.=2 [>27

    =K02.O =K02.O

    80>[.=2 [>27

    =K02.O80>[.=2=K02.O80>[.=2..

    Y Y

    C- CG O- OG 5 5 5

    -458>/20

    bc2-?=.Y/2 -4O2

    T4O2 40O20

    CGOGC-O-5 5 5

    8b7'($'.$9'

    7#5.$3().**9'

    #?+(P.*#$3'1/,#/'

    30%/+/.$%:7B &$,047 A::$// O$!04;$4+%8

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    152/200

    *

    O$!04;$4+%8B30,#+2$4 4$!04;$4/ #4084-, 04;$4.0 .6$ :0;$ 04;$4

    &-:6+%$ 4$!04;$4/ :0;$ 04;$4.0 .6$ ,$,047 :0,,+. 04;$4

    At a given point in time, the private view seen by a.64$-; ,-7 G$ ;+==$4$%. =40, .6$ >+$? +% /6-4$;,$,0475

    30%/+/.$%:7 &0;$2/;$=+%$ :0%/.4-+%./ 0% .6$ 04;$4/ 0=O$-;/ HOIF C4+.$/ HCI -%; J7%:640%+T-.+0%/ HJI

    i.e. how do the values seen by a thread change as you

    :6-%8$ 60? 0#/ =0220? HI 0.6$4 0#/5'0//+G+2+.+$/ +%:2

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    153/200

    *&$

    J$u

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    154/200

    *&%

    "#$%&' ;$=+%$/ :0%/+/.$%:7 -/ - >-4+-%. 0=?$-@ :0%/+/.$%:7B

    3-% %0. 4$04;$4 J 0#/ ?+.6 O 04 C 0#/ 0% .6$ /-,$.64$-;

    C$-@ :0%/+/.$%:7 8-%. .0 .6+/;+/:

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    155/200

    *&&

    W$=+%$/ - /$u

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    156/200

    *&'

    D2

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    157/200

    *&-.$ .0 - .64$-;

    D04.4-%B =A\@C^"W

    ?+.6 >\@C^"W820G-2 >-4+-G2$/ -4$ ,-/@$;5

    "p\W^U>\@C^"W#4$/$4>$/ 820G-2 /:0#$ ?+.6+% $-:6.64$-;

    *64$-;#4+>-.$ >-4+-G2$/ :-% G$ +%+.+-2+T$; Y@] 04 -. .+,$ 0= ;$=+%+.+0% H-.$ .0 :4$-.$ - :0

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    169/200

    *'(

    +%. :0-.$H:0

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    170/200

    *-2-.$/=40, 0%$ ,$,G$4 0= - .$-, .0 .6$ 4$/. 0= .6$ .$-,

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    171/200

    *-.$ HX/+T$F :60+:$I

    q

    e#4-8,- 0,# /+%82$ :0#7#4+>-.$ HX/+T$F :60+:$I

    +%#-.$ HX/+T$F :60+:$I

    q

    e#4-8,- 0,# /+%82$ :0#7#4+>-.$ HX/+T$F :60+:$I

    +%#

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    172/200

    *

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    173/200

    *+;$ .64$$ =+2$/ =04 .6+/ $V$4:+/$#+i,:5:B .6$ ,0%.$ :-420 ,$.60; #+ #4084-,

    4-%;0,5:B - /+,#2$ 4-%;0, %

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    174/200

    *+/+.$;I

    W+/: \B J7%:640%+T-.+0% 0>$46$-; -%; $2+,+%-.+%8 =-2/$ /6-4+%8

    &0; ^B '-4-22$2 Q00#/ H,-@+%8 .6$ '+ #4084-, /+,#2$I

    W+/: ]B '+ #4084-, ?4-#!$4/$ 2+%@$; 2+/./

    M%+. ]B - =$? -;>-%:$; "#$%&' .0#+:/ &0; aB *-/@/ H2+%@$; 2+/./ .6$ $-/7 ?-7I

    W+/: `B M%;$4/.-%;+%8 *-/@/

    Mod 8: The scary stuff Memory model, atomics, and flush (pairwise synch).

    W+/: aB *6$ #+.=-22/ 0= #-+4?+/$ /7%:640%+T-.+0%

    &0; bB *64$-;#4+>-.$ W-.- -%; 60? .0 /

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    175/200

    *$% #4$>+0-2-2

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    176/200

    *

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    177/200

    *

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    178/200

    *

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    179/200

    *

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    180/200

    *+)

    );))))*

    );)))*

    );))*

    );)*

    );*

    *

    * # $ % & '

    RTk U 462 =K02.O

    RTkX % =K02.ODX=0.>/ *

    RTk % =K02.ODX=0>./ #

    RTkX % =K02.ODX=0>./ $

    Q08YoO$2-

    .+>$$4404

    Q08Yo%

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    181/200

    *+*

    /.-.+: 20%8 AWWNXW j Y^oaabp

    /.-.+: 20%8 '&"W j `Y]o[^p20%8 4-%;0,i2-/. j op

    e#4-8,- 0,# .64$-;#4+>-.$H4-%;0,i2-/.I

    ;0

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    182/200

    *+#

    Q08YoO$2-.+>$$4404

    Q08Yo%$4/+0% 8+>$/ .6$/-,$ -%/?$4$-:6 .+,$ 70/S /.0M2; Use the rank an ID ranging from 0 to (PU1) to select

    Y2=7226 . D2= 43 =.D]D .6O =4 5.6.M2 .6S DK.02O O.=.D=0?-=?02D;

    eK>D 8.==206 >D [20S M2620./ .6O K.D Y226 ?D2O =4 D?8840=54D= A>3 64= .//C =K2 ./M40>=K5 D=0.=2MS 8.==206D;

    MPI programs almost always use this pattern it is804Y.Y/S =K2 54D= -45546/S ?D2O 8.==206 >6 =K2 K>D=40S 43

    8.0.//2/ 804M0.55>6M;

    eK>D 8.==206 >D [20S M2620./ .6O K.D Y226 ?D2O =4 D?8840=54D= A>3 64= .//C =K2 ./M40>=K5 D=0.=2MS 8.==206D;

    MPI programs almost always use this pattern it is804Y.Y/S =K2 54D= -45546/S ?D2O 8.==206 >6 =K2 K>D=40S 43

    8.0.//2/ 804M0.55>6M;

    "#$%&' '+ #4084-,B J'&W #-..$4%

    )*+,-./0 123456782*/ 39*+ :*+; 9 ,69< ?9

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    190/200

    *()

    C

    *+; *> 4*DE5E> F;04> F.3 D E5EGF;04 D $5EH:/2.I-0B +.3JF;04F G)4

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    191/200

    *(*

    /448D;R448 >=20.=>46D .02 O>[>O2O Y2=7226 . -4//2-=>46 43

    804-2DD>6M 2/2526=D =4 -458?=2 =.D]D >6 8.0.//2/;

    eK>D O2D>M6 8.==206 >D K2.[>/S ?D2O 7>=K O.=. 8.0.//2/ O2D>M68.==206D;

    W826@1 804M0.5520D -45546/S ?D2 =K>D 8.==206;

    L80.M5. 458 8.0.//2/ 340 DK.02OAd2D?/=DC D-K2O?/2AOS6.5>-C

    340A>9)H>nPH>ffCEG4N740]A>X d2D?/=DCH

    J

    "#$%&' '1 '4084-,BQ00# 2$>$2 #-4-22$2+/, #-..$4%

    D($)*+,# E15"

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    192/200

    TLF

    73.3() *1$0 $+5G73#"7 H IJJJJJK ,1+6*# 73#"K

    D,#-($# CL%G@M8=N;> O

    P1(, 5.($ QR

    S ($3 (K ,1+6*# AT "(T 7+5 HJ

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    193/200

    Q 804Y/25 >6-/?O2D . 52=K4O =4 O>[>O2 >6=4 D?Y804Y/25D.6O . 7.S =4 02-45Y>62 D4/?=>46D 43 D?Y804Y/25D >6=4 .M/4Y./ D4/?=>46;

    ,4/?=>46

    G23>62 . D8/>= 4820.=>46

    T46=>6?2 =4 D8/>= =K2 804Y/25 ?6=>/ D?Y804Y/25D .02D5.// 264?MK =4 D4/[2 O>02-=/S;

    d2-45Y>62 D4/?=>46D =4 D?Y804Y/25D =4 D4/[2 40>M>6./M/4Y./ 804Y/25;

    P4=2VT458?=>6M 5.S 4--?0 .= 2.-K 8K.D2 AD8/>=X /2.[2DX

    02-45Y>62C;

    W+>+;$ -%; :0%u= =K2 804Y/25 >6=4 D5.//20 D?YU804Y/25D; T46=>6?2 ?6=>/=K2 D?YU804Y/25D -.6 Y2 D4/[2 O>02-=/S;

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    194/200

    =K2 D?Y 804Y/25D -.6 Y2 D4/[2 O>02-=/S;

    $ W8=>46DV

    G4 740] .D S4? D8/>=>6=4 D?YU804Y/25D;

    G4 740] 46/S .= =K2/2.[2D;

    G4 740] .D S4?02-45Y>62;

    '4084-,B "#$%&' .-/@/ H;+>+;$ -%; :0%u6-/?O2 n458;Kq

    D=.=>- /46M 6?5ND=28D 9 *))))))))H

    LO23>62 @FPNmRw *)))))))

    O4?Y/2 8>N-458A>6= PD=.0=X>6= P3>6>DKXO4?Y/2 D=28C

    E > = > >Y/]

    >6= 5.>6 AC

    E

    >6= >H

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    195/200

    E >6= >X>Y/]H

    O4?Y/2 cX D?5 9 );)XD?5*X D?5#H>3 AP3>6>DKUPD=.0= n @FPNmRwCE

    340 A>9PD=.0=H>n P3>6>DKH >ffCE

    c 9 A>f);&C!D=28H

    D?5 9 D?5 f %;)oA*;)fc!cCHJ

    J

    2/D2E

    >Y/] 9 P3>6>DKUPD=.0=H

    e#4-8,- 0,# .-/@ /6-4$;H/N-458APD=.0=X P3>6>DKU>Y/]o#XD=28CH

    e#4-8,- 0,# .-/@ /6-4$;H/N-458AP3>6>DKU>Y/]o#X P3>6>DKX D=28CH

    e#4-8,- 0,# .-/@?-+.

    D?5 9 D?5* f D?5#H

    J02=?06 D?5H

    J\]

    O4?Y/2 D=28X 8>X D?5H

    D=28 9 *;)oAO4?Y/2C 6?5ND=28DH

    e#4-8,- 0,# #-4-22$2

    E

    e#4-8,- 0,# /+%82$

    D?5 9 8>N-458A)X6?5ND=28DXD=28CH

    J8> 9 D=28 ! D?5H

    J

    O$/M>6./ ,20>./ 8> 804M0.5 7>=K *)))))))) D=28D 0.6 >6 *;+$ D2-46OD;

  • 7/26/2019 Intro_To_OpenMP_Mattson.pdf

    196/200

    \]`

    !F6=2/ -458>/20 A>-8-C 7>=K 64 48=>5>g.=>46 46 Q88/2 W, h *);=K . O?./ -402 A34?0 iB=K02.OC F6=2/j T402e@>& 804-2DD40 .= *;< kKg .6O % kYS=2 GGd$ 52540S .= *;$$$ kKg;

    =K02.OD *D=,1@G

    ,1@G-0>=>-./

    1F R448 1> =.D]D

    * *;+' *;+< *;(* *;+