From Damaris to CALCioM - Inria · ICS 2011 M. Dorier, G. Antoniu, F. Cappello, M. Snir, L. Orf....
Transcript of From Damaris to CALCioM - Inria · ICS 2011 M. Dorier, G. Antoniu, F. Cappello, M. Snir, L. Orf....
FromDamaristoCALCioMMi/ga/ngI/OInterferenceinHPCSystems
Ma#hieuDorier–ENSRennes,IRISA,InriaRennesKerDataprojectteamJointworkwithRobRoss,DriesKimpe,GabrielAntoniu,ShadiIbrahim
WithintheData@Exascaleassociatedteam
10thworkshopoftheJointLabforPetascaleCompuLngUrbana-Champaign,November2013
1
Outline
• Damaris:AUer3yearsofcollaboraLon…• CALCioM:Towardscross-applicaLoncoordinaLon
2
DamarisaUer3years…Originated from the “Shared Buffering System”designed in 2010 during aninternship at NCSA, Damaris proposes to dedicate cores in mulLcore SMPnodestodatamanagement,i.e.storage,insituanalysisandvisualizaLon.
Overview Implementa/on
• Version0.7.3available• h#p://damaris.gforge.inria.fr
• Version1.0forsummer2014• 15095linesofcode• APIforC,C++andFortran
simulaLons• EasyconfiguraLonwithXML• InsituvisualizaLonwithVisIt• PythonandC++plugins
3
DamarisaUer3years…PeopleInvolved Evaluatedon
BlueWaters,Intrepid,Kraken,Jaguar,Grid’5000,BluePrint,Surveyor
Ma#hieuDorier,GabrielAntoniu,LokmanRahmani,RobertoSisneros,DaveSemeraro,BobWilhelmson,
RobRoss,TomPeterka,DriesKimpe,MarcSnir,FranckCappello,LeighOrf
Publica/onsM. Dorier, advised by G. Antoniu. Damaris - Using Dedicated I/OCoresforScalablePost-petascaleHPCSimulaLons.ICS2011M.Dorier,G.Antoniu,F.Cappello,M.Snir,L.Orf.Damaris:HowtoEfficientlyLeverageMulLcoreParallelismtoAchieveScalable,Ji#er-freeI/O.inProc.ofIEEECLUSTER2012.M.Dorier,advisedbyG.Antoniu.EfficientI/OusingDedicatedCoresinLarge-ScaleHPCSimulaLons.PhDforumofIPDPS2013M. Dorier, R. Sisneros, T. Peterka, G. Antoniu, D. Semeraro.Damaris/Viz, a Nonintrusive, Adaptable and User-Friendly In SituVisualizaLonFramework.inProc.ofIEEELDAV2013
EvaluatedwithCM1,Nek5000,OLAM
4
Mi/ga/ngI/OInterference
inHPCSystems
5
IntroducLontocross-applicaLoninterference
Interference: Performance degradaLon observed by anapplicaLon incontenLonwithotherapplicaLons for theaccesstoasharedresource. • HowoUendoesI/Ointerferenceoccur?• WhatistheeffectofI/Ointerference?• HowdowequanLfyandvisualizeit?• HowtomiLgateit?
6
HowoUendoesI/Ointerferenceoccur?
7
HowoUendoesinterferenceoccur?“Intrepidhasareallyweirdworkloadcomparedtomostothersystems,becauseofthelargenumberoflargejobs.”
NarayanDesai(ANL)
8
HowoUendointerferenceoccur?IamanapplicaLon,IstartwriLng,whatistheprobabilitythatatleastoneotherapplicaLonisalsoaccessingthefilesystem?
P(another is doing I/O) =1− P(X = n)(1−E(µ))n=0
+∞
∑
WhereXisthenumberofrunningapplicaLon(randomvariable),μistheI/OLmev.s.computaLonLmeraLoofapplicaLons(r.v.),AssumingindependencebetweenXandμ.
OnIntrepid:AssumingE(μ)=5%,P(anotherisdoingI/O)=64%
9
WhatistheeffectofI/Ointerference?
10
WhatistheeffectofI/Ointerference?
IORrunningon336cores,wriLngevery10secondsina35-serverPVFSfilesystem
onGrid’5000
Asecondinstanceisstartedon336othercores,wriLngthesameamountofdata
every7seconds
I/Ointerferencehasalargeimpactoncachingmechanisms 11
HowdowequanLfyandvisualizeI/Ointerference?
12
Interferencefactor• Theuserisinterestedinthefactorbywhichinterference
increasestheI/O3me:
• ConsideringnapplicaLons,wecould(forexample)wanttominimizethesumofaccessLmes:
• Thesemetricscanbeadaptedtoanything(EnergyconsumpLon,CPUcycles,etc.):fcanbegeneralizedasametricsformachine-wideefficiency.
IX =TX
TX (alone)>1
f = TXX∈app∑
13
Delta-graphAppA’saccess
AppB’saccessdt
ResultsonSurveyor(2x2048cores),eachcorewrites8MBconLguously.Thegraphrepresentsthepointofviewofoneofthe2applicaLons.
I/OLmewhentheapplicaLonisalone
PerformancedegradaLonduetointerferences
14
BadluckforsmallapplicaLons
ExperimentonGrid’5000,AppBon24cores,AppAon744,wriLng8MBperprocess
SmallestAppobservesanupto14xdecreaseofperformance!Biggestonedoesnotevenseeit!
15
HowtomiLgateI/Ointerference?TheCALCioMapproach
16
TheCALCioMarchitectureAppA
CALCioMI/Olibrary
MPII/O
AppB
I/Olibrary
MPII/O
Read/WriteRead/WriteFileSystem
CoordinaLon
17
Cross-Applica/onLayerforCoordinatedI/OManagement
CALCioM’sAPI
CALCioM_Init(MPI_Comm c)CALCioM_Prepare(MPI_Comm c, MPI_Info i)CALCioM_Ask()CALCioM_Check(int* status)CALCioM_Wait()CALCioM_Release()CALCioM_Complete()CALCioM_Finalize()
18
PossiblecoordinaLonstrategies
AppA
AppBdt
AppA
AppA
AppBdt
“Firstcomefirstserved”(FCFS)SerializaLon
InterrupLon
19
HowtochooseacoordinaLonstrategyQ:GivenapplicaLonAwithexpectedaccessLmeTAandapplicaLonBwithexpectedaccessLmeTB,starLngdtLmeunitsaUerapplicaLonA’saccess,
ShouldAbeinterruptedinfavorofB?OrshouldBwaitforAtoterminateitsaccess?
Example:ifneitherAnorBhavesomethingelsetodo,opLmizingglobalperformance,i.e.minimizinganinterferenceeffectgivenby
f = TATA(alone)
+TB
TB(alone)
TellsusthatBshouldinterruptAifandonlyif
dt <TA(alone)
2 −TB(alone)2
TA(alone)
f = TA +TB
dt < TA(alone) −TB(alone)
20
IntegraLoninMpich• MPI_InitandMPI_Finalize overwri#en in
libcalciom.a• MPI_File_open(“myfile”)
Ø MPI_File_open(“calciom:myfile”)• MPI_File_open(“pvfs2:myfile”)
Ø MPI_File_open(“calciom:pvfs2:myfile”)• ConnecLonbetweenapplicaLons:couldbedonethrough
MPI_Comm_connect/accept(ideallywouldbenefitfromMPI_Comm_iconnect/iaccept)+interacLonwiththejobscheduler
21
ExperimentalevaluaLon
22
2x2048coresonSurveyor• AppA:4files,4MBperfileperprocess,conLguouslayout• AppB:1file,4MBperfileperprocess,conLguouslayout
f = TA +TB dt < TA(alone) −TB(alone)
AppB(smallI/Oload) AppA(bigI/Oload)
ExampleofapplicaLon
23
AppBarrivesfirst,AppAisserializeda[erB
AppB(smallI/Oload) AppA(bigI/Oload)
ExampleofapplicaLon
24
ExampleofapplicaLon
AppBarrivesduringthewriteofthe3firstfilesofAppA,Condi/onindicatesthatAshouldbeinterrupted.
Thelevelofinterrup/onproducesdifferentpa^erns.
AppB(smallI/Oload) AppA(bigI/Oload)
25
ExampleofapplicaLon
AppBarrivesduringthelastwriteofAppA.Condi/ondictatesthatBisserializeda[erA.
AppB(smallI/Oload) AppA(bigI/Oload)
26
Synthesis
CALCioMmanagestoimprovethecomputa/onalefficiencyofthesetofapplica/onsbyavoidinginterference,andthus
improvestheefficiencyoftheen/remachine. 27
Conclusion
28
Conclusion• Interferencebetweenapplica/onimpactssystemefficiency• CALCioM:• CommunicaLonlayerbetweenindependentapplicaLons• Cross-applicaLoncoordinaLonthroughexchangeofknowledgeonI/Opa#erns
• Severalpoliciesimplemented:FCFS,interrupLon
reallifeinterference
Thankyou!Ques/ons?
29