(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
-
Upload
jeff-squyres -
Category
Technology
-
view
1.315 -
download
3
Transcript of (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 1© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 1
(Very) Loose Proposalto Revamp MPI_INIT and
MPI_FINALIZEThese are the kinds
of crazy ideasthat we discuss
at the MPI ForumJeffrey M. Squyres
Cisco Systems23 September 2015
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 2
Before MPI-3.1, this could be erroneous
int my_thread1_main(void *context) { MPI_Initialized(&flag); // …}
int my_thread2_main(void *context) { MPI_Initialized(&flag); // …}
int main(int argc, char **argv) { MPI_Init_thread(…, MPI_THREAD_FUNNELED, …); pthread_create(…, my_thread1_main, NULL); pthread_create(…, my_thread2_main, NULL); // …}
These mightrun at the same time (!)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
The MPI-3.1 solution• MPI_INITIALIZED (and friends) are allowed to be called at any time
…even by multiple threads…regardless of MPI_THREAD_* level
• This is a simple, easy-to-explain solutionAnd probably what most applications do, anyway
• But many other paths were investigated
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
MPI_INIT / FINALIZE limitations• Cannot call MPI_INIT more than once• Cannot set error behavior of MPI_INIT• Cannot re-initialize MPI after it has been finalized• Cannot init MPI from different entities within a process without a priori
knowledge / coordination
MPI Process// Library 1MPI_Initialized(&flag);if (!flag) MPI_Init(…);
// Library 2MPI_Initialized(&flag);if (!flag) MPI_Init(…);
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
MPI_INIT / FINALIZE limitations• Cannot call MPI_INIT more than once• Cannot set error behavior of MPI_INIT• Cannot re-initialize MPI after it has been finalized• Cannot init MPI from different entities within a process without a priori
knowledge / coordination
MPI Process// Library 1MPI_Initialized(&flag);if (!flag) MPI_Init(…);
// Library 2MPI_Initialized(&flag);if (!flag) MPI_Init(…);
THIS IS INSUFFICIENT / POTENTIALLY ERRONEOUS
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
1994 called.
They want their API design back.
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
What we should have• Call MPI_INIT as many times as you like• By whomever wants to call it
MPI Process
// Library 3MPI_Init(…);
// Library 4MPI_Init(…);
// Library 5MPI_Init(…);
// Library 6MPI_Init(…);// Library 7
MPI_Init(…);
// Library 8MPI_Init(…);
// Library 9MPI_Init(…);
// Library 10MPI_Init(…);
// Library 11MPI_Init(…);
// Library 12MPI_Init(…);// Library 2
MPI_Init(…);// Library 1MPI_Init(…);
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
…but that has its own complicationsDo you have to call MPI_FINALIZE exactly that many times?
Do you allow MPI_INIT after MPI_FINALIZE?
Or perhaps you only allow MPI_INIT before MPI has been finalized?
How can you tell if it’s safe to call MPI_INIT? Atomic “test-and-init”?
I IS CONFUSED
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
We need something new
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 10© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
The following are just (incomplete) crazy ideas
WARNING!
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
New MPI concept: a session
int my_thread1_main(void *context) { MPI_Session session; MPI_Session_create(…, &session);
// Do MPI things
MPI_Session_free(&session);}
int my_thread2_main(void *context) { MPI_Session session; MPI_Session_create(…, &session);
// Do MPI things
MPI_Session_free(&session);}
int main(int argc, char **argv) { pthread_create(…, my_thread1_main, NULL); pthread_create(…, my_thread2_main, NULL); …}
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
New MPI concept: a session
int my_thread1_main(void *context) { MPI_Session session; MPI_Session_create(…, &session);
// Do MPI things
MPI_Session_free(&session);}
int my_thread2_main(void *context) { MPI_Session session; MPI_Session_create(…, &session);
// Do MPI things
MPI_Session_free(&session);}
int main(int argc, char **argv) { pthread_create(…, my_thread1_main, NULL); pthread_create(…, my_thread2_main, NULL); …}
Now featuring
100% less MPI_INIT!
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Create communicators from sessionsint my_thread1_main(void *context) { MPI_Session session; MPI_Session_create(&session); MPI_Comm_create_from_session(session, &comm)
// Do MPI things with comm
MPI_Comm_free(&comm); MPI_Session_free(&session);}
int my_thread1_main(void *context) { MPI_Session session; MPI_Session_create(&session); MPI_Comm_create_from_session(session, &comm)
// Do MPI things with comm
MPI_Comm_free(&comm); MPI_Session_free(&session);}
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Problems that sessions solve
Each entity (library?) in an OS process can have its own session
Any session-local state can be encapsulated in the handle
Entities can create / destroy sessions at any time …in any thread
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
…but what about MPI_COMM_WORLD?
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
MPI_COMM_WORLD. Sigh.• When is MPI_COMM_WORLD created (and/or initialized)?• When is MPI_COMM_WORLD destroyed?• Can you use MPI_COMM_WORLD with any session?
There doesn’t seem to be an obvious relation between MCW and individual sessions (ditto for MPI_COMM_SELF)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
What if we get rid of MPI_COMM_WORLD?
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
Problems that solves• Addresses logical inconsistency with session concept• Clean separation of communicators between sub-entities
…maybe slightly better than we have it today (sub-entities dup’ing COMM_WORLD)
• Side effects:Fault tolerance issues become easierOpens some possibilities for scalability improvements
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
Problems that creates• Users will riot
…but what if they don’t?
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Open questions• What would be the forward / backward compatibility strategy?
E.g., deprecate INIT, FINALIZE, INITIALIZED, FINALIZED…?
• What are the other arguments to MPI_SESSION_CREATE?• Can you call both MPI_INIT and MPI_SESSION_CREATE in the same
process?• Can you do anything else with a session?
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Sooo… what happens next?
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 22© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Come to MPI Forum meetings
Discuss this and otherscintillating MPI topics
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
Thank you.