A Class to Manage Large Ensembles and Batch Execution in ...aerler/... · Motivation: Batch...
Transcript of A Class to Manage Large Ensembles and Batch Execution in ...aerler/... · Motivation: Batch...
Introduction Ensemble Class Argument Expansion
A Class to Manage Large Ensembles andBatch Execution in Python
PyCon Canada
Andre R. Erler
November 12th, 2016
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion
Outline
IntroductionScience is RepetitiveWhat I do
Batch Execution using an Ensemble ClassThe Ensemble ClassA Helper Class
Argument ExpansionOuter Product Implementation
Summary & Conclusion
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Repetitive Science What I do Motivation
Science is RepetitiveTo reach conclusive results, scientificexperiments usually have to be repeatedmany times; either to establish statisticalsignificance, or to test a range of parametervalues for optimization.
Experiments are planned andconducted in large batches orso-called ensembles.
Automation
It is therefore desirable toautomate the most repet-itive tasks, and to createtools for this purpose.
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Repetitive Science What I do Motivation
Science is RepetitiveTo reach conclusive results, scientificexperiments usually have to be repeatedmany times; either to establish statisticalsignificance, or to test a range of parametervalues for optimization.
Experiments are planned andconducted in large batches orso-called ensembles.
Automation
It is therefore desirable toautomate the most repet-itive tasks, and to createtools for this purpose.
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Repetitive Science What I do Motivation
Coupling Climate Modelswith Hydrologic Models
Surface Temperature in a Global and anested Regional Climate Model
I run Climate and Hydrologic Modelsto study the impact of climate changeon water resources and generate pro-jections of future hydro-climate.
Athabasca River watershed:groundwater depth (top) and surface waterdepth (bottom)
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Repetitive Science What I do Motivation
High Performance Computing
I High-resolution Climatesimulations:
I 4 days on 128 cores and300GB of storage permodel year
I 36 ensemble members, 15years each
I Surface-SubsurfaceHydrologic Simulations:
I 1 day on 2 cores permodel year
I also 15 years each, 100+ensemble members
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Repetitive Science What I do Motivation
Motivation:Batch ProcessingI In Computational Sciences
repetitive tasks can beautomated/scripted
Boilerplate Code
Python simplifies scripting alot, but we still have a lot ofboilerplate code! This can besimplified further.
Python is an Ideal Scripting Languageensemble = [...] # a list of objects ‘‘members’’
# for loop iterating over listtmp = [] # store resultsfor member in ensemble: # iterate over list
tmp.append( result = member.operation(*args, **kwargs) )ensemble = tmp
# list comprehension is already much shorter!ensemble = [m.operation(*args, **kwargs) for m in ensemble]
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Repetitive Science What I do Motivation
Motivation:Batch ProcessingI In Computational Sciences
repetitive tasks can beautomated/scripted
Boilerplate Code
Python simplifies scripting alot, but we still have a lot ofboilerplate code! This can besimplified further.
Python is an Ideal Scripting Languageensemble = [...] # a list of objects ‘‘members’’
# for loop iterating over listtmp = [] # store resultsfor member in ensemble: # iterate over list
tmp.append( result = member.operation(*args, **kwargs) )ensemble = tmp
# list comprehension is already much shorter!ensemble = [m.operation(*args, **kwargs) for m in ensemble]
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Motivation Ensemble Class Ensemble Wrapper
Motivation:Batch ProcessingI In Computational Sciences
repetitive tasks can beautomated/scripted
The Ensemble Class
I Emulate Container TypeI Redirect method calls to
ensemble members
And Ideal Use-case Exampleensemble = Ensemble(*[...]) # create Ensemble object
# apply member methods to entire ensembleensemble = ensemble.operation_1(*args, **kwargs)...ensemble = ensemble.operation_N(*args, **kwargs)
member_N = ensemble[n] # access elements by indexmember_key = ensemble[key] # .. or by name/key...
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Motivation Ensemble Class Ensemble Wrapper
Motivation:Batch ProcessingI In Computational Sciences
repetitive tasks can beautomated/scripted
The Ensemble Class
I Emulate Container TypeI Redirect method calls to
ensemble members
And Ideal Use-case Exampleensemble = Ensemble(*[...]) # create Ensemble object
# apply member methods to entire ensembleensemble = ensemble.operation_1(*args, **kwargs)...ensemble = ensemble.operation_N(*args, **kwargs)
member_N = ensemble[n] # access elements by indexmember_key = ensemble[key] # .. or by name/key...
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Motivation Ensemble Class Ensemble Wrapper
The Ensemble ClassImplementation Snippetclass Ensemble(object):
_members = None # members...
def __getitem__(self, i):# get individual membersif isinstance(i, int):
# access like list/tuplereturn self._members[i]
elif isinstance(i, string):...
def __iter__(self):# iterate over membersmm = self._membersreturn mm.__iter__()
...
Emulating the PythonContainer Type:
1. Support several built-inmethods, such as len ,
contains , iter
2. Item assignment like listor dict using getitemand setitem
Return Values
Calls to member methodsreturn a new container orEnsemble with the results
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Motivation Ensemble Class Ensemble Wrapper
The Ensemble ClassImplementation Snippetclass Ensemble(object):
_members = None # members...
def __getitem__(self, i):# get individual membersif isinstance(i, int):
# access like list/tuplereturn self._members[i]
elif isinstance(i, string):...
def __iter__(self):# iterate over membersmm = self._membersreturn mm.__iter__()
...
Emulating the PythonContainer Type:
1. Support several built-inmethods, such as len ,
contains , iter
2. Item assignment like listor dict using getitemand setitem
Return Values
Calls to member methodsreturn a new container orEnsemble with the results
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Motivation Ensemble Class Ensemble Wrapper
The Ensemble Class
Implementation ofMethod Redirection:
1. Redirect calls to membermethods/attributes byoverloading getattr
2. Execute call on allEnsemble members
3. Return a new container orEnsemble with results
Ensemble Wrapper
Methods require helper ClassEnsWrap to apply arguments
Implementation Snippetclass Ensemble(object):
_members = None # members...
def __getattr__(self, attr):# check if callablemem0 = self._members[0]# assuming homogeneity...f = getattr(mem0,attr)if callable(f):
# return Ensemble Wrapperv = EnsWrap(self,attr)
else:# just return valuesv = [getattr(m,attr) \
for m in self._members]return v
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Motivation Ensemble Class Ensemble Wrapper
The Ensemble Class
Implementation ofMethod Redirection:
1. Redirect calls to membermethods/attributes byoverloading getattr
2. Execute call on allEnsemble members
3. Return a new container orEnsemble with results
Ensemble Wrapper
Methods require helper ClassEnsWrap to apply arguments
Implementation Snippetclass Ensemble(object):
_members = None # members...
def __getattr__(self, attr):# check if callablemem0 = self._members[0]# assuming homogeneity...f = getattr(mem0,attr)if callable(f):
# return Ensemble Wrapperv = EnsWrap(self,attr)
else:# just return valuesv = [getattr(m,attr) \
for m in self._members]return v
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Motivation Ensemble Class Ensemble Wrapper
A Helper ClassImplementation Snippetclass EnsWrap(object):
...
def __init__(self, ens, attr):_ensemble = ens # members_attr = attr # member method
def __call__(self, **kwargs):# iterate over membersnew = Ensemble()for m in self._ensemble:
f = getattr(m,self.attr)# execute member methodnew.append(f(**kwargs))
# return new ensemblereturn new
...
Implementation of theEnsemble Wrapper:
1. Initialize with ensemblemembers and the calledattribute/method
2. Use call method toexecute member methodwith arguments
Parallelization
Simple parallelization usingmultiprocessing.Pool’sapply async can be applied
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Motivation Ensemble Class Ensemble Wrapper
A Helper ClassImplementation Snippetclass EnsWrap(object):
...
def __init__(self, ens, attr):_ensemble = ens # members_attr = attr # member method
def __call__(self, **kwargs):# iterate over membersnew = Ensemble()for m in self._ensemble:
f = getattr(m,self.attr)# execute member methodnew.append(f(**kwargs))
# return new ensemblereturn new
...
Implementation of theEnsemble Wrapper:
1. Initialize with ensemblemembers and the calledattribute/method
2. Use call method toexecute member methodwith arguments
Parallelization
Simple parallelization usingmultiprocessing.Pool’sapply async can be applied
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Motivation Implementation
How can we use Ensembles with Argument Lists
A Trivial Case# this defeats the purposemembers = [member.operation(arg1=arg) for arg in arg_list]Ensemble(*members) # initialize new ensemble
# a better solution: pass list directlyensemble.operation(arg1=arg_list, inner_list=[’arg1’])
Argument lists can easilybe implemented in the
call method of theensemble wrapperEnsWrap by creating a listof arguments for eachmember
# construct argument listargs_list = expandArgList(**kwargs)# loop over listsens = self._ensemblefor m,args in zip(ens,args_list):
f = getattr(m,self.attr)# execute member method with argsnew.append(f(**args))
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Motivation Implementation
How can we use Ensembles with Argument Lists
A More Complex Case: the Outer Product List# again, this defeats the purposearg_list = []for arg1 in arg_list1: # construct arg_list from two lists
for arg2 in arg_list2: # i.e. all possible combinationsarg_list.append(dict(arg1=arg1, arg2=arg2))
# apply list to ensembleensemble.operation(arg1=arg_list, inner_list=[’arg1’])
# a better solution is to expand the lists internallyensemble.operation(arg1=arg_list1, arg2=arg_list2,
outer_list=[’arg1’,’arg2’])
The Outer Product expansion of multiple argument lists createsargument lists with all possible combinations of arguments. InnerProduct expansion works like Python’s zip function.
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Motivation Implementation
Argument Expansionvia Outer Product
Recursive Implementationof Outer Product:
1. Separate expansionarguments from others
2. Recursively expandargument list
3. Generate argument set foreach ensemble member
Decorator Class
Argument Expansion is mostuseful as a Decorator class
Implementation of Recursiondef expandArgsList(args_list,
exp_args, kwargs):# check recursion conditionif len(exp_args) > 0:
# expand argumentsnow_arg = exp_args.pop(0)new_list = [] # new arg listfor narg in kwargs[now_arg]:
for arg_list in args_list:arg_list.append(narg)new_list.append(arg_list)
# next recursion levelargs_list = expandArgsList(
new_list, exp_args, kwargs)...# terminate: return arg listsreturn args_list
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Motivation Implementation
Argument Expansionvia Outer Product
Recursive Implementationof Outer Product:
1. Separate expansionarguments from others
2. Recursively expandargument list
3. Generate argument set foreach ensemble member
Decorator Class
Argument Expansion is mostuseful as a Decorator class
Implementation of Recursiondef expandArgsList(args_list,
exp_args, kwargs):# check recursion conditionif len(exp_args) > 0:
# expand argumentsnow_arg = exp_args.pop(0)new_list = [] # new arg listfor narg in kwargs[now_arg]:
for arg_list in args_list:arg_list.append(narg)new_list.append(arg_list)
# next recursion levelargs_list = expandArgsList(
new_list, exp_args, kwargs)...# terminate: return arg listsreturn args_list
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion Motivation Implementation
Argument Expansionvia Outer Product
Recursive Implementationof Outer Product:
1. Separate expansionarguments from others
2. Recursively expandargument list
3. Generate argument set foreach ensemble member
Decorator Class
Argument Expansion is mostuseful as a Decorator class
Implementation of Recursiondef expandArgsList(args_list,
exp_args, kwargs):# check recursion conditionif len(exp_args) > 0:
# expand argumentsnow_arg = exp_args.pop(0)new_list = [] # new arg listfor narg in kwargs[now_arg]:
for arg_list in args_list:arg_list.append(narg)new_list.append(arg_list)
# next recursion levelargs_list = expandArgsList(
new_list, exp_args, kwargs)...# terminate: return arg listsreturn args_list
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion
Summary & Conclusion
The Ensemble ClassI Functions like a container type and redirects
calls to (parallelized) member methods
Argument ExpansionI Systematic expansion of argument lists from
inner or outer product (with decorator)
Sprint Project: Publish Ensemble Class
Create a stand-alone module with the Ensemble class and theargument expansion code for others to use, and add supportfor array-like item access/assignment
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python
Introduction Ensemble Class Argument Expansion
Thank You! ∼ Questions?
Andre R. Erler ([email protected]) Large Ensembles and Batch Execution with Python