Counting Probability Convergence in a Multithreaded Counting...

44
Multithreaded Counting Scherrer et al. Introduction PDtree Data Structure Multithreaded Framework Dealing with Nondeter- minism Summary Probability Convergence in a Multithreaded Counting Application Chad Scherrer 1 Nathaniel Beagley 1 Jarek Nieplocha 1 Andrés Márquez 1 John Feo 2 Daniel Chavarría-Miranda 1 1 Pacific Northwest National Laboratory 2 Cray, Incorporated Workshop on Multithreaded Architectures and Applications March 30, 2007

Transcript of Counting Probability Convergence in a Multithreaded Counting...

Page 1: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Probability Convergence in a MultithreadedCounting Application

Chad Scherrer1 Nathaniel Beagley1 Jarek Nieplocha1

Andrés Márquez1 John Feo2 Daniel Chavarría-Miranda1

1Pacific Northwest National Laboratory

2Cray, Incorporated

Workshop on Multithreaded Architectures and ApplicationsMarch 30, 2007

Page 2: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Outline

1 Introduction

2 PDtree Data Structure

3 Multithreaded Framework

4 Dealing with Nondeterminism

Page 3: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Outline

1 Introduction

2 PDtree Data Structure

3 Multithreaded Framework

4 Dealing with Nondeterminism

Page 4: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

The Problem

We are given data in the form of a sequence of tuples,

[(a1, b1, c1) , . . . , (an, bn, cn)] .

We wish to be able to quickly answer queries of the form

count (A = a2, B = ∗, C = c17) .

Note that some variables may be unspecified.

In many modeling contexts, the queries may take on amore restricted form, e.g., at most two fixed values. Wewish to take advantage of any such structure.

Page 5: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

The Problem

We are given data in the form of a sequence of tuples,

[(a1, b1, c1) , . . . , (an, bn, cn)] .

We wish to be able to quickly answer queries of the form

count (A = a2, B = ∗, C = c17) .

Note that some variables may be unspecified.

In many modeling contexts, the queries may take on amore restricted form, e.g., at most two fixed values. Wewish to take advantage of any such structure.

Page 6: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

The Problem

We are given data in the form of a sequence of tuples,

[(a1, b1, c1) , . . . , (an, bn, cn)] .

We wish to be able to quickly answer queries of the form

count (A = a2, B = ∗, C = c17) .

Note that some variables may be unspecified.

In many modeling contexts, the queries may take on amore restricted form, e.g., at most two fixed values. Wewish to take advantage of any such structure.

Page 7: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

The Problem

We are given data in the form of a sequence of tuples,

[(a1, b1, c1) , . . . , (an, bn, cn)] .

We wish to be able to quickly answer queries of the form

count (A = a2, B = ∗, C = c17) .

Note that some variables may be unspecified.

In many modeling contexts, the queries may take on amore restricted form, e.g., at most two fixed values. Wewish to take advantage of any such structure.

Page 8: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Our Approach

Design a tree structure for storing multivariate count data,allowing a user-specified nesting.

Queries can be answered at any time as the tree ispopulated. For testing, we assume each new observationhas a corresponding set of queries.

Parallelize by breaking sequence into blocks, possiblyintroducing a race condition.

Prove a bound on the effects of the race condition thatshrink as data volume grows.

Page 9: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Our Approach

Design a tree structure for storing multivariate count data,allowing a user-specified nesting.

Queries can be answered at any time as the tree ispopulated. For testing, we assume each new observationhas a corresponding set of queries.

Parallelize by breaking sequence into blocks, possiblyintroducing a race condition.

Prove a bound on the effects of the race condition thatshrink as data volume grows.

Page 10: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Our Approach

Design a tree structure for storing multivariate count data,allowing a user-specified nesting.

Queries can be answered at any time as the tree ispopulated. For testing, we assume each new observationhas a corresponding set of queries.

Parallelize by breaking sequence into blocks, possiblyintroducing a race condition.

Prove a bound on the effects of the race condition thatshrink as data volume grows.

Page 11: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Our Approach

Design a tree structure for storing multivariate count data,allowing a user-specified nesting.

Queries can be answered at any time as the tree ispopulated. For testing, we assume each new observationhas a corresponding set of queries.

Parallelize by breaking sequence into blocks, possiblyintroducing a race condition.

Prove a bound on the effects of the race condition thatshrink as data volume grows.

Page 12: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Outline

1 Introduction

2 PDtree Data Structure

3 Multithreaded Framework

4 Dealing with Nondeterminism

Page 13: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Introducing PDtrees

An ADtree [Moore and Lee] is a nested data structure thatstores “All Dimensions”, in that counts are stored for everypossible combination of variables.Storage costs for an ADtree depend on the number ofvariables, the number of levels of each variable, and thedependence structure among the variables.The time required to populate an ADtree is linear in thenumber of observations but exponential in the number ofvariables.If this expense is unacceptable, a PDtree (for “PartialDimensions”) might be appropriate.Nesting structure is specified in an auxilliary data structurecalled a guide tree.Nesting structure can be changed without the need torecompile.

Page 14: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Introducing PDtrees

An ADtree [Moore and Lee] is a nested data structure thatstores “All Dimensions”, in that counts are stored for everypossible combination of variables.Storage costs for an ADtree depend on the number ofvariables, the number of levels of each variable, and thedependence structure among the variables.The time required to populate an ADtree is linear in thenumber of observations but exponential in the number ofvariables.If this expense is unacceptable, a PDtree (for “PartialDimensions”) might be appropriate.Nesting structure is specified in an auxilliary data structurecalled a guide tree.Nesting structure can be changed without the need torecompile.

Page 15: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Introducing PDtrees

An ADtree [Moore and Lee] is a nested data structure thatstores “All Dimensions”, in that counts are stored for everypossible combination of variables.Storage costs for an ADtree depend on the number ofvariables, the number of levels of each variable, and thedependence structure among the variables.The time required to populate an ADtree is linear in thenumber of observations but exponential in the number ofvariables.If this expense is unacceptable, a PDtree (for “PartialDimensions”) might be appropriate.Nesting structure is specified in an auxilliary data structurecalled a guide tree.Nesting structure can be changed without the need torecompile.

Page 16: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Introducing PDtrees

An ADtree [Moore and Lee] is a nested data structure thatstores “All Dimensions”, in that counts are stored for everypossible combination of variables.Storage costs for an ADtree depend on the number ofvariables, the number of levels of each variable, and thedependence structure among the variables.The time required to populate an ADtree is linear in thenumber of observations but exponential in the number ofvariables.If this expense is unacceptable, a PDtree (for “PartialDimensions”) might be appropriate.Nesting structure is specified in an auxilliary data structurecalled a guide tree.Nesting structure can be changed without the need torecompile.

Page 17: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Introducing PDtrees

An ADtree [Moore and Lee] is a nested data structure thatstores “All Dimensions”, in that counts are stored for everypossible combination of variables.Storage costs for an ADtree depend on the number ofvariables, the number of levels of each variable, and thedependence structure among the variables.The time required to populate an ADtree is linear in thenumber of observations but exponential in the number ofvariables.If this expense is unacceptable, a PDtree (for “PartialDimensions”) might be appropriate.Nesting structure is specified in an auxilliary data structurecalled a guide tree.Nesting structure can be changed without the need torecompile.

Page 18: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Introducing PDtrees

An ADtree [Moore and Lee] is a nested data structure thatstores “All Dimensions”, in that counts are stored for everypossible combination of variables.Storage costs for an ADtree depend on the number ofvariables, the number of levels of each variable, and thedependence structure among the variables.The time required to populate an ADtree is linear in thenumber of observations but exponential in the number ofvariables.If this expense is unacceptable, a PDtree (for “PartialDimensions”) might be appropriate.Nesting structure is specified in an auxilliary data structurecalled a guide tree.Nesting structure can be changed without the need torecompile.

Page 19: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Building a PDtree from a Guide Tree

Efficiently storing data for a Bayesian network

Start with Bayesian network A → B → C → D.Only need to store counts for {AB, B, BC , C , CD}.This is equivalent to storing {B, C , A|B, C |B, D|C}.

Page 20: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Building a PDtree from a Guide Tree

Efficiently storing data for a Bayesian network

Start with Bayesian network A → B → C → D.Only need to store counts for {AB, B, BC , C , CD}.This is equivalent to storing {B, C , A|B, C |B, D|C}.

Page 21: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Building a PDtree from a Guide Tree

Efficiently storing data for a Bayesian network

Start with Bayesian network A → B → C → D.Only need to store counts for {AB, B, BC , C , CD}.This is equivalent to storing {B, C , A|B, C |B, D|C}.

Page 22: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Outline

1 Introduction

2 PDtree Data Structure

3 Multithreaded Framework

4 Dealing with Nondeterminism

Page 23: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

An Implementation on the Cray MTA-2

First node for each variable is implemented as an array,because all possible values will be taken on.

Lower branches are implemented as linked lists, and valuesbecome increasingly sparse.

Page 24: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

An Implementation on the Cray MTA-2

First node for each variable is implemented as an array,because all possible values will be taken on.

Lower branches are implemented as linked lists, and valuesbecome increasingly sparse.

Page 25: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Multithreaded List Insertion, Take 1

while true {ptr = readfe(node.next)if ptr is null

ptr = memory for new nodeinitialize new nodewriteef(node.next, ptr)break

else if next node is the one I wantincrement counterwriteef(node.next, ptr)break

elsewriteef(node.next, ptr)

end if} end while

Branches in a PDtree arecurrently implementedusing a linked list.Synchronized read andwrite implemented withreadfe and writeef, resp.This version is overlyserial.Critical section per linkrather than only at theend of the list.

Page 26: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Multithreaded List Insertion, Take 1

while true {ptr = readfe(node.next)if ptr is null

ptr = memory for new nodeinitialize new nodewriteef(node.next, ptr)break

else if next node is the one I wantincrement counterwriteef(node.next, ptr)break

elsewriteef(node.next, ptr)

end if} end while

Branches in a PDtree arecurrently implementedusing a linked list.Synchronized read andwrite implemented withreadfe and writeef, resp.This version is overlyserial.Critical section per linkrather than only at theend of the list.

Page 27: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Multithreaded List Insertion, Take 1

while true {ptr = readfe(node.next)if ptr is null

ptr = memory for new nodeinitialize new nodewriteef(node.next, ptr)break

else if next node is the one I wantincrement counterwriteef(node.next, ptr)break

elsewriteef(node.next, ptr)

end if} end while

Branches in a PDtree arecurrently implementedusing a linked list.Synchronized read andwrite implemented withreadfe and writeef, resp.This version is overlyserial.Critical section per linkrather than only at theend of the list.

Page 28: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Multithreaded List Insertion, Take 1

while true {ptr = readfe(node.next)if ptr is null

ptr = memory for new nodeinitialize new nodewriteef(node.next, ptr)break

else if next node is the one I wantincrement counterwriteef(node.next, ptr)break

elsewriteef(node.next, ptr)

end if} end while

Branches in a PDtree arecurrently implementedusing a linked list.Synchronized read andwrite implemented withreadfe and writeef, resp.This version is overlyserial.Critical section per linkrather than only at theend of the list.

Page 29: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Multithreaded List Insertion, Take 1

while true {ptr = readfe(node.next)if ptr is null

ptr = memory for new nodeinitialize new nodewriteef(node.next, ptr)break

else if next node is the one I wantincrement counterwriteef(node.next, ptr)break

elsewriteef(node.next, ptr)

end if} end while

Branches in a PDtree arecurrently implementedusing a linked list.Synchronized read andwrite implemented withreadfe and writeef, resp.This version is overlyserial.Critical section per linkrather than only at theend of the list.

Page 30: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Multithreaded List Insertion, Take 2

while true {ptr = node.nextif ptr is null

ptr = readfe(node.next)if ptr is not null then continueptr = memory for new nodeinitialize new nodewriteef(node.next, ptr)break

else if next node is the one I wantincrement counterwriteef(node.next, ptr)break

elsewriteef(node.next, ptr)node = ptr

end if} end while

Changes are shown in red.Test the pointer beforelocking it.Must retest after readfe incase another threadsgrabs the lock to insert anew node.This version scales linearlyup to 32 processors.

Page 31: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Multithreaded List Insertion, Take 2

while true {ptr = node.nextif ptr is null

ptr = readfe(node.next)if ptr is not null then continueptr = memory for new nodeinitialize new nodewriteef(node.next, ptr)break

else if next node is the one I wantincrement counterwriteef(node.next, ptr)break

elsewriteef(node.next, ptr)node = ptr

end if} end while

Changes are shown in red.Test the pointer beforelocking it.Must retest after readfe incase another threadsgrabs the lock to insert anew node.This version scales linearlyup to 32 processors.

Page 32: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Multithreaded List Insertion, Take 2

while true {ptr = node.nextif ptr is null

ptr = readfe(node.next)if ptr is not null then continueptr = memory for new nodeinitialize new nodewriteef(node.next, ptr)break

else if next node is the one I wantincrement counterwriteef(node.next, ptr)break

elsewriteef(node.next, ptr)node = ptr

end if} end while

Changes are shown in red.Test the pointer beforelocking it.Must retest after readfe incase another threadsgrabs the lock to insert anew node.This version scales linearlyup to 32 processors.

Page 33: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Multithreaded List Insertion, Take 2

while true {ptr = node.nextif ptr is null

ptr = readfe(node.next)if ptr is not null then continueptr = memory for new nodeinitialize new nodewriteef(node.next, ptr)break

else if next node is the one I wantincrement counterwriteef(node.next, ptr)break

elsewriteef(node.next, ptr)node = ptr

end if} end while

Changes are shown in red.Test the pointer beforelocking it.Must retest after readfe incase another threadsgrabs the lock to insert anew node.This version scales linearlyup to 32 processors.

Page 34: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Multithreaded List Insertion, Take 2

while true {ptr = node.nextif ptr is null

ptr = readfe(node.next)if ptr is not null then continueptr = memory for new nodeinitialize new nodewriteef(node.next, ptr)break

else if next node is the one I wantincrement counterwriteef(node.next, ptr)break

elsewriteef(node.next, ptr)node = ptr

end if} end while

Changes are shown in red.Test the pointer beforelocking it.Must retest after readfe incase another threadsgrabs the lock to insert anew node.This version scales linearlyup to 32 processors.

Page 35: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Outline

1 Introduction

2 PDtree Data Structure

3 Multithreaded Framework

4 Dealing with Nondeterminism

Page 36: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Sequential Vs. Parallel Counts

n Sequential Parallel, 3 threads0 − −−−1 +− −−+−2 + +− +−+−−3 + + +− +−+ +−−4 + + + +− +−+ +−+−

In general, using k threads in the parallel implementation givesa maximal count deviation of k − 1.

Page 37: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Sequential Vs. Parallel Counts

n Sequential Parallel, 3 threads0 − −−−1 +− −−+−2 + +− +−+−−3 + + +− +−+ +−−4 + + + +− +−+ +−+−

In general, using k threads in the parallel implementation givesa maximal count deviation of k − 1.

Page 38: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Maximal Count Deviation

Lemma

Let cseq (n) and cpar (n) be the number of times a particularcollection of variables takes on a specified configuration, giventhe number n of observations so far, for a sequential andparallel implementation, respectively. If the parallelimplementation uses k threads, then

|cpar (n)− cseq (n)| < k .

Now let p̂par (n) =cpar (n)

n and p̂seq (n) =cseq(n)

n be theestimated probabilities of a given value after n observations.

Page 39: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Maximal Count Deviation

Lemma

Let cseq (n) and cpar (n) be the number of times a particularcollection of variables takes on a specified configuration, giventhe number n of observations so far, for a sequential andparallel implementation, respectively. If the parallelimplementation uses k threads, then

|cpar (n)− cseq (n)| < k .

Now let p̂par (n) =cpar (n)

n and p̂seq (n) =cseq(n)

n be theestimated probabilities of a given value after n observations.

Page 40: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Probability Convergence

Theorem

For a counting application, suppose a sequential implementationis compared to a parallel implementation using k threads, andlet n be the number of observations. The estimatedprobabilities are then related by

p̂par (n) = p̂seq (n) + O(

kn

).

Proof.

Using the result from the lemma,

|p̂par (n)− p̂seq (n)| =∣∣∣∣cpar (n)

n− cseq (n)

n

∣∣∣∣ <kn.

Page 41: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Probability Convergence

Theorem

For a counting application, suppose a sequential implementationis compared to a parallel implementation using k threads, andlet n be the number of observations. The estimatedprobabilities are then related by

p̂par (n) = p̂seq (n) + O(

kn

).

Proof.

Using the result from the lemma,

|p̂par (n)− p̂seq (n)| =∣∣∣∣cpar (n)

n− cseq (n)

n

∣∣∣∣ <kn.

Page 42: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Summary

A PDtree data structure has similar benefits to an ADtree,but allows specification of the nesting structure, leading tommory savings and speed improvements.

Parallelism is easily achieved on a Cray MTA-2, but a racecondition is introduced.

The numeric effect of this race condition decays as 1n .

Page 43: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Summary

A PDtree data structure has similar benefits to an ADtree,but allows specification of the nesting structure, leading tommory savings and speed improvements.

Parallelism is easily achieved on a Cray MTA-2, but a racecondition is introduced.

The numeric effect of this race condition decays as 1n .

Page 44: Counting Probability Convergence in a Multithreaded Counting …hpc.pnl.gov/mtaap/Copy_of_mtaap_files/scherrer.pdf · 2014-02-07 · Multithreaded Counting Scherrer et al. Introduction

MultithreadedCounting

Scherrer etal.

Introduction

PDtree DataStructure

MultithreadedFramework

Dealing withNondeter-minism

Summary

Summary

A PDtree data structure has similar benefits to an ADtree,but allows specification of the nesting structure, leading tommory savings and speed improvements.

Parallelism is easily achieved on a Cray MTA-2, but a racecondition is introduced.

The numeric effect of this race condition decays as 1n .