Parallel Processing in High Performance Computational Electromagnetics · 2017. 8. 8. · serial...

1
Parallel Processing in High Performance Computational Electromagnetics David Abraham, under the supervision of Professor Dennis D. Giannacopoulos Department of Electrical and Computer Engineering, McGill University, Montreal, Québec, Canada INTRODUCTION Modern simulation methods, based on techniques such as finite element analysis, have had much success performing electromagnetic modelling and simulation on traditional single processor CPU’s. However, owing to the ever increasing complexity of modern simulations and applications, these computationally intensive methods begin to incur very long computation times. As such, a greater focus is now being given to finding methods to increase the computation speed of electromagnetic simulations. One promising candidate is to parallelize computations, taking advantage of the rapid increase in production, popularity, and availability of multi core processors. OBJECTIVE We wish to determine the effects of parallel computing on elementary finite element simulations by comparing computation times between simulations run on 1, 2, 3 and 4 cores of a modern multi core processor. The same basic finite element simulation was run multiple times in order to acquire the following data. In order to ensure accuracy and account for computational delays due to other processes currently running on the machine at time of execution, each simulation was repeated 5 times and the results were statistically analyzed. The data was collected for each of 1, 2, 3 and 4 cores (also known as “workers” in the Matlab environment) in 3 different simulation scenarios containing different numbers of unknown variables and elements. The results are summarized in the table below. On average we observed simulation occurring 2.12 times as fast on 4 cores as opposed to 1. As was expected, the more cores that were used, the faster the simulation, with speed increase being independent of the number of elements. RESULTS Computation Time in Seconds vs. Number of Workers Cores Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average Std. Dev. Ratio (S/M) % Time 8040 Potentials 1 15.811531 16.051262 15.764749 16.608443 16.389314 16.1250598 0.366421853 1 100% 15721 Elements 2 9.801491 10.185931 10.504653 10.250295 10.397885 10.228051 0.26906067 1.58 63.43% 3 9.082127 9.22324 9.705682 9.632143 9.501662 9.4289708 0.267281605 1.71 58.47% 4 7.438147 7.164749 7.404521 7.627576 7.408214 7.4086414 0.164547564 2.18 45.87% Cores Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average Std. Dev. Ratio (S/M) % Time 31721 Potentials 1 176.885318 174.052936 178.592732 177.473435 175.431596 176.4872034 1.775131224 1 100% 62726 Elements 2 122.885197 120.186661 122.058474 126.028733 124.249454 123.0817038 2.208818043 1.43 69.74% 3 103.414182 105.35724 103.801511 107.268522 107.790581 105.5264072 1.976539066 1.67 59.79% 4 78.870842 82.338225 80.445475 80.832922 83.103374 81.1181676 1.659362622 2.18 45.87% Cores Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average Std. Dev. Ratio (S/M) % Time 126008 Potentials 1 2559.257279 2685.694734 2685.591752 2674.833031 2672.823466 2655.640052 54.20716589 1 100% 250588 Elements 2 1812.530122 1644.850036 1644.980962 1678.549471 1649.60475 1686.103068 72.05175966 1.57 63.49% 3 1588.346659 1597.217871 1570.345639 1534.964378 1564.359257 1571.046761 24.1421111 1.7 58.91% 4 1237.873169 1333.481115 1322.490923 1390.133754 1377.979164 1332.391625 60.0943562 1.99 50.17% CONCLUSION In conclusion it has been demonstrated that even at an elementary level, finite element analysis could in fact benefit from substantial reductions in computation times by utilizing modern multi core processors. Although in this treatment, only portions of the matrix algorithms were parallelized, it is believe that even better performance could be obtained by parallelizing other steps, such as the triangulation and solving. The results shown here could also apply to many other programming fields where an increase in speed is desired. FUTURE PROJECTS Future areas of interest could be to explore the possibility of cloud or cluster computing as well as executing simulations on graphics cards. One of the most widespread and useful techniques in electromagnetic simulation is that of Finite Element Analysis (FEA). This method breaks down a region into a finite number of segments or pieces known as elements (generally triangular or tetrahedral in shape), upon which an approximate solution to Maxwell’s equations can be determined. Within this method there are two keys steps required to complete the simulation. The first is to discretize the region into elements to allow the approximation and the second is to solve the matrix of coefficients associated with the set of linear equations that arises, in order to obtain the unknown potentials. Although decreasing the size of elements (and hence increasing the number of elements) increases the accuracy of the solution, this also consequently results in increased computation times. Since modern simulations are requiring ever larger and more accurate simulations, it is becoming a necessity to improve the efficiency and speed of FEA to reduce computation times. The method of interest here is that of parallel computing: allowing multiple processors or multiple cores of a single processor to work on different parts of the simulation simultaneously. By parallelizing the most time consuming steps, such as the triangulation of the region or the linear algebra routines, we can stand to gain significant improvements depending on the number of processors or “workers” available for the computation. ACKNOWLEDGEMENTS MATERIALS AND METHODS In order to test the effects of parallel computing on finite element analysis simulations, a very basic finite element simulation program was programmed in Matlab and made to perform a variety of simulations serially. (See images in previous sections for example solutions). The time required for each simulation was measured and averaged over 5 iterations. After serial execution, the parallel computing toolbox within Matlab was used to modify the code to allow matrix populating (the rate determining step in this code) to be parallelized. The simulations varied in size from 15721 elements to 250588 elements and were executed on 1, 2, 3 and 4 cores using an Intel i7 Q740 processor running at 1.734GHz. BACKGROUND I would like to thank Prof. D. Giannacopoulos for all his help and insight during the course of this project, the Faculty of Engineering at McGill for giving me the opportunity to participate and the Natural Sciences and Engineering Research Council of Canada for providing funding. Images: Intel i7: "Intel core i7." Technology@Intel. Web. 4 Aug 2011. <http://blogs.intel.com/technology/2009/04/intel_unveils_new.php> Mathworks: "Mathworks Logo." NC State University. Web. 4 Aug 2011. <http://ncsuecocar.com/?page_id=64>

Transcript of Parallel Processing in High Performance Computational Electromagnetics · 2017. 8. 8. · serial...

Page 1: Parallel Processing in High Performance Computational Electromagnetics · 2017. 8. 8. · serial execution, the parallel computing toolbox within Matlab was used to modify the code

Parallel Processing in High Performance

Computational ElectromagneticsDavid Abraham, under the supervision of Professor Dennis D. Giannacopoulos

Department of Electrical and Computer Engineering, McGill University, Montreal, Québec, Canada

INTRODUCTION

Modern simulation methods, based on techniques such as

finite element analysis, have had much success performing

electromagnetic modelling and simulation on traditional single

processor CPU’s. However, owing to the ever increasing

complexity of modern simulations and applications, these

computationally intensive methods begin to incur very long

computation times. As such, a greater focus is now being

given to finding methods to increase the computation speed

of electromagnetic simulations. One promising candidate is to parallelize computations, taking

advantage of the rapid increase in production, popularity, and availability of multi core processors.

OBJECTIVE

We wish to determine the effects of parallel computing on elementary finite element

simulations by comparing computation times between simulations run on 1, 2, 3 and 4 cores

of a modern multi core processor.

The same basic finite element simulation was run multiple times in order to acquire the following

data. In order to ensure accuracy and account for computational delays due to other processes currently

running on the machine at time of execution, each simulation was repeated 5 times and the results were

statistically analyzed. The data was collected for each of 1, 2, 3 and 4 cores (also known as “workers” in the

Matlab environment) in 3 different simulation scenarios containing different numbers of unknown variables

and elements. The results are summarized in the table below.

On average we observed simulation occurring 2.12 times as

fast on 4 cores as opposed to 1. As was expected, the more cores that

were used, the faster the simulation, with speed increase being

independent of the number of elements.

RESULTS

Computation Time in Seconds vs. Number of Workers

Cores Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average Std. Dev. Ratio (S/M) % Time

8040 Potentials 1 15.811531 16.051262 15.764749 16.608443 16.389314 16.1250598 0.366421853 1 100%

15721 Elements 2 9.801491 10.185931 10.504653 10.250295 10.397885 10.228051 0.26906067 1.58 63.43%

3 9.082127 9.22324 9.705682 9.632143 9.501662 9.4289708 0.267281605 1.71 58.47%

4 7.438147 7.164749 7.404521 7.627576 7.408214 7.4086414 0.164547564 2.18 45.87%

Cores Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average Std. Dev. Ratio (S/M) % Time

31721 Potentials 1 176.885318 174.052936 178.592732 177.473435 175.431596 176.4872034 1.775131224 1 100%

62726 Elements 2 122.885197 120.186661 122.058474 126.028733 124.249454 123.0817038 2.208818043 1.43 69.74%

3 103.414182 105.35724 103.801511 107.268522 107.790581 105.5264072 1.976539066 1.67 59.79%

4 78.870842 82.338225 80.445475 80.832922 83.103374 81.1181676 1.659362622 2.18 45.87%

Cores Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Average Std. Dev. Ratio (S/M) % Time

126008 Potentials 1 2559.257279 2685.694734 2685.591752 2674.833031 2672.823466 2655.640052 54.20716589 1 100%

250588 Elements 2 1812.530122 1644.850036 1644.980962 1678.549471 1649.60475 1686.103068 72.05175966 1.57 63.49%

3 1588.346659 1597.217871 1570.345639 1534.964378 1564.359257 1571.046761 24.1421111 1.7 58.91%

4 1237.873169 1333.481115 1322.490923 1390.133754 1377.979164 1332.391625 60.0943562 1.99 50.17%

CONCLUSION

In conclusion it has been demonstrated that even at an elementary level, finite

element analysis could in fact benefit from substantial reductions in computation times by

utilizing modern multi core processors. Although in this treatment, only portions of the matrix

algorithms were parallelized, it is believe that even better performance could be obtained by

parallelizing other steps, such as the triangulation and solving. The results shown here could

also apply to many other programming fields where an increase in speed is desired.

FUTURE PROJECTSFuture areas of interest could be to explore the

possibility of cloud or cluster computing as

well as executing simulations on graphics cards.

One of the most widespread and useful techniques in electromagnetic simulation is that

of Finite Element Analysis (FEA). This method breaks down a region into a finite number of

segments or pieces known as elements (generally triangular or tetrahedral in shape), upon which

an approximate solution to Maxwell’s equations can be determined. Within this method there are

two keys steps required to complete the simulation. The first is to discretize the region into

elements to allow the approximation and the second is to solve the matrix of coefficients

associated with the set of linear equations that arises, in order to obtain the unknown potentials.

Although decreasing the size of elements (and hence increasing the number of

elements) increases the accuracy of the solution, this also consequently results in increased

computation times. Since modern simulations are requiring ever larger and more accurate

simulations, it is becoming a necessity to improve the efficiency and speed of FEA to reduce

computation times. The method of interest here is that of parallel computing: allowing multiple

processors or multiple cores of a single processor to work on different parts of the simulation

simultaneously.

By parallelizing the most time consuming steps, such as the triangulation of the

region or the linear algebra routines, we can stand to gain significant improvements

depending on the number of processors or “workers” available for the computation.

ACKNOWLEDGEMENTS

MATERIALS AND METHODS

In order to test the effects of parallel computing on finite element analysis simulations, a

very basic finite element simulation program was programmed in Matlab and made to perform a

variety of simulations serially. (See images in previous sections for example solutions).

The time required for each simulation was measured and averaged over 5 iterations. After

serial execution, the parallel computing toolbox within Matlab was used to modify the code to

allow matrix populating (the rate determining step in this code) to be parallelized. The simulations

varied in size from 15721 elements to 250588 elements and were executed on 1, 2, 3 and 4 cores

using an Intel i7 Q740 processor running at 1.734GHz.

BACKGROUND

I would like to thank Prof. D. Giannacopoulos for all his help and insight during the course of

this project, the Faculty of Engineering at McGill for giving me the opportunity to participate

and the Natural Sciences and Engineering Research Council of Canada for providing funding.Images:Intel i7: "Intel core i7." Technology@Intel. Web. 4 Aug 2011. <http://blogs.intel.com/technology/2009/04/intel_unveils_new.php>Mathworks: "Mathworks Logo." NC State University. Web. 4 Aug 2011. <http://ncsuecocar.com/?page_id=64>