The Adaptive Cross-Approximation Technique for the 3-D Boundary-Element Method.pdf

4
IEEE TRANSACTIONS ON MAGNETICS, VOL. 38, NO. 2, MARCH 2002 421 The Adaptive Cross-Approximation Technique for the 3-D Boundary-Element Method Stefan Kurz, Member, IEEE, Oliver Rain, and Sergej Rjasanow Abstract—It is well known that the classical boundary-element method (BEM) yields fully populated matrices. Their manipula- tion is cumbersome with respect to memory consumption and com- putational costs. This paper describes a novel approach where the matrices are split into collections of blocks of various sizes. Those blocks which describe remote interactions are adaptively approx- imated by low rank submatrices. This procedure reduces the al- gorithmic complexity for matrix setup and matrix-by-vector prod- ucts to approximately . The proposed method has been ex- amined in a testing environment and implemented into an existing BEM-finite-element method (FEM) code for electromagnetic and electromechanical problems. The advantages of the new method are demonstrated by means of several examples. Index Terms—Boundary-element methods, fast methods, finite- element methods. I. INTRODUCTION T HE APPLICATION of the boundary-element method (BEM) for the solution of linear electromagnetic problems has many advantages. Only the boundaries of the considered domains need to be discretized, open boundary problems pose no additional difficulties, and problems including motion can be treated elegantly. However, application of the BEM leads to dense matrices. The storage requirements and computational costs are of , where is the number of unknowns, when a preconditioned iterative solver is applied. This means that only relatively small problems can be solved on usual PCs or workstations. One remedy could be the exploitation of the parallelism inherent to the BEM [1]. In this paper, a different approach is presented, which reduces the algorithmic complexity for matrix setup and ma- trix-by-vector products to approximately . This approach is called adaptive cross approximation (ACA) and will be explained in detail in Section II. The first part of Section III is devoted to the solution of the Laplace equation by means of the ACA-BEM. These computations have been performed in an ACA testing environment to collect informations about memory requirements, compression rates and CPU times. In a second step, the ACA algorithm has been implemented into an existing BEM-finite-element method (FEM) code for the solution of electromagnetic and electromechanical problems. The second part of Section III reports results obtained by this Manuscript received July 5, 2001; revised October 25, 2001. S. Kurz and O. Rain are with the Robert Bosch GmbH, 70049 Stuttgart, Ger- many (e-mail: [email protected]; [email protected]). S. Rjasanow is with the Universität des Saarlandes, Fachbereich Mathematik, 66041 Saarbrücken, Germany (e-mail: [email protected]). Publisher Item Identifier S 0018-9464(02)02351-8. Fig. 1. Clustering for a simple example with ten collocation points. A large distance between two collocation points results in a large difference of the respective equation numbers. code in connection with ACA. A comparison to existing fast methods for the BEM can be found in Section IV. II. THE ADAPTIVE CROSS APPROXIMATION Large dense matrices coming from integral equations have no explicit structure in general. However, it is possible to find a permutation so that the matrix with permuted rows and columns contains rather large blocks close to some low-rank matrices [2]–[5]. To find a suitable permutation, a cluster tree is constructed by recursively partitioning the collocation points according to some geometrical criterion. A simple example for such a clustering is given in Fig. 1. A large distance between two collocation points results in a large difference of the respective equation numbers. Next, cluster pairs which are geometrically well separated are identified. They will be regarded as “admissible” cluster pairs, e.g., the clusters {1, 2, 3, 4, 5} and {8, 9, 10} in Fig. 1. The cluster tree together with the set of “admissible” cluster pairs allows to split the matrix into a collection of blocks of various sizes. The block structure for the simple example is shown in Fig. 2. Since the off-diagonal blocks which describe remote in- teractions are close to some low-rank matrices, it might be a good idea to approximate them by low-rank matrices. We are, thus, led to the following matrix approximation problem for the individual blocks of the given matrix. Given a matrix and an accuracy , find an approximant with and provide the minimal possible value for . 0018-9464/02$17.00 © 2002 IEEE

Transcript of The Adaptive Cross-Approximation Technique for the 3-D Boundary-Element Method.pdf

Page 1: The Adaptive Cross-Approximation Technique for the 3-D Boundary-Element Method.pdf

IEEE TRANSACTIONS ON MAGNETICS, VOL. 38, NO. 2, MARCH 2002 421

The Adaptive Cross-Approximation Techniquefor the 3-D Boundary-Element Method

Stefan Kurz, Member, IEEE, Oliver Rain, and Sergej Rjasanow

Abstract—It is well known that the classical boundary-elementmethod (BEM) yields fully populated matrices. Their manipula-tion is cumbersome with respect to memory consumption and com-putational costs. This paper describes a novel approach where thematrices are split into collections of blocks of various sizes. Thoseblocks which describe remote interactions are adaptively approx-imated by low rank submatrices. This procedure reduces the al-gorithmic complexity for matrix setup and matrix-by-vector prod-ucts to approximately ( ). The proposed method has been ex-amined in a testing environment and implemented into an existingBEM-finite-element method (FEM) code for electromagnetic andelectromechanical problems. The advantages of the new methodare demonstrated by means of several examples.

Index Terms—Boundary-element methods, fast methods, finite-element methods.

I. INTRODUCTION

T HE APPLICATION of the boundary-element method(BEM) for the solution of linear electromagnetic problems

has many advantages. Only the boundaries of the considereddomains need to be discretized, open boundary problems poseno additional difficulties, and problems including motion canbe treated elegantly. However, application of the BEM leads todense matrices. The storage requirements and computationalcosts are of , where is the number of unknowns,when a preconditioned iterative solver is applied. This meansthat only relatively small problems can be solved on usual PCsor workstations. One remedy could be the exploitation of theparallelism inherent to the BEM [1].

In this paper, a different approach is presented, whichreduces the algorithmic complexity for matrix setup and ma-trix-by-vector products to approximately . This approachis called adaptive cross approximation (ACA) and will beexplained in detail in Section II. The first part of Section IIIis devoted to the solution of the Laplace equation by meansof the ACA-BEM. These computations have been performedin an ACA testing environment to collect informations aboutmemory requirements, compression rates and CPU times. Ina second step, the ACA algorithm has been implemented intoan existing BEM-finite-element method (FEM) code for thesolution of electromagnetic and electromechanical problems.The second part of Section III reports results obtained by this

Manuscript received July 5, 2001; revised October 25, 2001.S. Kurz and O. Rain are with the Robert Bosch GmbH, 70049 Stuttgart, Ger-

many (e-mail: [email protected]; [email protected]).S. Rjasanow is with the Universität des Saarlandes, Fachbereich Mathematik,

66041 Saarbrücken, Germany (e-mail: [email protected]).Publisher Item Identifier S 0018-9464(02)02351-8.

Fig. 1. Clustering for a simple example with ten collocation points. A largedistance between two collocation points results in a large difference of therespective equation numbers.

code in connection with ACA. A comparison to existing fastmethods for the BEM can be found in Section IV.

II. THE ADAPTIVE CROSSAPPROXIMATION

Large dense matrices coming from integral equations haveno explicit structure in general. However, it is possible to find apermutation so that the matrix with permuted rows and columnscontains rather large blocks close to some low-rank matrices[2]–[5].

To find a suitable permutation, a cluster tree is constructed byrecursively partitioning the collocation points according to somegeometrical criterion. A simple example for such a clustering isgiven in Fig. 1. A large distance between two collocation pointsresults in a large difference of the respective equation numbers.Next, cluster pairs which are geometrically well separated areidentified. They will be regarded as “admissible” cluster pairs,e.g., the clusters {1, 2, 3, 4, 5} and {8, 9, 10} in Fig. 1. Thecluster tree together with the set of “admissible” cluster pairsallows to split the matrix into a collection of blocks of varioussizes. The block structure for the simple example is shown inFig. 2. Since the off-diagonal blocks which describe remote in-teractions are close to some low-rank matrices, it might be agood idea to approximate them by low-rank matrices. We are,thus, led to the following matrix approximation problem for theindividual blocks of the given matrix.

Given a matrix and an accuracy , findan approximant with and provide theminimal possible value for .

0018-9464/02$17.00 © 2002 IEEE

Page 2: The Adaptive Cross-Approximation Technique for the 3-D Boundary-Element Method.pdf

422 IEEE TRANSACTIONS ON MAGNETICS, VOL. 38, NO. 2, MARCH 2002

Fig. 2. The permuted matrix for the example depicted in Fig. 1 contains ratherlarge off-diagonal blocks which describe remote interactions and which areclose to some low-rank matrices.

Here, denotes the Frobenius norm of the matrix. Thesolution of this problem is given by the singular-value decom-position (SVD) of the block

(1)

where and denote the greatest singular triples of thematrix and the rank is chosen so that the required accuracyof the approximation is fulfilled.

Since the SVD requires the computation of the whole matrixin advance and since SVD is rather expensive with respect

to numerical work this analytical solution is notpracticable.

We present now the algorithm of ACA, which allows to gen-erate only few rows and columns of the matrixand approxi-mate the rest of the matrix using only this information.

Let and for compute

This algorithm produces a sequence of decompositions of thematrix into a sum , where is a low-rankmatrix ( ) and denotes the error of the ap-proximation. It is important to remark that neither the matrix

nor the error will be computed completely. In the firststep of the algorithm, the row with index of the matrixwill be generated and the corresponding row of the errorwill be computed. During this computation the position and thevalue of the maximum element in the -row of will bedetermined (Step 2). This element will be called the pivot el-ement. In Step 3, the -row of will be normalized anddenoted by . Since the position of the pivot elementin the -row of is known we are able to compute the cor-responding column of this matrix and denote it as (Step 4).During the computation the position of the next pivot element inthe -column will be fixed ( ) in Step 5. The last step of

Fig. 3. TEAM problem 10. An exciting coil is set between two steel channels,and a steel plate is inserted between the channels. The surfaces of this geometryhave been discretized by linear triangular elements to obtain an input mesh forthe ACA testing environment.

the algorithm updates the approximationto . Note thatthe approximation contains the exact pivot rows and pivotcolumns of the matrix for all . An appropriate stoppingcriterion is given by

(2)

Since the matrix will not be generated completely only thenorm of the approximation is available. This norm can becomputed recursively the following way:

(3)The amount of numerical work required by the ACA algorithmis . Thus, if the numerical rank of the approx-imation remains constant (which is usually the case), then thetotal numerical work for the approximation and the memory re-quirements are both of the order .

III. EXAMPLES

A. Application to the Laplace Equation

Now we apply the ACA algorithm to two mesh sequences.The aim of these computations is to examine the numericalproperties of the ACA algorithm rather than to solve a tech-nical problem. The ACA testing environment deals with an ex-terior Dirichlet problem for the Laplace equation .The considered boundary surface is discretized by linear tri-angular elements. The potentialis represented by a single-and a double-layer potential (direct method). Nodal collocationyields a linear system whose system matrices are approximatedby means of the ACA. The approximated system is solved it-eratively by using the generalized minimum residual method(GMRES). In all computations, we set the accuracy 510 .

First, we consider the geometry of testing electromagneticanaylsis methods (TEAM) problem 10 [6]. The coarsest meshwith approximately 5000 collocation points is shown inFig. 3.

We perform two mesh refinements in order to get mesheswith about 20 000 and 80 000 collocation points, respectively.

Page 3: The Adaptive Cross-Approximation Technique for the 3-D Boundary-Element Method.pdf

KURZ et al.: THE ACA TECHNIQUE FOR THE 3-D BEM 423

TABLE IMEMORY REQUIREMENTSUSING THE ACA ALGORITHM

TABLE IICOMPUTATION TIMES USING THE ACA ALGORITHM

The values refer to a 1.2-GHz AMD Athlon PC. Note that the table shows the wallclocktime and not the CPU time. Therefore, it includes the swap time which the computer neededduring the computation for the finest mesh. Still, even the wallclock time does not grow like

Fig. 4. Electromechanical relay. The magnetic circuit consists of a pole core,a magnetic yoke, and a movable armature. Again, the surfaces have beendiscretized by linear triangular elements to obtain an input mesh for the ACAtesting environment.

It means that the memory amount of a fully populated matrixas well as the costs of the matrix vector multiplication wouldgrow after each refinement step with ratio 16. Table I shows thememory requirements using the ACA algorithm for the threeTEAM meshes. There are the real size of the approximantsgiven as well as their relative size compared to the full storage.Taking into account available resources, application of the stan-dard BEM would be possible on the coarsest mesh only.

We can observe the almost linear behavior of the memoryusage. The average scaling factor of the matrix size after the firstmesh refinement is equal to 6.0 and decreases to 5.2 after thesecond one. Thus, we see that the ratio the matrix size is growingwith gets closer to linear for large . Also, the time needed foran iteration step of GMRES and for generation of the approx-imants grows almost linearly, because the costs of the corre-sponding matrix vector multiplication performed in GMRES di-rectly depend on the matrix size. These data are given in Table II.

The second mesh sequence is based on the geometry of anelectromechanical relay as shown in Fig. 4 and explained inmore detail in [1] and [7].

Again, we consider three meshes and study the behavior ofthe memory usage and costs of the matrix vector multiplication.

TABLE IIIMEMORY REQUIREMENTSUSING THE ACA ALGORITHM

TABLE IVCOMPUTATION TIMES USING THE ACA ALGORITHM

The values refer to a 1.2-GHz AMD Athlon PC.

The size of the approximants and their relative size are givenin Table III. The average scaling factors due to the mesh re-finements are 5.7 and 4.8, respectively. Thus, we again observethe asymptotically linear behavior of the memory consumption.Analogously to the first example we give the time spent gener-ating the approximants as well as the costs of an iteration stepin Table IV.

The numerical examples above show that the memory usageof the BEM matrices computed by the ACA method grows al-most linearly with the number of unknowns on the boundary.The same behavior is observable with respect to the time ofapproximant generation and the matrix vector multiplication.Hence, by using the ACA method we are able to handle BEMproblems whose solution by application of standard BEM wouldbe impossible with the same resources.

B. Application to a BEM-FEM-Code

Electromagnetic devices can be analyzed by the coupledBEM-FEM method, where the conducting and magnetic partsare discretized by finite elements. In contrast, the surroundingspace is described with the BEM. This discretization scheme iswell suited for problems including moving parts and has beendescribed in detail elsewhere [7]–[9].

In the air domain, the BEM is applied to solve the equa-tion , where is the Coulomb gauged magneticvector potential and an impressed source current density. Thisvector equation decouples into three scalar equations for theCartesian components of, so that we are left with the samesituation as in the ACA testing environment. We implementedthe ACA algorithm into the BEM-FEM code and performedcomputations for the examples depicted in Figs. 3 and 4. How-ever, quadratic six-noded triangles in connection with quadraticten-noded tetrahedra have been employed for this analysis.

TEAM problem 10 has been treated as a magnetostaticproblem (for details, see [10]). The symmetry of the problemhas intentionally been disregarded. Some results are collectedin Table V.

The difference of the flux densities with and without ACA(0.5%) is much smaller than the difference to the measured valueof 1.67 T (3.4%) which is due to the still relatively coarse mesh.However, the computer resources for ACA-BEM dropped toabout half the amount needed for the standard BEM.

Page 4: The Adaptive Cross-Approximation Technique for the 3-D Boundary-Element Method.pdf

424 IEEE TRANSACTIONS ON MAGNETICS, VOL. 38, NO. 2, MARCH 2002

TABLE VMESH AND COMPUTATIONAL DATA FOR TEAM PROBLEM 10

The values refer to a 300-MHz Sun Ultra workstation.

TABLE VIMESH AND COMPUTATIONAL DATA FOR THE RELAY

The values refer to a 300-MHz Sun Ultra workstation.

As a final example, the closing process of the electromechan-ical relay has been studied, where only half of the mesh shown inFig. 4 was considered by taking advantage of the symmetry (fordetails, see [1] and [7]). Results of this computation are givenin Table VI.

This example requires an enormous amount of CPU time,because there are many time steps and the BEM matrices haveto be reprocessed frequently due to the motion of the armature.The ACA implementation for problems with symmetry is notyet optimized. Despite that fact the memory requirement couldstill be reduced to 50% of the previous value.

IV. CONCLUSION

The memory consumption of the standard BEM turns out tobe the limiting factor in many practical applications. The aboveresults show that the ACA-BEM is a feasible means to overcomethese limitations.

The main advantage of the ACA method over the other fastBEM techniques (H-Matrices [4], pseudoskeleton approxima-

tions [5], multipole decomposition [11]) is that only the orig-inal entries of the system matrix are used for its approximation.Thus, the already-developed procedures for the generating ofthe BEM matrices can be used after some minor modifications.The ACA algorithm is not difficult to implement in contrast topractical implementation of the Taylor series or spherical har-monics used in the multipole method. On the other hand, themultipole method allows the rapid computation of fields and po-tentials in the BEM domain once the problem has been solved[12].

The second advantage of the ACA-BEM is that any arbi-trary accuracy of the approximation can easily be reached. Inthe worst case, the whole matrix will be generated without anyerror. Using the sequence of the less and less accurate approxi-mations of the same coarse discretization we are able to fix thebound of the acceptable approximation error. Then, an obviousreduction of this bound due to the increased dimension of thematrix can be used for the final computations on the fine grid.

REFERENCES

[1] V. Rischmüller, M. Haas, S. Kurz, and W. M. Rucker, “3D transientanalysis of electromechanical devices using parallel BEM coupled toFEM,” IEEE Trans. Magn., vol. 36, pp. 1360–1363, July 2000.

[2] M. Bebendorf, “Approximation of boundary element matrices,”Numer.Math., vol. 86, no. 4, pp. 565–589, 2000.

[3] M. Bebendorf and S. Rjasanow, “Matrix compression for the radiationheat transfer in exhaust pipes,” inMultifield Problems, A.-M. Sändig,W. Schiehlen, and W. L. Wendland, Eds. Berlin, Germany: Springer-Verlag, 2000, pp. 183–191.

[4] W. Hackbusch, “A sparse matrix arithmetic based on H-matrices—PartI,” Computing, vol. 62, no. 2, pp. 89–108, 1999.

[5] S. A. Goreinov, E. E. Tyrtyshnikov, and N. L. Zamarashkin, “A theoryof pseudoskeleton approximations,”Linear Algebra Applicat., vol. 261,pp. 1–21, 1997.

[6] T. Nakata, N. Takahashi, and K. Fujiwara, “Summary of re-sults for benchmark problem 10 (steel plates around a coil),”COMPEL, pp. 335–344, Sept. 1992. [Online]. Available: http://ics.ec-lyon.fr/team.html.

[7] S. Kurz, U. Becker, and H. Maisch, “Dynamic simulation of electro-mechanical systems—From Maxwell’s theory to common rail diesel in-jection,” Naturwissenschaften, 2001, to be published.

[8] S. Kurz, J. Fetzer, G. Lehner, and W. M. Rucker, “A novel formulationfor 3D eddy current problems with moving bodies using a Lagrangiandescription and BEM-FEM coupling,”IEEE Trans. Magn., vol. 34, pp.3068–3073, Sept. 1998.

[9] , “Numerical analysis of 3D eddy current problems with movingbodies using BEM-FEM coupling,”Surveys Math. Ind., vol. 9, pp.131–150, 1999.

[10] K. Preiset al., “Numerical analysis of 3D magnetostatic fields,”IEEETrans. Magn., vol. 27, pp. 3798–3803, Sept. 1991.

[11] V. Rokhlin, “Rapid solution of integral equations of classical potentialtheory,”J. Comput. Phys., vol. 60, no. 2, pp. 187–207, 1985.

[12] A. Buchau, W. Rieger, and W. M. Rucker, “Fast field computations withthe fast multipole method,”COMPEL, vol. 20, no. 2, pp. 547–561, 2001.