Scalable Solution of the Linear Dynamics Problems in the...

16
Scalable solution of the linear dynamics problems in the frequency domain M.Belyi 1 , A.Larionov 1 , I.Tsukanov 1 , V.Belsky 1 , M.Kim 1 1 Dassault Systemés SIMULIA 1301 Atwood Avenue, Suite 101W, RI 02919, United States e-mail: [email protected] Abstract This paper presents a snapshot of the current linear dynamics capabilities in Abaqus. We discuss some of the recently added linear dynamics features useful for the car noise and vibration analysis. Inclusion of nonlinear preloading and unsymmetric effects in linear dynamic simulations is illustrated with an example of the frequency response analysis for a model of a full car with rolling tires. Parallel performance of the Abaqus modal frequency response solver is illustrated with examples of large scale simulations. 1 Introduction Abaqus has historically provided leading technology in nonlinear finite element analysis. In the last decade, SIMULIA has invested significant resources to deliver efficient linear dynamics functionality that meets our customers' current and future needs in terms of the finite element model size, modal content, response domain size, and performance. Abaqus linear dynamics functionality provides technology for accurate modelling of engineering problems including advanced mechanical behavior. Linear dynamics simulations with Abaqus can capture nonlinear effects of pre-loading, unsymmetric effects (for example, gyroscopic effects), acoustic-structural coupling, and frequency-dependent material behavior. The modal analysis procedures are based on the high performance Automatic MultiLevel Substructuring (AMLS) technology implemented in Abaqus that provides an effective tool for large scale linear dynamic simulations. Parallel performance of the Abaqus modal procedures enables effective large-scale automotive noise and vibration analysis in the so-called, “mid-frequency” range using traditional finite element approach. Solution of the linear dynamics problems in the frequency domain for damped finite element models of engineering systems requires solving systems of complex linear algebraic equations with millions equations for many excitation frequency values. Direct or iterative solution at each frequency is not usually feasible because of the computational cost. Mode-superposition method is widely used for solving large-scale linear dynamics problems. It includes extraction of the natural modes of vibration and solving the reduced system of equations of motion in the frequency domain. However, in practical engineering simulations such reduced systems of equations can have tens of thousands of complex linear equations. Thus, solving for many excitation frequencies still can be costly. We present high performance algorithms used in Abaqus for solving large scale linear dynamic problems in the frequency domain including the AMS eigensolver based on the AMLS technology and different approaches to solving the reduced modal frequency response problems with respect to the principal modal coordinates. We discuss high performance parallel implementation of the solution algorithms for symmetric multiprocessing systems. Also, we discuss modal dynamic computations on graphic processors (GPU). Several examples of large-scale linear dynamic simulations demonstrate performance and scaling of the presented algorithms. 3199

Transcript of Scalable Solution of the Linear Dynamics Problems in the...

Page 1: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

Scalable solution of the linear dynamics problems in the frequency domain

M.Belyi1, A.Larionov

1, I.Tsukanov

1, V.Belsky

1, M.Kim

1

1Dassault Systemés SIMULIA

1301 Atwood Avenue, Suite 101W, RI 02919, United States

e-mail: [email protected]

Abstract This paper presents a snapshot of the current linear dynamics capabilities in Abaqus. We discuss some of

the recently added linear dynamics features useful for the car noise and vibration analysis. Inclusion of

nonlinear preloading and unsymmetric effects in linear dynamic simulations is illustrated with an example

of the frequency response analysis for a model of a full car with rolling tires. Parallel performance of the

Abaqus modal frequency response solver is illustrated with examples of large scale simulations.

1 Introduction

Abaqus has historically provided leading technology in nonlinear finite element analysis. In the last

decade, SIMULIA has invested significant resources to deliver efficient linear dynamics functionality that

meets our customers' current and future needs in terms of the finite element model size, modal content,

response domain size, and performance.

Abaqus linear dynamics functionality provides technology for accurate modelling of engineering problems

including advanced mechanical behavior. Linear dynamics simulations with Abaqus can capture nonlinear

effects of pre-loading, unsymmetric effects (for example, gyroscopic effects), acoustic-structural coupling,

and frequency-dependent material behavior. The modal analysis procedures are based on the high

performance Automatic MultiLevel Substructuring (AMLS) technology implemented in Abaqus that

provides an effective tool for large scale linear dynamic simulations. Parallel performance of the Abaqus

modal procedures enables effective large-scale automotive noise and vibration analysis in the so-called,

“mid-frequency” range using traditional finite element approach.

Solution of the linear dynamics problems in the frequency domain for damped finite element models of

engineering systems requires solving systems of complex linear algebraic equations with millions

equations for many excitation frequency values. Direct or iterative solution at each frequency is not

usually feasible because of the computational cost. Mode-superposition method is widely used for solving

large-scale linear dynamics problems. It includes extraction of the natural modes of vibration and solving

the reduced system of equations of motion in the frequency domain. However, in practical engineering

simulations such reduced systems of equations can have tens of thousands of complex linear equations.

Thus, solving for many excitation frequencies still can be costly.

We present high performance algorithms used in Abaqus for solving large scale linear dynamic problems

in the frequency domain including the AMS eigensolver based on the AMLS technology and different

approaches to solving the reduced modal frequency response problems with respect to the principal modal

coordinates. We discuss high performance parallel implementation of the solution algorithms for

symmetric multiprocessing systems. Also, we discuss modal dynamic computations on graphic processors

(GPU). Several examples of large-scale linear dynamic simulations demonstrate performance and scaling

of the presented algorithms.

3199

Page 2: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

2 Maturity of the Abaqus Linear Dynamics Functionality

2.1 Brief overview

Abaqus provides state-of-the-art functionality capable for large-scale linear dynamic simulations and

particularly, for large-scale automotive noise and vibration analysis.

Abaqus models for structural components can be represented in the form of finite elements, substructures,

and global matrices. Substructures and matrices can be used for data exchange in collaborative workflows.

Abaqus substructuring capabilities allow generating dynamic substructures with mixed-interface dynamic

modes including fixed-interface and free-interface substructures as particular cases. Substructures can

include damping; they can capture effects of nonlinear preloading and unsymmertic effects. The

substructure generation algorithm based on the AMS technology recently implemented in Abaqus

demonstrates unique performance and scalability. Matrix modelling capabilities allow using

subassemblies represented by global sparse or dense matrices that can be generated by Abaqus or by some

other CAE software. The matrix-based subassemblies can be instantiated and attached through interface

nodes. Overall, the matrix modelling abstraction in Abaqus is almost as flexible as substructuring.

Rich damping capabilities provided by Abaqus include material-level, element-level, and system-level

damping options. Viscous and structural (hysteretic) damping can be introduced as material property, or as

the global mass or stiffness-proportional damping, or as direct modal damping, or it can be imported in a

form of matrices. Recently, modal damping was enabled for substructures in all dynamic analysis

procedures including nonlinear dynamic simulations using substructures.

Also, Abaqus provides many features for analyzing results of the frequency response analysis. Below, we

consider two such features that can be used in the car noise and vibration simulations.

2.2 Energy calculation capabilities

Abaqus allows calculation of different energy-type variables for the obtained frequency response. The set

of energy output variables includes kinetic and potential energy for structural and acoustic domains,

energy loss from viscous and structural damping, specifically, modal or global damping contributions to

the energy loss, and many other output variables. Energy variables integrated through the whole model or

parts of the model can be viewed as X-Y plots, and contour plots of the energy densities can be visualized.

In Figure 1 we present kinetic energy density distribution obtained from the the frequency response

analysis at excitation frequency 6Hz for a full car model loaded with applied accelerations (The Ford

Taurus body-in-white model, is used for illustrative purposes only and is courtesy of the Public Finite

Element Model Archive of the National Crash Analysis Center at George Washington University).

Figure 1: Kinetic energy density at 6Hz

3200 PROCEEDINGS OF ISMA2016 INCLUDING USD2016

Page 3: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

2.3 Acoustic contribution factors

In order to reduce the noise levels of the newly designed products, design engineers need to be able to

determine the noise sources, their location and contribution to the total acoustic pressure. In addition, the

design engineers also need to know which vibration frequencies contribute the most to the noise. To

provide efficient guidance to the design engineers, the Acoustic Contribution Factors feature was

implemented in Abaqus. Acoustic Contribution Factors represent partitions of the total acoustic pressure

with respect to vibration frequencies, loads, or structural components. Knowledge of the Acoustic

Contribution Factors provides design engineers with an efficient analysis tool that can:

Identify the sources of the noise and their contribution to the acoustic pressure

Determine which frequencies contribute the most to the noise

Identify the location of a point where the loudest noise comes from

New functionality supports calculation of the Acoustic Contribution Factors in Abaqus/Standard as well as

a plugin for Abaqus/Viewer that supports their visualization and analysis. Currently Abaqus supports

calculation and visualization of Panel, Acoustic Modal and Structural Modal Contribution Factors. Other

Contribution Factors such as Acoustic Load, Acoustic Load Modal, and Grid Contribution Factors can be

calculated, and the plugin can be easily extended to visualize them. Figures 2 and 3 illustrate the Acoustic

Contribution Factors feature.

Figure 2: GUI for the Acoustic Contribution Factors: a panel selection

Figure3: Acoustic Contribution Factors graphical representation.

STRUCTURAL DYNAMICS: METHODS AND CASE STUDIES 3201

Page 4: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

3 Advanced Frequency Response Analysis with Abaqus

We present an example of the frequency response analysis including advanced mechanical behavior that is

typical for Abaqus.

3.1 Including rolling tires in the full vehicle linear dynamic simulation

In a traditional automobile noise and vibration analysis, stationary tires are defined and subjected to

vertical dynamic loading. The actual operating conditions of a tire involve rolling however, and the

vibration characteristics of rolling tires are considerably different from those of stationary tires. The

vibration characteristics of rolling tires depend on the rolling velocity and are considerably different from

those of stationary tires. Specifically, the rolling condition contributes loads in the fore-aft direction as

well as the vertical.

Small amplitude vibrations of a tire on the road can be treated as a linear superposition of small amplitude

steady state vibrations on a highly nonlinear base state. For stationary tires, this base state is the footprint

configuration of the tire. The base state contains nonlinearities arising from the load-deflection behavior of

various rubber compounds, contact between the tire and the road, reinforcement behavior, etc.

It is common practice to employ a mixed Eulerian-Lagrangian scheme to compute the steady state rolling

configuration of the tire. This methodology uses a reference frame that is attached to the axle of the

rotating tire. An observer in this frame sees the tire as points that do not move, although the material of

which the tire is made moves through these points. Small amplitude vibrations can then be superposed on

the rolling configuration corresponding to the velocity of interest.

The dynamic substructure of a rolling tire can be created and incorporated in a full vehicle assembly, thus

eliminating the need to use a fully meshed representation of the tire. With these modeling capabilities, tire

manufacturers can provide automotive designers with richer, more comprehensive numeric representations

of their tires’ behavior – without divulging their detailed tire FE models.

The simulation presented here demonstrates the analysis methodology for including the effect of rolling

tires on the vehicle forced frequency response. A dynamic substructure of a rolling tire FE model is

created and assembled into a full vehicle model. The AMS eigensolver is used to extract the eigensolution

of the vehicle assembly, which is then subsequently used in the steady-state dynamic analyses.

The model under consideration represents a typical passenger car tire. The road is modeled as an

analytical rigid surface. Contact is defined between this surface and the tire. A simple Mooney-Rivlin law

is used for the strain energy potential of the various rubber materials. This simulation ignores the

viscoelastic nature of the rubber. The plies and belts are modeled using rebar layers embedded in the

surrounding rubber matrix. Linear elastic material properties are applied to the reinforcement fibers.

Analysis includes the following steps.

Step 1. Rim mounting and inflation analysis

Rim mounting and inflation are carried out using an axi-symmetric model of the tire cross-section to save

analysis time. Axisymmetric elements with twist capture the out of plane deformation introduced by the

belts.

Step 2. Footprint loading

Symmetric model generation is used to revolve the axi-symmetric cross-section into a three-dimensional

model, in this case with uniform mesh density around the circumference. The results from the end of the

inflation step are transferred to the three-dimensional model to act as the base state for the ensuing

footprint loading simulation. The vehicle load contribution is applied as a concentrated load to the

reference point of the road rigid surface.

3202 PROCEEDINGS OF ISMA2016 INCLUDING USD2016

Page 5: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

Step 3. Steady state rolling analysis

The static footprint configuration acts as the base state for the subsequent steady state transport analysis at

the desired velocity. The steady state transport analysis accounts for the effect of inertial and frictional

forces. The coefficient of friction between the road and the tire is set to 1.0. The free-rolling configuration

corresponding to 50 km/h is computed. If the substructure was being developed for a stationary tire, the

steady state transport analysis would not be necessary.

Step 4. Eigenmode extraction for a tire

For the purposes of generating a substructure, the modes of the tire are extracted. For a stationary

configuration this is done after the static footprint step and for a dynamic configuration after the steady

state rolling step. Inclusion of these modes when building the substructure provides for a better dynamic

approximation. Note that the natural frequency extraction does not consider gyroscopic effects or

damping. These real modes can be either the fixed- or free-interface type. Contact conditions from the

base state are preserved in both the normal and tangential directions. Points that are slipping (equivalent

shear stress greater than critical shear stress) are free to move tangential to the contact surface, while

points that are sticking are kept fixed.

Step 5. Substructure generation

The dynamic substructure is created by retaining both nodal degrees of freedom and the

eigenmodes calculated from Step 4. The substructure’s reduced stiffness, mass, structural

damping, and viscous damping matrices are calculated and stored; these are necessary to capture

the dynamic properties at the usage level. It is necessary to use the unsymmetric solver so that

gyroscopic effects are considered.

Step 6. Eigenmode extraction for entire vehicle

The tire substructure is imported, repositioned (translated and rotated) and used in the vehicle assembly.

For the noise and vibration analysis of the large assembly model, the AMS eigensolver can be used to

efficiently extract the modes in the frequency range of interest.

Step 8. Steady state dynamic analysis (frequency response calculation)

The forced response analysis can be performed using the mode-based steady-state dynamics procedure for

the stationary tire case, or the subspace-based steady-state dynamics procedure for the rolling tire case.

The subspace projection method uses the eigenmodes extracted in the preceding frequency extraction step.

3.2 Results and Discussion

It is clear that the rolling effect will affect the frequency response prediction at the full-vehicle level. In

order to highlight the difference in energy transfer from the road to the vehicle body due to the rolling

effect, the stationary tire and rolling tire substructures are created based on Steps 1-5 as described above.

Road asperities act as a source of excitation for tire vibrations, which produce spindle forces that act as the

primary source of excitation for the vehicle. A unit harmonic load is applied to each of the four road

reference nodes in the vertical direction as shown in Figure 4. These loads are applied simultaneously in-

phase between left and right tires and out-of-phase between front and rear tires. Mode extraction using the

AMS eigensolver followed by an excitation frequency sweep from 1 to 200 Hz is performed for both the

stationary- and rolling-tire cases. A global structural damping factor of six percent is applied.

STRUCTURAL DYNAMICS: METHODS AND CASE STUDIES 3203

Page 6: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

Figure 4: Vehicle assembly with applied harmonic load

Figures 5 and 6 compare results obtained for a vehicle models with stationary and rolling tires. Solution

for a model with stationary tires is shown in blue. The red curve shows the solution for a model with

rolling tires with a travelling speed of 50 km/h.

The ground reaction force in lateral direction and vertical displacements at a roof node are shown in

Figures 7 and 8. The two responses highlight the mode splitting phenomena. When symmetry causes two

modes to occur at the same frequency in an unloaded, stationary tire, the symmetry is broken when the

footprint load is applied. This causes the two modes to split slightly. When the tire is rolling, the

gyroscopic effect further splits the two symmetry-broken modes as the tire speed increases.

Figure 5: Ground reaction force in lateral direction

Figure 6: Vertical displacements at a roof node

3204 PROCEEDINGS OF ISMA2016 INCLUDING USD2016

Page 7: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

4 Modal Frequency Response Solver

4.1 Frequency response analysis

The frequency response problem for structures simulated with the finite element method can be defined by

the following equation that has to be solved at every given frequency

2

1 2, , , ,

nK M i C S U F

(1)

Here is the angular frequency of the time-harmonic excitation, and 1i ; n is the number of

excitation frequencies; M is the mass matrix which is real, symmetric and positive-semidefinite; K is

the stiffness matrix, S is the structural damping matrix, and C is the viscous damping matrix. The

matrices K , S , and C are real-valued matrices. They can be symmetric or unsymmetric depending to

the problem solved. F is the load matrix that may depend on the excitation frequency and may be

complex. Columns of the matrix F represent complex load vectors associated with the harmonic

excitation. Each column of the complex matrix U represents response of the finite element model to the

corresponding load vector in the matrix F .

In many engineering applications problem (1) may have millions of equations, and it can be solved at

thousands of frequencies. Direct or iterative solution of problem (1) at each frequency is not usually

feasible because of the computational cost. The modal approximation is a well-established method which

is commonly used to reduce the computational cost of solving problem (1). The frequency response is

represented in the form

U X (2)

where X is the modal frequency response also called generalized displacements and is the rectangular

matrix of the modal subspace containing m modal vectors as columns. Usually, these vectors are the free-

vibration mode shapes of the finite element model, but they can be accompanied by some other vectors, to

enhance accuracy of the solution. The modal subspace vectors are assumed to be mass-normalized

TM I (3)

where I is m m identity matrix. In equations (2), and (3), and below the upper bar reflects the fact that

a matrix belongs to the modal space (has m rows).

The generalized displacements X are obtained by solving the modal frequency response problem

2

1 2, , , ,

nK I i C S X F

(4)

where , ,T T T

K K S S C C , and T

F F .

Equation (4) is of order m , much smaller than the original equation (1), and as such is much cheaper to

solve. However, in practical engineering simulations m can be in thousands or tenth of thousands. Thus,

solving equation (4) at every excitation frequency still can be very costly.

When all the matrices K , C , and S are diagonal, the solution is not costly at all, since the matrix of the

system (4) is diagonal. We assume that at least, one of these matrices is a square dense matrix (not

diagonal). Solving the modal frequency response problem in the general case includes factorization of the

complex dense square matrix of order m at every frequency. The total number of arithmetic operations

with the complex numbers in the modal frequency response calculation can be estimated as

3

2

0

3

n mZ O m (5)

STRUCTURAL DYNAMICS: METHODS AND CASE STUDIES 3205

Page 8: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

Here and below standard “big O” notation is used, and the coefficient depends on the matrix symmetry

namely, 1 if the matrix is symmetric, and 2 in the other case. In estimate (5) we assume that the

number of the right hand sides (load vectors) is significantly smaller than m .

4.2 “Complex” solver vs. “real solver”

Equation (4) can be written as a system of equations with respect to the real and imaginary parts of the

response R e Im

X X iX

2

R e R e

2

Im Im

K I C S X F

X FC S K I

(6)

The total number of arithmetic operations with the real numbers required for solving system (6) can be

estimated as

3

2

0

8

3

n mR O m (7)

In many practical cases the modal stiffness matrix K is diagonal, and the cost of solving equation (6) can

be reduced approximately by factor of two in that case by manipulating with the left hand side blocks. The

response can be obtained by the following computations

11 1

Im Im R e

1 1

R e Im R e

,X D C S D C S F C S D F

X D C S X D F

(8)

where the matrix 2

D K I is diagonal. The 3

m -proportional matrix operations include one real

m m matrix multiplication and one matrix factorization. Therefore, the computational work can be

estimated (in real floating point operations) as

3

2

1

4

3

n mR O m (9)

Further reduction of computational work can be achieved by taking into account the sparse structure of the

left hand side in equation (6) for certain special cases. Let us consider an important special case of the

coupled structural-acoustic modal frequency response analysis based on using uncoupled modes –

eigenmodes calculated separately for the structural and acoustic parts of the model. After projection of the

structural, acoustic and coupling structural-acoustic finite element operators on the modal subspace, the

modal frequency response problem takes the form

20 0 0

0 00

T

S SS SS A S

A S A AA AA

I FC S UK Qi i

Q I FC S PK

(10)

Here the subscript S indicates structural operators projected on the structural modes; the subscript A

indicates acoustic operators projected on the acoustic modes; A S

Q is the coupling matrix multiplied by the

acoustic eigenvectors from the left and by the structural eigenvectors from the right, U and P are

generalized displacements corresponding to the structural and acoustic modes, respectively. Using the

Everstine symmetric potential formulation [1] we rewrite equation (10) in the form

20 0 0

0 0

T

S S SSS A S

A A AAA S A

K I FSC Q Ui i

K I GSQ C V

(11)

3206 PROCEEDINGS OF ISMA2016 INCLUDING USD2016

Page 9: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

where the potential V is defined by the equation , 0P i V ; A A

iG F

. When the matrices

SK

and A

K are diagonal we can use technique similar to (8) for solving equation (11).

Rewriting equation (11) with respect to the real and imaginary parts of the generalized displacements we

can address another important special case when all the projected matrices at least, in one domain:

structural or acoustic, are diagonal.

2

, R eR e

2

, ImIm

2, R eR e

2 , ImIm

0

0

0

0

T

S S S S A S

S

T

S S S S A SS

AA S A A A A

A

A S A A A A

K I C S QFU

C S K I Q FU

GVQ K I C S

GVQ C S K I

(12)

For example, if all the modal matrices in the structural domain: S

K , SC and

SS , are diagonal, then the

upper diagonal 2x2 block on the left hand side of equation (12) can be inverted easily, and solution of

equation (12) can be obtained by the following computations

1

, R e, R e2 1 1R e

, Im, ImIm

, R e1R e R e

, ImIm Im

,SAT

A S S

SA

S T

S

S

FGVD Q D Q Q D

FGV

FU VD Q

FU V

(13)

where 0

0

A S

A S

QQ

Q

,

2

2

S S S S

S

S S S S

K I C S

D

C S K I

,

2

2

A A A A

A

A A A A

K I C S

D

C S K I

.

Another approach to solving equation (4) is based on using the complex arithmetic. In Abaqus 2016 we

implemented the new SMP complex solver using high performance BLAS and BLAS extension kernels

from Intel(R) Math Kernel Library 11.1 for blocked matrix operations. We found that for many practical

cases the complex solver can perform on-pair or better than the real solver, but for some important special

cases performance of the complex solver can be superior to the performance of the real solver. For

example, we found that for the special case with the diagonal modal stiffness matrix K the brute force

complex solver performs and scales approximately as the real solver using technique (8). Comparing the

computational work estimates (5) and (9) we conclude that for our implementation of the complex solver

an average complex operation is approximately equivalent to 4 real flops. Also, the complex solver allows

handling important special cases sometimes, more effectively than the real solver. For example, for the

special case of the structural-acoustic analysis when all the modal matrices in the structural domain are

diagonal we can write equation (11) in the form

T

SS A S

AA S A

FB i Q U

Gi Q B V

(14)

where 2

S S S S SB K I i C S ,

2

A A A A AB K I i C S .

As the complex matrix S

B is diagonal, we can obtain the solution of (14) by the following operations

12 1 1

1

,T

A S S A S A A A S S S

T

S S A S

V Q B Q B G i Q B F

U B F i Q V

(15)

STRUCTURAL DYNAMICS: METHODS AND CASE STUDIES 3207

Page 10: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

Our examples show that this approach can perform up to two times better than the real solver using

technique (13). Similar technique can be used when not all the modal matrices in the structural domain are

diagonal, but all the modal matrices are diagonal in the acoustic domain.

In addition to the pure SMP complex solver we implemented the GPU complex solver that is using a GPU

device for performing massive computations. The GPU and SMP solvers can be used together for solving

the modal frequency response problems on a multicore machine with a GPU device. We implemented the

hybrid “multicore+GPU” algorithm using MAGMA 1.6 kernels (the MAGMA project by University of

Tennessee provides a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid

architectures for parallel computations on GPU and traditional LAPACK and BLAS libraries on the

“CPU” side. Performance of the current (brute force) implementation of the GPU solver for a single GPU

device (NVIDIA K40m) is approximately similar to the SMP complex solver performance using 14 cores

of a 28-core Haswell machine. Thus, using the hybrid solver we can expect about 1.5 times faster solution

on such a machine.

4.3 “Group” solver

The following general idea can be used to improve performance of the modal frequency response solver

[2]. Consider N systems of equations with non-singular square matrices of the order m

, 1, ,k k k

A x f k N (16)

Let us introduce the matrix

1 2 N

A A A A (17)

Obviously, solutions of equations (9) can be represented in the form

1 1 1

1 2 3 1 2 3 1 2 1 1, ,

N N N N Nx A A A A f x A A A A f x A A A f

(18)

Calculations by formulas (17) and (18) include one matrix factorization, 1N matrix products, and a

number of matrix-vector products those have computational complexity 2

O m . If the matrix A could

be obtained in 2

O m operations instead of 1N direct matrix multiplications, then solutions of all

equations (16) would require a single matrix factorization that is about N times faster than solving these

equations one-by-one.

Going back to the modal frequency response analysis let us partition the set of excitation frequencies into

smaller groups of frequencies with maximum N frequencies per a group

1 1 2 1 1 1, ; , ; , ;

N N N k N k N

(19)

We will try to achieve better performance of the frequency response calculation by processing the

equations at all frequencies within a single group simultaneously.

First, consider a special case when the modal viscous damping matrix is zero: 0C . For this case

problem (4) for a group of frequencies 1,

N reduces to the form

, 1, ,k k k

A X F k N (20)

where k k

A B I , k kF F ,

2

k k , and B K iS is the complex stiffness matrix. Note

that the matrices k

A are permutable. Product of these matrices can be presented in the form

1 2 NA A A A p B (21)

3208 PROCEEDINGS OF ISMA2016 INCLUDING USD2016

Page 11: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

where p a polynomial of order N defined by its roots 1 2, , ,

N

1

1 2 0 1 1

N N

N N Np a a a a

(22)

The coefficients 0 1, , ,

Na a a can be calculated by Vieta’s formulas or by a simple algorithm. If the

powers of the matrix B namely, 2

, , ,N

B B B are pre-calculated and stored in the computer memory,

then the matrix 1

0 1 1

N N

N Np B a a B a B a B

can be obtained in

2O m operations. The

modal frequency response for a portion of N frequencies is calculated as follows

1

, 1, ,k k k

X p B p B F k N

(23)

where 1

N

k k

j

j k

p B B I

. Calculation of the frequency response at every frequency in the portion

requires a single complex matrix factorization. The total cost of solving the modal frequency response

problem with zero viscous damping including the cost of preparatory calculations (number of arithmetic

operations with the complex numbers) can be estimated as

3 2

11

3

nZ N m O m

N

(24)

The optimal partitioning of the excitation frequencies that minimizes the principal term of 1

Z is

3o p t

nN . For example, if 1 0 0 0n and 1 8

o p tN N , then

1 00 .1Z Z . In practice, choosing the

value of N we have to take into account the memory consumptions for matrices 2

, , ,N

B B B and to

take care about accuracy of the solution. As the powers of the complex stiffness matrix B are used, the

value of N should guarantee the absolute values of the matrices 2

, , ,N

B B B elements are reasonably

small to enable accurate calculations. Using o p t

N N can be very practical: for our example, if 4N ,

then 1 0

0 .2 6Z Z .

4.4 AMS eigensolver

Every dynamic modal analysis includes solving the generalized eigenvalue problem for the natural

vibration mode extraction

2K M (25)

Automated Multi-Level Substructuring (AMLS) is a state-of-the-art technology to solve large

eigenvalue problems [3,4,5]. In AMLS, a finite element model is automatically divided into many

substructures in multiple levels. Based on that partitioning tree, the entire model is projected onto

reduced (or truncated) substructures modal space by solving many small substructure eigenvalue

problems. Then, eigensolution is computed on the reduced substructures modal space and the full

eigenmodes or selected parts of the eigenmodes can be recovered. This method can be divided

into three phases: reduction phase, reduced eigensolution phase, and recovery phase. In the

reduction phase, we can factorize the stiffness matrix and solve the substructure eigenproblems to

get the AMLS transformation matrix and project the system matrices (mass, stiffness, damping

matrices, and right hand side vectors) on to the truncated substructures modal space. Then, the

reduced eigensolution phase solves the reduced eigenvalue problem with the projected stiffness

and mass matrices. Finally, the recovery phase recovers eigenmodes in the original finite element

STRUCTURAL DYNAMICS: METHODS AND CASE STUDIES 3209

Page 12: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

space. The AMS eigensolver developed in Abaqus can solve large problems with tens of millions degrees

of freedom. It can run in parallel on shared memory computers with multiple processors. Also, the AMS

eigensolver provides the selective recovery capability, which recovers the eigenvectors for the user-

defined node set only. To demonstrate the AMS eigensolver performance consider a benchmark model of

impeller with 8.2 million of the finite element degrees of freedom. Figure 7 illustrates typical AMS

eigensolver timing and demonstrates performance improvements of the pure “CPU” and combined

“CPU+GPU” AMS implementations in the latest version of Abaqus.

Usually, the coupled structural-acoustic modal frequency response analysis is based on using uncoupled

modes – eigenmodes calculated separately for the structural and acoustic parts of the model. However, in

the noise and vibration simulations in automotive industry for the problems with strong structural-acoustic

coupling the coupled structural-acoustic eigenproblem should be solved to form a representative modal

subspace.

The coupled structural-acoustic eigenproblem is classically formulated in terms of structural

displacements and acoustic pressure degrees of freedom [6]. Both left-hand side and right-hand side

matrices (typically referred to as stiffness and mass matrices) are unsymmetric due to the structural-

acoustic coupling term. This term is present in the upper right corner of the stiffness matrix, and in the

lower left corner of the mass matrix. Therefore standard eigenvalue extraction methods for symmetric

eigenvalue problems are not applicable for this original formulation. A special system transformation

based on introduction of an auxiliary variable can be used to transform the original unsymmetrical

eigenproblem to a symmetric indefinite eigenproblem of a larger size [7]. This problem can be solved

using the Lanczos eigensolver (this eigensolver is available in Abaqus) however, this approach is

impractical in the cases when many (thousands) of couple structural-acoustic eigenmodes need to be

extracted. AMLS technology can’t be explicitly used for the symmetrized eigenproblem because the left-

hand side matrix of the symmetrized eigenproblem is indefinite; AMLS technology requires symmetric,

non-negative-definite matrices. In addition, both left-hand side and right-hand side matrices of the

symetrized formulation are singular (and the intersection of singularity subspaces of these matrices is not

empty), which represents another limitation of AMLS technology. To overcome this complication, a

special new method that allows for a solution of coupled structural-acoustic eigenproblem using AMLS

technology was implemented in Abaqus . Solving coupled structural-acoustic eigenproblem requires much

more computational work than uncoupled eigenmodes extraction. Figure 8 shows typical timing data for

the coupled structural-acoustic eigenmode extraction analysis using AMS eigensolver and demonstrates

performance improvements for this algorithm.

Figure 7: Benchmark impeller model: 8.2M DOF; 86 modes; 28-Core Haswell, 2xK40m, 768GB RAM

1541

1228

702

1.00

1.25

2.20

1.00 1.16

1.74

1.00 1.16

1.68

0.0

0.5

1.0

1.5

2.0

2.5

0

500

1000

1500

2000

2500

Abaqus 6.14-3… Abaqus 2016… Abaqus 2016…

Spe

ed

up

Fac

tor

Elap

sed

Tim

e (

sec.

)

AMS Solver

AMS Driver

STD

CPU+GPU CPU

3210 PROCEEDINGS OF ISMA2016 INCLUDING USD2016

Page 13: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

Figure 8: Benchmark vehicle body model: 18.8M DOF; 4580 modes; 20-Core Ivy Bridge, 192 GB RAM

4.5 Frequency response calculation

In this section we present performance results for modal frequency response analyses using the solvers

discussed above. The elapsed wall time is presented for the frequency response analysis phases. The

timing data was obtained by running the analyses on a 28-core Haswell machine with two K40m cards.

The first example is a full car modes with 17.5M degrees of freedom. To perform frequency response

calculation we extract 10871 uncoupled structural and acoustic eigenmodes (up to 500 Hz) including 316

acoustic eigenmodes. Material structural damping not proportional to the global stiffness is specified for

this model together with the diagonal modal viscous damping. The frequency response is calculates for

114 load cases at 490 points in the frequency domain. Figure 9 presents parallel scaling of the frequency

response analysis.

Figure 9: Vehicle body model: 17.5M DOF; 10871 modes; 28-Core Haswell, 2xK40m, 768GB RAM

(Std – total Abaqus/Standard time; AMS – eigensolver time; SSD – frequency response solver time)

In Figure 10 we compare performance and scaling of the “pure CPU” implementation of the modal

frequency response solver and combined “CPU+GPU” implementation. Using 28 cores and one GPU

device the solution time is 1115 sec versus 1549 sec for the pure CPU solver that is ~1.39 times faster.

0.29 0.31 0.31 0.31

2.80 2.46

2.15

0.73

0.23

0.14 0.13

0.20

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

Abaqus/AMS 6.14-4Coupled Modes

Abaqus/AMS 2016Coupled Modes

Abaqus/AMS 2017Coupled Modes

Abaqus/AMS 2016Uncoupled Modes

Elap

sed

Tim

e (

hrs

.)

PRE FREQ SSD

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

2 4 7 14 28

Elap

sed

tim

e (

sec.

)

Number of cores

Std

AMS

SSD

STRUCTURAL DYNAMICS: METHODS AND CASE STUDIES 3211

Page 14: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

Figure 10: Vehicle body model: 17.5M DOF; 10871 modes; 28-Core Haswell, 2xK40m, 768GB RAM

(CPU – “pure CPU” solver; GPU – combined “CPU+GPU solver)

For the same model we performed the modal frequency response analysis using coupled structural-

acoustic eigenmodes. It took Abaqus 5845 seconds to solve the coupled structural-acoustic eigenproblem

(up to 500 Hz) while the uncoupled eigenmodes were extracted in 3830 sec. Thus, solution of the coupled

problem was ~1.5 times more expensive. In Figure 11 we present solution time for the frequency response

analysis based on coupled eigenmodes using different variants of the frequency response solver discussed

above in sections 4.1-4.3. For this particular case, we dropped the viscous modal damping terms to

benchmark the current implementation of the group solver. The group solver time is the best; it is even

better than the best time obtained for the uncoupled modes case (Figure 12; 1159 sec).

Finally, in Figure 12 we show the modal frequency response solver timing data for the full car model with

20.4 million degrees of freedom. 30000 uncoupled structural and acoustic eigenmodes (including 1800

acoustic eigenmodes) were extracted in the range below 1500 Hz. Only diagonal modal damping is

specified for this analysis, but the modal operator is not diagonal because of the coupling terms.

Frequency response was calculated at 2050 frequency points for 4 load cases. The new complex solver is

more than two times faster for this case than the old real solver.

Figure 11: Vehicle body model: 17.5M DOF; 10871 modes; 28-Core Haswell, 2xK40m, 768GB RAM

0

1,000

2,000

3,000

4,000

5,000

6,000

1 7 14 28

Elap

sed

tim

e (

sec.

)

Number of cores

CPU

GPU

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

9,000

10,000

Abaqus 6.14 Abaqus 2016 -CPU

Abaqus 2016 -CPU + 1GPU

Group solver(CPU)

Elap

sed

tim

e (

sec.

)

SSD

~27,000 (estimated)

3212 PROCEEDINGS OF ISMA2016 INCLUDING USD2016

Page 15: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

Figure 12: Full car model: 20.4M DOF; 30000 modes; 28-Core Haswell, 2xK40m, 768GB RAM

References

[1] G.C. Everstine A Symmetric Potential Formulation for Fluid-Structure Interaction, Journal of Sound and Vibration, vol. 79, no. 2, pp. 157-160.

[2] M. Belyi, Accelerated Modal Frequency Response Calculation, U.S. Patent Application No. 13/730,403 filed on December 28, 2012

[3] J. K. Bennighof , R. B. Lehoucq, “An Automated Multilevel Substructuring Method for Eigenspace Computation in Linear Elastodynamics,” SIAM Journal on Scientific Computing, v.25 n.6, p.2084-2106, 2004

[4] M. F. Kaplan, “Implementation of automated multilevel substructuring for frequency response analysis of structures,” Ph.D. dissertation, Department of Aerospace Engineering & Engineering Mechanics, University of Texas at Austin, 2001

[5] M. Kim, “An Efficient Eigensolution Method and Its Implementation for Large Structural Systems,” Ph.D. dissertation, Department of Aerospace Engineering & Engineering Mechanics, University of Texas at Austin, 2004

[6] O. C. Zienkiewicz, R. F. Newton, “Coupled Vibrations of a Structure Submerged in a Compressible Fluid,” Proc. Of the Symposium on Finite Element Techniques, Stuttgart, 1969

[7] R. Ohayon, “Fluid-Structure Modal Analysis. New Symmetric Continuum-Based Formulations. Finite Element Applications,” Proceedings of the International Conference on Advances in Numerical Methods in Engineering: Theory and Applications, Edited by G. N. Pande and J. Middleton, Martinus Nijhoff, 1987

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

18,000

20,000

1 7 14 28

Elap

sed

tim

e (

sec.

)

Number of cores

Abaqus 6.14

Abaqus 2016

~56,000 (estimated)

STRUCTURAL DYNAMICS: METHODS AND CASE STUDIES 3213

Page 16: Scalable Solution of the Linear Dynamics Problems in the ...past.isma-isaac.be/downloads/isma2016/papers/isma2016_0563.pdf · the recently added linear dynamics features useful for

3214 PROCEEDINGS OF ISMA2016 INCLUDING USD2016