arXiv:1911.04240v1 [cs.LG] 6 Nov 2019people.cs.vt.edu/naren/papers/PhyNetSIAMDM2020.pdfedge of the...

Physics-guided Design and Learning of Neural Networksfor Predicting Drag Force on Particle Suspensions in Moving Fluids

Nikhil Muralidhar ∗† Jie Bu ∗† Ze Cao ‡ Long He ‡ Naren Ramakrishnan ∗

Danesh Tafti ‡ Anuj Karpatne ∗†

AbstractPhysics-based simulations are often used to model and un-derstand complex physical systems and processes in domainslike fluid dynamics. Such simulations, although used fre-quently, have many limitations which could arise either dueto the inability to accurately model a physical process ow-ing to incomplete knowledge about certain facets of the pro-cess or due to the underlying process being too complex toaccurately encode into a simulation model. In such situa-tions, it is often useful to rely on machine learning meth-ods to fill in the gap by learning a model of the complexphysical process directly from simulation data. However, asdata generation through simulations is costly, we need todevelop models, being cognizant of data paucity issues. Insuch scenarios it is often helpful if the rich physical knowl-edge of the application domain is incorporated in the ar-chitectural design of machine learning models. Further, wecan also use information from physics-based simulations toguide the learning process using aggregate supervision to fa-vorably constrain the learning process. In this paper, wepropose PhyDNN , a deep learning model using physics-guided structural priors and physics-guided aggregate super-vision for modeling the drag forces acting on each parti-cle in a Computational Fluid Dynamics-Discrete ElementMethod(CFD-DEM). We conduct extensive experiments inthe context of drag force prediction and showcase the use-fulness of including physics knowledge in our deep learningformulation both in the design and through learning pro-cess. Our proposed PhyDNN model has been compared toseveral state-of-the-art models and achieves a significant per-formance improvement of 8.46% on average across all base-line models. The source code has been made available∗ andthe dataset used is detailed in [1, 2].

1 Introduction

Machine learning (ML) is ubiquitous in several disci-plines today and with its growing reach, learning mod-els are continuously exposed to new challenges andparadigms. In many applications, ML models aretreated as black-boxes. In such contexts, the learningmodel is trained in a manner completely agnostic to therich corpus of physical knowledge underlying the processbeing modeled. This domain-agnostic training mightlead to many unintended consequences like the modellearning spurious relationships between input variables,

∗Dept. of Computer Science, Virginia Tech, VA, USA†Discovery Analytics Center, Virginia Tech, VA, USA‡Dept. of Mechanical Engineering, Virginia Tech, VA, USA∗https://github.com/nmuralid1/PhyDNN.git

PhyDNN

Physics-Guided Model ArchitectureInputs

(Reynolds Number,

Solid Fraction,

Neighboring

Particle Positions)

Physical

Loss(Aggregate

Supervision)

Outputs

Drag Force 𝑓𝑥

Velocity FieldsPressure Fields

Intermediate Variables

Figure 1: Our proposed PhyDNN Model.

models learning representations that are not easily ver-ifiable as being consistent with the accepted physicalunderstanding of the process being modeled. Moreover,in many scientific disciplines, generating training datamight be extremely costly due to the nature of the datageneration collection process. To effectively be usedacross many scientific applications, it is important fordata mining models to be able to leverage the rich phys-ical knowledge in scientific disciplines to fill the void dueto the lack of large datasets and be able to learn goodprocess representations in the context of limited data.This makes the model less expensive to train as well asmore interpretable due to the ability to verify whetherthe learned representation is consistent with the existingdomain knowledge.

In this paper, we attempt to bridge the gap betweenphysics-based models and data mining models by incor-porating domain knowledge in the design and learning ofmachine learning models. Specifically, we propose threeways for incorporating domain knowledge in neural net-works: (1) Physics-guided design of neural network ar-chitectures, (2) Learning with auxiliary tasks involvingphysical intermediate variables, and (3) Physics-guidedaggregate supervision of neural network training.

We focus on modeling a system in the domain ofmulti-phase flows (solid particles suspended in movingfluid) which have a wide range of applicability in fun-damental as well as industrial processes [3]. One of thecritical interaction forces in these systems that has alarge bearing on the dynamics of the system is the dragforce applied by the fluid on the particles and vice-versa.The drag force can be obtained by Particle Resolved

arX

iv:1

911.

0424

0v1

[cs

.LG

] 6

Nov

201

9

Simulations (PRS) at a high accuracy. It captures thevelocity and pressure field surrounding each particle inthe suspension that can later be used to compute thedrag force. However, PRS is quite expensive and only afew 100s or at most 1000s of particles can be resolved ina calculation utilizing grids of O(108) degrees of freedomand utilizing O(102) processors or cores. Thus morepractical simulations have to resort to coarse-graining,e.g., Discrete Element Method (DEM) and CFD-DEM.In these methods, a particle is treated as a point mass(not resolved) and the fluid drag force acting on theparticles in suspension is modeled.

Current practice in simulations is to use the meandrag force acting on the particle suspension as a functionof flow parameters (Reynolds number) and the particlepacking density (solid fraction - φ) [4–6]. Giventhe variability of drag force on individual particles insuspension, this paper explores techniques in physics-guided machine learning to advance the current state-of-the-art for drag force prediction in CFD-DEM bylearning from a small amount of PRS data.

Our contributions are as follows:•We introduce PhyDNN , a novel physics-guided modelarchitecture that yields state-of-the-art results for theproblem of particle drag force prediction.• We introduce physics-guided auxiliary tasks to trainPhyDNN more effectively with limited data.•We augment PhyDNN architecture with aggregate su-pervision applied over the auxiliary tasks to ensure con-sistency with physics knowledge.• Finally, we conduct extensive experimentation to un-cover several useful properties of our model in settingswith limited data and showcase that PhyDNN is consis-tent with existing physics knowledge about factors in-fluencing drag force over a particle, thus yielding greatermodel interpretability.

2 Related Work

There have been multiple efforts to leverage domainknowledge in the context of increasing the performanceof data-driven or statistical models. Methods havebeen designed to influence training algorithms in MLusing domain knowledge, e.g., with the help of phys-ically based priors in probabilistic frameworks [7–9],regularization terms in statistical models [10, 11], con-straints in optimization methods [12, 13], and rules inexpert systems [14, 15]. In a recent line of research,new types of deep learning models have been proposed(e.g., ODEnet [16] and RKnet [17]) by treating sequen-tial deep learning models such as residual networks andrecurrent neural networks as discrete approximations ofordinary differential equations (ODEs).

Yaser et al. show hints, i.e., prior knowledge can be

incorporated into learning-from-example paradigm [15].In [18] the authors explored the idea of incorporatingdomain knowledge directly as a regularizer in neuralnetworks to influence training and showed better gener-alization performance. In [19,20] domain knowledge wasincorporated into a customized loss function for weaksupervision that relies on no training labels.

There have been efforts to incorporate prior knowl-edge about a problem (like low rank structure of con-volutional filters to be designed) into model architec-ture design (structural priors) [21]. Also, to design neu-ral network architectures, to incorporate feature invari-ance [22], implicit physics rules [23] to enable learningrepresentations consistent with physics laws and explic-itly incorporating knowledge as constraints [24]. In [25]the authors propose a neural network model where eachindividual neuron learns ”laws” similar to physics lawsapplied to learn the behavior and properties of com-plex many-body physical systems. In [26], the authorspropose a theory that details how to design neural net-work architectures for data with non-trivial symmetries.However none of these efforts are directly applicable toencode the physical relationships we are interested inmodeling.

3 Proposed PhyDNN Framework

3.1 Problem Background: Given a collection of N3D particles suspended in a fluid moving along the Xdirection, we are interested in predicting the drag forceexperienced by the ith particle, Fi, along theX directiondue to the moving fluid. This can be treated as asupervised regression problem where the output variableis Fi, and the input variables include features capturingthe spatial arrangement of particles neighboring particlei, as well as other attributes of the system such asReynolds Number, Re, and Solid Fraction (fraction ofunit volume occupied by particles), φ. Specifically,we consider the list of 3D coordinates of 15-nearestneighbors around particle i, appended with (Re, φ) asthe set of input features, represented as a flat 47-lengthvector, Ai.

A simple way to learn the mapping from Ai to Fiis by training feed-forward deep neural network (DNN)models, that can express highly non-linear relationshipsbetween inputs and outputs in terms of a hierarchy ofcomplex features learned at the hidden layers of thenetwork. However, black-box architectures of DNNswith arbitrary design considerations (e.g., layout of thehidden layers) can fail to learn generalizable patternsfrom data, especially when training size is small. Toaddress the limitations of black-box models in our targetapplication of drag force prediction, we present a novelphysics-guided DNN model, termed PhyDNN , that uses

Input Layer [1]Fully-connected LayerAcvaon: ELUInput dimension: 47Output dimension: 128

Shared Layers [2]4 Fully-connected Layers

Acvaon: ELUInput dimension: 128Output dimension: 128

Pressure Field [3]Fully-connected Layer


Velocity Field [4]Fully-connected Layer


Convoluon Layer [5]1D Convoluonal LayerAcvaon: LinearInput dimension: (2,10)Output dimension: (2, 8)

Pooling Layer [6]1D MaxPooling LayerAcvaon: LineaerInput dimension: (2, 8)Output dimension: (2, 4)

Shear Component [7]Fully-connected LayerAcvaon: LinearInput dimension: 4Output dimension: 3

Pressure Component [8]Fully-connected LayerAcvaon: LinearInput dimension: 4Output dimension: 3

Output Layer [9]Fully-connected LayerAcvaon: LinearInput dimension: 6Output dimension: 1

Figure 2: PhyDNN Architecture

physical knowledge in the design and learning of theneural network, as described in the following.

3.2 Physics-guided Model Architecture: In or-der to design the architecture of PhyDNN, we deriveinspiration from the known physical pathway from theinput features Ai to drag force Fi, which is at the ba-sis of physics-based model simulations such as ParticleResolved Simulations (PRS). Essentially, the drag forceon a particle i can be easily determined if we know twokey physical intermediate variables: the pressure field(Pi) and the velocity field (Vi) around the surface ofthe particle. It is further known that Pi directly affectsthe pressure component of the drag force, FPi , and Vi

directly affects the shear component of the drag force,FSi . Together, FPi and FSi add up to the total dragforce that we want to estimate, i.e., Fi = FPi + FSi .

Using this physical knowledge, we design ourPhyDNN model so as to express physically meaningfulintermediate variables such as the pressure field, veloc-ity field, pressure component, and shear component inthe neural pathway from Ai to Fi. Figure 2 shows thecomplete architecture of our proposed PhyDNN modelwith details on the number of layers, choice of activa-tion function, and input and output dimensions of ev-ery block of layers. In this architecture, the input layerpasses on the 47-length feature vectors Ai to a collec-tion of four Shared Layers that produce a common setof hidden features to be used in subsequent branchesof the neural network. These features are transmittedto two separate branches: the Pressure Field Layer andthe Velocity Field Layer, that express Pi and Vi, respec-tively, as 10-dimensional vectors. Note that Pi and Vi

represent physically meaningful intermediate variablesobserved on a sequence of 10 equally spaced points onthe surface of the particle along the X direction.

The outputs of pressure field and velocity fieldlayers are combined and fed into a 1D Convolutional

layer that extracts the sequential information containedin the 10-dimensional Pi and Vi vectors, followedby a Pooling layer to produce 4-dimensional hiddenfeatures. These features are then fed into two newbranches, the Shear Component Layer and the PressureComponent Layer, expressing 3-dimensional FS

i and FPi ,

respectively. These physically meaningful intermediatevariables are passed on into the final output layer thatcomputes our target variable of interest: drag forcealong the X direction, Fi. Note that we only makeuse of linear activation functions in all of the layers ofour PhyDNN model following the Pressure Field andVelocity Field layers. This is because of the domaininformation that once we have extracted the pressureand velocity fields around the surface of the particle,computing Fi is relatively straightforward. Hence, wehave designed our PhyDNN model in such a way thatmost of the complexity in the relationship from Ai to Fiis captured in the first few layers of the neural network.The layout of hidden layers and the connections amongthe layers in our PhyDNN model is thus physics-guided.Further, the physics-guided design of PhyDNN ensuresthat we hinge some of the hidden layers of the networkto express physically meaningful quantities rather thanarbitrarily complex compositions of input features, thusadding to the interpretability of the hidden layers.

3.3 Learning with Physical Intermediates: It isworth mentioning that all of the intermediate variablesinvolved in our PhyDNN model, namely the pressurefield Pi, velocity field Vi, pressure component FP

i , andshear component FS

i , are produced as by-products of thePRS simulations that we have access to during training.Hence, rather than simply learning on paired examplesof inputs and outputs, (Ai, Fi), we consider learning ourPhyDNN model over a richer representation of trainingexamples involving all intermediate variables along withinputs and outputs. Specifically, for a given input Ai,

we not only focus on accurately predicting the outputvariable Fi at the output layer, but doing so whilealso accurately expressing every one of the intermediatevariables (Pi,Vi,F

Pi ,F

Si ) at their corresponding hidden

layers. This can be achieved by minimizing the followingempirical loss during training:

LossMSE = MSE(F, F̂ ) + λP MSE(P, P̂)

+λV MSE(V, V̂) + λFP MSE(FP, F̂P)

+λFS MSE(FS, F̂S)

where MSE represents the mean squared error, x̂ repre-sents the estimate of x, and λP , λV , λFP , and λFS rep-resent the trade-off parameters in miniming the errorson the intermediate variables. Minimizing the aboveequation will help in constraining our PhyDNN modelwith loss terms observed not only on the output layerbut also on the hidden layers, grounding our neural net-work to a physically consistent (and hence, generaliz-able) solution. Note that this formulation can be viewedas a multi-task learning problem, where the predictionof the output variable can be considered as the primarytask, and the prediction of intermediate variables can beviewed as auxiliary tasks that are related to the primarytask through physics-informed connections, as capturedin the design of our PhyDNN model.

3.4 Using Physics-guided Loss: Along with learn-ing our PhyDNN using the empirical loss observed ontraining samples, LossMSE , we also consider adding anadditional loss term that captures our physical knowl-edge of the problem and ensures that the predictionsof our PhyDNN model do not violate known physicalconstraints. In particular, we know that the distribu-tion of pressure and velocity fields over different com-binations of Reynolds number (Re) and solid fraction(φ) show varying aggregate properties (e.g., differentmeans), thus exhibiting a multi-modal distribution. Ifwe train our PhyDNN model on data instances belong-ing to all (Re,φ) combinations using LossMSE , we willobserve that the trained model will under-perform onsome of the modes of the distribution that are under-represented in the training set. To address this, wemake use of a simple form of physics-guided aggregatesupervision, where we enforce the predictions P̂(Re,φ)

and V̂(Re,φ) of the pressure and velocity fields arounda particle respectively, at a given combination of (Re,φ)to be close to the mean of the actual values of P and Vproduced by the PRS simulations at that combination.If P (Re,φ) and V (Re,φ) represent the mean of the pres-sure and velocity field respectively for the combination(Re, φ), we consider minimizing the following physics-

guided loss:

LossPHY =∑

Re

∑

φ

MSE(µ(P̂(Re,φ)), P (Re,φ))

+MSE(µ(V̂(Re,φ)), V (Re,φ))

The function µ(·) : R −→ R is a mean function. Wefinally consider the combined loss LossMSE +LossPHYfor learning our PhyDNN model.

4 Dataset Description

The dataset used has 5824 particles. Each particle has47 input features including three-dimensional coordi-nates for fifteen nearest neighbors relative to the targetparticle’s position, the Reynolds number (Re) and solidfraction (φ) of the specific experimental setting (thereare a total of 16 experimental settings with different(Re, φ) combinations). Labels include the drag force inthe X-direction Fi ∈ R1×1 as well as variables for auxil-iary training, i.e., pressure fields (Pi ∈ R10×1), velocityfields (Vi ∈ R10×1), pressure components (FP

i ∈ R3×1)and shear components of the drag force (FS

i ∈ R3×1)†.

4.1 Experimental Setup All deep learning modelsused have 5 hidden layers, a hidden size of 128 andwere trained for 500 epochs with a batch size of 100.Unless otherwise stated, 55% of the dataset was used fortraining while the remaining data was used for testingand evaluation. We applied standardization to the allinput features and labels in the data preprocessing step.

Baselines: We compare the performance ofPhyDNN with several state-of-the-art regression base-lines and a few close variants of PhyDNN .•Linear Regression (Linear Reg.), Random Forest Re-gression (RF Reg.), Gradient Boosting Regression (GBReg.) [27]: We employed an ensemble of 100 estimatorsfor RF, GB Reg. models and left all other parametersunchanged.• DNN: A standard feed-forward neural network modelfor predicting the scalar valued particle drag force Fi.• DNN+ Pres: A DNN model which predicts the pres-sure field around a particle (Pi) in addition to Fi.• DNN+ Vel: A DNN model which predicts the velocityfield around a particle (Vi) in addition to Fi.• DNN-MT-Pres: Similar to DNN+ Pres except thatthe pressure and drag force tasks are modeled in a multi-task manner with a set of disjoint layers for each of thetwo tasks and a separate set of shared layers.•DNN-MT-Vel; Similar to DNN-MT-Pres except in thiscase the auxiliary task models the velocity field aroundthe particle (Vi) in addition to drag force (Fi).

†Further details about the dataset included in the appendix.

We employ three metrics for model evaluation:MSE & MRE: We employ the mean squared error

(MSE) and mean relative error (MRE) [2] metrics toevaluate model performance. Though MSE can capturethe absolute deviation of model prediction from theground truth values, it can vary a lot for different scalesof the label values, e.g., for higher drag force values,MSE is prone to be higher, vice versa. Thus, the needfor a metric that is invariant to the scale of the labelvalues brings in the MRE as an important supplementalmetric in addition to MSE.

MRE =1

m

m∑

i=1

|F̂i − Fi|F (Re,φ)

F (Re,φ) is the mean drag force for (Re, φ) set-

ting and F̂i the predicted drag force for particle i.

AU-REC: The thirdmetric we employ is thearea under the relativeerror curve (AU-REC).The relative error curverepresents the cumu-lative distribution ofrelative error betweenthe predicted drag forcevalues and the groundtruth PRS drag forcedata. AU-REC calculates the area under this curve.The AU-REC metric ranges between [0,1] and higherAU-REC values indicate superior performance.

5 Experimental Results

We conducted multiple experiments to characterizeand evaluate the model performance of PhyDNN withphysics-guided architecture and physics-guided aggregatesupervision. Cognizant of the cost of generation of dragforce data, we aim to evaluate models in settings wherethere is a paucity of labelled training data.

5.1 Physics-Guided Auxiliary Task SelectionWhen data about the target task is limited, we may em-ploy exogenous inputs of processes that have an indirectinfluence over the target process to alleviate the effectsof data paucity on model training. An effective wayto achieve this is through multi-task learning. We firstevaluate multi-task model performance relative to thecorresponding single-task models to demonstrate per-formance gains. Table 1 shows the results of severalmulti-task and single task architectures that we testedto establish the superiority of multi-task models in thecontext of the particle drag force prediction task. It

Model MSE MRE AU-REC(% Imp.)

Linear Reg. 47.47 38.48 0.71332 (-19.54)

RF Reg. 29.33 19.13 0.82148 (-7.3)

GB Reg. 25.02 17.55 0.83692 (-5.60)

DNN 20.50 16.72 0.84573 (-4.61)

DNN-MT-Pres 20.12 15.66 0.85593 (-3.45)

DNN-MT-Vel 19.98 15.69 0.85556 (-3.49)

PhyDNN-FP

x FSx

15.54 14.06 0.87232 (-1.61)

PhyDNN 14.28 12.59 0.88657 (–)

Table 1: We compare the performance of PhyDNN andits variant PhyDNN-FP

x FSx (only x-components of pressure

and shear drag are modeled) with many state-of-the-artregression baselines and show that the PhyDNN model yieldssignificant performance improvement over all other modelsfor the particle drag force prediction task. The last columnof the table reports the AU-REC metric and also quantifiesthe percentage improvement of the best performing modeli.e PhyDNN w.r.t all other models in the context of the AU-REC metric.

is widely known and accepted in physics that the dragforce on each particle in fluid-particle systems such asthe one being considered in this paper, is influencedstrongly by the pressure and velocity fields acting onthe particles [2]. Hecnce, we wish to explicitly model thepressure and velocity fields around a particle, in addi-tion to the main problem of predicting its drag force. Tothis end, we design two multi-task models, DNN-MT-Pres, DNN-MT-Vel, as described in section 4.1. Wenotice that the two multi-task models DNN-MT-Presand DNN-MT-Vel outperform not only the DNN modelbut also their single task counterparts (DNN+ Pres ,DNN+ Vel). This improvement in performance may beattribured to the carefully selected auxiliary tasks to aidin learning the representation of the main task. Thisphysics-guided auxiliary task selection is also impera-tivie to our process of development of PhyDNN modelsto be detailed in section 5.2. However, Table 1 also un-covers another interesting property which is the DNN+models underperforming compared to their DNN-MTcounterparts. Although the DNN+ and the DNN-MTmodels are predicting the same set of 11 values i.e 1drag force value and 10 pressure or velocity samples inthe vicinity of the particle, the DNN+ models maketheir predictions as part of a single task. Hence, theimportance of the main task is diminished by the 10additional auxiliary task values as the model tries tolearn a jointly optimal representation. However, in thecase of the DNN-MT models, each task has a set of dis-joint hidden layers geared specifically towards learningthe representation of the main task and a similar setof layers for the auxiliary task (in addition to a set ofshared layers), which yields more flexibility in learningrepresentations specific to the main and auxiliary task

as well as a shared common representation. In addi-tion, it is straightforward to explicitly control the effectof auxiliary tasks on the overall learning process in amulti-task setting.

5.2 Physics-Guided Learning Architecture Sec-tion 5.1 showcases the effectiveness of multi-task learn-ing and of physics-guided auxiliary task selection forlearning improved representations of particle drag force.We now delve deeper and inspect the effects of expand-ing the realm of auxiliary tasks. In addition to this, wealso use our domain knowledge regarding the physicsof entities affecting the drag force acting on each par-ticle, to influence model architecture through physics-guided structural priors. As mentioned in Section 3,PhyDNN has four carefully and deliberately chosen aux-iliary tasks (pressure field prediction, velocity field pre-diction, predicting the pressure component(s) of drag,predicting the shear components of drag) aiding themain task of particle drag force prediction. In additionto this, the auxiliary tasks are arranged in a sequen-tial manner to incorporate physical inter-dependenciesamong them leading up to the main task of parti-cle drag force prediction. The effect of this carefullychosen physics-guided architecture and auxiliary taskscan be observed in Table 1. We now inspect the dif-ferent facets of this physics-guided architecture of thePhyDNN model‡.

0.0 0.2 0.4 0.6 0.8 1.0Percentage Error Compared to PRS Data

0.0

0.2

0.4

0.6

0.8

1.0

Cum

ulat

ive

Dist

ribut

ion

DNNPhyDNNMean

Figure 3: The cumulative distribution function of relative er-ror for all (Re,φ) combinations. Overall, the PhyDNN modelcomfortably outperforms the DNN model and the Meanbaseline (dotted red line).

We first characterize the performance of ourPhyDNN models with respect to the DNN and meanbaseline. Fig. 3 represents the cumulative distributionof relative error of the predicted drag forces and thePRS ground truth drag force data. We notice thatboth DNN and PhyDNN outperform the mean baseline

‡PhyDNNwas found to be robust to changes in auxiliary taskhyperparameters, results included in appendix.

which essentially predicts the mean value per (Re,φ)combination. The PhyDNN model significantly out-performs the DNN (current state-of-the-art [2]) modelto yield the best performance overall. We also testedDNN variants with dropout and L2 regularization butfound that performance deteriorated. Results wereexcluded due to space constraints.

5.3 Performance With Limited Data Bearing inmind the high data generation cost of the PRS simu-lation, we wish to characterize an important facet ofthe PhyDNN model, namely, its ability to learn effec-tive representations when faced with a paucity of train-ing data. Hence, we evaluate the performance of thePhyDNN model as well as the other single task andmulti-task DNN models, on different experimental set-tings obtained by continually reducing the fraction ofdata available for training the models. In our experi-ments, the training fraction was reduced from 0.85 (i.e85% of the data used for training) to 0.35 (i.e 35% ofthe data used for training).

0.35 0.45 0.55 0.65 0.75 0.85Training Fraction

0.75

0.80

0.85

0.90

AU-R

EC

PhyDNNRF Reg.

Linear Reg.GB Reg.

MeanDNN

Figure 4: Model performance comparison for different levelsof data paucity.

Fig. 4 showcases the model performance in settings withlimited data. We see that PhyDNN model significantlyoutperforms all other models in most settings (sparseand dense). We note that GB Reg. yields comparableperformance to the PhyDNN model for the case of 0.35training fraction. However, the gradient boosting (andall the other regression models except DNN) fail to learnuseful information as more data is provided for training.We also notice that the DNN model fails to outperformthe PhyDNN model for all but the last setting i.e thesetting with 0.85 training fraction.

5.4 Characterizing PhyDNN Performance ForDifferent (Re,φ) Settings. In addition to quantita-tive evaluation, qualitative inspection is necessary fora deeper, holistic understanding of model behavior.Hence, we showcase the particle drag force predictionsby the PhyDNN model for different (Re,φ) combina-

tions in Fig. 5§. We notice that the PhyDNN modelyields accurate predictions (i.e yellow and red curvesare aligned). This indicates that the PhyDNN model isable to effectively capture sophisticated particle inter-actions and the consequent effect of said interactions onthe drag forces of the interacting particles. We noticethat for high (Re,φ) as in Fig. 5b, the drag force i.ePRS curve (yellow) is nonlinear in nature and that themagnitude of drag forces is also higher at higher (Re,φ)settings. Such differing scales of drag force values canalso complicate the drag force prediction problem as itis non-trivial for a single model to effectively learn suchmulti-modal target distributions. However, we find thatthe PhyDNN model is effective in this setting.

0 20 40 60 80 100 120 140Particle Index

2

4

6

8

10

12

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(a) Re = 10, φ = 0.2

0 50 100 150 200Particle Index

40

60

80

100

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(b) Re = 200, φ = 0.35

Figure 5: Each figure shows a comparison betweenPhyDNN predictions (red curve) and ground truth drag forcedata (yellow curve), for different (Re,φ) cases. We also show-case the mean drag force value for each (Re,φ) case (black).

Thus far, we characterized the performance of thePhyDNN model in isolation for different (Re,φ) con-texts. In order to gain a deeper understanding ofthe performance of PhyDNN models for different (Re,φ)combinations, we show percentage improvement for theAUREC metric of PhyDNN model and three other mod-els in Fig. 6a - Fig. 6c. We choose DNN, DNN-MT-Pres, DNN-MT-Vel as these are the closest by designto PhyDNN among all the baselines we consider in thispaper. In Fig. 6, we see that PhyDNN outperformsthe other models in most of the (Re,φ) settings.PhyDNN when compared with the DNN model achievesespecially good performance for low solid fraction set-tings which may be attributed to the inability of theDNN model to learn effectively with low data volumesas lower solid fractions have fewer training instances. Inthe case of the DNN-MT models, the PhyDNN modelachieves significant performance improvement for highsolid fraction (φ = {0.2, 0.3, 0.35}) cases and also for Re= {10, 200}, indicating that PhyDNN is able to performwell in the most complicated scenarios (high Re, highφ). PhyDNN is able to achieve superior performance in14 out of the 16 (Re, φ) settings across all three models.

§Figures for all (Re,φ) combinations are in the appendix.

5.5 Verifying Consistency With DomainKnowledge A significant advantage of physics-guidedmulti-task structural priors is the increased inter-pretability provided by the resulting architecture.Since each component of the PhyDNN model has beendesigned and included based on sound domain theory,we may employ this theoretical understanding to verifythrough experimentation that the resulting behaviorof each auxiliary component is indeed consistent withknown theory. We first verify the performance of thepressure and shear drag component prediction task inthe PhyDNN model. It is well accepted in theory thatfor high Reynolds numbers, the proportion of the shearcomponents of drag (FS) decreases [2]. In order toevaluate this, we consider the ratio of the magnitude ofthe predicted pressure components in the x-direction(FPx ∈ FP) to the magnitude of the predicted shearcomponents in the x-direction (FSx ∈ FS) for every (Re,φ) setting¶. The heatmap in Fig. 7 depicts the compar-ison of this ratio of predicted pressure components topredicted shear components to a similar ratio derivedfrom the ground truth pressure and shear components.We notice that there is good agreement between thepredicted and ground truth ratios for each (Re, φ)setting and also that the behavior of the predictedsetting is indeed consistent with known domain theoryas there is a noticeable decrease in the contribution ofthe shear components as we move toward high Re andhigh solid fraction φ settings.

5.6 Auxiliary Representation Learning WithPhysics-Guided Statistical Constraints Two ofthe auxiliary prediction tasks involve predicting thepressure and velocity field samples around each par-ticle. We hypothesized that since the drag force of aparticle is influenced by the pressure and velocity fields,modeling them explicitly should help the model learnan improved representation of the main task of particledrag force prediction. In Fig. 8, we notice that ground-truth pressure field PDFs exhibit a grouped structure.Interestingly, the pressure field PDFs can be dividedinto three distinct groups with all the pressure fieldswith φ = 0.2 being grouped to the left of the plot, pres-sure fields with φ = 0.1 being grouped toward the bot-tom, right of the plot and the rest of the PDFs form-ing a core (highly dense) group in the center. Hence,we infer that solid fraction has a significant influenceon the pressure field. It is non-trivial for models toautomatically replicate such multi-modal and groupedbehavior and hence we introduce physics-guided statisti-

¶Similar behavior was recorded even when ratios were takenfor all three pressure and shear drag components.

(a) PhyDNN vs. DNN (b) PhyDNN vs. DNN-MT-Pres. (c) PhyDNN vs. DNN-MT-Vel

Figure 6: Each figure indicates the percentage improvement in the context of the AU-REC metric of the PhyDNN modelover the DNN (Fig. 6a), DNN-MT-Pres (Fig. 6b) and DNN-MT-Vel (Fig. 6c). Red squares show that PhyDNN does betterand blue squares indicate that other models outperform PhyDNN . PhyDNN yields significant performance improvementover other models.

0.1 0.2 0.3 0.35Solid Fraction

1050

100

200

Reyn

olds

Num

ber

0.83 1.00 1.30 1.68

0.63 1.21 1.52 2.15

0.74 1.66 1.93 2.77

1.47 2.43 3.05 3.65

Predictions

0.1 0.2 0.3 0.35Solid Fraction

1050

100

200

Reyn

olds

Num

ber

0.82 1.15 1.45 1.78

1.02 1.40 1.72 2.18

1.24 1.65 2.04 2.58

1.68 2.36 2.95 3.61

Ground Truth

1.0

1.5

2.0

2.5

3.0

3.5

1.0

1.5

2.0

2.5

3.0

3.5

Figure 7: Heatmap showing ratio of absolute value ofpressure drag (FP

x ) x-component to shear drag (FSx )

x-component i.e( |FP

x ||FS

x |). Left figure shows ratio for

PhyDNN predictions and the figure on the right shows thesame ratio for ground truth data. We notice that distribu-tion of ratios in both figures is almost identical.

cal priors through aggregate supervision during modeltraining of PhyDNN . We notice that the learned dis-tribution with aggregate supervision Fig. 8 (center) hasa similar grouped structure to the ground truth PDFpressure field. For the purpose of comparison, we alsoobtained the predicted pressure field PDFs of a versionof PhyDNN trained without aggregate supervision andthe result is depicted in Fig. 8 (right). We notice thatthe PDFs exhibit a kind of mode collapse behavior anddo not display any similarities to ground truth pressurefield PDFs. Similar aggregate supervision was also ap-plied to the velocity field prediction task and we foundthat incorporating physics-guided aggregate supervisionto ensure learning representations consistent with the-ory, led to significantly improved model performance.

6 Conclusion

In this paper, we introduce PhyDNN . A physics in-spired deep learning model developed to incorporatefluid mechanical theory into the model architecture

and proposed physics informed auxiliary tasks selec-tion to aid with training under data paucity. We con-duct a rigorous analysis to test PhyDNN performancein settings with limited labelled data and find thatPhyDNN significantly outperforms all other state-of-the-art regression baselines for the task of particle dragforce prediction achieving an average performance im-provement of 8.46% across all models. We verify thateach physics informed auxiliary task of PhyDNN is con-sistent with existing physics theory, yielding greatermodel interpretability. Finally, we also showcase theeffect of augmenting PhyDNN with physics-guided ag-gregate supervision to constrain auxiliary tasks to beconsistent with ground truth data. In future, we willaugment PhyDNN with more sophisticated learning ar-chitectures.

References

[1] L. He, D. K. Tafti, and K. Nagendra, “Evaluation ofdrag correlations using particle resolved simulations ofspheres and ellipsoids in assembly,” Powder technology,vol. 313, 2017.

[2] L. He and D. K. Tafti, “A supervised machine learningapproach for predicting variable drag forces on spher-ical particles in suspension,” Powder technology, vol.345, 2019.

[3] J. Li and J. Kuipers, “Gas-particle interactions indense gas-fluidized beds,” Chemical Engineering Sci-ence, vol. 58, no. 3-6, 2003.

[4] C. Wen, “Yh yu. mechanics of fluidization,” in Chem-ical Engineering Progress Symposium Series, vol. 62,no. 62, 1966.

[5] R. Di Felice, “The voidage function for fluid-particleinteraction systems,” International Journal of Multi-phase Flow, vol. 20, no. 1, 1994.

1 0 1 2 3 4 5Ground Truth

0.0

0.5

1.0

1.5

2.0

2.5

Re=10-SF=0.1Re=10-SF=0.2Re=10-SF=0.3Re=10-SF=0.35




6 4 2 0 2 4 6Prediction With Aggregate Supervision

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

6 4 2 0 2 4 6Prediction Without Aggregate Supervision

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Figure 8: The figure depicts the densities of the ground truth (left) and predicted (center, right) pressure fields of thePhyDNN model for each (Re,φ). Specifically, we wish to highlight the effect of aggregate supervision (physics-guidedstatistical prior) on the predicted pressure field. Notice that the PDFs of the pressure fields predicted with aggregatesupervision are relatively more distributed similar to the ground truth distribution of pressure field PDFs as opposed tothe plot on the right which represents predicted pressure field PDFs in the abscence of aggregate supervision and incorrectlydepicts a some what uniform behavior for all the PDFs of different (Re,φ) cases.

[6] S. Tenneti, R. Garg, and S. Subramaniam, “Draglaw for monodisperse gas–solid systems using particle-resolved direct numerical simulation of flow past fixedassemblies of spheres,” International journal of multi-phase flow, vol. 37, no. 9, 2011.

[7] K. C. Wong, L. Wang, and P. Shi, “Active model withorthotropic hyperelastic material for cardiac imageanalysis,” in Functional Imaging and Modeling of theHeart. Springer, 2009.

[8] J. Xu et al., “Robust transmural electrophysiologi-cal imaging: Integrating sparse and dynamic physio-logical models into ecg-based inference,” in MICCAI.Springer, 2015.

[9] H. Denli et al., “Multi-scale graphical models forspatio-temporal processes,” in NeurIPS, 2014.

[10] S. Chatterjee et al., “Sparse group lasso: Consistencyand climate applications.” in SDM12’. SIAM, 2012.

[11] J. Liu et al., “Accounting for linkage disequilibrium ingenome-wide association studies: a penalized regres-sion method,” Statistics and its interface, vol. 6, no. 1,2013.

[12] A. J. Majda and J. Harlim, “Physics constrained non-linear regression models for time series,” Nonlinearity,vol. 26, no. 1, 2012.

[13] A. J. Majda and Y. Yuan, “Fundamental limitationsof ad hoc linear and quadratic multi-level regressionmodels for physical systems,” Discrete and ContinuousDynamical Systems B, vol. 17, no. 4, 2012.

[14] D. Waterman, “A guide to expert systems,” 1986.[15] Y. S. Abu-Mostafa, “Learning from hints in neural

networks,” Journal of complexity, vol. 6, no. 2, 1990.[16] T. Q. Chen et al., “Neural ordinary differential equa-

tions,” in NeurIPS, 2018.[17] M. Zhu, B. Chang, and C. Fu, “Convolutional neu-

ral networks combined with runge-kutta methods,”arXiv:1802.08831, 2018.

[18] A. Karpatne et al., “Theory-guided data science: Anew paradigm for scientific discovery from data,” IEEETKDE, vol. 29, no. 10, 2017.

[19] H. Ren et al., “Learning with weak supervision fromphysics and data-driven constraints.” AI Magazine,vol. 39, no. 1, 2018.

[20] R. Stewart and S. Ermon, “Label-free supervision ofneural networks with physics and domain knowledge,”in AAAI, 2017.

[21] Y. A. Ioannou, “Structural priors in deep neural net-works,” Ph.D. dissertation, University of Cambridge,2018.

[22] J. Ling, A. Kurzawski, and J. Templeton, “Reynoldsaveraged turbulence modelling using deep neural net-works with embedded invariance,” Journal of FluidMechanics, vol. 807, 2016.

[23] S. Seo and Y. Liu, “Differentiable physics-informedgraph networks,” arXiv:1902.02950, 2019.

[24] J. Z. Leibo et al., “View-tolerant face recognitionand hebbian learning imply mirror-symmetric neuraltuning to head orientation,” Current Biology, vol. 27,no. 1, 2017.

[25] B. Anderson, T.-S. Hy, and R. Kondor, “Cor-morant: Covariant molecular neural networks,”arXiv:1906.04015, 2019.

[26] R. Kondor and S. Trivedi, “On the generalization ofequivariance and convolution in neural networks to theaction of compact groups,” arXiv:1802.03690, 2018.

[27] F. Pedregosa et al., “Scikit-learn: Machine learning inPython,” JMLR, vol. 12, 2011.

AppendixPhysics-guided Design and Learning of Neural Networks

for Predicting Drag Force on Particle Suspensions in Moving Fluids

Nikhil Muralidhar ∗† Jie Bu ∗† Ze Cao ‡ Long He ‡ Naren Ramakrishnan ∗

Danesh Tafti ‡ Anuj Karpatne ∗†

1 Dataset Description

The dataset used has 5824 particles. Each particle has47 input features including three-dimensional coordi-nates for fifteen nearest neighbors relative to the targetparticle’s position, the Reynolds number (Re) and solidfraction (φ) of the specific experimental setting (thereare a total of 16 experimental settings with different(Re, φ) combinations). Labels include the drag force inthe X-direction Fx ∈ R1×1 as well as variables for aux-iliary training, i.e., pressure fields (P ∈ R10×1), velocityfields (V ∈ R10×1), pressure components (FP ∈ R3×1)and shear components of the drag force (FS ∈ R3×1).

Features Range of DataX ∈ R15×1 −2.93 ∼ 2.95Y ∈ R15×1 −2.96 ∼ 2.95Z ∈ R15×1 −2.96 ∼ 2.96Re ∈ R1×1 {10, 50, 100, 200}φ ∈ R1×1 {0.1, 0.2, 0.3, 0.35}

Table 1: The 47 input features of the dataset. Indexes arethe column index of the features in the dataset. X,Y,Zcorrespond to the x, y, z coordinates respectively for the 15nearest neighbors of a particular particle. Re is the Reynoldsnumbers. The φ is the global solid fraction for the particularexperimental setting.

2 Experimental Results

2.1 Characterizing PhyDNN Performance ForDifferent (Re,φ) Settings. In addition to quanti-tative evaluation, qualitative inspection is necessaryfor a deeper, holistic understanding of model behav-ior. Hence, we showcase the particle drag force predic-tions by the PhyDNN model for different (Re,φ) combi-nations in Fig. 1. We notice that the PhyDNN modelyields accurate predictions (i.e yellow and red curves

∗Dept. of Computer Science, Virginia Tech, VA, USA†Discovery Analytics Center, Virginia Tech, VA, USA‡Dept. of Mechanical Engineering, Virginia Tech, VA, USA

Labels Range of DataFx ∈ R1×1 0.74 ∼ 107.93P ∈ R10×1 −1.26 ∼ 4.72V ∈ R10×1 0.07 ∼ 5.29FPx ∈ R1×1 0.27 ∼ 92.32FPy ∈ R1×1 −15.54 ∼ 19.61FPz ∈ R1×1 −14.23 ∼ 15.28FSx ∈ R1×1 0.42 ∼ 17.19FSy ∈ R1×1 −2.88 ∼ 3.57FSz ∈ R1×1 −2.95 ∼ 5.05

Table 2: Fx is the drag force the particle experienced onthe x direction, which is the target variable to predict.FPx , F

Py , F

Pz represent the pressure drag components in the

x,y,z directions respectively. FSx , F

Sy , F

Sz represent the shear

drag components in the x,y,z directions respectively.

are aligned). This indicates that the PhyDNN model isable to effectively capture sophisticated particle inter-actions and the consequent effect of said interactions onthe drag forces of the interacting particles. We noticethat for high (Re,φ) as in Fig. 1p, the drag force i.ePRS curve (yellow) is nonlinear in nature and that themagnitude of drag forces is also higher at higher (Re,φ)settings. Such differing scales of drag force values canalso complicate the drag force prediction problem as itis non-trivial for a single model to effectively learn suchmulti-modal target distributions. However, we find thatthe PhyDNN model is effective in this setting.

2.2 Hyperparameter Sensitivity Each of the fourauxiliary tasks in the PhyDNN models, is governedby a hyperparameter during model training (refer toSection 3 in the main paper). In our experiments, weonly tune the hyperparameters for the pressure fieldand velocity field prediction tasks leaving all otherhyperparameters set to static values for all experiments.We employ a grid search procedure on the validationset to select the optimal hyperparameter values for thepressure and velocity field prediction auxiliary tasks in

arX

iv:1

911.

0424

0v1

[cs

.LG

] 6

Nov

201

9


1

2

3

4

5

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(a) Re = 10, φ = 0.1

0 20 40 60 80 100 120 140Particle Index

2

4

6

8

10

12

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(b) Re = 10, φ = 0.2


68

1012141618

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(c) Re = 10, φ = 0.3


5101520253035

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(d) Re = 10, φ = 0.35


2

4

6

8

10

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(e) Re = 50, φ = 0.1

0 20 40 60 80 100 120 140Particle Index

468

101214161820

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(f) Re = 50, φ = 0.2

0 25 50 75 100 125 150 175 200Particle Index

10.012.515.017.520.022.525.027.5

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(g) Re = 50, φ = 0.3


101520253035404550

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(h) Re = 50, φ = 0.35


2.55.07.5

10.012.515.017.520.0

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(i) Re = 100, φ = 0.1

0 20 40 60 80 100 120 140 160Particle Index

5

10

15

20

25

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(j) Re = 100, φ = 0.2


15

20

25

30

35

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(k) Re = 100, φ = 0.3


20

30

40

50

60

70

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(l) Re = 100, φ = 0.35

0 10 20 30 40 50 60 70Particle Index

5

10

15

20

25

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(m) Re = 200, φ = 0.1

0 20 40 60 80 100 120Particle Index

10152025303540

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(n) Re = 200, φ = 0.2

0 25 50 75 100 125 150 175Particle Index

20253035404550

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(o) Re = 200, φ = 0.3


40

60

80

100

Drag

For

ce

MeanPRS

PhyDNN Avg.PhyDNN

(p) Re = 200, φ = 0.35

Figure 1: Each figure shows a comparison between PhyDNN predictions (red curve) and ground truth drag force data(yellow curve), for different (Re,φ) cases..We also showcase the mean drag force value for each (Re,φ) case (black).

Training Fraction λP λV0.35 1e−2 1e−1

0.45 1e−4 1e−1

0.55 1e−4 1e−3

0.65 1e−4 1e−1

0.75 1e−2 1e−3

0.85 1e−4 1e−2

Table 3: The table showcases hyperparameter values ofPhyDNN , for different levels of training fractions each ob-tained through gridsearch. It must be noted that only thehyperparameters for the pressure and velocity field predic-tion auxiliary tasks were tuned and the rest of the valueswere kept constant for all experiments λFP = 0.01, λFS =0.01.

the PhyDNN model. In order to characterize the effectof this hyperparameter selection procedure on the modelevaluation, we evaluate the sensitivity of the model todifferent hyperparameter values.

0.35 0.45 0.55 0.65 0.75 0.85Training Fraction

0.0

0.2

0.4

0.6

0.8

AU-R

EC

Static Grid-Search

Figure 2: Hyperparameter sensitivity evaluation of thegrid search hyperparameter selection procedure for thePhyDNN model. We notice that PhyDNN is robust to differ-ent settings of hyperparameters as we do not see significantchanges in the AU-REC between the settings where hyperpa-rameters for the PhyDNN were selected through grid searchon the validation set (green) and the settings wherein the hy-perparameter values were set by hand before the experiment(blue).

We design the hyperparameter sensitivity experimentto inspect how model performance varies with differenttraining fractions (i.e different experimental settings).We conduct an experiment by reducing the trainingfraction from 0.85 (85% of data used for training) to0.35 (35% data used for training). Fig. 2 shows the re-sults of our experiment wherein the blue bars indicatethe AU-REC values obtained when the PhyDNN modelwas trained with a static (predefined) set of hyperpa-rameters∗. The green bars indicate the setting wherethe optimal hyperparameters for pressure and velocityfield prediction for the PhyDNN model were obtained

∗The optimal hyperparameters for the 0.55 training fractioncase were used for all other settings.

through gridsearch on the validation set. We noticethat over all the training fractions, there is no signif-icant difference between the two models and hence con-clude that the PhyDNN model is robust across differenthyperparameter settings. Exact hyperparameter valuesare detailed in Table. 3.

arXiv:1911.04240v1 [cs.LG] 6 Nov 2019people.cs.vt.edu/naren/papers/PhyNetSIAMDM2020.pdfedge of the...

Documents

Transcript of arXiv:1911.04240v1 [cs.LG] 6 Nov 2019people.cs.vt.edu/naren/papers/PhyNetSIAMDM2020.pdfedge of the...