Fatigue life prediction of pipeline with equivalent initial flaw size … · 2020. 3. 23. ·...

15
RESEARCH Open Access Fatigue life prediction of pipeline with equivalent initial flaw size using Bayesian inference method Milad Salemi and Hao Wang * Abstract Majority of pipeline infrastructure are old and susceptible to possible catastrophic failures due to fatigue. Timely maintenance is the key to keep pipeline in serviceable and safe condition. This paper proposed a Bayesian inference methodology based on the observed crack growth measurements and cycle data that predicts the probability density of failure after initially estimating the equivalent initial flaw size (EIFS). The model was first developed based on one- dimensional crack growth problem in plate with edge crack. Then the model was expanded to two-dimensional crack growth problem in pipe wall. Stress intensity factors (SIF) at the crack tip in pipe model were calculated using finite element (FE) analysis for different crack lengths and depths. Polynomial function and Gaussian process were used to develop surrogate models of SIF. The analysis demonstrated that the proposed Bayesian inference method with hyperparameters generated accurate inferred results for probability density function (PDF) of both EIFS and the number of cycles to failure. Introduction Fatigue cracking is an inherently stochastic problem affected by various sources of variability and uncertainty. In the particular case of oil and gas pipeline fatigue- induced cracks, it is important to effectively and effi- ciently monitor and inspect pipelines. Due to the old age and large size of pipeline transmission network, main- taining these infrastructures within safe and serviceable conditions is not an easy task. As a result, it is important to contentiously improve and update efficiency of means and interval of monitoring and maintenance by use of innovative scientific methodologies. Such methodologies should be able to account for probabilistic nature of fatigue process, such as uncertainties in material proper- ties, variability of internal pressure within the pipe, and reliability and accuracy of inspection data about the condition of pipe structure. Traditionally probabilistic models within the classical statistics scope deal with analyzing statistics of the error due to differences between model predictions and meas- urement/observation data. The accuracy of such models rely upon the quality and quantity of available observation data. On the other hand, Bayesian statistics view on model definition quantifies the extent to which the model repre- sents the data and determines the probability of the model being correct itself. This makes Bayesian models suitable for cases in which the relevant data is limited. In recent years there has been a great focus on data driven solutions with help of machine learning methods in detecting patterns and performance prediction in various fields of science and engineering. Aeronautical engineering and aircrafts structural failures has been the major front for doing research related to fatigue in various metallic alloys [16]. Knowledge about equiva- lent initial flaw size (EIFS) in material is key to estimate crack initiation and progression process [7]. Different researchers have studied various aspects of applying Bayesian inference in fatigue modeling. Makeev et al. [3] investigated Bayesian inference of EIFS using Weibull and log-normal Distribution. In the follow up work, Cross et al. [1] extended the inference of Makeevs work © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. * Correspondence: [email protected] Department of Civil and Environmental Engineering, School of Engineering, Rutgers, The State University of New Jersey, Piscataway, USA Journal of Infrastructure Preservation and Resilience Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 https://doi.org/10.1186/s43065-020-00005-y

Transcript of Fatigue life prediction of pipeline with equivalent initial flaw size … · 2020. 3. 23. ·...

  • RESEARCH Open Access

    Fatigue life prediction of pipeline withequivalent initial flaw size using Bayesianinference methodMilad Salemi and Hao Wang*

    Abstract

    Majority of pipeline infrastructure are old and susceptible to possible catastrophic failures due to fatigue. Timelymaintenance is the key to keep pipeline in serviceable and safe condition. This paper proposed a Bayesian inferencemethodology based on the observed crack growth measurements and cycle data that predicts the probability densityof failure after initially estimating the equivalent initial flaw size (EIFS). The model was first developed based on one-dimensional crack growth problem in plate with edge crack. Then the model was expanded to two-dimensional crackgrowth problem in pipe wall. Stress intensity factors (SIF) at the crack tip in pipe model were calculated using finiteelement (FE) analysis for different crack lengths and depths. Polynomial function and Gaussian process were used todevelop surrogate models of SIF. The analysis demonstrated that the proposed Bayesian inference method withhyperparameters generated accurate inferred results for probability density function (PDF) of both EIFS and thenumber of cycles to failure.

    IntroductionFatigue cracking is an inherently stochastic problemaffected by various sources of variability and uncertainty.In the particular case of oil and gas pipeline fatigue-induced cracks, it is important to effectively and effi-ciently monitor and inspect pipelines. Due to the old ageand large size of pipeline transmission network, main-taining these infrastructures within safe and serviceableconditions is not an easy task. As a result, it is importantto contentiously improve and update efficiency of meansand interval of monitoring and maintenance by use ofinnovative scientific methodologies. Such methodologiesshould be able to account for probabilistic nature offatigue process, such as uncertainties in material proper-ties, variability of internal pressure within the pipe, andreliability and accuracy of inspection data about thecondition of pipe structure.Traditionally probabilistic models within the classical

    statistics scope deal with analyzing statistics of the error

    due to differences between model predictions and meas-urement/observation data. The accuracy of such modelsrely upon the quality and quantity of available observationdata. On the other hand, Bayesian statistics view on modeldefinition quantifies the extent to which the model repre-sents the data and determines the probability of the modelbeing correct itself. This makes Bayesian models suitablefor cases in which the relevant data is limited.In recent years there has been a great focus on data

    driven solutions with help of machine learning methodsin detecting patterns and performance prediction invarious fields of science and engineering. Aeronauticalengineering and aircrafts structural failures has been themajor front for doing research related to fatigue invarious metallic alloys [1–6]. Knowledge about equiva-lent initial flaw size (EIFS) in material is key to estimatecrack initiation and progression process [7]. Differentresearchers have studied various aspects of applyingBayesian inference in fatigue modeling. Makeev et al. [3]investigated Bayesian inference of EIFS using Weibulland log-normal Distribution. In the follow up work,Cross et al. [1] extended the inference of Makeev’s work

    © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License,which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you giveappropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate ifchanges were made. The images or other third party material in this article are included in the article's Creative Commonslicence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commonslicence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtainpermission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

    * Correspondence: [email protected] of Civil and Environmental Engineering, School of Engineering,Rutgers, The State University of New Jersey, Piscataway, USA

    Journal of InfrastructurePreservation and Resilience

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 https://doi.org/10.1186/s43065-020-00005-y

    http://crossmark.crossref.org/dialog/?doi=10.1186/s43065-020-00005-y&domain=pdfhttp://creativecommons.org/licenses/by/4.0/mailto:[email protected]

  • to multivariable inference of other crack growth parame-ters in addition to EIFS using Markov Chain MonteCarlo (MCMC) method. Liu et al. [2] used a fracture-mechanics based approach to estimate EIFS based fa-tigue limit of the material and compared the results withthe observed material microstructure and subsequentlypresented prediction on fatigue life. Sankararaman et al.[4, 5] performed detailed inference of multiple parame-ters involved in crack growth model in conjunction withfinite element (FE) simulations under variable amplitudeloading. Xie et al. [8] used Bayesian inference to inferparameters of crack growth model in addition to fatiguelife. They tested their model with pipeline field data andreported desirable results.Gobbato et al. [9] introduced a recursive Bayesian prog-

    nosis methodology to predict and update remainingfatigue life (RFL) of structural components in aerospacestructures using continues non-destructive evaluation(NDE) data. Ribeiro et al., [10] further employed Bayesianinference to estimate RFL of fixed offshore structures.Babuška et al. [11] used Bayesian network to predictfatigue parameters of 75S-T6 aluminum alloys by usingstrain-life (S-N) curve data. Arzaghi et al. [12] used Bayes-ian network inference to propose a dynamic decisionmaking and planned maintenance framework for sub-seapipelines. Li et al. [13] recommended a probabilistic healthprognosis for aircraft wing structure due to an edge crackusing dynamic Bayesian networks (DBN). They considereduncertainties and proposed methodology which reducedcomputational cost by modifying DBN updating intervalrequirements.

    Objective and scopeThis study aims to employ a combination of these ap-proaches to establish a hybrid model that improves itsprediction for future state of pipeline condition as moreinspection data is entered into the model. The modelhas backward and forward inference. At each addition ofdata point on number of cycle and crack size first themodel infers the equivalent initial flaw size (EIFS) prob-ability density (PDF) and subsequently uses that to inferthe PDF for number of cycles to failure. The model canbe used to prioritize maintenance orders by isolatingsections that are at greatest risk of failure.First, a base Bayesian framework model is established

    based on synthetically generated data sets generated fromcrack growth in steel plate from available equations in lit-erature for edge crack in plates. Secondly, FE simulationswere performed on pipe model for several cases of cracklength and depth in pipe wall thickness. Surrogate modelswere fitted to compute SIF for intermediary points amongactual FE simulated SIF values. The proposed modelpredicts the failure based on 2-dimenssional crack growthin the pipe wall thickness. Results of the proposed

    methodology are presented and ground truth values werecompared with inferred values.

    Base model development and results for steelplateEquivalent initial flaw size (EIFS) definition conceptUsing the available empirical solution for crack propaga-tion in plates with a single edge notch in a load con-trolled boundary conditions (constant remote tensilestress) the base model was established. To calculatestress intensity factor (SIF) at a given crack length, theequation proposed by [14] was used to calculate shapefactor in the general SIF formula:

    KI ¼ σffiffiffiffiffiffiπa

    pF

    ab

    � �ð1Þ

    Fab

    � �¼

    ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2bπa

    tanπa2b

    � �r�0:752þ 2:02 a

    b

    � �þ 0:37 1− sin πa2b

    � �� �3cos

    πa2b

    � �ð2Þ

    Where KI is mode-I SIF, σ is applied nominal constantremote stress, a is crack length, b is plate width, and F isshape factor given by Tada’s equation and is valid forany ab ratio.The crack growth model will follow Paris’ crack

    growth regime [15]:

    dadN

    ¼ C1 ΔKð ÞC2 ð3Þ

    Where, C1 and C2 are material properties calibratedbased on experimental data, and ΔK is stress intensityfactor range between minimum and maximum appliedremote stress. dadN is the rate of crack growth with thechange of loading cycle.Combining the equation for SIF and Paris’s regime:

    N ¼Z ac

    da

    C1 ΔKð ÞC2 ð4Þ

    N ¼Z ac

    da

    C1 σffiffiffiffiffiffiπa

    pF ab� �� �C2 ð5Þ

    Where, c is initial crack length. This integral can benumerically calculated using Simpson’s method.The experimental data follows a general function in

    the form of:

    N ¼ f a;ϕð ÞWhere, ϕ is the vector of model parameters such as

    loading, material properties, and geometry. The elementsof ϕ are considered random variables in a probabilisticscheme. The parameters governing the distribution ofeach random variable in ϕ are called hyper-parameters, .

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 2 of 15

  • The probability of observing target cycle for a givendata point of (N, ϕ) can be framed using normaldistribution.

    p N ja; θ; βð Þ ¼ 1ffiffiffiffiffiffiffiffiffiffi2πβ2

    q exp − N− f a;ϕð Þ½ �22β2

    !ð6Þ

    Where, p is conditional probability of observing Ncycle given model parameters, which includes EIFS,crack length a at cycle N, and associated noise in thecrack growth model with standard deviation of β:

    log N̂� � ¼ log Nð Þ � β ð7Þ

    Where, N̂ is the noisy measurement of the cycle com-pared to the true cycle .Using Bayesian inference the joint probability of target

    data, N, given model for k number of data points can bederived as:

    p N ja; ϕkf g; βð Þ ¼Ymk¼1

    p Nk jak ;ϕk ; βð Þ

    ¼Ymk¼1

    1ffiffiffiffiffiffiffiffiffiffi2πβ2

    q exp − Nk− f ak ;ϕkð Þ½ �22β2

    !ð8Þ

    The initial flaw size (IFS) in fatigue analysis refers to thesmall flaws that exist in material’s grain size scale. Suchflaws do not necessarily follow crack growth regimes suchas Paris’ equation which relates long crack propagationrates to material properties. The small crack growth rateexhibits oscillatory behavior compared to generally mono-tonic behavior in long crack growth rates.The equivalent initial flaw size (EIFS) is the concept

    that allows interpretation of IFS into long crack analysisrealm. This allows implementation of long-crack basedcrack propagation models from the beginning life of amaterial undergoing cyclic loading. In other words, thenumber of cycles required to reach a certain cracklength af considering IFS and EIFS are equal. Their re-spective short-crack and long-crack growth models gs(a)and gl(a) are as follows [2]:

    N f ¼Z a fIFS

    dags að Þ

    ¼Z a fEIFS

    dagl að Þ

    EIFS direct probability density inference - p(θ)To develop and verify base model, it was assumed thatall the uncertainties are the result of EIFS or initial cracklength, θ. The modified distribution is as follows:

    p N ja; θ; βð Þ ¼Ymk¼1

    p Nk jak ; ck ; βð Þ

    ¼Ymk¼1

    1ffiffiffiffiffiffiffiffiffiffi2πβ2

    q exp − Nk− f ak ; ckð Þ½ �22β2

    !ð9Þ

    Where now only EIFS, c, is the only parameter whosedistribution is directly inferred.To solve for p for each data point, k, we have to solve

    for N̂ by solving for f(ak, ck) following Eq. (5). To calcu-late the one dimensional integral in Eq. (5) Simpson’squadratic integration rule can be employed which is dis-cussed in detail in [16].To evaluate the proposed base model, various sets of

    data points where generated and β =10% noise wasadded to the data to account for uncertainty withincrack growth model. The training data were generatedaccording to the following framework:

    1- It was assumed that the initial flaw size, θ, follows anormal distribution with assumed mean andstandard deviation. k samples were drawn from thisdistribution.

    2- Uniform distributions was assumed for final crack size,a, and k samples were drawn from this distribution.The lower bound of this uniform distribution wasassumed to be at least 3 times of the assumed standarddeviation of the normal distribution for c, larger thanthe mean assumed for c.

    3- Given a and c for each data point thecorresponding number of cycles, N, for that datapoint was calculated following Eq. (5) and usingSimpson’s quadratic numerical integration method.

    4- A random noise with zero mean and β standarddeviation was added to the calculated N.

    Figure 1 shows a sample of synthetically generateddata points. In Fig. 1(a) final crack size and its corre-sponding cycle number is visualized for k data point, inwhich for this example they are 20 data points.Table 1 shows the selected properties from previous

    study [8] and assumed parameters for crack growth inthe base model for a steel plate with an edge crack.For each data point (N, a) an EIFS likelihood probabil-

    ity density is calculated and the product of all data pointlikelihoods as shown in Eq. (9) will establish estimatedprobability density of EIFS. Figure 2 shows box plots ofEIFS PDF for the dataset containing 20 data points. Thevariation of the inferred EIFS distribution at each datapoint is clearly distinguishable.The final inferred EIFS distribution shown in Fig.1(c)

    and summarized in Table 2 based on 20 data pintsshows that the estimation for mean is very close to themean of the true distribution but the standard deviationor spread is not captured very well. To improve the in-ferred distribution, we will try to infer the distribution

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 3 of 15

  • parameters instead of distribution itself in the nextsection.

    EIFS probability density inference with hyper parameters-p(θ| α)In this section instead of inference on EIFS itself, theparameters of the assumed governing distribution forEIFS conditioned on its distribution parameters will beinferred:

    p θjαð Þ ¼Ymk¼1

    p θk jαð Þ ð12Þ

    p N ja;α; βð Þ ¼Z

    p N ja; θ;α; βð Þp cja;α; βð Þdc

    ¼Z

    p N ja; θ; βð Þp cjαð Þdθ ð13Þ

    p N ja;α; βð Þ ¼Ymk¼1

    Z1ffiffiffiffiffiffiffiffiffiffi2πβ2

    q exp − Nk− f ak ; θð Þ½ �22β2

    !p θjαð Þdθ

    ð14Þ

    Now in this equation where we have introduced aprior probability for EIFS, p(θ| α), which is conditionedon α . This forms a hierarchical Bayes model. In this ap-proach an additional integration is introduced. This

    integration can be solved by applying Simpson’s quad-ratic integration twice.Let us elaborate inference process of hyper-parameters

    of EIFS distribution. We select some possible lower andhigher bounds for both mean and standard deviation.Mathematically the bounds for mean can be from −∞ to+∞, and for standard deviation it can be from 0 to +∞;but to save a huge computation cost especially if we areconsidering a finer step size or mesh, with a good guessbased on engineering judgement, we can select appropri-ate bounds. The physics of the problem such as max-imum possible crack depth and minimum possible crackdepth, a non- negative value, would give initial intuition.In the next step we select a small enough step size to

    capture a smooth outcome. This can be achieved by run-ning few trial cases. After selecting the appropriate stepsize for both standard deviation range and mean range aθ distribution can be generated (bounds of θ (EIFS) waschosen from zero to close to maximum possible crackdepth or plate thickness, b) .Integrating the resulting joint probability density

    along mean and standard deviation separately yieldsthe marginal probability densities of mean and stand-ard deviation of the EIFS. As it can be seen fromFig. 3(d) and the accuracy of EIFS distribution estima-tion using hierarchical model and hyper-parametershas increased significantly. It is clear that inferencedirectly on EIFS distribution can estimate the meanwithin an acceptable range with less than 0.1 mmerror (less than 6%), but it fails to predict standarddeviation and it has more than 90% error in standarddeviation prediction. On the other hand, the hierarch-ical model was able to predict both mean and stand-ard deviation with high accuracy. The error forestimating EIFS distribution mean was 2% and errorfor estimating standard deviation of EIFS distributionwas 4.6%.

    Fig. 1 Synthetically generated data in plate model. a (N, a) pair data set sample, b True N versus noisy N, and c Comparison of inferred EIFSdistribution and True EIFS distribution

    Table 1 Material and model parameters used in the platemodel

    Parameter Definition - Value

    C1 3.39e-13

    C2 2.9

    β 10%

    σ 40 MPa

    b (plate width/pipe wall thickness) 7.1 mm

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 4 of 15

  • Comparison of results between simple maximum like-lihood (direct EIFS inference) and hierarchical Bayesiananalysis are shown in Table 2.

    Estimating probability and cycle of failureTo estimate probability of failure we need to set a failurecrack length/depth criteria, af. This could be a percent-age of the width in which crack is being propagated.From previous section we determined probability densityof the EIFS for the plate problem. For the plate problemonly a is the variable of crack that is growing and cracklength does not apply.Now, having the EIFS distribution, distribution of

    number cycles required to reach failure can be pro-actively estimated as inspection data is gathered overtime. In the case of pipe problem, the inspection datacould include length and depth of the cracks detected inthe wall of pipe section.To test this hypothesis, synthetic inspection data

    points corresponding the number of cycles and respect-ive crack length at that cycle (or time) needed to gener-ated. Paris’ crack growth model was used to progresscrack length from the assumed initial size to assumedfinal crack size for each synthetically generated datapoint.Initial crack length is randomly sampled from EIFS

    distribution. As for the failure crack depth, a certain

    percentage of plate width, b, (and later pipe wall thick-ness in the pipe model) is assumed as the mean of fail-ure crack depth (maximum allowable crack length). Toinclude uncertainty for the failure crack length, a stand-ard deviation is added to the assumed mean of failurecrack depth to generate a normal distribution for thefinal crack length. In addition, for each randomly sam-pled EIFS, a corresponding failure crack depth is sam-pled from failure crack depth distribution and hence thenumber of cycles were calculated as shown in the flowchart in Fig. 4. The flow chart shown in this figure canbe used in either plate problem or pipe problem. In theplate problem, being a 2D model, crack only grows inone dimension (one crack front), while in the pipemodel crack grows along length of pipe and depth(through pipe thickness). Consequently, their corre-sponding SIF functions are single variate and bivariate,respectively.For each synthetic data point generated using the algo-

    rithm that was introduced earlier, the number of cyclesto the failure point corresponding to the sampled EIFSand failure crack depth is estimated. Using Bayesian ana-lysis two forecast scenarios are viable:

    1- Estimating the distribution of number of loadingcycles to failure (Nf) based on observed numbers ofdata points sampled from EIFS and failure crack

    Fig. 2 Box plot illustrating EIFS distribution for individual data points

    Table 2 Comparison of EIFS inference results using discussed methods

    Parameters True Value Inference on EIFS Inference on Hyper Parameters

    Mean 1.715 mm 1.831 1.750 mm

    Standard deviation 0.195 mm 0.014 0.204 mm

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 5 of 15

  • depth distributions. This can only be done on thesynthetic data or field data where data was collectedafter observing actual failure in the field or lab test.

    2- Estimating the distribution of number of loadingcycles to failure (Nf) based on observed numbers ofdata points sampled from EIFS and any randomfinal crack depth that is sampled from a uniformlydistributed final crack size from the sampled EIFSto any value smaller than assumed value for meanof failure crack depth. i.e. We will be able to predictprobability distribution of number of loading cyclesto failure from observed data (loading cycles vscrack depth) as new data points are added.

    It should be mentioned the distributions resulted fromscenario-1 may be used as an informative prior in theBayesian process of estimating the distribution of num-ber of cycles to failure in scenario-2. The mathematicalexpression is as follows:

    p N jNrange; β� � ¼Ym

    k¼1p Nk jNrange; β� �

    ¼Ymk¼1

    1ffiffiffiffiffiffiffiffiffiffi2πβ2

    q exp − Nk−Nrange� �22β2

    !

    Where, Nrange, is a vector with lower bound andhigher bound that covers the range of the number of cy-cles that may cause failure.The upper bound of Nrange can be calculated by as-

    suming an extreme case of smallest possible EIFS andlargest possible crack depth. The probability of suchcombination is very low in reality and even in case ofsynthetically generated data. For example, Fig. 5, showsan illustration of possible failure cycle outcomes for asteel plate with an edge crack. This example includes250,000 data points which was generated by sampling500 EIFS and 500 af from distributions EIFS~N(1.5,0.12)and af~N(6,0.1), respectively. As it can be seen in Fig. 5,for this particular example the number of cycles doesnot reach 3 million. While calculating the number ofcycles for the extreme case by considering IFS = 0.1mm,and af = 6.99mm, yields the maximum number of cyclesat about 3.8 millions.With this overview of methodology, we now investi-

    gate the estimations of number cycles to failure for

    Fig. 3 a Joint probability density function of mean and standard deviation of EIFS, b Marginal probability density of mean of EIFS, c Marginalprobability density of standard deviation of EIFS. d Estimated EIFS density vs. True assumed density for EIFS

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 6 of 15

  • various synthetic data sizes by directly estimating Nrangedistribution for the plate problem with edge crack. Thefollowing flowchart, shown in Fig. 6 demonstrates theprocess for estimating the distribution of number ofcycles to failure.The non-informative prior was selected as unit inte-

    ger. The informative prior may be a PDF generatedthe similar way as the example shown in Fig. 5 bysampling from EIFS and an assumed failure cracklength. In this particular method using a good in-formative prior will reduce the standard deviation ofthe final posterior PDF.To demonstrate this methodology with example, re-

    sults of various estimations for number of cycles tofailure is shown in Fig. 7 (top). In this example theinference was performed on four datasets comprising 5,15, 30, and 50 data points.

    From the Fig. 7 (left) we can observe that as more datapoints are added (blue 5 samples and red 50 samples)the distribution becomes sharper at peak and closer tothe mean of sample distribution shown in black dashlines. In the case of EIFS inference this method is betterif only a single estimate is desired rather than the wholedistribution. Table 3 lists the parameters of thedistributions.To have the standard deviation also represented in our in-

    ference process we will again resort to inferring hyper pa-rameters of the Nrange distribution as we did for EIFS.Figure 7 (middle) shows an example of resulting joint likeli-hood probability density for mean and standard deviation ofthe Nrange distribution with 50 samples. To the right we cansee progression of the Nrange distribution inference with 5samples (blue curve) to 50 samples (red curve). The dashedblack distribution is observed sample distribution. As it can

    Fig. 4 Flow chart that shows derivation of number of cycles at failure depth in generating synthetic data

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 7 of 15

  • be seen clearly there is very good match between predictionsand observed samples. Table 4 summarizes the parametersof this distributions.It is worth noting that final number cycles were more

    affected by the distribution of the EIFS than the finalcrack size during process of generating synthetic data.This another instance that the exemplifies importance ofEIFS distribution inference.

    Pipe model development and Bayesian inferenceresultsIn this section first the methodology to compute stressintensity factor (SIF) using FE models is discussed. In

    addition, surrogate models were introduced tointerpolate SIF values at the crack length and depthsthat FE simulation were not performed. First, EIFS forcrack depth and were inferred and those inferred valueswere used to estimate the distribution for number ofcycles to estimate cycle at final crack length.

    Finite element (FE) modelingThe finite element modeling of this research included3D simulation to calculate at the crack tip. Commer-cially available multi-physics software ABAQUS wasused to perform FE simulations. For purpose of thisstudy typical steel material properties (E = 200 GPa, υ =

    Fig. 5 An illustrative example of possible distribution of number cycles to failure for a steel plate with thickness of 7mm. assumptions: EIFS~N(1.5,0.12),and af~N(6,0.1)

    Fig. 6 Flowchart of deriving estimated distribution for number cycles for final (or failure) crack size distribution

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 8 of 15

  • 0.3) were assumed for steel and it was assumed that steelhas elastic behavior within the scope of SIF simulations.To investigate crack growth and the respective SIF

    evaluation more accurately a 3D model was developedand a semi-elliptical (halved ellipse) shape was assumedfor the crack shape as previously it was reported in theliterature to emulate realistic crack shape more closely[8, 17, 18]. The diameter of pipe was selected as D =863.6mm and pipe wall thickness of t = 7.1mm. Thesenumbers were adopted from [8]. The length of the pipesection model was chosen to be 1000mm.To reduce running time of each analysis case a prelim-

    inary mesh sensitivity analysis was performed to deter-mine largest mesh size where solution reaches stablestate and will not be improved by any further mesh sizereduction at the vicinity of the crack. This analysis wasperformed for the crack with length of L = 10mm anddepth of =1mm . The analysis concluded that a meshsize of approximately 0.5mm (equivalent to 23 nodes) issmall enough to produce the stable response.To further optimize required running time for each

    analysis case, the symmetry of the model was taken ad-vantage of, to reduce model size. To this end, as shownin Fig. 8, the full pipe model was once reduced in halfdue to symmetry along y axis (x-z plane) and a secondtime along z axis (x-y plane). In addition, bias meshsizing was used to incrementally increase mesh size of

    the pipe as we go farther away from location of embed-ded crack to reduce computation effort.It should be noted that the region where crack was

    embedded was always meshed uniformly in size and cor-responding to the appropriate mesh size derived frommesh sensitivity analysis that was pointed earlier. Thisincremental progression of mesh sizing can be observedin Fig. 8(b) where the elliptical crack is embedded in thequartile model (the model with symmetry along Y and Zaxis) and mesh dimension along perimeter of pipe andlong Z axis is increased as we go away from the crack re-gion. To verify validity of this model reduction approachwe performed analysis for all models introduced in Fig.8. The observed corresponding SIF calculations resultsin the quartile model was validated and verified as analternative to the full model.Figure 8(c) shows a snippet of cross section of the pipe

    along Z axis where the crack is embedded (crack is posi-tioned in the middle of the pipe). In this cross section wecan calculate the SIF for Mode-I fracture along the perim-eter of semi-elliptical crack. All combination crack length(10, 15, 20, 25, 30, 35, and 40mm) and depth (1, 2, 3, 4, 5,and 6mm) generates 42 simulation cases.It is worth noting that in Fig. 4 flowchart the criteria

    for finding maximum number of cycles is controlled bythe maximum depth of crack through wall thickness ofthe pipe . There are two reasoning for this criteria: first

    Fig. 7 Estimating the distribution for number of cycles to reach final crack size. Inferring Nrange likelihood density directly for various sample sizes(left), Inferring Nrange hyper parameters’ joint likelihood (middle), and likelihood densities of Nrange for various sample sizes (right)

    Table 3 Comparison of direct inference of Nrange on differentsample sizes

    NoSamples

    STD Mean

    True Estimate True Estimate

    5 515,847 52,757 1,422,988 1,115,177

    15 530,064 32,175 1,569,746 1,148,483

    30 425,478 24,247 1,572,589 1,256,449

    50 368,879 19,258 1,553,850 1,303,446

    Table 4 Comparison of inference of Nrange with hyperparameters on different sample sizes

    NoSamples

    STD Mean

    True Estimate True Estimate

    5 482,924 460,000 1,318,528 1,290,000

    15 425,447 450,000 1,516,247 1,520,000

    30 428,778 420,000 1,493,829 1,460,000

    50 412,633 420,000 1,578,909 1,570,000

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 9 of 15

  • as the ratio of crack length to crack depth (La) increasesthe stress intensity factor along crack length becomesless critical and the stress intensity factor along crackdepth is dominant. Second, in the crack growth process,due to physics of the problem the room for crack growthalong the length of the pipe is many times larger thanthe room for crack growth along crack depth throughthe pipe wall thickness. This methodology can be re-duced to only account for crack growth along the depthand neglect the crack length growth evolution if the finalcrack length is not of the interest for a particular prob-lem such as edge crack growth in the plate problem dis-cussed earlier. It is important to note that wile cracklength would be neglected as a failure assessment criter-ion, updating crack length is necessary to have correctcrack depth growth progress.The simulation results reveal that at shorter crack lengths

    as the depth of crack increases the SIF becomes more criticalat the endpoints along the crack length, as shown in Fig. 9(a).As the length of the crack becomes longer the critical SIF re-mains at the deepest point of the crack. In other words, thelonger the crack length becomes the more dominant crackdepth becomes. This is the assertion and justification to anearlier explanation about the implementation made in the al-gorithm in Fig. 4 that crack depth is the main derive behindcrack growth progression. The cases illustrated in Fig. 9(a)the critical points of stress intensity are clearly visible at redareas along crack perimeter.

    Surrogate model of SIFDue to high computation cost and time, evaluating SIFvalues for all possible combinations of crack lengths and

    depths are not feasible. In a crack growth model and theflowchart shown in Fig. 4 in each cycle SIF needs to becalculated for the updated L and a. To this end a func-tion was defined that can continuously compute SIF atany L and a within the lower and upper bounds of thecrack length and depth that was evaluated in the FEmodel. This function is known as surrogate function.The surrogate function forms a 3D surface. This sur-

    face can be estimated using various functions such aspolynomial based function or probabilistic based func-tion such as Gaussian process (GP) model. Here we willinvestigate both of these function for the simulated casesand compare their fitting.

    Polynomial surface fittingA bivariate polynomial function with variables L and awas used as inputs which are crack length and depth, re-spectively. Consequently, the SIF value can be calculatedas follows:

    cSIF a; Lð Þ ¼ XdegLi¼0

    Xdegaj¼0

    ci; jLia j

    Where dega and degL are the highest degree for eachvariable (L and a) in the bivariate polynomial. Here wechose dega = degL = 2 . In addition, ci, j are constant coef-ficients of the polynomial that will be evaluated usinglinear optimization. As we have two sets of SIF thisfunction needs to be determined twice: once as SIFa andsecond time as SIFL; which are SIF values at the front ofthe crack along the length and SIF values at the front ofthe crack along the depth, respectively. See Fig. 8(c) for

    Fig. 8 3D pipe model with three different representations for modeling

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 10 of 15

  • visualized locations of crack fronts annotated with SIFaand SIFL.

    Gaussian process (GP) fittingIn the GP fitting method each SIF data point is fitted toa normal distribution and the expected value of the dis-tribution is chosen as fitted value at that point.

    Assuming input variables (L and a) in form of a vector

    Xi = X1, X2, …, Xm for m data points, the output, cSIFða; LÞ , would be in form of Y(X1), Y(X2), …, Y(Xm). At eachnon training point, X∗:

    K Xi;X j� � ¼ Cov f Xið Þ; f X j� �� � ¼ e− 12ℓ2 Xi−X jj j2

    Fig. 9 a Stress contour at crack front for different crack lengths and depths. b Fitted surrogate functions using 42 data points for SIF along crackdepth. c Fitted surrogate functions using 42 data points for SIF along crack length (Crack lengths are in mm and SIF in MPa·mm− 1)

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 11 of 15

  • cSIF a; Lð Þ ¼ Y � X�ð Þ ¼ E Y � X�ð ÞjX�;X;Y½ �¼ K X�;Xð ÞK X;Xð Þ−1Y

    Var Y � X�ð ÞjX�;X;Y½ � ¼ K X�;X�ð Þ−K X�;Xð ÞK X;Xð Þ−1K X;X�ð Þ

    Where, K is the kernel or covariance function, f is theprocess function, and ℓ is the characteristic length of thecovariance function. In this study Radial-basis function(RBF) kernel (squared-exponential (SE) kernel) was used.Too small ℓ values cause oscillatory behavior between

    training data points as result of faster variations of thefunction. For details on Gaussian process implementa-tion refer to [19, 20].Totally 42 cases of FE simulations were conducted.

    These two surface fitting methods’ accuracy were com-pared with different number of training points. Simula-tions were performed at 6 crack depths and 7 cracklength, which in total yields 42 cases of simulation.

    Figure 9 illustrates results of fitting surface to the FEsimulations. In these figures the fitted surface is shownin blue and the all data points are shown in red circles.Overall it can be said that both methods perform very

    good with predictions. But caution is needed when deal-ing with GP method as the fitting in this method is verysensitive to the length scale parameter (ℓ). In addition,GP method has a tendency of overfitting if not tunedwell with a good covariance function (kernel) [20]. Thepolynomial based fitting method showed that it consist-ently gets better as more training data is included in thefitting process. It is worth noting that neither of thesemethods (especially GP method) are capable of havinggood extrapolated predictions so it is necessary to haveas many boundary points as possible to make the fittedmodel more accurate. Nevertheless, both these methodswere incorporated in the developed algorithm that com-putes the SIF values for predicting number of cycles inthe crack growth model.

    Fig. 10 Synthetic data points generated for 2 dimensional crack growth in pipe wall. Crack depth vs. cycle (left), crack length vs. cycle (middle),and true cycle vs. noise added cycle (right)

    Fig. 11 Comparison of estimated and true marginal EIFS likelihood distributions. Joint likelihood distribution of crack length and depth (left),Crack depth distribution (middle), Crack length (right) for 30 data points

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 12 of 15

  • Inference of EIFS for pipeApplying the aforementioned method to crack growth inpipeline problem brings a few more challenges. The casefor crack growth in an edge crack in steel plate only as-sumes crack growth in single dimension while the crackgrowth in pipe wall is assumed as a semi-elliptical shapethat has two growth fronts; along the depth and alongthe length of the crack (minor and major axis of ellipserespectively). This property makes this problem similarto the problem of EIFS distribution with hyper-parameters (introduced in the section “EIFS probabilitydensity inference with hyper parameters- p(θ|α)”) whichwould require cubic likelihood array. Applying the 2-dimmensional version of crack growth algorithm illus-trated in Fig. 4 we will generate 30 data points as shownin Fig. 10. The finer the step size for cycles and meshsize for crack length and depth, the longer it takes forthe code to yield the results.Consequently, there exists a likelihood distribution for

    every possible pair of crack length and crack depth. So,in the end we will find a joint likelihood distribution ofEIFS for crack length and crack depth. The example ofsuch joint likelihood distribution is shown in Fig. 11(left). We can solve for marginal likelihoods of EIFS forcrack length and crack depth by integrating out theother variable using Simpsons’ quadrature technique in-troduced earlier. The computed marginal likelihoods ofcrack length and depth are shown in Fig. 11 for crackdepth (middle) and crack length (right). The resultsshow that the algorithm has good accuracy in estimating(represented by blue curves) mean of the true distribu-tion (represented by red as ground truth distributionand green as sample distribution). Similar to the plateproblem, the standard deviation of crack depth andlength is not well estimated in this method.

    To improve inference on standard deviation we coulduse hyper-parameter estimation as we did for plate prob-lem. In comparison to the plate problem where crackgrowth was single dimensional, we only had to solvemarginal likelihoods once for each data point. In thecase of pipe crack growth given we consider hyper pa-rameters (mean and standard deviation) for both crackdepth and crack length, we will have 4 hyper parametersto estimate. Solving this problem with Simpson’s quad-rature will be extremely expensive. To solve this prob-lem Monte Carlo simulations would work moreefficiently as also mentioned in [1, 2, 4]. We chose to as-sume standard deviation for our problem in the sectionwhere we infer number of cycle to final crack size. Thisis because we did not perform hyper parameter esti-mates for mean and standard deviation of crack depthand length separately as we did in case of plate problem(see Fig. 3, and Table 2). Such computation can be per-formed using MCMC [21].The properties of estimates for EIFS are summarized

    in Table 5. The estimated values of standard deviation inthis table are not capturing the variability compared totrue values. The assumed values were selected based onmany observations and engineering judgments. Valuesbetween 0.1 to 0.5 mm for depth and 0.5 to 1.5 mm canbe considered good guess range for standard deviationsaccording to author’s observations in other cases. We

    Table 5 Estimated vs. true distribution values for EIFS alongdepth and length

    EIFS Mean STD

    True Estimate True Estimate Assumed

    Depth 1.74 1.794 0.222 0.052 0.19

    Length 15.166 15.429 1 0.555 1

    Fig. 12 Estimating the distribution for number of cycles to reach final crack size. Inferring Nrange likelihood density directly for various samplesizes (left), Inferring Nrange hyper parameters’ joint likelihood (middle), and Likelihood densities of Nrange for various sample sizes (right)

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 13 of 15

  • will use these estimated values as input for the model todo inference of the likelihood distribution for thenumber of cycles to failure in the next section.

    Estimating probability and cycle of failureHere we will follow the similar steps as we did for plateproblem to estimate the distribution for the number ofcycles to failure, Nrange. Using the methodology dis-cussed in section “Estimating probability and cycle offailure” for plate problem we will estimate the distribu-tion of number of cycles to failure. As shown in Fig. 12(top) the direct inference on Nrange distribution is shownfor various sample sizes. We can see the progressive ap-proach of the predictions towards the true sample distri-bution from 5 samples (blue) to 50 samples (red). Theground truth of the sample distribution is shown inblack dashed line. As with previous observation usingdirect inference we can see that while the standard de-viation is not accurately captured the mean or expectedvalue of number of cycles to final crack size is very wellcaptured. The summary of data for this model ispresented in Table 6.Using hyper parameters to perform estimations we will

    have the joint marginal likelihood distribution of meanand standard deviation of number of cycles to failure,shown in Fig. 12 (bottom left).Integrating along mean and standard deviation sep-

    arately we have marginal distribution for mean andstandard deviation of number of cycles to failure sep-arately. Using the expected value of the latter distri-butions we construct the distribution for number ofcycles to failure. These distributions for various num-ber data points or samples are shown in Fig. 12 (bot-tom right). We can see that as more data points are

    added to the model, from samples (blue curve) to 50samples (red curve), the estimated distribution veryclosely matches the true sample distribution shown inblack dashed line. The results for various sample sizesare summarized in Table 7.

    Conclusions and discussionsIn this paper a Bayesian inference methodology was im-plemented to accurately estimate the time (cycle) whenthe structure (pipe or plate) has the most probability offailure based on observed crack growth measurementsand cycle data (equivalent to field inspection) which wasgenerated synthetically. A methodology to estimateEquivalent Initial Flaw Size (EIFS) was introduced andthe distribution for number of cycles to failure waspredicted.A base model was initially developed based on edge

    crack growth in a steel plate and verified the method-ology. Then the method was expanded to model predic-tions for two-dimensional crack growth in pipe wallthickness. Finite Element (FE) Modeling was employedto calculate stress intensity factor (SIF) at finite pints ofcrack length and depth combinations. Two surrogatemodels were used to interpolate the SIF values at thepoints in which FE simulations were not conducted. TheBayesian inference was performed with and withouthyper parameters and the result demonstrated the gainin accuracy with use of hyperparameters.Comparison of estimation and true values showed that

    the proposed methodology has strong grounds for accur-ately predicting most critical time (cycle) when thestructure may become susceptible to failure as more in-spection data is collected and input into the model. Themodel can be further customized and more variables canbe inferred simultaneously. Consequently, the more vari-ables of the model to be inferred, the more complex andcertainly more computationally expensive it becomes.

    Abbreviations3D: three-dimensional; DBN: dynamic Bayesian networks; EIFS: equivalentinitial flaw size; FE: finite element; GP: Gaussian process; IFS: initial flaw size;MCMC: Markov Chain Monte Carlo; NDE: non-destructive evaluation;PDF: probability density function; RFL: remaining fatigue life; SIF: stressintensity factor; S-N: strain-life

    AcknowledgementsThe authors acknowledge the financial support provided by the USDOTPipeline and Hazardous Materials Safety Administration (PHMSA).

    Authors’ contributionsMS conducted the analysis and wrote the draft. HW developed the studyplan and revised the draft. The authors read and approved the final manuscript.

    FundingUSDOT PHMSA Grant DTPH5615HCAP04L.

    Availability of data and materialsAll data used or generated by this study is available from the correspondingauthor by reasonable request.

    Table 6 Comparison of direct inference of Nrange on differentsample sizes

    No ofSamples

    STD Mean

    True Estimate True Estimate

    5 836 321 8240 7080

    15 898 202 8455 7732

    30 1039 147 8626 7986

    50 1000 116 8580 8165

    Table 7 Comparison of inference of Nrange with hyperparameters on different sample sizes

    No ofSamples

    STD Mean

    True Estimate True Estimate

    5 3885 4620 6966 6800

    15 3195 3320 6363 6400

    30 2876 2900 6918 6800

    50 2657 2620 7136 7000

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 14 of 15

  • Ethics approval and consent to participateNot applicable.

    Consent for publicationNot applicable.

    Competing interestsThe authors declare that they have no competing interests.

    Received: 19 December 2019 Accepted: 24 February 2020

    References1. Cross R, Makeev A, Armanios E (2007) Simultaneous uncertainty

    quantification of fracture mechanics based life prediction modelparameters. Int J Fatigue 29(8):1510–1515

    2. Liu Y, Mahadevan S (2009) Probabilistic fatigue life prediction using anequivalent initial flaw size distribution. Int J Fatigue 31(3):476–487

    3. Makeev A, Nikishkov Y, Armanios E (2007) A concept for quantifyingequivalent initial flaw size distribution in fracture mechanics based lifeprediction models. Int J Fatigue 29(1):141–145

    4. Sankararaman S, Ling Y, Mahadevan S (2010) Statistical inference ofequivalent initial flaw size with complicated structural geometry and multi-axial variable amplitude loading. Int J Fatigue 32(10):1689–1700

    5. Sankararaman S, Ling Y, Mahadevan S (2011) Uncertainty quantification andmodel validation of fatigue crack growth prediction. Eng Fract Mech78(7):1487–1504

    6. Zarate BA, Caicedo JM, Yu J, Ziehl P (2012) Bayesian model updating andprognosis of fatigue crack growth. Eng Struct 45:53–61

    7. Fawaz SA (2003) Equivalent initial flaw size testing and analysis of transportaircraft skin splices. Fatigue Fracture Eng Mater Struct 26(3):279–290

    8. Xie M, Bott S, Sutton A, Nemeth A, Tian Z (2018) An integrated prognosticsapproach for pipeline fatigue crack growth prediction utilizing inlineinspection data. J Press Vessel Technol 140(3):031702

    9. Gobbato M, Kosmatka JB, Conte JP (2014) A recursive Bayesian approach forfatigue damage prognosis: an experimental validation at the reliabilitycomponent level. Mech Syst Signal Process 45(2):448–467

    10. Ribeiro T, Borges L, Rigueiro C (2019) A case study on the use of Bayesianinference in fracture mechanics models for inspection planning. J Fail AnalPrev 19(4):1043–1054

    11. Babuška I, Sawlan Z, Scavino M, Szabó B, Tempone R (2016) Bayesianinference and model comparison for metallic fatigue data. ComputMethods Appl Mech Eng 304:171–196

    12. Arzaghi E, Abaei MM, Abbassi R, Garaniya V, Chin C, Khan F (2017) Risk-based maintenance planning of subsea pipelines through fatigue crackgrowth monitoring. Eng Fail Anal 79:928–939

    13. Li C, Mahadevan S, Ling Y, Choze S, Wang L (2017) Dynamic Bayesian networkfor aircraft wing health monitoring digital twin. AIAA J 55(3):930–941

    14. Tada, H., Paris, P. C., & Irwin, G. R. (1973). The stress analysis of cracks.Handbook, Del Research Corporation

    15. Paris P, Erdogan F (1963) A critical analysis of crack propagation laws16. Atkinson KE (2008) An introduction to numerical analysis. Wiley, Hoboken,

    p 20317. Newman JC Jr, Raju IS (1984a) Prediction of fatigue crack-growth patterns

    and lives in three-dimensional cracked bodies. In: Proceedings of the 6thInternational Conference on Fracture (ICF6), New Delhi, India, pp 1597–1608

    18. Newman JC Jr, Raju IS (1984b) Stress-intensity factor equations for cracks inthree-dimensional finite bodies subjected to tension and bending loads

    19. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O et al(2011) Scikit-learn: machine learning in python. J Mach Learn Res 12(Oct):2825–2830

    20. Williams CK, Rasmussen CE (1996) Gaussian processes for regression. In:Advances in neural information processing systems, pp 514–520

    21. Geyer CJ (1992) Practical markov chain Monte Carlo. Stat Sci 7(4):473–483

    Publisher’s NoteSpringer Nature remains neutral with regard to jurisdictional claims inpublished maps and institutional affiliations.

    Salemi and Wang Journal of Infrastructure Preservation and Resilience (2020) 1:2 Page 15 of 15

    AbstractIntroductionObjective and scopeBase model development and results for steel plateEquivalent initial flaw size (EIFS) definition conceptEIFS direct probability density inference - p(θ)EIFS probability density inference with hyper parameters- p(θ| α)Estimating probability and cycle of failure

    Pipe model development and Bayesian inference resultsFinite element (FE) modelingSurrogate model of SIFPolynomial surface fittingGaussian process (GP) fitting

    Inference of EIFS for pipeEstimating probability and cycle of failure

    Conclusions and discussionsAbbreviationsAcknowledgementsAuthors’ contributionsFundingAvailability of data and materialsEthics approval and consent to participateConsent for publicationCompeting interestsReferencesPublisher’s Note