Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+...

29
Datadriven Predictive Analytics for A Spindle’s Health Divya Sardana, Raj Bhatnagar University of Cincinnati Radu Pavel, Jon Iverson Techsolve, Inc., Cincinnati

Transcript of Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+...

Page 1: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Data-­driven  Predictive  Analytics  for  A  Spindle’s  

Health  Divya Sardana,  Raj  Bhatnagar

University  of  CincinnatiRadu Pavel,  Jon  IversonTechsolve,  Inc.,  Cincinnati

Page 2: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

MotivationGoal in Manufacturing Industry:Improve Machine performance.Decrease Machine downtime.

A potential step towards achievingthis goal:Early detection of emerging faults anddegradation trends.

Existing  systems:Condition based maintenance (CBM)systems continuously monitor machinesand collect data related to machine statusand performance (Eg., vibration,temperature, wear debris data etc).

Challenge:Management and analysis of thehumungous amount of CBM data toaccurately detect equipment degradationand predict their remaining useful life.

Page 3: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

What  we  do?:  Problem  Definition• We present a methodology for exploiting largequantities of CBM data and building purely data-­drivenmodels of failure to predict the remaining useful life ofSpindle.– Investigate the monitored data (temperature, current andvibration) to see if it is sufficient for building a model.

– Extract features out of monitored data.– Build prediction models using aggregated features over 24-­72 hour windows of time to predict the remaining useful lifeof spindle.

Page 4: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Related  WorkMachine  health  prognostic  techniques

Model  based  approaches Data-­driven  approaches

• Assume  the  availability  of  a  physical  model  for  failure  progression.

• Eg.  Crack  Propagation  Model  [1]  [2]

• Based  on  analysis  of  CBM  data,  like  temperature,  vibration,  current,  etc.

• Time,  frequency,  time-­frequency  domain  analysis  (Eg.  FFT  [3],  Wavelet  Transform  analysis  [4])  used  to  extract  features  out  of  raw  data.

Page 5: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Related  Work:  Data  Driven  Approaches

• Gebraeel and  Lawley [5]  developed  a  neural  network  based  prediction  model.

• Huang  et  al.  [6]  used  self-­organizing  maps  and  back  propagation  neural  network-­based  failure  prediction  models.

• In  2014,  Dong  et  al.  [7]  used  Support  Vector  Machine  (SVM)  to  predict  bearing  degradation  using  time-­frequency  domain  analysis  of  the  vibration  signal.

Page 6: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Our  Proposed  Solution:  Broad  OverviewData Monitoring unit

Collection  of  spindle  health  data  using  sensors(setup  at  Techsolve inc.)

Data Exploration for Feature Discovery

Training of Regression Models for Spindle Failure Prediction

Prognostic unit

~400  GB  of  Monitored  Angular  Acceleration  Data

Page 7: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Data  Monitoring  Unit

• Testbed setup• Data  Collection  Details

Page 8: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Testbed setup

Fig.  1.  Spindle  Testbed at  Techsolve

Poly-­V  beltThermocouple

Load  sensor

Fig.  2.  Bottom  View  of  the  Loading  Mechanism

Sensor Type Manufacturer Type NumberAccelerometer IMI  sensors 607A11 OneThermocouples Watlow J-­type, Teflon  

coatedThree

Current  sensor Ohio  Semitronics SCT  050CX5 One

Fig  3.  Data  Channels  Diagram  for  Spindle  Testbed

Table  1.  Description  of  sensors  used  in  the  Spindle  Testbed

Page 9: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Data  Collection  Details

Dataset Start  Date FailureDate

Total  Life Bearingsused

Size  on  disk

4 7  Sept2011

4  March  2012

180  Days NSK 170  GB

5 15  May  2012

21  Sept2012

129 Days BARDEN 155  GB

6 29 April  2013

2  August  2013

96 Days NSK 99  GB

• Spindle  was  run  at  approx.  9100  rpm.• Data  collected  as  1  minute  samples  recorded  at  least  once  every  five  minutes.

• Accelerometer  samples  data  at  a  rate  of  25,000  samples  per  second.  • Data  saved  in  a  National  Instruments  format  called  ‘tdms’.• 3  artificially  degraded  (Datasets  1,2,  and  3)  and  3  Run  to  Failure  experiments  (Datasets  4,  5,  6)  were  conducted.

Table  2.  Description  of  Spindle  datasets  used

Page 10: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Prognostic    Unit

• Data  Exploration  for  Feature  Discovery• Training  of  Regression  Models  for  Spindle  Failure  Prediction

Page 11: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Data  Exploration  for  Feature  Discovery

Page 12: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Primitive  Frequency  domain  Features0.4  seconds  worth  of  a  data  signature    (Containing  10,000  values)  

Level  1 Level  2 Level  3 Level  1 Level  2 Level  3

Fig  4.  DFT  for  Spindle  Acceleration  during  early  life Fig  5.  DFT  for  Spindle  Acceleration  just  prior  to  failure

DFT

Primitive  Features  per  signature

Level-­1:  Energy  in  the  lowest  third  of  the  SpectrumLevel-­2:  Energy  in  the  middle  third  of  the  SpectrumLevel-­3:  Energy  in  the  highest  third  of  the  Spectrum

Page 13: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Fig  6.  A  Compressed  Lifetime  Historical  Record  of  Primitive  Energy  Features  in  Dataset-­4  [8].

Burst  region:When  the  average  of  energy  values  in  any  one  partition  (low,  medium,  or  high  frequencies)  of  the  spectrum  remains  consistently  above  a  selected  burst-­ threshold-­energy  for  longer  than  a  selected  minimum  burst-­threshold-­duration,  then  we  consider  that  a  burst  of  energy  has  been  observed  in  the  signatures.  

Calm  region:There  are  time  periods  during  which  energy  in  each  of  the  three  frequency  zones  is  low.  We  call  these  periods  as  those  of  Calm  operation.  

Primitive  Frequency  domain  FeaturesPrimitive  Features

Level-­1:  Energy  in  the  lowest  third  of  the  SpectrumLevel-­2:  Energy  in  the  middle  third  of  the  SpectrumLevel-­3:  Energy  in  the  highest  third  of  the  Spectrum

Lifetime  Variation

Page 14: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

High  Level  Aggregate  Features

Aggregate  features  for  24/48/72  hour  period  windows  of  signatures

Primitive  Features

Level-­1:  Energy  in  the  lowest  third  of  the  SpectrumLevel-­2:  Energy  in  the  middle  third  of  the  SpectrumLevel-­3:  Energy  in  the  highest  third  of  the  Spectrum Aggregate  Features  per  window

1. Average  energy  in  low  frequency  zone2. Average  energy  in  med.  frequency  zone3. Average  energy  in  high  frequency  zone4. Max.  energy  in  low  frequency  zone5. Max.  energy  in  med.  frequency  zone6. Max.  energy  in  high  frequency  zone7. No.  of  bursts  in  low  frequency  zone8. No.  of  bursts  in  med.  frequency  zone9. No.  of  bursts  in  high  frequency  zone10. Avg.  duration  of  bursts  in  low  frequency  zone11.Avg.  duration  of  bursts  in  med.  Frequency  zone12.Avg.  duration  of  bursts  in  high  frequency  zone  

Target  Value  to  predict  per  windowPercentage  time-­to-­failure  at  the  end  of  each  window

Feature  Matrix:  All  Feature  Vectors  with  12  columns  and  1  Target  ValueFM_24,  FM_48,  FM_72

Page 15: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Training  of  Regression  Models  for  Spindle  Failure  Prediction

Page 16: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Training  ModelsTraining Phase 1:

Build :A.) Individual regressionmodels for each dataset(4, 5, and 6) and eachwindow size (24, 48 and72).

B.) Combined regressionmodel for all the datasetscombined together, one foreach window size (24, 48and 72).

Training Phase 2:

Improve Feature vectorselection by usingclustering to extract featurevectors corresponding tobursts.Ø Feature vector similaritycalculated usingcorrelation coefficient(cutoff= 0.5).

Ø MCL clustering algorithm[9] used.

Ø Get feature vectorscorresponding to ‘burst’clusters to construct anenhanced feature matrix.

Training Phase 3:

Retrain the combinedregression models usingthe enhanced Featurematrix corresponding toburst vectors from Trainingphase 2 .

Page 17: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

• Individual  Regression  Models– Generate  3  Feature  Matrices  (FM)  for  each  dataset,  for  each  window  size  along  with  the  target  vectors:• 4-­FM_24,  5-­FM_24,  6-­FM_24• 4-­FM_48,  5-­FM_48,  6-­FM_48• 4-­FM_72,  5-­FM_72,  6-­FM_72• Target  vector  containing  ‘Time  to  Failure’  values.

• Combined  Regression  Model– Normalize  the  Feature  Vectors  for  each  dataset  (for  window  size  72)  and  combine  them  to  make  a  combined  FM  (C-­FM_72).  

Input  for  Regression  models

Page 18: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Significance  Study  of  Regression  Models

categorized as belonging to either primarily bursty or primarilycalm regions. A total of three clusters were labeled as burstyclusters. Out of these three bursty clusters, only two clusterscontained feature vectors from all the three datasets. Thisconveys to us that this type of signatures are common amongall three types of datasets. The cluster that did not containfeature vectors from all the datasets signifies characteristicsnot shared by all the datasets. Based upon this intuition, thefeature vectors corresponding to the two bursty clusters werecombined in a matrix called Clus-FM 72. The training of theregression model was repeated using this short-selected featurematrix Clus-FM 72 as input. The significance results showthat this trained model has much better goodness of fit thanthe previous set of models. So, the conclusion from this is thatthe feature vectors that are close to the prototypes of the twoselected clusters are much better at predicting the remaininglife of a spindle.

V. EMPIRICAL EVALUATION OF THREE REGRESSIONMODELS

A. Significance Study of Regression Models.All the Regression analysis experiments were performed

using an excel plugin called NUMXL. We conducted signif-icance tests for the constructed models, including, goodnessof fit and F-test. Goodness of fit describes how well a modelfits the set of observations. Such measures are mostly usedto summarize the discrepancy between observed values andthe actual expected values under the model in question. F-Test is a statistical test that is most commonly used to test theapplicability of the regression model for the given problem. Ap-value cutoff of 5% was used to discard the null hypothesis.

We have used three standard measures for testing thegoodness of fit of the model, namely the Coefficient ofDetermination, R2, Adjusted R2, and Sum of Squared Error(SSE) [21]. We briefly introduce these quantities here. Letthe predicted values or fitted values by the model be yj andthe actual observations be yj . Let n be the total number ofobservations. Coefficient of determination, R2 measures theproportion of the total variations in the dependent variable y,that can be explained by the regression model and it is definedas

R2 = 1� SSE

SST.

Here, the total sum of squares (SST) measures the totalamount of variation in observed y values and is defined as

SST =nX

j=1

(yj � y)2.

The sum of squared residuals (SSE) measures the amountof variability that the regression model can not explain and isdefined as

SSE =nX

j=1

(yj � yj)2.

The Adjusted R2 is a modification of R2, that has beenadjusted based upon the residual degrees of freedom. It adjusts

for the number of explanatory terms in a model relativeto the number of data points since the R2 automaticallyand spuriously increases when extra explanatory variablesare added to the model. The smaller the difference betweenR2 and the adjusted R2, the better is the fit of the model.Mathematically,

AdjustedR2 = 1� SSE(n� 1)

SST (v).

where v is the residual degrees of freedom in the model.

B. Individual Dataset ModelsIn order to study the effect of changing the window size for

aggregating the features, a regression model was trained forthree different window sizes: 24 hours, 48 hours and 72 hours.Such models were trained for each of the three datasets 4, 5and 6. The regression statistics for these nine trained modelshave been summarized in table III. In this table, the modelshave been named as the dataset followed by the window size inhours (e.g. 4 72). For each of the dataset, similar trends couldbe seen as the window size was increased from 24 hours to 72hours. It can be seen that the R2 value increases on increasingthe window size from 24 hours to 72 hours and the SSE valuereduces. It follows the intuition that larger the window size, itbetter captures the aggregate statistics like avg. burst duration,number of bursts for disturbance periods which last for a longtime. For all the nine test cases in table III, it can be seen thatthe trained models are significant as per the F-test and the nullhypothesis can be rejected. This provides good evidence thatthe polynomial relationship between the aggregated featuresand the predicted time-to-failure values does exist for thechosen window sizes. The difference between the R2 and theadjusted R2 values for all the cases is quite small, furtherindicating that the models fit well.

TABLE IIIREGRESSION RESULTS USING INDIVIDUAL MODELS FOR DIFFERENT

WINDOW SIZES

Model Rsquare%

Adj. Rsquare%

Std.Error

Obs. F pvalue%

4 72 87.8 75 0.15 48 6.87 0.04 48 77.4 65.5 0.18 71 6.55 0.04 24 57.5 48.6 0.22 139 6.44 0.05 72 87.7 69.2 0.16 41 4.75 0.15 48 71.6 53.1 0.2 62 3.88 0.05 24 55.0 43.8 0.22 121 4.89 0.06 72 99.0 92.9 0.08 29 16.35 0.76 48 92.1 82.2 0.13 44 9.28 0.06 24 83.4 77.0 0.14 88 13.16 0.0

C. Feature Vector Clustering Results.Feature vectors in the combined feature matrix Comb-

FM 72 (for window size 72 hours and all datasets) wereclustered together to see if we could separate the featurevectors corresponding to the bursty region windows from thosecorresponding to the calm region windows for all the datasets.As described above, a graph clustering algorithm called MCL

categorized as belonging to either primarily bursty or primarilycalm regions. A total of three clusters were labeled as burstyclusters. Out of these three bursty clusters, only two clusterscontained feature vectors from all the three datasets. Thisconveys to us that this type of signatures are common amongall three types of datasets. The cluster that did not containfeature vectors from all the datasets signifies characteristicsnot shared by all the datasets. Based upon this intuition, thefeature vectors corresponding to the two bursty clusters werecombined in a matrix called Clus-FM 72. The training of theregression model was repeated using this short-selected featurematrix Clus-FM 72 as input. The significance results showthat this trained model has much better goodness of fit thanthe previous set of models. So, the conclusion from this is thatthe feature vectors that are close to the prototypes of the twoselected clusters are much better at predicting the remaininglife of a spindle.

V. EMPIRICAL EVALUATION OF THREE REGRESSIONMODELS

A. Significance Study of Regression Models.All the Regression analysis experiments were performed

using an excel plugin called NUMXL. We conducted signif-icance tests for the constructed models, including, goodnessof fit and F-test. Goodness of fit describes how well a modelfits the set of observations. Such measures are mostly usedto summarize the discrepancy between observed values andthe actual expected values under the model in question. F-Test is a statistical test that is most commonly used to test theapplicability of the regression model for the given problem. Ap-value cutoff of 5% was used to discard the null hypothesis.

We have used three standard measures for testing thegoodness of fit of the model, namely the Coefficient ofDetermination, R2, Adjusted R2, and Sum of Squared Error(SSE) [21]. We briefly introduce these quantities here. Letthe predicted values or fitted values by the model be yj andthe actual observations be yj . Let n be the total number ofobservations. Coefficient of determination, R2 measures theproportion of the total variations in the dependent variable y,that can be explained by the regression model and it is definedas

R2 = 1� SSE

SST.

Here, the total sum of squares (SST) measures the totalamount of variation in observed y values and is defined as

SST =nX

j=1

(yj � y)2.

The sum of squared residuals (SSE) measures the amountof variability that the regression model can not explain and isdefined as

SSE =nX

j=1

(yj � yj)2.

The Adjusted R2 is a modification of R2, that has beenadjusted based upon the residual degrees of freedom. It adjusts

for the number of explanatory terms in a model relativeto the number of data points since the R2 automaticallyand spuriously increases when extra explanatory variablesare added to the model. The smaller the difference betweenR2 and the adjusted R2, the better is the fit of the model.Mathematically,

AdjustedR2 = 1� SSE(n� 1)

SST (v).

where v is the residual degrees of freedom in the model.

B. Individual Dataset ModelsIn order to study the effect of changing the window size for

aggregating the features, a regression model was trained forthree different window sizes: 24 hours, 48 hours and 72 hours.Such models were trained for each of the three datasets 4, 5and 6. The regression statistics for these nine trained modelshave been summarized in table III. In this table, the modelshave been named as the dataset followed by the window size inhours (e.g. 4 72). For each of the dataset, similar trends couldbe seen as the window size was increased from 24 hours to 72hours. It can be seen that the R2 value increases on increasingthe window size from 24 hours to 72 hours and the SSE valuereduces. It follows the intuition that larger the window size, itbetter captures the aggregate statistics like avg. burst duration,number of bursts for disturbance periods which last for a longtime. For all the nine test cases in table III, it can be seen thatthe trained models are significant as per the F-test and the nullhypothesis can be rejected. This provides good evidence thatthe polynomial relationship between the aggregated featuresand the predicted time-to-failure values does exist for thechosen window sizes. The difference between the R2 and theadjusted R2 values for all the cases is quite small, furtherindicating that the models fit well.

TABLE IIIREGRESSION RESULTS USING INDIVIDUAL MODELS FOR DIFFERENT

WINDOW SIZES

Model Rsquare%

Adj. Rsquare%

Std.Error

Obs. F pvalue%

4 72 87.8 75 0.15 48 6.87 0.04 48 77.4 65.5 0.18 71 6.55 0.04 24 57.5 48.6 0.22 139 6.44 0.05 72 87.7 69.2 0.16 41 4.75 0.15 48 71.6 53.1 0.2 62 3.88 0.05 24 55.0 43.8 0.22 121 4.89 0.06 72 99.0 92.9 0.08 29 16.35 0.76 48 92.1 82.2 0.13 44 9.28 0.06 24 83.4 77.0 0.14 88 13.16 0.0

C. Feature Vector Clustering Results.Feature vectors in the combined feature matrix Comb-

FM 72 (for window size 72 hours and all datasets) wereclustered together to see if we could separate the featurevectors corresponding to the bursty region windows from thosecorresponding to the calm region windows for all the datasets.As described above, a graph clustering algorithm called MCL

categorized as belonging to either primarily bursty or primarilycalm regions. A total of three clusters were labeled as burstyclusters. Out of these three bursty clusters, only two clusterscontained feature vectors from all the three datasets. Thisconveys to us that this type of signatures are common amongall three types of datasets. The cluster that did not containfeature vectors from all the datasets signifies characteristicsnot shared by all the datasets. Based upon this intuition, thefeature vectors corresponding to the two bursty clusters werecombined in a matrix called Clus-FM 72. The training of theregression model was repeated using this short-selected featurematrix Clus-FM 72 as input. The significance results showthat this trained model has much better goodness of fit thanthe previous set of models. So, the conclusion from this is thatthe feature vectors that are close to the prototypes of the twoselected clusters are much better at predicting the remaininglife of a spindle.

V. EMPIRICAL EVALUATION OF THREE REGRESSIONMODELS

A. Significance Study of Regression Models.All the Regression analysis experiments were performed

using an excel plugin called NUMXL. We conducted signif-icance tests for the constructed models, including, goodnessof fit and F-test. Goodness of fit describes how well a modelfits the set of observations. Such measures are mostly usedto summarize the discrepancy between observed values andthe actual expected values under the model in question. F-Test is a statistical test that is most commonly used to test theapplicability of the regression model for the given problem. Ap-value cutoff of 5% was used to discard the null hypothesis.

We have used three standard measures for testing thegoodness of fit of the model, namely the Coefficient ofDetermination, R2, Adjusted R2, and Sum of Squared Error(SSE) [21]. We briefly introduce these quantities here. Letthe predicted values or fitted values by the model be yj andthe actual observations be yj . Let n be the total number ofobservations. Coefficient of determination, R2 measures theproportion of the total variations in the dependent variable y,that can be explained by the regression model and it is definedas

R2 = 1� SSE

SST.

Here, the total sum of squares (SST) measures the totalamount of variation in observed y values and is defined as

SST =nX

j=1

(yj � y)2.

The sum of squared residuals (SSE) measures the amountof variability that the regression model can not explain and isdefined as

SSE =nX

j=1

(yj � yj)2.

The Adjusted R2 is a modification of R2, that has beenadjusted based upon the residual degrees of freedom. It adjusts

for the number of explanatory terms in a model relativeto the number of data points since the R2 automaticallyand spuriously increases when extra explanatory variablesare added to the model. The smaller the difference betweenR2 and the adjusted R2, the better is the fit of the model.Mathematically,

AdjustedR2 = 1� SSE(n� 1)

SST (v).

where v is the residual degrees of freedom in the model.

B. Individual Dataset ModelsIn order to study the effect of changing the window size for

aggregating the features, a regression model was trained forthree different window sizes: 24 hours, 48 hours and 72 hours.Such models were trained for each of the three datasets 4, 5and 6. The regression statistics for these nine trained modelshave been summarized in table III. In this table, the modelshave been named as the dataset followed by the window size inhours (e.g. 4 72). For each of the dataset, similar trends couldbe seen as the window size was increased from 24 hours to 72hours. It can be seen that the R2 value increases on increasingthe window size from 24 hours to 72 hours and the SSE valuereduces. It follows the intuition that larger the window size, itbetter captures the aggregate statistics like avg. burst duration,number of bursts for disturbance periods which last for a longtime. For all the nine test cases in table III, it can be seen thatthe trained models are significant as per the F-test and the nullhypothesis can be rejected. This provides good evidence thatthe polynomial relationship between the aggregated featuresand the predicted time-to-failure values does exist for thechosen window sizes. The difference between the R2 and theadjusted R2 values for all the cases is quite small, furtherindicating that the models fit well.

TABLE IIIREGRESSION RESULTS USING INDIVIDUAL MODELS FOR DIFFERENT

WINDOW SIZES

Model Rsquare%

Adj. Rsquare%

Std.Error

Obs. F pvalue%

4 72 87.8 75 0.15 48 6.87 0.04 48 77.4 65.5 0.18 71 6.55 0.04 24 57.5 48.6 0.22 139 6.44 0.05 72 87.7 69.2 0.16 41 4.75 0.15 48 71.6 53.1 0.2 62 3.88 0.05 24 55.0 43.8 0.22 121 4.89 0.06 72 99.0 92.9 0.08 29 16.35 0.76 48 92.1 82.2 0.13 44 9.28 0.06 24 83.4 77.0 0.14 88 13.16 0.0

C. Feature Vector Clustering Results.Feature vectors in the combined feature matrix Comb-

FM 72 (for window size 72 hours and all datasets) wereclustered together to see if we could separate the featurevectors corresponding to the bursty region windows from thosecorresponding to the calm region windows for all the datasets.As described above, a graph clustering algorithm called MCL

Coefficient  of  Determination  (𝑅")

Total  Sum  of  Squares  (SST)

Sum  of  squared  residuals  (SSE)categorized as belonging to either primarily bursty or primarilycalm regions. A total of three clusters were labeled as burstyclusters. Out of these three bursty clusters, only two clusterscontained feature vectors from all the three datasets. Thisconveys to us that this type of signatures are common amongall three types of datasets. The cluster that did not containfeature vectors from all the datasets signifies characteristicsnot shared by all the datasets. Based upon this intuition, thefeature vectors corresponding to the two bursty clusters werecombined in a matrix called Clus-FM 72. The training of theregression model was repeated using this short-selected featurematrix Clus-FM 72 as input. The significance results showthat this trained model has much better goodness of fit thanthe previous set of models. So, the conclusion from this is thatthe feature vectors that are close to the prototypes of the twoselected clusters are much better at predicting the remaininglife of a spindle.

V. EMPIRICAL EVALUATION OF THREE REGRESSIONMODELS

A. Significance Study of Regression Models.All the Regression analysis experiments were performed

using an excel plugin called NUMXL. We conducted signif-icance tests for the constructed models, including, goodnessof fit and F-test. Goodness of fit describes how well a modelfits the set of observations. Such measures are mostly usedto summarize the discrepancy between observed values andthe actual expected values under the model in question. F-Test is a statistical test that is most commonly used to test theapplicability of the regression model for the given problem. Ap-value cutoff of 5% was used to discard the null hypothesis.

We have used three standard measures for testing thegoodness of fit of the model, namely the Coefficient ofDetermination, R2, Adjusted R2, and Sum of Squared Error(SSE) [21]. We briefly introduce these quantities here. Letthe predicted values or fitted values by the model be yj andthe actual observations be yj . Let n be the total number ofobservations. Coefficient of determination, R2 measures theproportion of the total variations in the dependent variable y,that can be explained by the regression model and it is definedas

R2 = 1� SSE

SST.

Here, the total sum of squares (SST) measures the totalamount of variation in observed y values and is defined as

SST =nX

j=1

(yj � y)2.

The sum of squared residuals (SSE) measures the amountof variability that the regression model can not explain and isdefined as

SSE =nX

j=1

(yj � yj)2.

The Adjusted R2 is a modification of R2, that has beenadjusted based upon the residual degrees of freedom. It adjusts

for the number of explanatory terms in a model relativeto the number of data points since the R2 automaticallyand spuriously increases when extra explanatory variablesare added to the model. The smaller the difference betweenR2 and the adjusted R2, the better is the fit of the model.Mathematically,

AdjustedR2 = 1� SSE(n� 1)

SST (v).

where v is the residual degrees of freedom in the model.

B. Individual Dataset ModelsIn order to study the effect of changing the window size for

aggregating the features, a regression model was trained forthree different window sizes: 24 hours, 48 hours and 72 hours.Such models were trained for each of the three datasets 4, 5and 6. The regression statistics for these nine trained modelshave been summarized in table III. In this table, the modelshave been named as the dataset followed by the window size inhours (e.g. 4 72). For each of the dataset, similar trends couldbe seen as the window size was increased from 24 hours to 72hours. It can be seen that the R2 value increases on increasingthe window size from 24 hours to 72 hours and the SSE valuereduces. It follows the intuition that larger the window size, itbetter captures the aggregate statistics like avg. burst duration,number of bursts for disturbance periods which last for a longtime. For all the nine test cases in table III, it can be seen thatthe trained models are significant as per the F-test and the nullhypothesis can be rejected. This provides good evidence thatthe polynomial relationship between the aggregated featuresand the predicted time-to-failure values does exist for thechosen window sizes. The difference between the R2 and theadjusted R2 values for all the cases is quite small, furtherindicating that the models fit well.

TABLE IIIREGRESSION RESULTS USING INDIVIDUAL MODELS FOR DIFFERENT

WINDOW SIZES

Model Rsquare%

Adj. Rsquare%

Std.Error

Obs. F pvalue%

4 72 87.8 75 0.15 48 6.87 0.04 48 77.4 65.5 0.18 71 6.55 0.04 24 57.5 48.6 0.22 139 6.44 0.05 72 87.7 69.2 0.16 41 4.75 0.15 48 71.6 53.1 0.2 62 3.88 0.05 24 55.0 43.8 0.22 121 4.89 0.06 72 99.0 92.9 0.08 29 16.35 0.76 48 92.1 82.2 0.13 44 9.28 0.06 24 83.4 77.0 0.14 88 13.16 0.0

C. Feature Vector Clustering Results.Feature vectors in the combined feature matrix Comb-

FM 72 (for window size 72 hours and all datasets) wereclustered together to see if we could separate the featurevectors corresponding to the bursty region windows from thosecorresponding to the calm region windows for all the datasets.As described above, a graph clustering algorithm called MCL

Adjusted  𝑅"

Goodness  of  FitF-­test

• A  statistical  test  used  to  test  the  applicability  of  the  regression  model  for  the  given  problem.

• A  p-­value  cutoff  of  5%was  used  to  discard  the  null  hypothesis.

Page 19: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Individual  Models:  Significance  Results

categorized as belonging to either primarily bursty or primarilycalm regions. A total of three clusters were labeled as burstyclusters. Out of these three bursty clusters, only two clusterscontained feature vectors from all the three datasets. Thisconveys to us that this type of signatures are common amongall three types of datasets. The cluster that did not containfeature vectors from all the datasets signifies characteristicsnot shared by all the datasets. Based upon this intuition, thefeature vectors corresponding to the two bursty clusters werecombined in a matrix called Clus-FM 72. The training of theregression model was repeated using this short-selected featurematrix Clus-FM 72 as input. The significance results showthat this trained model has much better goodness of fit thanthe previous set of models. So, the conclusion from this is thatthe feature vectors that are close to the prototypes of the twoselected clusters are much better at predicting the remaininglife of a spindle.

V. EMPIRICAL EVALUATION OF THREE REGRESSIONMODELS

A. Significance Study of Regression Models.All the Regression analysis experiments were performed

using an excel plugin called NUMXL. We conducted signif-icance tests for the constructed models, including, goodnessof fit and F-test. Goodness of fit describes how well a modelfits the set of observations. Such measures are mostly usedto summarize the discrepancy between observed values andthe actual expected values under the model in question. F-Test is a statistical test that is most commonly used to test theapplicability of the regression model for the given problem. Ap-value cutoff of 5% was used to discard the null hypothesis.

We have used three standard measures for testing thegoodness of fit of the model, namely the Coefficient ofDetermination, R2, Adjusted R2, and Sum of Squared Error(SSE) [21]. We briefly introduce these quantities here. Letthe predicted values or fitted values by the model be yj andthe actual observations be yj . Let n be the total number ofobservations. Coefficient of determination, R2 measures theproportion of the total variations in the dependent variable y,that can be explained by the regression model and it is definedas

R2 = 1� SSE

SST.

Here, the total sum of squares (SST) measures the totalamount of variation in observed y values and is defined as

SST =nX

j=1

(yj � y)2.

The sum of squared residuals (SSE) measures the amountof variability that the regression model can not explain and isdefined as

SSE =nX

j=1

(yj � yj)2.

The Adjusted R2 is a modification of R2, that has beenadjusted based upon the residual degrees of freedom. It adjusts

for the number of explanatory terms in a model relativeto the number of data points since the R2 automaticallyand spuriously increases when extra explanatory variablesare added to the model. The smaller the difference betweenR2 and the adjusted R2, the better is the fit of the model.Mathematically,

AdjustedR2 = 1� SSE(n� 1)

SST (v).

where v is the residual degrees of freedom in the model.

B. Individual Dataset ModelsIn order to study the effect of changing the window size for

aggregating the features, a regression model was trained forthree different window sizes: 24 hours, 48 hours and 72 hours.Such models were trained for each of the three datasets 4, 5and 6. The regression statistics for these nine trained modelshave been summarized in table III. In this table, the modelshave been named as the dataset followed by the window size inhours (e.g. 4 72). For each of the dataset, similar trends couldbe seen as the window size was increased from 24 hours to 72hours. It can be seen that the R2 value increases on increasingthe window size from 24 hours to 72 hours and the SSE valuereduces. It follows the intuition that larger the window size, itbetter captures the aggregate statistics like avg. burst duration,number of bursts for disturbance periods which last for a longtime. For all the nine test cases in table III, it can be seen thatthe trained models are significant as per the F-test and the nullhypothesis can be rejected. This provides good evidence thatthe polynomial relationship between the aggregated featuresand the predicted time-to-failure values does exist for thechosen window sizes. The difference between the R2 and theadjusted R2 values for all the cases is quite small, furtherindicating that the models fit well.

TABLE IIIREGRESSION RESULTS USING INDIVIDUAL MODELS FOR DIFFERENT

WINDOW SIZES

Model Rsquare%

Adj. Rsquare%

Std.Error

Obs. F pvalue%

4 72 87.8 75 0.15 48 6.87 0.04 48 77.4 65.5 0.18 71 6.55 0.04 24 57.5 48.6 0.22 139 6.44 0.05 72 87.7 69.2 0.16 41 4.75 0.15 48 71.6 53.1 0.2 62 3.88 0.05 24 55.0 43.8 0.22 121 4.89 0.06 72 99.0 92.9 0.08 29 16.35 0.76 48 92.1 82.2 0.13 44 9.28 0.06 24 83.4 77.0 0.14 88 13.16 0.0

C. Feature Vector Clustering Results.Feature vectors in the combined feature matrix Comb-

FM 72 (for window size 72 hours and all datasets) wereclustered together to see if we could separate the featurevectors corresponding to the bursty region windows from thosecorresponding to the calm region windows for all the datasets.As described above, a graph clustering algorithm called MCL

Table  3.  Regression  Results  using  individual  models  for  different  window  sizes

• Trained  models  are  significant  as  per  the  F-­test.

• The  difference  between  𝑅"and  Adjusted  𝑅" is  quite  small,  so  we  can  infer  that  models  fit  well.

𝑅"increases  with  the  window  size

Page 20: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

was used for this purpose. The results for MCL clusteringwere obtained by using its implementation the ClusterMakerplugin [14] available in a graph analysis tool called Cytoscape[?]. MCL has a single parameter called inflation, which canbe used to tune the granularity of the clusters. It was set tobe = 3 in our analysis. Out of a total 119 feature vectors,MCL assigned 110 feature vectors into clusters of size >=3, resulting in a total of six clusters. In order to characterizeeach cluster as belonging to the bursty region or calm region,we define a measure called Cluster Burst Duration (CBD).Before describing CBD, we define what we mean by TotalBurst Duration or TBD for a feature vector.

Total Burst Duration (TBD:) For a feature vector, let thetotal burst duration in each frequency zone be b1, b2 andb3. These quantities are computed for each feature vector bymultiplying its avg. burst duration with the number of burstsin a frequency zone. Depending upon the dataset a featurevector belonged to, these burst durations were normalizedaccording to the maximum burst durations that occurred indifferent zones in that complete dataset. Let the normalizedburst durations in the three frequency zones be nb1, nb2 andnb3. Based upon these values, the measure TBD for a featurevector was defined as follows.

TBD = (nb1 + nb2 + nb3).

The above defined value of TBD was calculated for eachfeature vector belonging to a cluster. A high value of TBD fora feature vector implies a large amount of burst activity takingplace within the window corresponding to that feature vector.Based upon these TBD values, the quantity CBD is definedas follows.

Cluster Burst Duration (CBD): The median of TBD valuesfor all the feature vectors belonging to a cluster is computedto be the Cluster Burst Duration or CBD for that cluster.

A non-zero CBD value for a cluster implies that more thanhalf of the time windows belonging to that cluster contain alarge amount of energy burst activity within them. A CBDvalue of zero implies that more than half of the time windowsin that cluster had no energy burst activity. Therefore, thisvalue of CBD can be used as a measure to characterize acluster as belonging to primarily burst windows or primarilycalm windows.

A summary of CBD values obtained for all the six clustersreturned by the MCL algorithm is shown in table IV. Basedupon the CBD values for each cluster, it can be said that thetop three clusters can be characterized as belonging to thebursty time windows, whereas, the last three clusters can becharacterized as belonging to the calm time windows. Thistable also contains, for each cluster, the number of featurevectors coming from each of the three datasets. Focusingattention only on the bursty clusters(1, 2 and 3), it can be seenthat clusters 1 and 3 contain feature vectors belonging to eachof the three datasets, whereas cluster 2 has feature vectors onlybelonging to datasets 4 and 6. Based upon this observation,the attention was further narrowed down to study clusters 1and 3 in further detail. This decision was made because this

would help in the identification of the characteristics of thebursts that occur in all the three datasets. This in turn, helpsin designing a more generic prognostic framework.

The feature vectors belonging to clusters 1 and 3 have beenpresented in Figures 7 and 8. In these Figures, the featurevectors belonging to different datasets have been labeled andcolored differently. The width of each node has been drawnproportionately to the TBD value for that feature vector.Further, each node label also includes the actual percentagetime-to-failure value for the window to which the node belongsto. For example Figure 7 shown the composition of cluster1. It contains feature vectors from dataset 5 with 0.0% time-to-failure, feature vectors from dataset 4 with a time-to-failureof 26.8%, and feature vectors from dataset 6 with a time-to-failure of 23.5%. These feature vectors in cluster 1 are stronglycorrelated with each other.

TABLE IVCLUSTERS OBTAINED USING MCL CLUSTERING ON ALL THE FEATURE

VECTORS IN C-FM 72

Cluster#

#Nodes

#4 #5 #6 CBD

1 3 1 1 1 3002 10 8 0 3 53.43 47 6 23 8 11.834 31 20 1 10 05 11 6 0 5 06 8 0 6 2 0

On further evaluation of cluster 1, it is found that the threefeature vectors (5 0.0, 4 26.8 and 6 23.5 ) in this clustercorresponded to the peak values of TBD for their respectivedatasets! From this, it can be concluded that the peaks ofenergy bursts belonging to different datasets do share somesimilar features. Further, it also strengthens our hypothesisthat the choice of aggregated features used in the regressionmodel does help in grouping similar feature vectors together.This also reveals the fact, that in the life time of a spindle,the peak burst occurs, not necessarily towards the very endof the spindle lifetime. Even if a peak burst is followed byrelatively calmer regions, as was the case for datasets 4 and 6,the spindle may not have much remaining life left after sucha peak burst occurs.

D. Regression Model Using Combined Feature VectorsIn the preceding discussion it was observed that out of all

window sizes, the regression models trained for the windowsize of 72 hours were the most significant for all the threedatasets. We denote this class of regression models trainedon individual datasets as indv Model. However, in order tomake the regression model more generic, it was decidedto retrain the regression model using the combined featurematrix Comb-FM 72. Let this regression model be called ascomb Model. Further, using the feature vectors from clusters1 and 3 of the clustering output were combined together in afeature matrix called clus-FM 72. The regression model wasretrained using this feature matrix clus-FM 72 as input. Letthis regression model be called as clus Model. R2 values were

Cluster  Burst  Duration(c)  (CBD):  Median  of  FBD  values  for  all  feature  vectors  inside  a  cluster  c.

Feature  Burst  Duration(f)  (FBD):  Sum  of  normalized  burst  durations  in  the  three  frequency  zones  for  a  feature  vector  f.

Clustering  Goal:  To  separate  feature  vectors  belonging  to  burstyregions  from  the  calm  regions.Results:  Six  clusters  of  size  >=3  obtained  for  119  Feature  vectors.Measures  to  assign  cluster  label  as  ‘Bursty’  vs.  ‘Calm’:

Combined  Model:  Clustering  Analysis  

Table  5.  Clusters  obtained  using  MCL  clustering  on  the  feature  matrix  C-­FM_72  window  sizes

Fig  7.  Cluster  #  1  

Fig  8.  Cluster  #  3  

Page 21: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Comparison  of  results  for  three  types  of  models  (for  window  size  72)

• Individual  models  for  each  dataset  (4,5,6)  for  window  size  72(indv-­model).

• Combined  model  for  window  size  72  (comb-­model).• Model  obtained  using  only  burst  vectors  for  window  size  72  (clus-­model).

Page 22: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

All  models:  Dataset  4

Page 23: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

All  models:  Dataset  5

Page 24: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

All  models:  Dataset  6

Page 25: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Deployment  of  different  Regression  models

Fig. 7. Cluster #1 obtained after clustering of C-FM 72.

Fig. 8. Cluster #3 obtained after clustering of C-FM 72.

computed for each dataset to evaluate the predictions madeusing comb Model and clus Model. As discussed earlier, tableIII lists the values of R2 obtained for all datasets on theirrespective indv Model. In table V, we compare the R2 valuesobtained for each dataset for their indv Model with the valuesobtained using comb Model and clus Model. Further, for eachof the three datasets, plots comparing the actual value of time-to-failure (TTF) and the corresponding predicted values usingall the three trained regression models are shown figures 9,10 and 11.

As is evident from these plots, the regression models trainedon the individual feature matrices 4-FM 72, 5-FM 72 and 6-FM 72 obtain the highest value of R2. When the regressionis performed on the combined matrix Comb-FM 72, the R2

value goes down for each of the dataset. This was expectedas the matrix Comb-FM 72 contains feature vectors from allthe three datasets. The MCL clustering of feature vectorsbelonging to Comb-FM 72 helped in identifying the featurevectors corresponding to bursts from all the datasets whichhad high similarity. When the regression model is retrainedusing only the feature matrix corresponding to the burstclusters (Clus-FM 72), the R2 value for predicting improvesconsiderably. This improvement is significant for datasets 4and 6, whereas not that much for dataset 5. Also, for each of

Fig. 9. Actual and Predicted TTF values for dataset 4 for window size of 72hours.

Fig. 10. Actual and Predicted TTF values for dataset 5 for window size of72 hours.

the dataset, it can be seen that the clus Model predicts thepeak bursts with very good accuracy.

TABLE VR2 VALUES FOR THE REGULAR, COMBINED, AND CLUSTERED

REGRESSION MODELS FOR WINDOW SIZE 72.

Dataset#

indv TTF R2

%comb TTF R2

%clus TTF R2

%4 87.8 75.0 96.85 87.7 39.0 52.06 99.0 35.0 90.2

VI. SIGNIFICANCE AND IMPACT.

In the manufacturing industry, spindle is a key component ofthe machine tool. Any breakdown occurring in the spindle canlead to a failure in the machine function until a replacementis installed. Therefore, any attempt to predict its time- to-failure even a few weeks before its actual failure can be ofgreat help. Several model-based and data-driven solutions have

Table  6.  𝑅"values  for  the  individual,  combined,  and  clustered  Regression  Models  for    window  size  72.

Page 26: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Conclusion• Using  the  huge  amount  of  CBM  data  for  predicting  the  remaining  useful  life  of  spindle  can  be  of  great  help  to  the  manufacturing  industry.

• We  proposed  a  data-­driven  prognostic  methodology  to  exploit  large  quantities  of  CBM  data  for  developing  regression  based  spindle  failure  models.

• We  deployed  these  models  to  predict  the  remaining  useful  life  of  spindle  for  three  datasets  generated  by  a  testbed setup  at  Techsolve inc.

Page 27: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Future  Work

• The  regression  and  clustering  based  methodology  presented  in  the  paper  can  be  seen  as  a  proof  of  concept.

• By  performing  more  test  bed  experiments,  our  proposed  methodology  can  be  used  to  build  a  generic  prediction  framework  for  predicting  the  remaining  useful  life  of  spindle.

Page 28: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

References[1]  James  A  Harter,  Comparison  of  contemporary  fcg life  prediction  tools,  International  Journal  of  Fatigue  21  (1999),  S181–S185.  [2]  Sean  Marble  and  Brogan  P  Morton,  Predicting  the  remaining  life  of  propulsion  system  bearings,  Aerospace  Conference,  2006  IEEE,  IEEE,  2006,  pp.  8–pp.  [3]  Jihong Yan,  Chaozhong Guo,  and  Xing  Wang,  A  dynamic  multi-­scale  markov model  based  methodology  for  remaining  life  prediction,  Me-­ chanical Systems  and  Signal  Processing  25  (2011),  no.  4,  1364–1376.[4]  Hasan  Ocak,  Kenneth  A  Loparo,  and  Fred  M  Discenzo,  Online  tracking  of  bearing  wear  using  wavelet  packet  decomposition  and  probabilistic  modeling:  A  method  for  bearing  prognostics,  Journal  of  sound  and  vibration  302  (2007),  no.  4,  951–961.[5]  Nagi Z  Gebraeel,  Mark  Lawley,  et  al.,  A  neural  network  degradation  model  for  computing  and  updating  residual  life  distributions,  Automa-­ tion Science  and  Engineering,  IEEE  Transactions  on  5  (2008),  no.  1,  154–163.  [6] Runqing Huang,  Lifeng Xi,  Xinglin Li,  C  Richard  Liu,  Hai  Qiu,  and  Jay  Lee,  Residual  life  predictions  for  ball  bearings  based  on  self-­organizing  map  and  back  propagation  neural  network  methods,  Mechanical  Systems  and  Signal  Processing  21  (2007),  no.  1,  193–207.  [7]  Shaojiang Dong,  Shirong Yin,  Baoping Tang,  Lili Chen,  and  Tianhong Luo,  Bearing  degradation  process  prediction  based  on  the  support  vector  machine  and  markov model,  Shock  and  Vibration  2014  (2014).  [8]  Divya Sardana,  Raj  Bhatnagar,  Radu Pavel,  and  Jon  Iverson,  Investi-­ gations on  spindle  bearings  health  prognostics  using  a  data  mining  approach,  Proceedings  of  the  Society  for  machinery  Failure  Prevention  Technology  (MFPT),  2014,  2014.  [9] Stijn Van  Dongen,  Graph  clustering  via  a  discrete  uncoupling  process,  SIAM  Journal  on  Matrix  Analysis  and  Applications  30  (2008),  no.  1,  121–141.  

Page 29: Data$driven+Predictive+ Analyticsfor+A+Spindle’s Health+ · 2016-09-07 · Divya Sardana,Raj+ Bhatnagar Universityof+Cincinnati Radu Pavel,+Jon+Iverson Techsolve,+Inc.,+Cincinnati.

Thank  you!