Professor J.-C. Lu Industrial and Systems Engineering Georgia Institute of Technology
description
Transcript of Professor J.-C. Lu Industrial and Systems Engineering Georgia Institute of Technology
A Journey of Learning from Statistics to Manufacturing,
Logistics, Engineering Design and to Information Technology
Professor J.-C. Lu
Industrial and Systems Engineering
Georgia Institute of Technology
Contents
1 Introduction
2 Statistics in Reliability
3 Quality Improvement in Manufacturing
4 Data Mining in Manufacturing
5 Product Design, Manufacturing and Service Chain Management System
6 Information Technology in Education
1. Introduction
• Traditional Research Approach:
• Non-Traditional Research Methods:
Thesis BackgroundApplication #1
Application #2
Application #k
•••
“Modifications”
“Extensions”
New Methods
New “Areas”
Non-Traditional Research Approach
Real-life Problems
Team-work Practical ProblemSolving
Academic Problem Formulation
Best Practice
Literature Review
New Methods or New Areas in Research
Time
Academia
Business
Cross-disciplines
Discipline-focused
Impact Analysis
Application-orientedLiterature
2. Statistics in Reliability
Traditional Research Approach:
Lu, J. C. (1989), “Weibull Extensions of the Freund and Marshall-Olkin Bivariate Exponential,” IEEE Transaction on Reliability,
38, 5, 615- 619.Lu, J. C. and Bhattacharyya, G. K. (1990), “Some New Constructions of
Bivariate Weibull Models,” Annals of the Institute of Statistical Mathematics, 42(3), 543-559.Lu, J. C. (1990), “Least Squares Estimation for the Multivariate Weibull
Model of Hougaard Based on Accelerated Life Test of System and Component,” Communication in Statistics, 19(10), 3725-3739.Lu, J. C. and Bhattacharyya, G. K. (1991), “Inference Procedures for a
Bivariate Exponential Model of Gumbel Based on Life Test of System and Components,” Journal of Statistical Planning and Inference, 27, 383-396.Lu, J. C. and Bhattacharyya, G. K. (1991), “Inference Procedures for a
Bivariate Exponential Model of Gumbel,” Statistics and Probability Letters, 12, 37-50.
Lu, J. C. (1997), “A New Plan for Life-Testing Two-Component Parallel Systems,” Statistics and Probability Letters, 34(1), 19-32.
x(1)
y[1] ’
x(2)
y[2] ’
x(r)
y*[r] ’
x*(r+1)
y*[r+1] ’•••
x*(n)
y*[n] .•••
The life-testing experiment was terminated at x(r),
and data with superscript “*” are censored at x(r).
< x(2) < •••, x(r)are ordered statistics,
x(1)
y[1] , y[2] , •••, y[r] are concomitant ordered statistics.
Sample Publications from the Traditional Research Approach:
Chen, D., and Lu, J. C. (1998), “The Asymptotics of Maximum Likelihood Estimates of Parameters Based on a
Data Type Where Failure and Censoring Times are Dependent,”
Statistics and Probability Letters, 36, 379-391.Chen, D., Li. C. S., Lu, J. C., and Park, J. (2000), “Simple Parameter
Estimation for Bivariate Shock Models with Singular
Distribution for Censored Data with Concomitant Order
Statistics,” Australian and New Zealand Journal of Statistics, 42(3), 323-336.
Non-traditional Research Approaches:
A. Start to work with Nortel in the printed circuit board (PCB)manufacturing area in 1989. Get the 1st Nortel grant in1990. Publish the 1st paper (in JASA – case study) in 1994.
B. Start to work with NCSU’s Semiconductor Center in 1990.Early publications appeared in 1991 (Proceedings), 1993 (engineering journal) and 1997 (statistics journal).
Reliability Degradation Studies (First example of the Non-traditional Research Approach):
Lu, J. C., Park, J. and Yang, Q. (1997), “Statistical Inference of a Time-to-Failure Distribution from Linear Degradation
Data,” Technometrics, 39(4), 391-400.Su, C., Lu, J. C., Chen, D., and Hughes-Oliver, J. M. (1999), “A Linear
Random Coefficient Degradation Model with Random Sample Size,” Lifetime Data Analysis, 5, 173-183.Chen, D., Lu, J. C., X. Huo, and Ming, Y. (2001), “Optimum Percentile
Estimating Equations for Nonlinear Random Coefficient Models,” Journal of Statistical Planning and Inference,275-292.
NSF DMII-ORPS Program, “Modeling Accelerated Degradation Data forProduct Reliability Improvement and Warranty
Analysis,” 2001- 2003 (with Paul Kvam).
y ij
Linear Degradation Model (semiconductormanufacturing):
= 0i+ 1i
log(t ij) + ij ,
i = 1, 2, …, k (#replicates),
j = 1, 2, …, ni (#successive repeated measurements),
y ij = current, threshold voltage shift or transconductance degradation,
ijt = time.
1andAssume have a bivariate normal distributionwith mean (0 , 1), variance ( 0 , 1
2 2) and correlation .
Linear Random Coefficient Model:
Pr( T t ) = Pr( ( yf – 0
0
)/ 1 < t )
The distribution of the failure time T = f – 0 )/ 1( y is
yf
= 0 +1
Define the failure time T as the time that the degradation reaches a specified level y f , and set T .
{ A / B }, where A = 0 + t 1 –f
y and
B = sqrt(C), C = + 2 2 0 1 t2 + 2 t 0 1 .
Non-linear Degradation Model (motivated from both semiconductor and PCB manufacturing studies):
Y i = f ( Xi , i ) + i , i = + bi (random effects).
Note that E( Y i ) f ( Xi , E( i )) = f ( Xi , ).
Thus, f ( Xi , ) is not the mean response of the population,
and may not be the median of the distribution of Y i
even when zero is the distribution mean of errors i .
By correcting the bias of the median regression, estimates of were obtained from solving a system of (optimum) unbiased percentile estimating equations (PEE). The asymptoticdistribution of the estimates was derived. Several examplesof asymptotic efficiency evaluations were given.
3. Quality Improvement in Manufacturing
Non-Traditional Research (examples):Mesenbrink, P., Lu, J. C., McKenzie, R., and Taheri, J. (1994),
“Characterization and Optimization of a Wave Soldering Process,” Journal of the American Statistical Association (JASA), 89, 1209-1217.Gardner, M. M., Lu, J. C., et al. (NCSU ECE and TI researchers) (1997),
“Equipment Fault Detection using Spatial Signatures,” IEEE Trans. on Components, Hybrids and Manufacturing, 20(4), 295-304.Hughes-Oliver, J. M., Lu, J. C., Davis, J. C., and Gyurcsik, R. S. (1998),
“Achieving Uniformity in a Semiconductor Fabrication Process using Spatial Modeling,” JASA, 93, 36-45.Lu, J. C., et al. (SRC (semiconductor research corporation) and NCSU ECE
people) (1998), “A New Device Design Methodology,” IEEE Trans. on Electron Devices - Special Issue on Process Integration and Manufacturability, 45(3), 634-642.Li, C. S., Lu, J. C., Park, J., Kim, K. M., Brinkley, P. A., and Peterson, J.
(1999), “A Multivariate Zero-inflated Poisson Distribution and its Inferences,” Technometrics, 41(1), 29-38.
4. Data Mining in Manufacturing
Rying, E. A. Bilbro, G. L. Ozturk, M. C., and Lu, J. C. (2000), “In Situ Selectivity and Thickness Monitoring based on
Quadrupole Mass Spectroscopy during Selective Silicon Epitaxy,” Proceedings of the 197th Meetings of the Electronchemical Society, 383-392.
Lu, J. C. (2001), “Methodology of Mining Massive Data Set for Improving Manufacturing Quality/Efficiency,” Chapter 11 (pp. 255-
288) in Data Mining for Design and Manufacturing edited by D. Braha, Kluwer Academic Publishers: New York.Lada, E. K., Lu, J. C., and Wilson, J. R. (2002), “A Wavelet Based Procedure
for Process Fault Detection,” IEEE Trans. on Semiconductor Manufacturing, 15(1), 79-90.
Rying, E. A., Bilbro, G. L., and Lu, J. C. (in press), “Focused Local Learning with Wavelet Neural Networks,” IEEE Trans. on Neural
Networks.Porter, A. L., Kongthon, A., and Lu, J. C. (in press), “Research Profiling –
Improving the Literature Review: Illustrated for the Case of Data Mining of Large Datasets,” Scientometrics.
Data from Nortel’s Antenna Manufacturing Process
F ig u re 1 : A u to -C o rre la tio n M a p : K ey w o rd s – D a ta M in in g fo r la rg e D a ta sets
d a ta m in ing
le a rn ing (a rtific ia linte llig e nc e )
ve ry la rg ed a ta b a se s
d a tare d uc tio n
p a tte rnre c o g nitio n
p a tte rnc la ssific a tio n
p a tte rnc luste ring
d e c isio n tre e s
sta tistic a l a na lysis
unsup e rvise dle a rn ing
tre e d a tastruc ture s
d a taa na lysis
ne ura l ne ts
fuzzy se t the o ry
p a ra lle la lg o rithm s
sp a tia l d a tastruc ture s
fuzzy lo g ic
tre e s(m a the m a tic s)
p a ra lle lp ro g ra m m ing
w a ve le ttra nsfo rm s
te m p o ra ld a ta b a se s
fuzzy ne ura l ne ts
c la ssific a tio n
fe a turee xtra c tio n
im a g ep ro c e ssing
b a c kp ro p a g a tio n
re m o te se nsing
im a g ec la ssific a tio n
Ba ye sm e tho d s
im a g ere c o g nitio nim a g ere c o g nitio n
Ba ye sm e tho d s
im a g ec la ssific a tio n
re m o te se nsing
b a c kp ro p a g a tio n
im a g ep ro c e ssing
fe a turee xtra c tio n
c la ssific a tio n
fuzzy ne ura l ne ts
te m p o ra ld a ta b a se s
w a ve le ttra nsfo rm s
p a ra lle lp ro g ra m m ing
tre e s(m a the m a tic s)
fuzzy lo g ic
sp a tia l d a tastruc ture s
p a ra lle la lg o rithm s
fuzzy se t the o ry
ne ura l ne ts
d a taa na lysis
tre e d a tastruc ture s
unsup e rvise dle a rn ing
sta tistic a l a na lysis
d e c isio n tre e s
p a tte rnc luste ring
p a tte rnc la ssific a tio n
p a tte rnre c o g nitio n
d a tare d uc tio n
ve ry la rg ed a ta b a se s
le a rn ing (a rtific ia linte llig e nc e )
d a ta m in ing
A uto -C o rre la tio n M ap
K eyw o rd s (C lean ed ) (co r m ap 2 )
S im ila rity> 0 .750 .50 - 0 .7 50 .25 - 0 .5 0< 0 .25
A
B
C
D
N od e s ize reflec ts re la tiv e freq uen cy in th e d a tase t o f 9 91 ab strac t reco rd s. P lacem en t is b ased o n aV an tag eP o in t p ro p rie ta ry M u lti-d im en sion a l S ca ling (M D S ) rou tin e . T o p ics d ep ic ted c lo se to g e th er a rem o re ap t to b e asso c ia ted b ased on ex ten t o f co -o ccu rren ce in p articu la r ab strac t reco rd s. C onn ec ting lin es,as p er th e leg end , in d ica te re la tiv e d eg ree o f assoc ia tio n , u sin g a P a th E rasin g a lgo rith m (no te th a t theab sen ce o f a link su gg ests less asso c ia tio n , n o t no asso c ia tio n ). T h a t is th e b e tte r ind ica to r o f link ag e .G ro up in g s A -D reflec t ou r in te rp re ta tio n s.
Discrete Wavelet Transform:
Data Reduction Procedures
1 Linear and Nonlinear Approximation in Signal Processing
2 Information Metric Based Procedures
3 Data Denoising Procedures
4 Our Methods RRE_h and RRE_s
5 Comparisons• Testing Curves• “Data without Noises”• “Data with Inherent Random Noises”
Linear and Nonlinear Approximation in Signal Processing
Information Metric Based Procedure – AMDL(Approximation Minimum Description Length)
Saito’s (1994) method selects C to minimize
AMDL(C) = 1.5 C log2 N + 0.5 N log2 [ ( y i y i,C– ^ )2].
i = 1N
Data De-noising Procedures:
Donoho and Johnstone (1995) considered the nonparametric regression model, y i = f i + i , i = 1, 2, …, N, where i
are i.i.d. normal variables with zero mean and constant variance.The goal of the data de-noising procedures is to find a smoothestimate to minimize the mean square error (MSE). Three methods,VisuShrink, RiskShrink and SURE (Stein’s Unbiased Risk Estimate) were compared in our studies.
Seven Testing Curves, Two Real-life Data Examples
Comparison Results (“Data without Noise”)
Comparison Results (“Data with Inherent Random Noises”)
Decision Rules (based on the “reduced-size data”)
1 Chi-square tests
2 Multi-scale Statistical Process Control (SPC)
3 (Functional) Principal Component Analysis (PCA)
4 Bayesian Odds-ratio Probability-based Classification (and Canonical Variation Analysis)
5 Decision Tree (CART)
6 Scalogram (from Signal Processing Literature)
7 Integrated Energy Metrics
Scalogram
Challenges: derive the distribution of the “energy,”
E j = I ( | wjk | ) wjk2
k, where is decided from the
data reduction method, and wjk
is the wavelet coefficient.
Key Challenges in Data Mining Procedures in Manufacturing Applications:
The replication size in “fault classes” is small. Proposal: generating “learning data”
Example: Rying (2001) conducted 25 runs of RTCVD experimentswith four induced fault cases.
Nominal Runs:
Four Induced Fault Cases
Challenges in Learning-data Generations:
1. Difficult to generate the “data shifting patterns” (e.g., Rying’s nominal data) at the wavelet domain, which has a much smaller size ofdata to deal with compared to the originaldata domain with possible large size data.
Idea: “Zoom-in” the regions that “fault datapatterns” occurred, and generate the shifted-data at the original data domain in these focused regions.
Illustration Example:
“Zoom-in Procedure”:
Generate Replicates in the Wavelet Domainwith the following “Patching Technique”:
5. Product Design, Manufacturing and Service(PDMS) Chain Management System
Initiatives in iTimes (Information TechnologyIntegrated Manufacturing Enterprise System)
Enabling Technologies
Interoperability: Fine and Coarse GrainedDecision Making and Design Synthesis
Engr. Modeling, Validation, Testbeds
IT Architectures for Affordable Change
Tools for Modeling
Application Areas
Materials Design
Additive Fabrication
Aero/Auto/Elec Systems
Education
Engineering Domains
E-Design, Engineering Supply Chains
Customer-Driven Design/Engineering
Simulation-Based Design
Environments for Field Service Engineering
Enabling Technologies
Interoperability: Fine and Coarse GrainedDecision Making and Design Synthesis
Engr. Modeling, Validation, Testbeds
IT Architectures for Affordable Change
Tools for Modeling
Enabling Technologies
Interoperability: Fine and Coarse GrainedDecision Making and Design Synthesis
Engr. Modeling, Validation, Testbeds
IT Architectures for Affordable Change
Tools for Modeling
Application Areas
Materials Design
Additive Fabrication
Aero/Auto/Elec Systems
Education
Application Areas
Materials Design
Additive Fabrication
Aero/Auto/Elec Systems
Education
Engineering Domains
E-Design, Engineering Supply Chains
Customer-Driven Design/Engineering
Simulation-Based Design
Environments for Field Service Engineering
Engineering Domains
E-Design, Engineering Supply Chains
Customer-Driven Design/Engineering
Simulation-Based Design
Environments for Field Service Engineering
(1) developing a collaborative game theory based decision support system for structuring interactions among partners in the ePDMS chain, e.g., random coefficient based evolution modeling of utility functions changing over the “co-developing periods”);
(2) extracting design-relevant relationships from “data” collected from various sources, e.g., past designs, conditions of machines on the factory floor at distributed sites, etc.;
(3) monitoring and controlling resource (e.g., energy) utilization and environmental impact.
Current Involvement in iTimes:
Challenges in Data Mining on Product Design
(1) “Retrieving past design information”:
How to define “similarity” in 3-D geometric objects with spatial relationships?
Is it possible to develop a “multi-resolution”presentation of design models or data?
(2) Source of “variation” in design
(3) Relationship between design, manufacturingand service activities.
Analysis Models of Varying Fidelity
Design Model (CAD) Analysis Models (CAE)
1D Beam/Stick Model
3D Continuum/Brick Model
Airframe Subassembly
AssociativityGaps
DiverseFidelities
Informal Associativity Diagram
Constrained Object -based Analysis TemplateConstraint Schematic View
Plane Strain Bodies System
PWA Component Occurrence
CL
1
material ,E( , )geometry
body
plane strain body , i = 1...4PWB
SolderJoint
Epoxy
Componentbase: Alumina
core: FR4
Solder Joint Plane Strain Model
total height, h
linear-elastic model
APM
ABB
3 APM 4 CBAM
2 ABBc
4body 3body
2body
1h oT
primary structuralmaterial
ii
i
Plane Strain Bodies System
PWA Component Occurrence
CLCL
1
material ,E( , )geometry
body
plane strain body , i = 1...4PWB
SolderJoint
Epoxy
Componentbase: Alumina
core: FR4
Solder Joint Plane Strain Model
total height, h
linear-elastic model
APM
ABB
3 APM 4 CBAM
2 ABBc
4body 3body
2body
1h oT
primary structuralmaterial
ii
i
1 SMM
Design Model Analysis Model
ABB
SMM
soldersolder joint
pwb
component
1.25
deformation model
total height
detailed shape
rectangle
[1.2]
[1.1]
average
[2.2]
[2.1]
cTc
Ts
inter-solder joint distanceapproximate maximum
sj
L s
primary structural material
total thickness
linear-elastic model
Plane Strain
geometry model 3
a
stress-strainmodel 1
stress-strainmodel 2
stress-strainmodel 3
Bodies System
xy, extreme, 3
T2
L1
T1
T0
L2
h1
h2
T3Tsj
hs
hc
L c
xy, extreme, sj
bilinear- elastoplastic model
linear-elastic model
primary structural material linear-elastic model
componentoccurrence
solder jointshear strainrange
[1.2]
[1.1]length 2 +
3 APM 2 ABB
4 CBAM
Fine-Grained Associativity
6. Information Technology in Education
IC web-page links Laboratory project
Web-based User Interface
Modeling and analysis tools in “existing systems ePDMS decision
support tools
Middleware (e.g., CORBA,SOAP, Jini, etc.)
Case study database Simulated enterprise operation system
Industrial practicum reports and case studies
CaMILE
Architecture of the Integrated Curriculum (IC)-ePDMS System