Perspectives on System Identification Lennart Ljung Linköping University, Sweden.
-
date post
18-Dec-2015 -
Category
Documents
-
view
255 -
download
5
Transcript of Perspectives on System Identification Lennart Ljung Linköping University, Sweden.
Perspectives on System Perspectives on System IdentificationIdentification
Lennart LjungLennart Ljung
Linköping University, SwedenLinköping University, Sweden
The ProblemThe Problem Flight tests with Flight tests with
Gripen at high alphaGripen at high alpha Person in Magnet camera, Person in Magnet camera,
stabilizing a pendulum by stabilizing a pendulum by thinking ”right”-”left”thinking ”right”-”left”
fMRI picture of brainfMRI picture of brain
The ConfusionThe Confusion Support Vector Machines * Manifold learning *prediction error method * Partial Support Vector Machines * Manifold learning *prediction error method * Partial
Least Squares * Regularization * Local Linear Models * Neural Networks * Least Squares * Regularization * Local Linear Models * Neural Networks * Bayes method * Maximum Likelihood * Akaike's Criterion * The Frisch Scheme Bayes method * Maximum Likelihood * Akaike's Criterion * The Frisch Scheme * MDL * Errors In Variables * MOESP * Realization Theory *Closed Loop * MDL * Errors In Variables * MOESP * Realization Theory *Closed Loop Identification * Cram\'er - Rao * Identification for Control * N4SID* Experiment Identification * Cram\'er - Rao * Identification for Control * N4SID* Experiment Design * Fisher Information * Local Linear Models * Kullback-Liebler Distance * Design * Fisher Information * Local Linear Models * Kullback-Liebler Distance * MaximumEntropy * Subspace Methods * Kriging * Gaussian Processes * Ho-MaximumEntropy * Subspace Methods * Kriging * Gaussian Processes * Ho-Kalman * Self Organizing maps * Quinlan's algorithm * Local Polynomial Kalman * Self Organizing maps * Quinlan's algorithm * Local Polynomial Models * Direct WeightOptimization * PCA * Canonical Correlations * RKHS * Models * Direct WeightOptimization * PCA * Canonical Correlations * RKHS * Cross Validation *co-integration * GARCH * Box-Jenkins * Output Error * Total Cross Validation *co-integration * GARCH * Box-Jenkins * Output Error * Total Least Squares * ARMAX * Time Series * ARX * Nearest neighbors * Vector Least Squares * ARMAX * Time Series * ARX * Nearest neighbors * Vector Quantization *VC-dimension * Rademacher averages * Manifold Learning * Quantization *VC-dimension * Rademacher averages * Manifold Learning * Local Linear Embedding* Linear Parameter Varying Models * Kernel smoothing Local Linear Embedding* Linear Parameter Varying Models * Kernel smoothing * Mercer's Conditions *The Kernel trick * ETFE * Blackman--Tukey * GMDH * * Mercer's Conditions *The Kernel trick * ETFE * Blackman--Tukey * GMDH * Wavelet Transform * Regression Trees * Yule-Walker equations * Inductive Wavelet Transform * Regression Trees * Yule-Walker equations * Inductive Logic Programming *Machine Learning * Perceptron * Backpropagation * Logic Programming *Machine Learning * Perceptron * Backpropagation * Threshold Logic *LS-SVM * Generaliztion * CCA * M-estimator * Boosting * Threshold Logic *LS-SVM * Generaliztion * CCA * M-estimator * Boosting * Additive Trees * MART * MARS * EM algorithm * MCMC * Particle Filters Additive Trees * MART * MARS * EM algorithm * MCMC * Particle Filters *PRIM * BIC * Innovations form * AdaBoost * ICA * LDA * Bootstrap * *PRIM * BIC * Innovations form * AdaBoost * ICA * LDA * Bootstrap * Separating Hyperplanes * Shrinkage * Factor Analysis * ANOVA * Multivariate Separating Hyperplanes * Shrinkage * Factor Analysis * ANOVA * Multivariate Analysis * Missing Data * Density Estimation * PEM *Analysis * Missing Data * Density Estimation * PEM *
This TalkThis Talk
Two objectives:Two objectives:• Place System Identification on the global Place System Identification on the global
map. Who are our neighbours in this part map. Who are our neighbours in this part of the universe?of the universe?
• Discuss some open areas in System Discuss some open areas in System Identfication.Identfication.
The communitiesThe communities Constructing (mathe- Constructing (mathe-
matical) models from data matical) models from data is a prime problem in is a prime problem in many scientific fields and many scientific fields and many application areas.many application areas.
Many communities and Many communities and cultures around the area cultures around the area have grown, with their have grown, with their own nomenclatures and own nomenclatures and their own ``social lives''.their own ``social lives''.
This has created a very This has created a very rich, and somewhat rich, and somewhat confusing, plethora of confusing, plethora of methods and approaches methods and approaches for the problem.for the problem.
.
A picture: There is a core of central material, encircled by the different communities
EstimationEstimation
So need to meet the data with a prejudice!
All data contain information and misinformation (“Signal and noise”)
information in datainformation in dataSqueeze out the relevant Squeeze out the relevant But NOT MORE !But NOT MORE !
Estimation PrejudicesEstimation Prejudices
Nature is Simple!Nature is Simple!Occam's razor Occam's razor God is subtle, but He is not malicious (Einstein)God is subtle, but He is not malicious (Einstein)
So, conceptually:So, conceptually:
Ex: Akaike:Ex: Akaike:
Regularization:Regularization:
Estimation and Validation Estimation and Validation
So don't be impressed by a good fit to data in a So don't be impressed by a good fit to data in a flexible model set!flexible model set!
Bias and Variance Bias and Variance
MSE = BIAS (B) + VARIANCE (V)MSE = BIAS (B) + VARIANCE (V)
Error = Systematic + RandomError = Systematic + Random
This bias/variance tradeoff is at the heart of estimation!This bias/variance tradeoff is at the heart of estimation!
Information Contents in Information Contents in Data and the CR InequalityData and the CR Inequality
The Communities Around the Core IThe Communities Around the Core IStatistics : The the mother areaStatistics : The the mother area … … EM algorithm for ML estimationEM algorithm for ML estimation
Resampling techniques (bootstrap…)Resampling techniques (bootstrap…) Regularization: LARS, Lasso …Regularization: LARS, Lasso …
Statistical learning theoryStatistical learning theory Convex formulations, SVM (support Convex formulations, SVM (support vector machines)vector machines) VC-dimensionsVC-dimensions
Machine learningMachine learning Grown out of artificial intelligence: Logical trees, Grown out of artificial intelligence: Logical trees,
Self-organizing maps.Self-organizing maps. More and more influence from statistics: More and more influence from statistics:
Gaussian Proc., HMM, Baysian nets Gaussian Proc., HMM, Baysian nets
The Communities Around the Core IIThe Communities Around the Core IIManifold learningManifold learning
Observed data belongs to a high-dimensional spaceObserved data belongs to a high-dimensional space The action takes place on a lower dimensional manifold: The action takes place on a lower dimensional manifold:
Find that!Find that!
ChemometricsChemometrics High-dimensional data spaces High-dimensional data spaces
(Many process variables)(Many process variables) Find linear low dimensional Find linear low dimensional
subspaces that capture the essential state: PCA, PLS subspaces that capture the essential state: PCA, PLS (Partial Least Squares), ..(Partial Least Squares), ..
EconometricsEconometrics Volatility ClusteringVolatility Clustering Common roots for variationsCommon roots for variations
The Communities Around the Core IIIThe Communities Around the Core IIIData miningData mining
Sort through large data bases looking for information: Sort through large data bases looking for information: ANN, NN, Trees, SVD…ANN, NN, Trees, SVD…
Google, Business, Finance…Google, Business, Finance…Artificial neural networksArtificial neural networks
Origin: Rosenblatt's perceptronOrigin: Rosenblatt's perceptron Flexible parametrization of hyper-Flexible parametrization of hyper-
surfacessurfacesFitting ODE coefficients to dataFitting ODE coefficients to data
No statistical framework: No statistical framework: Just link ODE/DAE solvers to Just link ODE/DAE solvers to optimizersoptimizers
System IdentificationSystem Identification Experiment designExperiment design Dualities between time- and frequency domainsDualities between time- and frequency domains
System Identification System Identification – Past and Present– Past and Present
Two basic avenues, both laid out in the 1960'sTwo basic avenues, both laid out in the 1960's• Statistical route: ML etc: Åström-Bohlin 1965Statistical route: ML etc: Åström-Bohlin 1965
• Prediction error framework: postulate predictor and apply curve-fittingPrediction error framework: postulate predictor and apply curve-fitting
• Realization based techniques: Ho-Kalman 1966Realization based techniques: Ho-Kalman 1966• Construct/estimate states from data and apply LS (Subspace Construct/estimate states from data and apply LS (Subspace
methods).methods).
Past and Present:Past and Present:• Useful model structuresUseful model structures
• Adapt and adopt core’s fundamentalsAdapt and adopt core’s fundamentals
• Experiment Design ….Experiment Design ….
•...with intended model use in mind (”identification for control”)...with intended model use in mind (”identification for control”)
System Identification System Identification - Future: Open Areas- Future: Open Areas
Spend more time with our neighbours!Spend more time with our neighbours!Report from a visit later onReport from a visit later on
Model reduction and system identificationModel reduction and system identification Issues in identification of nonlinear Issues in identification of nonlinear
systemssystemsMeet demands from industryMeet demands from industryConvexificationConvexification
Formulate the estimation task as a convex Formulate the estimation task as a convex optimization problemoptimization problem
Model Reduction Model Reduction
System Identification is really ”System Approximation” System Identification is really ”System Approximation” and therefore closely related to Model Reduction.and therefore closely related to Model Reduction.
Model Reduction is a separate area with an extensive Model Reduction is a separate area with an extensive
literature (``another satellite''), which can be more literature (``another satellite''), which can be more seriously linked to the system identification field.seriously linked to the system identification field.
• Linear systems - linear modelsLinear systems - linear models• Divide, conquer and reunite (outputs)!Divide, conquer and reunite (outputs)!
• Non-linear systems – linear modelsNon-linear systems – linear models• Understand the linear approximation - is it good for control?Understand the linear approximation - is it good for control?
• Nonlinear systems -- nonlinear reduced modelsNonlinear systems -- nonlinear reduced models• Much work remainsMuch work remains
Linear Systems - Linear ModelsLinear Systems - Linear ModelsDivide – ConquerDivide – Conquer – Reunite! – Reunite!
Helicopter data: 1 pulse input; 8 outputs Helicopter data: 1 pulse input; 8 outputs
(only 3 shown here). (only 3 shown here).
State Space model of order 20 wanted.State Space model of order 20 wanted.
First fit all 8 outputs at the same time:First fit all 8 outputs at the same time:
Next fit 8 SISO models of Next fit 8 SISO models of
order 12, one for each output:order 12, one for each output:
Linear Systems - Linear ModelsLinear Systems - Linear ModelsDivide – ConquerDivide – Conquer – – Reunite!Reunite!
Now, concatenate the 8 SISO models, reduce the 96th order model to order Now, concatenate the 8 SISO models, reduce the 96th order model to order 20, and run some more iterations.20, and run some more iterations.
( ( mm = [m1;…;m8]; mr = balred(mm,20); model = pem(zd,mr); mm = [m1;…;m8]; mr = balred(mm,20); model = pem(zd,mr); compare(zd,model)compare(zd,model) ) )
Linear Models from Nonlinear Linear Models from Nonlinear SystemsSystems
Model reduction fromnonlinear to linearcould besurprising!
Nonlinear SystemsNonlinear SystemsA user’s guide to nonlinear model A user’s guide to nonlinear model
structures suitable for identification and structures suitable for identification and controlcontrol
Unstable nonlinear systems, stabilized by Unstable nonlinear systems, stabilized by unknown regulatorunknown regulator
Stability handle on NL blackbox modelsStability handle on NL blackbox models
Industrial DemandsIndustrial Demands Data mining in large Data mining in large
historical process historical process data bases data bases (”K,M,G,T,P”)(”K,M,G,T,P”)
A serious integration of physical modeling and A serious integration of physical modeling and identification (not just parameter optimization in identification (not just parameter optimization in simulation software)simulation software)
PM 12, Stora Enso BorlängePM 12, Stora Enso Borlänge
75000 control signals, 15000 control loops75000 control signals, 15000 control loops
All process variables, All process variables, sampled at 1 Hz for sampled at 1 Hz for 100 years100 years
= 0.1 PByte = 0.1 PByte
Industrial Demands: Simple ModelsIndustrial Demands: Simple Models
Simple Models/Experiments for certain aspects Simple Models/Experiments for certain aspects of complex systemsof complex systems
Use input that enhances the aspects, …Use input that enhances the aspects, … … … and also conceals irrelevant featuresand also conceals irrelevant features
Steady state gain for arbitrary systemsSteady state gain for arbitrary systems Use constant input!Use constant input!
Nyquist curve at phase crossoverNyquist curve at phase crossover Use relay feedback experimentsUse relay feedback experiments
But more can be done …But more can be done … ……Hjalmarsson et al: ”Hjalmarsson et al: ”Cost of ComplexityCost of Complexity”.”.
An Example of a Specific AspectAn Example of a Specific Aspect
Estimate a non-minimum-phase zero in Estimate a non-minimum-phase zero in complex systems (without estimating the complex systems (without estimating the whole system) – For control limitations.whole system) – For control limitations.
A NMP zero at for an arbitrary system A NMP zero at for an arbitrary system can be estimated by using the input can be estimated by using the input
Example: 100 complex Example: 100 complex systems, all with a zero at 2, systems, all with a zero at 2, are estimated as 2nd order are estimated as 2nd order FIR models FIR models
System Identification System Identification - Future: Open Areas- Future: Open Areas
Spend more time with our neighbours!Spend more time with our neighbours!Report from a visit later onReport from a visit later on
Model reduction and system identificationModel reduction and system identification Issues in identification of nonlinear Issues in identification of nonlinear
systemssystemsMeet demands from industryMeet demands from industryConvexificationConvexification
Formulate the estimation task as a convex Formulate the estimation task as a convex optimization problemoptimization problem
Convexification IConvexification I Are Local Minima an Are Local Minima an
Inherent feature of a Inherent feature of a model structure? model structure?
Example: Example:
Michaelis – Menten kineticsMichaelis – Menten kinetics
This equation is a linear regression This equation is a linear regression that relates the unknown parameters that relates the unknown parameters and measured variables. We can thus and measured variables. We can thus find them by a simple least squares find them by a simple least squares procedure. We have, in a sense, procedure. We have, in a sense, convexified the problemconvexified the problem
Is this a general property?Is this a general property? Yes, any identifiable Yes, any identifiable
structure can be structure can be rearranged as a linearrearranged as a linearregression (Ritt's algorithm)regression (Ritt's algorithm)
Massage the equations:Massage the equations:
Convexification IIConvexification IIManifold LearningManifold Learning
1. X : Original regressors
2. g(x) Nonlinear, nonparametric recoordinatization
3. Z: New regressor, possibly of lower dimension
4. h(z): Simpleconvex map
5. Y: Goal variable (output)
ConclusionsConclusions
System identification is a mature subject ...System identification is a mature subject ...same age as IFAC, with the longest runningsame age as IFAC, with the longest running
symposium seriessymposium series… … and much progress has allowed and much progress has allowed
important industrial applications …important industrial applications …… … but it still has an exciting and bright but it still has an exciting and bright
future! future!
ThanksThanks
Research:Research: Martin Enqvist, Torkel Glad, Håkan Martin Enqvist, Torkel Glad, Håkan Hjalmarsson, Henrik Ohlsson, Jacob RollHjalmarsson, Henrik Ohlsson, Jacob Roll
Discussions:Discussions: Bart de Moor, Johan Schoukens, Rik Pintelon, Bart de Moor, Johan Schoukens, Rik Pintelon, Paul van den HofPaul van den Hof
Comments on paper:Comments on paper: Michel Gevers, Manfred Deistler, Michel Gevers, Manfred Deistler, Martin Enqvist, Jacob Roll, Thomas SchönMartin Enqvist, Jacob Roll, Thomas Schön
Comments on presentation:Comments on presentation: Martin Enqvist, Håkan Martin Enqvist, Håkan Hjalmarsson, Kalle Johansson, Ulla Salaneck, Thomas Hjalmarsson, Kalle Johansson, Ulla Salaneck, Thomas Schön, Ann-Kristin LjungSchön, Ann-Kristin Ljung
Special effects:Special effects: Effektfabriken AB, Sciss AB Effektfabriken AB, Sciss AB