Alessio Cirone 11/09/2019...Alessio Cirone 11/09/2019 2 Zoom su virgo Advanced Virgo is a laser...
Transcript of Alessio Cirone 11/09/2019...Alessio Cirone 11/09/2019 2 Zoom su virgo Advanced Virgo is a laser...
Alessio Cirone 11/09/2019
Alessio Cirone 11/09/2019 2
Zoom su virgo
Advanced Virgo is a laser interferometer devoted to the
detection of Gravitational Waves of astrophysical
and cosmological origin. The detection
power lies entirely on the
instrument spectral
sensitivity.
Alessio Cirone 11/09/2019 3
Newtonian Noise of
seismic origin
It would be a limiting noise source
for AdV+ in the 10-40 Hz band
Alessio Cirone 11/09/2019 4
Seismic waves
Density fluctuations of the surrounding
media (rocks, air, water…)
Gravity gradients
Seismic wave
Alessio Cirone 11/09/2019 5
Seismic waves
Density fluctuations of the surrounding
media (rocks, air, water…)
Gravity gradients
Test Masses displacementSeismic wave
Newtonian attraction
Alessio Cirone 11/09/2019 6
A Newtonian noise cancellation scheme can be realized using an
array of sensors deployed near the test masses
What are we missing?
Alessio Cirone 11/09/2019 7
M.W.Coughlin et al, PRL, 2018.
1. Newtonian noise has never been directly measured
• Theoretical models
• semi-realistic FE simulations (including soil
properties and the surrounding infrastructure)
2. Sensors
• Optimized placement (number and position)
• Type selection (accelerometers and/or tiltmeters)
• Underground detectors?
3. Data processing
• A second feasible subtraction algorithm in
addition to Wiener Filter (the gold standard)
A Newtonian noise cancellation scheme can be realized using an array of seismometers deployed near
the test masses. What we are missing is:
Alessio Cirone 11/09/2019 8
𝑺𝒋(𝑭𝒙,𝑭𝒚)
Foundations,
building …
𝛿𝜌 = −𝛻𝜌0 𝒓 𝜉 𝒓, 𝑡
1. Newtonian noise has never been directly measured
• Theoretical models
• semi-realistic FE simulations (including soil
properties and the surrounding infrastructure)
2. Sensors
• Optimized placement (number and position)
• Type selection (accelerometers and/or tiltmeters)
• Underground detectors?
3. Data processing
• A second feasible subtraction algorithm in
addition to Wiener Filter (the gold standard)
A Newtonian noise cancellation scheme can be realized using an array of seismometers deployed near
the test masses. What we are missing is:
Alessio Cirone 11/09/2019 9
WEB
• 38 inside the WEB (2018)
• 38 inside the NEB (2019)
Are they all needed?
1. Newtonian noise has never been directly measured
• Theoretical models
• semi-realistic FE simulations (including soil
properties and the surrounding infrastructure)
2. Sensors
• Optimized placement (number and position)
• Type selection (accelerometers and/or tiltmeters)
• Underground detectors?
3. Data processing
• A second feasible subtraction algorithm in
addition to Wiener Filter (the gold standard)
A Newtonian noise cancellation scheme can be realized using an array of seismometers deployed near
the test masses. What we are missing is:
Alessio Cirone 11/09/2019 10
E. Calloni, Internal note: VIR-0637A-19
M. Beker et al, 2016, TLE
1. Newtonian noise has never been directly measured
• Theoretical models
• semi-realistic FE simulations (including soil
properties and the surrounding infrastructure)
2. Sensors
• Optimized placement (number and position)
• Type selection (accelerometers and/or tiltmeters)
• Underground detectors?
3. Data processing
• A second feasible subtraction algorithm in
addition to Wiener Filter (the gold standard)
A Newtonian noise cancellation scheme can be realized using an array of seismometers deployed near
the test masses. What we are missing is:
Alessio Cirone 11/09/2019 11
Einstein
Telescope
F. Badaracco and J. Harms, CQG, 2019.
1. Newtonian noise has never been directly measured
• Theoretical models
• semi-realistic FE simulations (including soil
properties and the surrounding infrastructure)
2. Sensors
• Optimized placement (number and position)
• Type selection (accelerometers and/or tiltmeters)
• Underground detectors?
3. Data processing
• A second feasible subtraction algorithm in
addition to Wiener Filter (the gold standard)
A Newtonian noise cancellation scheme can be realized using an array of seismometers deployed near
the test masses. What we are missing is:
Alessio Cirone 11/09/2019 12
𝑭𝒙
𝑭𝒚
𝑺𝟏
𝑺𝟐
𝑺𝟑
𝑺𝒌
Robust with respect to terrain parameters,
sensor placement, number and type
Deep Neural network (DNN)
1. Newtonian noise has never been directly measured
• Theoretical models
• semi-realistic FE simulations (including soil
properties and the surrounding infrastructure)
2. Sensors
• Optimized placement (number and position)
• Type selection (accelerometers and/or tiltmeters)
• Underground detectors?
3. Data processing
• A second feasible subtraction algorithm in
addition to Wiener Filter (the gold standard)
A Newtonian noise cancellation scheme can be realized using an array of seismometers deployed near
the test masses. What we are missing is:
Alessio Cirone 11/09/2019 13
Part 1
Prediction of one sensor displacement from
the entire array (Neural network compared to
Wiener filter)
Part 2
Optimal sensors configuration from partial
information (Ansys simulations + machine
learning)
Alessio Cirone 11/09/2019 14
The Neural Network
(DNN) is compared
with the Wiener
filter (WF)
Dataset
Pre-processing (resampling, filtering, normalization …)
Train/test set
Prediction set
DNN WF
Genetic Algorithm
Best DNNs Residuals
Alessio Cirone 11/09/2019 15
70 % for train 30 % for test Prediction
Time
Same for DNN and WF
1 hour 1 hour
Complete WEB/NEB datasets (from 13 days to a few months)
Randomly selected
Alessio Cirone 11/09/2019 16
Conv. 1D – Activation – Max
Pooling 1D
× 2
Basic sequential model
n s
en
sors
Alessio Cirone 11/09/2019 17
Conv. 1D – Activation – Max
Pooling 1D
× 2
Fully connected – Activation
Fully connected – Batch Norm. – Activation
Conv. 1D – Activation
/
/
Basic sequential model
n s
en
sors
/
Alessio Cirone 11/09/2019 18
Conv. 1D – Activation – Max
Pooling 1D
× 2
Fully connected – Activation
Fully connected – Batch Norm. – Activation
Conv. 1D – Activation
Global Average Pooling 1D Fully connected – Activation+
/
/
Basic sequential model
n s
en
sors
/
Alessio Cirone 11/09/2019 19
• Test cluster of about 140 cores
for distributed processing
• Virtual environment with
Anaconda python distribution
• Robust portable environment for
Virgo to run in loco and act offline
for Newtonian noise substraction
Good foundations:
python based
open access
CPU & GPU friendly
• At first the DNN is trained on the
simulated data (process done on all
the CPUs in parallel)
We use the Python based Keras library, the
TensorFlow backend and some extra
packages like SCOOP (Scalable COncurrent
Operations in Python) and DEAP
(Distributed Evolutionary Algorithms in
Python)
Alessio Cirone 11/09/2019 20
Survival of the fittest,
looking for good
performances and low
network complexity
Alessio Cirone 11/09/2019 21
Best
Variable number of neurons,
number of layers, layer types …
Mate and
mutate
New
networks
to be
tested
Fit,
Evaluate,
Predict
Evolutionary
algorithm
Select the
best
networks
Hall of
fame
Looking for
performance &
simplicity
Alessio Cirone 11/09/2019 22
• In GA new networks
are created by
“mating” two high-
performance
networks
• The new networks
inherit some
properties from both
parents, in particular
their weights,
therefore applying an
heuristic transfer
learning
Neo-initialized
weights
Crossover cut
Alessio Cirone 11/09/2019 23
• The DNN takes the sensor array temporal data as input and another single sensor as output
Time Sensor 1 Sensor 2 … Sensor k Target sensor
0.01 1,24E-11 1,26E-13 … 2,72E-14 2,30E-12
0.02 3,16E-10 5,31E-13 … 9,48E-13 5,11E-11
0.03 3,66E-09 4,03E-13 … 1,57E-11 4,68E-12
Target
sensor (30)
Selected input
sensors• Montecarlo simulation with n input sensors out
of N=36
• Random data selection for each MC choice
• A single predictor for DNN is selected: the mean
performance value is taken from the best DNNs
Alessio Cirone 11/09/2019 24
Example of seismic
spectra of the target
channel and the
predicted output with
DNN and WF,
together with the
residuals (entire array
as input)
Residual mean
Alessio Cirone 11/09/2019 25
3.34 0.66 0.61 3.16
Mean residual distributions
for DNN and WF
• DNN statistically
better than WF
• Redundancy: even
with few sensors we
can achieve
comparable
residuals to the full
array case, if
properly selected
• The higher sensor-
target correlation,
the better residuals
n selected sensors
out of 36
Alessio Cirone 11/09/2019 26
3.34 0.66 0.61 3.16
• NN statistically
better than WF
• Redundancy: even
with few sensors we
can achieve
comparable
residuals to the full
array case, if
properly selected
• The higher sensor-
target correlation,
the better residuals
Alessio Cirone 11/09/2019 27
• High correlation between DNN and
WF results
• Also some cases in which WF goes so
much worse than DNN
• This behaviour is under investigation
1
2(𝐷𝑁𝑁 + 𝑊𝐹) can be competitive / alternative
/ complementary to WF alone, even in a sub-
optimal sensor configuration
Pearson’s correlation: 0.736
High correlation
Alessio Cirone 11/09/2019 28
Time
10 input sensors: [4, 13, 15, 17, 19, 27, 29, 31, 36, 37]
Target sensor: 30
Time
history
(~𝑚𝑠)
Train/test
FFT (time
window:
40 − 1𝑘 𝑠𝑒𝑐)
Prediction
FFT to
predictTime to
predict
Complete WEB dataset (13 days)
Randomly selected
Input InputOutput Output
Alessio Cirone 11/09/2019 29
• Better residuals in
the frequency –
domain, both for
DNN and WF
• What if we increase
the time window?
Time window
for the FFT:
72 sec
Alessio Cirone 11/09/2019 30
• It turns out that the F – domain WF
performance strongly depends on the
time window length
• On the contrary the DNN seem to follow a constant trend
• In order to explore higher time windows, we
would need more computational resources
Alessio Cirone 11/09/2019 31
Part 1
Prediction of one sensor displacement from
the entire array (Neural network compared to
Wiener filter)
Part 2
Optimal sensors configuration from partial
information (Ansys simulations + machine
learning)
Alessio Cirone 11/09/2019 32
Many ANSYS simulations:
random stochastic sources and a
virtual network of N surface
sensors (regular grid or not)
𝑺𝒋(𝑭𝒙,𝑭𝒚)
Foundations,
building …
𝛿𝜌 = −𝛻𝜌0 𝒓 𝜉 𝒓, 𝑡
Alessio Cirone 11/09/2019 33
• Problem
Material = reinforced concrete = concrete + steel => 𝐸, 𝜈 unknown
• Simplification
We divide the structure into macros (plinths, foundation beams,
insoles …)
Ɐ macro, we consider the % of steel (0.5 – 2.5 %)
We homogenize the cross-section, enlarge the dimensions and keep
the elastic modulus of the concrete.
Alessio Cirone 11/09/2019 34
• Full covariance matrix (as if the sensors were installed in any
grid node)
Sensor – Newtonian
Noise covariance
Alessio Cirone 11/09/2019 35
• From each model generate several partial models with a variable number of sensors randomly picked from the
original grid (covering 0.1 < 𝑛 𝑁 < 0.99)
• A total of 10k variants computed on 200 models
FULL model Derived PARTIAL models
Alessio Cirone 11/09/2019 36
com
po
site vecto
r
PCA to extract the
important features and
reduce the datasets
51 scores on the eigenVectors
10k samples
Alessio Cirone 11/09/2019 37
DNN to predict full covariance matrix from partial information (𝑛 < 𝑁 sensors)
FullPartial
To be continued …
Alessio Cirone 11/09/2019 38
Part 1:
• Wiener filter is the best predictor for stationary linear stochastic signals.
• We are investigating possible causes (i.e. transients or non-idealities in the data, WF implementation,
…) that could explain the better performance for DNNs.
• DNN drawback: Virgo has to be limited by Newtonian noise in that frequency band, so that we could
take it as the target signal for prediction. On the contrary, WF employs the correlations between
seismic sensors and test masses, which allows long integration over time.
Part 2:
• Simulations in progress
• MATLAB toy model for parameter tuning and initial efficiency estimation is ready
• Post – processing and DNN in the exploratory phase
Alessio Cirone 11/09/2019 39
Part 1:
• Wiener filter is the best predictor for stationary linear stochastic signals.
• We are investigating possible causes (i.e. transients or non-idealities in the data, WF implementation,
…) that could explain the better performance for DNNs.
• DNN drawback: Virgo has to be limited by Newtonian noise in that frequency band, so that we could
take it as the target signal for prediction. On the contrary, WF employs the correlations between
seismic sensors and test masses, which allows long integration over time.
Part 2:
• Simulations in progress
• MATLAB toy model for parameter tuning and initial efficiency estimation is ready
• Post – processing and DNN in the exploratory phase
Thank you for the attention!
Back up slides
Alessio Cirone 11/09/2019 41
• Supervised learning on 100 networks (only fully connected
layers) with different architectures (number of neurons per layer
and number of layers) gives an average performance of 10%
• Not enough to find the best network structure
Move on to Convolutional NNs and genetic
algorithms, sped up with transfer learning
Alessio Cirone 11/09/2019 42
If we increase either the history (convolutional window) or the tau for a future prediction, the performances decrease
history
Unsurprising
Surprising
The training gets worse –> The networks become too complex
Alessio Cirone 11/09/2019 43
ANSYS simulations: random stochastic sources and a virtual network of N surface
sensors (regular grid or not)
DNN to predict full covariance matrix (as if the sensors were installed in any grid node)
from partial information (𝑛 < 𝑁 sensors)
Optimize sensors position by minimizing: 𝑅 𝜔 = 1 − 𝐶𝑆𝑁+ 𝜔 ∙ 𝐶𝑆𝑆 𝜔
−1∙ 𝐶𝑆𝑁(𝜔)
𝐶𝑁𝑁(𝜔)
If the simulations are sufficiently varied, we can assume that an approximation of the real
thing is within the DNN training scope
𝑺𝒊(𝑭𝒙,𝑭𝒚)
Foundations, building …
𝛿𝜌 = −𝛻𝜌0 𝒓 𝜉 𝒓, 𝑡
Discrete sampling of
the sensor – sensor
covariance matrix
The Neural Network
reconstruct the full
covariance matrix
Alessio Cirone 11/09/2019 44
Earth surface
“test mass” is placed @ (0,0,1)
full spread of sensors in a regular or
irregular grid
randomly placed sources (arbitrary
number)
sources are in the form
sin(ωt + φ0+ φg)
where:
ω is a random value fmin < ω/2π < fmax
φ0 is a random phase 0 < φ0 < 2π
φg is a Gaussian random variable with
tunable σ to add various intensity of
phase noise
• source signal is delayed (vs being a generic speed of sound) and attenuated
by the distance to the sensor
• sensors have added Gaussian
noise with tunable SNR
Alessio Cirone 11/09/2019 45
reference mass @ [0 0 1]
Alessio Cirone 11/09/2019 46
Test mass
Sensor arraySeismic sources
Sensor
covariance
matrix
• DATA: 5 s @ 100 Hz, range [5, 30] Hz, SNR=10 (amplitude),
Speed-of-sound = 103 m/s
• 200 models with 10 sources each, uniformly
dispersed in a 50 m sube side (~ 1.3 ⨉ 105 m3)
• Covariance matrix & vector computed as if there were the full sensors array
Alessio Cirone 11/09/2019 47
1) take a full model
3) build the partial
covariance matrix
and NN
5 sensors
2) sample an arbitrary
[low] number of
sensors
Alessio Cirone 11/09/2019 48
4) form composite vector
5) map onto eigenVectors
6) pass through Neural Network ensemble
7) remap into composite
8) transform into CM
Alessio Cirone 11/09/2019 49
Alessio Cirone 11/09/2019 50
Alessio Cirone 11/09/2019 51
sensors covariance matrix Newtonian noise / sensors
covariance
95% confidence interval