Alessio Cirone 11/09/2019...Alessio Cirone 11/09/2019 2 Zoom su virgo Advanced Virgo is a laser...

Alessio Cirone 11/09/2019

Alessio Cirone 11/09/2019 2

Zoom su virgo

Advanced Virgo is a laser interferometer devoted to the

detection of Gravitational Waves of astrophysical

and cosmological origin. The detection

power lies entirely on the

instrument spectral

sensitivity.


Newtonian Noise of

seismic origin

It would be a limiting noise source

for AdV+ in the 10-40 Hz band


Seismic waves

Density fluctuations of the surrounding

media (rocks, air, water…)

Gravity gradients

Seismic wave


Seismic waves

Density fluctuations of the surrounding

media (rocks, air, water…)

Gravity gradients

Test Masses displacementSeismic wave

Newtonian attraction


A Newtonian noise cancellation scheme can be realized using an

array of sensors deployed near the test masses

What are we missing?


M.W.Coughlin et al, PRL, 2018.

1. Newtonian noise has never been directly measured

• Theoretical models

• semi-realistic FE simulations (including soil

properties and the surrounding infrastructure)

2. Sensors

• Optimized placement (number and position)

• Type selection (accelerometers and/or tiltmeters)

• Underground detectors?

3. Data processing

• A second feasible subtraction algorithm in

addition to Wiener Filter (the gold standard)

A Newtonian noise cancellation scheme can be realized using an array of seismometers deployed near

the test masses. What we are missing is:


𝑺𝒋(𝑭𝒙,𝑭𝒚)

Foundations,

building …

𝛿𝜌 = −𝛻𝜌0 𝒓 𝜉 𝒓, 𝑡





2. Sensors




3. Data processing






WEB

• 38 inside the WEB (2018)

• 38 inside the NEB (2019)

Are they all needed?





2. Sensors




3. Data processing






E. Calloni, Internal note: VIR-0637A-19

M. Beker et al, 2016, TLE





2. Sensors




3. Data processing






Einstein

Telescope

F. Badaracco and J. Harms, CQG, 2019.





2. Sensors




3. Data processing






𝑭𝒙

𝑭𝒚

𝑺𝟏

𝑺𝟐

𝑺𝟑

𝑺𝒌

Robust with respect to terrain parameters,

sensor placement, number and type

Deep Neural network (DNN)





2. Sensors




3. Data processing






Part 1

Prediction of one sensor displacement from

the entire array (Neural network compared to

Wiener filter)

Part 2

Optimal sensors configuration from partial

information (Ansys simulations + machine

learning)


The Neural Network

(DNN) is compared

with the Wiener

filter (WF)

Dataset

Pre-processing (resampling, filtering, normalization …)

Train/test set

Prediction set

DNN WF

Genetic Algorithm

Best DNNs Residuals


70 % for train 30 % for test Prediction

Time

Same for DNN and WF

1 hour 1 hour

Complete WEB/NEB datasets (from 13 days to a few months)

Randomly selected


Conv. 1D – Activation – Max

Pooling 1D

× 2

Basic sequential model

n s

en

sors



Pooling 1D

× 2

Fully connected – Activation

Fully connected – Batch Norm. – Activation

Conv. 1D – Activation

/

/


n s

en

sors

/



Pooling 1D

× 2

Fully connected – Activation

Fully connected – Batch Norm. – Activation

Conv. 1D – Activation

Global Average Pooling 1D Fully connected – Activation+

/

/


n s

en

sors

/


• Test cluster of about 140 cores

for distributed processing

• Virtual environment with

Anaconda python distribution

• Robust portable environment for

Virgo to run in loco and act offline

for Newtonian noise substraction

Good foundations:

python based

open access

CPU & GPU friendly

• At first the DNN is trained on the

simulated data (process done on all

the CPUs in parallel)

We use the Python based Keras library, the

TensorFlow backend and some extra

packages like SCOOP (Scalable COncurrent

Operations in Python) and DEAP

(Distributed Evolutionary Algorithms in

Python)


Survival of the fittest,

looking for good

performances and low

network complexity


Best

Variable number of neurons,

number of layers, layer types …

Mate and

mutate

New

networks

to be

tested

Fit,

Evaluate,

Predict

Evolutionary

algorithm

Select the

best

networks

Hall of

fame

Looking for

performance &

simplicity


• In GA new networks

are created by

“mating” two high-

performance

networks

• The new networks

inherit some

properties from both

parents, in particular

their weights,

therefore applying an

heuristic transfer

learning

Neo-initialized

weights

Crossover cut


• The DNN takes the sensor array temporal data as input and another single sensor as output

Time Sensor 1 Sensor 2 … Sensor k Target sensor

0.01 1,24E-11 1,26E-13 … 2,72E-14 2,30E-12

0.02 3,16E-10 5,31E-13 … 9,48E-13 5,11E-11

0.03 3,66E-09 4,03E-13 … 1,57E-11 4,68E-12

Target

sensor (30)

Selected input

sensors• Montecarlo simulation with n input sensors out

of N=36

• Random data selection for each MC choice

• A single predictor for DNN is selected: the mean

performance value is taken from the best DNNs


Example of seismic

spectra of the target

channel and the

predicted output with

DNN and WF,

together with the

residuals (entire array

as input)

Residual mean


3.34 0.66 0.61 3.16

Mean residual distributions

for DNN and WF

• DNN statistically

better than WF

• Redundancy: even

with few sensors we

can achieve

comparable

residuals to the full

array case, if

properly selected

• The higher sensor-

target correlation,

the better residuals

n selected sensors

out of 36


3.34 0.66 0.61 3.16

• NN statistically

better than WF

• Redundancy: even

with few sensors we

can achieve

comparable

residuals to the full

array case, if

properly selected

• The higher sensor-

target correlation,

the better residuals


• High correlation between DNN and

WF results

• Also some cases in which WF goes so

much worse than DNN

• This behaviour is under investigation

1

2(𝐷𝑁𝑁 + 𝑊𝐹) can be competitive / alternative

/ complementary to WF alone, even in a sub-

optimal sensor configuration

Pearson’s correlation: 0.736

High correlation


Time

10 input sensors: [4, 13, 15, 17, 19, 27, 29, 31, 36, 37]

Target sensor: 30

Time

history

(~𝑚𝑠)

Train/test

FFT (time

window:

40 − 1𝑘 𝑠𝑒𝑐)

Prediction

FFT to

predictTime to

predict

Complete WEB dataset (13 days)

Randomly selected

Input InputOutput Output


• Better residuals in

the frequency –

domain, both for

DNN and WF

• What if we increase

the time window?

Time window

for the FFT:

72 sec


• It turns out that the F – domain WF

performance strongly depends on the

time window length

• On the contrary the DNN seem to follow a constant trend

• In order to explore higher time windows, we

would need more computational resources


Part 1

Prediction of one sensor displacement from

the entire array (Neural network compared to

Wiener filter)

Part 2

Optimal sensors configuration from partial

information (Ansys simulations + machine

learning)


Many ANSYS simulations:

random stochastic sources and a

virtual network of N surface

sensors (regular grid or not)

𝑺𝒋(𝑭𝒙,𝑭𝒚)

Foundations,

building …

𝛿𝜌 = −𝛻𝜌0 𝒓 𝜉 𝒓, 𝑡


• Problem

Material = reinforced concrete = concrete + steel => 𝐸, 𝜈 unknown

• Simplification

We divide the structure into macros (plinths, foundation beams,

insoles …)

Ɐ macro, we consider the % of steel (0.5 – 2.5 %)

We homogenize the cross-section, enlarge the dimensions and keep

the elastic modulus of the concrete.


• Full covariance matrix (as if the sensors were installed in any

grid node)

Sensor – Newtonian

Noise covariance


• From each model generate several partial models with a variable number of sensors randomly picked from the

original grid (covering 0.1 < 𝑛 𝑁 < 0.99)

• A total of 10k variants computed on 200 models

FULL model Derived PARTIAL models


com

po

site vecto

r

PCA to extract the

important features and

reduce the datasets

51 scores on the eigenVectors

10k samples


DNN to predict full covariance matrix from partial information (𝑛 < 𝑁 sensors)

FullPartial

To be continued …


Part 1:

• Wiener filter is the best predictor for stationary linear stochastic signals.

• We are investigating possible causes (i.e. transients or non-idealities in the data, WF implementation,

…) that could explain the better performance for DNNs.

• DNN drawback: Virgo has to be limited by Newtonian noise in that frequency band, so that we could

take it as the target signal for prediction. On the contrary, WF employs the correlations between

seismic sensors and test masses, which allows long integration over time.

Part 2:

• Simulations in progress

• MATLAB toy model for parameter tuning and initial efficiency estimation is ready

• Post – processing and DNN in the exploratory phase


Part 1:

• Wiener filter is the best predictor for stationary linear stochastic signals.

• We are investigating possible causes (i.e. transients or non-idealities in the data, WF implementation,

…) that could explain the better performance for DNNs.

• DNN drawback: Virgo has to be limited by Newtonian noise in that frequency band, so that we could

take it as the target signal for prediction. On the contrary, WF employs the correlations between

seismic sensors and test masses, which allows long integration over time.

Part 2:

• Simulations in progress

• MATLAB toy model for parameter tuning and initial efficiency estimation is ready

• Post – processing and DNN in the exploratory phase

Thank you for the attention!

Back up slides


• Supervised learning on 100 networks (only fully connected

layers) with different architectures (number of neurons per layer

and number of layers) gives an average performance of 10%

• Not enough to find the best network structure

Move on to Convolutional NNs and genetic

algorithms, sped up with transfer learning


If we increase either the history (convolutional window) or the tau for a future prediction, the performances decrease

history

Unsurprising

Surprising

The training gets worse –> The networks become too complex


ANSYS simulations: random stochastic sources and a virtual network of N surface

sensors (regular grid or not)

DNN to predict full covariance matrix (as if the sensors were installed in any grid node)

from partial information (𝑛 < 𝑁 sensors)

Optimize sensors position by minimizing: 𝑅 𝜔 = 1 − 𝐶𝑆𝑁+ 𝜔 ∙ 𝐶𝑆𝑆 𝜔

−1∙ 𝐶𝑆𝑁(𝜔)

𝐶𝑁𝑁(𝜔)

If the simulations are sufficiently varied, we can assume that an approximation of the real

thing is within the DNN training scope

𝑺𝒊(𝑭𝒙,𝑭𝒚)

Foundations, building …

𝛿𝜌 = −𝛻𝜌0 𝒓 𝜉 𝒓, 𝑡

Discrete sampling of

the sensor – sensor

covariance matrix

The Neural Network

reconstruct the full

covariance matrix


Earth surface

“test mass” is placed @ (0,0,1)

full spread of sensors in a regular or

irregular grid

randomly placed sources (arbitrary

number)

sources are in the form

sin(ωt + φ0+ φg)

where:

ω is a random value fmin < ω/2π < fmax

φ0 is a random phase 0 < φ0 < 2π

φg is a Gaussian random variable with

tunable σ to add various intensity of

phase noise

• source signal is delayed (vs being a generic speed of sound) and attenuated

by the distance to the sensor

• sensors have added Gaussian

noise with tunable SNR


reference mass @ [0 0 1]


Test mass

Sensor arraySeismic sources

Sensor

covariance

matrix

• DATA: 5 s @ 100 Hz, range [5, 30] Hz, SNR=10 (amplitude),

Speed-of-sound = 103 m/s

• 200 models with 10 sources each, uniformly

dispersed in a 50 m sube side (~ 1.3 ⨉ 105 m3)

• Covariance matrix & vector computed as if there were the full sensors array


1) take a full model

3) build the partial

covariance matrix

and NN

5 sensors

2) sample an arbitrary

[low] number of

sensors


4) form composite vector

5) map onto eigenVectors

6) pass through Neural Network ensemble

7) remap into composite

8) transform into CM


sensors covariance matrix Newtonian noise / sensors

covariance

95% confidence interval

Alessio Cirone 11/09/2019...Alessio Cirone 11/09/2019 2 Zoom su virgo Advanced Virgo is a laser...

Documents

Transcript of Alessio Cirone 11/09/2019...Alessio Cirone 11/09/2019 2 Zoom su virgo Advanced Virgo is a laser...