Post on 14-Apr-2018
7/29/2019 B4_Detetcion_Kadambe
1/106
Detection & Classification of RF signals and,
Physical and Network Layer Behavior of
Software Defined/Cognitive Radios
Dr. Shubha Kadambe
Advanced Technology CenterRockwell Collins
400 Collins Rd., NE
Cedar Rapids, IA 52498slkadamb@rockwellcollins.com
310-263-8455
mailto:slkadamb@rockwellcollins.commailto:slkadamb@rockwellcollins.com7/29/2019 B4_Detetcion_Kadambe
2/106
Why?
Detection & classification of RF signals is a front endprocessing for
Geo location for networking radios
Interoperability for making two different radios to talk to each other
Spectrum sensing and management
Detection and classification of physical and networklayer behavior is needed for
Security Spectral management and dominance
2
7/29/2019 B4_Detetcion_Kadambe
3/106
Why?
Detection and Classification is a classical problem that has many
applications
In particular for RF signals it has
Military: Signal intelligence (SIGINT), Electrromagneitc intelligence(ELINT) and Communication intelligence (COMINT), Electronic Warfare
Commercial: Software defined radio (SDR)/Cognitive Radio (CR)
In military applications it is a challenging task since
New RF threats are introduced every day
Friendly forces should have spectral dominance in the presence ofhostile signals
The spectrum of these signals
may range from high frequency (HF) to millimeter frequency band and
their format can vary from simple narrowband modulations to widebandschemes.
Techniques need to operate in real-time to make critical decisionsquickly in electronic warfare & tactical operations.
3
7/29/2019 B4_Detetcion_Kadambe
4/106
Why?
in software defined radio (SDR)/CR
Information is transmitted to reconfigure the SDR system.
These techniques can be used with intelligent transceiver toincrease the efficiency by reducing the overhead
Should be able to operate at very low SNR & in presence ofinterferer
4
7/29/2019 B4_Detetcion_Kadambe
5/106
Detection
5
7/29/2019 B4_Detetcion_Kadambe
6/106
Problem formulation
A general problem of detection corresponds:
Choose between two hypotheses
Common form of signals are:
Completely known s(t) =m(t)
Known except for a few parameters such as:
Purely stochastic: s(t) =z(t)
Common assumption of noise are: Zero mean white Gaussian
Zero mean colored with a white Gaussian component
Purely colored with zero mean (generally not used)
noiserandomis)(andinterestofsignaltheis)(where
0)()()(:
0)()(:
1
0
tnts
TttntstyH
TttntyH
+=
=
( )
{ } randomorunknownbemay,,ofncombinatiosomewhere
cos)()(
0
0
+= ttAts
6
7/29/2019 B4_Detetcion_Kadambe
7/106
Detection Deterministic known signal In general, a decision is made by
Deriving a statistic based on n(t)
Comparing it to a preset threshold
Ifn(t) is white Gaussian noise with mean 0 and variance
Then pdfs under two hypotheses can be shown to be:
Likelihood ratio is given by:
2
N
( )
( ) ( ) )(&)(ofversionssampledare&where21
exp)(
&2
1exp)(
1
2
21
1
2
20
tstysysyHtyp
yHtyp
kk
K
kkk
N
K
k
k
N
=
=
=
=
( )( )( )( )( )
( )( )
( )( )
+=
=
==K
kk
K
kkk
Nssyy
pdfsy
Htyp
Htypy
1
2
12
0
1
22
1
ln
:getwefor thengsubstituti&lngconsiderinBy
7
7/29/2019 B4_Detetcion_Kadambe
8/106
Detection deterministic known signal Using previous equation the detection hypothesis is:
For the continuous case it is:
( )
.forratiolikelihoodtoscorrespondwhere
21ln2
21
:ifChoose
00
1
2
20
12
1
H
ssy
H
K
k
k
N
K
k
kk
N
==+
( ) + dttsdttsty
H
NN
)(2
1ln)()(2
1
:ifChoose
220
*2
1
8
7/29/2019 B4_Detetcion_Kadambe
9/106
Detection
deterministic known signal
From the previous equation it can be seen that the processing thatneeded is:
Correlating the stored signal s(t) with the received signal y(t) or
Passing y(t) through a filter matched to s(t).Matched filter approach of early radar literature
Optimum detector from the decision theoretic point of view
Matched filtering provides optimum solution but not very realisticsince signal is not known completely in practice
It is used to compare the detectors provides theoretical bound
9
7/29/2019 B4_Detetcion_Kadambe
10/106
Detection unknown signal parameters Generalization of the detection problem arises when the signal
s(.) has unknown parameters.
In this case hypotheses are:
Leads to Generalized Likelihood Ratio test (GLRT) and GLRT isgiven by:
parametersunknownofvectoraiswhere
0)();()(:
0)()(:
1
0
TttntstyH
TttntyH
+==
+= yyy)(Pymax 22 TNTNGLRT
10
7/29/2019 B4_Detetcion_Kadambe
11/106
GLRT detector block diagram
+= yyy)(Pymax 22 TN
TNGLRT
received
Signaly(t)
Max. likelihood
signal parameterestimator
y)(Pymax
T
yyT
correlator
+ > T
yes
signal present
no
signalabsent
segment
the
signal
N
2
N2
11
Estimator-Correlator
7/29/2019 B4_Detetcion_Kadambe
12/106
GLRT special case Consider a signal with uniformly distributed random phase:
( ) ( ) ( )
[ ]
[ ]( ) ( )
( ) ( )
( ) ( ) dt)()(A(t)sint
dt)()(A(t)cost
kind,firsttheoffunctionBesselmodified
orderzeroththeis,;
where)()(exp
:shown thatbecanitassumptionnarrowbandUnder this
).cos(tocomparedyingslowly var)(),(with
202
1,)(cos)(;
0
0
0
22
2
1
2202
0
0
+=
+=
==
+=
=++=
tyttV
tyttV
IdttAdttsE
tVtVI
tttA
ptttAts
s
c
sc
E
12
7/29/2019 B4_Detetcion_Kadambe
13/106
GLRT
special case
The main operation on the received signal is:
Note that analysis of narrowband signals using a complex
envelope was first suggested by Gabor Woodward used this for signal detection in radar and developed an
ambiguity function to understand the resolution limits of radar
( ) ( )[ ]( ))()(cos)(signal
narrowbandthetomatchedfilteraofoutput
0
2
122
tttA
tVtV sc
+
+
13
7/29/2019 B4_Detetcion_Kadambe
14/106
Detection
time-frequency domain
Estimating signal parameters can be transferred to time-frequency domain
Feature selection easier
Can reduce noise effect
Several time-frequency distributions exist
Wigner
Windowed spectrum
Gabor
Choi-William
RID
We consider one that tries to reduce the cross-term
Cross-term Deleted Wigner Distribution (CDWR)
14
7/29/2019 B4_Detetcion_Kadambe
15/106
Definitions - 1 For a given signal x(t), the Gabor and the WD are defined as:
Complementary Gabor coefficients can be obtained
by reversing the role ofh and Using these coefficients, x(t) can be expanded as:
Gxm nx t t m e
j ntdt
h t
Wx t w x t x t ej d
,( ) *( )
( ),
( , ) * ,
=
= +
2
2 2
where, is an analysis window
that is biorthogonal to sysnthesis window
respectively.
Gxm nGxm n,
& $,
x t Gxmnnmhmn t Gxmn mn tnm( ) , , ( )
$
, , ( )= =
15
7/29/2019 B4_Detetcion_Kadambe
16/106
Definitions - 2 Substituting Gabor expansions ofx(t) in the WD definition and after
some algebraic simplifications it can be shown that:
The auto-WD terms m = p, n = q . Retaining only autoterms
crossterm deleted cross biorthogonal representation (XBIO)
When TFR - the Crossterm Deleted
Wigner Representation (CDWR)
[ ]{ }
Wx t w
p qm n
Gx
m nGx
p qW
ht
n qT
m p
T( , )
,,
,
$
, ,,=
+
+
2 2
.
ej (m+ p)(n-q) / 2+(m-p)t / T-(n-q)T
Gxm nGxm n,$
,=
16
7/29/2019 B4_Detetcion_Kadambe
17/106
Definitions of XBIO & CDWR
The XBIO x(t) is:
Similarly, the CDWR ofx(t) is:
XBx t Gxm nGxm nWh t nT
m
Tm n( , ) ,
$
,
*
, ,, =
.
CDWRx t Gxm nW
ht nT
m
Tm n
( , ) |,
| ,
,
=
2 .
CDWR is a special case of XBIO
17
7/29/2019 B4_Detetcion_Kadambe
18/106
Example: The CDWR of a linear chirp
18
7/29/2019 B4_Detetcion_Kadambe
19/106
Detection
Detectors:
Matched filter:
Auto-CDWR:
Cross-CDWR:
whereA is the energy of the signal.
2)(*)(8),(*),( == dttstrdtdfftssCDWRftrrCDWRacdwr
= dttstrmf
)(*)(
2)(*)(8),(*),( == dttstrAdtdfftssCDWRftrsCDWRxcdwr
19
7/29/2019 B4_Detetcion_Kadambe
20/106
Detection -
Performance of detectors
Performance measure is:
Using this measure it can be shown that:
T
The performance of the detector based on XCDWR is better than the
ACDWR and is equivalent to the detector based on matched filter
SNRH H
H H
=
+
1 0
1
2 1 0
1
2var var
SNRmf
A
N
SNRacdwr
A
N N
A
SNRxcdwr
A
N
=
=
+
=
2
2
1
2 1
2
2
20
7/29/2019 B4_Detetcion_Kadambe
21/106
Block diagram of the CDWR based detector
received
signal
is
prototype ref.
signal
estimated?
no
yes
> T1yes
estimate the
prototype
signal
$( )s t
$( )s t
segment the
signal
r t( )
compute
CDWR of
r(t) &
$( )s tcomputecross-corr > T2
signal
absent
signal present
yes
compute
CDWR
computeCDWR of
r(t)
21
7/29/2019 B4_Detetcion_Kadambe
22/106
Synthetic data details
Data consists of modulated Gaussian pulses
The signal parameters are:
amplitude, arrival time, spread of the pulses, modulationfrequency, phase and the sparseness i.e., the number of pulseswithin a frame of data
They were randomly varied for different experiments
The received signal was embedded in white Gaussiannoise with zero mean &
various noise variances that correspond to SNRs from 0 dB to -
12 dB Experiment was repeated 100 times.
22
7/29/2019 B4_Detetcion_Kadambe
23/106
Detector performance (synthetic signal)
23
7/29/2019 B4_Detetcion_Kadambe
24/106
Simulation setup
Synthetic Gaussian pulses with random arrival time, width, modulationfrequency and density (of pulses - overlapped) are generated.
White Gaussian noise of various SNRs (+3 to -6 dB) is added to the
generated synthetic signal. At each SNR, the detection experiment was iterated 10,000 times.
For each iteration,
In the case of the CDWR, cross-correlation coefficients are computed
whereas for the GLRT, signal parameters are estimated andis computed.
Signal is detected if the correlation coefficient or is above certainthreshold.
Threshold is set for a fixed probability of false alarm. The probability ofdetection is computed for different threshold values and ROC curves aregenerated. These curves are plotted in the next viewgraph
From this figure, it can be seen that the CDWR based detector
performs better than the GLRT at low SNRs ( < 3 dB ).
GLRT
GLRT
24
7/29/2019 B4_Detetcion_Kadambe
25/106
Detectors performance
ROC curves for XCDWR and GLRT for noisy signal with differentSNRs.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Probability of False Alarm
ProbabilityofDetection
solid = GLRT
dotted = xcdwt critical sampling
dashed = xcdwt 2x oversampling
3db
0db
-3db
-6db
-3db
-6db
25
7/29/2019 B4_Detetcion_Kadambe
26/106
Detector performance (real acoustic signal)
Acoustic signal and correlation coefficients in the case of ACDWR and XCDWR
26
7/29/2019 B4_Detetcion_Kadambe
27/106
Why CDWR detectors performance is
better?
In the case of the CDWR, the prototype signal is estimated afterprojecting the received signal onto the time-frequency plane.Advantages of this are:
time-frequency localization and reduced noise effect and hence better estimate.
The noise effect can be minimized by designing the analysis andsynthesis window functions (which are used in the computation of the
CDWR) by applying certain constraints such as minimum energy. However, in the case of GLRT, the accuracy of the signal
parameter estimations deteriorates with increase in noise level.Therefore, s^(t) of CDWR is more close to s(t) than the GLRT.
Hence, at high SNR these two detectors perform almost equally
at low SNR, the CDWR detector performs better.
27
7/29/2019 B4_Detetcion_Kadambe
28/106
Detection of RF signals
Manmade signals can be considered as cyclostationaryrandom process
Exhibits peaks in spectrum
Popular techniques are:
radiometry based peak detection
spectral correlation detection
Cyclostationary feature detector
Channelized receiver
7/29/2019 B4_Detetcion_Kadambe
29/106
Radiometric detector
Bandpass filter
( )2 Integrator Hypothesis test
Recei
vedsignal
Detection
decision
detects energy in the bandwidth of the bandpass filter using a coherent
processingThe resultant test statistic is compared to a threshold which can beestablished using various detection criteria - Bayes, Neyman- Pearson,etc.) and which varies as a function of channel characteristics.
The signal of interest is declared "present" whenever the test statisticexceeds the threshold.
7/29/2019 B4_Detetcion_Kadambe
30/106
Cyclostationary
feature detector
Assumption: zero mean discrete time signal x(n) exhibits widesense second order cyclostationarity
Is periodic in terms of fixed lag l=+-1, +-2,..
Example: Orthogonal Frequency Division Multiplexing (OFDM)signal
Fourier coefficient of Rxx(n,l) cyclic autocorrelation function is:
In practice it is estimated as:
[ ] )()(, * lnxnxlnRxx +=
( )
=
=
1
0
2
,1lim N
n
N
nkj
xxxx elnRNN
R k
frequencycarrierandschememodulationrate.symboltorelatedis
&kindexoffrequencycyclictheiswhere)()(1
k
1
0
2*
=
+=
N
n
N
nkj
xx elnxnxN
R k)
7/29/2019 B4_Detetcion_Kadambe
31/106
Cyclostationary
feature detector
ifk
is cyclic frequency but it can be non zero even when
k
is not cyclic frequency because of the estimation statistical
test is needed for detection
One such test is based on GLRT
0kxxR
)
7/29/2019 B4_Detetcion_Kadambe
32/106
Wideband detector
Need to
Detect wide and narrow band signals simultaneously
Detect multiple signals that are present simultaneously
Solutions
Channelized detector
Time-frequency representation / time-frequency atom based
detector
Ch li d d t t
7/29/2019 B4_Detetcion_Kadambe
33/106
Channelized detector
Reliable detection and arbitration of UWB signals that coexist with
NB signals is very difficult if not impossible with radiometrictechniques.
Channelization techniques are one of the alternatives to radiometricdetection to address this problem
Previous work has been reported on a multi-radiometer system fordetecting impulse radio signals.
Uses a form of temporal channelization, i.e., the observed frame time
(Tf) of the received UWB signal is equally divided into M segments. Each segment of time data is then processed by a wideband radiometer
using TR = Tf/M over bandwidth WRAD ~ WUWB (known bandwidth case).
The M radiometer outputs are then logically combined such that if any of
the individual outputs is positive the UWB signal is declared present(detection occurs).
Ref: Communication Channel Assessment: Detection of Ultra Wideband Signals Using aChannelized Receiver, Brett D. Gronholz, Michael A. Temple, Robert F. Mills & Willie H. Mims, 2005International Conference on Wireless Networks, Communications and Mobile Computing
7/29/2019 B4_Detetcion_Kadambe
34/106
Channelized detector
Channel outputs are collectively processed to arrive at the desiredconclusion.
The intent is to exploit the power of channelization and develop arbitrationtechniques to establish how many and what type signals are present.
BPF1 A/D
BPF2
BPFM
A/D
A/D
Digitalprocessor
Block Diagram of Channelized Receiver Using Total Bandwidth WTot Spannedby M Filters
7/29/2019 B4_Detetcion_Kadambe
35/106
Issues with Channelized detector
The fundamental receiver challenge is to determine the
number of signals present and
spectral characteristics of each, i.e., center frequency, bandwidth,
and/or other parameters of interest Specifically, the first part of the fundamental challenge involves
non-cooperative communication channel assessment,
i.e., given total channel bandwidth WTotdetermine
1) if there is a signal present, and 2) what features does the signal(s)have, i.e., NB, UWB, etc.
7/29/2019 B4_Detetcion_Kadambe
36/106
Time-frequency atom based wideband multiple
signal detector
Average
signal
signalCompute
Time-frequencyrepresentation
Obtain
time-frequencyatoms
Compute
Spectral energyIn each atom
& mean spectralenergy across
time dim
ComputeHistogram &Determinethreshold
Detect allEnergy peaks
Above thethreshold
EstimateBeginning
End ofSignals
Time-frequencyspectrum
atoms
Energydistribution
threshold#of localpeaks &energyvalue
#detectedsignals andtheirestimatedbeginning andend
Average
signal
signalCompute
Time-frequencyrepresentation
Obtain
time-frequencyatoms
Compute
Spectral energyIn each atom
& mean spectralenergy across
time dim
ComputeHistogram &Determinethreshold
Detect allEnergy peaks
Above thethreshold
EstimateBeginning
End ofSignals
Time-frequencyspectrum
atoms
Energydistribution
threshold#of localpeaks &energyvalue
#detectedsignals andtheirestimatedbeginning andend
Rockwell Collins Proprietary
7/29/2019 B4_Detetcion_Kadambe
37/106
How does it work?
The received signal is buffered for ten frames.
It is averaged over ten frames to reduce noise effect.
A 2 dimensional time-frequency representation spectrogram is computed using the averaged signal.
Since it is a 2 D representation,
if there are multiple signals with different bandwidths and centerfrequencies and occurring at different times,
it would exhibit spectral energy at around that frequencybandwidth and time.
Rockwell Collins Proprietary
7/29/2019 B4_Detetcion_Kadambe
38/106
Spectrogram example
Rockwell Collins Proprietary
7/29/2019 B4_Detetcion_Kadambe
39/106
How does it work (cont.)?
The time-frequency space of the spectrogram is dividedin to smaller regions called time-frequency atoms.
The spectral energy in each of these atoms iscomputed.
Then it is averaged over time.
This results in a spectral energy distribution acrossfrequencies.
An example 2D spectral energy in each time-frequencyatom and mean spectral energy distribution are shownin two following slides, respectively.
Rockwell Collins Proprietary
7/29/2019 B4_Detetcion_Kadambe
40/106
Example 2D spectral energy
The energy bands corresponding to each signal is much clearer in
the 2D energy plot of the time-
frequency atoms as compared to the spectrogram why time-frequency atoms are considered.
Rockwell Collins Proprietary
7/29/2019 B4_Detetcion_Kadambe
41/106
Example mean spectral energy distribution
Rockwell Collins Proprietary
How does it work (cont )?
7/29/2019 B4_Detetcion_Kadambe
42/106
How does it work (cont.)?
Using the spectral energy distribution, a histogram is
computed. A threshold value corresponding to maximum number of
values fall within a bin (the first bin in the figure below)
is chosen. Such a selection is made because most of the lower
spectral energy values correspond to the background.
This helps in characterizing the background statisticallyand fixing the threshold value adaptively based on thechanges in the background noise level.
An example histogram is shown in the next slide.
Rockwell Collins Proprietary
E l hi t f t l
7/29/2019 B4_Detetcion_Kadambe
43/106
Example histogram of spectral energy
distribution
Rockwell Collins Proprietary
( )?
7/29/2019 B4_Detetcion_Kadambe
44/106
How does it work (cont.)?
From the figure in previous slide, it can be seen that bin 1 has the most number of entries.
From 2D spectral energy plot,
it can be seen that very few time-frequency atoms have high
energy indicated by darker bands and
most of the time-frequency atoms have low energy values.
Hence, by choosing the threshold value that corresponds to bin1 we would be eliminating the background noise.
Rockwell Collins Proprietary
H d it k ( t )?
7/29/2019 B4_Detetcion_Kadambe
45/106
How does it work (cont.)?
From the spectral energy distribution, local peaks and the associated center frequencies are located
first.
If a local peak is above the chosen threshold then a decision is
made that a signal present at that peak location.
The # of chosen local peaks indicates the number of signalspresent.
The associated locations of the peaks in the frequencydetermine the center frequency of those signals.
Rockwell Collins Proprietary
How does it work (cont )?
7/29/2019 B4_Detetcion_Kadambe
46/106
How does it work (cont.)? For example, in the following figure, it can be noticed that seven local peaks
are above the background level.
The peak locator also makes a decision of whether the neighboring peaksare too close.
If they are then it ignores those peaks.
Hence, in the below example, it ignores the two neighboring peaks that are
present on either side of the peaks located at frequency indices 16 and 22resulting in 4 peaks instead of 7.
These four peaks correspond to the four signals present.
The associated peak locations correspond to the center frequencies of
those signals.
Rockwell Collins Proprietary
H d it k ( t )?
7/29/2019 B4_Detetcion_Kadambe
47/106
How does it work (cont.)?
Using the estimated center frequency and theassociated spectral energy,
the time-frequency space of the 2D energy distribution of time-frequency atoms is searched in the time dimension to estimate
the beginning and end of that signal in time.
Finally,
the estimated # of signals, their center frequencies and the
beginning and end time are outputted to the user.
Rockwell Collins Proprietary
7/29/2019 B4_Detetcion_Kadambe
48/106
Classification
48
7/29/2019 B4_Detetcion_Kadambe
49/106
Classification block diagram
Detected signals are processed FFT, spectrogram, T-F distribution Features are extracted Features are used in classification
Classification Model based Clustering based
Classifiers Open set Closed set
Pre-
processingFeature
Extraction
Feature
Classification
InputSignals
Signal
Classes
7/29/2019 B4_Detetcion_Kadambe
50/106
Classifiers
Model based
Hidden Markov Model
Neural network
Probabilistic neural network (PNN)
Gaussian mixture model
Clustering
K-means Discriminant analysis
Support Vector Machine
Gaussian Mixture Model (GMM) based
7/29/2019 B4_Detetcion_Kadambe
51/106
( )
signal Classifier Architecture
CumulantBased
FeatureExtraction
TestSignal (I &
samples)
BayesianClassifier
Testing
Class id
Training Data(I & Q
Samples ofSignals)
GMMParameter
(mean vector
& covariancematrix)
Estimation
Training
Cumulant
BasedFeatureExtraction
Rockwell proprietary
Cl ifi Bl k di f C t ti
7/29/2019 B4_Detetcion_Kadambe
52/106
Classifier Block diagram of Computationof Feature vector
LinearTransformation L
WindowedNoisySignal
(I & Q)
LinearTransformation 1
ComputeCumulants
ComputeCumulants
AdditionalProcessing
Features
FrequencyOffset andBandwidthCorrection
Linear
Transformation L
WindowedNoise(I & Q)
LinearTransformation 1 ComputeCumulants
Compute
Cumulants
Rockwell proprietary
G (G )
7/29/2019 B4_Detetcion_Kadambe
53/106
Gaussian Mixture Model (GMM) based
Classifier Architecture
CumulantBased
FeatureExtraction
TestSignal (I &
samples)
BayesianClassifier
Testing
Class id
Training Data(I & Q
Samples ofSignals)
GMMParameter
(mean vector
& covariancematrix)
Estimation
Training
Cumulant
BasedFeatureExtraction
Rockwell proprietary
Gaussian Mixture Model (GMM) Parameter
7/29/2019 B4_Detetcion_Kadambe
54/106
Estimation
Expectation Maximization Algorithm Algorithm which helps in estimating the parameters
of a multivariate distribution given the data
Input: Data (x), Number of mixtures/model (C) Output: C Mean Vectors (), C Covariance Matrices ()
Multivariate Normal Distribution:
where F
is the number of features (5-8 in our example)
Hence the likelihood (probability of data givena classification) is:
)()(21 1
)det()2(
1}),{,(
iiT
i xx
i
Fiiexp
=
= =C
i
iixphxp1
}),{,()|(
Rockwell proprietary
G i Mi t M d l (GMM) b d
7/29/2019 B4_Detetcion_Kadambe
55/106
Gaussian Mixture Model (GMM) based
Classifier Architecture
CumulantBased
FeatureExtraction
TestSignal (I &
samples)
BayesianClassifier
Testing
Class id
Training Data(I & Q
Samples ofSignals)
GMMParameter
(mean vector
& covariancematrix)
Estimation
Training
Cumulant
BasedFeatureExtraction
Rockwell proprietary
Bayesian Classifier
7/29/2019 B4_Detetcion_Kadambe
56/106
Bayesian Classifier
N
is the number of classes
A Bayesian Classifier is a statistical classifierwhich utilizes Bayes rule to make theclassification decision:
where h
is a class, x
is the data.
Each class is equally likely, hence: p(x)
can be approximated as:
Pick a class with largest p(h|x)
Nhp
1)( =
)(
)()|()|(
xp
hphxpxhp =
==N
i
ii hphxpxp1
)()|()(
Rockwell proprietary
7/29/2019 B4_Detetcion_Kadambe
57/106
Classifier
Unknown class
Unknown class detection
Uses distance measure based on Bhattacharya
If a new signal is detected that classifier is nottrained for, it is classified as unknown
Unknown class processing
Features of unknown classes are passed toCognitive Engine
It learns about new classes
Provides required information to retrain theclassifier for these new classes
Rockwell proprietary
7/29/2019 B4_Detetcion_Kadambe
58/106
-20
2 4
68
-5
0
5-10
-5
0
5
10
-20
2 4
68
-5
0
5-10
-5
0
5
10
-20
2 4
68
-5
0
5-10
-5
0
5
10
-20
2 4
68
-5
0
5-10
-5
0
5
10
-20
2 4
68
-5
0
5-10
-5
0
5
10
-20
2 4
68
-5
0
5-10
-5
0
5
10
-20
2 4
68
-5
0
5-10
-5
0
5
10
-20
2 4
68
-5
0
5-10
-5
0
5
10
Example: Learning unknown signals
Rockwell proprietary
Classifier Performance For USRP Radio
7/29/2019 B4_Detetcion_Kadambe
59/106
Generated Data
USRP radio (uses GNU radio software) datacollection
Data collected outdoors on RC campus
5 Gnu Radio waveforms considered - DBPSK, DQPSK, D8PSK,GMSK, Gaussian noise
Bit rate was 500 kbps for the data waveforms
Background noise collected for noise correction
Results shown on next slide High SNR case - uses only the background noise (SNR varied,
but was at least 10 dB)
Low SNR case - added AWGN to the collected data so that theSNR is around -3 dB
DQPSK and D8PSK are very close and should be hard todistinguish
We have success in classifying them
Rockwell proprietary
Classification accuracy for USRP radio
7/29/2019 B4_Detetcion_Kadambe
60/106
generated dataNoise DBPSK DQPSK D8PSK GMSK Gaussian
Noise 100 0 0 0 0 0
DBPSK 0 100 0 0 0 0
DQPSK 0 0 100 0 0 0
D8PSK 0 0 0 100 0 0
GMSK 0 0 0 0 100 0
Gaussian 0 0 0 0 0 100
Noise DBPSK DQPSK D8PSK GMSK Gaussian
Noise 100 0 0 0 0 0
DBPSK 0 95 0 5 0 0
DQPSK 0 0 87 13 0 0D8PSK 0 0 24 76 0 0
GMSK 0 0 2 1 97 0
Gaussian 0 0 0 0 0 100
10dB SNR
-3dB SNR
Rockwell proprietary
7/29/2019 B4_Detetcion_Kadambe
61/106
Real-World Data
Signals Collected 3 HDTV Stations (2 VHF, 1 UHF)
2 FM Radio Stations (96.5 MHz, 106.1 MHz)
Push-to-talk signal (FM modulation)
Weather Radio Station (AM modulation)
CB Radio (AM modulation)
Results shown on next slide
Each class had its own noise For omnipresent signals, noise sampled in adjacent band
Only non-noise classes shown in results
High SNR case - uses only the background noise (SNR varied,
but was at least 10 dB) Low SNR case - added AWGN to the data so that the input SNR
is around -3 dB
Rockwell proprietary
Classification accuracy for Real-World
7/29/2019 B4_Detetcion_Kadambe
62/106
dataTV1 TV2 TV3 FM1 FM2 WT WX CB
TV1 100 0 0 0 0 0 0 0
TV2 0 100 0 0 0 0 0 0
TV3 0 0 100 0 0 0 0 0
FM1 0 0 0 100 0 0 0 0
FM2 0 0 0 0 100 0 0 0
WT 0 0 0 0 0 93 7 0
WX 0 0 0 5 0 0 91 0
CB 0 0 0 0 0 0 0 100
10dB SNR
-3dB SNR
TV1 TV2 TV3
FM1
FM2
WT WX CB
TV1 100 0 0 0 0 0 0 0
TV2 3 97 0 0 0 0 0 0
TV3 0 0 100 0 0 0 0 0FM
1
0 0 0 91 8 0 1 0
FM2
0 0 2 0 98 0 0 0
WT 0 3 0 0 0 87 9 1
WX 0 0 0 24 2 7 58 8CB 0 0 0 0 0 1 3 96
Rockwell proprietary
Classifier performance on field data
7/29/2019 B4_Detetcion_Kadambe
63/106
p
Classifier performance on field data before learning
True
Test
MSK GMSK BPSK QPSK 16QAM Unknown
MSK 60 0 0 0 0 0
GMSK 0 60 0 0 0 0BPSK 0 0 60 0 0 0
QPSK 0 0 0 60 0 0
16QAM 0 0 0 0 60 0
OOK 0 0 0 0 0 60
FM 0 2 0 0 0 58
2FSK 0 0 0 0 0 60
4FSK 0 0 0 0 0 60AM 0 3 0 0 0 57
DSBSC 0 0 0 0 0 60
Training on synthetic signals of known 5 classes; testing on 11 real-
world signals
Rockwell proprietary
Classifier performance on field data Classifier performance after learning
7/29/2019 B4_Detetcion_Kadambe
64/106
Classifier performance after learningTrue
Test
MSK GMSK
BPSK
QPSK 16QAM Cluster1 (AM)
Cluster 2(2FSK)
Cluster3
(OOK)
Cluster 4
(4FSK)
unknow
n
MSK 60 0 0 0 0 0 0 0 0 0
GMSK 0 60 0 0 0 0 0 0 0 0
BPSK 0 0 60 0 0 0 0 0 0 0
QPSK 0 0 0 60 0 0 0 0 0 0
16QAM 0 0 0 0 60 0 0 0 0 0
OOK 0 0 0 0 0 0 0 60 0 0
FM 0 2 0 0 0 0 0 13 15 30
2FSK 0 0 0 0 0 0 59 0 0 1
4FSK 0 0 0 0 0 0 0 0 58 2
AM 0 3 0 0 0 50 0 0 0 10
DSBSC 0 0 0 0 0 0 0 10 0 50
Learning resulted in four clusters of size >= 50
We have shown:1) Our classifier works on field data2) We can train on synthesized data and then test on real-world data.3) We can learn unknown signals and define new classes.
Rockwell proprietary
Modulation of RF signals - Radar
7/29/2019 B4_Detetcion_Kadambe
65/106
Modulation of RF signals Radar To uniquely represent different types of modulation of radar
impulses and to classify them,
we have developed a multi-class classifier that is constructed bycombining a set of binary support vector machines (SVMs).
we have derived a set of innovative features using both high orderstatistics and information measures such as Renyi entropy and
relative entropy.
F t
7/29/2019 B4_Detetcion_Kadambe
66/106
Features
Features used to represent signal information
content include: Renyi entropy
Energy ratio
Frequency change Higher order statistics skewness
Relative entropy
Why these features?
7/29/2019 B4_Detetcion_Kadambe
67/106
Why these features? In general, signals are distorted by the transmission channel and the
receiver system Complete signal information is not available
Need robust, unique and optimum features that help in accuraterepresentation
The selected features represent distorted signals uniquely and robustly
Entropy is a measure of information content uniquely represent informationcontent of a signal
Renyi entropy is a generalized version of Shanon entropy more robust Relative entropy is a measure of relative information provides how
information is changing relatively
Statistical features such as skewness are robust
Features such as energy ratio and the frequency change uniquelyrepresent signals
F t 1 R i t
7/29/2019 B4_Detetcion_Kadambe
68/106
Feature 1: Renyi entropy
Notations:
s(t) : signal; S()
: FFT of s(t)
e(t) : envelop of s(t); E()
: FFT of e(t)
Feature 1: Renyi
entropy
where
and
))((1 FEHF =
))()(()( teteFFTFE =
yprobabilitand10)(log1
1)( 2 pipxH
i
x
7/29/2019 B4_Detetcion_Kadambe
69/106
Feature 2 : energy ratio
Feature 2: Energy ratio (of the envelope e and thes)
where
s
e
F
=2
[ ]
[ ] .operatornexpectatioanis
*))((
ExEmand
mxmxE
x
xxx
=
=
Feature 3: frequency change
7/29/2019 B4_Detetcion_Kadambe
70/106
Feature 3: frequency change
Feature 3 : Frequency change
Let and are segments of , then
where
and
=
=n
ii tsts
1
)()( )(tsi )(ts
)min()max(3 fsfsF =
},2,1:{ niffs i K==
)).(( ii Scenterf =
Feature 4 : skewness
7/29/2019 B4_Detetcion_Kadambe
71/106
Feature 4 : skewness
Feature 4: Higher order statistics - skewness
where
[ ]334
))((1
FE
FE
mFEEF =
)).()(()( teteFFTFE =
Feature 5: relative entropy
7/29/2019 B4_Detetcion_Kadambe
72/106
Feature 5: relative entropy
Feature 5 : Relative entropy
Let and be the upper and lower envelopes of,
then
where
and
)(1 te )(2 te
)(ts))(),(( 215 FEFEDF =
)()( iii eeFFTFE =
)()(
)()( log)(log)(),( jxp
jyp
jyiyp
ixp
ix jpipyxD +=
Signals considered and its generation
7/29/2019 B4_Detetcion_Kadambe
73/106
Signals considered and its generation
Considered signals are: Analogue
AM with and without ripple
FM chirp with and without ripple
DigitalQPSK Signal generation and description:
Synthesized using a realistic channel model and areceiver system
Analogue signals were generated using the abovesystem
Digital QPSK signals were also generated using theabove system
Signals are quite distorted - full (or sometimes even half)
signal spectrum is not available
Simulation details computation of
7/29/2019 B4_Detetcion_Kadambe
74/106
features
For each signal type 400 pulses are used 200 for training and 200 for testing
The ground truth of each signal pulse is knownThe ripple frequency, pulse width, pulse raising and
falling edges and additive noise are randomly
varied In the case of QPSK, phase, pulse width and noise
are randomly varied
Simulated noisy representative four pulses areplotted in the next slide
7/29/2019 B4_Detetcion_Kadambe
75/106
Simulated noisy pulses - example
Simulation details computation of features 2
7/29/2019 B4_Detetcion_Kadambe
76/106
Simulation details computation of features 2
The plot the additive Gaussian noise corrupted the envelopes of thepulses
such that the rippled and non-rippled pulses are not easy to distinguish.
Most of our features are extracted from the envelopes of pulses,
we used a simple but computationally efficient peak detecting technique. For the spectral based features, we used the windowed FFT.
For the computation of both Renyi entropy and the relative entropy, theprobability values are obtained from the histogram.
Next slide presents a plot of three features relative entropy, frequencychange and skewness for pulses of 10dB SNR.
From this plot it can be seen that these features form clusters.
our chosen novel features can represent different classes of modulatedpulses (that are closely related) fairly accurately.
Cluster plot of signals features
7/29/2019 B4_Detetcion_Kadambe
77/106
p g
Feature clusters of signals
7/29/2019 B4_Detetcion_Kadambe
78/106
Feature clusters of signals
Classifier SVM
7/29/2019 B4_Detetcion_Kadambe
79/106
Classifier - SVM A SVM is a supervised statistical learning machine.
In its learning process, an SVM constructs an optimal hyper-planeas its decision surface
using a small set of training data called support vectors that are the datapoints closest to the decision surface and the most difficult to classify.
The optimization for computing the decision surface is achieved bythe principle of structural risk minimization.
Since in many applications, optimal decision surfaces could be non-linear,
an SVM uses a set of non-linear transfer functions - inner-product kernelto map the data from the input space into a high- dimensional feature
space such that a non-linear decision surface in the input space becomes a
linear decision surface (optimal hyper-plane) in the feature space.
Motivation
7/29/2019 B4_Detetcion_Kadambe
80/106
Motivation
H3 doesn't separate the 2 classes. H1 does, with a small margin and H2 withthe maximum margin.
SVM - Construction
7/29/2019 B4_Detetcion_Kadambe
81/106
SVM - Construction The procedure for constructing an SVM can be
described as follows:
Let data set be the training set & be
the inner product kernel function
The objective function of constructing an optimaldecision surface is:
subject to the constraints
{ }Ni
idix
1
),(
=
),( jxixK
),(1 12
1
1)( jxixKjdidj
N
i
N
ji
N
iiJ =
=
=
=
NiforCi
andidN
ii
.....,3,2,10)2(
01
)1(
=
==
SVM - Construction
7/29/2019 B4_Detetcion_Kadambe
82/106
SVM - Construction C is a user-specified positive parameter called the cost
of mistakes. The optimal parameter vector isdetermined by maximizing , i.e.,
Then, the optimal decision surface can be written as:
Where is a support vector & is the # of support
vectors and b is the bias term The decision surface is an optimal hyperplane in the
feature space - hidden space related by the kernelfunction
*
)(J
)).(maxarg(*
J=
bxivKsN
iidixf +=
= ),(1
*)(
iv sN
)(xf
),( yxK
SVM - Construction
7/29/2019 B4_Detetcion_Kadambe
83/106
SVM - Construction The most common kernel functions that are used in
practice are: the polynomial & Gaussian functions:
p
y
T
xyxK )1(),( +=)
2
22
1exp(),( yxyxK =
Finding the optimum hyperplane
7/29/2019 B4_Detetcion_Kadambe
84/106
Finding the optimum hyperplane
Maximum-margin hyperplane and margins for a SVM trained with samples fromtwo classes. Samples on the margin are called the support vectors.
SVM - Multiclass
7/29/2019 B4_Detetcion_Kadambe
85/106
As described, the SVMs use an optimal hyper-plane asa decision surface for the classification of input data. Since a hyper-plane can only separate two classes, the
SVMs were originally developed for binaryclassification problems The optimal design of SVMs
for multi-class
classificatiion
is still a research topic.
We have extended the binary approach for multi-classclassification problem
by classifying each class from the rest of all other classes iteratively
Multi class Classifier
7/29/2019 B4_Detetcion_Kadambe
86/106
Multi-class Classifier
Our approach can be described as follows:
Let be N classes of signals.
We construct N classifiers and each classifier istrained by the method of one-class-versus-the-rest;
that is, the classifier fi is trained for Ci versus the rest of the classes.
Then in the signal classification phase, the classifiers perform
according to the following decision rule:
where the function fk(x) provides the distance ofx to thedecision surfaces.
{ }Niic ,...3,2,1: =
{ }1,...3,2,1: = Niif
{ },1,...3,2,1;0)(:)(max)( =>=
Nkxk
fxk
fxi
f
ificx
Multi-class Classifier
7/29/2019 B4_Detetcion_Kadambe
87/106
While classifying the features using our multi-class SVMbased classifier we used both Gaussian and polynomial
kernel functions We obtained a little bit better classification performance in
the case of a polynomial kernel function;
however, the computational speed of classification wasfaster in the case of a Gaussian kernel function.
Hence, we used Gaussian kernel function in our
experiments.
Simulation details - Classification
7/29/2019 B4_Detetcion_Kadambe
88/106
All four signal types were considered
Four features Renyi entropy, relative entropy, energyratio and frequency change were used
Classification results for pulses with 10dB SNR arereported in the form of a confusion matrix in the next slide.
From this, it can be seen that these features represent
information content of a signal pretty accurately
Classification results
7/29/2019 B4_Detetcion_Kadambe
89/106
Classification results
200 pulses for training and 200 pulses for testing
0.910.080.020.0FM ripple
0.01.00.00.0FM non-
ripple
0.00.00.950.05AM Ripple
0.00.00.040.96No-ripple
FM rippleFM non-
rippleAM RippleNo-ripple
TrueClasses
Computed Classes
Table 1: Classification results for SNR = 10 dB
Cl ifi ti P f SNR
7/29/2019 B4_Detetcion_Kadambe
90/106
Classification Performance vs. SNR
Mutual information measure for the selection of non-d d f SO
7/29/2019 B4_Detetcion_Kadambe
91/106
redundant features/SOI
1. Renyi
Entropy
2. Freq. Chang
3. Energy Ratio
4. Skewness
5. Relative
Entropy
6. Kurtosis
7. Pulse
Bandwidth8. Ripple
Frequency
7/29/2019 B4_Detetcion_Kadambe
92/106
Conclusions
7/29/2019 B4_Detetcion_Kadambe
93/106
Conclusions
Various techniques for detection and classification arediscussed
Particular emphasis is given to techniques applicable for low
SNRs Robust detector and classifier that works at low SNRs
are still a research topic
Opportunities exist for both military and commercialapplications
7/29/2019 B4_Detetcion_Kadambe
94/106
Phy layer and network layer behaviorlearning
94
Why Phy layer behavior learning?
7/29/2019 B4_Detetcion_Kadambe
95/106
y y y g CR/SDRs are being used in both military andcommercial applications
Adversaries can use them to attack radios without much effort
Most CR/SDRs research has focused on Quality ofService (QoS).
How do these algorithms respond to the actions of malicioususers?
Types of Attacks
7/29/2019 B4_Detetcion_Kadambe
96/106
Types of Attacks
Primary User Emulation (PUE)
Denial of Service (DoS)
Spectral Honeypot Attacks (SHA)
Primary User Emulation
7/29/2019 B4_Detetcion_Kadambe
97/106
Primary User Emulation
Actively attempting to confuse a CR/SDR into thinkingthat your signal is a primary user signal.
Specifically designed to attack the signal classification
component of a CR/SDR. If attacker is using a published standard (IEEE 802.11),
impossible to discern malicious users without additional
behavioral information. Makes feature selection within classifiers a critical
algorithm decision.
Denial of Service
7/29/2019 B4_Detetcion_Kadambe
98/106
Denial of Service
Goal is to disrupt some service provided by thecommunication node.
CR/SDRs know how to move channels when they are
being jammed What if you forced a DSA radio to continually move? It would
never be able to initiate real packet transfer thus achieving a
DoS.
Spectral Honeypot Attack
7/29/2019 B4_Detetcion_Kadambe
99/106
Spectral Honeypot
Attack
Given a certain band in the spectrum, lure or force theCR/SDRs to that band for a malicious purpose:
Man in the Middle Attack
Force degradation of secondary signal
Can use a PUE attack to force the radio into the band ofyour choice.
VTs Experimental Setup
7/29/2019 B4_Detetcion_Kadambe
100/106
VT s Experimental Setup
Three DSA 2100 Radios built by Shared SpectrumCompany.
One in base station mode.
Two in subscriber mode.
Vector Signal Generator
Tektronix RSA3408 Real-Time Spectrum Analyzer
USRP built by Ettus Research
7/29/2019 B4_Detetcion_Kadambe
101/106
VTs Experimental Results
7/29/2019 B4_Detetcion_Kadambe
102/106
VT s Experimental Results
Demonstrated DoS attack causing ~82% performancedegradation in DSA radios.
Can significantly degrade performance even with cheap COTS
components like a USRP. Demonstrated Honeypot Attack using PUE Using two
different methods, forced radio to target band in 5.6
seconds and 3.7 seconds.
Why network layer behavior learning? Interference can occur in a normally operated mobile wireless
7/29/2019 B4_Detetcion_Kadambe
103/106
Interference can occur in a normally operated mobile wirelessnetwork due to hidden terminals This condition can be intentionally created by the stealthy adversary
Programmable radios make it easy for attackers to emulate normalinterference.
need to distinguish malicious interference from normal and to
understand types of interference Type of malicious attacks
J amming attack by
Selective packet blocking (e.g,, ACK/control packets) by killing ACK/CTS
Blocking preamble (synching) such that a radio cannot lock on to a signal
Byzantine A node is compromised by the adversary to intentionally act inconsistently to
throw off routing protocols
Spoofing A device pretend to be a access point and thus obtain information about the
identity of wireless devices
Environment alteration
Purposely increase the background noise level to alter the power control
strategies of the devices
Why Network layer behavior learning?Why Network layer behavior learning?
7/29/2019 B4_Detetcion_Kadambe
104/106
y y g
Type of normal/benign attacks Background congestion, distance and mobility
Can occur due to
Noisy radio environment because of high ambient noise level
too many nodes that are close to each other and are constantlytransmitting
Hidden node
Two nodes are not within the sensing range but still interfere with thethird node
Can appear is different topologies
Cross protocol/technology
Devices using different protocols with overlapping frequency ranges
An Extension of PERAn Extension of PER--RSS Consistency CheckRSS Consistency Check
7/29/2019 B4_Detetcion_Kadambe
105/106
Entire signal space consists of three regions Interference-free: no hidden terminal
Normal interference: caused by legitimate hidden terminals
Intentional interference:malicious jamming
Thresholds are empiricallychosen using support
vector machine technique.
PER: Packet Error Rate, RSS: ReceivedSignal Strength
WINLAB
Challenges RemainChallenges Remain
7/29/2019 B4_Detetcion_Kadambe
106/106
A smart reactive jammer can take advantage of the captureeffect to throttle the victims throughput while keeping a lowPER.
160 170 180 190 200 210 2202
4
6
8
10
12
14
Transmission link distance (meters)
Normalizedthroughput(%)
Random
Reactive