Sidi Moustafa Compound @ Sidi Abdel El- Rahman. Why Sidi Abdel Rahman? & Sidi Moustafa.
A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL...
-
Upload
jeremy-woody -
Category
Documents
-
view
216 -
download
0
Transcript of A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL...
![Page 1: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/1.jpg)
A Plot for Visualizing Multivariate Data
Rida E. A. Moustafa
George Mason UniversityADM Group,AAL
[email protected]@aalcpas.com
![Page 2: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/2.jpg)
Talk Outline
The Theory of MV-Plot. Detecting Linear Structures with MV-plot. Detecting Non-Linear Structures with MV-plot. Comparisons with other methods and application on real data.
![Page 3: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/3.jpg)
MV-Plot Theory
d
jjd
d
jjd
xfxxfxgv
xxfm
1
21
1
1
|)(|))(,(
||)(
Given an observation x=(x1,x2,…,xd)We define m and v as follows:
Computing m and v for every observation produces vector of m and v.
What is the relationship between m and v?
![Page 4: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/4.jpg)
MV-Relationship in 2-d
21212
2
121
2121
2
121
||
|)||(|||
iiij
iji
iij
iji
xxmxv
xxxm
• Normalizing the data in range (0,1) avoid the abs-value in computing m.
• Close to the PC in 2-d
![Page 5: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/5.jpg)
MV- detects linear structure(s)
011011
00111
1
01121
01121
0112
;
;)1()1(
if
)1(
;)1(
axavaxam
awaww
w
wxwv
wxwmwxwx
iiii
ii
iiii
If the data is linear in the original space
It will be linear in the MV-space!!
![Page 6: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/6.jpg)
MV- detects linear structure(s)
1
10
1
1
10
1
)1()1)1(
)1(
2
d
jijjd
dj
d
jijjdj
wdxwdv
wxwm
1
10
1
10
d
jijjj
d
jijjj
axav
axam
![Page 7: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/7.jpg)
Detecting Linear structure(s)Example I
![Page 8: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/8.jpg)
Detecting Linear structure(s) Example II
![Page 9: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/9.jpg)
Detecting Linear structure(s) Example III
![Page 10: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/10.jpg)
Detecting nonlinear datawith MV-plot
MV- plot can detect nonlinear structure in the data set without any changes in the equations.
![Page 11: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/11.jpg)
Detecting nonlinear structure
|)sin(|),sin()sin(,
|)cos(|),cos()cos(,
xxvxxmxx
xxvxxmxx
![Page 12: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/12.jpg)
Detecting Sphere(s)
.222
1
221
2
1
12
dR
ii
d
jiijd
d
jiijdi
mv
dmxmxv
Case I:
• The sphere radius R
• The sphere center is the origin
![Page 13: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/13.jpg)
Detecting Sphere(s)
.
)()(
222
1
221
2
1
12
dR
ii
d
ji
cj
cjijd
d
ji
cj
cjijdi
mv
mxdxx
mxxxv
Case II:
• The sphere radius R
• The sphere center is not the origin
![Page 14: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/14.jpg)
Detecting Sphere(s)
![Page 15: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/15.jpg)
Fisher’s IRIS data (150x4) 3-classes of( 50 point each)
Process control data (600x60)6-classes of (100 points each)
Pollen data (3,848x5) (Wegman’s data)2-classes (linear and nonlinear)
Application on Real data
![Page 16: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/16.jpg)
Multidimensional Scaling Fisher Discriminate Analysis Principal Component
Related Dimensional Reduction Methods
![Page 17: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/17.jpg)
IRIS (R. A. Fisher) Dataset150-cases in 4-dim
![Page 18: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/18.jpg)
Time Series Dataset600-cases in 60-dim
![Page 19: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/19.jpg)
Pollen dataset 3,848-points in 5-dim
Other methods:
Require more storage and speed.
Even if it work, we expect bad results on this particular data.
(Wegman2002)
![Page 20: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/20.jpg)
Pollen dataset
Linear and Nonlinear mixed structures.
![Page 21: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/21.jpg)
The linear structure in the Pollen data set
17+16+18+17+14+16=98 Linear, 3750 nonlinear
![Page 22: A Plot for Visualizing Multivariate Data Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649c755503460f949285ee/html5/thumbnails/22.jpg)
Summary
MV-algorithm can discover the linear and nonlinear pattern at the same time.
MV-algorithm can discover symmetric data.
MV-algorithm deals with large multivariate data.