Data transformation
-
Upload
chris-orwa -
Category
Data & Analytics
-
view
202 -
download
0
Transcript of Data transformation
![Page 1: Data transformation](https://reader036.fdocuments.us/reader036/viewer/2022082907/58eca8531a28ab4a578b4759/html5/thumbnails/1.jpg)
Data TransformationSummer Data Jam
Chris Orwa14th July 2015
![Page 2: Data transformation](https://reader036.fdocuments.us/reader036/viewer/2022082907/58eca8531a28ab4a578b4759/html5/thumbnails/2.jpg)
Principal Component Analysis
Principal component analysis (PCA) is a technique used
to emphasize variation and bring out strong patterns in a
dataset. It's often used to make data easy to explore and
visualize.
Statistically, PCA is the eigenvectors of a covariance
matrix.
![Page 3: Data transformation](https://reader036.fdocuments.us/reader036/viewer/2022082907/58eca8531a28ab4a578b4759/html5/thumbnails/3.jpg)
Let us Look at Some Concepts
Covariance
The covariance of two variables x and y in a data sample
measures how the variance of two attributes are related.
R codeduration = faithful$eruptions
waiting = faithful$waiting
cov(duration, waiting)
![Page 4: Data transformation](https://reader036.fdocuments.us/reader036/viewer/2022082907/58eca8531a28ab4a578b4759/html5/thumbnails/4.jpg)
Covariance Matrix
![Page 5: Data transformation](https://reader036.fdocuments.us/reader036/viewer/2022082907/58eca8531a28ab4a578b4759/html5/thumbnails/5.jpg)
Eigen Vectors
Eigenvector is a vector of a square matrix that points in a
direction invariant under the associated linear
transformation.
R codeB <- matrix(1:9, 3)
eigen(B)
![Page 6: Data transformation](https://reader036.fdocuments.us/reader036/viewer/2022082907/58eca8531a28ab4a578b4759/html5/thumbnails/6.jpg)
Principal Component Analysis
R Code#load dataa = read.csv(‘my_data.csv') #perform PCAc = prcomp(a)