microRNA-mRNA interaction identification in Wilms tumor using principal component analysis based...
-
Upload
y-h-taguchi -
Category
Science
-
view
608 -
download
5
Transcript of microRNA-mRNA interaction identification in Wilms tumor using principal component analysis based...
microRNAmRNA interaction identification in Wilms tumor using principal component
analysis based unsupervised feature extraction
Yh. Taguchi
Department of Physics
Chuo University
Tokyo
Japan
What is PCA based unsupervised FE?
N features
Categorical multiclasses
In contrast to usual usage of PCA, not samples but features are embedded into Q dimensional space.
PC
A
PC1
samplesPC Loadings
M samplesN × M Matrix X (numerical values)
PC2
PC1
PC Score
++ ++ +
+++
++ ++ ++
+
No distinction between classes
Synthetic example
10 samples10 samples
90 features 10 featuresN(0)N()
[N()+N(0)]/2
+:Top 10 outliersThus, extracting outliers selects features distinct between two classes in an unsupervised way.Accuracy:(100 trials)Accuracy:(100 trials) 89.5% ( 52.6% (
PC1
PC2
Normal μ:mean Distribution ½ :SD
Application:Application:
microRNAmRNA interaction microRNAmRNA interaction identification in Wilms tumoridentification in Wilms tumor
What is microRNA (miRNA)?
DNA
mRNA
protein
miRNA
Difficulty of inference of miRNAmRNA interaction
*too many pairs mRNA 〜 104, miRNA 〜 103 → pairs 〜 107 *Computational prediction is sequence based
How to solve this problem?Prescreening mRNA/miRNA based on
differential expression (DE) Ex.:functional miRNAmRNA pairs in disease
→ mRNA/miRNA with significant DE: Normal vs Patients
mRNA miRNA
normal
patients
matching
Negativecorrelation
normal
patients
Problem: ””significant DE” significant DE” is arbitrary
Screening criteria: Pvalue+Fold Change:FC
Pvalue:Fixed number of mRNA/miRNA, NVariable sample numbers:M M:large → P:small
FC:Typical thershold: 2 or ½, but any basis?
Example previous researchessignificant DEsignificant DE
cancers
Previous studies
None
No mention
In real studies....Control Pvalue and FC → good resultsfeasibility → no discussions
If biologically feasible, no problem?If biologically feasible, no problem?
(No discussion about Pvalue and FC)
→”Which ones are DE mRNA/miRNA?”
→True answer exist (but unknown)
→ Data driven strategy can help us
IdeaIdea::PCA based unsupervised FEPCA based unsupervised FE
Fixed number of mRNA/miRNA, N,M:variable, what is convergent as M → ∞?
⇓Distributions of PC score(genes) should converge as M → ∞ .
M(≪N)sample
Gene expression
matrix
PC loading(Converge M )→ ∞
normal
patientsPC1M
N
PC1
PC2
Gaussian(assumed)
cf.Prob. PCA
PC scoresoutliers*
||selected
significance:T test:P<0.05
*:multiple normal+χ2 distBH corrected P value<0.01
N(m
RN
A/m
iRN
A)
mRNA miRNA
mRNAsmiRNAs
outliers
miRTarBase
Feature embedding
MiRNAmRNA
pairs
Reciprocal pairs
vs
Expression matrix
Controls
Patients
Sequence based miRNAmRNA interaction prediction
Results:Results:Samples: (Ludwig et al, IJMS, 2016)Samples: (Ludwig et al, IJMS, 2016)
mRNA miRNA(P)atients (N)ormal P N28 4 62 4
SelectedSelectedmRNA 1114 miRNA 55
Discrimination (PCA+LDA+LOOCV)Discrimination (PCA+LDA+LOOCV)mRNA miRNAP N P N
P 27 0 6161 0N 1 4 1 44
R=-0.126 (P=0.008)
R=-0.267 (P<1016)
3,42
Survival Analysis with genes targeted by multiple Survival Analysis with genes targeted by multiple miRNAs miRNAs ((OncoLnc.org, BoldOncoLnc.org, Bold::Kidney cancersKidney cancers))
3,4
2
Conclusion:Conclusion:
Integrated analysis of mRNA expression, miRNA expression and mRNAmiRNA interaction enables us to identify more more biologically feasiblebiologically feasible mRNAs than considering only differential expressiononly differential expression of mRNAs.