ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond
description
Transcript of ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond
![Page 1: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/1.jpg)
ArrayTrack
--- Data management, analysis and interpretation
tool for DNA microarray and beyond
![Page 2: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/2.jpg)
ArrayTrack – A brief history in the 5 years Development Cycle
• AT version 1 (2001) – Filter array; data management tool;
• AT version 2 (2002): in-house microarray core facility– Customized two color arrays; data management, analysis
and interpretation; – Open to public (late of 2003)
• AT version 3.1 (2004): VGDS– Affymetrix; analysis capability enhanced;
• AT version 3.2 (2005): MAQC– Tested on 7 commercial platforms (Affy, Agilent one- and
two-color arrays, ABI, CodeLink, Illumina …); – Integrated with other software (IPA, MetaCore, DrugMatrix,
CEBS, SAS/JMP …) • AT version 4 (2006 – present)
– CDISC/SEND standard;– VGDS VXDS
![Page 3: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/3.jpg)
Pub data(Gene annotation,
Pathways …)
Study data(Clinical and
non-clinical data)
ArrayTrack: Client-Server Architecture
Analysis Tools
MicroarrayProteomics
MetabolomicsSERVER
CLIENT
CDISC/SEND MIAME NCBI, KEGG, GO …
![Page 4: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/4.jpg)
Microarray data
Proteomics data
Metabolomics data
Chemical data
Clinical and non-clinical
data
Public data
ArrayTrack
ArrayTrack: An Integrated Solution
![Page 5: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/5.jpg)
ArrayTrack Websitehttp://www.fda.gov/nctr/science/centers/toxicoinformatics/ArrayTrack/
![Page 6: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/6.jpg)
MicroarrayDB
LIB
TOOL
ArrayTrack: MicroarrayDB-LIB-TOOL- An integrated environment for microarray data management, analysis
and interpretation
uploading
Geneselection
Exploring
Interpretation• pathways• GO
![Page 7: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/7.jpg)
Data interpretation
Data management
Data analysis
Exp Design
Microarray Exp
ArrayTrack for Microarray Data Management and Analysis
Hypothesis
MicroarrayDB
GeneLib
GeneTools
ArrayTrackComponents
![Page 8: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/8.jpg)
MicroarrayDB
MicroarrayDB – Storing data associated with a microarray exp
Microarray database:
• Handling both one- and two-channel data, including affy data
• Only the CEL file is required for affy data
• Supporting toxicogenomics research by storing tox parameters, e.g., dose schedule and treatment, sacrifice time
• MIAME supportive to capture the key data of a microarray experiment
• Will be MAGE-ML compliant to ensure inter- exchangeability between ArrayTrack and other public databases
![Page 9: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/9.jpg)
MicroarrayDB
LIBPublic Databases
HumanGenomeProject
HumanGenomeProject
HumanGenomeProject
HumanGenomeProject
MirroredDatabases
LIB Component – Containing functional information for microarray data interpretation
Functional data:• Individual gene analysis• Pathway-based analysis• Gene Ontology – based analysis• Linking expression data to the
traditional toxicological data
![Page 10: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/10.jpg)
MicroarrayDB
LIB
TOOL
TOOL Component- Containing functionality for microarray data analysis
Analysis tools:• Four normalization methods
– Mean/median scaling for affy data – LOWESS for 2-color array
• Gene selection method– T-test, permutation t-test, …– Filtering using fold changes, intensity, flag
inf …– Volcano plot, p-value plot …
• Data exploring (e.g., HCA, PCA)• Many visualization tools (e.g., flexible
scatter plot, Bar chart viewer,…
![Page 11: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/11.jpg)
Importing data
Normalization
Gene Selection
Interpretation
Data exploring
Apply to
Apply to
Supporting Eight Platforms • Affy, Agilent, ABI, Combimatrix,
Eppendorf, GE Healthcare, Illumina and customized arrays
• Affy data– Probe data (.cel file)– Probe-set data
Batch import
Individual hyb import
![Page 12: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/12.jpg)
MicroarrayDB
LIB
TOOL
![Page 13: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/13.jpg)
Data uploading and QC
Four normalization methods, including LOWESS
Significant genes can be identified based on:Cut-off of p-value (with or without Banferroni correction), fold-change, intensity or combinations thereofVolcano Plot (considering both p and fold-change)P-Value Plot (considering false positives/negatives)
Importing data
Normalization
Gene Selection
Interpretation
Data exploring
Apply to
Apply to
Pathway analysis
Gene Ontology analysis
Individual gene analysis
PCA
2-way HCA
Scatter Plot
Expression pattern using the bar chart plot
![Page 14: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/14.jpg)
Data Interpretation- GO-based analysis using GOFFA
• GOFFA – Gene Ontology For Functional Analysis• It is developed based on Gene Ontology (GO) database• Important for grouping the genes into functional classes • GO – Three ontologies
– Molecular function: activities performed by individual gene products at the molecular level, such as catalytic activity, transporter activity, binding
– Biological process: broad biological goals accomplished by ordered assemblies of molecular functions, such as cell growth, signal transduction, metabolism
– Cellular component: the place in the cell where a gene product is found, such as nucleus, ribosome, proteasome
![Page 15: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/15.jpg)
Study DB
TOOL
Study domain
MicroarrayDB
TOOL
Array domain
LIB
![Page 16: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/16.jpg)
Importing data
Normalization
Gene Selection
Interpretation
Data exploring
Apply to
Apply to
Data InterpretationGOFFA: Gene Ontology-based tool
Pathway-based tools:• Ingenuity Pathways Analysis• KEGG• PathArt
Gene Annotation
![Page 17: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/17.jpg)
Ingenuity Pathways Analysis (IPA)
• KEGG and PathArt provide canonical pathways• IPA provides both canonical and de-novo pathways
Interrogate genes
or proteins on “omics” scale
Conduct statistical analysis
Elucidate functional pathways
Understand markers of efficacy
and safety
Ingenuity Pathways Analysis
![Page 18: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/18.jpg)
Review Tool for Pharmacogenomics Data Submission: ArrayTrack
Receive the data; support future
regulatory policy
Verify the biological
interpretationAnalyze the
data
MicroarrayDB LibTool
ArrayTrack Components
Data repository Analysis Interpretation
![Page 19: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/19.jpg)
ProteinLib PathwayLib
ProteinTools
ProteomicsDB
PathwayTools
MetabonomicsDB
ToxicantLib
Future Direction - Toxicoinformatics Integrated System (TIS)
MicroarrayDB
GeneLib
GeneTools
![Page 20: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/20.jpg)
Data uploading and QC
Four normalization methods, including LOWESS
Significant genes can be identified based on:Cut-off of p-value (with or without Banferroni correction), fold-change, intensity or combinations thereofVolcano Plot (considering both p and fold-change)P-Value Plot (considering false positives/negatives)
Importing data
Normalization
Gene Selection
Interpretation
Data exploring
Apply to
Apply to
Pathway analysis
Gene Ontology analysis
Individual gene analysis
PCA
2-way HCA
Scatter Plot
Expression pattern using the bar chart plot
![Page 21: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/21.jpg)
ArrayTrack – Summary
• An integrated solution for microarray data management, analysis and interpretation
• Review tool for FDA pharmacogenomics data submission– Training course is provided to the FDA reviewers every two months
– At present, ~40 reviewers has been trained
• Freely available to public (http://edkb.fda.gov/webstart/arraytrack)
• Users at big Pharma, academic and government institutions; U.S., Europe & Asia
![Page 22: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/22.jpg)
ArrayTrack TutorialTopics Contents
1. (Basic)
Comparing two groups (e.g., treated vs control groups) Statistical methods (t-test, permutation t-test, ANOVA) for group comparison. Differentially Expressed Genes (DEGs) identification Biological interpretation (individual gene analysis) using LIB Pathway analysis (KEGG, PathArt, IPA, MetaCore, Key Molnet) Gene Ontology analysis using GOFFA
2. Comparing multiple groups (e.g., multiple doses, time points)
3
VennDiagram Determine the common genes/pathways/functions shared by two or three gene lists (extended to cross-experiment and –platform comparison and systems biology) Apply VennDiagram to the external files
4
Data exploring tools: Principal Component Analysis (PCA) Hierarchical Cluster Analysis (HCA) Apply HCA and PCA to the external files Extensive features in HCA
![Page 23: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/23.jpg)
Topics Contents
5
Assessing gene expression profiles using BarChart Access BarChart from the TOOL box Access BarChart from the t-test result table Access BarChart from ChipLib and other Libs How to use BarChart for cross-experiment comparison Assign group by color
6
GeneList – An important concept in ArrayTrack Create a gene list through data filtering and statistical analysis Import/export a gene list Conduct normalization filtered by a gene lists Conduct statistical analysis (t-test/ANOVA, PCA, HCA and others)
based on a gene list Export a dataset by specifying the gene list (extended for cross-
platform and cross-experiment comparison)
7Normalization methods For Affymetrix platform: MAS5, RMA, DChip, Plier, Plier+16 For other platforms: 7 methods (e.g., LOWESS)
8 How to create your own workspace Copy/Paste/duplicate an experiment
![Page 24: ArrayTrack --- Data management, analysis and interpretation tool for DNA microarray and beyond](https://reader030.fdocuments.us/reader030/viewer/2022033108/56815cb0550346895dcaade9/html5/thumbnails/24.jpg)
9
Import/Export Manual import and batch importOptions of data exporting Export a selected dataset with specifying a sub list of geneExport multiple experiments and/or platforms using selected geneID types (e.g., RefSeq)
10
Other useful functionsCorrelation matrixIDConverter – converting one gene ID to another (e.g., from AffyID to AgilentID or GeneBank#, or LocusLinkID or vice verse)ScatterPlot – pair-wise plotJoinTable – Combine two tablesSplitTable – If a table contains multiple hybridization data in column with genes in row, the function split the table into individual tables with single hybridization data.GetUniqueID – If a table contains duplicated IDs, the function pick out the unique IDs
11
Basic scripting for querying (raw and normalized) data and table Query data from the database tree (How to use *)
e.g., *EST, EST*, *EST*=ESTQuery data in tables e.g., contain, like (%) and inlist