GEO (Gene Expression Omnibus)

18
GEO (Gene Expression Omnibus) Deepak Sambhara Georgia Institute of Technology 21 June, 2006

description

Deepak Sambhara Georgia Institute of Technology 21 June, 2006. GEO (Gene Expression Omnibus). What is GEO?. A gene expression repository created by the NCBI Located: http://www.ncbi.nlm.nih.gov/projects/geo Supports data submissions, browsing, query and retrieval. - PowerPoint PPT Presentation

Transcript of GEO (Gene Expression Omnibus)

Page 1: GEO (Gene Expression Omnibus)

GEO (Gene Expression Omnibus)

Deepak Sambhara

Georgia Institute of Technology

21 June, 2006

Page 2: GEO (Gene Expression Omnibus)

What is GEO?

-A gene expression repository created by the NCBI

-Located: http://www.ncbi.nlm.nih.gov/projects/geo

- Supports data submissions, browsing, query and retrieval.

- Organized on three levels: platforms, series, and samples

Page 3: GEO (Gene Expression Omnibus)

Why Use GEO?

- Validating PADRE by invalidating public data

- Thorough data for microarray experiments

- Designing interface of MAGMA

Page 4: GEO (Gene Expression Omnibus)

Background and Significance

-MIAME (Minimum Information About a Microarray Experiment) Compliant

-Effort to help standardize publicly available data

-http://www.mged.org/Workgroups/MIAME/

MIAME CHECKLIST

-Experimental Design

-Samples used, extract preparation and labeling

-Hybridization procedures and parameters

-Measurement data and specifications

- Array Design

Page 5: GEO (Gene Expression Omnibus)

QUERY Search- Search by Data Sets, Gene profiles, GEO Accession numbers, or GEO Blast

-Can modify queries using search tabs on results page

- Search tabs: limits, history, clipboard, and query translation

E.g. Filter for only experiments with .CEL files

Page 6: GEO (Gene Expression Omnibus)

QUERY Results- Listed by relevance; sortable by: datasets, platforms and series

-Up to 500 results per page; shows summary of experiment, can list by briefs, PubMed links etc.

- If .CEL files exist, downloadable on results page.

- Click GEO accession number to access experiment page

Page 7: GEO (Gene Expression Omnibus)

Browsing

- Can browse by data sets (Result page with all experiments) or GEO Accessions

-GEO Accessions browsed by Platforms, Samples, or Series

Page 8: GEO (Gene Expression Omnibus)

Demo

http://www.ncbi.nlm.nih.gov/projects/geo

GO TO

Page 9: GEO (Gene Expression Omnibus)

Search data setsfor “cancer”

Page 10: GEO (Gene Expression Omnibus)

Download .CELfiles

Click GEO Accessionlink to access experiment

Page 11: GEO (Gene Expression Omnibus)

Take note of chip platform

Find the corresponding .pdf document using PubMed IDs

Take note of Classes, and number of arrays

Page 12: GEO (Gene Expression Omnibus)

Download DataSet file (Raw data) and Annotation file

DataSet SOFT filelist gene expressionfor all patients

Page 13: GEO (Gene Expression Omnibus)

Web-based analysis through Heirarchial Clustering, Value Distributions and t-tests

Page 14: GEO (Gene Expression Omnibus)

Can plot selected gene profiles using a region of interest box

Page 15: GEO (Gene Expression Omnibus)

Click value distribution for distribution of avg. gene expression values for outlier detection

Page 16: GEO (Gene Expression Omnibus)

One or two-tailed t-tests completed to compare two classes in the experiment

SignificanceLevels can beadjusted from 0.001 to 0.100

Page 17: GEO (Gene Expression Omnibus)

Shows Probe Set ID’s found significant based on chosen class comparisons

Page 18: GEO (Gene Expression Omnibus)

Features

PROS CONS- User-friendly interface

- MIAME Compliant

- Web based analysis

- Raw data/Annotation files available

- Vastly expansive/thorough compared to other microarray databases

- GSE series/ GDS series differences

- Must have PubMed ID

- .CEL files not available for all datasets

- .CEL files are individually zipped

- No Quality Control Information