Download - GEO (Gene Expression Omnibus)

Transcript
Page 1: GEO (Gene Expression Omnibus)

GEO (Gene Expression Omnibus)

Deepak Sambhara

Georgia Institute of Technology

21 June, 2006

Page 2: GEO (Gene Expression Omnibus)

What is GEO?

-A gene expression repository created by the NCBI

-Located: http://www.ncbi.nlm.nih.gov/projects/geo

- Supports data submissions, browsing, query and retrieval.

- Organized on three levels: platforms, series, and samples

Page 3: GEO (Gene Expression Omnibus)

Why Use GEO?

- Validating PADRE by invalidating public data

- Thorough data for microarray experiments

- Designing interface of MAGMA

Page 4: GEO (Gene Expression Omnibus)

Background and Significance

-MIAME (Minimum Information About a Microarray Experiment) Compliant

-Effort to help standardize publicly available data

-http://www.mged.org/Workgroups/MIAME/

MIAME CHECKLIST

-Experimental Design

-Samples used, extract preparation and labeling

-Hybridization procedures and parameters

-Measurement data and specifications

- Array Design

Page 5: GEO (Gene Expression Omnibus)

QUERY Search- Search by Data Sets, Gene profiles, GEO Accession numbers, or GEO Blast

-Can modify queries using search tabs on results page

- Search tabs: limits, history, clipboard, and query translation

E.g. Filter for only experiments with .CEL files

Page 6: GEO (Gene Expression Omnibus)

QUERY Results- Listed by relevance; sortable by: datasets, platforms and series

-Up to 500 results per page; shows summary of experiment, can list by briefs, PubMed links etc.

- If .CEL files exist, downloadable on results page.

- Click GEO accession number to access experiment page

Page 7: GEO (Gene Expression Omnibus)

Browsing

- Can browse by data sets (Result page with all experiments) or GEO Accessions

-GEO Accessions browsed by Platforms, Samples, or Series

Page 8: GEO (Gene Expression Omnibus)

Demo

http://www.ncbi.nlm.nih.gov/projects/geo

GO TO

Page 9: GEO (Gene Expression Omnibus)

Search data setsfor “cancer”

Page 10: GEO (Gene Expression Omnibus)

Download .CELfiles

Click GEO Accessionlink to access experiment

Page 11: GEO (Gene Expression Omnibus)

Take note of chip platform

Find the corresponding .pdf document using PubMed IDs

Take note of Classes, and number of arrays

Page 12: GEO (Gene Expression Omnibus)

Download DataSet file (Raw data) and Annotation file

DataSet SOFT filelist gene expressionfor all patients

Page 13: GEO (Gene Expression Omnibus)

Web-based analysis through Heirarchial Clustering, Value Distributions and t-tests

Page 14: GEO (Gene Expression Omnibus)

Can plot selected gene profiles using a region of interest box

Page 15: GEO (Gene Expression Omnibus)

Click value distribution for distribution of avg. gene expression values for outlier detection

Page 16: GEO (Gene Expression Omnibus)

One or two-tailed t-tests completed to compare two classes in the experiment

SignificanceLevels can beadjusted from 0.001 to 0.100

Page 17: GEO (Gene Expression Omnibus)

Shows Probe Set ID’s found significant based on chosen class comparisons

Page 18: GEO (Gene Expression Omnibus)

Features

PROS CONS- User-friendly interface

- MIAME Compliant

- Web based analysis

- Raw data/Annotation files available

- Vastly expansive/thorough compared to other microarray databases

- GSE series/ GDS series differences

- Must have PubMed ID

- .CEL files not available for all datasets

- .CEL files are individually zipped

- No Quality Control Information