08-10-2004MCGH Analyzer1 Hans A. Kestler André Müller.

23
08-10-2004 MCGH Analyzer 1 MCGH Analyzer Hans A. Kestler André Müller

Transcript of 08-10-2004MCGH Analyzer1 Hans A. Kestler André Müller.

08-10-2004 MCGH Analyzer 1

MCGH Analyzer

Hans A. KestlerAndré Müller

08-10-2004 MCGH Analyzer 2

Data processing steps

• Scanning of the DNA chips (normal and switched)– 2 Channels (Cy 5 and Cy 3)

• Build mean/median over the pixels

• Further processing with MCGH Software

08-10-2004 MCGH Analyzer 3

MCGH software

• Background reductioncalculate intensities according to the background

• Quality control of the spotsreject spots not fitting the quality criteria

• Accumulate spots to clones• Check test

reject clones not fitting the visual options

• Select control clones• Reduce control clones• Main calculation loop

08-10-2004 MCGH Analyzer 4

Overview

08-10-2004 MCGH Analyzer 5

Background reduction

Background reduction to get intensities1. No reduction2. Fixed reduction3. Local reduction4. Global reduction5. Local + Fixed reduction6. Global + Fixed reduction

Compute log Ratios• log( IntCy3 / IntCy5 )• log( IntCy5 / IntCy3 )

08-10-2004 MCGH Analyzer 6

Quality control

Reject spots with

• flags marked by the scanning software(bad, not found, absent, normal ...)

• A background intensity brighter than the foreground (new!)

• Min/Max reduction:– Reject the n smallest ratios– Reject the n largest ratios

08-10-2004 MCGH Analyzer 7

Spots to clones

Accumulate the non-rejected spot values

• Mean

• Standard deviation

• Median

over

• Intensities (Cy3, Cy5)

• log Ratios

New Feature: Reject clones with less than SpotLowerBound valid spots.

08-10-2004 MCGH Analyzer 8

Check test

Reject clones if at least one of these conditions holds:

1. Me(di)an background intensity > Background upper bound

2. Me(di)an Cy3 Intensity < Me(di)an Cy3 background intensity x Intensity lower bound

3. Standard deviation Cy3 Ratio > Ratio SD upper bound

08-10-2004 MCGH Analyzer 9

Select control clonesOnly non-rejected clones will be selected as control clones.

• Manual selectionSelect clones with id = ‚91‘ or ‚k‘ or ‚K‘ or ‚?91‘ as control clone

• Automatic selection– No [AutoBand]

[CutoffPercentage] clones from the middle band

– [AutoBand]Select band around the median

08-10-2004 MCGH Analyzer 10

Reduce control clonesSome of the control clones will be rejected ...

• [Cutoff Percentage]Reject the n smallest ratios

• Without [Cutoff Band]Reject the n largest ratios

• [Cutoff Band]Reject band around the median

08-10-2004 MCGH Analyzer 11

Main calculation loop1. Calculate control means (the mean/median over all control clones/spots)2. Normalize ratios (subtract control mean from the ratio)3. Calculate tolerance value T

s standard deviation of the ratios of the observed clonen the number of valid spots in this clonet value of the t-statistic

significance niveau

4. [ Force T-Test ]Reject clones with T > [ Force T Value ]

5. [ C Check ]Replace tolerance values with possible greater values.

6. Find clone with maximum tolerance and reject it if its tolerance value T is > [ Force T Value ]

7. Perform [ T Test ] and evaluate result value.

Everything has to be recalculated if a control clone will be rejected.

n

stT n

2

1,22

2

08-10-2004 MCGH Analyzer 12

The C CheckThe clone tolerance values are now recalculated according to the following scheme:

If the new tolerance value is greater than the old T will be replaced by the new value

clone in this spots validofnumber

clonecurrent theof ratiosover variance

clonecurrent theof ratiosover mean

spots control ofnumber

ratiosspot controlover variance

ratiosspot controlover mean

2

2

r

r

r

c

c

c

n

s

m

n

s

m

crrc

rc

rc

ccrrnnnew mm

nn

nn

nn

snsntT

rc

2

)1()1(1,2

08-10-2004 MCGH Analyzer 13

The T TestIf [ Force T ] is set, the value will be set to the [ Force T Value ]

otherwise it is the greates tolerance value found in the clones.

clone in this spots validofnumber

clonecurrent theof ratiosover variance

clonecurrent theof ratiosover mean

spots control ofnumber

ratiosspot controlover variance

ratiosspot controlover mean

2

2

r

r

r

c

c

c

n

s

m

n

s

m

rc

rc

rc

ccrr

rc

rc

rc

rc

ccrr

rc

nnnn

nnsnsn

mTmz

nnnn

nnsnsn

mTmz

2)1()1(

ˆ

2)1()1(

ˆ

2

1

08-10-2004 MCGH Analyzer 14

The T Test (2)

• No [ T Test ] : thresholding

rc

rc

nnnnrcrc

mmR

mmR

tztzTmmTmmRrcrc

1

1

ˆˆ0 1,221,21

Calculation of the result value R• [ T Test ]

tmmR

tmmR

tmmtmmR

rc

rc

rcrc

1

1

0

value thresholdpositive

value thresholdnegative

t

t

In this routine the test T > [ Force T Value ] will be performed repeatedly

08-10-2004 MCGH Analyzer 15

NCBI Clone Database

• Integration of the NCBI “component” database

• Automatically mapping of clone id’s to accession numbers, genomic clone locations and clone status information according to an up-to-date database

• Direct import of the NCBI file format

08-10-2004 MCGH Analyzer 16

Accession-Number

Start-Base End-Base

Clone-State

Database-generated Information

08-10-2004 MCGH Analyzer 17

Batch Processing

•One ore more file pairs can be added to a session

•All computations are performed simultaneous on the included datasets

08-10-2004 MCGH Analyzer 18

Diagrams functions

• Ratio-profiles of multiple clone sets can be shown in one diagram

08-10-2004 MCGH Analyzer 19

Ideogram Browser 1

• Independent portable Java application• Automation from MCGH-Analyzer with JNI• Generation of ideogram drawings from the NCBI

map database• Direct representation of gain and lost markers of

multiple clone sets• Scalable and scrollable graphs

08-10-2004 MCGH Analyzer 20

Ideogram Browser 2

08-10-2004 MCGH Analyzer 21

Software Structure 1

• Excel as convenient platform with widely known user interface for– Table representation– Diagram drawing– User interaction

• Windows DLL written in C++ for high performance using COM automation

• Platform-independent Java-Application for visualizing ideograms (can be docked to the DLL via JNI)

08-10-2004 MCGH Analyzer 22

Software Structure 2

08-10-2004 MCGH Analyzer 23

Future Features

• Copy number estimation– Global thresholds– Adaptive (local) thresholds

• Wavelets

• Adaptive weights smoothing

• NCBI database online update

• Interface to the R platform