Exercises Proteomics II

1
Exercises Proteomics II Exercise 1 The PaxDB database collects proteomics data of protein abundance in several species and tissues. A frequently observed trend in proteomics is a high degree of variability if several groups use different methods to address the same question. Pax-db uses a 'meta-analysis' to combine multiple datasets into one - which is hoped to be better than each individual set. a) Analyse the mouse data for the brain proteome (disregard experiments with coverage < 10%) Combine the various experiments into a single table (Excel or OOcalc, use the VLOOKUP function) b) Generate a 2D x-y Plot of the expression values (in ppm) for all pairwise combinations. Also calculate the correlation coefficient (use only genes with expression information for both dimensions). Which pair has the best correlation? Exercise 2 c) Now extend the table of exercise 1 by adding one human brain dataset (e.g. the pax-db combined dataset). The correct way of doing this would involve mapping the genes names to homologene IDs. For simplicity sake, skip this step and just use the gene names for the joining. (Convert mouse gene names to uppercase). d) Calculate the correlation of each individual mouse dataset to the human data. Which one shows the best correlation? Is this the one with the best PaxDB score? Generate a 2D plot figure. e) Repeat the analysis of the previous section, but use ranks rather than ppm values. Are the results the same as in exercise 2b? f) Click on some of the top brain-expressed proteins (human dataset). Identify at least one protein with very brain-specific expression and one that also shows high expression in other tissues.

description

Exercises Proteomics II. Exercise 1 - PowerPoint PPT Presentation

Transcript of Exercises Proteomics II

Page 1: Exercises Proteomics  II

Exercises Proteomics IIExercise 1

The PaxDB database collects proteomics data of protein abundance in several species and tissues. A frequently observed trend in proteomics is a high degree of variability if several groups use different methods to address the same question. Pax-db uses a 'meta-analysis' to combine multiple datasets into one - which is hoped to be better than each individual set.

a) Analyse the mouse data for the brain proteome (disregard experiments with coverage < 10%) Combine the various experiments into a single table (Excel or OOcalc, use the VLOOKUP function)

b) Generate a 2D x-y Plot of the expression values (in ppm) for all pairwise combinations. Also calculate the correlation coefficient (use only genes with expression information for both dimensions). Which pair has the best correlation?

Exercise 2

c) Now extend the table of exercise 1 by adding one human brain dataset (e.g. the pax-db combined dataset). The correct way of doing this would involve mapping the genes names to homologene IDs. For simplicity sake, skip this step and just use the gene names for the joining. (Convert mouse gene names to uppercase).

d) Calculate the correlation of each individual mouse dataset to the human data. Which one shows the best correlation? Is this the one with the best PaxDB score? Generate a 2D plot figure.

e) Repeat the analysis of the previous section, but use ranks rather than ppm values. Are the results the same as in exercise 2b?

f) Click on some of the top brain-expressed proteins (human dataset). Identify at least one protein with very brain-specific expression and one that also shows high expression in other tissues.