Personal Genome Project (PGP)
description
Transcript of Personal Genome Project (PGP)
Personal Genome Project (PGP)
• Harvard Medical School IRB Human Subjects protocol Approved Aug-2005.
• Highly-informed individuals consenting to potentially non-anonymous genomes & extensive phenotypes (medical records, imaging, omics). Volunteer waiting list ** http://pgen.us **
• Cell lines in Coriell NIGMS Repository (B-cells, keratinocytes, fibroblasts)
G M Church GM (2005) The Personal Genome ProjectNature Molecular Systems Biology doi:10.1038/msb4100040 Kohane IS, Altman RB. (2005) Health-information altruists--a potentially critical resource. N Engl J Med. 353:2074-7. McGuire AL, Gibbs RA (2006). Genetics. No longer de-identified. Science. 312:370-1.
Why?• Genealogy, forensics, ‘obvious’ traits• Causative mutations: new/old ; germ/soma• Allergens, microbes, viruses (bioweather map)• Immune T & B-cell receptors • Pathology/cancer RNA/DNA
How? $300M in 2003, $3K (free) in 2007 • ‘Next-generation’ sequencing technology• Polymerase colonies (Polonies) 1% cost• Selecting regions (1% : exons, enhancers, etc.)
PGP: 1 million Genome/Phenomes via electronic medical records
Obvious Trait Genetics:
Lower Risk Medical Systems
Biology
Hair: Baldness [alopecia](minoxidil) Eyes: [Near/Far-sightedness](glasses) Iris color [ARMD] (glasses)Face: [Developmental syndromes, Wrinkles] (Botox)Brain: ADHD (Ritalin); Depression (Prozac)Sleep & Circadian (caffeine, amphetamine, modafinil)Motion sickness (Dramamine, and Scopolamine)Allergies (antihistamines, cortisone, epinephrine, theophylline)Headache (analgesics)Lip: [Cleft palate] (surgery); [Hirsutism] (calcium thioglycolate)Ears: Sensitivity (hearing aids)Nose: Size [breathing disorders] Mouth: Halitosis, throat exams; aerosols [airborne pathogens]Skin: Body odor, Perspiration, Pheromones, Surface texture [psoriasis], Immune components (acne treatments) [acne]Skin color [vitamin D & sunburn] Hands: Dermatoglyphics [syndromes], [Arthritis](corticosteroids) Internal sensors: Proprioceptor, Repetitive stress syndromeBody: Height [short stature] (hGH) [Marfan]Weight [obese](phenethylamine); [anorexia] Metabolic polymorphisms (vitamins, minerals, insulin)Back: Strain sensitivity [IDD] (analgesics)Feet: Plantar fasciitis (orthotic shoes)Athlete’s foot (miconazole, itraconazole, terbinafine, salicylate)
Status quo “de-identification”
International HapMap Project, reconsent form:“It will be very hard for anyone to learn anything about you personally from any of this research because none of the samples, the database, or the HapMap will include your name or any other information that could identify you or your family.”
Genome Wide Association Studies: The proposed GWAS Policy calls for investigators funded by the NIH for GWAS 1) to submit de-identified genotypic and phenotypic data to a centralized NIH repository; and, 2) to submit documentation that describes how the investigators will protect privacy and confidentiality of research participants.
Problems with Status quo “de-identification”
1) Less integrated, holistic, comprehensive
2) Less enabling of system-wide medicine
3) Subjects not informed enough to “opt-out”
4) Life-saving info can’t be shared with subjects
5) False sense of anonymity (next slide)
Is “de-identification” realistic? 1) Re-identification after “de-identification” using other public data. Group Insurance Commission list of birth date, gender, and zip code was sufficient to re-identify medical records of Governor Weld & family via voter-registration records (1998) (2) Hacking. “Drug Records, Confidential Data vulnerable via Harvard ID number & PharmaCare loophole” (2005). A hacker gained access to confidential medical info at the U. Washington Medical Center -- 4000 files (names, conditions, etc, 2000)(3) Combination of surnames from genotype with geographical infoAn anonymous sperm donor was traced on the internet 2005 by his 15 year old son who used his own Y chromosome genealogy to access surname relations. (4) Inferring phenotype from genotype Markers for eye, skin, and hair color, height, weight, racial features, dysmorphologies, etc. are known & the list is growing.(5) Unexpected self-identification. An example of this at Celera undermined confidence in the investigators. Kennedy D. Science. 2002 297:1237. Not wicked, perhaps, but tacky.(6) A tiny amount of DNA data in the public domain with a name leverages the rest. This would allow the vast amount of DNA data in the HapMap (or other study) to be identified. This can happen for example in court cases even if the suspect is acquitted.(7) 26 million Veterans’ medical records including SSN and disabilities stolen Jun 2006.(8) Unauthorized access to DNA bearing samples (9) Identification by phenotype. If CT or MR imaging data is part of a study, one could reconstruct a person’s appearance .
Ethical, Legal Social (ELSI) Advisors Misha Angrist Duke Inst. Genome Sciences & Policy Terry Bard HMS IRB, BIDMC ChaplainDan Brock Harvard Program in Ethics & HealthRuth Chadwick CESAGen, Cardiff Univ. Mildred Cho Stanford EthicsRobert Cook-Deegan Duke Center for Genome Ethics, Law, & PolicyLisa Geller Wilmer-Hale IP Dept. Eric Juengst CWRU Center for Biomedical EthicsJeantine Lunshof EMGO Institute, AmsterdamAmy McGuire Baylor BioethicsPaul Rabinow UC Berkeley AnthropologyJohn Robertson Univ.of Texas School of LawPeter Singer Univ. of Toronto Joint Centre for BioethicsDaniel Vorhaus Harvard Law SchoolLaurie Zoloth Director, Bioethics, Northwestern Univ
‘Next Generation’ Sequencing Status
Multi-molecule Reaction Volume AB/APG Ligase beads 1 fL 454/Roche Pol beads 100,000 fL Solexa Pol term 1 fLCGI Ligase 1 fLAffymetrix Hybr array 100 fL
Single molecules Helicos Biosci Pol <1fLVisigen Biotech Pol FRET <1fLPacific Biosci Pol <1fLAgilent Nanopores <1fL
fL =1E-15 liters(femto)
Next generation sequencing: PoloniesBeads or not, Ligase or Polymerase
G
A
C
T
Shendure, Porreca, et al. (2005) Science 309:1728
HPLC autosampler
(96 wells)syringe pump
Polony Sequencing EquipmentHMS/AB/APG
flow-cell
temperature
control
microscope with
xyz controls
CCD camera
In vitro paired tag libraries
Bead polonies via emulsion PCR
Monolayer gel
immobilization
Enrich amplified
beads
SOFTWARE
Images → Tag Sequences
Tag Sequences → GenomeSBE or SBLsequencing
Epifluorescence & Flow Cell
Shendure, Porreca, Reppas, Lin, McCutcheon, Rosenbaum, Wang, Zhang, Mitra, Church (2005) Science 309:1728.
Integrated Polony Sequencing Pipeline(open source hardware, software, wetware)
Consensus error rate Total errors (E.coli)
(Human)
1E-4 Bermuda/Hapmap 500
600,000
4E-5 454 @40X 200 240,000
3E-7 Polony-SbL @6X 0 1800
1E-8 Goal for 2006 0 60
Goal of genotyping & resequencing Discovery of variantsE.g. cancer somatic mutations ~1E-6 (or lab evolved cells)
Why low error rates?
Also, effectively reduce (sub)genome target size by enrichment for exons or common SNPs to reduce cost & # false positives.
Monitoring resistance to BCR-ABL-kinase inhibitors with polonies during CML patient therapy Nardi, Raz, Chao, Wu, Stone, Cortes, Deininger, Church, Zhu, Daley (submitted)
E255K
T315I
M244V