Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243:...

93
Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology

Transcript of Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243:...

Page 1: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Selection of Candidate Genes for Population Studies

Zuo-Feng Zhang, MD, PhD

Epidemiology 243: Molecular Epidemiology

Page 2: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Gene Selection for Molecular Studies

• Selection of putative genetic factors is the central issue of the molecular epidemiological studies even thought the selection of the putative risk factors are equally important because of the focus of the molecular epidemiology is the assessment of gene-environment interaction

Page 3: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Two Types of Genes

• High Risk Genes

• Low Risk Genes

Page 4: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Familiar Disease Genes (High Risk Gene):

-High penetrance

-High AR/RR

-Gene frequency: low (<1%)

-Study setting: family

-Study type: Linkage

-PAR: low

-Role of Environment: Modest

Page 5: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Example of High Risk Genes

• Mutations of TP53 gene

• BRCA1 and BRCA2

• RB gene mutations

Page 6: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Susceptibility Genes (Low Risk Genes)

-Low penetrance

-Low AR/RR

-Gene frequency: high (>1%-90%)

-Study setting: population

-Study type: association

-PAR: high

-Role of Environment: critical

Page 7: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Approach for High Risk Genes

• Functional approach (forward genetics): from genotype to phenotype

• Positional approach (reversed genetics) from phenotype to genotype

Page 8: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Functional Approach: An Example

From patients with DNA repair defects:

• a cell line is created

• Add certain fragment of human chromosome

• Produce a repair component phenotype

Page 9: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Positional Approach

• Linkage analysis

• Loss of heterozygosity (LOH)

• Chromosome abnormalities

Page 10: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Linkage analysis

• It is method to identify the disease loci

• Family based, need sufficient sample size

• Germline DNA from affected and unaffected individuals

• A genetic mechanism (autosomal dominant/recessive)

• A set of markers

Page 11: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Loss of Heterozygosity (LOH)

• Need both normal and tumor tissues

• The loss of signal in targeted tissue (tumor) in comparison with normal tissue

• If LOH consistently observed in a particular region, an indication of an important gene is indicated in the region.

Page 12: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Chromosome Abnormalities

• Deletion

• Insertion

• Microsatellite instability

Page 13: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

In-depth Approaches to Identify Candidate Genes

• When above three methods indicate a region in chromosome, further work is needed to identify particular candidate genes:

-Mutation screening

-restoration of normal phenotype by transfection of a normal allele

-mouse model of disease by introducing defective mutations

Page 14: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Approaches for Low Risk Genes

• Linkage analysis may not be feasible because it requires a relatively large sample size (If the OR=2, 2500 family would be needed)

Page 15: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Approaches for Low Risk Genes

• New techniques will be needed to identify the low risk susceptibility genes

-Automated micro-array genechips

-SNP identification

Page 16: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Selection of Putative Genes (1)

• Inter-individual variation in the trait exist in the population-If there is very small variation of the phenotype in the population, the rationale to examine the genotype is weak.-If there is a very large variation of the phenotype, other potential factors need to be considered

Page 17: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Selection of Putative Genes (2)

• The gene is involved in the process related to carcinogenesis:

-DNA repair

-Chromosome stability

-Activities of oncogenes/tumor suppressor genes

-cell cycle control/signal tranduction

Page 18: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Selection of Putative Genes (3)

• The trait exhibits an inheritance pattern consistent with Mendelian transmission

• Any phenotype should have a genetic basis

Page 19: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Selection of Putative Genes (3)

• Certain phenotypes such as “mutagen sensitivity” has been reported to be associated with many smoking related cancers, however, the precise nature of this susceptibility factor remains incompletely understand because the genotype associated with mutagen sensitivity is still unclear.

Page 20: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Selection of Putative Genes (4)

• Gene action exists in relevant organ.

-CYP1A1 is largely absent from liver, but present in lung

-CYP2D6 is expressed in brain

-GSTM1 has some expression in lung

-GSTP1 is expressed in lung

Page 21: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Selection of Putative Genes (5)

• Gene location and characterization.-Similar gene structure may indicate similar function-Most of mutations occur in the coding sequence, but mutations in intragenic noncoding may occur-Specific point mutation may indicate specific exposures

Page 22: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Selection of Putative Genes

• Polymorphisms and mutation

• Gene-Gene interactions

• Animal models

• Human studies

• Genotype-phenotype

• Relation to disease

• Ethnic variation

Page 23: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Selection of Putative Genes

• Gene-Gene interaction (phase I and phase II).

-CYP1A1 and GSTM1 and lung cancer risk, PAH (carcinogens)

-CYP2A6 and CYP2D6, NNK

Page 24: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

2-1. Background:The summary of characteristics and significance of the genes of interest.

Full Gene Name Gene Symbol Location Polymorphis ms to Be Studied

Function Significance of Polymorphis m and Frequency Distribution*

Phase I genes Cytochrome p450

subfamily I polypeptide 1

CYP1A1 15q22-q24 m1 (Msp I), m2 (Ile462Val)

Increase microsomal enzyme activ ity or catalytic activ ity for activating procarcinogen including PAHs and aromatic amines

Exon 3 (Tyr113His) Reduce enzyme activity for metabolising polycyclic aromat ic hydrocarbons. Tyr113His *= 37-47%: 35-55%: 8-18% Microsomal epoxide

hydrolase mEH 1q42.1

Exon 4 ( His139Arg) Increase enzyme act ivity for metabolising polycyclic aromat ic hydrocarbons. His139Arg *= 58-70%: 23-44%: 1-7%

Phase II genes Glutathione S-transferase M?1

GSTM1 1p13.3 *A/B/null Null type expresses no enzyme act ivity for detoxify ing xenobiotics

Glutathione S-transferase P?1 GSTP1 16p13.1

Ile105Val,

Ala114Val

Reduce enzyme activity for detoxifying products of oxidative stress Ile105Val* = 27-33%: 35-40%: 10-5%

Ala114Val *= 78-84% : 15-20% : 1-2%

Diaphorase (NADH/NADPH) 4/

Cytochgrome b-5 reductase

DIA4/ NQO1 16q22.1 Pro187Ser (C609T)

187Ser associated with reduced enzyme act ivity for

metabolising tobacco-smoke carcinogen Pro187Ser *= 65%: 30%: 5%

DNA repair genes X-ray repair

complementing defective repair in Chinese hamster

cells 1

XRCC1 19q13.2

Arg194Trp,

Arg280His,

Arg399Gln

Leads to mitotic delay in response to IR, which might alter DNA repair capacity (DRC) and increase genotoxic damage

Arg194Trp *= 40-50%: 44-50%: 6-10%

Arg280His *= 80-85%: 12-15%: 1-3%

Arg399Gln *= 40-54%: 55-33%: 5-13%

Cell cycle control genes Tumour protein p53 TP53 17p13.1 Arg72Pro

Amino acid substitution might decrease DNA-binding ability. Arg72Pro* = 50%: 40-35%: 10-15%

Cyclin-dependent kinase inhibitor 2A (a.k.a. p16)

CDKN2A /P16

9p21 Ala148Thr Function uncertain. Exploratory analysis

Cyclin D1 (PRAD1: parathyroid

adenomatosis 1) CCND1 11q13 G870A

Splicing alterat ion. The A allele results in altered protein and bypass the G1/S checkpoint easier

G870A *= 30-33%: 50-53%: 15-20%

* frequency distribution=wild type: heterozygous variant: homozygous variant

Page 25: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

2-1. Background: Theoretical model of gene-gene/environmental interaction pathway

Environmental Carcinogens / Procarcinogens Exposures

PAHs, Xenobiotics,

Arene, Alkine, etc

Carcinogenesis

Tobacco consumption Occupational Exposures

Environmental Exposure

?

Page 26: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Environmental Carcinogens /

Procarcinogens Exposures

PAHs, Xenobiotics,

Arene, Alkine, etc

Active carcinogens Detoxified carcinogens

Tobacco Consumption Occupational Exposures

Environmental Exposure

CYP1A1

GSTP1

mEH mEHNQO1

GSTM1

2-1. Background: Theoretical model of gene-gene/environmental interaction pathway

Page 27: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

GSTM1

If DNA damage not repaired

DNA damage repaired

CYP1A1

GSTP1

mEH mEHNQO1

XRCC1

Defected DNA repair gene

Environmental Carcinogens / Procarcinogens Exposures

PAHs, Xenobiotics,

Arene, Alkine, etc

Active carcinogens Detoxified carcinogens

DNA Damage Normal cell

Tobacco consumption Occupational Exposures

Environmental Exposure

2-1. Background: Theoretical model of gene-gene/environmental interaction pathway

Page 28: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

If DNA damage not repaired

DNA damage repaired

If loose cell cycle control

Defected DNA repair gene

G

S

G2

M

Environmental Carcinogens / Procarcinogens Exposures

PAHs, Xenobiotics,

Arene, Alkine, etc

Active carcinogens Detoxified carcinogens

DNA Damage Normal cell

Carcinogenesis Programmed cell death

Tobacco consumption Occupational Exposures

Environmental Exposure

CYP1A1

GSTP1

mEH mEHNQO1

XRCC1

GSTM1

2-1. Background: Theoretical model of gene-gene/environmental interaction pathway

Page 29: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

If DNA damage not repaired

DNA damage repaired

If loose cell cycle control

Defected DNA repair gene

G

S

G2

M

P53

Cyclin D1

P16

Environmental Carcinogens / Procarcinogens Exposures

PAHs, Xenobiotics,

Arene, Alkine, etc

Active carcinogens Detoxified carcinogens

DNA Damage Normal cell

Carcinogenesis Programmed cell death

Tobacco consumption Occupational Exposures

Environmental Exposure

CYP1A1

GSTP1

mEH mEHNQO1

XRCC1

GSTM1

2-1. Background: Theoretical model of gene-gene/environmental interaction pathway

Ile105Val Ala114Val

Tyr113HisHis139Arg

Tyr113HisHis139Arg

Pro187Ser

MspIIle462Val

Arg194Trp, Arg399Gln, Arg280His

Null

Ala146ThrArg72Pro

G870A

Page 30: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

2-1. Background:The summary of epidemiological literature for the genes of interest

Full Gene Name Gene Symbol Location Polymorphis ms to Be Studied

Previous epidemiological literatures

Phase I genes Cytochrome p450

subfamily I polypeptide 1

CYP1A1 15q22-q24 m1 (Msp I), m2 (Ile462Val)

Early Japanese studies showed an association, Later studies did not confirm the association

Exon 3 (Tyr113His) Microsomal epoxide hydrolase mEH 1q42.1

Exon 4 ( His139Arg) Not consistent. A large UK study found it as risk factors in nonsmokers, but protective in heavy smokers

Phase II genes

Glutathione S-transferase M 1

GSTM1 1p13.3 *A/B/null Many studies, Weak or marginal effect. Not conclusive.

Glutathione S-transferase P 1 GSTP1 16p13.1

Ile105Val,

Ala114Val

Marginal increase of risk. Evidence inconsistent. Maybe important in heavy smokers.

Diaphorase (NADH/NA DPH) 4/

Cytochgrome b-5 reductase

DIA4/ NQO1 16q22.1 Pro187Ser (C609T) Conflicting results. Potential interaction with s moking.

DNA repair genes Limited studies, no association been found so far.

Increase the risk, but evidence not consistent.

X-ray repair complementing

defective repair in Chinese hamster

cells 1

XRCC1 19q13.2

Arg194Trp,

Arg280His,

Arg399Gln

Increase the risk in s mokers ,but not conclusive so far.

Cell cycle control genes Tumour protein p53 TP53 17p13.1 Arg72Pro Suggested increase risk. But results are not conclusive.

Cyclin-dependent kinase inhibitor 2A (a.k.a. p16)

CDKN2A /P16

9p21 Ala148Thr No study so far.

Cyclin D1 (PRA D1: parathyroid

adenomatosis 1) CCND1 11q13 G870A

One study showed increase risk on head and neck cancer. No study on lung cancer risk so far

Page 31: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 32: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 33: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 34: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 35: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 36: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 37: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

UCLA Prostate Cancer SPORE Development ProjectSingle Nucleotide Polymorphisms (SNPs) of Genes in the DNA Double Strand Break Repair (DSBR)

Pathways and Risk of Prostate Cancer, A Preliminary Study

Zuo-Feng Zhang, MD, PhDDepartment of Epidemiology

UCLA School of Public Health

Page 38: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Epidemiological Observations: Involvement of DSBR Pathway Genes in

Prostate Cancer Risk• The risk of prostate cancer is known to be elevated

in carriers of germline mutations in BRCA2 • Increased risk of prostate cancer is also observed

in carriers of BRCA1 and CHEK2 mutations, and also associated with SNPs of the ATM genes

• Those observations indicate possible involvement of DNA DSBR pathway genes

Page 39: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

BRCA2

BRCA1

BRCA1ATM CHEK2(RAD53

homologous recombination

Non-homologous Recombination

Damage recognition cell cycle delay

response (DRCCD )

Page 40: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Hypotheses

• Single Nucleotide Polymorphisms (SNPs) of genes in the DNA Double Strand Break Repair (DSBR) Pathways may be associated with the susceptibility to prostate cancer.

• We further hypothesize that the SNPs of the

DSBR may interplay each other and may modify effects of environmental factors on the risk of prostate cancer.

Page 41: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Specific Aim 1

• To assay Single Nucleotide Polymorphisms (SNPs) of genes in double strand break (DSB) repair pathway, including genes involved in Homologous Recombinational Repair (HRR): RAD51, RAD52, RAD54L, NBS1, XRCC2, XRCC3, BRCA1, and BRCA2; LIG4, and XRCC4 in Non-homologous end-joining (NHEJ), ATM, BRCA1, CHEK1, CHEK2 (RAD53), P53, and HUS1 in damage recognition cell cycle delay response (DRCCD) pathway.

Page 42: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Specific Aim 2

• To evaluate independent effect of SNPs of the DSB repair pathway when potential confounding factors, such as age, race, and education and to assess potential combined effects of SNPs

• To explore possible effect modifications on nutritional factors on the risk of prostate cancer.

Page 43: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Proposed Experimental Approach

• This study is based on a case-control study with a total of 122 cases with prostate cancer and 135 healthy controls. All cases and controls were interviewed by a research nurse using a standard epidemiological questionnaire at MSKCC from 1993 to 1997.

• Blood samples and tumor tissue specimens were collected. • The SNPs will be genotyped in individual DNA samples using

the SNPlex platform by ABI. The UCLA Sequencing and Genotyping Core Facility has recently added Applied Biosystem’s high-throughput SNP genotyping assay – SNPlex – to the available services. This assay is flexible, robust and highly reproducible.

Page 44: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

www.genetics.ucla.edu/genotyping

JCCC Genotyping Core:JCCC Genotyping Core: ABI SNPlex, a New High Throughput ABI SNPlex, a New High Throughput

Approach to Identify SNPs of Approach to Identify SNPs of Susceptibility GenesSusceptibility Genes

Page 45: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Zhang Lab SNP GenotypingZhang Lab SNP GenotypingPilot ProjectPilot Project

• 75% passed design process75% passed design process

• 48 SNPs chosen for first 48 SNPs chosen for first poolpool

• Whole Genome Whole Genome Amplification of DNA for Amplification of DNA for 3080 samples3080 samples

• 122,496 SNPs since 122,496 SNPs since genotyped since Januarygenotyped since January

Page 46: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Preliminary ResultsPreliminary Results

• 99.4% reproducibility by 99.4% reproducibility by automated scoring.automated scoring.

• 99.7% reproducibility by manual 99.7% reproducibility by manual scoring.scoring.

• 6 SNPs never worked6 SNPs never worked• 96% call rate of remaining markers96% call rate of remaining markers• Comparable to results reported by Comparable to results reported by

other labsother labs

Page 47: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Study Population

Page 48: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Progress of the Study

• Specific Aim 1, we have assayed selected single nucleotide polymorphisms (SNPs) of genes in double strand break repair (DSBR) pathway, including genes BRCA1, NBS1, TP53, APEX1, CHEK1, CHEK2, and ATM in 68 cases with prostate cancer and 90 healthy male controls using ABI SNPlex platform.

Page 49: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Progress of the Study

• Specific Aim 2, we explored independent effect of SNPs of the genes mentioned above in the DSBR pathway when potential confounding factors, such as age, race, and education were controlled.

Page 50: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 51: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Results of Preliminary study

• The adjusted ORs are:

4.6 (95%CI: 0.6-34.1) for BRCA1 (rs8176109) 5.0 (95% CI: 1.1-22.3) for NBS1 (rs9995)

3.1 (95%CI: 0.46-21.2) for TP53 (rs2909430)

2.0 (95%CI: 0.49-8.02) for APEX1 (rs3136820) 2.6 (95%CI: 0.58-11.6) for CHEK1 (rs506504)

0.6 (95%CI: 0.17-2.3) for ATM (rs228591).

Page 52: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Future Plan

• We will continue our proposed specific aims by assaying additional SNPs in the DSBR pathway genes as well as other pathways including other DNA repair pathways, metabolic, inflammatory, and cell cycle pathways among prostate cancer cases and controls. We will explore the independent effect of those SNPs on the risk of prostate cancer. We will also add the haplotype tagging SNPs of the DSBR pathways in order to identify haplotypes associated with prostate cancer risk. Those additional studies will have a greater impact on the translational research objectives of the SPORE.

Page 53: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

haplotype E(freq) E[Freq(0)] E[Freq(1)] Case (N) Ctrl (N) OR 95%CI_L 95%CI_UGATTATTT 0.527216 0.515241 0.548668 72.4 121.6 1TGACGCCC 0.279648 0.309341 0.226583 29.9 73.0 0.69 0.41 1.15GAATATTT 0.051616 0.054866 0.04582 6.0 12.9 0.78 0.29 2.15GAATATCT 0.049668 0.035849 0.074367 9.8 8.5 1.95 0.74 5.11Other haplotype 0.091852 0.084703 0.104562 13.8 20.0 1.16 0.55 2.44

BRCA1 Haplotypes and Risk of Prostate Cancer

Page 54: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Translational Potential of the Study

• Our results with relatively small samples size suggest potential involvement of SNPs of the DSBR pathway genes in the development of prostate cancer.

• If confirmed by studies with larger sample size, SNPs in DSBR pathway genes may be used in individual risk assessment, and identification of high risk population for intervention and chemoprevention

Page 55: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 56: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

The Selection Criteria of SNPs

• functional SNPs if possible• amino-acid-changing SNPs; • SNPs in the functional region of the gene or

SNPs without amino acid changes that were hypothesized to affect the transcription/ translation of the protein;

• the rare allele frequency of SNPs must be equal to or higher than 5% in the general population

Page 57: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 58: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 59: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 60: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Proposed Study of Lung Cancer among Non-smokers

Page 61: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Motives and Conceptual Framework For Study of Genetic Susceptibility to Lung

Cancer among Non-smokers • About 16% of the male smokers and 10% of female smokers

will eventually develop lung cancer, which suggest exposures to other environmental carcinogens and individual genetic susceptibility may play an important role among non smoking lung cancer.

• It is suggested that 26% of lung cancer are associated with genetic susceptibility Lichtenstein P, et al. NEJM, 2000)

• We hypothesize that the variation of genetic susceptibility or single nucleotide polymorphisms (SNPs) of genes in inflammation, DNA repair, and cell cycle control pathways may be important on the development of lung cancer among non-smokers.

Page 62: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Lichtenstein P, Holm NV, Verkasalo PK, Iliadou A, Kaprio J, Koskenvuo M, Pukkala E, Skytthe A, Hemminki K. NEJM, 2000

Page 63: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

If DNA damage not repaired

DNA damage repaired

If loose cell cycle control

Defected DNA repair gene

G1

S

G2

M

P53

Cyclin D1

P16

Environmental Carcinogens / Procarcinogens Exposures

PAHs, Xenobiotics,

Arene, Alkine, etc

Active carcinogens Detoxified carcinogens

DNA Damage Normal cell

Carcinogenesis Programmed cell death

Tobacco consumption Occupational Exposures

Environmental Exposure

CYP1A1

GSTP1

mEH mEHNQO1

XRCC1

GSTM1

Theoretical model of gene-gene/environmental interaction pathway for lung cancer

Ile105Val Ala114Val

Tyr113HisHis139Arg

Tyr113HisHis139Arg

Pro187Ser

MspIIle462Val

Arg194Trp, Arg399Gln, Arg280His

Null

Ala146ThrArg72Pro

G870A

G0

Page 64: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Issues in genetic association studies• Many genes

– ~25,000 genes, many can be candidates

• Many SNPs

– ~10,000,000 SNPs, ability to predict functional SNPs is limited

• Methods to select SNPs:

– Only functional SNPs in a candidate gene

– Systematic screen of SNPs in a candidate gene

– Systematic screen of SNPs in an entire pathway

– Genomewide screen

– Systematic screen for all coding changes

Page 65: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Selection of SNPs(Genome-wide association studies)

– Molecular• Higher requirements: Affymetrix and Perlegen

– Analytical• Highest requirements: Data management, automation

– Advantages• No biological assumptions and can identify novel genes/pathways

• Excellent chance to identify risk alleles

• Utility in individual risk assessment

– Disadvantages• High costs

• Concern of multiple tests

Page 66: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

500K SNP CoverageMedian intermarker distance: 3.3 kbMean intermarker distance: 5.4 kbAverage Heterozygosity 0.30Average minor allele frequency 0.22

SNPs in genes 196,38480% of genome within 10kb of a SNP

Page 67: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 68: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

LIG SNP and Passive Smoking

Page 69: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Figure 1. The effects of SNPs on the Risk of Lung Cancer among Smokers and Non-smokers

0

1

2

3

4

5

6

7

8

BRCA1 CHEK1 XRCC3 INFG IL-10 ALDH2

Smokers

Non-Smokers

ETS Exp

Non ETS Exp

OR

Page 70: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Hypothesis

• The overall hypothesis is that multiple sequence variants in the genome are associated with the risk of lung cancer among non-smokers. Specifically, we hypothesize that a number of common nonsmoking lung cancer risk-modifying SNPs are in strong LD with the SNPs arrayed on the 500K GeneChip®.

Page 71: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Executive Committee

DNA Repair Working

Group Coordinator

Familial Cases Working Group

Coordinator

Rare Histology Working Group

Coordinator

Young Onset Working Group

Coordinator

Nonsmokers Working Group

Coordinator

DNA Repair Working Group

Members

Nonsmokers Working Group

Members

Familial Cases Working Group

Members

Rare Histology Working Group

Members

Young Onset Working Group

Members

Figure 2. Structure and Governance of ILCCO

Page 72: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 73: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 74: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Specific Aims

• Aim 1. To perform exploratory tests for association between 500K SNPs across the genome and lung cancer risk among 200 non-smoking lung cancer patients and 200 controls.

• Aim 2. To perform first stage of confirmatory association tests between lung cancer risk and more than 1,000 SNPs implicated in Aim 1 among an independent set of 600 pairs of cases and controls.

Page 75: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Specific Aims

• Aim 3. To perform second stage of confirmatory association tests between lung cancer risk and more than 500 SNPs that were replicated in Aim 2 among an additional 600 cases and 600 controls. Additional SNPs will also be added from our ongoing pathway specific analyses of DNA repair, cell cycle regulation, inflammation and metabolic pathways based on non-smokers in our lung cancer study.

• Aim 4. To perform fine mapping association studies in the flanking regions of each of the 30-100 SNPs confirmed in Aim 3 among the entire 1,400 cases and 1,400 controls. The large number of cases with non-smoking lung cancer in this study population also allows us to identify SNPs that are associated with risk of the disease among nonsmokers.

Page 76: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Specific Aims

• Aim 5. To explore the generalizability of the SNPs identified in Specific Aims 1-4 within a Chinese population of 600 nonsmoking lung cancer cases and 600 nonsmoking controls. The relatively homogeneous Chinese population not only allows us to further confirm the associations, but also improves our ability to finely map the SNPs associated with lung cancer risk among non-smokers.

Page 77: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Discussion: Costs

• Affy 500 k SNP chip $1000/case

2000 x $1000=$2m

1000 x $1000=$1m

500 x $1000=$0.5 M

• 500 x 3000 (SNP) x $0.15=$225, 000

• 500 x 30 (SNP) x $0.15 =$2,250

Page 78: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Table 1. Distributions of demographic and potential risk factors Cases N (%) Adjusted OR (95% CI)a Key

Variable Controls N

(%) Head & Neck Cancer

Lung Cancer

Head & Neck Cancer

Lung Cancer

Ageb <40 83 (8.0) 59 (9.8) 14 (2.3) -- -- 40-45 139 (13.4) 50 (8.3) 47 (7.7) -- -- 45-50 175 (16.8) 117 (19.5) 109 (17.8) -- -- 50-55 324 (31.1) 150 (25.0) 192 (31.4) -- -- >55 319 (30.7) 225 (37.4) 249 (40.8) -- -- Sexb Female 417 (40.1) 147 (24.5) 308 (50.4) -- -- Male 623 (59.9) 454 (75.5) 303 (49.6) -- -- Race-Ethnicity Caucasian American

634 (61.0) 341 (56.9) 359 (58.9) 1.0 1.0

Mexican American

150 (14.4) 70 (11.7) 53 (8.7) 0.5 (0.4-0.8) 1.1 (0.7-1.7)

African American

102 (9.8) 69 (11.5) 96 (15.7) 1.0 (0.7-1.5) 1.9 (1.3-2.8)

Asian American

62 (6.0) 64 (10.7) 70 (11.5) 2.7 (1.8-4.1) 4.6 (3.0-7.1)

Others 91 (8.8) 55 (9.2) 32 (5.2) 1.0 (0.6-1.5) 0.9 (0.5-1.5)

1040 controls, 601 head and neck cancer cases, and 611 lung cancer cases

Page 79: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Pack-years of Tobacco Smoking Never 492 (47.3) 182 (30.3) 110 (18.0) 1.0 1.0 1-20 353 (34.0) 147 (24.4) 102 (16.7) 1.0 (0.7-1.3) 1.4 (1.0-1.9) 20-40 136 (13.1) 146 (24.3) 202 (33.1) 2.0 (1.4-2.7) 7.9 (5.5-11.4) >40 58 (5.6) 126 (21.0) 197 (32.2) 3.5 (2.3-5.3) 22.6 (14.6-35.0) Alcohol Drinking (Drinks per day) 0-6 1003 (96.8) 526 (87.8) 576 (94.4) 1.0 1.0 >6 33 (3.2) 73 (12.2) 34 (5.6) 2.1 (1.3-3.3) 0.9 (0.5-1.6) Education 1039 (100) 601 (100) 611 (100) 0.9 (0.9-0.9) 0.9 (0.9-1.0) Fruit Intake Frequency (yearly) <354 259 (25.1) 205 (34.5) 197 (32.7) 1.0 1.0 354-650 258 (24.9) 137 (23.1) 143 (23.8) 0.8 (0.6-1.1) 1.0 (0.7-1.4) 650-1037 259 (25.1) 119 (20.0) 130 (21.6) 0.9 (0.6-1.2) 1.0 (0.7-1.5) >1037 258 (24.9) 133 (22.4) 132 (21.9) 0.9 (0.6-1.2) 1.3 (0.9-1.9) Vegetable Intake Frequency (yearly) <526 259 (25.1) 194 (32.6) 208 (34.6) 1.0 1.0 526-822 258 (24.9) 128 (21.6) 173 (28.7) 0.8 (0.6-1.1) 1.0 (0.7-1.4) 822-1217 259 (25.1) 142 (23.9) 132 (21.9) 0.9 (0.7-1.3) 0.8 (0.5-1.1) >1217 258 (24.9) 130 (21.9) 89 (14.8) 0.9 (0.6-1.3) 0.6 (0.4-0.8)

a Adjusted for the variables listed in the table. b Odds ratios are not presented because age and sex are matching variables and their odd ratios calculated by the regression models are not valid.

Cases N (%) Adjusted OR (95% CI)a Key Variable

Controls N (%) Head & Neck

Cancer Lung

Cancer Head & Neck

Cancer Lung

Cancer

Page 80: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

GSTM1 and Lung Cancer among Non-Smokers

0.9

0.95

1

1.05

1.1

1.15

1.2

OR

GSTM1 Normal Null

Smoking No No

1.190.69-2.03

Page 81: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

GSTT1 and Lung Cancer among Non-Smokers

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

OR

GSTT1 Normal Null

Smoking No No

1.530.83-2.81

Page 82: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

p53 codon 72 and Lung Cancer among non-smokers

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

OR

p53 A/A or A/P P/P

Smoking No No

0.790.32-1.95

Page 83: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

GSTP1 and Lung Cancer among Non-Smokers

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

OR

GSTP1 Ile/Ile Any Val

Smoking No No

0.690.39-1.24

Page 84: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Variable 1 Variable 2 Cases N (%)

Controls N (%)

OR (95% CI)a

Lung Cancer

Tobacco XPG Never Arg/Arg 8 (1.6) 43 (4.7) 1.0

0-20 Pack-years Arg/Arg 8 (1.6) 20 (2.2) 2.5 (0.8-8.4) >20 Pack-years Arg/Arg 25 (5.0) 17 (1.9) 13.1 (4.5-38.7)

Never His/His+His/Arg 79 (15.9) 392 (43.0) 1.9 (0.8-4.4) 0-20 Pack-years His/His+His/Arg 71 (14.3) 290 (31.8) 2.2 (0.9-5.3) >20 Pack-years His/His+His/Arg 306 (61.6) 150 (16.4) 22.9 (9.5-55.3)

a Unconditional analysis, adjusted for age, race-ethnicity, educational level, tobacco smoking or alcohol drinking.

Page 85: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 86: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 87: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 88: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 89: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 90: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 91: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 92: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.
Page 93: Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.