    Considerable attention is currently focused on identifying the events that lead to the development of so-called tumor-initiating cells, as understanding this might facilitate the design of more effective cancer therapies. It is becoming increasingly evident that, in addition to genetic alterations, tumor development involves the alteration of gene expression patterns owing to epigenetic changes. Recent studies have implicated the Polycomb group proteins (PcG proteins) as key contributors to these changes. The PcG proteins form multiprotein repressive complexes, called Polycomb repressive complexes (PRCs), which repress transcription by a mechanism that probably involves the modification of chromatin.

    Several genetic studies in different organisms have firmly established the vital and conserved roles for PcG proteins in embryonic development and adult somatic cell differentiation. Moreover, recent studies have demonstrated that the PcG proteins are required for maintaining the correct identities of stem, progenitor and differentiated cells. The genome-wide mapping of PcG target genes in mammalian cells has offered scientists the opportunity to start to unravel the molecular mechanisms

    of PcG protein action. The PcG proteins have been found to bind and repress the promoters of genes that encode proteins with key roles in cell fate determination in many different cell lineages. Although these data support the large body of evidence that points to crucial roles for the PcG proteins in both development and adult homeostasis, we are only beginning to understand how the PcG proteins actually regulate their target genes.

    Initial studies have established that the PcG proteins are displaced from certain target genes, for example the homeobox (Hox) genes, on their transcriptional activation during differentiation. However, subsequent studies demonstrated that the binding of PcG proteins is much

    more dynamic than anticipated, showing that the PcG proteins are also recruited to the promoters of certain genes in response to differentiation signals and, importantly, that this recruitment is required for their silencing during differentiation. On the basis of these results, we and others have proposed a model in which the PcG proteins function dynamically during development and differentiation to lock off the expression of alternative fate regulators in any particular lineage. In this Review we propose that the deregulation of these mechanisms is central to tumor initiation.

    PG pro r

    Th PcG protins ar ssntia for th maintnanc ofboth norma and cancr stm c popations2022. Thisis party attribtd to thir abiity to bind to and rprssth CDKN2B and CDKN2A loci, which ncod th tumoursuppressors INK4B (ncodd byCDKN2B), INK4A andARF (both ncodd byCDKN2A)21,2331. INK4A andINK4B fnction pstram in th RB pathway, and ARFfnctions pstram in th p53 pathway25,32. In addition tofrqnt gntic atrations, this ocs is oftn pignti-

    cay sincd by DNA mthyation in cancr, and th PcGprotins ha bn proposd to contribt to this26. Manyadditiona PcG targt gns accmat DNA mthyationon thir promotrs in cancr, sch as Wims tmor 1(WT1),rtinoic acid rcptor-(RARB),krpp-ikfactor 4(KLF4),inhibitor of DNA binding 4 (ID4), GATAbinding protin 3 (GATA3)chromodomain hicasDNA binding protin 5(CHD5) and PU.1 (aso known asSPI1)13. Th rports that nhancr of zst homoog 2(EZH2)33 and chromobox homoog 7 (CBX7)34 canphysicay associat with DNA methyltransferases (DNMTs)sggst a mchanism whrby th PcG protins dirctycontribt to th atrd DNA mthyation profis that

    Polycomb group proteins: navigators oflineage pathways led astray in cancer

    Adrian P. Bracken* and Kristian Helin

    Abstract | The Polycomb group (PcG) proteins are transcriptional repressors that regulate

    lineage choices during development and differentiation. Recent studies have advanced our

    understanding of how the PcG proteins regulate cell fate decisions and how their

    deregulation potentially contributes to cancer. In this Review we discuss the emerging roles

    of long non-coding RNAs (ncRNAs) and a subset of transcription factors, which we call cellfate transcription factors, in the regulation of PcG association with target genes. We also

    speculate about how their deregulation contributes to tumorigenesis.

    gns79, and poory diffrntiatd and aggrssi hmantmors show prfrntia rprssion of PcG targt gns6.Takn togthr, ths rsts sggst a possib scnario inwhich PcG protins and DNA mthyating nzyms (schas DNMTs) cooprat to abrranty sinc pro-diffrn-tiation and anti-proifrati gns, which ads to thaccmation of a popation of cs nab to rspondto diffrntiation signas. It is thoght that th consqntbock of diffrntiation may aow ths tmor-initiatingcs to ingr and accmat th additiona pignticand/or gntic atrations ncssary to dop intoa tmor.

    Howr, a ky qstion rmains nanswrd: whattriggrs th abrrant sincing of PcG targt gns that isobsrd in many cancr typs? On potntia scnariois that PcG protins, sch as eZH2 and BMI1, bcomabrranty prgatd, ading to th progrssircritmnt of DNMTs to PcG targt gns, a switch toa mor prmannt transcriptiona sincing and th gn-ration of tmor-initiating cs. Spporting idncfor this hypothsis incds th fact that sra PcGprotins ar highy xprssd in cancr10. For xamp,BMI1 is ampifid and orxprssd in B c ymphomaand fnctions as an oncogene that cooprats withMycin a mos mod of ymphoma3538. Simiary, spprs-sor of zst 12 homoog (SUZ12) is transocatd inndomtria cancr39, and EZH2 is ampifid and highy

    xprssd in many tmor typs4047. Potntiay con-tribting to ths incrasd eZH2 s, th microRNAmiR-101 has rcnty bn rportd to dircty targtEZH2 and is itsf dtd in som cancrs48,49. Howr,dspit th fnctiona idnc for a ro of PcG pro-tins, particary BMI1, in th dopmnt of cancr,th highr s of ths protins frqnty obsrdin tmors cod party b a consqnc of th highproportion of proifrating and/or stm-ik csin tmors. For xamp, BMI1 has bn rportd tob highy xprssd in norma stm cs50, and EZH2xprssion corrats with proifration rat as it is con-trod by th RBe2F pathway41. Thrfor, in this

    Riw w discss an atrnati and compmntaryhypothsis in which PcG protins ar d astray in cancrby th drgation of factors that ar rqird for thirassociation to targt gns. W propos that th drg-ation of ths factors dircty contribts to th abrrantmodation of transcriptiona programms obsrd inmany cancrs.

    PG rrum o rg g

    PcG protins do not ha th abiity to bind spcificDNA motifs. Thrfor, a ky mchanistic qstionconcrns how thy ar rcritd to and dispacd fromthir targt gns dring inag spcification. Thanswr to this qstion not ony has impications for orfndamnta ndrstanding of inag choic dringdopmnt and diffrntiation, bt may aso consid-raby contribt to or ndrstanding of th initiatingnts in cancr.

    Transcription factors recruit PcG proteins. In Drosophilamelanogaster, sra transcription factors ar rqird

    to rcrit PcG protins to polycomb repressive elements(PRes) dring dopmnt51. On sch transcriptionfactor, ncodd byYy1 (aso known as Pho), has rcntybn shown to co-occpy most PRes with PRC1 andPRC2 componnts in D.melanogastermbryos andara19,52. Th PRe in D. melanogasteris not an asiyrcognizab DNA sqnc motif as it is not a singtranscription factor binding sit. Instad, it is a coc-tion of binding sits, dfind as an mnt of srahndrd bas pairs. To dat, PRes ha not bn dfindin mammaian cs, dspit th mapping of sra tho-sand binding sits for th PcG protins1214. This sggststhat many diffrnt mammaian transcription factorscontribt to th rcritmnt of th PcG protins. Infact, if on ooks at th targt gns rgatd by th PcGprotins in mammaian cs and considrs how thy arxprssd in diffrnt c typs, it bcoms diffict toimagin that ony a fw transcription factors ar inodin PcG rcritmnt and dispacmnt. It is iky that thrqirmnt of mtip transcription factors confrs amch gratr fxibiity of targt gn rgation. On thisbasis, it wi b ssntia to dfin ths transcription fac-tors, bcas thir drgation cod b ky to indc-ing cancr that is, thy cod work as oncogns ortmor spprssors.

    So which transcription factors contro th associa-tion of PcG protins with thir targt gns? It has bn

    stimatd that th hman gnom ncods approxi-maty 2,600 transcription factors53,54. W propos thatc fat transcription factors (CFTFs) ar strong can-didats for th rgation of PcG protin rcritmntto and dissociation from thir targt gns. W dfinCFTFs as a transcription factors that fnction to rg-at c fat dcisions dring ithr mbryognsis oradt c diffrntiation. Intrstingy, most if nota CFTFs ar thmss PcG targt gns1214. Somxamps incd th Hox, Sox, Rnx, Fox, Pax andGata transcription factor famiis. Fnctionay, thyar known to rgat many ky c fat dcisions, bothin stm cs and dring diffrntiation, by actiating

    a gl

    The Polycomb group (PcG) proteins regulate cell fate decisions during development

    and differentiation. They form multiprotein repressive complexes called Polycomb

    repressive complexes (PRCs), which modify chromatin.

    The PcG proteins bind and repress the promoters of hundreds of genes encoding

    proteins with roles in cell fate determination.

    It is unclear how PcG proteins are displaced and recruited to different subsets of

    target genes during cell fate decisions. However, cell fate transcription factors

    (CFTFs) and long non-coding RNAs (ncRNAs) are emerging as potential regulators.

    There is growing evidence that many PcG target genes are silenced in advanced

    cancer and that this may be the result of an epigenetic switch to DNA methylation

    during neoplastic progression.

    Several PcG proteins are known to be deregulated in cancer. We propose that the

    deregulation of CFTFs and long ncRNAs also leads to the misexpression of PcG

    target genes.


    774 | NOVEMBER 2009 | VOLUME 9

    2009 Macmillan Publishers Limited. All rights reserved
    NATURE REVIEWS | CANCER VOLUME 9 | NOVEMBER 2009 | 775

    Recruitment of PcGs by CFTFs

    Displacement of PcGs by CFTFs

    Recruitment of PcGs by long ncRNAs

    Coordinated regulation of PcGsby long ncRNAs and CFTFs

    Polycomb targetgene promoter

    Long ncRNApromoter





    PcG targetgene promoter

    Long ncRNApromoter















    th pol rol of cFtF Rna r

    What is known abot th ro of CFTFs and ongncRNAs in cancr? Th gntic idnc spportinga ro for transcription factors in cancr is probabystrongr than for any othr fnctiona grop of pro-tins. For xamp,MYCis on of th bst charactr-izd hman oncogns, and TP53 (which ncodsp53 in hmans) and RB1 ar th two most stdidhman tmor-spprssor gns. Thr is aso strongidnc that at ast 30 CFTFs ar gnticay atrdand contribt to cancr in a tiss-spcific mannr(TABLE 1). Th prcis mchanisms of action ar stipoory ndrstood in many cass. Howr, th i-dnc sggsts that thir norma ros in th rgationof inag-spcific c fat dcisions bcom prtrbdon thir mtation, ampification, transocation anddtion in cancr. Som spcific xamps incd thoncognsMYB85, SOX2(REF. 180),MITF86 and GATA2(REF. 87), and th tmor spprssors GATA3 (REF. 88),CEBP89, IKAROS and PAX5(REF. 90). Importanty, inaddition to bing gnticay atrd in hman can-

    crs, mos mods ha stabishd th significancof CFTFs as cancr-rant gns (TABLE 1).

    Th mod mrging is that CFTFs can b sbdiiddinto two casss on th basis of thir norma fnction,and that th drgation of both casss can potn-tiay contribt to th formation of non-diffrntiatdor tmor-initiating cs (FIG. 4). Oncogns bongto th first cass, as thy ar normay xprssd in

    stm or prognitor cs, and th tmor spprssorsar in th scond cass, as thy ar normay xprssddring diffrntiation and ar rqird for in-ag spcification. W propos that th drgationof ithr cass of CFTFs ads to th accmation of

    Box 1 | Log o-og Rna

    Long non-coding RNAs (ncRNAs) are > 200 nucleotides

    in length.

    Around 3,500 human and 2,000 mouse long ncRNAs

    have been identified to date.

    Long ncRNAs are often transcribed from gene loci that

    are overlapping and interspersed among coding genes.

    Long ncRNAs were thought to be transcriptional noise.

    However, the fact that many long ncRNAs are expressed

    in tightly regulated temporal and regional patterns

    suggests that they have important biological functions.

    Some long ncRNAs have been shown to have diverse

    cellular functions, including imprinting, X chromosome

    inactivation, chromatin remodelling and transcriptional


    Several long ncRNAs are emerging as regulators of

    chromatin-modifying complex (such as polycomb

    repressive complex 2 (PRC2), MLL, G9A, CoREST and

    SMCX) recruitment to target genes.

    Some long ncRNAs are emerging as candidate

    oncogenes and tumour-suppressor genes. (For further

    information see REF. 67).

    Figure 3 | Ptt mcm by wc c t tcpt ct d g -cdg rna uct t

    gut Pycmb gup pt ct wt tgt g dug g cc d pcct.

    | Cell fate transcription factors (CFTFs) recruit Polycomb group (PcG) proteins to target genes during lineage decisions.

    b | CFTFs induce the dissociation of PcG proteins from target genes during lineage decisions.c | Long non-coding RNAs

    (ncRNAs) recruit PcG proteins to target genes during lineage decisions. d | Coordinated action of CFTFs and long ncRNAs

    is necessary to recruit PcG proteins to or dissociate them from target genes during lineage determination. The long

    ncRNAs can function either in cis orin trans.


    cs incapab of ndrgoing diffrntiation (FIG. 4).Ths pr-tmorignic cs thn ha th potn-tia to frthr progrss to bcom tmors aftr thaccmation of additiona gntic and/or pignticatrations. To istrat this hypothsis w dscribsom xamps of ths two casss of CFTFs andpay particar attntion to thos CFTFs for whichthr is idnc of a fnctiona intraction with PcGprotins.

    OCT4 is normay xprssd in pluripotent cells ofth ary mbryo and in eS cs, and it is rqird formaintaining ths cs in an ndiffrntiatd stat91.

    A potntia oncognic actiity of OCT4 was radwhn it was shown to b highy xprssd in hmangrm c tmors and was rqird for thir con-tind growth92. In addition, th ctopic xprssionof OCT4 bocks prognitor c diffrntiation andcass dyspasia in pithia tisss93. Importanty,OCT4 occpis sra hndrd PcG targt gns inhman eS cs and is thoght to contribt to thsstaind rcritmnt of PcG protins to th promot-rs of diffrntiation gns14. Thrfor, th prga-tion of OCT4 in cancr might ad to th prsistnt orsstaind PcG-mdiatd rprssion of diffrntiation

    Table 1 | C t tcpt ct tt dgutd um cc

    G m r Gtc tt um cc edc cc Tgt g

    ETV5 Oncogene Translocated in prostate cancer126 In vitromodel126

    ETV7 Oncogene Overexpressed in lymphoma127 In vivomodel127

    GATA6 Oncogene Amplified in pancreatic cancer128 In vitromodel128

    HOXA9 Oncogene Translocated in myeloid leukaemia102 In vivomodel129

    LMO1 Oncogene Translocated in T cell leukaemia130 In vivomodel131

    LMO2 Oncogene Translocated in T cell leukaemia130 In vivomodel132

    MITF Oncogene Amplified in melanoma133 In vitromodel133 CDKN2AINK4A (REF. 134)

    MYB Oncogene Mutated in colon cancer and transclocated in T-ALL85 In vivomodel135

    MYCN Oncogene Amplified in neuroblastoma136 In vivo model137

    OTX2 Oncogene Amplified in medulloblastoma138,139 In vitromodel139

    PAX3 Oncogene Translocated in alveolar rhabdomyosarcoma140 In vivo model141

    PLZF Oncogene Translocated in acute promyelocytic leukaemia142 In vivo model143

    RUNX1 Oncogene Translocated in AML144 In vivomodel145 NF1(REF. 146)

    TAL1 Oncogene Translocated or mutated in T-ALL131,147 In vivomodel131,147 CD4(REF. 147)

    TBX2 Oncogene Amplified in breast cancer148

    In vitromodel148


    (REF. 148)

    TITF1 Oncogene Amplified in lung cancer149,150 In vitromodel149,150

    CDX2 TS Mutated in colon cancer151,152 In vivomodel153 CDKN1A(REF. 154)

    CEBPA TS Mutated in AML155 In vivomodel156

    FOXP3 TS Mutated and deleted in breast cancer157,158 In vitroand in vivomodels157,158 ERBB2(REF. 158)andSKP2(REF. 157)

    GATA3 TS Silenced and mutated in breast cancer159,160 In vivomodel159 FOXA1(REF. 159) andCDKN2C161

    HOXA5 TS Silenced in breast cancer162 In vitromodel162 TP53(REF. 162)

    IKAROS TS Deleted in AML90, 107 In vitromodel65 HES1(REF. 65)

    ING1 TS Mutated in squamous cell carcinoma163,164 In vivomodel165

    KLF6 TS Mutated in prostate cancer166 and deregulated in GBM167 In vivomodel167 ATF3(REF. 168)

    PAX5 TS Mutated, deleted or fused in ALL107 In vitromodel107 CD19andCD72(REF. 107)

    PU.1 TS Inactivated or mutated in AML169,170 In vivomodel171, 172 JUNB171

    RUNX3 TS Mutated or silenced in gastric cancer173 In vivomodel174

    SMAD4 TS Mutated in pancreatic cancer175 In vivomodel176

    WT1 Both Mutated in hepatic cancer177,178 In vitromodel179 CDKN1A (REF. 179)

    ALL, acute lymphoblastic leukaemia; AML, acute myeloid leukaemia; ATF3, activating transcription factor 3; CEBPA, CCAAT/enhancer binding protein-; CDKN1A,cyclin-dependent kinase inhibitor 2A; CDKN2A, cyclin-dependent kinase 2A; CDKN2C, cyclin-dependent kinase inhibitor 2C; CDX2, caudal type homeobox 2;ETV, ets variant; Fox, forkhead box; GBM, glioblastoma multiforme; HES1, hairy and enhancer of split 1; Hox, homeobox; LMO, LIM domain only; ING1, inhibitor ofgrowth family, member 1; KLF6, Kruppel-like factor 6; MITF, microphthalmia-associated transcription factor; NF1, neurofibromin 1; OTX2, orthodenticle homeobox2; Pax, paired box; Runx, runt-related transcription factor; SKP2, S-phase kinase-associated protein 2; TAL1, T-cell acute lymphocytic leukaemia 1; T-ALL, T cellacute lymphoblastic leukaemia; TBX2, T-box 2; TS, tumour suppressor; WT1, Wilms tumor 1.


    gns and a consqnt bock of th abiity of cs torspond to diffrntiation cs (FIG. 4). Sra othrCFTFs, sch as MYB85, PlZF94, HOXA9 (REFS 9597),PAX3(REF. 98) and PAX7(REF. 99), ar known to fnc-tion in tiss-spcific stm and prognitor cs andha bn fond to ha gain of fnction in cancr.PlZF rcrits BMI1 and th associatd PRC1 com-px to rprss th Hoxdocs dring mos dop-mnt61. PlZF is xprssd in hamatopoitic stm andprognitor cs and is an ssntia rgator of spr-matogonia stm c maintnanc94,100. Importanty,th PlZFrtinoic acid rcptor- (RAR) fsion pro-

    tin, ik th promyocytic kamia (PMl)RARfsion protin, can abrranty rcrit PcG protinsto targt gns dring cancr dopmnt62,101. Thisraiss th possibiity that othr CFTFs form fsionprotins with this abiity. For xamp, HOXA9 isxprssd in hamatopoitic stm cs and pro-gnitors and is transocatd in myoid kamia102.Simiary, th PAX3 and PAX7 CFTFs fnction dr-ing mbryonic myognsis (msc dopmnt) andar transocatd in aoar rhabdomyosarcoma achidhood cancr of skta msc cs 58. Sraothr CFTFs that ar normay xprssd in ndiffr-ntiatd cs ar drgatd in cancr withot bing

    xprssd as fsion protins (TABLE 1). For xamp,MYB is xprssd in coon stm cs and prognitorcs; it is gnticay disrptd in coon cancr by amtation in an intron, ading to highr xprssions85. SOX2, ik OCT4, occpis a sbst of PcGtargt gns in eS cs and is aso xprssd in tiss-spcific stm and prognitor cs, incding nra,ng and osophaga cs103105. Intrstingy, SOX2is ampifid in both ng and osophaga sqamosc carcinomas, sggsting that its ro in cancr isto maintain cs in a pr-trminay diffrntiatdstat180. It wi b intrsting to discor whthrMYB, PAX3, PAX7 and othr CFTFs ar inodin rgating PcG targt gns and/or whthr thyrgat PcG fnction.

    W propos that th scond grop of CFTFs pro-mots diffrntiation by rcriting PcG protins tostm c gns and/or by dispacing PcG protins fromdiffrntiation targt gns (FIG.4). Whn inactiatd incancr (for xamp by dtion or mtation) this wodad to cs bing nab to rprss stm c gns and/

    or actiat a programm of diffrntiation gns. Twopotntia xamps, PAX5 and IKAROS, ar rqirdfor B ymphocyt diffrntiation58,106 and ar fr-qnty dtd in act ymphobastic kamias90,107.Intrstingy, mic with oss of fnction ofIkaros or d-tion ofPax5 ha rdcd H3K27m3 at crtain oci,sggsting that ths CFTFs fnctionay intract withPcG protins65,108. Anothr xamp is GATA3, whichis known to b rqird for mina c diffrntiationin brast pithia88 and is mtatd in brast cancrs109.C/eBP, howr, is rqird for granocytic diffrn-tiation of bipotnt granocyt-macrophag prognitorcs and is mtatd in act myoid kamia89. Mostof ths inag-spcific CFTFs rmain to b charactr-izd in trms of thir intractions with PcG protins andpigntic modifirsper se (TABLE 1).

    Th ida that th drgation of CFTFs can changc fat and ad to tmor dopmnt has bnhighightd by th rcnt intrst in car rpro-gramming110112 (BOX 3). Adt somatic cs can bindcd to trans-diffrntiat into c typs of othrinags or d-diffrntiat into mbryonic stm-ikcs cad indcd pripotnt stm cs. It is nowcar that th controd gain or oss of xprssion ofspcific sts of CFTFs in diffrnt contxts has thpowr to rprogramm c idntity. This has bnshown to ad to a rstting of th pigntic andscap

    in ths cs113,114, frthr spporting th hypothsisthat th drgation of n on CFTF cod potn-tiay indc pigntic rprogramming and contribtto tmor initiation. Importanty, it is aso iky thatoncognic CFTFs (sch as OCT4 and SOX2) whnactiatd or tmor spprssi CFTFs (sch as PAX5and IKAROS) whn inactiatd ad to a bock of thdiffrntiation of immatr cs or a d-diffrntiationof mor matr cs (FIG. 4).

    Do CFTFs aways ha to b gnticay atrd asotind in TABLE 1? Th most iky answr is no. It isw stabishd that car signaing pathways arcommony drgatd in hman cancr 115. Most if

    Box 2 | Ovrvw of rrpom mhoolog

    Gb y mrna xp

    Expression microarrays. These are used to quantify mRNA expression levels in a cell.

    They consist of an arrayed series of thousands of microscopic spots of DNA

    oligonucleotides, each representing a gene, which are used as probes to hybridize a

    cDNA or cRNA sample. After hybridization, the microarray is scanned and software

    used to determine the expression levels of the mRNAs represented. Expression

    microarrays are restricted to the detection of genes that are represented on the array,and this can be limited in certain cases. For example, to cover the entire human

    genome would require as many as 20 chips. This becomes technically challenging,

    labour intensive and expensive.

    RNA sequencing. This is a relatively new technology that applies high-throughput

    next-generation sequencing technologies to sequence cDNA to obtain information

    about the RNA content of a sample120. This method is both quantitatively and

    qualitatively superior to expression arrays. It allows the accurate quantification of

    genes expressed at low levels and qualitatively allows the monitoring of all

    non-annotated genes, including long non-coding RNAs.

    Gm-wd ct y

    ChIP-on-chip or ChIP-chip. Chromatin immunoprecipitation (ChIP) is a method to

    map the DNA location of transcription factors, chromatin remodellers or histone

    modifications. The principle underpinning this assay is that DNA binding proteins in

    living cells are bound to DNA. By using an antibody that is specific to the DNA

    binding protein, one can immunoprecipitate the proteinDNA complex. The

    immunoprecipitated DNA is then isolated and detected by hybridizing the amplified

    DNA on a microarray (chip) (ChIP-on-chip or ChIP-chip). The lysate used for the

    immunoprecipitation can be prepared from non-treated cells, or cells treated with

    formaldehyde, which cross-links proteins bound to DNA.

    ChIPsequencing. This is a recent advancement of the global analysis of

    transcription factor binding sites by ChIP. It applies high-throughput

    next-generation sequencing technologies to sequence the ChIP DNA. This method

    is better than ChIP-chip, because it allows the researcher to ascertain the location

    of binding sites anywhere in the genome. Although this is also possible with

    ChIP-chip, for the human genome this would require as many as 20 chips.

    Moreover, the ChIPsequencing technology generates results with greater

    resolution and higher signal-to-noise values.


    Stem cell gene








    Stem cell gene

    Stem cell geneDifferentiation gene

    Differentiation gene

    Differentiation gene

    Stem cell

    Tumour-initiating cell

    Differentiated cell

    Tumour-initiating cell


    Gain of CFTF1 activitye.g. in OCT4 or SOX2

    Loss of CFTF2 activitye.g. in PAX5 or C/EBP



    Additional genetic andepigenetic alterations


    Additional genetic andepigenetic alterations


    Stem cell gene Differentiation gene


    Off Off




    b c

    not a of ths pathways rgat c fat dci-sions by controing th abndanc and/or actiity ofdownstram ffctor CFTFs. So, for xamp, athoghOCT4 has not bn shown to b gnticay atrdin grm c cancrs, it rsponds to signaing fromfibrobast growth factors, th kamia inhibitoryfactorsigna transdcr and actiator of transcription3 pathway, th transforming growth factor-bonmorphognic protin pathway and th noda andWnt pathways, any of which cod b drgatd116.It is thrfor ogica to assm that many additionaCFTFs ar drgatd in cancr as a consqnc ofatrd signaing pathways. Consistnt with this, s-ra CFTFs ar pignticay, rathr than gnticay,drgatd in cancr. For xamp, PAX2 is prg-atd in ndomtria cancr 117, and HOXA5 is down-rgatd in brast cancr109.

    Th potntia of ong ncRNAs as drirs of tmorformation is aso apparnt. Thr is mrging, if so farimitd, idnc that ong ncRNA gns ar inddcancr rant. Th xprssion ofHOTAIR is highr inmtastatic brast cancr than in primary brast pithiacs (R. Gptha and H.Y. Chang, prsona commnica-tion). Moror, ths obsrations proid idncthat HOTAIR contribts to th mtastatic phnotyp,and that this corratd with th abrrant rcritmnt ofPcG protins to mtip targt gns. In anothr rcntstdy, Y et al. sarchd for antisns transcripts asso-ciatd with 21 w-known tmor spprssor gns118.Thy idntifid a 34.8 kb transcript (which thy cad

    p15AS) that was associatd with th cycin-dpndntkinas inhibitor and PcG targt gn CDKN2B, which isfrqnty sincd in kamia. Th athors xamindth xprssion of both CDKN2B andp15AS in kamic

    Figure 4 | G d c t tcpt ct my d t t mt tumu-ttgc. This model illustrates how the loss or gain of function of two putative cell fate transcription factors (CFTFs)

    may lead to the formation of a tumour-initiating cell. | Normal differentiation of a stem or progenitor cell. The

    levels of CFTF1 decrease during differentiation and the levels of CFTF2 and CFTF3 increase. The decrease of CFTF1

    and the increase of CFTF2 lead to displacement of Polycomb group (PcG) proteins from the promoter of the

    differentiation gene. The increase in CFTF3 levels leads to the recruitment of PcG proteins to the promoter of the

    stem cell gene. b | Conversion of normal stem cells to tumour-initiating cells. In this case, the levels of CFTF1

    become aberrantly high in the stem cells and as a consequence the differentiation gene remains repressed,

    rendering the cell insensitive to differentiation signals. c | The conversion of differentiated cells to

    tumour-initiating cells. In this scenario, CFTF2 function is lost in a differentiated cell (for example, by mutation or

    deletion of the gene) and therefore the cell reverts or de-differentiates to a more stem cell-like state or to a

    tumour-initiating cell in which the differentiation gene is aberrantly silenced. Notably, it is also possible that CFTF1

    could be activated in a differentiated cell or that CFTF2 could be deleted or mutated in a stem cell or progenitor.

    In addition, the loss of CFTF3 activity could lead to an inability to repress the stem cell gene.


    ES and iPScells

    B cells

    T cells

    Pluripotent cells Progenitors Mature cells

    Neural progenitors

    Blood progenitors

    Pancreatic progenitors

    Mesenchymal progenitors




    Mast cells


    Red blood cells

    Skin fibroblasts




    Acinar cells

    Beta cells7








    OCT4, KLF4, SOX2 and MYC





    OCT4 and KLF4

    OCT4, KLF4, SOX2, MYC and C/EBP

    OCT4, KLF4, SOX2 and MYC

    Loss of PAX5




    PDX1, NGN3 and MAFA

    kocyts and fond that in most cassp15AS xprs-sion was incrasd with a concomitant dcras in INK4Bxprssion. ectopic xprssion of p15AS was shown toincras DNA mthyation s at th CDKN2B pro-motr. An intrsting and so far nxpord possibiity isthat th abrranty high s of this ong ncRNA that arobsrd in cancr cod ad to th prmannt rprs-sion of th CDKN2B ocs throgh PcG rcritmnt andsbsqnt accmation of DNA mthyation. Anothrstdy idntifid a 7 kb ong ncRNA, namd hcn, as amarkr for mos hpatocar carcinoma (HCC)119.exprssion of this ong ncRNA was fond to b ightfodhighr in a mos mod of HCC compard with matchd

    norma ir tiss. This high xprssion was obsrd ina stags of HCC, impicating it as a potntia initiatingsion in th dopmnt in cancr. Frthrmor, thathors idntifid a hman orthoog ofhcn, mtastasisassociatd ng adnocarcinoma transcript 1(MALAT1),which is highy xprssd in hman cancrs. It wi bintrsting to dtrmin th bioogica fnction of thsong ncRNAs and in particar whthr thy fnctionto rcrit PcG protins to targt gns dring inagchoics and whthr thir drgation contribts tocancr. It is car that ong ncRNAs rprsnt a prom-ising candidat st of potntia oncogns and tmorspprssor gns.


    Crrnt rsarch fforts ar dirctd at ndrstandingth mchanisms by which PcG protins ar rcritdto and dispacd from thir targt gns dring in-ag spcification. Both CFTFs and ong ncRNAs armrging as ky rgators of ths nts. In th nxtfw yars, w wi s a nmbr of paprs in which

    th targt gns of cancr-rant CFTFs wi bdinatd. Anaogos to th nraing of th tran-scriptiona ntworks controing eS cs, w xpctthat simiar fforts wi stabish th transcriptionantworks of adt stm cs, prognitors and diffr-ntiatd cs. Stdis wi aso addrss th hypothsisthat PcG rcritmnt is rgatd by CFTFs and ongncRNAs dring inag choic. It wi b importantto dtrmin whthr spcific sbsts of PcG targtgns ar actiatd or rprssd by spcific CFTFsor ong ncRNAs dring inag choics. Moror, itwi b intrsting to dtrmin to what xtnt CFTFsrgat th xprssion of ong ncRNAs, and whthrong ncRNAs dictat th rcritmnt of th PcG pro-tins. Ths stdis wi proid important informa-tion rgarding th mocar mchanisms that contronorma c fat dcisions and aso how th drga-tion of ky payrs (that is, CFTFs and ong ncRNAs)might ad to cancr. For instanc, do th many atra-tions in CFTFs that ha bn docmntd in ariostmors (TABLE 1) contribt to th atrd pig-ntic profis obsrd in ths tmors? Crrnty,rmarkaby itt is known abot ths spcifictranscriptiona atrations and how thy ar inf-ncd by th drgation of th CFTFs and/orong ncRNAs.

    Or th nxt 5 yars many fnctionay important

    ong ncRNAs wi probaby b ncord. It wi b fas-cinating to arn how many of th ong ncRNAs wimrg as bona fid oncogns and tmor spprssorgns. Anticancr thrapis targting inag-spcificCFTFs or ong ncRNAs may ha adantags or drgsdirctd at PcG protins or DNA mthyation nzyms,as thy ar mor iky to b c typ spcific. In conc-sion, it is car that a bttr ndrstanding of th ro ofCFTFs and ong ncRNAs in modating th pignticactiity of th PcG protins wi proid mor targtsfor anticancr thrapy, and thrfor is promising forth taior-mad indiidaizd tratmnt of cancrpatints.

    Box 3 | th powr of rrpo for: llulr rprogrmmg

    Terminally differentiated mammalian cells are considered to be stable and rarely,

    apart from in aberrant situations such as cancer development, do they de- or

    trans-differentiate121. Owing to this property it was originally believed that the fates

    imposed on somatic cells during development were final (or terminal), and that cells

    could not be reprogrammed. However, the successful cloning of animals by somatic

    cell nuclear transfer demonstrated that unidentified factor(s) in the oocyte had theability to revert or alter developmental decisions. Since these initial experiments,

    several studies have demonstrated that cell fate transcription factors(CFTFs), when

    deleted or induced exogenously, can cause a switch in the fate of various somatic

    cell types111 (see the figure). For example, the removal ofPax5 from mouse B

    lymphocytes results in their de-differentiation to blood progenitor cells that are

    capable of forming several haematopoietic lineages122. On the exogenous induction

    of CCAAT/enhancer-binding protein- (C/EBP), B cells and T cells can beconverted into macrophages123,124. Pancreatic exocrine cells can be converted to

    insulin producing -cells in vivo, by a combination of neurogenin 3 (NGN3),pancreatic and duodenal homeobox 1 (PDX1) and MAFA CFTFs125. In their seminal

    paper, Takahashi and Yamanaka demonstrated that four CFTFs (OCT4, SOX2, KLF4

    and MYC) were capable of reprogramming adult mouse skin fibroblasts into a

    population of cells that, in many ways, are both molecularly and functionally

    indistinguishable from embryonic stem (ES) cells112. These cells, termed induced

    pluripotent stem (iPS) cells, have also been derived from human skin fibroblasts andseveral other cells of alternative lineages111.


