Expression of family 3 cellulose-binding module (CBM3) as an affinity tag for recombinant proteins...

10
METHODS AND PROTOCOLS Expression of family 3 cellulose-binding module (CBM3) as an affinity tag for recombinant proteins in yeast Wen Wan & Dongmei Wang & Xiaolian Gao & Jiong Hong Received: 3 February 2011 /Revised: 4 May 2011 /Accepted: 6 May 2011 /Published online: 9 June 2011 # Springer-Verlag 2011 Abstract Easy and low-cost protein purification methods for the mass production of commonly used enzymes that play important roles in biotechnology are in high demand. In this study, we developed a fast, low-cost recombinant protein purification system in the methylotrophic yeast Pichia pastoris using the family 3 cellulose-binding module (CBM3)-based affinity tag. The codon of the cbm3 gene from Clostridium thermocellum was optimized based on the codon usage of P . pastoris. The CBM3 tag was then fused with enhanced green fluorescent protein (CBM3-EGFP) or with inulinase and expressed in P . pastoris to demonstrate its ability to function as an affinity tag in a yeast expression system. We also examined the effects of glycosylation on the secreted CBM3-tag. The secreted wild-type CBM3- EGFP was glycosylated; however, this had little influence on the adsorption of the fusion protein to the regenerated amorphous cellulose (RAC; maximum adsorption capacity of 319 mg/g). Two CBM3-EGFP mutants lacking glyco- sylation sites were also constructed. The three CBM3- EGFPs expressed in P . pastoris and the CBM3-EGFP expressed in Escherichia coli all had similar RAC adsorption capacity. To construct a tag-free recombinant protein purification system based on CBM3, a CBM3- intein-EGFP fusion protein was expressed in P . pastoris. This fusion protein was stably expressed and the self- cleavage of intein was efficiently induced by DTT or L- cysteine. In this study, we were able to purify the recombinant fusion protein with high efficiency using both intein and direct fusion-based strategies. Keywords Cellulose-binding module . Affinity tag . Yeast . Glycosylation Introduction Affinity chromatography using a variety of affinity tags on various resins is commonly used in many laboratory and biotechnology applications (Arnau et al. 2006; Esposito and Chatterjee 2006; Hartley 2006; Fong et al. 2010). Many of the current affinity chromatography protocols, including Ni-NTA agarose and glutathione, are too costly to be inexpensively offered or to be used in the large-scale purification of nontherapeutic proteins, including indus- trial enzymes (Fong et al. 2010). More cost-effective methods are also needed for the purification of more expensive therapeutic proteins. Developing alternative methods for the mass purification of recombinant proteins that are simple, of low cost, and environmentally friendly remains a challenging endeavor (Arnau et al. 2006; Przybycien et al. 2004). The cellulose-binding module (CBM) is an attractive affinity tag for protein purification for several reasons: (1) the highly specific binding ability of the protein fused with a CBM tag, (2) its low nonspecific binding for other Electronic supplementary material The online version of this article (doi:10.1007/s00253-011-3373-5) contains supplementary material, which is available to authorized users. W. Wan : D. Wang : X. Gao : J. Hong (*) School of Life Science, University of Science and Technology of China, Hefei, Anhui, Peoples Republic of China e-mail: [email protected] W. Wan : X. Gao : J. Hong Hefei National Laboratory for Physical Science at the Microscale, Hefei, Anhui, Peoples Republic of China X. Gao Department of Biology and Biochemistry, University of Houston, Houston, TX, USA Appl Microbiol Biotechnol (2011) 91:789798 DOI 10.1007/s00253-011-3373-5

Transcript of Expression of family 3 cellulose-binding module (CBM3) as an affinity tag for recombinant proteins...

Page 1: Expression of family 3 cellulose-binding module (CBM3) as an affinity tag for recombinant proteins in yeast

METHODS AND PROTOCOLS

Expression of family 3 cellulose-binding module (CBM3)as an affinity tag for recombinant proteins in yeast

Wen Wan & Dongmei Wang & Xiaolian Gao & Jiong Hong

Received: 3 February 2011 /Revised: 4 May 2011 /Accepted: 6 May 2011 /Published online: 9 June 2011# Springer-Verlag 2011

Abstract Easy and low-cost protein purification methodsfor the mass production of commonly used enzymes thatplay important roles in biotechnology are in high demand.In this study, we developed a fast, low-cost recombinantprotein purification system in the methylotrophic yeastPichia pastoris using the family 3 cellulose-binding module(CBM3)-based affinity tag. The codon of the cbm3 genefrom Clostridium thermocellum was optimized based on thecodon usage of P. pastoris. The CBM3 tag was then fusedwith enhanced green fluorescent protein (CBM3-EGFP) orwith inulinase and expressed in P. pastoris to demonstrateits ability to function as an affinity tag in a yeast expressionsystem. We also examined the effects of glycosylation onthe secreted CBM3-tag. The secreted wild-type CBM3-EGFP was glycosylated; however, this had little influenceon the adsorption of the fusion protein to the regeneratedamorphous cellulose (RAC; maximum adsorption capacityof 319 mg/g). Two CBM3-EGFP mutants lacking glyco-sylation sites were also constructed. The three CBM3-EGFPs expressed in P. pastoris and the CBM3-EGFP

expressed in Escherichia coli all had similar RACadsorption capacity. To construct a tag-free recombinantprotein purification system based on CBM3, a CBM3-intein-EGFP fusion protein was expressed in P. pastoris.This fusion protein was stably expressed and the self-cleavage of intein was efficiently induced by DTT or L-cysteine. In this study, we were able to purify therecombinant fusion protein with high efficiency using bothintein and direct fusion-based strategies.

Keywords Cellulose-binding module . Affinity tag . Yeast .

Glycosylation

Introduction

Affinity chromatography using a variety of affinity tags onvarious resins is commonly used in many laboratory andbiotechnology applications (Arnau et al. 2006; Esposito andChatterjee 2006; Hartley 2006; Fong et al. 2010). Many ofthe current affinity chromatography protocols, includingNi-NTA agarose and glutathione, are too costly to beinexpensively offered or to be used in the large-scalepurification of nontherapeutic proteins, including indus-trial enzymes (Fong et al. 2010). More cost-effectivemethods are also needed for the purification of moreexpensive therapeutic proteins. Developing alternativemethods for the mass purification of recombinant proteinsthat are simple, of low cost, and environmentally friendlyremains a challenging endeavor (Arnau et al. 2006;Przybycien et al. 2004).

The cellulose-binding module (CBM) is an attractiveaffinity tag for protein purification for several reasons: (1)the highly specific binding ability of the protein fused witha CBM tag, (2) its low nonspecific binding for other

Electronic supplementary material The online version of this article(doi:10.1007/s00253-011-3373-5) contains supplementary material,which is available to authorized users.

W. Wan :D. Wang :X. Gao : J. Hong (*)School of Life Science,University of Science and Technology of China,Hefei, Anhui, People’s Republic of Chinae-mail: [email protected]

W. Wan :X. Gao : J. HongHefei National Laboratory for Physical Science at the Microscale,Hefei, Anhui, People’s Republic of China

X. GaoDepartment of Biology and Biochemistry, University of Houston,Houston, TX, USA

Appl Microbiol Biotechnol (2011) 91:789–798DOI 10.1007/s00253-011-3373-5

Page 2: Expression of family 3 cellulose-binding module (CBM3) as an affinity tag for recombinant proteins in yeast

proteins, (3) the efficient release of bound protein undernondenaturing conditions, (4) its enhanced protein foldingand secretion (Murashima et al. 2003), and (5) its increasedprotein yield (Ahn et al. 2004). Cellulose offers a number ofadvantages that make it an ideal matrix for large-scaleaffinity purification. At less than $5 per gram, regeneratedamorphous cellulose (RAC) is inexpensive. This matrix hasphysical properties that are highly amenable with proteinpurification. For example, it can endure high-speed centri-fugation at 12,000×g and is stable in a variety of bufferswith varying pH levels. Cellulose is inert, i.e., it does notreact with most substrates in buffers or protein side groups.Cellulose has low nonspecific affinity for most proteins andis commercially available in many different forms, includ-ing cotton, filter membranes, powder, fibers, and beads.Furthermore, cellulose is safe and has been approved formany pharmaceutical and human applications.

The CBM tag has been used for recombinant proteinpurification on commercial cellulose matrix or powder (i.e.,Avicel—microcrystalline cellulose, SigmaCell, or amor-phous cellulose) in different capacities (Ramirez et al.1993; Kavoosi et al. 2004; Tomme et al. 1998; Murashimaet al. 2003; Ahn et al. 2004; Hong et al. 2008b, 2007b;Shoseyov et al. 2006). These reports have previously beenreviewed by Shoseyov et al. (Shoseyov et al. 2006; Levy andShoseyov 2004, 2002). However, most of the CBM-basedpurification systems have been developed on Escherichiacoli expressed recombinant proteins.

In eukaryote expression systems, yeast offers severaladvantages, including allowing proteins to undergo post-translational modification. Even more important, someproteins can only be expressed in eukaryote cells. However,only a few yeast expression systems using CBM affinity tagshave been reported (Boraston et al. 2001a). While theapplication of family 3 CBM (CBM3) from C. thermocellumin protein immobilization and as an affinity tag has beenwidely reported in E. coli (Hong et al. 2008a, b; Levy andShoseyov 2002; Shoseyov et al. 2006; Guerreiro et al. 2008),systemic research on its use in yeast has not been reported.

Affinity tags are robust tools in protein purification.However, several applications, including pharmaceuticalprotein purification, require their removal. This procedureis expensive and remains a significant issue that must beresolved before affinity tag-based purification methodsbecome widely used (Fong et al. 2010). Self-cleavableintein-based tags offer an alternative method for theremoval of these tags. Inteins can excise themselves and/or join two fragments together by adjusting pH or thiolreagent concentrations. An amino acid terminus mutatedintein can cleave one portion of a fusion protein from thetag (Zhao et al. 2008; Shoseyov et al. 2006; Chong et al.1996; Chong and Xu 1997; Fong et al. 2010; Elleuche andPoggeler 2010). Self-cleaving intein has been used to

replace a more costly peptide-specific protease-basedmethods and to simplify the purification process. Althoughsome inteins from eukaryote cells have been used inbacterial expression systems, as in the IMPACT systemfrom New England Biolabs (Ipswich, MA, USA), theapplication of intein-based affinity tag cleavage in eukary-ote cells is rare. To date, Sce VMA (an intein derived fromthe Saccharomyces cerevisiae VMA1) has only been usedto express human granulocyte macrophage colony stimu-lating factor in Pichia pastoris (Babu et al. 2008).

In this study, we developed a protein expression andpurification method in P. pastoris using genetic codonoptimized CBM3 from C. thermocellum. We also investi-gated CBM3s N-glycosylation effect and adsorption capac-ity. Our results show that CBM3 is effective in yeast andwithout significant changes in cellulose adsorption capacityof fusion protein compared with bacterial CBM3. A tag-free target protein purification system was also constructedwith the intein Sce VMA, and our results show that thissystem is effective with CBM3 in P. pastoris too.

Methods and materials

Chemicals and strains

All chemicals were reagent grade and purchased from Sigma(St. Louis, MO) or Sangon Biotech, Inc. (Shanghai, China),unless otherwise noted. Microcrystalline cellulose (FMC PH-105 (20 μm)) was obtained from FMCCo. (Philadelphia, PA).E. coli XL10 gold was used as a host cell for DNAmanipulation. Luria-Bertani (LB) medium with 100 μg/mLampicillin was used to culture E. coli. P. pastoris KM71H(Invitrogen, Carlsbad, CA, USA) was used for recombinantprotein expression. YPD medium (1% yeast extract, 2%peptone, and 2% dextrose) and BMGY medium (1% yeastextract, 2% peptone, 100 mM potassium phosphate pH 6.0,1.34% YNB, 4×10−5% biotin, 1% glycerol) were used toculture P. pastoris. BMMY medium (1% yeast extract, 2%peptone, 100 mM potassium phosphate, pH 6.0, 1.34%YNB, 4×10−5% biotin, with 0.5% methanol) was used for P.pastoris protein expression. Expression strains were obtainedby electroporation transformation of expression plasmids toP. pastoris KM71H as previously described (Hong et al.2007a). The oligonucleotides were synthesized by SangonBiotech, Inc. (Shanghai, China) (Table 1). RAC wasprepared from microcrystalline cellulose (FMC PH-105)with the method previously described (Hong et al. 2008a).

Construction of the expression plasmids

To express the CBM3 fusion protein in yeast, the codon ofCBM3 (GenBank accession no. EEU00265) was optimized

790 Appl Microbiol Biotechnol (2011) 91:789–798

Page 3: Expression of family 3 cellulose-binding module (CBM3) as an affinity tag for recombinant proteins in yeast

based on the codon usage of P. pastoris. The codonoptimized cbm3 gene (GenBank accession numberHQ232851) was synthesized as previously described(Zhou et al. 2004) and subcloned into the EcoR I andBamH I restriction sites of the pUCm-T plasmid(pUCBM). The egfp gene was then PCR amplified frompCG (Hong et al. 2008b) using the EGFP-SalI-F andEGFP-HindIII-R primers (Table 1) and inserted into theSal I and Hind III restriction sites of pUCBM (pUCG). Toinsert the cbm3-egfp fusion gene into the pPICZα Aexpression vector (Invitrogen), the cbm3-egfp gene wasPCR amplified using the CBM3-EcoRI-F and EGFP-KpnI-R primers (Table 1). Following digestion with EcoRI and Kpn I, the cbm3-egfp DNA was inserted into thesame restriction sites of pPICZα A. The CBM3-EGFP wasfused with the α-factor signal sequence at its N terminusin the resulting pPCG plasmid. Various cbm3 mutants(described below) were constructed from this cbm3-egfpgene and inserted into the pPICZα A construct.

The plasmid (pGIC) expressing the EGFP-intein-CBM3(GIC) fusion protein was constructed based on the pTYB1plasmid (New England Biolabs). The egfp DNA fragmentwas amplified from pPCG using the EGFP-NdeI-F andEGFP-XhoI-R primers (Table 1) and subsequently digestedusing Nde I/Xho I. This fragment was inserted into the NdeI/Xho I digested pTYB1 plasmid in front of intein SceVMA to produce the pTGI plasmid. Cbm3 DNA fragmentswere also amplified from pPCG using the CBM3-AgeI-Fand CBM3-PstI-R primers (Table 1), digested with Age I/

Pst I, and inserted into Age I/Pst I digested pTGI plasmidbehind intein Sce VMA to produce the pTGIC plasmid. Toinsert the egfp-intein-cbm3 fusion gene into the pPICZ Aexpression plasmid, the egfp-intein-cbm3 gene was ampli-fied with the GIC-EcoRI-F and GIC-NotI-R primers(Table 1). Following digestion with EcoR I and Not I, thisgene was inserted into the pPICZ A plasmid to form thepGIC expression plasmid.

The pPCI plasmid, which expresses CBM3mt2 and theinulinase fusion protein (CI), was constructed based on theplasmid for CBM3mt2-EGFP expression. The inulinasegene Inu1was PCR amplified from Kluyveromyces marx-ianus genomic DNA using the INU-XbaI-F and INU-XbaI-R primers. Following its digestion with XbaI, this gene wasinserted into the XbaI-digested plasmid, replacing theEGFP. The resulting plasmid was named pPCI.

Construction of CBM3 mutants

Glycosylation is the most common post-translationalmodification of proteins secreted by eukaryotic cells.Glycosylation occurs at asparagine residues at Asn-Xxx-Ser/Thr sequence motifs (N-linked) or serine and threonineresidues (O-linked). Three potential N-glycosylation sites(Asn14, Asn68, and Asn124) were found in CBM3. Toeliminate these sites, cbm3 mutants (Asn to Gln), in whichthe N-glycosylation sites were either partially or completelyremoved, were constructed using overlap extension PCR(Table 1).

Primer Sequence (5′→3′)a

CBM3WT ACGGAATTCCAGTTTCTGGTAATTTGAAAGTTGAATTTTATAATTC

CBM3mt1F ACGGAATTCCAGTTTCTGGTAATTTGAAAGTTGAATTTTATAATTCTCAACCATC

CBM3mt2F TATTGGTTCTCAAGGTTCTTATAATGG

CBM3mt2R ATAAGAACCTTGAGAACCAATAATAG

CBM3mt3F TTGGTCTCAATATACTCAATCTAATG

CBM3mt3R ATTGAGTATATTGAGACCAATCATTTTTAG

CBM3-EcoRI-F ACGGAATTCCAGTTTCTG

CBM3-BamHI-R CGATGGATCCTGGTTCTTTAC

EGFP-XbaI-R CTGTTCTAGACTTGTACAGCTCGTCCATG

EGFP-SalI-F ACTGGTCGACATGGTGAGCAAGGGCGAGGAG

EGFP-KpnI-R GGGGTACCTTGTACAGCTCGTCCATG

EGFP-HindIII-R ACTGAAGCTTACTTGTACAGCTCGTCCATG

EGFP-NdeI-F ACTGTTCATATGGTGAGCAAGGGCGAGAGCT

EGFP-XhoI-R CCGCTCGAGCTTGTACAGCTCGTCCATGCCGA

CBM3-Age1-F ACTGAGACCGGTCCAGTTTCTGGTAATTTGAA

CBM3-PstI-R AAAACTGCAGTTATGGTTCTTTACCCCAA

GIC-EcoRI-F CCGGAATTCAATAATGTCTGTGAGCAAGGGCGAGGA

GIC-NotI-R ATAAGAATGCGGCCGCTTATGGTTCTTTACCCCAA

INU-XbaI-F AACTAGTCTAGAGATGGTGACAGCAAGGCCATC

INU-XbaI-R AACTAGTCTAGAAGAACGTTAAATTGGGTAACG

Table 1 Primers used in thisstudy

a Restriction enzyme sites areunderlined

Appl Microbiol Biotechnol (2011) 91:789–798 791

Page 4: Expression of family 3 cellulose-binding module (CBM3) as an affinity tag for recombinant proteins in yeast

Although genes containing all possible mutants wereconstructed (seven in total), once the N68Q mutant wasshown to prevent glycosylation, expression of the remain-ing mutants was halted. Only the wild-type CBM3(CBM3wt, containing three potential N-glycosylation sites),the CBM3mt (no N-glycosylation sites), and the CBM3mt2(N68Q mutant) were expressed (Table 2). E. coli CBM3-EGFP was obtained as previously described (Hong et al.2008b) and used as a control to assess the adsorption abilityof the newly constructed plasmids

Expression of the cbm3 fused protein

The expression plasmids (pPCG, pGIC, and pPCI ) werelinearized using Sac I and transformed into P. pastorisKM71H by electroporation using the EasySelect PichiaExpression Kit according to the manufacturer’s protocoland modified as previously described (Hong et al. 2007a).The transformed cells were then spread on YPDS (YPDmedium containing 1 M sorbitol) plates containing 100 μg/mL Zeocin. Following 3 days of incubation at 30°C, 96colonies were inoculated on an YPDS plate containing1,000 μg/mL Zeocin for the selection of multicopyintegrated strains.

The selected strains were inoculated in 500 mL ofBMGY medium in 1-L Erlenmeyer flasks and cultivatedat 30°C and 250 rpm for 24 h. The cells were recoveredby centrifugation at 2,000×g for 10 min at roomtemperature and then resuspended in 100 mL BMMYmedium in 1-L Erlenmeyer flasks. The media were thenincubated at 30°C for 5 days (at 250 rpm). To maintaininduction, 0.5 mL of 100% methanol was added every24 h. For CBM3-EGFP and CBM3mt2-INU, the culturesupernatants were harvested by centrifugation at 4°C andused for protein purification directly. For EGFP-intein-CBM3, the cells were harvested by centrifugation at 4°Cand washed once with cell lysis buffer (20 mM Tris-HClbuffer at pH 8.5, 500 mM NaCl, and 1 mM EDTA) andthen resuspended in 30 mL of cell lysis buffer. The cellswere crushed in an ice bath by ultra-sonication (FisherScientific Sonic Dismembrator Model 500) with 5 s pulseat 40% strength for 10 min. The cell lysate was used forprotein purification.

Purification of the CBM3 fused protein and tag-free EGFP

After supernatants from the culture (CBM3-EGFPs andCBM3mt2-INU) and lysate (GIC) were obtained, thesewere purified using two different purification paths. In thefirst path, the RAC slurry was directly added to thesupernatant and mixed. The amount of RAC used wasestimated based on results from the E. coli expressionsystem, but did not exceed ~3 μmol of target protein to 1 gof RAC (as the fluorescence could not easily be detected inBMMY medium, the amount of target protein was roughlyestimated using SDS-PAGE). After mixing for approxi-mately 30 min at room temperature, the mixture was loadedto an empty column. In the second purification path, as inmost affinity chromatography protocols, the supernatantwas added to an RAC column, and the target protein(CBM3-EGFPs, CBM3mt2-INU, or EGFP-intein-CBM3)was directly adsorbed to the column.

For each purification path, after the sample wasloaded and adsorbed on the RAC column, a 5-foldRAC volume of 50 mM Tris-HCl buffer (pH 8.0) wasused to wash the column and remove impure proteinsfrom the RAC matrix. For proteins lacking the intein-based tags (CBM3-EGFPs and CBM3mt2-INU), thetarget protein was eluted using a 2-fold RAC volumeof 100% ethyl glycol (EG). This elution was accom-plished by thoroughly mixing the EG with RAC andcentrifuging the column in a larger tube that iscompatible with the centrifuge rotor (Fig. 1). Centrifu-gation speed was dependent on the tolerance of both thetube and the column. In this study, a speed of 5,000×g wasused. To increase post-purification yield, the elutionprocedure was performed two times. Purified proteinscan be stored in EG solution at −20°C. When they areready for use, the EG can be removed by dialysis and thedilute protein can be either re-concentrated using ultra-filtration centrifugal tubes or used directly. Althoughglycerol can be used to elute the purified protein in placeof EG, its high viscosity makes it less ideal for use in thismethod. Instead, for the GIC purification, thiols (DTT orcysteine) were added after the washing step to induceintein self-cleavage, allowing the target protein to becleaved from CBM3 and intein Sce VMA. The tag-freeprotein was then eluted by centrifugation.

Self-cleavage efficiency of intein at various temperaturesand various concentrations of thiols

To examine the self-cleavage efficiency of intein, we firstmixed 200 μL of RAC (5 mg/mL) with 2 mL of cell lysateof GIC at room temperature for 30 min. Following theGICs adsorption, impure proteins were removed bywashing the RAC with 5 volumes of 50 mM Tris-HCl

Table 2 CBM3 and its mutants

Amino acidsubstitution

Potential N-glycosylation site Polypeptideabbreviation

None Asn-14, Asn-68, and Asn-124 CBM3wt

N68Q Asn-14 and Asn-124 CBM3mt2

N12Q N68Q N124Q None CBM3mt

792 Appl Microbiol Biotechnol (2011) 91:789–798

Page 5: Expression of family 3 cellulose-binding module (CBM3) as an affinity tag for recombinant proteins in yeast

(pH 8.0), 1 mM EDTA. Next, 10, 25, 50, or 100 mM ofDTT or cysteine was added to examine the effects of theinducer concentration on self-cleavage of the target proteinat 23°C. The efficiency of cleavage at different temper-atures was determined by incubating the GIC-adsorbedRACs with 50 mM of the inducer at 4, 16°C, 23°C, or 37°C.After incubation (ranging from several hours to several days),the cleaved EGFP was recovered by centrifugation, and itsfluorescent intensity was quantified.

Adsorption of CBM3-EGFP on RAC

The equilibrium adsorption of CBM3-EGFP on RAC wasconducted at room temperature. CBM3-EGFP was mixedwith RAC for 30 min, the supernatant was recovered bycentrifugation at 12,000×g for 5 min, and the fluorescentintensity of the recovered CBM3-EGFP was quantified.The maximum adsorption capacity of RAC was calculatedbased on the Langmuir equilibrium as previously described(Hong et al. 2007b).

EGFP fluorescence detection, inulinase activity assay,and protein analysis

EGFP fluorescent intensity (excitation at 485 nm andemission at 528 nm) was detected using a SpectraMaxM5 Multi-Mode Microplate Reader (Molecular Devices,Inc., Sunnyvale, California).

Inulinase activity was determined according to themethod described by Gong et al. (Gong et al. 2007). Areaction mixture containing 0.02 mL of the enzyme and0.18 mL of 2.0% (w/v) inulin in 50 mM citrate phosphatebuffer (pH 5.0) was incubated at 50°C for 20 min. Thereaction was inactivated by incubation at 100°C for 10 min.The amount of reducing sugar produced in the reaction

mixture was assayed by the DNS method (Miller 1959).One inulinase unit (U) was defined as the amount ofenzyme that produced 1 μmol of reducing sugar per minuteunder the assay conditions used in this study.

The protein concentrations were determined by theBradford method and standardized using bovine serumalbumin (Bradford 1976). SDS-PAGE was performed ingels containing 12% or 10% (w/v) acrylamide and 0.1%SDS (w/v) using a Tris-glycine buffer system (Laemmli1970). Following separation, the resolved proteins werevisualized with Coomassie Brilliant blue R-250 staining.

Because of the high fluorescent background of mediumsupernatants and cell lysates, EGFP could not be measuredusing fluorescence-based methods to determine the purifica-tion yields. In this study, the crude and pure EGFPs wereanalyzed by SDS-PAGE. The amount of purified targetprotein was determined using the band’s density and thevolume of the loaded protein. To accurately demonstrate thepotential protein yield of our CBM3 tag system, we fused andexpressed the inulinase gene from K. marxianus with CBM3and determined its yield based on the enzyme’s activity.

Fig. 1 The purification of CBM3-EGFP with regenerated amorphouscellulose (RAC). 1, The column filled with RAC; 2, partly loadedcolumn with supernatant; 3, fully loaded column with supernatant; 4,wash-off of the impure protein with Tris-HCl buffer (100 mM,

pH 8.0); 5, mixture of the CBM3-EGFP bound RAC and EG; 6,elution of the CBM3-EGFP through centrifugation; 7, purified CBM3-EGFP and eluted RAC column

Fig. 2 SDS-PAGE analysis of purified CBM3-EGFPs. Lane 1,CBM3mt-EGFP; lane 2, CBM3mt-EGFP with Endo Hf digestion;lane 3, CBM3mt2-EGFP; lane 4, CBM3mt2-EGFP with Endo Hfdigestion; lane 5, CBM3wt-EGFP; lane 6, CBM3wt-EGFP with EndoHf digestion; lane 7, CBM3-EGFP from E. coli; M, protein molecularweight marker

Appl Microbiol Biotechnol (2011) 91:789–798 793

Page 6: Expression of family 3 cellulose-binding module (CBM3) as an affinity tag for recombinant proteins in yeast

Results

Expression and purification of the CBM3 mutant fusionprotein variants

We identified three potential N-glycosylation sites in theamino acid sequence of CBM3. In order to determine theinfluence of glycosylation on the CBM3-based purification,we fused EGFP with seven mutant CBM3s partially orcompletely lacking potential N-glycosylation sites. Analy-sis of the RAC adsorption capacity and SDS-PAGE resultsof these mutants revealed that the N68Q mutant inhibitedglycosylation of the fusion protein (Fig. 2), therefore, onlyCBM3mt-EGFP, CBM3wt-EGFP, and CBM3mt2-EGFPwere expressed.

The supernatant from each expressed culture wascollected daily and checked by SDS-PAGE. The band ofeach CBM3-EGFP was detectable by SDS-PAGE analysisafter 5 days of culture (data not shown) and the 5 daysculture supernatant was used to purify the target protein.The expression levels of the three mutants were different(Table s1 in the Electronic supplementary material (ESM)),which was due to the difference of selected strains (data notshown).

After adsorption, washing, and elution, the pure CBM3-EGFPs were recovered and analyzed by SDS-PAGE. Fromeach experiment, only specific target protein bands were

obtained for each purified CBM3-EGFP (Fig. s1 in theESM; Fig. 2 ). Quantification of the bands showed thatnearly half of the target proteins were recovered followingpurification (Table s1 in the ESM).

Our SDS-PAGE analysis results showed that onlyCBM3wt-EGFP was glycosylated (as indicated by thesmeared band in Fig. 2, lane 5). This protein wasdeglycosylated following treatment with Endo Hf (NewEngland Biolabs, USA) (Fig. 2, lane 6). No other samplesdisplayed shifted bands following Endo Hf treatment.

Adsorption of CBM3-EGFP variants on RAC

Although CBM3wt was glycosylated, the maximum ad-sorption capacity of CBM3wt-EGFP to RAC was 319 mg/g. This was not significantly decreased relative to themaximum adsorption capacity of non-glycosylatedCBM3wt-EGFP from E. coli (339 mg/g). While theCBM3mt2-EGFP (N68Q mutant) had the highest observedmaximum adsorption capacity (365 mg/g; Table 3).

GIC expression and cleavage condition

Egfp-intein-cbm was inserted into the pPICZ A plasmidand expressed in yeast. Following sonication, tag-freeEGFP was purified using a procedure similar to thatdescribed for the purification of CBM-EGFPs, with the

Fig. 3 The self-cleavage of GIC with DTT as the inducer. Data are given as mean±SD, n=3. a With various concentrations of DTT at 23°C. bUnder various temperatures with 50 mM DTT. The fluorescence of GIC adsorbed to RAC before cleavage was set as 0%

Variants Maximum adsorption capacity Ka

mg/g μmol/g L/mg L/μmol

CBM3e-EGFP 339±11 6.78±0.22 4.08±0.88 0.082±0.018

CBM3wt-EGFP 319±2.0 6.38±0.04 7.40±0.11 0.15±0.0022

CBM3mt2-EGFP 365±11 7.30±0.22 6.98±0.35 0.14±0.0070

CBM3mt-EGFP 291±13 5.82±0.26 4.83±0.83 0.097±0.017

Table 3 Binding properties ofCBM3-EGFP variants to RAC

Data are given as mean±SD, n=3

Ka association constant, CBM3ethe CBM3-EGFP from E. coli

794 Appl Microbiol Biotechnol (2011) 91:789–798

Page 7: Expression of family 3 cellulose-binding module (CBM3) as an affinity tag for recombinant proteins in yeast

exception of the elution step (Fig. 1). The EGFP wascleaved from adsorbed intein-CBM (IC) as a pure, tag-freeprotein by DTT or L-cysteine induction. The cleavageconditions used for purification were established usingvarying DTT and L-cysteine concentrations and temper-atures. As shown in Figs. 3 and 4, the cleavage of intein inEGFP-intein-CBM3 was faster at higher temperatures. A48-h induction with 50 mM DTT at 37°C resulted inapproximately 75% cleavage of EGFP from EGFP-Intein-CBM3. While at 4°C, only 45% was cleaved. After a 35 hinduction using 50 mM cysteine at 37°C, more than 90%of EGFP was cleaved from EGFP-Intein-CBM3. Con-versely, only approximately 60% was cleaved at 4°C.Higher concentration of the inducer also increased thespeed of intein cleavage. With 100 mM DTT, 80% ofEGFP was cleaved for 24 h, while with 10 mM DTT, onlyapproximately 41% was cleaved in 48 h. Similarly, with100 mM L-cysteine, more than 90% of EGFP was cleavedin 24 h; however, with 10 mM L-cysteine, only 76% wascleaved in 36 h.

Our results also showed that L-cysteine more efficientlyinduced self-cleavage of the purified protein than DTT.With 50 mM L-cysteine at 23°C, more than 90% of EGFPwas cleaved from the affinity tag in 24 h (Fig. 5, lanes 4and 5); however, with 50 mM DTT at 23°C, only 70% ofEGFP was cleaved following a 48 h induction. Using L-cysteine as the inducer, 4.71 mg of tag-free EGFP wasrecovered from 30 mL of lysate (492.57 mg crude protein).The specific fluorescence of the purified protein was 2.81×105/mg of protein.

Expression and purification of CBM3-inulinase fusionprotein

To evaluate the CBM3 tag system used in this study,expression and purification of the CBM3mt2-INU fusion

protein were performed using the same procedure describedfor CBM3-EGFP. Adsorption of the CBM3mt2-INU proteinon RAC (determined by subtracting the activity in flowthrough and wash fractions from the activity in mediumsupernatant) was 74.36%.

The activity and yield of the CBM3mt2-INU fusionprotein were determined at each step of this purificationprocedure. In total, the purified yield of the enzyme was40.9% and a purification of 11.86-fold was achieved(Table 4).

The purified CBM3mt2-INU protein was analyzed bySDS-PAGE. A portion of the CBM3mt2-INU fusion proteinwas glycosylated (Fig. 6, lanes 3 and 4). The sugar chains

Fig. 5 SDS-PAGE analysis of the purification of EGFP using intein-CBM3 fusion strategy. Only very small amounts of the GIC band weredetected, but GIC was still adsorbed. After self-cleavage, the intein-CBM3 (IC) was still adsorbed on RAC, but EGFP was released tosolution. M molecular weight marker; lane 1, crude extract from cellsexpressing GIC fusion protein; lane 2, crude extract from cellsexpressing GIC fusion protein after RAC adsorption (flow thoughfraction); lane 3, protein being adsorbed on RAC and striped by SDS;lane 4, IC retained on RAC after self-cleavage and striped by SDS;lane 5, purified tag-free EGFP

Fig. 4 The self-cleavage of GIC with L-cysteine as the inducer. Data are given as mean±SD, n=3. a With various concentrations of L-cysteine at23°C. b Under various temperatures with 50 mM L-cysteine. The fluorescence of GIC adsorbed to RAC before cleavage was set as 0%

Appl Microbiol Biotechnol (2011) 91:789–798 795

Page 8: Expression of family 3 cellulose-binding module (CBM3) as an affinity tag for recombinant proteins in yeast

were removed following Endo Hf treatment (because theEndo Hf and the CBM3mt2-INU were similar in molecularweight, the bands of them overlap in the SDS-PAGE gel)(Fig. 6, lane 5).

Discussion

To investigate the influence of glycosylation on the CBM3affinity tag, wild-type CBM3 and two mutants lacking N-glycosylation sites were fused and extracellularly expressedwith EGFP. The glycosylation of CBM3 did not change theadsorption ability of CBM3-EGFP to cellulose significantly(about 6% lower than no-glycosylated CBM3-EGFPexpressed in E. coli). Although three glycosylation siteswere predicted (Asn-Xaa-Ser/Thr sequence), SDS-PAGE ofthe expressed crude and purified CBM3-EGFP and itsmutants revealed that only N68 was glycoslated (Fig. 2).This glycosylation could be effectively removed followingEndo Hf treatment (Fig. 2).

The three CBM3-EGFPs and the E. coli CBM3-EGFPhad comparable maximum adsorptions. While CBM3mt2-EGFP had the highest adsorption capacity to RAC, this

measure was lowest in CBM3mt-EGFP (Table 3). AlthoughCBM3wt-EGFP was glycosylated, contrary to previousreports on CBM2a, its adsorption capacity was notsignificantly reduced (Boraston et al. 2001b). Accordingto the three-dimensional structure of CBM3 reported byTormo et al. (PDB, 1NBC), the three potential glycosylationsites were not located in the binding motif of this affinitytag (Tormo et al. 1996). Furthermore, only Asn68 wasactually glycosylated, even though the Asn14 residue wasone of the anchoring amino acid residues during itsadsorption. This observation may explain why thesepotential glycosylation sites did not hinder the adsorptionof CBM3.

Intein has been widely used to remove affinity tagsfollowing protein purification; the majority of theseapplications, however, have been performed in E. coli.Some inteins have serious in vivo self-cleavage, forexample intein Ssp DnaB, the in vivo self-cleavage in E.coli can clearly be detected by SDS-PAGE, even when theyare expressed at low temperatures (i.e., 18°C) or for a shortperiod of time (several hours) (Hong et al. 2008a).Conversely, when these proteins are expressed in yeast,several days are needed to express sufficient amounts of thetarget protein. These results suggest that not all inteins aresuitable for purification in yeast. In vivo self-cleavage ofSce VMA is weak under most conditions and, therefore,this intein was selected for use in this study. Our resultsshow that this fusion protein is stable enough during itsexpression and purification. There was no detectablecleavage product in the absence of the thiol inducer, evenfollowing adsorption to RAC and several days of incuba-tion at 4°C (data not shown).

In this study, EGFP-intein-CBM3 expression was con-ducted at the same time as the CBM3-EGFPs expression.To avoid potential influence of glycosylation and the thiolchemical in the medium, the EGFP-intein-CBM3 wasexpressed intracellularly. Based on our previous findings,which suggest that glycosylation does not significantlyinfluence CBM3 adsorption of cellulose, and no obviousself-cleavage was detected after 24-h incubation of EGFP-intein-CBM3 with BMMY medium (data not shown), theextracellular expression of EGFP-intein-fused protein islikely possible.

Fig. 6 SDS-PAGE analysis of the purification of inulinase usingCBM3 fusion strategy. The protein of CBM3mt2-INU (CI) can almostnot be detected in the culture supernatant (lane 1), but after RACadsorption, the CI still could be recovered (lanes 3 and 4). M,molecular weight marker; lane 1, medium supernatant of CIexpression; lane 2, supernatant after RAC adsorption; lane 3, proteinbeing adsorbed on RAC and striped by SDS; lane 4, purified CI afterthe EG Elute; lane 5, Endo Hf-digested CI

Table 4 Purification of CBM3-Inulinase fusion protein by CBM3 tag

Step Volume (mL) Total protein (mg) Total activity(U/mL)a

Specific activity(U/mg protein)

Yield (%) Purification factor

Medium supernatant 5 2.24±0.04 10.99±0.23 24.83 100 1

Purified inulinase 1.76 0.077±0.004 12.77±0.33 294.38 40.9±0.019 11.86

Data are given as mean±SD, n=3a One unit of inulinase activity is the amount of enzyme that yields 1 μmol of fructose per minute

796 Appl Microbiol Biotechnol (2011) 91:789–798

Page 9: Expression of family 3 cellulose-binding module (CBM3) as an affinity tag for recombinant proteins in yeast

Many impure proteins remained on RAC followingthe wash step (Fig. 5, lane 3). These proteins wereirreversibly adsorbed to RAC and could only beenremoved by boiling with SDS solution. These proteinscould not be eluted under mild conditions and, therefore,did not affect purification of the target protein using eitherEG elution or self-cleavage. As shown in Figs. 2, 5, and 6,the eluted proteins were highly pure using both of thesemethods.

We have demonstrated that even at very low concen-trations (50 μg/mL in 2 mg/mL impure protein), the targetprotein could be purified. Notably, 60–80% of the targetprotein could still be adsorbed to RAC (Hong et al. 2008b).In this study, the CBM3mt2-INU in the culture supernatant(38 μg/mL in 0.45 mg/mL impure protein) was purified at alow concentration (Fig. 6, lane 1). Approximately 74% ofCBM3mt2-INU was adsorbed to RAC (Fig. 6, lane 3). Thisresult suggests that CBM3 can be used as a high-efficiencyaffinity tag in yeast.

In our previous report, CBM3 was used as an affinity tagto purify recombinant protein in E. coli (Hong et al. 2008b).The fusion protein adsorption, impure protein removal, andpurified protein recovery were all achieved throughcentrifugation. This method is both simple and efficient.However, it requires a large wash buffer volume to removeimpure proteins and has dead volume in the RAC slurrythat reduces the purity and yield of the purified protein. Inthis study, a column was used for protein purification, andthe protein solution and buffer were passed through thecolumn without centrifugation. Centrifugation was per-formed in a 50-mL tube after EG was added to eluteCBM3-EGFP to reduce purification time and improveyield. Two to three elutions are recommended to furtherimprove purification yield. In the tag-free EGFP purifica-tion, centrifugation was not necessary after cleavage wasinduced. Centrifugation can, however, reduce the elutiontime and improve the overall yield. Because too muchcentrifugation can block the filter at the bottom of columnand cause a very slow flow rate later in this procedure,centrifugation is not recommended during the loading andwashing steps. For large volume, centrifugation methodremains a viable choice.

While RAC was used in this study, other cellulosematrices can also be used to purify CBM fusion proteins.Avicel is a microcrystal cellulose and has previously beenused to adsorb CBM fusion protein; however, this has oneseventeenth the capacity of RAC (Hong et al. 2008b,2007b). Bacterial microcrystalline cellulose has a highadsorption capacity (Hong et al. 2007b); however, thepreparation of this matrix is a time-intensive procedure.Filter papers can also be used. While these are easy tomanipulate, they have one fourth the adsorption capacity ofRAC (Hong et al. 2007b).

In conclusion, CBM3 is suitable to be used as an affinitytag for yeast-expressed recombinant protein purificationand intein Sce VMA is a good tool to remove the affinitytag of yeast-expressed recombinant proteins. Combiningtwo formal fusion recombinant protein purification meth-ods, we have developed a low-cost, highly scalable methodfor the large-scale protein purification in yeast.

Acknowledgments This work was supported by a grant-in-aid fromthe Anhui Provincial Natural Science Foundation (grant no.090413075), the State Education Ministry and Specialized ResearchFund for the Doctoral Program of Higher Education of China (grantno. 20093402120027), the National Basic Research Program of China(grant no. 2011CBA00801), and the Fundamental Research Funds forthe Central Universities (grant no. WK2070000007). The authorsdeclare that they have no conflicts of interest.

References

Ahn J, Choi E, Lee H, Hwang S, Kim C, Jang H, Haam S, Jung J(2004) Enhanced secretion of Bacillus stearothermophilus L1lipase in Saccharomyces cerevisiae by translational fusion tocellulose-binding domain. Appl Microbiol Biotechnol 64:833–839

Arnau J, Lauritzen C, Petersen GE, Pedersen J (2006) Currentstrategies for the use of affinity tags and tag removal for thepurification of recombinant proteins. Protein Expr Purif 48(1):1–13

Babu KS, Antony A, Muthukumaran T, Meenakshisundaram S (2008)Construction of intein-mediated hGMCSF expression vector andits purification in Pichia pastoris. Protein Expres Purif 57(2):201–205. doi:10.1016/j.pep.2007.10.004

Boraston AB, McLean BW, Guarna MM, Amandaron-Akow E,Kilburn DG (2001a) A family 2a carbohydrate-binding modulesuitable as an affinity tag for proteins produced in Pichiapastoris. Protein Expres Purif 21(3):417–423

Boraston AB, Warren RAJ, Kilburn DG (2001b) Glycosylation byPichia pastoris decreases the affinity of a family 2acarbohydrate-binding module from Cellulomonas fimi: a func-tional and mutational analysis. Biochem J 358:423–430

Bradford MM (1976) A rapid and sensitive method for thequantitation of microgram quantities of protein utilizing theprinciple of protein-dye binding. Anal Biochem 72:248–254

Chong S, Xu MQ (1997) Protein splicing of the Saccharomycescerevisiae VMA intein without the endonuclease motifs. J BiolChem 272(25):15587–15590

Chong S, Shao Y, Paulus H, Benner J, Perler FB, Xu MQ (1996)Protein splicing involving the Saccharomyces cerevisiae VMAintein. The steps in the splicing pathway, side reactions leading toprotein cleavage, and establishment of an in vitro splicingsystem. J Biol Chem 271(36):22159–22168

Elleuche S, Poggeler S (2010) Inteins, valuable genetic elements inmolecular biology and biotechnology. Appl Microbiol Biot 87(2):479–489. doi:10.1007/s00253-010-2628-x

Esposito D, Chatterjee DK (2006) Enhancement of soluble proteinexpression through the use of fusion tags. Curr Opin Biotechnol17(4):353–358

Fong BA, Wu WY, Wood DW (2010) The potential role of self-cleaving purification tags in commercial-scale processes. TrendsBiotechnol 28(5):272–279. doi:10.1016/j.tibtech.2010.02.003

Appl Microbiol Biotechnol (2011) 91:789–798 797

Page 10: Expression of family 3 cellulose-binding module (CBM3) as an affinity tag for recombinant proteins in yeast

Gong F, Sheng J, Chi ZM, Li J (2007) Inulinase production by amarine yeast Pichia guilliermondii and inulin hydrolysis by thecrude inulinase. J Ind Microbiol Biotechnol 34(3):179–185.doi:10.1007/s10295-006-0184-2

Guerreiro CIPD, Fontes CMGA, Gama M, Domingues L (2008)Escherichia coli expression and purification of four antimicrobialpeptides fused to a family 3 carbohydrate-binding module (CBM)from Clostridium thermocellum. Protein Expres Purif 59(1):161–168. doi:10.1016/j.pep.2008.01.018

Hartley JL (2006) Cloning technologies for protein expression andpurification. Curr Opin Biotechnol 17(4):359–366

Hong J, Tamaki H, Kumagai H (2007a) Cloning and functionalexpression of thermostable beta-glucosidase gene from Ther-moascus aurantiacus. Appl Microbiol Biot 73(6):1331–1339

Hong J, Ye XH, Zhang YHP (2007b) Quantitative determination ofcellulose accessibility to cellulase based on adsorption of anonhydrolytic fusion protein containing CBM and GFP with itsapplications. Langmuir 23(25):12535–12540

Hong J, Wang YR, Ye XH, Zhang YHP (2008a) Simple proteinpurification through affinity adsorption on regenerated amor-phous cellulose followed by intein self-cleavage. J Chromatogr A1194(2):150–154

Hong J, Ye XH, Wang YR, Zhang YHP (2008b) Bioseparation ofrecombinant cellulose-bindning module-proteins by affinityadsorption on an ultra-high-capacity cellulosic adsorbent. AnalChim Acta 621(2):193–199

Kavoosi M, Meijer J, Kwan E, Creagh AL, Kilburn DG, Haynes CA(2004) Inexpensive one-step purification of polypeptidesexpressed in Escherichia coli as fusions with the family 9carbohydrate-binding module of xylanase 10A from T. maritima.J Chromatogr B 807:87–94

Laemmli UK (1970) Cleavage of structure proteins during theassembly of the head of bacteriophage T4. Nature 227:680–685

Levy I, Shoseyov O (2002) Cellulose-binding domains biotechnolog-ical applications. Biotechnol Adv 20(3–4):191–213

Levy I, Shoseyov O (2004) Cross bridging proteins in nature and theirutilization in bio- and nanotechnology. Curr Protein Pept Sc 5(1):33–49

Miller GO (1959) Use of dinitrosaiicyiic acid reagent for determina-tion of reducing sugar. Anal Biochem 31:426–428

Murashima K, Kosugi A, Doi RH (2003) Solubilization of celluloso-mal cellulases by fusion with cellulose-binding domain ofnoncellulosomal cellulase engd from Clostridium cellulovorans.Proteins 50(4):620–628

Przybycien TM, Pujar NS, Steele LM (2004) Alternative biosepara-tion operations: life beyond packed-bed chromatography. CurrOpin Biotechnol 15:469–478

Ramirez C, Fung J, Miller RC, Antony R, Warren J, Kilburn DG(1993) A bifunctional affinity linker to couple antibodies tocellulose. Nat Biotech 11(12):1570–1573

Shoseyov O, Shani Z, Levy I (2006) Carbohydrate binding modules:biochemical properties and novel applications. Microbiol MolBiol R 70(2):283–295

Tomme P, Boraston AB, McLean B, Kormos JM, Creagh AL, SturchK, Gilkes NR, Haynes CA, Warren RA, Kilburn DG (1998)Characterization and affinity applications of cellulose-bindingdomains. J Chromatogr B Biomed Sci Appl 715:283–296

Tormo J, Lamed R, Chirino AJ, Morag E, Bayer EA, Shoham Y, SteitzTA (1996) Crystal structure of a bacterial family-III cellulose-binding domain: a general mechanism for attachment tocellulose. EMBO J 15(21):5739–5751

Zhao Z, Lu W, Dun B, Jin D, Ping S, Zhang W, Chen M, Xu MQ, LinM (2008) Purification of green fluorescent protein using a two-intein system. Appl Microbiol Biot 77(5):1175–1180.doi:10.1007/s00253-007-1233-0

Zhou X, Cai S, Hong A, You Q, Yu P, Sheng N, Srivannavit O,Muranjan S, Rouillard JM, Xia Y, Zhang X, Xiang Q, Ganesh R,Zhu Q, Matejko A, Gulari E, Gao X (2004) MicrofluidicPicoArray synthesis of oligodeoxynucleotides and simultaneousassembling of multiple DNA sequences. Nucleic Acids Res 32(18):5409–5417. doi:10.1093/nar/gkh879

798 Appl Microbiol Biotechnol (2011) 91:789–798