Identification and biochemical characterization of the UDP ...

72
WAGENINGEN UNIVERSITY LABORATORY OF ENTOMOLOGY Identification and biochemical characterization of the UDP-glycosyltransferases UGT73C10 and UGT73C11 in biosynthesis of saponins in Barbarea vulgaris ssp arcuata. No......................................................... 010.21 Name .......................................... Sylvia Drok Study programme.................................... MBI Period .................. november 2009- june 2010 Thesis/Internship ENT.......................... 80439 1st Examiner ............................. Marcel Dicke 2nd Examiner ............................ Peter de Jong

Transcript of Identification and biochemical characterization of the UDP ...

Page 1: Identification and biochemical characterization of the UDP ...

WAGENINGEN UNIVERSITY

LABORATORY OF ENTOMOLOGY

Identification and biochemical characterization of the UDP-glycosyltransferases UGT73C10 and UGT73C11 in biosynthesis of saponins in Barbarea vulgaris ssp arcuata. No.........................................................010.21

Name .......................................... Sylvia Drok

Study programme.................................... MBI

Period.................. november 2009- june 2010

Thesis/Internship ENT..........................80439

1st Examiner .............................Marcel Dicke

2nd Examiner............................ Peter de Jong

Page 2: Identification and biochemical characterization of the UDP ...
Page 3: Identification and biochemical characterization of the UDP ...

Preface

This master thesis project was performed at the faculty of LIFE, department of Plant Biology

and Biotechnology at the Univerisity of Copenhagen. It is a part of my Msc Biology and

counts for 39 ECTS, in accordance with circa a half year study.

Above all, I would like to thank my supervisors Søren Bak, Jörg M. Augustin and Peter de

Jong for their guidance and help during my thesis. I would like to thank Jörg for being so

patient and helpful with me and for the interesting discussions. I’m proud to have gained

even a part of his knowledge, which will be valuable for me in any field within Biology. I’m

also very grateful to Søren Bak for guiding me through the writing process and believing in

me even when I did not. Moreover, I would like to thank Peter de Jong for his support on a

longer distance, who always replied with some inspiring and friendly words.

For the experimental part of my thesis I would like to thank Carl Erik Olsen for the LCMS and

NMR analyses and Esben Hansen form Evolva his help with the large scale production of

monoglucosides.

At last, I would like to thank my colleagues at the department for being so friendly and social

from the very start. Many thanks my office mates, who have been a great help and

inspiration during this thesis. It was a pleasure to be part of this for half a year and I hope we

will meet again soon.

Sylvia Drok

Copenhagen, June 2010

Page 4: Identification and biochemical characterization of the UDP ...
Page 5: Identification and biochemical characterization of the UDP ...

Table of contents

Preface

Summary

1. Introduction 1

1.1 Secondary metabolites in insect resistance. 1

1.2 The interaction between Barbarea vulgaris ssp arcuata and Phyllotreta nemorum 2

1.3 Saponins-natural soap compounds 4

1.4 Saponin modes of resistance 5

1.5 The biosynthetic pathway of hederagenin and oleanolic acid 6

1.6 Glycosyltransferases 10

1.7 Thesis project 13

2. Materials and methods 15

2.1 Cloning the sequences 15

2.2 Heterologous expression of the UGTs 16

2.3 Characterization 17

3. Results 21

3.1 Phylogenetic analysis of the UGTcDNA and amino acid sequences 21

3.2 UGT sequence analysis 24

3.3 Cloning and expression of the UGTs 27

3.4 Characterization of the UGTs 29

3.5 UGT substrate specificity 33

3.6 Kinetics 37

4. Conclusions and discussion 43

5. Perspectives 53

6. References 57

7. Appendix I 60

Page 6: Identification and biochemical characterization of the UDP ...
Page 7: Identification and biochemical characterization of the UDP ...

Abstract

Plants are sessile organisms and in order to defend themselves against pathogens and pest

they have developed a wide variety of chemical and mechanical ways of defense. The

chemical defense of Barbarea vulgaris ssp arcuata against the flee beetle Phyllotreta

nemorum was correlated with the presence of four saponins, among which hederagenin and

oleanolic acid cellobioside (Kuzina et al., 2009). Both saponins were present in the resistant

G-type Barbarea but absent in the susceptible P-type plants. The proposed biosynthetic

pathway of the two saponins shares its first steps with the phytosterol pathway, but

branches with an alternative set of cyclization steps of 2,3 oxisqualene into β-amyrin, from

which both oleanolic acid and hederagenin are derived. Glycosylation of these aglycones is

performed by family 1 UDP- glycosyltransferases (UGTs). A UGT was identified that could

transfer the first glucose moiety to hederagenin and oleanolic acid aglyones, as in the first

glucylation step in the biosynthesis of Barbarea vulgaris var. variegata. A search for

orthologs in B.v.arcuata revealed five close homogous sequences, two derived from the G-

type and three from the P-type plant. In this master thesis project these five sequences were

cloned, expressed and characterized. UGT73C10, UGT73C11, UGT73C12 and UGT73C13

utilized oleanolic acid and hederagenin as acceptor substrates, and NMR analysis revealed

that a hederagenin monoglucoside was glycosylated at the C3 position. UGT73C9 was only

active on 2,4,6-trichlorophenol (TCP). UGT73C10 and UGT73C11 were highly specific towards

sapogenins, in vitro, suggesting that they could be involved in the biosynthesis pathway of

oleanolic acid and hederagenin cellobioside. The difference in the production of saponins in

the G and P plant is most likely not due to biochemical differences of UGTs involved in the

first glycosylation step.

Page 8: Identification and biochemical characterization of the UDP ...
Page 9: Identification and biochemical characterization of the UDP ...

Introduction

1

1. Introduction

Plants are autotrophic organisms, meaning they are able to produce their own organic

compounds from carbon dioxide and water, using light as an energy source. Therefore they

are at the bottom of the food chain and serve as a putative food source for a broad range of

herbivores, pests and pathogens. Because plants are sessile organisms, they cannot escape

their environment and have to cope with the accompanying biotic and abiotic stress. Plants

have developed several ways to counteract these threats, e.g. by trying to avoid their

enemies or by defending themselves using mechanical and chemical defenses. The chemical

defense compounds are typically secondary metabolites: compounds which are not directly

involved in growth, development or reproduction of the plant but have important roles in

the interaction between the plant and its environment (Albers et al., 1996). Secondary

metabolites are usually derived from the basic metabolic pathway; and are considered to

originate from random mutations in that pathway which had accidental beneficial effects.

1.1 Secondary metabolites in plant defense

There are three major classes within plant secondary metabolites: terpenes (or terpenoids),

phenolic compounds and nitrogen containing compounds. Terpenoids are derived from

isoprenoids building blocks which contain 5 carbons. Nomenclature of the terpenoids is

based on the amount of 5-carbon isoprenoids, namely monoterpenes (2 isoprenoids, 10

carbons), sesquiterpenes (15 carbons), diterpenes (20 carbons), triterpenes (30 carbons),

etc. Terpenes are often toxins and feeding deterrents that function as a defense mechanism

widespread in the plant kingdom. Monoterpenes accumulate in the resin in conifers. Mono

and sesquiterpenes are the essential oils present in many plant species and well-known to

have insect repellent properties. Within the triterpenes there is a compound, azadirachtin,

which is perhaps the most powerful insect deterrent known as it works at a dose of 50 parts

per billion in some cases (Aerts, 1997).

Phenolic compounds are derived from aromatic amino acids, often phenylalanine. They are a

big class within the secondary metabolites with a wide variety of biological functions.

Anthocyanins are thought to play a role in the attraction of pollinators while other phenolic

compounds may play a role in UV protection of the plant. Some compounds are suggested to

Page 10: Identification and biochemical characterization of the UDP ...

Introduction

2

act as allelopathic compounds thus influencing the neighboring plants e.g. when released in

the soil. Last, a group of phenolics, isoflavonoids and tannins are known to have

antimicrobial and feeding deterrent activities on herbivores.

The third major class of secondary metabolite is the nitrogen containing compounds, which

function as chemical defense as well. Nitrogen containing compounds like alkaloids are often

found to be toxic themselves. Other nitrogen containing compounds e.g. cyanogenic

glucosides and glucosinolates are glycosides of otherwise toxic compounds. Glucosinolates

are an important group within the chemical defense of crucifers. They are wide spread over

the brassicales, and are known to act as antibiotics, fungal growth inhibitor and toxins to

nematodes as well as a wide range of generalist herbivores (Renwick, 2002). Therefore

glucosinolates are considered to be the first line of defense in crucifers. However, crucifer

specialist insects have adapted to these glucosinolates and are able to detoxify or sequester

them, or even use them as a trigger for oviposition on the host plant (reviewed by Renwick,

2002). As a counter adaptation some crucifers have developed a second line of defense

which includes compounds acting as repellents or are toxic to their specific attacker. Such

second line of defense can be found in the form of saponins in Barbarea, a genus of the

crucifer family. Saponins are a wide ranging class of compounds but within the crucifer

family they only occur in the Barbarea genus. In Barbarea vulgaris two saponins are found to

be toxic to the diamond back moth which is a common pest on rape seed (Agerbirk et al.,

2003a). Recently its toxicity to the flee beetle Phyllotreta nemorum has also been shown

(Kuzina et al., 2009).

1.2 The interaction between Barbarea vulgaris ssp arcuata and Phyllotreta nemorum

The interaction between both winter cress and Barbarea vulgaris spp arcuata L. and the flee

beetle Phyllotretata nemorum is a unique model system to study plant insect interactions.

The plant, B. vulgaris spp arcuata (from now on referred to as Barbarea) is polymorphic with

respect to insect resistance. It has two genotypes: a resistant G-type (with glabrous leaves)

and a pubescent P-type which is susceptible to the flee beetles. In experiments in vitro no

larvae could survive on G-type whereas 91% of the larvae survived on the P-type (Kuzina et

al., 2009). It was observed that the flee beetle larvae often started feeding on the G- type

but stopped feeding after a few bites and starved to death. Whether their death is caused by

Page 11: Identification and biochemical characterization of the UDP ...

Introduction

3

the lack of food intake or the toxicity of the G-type leaves themselves remains unclear.

However, in some populations in Denmark flee beetles were collected which were able to

survive on the G type of Barbarea. The R gene that confers this virulence (resistance towards

the plant) is dominant as both RR and Rr genotypes confer resistance to G-type Barbarea

while the rr genotype does not (Nielsen, 1997). Thus, Barbarea plants are polymorphic in

insect resistance and the beetles are polymorphic with respect to resistance toward

Barbarea, making it a unique model system for the understanding of plant-insect

interactions and speciation.

Barbarea vulgaris grows naturally Europe and is naturalized in North-America. Several

ecotypes exist of which two, ssp vulgaris and ssp arcuata, occur in Denmark. Ssp acruata is

more common than ssp vulgaris. The P and G genotypes within B. vulgaris ssp arcauta are

morphologically, cytogically and genetically different and deviate by glucosinolate profile,

leaf pubescence and flee beetle resistance (Kuzina et al., 2009; Agerbirk et al., 2003b). In this

thesis also a third subspecies is mentioned, B. vulgaris var. variegata, which most likely is a

cultivated form of Babarea vulgaris and is commercially available, used as decoration in

gardens and eventually in salads. It has partially white leaves, which are glabrous and confer

resistance towards the Diamont Back moth, Plutella xylostella (Shinoda et al., 2002).

Using untargeted metabolite profiling and clustering analyses, Kuzina et al. (2009) identified

four saponins to be correlated with resistance against flee beetles. Two of those saponins

were novel compounds. The other two were oleanolic acid cellobioside and hederagenin

cellobioside which both are known, in Barbarea, to confer resistance to the diamond back

moth Plutella xylostella L. (Agerbirk et al., 2003a). Both saponins occur in the resistant G-

type but are absent, or only present in very low concentration in the P-type plants.

Hederagenin cellobioside is more abundant than the oleanolic acid cellobioside (Kuzina et

al., 2009) and shows a strong negative correlation with flee beetle survival (Nielsen et al.,

2010). This correlation is weaker and sometimes not significant at all for oleanolic

cellobioside (Nielsen et al., 2010). Therefore hederagenin cellobioside is considered to be

the most important saponin in resistance towards flee beetles. The glucosinolate content of

the P and G type plant show no correlation with resistance (Agerbirk et al., 2003b).

Page 12: Identification and biochemical characterization of the UDP ...

Introduction

4

1.3 Saponins – natural soap compounds

Saponins are the glycosides of an isoprenoidal originated aglycone (also ‘sapogenin’).

Saponins can have a steroidal or triterpenoid backbone, depending on the folding of the 30C

backbone in the biosynthesis. Saponins occur all over the plant kingdom but can also be

found in ancient animals like seacucumbers (holodurenthia) and starfish (asteroidea). The

name saponin is derived from their soap-like abilities (sapo = Latin for soap). Various

saponins are commercially used for a variety of purposes including drugs, medicines,

precursors for hormone synthesis, adjuvants, sweeteners, taste modifiers and cosmetics

(Osbourn, 1996). Furtermore anti-inflammetory, anti-molluscial, anti-microbial and anti-

fungal properties have been described for these compounds (Sparg et al. 1987).

The sugar moieties of triterpenoid aglycones commonly contain glucose, arabinose, xylose,

rhamnose or glycoronic acid (Vincken et al., 2007) and these sugar moieties can be attached

to O (OH or COOH), S and N atoms. The amount, origin and pattern (e.g. branching) of the

sugars leads to enormous variation. An aglycone with one or more sugars attached to one

position (usually the C3 position of triterpenoid aglycones) is referred to as a

monodesmoside. Bidesmosides saponins have their sugars attached to two different

positions on the aglycone, typically at the C26 (steroidal) or C28 (triterpenoid) position.

Didesmosidic triterpenoid saponins often lack the bioactivity of the corresponding

monodesmosides, but can be converted back into their monodesmoside by removal of the

C26 or C28 sugar chain (Osbourn, 1996). Glycosylation (also for monoglycosides) is

important for storage but it also allows the plant to accumulate potentially toxic compounds

in high concentration so when needed they can be released in such concentrations (Jones

and Vogt, 2001).

Various other roles of glycosylation have been described, e.g. as signaling molecules for

plant-microbe interactions, plant-to-plant signaling and transportation (Jones and Vogt,

2001). Glycosylation often results in a loss of bioactivity and can be used for storage and

detoxification of a compound. In contrast, in the interaction of Barbarea vulgaris and

Phyllotreta nemorum the glycosylation of the aglycone causes the bioactivity of the toxins.

The diglucosides of hederagenin and oleanolic acid have shown to cause plant resistance to

Page 13: Identification and biochemical characterization of the UDP ...

Introduction

5

the flee beetle, while the aglycone itself does not show any effect on flee-beetle survival

(Nielsen et al., 2010).

1.4 Saponin modes of resistance

The mechanism behind saponin resistance has been mostly studied for fungi. This resistance

is believed to be due to a complex formation between saponins and sterols in the cell

membrane of the fungus which leads to a loss of membrane integrity. The precise

mechanism of complex formation and membrane leakage is not fully understood. Electron

microscope analyses suggest a formation of membrane pores, while other studies suggest

that the complexes interfere with membrane integrity by extracting the sterols from the cell

membrane (figure 1). (Morrissey and Osbourn, 1999)

The sugar chain attached to the C-3 plays a

crucial role in the complex formation as it

mediates the aggregation of the sterols in

the membrane. The sugar is therefore

important and when removed it leads to loss

of anti-fungal activity (Armah et al., 1999).

The complex formation with sterols is a

rather general mechanism, and therefore

potentially toxic for every organism with

sterols in their membranes. To protect itself

from its toxic compounds the plant

sequesters the saponins in the vacuole; the

vacuole membrane may therefore have a

low sterol content or different types of

sterols that are less suitable for complex

formation (Osbourn, 2003).

Figure 1: two models of the interaction of saponins

with membrane sterols, using pore formation (A) or

sterols extraction (B), leading to cell leakage

(Morrissey and Osbourn, 1999).

A B

Page 14: Identification and biochemical characterization of the UDP ...

Introduction

6

Comparable strategies are described in which a pathogen can counter this mechanism of

saponin resistance. Resistance of oomycetes towards saponins has been associated with a

lack of sterols in the oomycete cell membranes. Also experiments with sterol-deficient

mutants of Neusospora crassia and Fusarium solanum show an increase of resistance

towards the steroidal alkaloid α-tomatine (Prakash et al., 1999; Defago and Kern, 1983). A

second way of resistance to saponins may be by degradation of saponins from the host

plant, which is found in a number of fungi. Degradation of saponins is usually performed by

hydrolysis of sugar molecules, often the ones at the C-3 position of the saponin (Osbourn,

1996). Although not much is known about the saponin resistance to insects, the resistance in

flee beetle towards the saponins of B. vulgaris could be based on the same principle, as the

R genes code for a beta-glucosidase (Ikink, 2008).

1.5 The biosynthetic pathway of hederagenin cellobioside and oleanolic cellobioside.

Triterpenoid saponins are derived from metabolites produced during the anabolism of

phytosterols, therefore they share the first steps of the phytosterol pathway. This includes

the linkage (head-tail) of two famesildisphosphate (FPP) molecules (15 carbon) into squalene

by a squalene synthase (figure 2) and the subsequent oxidization of squalene into 2,3,

oxisqualene (Yendo, 2010).

Figure 2: A) synthesis of squalene from condenstaion of FFP B) cyclisation of 2,3, xisqualene

into β-amyrin by a β-amyrin synthase. (Yendo, 2010)

Page 15: Identification and biochemical characterization of the UDP ...

Introduction

7

Further cyclization steps of 2,3-Oxisqualene by oxysqualene synthases (OCSs) form a branch

point between the biosynthetic pathway of sterols and triterpenoid saponins. OSCs utilize

2,3-oxisqualene as substrate and depending on the type of OSC, different compounds will be

released. In the sterol pathway 2,3, oxisqualene is converted into lanosterol (in fungi or

animals) or cycloartenol (in plants) (Vincken et al., 2007; Haralampidis, 2002; Phillips et al.,

2006; (Yendo, 2010). In the triterpenoid saponin pathway an OSC converts 2,3, oxisqualene

into β-amyrin through diverse intermediates (figure 2) (Yendo, 2010). An OSC can be specific

for one end product or multifunctional and release several end products. It is not known if

the sterol and triterpenoid pathway are connected through one multifunctional OSC or if

each pathway is facilitated by one OSC. From β-amyrin all kinds of aglycone backbones can

be derived through oxidation and substation. Examples are shown in figure 3 (Yendo, 2010).

The enzymes involved in cyclization of of 2,3 oxisqualene are largely uncharacterized. The

first OCS, that could cyclize and release cycloartenol, was found in Arabidopsis by screening

of extracts in a yeast lanosterol deficient mutant (Corey et al., 1993). Other OSCs have been

identified based on sequence homology. β-amyrin producing OSCs have been characterized

Figure 3: β-amyrin is a precursor for many sapogenin bakcbones. Glycosylation of hederagenin by UGT73K1 in

Medicago trancatula is shown in G. (Yendo, 2010 )

Page 16: Identification and biochemical characterization of the UDP ...

Introduction

8

for trees (Betula platyphylla), monocots (avena strigosa), Euphorbia tirucalli and several

legumes like Panax ginseng, Medicago trancatula and Pisum sativa (Phillips et al., 2006).

Four multifunctional OSCs have been found in Arabidopsis thaliana and could, among other

compounds, also release β-amyrin (Phillips et al., 2006). No β-amyrin releasing OSCs have

been found in Barbarea yet.

β-amyrin is a precursor for a wide variety of saponin aglycones (figure 3). Structural changes

that confer these different backbones are catalyzed by a class of enzymes named

chytochrome p450 mono-oxidases. The p450s involved in the pathway of oleanolic acid and

hederagenin, have not been chacarterized yet. Intermediates in the proposed pathway of

hederagenin and oleanolic acid in Barbarea vulgaris (figure 4), are produced by one or

possibly more p450s. The last step, the oxidation of the C23 position, will most likely be

performed by a different p450, as this modification is done on a different Carbon atom than

the first set of modifications.

In the final step of the biosynthetic pathway hederagenin and oleanolic acid are

glycosylated. There is accumulating evidence that the glycosylation occurs by one sugar

moiety at a time, but it cannot be excluded that a chain of sugars can be transferred in one

step (Haralampidis, 2002).

Analysis of metabolite extracts of the G-type plant did not reveal the presence of β-amyrin

or any of the intermediates. This may be explained by a high turnover rate of the

intermediates, which causes that the reaction intermediates are not freely available in the

cytosol. P450s are membrane bound enzymes, while UGTs are generally cytosolic. In many

metabolic pathways, e.g., alkaloid, phenylpropanoids, cyanogenic glycoside and the first part

of the isoprenoid pathway enzymes have been found to form a multi-enzyme complex

(metabolon) and form a channel for the pathway. Channeling of a pathway results in

concentration of the substrates, decreases the transit time for intermediates because the

active sites of the enzymes are close together, and avoid metabolite cross-talk and avoid

toxic intermediates to diffuse and cause problems elsewhere (Jorgensen et al., 2005; (Bak,

2006).

Page 17: Identification and biochemical characterization of the UDP ...

Introduction

9

C Y P 4 5 0

H O H O

O

H O

C O O H

H O

C H 2 O H

C H O

H O o l e a n o l i c a l d e h y d e

. b e t a . - a m y r i n c y c l o a r t e n o l

o l e a n o l i c a c i d

e r y t h r o d i o l

C S ( c y c l o a r t e n o l c y c l a s e ) β a s ( β - a m y r i n s y n t h a s e )

2 , 3 - o x i d o s q u a l e n e

p h y t o s t e r o l p a t h w a y

H O C H 2 O H

C O O H

h e d e r a g e n i n

C O O H

O O O H

H O H O C H 2 O H

C O O H

O O O H H O H O

C H 2 O H C H 2 O H

C O O H

H O O H

O O O O

H O H O

O H C H 2 O H C H 2 O H

C O O H

O O H

H O O

O O H

H O H O O C H 2 O H

C H 2 O H C H 2 O H

U G T

o l e a n o l i c a c i d c e l l o b i o s i d e

h e d e r a g e n i n c e l l o b i o s i d e

3 - O - . b e t a . - D - g l u c o p y r a n o s y l - o l e a n o l i c a c i d

3 - O - . b e t a . - D - g l u c o p y r a n o s y l - h e d e r a g e n i n

0 4 / 0 8 - 0 9

C Y P 4 5 0

C Y P 4 5 0

C Y P 4 5 0 U G T

U G T

U G T

Figure 4: Proposed biosynthetic pathway of hederagenin and oleanolic acid cellobioside in Barbarea vulgaris

arcuata.

Page 18: Identification and biochemical characterization of the UDP ...

Introduction

10

1.6 Glycosyltransferases

Enzymes that transfer a donor sugar molecule to an acceptor substrate are

glycosyltransferases (GT). The hierarchical classification is divided in superfamilies, families

and subfamilies. At present time 91 superfamilies have been identified (Campbell et al.,

1997). The plant GTs belongs to superfamily 1, a class of enzymes, which use UDP-sugar as

sugar donor. The classification into families and subfamilies are based on sequence

homology (Mackenzie et al., 1997; Campbell et al., 1997). However, a high sequence identity

does not automatically mean that the enzymes share same substrate specificity. Thus UGTs

within one family can accept all kinds of different substrates.

Despite a low sequence identity, the secondary and tertiary structure is conserved in most of

the UGTs, and some intrinsic structural features are conserved within a superfamily

(Coutinho et al., 2003). The tertiary structure can have a GT-A and GT-B fold (figure 5).

Within both folds an enzyme can be inverting (clan I and II) or retaining (clan II and IV). The

Figure 5: The hierarchical classification of glycosyltransferases families and subfamilies including their intrinsic

structural features, illustrated for those glycosyltransferases with a reported 3-D structure. (Coutinho et al., 2003)

Page 19: Identification and biochemical characterization of the UDP ...

Introduction

11

type of bold and the feature of being either an inverting or retaining enzyme are conserved

within the GT superfamilies.

The term inverting and retaining refers to the putative change of conformation of the sugar

moiety during glycosylation. A sugar moiety can be either in α or β conformation, depending

on the stereochemical properties of the OH group on the C1 position of the sugar. If the OH

group points in the opposite direction of the methyl group on the C6 position than the sugar

is in α conformation, and if it points in the same direction (upwards) the sugar is in a β

conformation. In an UDP-sugar, the sugar moiety is attached to the uridine-diphosphate in

the α-conformation. When attaching the sugar to the aglycone backbone, the sugar

conformation can either be retained (retaining) or inverted (inverting) into a β conformation

(figure 6). The first sugar moiety of hederagenin cellobioside and oleanolic acid cellobioside

has a β conformation so the UGT responsible for this step would be an inverting enzyme.

The GT-B fold separates two subunits placed at the C and N terminal end, respectively. The C

domain forms mainly interactions with the donor substrate, whereas the N domain mainly

interacts with the acceptor substrate (Wang, 2009; (Hansen et al., 2009). The sequence

identity within Arabidopsis UGTs is rather low, ca 10 percent (Vogt and Jones, 2000).

However, alignment of these sequences revealed a 44 amino acids long motif containing

very conserved amino avid sites. This motif was named the Plant Secondary Product

Glycosyltranserase (PSPG) motif (figure 7) (Vogt and Jones, 2000; Hughes and Hughes, 1994).

Figure 6: UDP-glucosyltransferes can be inverting or retaining. The OH group at the C1 position is indicated in

tred.

NH

O

ON

O

OHOH

HH

HH

OP

O-

O

OPO

O -

O

O

H

HO

OH

H

H

OHHH

OH

O

O

H

HO

OH

H

H

OHHH

OH

R

OO

H

HO

OH

H

H

OHH

H

OH

R

Inverting

Retaining

α-glucoside

β-glucoside

UDP-glucose

Page 20: Identification and biochemical characterization of the UDP ...

Introduction

12

Crystal structures indicate that the sugar is located in a tunnel in the C domain of the UGT

and interacts with amino acids in the PSPG motif (Hughes and Hughes 1999). Directed

mutagenesis of amino acids show that one amino acid substitution can change the donor

specificity e.g. from UDP-glucuronic acid to UDP-glucose (Noguchi et al., 2009).

The mechanism behind the acceptor site is even more complex and less understood. Most

Interaction between en enzyme and acceptor substrate occur in the N-domain of the UGT,

but no conserved motif has been identified. A lot of progress has been made to understand

the mechanism for acceptor binding in UGTs using crystal structures, homology based

modeling in combination with site directed mutagenesis and even swopping of N domains

among different UGTs (Hansen et al., 2009). The state of knowledge about UGT modeling is

recently reviewed in Osmani et al., 2009. Mutagenesis studies found that a single amino

substitution can lead to a change of substrate specificity (Wang, 2009). Despite promising

developments in this research field, the substrate specificity is, at present time, not

predictable from the UGT amino acid sequence.

In the past decade a lot of progress has been made on the identification and biochemical

characterization of UGTs. Lots of UGTs are found to glycosylate a wide variety of acceptor

substrates, mostly to form monoglucosides or bidesmosedic diglycosides. Most of the work

on characterization of triterpenoid saponins has been done on UGTs in legumes such as

soybean (Glycine max), alfalfa (Medicago sativa), ginseng (Panax ginseng) and the model

plant Medicago trancatula. These studies revealed several UGTs that transfer the sugar to

the aglycone, yielding in a monoglucoside. Not much is known about glycosyltransferases

that can catalyze the second step of adding a second sugar moiety to a monoglucoside to

yied a monodesmosedic diglucoside. In soybean UGTs have been identified of which one can

catalyse transfer of a UDP-galactose onto the sugar moiety of the soyasaponin (UGT73P2)

and another UGT could transfer a UDP-rhamnose moiety to the latter soyasaponin

diglycoside (UGT91H4) (Shibuya, 2010). They are the first examples of GTs which transfer the

Figure 7: the PSPG motif. A red color indicates highly conserved (identity >80%) amino acids. Blue is

moderately conserved (identity >50%) and black not conserved (>50%). (Vogt and Jones, 2000)

Page 21: Identification and biochemical characterization of the UDP ...

Introduction

13

second and third sugar to form monodesmosedic tri-glucosides of a triterpene saponin.

However, these UGTs utilize UDP-rhamnose and UDP-galactose and no UGT is characterized

that can transfer a glucose moiety to a glucose-monoglucoside as in the second step of the

biosynthesis pathway of saponins in Barbarea vulgaris.

Dr Tetsuro Shinoda, a Japanese collaborator, identified a UDP-glycosyltransferase (UGT)

from B. vulgaris variegata that was able to transfer the first glucose molecule to the

hederagenin and oleanolic acid aglycone (Shinoda, unpublished). Orthologues of that gene in

Barbarea vulgaris arcuata are the focus of this master thesis project.

1.7 This master thesis project

Jörg M. Augustin, a PhD student who is working on the biosynthetic pathway in B. v. arcuata

and was the daily supervisor in this thesis, searched for close homologs of Shinoda’s UGT in

the Danish B. v. arcuata. Five sequences were isolated of which two were from the G plant

and three were from the P plant (table 1). The five sequences were named by the UGT

Nomenclature Committee. The gene from B.v.variegata has not been named yet and is

referred to as UGT73Cshi. as it is a gene from the UGT73C subfamily isolated by Shinoda.

According to their nucleotide and amino acid sequence identity UGT73C9, UGT73C10 and

UGT73C11 are referred to as orthologs and UGT73C12 and UGT73C13 are referred to as

paralogs of UGT73Cshi. To make the thesis better understandable, the genes derived from

the P type will be marked in blue and the genes from the G type in red.

The aim of this master thesis is to clone, express and characterize these five genes. The main

purpose of the biochemical characterization is to elucidate the biosynthetic pathway of

saponins in Barbarea vulgaris. In addition the possible role of these UGTs in plant resistance

is examined. Small amino acid differences can have large effects on the enzyme function

(Wang, 2009; (Hansen et al., 2009). Such differences in functions between the UGTs might

P-type (non-resistant) G-type (resistant)

Orthologs UGT73C9, UGT73C10 UGT73C11

Paralogs UGT73C12 UGT73C13

Table 1: Orthologs and paralogs of B.v.variegata UGT73shi in B.v. arcuata.

Page 22: Identification and biochemical characterization of the UDP ...

Introduction

14

therefore explain the different saponin content and resistance found in G and P type of B.v.

arcuata.

Thus, the research questions for this thesis are as follows: i) Are the five UGTs capable of

transferring the first sugar moiety to the oleanolic acid and hederagenin aglycones? ii) How

specific are the UGTs for these aglycones? iii) Is there a difference in biochemical properties

between the enzymes originating from P and G plants? iv) Is there a difference in

biochemical properties between the orthologs and paralogs?

The starting point of this thesis will be the cDNAs encoding the five respective UGTs

obtained by Jorg M. Augustin. These cDNAs will be cloned into E.coli, expressed and

characterized to answer the above-mentioned questions. The characterization of the UGTs

will be based on the activity assays. All enzymes will be tested for glycosylation of oleanolic

acid and hederagenin. Initial activity assays will be analysed by Thin Layer Chromatography

(TLC), a fast and cheap way to determine glycosylation activity of the UGTs. TLC will also be

used to optimize the activity assay protocols when needed. Liquid Chromatography Mass

Spectrometry (LCMS) will be used to determine the mass of the oleanolic acid and

hederagenin glycosides. Nuclear Magnetic Resonance (NMR) spectroscopy can be used to

determine the position and conformation (α or β) of the sugar moiety.

The specificity of the UGTs towards different acceptor substrates is studied. The substrates

include the sapogenins, flavonols and sterols. The results will be analyzed by TLC and/or

kinetics. Kinetic parameters offer a good way to compare the ‘specificity’ of a UGT. For

enzymes following Michaelis–Menten kinetics the most important parameters are the Vmax,

the maximum velocity and the KM, a constant that represents the affinity of an enzyme

towards a substrate. A low KM indicates a high affinity for the substrate, as it reaches its own

top speed already at low substrate concentrations. The amount of substrate molecules per

enzyme molecule per time is represented by the Kcat.

Page 23: Identification and biochemical characterization of the UDP ...

Materials and Methods

15

2. Materials and methods

2.1 Cloning the five sequences

The five UGTs sequences, provided by Jörg M. augustin, served as a template for Polymerase

Chain Reaction (PCR) amplification. The primers were designed by Jörg and were extended

at the 5’end, containing a BamHI or NheI restriction site for the reverse and forward primers,

respectively, which is shown in blue and red in table 2. Two forward and two reverse primes

were designed to cover the four sequences.

For the PCR reaction, the primers were in 0.5μM concentration, mixed together with 2U/μl

Phusion® High-Fidelity DNA polymerase, 0.4mM dNTP, 1μl template in 1x Phusion® HC

buffer. To find the optimal annealing temperature of the primers, a gradient PCR was

performed with an initial denaturation step of 30 sec at 98˚C, 26 cycles of 10 sec at 98˚C

(denaturation), a 56˚C gradient for 20 sec (annealing); 90 sec at 72˚C for (extention);

followed by final 10min 72˚C extension. In order to obtain sufficient amounts of the

fragments, the PCR was subsequently performed in 50μl reactions with the same conditions

but with 56˚C annealing temperature. The PCR amplicons were separated according to size

by gel electrophoreses (TAE, 1.2 % agarose). For estimation of the DNA size, a 1 Kb plus DNA

Ladder (invitrogen™) was loaded flanking the DNA samples.

Ligation

10ml of an E.coli culture containing the pET-28c vector was grown overnight (37˚C) and the

plasmids were isolatedusing Qiagen MiniPrep Spin kit. The vectors and fragments were

restricted with BamHI and NheI restriction enzymes and subsequently cleaned up using

Nucleospin Extract II. A molar ratio 3:1 between insert:vector in a total of 100ng in a 10μl

reaction was used. The corresponding formula (ngvector *kbinsert)/kbvector *3 = nginsert, was used

to estimate the optimal ligation conditions. The length of the vector and insert were resp

5370bp and 1490bp, therefore 45ng insert and 55ng vector were used for the ligation. The

concentration of the vector and fragment solutions was determined with Nanodrop. The

ligation was performed in 10μl with 3 Weiss units pGEM® DNA T4 ligase overnight at 4˚C.

Transformation

10-20ng plasmids per 50μl competent E. coli XJb autolysis™ culture were used for

transformation. Transformations were performed either through electroporation or a 50

seconds heat shock on 42˚C. After a heat shock, the were incubated with 200μl SOC medium

for 1h 37˚C to recover and subsequently plated on LB+50μM kanamycin overnight 37˚C.

Insert containing colonies were selected by colony PCR with T7 primers flanking the multiple

cloning site of the pEt28c vector. Concentrations of the T7 primers and Hotmaster Taq

UGT73C9, UGT73C10,

UGT73C11, UGT73C12

forward 5’-ATATGGCTAGCATGGTTTCCGAAATCACCCA-3’

UGT73C13 forward 5’-ATATGGCTAGCATG GTTTCAGAAATTACCCAT-3’

UGT73C9, UGT73C11

UGT73C10,

reverse 5’-AATTCGGATCCTCAATTATTAGATTGTGCTAGTTGC-3’

UGT73C12, UGT73C13 reverse 5’-AATTCGGATCCTCAATTATTGGATTGTGCTAGTTGC-3’

Table 2: primers used for amplification of the sequences

Page 24: Identification and biochemical characterization of the UDP ...

Materials and Methods

16

polymerase were respectively 2,5μM and 5U/μl, with 0.2mM dNTPs in 1x Hotmaster Taq

buffer. A PCR program with initial denaturation of 3 min 94˚C, 40 cycles of 94˚C 20 sec,

annealing 51˚C 20sec, extension 2min 65˚C, and a final extension of 65˚C 10min as final step

was used for the colony PCR. Test restrictions, incubating 100ng plasmid, 0.3μl of each

restriction enzyme together for 1h at 37˚C, were performed to identify the inserted

fragments (table 3). Plasmids containing the right sequence were sequenced

(MWG/Eurofins). Stock cultures were made out of single colonies. UGT73C10 was cloned

and transformed later by Jörg Augustin and included in the expression and characterization

studies.

Expression of the UGTs

Crude extracts

For expression of the recombinant UGTs, 1.5ml LB medium (+50µM kanamycin) was

inoculated with 10μl transformed E. coli culture and grown for 12h at 30°C, 350rpm.

Subsequently, 3ml TB medium and arabinose (final concentration 3mM) were added and

gene expression was induced by 0.5mM Isopropyl β-D-1-thiogalactopyranoside (IPTG).

Incubation was continued for 24h at 15°C. E. coli cells were harvested by spinning (4°C,

14000xg, 20 minutes), resuspention of the pellet in 750µl 50mM Tris HCL pH7.5 and 1x

Complete protease inhibitor (EDTA free), and stored in -80 until further usage.

A 250μl sample of every culture was taken after both incubation steps, and the OD600 was

measured with an Amersham® Ultrospec 3100 pro spectrophotometer.

After freezing at -80˚C, the cultures were thaw and spinned for 20min at 20000xg. 75µl 1%

[W/v] Protamine Sulphate was added to remove remaining DNA, and after spinning another

20min 20000xg the supernatant was used as crude extract. These crude extracts are used for

the activity assays. The presence of the UGTs in the crude extract was analyzed by a SDS

Page and in some cases by Western Blotting.

SDS sample preparation

A solution containing 10% glycerol, 2% SDS, 5% β-mercaptoethanol 62.5mM Tris pH 6.8 and

a spatula tip Bromphenol Blue was used as 4x concentrated sample buffer. Prior to loading

the samples were reduced and denaturated by heating for 5-10 min 95˚C with 1-2x sample

buffer. To study the level of expression of the heterologous expressed genes, samples of

E.coli cultures were analyzed by SDS-PAGE. The volume of these culture samples was

normalized using the OD0.6, whereby 1 OD600 corresponded with 30μl culture sample. For the

corresponding crude extracts 8.5 OD600 corresponded with 6μl crude extract. When the

PstI EcoRV PCiI expected length of the fragments (bp)

UGT73C9 6034 5776; 3796 4592; 1979; 259 UGT73C11 6034 3796 4592; 2238 UGT73C12 3796 6830 UGT73C13 3796 688 3722; 3108

Table 3: restriction positions of PStI, EcoRV and PCiI on pET28+ with UGT73C9, UGT73C11, UGT73C12,

UGT73C13 inserts.

Page 25: Identification and biochemical characterization of the UDP ...

Materials and Methods

17

purpose was to analyze differences among the crude extracts, the volume crude extract

used as a sample was constant among the samples (e.g. 5μl).

SDS-Poly Acrylamide Gel Electrophoresis and Western blotting

The samples were loaded on 12% or 14% SDS-polyacrylamide gels and run for 1.5h at

maximum 200V and 100A (per gel) in electrophoresis buffer containing 0.05M in Tris,

192mM glycine and 3.47mM SDS. For SDS-PAGE analysis, the gels were stained with 10%

acetic acid, 50% ethanol and 0.2 % (w/v) coomassie brilliant R, by incubation for 60-90 sec in

the microwave (700W) until the gels colored dark-blue. Destaining was performed with 7%

acetic acid in 2x 4 minutes in the microwave. To obtain maximum contrast, the gels could be

further destained in 7% acetic acid overnight, by gently shaking at 20˚C.

Western blotting was performed on unstained SDS pages. Proteins were transferred to a

PVDF membrane (Immuno-blot™) using a Biorad™ criterion blotter. Blotting was done at

350mA for 1.5h. The transfer buffer contained 25mM Tris and 192mM Glycine, and was

cooled during blotting. Blocking of the membrane was performed with 5% skimmed milk in

PSB-Tween buffer (8.0mM K2HPO4, 0.15M NaCl, 3.9mM KH2PO4) for 0.5-3 h 20˚C or

overnight 4˚C. After washing with PSB Tween buffer, the membrane was incubated with

anti-his antibodies in a 5% skimmed milk in PSB-Tween buffer. As secondary antibody HRP

anti-mouse (1:1000) was incubated at RT for 2 hours 20˚C or 4˚C overnight. The secondary

antibody was visualized using Super Signal West® Dura Extended Duration Substrate and the

Auto Chemi System.

Activity assays

UDP-sugar assays (developed by Jörg M. Augustin)

In the initial protocol a mixture (total volume 20µl) of 100mM Tris HCL pH 7.19, 1μl UDP-

glucose (1.85 MBq/2mL, Amersham®), 10mM Dithiothreitol (DTT), 0.1mM acceptor

substrate (solved in 80% ethanol) and 5μl crude extract was incubated for 30min at 37°C.

The reactions were stopped by adding 20μl Ethyl Acetate, and spinned for 5min 20000xg to

achieve phase separation. The organic phase (ca 20μl) was dried using a Scanvac vacuum

centrifuge, resolved in 6μl ethyl acetate and subsequently transferred to a TLC silica gel 60

F254plate. The eluent was a mixture of 32:9:1 CHCl3/MeOH/H2O (as Shinoda, 2002). The TLC

plates were processed in a phosphor-imager screen for 24h. Radio labeled compounds were

depicted on the imager-screen and visualized with a (Molecular™ Dynamics Storm 860)

phosphor imager.

Subsequently, the above protocol was optimized. Both the concentration and pH of the Tris

buffer were modified into 50mM Tris-HCL pH7.5. The DTT added to the lyses buffer, prior to

freezing -80, instead of to the assays themselves and the assays were performed at 30˚C

instead of 37˚C. When the purpose of the assays was to study substrate specificity, only

10μM substrate was used. For assays with only hederagenin and oleanolic acid as substrates,

the 32:9:1 CHCl3/MeOH/H2O mixture was used as eluent. For assays with flavonols a mixture

of 5:2:1:1 EtAc/MeOH/HCOOH/H2O (Kurosawa, 2002) and for activity assays with saponins

(incl. β-amyrin), flavonols, sterols and TCP a mixture of 7.5:0.5:1:1 EtAc/MeOH /HCOOH/H2O

was used.

Page 26: Identification and biochemical characterization of the UDP ...

Materials and Methods

18

Quantification of the TLC results

The phospho-imager data were analyzed with ImageQuant 5.0 software. For quantification

the brightness/contrast ratio was optimalized and the spots were selected manually. The

intensity of these selected regions was analyzed by the ImageQuant 5.0 software. The

average intensity of the spots was used as a measure of for the amount of end product

detected.

LC-MS assays

For LC-MS assays the above reaction as for the UDP-sugar assays was used, but substituted

the radiolabeled UDP-glucose with 1mM unlabeled UDP-Glucose as sugar donor. The

reaction was stopped with 100μl methanol. After spinning at 10000xg the supernatant was

purified with 0.45nm filter and used diluted 5x in 50% methanol.

Solvent A in the LCMS analysis was: 0.1%HCOOH, 50uM NaCl and B: 80% MeCN, 0.1%

HCOOH. Three programs were used to detect the monoglucosides. The TOF analysis used a

gradient from 88% A and 12%B to 0%A and 100%B during 19 minutes. The second program,

used for the MS/MS techniques run the samples on a gradient from 63%A and 37%B to

20%A and 80%B in 9 minutes. A third program, used to detect flavonol and TCP

monoglucosides, a 7 min gradient from 98%A and 2%B to 60%A and 40%B was used. The

data were analyzed with Bruker Daltonics Data Analysis software.

Up scaling gene expression for NMR (protocol provided by Evolva).

For NMR, a sufficient amount (>2 mg) of the monoglucoside was obtained using a 1,5L

bacterial culture and a three day 50ml activity assay. 500mL NZYM medium was inoculated

with 1-2 transformed culture and grown overnight at 30˚C. 1000 ml of the same medium

was added, together with arabinose (final concentration 3mM) and IPTG (final concentration

of 0,1mM), and incubation continued at 15˚C for 24h. Cells were harvested in a 50mL tube

by spinning for 7min at 6500xg at 4˚C and resuspention in a lysis buffer (10mM tris, 5mM

MgCl2, 1mM Ca 3 tablets/100 ml Complete® protease inhibitor) to total volume of 30mL.

The cultures were stored at -80˚C.

After thawing the cultures to room temperature, 250μl DNasI solution (1.4mg/mL) was

added. After the viscosity had decreased sufficiently 1/3 volume of 4x binding buffer (80mM

Tris-Hcl pH7.4; 2M NaCl) was added. After centrifugation for 15min 15600xg 4˚C, the

supernatant was transferred to a fresh tube and spinned for 20min at 26000xg to remove

the remaining DNA. 3mL of His-select (sigma P6611) resin was incubated with the

supernatant for 2h at 4˚C, subsequently centrifuged for 4min at 2000xg and 4˚C; the

supernatant was discarded. The resin was washed 2-3 times with 1x binding buffer, with a

final volume of 10-15 ml.

The final reaction had a total volume of 40-50 ml containing the resin, 100mM Tris-HCL

pH8.0, 5mM DTT, 5mM MgCl2, 1mM hederagenin, 1mM UDP-glucose and 1% Fermentas

FastAP phosphatase 1U/μl. The reaction was incubated on 30˚C under gently stirring, and

was monitored by TLC analysis and staining with 10% sulphuric acid, which showed both

aglycone and monoglucoside. When the ratio aglycone:monoglucoside did not notably

change anymore, the reaction was stored at 4˚C until further usage. Prior to HPLC analysis,

Page 27: Identification and biochemical characterization of the UDP ...

Materials and Methods

19

supernatant was filtrated with a 0,22μM and filter (syringe) and the pH was adjusted to pH3-

4 by addition of Trifluoroactetic adid (TFA). HLPC purification was performed by Evolva.

NMR The NMR analysis was performed and interpreted by Carl Erik Olsen. NMR spectra were

recorded in methanol-d4 on a Bruker Avance 400 instrument using TMS as internal standard. 1H-NMR (methanol-d4, 400 MHz) δ: 0.71 (3H, s, CH3), 0.81 (3H, s, CH3), 0.91 (3H, s, CH3), 0.94

(3H, s, CH3), 0.98 (3H, s, CH3), 1.17 (3H, s, CH3), 0.7-2.1 (overlapping multiplets from steroid

skeleton), 2,84 (1H, dd, J=4.2 & 13.9 Hz, H-18), 3.16 (1H, dd, J= 7.8 & 8.7 Hz, H-2'), 3.25-3.36

(m, overlapped by solvent), 3.60-3.69 (3H, m), 3,82 (1H, dd, J=2.1 & 12.0 Hz, H-6’), 4.38 (1H,

d, J=7.9 Hz, H-1’), 5.24 (1H, broad t, J=3.5 Hz, H-12). 13

C-NMR (methanol-d4, 100.6 MHz) δ:

13.4 (C-24), 16.4 (CH3), 17.8 (CH3), 18.9 (CH2), 24.0 (CH3), 24.1 (CH2), 24.6 (CH2), 26.3 (CH2),

26.5 (CH3), 28.9 (CH2), 31.6, 33.5 (CH2), 33.6 (CH3), 33.9 (CH2), 35.0 (CH2), 37.7, 39.5 (CH2),

40.6, 42.8, 43.0, 43.9, 47.3 (CH2), 47.7, 48.2, 62.8 (CH2, C-6’), 64.9 (CH2, C-23), 71.6 (C-4’), 75.7

(C-2’), 77.8 (C-5’), 78.4 (C-3’), 83.5 (C-3), 105.8 (C-1’), 123.6 (C-12), 145.3 (C-13), 181.9 (C-28).

His-tag purification

His-tag purification was performed with the native and denaturized protein. This purification

method is based on binding of the His-tag of the protein to a nickel-containing resin.

Therefore 750μl of a crude extract was incubated with resin for 1h at 4˚C, spun through a

NucleoSpin® RNA Plant filter afterwards and washed twice with 750μl washing buffer.

Elution of the protein was performed in two steps with 250μl elution buffer and a final clean

up step was performed with 750μl of a strong buffer (e.g. 1M EDTA) to make sure all the

proteins were unbound from the resin. The fractions of all steps were collected and run on a

SDS page.

For the native enzyme purification, the conditions for the resin binding step were 300mM

NaCl, 10mM Immidazol, 50mM DTT and 50mM Tris pH 7.5. The washing buffer contained

300NaCl, 20mM Immidazol, 50mM DTT and 50mM Tris pH 7.5. Elution was obtained with a

buffer of 300NaCl, 50mM Tris-HCl pH 7.5 and 50mM DTT and 150mM Immidazol. The Clean

up was performed with 1M NaCl. In some cases EDTA was used to elute the protein, with an

elution buffer containing 200mM and 2M EDTA as clean up buffer. EDTA attracts nickel and

removes the nickel from the resin, thus competing with the resin.

For denaturized purification the 750μl crude was incubated with the resin in 8M urea and

50mM Tris. The washing step contained 8M urea and 10mM Tris-HCl, with a pH6.3. Elution

steps were performed with the same buffer but the pH was respectively 5.9 and 4.5 for the

two steps. Clean up was performed with 600μl NaAC –HCL pH3.

The eluted fractions were concentrated using to one IVSS vivaspin 500 centrifugal concentrat

(centricon) and spinned for 5 min at 12500xg, or until a residual volume of 50μl was

obtained. The centricon was washed 2-3x with an appropriate buffer (e.g. 100mM Tris-HCl

pH7.5 and 10mM Tris). The final volume of the purified solutions was ca 50μl.

Page 28: Identification and biochemical characterization of the UDP ...

Materials and Methods

20

Protein measurements

The concentrations of the purified enzyme were measured with Nanodrop at A280;

performingPierce BCA assays or by analyzing the intensity on SDS of Western with Image J. A

standard curve was made with known concentrations of BSA, essentially according to the

following the Pierce BCA Assay kit instructions, but using 5μl sample, 95μl sample buffer in a

96 multi-wells plate. After incubation the samples were kept on ice until they were analyzed

with Nano-drop, using the corresponding BCA-program.

Standard curves with both BSA or known concentrations purified UGT were made with SDS-

PAGEs and Western blots (performed as described above) and the gels were analyzed with

Image J freeware.

Sequence analysis

Alignments of the sequences and corresponding phylogenetic trees and similarity tables

were produced using MEGA4©, a program that makes use of a Custal W algorithm. Penalty

for Gap was set on 10 and Gap extention 0.5.

The crystal structures were obtained from the Protein data bank (www.pbd.org) and used

for analysis I with Pymol 1.3 (source: www.pymol.org.) The structures used were UGT71G1

containing UDP-glucose: PBD 2ACW and VvGT: PBD 2C9Z.

Page 29: Identification and biochemical characterization of the UDP ...

Results

21

Figure 8: NJ bootstrap consensus tree of all to date known members of the UGT73C subfamily. The

genes from Barbarea vulgaris (both ssp arcuata and variegata) are indicated. Sequences derived

from the G and P types are marked in respectively red and blue.

3. Results

3.1 Phylogenetic analysis of the UGT cDNA and amino acid sequences

The sequences of all known members of the UGT73C subfamily, including B.v.variegata

UGT73Cshi and the five UGT derived from B.v.arcuata were aligned with Clustal W. The

parameters were a gap penalty of 10 and a gap extension penalty of 0.5 for both the

pairwise and multiple alignments. A Neighbor Joining tree was constructed based on this

alignment, for both the nucleotide and amino acid sequence (figure 8). As statistical

bootstrapping was used (1000x). As an out group UGT71G1 is used, a family 1

glycosyltransferase from Medicago trancutala that is able to glycosylate isoflavones and

flavonols and the sapogenin medicagenic acid (He et al., 2008; (Achnine et al., 2005). The p-

distance, defined as ‘the proportion (p) of amino acid (or nucleotide for DNA sequences)

sites at which the two sequences to be compared are different’ (Mega4 2010), is used to

calculate the percentage pairwise identity (1- p-distance) as shown in table 4.

The five UGT sequences from Barbarea vulgaris ssp arcuata form together with the gene

from B. vulgaris ssp variegata a distinct clade within the UGT73C subfamily (figure 8). Two

UGT73Cshi

UGT73C11

UGT73C10

UGT73C9

UGT73C12

UGT73C13

B. vulgaris

UGT73C5

UGT73C6

UGT73C2

UGT73C3

UGT73C4

UGT73C1

UGT73C7

UGT73C8

UGT71G1

100

98 98

100 83

93

99 100

100

100 92 99

0.05

Page 30: Identification and biochemical characterization of the UDP ...

Results

22

subclades can be distinguished: UGT73C12 and UGT73C13 form one subclade and UGT73C9,

UGT73C10 and UGT73C11 and UGT73Cshi cluster together in the second. Orthologs and

paralogs cluster together in a QTL analysis of Barbarea vulgaris (Kuzina, Augustin,

unpublished). Their neighboring positions imply gene duplication.

UGT73C11 is the closest homolog to B.v. variegata UGT73Cshi, they share only one amino

acid substitution of an aspartic acid (UGT73Cshi) into a glutamic acid (UGT73C11) at position

338 (figure 9). Therefore UGT73C11 will be referred to as the ortholog for UGT73Cshi in the G

type plant of B vulgaris ssp arcuata. The NJ analysis on amino acid level indicates that

UGT73C10 is the ortholog in the P-type plant (98% similar to UGT73C11). UGT73C12 and

UGT73C13 share 98% (table 4) of their amino acid sequence and cluster together in both

nucleotide and amino acid bootstrap consensus trees. UGT73C12 was isolated from the P

plant and UGT73C13 from the G plant, indicating that there are orthologs towards one

another but paralogs of the orthologs of UGT73Cshi from B.v. variegata.

G1 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13

C1 0.30

C2 0.30 0.67

C3 0.29 0.65 0.77

C4 0.28 0.66 0.76 0.85

C5 0.29 0.68 0.73 0.76 0.75

C6 0.27 0.65 0.70 0.74 0.73 0.87

C7 0.26 0.50 0.51 0.52 0.50 0.50 0.50

C8 0.27 0.44 0.41 0.44 0.42 0.43 0.43 0.41

C9 0.29 0.67 0.69 0.71 0.72 0.77 0.75 0.48 0.40

C10 0.29 0.67 0.70 0.73 0.73 0.79 0.76 0.49 0.41 0.95

C11 0.29 0.67 0.70 0.72 0.73 0.79 0.76 0.49 0.41 0.95 0.98

C12 0.28 0.68 0.70 0.74 0.74 0.83 0.79 0.51 0.43 0.90 0.92 0.92

C13 0.28 0.67 0.70 0.74 0.74 0.83 0.79 0.51 0.43 0.90 0.92 0.92 0.98

UGT73Cshi 0.29 0.67 0.69 0.72 0.73 0.79 0.76 0.49 0.40 0.95 0.98 1.00 0.92 0.92

Table 4: percentage amino acid sequence identity of UGT73C subfamily members

Page 31: Identification and biochemical characterization of the UDP ...

Results

23

The closest related known subfamily members of the UGTs derived from B.vulgaris are

UGT73C5 and UGT73C6 from Arabidopsis thaliana. They are ca 80% percent identical to the

UGTs derived from Barbarea vulgaris. UGT73C5 has been characterized as a gene involved in

the glucosylation of brassinosteroids, a class plant hormones that are involved cell

elongation and differentiation (Poppenberger et al., 2005). Brassinosteroids are triterpene

structures, derived from the phytosterol pathway, with campesterol as precursor.

Glucosylation of brassinosteroids has a regulatory function as it inactivates the

brassinosteroids, and over expression of UGT73C5 in Arabidopsis led to a dwarf phenotype.

Besides glycosylation of brassinosteroids, UGT73C5 is also capable of glycosylating the

cytokinins (Poppenberger et al., 2005) trans-zeatin and dihydrozin (Hou et al., 2004) and

both UGT73C5 and UGT73C6 can utilize zearelone, a myco-toxin with estrogenic activity

(Poppenberger et al., 2006). UGT73C6 has shown to be in vivo involved in the glycosylation

of quercetin at the 3 and 7 positions (Jones et al., 2003)

UGT73C3, UGT73C4, UGT73C1 and UGT73C are ca 70% identical, UGT73C3 and UGT73C4

closer related than UGT73C2 and UGT73C1. Little is known about the function or activity of

these four UGTs. All are derived from A. thaliana. Like UGT73C5, UGT73C1 was capable of

glycosylating of trans-zeatin and dihydrozeatin. UGT73C3 and UGT73C4, also included in this

research did not show activity on these substrates (Hou et al., 2004). Within the UGT73C

subfamily the A. thaliana UGT73C7 and M. truncatula UGT73C8 are the least similar to the

UGT from Barbarea vulgaris, with a sequence identity of 40-50 percent.

Page 32: Identification and biochemical characterization of the UDP ...

Results

24

3.2 UGT sequence analysis

UGT73C9 shows an altered solubility and activity towards oleanolic acid and hederagenin

(see 3.3) compared to the other four UGTs. To gain more insight in the amino acid

substitutions underlying these differences, the sequences were aligned with UGT71G1, a

UGT that can utilize flavonols and saponins and of which a crystal structure is available. The

sites that differ from the consensus sequence are highlighted in figure 9.

Figure 9: CLC alignnment of UGT73C9, UGT73C10, UGT73C11, UGT73C12, UGT73C13, UGT73shi and UGT71G1

using Clustal W algorithm. Amino acids that differ from the consensus sequence are marked in pink. The PSPG box

is underlined in black.

Page 33: Identification and biochemical characterization of the UDP ...

Results

25

The crystal structure of UGT71G1 was used to study the amino acid substitutions in

UGT73C9 according to the alignment above. A crystal structure of UGT71G1 containing the

UDP-sugar was available (PBD 2ACW), and aligned with VvGT (PBD 2C9Z) (Offen et al., 2006)

containing quercetin and UDP as substrates. Thus in figure 10 both substrates are visualized,

derived from different crystal structures. The substrates are shown as sticks and are stained

by elements in which carbon is green for the UDP-sugar in the UGT71G1 model, whereas

quercetin is stained by elements but with purple carbon. The corresponding amino acids of

the positions in UGT71G1 where UGT73C9 differed from other Barbarea UGT73C are shown

as sticks and marked in red. This visualization can indicate how the amino acids of the

substitutions in UGT73C9 can interact with UDP-glucose and an acceptor substrate. The

active center is shown in figure 10.

Figure 10: Alignment of the crystal structure of UGT71G1 with UDP-glucose, and VvGT containing quercetin

and UDP. Only the active center of the enzyme is shown. The substrates are shown as sticks and are stained

by elements with O in red, N in blue and P in orange. However, Carbon atoms are shown white in UDP, green

in UDP glucose and purple in quercetin. UGT71C1 and VvGT are overlaid and showed as cartoon in which

UGT71G1 is orange and VvGT is blue. Amino acid of UGT71G1 which have a unique amino acid substitution in

UGT73C9 are shown as sticks and indicated in red.

Page 34: Identification and biochemical characterization of the UDP ...

Results

26

Table 5 one summarized the amino acid substitutions that are indicated in red in figure 10.

At positions 210, 297, and 379 corresponding with position202, 286 and 395 of UGT71G1,

UGT73C9 conferred unique amino acid substitutions towards compared with UGT73C10 and

UGT73C11. Even though these sites are in the active center, these sites do not seem to have

in direct contact with the substrate. However, an effect of these amino acid substitutions on

activity cannot be excluded as the amino acids are so close to the active center.

The Tyr202 of UGT71G1, however, might well interact with the acceptor substrate (figure

10). In the functional UGT73C10 this tyrosine is replaced by a lysine and in the non-

functional UGT73C9 by an arginine. Lysine and arginine are very similar amino acids, but the

extra length of the arginine in UGT73C9 may have a steric effect to the acceptor substrate,

as it changes the size and shape of the cavity that fits the acceptor substrates.

UGT71G1 position UGT71G1 UGT73C10 position UGT73C10 & 11 UGT73C9

194 Asn 202 Val Ala

202 Tyr 210 Lys Arg

286 Met 297 Ile Asn

379 Tyr 395 Phe Ile

380 Ala 396 Gly Val

Interesting is an amino acid substitution in the PSPG box, even the site itself is not very

conserved according to the PSPG box in Vogt and Jones 2000. All B. vulgaris except UGT73C9

contain an alanine on that position, and UGT71G1 a glycine. UGT73C9, however, contains a

valine, which contains has an extra CH3 group. Figure 10 shows this alanine is located exactly

in between donor and acceptor and the bigger size of the extra methyl group in valine may

form a steric barrier between donor and acceptor substrate, and thereby blocking the

activity of the enzyme. This could explain the lack of activity on sapogenins and the low

activity on TCP. Further studies will be needed to study the effect of this amino acid

substitution.

However, the altered solubility of UGT73C9 can be casued by amino acid substitutions

outside the active center. For example, the substitution of an arginine into an a cysteine on

position 167 introduces an extra S group on this position which might interfere with the

folding of the enzyme.

Table 5: amino acid substitution

Page 35: Identification and biochemical characterization of the UDP ...

Results

27

3.3 Cloning and heterologous expression of the UGTs

The cDNA sequences of UGT73C9, UGT73C11, UGT73C12 and UGT73C13 were amplified by

PCR, purified, ligated into a pET28c vector and transformed into DH5α or Xjb strains of E.

coli. Insert containing colonies were selected by colony of culture PCR. The five UGTs could

be individually distinguished on their EcoRV, PstI and PciI restriction pattern. Plasmids

containing the right insert were selected and sequenced. Often the sequences contained

point mutations compared to the original sequence, and were not used in this thesis. The

UGT73C10 sequence was cloned and transformed by Jörg M. Augustin and included in the

expression and characterization studies.

To study the expression of the recombinant UGTs, samples of the bacterial culture before

and after expression, the crude extracts and corresponding insoluble phase were analyzed

by SDS-PAGE figure 11. The SDS-PAGE analysis reveals induction of proteins with a size of

approximately 55kDa, which is within the expected UGT size region. UGT expression is found

in all cultures besides the empty vector culture which reveals one or more proteins with

approximately the same size as a background in the SDS-PAGE analysis. All the expressed

proteins could be seen in the soluble phase except for UGT73C9, which could not be

distinguished from the background (figure 11C) but is present in the insoluble phase as

Figure 11: SDS-PAGE analysis of bacterial cultures before (A) and after (B) induction with IPTG. C) The

corresponding according crude extract (soluble phase) D) insoluble phase. The size (kDa) of the fragments in the

marker lane is indicated on the left. The size region of the recombinant UGT is indicated with a black arrow.

75

50

25

37

Empty

vector

100

UGT73C9 UGT73C10 UGT73C11 UGT73C12 UGT73C13 UGT73Cshi

150

75

100

50

25

37

Empty

vector UGT73C9 UGT73C10 UGT73C11 UGT73C12 UGT73Cshi UGT73C13

150

75

100

50

25

37

UGT73C9 UGT73C10 UGT73C11 UGT73C12 UGT73C13

Empty

vector UGT73Cshi

UGT73C9 UGT73C10 UGT73C11 UGT73C12 UGT73C13

Empty

vector UGT73Cshi

150

100

75

50

25

37

A B

C D

Before induction After induction

Soluble phase Insoluble phase

Page 36: Identification and biochemical characterization of the UDP ...

Results

28

inclusion bodies (figure 11D).

To study the formation of inclusion bodies in UGT73C9, a sequential dilution series of the

IPTG concentration was used for induction of expression (figure 12A,B). Western Blot

analysis revealed the presence of UGT73C9 in the crude extract (figure 12D). However, the

concentration of UGT73C9 in the crude extract is obviously much lower than the

concentration of the other UGTs.

Figure 12: SDS-page (A,B,C) and Western Blot (D) of UGT73C9. Bacterial culture before (A) and after (B)

induction by IPTG. (C) crude extract, and D) the corresponding Western Blot analysis. The concentration IPTG

used to induce gene expression is indicated below each lane. The arrows indicated the position of the

heterologous expressed UGT.

A

250 150

75 100

50

25

37

20 0,1mM 0,05mM 25μM 12,5μM 6,75μM 3μM 0,75μM 1,5μM

D

100 75

50

25

37

20

B 0,1mM 0,05mM 25μM 12,5μM 6,75μM 3μM 1,5μM 0,75μM

75

100

50

25

37

C 0,1mM 0,05mM 25μM 12,5μM 6,75μM 3μM 1,5μM 0,75μM 0,1mM 0,05mM 25μM 12,5μM 6,75μM 3μM 1,5μM 0.75μM

Before induction After induction

Soluble Phase Soluble Phase

Page 37: Identification and biochemical characterization of the UDP ...

Results

29

3.4 Characterization of the UGTs

To study the activity of the heterologous expressed UGTs, activity assays were performed

with radioactively 14

C labeled UDP-glucose as donor and oleanolic acid and hederagenin as

acceptor substrates. The radioactive monoglucosides could be detected with a phosphor-

imager. The TLC analysis revealed glycosylation activity for UGT73C10, UGT73C11,

UGT73C12, UGT73C13 and UGT73Cshi (positive control) for the substrates oleanolic acid and

hederagenin (figure 13). No monoglucosides were detected for the empty vector crude

extract. Also UGT73C9 did not show any activity on these aglycones, which can be due to

various reasons such as a low concentration of UGT73C9 in the crude extract, an inactive

form of the protein in the crude extract, a very low affinity of UGT73C9 for hederagenin and

oleanolic acid, or a combination of the above mentioned reasons. However, in activity assays

with different substrates reveal that UGT73C9 is able to utilize 2,4,6-trichlorophenol (TCP)

(figure 14D), thus UGT73C9 is active in the crude extract of UGT73C9.

Activity assays were performed to study the activity of the UGTs under different conditions

regarding pH, temperature, buffer concentration, etc. The TLC results indicate that the

amount of end product correlates with the pH of the Tris-HCl buffer, with an optimal pH of

UGT73C9 UGT7310 UGT73C11 UGT73C12 UGT73C13 neg UGT73C

_shi

he

d

ol

he

d

ol

he

d

ol

he

d

ol

he

d

ol

he

d

ol

he

d

ol

ol = oleanolic acid

hed = hederagenin

oleanolic acid

monoglucoside

hederagenin

monoglucoside

Figure 13: TLC analysis of activity assays of UGT73C9, UGT73C10, UGT73C11, UGT73C12 and UGT73C13. The

substrates and UGT indicated below the lanes. ’neg’ refers to a crude extract that was not expressing one of the

five UGTs (empty vector). Oleanolic acid and hederagenin monoglucosides are indicated with an arrow. Activity

assays were performed on 37˚̊̊C with 0,1mM substrate concentration. Corresponding crude extracts are show in

figure 11C.

Page 38: Identification and biochemical characterization of the UDP ...

Results

30

pH7.5 (figure 14A). No pattern was found in LCMS analysis of assays of UGT73C12 with 20,

50 and 100mM Tris concentrations. Interestingly, the concentration of hederagenin and

oleanolic acid in a range of 10-100µM did not seem to have an effect on the amount of

monoglucoside produced in the assays, suggesting that these concentration of oleanolic acid

and hederagenin are within the Vmax range of UGT73C11. 10 times dilution of the crude

extract, however, correlated with a 2.5 times smaller TLC dot intensity. LCMS data of

UGT73C12 revealed that the enzyme seemed to have a similar activity well on 30 and 37˚C,

however no monoglucosides could be detected after 30 min incubation at 20˚C. Activity

assays using fresh crude extract and the same crude extract after storage on ice (0˚C)

overnight indicated that both UGT73C12 and UGT3C10 are stable on ice for at least 20h, as

no difference in the amount of monoglucoside could be detected.

Figure 14: TLC analysis of UGT73C members of B.v.arcuata in varying assay conditions. Assays were in

principle preformed with 100µM Tris pH 7.2 and 100µM substrate concentration, and varying conditions

treatments are indicated for each lane the. A) UGT73C11 activity with varying pH and concentrationTris

buffer. B) Activity of UGT73C11 with 10-100μM oleanolic acid and hederagenin concentration. C) activity of

UGT73C9, UGT73C10 and UGTC12 on TCP before and after 24h storage on ice. Below: UGT739 (D), UGT73C10

(E), UGT73C12 (F) tested for activity on different substrates: ol = oleanolic acid, he = hederagenin, qu =

quercetin, ka = kaempferol, TCP = 2,4,6, tri-chlorophenol (UGT73C11 and UGT73C13 not shown).

ol he qu ka TCP

UGT73C10

ol he qu ka

UGT73C9

TCP

1x

pH7.2

10x

pH7.2

10x

pH7,0

10x

pH6.8

10x

pH7.5

10x

pH7.8

UGT73C11

ol he qu ka

UGT73C12

TCP

100

μM

50

μM

20

μM

10

μM

100

μM

50

μM

20

μM

10

μM

hederagenin

UGT73C11

oleanolic acid UGT73C9 UGT73C10 UGT73C12

24h 24h 24h 0h 0h 0h

TCP

A B C

D E F

Page 39: Identification and biochemical characterization of the UDP ...

Results

31

In contrast, after freezing (-20) the activity of UGT73C12 gradually decreased (figure 15).

Addition of up to 50% glycerol limited the effect of freezing to some extent, but it was not

sufficient to maintain the original activity. UGT73C11 did not show loss of activity after

freeze thawing, with or without addition of glycerol. The percentage glycerol itself did not

have a notable effect on enzyme activity (figure 15).

The activity assays with oleanolic acid and hederagenin as acceptor substrate and were

analyzed by LCMS to determine the mass of the compounds detected by the TLC analysis.

The predicted mass of the monoglucosides is 619 for oleanolic acid and 634 for hederagenin,

based on the masses of the oleanolic acid (mw 457) and hederagenin (mw 472) aglycone.

Different LCMS programs were applied to detect the monoglucosides, using either positive

or the negative detection mode of the program.

An LCMS analysis of activity assays of UGT73C10 in negative mode is shown in figure 16. The

main compound in the peak at 6.1 min in assays with hederagenin has a weight of 679 m/z,

which corresponds to the hederagenin formic acid adducts (figure 16). The fragments of this

compound in MS/MS analysis represent the monoglucoside (m/z 634), and the difference in

mass is represented by the loss of a formic acid ion. The MS/MS/MS revealed a compound

with the mass of the hederagenin aglycone (m/z 472) is found. The loss of the sugar moiety

is represented by a loss in m/z 162.

UGT73C13

0

500

1000

1500

2000

2500

0 10 20 30 40 50

0

1

2

3

UGt73C11

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

0 10 20 30 40 50

0

1

2

3

Figure 15: The TLC intensity of activity assays UGT73C13 and UGT73C12 with addition 0, 10, 20, 30, 40 and

50% glycerol to the crude extracts after one, two and 3 freeze-thaw cycles (respectively 0, 1, 4 and 5 days in -

20˚C). The volume of the crude extract used in the activity assays was adjusted to the percentage glycerol in

the crude extract. The percentage glycerol is indicated below each lane and the amount of freeze-thaw cycles

is represtented by the bar colour.

Page 40: Identification and biochemical characterization of the UDP ...

Results

32

The oleanolic acid formic acid adduct was found at 8.3 min with a mass of m/z 619 that

represented the monoglucoside, and the aglycone was represented by a fragment of m/z

456 in the MS/MS. In the positive mode 50μM NaCl (mw 23) was added to the mobile phase

to facilitate the detection of the monoglucosides, and the masses we found were m/z 619

and m/z 656 representing the Na+ adducts of the hederagenin and oleanolic acid

monoglucosides.

Figure 16: LCMS analysis of activity assays with UGT73C10 using oleanolic acid and hederagenin

aglycones as acceptor substrate. A) Total Ion Chromatochram of an overlaid of assays with

hederagenin (blue), oleanolic acid (red) or hederagenin but without addition of UDP-glucose (dark

purple). Peaks containing the monoglucosides are marked with a black arrow. B) The fragmentation

pattern of the peak at 6.1 min. The red square indicates the compound analyzed with MS/MS (C) and

MS/MS/MS (D). The blue square indicates of which compound the fragmentation pattern is derived.

393.6

471.6

499.1

-MS3(680.2->634.1), 6.1min #460

0

1

2

3

4

4x10

200 300 400 500 600 700 800 900 1000 m/z

2 4 6 8 10 12 14 Time [min]0

2

4

6

87x10

Intens.

2 4 6 8 10 12 14 Time [min]0

2

4

6

87x10

Intens.

2 4 6 8 10 12 14 Time [min]0

2

4

6

87x10

Intens.

hederagenin

633.9

701.8

769.7

837.7905.6

973.6 1041.5

679.9-MS, 6.1min #458

0

2

4

6

6x10Intens.

200 300 400 500 600 700 800 900 1000 m/z

633.9-MS2(680.2), 6.1min #459

0

2

4

6

6x10Intens.

200 300 400 500 600 700 800 900 1000 m/z

A

B

C

D

hderagenin

monoglucoside

oleanolic acid

monoglucoside

Page 41: Identification and biochemical characterization of the UDP ...

Results

33

An NMR analysis was performed on the hederagenin monoglucoside to determine the

position of glucosylation and the conformation of the sugar. For this purpose >2 mg of the

hederagenin monoglucoside was required. Therefore activity assays were performed on a

large scale with 1.5L UGT73C11 containing culture and a 50ml activity assay for two days. Ca

24mg hederagenin aglycone was used for the activity assay; however, only 2.1mg of the

monoglucoside was purified. The unexpected low amount of purified monoglucoside is

probably due to a loss of monoglucoside during the pH adjustment and 0.21μm filtration of

the activity assay prior to the HPLC purification, as precipitation of one of the compounds in

the assay was observed. The NMR analysis of the hederagenin monoglucoside revealed that

the glucose moiety was attached to the C3 position of the aglycone and the sugar was in β-

conformation (figure 17).

3.5 UGT substrate specifity

UGT73C9, UGT73C10, UGT73C11, UGT73C12, UGT73C13 and also A. thaliana UGT73C5

UGT73C6, were tested for their ability to utilize different compounds as substrate. A

relatively low substrate concentration was used (10μM) so that the differences in efficiency

of the enzyme per substrate would be detectable. The acceptor substrates included

oleanolic acid and hederagenin, the substrates for the presumed biosynthetic step. Also β-

amyrin, the presumed precursor of oleanolic acid and hederagenin, was tested. Other

acceptor substrates included flavonols, quercetin and kaempferol, and phytosterols

(sitosterol, campesterol, sigmasterol, obtusifoliol) (figure 18). Quercetin and kaempferol are

the major flavonols in model plant A.thaliana and many UGTs that utilize sapogenins are also

active on quercetin aglycones. The saponin biosynthesis pathway is derived from the

phytosterol pathway, thus it would be relevant to also test sterols as substrate. Sitosterol,

sigmastero, campesterol and obtusifoliol were available as substrates. 2,4,6-trichlorophenol

Figure 17: 3-O-glucoside of hederagenin with the sugar in β-conformation. The C molecules of

hederagenin and the sugar moiety are numbered.

CH3 CH3

H3C

H3C CH2OH

CH3

COOH

CH3

34

17

OOOH

HOHO

OH

1

5

10

6

8

26

12

25

20

22

14

30 29

2816

27

24

18

1'

4'

Hederagenin monoglucoside

23

a

b

MW 472+162=634

Page 42: Identification and biochemical characterization of the UDP ...

Results

34

(TCP) served as a positive control for activity of the UGTs and was used in 100μM

concentration. TLC results of these activity assays are shown in figure 19 and summarized in

table 6.

The activity assays reveal that UGT73C10, UGT73C11, UGT73C12 and UGT73C13 can utilize

oleanolic acid and hederagenin as a substrate. Two glycosylation patterns were found, one

for UGT73C10, UGT73C11 and one for UGT73C12, UGT73C13. Therefore only one

representative of each (UGT73C10 and UGT73C12) is shown in figure 19. The spot intensity

of all is summarized in table 6. UGT73C10 and UGT73C11 were clearly active towards

oleanolic acid, hederagenin, and β-amyrin aglycones. A smear in the lanes of quercetin and

kaempferol indicated also activity for the flavonols, but much weaker than the activity on

sapogenins. Also UGT73C12 and UGT73C13 were active on oleanolic acid and hederagenin,

but the spots were less intense than those of UGT73C10 and UGT73C11 suggesting a lower

efficiency of UGT73C12 and UGT73C13 under these conditions. The activity of UGT73C12 on

TCP was similar to that of UGT73C10.

Whereas UGT73C10 and UGT73C11 seem to utilize β-amyrin, oleanolic acid and hederagenin

equally efficient, it is not clear if the Rf 0.73 compound in UGT73C12 (and UGT73C13) activity

assays is a real β-amyrin monoglucoside. (Impurities of) 80% ethanol appears to be

glucosylated and the products gave a Rf 0.28 and 0.73, which is the same retention time as

Figure 18: Substrates used in the activity assays. The tripenoid sapogenin aglycones hederagenin, oleanolic acid

and β-amyrin are marked in blue, the two flavonols kaempferol and quercetin in green, the four sterols

sigmasterol, sitosterol, campesterol, obtusifoliol in brown and 2,4,6 trichlorophenol (TCP) in black.

2,4,6-trichlorophenol

OHO

OH

HO

OH O

OH

quercitin

OH

OOH

HO O

OH

kaempferol

O

HO

hederagenin

O

OH

HO

oleanolic acid

HO

-amyrin

OH

OH

OH

Cl

Cl

Cl

HO

sigmasterolHO

HO

obtusifoliolHO

campesterolsitosterol

β

Page 43: Identification and biochemical characterization of the UDP ...

Results

35

the β-amyrin monoglycoside (which is solubilized in 80% ethanol). LCMS analysis could

neither detect β-amyrin in assays with UGT73C12 and UGT73C13 nor assays in with

UGT73C10 and UGT73C11. Thus, whether UGT73C12 and UGT73C13 can utilize β-amyrin as

a substrate could not be determined. A weak spot is detected in the lane of UGT73C9 and

TCP, but not for any other substrate tested.

Activity towards kaempferol and quercetin was detected for all four UGTs, although it was

very weak compared to oleanolic acid and hederagenin. The more intense spot in UGT73C6

reveals a quercetin monoglucoside with an Rf of 0.58 (data not shown). In the lanes with the

sterols as acceptor substrate, vague spots are detected with an Rf of 0.67. An additional

Figure 19: TLC analysis of activity assays with UGT73C12 (A), UGT73C10 (B) and UGT73C6 (C) tested with

different aglycones. Saponins (oleanolic acid and hederagenin), flavonols (quercetin and keampferol) and

sterols (obtusifoliol, sitosterol, sigmasterol and campesterol) were used at 10µM concentration. TCP was

tested on 100µM and served as positive control. Substrates are indicated below each lane: ol = oleanolic acid,

hed = hederagenin, que = quercetin, kae = kaempferol, obt = obtusifoliol, sito = sitosterol, sig = sigmasterol, β-

amyrin, TCP= 2,4,6-trichlorophenol. The retention factor of the main compounds is indicated on the right

side. As eluent a mixture of 7.5/.5/1/1 EtAc/MeOH/HCOOH /H2O was used.

0,73

0.66

0.28

0.69

0.58

0.52

ol hed que kae obt sito sig cam β-am TCP eth

0.73 0.69 0.66

0.58

0.53

0.28

ol hed que kae obt sito sig cam β-am TCP eth

A

B

UGT73C12

UGT73C10

Page 44: Identification and biochemical characterization of the UDP ...

Results

36

smear is detected in the lane with UGT73C12 and sigmasterol. However, the monoglucosides

are solubilized in 1%tween instead of 80% ethanol, and a control experiment with 1% tween

as substrate is lacking to determine if the compounds on Rf 0.67 or 0.73 are sterol-

glucosides.

The intensity of the main compounds for each lane is summarized in table 6. The intensity

values in table supports the difference in glycosylation for UGT73C10-UGT73C11 and

UGT73C12-UGT73C13 as mentioned earlier. However, the enzyme concentration is not

taken into account, as we did not find a feasible way to determine the amount of enzyme in

the crude extract in this master (see section 3.6). Correlations between the intensity of the

UGT band on SDS-PAGE and the activity on either the positive control TCP or any other

compound in the assays were not found (compare table 6 with figure 20). However, figure

14 shows that a 10x dilution of a crude extract does affect the amount of end products

formed. The affinity for TCP may differ among the enzymes, and also the percentage

enzymes in an active state may differ between the UGTs, which could explain why this

correlation between the SDS-PAGE and the difference in activity towards TCP or any other

substrate was not found.

oleanolic

acid hederagenin quercetin kaempferol obtusifoliol sitosterol sigmasterol campesterol β-amyrin TCP*

UGT73C9 nd nd nd nd nd nd nd nd nd 13

UGT73C10 480 717 33 24 17 13 12 11 439 292

UGT73C11 507 893 23 31 14 10 nd nd 329 471

UGT73C12 91 85 nd 28 13 13 15 13 26 298

UGT73C13 70 49 12 10 11 10 10 10 31 190

UGT73C5 76 20 nd 44 14 16 15 16 29 459

UGT73C6 nd nd 285 56 nd nd nd nd nd 563

* The TCP concentration was 100μM, whereas 10μM was used for the other substrates.

** nd= not detected.

UGT73C9 UGT73C10 UGT73C11 UGT73C12 UGT73C13 UGT73C5 UGT73C6 NC

Table 6: summary of the spot intensity of the TLC analysis with different acceptor substrates

Figure 20: Crude extracts used for the activity assays in figure 19 and table 6. The recombinant UGTs

are indicated below each lane and their size is indicated by the black arrow. NC is the negative

control, a crude extract containing an empty vector. UGT73C5 is less intense because of a pipetting

mistake.

Page 45: Identification and biochemical characterization of the UDP ...

Results

37

Even though TCP could not be used to normalize the UGT concentration and compare crude

extracts among each other, the ratio in which a UGT can utilize the substrates could be used

to study efficiency of a UGT towards different substrates. For example, UGT73C10 and

UGT73C11 are more efficient in utilizing oleanolic acid and hederagenin than they are with

the flavonols and sterols. Moreover, the difference between the capacity of UGT73C12 and

UGT73C13 to utilize sapogenins or flavonols is smaller than that of UGT73C10 and

UGT73C11. This indicates that UGT73C10 and UGT73C11 are more specific to oleanolic acid

and hederagenin than UGT73C12 and UGT73C13. To predict differences in KM values of the

enzymes towards the different substrates, information is needed about the Vmax of the UGTs

for every substrate. The KM for UGT73C11 towards oleanolic acid and hederagenin is most

likely <10μM, as the spot intensity did not decrease when 10μM instead of 100μM of

substrate concentration was used.

oleanolic

acid

hederagenin quercetin kaempferol obtusifoliol sitosterol sigmasterol campesterol β-amyrin TCP*

UGT73C9 nd nd nd nd nd nd nd nd nd 1.00

UGT73C10 1.64 2.45 0.11 0.08 0.06 0.04 0.04 0.04 1.50 1.00

UGT73C11 1.08 1.90 0.05 0.07 0.03 0.02 nd nd 0.70 1.00

UGT73C12 0.30 0.29 0.00 0.09 0.04 0.04 0.05 0.04 0.09 1.00

UGT73C13 0.37 0.26 0.07 0.05 0.06 0.05 0.05 0.05 0.16 1.00

UGT73C5 0.17 0.04 nd 0.10 0.03 0.03 0.03 0.03 0.06 1.00

UGT73C6 nd nd 0.51 0.10 nd nd nd nd 0.00 1.00

* The TCP concentration was 100μM, whereas 10μM was used for the other substrates.

** nd= not detected.

3.6 Kinetics

A more convenient way to study the substrate specificity of an enzyme is to determine

kinetic parameters like the KM, Vmax, and kcat for each substrate. The Vmax is the maximum

velocity in which an enzyme can utilize its substrates. With low concentrations of substrate

the amount of formed product is linear related to the amount of substrate available for the

enzyme, as there are ‘free’ enzymes enough to catalyze the reaction. In higher substrate

concentrations the enzymes get saturated with substrates and the amount of product

formed is than dependent on the maximum speed of the enzyme (Vmax). The substrate

concentration on which the protein reaches half of the Vmax is the KM, and is a measure for

the efficiency of the enzyme towards a substrate. A high KM implies that the enzyme reaches

Table 7: spot intensity relative to TCP

Page 46: Identification and biochemical characterization of the UDP ...

Results

38

its maximum velocity at a high concentration and therefore a low efficiency towards the

substrate. To determine the KM and Vmax of the enzyme, information on the concentration of

the enzyme is not necessary as long it is constant between the assays. The kcat is the

turnover number, thus the number of times each enzyme site converts substrate to product

per unit time. Therefore, precise information about the enzyme concentration and substrate

concentration is inevitable.

To explore the optimal conditions for kinetics a test time series was made with UGT73C11

and hederagenin (figure 21). The time series indicated that the linear range of the graph lies,

under these conditions, within the first few minutes of the times series. Kinetics should be

performed with the linear range of the graph, thus the conditions need to be optimized

regarding enzyme and substrate concentration. A difficulty in finding these conditions is that

the enzyme concentration differs between the crude extracts; therefore the optimal

conditions also differ per crude extract.

Therefore the recombinant UGTs were purified by his-tag purification. However, in

B.v.variegata UGT73Cshi a loss of activity after purification was found (Augustin, personal

communication) Activity assays of UGT73C10 and UGT73C12 did not show an obvious loss of

activity (figure 21), but one should keep in mind that the enzyme concentration after

purification is likely to be higher than in the crude extract (recombinant UGTs from 750μl

crude extract were concentrated into 50μl purified solution). Therefore it was preferred to

use the untreated crude extract for kinetics and activity assays. Several methods have been

explored to determine the UGT concentration circumventing purification.

Figure 21: Time series of UGT73C11 with 1mM UDP-glucose 100μM hederagenin in Tris-HCL pH 7.19. The peak

area on the Y axis is derived from LCMS data.

0

100000

200000

300000

400000

500000

600000

0 5 10 15 20 25time (min)

pe

ak a

rea

UGT73C11

Page 47: Identification and biochemical characterization of the UDP ...

Results

39

Figure 22: TLC analysis of the activity of

UGT73C10 and UGT72C12 towards

oleanolic acid (ol) and hederagenin (he)

before (B) and after (A) His-tag purification.

For the first approach two 2ml samples derived

from one culture were used, using one batch as

crude extract while determining the total amount of

UGT in the other batch by his-tag purification and

protein measurements. These batches contained an

equal amount of the same bacterial culture and

therefore theoretically an equal total amount of the

UGT. Determining the total amount of protein in one

batch would enable us to calculate back the

concentration of this protein in the crude extract of

the other batch. However, the His-tag enzyme purification (purification of native enzyme

using immidazol or EDTA elution step as well as purification based on the denaturized

enzyme) always resulted in a loss of enzyme in both the resin binding and elution step and

was therefore unsuitable for this purpose.

For the second approach the intensity of purified UGT on an SDS-PAGE was analyzed, using

Image J. BSA was used to produce a standard curve to determine the concentration of the

purified UGT and the UGT sample in each lane was used to explore the sensitivity and

reliability of this method. The standard curve was used in a range of 200-800μg/ml BSA,

using 2,5μl samples, and resulted in an R2

of 0,994 (figure 24). The standard deviation of 7

repetitions of the purified UGT was ca 10% of the total value, and 4% after discarding the

most extreme values. It was however not feasible to compare the standard curve of purified

enzyme (BSA) with the UGT bands in the crude extract, due to the high background in the

SDS-PAGE analysis of the crude extracts (figure 23).

UGT73C10 UGT73C12

B A B A B A B A

ol ol he he

Page 48: Identification and biochemical characterization of the UDP ...

Results

40

However, the purified and known concentration of UGT (determined by SDS-PAGE analysis)

could be used to make a standard curve for Western Blotting. Anti-his antibodies were used

so that only the recombinant UGTs were detected, thus circumventing the previous

background problems of the in SDS-PAGE analysis.

Figure 23: Analysis of the SDS-PAGE with ImageJ. A) a standard curve with 100-800μg/ml BSA, and

accordingly in each lane a sample of 4x diluted purified UGT73C10 of unknown concentration. Selected

regions used for ImageJ were indicated with blue and yellow. B) Image J analysis of the intensity of the

BSA and UGT73C10 bands in lane 3 and 4. C) SDS-PAGE containing crude extracts in lane 2-7 and a 200,

400 and 600 μg/ml reference band of BSA in lane 8,9 and 10. D) Lane 1, 2 and 3 from the corresponding

Image J analysis.

B

D C

UGT

UGT73C9 UGT73C10 UGT73C11 UGT73C12 UGT73C13 Empty

vector

200μg/ml

BSA

400μg/ml

BSA

600μg/ml

BSA

UGT BSA

A 200μg/ml

BSA

400μg/ml

BSA

600μg/ml

BSA

100μg/ml

BSA

300μg/ml

BSA

500μg/ml

BSA

700μg/ml

BSA

800μg/ml

BSA

BSA

Page 49: Identification and biochemical characterization of the UDP ...

Results

41

An initial Western Blot was performed on a series of 250, 500, 750, 1000 and 1250μg/ml

purified UGT. After black and white inversion, the same Image J analysis technique was used

to determine the band intensity as used for the SDS-PAGE analysis. Western Blotting is a

sensitive method and the lane with 1000μg/ml was clearly overloaded. However, in the

enzyme range of 250-750ug/ml, the correlation between band intensity and enzyme

concentration was linear, with an R2=0.99, but more measuring points are necessary to

obtain a more reliable result. Moreover the enzyme concentration of the (4x diluted) crude

extract was below this range of 250-750μg/ml. Because of time limitations this strategy was

not further optimizations during this thesis.

Figure 23: Correlation between the intensity of the peak area in Image J analysis of the SDS page (see figure

23) and the concentration BSA.

y = 10.024x + 956.75

R2 = 0.994

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

9,000

10,000

0 100 200 300 400 500 600 700 800 900

µg/ml

pea

k ar

eaBSA

Linear (BSA)

Page 50: Identification and biochemical characterization of the UDP ...

Conclusions and discussion

42

Page 51: Identification and biochemical characterization of the UDP ...

Conclusions and discussion

43

4. Conclusions and discussion

This thesis is the first characterizion of a candidate UGT in the saponin biosynthesis pathway

of Barbarea vulgaris arcuata. The main focus was to study if the genes could be involved in

the biosynthesic pathway of saponins, but also to study differences between genes with

respect to their role in resistance of Babarea towards the flee beetle. Therefore four

research focus questions were proposed, which will be the structure of following discussion.

These research questions are: i) Are the five UGTs from B. vulgaris arcuata capable of

glycosylating the hederagenin and oleanolic acid aglycone on C3 position? ii) How specific

are these UGTs towards oleanolic acid and hederagenin as acceptor substrate? iii) Is there a

difference in activity between the UGTs from the P and the G type plant? iv) Is there a

difference in activity between the ortholog the paralogs?

i) Are the five UGTs from B. vulgaris arcuta capable of glucosylation of hederagenin and

oleanolic acid aglycone at the 3 position?

The TLC analysis of figure 13 reveals activity of UGT73C10, UGT73C11, UGT73C12 and

UGT73C13 towards oleanolic acid and hederagenin. More than one glycosylation product

was present in each lane and no hederagenin and oleanolic acid 3-O-glucosides were

available to find out the Rf value of these monoglucosides to compare which spot represents

the 3-O-glucoside. Stainings with 10% sulphuric acid, which gave different color reactions for

both aglycones, the hederagenin stock solution turned out to be not 100% pure and

contained minor amounts of oleanolic acid. Two other compounds with Rf 0.73 and Rf 0.28

also appeared in a control lane with 80% alcohol and are apparently glycosides of ethanol or

impurities thereof. However, the extra spots presumably derived from ethanol and

impurities of hederagenin with oleanolic acid (and perhaps vice versa) do not account for all

the spots shown in the lanes in figure 13.

In contrast, LCMS analysis of the activity assays revealed only one extra peak for each

monoglucoside. Lacking (synthetically produced) 3-O-monoglucosides as reference for LCMS,

we assumed that the extra peak in LCMS analysis corresponds with the most intense spots

on TLC analysis. The fragments found in activity assays represented the oleanolic acid and

Page 52: Identification and biochemical characterization of the UDP ...

Conclusions and discussion

44

hederagenin monoglucosides and the corresponding aglycones. Thus, UGT73C10,

UGT73C11, UGTT73C12 and UGT73C13 can utilize oleanolic acid and hederagenin as a

substrate to produce oleanolic acid and hederagenin monoglucosides. TLC and LCMS data

however do not provide any inf

N ormation about the position or conformation of the sugar moiety.

The UGT73C subfamily belongs to superfamily 1, a class of inverting enzymes (figure 5). This

means that the sugar moiety of UDP-glucose, which is in α-conformation, will be inverted to

a β-conformation glucose moiety when attached to the aglycone. The NMR analysis of a

hederagenin-monoglucoside produced with UGT73C11 confirmed that the sugar moiety in

that monoglucoside was in β-conformation, and we have no reason to believe that the highly

similar UGT73C10, UGT73C12 and UGT73C13 do not share this feature.

The classification of the UGT73C does not make predictions about the position of

glycosylation or substrates specificity. However, B.v.variegata UGT73CShi was shown to

glucosylate oleanolic acid and hederagenin at the C3 position, and the high similarity (over

90% identity on amino acid level) between UGT73CShi and the B.v.arcuata UGTs suggested

that the latter may also glycosylate the C3 position of sapogenins. This was supported by the

high activity of UGT73C10 and UGT73C11 towards β-amyrin, a sapogenin with only one OH

group, located at the C3-position. UGTs can be rather promicous towards their substrate

specificity and are usually rather regiospecific than substrate specific (Vogt and Jones, 2000).

Thus, an enzyme which utilized β-amyrin is like to be able to utilize the 3-OH group on

oleanolic acid, hederagenin and other sapogenins. The NMR analysis revealed that the

hederagenin monoglucoside produced by UGT73C11 is indeed a 3-O-glucoside. Considering

the regiospecificity and the high similarity to UGT73CShi from B.v. variegata, I would expect

the oleanolic acid monoglucoside to be a 3-O-glucoside as well.

Moreover, TLC and LCMS analysis of the oleanolic acid and hederagenin monoglucosides

produced by UGT73C10, UGT73C11, UGT73C12 and UGT73C13 could not detect any

differences in the monoglucosides produced by UGT73C11 and the other three UGTS. This

indicates that also the oleanolic acid and hederagenin monoglucosides produced by these

UGTs are 3-O-monoglucoside.

Page 53: Identification and biochemical characterization of the UDP ...

Conclusions and discussion

45

UGT73C9 does not show any activity towards oleanolic acid and hederagenin. The Western

blot in figure 12 revealed the presence of the enzyme in the crude extract and the activity

towards TCP in figure 14 shows that at least a percentage of the enzyme is present in an

active form. However, it is not easy to determine if the lack of oleanolic acid and

hederagenin monoglucosides is due to the low concentration of (active) UGT73C9 in the

crude extract, a low affinity for these substrates or both of the above mentioned reasons. On

the other hand, the affinity of UGT73C10, UGT73C11, UGT73C12 and UGT73C13 towards

oleanolic acid and hederagenin was either similar or higher than the affinity for TCP. Thus, if

the lack of detection of activity of UGT73C9 would be only due to its low concentration in

the crude extract, we should have detected some oleanolic acid and hederagenin

monoglucosides TLC analyses of UGT73C9 activity assays, as we can see the TCP

monoglucoside (figure 14).

The aberrant activity of UGT73C9 lies within the little amino acid substitutions between

UGT73C9 and e.g. UGT73C10 (95 % identity). That makes these enzymes very suitable for

further studies to unravel the mechanism of enzyme activity. Homology studies with e.g. the

crystallized UGT71G1, which is also known to glycosylate flavonols and sapogenins, revealed

that a few of the amino acid substitutions that were unique for UGT73C9 were located

within the active site and therefore may be involved in the enzyme-substrate binding.

However, further studies (e.g. site directed mutagenesis) are beyond the scope and time

limits of this master thesis.

ii How specific are these UGTs towards oleanolic acid and hederagenin as acceptor

substrate?

The most common way to study the acceptor specificity of an enzyme is by determining

kinetic parameters like KM, Vmax and kcat. Several conditions are required to perform

kinetics. For example, the enzyme concentration is required to determine some kinetic

parameters (e.g. the kcat) and to optimize the activity assay conditions prior to kinetics.

Furthermore the reaction should only one end product, and ideally no reverse reaction

should occur. The activity assays of the UGTs studied in this thesis did not seem to meet

these requirements, as the TLC analysis of activity assays of hederagenin and oleanolic acid

always showed more than one product, and more importantly because the methods to

Page 54: Identification and biochemical characterization of the UDP ...

Conclusions and discussion

46

determine the UGT concentration in the crude extract were either not accurate enough or

too time consuming for this purpose. Therefore it was decided not to pursue kinetics but to

study the enzyme specificity by TLC analysis instead.

No difference was found within the activity of UGT73C11 towards oleanolic acid and

hederagenin in 10μM, 20μM, 50μM and 100μM concentrations. For the flavonol substrates,

however, the difference of end product resulting from activity assays is much lower in a

10μM substrate concentration assay than 100μM. This indicated that the KM of UGT73C

enzymes towards both flavonols is higher than 10μM, whereas the KM towards flavonols

could be below 10µM. However, the intensity of the spots in figure 14 with figure 19 and

table 6 may be well comparable as these activity assays as the crude extract and buffer pH

differs from the assays in figure 19.

The intensity of the main spots was shown in table 6. The spots on the TLC were manually

selected and the average intensity of each region was calculated by ImageQuant 5.0 and

shown in the table below. Thus, these numbers have no value itself but represent the

intensity of the spots on the TLC plates. The background value of each TLC plate was

between 5 and 7, and was comparable among TLC plates. The size of the selected region of

the homogeneous background had no influence on the average value of that region, but the

way of selecting the spot as region had a large influence on the average value. A ‘too large’

spot, including background, would decrease the average value and a ‘too small’ spot would

overestimate the average value of a spot. However, the ratios between the spots would be

maintained as long as the way of selecting the spot regions is the kept constant.

The relative activity to TCP is shown in table. To compare the ratio between enzymes, the

activity towards TCP should be equal for all enzymes under these conditions. No apparent

relation was found between the estimated UGT concentration in the crude extract on SDS-

PAGE and the amount of TCP end products detected in TLC analysis. This means that either

the affinity for TCP differs among enzymes under these substrate concentrations, or that the

percentage of active enzyme differs. It also means that in the ratio’s can only be compared

within enzymes and not among enzymes.

Page 55: Identification and biochemical characterization of the UDP ...

Conclusions and discussion

47

UGT73C10 and UGT73C11 seem to have a higher efficiency towards the sapogenin aglycones

than to any of the other tested substrates on 10μM concentration. UGT73C10 and

UGT73C11 produced 1-2 times as much monoglucoside of oleanolic acid (10μM) than they

do using (100μM) TCP as substrate. Their activity on flavonols and sterols is low, generally 1-

10% of their activity towards TCP. Not that the concentration of the TCP in these assays is 10

times higher than the other aglycones and TCP serves purely as a positive control.

UGT73C12 and UGT73C13 seem to be both less efficient and less specific towards oleanolic

acid and hederagenin. The spot intensity of (table 6) is generally 5x lower than that of

UGT73C10 and UGT73C11. However spot intensity could be influences by other factors like a

difference in the active enzyme in the crude extract or even a difference in the optimal

conditions between these enzymes, as those have only been tested for UGT73C11. However,

the ratio saponin:flavonol and sterols is relatively low in UGT73C12 and UGT73C13

compared to UGT73C10 and UGT73C11. The activity of UGT73C12 and UGT73C13 on

oleanolic acid and hederagenin is 30% of the activity on TCP and the activity on the flavonols

is also 5-10%. The difference in efficiency on these compounds is relatively small (compared

to UGT73C10 and UGT73C11). Thus, despite possible differences in the UGT concentration in

the assays etc. we can still conclude that UGT7312 and UGT73C13 are less specific for

oleanolic acid and hederagenin than UGT73C10 and UGT73C11 are for these substrates.

The pattern of activity of UGT73C5 towards saponins, flavonols and sterols is similar to that

of UGT73C12 and UGT73C13, but the enzyme might be even less specific for these

compounds. This is not surprising, as literature already mentions that UGT73C5 utilizes wide

spectrum of triterpene structures, such as cytokinins and, brassinosteroids. It would be

interesting to see if UGT73C12 and UGT73C13 are also capable of utilizing those aglycones.

Interestingly, within the sapogenins UGT73C10 and UGT73C11 appear to glycosylate

oleanolic acid, hederagenin and β-amyrin evenly well. As β-amyrin is the precursor of both

oleanolic acid and hederagenin, it would be inconvenient to glycosylate β-amaryin instead of

the aglycones. Not much is known about the function of a β-amyrin 3-O-glycoside. However,

evidence is accumulating that metabolic pathways in the plant are highly channeled

Page 56: Identification and biochemical characterization of the UDP ...

Conclusions and discussion

48

(reviewed in Jørgensen, 2005) and therefore UGT73C10 and UGT73C11 may not interact

physically with β-amyrin.

The high specificity of UGT73C10 and UGT73C11 for sapogenins indicates that they are

involved in the saponin pathway. However, in vitro assays do not necessarily describe the in

vivo function. The expression in planta, interaction with other genes, availability of the

substrate and many more factors all contribute to the function of a gene in the plant. Blast

of the sequences against pyrosequencing database showed that all four genes are

transcribed but not much have been tested for furthermore.

Stronger evidence could be by gained by knocking down the UGT of interest, using e.g. RNAi

to target all copies of the gene and it is possible to knock-down (preferably out) both

orthologs and paralogs at once. In combination with a line of over expression of these UGTS

would provide more insight in the function of the UGTs in the biosynthetic pathway of

saponins. This would, however, require that the plant can be transformed and that has not

been done yet for this plant species.

Kinetics

Quantification of the end products of activity assays would be much easier to interpret when

the amount of enzyme in the activity assay would be constant among the assays. In this

thesis, different approaches ways have been explored to measure the UGT concentration in

the crude extract. This was based on the assumption that UGTs partially lose activity during

the process of purification, which itself was based on observations of J.M. Augustin during

pilot experiments. However, no obvious loss of activity detected in figure 22 for both paralog

(UGT73C10) and ortholog (UGT73C12). Crucial is the comparison of both UGT concentrations

before and after purification in this case, which could have been estimated by SDS-PAGE

analysis.

Whether or not the assumption about loss of activity after purification is justified, in this

thesis I preferred to work with fresh crude extract. In particular because of the loss of

activity of UGT73C13 after storage at -20˚C, and purification every time to have freshly

purified enzymes (UGT73C12 and UGT73C13) is time consuming. The method which makes

Page 57: Identification and biochemical characterization of the UDP ...

Conclusions and discussion

49

use of purified UGT to make standard curve using Western blot (see results) seemed

promising. However, Western Blotting is a very time consuming technique and even though

the B.vulgaris UGTs seem to be stable on ice (figure), a faster method is preferred to

determine the amount of UGT in a crude extract. A well-established method is using an S-tag

fusion of the UGTs for quantification. S-tag assays are a labor intensive but sensitive

measurement, but the assays only take on hour, making them suitable for both a repetition

of the specificity assays and to continue kinetics.

iii is there a difference in activity towards oleanolic acid and hederagenin between the UGTs

isolated from the P and the G type plant?

This question was posed with respect to the resistance of the B.v.arcuata towards the flee

beetles. The G type of B.v.arcuata is resistant towards the flee beetles, whereas the P type is

not. A difference in functionality between enzymes derived from the P type and G type could

contribute a difference in the presence of the saponins and therefore the resistance of

B.v.arcuata towards flees beetles. Especially the activity of hederagenin cellobioside,

because that is the most abundant and most biologically active compound (Nielsen et al.,

2010).

The data show only little, if any, difference between UGT73C10 and UGT73C11, the two

UGTs that are most likely involved in the saponin biosynthesis pathway. The intensity of the

spots is comparable for the activity on oleanolic acid (respectively 480 and 505 for UGT7310

and UGT73C11) but UGT73C11, derived from the G plant, seems to show a slightly higher

efficiency on hederagenin than UGT73C10 (intensity of respectively 893 and 713 for

UGT73C11 and UGT73C10). However, this difference would definitely not explain the

difference in the presence of hederagenin monoglucoside in G and the P plant, as UGT73C10

is still very efficient with hederagenin.

Using protein extracts of G and P plants instead of crude extracts for activity assays, both G

and P protein extracts were able to glycosylate hederagenin and oleanolic acid (Augustin,

unpublished). The cellobioside, which is shown to be bio-active against flee beetles, was

found in both the G and P type plants. Mapping of the UGT73C members of B.v.arcuata and

Page 58: Identification and biochemical characterization of the UDP ...

Conclusions and discussion

50

variegata revealed that the UGTs are not located within the QTL for resistance (Kuzina et al.,

2009). These results indicate that the difference in production of saponins in G and P is due

to a difference earlier in the saponin biosynthesis pathway. The higher functionality of

UGT73C11 (if significant at all) than indicates that UGT73C11 is under a selection pressure

towards efficiency of utilizing hederagenin to produce cellobiosides, whereas UGT73C10 is

not.

iv) Is there a difference in activity between the orthologs (of UGT73CShi in B.v.variegata) and

their paralogs?

The genes of Barbarea vulgaris var. variegata and both the G and P type of Barbarea vulgaris

ssp arcuata are homologs, meaning that they have a common ancestor. Within homology

there are separate terms for genes were sequence divergence follows speciation (orthologs)

or gene duplication (paralogs) after speciation (Fitch, 2000). The most recent common

ancestor of the taxa is called the cenancestor. Assuming that the separation between species

and subspecies or variants is an earlier event than separation of one subspecies into types,

the UGTs derived from the G and the P type plant would have a more recent cenancestor

than the UGT from the two subspecies of Barbarea vulgaris.

However, the high sequence identity suggest that UGT73Cshi may be extremely close related

to UGT73C11 or that it even may be considered as the same gene. The difference between

the varient variegata seems to be smaller than the difference between the G and P type,

which is unlikely to be explained either by very high selection pressure of the beetles. First of

all the beetles selection pressure has not been validated and second a high number of silent

mutations would have been present in both sequences. It is more likely that the current

classificatio of the P and G type and variant variegata do not reflect their genetic distances.

The origin of B.v. variegata is referred to as a variant, and is probably a cultivar of B.vulgaris.

The high % identity suggests it may well be a cultivar of the G type of B.v.arcuata. In

contrast, recent microsatellite analysis of G and P plants and different subspecies of

Barbarea vulgaris suggests that the P and G type plants should be classified as different

subspecies or even species, rather than two types within one subspecies (Hauser et al,

unpublished).

Page 59: Identification and biochemical characterization of the UDP ...

Conclusions and discussion

51

The high similarity of UGT73Cshi, UGT73C11 and UGT73C10 suggest that these genes are the

orthologs in B.v. variegata, the G and P plant respectively. They most likely are derived from

one gene in the cenancestor of B.v.arcuata and B.v.variegata. Within the subspecies

arcuata, UGT73C9 and UGT73C10 in the P type are paralogs of each other, in which

UGT73C10 is the functional ortholog of UGT73C11 from the G type. UGT73C12 and

UGT73C13 are paralogs to UGT73C10, but to be able to say wheather they are orthologs or

paralogs towards UGT73Cshi, in the way we used the terms in this thesis, information about

the event of gene duplication in variegata is required. The UGT73C12 and UGT73C13 have

not been found in variegata because we haven’t looked there yet. However, if variegata is

indeed a cultivar of the G type, it will most likely also contain an ortholog of UGT73C12 and

UGT73C13. Therefore, and also to facilitate readability, in this thesis UGT73C12 and

UGT73C13 are referred to as paralogs to UGT73Cshi, even though strictly spoken we need to

search for UGT73C12 orthologs in variegata to confirm these relations.

The gene duplication that led to the presence of both UGT73C9 and UGT73C10 may have

occurred after separation into the P and G type plant, as no UGT has been isolated in the G-

plant that is closest to UGT73C9. However, inconsistencies with the sequence of the PCR

products of the UGT73C11 sequence in the G plant indicate that there are more copies of

UGT73C11 present in the G-type plant (Augustin, personal communication). Because all

inconsistencies were silent mutations, this extra copy (or copies) was never isolated.

So far we have seen that the plants contain orthologs and paralogs of the UGT from

B.v.variegata, and the research question was if, and how, these paralogs and orthologs differ

in activity and biochemical properties. Most of the experiments regarding optimal conditions

for activity assays were conducted with UGT73C11 only, under the assumption that all

enzymes would respond the same on the Tris buffer concentration, pH, ect. However, we did

find a difference in activity after freeze-thawing between the ortholog UGT73C11 and

paralog UGT73C13. UGT73C13 lost partially activity after storage at -20˚C, while this effect

was not detectable for UGT73C11. Moreover a difference in activity was found between the

orthologs and the paralogs. The genes that are the orghologs to B.v.variegata UGT73CSShi

had a higher specific towards sapogenin substrates than the paralogs.

Page 60: Identification and biochemical characterization of the UDP ...

Conclusions and discussion

52

Because of the terms orthologs, paralogs, and the high efficiency of the orthologs for

oleanolic acid and hederagenin it is tempting to say that the orthologs are the ‘original’

genes from which by an event of duplication the paralogs are derived, which than by lack of

selection pressure lost their function and became less specific for the saponin biosynthesis

pathway. However, the term orthologs and paralogs depend on the point of reference,

which in this case is the UGT from B.v.variegata. This point of reference was chosen from a

biochemical point of view, to elucidate the saponin biosynthesis pathway, but has no

meaning from an evolutionary point of view. Without a reference point the P and G plant

contain (at least) two copies of a gene and the question arises which is the copy of which.

Secondary metabolites are often derived from primary metabolites; copies of genes in the

primary metabolite pathway could get mutations and might change function, whereby the

newly produced metabolite ‘accidentally’ provides a beneficial effect on the interaction with

the plant and its environment. UGT73C12 and UGT73C13, orthologs to one another, show in

their glucosylation pattern a great similarity with the A.thaliana UGT73C5, which is a UGT

involved in the regulation of Brassinolide, a plant hormone involved in growth and

development (Poppenberger et al., 2005). Thus, perhaps UGT73C12 and UGT73C12 are

orthologs to A.thaliana UGT73C5 and shared a common ancestor further back in the

Brassicacea that possibly had a function in hormone regulation. Than UGT73C10 and

UGT73C11 are the result of a duplication of UGT73C12 and gained a new function in the

metabolic pathway of saponins, rather than that UGT73C10 lost their function. This theory

is, however, purely speculative. However, the thought that the characterization of these

genes could bring us closer to the origin of resistance of Barbarea vulgaris against flee

beetles is fascinating.

Page 61: Identification and biochemical characterization of the UDP ...

Perspectives

53

5. Perspectives

This thesis describes the characterization of what could be the first characterized gene

involved in the biosynthetic genes in the pathway of Barbarea vulgaris ssp arcuata.

However, the experiments are purely in vitro experiments and therefore provide only

information about the capacity of the UGTs under lab conditions.

Within the in vitro study, there are some gaps in the data. The data regarding optimization

of the activity assays should be interpreted with care, as both repetitions is lacking and the

range of the conditions is limited. For example, the pH should be tested over a larger range

and be repeated just to make sure that pH 7.5 was not just an accidental high value.

Moreover, the activity might changes with the buffer composition and optimum pH can

change among buffers (Hou et al., 2004). Moreover, the variation among assays is not

tested here. The intensity of the spots is influence by the duration time of an assay, the

percentage active enzyme present but also by the precision of the extraction. To limit the

latter variation, the specificity assays were always extracted twice.

The results obtained from TLC assays do suggest differences in specificity among the

enzymes. However, kinetic parameters are still necessary because they are more exact,

provide more detailed information and allow comparison with other publications. To

perform these kinetic assays, an S-tag fusion of the proteins would be a suitable approach,

because the methods explored in this thesis are very time consuming and might have a high

inaccuracy.

Interesting point would be to test the UGTs for more substrates. Other saponins,

soybeansaponin and medicagenin would give insight in the regiospecificity and substrate

specificity, and brassinosteroids substrates would be interesting to test possible similarities

in function towards UGT73C5. Since the NMR we now have purified hederagenin

monoglucoside available. It would be interesting to test the four recombinant UGTs on their

activity on this monoglucoside, as in fig 12 spots appear on Rf ≈0.3 which could be

hederagenin and oleanolic diglicosides but which were not reproducible within this thesis.

Using the purified monoglucoside the C3 of hederagenin is no longer avaiable and the

Page 62: Identification and biochemical characterization of the UDP ...

Perspectives

54

activity of the UGT for other positions can be tested. Moreover, the monoglucoside can be

used to test candidate UGTs for the second step of glycosylation yielding the cellobiosides.

Identification of these UGTs would be very interesting, as it would be the first enzyme

known to transfer a sugar onto a monoglycoside.

The large-scale activity assay unfortunately yielded in a low amount of purified

monoglucosides. However, the monoglucoside would be very interesting to test for its

biological activity towards flee beetles. Earlier bioassays with flee beetles revealed

bioactivity for the cellobioside but no bioactivity for the monoglucoside. As the resistance of

flee beetles might be due to a β-glucosidase, it would be really interesting to see if the

monoglucoside has bioactivity towards the common type of susceptible flee beetles.

To get more insight in the phylogenetic relations between the subspecies and types within

Barbarea vulgaris, one should screen B.v.variegata for orthologs of e.g. UGT73C12 or

UGT73C13 and UGT73C9. Those data would provide more insight in the events of gene

duplication among these subspecies and types and contribute to the (currently ongoing)

revision of the status of the P and G type and B.v.variegata.

The in vitro studies performed in this thesis do not prove the involvement of the UGTs in the

biosynthetic pathway of saponins. The role of the UGTs in the saponin pathway can be

studied by studying plants over expression or lacking expression of the UGT. RNAi could be a

suitable method to knock-down the UGTs as it can target all five UGTs in the plant. To

produce a real knock out, e.g. T-DNA, may be not feasible to produce as the amount of

copies of these UGTs are unknown, but are at least two or three copies per plant. Moreover,

the copies map together and may be very difficult to cross into each other to generate a

knock-outl ine. However, this RNAI technique requires transformation of the plant and that

has not been done yet for B. vulgaris.

The biological function of a UGT in the plant is influenced by many factors, e.g. the

expression level of the gene, regulation, localization and substrate availability. The

expression levels between the four five genes can be studied with RT-PCR. Localization

studies (roughly using RT-PCR in different plant organs or more detailed e.g. using GUS-

Page 63: Identification and biochemical characterization of the UDP ...

Perspectives

55

staining) could provide more in insight in the localization and therefore in the function of

these proteins. In addition, it would be interesting to study which factors influence gene

expression in Barbarea. Often UGT expression in the plant can be induces by Methyl

Jasmonate, a plant hormone that is released with plant stress, caused by cell damage

(Osbourn, 2003). As the cellobiosides have a function of insect resistance, it would be

interesting to see if (imitation of) insect damage increases genes involved in the saponin

pathway.

Moreover, the resistance of Babarea vulgaris towards the flee beetles declines in the

autumn (Agerbirk et al., 2001) which seems to be due to a decline of saponin content in the

same period (Nielsen, personal communication). In addition, resistance of the plants has

been shown to be inducible with light and temperature conditions. It would be very

interesting to study how the pathway is regulated and which genes are inducible by which

factors. However, the saponin pathway may be regulated at a different place in the pathway,

for instance at the OSC or p450 level. The other way around, studying gene expression

before and after induction of saponins using microarray can be used to search for candidate

genes.

By date the P and G type plant have been sequenced using Pyrosequencing and candidate

OSCs, p450s, and UGTs involved in the second glycosylation step, are being selected.

Candidate OSC genes will be screened by heterologous expression in a lanosterol sterol

deficient yeast strain, as used in (Corey, 1993)

The resistance of B.vulgaris towards the flee beetles is conferred by the two saponin

cellobiosides in which hederagenin cellobioside us the most bioactive and most abundand

compound (Nielsen et al., 2010). Although the precise mechanism is not completely

elucidated yet, studies agree on complex formation between saponins and the sterols in the

membrane of the pest or pathogen. This mode of resistance is a general mechanism it can

confer resistance against many other insect species. Therefore genes in this pathway could

be used for crop engineering, once the pathway is elucidated. Barbarea vulgaris is closely

related to Brassica napus (rape seed) and moreover the Brassica family contains many

eatable crop species.

Page 64: Identification and biochemical characterization of the UDP ...

Perspectives

56

Crystal structures of UGTs became available to gain more insight into the precise mechanism

of glucosylation and with directed mutagenesis the role of individual amino acids in donor

and acceptor binding have been elucidated. The knowledge about donor and acceptor

substrate binding in UGTs is expanding rapidly and will be used soon to modify UGTs to

exactly the function that you want them to have. UGT73C9 and UGT73C10 would be an

interesting pair of UGTs to include in this research field. The amino acid sequence is highly

similar (95% identity), yet UGT73C9 has an altered (solubility and) activity on hederagenin

and oleanolic acid. Homology studies of revealed that UGT of amino acids located in the

activity site, which could have an effect on the activity of UGT73C9. For the altered solubility,

other amino acids might be involved. Because of their many medicinal or soap like,

antiflammatory and other abilites, saponins are of great interest for the pharmaceutical

purposed, thus so are the GTs producing them. Using glycosyltransferases to produce

monoglucosides is relatively easy compared to chemical synthesis. A better understanding of

the mechanism behind acceptor-binding lead to efficient enhancement of UGT and the

ability to ‘design’ UGTs for any substrate could be in the near future.

Page 65: Identification and biochemical characterization of the UDP ...

References

57

6. References

Achnine, L., Huhman, D. V., Farag, M. A., Sumner, L. W., Blount, J. W. & Dixon, R. A. 2005.

Genomics-based selection and functional characterization of triterpene

glycosyltransferases from the model legume Medicago truncatula. Plant Journal, 41,

875-887.

Aerts, R. J. M., A.J. 1997. Feeding Deterrence and Toxicity of Neem Triterpenoids. Journal of

Chemical Ecology, 23, 2117-2132.

Agerbirk, N., Olsen, C. E., Bibby, B. M., Frandsen, H. O., Brown, L. D., Nielsen, J. K. & Renwick,

J. a. A. 2003a. A saponin correlated with variable resistance of Barbarea vulgaris to

the diamondback moth Plutella xylostella. Journal of Chemical Ecology, 29, 1417-

1433.

Agerbirk, N., Olsen, C. E. & Nielsen, J. K. 2001. Seasonal variation in leaf glucosinolates and

insect resistance in two types of Barbarea vulgaris ssp arcuata. Phytochemistry, 58,

91-100.

Agerbirk, N., Orgaard, M. & Nielsen, J. K. 2003b. Glucosinolates, flea beetle resistance, and

leaf pubescence as taxonomic characters in the genus Barbarea (Brassicaceae) (vol

63, pg 69, 2003). Phytochemistry, 64, 1177-1177.

Albers, J., Hartmann, P., Baumbach, A. & Germany. 1996. Zivilprozessordnung : mit

Gerichtsverfassungsgesetz und anderen Nebengesetzen, München, C.H. Beck.

Armah, C. N., Mackie, A. R., Roy, C., Price, K., Osbourn, A. E., Bowyer, P. & Ladha, S. 1999.

The membrane-permeabilizing effect of avenacin A-1 involves the reorganization of

bilayer cholesterol. Biophysical Journal, 76, 281-290.

Bak, S., Paquette, S., Morant, M., 2006. Cyanogenic glycosides: a case study for evolution

and application of cytochromes p450. Phytochemistry Reviews, 20.

Campbell, J. A., Davies, G. J., Bulone, V. & Henrissat, B. 1997. A classification of nucleotide-

diphospho-sugar glycosyltransferases based on amino acid sequence similarities.

Biochemical Journal, 326, 929-939.

Corey, E. J., Matsuda, S. P. T. & Bartel, B. 1993. ISOLATION OF AN ARABIDOPSIS-THALIANA

GENE ENCODING CYCLOARTENOL SYNTHASE BY FUNCTIONAL EXPRESSION IN A

YEAST MUTANT LACKING LANOSTEROL SYNTHASE BY THE USE OF A

CHROMATOGRAPHIC SCREEN. Proceedings of the National Academy of Sciences of

the United States of America, 90, 11628-11632.

Coutinho, P. M., Deleury, E., Davies, G. J. & Henrissat, B. 2003. An Evolving Hierarchical

Family Classification for Glycosyltransferases. Journal of Molecular Biology, 328, 307-

317.

Defago, G. & Kern, H. 1983. INDUCTION OF FUSARIUM-SOLANI MUTANTS INSENSITIVE TO

TOMATINE, THEIR PATHOGENICITY AND AGGRESSIVENESS TO TOMATO FRUITS AND

PEA-PLANTS. Physiological Plant Pathology, 22, 29-37.

Fitch, W. 2000. Homology: a personal view on some of the problems. Trends in Genetics, 16,

227-231.

Hansen, E. H., Osmani, S. A., Kristensen, C., Moller, B. L. & Hansen, J. 2009. Substrate

specificities of family 1 UGTs gained by domain swapping. Phytochemistry, 70, 473-

482.

Haralampidis, K., Trojanowska M., Osbourn Ae., 2002. biosynthesis of triterpenoid saponins

in plants. Advances in Biochemical Engineering/Biotechnology, 75, 31-49.

Page 66: Identification and biochemical characterization of the UDP ...

References

58

He, X. Z., Li, W. S., Blount, J. W. & Dixon, R. A. 2008. Regioselective synthesis of plant

(iso)flavone glycosides in Escherichia coli. Applied Microbiology and Biotechnology,

80, 253-260.

Hou, B. K., Lim, E. K., Higgins, G. S. & Bowles, D. J. 2004. N-glucosylation of cytokinins by

glycosyltransferases of Arabidopsis thaliana. Journal of Biological Chemistry, 279,

47822-47832.

Hughes, J. & Hughes, M. A. 1994. MULTIPLE SECONDARY PLANT PRODUCT UDP-GLUCOSE

GLUCOSYLTRANSFERASE GENES EXPRESSED IN CASSAVA (MANIHOT-ESCULENTA

CRANTZ) COTYLEDONS. DNA Sequence, 5, 41-49.

Ikink, G. 2008. Identification of genes involved in resistance of flea beetles (phyllotreta

nemorum) agains plant host defense. Msc, Wageningen University and Research

centre.

Jones, P., Messner, B., Nakajima, J. I., Schaffner, A. R. & Saito, K. 2003. UGT73C6 and

UGT78D1, glycosyltransferases involved in flavonol glycoside biosynthesis in

Arabidopsis thaliana. Journal of Biological Chemistry, 278, 43910-43918.

Jones, P. & Vogt, T. 2001. Glycosyltransferases in secondary plant metabolism: tranquilizers

and stimulant controllers. Planta, 213, 164-174.

Jorgensen, K., Rasmussen, A. V., Morant, M., Nielsen, A. H., Bjarnholt, N., Zagrobelny, M.,

Bak, S. & Moller, B. L. 2005. Metabolon formation and metabolic channeling in the

biosynthesis of plant natural products. Current Opinion in Plant Biology, 8, 280-291.

Kurosawa, Y. S. H. T. M. 2002. UDP-glucuronic acid:soyasapogenol glucuronosyltransferase

involved in saponin biosynthesis in germinating soybean seeds. Planta, 215, 620-629.

Kuzina, V., Ekstrom, C. T., Andersen, S. B., Nielsen, J. K., Olsen, C. E. & Bak, S. 2009.

Identification of Defense Compounds in Barbarea vulgaris against the Herbivore

Phyllotreta nemorum by an Ecometabolomic Approach. Plant Physiology, 151, 1977-

1990.

Mackenzie, P. I., Owens, I. S., Burchell, B., Bock, K. W., Bairoch, A., Belanger, A.,

Fournelgigleux, S., Green, M., Hum, D. W., Iyanagi, T., Lancet, D., Louisot, P.,

Magdalou, J., Chowdhury, J. R., Ritter, J. K., Schachter, H., Tephly, T. R., Tipton, K. F. &

Nebert, D. W. 1997. The UDP glycosyltransferase gene superfamily: Recommended

nomenclature update based on evolutionary divergence. Pharmacogenetics, 7, 255-

269.

Morrissey, J. P. & Osbourn, A. E. 1999. Fungal resistance to plant antibiotics as a mechanism

of pathogenesis. Microbiology and Molecular Biology Reviews, 63, 708-+.

Nielsen, J. K. 1997. Variation in defences of the plant Barbarea vulgaris and in

counteradaptations by the flea beetle Phyllotreta nemorum. Entomologia

Experimentalis Et Applicata, 82, 25-35.

Nielsen, J. K., Nagao, T., Okabe, H. & Shinoda, T. 2010. Resistance in the Plant, Barbarea

vulgaris, and Counter-Adaptations in Flea Beetles Mediated by Saponins. Journal of

Chemical Ecology, 36, 277-285.

Noguchi, A., Horikawa, M., Fukui, Y., Fukuchi-Mizutani, M., Iuchi-Okada, A., Ishiguro, M.,

Kiso, Y., Nakayama, T. & Ono, E. 2009. Local Differentiation of Sugar Donor Specificity

of Flavonoid Glycosyltransferase in Lamiales. Plant Cell, 21, 1556-1572.

Offen, W., Martinez-Fleites, C., Yang, M., Kiat-Lim, E., Davis, B. G., Tarling, C. A., Ford, C. M.,

Bowles, D. J. & Davies, G. J. 2006. Structure of a flavonoid glucosyltransferase reveals

the basis for plant natural product modification. Embo Journal, 25, 1396-1405.

Osbourn, A. 1996. Saponins and plant defence - A soap story. Trends in Plant Science, 1, 4-9.

Page 67: Identification and biochemical characterization of the UDP ...

References

59

Osbourn, A. E. 2003. Saponins in cereals. Phytochemistry, 62, 1-4.

Phillips, D. R., Rasbery, J. M., Bartel, B. & Matsuda, S. P. T. 2006. Biosynthetic diversity in

plant triterpene cyclization. Current Opinion in Plant Biology, 9, 305-314.

Poppenberger, B., Berthiller, F., Bachmann, H., Lucyshyn, D., Peterbauer, C., Mitterbauer, R.,

Schuhmacher, R., Krska, R., Glossl, J. & Adam, G. 2006. Heterologous expression of

Arabidopsis UDP-glucosyltransferases in Saccharomyces cerevisiae for production of

zearalenone-4-O-glucoside. Applied and Environmental Microbiology, 72, 4404-4410.

Poppenberger, B., Fujioka, S., Soeno, K., George, G. L., Vaistij, F. E., Hiranuma, S., Seto, H.,

Takatsuto, S., Adam, G., Yoshida, S. & Bowles, D. 2005. The UGT73C5 of Arabidopsis

thaliana glucosylates brassinosteroids. Proceedings of the National Academy of

Sciences of the United States of America, 102, 15253-15258.

Prakash, A., Sengupta, S., Aparna, K. & Kasbekar, D. P. 1999. The erg-3 (sterol Delta(14,15)-

reductase) gene of Neurospora crassa: generation of null mutants by repeat-induced

point mutation and complementation by proteins chimeric for human lamin B

receptor sequences. Microbiology-Sgm, 145, 1443-1451.

Renwick, J. a. A. 2002. The chemical world of crucivores: lures, treats and traps. Entomologia

Experimentalis Et Applicata, 104, 35-42.

Shibuya, M., Nishimura, K., Yasuyama, N., Ebikuka, Y. 2010. Identification and

characterization of glycosyltransferases involved in the biosynthesis of soyasaponin I

in Glycine max. Febs Letters, 584, 7.

Shinoda, T., Nagao, T., Nakayama, M., Serizawa, H., Koshioka, M., Okabe, H. & Kawai, A.

2002. Identification of a triterpenoid saponin from a crucifer, Barbarea vulgaris, as a

feeding deterrent to the diamondback moth, Plutella xylostella. Journal of Chemical

Ecology, 28, 587-599.

Vincken, J. P., Heng, L., De Groot, A. & Gruppen, H. 2007. Saponins, classification and

occurrence in the plant kingdom. Phytochemistry, 68, 275-297.

Vogt, T. & Jones, P. 2000. Glycosyltransferases in plant natural product synthesis:

characterization of a supergene family. Trends in Plant Science, 5, 380-386.

Wang, X. Q. 2009. Structure, mechanism and engineering of plant natural product

glycosyltransferases. Febs Letters, 583, 3303-3309.

Yendo, A. C. a. C. F. D. G., G. ; Fett-Neto, A.G. 2010. Production of Plant Bioactive

Triterpenoid Saponins: Elicitation Strategies and Target Genes to Improve Yields.

Molecular Biotechnology.

Page 68: Identification and biochemical characterization of the UDP ...

Appendix

60

Appendix I

Alignment of the UGT73C subfamily

UGT71G1 ------GCTG CAGGAATTCG GCACGAGGGA AAGAACAAAG AAA------- ---------- --ATGTCTAT GAGTGATA UGT73C1 ---------C AAAAAAAATA AACCTTAGAA GAGAGTTCCT ---------- ---------- -CATGGCATC GGA----- UGT73C2 CACACACACA CACACACACA CAATCGGAAT CAAAGCAAAA AAGAACTTAG AGATAGGTCC TCATGGCTTT CGAGAAGA UGT73C3 ---------- -AAAAACACA TAACTTGAGT ATTCCTCATT GAGTCGTTGC AAACGG---- -TATGGCTAC GGAAAAAA UGT73C4 -----AATCC AAAAACCACA AAACTTGAGA GTTCATCATT GAG---TTGG AACCGG---- -TATGGCTTC CGAAAAAT UGT73C5 ---------- -----AAGCA AAACTTGAGA GTTTCTTACT AAG------- -GTTGAATC- --ATGGTTTC CGAAACAA UGT73C7 ---------- ---------- --CCTTGTAA GATCTTCTAC A--------- ---------- --ATGTGTTC TCA----- UGT73C6 ---------- ----GAAACA AAACTTGAGA GGTTCTTACT AAA------- -GTTGCATCG TCATGGCTTT CGAAAAAA UGT73C10 ---------- ---------- ---------- ---------- ---------- ---------- --ATGGTTTC CGAAATCA UGT73C12 ---------- ---------- ---------- ---------- ---------- ---------- --ATGGTTTC CGAAATCA UGT73C11 ---------- ---------- ---------- ---------- ---------- ---------- --ATGGTTTC CGAAATCA UGT73C13 ---------- ---------- ---------- ---------- ---------- ---------- --ATGGTTTC AGAAATTA UGT73C9 ---------- ---------- ---------- ---------- ---------- ---------- --ATGGTTTC CGAAATCA UGT73C8 ---------- ---------- ---------- ---------- ---------- ---------- --ATGGTATC CCAAG--- UGT73Cshi ---------- ---------- ---------- ---------- ---------- ---------- --ATGGTTTC CGAAATCA UGT71G1 TAAACAAG-- -AATTCAGAA CTCATCTTCA TTCCTGCACC ----AGGAAT T--GGCCACT TAGCTTCAGC TCTTGAAT UGT73C1 -------ATT TCGTCCTCCT CTTCATTTTG TTCTCTTCCC TTTCATGGCT CAAGGCCACA TGATCCCAAT GGTAGATA UGT73C2 CCCGCCAATT TCTTCCTCCG CTTCACTTTG TTCTCTTCCC TTTCATGGCT CAAGGCCACA TGATCCCCAT GGTGGATA UGT73C3 CCCACCAATT TCATCCTTCT CTTCACTTTG TCCTCTTCCC TTTCATGGCT CAAGGCCACA TGATTCCCAT GATTGATA UGT73C4 CCCACAAAGT TCATCCTCCT CTTCACTTTA TTCTTTTCCC TTTCATGGCT CAGGGCCACA TGATTCCCAT GATTGATA UGT73C5 CCAA---ATC TTCTCCA--- CTTCACTTTG TTCTCTTCCC TTTCATGGCT CAAGGCCACA TGATTCCCAT GGTTGATA UGT73C7 ---------- TGATCCT--- CTTCACTTCG TCGTAATACC CTTTATGGCC CAAGGCCATA TGATCCCATT GGTCGACA UGT73C6 ACAACGAACC TTTTCCT--- CTTCACTTTG TTCTCTTCCC TTTCATGGCT CAAGGCCACA TGATTCCCAT GGTTGATA UGT73C10 CCCATAAATC TTATCCT--- CTTCACTTTG TTCTCTTCCC TTTCATGGCT CAAGGCCACA TGATTCCCAT GGTTGATA UGT73C12 CCCATAAATC TTATCCT--- CTTCACTTTG TTCTCTTCCC TTTCATGGCT CAAGGCCACA TGATTCCCAT GGTTGATA UGT73C11 CCCATAAATC TTATCCT--- CTTCACTTTG TTCTCTTCCC TTTCATGGCT CAAGGCCACA TGATTCCCAT GGTTGATA UGT73C13 CCCATAAATC TTATCCT--- CTTCACTTTG TTCTCTTCCC TTTCATGGCT CAAGGCCACA TGATTCCCAT GGTTGATA UGT73C9 CCCATCAATC TTATCCT--- CTTCACTTTG TTCTCTTCCC TTACATGGCT CAAGGCCACA TGATTCCCAT GGTTGATA UGT73C8 ---------- --ATCCAAAG GTACATTTTG TGTTATTTCC AATGATGGCA CAAGGCCACA TGATCCCCAT GATGGACA UGT73Cshi CCCATAAATC TTATCCT--- CTTCACTTTG TTCTCTTCCC TTTCATGGCT CAAGGCCACA TGATTCCCAT GGTTGATA UGT71G1 TTGCAAAACT TTTAAC-CAA C-------CA TGAC---AAA AATCTTTACA TCACAGTCTT CTGCATCAAG TTTCCAGG UGT73C1 TTGCAAGGCT CCTGGCTCAG CGC---GGGG TGACTATAAC CATTGTCACT ACACCTCAAA ACGCAGGCCG GTTC-AAG UGT73C2 TTGCAAGGAT CTTGGCTCAG CGC---GGGG TGACTATTAC CATTGTCACG ACGCCTCACA ACGCAGCCAG GTTC-AAA UGT73C3 TTGCAAGACT CTTGGCTCAG CGT---GGTG TGACCATAAC AATTGTCACG ACACCTCACA ACGCAGCAAG GTTT-AAG UGT73C4 TAGCAAGGCT CTTGGCTCAG CGC---GGTG CGACAGTAAC TATTGTCACG ACACGTTATA ATGCAGGGAG GTTC-GAG UGT73C5 TTGCAAGGCT CTTGGCTCAG CGT---GGTG TGATCATAAC AATTGTCACG ACGCCTCACA ATGCAGCGAG GTTC-AAG UGT73C7 TCTCTAGGCT CTTGTCCCAG CGCCAAGGCG TGACTGTCTG CATCATCACA ACTACTCAAA ATGTAGCCAA GATC-AAG UGT73C6 TTGCAAGGCT CTTGGCTCAG CGA---GGTG TGCTTATAAC AATTGTCACG ACGCCTCACA ATGCAGCAAG GTTC-AAG UGT73C10 TTGCAAGGCT CTTGGCTCAG CGC---GGTG TGAAAATAAC AATTGTCACA ACGCCGCACA ATGCAGCGAG GTTC-GAG UGT73C12 TTGCAAGGCT CTTGGCCCAG CGC---GGTG TGAAAATAAC AATTGTCACA ACCCCGCACA ATGCAGCGAG GTTC-AAG UGT73C11 TTGCAAGGCT CTTGGCTCAG CGC---GGTG TGAAAATAAC AATTGTCACA ACGCCGCACA ATGCAGCGAG GTTC-GAG UGT73C13 TTGCAAGGCT CTTGGCCCAG CGC---GGTG TGAAGATAAC AATTGTCACA ACGCCGCACA ATGCAGCGAG GTTC-GAG UGT73C9 TTGCAAGGCT CTTGGCTCAG CGC---GGTG TGAAAATAAC AATTGTCACA ACGCCGCAAA ATGCAGCGAG GTTC-GAG UGT73C8 TTGCAAAAAT ATTGGCACAA CATCAAAATG TTATTGTTAC AATAGTAACC ACCCCAAAAA ATGCATCTCG TTTC-ACA UGT73Cshi TTGCAAGGCT CTTGGCTCAG CGC---GGTG TGAAAATAAC AATTGTCACA ACGCCGCACA ATGCAGCGAG GTTC-GAG UGT71G1 CATGCCCTTT GCAGATTCAT A-TATCAAAT CAGTTTTAGC CTCACAACCA CAAATTCAAC TGA--TTGAT CTTCCTGA UGT73C1 AACGTTCT-- ---TAGCCGG GCTATCCAAT C-----CGG- CTTGC--CCA TCAATCTCGT GCAAGTAAAG TTTCCATC UGT73C2 GATGTCCT-- ---AAACCGG GCCATCCAGT C-----AGG- CTTGC--ACA TTAGGGTTGA GCATGTGAAG TTTCCTTT UGT73C3 AATGTCCT-- ---AAACCGA GCGATCGAGT C-----TGG- CTTGG--CCA TCAACATACT GCATGTGAAG TTTCCATA UGT73C4 AATGTCTT-- ---AAGTCGT GCCATGGAGT C-----TGG- TTTAC--CCA TCAACATAGT GCATGTGAAT TTTCCATA UGT73C5 AATGTCCT-- ---AAACCGT GCCATTGAGT C-----TGG- CTTGC--CCA TCAACTTAGT GCAAGTCAAG TTTCCATA UGT73C7 A----CTT-- ---CACTCTC ATTTTCCTCT T-----TG-- TTTGCGACTA TCAACATCGT TGAAGTTAAG TTTCTGTC UGT73C6 AATGTCCT-- ---AAACCGT GCCATTGAGT C-----TGG- TTTGC--CCA TCAACCTAGT GCAAGTCAAG TTTCCATA UGT73C10 AATGTCCT-- ---AAGCCGT GCCATTGAGT C-----TGG- CTTGC--CCA TCAGCATAGT GCAAGTCAAG CTTCCATC UGT73C12 AATGTCCT-- ---AAGTCGT GCCATTGAGT C-----TGG- CTTGC--CCA TCAGCATAGT GCAAGTCAAG CTTCCATC UGT73C11 AATGTCCT-- ---AAGCCGT GCCATTGAGT C-----TGG- CTTGC--CCA TCAGCATAGT GCAAGTCAAG CTTCCATC UGT73C13 AATGTCCT-- ---AAACCGT GCCATTGAGT C-----TGG- CTTGC--CCA TCAGCATAGT GCAAGTCAAG CTTCCATC UGT73C9 AATGTCCT-- ---AAGCCGT GCCATTGAGT C-----TGG- CTTGC--CCA TCAGCATAGT GCAAGTCAAG CTTCCATC UGT73C8 TCAATTGT-- ---AGCACGT TGTGTTGAAT A-----TGGT CTTGA---CA TTCAATTAGT CCAACTTGAA TTTCCATG UGT73Cshi AATGTCCT-- ---AAGCCGT GCCATTGAGT C-----TGG- CTTGC--CCA TCAGCATAGT GCAAGTCAAG CTTCCATC UGT71G1 AGTAGAACCA CCTCCACAAG AGCTACTAAA ATCTCCAGAA TTTT-ACATC TTGACTTTTT TGGAGAGTCT CATACCTC UGT73C1 TCAAGAATCG GGTTCACCGG A-------AG GACAGGAGAA TTTGGACTTG CTCGATTCAT TGGGGGCTTC ATTAACCT UGT73C2 TCAAGAAGCT GGTTTGCAAG A-------AG GACAAGAGAA TGTTGATTTT CTTGACTCAA TGGAGTTAAT GGTACATT UGT73C3 TCAAGAGTTT GGTTTGCCAG A-------AG GAAAAGAGAA TATAGATTCG TTAGACTCAA CGGAGTTGAT GGTACCTT UGT73C4 TCAAGAATTT GGTTTGCCAG A-------AG GAAAAGAGAA TATAGATTCG TATGACTCAA TGGAGCTGAT GGTACCTT UGT73C5 TCTAGAAGCT GGTTTGCAAG A-------AG GACAAGAGAA TATCGATTCT CTTGACACAA TGGAGCGGAT GATACCTT UGT73C7 TCAACAAACG GGTTTGCCAG A-------AG GGTGCGAGAG TTTAGATATG TTGGCTTCAA TGGGCGATAT GGTGAAGT UGT73C6 TCAAGAAGCT GGTCTGCAAG A-------AG GACAAGAAAA TATGGATTTG CTTACCACGA TGGAGCAGAT AACATCTT UGT73C10 TCAAGAAGCT GGCTTACCAG A-------AG GAAATGAGAC TTTCGATTCA CTTGTCTCAA CAAAGTTGCT GGTACCTT UGT73C12 TCAAGAAGCT GGCTTACCAG A-------AG GAAATGAGAC TCTCGATTCA CTTGTCTCGA TGGAGTTGAT GATACATT UGT73C11 TCAAGAAGCT GGCTTACCAG A-------AG GAAATGAGAC TTTCGATTCA CTTGTCTCGA TGGAGTTGCT GGTACCTT UGT73C13 TCAAGAAGCT GGCTTACCAG A-------AG GAAATGAGAC TTTCGATTCA CTTGTCTCGA TGGAGTTGCT GGTACCTT UGT73C9 TCAAGAAGCT GGCTTACCAG A-------AG GAATTGAGAC TTTCGAGTCA CTTGTCTCGA TGGAGTTGCT GGTACCTT UGT73C8 TAAAGAGTCT GGATTACCAG A-------AG GGTGTGAGAA TCTTGACATG CTACCTGCAC TTGGCATGGC CTCAAACT UGT73Cshi TCAAGAAGCT GGCTTACCAG A-------AG GAAATGAGAC TTTCGATTCA CTTGTCTCGA TGGAGTTGCT GGTACCTT UGT71G1 ATGTCAAAGC AACTA----- --TCAAAACC ATTTTATCAA AC-------A AAGTTG---- ----TTGGGT TAGTCCTA UGT73C1 TCTTCAAAGC ATTTAGCCTG CTCGAGGAAC CAGTCGAGAA GCTCTTGAAA GAGATTC--- AACCTAGGCC AAACTGCA

Page 69: Identification and biochemical characterization of the UDP ...

Appendix

61

UGT73C2 TCTTTAAAGC GGTTAACATG CTTGAAAATC CGGTCATGAA GCTCATGGAA GAGATGA--- AACCTAAACC AAGCTGCC UGT73C3 TCTTCAAAGC GGTGAACTTG CTTGAAGATC CGGTCATGAA GCTCATGGAA GAGATGA--- AACCTAGACC TAGCTGTC UGT73C4 TCTTTCAAGC AGTTAACATG CTCGAAGATC CGGTCATGAA GCTCATGGAA GAGATGA--- AACCTAGACC TAGCTGTA UGT73C5 TCTTTAAAGC GGTTAACTTT CTCGAAGAAC CAGTCCAGAA GCTCATTGAA GAGATGA--- ACCCTCGACC AAGCTGTC UGT73C7 TCTTTGATGC TGCCAACTCA CTTGAGGAGC AAGTTGAGAA AGCTATGGAA GAGATGGTTC AGCCGCGGCC AAGCTGCA UGT73C6 TCTTTAAAGC GGTTAACTTA CTCAAAGAAC CAGTCCAGAA CCTTATTGAA GAGATGA--- GCCCGCGACC AAGCTGTC UGT73C10 TCTTTAAAGC GGTTAACATG CTTGAAGAAC CGGTCCAGAA GCTCTTTGAA GAGATGA--- GCCCTCAACC AAGCTGTA UGT73C13 TCTTAAAAGC GGTTAACATG CTGGAAGAAC CGGTCCAGAA GCTCTTTGAA GAGATGA--- GCCCTCAACC AAGCTGTA UGT73C11 TCTTTAAAGC GGTTAACATG CTTGAAGAAC CGGTCCAGAA GCTCTTTGAA GAGATGA--- GCCCTCAACC AAGCTGTA UGT73C13 TCTTTAAATC GGTTAACATG CTTGAAGAAC CGGTCCAGAA GCTCTTTGAA GAGATGA--- GCCCTCAACC AAGCTGTA UGT73C9 TCTTTAAAGC GGTTAACATG CTTGAAGAAC CGGTCCAGAA GCTCTTTGAA GAGATGA--- GCCCTCAACC AAGCTGTA UGT73C8 TCCTCAATGC ATTAAAGTTT TTCCAACAAG AAGTTGAAAA GCTATTTGAA GAGTTCA--C AACACCAGC- AACTTGTA UGT73Cshi TCTTTAAAGC GGTTAACATG CTTGAAGAAC CGGTCCAGAA GCTCTTTGAA GAGATGA--- GCCCTCAACC AAGCTGTA UGT71G1 -GATTTCT-- TTTGTGTTT- ---CA-ATGA TTGATGTTGG AAATGAATTT GGTATCCCT- -TCTTATTTG TTTCTAAC UGT73C1 TAATCGCTGA CATGTGTTTG CCTTATACAA ACAGAATTGC CAAGAATCTT GGTATACCAA AAATCATCT- TTCATGGC UGT73C2 TAATTTCTGA TTTTTGTTTG CCTTATACAA GCAAAATCGC TAAGAGGTTC AATATCCCAA AGATCGTTT- TCCATGGC UGT73C3 TAATTTCTGA TTGGTGTTTG CCTTATACAA GCATAATCGC CAAGAACTTC AATATACCAA AGATAGTTT- TCCACGGC UGT73C4 TTATTTCTGA TTTGCTCTTG CCTTATACAA GCAAAATCGC AAGGAAATTC AGTATACCAA AGATAGTTT- TCCACGGC UGT73C5 TAATTTCTGA TTTTTGTTTG CCTTATACAA GCAAAATCGC CAAGAAGTTC AATATCCCAA AGATCCTCT- TCCATGGC UGT73C7 TCATTGGAGA CATGAGCCTT CCTTTCACTT CAAGACTTGC CAAGAAATTC AAGATCCCCA AACTTATCT- TCCATGGG UGT73C6 TAATCTCTGA TATGTGTTTG TCGTATACAA GCGAAATCGC CAAGAAGTTC AAAATACCAA AGATCCTCT- TCCATGGC UGT73C10 TAATTTCTGA TTTTTGTTTG CCTTATACAA GCAAAATCGC CAAGAAGTTC AATATCCCAA AGATCCTCT- TCCATGGC UGT73C12 TAATTTCTGA TTTTTGTTTG CCTTATACAA GCAAAATCGC CAAGAAGTTC AATATCCCAA AGATCCTCT- TCCATGGC UGT73C11 TAATTTCTGA TTTTTGTTTG CCTTATACAA GCAAAATAGC CAAGAAGTTC AATATCCCAA AGATCCTCT- TCCATGGC UGT73C13 TAATTTCTGA TTTTTGTTTG CCTTATACAA GCAAAATCGC CAAGAAGTTC AATATCCCAA AGATCCTCT- TCCATGGC UGT73C9 TAATTTCTGA TTTTTGTTTG CATTATACAA GCAAAATCGC TAAGAAGTTC AATATCCCAA AGATCCTCT- TCCATGGC UGT73C8 TCATCTCTGA TATGTGTTTG CCATATACAT CCCACGTAGC TAGAAAGTTC AATATTCCAA GAATTACTT- TTCTTGGA UGT73Cshi TAATTTCTGA TTTTTGTTTG CCTTATACAA GCAAAATAGC CAAGAAGTTC AATATCCCAA AGATCCTCT- TCCATGGC UGT71G1 ATCAAATG-- TTGGTTTTTT AAGTCT-CAT G-CTTTCCCT TAAAAACCGC CAAATCGAAG AAGTTTTCGA TGATTCCG UGT73C1 ATGTGTTGCT TCAATCTTCT TTGTACGCAC A--TAATGCA CCAAAACCAC ---------G AGTTCTTGGA AACTATAG UGT73C2 GTGTCTTGCT TTTGTCTTTT GAGTATGCAT A--TTCTACA CCGAAACCAC ---------A ATATCTTACA TGCTTTAA UGT73C3 ATGGGTTGCT TTAATCTTTT GTGTATGCAT G--TTCTACG CAGAAACTTA ---------G AGATCCTAGA GAATGTAA UGT73C4 ACGGGTTGCT TTAATCTTTT GTGTATGCAT G--TTCTACG CAGAAACCTC ---------G AGATCTTGAA GAACTTAA UGT73C5 ATGGGTTGCT TTTGTCTTCT GTGTATGCAT G--TTTTACG CAAGAACCGT ---------G AGATCTTGGA CAATTTAA UGT73C7 TTTTCTTGTT TCAGCCTCAT GTCTATACAA G--TGGTTCG AGAAAGC--- ---------G GGATCTTGAA AATGATAG UGT73C6 ATGGGTTGCT TTTGTCTTCT GTGTGTTAAC G--TTCTGCG CAAGAACCGT ---------G AGATCTTGGA CAATTTAA UGT73C10 ATGTGTTGCT TTTGTCTTCT GTGTATGCAT G--TTTTACG CAAGAACCGT ---------G AGATCTTGGA AAACTTAA UGT73C12 ATGTGCTGCT TTTGTCTTCT GTGTATGCAT A--TTTTACG CAAGAACCGT ---------G AGATCGTGGA AAACTTAA UGT73C11 ATGTGTTGCT TTTGTCTTCT GTGTATGCAT G--TTTTACG CAAAAACCGT ---------G AGATCTTGGA AAACTTAA UGT73C13 ATGTGTTGCT TTTGTCTTCT GTGTATGCAT G--TTTTACG CAAGAACCAT ---------G AGATCGTGGA AAACTTAA UGT73C9 ATGTGTTGCT TTTGTCTTCT GTGTATGCAT G--TTTTACG CAAGAACTGT ---------G AGATCTTGGA AAACTTAA UGT73C8 GTAAGTTGCT TCCATCTCTT CAATATGCAT AATTTTCATG TCAATAATAT G-----GCGG AAATCATGGC CAATAAAG UGT73Cshi ATGTGTTGCT TTTGTCTTCT GTGTATGCAT G--TTTTACG CAAAAACCGT ---------G AGATCTTGGA AAACTTAA UGT71G1 ACCGTGATCA TCAGTTGTTG AATATTCCTG GTATCTCAAA CCAAGTTC-- CTTCTAATGT TTTA---CCT GATG---- UGT73C1 AGTCTGACAA GGAATACTTC CCCATTCCTA ATTTCCCTGA CAGAGTTGAG TTCACAAAAT CTCAGCTTCC AATGGTAT UGT73C2 AGTCGGACAA AGAGTATTTC TTGGTTCCTA GTTTTCCAGA TAGAGTTGAA TTTACAAAGC TTCAAGTTAC TGTGAAAA UGT73C3 AGTCGGATGA AGAGTATTTC TTGGTTCCTA GTTTTCCTGA TAGAGTTGAA TTTACAAAGC TTCAACTTCC TGTGAAAG UGT73C4 AGTCGGATAA AGATTATTTC CTGGTTCCTA GTTTTCCTGA TAGAGTTGAA TTTACAAAGC CTCAAGTTCC AGTGGAAA UGT73C5 AGTCAGATAA GGAGCTTTTC ACTGTTCCTG ATTTTCCTGA TAGAGTTGAA TTCACAAGAA CGCAAGTTCC GGTAGAAA UGT73C7 AATCAAACGA CGAGTATTTT GATTTGCCCG GCTTGCCTGA CAAAGTTGAG TTCACGAAAC CTCAGGTCTC TGTGTTGC UGT73C6 AGTCTGATAA GGAGTACTTC ATTGTTCCTT ATTTTCCTGA TAGAGTTGAA TTCACAAGAC CTCAAGTTCC GGTGGAAA UGT73C10 AGTCTGACAA GGAGCATTTC GTTGTTCCTT ATTTTCCTGA TCGAGTTGAA TTCACAAGAC CTCAAGTTCC ATTGGCAA UGT73C12 AGTCTGACAA GGAGCATTTC GTTGTTCCTT ATTTTCCTGA TCGAGTTGAA TTCACAAGAC CTCAAGTTCC AGTGGCAA UGT73C11 AGTCTGACAA GGAGCATTTC GTTGTTCCTT ATTTTCCTGA TCGAGTTGAA TTCACAAGAC CTCAAGTTCC AATGGCAA UGT73C13 AGTCTGACAA GGAGCATTTC GTTGTTCCTT ATTTTCCTGA TCGAGTTGAA TTCACAAGAC CTCAAGTTCC AGTGGCAA UGT73C9 AGTCTGACAA GGAGCATTTC GTTGTTCCTT ATTTTCCTGA TCGAGTTGAA TTCACAAGAC CTCAAGTTCC AATGGCAA UGT73C8 AATCT----- -GAATACTTT GAATTGCCTG GTATCCCTGA CAAAATTGAA ATGACAATAG CACAAACAGG ACTAGGAG UGT73Cshi AGTCTGACAA GGAGCATTTC GTTGTTCCTT ATTTTCCTGA TCGAGTTGAA TTCACAAGAC CTCAAGTTCC AATGGCAA UGT71G1 CTTGTTTTAA TA---AAGAT GGTGGATATA TTGCTTATTA TAAACTAGCT GAGAGGTTTA GAGACACCAA AGGGATTA UGT73C1 TA---GTTGC TG---GAGAT TGGAAAGACT TCCTTGACGG AATGACAG-- AAGGGGATAA CACTTC-TTA TGGTGTGA UGT73C2 CAAACTTTAG TG---GAGAT TGGAAAGAGA TCATGGACGA ACAGGTGG-- ATGCTGATGA CACGTC-CTA TGGTGTAA UGT73C3 CAAATGCAAG TG---GAGAT TGGAAAGAGA TAATGGATGA AATGGTAA-- AAGCAGAATA CACATC-CTA TGGTGTGA UGT73C4 CAACTGCAAG TG---GAGAT TGGAAAGCGT TCTTGGACGA AATGGTAG-- AAGCAGAATA CACATC-CTA TGGTGTGA UGT73C5 CATATGTTCC AGCTGGAGAC TGGAAAGATA TCTTTGATGG TATGGTAG-- AAGCGAATGA GACATC-TTA TGGTGTGA UGT73C7 AACCTGTTGA AG---GAAAT ATGAAAGAGA GTACGGCCAA GATTATTG-- AAGCTGATAA TGACTC-TTA TGGTGTTA UGT73C6 CATATGTTCC TGC---AGGC TGGAAAGAGA TCTTGGAGGA TATGGTAG-- AAGCGGATAA GACATC-TTA TGGTGTTA UGT73C10 CATATGTTCC TG---GGGAA TGGCACGAGA TCAAGGAGGA TATGGTAG-- AAGCGGATAA GACTTC-CTA TGGTGTGA UGT73C12 CATATGTTCC TG---GAGAC TGGCACGAGA TCACGGAGGA TATGGTAG-- AAGCGGATAA GACTTC-CTA TGGTGTGA UGT73C11 CATATGTTCC TG---GAGAG TGGCACGAGA TCAAGGAGGA TATAGTAG-- AAGCGGATAA GACTTC-CTA TGGTGTGA UGT73C13 CATATGTTCC TG---GAGAC TGGCACGAGA TCACGGGGGA TATGGTAG-- AAGCGGATAA GACTTC-CTA TGGTGTGA UGT73C9 CATATGCTCC TG---GAGAC TGGCAAGAGA TCAGGGAGGA TATCGTAG-- AAGCGGATAA GACTTC-CTA TGGTGTGA UGT73C8 GATTAAAGGG TG---AAGTT TGGAAACAAT TTAATGATGA CTTGCTTG-- AAGCTGA-AA TAGGTAGTTA TGGGATGC UGT73Cshi CATATGTTCC TG---GAGAG TGGCACGAGA TCAAGGAGGA TATAGTAG-- AAGCGGATAA GACTTC-CTA TGGTGTGA UGT71G1 TTGTTAATAC CTTTTCAGAT TTGGAACAAT CTTCTATTGA TGCATTATAT GATCATGA-- --TGAGAAAA TCCCTCCT UGT73C1 TTGTTAACAC GTTTGAAGAG CTCGAGCCAG CTTATGTTAG AG-ACTACAA GAAGGTTAAA GCGGGTAAGA TATGGAGC UGT73C2 TTGTCAACAC ATTTCAGGAT TTGGAGTCTG CCTATGTGAA AA-ACTACAC GGAGGCTAGG GCTGGTAAAG TATGGAGC UGT73C3 TCGTCAACAC ATTTCAGGAG TTGGAGCCAC CTTATGTCAA AG-ACTACAA AGAGGCAATG GATGGAAAAG TATGGTCC UGT73C4 TCGTCAACAC ATTTCAGGAG TTGGAGCCTG CTTATGTCAA AG-ACTACAC GAAGGCTAGG GCTGGAAAAG TATGGTCC UGT73C5 TCGTCAACTC ATTTCAAGAG CTCGAGCCTG CTTATGCCAA AG-ACTACAA GGAGGTAAGG TCCGGTAAAG CATGGACC UGT73C7 TTGTGAACAC TTTTGAAGAG TTAGAGGTTG ATTATGCAAG AG-AATATAG GAAAGCAAGG GCTGGAAAAG TTTGGTGC UGT73C6 TAGTCAACTC ATTTCAAGAG CTCGAACCTG CGTATGCCAA AG-ACTTCAA GGAGGCAAGG TCTGGTAAAG CATGGACC UGT73C10 TAGTCAACAC ATATCAAGAG CTCGAGCCTG CTTATGCCAA CG-GCTACAA GGAGGCAAGG TCTGGTAAAG CATGGACC UGT73C12 TAGTCAACAC ATATCAAGAG CTCGAGCCTG CTTATGCCAA CG-ACTACAA GGAGGCAAGG TCTGGTAAAG CATGGACC

Page 70: Identification and biochemical characterization of the UDP ...

Appendix

62

UGT73C11 TAGTCAACAC ATATCAAGAG CTCGAGCCTG CTTATGCCAA CG-ACTACAA GGAGGCAAGG TCTGGTAAAG CATGGACC UGT73C13 TAGTCAACAC ATGTCAAGAG CTCGAGCCTG CTTATGCCAA CG-ACTACAA GGAGGCAAGG TCTGGTAAAG CATGGACC UGT73C9 TAGTCAACAC ATATCAAGAG CTCGAGCCTG CTTATGCCAA CG-ACTACAA GGAGGCAAGG TCTGGTAAAG CATGGACC UGT73C8 TCGTGAATTC TTTTGAAGAG TTAGAGCCAA CATATGCAAG GG-ATTATAA AAAGGTAAGA AATGATAAAG TTTGGTGT UGT73Cshi TAGTCAACAC ATATCAAGAG CTCGAGCCTG CTTATGCCAA CG-ACTACAA GGAGGCAAGG TCTGGTAAAG CATGGACC UGT71G1 ATCTATGCTG TTGGTCCTT- TGTTAGATCT CAAAGGTCAG CCTAACCCTA AATTGGATCA A--------- -------- UGT73C1 ATCGGACCGG TT--TCCTTG TGCAACAAGT TAGGAGAAGA CCAAGCTGAG AGGGGAAACA A--------- GGCGGACA UGT73C2 ATCGGTCCGG TT--TCCTTG TGCAACAAGG TAGGAGAAGA CAAAGCTGAG AGGGGAAACA A--------- GGCAGCCA UGT73C3 ATTGGACCCG TT--TCCTTG TGTAACAAGG CAGGTGCAGA CAAAGCTGAG AGGGGAAGCA A--------- GGCCGCCA UGT73C4 ATTGGACCTG TT--TCCTTG TGCAACAAGG CAGGTGCTGA TAAAGCTGAG AGGGGAAACC A--------- GGCCGCCA UGT73C5 ATTGGACCCG TT--TCCTTG TGCAACAAGG TAGGAGCCGA CAAAGCAGAG AGGGGAAACA A--------- ATCAGACA UGT73C7 GTTGGACCTG TT--TCCTTG TGCAATAGGT TAGGGTTAGA CAAAGCTAAA AGAGGAGATA A--------- GGCTTCTA UGT73C6 ATTGGACCTG TT--TCCTTG TGCAACAAGG TAGGAGTAGA CAAAGCAGAG AGGGGAAACA A--------- ATCAGATA UGT73C10 ATTGGACCTG TT--TCCTTG TGCAACAAGG TGGGAGCCGA CAAAGCAGAG AGGGGAAACA A--------- AGCAGACA UGT73C12 ATTGGACCTG TT--TCCTTG TGCAACAAGG TGGGAGCGGA CAAAGCAGAG AGGGGAAACA A--------- AGCAGACA UGT73C11 ATTGGACCTG TT--TCCTTG TGCAACAAGG TGGGAGCCGA CAAAGCAGAG AGGGGAAACA A--------- AGCAGACA UGT73C13 ATTGGACCTG TT--TCCTTG TGCAACAAGG TGGGAGCCGA CAAAGCAGAG AGGGGAAACA A--------- AGCAGACA UGT73C9 ATTGGACCTG TT--TCCTTG TGCAACAAGG TGGGAGCCGA CAAAGCAGAG AGGGGAAACA A--------- AGCAGACA UGT73C8 ATTGGTCCTG TA--TCACTT AGTAACACTG ATTACTTGGA TAAGGTTCAA AGAGGTAATA ATAATAATAA GGTTTCAA UGT73Cshi ATTGGACCTG TT--TCCTTG TGCAACAAGG TGGGAGCCGA CAAAGCAGAG AGGGGAAACA A--------- AGCAGACA UGT71G1 --GCTCAGCA TGATCTTATA TTGAAATGGC TAGATGAGCA GCCAGATAAA TCAGTTGTTT TTTTATGTTT TGGAAGCA UGT73C1 TTGATCAAGA CGAG--TGTA TT-AAATGGC TTGATTCTAA AGAAGAAGGG TCGGTGCTAT ATGTTTGCCT TGGAAGTA UGT73C2 TTGATCAAGA CGAG--TGTA TT-AAATGGC TTGATTCTAA AGATGTAGAG TCGGTGCTGT ATGTTTGCCT TGGAAGTA UGT73C3 TTGATCAAGA TGAG--TGTC TT-CAATGGC TTGATTCTAA AGAAGAAGGT TCGGTGCTCT ATGTTTGCCT TGGAAGTA UGT73C4 TTGATCAAGA TGAG--TGTC TT-CAATGGC TTGATTCTAA AGAAGATGGT TCGGTGTTAT ATGTTTGCCT TGGAAGTA UGT73C5 TTGATCAAGA TGAG--TGCC TT-AAATGGC TCGATTCTAA GAAACATGGC TCGGTGCTTT ACGTTTGTCT TGGAAGTA UGT73C7 TTGGTCAAGA CCAA--TGTC TT-CAATGGC TTGACTCTCA AGAAACTGGT TCAGTGCTCT ACGTTTGCCT TGGAAGTC UGT73C6 TTGATCAAGA TGAG--TGCC TT-GAATGGC TCGATTCTAA GGAACCGGGA TCTGTGCTCT ACGTTTGCCT TGGAAGTA UGT73C10 TTGATCAAGA TGAG--TGTC TT-AAATGGC TTGATTCTAA AGAAGAAGGT TCGGTTCTAT ATGTTTGCCT TGGAAGTA UGT73C12 TTGATCAAGA TGAG--TGTC TT-AAATGGC TTAATTCTAA AGAAGAAGGT TCGGTTCTAT ATGTTTGCCT TGGAAGTA UGT73C11 TTGATCAAGA TGAG--TGTC TT-AAATGGC TTGATTCTAA AGAAGAAGGT TCGGTTCTAT ATGTTTGCCT TGGAAGTA UGT73C13 TTGATCAAGA TGAG--TGTC TT-AAATGGC TTAATTCTAA AGAAGAAGGT TCGGTTCTAT ATGTTTGCCT TGGAAGTA UGT73C9 TTGATCAAGA TGAG--TGTC TT-AAATGGC TTGATTCTAA AGAAGAAGGT TCGGTTCTAT ATGTTTGCCT TGGAAGTA UGT73C8 ATGATGAATG GGAG--CACT TG-AAGTGGC TTGATTCTCA TAAACAAGGG AGTGTTATCT ACGCATGCTT TGGCAGTT UGT73Cshi TTGATCAAGA TGAG--TGTC TT-AAATGGC TTGATTCTAA AGAAGAAGGT TCGGTTCTAT ATGTTTGCCT TGGAAGTA UGT71G1 TGGGAGTTAG CTTTGGTCCA TCTCAAATAA GAGAGATAGC ATTAGGACTT -AAGCATAGT GGGGTTAGGT TCTTGTGG UGT73C1 TATG---CAA TCTTCCTCTG TCTCAGCTCA AAGAGCTCGG CTTAGGCCTC GAGGAATCCC AAAGACCTTT CATTTGGG UGT73C2 TATG---CAA TCTTCCTCTG GCTCAGCTTA GAGAGCTCGG GCTAGGCCTC GAGGCAACTA AAAGACCATT CATTTGGG UGT73C3 TATG---TAA TCTTCCTTTG TCTCAGCTCA AGGAGCTGGG GCTAGGCCTT GAGGAATCTC GAAGATCTTT TATTTGGG UGT73C4 TCTG---TAA TCTACCTTTG TCTCAGCTCA AGGAGCTGGG GCTAGGCCTT GAAAAATCCC AAAGATCTTT TATTTGGG UGT73C5 TCTG---TAA TCTTCCTTTG TCTCAACTCA AGGAGCTGGG ACTAGGCCTA GAGGAATCCC AAAGACCTTT CATTTGGG UGT73C7 TATG---TAA TCTTCCCTTG GCTCAGCTCA AAGAGCTGGG ACTAGGCCTT GAGGCATCTA ATAAACCTTT CATATGGG UGT73C6 TTTG---TAA TCTTCCTCTG TCTCAGCTCC TTGAGCTGGG ACTAGGCCTA GAGGAATCCC AAAGACCTTT CATCTGGG UGT73C10 TCTG---CAG TCTTCCTCTG TCTCAGCTCA AGGAGCTGGG GCTAGGCCTT GAGGAATCCC AAAGACCTTT CATTTGGG UGT73C12 TCTG---CAA TCTTCCTCTG TCTCAGCTCA AGGAGCTCGG GCTAGGCCTT GAGGAATCCC AAAGACCTTT CATTTGGG UGT73C11 TCTG---CAG TCTTCCTCTG TCTCAGCTCA AAGAGCTGGG GCTAGGCCTT GAGGAATCCC AAAGACCTTT CATTTGGG UGT73C13 TCTG---CAA TCTTCCTCTG TCTCAGCTCA AGGAGCTCGG GCTAGGCCTT GAGGAATCCC AAAGACCTTT CATTTGGG UGT73C9 ACTG---CAG TGTTCCTCTG TCTCAGCTCA AGGAACTGGG GCTAGGCCTT GAGGAATCCC AAAGACCTTT CATTTGGG UGT73C8 TATG---CAA TTTAACACCA CCACAGTTGA TAGAGCTTGG TTTAGCATTA GAAGCAACAA AAAGACCCTT TATTTGGG UGT73Cshi TCTG---CAG TCTTCCTCTG TCTCAGCTCA AAGAGCTGGG GCTAGGCCTT GAGGAATCCC AAAGACCTTT CATTTGGG UGT71G1 TC-TAACAGT GCAGAGAAA- ---AAAGTGT TCCCAGAAGG GTTTTT--AG AATGGAT--- -----GGAAT TGGAAGGT UGT73C1 TCATAAGAGG TTGGGAGAAG TATAACGAGT TACTTGAATG GATCTCAGAG AGCGGTTATA AGGAAAGAAT CA-AAGAA UGT73C2 TCATAAGAGG TGGGGGAAAG TATCATGAAC TAGCTGAGTG GATCTTAGAG AGCGGTTTTG AAGAAAGAAC CA-AAGAG UGT73C3 TCATAAGAGG TTCGGAAAAG TATAAAGAAC TATTTGAGTG GATGTTGGAG AGCGGTTTTG AAGAAAGAAT CA-AAGAG UGT73C4 TCATAAGAGG TTGGGAAAAG TATAATGAAC TATATGAGTG GATGATGGAG AGCGGTTTTG AAGAAAGAAT CA-AAGAG UGT73C5 TCATAAGAGG TTGGGAGAAG TACAAAGAGT TAGTTGAGTG GTTCTCGGAA AGCGGCTTTG AAGATAGAAT CC-AAGAT UGT73C7 TTATAAGAGA ATGGGGAAAA TATGGAGATT TAGCAAATTG GATGCAACAA AGCGGATTTG AAGAGCGGAT CA-AAGAT UGT73C6 TCATAAGAGG TTGGGAGAAA TACAAAGAGT TAGTTGAGTG GTTCTCGGAA AGCGGCTTTG AAGATAGAAT CC-AAGAT UGT73C10 TCGTAAGAGG TTGGGAGAAG AACAAAGAGT TACTTGAGTG GTTCTCGGAG AGCGGATTTG AAGAAAGAGT AA-AAGAC UGT73C12 TCATAAGAGG TTGGGAGAAG AACAAAGAGT TACATGAGTG GTTCTCGGAG AGCGGATTCG AAGAAAGAAT CA-AAGAC UGT73C11 TCGTAAGAGG TTGGGAGAAG AACAAAGAGT TACTTGAGTG GTTCTCGGAG AGCGGATTTG AAGAAAGAGT AA-AAGAC UGT73C13 TCATAAGAGG TTGGGAGAAG AACAAAGAGT TACTAGAGTG GTTCTCGGAG AGCGGATTCG AAGAAAGAAT CA-AAGAC UGT73C9 TCGTAAGAGG TTGGGAGAAG AACAAAGAGT TACTTGAGTG GTTCTCGGAG AGCGGATTTG AAGAAAGAGT AA-AAGAC UGT73C8 TTTTAAGAGA AGGAAATCAG TTAGAAGAGT TGAAAAAATG GCTTGAGGAG AGTGGATTTG AGGGAAGAAT CA-ATGGT UGT73Cshi TCGTAAGAGG TTGGGAGAAG AACAAAGAGT TACTTGAGTG GTTCTCGGAT AGCGGATTTG AAGAAAGAGT AA-AAGAC UGT71G1 AAGGGA---A TGATATGTGG ATGGGCACCA CAAGTTGAGG TTTTGGCACA TAAGGCTATT GGTGGATTTG TTTCACAT UGT73C1 AGAGGCCTTC TCATAACAGG ATGGTCGCCT CAAATGCTTA TCCTTACACA TCCTGCCGTT GGAGGATTCT TGACACAT UGT73C2 AGAAGCCTTC TCATAAAAGG ATGGTCGCCT CAAATGCTTA TCCTTTCACA CCCTGCCGTT GGAGGATTCC TGACACAT UGT73C3 AGAGGACTTC TCATTAAAGG GTGGGCACCT CAAGTCCTTA TCCTTTCACA TCCTTCCGTT GGAGGATTCC TGACACAC UGT73C4 AGAGGACTTC TTATTAAAGG GTGGTCACCT CAAGTCCTTA TCCTTTCACA TCCTTCCGTT GGAGGATTCC TGACACAC UGT73C5 AGAGGACTTC TCATCAAAGG ATGGTCCCCT CAAATGCTTA TCCTTTCACA TCCATCAGTT GGAGGGTTCC TAACACAC UGT73C7 AGAGGACTGG TGATCAAAGG TTGGGCGCCG CAAGTTTTCA TCCTCTCACA CGCATCCATT GGAGGGTTTT TGACTCAC UGT73C6 AGAGGACTTC TCATCAAAGG ATGGTCCCCT CAAATGCTTA TCCTTTCACA TCCTTCTGTT GGAGGGTTCT TAACGCAC UGT73C10 AGAGGGCTTC TCATCAAAGG ATGGTCACCT CAAATGCTTA TCCTTGCACA TCATTCCGTT GGAGGGTTCT TAACACAC UGT73C12 AGAGGACTTC TCATCAAAGG ATGGGCTCCT CAAATGCTTA TACTTTCACA TCATTCCGTT GGAGGGTTCT TAACACAC UGT73C11 AGAGGGCTTC TCATCAAAGG ATGGTCACCT CAAATGCTTA TCCTTGCACA TCATTCCGTT GGAGGGTTCT TAACACAC UGT73C13 AGAGGACTTC TCATCAAAGG ATGGGCTCCT CAAATGCTTA TACTTTCACA TCATTCCGTT GGAGGGTTCT TAACACAC UGT73C9 AGAGGGCTTC TCATCAAAGG ATGGTCACCT CAAATGCTTA TCCTTGCACA TCATTCCGTT GGAGGGTTCT TAACACAC UGT73C8 AGAGGCCTAG TGATTAAGGG TTGGGCTCCT CAGTTATTGA TATTGTCGCA TCTTGCAATT GGAGGATTCT TAACACAT UGT73Cshi AGAGGGCTTC TCATCAAAGG ATGGTCACCT CAAATGCTTA TCCTTGCACA TCATTCCGTT GGAGGGTTCT TAACACAC UGT71G1 TGTGGATGGA ATTCTATTTT GGAAAGTATG TGGTTTGGTG TACCAATATT GACATGGCCT ATTTATGCAG AACAACAG UGT73C1 TGTGGATGGA ACTCTACTCT TGAAGGAATC ACTTCAGGCG TTCCATTACT CACGTGGCCA CTGTTTGGAG ACCAATTC

Page 71: Identification and biochemical characterization of the UDP ...

Appendix

63

UGT73C2 TGTGGATGGA ACTCAACTTT AGAAGGAATC ACCTCAGGGG TTCCATTGAT CACTTGGCCA TTATTTGGAG ACCAATTC UGT73C3 TGTGGATGGA ACTCGACTCT CGAAGGAATC ACCTCAGGCA TTCCACTGAT CACTTGGCCG CTGTTTGGAG ACCAATTC UGT73C4 TGTGGATGGA ACTCGACTCT CGAAGGAATC ACCTCAGGCA TTCCACTGAT CACTTGGCCG CTGTTTGGAG ACCAATTC UGT73C5 TGTGGTTGGA ACTCGACTCT TGAGGGGATA ACTGCTGGTC TACCGCTACT TACATGGCCG CTATTCGCAG ACCAATTC UGT73C7 TGTGGATGGA ACTCGACACT AGAAGGAATT ACTGCAGGAG TTCCATTATT GACATGGCCT TTGTTTGCTG AACAATTC UGT73C6 TGCGGATGGA ACTCGACTCT TGAGGGGATA ACTGCTGGTC TACCAATGCT TACATGGCCA CTATTTGCAG ACCAATTC UGT73C10 TGTGGATGGA ACTCGACCCT CGAAGGAATC ACTTCAGGCG TTCCATTGCT CACTTGGCCA CTGTTTGGAG ACCAATTC UGT73C12 TGTGGATGGA ACTCGACTCT TGAGGGGCTA ACCGCTGGTC TACCACTGCT GACATGGCCG CTTTTCGCAG ACCAGTTC UGT73C11 TGTGGATGGA ACTCGACCCT CGAAGGAATC ACTTCAGGCA TTCCATTGCT CACTTGGCCA CTGTTTGGAG ACCAATTC UGT73C13 TGTGGATGGA ACTCGACTCT TGAGGGGCTA ACCGCTGGTC TACCACTGCT GACATGGCCG CTTTTCGCAG ACCAGTTC UGT73C9 TGTGGATGGA ACTCGACCCT CGAAGGAATC ACTTCAGGCA TTCCATTGCT CACTTGGCCA CTGATTGTAG ACCAATTC UGT73C8 TGTGGTTGGA ATTCTACTCT TGAAGCAATA TGTGCTGGTG TACCAATGGT TACATGGCCA CTTTTTGCGG ATCAATTT UGT73Cshi TGTGGATGGA ACTCGACCCT CGAAGGAATC ACTTCAGGCA TTCCATTGCT CACTTGGCCA CTGTTTGGAG ACCAATTC UGT71G1 CTTAAT--GC TTTTAGGTTG GTGAAGGAAT GGGGGGTAGG TTTGGGACTG AGAGTGGACT ATAGAAAGGG TAGTGATG UGT73C1 TGCAATGAGA AATTGG--CG GTGCAGATAC TAAAAGCCGG TGTGAGAGCT GGGGTTGA-- ----AGAGTC C-ATGAGA UGT73C2 TGCAACCAGA AACTGA--TC GTGCAGGTGC TAAAAGCAGG TGTAAGTGTT GGGGTTGA-- ----AGAGGT C-ATGAAA UGT73C3 TGCAACCAAA AACTGG--TC GTTCAAGTAC TAAAAGCCGG TGTAAGTGCC GGGGTTGA-- ----AGAAGT C-ATGAAA UGT73C4 TGCAACCAAA AACTGG--TC GTTCAAGTAC TAAAAGCCGG TGTAAGTGCC GGGGTTGA-- ----AGAAGT C-ATGAAA UGT73C5 TGCAATGAGA AATTGG--TC GTTGAGGTAC TAAAAGCCGG TGTAAGATCC GGGGTTGA-- ----ACAGCC T-ATGAAA UGT73C7 TTGAATGAGA AGTTAG--TT GTGCAGATAC TAAAAGCAGG GTTAAAGATA GGAGTAGA-- ----GAAATT G-ATGAAA UGT73C6 TGCAACGAGA AACTGG--TC GTACAAATAC TAAAAGTCGG TGTAAGTGCC GAGGTTAA-- ----AGAGGT C-ATGAAA UGT73C10 TGCAACCAAA AACTTG--TC GTGCAGGTGC TAAAAGTGGG TGTAAGTGCC GGGGTTGA-- ----AGAGGT T-ACGAAT UGT73C12 TGCAACGAGA AACTTG--CC GTGCAGGTAT TAAAAGCCGG TGTAAGCGCC GGGGTTGA-- ----CCAGCC T-ATGAAA UGT73C11 TGCAACCAAA AACTTG--TC GTGCAGGTGC TAAAAGTGGG TGTAAGTGCC GGGGTTGA-- ----AGAGGT T-ACGAAT UGT73C13 TGCAACGAGA AACTTG--CC GTGCAGGTAT TAAAAGCCGG TGTAAGCGCC GGGGTTGA-- ----CCAGCC T-ATGAAA UGT73C9 TGCAACCAAA AACTTG--TC GTGCAGGTGC TAAAAGTGGG TGTAAGTGCC GGGGTTGA-- ----AGAGGT T-ACGAAT UGT73C8 TTGAATGAAA GTTTTG--TT GTGCAAATAT TAAAAGTTGG TGTGAAAATT GGGGT-GA-- ----AGAGTC CTATGAAA UGT73Cshi TGCAACCAAA AACTTG--TC GTGCAGGTGC TAAAAGTGGG TGTAAGTGCC GGGGTTGA-- ----AGAGGT T-ACGAAT UGT71G1 TTGTAGCGGC CGAGGAGATT GAGAAAGGAT TGAAGGATTT GATGGATAAA GATAGCATTG TACACAAGAA GGTTCAAG UGT73C1 T--------- -GGGGAGAAG AGGAGAAAAT AGGAGTACT- GGTGGATAAA GAAGGAGTA- -----AAGAA GG--CAGT UGT37C2 T--------- -GGGGAGAAG AGGAGAGTAT TGGAGTGTT- AGTGGATAAA GAAGGAGTG- -----AAGAA GG--CAGT UGT73C3 T--------- -GGGGAGAAG AAGATAAAAT AGGAGTGTT- AGTGGATAAA GAAGGAGTG- -----AAAAA GG--CTGT UGT73C4 T--------- -GGGGAGAAG AGGAGAAAAT AGGAGTGTT- AGTGGATAAA GAAGGAGTA- -----AAGAA GG--CAGT UGT73C5 T--------- -GGGGAGAAG AGGAGAAAAT AGGAGTGTT- GGTGGATAAA GAAGGAGTG- -----AAGAA GG--CAGT UGT73C7 T--------- -ATGGAAAAG AAGAGGAGAT AGGAGCGAT- GGTGAGCAGA GAATGTGTG- -----AGAAA AG--CTGT UGT73C6 T--------- -GGGGAGAAG AAGAGAAGAT AGGAGTGTT- GGTGGATAAA GAAGGAGTG- -----AAGAA GG--CAGT UGT73C10 T--------- -GGGGAGAAG AGGAGAAAAT AGGAGTATT- AGTGGATAAA GAGGGAGTG- -----AAGAA GG--CAGT UGT73C12 T--------- -GGGGAGAAG AGGAGAAAAT AGGAGTGTT- GGTGGATAAA GAAGGAGTG- -----AAGAA GG--CAGT UGT73C11 T--------- -GGGGAGAAG AGGAGAAAAT AGGAGTATT- AGTGGATAAA GAGGGAGTG- -----AAGAA GG--CAGT UGT73C13 T--------- -GGGGAGAAG AGGAGAAAAT AGGAGTGTT- GGTGGATAAA GAAGGAGTG- -----AAGAA GG--CAGT UGT73C9 T--------- -GGGGAGAAG AGGAGAAAAT AGGAGTATT- AGTGGATAAA GAGGGAGTG- -----AAGAA GG--CAGT UGT73C8 T--------- -GGGGTGAAG AAGAGGAT-- -GGTGTGTT- GGTGAAGAAG GAAGATATT- -----GAAAG AG--GAAT UGT73Cshi T--------- -GGGGAGAAG AGGAGAAAAT AGGAGTATT- AGTGGATAAA GAGGGAGTG- -----AAGAA GG--CAGT UGT71G1 AGATGAAAGA GATGTCTAGG AATGCTGTTG TTGATGGTGG ATCTTCTTTA ATTTCTGTTG GAAAACTTAT TGATGATA UGT73C1 GGAGGAATTG -ATGGGTGAT AGTAATGATG CTAAGGAGAG A--------- ---------A GAAAAAGAGT GAAAGAGC UGT37C2 GGACGAAATA -ATGGGCGAG AGTGATGAAG CAAAAGAGAG A--------- ---------A GAAAAAGAGT CAGAGAGC UGT73C3 GGAAGAATTG -ATGGGTGAT AGTGATGATG CAAAAGAGAG G--------- ---------A GAAGAAGAGT CAAAGAGC UGT73C4 GGAAGAGTTA -ATGGGTGCG AGTGATGATG CAAAAGAGAG G--------- ---------A GAAGAAGAGT CAAAGAGC UGT73C5 GGAAGAATTA -ATGGGTGAG AGTGATGATG CAAAAGAGAG A--------- ---------A GAAGAAGAGC CAAAGAGC UGT73C7 GGATGAGCTA -ATGGGTGAT AGTGAAGAAG CAGAAGAGAG A--------- ---------A GAAGAAAAGT TACAGAAC UGT73C6 GGAAGAACTA -ATGGGTGAG AGTGATGATG CAAAAGAGAG A--------- ---------A GAAGAAGAGC CAAAGAGC UGT73C10 GGAAGAATTA -ATGGGTGAG AGTGATGATG CTAAAGAAAT A--------- ---------A GAAAAAGAGT CAAAGAGC UT73C12 GGAAGAATTA -ATGGGTGAG AGTGATGATG CTAAAGAGAT A--------- ---------A GAAGAAGAGC CAAAGAGC UGT73C11 TGAAGAATTA -ATGGGTGAG AGTGATGATG CTAAAGAAAG A--------- ---------A GAAAAAGAGT CAAAGAGC UGT73C13 GGAAGAATTA -ATGGGTGAG AGTGATGATG CTAAAGAGAT A--------- ---------A GAAGAAGAGC CAAAGAGC UT73C9 GGAAGAATTA -ATGGGTGAG AGTGATGATG CTAAAGAAAG A--------- ---------A GAAAAAGAGT CAAAGCGC UGT73C8 AGAAAAGTTA -ATGGATGAG ACAAGTGAAT GTAAAGAAAG A--------- ---------A GAAAAAGGAT TAGAGAGC UGT73Cshi TGAAGAATTA -ATGGGTGAG AGTGATGATG CTAAAGAAAG A--------- ---------A GAAAAAGAGT CAAAGAGC UGT71G1 TTACAGGAAG CAACTGATAA ACTGTCTTTT TTTGCTACAT AGGTGGAGTT TCCCTTTCTT GGAATCAATG GATGAAGA UGT73C1 TTGGAG-AAT TAGCTCACAA G--------- ---GCTG--- ---------- ---------- --------TG GAAGAAGG UGT73C2 TTGGAG-AAT TAGCTCACAA G--------- ---GCTG--- ---------- ---------- --------TG GAAGAAGG UGT73C3 TTGGAG-AAT TAGCTCACAA A--------- ---GCTG--- ---------- ---------- --------TG GAAAAAGG UGT73C4 TTGGAG-AAT CAGCTCACAA G--------- ---GCTG--- ---------- ---------- --------TG GAAGAAGG UGT73C5 TTGGAG-ATT CAGCTCACAA G--------- ---GCTG--- ---------- ---------- --------TG GAAGAAGG UGT73C7 TTAGTG-ACT TGGCAAATAA G--------- ---GCTT--- ---------- ---------- --------TG GAAAAAGG UGT73C6 TTGGAG-AAT CAGCTCACAA G--------- ---GCTG--- ---------- ---------- --------TG GAAGAAGG UGT73C10 TTGGAC-AAT TAGCTCACAA G--------- ---GCTG--- ---------- ---------- --------TG GAGGAAGG UGT73C12 TTGGAG-AAT TAGCTCACAA G--------- ---GCTG--- ---------- ---------- --------TG GAGGAAGG UGT73C11 TTGGAC-AAT TAGCTCAAAA G--------- ---GCTG--- ---------- ---------- --------TG GAGGAAGG UGT73C13 TTGGAG-AAT TAGCTCACAA G--------- ---GCTG--- ---------- ---------- --------TG GAGGAAGG UGT73C9 TTGGAC-AAT TAGCTCACAA G--------- ---GCTG--- ---------- ---------- --------TG GAGGAAGG UGT73C8 TTGCTG-AGA TGGCTAAAAA G--------- ---GCTG--- ---------- ---------- --------TA GAAAAAGG UGT73Cshi TTGGAC-AAT TAGCTCAAAA G--------- ---GCTG--- ---------- ---------- --------TG GAGGAAGG UGT71G1 AGACATTCTA TATGTTATAT TGTTTTGTTG AGGGATGTCA TTTTATATAC TATATTCTAC CTAAAAAAC- TGTTGAAA UGT73C1 AGGC--TCT- ---------- ---------- -----TCTCA TTCCA-ACAT CACATTCTTG CTACAAGACA TAATGCAA UGT73C2 AGGC--TCT- ---------- ---------- -----TCTCA TTCTA-ATAT CATATTTTTG CTACAAGATA TAATGCAA UGT73C3 AGGC--TCT- ---------- ---------- -----TCTCA TTCTA-ACAT CACACTCTTG CTACAAGACA TAATGCAA UGT73C4 AGGC--TCT- ---------- ---------- -----TCTCA TTCTA-ACAT CACATACTTG CTACAAGACA TAATGCAA UGT73C5 AGGC--TCT- ---------- ---------- -----TCTCA TTCTA-ACAT CTCTTTCTTG CTACAAGACA TAATGGAA UGT73C7 AGGA--TCT- ---------- ---------- -----TCAGA TTCTA-ATAT CACATTGCTC ATTCAAGATA TTATGGAG UGT73C6 AGGC--TCC- ---------- ---------- -----TCTCA TTCTA-ATAT CACTTTCTTG CTACAAGACA TAATGCAA UGT73C10 AGGC--TCA- ---------- ---------- -----TCTCA TTCTA-ATAT CACATCCTTG CTAGAAGACA TAATGCAA UGT73C12 AGGC--TCA- ---------- ---------- -----TCTCA TTCTA-ATAT CACATCCCTT CTAGAAGACA TAATGCAA

Page 72: Identification and biochemical characterization of the UDP ...

Appendix

64

UGT73C11 AGGC--TCA- ---------- ---------- -----TCTCA TTCTA-ATAT CACATCCTTG CTAGAAGACA TAATGCAA UGT73C12 AGGC--TCA- ---------- ---------- -----TCTCA TTCTA-ATAT CACATCCCTT CTAGAAGACA TAATGCAA UGT73C9 AGGC--TCA- ---------- ---------- -----TCTCA TTCTA-ATAT CACATCCTTG CTAGAAGACA TAATGCAA UGT73C8 TGGA--TCT- ---------- ---------- -----TCTCA CTCTA-ATAT TTCTTTGTTC ATCCAAGATA TCATG-AA UGT73Cshi AGGC--TCA- ---------- ---------- -----TCTCA TTCTA-ATAT CACATCCCTT CTAGAAGACA TAATGCAA UGT71G1 GAATAAAAGT TGAATGTGGA ATTAGTAGCA TATTTGTGTA TAGCAAATTT AATCAAGCTA GCACATGTGC CTATCTTT UGT73C1 TTAGAACAAC CCAA----GA AATGATTGTA ACTTTTCTT- ---------- -------AGA TTG--TAA-- ------CT UGT73C2 CAAGTAGAAT CCAA----GA GTTGA----- ---------- ---------- ---------- ---------- -------- UGT73C3 CTAGCACAAT TCAA----GA ATTGAGCATA TGTCATTTTA TGTTCATAGA AATTTAAACA TTAAACAG-- ------TT UGT73C4 CAAGTGAAAT CCAA----GA ACTGA--TTA CATCATTTTT AGT------- ------AACA TTA------- -------C UGT73C5 CTGGCAGAAC CCAA----TA ATTGAGTATA CGTCATCTT- --TTTAAAGG AATTTAAAAA TTAAATAG-- ------TT UGT73C7 CAATCACAA- ---------A ATCAATTTTA ---------- ---------- ---------- ------AA-- ------TT UGT73C6 CTAGCACAGT CCAA----TA ATTGAGTATA TGTCATATT- --TTCAAAGG AATTTAAACA TTCTATAG-- ------TT UGT73C10 CTAGCACAAC CTAA----TA ATTGA----- ---------- ---------- ---------- ---------- -------- UGT73C12 CTAGCACAAT CCAA----TA ATTGA----- ---------- ---------- ---------- ---------- -------- UGT73C11 CTAGCACAAT CTAA----TA ATTGA----- ---------- ---------- ---------- ---------- -------- UGT73C13 CTAGCACAAT CCAA----TA ATTGA----- ---------- ---------- ---------- ---------- -------- UGT73C9 CTAGCACAAT CTAA----TA ATTGA----- ---------- ---------- ---------- ---------- -------- UGT73C8 GAAAAATAA- ---A----GA TATGATGTCA T--------- ---------- ---------- ---------- ------CA UGT73Cshi CTAGCACAAT CTAA----TA ATTGA----- ---------- ---------- ---------- ---------- -------- UGT71G1 TTTTATTTCA GTACTGCTTT TCTTTGGAGG GTTGTTTATA TAATATTTTT TTATTAACAG CTGAACTAAT GAAGT--- UGT73C1 TTTCTTTTGT ATTACCCAAA AAAAATGATT GTGACGTTTG TTCT------ ---------- ---------- -------- UGT73C2 ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- UGT73C3 TTTGATTTCT ATATTGGAGA AATTTAAATC AGAGCCTTTG TTAAACACGT GGATAATGAA TCAGAAGAAG ATAAATCA UGT73C4 CATAATTT-- ---------- AATTCAAATT AAG---TCTG TT---CTTCT GA--GATGTA TTAAAAAAAA CTTA--TA UGT73C5 TT-GTTTTCT GTATTTGTGA AATTTAAAAC AGAGTCTTAG TTACACATTT GTACGAATTA GCAGAAGACA AC------ UGT73C7 TTGGTATTTG GCACCAGAGA AAAACA---- ---------- ---------- ---------- ---------- -------- UGT73C6 TTTGTTTTCT GTATTTGTGA AATTTAAAAC AGAGTCTTAG TTACACATTT GTACGAATTA GCAGAAGACA ACTAACTA UGT73C10 ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- UGT73C12 ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- UGT73C11 ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- UGT73C13 ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- UGT73C9 ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- UGT73C8 TTTATCCATG GAAATGCC-- AATTCAAAAT GA-------- ---------- ---------- ---------- -------- UGT73Cshi ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- UGT71G1 ---------- ---------- --------- UGT73C1 ---------- ---------- --------- UGT73C2 ---------- ---------- --------- UGT73C3 GTGAT----- ---------- --------- UGT73C4 ATGATGCA-- ---------- --------- UGT73C5 ---------- ---------- --------- UGT73C7 ---------- ---------- --------- UGT73C6 GTGTAACTTC CACATTCTGG AGTTATCTT UGT7310 ---------- ---------- --------- UGT73C12 ---------- ---------- --------- UGT73C11 ---------- ---------- --------- UGT73C13 ---------- ---------- --------- UGT73C9 ---------- ---------- --------- UGT73C8 ---------- ---------- --------- UGT73Cshi ---------- ---------- ---------