Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence...

40

Transcript of Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence...

Page 1: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Identification

of

Modified PTH-Amino Acids

in

Protein Sequence Analysis

First Edition

©1993 Association of Biomolecuiar Resource Facilities

Compiled

for

The Association of Biomolecuiar Resource Facilities

by

Mark W. Crankshaw and Gregory A. Grant

Departments of Medicine and Molecular Biology & Pharmacology Washington University School of Medicine

St. LouiS) Missouri 63110

Page 2: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

CONTENTS

Page

Introduction ^ Suggestions for Using This Guide 5 List of Abbreviations °

Explanation of Numbering Convention 9

PTH-Amino Acids

Reference Standard I - PTH-Amino Acids

Applied Biosystems Sequencer

Chromatoaram 10 Table 11

Reference Standard II - PTH-Amino Acids

Milligen/Biosearch Sequencer

ChromatO'iram 13 Table 14

PTC-Amino Acids

Reference Standard m - PTC-Amino Acids Applied Biosytems Sequencer

Chromatoaram 15 Table 16

Sequencing Artifacts or Associated Amino Acid Derivatives

Reference Standard IV- Sequencing .Artifacts

Applied Biosytems Sequencer Chromatoeram H Table 18

Side Chain Protected Amino Acids Used in Peptide Synthesis

Reference Standard V - Boc Synthesis Applied Biosytems Sequencer

ChromatoEram 1" Table 20

Reference Standard VI - Boc Synthesis Porton Sequencer

Chromatogram '

Table 22

Page 3: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Reference Standard VII - Fmoc Synthesis

Applied Biosytems Sequencer

Chromatosram 23

Table 24

Table I - Eludon Times of Selected PTH-Amino Acids on

an Applied Biosytems Sequencer 26

Table II - Eludon Times of Selected PTH-Amino Acids on

an Applied Biosytems Sequencer 27

Index of Compounds 28

List of Contributors 33

Suggested References 36 'BO1

IV

i..' sit- .if11

r;

Page 4: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Introduction

The identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of the PTH-amino acids on high pressure liquid chromatography systems. This method relies on a comparison of the elution position of the unknown PTH-amino acid with that of reference standards. This is relatively straightforward for the genetically encoded amino acids, but becomes problematic when modified or unusual amino acid residues are present in the sample being sequenced. Since the method does not provide a direct identification of the PTH-amino acid, additional analysis by chemical or phvsicai means is necessary. However, it is often helpful to have some knowledge of where P1H-anuno acids with known modifications elute in these systems. This provides a starting point for the investigator and provides an additional level of knowledge upon which to proceed.

This compilation has been undertaken by the Association of Biomolecular Resource Facilities (ABRF) in an attempt to consolidate information of this type for easy reference It should be noted that an exhaustive review of the literature has not been attempted. Rather, members of the ABRF were asked to submit any information they had in this regard for inclusion in this booklet. This compilation is intended for use by the entire scientific community and is available to.anyone who is interested. Single copies can be obtained for personal use by contacting the ABRb business office by letter or FAX. The address is 9650 RockvUle Pike, Bethesda, MD 20814-^998 and the FAX number is 301-530-7049. Parties interested in large quantities may also inquire through the

business office.

Also please note that it has not been possible to independently verify the information presented here. In many instances the entries come from personal observations of the contributors and there are no published references to provide rigorous documentation. When literature references have been provided they are included. Therefore, this informaDon is intended to be used only as a <mide. Additional supporting analyses should be employed to verify the identity ot any unknown residue. Anyone noting errors or conflicts is invited to send documentation.

It is recognized that this compilation is by no means complete. Anyone who has additional information is encouraged to submit their data for inclusion in subsequent editions. Those wishing to contribute may forward their material either to the ABRF business office or directly to the authors at the Department of Molecular Biology and Pharmacology, Campus Box 8103, Washington University School of Medicine, 660 South Euclid Avenue, St. Louis, Missouri

63110.

Finally we would like to thank all those who sent contributions. Their names are included to acknowledge their contribution. Their time and efforts are greatly appreciated and ins hoped that you will find this a useful and worthwhile research tooL.We also thank Edna Siivestn for excellent

clerical assistance. v \ n L. *:.

Page 5: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Suggestions For Using This Guide

This guide has been compiled as an aid to the initial investigation of the identity of

an unknown peak in phenykhiohydantoin (PTH) amino acid chromatograms in automated

Edman sequencing. The guide is arranged so that you may approach this through one of

three routes:

1) Reference Chromatograms. Locate the approximate position of an unknown peak on the appropriate reference standard and refer to the respective listing of modified

amino acids which generally elute in that area. Note that there is more than one

chromatogram listed, each of which refers to a particular type, source of origin, or

instrument.

2) Elution Tables. Tables of elution position have been supplied by some contributors.

These overlap to some extent with the reference chromatograms and also contain

information not included in the chromatograms. Estimate the unknown's elution time

and look through the Tables for a possible match.

3) Index of compounds. If you know or can hypothesize the identity of a modified

amino acid that you expect to be present, you can use the master list which refers to the

appropriate chromatogram or table.

Caution!

It is extremely important that you use this information only as a guide to a possible

identity for your unknown and that you are aware of the approximate nature of the

placement of the contributed modified amino acids on the reference standards and tables.

Every effort has been made to present the information as accurately as possible, but no two

sequencer-HPLC systems are exactly alike. A major change over die past year affecting

Applied Biosystems users is the switch from user adjusted Na-acetate buffers to the new

"pre-mix" system. In general the PTH-amino acids elute in relatively the same area, but

some significant shifts may occur. Other variables, such as initial conditions, gradient,

column temperature, and mobile phase additives, can make it difficult to precisely correlate

the elution position from one HPLC system to another. However, we believe this

representation is sufficiently accurate that it will be useful.

E

All amino acids are the PTH derivative unless otherwise stated. PTH-amino acids

are referenced with arabic numerals (see explanation of numbering convention on next

page), contributors are referenced with upper case letters, and footnotes are designated by

lower case roman numerals in italics as superscripts.

Page 6: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

List of Abbreviations

Acetnmidomeihyl-cysteine

Alanine

a-Aminobutyric Acid

Applied Biosystems "Incorporated"

Arginine

Arginine (diallyloxycarbonyl)

Arginine (mesitylene-2-sulfonyl)

Arginine (4-methoxy-2,3,6-trimethylbenzenesulfonyl)

Arginine (4-toluenesulfonyl)

Asparagine

Asparric Acid

Aspartic Acid (cyclohexyl)

Aspartic Acid (O-ailyl)

Aspartic Acid associated peak

Aspartic Acid (O-benzyl)

Aspartic Acid (O-tert-butyl)

Association of Biomolecular Resource Facilities

S-Carboxamidomethyl-cysteine

y-Carboxyg!utamic Acid

S-Carboxymethyl-cy stein e

Citruliine

Cysteine

Cysteine (allyl)

Cysteine (allyloxycarbonyl)

Cysteine (4-methoxybenzyl)

.■"IK

Cysteine (4-methylbcnzyl)

Cysteine (3-nitro-2-pyridylsuIfenyl)

Cysteine (tert-butyl)

Cystine

N-dimethyl, N'-phenylthiourea

O, O-dimethylphosphotyrosine

ACM-Cys

Ala, A

Abu

ABI

Arg.R

Arg(AJoc)2

Arg(Mts)

Arg(Mtr)

Arg(Tos)

Asn, N

Asp, D

Asp(OcHex)

Asp(OAl)

Asp'

Asp(OBzl)

Asp(OtBu)

ABRP

Cys(SCAM)

Gla

Cys(SCM)

Cit

Cys, C

Cys(AI)

Cys(Aloc)

Cys(4-CH3OBzi) or

Cys(SMob)

Cys(4-CH3Bzl) or

Cys(SMeb)

Cys(Npys)

Cys(tBu)

Cys2

DMPTU

Tyr(PO3Me2)

Page 7: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

N,N'-diphenyl thiourea

N,N'-diphenylurea

Dithiodireitol

DTT adduct of dehydroalanine

DTT adduct(s) of dehydro-oc-aminoisobutyric acid

Fluorenylmethyloxycarbonyl

Glutamic Acid

Glutamic Acid (O-allyl)

Glucamic Acid associated peak

Glutamic Acid (Obenzyl)

Glutamic Acid (cyciohexyi)

Glutamic Acid (O-9-Fluorenylmethyl)

High Pressure Liquid Chromatography

Histidine

Histidine (allyloxycarbonyl)

Histidine (3-benzyl)

Histidine (3-benyzloxymethyl)

Histidine (tert-butyloxymethyl)

Histidine (2,4-dinitrophenyl)

5-Hydro xyly sine

Hydroxyproline

Isoleucine

Lanthionine

Leucine

Lysine

Lysine (allyloxycarbonyl)

Lysine (chlorobenzyloxycarbonyl)

Lysine (N-e-dinitrophenyl)

Lysine (N-e-9-Fluorenylmethyioxycarbonyl)

Methionine

l-Methylhistidine

3-Methylhistidine

Nitroarginine

Norleucine

Norv aline

Gmithine

DPTU

DPU

DTT

S1

T'

Fmoc

Glu, E

Glu(OAl)

Glu1

GSu(OBzl)

Glu(OcHex)

Glu(OFm)

HPLC

His, H

His(Aloc)

His(3-Bz!)

His(Bom)

His(Bum)

His (Dnp)

Hyl

Hyp

lie, I

Lan

Leu, L

Lys, K

Lys(Aloc)

Lys(ClZ)

Lys (Dnp)

Lys (Fmoc)

Met,M

His(l-Me)

His(3-Me).

Arg(NO2)

Nle

Nva

Orn

7

Page 8: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Ornithine (benzyloxycarbonyl)

Phenylalanine

Phenylalanine (p-amino-benzyloxycarbonyl)

Phenylisothiocyanate

Phenylthiocarbamyl

Phenylthiohydantoin

Phenyltbiohydantoin amino acid

S-pyridylethyi cysteine

Pro line

Serine

Serine (allyloxycarbonyl)

Serine (benzyl)

Solid phase peptide synthesis

Threonine

Threonine (allyloxycarbonyl)

Threonine (benzyl)

Threonine (tert-butyl)

Tryptophan

Tryptophan associated peak

Tryptophan (Nin-formyl)

Tyrosine

Tyrosine (allyl)

Tyrosine (2-bromobenzyloxycarbonyl)

Tyrosine (ten-butyl)

Valine

Om(Z)

Phe, F

Phe(p-amino-Z)

PITC

PTC

PTH

PTH-aa

PECys or

Cys (SPE)

Pro.P

Ser, S

Ser(A!oc)

Ser(Bzl)

SPPS

Thr, T

Thr(Aloc)

Thr(Bzl)

Thr(tBu)

Trp.W

Tip'

Trp(CHO)

Tyr,Y

Tyr(Al)

Tyi(2BrZ)

Tyr(tBu)

Val,V

Page 9: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Explanation of Numbering Convention

Category

Modified PTH-aa on Applied Biosystems Instruments

Modified PTH-aa on Milligen/Biosearch Instruments

Phenylthiocarbamyl amino acids

Applied Biosystems Instruments

Side chain protected amino acids used in Boc SPPS on Applied Biosystems Instruments

Side chain protected amino acids used in Boc SPPS

on Porton Instruments 600-699 21

Side chain protected amino acids used in FMOC SPPS

on Applied Biosystems Instruments 800-899 23

Page 10: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of
Page 11: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of
Page 12: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Reference Standard I

Modified PTH Amino Acids on Applied Biosystems

03

CD

10

Page 13: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Modified Amino Acids for Reference Standard I

Applied Biosystems Instruments

PTH# Name Contributor

1 y-Carboxyglutamic Acid (Gla) I, RW*

2 O-Fucosyl threonine K"'" 3 S-Carboxymethyl cysteine J,M,R

4 Homoserine M,R

5 S-Carboxamidomethyl cysteine D,J

6 Carboxamidomethyl methionine T

7 Methionine suifone C,M,R,T

8 N-e-Succinyl Iysine B,D

9 5-Hydroxy Iysine derivative R iv 10 Hydroxyproline (Hyp) B,D,F,G,I,J,R,T

11 N-e-Acetyl Iysine M.P, Q, T

12 Methyl histidine O,R

13 O-Methyl threonine M 14 Cysrine O

15 O-Methyl glutamic acid B, U "

16 N-e-Methyl Iysine B, C, J, M, P, Q, R

17 N-e-dimethyi Iysine C, J, P, Q v

18 N-e-trimethyl Iysine C, E, J, P, Q v 19 Canavanine C

20 a-Amino butyric acid (Abu) G

21 Methyl arginine R

22 S-Methyl cysteine M

23 DL-Homocystine M

24 5-Hydroxylysine (Hyl) D.K.R 25 Iodotyrosine C,T

26 ct,Y-diaminobutyric acid I

27 Omithine (Orn)' R,T 28 O-Methyl tyrosine t' 29 Lanthionine (Lan) R vi

30 Norleucine (Nle) C,M 31 p-Chlorophenylalanine T 32 diiodotyrosine C

i In systems where Gla is efficiently extracted from the cartridge it runs as a broad peak immediately in front of Asp. A certain percentage (- 5-10%) can decarboxylate to Glu, which is the only PTH-aa seen if Gla is not extracted from the cartridge.

11

Page 14: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

ii Reference 35 shows Gla as di-O-mcthyl-GIa using a methylation procedure and a

modified gradient. In addition, O-methyi Asp and O-methyi GIu are also described.

itt Successive Edman cycles results in deglycosylation.

iv Of three 5-Hydroxy-Lys contributors, this is the only one to indicate this peak.

v The methyl lysines (mono-, di-, and tri-) have proven to be particularly problematic

in establishing their elurion position. Different contributors have shown diem

eluting in basically two places. The majority show diem eluting in a wide area

between alanine and DPTU and also after leucine. In some instances, a contributor

will indicate both positions, and in others, only one of the two positions. In an

effort to resolve this discrepancy, we obtained samples of the mono-, di- and tri-

methyl lysines and ran them on a standard Applied Biosystems 477A sequencer

simply by loading an aliquot into the reaction vessel and running a sequencer cycle.

Both mono- and dimethyl lysine show both early and late peaks, while trimediyl

lysine shows only the early peak. The eiution order of die early peaks is mono-

before di- before tri-, with a fairly limited range between Ala and Met The late

eludng peaks tend to co-elute just after leu (and nleu). So, what is the explanarion ?

Without chemical proof we can only speculate, but we offer die following

possiblity. Mono- and dimethyl lysine are alkyl amines that may be capable of

becoming protonated and assuming a positive charge. Trimediyl lysine is a

quaternary amine that is always positively charged. As such, the charge should

cause diem to run relatively early in the chromatogram and, like His and Arg, their

elution position will probably be very sensitive to ionic strength. Hence, varying

ionic strength from different systems may explain the wide variance reported in the

elurion positions of die early peaks. The late peaks may be due to a portion of die

mono- and dimethyl lysine side chains reacting with PITC in a manner similar to

that of lysine, since they retain a free pair of electrons on the nitrogen that can

participate in nucleophilic attack on the PFTC. In our hands, this appears to be a

major reaction for monomethyl lysine and a minor reaction for dimethyl lysine, but

others have reported variable ratios. This variability may be cycle dependent.

Trimediyl lysine does not possess an unbonded pair of electrons and thus would

not be expected to react with PITC at this position. Hence, we do not see a late

eluting peak.

Again, it must be stressed that this is only a hypothesis and particular care

must be taken in interpreting your results, However, the general behavior of the

methylated lysines, whatever the reason, is well documented and should aid in their

identiiication.

Vi Also reports minor peaks berween Ser and Gin and at dehydroalanine.

12

Page 15: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Reference Standard II

Modified PTH Amino Acids on Miiligen/Biosearch

q

o

q

cd

q

13

Page 16: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Modified Amino Acids for Reference Standard II

Milligen/Biosearch Instruments

PTH# Name Contributor

200 Hydroxy-Pro (Hyp) L

201 O-Phospho-Ser L1'

Both Ser and O-Phospho-Ser are convened to dehvdroalanine.

14

Page 17: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Reference Standard III

PTC Amino Acids on Applied Biosystems

15

Page 18: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

PTC Amino Acids from Reference Standard III

Applied Biosystems Instruments

PTH# Name Contributor

D

R

R

F, R

R

F, R

R

16

Page 19: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Reference Standard IV

Sequencing Artifacts

Q_ Q

O LLJ

— Unknown at PHE/ILE CoorassiB Blue

TRP Associated PaaK_ GLU Associated Peak

— AS? Associated Peak

— H4 Contamination — MethanolPITC

THR Associated Peaks

"5 >SEH Associated Peaks

THIS

""known ASN/SER

— Aniline

q

cd

O

CO CVJ

q

CM CM

q

o

CM

q

CO

q CO

co

q

c\i

q

o

q

co

q

CD

17

Page 20: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Sequencing Artifacts or Associated Derivatives of

Normal Amino Acids on Applied Biosystems

Reference Standard IV

PTH# Name Contributor

i Probably derived from PITC as a consequence of high residual acid during sample loading.

H Broad peak between Asn and Ser occasionally seen in very high sensitivity runs.

'" Only on instruments using methanolic conversion. Results from reaction between

methanol and PITC carried into the flask with S3.

i'v Unknown contaminant. Peak results from heating and drying in the conversion flask.

v Unidentified derivatives associated with Asp or Glu in addition to PTH-Asp and PTH-

Glu respectively.

vi Unidentified derivative of Trp often seen as the major peak.

v" Sharp peak often seen between Phe and lie.

vizi Consists of peaks at 1) Ser and Ser", 2) between Ser' doublet and Pro, and 3) between Trp

and Phe.

ix Peroxides in R4 may degrade Lys.

18

Page 21: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Reference Standard V

Sida-chain Protected Amino Acids Used in Boc SPPS on Applied Biosystema O

d

(Z-IO)sA-]-Hld

na

511

-510

-506

-509

-503

-507 ■506'

■505

■504

■503

■502

■501

500

p

d CO

I—> ■4-'

O =3 CVJ C

O

d

19

Page 22: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Side Chain Protected Amino Acids used in Boc SPPS

Applied Biosystems Instruments

Reference Standard V1

PTH# Name Contributor

500 Acetamidomethyl cysteine (ACM-Cys) J 501 Arg-p-toluenesulfonyl (Tos) N

502 Trp-Nin-formyl (CHO) N 503 Ser-benzyl (Bzl) N 504 Arg-mesirylene-2-sulfonyl (MTS)

505 Asp-O-benzyl (OBzl) N 506 Thr-benzy] (Bzl) N 507 Glu-O-benzyl (OBzl) N 508 Cys-4-methoxybenzyl (4-CH3OBzl) N

509 Lys-chlorobenzyloxycarbonyl (C1Z)

510 Cys-4-methyIbenzyl (4-CH3Bzl)

511 Tyr-2-bromobenzyloxycarbonyl (2BrZ) N

i Standard chromatogram provided by Michael Kochersperger of AppUed Biosystems. See

reference 23 .

it Lys-{2C1Z) is more commonly used given its greater stability in long syntheses and it co-eiutes with Lys-(CIZ).

20

Page 23: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

REF STD VI

Sids-chain Protected Amino Acids Used in Boc SPPS on Porton Instruments

SAT U

311

3Hd

nida

1VA

131

OUd

HA1

V1V

SIH

ma

AT9

UHi

H3S

NSV

dSV

604

603

600

602

601

Oi'9

zv% ■ ■

OJ

. CO

OJ

cg

CO

LO

CO

O

eg

. CD

CO

600

CO

CO

CO

in

CD •4—■

c

Page 24: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Side Chain Protected Amino Acids used in Boc SPPS

Porton Instruments

Reference Standard VI*

PTH# Name Contributor

Chromatogram provided by Audree Fowler.

Page 25: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Reference Standard VII

Side-chain Protected Amino Acids Used in FMOC SPPS on Applied Biosystems

...dJl-Hld

S!H-Hld

B[V-Hld

ni3-Hld nidwa _

A"[9-Hld

u|9-Hld

usv-Hld

-818

818

818

821

820 819

■818J

801

800

O

O

q

o

CO

-4—'

i o ■—

CM ^

p

p

o

23

Page 26: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Side chain Protected Amino Acids used in FMOC Synthesis

Applied Biosystems Instruments

Reference Standard VII *>"

PTH# Name Contributor

i Sauidard chromatogrnm provided by Michael Kochersperger of Applied Biosystems. See

reference 23.

li Locations are approximate. Originally done with different gradient conditions and are now

represented on a typical resin bound sequencing standard (ABI "rez" cycle)

lit Side chain deprotecrion occurs during conversion.

iv Byproduct resulting from conversion.

v Side chain deprotection accumulates during repeated Edman cycles.

Page 27: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Tables

The following tables of eiution positions were submitted already assembled and are

reproduced here essentially as received.

25

Page 28: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

TABLE I s

Om, omithinc; Trp', unidentified derivative of PTH-Trp;

26

Page 29: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

TABLE II

3 me his

nlrt

arg

asn

asp

cys

di iodo his

di iodotyr

dimethyl lys

gin

glu

giy his

hydroxy pro

ile

leu

lys

met

mono iodo his

mono iodo Tyr

mono methyl lys

phe

phos-ser

phos-tyr

pro

ser

thr

tip

tyr

val

after his about 1.57/>A (broad shape)

H//X>Y 0.81

ala<X<ser primeZ/approx near methyl his//H

D<X<N(nearer to N)//D(deamidated N)//S'

M

R<X<Y//S'//H//E//G(Q/E ratio approx =, no: as in a Q

call)//T//P//R//X<H(s-p proprioamido Cys-[aerylamide mod.])

//Y//X<P 1.5'

>his 3'

2'>nleu

DMPTU<X<H

E(deamidated Q)//G(ptcE?)//X>A 1'

G(or slighdy after=ptc E)//dptu<X<W//Y or Y<X<P

X<D l'(ptcG)

I//A<X<R(me his)

l">ala ptc hPro l'<his

X>nleu

tyr<X<pro(l/2 way)//M//V<X<dptu//X>W(close to W)

Q<X<T//(S)//X<P(metO?)//dpm<X<W0.5'>dptu

>his 2*

X<W 0.21

X>L

dptu//dptu<X<W//H?

ser/ser'=2.5;//ser/ser"=0.8;//ser

approx. 3.5' into run (w/DTTet al.) need nMol amts.

Y<X<P(ptc P)

D(approx 50%of S)//X<Y(r<Y,=S1)//s"=l'>Y//V<X<dptu//

N//Q//G//approx M//ser/ser'=17.9; ser/ser" =5.4

Q//X<P 1.5'(T")//X>dmptu(close to E)//X<F(cIose)// approx M

V<X<dptu l'//F<X,X<K//X<P 0.81

ala<X<tyr(l/2 way)

dptu<X<trp//X<pro(0.2")

fianipfe Translation of Shorthand

Asn- D<X<N(nearer to N)//D(deamidaiedN)//S'<X_<R(?)//M(?) =

•Asn may/will have an unknown(X) peak eluting between Asp and Asn,

and it is nearer to Asn than it is to Asp. fX=the unusual/unknown peak)

•A peak may/will appear at Asp which is deamidated Asn.

•A peak may(?) appear between. Ser1 and Arg (? means this is not verified

by more than one run) •A peak may arise at the Met location. This is not verified by more tiian

one run.

Gin- E(deamidatedQ)I!G(ptcE?)llX>A V =

•A peak may/will appear at Glu. This is deamidated Gin.

•A peak may/will appear at Gly, possibly PTC Glu.

•An unknown(X) peak may/will arise after Ala by about 1 minute.

27

Page 30: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Index of Compounds (Roman numeral refers to chromatogram number)

Acetamidomethyl-cysteine

N-e-Acetyl-Iysine

Alanine

a-Aminoburyric Acid

Aniline

Arginine

Arginine (diallyloxycarbonyl)

Arginine (diallyloxycarbonyl) derivative

Arginine (MesiryIene-2-sulfonyI)

Arginine (4-Methoxy-2,3>6-trimethy I benzene sulfonyl)

Arginine (4-Toluenesulfonyl)

Asparagine

Aspartic Acid

Aspartic Acid (cyclohexyl)

Aspartic Acid (O-allyl)

Aspartic Acid asociated peak

Aspartic Acid (O-benzyl)

Aspartic Acid (O-tert-butyl)

4-O-Benzylhydroxyproline

Biotinylated Iysine

Canavanine

S-Carboxamidomethyl-cysEeine

Carboxamidomethyl-methionine

-y-Carboxyglutamic Acid

S-Carboxymethyl-cysteine

p-Qiloro-phenylalanine

Citrulline

Coomassie Blue

P-Cyclohexylalanine

Cysteine

Cysteine (allyl)

Cysteine (allyioxycarbonyl)

Cysteine (4-methoxybenzyl)

V, VH, Table I, Ref. 17

I, Table I, Ref. 26

All, Tables I & H

I, Table I, Ref. 17

rv

AH, Tables I & II

VH, Ref. 16

VII, Ref. 16

V, Table I

VH, Table I

V. Table I

All, Tables I & H

All, Tables I & H

Table I

VII, Ref. 16

IV

V, VI, Table I, Ref. 23

VII, Ref. 23

Table I

Table I, Ref. 34

I

I

I

I, Ref. 35

I

I

Table I

IV, Ref. 36

Table I

Table U

, Ref. 16

, Ref. 16

V, Table I

28

Page 31: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Cysteine (4-methylbenzyl)

Cysteine (3-nitro-2-pyridylsuIfenyl)

Cystine

Dehydroalanine

Dehydro-a-aminoisobutyric acid

3,4-Dehydroproline

ovy-diaminobutyric acid

0-(2,6-dichlorobenzyl)-tyrosine

3-(2,6-dichlorobenzyl)-tyrosine

N-e-(2,3-dihydroxypropyl)-lysine

Diiodohistidine

Diiodotyrosine

Dimethyilysine

N-dimethyl, N'-phenylthiourea

O, O-dimethylphosphotyrosine

N,N'-diphenylthiourea

N,N'-diphenylurea

Dithioihreitol

DTT adduct of dehyroalanine

DTT adduct(s) of dehydro-a-aminoisobutyric acid

O-Fucosylthreonine

Glutamic Acid

Glutamic Acid (O-allyl)

Glutamid Acid (O-ailyl) derivative

Glutamic Acid associated peak

Glutamic Acid (O-benzyl)

Glutamic Acid (cyclohexyl)

Glutamic Acid (O-9-Fluorenylmethyl)

Glutamine

Glycine

Histidine

Histidine (allyloxycarbonyl)

Histidineazobenzene arsonate

Histidine (3-benzyl)

Histidine (3-benzyloxymethyl)

Histidine (tert-butyloxymethyl)

V, Table I

VII, Ref. 17

I, Ref. 15

Table I

Table I

Table I

I

Table I

Table I

Ref. 25

Table n

I, Table H

I, Table II, Ref. 26

All, Table I, Ref. 36

Table I

All, Table I, Ref. 36

Table I, Ref. 36

All, Ref. 36

All, Table I

IV, Table I

I, Ref. 19

All, Table I, Table H

VII, Ref. 16

VE, Ref. 16

IV

V, Table I, Ref. 23

Table I

Table I

All, Table I, Table H

All, Table I, Table II

All, Table I, Table n

VH, Ref. 16

Ref. 33

Table I

Table I

, Ref. 17

29

Page 32: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Histidine (2,4-dinitrophenyl)

Homoarginine

DL-Homocystine

Homophenylalanine

Homoserine

5-Hydroxylysine

5-Hydroxylysine derivative

Hydroxyproline

N-y-hydroxyethyl-glutamine

lodotyrosine

Isoleudne

Lanthionine

Leucine

Lysine

Lysine (allyloxycarbonyl)

Lysineazobenzene arsonate

Lysine (N-e- Chlorobenzyloxycarbonyl)

Lysine (N-e-dinitrophenyl)

Lysine (N-E-9-Fluorenylmeihyloxy car bony 1)

Methanoi/PITC conversion artifact

Methionine

Methionine sulfone

N-a-methylalanine

Methylarginine

S-methylcysteine

O-methyl glutamic acid

N-y-methyl glutamine

Methylhistidine

1-Methy Ihistidin e

3-Methylhistidine

N-E-methyllysine

N-a-methylphenylalanine

O-Methylthreonine

O-Methy 1 tyrosine

Naphthylalanine

Nitroarginine

Table I

Table I

I

Table I

I, Table I

I, Ref. 30

I, Ref. 30

I, Ref. 30

Table I

I, Table n

All, Table I, Table II

I, Ref. 30

All. Table I, Table II

All, Table I, Table II

VII, Ref. 16

Ref. 33

V, Table I

Table I

Table I

IV, Ref. 36

All, Table I, Table D

I, Ref. 30

Table I

I, Ref. 30

I

I, Ref. 35

Table I

I, Ref. 30

Table I

Table I

I, Table E, Ref. 14, 26, 30

Table I

I

I

VI, Table I

Table I

30

Page 33: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

p-Nitrophenylalanine

3-Nitrotyrosine

Norleucine

Norv aline

Omi thine

Omithine (benzyloxycarbonyl)

Phenylaianine

Phenylalanine (p-amino-benzyloxycarbonyl)

Phenylaianine (p-amino-benzyloxycarbonyl) prime

Phenylthiocarbamyl aianine

Phenylthiocarbamyl glycine

Phenylthiocarbamyl isoleucine

Phenylthiocaibamyl Ieucine

Phenylthiocarbamyl lysine

Phenylthiocarbamyl methionine

Phenylthiocarbamyl vaiine

O-phosphoserine

O-phosphotyrosine

P -(3 -pyridyl) aianine

S-Pyridylethyl cysteine

Proline

R4 contamination peak

Resumption of interrupted sequence artifacts

Serine

Serine (allyloxycarbonyl)

Serine (allyloxycarbonyl) derivative

Serine associated peaks (Ser1)

Ser (benzyl)

N-e-succinyl lysine

Threonine

Threonine (allyloxycarbonyl)

Threonine associated peaks (Thr1)

Threonine (benzyl)

Threonine (benzyl) prime

Threonine (ten-butyl)

N-e-trimethyl lysine

Table I

Ref. 24

I, Table I

Table I

I

Table I

All, Table I, Table II

Table I

Table I

m, Ref. 30

in

m, Ref. 30

in, Ref. 30

m. Table I, Ref. 30

m, Ref. 30

HI, Ref. 30

H, Table H

Table I, Table n

Table I

i, rrr, rv

AU, Table I, Table II

IV

IV

AU, Table I, Table II

VII, Ref. 16

VH, Ref. 16

rv, Table I

V,VI

I, Ref. 14

All, Table I, Table II

VH, Ref. 16

TV, Table I

V

V

vn

I, Ref. 26

Page 34: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Tris artifact IV, Ref. 36

Tryptophan All, Table I, Table II

Tryptophan associated peak IV, VII, Table I

Tryptophan (N^-formyl) V, Table I

Tyrosine All, Table I, Table II

Tyrosine (allyl) VII, Ref. 16

Tyrosine azobenzene arsonate Ref. 33

Tyrosine (2-bromobenzyloxycarbonyl) V, VI, Table I

Tyrosine (tert-butyi) VII

Unknown at Asparagine/Serine IV

Unknown at Phenylalanine/Isoleucine IV

Valine AH, Table I, Table II

Note: " All" indicates that the compound is represented in each of the reference

standards.

32

Page 35: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

LIST OF CONTRIBUTORS

Instrument

A. Andersen, Thomas T. Porton

Dept Biochem & Mol Biol A-10

Albany Med Col

Protein Chemistry Core Facility

Albany NY 12208

B. Barra, Donatella Applied Biosystems

Dept di Scienze Biochimiche

Univ La Sapienza

Piazzale Aido Moro 5

00185 Rome, Italy

C. Beach, Carol M. Applied Biosystems

Dept of Biochemistry

Univ of Kentucky

Chandler Medical Center

Lexington KY 40536-0084

D. Cook, Richard F. Applied Biosystems

MTT

E17-310

Cambridge MA 02139

E. Crimmins, Dan L. Applied Biosysiems

Howard Hughes Medical Inst.

Washington Univ Sch of Med

660 South Euclid - PO Box 8022

St. Louis MO 63110

F. Dorwin, Sarah A. Applied Biosystems

D-93D/AP-9A-Corporate Mol Biol

Abbott Laboratories

One Abbott Park Road

Abbott Park IL 60064-3500

G. Fields, Gregg B. Applied Biosystems

Dept Lab Medicine & Pathology

Univ of Minnesota

420 Delaware St. SE - Box 107

Minneapolis MN 55455-0392

33

Page 36: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Instrument

H. Fowler, Audree V. Porton

Dept of Biological Chemistry

UCLA Sch of Med

Los Angeles CA 90024-1737

I. Gaathon, Ariel Applied Biosystems

Bletterman Macromol Res Lab

Interdepartment Equipment Div

P.O. Box 1172

Jerusalem 91010 Israel

J. Grant, Gregory A. Applied Biosystems

Crankshaw, Mark W.

DepL Moiec Biology & Pharmacol

Washington Univ Sch of Med

Campus Box 8103

St Louis, MO 63110

K. Harris, Reed J. Applied Biosystems

#62

Genentech Inc

460 Point San Bruno Blvd

So San Francisco CA 94080

L. Hoffman, Donald R. Milligen/Biosearch

Dept of Pathology & Lab Med

East Carolina Univ Sch of Med

7S-10 Brody Sciences BIdg

Greenville NC 27858-3254

M. Hoogerheide, John G. Applied Biosystems

1140-230-3

The Upjohn Co

7000 Portage Road

Kalamazoo MI 49001-0199

N. Kochersperger, Michael Applied Biosystems

Applied Biosystems

850 Lincoln Centre Drive

Foster City CA 94404

O. Lane, William S. Applied Biosystems

Microchemisiry Facility

Harvard Univ

16 Divinity Avenue

Cambridge MA 02138

34

Page 37: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

Instrument

P. Man del, Lydia C. Applied Biosystems Molec Biology Core Facility

Univ Missouri Sch Bas Life Sci

5100 RockhiU Road - 109-BSB

Kansas City MO 64110-2499

Q. Niece, Ronald L. Applied Biosystems Biotechnology Center

Univ ofWisconsin

1710 University Avenue

Madison WI 53705

R. Paroutaud, P.S. Applied Biosystems Applied Biosystems S.A.R.L.

13 Rue de la Perdrix, BP 50086

Z.A.C. Paris Nord II

95948 Roissy Charies de Gaulle Cedex, France

S. Pohl, Jan Applied Biosystems Microchemical Facility - Rm. 5220

Emory Univ

1327 Clifton Road NB

Atlanta GA 30322

T. Siegel, Ned R. Applied Biosystems Smith, Christine

Biological Sciences - AA21

Monsanto Co

700 Chesterfield Pkwy North

Chesterfield MO 63198 ■ >

U. Williamson, Matthew K. Applied Biosystems DepLofBiology-0322 ... ■

University of California-San Diego 9500 Gilman Drive

LaJollaCA 92093-0322

35

Page 38: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

SUGGESTED REFERENCES

(* indicates a reference included by a contributor)

Books

1. Sequencing of Proteins and Peprides. Allen. G., Elscvier, 1981.

2. Techniques in Protein Chemistry HI. Angelciti, R.H., Ed., Academic Press,

1992.

3. Techniques in Protein Chemistry TV. Angeletti, R.H., Ed., Academic Press,

1993-

4. Practical Protein Chemistry - A Handbook. Darbre, A., Ed., John Wiley and

Sons, 1986.

5. Methods in Protein Sequence Analysis. Elzinga , M., Ed., Humana Press, 1982.

6. Protein Sequencing -- A Practical Approach, Findlay, J.B.C. and Geisow,

MX,- IRL Press, 1989.

7 Handbook of HPLC For the Separation of Amino Acids. Peprides and -

Proteins. Hancock, W.S., Ed., CRC Press, Vol. I &. II, 1984.

8. Techniques in Protein Chemistry I. Hugli, T.E., Ed., Academic Press, 1989.

9. High-Perfonnance Liquid Chromatographv of Peprides and Proteins; Separation. Analysis and Conformation. Mant, C.T. and Hodges, R.S.,

CRCPress, 1991.

10. -■ Methods of Protein MjcTocharacterizarion - A Practical Handbook.

Shively, J.E., Ed., Humana Press, 1986.

11. Post-Translation Modifications of Proteins. Tuboi, S., Taniguchi, CRC

Press, 1993.

''..ii... . -1 . ' "3

12. Current Researrh in Protein Chemistry: Techniques. Structure, and :-Function. Villafranca, J.J., Ed., Academic Press, 1990. (

13 Techniques in Protein Chemistry TT. Villafranca, J.J., Ed., Academic Press,

1989.

Articles

14.* The protein sequence of glutamate dehydrogenase from Sulfolobus solfataricus, a thennoacidophilic archacbacterium, Barra, D., Eur. J. Biochem., 203, 81-87,

1 1992.

36

Page 39: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

15.* Complete assignment of neurophysin disulfides indicates pairing in two separate

domains, Burman, S., Wellner, D., Chait, B., Chaudhary, T., Breslow, E.,

Proc.Nad.Acad.Sci. USA, 86, 429-433, 1989.

16.* The Development of High-Performance Liquid Chromatographic Analysis of Ally!

and Allyloxycarbonyl Side-Chain-Protected Phenylthiohydantoin Amino Acids,

Fields, C.G., Loffet, A., Kates, S.A., and Fields, G.B., Anal. Biochem., 203,

245-251, 1992.

17.* Edman Degradation Sequence Analysis of Resin-Bound Peprides Synthe-sized by

9-FIuorenyImethoxycarbonyl Chemistry, Fields, C.G., VanDirisse, V. and Fields,

G.B., Pep. Res., 6, 39-46, 1993.

18. Solvent system for the rapid identification of phenylthiohydantoin derivatives of

amino acids by high-performance liquid chromatography, Fonck, C, Frutigef, S.,

and Hughes, G.J., J. Chromatogr., 370-2, 339-343, 1986.

19.* O-Linked Fucose Is Present in the First Epidermal Growth Factor Domain -of

Factor XII but Not Protein C, Harris, R.J., J. Biol. Chem., 267-8, 5102-5107,

1992. .-1 jv_.

20. Microsequence Analysis of Peptides and Proteins: n. Separation of:Arnino_Acid

Phenylthiohydantoin Derivatives by High-Perfonnance Liquid Chromatography on

Octadecylsilane Supports, Hawke, D., Yuan, P-M., and Shiveiy, J.E., Anal.

Biochem. 120, 302-311, 1982. __.

21. Isocratic separation of phenylthiohydantoin-amino acids by reversed-phase high-

performance liquid chromatography, Hayakawa, K. and Oizumi, Ji, J.

Chromatogr., 487-1, 161-166, 1989.

22. Instability of phenylthiohydantoin amino acids, Jansecu_E.H. &. Both-Miedema,

R., J. Chromatogr., 435-2, 363-367, 1988.

23.* Sequencing of peptides on solid phase supports, Kocherspe-gerJ&i Blacher, R.,

Kelly, P., Pierce, L., and Hawke, D.H., American-:rBiotechnoTogy Laboratory,

1989.

_ - s£ a. ' "c 24. Preparation and characterization of 5-{4-hydroxy-3-nitrobenzyl)-3-p'henyl- 2-

thiohydantoin, the phenylthiohydantoin derivative of 3-nitrotyrosine, Lilova, A.

Kleinschmidt, T., Nedkov, P., and Braunitzer, G.^BiolrXhenx Hoppe Seyler,

367-10,1055-9,1986. .. .x. ■. A 'jfi

25. Preparation and characterization of N epsilon-(2,3-dihy.drjQ3!ypropyl)-Lrlysine_and

its phenylthiohydantoin derivative, Lilova, A., Biol. Chem. Hoppe Seyler, 368-11,

1489-1493, 1987.

26.* Sequence Analysis of Acetylation and Methylation in Two Histone H3 Variants of

Alfalfa, Mandel L.C., J. Biol. Chem., 265-28, 17157-17161, 1990.

27. Separation of amino acid phenylthiohydantoin"'derivjati^es by high-pressure .liquid chromatography, Meuth, J.L. and Fox, J.L., AnaI.Jbiochem., 154, 478-84,',1986.

37

Page 40: Protein Sequence AnalysisThe identification of amino acid residues in modern protein sequence analysis employing automated Edman degradation is dependent on the elution position of

28 High-sensitivity phenylthiohydantoin amino acid analysis on-line to a gas phase

protein sequencer, Murphy, R., J. Chromatogr., 408, 388-392, 1987.

29. Retention behaviour of phenylthiohydantoin amino acids in micro high-performance

liquid chromaiography with ociadecyl bonded glasses and silicas, Okamoto, M., J.

Chromatogr., 396, 345-349, 1987.

30.* Unpublished Studies on Unusual and Post Translational Modified^mino Acids — available upon request; and poster reprint from Protein Society Meeting, San Diego, CA, July 1993 - will be available in future, by Paroutaud, P.S. Contact Ruth Steinbrich, Applied Biosystems USA, 1-8OO-874-9868.

31. An optimized procedure for the separation of amino acid phenylthiohydantoins by reversed-phase HPLG, Persson, B. and Eaker, D., J. Biochem. Biophys.

Methods, 21-4, 341-350, 1990.

32. Analysis of phenylthiohydantoin amino acid mixtures for sequencing by thermospray liquid chromatography/mass spectrometry, Pramanik, B.C., Hinton,

S.M., Mtlliagton, D.S., Dourdeville, T.A., and Slaughter, CA.., Anal. Biochem.,

175-1, 305-318, 1988.

33. Protein modification by diazotized arsanilic acid: synthesis and characeterization of the phenylthiohydantoin derivative of azobenzene arsonate-coupledtyrosine, histidine, and lysine residues and their sequential alotment in labeled peptides, Schwallcr, B. & Sigrist, H., Anal. Biochem., 177-1, 193-187, 1989.

34. Biotinylatedpeptides/proteins. I. Identification of biotinylated lysyl phenylthiohydantoins, Smith, J.S., Anal. Biochem., 197-1, 254-257, 1991.

35.* Direct Identification of 7-Carboxyglutamic Add in the Sequencing of Vitamin K Dependent Proteins, Williamson, M.K., Anal. Biochem., 199,93- 97, 1991.

Bulletins

36. Artifact Peaks In HPLC Analysis of PTH Amino Acids, User Bulletin #5, Applied Biosystems, 1984.

37. Sequence Analysis of Synthetic, Side-Chain Protected, Resin-Bound Peptides, User Bulletin #13, Applied Biosystems, 1985.

38. PTH Amino Acid Analysis, Hunkapiller MW, Applied Biosystems, 1985.

38