Codon usage bias Ref: Chapter 9 Xuhua Xia [email protected] http:// dambe.bio.uottawa.ca.
-
Upload
adelia-ross -
Category
Documents
-
view
219 -
download
0
Transcript of Codon usage bias Ref: Chapter 9 Xuhua Xia [email protected] http:// dambe.bio.uottawa.ca.
-
Codon usage biasRef: Chapter 9Xuhua [email protected]:// dambe.bio.uottawa.ca
-
ObjectivesUnderstand how codon usage bias affect translation efficiency and gene expressionBiomedical relevanceProtein drugs in pharmaceutical industryTransgenic experiments in agricultureFactors affecting codon usage biasIndices measuring codon usage biasDevelop bioinformatic skills to study the genomic codon usage.
-
Codon Usage BiasObservation: Strongly biased codon usage in a variety of species ranging from viruses, mitochondria, plastids, prokaryotes and eukaryotes. Hypotheses:Differential mutation hypothesis, e.g., Transcriptional hypothesis of codon usage (Xia 1996 Genetics 144:1309-1320 )Different selection hypothesis, e.g., (Xia 1998 Genetics 149: 37-44)Predictions:From mutation hypothesis: Concordance between codon usage and mutation pressureFrom Selection hypothesis: Concordance between differential availability of tRNA and differential codon usage.The concordance is stronger in highly expressed genes than lowly expressed genes (CAI is positively correlated with gene expression).
-
Table 9-2, yeastXia 2007. Bioinformatics and the cell.
AA(1)
Codon(2)
T(3)
w(4)
F(5)
Arg
AGA
11
1
314
Arg
AGG
1
0.091
1
Asn
AAC
10
1
208
Asn
AAU
0
0
11
Asp
GAC
16
1
202
Asp
GAU
0
0
112
Cys
UGC
4
1
3
Cys
UGU
0
0
39
Gln
CAA
9
1
153
Gln
CAG
1
0.111
1
Glu
GAA
14
1
305
Glu
GAG
2
0.143
5
His
CAC
7
1
102
His
CAU
0
0
25
Leu
UUA
7
0.7
42
Leu
UUG
10
1
359
Lys
AAA
7
0.5
65
Lys
AAG
14
1
483
Phe
UUC
10
1
168
Phe
UUU
0
0
19
Ser
AGC
2
1
6
Ser
AGU
0
0
4
Tyr
UAC
8
1
141
Tyr
UAU
0
0
10
-
Conflict: Initiation and ElongationMet codon usage from the 12 CDSs: AUA214 AUG 37Possible tRNAMet/CAU, tRNAMet/UAUVertebrate mitochondrial genome has only one tRNAMet. Which one to have?tRNAMet/CAU: Good for initiation, but not efficient for AUA codons even with the C modified to 5-formylcytidinetRNAMet/UAU: Good for AUA codons, but not good for initiation.anticodon: CAU favoring the AUG codonNature has chosen CAU: All mitochondrial genomes with a single tRNAMet has a CAU anticodon.Problem with AUA codons in translation?Xia et al. 2007. PLoS One
-
Hypothesis and PredictionsFavoured by mutation, but not by tRNA-mediated selection because the first (wobble) position in tRNA anticodon is C.Favoured by mutationAlso favoured by tRNA-mediated selection: the first (wobble) position of tRNA is U.Predictions: 1. Proportion of A-ending codons (or RSCU) should be smaller in the Met codon family than in other R-ending codon families:PNNA = NNNA/NNNG2. Availability of tRNAMet/UAU should increase PAUA.
-
Selection against AUA codonsCarullo, M. and Xia, X. 2008 J Mol Evol 66:484493.
Met
Leu
Glu
Lys
Gln
Arg
Trp
Species
AUA
UUA
GAA
AAA
CAA
AGA
UGA
A. gossypii
1.473
1.993
1.826
1.852
1.917
2
2
C. glabrata
1.043
1.995
2.000
1.938
1.889
2
2
K. thermotolerans
0.556
1.973
1.910
1.948
1.945
2
1.967
S. cerevisiae
1.140
1.969
1.800
1.883
1.794
1.947
1.908
S. castelli
1.299
1.994
1.891
1.981
1.969
2
1.918
S. servazzii
1.321
1.931
1.702
1.824
1.841
1.959
2
Y. lipolytica
1.440
1.968
1.536
1.859
1.963
1.922
1.882
-
Xia, X. 2012. In: RS Singh et al.. Evolution in the fast lane: Rapidly evolving genes and genetic systems. Oxford University Press.Fig. 5. Relationship between PAUA and PUUA, highlighting the observation that PAUA is greater when both a tRNAMet/CAU and a tRNAMet/UAU are present than when only tRNAMet/CAU is present in the mtDNA, for bivalve species (a) and chordate species (b). The filled squares are for mtDNA containing both tRNAMet/CAU and tRNAMet/UAU genes, and the open triangles are for mtDNA without a tRNAMet/UAU gene.
-
(a)
(b)
Chart2
56.2865.65
60.3473.96
47.3463.18
46.8964.83
74.0780.18
33.9937.5
58.6660.61
63.8365.02
63.1265.3
PAUA
PAUA
PUUA
PAUA
Bivalve
PUUAPAUAPAUA
Acanthocardia tuberculataNC_008452CAU/CAU65.6556.28
Hiatella arcticaNC_008451CAU/CAU73.9660.34
Crassostrea virginicaNC_007175CAU/CAU63.1847.34
C. gigasNC_001276CAU/CAU64.8346.89
V.philippinarumNC_003354CAU/CAU80.1874.07
Placopecten magellanicusNC_007234CAU/CAU37.533.99
Mytilus trossulusNC_007687CAU/UAU58.6660.61
M. galloprovincialisNC_006886CAU/UAU63.8365.02
M. edulisNC_006161CAU/UAU63.1265.3
PAUAPUUAGroupGXSUMMARY OUTPUT
56.2865.6500
60.3473.9600Regression Statistics
47.3463.1800Multiple R0.9375384846
46.8964.8300R Square0.8789784101
74.0780.1800Adjusted R Square0.8063654561
33.9937.500Standard Error5.3287910681
60.6158.66158.66Observations9
65.0263.83163.83
65.363.12163.12ANOVA
dfSSMSFSignificance F
Regression31031.1996176508343.733205883612.10498075140.0099200083
Residual5141.980071238128.3960142476
Total81173.1796888889
CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%
Intercept-2.37010223810.6992462076-0.22152048770.8334500435-29.873390184225.1331857082-29.873390184225.1331857082
PUUA0.86460060580.16313122395.30003138070.0031919730.4452584451.28394276670.4452584451.2839427667
Group8.878155462983.92676456220.10578455530.9198662154-206.8624609657224.6187718914-206.8624609657224.6187718914
GX0.05887248471.35437475060.04346838620.9670106854-3.42265864663.5404036159-3.42265864663.5404036159
SUMMARY OUTPUT
Regression Statistics
Multiple R0.9375140938
R Square0.878932676
Adjusted R Square0.8385769013
Standard Error4.8654175142
Observations9
ANOVA
dfSSMSFSignificance F
Regression21031.145963365515.572981682521.77960113830.0017745197
Residual6142.033725523923.6722875873
Total81173.1796888889
CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%
Intercept-2.424949719.7007062734-0.24997661420.8109442465-26.161722809321.3118233892-26.161722809321.3118233892
PUUA0.86545470610.1478615145.85314381290.00109802510.50365061581.22725879640.50365061581.2272587964
Group12.5226003773.45782302673.62152726730.0110768964.0616122520.9835885044.0616122520.983588504
Bivalve
00
00
00
00
00
00
00
00
00
PAUA
PAUA
PUUA
PAUA
Tunicates
CodonAAAplidiumConicumNC_013584RSCUCionaIntestinalisNC_004447CionaSavignyiNC_004570ClavelinaLepadiformisNC_012887DiplosomaListerianumNC_013556DoliolumNationalisNC_006627Halocynthia roretziNC_002177HerdmaniaMomusNC_013561MicrocosmusSulcatusNC_013752PhallusiaFumigataNC_009834PhallusiaMammillataNC_009833StyelaPlicataNC_013565
UAG*30.66710.240.88910.2510.28671.55640.66781.610.182101.6676191.5
UAA*61.33391.851.11171.7561.71420.44481.33320.4101.81820.3336130.5
GCUA762.841702.642712.812873.164762.951771.532772.184682.45561.898781.625791.456522
GCGA50.18720.07570.27710.03630.117320.637150.426190.685270.915430.896390.719180.692
GCCA70.262100.37730.11950.18230.117531.05590.25580.28830.102340.708661.217140.538
GCAA190.71240.906200.792170.618210.816390.776401.135160.577321.085370.771330.608200.769
UGUC311.676281.806232321.939391.857220.917731.947341.789361.946331.404190.792421.826
UGCC60.32430.1940010.06130.143261.08320.05340.21110.054140.596291.20840.174
GAUD621.851631.909601.935651.733481.655401.039601.765591.71641.882371.104411.242591.735
GACD50.14930.09120.065100.267100.345370.96180.235100.2940.118300.896250.75890.265
GAGE220.579180.474250.61260.712140.406410.891611.386701.573641.524751.875611.627491.273
GAAE541.421581.526571.39471.288551.594511.109270.614190.427200.47650.125140.373280.727
UUUF4951.8754321.8194601.9095221.8355621.9151671.1254011.9373351.6923611.9361941.3432161.4074041.804
UUCF330.125430.181220.091470.165250.0851300.875130.063610.308120.064950.657910.593440.196
AGAG1331.4382211.7131511.589761.0131641.7921301.088680.872250.685620.681220.463200.519270.551
AGGG520.562370.287390.411740.987190.2081090.912881.128481.3151201.319731.537571.481711.449
GGAG411.763272.038481.811190.644110.978381.246460.722380.425571.06660.595600.57370.841
GGCG40.1720010.03890.30510.08980.26240.063190.21260.112260.234780.741150.341
GGGG160.68880.604140.528180.6140.356351.148610.9571401.564751.3952512.2611871.777410.932
GGUG321.376181.358431.623722.441292.578411.3441442.2591611.799771.4331010.91960.912831.886
CACH70.21260.18540.12370.222110.297310.88660.188160.48550.167290.906321.032120.369
CAUH591.788591.815611.877561.778631.703391.114581.813501.515551.833351.094300.968531.631
AUUI2601.8252961.8852811.9122461.8222811.8861251.2951691.8671241.592331.9581091.4531161.3182261.745
AUCI250.175180.115130.088240.178170.114680.705120.133320.4150.042410.547600.682330.255
AAAK681.4471021.672971.552711.4791281.778501.087280.8260.598420.92350.169250.794310.765
AAGK260.553200.328280.448250.521160.222420.913421.2611.402491.077541.831381.206501.235
CUAL371.345381.505191.07321.113301.5191431.563140.789441.257211.355731.0321021.115481.289
CUCL50.18240.15810.05680.27860.304620.67830.169180.51420.129610.862680.743110.295
CUGL50.18230.11950.28260.20940.203530.579130.732411.17170.452731.032820.896160.43
CUUL632.291562.218462.592692.4391.9751081.18412.31371.057322.065761.0741141.246741.987
UUAL3531.5483981.8513351.5023211.733761.8251301.4212921.2141900.7853551.4261080.8091070.9822521.289
UUGL1030.452320.1491110.498500.27360.175530.5791890.7862941.2151430.5741591.1911111.0181390.711
AUAM1611.4772361.7482131.6141821.6252331.8641421.4131111.314680.7231601.172800.914680.8141161.055
AUGM570.523340.252510.386420.375170.136590.587580.6861201.2771130.828951.086991.1861040.945
AACN120.173160.20140.051150.229290.331601.27750.12190.3880.133280.824250.769190.297
AAUN1271.8271431.7991541.9491161.7711461.669340.723781.88811.621121.867401.176401.2311091.703
CCUP773.02672.577652.708642.783853.119612.103592.408532.232471.899431.737391.405522
CCGP20.07850.192100.41730.1340.147100.345120.4990.379210.848140.566120.432140.538
CCCP70.27540.15440.16740.17430.11180.62140.163160.67490.364170.687461.658120.462
CCAP160.627281.077170.708210.913170.624270.931230.939170.716220.889251.01140.505261
CAGQ100.41750.2120.53380.3410.048281.12211311.216261.13271.459261.333261.156
CAAQ381.583451.8331.467391.66411.952220.88211200.784200.87100.541130.667190.844
CGCR10.08500000020.18230.17450.290000110.815151.09130.267
CGGR90.76640.30850.36430.26100191.101120.696130.945171.388191.407141.01880.711
CGUR262.213251.923221.6282.435292.636191.101291.681292.109141.143161.185201.455171.511
CGAR110.936231.769282.036151.304131.182281.623231.333130.945181.46980.59360.436171.511
AGCS130.27470.175120.235110.17980.186470.90440.063230.33360.1410.774430.945190.304
AGUS821.726731.825901.7651121.821781.814571.0961241.9381151.6671141.9651.226481.0551061.696
UCAS390.746641.169541.009320.634491.077480.97240.78331.008371.035441.006400.842380.826
UCCS140.268150.27460.112120.23880.176501.0150.163130.39750.14581.326551.158280.609
UCGS140.26870.128120.22450.09950.11190.384100.325120.366230.643280.64280.589160.348
UCUS1422.7181332.4291422.6541533.031202.637811.636842.732732.229782.182451.029671.4111022.217
ACAT250.752321381.16310.912431.293330.746331.483250.962351.333150.561210.694240.787
ACCT80.24140.12520.061120.35330.09410.92730.13540.15490.343180.673361.19160.525
ACGT40.1280.25100.30540.11830.09190.42950.225160.615130.495301.121250.826140.459
ACUT962.887842.625812.473892.618842.526841.898482.157592.269481.829441.645391.289682.23
GUGV380.589180.291300.432250.377100.226770.856890.8621331.3071421.3992582.0311631.52960.98
GUUV1161.7981161.8791462.1011422.143902.034911.0111841.7821931.8971551.5271170.9211261.1751841.878
GUCV100.155120.19450.072150.22660.136590.656130.126210.20640.039410.323580.541280.286
GUAV941.4571011.636971.396831.253711.6051331.4781271.23600.591051.034920.724820.765840.857
UGAW601.463701.556691.516481.247581.731581.126440.759230.469390.857310.496200.357200.526
UGGW220.537200.444220.484290.75390.269450.874721.241751.531521.143941.504921.643561.474
UACY180.218160.195110.148250.318270.297901.132150.181330.395100.118630.851731.014420.398
UAUY1471.7821481.8051381.8521321.6821551.703690.8681511.8191341.6051601.882851.149710.9861691.602
PAUA0.73853211010.87407407410.80681818180.81250.9320.70646766170.65680473370.36170212770.58608058610.45714285710.40718562870.5272727273
PUUA0.7741228070.92558139530.75112107620.86522911050.91262135920.71038251370.60706860710.39256198350.71285140560.4044943820.49082568810.6445012788
Lancelet
BranchiostomaBelcheriNC_004537BranchiostomaFloridaeNC_000834BranchiostomaLanceolatumNC_001912EpigonichthysMaldivensisNC_006465
UAG*61.09161.71481.23120.667AplidiumConicumNC_013584.gbAplidiumConicumNC_013584
UAA*50.90910.28650.76941.333BranchiostomaBelcheriNC_004537.gbBranchiostomaBelcheriNC_004537
GCUA1251.7991261.8811211.813981.508BranchiostomaFloridaeNC_000834.gbBranchiostomaFloridaeNC_000834
GCGA440.633340.507350.524490.754BranchiostomaLanceolatumNC_001912.gbBranchiostomaLanceolatumNC_001912
GCCA170.245220.328260.39320.492CionaIntestinalisNC_004447.gbCionaIntestinalisNC_004447
GCAA921.324861.284851.273811.246CionaSavignyiNC_004570.gbCionaSavignyiNC_004570
UGUC331.737231.211231.211281.273ClavelinaLepadiformisNC_012887.gbClavelinaLepadiformisNC_012887
UGCC50.263150.789150.789160.727DiplosomaListerianumNC_013556.gbDiplosomaListerianumNC_013556
GAUD611.564571.425571.425541.521DoliolumNationalisNC_006627.gbDoliolumNationalisNC_006627
GACD170.436230.575230.575170.479EpigonichthysMaldivensisNC_006465.gbEpigonichthysMaldivensisNC_006465
GAGE480.96450.874460.893541.029Halocynthia roretziNC_002177.gbHalocynthia roretziNC_002177
GAAE521.04581.126571.107510.971HerdmaniaMomusNC_013561.gbHerdmaniaMomusNC_013561
UUUF2271.7391701.3491741.3651811.42MicrocosmusSulcatusNC_013752.gbMicrocosmusSulcatusNC_013752
UUCF340.261820.651810.635740.58PhallusiaFumigataNC_009834.gbPhallusiaFumigataNC_009834
GGUG811.095921.26941.288831.118PhallusiaMammillataNC_009833.gbPhallusiaMammillataNC_009833
GGGG1321.7841131.5481111.5211181.589StyelaPlicataNC_013565.gbStyelaPlicataNC_013565
GGCG160.216200.274190.26410.552
GGAG670.905670.918680.932550.741PUUAPAUAPAUA
CACH110.242260.571280.622290.674BranchiostomaBelcheriNC_0045370.71981776770.2918660287CAU
CAUH801.758651.429621.378571.326BranchiostomaFloridaeNC_0008340.72069825440.3134328358CAU
AUUI1991.6381871.6051861.5971591.389BranchiostomaLanceolatumNC_0019120.72681704260.3118811881CAU
AUCI440.362460.395470.403700.611EpigonichthysMaldivensisNC_0064650.7360.3259668508CAU
AAAK431.178401.096401.081371.028AplidiumConicumNC_0135840.7741228070.7385321101CAU/UAU
AAGK300.822330.904340.919350.972CionaIntestinalisNC_0044470.92558139530.8740740741CAU/UAU
CUAL611.488931.683941.686971.644CionaSavignyiNC_0045700.75112107620.8068181818CAU/UAU
CUCL80.195150.271150.269310.525ClavelinaLepadiformisNC_0128870.86522911050.8125CAU/UAU
CUGL190.463360.652410.735510.864DiplosomaListerianumNC_0135560.91262135920.932CAU/UAU
CUUL761.854771.394731.309570.966DoliolumNationalisNC_0066270.71038251370.7064676617CAU/UAU
UUAL3161.442891.4412901.4542761.472Halocynthia roretziNC_0021770.60706860710.6568047337CAU/UAU
UUGL1230.561120.5591090.546990.528HerdmaniaMomusNC_0135610.39256198350.3617021277CAU/UAU
AUGM610.584630.627630.624590.652MicrocosmusSulcatusNC_0137520.71285140560.5860805861CAU/UAU
AUAM1481.4161381.3731391.3761221.348PhallusiaFumigataNC_0098340.4044943820.4571428571CAU/UAU
AACN180.303240.432260.464400.727PhallusiaMammillataNC_0098330.49082568810.4071856287CAU/UAU
AAUN1011.697871.568861.536701.273StyelaPlicataNC_0135650.64450127880.5272727273CAU/UAU
CCUP681.789761.924731.848731.896
CCGP200.526220.557220.557210.545
CCCP190.5200.506240.608180.468
CCAP451.184401.013390.987421.091
CAGQ340.8471.106461.082290.667
CAAQ511.2380.894390.918581.333
CGAR181271.421251.316301.622
CGCR20.111120.632120.63260.324
CGGR301.667211.105221.158191.027
CGUR221.222160.842170.895191.027
AGCS200.578330.895300.808350.959
AGAS290.838120.325120.323310.849
UCAS481.386641.736631.697381.041
UCCS110.318190.515200.539250.685
UCGS220.635170.461190.512190.521
UCUS862.484772.088752.02932.548
AGGS10.029000020.055
AGUS601.733731.98782.101491.342
ACAT691.5651.469651.469651.307
ACCT90.196150.339140.316310.623
ACGT310.674320.723320.723320.643
ACUT751.63651.469661.492711.427
GUCV240.291230.267260.301540.598
GUGV720.873670.779690.8820.909
GUUV1151.3941301.5121271.4721161.285
GUAV1191.4421241.4421231.4261091.208
UGAW621.127581.094601.132520.981
UGGW480.873480.906460.868541.019
UACY340.442530.716540.725600.774
UAUY1201.558951.284951.275951.226
PAUA0.29186602870.31343283580.31188118810.3259668508
PUUA0.71981776770.72069825440.72681704260.736
Lancelet
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
PAUA
PAUA
PUUA
PAUA
Chart3
0.29186602870.7198177677
0.31343283580.7206982544
0.31188118810.7268170426
0.32596685080.736
0.7741228070.7385321101
0.92558139530.8740740741
0.75112107620.8068181818
0.86522911050.8125
0.91262135920.932
0.71038251370.7064676617
0.60706860710.6568047337
0.39256198350.3617021277
0.71285140560.5860805861
0.4044943820.4571428571
0.49082568810.4071856287
0.64450127880.5272727273
PAUA
PAUA
PUUA
PAUA
Bivalve
PUUAPAUAPAUA
Acanthocardia tuberculataNC_008452CAU/CAU65.6556.28
Hiatella arcticaNC_008451CAU/CAU73.9660.34
Crassostrea virginicaNC_007175CAU/CAU63.1847.34
C. gigasNC_001276CAU/CAU64.8346.89
V.philippinarumNC_003354CAU/CAU80.1874.07
Placopecten magellanicusNC_007234CAU/CAU37.533.99
Mytilus trossulusNC_007687CAU/UAU58.6660.61
M. galloprovincialisNC_006886CAU/UAU63.8365.02
M. edulisNC_006161CAU/UAU63.1265.3
PAUAPUUAGroupGXSUMMARY OUTPUT
56.2865.6500
60.3473.9600Regression Statistics
47.3463.1800Multiple R0.9375384846
46.8964.8300R Square0.8789784101
74.0780.1800Adjusted R Square0.8063654561
33.9937.500Standard Error5.3287910681
60.6158.66158.66Observations9
65.0263.83163.83
65.363.12163.12ANOVA
dfSSMSFSignificance F
Regression31031.1996176508343.733205883612.10498075140.0099200083
Residual5141.980071238128.3960142476
Total81173.1796888889
CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%
Intercept-2.37010223810.6992462076-0.22152048770.8334500435-29.873390184225.1331857082-29.873390184225.1331857082
PUUA0.86460060580.16313122395.30003138070.0031919730.4452584451.28394276670.4452584451.2839427667
Group8.878155462983.92676456220.10578455530.9198662154-206.8624609657224.6187718914-206.8624609657224.6187718914
GX0.05887248471.35437475060.04346838620.9670106854-3.42265864663.5404036159-3.42265864663.5404036159
SUMMARY OUTPUT
Regression Statistics
Multiple R0.9375140938
R Square0.878932676
Adjusted R Square0.8385769013
Standard Error4.8654175142
Observations9
ANOVA
dfSSMSFSignificance F
Regression21031.145963365515.572981682521.77960113830.0017745197
Residual6142.033725523923.6722875873
Total81173.1796888889
CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%
Intercept-2.424949719.7007062734-0.24997661420.8109442465-26.161722809321.3118233892-26.161722809321.3118233892
PUUA0.86545470610.1478615145.85314381290.00109802510.50365061581.22725879640.50365061581.2272587964
Group12.5226003773.45782302673.62152726730.0110768964.0616122520.9835885044.0616122520.983588504
Bivalve
00
00
00
00
00
00
00
00
00
PAUA
PAUA
PUUA
PAUA
Tunicates
CodonAAAplidiumConicumNC_013584RSCUCionaIntestinalisNC_004447CionaSavignyiNC_004570ClavelinaLepadiformisNC_012887DiplosomaListerianumNC_013556DoliolumNationalisNC_006627Halocynthia roretziNC_002177HerdmaniaMomusNC_013561MicrocosmusSulcatusNC_013752PhallusiaFumigataNC_009834PhallusiaMammillataNC_009833StyelaPlicataNC_013565
UAG*30.66710.240.88910.2510.28671.55640.66781.610.182101.6676191.5
UAA*61.33391.851.11171.7561.71420.44481.33320.4101.81820.3336130.5
GCUA762.841702.642712.812873.164762.951771.532772.184682.45561.898781.625791.456522
GCGA50.18720.07570.27710.03630.117320.637150.426190.685270.915430.896390.719180.692
GCCA70.262100.37730.11950.18230.117531.05590.25580.28830.102340.708661.217140.538
GCAA190.71240.906200.792170.618210.816390.776401.135160.577321.085370.771330.608200.769
UGUC311.676281.806232321.939391.857220.917731.947341.789361.946331.404190.792421.826
UGCC60.32430.1940010.06130.143261.08320.05340.21110.054140.596291.20840.174
GAUD621.851631.909601.935651.733481.655401.039601.765591.71641.882371.104411.242591.735
GACD50.14930.09120.065100.267100.345370.96180.235100.2940.118300.896250.75890.265
GAGE220.579180.474250.61260.712140.406410.891611.386701.573641.524751.875611.627491.273
GAAE541.421581.526571.39471.288551.594511.109270.614190.427200.47650.125140.373280.727
UUUF4951.8754321.8194601.9095221.8355621.9151671.1254011.9373351.6923611.9361941.3432161.4074041.804
UUCF330.125430.181220.091470.165250.0851300.875130.063610.308120.064950.657910.593440.196
AGAG1331.4382211.7131511.589761.0131641.7921301.088680.872250.685620.681220.463200.519270.551
AGGG520.562370.287390.411740.987190.2081090.912881.128481.3151201.319731.537571.481711.449
GGAG411.763272.038481.811190.644110.978381.246460.722380.425571.06660.595600.57370.841
GGCG40.1720010.03890.30510.08980.26240.063190.21260.112260.234780.741150.341
GGGG160.68880.604140.528180.6140.356351.148610.9571401.564751.3952512.2611871.777410.932
GGUG321.376181.358431.623722.441292.578411.3441442.2591611.799771.4331010.91960.912831.886
CACH70.21260.18540.12370.222110.297310.88660.188160.48550.167290.906321.032120.369
CAUH591.788591.815611.877561.778631.703391.114581.813501.515551.833351.094300.968531.631
AUUI2601.8252961.8852811.9122461.8222811.8861251.2951691.8671241.592331.9581091.4531161.3182261.745
AUCI250.175180.115130.088240.178170.114680.705120.133320.4150.042410.547600.682330.255
AAAK681.4471021.672971.552711.4791281.778501.087280.8260.598420.92350.169250.794310.765
AAGK260.553200.328280.448250.521160.222420.913421.2611.402491.077541.831381.206501.235
CUAL371.345381.505191.07321.113301.5191431.563140.789441.257211.355731.0321021.115481.289
CUCL50.18240.15810.05680.27860.304620.67830.169180.51420.129610.862680.743110.295
CUGL50.18230.11950.28260.20940.203530.579130.732411.17170.452731.032820.896160.43
CUUL632.291562.218462.592692.4391.9751081.18412.31371.057322.065761.0741141.246741.987
UUAL3531.5483981.8513351.5023211.733761.8251301.4212921.2141900.7853551.4261080.8091070.9822521.289
UUGL1030.452320.1491110.498500.27360.175530.5791890.7862941.2151430.5741591.1911111.0181390.711
AUAM1611.4772361.7482131.6141821.6252331.8641421.4131111.314680.7231601.172800.914680.8141161.055
AUGM570.523340.252510.386420.375170.136590.587580.6861201.2771130.828951.086991.1861040.945
AACN120.173160.20140.051150.229290.331601.27750.12190.3880.133280.824250.769190.297
AAUN1271.8271431.7991541.9491161.7711461.669340.723781.88811.621121.867401.176401.2311091.703
CCUP773.02672.577652.708642.783853.119612.103592.408532.232471.899431.737391.405522
CCGP20.07850.192100.41730.1340.147100.345120.4990.379210.848140.566120.432140.538
CCCP70.27540.15440.16740.17430.11180.62140.163160.67490.364170.687461.658120.462
CCAP160.627281.077170.708210.913170.624270.931230.939170.716220.889251.01140.505261
CAGQ100.41750.2120.53380.3410.048281.12211311.216261.13271.459261.333261.156
CAAQ381.583451.8331.467391.66411.952220.88211200.784200.87100.541130.667190.844
CGCR10.08500000020.18230.17450.290000110.815151.09130.267
CGGR90.76640.30850.36430.26100191.101120.696130.945171.388191.407141.01880.711
CGUR262.213251.923221.6282.435292.636191.101291.681292.109141.143161.185201.455171.511
CGAR110.936231.769282.036151.304131.182281.623231.333130.945181.46980.59360.436171.511
AGCS130.27470.175120.235110.17980.186470.90440.063230.33360.1410.774430.945190.304
AGUS821.726731.825901.7651121.821781.814571.0961241.9381151.6671141.9651.226481.0551061.696
UCAS390.746641.169541.009320.634491.077480.97240.78331.008371.035441.006400.842380.826
UCCS140.268150.27460.112120.23880.176501.0150.163130.39750.14581.326551.158280.609
UCGS140.26870.128120.22450.09950.11190.384100.325120.366230.643280.64280.589160.348
UCUS1422.7181332.4291422.6541533.031202.637811.636842.732732.229782.182451.029671.4111022.217
ACAT250.752321381.16310.912431.293330.746331.483250.962351.333150.561210.694240.787
ACCT80.24140.12520.061120.35330.09410.92730.13540.15490.343180.673361.19160.525
ACGT40.1280.25100.30540.11830.09190.42950.225160.615130.495301.121250.826140.459
ACUT962.887842.625812.473892.618842.526841.898482.157592.269481.829441.645391.289682.23
GUGV380.589180.291300.432250.377100.226770.856890.8621331.3071421.3992582.0311631.52960.98
GUUV1161.7981161.8791462.1011422.143902.034911.0111841.7821931.8971551.5271170.9211261.1751841.878
GUCV100.155120.19450.072150.22660.136590.656130.126210.20640.039410.323580.541280.286
GUAV941.4571011.636971.396831.253711.6051331.4781271.23600.591051.034920.724820.765840.857
UGAW601.463701.556691.516481.247581.731581.126440.759230.469390.857310.496200.357200.526
UGGW220.537200.444220.484290.75390.269450.874721.241751.531521.143941.504921.643561.474
UACY180.218160.195110.148250.318270.297901.132150.181330.395100.118630.851731.014420.398
UAUY1471.7821481.8051381.8521321.6821551.703690.8681511.8191341.6051601.882851.149710.9861691.602
PAUA0.73853211010.87407407410.80681818180.81250.9320.70646766170.65680473370.36170212770.58608058610.45714285710.40718562870.5272727273
PUUA0.7741228070.92558139530.75112107620.86522911050.91262135920.71038251370.60706860710.39256198350.71285140560.4044943820.49082568810.6445012788
Lancelet
BranchiostomaBelcheriNC_004537BranchiostomaFloridaeNC_000834BranchiostomaLanceolatumNC_001912EpigonichthysMaldivensisNC_006465
UAG*61.09161.71481.23120.667AplidiumConicumNC_013584.gbAplidiumConicumNC_013584
UAA*50.90910.28650.76941.333BranchiostomaBelcheriNC_004537.gbBranchiostomaBelcheriNC_004537
GCUA1251.7991261.8811211.813981.508BranchiostomaFloridaeNC_000834.gbBranchiostomaFloridaeNC_000834
GCGA440.633340.507350.524490.754BranchiostomaLanceolatumNC_001912.gbBranchiostomaLanceolatumNC_001912
GCCA170.245220.328260.39320.492CionaIntestinalisNC_004447.gbCionaIntestinalisNC_004447
GCAA921.324861.284851.273811.246CionaSavignyiNC_004570.gbCionaSavignyiNC_004570
UGUC331.737231.211231.211281.273ClavelinaLepadiformisNC_012887.gbClavelinaLepadiformisNC_012887
UGCC50.263150.789150.789160.727DiplosomaListerianumNC_013556.gbDiplosomaListerianumNC_013556
GAUD611.564571.425571.425541.521DoliolumNationalisNC_006627.gbDoliolumNationalisNC_006627
GACD170.436230.575230.575170.479EpigonichthysMaldivensisNC_006465.gbEpigonichthysMaldivensisNC_006465
GAGE480.96450.874460.893541.029Halocynthia roretziNC_002177.gbHalocynthia roretziNC_002177
GAAE521.04581.126571.107510.971HerdmaniaMomusNC_013561.gbHerdmaniaMomusNC_013561
UUUF2271.7391701.3491741.3651811.42MicrocosmusSulcatusNC_013752.gbMicrocosmusSulcatusNC_013752
UUCF340.261820.651810.635740.58PhallusiaFumigataNC_009834.gbPhallusiaFumigataNC_009834
GGUG811.095921.26941.288831.118PhallusiaMammillataNC_009833.gbPhallusiaMammillataNC_009833
GGGG1321.7841131.5481111.5211181.589StyelaPlicataNC_013565.gbStyelaPlicataNC_013565
GGCG160.216200.274190.26410.552
GGAG670.905670.918680.932550.741PUUAPAUAPAUA
CACH110.242260.571280.622290.674BranchiostomaBelcheriNC_0045370.71981776770.2918660287CAU
CAUH801.758651.429621.378571.326BranchiostomaFloridaeNC_0008340.72069825440.3134328358CAU
AUUI1991.6381871.6051861.5971591.389BranchiostomaLanceolatumNC_0019120.72681704260.3118811881CAU
AUCI440.362460.395470.403700.611EpigonichthysMaldivensisNC_0064650.7360.3259668508CAU
AAAK431.178401.096401.081371.028AplidiumConicumNC_0135840.7741228070.7385321101CAU/UAU
AAGK300.822330.904340.919350.972CionaIntestinalisNC_0044470.92558139530.8740740741CAU/UAU
CUAL611.488931.683941.686971.644CionaSavignyiNC_0045700.75112107620.8068181818CAU/UAU
CUCL80.195150.271150.269310.525ClavelinaLepadiformisNC_0128870.86522911050.8125CAU/UAU
CUGL190.463360.652410.735510.864DiplosomaListerianumNC_0135560.91262135920.932CAU/UAU
CUUL761.854771.394731.309570.966DoliolumNationalisNC_0066270.71038251370.7064676617CAU/UAU
UUAL3161.442891.4412901.4542761.472Halocynthia roretziNC_0021770.60706860710.6568047337CAU/UAU
UUGL1230.561120.5591090.546990.528HerdmaniaMomusNC_0135610.39256198350.3617021277CAU/UAU
AUGM610.584630.627630.624590.652MicrocosmusSulcatusNC_0137520.71285140560.5860805861CAU/UAU
AUAM1481.4161381.3731391.3761221.348PhallusiaFumigataNC_0098340.4044943820.4571428571CAU/UAU
AACN180.303240.432260.464400.727PhallusiaMammillataNC_0098330.49082568810.4071856287CAU/UAU
AAUN1011.697871.568861.536701.273StyelaPlicataNC_0135650.64450127880.5272727273CAU/UAU
CCUP681.789761.924731.848731.896
CCGP200.526220.557220.557210.545
CCCP190.5200.506240.608180.468
CCAP451.184401.013390.987421.091
CAGQ340.8471.106461.082290.667
CAAQ511.2380.894390.918581.333
CGAR181271.421251.316301.622
CGCR20.111120.632120.63260.324
CGGR301.667211.105221.158191.027
CGUR221.222160.842170.895191.027
AGCS200.578330.895300.808350.959
AGAS290.838120.325120.323310.849
UCAS481.386641.736631.697381.041
UCCS110.318190.515200.539250.685
UCGS220.635170.461190.512190.521
UCUS862.484772.088752.02932.548
AGGS10.029000020.055
AGUS601.733731.98782.101491.342
ACAT691.5651.469651.469651.307
ACCT90.196150.339140.316310.623
ACGT310.674320.723320.723320.643
ACUT751.63651.469661.492711.427
GUCV240.291230.267260.301540.598
GUGV720.873670.779690.8820.909
GUUV1151.3941301.5121271.4721161.285
GUAV1191.4421241.4421231.4261091.208
UGAW621.127581.094601.132520.981
UGGW480.873480.906460.868541.019
UACY340.442530.716540.725600.774
UAUY1201.558951.284951.275951.226
PAUA0.29186602870.31343283580.31188118810.3259668508
PUUA0.71981776770.72069825440.72681704260.736
Lancelet
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
PAUA
PAUA
PUUA
PAUA
-
Calculation of RSCURSCU is codon-specificRSCU and proportion: Different scaling.
Sheet1
CodonAANRSCUCodonAANRSCUCodonAANRSCU
GCUAla520.84CCUPro420.87UAA*83.2
GCCAla911.47CCCPro631.31UAG*10.4
GCAAla1031.66CCAPro851.76AGA*10.4
GCGAla20.03CCGPro30.06AGG*00
GAAGlu781.64CAAGln791.82AAALys901.78
GAGGlu170.36CAGGln80.18AAGLys110.22
GGUGly290.53CGUArg70.44ACUThr440.57
GGCGly621.13CGCArg110.7ACCThr961.25
GGAGly971.77CGAArg422.67ACAThr1531.99
GGGGly310.57CGGArg30.19ACGThr150.19
UUALeu1101.11AUAMet2181.66UGATrp921.77
UUGLeu160.16AUGMet440.34UGGTrp120.23
CUULeu620.62UCUSer511.11GUUVal400.84
CUCLeu950.95UCCSer651.42GUCVal481.01
CUALeu2852.86UCASer992.16GUAVal871.83
CUGLeu290.29UCGSer50.11GUGVal150.32
Sheet2
Sheet3
-
Calculation of CAICompound 6- or 8-fold codon families should be broken into two codon families CAI is gene-specific. 0 CAI 1CAI computed with different reference sets are not comparable. Problem with computing w as Fi/Fi.max: Suppose an amino acid is rarely used in highly expressed genes, then there is little selection on it, and the codon usage might be close to even, with wi 1. Now if we have a lowly expressed gene that happen to be made of entire of this amino acid, then the CAI for this lowly expressed gene would be 1, which is misleading.There has been no good alternative. Further research is needed.N2,3,4: Number of 2-, 3-, 4-fold codon families
Sheet1
AAAK1855UGA*00.388
AACN1434UAG*00.243
AAGK2464UAA*01
AATN1646GCAA10.606
ACAT1067GCUA151
ACCT1056GCGA00.253
ACGT478GCCA80.75
ACTT1122UGCC31
AGAR950UGUC30.909
AGCS463GAUD91
AGGR187GACD110.581
AGTS623GAGE110.862
ATAI340GAAE151
ATCI1574UUUF40.552
ATGM1739UUCF91
ATTI1666GGAG111
CAAQ2106GGUG140.248
CACH751GGGG00.06
CAGQ966GGCG50.136
CATH910CACH50.825
CCAP2617CAUH21
CCCP222AUCI120.945
CCGP521AUUI111
CCTP461AUAI00.204
CGAR698AAGK271
CGCR525AAAK50.753
CGGR216UUAL00.265
CGTR1115UUGL120.74
CTAL274CUAL00.181
CTCL1244CUCL20.822
CTGL591CUGL40.39
CTTL1514CUUL41
GAAE2523AUGM101
GACD1416AAUN21
GAGE2175AACN140.871
GATD2437CCGP20.199
GCAA1107CCUP70.176
GCCA1371CCAP71
GCGA463CCCP90.085
GCTA1827CAGQ50.459
GGAG3372CAAQ21
GGCG460CGGR10.194
GGGG204CGCR10.471
GGTG837CGAR10.626
GTAV497AGAR40.852
GTCV1239AGGR20.168
GTGV725CGUR61
GTTV1645AGUS20.529
TAA*103AGCS30.393
TACY1129UCUS30.856
TAG*25UCGS30.621
TATY949UCAS01
TCAS1177UCCS50.764
TCCS899ACAT20.951
TCGS731ACCT140.941
TCTS1007ACGT00.426
TGA*40ACUT121
TGCC738GUGV60.441
TGGW686GUCV150.753
TGTC671GUAV30.302
TTAL401GUUV71
TTCF1640UGGW41
TTGL1120UAUY10.841
TTTF906UACY81
Sheet2
AAAK1855UGA*00.388CodonAAObsFreqRefCodFreqw
AACN1434UAG*00.243UGA*060.375
AAGK2464UAA*01UAG*040.250
AATN1646GCAA10.606UAA*0161.000
ACAT1067GCUA151GCAA11950.606
ACCT1056GCGA00.253GCUA153221.000
ACGT478GCCA80.75GCGA0810.252
ACTT1122UGCC31GCCA82420.752
AGAR950UGUC30.909UGCC31231.000
AGCS463GAUD91UGUC31120.911
AGGR187GACD110.581GAUD9691.000
AGTS623GAGE110.862GACD11400.580
ATAI340GAAE151GAGE112890.863
ATCI1574UUUF40.552GAAE143351.000
ATGM1739UUCF91UUUF31180.554
ATTI1666GGAG111UUCF92131.000
CAAQ2106GGUG140.248
CACH751GGGG00.06
CAGQ966GGCG50.136
CATH910CACH50.825
CCAP2617CAUH21
CCCP222AUCI120.945
CCGP521AUUI111
CCTP461AUAI00.204
CGAR698AAGK271
CGCR525AAAK50.753
CGGR216UUAL00.265
CGTR1115UUGL120.74
CTAL274CUAL00.181
CTCL1244CUCL20.822
CTGL591CUGL40.39
CTTL1514CUUL41
GAAE2523AUGM101
GACD1416AAUN21
GAGE2175AACN140.871
GATD2437CCGP20.199
GCAA1107CCUP70.176
GCCA1371CCAP71
GCGA463CCCP90.085
GCTA1827CAGQ50.459
GGAG3372CAAQ21
GGCG460CGGR10.194
GGGG204CGCR10.471
GGTG837CGAR10.626
GTAV497AGAR40.852
GTCV1239AGGR20.168
GTGV725CGUR61
GTTV1645AGUS20.529
TAA*103AGCS30.393
TACY1129UCUS30.856
TAG*25UCGS30.621
TATY949UCAS01
TCAS1177UCCS50.764
TCCS899ACAT20.951
TCGS731ACCT140.941
TCTS1007ACGT00.426
TGA*40ACUT121
TGCC738GUGV60.441
TGGW686GUCV150.753
TGTC671GUAV30.302
TTAL401GUUV71
TTCF1640UGGW41
TTGL1120UAUY10.841
TTTF906UACY81
Sheet3
520.8387096774
911.4677419355
1031.6612903226
20.0322580645
248
MBD0002DDBE.xls
Sheet1
CodonAANRSCRCodonAANRSCRCodonAANRSCR
GCUAla520.84CCUPro420.87UAA*83.2
GCCAla911.47CCCPro631.31UAG*10.4
GCAAla1031.66CCAPro851.76AGA*10.4
GCGAla20.03CCGPro30.06AGG*00
GAAGlu781.64CAAGln791.82AAALys901.78
GAGGlu170.36CAGGln80.18AAGLys110.22
GGUGly290.53CGUArg70.44ACUThr440.57
GGCGly621.13CGCArg110.7ACCThr961.25
GGAGly971.77CGAArg422.67ACAThr1531.99
GGGGly310.57CGGArg30.19ACGThr150.19
UUALeu1101.11AUAMet2181.66UGATrp921.77
UUGLeu160.16AUGMet440.34UGGTrp120.23
CUULeu620.62UCUSer511.11GUUVal400.84
CUCLeu950.95UCCSer651.42GUCVal481.01
CUALeu2852.86UCASer992.16GUAVal871.83
CUGLeu290.29UCGSer50.11GUGVal150.32
Sheet2
Sheet3
-
Weak mRNA predictive powerFRS2ENO1
-
Effect of Codon Usage BiasFRS2ENO1
-
Problems with CAIFormulationReference setw = 0ImplementationAUGUGGMultiple codon families for one amino acidDependence on AT%Solutions (Xia, X. 2007. Evolutionary Bioinformatics)
-
RSCU (HIV-1 vs Human)Fig. 1. Relative synonymous codon usage (RSCU) of HIV-1 compared to RSCU of highly expressed human genes. Data points for codons ending with A, C, G or U are annotated with different combinations of colors and symbols. A-ending codons exhibit strong discordance in their usage between HIV-1 and human and are annotated with their coded amino acids. van Weringh et al. 2011. MBE.
-
ResearchObservation on HIV-1: Strong surplus of A-ending codonHigh mutation rateHypothesis: Strong A-biased mutation disrupting codon adaptation.Prediction: Strong A-biased mutation (confirmed)If mutation rate is lower, then there will be better codon adaptation (The related HTLV-1 parasitizes the same cell as HIV-1, but have lower mutation rate: HTLV-1 genes should exhibit better codon adaptation)
-
RSCU (HTLV-1 vs Human)Relative synonymous codon usage (RSCU) of HTLV-1 compared to RSCU of highly expressed human genes. Data points for codons ending with A, C, G or U are annotated with different combinations of colors and symbols. A-ending codons exhibit strong discordance in their usage between HIV-1 and human and are annotated with their coded amino acids.
Chart1
0.510.74202898550.74202898550.7420289855
1.2240.78346028290.78346028290.7834602829
1.180.92817679560.92817679560.9281767956
0.7970.35839160840.35839160840.3583916084
1.50.76137339060.76137339060.7613733906
1.0730.29585087190.29585087190.2958508719
1.5050.67714285710.67714285710.6771428571
0.9860.93228655540.93228655540.9322865554
0.9450.51993262210.51993262210.5199326221
0.7880.97315436240.97315436240.9731543624
0.9320.75247524750.75247524750.7524752475
0.8150.79188900750.79188900750.7918890075
0.8210.97052541650.97052541650.9705254165
0.5830.38819875780.38819875780.3881987578
1.74057971012.4381.74057971011.7405797101
1.20276953511.4351.20276953511.2027695351
1.17019230771.2341.17019230771.1701923077
1.19561454131.0731.19561454131.1956145413
1.5453827941.281.5453827941.545382794
1.18640576731.2281.18640576731.1864057673
1.69055944061.2191.69055944061.6905594406
1.06674684311.4041.06674684311.0667468431
1.16544117651.1081.16544117651.1654411765
1.46418056921.7291.46418056921.4641805692
1.60936093611.2331.60936093611.6093609361
1.40476190481.2811.40476190481.4047619048
1.60939167562.2221.60939167561.6093916756
1.7291755661.9321.7291755661.729175566
1.01397515531.751.01397515531.0139751553
1.19484702091.3071.19484702091.1948470209
0.34347826090.34347826090.3030.3434782609
1.21653971711.21653971710.7761.2165397171
0.87134964480.87134964481.060.8713496448
1.23862660941.23862660940.51.2386266094
2.11906193632.11906193630.5812.1190619363
1.32285714291.32285714290.4951.3228571429
0.47301275760.47301275760.4250.4730127576
1.48006737791.48006737791.0551.4800673779
1.02684563761.02684563761.2121.0268456376
1.11611161121.11611161121.0531.1161116112
0.36926360730.36926360730.1480.3692636073
0.37761640320.37761640320.2560.3776164032
1.93478260871.93478260870.8611.9347826087
1.17391304351.17391304351.17391304350.749
0.79723046490.79723046490.79723046490.565
0.82980769230.82980769230.82980769230.766
0.80438545870.80438545870.80438545870.927
0.65509076560.65509076560.65509076560.48
0.81359423270.81359423270.81359423270.772
0.9510489510.9510489510.9510489510.984
0.51834034880.51834034880.51834034880.942
0.83455882350.83455882350.83455882350.892
1.13052011781.13052011781.13052011780.86
0.52205220520.52205220520.52205220520.782
0.59523809520.59523809520.59523809520.719
1.22945570971.22945570971.22945570970.815
0.92268261430.92268261430.92268261430.991
0.66304347830.66304347830.66304347830.806
0.80515297910.80515297910.80515297910.693
A-ending
C-ending
G-ending
U-ending
RSCU (Human)
RSCU (HTLV-1)
Fig2A
--AverageError--AverageErrorArg(AGA)
--HIV WTHIV WT--HIV MUTHIV MUTArg(AGG)
Ile-UAU1.339670.70913Ile-UAU0.061390.022161.88917405842.7703068592Ile(AUA)
Lys1,21.008130.29927Lys1,20.060080.017493.36863033383.4351057747Ile(AUY)
Lys30.640520.25086Lys30.033952.02E-042.5532966595167.9196755367Leu(UUA)
Asn-GUU0.458820.12913Asn-GUU0.041770.00469Leu(UUG)
Sec-UCA10.269970.19873Sec-UCA10.02978.58E-04Lys(AAA)
Ile-IAU/GAU0.267240.08196Ile-IAU/GAU0.137110.050753.26061493412.7016748768Lys(AAG)
His-GUG0.209870.0246His-GUG0.102210.04699Gly(GGA)
Gly-GCC/CCC0.201640.08479Gly-GCC/CCC0.048590.010462.37811062634.6453154876Gly(GGB)
Pro-IGG/CGG/UGG0.194840.11125Pro-IGG/CGG/UGG0.055520.03084Val(GUA)
Thr-IGU/CGU0.190040.09363Thr-IGU/CGU0.015490.00784Val(GUB)
Met-CAU0.184240.03201Met-CAU0.041640.00719Thr(ACA)
Arg-CCG/UCG0.171890.00734Arg-CCG/UCG0.042140.00163Thr(ACB)
Glu-CUC/UUC0.157570.01142Glu-CUC/UUC0.048960.01635
Arg-ICG0.13350.02488Arg-ICG0.048020.00623
Asp-GUC0.130290.03841Asp-GUC0.057370.01104
Ala-IGC/CGC/UGC0.113210.01105Ala-IGC/CGC/UGC0.073770.03882
Ala-CGC0.099250.01093Ala-CGC0.06940.0283
Leu-UAA0.090.03616Leu-UAA0.037380.008112.48893805314.6091245376
Cys-GCA0.082810.02926Cys-GCA0.026610.00898
Leu-IAG/UAG0.079650.04071Leu-IAG/UAG0.041780.01059
Sec-UCA20.078830.01611Sec-UCA20.060110.02047
Val-IAC/CAC0.073870.0285Val-IAC/CAC0.048530.021182.59192982462.291312559
Gly-UCC0.070750.05595Gly-UCC0.028950.006171.26452189454.6920583468
Tyr-GUA0.068330.01475Tyr-GUA0.03620.01522
Val-UAC0.066170.02318Val-UAC0.028230.01322.85461604832.1386363636
Arg-CCU0.0660.03084Arg-CCU0.054380.022582.1400778212.4083259522
Thr-CGU0.062150.0152Thr-CGU0.019210.006794.08881578952.8291605302
Ser-GCU0.060460.03637Ser-GCU0.019630.00788
Gln-CUG/UUG0.058650.00177Gln-CUG/UUG0.021390.00217
Glu-UUC0.054240.01134Glu-UUC0.019180.00184
Ser-CGA0.05150.02118Ser-CGA0.022440.01514
Arg-UCU0.049420.00253Arg-UCU0.027670.0021819.533596837912.6926605505
Thr-UGU0.048140.00106Thr-UGU0.021480.007245.41509433962.9833333333
Phe-GAA0.046840.00865Phe-GAA0.017510.00531
Leu-CAA0.045040.015Leu-CAA0.049560.025033.00266666671.9800239712
Trp-CCA0.035750.00952Trp-CCA0.018460.0082
Leu-CAG0.031960.01431Leu-CAG0.021570.01228
Met-i0.026230.00355Met-i0.01610.00704
Ser-IGA/UGA0.016610.00137Ser-IGA/UGA0.010610.00118
Leu-UAA20.00780.0069Leu-UAA20.015430.00306
----------
mGln0.030420.01646mGln0.024780.01961
mGlu0.027030.00706mGlu0.022080.00217
mAsp0.026590.01066mAsp0.027750.01513
mPro0.021320.01909mPro0.017370.01656
mPhe0.01940.00432mPhe0.019640.01056
mVal0.018480.01059mVal0.013240.01119
mMet0.0180.00455mMet0.014660.00276
mAsn0.017520.01484mAsn0.013560.01279
mHis0.015730.0025mHis0.009070.00417
mArg0.015480.00258mArg0.018810.00967
mLeu-UAA0.014480.00168mLeu-UAA0.011340.00185
mSer-GCU0.013010.00997mSer-GCU0.032180.01353
mLys0.012971.82E-04mLys0.009640.00171
mIle0.011240.01005mIle0.009550.00536
mTyr0.00990.00729mTyr0.010460.00579
mSer-UGA0.008550.00753mSer-UGA0.007530.00519
mAla0.008450.00634mAla0.00890.0041
mThr0.008290.0069mThr0.007680.00573
mLeu-UAG0.007640.0068mLeu-UAG0.00610.00542
mTrp0.006070.00397mTrp0.004850.00343
mGly0.005640.00434mGly0.004560.00349
mCys0.004510.00403mCys0.003710.00317
----
Fig2B
--AverageError
--HIV MUTHIV MUT
Ile-UAU0.061390.02216
Lys1,20.060080.01749
Lys30.033952.02E-04
Asn-GUU0.041770.00469
Sec-UCA10.02978.58E-04
Ile-IAU/GAU0.137110.05075
His-GUG0.102210.04699
Gly-GCC/CCC0.048590.01046
Pro-IGG/CGG/UGG0.055520.03084
Thr-IGU/CGU0.015490.00784
Met-CAU0.041640.00719
Arg-CCG/UCG0.042140.00163
Glu-CUC/UUC0.048960.01635
Arg-ICG0.048020.00623
Asp-GUC0.057370.01104
Ala-IGC/CGC/UGC0.073770.03882
Ala-CGC0.06940.0283
Leu-UAA0.037380.00811
Cys-GCA0.026610.00898
Leu-IAG/UAG0.041780.01059
Sec-UCA20.060110.02047
Val-IAC/CAC0.048530.02118
Gly-UCC0.028950.00617
Tyr-GUA0.03620.01522
Val-UAC0.028230.0132
Arg-CCU0.054380.02258
Thr-CGU0.019210.00679
Ser-GCU0.019630.00788
Gln-CUG/UUG0.021390.00217
Glu-UUC0.019180.00184
Ser-CGA0.022440.01514
Arg-UCU0.027670.00218
Thr-UGU0.021480.0072
Phe-GAA0.017510.00531
Leu-CAA0.049560.02503
Trp-CCA0.018460.0082
Leu-CAG0.021570.01228
Met-i0.01610.00704
Ser-IGA/UGA0.010610.00118
Leu-UAA20.015430.00306
------
mGln0.024780.01961
mGlu0.022080.00217
mAsp0.027750.01513
mPro0.017370.01656
mPhe0.019640.01056
mVal0.013240.01119
mMet0.014660.00276
mAsn0.013560.01279
mHis0.009070.00417
mArg0.018810.00967
mLeu-UAA0.011340.00185
mSer-GCU0.032180.01353
mLys0.009640.00171
mIle0.009550.00536
mTyr0.010460.00579
mSer-UGA0.007530.00519
mAla0.00890.0041
mThr0.007680.00573
mLeu-UAG0.00610.00542
mTrp0.004850.00343
mGly0.004560.00349
mCys0.003710.00317
--
Fig3A
--AverageErroraverageerror
--293T293THIV WTHIV WTlogRatio
Lys32.727370.861517.732910.22252.701
Lys1,21.772370.3779416.891671.517263.253
Asn-GUU3.379360.9394110.624751.435411.653
Glu-CUC/UUC4.980350.154254.208870.1755-0.243
Pro-IGG/CGG/UGG6.221192.608614.164570.61297-0.579
Ile-UAU0.625490.093814.11520.090092.718
Thr-IGU/CGU4.258170.429063.974360.66775-0.100
Gly-GCC/CCC2.119070.106552.699340.109310.349
Asp-GUC4.002340.30412.615540.97891-0.614
Arg-CCG/UCG2.666220.723152.544680.0461-0.067
Met-CAU1.501090.072892.118140.672720.497
Gly-UCC3.42040.354741.86260.87535-0.877
His-GUG0.874940.087051.812620.850641.051
Cys-GCA2.775990.029421.753370.67296-0.663
Arg-ICG2.679020.695931.615640.14948-0.730
Glu-UUC5.990950.969791.491130.22708-2.006
Tyr-GUA4.600010.388831.433040.22857-1.683
Gln-CUG/UUG4.280.020431.393620.06732-1.619
Ile-IAU/GAU0.581020.046731.371280.437861.239
Val-IAC/CAC2.169470.286911.29550.35448-0.744
Ser-GCU3.022390.87371.112010.08361-1.443
Arg-CCU2.614020.635731.109090.12204-1.237
Sec-UCA10.732880.179030.982760.455890.423
Thr-UGU2.791380.004850.958740.14684-1.542
Met-i5.145860.286850.945250.02846-2.445
Trp-CCA3.71460.276330.935160.1088-1.990
Phe-GAA2.116090.343760.912240.31935-1.214
Arg-UCU2.414170.652360.824090.15629-1.551
Ala-CGC0.759960.03750.730380.29535-0.057
Ser-CGA2.157040.714340.721130.08719-1.581
Leu-UAA1.430610.115050.670520.10534-1.093
Ser-IGA/UGA2.638970.526460.658340.01926-2.003
Leu-CAG2.051010.227620.650740.13045-1.656
Leu-CAA1.834630.142290.650650.06785-1.496
Leu-IAG/UAG0.997450.1190.584310.03237-0.772
Val-UAC1.053270.136390.541150.12489-0.961
Thr-CGU1.371450.014210.481580.01917-1.510
Ala-IGC/CGC/UGC0.363350.041140.409020.181430.171
Sec-UCA20.337760.07780.223780.05677-0.594
Leu-UAA20.828320.177070.180240.07853-2.200
--
Fig3B
--averageerror
--HIV WTHIV WT
Lys317.732910.2225
Lys1,216.891671.51726
Asn-GUU10.624751.43541
Glu-CUC/UUC4.208870.1755
Pro-IGG/CGG/UGG4.164570.61297
Ile-UAU4.11520.09009
Thr-IGU/CGU3.974360.66775
Gly-GCC/CCC2.699340.10931
Asp-GUC2.615540.97891
Arg-CCG/UCG2.544680.0461
Met-CAU2.118140.67272
Gly-UCC1.86260.87535
His-GUG1.812620.85064
Cys-GCA1.753370.67296
Arg-ICG1.615640.14948
Glu-UUC1.491130.22708
Tyr-GUA1.433040.22857
Gln-CUG/UUG1.393620.06732
Ile-IAU/GAU1.371280.43786
Val-IAC/CAC1.29550.35448
Ser-GCU1.112010.08361
Arg-CCU1.109090.12204
Sec-UCA10.982760.45589
Thr-UGU0.958740.14684
Met-i0.945250.02846
Trp-CCA0.935160.1088
Phe-GAA0.912240.31935
Arg-UCU0.824090.15629
Ala-CGC0.730380.29535
Ser-CGA0.721130.08719
Leu-UAA0.670520.10534
Ser-IGA/UGA0.658340.01926
Leu-CAG0.650740.13045
Leu-CAA0.650650.06785
Leu-IAG/UAG0.584310.03237
Val-UAC0.541150.12489
Thr-CGU0.481580.01917
Ala-IGC/CGC/UGC0.409020.18143
Sec-UCA20.223780.05677
Leu-UAA20.180240.07853
--
Table
Codon familyRSCU HumanRSCU HIV-1log2(RSCUHIV1/Human)RankRSCUtRNAHIV1tRNAGagVLPlog2(tRNAHIV1/GagVLP)RanktRNA--MeanWTMeanMut
Arg(AGA)0.971.440.5780.04940.02770.83684Ala-CGC0.099250.0694
Arg(AGG)1.030.56-0.8840.06600.05440.27942Ala-IGC/CGC/UGC0.113210.07377
Ile(AUA)0.241.592.73141.33970.06144.447714Arg-CCG/UCG0.171890.04214
Ile(AUY)2.641.41-0.9030.26720.13710.96285Arg-CCU0.0660.05438
Leu(UUA)0.681.381.02110.09000.03741.26778Arg-ICG0.13350.04802
Leu(UUG)1.320.62-1.0910.04500.0496-0.13801Arg-UCU0.049420.02767
Lys(AAA)0.761.270.7490.64050.03404.237813Asn-GUU0.458820.04177
Lys(AAG)1.240.73-0.7651.00810.06014.068712Asp-GUC0.130290.05737
Gly(GGA)0.932.081.16120.07080.02901.28929Cys-GCA0.082810.02661
Gly(GGB)3.071.92-0.6860.20160.04862.053110Gln-CUG/UUG0.058650.02139
Val(GUA)0.392.082.42130.06620.02821.22897Glu-CUC/UUC0.157570.04896
Val(GUB)3.611.92-0.9120.07390.04850.60613Glu-UUC0.054240.01918
Thr(ACA)0.971.941.00100.04810.02151.16426Gly-GCC/CCC0.201640.04859
Thr(ACB)3.032.06-0.5670.25220.03472.861511Gly-UCC0.070750.02895
0.32246362160.578021978His-GUG0.209870.10221
Ile-IAU/GAU0.267240.13711
Ile-UAU1.339670.06139
SUMMARY OUTPUTLeu-CAA0.045040.04956
Leu-CAG0.031960.02157
Regression StatisticsLeu-IAG/UAG0.079650.04178
Multiple R0.5780219780.5780219780.0303830139Leu-UAA0.090.03738
R Square0.3341094071Leu-UAA20.00780.01543
Adjusted R Square0.2786185243Lys1,21.008130.06008
Standard Error3.5530516214Lys30.640520.03395
Observations14mAla0.008450.0089
mArg0.015480.01881
ANOVAmAsn0.017520.01356
dfSSMSFSignificance FmAsp0.026590.02775
Regression176.009890109976.00989010996.02097841230.0303830139mCys0.004510.00371
Residual12151.490109890112.6241758242Met-CAU0.184240.04164
Total13227.5Met-i0.026230.0161
mGln0.030420.02478
CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%mGlu0.027030.02208
Intercept3.16483516482.0057639441.57787020470.1405793728-1.20534904457.5350193742-1.20534904457.5350193742mGly0.005640.00456
RankRSCU0.5780219780.23556502872.45376820670.03038301390.06476987191.09127408420.06476987191.0912740842mHis0.015730.00907
mIle0.011240.00955
mLeu-UAA0.014480.01134
mLeu-UAG0.007640.0061
mLys0.012970.00964
mMet0.0180.01466
mPhe0.01940.01964
mPro0.021320.01737
mSer-GCU0.013010.03218
mSer-UGA0.008550.00753
mThr0.008290.00768
mTrp0.006070.00485
mTyr0.00990.01046
mVal0.018480.01324
Phe-GAA0.046840.01751
Pro-IGG/CGG/UGG0.194840.05552
Sec-UCA10.269970.0297
Sec-UCA20.078830.06011
Ser-CGA0.05150.02244
Ser-GCU0.060460.01963
Ser-IGA/UGA0.016610.01061
Thr-CGU0.062150.01921
Thr-IGU/CGU0.190040.01549
Thr-UGU0.048140.02148
Trp-CCA0.035750.01846
Tyr-GUA0.068330.0362
Val-IAC/CAC0.073870.04853
Val-UAC0.066170.02823
Table
0
0
0
0
0
0
0
0
0
0
0
0
0
0
RanktRNA
RankRSCU
RanktRNA
FIg1
HIV-1HTLV-1
AACodonRSCUHumA-endingC-endingG-endingU-endingA-endingC-endingG-endingU-endingEnding
AGCA0.74202898552.02666666670.51A
EGAA0.78346028291.30708661421.224A
GGGA0.92817679562.07751937981.18A
IAUA0.35839160841.59414225940.797A
KAAA0.76137339061.26970954361.5A
LCUA0.29585087191.15384615381.073A
LUUA0.67714285711.38235294121.505A
PCCA0.93228655541.95789473680.986A
QCAA0.51993262211.09090909090.945A
RAGA0.97315436241.44329896910.788A
RCGA0.75247524752.26666666670.932A
SUCA0.79188900752.11494252870.815A
TACA0.97052541651.88018433180.821A
VGUA0.38819875782.08144796380.583A
AGCC1.74057971010.82.438C
CUGC1.20276953510.41176470591.435C
DGAC1.17019230770.91176470591.234C
FUUC1.19561454130.74418604651.073C
GGGC1.5453827940.52713178291.28C
HCAC1.18640576730.71264367821.228C
IAUC1.69055944060.5397489541.219C
LCUC1.06674684310.69230769231.404C
NAAC1.16544117650.61728395061.108C
PCCC1.46418056920.69473684211.729C
RCGC1.60936093610.53333333331.233C
SAGC1.404761904811.281C
SUCC1.60939167560.82758620692.222C
TACC1.7291755660.92165898621.932C
VGUC1.01397515530.41628959281.75C
YUAC1.19484702090.50549450551.307C
AGCG0.34347826090.21333333330.303G
EGAG1.21653971710.69291338580.776G
GGGG0.87134964481.0232558141.06G
KAAG1.23862660940.73029045640.5G
LCUG2.11906193631.28205128210.581G
LUUG1.32285714290.61764705880.495G
PCCG0.47301275760.16842105260.425G
QCAG1.48006737790.90909090911.055G
RAGG1.02684563760.55670103091.212G
RCGG1.11611161121.06666666671.053G
SUCG0.36926360730.22988505750.148G
TACG0.37761640320.18433179720.256G
VGUG1.93478260870.88687782810.861G
AGCU1.17391304350.960.749U
CUGU0.79723046491.58823529410.565U
DGAU0.82980769231.08823529410.766U
FUUU0.80438545871.25581395350.927U
GGGU0.65509076560.37209302330.48U
HCAU0.81359423271.28735632180.772U
IAUU0.9510489510.86610878660.984U
LCUU0.51834034880.87179487180.942U
NAAU0.83455882351.38271604940.892U
PCCU1.13052011781.17894736840.86U
RCGU0.52205220520.13333333330.782U
SAGU0.595238095210.719U
SUCU1.22945570970.82758620690.815U
TACU0.92268261431.01382488480.991U
VGUU0.66304347830.61538461540.806U
YUAU0.80515297911.49450549450.693U
FIg1
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
A-ending
C-ending
G-ending
U-ending
RSCU (Human)
RSCU (HIV-1)
V
T
S
R
R
Q
P
L
L
K
I
G
E
A
Fig.2
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
A-ending
C-ending
G-ending
U-ending
RSCU (Human)
RSCU (HTLV-1)
EarlyLate
AACodonEndingRSCUHumRSCUHIV1LateRSCUHIV1EarlyRSCUHTL1LateRSCUHTL1EarlySUMMARY OUTPUT: LateRSCUHumRSCUHIV1LateSUMMARY OUTPUTAACodonRSCUHumRSCUHIV1Late
AGCAA-A0.74202898552.1221.6670.5430.6150.74202898552.122AGCA0.74202898552.122
EGAAE-A0.78346028291.4140.7271.2241.231Regression Statistics1.74057971010.805Regression StatisticsAGCC1.74057971010.805
GGGAG-A0.92817679562.1261.7331.3621.28Multiple R0.47718215290.34347826090.171Multiple R0.054264191AGCG0.34347826090.171
IAUAI-A0.35839160841.6430.6430.7650.75R Square0.22770280711.17391304350.902R Square0.0029446024AGCU1.17391304350.902
KAAAK-A0.76137339061.3590.7621.4471.417Adjusted R Square0.20790031491.17019230770.887Adjusted R Square-0.0326645189DGAC1.17019230770.887
LCUAL-A0.29585087191.1581.121.1190.909Standard Error0.38156973990.82980769231.113Standard Error0.5464828602DGAU0.82980769231.113
PCCAP-A0.93228655542.0811.4120.9550.988Observations410.78346028291.414Observations30EGAA0.78346028291.414
QCAAQ-A0.51993262211.1170.9521.0150.6291.21653971710.586EGAG1.21653971710.586
RAGAR-A0.97315436241.4611.53811.143ANOVA1.18640576730.73ANOVAGGGA0.92817679562.126
TACAT-A0.97052541651.9511.50.7870.733dfSSMSFSignificance F0.81359423271.27dfSSMSFSignificance FGGGC1.5453827940.421
VGUAV-A0.38819875782.1591.20.6340.667Regression11.67415778681.674157786811.49869448830.00160720860.29585087191.158Regression10.02469553830.02469553830.08269236390.7757990677GGGG0.87134964481.011
AGCC1.74057971010.8050.8332.3432.769Residual395.67822318880.14559546641.06674684310.737Residual288.36201846170.2986435165GGGU0.65509076560.442
DGAC1.17019230770.8870.751.2571.273Total407.35238097562.11906193631.333Total298.386714HCAC1.18640576730.73
GGGC1.5453827940.4210.81.2771.60.51834034880.772HCAU0.81359423271.27
HCAC1.18640576730.730.7141.1651CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%0.93228655542.081CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%IAUA0.35839160841.643
IAUCI-C1.69055944060.4521.7141.2241.2Intercept0.55138258120.1451056823.7998689890.0004956790.25787863850.8448865240.25787863850.8448865241.46418056920.748Intercept0.93557266010.24525795383.81464758020.00068916520.43318452251.43796079760.43318452251.4379607976IAUC1.69055944060.452
LCUC1.06674684310.7370.481.2661.545RSCUHum0.4486418090.13230476193.39097249890.00160720860.1810301710.71625344710.1810301710.71625344710.47301275760.065RSCUHum0.06442733990.2240461560.28756279990.7757990677-0.39451040010.5233650799-0.39451040010.5233650799IAUU0.9510489510.905
PCCC1.46418056920.7480.5881.6522.3531.13052011781.106KAAA0.76137339061.359
SAGC1.40476190480.8721.21.3850SUMMARY OUTPUT: early0.51993262211.117KAAG1.23862660940.641
TACC1.7291755660.86411.772.7331.48006737790.883LCUA0.29585087191.158
VGUC1.01397515530.4090.61.8051.667Regression Statistics1.40476190480.872LCUC1.06674684310.737
AGCG0.34347826090.1710.1670.2860Multiple R0.36769703540.59523809521.128LCUG2.11906193631.333
EGAG1.21653971710.5861.2730.7760.769R Square0.13520110990.92817679562.126LCUU0.51834034880.772
GGGG0.87134964481.0111.20.8511.12Adjusted R Square0.11302677931.5453827940.421PCCA0.93228655542.081
KAAG1.23862660940.6411.2380.5530.583Standard Error0.61577178190.87134964481.011PCCC1.46418056920.748
LCUG2.11906193631.3331.120.6610.455Observations410.65509076560.442PCCG0.47301275760.065
PCCG0.47301275760.0650.5880.3780.1880.97052541651.951PCCU1.13052011781.106
QCAG1.48006737790.8831.0480.9851.371ANOVA1.7291755660.864QCAA0.51993262211.117
RAGG1.02684563760.5390.46210.857dfSSMSFSignificance F0.37761640320.222QCAG1.48006737790.883
TACG0.37761640320.2220.1670.1970.267Regression12.31190139282.31190139286.09719016160.01802421050.92268261430.963RAGA0.97315436241.461
VGUG1.93478260870.7731.80.8290.667Residual3914.78782060720.3791748874-0.14360111660.054264191RAGG1.02684563760.539
AGCU1.17391304350.9021.3330.8290.615Total4017.099722SAGC1.40476190480.872
DGAU0.82980769231.1131.250.7430.727SAGU0.59523809521.128
GGGU0.65509076560.4420.2670.5110CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%TACA0.97052541651.951
HCAU0.81359423271.271.2860.8351Intercept0.47278688790.23416947162.0189945540.0503998903-0.00086557170.9464393476-0.00086557170.9464393476TACC1.7291755660.864
IAUU0.9510489510.9050.6431.011.05RSCUHum0.52721311210.213511532.46924890640.01802421050.09534528290.95908094120.09534528290.9590809412TACG0.37761640320.222
LCUU0.51834034880.7721.280.9541.091TACU0.92268261430.963
PCCU1.13052011781.1061.4121.0150.471VGUA0.38819875782.159
SAGU0.59523809521.1280.80.6152VGUC1.01397515530.409
TACU0.92268261430.9631.3331.2460.267VGUG1.93478260870.773
VGUU0.66304347830.6590.40.7321VGUU0.66304347830.659
0.54440046870.7330651607
PCCAP-A0.93228655542.0811.4120.9550.988
PCCC1.46418056920.7480.5881.6522.353
PCCG0.47301275760.0650.5880.3780.188
PCCU1.13052011781.1061.4121.0150.471
xxia:mininum number of codons/family: 14
EarlyLate
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
RSCUHIV1Late
RSCUHIV1Early
RSCU (Human)
RSCU (HIV-1 early and late genes)
A-A
E-A
G-A
I-A
K-A
L-A
P-A
Q-A
R-A
T-A
V-A
I-C
y = 0.3084x + 0.6916R2 = 0.1024
SlopeDiff
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
RSCUHTL1Late
RSCUHTL1Early
RSCU (Human)
RSCU (HTLV-1 early and late genes)
y = 0.4486x + 0.5514R2 = 0.2277p = 0.0016
y = 0.5272x + 0.4728R2 = 0.1352p = 0.0180
EarlyLateCAI
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
RSCUHIV1Late
RSCU (Human)
RSCU (HIV1-late)
Fig. 2Final
AACodonHumanCFMusCFHIVCFRSCUHumRSCUMusRSCUHIVObsFreqHTL1RSCUHTL1ObsFreqHTL1EarlyRSCUHTL1EarlyObsFreqHTL1LateRSCUHTL1LateObsFreqHIV1EarlyRSCUHIV1EarlyObsFreqHIV1LateRSCUHIV1LateRSCUAACodonRSCUHIV1EarlyRSCUHIV1Late
AGCA51234611140.74200.85672.0267320.5140.615190.543101.667872.122Codon familyRank(Isupply)HIV1EarlyHIV1LateGGGA1.7332.126
AGCC12016299451.74061.55920.80001532.438182.769822.34350.833330.805Arg(AGA)41.5381.461GGGC0.80.421
AGCG2371499120.34350.37100.2133190.30300100.28610.16770.171Arg(AGG)20.4620.539GGGG1.21.011
AGCU8104901541.17391.21310.9600470.74940.615290.82981.333370.902Ile(AUA)140.6431.643GGGU0.2670.442
CUGC6083055141.20281.05640.4118611.43581331.46750.83380.348Ile(AUY)52.3571.357HCAC0.7140.73
CUGU4032729540.79720.94361.5882240.56581120.53371.167381.652Leu(UUA)81.2501.358HCAU1.2861.27
DGAC12176493621.17021.13820.9118871.234141.273441.25760.75470.887Leu(UUG)10.7500.642IAUA0.6431.643
DGAU8634916740.82980.86181.0882540.76680.727260.743101.25591.113Lys(AAA)130.7621.359IAUC1.7140.452
EGAA108064651660.78350.81391.3071711.224161.231301.224120.7271281.414Lys(AAG)121.2380.641IAUU0.6430.905
EGAG16779421881.21651.18610.6929450.776100.769190.776211.273530.586Gly(GGA)91.7332.126KAAA0.7621.359
FUUC10365193321.19561.17560.7442661.073200.909331.15830.667240.727Gly(GGB)102.2671.874KAAG1.2380.641
FUUU6973642540.80440.82441.2558570.927241.091240.84261.333421.273Val(GUA)71.2002.159LCUA1.121.158
GGGA58846521340.92821.08972.0775591.18161.28321.362131.7331012.126Val(GUB)32.8001.841LCUC0.480.737
GGGC9795667341.54541.32740.5271641.28201.6301.27760.8200.421Thr(ACA)61.5001.951LCUG1.121.333
GGGG5523722660.87130.87181.0233531.06141.12200.85191.2481.011Thr(ACB)112.5002.049LCUU1.280.772
GGGU4153036240.65510.71110.3721240.4800120.51120.267210.442LUUA1.251.358
HCAC5763671311.18641.23560.7126891.228161461.16550.714230.73LUUG0.750.642
HCAU3952271560.81360.76441.2874560.772161330.83591.286401.27RAGA1.5381.461
IAUA20514441270.35840.40041.5941510.797100.75250.76530.6431091.643RAGG0.4620.539
IAUC9675728431.69061.58820.5397781.219161.2401.22481.714300.452RCGA31.714
IAUU5443648690.95101.01150.8661630.984141.05331.0130.643600.905RCGC0.3330.857
KAAA88753381530.76140.74501.26971111.5171.417551.44780.7621251.359RCGG0.6671.143
KAAG14438993881.23861.25500.7303370.570.583210.553131.238590.641RCGU00.286
LCUA2461713450.29590.36951.15381071.073200.909611.11971.12331.158SAGC1.20.872
LCUC8874765271.06671.02780.69231401.404341.545691.26630.48210.737SAGU0.81.128
LCUG17629179502.11911.97991.2821580.581100.455360.66171.12381.333SUCA12.169
LCUU4312887340.51830.62270.8718940.942241.091520.95481.28220.772SUCC10.814
LUUA2371153940.67710.57051.3824701.50541381.43451.25741.358SUCG0.3330.203
LUUG4632889421.32291.42950.6176230.49541150.56630.75350.642SUCU1.6670.814
MAUG1012539777391017655TACA1.51.951
NAAC9515391501.16541.16900.6173721.108101.111381.01371.167360.571TACC10.864
NAAU68138321120.83460.83101.3827580.89280.889370.98750.833901.429TACG0.1670.222
PCCA4754201930.93231.13721.95791090.986210.988480.955121.412642.081TACU1.3330.963
PCCC7464621331.46421.25090.69471911.729502.353831.65250.588230.748VGUA1.22.159
PCCG241152780.47300.41330.1684470.42540.188190.37850.58820.065VGUC0.60.409
PCCU5764428561.13051.19861.1789950.86100.471511.015121.412341.106VGUG1.80.773
QCAA46330641140.51990.56131.09091200.945110.629681.015100.952861.117VGUU0.40.659
QCAG13187854951.48011.43870.90911341.055241.371660.985111.048680.883
RAGA43529901400.97321.01721.4433130.78841.14391201.5381031.461
RAGG4592889541.02680.98280.5567201.21230.8579160.462380.539
RCGA2091507170.75250.82322.2667310.93271.273160.979361.714
RCGC447247141.60941.34970.5333411.23371.273231.39410.33330.857
RCGG310215081.11611.17441.0667351.05330.545160.9720.66741.143
RCGU145119510.52210.65270.1333260.78250.909110.6670010.286
SAGC8264679531.40481.28631.0000411.28100271.38591.2340.872
SAGU3502596530.59520.71371.0000230.71942120.61560.8441.128
SUCA3712798460.79190.93512.1149440.81580.681300.98431322.169
SUCC7544382181.60941.46440.82761202.222252.128632.06631120.814
SUCG173106450.36930.35560.229980.14820.1750.16410.33330.203
SUCU5763725181.22951.24490.8276440.815121.021240.78751.667120.814
TACA56837511020.97051.09521.8802480.821110.733240.78791.5791.951
TACC10125099501.72921.48880.92171131.932412.733541.7761350.864
TACG2211446100.37760.42220.1843150.25640.26760.19710.16790.222
TACU5403404550.92270.99391.0138580.99140.267381.24681.333390.963
VGUA25015511150.38820.42182.0814210.58340.667130.63461.2952.159
VGUC6533822231.01401.03930.4163631.75101.667371.80530.6180.409
VGUG12466962491.93481.89310.8869310.86140.667170.82991.8340.773
VGUU4272375340.66300.64580.6154290.80661150.73220.4290.659
WUGG572339298721340881
YUAC7424116231.19481.17850.5055661.307221.833291.05561.091150.429
YUAU5002869680.80520.82151.4945350.69320.167260.94550.909551.571
*UAA44284240201
*UAG20197700034
*UGA68669122010
401672403783568
DiffMutationEarlyLate
AACodonRSCUHIV1LateRSCUHumHumGGroupSUMMARY OUTPUT
AGCA2.1220.742028985500
AGCC0.8051.740579710100Regression Statistics
AGCG0.1710.343478260900Multiple R0.2990972868
AGCU0.9021.173913043500R Square0.089459187
DGAC0.8871.170192307700Adjusted R Square0.0415359863
DGAU1.1130.829807692300Standard Error0.5284094753
EGAA1.4140.783460282900Observations41
EGAG0.5861.216539717100
GGGA2.1260.928176795600ANOVA
GGGC0.4211.54538279400dfSSMSFSignificance F
GGGG1.0110.871349644800Regression21.04243820520.52121910261.86671978660.1685345406
GGGU0.4420.655090765600Residual3810.61022979480.2792165735
HCAC0.731.186405767300Total4011.652668
HCAU1.270.813594232700
IAUA1.6430.35839160841.6431CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%
IAUC0.4521.69055944060.4521Intercept1.03873697990.215347174.82354599670.00002300370.60278942941.47468453040.60278942941.4746845304
IAUU0.9050.9510489510.9051RSCUHum-0.11143043250.1864465213-0.5976535880.5536162996-0.4888716790.266010814-0.4888716790.266010814
KAAA1.3590.76137339061.3591HumG0.27094832330.15985659261.69494619420.0982666541-0.05266442710.5945610738-0.05266442710.5945610738
KAAG0.6411.23862660940.6411
LCUA1.1580.295850871900
LCUC0.7371.066746843100SUMMARY OUTPUT
LCUG1.3332.119061936300
LCUU0.7720.518340348800Regression Statistics
PCCA2.0810.932286555400Multiple R0.2844298639
PCCC0.7481.464180569200R Square0.0809003475
PCCG0.0650.473012757600Adjusted R Square0.0573336897
PCCU1.1061.130520117800Standard Error0.5240366769
QCAA1.1170.519932622100Observations41
QCAG0.8831.480067377900
RAGA1.4610.97315436241.4611ANOVA
RAGG0.5391.02684563760.5391dfSSMSFSignificance F
SAGC0.8721.404761904800Regression10.94270488990.94270488993.43283075120.0714906748
SAGU1.1280.595238095200Residual3910.70996311010.2746144387
TACA1.9510.970525416500Total4011.652668
TACC0.8641.72917556600
TACG0.2220.377616403200CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%
TACU0.9630.922682614300Intercept0.9225583630.091896290610.03912515400.7366805721.10843615390.7366805721.1084361539
VGUA2.1590.38819875782.1591HumG0.28864610160.15578997241.85278999110.0714906748-0.02646885810.6037610612-0.02646885810.6037610612
VGUC0.4091.01397515530.4091
VGUG0.7731.93478260870.7731
VGUU0.6590.66304347830.6591
xxia:0: Codon families without documented selective packaging1: with selective packaging, i.e., those in Table 1 other than Thr and Gly.
SeqSeqLenCAICAI2t-Test: Two-Sample Assuming Unequal Variances
tat2610.668750.71957
rev3510.662110.72057Variable 1Variable 2
nef6210.675230.72625Mean0.66869666670.592736
gag-pol43080.591630.6524Variance0.00004303570.0035822537
vif5790.619410.68549Observations35
vpr2910.642720.69085Hypothesized Mean Difference0
vpu2490.490680.56748df4
env25710.619240.68272t Stat2.8098988703
P(T
-
Table 2. Frequency of A residues, length and codon adaptation index (CAI) for the three HIV-1 early (tat, rev and nef) and five late (gag-pol, vif, vpu, vpr, and env) coding sequences (CDS). Any problem with the mutation hypothesis?
GeneCDS (bp)CAItat2610.66875rev3510.66211nef6210.67523
gag15030.62784pol30120.58139vif5790.61941vpr2910.64272vpu2490.49068env25710.61924
You may be wondering about Cys codon family which has 4 tRNAs matching UGC, but none matching UGU. We would have predicted that UGC should be preferred, but the opposite is true. Why? One might think that, because Cys is rarely used, the codon family is not under selection, so that codon usage will be at the mercy of mutation bias. Because the yeast genome is AT-biased, we expect U-ending codon to be more than C-ending codon. Are you happy with this explanation? Unfortunately, the explanation is wrong, but the correct answer is still elusive.