Supplementary Figure 1 a b - Nature · mr36 mr 40 mr73323 2 6 mr2 04 mrr5r583 mr2 57 mr41 mr006...

34
1 1 a b c d Intensity, cps Intensity, cps Time, min m/z, Da 8.0e4 4.0e4 Time, min Intensity, cps Intensity, cps m/z, Da m/z, Da 2 Supplementary Figure 1 Saccharopine identification facilitated by 3 GWAS results. 4 (a) Manhattan plot displaying the GWAS result of the content of mr208 5 and the strongest association pointed by arrow in red. (b) The strongest 6 association between SNP sf0233224406 is 22 kb away from OsLKR, the 7 gene encoding saccharopine dehydrogenase, suggesting mr208 could be 8 saccharopine. (c) Mr208 detected with MRM transition 277/84 in rice 9 grain sample and (d) mr208 was confirmed as saccharopine by comparing 10 the RT and fragmentation pattern with the commercial standard. 11 12

Transcript of Supplementary Figure 1 a b - Nature · mr36 mr 40 mr73323 2 6 mr2 04 mrr5r583 mr2 57 mr41 mr006...

1

1

a b

c

d

Inte

nsity

, cps

Inte

nsity

, cps

Time, min m/z, Da

8.0e4

4.0e4

Time, minIn

tens

ity, c

psIn

tens

ity, c

psm/z, Da

m/z, Da 2

Supplementary Figure 1 Saccharopine identification facilitated by 3

GWAS results. 4

(a) Manhattan plot displaying the GWAS result of the content of mr208 5

and the strongest association pointed by arrow in red. (b) The strongest 6

association between SNP sf0233224406 is 22 kb away from OsLKR, the 7

gene encoding saccharopine dehydrogenase, suggesting mr208 could be 8

saccharopine. (c) Mr208 detected with MRM transition 277/84 in rice 9

grain sample and (d) mr208 was confirmed as saccharopine by comparing 10

the RT and fragmentation pattern with the commercial standard. 11

12

2

13

a b

c

465.1033

Inte

nsity

, cps

Inte

nsity

, cps

Time, min m/z, Da

465.1033

d

mass-to charge (m/z)

Inte

nsity

, cp

s

x105

14

Supplementary Figure 2 Delphinidin 3-O-glucoside identification 15

facilitated by GWAS results. 16

(a) Manhattan plot displaying the GWAS result of the content of mr063 17

and the strongest association pointed by arrow in red. (b) The strongest 18

association between SNP sf0605268699 is 45 kb away from OsC1. (c) 19

Mass peak (RT=5.2min) and spectrum of delphinidin 3-O-glucoside in 20

rice grain sample (d) delphinidin 3-O-glucoside was confirmed by high 21

resolution mass MS and MS/MS spectral pattern. 22

23

3

24

CV (%)

0-50 50-75 75-100 100-125 125-150 150-175 175-200 >200

Pe

rcen

tage

(%

)

0.00

.05

.10

.15

.20

.25

.30

25

H2

0.0-0.1 0.1-0.2 0.2-0.3 0.3-0.4 0.4-0.5 0.5-0.6 0.6-0.7 0.7-0.8 0.8-0.9 0.9-1

Num

ber

of m

eta

bolit

e

0

20

40

60

80

100

120

140

26

Supplementary Figure 3 The coefficients of variation (CV) and the 27

broad-sense heritability (H2) results for each metabolite. 28

(a) Distribution of the phenotypic coefficients of variation (CV) of 29

metabolic traits (n = 587). (b) Distribution of broad-sense heritability (H2) 30

of metabolic traits (P < 0.05, two-way ANOVA) detected in the 31

association panel. 32

33

4

34

Num

ofS

igni

fican

tLo

ci

010

2030

4050

60

1 2 3 4 5 6 7 8 9 10 11 12Genome

Leaf

a

b

1020

3040

50

1 2 3 4 5 6 7 8 9 10 11 12Genome

Num

ofS

igni

fican

tLo

ci Grain

060

35

Supplementary Figure 4 Statistics of the number of the significant 36

association for the mGWAS results in rice leaf (a) and rice grain (b). 37

38

39

5

40

41

1 2 3 4 5 6 7 8 9 10 11 12Genome

8

7

8

7

7

7

9

8

12

8 8

18 9

2116 8

11 7

7810

8

1240

52

2132

710 8

7217 119 8

418

4743

109

108921

167 7

137

1312 132143

6352

6 13 20

color key (-logP)

ApiChr

Lut 6-C-Gdi-C,C-P-api

Nar 7-O-Go-meApi C-H

Peo 3-O-HTri 4'-gg-O-H

LPC (18:1)LPC (16:2)Del 3-O-GApi 7-O-R

C-P-api O-RTri O-R-mH

AlaValTryIle

LPC (12:1)LA

ProTyrArgHisLysPhe0.1

0.5

1.0co

lor

k ey

(CV

)7

7 7

7

7 7

8

7 7

13 7

7 15

7

9

10 11

10 7

7 10

10

12 15

7

42

Supplementary Figure 5 Comparing the association for some same 43

metabolites between rice grains (up) and rice leaf (down). 44

Pro, proline; Tyr, tyrosine; Arg, arginine; His, histidine; Lys, lysine; Phe, 45

phenylalanine; Ala, alanine; Val, valine; Try, tryptamine; Ile, isoleucine; 46

LPC (12:1), lysophosphatidyl choline (12:1); LA, linoleic acid; 47

LPC(18:1), lysophosphatidyl choline(18:1); LPC(16:2), lysophosphatidyl 48

choline(16:2); Api 7-O-G, apigenin 7-O-glucoside; Api 7-O-R, apigenin 49

7-O-rutinoside; C-P-api O-R, C-pentoside-apigenin O-rutinoside; Tri 50

O-R-mh, tricin O-rutinoside-malonylhexoside; Lut 6-C-G, luteolin 51

6-C-glucoside; di-C,C-P-api, di-C,C-pentosyl-apigenin; Nar 7-O-G, 52

naringenin 7-O-glucoside; o-meApi C-P, O-methylapigenin C-pentoside; 53

Peo 3-O-H, peonidin 3-O-hexoside; Tri 4’-gg-O-H, Tricin 54

4'-O-(β-guaiacylglyceryl) ether O-hexoside; Api, apigenin; Chr, 55

chrysoeriol. 56

57

6

m/z, Da

150 250 350 450

Inte

nsity

,cps

0

4e+6

8e+6

1e+7

2e+7

179.2

267.2

297.1

326.9351.0

411.2381.1

447.2

O

OOH

O

O H

O

OH

OH

OH

OH

CH3

O-methylapigenin 6-C-hexoside

Chromosome

0

10

20

30

40

-Log

10(P

)

a

b

00.20.40.60.81

r2

c

d

e

*

RT, min

6 8 10 12

Inte

nsity

,cps

5.0e+5

1.0e+6

1.5e+6

6 8 10 12

Inte

nsity

,cps

5.0e+5

1.0e+6

1.5e+6

2.0e+6

fO-methylapigenin 6-C-glucoside

apigenin 6-C-glucoside (standard)

Os04g11970

Empty vector

RT, min

0

5.0e+4

1.0e+5

1.5e+5

2.0e+5

2.5e+5

Rel

ativ

eco

nten

t

GACA GACACA

n = 146/340p = 4.9 x 10-55

R2 = 77%

O-methylapigenin 6-C-hexoside

8.2

7.9

7.9

58

59

Supplementary Figure 6 Functional annotation of Os04g11970 and the 60

assignment of associated sites. 61

(a) Structure and LC-MS/MS fragmentation of O-methylapigenin 62

6-C-hexoside. Structure and the major fragments of O-methylapigenin 63

6-C-hexoside are shown. (b) Manhattan plot displaying the GWAS result 64

of the content of O-methylapigenin 6-C-hexoside. (c) Gene model of 65

Os04g11970. Filled black box represents coding sequence. The grey 66

7

vertical lines mark the polymorphic sites identified by high-throughput 67

sequencing, and the stars represent the associated sites. (d) A 68

representation of the pair-wise r2 value (a measure of LD) among all 69

polymorphic sites in Os04g11970, where the darkness of the color of each 70

box corresponds to the r2 value according to the legend. (e) Box plot 71

indicate O-methylapigenin 6-C-hexoside content; plotted as a associated 72

site at Chr4. vf0406561691. (f) LC-MS chromatograms of in vitro 73

enzyme assays showing the enzyme activity of recombinant Os04g11970 74

(up). Protein extract from E. coli containing pDEST15 empty vector were 75

used as a negative control (down). 76

77

8

78

cluster 2

cluster 1

cluster 3

cluster 4

cluster 5

cluster 6

a

mr365

59555955mrmmmmrr15

88688886mrr88 mr019 2333333333333333mr2rmrmmmmr23

mr0555mrmmmrr10

111118888888mrmmmrr88

mr591

mr070

6311313131mr6mr6mr636777mmmmmm

mr604

mr009mr771

49349mrr49mr678

mr282

mr531mr852

884484848444r8rmrmmr88

mr587

mr062634334334344mr6mmrrmrmmr63

mr495

mr246mr489

mr717

9419 14411mrr94

77676mrr77

332232333323mrmrmmrr22

4488844844488mr4mmr44

3093mrmrr93

mr158

479977979799799797mrmmrmmrr47

mr42241004110mrm 44mrmmrrr41

000000000000000000mrmmrmmrmrmmr107464646mrmmmmrr74

5202mr5r52

mr189

mr572

mr667

9188818mr9mrmr91

4474477mr4r44

9319mr9r93

mr229mr56284949849mrmmr84

0700700707707mrmmrr10

064640 40664646mr0mmrmmmrr06

mr078

41411441444mrrmmmmr14

mr215

mr373

mr672mr650

mr711mr218

829mrmrmmrmmr82

mr230 0100mr0mmr01

mr629

mr042

7757575755mr7mrrr77

578778788778mr5mrmmr57

474mr4rr4mrmrr47

mr567 7585858mr7mmmrmr75

mr488 98588558585mr9mmmmmrmm88 r989929992mr9r99

99899989898mr9mrmmmrmrr99

mr527 mr075

47070mr4r47

mr0804294294292229mr44mr42

85888

mr628956mr9r95

mr556

388mrr38

30mrr13

mr846mrmrr84

5171717mrr51

mr005

9626mr9r96

mr069

mr551

mr844

747mrmmr74

mr051

777777mr7mrr77

475mr4mr47

27373mr2r27

91919191mrmr19

mr284mr28

957mrr95

mr514r51 mr317r31

474mrmrr14

27676mr2r27

mr068

49mrmmmrmr14

mr480953mrr95

mr529mr838

mr041

mr555

mr839

mr505

mr020

3636mrr13

mr586

mr513r51

mr534

mr695

mr357r35

mr548rr54mr477347mr347r34mr411

mr050

mr319

mr512

mr444

mr421r42

mr074

mr025mr516

616mrmrmr261r26

88418188818mr4m44 r4142323mr4mrmrr42

65858888mrrmrmr658939393mrmmrm mmmmr89

mr835689989889889mrmrmrmrmmmmmmr68

mr077

666696696999mr6mrmmrrr69

646464444564mmrr56

982828282882mrrmrmrmrmrmmmr98

0070707777777077770r7mrmmrrmmmr77

9444449444699rmrmrmmmmmrr69

mr542

446

mr024

7777778979999799999777r8rmrmmrmmmmmmr89

792222222922929929292mrmrmmmmr79

mmmmmmr427mr54mr54m44 r42

mr4123181813111mrmmmmmmmrrmrmr31

mr781

mr620

4545mrr14

mr169mr621

90606mr9r90

mr569mmr56

819191r8rmrmmmmmr81

6116166mrmmmrmrmmmmrmmrr16

mr610

333873377877377737373377373mrmmrr87

64644969644464mr9mrrmmmr96 22212212122122mr2mmmrmmmmmmmmmmmmrmr21

74774777mrmrmrmmmmmr17 691691699191r6mmmmmmmmmmmmrr6959999999999mrmrmrmrmrmmmr59

mr644

8808008080808080mr8r88 559595959mr5mmrmr59

mr815

77773733777377373mrrmrr73909959909mr5mrrrmrrmrmr mmr59683883368868mrmmmr68

mr263mr709

327777mr3mmrrmrmrmmmmmr32 9191999999191919119199199mr9mrmmrrmr91

0868688686mrrrmmmmrmmrr08

20220220202200202022mr2mmrmmmmrmrmr2293233232mr9mmmmmmmr93

33333333333333mrmmmmrmmrr3rr33mr925

111111111rmmrrmr11

mr77317191171117117mr9rrmmmmmmmmmmmrmmrr91

mr609

mr0136363663636336336mrmmrmrrmmr63 mr173

222424244474747444422242442424742424mr7rmrmmmmmmmmmmmmrmr74

29292222222mrmmmmrrmmmmmmmr12

522952522255552252222552522529525rmrmmrmrmrmrmmr9500000000100001rmrrrmrmrmr00 4414112144414214mrmmmmrmrmrmrmr0101010000101010111111r21mr209

1992111mrr21

951mrmr9r95 mr264

mr68577mrmrr17

508000808mrmrrmrrmrr50

mr306

mr865

mr362

mr057

111211111rrmrmmmmr21

417mr4r41 mr055

mr465mrr46

21717mrmrr21

mr389r38

669696666mrmmmmmrmm

444444461411414mrmrmrrmrrmmmm 44

08808060008000808rrrmmrmrmmmrrrmmrrr60

577rrmrmmrrmmmmmmmmmmmrmmmmmmmmmmmr57

mr090

94774777447mrrmmmr94

9779709099mr0mrrmrrmmmrr09

666866666666666666mrmmmrr86

38838838863336363333338mr6rmrmrmrmmmmmmm4141111 r63

92222222222mrmmmmmmr92 00000000800800880808mr28mmrmrr 8802222222222222222222

55535334335533335533r4mrmmmmmmmmmmmrmrmr4333333433333333333333333mrmrmmmmmrmr43

m 04

66666756557565667mrmr75

mr52399999999999999mmrmr99

mr731

399973mr7mmrr73

mr492 242424rmrmrmrm22 r12

188118811111mrrmrmr11 3703370030033mr7mmmmr70

mmmmmmmmmmmmmm

8mrmrmrmmmmmmmr 6r86

mr843

mr584

mr231mr582mr718118m 18r71

mr579

65656696mmr9r96660006000mrmmmmmrmrr66

mr854

mr430269992626999mr2mmmmmr26

0929220mrr09

mr837r83

557mr5r55

757575775mrmmr57mr5535393mrr53

mr983mr98

78000mrmrmr78

m

mr030

42424444242mrmmmmrmrr42mr522

mr028

7444444444mr7mr74

7545554755545445mr7mrmmmmmmmmrr75

3099090090999099mr3rmrmmr30

mr4mr4mr8mmmmrmmmr84mr43862mr031

867786867667mr8mrrr86

75755797755mrmmmmmmmrmr97

mrmrmr

71010010000011010mr7mmmmmrr71

97666977666767666676mrmmrmmmmrmmmrr97mr413

31616mr3r31

mr002

mr810

832mr8mmrmrr83

mr661

mr304mr30

4848mrr84

mr730

mr716

mr518r51

mr221 mr225

56333mr5mrmmmr56 mr981

46060mr4mrmrr46

774mr7mrmmmmr77

778777mr7mr77

mrmrm 25225r32mr353

374mr3mmmr37

mr958

mr806

575557mrmrmmrm8 r15mr403

mr049m 42mr842r84

mr740

mr13445959559595mr4mr45

8333333mr8mrmrmrr83 mr014

mr283rr28

mr32mr 23233r32

659959mrmmr65

mr736

mr800

4414444141mr4mmr44

37577575mrmrmmr37

51919mr2323r51

915159 5mrrmmmmrr91

mr172

69797979mr6mrmmmmrmr69

59659696966966696mrmrmrrmmmrmr59

mr262

mr258

8242422mrmrmrrrmrmr82

42525252525mr4mr42

35959595955mrr35

38448mrmrr38

mr391

mr235

mr122mr248

mr250

929929mrmrmmr9290101110mrmrmrr90

mr268

mr690

mr896

6767766mrmrr16

8535553535mrmmmmr85

3763767767776mrmmm77777 r37

443343444343434443mrrmmmmrmmr4493883833893833838mrmmrmmmmrmrmrrmmr93

mr9084676676677667mrmmmmmmmmrmmmmrr46

560606060060056566mr5mrmrr56

9737339773733733mr9m 9mmmmmrmrmmrmrr97

35005050055350005mrmmmmmmrrmrr3560160160101001160mrmrr60

mr279

mr801

390909090399909000099mr3mmmmmmmmmmr39

68686868686668mrmmrrmrmrrmr68

mr698

933333mrmmrmmmrrmmr93970707970mr9mmmmrmr97

41666661161161mr4rmmrmmm 4rmmr41

mr1266mmm 2mmm 62

75055005557mrmmrr33333333333333r75

9555555595555555555mr9rmmmmmr95

0080880800008mrmmmmmr00

42424224224mrmrmmrmrr14

mr241 mrmmmrmr249458458558mr4mmrmmmmm 58mmr45

mr301851515155185mrrmmmrmmmmr85

92828292828mr9mmmr92

mr738

mr869

03333333333333mrmrmmmrr0r03

943434334434444334434334mrmmmrmrmrmmrmrrr94

82525258252555525mr8mmrmmmmmr82mr175

01666mrmrmmrmrr01

mr034

mr649

910101011010101910100mrmmmmmrrmrmr91

030303333mrmrmrrr1029999999999mr2mrmrr29

mr244mr137

mr804

7299292992922mr7rmrmmmrmmmmmrr72

mr387

mr855mr 55m 55m 555rm 5mmm 555558555m 85m

mr612

mrmrmmmmm

mr052

mr707

82622666226266262262662222268 6mrmrmrmmmmmr82

349494949mrmrmrmrr34290090mr2mmrr29

70000000mrmrr70550505505mrmrmmr55

3959593955595mr3mmrr39

7011701000mrmmmr70

363636363mrmrmrmr36

44000404400mr4mmmrmrmmrmmrmr44

73232732333332mrmrmmmmmmmmmmr73mr266

4343434rr14mr240

mr5mmmr58338583r58

mr257

4mrmrmmrmmrmmmrmrm 1r41mr006

45656565655656565mr4rrmmrmmmm3232323 r45

878878887878877878mr8mmmmmmrmmmr87

mr680r68

mr297

86336363mrr86

mr011

3838mrmrmr13993mrr99

89888889899mrmmmmrr89

mr176

0818818188188mrmmmrmmmmr08

mrmmmmmmmmmr

8505500505055050500mrmrmmrmmmrmrmmmmr85

328832828283228mr3mmmmmm mmmmmr32

8313313133133mr8mmmmmrmmrmmrmmrrmm 8mmrmmmr83 mr494

48484r4r48

961661mr9mmmrmr96

760606060060mrmrmr76

980808800980mrmrm111111 r98

46146616166mr4mrmrmmr46

97474774747974mrmrmrmmr9737mrmrmmmrmrrmmrrmrrm

44r37

482482882828224mr4mrmmrrmmr48

950595mr9mrmmmmmr95

mr817282828288288mrmmrr1209555mrmmrmmmrmr mm8

r09

33737mr3mrr33

093099309mrmrmmrmmmrr09

27272727mrmr1265757mrmmmmmr65

mr677

mr265476676676767676776676766766mrmmmrmmmmmrmr47

56565565565655665655mrmmmmmrmmr 6mmr15

236366333236336mrmmmrmmmr23 57177171775mrmmmmmr5mr570434433mrmrmrmmrmmr049090090090900mrr90

mr206

70000070mrmmrrmmmmrrmmr 022222mrmrmmmmmmmmrr70

mr625r62

9343333mr9mmmmmr93

338mr3r33

mr449r44

437mr4r434646mrmr14

mr4mrmrmmmmr 711r47mr503

624244442mrmmmmmmmr62 788888888888mrmmmrmmrmrmr78

626622mrmrr62397mrmmmmmm mr6mrr6r39

mr50229339299399393mrmrmmmr29

mr946

836366mr8rmr83

878789887879887787798mr9mmrmrmmmmr98

mr045766667666666666666mr7mrmmrmr76

mr533

8757575757587575757575mr8mmmmmmrmmmmm 5r87

mr029

2717mrmmmrmmmrr27

95454555445mrmmr95

828228822mr8mrm71717171717177 r82

6276272mrmr62

687787mrmrr68

mr585

mrmmmr393933939933mrmmmrmrr39

622mrr62084848mrmmmmmmr08mr072

mr061

mr023

mr382r38

mr071

mr545

mr688r686232mrmrmr62mr201mmr20

mr45454r45r 45545rrmr017

784848488444844mrmmmmmrmmmr78

24744mr2mrmr24

mr793r79

mr510mr692

mr140

mr704

568mr5mr56

mr507r50

mr496

10mrmr11mr371

mr728

mr809

mr795

43434mr4r43

mr969

mr525rr52

mr324mr120

294mrr29

mr570r57

60000mrmrr60mr814

mr343r34463mr4r46

mr949

mr272

mr037

mr348mrr34

mr581r58

1mmr100198898mr9mrmrr98

45353mrr45

mr594r59

090mrr10

mr528rr52

mr794

mr526mrr52

mr802432mr4rr43

mr224

mr676

mr803

mr239mr311

mr807

mr864r86

mr204

845mrmr84

285285mr2rmrr28 mr355

mr451r45

540mr5r54

mr326

mr885mr332mr340r34

m

mr816

791917mrmrrmmmrr79

94545945455454554mrmmmmmmrmmrmmmrrr94

mr722

mr693

mr329

mr799

mr180

mr287

mrmmmmmmr

mr812

mr663mr307

mr179

mr699

mr656

mr171mr305r30

84141mr8r84mr648

9909909mrmrmr9994244222422mr9mmr94 mr813

mr210

mr712mr181

mr208

9050505mr9r90

334344mr3mrrrmmmrr33

mrmrmmmmmmrmrrmmr

9489448mr9mmmrr94

mr255

mr254

333mr876

mr664

mr616 mr104 mr053

mr106

mr0830404040 04mrmmrr04

77999mrrmrmr77mr259 01221222mrmrmmrr01

mr497

mr46697797797mrmmmrmrmr19

mr286

091919mrr09

920200mrrmrr92mr617

725mr7r72

mr926

339mr3mrmmmmr33

0808880mrrmrr10

mr798mr900

76876mrmmr76

mr115mmr11

708mrr70mr720

mr056

mr923

53000mrr53mr032

mr9270366mrr03

mr408

mr414

mr332

mr896

mr653

mr908

mr903

mr876

mr904

unknown

flavonoids

others

AA and NA ders phenolamines fatty acids

vitamines terpenoids

positive

negative

mr051

mr018

mr041

mr231

mr059

mr422

mr533

mr412

b

sub network 2

sub network 1

sub network 3

79

Supplementary Figure 7 The PCA (a) and GGM (b) results of rice 80

grain. 81

AA and NA ders, amino acid and nucleic acid derivatives.82

9

50 100 150 200 250 300

Inte

nsity

, cps

0

4e+6

8e+6

1e+7

77.0

91.1117.1

144.1

161.0

50 100 150 200 250 3000

4e+6

8e+6

Inte

nsity

, cps

77.1

91.1

117.1

127.1

144.1

186.1 203.1

50 100 150 200 250 300

Inte

nsity

, cps

0

1e+7

2e+7

77.1 91.1

105.1117.1

128.0

144.1

204.1 247.2

265.1

m/z, Da

50 100 150 200 250 300

Inte

nsity

, cps

0

4e+6

8e+6

1e+7

77.090.9

103.1

117.1

130.2

144.1

191.1

273.9 291.0

NH2

NH

NH

NH

CH3

O

NH

NH

O

NH

NH

O

aTryptamine

N-acetyltryptamine

N-benzoyltryptamine

N-cinnamoyltryptamine

83

10

b

50 100 150 200 250 300

Inte

nsity

, cps

0

8e+5

2e+6

61.1 77.0 89.1 105.1

115.1

132.1

160.1

177.1

NH2

NH

OH

50 100 150 200 250 300

Inte

nsity

, cps

0

2e+7

4e+7

77.1

95.1

105.1

115.2

132.1143.1

160.1

210.1 229.0 249.0264.1 280.9

NH

NH

OH

O

m/z, Da

50 100 150 200 250 300

Inte

nsity

, cps

0

5e+4

1e+5

2e+5

77.1

105.1

115.2132.1

160.1

209.2

265.1

281.1

297.2

NH

NH

OH

OOH

Serotonin

N-benzoylserotonin

N-salicyloylserotonin

84

85

Supplementary Figure 8 The mass spectrum and structure of some 86

metabolites for GGM results. 87

(a) Tryptamine related metabolites and (b) serotonin related metabolites. 88

89

90

11

mr896 (N-Benzoyltryptamine) 91

92

93

mr903 (N-Benzoylserotonin) 94

95

96

97

12

mr904 (N-Cinnamoyltryptamine) 98

99

100

mr908 (N-Salicyloylserotonin) 101

102

103

104

13

1000 grain weight 105

106

107

Grain length 108

109

110

111

14

Grain thickness 112

113

114

Grain width 115

116

117

15

Hull color 118

119

120

Seed color 121

122

123

Supplementary Figure 9 The related Manhattan plots (up) and 124

quantile-quantile plots (down). 125

126

16

127

0 2 4 6 8 10 120

500

1000

1500

2000

2500

3000

mean=3.0

95% quantile = 5.3

128

Supplementary Figure 10 Permutations of number of homologous or 129

co-linear loci occurred by chance. 130

Given the number of loci studied between two species, an average of 3.0 131

out of 42 homologous or co-linear loci could possibly be due to chance 132

alone. The 95% quantile of the distribution for metabolite- metabolite loci 133

of homolog or co-linear is 5.3. 134

135

17

136

−lo

g(P

)

0

2

4

6

Chr8

rice

r2

0.60.4

0.20

0.8

sf0802256337

2.20 2.22 2.30 2.32 (Mb)2.26

Os08g04500 Os08g04540OsTDC1

Os08g04560 Os08g04620 Os08g06400

82.69 82.79 82.89 82.99 83.09 (Mb)

GRMZM2G362828 GRMZM2G063363 GRMZM2G021388 GRMZM2G021277 GRMZM2G016254

●●

●●

●●

●●

●●●●●●

●●

●●●●●●●

●●

●●●●●●●●●●

●●●

●●

●●●●●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

●●●●

●●●●

●●●

●●●●

●●

●●●

●●●●●

●●●●●●●●

●●●●●

●●●

●●

●●

●●

●●●●●●●●●

●●

●●●

●●●●●●●

●●●

●●

●●

●●●●

●●

●●●●●

●●

●●●●

●●

●●●●

●●●●

●●●●

●●●●●

●●

●●

●●

●●

●●●

●●

●●

●●●●●

●●

●●

●●

●●●

●●●

●●

●●

●●●●●●●●

●●●●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●●●

●●

−lo

g(P

)

0

5

10

15

Chr10

PZE-110043443maize

a

137

138

0

5

10

15

−lo

g(P

)

Chr1

30.63 30.65 30.70 30.75 (Mb)

ricesf0130643809

Os01g53330OsUGT-3

Os01g53350 Os01g53470 Os01g53520

200.20 200.16 (Mb)

GRMZM2G031308GRMZM2G078771GRMZM2G381025

●●●

●●●

●●●●●●●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●●●

●●

●●

●●

●●●●

●●

●●●

●●●

●●

●●●

●●●

●●●●●

●●

●●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●●●

●●

●●●●

●●●●

●●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●●

●●●●

●●

●●●●●

●●

●●

●●

●●

●●●●

●●

●●●

●●

●●●

●●●

●●

●●●

●●

●●●●●●

●●●●●●●

●●●●●●

●●●●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●●●●

●●

●●●

●●

●●

●●●●

●●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●●●●●●●

●●

●●

●●

●●

●●

●●●

●●●

●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●●

●●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●●

●●●●●●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●●●

●●

●●●●●●

●●●●

●●

●●●●

●●

●●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●●●

●●

●●

●●●

●●●

●●

●●

●●

−lo

g(P

)

0

4

6

2

chr3.S_200219661maize

r2

0.60.4

0.20

0.8

200.18200.20

Chr3

b

139

18

0

1

2

3

4

5

6

7

●●

●●

●●

●●●

●●●●

●●●●

●●

●●●●

● ●●●

●●

●●

●●●●

●●●●

●●

●●

●●

●●

●●●●●●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●●

●●

●●

●●●●●

●●

●●●●●

●●

●●●●

●●●●●

●●●●

●●●

●●●

●●

●●

●●●

●●●●●

●●●●●●

●●

●●●●●●●●●●●

●●●

●●

●●●●

●●●●

●●

●●●●

●●

●●●

●●●●

●●●●●●●

●●

●●

●●

●●

●●●●●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●

●●

●●●

●●●●●

●●

●●●●●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●●

●●●●●

●●

●●●●

●●●●●●

●●●

●● ●

●●●●●

●●

●●

●●●●

●●

●●

●●●

●●

●●●

●●

●●

●●●●

●●

●●

●●●●

●●

●●

●●

●●●

●●●●●●●

●●

●●●●

●●●●

●●●●●

●●

●●●●

●●

●●●●

●●●●●●

●●

●●●

●●●

●●

●●●●●●●●●●●●

●●●

●●●● ●●●●

●●

●●●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●●●●●●●●●●●●

●●●●●●●

●●●

●●

●●●●

●●●●

●●

●●

●●●

●●

●●●

●●

●●●●●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●●●●

●●●

●●●●●● ●

●●●●●

●●

●●

●●

●●

●●●

●●●●●●●●●●●●●

●●

●●

●●

●●●●●●●●●

●●

●●●

●●●●

●●●●●●

●●●●●●●

●●

1

2

3

4

5

6

7

0

rice sf0123644140

r2

0.60.4

0.20

0.8

Os01g41430RUGT-5

215.75

GRMZM5G888620

215.80

Chr1

23.55 23.60 (Mb)

Os01g41450

23.50

215.85 (Mb)

−lo

g(P

)−

log

(P)

chr3.S_215786480

Chr3

maize

c

140

−lo

g(P

)

0

2

4

6

Chr6

rice

r2

0.60.4

0.20

0.8

sf0603183527

3.18 3.21 3.24 3.27 3.30 3.33 (Mb)

Os06g06780 Os06g06880 Os06g06980 Os06g07020

78.80 78.90 79.00 79.10 79.20 79.30 79.40 (Mb)

GRMZM2G342243 GRMZM2G041822 GRMZM2G127948 GRMZM2G086277

●●●

●●

● ●

●●●

●●

●●

●●

●●●●●

●●

●●●●

●●●●●●

●●

●●

●●●●

●●

●●

●●●●

●●

●●

●●

●●●●●

●●

●●●

●●●●●

●●

●●

●●

●●

●●

●● ●

●●●●

●●●●●

●●●●

●●●●

●●

●●

● ●●●●●●●●●

●●●

●●

●●●●

●●

● ●

●●

●●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●●

−lo

g(P

)

0

2

4

6

8

10SYNGENTA0813

Chr6

maize

d

141

19

RT, min

6 7 8 9 10 11 12 13 14

Inte

nsity

,cp

s

1.0e+5

2.0e+5

3.0e+5

4.0e+5

11.4

Apigenin

Apigenin 7-O-glucoside

8.7

RT, min

6 7 8 9 10 11 12 13 14

Inte

nsity

,cp

s

2.0e+5

4.0e+5

6.0e+5

8.0e+5

1.0e+6

1.2e+6 Apigenin (standard)

11.4

Empty vector

Os11g25454

0

2

4

6

−lo

g(P

)

Chr11

14.03 14.13 14.23 14.33 (Mb)

rice sf1114353342

Os11g25454 Os11g25720 Os01g53470 Os01g53520

3.20 3.30 (Mb)

GRMZM2G085854GRMZM2G085054

−lo

g(P

)

0

4

6

2

chr4.S_3210771maize

r2

0.60.4

0.20

0.8

3.263.24

Chr4

8

2.3 2.5 2.7 2.9 3.1 3.3 3.5 3.7 3.9 4.1●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●●

●●●

●●●

●●

●●

●●●●

●●●●

●●

●●●

●●

●●●●

●●

●●

●●●●●●

●●

●●●●

●●

●●●●●●

●●●●

●●

●● ●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●●

●●●

●●●●●●●

●●

●●

●●●

●●●

●●●●●●

●●●

●●

●●

●●●●●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●●

●●●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●●

●●●●●●

●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●●●

●●●●

●●

●●

●●

●●●

●●

●●

●●●

●●●●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●●

●●●●

●●●●●

●●

●●

●●

●●●●●

●●●● ●

●●●

●●

●●●●

●●●●

●●

●●

●●

●●●

●●●

●●

●●●

●●

●●

●●●

●●

●●

●●●●●●

●●

●●

●●

●●●

●●

●●

●●

●●●●

●●

●●

●●

●●●●

●●●

●●

●●●

●●

●●

●●●●●

●●

●●●

●●●●

●●

●●●

●●●●●●●

●●

●●●●●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

e

f

142

143

Supplementary Figure 11 Colinear genomic regions and the 144

20

homologous loci (or genes) of tryptamine (a), 3’, 4’, 5’-tricetin 145

O-hexoside (b) chrysoeriol (c), caffeic acid (d) and apigenin 146

7-O-glucoside (e) between rice grain and maize kernel. (f) Annotation of 147

Os11g25454 as the candidate gene underlying the mGWAS for apigenin 148

7-O-glucoside and identified the function by in vitro. LC-MS 149

chromatograms of in vitro enzyme assay shows the enzyme activity of 150

recombinant Os11g25454 (up). Protein extract from E. coli containing 151

pDEST15 empty vector were used as a negative control (down). 152

153

21

At3GlcT

At3AraT

At3RhaT

Vv3GlcT

Ph3GlcT

Pf3GlcT

Hv3GlcT

Zm3GlcT

Os11g25454

UGT84A1

UGT84A2

At5GlcT

Ph5GlcT

Pf5GlcT

Vh5GlcT

Os07g32060

Os01g53460

Os06g18010

Os06g18140

GRMZM2G383404

Os06g18670

Os06g18790

At7RhaT

At7GlcT

DbB5GlcT

NtIS5a

Gt3GlcT

Os02g37690

CmF7G2RhaT

BpA3G2GlcAT

UGT79B1

IpA3G2GlcT

PhA3G2RhaT

Os11g26950

91

100

100

95

100

100

100

100

99

100

100

100

100

97

100

97

99

100

91

92

100

100

100

99

100

62

100

100

82

60

73

0.1 154

Supplementary Figure 12 Phylogenetic analysis of glucosyltransferase 155

genes from the plant glucosyltransferase family. 156

The neighbor-joining tree was constructed using aligned full-length 157

amino acid sequences. Bootstrap values from 1, 000 replicates are 158

indicate at each node. Bar = 0.1 amino acid substitutions per site. 159

160

22

161

metabolites

mr004 mr584 mr430 mr582 mr082 mr543 mr069

Rel

ativ

e co

nten

t

0

2e+4

4e+4

6e+4

8e+4

WT975

metabolites

mr004 mr584 mr430 mr582 mr082 mr543 mr069

Rel

ativ

e co

nten

t

0

1e+5

2e+5

WT123

c

d

WT 9 7 5

Rel

ativ

e ex

pres

sion

0

100200300400

Os06g18790

WT 1 2 30

1

8000

16000

Os06g18670

a b

Os06g18670

Os06g18790

162

Supplementary Figure 13 The transgenic results of Os06g18670 and 163

Os06g18790. 164

The expression level of Os06g18670 (a) and Os06g18790 (b) 165

respectively; (c and d), the relative content of some flavonoids in rice 166

transgenic individuals. WT, the transgenic background variety ZH11. The 167

23

P value is calculated using the Student’s t tests. Data are shown as the 168

means ± s.e.m., n = 3. mr004, di-C,C-pentosyl-apigenin; mr584, 169

C-hexosyl-luteolin O-p-coumaroylhexoside; mr430, Luteolin 170

6-C-glucoside; mr582, C-hexosyl-apigenin O-p-coumaroylhexoside; 171

mr082, C-hexosyl-apigenin O-feruloylhexoside; mr543, 172

C-hexosyl-chrysoeriol O-hexoside; mr069, di-C, C-hexosyl-apigenin 173

derivative. 174

175

24

176

12 2 13 4 WT 3 9 1 80

1

2

3

4

OX RNAi

p=3.6E-05

p=2.5E-06

p=6.6E-06

Gra

inw

idth

(mm

)

p=5.3E-08

p=5.8E-04

p=3.9E-06

p=1.9E-06

p=9.5E-07

WT

Re

lativ

eex

pre

ssio

n

0

1

2

100

150

200

12 2 13 4

OX

3 9 1 8RNAi

WT0

2

4

6

12 2 13 4

OX

3 9 1 8RNAi

Trig

one

llin

eco

nte

nt(×

106 )

a b

c

177

178

Supplementary Figure 14. Phenotype data in transgenic plants. 179

Shown are bar plots for the mRNA level of Os02g57760 (a) and for the 180

content of trigonelline (b) in transgenic positive individuals. (c) The 181

comparison of grain width between transgenic plants and wild type. The 182

P value is calculated using the Student’s t tests. Data are shown as the 183

means ± s.e.m., n = 3. 184

Supplementary Note 1 185

Metabolites identification and putative annotation strategies. 186

25

We determined the relative levels of 837 distinct metabolic traits in rice 187

grains using a newly developed liquid chromatography-tandem mass 188

spectrometry (LC-MS/MS)-based, widely targeted metabolic profiling 189

method1. Of the 837 metabolic features, 80 were identified based on 190

comparisons of MS/MS spectra, an exact mass number, and retention 191

time with those of authentic standards (Supplementary Data 2). A total 192

of 230 were putatively annotated based on high resolution MS, MS/MS 193

spectra and other strategies, including: i) linking the unknown metabolites 194

to functionally related genes based on genetic mapping2, and/or further 195

verifying the genes using in vivo or in vitro strategies; ii) connecting two 196

metabolites with similar MS/MS spectra3; and iii) combining the 197

Gaussian graphical model (GGM)4 with the similarity of the MS/MS 198

spectra. The details are provided below. 199

To annotate more metabolites, we associated the unknown metabolites 200

with annotated genes based on our high-resolution genetic mapping as 201

previously reported2, For example, the strong association between SNP 202

sf0233224406 located 22 kb from OsLKR5 (encoding saccharopine 203

dehydrogenase) and the mr208 (m/z 277) level suggested that this 204

metabolite could be saccharopine or its derivative. We subsequently 205

identified the mr208 metabolite as saccharopine by comparing the 206

retention time and fragmentation pattern of this metabolite with the 207

commercial standard (Supplementary Figure 1). 208

26

The strong association between SNP sf0605268699 located 45 kb from 209

OsC1 (encoding MYB transcription factor)6 and the mr063 (m/z 210

465.1033) level suggested that this metabolite could be an anthocyanin or 211

its derivative. We subsequently identified the mr063 metabolite as 212

delphinidin 3-O-glucoside by high resolution MS and the fragmentation 213

pattern of this metabolite (Supplementary Figure 2). Using this 214

approach, the mGWAS enabled the putative annotation of more than 40 215

metabolites (Supplementary Data 4). 216

We used the Gaussian graphical model (GGM) to reconstruct pathways 217

involving directly related metabolites. GGM is based on pairwise Pearson 218

correlation coefficients conditioned against the correlation with all other 219

metabolites4. First, we performed Principal component analysis (PCA) on 220

the genotype mean values to summarize the correlations and pinpoint 221

groups of correlated metabolites. We found some obvious clusters, such 222

as the class of flavonoids, some amino acids and terpenoids etc., 223

suggesting strong correlations between them (Supplementary Fig. 7a). 224

For the GGM calculation in this article, a full data matrix was constructed 225

from 502 samples and 587 metabolites. GGM with an empirical Bayes 226

approach7 was employed to estimate partial correlations and reconstruct a 227

GGM network from a given dataset. In addition to the results for the full 228

population, we included the data for separate GGM analyses across the 2 229

genetic subgroups (indica and japonica). We compared the GGM 230

27

networks of three groups filtered by a significant P-value < 2.9E-07 231

based on the Bonferroni correction. Together, the resulting GGM consists 232

of a total of 2119 connections (Supplementary Data 15). In accordance 233

with previous observations, we consistently observed associations 234

between biochemically related metabolites from various metabolic 235

pathways in both the overall network (Supplementary Fig. 7b) and the 236

top list of high-scoring GGM edges: metabolites naringenin 237

7-O-glucoside and apigenin 7-O-glucoside (P-cor = 0.30, Pearson’s 238

correlation coefficient), which are involved in flavonoid metabolism, or 239

threonyl carbamoyl adenosine and 240

[1,2,4]triazolo[1,5-a]pyrimidine-7-carboxamide,4,5,6,7-tetrahydro-N-(2-241

methoxy-5-methylphenyl)-5-oxo- (P-cor = -0.08, Pearson’s correlation 242

coefficient), which represent related nucleic acid derivatives 243

(Supplementary Fig. 7b). Then, we searched for high-score correlating 244

pairs of an unknown and a known metabolite that might provide a 245

biochemical context for the unknown metabolite. For example, the 246

correlation between tryptamine (mr653) and mr904 was 0.09. This pair 247

had the same major m/z 144 fragment (the main ion for tryptamine), 248

suggesting that mr904 was a tryptamine derivative. We putatively 249

annotated mr904 as N-cinnamoyltryptamine by comparing the MS and 250

fragmentation patterns with tryptamine (Supplementary Data 15). Over 251

30 metabolites were putatively annotated using this approach 252

28

(Supplementary Data 4 and 15). 253

Supplementary Note 2 254

The process and criterion for the assignment of candidate genes 255

responsible for the variation of metabolic traits based on mGWAS. 256

To confirm the candidate genes responsible for the variation of metabolic 257

traits, we mined the candidate genes using the following methods: i) 258

estimating the allelic effect of each genotypic class in close proximity to 259

the most significant peak SNPs and confirming the associated SNP/InDel; 260

ii) looking for a protein or protein cluster that was biochemically and/or 261

biologically related to the associated metabolic trait encoded at these loci; 262

iii) performing cluster analysis of the candidate genes relative to 263

homologous genes with known functions; iv) cross-referencing with 264

results from linkage mapping and v) verifying the candidate genes 265

according to the tissue-specific expression pattern. 266

For example, SNP sf0310132518 located 12 kb from Os03g18130 267

(encoding a putative asparagine synthetase) was significantly associated 268

(P = 5.7E-07, LMM, n = 502) with asparagine (mr173). The high 269

sequence identity (68% at the amino acid level) between Os03g18130 and 270

AtASN2 suggested that Os03g18130 encoded an asparagine synthetase. 271

This hypothesis was supported by the preferential expression of 272

Os03g18130 together with a higher accumulation of this metabolite in the 273

29

rice grain. 274

We also observed a 2 bp deletion that resulted in a frame shift in 275

Os04g11970. This deletion was highly significantly associated with the 276

variation (P = 6.7E-47, LMM, n = 502) and the absence of 277

O-methylapigenin C-hexoside, strongly suggesting the loss of function 278

allele for this candidate. 279

SNP sf0524319598 located in Os05g41645 (encoding a putative 280

chalcone synthase), was significantly associated (P = 3.3E-52, LMM, n = 281

502) with C-pentosyl-apigenin O-rutinoside (mr080). The high sequence 282

identity (48% at the amino acid level) between Os05g41645 and AtTT4 283

suggested that Os05g41645 encoded a chalcone synthase underlying this 284

flavonoid. 285

SNP sf0137818225 located 27 kb from Os01g65260 (encoding a 286

putative amido phosphor ribosyltransferase) was significantly associated 287

(P = 2.5E-10, LMM, n = 502) with threonyl carbamoyl adenosine 288

(mr408). The above data strongly suggested that Os01g65260 encoded an 289

amido phosphor ribosyltransferase that was involved in the accumulation 290

of threonyl carbamoyl adenosine. 291

Together, more than 30 candidate genes were newly disclosed by 292

examining the mGWAS data from the rice grain alone in addition to 30 293

genes that were previously identified in studies using either mutants or 294

recombinant and natural populations (Supplementary Data 14). 295

30

Supplementary Note 3 296

Comparative mGWAS between rice and maize. 297

Comparative linkage mapping between crop plants, such as wheat, maize, 298

and rice8,9, has revealed good correspondences among QTLs in crop 299

plants for traits including seed size, shattering habit, and flowering time 300

etc., and has been suggested as a useful tool for predictions of the loci of 301

homologous major genes10-12. This concept was modified and extended in 302

our mGWAS for candidate gene mining based on the co-linear mapping 303

of the targeted metabolic trait(s) between species (e.g., searching for 304

candidates within homologous or co-linear regions co-mapped by the 305

same metabolites detected in both species). Because orthologous genes 306

between rice and maize may vary in their substrate specificity (e.g., 307

responsible for similar but not exactly the same metabolite), metabolites 308

with similar structures were also included in the comparison. 309

In this study, rice (Nipponbare, MSU version 6.1) and maize (B73, 310

RefGen_v2) genomes were used for the identification and 311

characterization of homologous regions. The sequence alignment analysis 312

was based on a VISTA sequence alignment algorithm program13. 313

Detailed information concerning the homologous fragments between the 314

two species is available from the VISTA database 315

(http://genome.lbl.gov/vista/index.shtml)14. 316

We previously performed metabolic profiling of 983 metabolic features 317

31

in 702 diverse maize accessions and identified hundreds of significant 318

locus-trait associations in maize kernel through mGWAS15. To 319

investigate the common genetic control of metabolism between rice and 320

maize, we focused on the 123 co-detected metabolic features in rice 321

grains and maize kernels (Supplementary Data 4). The co-detected 322

metabolic traits in both species were used to filter out loci through 323

mGWAS in the rice and maize grain. The calculated genome-wide 324

threshold was set at P = 1.8E-06 (MLM, n = 339) for maize17 and P = 325

1.3E-06, 1.8E-06 and 4.1E-06 (LMM, n = 502) for the whole panel, 326

indica and japonica rice2, respectively. According to these thresholds, we 327

obtained a total of 420 (Supplementary Data 16) and 292 328

(Supplementary Data 17) loci for the 123 co-detected metabolic features 329

in rice and maize, respectively. We searched for homologous loci mapped 330

by the same metabolites or metabolites of similar structures between 2 331

species by referring to the VISTA database 332

(http://genome.lbl.gov/vista/index.shtml) and detected 42 loci for 23 333

metabolites or metabolites of similar structures in both species 334

(Supplementary Data 18). 335

To test the significance of our GWAS homolog or co-linearity, we 336

adopted the randomization test of Churchill et al. 199416 to determine the 337

proportion of overlaps expected to occur by chance. The deviation from 338

the random number of GWAS homolog or co-linearity was calculated as 339

32

follows. All SNP hits of co-detected metabolites were randomly 340

distributed over the 420 and 292 identified association positions in the 341

rice and maize kernel, respectively. Then, we counted the number of 342

homolog or co-linearity with each locus for metabolites with the same or 343

similar structures using rice and maize fragments according to their local 344

LD decays. This procedure was repeated 10,000 times and yielded a 345

distribution of expected numbers of loci of homolog or co-linearity. Then, 346

this distribution was compared against the outcome for the actual data. 347

The mean and 95% quantile of the distribution for tge 348

metabolite-metabolite loci of homolog or co-linearity were 3.0 and 5.3, 349

respectively (Supplementary Fig. 10), suggesting that the majority of the 350

observed overlaps could not possibly be explained by chance alone. 351

Next, we looked for homologous gene(s) within the homologous or 352

co-linear loci between rice and maize using an expectation value (E) of 353

10-10 as the significance threshold17. Using this approach, a number of 354

candidate genes were assigned (Supplementary Fig. 11 and 355

Supplementary Data 19), including reported genes for metabolic traits 356

such as tryptophan decarboxylase OsTDC1 (Os08g04540), which 357

catalyzes the conversion of tryptophan into tryptamine in rice18 358

(Supplementary Fig. 11a), and another two flavonoid O-UDP-glucosyl 359

transferases (OsUGT-319 and RUGT-520) underlying the variation of 3’, 4’, 360

5’-tricetin O-hexoside and chrysoeriol, respectively (Supplementary 361

33

Figs. 11b-c). 362

Supplementary References 363

1. Chen, W. et al. A novel integrated method for large-scale detection, identification, and 364

quantification of widely targeted metabolites: application in the study of rice metabolomics. Mol 365

Plant 6, 1769-1780 (2013). 366

2. Chen, W. et al. Genome-wide association analyses provide genetic and biochemical insights into 367

natural variation in rice metabolism. Nat Genet 46, 714-721 (2014). 368

3. Matsuda, F. et al. Metabolome-genome-wide association study dissects genetic architecture for 369

generating natural variation in rice secondary metabolism. Plant J 81, 13-23 (2015). 370

4. Krumsiek, J. et al. Mining the unknown: a systems approach to metabolite identification 371

combining genetic and metabolic information. PLoS Genet 8, e1003005 (2012). 372

5. Kawakatsu, T. & Takaiwa, F. Differences in transcriptional regulatory mechanisms functioning for 373

free lysine content and seed storage protein accumulation in rice grain. Plant Cell Physiol 51, 374

1964-1974 (2010). 375

6. Saitoh, K., Onishi, K., Mikami, I., Thidar, K. & Sano, Y. Allelic diversification at the C (OsC1) 376

locus of wild and cultivated rice: nucleotide changes associated with phenotypes. Genetics 168, 377

997-1007 (2004). 378

7. Schafer, J. & Strimmer, K. An empirical Bayes approach to inferring large-scale gene association 379

networks. Bioinformatics 21, 754-764 (2005). 380

8. Moore, G., Devos, K.M., Wang, Z. & Gale, M.D. Cereal genome evolution. Grasses, line up and 381

form a circle. Curr Biol 5, 737-739 (1995). 382

9. Ahn, S. & Tanksley, S.D. Comparative linkage maps of the rice and maize genomes. Proc Natl 383

Acad Sci U S A 90, 7980-7984 (1993). 384

10. Lin, Y.R., Schertz, K.F. & Paterson, A.H. Comparative analysis of QTLs affecting plant height and 385

maturity across the Poaceae, in reference to an interspecific sorghum population. Genetics 141, 386

391-411 (1995). 387

11. Ming, R. et al. Comparative analysis of QTLs affecting plant height and flowering among 388

closely-related diploid and polyploid genomes. Genome 45, 794-803 (2002). 389

12. Paterson, A.H. et al. Convergent domestication of cereal crops by independent mutations at 390

corresponding genetic Loci. Science 269, 1714-1718 (1995). 391

13. Frazer, K.A., Pachter, L., Poliakov, A., Rubin, E.M. & Dubchak, I. VISTA: computational tools 392

for comparative genomics. Nucleic Acids Res 32, W273-279 (2004). 393

14. Mayor, C. et al. VISTA : visualizing global DNA sequence alignments of arbitrary length. 394

Bioinformatics 16, 1046-1047 (2000). 395

15. Wen, W. et al. Metabolome-based genome-wide association study of maize kernel leads to novel 396

biochemical insights. Nat Commun 5, 3438 (2014). 397

16. Churchill, G.A. & Doerge, R.W. Empirical threshold values for quantitative trait mapping. 398

Genetics 138, 963-971 (1994). 399

17. Dean, R.A. et al. The genome sequence of the rice blast fungus Magnaporthe grisea. Nature 434, 400

980-986 (2005). 401

18. Kang, S., Kang, K., Lee, K. & Back, K. Characterization of rice tryptophan decarboxylases and 402

34

their direct involvement in serotonin biosynthesis in transgenic rice. Planta 227, 263-272 (2007). 403

19. Kim, B.G. et al. Flavonoid O-diglucosyltransferase from rice: molecular cloning and 404

characterization. J Plant Biol 52, 41-48 (2009). 405

20. Ko, J.H., Kim, B.G., Hur, H.G., Lim, Y. & Ahn, J.H. Molecular cloning, expression and 406

characterization of a glycosyltransferase from rice. Plant Cell Rep 25, 741-746 (2006). 407

408