Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 ›...

47
CMU SCS Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

Transcript of Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 ›...

Page 1: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Tensor Analysis – Applications and Algorithms

Christos Faloutsos CMU

Page 2: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Roadmap •  Applications – pattern discovery

– Brain scans – coupled matrix-tensor factorization

– Power grid •  Applications – anomaly detection •  Algorithms •  Conclusions

SIAM, July 2017 (c) C. Faloutsos, 2017 2

Page 3: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Roadmap •  Applications – pattern discovery

– Brain scans – coupled matrix-tensor factorization

– Power grid •  Applications – anomaly detection •  Algorithms •  Conclusions

SIAM, July 2017 (c) C. Faloutsos, 2017 3

Page 4: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Neuro-semantics

13�

3.�Additional�Figures�and�legends��

Figure�S1.�Presentation�and�set�of�exemplars�used�in�the�experiment. Participants were presented 60 distinct word-picture pairs describing common concrete nouns. These consisted of 5 exemplars from each of 12 categories, as shown above. A slow event-related paradigm was employed, in which the stimulus was presented for 3s, followed by a 7s fixation period during which an X was presented in the center of the screen. Images were presented as white lines and characters on a dark background, but are inverted here to improve readability. The entire set of 60 exemplars was presented six times, randomly permuting the sequence on each presentation.

To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.

This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected

in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.

Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].

To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-

Predicted“celery” = 0.84

“celery” “airplane”

Predicted:

Observed:

A B

+.. .

high

average

belowaverage

Predicted “celery”:

+ 0.35 + 0.32

“eat” “taste” “fill”

Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =

–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.

A

B

C

Mean over

participants

Participant P

5

Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-

ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.

30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192

RESEARCH ARTICLES

on

May

30,

200

8 w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.

This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected

in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.

Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].

To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-

Predicted“celery” = 0.84

“celery” “airplane”

Predicted:

Observed:

A B

+.. .

high

average

belowaverage

Predicted “celery”:

+ 0.35 + 0.32

“eat” “taste” “fill”

Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =

–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.

A

B

C

Mean over

participants

Participant P

5

Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-

ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.

30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192

RESEARCH ARTICLES

on

May

30,

200

8 w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

•  Brain Scan Data*

•  9 persons •  60 nouns

•  Questions •  218 questions •  ‘is it alive?’, ‘can

you eat it?’

To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.

This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected

in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.

Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].

To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-

Predicted“celery” = 0.84

“celery” “airplane”

Predicted:

Observed:

A B

+.. .

high

average

belowaverage

Predicted “celery”:

+ 0.35 + 0.32

“eat” “taste” “fill”

Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =

–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.

A

B

C

Mean over

participants

Participant P

5

Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-

ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.

30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192

RESEARCH ARTICLES

on

May

30,

200

8 w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

SIAM, July 2017 4 (c) C. Faloutsos, 2017

*Mitchell et al. Predicting human brain activity associated with the meanings of nouns. Science,2008. Data@ www.cs.cmu.edu/afs/cs/project/theo-73/www/science2008/data.html

Page 5: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Neuro-semantics

13�

3.�Additional�Figures�and�legends��

Figure�S1.�Presentation�and�set�of�exemplars�used�in�the�experiment. Participants were presented 60 distinct word-picture pairs describing common concrete nouns. These consisted of 5 exemplars from each of 12 categories, as shown above. A slow event-related paradigm was employed, in which the stimulus was presented for 3s, followed by a 7s fixation period during which an X was presented in the center of the screen. Images were presented as white lines and characters on a dark background, but are inverted here to improve readability. The entire set of 60 exemplars was presented six times, randomly permuting the sequence on each presentation.

•  Brain Scan Data*

•  9 persons •  60 nouns

•  Questions •  218 questions •  ‘is it alive?’, ‘can

you eat it?’

SIAM, July 2017 5 (c) C. Faloutsos, 2017

Patterns?

To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.

This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected

in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.

Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].

To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-

Predicted“celery” = 0.84

“celery” “airplane”

Predicted:

Observed:

A B

+.. .

high

average

belowaverage

Predicted “celery”:

+ 0.35 + 0.32

“eat” “taste” “fill”

Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =

–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.

A

B

C

Mean over

participants

Participant P

5

Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-

ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.

30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192

RESEARCH ARTICLES

on

May

30,

200

8 w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.

This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected

in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.

Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].

To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-

Predicted“celery” = 0.84

“celery” “airplane”

Predicted:

Observed:

A B

+.. .

high

average

belowaverage

Predicted “celery”:

+ 0.35 + 0.32

“eat” “taste” “fill”

Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =

–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.

A

B

C

Mean over

participants

Participant P

5

Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-

ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.

30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192

RESEARCH ARTICLES

on

May

30,

200

8 w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.

This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected

in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.

Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].

To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-

Predicted“celery” = 0.84

“celery” “airplane”

Predicted:

Observed:

A B

+.. .

high

average

belowaverage

Predicted “celery”:

+ 0.35 + 0.32

“eat” “taste” “fill”

Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =

–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.

A

B

C

Mean over

participants

Participant P

5

Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-

ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.

30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192

RESEARCH ARTICLES

on

May

30,

200

8 w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

Page 6: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Neuro-semantics

13�

3.�Additional�Figures�and�legends��

Figure�S1.�Presentation�and�set�of�exemplars�used�in�the�experiment. Participants were presented 60 distinct word-picture pairs describing common concrete nouns. These consisted of 5 exemplars from each of 12 categories, as shown above. A slow event-related paradigm was employed, in which the stimulus was presented for 3s, followed by a 7s fixation period during which an X was presented in the center of the screen. Images were presented as white lines and characters on a dark background, but are inverted here to improve readability. The entire set of 60 exemplars was presented six times, randomly permuting the sequence on each presentation.

•  Brain Scan Data*

•  9 persons •  60 nouns

•  Questions •  218 questions •  ‘is it alive?’, ‘can

you eat it?’

Tofullyspecify

amodelwithint

hiscom-

putationalmode

lingframework,o

nemustfirst

defineasetofin

termediateseman

ticfeatures

f 1(w)f 2(w)…f n(w)tobeextract

edfromthetext

corpus.Inthispa

per,eachintermed

iatesemantic

featureisdefined

intermsoftheco

-occurrence

statisticsofthein

putstimuluswo

rdwwitha

particularotherwo

rd(e.g.,“taste”)or

setofwords

(e.g.,“taste,”“tas

tes,”or“tasted”)w

ithinthetext

corpus.Themode

listrainedbythe

applicationof

multipleregressio

ntothesefeature

sf i(w)andthe

observedfMRIim

ages,soastoobt

ainmaximum-

likelihoodestima

tesforthemodel

parametersc vi

(26).Oncetrained,

thecomputational

modelcanbe

evaluatedbygivin

gitwordsoutside

thetraining

setandcomparin

gitspredictedfM

RIimagesfor

thesewordswith

observedfMRIda

ta.Thiscom

putationalmodeli

ngframeworkis

basedontwokey

theoreticalassump

tions.First,it

assumesthesema

nticfeaturesthatd

istinguishthe

meaningsofarbit

raryconcretenoun

sarereflected

inthestatisticsof

theirusewithina

verylargetext

corpus.Thisassum

ptionisdrawnfro

mthefieldof

computationallin

guistics,wherest

atisticalword

distributionsare

frequentlyusedt

oapproximate

themeaningofd

ocumentsandw

ords(14–17).

Second,itassume

sthatthebrainact

ivityobserved

whenthinkinga

boutanyconcrete

nouncanbe

derivedasaweig

htedlinearsumo

fcontributions

fromeachofits

semanticfeatures.

Althoughthe

correctnessofthis

linearityassumpt

ionisdebat-

able,itisconsist

entwiththewide

spreaduseof

linearmodelsin

fMRIanalysis(2

7)andwiththe

assumptionthatf

MRIactivationo

ftenreflectsa

linearsuperpositio

nofcontributions

fromdifferent

sources.Ourtheore

ticalframeworkdo

esnottakea

positiononwheth

ertheneuralactiv

ationencoding

meaningislocali

zedinparticular

corticalre-

gions.Instead,it

considersallcort

icalvoxelsand

allowsthetrainin

gdatatodetermin

ewhichloca-

tionsaresystemat

icallymodulated

bywhichas-

pectsofwordme

anings.

Results.Weevaluated

thiscomputational

mod-elusing

fMRIdatafromn

inehealthy,college

-ageparticipan

tswhoviewed60

differentword-pic

turepairspre

sentedsixtimes

each.Anatomicall

yde-finedreg

ionsofinterestwe

reautomaticallyla

beledaccording

tothemethodolog

yin(28).The60

ran-domlyo

rderedstimuliinc

ludedfiveitems

fromeachof1

2semanticcatego

ries(animals,body

parts,buildings

,buildingparts,clo

thing,furniture,in

sects,kitcheni

tems,tools,vegeta

bles,vehicles,and

otherman-mad

eitems).Areprese

ntativefMRIimage

foreachstim

uluswascreatedb

ycomputingthem

eanfMRIre

sponseoveritssi

xpresentations,a

ndthemeanof

all60oftheserep

resentativeimages

wasthensub

tractedfromeach

[fordetails,see(

26)].Toinstan

tiateourmodeling

framework,wefir

stchoseas

etofintermediate

semanticfeatures.

Tobeeffective,

theintermediatese

manticfeaturesm

ustsimultane

ouslyencodethew

idevarietyofsema

nticcontento

ftheinputstimul

uswordsandfacto

rtheobserved

fMRIactivationin

tomoreprimitivec

om-

Predicted

“celery” = 0.84

“celery”“airplan

e”

Predicted:

Observed:

AB

+...

high

average below average

Predicted “celer

y”:+ 0.35+ 0.32

“eat”“taste”

“fill”

Fig.2.Predicting

fMRIimages

forgivenstimulus

words.(A)

Formingapredict

ionforpar-

ticipantP1fort

hestimulus

word“celery”afte

rtrainingon

58otherwords.Le

arnedc vico-efficients

for3ofthe25s

e-manticfe

atures(“eat,”“tast

e,”and“fill”

)aredepictedby

thevoxelcolo

rsinthethreeima

gesatthetop

ofthepanel.Thec

o-occurrenc

evalueforeacho

fthesefeaturesfor

thestimulusword

“celery”is

showntothelefto

ftheirrespectiveim

ages[e.g.,thevalu

efor“eat(celery)”i

s0.84].The

predictedactivation

forthestimuluswo

rd[shownatthebo

ttomof(A)]isal

inearcombinationo

fthe25semantic

fMRIsignatures,w

eightedby

theirco-occurrence

values.Thisfigure

showsjustonehor

izontalslice[z=

–12mminMontr

ealNeurological

Institute(MNI)spa

ce]ofthepredict

edthree-dim

ensionalimage.(

B)Predictedand

observedfMRIima

gesfor“celery”a

nd“airplane”after

trainingthatuses5

8otherwords.The

twolongredandb

lueverticalstreaks

nearthetop(post

eriorregion)ofthe

predictedandobse

rvedimagesaret

heleftandrightfu

siformgyri.

A B

C

Mean overparticipants

Participant P5

Fig.3.Locations

ofmostac

curatelypre-

dictedvoxels.Sur

face(A)andg

lassbrain(B)

renderingofthecor

rela-tionbetw

eenpredicted

andactualvoxelac

tiva-tionsfor

wordsoutside

thetrainingsetfor

par-ticipantP

5.Thesepanelssho

wclusterscontaining

atleast10contigu

ousvoxels,eachof

whosepredicted-

actualcorrelationis

atleast0.28.These

voxelclustersared

istributedthroughou

tthecortexan

dlocatedinthelef

tandrightoccipit

alandparietallobe

s;leftandrightfu

siform,postcentra

l,andmiddlefronta

lgyri;leftinferiorfr

ontalgyrus;medial

frontalgyrus;anda

nteriorcingulate.

(C)Surfacerenderi

ngofthepredicted-

actualcorrelation

averagedoveralln

ineparticipan

ts.Thispanelrepres

entsclusterscontai

ningatleast10co

ntiguousvoxels,ea

chwithaveragec

orrelationofatleast

0.14.

30MAY2008V

OL320SCIENC

Ewww.science

mag.org1192RESEAR

CHARTICLES

on May 30, 2008 www.sciencemag.org Downloaded from

Tofullyspecify

amodelwithin

thiscom-

putationalmode

lingframework

,onemustfirst

defineasetof

intermediatese

manticfeatures

f 1(w)f 2(w)…f n(w)tobeextrac

tedfromthetext

corpus.Inthisp

aper,eachinter

mediatesemanti

cfeature

isdefinedinte

rmsoftheco-o

ccurrence

statisticsofthe

inputstimulus

wordwwitha

particularotherw

ord(e.g.,“taste”

)orsetofword

s(e.g.,“ta

ste,”“tastes,”or

“tasted”)within

thetextcorpus.

Themodelistra

inedbytheappl

icationof

multipleregressio

ntothesefeatu

resf i(w)andthe

observedfMRI

images,soasto

obtainmaximum

-likelihoo

destimatesfor

themodelparam

etersc vi

(26).Oncetraine

d,thecomputatio

nalmodelcanbe

evaluatedbygiv

ingitwordsou

tsidethetrainin

gsetand

comparingitsp

redictedfMRIim

agesforthesew

ordswithobser

vedfMRIdata.

Thiscomputatio

nalmodeling

frameworkis

basedontwok

eytheoreticalas

sumptions.First

,itassumes

thesemanticfea

turesthatdisting

uishthemeaning

sofarbitraryco

ncretenounsare

reflected

inthestatisticso

ftheirusewithi

naverylargetex

tcorpus.

Thisassumption

isdrawnfromthe

fieldofcomputa

tionallinguistic

s,wherestatist

icalword

distributionsare

frequentlyused

toapproximate

themeaningof

documentsand

words(14–17).

Second,itassum

esthatthebrain

activityobserve

dwhenth

inkingaboutan

yconcretenou

ncanbe

derivedasawe

ightedlinearsu

mofcontributio

nsfromea

chofitsseman

ticfeatures.Alth

oughthe

correctnessofth

islinearityassu

mptionisdebat

-able,it

isconsistentw

iththewidesprea

duseof

linearmodelsin

fMRIanalysis(2

7)andwiththe

assumptiontha

tfMRIactivatio

noftenreflects

alinearsu

perpositionofc

ontributionsfrom

differentsources.

Ourtheoreticalfr

ameworkdoesn

ottakeaposition

onwhetherthen

euralactivation

encoding

meaningisloc

alizedinpartic

ularcorticalre

-gions.I

nstead,itconsid

ersallcorticalv

oxelsand

allowsthetrain

ingdatatodeterm

inewhichloca-

tionsaresystem

aticallymodulat

edbywhichas-

pectsofwordm

eanings.

Results.Weevaluate

dthiscomputatio

nalmod-

elusingfMRId

atafromninehea

lthy,college-age

participantswho

viewed60diffe

rentword-pictur

epairspr

esentedsixtime

seach.Anatom

icallyde-

finedregionsofi

nterestwereauto

maticallylabele

daccordin

gtothemethodo

logyin(28).Th

e60ran-

domlyordered

stimuliincluded

fiveitemsfrom

eachof12sema

nticcategories(a

nimals,bodypart

s,building

s,buildingparts

,clothing,furnit

ure,insects,

kitchenitems,to

ols,vegetables,

vehicles,andoth

erman-ma

deitems).Arepr

esentativefMRI

imagefor

eachstimuluswa

screatedbycomp

utingthemean

fMRIresponse

overitssixpres

entations,andth

emeanof

all60oftheser

epresentativeim

ageswas

thensubtractedf

romeach[ford

etails,see(26)]

.Toinsta

ntiateourmodeli

ngframework,w

efirstchosea

setofintermedia

tesemanticfeatu

res.Tobe

effective,theint

ermediatesema

nticfeaturesmu

stsimultan

eouslyencodeth

ewidevarietyof

semanticcontent

oftheinputstim

uluswordsand

factorthe

observedfMRIa

ctivationintomor

eprimitivecom-

Predicted

“celery” = 0.84

“celery”

“airplane”

Predicted:

Observed:

AB

+...

high

average below averag

e

Predicted “cele

ry”:+ 0.35+ 0.32

“eat”“taste”

“fill”

Fig.2.Predictin

gfMRIimages

forgivenstimu

luswords.(A)

Formingapredic

tionforpar-

ticipantP1for

thestimulus

word“celery”aft

ertrainingon

58otherwords.L

earnedcvico-

efficientsfor3of

the25se-

manticfeatures

(“eat,”“taste,”

and“fill”)ared

epictedbythe

voxelcolorsinth

ethreeimages

atthetopofthe

panel.Theco-

occurrencevalue

foreachofthese

featuresforthes

timulusword“ce

lery”isshownto

theleftoftheirre

spectiveimages[

e.g.,thevaluefor

“eat(celery)”is

0.84].Thepredict

edactivationfor

thestimulusword

[shownatthebo

ttomof(A)]isa

linearcombinatio

nofthe25sema

nticfMRIsignatu

res,weightedby

theirco-occurren

cevalues.Thisf

igureshowsjust

onehorizontals

lice[z=

–12mminMont

realNeurologica

lInstitute(MNI)

space]ofthepr

edictedthree-dim

ensionalimage.

(B)Predictedan

dobservedfMR

Iimagesfor

“celery”and“air

plane”aftertrain

ingthatuses58

otherwords.The

twolongredand

blueverticalstre

aksnearthetop

(posteriorregion

)ofthepredicte

dandobs

ervedimagesare

theleftandrigh

tfusiformgyri.

A B

C

Mean overparticipants

Participant P5

Fig.3.Location

sofmostac

curatelypre-

dictedvoxels.S

urface(A)and

glassbrain(B)

renderingofthe

correla-tionbet

weenpredicted

andactualvoxel

activa-tionsfor

wordsoutside

thetrainingsetf

orpar-ticipantP

5.Thesepanelssh

owclusterscontai

ningatleast10c

ontiguousvoxels,

eachofwhose

predicted-actualc

orrelationisatleas

t0.28.Thesevox

elclustersaredis

tributedthrougho

utthecortexan

dlocatedinthele

ftandrightoccip

italandparietal

lobes;leftandri

ghtfusiform,

postcentral,andm

iddlefrontalgyri;

leftinferiorfronta

lgyrus;medialfron

talgyrus;andant

eriorcingulate

.(C)Surfaceren

deringofthepr

edicted-actualcor

relationaveraged

overallnine

participants.This

panelrepresents

clusterscontaining

atleast10contig

uousvoxels,each

withaverage

correlationofatle

ast0.14.

30MAY2008

VOL320SCI

ENCEwww.sc

iencemag.org

1192RESEARCHART

ICLES

on May 30, 2008 www.sciencemag.org Downloaded from

Tofullyspecify

amodelwithint

hiscom-

putationalmode

lingframework,o

nemustfirst

defineasetofin

termediateseman

ticfeatures

f 1(w)f 2(w)…f n(w)tobeextract

edfromthetext

corpus.Inthispa

per,eachintermed

iatesemantic

featureisdefined

intermsoftheco

-occurrence

statisticsofthein

putstimuluswo

rdwwitha

particularotherwo

rd(e.g.,“taste”)or

setofwords

(e.g.,“taste,”“tas

tes,”or“tasted”)w

ithinthetext

corpus.Themode

listrainedbythe

applicationof

multipleregressio

ntothesefeature

sf i(w)andthe

observedfMRIim

ages,soastoobt

ainmaximum-

likelihoodestima

tesforthemodel

parametersc vi

(26).Oncetrained,

thecomputational

modelcanbe

evaluatedbygivin

gitwordsoutside

thetraining

setandcomparin

gitspredictedfM

RIimagesfor

thesewordswith

observedfMRIda

ta.Thiscom

putationalmodeli

ngframeworkis

basedontwokey

theoreticalassump

tions.First,it

assumesthesema

nticfeaturesthatd

istinguishthe

meaningsofarbit

raryconcretenoun

sarereflected

inthestatisticsof

theirusewithina

verylargetext

corpus.Thisassum

ptionisdrawnfro

mthefieldof

computationallin

guistics,wherest

atisticalword

distributionsare

frequentlyusedt

oapproximate

themeaningofd

ocumentsandw

ords(14–17).

Second,itassume

sthatthebrainact

ivityobserved

whenthinkinga

boutanyconcrete

nouncanbe

derivedasaweig

htedlinearsumo

fcontributions

fromeachofits

semanticfeatures.

Althoughthe

correctnessofthis

linearityassumpt

ionisdebat-

able,itisconsist

entwiththewide

spreaduseof

linearmodelsin

fMRIanalysis(2

7)andwiththe

assumptionthatf

MRIactivationo

ftenreflectsa

linearsuperpositio

nofcontributions

fromdifferent

sources.Ourtheore

ticalframeworkdo

esnottakea

positiononwheth

ertheneuralactiv

ationencoding

meaningislocali

zedinparticular

corticalre-

gions.Instead,it

considersallcort

icalvoxelsand

allowsthetrainin

gdatatodetermin

ewhichloca-

tionsaresystemat

icallymodulated

bywhichas-

pectsofwordme

anings.

Results.Weevaluated

thiscomputational

mod-elusing

fMRIdatafromn

inehealthy,college

-ageparticipan

tswhoviewed60

differentword-pic

turepairspre

sentedsixtimes

each.Anatomicall

yde-finedreg

ionsofinterestwe

reautomaticallyla

beledaccording

tothemethodolog

yin(28).The60

ran-domlyo

rderedstimuliinc

ludedfiveitems

fromeachof1

2semanticcatego

ries(animals,body

parts,buildings

,buildingparts,clo

thing,furniture,in

sects,kitcheni

tems,tools,vegeta

bles,vehicles,and

otherman-mad

eitems).Areprese

ntativefMRIimage

foreachstim

uluswascreatedb

ycomputingthem

eanfMRIre

sponseoveritssi

xpresentations,a

ndthemeanof

all60oftheserep

resentativeimages

wasthensub

tractedfromeach

[fordetails,see(

26)].Toinstan

tiateourmodeling

framework,wefir

stchoseas

etofintermediate

semanticfeatures.

Tobeeffective,

theintermediatese

manticfeaturesm

ustsimultane

ouslyencodethew

idevarietyofsema

nticcontento

ftheinputstimul

uswordsandfacto

rtheobserved

fMRIactivationin

tomoreprimitivec

om-

Predicted

“celery” = 0.84

“celery”“airplan

e”

Predicted:

Observed:

AB

+...

high

average below average

Predicted “celer

y”:+ 0.35+ 0.32

“eat”“taste”

“fill”

Fig.2.Predicting

fMRIimages

forgivenstimulus

words.(A)

Formingapredict

ionforpar-

ticipantP1fort

hestimulus

word“celery”afte

rtrainingon

58otherwords.Le

arnedc vico-efficients

for3ofthe25s

e-manticfe

atures(“eat,”“tast

e,”and“fill”

)aredepictedby

thevoxelcolo

rsinthethreeima

gesatthetop

ofthepanel.Thec

o-occurrenc

evalueforeacho

fthesefeaturesfor

thestimulusword

“celery”is

showntothelefto

ftheirrespectiveim

ages[e.g.,thevalu

efor“eat(celery)”i

s0.84].The

predictedactivation

forthestimuluswo

rd[shownatthebo

ttomof(A)]isal

inearcombinationo

fthe25semantic

fMRIsignatures,w

eightedby

theirco-occurrence

values.Thisfigure

showsjustonehor

izontalslice[z=

–12mminMontr

ealNeurological

Institute(MNI)spa

ce]ofthepredict

edthree-dim

ensionalimage.(

B)Predictedand

observedfMRIima

gesfor“celery”a

nd“airplane”after

trainingthatuses5

8otherwords.The

twolongredandb

lueverticalstreaks

nearthetop(post

eriorregion)ofthe

predictedandobse

rvedimagesaret

heleftandrightfu

siformgyri.

A B

C

Mean overparticipants

Participant P5

Fig.3.Locations

ofmostac

curatelypre-

dictedvoxels.Sur

face(A)andg

lassbrain(B)

renderingofthecor

rela-tionbetw

eenpredicted

andactualvoxelac

tiva-tionsfor

wordsoutside

thetrainingsetfor

par-ticipantP

5.Thesepanelssho

wclusterscontaining

atleast10contigu

ousvoxels,eachof

whosepredicted-

actualcorrelationis

atleast0.28.These

voxelclustersared

istributedthroughou

tthecortexan

dlocatedinthelef

tandrightoccipit

alandparietallobe

s;leftandrightfu

siform,postcentra

l,andmiddlefronta

lgyri;leftinferiorfr

ontalgyrus;medial

frontalgyrus;anda

nteriorcingulate.

(C)Surfacerenderi

ngofthepredicted-

actualcorrelation

averagedoveralln

ineparticipan

ts.Thispanelrepres

entsclusterscontai

ningatleast10co

ntiguousvoxels,ea

chwithaveragec

orrelationofatleast

0.14.

30MAY2008V

OL320SCIENC

Ewww.science

mag.org1192RESEAR

CHARTICLES

on May 30, 2008 www.sciencemag.org Downloaded from

Tofullyspecify

amodelwithin

thiscom-

putationalmode

lingframework

,onemustfirst

defineasetof

intermediatese

manticfeatures

f 1(w)f 2(w)…f n(w)tobeextrac

tedfromthetext

corpus.Inthisp

aper,eachinter

mediatesemanti

cfeature

isdefinedinte

rmsoftheco-o

ccurrence

statisticsofthe

inputstimulus

wordwwitha

particularotherw

ord(e.g.,“taste”

)orsetofword

s(e.g.,“ta

ste,”“tastes,”or

“tasted”)within

thetextcorpus.

Themodelistra

inedbytheappl

icationof

multipleregressio

ntothesefeatu

resf i(w)andthe

observedfMRI

images,soasto

obtainmaximum

-likelihoo

destimatesfor

themodelparam

etersc vi

(26).Oncetraine

d,thecomputatio

nalmodelcanbe

evaluatedbygiv

ingitwordsou

tsidethetrainin

gsetand

comparingitsp

redictedfMRIim

agesforthesew

ordswithobser

vedfMRIdata.

Thiscomputatio

nalmodeling

frameworkis

basedontwok

eytheoreticalas

sumptions.First

,itassumes

thesemanticfea

turesthatdisting

uishthemeaning

sofarbitraryco

ncretenounsare

reflected

inthestatisticso

ftheirusewithi

naverylargetex

tcorpus.

Thisassumption

isdrawnfromthe

fieldofcomputa

tionallinguistic

s,wherestatist

icalword

distributionsare

frequentlyused

toapproximate

themeaningof

documentsand

words(14–17).

Second,itassum

esthatthebrain

activityobserve

dwhenth

inkingaboutan

yconcretenou

ncanbe

derivedasawe

ightedlinearsu

mofcontributio

nsfromea

chofitsseman

ticfeatures.Alth

oughthe

correctnessofth

islinearityassu

mptionisdebat

-able,it

isconsistentw

iththewidesprea

duseof

linearmodelsin

fMRIanalysis(2

7)andwiththe

assumptiontha

tfMRIactivatio

noftenreflects

alinearsu

perpositionofc

ontributionsfrom

differentsources.

Ourtheoreticalfr

ameworkdoesn

ottakeaposition

onwhetherthen

euralactivation

encoding

meaningisloc

alizedinpartic

ularcorticalre

-gions.I

nstead,itconsid

ersallcorticalv

oxelsand

allowsthetrain

ingdatatodeterm

inewhichloca-

tionsaresystem

aticallymodulat

edbywhichas-

pectsofwordm

eanings.

Results.Weevaluate

dthiscomputatio

nalmod-

elusingfMRId

atafromninehea

lthy,college-age

participantswho

viewed60diffe

rentword-pictur

epairspr

esentedsixtime

seach.Anatom

icallyde-

finedregionsofi

nterestwereauto

maticallylabele

daccordin

gtothemethodo

logyin(28).Th

e60ran-

domlyordered

stimuliincluded

fiveitemsfrom

eachof12sema

nticcategories(a

nimals,bodypart

s,building

s,buildingparts

,clothing,furnit

ure,insects,

kitchenitems,to

ols,vegetables,

vehicles,andoth

erman-ma

deitems).Arepr

esentativefMRI

imagefor

eachstimuluswa

screatedbycomp

utingthemean

fMRIresponse

overitssixpres

entations,andth

emeanof

all60oftheser

epresentativeim

ageswas

thensubtractedf

romeach[ford

etails,see(26)]

.Toinsta

ntiateourmodeli

ngframework,w

efirstchosea

setofintermedia

tesemanticfeatu

res.Tobe

effective,theint

ermediatesema

nticfeaturesmu

stsimultan

eouslyencodeth

ewidevarietyof

semanticcontent

oftheinputstim

uluswordsand

factorthe

observedfMRIa

ctivationintomor

eprimitivecom-

Predicted

“celery” = 0.84

“celery”

“airplane”

Predicted:

Observed:

AB

+...

high

average below averag

e

Predicted “cele

ry”:+ 0.35+ 0.32

“eat”“taste”

“fill”

Fig.2.Predictin

gfMRIimages

forgivenstimu

luswords.(A)

Formingapredic

tionforpar-

ticipantP1for

thestimulus

word“celery”aft

ertrainingon

58otherwords.L

earnedcvico-

efficientsfor3of

the25se-

manticfeatures

(“eat,”“taste,”

and“fill”)ared

epictedbythe

voxelcolorsinth

ethreeimages

atthetopofthe

panel.Theco-

occurrencevalue

foreachofthese

featuresforthes

timulusword“ce

lery”isshownto

theleftoftheirre

spectiveimages[

e.g.,thevaluefor

“eat(celery)”is

0.84].Thepredict

edactivationfor

thestimulusword

[shownatthebo

ttomof(A)]isa

linearcombinatio

nofthe25sema

nticfMRIsignatu

res,weightedby

theirco-occurren

cevalues.Thisf

igureshowsjust

onehorizontals

lice[z=

–12mminMont

realNeurologica

lInstitute(MNI)

space]ofthepr

edictedthree-dim

ensionalimage.

(B)Predictedan

dobservedfMR

Iimagesfor

“celery”and“air

plane”aftertrain

ingthatuses58

otherwords.The

twolongredand

blueverticalstre

aksnearthetop

(posteriorregion

)ofthepredicte

dandobs

ervedimagesare

theleftandrigh

tfusiformgyri.

A B

C

Mean overparticipants

Participant P5

Fig.3.Location

sofmostac

curatelypre-

dictedvoxels.S

urface(A)and

glassbrain(B)

renderingofthe

correla-tionbet

weenpredicted

andactualvoxel

activa-tionsfor

wordsoutside

thetrainingsetf

orpar-ticipantP

5.Thesepanelssh

owclusterscontai

ningatleast10c

ontiguousvoxels,

eachofwhose

predicted-actualc

orrelationisatleas

t0.28.Thesevox

elclustersaredis

tributedthrougho

utthecortexan

dlocatedinthele

ftandrightoccip

italandparietal

lobes;leftandri

ghtfusiform,

postcentral,andm

iddlefrontalgyri;

leftinferiorfronta

lgyrus;medialfron

talgyrus;andant

eriorcingulate

.(C)Surfaceren

deringofthepr

edicted-actualcor

relationaveraged

overallnine

participants.This

panelrepresents

clusterscontaining

atleast10contig

uousvoxels,each

withaverage

correlationofatle

ast0.14.

30MAY2008

VOL320SCI

ENCEwww.sc

iencemag.org

1192RESEARCHART

ICLES

on May 30, 2008 www.sciencemag.org Downloaded from

airplane

dog

noun

s

questions

voxels SIAM, July 2017 6 (c) C. Faloutsos, 2017

Patterns?

To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.

This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected

in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.

Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].

To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-

Predicted“celery” = 0.84

“celery” “airplane”

Predicted:

Observed:

A B

+.. .

high

average

belowaverage

Predicted “celery”:

+ 0.35 + 0.32

“eat” “taste” “fill”

Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =

–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.

A

B

C

Mean over

participants

Participant P

5

Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-

ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.

30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192

RESEARCH ARTICLES

on

May

30,

200

8 w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.

This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected

in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.

Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].

To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-

Predicted“celery” = 0.84

“celery” “airplane”

Predicted:

Observed:

A B

+.. .

high

average

belowaverage

Predicted “celery”:

+ 0.35 + 0.32

“eat” “taste” “fill”

Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =

–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.

A

B

C

Mean over

participants

Participant P

5

Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-

ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.

30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192

RESEARCH ARTICLES

on

May

30,

200

8 w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

To fully specify a model within this com-putational modeling framework, one must firstdefine a set of intermediate semantic featuresf1(w) f2(w)…fn(w) to be extracted from the textcorpus. In this paper, each intermediate semanticfeature is defined in terms of the co-occurrencestatistics of the input stimulus word w with aparticular other word (e.g., “taste”) or set of words(e.g., “taste,” “tastes,” or “tasted”) within the textcorpus. The model is trained by the application ofmultiple regression to these features fi(w) and theobserved fMRI images, so as to obtain maximum-likelihood estimates for the model parameters cvi(26). Once trained, the computational model can beevaluated by giving it words outside the trainingset and comparing its predicted fMRI images forthese words with observed fMRI data.

This computational modeling framework isbased on two key theoretical assumptions. First, itassumes the semantic features that distinguish themeanings of arbitrary concrete nouns are reflected

in the statistics of their use within a very large textcorpus. This assumption is drawn from the field ofcomputational linguistics, where statistical worddistributions are frequently used to approximatethe meaning of documents and words (14–17).Second, it assumes that the brain activity observedwhen thinking about any concrete noun can bederived as a weighted linear sum of contributionsfrom each of its semantic features. Although thecorrectness of this linearity assumption is debat-able, it is consistent with the widespread use oflinear models in fMRI analysis (27) and with theassumption that fMRI activation often reflects alinear superposition of contributions from differentsources. Our theoretical framework does not take aposition on whether the neural activation encodingmeaning is localized in particular cortical re-gions. Instead, it considers all cortical voxels andallows the training data to determine which loca-tions are systematically modulated by which as-pects of word meanings.

Results. We evaluated this computational mod-el using fMRI data from nine healthy, college-ageparticipants who viewed 60 different word-picturepairs presented six times each. Anatomically de-fined regions of interest were automatically labeledaccording to the methodology in (28). The 60 ran-domly ordered stimuli included five items fromeach of 12 semantic categories (animals, body parts,buildings, building parts, clothing, furniture, insects,kitchen items, tools, vegetables, vehicles, and otherman-made items). A representative fMRI image foreach stimulus was created by computing the meanfMRI response over its six presentations, and themean of all 60 of these representative images wasthen subtracted from each [for details, see (26)].

To instantiate our modeling framework, we firstchose a set of intermediate semantic features. To beeffective, the intermediate semantic features mustsimultaneously encode thewide variety of semanticcontent of the input stimulus words and factor theobserved fMRI activation intomore primitive com-

Predicted“celery” = 0.84

“celery” “airplane”

Predicted:

Observed:

A B

+.. .

high

average

belowaverage

Predicted “celery”:

+ 0.35 + 0.32

“eat” “taste” “fill”

Fig. 2. Predicting fMRI imagesfor given stimulus words. (A)Forming a prediction for par-ticipant P1 for the stimulusword “celery” after training on58 other words. Learned cvi co-efficients for 3 of the 25 se-mantic features (“eat,” “taste,”and “fill”) are depicted by thevoxel colors in the three imagesat the top of the panel. The co-occurrence value for each of these features for the stimulus word “celery” isshown to the left of their respective images [e.g., the value for “eat (celery)” is0.84]. The predicted activation for the stimulus word [shown at the bottom of(A)] is a linear combination of the 25 semantic fMRI signatures, weighted bytheir co-occurrence values. This figure shows just one horizontal slice [z =

–12 mm in Montreal Neurological Institute (MNI) space] of the predictedthree-dimensional image. (B) Predicted and observed fMRI images for“celery” and “airplane” after training that uses 58 other words. The two longred and blue vertical streaks near the top (posterior region) of the predictedand observed images are the left and right fusiform gyri.

A

B

C

Mean over

participants

Participant P

5

Fig. 3. Locations ofmost accurately pre-dicted voxels. Surface(A) and glass brain (B)rendering of the correla-tion between predictedand actual voxel activa-tions for words outsidethe training set for par-

ticipant P5. These panels show clusters containing at least 10 contiguous voxels, each of whosepredicted-actual correlation is at least 0.28. These voxel clusters are distributed throughout thecortex and located in the left and right occipital and parietal lobes; left and right fusiform,postcentral, and middle frontal gyri; left inferior frontal gyrus; medial frontal gyrus; and anteriorcingulate. (C) Surface rendering of the predicted-actual correlation averaged over all nineparticipants. This panel represents clusters containing at least 10 contiguous voxels, each withaverage correlation of at least 0.14.

30 MAY 2008 VOL 320 SCIENCE www.sciencemag.org1192

RESEARCH ARTICLES

on

May

30,

200

8 w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

Page 7: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Neuro-semantics

Dog

Airplane

Puppy

Toful

lyspe

cifya

modelw

ithin

thisc

om-

putation

almo

delin

gfram

ework

,one

mustf

irst

defin

easet

ofint

ermediates

emantic

featur

esf 1(w

)f 2(w)…

f n(w)to

beext

racted

from

thetex

tcor

pus.I

nthis

paper,

eachinte

rmedi

atesem

antic

featur

eisd

efinedinterm

softh

eco-o

ccurre

nce

statisticso

fthe

input

stimu

luswo

rdw

witha

particu

laroth

erwo

rd(e.g

.,“taste”

)orsetof

words

(e.g.,“

taste,”

“tastes,”

or“ta

sted”)

within

thetex

tcor

pus.T

hemo

delist

rained

bythe

applica

tiono

fmu

ltiplereg

ressio

ntot

hesef

eature

sfi(w

)and

theobser

vedfM

RIimage

s,soa

stoobtain

maxim

um-

likelih

oodestimate

sfor

themo

delpar

amete

rsc vi

(26).O

ncetrai

ned,th

ecom

putati

onalm

odelca

nbe

evalua

tedby

giving

itword

souts

idethe

training

setand

compar

ingits

predic

tedfM

RIimage

sfor

these

words

withobser

vedfM

RIdat

a.Th

iscom

putati

onal

model

ingframe

work

isbas

edon

twokey

theore

ticalassu

mption

s.First

,itass

umes

thesem

anticfea

tures

thatd

istingu

ishthe

meani

ngso

farbitrary

concre

tenouns

arereflec

ted

inthe

statist

icsofthe

iruse

withina

verylarg

etext

corpus.T

hisass

umption

isdraw

nfrom

thefieldof

computati

onal

linguisti

cs,wh

erestatist

icalw

orddis

tributionsa

refrequent

lyuse

dtoa

pprox

imate

theme

aning

ofdocum

entsa

ndwo

rds(14

–17).

Second

,itass

umes

thatth

ebrain

activit

yobse

rved

when

thinking

about

anycon

crete

noun

canbe

derive

dasa

weigh

tedlinear

sumof

contrib

utions

from

eacho

fitss

emant

icfea

tures.

Although

thecor

rectne

ssof

thisli

nearity

assum

ption

isdeba

t-abl

e,iti

sconsisten

twith

thewides

pread

useof

linear

model

sinfM

RIana

lysis(

27)and

withthe

assum

ption

thatfM

RIact

ivationo

ftenr

eflect

salinear

superp

osition

ofcon

tributio

nsfro

mdifferent

source

s.Ourthe

oretica

lfram

ework

doesnotta

kea

positi

onon

wheth

erthe

neural

activa

tionenc

oding

meaning

isloc

alized

inpartic

ularc

ortica

lre-

gions.

Instea

d,itc

onsid

ersall

cortica

lvoxels

andallo

wsthe

training

datatod

eterm

inewh

ichloc

a-tions

aresys

tematic

allymo

dulate

dbyw

hicha

s-pec

tsofw

ordme

anings.

Results.W

eeval

uated

thiscom

putatio

nalmo

d-elusi

ngfM

RIdat

afrom

nineh

ealthy

,colleg

e-age

partici

pantswh

oview

ed60

different

word-

picture

pairs

presen

tedsix

times

each.

Anato

mical

lyde-

fined

region

sofin

terestw

ereaut

omatic

allylab

eled

accord

ingtothe

metho

dology

in(28

).The

60ran

-dom

lyord

eredstim

uliinc

luded

fivei

temsf

romeac

hof12s

emant

iccategor

ies(an

imals,bo

dypar

ts,bui

ldings,

buildin

gparts

,cloth

ing,fu

rniture,i

nsects,

kitchen

items,t

ools,v

egetab

les,vehi

cles,a

ndoth

erma

n-made

items).

Arep

resent

ativef

MRIim

agefor

eachs

timulu

swas

createdb

ycom

puting

theme

anfM

RIrespon

seove

ritss

ixpre

sentati

ons,and

theme

anof

all60

ofthe

serep

resent

ativeim

agesw

asthe

nsubtracte

dfrom

each[

fordet

ails,se

e(26)].

Toins

tantiat

eourmo

deling

frame

work,

wefirs

tcho

seaseto

finterm

ediate

semant

icfeat

ures.T

obe

effect

ive,th

einte

rmedi

atesem

antic

features

must

simultane

ouslye

ncode

thewid

evarie

tyofse

mantic

conten

tofth

einputstim

ulusw

ordsa

ndfac

torthe

observ

edfM

RIact

ivation

intomo

reprim

itivec

om-

Pred

icted

“celer

y” = 0

.84

“celer

y”“ai

rplan

e”

Pred

icted

:

Obse

rved:

AB

+...

high

avera

ge below

avera

ge

Pred

icted

“cele

ry”:

+ 0.35

+ 0.32

“eat”

“taste

”“fil

l”

Fig.2

.Pred

icting

fMRIim

ages

forgiv

enstim

ulusw

ords.(

A)For

minga

predic

tionfor

par-

ticipan

tP1

forthe

stimulu

swo

rd“ce

lery”a

ftertr

aining

on58

other

words

.Learn

edc vi

co-effi

cients

for3o

fthe

25se-

mantic

featur

es(“e

at,”“ta

ste,”

and“fil

l”)are

depicte

dbyt

hevox

elcolo

rsint

hethree

images

atthe

topof

thepan

el.The

co-occ

urrenc

evalu

efor

eacho

fthese

features

forthe

stimulu

sword

“celery

”is

shown

tothe

leftofthe

irresp

ective

images

[e.g.,the

value

for“ea

t(cele

ry)”is

0.84].

Thepre

dicted

activa

tionfor

thestim

ulusw

ord[sh

owna

ttheb

ottom

of(A)]is

alinea

rcomb

ination

ofthe

25sem

anticf

MRIsi

gnatur

es,weigh

tedby

theirc

o-occu

rrence

values

.Thisf

igure

shows

justo

nehor

izonta

lslice

[z=

–12m

min

Montr

ealNe

urolog

icalIn

stitute

(MNI)s

pace]

ofthe

predic

tedthr

ee-dim

ension

alima

ge.(B)

Predic

tedand

observ

edfMR

Iima

gesfor

“celery

”and

“airpl

ane”a

ftertra

iningthat

uses5

8othe

rword

s.The

twolon

gred

andblu

evert

icalst

reaks

nearth

etop

(poste

riorre

gion)

ofthe

predic

tedand

observ

edima

gesare

theleft

andrig

htfus

iform

gyri.

A B

C

Mean overparticipants

Participant P5

Fig.3

.Loca

tionso

fmo

stacc

uratel

ypre

-dic

tedvox

els.S

urface

(A)and

glass

brain

(B)ren

dering

ofthe

correla

-tion

betwee

npre

dicted

andact

ualvox

elact

iva-

tionsf

orwords

outside

thetrai

nings

etfor

par-

ticipan

tP5.T

hesep

anelss

howclu

stersc

ontain

ingatl

east1

0cont

iguous

voxels,

eacho

fwhos

epre

dicted

-actua

lcorrelati

onisa

tleast

0.28.The

sevox

elclusters

aredistrib

utedthro

ughout

thecor

texand

locate

dint

heleft

andrigh

toccip

italand

parieta

llobes

;lefta

ndrigh

tfusifo

rm,

postce

ntral,

andmid

dlefronta

lgyri;l

eftinfe

riorfronta

lgyrus

;medi

alfron

talgyr

us;and

anterio

rcin

gulate

.(C)S

urface

render

ingof

thepre

dicted

-actua

lcorr

elation

averag

edove

ralln

inepar

ticipan

ts.Thi

spane

lrepre

sentsc

lusters

contain

ingatlea

st10c

ontigu

ousvox

els,eac

hwith

averag

ecorr

elation

ofatle

ast0.1

4.

30MA

Y200

8VO

L320

SCIEN

CEww

w.scie

ncem

ag.or

g11

92RESE

ARCH

ARTIC

LES

on May 30, 2008 www.sciencemag.org Downloaded from

Toful

lyspe

cifyam

odelw

ithint

hiscom

-put

ational

model

ingfra

mewo

rk,one

mustf

irstdef

ineas

etof

interm

ediate

semant

icfea

tures

f 1(w)f 2

(w)…f

n(w)to

beext

racted

from

thetex

tcor

pus.In

thispap

er,eac

hinterm

ediate

semant

icfea

tureis

define

dinterm

softh

eco-o

ccurre

ncesta

tistics

ofthe

input

stimulu

sword

wwith

apar

ticular

otherw

ord(e.g

.,“tast

e”)ors

etofw

ords

(e.g.,“

taste,”

“tastes,

”or“

tasted”

)withi

nthe

text

corpus

.The

model

istrain

edby

theapp

lication

ofmu

ltipler

egress

ionto

these

feature

sf i(w)

andthe

observ

edfM

RIima

ges,so

astoo

btainm

aximu

m-like

lihood

estima

tesfor

themo

delpar

ameter

scvi

(26).O

ncetrai

ned,th

ecomp

utation

almode

lcanb

eeva

luated

bygiv

ingitw

ordso

utside

thetrai

ning

setand

compar

ingits

predic

tedfM

RIima

gesfor

thesew

ordsw

ithobs

erved

fMRI

data.

Thisc

omput

ational

model

ingfram

ework

isbas

edon

twokey

theore

ticalas

sumptio

ns.Firs

t,itass

umes

thesem

anticf

eatures

thatd

istingui

shthe

meani

ngsofa

rbitrary

concre

tenoun

sarere

flected

inthe

statisti

csofth

eiruse

within

avery

largete

xtcor

pus.Th

isassu

mption

isdraw

nfrom

thefiel

dof

comput

ational

linguis

tics,w

heres

tatistic

alwo

rddis

tributio

nsare

frequen

tlyuse

dtoa

pproxi

mate

theme

aning

ofdoc

ument

sand

words

(14–17

).Sec

ond,it

assum

esthat

thebra

inactiv

ityobs

erved

when

thinkin

gabou

tany

concre

tenou

ncan

beder

iveda

sawe

ighted

linears

umof

contrib

utions

fromeac

hofit

ssem

anticf

eatures.

Althou

ghthe

correc

tnesso

fthisl

inearit

yassu

mption

isdeba

t-abl

e,itis

consist

entwit

hthe

widesp

readu

seof

linearm

odelsi

nfMRI

analys

is(27)

andwit

hthe

assum

ptionth

atfMR

Iactiva

tionofte

nrefle

ctsa

linears

uperpo

sitiono

fcontri

bution

sfrom

differen

tsou

rces.O

urtheo

retical

framew

orkdoe

snotta

kea

positio

nonw

hether

theneu

ralacti

vation

encodi

ngme

aning

isloc

alized

inpar

ticular

cortica

lre-

gions.

Instea

d,itco

nsider

sallco

rticalv

oxelsa

ndallo

wsthe

trainin

gdata

todeter

minew

hichlo

ca-tion

sare

system

atically

modul

atedby

which

as-pec

tsofw

ordme

anings

.

Results.W

eeval

uated

thiscom

putatio

nalmo

d-elu

singfM

RIdat

afrom

nineh

ealthy,

colleg

e-age

partici

pantsw

hovie

wed6

0diffe

rentw

ord-pic

turepai

rspre

sented

sixtim

eseac

h.An

atomic

allyde-

finedre

gions

ofinte

restwe

reauto

matica

llylab

eledacc

ording

tothe

metho

dology

in(28)

.The

60ran

-dom

lyord

ereds

timuli

includ

edfive

itemsf

romeac

hof12

semant

iccateg

ories(a

nimals,

bodyp

arts,

buildin

gs,bui

ldingp

arts,cl

othing

,furnit

ure,in

sects,

kitchen

items,t

ools,v

egetab

les,veh

icles,a

ndoth

erma

n-made

items).

Arepr

esenta

tivefM

RIima

gefor

eachs

timulu

swas

created

bycom

puting

theme

anfM

RIresp

onseo

verits

sixpre

sentatio

ns,and

theme

anofa

ll60o

fthese

repres

entativ

eimage

swas

thens

ubtrac

tedfro

meac

h[for

details

,see(2

6)].

Toinst

antiate

ourmo

deling

framew

ork,w

efirst

chosea

setofi

nterme

diates

emant

icfeatu

res.To

beeffe

ctive,t

heinte

rmedia

tesem

anticf

eatures

must

simulta

neously

encode

thewid

evarie

tyofse

mantic

conten

tofthe

inputs

timulu

sword

sandfa

ctorth

eobs

erved

fMRI

activat

ioninto

morep

rimitiv

ecom-

Predic

ted“ce

lery” =

0.84

“celer

y”“air

plane”

Predic

ted:

Obser

ved:

AB

+...

high

averag

e below

averag

e

Predic

ted “ce

lery”:+ 0

.35+ 0

.32

“eat”

“taste

”“fill

Fig.2

.Predic

tingfMR

Iimage

sfor

given

stimulu

sword

s.(A)

Formin

gapre

diction

forpar

-ticip

antP1

forthe

stimulu

swor

d“cele

ry”afte

rtrain

ingon

58oth

erword

s.Lear

nedc vic

o-effi

cientsf

or3o

fthe2

5se-

mantic

feature

s(“eat

,”“tast

e,”and

“fill”)a

redepic

tedby

thevox

elcolor

sinthe

threeim

ages

atthe

topoft

hepan

el.The

co-occ

urrence

valuefo

reach

ofthes

efeatu

resfor

thestim

uluswor

d“cele

ry”is

shown

tothe

leftoft

heirresp

ectivei

mages

[e.g.,th

evalue

for“ea

t(celer

y)”is

0.84].

Thepre

dicted

activati

onfor

thestim

uluswor

d[show

natth

ebotto

mof

(A)]isa

linearc

ombin

ationo

fthe2

5sema

nticfMR

Isigna

tures,w

eighte

dby

theirc

o-occu

rrence

values.

Thisfi

guresh

owsjust

onehor

izontal

slice[z

=

–12mm

inMo

ntrealN

eurolo

gicalIn

stitute(

MNI)s

pace]o

fthep

redicte

dthre

e-dime

nsional

image.

(B)Pre

dicted

andobs

erved

fMRIim

agesf

or“ce

lery”an

d“airp

lane”a

ftertrai

ningth

atuses

58oth

erword

s.The

twolon

gred

andblu

evertic

alstrea

ksnear

thetop

(poster

iorreg

ion)of

thepre

dicted

andobs

erved

images

arethe

leftand

rightfu

siform

gyri.

A B

C

Mean overparticipants

Participant P5

Fig.3

.Loca

tionso

fmo

stacc

urately

pre-

dicted

voxels.

Surface

(A)and

glassb

rain(B)

render

ingoft

hecorre

la-tion

between

predict

edand

actualv

oxelac

tiva-

tionsfo

rword

soutsi

dethe

trainin

gsetfo

rpar-

ticipant

P5.The

sepane

lsshow

clusters

contain

ingatl

east10

contigu

ousvox

els,eac

hofwh

osepre

dicted-

actualc

orrelatio

nisatl

east0.

28.The

sevoxe

lcluster

sared

istribut

edthro

ughout

thecort

exand

located

inthe

leftand

rightoc

cipital

andpar

ietallo

bes;le

ftand

rightfu

siform,

postcen

tral,an

dmiddl

efront

algyri;

leftinfe

riorfron

talgyru

s;media

lfronta

lgyrus;

andant

erior

cingulat

e.(C)

Surface

render

ingof

thepre

dicted-

actualc

orrelatio

navera

gedove

ralln

inepar

ticipant

s.This

panelr

eprese

ntsclus

terscon

taining

atleas

t10con

tiguous

voxels,

eachw

ithave

rageco

rrelatio

nofat

least0.

14.

30MA

Y2008

VOL3

20SC

IENCE

www.s

cience

mag.o

rg11

92RESEA

RCHA

RTICL

ES

on May 30, 2008 www.sciencemag.org Downloaded from

Toful

lyspe

cifya

model

within

thisc

om-

putati

onalm

odelin

gfram

ework

,one

mustf

irstdef

ineas

etof

interm

ediate

semant

icfea

tures

f 1(w)f 2

(w)…f n(w

)tobe

extrac

tedfro

mthe

text

corpus

.Inthis

paper,

eachin

termedi

atesem

antic

featur

eisdef

inedinterm

softh

eco-o

ccurre

ncesta

tistics

ofthe

input

stimulu

sword

wwit

hapar

ticular

otherw

ord(e.g

.,“tast

e”)or

setof

words

(e.g.,“

taste,”

“tastes

,”or“

tasted”

)withi

nthe

text

corpus

.The

model

istrain

edby

theapp

lication

ofmu

ltiplereg

ressio

ntot

hesef

eature

sf i(w)

andthe

observ

edfM

RIima

ges,so

astoo

btainm

aximu

m-like

lihood

estima

tesfor

themo

delpar

amete

rsc vi

(26).O

ncetrai

ned,th

ecom

putatio

nalmo

delcan

beeva

luated

bygiv

ingitw

ordso

utside

thetrai

ning

setand

compar

ingits

predic

tedfM

RIimage

sfor

these

words

witho

bserve

dfMR

Idata

.Th

iscom

putatio

nalmo

deling

framew

orkis

based

ontwo

keythe

oretica

lassum

ptions.

First,

itass

umes

thesem

anticfea

turesth

atdis

tinguis

hthe

meani

ngsof

arbitra

rycon

creten

ounsa

rerefle

cted

inthe

statist

icsofthe

iruse

within

avery

largete

xtcor

pus.T

hisass

umptio

nisdra

wnfro

mthe

fieldo

fcom

putatio

nalling

uistics

,wher

estati

stical

word

distrib

utions

arefreq

uently

usedt

oappr

oxima

tethe

meani

ngof

docum

entsa

ndwo

rds(14

–17).

Second

,itass

umes

thatth

ebrain

activit

yobse

rved

when

thinkin

gabou

tany

concre

tenou

ncan

beder

iveda

sawe

ighted

linear

sumof

contrib

utions

from

eacho

fitss

emant

icfea

tures.A

lthough

thecor

rectne

ssof

thisline

aritya

ssump

tionisd

ebat-

able,i

tiscon

sistent

witht

hewid

esprea

duse

ofline

armode

lsinf

MRIan

alysis

(27)an

dwith

theass

umptio

nthat

fMRI

activa

tionoft

enref

lectsa

linear

superp

osition

ofcon

tributio

nsfrom

differe

ntsou

rces.O

urtheo

retical

frame

workd

oesnot

takea

positio

nonw

hether

theneu

ralact

ivation

encodi

ngme

aning

isloc

alized

inpar

ticular

cortica

lre-

gions.

Instea

d,itco

nsider

sallco

rticalv

oxelsa

ndallo

wsthe

trainin

gdata

todet

ermine

which

loca-

tionsa

resys

tematic

allymo

dulate

dbyw

hicha

s-pec

tsofw

ordme

anings

.

Results.W

eeval

uated

thiscom

putatio

nalmo

d-elu

singf

MRId

atafro

mnin

eheal

thy,co

llege-a

gepar

ticipan

tswho

viewe

d60d

ifferen

tword

-picture

pairs

presen

tedsix

times

each.

Anato

mically

de-fine

dregi

onsof

interes

twere

autom

atically

labele

dacc

ording

tothe

metho

dology

in(28)

.The

60ran

-dom

lyord

ereds

timuli

includ

edfive

itemsf

romeac

hof12

semant

iccategor

ies(an

imals,

bodyp

arts,

buildin

gs,bui

ldingp

arts,cl

othing

,furnit

ure,in

sects,

kitchen

items,t

ools,v

egetab

les,veh

icles,a

ndoth

erma

n-made

items).

Arepr

esenta

tivefM

RIima

gefor

eachs

timulu

swas

createdb

ycom

puting

theme

anfM

RIrespon

seove

ritss

ixpre

sentati

ons,an

dthe

mean

ofall

60of

these

repres

entativ

eimage

swas

thens

ubtrac

tedfro

meac

h[for

details

,see(

26)].

Toins

tantiat

eourm

odeling

framew

ork,w

efirst

chosea

setofi

nterm

ediate

semant

icfeat

ures.T

obe

effectiv

e,the

interm

ediate

semant

icfea

turesm

ustsim

ultaneo

uslye

ncode

thewid

evarie

tyofse

mantic

conten

tofth

einput

stimulu

sword

sand

factor

theobs

erved

fMRI

activa

tioninto

morep

rimitiv

ecom

-

Predic

ted“ce

lery”

= 0.84

“celer

y”“ai

rplan

e”

Predic

ted:

Obse

rved:

AB

+...

high

averag

e below

averag

e

Predic

ted “c

elery”

:

+ 0.35

+ 0.32

“eat”

“taste

”“fill

Fig.2

.Pred

ictingfM

RIima

gesfor

given

stimulu

sword

s.(A)

Formin

gapre

diction

forpar

-tici

pantP

1for

thestim

ulus

word“

celery”

aftertr

aining

on58

otherw

ords.L

earned

c vico-

efficien

tsfor

3ofth

e25s

e-ma

nticfea

tures(“

eat,”“

taste,”

and“fil

l”)are

depicte

dbyth

evox

elcolo

rsinth

ethree

images

atthe

topoft

hepan

el.The

co-occ

urrence

valuef

oreac

hofth

esefea

turesfo

rthes

timulu

sword

“celery

”issho

wntot

heleft

ofthei

rrespe

ctiveim

ages[e

.g.,the

valuefo

r“eat(

celery)”

is0.8

4].The

predict

edacti

vation

forthe

stimulu

sword

[shown

atthe

bottom

of(A)]

isaline

arcom

binatio

nofth

e25s

emant

icfMR

Isigna

tures,w

eighte

dby

theirc

o-occu

rrence

values

.Thisf

igure

shows

justo

nehor

izonta

lslice

[z=

–12mm

inMo

ntreal

Neurolo

gical

Institu

te(MN

I)spac

e]of

thepre

dicted

three-

dimens

ional

image.

(B)Pre

dicted

andobs

erved

fMRIim

agesf

or“ce

lery”a

nd“ai

rplane”

aftertr

aining

thatus

es58o

therw

ords.T

hetwo

long

redand

bluev

ertical

streaks

nearth

etop(p

osterio

rregio

n)ofth

epred

icted

andobs

erved

images

arethe

leftand

rightfu

siform

gyri.

A B

C

Mean overparticipants

Participant P5

Fig.3

.Loca

tionso

fmo

stacc

urately

pre-

dicted

voxels.

Surfac

e(A)

andgla

ssbrain

(B)ren

dering

ofthe

correla

-tion

betwee

npred

icted

andactu

alvoxe

lactiva

-tion

sforw

ordso

utside

thetrai

nings

etfor

par-

ticipant

P5.The

sepane

lsshow

clusters

contain

ingatl

east10

contigu

ousvox

els,eac

hofw

hose

predict

ed-actu

alcorre

lation

isatle

ast0.2

8.Thes

evoxe

lcluste

rsared

istribu

tedthro

ughout

thecor

texand

located

inthe

leftand

righto

ccipital

andpar

ietallo

bes;le

ftand

rightfu

siform,

postcen

tral,an

dmidd

lefron

talgyr

i;leftin

feriorf

rontal

gyrus;

media

lfronta

lgyrus;

andant

erior

cingula

te.(C)

Surface

render

ingof

thepre

dicted-

actual

correla

tionave

raged

overa

llnine

particip

ants.T

hispan

elrepr

esents

clusters

contain

ingatl

east1

0cont

iguous

voxels,

eachw

ithave

rageco

rrelatio

nofat

least0.

14.

30MA

Y200

8VO

L320

SCIEN

CEww

w.scie

ncema

g.org

1192RESE

ARCH

ARTIC

LES

on May 30, 2008 www.sciencemag.org Downloaded from

Toful

lyspe

cifya

modelw

ithin

thisc

om-

putation

almo

delin

gfram

ework

,one

mustf

irst

defin

easet

ofint

ermediates

emantic

featur

esf 1(w

)f 2(w)…

f n(w)to

beext

racted

from

thetex

tcor

pus.I

nthis

paper,

eachinte

rmedi

atesem

antic

featur

eisd

efinedinterm

softh

eco-o

ccurre

nce

statisticso

fthe

input

stimu

luswo

rdw

witha

particu

laroth

erwo

rd(e.g

.,“taste”

)orsetof

words

(e.g.,“

taste,”

“tastes,”

or“ta

sted”)

within

thetex

tcor

pus.T

hemo

delist

rained

bythe

applica

tiono

fmu

ltiplereg

ressio

ntot

hesef

eature

sfi(w

)and

theobser

vedfM

RIimage

s,soa

stoobtain

maxim

um-

likelih

oodestimate

sfor

themo

delpar

amete

rsc vi

(26).O

ncetrai

ned,th

ecom

putati

onalm

odelca

nbe

evalua

tedby

giving

itword

souts

idethe

training

setand

compar

ingits

predic

tedfM

RIimage

sfor

these

words

withobser

vedfM

RIdat

a.Th

iscom

putati

onal

model

ingframe

work

isbas

edon

twokey

theore

ticalassu

mption

s.First

,itass

umes

thesem

anticfea

tures

thatd

istingu

ishthe

meani

ngso

farbitrary

concre

tenouns

arereflec

ted

inthe

statist

icsofthe

iruse

withina

verylarg

etext

corpus.T

hisass

umption

isdraw

nfrom

thefieldof

computati

onal

linguisti

cs,wh

erestatist

icalw

orddis

tributionsa

refrequent

lyuse

dtoa

pprox

imate

theme

aning

ofdocum

entsa

ndwo

rds(14

–17).

Second

,itass

umes

thatth

ebrain

activit

yobse

rved

when

thinking

about

anycon

crete

noun

canbe

derive

dasa

weigh

tedlinear

sumof

contrib

utions

from

eacho

fitss

emant

icfea

tures.

Although

thecor

rectne

ssof

thisli

nearity

assum

ption

isdeba

t-abl

e,iti

sconsisten

twith

thewides

pread

useof

linear

model

sinfM

RIana

lysis(

27)and

withthe

assum

ption

thatfM

RIact

ivationo

ftenr

eflect

salinear

superp

osition

ofcon

tributio

nsfro

mdifferent

source

s.Ourthe

oretica

lfram

ework

doesnotta

kea

positi

onon

wheth

erthe

neural

activa

tionenc

oding

meaning

isloc

alized

inpartic

ularc

ortica

lre-

gions.

Instea

d,itc

onsid

ersall

cortica

lvoxels

andallo

wsthe

training

datatod

eterm

inewh

ichloc

a-tions

aresys

tematic

allymo

dulate

dbyw

hicha

s-pec

tsofw

ordme

anings.

Results.W

eeval

uated

thiscom

putatio

nalmo

d-elusi

ngfM

RIdat

afrom

nineh

ealthy

,colleg

e-age

partici

pantswh

oview

ed60

different

word-

picture

pairs

presen

tedsix

times

each.

Anato

mical

lyde-

fined

region

sofin

terestw

ereaut

omatic

allylab

eled

accord

ingtothe

metho

dology

in(28

).The

60ran

-dom

lyord

eredstim

uliinc

luded

fivei

temsf

romeac

hof12s

emant

iccategor

ies(an

imals,bo

dypar

ts,bui

ldings,

buildin

gparts

,cloth

ing,fu

rniture,i

nsects,

kitchen

items,t

ools,v

egetab

les,vehi

cles,a

ndoth

erma

n-made

items).

Arep

resent

ativef

MRIim

agefor

eachs

timulu

swas

createdb

ycom

puting

theme

anfM

RIrespon

seove

ritss

ixpre

sentati

ons,and

theme

anof

all60

ofthe

serep

resent

ativeim

agesw

asthe

nsubtracte

dfrom

each[

fordet

ails,se

e(26)].

Toins

tantiat

eourmo

deling

frame

work,

wefirs

tcho

seaseto

finterm

ediate

semant

icfeat

ures.T

obe

effect

ive,th

einte

rmedi

atesem

antic

features

must

simultane

ouslye

ncode

thewid

evarie

tyofse

mantic

conten

tofth

einputstim

ulusw

ordsa

ndfac

torthe

observ

edfM

RIact

ivation

intomo

reprim

itivec

om-

Pred

icted

“celer

y” = 0

.84

“celer

y”“ai

rplan

e”

Pred

icted

:

Obse

rved:

AB

+...

high

avera

ge below

avera

ge

Pred

icted

“cele

ry”:

+ 0.35

+ 0.32

“eat”

“taste

”“fil

l”

Fig.2

.Pred

icting

fMRIim

ages

forgiv

enstim

ulusw

ords.(

A)For

minga

predic

tionfor

par-

ticipan

tP1

forthe

stimulu

swo

rd“ce

lery”a

ftertr

aining

on58

other

words

.Learn

edc vi

co-effi

cients

for3o

fthe

25se-

mantic

featur

es(“e

at,”“ta

ste,”

and“fil

l”)are

depicte

dbyt

hevox

elcolo

rsint

hethree

images

atthe

topof

thepan

el.The

co-occ

urrenc

evalu

efor

eacho

fthese

features

forthe

stimulu

sword

“celery

”is

shown

tothe

leftofthe

irresp

ective

images

[e.g.,the

value

for“ea

t(cele

ry)”is

0.84].

Thepre

dicted

activa

tionfor

thestim

ulusw

ord[sh

owna

ttheb

ottom

of(A)]is

alinea

rcomb

ination

ofthe

25sem

anticf

MRIsi

gnatur

es,weigh

tedby

theirc

o-occu

rrence

values

.Thisf

igure

shows

justo

nehor

izonta

lslice

[z=

–12m

min

Montr

ealNe

urolog

icalIn

stitute

(MNI)s

pace]

ofthe

predic

tedthr

ee-dim

ension

alima

ge.(B)

Predic

tedand

observ

edfMR

Iima

gesfor

“celery

”and

“airpl

ane”a

ftertra

iningthat

uses5

8othe

rword

s.The

twolon

gred

andblu

evert

icalst

reaks

nearth

etop

(poste

riorre

gion)

ofthe

predic

tedand

observ

edima

gesare

theleft

andrig

htfus

iform

gyri.

A B

C

Mean overparticipants

Participant P5

Fig.3

.Loca

tionso

fmo

stacc

uratel

ypre

-dic

tedvox

els.S

urface

(A)and

glass

brain

(B)ren

dering

ofthe

correla

-tion

betwee

npre

dicted

andact

ualvox

elact

iva-

tionsf

orwords

outside

thetrai

nings

etfor

par-

ticipan

tP5.T

hesep

anelss

howclu

stersc

ontain

ingatl

east1

0cont

iguous

voxels,

eacho

fwhos

epre

dicted

-actua

lcorrelati

onisa

tleast

0.28.The

sevox

elclusters

aredistrib

utedthro

ughout

thecor

texand

locate

dint

heleft

andrigh

toccip

italand

parieta

llobes

;lefta

ndrigh

tfusifo

rm,

postce

ntral,

andmid

dlefronta

lgyri;l

eftinfe

riorfronta

lgyrus

;medi

alfron

talgyr

us;and

anterio

rcin

gulate

.(C)S

urface

render

ingof

thepre

dicted

-actua

lcorr

elation

averag

edove

ralln

inepar

ticipan

ts.Thi

spane

lrepre

sentsc

lusters

contain

ingatlea

st10c

ontigu

ousvox

els,eac

hwith

averag

ecorr

elation

ofatle

ast0.1

4.

30MA

Y200

8VO

L320

SCIEN

CEww

w.scie

ncem

ag.or

g11

92RESE

ARCH

ARTIC

LES

on May 30, 2008 www.sciencemag.org Downloaded from

Toful

lyspe

cifyam

odelw

ithint

hiscom

-put

ational

model

ingfra

mewo

rk,one

mustf

irstdef

ineas

etof

interm

ediate

semant

icfea

tures

f 1(w)f 2

(w)…f

n(w)to

beext

racted

from

thetex

tcor

pus.In

thispap

er,eac

hinterm

ediate

semant

icfea

tureis

define

dinterm

softh

eco-o

ccurre

ncesta

tistics

ofthe

input

stimulu

sword

wwith

apar

ticular

otherw

ord(e.g

.,“tast

e”)ors

etofw

ords

(e.g.,“

taste,”

“tastes,

”or“

tasted”

)withi

nthe

text

corpus

.The

model

istrain

edby

theapp

lication

ofmu

ltipler

egress

ionto

these

feature

sf i(w)

andthe

observ

edfM

RIima

ges,so

astoo

btainm

aximu

m-like

lihood

estima

tesfor

themo

delpar

ameter

scvi

(26).O

ncetrai

ned,th

ecomp

utation

almode

lcanb

eeva

luated

bygiv

ingitw

ordso

utside

thetrai

ning

setand

compar

ingits

predic

tedfM

RIima

gesfor

thesew

ordsw

ithobs

erved

fMRI

data.

Thisc

omput

ational

model

ingfram

ework

isbas

edon

twokey

theore

ticalas

sumptio

ns.Firs

t,itass

umes

thesem

anticf

eatures

thatd

istingui

shthe

meani

ngsofa

rbitrary

concre

tenoun

sarere

flected

inthe

statisti

csofth

eiruse

within

avery

largete

xtcor

pus.Th

isassu

mption

isdraw

nfrom

thefiel

dof

comput

ational

linguis

tics,w

heres

tatistic

alwo

rddis

tributio

nsare

frequen

tlyuse

dtoa

pproxi

mate

theme

aning

ofdoc

ument

sand

words

(14–17

).Sec

ond,it

assum

esthat

thebra

inactiv

ityobs

erved

when

thinkin

gabou

tany

concre

tenou

ncan

beder

iveda

sawe

ighted

linears

umof

contrib

utions

fromeac

hofit

ssem

anticf

eatures.

Althou

ghthe

correc

tnesso

fthisl

inearit

yassu

mption

isdeba

t-abl

e,itis

consist

entwit

hthe

widesp

readu

seof

linearm

odelsi

nfMRI

analys

is(27)

andwit

hthe

assum

ptionth

atfMR

Iactiva

tionofte

nrefle

ctsa

linears

uperpo

sitiono

fcontri

bution

sfrom

differen

tsou

rces.O

urtheo

retical

framew

orkdoe

snotta

kea

positio

nonw

hether

theneu

ralacti

vation

encodi

ngme

aning

isloc

alized

inpar

ticular

cortica

lre-

gions.

Instea

d,itco

nsider

sallco

rticalv

oxelsa

ndallo

wsthe

trainin

gdata

todeter

minew

hichlo

ca-tion

sare

system

atically

modul

atedby

which

as-pec

tsofw

ordme

anings

.

Results.W

eeval

uated

thiscom

putatio

nalmo

d-elu

singfM

RIdat

afrom

nineh

ealthy,

colleg

e-age

partici

pantsw

hovie

wed6

0diffe

rentw

ord-pic

turepai

rspre

sented

sixtim

eseac

h.An

atomic

allyde-

finedre

gions

ofinte

restwe

reauto

matica

llylab

eledacc

ording

tothe

metho

dology

in(28)

.The

60ran

-dom

lyord

ereds

timuli

includ

edfive

itemsf

romeac

hof12

semant

iccateg

ories(a

nimals,

bodyp

arts,

buildin

gs,bui

ldingp

arts,cl

othing

,furnit

ure,in

sects,

kitchen

items,t

ools,v

egetab

les,veh

icles,a

ndoth

erma

n-made

items).

Arepr

esenta

tivefM

RIima

gefor

eachs

timulu

swas

created

bycom

puting

theme

anfM

RIresp

onseo

verits

sixpre

sentatio

ns,and

theme

anofa

ll60o

fthese

repres

entativ

eimage

swas

thens

ubtrac

tedfro

meac

h[for

details

,see(2

6)].

Toinst

antiate

ourmo

deling

framew

ork,w

efirst

chosea

setofi

nterme

diates

emant

icfeatu

res.To

beeffe

ctive,t

heinte

rmedia

tesem

anticf

eatures

must

simulta

neously

encode

thewid

evarie

tyofse

mantic

conten

tofthe

inputs

timulu

sword

sandfa

ctorth

eobs

erved

fMRI

activat

ioninto

morep

rimitiv

ecom-

Predic

ted“ce

lery” =

0.84

“celer

y”“air

plane”

Predic

ted:

Obser

ved:

AB

+...

high

averag

e below

averag

e

Predic

ted “ce

lery”:+ 0

.35+ 0

.32

“eat”

“taste

”“fill

Fig.2

.Predic

tingfMR

Iimage

sfor

given

stimulu

sword

s.(A)

Formin

gapre

diction

forpar

-ticip

antP1

forthe

stimulu

swor

d“cele

ry”afte

rtrain

ingon

58oth

erword

s.Lear

nedc vic

o-effi

cientsf

or3o

fthe2

5se-

mantic

feature

s(“eat

,”“tast

e,”and

“fill”)a

redepic

tedby

thevox

elcolor

sinthe

threeim

ages

atthe

topoft

hepan

el.The

co-occ

urrence

valuefo

reach

ofthes

efeatu

resfor

thestim

uluswor

d“cele

ry”is

shown

tothe

leftoft

heirresp

ectivei

mages

[e.g.,th

evalue

for“ea

t(celer

y)”is

0.84].

Thepre

dicted

activati

onfor

thestim

uluswor

d[show

natth

ebotto

mof

(A)]isa

linearc

ombin

ationo

fthe2

5sema

nticfMR

Isigna

tures,w

eighte

dby

theirc

o-occu

rrence

values.

Thisfi

guresh

owsjust

onehor

izontal

slice[z

=

–12mm

inMo

ntrealN

eurolo

gicalIn

stitute(

MNI)s

pace]o

fthep

redicte

dthre

e-dime

nsional

image.

(B)Pre

dicted

andobs

erved

fMRIim

agesf

or“ce

lery”an

d“airp

lane”a

ftertrai

ningth

atuses

58oth

erword

s.The

twolon

gred

andblu

evertic

alstrea

ksnear

thetop

(poster

iorreg

ion)of

thepre

dicted

andobs

erved

images

arethe

leftand

rightfu

siform

gyri.

A B

C

Mean overparticipants

Participant P5

Fig.3

.Loca

tionso

fmo

stacc

urately

pre-

dicted

voxels.

Surface

(A)and

glassb

rain(B)

render

ingoft

hecorre

la-tion

between

predict

edand

actualv

oxelac

tiva-

tionsfo

rword

soutsi

dethe

trainin

gsetfo

rpar-

ticipant

P5.The

sepane

lsshow

clusters

contain

ingatl

east10

contigu

ousvox

els,eac

hofwh

osepre

dicted-

actualc

orrelatio

nisatl

east0.

28.The

sevoxe

lcluster

sared

istribut

edthro

ughout

thecort

exand

located

inthe

leftand

rightoc

cipital

andpar

ietallo

bes;le

ftand

rightfu

siform,

postcen

tral,an

dmiddl

efront

algyri;

leftinfe

riorfron

talgyru

s;media

lfronta

lgyrus;

andant

erior

cingulat

e.(C)

Surface

render

ingof

thepre

dicted-

actualc

orrelatio

navera

gedove

ralln

inepar

ticipant

s.This

panelr

eprese

ntsclus

terscon

taining

atleas

t10con

tiguous

voxels,

eachw

ithave

rageco

rrelatio

nofat

least0.

14.

30MA

Y2008

VOL3

20SC

IENCE

www.s

cience

mag.o

rg11

92RESEA

RCHA

RTICL

ES

on May 30, 2008 www.sciencemag.org Downloaded from

Toful

lyspe

cifya

model

within

thisc

om-

putati

onalm

odelin

gfram

ework

,one

mustf

irstdef

ineas

etof

interm

ediate

semant

icfea

tures

f 1(w)f 2

(w)…f n(w

)tobe

extrac

tedfro

mthe

text

corpus

.Inthis

paper,

eachin

termedi

atesem

antic

featur

eisdef

inedinterm

softh

eco-o

ccurre

ncesta

tistics

ofthe

input

stimulu

sword

wwit

hapar

ticular

otherw

ord(e.g

.,“tast

e”)or

setof

words

(e.g.,“

taste,”

“tastes

,”or“

tasted”

)withi

nthe

text

corpus

.The

model

istrain

edby

theapp

lication

ofmu

ltiplereg

ressio

ntot

hesef

eature

sf i(w)

andthe

observ

edfM

RIima

ges,so

astoo

btainm

aximu

m-like

lihood

estima

tesfor

themo

delpar

amete

rsc vi

(26).O

ncetrai

ned,th

ecom

putatio

nalmo

delcan

beeva

luated

bygiv

ingitw

ordso

utside

thetrai

ning

setand

compar

ingits

predic

tedfM

RIimage

sfor

these

words

witho

bserve

dfMR

Idata

.Th

iscom

putatio

nalmo

deling

framew

orkis

based

ontwo

keythe

oretica

lassum

ptions.

First,

itass

umes

thesem

anticfea

turesth

atdis

tinguis

hthe

meani

ngsof

arbitra

rycon

creten

ounsa

rerefle

cted

inthe

statist

icsofthe

iruse

within

avery

largete

xtcor

pus.T

hisass

umptio

nisdra

wnfro

mthe

fieldo

fcom

putatio

nalling

uistics

,wher

estati

stical

word

distrib

utions

arefreq

uently

usedt

oappr

oxima

tethe

meani

ngof

docum

entsa

ndwo

rds(14

–17).

Second

,itass

umes

thatth

ebrain

activit

yobse

rved

when

thinkin

gabou

tany

concre

tenou

ncan

beder

iveda

sawe

ighted

linear

sumof

contrib

utions

from

eacho

fitss

emant

icfea

tures.A

lthough

thecor

rectne

ssof

thisline

aritya

ssump

tionisd

ebat-

able,i

tiscon

sistent

witht

hewid

esprea

duse

ofline

armode

lsinf

MRIan

alysis

(27)an

dwith

theass

umptio

nthat

fMRI

activa

tionoft

enref

lectsa

linear

superp

osition

ofcon

tributio

nsfrom

differe

ntsou

rces.O

urtheo

retical

frame

workd

oesnot

takea

positio

nonw

hether

theneu

ralact

ivation

encodi

ngme

aning

isloc

alized

inpar

ticular

cortica

lre-

gions.

Instea

d,itco

nsider

sallco

rticalv

oxelsa

ndallo

wsthe

trainin

gdata

todet

ermine

which

loca-

tionsa

resys

tematic

allymo

dulate

dbyw

hicha

s-pec

tsofw

ordme

anings

.

Results.W

eeval

uated

thiscom

putatio

nalmo

d-elu

singf

MRId

atafro

mnin

eheal

thy,co

llege-a

gepar

ticipan

tswho

viewe

d60d

ifferen

tword

-picture

pairs

presen

tedsix

times

each.

Anato

mically

de-fine

dregi

onsof

interes

twere

autom

atically

labele

dacc

ording

tothe

metho

dology

in(28)

.The

60ran

-dom

lyord

ereds

timuli

includ

edfive

itemsf

romeac

hof12

semant

iccategor

ies(an

imals,

bodyp

arts,

buildin

gs,bui

ldingp

arts,cl

othing

,furnit

ure,in

sects,

kitchen

items,t

ools,v

egetab

les,veh

icles,a

ndoth

erma

n-made

items).

Arepr

esenta

tivefM

RIima

gefor

eachs

timulu

swas

createdb

ycom

puting

theme

anfM

RIrespon

seove

ritss

ixpre

sentati

ons,an

dthe

mean

ofall

60of

these

repres

entativ

eimage

swas

thens

ubtrac

tedfro

meac

h[for

details

,see(

26)].

Toins

tantiat

eourm

odeling

framew

ork,w

efirst

chosea

setofi

nterm

ediate

semant

icfeat

ures.T

obe

effectiv

e,the

interm

ediate

semant

icfea

turesm

ustsim

ultaneo

uslye

ncode

thewid

evarie

tyofse

mantic

conten

tofth

einput

stimulu

sword

sand

factor

theobs

erved

fMRI

activa

tioninto

morep

rimitiv

ecom

-

Predic

ted“ce

lery”

= 0.84

“celer

y”“ai

rplan

e”

Predic

ted:

Obse

rved:

AB

+...

high

averag

e below

averag

e

Predic

ted “c

elery”

:

+ 0.35

+ 0.32

“eat”

“taste

”“fill

Fig.2

.Pred

ictingfM

RIima

gesfor

given

stimulu

sword

s.(A)

Formin

gapre

diction

forpar

-tici

pantP

1for

thestim

ulus

word“

celery”

aftertr

aining

on58

otherw

ords.L

earned

c vico-

efficien

tsfor

3ofth

e25s

e-ma

nticfea

tures(“

eat,”“

taste,”

and“fil

l”)are

depicte

dbyth

evox

elcolo

rsinth

ethree

images

atthe

topoft

hepan

el.The

co-occ

urrence

valuef

oreac

hofth

esefea

turesfo

rthes

timulu

sword

“celery

”issho

wntot

heleft

ofthei

rrespe

ctiveim

ages[e

.g.,the

valuefo

r“eat(

celery)”

is0.8

4].The

predict

edacti

vation

forthe

stimulu

sword

[shown

atthe

bottom

of(A)]

isaline

arcom

binatio

nofth

e25s

emant

icfMR

Isigna

tures,w

eighte

dby

theirc

o-occu

rrence

values

.Thisf

igure

shows

justo

nehor

izonta

lslice

[z=

–12mm

inMo

ntreal

Neurolo

gical

Institu

te(MN

I)spac

e]of

thepre

dicted

three-

dimens

ional

image.

(B)Pre

dicted

andobs

erved

fMRIim

agesf

or“ce

lery”a

nd“ai

rplane”

aftertr

aining

thatus

es58o

therw

ords.T

hetwo

long

redand

bluev

ertical

streaks

nearth

etop(p

osterio

rregio

n)ofth

epred

icted

andobs

erved

images

arethe

leftand

rightfu

siform

gyri.

A B

C

Mean overparticipants

Participant P5

Fig.3

.Loca

tionso

fmo

stacc

urately

pre-

dicted

voxels.

Surfac

e(A)

andgla

ssbrain

(B)ren

dering

ofthe

correla

-tion

betwee

npred

icted

andactu

alvoxe

lactiva

-tion

sforw

ordso

utside

thetrai

nings

etfor

par-

ticipant

P5.The

sepane

lsshow

clusters

contain

ingatl

east10

contigu

ousvox

els,eac

hofw

hose

predict

ed-actu

alcorre

lation

isatle

ast0.2

8.Thes

evoxe

lcluste

rsared

istribu

tedthro

ughout

thecor

texand

located

inthe

leftand

righto

ccipital

andpar

ietallo

bes;le

ftand

rightfu

siform,

postcen

tral,an

dmidd

lefron

talgyr

i;leftin

feriorf

rontal

gyrus;

media

lfronta

lgyrus;

andant

erior

cingula

te.(C)

Surface

render

ingof

thepre

dicted-

actual

correla

tionave

raged

overa

llnine

particip

ants.T

hispan

elrepr

esents

clusters

contain

ingatl

east1

0cont

iguous

voxels,

eachw

ithave

rageco

rrelatio

nofat

least0.

14.

30MA

Y200

8VO

L320

SCIEN

CEww

w.scie

ncema

g.org

1192RESE

ARCH

ARTIC

LES

on May 30, 2008 www.sciencemag.org Downloaded from

Toful

lyspe

cifya

modelw

ithin

thisc

om-

putation

almo

delin

gfram

ework

,one

mustf

irst

defin

easet

ofint

ermediates

emantic

featur

esf 1(w

)f 2(w)…

f n(w)to

beext

racted

from

thetex

tcor

pus.I

nthis

paper,

eachinte

rmedi

atesem

antic

featur

eisd

efinedinterm

softh

eco-o

ccurre

nce

statisticso

fthe

input

stimu

luswo

rdw

witha

particu

laroth

erwo

rd(e.g

.,“taste”

)orsetof

words

(e.g.,“

taste,”

“tastes,”

or“ta

sted”)

within

thetex

tcor

pus.T

hemo

delist

rained

bythe

applica

tiono

fmu

ltiplereg

ressio

ntot

hesef

eature

sfi(w

)and

theobser

vedfM

RIimage

s,soa

stoobtain

maxim

um-

likelih

oodestimate

sfor

themo

delpar

amete

rsc vi

(26).O

ncetrai

ned,th

ecom

putati

onalm

odelca

nbe

evalua

tedby

giving

itword

souts

idethe

training

setand

compar

ingits

predic

tedfM

RIimage

sfor

these

words

withobser

vedfM

RIdat

a.Th

iscom

putati

onal

model

ingframe

work

isbas

edon

twokey

theore

ticalassu

mption

s.First

,itass

umes

thesem

anticfea

tures

thatd

istingu

ishthe

meani

ngso

farbitrary

concre

tenouns

arereflec

ted

inthe

statist

icsofthe

iruse

withina

verylarg

etext

corpus.T

hisass

umption

isdraw

nfrom

thefieldof

computati

onal

linguisti

cs,wh

erestatist

icalw

orddis

tributionsa

refrequent

lyuse

dtoa

pprox

imate

theme

aning

ofdocum

entsa

ndwo

rds(14

–17).

Second

,itass

umes

thatth

ebrain

activit

yobse

rved

when

thinking

about

anycon

crete

noun

canbe

derive

dasa

weigh

tedlinear

sumof

contrib

utions

from

eacho

fitss

emant

icfea

tures.

Although

thecor

rectne

ssof

thisli

nearity

assum

ption

isdeba

t-abl

e,iti

sconsisten

twith

thewides

pread

useof

linear

model

sinfM

RIana

lysis(

27)and

withthe

assum

ption

thatfM

RIact

ivationo

ftenr

eflect

salinear

superp

osition

ofcon

tributio

nsfro

mdifferent

source

s.Ourthe

oretica

lfram

ework

doesnotta

kea

positi

onon

wheth

erthe

neural

activa

tionenc

oding

meaning

isloc

alized

inpartic

ularc

ortica

lre-

gions.

Instea

d,itc

onsid

ersall

cortica

lvoxels

andallo

wsthe

training

datatod

eterm

inewh

ichloc

a-tions

aresys

tematic

allymo

dulate

dbyw

hicha

s-pec

tsofw

ordme

anings.

Results.W

eeval

uated

thiscom

putatio

nalmo

d-elusi

ngfM

RIdat

afrom

nineh

ealthy

,colleg

e-age

partici

pantswh

oview

ed60

different

word-

picture

pairs

presen

tedsix

times

each.

Anato

mical

lyde-

fined

region

sofin

terestw

ereaut

omatic

allylab

eled

accord

ingtothe

metho

dology

in(28

).The

60ran

-dom

lyord

eredstim

uliinc

luded

fivei

temsf

romeac

hof12s

emant

iccategor

ies(an

imals,bo

dypar

ts,bui

ldings,

buildin

gparts

,cloth

ing,fu

rniture,i

nsects,

kitchen

items,t

ools,v

egetab

les,vehi

cles,a

ndoth

erma

n-made

items).

Arep

resent

ativef

MRIim

agefor

eachs

timulu

swas

createdb

ycom

puting

theme

anfM

RIrespon

seove

ritss

ixpre

sentati

ons,and

theme

anof

all60

ofthe

serep

resent

ativeim

agesw

asthe

nsubtracte

dfrom

each[

fordet

ails,se

e(26)].

Toins

tantiat

eourmo

deling

frame

work,

wefirs

tcho

seaseto

finterm

ediate

semant

icfeat

ures.T

obe

effect

ive,th

einte

rmedi

atesem

antic

features

must

simultane

ouslye

ncode

thewid

evarie

tyofse

mantic

conten

tofth

einputstim

ulusw

ordsa

ndfac

torthe

observ

edfM

RIact

ivation

intomo

reprim

itivec

om-

Pred

icted

“celer

y” = 0

.84

“celer

y”“ai

rplan

e”

Pred

icted

:

Obse

rved:

AB

+...

high

avera

ge below

avera

ge

Pred

icted

“cele

ry”:

+ 0.35

+ 0.32

“eat”

“taste

”“fil

l”

Fig.2

.Pred

icting

fMRIim

ages

forgiv

enstim

ulusw

ords.(

A)For

minga

predic

tionfor

par-

ticipan

tP1

forthe

stimulu

swo

rd“ce

lery”a

ftertr

aining

on58

other

words

.Learn

edc vi

co-effi

cients

for3o

fthe

25se-

mantic

featur

es(“e

at,”“ta

ste,”

and“fil

l”)are

depicte

dbyt

hevox

elcolo

rsint

hethree

images

atthe

topof

thepan

el.The

co-occ

urrenc

evalu

efor

eacho

fthese

features

forthe

stimulu

sword

“celery

”is

shown

tothe

leftofthe

irresp

ective

images

[e.g.,the

value

for“ea

t(cele

ry)”is

0.84].

Thepre

dicted

activa

tionfor

thestim

ulusw

ord[sh

owna

ttheb

ottom

of(A)]is

alinea

rcomb

ination

ofthe

25sem

anticf

MRIsi

gnatur

es,weigh

tedby

theirc

o-occu

rrence

values

.Thisf

igure

shows

justo

nehor

izonta

lslice

[z=

–12m

min

Montr

ealNe

urolog

icalIn

stitute

(MNI)s

pace]

ofthe

predic

tedthr

ee-dim

ension

alima

ge.(B)

Predic

tedand

observ

edfMR

Iima

gesfor

“celery

”and

“airpl

ane”a

ftertra

iningthat

uses5

8othe

rword

s.The

twolon

gred

andblu

evert

icalst

reaks

nearth

etop

(poste

riorre

gion)

ofthe

predic

tedand

observ

edima

gesare

theleft

andrig

htfus

iform

gyri.

A B

C

Mean overparticipants

Participant P5

Fig.3

.Loca

tionso

fmo

stacc

uratel

ypre

-dic

tedvox

els.S

urface

(A)and

glass

brain

(B)ren

dering

ofthe

correla

-tion

betwee

npre

dicted

andact

ualvox

elact

iva-

tionsf

orwords

outside

thetrai

nings

etfor

par-

ticipan

tP5.T

hesep

anelss

howclu

stersc

ontain

ingatl

east1

0cont

iguous

voxels,

eacho

fwhos

epre

dicted

-actua

lcorrelati

onisa

tleast

0.28.The

sevox

elclusters

aredistrib

utedthro

ughout

thecor

texand

locate

dint

heleft

andrigh

toccip

italand

parieta

llobes

;lefta

ndrigh

tfusifo

rm,

postce

ntral,

andmid

dlefronta

lgyri;l

eftinfe

riorfronta

lgyrus

;medi

alfron

talgyr

us;and

anterio

rcin

gulate

.(C)S

urface

render

ingof

thepre

dicted

-actua

lcorr

elation

averag

edove

ralln

inepar

ticipan

ts.Thi

spane

lrepre

sentsc

lusters

contain

ingatlea

st10c

ontigu

ousvox

els,eac

hwith

averag

ecorr

elation

ofatle

ast0.1

4.

30MA

Y200

8VO

L320

SCIEN

CEww

w.scie

ncem

ag.or

g11

92RESE

ARCH

ARTIC

LES

on May 30, 2008 www.sciencemag.org Downloaded from

Toful

lyspe

cifyam

odelw

ithint

hiscom

-put

ational

model

ingfra

mewo

rk,one

mustf

irstdef

ineas

etof

interm

ediate

semant

icfea

tures

f 1(w)f 2

(w)…f

n(w)to

beext

racted

from

thetex

tcor

pus.In

thispap

er,eac

hinterm

ediate

semant

icfea

tureis

define

dinterm

softh

eco-o

ccurre

ncesta

tistics

ofthe

input

stimulu

sword

wwith

apar

ticular

otherw

ord(e.g

.,“tast

e”)ors

etofw

ords

(e.g.,“

taste,”

“tastes,

”or“

tasted”

)withi

nthe

text

corpus

.The

model

istrain

edby

theapp

lication

ofmu

ltipler

egress

ionto

these

feature

sf i(w)

andthe

observ

edfM

RIima

ges,so

astoo

btainm

aximu

m-like

lihood

estima

tesfor

themo

delpar

ameter

scvi

(26).O

ncetrai

ned,th

ecomp

utation

almode

lcanb

eeva

luated

bygiv

ingitw

ordso

utside

thetrai

ning

setand

compar

ingits

predic

tedfM

RIima

gesfor

thesew

ordsw

ithobs

erved

fMRI

data.

Thisc

omput

ational

model

ingfram

ework

isbas

edon

twokey

theore

ticalas

sumptio

ns.Firs

t,itass

umes

thesem

anticf

eatures

thatd

istingui

shthe

meani

ngsofa

rbitrary

concre

tenoun

sarere

flected

inthe

statisti

csofth

eiruse

within

avery

largete

xtcor

pus.Th

isassu

mption

isdraw

nfrom

thefiel

dof

comput

ational

linguis

tics,w

heres

tatistic

alwo

rddis

tributio

nsare

frequen

tlyuse

dtoa

pproxi

mate

theme

aning

ofdoc

ument

sand

words

(14–17

).Sec

ond,it

assum

esthat

thebra

inactiv

ityobs

erved

when

thinkin

gabou

tany

concre

tenou

ncan

beder

iveda

sawe

ighted

linears

umof

contrib

utions

fromeac

hofit

ssem

anticf

eatures.

Althou

ghthe

correc

tnesso

fthisl

inearit

yassu

mption

isdeba

t-abl

e,itis

consist

entwit

hthe

widesp

readu

seof

linearm

odelsi

nfMRI

analys

is(27)

andwit

hthe

assum

ptionth

atfMR

Iactiva

tionofte

nrefle

ctsa

linears

uperpo

sitiono

fcontri

bution

sfrom

differen

tsou

rces.O

urtheo

retical

framew

orkdoe

snotta

kea

positio

nonw

hether

theneu

ralacti

vation

encodi

ngme

aning

isloc

alized

inpar

ticular

cortica

lre-

gions.

Instea

d,itco

nsider

sallco

rticalv

oxelsa

ndallo

wsthe

trainin

gdata

todeter

minew

hichlo

ca-tion

sare

system

atically

modul

atedby

which

as-pec

tsofw

ordme

anings

.

Results.W

eeval

uated

thiscom

putatio

nalmo

d-elu

singfM

RIdat

afrom

nineh

ealthy,

colleg

e-age

partici

pantsw

hovie

wed6

0diffe

rentw

ord-pic

turepai

rspre

sented

sixtim

eseac

h.An

atomic

allyde-

finedre

gions

ofinte

restwe

reauto

matica

llylab

eledacc

ording

tothe

metho

dology

in(28)

.The

60ran

-dom

lyord

ereds

timuli

includ

edfive

itemsf

romeac

hof12

semant

iccateg

ories(a

nimals,

bodyp

arts,

buildin

gs,bui

ldingp

arts,cl

othing

,furnit

ure,in

sects,

kitchen

items,t

ools,v

egetab

les,veh

icles,a

ndoth

erma

n-made

items).

Arepr

esenta

tivefM

RIima

gefor

eachs

timulu

swas

created

bycom

puting

theme

anfM

RIresp

onseo

verits

sixpre

sentatio

ns,and

theme

anofa

ll60o

fthese

repres

entativ

eimage

swas

thens

ubtrac

tedfro

meac

h[for

details

,see(2

6)].

Toinst

antiate

ourmo

deling

framew

ork,w

efirst

chosea

setofi

nterme

diates

emant

icfeatu

res.To

beeffe

ctive,t

heinte

rmedia

tesem

anticf

eatures

must

simulta

neously

encode

thewid

evarie

tyofse

mantic

conten

tofthe

inputs

timulu

sword

sandfa

ctorth

eobs

erved

fMRI

activat

ioninto

morep

rimitiv

ecom-

Predic

ted“ce

lery” =

0.84

“celer

y”“air

plane”

Predic

ted:

Obser

ved:

AB

+...

high

averag

e below

averag

e

Predic

ted “ce

lery”:+ 0

.35+ 0

.32

“eat”

“taste

”“fill

Fig.2

.Predic

tingfMR

Iimage

sfor

given

stimulu

sword

s.(A)

Formin

gapre

diction

forpar

-ticip

antP1

forthe

stimulu

swor

d“cele

ry”afte

rtrain

ingon

58oth

erword

s.Lear

nedc vic

o-effi

cientsf

or3o

fthe2

5se-

mantic

feature

s(“eat

,”“tast

e,”and

“fill”)a

redepic

tedby

thevox

elcolor

sinthe

threeim

ages

atthe

topoft

hepan

el.The

co-occ

urrence

valuefo

reach

ofthes

efeatu

resfor

thestim

uluswor

d“cele

ry”is

shown

tothe

leftoft

heirresp

ectivei

mages

[e.g.,th

evalue

for“ea

t(celer

y)”is

0.84].

Thepre

dicted

activati

onfor

thestim

uluswor

d[show

natth

ebotto

mof

(A)]isa

linearc

ombin

ationo

fthe2

5sema

nticfMR

Isigna

tures,w

eighte

dby

theirc

o-occu

rrence

values.

Thisfi

guresh

owsjust

onehor

izontal

slice[z

=

–12mm

inMo

ntrealN

eurolo

gicalIn

stitute(

MNI)s

pace]o

fthep

redicte

dthre

e-dime

nsional

image.

(B)Pre

dicted

andobs

erved

fMRIim

agesf

or“ce

lery”an

d“airp

lane”a

ftertrai

ningth

atuses

58oth

erword

s.The

twolon

gred

andblu

evertic

alstrea

ksnear

thetop

(poster

iorreg

ion)of

thepre

dicted

andobs

erved

images

arethe

leftand

rightfu

siform

gyri.

A B

C

Mean overparticipants

Participant P5

Fig.3

.Loca

tionso

fmo

stacc

urately

pre-

dicted

voxels.

Surface

(A)and

glassb

rain(B)

render

ingoft

hecorre

la-tion

between

predict

edand

actualv

oxelac

tiva-

tionsfo

rword

soutsi

dethe

trainin

gsetfo

rpar-

ticipant

P5.The

sepane

lsshow

clusters

contain

ingatl

east10

contigu

ousvox

els,eac

hofwh

osepre

dicted-

actualc

orrelatio

nisatl

east0.

28.The

sevoxe

lcluster

sared

istribut

edthro

ughout

thecort

exand

located

inthe

leftand

rightoc

cipital

andpar

ietallo

bes;le

ftand

rightfu

siform,

postcen

tral,an

dmiddl

efront

algyri;

leftinfe

riorfron

talgyru

s;media

lfronta

lgyrus;

andant

erior

cingulat

e.(C)

Surface

render

ingof

thepre

dicted-

actualc

orrelatio

navera

gedove

ralln

inepar

ticipant

s.This

panelr

eprese

ntsclus

terscon

taining

atleas

t10con

tiguous

voxels,

eachw

ithave

rageco

rrelatio

nofat

least0.

14.

30MA

Y2008

VOL3

20SC

IENCE

www.s

cience

mag.o

rg11

92RESEA

RCHA

RTICL

ES

on May 30, 2008 www.sciencemag.org Downloaded from

Toful

lyspe

cifya

model

within

thisc

om-

putati

onalm

odelin

gfram

ework

,one

mustf

irstdef

ineas

etof

interm

ediate

semant

icfea

tures

f 1(w)f 2

(w)…f n(w

)tobe

extrac

tedfro

mthe

text

corpus

.Inthis

paper,

eachin

termedi

atesem

antic

featur

eisdef

inedinterm

softh

eco-o

ccurre

ncesta

tistics

ofthe

input

stimulu

sword

wwit

hapar

ticular

otherw

ord(e.g

.,“tast

e”)or

setof

words

(e.g.,“

taste,”

“tastes

,”or“

tasted”

)withi

nthe

text

corpus

.The

model

istrain

edby

theapp

lication

ofmu

ltiplereg

ressio

ntot

hesef

eature

sf i(w)

andthe

observ

edfM

RIima

ges,so

astoo

btainm

aximu

m-like

lihood

estima

tesfor

themo

delpar

amete

rsc vi

(26).O

ncetrai

ned,th

ecom

putatio

nalmo

delcan

beeva

luated

bygiv

ingitw

ordso

utside

thetrai

ning

setand

compar

ingits

predic

tedfM

RIimage

sfor

these

words

witho

bserve

dfMR

Idata

.Th

iscom

putatio

nalmo

deling

framew

orkis

based

ontwo

keythe

oretica

lassum

ptions.

First,

itass

umes

thesem

anticfea

turesth

atdis

tinguis

hthe

meani

ngsof

arbitra

rycon

creten

ounsa

rerefle

cted

inthe

statist

icsofthe

iruse

within

avery

largete

xtcor

pus.T

hisass

umptio

nisdra

wnfro

mthe

fieldo

fcom

putatio

nalling

uistics

,wher

estati

stical

word

distrib

utions

arefreq

uently

usedt

oappr

oxima

tethe

meani

ngof

docum

entsa

ndwo

rds(14

–17).

Second

,itass

umes

thatth

ebrain

activit

yobse

rved

when

thinkin

gabou

tany

concre

tenou

ncan

beder

iveda

sawe

ighted

linear

sumof

contrib

utions

from

eacho

fitss

emant

icfea

tures.A

lthough

thecor

rectne

ssof

thisline

aritya

ssump

tionisd

ebat-

able,i

tiscon

sistent

witht

hewid

esprea

duse

ofline

armode

lsinf

MRIan

alysis

(27)an

dwith

theass

umptio

nthat

fMRI

activa

tionoft

enref

lectsa

linear

superp

osition

ofcon

tributio

nsfrom

differe

ntsou

rces.O

urtheo

retical

frame

workd

oesnot

takea

positio

nonw

hether

theneu

ralact

ivation

encodi

ngme

aning

isloc

alized

inpar

ticular

cortica

lre-

gions.

Instea

d,itco

nsider

sallco

rticalv

oxelsa

ndallo

wsthe

trainin

gdata

todet

ermine

which

loca-

tionsa

resys

tematic

allymo

dulate

dbyw

hicha

s-pec

tsofw

ordme

anings

.

Results.W

eeval

uated

thiscom

putatio

nalmo

d-elu

singf

MRId

atafro

mnin

eheal

thy,co

llege-a

gepar

ticipan

tswho

viewe

d60d

ifferen

tword

-picture

pairs

presen

tedsix

times

each.

Anato

mically

de-fine

dregi

onsof

interes

twere

autom

atically

labele

dacc

ording

tothe

metho

dology

in(28)

.The

60ran

-dom

lyord

ereds

timuli

includ

edfive

itemsf

romeac

hof12

semant

iccategor

ies(an

imals,

bodyp

arts,

buildin

gs,bui

ldingp

arts,cl

othing

,furnit

ure,in

sects,

kitchen

items,t

ools,v

egetab

les,veh

icles,a

ndoth

erma

n-made

items).

Arepr

esenta

tivefM

RIima

gefor

eachs

timulu

swas

createdb

ycom

puting

theme

anfM

RIrespon

seove

ritss

ixpre

sentati

ons,an

dthe

mean

ofall

60of

these

repres

entativ

eimage

swas

thens

ubtrac

tedfro

meac

h[for

details

,see(

26)].

Toins

tantiat

eourm

odeling

framew

ork,w

efirst

chosea

setofi

nterm

ediate

semant

icfeat

ures.T

obe

effectiv

e,the

interm

ediate

semant

icfea

turesm

ustsim

ultaneo

uslye

ncode

thewid

evarie

tyofse

mantic

conten

tofth

einput

stimulu

sword

sand

factor

theobs

erved

fMRI

activa

tioninto

morep

rimitiv

ecom

-

Predic

ted“ce

lery”

= 0.84

“celer

y”“ai

rplan

e”

Predic

ted:

Obse

rved:

AB

+...

high

averag

e below

averag

e

Predic

ted “c

elery”

:

+ 0.35

+ 0.32

“eat”

“taste

”“fill

Fig.2

.Pred

ictingfM

RIima

gesfor

given

stimulu

sword

s.(A)

Formin

gapre

diction

forpar

-tici

pantP

1for

thestim

ulus

word“

celery”

aftertr

aining

on58

otherw

ords.L

earned

c vico-

efficien

tsfor

3ofth

e25s

e-ma

nticfea

tures(“

eat,”“

taste,”

and“fil

l”)are

depicte

dbyth

evox

elcolo

rsinth

ethree

images

atthe

topoft

hepan

el.The

co-occ

urrence

valuef

oreac

hofth

esefea

turesfo

rthes

timulu

sword

“celery

”issho

wntot

heleft

ofthei

rrespe

ctiveim

ages[e

.g.,the

valuefo

r“eat(

celery)”

is0.8

4].The

predict

edacti

vation

forthe

stimulu

sword

[shown

atthe

bottom

of(A)]

isaline

arcom

binatio

nofth

e25s

emant

icfMR

Isigna

tures,w

eighte

dby

theirc

o-occu

rrence

values

.Thisf

igure

shows

justo

nehor

izonta

lslice

[z=

–12mm

inMo

ntreal

Neurolo

gical

Institu

te(MN

I)spac

e]of

thepre

dicted

three-

dimens

ional

image.

(B)Pre

dicted

andobs

erved

fMRIim

agesf

or“ce

lery”a

nd“ai

rplane”

aftertr

aining

thatus

es58o

therw

ords.T

hetwo

long

redand

bluev

ertical

streaks

nearth

etop(p

osterio

rregio

n)ofth

epred

icted

andobs

erved

images

arethe

leftand

rightfu

siform

gyri.

A B

C

Mean overparticipants

Participant P5

Fig.3

.Loca

tionso

fmo

stacc

urately

pre-

dicted

voxels.

Surfac

e(A)

andgla

ssbrain

(B)ren

dering

ofthe

correla

-tion

betwee

npred

icted

andactu

alvoxe

lactiva

-tion

sforw

ordso

utside

thetrai

nings

etfor

par-

ticipant

P5.The

sepane

lsshow

clusters

contain

ingatl

east10

contigu

ousvox

els,eac

hofw

hose

predict

ed-actu

alcorre

lation

isatle

ast0.2

8.Thes

evoxe

lcluste

rsared

istribu

tedthro

ughout

thecor

texand

located

inthe

leftand

righto

ccipital

andpar

ietallo

bes;le

ftand

rightfu

siform,

postcen

tral,an

dmidd

lefron

talgyr

i;leftin

feriorf

rontal

gyrus;

media

lfronta

lgyrus;

andant

erior

cingula

te.(C)

Surface

render

ingof

thepre

dicted-

actual

correla

tionave

raged

overa

llnine

particip

ants.T

hispan

elrepr

esents

clusters

contain

ingatl

east1

0cont

iguous

voxels,

eachw

ithave

rageco

rrelatio

nofat

least0.

14.

30MA

Y200

8VO

L320

SCIEN

CEww

w.scie

ncema

g.org

1192RESE

ARCH

ARTIC

LES

on May 30, 2008 www.sciencemag.org Downloaded from

Toful

lyspe

cifya

modelw

ithin

thisc

om-

putation

almo

delin

gfram

ework

,one

mustf

irst

defin

easet

ofint

ermediates

emantic

featur

esf 1(w

)f 2(w)…

f n(w)to

beext

racted

from

thetex

tcor

pus.I

nthis

paper,

eachinte

rmedi

atesem

antic

featur

eisd

efinedinterm

softh

eco-o

ccurre

nce

statisticso

fthe

input

stimu

luswo

rdw

witha

particu

laroth

erwo

rd(e.g

.,“taste”

)orsetof

words

(e.g.,“

taste,”

“tastes,”

or“ta

sted”)

within

thetex

tcor

pus.T

hemo

delist

rained

bythe

applica

tiono

fmu

ltiplereg

ressio

ntot

hesef

eature

sfi(w

)and

theobser

vedfM

RIimage

s,soa

stoobtain

maxim

um-

likelih

oodestimate

sfor

themo

delpar

amete

rsc vi

(26).O

ncetrai

ned,th

ecom

putati

onalm

odelca

nbe

evalua

tedby

giving

itword

souts

idethe

training

setand

compar

ingits

predic

tedfM

RIimage

sfor

these

words

withobser

vedfM

RIdat

a.Th

iscom

putati

onal

model

ingframe

work

isbas

edon

twokey

theore

ticalassu

mption

s.First

,itass

umes

thesem

anticfea

tures

thatd

istingu

ishthe

meani

ngso

farbitrary

concre

tenouns

arereflec

ted

inthe

statist

icsofthe

iruse

withina

verylarg

etext

corpus.T

hisass

umption

isdraw

nfrom

thefieldof

computati

onal

linguisti

cs,wh

erestatist

icalw

orddis

tributionsa

refrequent

lyuse

dtoa

pprox

imate

theme

aning

ofdocum

entsa

ndwo

rds(14

–17).

Second

,itass

umes

thatth

ebrain

activit

yobse

rved

when

thinking

about

anycon

crete

noun

canbe

derive

dasa

weigh

tedlinear

sumof

contrib

utions

from

eacho

fitss

emant

icfea

tures.

Although

thecor

rectne

ssof

thisli

nearity

assum

ption

isdeba

t-abl

e,iti

sconsisten

twith

thewides

pread

useof

linear

model

sinfM

RIana

lysis(

27)and

withthe

assum

ption

thatfM

RIact

ivationo

ftenr

eflect

salinear

superp

osition

ofcon

tributio

nsfro

mdifferent

source

s.Ourthe

oretica

lfram

ework

doesnotta

kea

positi

onon

wheth

erthe

neural

activa

tionenc

oding

meaning

isloc

alized

inpartic

ularc

ortica

lre-

gions.

Instea

d,itc

onsid

ersall

cortica

lvoxels

andallo

wsthe

training

datatod

eterm

inewh

ichloc

a-tions

aresys

tematic

allymo

dulate

dbyw

hicha

s-pec

tsofw

ordme

anings.

Results.W

eeval

uated

thiscom

putatio

nalmo

d-elusi

ngfM

RIdat

afrom

nineh

ealthy

,colleg

e-age

partici

pantswh

oview

ed60

different

word-

picture

pairs

presen

tedsix

times

each.

Anato

mical

lyde-

fined

region

sofin

terestw

ereaut

omatic

allylab

eled

accord

ingtothe

metho

dology

in(28

).The

60ran

-dom

lyord

eredstim

uliinc

luded

fivei

temsf

romeac

hof12s

emant

iccategor

ies(an

imals,bo

dypar

ts,bui

ldings,

buildin

gparts

,cloth

ing,fu

rniture,i

nsects,

kitchen

items,t

ools,v

egetab

les,vehi

cles,a

ndoth

erma

n-made

items).

Arep

resent

ativef

MRIim

agefor

eachs

timulu

swas

createdb

ycom

puting

theme

anfM

RIrespon

seove

ritss

ixpre

sentati

ons,and

theme

anof

all60

ofthe

serep

resent

ativeim

agesw

asthe

nsubtracte

dfrom

each[

fordet

ails,se

e(26)].

Toins

tantiat

eourmo

deling

frame

work,

wefirs

tcho

seaseto

finterm

ediate

semant

icfeat

ures.T

obe

effect

ive,th

einte

rmedi

atesem

antic

features

must

simultane

ouslye

ncode

thewid

evarie

tyofse

mantic

conten

tofth

einputstim

ulusw

ordsa

ndfac

torthe

observ

edfM

RIact

ivation

intomo

reprim

itivec

om-

Pred

icted

“celer

y” = 0

.84

“celer

y”“ai

rplan

e”

Pred

icted

:

Obse

rved:

AB

+...

high

avera

ge below

avera

ge

Pred

icted

“cele

ry”:

+ 0.35

+ 0.32

“eat”

“taste

”“fil

l”

Fig.2

.Pred

icting

fMRIim

ages

forgiv

enstim

ulusw

ords.(

A)For

minga

predic

tionfor

par-

ticipan

tP1

forthe

stimulu

swo

rd“ce

lery”a

ftertr

aining

on58

other

words

.Learn

edc vi

co-effi

cients

for3o

fthe

25se-

mantic

featur

es(“e

at,”“ta

ste,”

and“fil

l”)are

depicte

dbyt

hevox

elcolo

rsint

hethree

images

atthe

topof

thepan

el.The

co-occ

urrenc

evalu

efor

eacho

fthese

features

forthe

stimulu

sword

“celery

”is

shown

tothe

leftofthe

irresp

ective

images

[e.g.,the

value

for“ea

t(cele

ry)”is

0.84].

Thepre

dicted

activa

tionfor

thestim

ulusw

ord[sh

owna

ttheb

ottom

of(A)]is

alinea

rcomb

ination

ofthe

25sem

anticf

MRIsi

gnatur

es,weigh

tedby

theirc

o-occu

rrence

values

.Thisf

igure

shows

justo

nehor

izonta

lslice

[z=

–12m

min

Montr

ealNe

urolog

icalIn

stitute

(MNI)s

pace]

ofthe

predic

tedthr

ee-dim

ension

alima

ge.(B)

Predic

tedand

observ

edfMR

Iima

gesfor

“celery

”and

“airpl

ane”a

ftertra

iningthat

uses5

8othe

rword

s.The

twolon

gred

andblu

evert

icalst

reaks

nearth

etop

(poste

riorre

gion)

ofthe

predic

tedand

observ

edima

gesare

theleft

andrig

htfus

iform

gyri.

A B

C

Mean overparticipants

Participant P5

Fig.3

.Loca

tionso

fmo

stacc

uratel

ypre

-dic

tedvox

els.S

urface

(A)and

glass

brain

(B)ren

dering

ofthe

correla

-tion

betwee

npre

dicted

andact

ualvox

elact

iva-

tionsf

orwords

outside

thetrai

nings

etfor

par-

ticipan

tP5.T

hesep

anelss

howclu

stersc

ontain

ingatl

east1

0cont

iguous

voxels,

eacho

fwhos

epre

dicted

-actua

lcorrelati

onisa

tleast

0.28.The

sevox

elclusters

aredistrib

utedthro

ughout

thecor

texand

locate

dint

heleft

andrigh

toccip

italand

parieta

llobes

;lefta

ndrigh

tfusifo

rm,

postce

ntral,

andmid

dlefronta

lgyri;l

eftinfe

riorfronta

lgyrus

;medi

alfron

talgyr

us;and

anterio

rcin

gulate

.(C)S

urface

render

ingof

thepre

dicted

-actua

lcorr

elation

averag

edove

ralln

inepar

ticipan

ts.Thi

spane

lrepre

sentsc

lusters

contain

ingatlea

st10c

ontigu

ousvox

els,eac

hwith

averag

ecorr

elation

ofatle

ast0.1

4.

30MA

Y200

8VO

L320

SCIEN

CEww

w.scie

ncem

ag.or

g11

92RESE

ARCH

ARTIC

LES

on May 30, 2008 www.sciencemag.org Downloaded from

Toful

lyspe

cifyam

odelw

ithint

hiscom

-put

ational

model

ingfra

mewo

rk,one

mustf

irstdef

ineas

etof

interm

ediate

semant

icfea

tures

f 1(w)f 2

(w)…f

n(w)to

beext

racted

from

thetex

tcor

pus.In

thispap

er,eac

hinterm

ediate

semant

icfea

tureis

define

dinterm

softh

eco-o

ccurre

ncesta

tistics

ofthe

input

stimulu

sword

wwith

apar

ticular

otherw

ord(e.g

.,“tast

e”)ors

etofw

ords

(e.g.,“

taste,”

“tastes,

”or“

tasted”

)withi

nthe

text

corpus

.The

model

istrain

edby

theapp

lication

ofmu

ltipler

egress

ionto

these

feature

sf i(w)

andthe

observ

edfM

RIima

ges,so

astoo

btainm

aximu

m-like

lihood

estima

tesfor

themo

delpar

ameter

scvi

(26).O

ncetrai

ned,th

ecomp

utation

almode

lcanb

eeva

luated

bygiv

ingitw

ordso

utside

thetrai

ning

setand

compar

ingits

predic

tedfM

RIima

gesfor

thesew

ordsw

ithobs

erved

fMRI

data.

Thisc

omput

ational

model

ingfram

ework

isbas

edon

twokey

theore

ticalas

sumptio

ns.Firs

t,itass

umes

thesem

anticf

eatures

thatd

istingui

shthe

meani

ngsofa

rbitrary

concre

tenoun

sarere

flected

inthe

statisti

csofth

eiruse

within

avery

largete

xtcor

pus.Th

isassu

mption

isdraw

nfrom

thefiel

dof

comput

ational

linguis

tics,w

heres

tatistic

alwo

rddis

tributio

nsare

frequen

tlyuse

dtoa

pproxi

mate

theme

aning

ofdoc

ument

sand

words

(14–17

).Sec

ond,it

assum

esthat

thebra

inactiv

ityobs

erved

when

thinkin

gabou

tany

concre

tenou

ncan

beder

iveda

sawe

ighted

linears

umof

contrib

utions

fromeac

hofit

ssem

anticf

eatures.

Althou

ghthe

correc

tnesso

fthisl

inearit

yassu

mption

isdeba

t-abl

e,itis

consist

entwit

hthe

widesp

readu

seof

linearm

odelsi

nfMRI

analys

is(27)

andwit

hthe

assum

ptionth

atfMR

Iactiva

tionofte

nrefle

ctsa

linears

uperpo

sitiono

fcontri

bution

sfrom

differen

tsou

rces.O

urtheo

retical

framew

orkdoe

snotta

kea

positio

nonw

hether

theneu

ralacti

vation

encodi

ngme

aning

isloc

alized

inpar

ticular

cortica

lre-

gions.

Instea

d,itco

nsider

sallco

rticalv

oxelsa

ndallo

wsthe

trainin

gdata

todeter

minew

hichlo

ca-tion

sare

system

atically

modul

atedby

which

as-pec

tsofw

ordme

anings

.

Results.W

eeval

uated

thiscom

putatio

nalmo

d-elu

singfM

RIdat

afrom

nineh

ealthy,

colleg

e-age

partici

pantsw

hovie

wed6

0diffe

rentw

ord-pic

turepai

rspre

sented

sixtim

eseac

h.An

atomic

allyde-

finedre

gions

ofinte

restwe

reauto

matica

llylab

eledacc

ording

tothe

metho

dology

in(28)

.The

60ran

-dom

lyord

ereds

timuli

includ

edfive

itemsf

romeac

hof12

semant

iccateg

ories(a

nimals,

bodyp

arts,

buildin

gs,bui

ldingp

arts,cl

othing

,furnit

ure,in

sects,

kitchen

items,t

ools,v

egetab

les,veh

icles,a

ndoth

erma

n-made

items).

Arepr

esenta

tivefM

RIima

gefor

eachs

timulu

swas

created

bycom

puting

theme

anfM

RIresp

onseo

verits

sixpre

sentatio

ns,and

theme

anofa

ll60o

fthese

repres

entativ

eimage

swas

thens

ubtrac

tedfro

meac

h[for

details

,see(2

6)].

Toinst

antiate

ourmo

deling

framew

ork,w

efirst

chosea

setofi

nterme

diates

emant

icfeatu

res.To

beeffe

ctive,t

heinte

rmedia

tesem

anticf

eatures

must

simulta

neously

encode

thewid

evarie

tyofse

mantic

conten

tofthe

inputs

timulu

sword

sandfa

ctorth

eobs

erved

fMRI

activat

ioninto

morep

rimitiv

ecom-

Predic

ted“ce

lery” =

0.84

“celer

y”“air

plane”

Predic

ted:

Obser

ved:

AB

+...

high

averag

e below

averag

e

Predic

ted “ce

lery”:+ 0

.35+ 0

.32

“eat”

“taste

”“fill

Fig.2

.Predic

tingfMR

Iimage

sfor

given

stimulu

sword

s.(A)

Formin

gapre

diction

forpar

-ticip

antP1

forthe

stimulu

swor

d“cele

ry”afte

rtrain

ingon

58oth

erword

s.Lear

nedc vic

o-effi

cientsf

or3o

fthe2

5se-

mantic

feature

s(“eat

,”“tast

e,”and

“fill”)a

redepic

tedby

thevox

elcolor

sinthe

threeim

ages

atthe

topoft

hepan

el.The

co-occ

urrence

valuefo

reach

ofthes

efeatu

resfor

thestim

uluswor

d“cele

ry”is

shown

tothe

leftoft

heirresp

ectivei

mages

[e.g.,th

evalue

for“ea

t(celer

y)”is

0.84].

Thepre

dicted

activati

onfor

thestim

uluswor

d[show

natth

ebotto

mof

(A)]isa

linearc

ombin

ationo

fthe2

5sema

nticfMR

Isigna

tures,w

eighte

dby

theirc

o-occu

rrence

values.

Thisfi

guresh

owsjust

onehor

izontal

slice[z

=

–12mm

inMo

ntrealN

eurolo

gicalIn

stitute(

MNI)s

pace]o

fthep

redicte

dthre

e-dime

nsional

image.

(B)Pre

dicted

andobs

erved

fMRIim

agesf

or“ce

lery”an

d“airp

lane”a

ftertrai

ningth

atuses

58oth

erword

s.The

twolon

gred

andblu

evertic

alstrea

ksnear

thetop

(poster

iorreg

ion)of

thepre

dicted

andobs

erved

images

arethe

leftand

rightfu

siform

gyri.

A B

C

Mean overparticipants

Participant P5

Fig.3

.Loca

tionso

fmo

stacc

urately

pre-

dicted

voxels.

Surface

(A)and

glassb

rain(B)

render

ingoft

hecorre

la-tion

between

predict

edand

actualv

oxelac

tiva-

tionsfo

rword

soutsi

dethe

trainin

gsetfo

rpar-

ticipant

P5.The

sepane

lsshow

clusters

contain

ingatl

east10

contigu

ousvox

els,eac

hofwh

osepre

dicted-

actualc

orrelatio

nisatl

east0.

28.The

sevoxe

lcluster

sared

istribut

edthro

ughout

thecort

exand

located

inthe

leftand

rightoc

cipital

andpar

ietallo

bes;le

ftand

rightfu

siform,

postcen

tral,an

dmiddl

efront

algyri;

leftinfe

riorfron

talgyru

s;media

lfronta

lgyrus;

andant

erior

cingulat

e.(C)

Surface

render

ingof

thepre

dicted-

actualc

orrelatio

navera

gedove

ralln

inepar

ticipant

s.This

panelr

eprese

ntsclus

terscon

taining

atleas

t10con

tiguous

voxels,

eachw

ithave

rageco

rrelatio

nofat

least0.

14.

30MA

Y2008

VOL3

20SC

IENCE

www.s

cience

mag.o

rg11

92RESEA

RCHA

RTICL

ES

on May 30, 2008 www.sciencemag.org Downloaded from

Toful

lyspe

cifya

model

within

thisc

om-

putati

onalm

odelin

gfram

ework

,one

mustf

irstdef

ineas

etof

interm

ediate

semant

icfea

tures

f 1(w)f 2

(w)…f n(w

)tobe

extrac

tedfro

mthe

text

corpus

.Inthis

paper,

eachin

termedi

atesem

antic

featur

eisdef

inedinterm

softh

eco-o

ccurre

ncesta

tistics

ofthe

input

stimulu

sword

wwit

hapar

ticular

otherw

ord(e.g

.,“tast

e”)or

setof

words

(e.g.,“

taste,”

“tastes

,”or“

tasted”

)withi

nthe

text

corpus

.The

model

istrain

edby

theapp

lication

ofmu

ltiplereg

ressio

ntot

hesef

eature

sf i(w)

andthe

observ

edfM

RIima

ges,so

astoo

btainm

aximu

m-like

lihood

estima

tesfor

themo

delpar

amete

rsc vi

(26).O

ncetrai

ned,th

ecom

putatio

nalmo

delcan

beeva

luated

bygiv

ingitw

ordso

utside

thetrai

ning

setand

compar

ingits

predic

tedfM

RIimage

sfor

these

words

witho

bserve

dfMR

Idata

.Th

iscom

putatio

nalmo

deling

framew

orkis

based

ontwo

keythe

oretica

lassum

ptions.

First,

itass

umes

thesem

anticfea

turesth

atdis

tinguis

hthe

meani

ngsof

arbitra

rycon

creten

ounsa

rerefle

cted

inthe

statist

icsofthe

iruse

within

avery

largete

xtcor

pus.T

hisass

umptio

nisdra

wnfro

mthe

fieldo

fcom

putatio

nalling

uistics

,wher

estati

stical

word

distrib

utions

arefreq

uently

usedt

oappr

oxima

tethe

meani

ngof

docum

entsa

ndwo

rds(14

–17).

Second

,itass

umes

thatth

ebrain

activit

yobse

rved

when

thinkin

gabou

tany

concre

tenou

ncan

beder

iveda

sawe

ighted

linear

sumof

contrib

utions

from

eacho

fitss

emant

icfea

tures.A

lthough

thecor

rectne

ssof

thisline

aritya

ssump

tionisd

ebat-

able,i

tiscon

sistent

witht

hewid

esprea

duse

ofline

armode

lsinf

MRIan

alysis

(27)an

dwith

theass

umptio

nthat

fMRI

activa

tionoft

enref

lectsa

linear

superp

osition

ofcon

tributio

nsfrom

differe

ntsou

rces.O

urtheo

retical

frame

workd

oesnot

takea

positio

nonw

hether

theneu

ralact

ivation

encodi

ngme

aning

isloc

alized

inpar

ticular

cortica

lre-

gions.

Instea

d,itco

nsider

sallco

rticalv

oxelsa

ndallo

wsthe

trainin

gdata

todet

ermine

which

loca-

tionsa

resys

tematic

allymo

dulate

dbyw

hicha

s-pec

tsofw

ordme

anings

.

Results.W

eeval

uated

thiscom

putatio

nalmo

d-elu

singf

MRId

atafro

mnin

eheal

thy,co

llege-a

gepar

ticipan

tswho

viewe

d60d

ifferen

tword

-picture

pairs

presen

tedsix

times

each.

Anato

mically

de-fine

dregi

onsof

interes

twere

autom

atically

labele

dacc

ording

tothe

metho

dology

in(28)

.The

60ran

-dom

lyord

ereds

timuli

includ

edfive

itemsf

romeac

hof12

semant

iccategor

ies(an

imals,

bodyp

arts,

buildin

gs,bui

ldingp

arts,cl

othing

,furnit

ure,in

sects,

kitchen

items,t

ools,v

egetab

les,veh

icles,a

ndoth

erma

n-made

items).

Arepr

esenta

tivefM

RIima

gefor

eachs

timulu

swas

createdb

ycom

puting

theme

anfM

RIrespon

seove

ritss

ixpre

sentati

ons,an

dthe

mean

ofall

60of

these

repres

entativ

eimage

swas

thens

ubtrac

tedfro

meac

h[for

details

,see(

26)].

Toins

tantiat

eourm

odeling

framew

ork,w

efirst

chosea

setofi

nterm

ediate

semant

icfeat

ures.T

obe

effectiv

e,the

interm

ediate

semant

icfea

turesm

ustsim

ultaneo

uslye

ncode

thewid

evarie

tyofse

mantic

conten

tofth

einput

stimulu

sword

sand

factor

theobs

erved

fMRI

activa

tioninto

morep

rimitiv

ecom

-

Predic

ted“ce

lery”

= 0.84

“celer

y”“ai

rplan

e”

Predic

ted:

Obse

rved:

AB

+...

high

averag

e below

averag

e

Predic

ted “c

elery”

:

+ 0.35

+ 0.32

“eat”

“taste

”“fill

Fig.2

.Pred

ictingfM

RIima

gesfor

given

stimulu

sword

s.(A)

Formin

gapre

diction

forpar

-tici

pantP

1for

thestim

ulus

word“

celery”

aftertr

aining

on58

otherw

ords.L

earned

c vico-

efficien

tsfor

3ofth

e25s

e-ma

nticfea

tures(“

eat,”“

taste,”

and“fil

l”)are

depicte

dbyth

evox

elcolo

rsinth

ethree

images

atthe

topoft

hepan

el.The

co-occ

urrence

valuef

oreac

hofth

esefea

turesfo

rthes

timulu

sword

“celery

”issho

wntot

heleft

ofthei

rrespe

ctiveim

ages[e

.g.,the

valuefo

r“eat(

celery)”

is0.8

4].The

predict

edacti

vation

forthe

stimulu

sword

[shown

atthe

bottom

of(A)]

isaline

arcom

binatio

nofth

e25s

emant

icfMR

Isigna

tures,w

eighte

dby

theirc

o-occu

rrence

values

.Thisf

igure

shows

justo

nehor

izonta

lslice

[z=

–12mm

inMo

ntreal

Neurolo

gical

Institu

te(MN

I)spac

e]of

thepre

dicted

three-

dimens

ional

image.

(B)Pre

dicted

andobs

erved

fMRIim

agesf

or“ce

lery”a

nd“ai

rplane”

aftertr

aining

thatus

es58o

therw

ords.T

hetwo

long

redand

bluev

ertical

streaks

nearth

etop(p

osterio

rregio

n)ofth

epred

icted

andobs

erved

images

arethe

leftand

rightfu

siform

gyri.

A B

C

Mean overparticipants

Participant P5

Fig.3

.Loca

tionso

fmo

stacc

urately

pre-

dicted

voxels.

Surfac

e(A)

andgla

ssbrain

(B)ren

dering

ofthe

correla

-tion

betwee

npred

icted

andactu

alvoxe

lactiva

-tion

sforw

ordso

utside

thetrai

nings

etfor

par-

ticipant

P5.The

sepane

lsshow

clusters

contain

ingatl

east10

contigu

ousvox

els,eac

hofw

hose

predict

ed-actu

alcorre

lation

isatle

ast0.2

8.Thes

evoxe

lcluste

rsared

istribu

tedthro

ughout

thecor

texand

located

inthe

leftand

righto

ccipital

andpar

ietallo

bes;le

ftand

rightfu

siform,

postcen

tral,an

dmidd

lefron

talgyr

i;leftin

feriorf

rontal

gyrus;

media

lfronta

lgyrus;

andant

erior

cingula

te.(C)

Surface

render

ingof

thepre

dicted-

actual

correla

tionave

raged

overa

llnine

particip

ants.T

hispan

elrepr

esents

clusters

contain

ingatl

east1

0cont

iguous

voxels,

eachw

ithave

rageco

rrelatio

nofat

least0.

14.

30MA

Y200

8VO

L320

SCIEN

CEww

w.scie

ncema

g.org

1192RESE

ARCH

ARTIC

LES

on May 30, 2008 www.sciencemag.org Downloaded from

✔ ✔ ✔

✔ ✔ ✔

✔ ✔

Y

X

7 SIAM, July 2017 (c) C. Faloutsos, 2017

Page 8: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS Coupled Matrix-Tensor Factorization

(CMTF)

Y X

a1 aF

b1 bF

c1 cF

a1 aF

d1 dF

SIAM, July 2017 8 (c) C. Faloutsos, 2017

+

+

Page 9: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS Neuro-semantics

50 100 150 200 250

50

100

150

200

250

3000

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

50 100 150 200 250

50

100

150

200

250

3000

0.01

0.02

0.03

0.04

0.05

50 100 150 200 250

50

100

150

200

250

3000

0.01

0.02

0.03

0.04

0.05

Premotor Cortex

50 100 150 200 250

50

100

150

200

250

300

0.005

0.01

0.015

0.02

0.025

0.03

0.035

Group1

Group 2 Group 4

Group 3

beetle can it cause you pain?pants do you see it daily?bee is it conscious?

beetle can it cause you pain?pants do you see it daily?bee is it conscious?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?bear does it grow?cow is it alive?coat was it ever alive?

bear does it grow?cow is it alive?coat was it ever alive?

glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’

glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’

bed does it use electricity?house can you sit on it?car does it cast a shadow?

bed does it use electricity?house can you sit on it?car does it cast a shadow?

Figure 4: Turbo-SMT finds meaningful groups of words, questions, and brain regions that are (both negativelyand positively) correlated, as obtained using Turbo-SMT. For instance, Group 3 refers to small items that canbe held in one hand,such as a tomato or a glass, and the activation pattern is very di↵erent from the one ofGroup 1, which mostly refers to insects, such as bee or beetle. Additionally, Group 3 shows high activation in thepremotor cortex which is associated with the concepts of that group.

v

1

and v

2

which were withheld from the training data,the leave-two-out scheme measures prediction accuracyby the ability to choose which of the observed brainimages corresponds to which of the two words. Aftermean-centering the vectors, this classification decisionis made according to the following rule:

kv1 � v̂1k2 + kv2 � v̂2k2 < kv1 � v̂2k2 + kv2 � v̂1k2

Although our approach is not designed to make predic-tions, preliminary results are very encouraging: Usingonly F=2 components, for the noun pair closet/watch

we obtained mean accuracy of about 0.82 for 5 out of the9 human subjects. Similarly, for the pair knife/beetle,we achieved accuracy of about 0.8 for a somewhat dif-ferent group of 5 subjects. For the rest of the humansubjects, the accuracy is considerably lower, however, itmay be the case that brain activity predictability variesbetween subjects, a fact that requires further investiga-tion.

5 Experiments

We implemented Turbo-SMT in Matlab. Our imple-mentation of the code is publicly available.5 For the par-allelization of the algorithm, we used Matlab’s ParallelComputing Toolbox. For tensor manipulation, we used

5http://www.cs.cmu.edu/

~

epapalex/src/turbo_smt.zip

the Tensor Toolbox for Matlab [7] which is optimizedespecially for sparse tensors (but works very well fordense ones too). We use the ALS and the CMTF-OPT[5] algorithms as baselines, i.e. we compare Turbo-

SMT when using one of those algorithms as their coreCMTF implementation, against the plain execution ofthose algorithms. We implemented our version of theALS algorithm, and we used the CMTF Toobox6 im-plementation of CMTF-OPT. We use CMTF-OPT forhigher ranks, since that particular algorithm is moreaccurate than ALS, and is the state of the art. All ex-periments were carried out on a machine with 4 IntelXeon E74850 2.00GHz, and 512Gb of RAM. Wheneverwe conducted multiple iterations of an experiment (dueto the randomized nature of Turbo-SMT), we reporterror-bars along the plots. For all the following experi-ments we used either portions of the BrainQ dataset,or the whole dataset.

5.1 Speedup As we have already discussed in the In-troduction and shown in Fig. 1, Turbo-SMT achievesa speedup of 50-200 on the BrainQ dataset; For allcases, the approximation cost is either same as the base-lines, or is larger by a small factor, indicating thatTurbo-SMT is both fast and accurate. Key facts that

6http://www.models.life.ku.dk/joda/CMTF_Toolbox

9 SIAM, July 2017 (c) C. Faloutsos, 2017

=

Page 10: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS Neuro-semantics

10 SIAM, July 2017 (c) C. Faloutsos, 2017

50 100 150 200 250

50

100

150

200

250

3000

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

50 100 150 200 250

50

100

150

200

250

3000

0.01

0.02

0.03

0.04

0.05

50 100 150 200 250

50

100

150

200

250

3000

0.01

0.02

0.03

0.04

0.05

Premotor Cortex

50 100 150 200 250

50

100

150

200

250

300

0.005

0.01

0.015

0.02

0.025

0.03

0.035

Group1

Group 2 Group 4

Group 3

beetle can it cause you pain?pants do you see it daily?bee is it conscious?

beetle can it cause you pain?pants do you see it daily?bee is it conscious?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?bear does it grow?cow is it alive?coat was it ever alive?

bear does it grow?cow is it alive?coat was it ever alive?

glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’

glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’

bed does it use electricity?house can you sit on it?car does it cast a shadow?

bed does it use electricity?house can you sit on it?car does it cast a shadow?

Figure 4: Turbo-SMT finds meaningful groups of words, questions, and brain regions that are (both negativelyand positively) correlated, as obtained using Turbo-SMT. For instance, Group 3 refers to small items that canbe held in one hand,such as a tomato or a glass, and the activation pattern is very di↵erent from the one ofGroup 1, which mostly refers to insects, such as bee or beetle. Additionally, Group 3 shows high activation in thepremotor cortex which is associated with the concepts of that group.

v

1

and v

2

which were withheld from the training data,the leave-two-out scheme measures prediction accuracyby the ability to choose which of the observed brainimages corresponds to which of the two words. Aftermean-centering the vectors, this classification decisionis made according to the following rule:

kv1 � v̂1k2 + kv2 � v̂2k2 < kv1 � v̂2k2 + kv2 � v̂1k2

Although our approach is not designed to make predic-tions, preliminary results are very encouraging: Usingonly F=2 components, for the noun pair closet/watch

we obtained mean accuracy of about 0.82 for 5 out of the9 human subjects. Similarly, for the pair knife/beetle,we achieved accuracy of about 0.8 for a somewhat dif-ferent group of 5 subjects. For the rest of the humansubjects, the accuracy is considerably lower, however, itmay be the case that brain activity predictability variesbetween subjects, a fact that requires further investiga-tion.

5 Experiments

We implemented Turbo-SMT in Matlab. Our imple-mentation of the code is publicly available.5 For the par-allelization of the algorithm, we used Matlab’s ParallelComputing Toolbox. For tensor manipulation, we used

5http://www.cs.cmu.edu/

~

epapalex/src/turbo_smt.zip

the Tensor Toolbox for Matlab [7] which is optimizedespecially for sparse tensors (but works very well fordense ones too). We use the ALS and the CMTF-OPT[5] algorithms as baselines, i.e. we compare Turbo-

SMT when using one of those algorithms as their coreCMTF implementation, against the plain execution ofthose algorithms. We implemented our version of theALS algorithm, and we used the CMTF Toobox6 im-plementation of CMTF-OPT. We use CMTF-OPT forhigher ranks, since that particular algorithm is moreaccurate than ALS, and is the state of the art. All ex-periments were carried out on a machine with 4 IntelXeon E74850 2.00GHz, and 512Gb of RAM. Wheneverwe conducted multiple iterations of an experiment (dueto the randomized nature of Turbo-SMT), we reporterror-bars along the plots. For all the following experi-ments we used either portions of the BrainQ dataset,or the whole dataset.

5.1 Speedup As we have already discussed in the In-troduction and shown in Fig. 1, Turbo-SMT achievesa speedup of 50-200 on the BrainQ dataset; For allcases, the approximation cost is either same as the base-lines, or is larger by a small factor, indicating thatTurbo-SMT is both fast and accurate. Key facts that

6http://www.models.life.ku.dk/joda/CMTF_Toolbox

Small items -> Premotor cortex

=

✔ Unsupervised ✔ Matches intuition

Page 11: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS Neuro-semantics

11 SIAM, July 2017 (c) C. Faloutsos, 2017

50 100 150 200 250

50

100

150

200

250

3000

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

50 100 150 200 250

50

100

150

200

250

3000

0.01

0.02

0.03

0.04

0.05

50 100 150 200 250

50

100

150

200

250

3000

0.01

0.02

0.03

0.04

0.05

Premotor Cortex

50 100 150 200 250

50

100

150

200

250

300

0.005

0.01

0.015

0.02

0.025

0.03

0.035

Group1

Group 2 Group 4

Group 3

beetle can it cause you pain?pants do you see it daily?bee is it conscious?

beetle can it cause you pain?pants do you see it daily?bee is it conscious?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?bear does it grow?cow is it alive?coat was it ever alive?

bear does it grow?cow is it alive?coat was it ever alive?

glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’

glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’

bed does it use electricity?house can you sit on it?car does it cast a shadow?

bed does it use electricity?house can you sit on it?car does it cast a shadow?

Figure 4: Turbo-SMT finds meaningful groups of words, questions, and brain regions that are (both negativelyand positively) correlated, as obtained using Turbo-SMT. For instance, Group 3 refers to small items that canbe held in one hand,such as a tomato or a glass, and the activation pattern is very di↵erent from the one ofGroup 1, which mostly refers to insects, such as bee or beetle. Additionally, Group 3 shows high activation in thepremotor cortex which is associated with the concepts of that group.

v

1

and v

2

which were withheld from the training data,the leave-two-out scheme measures prediction accuracyby the ability to choose which of the observed brainimages corresponds to which of the two words. Aftermean-centering the vectors, this classification decisionis made according to the following rule:

kv1 � v̂1k2 + kv2 � v̂2k2 < kv1 � v̂2k2 + kv2 � v̂1k2

Although our approach is not designed to make predic-tions, preliminary results are very encouraging: Usingonly F=2 components, for the noun pair closet/watch

we obtained mean accuracy of about 0.82 for 5 out of the9 human subjects. Similarly, for the pair knife/beetle,we achieved accuracy of about 0.8 for a somewhat dif-ferent group of 5 subjects. For the rest of the humansubjects, the accuracy is considerably lower, however, itmay be the case that brain activity predictability variesbetween subjects, a fact that requires further investiga-tion.

5 Experiments

We implemented Turbo-SMT in Matlab. Our imple-mentation of the code is publicly available.5 For the par-allelization of the algorithm, we used Matlab’s ParallelComputing Toolbox. For tensor manipulation, we used

5http://www.cs.cmu.edu/

~

epapalex/src/turbo_smt.zip

the Tensor Toolbox for Matlab [7] which is optimizedespecially for sparse tensors (but works very well fordense ones too). We use the ALS and the CMTF-OPT[5] algorithms as baselines, i.e. we compare Turbo-

SMT when using one of those algorithms as their coreCMTF implementation, against the plain execution ofthose algorithms. We implemented our version of theALS algorithm, and we used the CMTF Toobox6 im-plementation of CMTF-OPT. We use CMTF-OPT forhigher ranks, since that particular algorithm is moreaccurate than ALS, and is the state of the art. All ex-periments were carried out on a machine with 4 IntelXeon E74850 2.00GHz, and 512Gb of RAM. Wheneverwe conducted multiple iterations of an experiment (dueto the randomized nature of Turbo-SMT), we reporterror-bars along the plots. For all the following experi-ments we used either portions of the BrainQ dataset,or the whole dataset.

5.1 Speedup As we have already discussed in the In-troduction and shown in Fig. 1, Turbo-SMT achievesa speedup of 50-200 on the BrainQ dataset; For allcases, the approximation cost is either same as the base-lines, or is larger by a small factor, indicating thatTurbo-SMT is both fast and accurate. Key facts that

6http://www.models.life.ku.dk/joda/CMTF_Toolbox

Evangelos Papalexakis, Tom Mitchell, Nicholas Sidiropoulos, Christos Faloutsos, Partha Pratim Talukdar, Brian Murphy, Turbo-SMT: Accelerating Coupled Sparse Matrix-Tensor Factorizations by 200x, SDM 2014

Small items -> Premotor cortex

Page 12: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Roadmap •  Applications – pattern discovery

– Brain scans – coupled matrix-tensor factorization

– Power grid •  Applications – anomaly detection •  Algorithms •  Conclusions

SIAM, July 2017 (c) C. Faloutsos, 2017 12

Page 13: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

PowerCast Mining electric power data

SIAM, July 2017 (c) C. Faloutsos, 2017 13

Hyun Ah Song, Bryan Hooi, Marko Jereminov, Amritanshu Pandey, Larry Pileggi, and CF, PowerCast: Mining and Forecasting Power Grid Sequences, PKDD’17, Skopje, FYROM

Page 14: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Problem definition •  Given: real and imaginary current and voltage •  Forecast: power demand in the future, and •  Guess: how the forecasts will change under

various scenarios (e.g. population drops in half, etc)

SIAM, July 2017 (c) C. Faloutsos, 2017 14

… … … …

What-if: population drops? What-if: more factories are built?

?

?

?

?

Given Forecast

What-if: normal condition? PowerCast

Page 15: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Domain knowledge

SIAM, July 2017 (c) C. Faloutsos, 2017 15

Coils/capacitors

resistors Ir

Ii

Page 16: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

PowerCast

SIAM, July 2017 (c) C. Faloutsos, 2017 16

!"!#$"$#

!"!#$"$#

timeoftheday

days %…………

Tensorize Transfertoexpertdomain

&'("(#

timeoftheday

days %

!" ) = & ) $" ) − ' ) $# ) + ("())!# ) = ' ) $" ) +& ) $# ) + (#())

/"0

1"2"

= 3 4"5

"67×

ARSAR…

oror

Extensionby

Forecast(days)/"

1"2"

= 3 4"5

"67×&'

("(#

%

&'("(#

…%

Tensorextension(forecast)

Tensordecomposition Backtooriginal form(fromBIGdomain)

!"!#$"$#

…%

DETAILS

Page 17: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

PowerCast

SIAM, July 2017 (c) C. Faloutsos, 2017 17

!"!#$"$#

!"!#$"$#

timeoftheday

days %…………

…Tensorize Transferto

expertdomain

&'("(#

timeoftheday

days %

!" ) = & ) $" ) − ' ) $# ) + ("())!# ) = ' ) $" ) +& ) $# ) + (#())

/"0

1"2"

= 3 4"5

"67×

ARSAR…

oror

Extensionby

Forecast(days)/"

1"2"

= 3 4"5

"67×&'

("(#

%

&'("(#

…%

Tensorextension(forecast)

Tensordecomposition Backtooriginal form(fromBIGdomain)

!"!#$"$#

…%

DETAILS

Page 18: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

PowerCast

SIAM, July 2017 (c) C. Faloutsos, 2017 18

!"!#$"$#

!"!#$"$#

timeoftheday

days %…………

Tensorize Transfertoexpertdomain

&'("(#

timeoftheday

days %

!" ) = & ) $" ) − ' ) $# ) + ("())!# ) = ' ) $" ) +& ) $# ) + (#())

/"0

1"2"

= 3 4"5

"67×

ARSAR…

oror

Extensionby

Forecast(days)/"

1"2"

= 3 4"5

"67×&'

("(#

%

&'("(#

…%

Tensorextension(forecast)

Tensordecomposition Backtooriginal form(fromBIGdomain)

!"!#$"$#

…%

DETAILS

Page 19: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Tensor factors (concepts)

SIAM, July 2017 (c) C. Faloutsos, 2017 19

weekends weekends weekends 9am 6pm4pm

“background”“human-activities”

G(e.g.““)

G(e.g.““)

B(e.g.“”)

B(e.g.“”)

1st component:

2nd component:

Long-term-concepts (ur)

Daily-concepts (vr)

User-profile-concepts (wr)

!"!#$"$#

!"!#$"$#

timeoftheday

days %…………

Tensorize Transfertoexpertdomain

&'("(#

timeoftheday

days %

!" ) = & ) $" ) − ' ) $# ) + ("())!# ) = ' ) $" ) +& ) $# ) + (#())

/"0

1"2"

= 3 4"5

"67×

ARSAR…

oror

Extensionby

Forecast(days)/"

1"2"

= 3 4"5

"67×&'

("(#

%

&'("(#

…%

Tensorextension(forecast)

Tensordecomposition Backtooriginal form(fromBIGdomain)

!"!#$"$#

…%

Page 20: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Forecast

SIAM, July 2017 (c) C. Faloutsos, 2017 20

PowerCast forecasts 24-steps (1-day) ahead on CMU data more accurate than other competitors

Page 21: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Anomaly detection & explanation

SIAM, July 2017 (c) C. Faloutsos, 2017 21

Anomaly! A: Q: What happened?

‘background’ component increased!

“background”“human-activities”

G(e.g.““)

G(e.g.““)

B(e.g.“”)

B(e.g.“”)

1st component:

2nd component:

Case 1?

Case 2?

?:

Page 22: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Anomaly detection & explanation

SIAM, July 2017 (c) C. Faloutsos, 2017 22

Anomaly! A: Q: What happened?

‘background’ component increased!

“background”“human-activities”

G(e.g.““)

G(e.g.““)

B(e.g.“”)

B(e.g.“”)

1st component:

2nd component:

Case 1?

Case 2?

?: Case 1?

Case 2?

80ºF 88ºF

Page 23: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Roadmap •  Applications – pattern discovery •  Applications – anomaly detection

– Phone-call data –  Intrusion detection

•  Algorithms •  Conclusions

SIAM, July 2017 (c) C. Faloutsos, 2017 23

Page 24: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Anomalies in phone-call data •  PARAFAC decomposition •  Results for who-calls-whom-when

–  4M x 15 days

SIAM, July 2017 (c) C. Faloutsos, 2017 24

= + + caller

callee

?? ?? ??

Page 25: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS Anomaly detection in time-

evolving graphs

•  Anomalous communities in phone call data: – European country, 4M clients, data over 2 weeks

~200 calls to EACH receiver on EACH day!

1 caller 5 receivers 4 days of activity

SIAM, July 2017 25 (c) C. Faloutsos, 2017

=

Page 26: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS Anomaly detection in time-

evolving graphs

•  Anomalous communities in phone call data: – European country, 4M clients, data over 2 weeks

~200 calls to EACH receiver on EACH day!

1 caller 5 receivers 4 days of activity

SIAM, July 2017 26 (c) C. Faloutsos, 2017

=

Page 27: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS Anomaly detection in time-

evolving graphs

•  Anomalous communities in phone call data: – European country, 4M clients, data over 2 weeks

~200 calls to EACH receiver on EACH day!

1 caller 5 receivers 4 days of activity

Tencent, 6/22 27 (c) C. Faloutsos, 2017

=

Page 28: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS Anomaly detection in time-

evolving graphs

•  Anomalous communities in phone call data: – European country, 4M clients, data over 2 weeks

~200 calls to EACH receiver on EACH day!

SIAM, July 2017 28 (c) C. Faloutsos, 2017

=

Miguel Araujo, Spiros Papadimitriou, Stephan Günnemann, Christos Faloutsos, Prithwish Basu, Ananthram Swami, Evangelos Papalexakis, Danai Koutra. Com2: Fast Automatic Discovery of Temporal (Comet) Communities. PAKDD 2014, Tainan, Taiwan.

Page 29: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Roadmap •  Applications – pattern discovery •  Applications – anomaly detection

– Phone-call data –  Intrusion detection

•  Algorithms •  Conclusions

SIAM, July 2017 (c) C. Faloutsos, 2017 29

Page 30: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

ParCube at Work: Port-Scanning

LBNL Network Tra�c This dataset consists of (source, destination, port #)triplets, where each value of the corresponding tensor is the number of packetssent. The snapshot of the dataset we used, formed a 65170 ⇥ 65170 ⇥ 65327tensor of 27269 non-zeros. We ran Algorithm 3 using s = 5 and r = 10 and wewere able to identify what appears to be a port-scanning attack: The componentshown in Fig. 9 contains only one source address (addr. 29571), contacting onedestination address (addr. 30483) using a wide range of near-consecutive ports(while sending the same amount of packets to each port), a behaviour whichshould certainly raise a flag to the network administrator, indicating a possibleport-scanning attack.

0 1 2 3 4 5 6 7

x 104

0

10

20

0 1 2 3 4 5 6 7

x 104

0

0.5

1

0 1 2 3 4 5 6 7

x 104

0

0.05

0.1

Src

Dst

Port Scaning AttackPort

Fig. 9. Anomaly on the Lbnl data: We have one source address (addr. 29571), con-tacting one destination address (addr. 30483) using a wide range of near-consecutiveports, possibly indicating a port scanning attack.

Facebook Wall posts This dataset 5 first appeared in [25]; the specific partof the dataset we used consists of triplets of the form (Wall owner, Poster,day), where the Poster created a post on the Wall owner’s Wall on the specifiedtimestamp. By choosing daily granularity, we formed a 63891 ⇥ 63890 ⇥ 1847tensor, comprised of 737778 non-zero entries; subsequently, we ran Algorithm 3using s = 100 and r = 10. In Figure 10 we present our most surprising findings:On the left subfigure, we demonstrate what appears to be the Wall owner’sbirthday, since many posters posted on a single day on this person’s Wall; thisevent may well be characterized as an ”anomaly”. On the right subfigure, wedemonstrate what ”normal” Facebook activity looks like.

NELL This dataset consists of triplets of the form (noun-phrase, noun-phrase,context). which form a tensor with assorted modes of size 14545⇥14545⇥28818and 76879419 non-zeros, and as values the number of occurrences of each triplet.The context phrase may be just a verb or a whole sentence. After computing theParafac decomposition of the tensor using ParCube with s = 500, and r = 10repetitions, we computed the noun-phrase similarity matrix AAT + BBT and

5 Download Facebook at http://socialnetworks.mpi-sws.org/data-wosn2009.

html

Src

Dst

Port

+…+ ≅ LBNL Network Data

SIAM, July 2017 (c) C. Faloutsos, 2017 30 Papalexakis et al. ECML-PKDD 2012

Page 31: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Roadmap •  Applications – pattern discovery •  Applications – anomaly detection •  Algorithms

– ParCube (and TurboSMT) – S-HOT for higher order Tucker

•  Conclusions

SIAM, July 2017 (c) C. Faloutsos, 2017 31

Page 32: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS ParCube/Turbo-SMT: Triple-sparse Parallel Tensor & Coupled

Decomposition

Xr

X1

X

≈ …

FACTORMERGE

32 SIAM, July 2017 (c) C. Faloutsos, 2017 Papalexakis et al. ECML-PKDD 2012 / SDM 2014

Page 33: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS Speedup

SIAM, July 2017 33

Baseline (ALS)

~ 1 day

4 Intel Xeon E74850 512Gb RAM, Fedora 14

Data size ~500Mb

0123456789

101112

0.001 0.01 0.1 1

Rela%veerror

Rela%verun%me

8 workers

100x faster

1/20

1/10

1/5 1/2

sample size

(c) C. Faloutsos, 2017

Page 34: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Roadmap •  Applications – pattern discovery •  Applications – anomaly detection •  Algorithms

– ParCube (and TurboSMT) – S-HOT for higher order Tucker

•  Conclusions

SIAM, July 2017 (c) C. Faloutsos, 2017 34

Page 35: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

•  Tucker Decomposition:

SIAM, July 2017 35 (c) C. Faloutsos, 2017

Jinoh Oh, Kijung Shin, Evangelos E. Papalexakis, Christos Faloutsos, and Hwanjo Yu, S-HOT: Scalable High-Order Tucker Decomposition, WSDM 2017

S-HOT: Scalable High-Order Tucker Decomposition

X C A(2)

A(1

)

Page 36: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

High-Order Tucker Decomposition (Example)

•  Input tensor – X: a 5-way sparse tensor (size: 1M … 1M, #non-zeros: 100M)

•  Output tensors: – C: a 5-way core tensor (size: 10 … 10, #non-zeros : ~100K) – A, …, A(5): factor matrices (size: 1M 10, #non-zeros : ~10M)

SIAM, July 2017 (c) C. Faloutsos, 2017 36

Page 37: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

High-Order Tucker Decomposition (Main idea)

SIAM, July 2017 (c) C. Faloutsos, 2017 37

Huge, & dense

Page 38: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

High-Order Tucker Decomposition (Main idea)

SIAM, July 2017 (c) C. Faloutsos, 2017 38

Huge, & dense

-> DON’T materialize it

Page 39: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Scalability of S-HOT •  Baselines: Standard ALS (Naïve), & (opt)

SIAM, July 2017 (c) C. Faloutsos, 2017 45

Page 40: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Scalability of S-HOT •  Baselines: Standard ALS (Naïve), & (opt) •  I = 1M; J=10; M=1B; N=5 modes

SIAM, July 2017 (c) C. Faloutsos, 2017 46

Page 41: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Discovery using S-HOT •  Microsoft Academic Graph (42M papers × 25K venues × 115M authors × 54K keywords)

SIAM, July 2017 (c) C. Faloutsos, 2017 47

Page 42: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Discovery using S-HOT •  Microsoft Academic Graph (42M papers × 25K venues × 115M authors × 54K keywords)

SIAM, July 2017 (c) C. Faloutsos, 2017 48

Page 43: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

Roadmap •  Applications – pattern discovery •  Applications – anomaly detection •  Algorithms

– ParCube (and TurboSMT) – S-HOT for higher order Tucker

•  Conclusions

SIAM, July 2017 (c) C. Faloutsos, 2017 49

Page 44: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

(c) C. Faloutsos, 2017 50

Thanks

SIAM, July 2017

Thanks to: NSF IIS-0705359, IIS-0534205, CTA-INARC; Yahoo (M45), LLNL, IBM, SPRINT, Google, INTEL, HP, iLab

Disclaimer: All opinions are mine; not necessarily reflecting the opinions of the funding agencies

Page 45: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

(c) C. Faloutsos, 2017 51

CONCLUSION#1

•  MANY applications for tensors

SIAM, July 2017

=

50 100 150 200 250

50

100

150

200

250

3000

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

50 100 150 200 250

50

100

150

200

250

3000

0.01

0.02

0.03

0.04

0.05

50 100 150 200 250

50

100

150

200

250

3000

0.01

0.02

0.03

0.04

0.05

Premotor Cortex

50 100 150 200 250

50

100

150

200

250

300

0.005

0.01

0.015

0.02

0.025

0.03

0.035

Group1

Group 2 Group 4

Group 3

beetle can it cause you pain?pants do you see it daily?bee is it conscious?

beetle can it cause you pain?pants do you see it daily?bee is it conscious?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?

Nouns Questions Nouns Questions

Nouns Questions

beetle can it cause you pain?bear does it grow?cow is it alive?coat was it ever alive?

bear does it grow?cow is it alive?coat was it ever alive?

glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’

glass can you pick it up?tomato can you hold it in one hand?bell is it smaller than a golfball?’

bed does it use electricity?house can you sit on it?car does it cast a shadow?

bed does it use electricity?house can you sit on it?car does it cast a shadow?

Figure 4: Turbo-SMT finds meaningful groups of words, questions, and brain regions that are (both negativelyand positively) correlated, as obtained using Turbo-SMT. For instance, Group 3 refers to small items that canbe held in one hand,such as a tomato or a glass, and the activation pattern is very di↵erent from the one ofGroup 1, which mostly refers to insects, such as bee or beetle. Additionally, Group 3 shows high activation in thepremotor cortex which is associated with the concepts of that group.

v

1

and v

2

which were withheld from the training data,the leave-two-out scheme measures prediction accuracyby the ability to choose which of the observed brainimages corresponds to which of the two words. Aftermean-centering the vectors, this classification decisionis made according to the following rule:

kv1 � v̂1k2 + kv2 � v̂2k2 < kv1 � v̂2k2 + kv2 � v̂1k2

Although our approach is not designed to make predic-tions, preliminary results are very encouraging: Usingonly F=2 components, for the noun pair closet/watch

we obtained mean accuracy of about 0.82 for 5 out of the9 human subjects. Similarly, for the pair knife/beetle,we achieved accuracy of about 0.8 for a somewhat dif-ferent group of 5 subjects. For the rest of the humansubjects, the accuracy is considerably lower, however, itmay be the case that brain activity predictability variesbetween subjects, a fact that requires further investiga-tion.

5 Experiments

We implemented Turbo-SMT in Matlab. Our imple-mentation of the code is publicly available.5 For the par-allelization of the algorithm, we used Matlab’s ParallelComputing Toolbox. For tensor manipulation, we used

5http://www.cs.cmu.edu/

~

epapalex/src/turbo_smt.zip

the Tensor Toolbox for Matlab [7] which is optimizedespecially for sparse tensors (but works very well fordense ones too). We use the ALS and the CMTF-OPT[5] algorithms as baselines, i.e. we compare Turbo-

SMT when using one of those algorithms as their coreCMTF implementation, against the plain execution ofthose algorithms. We implemented our version of theALS algorithm, and we used the CMTF Toobox6 im-plementation of CMTF-OPT. We use CMTF-OPT forhigher ranks, since that particular algorithm is moreaccurate than ALS, and is the state of the art. All ex-periments were carried out on a machine with 4 IntelXeon E74850 2.00GHz, and 512Gb of RAM. Wheneverwe conducted multiple iterations of an experiment (dueto the randomized nature of Turbo-SMT), we reporterror-bars along the plots. For all the following experi-ments we used either portions of the BrainQ dataset,or the whole dataset.

5.1 Speedup As we have already discussed in the In-troduction and shown in Fig. 1, Turbo-SMT achievesa speedup of 50-200 on the BrainQ dataset; For allcases, the approximation cost is either same as the base-lines, or is larger by a small factor, indicating thatTurbo-SMT is both fast and accurate. Key facts that

6http://www.models.life.ku.dk/joda/CMTF_Toolbox

Page 46: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

(c) C. Faloutsos, 2017 52

CONCLUSION#2 •  Domain-experts: valuable

SIAM, July 2017

Q: What happened?

Case 1?

Case 2?

80ºF 88ºF

Page 47: Tensor Analysis – Applications and Algorithmsusers.wfu.edu › ballard › SIAM-AN17 › faloutsos.pdf · Tensor Analysis – Applications and Algorithms Christos Faloutsos CMU

CMU SCS

(c) C. Faloutsos, 2017 53

CONCLUSION#2 •  Domain-experts: valuable

SIAM, July 2017

Q: What happened?

Case 1?

Case 2?

80ºF 88ºF

Thank you!