MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN)...

93
MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) [email protected] CML-2008 Montenegro, September 2008

Transcript of MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN)...

Page 1: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

MULTIMODAL LINGUISTICS:

Directions of research

Andrej A. Kibrik (Institute of Linguistics, RAN)

[email protected]

CML-2008Montenegro, September

2008

Page 2: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

2

The mainstream linguistic approach

Language consists of hierarchically organized segmental units, such as phonemes, morphemes, words, phrases, and sentences

Linguistic form is thus equated with verbal form Search for “linguistic form” in Google:

The first result is: “A meaningful unit of language, such as an affix, a word,

a phrase, or a sentence.” (TheFreeDictionary.com) «В своей совокупности языковые знаки

образуют особого рода знаковую систему – язык. <…> Наиболее типичным языковым знаком является слово <…> Форма выражения любого словесного знака состоит из фонем» (Лингвистический энциклопедический словарь, с. 167)

Page 3: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

3

In related disciplines

Assumption typically held by other cognitive scientists, for example psychologists: language consists of words, sentences, and other verbal units

“With no more than 50 to 100 K words humans can create and understand an infinite number of sentences” (Bernstein et al. 1994: 349-350)

When psychologists and neuroscientists work with “language”, they almost invariably think that language is a set of individual words or, at most, sentences

Page 4: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

4

However

There are prosodic, that is non-verbal aspects to sound Imagine prosody-free talk or, vice versa, talk behind a wall

Apart from sound, there are other channels of communication, in the first place through vision (body language, gesture, gaze, etc.)

Page 5: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

5

Multimodality

In order to understand language and communication, all aspects of linguistic form shold be taken into account

This is what is sometimes called the multimodal approach Modality, or mode, refers to a distinct type of input In particular, modality is a kind of stimulus associated

with one the human senses, particularly hearing and sight

So the verbal component, prosody, and body language all count as modes or modalities

Hence the notion of multimodality

Page 6: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

6

Goals of this talk

Emphasize the importance of prosody and visual aspects of communication in linguistic research

Show how prosody and visual communication interact with the verbal component, thus suggesting not only the multimodal, but also the cross-modal approach

Propose that linguistics cannot progress without seriously taking multimodality into account

Page 7: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

7

Are these goals relevant and important?

After all, linguists and other scholars have already been pursuing these issues for many decades, and the respective research traditions are quite rich

But: First, prosody and visual communication are

marginalized in linguistics, they are located in certain “pockets” of the overall linguistic panorama and are tolerated by the mainstream as “paralinguistics”

Those focusing on these information channels often treat them as a “thing in itself”, without integration with the verbal component

Page 8: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

8

Plan of talk

I. Prosody Sentence

II. Gestures Reference

III. Relative contribution of three information channels

IV. Signed languages Reference

V. Wider context

Page 9: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

9

I. PROSODY Prosodic components

pausing accents pitch tempo (of various scope) registers degrees of reduction glottal features loudness ................

Prosody is responsible for discourse segmentation into Elementary Discourse Units (EDUs), identified on the basis of several prosodic components and strongly correlated with clauses

Page 10: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

10

An example of prosodically oriented discourse transcription

....(1.5) /\Озеро ...(0.5) какое-то,Lake some

..(0.3) (Или /\речка,Either river

или /\озеро,or lake

но по-моему \озеро,but I guess lake

потому что’ ..(0.2) как-то-оw because somehow...(0.6) \маленькое такое,

small such \небольшое.)

minor ....(1.0) ’и-иh ...(0.7)через /него

and across it..(0.3) как-то \бревно какое-то,

somehow log some типа \моста.

like bridge

....(1.5) /\Ozero ...(0.5) kakoe-to,

..(0.3) (Ili /\rečka,

ili /\ozero,

no po-moemu \ozero,

potomu čto’ ..(0.2) kak-to-oW ...

(0.6) \malen’koe takoe,

\nebol’šoe.)

....(1.0) ’i-iH ...(0.7) čerez /nego ..

(0.3) kak-to \brevno kakoe-to,

tipa \mosta.

Page 11: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

11

Night Dream Stories

Corpus of spoken Russian stories Speakers: children and adolescents Subject matter: retelling of night

dreamsDiscourse type: monologic narrative

(personal stories) Joint study with Vera Podlesskaya and

a group of our graduate students

Page 12: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

12

Segmentation (lines) ....(1.5) /\Озеро ...(0.5) какое-то,

Lake some ..(0.3) (Или /\речка,

Either river или /\озеро,

or lake но по-моему \озеро,

but I guess lake потому что’ ..(0.2) как-то-оw

because somehow...(0.6) \маленькое такое,

small such \небольшое.)

minor ....(1.0) ’и-иh ...(0.7)через /него

and across it..(0.3) как-то \бревно какое-то,

somehow log some типа \моста.

like bridge

....(1.5) /\Ozero ...(0.5) kakoe-to,

..(0.3) (Ili /\rečka,

ili /\ozero,

no po-moemu \ozero,

potomu čto’ ..(0.2) kak-to-oW ...

(0.6) \malen’koe takoe,

\nebol’šoe.)

....(1.0) ’i-iH ...(0.7) čerez /nego ..

(0.3) kak-to \brevno kakoe-to,

tipa \mosta.

Page 13: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

13

Pauses ....(1.5) /\Озеро ...(0.5) какое-то,

Lake some ..(0.3) (Или /\речка,

Either river или /\озеро,

or lake но по-моему \озеро,

but I guess lake потому что’ ..(0.2) как-то-оw

because somehow...(0.6) \маленькое такое,

small such \небольшое.)

minor ....(1.0) ’и-иh ...(0.7)через /него

and across it..(0.3) как-то \бревно какое-то,

somehow log some типа \моста.

like bridge

....(1.5) /\Ozero ...(0.5) kakoe-to,

..(0.3) (Ili /\rečka,

ili /\ozero,

no po-moemu \ozero,

potomu čto’ ..(0.2) kak-to-oW ...

(0.6) \malen’koe takoe,

\nebol’šoe.)

....(1.0) ’i-iH ...(0.7) čerez /nego ..

(0.3) kak-to \brevno kakoe-to,

tipa \mosta.

Page 14: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

14

Pitch accents ....(1.5) /\Озеро ...(0.5) какое-то,

Lake some ..(0.3) (Или /\речка,

Either river или /\озеро,

or lake но по-моему \озеро,

but I guess lake потому что’ ..(0.2) как-то-оw

because somehow...(0.6) \маленькое такое,

small such \небольшое.)

minor ....(1.0) ’и-иh ...(0.7)через /него

and across it..(0.3) как-то \бревно какое-то,

somehow log some типа \моста.

like bridge

....(1.5) /\Ozero ...(0.5) kakoe-to,

..(0.3) (Ili /\rečka,

ili /\ozero,

no po-moemu \ozero,

potomu čto’ ..(0.2) kak-to-oW ...

(0.6) \malen’koe takoe,

\nebol’šoe.)

....(1.0) ’i-iH ...(0.7) čerez /nego ..

(0.3) kak-to \brevno kakoe-to,

tipa \mosta.

Page 15: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

15

Tempo: wide and narrow scope ....(1.5) /\Озеро ...(0.5) какое-то,

Lake some ..(0.3) (Или /\речка,

Either river или /\озеро,

or lake но по-моему \озеро,

but I guess lake потому что’ ..(0.2) как-то-оw

because somehow...(0.6) \маленькое такое,

small such \небольшое.)

minor ....(1.0) ’и-иh ...(0.7)через /него

and across it..(0.3) как-то \бревно какое-то,

somehow log some типа \моста.

like bridge

....(1.5) /\Ozero ...(0.5) kakoe-to,

..(0.3) (Ili /\rečka,

ili /\ozero,

no po-moemu \ozero,

potomu čto’ ..(0.2) kak-to-oW ...

(0.6) \malen’koe takoe,

\nebol’šoe.)

....(1.0) ’i-iH ...(0.7) čerez /nego ..

(0.3) kak-to \brevno kakoe-to,

tipa \mosta.

Page 16: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

16

Other prosodic phenomena ....(1.5) /\Озеро ...(0.5) какое-то,

Lake some ..(0.3) (Или /\речка,

Either river или /\озеро,

or lake но по-моему \озеро,

but I guess lake потому что’ ..(0.2) как-то-оw

because somehow...(0.6) \маленькое такое,

small such \небольшое.)

minor ....(1.0) ’и-иh ...(0.7)через /него

and across it..(0.3) как-то \бревно какое-то,

somehow log some типа \моста.

like bridge

....(1.5) /\Ozero ...(0.5) kakoe-to,

..(0.3) (Ili /\rečka,

ili /\ozero,

no po-moemu \ozero,

potomu čto’ ..(0.2) kak-to-oW ...

(0.6) \malen’koe takoe,

\nebol’šoe.)

....(1.0) ’i-iH ...(0.7) čerez /nego ..

(0.3) kak-to \brevno kakoe-to,

tipa \mosta.

Page 17: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

17

Prosody and sentence

Does spoken language consist of sentences? Sheer facts:

Spoken language is the primary form of language Spoken language does not contain periods,

question marks and other explicit signals of sentence boundaries

Research question: Is sentence, as a theoretical construct, as

identifiable and as basic for the primary form of language as it is (or as it is thought to be) for written language?

Page 18: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

18

Transitional continuity

Term by J. DuBois et al. 1992 Alternative term by Sandro V. Kodzasov:

phase Discourse semantic category: ‘end’ vs.

‘non-end’ (=expectation of a forthcoming end)

End of tentative sentence – falling tonal accent

Non-end – rising tonal accent

Page 19: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

19

A canonical example of the transitional continuity distinction z57:15-16

..(0.4) /\Мы-ы’ ..(0.4) \как бы за них /взя-ались,..(0.4) /\My-y’ ..(0.4) \kak by za nix /vzja-alis’,

We sort of at them got.hold

...(0.5) и-и ввь= || ..(0.2) полетели \вве-ерх. ...(0.5) i-i vv’= || ..(0.2) poleteli \vve-erx. and flew upward

Rising (“comma”)Non-end

Falling (“period”)End

If things were that easy, sentence would be uncontroversial

Page 20: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

20

Uncanonical situation: Non-end with a falling tonal accent

....(1.5) /\Озеро ...(0.5) какое-то,

..(0.3) (Или /\речка,

или /\озеро,

но по-моему \озеро,

потому что’ ..(0.2) как-то-оw

...(0.6) \маленькое такое,

\небольшое.)

....(1.0) ’и-иh ...(0.7)через /него

..(0.3) как-то \бревно какое-то,

типа \моста.

....(1.5) /\Ozero ...(0.5) kakoe-to,Lake some

..(0.3) (Ili /\rečka, Either river

ili /\ozero,or lake

no po-moemu \ozero,but I guess lake

potomu čto’ ..(0.2) kak-to-oWbecause somehow ...(0.6) \malen’koe takoe,

small such \nebol’šoe.)

minor ....(1.0) ’i-iH ...(0.7) čerez /nego

and across it ..(0.3) kak-to \brevno kakoe-to,

somehow log some tipa \mosta.

like bridge

Page 21: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

21

The problem of two kinds of falling

The existence of non-final falling calls relevance of sentence into question

However, the distinction between two kinds of falling is very systematic

The two kinds of falling: are prosodically distinct have distinct discourse functions

Page 22: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

22

Prosodic criteria of the final vs. non-final falling distinction

1. Target frequency band2. Post-accent behavior3. Pausing pattern4. Reset vs. latching5. Steepness of falling6. Interval of falling

Page 23: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

23

Target frequency band

Final falling (“period”): targets at the bottom of the speaker’s F0 range

Non-final falling (“faling comma”): targets at level several dozen Hz (several semitones) higher

Page 24: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

24

F0 graph for the “lake” example

\ozero, \malen’koe \nebol’ \brevno kakoe \mosta.

takoe, šoe.-to,

12 10 125

8

Page 25: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

25

Representation of EDU continuity types in corpus

894

606

1188

0

200

400

600

800

1000

1200

Finalfalling

Non-finalfalling

(Non-final)rising

Page 26: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

26

The status of sentence

In the speech of most speakers final falling is clearly distinct from non-final patterns

Final intonation, expressly distinct from non-final intonation (both rising and falling), makes the notion of sentence valid for spoken discourse

Speakers “know” when they complete a sentence and when they do not

Apparently, spoken sentences are the prototype of written sentences

However, identification of sentences is possible only on the basis of a complex analytic procedure

Page 27: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

27

Conclusions on prosody and sentence

Sentence is an intermediate hierarchical grouping between a whole discourse and an EDU (roughly, clause)

Sentence is a complex, non-elementary unit of spoken language

These conclusions, possible only due to prosodic analysis, are of prime importance for linguistic theory

The notion of sentence, so salient in theories restricted to the verbal component alone, can only be evaluated relying on prosodic evidence

Page 28: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

28

II. GESTURE

In the course of linguistic communication, it is not just that the speaker speaks and the addressee listens

In addition, the speaker displays, and the addressee observes Gesture Gaze Mimics Posture Proxemics Cultural symbolism .....................(see, for example, Крейдлин 2002, Бутовская

2004)

Page 29: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

29

Gestures

Gestures are kinetic behaviors of arms and other limbs, capable of conveying meaning from speaker to addressee.

Among the various types of gestures (see e.g. McNeill 1992) pointing gestures are one of the most salient types.

Page 30: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

30

Pointing

Возьмите игрушки там!

Page 31: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

31

Elements of a canonical pointing act

Page 32: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

32

Phylogeny and ontogeny

Appear an exclusive property of humans (Tomasello et al. 2007)

Are a very ancient gesture type (Крейдлин 2007)

Appear at the end of the first year Can participate in binary multimodal

constructions “word + gesture”, such as open POINT (Butcher and Goldin-Meadow 2000)

Page 33: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

33

Reference and pointing

Reference is a fundamental linguistic phenomenon, accounting for about every third word in running discourse

Studies of reference (deixis, anaphora, etc.) are among the central concerns of modern linguistics

Pointing is the developmental source of reference

Page 34: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

34

Pointing, deixis, and exophora

Deixis is the most widely recognized function of pointing

However, quite frequently pointing is associated with exophora, that is mention of perceptually activated referents (O'Neill 1996, Levy 2000: 219, Nikolaeva 2003 )

Page 35: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

35

Exophoric reference (from Nikolaeva 2003)

a. My s Anatoliem uže mnogolet očen’ rabotaem,

<three intervening clauses>

e. on mnogo raz zavjazyval,

‘Anatolij and I have been working together for many years, <…> he was winding it up (drinking) many times’

Page 36: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

36

Anaphora

Anaphora (reference to items activated by prior discourse) is secondary to exophora (reference to items activated by perceptual availability)

Exophora is the ontological source of anaphora

Anaphora occasionally occurs with pointing

Page 37: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

37

Pointing and prosody

Pointing and accentuation are analogous phenomena, both associated with making an item salient

Levy (2000): energy expenditureNikolaeva (p.c.): pointing invariably

cooccurs with accent

Page 38: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

38

Substitution: Referent vs. demonstratum

Reference to non-specific items:

Vot počemu my i obraščaemsja poroj k psixologam.

‘This is why we address psychologists now and then’

This phenomenon is known as deferred ostension, analogic deixis, ostensive metonymy, etc.

In substitution, reference does not have to be non-specific

He got a big scar here (pointing to one’s cheek) (Levelt 1989)

Page 39: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

39

Virtual pointing

Pointing to imaginary targets cf. Buehler’s Deixis am Phantasma,

McNeill’s abstract pointing

Page 40: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

40

Frequency in two discourse types

Nikolaeva 2003 (TV shows): 5.4 poinring gestures per 100 EDUs 2.7 are virtual pointing

Nikolaeva p.c. 2007 (retelling of a film): 4.2 pointing gestures per 100 EDUs All are virtual pointing

Pointing in exophora/anaphora is as frequent as in deixis

Page 41: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

41

a. … əə Kogda on exal po= podoroge,

b. on əə mm … poravnjalsja s devočkoj,

‘As he rode along the road, he passed a girl <...>’

Page 42: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

42

d. on zasmotrelsja na neë,

‘he gaped at her’

Page 43: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

43

Establishment of spatial relations

By illustrative gestures, as in the previous example By verbal devicesa. i naprotiv menja sideli dve devočki-mulatki, <21 intervening clauses>y. vot êti dve devočki naprotiv i ja,‘‘And across from me sat two brown-skinned girls, <…

> these two girls and I <...>’ There is no difference for the referential system

what was used to establish spatial relations Verbal and gestural material is jointly used to

convey the inner cognitive representation from the speaker to the addressee

Page 44: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

44

Conclusions on pointing and reference

The pointing gesture is the developmental source of reference

The use of pointing is intimately connected to reference

Reference, a central linguistic phenomenon, cannot be understood if we fail to take gesture into account

Page 45: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

45

III. Relative contribution of three information channels

Discourse

Vocal channels Visual channel

Verbal channel Prosodic channel

Page 46: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

46

What is the contribution of different channels?

Traditional approach of mainstream linguistics: the verbal channel is so central that prosody and the visual channel are at best downgraded as “paralinguistics”

Applied psychology It is often stated that (figures go back to Mehrabian

1971):• body language conveys 55% of information• prosody conveys 38% of information• the verbal component conveys 7% of information

«Words may be what men use when all else fails» (Крейдлин 2002: 6)

Who is right?

Page 47: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

47

Experimental study

Isolate three information channels Present a sample discourse in all

possible variants (23=8) Present each of the eight variants to

a group of subjectsAssess the degree of understanding

in each case El’bert 2007, Kibrik and El’bert 2008

Page 48: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

48

Experimental material

Russian TV serial “Tajny sledstvija” – “Mysteries of the investigation”

Experimental excerpt: 3 min. 20 sec. Preceded by a 8 minutes context (that starts from the

beginning of the series) The excerpt fully consists of a conversation, to ensure that

we are testing the understanding of discourse rather than of the film in general

Two vocal channels have been separated: verbal alone – running subtitles prosodic alone – superimposed filter creating the “behind a

wall” effect Subjects:

99 participants, divided into 8 groups Native speakers of Russian Each group comprised 10 to 17 subjects

Page 49: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

49

Full version

Page 50: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

50

Visual + verbal channels

Page 51: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

51

Visual + prosodic channels

Page 52: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

52

Procedure

Every subject was instructed to watch the context and the experimental excerpt and then answer a set of questions concerned with the experimental excerpt alone

Questionnaire was constructed in accordance with the received principles of test tasks (Panchenko 2000)

23 questions in questionnaire A subject was supposed to choose only one answer out of

four listed variants What Tamara Stepanovna offers Masha before the

beginning of the conversation: a. to take off her coat b. to have a cup of tea c. to have a seat d. to have a drink

Percentage of correct answers is used as an assessment of a subject’s degree of understanding

Page 53: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

53

Results

Group number

1 2 3 4 5 6 7 8

Experimen-tal material

Original

Sound Subtitles+ video

Prosody+ video

Subtitles

Prosody Video Nothing(context only)

Information channels

verbal prosodic visual

verbal prosodic

verbal visual

prosodic visual

verbal prosodic

visual [none]

Number of information channels

3 2 2 2 1 1 1 0

Mean %% of correct answers

87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%

Page 54: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

54

Each of the three information channels, taken in isolation, is quite informative

Group number

1 2 3 4 5 6 7 8

Experimen-tal material

Original

Sound Subtitles+ video

Prosody+ video

Subtitles

Prosody Video Nothing(context only)

Information channels

verbal prosodic visual

verbal prosodic

verbal visual

prosodic visual

verbal prosodic

visual [none]

Number of information channels

3 2 2 2 1 1 1 0

Mean %% of correct answers

87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%

Page 55: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

55

The hierarchy of informativeness: verbal > visual > prosodic

Group number

1 2 3 4 5 6 7 8

Experimen-tal material

Original

Sound Subtitles+ video

Prosody+ video

Subtitles

Prosody Video Nothing(context only)

Information channels

verbal prosodic visual

verbal prosodic

verbal visual

prosodic visual

verbal prosodic

visual [none]

Number of information channels

3 2 2 2 1 1 1 0

Mean %% of correct answers

87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%

Page 56: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

56

Combining the verbal channel with one additional channel does not increase the percentage of correct answers

Group number

1 2 3 4 5 6 7 8

Experimen-tal material

Original

Sound Subtitles+ video

Prosody+ video

Subtitles

Prosody Video Nothing(context only)

Information channels

verbal prosodic visual

verbal prosodic

verbal visual

prosodic visual

verbal prosodic

visual [none]

Number of information channels

3 2 2 2 1 1 1 0

Mean %% of correct answers

87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%

Page 57: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

57

The combination ‘prosodic plus visual’ (group 4) leads to significantly lower result than in other pairs of channels (groups 2 and 3).

Group number

1 2 3 4 5 6 7 8

Experimen-tal material

Original

Sound Subtitles+ video

Prosody+ video

Subtitles

Prosody Video Nothing(context only)

Information channels

verbal prosodic visual

verbal prosodic

verbal visual

prosodic visual

verbal prosodic

visual [none]

Number of information channels

3 2 2 2 1 1 1 0

Mean %% of correct answers

87,4% 70,4% 73,9% 51,2% 72,0% 51,1% 61,7% 38,3%

Page 58: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

58

Relative contribution of the three channels

For the sake of simplicity, assume that all three channels are independent

(72+51+62=185)/100Results:

Verbal channel 39% (72:1.85≈39), Prosodic channel 28% (51,1:1.85≈28), Visual channel 33% (61,7:1.85≈33),

Page 59: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

59

Conclusions about the relative weight of three information channels

All information channels are highly significant the traditional linguistic viewpoint is

erroneous

The verbal channel is the leading one the viewpoint popular in applied psychology

is erroneous

Information from the prosodic and the visual channels is primarily used through integration with the verbal channel, at least for this discourse type

Page 60: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

60

IV. Signed languagesNATURAL LANGUAGES

SPOKEN SIGNED

DEAF SIGN LANGUAGES

natural, full-fledged human languages visual-spatial languages

use hands and arms, facial expressions, eye gaze, head and body posture to encode linguistic information

manual signs are produced in a three-dimensional space immediately in front of the signer – the signing arena

121 sign languages (http//:www.ethnologue.com)American Sign Language, Russian Sign Language …

Page 61: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

61

Reference in RSL

Prozorova 2006, Kibrik and Prozorova 2007

Goal: to characterize referential choice of a deaf sign language as contrasted to that of spoken languages

Page 62: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

62

RSL data collection

‘The Pear Stories’ Film (Chafe 1980) Corpus of 10 video-recorded RSL narratives

based on the retellings of the Pear Film Speakers:

6 men and 4 women age 15-55 all based in Moscow

7 animate referents in the Pear Film 657 clauses 542 referential expressions (animate)

Page 63: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

63

Deictic demonstrative reference in RSL

operates in the perceived space P

deictic expressions: pointing signs pointing with an index

finger towards the intended referent

(2) DEMcat ILL ‘He is ill’

Page 64: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

64

Major anaphoric options in RSL

Full NPs (114)Zero expressions (401)Demonstratives (27)

Page 65: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

65

Full NP

BOY YOUNG AGE CYCLE ‘A young boy is riding a bicycle’

Page 66: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

66

Full NP

BOY YOUNG AGE CYCLE ‘A young boy is riding a bicycle’

Page 67: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

67

Zeroexpressions

1. BOY YOUNG AGE CYCLE2. Øboy STOP

3. Øboy HUMAN-STANDrightdown

4. Øboy LOOKrightdown P-E-A-R

1. A young boy is riding a bicycle.2. He stops.3. He stands upright.4. He sees the pears.

Page 68: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

68

Anaphoric zero reference

Interlocutors’ shared cognitive representation contains not only perceived referents, but also referents conceived of (remembered or imagined)

We call this representation the conceived space C

Mentioning referents that are present, or activated, in the conceived space is what is known as anaphora

Anaphoric referential choice depends on a referent’s activation in the conceived space: High zero Low full NP

Page 69: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

69

Demonstrative

1. Øboy CYCLE

2. Øboy GOsignerforward AWAYsignerforward

3. DEMmanright

SEE NEG

4. Øman PICK-ROUND

1. He cycles.2. He goes away.3. That one doesn’t see.4. He picks pears

Page 70: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

70

Anaphoric demonstrative reference

In signed discourse the signer maps referents from the inner conceived space C onto the external signing arena

Mapping includes various parameters of referents: locations orientations physical interactions even abstract relations between

them Thus a constructed space C’

is created, inhabited by referents conceived of

Page 71: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

71

How are locations of referents established in the constructed space?

Signed discourse takes place in the three-dimensional signing arena

The topology of the signing arena isomorphically represents the topology of the scenes, remembered by signers from the film

The signer establishes the locations of referents in his signing arena

These locations are isomorphic to the locations of the referents in the film, as remembered by the signer

Page 72: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

72

An episode from the Pear Film

Page 73: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

73

A retelling

1. ONE-MOVEfrontsigner MANi 2. ONE-MOVEfrontsigner SHE-

GOAT3. BOY GIRL UNCLEAR4. SHE-GOAT5. Øgoat TWO-HORN HAVE.NEG6. DEMi

front PULL

1. A man is coming,2. with a she-goat.3. Male, female – it is

unclear.4. It’s a she-goat:5. It has no horns.6. This one is pulling it.

Page 74: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

74

Anaphoric demonstratives

Once the signer has explicitly indicated the location/path of a referent, nominal demonstratives may be used for further mentions of this referent

Thus demonstratives are the basic device used for repeated mention of referents in the constructed space

Formally they are the same as deictic demonstratives Often called ‘personal pronouns’ in the literature Demonstratives are based on the mechanism of

virtual pointing, but it is conventionalized in RSL What is a kind of an ad hoc, fluid device in spoken

languages, is an established, nearly lexical device in RSL

Page 75: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

75

Two discourse factors and anaphoric referential devices

factor 1: RD=1

RD=2 RD=3+ TOTALfactor 2: Ant=S Ant=O

full NP <1 % 33 % 14 % 57 % 59

zero NP 99 % 42 % 67 % 27 % 401

nominal DEM

<1 % 25 % 19 % 16 % 27

TOTAL346

(100%)24

(100%)43

(100%)74

(100%)487

Page 76: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

76

Use of zero expressions under RD > 1

49 usages (12% of all zeroes) Pragmatic and semantic clues that help to

identify the referent of a zero expression: certain predicates associated with a particular

referent (RIDE-BICYCLE; HOLD-BICYCLE)

The process of role-shifting (Padden 1986): by shifting (rotating) the body and changing

his/her facial expression the signer shows that s/he is currently “acting” for one of the referents

Page 77: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

77

Role-shifting

1. Øboy LOOKdown

2. Øboy BE-ABOUT ONE PEAR ONE TAKE-ROUND

3. Øboy LOOKup

role-shifting4. DEMup

man PICK-ROUNDrole-shifting

5. Øboy LOOKdown

6. Øboy TAKE-ROUND

1. He [the boy] looks down.2. He is about to take one pear.3. He looks up.

role-shifting4. That one (the man) is picking pears.

role-shifting5. He (the boy) looks down.6. He takes one.

Page 78: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

78

Referential function of nominal demonstratives

Nominal demonstratives are not particularly sensitive to discourse factors:

factor 1: RD=1

RD=2 RD=3+ TOTAL

factor 2: Ant=S Ant=O

nominal DEM

<1 % 25 % 19 % 16 % 27

Page 79: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

79

In case of intermediate referent activation, full NPs and demonstratives compete

In case of low activation (RD=3+) full NPs strongly prevail (57%)

Apparently, information on the location of a referent in the constructed space can be assumed available to the addressee only for a limited time

Full NPsvs nominal demonstratives

Page 80: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

80

Full NPs vs demonstratives

1. Øboy CYCLE

2. Øboy OBJECT-MOVEsignerforward

3. Øboy GO-AWAYsignerleft-forward

4. DEMup MAN STILL PICK-PEAR5. CYCLE DEMboy

front

6. Øboy OBJECT-MOVEsignerforward

1. He (the boy) is cycling.

2. He is riding forward.3. He goes away.4. That man is still

picking pears.5. This one is cycling.6. He is riding forward.

11

2

Page 81: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

81

Conclusions on reference in RSL

Types of referential devices and factors of reference are analogous to those of spoken languages

Some devices, only embryonically present in spoken languages, are strongly employed in RSL: virtual pointing role-shifting

This is apparently due to the fundamentally spatio-visual character of RSL

Studying signed languages gives us a new perspective on spoken languages

Recognition of two fundamental types of languages, spoken and signed, appears indispensable for a general theory of language

Page 82: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

82

V. A wider picture

The world surrounding us is multimodalWe are multimodal organismsObviously language and communication

are mutimodalAs it often happens, those specializing

in applied fields have understood the importance of multimodality before pure scholars and theorists

Page 83: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

83

Multimodality in technology

TV is superior to radioMultimodal communication devices Internet, especially Web 2.0, is all

multimodalMultimodal GPS

Page 84: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

84

The multimodal flight finder enables rapid task completion by enabling the user to interact via a multiplicity of user interaction modalities

Page 85: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

85

Stages of multimodal integration, from Cohen and Oviatt 2006

Page 86: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

86

Multimodality in biological sciences

“Within biology, experimental psychology, and cognitive neuroscience, a separate rapidly growing literature has clarified that multisensory perception and integration cannot be predicted by studying the senses in isolation.”

(Cohen and Oviatt 2006)

Page 87: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

87

Multimodality in communication studies and semiotics

Kress G & van Leeuwen T (2001). Multimodal discourse: the modes and media of contemporary communication. London: Arnold.

‘‘A multimodal approach assumes that the message is ‘spread across’ all the modes of communication. If this is so, then each mode is a partial bearer of the overall meaning of the message. All modes, speech and writing included, are then seen as always partial bearers of meaning only. This is a fundamental challenge to hitherto current notions of ‘language’ as a full means of making meaning’’ (Kress, 2002: 6).

Page 88: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

88

Multimodal Analysis Lab (Singapore): collaboration of social scientists and computer scientists

Page 89: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

89

Multimodality in computational linguistics

Gibbon D, Mertins I & Moore R (eds.) Handbook of multimodal and spoken dialogue systems: resources, terminology and product evaluation. Dordrecht: Kluwer. 2000

Page 90: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

90

Multimodal corpora

LREC-2008 (Language Resources and Evaluation Conference) Blache P., Bertrand R., Ferré G. 2008. Creating

and exploiting multimodal annotated corpora. Gallo C.G., Jaeger T.F., Allen J., Swift M. 2008.

Production in a multimodal corpus: How speakers communicate complex actions

Kitazawa Sh., Kiriyama Sh., Kasami T., Ishikawa Sh., Otani N., Horiuchi H., Takebayashii Y. 2008. A Multimodal infant behavior annotation for developmental analysis of demonstrative expressions

Page 91: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

91

Synthesis

LeVine P & Scollon R (eds.) Discourse and technology: multimodal discourse analysis. Washington, DC: Georgetown University Press. 2004

Page 92: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

92

Conclusions

“Normal” linguists, researching conventional verbal material, need to understand that further progress in linguistics is impossible if one ignores the multimodality of language

Language in the understanding of the 20th century mainstream linguistics is an abstraction, very remote from reality. We live in the multimodal world, this is where language evolved and where it functions, and this is what we need to realize if we want to understand it

Taking the multimodal perspective into account can help to adequately approach classical questions of narrow linguistics

Choice between continuing the habits of mainstream linguistics and swtiching to multimodality amounts to the choice between Impoverishment vs. richness Stagnation vs. innovation Isolation vs. interdisciplinarity

Page 93: MULTIMODAL LINGUISTICS: Directions of research Andrej A. Kibrik (Institute of Linguistics, RAN) kibrik@comtv.ru CML-2008 Montenegro, September 2008.

93

Acknowledgements

Julia NikolaevaVera Podlesskaya Evgenia Prozorova Ekaterina El’bert