Using spoken corpora to investigate pragmatic
variation
Workshop on New Trends in Spoken CorporaSantiago de Compostela
Brian Clancy, Mary Immaculate College, ~University of Limerick~
Pragmatics
• The study of interactional meaning…
• A man and a friend are playing golf one day. One of them is about to chip onto the green when he sees a long funeral procession on the road next to the course. He stops mid-swing, takes off his hat, closes his eyes and bows down on his knees in prayer. His friend says ‘Wow! That is the most thoughtful and touching thing I have ever seen. You truly are a kind man.’
• The other man replies ‘Yeah, well, we were married for thirty-five years.’
(Culpeper and Haugh, 2014: 4)
Corpus linguistics
• Modern corpora have made it increasingly possible to study pragmatic phenomena:
• Demographic information now almost standard in spoken corpora;• An increasing preoccupation with representativeness now allows us to study
variation across register, context-type, genre;• Annotation extremely useful for research into pragmatics;• Vertical reading allows for the processing of large datasets.
• Authentic data…
Pragmatics + Corpus Linguistics = Corpus pragmatics
• How the interpersonal is encoded in language – in routinised ways, looking for patterns and evidence (Clancy and O’Keeffe, 2015)
• Corpus linguistics is a sympathetic methodological companion (dare I suggest, revolutionary)
• Corpus pragmatic studies have added value:• Results are considered in context;• All sizes of corpora considered;• Instances of pragmatic annotation growing;• Highly iterative;• More nuanced…
Big ‘V’ and little ‘v’
• Question: What can corpus pragmatics tell us about language Variety and variety?
• Variety = defined geographically and ‘user’ related • variety = defined situationally and ‘use’ related
(Quirk, 1995; Clancy, 2010)
The Limerick Corpus of Irish English
• 1 million words of spoken Southern Irish English• Context-types = transactional; professional; pedagogical; socialising;
intimate• Demographic speaker information such as gender, age, birthplace,
occupation and level of education recorded
• Complemented by reference to ANC, BNC (and Baby), ICE-Ireland, LIBEL…
Data distribution in LCIE (%)
5626
5
12
1
Intimate
Socialising
Pedagogical
Professional
Transactional
Start with frequency…N BABY BNC LCIE
1 I the
2 you I
3 the and
4 it you
5 and to
6 a it
7 to a
8 that that
9 yeah of
10 oh yeah
11 in in
12 of was
13 no is
N BABY BNC LCIE
14 it’s like
15 well know
16 what he
17 on on
18 is they
19 have have
20 know there
21 one no
22 do but
23 was for
24 got be
25 we what
Keyword list
N Keyword1 am2 ah3 em4 ya5 like6 shure7 the8 laughing9 aam
10 umhum11 kind12 of13 laughs14 laughter15 was16 cause17 would18 ye19 um20 in21 now22 goin23 were24 Jesus25 and
LCIE list with BABY BNC as reference corpus
Keyword list
N Keyword1 am2 ah3 em4 ya5 like6 shure7 the8 laughing9 aam
10 umhum11 kind12 of13 laughs14 laughter15 was16 cause17 would18 ye19 um20 in21 now22 goin23 were24 Jesus25 and
LCIE list with BABY BNC as reference corpus
Keyword list
N Keyword1 am2 ah3 em4 ya5 like6 shure7 the8 laughing9 aam
10 umhum11 kind12 of13 laughs14 laughter15 was16 cause17 would18 ye19 um20 in21 now22 goin23 were24 Jesus25 and
LCIE list with BABY BNC as reference corpus
Keyword list
N Keyword1 am2 ah3 em4 ya5 like6 shure7 the8 laughing9 aam
10 umhum11 kind12 of13 laughs14 laughter15 was16 cause17 would18 ye19 um20 in21 now22 goin23 were24 Jesus25 and
LCIE list with BABY BNC as reference corpus
Keyword list
N Keyword1 ah2 am3 ya4 em5 no6 he7 shure8 laughing9 cos
10 aam11 she12 eh13 umhum14 laughs15 ye16 now17 laughter18 s19 d’you20 will21 mm22 Jesus23 muffled24 t25 there
LCIE list with ANC as reference corpus
Keyword list
N Keyword1 ah2 am3 ya4 em5 no6 he7 shure8 laughing9 cos
10 aam11 she12 eh13 umhum14 laughs15 ye16 now17 laughter18 s19 d’you20 will21 mm22 Jesus23 muffled24 t25 there
LCIE list with ANC as reference corpus
Keyword list
N Keyword1 ah2 am3 ya4 em5 no6 he7 shure8 laughing9 cos
10 aam11 she12 eh13 umhum14 laughs15 ye16 now17 laughter18 s19 d’you20 will21 mm22 Jesus23 muffled24 t25 there
LCIE list with ANC as reference corpus
Keyword list
N Keyword1 ah2 am3 ya4 em5 no6 he7 shure8 laughing9 cos
10 aam11 she12 eh13 umhum14 laughs15 ye16 now17 laughter18 s19 d’you20 will21 mm22 Jesus23 muffled24 t25 there
LCIE list with ANC as reference corpus
Frequency information - shure
LCIE SPICE BNC
1277 310 1
Normalised per million words
Observations on shure
• Sure has a range of functions – both emphatic and indexical – and can be used to mark causality, mockery, contrast and consensuality (Amador-Moreno, 2010).
• Sure usually precedes an assertion; new information presented as old, indirectly requesting agreement (Kallen, 2006).
• This is used as a mitigator or hedge in initial position. Its most common pattern of use being but shure, used to soften the effect of but in the introduction of a counter point (O’Keeffe, 2011).
• Sure confirms presupposed knowledge – includes general knowledge, which is not the property of any specific individual in the conversation (Kallen and Kirk, 2012)
Now…
• https://www.youtube.com/watch?v=gT9xuXQjxMM
• Father Ted: The Passion of St Tibulus
• Rich pickings!
• Wierzbicka (1991: 341): ‘There are few aspects of any language which reflect the culture of a given speech community better than its particles.’
Oh are you now…
‘And most ominous of all, of course, is the apparently innocuous question followed by a comma and the adverb “now”. No feckless husband who declares that he is going to the pub can fail to notice the chill that descends on the room when his wife replies: “Oh are you, now?”’
(McNally, 2007)
Frequency list data (Clancy and Vaughan, 2012)
Corpus LCIE BNC
Position 30th 63rd
Frequency per million words
4860 2864
Now occurs in formal contexts...
• ‘The text types where we find the largest number of examples of now are more formal than ordinary conversation and contain more structure’ (Aijmer, 2002: 69).
• ‘...now is more likely [than well] to occur in formal contexts’ (Defour, 2008: 63).
Watching sport [hurling]…
Watching sport [hurling]...<$1> Come on Murphy. Oh keep it out. Aw no. Good man Fitz. Come on. Janey. Now.
Come on Wexford. Now come on Wexford. Come on Wexford. Now come on Wexford. Come on. Come on Wexford. Come on. Now. Have a shot. Now come on Wexford. Come on now Wexford. A point now. Come on Wexford. Come on Wexford. Come on. Come on. Go, go with it. Now. Shit. Come on Wexford. Come on Wexford. Come on. Come on Wexford. Come on. Come on Wexford. Come on. Great goal. Shit. Move it for God's sake. Now come on Wexford. Come on Wexford. Come on come on come on. Good man Bowe. Wexford be careful or you'll be put off. Now come on Wexford. Now come on. Now come on Wexford. Come on Wexford. Come on Wexford now. Come on now. Ehh. Now come on. Come on now Wexford. Come on Wexford. Come on. Come on Wexford. Steady up now. Steady up. Come on Wexford. Come on into him. Come on Wexford. Come on Wexford. Come on Wexford. Come on now. Come on now. Come on. Go with it.
<$2> Go on.
<$1> Go on Wexford. Ahh shit.
Formal versus informal
0
2000
4000
6000
8000
10000
12000
Intimate
Pedagogical
Professional
Socialising
11054 11097
Sample concordance linesN Concordance
1 now yourself I'm not going to do it at all for they're in that . So you can rename them 2 now the only thing is its pulling all you can you go . Its up there . Yeah . There you are 3 now. Yeah . There tis now . No you see I when it flashes on charge you put it like that 4 now. Yes . The guy who sings has dyed red to kind of you know you can see it there 5 Now how do you listen to those kind of song was nice . The German song was . 6 Now is that a good shot or is that not a . Now . There's the fire . There's the fire . 7 Now. There's the fire . There's the fire . Now the religious artefacts to this fella ¦ fellow . 8 Now am where is my pictures do you know? onto your system? Do shure stick them on . 9 Now do you want those things will I put them Syl Adley's place look . Syl . He has his
10 now Dermot Lynch said he'd send me all good I'm going to print that off . That's it 11 now. background talking He'll go that Derek two three laughing It's sent . Is it? It's gone 12 now in Paris and I said what's this and she and she goes I was telling you about it 13 now in+ she's off in two weeks . +she's going like yeah sure there's Ah Eva's off to Canada 14 now. Ah yeah with the exams . ah you'd get $1> How are you Gerry? Not too bad Eileen 15 now. If you'd do well Derek listen have a go I don't know about being mechanically minded 16 now is just Yeah . The other thing is on . That's her that's the mother . The other one 17 now? Which no . The bananas . Are they you want they're not holding at all Tommy 18 now and the fruit there+ Yeah . +and they go at all Terry I only bought them yesterday 19 now. you know what I mean shure give John laughing I know well that's another story 20 now. See ya take care of yourself alright . Alright bye
Temporal <$3> Ah yeah I'm not interested in the money anyway I like it where
I am you know shure I've a rich man. <$2> I know you don't need it <$E> $3 laughs <\$E> bitch I've to
think of money all the time.<$3> Yeah you know like I'll get a rise a tenner rise now in
September woo hoo+ <$2> Oh really.<$3> +but like then like I get a hundred and ninety seven now+ <$2> Ah yeah yeah yeah. <$3> +so I'm not really gonna get much more I don't think I don't
really know but if I was then I’d have to travel and then I'd be paying.
Topic-related
<$1> So we're around <$G?> it would be costs and six hundred Euros approx we'll say a thousand legal fees and five.
<$2> Yeah.<$3> Em yeah your land registry fees now about depending on if
it's a couple or a single person <$G?> <$H> supply us with the relevant information so that's <$G1> Euro <$G?>.
<$1> Yeah.
Pragmatic
<$2> You tend to just like the teabags to be waved at the cup don't ya Pa?
<$4> Ah give 'em a good squeeze there now.<$2> Will I?<$4> Yeah don't squeeze the bollocks out of it now or
anything now <$E> $2 laughs <\$E> <$G3> out like.<$2> In between a a good squeeze and a squeeze in the
balls. <$4> Yeah exactly.
Pragmatic
<$1> Has she hit the old am teenage <$O1> y= years <$O1>?<$3> <$O1> Sort of <\$O1> yeah I suppose she has now really.<$1> Doin' up the hair as George Willaby used say to me coming
down every morning with a different hairstyle <$E> laughter <\$E>.<$3> Yeah she's a bit like that all right she bought she got her dress for
the wedding. <$1> Is it?<$3> It's lovely now.
there = location
I hate going out in Limerick, total weirdos and all out there… (LCIE, 2002)
A different there…
• https://www.youtube.com/watch?v=KMbQ-aMvS44
Are you right there Father Ted?(Phone rings)
Father Ted: Father Crilly speaking.
Bishop Brennan: Crilly, it's Bishop Brennan.
Father Ted: Oh, feck! What? (French accent) Who is this? There is no Crilly 'ere. (Hangs up) God, I just said "feck" to Bishop Brennan.
Dougal: He won't like that.
Father Ted: It's ok I put on an accent so he'll think he had the wrong number.
(Phone rings again)
Bishop Brennan: Crilly.
Father Ted: Hello Bishop Brennan. I think you got the wrong number there.
Bishop Brennan: Shut up Crilly. Shut up…
there and Irish English
• Attested examples…
• Just read your e-mail there.• There’s someone looking for you there.• I was talking to Máiréad there…
Frequency data (Vaughan and Clancy, 2014)
LCIE ICE-Ireland BNC_S ANC_S0
1000
2000
3000
4000
5000
6000
7000
80007224
6487
5367
4258
Frequencies per million words
T.P. Dolan
there adv., used as a filler in HE dialogue, without necessarily indicating location. ‘Her brother-in-law came up here there last week’ (Longford)
(Dolan, 2012: 251)
There – categories in the literature Existential there There’s an awful lot of tourists. [LCIE]
Referential there Just press that button there. [LCIE]It’s really neat and she says that everybody up there wears buttons on their socks. [ANC]
Fixed phrases We had a little bit of snow here and there. [ANC]
Presentative there A: Can I have the remote control?B: There we go. [LCIE]
Conversational there Hello there and a very good afternoon to you. [LCIE]
Functions across varieties
Ex Ref There FP Pres Con Quot0
50
100
150
200
250
300
350
ANCLCIEBNC
Spatial Temporal
<$3> What's up with you?<$1> What's up with you?<$3> What's up with you?<$1> I asked you first.<$E> pause <\$E><$1> Hey Lisa did you record anything there over the weekend that we can stick on?<$2> No. I should have recorded Jagged Edge last night and you could watch it.
Quotative there<$5> Oh he went mad, so we were sayin we were just laughin and out there. And I was fuckin gettin free cocktails and everything, and emm next thing Gavin came over and took it personally from some fellow, and he says “No you'll have to do the proper thing, the real proper thing now”. He got down on the one knee and he was there “Would you marry me?” I was there oh my God, he was there would you marry me, I said I'd marry you. And Gavin turned around and said one for the price of two, we have an open invitation now.<$6> Ahh.<$5> And he kept on sayin would you, would you and this really bugged me. Would you would you definitely and here's the happy couple to be.<$6> Ahh.<$2> When are yiz gettin engaged?<$5> I’m not gettin engaged.
What is intimate discourse?
• ‘Everyday life is made up of a multitude of small, if not small-minded, immediate concerns. Dinner has to be cooked, children put to bed, furniture selected and purchased. Such things have to be dealt with immediately if everything else is to proceed’ (Varenne, 1992: 13).
• ‘With those closest to us, we laugh, we chat, we fight, we gossip, we bond. [Intimate discourse] is the interaction that lies at the heart of our everyday linguistic experience’ (Clancy, fc 2016).
But…
• ‘Most dismiss these activities as irrelevant to the business of life’ (Varenne, 1992: 13).
• ‘Intimate discourse is so familiar to us all means opinions about it are readily proffered and yet rarely informed or supported with any consideration of systematic corpus evidence from the context-type itself’ (Clancy, fc 2016).
• How can corpus pragmatics help in understanding intimate discourse?
LINT (The Limerick Corpus of Intimate Talk)• Sub-corpus of LCIE• Approx. 500,000 words• Can be sub-divided again into couples, family and friends
‘Mundane’
• People talking together, ‘conversation’, is one of the most mundane of all topics (Ten Have, 2007: 3).
The mundane and the intimate
• It is not surprising then that more and more aspects of mundane socialising activities among friends and family that were once done face-to-face are now done on-line (Palencia and Lower, 2013: 617-8).
‘Banal’
• ‘Hyperbole (also referred to as exaggeration or overstatement) has been studied in rhetoric and in literary contexts, but only relatively recently in banal, everyday contexts’ (McCarthy and Carter, 2004: abstract).
‘Casual’
• ‘At other times we talk simply for the sake of talking itself. An example of this is when we get together with friends or workmates over coffee or dinner and just “have a chat”. It is to these informal interactions that the label casual conversation is usually applied’ (Eggins and Slade, 1997: 6).
‘Casual’
Pronouns
• O’Keeffe (2006: 98-99) – we, our and us ‘are central to the process of establishing and maintaining a sense of commonality and inclusion in everyday casual conversation between people who have a real common bond.’
We, our and us (normalised per million words)
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
4214
495 789
7245
792 636
7732
1263 1069
LINTLIBELBNC
our uswe
He and she…
I you he she0
5000
10000
15000
20000
25000
3000027511
25067
95547702
13653
26394
37761436
2399122187
56223157
LINTLIBELBNC
General extenders
Stan: Wow. I can’t believe Ms. Ellen was a criminal Iraqi fugitive.Wendy: Yeah you just never know.Stan: Well I guess I’m sorry that I was ignoring you and stuff.[Wendy smiles.]Wendy: Happy Valentine’s Day, Stan.[Wendy puckers. Stan looks a little scared, but then moves his mouth towards hers. Both kids open their mouths slightly. Stan vomits into Wendy’s open mouth.]Wendy: Ew.Stan: Sorry. (South Park, Season 1, Episode 11)
General extenders
• And stuff is a ‘powerful marker of intimacy based on shared feelings or experiences’ (Aijmer, 2013: 140).
And stuff (normalised per million words)
Series10
20
40
60
80
100
120
140
120
42
LINT
BNC
Disjunctives (normalised per million words)
or something or anything or whatever0
100
200
300
400
500
600
700
800
688
248
160210
76 103
LINTBNC
Keywords
N Family Friends
1 he like
2 goin it’s
3 child eh
4 shure know
5 cause cos
6 tis kind
7 Nana mm
8 now em
9 Shelley just
10 Killian actually
N Family Friends
11 mmm mean
12 Lauren my
13 printer fuck
14 quare d’ya
15 somethin awh
16 yesterday shit
17 room am
18 Mammy but
19 comin fuckin
20 her really
Keywords
N Family Friends
1 he like
2 goin it’s
3 child eh
4 shure know
5 cause cos
6 tis kind
7 Nana mm
8 now em
9 Shelley just
10 Killian actually
N Family Friends
11 mmm mean
12 Lauren my
13 printer fuck
14 quare d’ya
15 somethin awh
16 yesterday shit
17 room am
18 Mammy but
19 comin fuckin
20 her really
Keywords
N Family Friends
1 he like
2 goin it’s
3 child eh
4 shure know
5 cause cos
6 tis kind
7 Nana mm
8 now em
9 Shelley just
10 Killian actually
N Family Friends
11 mmm mean
12 Lauren my
13 printer fuck
14 quare d’ya
15 somethin awh
16 yesterday shit
17 room am
18 Mammy but
19 comin fuckin
20 her really
Keywords
N Family Friends
1 he like
2 goin it’s
3 child eh
4 shure know
5 cause cos
6 tis kind
7 Nana mm
8 now em
9 Shelley just
10 Killian actually
N Family Friends
11 mmm mean
12 Lauren my
13 printer fuck
14 quare d’ya
15 somethin awh
16 yesterday shit
17 room am
18 Mammy but
19 comin fuckin
20 her really
Taboo language
• Links taboo language to group affinity – ‘the stronger the group affinity, the more swearing’ (Stenström et al., 2002: 77).
• Taboo items have a social function as ‘intimacy signals’ aimed at building up an informal, chummy atmosphere (Stenström, 1991).
Distribution of FUCK (normalised 500,000 words)
fuckin(g) fuck fucker(s) fucked fuck's0
100
200
300
400
500
600
700
183
6910 5 2
574
315
30 57 38
FamilyFriends
Distribution today, tomorrow, yesterday (normalised 500,000 words)
today tomorrow yesterday0
50
100
150
200
250
300
350
298
174
298
220
176
99
FamilyFriends
Distribution of pragmatic markers (normalised 500,000)
likeyou know
kind of
justactually
I meanreally
nowshure
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
2908
1142
274
1507
286 188645
2808
1073
7758
1693
741
2340
707 624 1069
2004
463
FamilyFriends
So…
• The blend of pragmatics and corpus linguistics (corpus pragmatics) provides us with a robust platform for the study of linguisitic variation.
• It is, however, necessary to drill down into the data in order to properly determine the characteristics of both a variety and Variety .
Shameless plug!
References
Aijmer, K., 2002. English Discourse Particles: Evidence from a Corpus. Amsterdam: John Benjamins.Aijmer, K., 2013. Understanding Pragmatic Markers. Edinburgh: Edinburgh University Press.Amador-Moreno, C., 2005. An Introduction to Irish English. London: Equinox.Clancy, B., 2010. Building a corpus to represent a variety of a language. In: A. O’Keeffe and M. McCarthy (eds.), The Routledge Handbook of Corpus Linguistics. London: Routledge, 80-92.Clancy, B. and E. Vaughan, 2012. It’s lunacy now: A corpus-based pragmatic analysis of the use of now in contemporary Irish English. In: B. Migge and M. Ní Chiosáin (eds.), New Perspectives on Irish English. Amsterdam: John Benjamins, 225-246.Culpeper, J. and M. Haugh, 2014. Pragmatics and the English Language. London: Palgrave.Defour, T., 2008. ‘The speaker’s voice: A diachronic study on the use of well and now as pragmatic markers.’ English Text Construction, 1(1), 62-82.Dolan, T.P., 2012. A Dictionary of Hiberno-English: The Irish Use of English (3rd Edition). Dublin: Gill & Macmillan. Eggins, S. and D. Slade, 1997. Analysing Casual Conversation. London: Continuum.Kallen, J., 2006. Arrah, like, you know: The dynamics of Discourse Marking in ICE-Ireland. Plenary paper presented at Sociolinguistics Symposium, July, Limerick. Available on-line: http://www.tara.tcd.ie/bitstream/handle/2262/50586/Arrah%20like%20y%27know.pdf?sequence=1 (accessed 13.08.2014).Kallen, J. and J. Kirk, 2012. SPICE-Ireland: A user’s guide. Belfast: Cló Ollscoil na Banríona.
References
McCarthy, M. and R. Carter, 2004. ‘“There’s millions of them”: Hyperbole in everyday conversation.’ Journal of Pragmatics, 36, 149-184.McNally, F., 2007. An Irishman’s Diary. The Irish Times, 23rd August, p. 17. O’Keeffe, A., 2006. Investigating Media Discourse. London: Routledge.O’Keeffe, A., 2011. ‘Teaching and Irish English.’ English Today, 27(2), 58-64.Quirk, R., 2005. Grammatical and Lexical Variance in English. London: Longman.Palencia, M.E. and A. Lower, 2013. ‘Your kids are so stinkin’ cute! :-): Complimenting behaviour on Facebook among family and friends.’ Intercultural Pragmatics, 10(4), 617-646.Stenström, A-B., 1991. Expletives in the London-Lund Corpus. In: K. Aijmer and B. Altenberg (eds.), English Corpus Linguistics: Studies in honour of Jan Svartvik. London: Longman, 239-253.Stenström, A-B., G. Andersen and I-K. Hasund, 2002. Trends in Teenage Talk: Corpus compilation, analysis and findings. Amsterdam: John Benjamins. Ten Have, P., 2007. Doing Conversation Analysis: A practical guide. London: Sage.Vaughan, E. and B. Clancy, 2014. The devil is in the detail: Using corpora to investigate spoken language varieties. Paper presented at the American Association for Corpus Linguistics, Arizona, September.Wierzbicka, A., 1991. Cross-cultural Pragmatics: The Semantics of Human Interaction. Berlin: Mouton de Gruyter.
Distribution of personal pronouns (normalised 500,000 words)
I you he she we they0
2000
4000
6000
8000
10000
12000
14000
11662
10422
5146
3815
18062935
11783
10831
3528 3140
19462845
Family Friends
Top Related