Michaela Mahlberg Literature and Corpora Corpus Linguistics Summer Institute, 30 June – 3 July...

70
Michaela Mahlberg Literature and Corpora Corpus Linguistics Summer Institute, 30 June – 3 July Corpus Linguistics Summer Institute, 30 June – 3 July 2008 2008 University of Liverpool University of Liverpool

Transcript of Michaela Mahlberg Literature and Corpora Corpus Linguistics Summer Institute, 30 June – 3 July...

  • Slide 1

Michaela Mahlberg Literature and Corpora Corpus Linguistics Summer Institute, 30 June 3 July 2008 University of Liverpool University of Liverpool Slide 2 2 Todays talk Literature and computer corpora ? Corpus stylistics and the theoretical context Examples Work in progress and further directions Slide 3 3 I am in a car park in Leeds when I tell my husband I don't want to be married to him anymore. David isn't even in the car park with me. He's at home, looking after the kids, and I have only called him to remind him that he should write a note for Molly's class teacher. The other bit just sort of... slips out. This is a mistake, obviously. Even though I am, apparently, and to my immense surprise, the kind of person who tells her husband that she doesn't want to be married to him anymore, I really didn't think that I was the kind of person to say so in a car park, on a mobile phone. That particular self-assessment will now have to be revised, clearly. I can describe myself as the kind of person who doesn't forget names, for example, because I have remembered names thousands of times and forgotten them only once or twice. But for the majority of people, marriage-ending conversations happen only once, if at all. If you choose to conduct yours on a mobile phone, in a Leeds car park, then you cannot really claim that it is unrepresentative, in the same way that Lee Harvey Oswald couldn't really claim that shooting presidents wasn't like him at all. Sometimes we have to be judged by our one-offs. Slide 4 4 Slide 5 5 Corpus approaches to literature ? Corpus Linguistics 2005 Oxford Workshop 2006 Pala 2006 Corpus Linguistics 2007 Corpus style mailing list Slide 6 6 Literature in Corpora ? naturalness, mainstream, repeated patterns in a general corpus a literary text will disappear below the waves (Sinclair 2007) a novel as a unit of meaning (world of the text) text length and balance copyright Slide 7 7 Literature and computers the gap is still immense between what readers can do effortlessly, and what a computer can do. Scholars interested in calling on a computer to aid their research are limited to a very narrow range of possible operations, and such operations still fall largely outside the mainstream work of literary scholarship. (Miall 1996: online) Slide 8 8 Literature is a prime example of language in use; no systematic apparatus can claim to describe language if it does not embrace the literature also; and not as a freakish development, but as a natural specialization of categories which are required in other parts of the descriptive system. Further, the literature must be describable in terms which accord with the priorities of literary critics. (Sinclair 2004: 51) Literature and language descriptions / grammars Slide 9 9 a cline of literariness in language use with some uses of language being marked as more literary than others in certain domains and for certain judges within that domain (Carter 2004: 69) Literariness? Slide 10 10 Corpus stylistics: corpus linguistics + literary stylistics ? criticism of literary stylistics corpus-based vs. corpus-driven: stylistics checklist (nouns, verbs, simple sentences, cohesion, etc.) or challenging linguistic categories (lexical items, patterns) Slide 11 11 Corpus stylistics: corpus linguistics + literary stylistics ? can be interpreted in CL terms: corpus work is based on comparison style is distinctive: in essence, the set or sum of linguistic features that seem to be characteristic: whether of register, genre, or period, etc ( Wales 2001: 371) Slide 12 12 Corpus stylistics: corpus linguistics + literary stylistics ? can be interpreted in CL terms: primary deviation: norms of the language as a whole secondary deviation: norms of literary composition, e.g. author, genre tertiary deviation: internal, norms of a text the description of deviations from linguistic norms (Leech 1985) Slide 13 13 characterising meanings of words word out of textual context KWIC word in a specific text ? Slide 14 14 Local textual functions local: apply to (a group of) lexical items in a (group of) texts textual: focus on lexical items in relation to features of texts Slide 15 15 Clusters as pointers to local textual functions Repetition as evidence of functional relevance. A cluster is a sequence of words that are used repeatedly (at the same time, on the one hand.. on the other hand, at the end of the .) Slide 16 16 Clusters: counting and comparing (norms and deviations) Counting: at the end of the is one of the most frequent 5-word clusters English it appeared to me that occurs 7x in Great Expectations Slide 17 17 Clusters: counting and comparing (norms and deviations) Comparing: General corpora Dickens corpus: ~ 4.5 million words, 23 texts 19th century novel corpus (19C): ~ 4.5 million words, 29 texts, 18 authors key clusters Slide 18 18 Striking and long clusters in Dickens: 51 8-word clusters (min. 5) THE ANGLO-BENGALEE DISINTERESTED LOAN AND LIFE ASSURANCE COMPANY (8 words, MC) NOT TO PUT TOO FINE A POINT UPON IT (9 words, BH) (THE) UNITED METROPOLITAN IMPROVED HOT MUFFIN AND CRUMPET BAKING AND PUNCTUAL DEVLIVERY COMPANY (13 words, NN) Slide 19 corpus-driven categories for the 66 key clusters (dynamic groups, ad hoc labels) 1)Labels 20 2)Speech 14 3)As if 6 4)Body parts 9 5)Time and place 5 6)Rest 12 Slide 20 D19C THE FATHER OF THE MARSHALSEA450 THE PERSON OF THE HOUSE370 THE LADY OF THE CARAVAN220 MAN OF THE NAME OF221 THE OLD MAN WITH A210 CAPTAIN GILLS SAID MR TOOTS200 MY DEAR SAID THE JEW190 MR PICKWICK AND HIS FRIENDS190 GENTLEMAN IN THE WHITE WAISTCOAT180 THE GENTLEMAN IN THE WHITE170 HOW NOT TO DO IT160 Labels Slide 21 D19C DO ME THE FAVOUR TO310 WHAT DO YOU MEAN BY7315 BEG YOUR PARDON SIR SAID250 UPON MY WORD AND HONOUR251 I BEG YOUR PARDON SIR5611 HOW DO YOU FIND YOURSELF230 HOW DO YOU DO MR292 YOU BE SO GOOD AS191 WHAT I AM GOING TO293 AM GLAD TO SEE YOU242 Speech Slide 22 D19C HIS HANDS IN HIS POCKETS9013 WITH HIS HANDS IN HIS6012 HANDS IN HIS POCKETS AND405 WITH HIS HAND TO HIS312 LAYING HIS HAND UPON HIS221 HIS HEAD AS IF HE180 THE PALMS OF HIS HANDS170 HIS HEAD ON ONE SIDE304 HIS HAND AS IF HE151 Body part clusters Slide 23 D19C AS IF HE WOULD HAVE412 AS IF HE WERE A457 AS IF HE WERE GOING323 IF HE WERE GOING TO263 AS IF IT WERE A7223 As if Slide 24 D19C ON THE TOP OF HIS210 AT THE UPPER END OF231 ON THE OPPOSITE SIDE OF5415 AFTER A GREAT DEAL OF160 THE OPPOSITE SIDE OF THE7026 Time and place Slide 25 IN THE COURSE OF THE A QUARTER OF AN HOUR AT THE BOTTOM OF THE IN THE MIDDLE OF THE AT THE TOP OF THE ON THE OTHER SIDE OF AT THE END OF THE THE OTHER SIDE OF THE UP AND DOWN THE ROOM Slide 26 What do (key) clusters tell us? Dickens uses more 5 word-clusters. Dickens seems to like certain clusters.. Why? Slide 27 Local textual functions: body part clusters Contextualising Highlighting Slide 28 28 Details of local textual functions Contextualising 'You see, Mr Richard,' said Brass, thrusting his hands in his pockets, and rocking himself to and fro on his stool, 'the fact is, (Old Curiosity Shop) 'Let me see then,' resumed Mr Boffin, with his hand to his chin. 'It was Secretary that you named; wasn't it?' (Our Mutual Friend) "Eh? What do you say I have got of my own?" asked Mr. Smallweed with his hand to his ear. (Bleak House) Slide 29 29 Highligthing He was the meekest of his sex, the mildest of little men. He sidled in and out of a room, to take up the less space. He walked as softly as the Ghost in Hamlet, and more slowly. He carried his head on one side, partly in modest depreciation of himself, partly in modest propitiation of everybody else. It is nothing to say that he hadn't a word to throw at a dog. He couldn't have thrown a word at a mad dog. He might have offered him one gently, or half a one, or a fragment of one; for he spoke as slowly as he walked; but he wouldn't have been rude to him, and he couldn't have been quick with him, for any earthly consideration. Mr. Chillip, looking mildly at my aunt with his head on one side, and making her a little bow, said, in allusion to the jewellers' cotton, as he softly touched his left ear: 'Some local irritation, ma'am? (David Copperfield) Slide 30 'That little man of a doctor, with his head on one side,' said my aunt, 'Jellips, or whatever his name was, what was he about? 30 Slide 31 CharacterClusters associated with the character Mr Snagsby NOT TO PUT TOO FINE, PUT TOO FINE A POINT, TO PUT TOO FINE A, FINE A POINT UPON IT, TOO FINE A POINT UPON Mr Guppy MAN OF THE NAME OF, OF THE NAME OF GUPPY, YOUNG MAN OF THE NAME, THE YOUNG MAN OF THE CIRCUMSTANCES OVER WHICH I HAVE, OVER WHICH I HAVE NO, WHICH I HAVE NO CONTROL YOUR LADYSHIP SAYS MR GUPPY Mr Bagnet OLD GIRL SAYS MR BAGNET, BUT I NEVER OWN TO, I NEVER OWN TO IT, NEVER OWN TO IT BEFORE, BEFORE HER DISCIPLINE MUST BE, HER DISCIPLINE MUST BE MAINTAINED, IT BEFORE HER DISCIPLINE MUST, OWN TO IT BEFORE HER, THE OLD GIRL SAYS MR, TO IT BEFORE HER DISCIPLINE Slide 32 Mr Jellyby HIS HEAD AGAINST THE WALL, WITH HIS HEAD AGAINST THE Mr Bucket SIR LEICESTER DEDLOCK BARONET I BY SIR LEICESTER DEDLOCK BARONET NOW SIR LEICESTER DEDLOCK BARONET INSPECTOR BUCKET OF THE DETECTIVE Mr George YOUR FRIEND IN THE CITY I ASK YOUR PARDON SIR Richard AS WELL AS ANYTHING ELSE, DO AS WELL AS ANYTHING ELSE Esther I THOUGHT IT BEST TO WHEN WE CAME TO THE Slide 33 Mr Vholes IN THE VALE OF TAUNTON Mr Jarndyce HAVE SOMETHING TO SAY ABOUT, SOMETHING TO SAY ABOUT IT, WILL HAVE SOMETHING TO SAY Miss Flite I EXPECT A JUDGMENT SHORTLY Charley IF YOU PLEASE MISS SAID Chadband IN A SPIRIT OF LOVE RIGHT THAT I SHOULD BE YOU ARE TO US A Grandfather Smallweed MY FRIEND IN THE CITY TO LOOK AFTER THE PROPERTY Krook MY NOBLE AND LEARNED BROTHER Jo WOS WERY GOOD TO ME Slide 34 And she rings for Mercury to show the young man of the name of Guppy out. But in that house, in that same moment, there happens to be an old man of the name of Tulkinghorn. And that old man, coming with his quiet footstep to the library, has his hand at that moment on the handle of the door- -comes in--and comes face to face with the young man as he is leaving the room. (Bleak House, Chapter 33) Slide 35 Example of Bleak House Labels most frequent (59 out of 97) Speech labels (e.g. Esther) Character pairs and relationships (Guppy, Bagnet, Bucket, Charley, ) Body part cluster labels (Mr Jellyby) groups of characters that get labels (cf. Tulkinghorn, Lady Dedlock) point of view (Snagsby) Slide 36 Local textual functions: Clusters not automatically associated with the same functions in all texts. 5-word clusters useful for comparison: Dickens uses more clusters and cluster functions, in 19C clusters max character-cluster link in 19C = 3) the 5 functional groups tend to cover 5-word clusters differences between novels (BH ~ 350,000: 97 5- word clusters, 59 labels, 2 As If, GE ~180,000: 21 5-word clusters, 4 As If) the longer the cluster, the more text-specific, long clusters useful for literary analysis (the young man of the name of Guppy) Slide 37 19CDickens general novel Slide 38 Classification of cluster types into 5 functional groups, figures per 100,000 words (For each text all 5-word clusters >4 are classified =716 different types) Slide 39 Clusters and characterisation small numbers have to be treated with caution, clusters can not provide full picture of individual character, clusters only point to what is explicitly there, 5-clusters as formal starting- point characterisation is a process, forming impressions of characters in our minds (Culpeper 2001) but: features of characters have to be seen with regard to textual world, behaviour interpreted against norms of the text world Slide 40 Speech clusters: definition Signals of interaction, formal criteria for definition ( inferences about personality? ) i have no doubt of i do not know what i am not at all you need not be afraid Slide 41 Speech clusters: definition Speech clusters are not the same as imitation of spoken language not a bit of it all 21 FDS/DS 1 are reconciled then?' said Perker. 'Not a bit of it,' answered Wardle; 'she 2 ping her hands and shaking her head. 'Not a bit of it.' 'At least, his name 3 Dick. 'No I haven't,' she returned, 'not a bit of it. Don't you mind about 4 ing her hands, and shaking her head. 'Not a bit of it.' 'Handford then,' su 5 of course it's not. Is it in Africa? Not a bit of it. Is it in America? YOU 6on't believe it,' cried the gentleman, 'not a bit of it. It's an excuse not t Slide 42 all that sort of thing tendency for FDS/DS but: Mr. Pickwick's upright and honourable bearing, coupled with that force and energy of speech which so eminently distinguished him, would have carried conviction to any reasonable mind; but, unfortunately, at that particular moment, the mind of Mr. Peter Magnus was in anything but reasonable order. Consequently, instead of receiving Mr. Pickwick's explanation as he ought to have done, he forthwith proceeded to work himself into a red- hot, scorching, consuming passion, and to talk about what was due to his own feelings, and all that sort of thing; adding force to his declamation by striding to and fro, and pulling his hair-- amusements which he would vary occasionally, by shaking his fist in Mr. Pickwick's philanthropic countenance. Slide 43 Speech clusters are not the same as the speech tics, character tags, or idiolects described in Dickens criticism Ill tell you wot it all 6 in PP but I never own to all 6 in BH But I never own to it before her. Discipline must be maintained. (character tags and flat characters: same behaviour in different situations) Slide 44 politeness promoting social harmony (cf. Leech 1983, Brown & Levinson 1987) am delighted to see you am glad to see you am much obliged to you beg your pardon sir how do you do mr I wish you good night Slide 45 impoliteness: attacking face (cf. Culpeper 1996) what have you got to... Now, it so happened that Mr. Fang was at that moment perusing a leading article in a newspaper of the morning,... He was out of temper; and he looked up with an angry scowl. 'Who are you?' said Mr. Fang. The old gentleman pointed, with some surprise, to his card. 'Officer!' said Mr. Fang, tossing the card contemptuously away with the newspaper. 'Who is this fellow?' 'My name, sir,' said the old gentleman, speaking like a gentleman, 'my name, sir, is Brownlow. Slide 46 'Officer!' said Mr. Fang, throwing the paper on one side, 'what's this fellow charged with?' 'He's not charged at all, your worship,' replied the officer. 'He appears against this boy, your worship.' His worship knew this perfectly well; but it was a good annoyance, and a safe one..... 'Now,' said Fang, 'what's the charge against this boy? What have you got to say, sir?... 'Swear the man,' growled Mr. Fang. with a very ill grace. 'Now, man, what have you got to say?' Slide 47 what have you got to (21) what have you got to say for yourself.. what have you got to say to me... power relationships of characters (not only magistrate) Slide 48 pointers to impoliteness and conflict DCorp 19C what do you mean by 73 15 what do you want here 15 0 (cf. also key clusters) Slide 49 The respected Mr Lammle was a bully, by nature and by usual practice. Perceiving, as Fledgeby's affronts cumulated, that conciliation by no means answered the purpose here, he now directed a scowling look into Fledgeby's small eyes for the effect of the opposite treatment. Satisfied by what he saw there, he burst into a violent passion and struck his hand upon the table, making the china ring and dance. 'You are a very offensive fellow, sir,' cried Mr Lammle, rising. 'You are a highly offensive scoundrel. What do you mean by this behaviour?' [] 'I say,' repeated Fledgeby, with laborious explanatory politeness, 'I beg your pardon. (OMF) Slide 50 'Now, sir,' said Mr Dorrit, turning round upon him and seizing him by the collar when they were safely alone. 'What do you mean by this?' The amazement and horror depicted in the unfortunate John's face--for he had rather expected to be embraced next--were of that powerfully expressive nature that Mr Dorrit withdrew his hand and merely glared at him. 'How dare you do this?' said Mr Dorrit. 'How do you presume to come here? How dare you insult me?'.... 'I humbly beg your pardon, sir.... Slide 51 ... Mrs Quilp, to whom, after contemplating her for some time in silence, he communicated a violent start by suddenly yelling out--'Halloa!' 'Oh, Quilp!' cried his poor little wife, looking up. 'How you frightened me!' 'I meant to, you jade,' returned the dwarf. 'What do you want here?... Slide 52 aggression as personality trait, villains, hostile relationships but also local context: example of PP Slide 53 53 Dickens very conveniently was a prolific writer Slide 54 54 Joint work with Dan McIntyre Slide 55 Aims of the Bond Project Why Bond? popular fiction under-represented within stylistics (though see Ryder 1999; Montoro 2007) Some literary critical interest (Lindner 2003, Comentale et al. 2005) though considerable lack of focus on the text Why focus on Casino Royale? First book in the series; film version released this year, bestseller What questions do we want to investigate? What are the stylistic characteristics of Casino Royale? How is the character of Bond constructed linguistically? Does a corpus stylistic analysis tie in with literary critical comment on Bond? Slide 56 56 Key semantic domains via Wmatrix assigns semantic tags, 21 discourse fields Slide 57 57 Top 10 key semantic domains Semantic domainLog-likelihood valueExamples (1) Pronouns1375.62it, he, his, him (2) Anatomy and physiology1030.25body, arm, profile, chin (3) Games391.55casino, gambler, croupier (4) Light348.72light, illuminated, sunshine (5) Unmatched217.84salle, privee, caisse (6) Furniture and household fittings171.18table, stool, pillow, bed (7) Degree146.83as [high as], as [serious as] (8) Parts of buildings137.51vault, passages, doors (9) Location and direction131.71away from, top, left (10) Darkness104.00dark, unlit, darkness Casino Royale compared against BNC Written Sampler Slide 58 58 References to Bonds BODY Slide 59 Key words: building blocks for the world of the text fewer key words and more selective than Wmatrix 183 words 189 key words in category to start analysis Anatomy and physiology groups of key words built bottom-up (by means of concordancing) Slide 60 Names (people and places) Bond, Bonds M Chiffre, Chiffres, Le Vesper, Verspers, Lynd Pont, du Mathis Splendide, Paris, Jamaica, Leningrad Slide 61 Casino physical aspects of the casino (table, cards, plaques, shoe, rail) people and their roles in the casino (players, croupier, spectators) aspects of games (maximums, bet, slipped) numbers and money (neuf, notes) French words and phrases in the games Messieurs, mesdames, les jeux sont faits. Un banco de cinq cent mille. Slide 62 Spying gunmen, gunman, organization, smoke-bomb, double, bureau, agent, memorandum, 007, MWD trace at Royale trying to trace the Jamaican millionaire it would take hours to trace the ownership to him There was no trace of the gunman, Next he examined a faint trace of talcum powder on the inner rim effort to find out, mystery with a trace of impatience Le Chiffre showed no trace of emotion. characterisation Slide 63 Settings and props people/jobs concierge, patron, sommelier places boulevard, night-club, coast, villa, side-road food, drink, champagne, caviar clothes pyjama-coat, dinner-jacket furniture arm-chair chair Vesper and Bond having a meal Le Chiffres benzedrine inhaler cane chair and cane carpet-beater in torture scene Slide 64 Themes and characterisation luck A1.4 Chance, luck blackO4.3 Colour and colour patterns villains G2.1- Crime evil G2.2- unethical all villains (7) and most evil (15 of 19) in chapter 20 Slide 65 Key word luck 1) Bond as a gambler Bond had always been a gambler...., and he accepted the fact, he would be brought to his knees by love or by luck. (Chapter 7) 2) Bond and Vesper/women Perhaps I will bring you luck (Chapter 5) he felt vague disquiet. On an impulse he touched wood. (Chapter 5) Slide 66 Key words black, villains, evil The villains and heroes get all mixed up.. Now in order to tell the difference between good and evil, we have manufactured two images representing the extremes representing the deepest black and the purest white and we call them God and the Devil. (chapter 20) Clothes Vesper: black dress, black velvet skirt, black velvet ribbon on her hat, black hair Bond: black satin tie, black hair Slide 67 in KW list Anatomy and physiology category not so clearly visible, but: 'Well, when you get back to London you will find there are other Le Chiffres seeking to destroy you and your friends and your country. [] And now that you have seen a really evil man, you will know how evil they can be and you will go after them to destroy them in order to protect yourself and the people you love. [] You may want to be certain that the target really is black, but there are plenty of really black targets around. There's still plenty for you to do. And you'll do it. And when you fall in love and have a mistress or a wife and children to look after, it will seem all the easier.' Mathis opened the door and stopped on the threshold. 'Surround yourself with human beings, my dear James. They are easier to fight for than principles. He laughed. 'But don't let me down and become human yourself. We would lose such a wonderful machine. (Chapter 20) Slide 68 68 Conclusions corpus findings can provide insights into the world of a novel the need for local descriptions linking linguistic and literary discussion Slide 69 69 Future work: bottom-up and top-down: reception effects on reader script and film Slide 70 70 Further detail. Mahlberg, M. 2007. "Corpora and translation studies: textual functions of lexis in Bleak House and in a translation of the novel into German". In V. Intonti, G. Todisco, and M. Gatto (eds), La Traduzione. Lo Stato dell'Arte. Translation. The State of the Art. Ravenna: Longo, 115-135 -- 2007. Clusters, key clusters and local textual functions in Dickens, Corpora. -- 2007. Corpus stylistics: bridging the gap between linguistic and literary studies In Hoey, M., Mahlberg, M., Stubbs, M., Teubert, W. forthcoming. Text, Discourse and Corpora. Theory and Analysis. London: Continuum. -- forthcoming, 2007 A corpus stylistic perspective on Dickens Great Expectations. In M. Lambrou and P. Stockwell. Contemporary Stylistics. London: Continuum.