What is a corpus?
a collection of spoken or written authentic texts that is representative of a particular area of language use, by virtue of its size and composition
usually computer-readable and able to be accessed with tools such as concordancers which are able to find and sort out language patterns
Kinds of corpora
General corpora
aim to represent language in its broadest sense and to serve as a widely available resource for baseline or comparative studies of general linguistic features (Reppen and Simpson 2004: 95).
Kinds of corpora
Specialized corpora
a corpus of texts of a particular type, such as newspaper editorials, geography textbooks, academic articles in a particular subject, lectures, casual conversations, essays written by students etc. It aims to be representative of a given type of text. It is used to investigate a particular type of language (Hunston 2002: 14)
Specialized corpora
The Michigan Corpus of Academic Spoken English
The British Academic Spoken English corpus
The British Academic Written English corpus
The TOEFL Spoken and Written Academic Language corpus
Design and construction of corpora
Authenticity, representativeness and validity
Kinds of texts to include
Size of the texts
Sampling and representativeness
Discourse characteristics of conversational English
Non-clausal units
Personal pronouns and ellipsis
Situational ellipsis
Non-clausal units as elliptic replies
Repetition
Lexical bundles
Performance phenomena of conversational English
Silent and filled pauses
Utterance launchers and filled pauses
Attention signals
Response elicitors
Non-clausal items as response forms
Extended co-ordination of clauses
Constructional principles of conversational English
Keep talking
Limited planning ahead
Qualification of what has been said
Prefaces
Tags
Corpus studies of the social nature of discourse
Spoken language in academic settings(Swales 2003)
Dissertation acknowledgements (Hyland 2004)
Collocation and corpus studies
Dissertation acknowledgements (Hyland and Tse 2004)
Personal ads (Ooi 2001)
Corpus studies and academic writing
Academic Vocabulary in Context (Hirsh 2010)
University Language (Biber 2006)
Register, Genre, and Style (Biber and Conrad 2009)
Metadiscourse(Hyland 2005)
Academic Discourse(Hyland 2009)
Disciplinary Identities(Hyland 2012)
Top Related