Linguistics Journal Volume 8 Issue 1 2014

171

description

Linguistics Journal

Transcript of Linguistics Journal Volume 8 Issue 1 2014

theLinguisticsJournal

ISSN 1738-1460

VOLUME 8 ISSUE 1 2014

The Linguistics Journal

July 2014 Volume 8 Issue 1

Editors: Paul Robertson and Biljana Čubrović

The Linguistics Journal July 2014 Volume 8, Number 1 http://www.linguistics-journal.com © English Language Education Publishing Brisbane Australia This E-book is in copyright. Subject to statutory exception no reproduction of any part may take place without the written permission of the English Language Education Publishing. No unauthorized photocopying All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying or otherwise, without the prior written permission of English Language Education Publishing. [email protected] Editors: Dr. Paul Robertson and Dr. Biljana Čubrović Chief Editor: Dr. Biljana Čubrović Senior Advisor: Dr. John Adamson Journal Production Editor: Dr. Erin Carrie ISSN 1738-1460

Table of Contents: Foreword by Biljana Čubrović 1 - 3 Research Articles

1. Dina Awad 4 - 29 Diverse Acquisition Patterns

2. Ibrahim M. R. Al-Shaer 30 - 59 The Use of Third-Person Pronouns by Native and Non-Native Speakers of English 3. Napasri Timyam 60 - 89 An Analysis of Learner Use of Argument Structure Constructions: A Case of Thai Learners Using the Passive and Existential Constructions in English 4. Mohammad Aliakbari, Mahmoud Qaracholloo and Ali Mansouri Nejad 90 -114 Social Class and Language Structure: A Methodological Inquiry into Bernstein's Theory of Sociology of Education Research Notes

5. Ming Wei 115 - 135 Code-Switching in a Virtual English Community in China: An International Perspective 6. Jabulani Sibanda 136 - 151 Interrogating Current Conceptualisations of ‘Word’ for Word Knowledge Studies: Challenges and Prospects. 7. María José Serrano and Miguel A. Aijón Oliva 152 - 167 On Gendered Styles and their Socio-Cognitive Foundations

1

Foreword

This year‘s edition of the journal comprises seven articles: four full research articles and three

research notes. Thanks are extended primarily to the authors who have contributed to this

edition, and the Associate Editors, reviewers, and the production team under Dr. Erin Carrie

for their efforts in preparing the papers for publication. This last year has been unique for the

journal in terms of the significant changes affecting the Editorial Board, a healthy volume of

submissions and a large number of new reviewers together with a brand new production team

who have joined the journal in their new roles. Congratulations must be extended to all the

new editors, who have become part of the team recently and have already proved to be

dependable, constructive and highly professional. Special thanks go to John Adamson, who

has moved on to a sister journal but has helped me take over the position of the Chief Editor

he has successfully held for many years and helped with all my questions and concerns since

January 2014.

The first contribution, entitled "Diverse Acquisition Patterns" by Dina Awad,

elaborates on second language acquisition issues, featuring one of the most problematic areas

of English grammar - the articles as used by native speakers of Arabic. Awad's original study

of the acquisition of the definite and indefinite articles in SLA shows that the developmental

patterns of the two articles are divergent in both accuracy rates and error types, and that they

cannot be easily predicted because their acquisition is influenced by multiple and diverse

factors, such as proficiency level, first language, task-type and the processing demands of

each linguistic feature. The next research article, contributed by Ibrahim M. R. Al-Shaer, is

"The Use of Third-Person Pronouns by Native and Non-Native Speakers of English",

especially in the context of pronoun-antecedent agreement, an area where it proves difficult to

draw the line between standard and non-standard usage. Similar to Dina Awad's study of the

acquisition of articles by Arabic non-native speakers of English, Al-Shaer looks into the

differences in the use of pronouns. The results of the study show that most native speakers

choose third-person pronouns depending on the socio-cultural context and pragmatic factors,

bending the formal rule of pronoun-antecedent agreement, especially when dealing with

gender-unspecified words. However, the majority of non-native speakers show an inclination

to follow prescriptive grammar rules, due to the absence of social and cultural sensitivity

evidenced in English as L2. Napasri Timyam's study, entitled "An Analysis of Learner Use of

Argument Structure Constructions: A Case of Thai Learners Using the Passive and

Existential Constructions in English", focuses on the aforementioned two types of common

2

constructions in English with the aim of discovering the deviations in terms of their general

characteristics in the written English of non-native speakers of the Thai language background.

The results reveal that Thai learners‘ constructions differ from the prevalent native speaker

norms in that they are much more limited in terms of structural complexity, semantic and

pragmatic functions. In the last paper in the research article section, "Social Class and

Language Structure: A Methodological Inquiry into Bernstein's Theory of Sociology of

Education", Mohammad Aliakbari, Mahmoud Qaracholloo and Ali Mansouri Nejad explore

the manifestations and credibility of Bernstein's Language Codes Theory in an Iranian context

so as to check whether there are any significant differences between working- and middle-

class Iranian native speakers in the domain of linguistic patterns usage. Even though

Bernstein's view of the relationship between language and social class has been largely

disputed, Aliakbari and colleagues provide some evidence supporting the manifestations of

the two dichotomous language codes: restricted code (lower strata of society) and elaborated

code (higher socioeconomic class of language users).

Three additional research notes are presented in the next section of this edition. The

first article, entitled "Code-Switching in a Virtual English Community in China: An

International Perspective‖ and written by Ming Wei, looks into the concept of code-switching

as used in chat rooms. The study examines how code-switching negotiates social and

interactional meanings in virtual conversations as conducted by Chinese speakers of English,

as well as how it contributes to the creation of an authentic, slightly adapted context of social

interaction between interlocutors. Speakers tend to adjust their choice of code as well the

degree of code-switching, both of which are firmly entrenched in the social distance and face

management in synchronous conversations, as well as how manipulation of code

interpretation and selection was achieved in the virtual English community. Jabulani

Sibanda‘s paper, "Interrogating Current Conceptualisations of ‗Word‘ for Word Knowledge

Studies: Challenges and Prospects", questions the efficacy of the conceptualisation of the

construct ‗word‘ represented by different terms, ‗token‘, ‗type‘, ‗lemma‘, and ‗word family‘,

as units of measurement of the English lexicon as seen in the vocabulary expansion of South

African learners of English. Jabulani points out that an implementation of an extension of

Nation and Bauer‘s (1983) levels of ‗word family‘ membership, through an association of

inflected and derived forms with base words, seems a desirable proposition in second

language acquisition studies. Last but not least, the concluding research note, entitled "On

Gendered Styles and their Socio-Cognitive Foundations", is written by María José Serrano

and Miguel A. Aijón Oliva. The main purpose of their investigation is to outline a theoretical

3

and analytical framework that reconciles the quantitative and qualitative perspectives on

language and gender as used by male vs. female speakers of European Spanish. The authors

develop a view of the statistical patterning of linguistic usage which reflects the meaningful

use of linguistic elements in local contexts.

We hope you find the articles in the 2014 edition of the journal interesting. Your own

submissions and feedback are always welcome, and we look forward to receiving them.

Biljana Čubrović, Ph.D.

Chief Editor

4

Diverse Acquisition Patterns

Dina Awad

Leicester University

[email protected]

Bioprofile: Dina Awad holds a Ph.D. degree in Linguistics from Lancaster University (2011).

She received her M.A. in English Language Teaching and Applied Linguistics from Kings

College London in 2001. She is currently a lecturer at Leicester University, UK. Research

interests include second language acquisition, cognitive linguistics and teaching methods.

Abstract

Acquiring a second language is a complex and nonlinear process in which learner hypotheses

and production constantly change and evolve towards the target language. In order to find out

more about developmental patterns in SLA, we examined the L2 use of English articles in the

free composition of students in the United Arab Emirates, all of whom are L1 speakers of

Arabic. The participants were grouped into three proficiency levels (PL) according to the

Oxford Placement Test (OPT) to assimilate diachronic progression. It was expected that

learners‘ performance on both articles would improve with higher competence. However, by

comparing accuracy and error rates across the three groups, we found that the articles ‘a(n)’

and ‘the’ develop not only independently of each other but could sometimes progress in

diverse directions. The most influential factors that contributed to determining the final

outcome were the non-existence of a one-to-one form-function relation between the two

English articles, the dissimilarity between L1/L2 representations of definiteness and number,

and learners‘ competence levels.

Keywords: English, second language acquisition, articles, pattern

Introduction

English articles have always been difficult for second language learners regardless of their

first language, persisting into advanced levels. Notorious as one of the most difficult features

of English to be learned or taught (Kaluza 1963, Brown 1973, Dulay et al. 1982, Pica 1983,

Master 1990, inter alia), misuse of articles ranks highest among L2 learners‘ errors (Covitt

1976, cited in Celce-Murcia and Larsen-Freeman 1999, Richards and Simpson 1974). Sharma

(2005) established that article errors account for 60.37% of the total number of errors

5

committed by L1 Indian learners of English, while Thu (2005) found that article errors

constituted 31.5% of all other errors made by L1 Vietnamese learners. Thus, articles represent

an area of ‗considerable prominence in any error analysis‘ since, as traditionally believed,

performance regarding articles reflects overall linguistic competence (Oller and Redding

1971: 85). Later, researchers such as Lightfoot (1998) suggested that learners‘ performance

on articles does not necessarily reflect their PLs. Bataineh (2005) found that senior Jordanian

learners overused the indefinite article more frequently than lower ability learners.

Research on SLA of English articles has shown that articles develop at different rates

(Chaudron and Parker 1990, Kellerman 1977, inter alia) due to the differences in the meaning

and function of each article. While the function of the definite article in English is to signal

that a particular entity in a limited context is uniquely identified by the interlocutors in a

particular pragmatic setting (Hawkins 1978, Lyons 1999), its absence is sufficient to indicate

indefiniteness, such as the case with plural and uncountable nouns. This leaves the indefinite

article primarily with a cardinality function assigned only to singular indefinite contexts.

Therefore, the disparity could arise from the fact that there is no one-to-one relationship

between the two.

In addition to the point that the two articles develop independently of each other, what

is proposed in this paper is that this nonlinearity can, under certain conditions, culminate in a

progression in different directions. The purpose of this study is to draw attention to the

complexity of article acquisition in L2 and to alert educators that progression in L2 does not

always correlate positively with performance, as advanced learners can make more errors, in

certain contexts, than weaker ones before finally improving. This pattern has been often

described as U-shaped development (See Kellerman 1977, Master 1997, Haznedar 2001), but

the consistent article errors even for advanced level L2 users undermine this proposition.

Literature Review

The two articles in English have not been reported to be acquired at the same time nor follow

the same route of development in SLA. Studies show that each article is produced and

mastered at variable stages and to incur different error types at different PLs. Several criteria,

such as difference in function, L1 grammar and task type determine the L2 development map

of each article separately.

Except for two known studies (Leung 2001, Young 1996), most researchers seem to

agree that mastering the definite article precedes that of the indefinite (Hakuta 1976, Huebner

6

1983, Master 1987, Thomas 1989, Yamada and Matsuura 1982). The rationale is that

definiteness, as a semantic concept, is at least encoded before indefiniteness (Chaudron and

Parker 1990) which involves grammatical notions of number and countability. This position

is corroborated by findings from many studies (e.g., Hamdallah 1988, Kharma and Hajjaj

1989, Maalej 2004). It is therefore noticeable that better performance on the definite article at

earlier stages is a common occurrence in the SLA process.

From a transfer perspective, the absence of articles in the L1 impedes the L2

acquisition and vice-versa (Ringbom 1987, Goad and White 2004). Despite the fact that

Arabic is considered a language with definiteness grammaticalised (+ART), there is no

explicit marker of indefiniteness. Suffix accents, or nunation (Smith 2001), sometimes mark

indefinite nouns, but their presence is optional and largely limited to classic, formal and

written registers. The indefinite NP in ‗This is a big house‘, for example, can be expressed

formally where explicit markers appear as suffixes (1), or informally (2) without markers.

هذا بيتٌ کببرٌ (1)

Haatha bait-un kabeer-un

Dem:prox house-N-Indef-Sg big-Adj-Indef-Sg

This house big

هذا بيت كبير (2)

Haatha bait kabeer

This house big

Learners could transfer the semantic notions from Arabic in which the absence of definite

marking in a NP is a sufficient indication to its indefiniteness status. This principle, however,

is not entirely exclusive to Arabic. Leech contends that ‗it is convenient, from many points of

view, to regard an initial determiner as obligatory for English noun phrases, so that the

absence of an article is itself a mark of indefiniteness.‘ (1992: 15).

Studies on the production of learners whose L1s lack formal representation of articles

(–ART) suggest that the failure to supply articles persists onto advanced stages (Thomas

1989, Master 1997, Trenkić 2002, Ekiert 2004). Zdorenko and Paradis (2008) recorded more

omissions in the L2 production of Korean, Chinese and Japanese (–ART) learners of English

than in the production of Spanish, Romanian and Arab (+ART) learners. High omission rates

of the indefinite article observed in Arab learners‘ production is a typical occurrence of what

7

Eckman (1977) describes as the most difficult aspect to acquire in the target language, namely

the production of elements which are not present in the L1 but marked in L2. Tsimpli‘s

(2003) conviction that the absence of features in the L1 causes syntactic representations in L2

production to become defective applies to the difficulties which Arab learners encounter.

SLA researchers, such as Hawkins and Chan (1997) and Prévost and White (2000),

ascribed the difficulty that second language learners (2LL) have in the employment of a

feature that does not exist in their L1 to a failure in mapping functional features present in the

L2 (FFFH) onto their production of the target language. With L1 transfer most operative at

weaker PLs (Odlin 1989, Sharma 2005, Slabovka 2000, Snape 2005) better performance is

expected on the definite than the indefinite article.

Previous research has provided evidence for the tendency of Arab learners to overuse

the definite article across indefinite contexts (cf. Bataineh 2005, Kharma 1981, Maalej 2004).

This error was attributed to two different sources. One group of researchers (e.g., Al-Fotih

2003, Diab 1996, Habash 1982, Kharma and Hajjaj 1989) ascribes the error to the negative

transfer of the definite article norms in Arabic. While the definite marker in Arabic is used to

generalise as well as to identify, rendering all generic references grammatically definite

(Hawas 1989, Kremers 2003), non-referential NPs in English are largely left unmarked1 as

native speakers‘ most favourable option (Behrens 2005). The fact that the definite article

tends to be overused in non-specific contexts while the indefinite is expected to be

underrepresented can cause a gap in the development of the two articles.

Other researchers, including those whose data was collected from free production,

such as Abi Samra (2003) and Bataineh (2005), believe that the-flooding tendency is a

universal (IL) phenomenon; a stage that all L2 learners go through regardless of their L1. In a

study on university students in the Arab Emirates, Crompton (2011) contends that the most

common error is the overuse of the definite article in generic contexts. Therefore, overuse of

‘the’ is expected in indefinite plural/uncountable contexts especially at lower PLs.

The higher overuse rate of ‘the’ at lower PLs is not by any means exclusive to Arab

L1 learners. Similar findings were reported in SLA studies on other L1s, including languages

that possess or lack a formal representation of articles (Huebner 1983, Nagata et al. 2005,

Thomas 1989, Young 1996). Master‘s (1987) study of Japanese learners, for example, found

that the definite article was flooded into indefinite contexts although Japanese does not

possess an article system.

Test type can also influence article choice causing inconsistency in production and

accuracy/error rates. Research findings suggest that free writing tasks yield higher accuracy

8

rates than controlled cloze tests. Dulay et al. (1982) argue that errors in form-focused tests

occur when formally learned rules have not yet become part of the learners‘ linguistic

competence, i.e. learners need time to practice their explicitly learned L2 rules in order to

produce grammatically appropriate forms in free production. This is largely attributed to

avoidance strategies that are available to learners in production-based tasks (Kharma and

Hajjaj 1989, Mizuno 1985, Tarone and Parrish 1988). Accordingly, learners resort to other

determiners such as quantifiers and demonstratives to reduce the risk of committing errors in

article use. In this case, when given the choice, the definite article presents a safer option

since it collapses elements of countability and number, which endanger the grammatical

accuracy of the NP. Furthermore, ‘the’ is already available in learners‘ subconscious and

easily automated in free production, while the indefinite article, learned mostly through

explicit instruction, is more accessible in tasks that draw on metalinguistic information such

as cloze tests. With communicating meaning being the primary goal in a free production task,

learners‘ attention might not be fully directed towards form causing the production and

accuracy rates of the indefinite article to be relatively low.

Advocates of teaching articles (e.g., Master 1997) in the EFL/ESL classroom propose

that informing learners of explicit rules can eventually lead to automated use, i.e., for a

learner to know how a feature operates, precedes, and leads to, the voluntary application of

these rules in communicative settings (DeKeyser 2003, Doughty 2003, Ellis 2001, VanPatten

1994), i.e., more time is needed for this declarative knowledge to become internally

proceduralised and voluntarily produced in meaningful output. Therefore, participants could

have achieved different results had the task been form-focused.

Method

Participants

Sixty undergraduate students from different colleges in the UAE University, United Arab

Emirates, volunteered to participate in this study. Each participant was given a reference

number. A background uniformity survey was conducted to ensure unanimity of first

language, Arabic, while participants who had studied in English medium schools or lived in

an English speaking country for more than three months were excluded.

9

Materials and Procedure

The Oxford Placement Test (OPT) was used to determine proficiency levels. Participants

with scores of Elementary (30 out of 60) were placed in the weakest group (G1), while those

whose scores were between 31 and 44 formed the second group (G2), to include Lower

Intermediate and average Intermediate levels by OPT standards. The highest group, (G3),

included students with 45 points and above, i.e. Upper Intermediate and Advanced by OPT

banding. In order to ensure sufficient gaps between the groups, borderline scores were

excluded from the test, leaving 51 students to take the following test.

Table 1 Banding criteria according to OPT results

Levels OPT Scores Range Groups

Beginner 0-17 0-30 1

Elementary 18-29

Lower Intermediate 30-39 31-45 2

Upper Intermediate 40-47

Advanced 48-54 46-60 3

Very Advanced 54-60

The succession of levels is an attempt to follow, synchronically, natural L2 progress,

otherwise operationalised longitudinally, as variation across proficiencies can reflect some

aspects observed in diachronic development (Raymond et al. 2002).

The data was collected from a composition task in which learners were asked to

describe their hometowns in 350-500 word essays. The topic provides students with an

opportunity to express themselves freely and creatively by introducing new information and

referring to it later in the text, which ensures the availability of definite and indefinite

constructions. No indication was made to the purpose of the test in the prompts in order for

the production to better reflect learners‘ communicative competence as it approximates real-

life interaction (Lightbown and Spada 1999, Power 2003). Free production tests are known to

direct learners‘ attention mainly towards delivering meaning, providing the researcher with a

10

sample of L2 data in non-test situations (Skehan 1989). Therefore, the outcomes of this study

might not resemble those obtained from cloze tests.

Observed by teachers, the participants were given one hour to write. Time pressure

adds a processing constraint on participants to prevent conscious contemplation of the forms

produced (cf. Robinson 1996, Sorace 1996).

Data Analysis and Statistical Analysis

NPs were numbered by order of appearance in each essay and described in terms of the

criteria that determine article use i.e. definiteness, countability and number (Celce-Murcia

and Larsen-Freeman 1999, Quirk et al. 1991). NPs were described as possessing (1) or

lacking (0) these criteria in order to facilitate calculations.

(3) *It‘s quite big town. (14C4)2.

(4) *He told us an interesting stories. (1A22).

According to the categories of analysis, the NP in 3 was described as [Def=0] [Count=1]

[Sing=1] while the NP in example 4 was [Def=0] [Count=1] [Sing=0].

Article use was categorised as either correct or incorrect. The approach followed to

determine the correctness of the definite article is derived from Liu and Gleason‘s (2002)

classification of non-generic contexts of definite article use, which, in turn is based on the

theory of definiteness advanced by Hawkins (1978). Semantically definite NPs, such as

proper nouns, pronouns and demonstratives as well as quantifiers that exclude articles were

not included in the dataset because [+Def] categorisation would automatically require the

supply of the definite article, leading to confusion in subsequent calculations. However, the

determiner some was accepted as a correct indefinite plural marker. Incorrect use was

subdivided into errors of overuse, omission and replacement. Overuse errors refer to

instances where articles should not have appeared (Pica 1983) while omission errors denote

the failure to supply either article in contexts where they are deemed obligatory. Thus, the

error in example 3 is that of omission while in 4 it is overuse. Replacement errors refer to the

employment of the indefinite article in uniquely identifiable referents (a-for-the), or the

supply of the definite article in indefinite singular contexts (the-for-a). A sample datasheet is

shown in Table 2.

11

Table 2 Sample data sheet

Ref.

No. Student name

Article Use NP description

The a(n) 0

11

aw NP C

orr

ect

Om

issi

on

Ov

eru

se

Co

rrec

t

Ov

eru

se

Om

issi

on

Co

rrec

t

Def

init

e

Co

un

tab

le

Sin

gu

lar

1 a special place to live 1 1 1

2 place to work 1 1 1

3 has relatives 1 1

4 I live in small town 1 1 1

5 the most beautiful town 1 1 1 1

6 in the world 1 1 1 1

7 It has many places which attract

tourists 1 1

8 from all over the world 1 1 1 1

9 It has mountains 1 1

0 beaches where you breathe 1 1

11 fresh air 1

12 and farms where you find 1 1

13 different kinds of 1 1

14 fruit and 1

15 Vegetable 1 1

16 It has the most important factor

which are 1 1 1 1

17 Safety 1

18 Quietness 1

19 and purity 1

20 whenever I have a problem 1 1 1

21 my place in the society 1

22 small simple houses 1 1

12

Two speakers of English as a first language volunteered to review the datasheets to ensure the

reliability of the coding. Some expressions were marked as (grammatically) correct, although

more target-like constructions would have been preferable.

Learner data Native like choice

(10) As a conclusion in conclusion

(11) The houses of people People’s houses

To calculate accuracy rates, the number of correct supplies was divided by the total sum of

NP environments in which articles should have appeared.

Number of observed occurrences

Correct use = %

Total number of obligatory contexts

Outcomes were measured in percentages to allow comparisons across groups with varying

numbers of participants and unequal obligatory contexts. In principle, the formula used in

calculating errors was similar to the one used for accuracy, i.e., the observed instances were

compared against the total number of contexts where such occurrences were expected to

appear. For example, to calculate the percentage of overuse of a(n), the following equation

was used:

Number of incorrect instances of a(n)

Overuse of a(n) = %

Total number of [–Def] [–Count] NPs

A similar method was followed to examine the occurrence of the indefinite article in plural

contexts, simply by changing the [–Count] contexts into [–Sing] ones. The overuse rates of

the definite article were calculated by dividing the total number of overuse instances in

learner data by the total number of indefinite NPs in a given group.

The omission rates of ‘a(n)’ were obtained by comparing the total number of

obligatory contexts; i.e. [–Def] [+Count] [+ Sing] NPs, against observed instances. The same

was used for definite article omissions.

13

Replacement errors had to be calculated in a manner that would make the two articles

more comparable since it is grammatically acceptable for the definite article to replace the

indefinite while the reverse is not always possible. Therefore, only [+Sing] [+Count] nouns

were selected as constants for both articles leaving definiteness as the only dependent variable

that determines the appropriate choice of either article.

Total number of overuse instances of a(n)

a-for-the = %

Total number of [+Def] NPs

A similar calculation was used to examine the error of replacing the indefinite article with the

definite.

Finally, to ensure that there is consistency within the responses of each group, the

following analysis was performed.

Table 3 Consistency within groups

N Range 1st

quartile

Median 3rd

quartile

G1 19 18-30 27 30 32

G2 20 31-45 35 37 40

G3 17 46-53 45 47 49

There was also sufficient cross-group difference to justify the categorisation. In order to

measure cross-group variance, we used a non-paired, two-tailed t-test assuming equal

variance with 95% confidence, comparing two groups at a time. Cross group differences were

statistically significant as is shown in Table 4.

14

Table 4 Cross group variation

N M CI at 95% p

G1 19 124.07 ±7.89

G1 v G2 <0.0001

G2 20 159.1 ±8.26 G1 v G3

<0.0001

G2 v G3 0.0057

G3 17 177.57 ±8.57

Results

Accuracy

G1 Learners employed the definite article correctly 150 times in 199 obligatory contexts,

while ‘a(n)’ was correctly supplied 64 times in 106 indefinite singular contexts. The

significant difference (p=0.0079) strongly suggests that Arab learners initially perform better

on the definite than the indefinite article.

G2 achieved higher accuracy rates on both articles. The gap between the accuracy rates of the

two articles was smaller. However, the error pattern remains in line with that detected in G1‘s

production as the accuracy rates of the definite article (84%) remained significantly higher

than those of the indefinite (p=0.0395).

G3 The highest accuracy rates were, as expected, achieved by more advanced learners.

Unlike the results from the two lower groups; there was little difference in the accuracy rates

of the definite and indefinite articles. However, G3 performed better on the indefinite (89%)

than the definite (86%) article.

15

Figure 1 Accuracy rates across groups

The results show sustained improvement in learners‘ performance on both articles yet the

progress on the indefinite article was more noticeable and consistent, correlating positively

with PLs with significant rates scored across PLs. On the other hand, the difference between

G2 and G3‘s accuracy rates of the definite article was not significant (p=0.3742) as is shown

in Table 5.

Table 5 Accuracy rates of both articles compared across groups

N Correct the

% p Correct a/an

% p

G1 19 150/199 75 G1 v G2 64/106 60 G1 v G2

0.0276 0.0277

G2 17 245/293 84 G2 v G3 86/116 73 G2 v G3

0.3742 0.0080

G3 12 191/221 86 G1 v G3 57/64 89 G1 v G3

0.0039 <0.0001

16

Diverse acquisition patterns can be detected as the highest scores shift from being achieved

on one article (the) to the other (a) with PL progression. The diagram in figure 2 further

illustrates this trend.

Figure 2 Accuracy trend-lines for both articles

Omission

G1 This group omitted the definite article in 48 obligatory instances, which is 24% of all

definite contexts. The failure to supply ‘a(n)’ with indefinite singular countable nouns was

the most noticeable difficulty in the lower group‘s performance as the omission of the

indefinite article was higher than all other errors. The omission of the indefinite article was

the highest of all grammatical errors recorded in G1‘s production (44%). G1 omitted the

indefinite article 42 times in 106 contexts. In percentages: weaker learners failed to supply

‘a(n)’ 40% of the time with singular indefinite NPs.

G2 Although there were fewer omissions by this group than by the weaker group (p=0.0476),

G2‘s performance was similar to that of G1 as intermediate PL participants omitted the

indefinite article more frequently than the definite article. The omission of ‘a(n)’ constituted

34% of all grammatical errors made by G2. They omitted the indefinite article 30 times in

116 indefinite singular NP contexts (26%), a significantly higher rate than that of the definite

article (17%).

17

G3 Omission rates of the indefinite article seem to have decreased regularly and significantly

as PLs improve. However, it was interesting to find that G3 participants made more

omissions of the definite than the indefinite article. The omission rate of the definite article

was 12% while the rates of ‘a(n)’ omission were only 11%.

Diverse, if not inverse, patterns are clearly evident in Figure 3.

Figure 3 Omission patterns across groups with linear trend.

Overuse

G1 The results obtained from the weaker group‘s production reveal that indefinite nouns were

unconventionally preceded by the definite form 52 times in 385 possible contexts (14%). All

of these instances were non-referential plural/uncountable contexts. Compared to the overuse

of the indefinite article which was lower than 2%, the overuse of the definite article was

significantly higher (p<0.0001). The overuse rates of the definite article were considerably

more frequent than the total sum of ungrammatical supply of ‘a(n)’ in plural/uncountable

constructions and in contexts where the definite article should have appeared.

G2 The recorded overuse rates of the definite article dropped down to 10% (47 out of 461

indefinite contexts) in the production intermediate group, with most instances observed in

generic, non-referential, contexts as is the case in learners‘ L1. The ungrammatical supply of

‘a(n)’ with plural and uncountable nouns did not exceed 2.3% which means that the disparity

18

between the overuse rates of the two articles was smaller than the rates emerging from the

weaker group‘s performance.

G3 The most noticeable improvement in learners‘ production was the significant and

systematic drop in the overuse rate of the definite article with improved L2 competence. The

definite article was overused in 14 of 236 indefinite NP environments, which reduces the rate

to only 6%. However, the advanced group‘s overuse rates of the indefinite article were

slightly higher than those of the two weaker groups as shown in Table 6.

Table 6 Overuse rates

N the % a(n) % p

the : a

G1 19 52/385 13.51 6/279 2 <0.0001

G2 17 47/461 10.2 8/345 2 <0.0001

G3 12 14/236 5.93 4/172 2 0.0603

From the table above, it is noticeable that while the overuse of the definite article falls

sharply, the indefinite article is over supplied and flooded. Figure 4 illustrates the contrast in

error trends.

Figure 4 Overuse rates of articles across groups

19

Replacement

The phenomenon of diverse acquisition patterns is most evident in replacement errors.

Replacement errors constituted 59% of all errors committed in the test; a considerable rate

compared to the total sum of all other errors (41%).

G1 In analysing data entries, it was evident that the definite article was the preferred option

especially for weaker learners, as it replaced the indefinite article in many [+Count] [+ Sing]

contexts. This group overused the definite article to replace the indefinite four times as often

as they did the opposite. The definite article replaced the indefinite in only one instance – out

of 111 possible replacement contexts.

G2 The intermediate group made fewer replacement errors. The improvement is also noticed

in the fact that the gap between the two replacement rates has decreased. G2 participants used

‗the’ to replace ‘a(n)’ twice as often as replacing the definite. This can be a form of

improvement compared to the four-fold ratio observed in the production of G1. However,

despite the improvement, intermediate learners still preferred to substitute the indefinite

article with the definite rather than the reverse while supplying ‗a(n)’ instead of ‘the’

increased from 0.9% to 1.2%.

G3 At a later learning stage, the higher group‘s replacement rates became very close, i.e. the

difference between the rates of replacing the-for-a were almost equal to those of replacing a-

for-the with the indefinite article preferred. A summary of the above results is presented in

Table 7.

Table 7 Replacement errors

Groups N the for

a(n) %

a(n) for

the %

G1 19 5/106 4.7 1/111 0.9

G2 17 4/116 3.4 2/161 1.2

G3 12 2/64 3.1 4/116 3.4

20

The inclination to substitute ‘a(n)’ with ‘the’ was reduced with improving PLs while the

production of the indefinite article in definite contexts increased steadily.

The error map of replacement in the learners‘ data is most reflective of diverse

acquisition patterns. This is perhaps clearer in the presentation in Figure 5.

Figure 5 Replacement errors

Discussion

Accuracy

The accuracy rates of the weaker group were higher than those reported by studies on learners

of –ART L1s (c.f. Butler 2002, Ekiert 2004, Master 1997, Trenkić 2002), which confirms

propositions of stronger L1 influence at lower L2 levels. This can be construed as positive

transfer of L1 semantic properties to the L2 as both languages concord on most conditions for

obligatory supply. The lower accuracy scores of the indefinite article resulting from little

production or erroneous use also suggest stronger L1 influence at earlier stages. G1 learners

seem not to have internalised the rules governing the use of the indefinite article to

automatically supply it where necessary. It is not surprising G2 learners performed better on

the definite article despite the improvement in PL since this type of test better reflects implicit

knowledge in which the representation of a feature with a semantic equivalent in the

participants‘ L1 is more accessible than the indefinite article which is not readily available in

21

the learners‘ subconscious knowledge and perhaps requires direct prompts to activate the

newly learned L2 form. G3‘s higher PL is reflected in the accuracy rates of the indefinite

article, approaching those of the definite and exceeding them. Although the difference

between the accuracy rates of the two articles in G3 is small and statistically insignificant, it

strongly indicates a change of trend (see Figure 1). Thus, we can assume that with stronger

L2 ability, learners‘ mastery of the two articles becomes more compatible.

Omission

With focus on expressing thoughts and describing locations and attractions and the lack of

prompting in the rubric to the purpose of the test, it is expected that this type of test would

accrue a high number of omission instances. This lends support to Granfeldt‘s (2000)

observation that accuracy will decrease if learners‘ attentional resources (Bialystok and Ryan

1985) are channelled towards goals other than accuracy.

G1 participants‘ omission rates of the indefinite article were significantly higher than

those of the definite. The failure to provide the indefinite article can also be driven by

learners‘ assumption that its absence does not constitute a hindrance to successful

communication of ideas. It is likely that weaker learners have subconsciously applied the

Economy Principle (Poulisse 1997) whereby maximal comprehensibility is achieved while

exerting minimal processing effort. G2 learners might have also found it redundant to mark

nominals overtly for indefiniteness if their [-DEF] value is readily inferred by the absence of

the definite marker. However, lower omission rates suggest that G2 learners have become

more aware of the conditions of indefinite article employment while beginning to realise the

limitations of the definite article to specific environments rather than its generalising function

in Arabic. Since free composition better reflects subconscious knowledge, lower omissions

and higher production of ‘a(n)’ indicate that G3 learners‘ command of the indefinite article

has become more internalised to be produced spontaneously in communicative output.

Although the disparity in the omission rates of the two articles was not significant in

G3‘s results, the switch in tendency is quite clear. While the weaker and intermediate learners

omitted the indefinite article more frequently than the definite, the advanced group were more

aware of the necessity to provide ‘a(n)’ and at the same time reduce the provision of the

definite even in obligatory contexts. This is consistent with the findings of researchers such as

Chaudron and Parker (1990), Cziko (1986), Ekiert (2004) and Habuto (2000).

22

Overuse

The overuse errors made by the weaker group were lower than originally expected. A

possible rationale for this is that free production tests are known to yield lower overuse rates

(see Tarone and Parrish 1988) since learners were not directed to provide a particular form,

which is known to encourage overuse in cloze tests.3 While the weaker group

overwhelmingly preferred the definite article, this was less noticeable in G2‘s production.

The decreased difference between the overuse rates of the two articles marks a change in

learners‘ underlying hypotheses on article use and indicates fluctuation characteristic of their

IL stage. This type of overuse is typical of what Richards (1971) refers to as partial

understanding of target language features. The significantly lower overuse rates of the

indefinite article compared to that of the definite in G1 and G2 production may not be entirely

due to learners‘ developed awareness of article use. Instead, it could well be attributed to task

type and L1 transfer.

The increase in overuse errors of ‘a(n)’ by the advanced group could be interpreted as

a form of regression but it could also be a result of hyper-correction as learners try to avoid

omission errors committed during past learning experience- over applying instructions to

produce ‘a(n)’ which leads to a flooding stage similar to the one observed in definite article

use. Richards (1976) maintains that failure to observe restrictions of countability and number

in article use may be due to faulty analogy. In many cases, the analogy is derived from

formulaic expressions learned as chunks in existential and have constructions memorised at

earlier stages and incorrectly overgeneralised.

Replacement

The reason underlying the preference of G1 to replace the indefinite article with the definite is

mainly developmental, through flooding and avoidance, but also involves L1 influence in the

absence of an explicit marker of indefiniteness in L1. Although both rates of replacement

errors are considerably low in G2, what emerges at this stage is an obvious change of trend

from that observed in the production of the weaker group. G3‘s preference of the indefinite

article to replace the definite is probably a result of learners‘ recently increased awareness of

the importance of supplying the indefinite article. Moreover, this result could have been

equally influenced by the receding influence of L1 represented by the drop in the overuse

rates of ‘the’ before singular indefinite nouns since the use of the definite singular to deliver

generic reference is substantially recurrent in Arabic. Although acceptable in certain

23

expressions in English (e.g., She plays the piano), it is not likely that learners have been

sufficiently exposed to authentic material to the extent that would enable them to detect

similar uses and employ them unprompted. If we suppose that, in marking indefiniteness,

Arabic is an –ART language, then G3‘s understanding of the indefinite article corresponds to

that of Leung‘s (2001) Japanese (–ART) learners who preferred a-for-the more often than

the-for-a.

This suggests that Arab learners experience a mapping problem of ‘a(n)’ into IL

grammar, which is more in line with the performance of Japanese, Chinese and Korean

learners (–ART) rather than the Spanish and Romanian groups in Snape et al.‘s (2006) and

Zdorenko and Paradis‘s (2008) studies.

Implications

The results of this study show that second language development is neither homogenous nor

simultaneous. The advancement in one aspect of L2 knowledge does not imply identical level

of achievement in another. Rather, there is evidence for a complex, non-linear and sometimes

inverse progression, guided by multiple factors such as proficiency level, first language, task-

type and the processing demands of each linguistic feature.

The developmental patterns of the two articles are divergent in both accuracy rates and

error types. The learning curve seems to start with higher awareness and a better supply of a

feature which already exists in the L1 (the definite article), but with improved PLs and

reduced L1 influence, the trend gradually shifts towards a better conceptualisation, and

therefore a higher production, of the newly acquired feature (the indefinite article). Error

patterns are also converse. Learners begin by overproviding the definite article in non-

referential contexts, and gradually reduce production until it is undersupplied in obligatory

contexts at later developmental stages. In contrast, the overuse of the indefinite article is

scarce in the production of weaker learners, yet with overall L2 progress, rates exceeded

those of the definite.

A mirror image of the above pattern is observed in omission errors, as high rates of

indefinite article omissions were observed in early stages. With better PLs, the rates fell

considerably. Although the definite article was properly supplied in obligatory contexts at

elementary levels scoring very low omission rates, the error increased in the production of

more able groups leading to higher omissions. A diverse progression map is also detected in

replacement errors as participants started with higher the-for-a rates but ended with greater a-

24

for-the substitutions. The switch of preferences from ‗the’ to ‗a(n)‘ reflects the regular and

systematic move from limited, L1 influenced use towards more target-like, internalised

knowledge.

It is worth mentioning that if occurrences of the indefinite article within formulaic

expressions were excluded from our calculations, since they are mostly memorised and not

automatically produced in corresponding contexts, the rates would have been more

contrastive. It is therefore safe to propose that articles develop not only independently from

one another but could also progress in diverse directions.

References

Abi Samra, N. (2003). An analysis of errors in Arabic speakers‘ English writings. American

University of Beirut. Retrieved 25 October, 2005 from

http://abisamra03.tripod.com/nada/languageacq-erroranalysis.html

Al-Fotih, T. A. (2003). Acquisition of the English articles by Arabic-speaking students.

Indian Linguistics, 64, 157-174.

Bataineh, R. F. (2005). Jordanian undergraduate EFL students‘ errors in the use of the

indefinite article. Asian EFL Journal, 7(1), 56-76.

Behrens, L. (2005). Genericity from a cross-linguistic perspective. Linguistics, 43(2), 275–

344.

Bialystok, E. and E. B. Ryan. (1985). A metacognitive framework for the development of

first and second language skills. In D. L. Forrest-Pressley, G. E. Mackinnon, and T. G.

Waller (Eds.), Metacognition, cognition, and human performance: Vol. 1. Theoretical

perspectives (pp. 207-252). San Diego, CA: Academic Press.

Butler, Y. G. (2002). Second language learners‘ theories on the use of English articles: An

analysis of the metalinguistic knowledge used by Japanese students in acquiring the

English article system. Studies in Second Language Acquisition, 24(3), 451-480.

Celce-Murcia, M. and D. Larsen-Freeman. (1999). The Grammar Book. Los Gatos: Sky Oaks

Production.

Chaudron, C. and K. Parker. (1990). Discourse markedness and structural markedness: The

acquisition of English noun phrases. Studies in Second Language Acquisition, 12(1), 43–

64.

Crompton, P. (2011). Article errors in the English writing of advanced L1 Arabic learners:

The role of transfer. Asian EFL Journal, 50, 4-32.

25

Cziko, G. (1986). Testing the language hypothesis: A review of children‘s acquisition of

articles. Language, 62, 878-898.

DeKeyser, R. M. (2003). Implicit and explicit learning. In C. J. Doughty and M. H. Long

(Eds.), The Handbook of second language acquisition (pp. 313-348). Malden, MA:

Blackwell.

Diab, N. (1996). The transfer of Arabic in the English writings of Lebanese students. The

ESPecialist, 18(1), 71-83.

Doughty, C. J. (2003). Instructed SLA: Constraints, compensation, and enhancement. In C. J.

Doughty and M. H. Long (Eds.), The Handbook of Second Language Acquisition. (pp.

256-310). Malden, MA: Blackwell.

Dulay, H., M. Burt, and S. Krashen. (1982). Language Two. New York: Oxford University

Press.

Eckman, F. (1977). Markedness and the contrastive analysis hypothesis. Language Learning,

27(2), 315-330.

Ekiert, M. (2004). Acquisition of the English article system by speakers of Polish in ESL and

EFL settings. Columbia University Working Papers in TESOL and Applied Linguistics,

4(1), 1-23.

Ellis, R. (2001). Investigating form-focused instruction. Language Learning, 51(1), 1–46.

Foster, P. and P. Skehan. (1996). The influence of planning and task type on second language

performance. Studies in Second Language Acquisition, 18(3), 299-323.

Garcia Mayo, M. P. (2008). The acquisition of four nongeneric uses of the article the by

Spanish EFL learners. System, 36, 550–565.

Goad, H. and L. White. (2004). Ultimate attainment of L2 inflection effects of L1 prosodic

structure. European Second Language Association Yearbook, 4 (pp. 119-145). John

Benjamins.

Granfeldt, J. (2000). The acquisition of the determiner phrase in bilingual and second

language French. Bilingualism: Language and Cognition, 3, 263-280.

Habash, Z. (1982). Common errors in the use of English prepositions in the written work of

UNRWA students at the end of the preparatory cycle in the Jerusalem area. Retrieved 3

July, 2006 from http://www.zeinab-habash.ws/education/books/master.pdf

Habuto, J. (2000). Comprehensible output hypothesis: Study of Japanese ESL students and

the acquisition of the English article system. In Moroishi, M. (Ed.), Classroom Second

Language Acquisition. FLL679S.

26

Hakuta, K. (1976). A case study of a Japanese child learning English as a second language.

Language Learning, 26, 321-351.

Hamdallah, R. (1988). Syntactic errors in written English: Study of errors made by Arab

students of English. Unpublished doctoral dissertation. Lancaster University, UK.

Hawas, H. M. (1989). The articles in English and Arabic: A contrastive study. Indian Journal

of Applied Linguistics, 15(2), 23-52.

Hawkins, J. A. (1978). Definiteness and indefiniteness. London: Croom Helm.

Hawkins, R. (2004). Explaining full and partial success in the acquisition of second language

grammatical properties. Paper presented at J-SLA, Gunma Prefectural Women‘s

University, Gunma, Japan.

Hawkins, R. and Y. Chan. (1997). The partial availability of universal grammar in second

language acquisition: The failed functional features hypothesis. Second Language

Research, 13(3), 187–226.

Haznedar, B. (2001). The acquisition of the IP system in child L2 English. Studies in Second

Language Acquisition, 23(1), 1–39.

Huebner, T. (1983). A longitudinal analysis of the acquisition of English. Ann Arbor.

Kellerman, E. (1977). Towards a characterization of the strategies of transfer in second

language learning. Interlanguage Studies Bulletin, 2, 58-145.

Kharma, N. (1981). Analysis of the errors committed by Arab university students in the use

of the English definite/indefinite articles. International Review of Applied Linguistics, 19,

331-345.

Kharma, N. and A. Hajjaj. (1989). Errors in English among Arabic speakers: Analysis and

remedy. London: Longman Group UK Limited.

Kremers, J. M. (2003). The Arabic noun phrase. LOT: The Netherlands.

Larsen-Freeman, D. (1997). Chaos/complexity science and second language acquisition.

Applied Linguistics, 18(2), 141-165.

Leech, G. (1992). Introducing English grammar. London: Penguin.

Lightbown, P. M. and N. Spada. (1999). How Languages are Learned. (2nd

ed.). Oxford:

Oxford University Press.

Lightfoot, A. R. (1998). Japanese second-language learners and the English article system: A

study in error analysis. University of Leeds. Retrieved 6 November, 2008 from

http://ardle.net/linguistics.html

27

Liu, D. and J. I. Gleason. (2002). Acquisition of the article the by nonnative speakers of

English: An analysis of four nongeneric uses. Studies in Second Language Acquisition,

24(1), 1-26.

Lyons, C. (1999). Definiteness. Cambridge Textbooks in Linguistics. Cambridge University

Press.

Maalej, Z. (2004). On the misuse of determination in Arab students‘ writing. University of

Manouba-Tunis. Retrieved 18 February, 2006 from www.executivetranslators.com

Master, P. (1987). A cross-linguistic interlanguage analysis of the acquisition of articles.

Unpublished doctoral dissertation. University of California, Los Angeles.

Master, P. (1990). Teaching the English articles as a binary system. TESOL Quarterly, 24,

461–478.

Master, P. (1997). The English article system: acquisition, function, and pedagogy. System,

25, 215-232.

Mizuno, H. (1985). A psycholinguistic approach to the article system in English. JACET

Bulletin, 16, 1-29.

Nagata, R., T. Iguchi, K. Wakidera, F. Masui and A. Kawai. (2005). Recognizing article

errors in the writing of Japanese learners of English. Systems and Computers in Japan,

36(7), 54-62.

Oller, J. W. and E. Z. Redding. (1971). Article usage and other language skills. Language

Learning, 21(1), 85-95.

Parrish, B. (1987). A new look at methodologies in the study of article acquisition for learners

of ESL. Language Learning, 37, 361–383.

Pica, T. (1983). Adult acquisition of English as a second language under different conditions

of exposure. Language Learning, 33, 465-97.

Poulisse, N. (1997). Some words in defense of the psycholinguistic approach: a response to

Firth and Wagner. The Modern Language Journal, 81(3), 324-328.

Power, T. (2003). Communicative language teaching: The appeal and poverty of

communicative language teaching. Retrieved 15 June, 2007 from

http://www.btinternet.com/~ted.power/esl0404.html

Prévost, P. and L. White. (2000). Missing surface inflection or impairment in second

language acquisition? Evidence from tense and agreement. Second Language Research,

16, 103-133.

28

Raymond, W., J. A. Fisher, and A. F. Healy. (2002). Linguistic knowledge and language

performance in English article variant preference. Language and Cognitive Processes,

17(6), 613–662.

Richards, J. C. (1971). A non-contrastive approach to error analysis. English Language

Teaching Journal, 25, 204-19.

Richards, J. C. (1976). The role of vocabulary teaching. TESOL Quarterly, 10(1), 77-89.

Ringbom, H. (1987). The role of the first language in foreign language learning. Clevedon,

UK: Multilingual Matters.

Robinson, P. (1996). Learning simple and complex second language rules under implicit,

incidental, rule-search, and instructed conditions. Studies in Second Language Acquisition,

18(1), 27–67.

Sharma D. (2005). Transfer and universals in Indian English article use. Studies in Second

Language Acquisition, 27(4), 535-566.

Skehan, P. (1989). Language testing. Language Teaching, 22, 1-13.

Slabakova, R. (2000). L1 transfer revisited: the L2 acquisition of telicity marking in English

by Spanish and Bulgarian native speakers. Linguistics, 38(4), 739-770.

Smith, B. (2001). Learner English: A teacher’s guide to interference and other problems.

Cambridge: Cambridge University Press.

Snape, N. (2005). The uses of articles in L2 English by Japanese and Spanish speakers. Paper

submitted to the annual conference on language acquisition. Essex Graduate Student

Papers in Language and Linguistics, 7, (pp. 1-23).

Snape, N., Y. I. Leung and H-C. Ting. (2006). Comparing Chinese, Japanese and Spanish

speakers in L2 English article acquisition: evidence against the fluctuation hypothesis. In

M. Grantham O‘Brien, C. Shea, and J. Archibald (Eds.), Proceedings of the 8th Generative

Approaches to Second Language Acquisition Conference (pp. 132-139). Somerville, MA:

Cascadilla Proceedings Project.

Sorace, A. (1996). The use of acceptability judgments in second language acquisition

research. In W. Ritchie and T. Bhatia (Eds.), Handbook of Second Language Acquisition

(pp. 375–409). San Diego, CA: Academic Press.

Tarone, E. and B. Parrish. (1988). Task-related variation in interlanguage: the case of articles.

Language Learning, 38, 21-44.

Thomas, M. (1989). The acquisition of articles by native and non-native speakers of first and

second language learners. Applied Psycholinguistics, 10, 335-355.

29

Trenkić, D. (2000). The acquisition of English articles by Serbian speakers. Unpublished

doctoral dissertation. University of Cambridge.

Trenkić, D. (2002). Form-meaning connections in the acquisition of English articles. In

Foster Cohen, S., T. Ruthenberg and M. Poschen (Eds.), European Second Language

Association Yearbook, 2 (pp. 115-133). Amsterdam: John Benjamins.

Tsimpli, I. M. (2003). Clitics and determiners in L2 Greek. In J. M. Liceras, H. Zobl and H.

Goodluck (Eds.), Proceedings of the 6th Generative Approaches to Second Language

Acquisition Conference (pp. 331-339). Somerville, MA: Cascadilla Proceedings Project.

VanPatten, B. (1994). Cognitive aspects of input processing in second language acquisition.

In P. Heshemipour, I. Maldonado, and M. Van Naerssen (Eds.), Festschrift in honour of

Tracy D. Terrill (pp. 170-183). NewYork: McGraw-Hill.

Yamada, J. and N. Matsuura. (1982). The use of the English article among Japanese students.

RELC Journal, 13, 50-63.

Young, R. (1996). Form-function relations in articles in English interlanguage. In R. Bayley

and D. R. Preston (Eds.), Second language acquisition and linguistic variation (pp. 135-

175). Amsterdam: John Benjamins.

Zdorenko, T. and J. Paradis. (2008). The acquisition of articles in child second language

English: Fluctuation, transfer or both? Second Language Research, 24(2), 227-250.

Notes

1. Some researchers (e.g., Master 1987) consider bare nouns as marked with a ‗zero article‘.

2. The number in brackets reflects the student‘s serial number, her PL group (A/B/C), and the

ordinal number of the NP in the essay.

3. For task type effect on L2 production, see Foster and Skehan (1996).

30

The Use of Third-Person Pronouns by Native and Non-Native Speakers of English

Ibrahim M. R. Al-Shaer Al-Quds Open University

[email protected]

Bioprofile:

Dr Ibrahim Al-Shaer has 23 years of experience in higher education. He spent his first 7 years

of professional experience teaching different English language and linguistics courses at

several universities. He was also the Director of Al-Quds Open University in Bethlehem for

10 years. He is currently the President Assistant for Innovation and Excellence.

Dr Al-Shaer obtained a Bachelor of Arts in English language and a Diploma in secondary

education in 1986 from Bethlehem University. He is a recipient of a 1989 scholarship from

the British Council, to study for a Master of Linguistics for ELT at Lancaster University. Dr

Al-Shaer is also a recipient of a 1998 scholarship from ASAI in conjunction with Al-Quds

Open University to study for a Ph.D. in Applied Linguistics from the University of Reading.

Dr Al-Shaer‘s main research interests are in the fields of psycholinguistics, construction

grammar, semantics, syntax, ELT applications, writing skill, corpus linguistics, e-learning,

innovation, and creativity.

Abstract

This study addresses research questions concerning the use of third-person pronouns by

native and non-native speakers of English. For this purpose, a corpus-based analysis of these

pronouns in naturally-occurring data was carried out, highlighting the different constraints

that cause writers to choose one pronoun over another. Then, thirteen sentences with tricky

third-person pronouns taken from the IBM-Lancaster Associated Press corpus were presented

in writing to two groups of native and non-native speakers of English. The results indicated

that most native speakers choose third-person pronouns depending on the socio-cultural

context and pragmatic factors, showing an inclination to bend the formal rule of pronoun-

antecedent agreement. However, the majority of non-native speakers had a tendency to abide

by the prescriptive rule of pronoun-antecedent agreement, showing little or no sensitivity to

context. The study concluded that pronoun-antecedent agreement has proven to be an area

where it is difficult to draw a line between standard and non-standard usage.

Keywords: third-person pronouns, cohesive devices, native speakers, non-native speakers,

pragmatic constraints

31

Introduction

Traditionally speaking, pronouns are simply defined as words used instead of a noun or a

noun phrase to avoid repetition. Quirk et al. (1985) have defined pronouns in English as

‗noun-like‘ but differ from nouns in that they have distinct forms in terms of case, person,

number, and gender as opposed to nouns in general. Fromkin et al. (2007) have described

them as substantives whose interpretation depends on syntax and context.

Standard English grammar provides the reader with the prescriptive rule that ‗a

pronoun must agree with its antecedent for person, number, and gender‘ (Kroeger 2005: 138).

When the gender of an antecedent is unspecified, as with student, nurse, everyone, standard

grammar states that the default pronoun to be employed is the masculine one. According to

the Chicago Manual of Style (2010), this approach is no longer acceptable as it is taken to be

outdated and sexist. As such, other approaches are adopted in an attempt to offer a ‗gender-

neutral‘ resolution, as in (1).

(1) a. A student must do his/her homework.

b. A student must do their homework

But some people find repeating his or her throughout a long piece of writing irritating and

others find using plural pronouns in such contexts ungrammatical. For example, Mangan

(2010) has asked for a gender-neutral third-person singular pronoun. Einsohn has even gone

further saying that ‗the newer grammar books recommend using the plural pronoun after an

indefinite subject‘ (2011: 361).

Third-person pronouns are the only class of pronouns which are inherently cohesive,

in that a third-person pronoun form typically refers anaphorically or cataphorically to another

item in the text. For example, first- and second-person forms do not normally refer to the text

at all; their referents are defined by the speaker and hearer speech roles and are normally

interpreted exophorically by reference to the situation. A third-person form implies the

presence of a referent somewhere in the text, and in the absence of such a referent the text

appears incomplete.

Third-person pronouns are very important for the semantic interpretation of texts

because they contribute to cohesion. The concept of cohesion is a semantic one referring to

the relations of meaning that exist within a text that define it as a text (Halliday and Hasan

1976). As such, cohesion is not a structural relation; although cohesion relations could be in

the same sentence, they are not restricted by sentence boundaries. In its most normal form, it

32

is simply the presupposition of something that has been mentioned somewhere in the text

(endophora), whether in the preceding sentences (anaphora) or in the following ones

(cataphora). In addition, third-person pronouns may sometimes co-refer with entities which

cannot be found in the text itself but in the extralinguistic context (exophora) (Quirk et al.

1985).

According to Wilson (1990, cited in Partington 2003), the first-person pronoun we can

be used by politicians in their strategies either inclusively to convey solidarity or exclusively

to stress joint responsibility. Clearly, there is more to pronouns than the simple formal

definition which describes them as words used instead of nouns that must agree with their

referents in gender and number. Pronouns can reflect language users' attitudes and social

orientations. As Curzan has put it:

[P]ronoun selection depends on speaker attitudes and involvement as well as

cultural prototypes [and] all of these factors in turn rest on the same

foundation: the concepts of sex and gender held by language users and the

society in which they express themselves. (2003: 29)

In the same vein, Gocheco (2012: 5) has claimed that ‗pronouns, among other linguistic

features, can shed light on how participants project themselves and how they express

associations with others‘.

Apart from this brief introduction, this paper will be presented in five sections.

Section two outlines the general research methodology. Section three offers the results of the

corpus-based analysis of the behavior of third-person pronouns in journalistic texts and

presents the elicited performance of native and non-native speakers in a set of sentences

selected from Associated Press news articles as compared with their syntactic, semantic and

pragmatic behavior in the data. Section four offers a discussion of the results. Section five

gives a brief summary of the conclusions drawn from this empirical work.

Statement of the Problem

The researcher‘s students, as EFL learners of English, often complain that they get confused

by pronoun-antecedent agreement when interacting with native speakers. For instance, one

student complained that she does not know if generic he is still used in present-day English as

inclusive of she or as an acceptable choice to refer to generic antecedents like someone or

dual-gender words like student. She wanted to know whether using the coordinate

33

construction he or she irritates native speakers, or if the plural they is acceptable to refer to

individuals with unknown gender. When the researcher approached a native speaker of

American English for advice, she replied: ‗Who knows exactly what pronoun to use

anymore!‘ This definitely puts a greater burden on non-native teachers who have limited

exposure to English, as non-native learners of English, especially beginners, need explicit

rules to learn the language; otherwise, they will be lost. Given this challenge, the current

study attempts to provide insights on the reality of pronoun agreement and the challenges it

poses to both native and non-native speakers.

Research Objectives and Questions

Since the great bulk of linguistic arguments and conclusions are currently derived from

reliable evidence stemming from gauging native speakers‘ natural performance, or from

spoken or written corpora, or from contrastive studies, this paper will be concerned with the

natural function and behavior of third person pronouns (he, she, it, they) as cohesive elements

in naturally-occurring data rather than their rigid theoretical features. Then, it will offer a

comparative analysis of the spontaneous choices and preferences of a group of American

native English speakers as compared to that of a group of non-native Palestinian speakers on

the same items extracted from Associated Press articles.

More specifically, the current study can contribute by providing some answers to the

following questions:

1. What grammatical, textual, and extralinguistic factors constrain co-reference in

American English journalistic texts?

2. Are there group differences between native and non-native speakers of English on the

use of third-person pronouns?

3. To what extent do the factors of gender and age play a role in shaping language users‘

choice of one pronoun over another?

Methods

The data used in this study come from two sources, and they are joined together to hopefully

generate powerful insights concerning third-person pronoun-antecedent agreement. The first

source was a collection of examples taken from the IBM-Lancaster Associated Press corpus

34

(A001 – A010), consisting of some one million words of tagged 1970s American Press

material. The second source was a survey of native and non-native speakers‘ performance.

Corpus-based analysis

For the purpose of this analysis, the Associated Press was selected for its prestigious character

and the topics dealt with are interesting for international audiences, though they are directed

to the general American public.

In this analysis, the frequency distributions of the various types of cohesive devices

were presented and examined. Then, aspects of usage in the corpora that required the choice

of a given pronoun were identified and described. All examples were manually processed and

systematically classified in order to identify the environments in which the writer chose one

pronoun over another.

Survey of native and non-native speakers’ performance

Instrument

The second source of data is a survey of 40 native and 40 non-native speakers‘ usage of

pronoun-antecedent agreement in a selected set of sentences mostly taken from the

Associated Press corpus. As shown in the Appendix, the survey consists of two parts. In the

first part, the participants were asked to fill in each blank space with an appropriate third-

person pronoun to complete the sentence, and in the second part they were asked to mark

their preferred choice — either a passive construction with the third-person singular neuter it

used as its subject or an active construction.

Participants

The survey involved two groups of participants. The first consisted of native speakers with no

background in linguistics. Since the tested materials were assumed to be so basic and

universal that they could be generalized beyond the given sample, the non-probability

sampling, or snowball sampling was used. Snowballing allowed for locating information-rich

key informants. The first wave of participants were given a selection criterion (e.g., age,

gender, and no background in linguistics) that helped ‗randomize‘ the sampling process; they

were also asked to recommend for the second wave potential participants who lived the

farthest away. This sampling was not a stand-alone tool; it was just a way of selecting

participants and then the survey was conducted.

35

The 40 native participants were selected from the US states of Kansas and Missouri.

The median age was 35 and ages ranged from 18 to 65. The second group consisted of non-

native speaker participants. They were third-year English majors studying at Al-Quds Open

University who are native speakers of Arabic. Their median age was 24, and ages ranged

from 18 to 36 years.

All participants were instructed to base their responses solely on their immediate

reactions, without worrying too much about any rules they might have learnt about so-called

‗correct‘ English. Respondents needed approximately ten minutes to complete the survey.

Findings

Corpus-based analysis

The main concern of this paper is the use of third-person pronouns as cohesive elements.

However, the existence of other cohesive devices in the corpus affects the frequency

distribution of these pronouns. Therefore, perhaps giving a sense of the incidence of such

cohesive devices as compared to the referential functions of third-person pronouns would be

useful to get the feel of their functioning (Table 1). In this respect, Halliday has said:

Continuity may be established in a text by the choice of words. This may

take the form of word repetition; or the choice of a word that is related in

some way to a previous one—either semantically, such that the two are in

the broadest sense synonymous, or collocationally, such that the two have a

more than ordinary tendency to co-occur. (1985: 289)

Table 1 Distribution of cohesive devices in the corpus

Cohesive Devices Number Percentage Total

Lex

ical Dev

ices

Specific to General 96 15%

643 = 50.5%

General to Specific 25 4%

Repetition 364 57%

Substitution 153 23%

Ellipsis 5 1%

Referen

ce

Cataphoric 8 1.5%

631 = 49.5% Anaphoric 576 91%

Undecided 47 7.5%

Grand Total 1274 100% 1274

36

Starting with repetition, the data show that the use of this device has an important role in

journalistic language. In the data, there are 364 instances of repetition out of a grand total of

1274 different cohesive elements (see Table 1). In many texts, some nouns or noun phrases

are continuously repeated many times, though it is possible to use other cohesive devices in

the same places. For instance, in A001 127/ 128/ 129/ 130, the noun phrase the offender is

repeated four times. This phenomenon has one possible interpretation: the writer might have

found it safer to avoid the dilemma of choosing a pronoun appropriate to the situation.

Another case of repetition appears when the text is condensed with many nouns and

noun phrases. For instance, in A 008 43-50, which is a very short sports report about

basketball, repetition is the only cohesive device used throughout the whole text. A potential

reason for this is that as there are nine noun phrases (teams and players), the writer had to

repeat every noun to avoid confusion on the part of the reader.

The next partially lexical cohesive device used in the data is substitution. What

distinguishes this device from pronominal substitution is that it operates on both the syntactic

and the semantic levels. In other words, grammatically speaking, the substitution element has

to match its referent in terms of syntactic features (especially word class); and lexically, the

substitution element helps produce more coherent texts and solves the problem of intensive

use of other types of cohesive device.

As shown in Table 1 above, lexical substitution is the next most frequent cohesive

device after repetition. Of the 1274 cohesive devices identified, 274 of these are substitution

cases (general to specific = 25, specific to general = 96, synonyms or others = 153).

To begin with the first sub-class general to specific, in A 004 1 (A man opened fire

with a 22 caliber rifle …), then in A 004 3 (The man was subdued by bar patrons…..), and

finally in A 004 5 (A deputy at El Paso County Jail said Barry Chvarak 21…) the writer

starts the report by a general noun accompanied by the indefinite article a, then continues

with the same noun but with definite article the, to replace it finally by the man's name (a man

» the man » Barry Chvarak).

The second substitution sub-class is from specific to general. For example, in A 001

45 (A dormitory fire at the University of Northern Colorado that sent hundreds of students

scurrying from the building on Saturday…), this long noun phrase modified by the relative

clause is used to describe – in a very specific way – the word fire, which is replaced later by

another synonym (blaze) in A 001 47 (… just after the blaze was discovered about 3.00 a.m.

…).

37

The third sub-division of lexical substitution is the use of synonyms. In A 005 52-55

(The pact approved by the Association's executive… I think this contract is a major step

forward… Under the agreement, the players would receive an increase…), here is a good

example of a series of synonyms (the pact » this contract » the agreement) to refer to the

same idea.

The last sub-class is lexical ellipsis. Halliday and Hasan (1976: 142) have argued that

‗the starting point of the discussion of ellipsis can be the familiar notion that it is ―something

left unsaid‖ … is used in the special sense of ―going without saying‖‘.

In the data, only 5 cases of lexical ellipsis have been identified. For instance, in A 005

102-103 (Police said the impact split the car in half…One of the passengers, Kern Jones of

Cushing, remained in critical condition…), the phrase One of the passengers could have been

written in a more explicit way (e.g., One of the car’s passengers), however, the omission of

the word car, in the researcher‘s opinion, does not change the meaning or cause any

confusion because the semantic relation between the words passenger and car is very strong

in this context.

Last but not least, an important characteristic of the data in this study is the use of

referential chains which are produced by the combination of lexical cohesion (repetition and

synonyms) and reference. A typical referential chain in the data can be found in A 004 1-21,

in which the noun man (A 004 1) is gradually replaced by the man (A 004 8), Chvarak (A

004 8), he (A 004 12), the guy and his brother (A 004 13), he (A 004 14) he and his vehicle

(A 004 15), the suspect (A 004 21).

As for the referential functions of third-person pronouns, as presented in Table 1

above, despite the significance of other reference tools and lexical devices in journalistic

discourse, the study reveals the predominance of anaphoric reference in the data. In A 001 –

A 010, out of 631 referential occurrences of the third-person pronouns, there are 576

anaphoric cases. Taking one of these cases in A 009 10 (The Whalers have been playing their

home games in the Springfield...), the possessive pronoun their refers anaphorically to the

noun Whalers.

With regard to cataphoric reference, in A 003 19 Although it has expressed support

for holding the Games, the German Olympic committee has taken no final stand …, the

pronoun it cataphorically refers to the noun phrase the German Olympic committee which is

introduced later in the text.

Table 1 above indicates the low occurrence of cataphora in the journalistic texts; only

8 of the 631 referential uses of third-person pronouns are cataphoric. This is not surprising as

38

cataphora is generally uncommon. However, a possible specific interpretation of this is that

because journalistic writing is usually meant to be more straightforward and accessible to

ordinary readers than other types of writing, journalists might try to use the simplest

referential devices and avoid more difficult ones, such as cataphora.

A point to be added here is that, where the cataphoric reference occurs, the anaphoric

one is possible as well. In other words, ‗we can equate two synonymous sentences in which

the positions of the pronoun and antecedent can be reversed‘ (Quirk et al. 1985: 351). As

such, the above example can be easily transformed from cataphora into anaphora, as in:

Although the German Olympic committee has expressed support for holding the Games, it

has taken no final stand...

Moreover, the data reveal that the linguistic choices that writers tend to make may

affect the semantic and stylistic interpretation of a text. On the one hand, in A 010 125 (He

refused to take questions and returned inside), the writer did not repeat the pronoun he after

the conjunction and, though s/he could have done so. On the other hand, in A 007 107

(Taylor played for the Denver Broncos and Houston Oilers when those clubs were part of the

American Football League, and he coached for seven seasons…), the writer does repeat he.

The researcher‘s interpretation of these two cases is that, in the first, there is only one referent

and so the writer may have found it redundant to use he again. In the second case, the

repetition of the pronoun he helps the reader figure out easily the right referent because of the

big distance between the two elements.

There is another special case where reference is both anaphoric and cataphoric at the

same time. For example, in A 010 11 (They were: Suffolk, Queens, Brooklyn, Nassau, Erie,

and Manhattan), the third-person pronoun they co-refers with an antecedent in the previous

sentences A 010 10 (Six countries accounted for 38.6 percent of all traffic related deaths…)

as well as forward with the items (Suffolk, Queens, …) that come after the same pronoun they

in the same sentence. This is a case of copular relationship, which is not the same as

cohesion, as it is a structural relationship.

All the cases discussed thus far involve specific reference; the pronoun has a definite

referent somewhere in the text. This implies that the data do not have any third person

pronouns with generic reference, i.e. there is no co-reference with any unspecified entity such

as people, animals, plants, etc.

The use of these various types of lexical substitutions appears to depend on the nature

of this genre. The fact that journalistic writing is intended for ordinary readers imposes some

constraints on the way information is presented. On the one hand, journalists tend to take into

39

account the tastes and needs of those who may get irritated by the intensive use of repetition,

confusion, ellipsis, or even pronominal reference and difficult synonyms. This is the device

known as elegant variation (Fowler 1965). On the other hand, they avoid the tricky pronouns

(e.g., cataphora, generic masculine he, and plural they referring to singular referents with

unspecified gender), and use, for example, repetition or substitution — which proved to be

the most frequently used devices in the data. These techniques spare their readers potential

ambiguity or complexity.

The analysis above reveals that the traditional prescriptive rule that antecedents must

take gender- and number-matched pronouns is not highly respected. In the data, the plural

their is used to refer to the singular antecedent anyone in one case and to the people of

Canada in another. The pronoun he is used instead of the pronoun it to co-refer with the

animal horse. Moreover, the third-person singular neuter it is used as a subject of a passive

construction where the more straightforward active construction could have been used. These

interesting cases and others break the prescriptive requirements for the use of third-person

singular pronouns. Clearly, much still remains to be done to clarify how this affects native

and non-native speakers‘ usage.

Survey of native and non-native speakers’ performance

This section offers a comparison of native and non-native speakers‘ use of third-person

pronouns in problematic sentences taken from the Associated Press corpus and other sources.

In the light of what prescriptive grammarians say concerning particular points of

pronoun-antecedent agreement and the journalists‘ choices of third-person pronouns within

the given environments, the performance of native speakers as compared with that of non-

natives on the usage of 13 sentences will be examined. It should be noted that Figures 1-10

employ these abbreviations: (NNs = non-natives; Ns = natives; F =female; M= male; O = old;

Y = young; S = sentence).

Sentence 1: Every student must bring … books to class.

According to prescriptive grammarians, a pronoun must agree with its antecedent in gender

and in number (Celce-Murcia 1985). Despite those grammarians‘ dissatisfaction with using

they or the coordinator construction he or she as alternatives for using the masculine he when

the gender of a referent is unknown (Quirk et al. 1985), usage is changing. As shown in

Figure 1 below, the results of the survey confirm the observation that out of 40, 16 native

40

speakers chose his/her; 13 went for the plural their; and only 11 chose his to refer to the

genderless word student. Although native speakers proved to be equally divided between

using their or his/her, they showed readiness to bend the traditional rule, although more of the

older native speakers opted for abiding by the formal rule.

Interestingly, although this sentence would be grammatical as Every student must

bring books to class, it is surprising that the ‗null‘ option did not show up in the native

performance. This can be attributed to the way it was presented to informants in which the

sentence had an indicated gap after ‗bring‘, and were told to write a third-person pronoun in

the blank.

0

1

2

3

4

5

6

7

8

9

10

YOYOYOYO

MFMF

NsNNs

S1

4

2

43

211

2

5

3

5

3

12

32

1

5

1

4

7766

his

his/her

their

Figure 1 Comparison of Ns and NNs' performance for S1

Equally important, the results revealed that the majority of non-native speakers (26 out of 40)

abided by the prescriptive rule and chose the masculine pronoun his. This suggests that native

speakers are not conscious of or do not follow any systematic criterion while using their

language, and non-native speakers need well-defined rules to follow.

41

Sentence 2: A child learns to speak the language of … environment. (Quirk et al. 1985:

316)

According to Quirk et al. (1972: 360), words like child are ‗exceptionally referred to by the

pronoun its‘. According to the survey results, very few native speakers (only 4 out of 40) used

the pronoun its to refer to child. Almost half of them (19 of 40) chose his/her.

Native speakers were not keen to make a gender distinction and use the coordinate

pronouns his or her or the plural their to refer to the noun child. However, one fourth of the

respondents (11 out of 40) made a gender distinction in favor of the masculine pronoun his.

Perhaps parents tend to refer to their baby with personal reference, and those without children

may prefer to use non-personal reference. Quirk et al. (1985: 316) have described them as

‗emotionally unrelated to the child‘.

0

1

2

3

4

5

6

7

8

9

10

YOYOYOYO

MFMF

NsNNs

S2

211

0

54

56

122

2

0

0

0

0 6

3

5

5

1

1

1

1

1

4

23

45

43

his

his/her

their

its

Figure 2 Comparison of Ns and NNs' performance for S2

Half of the non-native speakers chose its based on what they learn in their EFL

classes, and nearly the other half opted for his mainly because this pronoun in Arabic has a

generic reference.

An interesting result here has to do with the clear interaction between age and gender

in the native speakers‘ performance regarding the use of his/her. Younger males use it twice

as much as older males, but older and younger females use it most frequently.

42

Sentence 3: Ridden by jockey Aki Kato, Tally Ho the Fox, scored … second consecutive

stakes win.

According to Quirk et al. (1985), the pronoun it is mainly used to refer to lower animals. In

the data, there is an exceptional occurrence which does not lend itself to the formal rule. In

this sentence, the pronoun his is used to refer to the horse Tally Ho the Fox instead of the

pronoun it. If the horse is viewed as a non-personal entity, it is mainly referred to by the

neutral pronoun it. But, according to Quirk et al. (1985), people express male/female gender

distinctions with higher animals.

In this case, in which syntactic and lexicogrammatical rules do not seem to be in

operation, readers would not have been able to understand what Tally Ho the Fox was, if the

immediate environment had not provided them with the information ‗ridden by jockey Aki

Kato …‘ (A004 45). The particular choice of the masculine he to refer to a horse may depend

on a number of variables, primarily the speaker‘s relation to the species in question, but also

on her/his individual preference for pronoun usage. If the horse was not male, then it may be

explained in terms of human-like behavior of the horse which scores just like humans do. One

may add an additional factor encouraging the use of he or she: the horse was named.

Animals are mainly referred to with non-personal gender pronouns (it, its, itself).

However, Quirk et al. (1985: 314) have asserted that ‗persons are not only human beings, but

may also include supernatural beings… and higher animals‘.

0

1

2

3

4

5

6

7

8

9

10

YOYOYOYO

MFMF

NsNNs

S3

8 7

3

5 4

5

2 3

1 2

5 2

0

0

0

0

1 1 2

3

6 5

8 7

its

his/her

his

Figure 3 Comparison of Ns and NNs' performance for S3

43

The majority of native speakers (23 out of 40) opted for the masculine he to refer to the word

horse. This is not surprising in racing contexts or with pets. However, a striking finding with

age and gender is that most of younger females went for his/her.

Most non-native speakers (26 out of 40) went for the non-personal pronoun its to refer

to the horse. Almost one third of them (14 out of 40) chose the pronoun his, taking the horse

as male, and none chose his or her.

According to Quirk et al. (1985), since English lacks gender-neutral third-person

singular pronouns, the plural they represents an alternative to using the masculine pronoun he

in reference to mixed-gender groups or persons of unknown gender.

Sentence 4: When the average person walks into a bank, … looks over brochures in the

lobby.

In Figure 4, the results show that almost half of the native speakers, mostly the young group,

chose he or she (18 out of 40). Only one fourth of the native speakers (10 out of 40) went for

the singular they as an alternative to the masculine generic pronoun he.

0

1

2

3

4

5

6

7

8

9

10

YOYOYOYO

MFMF

NsNNs

S4

1

5

1

5

7 6

8 9

6

3

5

4

3

1

2 1

3 2

4

1 0

3

0 0

they

he/she

he

Figure 4 Comparison of Ns and NNs' performance for S4

This result is consistent with findings of a study by Madson and Hessling (2001) in American

readers' perceptions of four alternatives to masculine generic pronoun in which the

respondents rated the they version as lowest in overall quality. However, this is inconsistent

44

with Johnson's (2004) claim that many English speakers prefer the singular they, and

proposes, based on evidence, endorsement of the singular they rather than other alternative

strategies. In the researcher‘s analysis, the form of the verb looks makes the alternative they

ungrammatical, otherwise it would have been used more. Not surprisingly, perhaps, 10 older

native speakers went for he, as compared to only 2 young taking he as their choice. This

reflects the gap between the old and young cohorts.

As for non-native speakers, the majority of them fell back on what they learnt in their

EFL classes (30 out of 40) and chose he as a pronoun referring to the antecedent average

person.

Sentence 5: It was a singular act of courage on the part of Canada to spirit out of Iran a

group of diplomats who were not even … own citizens.

This sentence begins with the ‗prop it‘ – the most neutral and semantically unmarked of the

personal pronouns. The ‗prop it‘ in this sentence appears to function as an empty theme

(Quirk et al. 1985). This ‗prop it‘ is followed by the verb to be and a construction which

‗makes it natural to achieve focus on the item that follows: in effect, end focus within an SVC

clause‘ (Quirk et al. 1985: 1384). This equals extraposition of subject clauses.

The observation here has to do with the writer‘s apparent violation of number

concord, that is, her/his choice of the plural pronoun their to refer to a singular entity Canada.

This choice does have a purpose if explained within the politeness framework.

According to Brown and Levinson (1987: 180), ‗plurality signifies respect

throughout the pronominal paradigm of reference‘. Likewise, Lin has argued that ‗the idea of

plural is naturally and historically connected with power‘ (1988: 159-160). It is also believed

that plurality is a very old and ubiquitous metaphor for power, the earliest instance of which

‗has been used to address the emperors of Rome in the 4th

century‘ (Brown and Gilman:

1960). Obviously, the writer takes Canada as a plural to show collectivity (as a nation

consisting of millions of people).

Surprisingly, the pronoun her, which represents another alternative, didn‘t show up

in the native speakers‘ performance. As shown in Figure 5, the data illustrate that the high

majority of native speakers (36) and non-native speakers (37) chose the pronoun its. Although

the contextual information surrounding the antecedent Canada in sentence (5) presents it as

political entity, the respondents‘ immediate reactions portray Canada as a geographical entity

(i.e., inanimate).

45

012345678910

YOYOYOYO

MFMF

NsNNs

S5

8 10 9 10 10 9 9 8

2 0 1 0 0 1 1 2

their

its

Figure 5 Comparison of Ns and NNs' performance for S5

Sentence 6: I don’t think anyone would approve of having … children attend classes in

this setting.

Figure (6) shows that the majority of non-native speakers (27 out of 40) chose the masculine

pronoun he to refer to the non-specific referent anyone. Native speakers, nonetheless, were

less observant of prescriptive rules; 12 of them chose the plural they. 12 chose the coordinate

construction he or she; 12 of them voted for one’s, and only 10 old speakers chose his.

Regarding native speakers' performance, the results support Holmes‘ (1998)

conclusion, after conducting an analysis of generic pronouns in New Zealand that 80% of

non-specific referents, such as anyone, are referred to by they. However, the results show that

almost half of non-native speakers stick to the traditional rule and chose his rather than their.

By using their as gender-free pronoun, the majority of native participants in this

survey appeared to be socially-sensitive to avoid gender bias. This is consistent with Mair and

Leech‘s conclusion that ‗an ideological motivation (avoidance of sexual inequality) [can be a

reason, among others] for replacing an older pronoun usage by a newer one‘ (2006: 336). In

addition, big differences were observed in the survey in the natives‘ performance between

older and younger females regarding this point.

46

0

1

2

3

4

5

6

7

8

9

10

YOYOYOYO

MFMF

NsNNs

S6

5

3

7

211

21

0

0

0

1

00

0

0

4

3

3

2

11

3

2

1

4

0

5

88

5

7

his

his/her

one’s

their

Figure 6 Comparison of Ns and NNs' performance for S6

In this sentence, the plural pronoun their is used —in defiance of strict number

concord — in co-reference to the indefinite pronoun anyone. This violation appears to have a

different interpretation from the one mentioned above. The reporter, here, may have used the

plural as a convenient means of avoiding the traditional use of the third person masculine he,

as the syntactically unmarked form. Since the gender of the indefinite pronoun anyone is

unspecified, the writer chooses their in order to avoid possible attacks from those who view

the use of the generic he as a kind of sexual bias in language. In addition, by the choice of

their to co-refer with anyone, s/he also avoids being vulnerable to ‗the objection of seeming

to have a male orientation‘ (Greenbaum et al. 1990: 451). The choice of their seems to be

governed by more contingent, context-dependent pragmatic as well as social orientations.

Sentence 7: The hairdresser turned down the offer and … returned inside.

The overwhelming majority of non-native speakers (37 out of 40) chose the feminine pronoun

she to refer to the antecedent hairdresser, and almost half of the native speakers (18 out of

40) chose she as well. This is by no means surprising since the default inference of

hairdresser in some communities is female.

47

0

1

2

3

4

5

6

7

8

9

10

YOYOYOYO

MFMF

NsNNs

S7

2222

0000

444

6

8

109

10

34

3

2

0

0

0

01

01

0

2

01

0

he

Ø

she

he\she

Figure 7 Comparison of Ns and NNs' performance for S7

Sentence 8: The blacksmith remained silent and … refused to leave the coach.

As shown in Figure (8), the overall results show that the antecedent word blacksmith is

treated as male and referred to as he.

0

1

2

3

4

5

6

7

8

9

10

YOYOYOYO

MFMF

NsNNs

S8

4 5

4

7

10 9 9

10 3

3

2

2

0 1 1

0

1 0

2

0

0 0 0 0

2 2 2 1

0 0 0 0

he/she

she

Ø

he

Figure 8 Comparison of Ns and NNs' performance S8

48

The overwhelming majority of non-native speakers (38 out of 40) chose the masculine

pronoun he to refer to the antecedent blacksmith, and half of the native speakers (20 out of

40) chose it as well. This is by no means surprising since the default inference of blacksmith

in some communities is male. Although none of the non-native speakers treated blacksmith as

female, three native speakers singled out the feminine she and seven of them preferred the

coordinate construction he or she.

Sentence 9: The Titanic was massive because … killed thousands and thousands of

people.

The neutral pronoun it is almost always used in place of a single thing. However, there are,

according to Quirk et al. (1985), a few exceptions. For example, the feminine pronoun she

can be exceptionally used in a case of personification to refer to a ship.

012345678910

YOYOYOYO

MFMF

NsNNs

S9

9 10 8

10

7 9 8 8

1 0 2

0

3 1 2 2

she

it

Figure 9 Comparison of Ns and NNs' performance for S9

However, the results presented in Figure (9) show that the majority of native and non-native

speakers treated even the ship Titanic as a single thing rather than female and chose the

neuter pronoun it.

In this particular case, it seems that the referent is not to the ship itself, but the disaster

event which is named after the ship involved in it. Clearly, the option of she for the ship is not

a popular choice, and almost all language users indicated that the gap is best filled by it.

49

Sentence 10: The offender argued logically and calmly. This could eventually help

change the attitudes of the taxpayers and officials, who are in a position to give more

support to … as well as to the victims.

As shown in Figure (10), the majority of native (24) and non-native speakers (27) chose the

masculine pronoun him to refer to the antecedent word offender. This goes in line with the

widely-held assumption that the default gender interpretation of offender is male. It seems

that this applies to both American and Palestinian communities.

0

1

2

3

4

5

6

7

8

9

10

YOYOYOYO

MFMF

NsNNs

S10

6 7

5 6

7

9

5 6

0

0

0

0

0

0

3 1 2

2

3 2

2

1

1 2

2 1

2 2 1

0 1 1

them

him/her

her

him

Figure 10 Comparison of Ns and NNs' performance for S10

Interestingly, when the same sentence is considered in its context, the reporter did not use any

pronoun and chose to repeat the noun phrase the offender, although it would have been

equally explicit if it had been substituted for a pronoun. The same noun phrase is repeated in

A 001 127 and in A 001 128, and the context itself gives enough information for the reader to

interpret it in the right way. A possible explanation is that the reporter wants to avoid the

dilemma of choosing a pronoun appropriate to the situation. According to Mair and Leech

(2006), the generic use of he for both male and female was prevalent in the 1960s, but it

declined in the 1990s owing to the efforts of women‘s movements. The feminist

50

recommendations in this regard, together with the need to fill the gap left by the downfall of

the generic he, allowed for the deeply-rooted they to re-emerge.

A few years ago, the choice of the third person plural they would have been totally

unacceptable in terms of number concord, since the offender signals a singular entity which

requires a singular co-referent. Choice of the third person masculine he could have been seen

as male oriented or another manifestation of the subjection of women to men, whereas the

third person feminine she would have been awkward, since readers are not used to the idea of

the feminine as a generic pronoun. The writer successfully managed to avoid the dilemma by

repeating the noun.

Discussion

The two most striking results to emerge from both the corpus-based analysis and the survey

can be summarized as follows. The first is that the traditional prescriptive rule that

antecedents must take gender- and number-matched pronouns is not highly respected. In the

journalistic corpus data, many reporters‘ pronoun choices hinge upon contingent, context-

dependent pragmatic social and cultural factors. For example, the plural their is used to refer

to the singular antecedent anyone in one case and to the state Canada in another. The pronoun

he is also used to refer to the animal horse instead of the pronoun it. Moreover, the results

obtained from the survey show that native speakers are deeply divided about what pronoun to

use when dealing with entities of unknown gender relative to their age and gender.

Moreover, the systematic way in which the language users‘ responses of a certain

pronoun pattern provides evidence that the age factor, for instance, constrains their choices

and leads to apparent sensitivity of judgment relative to the given socio-cultural context. With

this in mind, usage nowadays is changing under the pressure of social, cultural, and pragmatic

constraints.

Further evidence can be obtained from sentences (11-13) below, which are extracted

from the survey. They show how the third-person singular neuter it is used as the subject of a

passive construction where the more straightforward active construction could have been

used.

11. A. It is hoped that as a result, the public might view the offender in a more positive light.

B. I hope that the public might view the offender in a more positive light.

12. A. It was not known immediately what interest rates would be charged.

51

B. The minister did not know immediately what interest rates would be charged.

13. A. It also was announced that Bowman's squad had lost three players to injury.

B. The coach announced that Bowman's squad had lost three players to injury.

To begin with sentence (11) (A 001 28), the third person singular neuter it is used as a subject

of a passive construction. A possible interpretation of this choice is that since newspaper

language has to be objective and unbiased, the writer tries to disassociate her/himself from the

potential hope expressed in the utterance, simply because s/he is not the appropriate person to

express feelings and hopes, but facts.

Let us consider for a moment how the sentence could have been stated otherwise: I

hope that the public might view the offender in a more positive light. The use of the first

person pronoun, together with the modal auxiliary might, expresses a strong hope on the part

of the writer, but at the same time, it can be viewed as a kind of a ‗mild imperative‘ (an

ethical code requires that you should view the offender ...), or as a strong suggestion, which

automatically turns a simple utterance to a ‗Face Threatening Act‘ (FTA) (Brown and

Levinson 1987: 10).

The ‗Face Threatening Act‘, which is closely related to the notion of politeness,

imposes many constraints on the linguistic choices language users make, both in spoken and

written discourse. Brown and Levinson (1987), in discussing the notion of politeness, have

proposed that ‗face‘ consists of two related aspects. ‗Negative face‘ refers to the want of

every individual that his actions be unimpeded by others (i.e., one's freedom of action and

freedom from imposition). ‗Positive face‘ refers to ‗the want of every member that his wants

be desirable to at least some others‘. Brown and Levinson (1987: 61)

Brown and Levinson (1987) have also highlighted the options available to the speaker

who must decide whether and how to utter a Face Threatening Act, that is, an act which poses

a threat to either the positive or the negative face of the addressee. These options range from

simply not doing the FTA (off-record), to doing the act boldly, with little or no concern for

face (on record, without redressive action). Between these two options, for a speaker who

chooses to do the FTA but who wishes to show an appropriate concern to ‗face‘, there are

various ‗FTA minimizing strategies‘ and devices for mitigating the illocutionary force of

particular utterances: (cf. Brown and Levinson 1987, Lakoff 1972, Leech 1983).

52

Therefore, one could say that the journalist uses a ‗negative politeness strategy‘, that

is, the passive construction, in order to preserve the addressee's (the public's) negative face, as

well as to avoid any kind of impingement on their desire to be free from imposition.

Likewise, in example (12), ‗it was not known immediately what interest rates would

be charged...‘ (A009 83), the writer again chooses the passive construction without an agent,

in order to avoid putting the blame on anyone, on the authorities or on the particular president

of the institution in our case. If the reporter had used the third person plural they to mean

‗persons unspecified‘, or ‗persons with responsibility‘ (Halliday and Hasan 1976: 53), s/he

would have again performed an FTA, that is, s/he would have shown disapproval or

contempt, expressions that both threaten the addressee's positive face want, by indicating that

‗the speaker doesn't care about the addressee's feelings, wants, etc.‘ (Brown and Levinson

1987: 66).

Something similar can be observed in (13): ‗it also was announced that Bowman's

squad, which already had lost three players to injury…‘ (A 007 13 in the data). The pronoun

it here does not co-refer with a previous antecedent, but it occupies the subject role of the

utterance. The journalist may have used this construction in order to avoid attribution of

blame or responsibility to persons involved in the situation.

When the same examples were presented to native and non-native speakers out of

their context, they, as shown in Figure (11) below, overwhelmingly selected the active form.

Putting the results obtained from the analysis and the survey together shows how the

contextual meaning of individual examples shapes their structural form relative to what the

speaker intends her/his meaning to be (i.e., by means of a pragmatic rather than a syntactic or

semantic explanation). Clearly, pragmatics goes a step further than text and textual meaning,

clarifying what exactly ‗a piece of language means to a given person — to the speaker or

addressee — in a given speech situation‘ (Leech 1980: 80).

These pragmatic constraints of the occurrence of the third-person pronouns refute the

claim that since semantic interpretation is the ‗study of what a piece of language means‘

(Leech 1980: 80), pragmatic explanation of any piece of spoken or written discourse is

redundant.

The second striking result is that the obtained information on the spontaneous choices

of Americans as compared with those of non-native speakers paints a rather fuzzy picture.

However, a couple of patterns are worth mentioning here. Native speakers were more flexible

than non-native speakers in their choices of the plural they or the coordinate he or she to refer

53

Figure 11 Comparison of Ns and NNs' performance for S11 – S13

to singular words with unspecified gender. When using they as gender-free pronoun, native

speakers here are socially-sensitive to avoid gender bias in their communities. Interestingly, a

clear interaction between age and gender in the native participants, influencing the use of

his/her, has been observed in many cases; younger males used it twice as much as older

males, but older and younger females used it most and equivalently. Not surprisingly,

perhaps, older native speakers went for the masculine pronoun he to refer to non-specific

referents like anyone.

The data analysis highlights the problem of the nonexistence of gender-neutral

singular pronouns in English. An antecedent like student or anyone does not display whether

the referent is male or female. This study has shown the usage of third-person pronouns, in

native and non-native speakers‘ completion of sentences extracted from Associated Press

news articles.

This is fair enough, if the issue is restricted to native speakers living in one society.

But can this be easily adopted by non-native speakers to become socially sensitive to the

culture-specific rules in English-speaking countries? Should EFL teachers and students

follow what prescriptive grammarians say, or study language as it is used by its speakers?

Although the plural they or the coordinate construction he or she is widely acceptable

nowadays in English-speaking societies to refer to gender-unknown singular words, their use

poses a real problem for non-native speakers who need systematic formal rules that can be

easily followed.

S11S12S13S11S12S13

Native SpeakersNon-native Speakers

8087.5

9592.510097.5

7.5

Passive

Active

54

Clearly, non-native speakers of English tend to follow the prescriptive rule that a

pronoun must agree with its antecedent in gender and number without paying much attention

to social developments in the English-speaking communities. As the survey results indicate,

there is a clear gap between native and non-native speakers‘ performance on the choice of

third-person pronouns. This can be explained in two ways. First, it may be partly attributed to

language interference in which L2 learners in general, and Arabic-speaking learners in

particular, transfer the pronoun system of their native language to L2 (Al-Jarf 2010). Unlike

nouns in Arabic which show grammatical gender, nouns including indefinite pronouns in

English (e.g., someone, anyone) do not display gender (Khalil 1999). Second, it may be

attributed to the lack of cultural knowledge and awareness on the part of the non-native

speakers. To bridge such a gap and avoid intercultural miscommunication, culture teaching is

badly needed to develop EFL students‘ cultural awareness and competence. Clearly, EFL

teachers need to integrate some cultural knowledge into classroom teaching of certain

grammar points.

These findings break the prescriptive requirements for the use of third-person singular

pronouns. The overall impression one gets from the discussion above is that there is no

concrete well-defined criterion as to what pronoun to use when talking about an entity with

unspecified gender. Language users, whether native or non-native, need to be sensitive to the

culture-specific rules in English-speaking countries in order to use third-person pronouns

appropriately. Some people may accept that it is important to raise EFL learners‘ and

teachers‘ awareness of native speakers‘ use, and to train them on how to notice the difference

in cultural orientations. Others may argue that non-native speakers should not be left at the

mercy of native speakers‘ attitudes and desires, and they should not be hung in the middle

between strict prescriptive rules and users' actual practices or applications.

Summary and Conclusion

The purpose of this study is threefold: first to highlight the factors that constrain co-reference

in American English journalistic texts using grammatical, textual and extralinguistic

parameters; second to examine the extent to which native and non-native speakers of English

differ in terms of their use of third-person pronouns; and third to measure the impact of the

factors of gender and age on language users‘ choice of one pronoun rather than another.

The analysis of the frequency and usage of third-person pronouns in Associated Press

articles has provided insights on their important role in achieving cohesion. It has also offered

a pragmatic explanation of the speaker‘s (reporter‘s) intended meaning in order to account for

55

third-person pronouns as cohesive devices, which could not be interpreted either syntactically

or semantically. Moreover, the study has shown how the pronoun usage reflected in the

reporters‘ choices reflects the relations toward participants‘ acts in the discourse. That is,

third-person pronouns, among other linguistic features, have displayed how reporters project

themselves and how they express associations or disassociations with others‘ acts.

In order to carry out a performance comparison of native and non-native speakers‘ use

of English third-person pronouns, thirteen sentences with tricky pronouns taken from the

corpus were presented in writing to two groups of native and non-native speakers. The results

have revealed that most native speakers chose third person pronouns depending on the socio-

cultural context and pragmatic factors, tending to bend the formal rule of pronoun-antecedent

agreement, especially when dealing with gender-unspecified words. However, the majority of

non-native speakers showed an inclination to abide by the prescriptive rules of grammar,

demonstrating little social and cultural sensitivity.

This seems to imply that a treatment of third-person pronouns, or pronouns in general,

based on syntactic conditions alone, may not lead to a consistent and convincing explanation

of their behavior. Going a little bit further, the results of this study suggest that the choice of

different forms in a particular discourse type may be a matter of emotional reflection, as well

as a matter of particular linguistic needs and attitudes, which have to be taken seriously into

consideration well before attempting any kind of syntactic, semantic or pragmatic analysis.

This has been reflected, for example, in the native speakers‘ divided responses regarding the

antecedent child. In many cases, the results obtained from native speakers have shown an

interesting interaction between age and gender, influencing the use of one pronoun rather than

another. The performance of younger males or females was different in many cases from that

of older males and females. Pronoun-antecedent agreement has proven to be an area where it

is difficult to draw the line between standard and non-standard usage.

It should be noted that this study has not fully covered the broad topic of pronoun

usage. One limitation stems from the fact that the data was of a particular discourse type,

namely American newspaper reports, which are copy-edited according to prescriptive

stylebooks. Other limitations may be attributed to the participants‘ characteristics. However,

this study should, hopefully, provide insights into the reality of pronouns and the challenges

they pose to both native and non-native speakers.

56

Acknowledgements:

I am immensely grateful to Professors Mike Garman and Aziz Khalil for their invaluable

comments on an earlier draft of this paper. Special thanks go to Professor Steve Schwegler

for helping me recruit American participants for the survey. I also extend my deepest thanks

to the anonymous reviewers for The Linguistics Journal for their helpful remarks and

insightful suggestions. My special thanks are also due to all American and Palestinian

participants who willingly volunteered to complete the survey.

References

Al-Jarf, R. (in press). Interlingual pronoun errors in English-Arabic translation. King Saud

University. Retrieved July 21, 2013 from

http://faculty.ksu.edu.sa/aljarf/Publications/Forms/AllItems.asp

Brown, P. and S. C. Levinson (1987). Politeness: Some universals in language usage.

Cambridge: Cambridge University Press.

Brown, R. and A. Gilman (1960). The pronouns of power and solidarity. In T. A. Sebeok

(Ed.), Style in language (pp. 253-276). Cambridge, Mass: MIT Press.

Celce-Murcia, M. (1985). Making informed decisions about the role of grammar in

language teaching. TESOL Newsletter, 1 , 4 -5 .

Christophersen, P. and A. Sandred. (1969). An advanced English grammar. London:

Macmillan.

Curzan, A. (2003) Gender shifts in the history of English. Cambridge: Cambridge University

Press.

Einsohn, A. (2011). The copyeditor's handbook: A guide for book publishing and corporate

communications, with exercises and answer keys. Berkeley: University of California

Press.

Fowler, W. H. (1965). Fowler's modern English usage. In E. Gowers (Ed.), A dictionary of

modern English usage (2nd

ed.). London: Oxford University Press.

Fromkin, V., R. Rodman and N. Hyams. (2007). An introduction to language (8th

ed.). New

York: Thomson Corporation.

Gocheco, P. (2012). Pronominal choice: a reflection of culture and persuasion in Philippine

political campaign discourse. The Philippine ESL Journal, 8, 4-25.

Greenbaum, S., R. Quirk, G. Leech and J. Svartvik. (1990). A student’s grammar of the

English language. Essex: Longman.

Halliday, M. A. K. and R. Hasan. (1976). Cohesion in English. London: Longman.

Halliday, M. A. K. (1985). An introduction to functional grammar. London: Edward Arnold.

57

Holmes, J. (1998). Generic pronouns in the Wellington corpus of spoken New Zealand

English. Kotare: New Zealand notes and queries, 1(1), 32-40.

Johnson, S. (2004). Exploring the use of the 'they' pronoun singularly in English. California

Linguistics Notes, 29(1), 1-5.

Khalil, A. (1999). A contrastive grammar of English and Arabic. Amman: Jordan Book

Center.

Kroeger, P. R. (2005). Analyzing grammar: An introduction. Cambridge: Cambridge

University Press.

Lakoff, G. (1972). Hedges, fuzzy logic and multiple meaning criteria. Papers from the

Chicago Linguistic Society 8, 183-228.

Lakoff, R. T. (1984). Remarks on THIS and THAT. Papers from the Chicago Linguistic

Society 10, 345-356.

Leech, G. N. (1980). Explorations in semantics and pragmatics. Amsterdam: John

Benjamins.

Leech, G. N. (1983). The principles of pragmatics. London: Longman.

Lin, Yang-Yong (1988). The English pronoun of address: A matter of self-compensation.

Sociolinguistics, 2, 157-180.

Linde, C. (1979) Focus of attention and the choice of pronouns in discourse. In T. Givón,

(Ed.), Syntax and semantics, 12. New York: Academic Press.

Lyons, J. (1975). Deixis as a source of reference. In E. L. Keenan (Ed.), Formal semantics of

natural language (pp. 61-83). Cambridge: Cambridge University Press.

Lyons, J. (1977). Semantics. Cambridge: Cambridge University Press.

Madson, L. and R. Hessling. (2001). Readers' perceptions of four alternatives to masculine

generic pronouns. Journal of Social Psychology, 141(1), 156-158.

Mair, Ch. and G. N. Leech. (2006). Current changes in the English syntax. In B. Aarts and A.

McMahon (Eds.), The Handbook of English Linguistics (pp. 318-342). Oxford:

Blackwell.

Mangan, L. (2010). All style and substance. The Guardian. Retrieved 24 July, 2010 from

http://www.theguardian.com/lifeandstyle/mind-your-language/2010/jul/24/style-

guide-grammar-lucy-mangan

Partington, A. (2003). Politics, power and politeness. In A. Partington (Ed.), The linguistics of

political argument (pp. 124 - 155). London and New York: Routledge.

Quirk, R, S. Greenbaum, G. N. Leech and J. Svartvik. (1972). A grammar of contemporary

English. London: Longman.

58

Quirk, R., S. Greenbaum, G. N. Leech and J. Svartvik. (1985). A comprehensive grammar of

the English language. London: Longman.

The Chicago Manual of Style (16th

ed.). (2010). Chicago: University of Chicago Press.

59

Appendix I Use of Third-Person Pronouns in English

Dear colleague,

The purpose of this mini-research project is to survey native speakers‘ opinions about the use of the third-person

pronouns. I would be grateful if you could spare ten minutes to do the following exercises. Don‘t worry about

any rules you may have learnt about what ‗proper‘ or ‗correct‘ English is. Work as quickly as you can – what we

are interested in is your immediate reaction.

Thanks for your co-operation

Native Speaker ( ...................... ) Non-native speaker (..............................)

Age (........................ ) Sex (...........................................)

Part I: Fill in each blank in the following sentences with an appropriate third-person pronoun and briefly

explain why.

- Every student must bring (1) … books to class.

…………………………………………………………………………………………………………………

The child learns to speak the language of (2) … environment.

- …………………………………………………………………………………………………………………

Ridden by jockey Aki Kato, Tally Ho the Fox, scored (3) … second consecutive stakes win.

…………………………………………………………………………………………………………………

- When the average person walks into a bank, (4) … looks over brochures in the lobby.

…………………………………………………………………………………………………………………

- It was a singular act of courage on the part of Canada to spirit out of Iran a group of diplomats who were not

even (5) … own citizens.

…………………………………………………………………………………………………………………

- I don‘t think anyone would approve of having (6) … children attend classes in this setting.

…………………………………………………………………………………………………………………

- The hairdresser turned down the offer and (7) … returned inside.

…………………………………………………………………………………………………………………

- The blacksmith remained silent and (8) … refused to leave the coach.

…………………………………………………………………………………………………………………

- The Titanic was massive because (9) … killed thousands and thousands of people.

…………………………………………………………………………………………………………………

- The offender argued logically and calmly. This could eventually help change the attitudes of the taxpayers

and officials, who are in a position to give more support to (10) … as well as to the victims.

…………………………………………………………………………………………………………………

Part II: Which sentence would you prefer to use in your writing? Please tick the box next to it.

11. A. □ It is hoped that as a result, the public might view the offender in a more positive light.

B. □ I hope that the public might view the offender in a more positive light.

12. A. □ It was not known immediately what interest rates would be charged.

B. □ The minister did not know immediately what interest rates would be charged.

13. A. □ It also was announced that Bowman's squad had lost three players to injury.

B. □ The coach announced that Bowman's squad had lost three players to injury.

Thank you

60

An Analysis of Learner Use of Argument Structure Constructions: A Case of Thai

Learners Using the Passive and Existential Constructions in English

Napasri Timyam

Kasetsart University

[email protected]

Bioprofile: Napasri Timyam earned her Ph.D. degree in Linguistics from the Department of

Linguistics, University of Hawaii at Manoa, USA. She is currently an assistant professor at

the Department of Foreign Languages, Faculty of Humanities, Kasetsart University, Thailand.

Her current research interests include syntactic theory, Thai learners of ELF, and acquisition

of child Thai.

Abstract

Taking the Construction Grammar (CxG) and English as a Lingua Franca (ELF) approaches

together, this study examined whether Thai learners‘ use of the English passive and

existential constructions deviated from the native speaker norms and how such deviations

reflected the general, universal characteristics of ELF. Data were taken from 70 English-

major students who represented ELF speakers at the upper-intermediate level. Two kinds of

writing tasks were designed – writing with prompts and free essay writing.

The results revealed that passive and existential sentences produced by Thai learners –

compared to native speakers – are much more limited in structural complexity and also

semantic and pragmatic functions. Moreover, the results reflected that Thai learners‘ use of

the English clausal constructions is also governed by three general and universal

characteristics, i.e., simplicity, regularity, and analogy, which have been found in the

phonology and pragmatics of different varieties of ELF.

The study extended the CxG scope from L1 settings to L2 phenomena by showing the

differences in constructional use between native and non-native speakers, which can be used

as guidelines for teaching argument structures to English learners. It also broadened the scope

of ELF research; ELF deviations at all levels – sounds, words, phrases, discourse, and also

sentences – are governed by some universal characteristics which reflect speakers‘ motivation

to shape English in the direction that results in a simple and effective form of communication.

Keywords: argument structure constructions, varieties of ELF, the passive construction, the

existential construction

61

Introduction

In the theory of Construction Grammar (CxG), all levels of description in language lie in the

notion of ‗construction‘, which refers to a pairing of form and meaning. Morphemes, words,

idioms, and phrasal patterns are all constructions since they are instances of form-meaning

correspondences (Fillmore 1988). Generalizations about particular arguments being topical,

focused, inferable, etc., as well as facts about the actual use such as frequencies are also

stated as part of the constructional representation (Goldberg 2002, 2009). Such perspective of

constructional properties suggests a more precise definition of a construction implied in the

theory, i.e., an association of form, meaning, and use.

Clause-level syntactic patterns, often referred to as argument structures, are one type

of construction because they are associated with a particular form, meaning, and use. A

fundamental idea behind the CxG approach to argument structure constructions is that they

designate event types, which are basic to human experience. The meanings of these event

types are rather general and abstract (Goldberg 1995). For instance, in English, the transitive

construction (of the form Subject-Verb-Object, as in Pat opened the door) denotes something

acting on something; the ditransitive construction (Subject-Verb-Object1-Object2, as in Pat

gave Jill a gift) denotes possessive transfer from one participant to another.

Compared to constructions at the lower levels, argument structures are more difficult

to acquire. When English-speaking children encounter new words, for example, they can

quite quickly pick up the form and meaning of those unfamiliar expressions from the

immediate context. In contrast, properties of an argument structure are general and abstract.

Children need to be exposed to a number of instances of one argument structure before they

can make generalizations about the form, meaning, and use inherently attached to that

construction.

Children learn their first language by making generalizations and drawing conclusions

based on the linguistic input they have received. They tend to lose this innate linguistic ability

when they grow up (Bley-Vroman, 1988), and the process of learning a second language is

more explicit and depends heavily on explanations of instructors. Based on this fact in the

acquisition literature, the task of learning an argument structure becomes even more

challenging for second language learners. While some constructional features are noticeable

and easy to describe, many general and abstract constructional features are hard to explain. In

order to appropriately use one argument structure, English learners need to recognize all of its

syntactic, semantic, and pragmatic principal properties. Given that their deviations in using

clausal patterns are often found, it is evident that this is not always the case.

62

Nevertheless, deviations produced by English learners do not always occur

sporadically. According to research on English as a Lingua Franca (ELF) – an approach to

study English used for communication among people speaking different languages, there is

universal motivation underlying learners‘ usage of English. In other words, there are some

general characteristics which are produced by English learners across language backgrounds,

including repetition, explicitness, and regularization, etc. Due to these universal features,

researchers adhering to this approach believe that the notion of ELF encompasses not only the

use of English internationally, but also the use and modification of a particular form of

English which does not necessarily conform to native speaker norms (Dewey 2007, Jenkins

2006, Seidlhofer 2001).

Based on this tenet of the ELF approach, ELF researchers regard common and

systematic forms of deviations as ‗variations‘ which are part of the natural process of

language contact and language change, rather than ‗errors‘ caused by incomplete acquisition

of the target language. Moreover, they hold that despite differences in minute detail, different

groups of L2 users have developed ELF in a rather similar direction, with general and

universal characteristics underlying the use of constructions at all levels.

Taking the two lines of the CxG and ELF approaches together, this study aims to

investigate learners‘ use of argument structure constructions in English. Since a clausal

construction possesses general and abstract properties, the study hypothesizes that English

learners do not acquire all features associated with the construction. As a result, their use of

the construction deviates from the native speaker norms. Yet, such differences should not be

completely unexpected and lead to disunity because they are partly governed by the universal

principles of second language usage which have been observed in numerous varieties of ELF,

particularly in their phonological and pragmatic features (Cogo and Dewey 2006, Seidlhofer

2004).

The study focuses on Thai learners using the passive and existential constructions in

English. The passive is a construction in which the subject corresponds to the theme, as in

The glass was broken by the boy. The existential construction expresses the existence of an

entity, as in There are two books on the table. These constructions were chosen for the case

study for two major reasons. First, both constructions are known to possess a number of

linguistic and pragmatic properties, which trigger variations in the use of English learners.

Second, the two constructions have their unmarked counterparts. The active (as in The boy

broke the glass) is the unmarked structure for the passive while the non-existential structure

(as in Two books are on the table) is the unmarked version of the existential-there sentence.

63

Thus, the use of the passive and existential constructions is a ‗linguistic option‘. Native

speakers choose them over the more basic structures due to very specific properties. It is

difficult for L2 learners to differentiate between the alternative structures and recognize all

properties particularly attached to each of the two constructions. In sum, by dealing with the

passive and existential constructions, the objectives of the study are: (1) to investigate Thai

learners‘ use of the English constructions, in comparison with the native speaker norms, and

(2) to analyze the deviations in terms of the general, universal characteristics of ELF.

Literature Review

The review of the literature covers four areas: (1) CxG, (2) ELF, (3) the English passive

construction, and (4) the English existential construction.

Construction Grammar

The basic tenet of CxG is that constructions – form-meaning correspondences – constitute the

basic units of language (Goldberg 1995). The main objective of the theory is to provide a full

range of facts in language on the basis of various types of constructions available in human

languages.

Argument structures hold a special interest in the theory. This type of construction is

marked by syntactic, semantic, and pragmatic properties. According to the Principle of No

Synonymy of Grammatical Forms, the form of a construction is very specific; even slight

changes in a sentence structure can result in differences in meaning – either denotational or

pragmatic meaning (Goldberg 1995). Thus, pairs of alternating sentences such as an active

and its passive counterpart belong to different constructions that denote subtle differences in

meaning. Semantically, an argument structure designates a scene basic to human experience,

and its meaning can be polysemous, having a family of different, but related senses. As a

result, there are semantic variations in the way speakers use a construction. For example,

while the English ditransitive typically expresses successful transfer, some ditransitive

sentences denote other related senses of transfer, including future transfer, intended transfer,

and negation of transfer. Pragmatically, the use of a construction varies along different kinds

of pragmatic dimensions, such as packaging of information structure, grammatical heaviness,

and register. All of these properties in form, meaning, and use contribute to the existence of

an argument structure construction in a language.

64

English as a Lingua Franca

Lingua Franca refers to a language used as a means of communication between people who

speak different languages. Speakers of English as a lingua franca are those who have learned

English as an additional language, and to whom it serves as the most useful instrument for

communication that cannot be conducted in the mother tongue, be it in business, casual

conversation, science, politics, etc. (Seidlhofer 2001).

With the more important role of non-native speakers and the increased acceptance of

various forms of English in the globalization period, ELF – a recent approach to the study of

English – has emerged. The growing body of ELF research has revealed the patterns of

change and linguistic fluidity emerging in the way English is transformed in lingua franca

interaction (Dewey 2007, Jenkins, Cogo and Dewey 2011). Close examination of a number of

features, mostly at the levels of phonology and pragmatics, reflects the underlying motivation

of ELF speakers. That is, there is a tendency to shape the language in the direction, which

renders a ‗simple and effective‘ form of communication. As Breiteneder (2009) pointed out,

this is a universal tendency for second language usage; speakers from different lingua cultures

who enter into intercultural communication situations usually shift their focus to simplicity

and effectiveness. ELF researchers (e.g., Breiteneder 2009, Cogo and Dewey 2006, Dewey

2007) have summarized a set of general characteristics found in ELF interactions among

speakers across various linguistic and cultural backgrounds. Table 1 presents these shared

characteristics, all of which contribute to simplicity and effectiveness in communication.

Table 1 General and Universal Characteristics of ELF

Characteristic Definition

Repetition ELF speakers often repeat their words and other speakers‘ words. Repetition

is an accommodation strategy to achieve efficiency of communication,

signal agreement and alignment, show attention and engagement in the

conversation, and establish cohesion.

Explicitness and redundancy Extra words are inserted to ensure clarity of the conversation.

Simplification Complex forms are replaced by simple, shortened forms. Complex rules are

simplified.

Regularization ELF speakers make use of rule regularizations to make the rules more

general and consistent and to avoid exceptions.

Analogy ELF speakers prefer generalizing uses of expressions to all or more varied

contexts on the basis of predominant cases.

65

Table 2 NS Norms of the Passive Construction

Property NS Norm

Syntax:

A passive verb appears in many forms, with various tenses, aspects, and auxiliaries.

An agent is usually omitted when it is unknown or irrelevant to the point being

discussed, when it is predictable by the context or world knowledge, and when it

refers to people in general.

An agent is retained when it conveys new information. Typically, it is introduced

by the preposition by.

Semantics and pragmatics:

The theme functions as the topic of a passive sentence; it usually expresses given

information.

Speakers tend to choose the passive when an agent at sentence-final position is

structurally heavy.

Non-basic passives:

The get passive is used with an event whose subject is partly responsible for the

result, or which happens unexpectedly.

The ditransitive passive is formed from a ditransitive verb (e.g., she was sent a note).

The prepositional passive is formed from an intransitive verb that occurs with a

preposition (e.g., the project was thought about).

The English Passive Construction

The passive is the construction in which the theme, instead of the agent, is linked to the

subject. As the passive sentence in (1) illustrates, the theme NP (the thieves) serves as the

subject, and the agent NP (police) downgrades to be the oblique.

(1) The thieves were caught by police.

In terms of form, the passive structure includes a theme subject, a passive verb form

(usually consisting of be and a past participle), and an optional agent phrase. As to meaning, a

passive sentence is used to talk about an action from the viewpoint of the theme. Apart from

these basic form and meaning, the English passive is associated with a set of syntactic,

semantic, and pragmatic properties. The major characteristics of the construction as

frequently discussed in literature (e.g., Downing and Locke 2006, Finegan 2004, O‘Grady

2001, Parrott 2000) are listed in Table 2.

66

Table 3 NS Norms of the Existential Construction

Property NS Norm

Syntax:

The form of be is varied, with various tenses, aspects, and auxiliaries.

In addition to be, a small number of verbs appears in the construction. Most are

intransitive verbs.

The displaced subject denotes countable, uncountable, or abstract entities.

The displaced subject tends to be long, having various kinds of modifiers.

The bare existential structure contains there, be, and a displaced subject. The

extended existential structure also contains an extension – often a locative or

temporal expression.

Existential sentences often appear in the declarative form and in the simple structure.

Semantics and pragmatics:

The existential construction typically serves a presentational function. It draws an

addressee‘s attention to the displaced subject.

The displaced subject typically conveys new information; its position is usually

occupied by an indefinite noun phrase.

The English Existential Construction

The existential construction expresses the existence of an entity. The locative expression

functions as the expletive subject which appears in the form of unstressed there.

(2) There were ten students in the classroom.

The existential construction requires an unusual agreement pattern (O‘Grady 2001).

As the example in (2) shows, the verb agrees with the pivot noun phrase that follows, rather

than with the expletive subject there, which is neutral for number. As a result, the pivot

nominal is called a ‗displaced subject‘, i.e., the real subject that is moved from the pre-verbal

original position to the position after the verb.

In terms of form, the existential structure consists of the expletive there, the verb be, a

displaced subject, and an optional extension. As to meaning, an existential sentence denotes

the presence of something. Apart from these basic form and meaning, the English existential

construction is associated with a set of syntactic, semantic, and pragmatic properties. The

major characteristics of the construction as frequently discussed in literature (e.g., Collins

67

2002, Downing and Locke 2006, Huddleston and Pullum 2005, O‘Grady 2001) are listed in

Table 3.

Research Methodology

The study employed the qualitative approach, by assigning a writing task with prompts and a

free writing task to collect data and interpreting the results in terms of the common and

systematic characteristics of Thai learners‘ use of the English passive and existential

constructions. The details of the subjects and instruments are as follows:

Subjects

Since their deviations should reflect systematic variations – not sporadic errors of beginning

learners, the target population was upper-intermediate Thai ELF learners who had received

formal instruction in English and had been schooled to conform to Standard English norms

over several years. Both the purposive and random sampling procedures were used to select

the representatives of the population. That is, the subjects were among those who met the

following language criteria. First, undergraduate students majoring in English at Kasetsart

University who had been in the program for more than one year were targeted since they had

studied the four skills of English extensively – listening, speaking, reading, and writing,

especially during the period of their study at the university. Second, to ensure that they had

upper-intermediate level English knowledge and skills, only those with an average grade of

over 3.25 for all English classes taken at the university were considered. The subjects were

randomly selected from this group of students who met the two criteria.

Subjects meeting the selection criteria were in the third and fourth years of their study.

They were in the regular and special programs of English, affiliated with the Department of

Foreign Languages, Faculty of Humanities. The two programs shared the same curriculum;

they differed only in the class times. There were 139 third-year students and 122 fourth-year

students, yielding 261 third-year and fourth-year students in the two English programs. 35

third-year students and 35 fourth-year students were chosen to participate in the study. Of

these 70 subjects, 50 (71.4%) were female, and 20 (28.6%) were male; 40 subjects (57.1%)

studied in the regular program while 30 (42.9%) studied in the special program. The average

age of all the subjects was 22, and the average number of years of English study was 16.

68

Instruments

Two types of writing tasks were designed. In order that the subjects could concentrate on

their writing, they were assigned to do the tasks in two separate sessions, which took place on

different days. There was no time limit on finishing each task; however, most subjects could

finish within two hours. The designs and instructions of the tasks are as follows:

Writing Task with Prompts

The writing task with prompts included two sub-tasks – picture description and Thai-English

translation. For the first sub-task, three pictures depicting people doing various activities in

different places (such as a beach where people were doing relaxing activities) were prepared.

Based on several syntactic studies which have demonstrated that production of the target

structure is likely to be enhanced by using lexical items as prompts (e.g., McDonough and

Kim 2009), twelve expressions relevant to the scene depicted in each picture were given.

They included five critical items that prompted the target structures (i.e., the verb be, the verb

get, two past participle forms, and the expletive there) and seven fillers, which were related to

other constructions or provided unfamiliar vocabulary (e.g., on vacation, easel). To minimize

hints to the students, words corresponding to the same target structure (e.g., there and be,

be/get and a past participle) were placed separately, with one or more fillers between them.

Each subject was randomly presented with one of the three pictures. On the top of the

picture, there was an instruction to write what they saw by using about ten to twelve

sentences. A list of twelve lexical items was provided in the box below the instruction. The

subjects were encouraged to use the given expressions in their description of the picture, and

they were allowed to use any of those expressions more than once.

As to the second sub-task, a test containing eight Thai sentences was constructed. Two

sentences were targeted at the passive construction; two were targeted at the existential

construction; and the other four served as fillers. A list of twelve expressions relevant to the

content in the eight sentences was provided. The list included five critical items prompting

the target structures (the verb be, the verb get, two past participle forms, and the expletive

there) and seven fillers related to other constructions or providing unfamiliar vocabulary. The

critical items of the same target structure were placed separately.

The students were instructed to translate all sentences into English. They were

encouraged to use the given expressions in the box below the instruction. They were allowed

to use any of the expressions more than once.

69

Free Writing Task

For the second session of the tasks, the subjects were asked to write one essay on a topic of

their own interest or choose one of six suggested topics. Three of these topics were non-

academic (e.g., ‗my favorite hobbies‘) while the other three were concerned with more

serious or academic issues (e.g., ‗the problem of deforestation in my country‘). The objective

of this task was to stimulate the subjects to tell stories they were interested in – concerning

relaxing or serious topics – from their own experience and linguistic knowledge by using

expressions and structures they were familiar with.

The subjects were given sheets of paper; instructions were in English. They were told

to write an essay (approximately 1,200-1,500 words) about one topic. Moreover, to ensure

that the target sentences obtained from the task would be sufficiently substantial for the

analysis, the researcher encouraged the subjects to write more than one essay on different

topics. Since all of the subjects had taken several English writing classes, most of them were

able to write on two topics and some subjects could finish three topics in one writing session.

The total number of the essays written by the 70 subjects was 158.

Results

This section is divided into three parts. The first part involves the passive construction. The

second part discusses the existential construction. The last part analyzes how the Thai

learners‘ use of the constructions reflects the general, universal characteristics of ELF.

Thai Learners’ Use of the English Passive Construction

Table 4 presents the number of passive sentences and passive verb phrases taken from each

task and sub-task. Since several sentences contained more than one passive verb phrase, the

number of the passive verb phrases outnumbered that of the passive sentences.

Table 4 Number of Passive Sentences and Passive Verb Phrases

Task/Sub-task Number of Passive Sentences Number of Passive Verb Phrases

Picture description 65 70

Translation 271 277

Essay writing 455 501

Total 791 848

70

Of these three data sources, the sentences from the students‘ essays are considered the

best indicator of how the Thai students used the English construction. The sentences from

essay writing are naturalistic, or naturally occurring data; the students produced these

sentences from their own linguistic repertoire, with no hints or stimulation to use any

particular features through the provided word prompts, pictures, or Thai counterpart

sentences. Accordingly, the results from the essay and the writing with prompts are presented

separately for both the passive and existential constructions. This is to see whether the results

from the naturalistic data and elicited data supported each other regarding the Thai students‘

use of the constructions.

Table 5 Passive Verb Forms

Essay Writing Picture Description & Translation

Verb Form Frequency Verb Form Frequency

Present simple 232 (46.3%) Present simple 147 (42.4%)

Past simple 67 (13.4%) Present perfect 60 (17.3%)

The modal can 56 (11.2%) The modal can 53 (15.3%)

Present perfect 29 (5.8%) Past simple 38 (11%)

The modal should 26 (5.2%) Future simple 34 (9.8%)

To infinitive 25 (4.9%) To infinitive 4 (1.2%)

Future simple 23 (4.6%) Present participle & gerund 4 (1.2%)

Present participle & gerund 8 (1.6%) Present continuous 2 (0.6%)

The modal may 8 (1.6%) Present perfect continuous 2 (0.6%)

The modal could 7 (1.4%) Past continuous 1 (0.2%)

The modal must 4 (0.8%) The modal would 1 (0.2%)

The modal have to 4 (0.8%) The modal may 1 (0.2%)

The modal would 3 (0.6%) Total 347

Bare infinitive 2 (0.4%)

Present continuous 2 (0.4%)

The modal might 2 (0.4%)

Past continuous 1 (0.2%)

Past perfect 1 (0.2%)

Imperative 1 (0.2%)

Total 501

71

1. Passive Verb Forms

Despite their variety of passive verb forms, the Thai students distinctly wrote passive

sentences in the present simple tense in essay writing (46.3%) and in the picture description

and translation (42.4%). Since Thai verbs do not have inflection to show the time reference,

this finding shows that many Thai students generalize the use of the present simple tense to

various situations – not only facts or timeless events but also other situations they do not want

to clarify the time reference of.

2. Auxiliary Verbs

The students‘ passive sentences were predominantly formed by the typical passive auxiliary

verb be in essay writing (97.2%) and in the picture description and translation (96.8%). This

finding shows that Thai students usually produce the basic form of the English passive verb

phrase; variant forms containing other auxiliaries are uncommon.

Table 6 Auxiliary Verbs

Essay Writing Picture Description & Translation

Auxiliary Frequency Auxiliary Frequency

be 487 (97.2%) be 336 (96.8%)

become 6 (1.2%) get 10 (2.9%)

get 4 (0.8%) seem 1 (0.3%)

feel 3 (0.6%) Total 347

look 1 (0.2%)

Total 501

3. The Agentless Passive

The students frequently omitted agent phrases in essay writing (77.4%) and in the picture

description and translation (66%). This suggests that Thai students perceive the most distinct

pragmatic property of the construction, i.e., to talk about an event from the perspective of the

theme, thereby making the agent become less prominent and very often be eliminated from

the structure (O‘Grady 2001). Because of the awareness of the downgraded agent‘s status,

Thai students tend to produce the passive without explicitly identifying the doer of the action.

72

Table 7 Agentless Passive

Essay Writing Picture Description & Translation

Agent Phrase Frequency Agent Phrase Frequency

Passives with no agent 388 (77.4%) Passives with no agent 229 (66%)

Passive with an agent 113 (22.6%) Passive with an agent 118 (34%)

Total 501 Total 347

4. Contexts for Agent Omission

The students most often omitted the agent phrase when it was unidentified or irrelevant to the

point being discussed in essay writing (61.3%) and in the picture description and translation

(65.1%). This is the context that is also most typical in native speaker English (Finegan

2004). This finding supports the result of the previous topic. Thai students perceive the

English passive as the structure for downgrading the agent role; thus, they tend to choose the

passive when they do not know or are not interested in the agent.

Table 8 Contexts for Agent Omission

Essay Writing Picture Description & Translation

Context Frequency Context Frequency

Unknown or irrelevant 238 (61.3%) Unknown or irrelevant 149 (65.1%)

Predictable by context 87 (22.4%) Referring to people 63 (27.5%)

Referring to people 41 (10.6%) Predictable by context 16 (7%)

Predictable by world knowledge 22 (5.7%) Predictable by world knowledge 1 (0.4%)

Total 388 Total 229

5. Prepositions of the Agent Phrases

The students mostly put the preposition by before the agent phrase in essay writing (76.1%)

and in the picture description and translation (60.2%). This finding reveals once again that

Thai students usually write passive sentences of the basic, typical pattern; they frequently use

the typical preposition by as the agent marker.

73

Table 9 Prepositions of Agent Phrases

Essay Writing Picture Description & Translation

Preposition Frequency Preposition Frequency

by 86 (76.1%) by 71 (60.2%)

to 11 (9.7%) from 41 (34.7%)

with 5 (4.4%) with 6 (5.1%)

because of 4 (3.5%) Total 118

due to 4 (3.5%)

from 3 (2.7%)

Total 113

Table 10 Weight of Agent Phrases

Essay Writing Picture Description & Translation

Weight Frequency Weight Frequency

1-2 words 51 (45.1%) 1-2 words 37 (31.4%)

3-4 words 28 (24.8%) 3-4 words 35 (29.7%)

5-6 words 11 (9.7%) 5-6 words 37 (31.4%)

7-8 words 7 (6.2%) 7-8 words 6 (5.1%)

9-10 words 7 (6.2%) 9-10 words 1 (0.8%)

11-12 words 2 (1.8%) 11-12 words 1 (0.8%)

13 words or more 7 (6.2%) 13 words or more 1 (0.8%)

Total 113 Total 118

6. Weight of the Agent Phrases

The agent phrases mostly belonged to the two lightweight groups containing not more than

four words in essay writing (69.9%) and in the picture description and translation (61.1%).

This indicates that Thai students do not associate the construction with the end-weight

principle. For English speakers, the passive is preferred when the retained agent phrase is

long because it is allowed to occur at the end of the sentence – the usual position for a heavy

element in the language (Downing and Locke 2006). For Thai students, however, the agent

phrase tends to be short. This finding is not surprising given that the principal pragmatic

property of the passive is concerned with the theme being topical and the agent being

74

downgraded. Many Thai students are aware only of this distinct pragmatic, which involves

the omission of the agent, and they do not recognize other additional functions including the

end-weight principle, which involves the presence of the agent.

7. The Theme Subjects

The theme subjects were often expressed as given and definite noun phrases in essay writing

(43.3%) and in the picture description and translation (32.8%). In fact, the most common

correlation in the sub-tasks with prompts was new and indefinite subjects (34.9%). Since

these sub-tasks gave the pictures and sentences for translation with no prior context, the

students were likely to present the subject nouns mentioned for the first time as new and

indefinite. Yet, a closed relation between given information and definiteness could be

identified. Therefore, in general, Thai students produce passive subjects showing the most

typical correlation between information structure and definiteness. The subjects of their

passive sentences – like those of native speakers – are usually given and definite.

Table 11 Theme Subjects

Essay Writing Picture Description & Translation

Theme Subject Frequency Theme Subject Frequency

Given & definite 217 (43.3%) New & indefinite 121 (34.9%)

Given & indefinite 103 (20.5%) Given & definite 114 (32.8%)

New & indefinite 80 (16%) Given & indefinite 68 (19.6%)

New & definite 76 (15.2%) New & definite 41 (11.8%)

Dummy it 20 (4%) Dummy it 2 (0.6%)

Interrogative pronoun 5 (1%) Interrogative pronoun 1 (0.3%)

Total 501 Total 347

8. Sentence Types by Grammatical Structures

The students often produced passive sentences in two structures – the simple and complex

structures – in essay writing (81.1%) and in the picture description and translation (96.4%).

This means that when writing in English, Thai students often express their idea in one

independent clause, i.e., the simple structure, which is considered the basic sentence structure.

In cases where they want to expand the message, they usually do it by adding one or more

dependent clauses to the independent clause, resulting in the complex structure.

75

Table 12 Sentence Types by Grammatical Structures

Essay Writing Picture Description & Translation

Sentence Type Frequency Sentence Type Frequency

Complex 235 (51.6%) Simple 258 (76.8%)

Simple 134 (29.5%) Complex 66 (19.6%)

Compound-complex 58 (12.7%) Compound 10 (3%)

Compound 28 (6.2%) Compound-complex 2 (0.6%)

Total 455 Total 336

9. Sentence Types by Communicative Purposes

The predominant sentence type of the passive produced by the students in essay writing and

in the picture description and translation was the declarative (95.8% and 99.4%, respectively).

Like the results in several topics, this finding suggests that Thai students usually produce

passives of the basic form; most passive sentences belong to the declarative structure, which

is considered the canonical sentence type.

Table 13 Sentence Types by Communicative Purposes

Essay Writing Picture Description & Translation

Sentence Type Frequency Sentence Type Frequency

Declarative 480 (95.8%) Declarative 345 (99.4%)

Indirect interrogative 17 (3.4%) Indirect interrogative 2 (0.6%)

Direct interrogative 4 (0.8%) Total 347

Total 501

10. Basic and Non-Basic Passives

The passive sentences mostly belonged to the basic passive structure in essay writing (98%)

and in the picture description and translation (97.1%). Once again, the finding shows Thai

students‘ preference for the basic structure; they tend to produce passive sentences of the

basic type. Non-basic passives are rare.

76

Table 14 Basic and Non-Basic Passives

Essay Writing Picture Description & Translation

Passive Type Frequency Passive Type Frequency

Basic 491 (98%) Basic 337 (97.1%)

Ditransitive passive 6 (1.2%) Get passive 10 (2.9%)

Get passive 4 (0.8%) Total 347

Total 501

Thai Learners’ Use of the English Existential Construction

Table 15 presents the number of existential sentences and clauses taken from each task and

sub-task. Since some sentences contained two existential clauses, the number of the

existential clauses was a little higher than that of the existential sentences.

Table 15 Number of Existential Sentences and Clauses

Task/Sub-task Number of Existential Sentences Number of Existential Clauses

Picture description 121 125

Translation 162 163

Essay writing 244 248

Total 527 536

1. Verb Forms

The students wrote existential sentences mainly in the present simple tense for essay writing

(87.1%) and the picture description and translation (93.1%). This shows that many Thai

students generalize the use of the present simple tense to talk about not only facts and habits,

but also other event types in which they do not want to clarify the time reference.

77

Table 16 Verb Forms

Essay Writing Picture Description & Translation

Verb Form Frequency Verb Form Frequency

Present simple 216 (87.1%) Present simple 268 (93.1%)

Past simple 16 (6.5%) Past simple 10 (3.5%)

Future simple 6 (2.4%) Present perfect 5 (1.7%)

Present perfect 2 (0.8%) Future simple 4 (1.4%)

The modal may 2 (0.8%) The modal might 1 (0.3%)

The modal would 2 (0.8%) Total 288

The modal must 2 (0.8%)

Past perfect 1 (0.4%)

The lexical verb seem to 1 (0.4%)

Total 248

2. Types of Verbs

The students overwhelmingly chose the typical verb be in essay writing (98%) and in the

picture description and translation (100%). The finding reflects that Thai students usually

produce existential sentences of the basic form. Moreover, it suggests that they consider the

form ‗there + be‘ an essential part of the construction; they treat this specific pattern as an

idiomatic expression whose elements always co-occur and do not allow much variation.

Table 17 Types of Verbs

Essay Writing Picture Description & Translation

Verb Frequency Verb Frequency

be 243 (98%) be 288 (100%)

come 2 (0.8%) Total 288

come up with 1 (0.4%)

remain 1 (0.4%)

seem to be 1 (0.4%)

Total 248

78

3. Types of the Displaced Subjects

The students strongly associated the existential construction with countable nouns in essay

writing (85.5%) and in the picture description and translation (95.1%). This reflects that Thai

students use the construction to mainly talk about the presence of countable, discrete entities.

This is not surprising because countable nouns are the most common type of nouns, and all

the results reported so far have shown that Thai students tend to use the basic, typical forms

of the English constructions.

Table 18 Types of Displaced Subjects

Essay Writing Picture Description & Translation

Displaced Subject Frequency Displaced Subject Frequency

Countable noun 212 (85.5%) Countable noun 274 (95.1%)

Abstract noun 15 (6.1%) Uncountable noun 13 (4.5%)

Uncountable noun 13 (5.2%) Countable & uncountable nouns 1 (0.4%)

Indefinite pronoun 8 (3.2%) Total 288

Total 248

Table 19 Weight of Displaced Subjects

Essay Writing Picture Description & Translation

Weight Frequency Weight Frequency

1-2 words 36 (14.5%) 1-2 words 27 (9.4%)

3-4 words 34 (13.7%) 3-4 words 65 (22.5%)

5-6 words 47 (19%) 5-6 words 25 (8.7%)

7-8 words 42 (16.9%) 7-8 words 28 (9.7%)

9-10 words 27 (10.9%) 9-10 words 46 (16%)

11-12 words 17 (6.9%) 11-12 words 31 (10.8%)

13-14 words 12 (4.8%) 13-14 words 11 (3.8%)

15 words or more 33 (13.3%) 15 words or more 55 (19.1%)

Total 248 Total 288

79

4. Weight of the Displaced Subjects

The displaced subjects mostly belonged to heavy weight groups containing more than four

words in essay writing (71.8%) and in the picture description and translation (68.1%). This

implies that Thai students are aware of the most distinct pragmatics of the construction, i.e.,

to introduce a new referent into the discourse (Collins 2002). Since the displaced subject is

new or unfamiliar to the addressee, it needs detailed description, resulting in the form of a

long noun phrase.

5. Information Structure and Definiteness of the Displaced Subjects

Most of the displaced subjects were expressed as new and indefinite noun phrases in essay

writing (78.6%) and in the picture description and translation (99%). This is also the most

typical kind of correlation for native speakers (Huddleston and Pullum 2005). This finding

confirms that Thai students are aware of the principal pragmatic function of the construction.

They usually encode the displaced subject that is newly introduced as an indefinite noun

phrase.

Table 20 Information Structure and Definiteness of Displaced Subjects

Essay Writing Picture Description & Translation

Displaced Subject Frequency Displaced Subject Frequency

New & indefinite 195 (78.6%) New & indefinite 285 (99%)

Given & indefinite 50 (20.2%) New & definite 3 (1%)

New & definite 2 (0.8%) Given & definite 0

Given & definite 1 (0.4%) Given & indefinite 0

Total 248 Total 288

6. Types of Existential Sentences

The number of bare existential sentences was much higher than extended ones in essay

writing (73.4%) and in the picture description and translation (75.3%). This indicates that

Thai students usually produce existential sentences in the basic structure, containing just the

three main components (there + be + displaced subject), without an additional extension.

80

Table 21 Types of Existential Sentences

Essay Writing Picture Description & Translation

Type Frequency Type Frequency

Bare 182 (73.4%) Bare 217 (75.3%)

Extended 66 (26.6%) Extended 71 (24.7%)

Total 248 Total 288

7. Types of Modifiers of the Bare Existential

Most of the displaced subjects of the bare structure contained one or more modifiers in essay

writing (89%) and in the picture description and translation (98.6%). Moreover, relative

clauses and prepositional phrases accounted for a big proportion of modifiers in both data

sources (53.1% and 68.6%, respectively). They are among the most common modifiers of

nouns in English (Downing and Locke 2006). Since Thai students are aware that the main

pragmatic function of the construction is to introduce a novel entity, they tend to give the full

description of this unfamiliar referent by adding various pre- and post-modifiers.

Table 22 Types of Modifiers of the Bare Existential

Essay Writing Picture Description & Translation

Modifier Frequency Modifier Frequency

Relative clause 71 (29.2%) Relative clause 101 (35%)

Prepositional phrase 58 (23.9%) Prepositional phrase 97 (33.6%)

Adjective 42 (17.3%) Present participial phrase 47 (16.3%)

Infinitive phrase 29 (11.9%) Adjective 28 (9.7%)

Present participial phrase 14 (5.8%) Noun 7 (2.4%)

Adjective phrase 9 (3.7%) Past participial phrase 4 (1.4%)

Noun phrase 7 (2.9%) Adjective phrase 3 (1%)

Past participial phrase 7 (2.9%) Noun phrase 1 (0.3%)

Noun 6 (2.5%) Adverb 1 (0.3%)

Total 243 Total 289

81

8. Types of Extensions of the Extended Existential

Locative extensions were common in the extended existential sentences in essay writing

(72.2%) and in the picture description and translation (77.3%). The locative expression is the

most common type of extensions in native speaker English (Huddleston and Pullum 2005).

Once again, the result shows that Thai students usually produce existential sentences of the

basic, typical structure. They prefer the locative expression, which is the most typical

extension of the extended existential structure.

Table 23 Types of Extensions of the Extended Existential

Essay Writing Picture Description & Translation

Extension Frequency Extension Frequency

Locative 52 (72.2%) Locative 58 (77.3%)

Temporal 19 (26.4%) Temporal 17 (22.7%)

Comparison 1 (1.4%) Total 75

Total 72

9. Sentence Types by Grammatical Structures

The students mostly produced existential sentences in two structures – the complex and

simple structures – in essay writing (80.3%) and in the picture description and translation

(93.6%). Like the passive, when writing an argument structure, Thai students often express

their idea in one independent clause, i.e., the simple structure. When they want to expand the

message, they usually do it by adding one or more dependent clauses to the independent

clause, creating the complex structure.

Table 24 Sentence Types by Grammatical Structures

Essay Writing Picture Description & Translation

Sentence Type Frequency Sentence Type Frequency

Complex 126 (51.6%) Complex 135 (47.7%)

Simple 70 (28.7%) Simple 130 (45.9%)

Compound-complex 39 (16%) Compound 11 (3.9%)

Compound 9 (3.7%) Compound-complex 7 (2.5%)

Total 244 Total 283

82

10. Sentence Types by Communicative Purposes

The predominant sentence type of the existential construction produced by the students in

essay writing and in the picture description and translation was the declarative (98.8% and

100% respectively). This finding suggests again that Thai students usually produce existential

sentences of the basic structure; most sentences belong to the declarative, canonical form.

Table 25 Sentence Types by Communicative Purposes

Essay Writing Picture Description & Translation

Sentence Type Frequency Sentence Type Frequency

Declarative 245 (98.8%) Declarative 288 (100%)

Indirect interrogative 3 (1.2%) Total 288

Total 248

Thai Learners and Universal Characteristics of ELF

All these characteristics of passive and existential sentences reveal one fact about Thai

learners‘ usage of the English constructions. That is, when used by native speakers, the two

constructions are known to be associated with a variety of basic and non-basic properties.

However, when used by Thai learners, the two constructions are simplified and generalized to

such an extent that they usually exhibit only the most distinct, fundamental properties in

terms of syntax, semantics, and pragmatics. Accordingly, sentences in the two constructions

produced by Thai learners are much more limited in terms of structural complexity and

semantic and pragmatic value. It is important to note that passive and existential sentences

produced by native speakers are also associated with basic linguistic characteristics, but the

association is not as strong, and thus various non-basic patterns are prevalent.

Applying such unique usage to the ELF framework, we find that the properties of the

passive and existential constructions produced by Thai learners serve to reflect three general

and universal characteristics of ELF. These include (i) simplification, (ii) regularization, and

(iii) analogy.

Simplification – Simple and Basic Structural Patterns

Simplification is revealed in many properties of the constructions produced by Thai learners.

Most of them involve syntax; shortened and basic syntactic forms are preferred to complex

83

and non-basic ones. This characteristic results in the association of the constructions with

simple, basic structural patterns.

For instance, passive and existential sentences are usually of the basic type; the

passive consists of the typical auxiliary be and a past participle while the existential structure

is made up of the expletive there, the typical verb be, and the displaced subject. More

complex or non-basic forms, such as ditransitive passives and extended existential sentences,

are not frequently found among Thai learners.

Regularization and Analogy – No Variety in Form and Meaning

Regularization and analogy are reflected by many properties of the two constructions

produced by Thai learners. They involve syntax, semantics, and pragmatics; various kinds of

constructional features are regularized and generalized to become more general and consistent

on the basis of predominant cases. These characteristics result in no great variety in the use of

the constructions.

In terms of syntax, for example, passive and existential sentences do not appear in

various verb forms. In most cases, they are in the present simple tense, which is regarded as

the unmarked verb form of English. Moreover, since by is the typical marker of the passive

agent (Parrott 2000), most agent phrases produced by Thai learners – by means of analogy –

are introduced by this preposition. Likewise, since the majority of nouns are countable,

almost all existential sentences produced by Thai learners talk about the presence of this type

of nouns which function as the displaced subject.

As to semantics and pragmatics, for example, both the theme subject of the passive

and the displaced subject of the existential follow the main tendencies of the constructional

usage. On the basis of predominant cases, the former usually appears as given and definite

and the latter as new and indefinite. Moreover, like native speakers who mainly choose the

passive when they want to focus the theme and downgrade the agent, Thai learners frequently

omit the agent phrase in their passive sentences. Likewise, the forms of displaced subjects in

Thai learners‘ existential sentences are quite consistent. As entities newly introduced, most

displaced subjects are structurally heavy, containing various modifiers, particularly relative

clauses and prepositional phrases, which are among the most common kinds of English noun

modifiers.

Associated with these characteristics – simplification, regularization, and analogy –

English passive and existential sentences produced by Thai learners are involved with only

the most distinct and fundamental properties in syntax, semantics, and pragmatics. Moreover,

84

their uses are more regular and consistent, not as varied as those of native speakers. In other

words, due to these universal tendencies of second language usage, Thai learners treat the

passive and existential constructions in English as ‗idiomatic expressions‘ or ‗pre-fabricated

chunks‘ which are made up of rather fixed components and do not allow much variation and

flexibility in both form and meaning.

Discussion

Based on the characteristics of the students‘ use of the English passive and existential

constructions, we can draw four general properties of argument structures typically produced

by Thai learners of English.

The Present Simple Tense

Thai learners usually produce argument structure constructions in the present simple tense.

The preference for this tense is largely due to first language interference. Thai verbs do not

have inflection to show tense or time reference; situation and context provide clues to avoid

any ambiguity (Swan and Smith 2001). The present simple tense in English is ‗the unmarked

tense‘; it is used to describe general actions and states which are not viewed as being in any

way temporary or limited in time (Parrott 2000). Thus, the present simple tense is generalized

to talk about various situations whose time reference is not needed. For example, the results

of the study show that although passive and existential sentences produced by Thai learners

appear in various verb forms, the predominant verb form for both constructions is the present

simple tense.

The Basic Sentence Type

Thai learners usually express argument structure constructions in the basic sentence type. On

the criterion of grammatical structures, sentences in a clausal construction occur frequently in

the simple structure. When they are used to convey an extended message, they are likely to

appear in the complex structure, by attaching one or more dependent clauses to the existing

independent clause. As to the criterion of communicative purposes, sentences in a clausal

construction frequently appear in the declarative form, with the basic SVO order. For

instance, the results of the study indicate that many passive and existential sentences

produced by Thai learners are in the simple, declarative structures.

85

The Most Basic Structure

Many English clausal constructions have their variant structures, which slightly differ in form

and meaning (Goldberg 1995). Thai learners prefer the most basic structure, which is made

up of only the core components of the construction. For example, the basic passive

construction, consisting of the typical auxiliary be and a past participial verb, is the kind of

passive structure most commonly produced by Thai learners. Likewise, the bare existential

construction, consisting of the expletive subject there, the typical verb be, and the displaced

subject, is the most prevalent existential structure found in Thai learners‘ writing.

The Most Distinct, Fundamental Meaning

Thai learners are usually aware of the most distinct, fundamental meaning. Thus, their usage

of the construction is relatively limited, without variation in meaning. For example, the

English passive typically serves to put the theme as the topic (O‘Grady 2001). Many Thai

learners produce passive sentences with this pragmatic tendency by having a given and

definite theme subject and an omitted agent phrase. Likewise, the English existential typically

has the presentational function of a novel entity (Collins 2002). Thai learners‘ existential

sentences tend to have a new and indefinite displaced subject, which is in the form of a long

noun phrase having several modifiers to describe the subject referent.

Additional semantic or pragmatic properties are unlikely to be observed by Thai

learners. For instance, another pragmatic property of the English passive involves the end-

weight principle: the passive is used to place a long agent phrase in sentence-final position

(Downing and Locke 2006). Contrary to this principle, Thai learners‘ passive sentences tend

to contain short agent phrases. Likewise, the existential construction has some additional

functions, such as providing circumstantial background and reintroducing a referent already

mentioned (Collins 2002, Ward and Birner 1995). Thai learners‘ existential sentences do not

usually convey these functions; they are used mostly for the presentational function.

Therefore, compared to the native speaker norms, Thai learners‘ use of argument

structure constructions is much more limited in syntax, semantics, and discourse functions. In

general, there is the tendency for Thai learners to treat argument structure constructions in

English as ‗idiomatic expressions‘ or ‗pre-fabricated chunks‘ which are made up of rather

fixed components and are used to convey one meaning, and hence do not allow much

variation in both form and meaning.

What motivates such usage among Thai learners? Like many deviations found in the

phonology and pragmatics of other varieties of ELF, the motivation of Thai learners‘ distinct

86

usage of clausal constructions is the need for simplicity and effectiveness in communication.

Thai learners have developed their own version of an argument structure construction, which

is simpler and more consistent than the native speaker norms. Because this version is

associated with one particular form and one particular meaning – with not much variation, it

ensures mutual understanding and successful communication. Therefore, the present study

supports the precept of the ELF approach, which holds that there is a universal tendency for

L2 speakers to make some changes in the way they use English and shift their focus to

simplicity and effectiveness in communication.

Conclusion and Suggestions

In conclusion, this study investigated Thai learners‘ use of the English passive and existential

constructions. Data were taken from 70 English-major undergraduate students who

represented ELF speakers at the upper-intermediate level. Two kinds of writing tasks were

designed to collect the data – writing with prompts and free essay writing. The results

revealed that Thai learners‘ constructions deviate from the native speaker norms in that they

are much more limited in terms of structural complexity as well as semantic and pragmatic

functions. Moreover, the results reflected that Thai learners‘ use of the English constructions

is also governed by three general and universal characteristics, i.e., simplicity, regularity, and

analogy, which have also been found in different varieties of ELF.

The results have a pedagogical implication for teaching English argument structures to

non-native speakers. A traditional way of teaching an argument structure in many Asian

schools is by introducing the form and emphasizing its grammatical properties (i.e., a

grammar-based approach). However, this teaching method is not especially effective,

particularly in Thailand where graduates do not have sufficient skills in English (Kirkpatrick

2012). As shown by the study, there is the tendency for L2 speakers to use a clausal

construction in a simple and consistent pattern by associating it with only the most basic and

distinct properties in syntax, semantics, and pragmatics. This overall result suggests that the

process of teaching an argument structure should be divided into steps based on all the

properties involved. Basic and principal properties in both form and meaning should be

introduced earlier than non-basic and additional ones because they can be treated like

formulaic expressions or chunks, which are easier to acquire. Once learners can pick up the

basic form and meaning of a construction, teachers should present its variant characteristics

and the specific nuances of semantic and pragmatic meanings conveyed by them so that the

learners can use the construction in a more complex and varied way. Such steps of teaching

87

an argument structure are in accordance with Ellis‘ (2005) principle of second language

acquisition that formulaic expressions serve as a basis for the later development of more

complicated features which require a rule-based competence.

The study has extended the scope of CxG from L1 settings to L2 phenomena. Most

studies in the CxG approach have focused on the formal properties of various constructions in

English and other languages from the perspective of native speakers‘ reception and

production. The results of this study have revealed differences in the constructional use

between L1 and L2 speakers, which serve to provide guidelines of teaching argument

structure constructions to English learners. Moreover, the study has broadened the scope of

ELF research, which has focused on phonological and pragmatic features of ELF interactions,

with just a little description at the lexical-grammatical level (Cogo and Dewey 2006,

Seidlhofer 2004). The results have demonstrated that ELF speakers‘ deviations from Standard

English at all levels – sounds, words, phrases, discourse, and also sentences – are governed by

the universal characteristics of second language usage, which reflects the underlying

motivations of ELF speakers to shape the language in the direction that results in a ‗simple

and effective‘ form of communication.

However, all data in the study involved only written English. In fact, the spoken form

of language is considered more natural (Stewart, Jr. and Vaillette 2001), and an analysis of

data taken from both written and spoken English should reflect more precise characteristics of

the constructions. Moreover, the subjects in the study were from only one institution; data

from various institutions should better represent Thai ELF learners. Therefore, future research

that includes both written and spoken English and participants from various institutions

should be able to find out Thai learners‘ use of English argument structure constructions in

more precise and specific detail.

Acknowledgements

This research project was supported by the Department of Foreign Languages, Faculty of

Humanities, Kasetsart University.

References

Bley-Vroman, R. (1988). The fundamental character of foreign language learning. In W.

Rutherford and M. Sharwood-Smith (Eds.), Grammar and second language teaching: A

book of readings (pp. 19-30). Rowley, MA: Newbury House.

88

Breiteneder, A. (2009). English as a lingua franca in Europe: An empirical perspective. World

Englishes, 28(2), 256-269.

Cogo, A. and M. Dewey. (2006). Efficiency in ELF communication: From pragmatic motives

to lexico-grammatical innovation. Nordic Journal of English Studies, 5(2), 59-94.

Collins, P. (2002). Some discourse functions of existentials in English. In C. Allen (Ed.), The

Proceedings of the 2001 Conference of the Australian Linguistic Society (pp. 1-6).

Australia: Canberra.

Dewey, M. (2007). English as a lingua franca and globalization: An interconnected

perspective. International Journal of Applied Linguistics, 17(3), 332-354.

Downing, A. and P. Locke. (2006). English grammar: A university course (2nd

ed.). London

and New York: Routledge.

Ellis, R. (2005). Principles of instructed language learning. Asian ELF Journal, 7(3), 9-24.

Fillmore, C. J. (1988). The mechanisms of ―construction grammar‖. BLS 14, 35-55.

Finegan, E. (2004). Language: Its structure and use (4th

ed.). Boston, MA: Wadsworth.

Goldberg, A. E. (1995). A construction grammar approach to argument structure. Chicago

and London: University of Chicago Press.

(2002). Construction grammar. In L. Nadel (Ed.), Encyclopedia of Cognitive Science (pp.

813-816). London: Macmillan.

(2009). The nature of generalization in language. Cognitive Linguistics, 20(1), 93-127.

Huddleston, R. and G. Pullum. (2005). A student’s introduction to English grammar.

Cambridge: Cambridge University Press.

Jenkins, J. (2006). Points of view and blind spots: ELF and SLA. International Journal of

Applied Linguistics, 16(2), 137-162.

A. Cogo, and M. Dewey. (2011). Review of developments in research into English as a

Lingua Franca. Language Teaching, 44(3), 281-315.

Kirkpatrick, R. (2012). English education in Thailand: 2012. Asian ELF Journal, 61.

Retrieved July 20, 2013 from http://www.asian-elf-journal.com

McDonough, K. and Y. Kim. (2009). Syntactic priming, type frequency, and EFL learners‘

production of wh-questions. The Modern Language Journal, 93(3), 386-398.

O‘Grady, W. (2001). The syntax files. Honolulu: University of Hawai‗i at Manoa.

Parrott, M. (2000). Grammar for English language teachers. Cambridge: Cambridge

University Press.

Seidlhofer, B. (2001). Closing a conceptual gap: The case for a description of English as a

lingua franca. International Journal of Applied Linguistics, 11(2), 133-158.

89

(2004). Research perspectives on teaching English as a lingua franca. Annual Review of

Applied Linguistics, 24, 209-239.

Stewart, Jr., T. and N. Vaillette (Eds.). (2001). Language files: Materials for an introduction

to language and linguistics (8th

ed.). Columbus: The Ohio State University Press.

Swan, M. and B. Smith. (2001). Learner English: A teacher’s guide to interference and other

problems (2nd

ed.). Cambridge: Cambridge University Press.

Ward, G. and B. J. Birner. (1995). Definiteness and the English existential. Language, 71(4),

722-742.

90

Social Class and Language Structure: A Methodological Inquiry into Bernstein's Theory of Sociology of Education

Mohammad Aliakbari

[email protected] Mahmoud Qaracholloo

Ali Mansouri Nejad Ilam University

Bioprofiles: Mohammad Aliakbari is an Associate Professor of TEFL at Ilam University, Iran. His areas of interest embrace SLA, sociolinguistics and bilingualism. Mahmoud Qaracholloo holds an M.A. in TEFL. His research interests are different aspects of English teaching, issues of sociolinguistics, and discourse analysis. Ali Mansouri Nejad is a Ph.D. candidate at the University of Ilam, Iran. His areas of interest include critical discourse analysis (CDA), co-teaching and genre analysis.

Abstract

The present study aimed at finding the differences between the language patterns of Iranian

working-class and middleclass speakers. To see if the language patterns produced by

members of different social classes have particular attributes, 100 participants from a western

city of Iran were selected from working and middle-class members. The working-class

members were selected from among salespersons, sale-assistants, and shopkeepers who

worked in groceries, department stores, supermarkets and cafés with no high education. The

subjects selected for the middle-class sample included 16 participants with Ph.D. degrees who

were professors at Ilam University and 34 master students of Ilam University who were

language teachers. Prompts with two topics were administered to both groups to write what

they wished for. After excluding part of the data which was not suitable for the purpose of

this study, the texts were analyzed in terms of the frequencies of total number of words,

content-words repetitions, personal pronouns, impersonal pronouns, structurally-complete

sentences, quasi-sentences, noun groups, adjective groups and verb groups. The 2 results

indicated significant differences between working and middle-class samples in terms of the

total number of words, content-words repetitions, impersonal pronouns, quasi-sentences, and

verb groups. Moreover, the findings of the study showed that middle-class members were

more productive and creative than persons from lower classes. Accordingly, this study can be

regarded as partial support of Bernstein's Language Codes Theory in an Iranian context.

91

Keywords: language codes theory, restricted code, elaborated code, working-class, middle-

class

1. Background

It is often claimed that social class structure is mirrored in the language patterns produced by

speakers (Holmes 1992) and that ‗there is a direct and reciprocal relationship between a

particular kind of social structure, in both its establishment and maintenance, and the way

people in that social structure use language‘ (Wardhaugh 2006: 336). It is also credited that

the quality of the speakers' language patterns changes according to their socio-economic

status. Therefore, the way language production interacts with social class has provided a rich

area of investigation (e.g., Allafchi 1998, Hoff-Ginsberg 1998, Richardson et al. 1976,

Walker et al. 1994).

Research on this line of study has received much interest in Iranian context in recent

years. Drawing on the relationship between language 1 and language 2 proficiency, Hosseini

(2003) studied learners‘ writing characteristics in light of their socio-economic statuses in

Iran. The study revealed that learners with high and low socio-economic status performed

differently in their writing. Further, no significant relationship was identified between L1 and

L2 proficiency in terms of socio-economic statuses. Likewise, Aliakbari et al. (2012) analyzed

the relationship between social class and language patterns among a group of elementary

school students in Iran. The result of their study illustrated a significant relationship between

ones' use of grammatical categories and their social classes.

Bernstein (1973a) argues that the linguistic differences of various social class

structures lead to two dichotomous language codes: a restricted code and an elaborated code;

the former concerns the language produced by working-class people, and the latter deals with

the language patterns of middle-class language users. The difference between restricted and

elaborated language codes is so interwoven that Bernstein has developed them into two

dichotomous language codes, each one holding its own particular characteristics. More

specifically, it is argued that working-class people do not have access to the elaborated code

and language users or speakers with lower socio-economic statuses speak a language that is

not useful for academic or educational purposes.

The aforementioned language codes are thought to have advantages and disadvantages

Ginsberg (2006) considers that less academic achievement can be attributed to insufficient

language skills. She contends that children from a low socio-economic status are usually

more under-achieved than middle-class students. Such a conclusion was strongly supported

92

by a host of studies which have given a specific attention to social class and written

composition (Richardson et al. 1976), the number of produced vocabulary (Tizard and

Hughes 1984), and vocabulary growth (Walker et al. 1994). Bernstein (1973a) points out that

the process of schooling needs specific language patterns to which low working-class

students have less access. In agreement with Bernstein, Christie (1999) writes that middle-

class children have access to the language code needed for educational purposes and are

successful at schools, whereas children from lower social classes lack access to it. To

maintain the platform for the present research, more elaboration of Bernstein's theory of

sociology of education and his restricted and elaborated language codes seems warranted.

1.1. Bernstein's Theory of Sociology of Education

Bernstein's social theory has been considered as a theory of sociology of education because it

is highly associated with the linguistic differences across social classes and the great effects

that linguistic differences have on the educational processes. Allafchi (1998) believes that

Bernstein has been affected by scholars like Sapir, Mead, Von Humboldt, Cassier, Firth,

Malinovski, Vygotsky and Luria. According to Sadovnik (2001), Durkheim has also played a

fundamental role in the formation of Bernstein's thought and Bernstein (1972) himself

confessed the great influence of Durkheim on his viewpoints. He believed that Durkheim

owned a truly remarkable vision into the relationship between symbolic orders, social

relationship, and the structure of experience. Accepting Durkheim's social opinion, Bernstein

established the foundations of his social theory. Just like Sadovnik (2001), Atkinson (1981)

also explains that Bernstein's theory roots in Durkheimian ideology. However, he states that

Bernstein's sociology gradually found tendency toward European structuralism. According to

Allafchi (1998), as a structuralist, Bernstein was highly indebted to Whorf who believed in a

single universalistic relationship between language and worldview. Sadovnik (2001: 2) notes

that ‗from his early study on language, communication, codes, and schooling, to his later

works on pedagogic discourse, practice and educational transmission, Bernstein produced a

theory of social and educational codes and their effect on social reproduction‘. The influence

of Bernstein's theory was so noticeable that Karabel and Halsey (1977) called Bernstein's

work in the field of sociology of education the ‗harbinger of a new synthesis‘. Compatible

with Karabel and Halsey (1977: 62), Robertson (2008) called Bernstein a central actor in

developing a new sociology of education.

93

1.2. Restricted and Elaborated Language Codes

The discrimination between public and formal languages was the source for introduction and

development of language codes theory that stood as the core of Bernstein's social and

educational theory. Bernstein introduced and developed the language codes theory in 1960s,

1970s and 1980s. As a pioneer, he investigated the interaction between informal languages,

power and shared meaning (Bernstein 1958, 1960, 1961). The study on the nature of informal

and formal languages led to the introduction of restricted and elaborated language codes.

Bernstein concentrated all his attention on the development of restricted and elaborated

language codes (Bernstein 1962a, 1962b). Sadovnik (2001) reports that Bernstein (1972,

1973a) investigated the relationships between socio-economic status, family, and the

regeneration of systems of meaning. He also differentiated between the restricted code of the

working-class and the elaborated code of the middle-class. Bernstein (1973a) acknowledges

that schools require an elaborated code for success to which working-class children may have

no access. Sadovnik (2001) considers restricted codes as context-dependent and particularistic

and elaborated codes as context-independent and universalistic. In addition, an elaborated

code closely corresponds to horizontal discourse introduced by Bernstein as common sense

knowledge. On the other hand, a restricted code is intricately interwoven with vertical

discourse, a style of interrogation and text creation (Bernstein 1999: 159).

Bernstein (1972) differentiated among four socialization agencies that aid the

production of restricted and elaborated language codes: The job, the educational setting, peer-

age class, and the family. He further considered family as the most important element in the

process of socialization. A number of studies reflected his views toward the role of family

(Bornstein, Haynes and Painter 1998, Dollaghan et al. 1999, Naigles and Hoff-Ginsberg

1998). In this regard, he differentiated between positional and person-oriented families

(Bernstein 1972). In positional and working-class families, children's roles are often

determined by position. As a consequence, children are subordinate to their parents and do

not have the permission to participate in many conversations. Such persons are, therefore, not

allowed to generate individualized speeches. On the contrary, in person-oriented families,

typical of middle-class families, children's individual capacities and interests are taken into

account. They even enjoy the privilege to discuss issues with their parents. Thus, an intense

system of communication is established.

For a better understanding of these concepts, some main characteristics of the

informal and formal languages which are respectively in line with restricted and elaborated

language codes (Bernstein 1973b: 42-43, 55) are presented in the following table.

94

Table 1 Characteristics of the public/informal and formal languages.

Informal languages formal languages

Short, grammatically simple, often unfinished

sentences with a poor syntactical

construction.

Accurate grammatical order and syntax

regulate what is said.

Simple and repetitive use of conjunctions (so,

then, and).

Logical modifications and stress are mediated

through a grammatically complex sentence

construction, especially through a range of

conjunctions and relative clauses.

Modifications, qualifications, and logical

stress will tend to be indicated by non-verbal

means.

Frequent use of prepositions which indicate

logical relationships as well as prepositions

which indicate temporal and spatial

contiguity.

Frequent use of short commands and

questions.

Frequent use of impersonal pronouns ( it,

one)

Rigid and limited use of adjectives and

adverbs.

A discriminative selection from a range of

adjectives and adverbs.

Infrequent use of impersonal pronouns (it,

one), as subject of a conditional sentence.

Individual qualification is verbally mediated

through the structure and relationships within

and between sentences. That is, it is explicit.

Statements formulated as questions which set

up a sympathetic circularity, just fancy? Isn't

it terrible? Isn't it a shame? It's only natural,

isn't it?

A language use which points to the

possibilities inherent in a complex conceptual

hierarchy for the organizing of experience.

95

A statement of fact is often used as both a

reason and a conclusion, you are not going

out. I told you to hold on tight (mother to

child on bus, as repeated answer to child's

why).

Universal

Individual selection from a group of

traditional phrases plays a great part.

Low structural prediction

Symbolism is of a low order of generality.

The personal qualification is left out of the

structure of the sentence; therefore it is a

language of implicit meaning.

Communicated feelings will be diffused and

crudely differentiated when a public language

is being used, for if a personal qualification is

to be given to this language, it can be done

only by non-verbal means, primarily by

change in volume and tone accompanied by

pictures, bodily movement, facial expression,

and physical set.

Having been inspired by the theoretical position reviewed earlier, this study aimed to

investigate the relationship between social classes and language patterns with a particular

reference to Iran. The study is undertaken with the following research question in mind: is

there any significant difference between working and middle-class language users in use of

language patterns?

2. The Study

Bernstein's theory was mainly based on speech; however, less attention has been paid to

written performance. The similarities between spoken and written discourse (Akinnaso 1985),

96

the interplay between speech and writing (Gillam and Johnson 1992, Olson 1995, Strömqvist

et al. 2002, Tseng 2002), and the presentation of speech by writing (Olson 1993),

substantiated more studies on the linguistic differences between working and middle-class

writings. Inspired by this assumption, the researchers were encouraged to investigate the

quality of writing in the compositions of working and middle-class language speakers in the

Iranian context.

Meanwhile, Bernstein's remarks on the linguistic differences between working and

middle-classes have led to a number of language productivity studies. Although references

were made to a few studies carried out in the Iranian society, the nature of language across

social classes is still indefinite and demands further research. Worthy of note is the fact that

the previous studies have used the general number of vocabularies as the criteria of linguistic

productivity, with less or no focus on the grammatical categories of words. As a result, the

present study compared the linguistic productivity of working and middle-class subjects. To

do so, the language patterns produced by working and middle-class language speakers were

investigated in terms of various grammatical categories.

The dilemma of applicability of Bernstein‘s theoretical framework in EFL context such

as Iran motivated the present investigation. Prior to the study, much has been tried to testify

Bernstein‘s model in English-speaking society (ESL context) that reflects the a better

discrimination between working and middle-classes whereas in eastern society, namely Iran,

assigning elaborate and restricted codes to their respective socio-economic status is a

daunting task because the sociocultural background of eastern society obscures the

differentiation between different socio-economic classes. Thus, the study is intended to

examine how Bernstein's Language Codes Theory functions in Iranian context.

2.1. Participants

A total of 100 subjects participated in the study. Working and middle-class members were

selected according to the level of education and occupation and two indexes of social class.

The social class indexes employed for subject sampling included Socio-economic Status

Scores by Nam and Powers (1983) and Hollingshead's two-factor Index of Social Position

(1957) which have been developed based on two countrywide surveys in the US. Working-

class members were salespersons, sale-assistants, and shopkeepers from among low-educated

and low-income people who had low score (29) from Nam and Powers‘ Socio-economic

Status Scores (1983). The salespersons, sale-assistants, and shopkeepers who participated in

the study used to work in groceries, department stores, and supermarkets in Ilam, a western

97

city of Iran. The sample comprised 9 females and 41 males whose ages ranged from 18 to 50.

Based on the aforementioned indexes, the middle-class subjects were 16 professors at Ilam

University with Ph.D. degrees and 34 Master students from different tracks at the same

university. All the professors were male, aged between 30 and 60, while Master students

comprised 2 females and 32 males whose age varied from 24 to 30. M.A. students were

studying in their third semester. The university professors' score on Nam and Powers‘ Socio-

economic Status Scores (1983) was between 70 to 99 and Master students were considered as

the main specialist group according to Hollingshead's two-factor Index of Social Position

(1957).

2.2. Language Pattern Elicitation Prompt

To obtain a rich corpus of language data, a prompt was designed. The prompt included two

topics, life and home country. Participants were asked to write about these topics. The topics

were in Persian and the subjects were required to write their compositions in Farsi, the

language of the participants. The selected topics were ideological notions that evoked the

participants, whether high or low educated, to write about (example of the English version of

the prompt is provided in Appendix A).

2.3. Raters

Two Master students analyzed and investigated the language data retrieved from working and

middle-class members. A number of attributes made them qualified enough for analyzing the

data. Both of them were native speakers of Persian who had received Persian Language and

Literature and Humanities Diploma issued by the Office of Education which indicated that

they attended many Persian language and literature courses at high school. They were aptly

familiar with the Persian language grammar and structure. Both raters had also passed a

course on Persian language and literature in their B.A. with excellent marks. In addition, the

correlation coefficient of 78% indicated an inter-rater reliability for their analyses of data.

2.4. Administration

After subject sampling, during the following week, the copies of the prompts were given to

members of both classes individually and in their workplaces. The prompts were given to

university professors in their offices, and to salespersons, sale-assistants, and shopkeepers in

groceries, department stores, and supermarkets. The procedure was somehow different for

Master students. Since all Master students were not classmates and did not have workplaces,

98

they were provided with the prompts in the dormitory, classroom, or the campus. Although

the prompts were administered at different places, all the subjects were asked to write their

texts or paragraphs at the very moment without any time interval. The reason to adopt this

procedure was to make the situation more natural and to prevent the participants from

cheating. Although the subjects were asked to write impromptu and not to quote and copy

from any sources, some of the collected writings included inappropriate data. Therefore, those

texts which showed cases of plagiarism were excluded from the study. Illegible handwritings

and too lengthy texts were left out as well. Finally from each group, 30 prompts which were

appropriate to the purpose of this study were selected for the analysis.

2.5. Data Analysis Procedure

The raters analyzed the language data elicited from both groups and investigated the Persian

grammatical categories (GCs). The investigation of the GCs was based on Ahmadi Givi and

Anvari's (2006) model. Consultants with the full faculty members of the Persian language

department of Ilam university made it clear that Ahmadi Givi and Anvari's (2006)

classification of Persian language GCs is the most up to date, authoritative and

comprehensive index in the Persian language. The raters analyzed the language data for their

total number of words (TNWs), content-words repetitions (CWRs), personal pronouns (PPs),

impersonal pronouns (IPs), structurally-complete-sentences (SCSs), quasi-sentences (QSs),

noun groups (NGs), adjectives groups (AGs), and verb groups (VGs). First, the TNWs

produced by each class of participants were counted by the raters. Then, the frequency of

CWRs, i.e., words which had been repeated at least twice, was determined for each class of

participants. Next, all the variations of PPs, including subjects, objects, possessives, reflective

and emphatic pronouns were counted. Since Persian is a pro-drop language, the subjects of

the sentences are sometimes deleted and the verb suffixes indicate the subject of the sentence.

For example, in the verb xord-am (I ate), am refers to the first person singular I. In pro-

dropped sentences the verb suffixes were regarded as the subject of the sentences and were

counted as PPs. The frequency of IPs, those referring to indefinite human beings, like

someone, somebody, and everybody, was determined as well. Those sentences which were

complete in their surface structure or had all the features of a complete sentence were counted

and labeled as SCSs. Contrary to SCSs, some sentences are semantically complete, but do not

have all the features of a complete sentence. A good example is that such sentences lack

verbs, but still present a complete idea. Structurally or syntactically incomplete sentences

were counted individually and were labeled as QSs. Finally, the frequencies of NGs, AGs,

99

and VGs were enumerated for each class of participants. According to Ahmadi Givi and

Anvari (2006), NGs, AGs, and VGs are very vast categories which comprise many cases, but

for the sake of precision, this study was limited to only those groups of nouns, adjectives, and

verbs that associated each other by the Persian conjunction word, va (and).

To illustrate the analysis procedure, the next two paragraphs present a word by word

translation of two pieces of language data in which all the syntactic and grammatical elements

of the Persian language were presented with no change.

Life

Good life with particular meanings for each man (1)*. For some people, happiness means

having cars, house, and many properties (2). But for others, a simple house is enough for

the family to be happy (3). Many believe ordinary and common life accompanies salvation,

but luxurious life destroys comfort (4).

Home country

Home country, the place where human beings are born, grow up, and live (5)*. We

accommodate in the Muslim country named Iran (6). Iranians have a specific interest in this

treasure, because this country has achieved revolution due to the attempt of many people (7).

We lost many youths for this; therefore, we must love our home country like our essence and

spirit (8).

The italicized words are English language specific which did not exist in the Persian text, but

their existence in the English translation was compulsory. The TNWs, excluding the italicized

ones, was 102. The numbers within the parentheses indicate the sentences. The asterisks show

the QSs. The number of all sentences in this data was 8, 6 SCSs, and 2 QSs. It was found that

the language data included 5 PPs. The words others, many, and this were the IPs in this

prompt. We, our, home country, house, people, and life are the CWRs. The samples included

14 CWRs. Finally, the bold words indicate NGs, AGs, and VGs. The samples included 2

NGs, 1 AG, and 1 VG.

3. Results

3.1. Descriptive Presentation of Data

Table 2 (Appendix B) displays the frequency of the grammatical categories in the middle-

class. The middle-class data included a total of 3049 TNWs, 412 CWRs, 123 PPs, 80 IPs, 164

100

SCSs, 55 QSs, 57 NGs, 15 AGs, and only 10 VGs. As Table 3 shows, the minimum and

maximum number of words produced was 13 and 193, respectively. The middle-class

members produced 101.6333 words on average (Table 3). The frequency of PPs was much

higher than IPs. The number of SCSs was nearly triple that of QSs. Among NGs, AGs, and

VGs, the highest and the lowest portions were for NGs and VGs respectively. The division of

TNWs by the number of all sentences (SCSs and QSs) indicated that average sentence length

for middle-class data was 14.004.

Table 3 Descriptive statistics of grammatical categories in middle-class data

GCs N Range Minimum Maximum Sum Mean SD

TNWs 30 180.00 13.00 193.00 3049.00 101.6333 45.45971

CWRs 30 32.00 .00 32.00 412.00 13.7333 8.30012

PPs 30 14.00 .00 14.00 123.00 4.1000 3.65164

IPs 30 10.00 .00 10.00 80.00 2.6667 2.82029

SCSs 30 13.00 .00 13.00 164.00 5.4667 3.28773

QS 30 12.00 .00 12.00 55.00 1.8333 2.75535

NGs 30 7.00 .00 7.00 57.00 1.9000 1.82606

AGs 30 2.00 .00 2.00 15.00 .5000 .62972

VGs 30 3.00 .00 3.00 10.00 .3333 .71116

As for the working-class, Table 4 (Appendix B) shows the frequency and distribution of

the grammatical categories in the collected data. Data presented in Table 4 show that there

were 2766 words, 525 CWRs, 131 PPs, 32 IPs, 154 SCSs, 81 QSs, 75 NGs, 15 AGs, and just

2 VGs. As shown in Table 5, the minimum and maximum numbers of words were 16 and 203

respectively. The frequency of PPs was much higher than IPs. The number of SCSs was

nearly twice that of QSs. Similar to middle-class data, among NGs, AGs, and VGs, the

highest and the lowest portions were for NGs and VGs respectively. The division of TNWs

by the number of all sentences (SCSs and QSs) showed that average sentence length for the

101

working-class data was 11.77. Summary of the results of descriptive analysis of grammatical

categories collected from the working-class prompts has been represented in Table 5.

Table 5 Descriptive statistics of grammatical categories in working-class data

GCs N Range Minimum Maximum Sum Mean SD

TNWs 30 187.00 16.00 203.00 2766.00 92.2000 45.65644

CWRs 30 48.00 2.00 50.00 525.00 17.5000 10.80788

PPs 30 11.00 .00 11.00 131.00 4.3667 3.16754

IPs 30 6.00 .00 6.00 32.00 1.0667 1.59597

SCSs 30 10.00 .00 10.00 154.00 5.1333 3.10432

QS 30 14.00 .00 14.00 81.00 2.7000 3.86987

NGs 30 9.00 .00 9.00 75.00 2.5000 2.46003

AGs 30 6.00 .00 6.00 15.00 .5000 1.19626

VGs 30 1.00 .00 1.00 2.00 .0667 .25371

Table 6 Percentages of GCs in proportion to the TNWs, along with percentages of SCSs

and QSs in proportion to the total number of sentences

GCS

Middle-Class Working-class

CWRs 18.980 13.433

Pronouns

PPs 4.401 4.736

IPs 2.608 1.084

Sentences

SCSs 74.885 65.531

QSs 25.114 34.468

NGs 1.853 2.711

AGs 0.487 0.542

VGs 3.250 0.072

102

After data collection, the percentages of the frequencies of GCs in each social class

were computed in proportion to the TNWs produced by the same social class. Table 6 also

shows the percentages of SCSs and QSs in each social class computed in proportion to the

total number of sentences produced by the same social class. Although for categories PPs,

IPs, NGs, and AGs, the percentages were nearly the same for both social classes, the

percentages of CWRs, SCSs, QSs, and VGs were different for both groups. Middle-class

members produced higher percentages of CWRs, SCSs, and VGs. However, the percentage of

QSs was greater for the working-class members.

3.2. Referential data analysis

In order to see if there were any significant differences between the two groups in their

frequencies of the GCs, 9 2 were run for the given categories. The results of the 2 indicated

significant differences in five cases, and four of the differences in the frequencies of GCs

were found insignificant.

In the case of the TNWs, the middle-class language data comprised more words. There

was a significant difference ( 2 = 13.584, p < .01) between two groups in the TNWs. CWR

was another point of discrepancy between two classes. Working-class members were more

eager to use words more repetitively than members of the middleclass. The Chi square result

indicated one more significant difference ( 2 = 13.621, p < 0.01) between two social classes

where the number of IPs produced by middleclass was nearly triple that of working-class

data. Another significant difference ( 2 = 20.57, p < 0.01) was reported for the frequencies of

IPs between two groups. Although the middle-class overcame working-class data in the

frequencies of the TNWs and IPs, working-class members produced more QSs and the

difference in the frequency of QSs was found to be significant ( 2 = 4.971, p < .05) as well.

Finally, a significant difference was reported in the frequency of VGs ( 2 = 5.333, p <.01)

between two classes of language users (Table 7).

Although the results of the five GCs indicated that the differences between two classes

were significant, supporting Bernstein's theory, some discrepant results were also reported.

There was not much difference between the MC and WC members in terms of the frequency

of PPs. Middle-class members used just 4 PPs more than WC ones and this trivial difference

in the number of PPs led to no significant difference ( 2= .320, p > 0.05) between the two

groups. Similar to PPs, the frequency of SCSs was nearly the same for both SCs and the Chi

square results indicated no significant difference ( 2 = 0.314, p > 0.05) between two social

103

classes. Besides, no significant difference was reported ( 2 = 2.45, p > 0.05) for the frequency

of NGs. Finally, since the frequency of AGs was exactly the same for both SCs, 2 was 0 and

p was equaled to 1.00. Summary of Chi square results with respect to the distribution of

grammatical categories is shown in Table 7.

Table 7: The results of 2 for the differences in the frequencies of the grammatical

categories

GCs TNWs CWRs PPs IPs SCSs QSs NGs AGs VGs 2 13.584 13.6213 .320 20.571 .314 4.971 2.455 .000 5.333

Sig. .000** .000** .572 .000** .575 .026* .117 1.000 .021*

** P < 0.01; * p < 0.05

4. Discussion

The present study attempted to compare working and middle-class language users in an

Iranian context with respect to the frequency of certain GCs in their compositions. As was

reported in the previous sections, some discrepant findings arose out of the data analysis. In

terms of the frequency of the TNWs, a significant difference was found between the groups.

For instance, middle-class members produced greater number of vocabularies. This means

that middle-class members were more productive and creative than subjects from the lower

social class. Although the participants were asked to write about the same topics, the

professors and master students were more productive. The difference in the productivity level

of the two groups leads two general conclusions: First, with respect to the relationship

between language and thought, professors and Master students might have read more books;

they are more prepared to discuss abstract concepts such as life. In other words, they are more

thoughtful and have more ideas to express. The second conclusion that is more in line with

Bernstein's language code theory is that the higher linguistic creativity of the middle-class

members may have nothing to do with one‘s thought but with the developed language pattern,

which has the potentials to discuss any abstract topic. The topics selected for participants to

write about were so general and ideological that people with different levels of education

could write about. Therefore, the first remark that educated people can discuss more because

they are more thoughtful and have more opinions for the discussion cannot be taken seriously.

On the other hand, production of more words can be discussed in terms of a more developed

104

language pattern which provides language speakers with more words to use in language

production.

While middle-class members were more productive in their writing, working-class

members were more repetitive in their terminologies. That is, middle-class members

expressed themselves using a variety of vocabularies, but the self-presentations of working-

class members were more bound to a range of repetitive words that were more or less

synonymous to the topics they were to write about. It can, thus, be claimed that the

application of word repetition by working-class members is due to their inaccessibility to an

enough corpus of terms in their language code to express themselves easily. On the contrary,

middle-class members seem to have access to a more lexically developed language which

allows language producers to express the same intentions with different lexicons.

As for the IPs, there was a significant difference between the two groups of participants

in that middle-class members used more IPs. In contrast with PPs as placeholders for proper

or common nouns with real referents in the world, IPs refer to no definite persons in the real

world and are used to express facts or opinions anonymously. In general, IPs are factors that

are used to express ideas context-independently. The higher number of IPs means that their

language production is less context-or-situation-bound. It can, therefore, be claimed that

middle-class members express ideas as generalizations. In other words, they usually

overgeneralize their beliefs to be more acceptable in different situations. Stated otherwise,

middle-class language pattern can be regarded as a general, or in Bernstein's terms,

‗universal‘ language code.

Difference in language production of the subjects was noteworthy for QSs. It was found

that QSs were more common among members of lower social classes. As noted by Ahmadi

Givi and Anvari (2006), QSs are shorter and more concise sentences because they lack some

elements of SCSs. Results of data analysis indicated that such sentences are typical of

working-class members. This finding supports Bernstein's idea that restricted code is full of

short and incomplete sentences, either grammatically or semantically.

In the case of VGs, a significant difference was reported between two social classes as

well. Middle-class members had more preferences for VGs which are examples of language

elaboration devices. They are used to express meaning more explicitly and in details. In the

current research only those categories of verbs that have been linked together by the Persian

conjunction word va (and) were included. The verbs that follow the previous verb by a

conjunction give more explanation to the meaning of the previous verb. In such groups of

verbs, neighboring verbs influence each other semantically. The more verbs that accompany

105

each other, the more comprehensive and exact meaning is expressed. It can be claimed that

this language pattern which is typical of middle-class members is semantically precise. Such

a precision is gained through a link of linguistic elements that express ideas explicitly. This

remark supports Bernstein's theory in that elaborated language code is more explicit and

semantically precise and expresses all meaning exploiting linguistic structures.

PPs, as opposite elements to IPs, were another point of investigation in the study. PPs

replace the proper and common nouns in the real context and are indicators of a context-

dependent language code. The more members use PPs in their speaking or writing, the more

context-dependent and specific their language will be. Though to Bernstein (1973a) restricted

language code is context-dependent and full of PPs, in the present study, no significant

difference was found between the frequencies of PPs in performance of the participants in the

given classes. PPs and IPs stand as dichotomous concepts, each of which typical in one of the

codes developed by Bernstein. In this study, the higher frequency of IPs among middle-class

members was approved, but the frequency of PPs was nearly the same for both classes, which

did not support Bernstein's claim. The percentages of PPs in proportion to the TNWs

produced by each class also exhibited no difference between the two classes.

Since the structure of SCSs is based on the common logical grammaticality, a sentence

has all the grammatical elements, hence longer and more logical. Data analysis indicated no

significant difference between the frequencies of SCSs. Of course, the percentages of SCSs in

proportion to the TNWs produced by each class indicated a big difference between two SCs.

Therefore, it was shown that middle-class members have produced higher percentage of

complete SCSs in contrast with the working-class members who produced higher percentage

of QSs.

As indicated by Ahmadi Givi and Anvari (2006), just like VGs, AGs and NGs are

appropriate tools to produce a more elaborated language code. It was found that middle-class

members preferred to use VGs more than working-class members, but no significant

difference between the frequencies of AGs and NGs was reported. In other words, in case of

AGs and NGs, Bernstein's theory was not supported either.

5. Conclusion

Seeking the distribution and significance of language users‘ linguistic patterns within distinct

social classes, the study was an attempt to underline the interplay between language

production and socio-economic classes. Elaboration of the interaction can provide a better

view of applicability of linguistic categories within the social frameworks. Although the

106

investigation of differences in the frequency of GCs in the language data collected from both

groups was not an absolute issue, Bernstein's remark on the linguistic differences between

language speakers from various social classes was supported to some extent. Middle-class

members were found to be more productive and creative than persons from lower classes. The

accessibility to enough ranges of vocabularies or terminologies was different across groups.

Working-class members had limited access to terms to easily express themselves. Since

middle-class members used many more IPs, it was concluded that their language code is less

situation-or-context-specific. In other words, middle-class language code is a general or

universal pattern which is easily overgeneralized to different occasions. In addition, it was

found that working-class members usually express their meanings using shorter sentences.

Finally, although the distribution of AGs and NGs was the same across two classes of Iranian

native speakers, the middle-class's preference for the production of more VGs indicates that

their language is more elaborated and explicit. All in all, the data collected in the given

Iranian context support Bernstein's language code theory to a certain extent.

6. Research Implications

The findings of the study can have some implications for language studies, sociolinguistics,

schooling and education in Iran and similar context. First, it can contribute to the field of

discourse studies. Since a central emphasis of Bernstein's theory is the impact that context

imposes on the production of linguistic structures, discourse analysts can take advantage of

this study about the production of the linguistic structures. This study can support Bernstein's

differentiation between horizontal and vertical discourses that can also be a good framework

available for discourse analysts. Second, although sociologists and sociolinguists usually

consider factors like occupation and education as indicators of social class, the present study

advocates linguistic structure as a new indicator for that purpose. The difference in the

language structures produced by people from different social classes justifies sociolinguistic

perspectives on the application of the language pattern as a device for determination of social

class. Third, even though the present study was conducted among adult participants, its

findings can be beneficial to language teachers in making them alert to the fact that students

from different social class families do not have identical access to language knowledge in

schooling even though they have passed similar level of education. As wary of socio-

economic status of students and their different accessibility to the language use, teachers can

minimize language loss of working-class students through holding classes participated by

students with different socio-economic backgrounds. Such heterogeny might provide

107

working-class students with better language accessibility in proximity to the middle-class

students. Finally, although the current results are more conducive in the society, they are not,

at least partially, value free in the educational context for students from families with

different socio-economic statuses. Thus, material and syllabus designers can also benefit from

the results of the present study in pedagogical contexts. They could include socio-economic

considerations in materials and syllabuses to compensate for the language loss of working-

class children.

References

Ahmadi Givi, H. and H. Anvari. (2006). Persian syntax (3rd

ed.). Iran, Tehran: Fatemi

Publication.

Akinnaso, F. N. (1985). On the similarities between spoken and written language. Language

and Speech, 28(4), 323-359.

Aliakbari, M., M. Samaie, K. Sayehmiri and M. Qaracholloo. (2012). The grammatical

correlates of social class factors: The case of Iranian fifth-graders. Linguistikonline,

56(6), 3-20.

Allafchi, J. (1998). The relationship between social class and speech codes with respect to

syntactic complexity. Unpublished Master's dissertation. Shiraz University, Iran.

Atkinson, P. (1981). Bernstein's structuralism. Educational Analysis, 3(1), 85-96.

Bernstein, B. (1958). Some sociological determinants of perception: An enquiry into sub-

cultural differences. British Journal of Sociology, 9(10), 159-174.

Bernstein, B. (1960). Language and social class: A research note. British Journal of

Sociology, 11(3), 271-276.

Bernstein, B. (1961). Social structure, language and learning. Educational Research, 3(3),

163-176.

Bernstein, B. (1962a). Linguistic codes, hesitation phenomena and intelligence. Language

and Speech, 5(1), 31-46.

Bernstein, B. (1962b). Social class, linguistic codes and grammatical elements. Language and

Speech, 5(4), 221-240.

Bernstein, B. (1972). A sociolinguistic approach to socialization with some reference to

educability. In J. J. Gumperz and D. Hymes (Eds), Directions in sociolinguistics: The

ethnography of communication. New York: Halt, Reinhart and Winston.

108

Bernstein, B. (1973a). Class, codes and control, Vol 1. London: Routledge and Kegan Paul.

Bernstein, B. (1973b). Class, codes and control, Vol 2. London: Routledge and Kegan Paul.

Bernstein, B. (1999). Vertical and horizontal discourse: An essay. British Journal of

Education, 20(2), 157-173.

Bornstein, M. H., M. O. Haynes, and K. M. Painter. (1998). Sources of child vocabulary

competence: A multivariate model. Journal of Child Language, 25, 367-393.

Christie, F. (1999). Pedagogy and the shaping of consciousness: Linguistic and social

processes. London: Continuum.

Dollaghan, C. A., T. F. Campbell, J. L. Paradise, H. M. Feldman, J. E. Janosky, D. N. Pitcairn

and M. Kurs-Lasky. (1999). Maternal education and measures of early speech and

language. Journal of Speech, Language and Hearing Research, 42, 1432-1443.

Gillam, R. B. and J. R. Johnston. (1992). Spoken and written language relationships in

language/learning-impaired and normally achieving school-age children. Journal of

Speech and Hearing Research, 35, 1303-1315.

Ginsborg, J. (2006). The effects of socio-economic status on children‘s language acquisition

and use. In J. Clegg and J. Ginsborg (Eds.), Language and social disadvantage:

Theory into practice (pp. 9-27). Chichester: John Wiley and Sons.

Hoff-Ginsberg, E. (1998). The relation of birth order and SES to children's language

experience and language development. Applied Psycholinguistics, 19, 603-629.

Hollingshead, A. B. (1957). Two factor index of social position. New Haven, CT: Privately

printed.

Holmes, J. (1992). An introduction to sociolinguistics. London: Longman.

Hosseini, A. (1993). The relationship between L1 academic proficiency and foreign language

learning with respect to socio-economic background of learners. Unpublished

Master's dissertation. University for Teacher Education, Tehran, Iran.

Karabel, J. and A. H. Halsey. (1977). Power and ideology in education. New York: Oxford

University Press.

Naigles, L. R. and E. Hoff-Ginsberg. (1998). Why are some verbs learned before other verbs?

Effects of input frequency and structure on children's early verb use. Journal of Child

Language, 25, 95-120.

Nam, C. B. and M. G. Powers. (1983). The socioeconomic approach to status measurement.

Houston: Cap and Gown.

Olson, D. R. (1993). How writing represents speech. Language and Communication, 13(1), 1-

17.

109

Olson, D. R. (1995). Towards a psychology of literacy: On the relations between speech and

writing. Cognition, 60, 83-104.

Richardson, K., M. Calnan, J. Essen and L. Lambert. (1976). The linguistic maturity of 11-

year olds: Some analysis of the written compositions of children in the national child

development study. Journal of Child Language, 3, 99-115.

Robertson, I. (2008). An introduction to Basil Bernstein's sociological theory of pedagogy.

Retrieved from http://sites.google.com/site/robboian/IntroBernstein.pdf?attredirects=0

Sadovnik, A. R. (2001). Basil Bernstein. Prospects: The Quarterly Review of Comparative

Education, 31(4), 687-703.

Strömqvist, S., V. Johansson, S. Kriz, H. Ragnarsdóttir, R. Aisenman and D. Ravid. (2002).

Toward a cross-linguistic comparison of lexical quanta in speech and writing. Written

Language and Literacy, 5(1), 45-67.

Tizard, B. and M. Hughes. (1984). Young children learning: Talking and thinking at home

and at school. London: Fontana.

Tseng, M. Y. (2002). On the interplay between speech and writing: Where Wordsworth and

Zen discourse meet. Journal of Literary Semantics, 31(2), 171-198.

Walker, D., C. Greenwood, B. Hart and J. Carta. (1994). Prediction of school outcomes based

on early language production and socioeconomic factors. Child Development, 65, 606-

621.

Wardhaugh, R. (2006). An introduction to sociolinguistics (5th

ed.). Oxford: Oxford

University Press.

110

Appendix A

The present prompt has been developed for research purposes. Appreciating your favor, please

help us carrying out the research

It should be mentioned that, since no personal information of respondent's identity is requested,

all the opinions presented in the prompt will remain confidential and will be used only for

research purposes.

Write what you like about the following topics.

Life

…………………………………………………………………………………………………

…………………………………………………………………………………………………

…………………………………………………………………………………………………

…………………………………………………………………………………………………

…………………………………………………………………………………………………

…………………………………………………………………………………………………

Home country

…………………………………………………………………………………………………

…………………………………………………………………………………………………

…………………………………………………………………………………………………

…………………………………………………………………………………………………

…………………………………………………………………………………………………

…………………………………………………………………………………………………

Many thanks

111

Appendix B

Table 2: Frequency of the grammatical categories in the middle-class group

Middle

Class

TNWs CWRs PPs IPs SCSs QSs NGs AGs VGs

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

96

116

65

113

83

103

193

101

128

93

161

93

13

158

185

14

13

7

24

22

21

32

9

17

10

21

13

2

19

21

9

1

0

7

2

4

4

0

6

8

1

5

0

8

7

0

1

6

1

3

0

4

2

10

0

5

4

0

7

3

7

13

5

1

2

9

11

4

9

7

6

8

0

5

2

2

1

2

6

4

1

0

0

0

0

1

1

2

2

8

4

1

0

4

2

0

2

1

1

4

3

1

1

2

2

0

1

0

1

0

0

1

1

1

0

0

0

0

1

0

0

0

0

0

0

0

0

0

0

0

1

0

0

2

3

112

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

93

126

189

57

39

75

83

70

38

110

60

73

73

160

102

10

12

22

13

2

30

4

7

2

12

0

5

14

20

14

9

14

1

10

0

0

2

4

1

6

4

1

2

2

5

1

2

7

0

1

0

2

4

0

2

0

3

0

9

3

7

7

5

3

2

2

2

6

4

5

5

3

5

13

6

0

0

1

0

1

12

0

0

3

1

5

0

0

0

2

0

0

6

0

2

4

3

1

0

7

0

2

0

2

2

0

1

1

1

0

0

0

0

1

2

0

0

0

2

1

0

1

0

0

0

0

1

0

0

0

1

0

0

0

1

Total 3049 412 123 80 164 55 57 15 10

113

Table 4: Frequency of the grammatical categories in the working-class group

Working-class TNWs CWRs PPs IPs SCSs QSs NGs AGs VGs

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

122

32

68

37

71

31

59

97

71

42

16

51

117

87

106

95

100

12

4

18

7

21

13

16

24

13

6

2

11

30

28

11

34

15

5

2

0

3

1

1

0

1

3

4

0

8

8

6

8

5

6

1

0

2

0

1

0

0

6

0

5

0

0

0

0

1

1

0

5

5

0

3

0

3

3

5

5

3

1

6

6

10

7

0

7

0

0

6

0

8

3

5

3

1

0

2

0

5

0

0

14

0

0

0

0

2

3

1

0

1

1

1

0

1

3

4

2

0

6

0

0

0

1

0

1

0

0

0

0

0

0

0

0

1

0

0

1

0

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

114

18

19

20

21

22

23

24

25

26

27

28

29

30

82

37

54

141

107

166

125

203

165

143

101

113

127

25

11

16

16

11

15

5

50

18

38

11

15

29

3

2

3

0

11

7

2

9

8

5

6

9

5

0

0

1

0

0

4

3

2

2

0

2

0

1

5

5

0

8

8

10

5

9

9

2

8

7

9

6

2

8

0

0

0

0

1

0

13

0

3

1

4

3

0

3

1

5

2

7

8

9

3

3

2

0

0

0

2

0

0

6

0

0

1

1

2

0

0

0

0

0

0

0

0

0

0

0

0

0

0

Total 2766 525 131 32 154 81 75 15 2

Abbreviations: TNWs = total number of words; WR = word repetition; PP = personal

pronoun; IPA = impersonal pronoun/adjective; GC = grammatical category; SCS =

structurally-complete sentence; QS = quasi-sentence; NG = noun group; AG = adjective

group; VG = verb group.

115

Code-Switching in a Virtual English Community in China: An International Perspective

Ming Wei

University of International Business and Economics

[email protected]

Bioprofile: Ming Wei is an Associate Professor at the University of International Business

and Economics. She earned her doctorate degree in 2009 at Oklahoma State University in

Linguistics/TESL in the United States after teaching at Beijing Foreign Studies University for

five years. She received her Master‘s degree in Linguistics in 1999 from Nankai University,

China.

Abstract

This paper investigates how a Net-based environment promotes code-switching practices

among English learners in China and examines how such practices negotiate social and

interactional meanings. Based on the analysis of conversations in an English chat room in

China from the interactional perspective, this study demonstrates how code-switching

contributes to the creation of an authentic and distinctive context of social interaction. It

reveals how the speaker‘s adjustment of code choice and degree of code-switching are firmly

anchored to the situational need to manage face and social distance in synchronous

conversations, as well as how manipulation of code interpretation and selection was achieved.

It was found that people‘s use of code affects the addressee‘s involvement in the ongoing

dialogue in that it either acknowledges the latter‘s intention behind the code choice or inhibits

behavior perceived as inappropriate. The paper also discusses how code choice may relate to

the local setting of learners studying English in mainland China.

Keywords: code-switching, chat room, social meanings, interactional perspective, English

learning

1. Introduction

Code-switching has been extensively studied in the past few decades in terms of its patterns

and meanings in oral production (e.g., Adendorff 1996, Cheng and Butler 1991, Gumperz

1982, Hoffman 1991, Lu 1991, Myers-Scotton 1989). It has been found to be a discursive

convention which can index contextual and metalinguistic information that is conveyed by

other means (e.g. prosody) in monolingual settings. This is particularly relevant to online

communications which, in addition to being social and context dependent, are structurally

simpler to meet specific interactive purposes and overcome its lack of a conventional form of

116

presence (Bays 1998, Crystal 2001). However, little is known about how interactional

frameworks are built in fluid virtual communities populated by English learners, especially

strangers whose identities and presence are primarily maintained by their verbal practices.

Using an interactional perspective, this study analyzed interactions of English learners in

China in an online chat room to uncover how code-switching helps speakers manage social

distance and facework as well as how this affects the addressee‘s choice of code. Also, it aims

to contribute to the research on second language learning by identifying some gaps in

learners‘ interactional competence in English through examining where they switch to their

native language.

1.1. Code-Switching and Interaction

Over the past few decades, code-switching, which has been described as two languages

juxtaposed, or alternated in discourse, typically within a single conversation, or within a

sentence or utterance (Auer 1998, Liebscher and Dailey O‘Cain 2005), has been dealt with by

numerous scholars. In a prototypical case, code-switching occurs in a sociolinguistic context

in which speakers orient towards a preference for one language at a time (Auer 1998). As an

integral aspect of conversational analysis, it is one of the contextualization conventions which

are acquired through interactions where people participate in a particular network of

relationships (Gumperz 1982).

Code-switching is intention-driven and functionally motivated (Adendorff 1996,

Hoffman 1991, Myers-Scotton 1989). For example, Saville-Troike (1982) identified eight

different functions such as softening or strengthening a request or command, humorous

effect, or lexical need. Gardner-Chloros (1991) argues that code-switching may occur as an

effect of the topic or the roles of the participants. Auer (1998, 2007) asserts that as an index

of certain extralinguistic social categories it can be interpreted by participants as indicating

either some aspects of the situation (discourse-related switching), or some features of the

code-switching speaker (participant-related switching).

In discussing how a code signifies a network of interpersonal relationships, McConvell

(1988) believes that we should consider the standpoint and attitude speakers wish to express

and of the social domain where they wish to relate to the interlocutor or the referent. Tay

(1989) argues that it can contribute to solidarity and rapport in multilingual discourses. Code-

switching has been associated with footing, which is defined as the speaker‘s ‗alignment, or

set, or stance, or posture, or projected self‘ (Goffman 1981: 128) and ‗the projection of a

speaker‘s stance towards an utterance (its truth value and emotional content), as well as

117

towards other parties and events‘ (Levinson 1988, as cited in Wine 2008: 2). Through its

departure from the established language-of-interaction, code-switching signals ‗otherness‘ of

the upcoming contextual frame and thereby achieves a change of ‗footing‘ (Auer 1998). In

other words, it can affect conversational status and social distance among interlocutors during

the production and reception of an utterance. As a form of foot shifting, code-switches can be

temporary suspensions of social relations that are later resumed or change the nature of whole

activities (Levinson 1992).

Existing studies have been primarily oriented towards the way speakers alternate

languages and how this indexes speakers‘ purposes and the communication situation. There

are several exceptions which look into how code-switching relates to the interaction between

interlocutors. For example, in a pioneering study of language alternation in Italian-German

peer talk and adult-child conversations, Auer (1984) analyzed both speaker adjustments and

participation framework phenomena in relation to code-switching and demonstrated how

code-switching may be used to attain a shift in the recipient constellation. Cromdal and

Aronsson (2000) examined in depth speakers‘ mutual adjustment of actions and reception of

code-switches, revealing that footings are intrinsically interactional achievements. Su (2009)

suggests that code-switching can negotiate interpersonal relationships in a face-threatening

situation on the interactional level in conversational interaction, and can make it easier for the

addressee to identify changes in frames, alignment and footing and react accordingly. The

interactional perspective has informed the research on how code-switching affects the

participation framework. Nevertheless, bilingual conversations have rarely been approached

explicitly from the perspective of whether and how specific code shifts can affect others‘

choice of code.

1.2. Code-Switching and Language Learning

Besides revealing the interactive mechanism, analyzing the way codes switch has been

considered relevant to language learning. In the unfolding of meaning, switches can be

indicative of different stages in learners‘ learning and using of the target language. In

particular, code alternation can fulfill a wide range of functions in cognitive, linguistic,

interactional as well as discourse terms in the L2 setting (e.g., Van Lier 1996, Simon 2001).

Code-switching has been traditionally seen as an asset in communication. As pointed out by

Goffman (1981: 156), switching codes requires ‗the capacity of a dexterous speaker to jump

back and forth, keeping different circles in play‘. Heller (1988) sees it as a constructive verbal

strategy used in social interaction which facilitates the effort of interlocutors to seek common

118

ground in bilingual conversation. Cheng and Butler (1989) contend that it can be seen as an

asset when it is employed to promote the content and the essence of the message. In two more

recent studies (Liebscher 2005, Olmstead 2004), code-switching has been shown to be a

useful conversational resource that enhances sociability by building shared understanding

about the ongoing interaction and indicate participants‘ orientation toward the interaction and

toward each other. Some other scholars have related code-switching with language

deficiency. For example, Auer‘s (1984) non-classroom data show that code-switching could

be an indication of a momentary lack of competence. Cheng and Butler (1989) are also

concerned that it can be a deficit when used to the extent that it interferes with

communication. Sert (2005) also reminds us that code-switching may interfere with mutual

intelligibility when learners interact with native speakers of the target language and pose

long-term damage on the foreign language learning process. Whether code-switching plays a

positive or negative role depends largely on the addressee and specific goal of interaction.

However, from the perspective of second language learning, studying the way learners switch

between languages could reveal non-native learners‘ communicative capacity. Under the

assumption that the function performed by the use of the native language in target-language-

based conversation may indicate a gap of capability and lack of comfort in the target language

relative to the native language.

1.3. Code-Switching and the Internet

Language used in Internet communication triggered immense research interest in recent

years. The increasingly widespread use of the Internet, which has developed from a peripheral

cultural phenomenon to an important locale of ‗cultural transformation and production in its

own right‘, has given rise to new varieties of communities (Porter 1996: 17). Porter describes

the new phenomenon of the Internet as an interface not between the user and the computer,

but between the user and the collective imagination of the vast virtual audience, where one

can find that the whole range of interpersonal dynamics has adjusted to the distinct conditions

of online connectivity. In turn, this establishes conventions of ‗self-presentation and

argument, widely shared systems of value and belief, complete lexicons of gestural symbols

to convey nuances of personal style, and modified standards of social decorum that facilitate

easy interactions with strangers‘ (ibid.:13). In particular, anonymity has been noted by several

authors as a defining feature of the environment of cyberspace which makes it possible to

consciously shape one‘s persona by creating alternative versions of one‘s self (Baym 1995,

Wilbur 1997). In addition, the lack of obligation on the part of the participants of virtual

119

encounters contributes to the fluidity or changeability that other aspects of lives do not have

(Healy 1997, Wilbur 1997). Not surprisingly, the Internet has become a site where virtual

communities of social and cultural interest groups are organized and new modes of

communication are formed.

The Internet chat group is a typical example of such virtual communities, which is

defined by Rheingold (1993: 7) as ‗social aggregations that emerge from the Net when

enough people carry on those public discussions long enough, with sufficient human feeling,

to form webs of personal relationships in cyberspace‘. Previous studies (e.g., Friermuth 2001,

Hall 1996, Lam 2004, Tepper 1997) have dealt with online chatting as a distinct form of

communication in the make-believe world. For instance, Bays (1998) asserts that the

combination of textuality and temporality contributes to a conversational mode of the

environment which ‗allows for an enlarged possibility for identity experimentation and fictive

exaggeration of discursive action‘. Crystal (2001) also points out that in synchronous

communication in computer-mediated contexts, the form of talk has been traditionally seen as

social rather than serious in its content in that it is more context dependent and structurally

simpler to serve specific interactive purposes.

A major distinction has been made between online and real world communication

concerning the form of presence. The subtleties in conventional conversations typically

conveyed by physical qualities such as vocal intonation, stress and gesture become

problematic in the chat room where the encounter is typically not face to face. However, as

proposed by Bays (1998), the need for the underlying sense of presence can be fulfilled by the

physical setting of the computer and the scrolling dialogue, which indicates that there is some

unseen user out there typing and sending responses to their messages, as well as some

discursive strategies, such as addressivity, which allow the users to engage personally in the

electronic setting. According to Bays (1998), participants readjust their contributions for a

valid and desired exchange by recreating presence as the cognitive foundation of conversation

where parallels to ordinary conversation can be found through discursive conventions.

Code-switching has been found to be one of such discursive conventions which can

index contextual and metalinguistic information that are conveyed in other ways, e.g.,

prosody, in monolingual settings (Gumperz 1982). Comparable to prosodic parameters and as

a contextualization strategy, it helps create situational co-presence in a pseudo-physical

environment (Auer 1988, Nilep 2006). It has been found to work as a feasible strategy

sustaining viable social encounters. For example, Bays (1998) asserts that alternative

language choice is used as a strategy to achieve and handle disagreement in the Internet chat

120

room. In Lam‘s (2004) investigation with two Chinese immigrant high school girls in the US,

the examination of their code-switching practices revealed that the girls' participation in the

chat room should be understood in relation to their experiences in the national context of the

US and demonstrated how alternative identities are sought in the virtual world. Ho (2006)

looked into the bilingual practices of tertiary students in Hong Kong when using ICQ — an

instant messaging computer program. She found that English and Chinese were

complementary to each other in helping participants handle the pressure of instant

communication. Cárdenas-Claros and Isharyanti‘s (2009) study with some of their MSN

messenger (another online social networking site) contacts and Goldbarg‘s (2009) analyses of

the survey results with her personal contacts suggest that online chatters showed people‘s

preference for their first language in conveying more personal content and feelings. These

studies have been illuminating how people‘s choice of code may relate to social realities in

virtual communities and have given rise to the relevance of code-switching to learners‘ verbal

behaviors in the online context. Nevertheless, the majority of existing studies focused on

people who already knew each other. Relatively little is known about code-switching between

total strangers in an online environment, where the validity and durability of their identities

rely almost exclusively on their presence and behaviors in the virtual context.

1.4. This Study

Although English has been widely accepted as an indispensable tool for achieving academic

and career advancement in China, learners generally do not have much exposure to the

English-speaking environment other than classroom settings. In such a context where English

is rarely used for daily communication, the Internet chat room represents a unique locale of

interactions; it has been regarded by many English learners as a useful and handy site to

practice their English, especially spoken English, comparable to the so-called ‗English

corner‘ outside EFL classrooms where spoken English is practiced in the physical world

where the conversation is typically the first encounter for the interactors who do not know

each other.

However, not much information is available as to how this group of learners interacts

synchronously in a Net-based environment where co-presence is maintained primarily

through one‘s literary practice. By studying the code-switching practice and its functions in

the English chat room from an interactional perspective, it is hoped that we can understand

how social relations and interactional meanings are co-constructed through particular forms

of discursive practices. Meanwhile, through analyzing the sequential position in which a

121

code-switch occurs and how the code choice of one interlocutor affects that of the other, we

can catch a glimpse of the dynamics at play which prompt code-switches and affect the

reception and code choice by the addressee. Finally, it is presumed that code shifts in contexts

where co-presence is maintained primarily through verbal practices not accompanied by

prosody or body language could indicate learners‘ (lack of) target language competence to

meet various interactional needs.

2. Methods

2.1. Data Collection

The chat room under study is called English WW, a component of www.bliao.com which is

the largest chatting website in China composed of various freely accessible chat rooms

catering for people with different interests. This particular room is one of the twelve English

chat rooms on this website intended for people to practice conversational skills in English; in

other words, the chatters are typically English learners. Chatters use nicknames they make up

for this chat room, which enables them to remain anonymous regarding their real life

identities. As observed from the exchanges, interactors are from a broad range of

backgrounds, being college students, white collars, teachers, etc.

This chat room was selected because it was the most populated English chat rooms of

the website, with an average of 50-60 participants at a time, which provided a rich resource

for linguistic research. Also importantly, the settings of this chat room enabled the researcher

to easily copy the ongoing conversation.

The researcher logged into the chat room and observed it for about two hours on

twelve consecutive days. The conversations in progress were copied and saved for subsequent

analysis, resulting in approximately 18 hours of verbal exchanges. Due to the voluntariness,

anonymity, irregularity and fluidity of online communities, it was impossible to obtain

demographic information from the participants or keep track of them once they quit the chat

room. Each line is prefaced by the names of both the speaker and the addressee, making it

possible for the investigator to piece them together and obtain individual conversations from

the synchronous multiple conversations which mingled together on the screen.

Then instances of code-switching were identified and examined to find under what

circumstances participants shifted codes, and whether and how the code-switching affected

social relations with the interlocutor and the code choice of the interlocutor.

122

2.2. Codes Used in the Chat room

English proficiency varied greatly from chatter to chatter; some people demonstrated

noticeable and frequent grammatical errors in English. However, the major focus of the

present study is not English proficiency variation, but the way people shifted between English

and Chinese. It was noted through observational data that this English-based chat room had

been turned into a peculiar English-based bilingual community through the use of a mixed-

code variety of language among the interactors, consisting of English and pinyin, i.e.,

romanized Chinese. Pinyin is a system of romanization for Standard Mandarin. It was

adopted in 1979 by China as the method of phonetic instruction in mainland China and

established by the International Organization for Standardization (ISO) as the standard

romanization for modern Chinese. Pinyin uses Roman letters not to represent the shapes of

Chinese characters, but to spell the sounds of Standard Mandarin (Swofford 2006). It has also

become a convenient tool for entering Chinese language text on computers. Pinyin was found

to be a code preferred to Chinese characters in the chat room under study, which could be

primarily a result of the participants‘ avoiding the trouble of having to convert between

English and Chinese characters, or partly due to the consideration that in an English context,

Chinese characters would appear somewhat abrupt. Therefore, the following section will

focus on the switch between English and pinyin within utterances and across utterance

boundaries. Among the great number of such switches, a large part of which took place in

brief phatic verbal exchanges, three excerpts were selected for detailed qualitative analyses

because they provided relatively complete communicative settings which made it possible to

carry out more meaningful, objective, and rational interpretation and discussion from the

interactional perspective.

3. Discussion

It was found that in many cases, chatters used various combinations of English and pinyin,

which seemed to have worked well with this chat group. A noticeable aspect of the

phenomenon of code-switching was the attachment of Chinese particles to the end of English

utterances. Although pragmatic particles do not contribute significantly to the propositional

content, they affect the utterance as a whole in that they provide contextual coordinates for

the proper interpretation of the speaker‘s utterances in ongoing discourse (Ostman 1982). In

traditional Chinese grammar, sentence-final particles are referred to as yǔqì cí ‗mood words‘,

which suggests that their function is primarily to relate in various ways the hosting utterance

to the conversational context and to indicate how this utterance is to be interpreted by the

123

hearer (Li and Thompson 1981). Although these particles are optional as far as

grammaticality judgments are concerned, they are pragmatically informative and express the

speaker‘s attitude or emotional state in the communication interchange. As pointed out by

Chao (1968), they are important devices in Chinese that fulfill many of the functions of

intonation in other languages, such as English, which is especially meaningful in online chat

rooms where there is a lack of prosodic features.

In the collected corpus, many conversation participants adopted Chinese sentence-

final particles to cue the modality of their utterances and their orientation to the addressee.

This is particularly interesting because there is no one-to-one correspondence between pinyin

and Chinese characters, not to mention that pinyin mixed in English utterances was not

marked with tones, an important feature of Chinese pronunciation. In the following excerpt,

particles constitute all of the code-switches from English to pinyin. Tim and Vicki

(pseudonyms) are talking about Vicki‘s relationship with her boyfriend. Vicki is not very

happy with her boyfriend and Tim is trying to help her out by offering suggestions.

Excerpt 1

1 Tim: do u think about leaving?

2 Vicki: leaving from him?

3 Vicki: how?

4 Vicki: I don‘t want to mention it first.

5 Tim: by starting to try another guy ne

6 Vicki: no guy I can try a

7 Vicki: I don‘t want to play game in fair

8 Tim: well..u r living in the kingdom of gals?

9 Tim: it‘s not playing a

10 Vicki: iiiiiiiiiii)

11 Tim: yes?

12 Vicki: I don‘t know what should I do

13 Tim: he loves you?

14 Vicki: he said

15 Vicki: but I can‘t fell that

16 Vicki: feel

17 Vicki: right now I am working in one company, no gals, so many guys

18 Tim: which stage does a girl feel the love to her from bf most?

124

19 Vicki: I have no time to make other bf due to busy

20 Tim: I see

21 Vicki: at my stage ba

22 Tim: I mean, at the beginning of an affair, or in the mid, or after getting married ne

23 Vicki: I wish I have bf, we leave in different city but not far from

24 Vicki: marriage is far from me I think

25 Tim: hehe

26 Vicki: I never think about it

Tim and Vicki have been conversing solely in English for 16 minutes. Then in reply to

Vicki‘s question in line 3, instead of using a question mark, Tim code-switches to pinyin ne, a

rough equivalent to ‗how about‘, which has the function of converting a statement into a

question in context that is already known (Chu 1998). The tentativeness achieved by suffixing

such a force-reducing particle saves Vicki‘s negative face and indicates Tim‘s awareness of

the potential risk of being perceived as impolite and intrusive in advising people, especially

strangers, on their personal affairs. This is Tim‘s adaptation to this peculiar virtual

environment which lacks nonverbal subtleties that can otherwise be conveyed by body

language or voice features. Tim‘s change of code to signal his pragmatic intent triggers

Vicki‘s incorporation of pinyin a in line 6, which, as a sentence-final particle, similar in

pragmatic function to ne, reduces the assertiveness of the message conveyed by the sentence

(Li and Thompson 1981). This suggests Vicki‘s attempt to mitigate the tone of her negative

reply and is a sign that she is aware of Tim‘s insertion and the potential threat to Tim‘s

positive face. The same particle is also invoked by Tim in line 9 as a face-saving tone

softener for his disagreement with Vicki‘s comparison between love and game playing.

Vicki‘s second code-switch to ba in line 21 implies her desire to ‗solicit the approval or

agreement of the hearer with respect to the information conveyed by the sentence‘ (Li and

Thompson 1981: 307); its semantic function resembles that of questions ‗don‘t you think so?‘

or ‗wouldn‘t you agree?‘ in English. This also seems to contribute partly to Tim‘s code

change in line 22 which combines the English utterance with the Chinese mitigator and

question marker ne.

In brief, Tim initiates the use of Chinese sentence-final particles, which results in a

similar choice of code on the part of Vicki, who, after a little while, also resorts to these

particles, which in turn affects Tim‘s language use. This extract shows switching to

romanized particles is used to adjust and negotiate interlocutors‘ involvement in this virtual

125

environment and to affect the mutual interpretation and participation in the ongoing dialogue.

Such careful tagging reduces assertiveness of the otherwise monolingual English utterances

and indicates that the accommodation of the interlocutors is probably out of face-maintaining

considerations when making suggestions, showing disagreement or indicating tentativeness.

This mixture of codes facilitates the building of rapport and intimacy between the speakers

involved.

It was also found that the insertion of pinyin minimized the chance of communication

breakdowns by softening the atmosphere that would be otherwise tense. In the following

excerpt, Tina and Jason ask for each other‘s means of contact, i.e., Tina‘s number on QQ,

another chat program, and Jason‘s email address. However, somehow, neither of them

succeeds.

Excerpt 2 — Part 1

27 Tina: are you playing music ne?

28 Jason: yeah

29 Jason: a loving one

30 Jason: got it

31 Jason: so noisy here

32 Tina: huh

33 Jason: hi

34 Jason: do you have a qq

35 Tina: no ya

36 Jason: i have no email

37 Tina: why ??

38 Jason: wo ye mei you a

[I don‘t have it either a]

Tina‘s attachment of ya to her reply to Jason‘s request for her QQ number in line 35 is a tone

softener, counteracting the forcefulness of her negative reply that is potentially face-

threatening and offensive because it is likely to be interpreted as a refusal. Jason immediately

returns the rejection in line 36 in an unmarked manner to save his own face by claiming that

he does not have an email, which obviously takes Tina by surprise and threatens her positive

face, as shown by line 37. At this point, the atmosphere is getting tenser and the conversation

seems to have reached a deadlock. Then Jason‘s unexpected and thorough change of code in

126

line 38 suggests his realization of the possible embarrassment caused by his bluntness in line

36; it may be a repair of line 36 based on Tina‘s use of the mixed code in line 35. From

exclusively English to exclusively pinyin, this drastic code change indicates Jason‘s timely

adjustment to the changing context. Their conversation continues.

Excerpt 2 — Part 2

39 Tina: no one have no email in this world

40 Jason: send you my pic by qq

41 Tina: no qq here

42 Jason: oh

43 Jason: so pitiful

44 Tina: soooooooooooooooooo

45 Tina: even bargain

46 Jason: what?

47 Tina: even bargain ya

48 Tina: you have what i dont have

49 Jason: hehe

50 Tina: and i have what you dont have ya

51 Jason: never mind ya

52 Jason: qq is more convenient than email, at least for me

53 Tina: really? I see

Jason is not annoyed by Tina‘s disbelief about him not having an email. Instead, in line 40, he

offers to send his picture to Tina through QQ which is what Tina just claims she does not

have, only to be rejected indirectly by Tina in line 41 on the grounds that it is an ‗even

bargain‘ in line 45. Jason‘s what in line 46 shows that he is either surprised or does not

understand what Tina says, which provides a need and opportunity for Tina to repair the

satirical tone and provocativeness of line 45 through the attachment of ya to lines 47 and 50.

This modification makes the tone lighter and more playful, and thereby reduces the tension.

As a result, Jason seems to be able to conform to this newly emerging norm of code use, and

follows Tina in the use of ya for his never mind (line 51), which steers the conversation in a

more friendly direction. Thus the wh-questions of line 37 and line 46 both set off a

subsequent change of code use: before them, the participants use English when trying to

sound assertive to keep their own face; after them, they code-switch to pinyin to various

127

extents to save the addressees‘ face. Besides, in this process, the participants are keenly

sensitive to the subtle messages conveyed through code alternation by the interlocutor and

often adjust their code choice accordingly. Social relations are thus implicitly co-constructed

in the virtual environment through a distinctive way of speaking when people modify their

own behaviors in the sequential context of a conversation.

This virtual community is also a place where people‘s behaviors are manipulated

explicitly through shifts in code. In the following dialogue between Justin and Linda, the

mixed-code variety of language use not only heightens the interpersonal nature of the

conversation but also signifies a process in which Linda gets socialized to behave more

politely in this peculiar environment of interaction.

Excerpt 3 — Part 1

54 Justin: hi, Linda, nice to see you

55 Justin: i am wonderful, and you?

56 Linda: glad to hear that

57 Justin: :)

58 Linda: compare with u, i should say it's as usual

59 Justin: that is not easy, many people get worse and worse, don't be greedy la

60 Linda: haha

Here Justin‘s use of an imperative in line 59 is a Chinese way of establishing intimacy and

solidarity; but there is still a potential risk of being perceived as rude and offensive.

Therefore, la is attached to line 59 as a mitigator. This particle usually appears as a sentence

suffix, used in many Chinese dialects to present a sentence as rather light-going and to entice

solidarity. This combination of an English imperative and a Chinese particle is proved to be

effective because Linda is obviously not offended, but amused and pleased, as shown in line

60 as follows.

Excerpt 3 — Part 2

61 Justin: dui ba?

[Is it right?]

62 Linda: dui ni ge tou la

[right you quantifier head la]

63 Justin: ah?

128

64 Justin: bu xing,

[You cannot do this.]

65 Justin: hai shi dui ni de tou ba

[It would be better if …]

Justin‘s total code-switch for his tag question in line 61 makes the tone even milder, which

further counteracts the assertiveness in line 59. This is followed by another complete change

of code on the part of Linda in line 62. But Linda‘s hasty judgment of their social distance

causes her to make fun of Justin. Her ni ge tou la is a teasing way in Chinese of claiming

one‘s disagreement or negative opinion on what is said by the addressee, usually used

casually as a pet phrase with intimate friends or people lower in power rank. It can be seen as

Linda‘s effort in reducing her social distance with Justin. This bold use, as an indication of an

attempt for greater intimacy, is potentially face-threatening and sounds abrupt and impolite in

this context where interlocutors are usually stranger to each other. It turns out to be

detrimental to the atmosphere and gives rise to a communication crisis, which is substantiated

by Justin‘s ah in line 63 showing his surprise with the way Linda talked to her. His

subsequent buxing in line 64 and hai shi dui ni de tou ba in line 65 reveal that he is obviously

offended and are strong protests against Linda‘s rude verbal behavior. In particular, in line 65,

Justin returns what Linda says in line 62 to Linda, with the addition of the affix hai shi

(meaning ‗it would be better if‘) and the suffix ba. This seemingly polite expression in reality

conveys his dissatisfaction with Linda‘s manner, on the one hand, and works as a mitigator,

on the other hand, in the sense that it saves Linda‘s face through the joint use of the force-

reducing prefix and suffix. Therefore, the code-switch in lines 64 and 65 can be perceived as

Justin‘s blunt correction of Linda‘s verbal behavior and an indication of a change in

alignment.

Excerpt 3 — Part 3

66 Justin: zen me la? bu hao yi si?

[Are you OK? Do you feel embarrassed?]

67 Linda: dui bu qi

[I‘m sorry.]

68 Justin: bu yao jin

[Never mind.]

129

69 Linda: en

[All right]

It is worth mentioning that Linda then pauses for almost half a minute, which is a likely

indication of Linda‘s embarrassment resulting from Justin‘s explicit expression of

displeasure. Justin‘s continuing use of pinyin in line 66, which shows clearly his concern

about how Linda feels, suggests his awareness of Linda‘s loss of face due to his utterance in

line 65 and has a remedial function for line 65. It helps alleviate the tension building up

between them that put the conversation on the verge of a breakdown, and finally gets Linda to

apologize using the same code in line 67 for what she says in line 62. Thus, Justin finally

regains his face; subsequently, his bu yao jin in line 67 marks clearly his willing acceptance

of Linda‘s apology, which is acknowledged by Linda whose onomatopoeic ‗en‘ in line 69

puts an end to the unpleasant and embarrassing part of their verbal interchange.

Excerpt 3 — Part 4

70 Linda: I will leave

71 Justin: for what?

72 Linda: for working

73 Justin: i know le

74 Linda: bai la

75 Justin: that is for money

76 Justin: bai bai

But Linda‘s switching back to English in line 70 is an intentional attempt to increase social

distance; she obviously does not feel at ease about what just happens. This results in the same

code change on the part of Justin in line 71, who then incorporates another romanized

Chinese particle le to I know in line 73. According to Li and Thompson (1986: 240), the basic

communicative function of le is to signal a ‗currently relevant state‘; in other words, it claims

that a state of affairs has special current relevance with respect to some particular situation. In

this case, it signals to Linda in a mild way that Justin has already understood the reason why

she is leaving and represents Justin‘s effort to soften the tense atmosphere through

manipulating code use. It is followed by Linda‘s interesting combination of bai (a loan of

English bye), which is commonly used among young Chinese intimates, and a sentence-final

particle la. This is responded by Justin in a similar fashion, which concludes this online

130

encounter. The code shift to pinyin resumed by Justin in line 73 and used by both participants

thus helps to restore rapport between the two persons.

In short, Justin‘s playful tone accomplished by his incorporation of pinyin into his

utterances enhances the intimacy with Linda and also leads to Linda‘s change of code as well

as her blunt tone suggestive of her misjudgment of their social distance. Her face-threatening

teasing causes some discomfort in Justin and turns out to be unacceptable for him. The

succeeding use of pinyin is corrective, enabling Justin to make Linda realize that he doesn‘t

like the way he was treated by Linda, which is followed by his inviting Linda back into the

conversation after realizing Linda‘s loss of face. Social distance is then increased by Linda by

switching back to English as a retreat from the embarrassment, and is ultimately reduced by

Linda by an interesting mixture of codes when she leaves the chat room. Both acts

immediately change Justin‘s code choice. Therefore, code-switching facilitates Justin and

Linda‘s face management and proximity manipulation. It is particularly interesting that the

extent of code-switching seems to vary with the atmosphere and purpose of the speaker. A

complete switch to pinyin or English highlights the utterance and explicitly marks speech acts

as seeking agreement, protesting, apologizing or bidding farewell, indicating the negotiation

of social meanings between the two interlocutors.

4. Conclusions

The above analysis reveals that code-switching, which has been shown to be an interactive

and dynamic negotiation process during which participants shape their social positions and

build their virtual environment, helps Chinese learners of English actively co-construct social

meanings and relations in this virtual chat room. Their code choice and degree of code-

switching are firmly anchored to the situational need in social distance and face maintenance.

The analyzed conversations lend further support to Olmstead‘s (2004: 23) claim that code-

switching, which indicates participants‘ orientation toward the interaction and toward each

other, is a positive conversational resource that enhances sociability, and allows ‗shared

understandings about the purpose of the interaction to enter into the language practice‘. It

helps people convey subtle messages that underlie the propositional content and signals a role

shift in the social alignments of the participants. From the interactive perspective, one

person‘s selection of code constrains the interpretation and the code choice of the addressee,

which in turn has a considerable effect on their context. People‘s use of code affects the

addressee‘s involvement in the ongoing dialogue in that it either acknowledges the latter‘s

intention behind the code choice or corrects his or her behavior perceived as inappropriate.

131

Chatters in this online virtual community have been shown to draw on the linguistic

and discursive resources of both English and Chinese in the development of a distinct virtual

social network, which contributes to the creation of their relationships as bilingual speakers

who resorts to code shifts, especially from English to pinyin, for more subtle interactive and

social purposes. This use of hybrid language also shapes roles for interlocutors in either

encouraging or inhibiting certain types of verbal behaviors. Social distance, identities, and

facework are negotiated rather than pre-established and fixed, which is particularly

meaningful in a context where participants are strangers and other contextualization cues such

as prosodic features and body language are not possible.

Tying code-switching in a computer-mediated community in an EFL setting to

approaching the online interaction demonstrates how the electronic chat room provides an

authentic and distinct context of social interaction. It illuminates how language is a valuable

asset that enriches our knowledge of the way specific interactive purposes are served in an

online environment typically populated by strangers. Meanwhile, an examination of the way

Chinese and English are mixed as contextualization cues to index social meanings can inform

our understanding of how people adjust to the practices of the virtual community they are

involved in.

Furthermore, thanks to the lack of visual and audio aids in the context under study, the

investigation of literary practices in this peculiar online setting also sheds some light on how

the verbal behaviors of English learners in the chat room relate to their local experiences of

English learning. It has to be recognized that the Internet offers unique opportunities for EFL

learners in China to use the target language. It provides a platform for people not only to

practice their English language, but also to create a new collective identity not simply as

English speakers or Chinese speakers, but as learners trying to converse in a language that is

rarely used in their daily life. On the one hand, this mixed-code variety works well among the

interlocutors since there are no obvious signs of confusion and misunderstanding as speakers

seem to have managed to effectively get across the propositional and non-propositional

messages. On the other hand, shifting skillfully to Chinese to various extents complements

the use of English in expressing subtle interactive and social meanings, which should have

been attended to in English, given the purpose of the chat rooms. Their shift to Chinese runs

counter to their purposes in a sense. This phenomenon – that members of this community use

English primarily for ideational content and frequently resort to Chinese for interactive and

emotional nuance – may suggest their underdeveloped ability to attend to the social and

pragmatic aspects of communication in English relative to Chinese. Therefore, from the

132

perspective of language learning, this study makes another good case for improving the

interactive competence of English in EFL settings where exposure to authentic language use

is rather limited.

References

Adendorff, R. (1996). The functions of code switching among high school teachers and

students in KwaZulu and implications for teacher education. In K. M. Bailey and D.

Nunan (Eds.), Voices form the language classroom: Qualitative research in second

language education (pp. 388–406). Cambridge: Cambridge University Press.

Auer, P. (1984). Bilingual conversation. Amsterdam: Benjamins.

Auer, P. (1988). A conversation analytic approach to code-switching and transfer. In M.

Heller (Ed.), Codeswitching: Anthropological and sociolinguistic perspectives (pp. 187-

213). Berlin: Mouton de Gruyter.

Auer, P. (1998). Code-switching in conversation: Language, interaction and identity. New

York: Routledge.

Auer, P. (2007). A postscript: code-switching and social identity. Journal of Pragmatics,

37(3), 403-410.

Baym, N. K. (1995). The emergence of community in computer-mediated communication. In

S. G. Jones (Ed.), Cybersociety: Computer-mediated communication and community

(pp. 138-163). Thousand Oaks: SAGE.

Bays, H. (1998). Framing and face in internet exchanges: A socio-cognitive approach.

Linguistik Online, 1. Retrieved June 9, 2008 from http://viadrina.euv-frankfurt-

o.de/~wjournal/bays.htm

Cárdenas-Claros, M. S. and N. Isharyanti. (2009). Code-switching and code mixing in

internet chatting. The JALT CALL Journal, 5(3), 67-78.

Chao, Y. R. (1968). A grammar of spoken Chinese. Berkeley: University of California Press.

Cheng, L. and K. Butler. (1989). Code-switching: A natural phenomenon vs language

‗deficiency‘. World Englishes, 8(3), 293-309.

Chu, C. C. (1998). A discourse grammar of Mandarin Chinese. New York: Peter Lang

Publishing.

Cromdal, J. and K. Aronsson. (2000). Footing in bilingual play. Journal of Sociolinguistics,

4(3), 435-457.

Crystal, D. (2001). Language and the Internet. Cambridge: Cambridge University Press.

133

Freiermuth, M. R. (2001). Features of electronic synchronous communication: A comparative

analysis of online chat, spoken and written texts. Unpublished Master‘s dissertation,

Oklahoma State University, Stillwater.

Gardner-Chloros, P. (1991). Language selection and switching in Strasbourg. Oxford:

Clarendon Press.

Goffman, E. (1981). Forms of talk. Philadelphia: University of Pennsylvania Press.

Goldbarg, R. N. (2009). Spanish-English codeswitching in email communication. Language

@ Internet, 6. Retrieved June 19, 2009 from

http://www.languageatinternet.de/articles/2009/2139

Gumperz, J. J. (1982). Introduction: Language and the communication of social identity. In J.

J. Gumperz (Ed.), Language and social identity (pp. 1-21). Cambridge: Cambridge

University Press.

Hall, K. (1996). Cyberfeminism. In S. C. Herring (Ed.), Computer-mediated communication:

Linguistic, social and cross-cultural perspectives (pp.147-170). Amsterdam: John

Benjamins.

Healy, D. (1996). Cyberspace and place: The internet as middle landscape on the electronic

frontier. In D. Porter (Ed.), Internet culture (pp.55-72). New York: Routledge.

Heller, M. (1988). Codeswitching: Anthropological and sociolinguistic perspectives. Berlin:

Mouton de Gruyter.

Ho, J. W. Y. (2006). Functional complementarity between two languages in ICQ.

International Journal of Bilingualism, 10(4), 429-451.

Hoffman, C. (1991). Introduction to bilingualism. New York: Longman.

Lam, W. S. (2004). Second language socialization in a bilingual chat room: Global and local

considerations. Language learning technology, 8(3), 44-65.

Li, C. N. and S. A. Thompson. (1986). Mandarin Chinese. Berkeley: University of California

Press.

Liebscher, G. and J. Dailey-O'Cain. (2005). Learner code-switching in the content-based

foreign language classroom. The Modern Language Journal, 89(2), 234-247.

Lu, J. Y. (1991). Code-switching between Mandarin and English. World Englishes, 10(2),

139-151.

McConvell, P. (1988). Mix-im-up: Aboriginal code-switching, old and new. In M. H. (Ed.),

Codeswitching: Anthropological and sociolinguistic perspectives (pp. 97-149). Berlin:

Mouton de Gruyter.

134

Myers-Scotton, C. (1988). Self-enhancing codeswitching as interactional power. Language

and Communication, 8(3), 199-211.

Nilep, C. (2006). ‗Code-switching‘ in sociocultural linguistics. Colorado Research in

Linguistics, 19(1). Retrieved June 19, 2008, from

http://www.colorado.edu/ling/CRIL/Volume19_Issue1/paper_NILEP.pdf

Olmstead-Wang, S. (2004). Construction sociability through code-switching in Mandarin-

English family conversations. Unpublished doctoral dissertation. The University of

Alabama, Tuscaloosa.

Ostman, J. O. (1982). The symbiotic relationship between pragmatic particles and impromptu

speech. In N. E. Enkvist (Ed.), Impromptu speech: A symposium (pp.147-177). Abo:

Akademi.

Porter, D. (1996). Introduction. In D. Porter (Ed.), Internet culture (pp.11-18). New York:

Routledge.

Rheingold, H. (1993). The Virtual Community: Homesteading on the electronic frontier.

Reading: Addison-Wesley.

Saville-Troike, M. (1982). The ethnography of communication. Oxford: Blackwell.

Sert, O. (2005). The functions of code-switching in ELT classrooms. The Internet TESL

Journal, 11(8), Retrieved February 20, 2009 from

http://iteslj.org/Articles/Sert-CodeSwitching.html

Simon, D. L. (2001) Towards a new understanding of codeswitching in the foreign language

classroom. In R. Jacobson (Ed.), Codeswitching worldwide II (pp. 311–342). Berlin:

Mouton de Gruyter.

Van Lier, L. (1996) Conflicting voices. In K. Bailey and D. Nunan (Eds.), Voices from the

classroom. Cambridge: Cambridge University Press.

Levinson, S. C. (1992). Activity types and language. In P. Drew and J. Heritage (Eds.), Talk

at work: Interaction in institutional settings (pp. 181-205). Thousand Oaks: SAGE.

Su, H. (2009). Code-switching in managing a face-threatening communicative task: Footing

and ambiguity in conversational interaction in Taiwan. Journal of Pragmatics, 41(2),

372-392.

Tay, M. W. (1989). Code-switching and code mixing as a communicative strategy in

multilingual discourse. World Englishes, 8(3), 293-309.

Tepper, M. (1997). Usenet communities and the cultural politics of information. In D. Porter

(Ed.), Internet culture (pp.39-54). New York: Routledge.

135

Swofford, M. (2006). The Three ‗NOTs‘ of Hanyu Pinyin. Retrieved March 15, 2006 from

http://www.pinyin.info

Wilbur, S. P. (1996). An archaeology of cyberspaces: Virtuality, community, identity. In D.

Porter (Ed.), Internet culture (pp.5-22). New York: Routledge.

Wine, L. (2008). Towards a deeper understanding of framing, footing, and alignment.

Working Papers in TESOL & Applied Linguistics, 8(2), 1-3.

136

Interrogating Current Conceptualisations of ‘Word’ for Word Knowledge Studies:

Challenges and Prospects

Jabulani Sibanda

Rhodes University

[email protected]

Bioprofile: Jabulani Sibanda is currently studying for a Ph.D. degree with Rhodes University

(South Africa). He has taught and published in the area of second language teaching and

literacy. His primary research interests are in second language teaching and research, and

literacy development.

Abstract

The present paper interrogates the efficacy of the conceptualisation of the construct ‗word‘

represented by ‗token‘, ‗type‘, ‗lemma‘, and ‗word family‘ as units of measurement in

English vocabulary knowledge research studies. It uses Grade 3 second language learners of

English in South Africa as the context for investigating the adequacy and validity of each of

the word units. The paper argues that the ‗token‘ and ‗type‘ units‘ disregard of the important

principle of ‗learning burden‘ and the ‗lemma‘ and ‗word family‘ units‘ over extension of the

principle militates against their validity as units of vocabulary measurement. The paper casts

doubt on the feasibility of objectively defining the ‗lemma‘ and ‗word family‘ membership

with precision, which compromises their efficacy as units of word measurements. An

extension of Nation and Bauer‘s (1983) levels of ‗word family‘ membership, through a

determination of inflected and derived forms of base words learners show a propensity for

acquisition and the order of that acquisition, is proposed as a desirable and requisite way

forward.

Keywords: token, type, lemma, word family, learning burden, word knowledge

Introduction

The lofty place of words in language proficiency has long been acknowledged in statements

like ‗what learners carry around with them are dictionaries and not grammar books‘ (Baxter

1980) and ‗without grammar very little can be conveyed, without words nothing can be

conveyed‘ (Wilkins 1972: 111). Both statements attest to the superior effect of vocabulary

over grammar for the development of language proficiency. In fact, grammar and language

proficiency are an outgrowth of one‘s lexical competency which renders word knowledge a

proxy of language proficiency. Research has consistently testified to vocabulary having

137

higher correlations with language proficiency than other measures (Qian 2002, Koda 2005,

Chen 2011). Words have both an upward and downward influence; downward to their

constituent morphemes and upward to larger units of which they are parts. In the latter, they

form the basis of all language as they are basic units of meaning upon which larger structures

like phrases, sentences, and paragraphs hinge. The bulk of vocabulary research focuses on

individual words. The exalted status of words in language proficiency coupled with Mármol‘s

(2011: 12) observation that ‗despite new trends in vocabulary research that focus on higher

units as collocations or idioms, there is no doubt that the word is the main unit in vocabulary

quantification and language by and large‘ is demonstrative of the merit there is in closely

examining the concept ‗word‘ which the present paper seeks to do. The interrogation of the

efficacy of the current conceptualisations of the construct ‗word‘ is done in the context of

Grade 3 second language (L2) learners transitioning to reading to learn in Grade 4. Such a

context, it is hoped, would be illustrative of the need for a further reconceptualization of the

construct ‗word‘ for word knowledge measurements on Foundation Phase (FP) L2 learners.

The Context

The Grade 3 learners who speak any of the 10 official languages of South Africa (excluding

English) as their Home Language (HL) or First Language (L1) who are on the verge of a

transition to Grade 4 form the context on which the paper‘s discussion hinges. The table

below indicates the Home Language distribution according to the 2011 census.

SOUTH AFRICAN LANGUAGES – 2011

Language Number of speakers* % of total

Afrikaans 6 855 082 13.5%

English 4 892 623 9.6%

IsiNdebele 1 090 223 2.1%

IsiXhosa 8 154 258 16%

IsiZulu 11 587 374 22.7%

Sepedi 4 618 576 9.1%

Sesotho 3 849 563 7.6%

Setswana 4 067 248 8%

Sign language 234 655 0.5%

SiSwati 1 297 046 2.5%

Tshivenda 1 209 388 2.4%

Xitsonga 2 277 148 4.5%

Other 828 258 1.6%

TOTAL 50 961 443** 100%

* Spoken as a home language

** Unspecified and not applicable excluded

Source: Statistics SA

138

Third graders from such linguistic demographic profiles are expected to learn in their HL for

the duration of the Foundation Phase (Grade R-3) and shift, largely to English as the

Language of Learning and Teaching from fourth grade onwards (South Africa Department of

Education Curriculum and Policy Statement (CAPS) 2011). Prior to the CAPS dispensation

(which has only been phased in with effect from 2012) schools were at liberty to determine

the point at which they wanted to introduce English as a subject in their FP curriculum. The

current third graders therefore, have a diverse duration of exposure to English ranging from -1

year to a maximum of 4 years for those who have had exposure to English since Grade R.

Although they have been in school for almost three years, there is a sense in which the

majority of them are beginners in terms of exposure to English. The fact that for most of

them, English is not sufficiently reinforced at home (CAPS 2011) represents a challenge

which is accentuated by the fact that the focus of fourth grade reading is reading to learn

which is qualitatively more challenging than the FP learning to read. The assumption is that

by end of third grade the learners have attained reading proficiency in the language they are

going to use to learn, and are now well positioned to use their reading proficiency to learn

textual material. Even among HL speakers of English, a fourth grade slump, a designation of

the ‗…sudden drop-off between third and fourth grade in the reading scores…‘ (Hirsch 2003:

10) is a common phenomenon. For second language learners who have had scant exposure to

English both at home and at school, the slump could only be worse. Recognising how much

vocabulary is a proxy for language proficiency, a measure of such learners‘ vocabulary

knowledge would be indicative of their chances of surviving the impending slump. The

question meriting consideration is whether there is a conceptualisation of the construct ‗word‘

which is equal to the task of indicating the actual word knowledge of learners with the profile

described.

Conceptualisation of the Construct ‘Word’

The infamous question ‗What is a word?‘ has plagued the field of vocabulary testing for years

and has defied singularity or uniformity of definition. Discrepancies in vocabulary size

estimates are primarily a result of lack of consensus on what constitutes a word for word-

counting purposes. Put differently, if a child knows all the words in the statement, ‗The boy

did not go to the shops when the other boys were going’, how many words do they actually

know? Should we keep counting the word ‗the‘ the three times it recurs in the statement or

should we just count it once? Can we not presuppose the knowledge of ‗boys‘ to be an

outgrowth of the knowledge of ‗boy‘ to warrant treating them as the same word? Should ‗go‘

139

and ‗going‘ not be taken as one word in different forms? Such fundamental questions lead to

diverse conceptualisations of the construct ‗word‘. In a bid to respond to such questions, the

field of vocabulary measurement has landed itself with four conceptualisations namely: word

as token, word as type, word as lemma, and word as word family. The relative merits of these

word constructs in relation to the context of this paper require examination. D'Anna,

Zechmeister and Hall‘s (1991: 111) question, ‗When we say that a child learns 3,000 or 5,000

words per year, what exactly are we talking about?‘ is as valid now as it was then.

Word as Token

Ordinarily we identify words ‗…simply by the space between the strings of letters in written

language‘ (Luitel 2011: 59). This is consistent with Carter in Catalán and Francisco‘s (2008:

151) definition of a token as ‗…any sequence of letters (and a limited number of other

characteristics such as hyphen and apostrophe) bounded on either side by a space or

punctuation mark‘. Any expression devoid of any spaces within it and separated by spaces

from other expressions is consistent with the view of word as token. Such a conceptualisation

can, however, be faulted on the basis of its failure to account for some compound

constructions like ‗cannot‘ which can be regarded as one or two words depending on how

they are written. As well, should hyphens be considered as spaces or not? If they should, what

do we say about the inconsistency in the division of compound words like ‗injustice‘ and ‗in-

laws‘? Some words like ‗ice cream‘ are visualised and thought of as one word despite having

two forms and there is the complication of whether we need to consider the forms making up

the expression or the concept represented by the forms. Does the fact that an ice cream is one

item make the word a single word or does the presence of two forms make it two words?

Mármol (2011) contends that because such words represent a single concept and learners

learn and understand them as just one concept, they should be considered as single words.

The criterion of spaces demonstrates the uninterruptiblity of words where one cannot add

anything between words as they would with a sentence. Inserting another word between a

word and its inflection is impossible but you can always add a qualifier to say more about a

verb or noun in a sentence. Tokens are also referred to as running words in a text and ‗…each

occurrence of a form is counted separately‘ (Luitel 2011: 59). Tokens indicate the total

number of words in a text or corpus yielding the quantity of input in a text in raw terms

(Mármol 2011). According to Nation (2001), tokens are the conceptualisation of ‗word‘ we

would be making reference to when we talk about a summary, a telegram, or a research paper

140

being so many words long. Every occurrence of each word is counted despite the recurrence

of some words in the text.

There are limitations to the application of the token as a conceptualisation of word in

vocabulary measurement. Most vocabulary measurement studies utilise word frequencies to

determine the most frequent words and the learners‘ extent of their knowledge. Using the

token as a unit of measurement would make computation of word frequencies impossible

since every stand-alone form is regarded as a different word. Token as a unit of analysis treats

every form as diverse from the others implying that each form has to be learnt separately. In a

statement ‗Your mother was talking to my mother in your garden’, the words mother and

your, which appear twice each, are regarded as four different words yet everything about

them (orthographic make-up, meaning, and pronunciation) remains the same. Apart from

treating the same form as a different word whenever it recurs in text, forms like boy and boys

are presumably learnt one by one. This would make vocabulary acquisition and learning a

painfully slow process. What should, and does, happen is that sometimes we learn the

meanings of some words by inferring them from those related words which are already part of

our repertoire. Even the English Second Language (ESL) third graders profiled in this paper

can deductively recover some words‘ meanings from those they already know. The token

therefore, falls short as a unit of word counting for word knowledge studies in this and other

contexts. Word as type addresses some of the limitations of the token construct and so

deserves some scrutiny.

Word as Type

According to Read (2000), in the conceptualisation of word as type, only the word form that

is dissimilar from all the others in an utterance is counted. Any recurring word form is only

counted once. Using the ‘Your mother was talking to my mother in your garden’ example, we

can note that although there are ten tokens, there are only eight types since the words ‗your‘

and ‗mother‘ appear twice in the statement. If we adopt the word as type as the unit of

quantification, all words identically spelt will be considered as one word. Word types would,

then, be all those items with different orthographic identity. Nation (2001: 7) observes that

conceptualising words as tokens is necessary when responding to questions like ‗How large

was Shakespeare‘s vocabulary?‘ Conceiving a word as a type is based on two assumptions:

first, that knowing a particular word in one context translates to its knowledge in different

contexts making it one word no matter the number of times it recurs in a text; and, second,

that every individual word type is unique and its understanding does not depend on an

141

understanding of another. Learners‘ knowledge of some words should, therefore, not be

inferred from their knowledge of other words. Both assumptions are questionable. The fact

that a word‘s identity rests on its orthographic composition or spelling leads to problems with

homonyms which take on several meanings depending on the context of use. An overused but

apt example is that of ‗bank‘. Word as type considers such as one word when it can be many

words. Some words also function as both nouns and verbs depending on their use. An

example would be the form ‗pin‘ in the statement ‘Get the pin and pin the papers.‘ The first

pin is a noun and the second is a verb, and knowledge of the first does not guarantee that of

the other. This discounts the assumption that because the word is spelt the same, it is the same

word wherever it is encountered. The other limitation of the word as type construct is the

disregard of the idea that some words‘ meanings can be extrapolated from knowledge of

related others. Knowing the word ‗boys’ logically presupposes knowledge of the word ‗boy’

and the two would well be considered as one word even for ESL FP learners. Such a

generalisation lacking in type is the basis upon which the lemma is built.

Word as Lemma

The lemma is preferred for lexical quantification on account of overcoming the limitation of

having to consider each word form as a unique form unrelated to the other forms as does the

type and token conceptualisations. Gardner (2007) notes that, in a lemma, all lexical forms

share the same stem and word class, and differ only in inflection or orthographic make-up.

The words write, writes, writing, written and wrote are all verbs emanating from the base

form write. The ‗-s‘, ‗-ing‘, ‗-en‘ are the inflections which are just indicative of a change in

grammatical functioning of the same base word write. The lemma is based on the assumption

that the knowledge of the inflected forms is eased and expedited once the base form, as well

as the morphological inflections, are known. The learning burden, which Nation (2001)

defines as the amount of effort required to learn a new word, is eliminated or eased

considerably if the base word is known. Knowledge of the inflectional system of English

would ease the learning of the inflected forms on the basis of the knowledge of the base form.

The other justification for considering inflected forms as one word with the base form is that

morphemes do not create new words; they merely modify the form in which they occur to

indicate grammatical functioning, such as plurality. The base form which has to be known in

this instance is write and what the inflections do is to give grammaticality to the functioning

of the same word in different contexts.

142

The requirement of having all members of a lemma belong to the same word class

would disqualify the form writer from the lemma of write, writes, writing, written and wrote

as it belongs to the class of nouns. It would become a base word for a different lemma of

writer, writers, writer’s and writers’. The assumption is that the learning burden of words

emanating from the base form belonging to the same word class is less than that of inflected

forms from the same base which cut across word classes. Browne, Cihi and Culligan (2007:

2) exemplify and corroborate this assumption when they posit that the ‗…statistical item

difficulty factors for ‗accept‘, ‗accepts‘ and ‗accepting‘ are very close, whereas the statistical

difficulties for ‗acceptable‘, ‗acceptance‘ and ‗unacceptable‘, are all quite different. One

hypothesis is that the brain treats these six items as four different Base Words.‘ Such an

argument necessitates and rationalises the confinement of members of a lemma to a single

word class. The example of the six word forms given fit the argument well but going back to

the examples of inflected forms emanating from write, one may argue that knowledge of the

base form write may make the form writer easier to one learner than the form wrote or written

which belongs to the same word class as write. That the definition of a lemma cited above

accommodates irregular verbs like went for go, sought for seek or am, is, are, was, were,

being for be within a lemma makes the assumption that belonging to the same part of speech

as the base reduces the learning burden of a word highly suspect. As Gardner (2007: 244)

observes, ‗the case of the irregulars poses serious quandaries relating to the psychological

validity of such family relationships — namely, that the opaque spelling and phonological

connections between the lemma headword and the family members will surely cause more

and different learning problems than their more transparent counterparts‘. This defeats the

whole principle of learning burden for which the lemma is created to uphold.

Nation (2001: 8) registers concern over the inclusion of irregular forms within a

lemma when he notes that ‗one problem in forming lemmas is to decide what will be done

with irregular forms such as mice, is, brought, beaten and best. The learning burden of these

is clearly heavier than the learning burden of regular forms like books, runs, talked, washed

and fastest. Should the irregular forms be counted as a part of the same lemma as their base

word or should they be put into separate lemmas?‘ The orthographic constitution or spelling

of the word ‘best’ is not in any way indicative of stemming from the base form ‗good‘.

Including it within the lemma of ‘good’ would present an even higher burden of recovering

its meaning from the latter than it would be in learning its antonym ‘bad’ for instance.

Irregular plurals or verbal forms may need to be considered independently from their

headwords but such exclusion would mean quite a number of words would just be treated as

143

types or tokens as they cannot belong to lemmas. The words like good, better, best would not

be part of any lemma, as would all the irregular forms. The lemma should be a grouping of all

those words whose understanding is almost made obvious whenever the base form is known,

rather than a collection of words, which are brought together by virtue of them being inflected

from the same base form. Irregular forms normally use inflections diverse from regular ones

which gives an abstract status to morphemes. The regularity of frequent or regular inflections

stems from them being the inflections added to the vast majority of content words (verbs,

nouns, adjectives, and adverbs) to reflect grammatical properties such as tense, number, and

degree. The criteria of inflection and belonging to the same word class are not tight enough to

ensure only those words whose meanings are easily recoverable from the meaning of the base

gain entrance into the lemma.

Nation (2001) broadens the scope of a lemma to include the contracted forms. One

may express reservations over the inclusion of contracted forms on at least two grounds. First,

knowledge of the contracted form requires knowledge of, not only the base form, but also that

of ‗not‘ since the contracted form is both a fusion and reduction of two words (for example,

can + not = can’t). Second, there are transparent and opaque kinds of contractions and the

opaque contractions cannot easily be inferred from the base form + ‘not‘. Transparent

contractions would be forms like have + not = haven’t, do + not = don’t and the opaque

forms would be will + not = won’t, am + not = ain’t, shall + not = shan’t. The opaque

contractions have a higher learning burden which does not justify treating them as part of the

same lemma as the base especially for vocabulary knowledge measurement on second

language Foundation Phase learners. Asserting that beginners can associate such irregular

forms with their headwords is fundamentally unrealistic.

Possibly from realising the problems of having a too-accommodative criteria for a

lemma, Milton (2009: 10) makes the conception of a lemma less accommodative but more

manageable by narrowing its definition saying, it ‗...includes a headword and its most

frequent inflections and this process must not involve changing the part of speech from that of

the headword‘. In formulaic terms, the definition of a lemma can be represented, thus:

Lemma = headword + most frequent inflections + their contracted forms (belonging to same

class)

The use of the word ‗most frequent‘ is noteworthy and could well be interchanged or used

together with ‗transparent‘. The only problem with ‗most frequent‘ is that it leaves the

144

determination of most frequent to the researcher‘s discretion in the absence of frequency lists

of inflected forms. The ‗frequency‘ also needs qualification, whether it is the frequency with

which the inflected form is used in a text, or the frequency that stems from the number of

English words that an inflection inflects. The former kind of frequency would be relative to

text as frequent forms in one text may be less frequent in another. A definition of word whose

criterion is of a relative nature is not tight enough to allow easy and objective application. The

latter kind of frequency does not guarantee that inflections that have a lower spread in their

use are more difficult than those that impact a wide range of word forms in the language.

The lemma is also based on an assumption that inflections are easier than other forms

of affixation (prefixation and suffixation) which can be challenged. Some suffixes like ‗-able‘

and ‗-less‘ and prefixes like ‗un-‘ have meaning in and of themselves which can be used to

recover the meaning of a suffixed and prefixed form like ‗suitable‘, ‗careless‘ and ‗unfair‘;

yet, inflections are devoid of such independent meaning. Such systematic use of affixes can

be used to significantly reduce the learning burden of the words derived from a known base

form. That the inflections ‗-s‘ and ‗-es‘ can be used for both verb and plural forms can be a

confounding factor on its own. This is not to imply such is absent from affixed forms.

In this paper, reference has severally been made to the base form, better known as the

headword but what really constitutes or counts as a headword is not clear. Nation (2001)

raises Sinclair‘s concern whether a headword should be the base form or the most frequent

form. The base form may not be the most common form or the form that learners are likely to

acquire first. The base itself can be recoverable from the most common form which justifies

the supposed complication of which to consider as the headword, the base form or the most

common form. That the construct lemma is elusive to define with precision explains why,

although the comparative and superlative forms have always been considered English

inflections, Nation (2001) notes that, in the computerised, lemmatised list of the Brown

Corpus (Francis and Kučera 1982), these are excluded.

Stubbs (2002) proposes an additional criterion for membership into a lemma: the

requirement that all the members share the same meaning, a criterion challenged for its failure

to distinguish a lemma from a lexeme. The lexeme also denotes a group of words sharing the

same meaning and same word class which the lemma does as well. An additional criterion

complicates the determination of what it is that should gain admission into the lemma

membership. Acknowledging the difficulty of constituting a lemma and the unconvincing

generalisations often emanating from ‗…generalizations about whole lemma…‘. Knowles

and Mohd Don (2004: 71) advise researchers ‗…to consider individual words‘ or ‗…actually

145

even individual word meanings…‘ as the basis for their word count and analyses. This is

almost a call to revert to conceptualisation of word as type.

Brain research has provided insights which support the learning burden principle but

not the constitution of lemmas. Browne, Cihi and Culligan (2007: 2) assert that ‗…the brain

stores and processes lemmas having similar difficulty factors as forms of the same word,

and…stores and processes lemmas having different difficulty factors as different words‘. The

idea of coming up with a formula for defining what qualifies as a lemma is a noble one which

seeks to make the determination of lemmas objective. We have already seen how some

inflected or contracted forms are more difficult than others, implying that there is no

justification in generalising that because a word is an inflection or contraction of a base form

then it should enjoy lemma membership. Browne, Cihi and Culligan‘s (2007) observation that

some lemmas are registered by the brain as separate words, rather than one word, casts doubt

on the validity of lemmas as a unit of vocabulary counting and analysis. That the brain does

not always store and process lemmas as we constitute them points to the need for either a

revisit of the constitution of lemmas if not a creation of another unit of counting.

Word as Word Family

Nation (2001) identifies the components of a word family as a headword, its inflected forms

and closely derived forms (derivatives). Derivation differs from inflection in that, while

inflection does not produce separate words, derivation creates separate but morphologically

related words usually involving some change in form. A subjective element is introduced by

the expression ‗closely derived forms‘ as one cannot make, with objective certainty, the

determination of the closely derived forms and those not so closely derived. Words from

across word classes can gain membership into a word family. Word families have their basis

on the understanding that the acquisition of thousands of words is through the application of

rules which make words into morphological families which ensures ‗…little or no extra

learning when one or more of the members is already known to the learner‘ (Chung 2009:

162). For instance, the process of affixation, which includes prefixation and suffixation, eases

the learning of a lot of words. A word family, therefore, ‗…includes a wider range of

inflections and derivations…as the basis of word counts‘ (Milton 2009: 11). Our word family

formula would be:

Word Family = Base form + Basic inflected forms + Transparent derivatives

146

The learning burden principle is the basis upon which the word family unit is constructed.

Knowledge of the base form engenders knowledge of its inflections and its close derivatives.

The word family unit is too accommodative of members into the family than the lemma. In

the first place, there is an inclusion of derivatives which are not included in the lemma, and

second, the restriction of having members belong to the same word class does not apply.

Word family members traverse boundaries of grammatical classes. Several lemmas usually

find themselves part of a single word family. From the base form long can come long, longer,

longest, longevity, longish, length, lengthen, lengthy; and all these can be considered as one

word under the word family unit of analysis. Certainly, all these forms cannot have similar

learning burden from the base form to warrant inclusion in the same word family. Even

derived forms differ in their complexity and difficulty of comprehension (Browne, Cihi and

Culligan 2007). That all these forms would be known once the base form is known is the

argument behind the word family unit. Mármol (2011: 12) challenges such an assumption by

pointing out that ‗…we cast doubt on the idea that a child acquiring bed has also acquired

bedroom. There is the possibility that an adult could guess the meaning of the latter, but a

young language learner in his first stages of acquisition may not be able to make those

inferences.‘ The word family unit depends for its use on the learner‘s possession of an

intricate knowledge of morphological inflections of the English language in order to make

intelligent guesses about the meaning of some words on the basis of knowledge of their base

form. Evidently, learners, such as the ones described in this paper, would not possess the

native-like knowledge of morphological relations between words in a family. Schmitt and

Zimmerman‘s (2002) study which required non-native postgraduate and undergraduate

participants to identify the derivational forms of stimulus stem words revealed that

participants could only rarely provide all the different derivations of the stimulus words. This

suggested only partial knowledge of derivational forms on the part of the participants. Bauer

and Nation (1993) even add that learners should know that ‗mean’ does not derive from ‗me‘,

despite the orthographic or spelling string for ‗me‘ occurring in ‗mean‘. Learners should also

have some implicit knowledge of the role of affixes (prefixes and suffixes) in word formation

and word meaning, as well as use permissible base-affix combinations in speech and writing.

Because it takes in a broader membership and treats the different members as one, most if not

all the challenges confounding the application of a word family for word frequency counts

and word knowledge analysis are similar to, and even take a greater magnitude than, those of

lemmas as discussed in this paper. The challenge of deciding what should be included in a

word family and what should not is as manifest in the word family unit as in lemmatisation.

147

Bauer and Nation‘s (1993) studied inflections and affixations of English words based

on their productivity, frequency, regularity and predictability and came up with a scheme for

defining word-families. They came up with seven levels or a word family scale based on an

analysis of the 1,000,000 token Lancaster-Oslo-Bergen (LOB) corpus dealing mainly with

affixation. These levels were supposed to form the basis for teaching and learning of English

words. The scheme is a welcome acknowledgement that learners‘ knowledge of affixation

develops with more experience of the language. A sensible word family for one learner may

be beyond another learner‘s current level of proficiency. This necessitates the scaling of word

families from the most elementary and transparent members to those of less obvious

possibilities (Nation 2001). At level 1, learners are assumed to treat each form as a different

word. The table below, adapted from Bauer and Nation (1993: 254), takes the scale from the

second level to the seventh level of inflections and affixations.

Level Affixation and inflection

1 No affixes.

2 -s, -ing, -ed, -er, -est, (all inflections)

3 -able, -er, -ish, -less, -ly, -ness, -th, -y, non-, un-, (Most frequent and regular

derivational affixes)

4 -al, -ation, -ess, -ful, -ism, -ist, -ity, -ize, -ment, -ous, in- (Frequent, orthographically

regular affixes)

5 -age, -al, -ally, -an, -ance, -ant, -ary, -atory, -dom, -eer, -en, -ence, -ent, -ery, -ese,

-esque, -ette, -hood, -l, -ian, -ite, -let, -ling, -ly, -most, -ory, -ship, -ward, -ways,

-wise, ante-, anti-, arch-, bi-, circum-, counter-, en-, ex-, fore-, hyper-, inter-, mid-,

mis-, neo-, post-, pro-, semi-, sub-, un- (Regular but infrequent affixes)

6 -able, -ee, -ic, -ify, -ion, -ist, -ition, -ive, -th, -y, pre-, re- (Frequent but irregular

affixes)

7 ab-, ad-, com-, de-, dis-, ex-, and sub- (Classical roots and affixes)

N.B.: Bracketed words in italics at the end of levels 4 through 7 are not part of the original.

Gardner (2007: 247) appreciates the ‗…apparent advantage of this seven-level

categorization scheme … that Word or Word Family can be operationalized at various

defensible levels for analysis and comparative analysis purposes — at least in terms of

learners‘ abilities to associate morphologically related words‘. Bauer and Nation (1993) need

148

to be applauded for hierarchically organising word family levels which can be matched with

learners‘ competence levels. For a learner operating at level 5, for instance, all the words in

levels 1 to 5 emanating from the same base would be considered as a single word, but those in

levels 6 and 7 would be regarded as different words from their base form. It is also significant

that such categorisation was done systematically on the basis of a rigorous criteria identified

above (their productivity, frequency, regularity and predictability) and on a large corpus (a

million words) which gives the categorisation a substantial measure of validity.

Gardner, however, notes as problematic, the repetition of many affixed forms at the

different levels, failure to acknowledge that ‗derivational prefixes and derivational suffixes

may present different learning dilemmas for developing readers‘, as well as assuming ‗that

learners‘ exposure to, and acquisition of, morphologically-related words is somehow linear in

nature — in other words, that language learners acquire base forms before their inflected and

derived family members‘ (2007: 247).

Such an assumption is refuted by Biemiller and Slonim (2001), who note that young

children may actually acquire many derived forms before they acquire their root-form

counterparts. Concerning the duplication of affixes, an example would be the suffix ‘-able’ in

level 3 and in level 6 which presents uncertainty about membership level of forms like

‘suitable’ on the word family scale. The assumption of the linear nature of exposure and

acquisition of word family members rests on a shaky pedestal. A form like disadvantage

(level 7, according to the taxonomy of levels of inflections and affixations) can have a lower

learning burden than advantageous (level 4).

Such categorisation as Bauer and Nation (1993) come up with seems to come as a

solution to the challenge of determining what qualifies as a member of a word family. The

present paper, however, takes exception to the idea of basing the categorisation of the word

family levels solely on the basis of a corpus without complementing it with empirical

evidence of the ease with which learners acquire the different affixed forms. This is not a

criticism of Bauer and Nation‘s (1993) work but a pointer to the need for further large scale

research to corroborate the match between the levels of the corpus analysis and the

psychological realities of learners‘ word learning and acquisition.

Conclusion: The Potential Way Forward

A resuscitation and extension of morpheme studies by Dulay and Burt (1974), Fathman

(1975), Makino (1980), cited in Krashen (1982), showed that acquisition of English

149

grammatical structures follows a 'natural order' which is predictable and is independent of

instruction, learners' age, L1 background, or conditions of exposure. Such studies need to be

conducted with both L1 and L2 learners of different language backgrounds to determine the

extent of the match between the language corpus ideals and the psychological realities of the

learners. Brown‘s (1973) longitudinal study reported in Kwon (2005: 4) produced the

following order of L1 acquisition of English Morphemes.

Rank Morpheme

1 Present progressive (-ing)

2/3 in, on

4 Plural (-s)

5 Past irregular

6 Possessive (-‘s)

7 Uncontractible copula (is, am, are)

8 Articles (a, the)

9 Past regular (-ed)

10 Third person singular (-s)

11 Third person irregular

12 Uncontractible auxiliary (is, am, are)

13 Contractible copula

14 Contractible auxiliary

The above hierarchical ordering is limited in two ways. First, the studies are based on native

English language speakers and importing the ranking wholesale to ESL learners may be

misleading. Second, the studies are exclusively based on morpheme studies when in fact most

high frequency words are just sight words which cannot be reduced to their morphological

150

composition. The paper, therefore, argues for extensive testing and documentation of the

acquisition order of English affixed forms (suffixed and prefixed for both inflections and

derivations). The testing should cover a wide range of learner profiles from diverse language

backgrounds and competence levels. The resultant taxonomy should ensure that only those

lexical forms which pose negligible or no learning burden in the event that the base form is

known, are regarded as one word. Two forms may justifiably be regarded as one for one

learner but not for another depending on their level of competence. A taxonomy of word

conceptualisation levels is, therefore, needed where, at the first level, some lexical forms may

be regarded as separate words but, at the next levels, be considered as one word. Researchers

would then choose the level at which they conceptualise word for their word knowledge

measurements depending on the competence level of the learners. A departure from a ‗one

size fits all‘ would make possible the replication of studies. One would just need to specify

that they based their studies on level 3 of the word conceptualisation taxonomy. Explicit rules

would need to be generated for word membership at each level and exceptions identified.

Even teachers would know which lexical forms they need to give preference to for explicit

instruction depending on the competence level of the learners. A move away from the current

word conceptualisations would ensure more realistic and valid conclusions on word

knowledge measurement studies.

References

Baxter, J. (1980). The dictionary and vocabulary behavior: A single word or a handful?

TESOL Quarterly, 14, 325-336.

Bauer, L. and I. S. P. Nation. (1993). Word families. International Journal of Lexicography,

6, 253–279.

Catalán, R., J. and R. M. Francisco. (2008). Vocabulary input in EFL textbooks. RESLA, 21,

147-165.

Chen, K., Y. (2011). The impact of EFL students‘ vocabulary breadth of knowledge on

literal reading comprehension. Asian EFL Journal, 51, 30-40.

Chung, T., M. (2009). The newspaper word list: A specialised vocabulary for reading

newspapers. JALT Journal, 31(2), 159-182.

D'Anna, C., A., E. B. Zechmeister and J. W. Hall. (1991). Toward a meaningful definition

of vocabulary size. Journal of Literacy Research 23(1), 109-122.

151

Gardner, D. (2007). Validating the construct of word in applied corpus-based vocabulary

research: A critical survey. Applied Linguistics, 28(2), 241-265.

Hirsch, E. D. (2003). Reading comprehension requires knowledge – of words and the world:

Scientific insights into the fourth-grade slump and the nation‘s stagnant

comprehension scores. American Educator. American Federation of Teachers.

Knowles, G. and Z. Mohd Don. (2004). The notion of a ‗lemma‘: Headwords, roots and

lexical sets. International Journal of Corpus Linguistics 9(1), 69-81.

Koda, K. (2005). Insight into second language reading: A cross-linguistic approach.

Cambridge: Cambridge University Press.

Krashen, S. D. (1982). Principles and practice in second language acquisition. Oxford:

Pergamon.

Kwon, E. Y. (2005). The ‗natural order‘ of morpheme acquisition: A historical survey and

discussion of three putative determinants. Teachers College, Columbia University

Working Papers in TESOL and Applied Linguistics, 5(1), 1-21.

Luitel, B. (2011). Vocabulary in the new B.Ed. general English under Tribhuvan University.

Nepal English Language Teachers’ Association Journal of NELTA, 16(1-2), 59-69.

Mármol, A. (2011). Vocabulary input in classroom materials: Two EFL coursebooks used

in Spanish schools by Gema. RESLA, 24, 9-28.

Milton, J. (2009). Measuring second language vocabulary acquisition. Bristol: Multilingual

Matters.

Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge: Cambridge

University Press.

Qian, D. D. (2002). Investigating the relationship between vocabulary knowledge and

academic reading performance: An assessment perspective. Language Learning,

52(3), 513-536.

Read, J. (2000). Assessing vocabulary. Cambridge: Cambridge University Press.

Schmitt, N. and C. B. Zimmerman. (2002). Derivative word forms: What do learners

know? TESOL Quarterly, 36(2), 145-171.

South Africa Department of Basic Education Curriculum and Assessment Policy Statement

(CAPS). (2012). Foundation Phase First Additional Language. Pretoria: Government

Printer.

Stubbs, M. (2002). Words and phrases: Corpus studies of lexical semantics. Oxford:

Blackwell Publishing.

152

On Gendered Styles and their Socio-Cognitive Foundations

María José Serrano

Universidad de La Laguna

Miguel A. Aijón Oliva

Universidad de Salamanca

Bioprofiles:

María José Serrano is a Full Professor at the University of La Laguna (Spain). Her research

areas are syntactic variation, sociolinguistics, pragmatics and cognitive linguistics. Some of

her recent publications are: Sociolingüística (2011, Ediciones del Serbal) and Variación

variable (2011 ed., Círculo Rojo). She has also published articles in Spanish in Context,

Language Sciences, Language & Communication and Folia Lingüistica, among others.

Miguel A. Aijón Oliva is an Associate Professor at the University of Salamanca (Spain). His

research interests include variation in Spanish morphosyntax and the development of styles

from pragmatic, sociolinguistic and cognitive viewpoints. His recent work has been published

in journals like Language Sciences, Language and Communication and Folia Linguistica.

Abstract

The dynamic construction of gender as a set of communicative values is subject to ever-

growing interest amidst the social sciences. The main purpose of this investigation is to

outline a theoretical and analytical frame that reconciles both the quantitative and qualitative

perspectives on language and gender. A view is developed of the statistical patterning of

linguistic usage as reflecting the meaningful use of linguistic elements in local contexts. The

adequacy of such an approach is subsequently tested through the analysis of syntactic choices

in male vs. female speakers. Syntactic variants are not synonymous, but entail particular

discursive and cognitive meanings, and thus may contribute to the shaping of different

communicative styles.

The syntactic phenomenon chosen for the study is the expression vs. omission of

Spanish pronoun subjects in spontaneous conversation and media discourse. Three different

forms are separately studied: yo ‗I‘, nosotros ‗we‘ and tú ‗you‘ (sing.), and their notional

peculiarities are taken into account. In all cases, quantitative analyses certify frequential

differences in the distribution of expression vs. omission across male and female discourse.

More specifically, men display significantly higher rates of expressed subject pronouns, while

women are more inclined towards omission. Statistical calculations are then complemented

by the contextual observation of some gendered stylistic values that seem to be brought into

play through syntactic choice. A relationship is suggested between gendered styles and the

discursive-cognitive continuum from objectivity to subjectivity, this being reflected on a wide

range of communicative possibilities.

153

Keywords: gender, syntactic variation, pronouns, style

1. Gender as Style1

The study of the relationships between sex/gender and communication is an ever-developing

area of social science research with already quite a long history behind, and one that currently

offers some of the most promising prospects for sociolinguistics. Far from traditional clichés

and prejudices on the subject, a fair deal of consensus has been reached regarding the fact that

linguistic-communicative usage is usually less conditioned by biological sexual factors than

by psychosocial ones (see Eckert 1989). Sex/gender needs to be analyzed within socially and

situationally contextualized approaches, observing how identities are constructed and

reformulated through linguistic choice.

Dialectological and sociolinguistic studies conducted in diverse human communities

have long pointed out differences in the communicative norms followed by men vs. women.

Most investigations have focused on the supposed peculiarities of female behaviour, thus

more or less implicitly certifying male speech as the unmarked or standard variety. The result

of such an orientation for linguistic research has been more thorough knowledge of women‘s

discourse and of the social contexts and practices across which it is developed (Coates 2003:

3; Edwards 2009: 146). However, this tendency is also being counterbalanced by the

appearance of more investigations specifically devoted to male self-presentation and

socialization through speech (Coates 2003; Jordan-Jackson and Davis 2005; Kiesling 2005).

The truth is that each gender group is repeatedly found to follow partly different

interactional patterns that, in our view, may well be the manifestation of different basic socio-

communicative styles. The view of gendered linguistic usage as a matter of style makes it

possible to move beyond the ‗reactive‘ orientations promoted by classic sociolinguistic

quantitative research, which inevitably lead to somewhat fixed and static conceptions of

gender – just as any other social – ascription (Bell 1999: 524). Seeking the balance point

between the conditionings imposed on gender by societal structures on one hand, and speaker

agency and creative elaboration on the other, seems to offer the most realistic and potentially

fruitful path at the present state of knowledge.

1 This paper is part of the research project ‗Los estilos de comunicación y sus bases cognitivas en el

estudio de la variación sintáctica en español‘ (FFI2009-07181/FILO), funded by the Spanish

Ministerio de Ciencia e Innovación.

154

Style can be understood as any system of (linguistic and other) meaningful choices

that helps someone shape some (social, professional, emotional, ...) self-image; this being

perceived by the speaker as optimal for the achievement of certain interactional goals in a

particular context. The making of styles needs to be found and analyzed within real discourse,

where it may be feasible to describe the relevant circumstances of the situation, the social

features of the participants, and how they all interact with creative linguistic choice (Aijón

Oliva and Serrano 2013: 11-45; Serrano and Aijón Oliva 2011: 139). Sociolinguistic meaning

does not arise from ‗extralinguistic‘ factors, but from the joint action of linguistic and any

other semiotic choices across symbolic communicative acts (Coupland 2007: 3).

The application of these principles to the study of language and gender naturally

results in a view of masculinity and femininity as sets of values that are partly received from

social structure, but that can and need to be continuously elaborated in interaction. Speakers

are not ‗male‘ or ‗female‘ once and for all, nor do they need to be just one or the other.

Rather, they can choose the extent to which they want to associate themselves with some

gender label – and even what the labels themselves might imply in a certain context –; their

stylistic work will aim to shape a corresponding self-image towards others. In this paper, we

will conduct an analysis of stylistic choice and the configuration of gender as a socio-semiotic

category, with regard to a phenomenon of Spanish morphosyntax: variable formal expression

of first- and second-person clause subjects.

2. Subject Variation in Spanish: Discursive and Cognitive Interpretation

The approach to linguistic choice as stylistic work outlined in the preceding section also quite

naturally allows for a view of variation in morphosyntax and the lexicon as inherently

meaningful, not just socially and pragmatically, but even at the semantic and cognitive levels.

A linguistic structure mirrors the structure of human cognition; it is shaped by the human

perception of the surrounding world just as it helps shape it (Croft and Cruse 2004; Langacker

2009). This, in turn, leads to the assumption that the meaning of a construction will never be

exactly the same as that of seemingly synonymous alternatives, a tenet put forward by, for

example, construction grammar (cf. Goldberg 2003) and other related theoretical approaches.

The relevance of these views for the rethinking and refinement of the analysis of

linguistic variation can hardly be overstated. Following such reasoning, the cognitive

properties of morphosyntactic choices should be at the base of any usage patterns and

tendencies they might reveal. In fact, we believe the conjunction of so-called internal

meaning with social and situational features is what engenders socio-communicative styles;

155

that is, it creates systems of meaning affecting all possible levels of communicative choice

(Aijón Oliva and Serrano 2010a: 9; Serrano and Aijón Oliva 2011: 142).

Variation between the expression and omission of subject pronouns in languages such

as Spanish is one among many syntactic phenomena that lend themselves to style

construction. A relationship between pronoun usage and meaningful social factors such as

gender can thus be hypothesized and scientifically tested. It will be our task to ascertain

whether there are statistical differences in subject expression according to speaker gender, as

has been found in many other facts of linguistic variation. But, more importantly, if this is the

case, we will try to advance some explanation of such statistical patterning, by investigating

which semiotic facets of gender seem to be conveyed through syntactic choice in particular

discursive genres and contexts, and how this can be related to the meanings inherently linked

to grammatical forms.

In order to do this, we will first examine whether the syntactic phenomenon under

study is, in fact, the carrier of meaning differences at internal linguistic levels. As can be

inferred from a number of previous studies (Delbecque 2005; Siewierska 2004), variation

between the expression and omission of subject pronouns seems to be a formal reflection of

the degree of cognitive salience achieved by discursively encoded entities. When the referent

of a clause subject is under the attention focus, and can thus be considered salient or

accessible, its formulation tends to be perceived as unnecessary for the communicative

purposes of the speaker (Langacker 2009: 112). This is particularly evident in languages with

a relatively rich inflectional morphology such as Spanish, where the identity of clause

subjects can easily be tracked through verb agreement morphemes (cant-o ‗I sing‘, cant-as

‗you (s.) sing‘, and so on), which, in fact, makes subject omission the unmarked choice in

most discourse types (Serrano 2013: 276-281).

At the same time, discourse-oriented studies on subject variation such as the ones

cited above have often explained subject expression through informativeness, understood as

the degree of mental processing required by textual elements, given their newness or

unpredictability for participants (Beaugrande and Dressler 1997: 201). Informativeness is not

unrelated to salience but could, rather, be considered a textual correlate of it, albeit an

inversely proportional one; in general, the most salient entities are also the less informational

ones, due to their very accessibility and continuity across discourse stretches. Both salience

and informativeness should be conceived of as gradual magnitudes that are largely dependent

on the particular context, the relationship between the participants and other factors. Their

existence confirms the notion that different syntactic forms such as subject expression and

156

omission can hardly be seen as synonymous – they represent different views of non-linguistic

situations encoded through linguistic means (Serrano 2013: 284-288).

The analysis of subject pronouns, given their deictic nature and their power to endow

real-world entities with different degrees of cognitive salience within discourse, suggests that

their choice is a formal manifestation of abstract cognitive dimensions underlying speech, and

particularly of the continuum between objectivity and subjectivity. The latter is the tendency

of discourse and perception to revolve around subjects (mainly human participants, these

being the entities most frequently encoded as clause subjects in conversational speech and

other discourse types), while objectivity would imply the converse orientation towards non-

participants: third-person human and non-human entities. There is, in fact, a significant

amount of evidence pointing to objectivity-subjectivity as a very powerful notion for the

theoretical explanation of linguistic variation and style construction (Aijón Oliva and Serrano

2013, Kerbrat-Orecchioni 1980, Kristiansen 2008). In the present study, we will try to

elucidate whether this may bear some relationship to the shaping of gendered identities

through syntactic choice.

3. Corpora and Methodology

Two corpora of European Spanish were analyzed for the present research. The first one is the

Corpus Conversacional del Español de Canarias (CCEC), which comprises a series of

transcribed oral interactions among Canary Island speakers in different communicative

situations, basically divided into two types: spontaneous conversations and talk shows

broadcast on regional TV. Both gender groups and a variety of speaker social ascriptions are

sufficiently represented across the texts. The second corpus under analysis is the Corpus de

Lenguaje de los Medios de Comunicación de Salamanca (MEDIASA), devoted to

representing the media discourse of a Spanish Peninsular central town. It incorporates not just

oral but also written texts which are taken from local newspapers. Oral materials come from

the transcription of radio broadcasts pertaining to different media genres, and in which

socially and professionally heterogeneous speakers take part. Together, both corpora make it

possible to observe and analyze a fairly wide range of contexts and interactions.

As discussed in the preceding section, variation between the omission and expression

of subjects is a matter of semiotic choice whereby different meanings are communicated

through different syntactic configurations. Now it is necessary to specify the empirical

methodology required to analyze the projection of semiotic choice in syntactic form and its

communicative repercussions.

157

In this sense, the calculation of descriptive frequencies, that is, of the percentages of

one variant against the other, is the most basic tool for the quantitative assessment of

variation. This is referred to as relative variables. However, the consideration of syntactic

options as meaningful options by themselves and not in opposition to other variants suggests

the incorporation of a complementary statistical method that can, in some way, better suit this

conception of linguistic variability. This we shall refer to as the absolute variable

methodology (Aijón Oliva and Serrano 2012: 80-94). It is based on the assumption that any

form-meaning pairing is contextually chosen for its own value and not just as opposed to any

other options. Consequently, aside from assessing its frequency against those of its alleged

alternatives, it may be interesting to calculate it in overall terms according to an independent

measure, such as word number. In our case, this means assuming that the total frequencies of,

e.g., expressed subjects across some text, group of speakers, etc. can be scientifically

revealing in itself and irrespective of their relationship to omitted-subject rates. Thus, a

frequency index of each form per 10,000 words will be used to clarify the tendencies

suggested by percentage data.

Now it must be acknowledged that statistical patterns, useful and revealing as they

may be, would make little sense if they had no relationship to the actual instances of

communication they emerge from. We believe there is an essential connection between the

quantitative and the qualitative sides of sociolinguistic variation; one that has been generally

neglected, but that is indispensable for the future construction of a general theory. In the case

of our study, the conjunction of statistical and interactional findings seems particularly crucial

if we aim to explain communicative styles as the contextual construction of identities by men

and women.

Our analysis and discussion of syntactic variation and its stylistic implications for the

notion of gender will be divided into the next three sections, each one focusing on a different

subject pronoun: yo ‗I‘, nosotros ‗we‘ and tú ‗you (singular)‘.

4. The First-Person Singular yo ‘I’ in (yo) creo ‘I think’ Constructions

In general terms, yo is the most frequent subject in Spanish clauses. Its statistical dominance

can be taken as a formal reflection of the general egocentric orientation of human language

(Keysar 2007, Serrano 2014), even if its occurrence rates are obviously quite variable

depending on the context and discourse type, usually becoming higher in contentious or

persuasive speech. The argumentative potential of first-person subjects is particularly obvious

in the context of verbal lexemes acting as indicators of modality, among which creer ‗to

158

think‘ seems to be the most frequent one in Spanish discourse. This is why our present

analysis will be restricted to the construction (yo) creo and its basic usage patterns.

Qualitative contextual analysis suggests that formal expression of the subject (yo creo

or else creo yo) represents the paradigmatic case of the aforesaid association of the structure

with personal opinion and argumentation, as seen in example (1), regarding the procedure that

should be followed in a Carnival competition. The speaker emphasizes the personal nature of

her stance.

[Female]:

(1) La gente que venía de la Península no sabía valorar un traje\yo creo que las personas

famosas\que vienen aquí al Carnaval\deberían de ser invitados\yo creo que podía haber un|||un

jurado más específico sobre el tema que estamos tratando\ (CCEC Conv<MaTe09>)

‗People who used to come from Peninsular Spain were not apt to evaluate a costume. I think

famous people taking part in the Carnival competition should be specifically chosen. I think

there should be a jury composed of experts on the topic we‘re dealing with.‘

On the other hand, omission of the subject (Ø creo) tends to be preferred for the presentation

of contents as hypothetical or as having a more general and less personal scope. In (2), the

speaker is expressing what she believes to be a mere possibility rather than a personal

position. That is, the omission of yo seems to displace potentially contentious discourse

towards objectivity.

[A: Male, B: Female]:

(2) A: Yo por lo que he leído en prensa\tengo la idea de que tu madre||dejó escrito algo

B: No sabemos\Ø creo que fue algo sobre un dinero que le debía\para que se le pagara\

(CCEC Conv<MaTe09>)

‗A: Based on what I read in the newspapers, [I] believe your mother left something written.

B: We don‘t really know. [I] think it might have to do with some money she owed and wanted

to be returned.‘

The variability and its discursive repercussions are explainable through the higher salience

and accessibility of omitted subjects. Avoiding overt self-indexation, the speaker builds a

more objective self-image that can be perceived as advantageous in contexts such as that of

159

(2). It is interesting to point out the fact that (yo) creo is one of the rare Spanish constructions

in which expression of the first-person subject is altogether more frequent than its omission,

as discussed in previous works (see Aijón Oliva and Serrano 2010b). This suggests that its

basic function is that of indexing the speaker in discourse, rather than strictly introducing a

belief or opinion, as the verb lexeme would indicate.

If just the overall frequencies of (yo) creo are calculated, whether with expressed or

omitted yo, we find that its occurrence is notably more usual in male speech. This table shows

that, in the CCEC corpus, men are ahead of women by 6.5 items of (yo) creo per 10,000

words (Table 1).

Table 1 Overall frequency of (yo) creo (expressed and omitted) according to gender

(CCEC media texts)

Gender Word number Overall

occurrences of (yo)

creo

Frequency index per

10,000 words

Men 48,035 136 28.3

Women 19,654 43 21.8

In the case of the MEDIASA corpus, the contrast is even sharper, with the scores of men

outweighing those of women by a three-one ratio. That is, male speech seems to be

characterized by a stronger tendency towards discourse modalization through self-indexing

choices such as (yo) creo. However, such a hypothesis needs to be confirmed by analyzing

other facts of grammatical choice in discourse.

Table 2 Overall frequency of (yo) creo (expressed and omitted) according to gender

(MEDIASA corpus)

Gender Word number Overall occurrences

of (yo) creo

Frequency index per

10,000 words

Men 177,332 232 13.1

Women 116,288 51 4.4

160

5. The First-Person Plural nosotros ‘we’

The referential content of nosotros ‗we‘ is naturally diffuse, making it a highly versatile

pronoun in discourse. However, the higher informational load associated with subject

expression (see Section 2) usually results in a somewhat sharper demarcation of its referential

scope. This often goes together with some intention to detach some group of people, in which

the speaker includes him/herself and may exclude or include others.

Expressed nosotros is typical of discourse characterized by overt argumentation, a

pragmatic function that sometimes makes it useful to suggest the speaker‘s inclusion in a

particular human group (examples 3 and 4).

[Female]:

(3) Que nosotros hemos tomado decisiones en reuniones \y después el resto de la gente no está

informada de lo que hay que hacer\ (CCEC Conv<ElEn08>)

‗We have made decisions in our meetings, but the rest of the people have no way of knowing

what is to be done.‘

[Male]:

(4) Nosotros sólo pedimos que se cumplan los compromisos que estaban acordados. (MEDIASA

<Ent-Ad-131104-17>)

‗We are only asking for the commitments agreed on to be fulfilled.‘

Omission is fostered by a high degree of subject salience in the context; but, due to the

peculiar discursive projections of nosotros, it is also often related to referentially vague uses

in which the first-person plural indexes a general community or performs a merely discursive

function. These usually promote a universal interpretation of the content. Omitted nosotros

helps move attention away from particular human subjects and place the interest of discourse

on objects being talked about – in other words, it enhances objectivity. In (5), the content is

presented as relating to any human being and not just a definite group, while in (6) the form

digamos ‗let‘s say‘ basically acts as a discourse marker.

161

[Male]:

(5) Los muertos nos permiten comprender la vida que Ø hemos construido y a su través Ø

entramos en la razón de ser de lo que Ø hemos sido y hecho. (MEDIASA <Art-Ga-051104-

5c>)

‗The dead help us understand the life [we] have built, and it is through them that [we] discover

the raison d’être of all that [we] have been and done.‘

[Male]:

(6) pretende: / por un lado / e: sacar:: / es:cenas en las que se muestra: / e: Ø digamos: / la

barbarie entre comillas: / de: los republicanos: / Y: lo buenos: que eran / también entre

comillas: / los nacionales (MEDIASA <Inf-SE-180603-14:10>)

‗It is his intention to capture scenes showing, [we] say, the so-called barbarity of the Spanish

Republicans, as well as the supposed goodness of the Nationalists.‘

From a cognitive viewpoint, any use of nosotros can be described as an extension of the first

person towards a larger group. Thus, whenever the first-person plural perspective is adopted,

the speaker will be included in some way, even if just in a metaphorical sense. But, crucially,

his/her personal sphere will be extended to include others as well.2

Salience and

informativeness can account for the observed variation, thus contribute to shape

communicative styles oriented to subjectivity or to objectivity.

The results from the CCEC corpus are clearly indicative of gender differences:

Omitted nosotros as an expressive choice is, in fact, much more usual in women‘s

conversational speech. The objective presentation of facts and ideas through subject omission

would thus seem to be a trait more typical of female communicative styles, placing them

away from the pole of subjectivity (Table 3).

2 In this respect, inclusion against exclusion of the audience in the scope of nosotros appears as

particularly significant, even if it will not be possible to investigate the subject in this paper.

162

Table 3 Overall frequency of omitted nosotros according to gender

(CCEC conversational texts)

Gender Word number Overall occurrences

of omitted nosotros

Frequency index per

10,000 words

Men 27,867 37 13.2

Women 51,677 168 32.5

6. The Second-Person Singular tú ‘you’

As it is the case with nosotros, matters of referential variability are important to the discursive

and pragmatic study of tú. The most significant fact in this respect is the existence of non-

specific uses of the pronoun, whereby some particular content can be presented as more

general; in fact, this is a possibility shared with English and other languages. The discursive

effect of generalization and objectivity is achieved by iconically associating the content of the

utterance with the hearer, even if deixis is hardly literal in this case. The switch from first to

second person seems to characterize the utterance as having a broader scope; this use of the

second person could, thus, be termed objectivizing tú (Serrano and Aijón Oliva 2012). Its

basic communicative motivations are notorious whenever it is clear that the speaker is

drawing on personal experience, as can be perceived in excerpts (7) and (8).

[Female]:

(7) No es que tu hijo o tu hija tengan hijos\ es que tú te conviertes en abuela\ a mí eso me parece

más fuerte\ (CCEC Conv<ElEn08>)

‗It‘s not just that your son or daughter may become a parent; you in turn will become a

grandmother, and that‘s what feels most shocking to me.‘

[Female]:

(8) desde luego es en la Única cadena / que se: puede hablar / porque en las otras / cuanto Ø

empiezas a decir algo de esto / te cortan (MEDIASA <Var-Co-230503-12:30>)

‗This is indeed the only radio station where one can talk freely; in others, whenever [you] start

saying things like these, they‘ll cut you.‘

163

Both nosotros and objectivizing tú can be seen as discursive-cognitive extensions of yo,

aimed to widen or blur first-person deixis for a variety of communicative goals. As is the case

with referentially deictic instances of tú, formal expression in its nonspecific use is variable.

Whenever personal circumstances or positions are attributed to a second-person subject, they

seem to move beyond their particular notional sphere and acquire a more general value,

relieving the speaker from direct responsibility, and promoting discursive and cognitive

objectivity.

According to the scores in both corpora, it is somewhat more frequent for female

speech to carry out a transition from the first to the second person. That is, women are slightly

more inclined towards the indexation of their interactional partner (tú) in discourse, iconically

involving him/her in the content discussed. This could be interpreted as a quantitative

reflection of the collaborative or supportive orientation often attributed to female speech in

gender studies (e.g., Johnstone, Ferrara and Bean 1992: 150; Maltz and Borker 2011: 488)

(Table 4, Table 5).

Table 4 Overall frequency of objectivizing tú according to gender

(CCEC conversational texts)

Gender Word number Overall occurrences

of objectivizing tú

Frequency index per

10,000 words

Male 27,867 17 6.1

Female 51,677 38 7.3

Table 5 Overall frequency of objectivizing tú according to gender (MEDIASA texts)

Gender Word number Overall occurrences

of objectivizing tú

Frequency index per

10,000 words

Male 177,332 105 5.9

Female 116,288 75 6.4

164

However, the analysis of tú begs for further elaboration of the objectivity-subjectivity

continuum as an abstract dimension explaining pronoun usage and style construction.

Whereas the choice of second-person pronouns is itself related to the realm of subjectivity, in

a given context it may, in fact, be intended to downplay the more ‗prototypical‘ subjectivity

conveyed by the first person. Even so, in general terms we can once again point out some

preference of women for the variants conveying objectivity and a lesser tendency to impose

personal views on discourse, favouring agreement and collaboration, instead. Our analysis

has shown that such orientation does not surface only in general discursive and interactional

strategies, but also in local grammatical facts such as subject choice and formulation.

7. Conclusions

In the present study, we have analyzed the statistical variation and some interactional

projections of the expression vs. omission of three Spanish subject pronouns; we hypothesize

that the syntactic variants under study might constitute formal-semantic choices helping the

development of communicative styles. More specifically, such choices might be associated

with the interactional construction of sex/gender as a stylistic category.

Our results seem to largely confirm the hypotheses assumed, as well as support and

explain certain previous findings on male vs. female ways of communicating, particularly

those regarding the supposed collaborative orientation of female speech. The notion that

women tend to favour interactional co-operation and agreement, while men orient themselves

more clearly towards self-expression and imposition is widespread in gender studies. But we

have also tried to offer a cognitive explanation to such social variability. This can be

condensed in the abstract continuum between objectivity and subjectivity, understood as a

dimension conditioning all levels of form and meaning. In this sense, the analysis suggests

that female speech is particularly inclined to syntactic choices promoting objectivity – or,

perhaps more precisely, downplaying subjectivity –, whereas the opposite tendency seems to

characterize male communicative styles.

Our positive conclusions on the connection between pronoun usage and gendered

identities are not meant to imply that such usage is perceived as anything like a gender

marker in Spanish-speaking communities, but rather that it is one among the variety of

semiotic resources used for the (sometimes quite subtle) construction of gender in interaction.

A line of research like the one outlined here should further incorporate other meaningful

linguistic and communicative phenomena, as well as refine the analysis of interactional

contexts, in order to achieve a more realistic picture of the ways male and female identities

165

are contextually shaped, and of the cognitive orientations towards reality underlying such

identities.

This should probably start from the joint consideration of the whole paradigm of

grammatical persons, each of which can be seen as embodying a different perspective along

the subjectivity-objectivity continuum. For example, the singular first person can be viewed

as signaling the highest degree of subjectivity, while the plural downplays this value by

including the speaker in a wider group. In turn, second and third persons, as well as their

different variants, will promote different perceptions and interpretations of the content of

discourse. If a relationship can be demonstrated between the choice of person as a discursive-

cognitive perspective and the construction of gender as well as other relevant identity

features, a further step will be achieved towards the theoretical, explanatory model of

sociolinguistic variation that we see as a desirable scientific goal. The handling of general

cognitive notions such as subjectivity in the description and explanation of styles is, in our

view, the key to transcend the peculiarities of the communities and interactional domains

analyzed. In sum, further research from this viewpoint in different settings and languages

should be carried out in order to check the wider validity of the claims put forward here.

References

Aijón Oliva, M. Á. and M. J. Serrano. (2010a). Las bases cognitivas del estilo lingüístico.

Sociolinguistic Studies 4, 115-144.

Aijón Oliva, M. Á. and M. J. Serrano. (2010b). El hablante en su discurso: Expresión y

omisión del sujeto de creo. Oralia 13, 7-38.

Aijón Oliva, M. Á. and M. J. Serrano. (2012). Towards a comprehensive view of variation in

language: The absolute variable. Language & Communication 32: 80-94.

Aijón Oliva, M. Á. and M. J. Serrano. (2013). Style in syntax: Investigating variation in

Spanish pronoun subjects. Bern: Peter Lang.

Beaugrande, R. A. and W. Dressler. (1997). Introducción a la lingüística del texto.

Barcelona: Ariel.

Bell, A. (1999). Styling the other to define the self: A study in New Zealand identity making.

Journal of Sociolinguistics 3, 523-541.

Coates, J. (2003). Men talk. Oxford: Blackwell.

Coupland, N. (2007). Style: Language variation and identity. Cambridge: Cambridge

University Press.

166

Croft, W. and D. A. Cruse. (2004). Cognitive linguistics. Cambridge: Cambridge University

Press.

Delbecque, N. (2005). El análisis de corpus al servicio de la gramática cognoscitiva: Hacia

una interpretación de la alternancia lineal SV / VS. In G. Knauer and V. Bellosta von

Colbe (Eds.), Variación sintáctica en español: Un reto para las teorías de la sintaxis

(pp. 51-74). Tübingen: Niemeyer.

Eckert, P. (1989). The whole woman: Sex and gender differences in variation. Language

Variation and Change 1, 245-267.

Edwards, J. (2009). Language and identity: An introduction. Cambridge: Cambridge

University Press.

Goldberg, A. E. (2003). Constructions: A new theoretical approach to language. Trends in

Cognitive Sciences 7, 219-224.

Johnstone, B., K. Ferrara and J. M. Bean. (1992). Gender, politeness, and discourse

management in same-sex and cross-sex opinion-poll interviews. Journal of

Pragmatics 18, 145-170.

Jordan-Jackson, F. F. and K. A. Davis. (2005). Men talk: An exploratory study of

communication patterns and communication apprehension of black and white males.

Journal of Men’s Studies 13, 347-367.

Kerbrat-Orecchioni, C. (1980). La enunciación: De la subjetividad en el lenguaje. Buenos

Aires: Hachette.

Kristiansen, G. (2008). Style shifting and shifting styles: A socio-cognitive approach to lectal

variation. In G. Kristiansen and R. Dirven (Eds.), Cognitive sociolinguistics:

Language variation, cultural models, social systems (pp. 45-88). Berlin: Mouton de

Gruyter.

Keysar, B. (2007). Communication and miscommunication: The role of egocentric processes.

Intercultural Pragmatics 4, 71-84.

Kiesling, S. F. (2005). Homosocial desire in men‘s talk: Balancing and re-creating cultural

discourses of masculinity. Language in Society 34, 695-726.

Langacker, R. W. (2009). Investigations in cognitive grammar. Berlin: Mouton de Gruyter.

Maltz, D. N. and R. A. Borker. (2011). A cultural approach to male-female

miscommunication. In J. Coates and P. Pichler (Eds.), Language and gender: A

reader (pp.487-502). Oxford: Wiley-Blackwell.

Serrano, M. J. (2013) De la cognición al texto: El efecto de la prominencia cognitiva y la

informatividad discursiva en el estudio de la variación de los sujetos pronominales.

167

Estudios de Lingüística de la Universidad de Alicante 27, 275-29

Serrano, M. J. (in press). El sujeto y la subjetividad: Variación del pronombre yo en géneros

textuales del Español de Canarias. Revista Signos: Estudios de Lingüística, 47, 85.

Serrano, M. J. and M. Á. Aijón Oliva. (2011). Syntactic variation and communicative style.

Language Sciences 33, 138-153.

Serrano, M. J. and M. Á. Aijón Oliva. (2012). Cuando tú eres yo: La inespecificidad

referencial de tú como objetivación del discurso. Nueva Revista de Filología

Hispánica 60(2), 541-563.

Siewierska, A. (2004). On the discourse basis of person agreement. In T. Virtanen (Ed.),

Approaches to cognition through text and discourse (pp.33-48). Berlin: Mouton de

Gruyter.