Annelore Willems, Gert De Sutter Faculty of Translation Studies
description
Transcript of Annelore Willems, Gert De Sutter Faculty of Translation Studies
Where shall I put this?
Distance-to-V, length and verb disposition effects on PP placement in Belgian Dutch
Annelore Willems, Gert De Sutter Faculty of Translation Studies
University College Ghent – Ghent University
{annelore.willems,gert.desutter}@hogent.be
New Ways of Analyzing Variation 2012
(1) A multifactorial investigation of PP placement in Dutch subordinate clauses
(2) Refine common assumptions in syntactic and psycholinguistic theory
Dutch language users do not strive at maximally reducing the distance between depending elements
Goals
dat ikSU binnen een vijftiental ondernemingen van de Bel20 contacten hebV-final.that ISU within about five enterprises of the Bel20 contacts haveV-final.
Midfielddat ikSU contacten hebV-final binnen een vijftiental ondernemingen van de Bel20.that ISU contacts haveV-final within about five enterprises of the Bel20.
Postfield
The structural position before V-final (midfield) is the standard slot for PPs, with the slot after V-final being an expansion tank for an overladen midfield slot (ANS 1997, Jansen 1979)
The distance between SU and V should be reduced as much as possible (Jansen 1979, Van Haeringen 1949)
Research object
Dutch Parallel Corpus (DPC)• a 10-million-word, parallel corpus of Dutch, English and French• sentence-aligned with basic linguistic annotations• 5 different text genres but for this presentation only journalistic texts
Data selection:• dependent clauses starting with the grammatical conjunction dat (=
that)• PP phrases where variation between extraposition and non-
extraposition is possible• Belgian Dutch
Method: Corpus and data
Logistic regression analysis and generalised linear mixed model
PP position (midfield vs. postfield) as binary response variable Predictor variables:
Fixed effects1. The length of the PP 2. The distance-to-V 3. The distance between V and the end of the clauseRandom effects1. Verbs2. Prepositions
Method: Statistical evaluation
Results
Overview general distribution Monofactorial analysis
1. Fixed effect 1: Length of PP2. Fixed effect 2: Distance-to-V3. Fixed effect 3: Distance between V and end
Multifactorial analysis
Overview Results
[…] dat de Belgen aan de Olympische Spelen deelnamen […] that the Belgians in the Olympics take part […] dat de Belgen deelnamen aan de Olympische Spelen […] that the Belgians take part in the Olympics
Distribution of PPs in midfield or postfield
MidfieldPostfield
58%42%
Operationalised in terms of syllables Example:
[…] dat de Belgen aan /de/ O/lym/pi/sche/ Spe/len deelnamen = 8
Also counted in terms of words
Fixed effect1: The length of the PP
Fixed effect1: The length of the PP
1 = 2 syllables2 = 3 to 7 syllables3 = 8 to 12 syllables4 = 13 or more syllables
AV = postfieldMV = midfield
Statistical evaluation:
Fixed effect1: The length of the PP
Length of the PP O.R. p-value
syllables 12.05 < 2e-16 ***
Operationalised for the syllables between SU and V (o.a. Jansen 1978, Gibson 2000)
Example:dat ik binnen een vijftiental ondernemingen van de Bel20
con/tac/ten heb.= 3 syllables
Also counted in terms of words and phrases
Fixed effect2: Distance-to-V
Fixed effect2: Distance-to-V
1 = 0 syllables2 = 1 or 6 syllables3 = 7 or more syllables
AV = postfieldMV = midfield
Fixed effect2: Distance-to-V
1 = 0 syllables2 = 1 or 6 syllables3 = 7 or more syllables
AV = postfieldMV = midfield
Fixed effect2: Distance-to-V
1 = 0 syllables2 = 1 or 6 syllables3 = 7 or more syllables
AV = postfieldMV = midfield
?
Statistical evaluation:
Fixed effect2: Distance-to-V
Distance-to-V O.R. p-value
syllables 1.48 0.0001***
Operationalised in terms of syllables Example:
Dat mensen een sympathieke collega zullen verkiezen als/ part/ner.
= 3 syllable
Also counted in terms of words
Fixed effect3: Distance between V and the end
Fixed effect3: Distance between V and the end
1 = 0 syllables2 = 1 or more syllables
AV = postfieldMV = midfield
Statistical evaluation:
Fixed effect3: Distance between V and the end
Distance-V-end O.R. p-value
syllables 0.43 <2e-16 ***
No correlation No interaction Multicollinearity C concordance = 0.75
Logistic regression analysis
Factor O.R. p-value
Length PP 12.48 <2e-16 ***
Distance-to-V 1.62 0.0001***
Distance-V-end 0.48 5.47e-13 ***
Verbs and preposition as random variables
C Concordance = 0.86
Generalised mixed effect model
variance Std.Dev.
Verbs 0.53 0.72
Prepositions 0.36 0.59
O.R. p-value
Length PP 16.89 <2e-16 ***
Distance-to-V 1.68 0.00***
Distance-V-end 0.43 6.15e-14 ***
Gries, Stefanowitsch 2004: Collostructional analysis An analysis of the verbs/prepositions that are distinctive for each
construction may help us elucidate the existence and degree of fine semantic differences that might explain the different restrictions.
Interpretation random effects
Collostructional analysis (Gries, Stefanowitsch 2004) :
PP disposition
MV AVin 7.33 van 5.29binnen 2.7 voor 3.64na 2.68 aan 1.5tijdens 1.76 met 1.47
Collostructional analysis (Gries, Stefanowitsch 2004) :
Verb disposition
MV AVDoen 3.02 Recht hebben 1.89Komen 2.53 Rol spelen 1.89Staan 1.95 Deel uitmaken 1.65Beschikken 1.78 Bezig zijn 1.42Halen 1.62 Tevreden zijn 1.42lijden 1.52 Verantwoordelijk zijn 1.41
Collostructional analysis (Gries, Stefanowitsch 2004) :
Verb disposition
MV AVDoen 3.02 Recht hebben 1.89Komen 2.53 Rol spelen 1.89Staan 1.95 Deel uitmaken 1.65Beschikken 1.78 Bezig zijn 1.42Halen 1.62 Tevreden zijn 1.42lijden 1.52 Verantwoordelijk zijn 1.41
1. Postfield position is more often preferred than midfield position
2. PP placement will be determined by 3 length factors 2 random effects
Summary
Common assumption:dat ik contacten heb V-final binnen een vijftiental ondernemingen van de Bel20.
But the structural position before V-final (midfield) is not the standard slot for PPs.
Implications for linguistic theory
MidfieldPostfield
OR Distance-V-end 0.43> OR Distance-to-V 1.68
Subject and verb in subordinate clauses are mostly not adjacentDistance between subject and V is not to be reduced as
much as possible in Dutch dependent clauses
Implications for linguistic theory
Thank you!
For further [email protected]
The length of the PP [words]
1 = 2 or 3 words2 = 4 to 6 words3 = 7 to 11 words4 = 12 or more words
AV = extrapositionMV = midfield
Logistic regression analysis:
Correlation = 0.81, p < 2.2e-16
The length of the PP
Factor O.R. p-value
words 25.79 1.62e-11 ***
syllables 12.05 < 2e-16 ***
Distance-to-V [words]
1 = 0 words2 = 1 or 2 words3 = 3 or more words
AV = extrapositionMV = midfield
Distance-to-V [phrases]
1 = 0 phrases2 = 1 phrase3 = 2-4 phrases
AV = extrapositionMV = midfield
Logistic regression analysis:
Correlation [words, syllables] = 0.84, p < 2.2e-16 Correlation [words, phrases] = 0.64 , p < 2.2e-16 Correlation [syllables, phrases] = 0.58 , p < 2.2e-16
Distance-to-V
Factor O.R. p-value
words 1.41 0.00***
syllables 1.48 0.00***
phrases 0.49
Verbs as random variable Mixed effect in logistic regression:
C Concordance = 0.85
Verb disposition
Factor O.R. p-value
Length PP 15.61 <2e-16 ***
Distance-to-V 1.6 0.00***
Distance-V-end 0.42 1.61e-15 ***