Contrasting Polish and English Derivational Groups Karolina Tymowicz based on Jadacka, H....
-
Upload
catherine-teasdale -
Category
Documents
-
view
219 -
download
1
Transcript of Contrasting Polish and English Derivational Groups Karolina Tymowicz based on Jadacka, H....
Contrasting Polish and English Derivational Groups
Karolina Tymowicz
based on•Jadacka, H. Rzeczeownik polski jako baza derywacyjna,WN-PWN 1995 •independent contrastive study of 540 Polish-English pairs of derivations
November 28th 2000
Outline
• Defining terms:– Derivational group– Derivational base– Affixes– Similarity of and within derivational groups
• Procedure of comparison
• Conclusions
Derivational group
• A well-ordered system constructed around an underived entry word concentrating all the derivatives connected with it by means of direct or indirect process of derivation
• a hierarchical structure in which each element functions as a link between other derivatives and the BASE
Derivational base
• The item to which an affix is added to derive a new word-form
• the word-forms consisting of the derivational base and an affix are called DERIVATIVES– e.g. STYLE - STYLIZE - STYLIZER– e.g. CENTRE - CENTRIC - CENTRICALLY
Affix
• a morpheme that is added to a word, and which changes the meaning or function of the word
• affixes are bound-forms that can be added:– to the beginning of a word = a prefix, e.g.:
unkind– to the end of a word = suffix, e.g.: kindness
Similarity within derivational groups
Four kinds of similarities within derivational groups are considered. Three types of translational similarity– translational similarity between morphemes
– translational similarity between derivatives
– translational similarity between derivational groups
and one type of grapho-etymological similarity – graphemic and etymological similarity between bases
degrees of translational similarity between morphemes (incl. bases)
def. translational similarity between L1 and L2 morphemes is a degree to which L1 morpheme can correctly be rendered as a corresponding L2 morpheme (i.e. morphemes occupying the same position with respect to the base).
• no similarity, e.g.
ponad- vs. -less in P. ponad-czasowy, E. time-less)
• 1st degree of similarity, e.g.
bez- vs. -less in P. bez-głośny, E. voice-less
• 2nd degree of similarity, e.g.
-ik vs. -er in P. głośn-ik, E. loudspeak-er
-czas- vs. time- in P. ponad-czas-owy, E. time-less)
degrees of translational similarity between derivativesdef.: a joint translational similarity between all the
corresponding morphemes of the Polish and English derivatives
e.g. Pol. Eng.
za- = a-
les’- = forest
ać
whereby two morphemes are corresponding iff they occupy the same position with respect to the base.
similarity between derivational groups is a function of – the grapho-etymological similarity of
their bases,– and the translational similarity of all
their derivatives.
degrees of translational similarity between derivative groups
Degrees of graphemic-etymological similarity between
derivational bases
def. Similarity established between two bases with respect to their etymological and graphemic features with the assumption of their translational equivalence
– no similarity, e.g. dom vs. house
– remote similarity, e.g. brat vs. brother
– close similarity, e.g. styl vs. style
irrespective of the translational equivalence of their derivatives
Scale of translational similarity between derivatives
This scale used here consists of 12 levels of similarity counted from 11 to 0, where 0 stands for the lowest level of similarity and 11 denotes the highest level of similarity.
0 1 2 3 4 5 6 7 8 9 10 11
Treatment of compound derivatives
If a single compound derivative of the form “A-B” or “AB” (but not “A B”) has an equivalent in the other language in the form of 2 separate words “C D” then it is included into our classification as long as
• C is a direct translation of A and D is a direct translation of B
• or C is a direct translation of B and D is a direct translation of A.
This convention has been adopted because• Jadacka’s derivational groups contain only derivatives of the
form ‘AB’ or ‘A-B’, but no ‘A B’ derivatives
• Jadacka’s work constituted the main and most reliable source of derivatives and derivational groups considered in the study.
11. P. BASE1 + BASE2 + SUFFIX =
E. BASE1 + BASE2 + SUFFIX
e.g.: słowo - word
słowo-twór-stwo word form-ation
10. E. BASE1 + (BASE2 + SUFFIX) =
P. (BASE2 + SUFFIX) + BASE1
e.g.: krew - blood
blood-stain-ed poplamio-ny krwią
9. E. BASE1 + BASE2 =
P. BASE2 + (BASE1 + SUFFIX)
e.g.: głos - voice
voice-mail poczta głos-owa
0 1 2 3 4 5 6 7 8 9 10 11 C
ompound derivatives 1
Scale of similarity
Com
pound derivatives 2
8. P. BASE1 + BASE2 =
E. BASE1 + BASE2
e.g.: słowo - word
pół-słowo half-word
7. E. BASE1 + BASE2 =
P. BASE2 + BASE1
e.g.: styl - style
free-style styl wolny
0 1 2 3 4 5 6 7 8 9 10 11
Scale of similarity
6. P. BASE + SUFFIX =
E. BASE + SUFFIX
e.g.: las - forest
les’-nik forest-er
P. BASE + SUFFIX + SUFFIX =
E. BASE + SUFFIX + SUFFIX
e.g.: styl - style
styl-ist-yczny styl-ist-ic
P. PREFIX + BASE + SUFFIX =
E. PREFIX + BASE + SUFFIX
e.g.: las - forest
wy-les’-anie de-forest-ation
P. PREFIX + BASE + SUFFIX + SUFFIX =
E. PREFIX + BASE + SUFFIX + SUFFIX
e.g.: centrum - centre
de-centr-al-izować de-centr-al-ize
Single derivatives 1
0 1 2 3 4 5 6 7 8 9 10 11
Scale of similarity
5. P. PREFIX + BASE + SUFFIX =
E. BASE + SUFFIX + SUFFIX
e.g.: dziecko - child
bez-dziet-ność child-less-ness
4. P. PREFIX + BASE + SUFFIX =
E. BASE + SUFFIX
e.g.: pan - lord
wielko-pań-ski lord-ly
3. P. PREFIX + BASE + SUFFIX =
E. PREFIX + BASE
e.g.: las - forest
za-leś-ać a-forest
Single derivatives 2
Scale of similarity
0 1 2 3 4 5 6 7 8 9 10 11
2. P. BASE + SUFFIX =
E. BASE + ____
e.g.: słowo - word
słow-nik word-book
P. BASE + SUFFIX =
E. BASE
e.g.: dziecko - child
diec-inka child
1. P. BASE + SUFFIX + SUFFIX =
E. _____ + _______ + SUFFIX
e.g.: słowo - word
słow-nik-arz lexico-graph-er
P. BASE + SUFFIX =
E. _____ + SUFFIX
e.g.: znak - sign
znacz-nik mark-er
Single derivatives 3
Scale of similarity
0 1 2 3 4 5 6 7 8 9 10 11
0. E. BASE + BASE =
P. _____
e.g.: time - czas
time-piece zegarek
P. BASE + SUFFIX =
E. _____
e.g.: kość - bone
kos-tka ankle
E. PREFIX + BASE =
P. _______
e.g.: child - dziecko
grand-child wnuk
Single derivatives 4
Scale of similarity
0 1 2 3 4 5 6 7 8 9 10 11
Experiment
• 540 Polish-English pairs of derivatives were judged as to their similarity according to the 12-point scale presented above
• the translational similarity points for each pair of derivatives obtained for each of the Polish and English bases together with the grapho-etymological similarity between these bases were analysed statistically
Statistical tests applied in the study
• in spite of nonnormality of the data the following parametric tests were applied• MANOVA for
– for translational similarity between derivatives by– grapho-etymological similarity between the basis these derivatives were obtained from, and– direction of translation
» (Polish-English: based on Jadacka ‘95 and Collins Polish-English Electronic Dictionary, » English-Polish: based on Harper-Collins Electronic Dictionary and Collins English-Polish
Electronic Dictionary)
• Multiple Range Tests for – translational similarity of the derivatives, irrespective of whether they were obtained through Polish-English
or English-Polish translation– by grapho-etymological similarity between the Polish and English bases they were derived from
• Multiple Range Tests for – translational similarity of the derivatives obtained through Polish-English translation– by grapho-etymological similarity between the Polish and English bases they were derived from
• additionally some non-parametric tests were applied• Mann-Whitney W test to compare
– medians of the similarity points obtained for the derivatives in Polish-English translation– with the medians of the similarity points obtained for the derivatives in English-Polish translation
Some results: MANOVA
• Type III Sums of Squares was used
• All F-ratios were based on the residual mean square error.
Source Sum of Squares Df Mean Square F-Ratio P-Value
A:graph_ethym_sim_betw_bases 590,704 2 295,352 53,53 0,0000
B:direction_of_translation 195,227 1 195,227 35,38 0,0000
RESIDUAL 2957,27 536 5,5173
TOTAL (CORRECTED) 3903,44 539
The P-values test the statistical significance of each of the sources. Since P-values are less than 0,05,
these grapho-etymological similarity between bases and the direction of translation have a statistically
significant effect on the translational similarity between the derivatives obtained from these bases
at the 95,0% confidence level.
Contrast Difference +/- Limits
0 - 1 0,197742 1,25397
0 - 2 *-2,60124 0,488299
1 - 2 *-2,79898 1,30672
* denotes a statistically significant difference.
which means that the derivational groups
* of the Polish-English bases that were judged to bear no similarity with respect to their grapho-etymological features, and the derivational groups
* of the bases that were judged to be remotely similar with respect to their grapho-etymological features
(i.e. 0-1) do not differ significantly with respect to the similarity of the derivatives that constitute derivational groups of each of these basis.
on the other hand, groups derived from bases that differed in their etymology and graphemic representation (contrasts 0-2 and 1-2) have significantly different derivatives as far as the translational similarity of these derivatives is concerned.
Some results: Multiple Range Tests
0
20
40
60
80
100
120
140
160
180
200
6 0 3 4 10 9
Degrees of similarity
Fre
qu
en
cy
,00%
10,00%
20,00%
30,00%
40,00%
50,00%
60,00%
70,00%
80,00%
90,00%
100,00%
1 2 5 7 8
FrequencyCumulative %540 observations = 100%
Applications of the study
The results of the study provide insights into the possibility of automatic translation of UNKNOWN L1 derivatives on the basis of – the L2 equivalents of the component
morphemes of L1 derivative– the degree of grapho-etymological similarity
between the bases of these derivatives
For example: assume • we do not know the equivalent of a derivative leśnik • we can interpret bases even if they are modified by other
morphemes (las leś-)• we know the equivalents of the component morphemes:
les’- (= las) forest -nik -er
• we know the grapho-etymological similarity between the bases (= 0)
Hence, we guess with a relatively small certainty thatEnglish equivalent of leśnik is forester
Pessimistic scenario for automatic translation of derivatives
correct translation
38%
incorrect translation
62%
Scale of similarity
0 1 2 3 4 5 6 7 8 9 10 11
Optimistic scenario for automatic translation of derivatives
incorrect translation
53%
correct translation
47%
Scale of similarity
0 1 2 3 4 5 6 7 8 9 10 11
Very optimistic scenario for automatic translation of derivatives
correct translation
56%
incorrect translation
44%
Scale of similarity
0 1 2 3 4 5 6 7 8 9 10 11
Conclusions
• COMPOSITIONALITY: The meaning of the derivative is a direct function of the meaning of its morphemes in app. 38-56% of cases
• Assuming we know the equivalents of all the morphemes of an L1 derivative we have app. 38-56% chance of producing a comprehensible L2 derivative
• The grapho-etymological similarity of L1 and L2 bases influences the translational similarity of their derivational groups