Download - Regular polysemy in context: a corpus-based analysis of ...

Page 1: Regular polysemy in context: a corpus-based analysis of ...

Regular polysemy in context: a corpus-based analysis of regular polysemy in Spanish nouns

Irene Renau

Pontificia Universidad Católica de Valparaíso


Fondecyt Project nr1191204

Page 2: Regular polysemy in context: a corpus-based analysis of ...


Introduction to the project

01Theoretical background

02Corpus-based methodology

03Preliminary results

04Final remarks


Page 3: Regular polysemy in context: a corpus-based analysis of ...

Introduction to the project

Project Fondecyt nr 1191204 (ANID, Chile Government) 2019-2022 is devoted to:

1. Conduct corpus analysis of highly frequent Spanish nouns to detect regular polysemy

2. Classify types of regular polysemy of Spanish nouns

3. Provide an open lexical database mapping corpus data with semantic description

Page 4: Regular polysemy in context: a corpus-based analysis of ...

Theoretical backgroundRegular polysemy is defined by Apresjan (1974) as follows:

“Polysemy of the word A with the meanings ai and aj is called regular if, in the given language, there exists at least one other word B with the meanings bi and bj, which are semantically distinguished from each other in the same way as ai and aj and aiand bj are nonsynonymous” (Apresjan 1974: 16).

Page 5: Regular polysemy in context: a corpus-based analysis of ...

Theoretical background

Word A:

violeta ‘violet’

meaning ai: PLANT

meaning aj: FLOWER

Word B:

margarita ‘daisy’

meaning bi: PLANT

meaning bj: FLOWER

Word C:

lavanda ‘lavender’

meaning ci: PLANT

meaning cj: FLOWER

Word D: fucsia‘fuchsia’

meaning di: PLANT

meaning dj: FLOWER

“To plant violets”.

“A bouquet of violets”.

Page 6: Regular polysemy in context: a corpus-based analysis of ...

Theoretical background

Word A:

violeta ‘violet’

meaning ai: PLANT

meaning aj: FLOWER

Word B:

margarita ‘daisy’

meaning bi: PLANT

meaning bj: FLOWER

Word C:

lavanda ‘lavender’

meaning ci: PLANT

meaning cj: FLOWER

Word D: fucsia‘fuchsia’

meaning di: PLANT

meaning dj: FLOWER

Page 7: Regular polysemy in context: a corpus-based analysis of ...

Theoretical background

Word A:

violeta ‘violet’

meaning ai: PLANT

meaning aj: FLOWER

Word B:

margarita ‘daisy’

meaning bi: PLANT

meaning bj: FLOWER

Word C:

lavanda ‘lavender’

meaning ci: PLANT

meaning cj: FLOWER

Word D: fucsia‘fuchsia’

meaning di: PLANT

meaning dj: FLOWER

Page 8: Regular polysemy in context: a corpus-based analysis of ...

Theoretical background

Word A:

violeta ‘violet’

meaning ai: PLANT

meaning aj: FLOWER

Word B:

margarita ‘daisy’

meaning bi: PLANT

meaning bj: FLOWER

Word C:

lavanda ‘lavender’

meaning ci: PLANT

meaning cj: FLOWER

Word D: fucsia‘fuchsia’

meaning di: PLANT

meaning dj: FLOWER

Page 9: Regular polysemy in context: a corpus-based analysis of ...

Theoretical backgroundRegular polysemy is created by metonymy, e.g. “PLANT for PLANT PART”. Lakoff and Johnson (1980) explain that conceptual metonymy is a cognitive resource which can be found in the use of language.

Pustejovsky (1995) postulates that these pairs (e.g. PLANT / FLOWER) are packed together in the same semantic type. The context establishes if meaning ai or aj are used, or both (copredication, e.g. “a heavy and boring book”).

Differentiation between meanings is more fuzzy and gradual than in metaphor (e.g. “I bought a book” → PHYSICAL OBJECT / INFORMATION).

Page 10: Regular polysemy in context: a corpus-based analysis of ...

Theoretical backgroundRegular polysemy seems to be a predictable phenomenon due to its systematicity (Peters and Kilgarriff, 2000). E.g., one can predict that the document and the content will be called the same way (book, novel, volume, film, trilogy, draft, etc.).

The grade of lexicalization and conventionalization of the meaning is not predictable, as is usual in language (Rojas, 2011). E.g., board as ‘content’ (read the board)??

There are extra-linguistic constraints (cultural, technological…) for the stabilization of these pairs of meanings in the lexicon. E.g., a flower which is not very remarkable can have no name.

Page 11: Regular polysemy in context: a corpus-based analysis of ...

Theoretical backgroundDictionaries show the variety of representation of regular polysemy in different languages and between dictionaries of the same language (Renau, 2021):

Spanish Catalan French Italian English



Spanish DIEC TLF Zing. Coll.





alcoholic drink 1 1 1 1 4 1 1 0 0 6colour 0 0 0 0 0 0 0 0 0 0fruit 1 0 0 0 1 1 1 1 0 5herbal tea 0 0 0 0 0 0 0 0 0 0plant 1 1 1 1 4 1 1 1 1 8product 0 0 1 1 2 1 1 0 0 6seed 0 1 1 0 2 0 1 1 0 4



colour 0 0 0 0 0 0 0 0 0 0flower 0 0 0 0 0 0 1 0 0 1fruit 0 1 1 1 3 1 1 1 1 7plant 0 1 0 0 1 0 0 1 0 2

Page 12: Regular polysemy in context: a corpus-based analysis of ...

Theoretical backgroundMost of the data for the study of regular polysemy in previous literature come from introspection or secondary sources (mainly Wordnet). Lack of direct sources of data is a problem to define the phenomenon with an empirical basis.

Some exceptions (corpus-based approaches): Goossens (2012), Ježek and Vieu (2014), Berri (2014), Ramírez (2020), Renau (2021). Corpus-based approaches show the complexity and richness of this phenomenon:

plant seed


herbal tea

alcoholic drink colour


Corpus analysis for anís ‘anise’ (Renau, 2021)

Page 13: Regular polysemy in context: a corpus-based analysis of ...

Theoretical backgroundNot very obvious how to structure thiscomplexity in the microstructure of thepedagogical dictionary, preserving the needs ofthe learner (easy to understand, avoidingverbosity, etc.)!

plant seed


herbal tea

alcoholic drink colour


Corpus analysis for anís ‘anise’ (Renau, 2021)

Page 14: Regular polysemy in context: a corpus-based analysis of ...

Corpus-based methodologyWe used Sketch Engine (Kilgarriff et al., 2004) and the EsTenTen18 corpus.

We conducted manual corpus analysis using the annotation mode (Baisa et al., 2020).

We selected the nouns from a list of the most 5,000 frequent Spanish nouns taken from the EsTenTen and the Corpes XXI combined.

We first used groups of regular polysemy already found in the data (Renau et al., in preparation) (311 groups).

Page 15: Regular polysemy in context: a corpus-based analysis of ...

Corpus-based methodologyWe operate in a similar way of our previous project annotating verbs with the Corpus Pattern Analysis technique (Hanks, 2004, 2013).

However, part of the task has to do with pragmatic or “encyclopaedic” interpretation of the context.

We use a slightly modified version of the CPA Ontology for the selection of semantic types.

preparar café ‘to prepare coffee’ → beveragesembrar café ‘to plant coffee’ → plantcafé caliente ‘hot coffee’ → beverageEl café estaba lleno ‘The café was full’ → place

“Yo reaccioné de forma alérgica a los siguientes alimentos: plátanos, manzanas, naranjas, chocolate, huevos, fresas y café” ‘I reacted allergic to the following foods: bananas, apples, oranges, chocolate, eggs, strawberries and coffee’.

Page 16: Regular polysemy in context: a corpus-based analysis of ...

Preliminary resultsMain semantic types for group Examples of analyzed nouns Nr conc. Examples of use

COLOUR / STUFF amatista, azabache, azurita, bronce, esmeralda 3.032 vestido de amatista, extracción de amatista

MUSICAL INSTRUMENT OR VOICE / HUMAN acordeón, barítono, batería, contralto, guitarra 5.175 tocar la guitarra, Las guitarras desafinaron

ACTION / RESULT agendamiento, alegato, apertrechamiento,

apozamiento, austericidio

3.750 durante su alegato, El film es una especie de alegato

ANIMAL / FOOD gallina, pollo, vaca 450 criar pollos, comer pollo

ARTIFACT / INFORMATION / INSTITUTION / GENRE apunte, artículo, cuaderno, novela, revista 2.500 manchar la novela, leer la novela, Lo contrataron en la revista, experto en

novela decimonónica

INSTITUTION / POSITION / LOCATION / HUMAN / TIME PERIOD arzobispado, capitanía, comisión, consulado,


1.844 organizar la presidencia, Fue promovido a la presidencia, explosión en la

presidencia, La presidencia te ayudará, durante su presidencia

BUILDING / INSTITUTION / HUMAN / DIGITAL PLATFORM / EVENT aduana, ayuntamiento, banco, biblioteca,


4.219 La biblioteca está cerrada, el director de la biblioteca, La biblioteca rinde

homenaje al escritor, Abre la biblioteca y descárgate los libros

organizar un congreso

DANCE / MUSICAL COMPOSITION / GENRE / EVENT bachata, chachachá, cueca, guaguancó, milonga 1.902 bailar una milonga, componer una milonga, experto en milonga, organizar una


IDEOLOGY / HUMAN GROUP / TIME PERIOD / SYSTEM allendismo, anarquismo, comunismo,

ecologismo, fascismo

1.456 conceptos económicos del comunismo, crimen perpetrado por el comunismo,

durante el comunismo, la caída del comunismo



anís, café, mostaza, laurel, rosa 2.150 plantar café, oler una rosa, Queremos dos cafés, falda mostaza, añadir laurel al

guiso, untar la mostaza, mesa de laurel

CONTAINER / STUFF OR LIQUID / UNIT OF MEASURE acuario, balde, bañera, barril, biberón 2.494 lavar las tazas, beberse una taza, añadir dos tazas de harina

252 analyzed nouns28,972 analyzedconcordances

Page 17: Regular polysemy in context: a corpus-based analysis of ...

Preliminary resultsPairs of semantic types are usually grouped in “chains” of more components.

When a specific semantic type appears, it usually predicts the appearance of its pair in the corpus sample, e.g. HUMAN / INSTITUTION, INSTITUTION / BUILDING, ARTIFACT / INFORMATION.

Some of these pairs had not been reported in the literature.


BUILDING: ir a la biblioteca ‘to go

to the library’

INFORMATION: leerse toda la

biblioteca ‘to read all library’

INSTITUTION: fundaruna biblioteca ‘to found a library’

HUMAN: La biblioteca decidió cerrar ‘The library decided to close’

ARTIFACT: una biblioteca de roble ‘an oak


Page 18: Regular polysemy in context: a corpus-based analysis of ...

Preliminary resultsTypes of regular polysemy can be classified in a hierarchy, from the more general to the more specific ones. E.g.:

whole / part

animal / animal part

animal / artifact

animal / food

animal / meat

animal / fur

(cabeza de ganado) (abrigo de zorro) (pollo a la parrilla)

Page 19: Regular polysemy in context: a corpus-based analysis of ...

Preliminary resultsMany cases of vagueness in which both semantic types seem to be present, e.g.:

“Sacó su primer libro al año siguiente” ‘He/she got his first book out the following year’

“La poetisa publicó un solo libro” ‘The poet published only one book’


Verbs which activate both meanings simultaneously could be vender ‘to sell’, publicar‘to publish’, comprar ‘to buy’, etc.

Verbs for meaning differentiation: escribir ‘to write’ (for INFORMATION), pesar ‘to be heavy’ (for ARTIFACT), etc.

Page 20: Regular polysemy in context: a corpus-based analysis of ...

Lema Contexto izquierdo Forma Contexto derecho Categoría gramatical Anotación

rojoes más común, donde incluso he necesitado aplicar medicamento en los folículos para poder controlarla,saludos </s><s> roja

directa futbol en vivo gratis ver ... Disfrute y comparta lo mejor en partidos de futbol en vivo ... Y para poder ver la S Artefacto

rojoS" en la capa en varios tamaños y colores. </s><s> A veces está en amarillo con delineado negro y otros artistas pusieron la "S" en rojo

con un fondo amarillo. </s><s> La capa es un aspecto estilístico del traje, y más específicamente la forma en que se conecta al S Color

rojoluego del pasillo tras la entrada aguardaba pacientemente con una sorpresa. </s><s> La sangre recorría la madera tiñendo de rojo

su color opaco, al alzar la mirada del suelo se podían seguir varias líneas hasta llegar al punto alto de la pared, donde S Color

rojoregionales. </s><s> A futuro vamos a plantear la opción de sacar las butacas y dejar sólo el cemento donde se podría pintar de rojo

, así evitamos gastos", sostuvo el alcalde. </s><s> Nos visitará la arquitecta María José Leveratto quien nos hablará de la S Color

rojo, las diferencias políticas se desarrollan en un saludable clima de intercambio de ideas, nadie cruza el semáforo en rojo

, los motociclistas no roban, los supermercados son pletóricos y la gente anda por la calle con el teléfono en la mano. </s> S Color

rojomundo"), donde se indica con diferentes colores el ratio de infección –desde el verde, para regiones seguras, hasta el rojo

oscuro, para brotes severos de malware– Este mapa, además, es interactivo: "planeando" con el cursor sobre los S Color

rojo'' delanteros, firmados por Peugeot Sport y tapizados en combinación de Alcantara y una malla negra con pespuntes en rojo

, y en la decoración en negro lacado del habitáculo –también con contrastes en rojo en los cinturones o las alfombrillas S Color

rojohecho a los que ofrece el propio Juke. </s><s> Mientras que en la gama del Juke los tonos disponibles son amarillo, blanco, azul, rojo

, gris y negro, para el Nissan Note se marca la diferencia con algunas tonalidades nuevas. </s><s> De esta forma, en el Nissan S Color

rojocolegiado también evitaba una goleada escandalosa al mirar para otro lado en un claro penalti a Airam y no sancionar con roja

una agresión al canario. </s><s> El ariete de Puerto de la Cruz relegaba a un Villar que se marchaba lesionado, el único lunar de S Artefacto

rojocosechado en el Bernabeu, perdida la Copa ante el Real Madrid, los de Guardiola deben centrarse en el encuentro ante los rojillos

para mantener su cómoda ventaja en el campeonato (ocho puntos) y restar un partido al calendario. </s><s> Real-Valencia. </s><s> El S Persona

rojoy la región todavía se mantiene en latín el nombre de la provincia (Estiria) y el escudo común de los brazos de plata, rojo

y armados con cuernos. </s><s> Por lo tanto, hay una distinción entre el concepto geográfico de Estiria y el oficial-político S Color

rojoen forma de corazón para expresar un claro mensaje de amor: I love you. </s><s> El osito es de tonos marrones combinando con rojo

y marrón más claro. </s><s> Demuéstrale tu amor con este encantador osito de peluche. </s><s> Tamaño 25 cm. </s><s> "Fue una sorpresa enorme, no S Color

rojoLópez de Uralde, reaccionó con flema en su cuenta de Twitter: "Atacan la sede de @equo en Madrid dejando pintadas de '' Rojos

No''. </s><s> No se enteran: somos verdes". </s><s> En el mismo distrito de Madrid el Partido Comunista recibió la visita de unos S Persona

rojocomo el club ganó su primera Copa de Alemania en 1968. </s><s> Incluso en 1974 el equipo fue relegado a la segunda división, los rojos

no se dieron por vencidos y en 2008 puso en marcha un proyecto para traer de nuevo la vieja gloria del club. </s><s> En 2010 el S Persona

rojoazul oscuro apenas presentan diferencias y en ocasiones resultan difíciles de distinguir. </s><s> Combinaciones de verde y rojo

suelen ser desagradables a la vista. </s><s> Otras combinaciones que debes evitar son rojo-azul, verde-azul y marrón-negro. </s> S Color

rojoque las paredes, los techos y los pisos estén revestidos o pintados en este maravilloso color y que luego, los salones en rojo

rubí estén equipados con muebles en color marrón, negro o blanco. </s><s> Esta combinación se ve fabulosa. </s><s> Otra opción seria S Color

rojofinal, los goles de Mallarach, Víctor Gutiérrez y Roger Tahull en un parcial espectacular de 3–0 han dejado el choque al rojo

vivo. </s><s> Grecia ha logrado adelantarse a 1:20 en superioridad y Munárriz ha tenido un último remate que se ha estrellado S Color

rojohas conseguido traer todo esto aquí? </s><s> - No tranquilo no me iba a reir pero es que tu edredón es ''''muy de Gryffindor'''' con ese rojo

fuego y el león dorado. - le dije riendo. - Así que te has estado leyendo los siete libros que úsabamos en el colegio. </s><s> ¿Me lo S Color

rojoque si es cierto esque uefa no usa misma vara de medir, a pepe por una jugada parecida a esta en la que si toca balon le saquen roja

, y al del dinamo no le saque ni amarilla por una cuchillada. </s><s> Por lo demas, decir que benzema es un jugador de un talento S Artefacto

Rojo ‘red’

Page 21: Regular polysemy in context: a corpus-based analysis of ...


Rojo Amarillo Azul

Tipo semántico n % Tipo semántico n % Tipo semántico n %

Artefacto 11 8.40 Artefacto 39 21.31

Color 93 70.99 Color 131 71.58 Color 130 87.25

Institución 11 8.40 Persona 12 6.56 Institución 3 2.01

Persona 16 12.21 Pez 1 0.55 Persona 16 10.74

Total sustantivos 131 100 Total sustantivos 183 100 Total sustantivos 149 100


Verde Naranja Violeta

Tipo semántico n % Tipo semántico n % Tipo semántico n %

Color 99 23.19

Color 69 69.00 Fruta 269 63.00 Color 48 57.14

Institución 9 9.00 Institución 4 0.94 Flor 14 16.67

Persona 14 14.00 Persona 11 2.58 Planta 9 10.71

Vegetación 8 8.00 Sabor/aroma 44 10.30 Sabor/aroma 13 15.48

Total sustantivos 100 100 Total sustantivos 427 100 Total sustantivos 84 100

Page 22: Regular polysemy in context: a corpus-based analysis of ...

Final remarksCorpus analysis provides with richer and complex information than the one that we have in the literature or Spanish dictionaries.

For the moment, regular polysemy was found in all analysed words. It seems to be a wider phenomenon than expected.

We still have to add more nouns to the analysis, and create a taxonomy of types of regular polysemy.

Lexicographic representation of regular polysemy could include, among other things, the use of signposts with a general description of the semantic type. Some of the meanings may not require a specific definition, only corpus examples.

Page 23: Regular polysemy in context: a corpus-based analysis of ...

Thank you!

[email protected]