A Linguistic Approach for Multilingual Machine Translation System

33
Fahima Bouzit & Mohamed Tayeb Laskri Fahima Bouzit & Mohamed Tayeb Laskri Rencontres sur la Recherche en Informatique june 12-14, 2011 1

description

R2I ‘11. A Linguistic Approach for Multilingual Machine Translation System. Fahima Bouzit & Mohamed Tayeb Laskri. Rencontres sur la Recherche en Informatique june 12-14, 2011. Introduction Analysis Levels Machine Translation Proposed Approach: Fillmore Theory - PowerPoint PPT Presentation

Transcript of A Linguistic Approach for Multilingual Machine Translation System

Page 1: A  Linguistic Approach  for  Multilingual  Machine Translation System

Fahima Bouzit & Mohamed Tayeb LaskriFahima Bouzit & Mohamed Tayeb LaskriRencontres sur la Recherche en Informatique

june 12-14, 2011

1

Page 2: A  Linguistic Approach  for  Multilingual  Machine Translation System

2

PLAN

Introduction

Analysis Levels

Machine Translation

Proposed Approach:

Fillmore Theory

Conceptual Dependency

Semantic Traits of Chafe

Frame Based Representation

Conclusion & Perspectives

Page 3: A  Linguistic Approach  for  Multilingual  Machine Translation System

3

Introduction

Language

Natural Language Processing(NLP)

Page 4: A  Linguistic Approach  for  Multilingual  Machine Translation System

4

Introduction

Linguistic Approaches

Probabilistic (Statistical) Approaches

NLP Schools

Page 5: A  Linguistic Approach  for  Multilingual  Machine Translation System

Analysis Levels

5

IntroductionAnalysis Levels

Morphology ;

Syntax ;

Semantic ;

Pragmatic ;

Page 6: A  Linguistic Approach  for  Multilingual  Machine Translation System

6

IntroductionAnalysis Levels

Machine Translation

Machine Translation

Translation

Source language

Target language

Machine

Page 7: A  Linguistic Approach  for  Multilingual  Machine Translation System

7

The challenge in machine translation: how to program a computer that will "understand" a text as a person does, and that will "create" a new text in the target language that "sounds" as if it has been written by a person.This problem may be approached in a number of ways.

IntroductionAnalysis Levels

Machine Translation

Translation

Decoding the meaning of the source text

Re-encoding this meaning in the target language

Page 8: A  Linguistic Approach  for  Multilingual  Machine Translation System

8

Basic Model of a Machine Translation System

الـصغـيرة

الـمطلـوبالـطفـلـةوجـدتة

الـصفـحة

a trouvéla fillepetitla pagedemandé

petit fillela a trouvé la page demandé

petite fillela a trouvé la page demandée

3

2

1

IntroductionAnalysis Levels

Machine Translation

Page 9: A  Linguistic Approach  for  Multilingual  Machine Translation System

9

Arabic Sentence

Analyse

Frame in Arabic Frame in French

Construction

French sentence

Translation

Proposed ArchitectureProposed Architecture

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 10: A  Linguistic Approach  for  Multilingual  Machine Translation System

10

Proposed Approach

Fillmore theory

Conceptual Dependency (Schank)

Nouns Classification (Chafe)

Frame based representation (Minsky)

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 11: A  Linguistic Approach  for  Multilingual  Machine Translation System

11

Fillmore theory

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

The sentence;

Verb = Kernel

Other components of the sentence = peripherals

Verbs typological nature

Page 12: A  Linguistic Approach  for  Multilingual  Machine Translation System

The case AGENT : syntactic case = Subject.  The case OBJET : syntactic case = objectComp Or syntactic case = Subject verb mode = Passive  The case INSTRUMENT : gram case = Dative Preposition = ب ,باستعمال ,بواسطة ِ 

12

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 13: A  Linguistic Approach  for  Multilingual  Machine Translation System

The case SOURCE : grammatical case = Dative

Preposition = KْنMِم Or A place noun playing the

role of a direct object comp of some known verbs, such

us : ترك , غادر like in الطفل الموقعغادر

13

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 14: A  Linguistic Approach  for  Multilingual  Machine Translation System

DESTINATION : gram case = Dative Preposition = M نحو\ , ِ إلى , لـ , باتجاه , صوKب\

Or A place noun playing the role

of a direct object comp of some known verbs, such us

الموقع in قصد المسافر َ قصد

14

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 15: A  Linguistic Approach  for  Multilingual  Machine Translation System

FURNISHER : syntactic case = Indirect object

complement. Animation = Animated kind of verb = transfert verb Particule = M Kِمْن or M عند ِمْن eg :الطفل in ِمْن األستاذ استلم

رسالة الطفل

15

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 16: A  Linguistic Approach  for  Multilingual  Machine Translation System

BENEFICIARY : Syn case = Direct object comp Animation = Animated Kind of verb = verb of transfert such us : استلم, سلّـم , أرسل , َح\ص\ل, تسلّـم , أعطى , Particule = M إلى ل ، M like األستاذ in األستاذ إلى اإللكترونية الرسالة الطفل أرسلOr Syn case = indirect object comp Animation = Animated Kind of verb = transfert verb like : طارق in لطارق هدية األب أعطى

16

Page 17: A  Linguistic Approach  for  Multilingual  Machine Translation System

17

]]

]]

] =المستعمل

] = الشاشة

Animated,Animated, HumanHuman, , FeminineFeminine,,

Concrete,Concrete, Potent,Potent,

CountableCountable

(+)(+) (+)(+)

(+)(+)

(+)(+)

(-)(-)

Animated,Animated, Human, Human, Feminine,Feminine,

Unique,Unique, Concrete,Concrete, Potent,Potent,

CountableCountable

(-)(-) (-)(-)

(+)(+)

(+)(+)(+)(+)(-)(-)

(+)(+)

Unique,Unique, (+)(+)(-)(-)

Traits of Chafe

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 18: A  Linguistic Approach  for  Multilingual  Machine Translation System

18

Frames

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Fig 1. General Frame

Page 19: A  Linguistic Approach  for  Multilingual  Machine Translation System

19

Frames

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Fig 2. Specialized Frame

Page 20: A  Linguistic Approach  for  Multilingual  Machine Translation System

20

Conceptual Dependency

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

PROPEL Apply a force to somethingMOVE Moving a body partGRASP Catch an objectINGEST Ingest, for a moving objectEXPEL Physically expel, for a moving objectPTRANS Move a physical objectATRANS Modify an abstract relationship, such as possessionSPEAK Produce a sound; support of an action such as

“Communicate”

ATTEND Apply his attention to a perception or stimulusMTRANS Information TransferMBUILD Creating a new though

Page 21: A  Linguistic Approach  for  Multilingual  Machine Translation System

21

Arabic Sentence

Analyse

Frame in Arabic Frame in French

Construction

French Sentence

Translation

Proposed ArchitectureProposed Architecture

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 22: A  Linguistic Approach  for  Multilingual  Machine Translation System

22

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 23: A  Linguistic Approach  for  Multilingual  Machine Translation System

Re-organization French

Le livre est vendu / beau La revue est vendue / belle Les livres sont vendus / beaux Les revues sont vendues / belles

Less in English

23

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 24: A  Linguistic Approach  for  Multilingual  Machine Translation System

24

Examples الملف أعاد تسمية المستعمل ب الطفل طبع الطابعةـالنص سرعةب النص الطفل طبع بالطابعة النصع ـطب قاعدة المهندس معلوِماتال نسخ إلى إلكترونية{ رسالة الطفل أرسل

ستاذأال إلى إلكترونية رسالة الطفل أرسل

هأستاذ

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 25: A  Linguistic Approach  for  Multilingual  Machine Translation System

25

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

L'utilisateur a re-nommé le fichierL’enfant a imprimé le texte avec l’imprimanteEnfant a imprimé le texte rapidementLe texte a été impriméL’ingénieur a copié la base de donnéesL’enfant a envoyé un email à l'enseignantEnvoyer un email à son enseignant

The user re-named the fileThe Child printed the text with the printerThe Child printed text The Printed text printerThe Engineer copied the database The child sended an email to the teacherThe child sended an email to his teacher

Page 26: A  Linguistic Approach  for  Multilingual  Machine Translation System

26

Examples

L'utilisateur de renommer le fichierImprimante enfant du texte impriméEnfant texte imprimé rapidementImprimé imprimante texteIngénieur de base de données de copieEnvoyer un email à l'enfant de l'enseignantEnvoyer un email à l'enfant mentor

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

الملف أعاد تسمية المستعمل

ب الطفل طبع الطابعةـالنص

سرعةب النص الطفل طبع

بالطابعة النصع ـطب

قاعدة المهندس معلوِماتال نسخ

إلى إلكترونية{ رسالة الطفل ستاذأالأرسل

أستاذ إلى إلكترونية رسالة الطفل هأرسل

Page 27: A  Linguistic Approach  for  Multilingual  Machine Translation System

27

Examples

User re-naming the fileChild printer printed textChild printed text quicklyPrinted text printerCopy Database EngineerSend an email to the child the teacherSend an email to the child mentor

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

الملف أعاد تسمية المستعمل

ب الطفل طبع الطابعةـالنص

سرعةب النص الطفل طبع

بالطابعة النصع ـطب

قاعدة المهندس معلوِماتال نسخ

إلى إلكترونية{ رسالة الطفل ستاذأالأرسل

أستاذ إلى إلكترونية رسالة الطفل هأرسل

Page 28: A  Linguistic Approach  for  Multilingual  Machine Translation System

28

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Our translation system that some modules

were exposed in this paper, takes part in the

semantic processing of texts using purely

linguistic tools and finds fulfillment with the

DCF method as a basis.

This method has been proved appropriate to

the Arabic language and its particularities as

to syntax and semantic sides [3][4][6]

Page 29: A  Linguistic Approach  for  Multilingual  Machine Translation System

29

Enrich dictionaries to cover other domains

Multilingual system

Fusion of linguistic and probabilistic approaches

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 30: A  Linguistic Approach  for  Multilingual  Machine Translation System

30

Rich DictionariesWell Defined Rules ++

Better Translation

Enrich dictionaries to cover other domains

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Page 31: A  Linguistic Approach  for  Multilingual  Machine Translation System

31

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

31

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

Intern Representation

« Meaning »

Sentence in Italien

Sentence in French

Sentence in English

Sentence in Arabic

Sentence in Italien

Sentence in French

Sentence in English

Sentence in Arabic

Multilingual system

Page 32: A  Linguistic Approach  for  Multilingual  Machine Translation System

32

IntroductionAnalysis Levels

Machine Translation

Proposed Approach

Conclusion & perspectives

++++

Better TranslationBetter Translation

FUSIONFUSION

LINGUISTIC APPROACH

STATISTIC APPROACH

Fusion of linguistic and probabilistic approaches

Page 33: A  Linguistic Approach  for  Multilingual  Machine Translation System

33