MALIM – A NEW COMPUTATIONAL APPROACH
OF MALAY MORPHOLOGY
Ainun Najwa Bt Aziz P61811Fatimah Zawani Bt Abdullah P61028Mohd Rashidie B. Ramli P62451
Mohd Yunus Sharum, Muhammad Taufik Abdullah, Md Nasir Sulaiman, Masrah Azrifah
Azmi Murad & Zaitul Azma Zainon Hamzah
INTRODUCTION A major problem in Malay morphological
processing is in analysis. Existing model : finite-state, two-level
formalism. Hypothesis : higher accuracy of
morphological analysis can be achieved by widening the decision-selection domain.
Implements MALIM approach using S-A-P-I.
MALAY MORPHOLOGY Basic target of S-A-P-I is to analyze
affixation, especially multiple affixations. Affixation could be one or several of these
processes (prefixation, suffixation, circumfixation and infixation).
3 basic categories of Malay reduplication:1. Full reduplication2. Partial reduplication3. Rhythmic reduplication
THE S-A-P-I APPROACH Use the divide-and-conquer technique to
handle Malay morphological analysis. S-A-P-I (‘search-all-pick-if…) algorithm. Advantage : we can search for most
appropriate result, since we had gathered all possible options from the decision-selection domain.
Side-effect : multiple outputs due to ambiguity.
2 technique to improve the analysis’ results (separating and filtering).
MALIM – MORPHOLOGICAL ANALYZER FOR LINGUISTIC INDECISION OF MALAY
A morphological analyzer which implements the S-A-P-I approach.
Developed with Perl. Characteristic of Perl :
1. Support regular expression, a notation which describes regular language.
2. Capability of supporting lexical processing. MALIM contains a basic set but
comprehensive root lexicon as reference (root lexicon: 5710 root words).
MALIM contains a set of 80 morphosyntatic rules.
Limitations in implementation:1. Do not includes infixation analysis.2. Do not includes analysis on complex
affixation/reduplication.3. Do not analyze rhythmic and free
reduplication.4. Limited in analyzing affixation / reduplication
of compound word and phrase. Overcome the limitation : use a strategy
resembling direct mapping approach.
MALIM – MORPHOLOGICAL ANALYZER FOR LINGUISTIC INDECISION OF MALAY
METHOD EXPERIMENT Types of experiment :
1) Testing processing model (S-A-P-I)2) Splitting lexicon (of mono-syllabic and multi-
syllabic)3) Morphosyntactic rule filtering4) First syllabic reduplication analysis5) Clitics/particles extraction6) The effects of ‘cheat-list’ (direct mapping)
METHOD EXPERIMENT Experiment setting :
Set 1 : MALIM (complete) Set 2 : MALIM without lexicon splits Set 3 : MALIM without morphosyntactic rule
filtering Set 4 : MALIM without first syllabic reduplication
analysis Set 5 : MALIM without clitics/particles extraction Set 6 : MALIM without ‘cheat list’ Set X : MALIM with basic capabilities (fullfills all
Set 2 to Set 6) – use as control set
CONTRIBUTION1) Introducing a new and more accurate
approach of morphological analysis using S-A-P-I
2) Solved most of morphological problems involving Malay morphology, except involving multi-words (or compound word) and certain reduplicated words
CONCLUSION 1) MALIM only uses controlled sample data
which is not from daily life usage.2) Thus, this may not pose the real challenge
as solving the real world problems.3) So, in future, we may perform a test-run
using real-life data such as from corpus to verify the performance.