Backward Machine Transliteration by Learning Phonetic Similarity
An exploration towards a generalized Phonetic Transliteration of Classical Arabic
-
Upload
gregorymorse -
Category
Documents
-
view
218 -
download
0
Transcript of An exploration towards a generalized Phonetic Transliteration of Classical Arabic
An exploration towards a generalized Phonetic
Transliteration of Classical Arabic
Featuring complete computerized rules of recitation and a special
transliteration system for the Classical Arabic letters such as ‘ (’ع‘) ’عين
By Gregory Morse
www.islamsource.info
With help from Israa Alaradi
May Allah accept this effort at advancing transliteration of this most blessed language! Truly if
Abu Bakr as-Siddiq and Uthman ibn Affan spent their valuable time as leaders advancing preservation of
the Quran in Arabic writing, then how would it not be worth our time to transliterate it into English in
the modern context? Some history: During this time, Al-Khalil ibn Ahmad al-Farahidi (died 786) devised a
tashkil system to replace that of Abu al-Aswad. His system has been universally used since the early 11th
century, and includes six diacritical marks: fatha (a), damma (u), kasra (i), sukun (vowel-less), shadda
(double consonant), madda (vowel prolongation; applied to the alif).
This paper is the start of what will ultimately grow into a complete transliteration system from
classical Arabic into English, by tackling one of the toughest issues to do justice for such a system first
upfront and in a phonological manner.
The particular Arabic letter ‘ع’ is one of the most difficult for English speakers to pronounce and
one of the most difficult to come up with a consistent and extensive set of rules for transliteration.
Evaluation for a transliteration must take into account most essentially the vowel sounds on either side
of it. The throat is not used in English but there are more guttural sounds which can be used as a best
phonetic approximation that gives the best spirit of transliteration. Guttural speech sounds are those
with a primary place of articulation near the back of the oral cavity. The rules here are prefix and postfix
based but can be combined as some of the examples show.
The International Phonetic Alphabet (IPA) lists the letter as a voiced pharyngeal approximant or
fricative indicated by ‘ʕ’. Although traditionally placed in the fricative row of the IPA chart, ‘ʕ’ is usually
an approximant. The IPA symbol itself is ambiguous, but no language is known to make a phonemic
distinction between fricatives and approximants at this place of articulation. The approximant is
sometimes specified as ‘ʕ’ or as ‘ɑ’. A pharyngeal consonant is a type of consonant which is articulated
with the root of the tongue against the pharynx.
The best approximation will attempt to bring the point of articulation to the very bottom and
back of the throat where it is pronounced. In English there is one glottal consonant, the letter ‘h’ as in
‘high’. Glottal consonants, also called laryngeal consonants, are consonants articulated with the glottis.
There is also 5 velar consonants, including the nasal ‘ng’ as in ‘sing’, ‘w’ as in ‘weep’, ‘ch’ as in ‘loch’, ‘g’
as in ‘gaggle’, and ‘k’ as in ‘kiss’. Velars are consonants articulated with the back part of the tongue (the
dorsum) against the soft palate, the back part of the roof of the mouth (known also as the velum).
The vowels make a great deal of difference between dialects of spoken English and even other
languages which have moved towards Romanized alphabets generally being the distinguishing factor
over the consonants. The primary focus of vowels is the back vowels, the defining characteristic of
which is that the tongue is positioned as far back as possible in the mouth without creating a
constriction that would be classified as a consonant. In all dialects of English, the back vowels include
‘oo’ as in ‘boot’, ‘oo’ as in ‘hook’, ‘o’ as in ‘not’ or ‘ough’ as in ‘thought’, ‘a’ as in ‘bath’, as well as a
number of diphthongs which make use of these back vowels.
The only letters used beyond those to transliterate the vowels in their normal cases will thus
come from this groupings of back vowels and glottal and velar consonants. The exception is the letter
‘e’ which has been used several times here as it is used in front or central vowel combinations but it is
used in place of ‘i’ as it is generally more open than close meaning the position of the tongue is not at
the top of the mouth but instead low in the mouth.
The following Arabic letters ‘ه‘ ,’ء’ (which is equal to an English ‘h’) are glottal and pharyngeal
letters are ‘ع‘ ,’ح’ while the uvular letters ‘خ’ (equal to English ‘ch’ as in ‘loch’), ‘غ’, (equal to English ‘g’),
and the consonantal form of ,(’equal to English ‘k) ’ك‘ while the velar letters are (’equal to English ‘q) ’ق‘
.’ع‘ and of course ’ح‘ ,’ء‘ so the rules apply to the non-English equivalents of (’equal to English ‘w) ’و‘
The so-called emphatic Arabic letters ‘ظ‘ ,’ط‘ ,’ض‘ ,’ص’ and ‘ل’ only in the word ‘هللا’ (without emphasis
equal to ‘s’, ‘d’, ‘t’, ‘th’, and ‘l’ in English respectively) though dental are velarized or pharyngeal-ized, so
these same rules could be applied to them. However, since the point of articulation is not in the throat
but the emphasis of the letter carries it back, the chart would only apply to the vowels after the those
letters and not before. ‘ء’ on the other hand is a glottal stop which does not continue or start. The
name Allah is already a well-accepted English words and any Arabic loan words which have made it into
the English dictionary with a regularized spelling should be accepted as such even if a better
transliteration is possible.
Some aspects of recitation especially those of nasalization and prolongation cannot be
expressed with phonetic symbols in English. A metadata scheme is required which can provide this
information symbolically or through a color coded scheme or a combination there of.
It is also worth noting that there are 2 possible subsystems of classic Arabic transliteration into
English, the first letter-by-letter quite literally transliterating, while the second looks at sentences
composed of words primarily because word-by-word would have problem at the end of sentences
where letters are often omitted and this style also intelligently assimilates letters when appropriate
from the moon and sun letter categories, and silences letters that become combining place-holder like
letters. Furthermore, styles of transliteration can be done as plain roman or as a Romanized form that
uses letters with markup that are outside the normal English alphabet or even using capitalization.
Several stages are used in this phonetic transliteration scheme. The first is an error check which
filters out illegal usages as the special Arabic letters and ones features diacritics or some depending on
the location in the word. This can help improve the consistency of diacritic usage on classical Arabic
texts as well as allow for a consistent platform for checking optional but desirable diacritics such as the
one which represents lack of a vowel as they can be marked as an error or at least a warning. The
second stage adds metadata for rules of recitation to the Arabic and processing can be stopped at this
stage to produce the Arabic rules of recitation, otherwise processing continues which will transliterate
taking account this metadata. The third stage decomposes combination Arabic symbols into finer
symbols in the order given. The fourth stage applies transliteration in the given order including using
some transliteration functions to keep the tables simplified. Currently using a regular expression based
grammar: ‘\b’ for word boundary, ‘\B’ for not a word boundary, ‘\s’ for whitespace, ‘\w’ for letter of a
word, ‘$’ for end of expression, ‘^’ for beginning of expression or not if in a grouping, ‘(‘ followed by ‘)’
to represent groupings, ‘[‘ followed by ‘]’ to represent character classes, ‘+’ for combination or meaning
1 or more, ‘|’ as an or separator, ‘*’ meaning 0 or more, ‘?’ meaning 0 or 1, and ‘<’ followed by ‘>’ then
‘</’ followed by ‘>’ to represent surrounding metadata. Character classes are used to represent the
various groupings in consideration including ‘letters’, ‘sunletters’, ‘moonletters’, ‘specialgutteral’,
‘specialleadinggutteral’ where a character can appear in more than one group. Table factoring has used
‘letterrules’ and ‘gutteralrules’ as sub-tables which can be referenced. ‘letterspelling’ is a special usage
where the letters are spelled out with Arabic letters as given in the table. ‘decomposeletters’ takes a
word and processes each letter through the rule separately. For the rules of recitation, the original
letters never change in anyway but the metadata tags give pronunciation guidance information only.
The system is suffering ambiguity problems given that English uses combinations like ‘th’ and
‘sh’ to best represent Arabic transliteration yet it also uses ‘t’, ‘s’, and ‘h’ separately as well. But given
that the primary emphasis is on phonetic correctness, ambiguity is second in priority. The traditional
‘kh’ transliteration here is changed to ‘ch’ as it is phonetically correct due to the word ‘loch’ and even
unambiguous given ‘c’ is not used. The traditional ‘dz’, ‘z’, ‘dh or ‘th’ to show the variety which has been
put in use is using the only phonetic possibility of ‘th’ and the ‘th’ can potentially be changed to either
‘tth’ as in ‘Matthew’ or ‘fth’ as in ‘twelfth’ though ambiguity will remain. Corrections must be made
here for final revision given that ambiguity should also not be accepted unless it is the only phonetically
acceptable answer so more research is required in that realm. It probably cannot be resolved given that
English phonics play the primary role in the ambiguity as highlighted simply by the ‘th’ in ‘thin’ versus
‘this’.
A color coding scheme is provided after stage two as if going directly to Arabic, the color coding
rules can be applied then and there otherwise after stage three and a custom stage to a transliteration
scheme which does not attempt to transliterate the color coding metadata, that can effectively be
implemented after a custom fourth stage using any of the various transliteration schemes in existence
that need only provide a 1 to 1 scheme for the primary 28 Arabic letters, the 3 short vowels, the 3 long
vowels and the 3 tanweens as by stage 4 it is broken down into those letters plus the coloring metadata.
Error check Error
^) + أ | ^ | ^ ) |
^ + إ | ^ | ؤ +
(^ | ^ | ئ + (
^ ^) | ة + | ^ )
Missing diacritic
ى +
([nonvowelletter
ي| و| و| و| ء | [
ي| ي| ) + (\b |
[nonvowelletter]
| ا| و| ء| (ي
Missing diacritic
Sukoon
(س ك ون)
(^ ى| + إ| أ| آ ) +
\b
Must not appear at end of
word
( | | | ى| ء|
ة| ) + \B
Can only appear at end of
word
\b + ( | ت | ب | و
ل | ك | ت )* + ( | ؤ
| | | | | | ا| ا | ئ
| | |)
Must not appear at
beginning of word
\B + ^( | ت | ب | و
ل | ك | ت ٱ + *(
Must appear at beginning
of word
( \ \ \ \ \ \ ) |
+ ( + \ \
(ء | ة) | *( \ \ \ \
+ ) + (ٱ | آ) |
ى | ( \ \ \ \ \
+ ( | \ \ \ \
\ ) + ^ ة + +
Not valid combination
ا | ا Needs to be recomposed ىء | وء | اء | ء
Conversions Metadata Recitation
rule (تجويد)
| | ۩ + \s* <compulsorystop> </ Stopping
+ \b compulsorystop> |
<endofversestop></
endofversestop > |
<prostration>۩</prostrati
on> + $ + \s* + \b
قف) (و
| | + \w* +
| | |
<canstoporcontinue> </c
anstoporcontinue> |
<betternottostop> </bett
ernottostop> | <stopatfirstnotsecond> <
/stopatfirstnotsecond> +
\w* +
<stopatsecondnotfirst> <
/stopatsecondnotfirst> |
<bettertostopbutpermissib
letocontinue> </bettertos
topbutpermissibletocontin
ue> |
<bettertocontinuebutperm
issibletostop> </betterto
continuebutpermissibleto
stop> |
<subtlestopwithoutbreath> </subtlestopwithoutbre
ath>
| | | | +
\s* + $
<empty> | | | |
</empty> + \s* + $
) + ة | | | | <helperheh>ة</helperheh
| | ) + \s* +
$
> + ( | | | | | |
) + \s* + $
( | | ) + ( | د
ج| ط| ق| ب ) +
+ \B
( | | ) + <bounce>( | د
ج| ط| ق| ب )</bounce> +
+ \B
Small
bounce
( يص غر أ لق لق ل ة )
( ج| ط| ق| ب| د )
+ ( | | | |
| ) + \s* + $
<bounce>( | ط| ق| ب| د
) + <bounce/>(ج | | |
| | ) + \s* + $
Moderate
bounce
( س أ لق لق ل ة ت و ط م )
( ج| ط| ق| ب| د )
+ + + \s* +
$
<bounce>( | ط| ق| ب| د
+ <bounce/>(ج + +
\s* + $
Great
bounce ( ى أ لق لق ل ة ك بر )
ن <normalprolong><nasalize> ن</nasalize></normalp
rolong>
Nasalize
character
doubled
( غ نة ح رف
ش دد ة (م
)) | ن ن| | | )
+ \b + \s*) + ( | غ
ء| ه| ح| ج| ع )
)) | ن ن| | | ) + \b +
\s*) + ( ء| ه| ح| ج| ع| غ )
Vowel-less
noon clear
( ن ة ن ون س اك
ار ظه لق ى ؤلت ح )
)) | ن ن| | | )
+ ( | | | )
+ \b + \s*) + ( | ب
ب | ب )
<empty> ن</empty> +
<nasalize> </nasalize> |
((<empty>ن</empty> |
<dividetanween(, ,
<empty>, </empty>)> |
| </dividetanween> +
<nasalize>( | | |
)</nasalize>) + \b + \s*)
Vowel-less
noon covered ( ن ون
ن ة ب س اك إ قل )
+ ( ب | ب | ب )
)) | ن ن| | | )
+ \b + \s*) + ( | ي
و| م| ن ) | \b +
b\ + (ن | يس )
<assimilate> ن</assimilat
e> | (<assimilate>ن</assimilat
e> | <dividetanween(, ,
<assimilate>, </assimilate>)> | |
</dividetanween>) + \b +
\s*) +
<assimilator><normalpro
long><nasalize>( | م| ن| ي
nasalize></normalpro/>(و
long></assimilator> | \b +
b\ + (ن | يس )
Vowel-less
noon
assimilating
nasalization
( ن ة ن ون س اك
غ نة إ دغ ام )
)) | ن ن| | | )
+ \b + \s*) + ( | ل
ن | (ر اق م ر
<assimilate> ن</assimilat
e> |
(<assimilate>ن</assimilat
e> | <dividetanween(, ,
<assimilate>,
</assimilate>)> | |
</dividetanween>) + \b + \s*) + <assimilator>( | ل
ن | <assimilator/>(ر اق م ر
Vowel-less
noon
assimilating
( ن ة ن ون س اك
(إ دغ ام
)) | ن ن| | | )
+ \b + \s*) + ( ص
ق| ظ| ط| ض| )
<normalprolong><nasalize> ن</nasalize></normalp
rolong> |
((<normalprolong><nasalize>ن</nasalize></norma
Vowel-less
noon hide
heaviness
( ن ة ن ون س اك
ق يق ى إ خف اء ح
lprolong> |
<dividetanween(, ,
<normalprolong><nasaliz
e>,
</nasalize></normalprolong>)> | |
</dividetanween>) + \b +
\s*) + ( ق| ظ| ط| ض| ص )
يم (ت فخ
)) | ن ن| | | )
+ \b + \s*) + ( | ت
| ز| ذ| د| ج| ث
ك| ف| ش| س )
<normalprolong><nasalize> ن</nasalize></normalp
rolong> |
((<normalprolong><nasal
ize>ن</nasalize></norma
lprolong> |
<dividetanween(, ,
<normalprolong><nasaliz
e>,
</nasalize></normalprolo
ng>)> | |
</dividetanween>) + \b + \s*) + ( | ز| ذ| د| ج| ث| ت
ك| ف| ش| س )
Vowel-less
noon hide
lightness ( ن ة ن ون س اك
ق يق ى إ خف اء ح
(ت رق يق
م <normalprolong><nasalize> م</nasalize></normalp
rolong>
Character
doubled
( ش دد ة ح رف م )
(*b + \s\ + م | م )
) + ب + | | )
(<normalprolong><nasali
ze> م</nasalize></normal
prolong> |
Vowel-less
meem hide ( يم ن ة م س اك
<normalprolong><nasaliz
e>م</nasalize></normalp
rolong> + \b + \s*) + \b +
\s* + ب + ( | | )
ش ف و ي إ خف اء )
(*b + \s\ + م | م )
) + م + | | )
(<assimilate> م</assimilat
e> | <assimilate>م</assimilate
> + \b + \s*) +
<assimilator><normalprolong><nasalize>م</nasali
ze></normalprolong></assimilator> + ( | | )
Vowel-less
meem
assimilating
small
identical ( يم م
ن ة إ دغ ام س اك
اث ل ين ت م غ م يرص )
(*b + \s\ + م | م )
+ ( | ح| ج| ث| ت
س| ز| ر| ذ| د| خ
| ط| ض| ص| ش|
| ك| ق| غ| ع| ظ
ي| ء| ن| ل |
ي ) + ( | | )
) + (*b + \s\ + م | م ) | ث| ت
| س| ز| ر| ذ| د| خ| ح| ج
| غ| ع| ظ| ط| ض| ص| ش
ي| ء| ن| ل| ك| ق ي| ) +
( | | )
Vowel-less
meem clear
( يم ن ة م س اك
ش ف و ي إ ظه ار )
(*b + \s\ + م | م )
+ ( ف| و و| ) +
( | | )
) + (*b + \s\ + م | م ) ف| و
و| ) + ( | | )
Vowel-less
meem clear greater ( يم م
ن ة ار س اك إ ظه
ي أ ش د ش ف و )
^ + \s* + ٱل +
([sunletter] + |
[moonletter])
^ + \s* +
<helperfatha>ٱ</helperfat
ha> + ل + ([sunletter] +
| [moonletter])
Empty
Hamza ( ة ه مز
صل (و
^ + \s* + (ٱ +
[letter] + ( | |
)? + [letter] +
( | ض وا | (( | ٱم
ش وا ٱق ض وا | ٱب ن وا | ٱم
أ ت | ٱئ ت وا | ر م | ٱم | ٱس
| ٱب ن ت | ٱث ن ين | ٱث ن ت ين
ا | ٱب ن ؤ ر ٱم
^ + \s* +
<helperkasra>ٱ</helperka
sra> + ([letter] + ( | |
)? + [letter] + ( | )) |
ض وا ش وا | م ئ ت وا | ق ض وا | ب ن وا | م
أ ت | ر م | م | ب ن ت | ث ن ين | ث ن ت ين | س
ا | ب ن ؤ ر م
^ + \s* + ٱ +
[letter] + ( | |
|) + [letter] +
^ + \s* + <helperdamma>ٱ</helper
damma> + [letter] + ( |
| ) + [letter] +
\w+ + \s* + \b +
ٱ
\w+ + \s* + \b +
<empty>ٱ</empty>
) + *b + \s\ + ل ل
ر| )
<assimilate>ل</assimilat
e> + \b + \s* +
<assimilator>( | ل
<assimilator/>(ر
Laam of
verb
( م أ لف عل ل )
ان ب ل ر ب ل | ه ل) | )
+ \b + \s* + ( | ل
(ر
ان ب ل ر ب | ه ) | ) +
<assimilate>ل</assimilat
e> + \b + \s* + <assimilator>( | ل
<assimilator/>(ر
Laam of
particle
( م رف ل أ لح )
(^ + \s* | \b) + ( و
| ك | ت | ت | ب |
+ ٱل + *(ل
[sunletter] +
(^ + \s* | \b) + ( | ت | ب | و
ل | ك + ٱ + ?(
<assimilate>ل</assimilat
e> +
<assimilator>[sunletter]</
Assimilate
laam sun letters ( إ د غ ام
م ي أ لل س شم )
assimilator> +
(^ + \s* | \b) + ( و
| ك | ت | ت | ب |
+ *(ل
(<empty>آ<emp
ty> | <helperfatha>آ</
helperfatha> | ٱ |
+ ل + (أ
[moonletter]
(^ + \s* | \b) + [ | ت | ب | و
ل | ك )? +
(<empty>آ</empty> |
<helperfatha>آ</helperfat
ha> | ل + (أ | ٱ +
[moonletter]
Clear laam
moon letters ( م امإ د غ أ لل
ي ر (ق م
\b + لل
+ ل
<helperkasra></helperkas
ra> +
<helperalifwasl></helper
alifwasl> +
<helperlaam></helperlaa
m> + ل
( |) + ([letter]
+ ٱلل + (|
( |) + ([letter] + |) +
ٱلل
Laam
heaviness
م) يم ل (ت فخ
+ ([letter] +
ل | ٱلل + (| + ([letter] + ل | ٱلل + (|
Laam lightness (م ل
(ت رق يق
) + ر | ) | (
) | ر + (| |) |
ر + [elevation]
+ ( | | ) | (
+ (ي^) + (| +
) + ر | ) | ( | | ر + (
( ر | (| + [elevation]
+ ( | | ) | ( | ) +
+ (ي^) ) + ر + | | |
| | [letter] + ٱر | $ + (
Raa’
heaviness ( اء يم ر ت فخ )
) + ر | | |
| | + ٱر | $ + (
[letter] + ( | )
+ ( | )
ر ر | +
[lowness] | +
[lowness] +
) + ر+ | | |
| | ) + $ |
ير + ( | | |
| | ) + $
ر ر | + [lowness] | +
[lowness] + ) + ر+ |
| | | | ير | $ + ( +
( | | | | | ) + $
Raa’
lightness
( اء ت رق يق ر )
[nonvowelletter ي | و | ] + +
[nonvowelletter]
+ ( | | )
<assimilate>[nonvowellet
ter]</assimilate> + +
<assimilator>[nonvowell
etter]</assimilator> + ( |
| )
Assimilate
small
identical
( ن إ د غ ام ث ل ت م أ ل م
ير غ (أ لص
[nonvowelletter]
+ ( | | ) +
[nonvowelletter]
+ ( | | تأ م | (
+ ن + ن + ا +
[nonvowelletter] + ( | |
) + [nonvowelletter] +
( | | + تأ م | (
<assimilate>ن</assimilat
e> + +
<assimilator><dipthong>dipthong></assimilato/>ن
r> + ا
Assimilate
large
identical ( ن إ د غ ام ث ل ت م أ ل م
(أ لك ب ير
[nonvowelletter] + ( | | ) +
[nonvowelletter]
+
[nonvowelletter] + ( | |
) + [nonvowelletter] +
Absolute
identical
( ن إ د غ ام ث ل ت م أ ل م
(أ لم طل ق
+ assimilate>[letter]>) م + *b + \s\ + ب ) Assimilate
ذ + *b + \s\ + ث |
+ *b + \s\ + ذ |
د | ظ + \b + \s*
ت | ت + + \b +
\s* + ت | د + \b
+ \s* + ط | ط +
\b + \s* + ت) +
( | | )
+ </assimilate> + \b + \s*
+
<assimilator>[letter]</ass
imilator> |
<assimilateincomplete>ط
</assimilateincomplete>
+ \b + \s* +
<assimilatorincomplete>assimilatorincomplete/>ت
>)+ ( | | )
small
similar ( إ د غ ام
ان س ان ت ج أ ل م
ير غ (أ لص
Great
similar
( ان س ان ت ج أ ل م
(أ لك ب ير
Absolute
similar
( ان س ان ت ج أ ل م
(أ لم طل ق
) + ن م| و| ي ل ) | (
| ل + ن | ر + (ن |
ل قكم أ ل م ن خ
Assimilate
small
proximate ( إ د غ ام
ب ان ت ق ار أ لم
ي غ رأ لص )
Great
proximate
( ب ان ت ق ار أ لم
(أ لك ب ير
Absolute
proximate ( ب ان ت ق ار أ لم
(أ لم طل ق
Distanced
د ان) ت ب اع (أ لم
ا ي| و| <normalprolong> ا ي| |
و </normalprolong>
Original
lengthening
( د ط ب يع ي م )
ا + \s* + $
+ ا
<helperfatha> </helperfa
tha> + \s* + $
Steady
lengthening
stop ( د ث اب ت م
قف (و
اء يء| وء|
<obligatoryprolong> ا |
ي و|
</obligatoryprolong> + ء
Connected
lengthening
essential ( د م
ل ت ص ب م اج و )
و ي |
<permissibleprolong> و
|ي </permissibleprolong>
+
Soft
lengthening ( د ل ين م )
ا ي | و |
<permissibleprolong> ا |
ي و|
</permissibleprolong> +
Lengthening
presented to
sukoon ( د م
ض ع ار
(ل ل سك ون
( ا ي| و| ) + <necessaryprolong> ا | Weighted
ي و|
</necessaryprolong> +
compulsory
lengthening words ( د م
م ز ي ل م كل
ث قل (م
( ا ي| و| ) +
<necessaryprolong> ا |
ي و|
</necessaryprolong> +
Lightened
compulsory
lengthening
words ( د م
م ز ي ل م كل
ف ف خ (م
ال م ص | ال م ر | ال م
<necessaryprolong> + ا
+ <necessaryprolong/>ل م
(ر | ص )
Weighted
compulsory
lengthening
letters ( د م
م ز رف ي ل ح
ث قل (م
decomposeletters( ص | ال م | ال ر| ال م
ر | طه| ك هيع ص | ال م
م | يس | طس | طس
ق | ع س ق | حم | ص
ن | )
م |<necessaryprolong> | ا
| ق | س | ك | ص | ل
| <necessaryprolong/>ن
<necessaryprolong> ع</ne
cessaryprolong> |
<normalprolong> ر ي | ه |
<normalprolong/>ح | ط |
Lightened
compulsory
lengthening
letters ( د م
م ز رف ي ل ح
ف ف خ (م
( ا ي| و| ) +
\b + \s* + ء
<obligatoryprolong>( ا |
ي و| ) + \b + \s* +
<obligatoryprolong/>ء
Allowed
separated
lengthening ( د ل م نف ص م
ائ ز (ج
ه + ۥ | ه + ۥ
ۥ <obligatoryprolong> + ه
+ <obligatoryprolong> ه
+ <normalprolong>| ۥ +
</normalprolong>
Separated
lengthening
small
connection
( د ل م نف ص م
ل ى ى س ص غر )
ه ه |
ۥ <obligatoryprolong> + ه
+ <obligatoryprolong> |
ه + <normalprolong> ۦ +
</normalprolong>
Separated
lengthening
great
connection
( د ل م نف ص م
ل ى ى س ك بر )
ا أ ا | إ ا | ء
( أ| إ| ء ) +
<normalprolong>ا </nor
malprolong> +
Exchange
lengthening ( د ب د ل م )
+ *s\ + ?(ا) +
$
<normalprolong><helperf
atha> </helperfatha> + ا |
<helperalif></helperalif>
</normalprolong> + \s* +
$
Substitute
lengthening
( د اد م و ع )
ٱل ء
+ ء
<necessaryprolong>ٱ </n
ecessaryprolong> + ل
Distinction
lengthening
( د أ لف رق م )
ي | ي ي
<normalprolong> ي</nor
malprolong> + ي |
<normalprolong> ي</nor
malprolong>
Stablization
Lengthening
( د ين م ت مك )
+ <empty/>ص<empty> ص
Seen
essential ( ين ب س اج و )
ص
<firstnotsecond>ص</first
notsecond> +
<secondnotfirst></second
notfirst>
Seen noted
( ين شه ور س م )
Metadata Color / Name: RGB
code
Description
<tag>\w*</tag>
where
tag=empty |
helperfatha |
helperkasra |
helperdamma |
helpermeem |
assimilator |
assimilatorinco
mplete |
dipthong |
compulsorystop
| endofversestop
| prostration |
canstoporcontin
ue |
betternottostop |
stopatfirstnotsec
ond |
● black: RGB(0,0,0)
Normal text
stopatsecondnot
first |
bettertostopbutp
ermissibletocont
inue |
bettertocontinue
butpermissibleto
stop |
subtlestopwitho
utbreath
<necessaryprolo
ng>\w*</necess
aryprolong>
● dark red: RGB(139,0,0)
Necessary prolongation 6
vowels / 6 harakah
<obligatoryprol
ong>\w*</oblig
atoryprolong>
● blood red:
RGB(175,17,28)
Obligatory prolongation
4 or 5 vowels / 4-5
harakah
<permissiblepro
long>\w*</per
missibleprolong
>
● orange red:
RGB(255,69,0)
Permissible prolongation
2,4,6 vowels / 2-4-6
harakah
<normalprolong
>\w*</normalpr
olong>
● cumin red:
RGB(213,139,24)
Normal prolongation 2
vowels
<nasalize>\w*</
nasalize> ● green: RGB(0, 255, 0)
Nasalization 2 vowels /
gunnah
<assimilate>\w*
</assimilate> |
<assimilateinco
● grey: RGB(128, 128,
128)
Unannounced (silent) /
’idgaam
mplete>\w*</as
similateincompl
ete>
<emphasis>\w*
</emphasis>
● dark blue: RGB(0, 0,
139)
Emphatic pronunciation
of the letter (R) /
Tafcheem (R)
<bounce>\w*</
bounce> ● blue: RGB(0, 0, 255)
Unrest letters (Echoing
Sound) / qalqalah
Conversions Result
| آ ا
Alif
chanjareeah ( ي ة أ ل ف ر نج خ )
ا ء
اء إ | أ
وء ؤ
ىء ئ
و ۥ
ي
| م
ن
س
<empty> </empty> Sukoon (س ك ون)
<helperfatha>\w
*</helperfatha>
<helperkasra>\
w*</helperkasra
>
<helperdamma>
\w*</helperdam
ma>
<empty>\w*</e
mpty>
| ء
\b + ( ال م ص | ال م |
ر | ال ر | ك هيع ص | ال م
م | طه | طس | طس
| حم | ص | يس
| ع س ق b\ + (ن | ق
[letterspelling]
Harfu l-
muqatta'at
( رف ح
ق طع ات (ألم
ى ا Alif
maqsoorah ( ة أ ل ف قص ور م )
ى ي
) + ة | | | |
| ) ت
Taa
marbootah ( رب وط ة ت اء م + ة ( ه
( | | | | |
ا| ي| و| )
+ \b + \s* + اء
( ا| | | | | | |
ي و| ) + \s* + \b
\B +
[waslloanword]
+ (| | | | |
| )
\B + [waslloanwordrules]
+ (| | | | | | )
[loanword |
name] + (| | |
| | | )
[loanwordrules |
namerules] + (| | | |
| | )
Diacritic
Combination Transliteration
<normalprolong
>\w*</normalpr
olong>
\w* + - + \w*
<necessaryprolo
ng>\w*</necess
aryprolong>
\w* + - + \w* + - + \w* +
- + \w* + - + \w* + - +
\w*
<obligatoryprol
ong>\w*</oblig
atoryprolong>
\w* + - + \w* + - + \w* +
- + \w* + (- + \w*)
<permissiblepro
long>\w*</per
missibleprolong
>
\w* + - + \w* + (- + \w*
+ - + \w*) + (- + \w* + -
+ \w*)
<nasalize>\w*</
nasalize> \w*
<assimilate>\w*
</assimilate> |
<assimilateinco
mplete>\w*</as
similateincompl
ete>
\w*
<emphasis>\w*
</emphasis> R
<bounce>\w*</
bounce> \w* + - + \w*
( ا| | | |
ي و| ) +
[specialleadingg
utteral] + ( | |
ا| | | | |
ي و| )
[gutteralrules]
[specialgutteral]
+ ( | | | |
ا| | ي| |
و )
[gutteralrules]
( ا| | | |
ي و| | ء) + (
[specialleadingg
utteral])
[gutteralrules]
[letter] [letter]-[letter]
ا aa
و oo
ي ee
[letter] [letterrules]
an
un
in
a
u
i
’ ء
[letter] | [letterrules] | [letterspelling]
Sun
letter
Transliteration |
Spelling
Moon
letter
Transliteration |
Spelling
[sunletters] [moonletters]
ت ا ء t ت[whispering] [strength]
[lowness]
[opening] [restraint]
أ ل ف ا
ث ا ء th ث[whispering]
[weakness]
[lowness] [opening]
[restraint]
ب ا ء b ب[audibility]
[strength]
[lowness] [opening]
[fluency]
[vibration]
د ال d د[audibility]
[strength]
[lowness] [opening]
[restraint]
[vibration]
يم j ج ج [audibility]
[strength]
[lowness] [opening]
[restraint]
[vibration]
ذ ال th ذ[audibility]
[weakness]
[lowness] [opening]
[restraint]
ا ء h ح ح [whispering]
[weakness]
[lowness] [opening]
[restraint]
ا ء r ر ر [audibility]
[moderation] [lowness]
[opening]
[fluency] [inclination]
[repetition]
ا ء ch خ خ [whispering]
[weakness] [elevation]
[opening]
[restraint]
ى z ز /ز
ي ن ز
[audibility] [weakness]
[lowness]
[opening] [restraint]
[whistling]
ع ي ن ع[audibility] [moderation]
[lowness]
[opening] [restraint]
ين s س س [whispering] [weakness]
[lowness]
[opening] [restraint]
[whistling]
غ ي ن g غ[audibility] [weakness]
[elevation]
[opening] [restraint]
ين sh ش ش [whispering]
[weakness] [lowness]
[opening]
[restraint] [diffusion]
ف ا ء f ف[whispering]
[weakness] [lowness]
[opening]
[fluency]
اد s ص ص [whispering]
[weakness] [elevation]
[closing]
[restraint] [whistling]
ق اف q ق[audibility]
[strength] [elevation]
[opening]
[restraint] [vibration]
اد d ض ض [audibility]
[weakness]
[elevation] [closing]
[restraint]
[elongation]
ك اف k ك[whispering]
[strength]
[lowness] [opening]
[restraint]
ط ا ء t ط[audibility]
[strength] م m يم م [audibility]
[moderation]
[elevation]
[closing] [restraint]
[vibration]
[lowness]
[opening] [fluency]
[nasal]
ظ ا ء th ظ[audibility]
[weakness] [elevation]
[closing]
[restraint]
ه ا ء h ه[whispering]
[weakness] [lowness]
[opening]
[restraint]
ل م l ل[audibility]
[moderation]
[lowness] [opening]
[fluency]
[inclination]
او w و و
ن ون n ن[audibility] [moderation]
[lowness]
[opening]
[fluency]
[nasal]
ي ا ء y ي
و و و
او و
ي و ي ي اي ي
يي ي
ى اي
ى ى
aw
ay
او غ ي ر ٱل و
ية د ٱل م
ٱل ي ا ء غ ي ر
ية د ٱل م
[audibility] [weakness]
[lowness]
[opening] [restraint]
[ease]
ة ‘ ء ز ه م [audibility] [strength]
[lowness]
[opening] [restraint]
[nonvowelletter]
| س | ز | ر | ذ | د | خ | ح | ج | ث | ت | ب
| ك | ق | ف | غ | ع | ظ | ط | ض | ص | ش
ه | ن | م | ل
[specialleadinggutteral] ح ع |
[specialgutteral] ض | ص ط | ظ | |
[specialleadinggutteral]
[audibility]
|غ |ع |ظ |ط |ض |ز |ر |ذ |د |ج |ب
◌و | |ن |م |ل |ق ◌و | ◌ ◌و | ◌ و | ◌
◌ ◌او | ◌ ◌و | ي | ◌ ◌◌ ◌ي | ◌ ي | ◌
◌ ◌ي | ◌ ◌اي | ◌ ◌ي | ◌ ◌◌ ◌يي | ◌ اي ◌
|◌ ◌ى | ◌ ◌ى | ◌ ء |ى ◌
[whispering] ه |ك |ف |ص |ش |س |خ |ح |ث |ت
[weakness]
|ظ |ض |ص |ش |س |ز |ذ |خ |ح |ث
◌و | |ه |ف |غ ◌و | ◌ ◌و | ◌ ◌و | ◌ او ◌
|◌ ◌و | ي | ◌ ◌◌ ◌ي | ◌ ◌ي | ◌ ي | ◌
◌ ◌اي | ◌ ◌ي | ◌ ◌◌ ◌يي | ◌ ◌اي | ◌ ى ◌
|◌ ◌ى | ◌ ى ◌
[moderation] ن |م |ل |ع |ر
[strength] ء |ك |ق |ط |د |ج |ت |ب
[lowness]
ش |س |ز |ر |ذ |د |ح |ج |ث |ت |ب
◌و | |ه |ن |م |ل |ك |ف |ع | و | ◌
◌ ◌و | ◌ ◌و | ◌ ◌او | ◌ ◌و | ي | ◌ ◌◌ ي ◌
|◌ ◌ي | ◌ ◌ي | ◌ ◌اي | ◌ ◌ي | ◌ ◌◌ يي ◌
|◌ ◌اي | ◌ ◌ى | ◌ ◌ى | ◌ ء |ى ◌
[elevation] ق | غ | ظ | ط | ض | ص | خ
[opening]
|س |ز |ر |ذ |د |خ |ح |ج |ث |ت |ب
و | |ه |ن |م |ل |ك |ق |ف |غ |ع |ش
◌ ◌و | ◌ ◌و | ◌ ◌و | ◌ ◌او | ◌ و | ي ◌ ◌◌
|◌ ◌ي | ◌ ◌ي | ◌ ◌ي | ◌ ◌اي | ◌ ي ◌ ◌◌
|◌ ◌يي | ◌ ◌اي | ◌ ◌ى | ◌ ◌ى | ◌ | ى ◌
ء
[closing] ظ | ط | ض | ص
[restraint]
|ش |س |ز |ذ |د |خ |ح |ج |ث |ت
و | |ه |ك |ق |غ |ع |ظ |ط |ض |ص
◌ ◌و | ◌ ◌و | ◌ ◌و | ◌ ◌او | ◌ و | ي ◌ ◌◌
|◌ ◌ي | ◌ ◌ي | ◌ ◌ي | ◌ ◌اي | ◌ ي ◌ ◌◌
|◌ ◌يي | ◌ ◌اي | ◌ ◌ى | ◌ ◌ى | ◌ ء |ى ◌
[fluency] ن | م | ل | ف | ر | ب
[vibration] ق | ط | د | ج | ب
[inclination] ل | ر
[repetition] ر
[whistling] ص | س | ز
[diffusion] ش
[elongation] ض
[nasal] ن | م
[ease]
◌و | ◌و | ◌ ◌و | ◌ ◌و | ◌ ◌او | ◌ و | ◌ ◌◌
◌ي | ◌ي | ◌ ◌ي | ◌ ◌| ي◌ اي | ◌
◌ ◌ي | ◌ ◌◌ ◌يي | ◌ ◌اي | ◌ ◌ى | ◌ ى | ◌
◌ ى ◌
<dividetanween(vow
elopentag,
vowelclosetag,
noonopentag,
noonclosetag)> | |
</dividetanween>
<empty> | | </empty> +
vowelopentag(<helperfatha>
</helperfatha> | <helperkasra>
</helperkasra> | <helperdamma>
</helperdamma>)vowelclosetag +
noonopentag<helpernoon></helpernoo
n>noonclosetag
[gutteralletter] | [gutteralrules]
Letter
combination
Transliterati
on
Example
word Example Transliteration
ع ا | ع aw | awn
ل ي ك م يع | ع م اج awlaykum |
jameeawn
ل م e | en ع | ع ش ف يع | ع elma | shafeehen
يع o | on ع | ع يع | ب د س م badeeho | sameehon
ىع | ع ا waa س ع ى | ط ع ام tawwaamu | sahwaa
يد ا aee ع ى | ع ي ى | ب ع ع م bahaeedan | mahaeea
Appendix A - List of English language dictionary loan words from classic Arabic
[waslloanword] | [loanwordrules] | [waslloanwordrules]
Allah llah هللا
[loanword] | [loanwordrules]
Quran/Qur'an/Koran ان ق ر
Hadith د يث ح
Islam م إ سل
Muslim/Moslem سل م م
Imam ام إ م
Sheikh/Sheik/Shaykh/Shaikh ش يخ
Caliph ل يف ة خ
Caliphate ف ة أ خل
Hajj/Hadj ج ح
Hajji/Hadji/Haji ي ج اج / ح / ح
ة ج ح
Muharram/Moharram م ر ح م
Ramadan ض ان م ر
Safar ف ر ص
Safari ف ر ص
Halal ل ل ح
Haram ام ر ح
Sunnah/Sunna س نة
Ihram ام إ حر
Fitna ف تن ة
Makkah/Makah/Mecca كة م
Medina ين ة د م
Salah/Salat ة ص ل
Nabi ن ب ي
Jihad/Jehad اد ه ج
Zakah/Zakat ك اة ز
Sadaqat د ق اة ص
Wudu د وء و
Sawm ص وم
Adhan أ ذان
Khutbah طب ه خ
Miraj اج عر م
Hijrah/Hijra/Hegira ة ه جر
Shahada/Shahadah اد ة شه
Mujahid/Mujahideen/Mujahedin/Mujaheddin د اه ج ين / م د اه ج م
Salaam م س ل
Hijab/Hejab اب ج ح
Qadi/Kadi/Cadi ة ق عد
Hakim/Hakeem ك م يم / ح ك ح
Shia/Shiah/Shii يع ة يع ي / ش ش
Qiblah/Qibla/Kiblah/Kibla ق بل ة
Qiyas/Kiyas ق ي اس
Fiqh/Fikh ف قه
Surah/Sura ة س ور
Sayyid/Said/Sayed/Sayid س ي د
Shirk رك ش
Shaitan/Shaytan ش يط ان
Talaq ق ط ل
Niqab ن ق اب
Deen/Din ين د
Riba ب ا ر
Shariah/Sharia ي عة ش ر
Appendix B – Names
ع ون eoo ع و ج raajeeoona ر
ع ah ين ت ع nastahaeenu ن س
ع e ن ع م nekma
ع o ب ع د bokda
اع aae ل ك اع jaaeeluka ج
يع eeh يع badeeho ب د
وع ooh ي وع د ون yoohawdoona
shoaeyban ش ع ي ب ا aey ع ي
ن awh ع و ع و farawhna ف ر
ع (nothing) أ ن ع م anawma
nahkbudu ن ع ب د k ع
[name] [namerules]
Hadith Qudsi د يث ق دس ي ح
Imam Nabawi’s Forty Hadith د يث ن ب و ي ح
Sahih Al-Bukhari يح ح ي ص ار أ لب خ
Sahih Muslim يح ح سل م ص م
Al-Sunan Al-Sughra ى أ لس ن ن أ لص غر
Sunan Abi Dawood د أ ب ي س ن ن د او
Sunan Al-Tirmidhi ع ام ي ج ذ أ لت رم
Sunan Ibn Maja ه إ بن س ن ن اج م
Kitab Al-Kafi ت اب أ لك اف ي أ لك
Man La Yahduruhu Al-Faqih ه ل م ن أ لف ق يه ي حض ر
Tahdhib Al-Ahkam أ أل حك ام ت هذ يب
Al-Istibsar ست بص ار أ إل
Appendix C – International Phonetic Alphabet pictures and diagrams for Arabic, English and in
general
Part of
Speech
Specific point of articulation ( مخرج) Letters
The nasal
passage
( يش و مأ لخ )
This is a single point of articulation for the sound of
nasalization ( غنة) that comes from the nose by blocking
the flow of air from the mouth with the tongue ‘ ن’ or
lips ‘ م’. This is inherent characteristic of the two letters
that cannot change.
م | ن
The two
lips
( ن ٱلشف ت ي )
Between the two lips ب | م | و
Between the inside of the lower lips and the upper
incisors. ف The tongue
(ٱلل س ان )The innermost part of the tongue next to the throat
touching the roof of the mouth opposite it. ( ٱلل س ان أ ق ص ى ق (The innermost part of the tongue towards the mouth
touching the roof of the mouth opposite it. ك One or both edges of the tongue, usually the left along
with the upper back molars. ض
Between the edges of the tongue usually the right and
the gums of the front molars, canine and incisors. ل Between the tips of the tongue and the gums of the two
upper central incisors. ن Between the upper part of the tip of the tongue and the
gums of the two upper central incisors. ر The middle of tongue with the opposite from the roof
of the mouth. ( س ط ٱلل س ان و ي | ش | ج ( The tip of the tongue near the inner plates of the upper
central incisors. ص | س | ز The tip of the tongue along with its upper surface
touching the roots of the central incisors. ط | د | ت Between the upper surface of the tongue near the end
of the tips of the two upper central incisors. ظ | ذ | ث The throat
ل ق ) (ٱل ح The deepest part of the throat. ( ل ق أ ق ص ى ٱل ح )
ه | ء The middle of the throat. ( س ط ل ق و ٱل ح )
ح | ع The nearest part of the throat. ( ل ق أ د ن ى ٱل ح )
خ | غ The chest
or interior
ف ) و (ٱل ج
This is a single point of articulation comprising the
empty space of the open mouth for the letters of long
vowels ( د or elongation preceded by the Arabic (م
vowels.
ا ي | و |
Characteristics ( صفات)
Characteristic Opposite
Strength ( دة (ٱلش
Trapping the flow of sound, strengthening the
complete reliance on the point of articulation,
associated with the letters in this phrase: د ق ط أج
.ب ك ت
Moderation ( س ط (ٱلتو
In between the strength and the weakness is
the moderation, where the sound emerges but
does not flow from the point of articulation
associated with letters in this phrase: ر ل ن ع م .
Weakness ( ة او خ (ٱلر
A flow of sound during pronunciation,
weakening the reliance on point of
articulation associated with all the letters not
included in “Strength” and “Moderation”.
Whispers ( س م ر ) Audible (ٱل ه ه (ٱل ج
A flow of breath (air) during pronunciation due to weakness in the
reliance on the point of articulation associated with letters in this
phrase: ث ه س ك ت ش خص ف ح .
The trapping of the flow of breath (air) due to heavy dependence on the
point of articulation associated with all the letters not in “Whispers”.
Heaviness/Elevation ( ت ع ل ء س (ٱإل Raising the tongue to the roof of the mouth during articulation,
associated with the letters in this phrase: غ ط خ ص ق ظ ض .
Lightness/Lowering ( ت ف ال س (ٱإل
Lowering the tongue to the floor of the mouth during articulation
associated with all the letters not in “Elevation”.
Closing ( ب اق ط (ٱإل
The meeting of the tongue and what is opposite it from the roof of the
mouth during articulation, associated with the letters: ظ | ط | ص | ض .
Opening ( ت اح نف (ٱإل
The separation of the tongue from the roof of the mouth during
articulation, associated with all the letters not included in “Closing”.
Fluency ( ق ذ ل (ٱإل
The easy flowing of the letters ب | ل | ن | م | ر | ف from the tip of the
tongue and lips. However this characteristic and its opposite is not
included in the study of rules of recitation but included for
completeness.
Restraint ( ات م ص (ٱإل
The emergence of the remaining letters not included in “Fluency” from
inside of the mouth and throat.
Whistling ( ف ير (ٱلص
A sound emerging between the tip of the tongue and the upper central incisors which resembles the sound of a bird, associated with ز |س |ص .
It is usually like a buzzing sound with ز.
Vibration ( ٱل ق ل ق ل ة)
The vibration of the point of articulation with the emergence of the letter when it has sukoon ( ) associated with letters in this phrase: د ق ط ب ج .
Ease ( ٱلل ين)
This is pronunciation without exertion or difficulty. It is associated with letters ي | و with sukoon ( ) preceded by fathah ( ).
Inclination/Drifting ( اف ر ن ح (ٱإل
The inclination of the letter after its articulation from the point of articulation towards another point of articulation associated with ر | ل ل .
inclines towards the tip of the tongue and ر inclines towards the point of articulation of ل.
Repetition ( ير ر (ٱلتك
This is the natural tendency to vibrate or roll the tongue during articulation of the letter ر. However this is to be avoided for correct pronunciation
by controlling the tongue and not relaxing it.
Diffusion (ٱلتف ش ى)
The spreading of air throughout the mouth during articulation of the letter ش.
Elongation ( ت ط ال ة س (ٱإل This is the extension of the sound over the entire edge of the tongue from front to back during articulation, associated with letter ض.
Nasalization ( ٱل غ نة)
This is the sound emitted from the nose, an inherent characteristics of letters م | ن when accompanied by sukoon ( ) or shaddah ( ). Nasalization
emerges from the nose when the flow of sound is blocked in the mouth, by the tongue with ن and by the lips with م.