Appendix A Buckwalter Transliteration

22
Appendix A Buckwalter Transliteration Table A.1 Arabic Letters with windows 1256, ISO 8859-6, and Unicode character encoding and corresponding Buckwater transliteration Letter Description Win. CP-1256 ISO 8859-6 Unicode Buckwalter Letter Hamza C1 C1 U+0621 Letter Alef, Madda above C2 C2 U+0622 | Letter Alef, Hamza above C3 C3 U+0623 > Letter Waw, Hamza above C4 C4 U+0624 & Letter Alef, Hamza Below C5 C5 U+0625 < Letter Yeh, Hamza above C6 C6 U+0626 } Letter Alef C7 C7 U+0627 A Letter Beh C8 C8 U+0628 b Letter Teh Marbuta C9 C9 U+0629 p Letter Teh CA CA U+062A t Letter Theh CB CB U+062B v Letter Jeem CC CC U+062C j Letter Hah CD CD U+062D H Letter Khah CE CE U+062E x Letter Dal CF CF U+062F d Letter Thal D0 D0 U+0630 * Letter Reh D1 D1 U+0631 r Letter Zain D2 D2 U+0632 z Letter Seen D3 D3 U+0633 s Letter Sheen D4 D4 U+0634 $ Letter Sad D5 D5 U+0635 S Letter Dad D6 D6 U+0636 D Letter Tah D8 D7 U+0637 T M. Elmahdy et al., Novel Techniques for Dialectal Arabic Speech Recognition, DOI 10.1007/978-1-4614-1906-8, © Springer Science+Business Media New York 2012 87

Transcript of Appendix A Buckwalter Transliteration

Page 1: Appendix A Buckwalter Transliteration

Appendix ABuckwalter Transliteration

Table A.1 Arabic Letters with windows 1256, ISO 8859-6, and Unicode character encoding andcorresponding Buckwater transliteration

Letter Description Win. CP-1256 ISO 8859-6 Unicode Buckwalter

� Letter Hamza C1 C1 U+0621 ’�� Letter Alef, Madda above C2 C2 U+0622 |�� Letter Alef, Hamza above C3 C3 U+0623 >�� Letter Waw, Hamza above C4 C4 U+0624 &

�� Letter Alef, Hamza Below C5 C5 U+0625 <�� Letter Yeh, Hamza above C6 C6 U+0626 }

� Letter Alef C7 C7 U+0627 A

� Letter Beh C8 C8 U+0628 b� Letter Teh Marbuta C9 C9 U+0629 p� Letter Teh CA CA U+062A t�� Letter Theh CB CB U+062B v

Letter Jeem CC CC U+062C j

Letter Hah CD CD U+062D H

� Letter Khah CE CE U+062E x

� Letter Dal CF CF U+062F d�� Letter Thal D0 D0 U+0630 *

� Letter Reh D1 D1 U+0631 r�� Letter Zain D2 D2 U+0632 z

� Letter Seen D3 D3 U+0633 s�� Letter Sheen D4 D4 U+0634 $

� Letter Sad D5 D5 U+0635 S�� Letter Dad D6 D6 U+0636 D

� Letter Tah D8 D7 U+0637 T

M. Elmahdy et al., Novel Techniques for Dialectal Arabic Speech Recognition,DOI 10.1007/978-1-4614-1906-8, © Springer Science+Business Media New York 2012

87

Page 2: Appendix A Buckwalter Transliteration

88 A Buckwalter Transliteration

Table A.1 (continued)

Letter Description Win. CP-1256 ISO 8859-6 Unicode Buckwalter

�� Letter Zah D9 D8 U+0638 Z

� Letter Ain DA D9 U+0639 E�� Letter Ghain DB DA U+063A g

_ Tatweel DC E0 U+0640 _�� Letter Feh DD E1 U+0641 f

� Letter Qaf DE E2 U+0642 q

� Letter Kaf DF E3 U+0643 k

� Letter Lam E1 E4 U+0644 l

� Letter Meem E3 E5 U+0645 m�� Letter Noon E4 E6 U+0646 n

� Letter Heh E5 E7 U+0647 h

� Letter Waw E6 E8 U+0648 w

� Letter Alef Maksura EC E9 U+0649 Y

�� Letter Yeh ED EA U+064A y�� Fathatan F0 EB U+064B F�� Dammatan F1 EC U+064C N

�� Kasratan F2 ED U+064D K � Fatha F3 EE U+064E a!� Damma F5 EF U+064F u

� Kasra F6 F0 U+0650 i"� Shadda F8 F1 U+0651 ∼#� Sukun FA F2 U+0652 o$� Letter Superscript Alef – – U+0670 ‘ � Letter Alef Wasla – – U+0671 {

�% Letter Peh 81 – U+067E P

% Letter Tcheh 8D – U+0686 J�� Letter Veh – – U+06A4 V

& Letter Gaf 90 – U+06AF G

Page 3: Appendix A Buckwalter Transliteration

Appendix BEgyptian Colloquial Arabic Lexicon

Table B.1 ECA corpus lexicon with Arabic othography, Buckwalter transliteration, and SAMPAphonetic transcription

Arabic Buckwalter SAMPA Arabic Buckwalter SAMPA

� � |h ?A: �� '()* � >bjdy ?abgadi

�'+ � >bdA ?abadan ,+�-+ � >bryl ?abri:l

.+ � >bw ?abu �/0�+ � >byD ?AbyAd’

- �12+ � <t<xr it?AxxAr�2 �3+ � <tfAq ?ittifa:?

-4�52 �0+ � <tnA$r ?itnA:SAr �67� �0+ � <tnyn ?itne:n

�2�+ � >vAr ?AsA:r� ��28 � >jAzp ?aga:za

-98� >Hmr ?AX\mAr 2 �01� <HnA ?iX\na

��20 �1� >xbArk ?AxbA:rAk -4�: �1� >xDr ?Axd’Ar

�� >d ?add����� <dArp ?idA:ra

;<� ��� <*AEp ?iza:?\a �� ��� <*n ?izn

=�+ ��� <*nk ?iznak >�?�@��� >rADy ?ArA:d’i

;A+ �� >rbEp ?ArbA?\a -4�5 20A+ �� >rbEtA$r ?ArbA?\tA:SAr�67�A+ �� >rbEyn ?arbi?\i:n B+ �� >rbE ?ArbA?\

���� >rD ?Ard’ C �+�� >rnb ?arnab

�� � ��� <zAy izza:y�� ��� >zrq ?azra?

=+� ��� <zyk ?izzayyak -7�D�E2F� >sAnsyr ?AsAnse:r

'F� >sd ?asad;+��'�0GF� <skndryp ?iskindiriyya

���.F� >swAn ?AswA:n �.F� <swd ?iswid

�.0�F� >sywT ?asyu:t �2H �F� >$kAl ?aSka:l

�-G �F� >$krk aSkurak IJK�-G �F� >$krk(2) ?ASkurAk

- �3L� >Sfr ?As’fAr ,L� >Sl ?As’l��M<� <ElAn ?i?\la:n �2 �+M<� <ElAnAt ?i?\lA:na:t

,9<� >Eml ?a?\mil �20�<� >EyAd ?a?\ya:d

M. Elmahdy et al., Novel Techniques for Dialectal Arabic Speech Recognition,DOI 10.1007/978-1-4614-1906-8, © Springer Science+Business Media New York 2012

89

Page 4: Appendix A Buckwalter Transliteration

90 B Egyptian Colloquial Arabic Lexicon

Table B.1 (continued)

Arabic Buckwalter SAMPA Arabic Buckwalter SAMPA

NOD �<� >gsTs ?aGust’us 20� 3+�- �P � >fryqyA ?afriqya��20D�E2 �A �P � >fgAnstAn ?afGanista:n -+ .0Q � >ktwbr ?ukto:bAr

R� <lA illa;+�'()* R� Al<bjdyp il?abgadiyya

-98R� Al<Hmr il?AX\mAr B+ �R� Al<rbE lArbA?\����R� Al<rdn ?il?urdun >�?

�@�R� Al<rDy il?Ard’i�2 �3 �+R� Al<nfAq l?anfa:? �67� �0+R� Al<tnyn litne:n��2AFR� Al<sEAf il?is?\a:f -S0 T � AlbHr ?ilbAX\r

'U0 T � Albld ilbalad �M0T � AltlAt ittala:t

-+�� �-(V*� AljzAyr ?iggaza:yir;AW(V*� AljmEp iggum?\a

'(V*� AlHd ilX\add;0� �3 �0(V*� AlHnfyp ilX\anafiyya

NX�W�(V*� Alxmys ilxami:s -7��(V*� Alxyr ilxe:r

�� -�+�'T � AldA}ry idda:?iri Y?5'T� Aldsm iddasam

�'T� Aldm iddamm ��'T� Aldwr ?iddo:r�6+�'T� Aldyn iddi:n

;<2DT� AlsAEp ?issa:?\aCX DT� Alsbt issabt

;ZMDT� AlslAmp ssala:ma

�MDT� AlslAm ?issala:mu NW �DT� Al$ms iSams;O�0 �DT� Al$nTp iSSAnt’a

;S[T� AlSHp is’s’iX\X\A\+�-OT� AlTryq ?it’t’Ari:? -] �OT� AlZhr id’d’uhr;0�+ -AT� AlErbyp il?\ArAbiyya �.WAT� AlEmwm l?\umu:m;P�- �AT � Algrdqp ?ilGarda?a

�̂ T � >lf ?alf

IJK �̂ T � >lf(2) ?alif;SD �3T � AlfsHp ilfusX\a

�67� �3T � >lfyn ?alfe:n�-_23T � AlqAhrp ?ilqA:hirA

-O3T� AlqTr ?il?At’r ;`T � Allh ?AllA:

IJK ;`T � Allh(2) ?AllA:h >�aT � Ally lli

IJK>�aT � Ally(2) ?illi��.F2bc � AlmAswrp ?ilmAsu:rA

20� �+ 2bc � >lmAnyA ?almanya�-bc � Almrp ?ilmArrA

;UH �Dbc � Alm$klp ilmuSkila � - �Abc � Almgrb ?ilmaGrib

�.Ubc � Almlwk lmulu:k �67�<�.bc � AlmwAEyn ?ilmawa?\i:n

�. �L.bc � AlmwDwE ilmAwd’u:?\ �2d �eT � AlnhAr innAhA:r

���2d �eT � AlnhArdh innAhArdA �.�0T � Alnwr innu:r

�-]T � Alhrm ?ilhArAm �.]T� Alhwl lho:lCP.T� Alwqt ilwa?t

; �3P.T� Alwqfp ilwa?fa�6Q 2Z� >mAkn ?ama:kin ��2Z� >mAn ?ama:n

�20Z� <mbArH imba:riX\ IJK �20Z � <mbArH(2) ?imba:riX\

2H+�-Z� >mrykA ?amri:ka 2 �+ � >nA ?ana;+ .0 �+ � >nbwbp ?anbu:bit ,_� >hl ?ahl

M_� >hlA ?ahlan ;U_� >hlh ?ahlu

Page 5: Appendix A Buckwalter Transliteration

B Egyptian Colloquial Arabic Lexicon 91

Table B.1 (continued)

Arabic Buckwalter SAMPA Arabic Buckwalter SAMPA

NX�+ .+�� >wtwbys ?utubi:s; �L�� >wDp ?o:d’it

��� >wl ?awwil >f�� >wlY ?u:la

�� � >y ?ayy 20�T 2OE�� <yTAlyA ?it’Alya

;+�� <yh ?e: IJK ;+� � <yh(2) ?e:h

�.+�� >ywh ?aywa � 2+ bAb ba:b

=T2+ bAlk ba:lak �67 UT 2+ bAllbn billaban

��-7+ btrwl bitro:l ��2S �0X+ btnjAn bitinga:n

>��?A1 .0+ btwjEny btiwga?\ni ��2 3+-+ brtqAn burtu?a:n

Y?�F-+ brsym barsi:m�.P-+ brqwq bar?u:?

� �-+ bzr bizr;UDE bslp bisilla

,[E bSl bAs’Al Ng2OE bTATs bAt’A:t’is�67�g2OE bTATyn bAt’At’i:n

;0� �+ 2OE bTAnyp bAt’t’Aniyya�6OE bTn bAt’n >�

�?OE bTny bAt’ni

h0�OE bTyx bAt’t’i:x 'A+ bEd ba?\d

�6+�'A+ bEdyn ba?\de:n �/A+ bED bA?\d

=T23+ bqAlk ba?a:lak >�f23+ bqAly ba?a:li

N�E�'3+ bqdwns ba?du:nis�-3+ bqrp ba?ArA

�-G+ bkrh bukrA�-G+ bkrp bukrA

iU+ blH balaX\ �� 'U+ bldy baladi

�2 �0+ bnAt bana:t C �X+ bnt bint; �0+� �- �7+ bnzynp banzi:na �6+� �- �7+ bnzyn banzi:n��.UO�0+ bnTlwn bAnt’Alo:n �2 �0A �0+ bnEnAE bini?\na:?\

;+ bh bih ���2dj bhArAt buhArA:t;+ �.+ bwAbp bawwa:ba ��2()

*.+ bwtjAz butaga:z���.+ bwdrp budrA �� �.+ bwry bu:ri;OF.+ bwsTp bust’a

;0�Q.+ bwkyp buke:h

>�?k0X�+ bybsy pepsi CX�+ byt be:t

� �-70�+ bytzA pitza 2 �[0�+ byDA be:d’a�/0�+ byD be:d’

; �[0�+ byDp be:d’A

2 �0X�+ bynA bi:na ' �82+ tAxd ta:xud��R2+ tAl|f tala:f >�

�l2+ tAny ta:ni

C()* tHt taX\t ,0�U()* tHlyl taX\li:l

�-Q �'+ t*krp tAzkArA� �-7�+ �-+ trAbyzp t’ArAbe:zA

IJK � �-7�+ �-+ trAbyzp(2) t’ArAbe:zit C X�+-+ trtyb tarti:b;<-+ trEp tir?\a BDE tsE tisa?\;ADE tsEp tis?\a -4�5 20ADE tsEtA$r tisa?\tA:SAr

Page 6: Appendix A Buckwalter Transliteration

92 B Egyptian Colloquial Arabic Lexicon

Table B.1 (continued)

Arabic Buckwalter SAMPA Arabic Buckwalter SAMPA

;0�WADE tsEmyp tus?\umiyya C0�WADE tsEmyt tus?\umi:t�67�ADE tsEyn tis?\i:n

;U0�G �DE t$kylp taSki:la

i0[E tSbH tis’bAX\ =0A+ tEbk ta?\abak

�NX�A+ tEy$ ti?\i:S 2 �3+ tfAH tuffa:X\;12 �3+ tfAHp tuffa:X\a 20 +�-3+ tqrybA ta?ri:ban�M+ tlAt talat -4�52+M+ tlAtA$r tAlAttA:SAr;+M+ tlAtp tala:ta �67�+M+ tlAtyn talati:nCU+ tlt tilt

;0�W0U+ tltmyp tultumiyyaC0�W0U+ tltmyt tultumi:t

;0� �+ 2b m tmAnyp tamanya�67� �+ 2b m tmAnyn tamani:n ,0��0b

m tmvyl tamsi:l

2Db m tmsAH timsa:X\ �6b m tmn taman

-4�520 �0b m tmntA$r tAmAntA:SAr;0�W �0b m tmnmyp tumnumiyya

C0�W �0b m tmnmyt tumnumi:t ��2D �X+ tnsA$ tinsa:S

N�X+ tns tinis >��?UL.+ twSlny tiwAs’s’Alni

,0�+ tyl ti:l ;0�+ tyh tih

;�+ vh sih;0�+ ��28 jA*byp ga:zibiyya

�67�+�28 jAyyn gayyi:n -78 jbr gAbr; �001 jbnp gibna '+�'8 jdyd gidi:d�'+�'8 jdydp gidi:da �-1 jrAj gArA:Z

=T�-1 jrAlk gArA:lak -7�8 -1 jrjyr gargi:r

�-1 jrs gArAs �2 �+-1 jrnAl gurnA:l

�� �-1 jzAr gAzza:r � �-1 jzr gAzAr;Z �-1 jzmp gazma

�-+� �-1 jzyrp gizi:ra��2A1 jEAn ga?\a:n

;OU8 jlTp gAlt’A

�� -7 98 jmbry gambari ,98 jml gamal

�.]98 jmhwr gumhu:r;+��.]98 jmhwryp gumhuriyyit

;0� �01 jnyp gine:; �P �.1 jwAfp gawa:fa

>�1 -_�.1 jwAhrjy gawahirgi �.1 jwrj ZorZ

�NX�1 jy$ ge:S Y?�1 jym gi:m

N�X0�1 jyns Zins ��.0�1 jyw$ guyu:S;1 28 HAjp X\a:ga �28 HAl X\a:l;01 Htp X\itta �-(1 Hjz X\agz

'8 Hd X\add -4�5 �'8 HdA$r X\idA:SAr���-1 Hrwf X\uru:f �� �-1 HzAm X\iza:m

� 2D1 HsAb X\isa:b;0�F2D1 HsAsyp X\asasiyya

��2[1 HSAn X\us’A:n ,[1 HSl X\As’al;U �31 Hflp X\aflit

\1 Hq X\a??�M8 HlAq X\alla:? �298 HmAr X\umA:r

Page 7: Appendix A Buckwalter Transliteration

B Egyptian Colloquial Arabic Lexicon 93

Table B.1 (continued)

Arabic Buckwalter SAMPA Arabic Buckwalter SAMPA

�298 HmAm X\amma:m ;`T'98 Hmdllh X\amdilla

;1 Hh hAh �67�T �.1 HwAlyn X\awa:le:n

>�f�.1 HwAly X\awa:li ��.1 HwD X\o:d

�2 �+ �.0�1 HywAnAt X\ayawa:na:t Yn*2 �8 xAtm xa:tim

>��l' �8 xdny xudni �.g- �1 xrTwm xart’u:m

C �D �1 x$b xaSab;0�L.[ �1 xSwSyp xus’us’iyya

�-4�: �1 xDrA xAd’rA �M �8 xlAT xAllA:t

>�a�8 xly xalli N9 �8 xms xamas

;D9 �8 xmsp xamsa -4�520D9 �8 xmstA$r xAmAstA:SAr;0�WD9 �8 xmsmyp xumsumiyya C0�WD9 �8 xmsmyt xumsumi:t�67�D9 �8 xmsyn xamsi:n ; �1 xh xAh

�. �1 xwx xo:x �20� �1 xyAr xiyA:r

-7� �8 xyr xe:r �� dA da

��� dAl da:l;bm��� dAymp dayma

;1 �� drjp dArAgA ���� drws duru:s�.<� dEwp da?\wa

\+�2P� dqAyq da?a:yi?;30�P� dqyqp di?i:?a �.0Q� dktwr dukto:r

>�?P.T� dlwqty dilwa?ti -7WDE�� dysmbr disimbir

�� �� *Al za:l�� �� *rp dura

���� *y zayy ,8 �� rAjl ra:gil

;0�WA+ � rbEmyp rub?\umiyya C0�WA+ � rbEmyt rub?\umi:t

B+ � rbE rub?\ 2 �0+ � rbnA rAbbina;U8� rHlp riX\la

;[ �1� rxSp ruxs’it��� rz ruzz �2L� rSAS rus’A:s’

�̂ 0� �<� rgyf riGi:f �� �2Z� rmAdy rumA:di

�� rh rih >�o�� rwmy ru:mi; �L2+�� ryADp riyA:d’A

�'+ �� zbdp zibda;<�� �� zrAEp zirA:?\A >�<�� �� zrAEy zirA:?\i��M< �� zElAn za?\la:n

��2+� �� zyAdp ziya:da��2+� �� zyArp ziyA:rA =+�2+� �� zyArtk ziyArtakC+� �� zyt ze:t ��.0+� �� zytwn zatu:n

�-+� �� zyrw zi:ru �6+� �� zyn ze:n��2F sAdp sa:da

;AP2F sAqEp sa??\a;120F sbAHp siba:X\a p) �* 20F sbAnx saba:nix

C X F sbb sabab -7W0X F sbtmbr sibtimbir

B0F sbE saba?\;A0F sbEp sab?\a

Page 8: Appendix A Buckwalter Transliteration

94 B Egyptian Colloquial Arabic Lexicon

Table B.1 (continued)

Arabic Buckwalter SAMPA Arabic Buckwalter SAMPA

-4�520A0F sbEtA$r sAbA?\tA:SAr;0�WA0F sbEmyp sub?\umiyya

C0�WA0F sbEmyt sub?\umi:t �67�A0F sbEyn sab?\i:n

-4�520F stA$r sittA:SAr CF st sit;0F stp sitta

;0�W0F stmyp suttumiyyaC0�W0F stmyt suttumi:t �67�0F styn sitti:n�6(q5 sjn sign

; �0 �(q5 sxnp suxna�-45 srp surrA -AF sEr si?\r�'0�AF sEydp sa?\i:da

�- �3F sfrp sufra-7� �3F sfyr safi:r -GF skr sukkar

=0ZMF slAmtk salamtak CSbr5 smHt samaX\t

Y?kbr5 smsm simsim =br5 smk samak;Gbr5 smkp samaka �67� �0F snyn sini:n��.F swAq sawwa:?

;P�.F swAqp siwa:?a

>��l ��.F swdAny suda:ni 2+��.F swryA surya

�67�F syn si:n ��2 �F $ArE Sa:ri?\

>�o2 �F $Amy Sa:mi �� 2 �F $Ay Sa:y�-(q

�5 $jrp SAgarA ;0 �3 �F $fth Suftu;3 �F $qp Sa??a -G �F $kr Sukr

�-G �F $krA SukrAn ,H �F $kl Sakl

Nbr�5 $ms Sams;0�Dbr�5 $msyp Samsiyya

;Abr�5 $mEp Sam?\a;O�0 �F $nTp SAnt’A

��2de�5 $hAdp Siha:dit;+� �. �F $wAyp Sawwa:ya

;+ �. �F $wrbp Surba;Q. �F $wkp So:ka

;+�. �F $wyp Swayya �67� �F $yn Si:n��.+ 2L SAbwn s’Abu:n �2L SAd s’A:d��.T2L SAlwn s’alo:n 20L SbAH s’AbA:X\

- �3L Sfr s’ifr��'�0L Sndwq sandu:?

�.L Swt s’o:t �2 �L DAd d’A:d

s �A �L DgT d’AGt’\0g Tbq t’AbA?

>�? g Tby t’ibbi;0�WAg TEmyp t’A?\miyya

;+�2 �3g TfAyp t’AffA:ya tg2Wg TmATm t’AmA:t’im

.Wg TmwH t’umu:X\ ;g Th t’Ah;+ .g Twbp t’u:bA �.g Twl t’u:l��20�g TyArp t’AyyA:rA ���-7�g TyrAn t’AyArA:n

; �g Zh D’Ah���2< EArf ?\a:rif

;()*�-T2< EAlryHp ?\arri:X\a;ZMDT2< EAlslAmp ?\assala:ma

C 0�S < Ejyb ?\agi:b �'< Eds ?\ads

Page 9: Appendix A Buckwalter Transliteration

B Egyptian Colloquial Arabic Lexicon 95

Table B.1 (continued)

Arabic Buckwalter SAMPA Arabic Buckwalter SAMPA

>�l -< Erby ?\ArAbi �20�+ -< ErbyAt ?\ArAbiyya:t;0�+ -< Erbyp ?\ArAbiyya IJK ;0�+ -< Erbyp(2) ?\ArAbiyyit

,D< Esl ?\asal -4�k< E$r ?\ASAr�-4�k< E$rp ?\ASA:rA �6+�-4�k< E$ryn ?\iSri:n-7� �P2[< ESAfyr ?\As’Afi:r

��. �3[< ESfwrp ?\As’fu:rA-7�[< ESyr ?\As’i:r �2 �O< EZAm ?\iD’A:m

>�a3< Eqly ?\a?li

;0 U< Elbp ?\ilbit

>a< ElY ?\ala .G0�U< Elykw ?\ale:ku

�� -9< Emry ?\umri �6< En ?\an

C �0< Enb ?\inab �'�0< Endk ?\andak�.0G�0< Enkbwt ?\ankabu:t ���. �0< EnwAn ?\inwA:n

'0�< Eyd ?\i:d �NX�< Ey$ ?\e:S�67�< Eyn ?\e:n � �- �< grwb Guru:b

-7� �< gyr Ge:r 2_-7� �< gyrhA Gerha�67� �< gyn Ge:n C+2 �P fAtt fa:tit

20�T.L2 �P fASwlyA fAs’ulya , �L2 �P fADl fA:d’il;0� �L2 �P fADyp fAd’ya C U3T2 �P fAlqlb fil?alb

-+��-7 �P fbrAyr fibrA:yir ��- �P frAx fira:x

,Z�- �P frAml fArA:mil; �1- �P frxp farxa

;L- �P frSp furs’A 2D�E- �P frnsA fArAnsA; �[ �P fDp fAd’d’A =U �[ �P fDlk fAd’lAk

B0� �O �P fZyE faD’i:?\ , �PM �P flAfl fala:fil

, �3U �P flfl filfil �.U �P flws filu:s

; �P fh fih ��. �P ��. �P fwdAfwn vodafon�. �P fwq fo:? �. �P fwl fu:l

>��P fy fi � �-7� �P fyzA vi:za

M0� �P fylA villa Ya0��P fylm film

.�00� �P fynw fi:nu��2P qAf qA:f

��. �+2P qAnwn qAnu:n ,0 P qbl ?abl

Yn*�'P qdym ?adi:m ��-P qrAr qArA:r

�-P qrd ?ird;+�-P qryp qAryA

�NE�-P qry$ ?ari:S;OP qTp ?ut’t’A

�6OP qTn ?ut’n; �0OP qTnp ?ut’nA

C UP qlb ?alb YaP qlm ?alam

i9P qmH ?amX\ -9P qmr ?AmAr�.]P qhwp ?ahwa

;W0�P qymp ?i:ma��2u kAf ka:f �2u kAm ka:m

Page 10: Appendix A Buckwalter Transliteration

96 B Egyptian Colloquial Arabic Lexicon

Table B.1 (continued)

Arabic Buckwalter SAMPA Arabic Buckwalter SAMPA

,Z2u kAml ka:mil ��2u kAn ka:n�'0Q kbdp kibda C+�-7Q kbryt kabri:t

� 20Q ktAb kita:b �.G0Q ktkwt katku:t

� �'Q kdAb kadda:b �'Q kdh kida�'Q kdp kida ��2WQ kmAn kama:n;0 �0Q knbp kanaba �' �0Q kndA kanada���'�0Q kndwz kandu:z �2+ -]Q khrbA’ kAhrAbA

�� -+ .Q kwbry kubri��.Q kwrp ko:rA

;F.Q kwsp ko:sa NX�Q kys ki:s

= �30�Q kyfk ke:fak R l< la?

�R l<A la?a � ��R lAzm la:zim

�R lAm la:m �67 T lbn laban;W(V lHmp laX\ma

;O3T lqTp lu?t’a;0 bc lmbp lAmbA .T lw law

20�+ .T lwbyA lubya;1.T lwHp lo:X\a

20�T lyA liyya;U0�T lylp le:la

��.W0�T lymwn lamu:n 2Z mA ma

��2Z mArs ma:ris =T2Z mAlk ma:lak��.T2Z mAlw$ malu:S �NX�T2Z mAly$ mali:S

;()�* 2Z mAnjp manga .+�2Z mAyw ma:yu

�-7Z mtrw mitru sF.0Z mtwsT mutAwAssit��. �0(o mjnwn magnu:n �.](o mjhwl maghu:l

C F2(o mHAsb muX\a:sib >�o2(o mHAmy muX\a:mi

��'(o mHd$ maX\addS;O(o mHTp mAX\At’t’it

�-W(o mHmrp miX\AmmAra ���'�(o mxdrAt muxAddArA:t

NZ'Z mdms midammis;+� �-Z mrAyp mira:ya

;g.+ -Z mrbwTp mArbu:t’A�-Z mrp mArrA

C +-Z mrtb murattab �2DZ msA’ masa:?�'<2DZ msAEdp musa?\da v �3 �DXDZ mst$fY mustaSfa

� - �A0DZ mstgrb mistAGrAb���'DZ msdwdp masdu:da

�.UDZ mslwq maslu:? .WDZ msmwH masmu:X\

�NZ m$ miS �NW �DZ m$m$ miSmiS

�� . �DZ m$wy maSwi;+�. �DZ m$wyp maSwiyya

-4:Z mSr mAs’r � -4�:Z mDrb mAd’rAb

�2OZ mTAr mAt’A:r h0OZ mTbx mAt’bAx

tAOZ mTEm mAt’?\Am BZ mE ma?\a

�2AZ mEAk ma?\a:k 2+�2AZ mEAyA ma?\a:ya;0� �+'AZ mEdnyp ma?\daniyya �.3AZ mEqwl ma?\?u:l

Page 11: Appendix A Buckwalter Transliteration

B Egyptian Colloquial Arabic Lexicon 97

Table B.1 (continued)

Arabic Buckwalter SAMPA Arabic Buckwalter SAMPA

;3UAZ mElqp ma?\la?a IJK ;3UAZ mElqp(2) ma?\la?it��-Z2 �AZ mgAmrAt muGAmrA:t 20 �3Z mftAH mufta:X\

, �3U �3Z mflfl mefalfel �NX� �3Z mfy$ mafi:S

>�a3Z mqly ma?li ��2HZ mkAn maka:n

; �+�-GZ mkrwnp mAkAro:na��.DGZ mkswf maksu:f

�.GZ mkwp makwa >�1 .GZ mkwjy makwagi

iUZ mlH malX\;U�0wx mmvlp mumassila

�6Gwx mmkn mumkin �6Z mn min

,+�'�0Z mndyl mandi:l;3O�0Z mnTqp mAnt’i?a

2d �eZ mnhA minha �'�0]Z mhnds muhandis

'0�<�.Z mwAEyd mawa?\i:d �.1 .Z mwjwd mawgu:d��.Z mwz mo:z v30�F.Z mwsyqY musi:qA;0�Z myp miyya IJK ;0�Z myp(2) mAyyAC0�Z myt mi:t �67�00�Z mytyn mite:n

�M0�Z mylAd mila:d Y?�Z mym mi:m

' �82 �+ nAxd na:xud �� �2 �+ nAdy na:di

�2�+ nAs na:s Yn*�2 �+ nAym na:yim

,() �* nHl naX\l �2 �D�E n$AT nASA:t;S0�[�E nSyHp nAs’i:X\a

��2 �O �E nZArp nAd’d’A:rA

tA�+ nEm na?\am �2 �0A �+ nEnAE ni?\na:?

N �3�+ nfs nafs;O3 �+ nqTp nu?t’it

,b �m nml naml;Ub �m nmlp namla

-7 9�P. �+ nwfmbr nuvimbir �.�+ nwm no:m

��. �+ nwn nu:n ��.0� �+ nywn niyo:n� �-W_ hmzp hamza 2 �0_ hnA hina;F'�0_ hndsp handasa ;_ hh hih;A+ ��� w<rbEp w?ArbA?\a -4�520A+ ��� w<rbEtA$r w?ArbA?\tA:SAr�67�A+ ��� w<rbEyn w?arbi?\i:n -4�52 �0+ �� w<tnA$r w?itnA:SAr�67� �0+ �� w<tnyn w?itne:n C�+�� w<nt winta

'8�� wAHd wa:X\id�'8�� wAHdp waX\da

' �8�� wAxd wa:xid �� ��� wAdy wa:di

��� wAw wA:w ' �82+� wtAxd wita:xud;ADE� wtsEp wtis?\a -4�520ADE� wtsEtA$r wtisa?\tA:SAr;0�WADE� wtsEmyp wtus?\umiyya �67�ADE� wtsEyn wtis?\i:n

-4�52+M+� wtlAtA$r wtAlAttA:SAr;+M+� wtlAtp wtala:ta

�67�+M+� wtlAtyn wtalati:n CU+� wtlt wtilt;0�W0U+� wtltmyp wtultumiyya

;0� �+ 2b m� wtmAnyp wtamanya

Page 12: Appendix A Buckwalter Transliteration

98 B Egyptian Colloquial Arabic Lexicon

Table B.1 (continued)

Arabic Buckwalter SAMPA Arabic Buckwalter SAMPA

�67� �+ 2b m� wtmAnyn wtamani:n -4�520 �0b m� wtmntA$r wtAmAntA:SAr;0�W �0b m� wtmnmyp wtumnumiyya -4�5 �'8� wHdA$r wX\idA:SAr;D9 �8� wxmsp wxamsa -4�520D9 �8� wxmstA$r wxAmAstA:SAr;0�WD9 �8� wxmsmyp wxumsumiyya �67�D9 �8� wxmsyn wxamsi:n

y)*� ��� wrAyH wrA:yiX\ B+ �� wrbE wrub?\;0�WA+ �� wrbEmyp wrub?\umiyya ��� wrd ward;P�� wrqp wara?a

;A0F� wsbEp wsab?\a

-4�520A0F� wsbEtA$r wsAbA?\tA:SAr;0�WA0F� wsbEmyp wsub?\umiyya

�67�A0F� wsbEyn wsab?\i:n -4�520F� wstA$r wsittA:SAr;0F� wstp wsitta

;0�W0F� wstmyp wsuttumiyya�67�0F� wstyn wsitti:n Mde5� wshlA wasahlan

; �F� w$h wiSSu �.L� wSwl wus’u:l�-4�k<� wE$rp w?\ASA:rA �6+�-4�k<� wE$ryn w?\iSri:n

'T� wld walad;0�Z� wmyp wmiyya

�67�00�Z� wmytyn wmite:n /�E� wnS wnus’s’

'8��� wwAHd wwa:X\id 2Z2+� yAmA yama

=+�'+� ydyk yiddi:k =WUDE� yslmk ysallimak

>��?A+� yEny ya?\ni ,D �A+� ygsl yiGsil

;U+� ylh yAllA -+�2 �0+� ynAyr yana:yir��.D �X+� ynswn yansu:n ;+� yh yih

=3 �P.+� ywfqk ywaffa?ak ;0�T.+� ywlyh yulya

.0�T.+� ywlyw yulyu �.+� ywm yo:m

;0� �+.+� ywnyh yunya .0� �+.+� ywnyw yunyu

Page 13: Appendix A Buckwalter Transliteration

Appendix CSeinnheiser ME-3 Specifications

Seinnheiser ME-3 is a headset microphone of exceptional sound quality, the ME 3is intended for music and speech applications. The super-cardioid condenser designoffers excellent feedback rejection.

Table C.1 Specifications ofthe Seinnheiser ME-3microphone

AF sensitivity 1.6 mV/PaMax. sound pressure level (active) 150 dBPick-up pattern Super-cardioidTransducer principle Electret condenser

Fig. C.1 Polar diagram for Seinnheiser ME-3

M. Elmahdy et al., Novel Techniques for Dialectal Arabic Speech Recognition,DOI 10.1007/978-1-4614-1906-8, © Springer Science+Business Media New York 2012

99

Page 14: Appendix A Buckwalter Transliteration

100 C Seinnheiser ME-3 Specifications

Fig. C.2 Frequency response curve for Seinnheiser ME-3

Page 15: Appendix A Buckwalter Transliteration

Appendix DBuddy 6G USB Specifications

The Buddy 6G USB adapter manufacured by InSync speech Technologies, Inc. isbased on the Micronas UAC3556b microchip. It has a built-in high-quality soundcard, which replaces a desktop or laptop computer’s sound card for high perfor-mance speech sound input and output. It offers full duplex operation for connectionwith microphone and speakers that is especially well suited to speech recognitionapplications.

Table D.1 Specifications of the Buddy 6G USB adapter based on the Micronas UAC3556b mi-croship

Microphone Supports 8/16 bit mono recording at 6.4 kHz to 48 kHz, sensitivity−54 ± 4 dB impedance < 650 Ohms.

Speaker Output Supports 16/24 bit mono/stereo at 6.4 kHz to 48 kHz. Includes alow power stereo amplifier.

Signal to Noise Ratio SNR is typically −92 dB for A/D (recording) and −96 dB for D/A(playback).

Total Harmonic Distortion THD is better than −90 dB for both A/D (recording) and D/A(playback).

Power Self powered from USB bus with less than 100 mA current at 5VDC.

Operating Temperature Minimum −10°C (14°F), maximum 70°C (158°F).

Storage Temperature Minimum −40°C (−40°F), maximum 75°C (167°F).

M. Elmahdy et al., Novel Techniques for Dialectal Arabic Speech Recognition,DOI 10.1007/978-1-4614-1906-8, © Springer Science+Business Media New York 2012

101

Page 16: Appendix A Buckwalter Transliteration

References

Abdou, S., Hamid, S. E., Rashwan, M., Samir, M., Abd-Elhamid, O., Shahin, M., and Nazih,W. (2006) Computer Aided Pronunciation Learning System Using Speech Recognition Tech-niques. In Proceedings of International Conference on Speech and Language Processing IN-TERSPEECH

Afify, M., Nguyen, L., Xiang, B., Abdou, S., and Makhoul, J. (2005) Recent Progress in ArabicBroadcast News Transcription at BBN. In Proceedings of International Conference on Speechand Language Processing INTERSPEECH, Lisbon, Portugal, pp. 1637–1640

Afify, M., Sarikaya, R., Kuo, H. J., Besacier, L., and Gao, Y. (2006) On the use of morphologicalanalysis for dialectal Arabic speech recognition. In Proceedings of International Conference onSpeech and Language Processing INTERSPEECH, Pittsburgh, Pennsylvania, pp. 277–280

Alshalabi, R. (2005) Pattern-based Stemmer for Finding Arabic Roots. Information TechnologyJournal 4(1), pp. 38–43

Appen Pty Ltd, Sydney, Australia (2006a) Iraqi Arabic Conversational Telephone Speech. Linguis-tic Data Consortium, University of Pennsylvania, LDC Catalog No.: LDC2006S45

Appen Pty Ltd, Sydney, Australia (2006b) Gulf Arabic Conversational Telephone Speech. Linguis-tic Data Consortium, University of Pennsylvania, LDC Catalog No.: LDC2006S43

Appen Pty Ltd, Sydney, Australia (2007) Levantine Arabic Conversational Telephone Speech. Lin-guistic Data Consortium, University of Pennsylvania, LDC Catalog No.: LDC2007S01

Atiyya, M., Choukri, K., and Yaseen, K. (2005) Specifications of the Arabic Written Corpus. Nem-lar project

Barras, C., Geoffroisb, E., Wuc, Z., and Libermanc, M. (2000) Transcriber: Development and useof a tool for assisting speech corpora production. Speech Communication 33(1–2), pp. 5–22

Billa, J., Noamany, M., Srivastava, A., Liu, D., Stone, R., Xu, J., Makhoul, J., and Kubala, F.(2002) Audio Indexing of Arabic Broadcast News. In Proceedings of the IEEE InternationalConference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, pp. 5–8

Buckwalter, T. (2002a) Arabic Transliteration. URL: http://www.qamus.org/transliteration.htmBuckwalter, T. (2002b) Buckwalter Arabic Morphological Analyzer Version 1.0. Linguistic Data

Consortium, University of Pennsylvania, LDC Catalog No.: LDC2002L49Cambridge (2010) HTK—Hidden Markov Model Toolkit—Speech Recognition toolkit. URL:

http://htk.eng.cam.ac.uk/Canavan, A., Zipperlen, G., and Graff, D. (1997) CALLHOME Egyptian Arabic Speech. Linguistic

Data Consortium, University of Pennsylvania, LDC Catalog No.: LDC97S45Clarkson, P., and Rosenfeld, R. (1997) Statistical Language Modeling Using the CMU-Cambridge

Toolkit. In Proceedings of ISCA EurospeechCarnegie Mellon University (2010a) Sphinx—Speech Recognition Toolkit. URL: http://

cmusphinx.sourceforge.net/Carnegie Mellon University-Cambridge (2010b) CMU-Cambridge Statistical Language Modeling

toolkit. URL: http://www.speech.cs.cmu.edu/SLM/toolkit.html

M. Elmahdy et al., Novel Techniques for Dialectal Arabic Speech Recognition,DOI 10.1007/978-1-4614-1906-8, © Springer Science+Business Media New York 2012

103

Page 17: Appendix A Buckwalter Transliteration

104 References

Darwish, K. (2002) Building a shallow Arabic morphological analyzer in one day. In Proceedingsof ACL workshop on computational approaches to semitic languages

Djoudi, M., Fohr, D., and Haton, J. P. (1989) Phonetic study for automatic recognition of Ara-bic. In Proceedings of first European conference on speech communication and technology(Eurospeech), Paris, France, pp. 2268–2271

Djoudi, M., Aouizerat, H., and Haton, J. P. (1990) Phonetic study and recognition of standard Ara-bic emphatic consonants. In Proceedings of First International conference on spoken languageprocessing (ICSLP), Kobe, Japan, pp. 957–960

El-Halees, Y. (1989) A study of subglottal pressure for emphatic and non-emphatic sounds inArabic. In Proceedings of first European conference on speech communication and technology(Eurospeech), Paris, France

Elmahdy, M., Gruhn, R., Minker, W., and Abdennadher, S. (2009a) Survey on Common ArabicLanguage Forms from a Speech Recognition Point of View. In Proceedings of the InternationalConference on Acoustics (NAG-DAGA), Rotterdam, Netherlands, pp. 63–66

Elmahdy, M., Gruhn, R., Minker, W., and Abdennadher, S. (2009b) Effect of Gaussian Densi-ties and Amount of Training Data on Grapheme-Based Acoustic Modeling for Arabic. In Pro-ceedings of the IEEE international conference on natural language processing and knowledgeengineering (IEEE NLP-KE), Dalian, China

Elmahdy, M., Gruhn, R., Minker, W., and Abdennadher, S. (2009c) Modern Standard Arabic BasedMultilingual Approach for Dialectal Arabic Speech Recognition. In International Symposiumon Natural Language Processing (SNLP), Bangkok, Thailand, pp. 169–174

Elmahdy, M., Gruhn, R., Minker, W., and Abdennadher, S. (2010) Cross-Lingual Acoustic Mod-eling for Dialectal Arabic Speech Recognition. In Proceedings of International Conference onSpeech and Language Processing INTERSPEECH, Makuhari, Japan, pp. 873–876

Elmahdy, M., Gruhn, R., Abdennadher, S., and Minker, W. (2011) Rapid Phonetic Transcriptionusing Everyday Life Natural Chat Alphabet Orthography for Dialectal Arabic Speech Recog-nition. In Proceedings of the IEEE International Conference on Acoustics, Speech, and SignalProcessing (ICASSP), Prague, Czech Republic

ELRA: European Language Resources Association (2010) URL: http://www.elra.info/Fegen, C., Steker, S., Soltau, H., Metze, F., and Schultz, T. (2003) Efficient Handling of Multilin-

gual Language Models. In Proceedings of Automatic Speech Recognition and UnderstandingWorkshop (ASRU), St. Thomas, Virgin Islands, pp. 441–446

Ferguson, C. (1959) Diglossia. Word 15, pp. 325–340Fung, P., and Schultz, T. (2008) Multilingual Spoken Language Processing. IEEE Speech Process-

ing Magazine 25(3), pp. 89–97Gal, Y. (2002) An HMM approach to vowel restoration in Arabic and Hebrew. In Proceedings of

the ACL-02 workshop on Computational approaches to semitic languages, USA, Associationfor Computational Linguistics

Gales, M. J. F., Diehl, F., Raut, C. K., Tomalin, M., Woodland, P. C., and Yu, K. (2007) Develop-ment of a Phonetic System for Large Vocabulary Arabic Speech Recognition. In Proceedings ofAutomatic Speech Recognition and Understanding Workshop (ASRU), Kyoto, Japan, pp. 24–29

Gibbon, D., Moore, R., and Winski, R. (1997) SAMPA computer readable phonetic alphabet. InHandbook of Standards and Resources for Spoken Language Systems. Mouton de Gruyter,Berlin. Part IV, section B

Google Labs (2009) Google Transliteration. URL: http://www.google.com/ta3reeb/Google Labs (2010) Google Tashkeel. URL: http://tashkeel.google.comGruhn, R., and Nakamura, S. (2001) Multilingual, Speech Recognition with the CALLHOME

Corpus (ASJ2001), vol. 1. Acoustical Society of Japan, Japan, pp. 153–154Gu, L., Zhang, W., Tahir, L., and Gao, Y. (2007) Statistical Vowelization of Arabic Text for Speech

Synthesis in Speech-to-Speech Translation Systems. In International Conference on Speechand Language Processing INTERSPEECH, Antwerp, Belgium, pp. 1901–1904

Habash, N., and Rambow, O. (2007) Arabic Diacritization through Full Morphological Tagging.In Proceedings of NAACL HLT, pp. 53–56

Page 18: Appendix A Buckwalter Transliteration

References 105

Hassan, Z. M., and Esling, J. H. (2007) Laryngoscopic (Articulatory) and Acoustic Evidence of aPrevailing Emphatic Feature Over the Word in Arabic. In Proceedings of the 16th InternationalCongress of Phonetic Sciences

Hinds, M., and Badawi, E. (2009) A Dictionary of Egyptian Arabic. Librairie du Liban, ReprintedHoles, C. (2004) Modern Arabic: Structures, Functions, and Varieties. Georgetown University

Press, WashingtonHuang, X., Acero, A., and Hon, H. (2001) Spoken language processing: a guide to theory, algo-

rithm, and system development. Prentice Hall, New YorkISO 8859-6 (1987) Information processing—8-bit single-byte coded graphic character sets—Part

6: Latin/Arabic alphabet. International Organization for StandardizationJurafsky, D., and Martin, J. H. (2009) Speech and language processing: An introduction to nat-

ural language processing, computational linguistics, and speech recognition, second edition.Prentice Hall, New York

Kaye, A. S. (1970) Modern Standard Arabic and the Colloquials. Lingua 24, pp. 374–391Kilany, H., Gadalla, H., Arram, H., Yacoub, A., El-Habashi, A., and McLemore, C. (2002) Egyp-

tian Colloquial Arabic Lexicon. Linguistic Data Consortium, University of Pennsylvania, LDCCatalog No.: LDC99L22

Kirchhoff, K., and Vergyri, D. (2005) Cross-Dialectal Data Sharing For Acoustic Modeling inArabic Speech Recognition. Speech Communication 46(1), pp. 37–51

Kirchhoff, K., Bilmes, J., Das, S., Duta, N., Egan, M., Ji, G., He, F., Henderson, J., Liu, D., Noa-many, M., Schone, P., Schwarta, R., and Vergyri, D. (2002) Novel approaches to Arabic speechrecognition: report from the 2002 Johns-Hopkins summer workshop. Technical report, JohnsHopkins University

Lagally, K. (1992) ArabTEX Typesetting Arabic with vowels and ligatures. In Proceedings of theEuroTEX92 conference, Prague

Lamel, L., Messaoudi, A., and Gauvain, J. (2007) Improved Acoustic Modeling for Transcrib-ing Arabic Broadcast Data. In International Conference on Speech and Language ProcessingINTERSPEECH, pp. 2077–2080

Lamere, P., Kwok, P., Gouvea, E. B., Raj, B., Singh, R., Walker, W., and Wolf, P. (2003) The CMUSPHINX-4 speech recognition system. In Proceedings of the IEEE International Conferenceon Acoustics, Speech, and Signal Processing (ICASSP), vol. 46(1), pp. 37–51

Linguistic Data Consortium (LDC) (2010) University of Pennsylvania. URL: http://www.ldc.upenn.edu/

Lee, C., and Gauvain, J. (1993) Speaker Adaptation Based on MAP Estimation of HMM Param-eters. In Proceedings of the IEEE International Conference on Acoustics, Speech, and SignalProcessing (ICASSP), pp. II–558

Leggetter, C. J., and Woodland, P. C. (1995) Maximum likelihood linear regression for speakeradaptation of the parameters of continuous density hidden Markov models. Computer Speechand Language 9, pp. 171–185

Maamouri, M., Graff, D., Jin, H., Cieri, C., and Buckwalter, T. (2004) Dialectal ArabicOrthography-based Transcription and CTS Levantine Arabic Collection. Paper presented atthe Parallel STT-NA Tracks Session of the EARS RT-04 Workshop, Palisades IBM ExecutiveCenter, New York

Maamouri, M., Graff, D., and Cieri, C. (2006) Arabic Broadcast News Transcripts. Linguistic DataConsortium, University of Pennsylvania, LDC Catalog No.: LDC99L22

Maamouri, M., Buckwalter, T., Graff, D., and Jin, H. (2007) Fisher Levantine Arabic Conversa-tional Telephone Speech. Linguistic Data Consortium, University of Pennsylvania, LDC Cata-log No.: LDC2007S02

Maegaard, B., Damsgaard, J. L., Krauwer, S., and Choukri, K. (2004) NEMLAR: Arabic LanguageResources and Tools. In Proceedings of Arabic Language Resources and Tools Conference,Cairo, Egypt, pp. 42–54

Makhoul, J., Zawaydeh, B., Choi, F., and Stallard, D. (2005) BBN/AUB DARPA Babylon Levan-tine Arabic Speech and Transcripts. Linguistic Data Consortium, University of Pennsylvania,LDC Catalog No.: LDC2005S08

Page 19: Appendix A Buckwalter Transliteration

106 References

Messaoudi, A., Lamel, L., and Gauvain, J. (2004) Transcription of Arabic Broadcast News. In In-ternational Conference on Spoken Language Processing (INTERSPEECH), Jeju Island, Korea,pp. 1701–1704

Messaoudi, A., Gauvain, J., and Lamel, L. (2006) Arabic Broadcast News Transcription using aOne Million Word Vocalized Vocabulary. In Proceedings of the IEEE International Conferenceon Acoustics, Speech, and Signal Processing (ICASSP), vol. 1, pp. 1093–1096

Microsoft Innovation Lab, Cairo (2009) Microsoft Maren. URL: http://www.microsoft.com/middleeast/egypt/cmic/maren/

Nelken, R., and Shieber, S. M. (2005) Arabic Diacritization Using Weighted Finite-State Trans-ducers. Workshop On Computational Approaches To Semitic Languages 5(2), pp. 79–86

Newman, D. (2002) The Phonetic Status of Arabic within the World’s Languages: The Uniquenessof the Lughat Al-Aaad. Antwerp papers in linguistics 100, pp. 65–75

Ney, H., Essen, U., and Kneser, R. (1994) On structuring probabilistic dependencies in stochasticlanguage modeling. Computer Speech and Language 8(1), pp. 1–28

Ng, T., Nguyen, K., Zbib, R., and Nguyen, L. (2009) Improved Morphological Decomposition forArabic Broadcast News Transcription. In Proceedings of the IEEE International Conference onAcoustics, Speech, and Signal Processing (ICASSP), Taipei, Taiwan, pp. 4309–4311

Paulsson, K., Choukri, K., Mostefa, D., DiPersio, D., Glenn, M., and Strassel, S. (2009) A LargeArabic Broadcast News Speech Data Collection. In Proceedings of the Second InternationalConference on Arabic Language Resources and Tools, Egypt, pp. 280–284

Rabiner, L., and Juang, B. (1993) Fundamentals of Speech Recognition. Prentice Hall, New YorkRabiner, L. R. (1989) A Tutorial on Hidden Markov Models and Selected Applications in Speech

Recognition. Proceedings of the IEEE 77(2), pp. 257–286Razak, Z., Ibrahim, N. J., Idris, M. Y. I., Tamil, E. M., Yakub, M., Yusoff, Z. M., and Rahman,

N. N. A. (2008) Quranic Verse Recitation Recognition Module for Support in j-QAF Learning:A Review. IJCSNS International Journal of Computer Science and Network Security 8(8),pp. 207–216

RDI (2007) Fassieh. URL: http://www.rdi-eg.com/Rybach, D., Hahn, S., Gollan, C., Schluter, R., and Ney, H. (2007) Advances in Arabic Broadcasr

News Transcription At RWTH. In Proceedings of Automatic Speech Recognition and Under-standing Workshop (ASRU), Kyoto, Japan, pp. 449–454

The Nemlar project (2005) URL: http://www.nemlar.org/Sarikaya, R., Emam, O., Zitouni, I., and Gao, Y. (2006) Maximum Entropy Modeling for Diacriti-

zation of Arabic Text. In Proceedings of International Conference on Speech and LanguageProcessing INTERSPEECH, pp. 145–148

Schultz, T., and Waibel, A. (2001) Language Independent and Language Adaptive Acoustic Mod-eling for Speech Recognition. Speech Communication 35, pp. 31–51

Stevens, V., and Salib, M. (2005) A Pocket Dictionary of the Spoken Arabic of Cairo. The Ameri-can University in Cairo Press, Cairo

Vergyri, D., and Kirchhoff, K. (2004) Automatic diacritization of Arabic for acoustic modeling inspeech recognition. In Proceedings of COLING Computational Approaches to Arabic Script-based Languages, Geneva, Switzerland, pp. 66–73

Vergyri, D., Kirchhoff, K., Gadde, R., Stolcke, A., and Zheng, J. (2005) Development of a con-versational telephone speech recognizer for Levantine Arabic. In Proceedings of InternationalConference on Speech and Language Processing INTERSPEECH, Lisboa, pp. 1613–1616

Waibel, A., Geutner, P., Mayfield-Tomokiyo, L., Schultz, T., and Woszczyna, M. (2000) Multilin-guality in Speech and Spoken Language Systems. Proceedings of the IEEE, Special Issue onSpoken Language Processing 88(8), pp. 1297–1313

Xiang, B., Nguyen, K., Nguyen, L., Schwartz, R., and Makhoul, J. (2006) Morphological de-composition for Arabic broadcast news transcription. In Proceedings of the IEEE InternationalConference on Acoustics, Speech, and Signal Processing (ICASSP), vol. I, pp. 1089–1092

Yaghan, M. A. (2008) Arabizi: a contemporary style of Arabic slang. Design Issues 24(2), pp. 39–52

Page 20: Appendix A Buckwalter Transliteration

References 107

Yaseen, M., Attia, M., Maegaard, B., Choukri, K., Paulsson, N., Haamid, S., Krauwer, S., Ben-dahman, C., Fersoe, H., Rashwan, M., Haddad, B., Mukbel, C., Mouradi, A., Al-Kufaishi, A.,Shahin, M., Chenfour, N., and Ragheb, A. (2006) Building Annotated Written and SpokenArabic LRs in NEMLAR Project. In Proceedings of International Conference on LanguageResources and Evaluation (LREC)

Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason,D., Povey, D., Valtchev, V., and Woodland, P. (1996) The HTK Book. Cambridge UniversityPress, Cambridge

Zitouni, I., Olive, J., Iskra, D., Choukri, K., Emam, O., Gedge, O., Maragoudakis, E., Tropf, H.,Moreno, A., Rodriguez, A. N., Heuft, B., and Siemund, R. (2002) ORIENTEL: Speech-BasedInteractive Communication applications for the Mediterranean and the Middle East. In Pro-ceedings of International Conference on Speech and Language Processing INTERSPEECH,pp. 325–328

Page 21: Appendix A Buckwalter Transliteration

Index

AArabic Chat Alphabet, 5, 71

BBayes rule, 8Buckwalter, 25, 72, 87

CClassical Arabic, 13, 17Context dependent acoustic model, 9Context independent acoustic model, 9

DData pooling, 39, 61Diacritization, 3, 19, 20, 25, 53, 55, 71Dialectal Arabic, 4, 13, 18, 27, 29, 33, 54, 63Diglossia, 1Distinct tri-phones coverage, 27

EEgyptian Colloquial Arabic, 18, 25, 27, 33–35,

60, 72Emphatic phonemes, 15, 35

FFussha, 15

GGaussian, 57Gaussian mixture model, 10, 33, 42, 73Grapheme-to-phoneme, 19, 23, 32Graphemic Acoustic Modeling, 53Graphemic lexicon, 55

HHidden Markov Model, 9, 33–35, 42, 55, 56,

73

IIPA, 17, 30, 73, 79, 80

LLevantine colloquial Arabic, 25, 30, 67Lexicon, 10, 29, 55

MMaximum A-Posteriori, 42, 43Maximum Likelihood Linear Regression, 42,

43MFCC, 7, 34, 57Modern standard Arabic, 1, 2, 4, 13, 15, 25Morphological analyzer, 25Morphological complexity, 19Multilingual acoustic model, 35, 60

NN-gram, 8, 46, 49

OOut-of-vocabulary, 2, 19, 71

PPhoneme set, 30Phoneme sets normalization, 36Phonemic acoustic modeling, 33Phonetic transcription, 15, 19, 71

QQuran, 17

M. Elmahdy et al., Novel Techniques for Dialectal Arabic Speech Recognition,DOI 10.1007/978-1-4614-1906-8, © Springer Science+Business Media New York 2012

109

Page 22: Appendix A Buckwalter Transliteration

110 Index

RReal Time Factor, 47

SSAMPA, 15, 17, 20, 26, 73, 79, 80, 87Spelling variants, 63State tying, 9Supervised adaptation, 41, 61

TText-To-Speech, 20Tied-states, 9, 34, 39, 57, 74

UUnicode, 87Unsupervised adaptation, 44, 63