DEVELOPMENT OF SHAHMUKHI TO GURMUKHI...

References 1 Chinnakotla, M.K., Damani, O.P., Satoskar, A. 2010. Transliteration for

Resource-Scarce Languages. ACM Transactions on Asian Language

Information Processing (TALIP), Vol. 9:4, pp. 1-30.

2 Oh, J.-H., Choi, K.-S., Isahara, H. 2006. A comparison of different Machine

Transliteration Models. Journal of Artificial Intelligence Research, Vol. 27,

pp. 119-151.

3 Knight, K., Graehl, J. 1997. Machine transliteration. In Proceedings of the

35th Annual Meeting of the Association for Computational Linguistics. pp.

128-135.

4 Al-Onaizan, Y., Knight, K. 2002. Machine transliteration of names in Arabic

text. In Proceedings of the ACL workshop on Computational approaches to

Semitic languages. Philadelphia, PA, pp. 1–13.

5 Fujii, A., Ishikawa, T. 2001. Japanese/English Cross-Language Information

Retrieval: exploration of query translation and transliteration. Computers and

the Humanities, Vol. 35, pp. 389-420.

6 Lin, W.-H., Chen, H.-H. 2002. Backward machine transliteration by learning

phonetic similarity. In Proceeding of the 6th Conference on Natural Language

Learning. Taipei, Taiwan, pp. 1–7.

7 Kawtrakul, A., Deemagarn, A., Thumkanon, C., Khantonthong, N.,

McFetridge, P. 1998. Backward transliteration for Thai document retrieval, In

Proceedings of IEEE Asia-Pacific Conference. pp. 563– 566.

8 Lee, J.S., Choi, K.-S. 1998. English to Korean Statistical Transliteration for

Information Retrieval. Computer Processing of Oriental languages, Vol. 12,

pp. 17-37.

9 Jeong, K.S., Myaeng, S. H., Lee, J. S., Choi, K.-S. 1999. Automatic

identification and back-transliteration of foreign words for information

retrieval. Information Processing and Management, Vol. 35, pp. 523–540.

10 Kim, J. J., Lee, J. S., Choi, K.-S. 1999. Pronunciation unit based automatic

English-Korean transliteration model using neural network. In Proceedings of

Korea Cognitive Science Association. pp. 247–252.

11 Li, H., Zhang, M., Su, J. 2004. A joint source-channel model for machine

transliteration. In Proceedings of the 42nd Annual Meeting of the Association

for Computational Linguistics. Barcelona, Spain, pp. 159–166.

12 Kang, B. J., Choi, K.-S. 2000. Automatic transliteration and back-

transliteration by decision tree learning. In Proceedings of Conference on

Language Resources and Evaluation. Athens, Greece, pp. 1135–1411.

13 Theeramunkong, T., Usanavasin, S. 2001. Non-dictionary-based Thai word

segmentation using decision trees. In Proceedings of the first international

conference on Human language technology research (HLT '01) ACL.

Stroudsburg, PA, USA, pp. 1-5.

14 Jung, Y., Lee, D., Yoon, A., Kwon, H.-C. 2004. Transliteration system for

Arabic-Numeral Expressions using decision tree for intelligent Korean TTS,

30th Annual Conference of IEEE, Vol. 1, pp. 657 – 662.

15 Kang, I. H., Kim, G. 2000. English-to-Korean transliteration using multiple

unbounded overlapping phoneme chunks. In Proceedings of the 18th

Conference on Computational Linguistics. Germany, pp. 418–424.

16 Goto, I., Kato, N., Uratani, N., Ehara, T. 2003. Transliteration considering

Context Information based on the Maximum Entropy Method. In Proceeding

of the MT-Summit IX. New Orleans, USA, pp. 125–132.

17 Karimi, S., Turpin, A., Scholer, F. 2006. English to Persian transliteration. In

Proceedings of String Processing and Information Retrieval. Lecture Notes in

Computer Science, Vol. 4209. Glasgow, UK, pp. 255–266.

18 Brown, P.F., Della Pietra, S.A., Della Pietra, V. J., Mercer, R. L. 1993. The

mathematics of statistical machine translation: Parameter estimation.

Computational Linguistics, Vol. 19:2, pp. 263-311

19 Virga, P., Khudanpur, S. 2003. Transliteration of proper names in cross-

language applications. In Proceedings of the 26th Annual International ACM

SIGIR Conference on Research and Development in Information Retrieval.

Toronto, Canada, pp. 365–366.

20 Ekbal, A., Naskar, S. K., Bandyopadhyay, S. 2006. A modified joint source-

channel model for transliteration. In Proceedings of the 21st International

Conference on Computational Linguistics and the 44th Annual Meeting of the

ACL. Sydney, Australia, pp. 191–198.

21 Knight, K., Morgan, K., Graehl, J. 1998. Machine Transliteration.

Computational Linguistics, Vol. 24:4, pp. 599-612.

22 Stalls, B., Knight, K. 1998. Translating names and technical terms in Arabic

text. In Proceedings of the COLING/ACL Workshop on Computational

Approaches to Semitic Languages. Montreal, Canada, pp. 34–41.

23 Jung, S. Y., Hong, S. L., Paek, E. 2000. An English to Korean transliteration

model of extended Markov window. In Proceedings of the 18th Conference on

Computational linguistics. Germany, pp. 383–389.

24 Meng, H.M., Lo, W-K., Chen, B., Tang, K. 2001. Generate phonetic cognates

to handle name entities in English-Chinese cross-language spoken document

retrieval. In Proceedings of the IEEE workshop on Automatic Speech

Recognition and Understanding. Madonna di Campiglio, Italy, pp. 311- 314

25 Wutiwiwatchai, C., Thangthai, A. 2010. Syllable-based Thai-English machine

transliteration. In Proceedings of the 2010 Named Entities Workshop (NEWS

'10). Stroudsburg, PA, USA, pp. 66-70.

26 Jiang, X., Sun, L., Zhang, D. 2009. A syllable based name transliteration

system. In Proceedings of the 2009 Named Entities Workshop (NEWS '09).

ACL-IJCNLP, pp. 96-99.

27 Arbabi, M., Fischthal, S. M., Cheng, V. C., Bart, E. 1994. Algorithms for

Arabic name transliteration. IBM Journal of research and Development, Vol.

38:2, pp. 183-194.

28 Lee, J. S. 1999. An English to Korean transliteration and re-transliteration

model for CLIR. Ph.D. Thesis, Computer Science Department, KAIST.

29 Bilac, S., Tanaka, H. 2004. A hybrid back-transliteration system for Japanese.

In Proceedings of the 20th international conference on Computational

Linguistics (COLING). Geneva, Switzerland, pp. 597–603.

30 Bilac, S., Tanaka, H. 2004. Improving back-transliteration by combining

information sources. In Proceedings of First International Joint Conference

on Natural Language Processing. Hainan Island, pp. 542–547.

31 Bilac, S., Tanaka, H. 2005. Direct combination of spelling and pronunciation

information for robust back-transliteration. In Proceedings of Conferences on

Computational Linguistics and Intelligent Text Processing. Mexico City,

Mexico, pp. 413–424.

32 Oh, J.-H., Choi, K.-S., Isahara, H. 2006. A machine transliteration model

based on correspondence between graphemes and phonemes. ACM

Transactions on Asian Language Information Processing (TALIP), Vol. 5, pp.

185–208.

33 Karimi, S., Scholer, F., Turpin, A. 2011. Machine Transliteration Survey.

ACM Computing Survey, Vol. 43, pp. 1-57.

34 Nelken, R., Shieber, S. M. 2005. Arabic Diacritization Using Weighted Finite-

State Transducers. In Proceedings of ACL Workshop on Computational

Approaches to Semitic Languages. Ann Arbor, Michigan, pp. 79-86.

35 Ananthakrishnan, S., Narayanan, S., Bangalore, S. 2005. Automatic

diacritization of Arabic transcripts for automatic speech recognition. In

Proceedings of ICON-05. Kanpur, India.

36 Wellisch, H. H. 1977. The Conversion of Scripts, its Nature, History and

Utilization. New York: Wiley Publication

37 ALA-LC Romanization Tables: Transliteration Schemes for Non-Roman

Scripts. 1997. Library of Congress, Washington, D.C. Retrieved May 12,

2010, from http://www.loc.gov/catdir/cpso/roman.html

38 ISO: 15919 Standards. 2001. International standard for Transliteration of

Devanagari and related Indic scripts into Latin characters. Retrieved May 12,

2010, from http://webstore.ansi.org/

39 Chopde, A. 1991. ITRANS: Encoding and Indian Language Transliteration

Package. Retrieved May 12, 2010, from http://www.aczoom.com/itrans/

40 Krishnan, R.K. 2005. Acharya: A Multilingual Computing for Literacy and

Eduction. SDL, IIT Madras, India. Retrieved May 12, 2010, from

http://acharya.iitm.ac.in

41 Madhavi, G., Balakrishnan, M., Balakrishnan, N., Reddy, R. 2005. Om: One

tool for many (Indian) languages. Journal of Zhejiang University Science. pp.

1348-1353.

42 IAST: International Alphabet of Sanskrit Transliteration Retrieved May 12,

2010, from http://www.omniglot.com/writing/sanskrit.htm

43 National Library at Kolkata Romanization, Retrieved May 12, 2010, from

http://en.wikipedia.org/wiki/National_Library_at_Kolkata_romanization

44 Harvard-Kyoto Conversion, Retrieved May 12, 2010, from

http://en.wikipedia.org/wiki/Harvard-Kyoto;

http://www.encyclo.co.uk/define/Harvard-Kyoto

45 Jayaraman, A., Sangani, S., Ganapathiraju, M. 2004. OmSE: Tamil Search

Engine. In Proceedings of Tamil Internet conference. Singapore.

46 Arokia, R. A., Maganti, H. 2009. Transliteration based Search Engine for

Multilingual Information Access. In Proceedings of CLIAWS3, Third

International Cross Lingual Information Access Workshop. Boulder,

Colorado, pp. 12–20.

47 Sinha, R.M.K. 2009. A Journey from Indian Scripts Processing to Indian

Language Processing. Annals of the History of Computing, IEEE, Vol. 31:1,

pp. 8-31.

48 The Unicode Standard, Version 4.0. 2003. The Unicode Consortium, Addison-

Wesley, Boston, MA.

49 Indic Scripts and Languages. 2011. Frequently Asked Questions, Unicode,

Inc. Retrieved March 11, 2011, from http://unicode.org/faq/indic.html

50 Sidhu, S. 2006. ISCII to Unicode Conversion Issues for Gurmukhi. Unicode

Technical Note #30, Unicode, Inc. Retrieved May 12, 2010, from

http://unicode.org/notes/tn30/utn30-gurmukhi.pdf

51 Vishwabharat@tdil: Language Technology Flash. 2008. Technology

Development for Indian Languages Programme, Department of IT, New

Delhi. Retrieved May 21, 2010, from http://tdil.mit.gov.in/jan-dec-2008.htm

52 Wan, S., Verspoor, C. 1998. Automatic English-Chinese name transliteration

for development of multilingual resources. In Proceedings of the 17th

International Conference on Computational linguistics. Montreal, Canada, pp.

1352–1356.

53 Darwish, K., Doermann, D., Jones, R., Oard, D., Rautiainen, M. 2001. TREC-

10 Experiments at University of Maryland CLIR and Video. In Proceedings of

the Tenth Text Retrieval Conference TREC-2001. NIST. Retrieved February

11, 2008, from http://trec.nist.gov/pubs/trec10/t10_proceedings.html

54 Abduljaleel, N., Larkey, S. 2003. Statistical transliteration for English-Arabic

cross language information retrieval. In Proceedings of the 12th International

Conference on Information and Knowledge Management. New Orleans, pp.

139-146.

55 Buckwalter, T. 2002. Buckwalter Arabic Transliteration. Retrieved February

22, 2007, from http://www.qamus.org/transliteration.htm

56 Buckwalter, T. 2004. Issues in Arabic orthography and morphology analysis.

In Proceedings of Computational Approaches to Arabic Script based

Languages, COLING 2004. A. Farghaly and K. Megerdoomian, Eds. Geneva,

Switzerland, pp. 31–34.

57 Oh, J.-H., Choi, K.-S. 2002. An English-Korean transliteration model using

pronunciation and contextual rules. In Proceedings of the 19th International

Conference on Computational linguistics. Taipei, Taiwan, pp. 758-764.

58 Qu, Y., Grefenstette, G., Evans, D.A. 2003. Automatic transliteration for

Japanese-to-English text retrieval. In Proceedings of the 26th Annual

International ACM SIGIR Conference on Research and Development in

Information Retrieval. pp. 353-360.

59 Gao, W., Wong, K.-F., Lam, W. 2004. Improving transliteration with precise

alignment of phoneme chunks and using contextual features. In Proceedings

of Information Retrieval Technology, Asia Information Retrieval Symposium.

Lecture Notes in Computer Science, Vol. 3411, Beijing, China, pp. 106–117.

60 Gao, W., Wong, K.-F., Lam, W. 2004. Phoneme-based transliteration of

foreign names for OOV problem. In Proceedings of the 1st International Joint

Conference on Natural Language Processing. Sanya City, China, pp. 374–

61 Oh, J.-H., Choi, K.-S. 2005. Machine learning based English to Korean

transliteration using grapheme and phoneme information. IEICE Transactions

on Information and Systems E88-D, Vol. 7, pp. 1737–1748.

62 Levenshtein, V. I. 1966. Binary codes capable of correcting, deletions,

insertions and reversal. Soviet Phys. Dokl, Vol.10, pp. 707-710

63 Oh, J.-H., Choi, K.-S. 2005. An Ensemble of Grapheme and Phoneme for

Machine Transliteration. In Proceedings of 2nd International Joint

Conference on Natural Language Processing. Jeju Island, pp. 450-461.

64 Huang, F. 2005. Cluster-specific named entity transliteration. In Proceedings

of the Human Language Technology Conference and Conference on Empirical

Methods in Natural Language Processing. Vancouver, Canada, pp. 435-442.

65 Oh, J.-H., Choi, K.-S., Isahara, H. 2006. A hybrid model for extracting

transliteration equivalents from parallel corpora. In Proceedings of 9th

International Conference of the Text, Speech and Dialogue. Brno, Czech

Republic, pp. 119–126.

66 Oh, J.-H., Choi, K.-S., Isahara, H. 2006. Improving machine transliteration

performance by using multiple transliteration models. In Proceedings of the

21st International Conference on Computer Processing of Oriental

Languages. Singapore, pp. 85–96.

67 Zelenko, D., Aone, C. 2006. Discriminative methods for transliteration. In

Proceedings of the 2006 Conference on Empirical Methods in Natural

Language Processing. Sydney, Australia, pp. 612–617.

68 Sherif, T., Kondrak, G. 2007. Bootstrapping a stochastic transducer for

Arabic-English transliteration extraction. In Proceedings of the 45th Annual

Meeting of the Association of Computational Linguistics. Czeck Republic, pp.

864–871.

69 Ristad, E. S., Yianilos, P. N. 1998. Learning string edit distance. IEEE

Transactions on Pattern Analysis and Machine Intelligence, Vol. 20:5, pp.

522–532.

70 Freeman, A., Condon, S., Ackerman, C. 2006. Cross linguistic name matching

in English and Arabic. In Proceedings of Human Language Technology

Conference of the NAACL. New York City, USA, pp. 471–478.

71 Kondrak, G. 2000. A new algorithm for the alignment of phonetic sequences.

In Proceedings of the First Meeting of the North American Chapter of the

Association for Computational Linguistics. pp. 288–295.

72 Yoon, S.-Y., Kim, K.-Y., Sproat, R. 2007. Multilingual Transliteration using

Feature based Phonetic Method. In Proceedings of 45th Annual Meeting of the

ACL. Prague, pp. 112-119

73 David, M. 2007. Machine Transliteration of Proper Names. Master’s thesis,

School of Informatics, University of Edinburgh.

74 Jiang, L., Zhou, M., Chien, L., Niu, C. 2007. Named Entity Translation with

Web Mining and Transliteration. In Proceedings of the 20th International

Joint Conference on Artificial Intelligence. pp. 1629–1634.

75 Zhao, B., Bach, N., Lane, I., Vogel, S. 2007. A Log-linear Block

Transliteration Model based on Bi-Stream HMMs. In Proceedings of NAACL

HLT. pp. 364–371.

76 Karimi, S., Scholer, F., Turpin, A. 2007. Collapsed consonant and vowel

models: New approaches for English-Persian transliteration and back-

transliteration. In Proceedings of the 45th Annual Meeting of the Association

for Computational Linguistics. Prague, Czech Republic, pp. 648–655.

77 Oh, J.-H., Isahara, H. 2007. Machine transliteration using multiple

transliteration engines and hypothesis re-ranking. In Proceedings of the 11th

Machine Translation Summit. Copenhagen, Denmark, pp. 353–360.

78 Al-Onaizan, Y., Knight, K. 2002. Translating named entities using

monolingual and bilingual resources. In Proceedings of the 40th Annual

Meeting of the Association for Computational Linguistics. Philadelphia, PA,

pp. 400–408.

79 Goldwasser, D., Roth, D. 2008. Active sample selection for named entity

transliteration. In Proceedings of ACL-08: HLT. Short Papers, Columbus,

Ohio, pp. 53-56.

80 Goldberg, Y., Elhadad, M. 2008. Identification of transliterated foreign words

in Hebrew script. In Proceedings of Conference on Intelligent Text Processing

and Computational Linguistics. A. Gelbukh Eds. LNCS 4919, pp. 466–477.

81 Hermjakob, U., Knight, K., Daumé-III, H. 2008. Name translation in statistical

machine translation learning when to transliterate. In Proceedings of ACL-08:

HLT. Columbus, Ohio, pp. 389–397.

82 Kirschenbaum, A., Wintner, S. 2009. Lightly Supervised Transliteration for

Machine Translation. In Proceedings of 12th Conference of the European

Chanpter of the ACL. Athens, pp. 433-441.

83 Karimi, S. 2008. Machine transliteration of proper names between English

and Persian. Ph.D. thesis, RMIT University, Melbourne, Australia.

84 Deselaers, T., Hasan, S., Bender, O., Ney, H. 2009. A Deep Learning

Approach to Machine Transliteration. In Proceedings of 4th Workshop on

Statistical Machine Translation, the 12th Conference of EACL. Athens, pp.

233-241.

85 Collobert, R., Weston, J. 2008. A unified architecture for natural language

processing: deep neural networks with multitask learning. In Proceedings of

the 25th International Conference on Machine Learning (ICML '08). ACM,

New York, NY, USA, pp. 160-167.

86 Finch, A., Sumita, E. 2010. Transliteration using a phrase-based statistical

machine translation system to re-score the output of a joint multigram model.

In Proceedings of the 2010 Named Entities Workshop (NEWS '10). ACL,

87 Kalyanasundaram, K. 1997. An Overview of Different Tools for Word-

Processing of Tamil and a Proposal Towards Standardisation. In Proceedings

of International Symposium for Tamil Information Processing and Resources

on the Internet. National University of Singapore, Singapore, pp. 17-18.

88 UTRANS – Transliteration tool. 2002. C-DAC Annual Report 1999-2000.

Retrieved February 17, 2011, from http://www.cdac.in/html/pdf/annual

report.pdf

89 Enhanced Transliteration. 2008. C-DAC Pune, Maharashtra. Retrieved

February 17, 2011, from http://pune.cdac.in/html/gist/products/enhanced

_translit.aspx

90 Mobile Computing: Embedding Indian languages into Cellular Phones. 2011.

C-DAC Pune, Maharashtra. Retrieved February 17, 2011, from

http://pune.cdac.in/html/gist/lang_tools/embedded/mobile.aspx

91 Sasikumar, M., Dadhekar, A., Pimpale, P., More, P., Nikumbh, S., Mukherjee,

A. 2009, March 17. Xlit: Tool for Transliteration between English and Indian

Language, C-DAC Mumbai. Retrieved February 17, 2011, from

http://www.cdacmumbai.in/xlit

92 Rao, K.V. 2003. Rice Inverse Transliterator (RIT) Software. Retrieved

February 17, 2011, from http://www.teluguworld.org/RIT/rit.html

93 Prasad, A. 2010. SANSCRIPT: South Asian Network Software Conferring

Relatively Immediate Phonetic Transliteration. Retrieved February 19, 2011,

from http://www.learnsanskrit.org/tools/sanscript.php

94 Verma, R.K., Lehal, G.S. 2006. GTrans - Gurmukhi to Roman Transliteration,

Punjabi University, Patiala, India. Retrieved February 17, 2007, from

www.learnpunjabi.org/gtrans/index.asp

95 Google Input Method: Type anywhere in your language. 2011. Google Inc.,

CA, United States. Retrieved February 17, 2011, from

http://www.google.com/ime/transliteration/help.html

96 UzZaman, N., Zaheen, A., Khan, M. 2006. A Comprehensive Roman

(English) to Bangla Transliteration Scheme, In Proceedings of International

Conference on Computer Processing on Bangla (ICCPB-2006). Dhaka,

Bangladesh.

97 Vijaya, M.S., Dhanalakshmi, V., Shivapratap, G., Ajith, V.P., Soman, K.P.

2008. Sequence Labeling Approach for English to Tamil Transliteration using

Memory based Learning, In Proceedings of 6th International Conference on

Natural Language Processing ICON-2008. Pune, India.

98 Daelemans, W., Zavrel, J., Berck, P., Gillis, S. 1996. MBT: A Memory-Based

Part of speech Tagger Generator. In Proceedings of the Fourth Workshop on

Very Large Corpora. Copenhagen: ACL SIGDAT, pp. 14-27.

99 Surana, H., Singh, A.K. 2008. A More Discerning and Adaptable Multilingual

Transliteration Mechanism for Indian Languages. In Proceedings of the

International Joint Conference on Natural Language Processing’08. Asian

Federation of Natural Language Processing. Hyderabad, India.

100 Malik, M. G. A., Boitet, C., Bhattacharyya, P. 2008. Hindi Urdu Machine

Transliteration using Finite-state Transducers. In Proceedings of 22nd

International Conference on Computational Linguistics (COLING).

Manchester, UK, pp. 537-544.

101 Ganesh, S., Harsha, S., Pingali, P., Verma, V. 2008. Statistical Transliteration

for Cross Language Information Retrieval using HMM Alignment and CRF.

In Proceedings of Workshop on CLIA, Addressing the Needs of Multilingual

Societies. IJCNLP, pp. 42-47.

102 Chinnakotla, M.K., Ranadive, S., Damani, O.P., Bhattacharyya, P. 2008.

Hindi and Marathi to English Cross Language Information. In Proceedings of

the 2nd International Workshop on Cross Lingual Information Access,

Addressing the Information Need of Multilingual Societies. pp. 64-71

103 Lehal, G.S. 2009. A Gurmukhi to Shahmukhi Transliteration System. In

Proceedings of ICON-2009: 7th International Conference on Natural

Language Processing. Macmillan Publishers, India.

104 Haque, R., Dandapat, S., Srivastava, A.K., Naskar, S.K., Way, A. 2009.

English-Hindi Transliteration Using Context-Informed PB-SMT: the DCU

system for NEWS 2009. In Proceedings of workshop on Named Entities

(NEWS-09), Joint conference of the 47th Annual Meeting of the ACL and the

4th International Joint Conference on Natural Language Processing.

ACL/IJCNLP, Singapore, pp. 104-107.

105 Vardarajan, B., Rao, D. 2009. ε-extension Hidden Markov Models and

weighted transducers for machine transliteration. In Proceedings of the 2009

Named Entities Workshop: Shared Task on Transliteration (NEWS '09).

Association for Computational Linguistics. Stroudsburg, PA, USA, pp. 120-

106 Das, A., Ekbal, A., Mandal, T., Bandyopadhyay, S. 2009. English to Hindi

machine transliteration system at NEWS 2009. In Proceedings of the 2009

Named Entities Workshop: Shared Task on Transliteration (NEWS '09).

107 Chinnakotla, M.K., Damani, O.P. 2009. Experiences with English-Hindi,

English-Tamil and English-Kannada Transliteration Tasks at NEWS 2009. In

Proceedings of workshop on Named Entities (NEWS-09), Joint Conference of

the 47th Annual Meeting of the ACL and the 4th International Joint

Conference on Natural Language Processing. ACL/IJCNLP, Singapore, pp.

44-47.

108 Bushra, J., Tafseer, A. 2009. Hindi to Urdu conversion: beyond simple

transliteration. In Proceedings of Conference on Language and Technology

2009. Lahore, Pakistan, pp. 24-31.

109 Hindi to Urdu Transliterator. 2009. Center for Research in Urdu Language

Processing (CRULP), Lahore. Retrieved April 9, 2010, from

http://www.crulp.org/software/langproc/h2utransliterator.html

110 Malik, M.G.A. 2006. Hindi Urdu Machine Transliteration System. MS thesis,

Department of Linguistics, University of Paris 7, Paris

111 Lehal, G.S., Saini, T.S. 2010. A Hindi to Urdu Transliteration System, In

Proceedings of ICON-2010: 8th International Conference on Natural

Language Processing. Macmillan Publishers, India.

112 Antony, P.J., Ajith, V.P., Soman, K.P. 2010. Kernel Method for English to

Kannada Transliteration. In Proceedings of International Conference on

Recent Trends in Information, Telecommunication and Computing (ITC).

Kochi, Kerala, pp. 336-338.

113 Saravanan, K., Udupa, R., Kumaran, A. 2010. Cross Lingual Information

Retrieval System Enhanced with Transliteration Generation and Mining, In

Proceedings of the Forum for Information Retrieval Evaluation (FIRE-2010)

Workshop. Kolkata, India. Retrieved December 29, 2010, from

http://research.microsoft.com/apps/pubs/default.aspx?id=120228

114 Khapra, M.M., Kumaran, A., Bhattacharyya, P. 2010. Everybody Loves a

Rich Cousin: An Empirical Study of Transliteration through Bridge

Languages. In Proceedings of Human Language Technologies: The 2010

Annual Conference of the North American Chapter of the Association for

Computational Linguistics (HLT '10). Stroudsburg, PA, USA, pp. 420-428.

115 Kumaran, A., Khapra, M.M., Bhattacharyya, P. 2010. Compositional Machine

Transliteration. ACM Journal on Transactions on Asian Language

Information Processing (TALIP), Vol. 9:4, pp. 1-29.

116 Gal, Y. 2002. An HMM Approach to Vowel Restoration in Arabic and

Hebrew, In Proceedings of ACL Workshop on Computational Approaches to

Semitic Languages. pp. 27-33.

117 Levinger, M., Itai, A., Ornan, U. 1995. Learning Morpho-lexical probabilities

from an untagged corpus with an application to Hebrew. Computational

Linguist, Vol. 21:3, pp. 383-404.

118 Beesley, K.R. 1996. Arabic finite-state morphological analysis and generation.

In Proceedings of the 16th conference on Computational linguistics (COLING

'96). ACL, Stroudsburg, PA, USA, pp. 89-94.

119 El-Sadany, T., Hashish, M. 1988. Semi-automatic vowelization of Arabic

verbs. In Proceedings of 10th NC Conference. Jeddah, Saudi Arabia.

120 El-Imam, Y. 2003. Phonetization of Arabic: rules and algorithms. Computer

Speech and Language, Vol. 18, pp. 339-373

121 Kirchhoff, K., Bilmes, J., Das, S., Duta, N., Egan, M., Gang, Ji., Feng,

He., Henderson, J., Daben Liu, Noamany, M., Schone, P., Schwartz, R.,

Vergyri, D. 2002. Novel approaches to Arabic speech recognition - final

report from the JHU summer workshop 2002. Technical report, Johns Hopkins

University.

122 Vergyri, D., Kirchhoff, K. 2004. Automatic diacritization of Arabic for

acoustic modeling in speech recognition. In Proceedings of COLING and

Computational Approaches to Arabic Script-based Languages. Geneva,

Switzerland, pp. 66-73.

123 Emam, O., Fisher, V. 2004. A hierarchical approach for the statistical

vowelization of Arabic text. Technical report, IBM patent filed, DE9-2004-

0006, US patent application US2005/0192809 A1.

124 Elshafei, M., Al-Muhtaseb, H., Alghamdi, M. 2006. Statistical Methods for

Automatic Diacritization of Arabic Text. In Proceedings of the Saudi 18th

National Computer Conference. Riyadh, Vol. 18, pp. 301-306.

125 Zitouni, I., Sorensen, J.S., Sarikaya, R. 2006. Maximum entropy based

restoration of Arabic diacritics. In Proceedings of the 21st International

Conference on Computational Linguistics and 44th Annual Meeting of the

Association for Computational Linguistics. Sydney, Australia, pp. 577-584.

126 Habash, N., Rambow, O. 2007. Arabic Diacritization through Full

Morphological Tagging. In Proceedings of NAACL HLT, ACL Companion

Volume. Rochester, NY, pp. 53–56.

127 Habash, N., Rambow, O. 2005. Arabic tokenization, Part-of-Speech tagging

and morphological disambiguation in one fell swoop. In Proceedings of the

43rd Annual Meeting on Association for Computational Linguistics (ACL '05).

128 Kübler, S., Mohamed, E. 2008. Memory-based vocalization of Arabic. In

Proceedings of the LREC Workshop on HLT and NLP within the Arabic

World. Marrakech, Morocco.

129 Daelemans, W., Zavrel, J., Sloot, K.V.D., Bosch, A.V.D. 2007. TiMBL:

Tilburg memory based learner reference guide. Technical Report ILK 07-07,

Induction of Linguistic Knowledge, Computational Linguistics, Tilburg

University.

130 Shaalan, K., Hitham, M.A.B., Ziedan, I. 2009. A hybrid approach for building

Arabic diacritizer. In Proceedings of the EACL 2009 Workshop on

Computational Approaches to Semitic Languages (Semitic '09). Association

for Computational Linguistics. Stroudsburg, PA, USA, pp. 27-35.

131 Rashwan, M.A.A., Al-Badrashiny, M., Attia, M., Abdou, S.M. 2009. A

Hybrid System for Automatic Arabic Diacritization. In Proceedings of the

Second International Conference on Arabic Language Resources and Tools.

Cairo, Egypt, pp. 54-60.

132 Haertel, R.A., McClanahan, P., Ringger, E.K. 2010. Automatic Diacritization

for Low-Resource Languages Using a Hybrid Word and Consonant CMM. In

Proceedings of Human Language Technologies: The 2010 Annual Conference

of the North American Chapter of the ACL. Los Angeles, California, pp. 519–

133 Malik, M.G.A. 2006. Punjabi Machine Transliteration. In Proceedings of the

21st International Conference on Computational Linguistics and 44th Annual

Meeting of the ACL. pp. 1137-1144.

134 Lewis, M. P. (Ed.) 2009. Ethnologue: Languages of the World, Sixteenth

edition. Online version, Dallas, Tex.: SIL International. Retrieved February

22, 2010, from http://www.ethnologue.org

135 Clews, J. 1997. Digital Language Access: Scripts, Transliteration and

Computer Access. D-Lib Magazine. Retrieved December 22, 2006, from

http://www.dlib.org/dlib/march97/sesame/03clews.html

136 Nasir, R. Punjabi Research and Criticism: A Brief Study. Punjabi Department,

Govt. M.A.O College, Lahore, Pakistan. Retrieved February 27, 2011, from

http://www.apnaorg.com/research-papers/nasir-rana-1/

137 Sekhon, S. S. 1996. A History of Panjabi Literature, Publication Bureau,

Punjabi University, Patiala, Vol. 1 & 2, Punjab, India.

138 Singh, H. 1997. Medieval Indian Literature: An Anthology. Paniker K.

Ayyappa, Ed. Sahitya Akademi Publication, Vol. 2, pp. 417-452.

139 Quraishi, W. 1987. Punjabi Adab di Traqqi Wich Angraizan da Hissa. Six

monthly (July-December) “Kohl”, No. 19, Lahore, pp. 12-13.

140 Malik, S. 1991. Punjabi kitabyiat. Pakistan Academy of Letters. Islamabad,

Vol. 1, pp. “ر”.

141 Gill, H., Gleason, H. A. 1963. A Reference Grammar of Punjabi. Publication

Bureau, Punjabi University, Patiala, India

142 Durrani, N., Hussain, S. 2010. Urdu Word Segmentation. In Proceedings of

the 2010 Annual Conference of the North American Chapter of the ACL. Los

Angeles, California, pp. 528–536.

143 Hussain, S., Afzal, M. 2001. Urdu Computing Standards: Urdu Zabta Takhti

(UZT) 1.01. In Proceedings of IEEE International Multi-Topic Conference.

Lahore, pp. 223-228.

144 Durrani, N. 2007. Typology of Word and Automatic Word Segmentation in

Urdu Text Corpus. MS Thesis, National University of Computer and

Emerging Sciences, Lahore, Pakistan.

145 Shannon, C.E. 1948. A mathematical theory of communication. Bell System

Technology Journal, Vol. 27, pp. 379-423.

146 Shannon, C.E. 1948. A mathematical theory of communication. Bell System

Technology Journal. pp. 623-656.

147 Yannakoudakis, E.J., Angelidakis, J. 1988. An insight into the entropy and

redundancy of English dictionary. IEEE Transactions on Pattern Analysis and

Machine Intelligence PAMI, Vol. 10, pp. 960-970.

148 Rashid, A., Lehal, G.S. 2005. Development of Urdu word processing

components. MS Thesis, DCS, Punjabi University, Patiala, India.

149 Davis, M., Whistler, K. (Eds.) 2010. Unicode Normalization Forms, Technical

Reports, Unicode Standard Annex #15, Revision 33. Retrieved February 22,

2011, from http://www.unicode.org/reports/tr15/tr15-33.html

150 Urdu Normalization Utility v1.0. 2009. Centre for Research in Urdu Language

Processing, National University of Computer and Emerging Sciences, Lahore,

Pakistan. Retrieved April 14, 2011 from

http://www.crulp.org/software/langproc/urdunormalization.htm

151 Wittenburg, P., Peters, W., Drude, S. 2002 Analysis of Lexical Structures

from Field Linguistics and Language Engineering. In Proceedings of LREC

2002 Conference. Las Palma, Mai, pp. 682-686.

152 McEnery, T., Wilson, A. 1993. Corpora and Translation: Uses and Future

Prospects. Retrieved April 14, 2011, from

http://ucrel.lancs.ac.uk/tech_papers.html

153 Leech, G. 1992. Corpus Annotation Schemes. In Proceedings of Pisa

Workshop on European Corpus Resources.

154 Johansson, S. 2007. Seeing through Multilinguial Corpora: on the use of

corpora in contrastive studies, Vol. 26, John Benjamins Publishing Company,

Philadelphia, USA.

155 Baker, J.P., McEnery, A.M., Leisher, M., Cunningham, H., Gaizauskas, R.

2000. Mapping multiple South Asian 8 bit character sets to the Unicode

standard. In Proceedings of the Linguistic Exploration Workshop. University

of Pennsylvania. Retrieved April 16, 2007, from

http://www.ldc.upenn.edu/exploration/expl2000/papers/mcenery/mcenery.pdf

156 Papageorgiou, C.P. 1994. Japanese Word segmentation by hidden Markov

model. In Proceedings of the HLT Workshop. pp. 283–288.

157 Xu, J., Matusov, E., Zens, R., Ney, H. 2005. Integrated Chinese Word

Segmentation in Statistical Machine Translation. In Proceedings of the

International Workshop on Spoken Language Translation. Pittsburgh, PA, pp.

141–147.

158 Akram, M., Hussain, S. 2010. Word Segmentation for Urdu OCR System. In

Proceedings of the 8th Workshop on Asian Language Resources. Beijing,

China, pp. 88-94.

159 Sproat, R., Shih, C., Gale, W., Chang, N. 1996. A stochastic finite state word

segmentation algorithm for Chinese. Computational Linguistics, Vol. 22:3, pp.

377–404.

160 Lehal, G.S. 2010. A Word Segmentation System for Handling Space Omission

Problem in Urdu Script, In Proceedings of the 1st Workshop on South and

Southeast Asian Natural Language Processing (WSSANLP) and 23rd COLIN.

Beijing, pp. 43–50.

161 Lehal, G.S. 2009. A Two Stage Word Segmentation System for Handling

Space Insertion Problem In Urdu Script, In Proceedings of World Academy of

Science, Engineering and Technology. Bangkok, Thailand, pp. 321-324.

162 Naseem, T., Hussain, S. 2007. Spelling Error Trends in Urdu. In Proceedings

of Conference on Language Technology (CLT07). University of Peshawar,

Pakistan.

163 Kukich, K. 1992. Techniques in Automatically Correcting Words in Texts.

ACM Computing Surveys, Vol. 24:4, pp. 377-439.

164 Odell, M.K., Russell, R.C. U.S. Patent Numbers, 1,261,167 (1918) and

1,435,663 (1922). U.S Patent Office, Washington D.C., USA.

165 Gadd, T.N. 1988. Fisching fore weds: Phonetic retrieval of written text in

information systems. Program, Electronic Library and Information Systems,

Vol. 22:3, pp. 222–237.

166 Gadd, T.N. 1990. PHONIX: The Algorithm. Program: Autom. Libr. Inf. Syst,

Vol. 24:4, pp. 363–366.

167 Hodge, V., Austin, J. 2001. An Evaluation of Phonetic Spell Checkers.

Technical Report YCS-2001-338 (2001), Department of Computer Science,

University of York, UK.

168 McCabe, M.C., Chowdhury, A., Grossman, D., Frieder, O. 1999. A unified

environment for fusion of information retrieval approaches. In Proceedings of

the 1999 Conference on Information and Knowledge Management (CIKM99).

New York, ACM Press, pp. 330–334.

169 Celko, J. 1995. Joe Celko’s SQL for smarties: Advanced SQL programming.

2nd edition, Burlington, MA: Morgan Kaufmann.

170 Pfeifer, U., Poersch, T., Fuhr, N. 1995. Searching proper names in databases.

In Proceedings of the Conference on Hypertext Information Retrieval

Multimedia. Konstanz, Germany, pp. 259–275.

171 Pfeifer, U., Poersch, T., Fuhr, N. 1996. Retrieval effectiveness of name search

methods. Information Processing and Management, Vol. 32:6, pp. 667–679.

172 Liu, Y. 1987. New Advances in computers and Natural Language Processing

in China. Information Science, Vol. 8, pp. 64–70.

173 Palmer, D.D. 1997. A trainable Rule-based Algorithm for Word Segmentation.

In Proceedings of the eighth conference on European Chapter of the

Association for Computational Linguistics, EACL '97. Stroudsburg, PA, USA,

pp. 321-328

174 Kawtrakul, A., Thumkanon, C., Poovorawan, Y., Varasrai, P., Suktarachan,

M. 1997. Automatic Thai Unknown Word Recognition. In Proceedings of the

Natural Language Processing Pacific Rim Symposium. Phuket, Thailand, pp.

341-348.

175 Mekanavin, S., Charenpornsawat, P., Kijsirikul, B. 1997. Feature-based Thai

Words Segmentation. In Proceedings of the Natural Language Processing

Pacific Rim Symposium. Phuket, Thailand, pp. 41-48.

176 Liang, N. 1986. A written Chinese automatic segmentation system-CDWS.

Journal of Chinese Information Processing, Vol. 1:1, pp. 44-52.

177 Li, B.Y., Lin, S., Sun, C.F., Sun, M.S. 1991. A maximum-matching word

segmentation algorithm using corpus tags for disambiguation, In Proceedings

of ROCLING IV. Taipei, pp. 135-146.

178 Gu, P., Mao, Y. 1994. The adjacent matching algorithm of Chinese automatic

word segmentation and its implementation in the QHFY Chinese-English

system. In Proceedings of International Conference on Chinese Computing.

Singapore.

179 Yeh, C.L., Lee, H.J. 1991. Rule-based word identification for mandarin

Chinese sentences: A unification approach. Computer Processing of Chinese

and Oriental Languages, Vol. 5:2, pp. 97–118.

180 Ramshaw, L., Marcus, M. 1995. Text chunking using transformation-based

learning. In Proceedings of the Third Workshop on Very Large Corpora

(WVLC-3). pp. 82-94.

181 Nie, J.Y., Jin, W., Hannaan, M.L. 1994. A hybrid approach to unknown word

detection and segmentation of Chinese. In Proceedings of International

Conference on Chinese Computing. Singapore.

182 Nie, J.Y., Hannan, M.L., Jin, W. 1995. Combining dictionary, rules, and

statistical information in segmentation of Chinese. Computer Processing of

Chinese and Oriental Languages, Vol. 9:2, pp. 125–143.

183 Lê, H.P., Nguyễn, T.M.H., Roussanaly, A., Hồ, T.V. 2008. A hybrid approach

to word segmentation of Vietnamese texts. In Proceedings of 2nd

International Conference on Language and Automata Theory and

Applications. Tarragona, Spain.

184 Islam, M. A., Inkpen, D., Kiringa, I. 2007. A Generalized Approach to Word

Segmentation using Maximum length descending frequency and Entropy rate,

In Proceedings of CICLing 2007. Springer, LNCS 4394, pp. 175-185.

185 Sproat, R., Shih, C. 1990. A statistical method for finding word boundaries in

Chinese text. Computer Processing of Chinese and Oriental Languages, Vol.

4, pp. 336–351.

186 Yang, C.C., Luk, J., Yung, S., Yen, J. 2000. Combination and boundary

detection approach for Chinese indexing. Journal of the American Society for

Information Science, Special Topic Issue on Digital Libraries, Vol. 51:4, pp.

340–351.

187 Chien, L. F. 1997. PAT-Tree-based Keyword Extraction for Chinese

Information Retrieval. In Proceedings of the 20th Annual International ACM

Conference on Research and Development in Information Retrieval (SIGIR

1997). Philadelphia, pp. 50-58.

188 Yang, C.C., Li, K.W. 2005. A heuristic method based on a Statistical

Approach for Chinese Text Segmentation, Journal of the American Society for

Information Science and Technology, Vol. 56:13, pp. 1438–1447

189 Brent, M., Cartwright, T. 1996. Distributional regularity and phonotactics are

useful for segmentation. Cognition, Vol. 61, pp. 93-125.

190 Brent, M. 1999. An efficient, probabilistically sound algorithm for

segmentation and word discovery. Machine Learning, Vol. 34, pp. 71–106.

191 Christiansen, M., Allen, J. 1997. Coping with Variation in Speech

Segmentation. In Proceedings of GALA 1997: Language Acquisition,

Knowledge Representation and Processing. pp. 327-332.

192 Christiansen, M., Allen, J., Seidenberg, M. 1998. Learning to Segment Speech

Using Multiple Cues: A Connectionist Model. Language and Cognitive

Processes, Vol. 13, pp. 221-268.

193 Pearl, L., Goldwater, S., Steyvers, M. 2011. Online Learning Mechanisms for

Bayesian Models of Word Segmentation. Research on Language &

Computation. doi:10.1007/s11168-011-9074-5.

194 Shannon, C.E. 1951. Prediction and entropy of printed English. Bell System

Technical Journal, Vol. 30, pp. 50-64.

195 Manning, C. D., Schütze, H. 2003. Fundamentals of Statistical Natural

Language Processing. The MIT Press, Cambridge London, England

196 Rabiner, L.R. 1989. A Tutorial on Hidden Markov Models and Selected

Applications in Speech Recognition, Proceedings of the IEEE, Vol. 77:2, pp.

257-285.

197 Gotoh, Y., Renals, S. 2003. Statistical language modelling. In Text and Speech

Triggered Information Access, S. Renals, G. Grefenstette Eds. Springer-

Verlag, Heidelberg, Germany, pp. 78–105.

198 Jurafsky, D., Martin, J. 2008. Speech and Language Processing: An

Introduction to Natural Language Processing, Computational Linguistics and

Speech Recognition. Prentice Hall.

199 Grishman, R., Hirschman, L., Nhan, N.T. 1986. Discovery procedures for

sublanguage selectional patterns: initial experiments. Computational

Linguistics, Vol.12:3, pp. 205-214.

200 Schutze, H. 1992. Dimensions of meaning. In Proceedings of Supercomputing

‘92. Minneapolis, MN, pp. 787-796.

201 Schutze, H. 1993. Word space. In S. J. Hanson, J. D. Cowan, and C. L. Giles,

(Eds.), Advances in Neural Information Processing Systems 5. Morgan

Kaufman, San Mateo, California, pp. 895-902.

202 Essen, U., Steinbiss, V. 1992. Co-occurrence smoothing for stochastic

language modeling. In Proceedings of ICASSP, Vol. 1, pp. 161-164.

203 Grishman, R., Sterling, J. 1993. Smoothing of automatically generated

selectional constraints. In Proceedings of DARPA Conference on Human

Language Technology. San Francisco, California, pp. 254-259.

204 Dagan, I., Shaul, M., Shaul, M. 1993. Contextual word similarity and

estimation from sparse data. In Proceedings of 31st Annual Meeting of the

Association for Computational Linguistics. Columbus, Ohio, pp. 164-171.

205 Dagan, I., Shaul, M., Shaul, M. 1995. Contextual word similarity and

estimation from sparse data. Computer Speech and Language, Vol. 9, pp. 123-

206 Karov, Y., Shimon, E. 1996. Learning similarity-based word sense

disambiguation from sparse data. In Proceedings of the Fourth Workshop on

Very Large Corpora. Copenhagen, Denmark, pp. 42-55.

207 Lin, D. 1997. Using syntactic dependency as local context to resolve word

sense ambiguity. In Proceedings of 35th Annual Meeting of the ACL and 8th

Conference of the European Chapter of the Association for Computational

Linguistics. Madrid, Spain, pp. 64-71.

208 Resnik, P. 1992. WordNet and distributional analysis: A class-based approach

to lexical discovery. In Proceedings of AAAI Workshop on Statistically-based

Natural Language Processing Techniques. Menlo Park, California, pp. 56-64.

209 Resnik, P. 1995. Disambiguating noun groupings with respect to WordNet

senses. In Proceedings of the Third Workshop on Very Large Corpora.

Cambridge, pp. 54-68

210 Jiang, J.J., Conrath, D.W. 1997. Semantic similarity based on corpus statistics

and lexical taxonomy. In Proceedings of International Conference Research

on Computational Linguistics (ROCLING). Taiwan, pp. 1-15.

211 Frakes, W. B., Baeza-Yates, R. (Eds.) 1992. Information Retrieval, Data

Structures and Algorithms. Prentice Hall, New York.

212 Niwa, Y., Nitta, Y. 1994. Co-occurrence Vectors from Corpora vs. Distance

Vectors from Dictionaries. In Proceedings of the 17th International

Conference on computational Linguistics, COLING’94. pp. 304-309.

213 Brown, P., Della-Pietra, S., Della-Pietra, V., Mercer, R. 1991. Word sense

disambiguation using statistical methods. In Proceedings of 29th Annual

Meeting of the Association for Computational Linguistics. Berkeley,

California, pp. 264-270.

214 Gale, W.A., Church, K.W., Yarowsky, D. 1993. A method for disambiguating

word senses in a large corpus. Computers and the Humanities, Vol. 26, pp.

415-439.

215 Dagan, I., Lee, L., Pereira, F. 1997. Similarity-based methods for word sense

disambiguation. In Proceedings of 35th Annual Meeting of the Association for

Computational Linguistics and 8th Conference of the European Chapter of the

Association for Computational Linguistics. Madrid, Spain, pp. 56-63.

216 Dagan, I., Pereira, F., Lee, L. 1994. Similarity-based estimation of word co-

occurrence probabilities. In Proceedings of 32nd Annual Meeting of the

Association for Computational Linguistics. Las Cruces, New Mexico, pp. 272-

217 Resnik, P. 1995. Disambiguation noun groupings with respect to WordNet

senses. In Proceedings of the Third Workshop on Very Large Corpora.

Cambridge, MA, pp. 54-68.

218 Choueka, Y., Klein, S.T., Neuwitz, E. 1983. Automatic retrieval of frequent

idiomatic and collocational expressions in a large corpus. Journal of the

Association for Literary and Linguistic Computing, Vol. 4:1, pp. 34-38.

219 Smadja, F., McKeown, K. 1990. Automatic extracting and representing

collocations for language generation. In Proceedings of 28th Annual meeting

of the ACL. Pittsburgh. PA, pp. 252-259.

220 McCradell, D. R. 1995. A lexical-semantic and statistical approach to lexical

collocation extraction for Natural Language Generation. Ph.D. dissertation,

University of Maryland, Baltimore, MD.

221 Dale, R., Moisl, H., Somers, H. (Eds.) 2000. Handbook of Natural Language

Processing, Marcel Dekker Inc. New York, Basel, pp. 459-475.

222 Gale, W.A., Church, K.W., Yarowsky, D. 1992. Using bilingual materials to

develop word sense disambiguation methods. In Proceedings of Fourth

International Conference on Theoretical and Methodological Issues in

Machine Translation (TMI-92), Empiricist vs. rationalist methods in MT.

Montreal, pp. 101-112.

223 Lau, R., Rosenfeld, R., Roukos, S. 1993. Trigger-based language models: A

maximum entropy approach. In Proceedings of the IEEE International

Conference on Acoustics, Speech and Signal Processing, ICASSP-II.

Minneapolis, MN, pp. 45-48.

224 Kaplan, A. 1955. An experimental study of ambiguity and context, Machine

Translation, Vol. 2:3, pp. 39-46.

225 Choueka, Y., Lusignan, S. 1985. Disambiguation by short contexts.

Computers and the Humanities, Vol. 19:3, pp. 147-157.

226 Josan, G.S., Lehal, G.S. 2008. Size of N for Word Sense Disambiguation

using N Gram Model for Punjabi Language. International Journal of

Translation, Vol. 20, pp. 47-56.

227 Goyal, V. 2010. Development of a Hindi to Punjabi Machine Translation

System. Ph.D. Thesis, Punjabi University, Patiala, India.

228 Chen, S.F., Goodman, J. 1998. An Empirical Study of Smoothing Techniques

for Language Modeling. Technical Report, (TR-10-98). Computer Science

Group, Harvard University, Cambridge, Massachusetts, pp. 1-63.

229 Gale, W.A., Church, K.W. 1990. Estimation procedures for language context:

poor estimates are worse than none. In Proceedings of Computational

Statistics (COMPSTAT), 9th Symposium. Dubrovnik, Yugoslavia, pp. 69-74.

230 Gale, W.A., Church, K.W. 1994. What's wrong with adding one? In N.

Oostdijk and P. de Haan, (Eds.), Corpus-Based Research into Language.

Rodolpi, Amsterdam.

231 Witten, I.H., Bell, T.C. 1991. The zero-frequency problem: Estimating the

probabilities of novel events in adaptive text compression. IEEE Transactions

on Information Theory, Vol. 37:4, pp. 1085-1094.

232 Nádas, A. 1984. Estimation of probabilities in the language model of the IBM

speech recognition system. IEEE Transactions on Acoustics, Speech and

Signal Processing, ASSP, Vol. 32:4, pp. 859-861

233 Katz, S.M. 1987. Estimation of probabilities from sparse data for the language

model component of a speech recognizer. IEEE Transactions on Acoustics,

Speech and Signal Processing, ASSP, Vol. 35:3, pp. 400-401.

234 Jelinek, F., Mercer, R.L. 1980. Interpolated estimation of Markov source

parameters from sparse data. In Proceedings of the Workshop on Pattern

Recognition in Practice. Amsterdam, The Netherlands: North-Holland.

235 Ney, H., Essen, U., Kneser, R. 1994. On structuring probabilistic dependences

in stochastic language modeling. Computer Speech and Language, Vol. 8:1,

pp. 1-38.

236 Church, K.W., Gale, W.A. 1991. A comparison of the enhanced Good-Turing

and deleted estimation methods for estimating probabilities of English

bigrams. Computer Speech and Language, Vol. 5, pp. 19-54.

237 Thede, S.M., Harper, M.P. 1999. A Second-Order Hidden Markov Model for

Part-of-speech Tagging, In Proceedings of the 37th annual meeting of the ACL

on Computational Linguistics. pp. 175-182.

238 Li, H., Kumaran, A., Zhang, M., Pervouchine, V. 2009. Whitepaper of NEWS

2009 Machine Transliteration Shared Task, In Proceedings of the 2009 Named

Entities Workshop. ACL-IJCNLP, Suntec, Singapore, pp. 19–26.

239 Li, H., Kumaran, A., Zhang, M., Pervouchine, V. 2010. Whitepaper of NEWS

2010 Shared Task on Transliteration Generation. In Proceedings of the 2010

Named Entities Workshop, ACL. Uppsala, Sweden, pp. 12–20.

240 Papineni, K., Roukos, S., Ward, T., Zhu, W.-J. 2002. BLEU: a method for

automatic evaluation of Machine Translation. In Proceedings of the 40th

Annual Meeting of the Association for the Computational Linguistics.

Philadelphia, pp. 311-318.

241 Turian, J.P., Shen, L., Melamed, I.D. 2003. Evaluation of Machine Translation

and its Evaluation. In Proceedings of the MT Summit IX. New Orleans, LA,

pp. 386-393.

242 Melamed, I.D. 1995. Automatic Evaluation and Uniform Filter Cascades for

Inducing N-Best Translation Lexicons. In Proceedings of the third Workshop

on Very Large Corpora (WVLC3). Boston.

243 Thompson, H. 1991. Automatic evaluation of translation quality: Outline of

methodology and report on pilot experiment. In Proceedings of the

Evaluators’ Forum (ISSCO). Geneva, Switzerland, pp. 215–223.

244 White, J., O’Connell, T., Carlson, L. 1993. Evaluation of machine translation.

In Human Language Technology: Proceedings of the Workshop (ARPA). pp.

206–210.

245 Brew, C., Thompson, H. 1994. Automatic Evaluation of Computer Generated

Text: A Progress Report on the TextEval Project. In Human Language

Technology: Proceedings of the Workshop (ARPA/ISTO). pp. 108–113.

246 Rajman, M., Hartley, T. 2001. Automatically predicting MT systems rankings

compatible with Fluency, Adequacy or Informativeness scores. In Proceedings

of the Workshop on Machine Translation Evaluation: Who Did What To

Whom. Spain, pp. 29–34.

247 Doddington, G. 2002. Automatic evaluation of machine translation quality

using n-gram co-occurrence statistics. In Proceedings of the second

international conference on Human Language Technology Research. pp. 138–

248 Riezler, S., Maxwell-III, J.T. 2005. On some pitfalls in automatic evaluation

and significance testing for MT. In Proceedings of ACL Workshop on Intrinsic

and Extrinsic Evaluation Measures for MT and Summarization. pp. 57–64.

249 Cormen, T., Leiserson, C., Rivest, R., Stein, C. 2001. Introduction to

Algorithms. 2nd Edition, MIT Press.

250 Saini, T.S., Lehal, G.S., Kalra, V.S. 2008. Shahmukhi to Gurmukhi

transliteration system. In Proceedings of 22nd international Conference on

Computational Linguistics (Coling). Manchester, UK, pp. 177-180.

Publications Based on the Work Presented in this Thesis

Journals

[1] Saini, T. S., Lehal, G. S. 2008. Shahmukhi to Gurmukhi

Transliteration System: A Corpus based Approach, Research in

Computing Science (Mexico), Vol. 33, pp. 151-162.

International

Conference

[2] Saini, T. S., Lehal, G. S., Kalra, V. S. 2008. Shahmukhi to

Gurmukhi Transliteration System. In Proceedings of 22nd

International Conference on Computational Linguistics:

Demonstration Papers. August 18-22, Manchester, United

Kingdom, pp. 177-180.

[3] Lehal, G. S., Saini, T. S. 2011. A Transliteration based Word

Segmentation System for Shahmukhi Script. In Proceedings of

ICISIL. Springer, Communication in Computer and Information

Science, CCIS-139, pp. 136-143.

DEVELOPMENT OF SHAHMUKHI TO GURMUKHI...

Documents

Transcript of DEVELOPMENT OF SHAHMUKHI TO GURMUKHI...

Sukhmana Sahib Gurmukhi Translit Translat

HANDWRITTEN GURMUKHI CHARACTER RECOGNITION · PDF filehandwritten gurmukhi character recognition. ... would like to thank fauji sir, ... 1.5 application of handwritten character recognition

SGGS text in a sentence by sentence format. - gurbanifiles.netgurbanifiles.net/gurmukhi/Sentence by Sentence SGGS in Gurmukhi... · Web viewAll word databse making, text arrangement

Alphabetized SBS SGGS in Unicode Gurmukhi font …gurbanifiles.net/gurmukhi/Alphabetized SBS SGGS with Page... · Web viewSiri Guru Granth Sahib In Gurmukhi (in Unicode font) Arranged

Gurmukhi & Romanized & English Translation

Gurmukhi Book 1 Chldren Punjabi

Hikaaitaan (Gurmukhi,Punjabi Meanings)

Bachitar Natak (Gurmukhi,Hindi,English Meanings)

Shashtar Nam Mala (Gurmukhi,Hindi,English Meanings)

TM 9-4931-363-14&P · tm 9-4931-363-14&p technical manual headquarters department of the army no. 9-4931-363-14&p washington d.c., 11 may 1981 operator, organizational, direct support

Sampooran Nitnem in Gurmukhi With Bishraams

Chaubis Avtars (Gurmukhi,Punjabi Meanings)

Hikaaitaan (Gurmukhi,Hindi,English Meanings)

Punjabi Machine Transliteration · System uses transliteration rules (charac-ter mappings and dependency rules) for transliteration of Shahmukhi words into Gurmukhi. The PMT system

Employee Performance Management Ppt 4931

Bachitar Natak (Gurmukhi,Punjabi Meanings)

Jaap Sahib (Gurmukhi,Hindi,English Meanings)

Gurbani Kirtan in Gurmukhi - Punjab Online Kirtan in Gurmukhi... · 2008. 6. 6. · Gurbani Kirtan in Gurmukhi This document contains selected “Shabads” (groups of hymns) from

Gurbani Kirtan in Gurmukhi. - SBSS.itsbss.it/pdf/GurbaniKirtaninGurmukhi.pdf · 1 Gurbani Kirtan in Gurmukhi This document contains selected “Shabads” (groups of hymns) from Siri

Aasaa Dee Vaar in Gurmukhi With English Translation