[IEEE 2009 Second International Symposium on Electronic Commerce and Security - Nanchang City, China...

4
NaXi Pictographs Input Method Based on Primitives Hai Guo, Jingying Zhao Department of Computer Science and Engineering University of Dalian Nationalities Dalian, China e-mail: [email protected] Abstract—Naxi pictograph is the only still using hieroglyph all over the world, which does a positive role in the research for evolution history of human language. Through the analysis for the characteristics and information processing of Naxi pictograph, this paper proposes a transcription scheme of Naxi Pictograph input method from phonetic into primitives, and also optimizes this scheme. Based on Windows XP platform, the input method based on primitives is developed through adopting IMM-IME interface, and obtains remarkable economical efficiency and social performance. NaXi Pictograph; Graphic Primitives Input Method; information development I. INTRODUCTION Naxi pictograph belongs to the Naxi language of yilanguage branch of Tibetan-Burmese languages which consists of 2120 characters. Since it is the only ancient character handed down and still being used in the world, Naxi pictograph is of great value in studying the historical development of ancient writing. In the early 20th century, people at home and board first paid attention on Naxi pictograph and now many Universities and research institutions from American, Japan and Europe have made a through research on this subject. The traditional Naxi pictograph information processing technique uses manual methods like hand drawing, scanning and stencil making. Because of the complexity of Naxi pictograph, say, each of the words “ ”, ” ”, ” ” , “ ”, “ [1] has various ways of writing, it takes at least ten years for one to master the writing patterns of 2120 commonly-used words. The low efficiency of the traditional way makes it unsuitable for modern character information processing. Therefore, it’s necessary to develop the Naxi pictograph outline fonts and input method in order to complete the processing of Naxi pictograph. This paper mainly focuses on the features of Naxi pictograph which will lay a foundation for further study. II. THE FEATURES OF NAXI PICTOGRAPH As a special language, Naxi pictograph is very different on word formation, spelling and typeset from other languages like Chinese and English. In this case, the experience of Information processing on English and Chinese has no reference value when we try to construct a Naxi pictograph information platform. Naxi pictograph, as a kind of pictograph in the primary stage of evolutionary history of character, has some relevance with Chinese, English and some other minority language but also has more difference. The feature of Naxi is that one character can form a word or a sentence which is very different from Chinese character. Naxi pictograph is a kind of pictograph, most Naxi people can just read and only few people can write it fluently. The character interval between traditional Naxi Scriptures and ancient Chinese books is unfixed and the altitude difference between different height and different words is relatively large. Chinese and English character can be divided into two types: handwriting and printed. However, since Naxi pictograph appeared in the primary stage of evolutionary history of character, there just remains the handwriting without the concept of printing [2-3] . III. THE DEVELOPMENT OF NAXI PICTOGRAPH OUTLINE FONTS The development of Naxi pictograph outline fonts is mainly according to several stages: script designing, font scanning, digital fitting, font trimming, character checking and font integration. The outline fonts of Naxi pictograph is designed with reference to A Dictionary of Naxi Pictograph sound-indication wrote by expert–Li-Lincan which should be scanned into the computer. After computing, the script is scanned to form a dot matrix library with high quality and the corresponding code in the lib is encoded too. The digital fitting phase automatically extracts digital information (outline curve) approximate to the original script as much as possible from the dot matrix lib according to the dual-mode conversion algorithm. The outline curve of Naxi pictograph “elephant” is shown in Fig. 1. That the contour points, lines, angle and position can be controlled by parameters is very important in developing Naxi pictograph which is of great complexity and differences [2] . IV. THE INPUT METHOD PRINCIPLE As Windows is the most widely used operation system at present, this paper focuses on the input system for Naxi pictograph based on Windows. The input method based on 2009 Second International Symposium on Electronic Commerce and Security 978-0-7695-3643-9/09 $25.00 © 2009 IEEE DOI 10.1109/ISECS.2009.89 231

Transcript of [IEEE 2009 Second International Symposium on Electronic Commerce and Security - Nanchang City, China...

Page 1: [IEEE 2009 Second International Symposium on Electronic Commerce and Security - Nanchang City, China (2009.05.22-2009.05.24)] 2009 Second International Symposium on Electronic Commerce

NaXi Pictographs Input Method Based on Primitives

Hai Guo, Jingying Zhao Department of Computer Science and Engineering

University of Dalian Nationalities Dalian, China

e-mail: [email protected]

Abstract—Naxi pictograph is the only still using hieroglyph all over the world, which does a positive role in the research for evolution history of human language. Through the analysis for the characteristics and information processing of Naxi pictograph, this paper proposes a transcription scheme of Naxi Pictograph input method from phonetic into primitives, and also optimizes this scheme. Based on Windows XP platform, the input method based on primitives is developed through adopting IMM-IME interface, and obtains remarkable economical efficiency and social performance.

NaXi Pictograph; Graphic Primitives Input Method; information development

I. INTRODUCTION Naxi pictograph belongs to the Naxi language of

yilanguage branch of Tibetan-Burmese languages which consists of 2120 characters. Since it is the only ancient character handed down and still being used in the world, Naxi pictograph is of great value in studying the historical development of ancient writing. In the early 20th century, people at home and board first paid attention on Naxi pictograph and now many Universities and research institutions from American, Japan and Europe have made a through research on this subject. The traditional Naxi pictograph information processing technique uses manual methods like hand drawing, scanning and stencil making. Because of the complexity of Naxi pictograph, say, each of the words “”, ””, ” ” , “”, “”[1] has various ways of writing, it takes at least ten years for one to master the writing patterns of 2120 commonly-used words. The low efficiency of the traditional way makes it unsuitable for modern character information processing. Therefore, it’s necessary to develop the Naxi pictograph outline fonts and input method in order to complete the processing of Naxi pictograph. This paper mainly focuses on the features of Naxi pictograph which will lay a foundation for further study.

II. THE FEATURES OF NAXI PICTOGRAPH As a special language, Naxi pictograph is very different

on word formation, spelling and typeset from other languages like Chinese and English. In this case, the experience of Information processing on English and Chinese has no reference value when we try to construct a Naxi pictograph information platform.

Naxi pictograph, as a kind of pictograph in the primary stage of evolutionary history of character, has some relevance with Chinese, English and some other minority language but also has more difference.

• The feature of Naxi is that one character can form a word or a sentence which is very different from Chinese character.

• Naxi pictograph is a kind of pictograph, most Naxi people can just read and only few people can write it fluently.

• The character interval between traditional Naxi Scriptures and ancient Chinese books is unfixed and the altitude difference between different height and different words is relatively large.

• Chinese and English character can be divided into two types: handwriting and printed. However, since Naxi pictograph appeared in the primary stage of evolutionary history of character, there just remains the handwriting without the concept of printing [2-3].

III. THE DEVELOPMENT OF NAXI PICTOGRAPH OUTLINE FONTS

The development of Naxi pictograph outline fonts is mainly according to several stages: script designing, font scanning, digital fitting, font trimming, character checking and font integration. The outline fonts of Naxi pictograph is designed with reference to A Dictionary of Naxi Pictograph sound-indication wrote by expert–Li-Lincan which should be scanned into the computer. After computing, the script is scanned to form a dot matrix library with high quality and the corresponding code in the lib is encoded too. The digital fitting phase automatically extracts digital information (outline curve) approximate to the original script as much as possible from the dot matrix lib according to the dual-mode conversion algorithm. The outline curve of Naxi pictograph “elephant” is shown in Fig. 1. That the contour points, lines, angle and position can be controlled by parameters is very important in developing Naxi pictograph which is of great complexity and differences [2].

IV. THE INPUT METHOD PRINCIPLE As Windows is the most widely used operation system at

present, this paper focuses on the input system for Naxi pictograph based on Windows. The input method based on

2009 Second International Symposium on Electronic Commerce and Security

978-0-7695-3643-9/09 $25.00 © 2009 IEEE

DOI 10.1109/ISECS.2009.89

231

Page 2: [IEEE 2009 Second International Symposium on Electronic Commerce and Security - Nanchang City, China (2009.05.22-2009.05.24)] 2009 Second International Symposium on Electronic Commerce

Figure 1. Naxi pictograph “elephant” outline curve delineation

Windows transforms standard ASCII string into Naxi word or string using some particular coding rule. With different application program the user can not design a transformation program himself, due to which the task of inputting Naxi pictograph should be taken by the Windows system administration.

As shown in Fig. 2, at first, the keyboard event of Naxi pictograph input system is received by the Windows file use.exe, then use.exe transfers the event to the Input Method Manager (IMM), after that the IMM conveys the event to the input method editor which translates the keyboard event to its corresponding Naxi character (or string) with reference to user’s encoding dictionary, when this is done the translated event is propagated back to use.exe and then to the executing application, until now the whole input process of Naxi pictograph is finished[4-5].

V. The IMM-IME structure provides various input methods for applications, each tread of an application can keep an active input window. The processing order of other messages won’t be disturbed by inserting the NaXi pictograph message to message circle. The head file immdev.h should be included for using these new features. The detailed working principle of NaXi pictograph Primitives input method is shown in Fig. 3. The Design and Realization of Primitives Input Encoding Scheme

A. The Scheme of NaXi Primitives Encoding 1) he Structure Coding Method of NaXi Pictograph: The

NaXi Pictograph has four common structures, those are undivided whole structure codedby b, up and lower structure coded by s, right and left structure coded by z, surounding structure coded by b. The NaXi Pictograph character of undivided whole structure are such as "","","","",""and so on;the upper and lower structure, "","","","","",""; right and left structure, "","","","",""; surounding structure, "","", "","","". After the coarse segmentation and coding above, the fine granularity coding is introduced.

2) the Graphic Primitive Coding Method of NaXi Pictograph: The NaXi pictograph character hasn’t radical component such as Chinese characters, so the representation method using Graphic Primitive is proposed in this paper,which inspired by syntactic pattern recognition. The basic elements of graph, graphic primitives, are point, line, circle, circular curve, left oblique line. right oblique line, vertical line, vertical curve, oval curve and rectangle. As showed in Table I, the basic graphic primitives are coded. Different with Chinese and other characters, many NaXi pictograph characters contains digit, for example ""(dice), the number of dots, and ""(fire), the number of vertical curve, and" "(treasure),the number of circle. So the quantity is coded, one is 'y', two is 'e', three is 's', four is 'f', five is 'w', six is 'l', seven is 'q', eight is 'b', nine is 'j', the number greater than nine is 'd'.

Figure 2. the Input Method Principle of NaXi Pictograph

Figure 3. Detailed Working Principle of NaXi Pictograph Primitives Input method

232

Page 3: [IEEE 2009 Second International Symposium on Electronic Commerce and Security - Nanchang City, China (2009.05.22-2009.05.24)] 2009 Second International Symposium on Electronic Commerce

TABLE I. GRAPHIC PRIMITIVE CODE OF NAXI PICTOGRAPH

Graphic Primitive code

point a

line b

circle c

circular curve f

left oblique line g

right oblique line h

vertical line i

vertical curve g 3) The Graphic Primitive Coding Order of NaXi

Pictograph Character:The coding order of graphic primitive in NaXi Pictograph Character is from top to bottom, from left to right, from external to internal. The code table is made according to above rule, and some character are coded as follows: af, afgcs, afadcf, afgcg.

B. The Realization of Naxi Pictograph IME The Naxi pictograph Input Method Editor (IME) is

realized as a form of Dynamic Link Library (DLL) which is located under Windows System or Systems directory. The only difference from ordinary DLL is that the input method uses .IME as its suffix. IME need to offer two units: IME conversion interface and IME user interface.

The former interface can be realized though a group of exported functions of IME model which is called by IMM,and the latter one through a group or windows which receive messages and also supply IME user interface.

IME includes IME conversion interface and user interface. The conversion interface is composed of several interface functions of which the concrete interface and detailed function is regulated by the rules of the development interface of IME. IMM completes the transform function by calling a corresponding interface function. IME user interface consists of a number of relevant users’ windows which can receive and handle messages from IMM and also serve as interactive interface with users.

1) IME Conversion Interface Function Realization: Of all the functions called by IMM, there are four important ones for Naxi pictograph Primitive input method that should be realized with first consideration.

a) Imelnquir: When a user selects the Naxi pictograph input method, this function will be called first by IMM to get relevant information about this input method. The function should return initial information of IME, set every attribute of the method under the IMEINFO structure and name the window class of user interface.

b) ImeConfigure: This function will be called by IMM when a user sets attributes of the input method

through control panel and system icon. It can show attribute setup dialogue for user to set options of Naxi pictograph input methoud.

c) ImeProcessKey: This function will be called by IMM when a keyboard event needs to be handled. A keyboard event will be preprocessed by this function, then, according to returned value the system makes a decision with the consideration of specific context whether this event should be transferred to IME. If the returned value is true it means the keyboard message should be conveyed to the IME and ImeToAsciiEx will be called in a minute. On the contrary, if the returned value holds false it indicates there is no need for the IME to process the keyboard message, in this case, Naxi IMM directly transfers the message to the application.

d) ImeToAsciiEx: According to the context of Naxi input method, this function generates conversion result using conversion engine of IME and puts relevant character message to specified buffer area. Returned value is the number of messages, if this number is larger than the length of the buffer the system will turn to hMsgBuf item of the context of Naxi input method to read the message. This function together with function ImeProcessKey construct the main body of the conversion engine of IME based on keyboard input method.

2) IME User interface Realization. The user interface of Naxi input method mainly includes user interface window (window class and window procedure), user interface window components (status window, writing widow, and candidate window of window class and window procedure), setup window of the input method, soft keyboard, indicator window of the taskbar (icon, mail, tool tips) and hot key of the input method. During the process of programming the user interface window of Naxi input method the main task is dealing with the IME messages from the default IME window, WM_IME_SETCONTEXT, WM_IME_ COMPOSITION and WM_IME SELECT are relatively important and should be processed first. Through the process discussed above, we develop outline fonts and input method for the Naxi pictograph. Fig. 4 shows the usage situation of this input method in word.

VI. CONCLUSION The realization of NaXi pictograph Primitive input

method have made a great contribution for the breakthrough and filling gaps of the domestic study in the field of NaXi pictograph information processing. After actual application in many research institutes, this system obtained good repercussions and it can satisfy the daily requirements in NaXi pictograph information processing.

This implement can serve as a solid foundation for NaXi pictograph information and the follow-up item development.

233

Page 4: [IEEE 2009 Second International Symposium on Electronic Commerce and Security - Nanchang City, China (2009.05.22-2009.05.24)] 2009 Second International Symposium on Electronic Commerce

Figure 4. Naxi Pictograph Primitive Input Method

ACKNOWLEDGMENT This project is supported by the National Natural Science

Foundation of China under Grant No. 60803096.

REFERENCES

[1] Joseph Francis Charles Rock,A Nakhi-English encyclopedic dictionary, I.M.E.O, Rome,1963.

[2] Liu Yongkui, Guo Hai, Lu Guiyan,Li Hongyan,"Input technology and information processing of NaXi pictograph",Journal of Computational Information Systems,v3, n1, February, 2007, p361-368.

[3] Guo Hai,Che Wengang,Nie Juan,Li Bin etc,"Web embedding fonts technology of Naxi pictographs",Jisuanji Gongcheng/Computer Engineering,v31,n17,Sep 5, 2005, p203-204+207.

[4] Guo Hai,Zhao Jing-ying,"Development of the NaXi Pictographs Information Processing System",Control & Automation,v22,n22, 2006, p122-124.

[5] Xie Qian,Jiang Li,Wu Jian,etc,"Research on Chinese Linux input method engine standard",Jisuanji Yanjiu yu Fazhan/Computer Research and Development, v43, n11, November, 2006, p1965-1971.

[6] Tseng Chun-Han, Chen Chia-Ping,"Chinese input method based on reduced Mandarin phonetic alphabet",INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP, v 2, INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP, 2006, p 733-736.

[7] Tanimoto Yoshio,Nanba Kuniharu,Rokumyo Yasuhiko, etc,"Evaluation system of suitable computer input device for patients", Proceedings of the Third Workshop - 2005 IEEE Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, IDAACS 2005, Proceedings of the Third Workshop - 2005 IEEE Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, IDAACS 2005, 2007, p 369-373..

234