Table of Contents - Repositori | Universitas Udayana. Rajesh Kumar, National University of Singapore...
Transcript of Table of Contents - Repositori | Universitas Udayana. Rajesh Kumar, National University of Singapore...
Table of Contents
Sr. No Topic
1. Scope of the Journal
2. The Model
3. The Advisory and Editorial Board
4. Papers
First Published in the United States of America. Copyright © 2012 Foundation of Computer Science Inc. Originated and printed by Foundation of Computer Science Press, New York, USA
Scope of the Journal
International Journal of Computer Applications (IJCA) creates a place for publication of papers which covers the
frontier issues in Computer Science and Engineering and their applications which will define new wave of
breakthroughs. The journal is an initiative to identify the efforts of the scientific community worldwide towards
inventing new-age technologies. Our mission, as part of the research community is to bring the highest quality
research to the widest possible audience. International Journal of Computer Applications is a global effort to
consolidate dispersed knowledge and aggregate them in a search-able and index-able form.
The perspectives presented in the journal range from big picture analysis which address global and universal
concerns, to detailed case studies which speak of localized applications of the principles and practices of
computational algorithms. The journal is relevant for academics in computer science, applied sciences, the
professions and education, research students, public administrators in local and state government, representatives
of the private sector, trainers and industry consultants.
Indexing International Journal of Computer Applications (IJCA) maintains high quality indexing services such as Google
Scholar, CiteSeer, UlrichsWeb, DOAJ (Directory of Open Access Journals) and Scientific Commons Index,
University of St. Gallen, Switzerland. The articles are also indexed with SAO/NASA ADS Physics Abstract
Service supported by Harvard University and NASA, Informatics and ProQuest CSA Technology Research
Database. IJCA is constantly in progress towards expanding its contents worldwide for the betterment of the
scientific, research and academic communities.
Topics International Journal of Computer Applications (IJCA) supports a wide range of topics dealing in computer
science applications as: Embedded Systems, Pattern Recognition, Signal Processing, Robotics and Micro-
Robotics, Theoretical Informatics, Quantum Computing, Software Testing, Computer Vision, Digital Systems,
Pervasive Computing etc.
The Model
Open Review
International Journal of Computer Applications approach to peer review is open and inclusive, at the same time as
it is based on the most rigorous and merit-based ‘blind’ peer-review processes. Our referee processes are criterion-
referenced and referees selected on the basis of subject matter and disciplinary expertise. Ranking is based on
clearly articulated criteria. The result is a refereeing process that is scrupulously fair in its assessments at the same
time as offering a carefully structured and constructive contribution to the shape of the published paper.
Intellectual Excellence
The result is a publishing process which is without prejudice to institutional affiliation, stage in career, national
origins or disciplinary perspective. If the paper is excellent, and has been systematically and independently
assessed as such, it will be published. This is why International Journal of Computer Applications has so much
exciting new material, much of it originating from well known research institutions but also a considerable amount
of brilliantly insightful and innovative material from academics in lesser known institutions in the developing
world, emerging researchers, people working in hard-to-classify interdisciplinary spaces and researchers in liberal
arts colleges and teaching universities.
The Advisory and Editorial Board
The current editorial and Advisory committee of the International Journal of Computer Applications (IJCA)
includes members of research center heads, faculty deans, department heads, professors, research scientists,
experienced software development directors and engineers.
Dr. T. T. Al Shemmeri, Staffordshire University, UK Bhalaji N, Vels University
Dr. A.K.Banerjee, NIT, Trichy Dr. Pabitra Mohan Khilar, NIT Rourkela
Amos Omondi, Teesside University Dr. Anil Upadhyay, UPTU
Dr Amr Ahmed, University of Lincoln Cheng Luo, Coppin State University
Dr. Keith Leonard Mannock, University of London Harminder S. Bindra, PTU
Dr. Alexandra I. Cristea, University of Warwick Santosh K. Pandey, The Institute of CA of India
Dr. V. K. Anand, Punjab University Dr. S. Abdul Khader Jilani, University of Tabuk
Dr. Rakesh Mishra, University of Huddersfield Kamaljit I. Lakhtaria, Saurashtra University
Dr. S.Karthik, Anna University Dr. Anirban Kundu, West Bengal University of
Technology
Amol D. Potgantwar, University of Pune Dr Pramod B Patil, RTM Nagpur University
Dr. Neeraj Kumar Nehra, SMVD University Dr. Debasis Giri, WBUT
Dr. Rajesh Kumar, National University of Singapore Deo Prakash, Shri Mata Vaishno Devi University
Dr. Sabnam Sengupta, WBUT Rakesh Lingappa, VTU
D. Jude Hemanth, Karunya University P. Vasant, University Teknologi Petornas
Dr. A.Govardhan, JNTU Yuanfeng Jin, YanBian University
Dr. R. Ponnusamy, Vinayaga Missions University Rajesh K Shukla, RGPV
Dr. Yogeshwar Kosta, CHARUSAT Dr.S.Radha Rammohan, D.G. of Technological
Education
T.N.Shankar, JNTU Prof. Hari Mohan Pandey, NMIMS University
Dayashankar Singh, UPTU Prof. Kanchan Sharma, GGS Indraprastha
Vishwavidyalaya
Bidyadhar Subudhi, NIT, Rourkela Dr. S. Poornachandra, Anna University
Dr. Nitin S. Choubey, NMIMS Dr. R. Uma Rani, University of Madras
Rongrong Ji, Harbin Institute of Technology, China Dr. V.B. Singh, University of Delhi
Anand Kumar, VTU Hemant Kumar Mahala, RGPV
Prof. S K Nanda, BPUT Prof. Debnath Bhattacharyya, Hannam University
Dr. A.K. Sharma, Uttar Pradesh Technical
University
Dr A.S.Prasad, Andhra University
Rajeshree D. Raut, RTM, Nagpur University Deepak Joshi, Hannam University
Dr. Vijay H. Mankar, Nagpur University Dr. P K Singh, U P Technical University
Atul Sajjanhar, Deakin University RK Tiwari, U P Technical University
Navneet Tiwari, RGPV Dr. Himanshu Aggarwal, Punjabi University
Ashraf Bany Mohammed, Petra University Dr. K.D. Verma, S.V. College of PG Studies & Research
Totok R Biyanto, Sepuluh Nopember R.Amirtharajan, SASTRA University
Sheti Mahendra A, Dr. B A Marathwada University Md. Rajibul Islam, University Technology Malaysia
Koushik Majumder, WBUT S.Hariharan, B.S. Abdur Rahman University
Dr.R.Geetharamani, Anna University Dr.S.Sasikumar, HCET
Rupali Bhardwaj, UPTU Dakshina Ranjan Kisku, WBUT
Gaurav Kumar, Punjab Technical University A.K.Verma, TERI
Prof. B.Nagarajan, Anna University Vikas Singla, PTU
Dr H N Suma, VTU Dr. Udai Shanker, UPTU
Anu Suneja, Maharshi Markandeshwar University Prof. Rachit Garg, GNDU
Aung Kyaw Oo, DSA, Myanmar Dr Lefteris Gortzis, University of Patras, Greece.
Suhas J Manangi, Microsoft Mahdi Jampour, Kerman Institute of Higher Education
Prof. D S Suresh, Pune University Prof.M.V.Deshpande, University of Mumbai
Dr. Vipan Kakkar, SMVD University Dr. Ian Wells, Swansea Metropolitan University, UK
Dr M Ayoub Khan, Ministry of Communications
and IT, Govt. of India
Yongguo Liu, University of Electronic Science and
Technology of China
Prof. Surendra Rahamatkar, VIT Prof. Shishir K. Shandilya, RGPV
M.Azath, Anna University Liladhar R Rewatkar, RTM Nagpur University
R. Jagadeesh K, Anna University Amit Rathi, Jaypee University
Dr. Dilip Mali, Mekelle University, Ethiopia. Dr. Paresh Virparia, Sardar Patel University
Morteza S. Kamarposhti , Islamic Azad University
of Firoozkuh, Iran
Dr. D. Gunaseelan Directorate of Technological
Education, Oman
Dr. M. Azzouzi, ZA University of Djelfa, Algeria. Dr. Dhananjay Kumar, Anna University
Jayant shukla, RGPV Prof. Yuvaraju B N, VTU
Dr. Ananya Kanjilal, WBUT Daminni Grover, IILM Institute for Higher Education
Vishal Gour, Govt. Engineering College Monit Kapoor, M.M University
Dr. Binod Kumar, ISTAR Amit Kumar, Nanjing Forestry University, China.
Dr.Mallikarjun Hangarge, Gulbarga University Gursharanjeet Singh, LPU
Dr. R.Muthuraj, PSNACET Mohd.Muqeem, Integral University
Dr. Chitra. A. Dhawale, Symbiosis Institute of
Computer Studies and Research
Dr.Abdul Jalil M. Khalaf, University of Kufa, IRAQ.
Dr. Rizwan Beg, UPTU R.Indra Gandhi, Anna University
V.B Kirubanand, Bharathiar University Mohammad Ghulam Ali, IIT, Kharagpur
Dr. D.I. George A., Jamal Mohamed College Kunjal B.Mankad, ISTAR
Raman Kumar, PTU Lei Wu, University of Houston – Clear Lake, Texas.
G. Appasami , Anna University S.Vijayalakshmi, VIT University
Dr. Gurpreet Singh Josan, PTU Dr. Seema Shah, IIIT, Allahabad
Dr. Wichian Sittiprapaporn, Mahasarakham
University, Thailand.
Chakresh Kumar, MRI University, India
Dr. Vishal Goyal, Punjabi University, India Dr. A.V.Senthil Kumar, Bharathiar University, India
R.C.Tripathi, IIIT-Allahabad, India Prof. R.K. Narayan , B.I.T. Mesra, India
PAPERS
System Progress Estimation in Time based Coordinated Checkpointing Protocols
Authors : P. K. Suri, Meenu Satiza
1-6
Adaptive Learning for Algorithm Selection in Classification
Authors : Nitin Pise, Parag Kulkarni
7-12
Routing Protocol for Mobile Nodes in Wireless Sensor Network
Authors : Bhagyashri Bansode, Rajesh Ingle
13-16
32-Bit NxN Matrix Multiplication: Performance Evaluation for Altera FPGA, i5 Clarkdale, and
Atom Pineview-D Intel General Purpose Processors
Authors : Izzeldin Ibrahim Mohd, Chay Chin Fatt, Muhammad N. Marsono
17-23
Recognizing and Interpreting Sign Language Gesture for Human Robot Interaction
Authors : Shekhar Singh, Akshat Jain, Deepak Kumar
24-31
Change Data Capture on OLTP Staging Area for Nearly Real Time Data Warehouse base on
Database Trigger
Authors : I Made Sukarsa, Ni Wayan Wisswani, K. Gd. Darma Putra, Linawati
32-37
Decision Support System for Admission in Engineering Colleges based on Entrance Exam Marks
Authors : Miren Tanna
38-41
A Genetic Algorithm based Fuzzy C Mean Clustering Model for Segmenting Microarray Images
Authors : Biju V G, Mythili P
42-48
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
1
System Progress Estimation in Time based Coordinated
Checkpointing Protocols
P. K. Suri
Dean, Research and Development; Chairman, CSE/IT/MCA, HCTM
Technical Campus, Kaithal,Haryana, India
Meenu Satiza HCTM Technical Campus
ABSTRACT
A mobile computing system consists of mobile and stationary
nodes. Checkpointing is an efficient fault tolerant technique
used in distributed systems. Checkpointing in mobile systems
faces many new challenges such as low wireless bandwidth,
frequent disconnections and lack of stable storage on mobile
nodes. Coordinated Checkpointing that minimizes the number
of processes to take useless checkpoints is a suitable approach
to introduce fault tolerance in such systems. The time-based
checkpointing protocol eliminates communication overheads
by avoiding extra control messages and useless checkpoints.
Such protocols directly accesses stable storage when
checkpoints are saved. In this paper a new probabilistic
approach for evaluation of the system progress is devised
which is suitable for the mobile distribution applications. The
system behavior is observed by varying some system
parameters such as fault rate, clock drift rate, saved
checkpoint time, checkpoint intervals. A validation regarding
system progress is made via a simulation technique. The
simulation results show that the proposed probabilistic model
is well suited for the mobile computing systems.
General Terms
Checkpointing, System progress, Simulation
Keywords
Distributed system, fault tolerance, time-based checkpointing
System progress, consistent checkpoint
1. INTRODUCTION Checkpointing is a major technique of fault tolerance system
in which state of a process has to be saved in stable storage so
that the process can be restarted in case of fault. There are two
main categories of checkpointing techniques: (i) coordinated
and (ii) uncoordinated checkpointing. In coordinated
checkpointing, the processes send the control messages to
their dependent processes to save their states at the same time.
This results a global consistent state from which the system
recovers when a fault occur in the system. In uncoordinated
checkpointing, the processes save their states independently.
In this type of protocol, during fault occurrence processes
rollback to a point of recovery. Recently new type of time
based coordinated checkpointing techniques have been
introduced which avoid extra coordination messages among
dependent processes. The time based approach is based on
loosely synchronized timers. The Timer information is
piggybacked along application messages. System performance
of the time based checkpointing protocols depends on the
application and system’s characteristics such as checkpoint
intervals, save checkpoint time, resynchronize time, clock
drifts. We proposed a probabilistic model for the system
progress with particular system parametric values. This model
shows that how system operations can affect the system
performance and the simulation results shows the states at
which system perform well with the particular values of
defined parameters. A simulation model is also developed to
validate the system progress.
1.1 Related work In 1985, Chandy and Lamport [1] proposed a global snapshot
algorithm for distributed systems. The global state is achieved
by coordinating all the processors and logging the channel
state at the time of checkpointing. Special messages called
markers are used for coordination and for identifying the
messages originating at different checkpointing intervals.
In 1987, Koo-Toueg [5] proposed a two phase Minimum-
process Blocking Scheme for distributed system. The
consequence of algorithm is a consistent global checkpointing
state that involves only the participating processes and prevent
live lock problem (A single failure can cause an infinite
rollbacks)
In 1996 Ravi Prakash and Mukesh Singhal [11] presented a
synchronous non-blocking snapshot collection algorithm for
mobile systems that does not force every node to take a local
snapshot. They had also proposed a minimal
rollback/recovery algorithm in which the computation at a
node is rolled back only if it depends on operations that have
been undone due to the failure of node(s).
In 1996 N. Neves and W.K.Fuches [9] presented a time based
checkpointing protocol which eliminate communication
overhead present in traditional checkpointing protocols. The
checkpointing protocol was implemented on CM5 and their
performance was compared using several applications.
In 2001 Guohong Cao and Mukesh Singhal [4] had introduced
the concept of “mutable checkpoint” which is neither a
tentative checkpoint nor a permanent checkpoint. To design
efficient checkpointing algorithms for mobile computing
systems the mutable checkpoints can be saved anywhere e.g.
the main memory or local disk of MHs. In this way taking a
mutable checkpoint avoids the overhead of transferring large
amounts of data to the stable storage at MSSs over the
wireless network.
In 2002 Chi-Yi Lin et. al. [7] proposed an improved time
based checkpointing protocol by integrating the improved
timer synchronization technique. The mechanism of time
synchronization utilizes the accurate timer in MSSs as an
absolute reference. The timers in fixed hosts (MSSs) are more
reliable than those in MHs.
In 2003 Chi-Yi Lin and Sy-Yen Kuo[8] had proposed an
efficient time-based non-blocking checkpointing protocol.
The protocol reduces the no. of checkpoints transmitted over
wireless link. The protocol use synchronized timer to
indirectly coordinate the creation of checkpoints. In the
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
2
system each process takes a soft checkpoint first which is
saved in main memory of mobile hosts. If the process is
irrelevant to initiator it can be discarded otherwise will be
saved in the local disk at a later time as hard checkpoint. As a
result the number of disk accesses in mobile hosts can be
reduced. The advantage of using time based approach
improves the need of explicit coordination message.
In 2006 Men Chaoguang [2] proposed a two-phase time based
checkpointing strategy. This eliminates orphan and in-transit
messages. In this strategy, the issues of time - based adaptive
checkpoint strategy was evaluated which describes about all
processes need not to block their computation work and also
not to log all messages. In proposed strategy the inconsistency
issues are also discussed.
In 2007 Awasthi and Kumar [6] had proposed a probabilistic
approach based on keeping track of direct dependencies of
processes. Initiator MSS collects the direct dependency
vectors of all processes and sends the checkpoint request to all
dependent MSSs. This step was taken to reduce the time to
collect the coordinated checkpoint. It would also reduce the
number of useless checkpoints and the blocking of the
processes. The buffering of selective messages at the receiver
end and exact dependencies among processes had maintained.
Hence the useless checkpoint requests and the number of
duplicate checkpoint requests get reduced.
In 2008 Suchistmita Chinara and Santanu Kumar Rath [3]
had proposed an energy efficient mobility adaptive distributed
clustering algorithm for Mobile ad-hoc Network. In which a
better cluster stability and a low maintenance overhead is
achieved by volunteer and non volunteer cluster heads. The
proposed algorithm is divided into parts like cluster
formation, energy consumption model, cluster maintenance.
The objective of algorithm is to minimize the re affiliation
rate (A changing situation of member node search for another
head is called re affiliation) .The simulation Experiment
compare the ID of members. A high ID member act as cluster
head and cluster maintenance overhead is reduced from time
to time.
In 2011 Anil Panghal, Sharda Panghal, Mukesh Rana [10]
presented a comprehensive study of the existing techniques
namely Checkpoint-based recovery and Log-based recovery.
Based on the study they conclude that Log-based recovery
techniques which combine checkpointing and logging of
nondeterministic events during pre-failure execution are
suitable for systems that frequently interact with the outside
world. They also conclude that communication-induced
checkpointing reduces the message overhead if implemented
along with checkpoint staggering can prove to be the best
method for recovery in distributed systems
2. PROBLEM FORMULATION In this paper a number of time based checkpointing protocols
are analyzed. In [9] Neves and Fuchs had given the concept of
timers to reduce the communication overhead. They made the
following assumptions:
(a) The processes involved in checkpointing have loosely
synchronized clocks.
(b) All the processes are approximately synchronized and
have a deviation from real time in their local clock
timers. The local clock drift rate between the processes
being assumed as ρ.
(c) The timer will terminate at most (2ρT/ (1-ρ2) ≈ 2ρT)
seconds apart. Here T is the initial timer value. Normally
drift rate ρ attains values between 10-5 sec. to 10-8 sec.
(d) The clocks will show a maximum drift of 2NρT after N
checkpoint interval.
Consider the following figure1. in which P1 and P2 are two
processes. The message M1 is sent from process P1 to P2 in its
Nth checkpoint interval and also message M2 is being sent in
same interval of P1 to (N+1)th interval of P2. Let some fault
arrives in timeline of Process P2.
It is observed that checkpoint N+1 is saved in P2 before the
fault occurrence. Now P2 has yet to receive M2. But P2 has no
information of message M2. Such situations can be handled by
resending unacknowledged messages again.
According to Neves the more time is wasted in storing
checkpoints and the processes has to block its execution for a
long time which is an impractical situation. Such
inconsistencies of in-transit or orphan messages can be
handled by using time based checkpointing approach where
the messages are now being sent along with timers.
According to Men Chaoguang approach [2] the orphan
messages can be eliminated by using communication induced
approach and in-transit messages can be stored in message
logging queue. The following figure 2. Illustrates the above
situations.
Consider P1 and P2 are two processes. T1 and T2 are their
timers respectively. Let MD = D + 2 ρ T be the maximum
deviation between the timer of two processes = T1 – T2 .tmax
is the maximum delivery time at which process P2 should get
the message M1. tmin is the minimum delivery time of message
M4. ED = MD – tmin is the effective deviation in which the
processes cannot send or receive the message .M2 and M3 lies
in effective deviation and they arise the inconsistency due to
orphan and in-transit messages. To handle orphan message M3
a communication induced checkpoint is placed before the
delivery of message M3 and in-transit message M2 can be
retrieved from message logging queue.
It is observed that the parametric values ρ, T, D, tmin, tmax,
fault rate λ, Saved checkpoint point time S, time (t) at which
fault occurs affects the mobile distributed system
performance.
In this paper a probabilistic model is developed in which the
system performance is evaluated by varying various system
parametric values.
P1
P2
M1 M2
N N+1
N N+1
Fault arrived
Figure 1: Inconsistent state
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
3
3. SYSTEM PROGRESS EVALUATION
3.1 Probabilistic model development When faults occur in the system, resynchronization is made.
Here the system’s progress is defined as the ratio of
constructive computational work to the total work during a
given interval of time.
In order to perform a simulation experiment on distributed
system a random sample of time t1,t2,t3………..tn is
generated by transforming n uniform random numbers
u1,u2,u3……un in the interval (0,1). Where λ is the positive
constant depending on characteristics of distributed system
[12]. The general term of time tk is
tk = – (1/λ)*logeuk where k Є [1,n]
Let ts = time to store checkpoint, tmin be the minimum
checkpoint delivery time, tmax be the maximum
checkpointing delivery time, Tdiff be the maximum difference
between timers of different processes, L be the length of
checkpoint intervals before resynchronization, ρ be the clock
drift rate between the processes, fr be the fault rate, Tw be the
probabilistic wasted time of fault occurrence ,Twr be the
probabilistic wasted time of fault occurrence between
resynchronization, tr be the resynchronization time.
Let T1, T2, T3………Tnmax are n checkpoint intervals
between resynchronization. When fault not occurs then this
time will be equal to maximum number of checkpoint
intervals nmax.
The system progress is evaluated by developing following
probabilistic model.
Here ts ≤ (Tdiff + 2*nmax*L* ρ – tmin)
(ts + tmin – Tdiff)/ (2*L*ρ) ≤ nmax
nmax = ceil ((ts + tmin – Tdiff)/(2*L*ρ)
Let Tcons be the time interval during which constructive
computational work is done. Where Tcons is given as
Tcons = L – ts – tk where k Є [1,n]
The probability density function of occurrence of fault is
given as
Let Ir = Expected number of intervals between
resynchronization = Probability of happening fault during any
Interval less than nmax + Probability of happening no fault
during any interval less than nmax.
In Fig. 3 a set of nmax checkpoints numbers ({1, 2, 3 ……k,
k+1…nmax}) are considered on the time line of the process.
The probability of a fault occurring in the kth checkpoint
interval Pr[k] in last resynchronization process is given as
Pr[k] = e – fr* L*k – e – fr* L*(k+1)
(nmax – 1)
Ir = ∑ k * Pr[k] + nmax * e – fr* L*k
k =1
Ir = (1 – e – fr* L*nmax)/ (e fr* L – 1)
Probability of wasted time of occurrence of fault Tw and the
wasted time of occurrence between resynchronization Twr is
given as
Tw = ((e – fr*Tcons)*( – fr * Tcons – 1 ) + 1)/(fr*(1 – e – fr*Tcons ))
The Probability of wasted time of occurrence between
resynchronization Twr is given as
Twr = (1 – e – fr* L*nmax)*( ts +Tw) + e – fr* L*nmax * tr
Let probability of total time between resynchronization is Tr .
Where Tr = Ir * L + Twr
Let TCW be the Probability of time used in constructive work
between resynchronization
TCW = Ir*Tcons
The System Progress (SP say) of a process = TCW/Tr
The System Progress of all the processes = ∑ TCW/Tr
The System Progress of the complete system having n
processes = (∑ TCW/Tr)/n.
3.2 Validation of system progress To confirm the correctness of system progress evaluation,
system progress validation is implemented to more detailed
confidence level of simulation. Here in the simulation
technique to achieve the validation having a better confidence
level ,first 1000 runs of simulation experiment are made from
10 samples with 100 checkpoint intervals then
2000,3000,……..10000 runs are made and then average value
of system progress, their standard deviation (say SD), upper (
say UL) , lower (say LL) confidence limit of system progress
is computed. Further corresponding interval of interest Tcons
and then corresponding optimal system progress is evaluated
for the system by using the variation among the parameters
say λ, ρ, ts, fr, L. The used simulation technique follows as
Let’s take n independent samples of time interval Length L
and according to such n samples values of System progress
SP1,SP2,SP3……………………SPn .Then their mean μ and
P1
P2
M1
tmax
T2
T1
M2 M3
M4
tmin ED
MD
Fig.2 Elimination of inconsistent state
Fig 3. Fault arrival in kth checkpoint interval
L
2 1 k k+1
Fault
Waste
time
nmax
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
4
standard deviation σ are evaluated .The sample mean of all
System progresses are to be evaluated by using formula:
SPmean = ∑ SPi/ n
The variance σ2 can be estimated as :
σ2est = (1/ (n-1))*∑(SPk – SPmean)
2
The general relationship between the parameters is given as
Pr {μ – t ≤ SPmean ≤ μ + t} = 1 – α
where t is the tolerance on either side of mean within which
the estimated to fall within probability 1– α. The normal
density function is Φ(y) = y1-α/2, the upper confidence limit UL
and lower confidence limit LL of System progress can be
obtained respectively.
y
Φ(y) = ∫ (1/√2Π)* e–(z2/2)
*dz
–α
Where z = (√n) *(SPmean – μ)/ σest
UL = SPmean + (y1-α/2 * σest) / (√n)
LL = SPmean – (y1-α/2 * σest) / (√n)
y1-α/2 = 2.58 (99% confidence level)
The interval (UL - LL) will contain the true mean with a
specific certain experimental confidence value [12].
3.2 Simulation results Simulation result shows that the System Performance is
affected by the various factors such as number of checkpoint
intervals, Clock drift rate of processors, Fault rate of
processors, Time of saving checkpoints .In our simulation
experiment such variations of factors against System Progress
are shown in tabular as well as graphical form.
3.2.1 Checkpoint interval vs. system progress The following table expresses the parametric values used in
proposed model. The first column shows when varying values
of Fault rate (Table 4), Drift rate (Table 5), Checkpoint
intervals (Table 2) and second and third column shows the
corresponding other variable names and their particular
values.
Table 1: Parametric values of System model
According to these values the system progress is evaluated
and respective graphs are drawn. First according to increasing
values of checkpoint intervals (L) corresponding decreasing
values of system progress is evaluated i.e. obviously as
number of checkpoint intervals are increased corresponding
system progress get decreased (Fig 4) i.e. The system progress
is affected by number of checkpoints
Table 2. Checkpoint intervals vs. System Progress
Checkpoint
Intervals( L)
System Progress
SP
100 0.9925
10100 0.950282
20100 0.902832
30100 0.857017
40100 0.81285
50100 0.770316
Fig 4: Checkpoint intervals vs. System Progress
3.2.2 Saved Checkpoint time vs. system progress Table 3. describes as time to save checkpoints get increased
the system progress get decreased.
This is illustrated in Fig.5 which is obviously true as the time
to save checkpoint get increased the system progress will
decrease.
Table 3. Saved Checkpoint time vs. System Progress
Saved checkpoint
Time (ts)
System Progress
(SP)
1 0.98183
2 0.981554
3 0.981278
4 0.980999
5 0.980723
6 0.980441
7 0.980166
8 0.979887
9 0.979609
10 0.979331
11 0.979053
12 0.978775
13 0.978498
14 0.978221
15 0.977942
SYSTEM PROGRESS EVALUATION
Used System parameters
Variable tr 0.1
State Tdiff 0.01
tmin 0.001
fr L 3600
Fault rate ts 0.7
ρ 0.000001
ρ fr 0.00001
Drift rate L 3600
ts 0.7
L fr 0.00001
Ckeckpoint
Interval ρ 0.000001
ts 0.7
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
5
Fig 5: Checkpoint intervals vs. System Progress
3.2.3 Fault Rate vs. system progress This subsection describes how fault rate affects the system
progress.
Table 4. describes as fault rate get increased The system
progress get decreased.This is illustrated in Fig.6
Table 4. Fault Rate vs. system progress
Fault Rate
fr
System Progress
SP
1.00E-16 0.999803
1.00E-15 0.999741
1.00E-14 0.999767
1.00E-13 0.999783
1.00E-12 0.999802
1.00E-11 0.999761
1.00E-10 0.999726
1.00E-09 0.999765
1.00E-08 0.999658
1.00E-07 0.999625
Fig 6 Fault Rate vs. System Progress
3.2.4 Drift Rate vs. system progress This subsection describes how drift rate affects the system
progress. For low value of drift rate the system progress is
high little bit .The System Progress of non blocking protocol
is not much affected for different values of drift rate the
System .Table 5. and Fig.6 illustrates this.
Table 5. Drift Rate vs. System progress
Drift Rate ρ
System Progress
SP
0.1 0.994184
0.01 0.993871
1.00E-03 0.993605
1.00E-04 0.983866
1.00E-05 0.993416
1.00E-06 0.994208
1.00E-07 0.993747
1.00E-08 0.994071
1.00E-09 0.993928
1.00E-10 0.994053
Fig 7 Drift Rate vs. System Progress
3.2.5 System progress validation In Table 6. the first column, first entry illustrates that 10
samples of checkpoint interval of length 100 are taken and
corresponding System progress of 100,200,…..1000
checkpoint intervals gets evaluated, their average is shown in
second column (i.e. 0.99520).The third and fourth column
shows their standard deviation, upper and lower confidence
limit respectively. Similarly System Progress of other samples
having checkpoint intervals 2000, 3000 …10000 are validated
Similar validation can be applied to other system parameters.
The difference between upper and lower confidence limit
should be less than 2*Tolerance value. Here tolerance value is
0.001 for 99% confidence
Table 6. System progress validation
Sample
No.
System
Progress
average
σest
Upper
Confidence.
Limit
Lower
Confidence
Limit
1000 0.99520 0.00114 0.9961 0.99427
2000 0.99180 0.00141 0.9929 0.99065
3000 0.98702 0.00146 0.9882 0.98583
4000 0.98215 0.00014 0.9833 0.98095
5000 0.97727 0.00014 0.9784 0.97606
6000 0.97238 0.00147 0.9735 0.97117
7000 0.96750 0.00147 0.9687 0.96629
8000 0.96263 0.00147 0.9638 0.96143
9000 0.95777 0.00146 0.9589 0.95658
10000 0.95293 0.00146 0.9541 0.95174
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
6
4. CONCLUSION In this paper the problem of arrival of fault is efficiently
discussed. A probabilistic model is developed for evaluation
of System progress of the processes along with a particular set
of parameters. It is observed that the System Progress is
evaluated by introducing the time generated by negative
exponential distribution function. The system Progress gets
optimizes on particular values of system parameters. A
validation regarding the System progress on the basis of set of
parameter checkpoint interval length (L) value is derived
.Such validation can be evaluated regarding the other set of
parameters such as drift rate, fault rate, saved checkpoint
time.
5. ACKNOWLEDGMENTS Sincere thanks to HCTM Technical Campus Management
Kaithal-136027, Haryana, India for their constant
encouragement.
6. REFERENCES [1] Chandy K.M. and Lamport L. “Distributed Snapshots:
Determining Global States of Distributed Systems” ACM
Transactions Computer systems vol. 3, no.1. pp. 63-75,
Feb.1985
[2] Chaoguang M., Yunlong Z. and Wenbin Y., “A two-
phase time-based consistent checkpointing strategy,” in
Proc. ITNG’06 3rd IEEE International Conference on
Information Technology: New Generations, April 10-12,
2006, pp. 518–523.
[3] Chinara Suchistmita and Rath S.K.“An Energy Efficient
Mobility Adaptive Distributed Clustering Algorithm for
Mobile ad-hoc Network” 978-1-4244-2963-9/08 (2008)
IEEE.
[4] Guohong Cao and Singhal Mukesh, “Mutable
Checkpoints: a new checkpointing approach for Mobile
Computing Systems”, IEEE Transaction on Parallel and
Distributed Systems, vol. 12, no. 2, pp. 157-172,
February 2001
[5] Koo. R. and Toueg. S. “Checkpointing and Rollback-
Recovery for Distributed Systems”. IEEE Transactions
on Software Engineering, SE-13(1): pp 23-31, January
1987.
[6] Kumar Lalit, Kumar Awasthi, “A Synchronous
Checkpointing Protocol for Mobile Distributed Systems:
Probabilistic Approach” International Journal of
Information and Computer Security, Vol.1, No.3 .pp
298-314, 2007.
[7] Lin C., Wang S., and Kuo S., “A Low Overhead
Checkpointing Protocol for Mobile Computing System”
in Proc of the 2002 IEEE Pacific Rim International
Symposium on dependable computing (PRDC’02).
[8] Lin C., Wang S., and Kuo S., “An efficient time-based
checkpointing protocol for mobile computing systems
over wide area networks,” in Lecture Notes in Computer
Science 2400, Euro-Par 2002, Springer-Verlag, 2002, pp.
978–982. Also in Mobile Networks and Applications,
2003, vo. 8, no. 6, pp. 687–697.
[9] Neves N., Fuchs W.K., “Using time to improve the
performance of coordinated checkpointing,” In:
Proceedings of 2nd IEEE International Computer
Performance and Dependability Symposium, Urbana-
Champaign, USA, 1996, pp.282 –291.
[10] Panghal Anil, Panghal Sharda, Rana Mukesh
“Checkpointing Based Rollback Recovery in Distributed
Systems” Journal of Current Computer Science and
Technology Vol. 1 Issue 6 [2011]258-266.
[11] Prakash R. and Singhal M., “Low-Cost Checkpointing
and Failure Recovery in Mobile Computing Systems”,
IEEE Transaction on Parallel and Distributed Systems,
vol. 7, no. 10, pp. 1035-1048, October1996.
[12] “ System simulation with digital computer” by Narsingh
Deo
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
7
Adaptive Learning for Algorithm Selection in
Classification
Nitin Pise
Research Scholar Department of Computer Engg. & IT College of Engineering, Pune, India
Parag Kulkarni Phd, Adjunct Professor
Department of Computer Engg. & IT College of Engineering, Pune, India
ABSTRACT No learner is generally better than another learner. If a learner
performs better than another learner on some learning
situations, then the first learner usually performs worse than
the second learner on other situations. In other words, no
single learning algorithm can perform well and uniformly
outperform other algorithms over all learning or data mining
tasks. There is an increasing number of algorithms and
practices that can be used for the very same application. With
the explosion of available learning algorithms, a method for
helping user selecting the most appropriate algorithm or
combination of algorithms to solve a problem is becoming
increasingly important. In this paper we are using meta-
learning to relate the performance of machine learning
algorithms on the different datasets. The paper concludes by
proposing the system which can learn dynamically as per the
given data.
General Terms
Machine Learning, Pattern Classification
Keywords Learning algorithms, Dataset characteristics, algorithm
selection
1. INTRODUCTION
The knowledge discovery [3] is an iterative process. The
analyst must select the right model for the task he is going to
perform, and within it, the right model or algorithm, where the
special morphological characteristics of the problem must
always be considered. The algorithm is then invoked and its
output is evaluated. If the evaluations results are poor, the
process is repeated with new selections. A plethora of
commercial and prototype systems with a variety of models
and algorithms exist at the analyst’s disposal. However, the
selection among them is left to the analyst. The machine
learning field has been evolving for a long time and has given
us a variety of models and algorithms to perform the
classification, e.g. decision trees, neural networks, support
vector machines [4], rule inducers, nearest neighbor etc. The
analyst must select among them the ones that better match the
morphology and the special characteristics of the problem at
hand. This selection is one of the most difficult problems
since there is no model or algorithm that performs better than
all others independently of the particular problem
characteristics. A wrong choice of model can have a more
severe impact: A hypothesis appropriate for the problem at
hand might be ignored because it is not contained in the
model’s search space.
There is an increasing number of algorithms and practices that
can be used for the very same application. Extensive research
has been performed to develop appropriate machine learning
techniques for different data mining tasks, and has led to a
proliferation of different learning algorithms. However,
previous work has shown that no learner is generally better
than another learner. If a learner performs better than another
learner on some learning situations, then the first learner
usually performs worse than the second learner on other
situations [5]. In other words, no single learning algorithm can
perform well compared to the other algorithms and
outperform other algorithms over all classification tasks. This
has been confirmed by the “no free lunch theorems” [6]. The
major reasons are that a learning algorithm has different
performances in processing different datasets and that
different variety of ‘inductive bias’ [7]. In real-world
applications, the users need to select an appropriate learning
algorithm according to the classification task that is to be
performed [8],[9]. If we select the algorithm inappropriately,
it results in a slow convergence or may lead to a sub-optimal
local minimum. Meta-learning has been proposed to deal with
the issues of algorithm selection [10]. One of the aims of
meta-learning is to help or assist the user to determine the
most suitable learning algorithm(s) for the problem at hand.
The task of meta-learning is to find functions that map
datasets to predicted data mining performance (e.g., predictive
accuracies, execution time, etc.). To this end meta-learning
uses a set of attributes, called meta-attributes, to represent the
characteristics of classification tasks, and search for the
correlations between these attributes and the performance of
learning algorithms. Instead of executing all learning
algorithms to obtain the optimal one, meta-learning is
performed on the meta-data characterizing the data mining
tasks. The effectiveness of meta-learning is largely dependent
on the description of tasks (i.e., meta-attributes).
Ensemble methods are learning algorithms that construct a set
of classifiers and then classify new data points by taking a
vote of their predictions. Combining classifiers or studying
methods for constructing good ensembles of classifiers to
achieve higher accuracy is an important research topic [1] [2].
The drawback of ensemble learning is that in order for
ensemble learning to be computationally efficient,
approximation of posterior needs to have a simple factorial
structure. This means that most dependence between various
parameters cannot be estimated. It is difficult to measure
correlation between classifiers from different types of
learners. Also there are learning time and memory
constraints. Learned concept is difficult to understand.
So we are trying to propose adaptive learning. We need to
propose algorithm for selection of methods for classification
task. The datasets are identified and we are trying to map to
learning algorithms or methods. We need to generate adaptive
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
8
function. Adaptive learning will be built on the top of
ensemble methods.
2. RELATED WORKS
Several algorithm selection systems and strategies have been
proposed previously [3][10][11][12]. STATLOG [14] extracts
various characteristics from a set of datasets. Then it
combines these characteristics with the performance of the
algorithms. Rules are generated to guide inducer selection
based on the dataset characteristics. This method is based on
the morphological similarity between the new dataset and
existing collection of datasets. When a new dataset is
presented, it compares the characteristics of the new dataset to
the collection of the old datasets. This costs a lot of time.
Predictive clustering trees for ranking are proposed in [15]. It
uses relational descriptions of the tasks. The relative
performance of the algorithms on a given dataset is predicted
for a given relational dataset description. Results are not very
good, with most relative errors over 1.0 which are worse than
default prediction. Data Mining Advisor (DMA) [16] is a
system that already has a set of algorithms and a collection of
training datasets. The performance of the algorithms for every
subset in the training datasets is known. When the user
presents a new dataset, DMA first finds a similar subset in the
training datasets. Then it retrieves information about the
performance of algorithms and ranks the algorithms and gives
the appropriate recommendation. Our approach is inspired by
the above method used in [16].
Most work in this area is aimed at relating properties of data
to the effect of learning algorithms, including several large
scale studies such as the STATLOG (Michie et al., 1994) and
METAL (METAL-consortium, 2003) projects. We will use
this term in a broader sense, referring both to ‘manual’
analysis of learner performance, by querying, and automatic
model building, by applying learning algorithms over large
collections of meta-data. An instance based learning algorithm
(K-nearest neighbor) was used to determine which training
datasets are closest to a test dataset based on similarity of
features, and then to predict the ranking of each algorithm
based on the performance of the neighboring datasets.
3. LEARNING ALGORITHMS AND
DATASET CHARACTERISTICS
In general there are two families of algorithms, the statistical,
which are best implemented by an experienced analyst since
they require a lot of technical skills and specific assumptions
and the data mining tools, which do not require much model
specification but they offer little diagnostic tools. Each family
has reliable and well-tested algorithms that can be used for
prediction. In the case of the classification task [11], the most
frequent encountered algorithms are logistic regression (LR),
decision tree and decision rules, neural network (NN) and
discriminant analysis (DA). In the case of regression, multiple
linear regression (MLR), classification & regression trees
(CART) and neural networks have been used extensively.
In the classification task the error rate is defined
straightforwardly as the percentage of the misclassified cases
in the observed versus predicted contingency table. When
NNs are used to predict a scalar quantity, the square of the
correlation for the predicted outcome with the target response
is analogous to the r-square measure of MLR. Therefore the
error rate can be defined in the prediction task as:
Error rate = 1 - correlation2 (observed, predicted)
In both tasks, error rate varies from zero to one, with one
indicating bad performance of the model and zero the best
possible performance.
The dataset characteristics are related with the type of
problem. In the case of the classification task the number of
classes, the entropy of the classes and the percent of the mode
category of the class can be used as useful indicators. The
relevant ones for the regression task might be the mean value
of the dependent variable, the median, the mode, the standard
deviation, skewness and kurtosis. Some database measures
include the number of the records, the percent of the original
dataset used for training and for testing, the number of
missing values and the percent of incomplete records. Also
useful information lies on the total number of variables. For
the categorical variables of the database, the number of
dimensions in homogeneity analysis and the average gain of
the first and second Eigen values of homogeneity analysis as
well as the average attribute entropy are the corresponding
statistics. For the continuous variables, the average mean
value, the average 5% trimmed mean, the median, the
variance, the standard deviation, the range, the inter-quartile
range, skewness, kurtosis and the Huber’s M-estimator are
some of the useful statistics that can be applied to capture the
information on the data set.
The determinant of the correlation matrix is an indicator of the
interdependency of the attributes on the data set. The average
correlation, as it is captured by Crobach-α reliability
coefficient, may be still an important statistic. By applying
principal component analysis on the numerical variables of
the data set, the first and second largest Eigen values can be
observed.
If the data set for a classification task has categorical
explanatory variables, then the average information gain and
the noise to signal ratio are two useful information measures,
while the average Goodman and Kruskal tau and the average
chi-square significance value are two statistical indicators.
Also in the case of continuous explanatory variables, Wilks’
lambda and the canonical correlation of the first
discrimination function may be measures for the
discriminating power within the data set.
By comparing a numeric with a nominal variable with the
student’s t-test, two important statistics are produced to
indicate the degree of their relation, namely Eta squared and
the Significance of the F-test.
Table 1. DCT dataset properties [17]
Nr_Attributes Nr_num_attributes
Nr_sym_attributes Nr_examples
Nr_classes MissingValues_Total
MissingValues_relative Mean_Absolute_Skew
MStatistic MeanKurtosis
NumAttrsWithOutliers MstatDF
MstatChiSq SDRatio
WiksLambda Fract
Cancor BarlettStatistic
Class Entropy Mutual Information
Joint Entropy Eqivalent_nr_of_attrs
Entropy Attributes NoiseSignalRatio
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
9
4. PROPOSED METHOD Here we are considering properties of scenarios. We need to
classify learning scenario. We are extracting features of input
data or datasets. We are using the concept of meta-learning.
Meta-learning relates algorithms to their area of expertise
using specific problem characteristics. The idea of meta-
learning is to learn about classifiers or learning algorithms, in
terms of the kind of data for which they actually perform well.
Using dataset characteristics, which are called meta-features;
one predicts the performance results of individual learning
algorithms. These features are divided into several categories:
Sample or general features: Here we need to find
out the number of classes, the number of attributes,
the number of categorical attributes, the number of
samples or instances etc.
Statistical features: Here we require to find
canonical discriminant, correlations, skew, kurtosis
etc.
Information theoretic features: Here we need to
extract class entropy, signal to noise ratio etc.
We are proposing adaptive methodology. Different thoughts
can be considered, e.g. parameters such as the input data,
learning methods, learning policies, learning methods
combination. Here there can be a single learner or multiple
learners. Also we can use simple voting or averaging while
combining the performance of the different learners.
5. EXPERIMENTS
5.1 Experimental Descriptions Here we need to map the dataset’s characteristics to the
performance of the algorithm. We are capturing the
knowledge about the algorithms’ from experiments. Here we
are calculating the algorithms’ accuracy on each dataset.
After the experiments, accuracy of each algorithm
corresponding to every dataset is saved in the knowledge base
for the future use. The Ranking procedure is shown in Figure
1.
Given a new dataset, we use k-NN [7] to find out the most
similar dataset in the knowledge base with the new one. K-
Nearest Neighbor learning is the most basic instance-based
method. The nearest neighbors of an instance are defined in
terms of the standard Euclidean distance. Let an arbitrary
instance x be described by the feature vector
<a1 (x), a2 (x), --- an(x) >
Where ar (x) denotes the value of the rth attribute of instance x.
Then the distance between two instances xi and xj is defined to
be d (xi- xj),
d (xi- xj) = √ (∑(ar (xi) – ar (xj ) )) 2
Here r varies from 1 to n in summation. 24 characteristics are
used to compare the two dataset’s similarities. A distance
function that based on the characteristics of the two datasets is
used to find the most similar neighbors, whose performance is
expected to be similar or relevant to the new dataset. The
recommended ranking of the new dataset is built by
aggregating the learning algorithms’ performance on the
similar datasets. The knowledge base KB stores the dataset’s
characteristics and the learning algorithms’ performance
corresponding to each dataset.
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
10
k Similar Datasets
Ranking of learning
algorithms for the
new dataset
Result: Recommended
learning algorithm
Decision
Making
New Dataset
Calculate Dataset
Characteristics
Characteristics of the
new dataset
k-NN
Calculate Dataset
Characteristics
Knowledge base
(Learning
algorithms’
performance &
Dataset
Characteristics)
Fig 1: The Ranking of Learning Algorithms
6. RESULTS AND DISCUSSIONS Here we have used Adult Dataset [13]. The dataset Adult has
following features:
48842 instances
14 attributes (6 continuous, 8 nominal)
Contains information on adults such as age, gender,
ethnicity, martial status, education, native country,
etc.
The instances are classified into either “Salary
>50K” or “Salary <= 50K”
Table 2 shows the ranking of eight algorithms used on Adult
Dataset from UCI Repository. The table shows highest rank to
LogitBoost algorithm, then to J48, oneR and finally lowest
rank is given to ZeroR algorithm.
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
11
Table 2. Ranking of different algorithms on Adult Dataset
Algorithm Rank
LogitBoost 1
J48 2
OneR 3
DecisionStump 4
IB1 5
IBK 6
NaiveBayes 7
ZeroR 8
Table 3. Correctly & Incorrectly Classified Instances for
Adult Dataset
Algorithm % of
Correct
classified
instances
% of
Incorrect
classified
instances
LogitBoost 84.68 15.32
ZeroR 76.07 23.93
Fig. 2: % Classified instances with top ranked algorithm
LogitBoost on Adult Dataset
Figure 2 shows percentage of classified instances with the top
ranked algorithm called LogitBoost on Adult Dataset. Here
84.68 % instances are correctly classified.
Figure 3 shows percentage of classified instances with the
lowest ranked algorithm called ZeroR on Adult Dataset. Here
76.07 % instances are correctly classified.
Fig. 3: % Classified instances with lowest ranked
algorithm ZeroR on Adult Dataset
7. CONCLUSIONS AND FUTURE WORK In this paper, we present our preliminary work on using meta-
learning method for helping user effectively to select the most
appropriate learning algorithms and give the ranking
recommendation automatically. It will assist both novice and
expert users. Ranking system can reduce the searching space,
give him/her the recommendation and guide the user to select
the most suited algorithms. Thus the system will assist to
learn adaptively using the experiences from the past data. In
the future work, we will investigate more on our proposed
method and test extensively on other datasets. Meta Learning
helps improve results over the basic algorithms. Using Meta
Characteristics on the Adult dataset to determine an
appropriate algorithm, almost 85% correct classification is
achieved for LogitBoost algorithm. So out of eight algorithms
LogitBoost algorithm is recommended to the user.
8. ACKNOWLEDGMENTS Our thanks to the experts who have contributed towards
development of the different algorithms and made them
available to the users.
9. REFERENCES [1] Kuncheva, L, Bezdek J., and Duin, R. 2001 Decision
Templates for Multiple Classifier Fusion: An
Experimental Comparison, Pattern Recognition. 34, (2),
pp.299-314, 2001.
[2] Dietterich, T. 2002 Ensemble Methods in Machine
Learning 1st Int. Workshop on Multiple Classifier
Systems, in Lecture Notes in Computer Science, F. Roli
and J. Kittler, Eds. Vol. 1857, pp.1-15, 2002.
[3] Alexmandros, K. and Melanie, H. J. 2001 Model
Selection via Meta-Learning: A Comparative Study.
International Journal on Artificial Intelligence Tools.
Vol. 10, No. 4 (2001).
[4] Joachims, T. 1998 Text Categorization with Support
Vector Machines: Learning with Many Relevant
Features. Proceedings of the European Conference on
Machine Learning, Springer.
[5] Schaffer, C. 1994 Cross-validation, stacking and bi- level
stacking: Meta-methods for classification learning, In Cheeseman, P. and Oldford R.W.(eds) Selecting Models from Data: Artificial Intelligence and
IV, 51-59.
Correct
Incorrect
Correct
Incorrect
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
12
[6] Wolpert, D. 1996 The lack of a Priori Distinctions between Learning Algorithms, Neural Computation, 8,
1996, 1341-1420.
[7] Mitchell, T. 1997 Machine Learning, McGraw Hill.
[8] Brodley, C. E. J.1995 Recursive automatic bias selection
for classifier construction, Machine Learning, 20, 63-94.
[9] Schaffer, C. J. 1993 Selecting a Classification Methods
by Cross Validation, Machine Learning, 13, 135-143.
[10] Kalousis, A. and Hilario, M. 2000 Model Selection via
Meta-learning: a Comparative study, Proceedings of the
12th International IEEE Conference on Tools with AI,
Canada, 214-220.
[11] Koliastasis, D. and Despotis, D. J. 2004 Rules for
Comparing Predictive Data Mining Algorithms by Error
Rate, OPSEARCH, VOL. 41, No. 3.
[12] Fan, L., Lei M. 2006 Reducing Cognitive Overload by
Meta-Learning Assisted Algorithm Selection,
Proceedings of the 5th IEEE International Conference on
Cognitive Informatics, pp. 120-125, 2006.
[13] Frank, A. and Asuncion, A. 2010. UCI machine learning
Repository [http://archive.ics.uci.edu/ml]. Irvine, CA:
University of California, School of Information and
Computer Science.
[14] Michie, D. and Spicgelhater, D. 1994 Machine Learning,
Neural and Statistical Classification. Elis Horwood
Series in Artificial Intelligence, 1994.
[15] Todorvoski, L. and Blockeel, H. 2002 Ranking with
Predictive Clustering Trees, Efficient Multi-Relational
Data Mining, 2002.
[16] Alexandros, K. and Melanie, H. J. 2001 Model Selection
[17] Peng, Y., Flach., P., Soarces C. and Brazdil, P., 2002
Improved Dataset Characterization for Meta-learning,
Springer LNCS 2534, pp. 141-152, 2002.
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
13
Routing Protocol for Mobile Nodes in Wireless Sensor
Network
Bhagyashri Bansode
Department of Computer Engineering, Pune Institute of Computer Technology, Pune,
Maharashtra, India.
Rajesh Ingle Phd, Department of Computer Engineering, Pune
Institute of Computer Technology, Pune, Maharashtra, India
ABSTRACT
Wireless sensor network made up of sensor nodes which are
fix or mobile. LEACH is clustered based protocol uses time
division multiple access. It supports mobile nodes in WSN.
Mobile node changes cluster. LEACH wait for two TDMA
cycles to update the cluster, within these two cycles mobile
node which changed cluster head, can’t send data to any other
cluster head, it causes packet loss. We propose an adaptive
Low Packet Loss Routing protocol which support mobile
node with low packet loss. This protocol uses time division
multiple access scheduling to reserve the battery of sensor
node. We form clusters, each cluster head update cluster after
every TDMA cycle to reduce packet loss. The proposed
protocol sends data to cluster heads in an efficient manner
based on received signal strength. The performance of
proposed LPLR protocol is evaluated using NS2.34 on Linux
2.6.23.1.42.fc8 platform. It has been observed that the
proposed protocol reduces the packet loss compared to
LEACH-Mobile protocol.
Keywords
Cluster based routing, mobility, LEACH-Mobile, WSN
1.INTRODUCTION
A wireless sensor network (WSN) consists of spatially
distributed autonomous sensors to monitor physical or
environmental conditions, such as temperature, sound,
vibration, pressure, humidity, motion or pollutants and to
cooperatively pass their data through the network to a main
location. Modern networks are bi-directional, also enabling
control of sensor activity. The development of wireless sensor
networks was motivated by military applications such as
battlefield surveillance; today such networks are used in many
industrial and consumer applications, such as industrial
process monitoring and control, machine health monitoring.
WSN consist of mobile or fix sensor nodes. In some cases it
consists of hybrid sensor nodes. All nodes sense and send
data to server. This increases communication overhead
because all nodes are sending data to server. This network
containing hundreds or thousands of sensor node and main
challenge in WSN is to reduce energy consumption and low
packet loss in each sensor node. There are many routing
protocols like Destination Sequenced Distance Vector
(DSDV), Dynamic Source Routing (DSR), and Ad hoc On
Demand Distance Vector (AODV) [1]. These protocols are
supported to WSN but they are not suitable for tiny, low
capacity sensor nodes and they require high power
consumption. Flat-based multi-hop routing protocols,
designed for static WSN [2-6], have also been exploited in
WSN mobile nodes. However it not supports to mobility of
sensor node
The main challenge in WSN is to minimize energy
consumption in each sensor node. Many researchers
concentrate on the routing protocol that would consume less
power and hence prolong network’s life span. Wireless ad hoc
network routing protocols have been proposed for routing
protocols in WSN.
Low Energy Adaptive Clustering Hierarchy-Mobile
(LEACH-Mobile) [7] is routing protocol which support to
WSN which have mobile nodes. LEACH-Mobile supports
sensor nodes mobility in WSN by adding membership
declaration to LEACH protocol. LEACH-Mobile protocol
selects heads randomly and form cluster. Cluster head create
Time Division Multiple Access (TDMA) schedule. Nodes
sense and send that data to cluster head according to TDMA
schedule. Mobility of node is big challenge to maintain
cluster. Mobile nodes changes cluster continuously. LEACH-
Mobile protocol update cluster after every two cycles of
TDMA schedule. Packet loss happened in between two cycles
of TDMA schedule. Mobile node which is not near to any
cluster cannot send data to any cluster head so it causes packet
loss.
Sensor nodes in LEACH-Mobile wait for two
consecutive failure TDMA cycles, then cluster head decide
that it has moved out of its cluster. During these two TDMA
cycles sensor node loss the packets. In LPLR, sensor node
does not need to wait for two consecutive TDMA cycles from
cluster head to make decision. Cluster head directly decides
that member node has moved out of its cluster after one
TDMA cycle. The data loss is reduced by sending its data to
new cluster head and sends join acknowledgment message to
the cluster head.
We proposed a new low packet loss technique with
efficient power consumption routing protocol for WSN. This
proposed routing protocol called Low Packet Loss Routing
Protocol for mobile nodes in wireless sensor network (LPLR
Mobile-WSN). In our proposed protocol, the cluster head
sends data request message to its members. When the cluster
head does not receive data from its members, the packet is
considered lost and cluster head delete membership of sensor
node from the cluster. On the other hand, when the sensor
node does not receive data request message from cluster head
it will try to get entry into new cluster to avoid packet loss.
Cluster head gives entry in TDMA schedule to incoming
nodes from other cluster. Transmitter will send the message
according to the received signal strength of data request
message from the cluster head.
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
14
Table 1 Abbreviations
WSN Wireless Sensor Network
DSDV Destination Sequenced Distance Vector
DSR Dynamic Source Routing
AODV Ad hoc On Demand Distance Vector
LEACH-Mobile Low Energy Adaptive Clustering
Hierarchy-Mobile
TDMA Time Division Multiple Access
LPLR Low Packet Loss Routing
CSMA Carrier Sense Multiple Access
CA MAC Collision Avoidance Medium Access
Control
CH Cluster Head
SN Sensor Node
2. LOW PACKET LOSS ROUTING Low Packet Loss Routing (LPLR) is Low Packet Loss
Routing protocol for wireless sensor network. LPLR proposes
to handle packet loss and efficiently use energy resources. In
this protocol, cluster head receives the data not only from its
members during TDMA allocated time slot but also from
other lost sensor nodes. WSN consist of mobile and fix both
type of nodes. After cluster formation mobile node can change
cluster. LPLR gives entry to mobile node in new cluster and
TDMA schedule to send data.
2.1 Selection of Cluster Head Protocols like TEEN [8] and APTEEN [9] used stationary but
dynamically changing cluster head. In some protocol where
sensor nodes are mobile cluster head is selected according to
mobility factor [10]. The node with the smallest mobility
factor in each cluster is chosen as cluster head. In LEACH-
Mobile cluster head assumed to be stationary and static in
order to control mobility. In proposed protocol we elect
cluster heads randomly. It assumed to be stationary and static
through the rounds.
2.2 Formation of Cluster After a cluster head has been selected, it broadcasts an
advertisement messages to the rest of the sensor nodes in the
network as in LEACH and LEACH-Mobile. For these
advertisement messages, cluster heads use a Carrier Sense
Multiple Access with Collision Avoidance Medium Access
Control (CSMA/CA MAC) protocol. All cluster heads use the
same transmit energy when transmitting advertisement
messages. Sensor nodes must keep their receivers “ON” in
order to receive the advertisement message from their cluster
head. After sensor nodes have received advertisement
messages from one or more cluster heads, sensor nodes
compare the received signal strength for received
advertisement messages, and decide the cluster to which it
will belong. By assuming symmetric propagation channels,
the sensor node selects cluster head to which the minimum
amount of transmitted energy is needed for communication. In
the case of a tie, a random cluster-head is chosen. After
deciding the cluster it will belong, the node sends registration
message to inform the cluster head. This advertisement
messages are transmitted to the cluster heads using
CSMA/CA MAC protocol. During this phase, all cluster
heads must keep their receiver on.
2.3 TDMA Schedule Creation After cluster head receives registration messages from the
nodes that would like to join the cluster, the cluster head
creates a TDMA schedule based on the number of nodes and
assigns each node a time slot to transmit the data. This
schedule is broadcasted to all the sensor nodes in the cluster.
All sensor nodes will transmit data according to TDMA
schedule
2.4 Data Transmission Once the clusters are created and the TDMA schedule is
fixed, data transmission from sensor nodes to their cluster
heads begin according to their TDMA scheduled. Upon
receiving data request from the cluster head the sensor node
switches on its radio transmitter, adjusts transmission power
and sends its data. At the end of the transmission, the node
turns off its radio, thus we can save the battery of sensor node.
The cluster head must keep its radio on to send data request
messages, receive data from the sensor nodes and to send and
receive other messages needed to maintain the network.
Sensor node receives data request message from the cluster
head, it will send its data back to the cluster head. If the
sensor nodes did not receive data request message from its
cluster head, it will send the message to a free cluster head.
3. LPLR ALGORITHM Cluster Head – CH1 to CH6
Sensor node – S1 to S51
1. Select head randomly.
Select cluster head CH1 to CH6 from S1 to S51 sensor node.
2. All cluster head broadcast advertisement message, from S1
to S51 except cluster heads.
3. Sensor node receives advertisement messages from CH1 to
CH6.
4. Compare received signal strength. Select maximum
received signal strength. According to that received signal
strength sensor node select head CH1 to CH6.
5. Sensor node sends registration message to cluster head.
6. Cluster head create TDMA schedule and broadcast to all
member nodes.
7. According to TDMA schedule cluster head sends data
request message to member node.
8. Member node sends data to cluster head.
Figure 1: Messages of Cluster Head and Sensor Node
3.1 Cluster Head Cluster head broadcast advertisement message. If cluster head
receive registration message then it create TDMA schedule
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
15
and broadcast that schedule. When cluster head finishes
receiving data messages from all sensor nodes, it will check
whether it receives data messages from all members. Any of
the member nodes did not send the data message then cluster
head remove that sensor node from cluster. Cluster head again
broadcast advertisement message to all nodes and update the
cluster. Updating of cluster causes entry of new mobile nodes
in TDMA schedule of cluster.Figure1 shows messages
transfers and receives by cluster head.
3.2 Sensor Node Sensor node receives advertisement message from one or
more cluster head. According to received signal strength node
select head. Node send registration message to selected head
and get entry into TDMA schedule of that cluster. Member
node sends data to cluster head according to TDMA schedule
when it receive data request message from cluster head. If
sensor node does not receive any advertisement or any data
request message then it sends data to free cluster head.
Figure1 shows messages transfers and receives by sensor
node.
4. SIMULATION RESULT We have simulated LPLR and LEACH-Mobile using NS2.34
on Linux 2.2.23.1.42.fc8 platform with parameters as shown
in Table 1. Basic hardware requirement for this simulation is
pentium4 processor, 512mb RAM, 10GB hard disk.
Table 2 Performance Parameter
Parameter Value
Network Size (L*W) 1800*840
Number of Sensor Nodes (N) 51
Percentage of Cluster Head 10%
Percentage of Mobile Sensor
Node
90%
Sensing Range 500m
Sensor speed 1-17 m/s
Figure2. Total Number of Received Packets
Figure2. shows total number of packets received by cluster head from member node. We applied LEACH-Mobile to network and observed number of packets received by each
cluster head from member node in the WSN. We applied
LPLR to the same WSN and observed same result. We can
see from figure2 the LPLR achieves significant improvement
compared to LEACH-Mobile. We can conclude that packet
loss decreases using LPLR
Figure3. Remaining Energy of Nodes
Figure3 shows remaining energy of every sensor node in the
WSN. Member node wakes up to send the data according to
TDMA schedule otherwise they are in sleep mode so we can
reduce energy consumption. We can compare the remaining
energy of sensor node in LEACH-Mobile and LPLR from
figure3. We can say that sensor node can reserve more battery
using LPLR than LEACH-Mobiles.
Figure4. Packet Delivery Ratio
Figure4 shows packet delivery ration of cluster head of LPLR
and LEACH-Mobile. From this figure we can compare
delivery ratio of both protocol and performance of LPLR is
efficient than LEACH-Mobile.
5. CONCLUSION We proposed cluster based routing protocol LPLR which is
efficient as compare to LEACH-Mobile protocol. In proposed
protocol all sensor nodes are maintain by cluster head. TDMA
schedule is also maintained by cluster head. Cluster head
collect data from member node and sends that data to server.
If sensor node fails or battery of node is get discharged then
cluster head remove record of that node from TDMA schedule
and update schedule. Cluster head failure can affect the
working of WSN. Member node cannot send data to cluster
head in case of cluster head failure. It causes packet loss. We
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
16
are working on cluster head failure case to get better result
than proposed LPLR protocol.
We proposed efficient routing protocol for mobile
nodes in wireless sensor network. LPLR protocol is efficient
in energy consumption and packet delivery in the network. It
forms cluster. Cluster head create and maintain TDMA
schedule. Sensor node wake up at the time of sending data and
it goes into sleep mode, it reserve the battery of sensor node.
Important feature of this protocol is it update cluster after
every TDMA cycle. So every mobile node can get entry into
new cluster and can send data. LPLR also maintain a free
cluster head. Mobile sensor node which is not in the range of
any cluster or node which are moved from one cluster and
waiting for new cluster they sends data to free cluster head. It
reduces packet loss. We have simulated LEACH-Mobile and
proposed LPLR protocol and got efficient result of LPLR.
6. REFFERENCES [1] C.Perkins and P.Bhagwat. "Highly Dynamic Destination-
Sequenced Distance-Vector Routing (DSDV) for
Mobile Computers," presented at the ACM '94
Conference on Communications Architectures, Protocols
and Applications, 1994.
[2]W. Heinzelman, J. Kulik, and H. Balakrishnan, "Adaptive
protocols for information dissemination in wireless
sensor networks," Proc. 5th ACM/IEEE Mobicom
Conference (MobiCom '99), Seattle, WA, August, 1999,
pp. 174-85.
[3]J.Kulik, W. R. Heinzelman, and H. Balakrishnan,
"Negotiation-based protocols for disseminating
information in wireless sensor networks," Wireless
Networks, Vol. 8, 2002, pp. 169-185.
[4] C. Intanagonwiwat, R. Govindan, and D. Estrin, "Directed
diffusion: a scalable and robust communication paradigm
for sensor networks," Proc. of ACM MobiCom '00,
Boston, MA, 2000, pp. 56-67.
[5] D. Braginsky and D. Estrin, "Rumor routing algorithm for
sensor networks," Proc. of the 1st Workshop on Sensor
Networks and Applications (WSNA), Atlanta, GA,
October 2002.
[6]Y. Yao and J. Gehrke, "The cougar approach to in-network
query processing in sensor networks", in SIGMOD
Record, September 2002.
[7]Guofeng Hou, K. Wendy Tang, "Evaluation of LEACH
protocol Subject to Different Traffic Models,” presented
at the first International conference on Next Generation
Network (NGNCON 2006), Hyatt Regency Jeju,
Korea/July 9-13,1006
[8]A. Manjeshwar and D. P. Agarwal, "TEEN: a routing
protocol for enhanced efficiency in wireless sensor
networks," presented at the 1st Int. Workshop on Parallel
and Distributed Computing Issues in Wireless Networks
and Mobile Computing, April 2001.
[9]A. Manjeshwar and D. P. Agarwal, "APTEEN: A hybrid
protocol for efficient routing and comprehensive
information retrieval in wireless sensor networks,"
Parallel and Distributed Processing Symposium., IPDPS
2002, pp. 195-202.
[10]M. Liliana, C. Arboleda, and N. Nasser, "Cluster-based
Routing Protocol or Mobile Sensor Networks," presented
at the 3rd Int. Conf. on Quality of Service in
Heterogeneous Wired/Wireless Networks, Waterloo,
Ontario, Canada, 2006.
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
17
32-Bit NxN Matrix Multiplication: Performance Evaluation for Altera FPGA, i5 Clarkdale, and Atom Pineview-D Intel
General Purpose Processors
Izzeldin Ibrahim Mohd
Faculty of Elect. Engineering,
Universiti Teknologi Malaysia,
81310 JB, Johor
Chay Chin Fatt
Intel Technology Sdn. Bhd.,PG9 (Intel U Building), Bayan
Lepas,11900 Penang
Muhammad N. Marsono
Faculty of Elect. Engineering,
Universiti Teknologi Malaysia,
81310 JB, Johor
ABSTRACT
Nowadays mobile devices represent a significant portion of
the market for embedded systems, and are continuously
demanded in daily life. From the end-user perspective size,
weight, features are the key quality criteria. These
benchmarks criteria became the usual design constraints in the
embedded systems design process and put a high impact on
the power consumption. This paper survey and explore
different low power design techniques for FPGA and
processors. We compare, evaluate, and analyze, the power and
energy consumption in three different designs namely, Altera
FPGA Cyclone II which has a systolic array matrix
multiplication implemented, i5 Clarkdale, and Atom
Pineview-D Intel general purpose processors, which multiply
two nxn 32-bit matrices and produce a 64-bit matrix as an
output. We concluded that FPGA is a more power and energy
efficient on low matrix size. However, general purpose
processor performance is close to FPGA on larger matrix size
as the larger cache size in general purpose processor help in
reducing latency. We also concluded that the performance of
FPGA can be improved in terms of latency if more systolic
array processing elements are implemented in parallel to
allow more concurrency.
General Terms
Computational Mathematics
Keywords
FPGA, Matrix Multiplication, General Purpose Processor,
Systolic Array, Energy Consumption
1. INTRODUCTION With drastic improvement and mature of Field Programmable
Gate Array (FPGA) technology nowadays, FPGAs become
one of the choices for designer other than traditionally
solution such as general purpose processor and digital signal
processor (DSP). Its nature of reconfigurable and can be
programmed to implement any digital circuit make it a best
candidate for most of data computation extensive application.
This is including area such as signal processing and
encryption engine which involves large amount of real time
data processing. FPGAs provide better throughput and
latency since it is able to be customized to optimize the
execution of particular process or algorithm.
Traditionally, research of FPGAs and improvement of FPGAs
are mainly focusing on reducing the area overhead and
increasing the speed [7-11]. With emerging of portable and
mobile devices which are now become a need of most of
people today, performance metrics of any of devices are not
mainly focus on latency and throughput but energy efficiency
is key factor as well. As summary, performances of electronic
devices are not mainly focusing on just speed but energy
efficiency should be listed as a major design consideration.
Designers are focusing on producing high throughput solution
while maintaining the power consumption low.
A lot of study and experiment had been done comparing the
energy efficiency between FPGAs, DSPs, embedded
processor [1] and general purpose processor. However,
particularly on general purpose processor, most of experiment
are not comparing to the greatest and latest commercial
processor in market which claimed by manufacturer that
several low power design techniques had been adopted. With
the advance of semiconductor process technology nowadays
which lead to lower leakage current, and flexibility of
software implementation for power saving, performance of
general purpose processor in term of power dissipation and
energy consumption had greatly improved. The key question
here is how well current modern general purpose processor in
market performs in term of energy efficiency compares to
FPGAs particularly on signal processing centric application.
This paper is going to evaluate and discuss in detail this key
question by executing nxn matrix multiplication on Altera
Cyclone II FPGA [17-19] and Intel processor and comparing
the performance of both devices in term of power dissipation
and energy efficiency.
2. RELATED WORK Ronald Scrofano et al had shown that matrix multiplication of
two nxn matrices can be done most efficiently in term of
energy and power with FPGA in their paper- Energy
Efficiency of FPGAs and Programmable Processor for Matrix
Multiplication [1]. They compared the energy efficiency of
nxn matrix multiplication between Xilinx Vertex-II, Texas
Instruments DSP processor (TMS320C6415) and Intel Xscale
PXA250. They use linear array architecture for the matrix
multiplication module. In their work, there is no actual power
measurement been done on FPGA. However, only estimated
energy of each Process Element (PE) was done. Each PE
components (multiplier, adder, register, RAM) was modeled
in VHDL and synthesis. Design is placed and route after
synthesis. Place and route result and output from simulation
were used as input to Xilinx Xpower tool to estimate the
power. The way the power estimation was done actually
assumed that location of each component in FPGA does not
affect the power consumption. This is not true as location of
each component has big impact on the routing. And
capacitance of routing is a major factor of power
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
18
consumption. Thus, accuracy of power estimation shown by
Ronald Scrofano et al is a concern.
Seonil Choi et al developed an energy efficient design for
matrix multiplication base on uniprocessor architecture and
linear array architecture for both on chip and off chip storage
[2-6]. No actual measurement was taken on actual HW
implementation. However, energy consumption was
estimated using Xilinx Xpower tool. They showed that linear
array architecture need 49 cycles for 6x6 matrices and up to
256 cycles for 15x15 matrices.
3. SYSTOLIC ARRAY MATRIX
MULTIPLICATION ON ALTERA
CYCLONE II Traditional way of implementing systolic array for matrix
multiplication is matching the systolic array order to the
problem size [12-16]. For example, an 8x8 matrix size will
need 8x8 order systolic array and 16x16 matrix size will need
16x16 order systolic array. The basic building block of
systolic array is 2x2 and each 2x2 systolic array need 4
multiply and accumulators (MAC) as shown in Figure 1. The
number of MAC increases tremendously if problem size
increases. Table 1 below shows number of MACs required on
matrix size.
Table1: Resource utilization on difference Matrices sizes
Matrix Size Number of 2x2
systolic array
Number of
MACs
2x2 1 4
4x4 4 16
8x8 16 64
16x16 64 256
Our target FPGA device for power measurement is Cyclone II
EP2C35F672C6 which has only 33,216 logic elements and 70
embedded 9 bits multiplier. Besides, input data of 32 bit and
output data of 64 bit imply that we need pretty huge MAC on
wide bus width. If we follow the method of matching the
matrix size to the order of systolic array, resource utilization
will exceed the number of gate on target devices.
Alternatively, we utilized 4x4 systolic arrays which is
developed by connecting four 2x2 systolic arrays in the way
shown in Figure 2, as basic building block to construct up to
16x16 matrix multiplication. In other words, the same 4x4
systolic arrays will be used to implement 2x2, 4x4, 8x8 and
16x6 matrix multiplication module on the expense of latency
increment on larger problem size. Figure 3 below illustrates
the implementation with single 4x4 systolic array.
Figure 1: Functional Block Diagram of 2x2 Systolic Array
Figure 2: Functional Block Diagram of 4x4 Systolic Array
Clearly, an algorithm is needed in order to use a single 4x4
systolic array for all matrices sizes. The steps below show the
method of constructing 8x8 matrix multiplication module
from a single 4x4 systolic array.
Step 1: Divided an 8x8 input matrix A and matrix B to four
4x4 sub matrices which are named as A1-A4 and B1-B4.
MAC MAC
MAC MAC
1
0
1
0
1
0
1
0
Reg
Reg Reg
Reg Reg
Reg Reg
RegReg
Reg
in_col0 in_col1
in_row0
in_row1
out_row0
out_row1
in_data0
in_data1
out_col0 out_col0
out_data0
out_data0
mult_over
mult_over mult_over
mult_over
mult_over
clk
clr
clk clr mult_over
in_col0 in_col1 in_col2 in_col3
in_row0
in_row1
in_row2
in_row3
out_data0
out_data1
out_data2
out_data3
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
19
Step 2: Output matrix C can be obtained by passing sub
matrix A and sub matrix B to 4x4 systolic array and adding up
the result accordingly as below
C1= A1xB1+A2xB3
C2= A1xB2+A2xB4
C3= A3xB1+A4xB3
C4= A3xB3+A4xB4
Figure 3: Implementation of matrix multiplier with single
4x4 systolic array
The same method can be used for higher matrices sizes such
as 16x16. As example, for 16x16 matrix size, we first need to
divide the input matrix to four 8x8 sub matrices. And, each
8x8x sub matrix will be further divided into four 4x4 sub
matrices. Figure 4 below illustrates the method of
constructing 8x8 matrix multiplication module from a single
4x4 systolic array
Figure 4: Constructing 8x8 matrix multiplier from single
4x4 systolic array
Figure 5 shows the high level functional block diagram of
overall design. Both 32 bit matrix A and matrix B will be
stored in Random Access Memory (RAM). A module
(Reg_delay) is required to stagger the input matrix A and
matrix B as required by algorithm of systolic array. A 4x4
matrix multiplication module responsible to compute the
multiplication between elements of matrix A and matrix B.
Dataout module responsible for addition of result of 4x4
matrix multiplication and write each element of output matrix
to RAM_out. Output of matrix multiplication result will be
stored in RAM.
Figure 5: High Level Functional Block Diagram of Matrix
Multiplication Module
4. MATRIX MULTIPLICATION ON
GENRAL PURPOSE PROCESSOR For general purpose processor, we used Intel i5 Clarkdale
Intel Atom Pineview-D processor. Matrix multiplication is
running on MATLAB on these two processors. MATLAB
was used because it is incorporates the Linear Algebra
Package (LAPACK) which greatly improve the performance
of matrix multiplication. A single iteration of computation of
matrix multiplication will not take advantage of the large
cache size in Intel processor. Thus, in order to utilize the
cache from Intel processors, 1000 iterations of 32 bit matrix
multiplication will be executed. With this, we can ensure that
the cache in processor will be fully utilized.
5. POWER AND ENERGY
MEASUREMENTS Power is measured using Tektronix current probe with current
amplifier and digital Tektronix digital oscilloscope TDS7104
to capture both the current and the voltage.
The power is obtained by multiplying the voltage and the
current. Average voltage and current are measured within the
window of matrix multiplication process. The energy
consumption is obtained by multiplying the average power to
total execution time.
2x2
2x2 2x2
2x2 2x2
2x2 Multiplier2x2 Multiplier
4x4 Multiplier4x4 Multiplier
8x8 Multiplier8x8 Multiplier 16x16 Multiplier16x16 Multiplier
8x8 8x8
8x8 8x8
4x4 4x4
4x4 4x4
Resources Intensive !!!Resources Intensive !!!
2x2
2x2 2x2
2x2 2x2
2x2 2x2
2x2 2x2
2x2 2x2
2x2 2x2
2x2 Multiplier2x2 Multiplier
4x4 Multiplier4x4 Multiplier
8x8 Multiplier8x8 Multiplier 16x16 Multiplier16x16 Multiplier
With SchedulingWith Scheduling
A1 A2
A3 A4
B1 B2
B3 B4
X
MatrixA 8x8 MatrixB 8x8C1 C2
C3 C4
=
MatrixC 8x8
A1B1+A2B3 A1B2+A2B4
A3B1+A4B3 A3B3+A4B4
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
20
5.1. Power and Energy Measurements on
ALTERA DE2 Board Figure 6 shows the experiment setup for the power
measurement on ALTERA board. The Cyclone II Altera
FPGA (EP2C35F672C6) is powered by 3 power rails. There
are the core voltage (VCC_INT, 1.2V), IO Voltage (VCCIO,
3.3V) and PLL Voltage (VCC12, 1.2V). Figure 7 below
shows the power supply pins of the Cyclone II FPGA. In
order to obtain total power consume by the entire IC, the
current drawn by each of those rail is measured.
Figure 6: Hardware setup of current measurement on
Altera DE2 board
Figure 7: Power Supply Pins of Cyclone II
EP2C35F672C6
IO voltage (VCCIO) of Cyclone II is power directly by 3.3V
rail on Altera DE2 board through a 0ohm resistor (R92) as
shown in Figure 8. So that, the current drawn by Cyclone II
can be easily measure by replacing the 0ohm resistor with a
wire loop for current probe to be clamped on.
Both core voltage and PLL voltage are connected to output of
a low dropout regulator (LDO). Since there is no shunt
resistor exist on the output of LDO, a workaround is needed to
measure the power consume by these 2 voltage rails. In order
to measure the current on both voltage rails, output pin (pin 2)
of LDO (U24) have been lifted up and a wire loop is inserted
for current probe to be clamped on as shown in Figure 8. The
current is measured at the output of the LDO but not the input
of the LDO to exclude the efficiency loss of the LDO. If we
measured the current at the input of the LDO, it includes the
efficiency loss of LDO which is not required.
Figure 8: Power Block of ALTERA DE2 Board
5.2 Power and Energy Measurements on
the General Purpose Processors In the computer system, the power supply unit is supplying
5V, 3.3V and 12V to the motherboard. 12V is the main
voltage used by processor switching voltage regulator. 12V to
processor is supplied through a 2x2 or 2x4 power connector
from power supply. 12V input will be down regulated to
processor core and IO voltage. Thus, measuring the 12V input
current from 2x2 or 2x4 power connector will give the total
current consume by processor.
Since the computer system is running on operating system and
there are other background activities which will consume
processor resources, we have to take into consideration
current consumes by those background activities. Otherwise,
current measure will not only included the current consume
by matrix multiplication process but it include other process
that is happening in background as well. One of the ways to
overcome this is to measure the current on 12V rail when
system is idle. Idle is referring to situation where system is
power up but no other active process is running on the system
except those background activities initiate by operating
system. And another set of current measurement is taken
when system is running the required matrix multiplication
routine. By subtracting the idle current from the current
consume during matrix multiplication routine, we can obtain
the net current used just to execute the matrix multiplication.
With this methodology, we can measure both the power and
energy consume by processor for the particular matrix
multiplication module as shown in Figure.9.
Figure 9: Hardware setup of current measurement on
Atom Pineview-D2 processor
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
21
6. RESULTS AND ANALYSIS In this section, we analyzed the results of power and energy
we obtained from both Altera DE2 board as well as the two
platforms with Intel Core i5 and Intel Atom Pineview-D
processor. We discussed the observation we have from result
obtained. All measurements are done on 1000 iterations of
matrix multiplication on the three different designs.
6.1. Power From the results shown in Table 2 and Figure 10, it is cleared
that FPGA dissipate the least amount of power compared to
Intel i5 Clarkdale and Intel Atom Pineview-D. Our result
shows that Intel Atom Pineview-D perform better in term of
power dissipation compare to Intel i5 Clarkdale so that Intel
Atom series can be used for low power application. Intel Core
i5 consumes 16 times as much as power compare to Altera
Cyclone II while Intel Atom consumes 2.6 times more power
than Cyclone II.
Apparently, an interesting observation to be noted is all of
three implementations are consuming the same amount of
power regardless of matrix size. This is due the same HW unit
will operate at any case regardless of problem size and data
content.
Table 2: Power Dissipation versus Matrix Sizes for the
three Implementations
Matrix
Sizes
Power Dissipated (mW)
ALTERA
DE2 Cyclone
II
Intel General Purpose
Processors
Atom
Pineview-D i5 Clarkdale
2x2 237.05 794.24 5046.38
4x4 287.70 791.43 5152.54
8x8 290.27 788.36 5095.23
16x16 293.53 840.42 5156.37
Figure 10: Power Dissipation versus Matrix Sizes for the
three Implementations
For matrix multiplication on the FPGA, we used single 4x4
systolic array as basic building block for all matrix size as we
stated earlier in previous section. That is the reason why we
got same amount of power dissipated on all matrix sizes. As
the result we can concluded that for higher order of systolic
array, the power consumption will increase by the same
amount of logic element increment. For example, if we
choose to use 8x8 systolic array as basic building block, the
amount of power dissipated on FPGA will become closer to
Intel Atom Pineview-D. However, this will come with
advantage of reducing the latency.
6.2 Energy A device with less power dissipation doesn’t mean that it has
longer battery time. It might dissipate less power but take
significant time to complete a task compare to a high power
dissipation device but operate faster. A more accurate
measurement on battery life is using energy. Energy can be
calculated by multiplying average power to latency.
Table 3: Energy Consumption versus Matrix Sizes for the
three Implementations
Matrix
Size
ALTERA DE2
Cyclone II
Intel General
Purpose Processors
Atom
Pineview-D
i5
Clarkdale Latency
(ms)
Energy
(mJ)
Latency
(ms)
Energy
(mJ)
Latency
(ms)
Energy
(mJ)
2x2 0.25 0.06 13 10.33 1.80 9.08
4x4 1.14 0.33 13.8 10.92 2.20 11.34
8x8 7.76 2.25 19 14.98 2.50 12.74
16x16 60.02 17.62 41.4 34.79 3.90 20.11
As shown in Table 3 and Figure 11.The ALTERA FPGA is
consumed the least energy for matrix size up to 16x16.
However, on matrix size higher than 16x16, a linear
extrapolation on the line graph will reveal that Intel Core i5
performs better than FPGA in terms of energy consumption.
The main reason is the latency of FGPA implementation
increases at rate higher than the latency increment rate of Intel
Core i5 when matrix size grows from 16x16. The scheduling
we made on FPGA design in order to reduce resource
utilization is the main reason of excessive latency increment.
One can reduce the latency by using higher order of systolic
array as basic building block. Apparently, Intel Atom is not an
economic solution for matrix multiplication centric
application. Even though Intel Atom dissipates less power
than Intel Core i5 but overall it consumes more energy than
Intel Core i5 on same workload. However, we not deny that
minimum power dissipation is still important to keep
minimum heat dissipation for simple thermal solution.
On other hand, if we increase the order of systolic array in the
way that it matches with problem size, latency will improve
significantly. For example, use 16x16 systolic array for
problem size of 16x16. In order to estimate the energy
consumption, we have to understand the resource utilization
increment as well as the latency improvement if higher order
of systolic array is used.
Figure 11: Energy Consumption versus Matrix Sizes for
the three Implementations
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
22
A 4x4 systolic array is formed from four 2x2 systolic array, an
8x8 systolic array is constructed from four 4x4 systolic array
and so on. With scheduling, we use single 4x4 systolic array
for problem size of 8x8. Now, without scheduling we need
four 4x4 systolic array on problem size of 8x8. Thus, in term
of resource utilization, it increases by factor of 4.
Furthermore, if we assume that power dissipation is
proportional to resource utilization, power dissipation will
increase by factor of 4 as well.
As for the latency, matrix size of 8x8 needs 8 iterations if
single 4x4 systolic array is used. However, four 4x4 systolic
array will reduce the latency by factor of 8. Thus, overall
energy consumption will be reduced by 50% (power
dissipation increases by 4 times but latency decrease by 8
times). Figure 12 shows that the estimated energy
consumption if order of systolic array is matching with matrix
size. If we linearly extrapolate the line graph, FPGA is still
the candidate consume the least energy even at matrix size
higher than 16x16.
Figure 12: Energy consumption versus matrix size with
estimated result on higher order systolic array
7. CONCLUSION Due to hardware limitation on Altera DE2 with Cyclone II
EP2C35F672C6 which only has 33,216 logic elements,
systolic array matrix multiplication design has to be scheduled
and using only 4x4 systolic array. This approach helped to
ensure that design can be implemented and fit into Altera DE2
board. However, it induces significant latency. This design
can be modified with higher order of systolic array to
minimize latency and implement on FPGA family with higher
logic count such as Cyclone IV EP4C115 which has 115,000
logic elements.
Using more resources for parallel processing will always
reduce the latency but it increases the power dissipation. This
is because a powerful chip required more silicon area as more
circuit is required. Comparing between Atom Pineview-D
and Core i5, obviously Core i5 has better performance but it
suffer for higher power dissipation. For matrix size less than
16x16, we observed the FPGA is the best candidate in term of
power and energy consumption. Increasing the matrix size
further to greater than 16x16, Core i5 become a more
favorable candidate. Larger data and instruction cache in Core
i5 is the main reason we do not see latency increase at rate as
high as on Atom Pineview-D and Cyclone II when matrix size
increase. However, if we increase the order of systolic array to
match the matrix size, our estimated result show that FPGA is
still the most economical candidate for matrix multiplication.
8. ACKNOWLEDGMENTS We would like to thank the Universiti Teknology Malaysia for
funding support. We also like to take this opportunity to
express our appreciation to the Intel Technology SDN BHD
Malaysia, In particular, Intel Test and Tool Operation (iTTO)
for making this work possible.
9. REFERNCES [1]. R. Scrofano, S. Choi, and V. K. Prasanna, “Energy
Efficiency of FPGAs and Programmable Processors for
Matrix Multiplication,” in Proc. of IEEE Intl. Conf. on
Field Programmable Technology, pp. 422-425, 2002.e
[2]. S. Choi, V. K. Prasanna, and J. Jang, “Minimizing
energy dissipation of matrix multiplication kernel on
Virtex-II,” in Proc. of SPIE, Vol. 4867, pp. 98-106,
2002.
[3]. J. Jang, S. Choi, and V. K. Prasanna, “Energy efficient
matrix multiplication on FPGAs,” in Proc. of 12th Intl.
Conf.on Field Programmable Logic and Applications, pp.
534-544, 2002.
[4]. J. Jang, S. Choi, and V. K. Prasanna, “Area and Time
Efficient Implementations of Matrix Multiplication on
FPGAs,” in Proc. of IEEE Intl. Conf. on Field
Programmable Technology, pp. 93-100, 2002.
[5]. H. T. Kung and C. E. Leiserson, “Systolic arrays for
(VLSI),” Introduction to VLSI Systems, 1980.
[6]. V. K. P. Kumar and Y. Tsai, “On synthesizing optimal
family of linear systolic arrays for matrix multiplication,”
IEEE Trans Comput., vol. 40, no. 6, pp. 770–774, 1991.
[7]. Lamoureux J and Luk, W,“An overview of Low-Power
Techniques for Field-Programmable Gate Arrays.”, in
Adaptive Hardware and System. AHS’08. NASA/ESA,
2008.
[8]. Sutter, G., Boemo, E. "Experiments in low power FPGA
design", Lat. Am. Appl. Res., vol.37, no.1, pp.99-104,
2007.
[9]. Dave. N, Fleming. K, Myron King, Pellauer. M,
Vijayaraghavan, M. “Hardware Acceleration of Matrix
Multiplication on a Xilinx FPGA”, in Formal Methods
and Models for Codesign, 2007. MEMOCODE 2007. 5th
IEEE/ACM International Conference, May 30 2007-June
2 2007.
[10]. Aslan. S, Desmouliers. C., Oruklu. E and Saniie. J. “An
Efficient Hardware Design Tool for Scalable Matrix
Multiplication”, in Circuits and Systems (MWSCAS),
2010 53rd IEEE International Midwest Symposium,
pp1262-1265, 2010
[11]. H.T. Kung. “Why Systolic Architecture”, in IEEE
computer, pp37-46. 1982
[12]. Ju-Wook Jang, Seonil B. Choi and Viktor K. Prasanna. “
Energy and Time Efficient Matrix Multiplication on
FPGAs”, in IEEE transactions on very large scale
integration (VLSI) system vol 13, NO 11, November
2005.
[13]. Qasim, S.M, Abbasi S.A, Almashary B. “ A proposed
FPGA-based parallel architecture for matrix
multiplication”, in circuits and systems, 2008, APCCAS
2008, IEEE Asia Pacific Conference, pp1763-1766,
2008.
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
23
[14]. Syed M, Qasim, Ahmed A.Telba, Abdulhameed Y.
AlMazroo. “FPGA Design and Implementation of Matrix
Multiplier Architectures for Image and Signal Processing
Applications”, in IJCSNS International Journal of
Computer Science and Network Security, VOL 10. NO2,
Feb 2010.
[15]. AHM Shapri and N.A.Z Rahman. “Performance Analysis
of Two-Dimensional Systolic Array Matrix
Multiplication with Orthogonal Interconnections”, in
International Journal on New Computer Architectures
and Their Applications (IJNCAA) 1(3): 1090-1100, 2001
[16]. Jonathan Break. “Systolic Array and their Application”,
inhttp://www.cs.ucf.edu/courses/cot4810/fall04/.../Systol
ic_Arrays.ppt
[17]. Altera Inc, Cyclone II Device Handbook, Volume 1,
available at www.altera.com
[18]. Altera Inc, DE2 Development and Education Board Use
Manual available at www.altera.com
[19]. Altera Inc DE2 Development and Education Board
Schematic, available at www.altera.com
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
24
Recognizing and Interpreting Sign Language Gesture for Human Robot Interaction
Shekhar Singh
Assistant Professor CSE, Department
PIET, Samalkha, Panipat, India
Akshat Jain Assistant Professor CSE, Department
PIET, Samalkha, Panipat, India
Deepak Kumar Assistant Professor CSE, Department
PIET, Samalkha, Panipat, India
ABSTRACT
Visual interpretation of sign language gesture can be useful in
accomplishing natural human robot interaction. This paper
describes a sign language gesture based recognition,
interpreting and imitation learning system using Indian Sign
Language for performing Human Robot Interaction in real
time. It permits us to construct a convenient sign language
gesture based communication with humanoid robot. The
classification, recognition, learning, interpretation process is
carried out by extracting the features from Indian sign
language (ISL) gestures. Chain code and fisher score is
considered as a feature vector for classification and
recognition process. It is to be done by the two statistical
approaches namely known as Hidden Markov Model (HMM)
technique and feed forward back propagation neural network
(FNN) in order to achieve satisfactory recognition accuracy.
The sensitivity, specificity and accuracy were found to be
equal 98.60%, 97.64% and 97.52% respectively. It can be
concluded that FNN gives fast and accurate recognition and it
works as promising tool for recognition and interpretation of
sign language gesture for human computer interaction. The
overall accuracy of recognition and interpretation of the
proposed system is 95.34%. Thus, this approach is suitable
for automated real time human computer interaction tool.
General Terms
Human Robot Interaction; Gesture, Indian Sign Language;
Vector Quantization and LBG algorithm; Hidden Markov
Model; Neural Network.
Keywords
HRI, ISL, CLAHE, Chain Code, HMM, Fisher Score, FNN,
Gesture recognizing, Gesture Interpretation
1. INTRODUCTION The growing commitment of the society in reducing the
barriers to persons with disabilities, added to the advances of
the computers and form recognition methods, has motivated
the development of the present system that recognized, learn
and interpret ate the Indian sign language gesture .Sign
language gesture based recognition, learning and
interpretation is one of the most promising areas in research
appealing its huge applications [28]. An artificial intelligent
framework is being built with sign language gesture in order
to accelerate Human Robot Interaction accurately. We have
chosen suitable recognition, learning, interpretation
techniques for the establishment of gesture based
communication with humanoid robot. According to the
different researcher point of view, hidden markov model
(HMM) is a very well acceptable method for gesture
recognition. But, we find that it takes more time to recognize
the sign language gesture. For this reason we are trying to find
a different method which gives same accuracy to recognize,
learn and interpret ate sign language gestures but within a
very short period of time. Indian Sign Language (ISL) has its
own grammatical and syntactical meaning in the linguistic
form of signs. It implies a visual-spatial language which
consists of hands, arms, and facial expression and head or
body postures in such a manner that linguistic information
could be provided significantly. The construction of Indian
sign language (ISL) gesture [1, 28] can be defined by several
parameters like shape of the hand, location of the hand
movements in a straight or circular way, orientation of the
hand, facial expressions, body or head posture and eye gaze.
The sign language gestures primitives are being captured by
Indian Sign Language symbols. It imparts a challenge in terms
of its complex symbolic gestural representation and proper
linguistic understanding. It would be used as a helping agent
for interpreting the knowledge among hearing impaired
people in their own community. It will increase their strength
of conversation in an extreme extent. This paper tries to
introduce the strategy for dealing with the dynamic hand
gesture recognition, learning and interpretation with reduction
of time which will really be helpful for designing a concrete
human robot interaction (HRI) system. It incorporates the
challenges to extract and interpret ate the dynamic feature
from the continuous signals.
The current prototype strongly supports the mimicry on
humanoid robot in real time. The further expansion on
recognition, learning and interpretation includes the
translation system between verbal expression and sign
language. This system would be useful for hearing impaired
people for exchanging information among them through the
human robot interaction (HRI) approach. This demands a real
time gesture recognition system by translating sign language.
The unique and novel techniques using neural network and
HMM have been applied separately in our current system to
recognize the Indian sign language (ISL) gestures and
generate mimicry by the humanoid robot accordingly. Then
we try to identify a suitable recognition, learning,
interpretations system according to the time taken for
recognizing learn, and interpreted the sign language gesture.
The used procedure in this automatic recognition is based in
the adequate modeled of the hand sign by Neural Network.
For it, it has been realized the calculation of Fisher score,
which is extracted of HMM [28, 31]. This HMM is fed with
the chain code, determined from the band sign image. For this
process, the first step has been arranged the capture of
samples. It is necessary to create a database with hand sign
pictures of the sign text of numerous persons, being the ideal
infinite. In our case, it was enclosed for sixty different people
(50 hand signs each one). Subsequently, it has been applied a
image preprocessing, transforming the color image in a white
and black one, with more reduced size, which defines the
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
25
hand outline. With this outline, we are going to extract a
series of parameter, associated in a vector that defines the
contour of the hand sign with chain code [29, 32], and it is
classified and parameterized by HMM. From these HMM,
some parameters are going to extract, based on the emission
probability of the vectors, which are going to determine the
score Fisher [28, 32], for classifying it with Neural Network
[33, 34, 39, 40]. This system will he supervised with a
training process, in which is learned to differentiate some
hand signs of other; and a other test process, where the models
will be verified. This process is resumed in the following
figure 1. In the second section the creation of the database. In
the third, the applied image processing. Subsequently, the
calculation of Fisher score by HMM. In the fifth section the
neural network classifier. The sixth sections are all the
realized experiments and finally the conclusions and
references. Figure 1 shows the proposed system for
recognizing and interpretation of sign language gesture for
human robot interaction.
2. INDIAN SIGN LNGUAGE GESTURE
ACQUISITION SYSTEM The prime focus on recording Indian sign language gesture is
to create a repository with dynamic ISL video (sequence of
images) gestures from different kinds of Indian sign language
class/word dictionary. Preliminarily, Indian sign language
dynamic gestures have been recorded with fixed frame rate
per second (10 fps) and with the fixed location of the camera
from the object. The samples of Indian sign language gestures
are to be used during the classification, learning and
interpretation process. All the Indian sign language gestures
include various kinds of hand motions. A capturing device
SONY handy cam with 8 mega pixel resolutions is used for
capturing videos of several Indian sign language gestures.
One elementary approach for image processing tends to
background uniformity where a dark background is chosen for
dealing with gray scale images effectively.
To have a controlled environment, the background uniformity
has been kept while recording the videos in real time. This
will reduce the computational complexity during background
removal and increase recognition and interpretation accuracy
in real time [15]. A single gesture video has been restricted to
20 frames. This is done by selecting 20 frames equally spaced
in time from the original captured video. The background is
chosen dark. Every ISL gesture implies some class or word
which could be captured by waving both hands in a very
appropriate manner [16]. For the enhancement of
preprocessing Indian sign language gestures need fast and
accurate movements of hands. Several operations have been
accomplished in all the Indian sign language (ISL) videos
before the classification process.
3. PREPROCESSING AND FEATURE
EXTRACTION Primarily, all the Indian sign language videos are split up into
sequences of image frames (RGB). The frames are converted
into grayscale images and the background is subtracted in
order to reduce computational complexity. In this process are
going to take the color images for transforming to binary
images of hand sign shape (white and black) with a fixed
height of 400 pixels, conserving the relation with respect the
width. The following steps are;
1. Filter RGB image for eliminating the noise, hue,
saturation effects.
2. Convert RGB image in YCBCR color image by
eliminating the hue and saturation [40].
3. Convert YCBCR image in grayscale by eliminating the
hue and saturation [40].
4. Filter grayscale image for eliminating the noise, hue and
saturation.
5. Enhance contrast using histogram equalization [29]. To
realize this process, it is equalized the histogram of the
different levels of gray, under a lineal function, but
without affecting to the darkest points, only for the
clearest parts (the hand sign), marking differences among
the shadow of the hand and the background; and the own
hand.
6. Take the frame out to eliminate some border effects.
7. Convert image to binary image by Thresholding. The
Thresholding is computed by means of Otsu's method,
which chooses the threshold to minimize the interclass
variance of the Thresholding black and white pixels.
With this step is finished for determining the hand as an
object.
8. Morphologic operators [30]. It is applied the dilatation
operations in a first place, and after the erosion, as effect
to recover the conditions of size of the image. The
elimination of the noise is other desired effect. The
dilatation is going to unite the holes that can remain in
the line of the contour. Subsequently the hand is filled as
object, and finally recovers the original dimension, by
means of the erosion, that is able to eliminate the
possible noises happened by the own photo.
9. Reduction. The size of image is reduced in one quarter,
for reducing the size of the final vector.
10. Calculation of the contour [31, 32]. It is calculated the
hand contour, which determines the hand sign on the
background of the image, with the particularity that the
connection among pixels is alone one pixel, with a
connectivity among neighbors of 8, that is chain code.
11. Adjustment of the wrist. The image is trimmed slightly
with the intention to determine the side of the one that
the hand sign arises. Adjustment of high of the image.
Finally it is fixed the value of the height, maintaining the
proportionality with regard to the width. In this way the
information is not loosen, indistinctly that the hand sign
is horizontal, vertical up or vertical downward.
Figure 1 shows the process of image preprocessing and
feature extraction step of proposed system. Figure 1 shows the
proposed system for recognizing and interpretation of sign
language gesture for human robot interaction. The feature
extraction for hand gesture recognition is done using chain
code and fisher score [17]. It provides tremendous flexibility
towards scene illumination invariance property. The
classification and interpretation policy includes robustness in
changing the illumination conditions. The fisher score of
dynamic gestures forms the feature vector based on the image
feature. HMM and neural network are very robust and
efficient algorithm which would be used to classify ISL
dynamic gestures based on the pattern classification and
recognizing technique. The algorithm is constituted with the
chain code and fisher score of the local orientation of edges in
an image. The chain code and fisher score will be treated as
feature vector for motion classification of Indian sign
language gestures. The algorithm is very fast and strong to
compute the feature vectors of the sequence of images.
Therefore, the calculation of direction of edges can be
performed in real time applications. It offers advantage to
scene illumination changes and even light condition changes.
The edges of the sequences of images would be still same
[18]. All the Indian sign language gestures have been captured
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
26
in different lighting conditions. Another advantage of chain
code and fisher score refers to the translation invariant
property. It demonstrates that the same frames at different
position of gestures would produce the same feature vectors.
It is being done to calculate the chain code and fisher score of
the local orientations for all the frames of the moving
gestures. Translation of the frame in the gesture does not
change the chain code and fisher score. The overall algorithm
[37, 38] has been described to evaluate the feature vector for
recognition, learning and interpretation system.
4. ISL RECOGNITION AND
INTERPRETATION TECHNIQUE The classification and interpretation process is done by two
statistical techniques: HMM technique and neural network.
Hidden Markov Model technique is followed by vector
quantization technique with Linde Buzo Gray algorithm [38].
The HMM explains the construction of model which is
needed to generate the observation sequences. Here we use a
left-right HMM. The details of HMM is defined in the
following manner [37]: In our research work we have
generated 10 different HMM models for 21 different ISL
gestures for classification and interpretation. Figure 1 shows
the process of image processing, chain code calculation, fisher
scores calculation, HMM, quantization, learning, recognizing
and interpretation of sign language gesture.
4.1 Fisher Score calculation Once it is obtained all the outline images of the hand signs, it
is realized the calculation of Fisher score. This process
comprises in three steps;
a. Extraction of parameter from outline: chain code [29].
b. Creation HMM with chain code as input [31, 32].
c. Calculation of Fisher score from gradient of logarithm of
the observation symbol probability distribution [28, 32]
Fig 1: Block diagram of the proposed approach for hand tracking and gesture recognition. Processing is organized into six
layers.
Preprocessing of Indian Sign Language Gesture
Color
Smoothing
Color Space
Transformation
Contrast Limited Adaptive
Histogram equalization Process
Segmentation of Indian Sign Language Gesture
Assign Skin Color
Probabilities to pixels
Compute Skin
Colored Blobs
Background
Subtraction
Hand Tracking in the Model Space
Particle filters
for left Arm
Hand Gesture
Detection
HMM for Body
orientation
Filter for Right arm
Parameterization
Extraction of
Parameter
Extract the
Chain Code
HMM: Calculation of
Fisher Score
Hidden Markov Model
Gesture Classification
Feed Forward Back Propagation Neural
Network
Decision Gesture Recognition
Image Acquisition Process
Initialization Process Color Image Sequence
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
27
This vector of the contour hand sign is obtained with the mask
of the figure 2, observing the position of each pixel with their
adjacent one. This vector is formed by numbers from the 1 to
the 8 that describes the outline of the hand sign. The
information that is extracted, describes the sequence of the
hand, accompanied by temporary information, because all the
hand signs are acquired in the same order and sense. This
information is very important for the recognizer based on
HMM, therefore utilizes this information to distinguish the
different hand signs. It is fixed a criterion of start for the
obtaining of this vector, verifying first if is a vertical,
horizontal since down or vertical since up hand sign, for this
is not rotation invariant. It is begun for seeking write pixels
with the following order of priority: first, the first column of
the left (horizontal hand sign), the last row (vertical since
down) or the first row (vertical since up).
1 2 3
8 X 4
7 6 5
Fig 2: Mask of composition of the vector of the chain code
4. 2 Transformation of Parameter with
HMM
It is going to determine by the supervised classification of
chain code using HMM, which is the maximum rate of
success, for extrapolating the forward and backward
parameter of HMM in the calculation of Fisher score.
Therefore, the HMM employed is a Bakis, and is trained with
the procedure of Baum-Welch, to maximize the probabilities
of success [28]. Besides, 8 symbols by state have been
utilized. The creation of the HMM models has two phases, the
training and the test. Finally, the number of states (N) and the
percentage of training samples have utilized like parameters to
find the highest rate of success.
4.3 Fisher Score
Finally, it is proposed the transformation that provides the
HMM probabilities relating to the approach of the Fisher
score [28, 32]. With this goal, it intends to unite the
probability given by the HMM to the given discrimination of
the neural network, whose tie of union is this Fisher score.
This score calculates the gradient with respect to the
parameters of HMM, in particular, on the probabilities of
emission of a vector of data x, while it is found in a certain
state q Є {l,…….., N}, given by the matrix of symbol
probability in state q(bq(x)), just as it is indicated in the
following equation ;
P(x/ q, λ) = bq(x) (Eq. 1)
If it is realized the derivate of the logarithm of the above
probability, with the purpose to calculate its gradient, it is
obtained the kernel of Fisher, whose expression comes given
by;
∂log p(x/q, λ)/∂p(x, q) =ζ(x, q)/bq(x)-ζ (q) (Eq. 2)
Where in [28, 31, 32], it has been found the approximations
and the calculation of above equation. Besides, ζ (x, q)
represents the number of times, that is localized in a state q,
during the generation of a sequence, emitting a certain symbol
x [28, 31]. ζ (q) represents the number of times that has been
in q during the process of generation of the sequence [28, 31].
These values are obtained directly and of form effective, from
the forward backward algorithm, applied to the HMM [28,
31]. The application of this score (Ux) to the neural network,
comes given by the expression of the equation 2, utilizing the
techniques of the natural gradient, from the following
equation [1];
Ux = p(x, q) log p(x / q, λ) (Eq. 3)
Where Ux define the direction of maximum slope of the
logarithm of the probability to have a certain symbol in a
state.
4.4 Feed forward back propagation
Technique The objective of this study is to classifying Fisher kernel data
of sign language latter symbol using feed forward back
propagation neural network and Levenberg-Marquardt (LM)
as the training algorithm. LM algorithm has been used in this
study due to the reason that the training process converges
quickly as the solution is approached. For this study, sigmoid,
hyperbolic tangent functions are applied in the learning
process. Feed forward back propagation neural network use to
classify sign language gesture according to fisher score
characteristic [33, 34, 39, and 40]. Feed forward back
propagation neural network is created by generalizing the
gradient descent with momentum weight and bias learning
rule to multiple layer networks and nonlinear differentiable
transfer functions. Input vectors and the corresponding target
vectors are used to train feed forward back propagation neural
network. Neural network train until it can classify the defined
pattern. The training algorithms use the gradient of the
performance function to determine how to adjust the weights
to minimize performance. The gradient is determined using a
technique called back propagation, which involves performing
computations backwards through the network. The back
propagation computation is derived using the chain rule of
calculus. In addition, the transfer functions of hidden and
output layers are tan-sigmoid and tan-sigmoid, respectively.
Training and Testing:
The proposed network was trained with fisher score data
cases. When the training process is completed for the training
data, the last weights of the network were saved to be ready
for the testing procedure. The time needed to train the training
datasets was approximately 4.60 second. The testing process
is done for 60 cases. These 60 cases are fed to the proposed
network and their output is recorded.
5. VECTOR QUANTIZATION
TECHNIQUE A discrete HMM is taken into consideration for recognition
process of Indian sign language gestures. The feature vector
of chain code needs to be converted into a finite set of
symbols from a codebook. The VQ technique plays a
reference role in HMM based approach in order to convert
continuous Indian sign language (ISL) gestural signals into a
discrete sequence of symbols for discrete HMM. The VQ
concept is entirely determined by a codeword which is
composed by fixed prototype vectors. Fig 3 shows that the
process of quantization has been divided into two parts. The
first part is having the ability to produce a codebook and the
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
28
second part attempts to update the codeword followed by
training of all the vectors according to their finite vectors. Its
strength lies in reducing the data redundancy and the
distortion created among the quantized data and the original
data. It is essentially required to propose a VQ method which
would be genuinely used to minimize this distortion measure.
In order to compute the minimum average distortion measure
for a set of vectors an iterative algorithm is proposed by
Linde, Buzo and Gray [22] which is known as LBG vector
quantization designing algorithm. The algorithm illustrates the
generation of optimal codebook (in our case codebook size is
16) for isolated ISL gestures.
6. RECOGNIZING AND
INTERPRETATION BY HUMANOID
ROBOT
Fig 3 shows that the process of quantization, learning,
recognizing and interpretation of sign language gesture. An
underlying concept of learning gestures has been introduced
for humanoid robot in order to perform several tasks
eventually [27]. An integration of humanoid robot with Indian
sign language gestures encounters an elegant way of
communication through mimicry. The real time robotics
simulation software, WEBOTS is adopted to generate HOAP-
2 actions accurately. The way of learning process marks an
intelligent behavior of HOAP-2 which sustains its learning
capability in any type of environment. The learning process is
dealt with the HOAP-2 robot controller which has been built
intelligently. It is used to invoke Comma Separated Value
(CSV) file in order to perform that gestures in real time. All
the classified gestures bring out some useful information
about all the joints of upper body of the humanoid robot.
6.1 Learning ISL gesture using HMM with
vector quantization techniques In order to learn the ISL gestures by humanoid robot the
preprocessing technique is essentially needed for this purpose.
It employs the following steps.
Capture the ISL gesture as an input gesture.
Apply an algorithm for extracting orientation histogram to construct feature vector.
From the feature vector of each gesture an initial
codebook is to be generated. Then apply LBG algorithm
to generate an optimized codebook.
Each row corresponds to a number of the codeword
which helps to form a quantized vector used by hmm
algorithm.
In this preprocessing step we have generated 30 symbol
sequences for each ISL gesture as each gesture is captured
with equal number of frames. Next stage implies to train each
gesture using Hidden Markov Model where parameters of the
HMM can be determined efficiently. We calculated the
transition probability and emission probability of HMM by
the known states and known sequences. The training
algorithm for each gesture measures the accurate transition
and emission probability which are used for finding out the
most probable sequence.
It has been assumed 5 hidden states for each ISL gesture and
made the state sequences in the distribution of 1 to 5 with total
number sequence matches the codebook size. The observation
sequence for each state has been represented by row vector
with the calculated codebook of all the training samples. Each
observation sequence for each training gesture corresponds to
single row vector. Then apply algorithm for the estimation of
transition and emission probability of each gesture model and
preserve it for recognition purpose with unknown gesture.
Every HMM model is uniquely attached with each gesture
which is trained with different samples of each ISL gesture.
This trained model is extensively used for recognition of new
gesture. This new gesture would be tested through all the
trained HMM model. The new gesture is to be declared as
classified when it provided a maximum likelihood state with a
trained gesture. Compute the percentage of each probable
state for all the training samples which agrees with the likely
state sequences of test gesture. The gesture is declared as
classified with maximum percentage. Fig 3 shows the process
of quantization, learning, recognizing and interpretation of
sign language gesture using HMM and neural network.
Fig 3: Learning of ISL gesture using HMM, Fisher Score
and Neural Network technique
Capture Video
Split the gesture video into
sequence of color image frames
Gesture video into sequence of
gray scale frames
Preprocessing of all Frames
Construct Feature Space for all
gestures
Vector Quantization technique
for codebook generation
Generate HMM model for Each
Gestures
Recognize test gesture
according to maximum
likelihood and generate Fisher
Score
Learning of humanoid robot
according to the recognized
gesture using Fisher Score and
Neural Network
Database for
chain code,
fisher Score
and codebook
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
29
6.2 Learning ISL gesture using Neural
Network The objective of this study is to classifying Fisher kernel data
of sign language gesture using feed forward back propagation
neural network and Levenberg-Marquardt (LM) as the
training algorithm. Feed forward back propagation neural
network use to classify sign language gesture according to
fisher score characteristic [33, 34, 39, and 40]. Fig 3 shows
the process of quantization, learning, recognizing and
interpretation of sign language gesture using neural network.
7. RESULT ANALYSIS Only 21 ISL gestures are selected for our work. In our work
the same gesture is performed by the ten different persons.
During acquisition of ISL video gestures we have kept same
number of frames for each gesture. It makes the recognition
process easier but it reduces the freedom of doing the gesture.
In the HMM classification process each model is tested by
Viterbi algorithm, where same number of frames are required
to get percentage of the matching gesture. We have taken 10
samples of each gesture for training and 10 separate samples
for each gesture for testing. We have generated probable path
from the trained hidden model for each training samples of
each gesture using viterbi algorithm. The same process is also
applied for test samples of each gesture. In the recognition
phase of known gestures, we have taken test sample one by
one from the test gestures and have compared that sample
with all the samples of each trained gestures iteratively. It
produces the recognition percentage of that particular sample
into the entire training samples. In that process we have
created separate training set and test set with most probable
paths. If a particular gesture is matched with 60% and above
according to the maximum likelihood state with one of the
training samples of that gesture then it is considered as
classified gesture. We have achieved up to 98% recognition
accuracy with both the techniques. The total prototype is
tested and simulated on Intel Core 2 Duo system with Mat lab
coding. Each gesture for mimicry generation needs a special
care of its dedicated joints which are responsible to perform
that gesture. The experiments have been realized with
independent samples for training and for test. These have been
repeated in five times to take averaged values, expressed them
by their average and their variance.
Table 1: Rate of success of the HMM in function of the
percentage of samples for training and of the number of
states
Samples
Trainin
g
Number of state
20 45 45 65 100
20% 75.66
%
±22.82
86.60
%
±3.33
87.74
%
±2.47
85.38
%
±7.67
64.40
%
±45.65
40% 78.20
%
±12.83
88.70
%
±3.50
88.58
%
±7.24
92.10
%
±1.82
73.20
%
±26.50
60% 76.78
%
±11.20
91.90
%
±2.95
87.30
%
±3.90
87.98
%
± 3.10
68.10
%
±25.87
80% 77.31
%
±31.42
92.75
%
±14.19
93.20
%
±8.30
92.50
%
±14.15
75.65
%
±16.80
The execution of the same have been done sequential mode,
first in the HMM to achieve its maximum success, varying the
number of states and the percentage of samples in the training.
The experimental results when varying the number of states
and states of HMM are shown in table 1. As a result, the
recognition accuracy was improved by increasing the number
of states and the highest recognition accuracy was obtained in
65 states. Although the recognition accuracy was hardly
improved when the number of samples increased, the highest
recognition accuracy was obtained in 80% samples. As a
result, recognition accuracy was 94.50% by using only the
position of hands. The results are show in table1. Of the above
table is deduced that the best percentage of training is fur 80%
and 65 states, with a rate of 94.50%, presenting a small
variance. Of this model is going to generate the kernel of
Fisher, and it is going to apply to the neural network. It is
observed as for the neural networks are presented better
results, arriving to an average rate of 95.34%, with the
smallest variance. The overall accuracy of Indian sign
language gesture recognition in the training, validation and
testing mode are 98.60, 97.64 and 97.52%.
8. CONCLUSION
In this work we have used Indian sign language gesture as a
communicating agent between human and robot interaction.
This is our first step to design a prototype of vision base
human robot interaction (HRI) system for speech and hearing
impaired persons. They can use the humanoid robot as his
translator or as his helping agent where the persons could
communicate with the robot through Indian sign language
gesture. According to the observation we select neural
network as a best recognition tool compatible with fisher
score according to time taken and accuracy for recognizing
gestures. However in the present work we are only
implementing a mimicry action by the humanoid robot. We
could in principle use these techniques for trying to recognize
any other human gestures. However the complexity of
classification will increase because of the ambiguity in normal
human gestures. We have chosen ISL because of its rigid
vocabulary which makes the classification simpler. The
present work only recognizes a single gesture at a time. It will
be challenging to recognize multiple gestures or sequence of
gesture one after the other. The task in hand will be to
separate out each gesture from the sequence of gestures and
use our gesture recognition techniques as described in this
paper. In this article a robust and novel automatic Indian sign
language gesture recognition system has been presented. The
overall accuracy of sign language recognition in the training,
validation and testing mode are 98.60, 97.64 and 97.52%. We
are concluding that that the proposed system gives fast and
accurate Indian sign language gesture recognition. Given the
encouraging test results, we are confident that an automatic
sign language gesture recognition system can be developed.
9. REFERENCES [1] Zhang, J., Zhao, M.: A vision-based gesture recognition
system for human-robot interaction. Robotics and
Biomimetics (ROBIO), 2009 IEEE International
Conference on, vol., no., pp.2096-2101, 19-23 Dec.
(2009). doi: 10.1109/ROBIO.2009.5420512
[2] Calinon, S., Guenter, F., Billard, A.: On Learning,
Representing and Generalizing a Task in a Humanoid
Robot. IEEE Trans. on Systems, Man and Cybernetics,
Part B, Vol. 37, No. 2, pp. 286-298 (2007). doi:
10.1109/TSMCB.2006.886952
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
30
[3] Calinon, S., Guenter, F., Billard, A.: Goal-Directed
Imitation in a Humanoid Robot. International Conference
on Robotics and Automation (ICRA), pp. 299-304
(2005).
[4] Pantic, M., Rothkrantz, L. J. M.: Toward an affect-
sensitive multimodal human-computer interaction. IEEE,
vol.91, no.9, pp. 1370- 1390, Sept. (2003).
[5] Bhuyan, M. K., Ghoah, D., Bora, P. K.: A Framework
for Hand Gesture Recognition with Applications to Sign
Language. India Conference, 2006 Annual IEEE, PP. 1-
6, Sept. (2006). doi: 10.1109/INDCON.2006.302823
[6] Prasad, J. S., Nandi, G. C.: Clustering Method
Evaluation for Hidden Markov Model Based Real-Time
Gesture Recognition. Advances in Recent Technologies
in Communication and Computing, ARTCom '09, pp.
419-423, 27-28Oct. (2009).
[7] Lee, H. J., Chung, J. H.: Hand gesture recognition using
orientation histogram. TENCON 99. Proceedings of the
IEEE Region 10 Conference, vol.2, no., pp.1355-1358
vol.2, Dec. (1999). doi: 10.1109/TENCON.1999.818681
[8] Freeman, W. T., Roth, M.: Orientation histograms for
hand gesture recognition. Intl. Workshop on Automatic
Face- and Gesture- Recognition, IEEE Computer
Society, Zurich, Switzerland, pp.296—301, June (1995).
MERL-TR94-03.
[9] Nandy, A., Prasad, J. S., Chakraborty, P., Nandi, G. C.,
Mondal, S.: Classification of Indian Sign Language In
Real Time. International Journal on Computer
Engineering and Information Technology (IJCEIT), Vol.
10, No. 15, pp. 52-57, Feb. (2010).
[10] Nandy, A., Prasad, J. S., Mondal, S., Chakraborty, P.,
Nandi, G. C.: Recognition of Isolated Indian Sign
Language gesture in Real Time. BAIP 2010, Springer
LNCS-CCIS, Vol. 70, pp. 102-107, March (2010). doi:
10.1007/978-3-642-12214-9_18.
[11] Dasgupta, T., Shukla, S., Kumar, S., Diwakar, S., Basu,
A,: A Multilingual Multimedia Indian Sign Language
Dictionary Tool. The 6’Th Workshop on Asian
Language Resources, pp. 57-64 (2008).
[12] Kim, J., Thang, N. D., Kim, T.: 3-D hand motion
tracking and gesture recognition using a data glove.
Industrial Electronics, 2009. ISIE 2009. IEEE
International Symposium on, vol., no., pp.1013-1018, 5-8
July (2009). doi: 10.1109/ISIE.2009.5221998
[13] Jiangqin, W., Wen, G., Yibo, S., Wei, L., Bo, P.: A
simple sign language recognition system based on data
glove. Signal Processing Proceedings, 1998. ICSP '98.
1998 Fourth International Conference on, vol.2, no.,
pp.1257-1260 vol.2 (1998). doi:
10.1109/ICOSP.1998.770847
[14] Ishikawa, M., Matsumura, H.: Recognition of a hand-
gesture based on self-organization using a DataGlove.
Neural Information Processing, 1999. Proceedings.
ICONIP '99. 6th International Conference on, vol.2, no.,
pp.739-745 vol.2 (1999). doi:
10.1109/ICONIP.1999.845688
[15] Swee, T. T., Ariff, A. K., Salleh, S. H., Seng, S. K.,
Huat, L. S.: Wireless data gloves Malay sign language
recognition system. Information, Communications &
Signal Processing, 2007 6th International Conference on,
vol., no., pp.1-4, 10-13 Dec. (2007). doi:
10.1109/ICICS.2007.4449599
[16] Liang, R. H., Ouhyoung, M.: A real-time continuous
gesture recognition system for sign language. Automatic
Face and Gesture Recognition, 1998. Proceedings. Third
IEEE International Conference on, vol., no., pp.558-567,
14-16 Apr (1998).
[17] Won, D., Lee, H. G., Kim, J. Y., Choi, M., Kang, M. S.:
Development of a wearable input device based on human
hand-motions recognition. Intelligent Robots and
Systems, 2004. (IROS 2004). Proceedings. 2004
IEEE/RSJ International Conference on, vol.2, no., pp.
1636- 1641 vol.2, 28 Sept.-2 Oct. (2004). doi:
10.1109/IROS.2004.1389630
[18] Kuzmanic, A., Zanchi, V.: Hand shape classification
using DTW and LCSS as similarity measures for vision-
based gesture recognition system. EUROCON, 2007.
The International Conference on "Computer as a Tool",
vol., no., pp.264-269, 9-12 Sept. (2007). doi:
10.1109/EURCON.2007.4400350
[19] Hienz, H., Grobel, K., Offner, G.: Real-time hand-arm
motion analysis using a single video camera. Automatic
Face and Gesture Recognition, 1996., Proceedings of the
Second International Conference on , vol., no., pp.323-
327, 14-16 Oct. (1996).
[20] Hasanuzzaman, M., Ampornaramveth, V., Zhang, T.,
Bhuiyan, M. A., Shirai, Y., Ueno, H.: Real-time Vision-
based Gesture Recognition for Human Robot Interaction.
Robotics and Biomimetics, 2004. ROBIO 2004. IEEE
International Conference on, vol., no., pp.413-418, 22-26
Aug. (2004). doi: 10.1109/ROBIO.2004.1521814.
[21] Rabiner, L. R.: A tutorial on hidden Markov models and
selected applications in speech recognition. IEEE, vol.77,
no.2, pp.257-286, Feb. (1989). doi: 10.1109/5.18626.
[22] Vector Quantization Technique and LBG Algorithm.
www.cs.ucf.edu/courses/cap5015/vector.ppt.
[23] Michailovich, O., Rathi, Y., Tannenbaum, A.: Image
Segmentation Using Active Contours Driven by the
Bhattacharyya Gradient Flow. IEEE Transactions on
Image Processing, vol.16, no.11, pp.2787-2801, Nov.
(2007). doi: 10.1109/TIP.2007.908073
[24] Kailath, T.: The Divergence and Bhattacharyya Distance
Measures in Signal Selection. IEEE Transactions on
Communication Technology, vol.15, no.1, pp.52-60, Feb.
(1967).
[25] Nayak, S., Sarkar, S., Loeding, B.: Distribution-Based
Dimensionality Reduction Applied to Articulated Motion
Recognition. Pattern Analysis and Machine Intelligence,
IEEE Transactions on, vol.31, no.5, pp.795-810, May
(2009). doi: 10.1109/TPAMI.2008.80.
[26] Nandy, A., Mondal, S., Prasad, J. S., Chakraborty, P.,
Nandi, G. C.: Recognizing & interpreting Indian Sign
Language gesture for Human Robot Interaction.
Computer and Communication Technology (ICCCT),
2010 International Conference on , vol., no., pp. 712-717,
17-19 Sept. (2010). doi: 10.1109/ICCCT.2010.5640434.
[27] Mitra, S., Acharya, T.: Gesture Recognition: A Survey.
IEEE Transactions on Systems, Man, and Cybernetics,
Part C: Applications and Reviews, vol.37, no.3, pp.311-
324, May (2007). doi: 10.1109/TSMCC.2007.893280.
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
31
[28] Lawence R. Rabiner, “A tutorial on Hidden Markov
models and Selected Applications in Speech
Recognition”, in Proceedings of the IEEE, vol. 77, no.
2,IEEE, pp. 257-286, 1989.
[29] L. O’Gorman and R. Kasturi, Document Image AnalJsis,
IEEE Computer Society Press, 1995.
[30] J. Serra, Image Analysis and Mathematical Morphology,
Academic Press, 1982.
[31] C. Travieso, C. Morales, I. Alonso y M. Ferrer,
“Handwritten digits parameterisation for HMM based
recognition”, Proceedings of the Image Processing and
its Applications,vol.2, pp. 770-774, julio de 1999.
[32] E. Gomez, C.M. Travieso, J.C. Briceiio, M.A. Ferrer,
“Biometric Identification Svstem by Lip Shape”, in
Proceeding of 361h International Carnahan Conference
on Security Technology, Atlantic City, October 2002,pp.
39-42.
[33] L. Fausett, "Fundamentals of Neural Networks,
Architectures, Algorithms, and Applications", Prentice-
Hall, Inc. 1994, pp-304-315.
[34] K.Murakami, H.Taguchi: Gesture Recognition using
Recurrent Neural Networks. In CHI ’91 Conference
Proceedings (pp. 237-242). ACM. 1991.
[35] Chang, J. Chen, W. Tai, and C. Han, ―New Approach
for Static Gesture Recognition", Journal of Information
Science and Engineering22, 1047-1057, 2006.
[36] S. Naidoo, C. Omlin and M. Glaser, "Vision-Based Static
Hand Gesture Recognition Using Support Vector
Machines", 1998.pages 88 – 94.
[37] Vladimir I. Pavlovic, Rajeev Sharma, Thomas S Huang,
―Visual Interpretation of Hand Gestures for Human-
Computer Interaction: A review‖ IEEE Transactions of
pattern analysis and machine intelligence, Vole 19, NO
7, July 1997, 677 - 695
[38] Visual gesture recognition. In proceedings IEEE Visual
Image Signal Process, by Davis & Shah, 1994 vol.141,
Issue: 2, 101 – 106.
[39] Hand gesture Recognition of Radial Basis Functions
(RBF) Networks and Decision Trees ―International
Journal of Pattern Recognition and Artificial Intelligence.
Volume: 11, Issue: 6(1997) pp. 845-850.
[40] Face detection in complicated backgrounds and different
illumination conditions by using YCbCr color space and
neural network Pattern Recognition Letters, Volume 28,
and Issue 16, 1 December 2007, Pages 2190-2200.
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
32
Change Data Capture on OLTP Staging Area for Nearly Real Time Data Warehouse base on Database Trigger
I Made Sukarsa
Departement of Information Technology Faculty of Engineering Udayana University,
Bali, Indonesia
Ni Wayan Wisswani Departement of
Informatic Manajemen Politeknik Negeri Bali,
Bali, Indonesia
I K. Gd. Darma Putra
Departement of Information Technology Faculty of Engineering Udayana University,
Bali, Indonesia
Linawati Departement of
Electrical Engineering Faculty of Engineering Udayana University,
Bali, Indonesia
ABSTRACT
A conventional data warehouse use to produce summary from
an organization information system in a long time period. This
condition will make the management unable to get the most
up to date data every time it needed. Therefore a nearly real
time data warehouse which will manage the ETL process with
a more compact data and a shorter period is needed.
The design of nearly real time data warehouse in this research
is implemented in two steps. The first step is done by data
collection technique modeling to make a more compact ETL
data managed. This step is done by putting the staging area on
an Online Transactional Processing (OLTP). It can minimize
the failure of data movement process from the OLTP to the
staging area. Besides that, the CDC method is also had
applied on the OLTP. This method will be implemented with
a trigger active database. The trigger will capture of the data
changing on the OLTP, transform it and then load it to the
staging area in one time. The second step is the
synchronization process of the data movement from the
staging area to the nearly real time data warehouse. This
process is done by mapping the movement which is ran by the
SQL Yog. The mapping result will accomplished by the
windows task scheduler
General Terms
Modelling System, Data Warehouse
Keywords
Nearly real time data warehouse, Change Data Capture,
Surrogate key, Trigger.
1. INTRODUCTION Data warehouse is a need for an organization. Data warehouse
(DWH) capable to be the data sources to all integrated report
making process which are needed in prompting the decision
making process. [1]. Data source from various OLTP
processed through the various stages that consist of Extract,
Transform and Loading (ETL). ETL is built on a tier that is
placed between the source data and DWH and known also as
a staging area[2]. Extract part relied on to take data from
multiple sources within a specific time period to be taken to
DWH. Data is cleaned, integrated and transformed into a
specific format by the transform and then moved to the DWH
by Loading component.
Conventional ETL machine will work on time variant. This
machine will save the data periodically in accordance to the
organization business process flow [3]. This characteristic
made the DWH unable to give the most up to date information
from every event on the transactional system. The fact is data
warehouse which is real time is really needed in decision
making which is need the highest level of up to date
information. [4].
Real time data warehouse will able to show the ETL working
result in an exact time according to the transactional time on a
number system [5]. But ETL as the core of data warehouse [6]
cannot really work on real time [7]. This happens because of
the ETL need some time to process the data from various
sources in a large amount, and has to go through some
communication component [8]. The delay time is needed by
ETL to process this summary, which trigger the term Nearly
Real Time Data Warehouse (NRTDWH) [7].
To produce NRTDWH, ETL therefore can be implemented by
applying Change Data Capture (CDC) [9]. CDC is used to
know the changing on the data sources and then capture it to
be given to the database destinations which need it [10]. This
ability made CDC able to capture data changing efficiently
[11] therefore NRTDWH will be easier to be implemented.
Based on the above explanation, therefore the effort to create
NRTDWH by CDC modeling becomes really important to be
implemented.
2. RELATED WORK Some researches on the development of CDC modeling and
real time data warehouse have been done [12]. The modeling
of CDC processes uses the log analysis while it introduces the
architecture of semi real time DWH to make real time data
warehouse by using the CDC mechanism which have owned
by Oracle.
[10] Modeling of the data changing capture process by using a
set of web service. Captured modeling use the web service is
also done by [13] and to facilitate real time data warehouse is
introduce an architecture of multi level real time data cache.
Meanwhile [8] modeling of the ETL for real time DWH with
using schedule algorithm to balance the query and updates
thread control trigger based on ETL machine.
In our research will be develop a trigger based CDC modeling
which will capture data changing on different sources system.
The same trigger will transform the capture result in one time
and then load it to the staging area which is placed on the
OLTP.
The capture, transform and load (CTL) which has designed,
made the DWH able to receive the data summary faster. It
happen because the ETL process a smaller amount data and
the CTL process result is the final data which is accordance to
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
33
the DWH structure. This condition made the synchronization
process of the whole data sources to DWH doesn’t need a
more advanced transformation.
.
3. CAPTURE, TRANSFORM AND
LOAD
3.1 CTL Framework The CTL model architecture for NRTDWH which will be
developed on this research is visualize like the following
figure 1:
ODS - My SQL
data berubah
Scheduler
Loading
Transform
Loading
ODS user
End User
Change data Capture
Staging area ODS
Transform
load
Staging ODS
ODS - My SQL
data berubah
ODS user
Change data Capture
Staging area ODS
Transform
load
Staging ODS
Nearly Realtime
Datawarehouse
Loading
Viewing
Figure 1. General architecture of the system
In this model, transform and load process will be conducted
by each OLTP engine so as to reduce the time delay due to the
staging area located at each OLTP and do not need to build a
new staging area as in the models that already exist. The
integration process has been completed on the OLTP so the
data warehouse will receives the final data.
NRTDWH on this research is produced from the CTL process
on different OLTP sources. This model is starting to work
when a user enters new data, change or delete a record or
some field on the OLTP.
Event insert, will make a trigger capture the inserted data and
then save it as a new record on a table in staging area which is
appropriate. An update to one or some field on a record, make
a trigger captured the changing which is made. The result will
be used to updating data or being save as a new record on a
table accordingly on a staging area. On the other hand, if the
deleted process happens, therefore deleted data will change
some field on the active record in staging area. The delete
proscess can make a trigger inserted as a new data to the
appropriate table on the staging area. CTL will work like
figure 2, in the following
inserted data capture
Transform data inaccordance to the
appropriate structure
updated data capture
Loading the changing tothe staging OLTP
deleted data capture
Load to DWH
determination process
data manipulation
Figure 2. CTL process flow
When the transform of the capture result is done, trigger
might do one of these processes:
1. Simple Transform Process. This process will do
some field adjustment and formatting data between
captured data with the structure on the staging area.
This process happens if the information on the
related topic on a staging area is the information
which comes from one table and doesn’t need
relation with other table.
2. Leveled Transform Process. This process is
completed with advance query joint operation
process and other operation which has look up
characteristic. This is done if the information comes
from some tables on the OLTP.
All saved CTL process result on the staging area then move to
NRTDWH by task scheduler based on the metadata mapping
design. This metadata will be the basic rule to do join data
from every OLTP sources to NRTDWH. In order to make the
data warehouse easier to understand, therefore the data on
data warehouse will be shown through a data mart application.
3.2 Dimensional Modelling On this research, all of the OLTP uses the same MySQL
platform database. OLTP will give the data that NRTDWH
needed, while staging area will load the CTL results into
dimensions and facts tables which are ready to be joined to
NRTDWH. Through the figure 3, will be shown the star
schema which will be put on each staging area on OLTP and
the dimensional modeling on the data warehouse.
dimensi_prodi
PK id_sd_prodi
id_prodi
nama_prodi
nm_prodi_lama
mulai
selesai
status
fak_pengunjung_perdisertasi
PK id_sd_pengunjungdisertasi
waktu
id_sd_disertasi
jumlahpengunjung
id_sd_prodi
dimensi_disertasi
PK id_sd_disertasi
id_disertasi
judul_penelitian
nama_peneliti
status
mulai
selesai
jdl_lama
id_prodi
fakta_prodi_disertasi
PK id_sd_prodi_disertasi
id_sd_prodi
tgl
jumlah
fg_pengunjung_perprodi
PK id_sd_pengunjungprodi
id_sd_prodi
waktu
jumlahpengunjung
fg_pengunjung_perprodi_perbulan
PK id_sd_pengunjungprodibulan
id_sd_prodi
bulan
jumlahpengunjung
DESAIN STAGING OLTP1
dimensi_prodi
PK id_dwh_prodi
id_prodi
nama_prodi
nm_prodi_lama
mulai
selesai
status
fg_pengunjungprodi
PK id_dwh_pengunjungprodi
id_waktu
id_dwh_prodi
jumlahpengunjung
fak_pengunjung_tsds
PK id_dwh_fp_tsds
id_dwh_tsds
id_dwh_prodi
waktu
jumlahpengunjung
dimensi_tesisdisertasi
PK id_dwh_ts_ds
id_tsds
judul_penelitian_baru
judul_penelitian_lama
nama_peneliti
status
mulai
selesai
id_prodi
fakta_prodi_tsds
PK id_dwh_prodi_ts_ds
id_dwh_prodi
tgl
jumlah
fg_pengunjung
PK id_dwh_pengunjung
waktu
id_dwh_prodi
jumlahpengunjung
DESAIN DWH
dimensi_prodi
PK id_st_prodi
id_prodi
nama_prodi
nm_prodi_lama
mulai
selesai
status
fak_pengunjung_pertesis
PK id_st_pengunjungtesis
id_st_thesis
id_st_prodi
waktu
jumlahpengunjung
dimensi_tesis
PK id_st_thesis
id_thesis
judul_penelitian_lama
judul_penelitian_baru
nama_peneliti
status
mulai
selesai
id_prodifakta_prodi_tesis
PK id_st_proditesis
id_st_prodi
jumlah
tgl
fak_pengunjung_per_prodi
PK id_st_pengunjungprodibulan
id_st_prodi
bulan
jumlahpengunjung
fak_pengunjung_per_prodi
PK id_st_pengunjungprodi
id_st_prodi
waktu
jumlahpengunjung
DESAIN STAGING OLTP 2
Figure 3. Dimensional Modeling
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
34
Even though comes from a different sources, the join data
process from the staging area to NRTDWH, doesn’t need an
advanced transformation process to form a new surrogate key
on every dimensions and facts. Even though so, all data on
NRTDWH will be able to be differentiated. This happen
because the surrogate key on this research is designed to keep
the characteristic from OLTP source. The surrogate key
model on this research is also able to prevent the failure of
joining data process because of the same data.
3.3 Nearly Real time data warehouse The effort to make the NRTDWH on this research is done by
some way, which are:
a. Staging area design which unite with the OLTP
database.
This is done to shorten the time of data capture
changing process from the OLTP to the staging area.
Therefore the transform process can be done
immediately. This model is also to minimize the
communication failure. It is because of the data source
and target put on the same host. The staging area
placement on the OLTP is also to make the
synchronization process into the NRTDWH become
easier. It is because of the whole data process is done in
the OLTP, therefore all of the save data in the staging
area are the final data in accordance to the structure
which NRTDW wanted.
b. Shorten the data load time span to NRTDWH with a
trigger.
The effort to shorten the load process is done by using a
trigger. Trigger will make the capture can be done in a
short time period of time if it is compare with the other
CDC method. The shorter capture process surely will
influence the time which is needed for the transform and
load process on the staging area.
c. Join the Transform on the Change Data Capture
The CTL process which is done in one time by using the
same trigger surely will minimize the delay between
capture and transform. This will immediate the load
process to the staging area, therefore the
synchronization will also be organized to be shorter.
d. The use of a trigger, function and procedure as the
transform engine.
On this research, all of the capture process, transform
and load which take place will be run by PL/SQL
trigger, function and procedure. Trigger is chosen
because all the process will works faster and all daily
transaction capable to work without disturbance. This
happen because PL/SQL works on DBMS. Trigger also
can be known events that make the record in
accordance in OLTP changing. Therefore the changing
data will have the transform process directly without
comparison with previous data which are have already
save on the DWH. This will help NRTDWH easier to
achieve.
3.4 The Synchronization Process The synchronization process is done by moving and joining
the data processes result which is load on the staging area on
each OLTP. This process consists of two main components.
The first component will do the metadata mapping which will
be done by SQL Yog Ultimate. The metadata will use as the
basic rules when the synchronization process happened. All
these mappings are saving on a job file which is different for
every source. The second component is the scheduler which
contains of the data moving time span schedule to DWH. This
process will run the job file on a metadata scheme which has
made. The making scheduler is done by a windows operation
system which is scheduled in every one minute.
3.5 Testing and Results The testing of CDC modeling on this research use three
testing application: the thesis system and the Dissertation
system which act as an OLTP, and the data mart of Udayana
university application. The testing is done by manipulating
some data dummy which is spread on each OLTP. The data
manipulation is done only to some tables on OLTP which
might be the source of the DWH.
The testing on this research is done by two phase. The First
phase is done to know that the CTL process on the staging
area is done successfully. The second testing is done to prove
that the synchronization from the staging area on each OLTP
to NRTDWH is successfully done by the scheduler.
3.5.1 Capture, Transform and Load Testing
Trigger will do the CTL process before and after insert,
update and delete happen on an OLTP. These manipulation
processes will influence the facts and dimensions tables of
each staging area. One of the CTL processes which will be
observed is one of it the manipulation process of insert,
update and delete on the th_thesis table. The insert process of
th_thesis table through a form visualize on the following
figure 4:
Figure 4: Insert Form of the thesis OLTP system.
When the insert happen through above form, the CTL on
th_thesis table will work to insert a new row to the table
dimension and fact on the staging area. It caused the
dimension table will be like the figure 5.
Figure 5. Inserts data result to the thesis dimension
While the While the fact table pengunjung_pertesis will like
the figure 6
Figure 6. Data insert result to the fact table
pengunjung_pertesis
Other fact table is also influenced by this process is the
prodi_tesis table. When CTL succeed therefore the table will
be like figure 7.
Figure 7. Inserts result to prodi_tesis
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
35
The pengunjung_prodi fact table will also have some changes
when the insert to th_tesis is done. The result of CTL process
on this table will be like figure 8
Figure 8. The inserts result to pengunjung_prodi table on
thesis OLTP system.
Insert to the th_thesis table will also influence the
pengunjung_prodi_perbulan fact table. This process will
caused the table changes like figure 9.
Figure 9. Inserts result to the table of
pengunjung_prodi_perbulan on thesis system.
The update process which will influence the dimension and
the facts is done by two means: First, update the th_thesis
table which is done trough a form like the figure 10.
Figure 10. Updates Form of the thesis OLTP System.
Update on this form, is done to the name of the researcher
field, the title of the research or the id_prodi field. This will
trigger CTL to work and influence the dimension and facts
table on the staging area. If the change happened only on the
field of name and the title of the inputed data research,
therefore CTL will caused the thesis dimension change like
figure 11.
Figure 11. Update result of thesis dimension
If this changing is done on the id_prodi field, therefore prodi
dimension will change like the figure 12.
Figure 12. Updates result of id prodi field on prodi dimension
The changing of id_prodi field, can influence the fact table on
the staging area. The fact table which is change is the
pengunjung_pertesis table. This changing will be shown on
figure 13.
Figure 13. Update result pengunjung_per_thesis
After the CTL working, the prodi _tesis fact table will be like
the figure 14.
Figure 14. Updates Result of the id_prodi field on the
prodi_tesis table.
Because of these process, the pengunjung_prodi fact table will
be like figure 15.
Figure 15. Updates result of pengunjung_prodi table of the
thesis system
Then the changed of the other facts is
fg_pengunjungprodibulan. It will change like figure 16.
Figure 16. Updates result of the pengunjung_prodi perbulan
table on the thesis system.
The second update method to th_thesis is done through a form
like figure 17.
.
Figure 17. Update Form on the lihat field on the table thesis.
User activity through this form, caused the value of lihat field
which is save on the th_thesis table will change. This change
caused CTL work, therefore the pengunjung_per_tesis table
will be like figure 18 on the following.
Figure 18. Update result of the lihat field on the
pengunjung_pertesis table
Other table will also change because of this process is the
pengunjung prodi fact table. The results will look like figure
19.
Figure 19. Update Result of the pengunjung_prodi table.
CTL process which is trigger by the lihat field is also change
the pengunjung_prodi_perbulan table. The changing on this
table will be like figure 20.
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
36
Figure 20. Update result of the pengunjung_prodi_perbulan
table.
The delete process on the thesis system is done through a
form like figure 21.
Figure 21. Delete form of the th_thesis table on OLTP thesis
system.
The delete activity through this form, triggers the CTL
process to work. It makes some change on the record on some
tables in a staging area. The first table which will change is
the thesis dimension table. Changing on this table is shown
like the figure 21.
Figure 21. Delete result on the thesis dimension table.
Other table which also will change is the fakta_prodi_tesis.
Due to this process this table will be look like this following
figure.
Figure 22. Delete results of the study program thesis
While the pengunjung_per_prodi table will be look like figure
23.
Figure 23. Delete result on the pengunjung_prodi table
3.5.2 Data Synchronization Process to Data
Warehouse
The data synchronization process from OLTP source to
NRTDWH is done by a scheduler. Its work in according to the
scheme which has designed. Data which is successfully
moved from staging area will be joined into NRTDWH based
on the metadata which is shown on table 1 on the following.
Table 1. staging area Metadata of DWH
Source
staging area
Source tables Destination table on
NRTDWH
DWH
disertasi
Dimensi disertasi Dimensi_ts_ds
DWH
disertasi
Dimensi prodi Dimensi_prodi
DWH
disertasi
Fak_pengunjung_p
erdisertasi
Fak_pengunjungtsds
DWH
disertasi
Fakta_prodi_diserta
si
Fakta_prodi_tsds
DWH
disertasi
Fg_pengunjung_pro
di
Fakta_pengunungprodi
DWH
disertasi
Fgpengunjungprodi
bln
Fakpengunjungprodbln
DWH thesis Dimensi tesis Dimensi_ts_ds
DWH thesis Dimensi prodi Dimensi_prodi
DWH thesis Fak_pengunjung_p
ertesis
Fak_pengunjungtsds
DWH thesis Fakta_prodi_tesis Fakta_prodi_tsds
DWH thesis Fg_pengunjung_pro
di
Fata_pengunungprodi
DWH thesis Fg_kunjungprodibu
lan
Fakkunjungprodbulan
The above metadata will be the rule base of the
synchronization process. Figure 24 on the following, show the
succeed synchronization history of the capture job scheduler.
Figure 24. Job scheduler history
One of the succeed synchronization process which is shown
on figure 25 on the following
Figure 25. The synchronization result to the
prodi_tesis_disertasi table on the NRTDWH
The synchronization result which is saving on the dimension
and fact table on NRTDWH is shown through a data mart
application. It make the data on the NRTDWH easier to read
and help the end user to get a whole meaning. Trough this
application, the data on NRTDWH has to going through
masking process first. This process is done by syncronize the
prodi dimension table with the related fact. One of the
masking processes is done between the values on prodi
dimension table which is shown like figure 26 with the record
value on figure 25.
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
37
Figure 26. Data on the prodi dimension table on NRTDWH
Based on this, therefore the masking process result on the
testing application will give a result like figure 27.
Figure 27. Masking result of the dimension and fact table
Through this application, the masking of a result is also can be
seen by using graphics. The graphics which is get from the
data on the prodi_disertasi fact is like the following figure 28.
Figure 28. Masking Graphic Result
4. CONCLUSION AND THE FUTURE
WORK On this research has developed a method to create nearly data
warehouse which comes from some different OLTP with the
same platform. NRTDWH is done by implementing CTL
based on trigger. It will run the transform and load process in
one time on the staging area which is put on the OLTP. This
future research is able to be done by applying CTL to create
nearly real time data warehouse for form different platform
data sources and perform measurements on the OLTP
performance because of the extra burden of staging machine.
Data integration issues also need special attention to meet a
more dynamic modeling. If further research can be done will
be obtained data warehouse implementation model that is
more real time by cutting processing time in the staging area.
5. ACKNOWLEDGMENTS Our special thanks to the Divinkom Departement of Udayana
University, Indonesia Bali, who have contributed towards the
application test of the model.
6. REFERENCES [1] Robert M. Bruckner, Beate List, and Josef Schiefer,
Striving towards Near Real-Time Data Integration for
Data Warehouses , Data Warehousing and Knowledge
Discovery Lecture Notes in Computer Science, 2002,
Volume 2454/2002, 173-182, DOI: 10.1007/3-540-
46145-0_31
[2] Javed, Dr.Muhammad Younus. , Nawaz, Asim. ,2010.
Data Load Distribution by Semi Real Time Data
Warehouse, In: Computer and Network Technology
(ICCNT), 2010 Second International Conference On
page(s): 556 - 560
[3] Inmon, W.H. 2005. Building The Data Warehouse
Fourth Edition. Canada : Wiley Publishing.Inc.
[4] Simitsis, A.; Vassiliadis, P.; Sellis, T.;, Optimizing ETL
Processes in Data Warehouses.In Data Engineering,
2005. ICDE 2005. Proceedings. 21st International
Conference on Digital Object, Page(s): 564 – 575
[5] Vandermay, John., 2001. Considerations for Building a
Real-time Data Warehouse
[6] Savitri, F.N. , Laksmiwati, H. ,Study of localized data
cleansing process for ETL performance improvement in
independent datamart, Electrical Engineering and
Informatics (ICEEI), 2011 International Conference on,
[diunduh : 13 Agustus 2011]
[7] Langseth ,Justin., 2004, Real-Time Data Warehousing:
Challenges and Solutions.
[8] Jie Song; Yubin Bao; Jingang Shi; 2010, A Triggering
and Scheduling Approach for ETL . Computer and
Information Technology (CIT), 2010 IEEE 10th
International Conference on , Page(s): 91 – 98.
[9] R. Kimball and J. Caserta, The Data Warehouse ETL
Toolkit: Practical Techniques for Extracting, Cleanin.
John Wiley & Sons, 2004.
[10] Mitchell J Eccles, David J Evans and Anthony J
Beaumont, True Real-Time Change Data Capture
WithWeb Service Database Encapsulation, 2010, 2010
IEEE 6th World Congress on Services
[11] Attunity Ltd , 2009, Efficient and Real Time Data
Integration With Change Data Capture, Tersedia di
http://www.attunity.com/cdc_for_etl
[12] Jingang Shi, Yubin Bao, Fangling Leng, Ge
Yu.2008,Study on Log-Based Change Data Capture and
Handling Mechanism in Real-Time Data Warehouse. In
International Conference on Computer Science and
Software Engineering, CSSE 2008, Volume 4:
Embedded Programming / Database Technology / Neural
Networks and Applications / Other Applications,
December 12-14, 2008, Wuhan, China. pages 478-481,
IEEE Computer Society, 2008.
[13] Liu Jun; Hu ChaoJu; Yuan HeJin. 2010. Application of
Web Services on The Real-time Data Warehouse
Technology, Advances in Energy Engineering (ICAEE),
2010 International Conference on , Page(s): 335 – 338
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
38
Decision Support System for Admission in Engineering
Colleges based on Entrance Exam Marks
Miren Tanna B.E. Student
Dept. of Computer Engineering Thakur College of Engineering &Technology, Mumbai, India
ABSTRACT
Making a wise career decision is very important for everyone.
In recent years, decision support tools and mechanisms have
assisted us in making the right career decisions. This paper
attempts to enable a student who wishes to pursue
Engineering, make up good decisions, using the help of a
Decision Support System. Last 3 years’ information has been
obtained from the website of Directorate of Technical
Education, India (DTE) which makes it freely available.
Using Decision Rules, results are computed from which a
student can choose which stream and college he/she can opt
for on the basis of Entrance Exam marks he/she has scored.
To make the results more relevant, a search in the already
created decision system is performed. A student has to enter
his/her Entrance Exam scores and the stream he/she wishes to
opt for. Based on the entered information, the decision system
will return colleges and streams categorized as Ambitious,
Best Bargain and Safe.
General Terms
Data Mining, Decision Support System.
Keywords
Prediction, Result prediction.
1. INTRODUCTION Universities possess large amount of demographic data about
the students and colleges. This data is present without any
form of analysis. Informal analysis requires one to read
through each line of the data. Such a form of analysis is not
economical. Studies have been conducted in similar area such as
understanding student data [1]. There they apply and evaluate
a decision tree algorithm to university records, producing
graphs that are useful both for predicting graduation, and
finding factors that lead to graduation. Another study has been
conducted which uses student data to predict which branch a
student has high chances of being placed into [2]. In this
study, they make use of adjacency list, information gain
theory and confusion matrix. This proposal deals with the two problems. Using the data
available, to predict in which college a student has high
chances of getting an admit, and also which stream. It also
deals with providing relevant results by learning from
previous system states and revising itself each time. Decision Support System are mostly interactive systems, often
required by humans to provide necessary information based
on specific inputs; and they are also adaptable computer based
information system. Such a decision system not only utilizes
decision rules, models, and a comprehensive database but also
the decision maker’s own insights, leading to specific,
implementable decisions in solving problems that would be
difficult for a human to take alone [3].
2. PROBLEM DEFINITION Admission into professional colleges for engineering degree
course is based on scores of the Common Entrance Test
(CET). Students are allotted colleges based on these scores.
Seats are allotted on the basis of availability of seats in CAP
rounds. The lowest score accepted in a college for a certain
CAP round is known as the cut-off score.
Universities under DTE collect data about CET scores and
admissions from each college under that particular university.
Analyzing this extensive data provides us with an opportunity
to predict the admission pattern for a particular score, branch
and even a CAP round. Presently there are no such resources
to sort out colleges based on the parameters of marks,
branches and CAP rounds. Due to absence of such resources a
student would be less informed regarding the colleges he is
eligible in. Here we propose a technique to make use of
Decision Support System to assist in providing a student with
such decisions. The decisions taken by the system should not
only focus on present decisions, but also should take past
decisions into account.
The proposal mainly discusses about the use of DSS for
finding the most appropriate colleges forstudents based on
their CET scores. However, the scope of the project can be
extended to includethe common entrance exam that is being
envisioned to bring about uniformity and fairness in
thecurrent admission system. The algorithm that has been
developed can be modified accordinglyso that it will function
properly for the new pattern as well, for obtaining an
admission to differentcolleges. The main focus here has been
given to the Engineering field and the data has beencollected
accordingly. So, the students opting for the engineering field
may enter their marks inorder to get an appropriate result for
the colleges suitable for them. Similarly, this system can
beused for several other fields too such as Medicine,
Pharmacy, etc.
3. DATA MINING Data mining is a process that analyses (often large)
observational data sets to find relationships within it and to
summarize this data in a way that can be used by humans for
various purposes [4]. Techniques such as Bayes’ theorem,
neural networks and decision trees are an integral part of the
data mining process. Data mining is the process of extracting out knowledge from a
set of data. It discovers new patterns from data sets using
various methods of artificial intelligence, machine learning
and database systems. It is one of the steps of the Knowledge
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
39
Discovery in Databases process. In the Knowledge Discovery
Process the uncovered hidden knowledge can be identified as
relationships or patterns. The relationships may be between
two or more different objects which may change over a period
of time. Discovery of relationships is a key result of data
mining [5]. If knowledge discovery is one aspect of data mining,
prediction is the other. Here we look for a specific association
with regard to an event or condition. Pattern discovery is
another outcome of data mining operations. The data mining
tools mine the usage patterns of thousands of users and
discover the potential pattern of usage that will produce
results.
The most common example of data mining would be analysis
of shopping trends amongst shoppers. Products bought most
commonly by shoppers, are placed next to each other for
greater sales.
4. DECISION SUPPORT SYSTEM DSS are designed specifically to facilitate decision processes.
It should support rather than automate decision making, and
should adapt quickly to the changing requirements of decision
makers [6]. Decision Support Systems are found most useful
as they couple human decision making skills along with the
computational capability of a computer to improve the quality
of decisions. It is a computer-based support system for
management decision makers who deal with semi-structured
problems [7]. We propose implementation of combination of Data-Driven
as well as Knowledge driven type of Decision Support
System. Data-driven decision support systems are based on
access and manipulation of series of internal, external and
sometimes real-time data of an organization. Simple file
systems accessed by query and retrieval tools provide the
most elementary level of functionality [8]. Knowledge-driven
DSS suggest or recommend actions to users. They use
business rules, knowledge bases and also human expertise in
form of programmed internal logic. Decisions and tasks which
can be taken and performed respectively by a human expert
are taken by a Knowledge-driven DSS. The generic tasks
include classification, configuration, diagnosis, interpretation,
planning and prediction [9]. DSS systems often require user involvement in the
construction of problem representation and model
verification. They also require direct user involvement in the
analysis and evaluation of decision outcomes. These activities
involve subjective judgments and, therefore, a DSS should
focus on effective support and not on automatic selection. An
effective DSS is one which is flexible, adaptable to changing
user scenarios, its environment and one which learns from
knowledge gained on past user scenarios [10].
5. CENTRAL TENDENCY Central tendency refers to the number of ways, in which the
central value or the median value is calculated. The most
common and most effective numerical measure of the
“center” of a set of data is the (arithmetic) mean [11].
Let x1; x2,…, xN be a set of N values or observations, such as
for some attribute, like salary. Equation 1 shows the mean of
this set of values.
(1)
6. IMPLEMENTATION The CET scores are stored in the database, and it is processed
to obtain the relevant results. Colleges are sorted out on the
basis of previous year results as well as each CAP round.
High and low values are taken into consideration to apply
conditions to the entries in the database to categorize them.
The results are categorized as Ambitious, Best Bargain and
Safe colleges.
High = Score + offset (2)
Low = Score — offset (3) where offset is the deciding factor which decides the relevance
of the results. Algorithm of the logic is given as follows:
1) Accept input from user. 2) Search database for results pre-existing in the
database. 3) If result is found, display them to the user. This
method thereby reduces the complexity and
provides a quick result to the user. Update low and
high values for the existing result using the current
input of the user. 4) Else, do the following:
a) Search entire database. b) For entries, with mean greater than input but
less than or equal to high, mark them as
Ambitious. It means the candidate has slim
chances of getting into that college and stream. c) For entries, with mean less than input but
greater than or equal to low, mark them as Best
Bargain. It means the computed result is best
possible college-branch result for the candidate
and is most likely to get into it. d) For entries, with mean less than or equal to
low, mark them as Safe. It means the candidate
has highest chances of getting into these
colleges. 5) Enter the result generated in step 4 into the database
for making it available for future users. Mean is the average of marks of all years for a particular CAP
round. At step 4 and 5, the system learns on its own on the
basis of marks and previous year results. It revises itself each
time the system is used. Therefore, each time, the system is in
a new state, except when results pre-exist in the table.
7. TEST RESULTS Sample data taken from the data collected from the DTE
website is used to test the working of this decision system. Data is compiled by them every year, after the admission
process gets over with the help of feedback from the colleges
under it. This data is available on its website [12]. Table 1
shows the sample data. The user has to provide his/her CET
score. The system will provide the user its decision and will
update itself and be ready for future decision queries.
Table 1. Sample data of college cut off scores for
particular branch and year College
ID
Branch
ID
CAP
Round
2009
Marks
2010
Marks
2011
Marks
1 1 1 144 149 126
1 1 2 142 146 0
1 1 3 148 136 101
2 4 1 150 147 128
2 4 2 147 149 120
2 4 3 142 0 0
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
40
Decision rules have been tested using the data from the year
2009-2011. MySQL is used for processing of the data.
Table 2. Mean scores of three years for a college for a
particular CAP Round College ID Branch ID CAP Round Mean
1 1 1 139
1 1 2 144
1 1 3 128
2 4 1 141
2 4 2 138
2 4 3 142
Table 3. Result shown to the user College ID Branch ID CAP Round Type
1 1 1 Best Bargain
1 1 3 Safe
2 4 1 Ambitious
2 4 2 Safe
College ID refers to a college, Branch ID refers to an
engineering stream. Mumbai University offer specialization in
Computer, Information Technology, Electronics, Mechanical
and many more streams. CAP Rounds are the admission
rounds undertaken by a college, if there are seats available
under a particular stream. 2009 Marks, 2010 Marks, 2011
Marks are the cut off marks for that year, for a particular
college, branch and CAP round. 0 marks indicate no seats
were available for a particular CAP round in that year. If
scores of two years are available, then mean is calculated for
three years, else, if it is available for two years, then mean of
two years is calculated. Values of mean calculated by the
system are shown in Table 2. Let the offset value be equal to 2. If a higher offset value is
considered, then the results may not be realistic. Let the user
input be 140.
The system will follow the algorithm as follows.
1) System will check if results are available for the
score of 140. Let us assume that the results are not
available. 2) The system will calculate the values of High and
Low as 142 and 138 on the basis of (2) and (3)
respectively. 3) Now, system will search through the database and
perform following updates: a) For entries with mean greater than input but
less than or equal to High, i.e. 140 and 142
respectively are listed as Ambitious. b) For entries with mean less than input but
greater than or equal to Low, i.e. 140 and 138
respectively are listed as Best Bargain. c) For entries with mean less than or equal to
Low, i.e. 138 are listed as Safe. 4) Results of steps a, b and c are updated in the
database for future use. 5) If another search on marks of 140 is performed, the
system will show values of calculations performed
in previous steps. This way it saves resources and
does not perform calculation unnecessarily. Results provided to the user are shown in Table 3. The results
shown to the user are the decisions made by the system, which
it thinks are optimal for a user having CET marks of 140. A user can also provide with specific branch of engineering in
which he/she is interested in getting decisions about. This
way, more user-oriented decisions can be generated.
A user can also provide with specific branch of engineering in
which he/she is interested in getting decisions about. This
way, more user-oriented decisions can be generated.
8. CONCLUSION The papers referred helped getting a better idea about the
technique to be applied in this proposed system. Thus, helping
in getting a better idea about the way in which the systemmust
work with minimum faults in it. The papers provided me with
a clear idea about which techniques may have which
advantages and disadvantages with them when implemented.
The advantages of the system proposed in this paper are that it
uses past inputs whichenables the user to get a more realistic
result as compared to a system generating resultsbased purely
on the thresholds assigned. The cut-off marks keep changing
year afteryear due to the varying difficulty levels in different
sections for different years thus, it isnecessary for the system
to be able to update itself. The system proposed in this
paperkeeps revising itself after each calculation so as to
provide the user with the most possiblyaccurate and up-to-
date information regarding the colleges they are eligible for.
As thesystem is put into use, more and more data will be
collected each year. Thus, with more data, a stronger system
can be guaranteed to be made available to the students/users.
Choosing the right career path, coupled with the right
institution is extremely important for any student. With large
amount of data at hand, it is important that it should be
analyzed efficiently. Hence this work demonstrates how data
mining technologies can be used to help take wise career
decisions. This method can further be extended by using
various probability based prediction methods.
If the current system is kept in mind, then all the results from
the different entrance exams such as state CETs, IIT-JEE,
BIT-SAT, AIEEE can be taken into account and a system can
be developed togive proper results to students according to
their scores in the different exams. However, differentstudents
may score differently in different tests and thus, the system
must be robust enough todecide which is the best score
amongst all and thus must be taken into consideration. And,
not allstudents always appear for all the exams to gain
admission to an engineering college. Keeping thisin mind, the
options must be made available to students while selecting the
scores to be entered forthe different entrance examinations.
9. ACKNOWLEDGMENTS I would like to thank Prof. Shiwani Gupta for encouraging us
to implement this project, Ushang Thakker for assisting me in
designing the logic and Lavanya Singh for helping me in
preparing this paper.
10. REFERENCES [1] Elizabeth Murray, Using decision trees to understand
student data, Proceedings of the 22nd International
Conference on Machine Learning, Bonn, Germany,2005,
unpublished.
[2] Sudheep Elayidom, Dr. Sumam Mary Idikkula, Joseph
Alexander, Anurag Ojha, Applying data mining
techniques for placement chance prediction, 2009
International Conference on Advances in Computing,
Control, and Telecommunication Technologies,
Trivandrum, Kerala, 2009, published, pp 669-671.
[3] V. S. Janakiraman, K. Sarukesi, Decision Support
Systems, Chapter 6, PHI Learning Pvt. Ltd., New Delhi,
2004, p 26.
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
41
[4] D. J. Hand, Heikki Mannila, Padhraic Smyth, Principles
of data mining, Cambridge, MIT Press, 2001, p 1.
[5] Paulraj Ponniah, Data Warehousing Fundamentals, John
Wiley & Sons, 2001, p 402-403.
[6] Daniel J. Power, Decision Support Systems: Concepts
and Resources for Managers, Greenwood Publishing
Group, 2002, p 6-13.
[7] Keen, P. G. W. and M. S. Scott-Morton. Decision
Support Systems: An Organizational Perspective.
Reading, MA: Addison-Wesley, 1978.
[8] Frada Burstein, C. W. Holsapple, Handbook on Decision
Support, vol. 1, Springer-Verlag, Berlin, 2008, p 127.
[9] Daniel J. Power, Decision Support Basics, Business
Expert Press, 2009, p 41.
[10] Gregory E. Kersten, Zbigniew Mikolajuk, Anthony G. O.
Yeh, Decision support systems for sustainable
development, Kluwer Academic Publishers, Norwell,
2000, p42.
[11] Jiawei Han, MichelineKamber, Data Mining Concepts
and Techniques, Chapter 2, Morgan Kaufmann
Publishers, San Francisco, 2006, p. 51
[12] Directorate of Technical Education [Online]. Available:
http://www.dte.org.in/fe2011/StaticPages/Default.aspx
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
42
A Genetic Algorithm based Fuzzy C Mean Clustering
Model for Segmenting Microarray Images
Biju V G
Division of Electronics School Of Engineering
Cochin university of Science and Technology
Mythili P Division of Electronics School Of Engineering
Cochin university of Science and Technology
ABSTRACT Genetic algorithm based Fuzzy C Mean (GAFCM) technique
is used to segment spots of complimentary DNA (c-DNA)
microarray images for finding gene expression is proposed in
this paper. To evaluate the performance of the algorithm,
simulated microarray slides were generated whose actual
mean values were known and is used for testing. K-means,
Fuzzy C Means (FCM) and the proposed GAFCM algorithm
were applied to the simulated images for the separation of the
foreground (FG) spot signal information from background
(BG) and the results were compared. The strength of the
algorithm was tested by evaluating the segmentation matching
factor, coefficient of determination, concordance correlation
and gene expression values. From the results it is observed
that the segmentation ability of GAFCM is better compared to
FCM and K- Means algorithms.
Keywords K-means, FCM, GAFCM, Genetic Algorithm, Segmentation,
Gene expression
1. INTRODUCTION C-DNA microarrays is one of the most fundamental and
powerful tools in biotechnology, which has been utilized in
many biomedical applications such as cancer research,
infectious disease diagnosis and treatment, toxicology
research, pharmacology research, and agricultural
development. The enormous improvement of technology in
the last decade provides the ability to simultaneously identify
and quantify thousands of genes by their gene expression [1].
The spots on a microarray are segmented from the
background to compute the red to green intensity ratio to give
the gene expression. The three basic operations to compute
the spot intensities are gridding, segmentation and intensity
extraction. These operations are used to find the accurate
location of the spot, separate spot FG from BG and the
calculation of the mean red and green intensity ratio.
In the last decade, several software packages and algorithms
were developed for segmenting spots in microarray images.
Fixed circle segmentation was the first algorithm used in
ScanAlyze Software [2], where all spots were considered to
be circular with a predefined fixed radius. An adaptive circle
segmentation technique was employed in the GenePix
software [3], where the radius of each spot was not considered
constant but adapts to each spot separately. Dapple software
estimated the radius of the spot using the laplacian based edge
detection [4]. An adaptive shape segmentation technique was
used in the Spot software [5]. A histogram-based
segmentation method was used in the ImaGene software [6].
Later watershed [7] and the seeded region algorithms [8] were
employed. The disadvantage of the above mentioned software
packages and algorithms were either the spots were
considered to be circular in shape or a priori knowledge of the
precise position of the spot’s center was a prerequisite [9].
Further segmentation algorithms based on the statistical
Mann–Whitney test were also used [10], which assess the
statistical significant difference between the FG and BG.
Lately the K-Means and FCM clustering algorithm are the
techniques that are used for spot segmentation [11][12].
The present work mainly focuses on the microarray spot
segmentation ability of the proposed GAFCM algorithm over
the FCM and K-mean algorithm. Gridding is done by means
of an automatic gridding based on intensity profile technique
using both horizontal and vertical intensity profiles and the
spots are addressed on the basis of this gridding information.
The K-means, FCM and GAFCM algorithm were
developed in matlab [13]. For the evaluation and testing of the
algorithm both simulated and real microarray images were
used. The performance of the algorithms were tested by
evaluating the segmentation matching factor (SMF),
Coefficient of determination (r2), Concordance correlation (Pc)
and spot gene expression value.
2. METHODS The aim of microarray image processing is to extract each
spotted DNA sequence as well as its background estimates
and quality measures. This can be achieved in three steps:
gridding, segmentation and information extraction as shown
in Figure 1. In the gridding process, the coordinates of each
spot are determined. In the segmentation process, the
pixels are segmented as BG or FG, and in the third step
the intensities are extracted and the gene expressions are
obtained. The results are useful for accurate microarray
analysis which involves data normalization, filtering and data
mining. Clustering is the most common technique that is used
for the segmentation of the microarray images. The idea of the
clustering application is to divide the pixels of the image into
several clusters (usually two clusters) and then to characterize
these clusters as FG or BG. The K-means segmentation
algorithm is based on the traditional K-means clustering
technique [14]. It employs a square-error criterion, which is
calculated for each of the two clusters. A brief idea of FCM
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
43
[15] is given in Section 3 and the proposed GAFCM is
described in detail in Section 4.
Figure 1 Block diagram of microarray image processing.
3. FUZZY C MEAN (FCM)
ALGORITHM
Let x= xi, i = 1 to N be the pixels of a single microarray spot,
where N is the number of pixels present in the spot. These
pixels have to be clustered in two classes BG and FG. Let cj
j=1,2 be the cluster centers of the FG and BG pixels
respectively. Each pixel should have membership degrees uij
for each cluster. The pixel is assigned to a particular cluster
based on the value of the membership degree function. Hence
the algorithm aims at iteratively improving the membership
degree function until there is no change in the cluster centers.
The sum of the membership values of a pixel belonging to all
clusters should satisfy Equation 1.
∑ (1)
The Euclidean distance from a pixel to a cluster center is
given by
(2)
The aim of this method is to minimize the absolute value of
the difference between the two consecutive objective
functions Ft and Ft+1 given by the Equation 3 and 4.
∑ ∑
(3)
(4)
Where m is the fuzziness parameter and ε is the error
which has to be minimized. Iteratively in each step, the
updated membership uij and the cluster centers cj are
given by Equations 5 and 6.
∑
(5)
∑
∑
(6)
4. GENETIC ALGORITHM BASED FCM
OPTIMIZATION (GAFCM).
GA is a powerful, stochastic non-linear optimization tool
based on the principles of natural selection and evolution
[16][17][18][19][20]. To find the optimum fuzzy partitions of
a microarray spot signal, a new GA based fuzzy c mean
clustering method has been proposed. Clustering using
GAFCM can be achieved using the following steps. Here each
chromosome in the population of GA encodes a possible
partition of image and the goodness of the chromosome is
computed by using a fitness function. The technique is
described as follows.
A. Population initialization
The chromosomes are made up of real numbers
which represent microarray spot BG and FG pixel
intensity centers respectively. These values are
randomly initialized by taking all possible intensity
values in the search space under evaluation.
B. Fitness computation
Fitness of a chromosome is calculated in two steps.
In the first step membership values of the image
data points to the different clusters are computed by
using FCM algorithm. In the second step fitness
value is computed. This is used as a measure to
evaluate the fitness of the chromosome. The
membership degree function uij can be computed
using the FCM algorithm explained in Section 3.
Saha et.al has given a fitness function for the
segmentation of satellite images [21][22]. This has
been further modified for finding the cluster center
of c-DNA microarray spots and is given in Equation
7.
(7)
Where (8)
(9)
(10)
Ec is same as Equation 4. This is the difference between
two successive objective function values in FCM. This
value is to be minimized. Dc is the maximum Euclidean
distance between two cluster centers among all centers.
E is the error matrix; Gij is a 2x N reference matrix. The
first row of the reference matrix is the one dimensional
binary image corresponding to the simulated spot. The
second row is the complement of first row. The objective
is to maximize the Fit so as to achieve proper clustering.
To ensure this E & Ec values has to decrease and Dc has
to increase.
DNA
Microarray
image
Gridding
Automatic spot
cropping based on
gridding
Segmentation
of spot from
background
Red and
Green
channel
intensity
extraction
Computation
of gene
expression
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
44
C. Selection, Crossover and Mutation
Roulette wheel selection method is applied on the population
where, each chromosome receives a number that is
proportional to its fitness value. Crossover and Mutation are
the two Genetic Operators used for the creation of new
Chromosomes. After repeating steps A, B, C for a fixed
number of iterations the best cluster centers are selected [23].
The flow chart for performing GAFCM is given in Figure 2
No
Yes
Figure 2 Flow chart of GAFCM algorithm.
5. EVALUATION OF THE PROPOSED
METHOD
To quantify the effectiveness of the proposed approach,
simulated as well as real microarray images from the Stanford
Microarray Database (SMD) have been used. The spots were
gridded and segmented using K-Means, FCM and GAFCM
independently for comparison purposes. Simulated microarray
images were used for validation and comparison purposes
since their gene expressions are known. Spots were simulated
with realistic characteristics to ensure that it looks like a true
c-DNA image, consisting of more than 1000 spots. Hence a
real c-DNA image was used as a template, and its binary
version was produced by employing a threshold technique
[24]
After converting it into a binary image, the spot area is
replaced by random values of mean intensities. In the
simulated microarray image the mean intensity value of each
spot was predefined, ranging between 0 and 255 for both the
R and G channels [24]. BG intensities were replaced by a
single intensity value.
The accuracy of any segmentation technique can be evaluated
using three parameters. The segmentation matching factor
SMF, The coefficient of determination r2 and The
concordance correlation Pc. The SMF [25][26][27] for every
binary spot, produced by the clustering algorithm is given by
(11)
Where Aseg is the area of the spot, as determined by the
proposed algorithm and Aact is the actual spot area. A perfect
match is indicated by a 100% score, any score higher than
50% indicates reasonable segmentation where as a score less
than 50% indicate poor segmentation. The coefficient of
determination r2 [24][28][29] indicates the strength of the
linear association between simulated and calculated spots, as
well as the proportion of the variance of the calculated data.
∑
∑
(12)
Where Iseg and Iact are the mean intensity value of the
calculated and simulated spots respectively and Imean is the
overall mean spot intensity values of the simulated image. The
algorithm that scores r2 value closer to 1 has better
performance.
The concordance correlation Pc was calculated using the
Equation
(13)
Start
Crop the spot sub image based on
Gridding
Initialize the center encoded population
matrix P (K) of size (Nx2)
Select chromosome
Update uij matrix
matrix
Calculate cj matrix based on uij
Find Ec, E, Dc and Fit matrix
Selection, Cross over & Mutation
Update the population matrix P (K)
If (iterations=
desired value)
Select the best uij, cj & Cluster the spot
pixels into BG & FG
Stop
A
A
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
45
Where A and B are two samples, are the mean values,
and SA and SB are the standard deviation of the samples. The
higher the Pc value, the better the performance of the
algorithm. Further the proposed algorithm’s performance has
been tested in the presence of noise. This was done by
corrupting the simulated spot with additive white Gaussian
noise whose signal-to-noise ratio (SNR) ranges from 1 to 19
dB [30].
6. RESULTS AND DISSCUSSION
The segmentation ability of KM, FCM and the proposed
GAFCM algorithm is made by computing and comparing the
SMF r2 and Pc values explained in section 5. The K-Means,
FCM and GAFCM algorithms were applied independently on
these images for the classification of the BG and FG pixels.
Several microarray images with different FG mean were
simulated and spots were randomly selected from these
images. The SMF value for the three algorithms is shown in
Figure 3 with the original spots, actual boundaries and the
results obtained for various methods. It is obvious from the
result that GAFCM shows an overall SMF of 98.56%
compared to FCM with 97.19% and K-means with 68.78%.
The average SMF, r2 and Pc values shown in Table 1 is
obtained from the simulated microarray image shown in
Figure 4 before corrupting it with noise.
Table 1 The SMF, r2 and Pc value for a simulated
microarray image before adding noise.
KM FCM GAFCM
SMF 82.304 98.3447 99.3357
r2 0.80188 0.968114 0.991427
Pc 0.77947 0.968089 0.991424
The segmentation ability of the proposed method in the
presence of noise has been studied. To do this, the simulated
microarray images were added with additive white Gaussian
noise gradually. The SMF, r2 and Pc values of the noisy
images were computed using K-means, FCM and GAFCM
algorithm. The SNR value is varied from 1dB to 19 dB.
Figure 5 shows the graph of SMF vs SNR for the three
algorithms and Table 2 gives the corresponding numerical
value. It can be seen from the graph that the difference in the
SMF is more for FCM and GAFCM compared with K-
means. In the case of GAFCM and FCM even though curves
are close, GAFCM segmentation is better than FCM for low
and high noise images. The result shows that the overall SMF
value varies from 97.050% to 70.551%, 96.807% to 69.645%
and 85.418% to 53.940% for GAFCM, FCM and K-means
respectively. This reveals that GAFCM is having better SMF
value.
The Coefficient of determination (r2) for simulated
microarray images for K-means, FCM and GAFCM are
shown in Table 3. The graph between r2 and SNR in dB is
shown in Figure 6. The method that scores r2 value closer to 1
has better performance. The r2 value of GAFCM is closer to 1
compared to FCM and K-means for low noise images. The
variation of r2 for SNR variation from 1 to 19 dB is from
0.7501 to 0.1296, 0.6935 to 0.1079 and 0.2880 to 0.0036 for
GAFCM, FCM and K-means respectively.
The concordance correlation (Pc) values obtained for K-
means, FCM and GAFCM are shown in Table 4. Figure 7
shows the graph between Pc and SNR in dB. Higher the values
of Pc the better will be the segmentation value for that
algorithm. From Table 4 it can be seen that the Pc value varies
from 0.7471 to 0.0960, 0.6916 to 0.0796 and 0.2878 to 0.0007
for GAFCM, FCM and K-mean respectively. This clearly
indicates that the proposed GAFCM has better segmentation
capability for the current application.
Figure 3 Comparison results for seven segmented spots
obtained from seven simulated images.
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
46
Figure 4 Simulated microarray image used to calculate the
gene expression.
SNR(dB)
0 2 4 6 8 10 12 14 16 18 20
SM
F
50
60
70
80
90
100
SNR vs GAFCM
SNR vs K-Means
SNR vs FCM
Figure 5 SMF calculated for simulated image corrupted
with additive white Gaussian noise having different levels
of SNR (dB) using K-means, FCM, GAFCM algorithms.
Table 2 The comparison of K-means, FCM, GAFCM
algorithm based on segmentation matching factor
(SMF) for simulated microarray images with
different levels of additive white Gaussian noise
SNR(dB).
SNR(dB) KM FCM GAFCM
1 53.93972 69.64504 70.55050
3 58.52296 78.66445 79.11223
5 63.03961 84.53164 84.63773
7 67.87467 88.79217 89.11575
9 72.60327 92.44617 92.73175
11 77.90749 92.61146 93.02225
13 81.82369 94.17475 94.70089
15 84.01279 95.58631 96.18429
17 85.22194 96.1873 96.28328
19 85.41774 96.80675 97.05008
Table 3 The comparison of K-means, FCM,
GAFCM algorithm based on coefficient of
determination (r2) for simulated microarray images
with different levels of additive white Gaussian
noise SNR(dB).
SNR(dB) KM FCM GAFCM
1 0.003582 0.107935 0.129569
3 0.002433 0.070657 0.08278
5 0.009682 0.200522 0.217191
7 0.014513 0.380952 0.414809
9 0.034473 0.348032 0.382025
11 0.091063 0.310028 0.361558
13 0.211104 0.35561 0.454974
15 0.273211 0.613217 0.657108
17 0.301239 0.619506 0.728683
19 0.287993 0.693543 0.750119
SNR (dB)
0 2 4 6 8 10 12 14 16 18 20
r2
0.0
0.2
0.4
0.6
0.8
SNR vs GAFCM
SNR vs K-Means
SNR vs FCM
Figure 6 r2 calculated for simulated image corrupted with
additive white Gaussian noise having different levels of
SNR (dB) using K-means, FCM, GAFCM algorithms.
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
47
Table 4 The comparison of K-means, FCM, GAFCM
algorithm based on concordance correlation (Pc) for
simulated microarray images with different levels of
additive white Gaussian noise SNR (dB).
SNR(dB) KM FCM GAFCM
1 0.0007 0.0796 0.0960
3 0.0003 0.0447 0.0497
5 0.0028 0.1813 0.1977
7 0.0052 0.3601 0.3923
9 0.0190 0.3429 0.3778
11 0.0762 0.2910 0.3412
13 0.2058 0.3551 0.4546
15 0.2730 0.6120 0.6536
17 0.3012 0.6173 0.7257
19 0.2878 0.6916 0.7477
SNR (dB)
0 2 4 6 8 10 12 14 16 18 20
Pc
0.0
0.2
0.4
0.6
0.8
SNR vs GAFCM
SNR vs K-Means
SNR vs FCM
Figure 7 Pc calculated for simulated image corrupted with
additive white Gaussian noise having different levels of
SNR (dB) using K-means, FCM, GAFCM algorithms.
The aim of microarray image processing is to find the gene
expression value. The gene expression value is the logarithm
mean intensity ratio of red and green channels in a spot. The
closeness of the computed gene expression value with the
actual value shows the performance of the algorithm. To
validate this, several microarray images were simulated and
tested. Figure 4 shows one such simulated images and the
corresponding result is shown in Table 5. The better the
segmentation technique the closer will be the gene expression
value with the actual value. Table 5 shows the gene
expression value obtained for a microarray simulated image of
16 spots using the three segmentation methods along with
their actual values of gene expression. It can be seen that the
gene expression value measured is almost close to the actual
value in the case of GAFCM compared to FCM and K-
Means. This shows that GAFCM algorithm has better scope in
microarray image spot segmentation application.
Table 5 Comparison of gene expression values computed
using K-means, FCM and GAFCM algorithm.
SPOT
No
Gene Expression
KM FCM GAFCM Actual
1 -0.01147 -0.06477 -0.04779 -0.04779
2 0.04617 -0.12034 -0.12034 -0.12034
3 0.03171 -0.09431 -0.09431 -0.09431
4 0.16624 0.08583 0.085828 0.091598
5 -0.12983 -0.19036 -0.17852 -0.17852
6 -0.00411 -0.11734 -0.11734 -0.10333
7 -0.05711 -0.1459 -0.13697 -0.13276
8 0.12509 -0.00511 -0.00511 -0.00386
9 -0.02495 -0.07131 -0.07716 -0.07716
10 -0.04111 -0.09078 -0.09078 -0.09078
11 -0.05853 -0.15023 -0.15023 -0.15023
12 0.06195 0.0167 0.016696 0.016696
13 -0.02509 -0.10586 -0.09059 -0.09059
14 0.03494 -0.04701 -0.04701 -0.04922
15 -0.11408 -0.2259 -0.2259 -0.2259
16 0.0467 -0.07544 -0.0705 -0.02818
7. CONCLUSION
Segmentation is an important part in microarray image
processing. The microarray spot segmentation for estimating
gene expression using K-means FCM and proposed GAFCM
has been done. It is seen that the proposed GAFCM algorithm
is more efficient than the FCM and K-means in terms of
clustering the signal FG and BG pixels. The errors during
segmentation lead to inaccurate calculation of gene expression
values in the intensity extraction step. All the above
mentioned algorithms do not perform well at high noise
International Journal of Computer Applications (0975 – 8887)
Volume 52– No.11, August 2012
48
levels. This can be rectified by using suitable filtering
techniques. As our future work, the noise removal has to be
addressed to get much smoother image and also an improved
clustering algorithm is to be developed so that low signal
intensity spots can be segmented more effectively.
8. REFERENCES [1] Y. H. Yang, M. J. Buckley, S. Duboit, and T. P. Speed
(2002), “Comparison of methods for image analysis on
c- DNA microarray data,” J. Comput. Graphical Statist.,
vol. 11, pp. 108–136
[2] M.B.Eisen. (1999). ScanAlyze [Online] Available:-
http://rana.lbl.gov/ EisenSoftware.htm
[3] GenPix 4000, A User’s Guide (1999), Axon Instruments,
Inc., Foster City, CA.
[4] J. Buhler, T. Ideker, and D. Haynor, “Dapple: improved
techniques for finding spots on DNA microarrays,”
Technical Report. UWTR 2000-08-05, UV CSE,
Seattle,Washington, USA.
[5] M. J. Buckley. (2000). The spot user’s guide.
CSIRO Mathematical and Information Science [Online].
Available:
http://www.cmis.csiro.au/IAP/Spot/spotmanual.html.
[6] ImaGene, ImaGene 6.1 User Manual. (2006. [Online]
Available:-http://www.biodiscovery.com/index/papps-
webfiles-action.
[7] S. Beucher and F. Meyer (1993), “The morphological
approach to segmentation: The watershed
transformation,” Opt. Eng., vol. 34, pp. 433–481.
[8] R. Adams and L. Bischof (Jun. 1994), “Seeded region growing,” IEEE Trans. Pattern Anal. Mach. Intell., vol.
16, no. 6, pp. 641–647.
[9] D. Bozinov and J. Rahenfuhrer (2002.), “Unsupervised
technique for robust target separation and analysis of
DNA microarray spots through adaptive pixel
clustering,” J. Bioinform., vol. 18, pp. 747–756.
[10] Y. Chen, E. R. Dougherty, and M. L. Bittne (1997),
“Ratio-based decisions abd the quantitative analysis of c-
DNA microarray images,” J. Biomed. Opt., vol. 2, pp.
264–374.
[11] S. Wu and H. Yan (2003), “Microarray Image Processing Based on Clustering and Morphological
Analysis”, Proc. Of First Asia-Pasific Bioinformatics
Conference, Adelaide, Australia, pp. 111-118.
[12] Volkan Uslan and Đhsan Ömür Bucak (2010). Microarray image segmentation using clustering
methods. Mathematical and Computational
Applications, Vol. 15, No. 2, pp. 240-247, © Association
for Scientific Research
[13] The Math Works, Inc., Software, MATLABR (2010a).
Natick, MA.
[14] MacQueen, J. B. (1967). Some Methods for
classifications. In 5-th Berkeley Symposium on
Mathematical Statistics and Probability, 1, 281-297.
Berkeley:University of California Press
[15] J. C. Bezdek (1981), Pattern Recognition with Fuzzy
Objective Function Algorithms, Plenum Press, New
York.
[16] D. E. Goldberg (1989), Genetic Algorithms in Search,
Optimization & Machine Learning, Boston: Addison-
Wesley, Reading, ch. 1.
[17] L.Davis (Ed.)(1991), Handbook of Genetic Algorithms,
Van Nostrand Reinhold, New York.
[18] Z. Michalewicz (1992), Genetic Algorithms #Data
Structures" Evolution Programs, Springer, New York.
[19] J.L.R. Filho, P.C. Treleaven, C. Alippi (1994), Genetic
algorithm programming environments, IEEE Comput.
27, 28-43.
[20] U. Maulik and S. Bandyopadhyay (2000),
“Genetic algorithm based clustering technique,” Pattern
Recog., vol. 33, pp. 1455–1465.
[21] Saha, S. and Bandyopadhyay, S., Accepted, (2007),
Fuzzy Symmetry Based Real-Coded Genetic Clustering
Technique for Automatic Pixel Classification in Remote
Sensing Imagery. Fundamenta Informaticae.
[22] S. Bandyopadhyay and S. Saha (2007), “GAPS: A clustering method using a new point symmetry based
distance measure,” Pattern Recog., vol. 40, pp. 3430–
3451.
[23] F. Herrera, M. Lozano, and J. L. Verdegay (Nov 1998),
“Tackling Real Coded Genetic Algorithms: Operators
and Tools for Behavioural Analysis,” Artificial
Intelligence Review, vol. 12, no. 4, pp. 265–319.
[24] O. Demirkaya, M. H. Asyali, and M.M. Shoukri (2005),
“Segmentation of c-DNA microarray spots using Markov
radom field modeling,” Bioinformatics, vol. 21, no. 13,
pp. 2994–3000.
[25] D. Tran and M. Wagner (2002), “Fuzzy C-means
clustering-based speaker verification,” in Lecture Notes
in Computer Science: Advances in Soft Computing—
AFSS 2002, N. R. Pal and M. Sugeno, Eds. New York:
Springer-Verlag, pp. 318–324.
[26] D. Betal, N. Roberts, and G. H. Whitehouse (1997),
“Segmentation and numerical analysis of micro
calcifications on mammograms using mathematical
morphology,” Br. J. Radiol., vol. 70, no. 837, pp. 903–
917.
[27] E.I. Athanasiadis, D.A. Cavouras, P.P. Spyridonos,
D.Th.Glotsos, I.K. Kalatzis, G.C. Nikiforidis (July
2009), Complementary DNA microarray image
processing based on the Fuzzy Gaussian mixture
model, in: IEEE Transaction on Information
Technology in Biomedicine, vol. 13, issue 4.
[28] E.I. Athanasiadis, D.A. Cavouras, P.P. Spyridonos,
D.Th.Glotsos, I.K. Kalatzis, G.C. Nikiforidis (2011), A
Wavelet based markov random field segmentation model
in segmenting microarray experiments, in: Computer
methods and programs in biomedicine 104,307-315.
[29] A.Lehmussola, et al. (2006), Evaluating the performance
of microarray segmentation algorithms, Bioinformatics
22, 2910–2917.
[30] K. Blekas, N. Galatsanos, A. Likas, and I. E. Lagaris
(Jul. 2005.), “Mixture model analysis of DNA
microarray images,” IEEE Trans. Med. Imag., vol. 24,
no. 7, pp. 901–907.