8/3/2019 Pattern Recognition by AMBUJ MISHRA
1/8
Study of Pattern Recognition using
Rough-Fuzzy and Neuro- Fuzzy approach
Abstract
Development of intelligent classifier and efficient classification techniques is inevitable these
days to meet real world complex problems. The most common approach to developing expressive
and human readable representations of knowledge is the use of if-then production rules but real life
problem domain usually lack generic and systematic expert rules for mapping feature patterns onto
their underlying classes. Fuzzy sets constitute the oldest and most reported soft computing
paradigm. They are well suited to modelling different forms of uncertainties and ambiguities, often
encountered in real life. Integration of fuzzy sets with other soft computing tools has led to the
generation of more powerful, intelligent and efficient systems.
In this discussion we are seeking for the contribution of fuzzy set theory in the improved pattern
recognition systems which use neuro-fuzzy and rough-fuzzy theories.
-Introduction
A recognition system should have sufficient provision for representing the uncertainties
involved at every stage, i.e., in defining image regions, its features and relations among them, and in
their matching, so that it retains as much as possible the information content of the original input
image for making a decision at the highest level. It, therefore, becomes natural, convenient and
appropriate to avoid committing ourselves to a specific hard decision by allowing the segments or
skeletons or contours to be fuzzy subsets of the image; the subsets being characterized by the
possibility of a pixel belonging to them. Fuzzy measures and methodologies, like fuzzy entropy, fuzzy
geometry, and fuzzy medial axis transform, fuzzy Hough transform, and other fuzzy processing
algorithms, have been developed to manage the uncertainty.
Fuzzy sets were introduced in 1965 by Zadeh [1] as a new way of representing vagueness in
everyday life. This theory provides an approximate and yet effective means for describing the
characteristics of a system that is too complex or ill-defined to admit precise mathematical analysis.
Fuzzy approach is based on the premise that key elements in human thinking are not just numbers
but can be approximated to tables of fuzzy sets, or, in other words, classes of objects in which the
transition from membership to non-membership is gradual rather than abrupt. Much of the logic
behind human reasoning is not the traditional two-valued or even multi-valued logic, but logic with
fuzzy truths, fuzzy connectives, and fuzzy rules of inference. This fuzzy logic plays a basic role in
various aspects of the human thought process. Fuzzy set theory is the oldest and most widely
reported component of present-day soft computing (or computational intelligence), which deals
with the design of flexible information processing systems.
8/3/2019 Pattern Recognition by AMBUJ MISHRA
2/8
These provide soft decision by taking into account characteristics like tractability,
robustness, low cost, etc., and have close resemblance to human decision making.
The significance of fuzzy set theory in the realm of pattern recognition [2-12] is adequately justified
in
- Representing linguistically phrased input features for processing.
- Providing an estimate (representation) of missing information in terms of membership values.- Representing multiclass membership of ambiguous patterns and in generating rules and inferences
in linguistic form.
- Extracting ill-defined image regions, primitives, and properties and describing relations among
them as fuzzy subsets.
-Neural network and fuzzy Theory
A judicious integration of neural network and fuzzy theory, commonly known as neuro-fuzzy
computing , is a hybrid paradigm which is most visible and has been adequately investigated. This
allows one to incorporate the generic advantages of artificial neural networks and fuzzy logic-like
massive parallelism, robustness, learning, and handling of uncertainty and impreciseness, into the
system. Moreover, some application specific merits can also be incorporated. For example, in the
case of pattern classification and rule generation, one can exploit the capability of neural nets in
generating highly nonlinear decision boundaries, and model the uncertainties in the input
description and output decision by the concept of fuzzy sets. Often the neuro-fuzzy model is found
to perform better than either a neural network or a fuzzy system considered individually. In effect
there is a symbiotic combination of different soft computing tools for flexible information
processing, in order to deal with real life ambiguous situations and to achieve tractability,robustness, and low-cost solutions [13, 14].
Neuro-fuzzy hybridization is done broadly in two ways:
1- A neural network equipped with the capability of handling fuzzy information [termed fuzzy-
neural network (FNN)], and
2- A fuzzy system augmented by neural networks to enhance some of its characteristics likeflexibility, speed, and adaptability [termedneural-fuzzy system (NFS)] [15].
In an FNN either the input signals and/or connection weights and/or the outputs are fuzzy
subsets or membership values to some fuzzy sets [16]. Usually linguistic values suchas low, medium,
and high, or fuzzy numbers or intervals are used to model these. Neural networks with fuzzyneurons are also termed FNN as they are capable of processing fuzzy information.
A NFS, on the other hand, is designed to realize the process of fuzzy reasoning, where the
connection weights of the network correspond to the parameters of fuzzy reasoning. The NFS
architecture has distinct nodes for antecedent clauses, conjunction operators, and consequent
clauses.
8/3/2019 Pattern Recognition by AMBUJ MISHRA
3/8
-Rough sets and Fuzzy set Theory
An exhaustive fuzzy rule induction algorithm (RIA) [17] uses descriptive set representation
features [18].The current RIA, provided with sets of continuous feature values, induces classification
rules to partition the feature patterns into underlying categories. It is chosen for properties such asgraceful handling of missing and inaccurate information or vague data, domain independence,
incremental operation [19]. However, as with many RIAs, this algorithm exhibits high computational
complexity due to its generate-and-test nature. The effects of this become evident where patterns
of high dimensionality need to be processed.
In order to speed up the RIA, a pre-processing step is required. This is particularly important
for tasks where learned rule sets need regular updating to reflect the changes in the description of
domain features. This step, herein implemented using rough set theory [20], reduces the
dimensionality of potentially very large feature sets without losing information needed for rule
induction. It has an advantageous side-effect in that it removes redundancy from the historical data.
This also helps simplify the design and implementation of the actual pattern classifier itself; bydetermining what features should be made available to the system. In addition, the reduced input
dimensionality increases the processing speed of the classifier, leading to better response times.
Most significant, however, is the fact that rough set feature reduction (RSFR) preserves the
semantics of the surviving features after removing any redundant ones. This is essential in satisfying
the requirement of user readability of the generated knowledge model, as well as ensuring the
understand ability of the pattern classification process.
The FAPACS (Fuzzy Automatic Pattern Analysis and classification system) algorithm [21, 22],
is able to discover fuzzy association rules in relational databases. It works by locating pairs of
attributes that satisfy an interestingness measure that is defined in terms of an adjusted difference
between the observed and expected values of relations. This algorithm is capable of expressinglinguistically both the regularities and the exceptions discovered within the data.
A common disadvantage of these techniques is their sensitivity to high dimensionality. This
may be remedied using Principal Components Analysis (PCA), a well-known tool for data analysis and
transformation [23, 24]. However, although PCA is an efficient methodology, it irreversibly destroys
the underlying semantics of the dataset. Further reasoning about the data is almost always humanly
impossible, prohibiting the use of PCA as a dataset pre-processor for symbolic or descriptive fuzzy
modelling. By implication, only purely numerical (non-symbolic) datasets may be processed by PCA.
Most semantics-preserving dimensionality reduction (feature selection) approaches tend to
be domain specific, however, utilising well-known features of specific application domains. RSFRoffers an alternative approach that preserves the underlying semantics of the data while allowing
reasonable generality.
8/3/2019 Pattern Recognition by AMBUJ MISHRA
4/8
Rough-Fuzzy Rule Induction
Present approach deals with pattern involving large set of features by applying a
dimensionality reduction algorithm on the feature set to discover a smaller set of features that
conveys all the information with as little redundancy as possible. Patterns formed using the resulting
dimensionally reduced feature set are then extracted from the original pattern descriptions and fed
to a rule induction algorithm to generate the suitable rule set. A rough fuzzy system following this
approach integrates the following modules-
Feature Reduction: Reads a set of feature patterns and outputs with reduced dimensionality,
implemented with the RSFR algorithm.
Rule Induction: Reads a sequence of feature patterns and outputs a set of if then rules connecting
feature and their implied classes, implemented through RIA algorithm.
Fuzzy Reasoner: A standard approximate reasoner that interprets the induced fuzzy rule set and uses
it to classify previously unseen feature patterns.
a- The fuzzy rule induction algorithm
The rule induction algorithm as presented in Ref. [17] extracts linguistically expressed fuzzy
rules from real valued examples. Although this RIA was proposed to be used in conjunction with
neural network-based classifiers, it appears to be independent of the actual type of classifier used.
Provided with training data, the RIA induces approximate relationships between the characteristics
of the conditional attributes (pattern features) and their decision attributes (underlying classes). The
premise attributes of the induced rules are represented by fuzzy variables, facilitating the modelling
of the inherent uncertainty of the knowledge domain.
An additional input item needed by this algorithm is the fuzzification of the conditional
attributes as feature patterns are in general assumed to be available in real numbers (though these
numbers may not be accurately measured). For presentational simplicity, the terms dataset and set
of historical patterns are hereafter used interchangeably, and so are the term historical patterns and
the term training data (or examples).
A decision region is a set comprising examples row numbers, for which a certain decision
attribute yi has a certain value c: Dyi=c = {x: yxi = cx {1; : : : ; k}}, where yxi is the value of the
decision attribute yi as given by the example with number x, and k is the total number of examples in
the dataset.
Next, the algorithm generates a set of all the possible combinations of fuzzy sets defined in
the underlying ranges of the n conditional attributes. Each of the n attributes has its own value range
divided up into a number of fuzzy sets. Such a range is hereafter called a fuzzy region for short. For a
domain with n conditional attributes, each of which is represented by fx fuzzy sets (1< x< n), there
will be i=1i=n
fi possible combinations, each referred to as a fuzzy set vector. Each vector represents
an emerging pattern of rule conditions that may lead to a fuzzy rule, provided that dataset examples
support it. Note that, in implementing this algorithm, it is infeasible and undesirable to implement
directly this step. Alternative, computationally equivalent methods are employed to avoid the
overhead of generating and storing this potentially vast set of combinations.
8/3/2019 Pattern Recognition by AMBUJ MISHRA
5/8
Based on this, it is possible to measure the evidence contributed by an example x towards
the establishment of a fuzzy rule denoted by a fuzzy set vector p = 1, 2, n(as produced in theprevious step). The t-norm operator is used to this end. Hence, Tp(x)=min(1(x1), 2(x2),.., n(xn))(Fuzzy intersection or t-norm), where x1; x2; : : : ; xn are conditional attribute values. Sets of these
values are formed for each decision yi = c: Tpyi=c
= {w: w = Tp(xj) j Dyi=c}.
In a typical fuzzy region, no more than two fuzzy sets overlap for each underlying real
element, while the region is split into two or more fuzzy sets. This causes most of the memberships
to evaluate to zero, in turn forcing Tp(x) to also evaluate to zero, due to the use of min().
For each decision yi = c, the maximum element of Tp yi=c is then evaluated (Fuzzy union or s-
norm): Sp yi=c = max{x: x Tp yi=c}. This results in a multidimensional array of candidate rules
describing the mapping between conditional and decision attributes. There is one array for each pair
of decision attributes and decision attribute values (i.e. one for each decision region, as calculated
above).
Fuzzy rules may not be generated directly from these arrays, as there are candidate rules foreach possible decision yi = c and every possible combination of fuzzified conditional attributes. This
implies that the candidate rule set may comprise contradicting rules. A means of deciding which
candidate rule best describes a given fuzzy set vector is needed.
A constant dubbed the uncertainty margin (or tolerance) is used to implement this control.A fuzzy rule concludes that class yi has value c given attribute values matching the fuzzy premises in
p, if and only if the corresponding candidate rule is equal to at least more than the candidate rulessupporting any competing classifications yi = c_, where c_ Yi (the set of possible values for
decision class yi) and c_ =c. If no candidate rule can be decided upon (i.e. all values are within the neighbourhood), the whole rule is undecided for the fuzzy set vector in question. That is,
yi = c if ( c Yi {c}) Spyi=c S
pyi=c
8/3/2019 Pattern Recognition by AMBUJ MISHRA
6/8
in U are indiscernible with respect to P if and only if f(x; q)=f(y; q) qP. The indiscernibility
relation for all PA is written as IND (P). U=IND (P) is used to denote the partition of U given IND(P)
and is calculated as:
U=IND(P) = {q P: U=IND(q)}
Where
I J = {X Y : X I; Y J; X Y =}:
A rough set approximates traditional sets using a pair of sets named the lower and upper
approximation of the set in question. The lower and upper approximations of a set PU (given an
equivalence relation IND(P)) are defined as:
P_Y = {X : X U=IND(P); X Y };
P-Y = {X : X U=IND(P); X Y = }:
Assuming P and Q are equivalence relations in U, the important concept positive region POSP(Q) is
defined as:
POSP(Q) = XQ P_X ;
A positive region contains all patterns in U that can be classified in attribute set Q using the
information in attribute set P. From this, the degree of dependency of a set Q of classification
variables, Q B on a set of features P, where P A, is defined as :
P(Q) = ||POSP(Q)|| .||U|| (7)
where ||S|| denotes the cardinality of set S.
The degree of dependency P(Q) of a set P of conditional attributes (features) with respect toa set Q of decision attributes (pattern classes) provides a measure of how important P is in
classifying the dataset examples into Q. IfP(Q) = 0, then classification Q is independent of the
attributes in P, hence the decision attributes are of no use to this classification. If = 1, then Q iscompletely dependent on P, hence the attributes are indispensable. Values 0
8/3/2019 Pattern Recognition by AMBUJ MISHRA
7/8
Rmin R:
Rmin = {X : X R; Y R; ||X ||
8/3/2019 Pattern Recognition by AMBUJ MISHRA
8/8
1999.
[16]- S.K. Pal, A. Skowron (Eds.), Rough-Fuzzy Hybridization: A New Trend in Decision Making,
Springer, Heidelberg, 1999.
[17]- A. Lozowski, T.J. Cholewo, J.M. Zurada, Crisp rule extraction from perceptron network
classifiers, Proceedings of the International Conference on Neural Networks, Volume ofPlenary, Panel and Special Sessions, 1996, pp. 9499.
[18]- Q. Shen, J.G. Marin-Blazquez, A. Tuson, Tuning fuzzy membership functions withneighbourhood search techniques: a comparative study. Proceedings of the Third IEEEInternational Conference on Intelligent Engineering Systems, 1999, pp. 337342.
[19]- A. Chouchoulas, Q. Shen, Rough set-aided rule induction for plant monitoring, Proceedingsof the 1998 International Joint Conference on Information Science (JCIS98), Vol. 2, 1998,pp. 316319.
[20]- Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning About Data, Kluwer AcademicPublishers, Dordrecht, 1991.
[21]- W.H. Au, K.C.C. Chan, An e1ective algorithm for discovering fuzzy rules in relationaldatabases, Proceedings of the Seventh IEEE International Conference on Fuzzy Systems,
1998, pp. 13141319.[22]- K. Chan, A.A. Wong, Apacs: a system for automatic analysis and classification of conceptual
patterns, Comput.Intelligence 10 (1990) 119131.
[23]- P. Devijver, J. Kittler, Pattern Recognition: A Statistical Approach, Prentice Hall, EnglewoodCli1s, NJ, 1982.
[24]- B. Flury, H. Riedwyl, Multivariate Statistics: A Practical Approach, Prentice Hall, EnglewoodCli1s, NJ, 1988.
Top Related