Download - Pattern Recognition by AMBUJ MISHRA

8/3/2019 Pattern Recognition by AMBUJ MISHRA

1/8

Study of Pattern Recognition using

Rough-Fuzzy and Neuro- Fuzzy approach

Abstract

Development of intelligent classifier and efficient classification techniques is inevitable these

days to meet real world complex problems. The most common approach to developing expressive

and human readable representations of knowledge is the use of if-then production rules but real life

problem domain usually lack generic and systematic expert rules for mapping feature patterns onto

their underlying classes. Fuzzy sets constitute the oldest and most reported soft computing

paradigm. They are well suited to modelling different forms of uncertainties and ambiguities, often

encountered in real life. Integration of fuzzy sets with other soft computing tools has led to the

generation of more powerful, intelligent and efficient systems.

In this discussion we are seeking for the contribution of fuzzy set theory in the improved pattern

recognition systems which use neuro-fuzzy and rough-fuzzy theories.

-Introduction

A recognition system should have sufficient provision for representing the uncertainties

involved at every stage, i.e., in defining image regions, its features and relations among them, and in

their matching, so that it retains as much as possible the information content of the original input

image for making a decision at the highest level. It, therefore, becomes natural, convenient and

appropriate to avoid committing ourselves to a specific hard decision by allowing the segments or

skeletons or contours to be fuzzy subsets of the image; the subsets being characterized by the

possibility of a pixel belonging to them. Fuzzy measures and methodologies, like fuzzy entropy, fuzzy

geometry, and fuzzy medial axis transform, fuzzy Hough transform, and other fuzzy processing

algorithms, have been developed to manage the uncertainty.

Fuzzy sets were introduced in 1965 by Zadeh [1] as a new way of representing vagueness in

everyday life. This theory provides an approximate and yet effective means for describing the

characteristics of a system that is too complex or ill-defined to admit precise mathematical analysis.

Fuzzy approach is based on the premise that key elements in human thinking are not just numbers

but can be approximated to tables of fuzzy sets, or, in other words, classes of objects in which the

transition from membership to non-membership is gradual rather than abrupt. Much of the logic

behind human reasoning is not the traditional two-valued or even multi-valued logic, but logic with

fuzzy truths, fuzzy connectives, and fuzzy rules of inference. This fuzzy logic plays a basic role in

various aspects of the human thought process. Fuzzy set theory is the oldest and most widely

reported component of present-day soft computing (or computational intelligence), which deals

with the design of flexible information processing systems.


2/8

These provide soft decision by taking into account characteristics like tractability,

robustness, low cost, etc., and have close resemblance to human decision making.

The significance of fuzzy set theory in the realm of pattern recognition [2-12] is adequately justified

in

- Representing linguistically phrased input features for processing.

- Providing an estimate (representation) of missing information in terms of membership values.- Representing multiclass membership of ambiguous patterns and in generating rules and inferences

in linguistic form.

- Extracting ill-defined image regions, primitives, and properties and describing relations among

them as fuzzy subsets.

-Neural network and fuzzy Theory

A judicious integration of neural network and fuzzy theory, commonly known as neuro-fuzzy

computing , is a hybrid paradigm which is most visible and has been adequately investigated. This

allows one to incorporate the generic advantages of artificial neural networks and fuzzy logic-like

massive parallelism, robustness, learning, and handling of uncertainty and impreciseness, into the

system. Moreover, some application specific merits can also be incorporated. For example, in the

case of pattern classification and rule generation, one can exploit the capability of neural nets in

generating highly nonlinear decision boundaries, and model the uncertainties in the input

description and output decision by the concept of fuzzy sets. Often the neuro-fuzzy model is found

to perform better than either a neural network or a fuzzy system considered individually. In effect

there is a symbiotic combination of different soft computing tools for flexible information

processing, in order to deal with real life ambiguous situations and to achieve tractability,robustness, and low-cost solutions [13, 14].

Neuro-fuzzy hybridization is done broadly in two ways:

1- A neural network equipped with the capability of handling fuzzy information [termed fuzzy-

neural network (FNN)], and

2- A fuzzy system augmented by neural networks to enhance some of its characteristics likeflexibility, speed, and adaptability [termedneural-fuzzy system (NFS)] [15].

In an FNN either the input signals and/or connection weights and/or the outputs are fuzzy

subsets or membership values to some fuzzy sets [16]. Usually linguistic values suchas low, medium,

and high, or fuzzy numbers or intervals are used to model these. Neural networks with fuzzyneurons are also termed FNN as they are capable of processing fuzzy information.

A NFS, on the other hand, is designed to realize the process of fuzzy reasoning, where the

connection weights of the network correspond to the parameters of fuzzy reasoning. The NFS

architecture has distinct nodes for antecedent clauses, conjunction operators, and consequent

clauses.


3/8

-Rough sets and Fuzzy set Theory

An exhaustive fuzzy rule induction algorithm (RIA) [17] uses descriptive set representation

features [18].The current RIA, provided with sets of continuous feature values, induces classification

rules to partition the feature patterns into underlying categories. It is chosen for properties such asgraceful handling of missing and inaccurate information or vague data, domain independence,

incremental operation [19]. However, as with many RIAs, this algorithm exhibits high computational

complexity due to its generate-and-test nature. The effects of this become evident where patterns

of high dimensionality need to be processed.

In order to speed up the RIA, a pre-processing step is required. This is particularly important

for tasks where learned rule sets need regular updating to reflect the changes in the description of

domain features. This step, herein implemented using rough set theory [20], reduces the

dimensionality of potentially very large feature sets without losing information needed for rule

induction. It has an advantageous side-effect in that it removes redundancy from the historical data.

This also helps simplify the design and implementation of the actual pattern classifier itself; bydetermining what features should be made available to the system. In addition, the reduced input

dimensionality increases the processing speed of the classifier, leading to better response times.

Most significant, however, is the fact that rough set feature reduction (RSFR) preserves the

semantics of the surviving features after removing any redundant ones. This is essential in satisfying

the requirement of user readability of the generated knowledge model, as well as ensuring the

understand ability of the pattern classification process.

The FAPACS (Fuzzy Automatic Pattern Analysis and classification system) algorithm [21, 22],

is able to discover fuzzy association rules in relational databases. It works by locating pairs of

attributes that satisfy an interestingness measure that is defined in terms of an adjusted difference

between the observed and expected values of relations. This algorithm is capable of expressinglinguistically both the regularities and the exceptions discovered within the data.

A common disadvantage of these techniques is their sensitivity to high dimensionality. This

may be remedied using Principal Components Analysis (PCA), a well-known tool for data analysis and

transformation [23, 24]. However, although PCA is an efficient methodology, it irreversibly destroys

the underlying semantics of the dataset. Further reasoning about the data is almost always humanly

impossible, prohibiting the use of PCA as a dataset pre-processor for symbolic or descriptive fuzzy

modelling. By implication, only purely numerical (non-symbolic) datasets may be processed by PCA.

Most semantics-preserving dimensionality reduction (feature selection) approaches tend to

be domain specific, however, utilising well-known features of specific application domains. RSFRoffers an alternative approach that preserves the underlying semantics of the data while allowing

reasonable generality.


4/8

Rough-Fuzzy Rule Induction

Present approach deals with pattern involving large set of features by applying a

dimensionality reduction algorithm on the feature set to discover a smaller set of features that

conveys all the information with as little redundancy as possible. Patterns formed using the resulting

dimensionally reduced feature set are then extracted from the original pattern descriptions and fed

to a rule induction algorithm to generate the suitable rule set. A rough fuzzy system following this

approach integrates the following modules-

Feature Reduction: Reads a set of feature patterns and outputs with reduced dimensionality,

implemented with the RSFR algorithm.

Rule Induction: Reads a sequence of feature patterns and outputs a set of if then rules connecting

feature and their implied classes, implemented through RIA algorithm.

Fuzzy Reasoner: A standard approximate reasoner that interprets the induced fuzzy rule set and uses

it to classify previously unseen feature patterns.

a- The fuzzy rule induction algorithm

The rule induction algorithm as presented in Ref. [17] extracts linguistically expressed fuzzy

rules from real valued examples. Although this RIA was proposed to be used in conjunction with

neural network-based classifiers, it appears to be independent of the actual type of classifier used.

Provided with training data, the RIA induces approximate relationships between the characteristics

of the conditional attributes (pattern features) and their decision attributes (underlying classes). The

premise attributes of the induced rules are represented by fuzzy variables, facilitating the modelling

of the inherent uncertainty of the knowledge domain.

An additional input item needed by this algorithm is the fuzzification of the conditional

attributes as feature patterns are in general assumed to be available in real numbers (though these

numbers may not be accurately measured). For presentational simplicity, the terms dataset and set

of historical patterns are hereafter used interchangeably, and so are the term historical patterns and

the term training data (or examples).

A decision region is a set comprising examples row numbers, for which a certain decision

attribute yi has a certain value c: Dyi=c = {x: yxi = cx {1; : : : ; k}}, where yxi is the value of the

decision attribute yi as given by the example with number x, and k is the total number of examples in

the dataset.

Next, the algorithm generates a set of all the possible combinations of fuzzy sets defined in

the underlying ranges of the n conditional attributes. Each of the n attributes has its own value range

divided up into a number of fuzzy sets. Such a range is hereafter called a fuzzy region for short. For a

domain with n conditional attributes, each of which is represented by fx fuzzy sets (1< x< n), there

will be i=1i=n

fi possible combinations, each referred to as a fuzzy set vector. Each vector represents

an emerging pattern of rule conditions that may lead to a fuzzy rule, provided that dataset examples

support it. Note that, in implementing this algorithm, it is infeasible and undesirable to implement

directly this step. Alternative, computationally equivalent methods are employed to avoid the

overhead of generating and storing this potentially vast set of combinations.


5/8

Based on this, it is possible to measure the evidence contributed by an example x towards

the establishment of a fuzzy rule denoted by a fuzzy set vector p = 1, 2, n(as produced in theprevious step). The t-norm operator is used to this end. Hence, Tp(x)=min(1(x1), 2(x2),.., n(xn))(Fuzzy intersection or t-norm), where x1; x2; : : : ; xn are conditional attribute values. Sets of these

values are formed for each decision yi = c: Tpyi=c

= {w: w = Tp(xj) j Dyi=c}.

In a typical fuzzy region, no more than two fuzzy sets overlap for each underlying real

element, while the region is split into two or more fuzzy sets. This causes most of the memberships

to evaluate to zero, in turn forcing Tp(x) to also evaluate to zero, due to the use of min().

For each decision yi = c, the maximum element of Tp yi=c is then evaluated (Fuzzy union or s-

norm): Sp yi=c = max{x: x Tp yi=c}. This results in a multidimensional array of candidate rules

describing the mapping between conditional and decision attributes. There is one array for each pair

of decision attributes and decision attribute values (i.e. one for each decision region, as calculated

above).

Fuzzy rules may not be generated directly from these arrays, as there are candidate rules foreach possible decision yi = c and every possible combination of fuzzified conditional attributes. This

implies that the candidate rule set may comprise contradicting rules. A means of deciding which

candidate rule best describes a given fuzzy set vector is needed.

A constant dubbed the uncertainty margin (or tolerance) is used to implement this control.A fuzzy rule concludes that class yi has value c given attribute values matching the fuzzy premises in

p, if and only if the corresponding candidate rule is equal to at least more than the candidate rulessupporting any competing classifications yi = c_, where c_ Yi (the set of possible values for

decision class yi) and c_ =c. If no candidate rule can be decided upon (i.e. all values are within the neighbourhood), the whole rule is undecided for the fuzzy set vector in question. That is,

yi = c if ( c Yi {c}) Spyi=c S

pyi=c


6/8

in U are indiscernible with respect to P if and only if f(x; q)=f(y; q) qP. The indiscernibility

relation for all PA is written as IND (P). U=IND (P) is used to denote the partition of U given IND(P)

and is calculated as:

U=IND(P) = {q P: U=IND(q)}

Where

I J = {X Y : X I; Y J; X Y =}:

A rough set approximates traditional sets using a pair of sets named the lower and upper

approximation of the set in question. The lower and upper approximations of a set PU (given an

equivalence relation IND(P)) are defined as:

P_Y = {X : X U=IND(P); X Y };

P-Y = {X : X U=IND(P); X Y = }:

Assuming P and Q are equivalence relations in U, the important concept positive region POSP(Q) is

defined as:

POSP(Q) = XQ P_X ;

A positive region contains all patterns in U that can be classified in attribute set Q using the

information in attribute set P. From this, the degree of dependency of a set Q of classification

variables, Q B on a set of features P, where P A, is defined as :

P(Q) = ||POSP(Q)|| .||U|| (7)

where ||S|| denotes the cardinality of set S.

The degree of dependency P(Q) of a set P of conditional attributes (features) with respect toa set Q of decision attributes (pattern classes) provides a measure of how important P is in

classifying the dataset examples into Q. IfP(Q) = 0, then classification Q is independent of the

attributes in P, hence the decision attributes are of no use to this classification. If = 1, then Q iscompletely dependent on P, hence the attributes are indispensable. Values 0


7/8

Rmin R:

Rmin = {X : X R; Y R; ||X ||


8/8

1999.

[16]- S.K. Pal, A. Skowron (Eds.), Rough-Fuzzy Hybridization: A New Trend in Decision Making,

Springer, Heidelberg, 1999.

[17]- A. Lozowski, T.J. Cholewo, J.M. Zurada, Crisp rule extraction from perceptron network

classifiers, Proceedings of the International Conference on Neural Networks, Volume ofPlenary, Panel and Special Sessions, 1996, pp. 9499.

[18]- Q. Shen, J.G. Marin-Blazquez, A. Tuson, Tuning fuzzy membership functions withneighbourhood search techniques: a comparative study. Proceedings of the Third IEEEInternational Conference on Intelligent Engineering Systems, 1999, pp. 337342.

[19]- A. Chouchoulas, Q. Shen, Rough set-aided rule induction for plant monitoring, Proceedingsof the 1998 International Joint Conference on Information Science (JCIS98), Vol. 2, 1998,pp. 316319.

[20]- Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning About Data, Kluwer AcademicPublishers, Dordrecht, 1991.

[21]- W.H. Au, K.C.C. Chan, An e1ective algorithm for discovering fuzzy rules in relationaldatabases, Proceedings of the Seventh IEEE International Conference on Fuzzy Systems,

1998, pp. 13141319.[22]- K. Chan, A.A. Wong, Apacs: a system for automatic analysis and classification of conceptual

patterns, Comput.Intelligence 10 (1990) 119131.

[23]- P. Devijver, J. Kittler, Pattern Recognition: A Statistical Approach, Prentice Hall, EnglewoodCli1s, NJ, 1982.

[24]- B. Flury, H. Riedwyl, Multivariate Statistics: A Practical Approach, Prentice Hall, EnglewoodCli1s, NJ, 1988.