Pattern Recognition by AMBUJ MISHRA

download Pattern Recognition by AMBUJ MISHRA

of 8

Transcript of Pattern Recognition by AMBUJ MISHRA

  • 8/3/2019 Pattern Recognition by AMBUJ MISHRA

    1/8

    Study of Pattern Recognition using

    Rough-Fuzzy and Neuro- Fuzzy approach

    Abstract

    Development of intelligent classifier and efficient classification techniques is inevitable these

    days to meet real world complex problems. The most common approach to developing expressive

    and human readable representations of knowledge is the use of if-then production rules but real life

    problem domain usually lack generic and systematic expert rules for mapping feature patterns onto

    their underlying classes. Fuzzy sets constitute the oldest and most reported soft computing

    paradigm. They are well suited to modelling different forms of uncertainties and ambiguities, often

    encountered in real life. Integration of fuzzy sets with other soft computing tools has led to the

    generation of more powerful, intelligent and efficient systems.

    In this discussion we are seeking for the contribution of fuzzy set theory in the improved pattern

    recognition systems which use neuro-fuzzy and rough-fuzzy theories.

    -Introduction

    A recognition system should have sufficient provision for representing the uncertainties

    involved at every stage, i.e., in defining image regions, its features and relations among them, and in

    their matching, so that it retains as much as possible the information content of the original input

    image for making a decision at the highest level. It, therefore, becomes natural, convenient and

    appropriate to avoid committing ourselves to a specific hard decision by allowing the segments or

    skeletons or contours to be fuzzy subsets of the image; the subsets being characterized by the

    possibility of a pixel belonging to them. Fuzzy measures and methodologies, like fuzzy entropy, fuzzy

    geometry, and fuzzy medial axis transform, fuzzy Hough transform, and other fuzzy processing

    algorithms, have been developed to manage the uncertainty.

    Fuzzy sets were introduced in 1965 by Zadeh [1] as a new way of representing vagueness in

    everyday life. This theory provides an approximate and yet effective means for describing the

    characteristics of a system that is too complex or ill-defined to admit precise mathematical analysis.

    Fuzzy approach is based on the premise that key elements in human thinking are not just numbers

    but can be approximated to tables of fuzzy sets, or, in other words, classes of objects in which the

    transition from membership to non-membership is gradual rather than abrupt. Much of the logic

    behind human reasoning is not the traditional two-valued or even multi-valued logic, but logic with

    fuzzy truths, fuzzy connectives, and fuzzy rules of inference. This fuzzy logic plays a basic role in

    various aspects of the human thought process. Fuzzy set theory is the oldest and most widely

    reported component of present-day soft computing (or computational intelligence), which deals

    with the design of flexible information processing systems.

  • 8/3/2019 Pattern Recognition by AMBUJ MISHRA

    2/8

    These provide soft decision by taking into account characteristics like tractability,

    robustness, low cost, etc., and have close resemblance to human decision making.

    The significance of fuzzy set theory in the realm of pattern recognition [2-12] is adequately justified

    in

    - Representing linguistically phrased input features for processing.

    - Providing an estimate (representation) of missing information in terms of membership values.- Representing multiclass membership of ambiguous patterns and in generating rules and inferences

    in linguistic form.

    - Extracting ill-defined image regions, primitives, and properties and describing relations among

    them as fuzzy subsets.

    -Neural network and fuzzy Theory

    A judicious integration of neural network and fuzzy theory, commonly known as neuro-fuzzy

    computing , is a hybrid paradigm which is most visible and has been adequately investigated. This

    allows one to incorporate the generic advantages of artificial neural networks and fuzzy logic-like

    massive parallelism, robustness, learning, and handling of uncertainty and impreciseness, into the

    system. Moreover, some application specific merits can also be incorporated. For example, in the

    case of pattern classification and rule generation, one can exploit the capability of neural nets in

    generating highly nonlinear decision boundaries, and model the uncertainties in the input

    description and output decision by the concept of fuzzy sets. Often the neuro-fuzzy model is found

    to perform better than either a neural network or a fuzzy system considered individually. In effect

    there is a symbiotic combination of different soft computing tools for flexible information

    processing, in order to deal with real life ambiguous situations and to achieve tractability,robustness, and low-cost solutions [13, 14].

    Neuro-fuzzy hybridization is done broadly in two ways:

    1- A neural network equipped with the capability of handling fuzzy information [termed fuzzy-

    neural network (FNN)], and

    2- A fuzzy system augmented by neural networks to enhance some of its characteristics likeflexibility, speed, and adaptability [termedneural-fuzzy system (NFS)] [15].

    In an FNN either the input signals and/or connection weights and/or the outputs are fuzzy

    subsets or membership values to some fuzzy sets [16]. Usually linguistic values suchas low, medium,

    and high, or fuzzy numbers or intervals are used to model these. Neural networks with fuzzyneurons are also termed FNN as they are capable of processing fuzzy information.

    A NFS, on the other hand, is designed to realize the process of fuzzy reasoning, where the

    connection weights of the network correspond to the parameters of fuzzy reasoning. The NFS

    architecture has distinct nodes for antecedent clauses, conjunction operators, and consequent

    clauses.

  • 8/3/2019 Pattern Recognition by AMBUJ MISHRA

    3/8

    -Rough sets and Fuzzy set Theory

    An exhaustive fuzzy rule induction algorithm (RIA) [17] uses descriptive set representation

    features [18].The current RIA, provided with sets of continuous feature values, induces classification

    rules to partition the feature patterns into underlying categories. It is chosen for properties such asgraceful handling of missing and inaccurate information or vague data, domain independence,

    incremental operation [19]. However, as with many RIAs, this algorithm exhibits high computational

    complexity due to its generate-and-test nature. The effects of this become evident where patterns

    of high dimensionality need to be processed.

    In order to speed up the RIA, a pre-processing step is required. This is particularly important

    for tasks where learned rule sets need regular updating to reflect the changes in the description of

    domain features. This step, herein implemented using rough set theory [20], reduces the

    dimensionality of potentially very large feature sets without losing information needed for rule

    induction. It has an advantageous side-effect in that it removes redundancy from the historical data.

    This also helps simplify the design and implementation of the actual pattern classifier itself; bydetermining what features should be made available to the system. In addition, the reduced input

    dimensionality increases the processing speed of the classifier, leading to better response times.

    Most significant, however, is the fact that rough set feature reduction (RSFR) preserves the

    semantics of the surviving features after removing any redundant ones. This is essential in satisfying

    the requirement of user readability of the generated knowledge model, as well as ensuring the

    understand ability of the pattern classification process.

    The FAPACS (Fuzzy Automatic Pattern Analysis and classification system) algorithm [21, 22],

    is able to discover fuzzy association rules in relational databases. It works by locating pairs of

    attributes that satisfy an interestingness measure that is defined in terms of an adjusted difference

    between the observed and expected values of relations. This algorithm is capable of expressinglinguistically both the regularities and the exceptions discovered within the data.

    A common disadvantage of these techniques is their sensitivity to high dimensionality. This

    may be remedied using Principal Components Analysis (PCA), a well-known tool for data analysis and

    transformation [23, 24]. However, although PCA is an efficient methodology, it irreversibly destroys

    the underlying semantics of the dataset. Further reasoning about the data is almost always humanly

    impossible, prohibiting the use of PCA as a dataset pre-processor for symbolic or descriptive fuzzy

    modelling. By implication, only purely numerical (non-symbolic) datasets may be processed by PCA.

    Most semantics-preserving dimensionality reduction (feature selection) approaches tend to

    be domain specific, however, utilising well-known features of specific application domains. RSFRoffers an alternative approach that preserves the underlying semantics of the data while allowing

    reasonable generality.

  • 8/3/2019 Pattern Recognition by AMBUJ MISHRA

    4/8

    Rough-Fuzzy Rule Induction

    Present approach deals with pattern involving large set of features by applying a

    dimensionality reduction algorithm on the feature set to discover a smaller set of features that

    conveys all the information with as little redundancy as possible. Patterns formed using the resulting

    dimensionally reduced feature set are then extracted from the original pattern descriptions and fed

    to a rule induction algorithm to generate the suitable rule set. A rough fuzzy system following this

    approach integrates the following modules-

    Feature Reduction: Reads a set of feature patterns and outputs with reduced dimensionality,

    implemented with the RSFR algorithm.

    Rule Induction: Reads a sequence of feature patterns and outputs a set of if then rules connecting

    feature and their implied classes, implemented through RIA algorithm.

    Fuzzy Reasoner: A standard approximate reasoner that interprets the induced fuzzy rule set and uses

    it to classify previously unseen feature patterns.

    a- The fuzzy rule induction algorithm

    The rule induction algorithm as presented in Ref. [17] extracts linguistically expressed fuzzy

    rules from real valued examples. Although this RIA was proposed to be used in conjunction with

    neural network-based classifiers, it appears to be independent of the actual type of classifier used.

    Provided with training data, the RIA induces approximate relationships between the characteristics

    of the conditional attributes (pattern features) and their decision attributes (underlying classes). The

    premise attributes of the induced rules are represented by fuzzy variables, facilitating the modelling

    of the inherent uncertainty of the knowledge domain.

    An additional input item needed by this algorithm is the fuzzification of the conditional

    attributes as feature patterns are in general assumed to be available in real numbers (though these

    numbers may not be accurately measured). For presentational simplicity, the terms dataset and set

    of historical patterns are hereafter used interchangeably, and so are the term historical patterns and

    the term training data (or examples).

    A decision region is a set comprising examples row numbers, for which a certain decision

    attribute yi has a certain value c: Dyi=c = {x: yxi = cx {1; : : : ; k}}, where yxi is the value of the

    decision attribute yi as given by the example with number x, and k is the total number of examples in

    the dataset.

    Next, the algorithm generates a set of all the possible combinations of fuzzy sets defined in

    the underlying ranges of the n conditional attributes. Each of the n attributes has its own value range

    divided up into a number of fuzzy sets. Such a range is hereafter called a fuzzy region for short. For a

    domain with n conditional attributes, each of which is represented by fx fuzzy sets (1< x< n), there

    will be i=1i=n

    fi possible combinations, each referred to as a fuzzy set vector. Each vector represents

    an emerging pattern of rule conditions that may lead to a fuzzy rule, provided that dataset examples

    support it. Note that, in implementing this algorithm, it is infeasible and undesirable to implement

    directly this step. Alternative, computationally equivalent methods are employed to avoid the

    overhead of generating and storing this potentially vast set of combinations.

  • 8/3/2019 Pattern Recognition by AMBUJ MISHRA

    5/8

    Based on this, it is possible to measure the evidence contributed by an example x towards

    the establishment of a fuzzy rule denoted by a fuzzy set vector p = 1, 2, n(as produced in theprevious step). The t-norm operator is used to this end. Hence, Tp(x)=min(1(x1), 2(x2),.., n(xn))(Fuzzy intersection or t-norm), where x1; x2; : : : ; xn are conditional attribute values. Sets of these

    values are formed for each decision yi = c: Tpyi=c

    = {w: w = Tp(xj) j Dyi=c}.

    In a typical fuzzy region, no more than two fuzzy sets overlap for each underlying real

    element, while the region is split into two or more fuzzy sets. This causes most of the memberships

    to evaluate to zero, in turn forcing Tp(x) to also evaluate to zero, due to the use of min().

    For each decision yi = c, the maximum element of Tp yi=c is then evaluated (Fuzzy union or s-

    norm): Sp yi=c = max{x: x Tp yi=c}. This results in a multidimensional array of candidate rules

    describing the mapping between conditional and decision attributes. There is one array for each pair

    of decision attributes and decision attribute values (i.e. one for each decision region, as calculated

    above).

    Fuzzy rules may not be generated directly from these arrays, as there are candidate rules foreach possible decision yi = c and every possible combination of fuzzified conditional attributes. This

    implies that the candidate rule set may comprise contradicting rules. A means of deciding which

    candidate rule best describes a given fuzzy set vector is needed.

    A constant dubbed the uncertainty margin (or tolerance) is used to implement this control.A fuzzy rule concludes that class yi has value c given attribute values matching the fuzzy premises in

    p, if and only if the corresponding candidate rule is equal to at least more than the candidate rulessupporting any competing classifications yi = c_, where c_ Yi (the set of possible values for

    decision class yi) and c_ =c. If no candidate rule can be decided upon (i.e. all values are within the neighbourhood), the whole rule is undecided for the fuzzy set vector in question. That is,

    yi = c if ( c Yi {c}) Spyi=c S

    pyi=c

  • 8/3/2019 Pattern Recognition by AMBUJ MISHRA

    6/8

    in U are indiscernible with respect to P if and only if f(x; q)=f(y; q) qP. The indiscernibility

    relation for all PA is written as IND (P). U=IND (P) is used to denote the partition of U given IND(P)

    and is calculated as:

    U=IND(P) = {q P: U=IND(q)}

    Where

    I J = {X Y : X I; Y J; X Y =}:

    A rough set approximates traditional sets using a pair of sets named the lower and upper

    approximation of the set in question. The lower and upper approximations of a set PU (given an

    equivalence relation IND(P)) are defined as:

    P_Y = {X : X U=IND(P); X Y };

    P-Y = {X : X U=IND(P); X Y = }:

    Assuming P and Q are equivalence relations in U, the important concept positive region POSP(Q) is

    defined as:

    POSP(Q) = XQ P_X ;

    A positive region contains all patterns in U that can be classified in attribute set Q using the

    information in attribute set P. From this, the degree of dependency of a set Q of classification

    variables, Q B on a set of features P, where P A, is defined as :

    P(Q) = ||POSP(Q)|| .||U|| (7)

    where ||S|| denotes the cardinality of set S.

    The degree of dependency P(Q) of a set P of conditional attributes (features) with respect toa set Q of decision attributes (pattern classes) provides a measure of how important P is in

    classifying the dataset examples into Q. IfP(Q) = 0, then classification Q is independent of the

    attributes in P, hence the decision attributes are of no use to this classification. If = 1, then Q iscompletely dependent on P, hence the attributes are indispensable. Values 0

  • 8/3/2019 Pattern Recognition by AMBUJ MISHRA

    7/8

    Rmin R:

    Rmin = {X : X R; Y R; ||X ||

  • 8/3/2019 Pattern Recognition by AMBUJ MISHRA

    8/8

    1999.

    [16]- S.K. Pal, A. Skowron (Eds.), Rough-Fuzzy Hybridization: A New Trend in Decision Making,

    Springer, Heidelberg, 1999.

    [17]- A. Lozowski, T.J. Cholewo, J.M. Zurada, Crisp rule extraction from perceptron network

    classifiers, Proceedings of the International Conference on Neural Networks, Volume ofPlenary, Panel and Special Sessions, 1996, pp. 9499.

    [18]- Q. Shen, J.G. Marin-Blazquez, A. Tuson, Tuning fuzzy membership functions withneighbourhood search techniques: a comparative study. Proceedings of the Third IEEEInternational Conference on Intelligent Engineering Systems, 1999, pp. 337342.

    [19]- A. Chouchoulas, Q. Shen, Rough set-aided rule induction for plant monitoring, Proceedingsof the 1998 International Joint Conference on Information Science (JCIS98), Vol. 2, 1998,pp. 316319.

    [20]- Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning About Data, Kluwer AcademicPublishers, Dordrecht, 1991.

    [21]- W.H. Au, K.C.C. Chan, An e1ective algorithm for discovering fuzzy rules in relationaldatabases, Proceedings of the Seventh IEEE International Conference on Fuzzy Systems,

    1998, pp. 13141319.[22]- K. Chan, A.A. Wong, Apacs: a system for automatic analysis and classification of conceptual

    patterns, Comput.Intelligence 10 (1990) 119131.

    [23]- P. Devijver, J. Kittler, Pattern Recognition: A Statistical Approach, Prentice Hall, EnglewoodCli1s, NJ, 1982.

    [24]- B. Flury, H. Riedwyl, Multivariate Statistics: A Practical Approach, Prentice Hall, EnglewoodCli1s, NJ, 1988.