Unsupervised Segmentation of Collagen Fiber Distribution...

25
Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages of OSF Tathagata Ray a , Jyotirmoy Chatterjee b , Anirban Mukherjee a , Mousumi Pal c , Keya Chaudhuri d , Ranjan Rashmi Paul c , Pranab K. Dutta a a Department of Electrical Engineering, Indian Institute of Technology, Kharagpur, 721302, W.B. b School of Medical Science and Technology, Indian Institute of Technology, Kharagpur, 721302, W.B. c Department of Oral and Maxillofacial Pathology, Gurunank Institute of Dental Science and Research, Panihati, Kolkata 700114, W.B. d Molecular and Human Genetics Division, Indian Institute of Chemical Biology, Kolkata 700 032, India. Corresponding Author: Pranab K. Dutta, E-mail: [email protected] Abstract The objective of this paper is to describe the comparative efficacy of three signal processing based feature extraction methodologies for classifying normal oral mucosa, early and advanced stages of Oral Submucous Fibrosis (OSF) by unsupervised segmentation of sub-epithelial collagen fibers. Wavelet and discrete cosine transform (DCT) based energy features are extracted from Transmission Electron Micrographs (TEM) of collagen fibers of different stages of OSF in non overlapping blocks which in turn are classified by fuzzy c-means clustering. The overall efficacy of DCT based features have been found better in comparison to its wavelet based counterpart. Keywords: DCT mask, wavelet transform, segmentation, OSF, collagen fiber, rotational invariance 1. Introduction A high incidence of oral cancer is mainly due to late diagnosis of potential precancerous lesions and conditions. 1-3 There is a consistent evidence that prognosis of oral cancer is better when it is diagnosed at early stage i.e in precancerous condition. 4 Oral Submucous Fibrosis (OSF) is such a precancerous condition of oral cavity and oro-pharynx having

Transcript of Unsupervised Segmentation of Collagen Fiber Distribution...

Page 1: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages of OSF

Tathagata Raya, Jyotirmoy Chatterjeeb, Anirban Mukherjeea, Mousumi Palc, Keya Chaudhurid, Ranjan

Rashmi Paulc, Pranab K. Duttaa

a Department of Electrical Engineering, Indian Institute of Technology, Kharagpur, 721302, W.B.

b School of Medical Science and Technology, Indian Institute of Technology, Kharagpur, 721302, W.B. c Department of Oral and Maxillofacial Pathology, Gurunank Institute of Dental Science and Research,

Panihati, Kolkata 700114, W.B. d Molecular and Human Genetics Division, Indian Institute of Chemical Biology, Kolkata 700 032, India.

Corresponding Author: Pranab K. Dutta, E-mail: [email protected]

Abstract

The objective of this paper is to describe the comparative efficacy of three signal

processing based feature extraction methodologies for classifying normal oral mucosa,

early and advanced stages of Oral Submucous Fibrosis (OSF) by unsupervised

segmentation of sub-epithelial collagen fibers. Wavelet and discrete cosine transform

(DCT) based energy features are extracted from Transmission Electron Micrographs

(TEM) of collagen fibers of different stages of OSF in non overlapping blocks which in

turn are classified by fuzzy c-means clustering. The overall efficacy of DCT based

features have been found better in comparison to its wavelet based counterpart.

Keywords: DCT mask, wavelet transform, segmentation, OSF, collagen fiber, rotational invariance 1. Introduction

A high incidence of oral cancer is mainly due to late diagnosis of potential precancerous

lesions and conditions.1-3 There is a consistent evidence that prognosis of oral cancer is

better when it is diagnosed at early stage i.e in precancerous condition.4 Oral Submucous

Fibrosis (OSF) is such a precancerous condition of oral cavity and oro-pharynx having

Page 2: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

insidious chronic progressive nature and a high degree of malignant potentiality.5

Histopathologically various changes in epithelium with concurrent sub-epithelial fibrosis

are major characteristics of OSF, but very few studies have addressed the collagen

fibrosis in a definite quantitative manner. In this regard computer aided diagnostic (CAD)

approach aided with statistical modeling has been successfully employed to analyse

collagen ultrastructure.6 The transmission electron micrographs of sub epithelial fibrillar

collagen population of early and advanced stages of OSF have also been analyzed by

CAD approach coupled with wavelet-Artificial Neural Network (wavelet-ANN) to

compare with the fibers of normal oral mucosa.7 The wavelet features have been

compared for the supervised stage classification of OSF.7 The Haar features were used in

for ANN based supervised classification of different stages of OSF.7 Few reports are

available regarding successful applications of machine learning in precancerous

diagnosis.8,9,10,11,12,13

Porter showed that energies (averaged 1l norm) of the combined LH and HL channels for

all three level decompositions along with the energy of the LL channel of the lowest level

decomposition of 3 level wavelet transform using Daubechies wavelet filter of support 8

(DB8) can serve as rotational invariant features in case of texture classification.14 In the

present study this philosophy has been used to mark different regions of the input image.

Now this study mainly aims at differentiating axial, transverse and collagen free regions

present in OSF in an unsupervised way as a texture classification problem. The energy of

three levels DB8 wavelet function is used as the discriminating test features. In general,

neighboring pixels within an image tend to be highly correlated. The DCT has been

shown to be near optimal for a large class of images in energy concentration and

Page 3: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

decorrelating.15,16 DCT has the construction of a decorrelated basis system which will

almost always reduce the dimensionality of a feature vector space. If the Markov-1 model

fits the image well, then the best option would be to use the DCT as the decorrelating

transform.17 It decomposes the signal into underlying spatial frequencies, which then

allow further processing techniques to reduce the precision of the DCT coefficients

consistent with the Human Visual System (HVS) model. The horizontal and vertical AC

coefficients of DCT have been selected for rotational invariant features to represent the

regularity, complexity and some texture features of an image. This feature extraction

process has been followed by Fuzzy c-means technique.18

Earlier the input TEM images were directly fed to feature extraction and classification

stage.7 But it has been observed that the content of section of the incision biopsy is

heterogeneous in nature. A single section contains both transverse and axially-cut

collagen. Moreover, this section contains some collagen free region. Naturally, there are

some voids (collagen-free portion) in any transverse or axial section of the incision

biopsy. The simultaneous presence of transverse, axial and collagen-free sections (as

Fig. 1. Less-advanced stage of OSF consisting of transverse, axial and collagen free regions.

Collagen free region

Axial region

Transverse region

Page 4: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

shown in Fig. 1) of the input image makes the processing task challenging. In order to

enhance the performance of the ANN-based classifier the input image has to be clustered

first. Out of the three clusters, the zone showing the transverse section of collagen will be

identified. The ANN-based staging of the OSF will be performed on this transverse

section of the collagen image identified in the previous step. The complete scheme is

shown in Fig 2. In order to check the performance of this clustering technique, the

computed output has been compared with the reference segmentation map manually

drawn by two oral oncologists. These medical experts have marked the three regions

(transverse, longitudinal, and collagen free) independently. The independently marked

common area has been taken as desired segmentation map and the region segmentation

accuracy has been evaluated for various stages of OSF images. Basically study for

identifying certain features from TEM images has been done here. The features

considered are based on wavelet transform, wavelet packet transform and DCT.

Rotational invariance of these features has been verified also. These features will be

subsequently used to ascertain that under certain circumstances whether one feature set

outperforms other feature sets or not. The scope of this paper has been marked in Fig. 2.

Region wise segmentation is carried out in block wise decomposition due to lesser

computational time than that of pixel wise processing. Apart from this rotational invariant

method based on DCT proposed here has been found to be superior to rotational invariant

wavelet based methods. Also the proposed method has the advantage of easiness of

fabrication and implementation as it is derived from DCT.

Page 5: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

This paper has been organized as follows: Introduction has been covered in Section 1.

Section 2 deals with selection of patients, clinical classification of OSF stages and

transmission electron microscopic (TEM) study. In section 3 and 4, feature extraction and

clustering methodology have been discussed respectively. The results and observations

have been presented in section 5. Finally, Section 6 is devoted for conclusion.

2. Selection of Patients, Clinical Classification of OSF Stages and Transmission Electron Microscopic (TEM) Study In selection of study-subjects, primarily oral health was examined as per standard

procedures and provisional clinical diagnosis was made for OSF patients with oral

lesions. Subsequently OSF cases were confirmed histopathologically (still regarded as

the gold standard of OSF diagnosis), through routine H&E light microscopic evaluation

of incisional biopsy under their (patient’s) prior consent at the Department of Oral and

Maxillofacial Pathology, R. Ahmed Dental College & Hospital, Kolkata.19 The examined

cases are given as Less advanced(n=55), Advanced(n=60), Normal(n=30). The

classification /grading of OSF (less-advanced and advanced) has been done according to

the degree of trismus (i.e. reduction in the overall mouth opening, visual observation etc.

Region-wise segmentation

ANN-based staging of OSF evaluated on transverse/axial section

Identification of cluster (transverse or axially cut area)

Evaluation of clustering accuracy

Desired common segmentation map by two oral oncologists

Input image

Fig. 2 Block diagram for staging process of OSF.

Page 6: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

by experienced oral pathologists) which has direct correlation with the degree of fibrosis,

progression of the disease and location of OSF lesion in oral mucosa.20 Clinically, varied

degrees of trismus (inability to open the mouth) in these patients are also evident which

has a direct correlation with the oral location of the OSF lesion and degree of fibrosis due

to excessive formation of subepithelial collagen fibers.20 However, trismus is not the only

basis of diagnosis of OSF i.e. the reverse is not always true. The unaffected area of the

biopsy was treated as the representative of normal mucosa (test normal). In histologically

confirmed OSF cases (less advanced and advanced) both the test normal and affected

mucosal regions of all biopsies were further subjected to transmission electron

microscopic evaluation for capturing ultra-structural features of sub-epithelial collagen

fibers.20 It is to be noted that the test normal sample has been taken from the visually and

clinically unaffected zones of oral mucosa. In this work, the ultrastructural changes in

TEM images are captured by the feature extraction and classification techniques as

discussed in section 3. The clinical diagnosis of OSF at macroscopic and microscopic

level by the oncologists provides the ground truth for less advanced stage and advanced

stage.

3. Feature Extraction 3a. Wavelet Transformation The 2-D wavelet transform performs a spatial frequency analysis on an image by

repeatedly decomposing the image in the lower frequency sub-bands.21 The output

depends on the type of wavelet, the decomposition and wavelet filter specifications. The

features have been derived from three level wavelet decomposition coefficients. The

Page 7: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

energy level of the main channels of the wavelet decomposition has been found to be

effective as rotational invariant features for texture segmentation.14 In general, the texture

always has components in both vertical and horizontal frequencies whatever the amount

of rotation it has may be. Feature based on plain wavelet transform represent these

frequencies. So these features have been found to be rotational variant. Therefore, in the

proposed scheme rotation invariance has been achieved by combining pairs of diagonally

opposite wavelet channels (HL and LH) to form single features. This approach is thus

entirely based on the composition of spatial frequencies within the texture and is not

heavily dependent on the texture’s directionality. The LH and HL channels (as shown in

Fig 3.) in each level of decomposition are grouped together to produce four main

frequency bands. The HH channels are not used owing to their poor signal to noise ratio

which degrades the classification accuracy. The energy levels in each of these chosen

bands are calculated as the mean of the magnitudes of their wavelet coefficients. The four

dimensional feature vector Twwwww ecececdcF ][ 3211= has been used as the characteristics

of MxN block. The region classification has been performed on this four dimensional

feature space. The energy or averaged 1l -norm for the thn channel is given by

( )∑∑= =

=M

i

N

jLLn

wn jix

MNdc

1 1,1 for 1=n (1a)

( ) ( ){ }∑∑= =

+=N

i

N

jLHnHLn

wn jixjix

MNec

1 1,,1 for 3,2,1=n (1b)

where the channel is of dimensions MxN (usually NM = ), i and j are the indices of

row and column of the channel and LLnx is LL channel wavelet coefficient of level n .

Page 8: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

3b. Wavelet Packet Transformation Wavelet packets are a generalization of orthonormal and compactly supported wavelets

.22 In wavelet analysis a signal is split into an approximation and a detail. The

approximation is then itself split into a second level approximation and detail and the

process is repeated. But in wavelet packet analysis the detail as well as the approximation

can be split. The difference between wavelet transform and wavelet packet transform is

that later recursively decomposes higher frequency components thus constructing a tree-

structured multiband extension of the wavelet transform. An example of a three level

wavelet packet transform is shown in Fig. 4.

LL HL

HH LH

1 2 2

3

3 4

4

Fig. 3 Grouping of wavelet channels to form 4 bands to produce rotation invariant features.

Page 9: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

The decomposition of root image (top image shown in Fig. 4) creates four components at

the next level ((1,1),….(1,4)). Likewise 64 channel wavelet coefficients ((3,1)..(3,64))

have been obtained from the 3 level wavelet packet transform. In this wavelet packet

transform 17 rotational invariant energy features (eq. 2) have been considered for 3 level

wavelet packet transform. Therefore, the wavelet packet feature may also have been used

for stage classification. Here feature vector is Twpwpwpwpwp ecececdcF ]......[ 17211=

where

( )∑∑= =

=M

i

N

jLLn

wpn jix

MNdc

1 1,1 for 1=n (2a)

(1,1) (1,2) (1,3) (1,4)

(2,1)(2,2) (2,3)(2,4) . . . . . . . (2,13)(2,14)(2,15)(2,16)

(3,1)(3,2)(3,3)(3,4)…………………………………………… (3,61)(3,62)(3,63)(3,64)

Root image

Fig. 4 Tree structure of three level wavelet packet transform.

Page 10: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

( ) ( ){ }∑∑= =

++=N

i

N

jnLHHLn

wpn jixjix

MNec

1 1)11(1 ,,1 for 16..3,2,1=n (2b)

where 62..8,5,21 =n and the channel is of dimensions MxN (usually NM = ), i and j

are the indices of row and column of the channel.

3c. Discrete Cosine Transform (DCT)

A local linear transform can be interpreted as a spatial filtering approach for image

decomposition and texture representation.24,18 The corresponding 2D DCT mask may be

obtained from its 1D version as it is a separable transform.

A 1D DCT basis vector um support N is expressed as

( )N

kum1

= for m=1

( )( )⎭⎬⎫

⎩⎨⎧ −−

=Nmk

N 2112cos2 π for m=2,……N …(3)

8 pixels

Fig. 5 DCT basis patterns Neutral gray represents zero, white represents positive amplitudes, and black represents negative amplitude.23

8 pi

xels

Page 11: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

These 1D DCT vectors can then be used to generate 2D transform filters appropriate for

images. For this column basis vectors with the row vectors of identical length are

multiplied to produce a set of 2D filters of 2N entities where N is the vector length. The

mn -th entity in this filter bank, mnd , is given as

( ) ( )'kukud nmmn = where Nnm ≤≤ ,1 3c.1 Proposed Modification In Table 1 the highest filter coefficient mask of size 8X8 has been taken from rightmost

bottom corner of Fig 5. It has been assumed in this work that the 4X4 submask A serves

as approximation filter mask. Consequently V & H has been treated as vertical and

horizontal filter mask respectively. It is to be noted that these submasks V and H are the

horizontal and vertical version with a negative sign.

TARARARARAACACACACA ][],[ 43214321 == ; where jAC and jAR are

column vector.

( )[ ]12341 ACACACACV −= , ( )[ ]TARARARARH 12341−=

0.0396 -0.1127 0.1686 -0.1989 0.1989 -0.1686 0.1127 -0.0396 -0.1127 0.3209 -0.4802 0.5665 -0.5665 0.4802 -0.3209 0.1127 0.1686 -0.4802 0.7187 -0.8478 0.8478 -0.7187 0.4802 -0.1686 -0.1989 0.5665 -0.8478 1.0000 -1.0000 0.8478 -0.5665 0.1989 0.1989 -0.5665 0.8478 -1.0000 1.0000 -0.8478 0.5665 -0.1989 -0.1686 0.4802 -0.7187 0.8478 -0.8478 0.7187 -0.4802 0.1686 0.1127 -0.3209 0.4802 -0.5665 0.5665 -0.4802 0.3209 -0.1127 -0.0396 0.1127 -0.1686 0.1989 -0.1989 0.1686 -0.1127 0.0396

Table 1: Approximation, horizontal and vertical part of the highest detail DCT filter coefficient mask.

A

H

V

D

Page 12: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

The energy measure of dc component of the convoluted image with the A part of lowest

DCT mask (left most top corner of Fig 5) and energy measure of some combination of

convoluted image with V and H part of 64 DCT masks are taken as rotational invariant

DCT features as given by feature vector [ ]Tddddd ecececdcF 64211 .......= where

( )∑∑= =

=M

i

N

jAn

dn jix

MNdc

1 1,1 for n =1

( ) ( )∑∑= =

+=M

i

N

jVnHn

dn jixjix

MNec

1 1,,1 for n =1 to 64. (4)

where VnAn xx , and Hnx denote input image, I convoluted with VA, and H part of n th

DCT mask given by eq 5.

VIxHIxAIx VnHnAn ⊗=⊗=⊗= ,, (5)

These are the modified version of conventional DCT features.

An experiment has been performed to show the efficacy of the above mentioned

proposed rotational invariant features based on DCT. Twelve images belonging to

normal, less advanced and advanced stages of OSF have been taken as test image. Now

each test image has been rotated by angle of 45, 90 and 135 degree. Thereafter

midportion of size 32X32 of unrotated and three rotated versions of each of the test

images are considered as input to the proposed feature extraction method. Sixty five

proposed features have been extracted for four versions of input image for rotation of 0,

45, 90 and 135 degrees. Centralised moments of order 1, 2 and 3 are calculated on

normalized 65 features. Three centralized moments calculated for four versions for each

of twelve input images are shown in Table 2. It can be inferred from the Table 2 that

centralized moments of the proposed features vary in a small amount among four

Page 13: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

versions of a particular input image. So the proposed features are rotational invariant

indeed.

Input image Rotation

in degrees

Centralized 1st order moment

Centralized 2nd order moment

Centralized 3rd order moment

Sample 1 0 0 0.020902 0.0153 45 0 0.020961 0.015347 90 0 0.020909 0.015305 135 0 0.020969 0.015354 Sample 2 0 0 0.020891 0.015257 45 0 0.020913 0.015277 90 0 0.020885 0.015248 135 0 0.020913 0.015277 Sample 3 0 0 0.020995 0.015364 45 0 0.020994 0.015357 90 0 0.020999 0.015369 135 0 0.020999 0.015365 Sample 4 0 0 0.020876 0.015272 45 0 0.020937 0.015322 90 0 0.020874 0.01527 135 0 0.020934 0.015318 Sample 5 0 0 0.020963 0.015347 45 0 0.021008 0.015386 90 0 0.020964 0.015346 135 0 0.021009 0.015387 Sample 6 0 0 0.021046 0.015429 45 0 0.021079 0.015451 90 0 0.021047 0.015429 135 0 0.021078 0.01545 Sample 7 0 0 0.021037 0.0154 45 0 0.02105 0.015413 90 0 0.021036 0.015399 135 0 0.021049 0.015412 Sample 8 0 0 0.021079 0.01545 45 0 0.021099 0.015466 90 0 0.02108 0.015451 135 0 0.021101 0.015468 Sample 9 0 0 0.020935 0.015312 45 0 0.020961 0.015332 90 0 0.020933 0.01531

Table 2.Centralised moments of order 1, 2 and 3 with four rotated versions of each of the twelve number of input images of size 32X32

Page 14: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

Input image Rotation in degrees

Centralized 1st order moment

Centralized 2nd order moment

Centralized 3rd order moment

135 0 0.020961 0.015332 Sample 10 0 0 0.02103 0.015401 45 0 0.021047 0.015412 90 0 0.021034 0.015406 135 0 0.02105 0.015416 Sample 11 0 0 0.020889 0.015283 45 0 0.020951 0.015333 90 0 0.020892 0.015286 135 0 0.020944 0.015324 Sample 12 0 0 0.020866 0.015241 45 0 0.020919 0.01529 90 0 0.020866 0.01524 135 0 0.02092 0.015292 4. Clustering of Feature Vectors In this proposed approach, the extracted feature vectors have been fed to clustering stage.

Among different clustering techniques, k-means and fuzzy c-means are used frequently

nowadays. Starting with an initial condition k-means algorithm finds a “hard partition” of

a given feature vector based on certain criteria that evaluates the goodness of a partition.

The hard partitioning assigns each cluster to only one class. This disadvantage is

eliminated in fuzzy c-means clustering methodology.25 The progression of OSF is

continuous in nature and the extracted feature points partially may belong to multiple

classes with different degree of membership. Then the rotational invariant features have

been fed to fuzzy c-means clustering.

5. Results and Discussions The entire process consists of two goals namely obtaining zonal segmentation area wise

and estimating severity of OSF. But the feature extraction and clustering are backbones to

Page 15: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

achieve these goals. In the following table, the best results are shown with rotational

invariant wavelet transform, wavelet packet transform and DCT mask based feature

extraction technique in terms of percentage of misclassification. Percentage of

misclassification is given as

100% ×=consideredpixelsofnumbertotal

pixelsiedmisclassifofnumbericationmisclassifof

Number of misclassified pixels is the mismatched pixels between obtained segmentation

map and desired common segmentation map.

In Table 3, the results of the different feature extraction methods are tabulated. Three

block sizes namely 8X8, 16X16 and 32X32 have been tried as a compromise between

computational burden and classification accuracy. In the first row of Table 3, the feature

extraction techniques namely rotational invariant wavelet transform, rotational invariant

wavelet packet transform and rotational invariant DCT produce 38.43, 31.57 and 34.43 %

of misclassification respectively for three class clustering on can5 image for block size of

8X8. As per the result is concerned in can5, can17 rotational invariant wavelet transform

is better compared to rotational invariant DCT and rotational invariant wavelet packet

transform. Rotational invariant wavelet packet performs well in lessadv2. In the rest

images rotational invariant DCT performs well compared to other two feature extraction

methods. The highest and lowest misclassification accuracies are observed in rotational

invariant DCT mask based method. Choice of block size for best performance varies with

the image and with the feature extraction method chosen. Out of 36 cases given in Table

3, the rotational invariant DCT based feature extraction method is good in 22 cases

whereas that number for rotational invariant wavelet transform and rotational invariant

wavelet packet transform are 11 and 3 respectively. In Fig. 6, area wise segmented image

Page 16: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

by 3 different feature extraction methods are shown for different stages of OSF. In Table

4, rotational invariant features based on wavelet transform, wavelet packet transform and

DCT followed by kmeans clustering show similar results as that of Table 3 where fuzzy

cmeans clustering instead of kmeans clustering is used. Now it can be inferred that

features are stable.

The methods discussed above can be used to segment the sections of different stages of

collagen fibers namely normal, less advanced and advanced. The efficacy of rotational

invariant DCT is well demonstrated in fig 7(a)-(d). In these two examples the test image

is created by the combination of normal, less advanced and advanced stages of axial

section of OSF. In these two examples the classification accuracy is 100%. In fact ideal

segmentation is found in 35% and 83% of all valid examples in the form of Fig 7(a). and

Fig. 7(c) respectively. The figures are 40% and 88% respectively for transverse section

of OSF.

The classification of ultrastructural features of OSF in an unsupervised manner is really

challenging. Because like any other precancerous condition, different phases of OSF

depict mixed features of normal and disease states (pre-malignancy). Accordingly the less

advanced stage of OSF comprises of both the normal and pre malignant features with

many overlapping and ambiguities. This complexity of the diseased tissue has been

reflected in connection with the unsupervised area-wise segmentation of different

structural features in less advanced stage of OSF in lower resolution (block size- 8X8

pixels) as shown in Table 3. To tackle these ambiguities in the structural level three

different unsupervised feature extraction techniques are used in this study. The data

Page 17: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

Performance in terms of percentage of misclassification of Rotation

Invariant Feature extraction methods

Image Name

Block size

Wavelet tr. (3 level)

(C3)

Wavelet packet tr. (3 level) (C4)

DCT (C5)

No. of clusters as suggested

by medical expert

Min (C3,C4,C5)

less adv1 8X8 39.02 38.13 37.12 Three 37.12(dct) 16X16 21.57 43.08 39.73 Three 21.57(w) 32X32 37.20 40.83 42.85 Three 37.20(w)

less adv2 8X8 10.84 12.29 11.68 Three 10.84(w) 16X16 8.72 10.48 10.26 Three 8.72(w) 32X32 7.68 9.39 10.84 Three 7.68(w)

less adv3 8X8 47.02 35.81 36.80 Three 35.81(wp) 16X16 41.92 41.18 37.29 Three 37.29(dct) 32X32 40.53 42.08 36.47 Three 36.47(dct)

less adv 4 8X8 15.18 24.86 13.55 Three 13.55(dct) 16X16 16.40 19.71 13.64 Three 13.64(dct) 32X32 18.51 20.31 13.15 Three 13.15(dct)

adv1 8X8 32.38 31.07 29.54, Three 29.54(dct) 16X16 23.80 29.11 26.92 Three 23.80(w) 32X32 24.16 25.91 23.23 Three 23.23(dct)

adv2 8X8 42.91 40.77 33.42 Three 33.42(dct) 16X16 41.27 42.76 32.57 Three 32.57(dct) 32X32 41.77 39.01 33.33 Three 33.33(dct)

adv3 8X8 18.48 19.37 17.35 Three 17.35(dct) 16X16 17.23 19.01 14.68 Three 14.68(dct) 32X32 16.82 13.92 15.29 Three 13.92(wp)

adv4 8X8 35.39 31.49 25.20 Three 25.20(dct) 16X16 34.90 36.60 24.59 Three 24.59(dct) 32X32 31.31 36.16 25.41 Three 25.41(dct)

adv5 8X8 44.89 47.17 39.22 Two 39.22(dct) 16X16 47.89 50.20 38.96 Two 38.96(dct) 32X32 52.55 49.75 42.00 Two 42.00(dct)

adv6 8X8 27.30 29.45 24.06 Two 24.06(dct) 16X16 28.44 30.52 26.05 Two 26.05(dct) 32X32 31.91 33.67 27.47 Two 27.47(dct)

Normal1 8X8 16.29 17.98 22.34 Three 16.29(w) 16X16 11.77 17.51 22.61 Three 11.77(w) 32X32 8.80 9.04 14.28 Three 8.80(w)

Normal2 8X8 34.71 37.77 36.99 Three 34.71(w) 16X16 36.24 35.74 36.26 Three 35.74(wp) 32X32 35.24 35.74 37.71 Three 35.24(w)

Table 3. Percentage of misclassification of feature extraction methods followed by fuzzy cmeans clustering with the different block sizes on images of OSF.

Page 18: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

Performance of rotational invariant feature extraction

Input image name

Block size

Wavelet tr.

Wavelet packet tr.

DCT

No. of clusters

8X8 0.3420 0.2518 0.2949 Three 16X16 0.2375 0.2470 0.2747 Three

adv1

32X32 0.2464 0.5081 0.2264 Three 8X8 0.2975 0.3248 0.3785 Three

16X16 0.3022 0.3451 0.3055 Three less adv1

32X32 0.3029 0.3454 0.4302 Three 8X8 0.1067 0.1131 0.1138 Three

16X16 0.0925 0.1067 0.1008 Three less adv2

32X32 0.0961 0.1033 0.1049 Three 8X8 0.2957 0.2906 0.3776 Three

16X16 0.3236 0.3144 0.3950 Three lessadv3

32X32 0.3712 0.3050 0.3841 Three 8X8 0.1997 0.1564 0.1421 Three

16X16 0.1963 0.1557 0.1399 Three lessadv4

32X32 0.1755 0.1793 0.1161 Three 8X8 0.3435 0.3496 0.2490 Three

16X16 0.2993 0.3524 0.2438 Three adv4

32X32 0.2714 0.3208 0.2553 Three 8X8 0.4006 0.4049 0.3402 Three

16X16 0.3744 0.3707 0.3306 Three adv2

32X32 0.3798 0.3952 0.3491 Three 8X8 0.1726 0.1828 0.1752 Three

16X16 0.1807 0.1588 0.1469 Three adv3

32X32 0.1798 0.1516 0.1529 Three 8X8 0.4423 0.4692 0.3840 Two

16X16 0.4806 0.5058 0.3760 Two adv5

32X32 0.5227 0.4993 0.4247 Two 8X8 0.2638 0.2714 0.2314 Two

16X16 0.2813 0.2935 0.2547 Two adv6

32X32 0.3216 0.3296 0.2613 Two 8X8 0.1686 0.1594 0.2264 Three

16X16 0.0815 0.1874 0.2288 Three Nrm(a)

32X32 0.1118 0.0627 0.1408 Three 8X8 0.3403 0.3590 0.3181 Three

16X16 0.3233 0.3237 0.3709 Three Nrm(b)

32X32 0.3027 0.3333 0.3780 Three

Table 4. Percentage of misclassification of feature extraction methods followed by kmeans clustering with the different block sizes on images of OSF.

Page 19: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

shows that the performance of DCT is more consistent in comparison to wavelet

transform and wavelet packet transform based techniques. From the notch box

(a) (b) (c) (d)

(e) (f) (g) (h)

(i) (j) (k) (l)

Fig. 6. (a) less adv2 (b) results of rotational invariant 3 level wavelet transform, (c) results of rotational invariant 3 level wavelet packet transform, (d) results of rotational invariant DCT mask based features on a image blocks having size 8X8, (e) less adv4, (f) results of rotational invariant 3 level wavelet transform, (g) results of rotational invariant 3 level wavelet packet transform, (h) results of rotational invariant DCT mask based features on a image blocks having size 8X8, (i) adv3, (j) results of rotational invariant 3 level wavelet transform, (k) results of rotational invariant 3 level wavelet packet transform, (l) results of rotational invariant DCT mask based features on a image blocks having size 8X8.

Page 20: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

presentation of the misclassification accuracy of different feature extraction techniques at

different levels of resolutions, the performance of DCT is superior to wavelet transform

and wavelet packet transform in respect to the error range (as a whole [Fig. 8] and in

different level of resolution [Figs. 9-11] ), uncertainty of median value is minimum. Thus

it may be stated that the DCT-based technique may be adopted for unsupervised

segmentation of the ultra structural features of OSF. Simultaneously it is also noted by

the researchers that more critical attention is required in identifying target features of

complex tissue texture like pre-cancerous conditions to minimize the misclassification

error.

Every image processing task is generally input data or image dependent. So the

classification accuracy is varying a great deal. Also there are overlapping between three

regions namely axial, transverse and void. The classification accuracy also dips due to

complex boundary between different regions

(m) (n)

(o) (p)

Fig. 6. (m) Normal1, (n) results of rotational invariant 3 level wavelet transform, (o) results of rotational invariant 3 level wavelet packet transform, (p) results of rotational invariant DCT mask based features on a image blocks having size 8X8.

Page 21: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

(a) (b)

(c) (d)

Fig. 7. (a), (c) two input images consisting of normal, less advancedand advanced stages of axial section of OSF. (b), (d) segmented mapby rotational invariant DCT.

1 2 3

0

10

20

30

40

50

60

% o

f mis

clas

ifica

tion

feature extraction methods

Fig 8. Notch box plot of misclassification accuracy of different feature extraction methods namely wavelet transform(1), wavelet packet transform(2) and Discrete Cosine transform(3).

Page 22: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

1 2 3

0

10

20

30

40

50

60

% o

f mis

clas

ifica

tion

feature extraction methods Fig 9. Notch box plot of misclassification accuracy of different feature extraction

methods namely wavelet transform(1), wavelet packet transform(2) and Discrete Cosine transform(3) for 8X8 resolutions.

1 2 30

10

20

30

40

50

60

% o

f mis

clas

ifica

tion

feature extraction methods

Fig. 10 Notch box plot of misclassification accuracy of different feature extraction methods namely wavelet transform(1), wavelet packet transform(2) and Discrete Cosine transform(3) for 16X16 resolutions.

Page 23: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

6. Conclusion

In conclusion it may be stated that among three feature extraction techniques, the overall

performance of proposed DCT based technique is better at all levels of resolution tried.

Moreover, being an unsupervised segmentation technique it can be a very important tool

in characterizing different stages of OSF specially the early stage which not only depicts

structural ambiguities with the presence of both normal and diseased features but also

poses problem to the medical experts while making the decision. The region of transverse

section of collagen may be extracted successfully with the help of our novel

methodology, which otherwise is a very intensive task for human analyst. The ANN-

1 2 30

10

20

30

40

50

60

% o

f mis

clas

ifica

tion

feature extraction methods

Fig. 11 Notch box plot of misclassification accuracy of different feature extraction methods namely wavelet transform(1), wavelet packet transform(2) and Discrete Cosine transform(3) for 32X32 resolutions.

Page 24: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

based supervised classifier will further be used on this extracted transverse zone in order

to finally detect the stage of the disease.20

References 1. Parkin, D.M., Pisani, P., Ferlay, J., Estimates of the worldwide incidence of 25 major cancers in 1990, Int J Cancer 80: 827–841, 1999. 2. http://www.oralcancerfoundation.org/facts/index.htm. 3. http://www.cancerresearchuk.org/cancerstats/type/oral/ 4. Wingo, P.A., Tong, T., Bolden, S., Cancer statistics, CA Cancer J Clin. 45: 8–30, 1995. 5. Aziz, S.R., Oral submucous fibrosis: an unusual disease, J N J Dent Assoc. 68: 17–19, 1997. 6. Chatterjee, J., Mukherjee, A., Mukherjee, K., Dutta, P.K., Chaudhuri, K., Statistical modeling of ultrastructural features of murine dermal collagen under chronic low- dose whole body X-irradiation, FEBS Letters 581: 5034–5042, 2007. 7. Paul, R.R., Mukherjee, A., Dutta, P.K., Banerjee, S., Pal, M., Chatterjee, J. et al., Pathological stage detection for oral precancerous condition using a novel wavelet- neural network-based technique, J. Clin. Pathol. 58: 932–938, 2005. 8. Dayhoff, J.E., Deleo, J.M., Artificial neural network—opening the black box, Cancer 91: 1615-1635, 2001. 9. Kappen, H.J., Neijt, J.P., Advanced ovarian cancer. Neural network analysis to predict treatment outcome, Ann. Oncol. 4 (Suppl. 4): 31–34, 1993. 10. Maclin, P., Dempsey, J., Using an artificial neural network to diagnose hepatic masses, J Med Syst. 16: 215–225, 1992. 11. Ravdin, P.M., Clark, G.M., A practical application of neural network analysis for predicting outcome of individual breast cancer patients, Breast Cancer Res. Treat 22: 285–293, 1992. 12. Wilding, P., Morgan, M.A., Grygotis, A.E., Shoffner, M.A. and Rosato, E.F., Application of back propagation neural networks to diagnosis of breast and overian cancer, Cancer Lett. 74: 143–153, 1994. 13. Wu, Y., Giger, M.L., Doi, K., Vyborny, C.J., Schmidt, R.A., Metz, C.E., Artificial neural networks in mammography: application to decision making in the diagnosis of breast cancer, Radiology 187: 817, 1993. 14. Porter, R., Canagarajah, N., Robust rotation-invariant texture classification: wavelet, Gabor filter and GMRF based schemes, IEE Proc. Vis. Image Signal Process. 144 (3): 180-188, 1997. 15. Wallace, G.K., Overview of the JPEG still Image Compression standard, SPIE 1244: 220-233, 1990. 16. Gall, D.J.Le., The MPEG Video Compression Algorithm: A review, SPIE 1452: 444- 457, 1991. 17. Greenshields I. R., Rosiene J. A., A fast wavelet-based Karhunen-Loeve transform, Pattern Recognition 31(7): 839-845, 1998. 18. Ng, I., Tan, T., Kittler, J., On local linear transform and Gabor filter representation of texture, Proc. Int’l Conf. Pattern Recognition, 627-631, 1992.

Page 25: Unsupervised Segmentation of Collagen Fiber Distribution ...manirban/journalPub/TR_JC_JBS_2010.pdf · Unsupervised Segmentation of Collagen Fiber Distribution in Different Stages

19. Pindborg, J. J., Sirsat, S. M., Oral submucous fibrosis, Oral Surg. Oral Med. Oral Pathol. 22: 764–779, 1966. 20. Mukherjee, A., Paul, R. R., Chaudhuri, K., Chatterjee, J., Pal M., Banerjee, P, Mukherjee K., Banerjee S., Dutta, P. K., Performance analysis of different wavelet feature vectors in quantification of oral precancerous condition, Oral Oncology 42(9): 914-928, 2006. 21. Mallat, SG., Multi frequency channel decompositions of images and wavelet models, IEEE Trans. Acoust. Speech Signal Process. 37 (12): 2091-2110, 1989. 22. Daubechies, I., Orthonormal bases of compactly supported wavelets, Commun. Pure Appl. Math. XLI: 909-996, 1988. 23. Pennebaker W. B., Mitchell J. L., JPEG – Still Image Data Compression Standard, International Thomsan Publishing, Newyork, 1993. 24. Unser, M., Local linear transform for texture measurements, Signal Processing 11: 61-79, 1986. 25. Jain, A.K. and Dubes, R.C., Algorithms for clustering data, Prentice Hall, Englewood Cliffs; New Jersey; 1988.