[IEEE 2014 IEEE International Conference on Mechatronics and Automation (ICMA) - Tianjin, China...

6
Identification of Fish Species based on Image Processing and Statistical Analysis Research Lian Li , Jinqi Hong Tianjin Key Laboratory for Control Theory & Applications in Complicated System School of Electrical Engineering Tianjin University of Technology Tianjin, China [email protected] Abstract - For more fish varieties, to the subsequent processing and marketing, which is necessary to classify the types of fish.This study is based on image processing technology to classify the types of fish, four fish to make use of the existing image acquisition device for collecting samples, through the MATLAB software for image preprocessing, such as fish gray, binarization, image enhancement, contour extraction to extract the 11 feature parameters of four fish species, such as using the principal component analysis (PCA) to 11 characteristic parameters for dimension reduction, this study took four principal component.Then use SPSS software to establish fisher and mahalanobis distance model, the combination of four principal component reuse component to build a model to classify the four different kinds of fish.Through SPSS software simulation and identification results show that the average recognition rate of 96.67%, which can be well applied to the fish species identification technology. Index Terms - image processing, principal components analysis, fisher, mahalanobis distance,fish species, identification. I. INTRODUCTION With the development of automation, aquaculture industry also in urgent need of intelligence and automation control.The fish farming in industrial aquaculture at home and abroad is becoming much more common.However, in front of the scientific management and processing sales in the culturing process of fish species [11], how to automatic identification of Fish species is a hotspot.At present domestic reports of fish species identification are rare. In addition to the classification of the mechanical, artificial classification, artificial classification not only can cause eye fatigue and low efficiency, in the face of a large number of fish classification, workload is huge [4]-[7]. Mechanical classification of fish though will improve the work efficiency [3], but the enormous damage to the fish [8]. Some based on the above shortcomings, this paper presents a kind of fish recognition based on image processing and statistical technology research, mainly rely on image processing techniques and MATLAB software to the fish image processing and analysis [1], and put forward the method of principal component analysis the characteristics of dimension reduction [9]-[15], then use fisher discriminant method and markov distance mathematical model is established to classify four types of fish and verification, the results show that this technology has great economic value and application value. II. IMAGE ACQUISITION AND PREPROCESSING Image acquisition access format for JPG four fish respectively of chub, crucian, bream fish and carp true color image obtained by digital camera. As shown in figure 1 is the crucian and carp.And through the cable to the computer, in order to obtain the basic information of the image, used MATLAB software to analysis and information gathering. Image preprocessing mainly in order to improve the quality of the image,and is to extract the features of fish,improve the accuracy of image information,preprocessing part consists of image gray level, image binarization, image enhancement and the extraction of contour, the work is to the back of the color, texture and shape feature extraction such as prepared.As shown in figure 2 is the carp and crucian’s grayscale images. Fig. 1 Carp and crucian Fig. 2 Carp and crucian’s grayscale images The grey value of points in the gray image is set to 0 or 255 is image binarization [4], the whole image showing a black and white two colors [2], the boundary of the fish body contour with the surrounding better contrast, therefore, in order to further highlight the interest of contour features,and 1155 978-1-4799-3979-4/14/$31.00 ©2014 IEEE Proceedings of 2014 IEEE International Conference on Mechatronics and Automation August 3 - 6, Tianjin, China

Transcript of [IEEE 2014 IEEE International Conference on Mechatronics and Automation (ICMA) - Tianjin, China...

Page 1: [IEEE 2014 IEEE International Conference on Mechatronics and Automation (ICMA) - Tianjin, China (2014.8.3-2014.8.6)] 2014 IEEE International Conference on Mechatronics and Automation

Identification of Fish Species based on Image Processing and Statistical Analysis Research

Lian Li , Jinqi Hong Tianjin Key Laboratory for Control Theory & Applications

in Complicated System School of Electrical Engineering Tianjin University of Technology

Tianjin, China [email protected]

Abstract - For more fish varieties, to the subsequent processing

and marketing, which is necessary to classify the types of fish.This study is based on image processing technology to classify the types of fish, four fish to make use of the existing image acquisition device for collecting samples, through the MATLAB software for image preprocessing, such as fish gray, binarization, image enhancement, contour extraction to extract the 11 feature parameters of four fish species, such as using the principal component analysis (PCA) to 11 characteristic parameters for dimension reduction, this study took four principal component.Then use SPSS software to establish fisher and mahalanobis distance model, the combination of four principal component reuse component to build a model to classify the four different kinds of fish.Through SPSS software simulation and identification results show that the average recognition rate of 96.67%, which can be well applied to the fish species identification technology.

Index Terms - image processing, principal components analysis, fisher, mahalanobis distance,fish species, identification.

I. INTRODUCTION

With the development of automation, aquaculture industry also in urgent need of intelligence and automation control.The fish farming in industrial aquaculture at home and abroad is becoming much more common.However, in front of the scientific management and processing sales in the culturing process of fish species [11], how to automatic identification of Fish species is a hotspot.At present domestic reports of fish species identification are rare. In addition to the classification of the mechanical, artificial classification, artificial classification not only can cause eye fatigue and low efficiency, in the face of a large number of fish classification, workload is huge [4]-[7]. Mechanical classification of fish though will improve the work efficiency [3], but the enormous damage to the fish [8]. Some based on the above shortcomings, this paper presents a kind of fish recognition based on image processing and statistical technology research, mainly rely on image processing techniques and MATLAB software to the fish image processing and analysis [1], and put forward the method of principal component analysis the characteristics of dimension reduction [9]-[15], then use fisher discriminant method and markov distance mathematical model is established to classify four types of fish and verification, the

results show that this technology has great economic value and application value.

II. IMAGE ACQUISITION AND PREPROCESSING

Image acquisition access format for JPG four fish respectively of chub, crucian, bream fish and carp true color image obtained by digital camera. As shown in figure 1 is the crucian and carp.And through the cable to the computer, in order to obtain the basic information of the image, used MATLAB software to analysis and information gathering. Image preprocessing mainly in order to improve the quality of the image,and is to extract the features of fish,improve the accuracy of image information,preprocessing part consists of image gray level, image binarization, image enhancement and the extraction of contour, the work is to the back of the color, texture and shape feature extraction such as prepared.As shown in figure 2 is the carp and crucian’s grayscale images.

Fig. 1 Carp and crucian

Fig. 2 Carp and crucian’s grayscale images

The grey value of points in the gray image is set to 0 or

255 is image binarization [4], the whole image showing a black and white two colors [2], the boundary of the fish body contour with the surrounding better contrast, therefore, in order to further highlight the interest of contour features,and

1155978-1-4799-3979-4/14/$31.00 ©2014 IEEE

Proceedings of 2014 IEEEInternational Conference on Mechatronics and Automation

August 3 - 6, Tianjin, China

Page 2: [IEEE 2014 IEEE International Conference on Mechatronics and Automation (ICMA) - Tianjin, China (2014.8.3-2014.8.6)] 2014 IEEE International Conference on Mechatronics and Automation

need to gray image binarization processing is shown in figure 3.

Fig. 3 Carp's gray image and binary image

As shown in figure 4 is carp’s image enhancement,the

image background is white,all become black,which aims to eliminate the interference factors of the mixed in the image, strengthen the features of the image, improve the image of target and background contrast. Image enhancement is adaptive median filtering technique [13], as shown in figure 4 is the rough outline of the image, can highlight the edge line of fish, convenient for feature extraction of the follow-up work.

Fig.4 Carp’s image enhancement and outline

III. FEATURE EXTRACTION OF FISH

The quality of features parameters will directly affect the final recognition effect [5]-[6], the work of this paper is from the fish colour features, morphological features, texture features analysis, the color features, shape features and texture features from color image, binary image and texture image retrieval.

A. Color components extraction Color belongs to the image of the internal features, color

features compared with other visual features [2], it changes the image size, orientation and perspective which is not sensitive,certain stability,and very strong robustness, the extraction of color features color moment, color histogram, Color related methods. The color histogram is used to extract color features, such as table 1 through the color features of four species of fish of the mean and variance of data statistics.

variety mean variance

R G B R G B

crucian 157.42 167.66 166.82 14.41 15.72 16.28

carp 169.87 160.11 125.55 12.24 12.40 14.33 chub 134.67 130.55 112.90 13.23 13.40 15.23

bream 160.90 156.89 158.32 14.12 15.20 14.33

Table 1 Four species of fish color component

The color space is divided into several small space is the need of calculating color histogram, each interval as a handle to the histogram,which this process is color quantization. By calculating the color fall in the pixel values of each area can get color histogram.The values of normalized color by horizontal axis, each small color range of the image pixels are combined by the ordinate axis.The following by the colour of each com-ponent to the fish histogram statistics, mainly through two kinds of space model to represent the RGB and HSV respectively in figure 5 and figure 6.

Fig. 5 RGB color component histogram

Fig. 6 HSV color component histogram

B. The shape feature extraction This article adopts the method of image pixel extraction fish area, length and breadth.Area is the total size of a convenient measurement [9], which is only associated with the boundary of the object, and has nothing to do with the change of internal grayscale. Here to show the size of the image is the relative area,the sum total of the number of image pixels to show area of fish.The area of real fish can conversion with pixel area,it adopts the pixel area does not affect the correlation analysis,formula:

( )

= ==

N

x

M

y

y,xfarea1 1 , ( ) ( )

( ){ Ry,xRy,xy,xf ∈

∉= 10 (1)

For the size of the two-dimensional image, R as the target area, f is the total number of pixels in the region. Table 2 is obtained by the rough outline of MATLAB software to scan

1156

Page 3: [IEEE 2014 IEEE International Conference on Mechatronics and Automation (ICMA) - Tianjin, China (2014.8.3-2014.8.6)] 2014 IEEE International Conference on Mechatronics and Automation

the fish that the four fish's body length and width ratio of the variance and mean. The smallest Axle ratio is bream fish, chub second, crucian and carp are big, main reason is that bream fish body shape is generally flat , body width is relatively large, so its length-width ratio is minimum.

variety Mean Variance

crucian 2.96 0.03

carp 3.53 0.04

chub 3,76 0.02

bream 2.62 0.01

Table 2 Four varieties of fish length-width ratio

C. Texture feature extraction Gray level co-occurrence matrix is through studying the characteristics of gray space related the commonly used method to describe texture [1].Therefore this study gray level co-occurrence matrix from target image grayscale calculation, then using gray level co-occurrence matrix texture feature. In order to better describe the fish co-occurrence matrix texture feature of the experiment, selected the five scalar parameter to describe,include energy ASM,contrast CON,correlation COR,entropy ENT,homogeneity IDM.Calculating formul is as follows,

( )=

=

=1-G

0i

1

0

2G

j

j,ipASM,

( ) ( )−

=

=−=

1

0

1

0

2G

i

G

j

j,ipjiCON,

( )( ) ( )

yx

G

i

G

j

j,ipyjxiCOR

σσ

−−=

=

=

1

0

1

0

,( ) ( )

=

=

−−=1

0

1

0

G

i

G

j

j,iplgjipENT,

( )( )

=

= −+=

1

0

1

021

G

i

G

j jij,ipIDM

(2)

In the correlation COR,

( )−

=

=

=1

0

1

0

G

i

G

j

j,ipx,

( )−

=

=

=1

0

1

0

G

i

G

j

j,ipy,

( ) ( )−

=

=

−=σ1

0

1

0

22G

i

G

jx j,ipxi

,( ) ( )

=

=

−=σ1

0

1

0

22G

j

G

iy j,ipyj

(3) In the formula, ( )j,ip is refers to the normalization of

the gray level co-occurrence matrix.Gray levels of pixels with i,j, the grayscale image with G, Such as table 3 can be analyzed four species of fish, the partition degree of energy is very big,which can directly identify, but entropy, correlation, homogeneity is not very obvious, we work in order to better distinguish between the variety of fish, which must take a lot of data is improved more rate of identification.

variety contrast correlation entropy homogeneity energy

crucian 28.9517 3.3762 0.3390 3.1269 4.6285

carp 7.8415 3.7982 0.2022 3.4063 3.6022

chub 6.7340 3.5781 0.2322 3.2314 4.1520

bream 6.3903 3.4760 0.3473 3.5341 3.4561

Table 3 Four species of fish texture feature

IV. FISH SPECIES IDENTIFICATION

A. Principal component analysis After the previous feature extraction of fish, here to learn

from the predecessors' research methods,by compressing data dimension reduction method to speed up the operation. Principal component analysis is a branch of multivariate statistical analysis, is will be related to the basic theory of the random vector, by using a orthogonal transformation, into unrelated new random vector and variance as the degree of information detection, the new random vector dimension, then construct appropriate cost function, further do system transformation. This paper uses principal component analysis for data dimension reduction [12],Then combined with the fisher discriminant analysis,that discriminant function build feature parameter after the use of dimension reduction.The improved principal component analysis to extract the feature value of 11 that dimension reduction processing,which to get new four variable factor.The main flow chart of Principal component analysis is as shown in figure 7.

Fig. 7 Principal component analysis flow chart

Scree plot as shown in figure 8.Feature values expressed

in the vertical and horizontal for the component number [10], figure shows line flat from the fourth factor, and clearly know the eigenvalues of the first three eigenvalue is bigger, so choose four principal components.Through principal component analysis first, two, three, four main components of the cumulative contribution rate reached 89.73%, shows a good fit of the classification, so choose the first four principal components to build the discriminant function.The

1157

Page 4: [IEEE 2014 IEEE International Conference on Mechatronics and Automation (ICMA) - Tianjin, China (2014.8.3-2014.8.6)] 2014 IEEE International Conference on Mechatronics and Automation

Component matrix(initial factor load matrix) are obtained by principal component analysis reduce dimension such as table 4 and SPSS software simulation.

Fig. 8 Scree plot

Table 4 Component matrix

B. Fisher discriminant Multivariate discriminant method, as an important means in this paper, the types of fish classification in the statistical analysis is used to distinguish the sample belongs to category of a kind of method, the similar things.In the condition of known research object classification, discriminant analysis is deduced according to the sample data of one or a set of discriminant function, discriminant rules are used to determine a pointed out at the same time to the category of discriminant sample and make false.Fisher's discriminant is applied in the derivation process of extremum principle, is not doing any hypothesis on the matrix of probability distribution, the distribution type without special requirements for the overall, as long as there is a second order matrix, the discriminant accuracy will be a higher,This text selects fisher criterion to build fish species identification model. Fisher discriminant method is put forward in the 1930s [9]. According to the variance as small as possible in class, the class average gap between the principle of as large as possible, that is, class the distance between the maximum and minimum discreteness in the class to establish the principle of discriminant function,

and the use of discriminant function for minimum distance classification.For the classification of 2 or more overall, fisher discriminant algorithm to do the following description: For k ensemble G1, G2,..., Gk, assuming the same covariance matrix, u1, u2,..., uk, respectively the ensemble G1, G2,..., Gk mean vector, each containing p sample points, general term for a sample points X, discriminant belong to which a general X, that of linear discriminant function of X.

( ) ppT xbxbxbXbXF +++== 2211

(4) in which b for differentiating coefficient, ( )pbbbb 21= .

Four principal components is set to A, B, C, D after analysis and processing of data dimension reduction estimation b value.

−−−−

=

910140640250410040670200060380010750

00102500040300

............

....

b

(5)

Eigenvalue λ reflect the comprehensive discriminant ability of the discriminant function,thus can be calculated the discriminant function of discriminant ability by the value of the characteristic root. The discriminant function type as follows:

43211

43211

62700771800337004390255605981067510A34260

D.C.B.A.FD.C.B..F

′−′+′−′−=′+′+′+′=

(6)

From the formula can be displayed, fisher discriminant itself is no clear delivery point, therefore, how to classify can choose according to actual situation.In this paper, which use the discriminant by distance for the discriminant classification on the ownership of the samples.

C. Mahalanobis distance Mahalanobis distances is made by the statisticians Mahalanobis in India who puts forward data covariance distance [9], is a kind of effective and are unknown to calculate similarity method of sample set. Each general markov distance discriminant method are assumed as normal population for analysis processing, the advantages of this method lies in the markov distance has nothing to do with variable unit scale, demanded to know the overall feature value (mean and variance), does not involve the overall distribution type, the method is simple, a conclusion.On The Contrast Euclidean distance, it is to consider the connection between the various features has nothing to do with scale, which has nothing to do with scale for average value,Covariance matrix of the multivariate vector, The mahalanobis distance are as shown:

1158

Page 5: [IEEE 2014 IEEE International Conference on Mechatronics and Automation (ICMA) - Tianjin, China (2014.8.3-2014.8.6)] 2014 IEEE International Conference on Mechatronics and Automation

( ) ( ) ( )( )21

1 μ−′μ− − xx (7)

The distance discriminant rule is sample to all kinds of overall, according to the minimum distance discriminant, overall recently, is the distance which classification is belong to the category.Discriminant rule: sample points to all kinds of mean vector mahalanobis distance[14].The distance x to the class A aD , the distance x to the class B as bD , the distance

x to the class C as cD ,Formula such as:

( ) ( )( ) ( )( ) ( )CCCc

BBBb

AAAa

XXSXXD

XXSXXD

XXSXXD

−′

−=

−′

−=

−′

−=

1

1

1

(8)

The Fisher method transformed on the sample data into the formula: samples from which the overall class recently, belonging to the general.Final inspection sample output as shown in table 5:

PCA+Fisher+Mahalanobis distance

variety sample identification accuracy rate

crucian 15 14 93.33%

carp 15 1 100.00%

chub 15 14 93.33%

bream 15 1 100.00%

Table 5 Recognition rate

Comparison results show that the selection of principal component analysis,mahalanobis distance and fisher discriminant method combination, the average discrimination rate of 96.67%.This combined method have higher credibility, for the recognition of the fish classification discriminant classification can be used as fish species.The test number of learning samples,resulting in the standard library characteristic value quantity is limited, not good that includes characteristics of complete information,will make discriminant rate is reduced.

V .CONCLUSION Through image processing to obtain the basic image

information of fish, and extract the characters of color, shape, texture feature. By using the method of statistical analysis, using the principal component analysis to feature data dimension reduction process, and then using the fisher discriminant and mahalanobis distance criterion to distinguish the fish species identification. The test results show that the method can effectively identifying fish category, the average recognition rate of 96.67%. This method can rapidly and accurately used to identify the varieties of the four types of fish, and can reach the effect of body with no damage to the

fish, not only save the labor force, has strong practical application value, the nondestructive identification technology has the characteristics of the objective, rapid and accurate, more important is to identify the object does not cause damage and have real time, so as to improve the work efficiency. In this paper, the testing sample is mainly used in freshwater fish contains only four inland freshwater fish, common chub, bream , carp and crucian , for other common freshwater fish didn't study, grass carp and other common freshwater fish, for example, the type of the test samples are not complete enough.Therefore, in the next step of work in the study of other kinds of freshwater fish, extract its image characteristics to establish a more robust freshwater fish species identification system.Freshwater fish image acquisition still under static conditions, dynamic conditions for the image processing technology, further studies are needed in order to realize the rapid and online variety of freshwater fish and automatically by size grading, satisfy the requirement of practical production.

REFERENCES [1] Yang Dan,Zhao Haibin,Long Zhe,MATLAB Image Processing Example

Explanation.Bei Jing: Tsinghua University press,2013. [2] Zhang Changjiang,Digital Image Processing and its Application.Bei Jing:

Tsinghua University press,2013. [3] D.J.White,C.Svellingen,“Automated Measurement of Species and Length

of Fish by Computer Vision”,Fisheries Research,vol.80,no.4,pp.203210 2006.

[4] Zhang Zhiqiang,Niu Zhiyou,Zhao Siming,“Grading Method of Freshwater Fish Weight based on Computer Vision,”Transactions of CSAE,vol. 27,no. 2,pp.350 354,2011.

[5] Ying Yibin,Fu Feng,“Color Transformation Model of Fruit Image in Process of Non-destructive Quality Inspection based on Machine Vision,”Transactions of CSAM, vol.35,no. 1,pp.85 89,2004.

[6] D.G.Lowe.“Distinctive Image Feature from Scale Invariant Keypoint,” IJCV, vol.60,no. 2,pp.91 110,2004.

[7] D.R.Fernaadez,A.A.Agostinho,and L.M.Bini,“Selection of an Experimental Fish Ladder Located at the Dam of the Itaipu Binacional, Paran River, Brazil,” Brazilian Archives of Biology and Technology, vol.47,no.4,pp.579-586,2004.

[8] M.Bowen,S.Marques,L.Siva,V.Vono,and H.Godinho,“Comparing on Site Human and Video Counts at Igarapava Fish Ladder,Southeastern Brazil,” Neotropical Ichthyology,vol.4,pp.291-294,2006.

[9] Feng Jiayi. “Research on Inspection Technology of Vegetable‘s Fresh Levels based on Computer Vision,”Nan jin Nanjing Agricultural University,2012.

[10] Lv Zhentong,Zhang Lingyun,SPSS Statistical Analysis and Application,Bei Jing: Mechanical industry press,2009.

[11] Frank Storbeck, Berent Daan,“Fish Species Recognition Using Computer Vision,”Fisheries Research, vol.51,no.1,pp.11-15,2001.

[12] Fan Jie,“Numerical Implementation Algorithm of the Principal Component Analysis,”Journal of hebei university of engineering: Natural science edition, vol.24,no.4,pp.103-105,2007.

[13] Yue Zhenjun,Qiu Wangcheng,Liu Chunlin,“An adaptive multiple target image segmentation method,” Chinese journal of image and graphics, vol.9,no.6,pp.674-678,2004.

[14] T.Sziranyi,J.Zerubia,L.Czuni,“Image Sementaion Using Markov Random Field Model in fully Parallel Celluar Network Architectures,”Real-Time Imaging,vol.6,no.6,pp.195-211,2000.

[15] Y.ke and R.Sukthankar, “PCA-SIFT:A More Distinctive Representation for Local Image Descriptors,”in IEEE CVPR, pp.506-513,2004.

[16] Wang Na,Wang Keru, “Maize Leaf Disease Identification Based on Fisher Discrimination Analysis,”Scientia Agricultura Sinica,vol.11,no.42,pp.3836-3842,2009.

1159

Page 6: [IEEE 2014 IEEE International Conference on Mechatronics and Automation (ICMA) - Tianjin, China (2014.8.3-2014.8.6)] 2014 IEEE International Conference on Mechatronics and Automation

[17] Chai Ali,Li Baoju, “Recognition of Tomato Foliage Disease Based on Computer Vision Technology,”Acta Horticulturae Sinica,vol.9,no.37,pp.1423-1430,2010.

1160