Protein crystallization image analysis ICCBM-2013

Post on 15-Apr-2017

497 views 0 download

Transcript of Protein crystallization image analysis ICCBM-2013

Madhav SigdelComputer Science PhD Student

University of Alabama in Huntsville

14th International Conference on the Crystallization of Biological Macromolecules

9/27/2012

OverviewProtein crystallography

Protein crystallization

Crystallization trials

Scoring of crystallization trials

Image acquisition

Image classification

Protein Image Samples

Image 1 Image 2 Image 3

Image 4 Image 5 Image 6

Protein Crystallization Phases (Hampton Research)

1. Clear Drop2. Phase Separation3. Regular Granular Precipitate4. Birefringent Precipitate or Microcrystals5. Posettes and Spherulites6. Needle Crystals (1D Growth)7. Plate Crystals (2D Growth)8. Single Crystals (3D Growth < 0.2 mm)9. Single Crystals (3D Growth > 0.2 mm)

General ApproachApply image processing techniques to extract

featuresApply data mining techniques for classificationImage processing

Region of Interest (drop boundary) detectionImplementation of complex algorithms for edge

detection Hough transform Canny edge detection

Geometric and texture featuresDistributed computing to speed up the processFeature extraction computationally expensive

Related WorksAccording to no of categories

Binary classification - [Xioqung 2004], [Takahashi 2005], [Ming 2008], [Roy Liu 2008]Distinguishes between crystal and non-

crystal class onlyMulticlass classification – [Kanako Saitoh

2006], [Christian A 2010]Reported accuracy is very less for some

classesVarieties of classification methods applied

Our ApproachLow cost/in-house assembled system for

image acquisitionTrace fluorescent labeling of proteinApplication of intensity and simple

geometric features for processing imageClassification into 3 categories

Non-crystalsLikely leadsCrystals

Image Acquisition System

~30 minutes to collect images from 3-celled 96-well plate (288 images)

Image CategoriesImage category Grouping of Hampton categories

Non-crystals1. Clear Drop2. Phase Separation3. Regular Granular Precipitate

Likely leads 4. Birefringent Precipitate or Microcrystals*. Unclear bright regions

Crystals

5. Posettes or Spherulites6. Needles (1D Growth)7. Plates (2D Growth)8. Single Crystals (3D Growth < 0.2 mm)9. Single Crystals (3D Growth > 0.2 mm)

Non-crystal Images

Clear drops Regular precipitates

Likely Leads

Granular precipitate / Microcrystals

Unclear bright regions

Crystals

Image PreprocessingImage size reduction

Median filter

Thresholding techniquesOtsu threshold – select threshold intensity which

maximizes inter-class variance and minimizes intra-class variance

Dynamic thresholding I – select 90th percentile intensity of green component as the threshold

Dynamic thresholding II – select maximum intensity of green component as the threshold

Otsu Threshold

Image 1 Image 2Image 3

Image 4 = Otsu (Image1)Image 5 = Otsu (Image 2)

Image 6 = Otsu (Image 3)

Thresholding Techniques Comparison

Image 4: Max green threshold

Image 2: Otsu thresholdingImage 1: Original image

Image 3: 90th percentile threshold

Intensity Features

Background region in the original image

Image 1: Original image resized (Img1) Image 2: Thresholded image (Img2)

Image 3: Img1 AND Img2 Image 4: Img1 AND (Img2)c

Intensity featuresThreshold intensity () Bright pixel count (n) Average intensity in bright region (fStandard deviation of intensity in bright

region (fAverage intensity in dark region (b Standard deviation of intensity in dark

regionb

Region/Blob Features

Image 1: Original image Image 2 = Binary(Image1) Image 3 = Skeleton(Image2)

Image 4: Showing the connectedregions in different colors

Largest Blob (R1) R2 R3 R4

Extracted blobs

Region/Blob FeaturesNo of blobsConsider R1 denotes the largest blob

Area(R1)Boundary pixel count in R1

Fullness – No. of white pixels in R1 /Area(R1)Measure of boundary smoothness of R1

Variance of boundary smoothness of R1

Measure of symmetry of R1 along X and Y-axisConsider R2,R3,

R4 and R5 as the 4 largest blobs excluding R1 Average areaAverage fullness

Dataset

Category No of images Percentage

Non Crystals 1514 67.3%

Likely Leads 404 18.0%

Crystals 332 14.8%

Total images 2250

Experimental Results

Confusion matrix

    Observed class

   Non

crystalsLikely leads Crystals Actual

Total Accuracy

Actual class

Non-Crystals

1467 43 4 1514 96.9%

Likely Leads

42 317 45 404 78.5%

Crystals 7 68 257 332 77.4%Observed Total 1516 428 306 2250 90.7%

Classifier – Multilayer Perceptron Neural NetworkTesting – 10-fold cross validation

Other Classification TechniquesMax class ensemble method

Uses multiple classifiers with different feature combination

Assigned class is the maximum predicted class of all the classifiers

Decreases false negatives but increases false positives

Exhaustive binary classifiersSolves multiclass problem using all possible

binary classifiersFor class 3 – no of binary classifiers = 6

Overall accuracy around 82%

Future WorkClassify the crystals according to crystal

morphologyTrack temporal evolution of the crystalsExtract other relevant image features and

improvement of accuracy

SummaryIntensity is shown to be an easier but

useful search parameter to identify crystals

Efficient image processing (3 sec/image) Classification into 3 categories – non-

crystals, likely crystals and clear crystalsComparable accuracy with other

systems

AcknowledgementCoworkers

Salma BegumMarc L PuseyRamazan Aygun

iExpressGenes inc.