IP Numerical Apps

Consider the following image, on 3 bits (values between 0 and 7) and the origin (0,0) underlined: 06013

55162

07326

73455

53524

1. compute the binomial 3x3 filter and the result of the filtered image for the pixel at location (2,2)2. compute, for the same location (2,2), the result of image filtering by the 3x3 median filter3. same question but for the following weighted median filter:232

353

232

4. same question for the following L filter: {1,3,4,6,9,6,4,3,1}5. for the same location compute the result of filtering the image with the conditional mean filter having the following parameters: L=3, th=2.56. the image from before is rotated with 30 around its center. write the composed transformation matrix. compute the value of the rotated image at location (1,2) using bilinear interpolation.7. compute the Sobel gradient magnitude and its direction for the central pixel of the above image8. using the kmeans algorithm compute an optimal threshold and calculate the corresponding binary image, for the following input image:62213

66122

67322

73455

13567

ANSWERS:1. the binomial 5x5 filter has the following mask: 1/16*121

242

121

location (2,2) is the center of the image and the result of filtering is given by the 2D convolution operation. the neighborhood considered is:516

732

345

and the result: (1*5+2*1+1*6+2*7+4*3+2*2+1*3+2*4+1*5)/16 = (5+2+6+14+12+4+3+8+5)/16=59/16=3.68if the result remains on 3 bits and in the same range we round to the nearest integer in the 0-7 range so the final result is 4.2. the median filter constructs the sequence of values in the specified neighborhood. the sequence is sorted and the result is represented by the median value (value at the center of the sorted sequence):the neighborhood: 516

732

345

unsorted sequence: 5,1,6,7,3,2,3,4,5sorted sequence: 1,2,3,3,4,5,5,6,7median value and final result: 43. the weighted median filter works the same as the median filter with the difference that the weights tell us how many times the corresponding image value is repeated in the sequencethe neighborhood: 516

732

345

unsorted sequence: 5 - 2 times ,1 - 3 times,6 - 2 times,7 - 3 times,3 - 5 times,2 - 3 times,3 - 2 times,4 - 3 times,5 - 2 times= 5,5,1,1,1,6,6,7,7,7,3,3,3,3,3,2,2,2,3,3,4,4,4,5,5sorted sequence: 1,1,1,2,2,2,3,3,3,3,3,3,3,4,4,4,5,5,5,5,6,6,7,7,7median value and final result: 34. the L filter constructs the sorted sequence and the final result is the weighted mean using the given coefficientsthe sorted sequence (as computed for the median filter): 1,2,3,3,4,5,5,6,7the coefficients: {1,3,4,6,9,6,4,3,1}the weighted mean: (1*1+3*2+4*3+6*3+9*4+6*5+4*5+3*6+1*7)/(1+3+4+6+9+6+4+3+1)==(1+6+12+18+36+30+20+18+7)/37=148/37=45. conditional mean filter considers a LxL neighborhood and the final result is given by the mean of the pixel values that are not further by th from the value at the location we are computing the result for. so, only those pixels having the absolute difference against the central pixel smaller than th are considered for the average.in our case the neighborhood is 3x3: 516

732

345

knowing that th=2.5, the pixels considered for the mean computation are those in red the final result is: (5+1+3+2+3+4+5)/7=23/7=3.2 rounded to 3.the simple average would be: 36/9=4.6. in order to perform a rotation around the center of the image we need to move the center in the origin which is done by a translation, then we perform the rotation, and finally we apply the inverse translation. the composed transformation will be:T(dx,dy)*R(theta)* T(-dx,-dy), where T(-dx,-dy) is the translation that moves the center in the origin , and T(dx,dy) is its inverse. since we are talking about the center of the image dx=dy=2.in matrix form we have:

computing the matrix products we get:

we know the location in the output image: yout = 2, xout=1, and we solve the above system for xin and yin:

this means that the value of the rotated image at location (1,2) is the same as the value of the original image at location (1.1,2.5) the problem is that these are not integer coordinates so we will use interpolation to estimate a value at the non integer locationbilinear interpolation considers the 4 neighbors to estimate a value at the non integer location:f(x,y) =a*b*f([x]+1,[y]+1) + (1-a)*b*f([x],[y] +1) + a*(1-b)*f([x]+1,[y]) + (1-a)*(1-b)*f([x],[y])where: [] is the integer part, a and b are the fractional parts of x respectively y; f is the image value.in our case:51

73

f(1.1,2.5) = 0.1*0.5*f(2,3)+0.9*0.5*f(1,3)+0.1*0.5*f(2,2)+0.9*0.5*f(1,2)==0.05*1+0.45*5+0.05*2+0.45*7 = 0.05+2.25+0.1+3.15=5.55eventually rounded up to 6 (if we consider the output also in the same range as the input)7. the horizontal (gx) and the vertical (gy) components of the Sobel gradient are obtained by convolving the image by the two following masks:10-1

20-2

10-1

121

000

-1-2-1

the neighborhood considered:516

732

345

gx = 1*5+2*7+1*3-1*6-2*2-1*5=5+14+3-6-4-5=7gy = 1*5+2*1+1*6-1*3-2*4-1*5 = 5+2+6-3-8-5 = -3Sobel gradient magnitude: g=sqrt(gx*gx+gy*gy)=sqrt(49+9)=sqrt(58)=7.6 ~ 8direction of the gradient: atan(gy/gx) = atan(-3/7)=......8. we start by choosing an arbitrary threshold in the image range: 0-7. for example the starting threshold = 3.5 (the middle of the image range)original image72213

76122

67322

73457

13767

we compute the means of the pixels having values bigger (mb) respectively smaller (ms) than the thresholdimage pixels bigger that the threshold: 7,7,6,6,7,7,4,5,7,7,6,7=> mb=76/12 = 6.33image pixels smaller that the threshold:2,2,1,3,1,2,2,3,2,2,3,1,3 = 27/13 =2.07the new threshold is the mean of mb & ms: th = (6.33+2.07)/2=4.2we repeat the operations from before until the new threshold is equal to the old one (convergence condition), value that will also represent the final threshold to obtain the binary image.so:th=4.2mb = 7+7+6+6+7+7+5+7+7+6+7/11 = 6.54ms = 2+2+1+3+1+2+2+3+2+2+3+4+1+3/14 = 31/14 =2.21new th = (6.54+2.21)/2 = 4.3next step using th = 4.3: mb = 7+7+6+6+7+7+5+7+7+6+7/11 = 6.54ms = 2+2+1+3+1+2+2+3+2+2+3+4+1+3/14 = 31/14 =2.21so the new th = 4.3 equal to the old one so the alg. stops here and we use 4.3 to obtain the binary image: if the pixel value >th that pixel -> 1, else pixel -> 0. the binary image will look like this:10000

11000

11000

10011

00111

IP Numerical Apps

Documents

Transcript of IP Numerical Apps