Dewarped Minds United. Progress Report from Bozeman: Simulator Estimating eye motion Dewarping...
-
date post
21-Dec-2015 -
Category
Documents
-
view
217 -
download
4
Transcript of Dewarped Minds United. Progress Report from Bozeman: Simulator Estimating eye motion Dewarping...
Progress Report from Bozeman:
• Simulator
• Estimating eye motion
• Dewarping
• Montaging
• Mosaicing
Simulator• Simulates translational eye motion, including saccades,
and generates the corresponding AOSLO videos1. Start with a BIG IMAGE2. Model the raster motion r(t) 3. Model the eye motion x(t).4. Record from the BIG IMAGE pixel by pixel according to r(t) + x(t)
• Validates Motion Estimation1. Tracking Saccadic Eye Motions involve more than just estimating
translational transforms 2. Saccadic eye motions induce spurious rotational estimates.3. Using small patches induces spurious vertical motion and rotation
estimates.
• Simulates rotational eye motion and generates videos
Validation: Tracking through a pure translational saccade.
1. This suggests that non-translational transforms are needed to accurately track real saccadic eye movements.
2. Saccadic eye motions induce spurious rotational estimates.
3. Using really small patches induces spurious vertical motion and rotation estimates during
periods of pure drift: 4 patches/frame, 16 patches/frame …
What’s the “optimal” patch size so that spurious rotations are not induced?
ANSWER: it depends on the length of the patch wrt frame size
Optimal Patch Size for 480x512 simulated images
Patch Length Patch Height Patches/Frame
.4*512 96 5
.6*512 64 7
.8*512 48 10
512 40 12
Rule of Thumb: To prevent spurious rotational estimates, each patch must cover about 8% of the total image
1/num_patches * patchlength/cols patchheight/rows * patchlength/cols = .08
The rule held for:• Simulated data with pure drift or drift with high frequency oscillations• Different quantizations of rotation transform (1/8 to 1/32 degree)
Even for “optimal” patch size, still get spurious rotations at a saccade
Estimating Eye Motion
• Three Motion Types: transforms are estimated … 1. Differential: between subsequent frames, so the
reference frame is continually changing2. Absolute: from a single, fixed reference frame 3. Compromise: from dynamically changing reference
frame,where a reference frame change occurs due to some criterion, such as a low correlation
• Filtering: remove motion estimates with low correlation (then interpolate)
• Aligning: propoerly align motion tracks after each reference frame change
1. Differential Motion PROS: - yields motion estimates with high image-to-
image correlasions AND reliably detects saccades
1. Differential Motion CONS: Not possible to determine absolute motion by
“integration” or least squares methods (even when adding penalty methods or AR1 model for correlations) due to additive error.
LESSON LEARNED: Aligning after each reference frame change contributes some error to all subsequent motion
estimates
POSSIBLE SOLUTION: Kalman filtering?
2. Absolute Motion from a single reference frame
PROS: No error due to reference frame changes CONS: Correlations drop as a function of distance from the
reference frame
(the second frame shows a filtered motion track, corr<.3,dropping 65% of motion estimates)
3. Compromise: dynamically change the reference frame whenever the correlation drops below some threshold for some proportion of the patches between the current reference frame and the current frame in the video.
Estimating Eye Motion: Filtering and Aligning(filtering out all corr<.3 dropped about 5% of the motion estimates)
Reference Frame changes are not always necessary … (for sk_v15cropped, the reference frame never changes)
Or changing reference frames is a necessity … such as when estimating frame to frame motion from a video which pans about the
retina (ARDISK) …
Dewarping images
• Interpolate between motion estimates (nearest neighbor, linear, cubic, splines) to get a motion estimate at each pixel of an image
• Guess the motion during which the very first reference frame was recorded (extrapolation
based on an average of the “first set” of motion estimates)
• Interpolate in 2-D to create a dewarped image after each pixel in a frame has been moved according to the corresponding motion
Montaging
1. Estimate the motion for a sequence of frames
2. Cluster the frames according to reference frame
3. Dewarp each frame in a (subset of a) cluster choose which frames e.g. by staying away from saccades
4. Average the dewarped frames together via 2-D interpolation (Curt’s method, voronoi method)
POINT: a montage is a de-noised retinal image
Mosaicing1. Estimate the motion for a sequence of frames
2. Cluster the frames according to reference frame
3. Dewarp each frame in a (subset of a) cluster choose which frames e.g. by staying away from saccades
4. Three ways to build the mosaic, by adding select:1. “raw” frames (e.g. a cluster representative)
2. dewarped frames
3. montages
5. The current mosaic is the “reference frame” when adding another image to the mosaic
MosaicingDifficulties:
1. It is problematic to dewarp the additions (wrt the mosaic), especially when only a small part of the mosaic and the addition overlap. Thus, a mosaic appears to lack the detail which the individual montages have.
2. How to choose “select frames” or “select montages”?3. How does one assure that additions to the mosaic
are placed correctly? If an addition is placed incorrectly, now all subsequent additions are being referenced to an incorrect mosaic.
How to select representative frames or montages?
• cluster mincorr meancorr maxcorr wt
• 1.0000 0.2786 0.5046 0.6234 33.0000• 2.0000 0.3220 0.5281 0.6384 26.0000• 3.0000 0.2021 0.3421 0.4792 2.0000• 4.0000 0.2245 0.3449 0.5354 20.0000• 5.0000 0.1844 0.4736 0.6107 61.0000• 6.0000 0.2426 0.4620 0.6123 55.0000• 7.0000 0.2075 0.4409 0.5836 30.0000• 8.0000 0.2767 0.4409 0.5990 21.0000• 9.0000 0.2476 0.4037 0.5224 9.0000• 10.0000 0.1728 0.4648 0.6511 46.0000• 11.0000 0.1933 0.4272 0.5890 20.0000• 12.0000 0.3358 0.4467 0.5993 7.0000• 13.0000 0.1227 0.3353 0.5690 5.0000• 14.0000 0.2304 0.4442 0.6002 38.0000• 15.0000 0.1432 0.3871 0.5722 16.0000• 16.0000 0.3027 0.4357 0.5298 9.0000• 17.0000 0.1939 0.4308 0.5204 31.0000• 18.0000 0.1687 0.4237 0.5688 7.0000• 19.0000 0.2083 0.3950 0.5190 10.0000• 20.0000 0.1636 0.3695 0.4908 9.0000• 21.0000 0.1810 0.3579 0.4689 16.0000• 22.0000 0.2630 0.3809 0.4726 12.0000• 23.0000 0.1414 0.3733 0.4919 31.0000• 24.0000 0.1489 0.3602 0.4862 44.0000• 25.0000 0.3223 0.3807 0.4396 10.0000• 26.0000 0 0 0 0• 27.0000 0.1709 0.3617 0.4677 12.0000• 28.0000 0.1753 0.3465 0.4773 24.0000• 29.0000 0.1995 0.2995 0.4202 1.0000• 30.0000 0.1644 0.3462 0.4697 9.0000• 31.0000 0.1953 0.3485 0.5034 14.0000• 32.0000 0.0724 0.3729 0.5069 21.0000• 33.0000 0.2101 0.4464 0.5524 8.0000• 34.0000 0.3594 0.4614 0.5521 9.0000
A mosaic from raw cluster reps and dewarped cluster reps …CONS: still, incorrectly placed frames (although not as many)
Why are there incorrectly placed images into the mosaic?
(comparing an incorrect mosaic with a correct one)