Video Magnification in Presence of Large Motions...Video Magnification in Presence of Large...

1
Video Magnification in Presence of Large Motions Mohamed Elgharib 1 , Mohamed Hefeeda 1 , Frédo Durand 2 , William T. Freeman 2 1 Qatar Computing Research Institute. 2 MIT CSAIL. y time Ori ginal Lagrangian Eulerian DVMAG (this paper) Figure 1: Comparison with state of the art. The Region of Interest is the dashed yellow region (top, left). For each magnification we show the spatio- temporal slice for the green line (top, left). For easier comparison all slices are temporally stabilized. This sequence shows an eye moving along the horizontal direction. Processing the sequence with our DVMAG technique shows that the iris wobbles as the eye moves (spatio-temporal slice). Such wobbling is too small to be observed in the original sequence (top, left). The global motion of the eye causes significant blurring artifacts when processed with the Eulerian approach [2]. The Lagrangian approach [1] sensitivity to motion errors generates noisy magnification (see dashed blue). The world is full of small temporal variations that are hard to see with naked eyes. Variations in skin color occur as blood circulates [3], structures sway imperceptibly in the wind [2], and human heads wobble with each heart beat. While usually too small to notice, such variations can be magni- fied computationally to reveal a fascinating and meaningful world of small motions [1, 2, 3]. Current video magnification approaches assume that the objects of interest have very small motion. However, many interesting de- formations occur within or because of larger motion. For example, our skin deforms subtly when we make large body motion. A toll gate that closes exhibits tiny vibrations in addition to the large rotational motion. And mi- crosaccades are often combined with large-scale eye movements (Fig. 1). Furthermore, videos or objects might be handheld and may not be perfectly still, and a standard video magnification technique will amplify handshake in addition to the motion of interest. When applied to videos that contain large motions, current magnification techniques result in large artifacts such as haloes or ripples, and the small motion remains hard to see because it is overshadowed by the then magnified large motion and its artifacts (Fig. 1). In the special case of camera motion, it might be possible to apply video stabilization as a preprocess to remove the undesirable large handshake, be- fore magnification. However, this approach does not work for general object motion, and even in the case of camera shake, one has to be extremely care- ful because any small mistake in video stabilization will be amplified by the video magnification step. This problem is especially challenging at the boundary between a moving object, such as an arm, and its background, where multiple motions are present: the large motion, the subtle deforma- tion to be amplified, and the background motion. Current video magnifica- tion such as the linear Eulerian [3] and phase-based [2] algorithms assume that there is locally a single motion. This generates a background dragging effect around object boundaries. In addition the Lagrangian approach [1] is sensitive to motion errors. This generates noisy magnifications. This paper presents a video magnification technique capable of handling This is an extended abstract. The full paper is available at the Computer Vision Foundation webpage. Reference Frame Original Ground-truth Lagrangian Youtube Eulerian DVMAG After Effects y time Figure 2: A frame from a synthetic sequence with mixed small and large mo- tions (left), and its magnification using different techniques. We zoom on the blue spatio-temporal slice (see left). Our approach DVMAG best resembles ground-truth and does not generate blurring artifacts as other techniques. 0 50 100 150 200 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Frame number SSIM Lagrangian Youtube After Effects Eulerian DVMAG 0 20 40 60 80 100 120 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Amplification SSIM Lagrangian Youtube After Effects Eulerian DVMAG Figure 3: Left: SSIM with ground-truth for the sequence of Fig. 2. The larger the SSIM the better. DVMAG outperforms all examined techniques. Right: SSIM with ground-truth over different amplification factors. Our approach handles large amplifications with less magnification artifacts over all other techniques. small motions within large ones. Our technique has two main components: 1. Warping to discount large motion and 2. Layer-based Magnification. Users select a region of interest (ROI) they want to magnify (Fig. 1, dashed yellow). The Warping stage removes large motion while preserving small ones, and without introducing artifact that could be magnified. For this, we use feature point tracking and optical flow, with regularized low-order parametric models for the large-scale motion. Our layer-based magnifica- tion is based on decomposing an image into a foreground, and a background through an opacity matte. We magnify each layer and generate a magnified sequence through matte inversion. We use texture synthesis to fill in image holes revealed by the magnified motion. Finally, we de-warp the magnified sequence back to the original space-time co-ordinates. Fig. 2 shows a synthetic sequence mixed with small and large motions. A small vibrating local motion is first added to the white circle (Fig. 2, left) and then a large global motion is added to the entire sequence. We process the sequence using different magnification techniques, and our technique DVMAG best resembles ground-truth. This is also reflected numerically in In Fig. 3 (left) through SSIM. Fig. 3 (right) examines how amplification factors are handled by different magnification techniques. Results show that DVMAG handles larger amplifications with less errors over all other techniques. In addition DVMAG has the slowest rate of degradation. For instance in the range α = 0 - 40 the slopes of Youtube, Eulerian and La- grangian are steeper than DVMAG. [1] Ce Liu, Antonio Torralba, William T. Freeman, Frédo Durand, and Ed- ward H. Adelson. Motion magnification. SIGGRAPH, 24(3):519–526, 2005. [2] Neal Wadhwa, Michael Rubinstein, Frédo Durand, and William T. Free- man. Phase-based video motion processing. SIGGRAPH, 32(4), 2013. [3] Hao-Yu Wu, Michael Rubinstein, Eugene Shih, John Guttag, Frédo Du- rand, and William T. Freeman. Eulerian video magnification for reveal- ing subtle changes in the world. SIGGRAPH, 31(4):65:1–65:8, 2012.

Transcript of Video Magnification in Presence of Large Motions...Video Magnification in Presence of Large...

Page 1: Video Magnification in Presence of Large Motions...Video Magnification in Presence of Large Motions Mohamed Elgharib 1 , Mohamed Hefeeda 1 , Frédo Durand 2 , William T. Freeman

Video Magnification in Presence of Large Motions

Mohamed Elgharib1, Mohamed Hefeeda1, Frédo Durand2, William T. Freeman2

1Qatar Computing Research Institute. 2MIT CSAIL.

y

time

Original Lagrangian

Eulerian DVMAG (this paper)

Figure 1: Comparison with state of the art. The Region of Interest is thedashed yellow region (top, left). For each magnification we show the spatio-temporal slice for the green line (top, left). For easier comparison all slicesare temporally stabilized. This sequence shows an eye moving along thehorizontal direction. Processing the sequence with our DVMAG techniqueshows that the iris wobbles as the eye moves (spatio-temporal slice). Suchwobbling is too small to be observed in the original sequence (top, left). Theglobal motion of the eye causes significant blurring artifacts when processedwith the Eulerian approach [2]. The Lagrangian approach [1] sensitivity tomotion errors generates noisy magnification (see dashed blue).

The world is full of small temporal variations that are hard to see withnaked eyes. Variations in skin color occur as blood circulates [3], structuressway imperceptibly in the wind [2], and human heads wobble with eachheart beat. While usually too small to notice, such variations can be magni-fied computationally to reveal a fascinating and meaningful world of smallmotions [1, 2, 3]. Current video magnification approaches assume that theobjects of interest have very small motion. However, many interesting de-formations occur within or because of larger motion. For example, our skindeforms subtly when we make large body motion. A toll gate that closesexhibits tiny vibrations in addition to the large rotational motion. And mi-crosaccades are often combined with large-scale eye movements (Fig. 1).Furthermore, videos or objects might be handheld and may not be perfectlystill, and a standard video magnification technique will amplify handshakein addition to the motion of interest. When applied to videos that containlarge motions, current magnification techniques result in large artifacts suchas haloes or ripples, and the small motion remains hard to see because it isovershadowed by the then magnified large motion and its artifacts (Fig. 1).

In the special case of camera motion, it might be possible to apply videostabilization as a preprocess to remove the undesirable large handshake, be-fore magnification. However, this approach does not work for general objectmotion, and even in the case of camera shake, one has to be extremely care-ful because any small mistake in video stabilization will be amplified bythe video magnification step. This problem is especially challenging at theboundary between a moving object, such as an arm, and its background,where multiple motions are present: the large motion, the subtle deforma-tion to be amplified, and the background motion. Current video magnifica-tion such as the linear Eulerian [3] and phase-based [2] algorithms assumethat there is locally a single motion. This generates a background draggingeffect around object boundaries. In addition the Lagrangian approach [1] issensitive to motion errors. This generates noisy magnifications.

This paper presents a video magnification technique capable of handling

This is an extended abstract. The full paper is available at the Computer Vision Foundationwebpage.

Reference Frame Original Ground-truth Lagrangian Youtube Eulerian DVMAGAfter Effectsy

time

Figure 2: A frame from a synthetic sequence with mixed small and large mo-tions (left), and its magnification using different techniques. We zoom on theblue spatio-temporal slice (see left). Our approach DVMAG best resemblesground-truth and does not generate blurring artifacts as other techniques.

0 50 100 150 2000.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Frame number

SS

IM

LagrangianYoutubeAfter EffectsEulerianDVMAG

0 20 40 60 80 100 1200.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Amplification

SS

IM

LagrangianYoutubeAfter EffectsEulerianDVMAG

Figure 3: Left: SSIM with ground-truth for the sequence of Fig. 2. Thelarger the SSIM the better. DVMAG outperforms all examined techniques.Right: SSIM with ground-truth over different amplification factors. Ourapproach handles large amplifications with less magnification artifacts overall other techniques.

small motions within large ones. Our technique has two main components:1. Warping to discount large motion and 2. Layer-based Magnification.Users select a region of interest (ROI) they want to magnify (Fig. 1, dashedyellow). The Warping stage removes large motion while preserving smallones, and without introducing artifact that could be magnified. For this,we use feature point tracking and optical flow, with regularized low-orderparametric models for the large-scale motion. Our layer-based magnifica-tion is based on decomposing an image into a foreground, and a backgroundthrough an opacity matte. We magnify each layer and generate a magnifiedsequence through matte inversion. We use texture synthesis to fill in imageholes revealed by the magnified motion. Finally, we de-warp the magnifiedsequence back to the original space-time co-ordinates.

Fig. 2 shows a synthetic sequence mixed with small and large motions.A small vibrating local motion is first added to the white circle (Fig. 2, left)and then a large global motion is added to the entire sequence. We processthe sequence using different magnification techniques, and our techniqueDVMAG best resembles ground-truth. This is also reflected numerically inIn Fig. 3 (left) through SSIM. Fig. 3 (right) examines how amplificationfactors are handled by different magnification techniques. Results showthat DVMAG handles larger amplifications with less errors over all othertechniques. In addition DVMAG has the slowest rate of degradation. Forinstance in the range α = 0− 40 the slopes of Youtube, Eulerian and La-grangian are steeper than DVMAG.

[1] Ce Liu, Antonio Torralba, William T. Freeman, Frédo Durand, and Ed-ward H. Adelson. Motion magnification. SIGGRAPH, 24(3):519–526,2005.

[2] Neal Wadhwa, Michael Rubinstein, Frédo Durand, and William T. Free-man. Phase-based video motion processing. SIGGRAPH, 32(4), 2013.

[3] Hao-Yu Wu, Michael Rubinstein, Eugene Shih, John Guttag, Frédo Du-rand, and William T. Freeman. Eulerian video magnification for reveal-ing subtle changes in the world. SIGGRAPH, 31(4):65:1–65:8, 2012.