TurkerGaze: Crowdsourcing Saliency with Webcam based Eye ...

000001002003004005006007008009010011012013014015016017018019020021022023024025026027028029030031032033034035036037038039040041042043044045046047048049050051052053

054055056057058059060061062063064065066067068069070071072073074075076077078079080081082083084085086087088089090091092093094095096097098099100101102103104105106107

ICCV#80

ICCV#80

ICCV 2015 Submission #80. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.

TurkerGaze: Crowdsourcing Saliency with Webcam based Eye Tracking

Anonymous ICCV submissionPaper ID 80

In the supplementary material, we demonstrated more examples collected by TurkerGaze of:

• gaze prediction error during image free viewing

• gaze prediction accuracy evaluated by a 33-point chart

• meanshift fixation detection results

• image saliency for fully annotated images

• center-bias-free saliency map for a panoramic image

• video saliency

Time (ms)0 500 1000 1500 2000 2500

X

200

300

400

500

600

700

800

900

EyelinkOurs

Time (ms)0 500 1000 1500 2000 2500

Y

320

340

360

380

400

420

440

460

Time (ms)0 500 1000 1500 2000 2500

X

350

400

450

500

550

600

650

700

750

EyelinkOurs

Time (ms)0 500 1000 1500 2000 2500

Y

250

300

350

400

450

500

550

600

650

Time (ms)0 500 1000 1500 2000 2500

X

200

250

300

350

400

450

500

550

600

EyelinkOurs

Time (ms)0 500 1000 1500 2000 2500

Y

250

300

350

400

Figure 1: Gaze prediction error during image free viewing. On the top, the red curve shows the eye position recordedby the Eyelink eye tracker operating at 1000 Hz, and the green squares indicate the predictions of our system with a 30 Hzsample rate. On the bottom is position error in the x and y direction over time.

1

108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161

162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215

ICCV#80

ICCV#80


0 256 512 768 1024

192

384

576

768

Screen Width

Scre

en H

eigh

t

Error in pixelError in pixel

0 100

0 256 512 768 1024

192

384

576

768

Screen Width

Scre

en H

eigh

tError in pixelError in pixel

0 100

0 256 512 768 1024

192

384

576

768

Screen Width

Scre

en H

eigh

t

Error in pixelError in pixel

0 100

Figure 2: Gaze prediction accuracy evaluated by a 33-point chart. The radius of each dot indicates the median error ofthe corresponding testing position. Each plot corresponds to one experiment for one subject.

Figure 3: Meanshift fixation detection results on two Judd images. On left is the noisy, subsampled gaze data used fortesting. On right are the detected fixations. Red circles indicate ground truth (detected in 240Hz data using code provided byJudd et al.) and yellow crosses indicate fixations detected in the noisy 30Hz data using our meanshift approach.

2

216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269

270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323

ICCV#80

ICCV#80


Figure 4: Examples of center-bias-free saliency maps for panoramic images collected by our system. We uniformlysampled overlapping views from the panorama, projecting each to a regular photo perspective, and collected gaze data foreach projection. We then projected the saliency maps back onto the panorama and averaged saliency for overlapping areas.This gives a panoramic saliency map with no photographer bias.

3

324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377

378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431

ICCV#80

ICCV#80


10

20

30

40

50

60

10

20

30

40

50

60

10

20

30

40

50

60

Figure 5: Examples of the image saliency data obtained by our system. In a free-viewing task, users were shown selectedimages from the SUN database with fully annotated objects. From the raw eye tracking data, we used the proposed algorithmto estimate fixations and then computed the saliency map. This map could be used to evaluate the saliency of individualobjects in the image. The 1st row: the original image; the 2nd row: the raw eye tracking data; the 3rd row: the fixationsdetected by the meanshift algorithm; the 4th row: the saliency map; the 5th row: the semantic object segments; the 6th row:salient objects.

4

TurkerGaze: Crowdsourcing Saliency with Webcam based Eye ...

Documents

Transcript of TurkerGaze: Crowdsourcing Saliency with Webcam based Eye ...