Exploring Human Sketching Process - UC...
Transcript of Exploring Human Sketching Process - UC...
Online Submission ID: 0000
Exploring Human Sketching Process
Figure 1: We visualize temporal trend of common sketching practice across multiple human subjects. In each row, we show the averagesketching results with 3%, 10%, 25%, 42%, 70%, 100% completion rate respectively for a particular object category.
Abstract1
In this paper, we aim to build an interactive system for visualizing2
and understanding how human sketch objects. We implement two3
key techniques: Video Averaging and Generalized Selection to al-4
low users (a) visualize temporal information of drawing sequences5
(b) understand common sketching practice across different human6
subjects (c) identify multiple modes from data and detect unusual7
sketching behavior.8
1 Introduction9
Humans have used sketching to describe visual concept and tell10
visual story from antiquity till today. Recently, Eitz et al. com-11
piled a very nice sketched object dataset including 20, 000 unique12
sketches evenly distributed over 250 object categories [Eitz et al.13
2012a], which is the first large-scale study of non-expert sketches14
of everyday objects. Although the authors ask human to identify15
sketched object category, and also compare human performance16
against computational recognition methods, they mainly focus on17
object recognition problem instead of sketching process itself. We18
are still not clear how human create their sketches from scratch, and19
whether there exists common practice across multiple human sub-20
jects, and across different object categories. In this project, we are21
typically interested in visualizing sketching process since it shows22
some consistent patterns on stroke orders. For example, as shown in23
Figure 1, if we ask people to draw a potted plant, they will usually24
first draw a pot, then sketch stem, then go for leaves and flowers,25
and finally add more details. For face, people always start from face26
outline, then sketch facial parts like eyes, nose, and mouth, and fi-27
nally sketch ears and hair with different style. Our goal is to build28
an interactive system for visualizing and understanding “what do29
sketching processes look like” rather than “what do sketched ob-30
jects look like”, which differs from the original paper.31
At first sight, sketching sequence data look similar to time-series32
1D data (e.g. stock trend). However, traditional line chart tech-33
nique can only display 1D signal at each timestamp while we are34
now faced with 2D image pixels at each frame, which makes it35
non-trivial to reveal underlying regularities existing in the common36
intensity patterns across spatial-temporal domain. We are picturing37
two ideas to address the above problem:38
• Aggregate images at each frame to depict a global trend using39
“Image Averaging” technique [Viegas and Wattenberg 2007].40
Instead of showing multiple image sequences at one large dis-41
play (i.e. play multiple sketching videos), we simply average42
strokes drawn by multiple subjects and only show one average43
sketching sequence to users, which ease the burdens of users’44
perception system.45
• In addition to presenting commonalities, we also need to fig-46
ure out difference, and group different sketching processes47
into several subcategories. This requires techniques that al-48
low users to further filter, select and slice data based on their49
observation and questions. We implement an interactive inter-50
face that provides selection operation similar to [Heer et al.51
2008] to help users quickly select interesting groups out of52
original messy data.53
2 Prior Work54
Our work is inspired by, and builds on, ideas from a number of55
different areas:56
Sketch-based Interaction: Unlike keyboard typing or mouse57
click, people have utilized sketching to describe visual concept and58
tell visual story from antiquity. Thus it is very straightforward for59
humans to convey semantic meanings to computer using sketching,60
which makes sketching a powerful and intuitive tool for many pur-61
poses including (1) Sketch-based Modeling and Design: Igarashi62
and others built very cool sketching interfaces for computer aided63
design and modeling like Teddy System [Igarashi et al. 2007] and64
FiberMesh [Nealen et al. 2007]. Sketching cannot only be applied65
to editing and creating 3D model, but also can be used for synthesiz-66
ing realistic images. Recent semi-automatic image composting sys-67
tems (like Sketch2Photo [Chen et al. 2009] and Photosketcher [Eitz68
et al. 2009]) allow users to create novel photos from sketching anno-69
tated with text labels. (2) Sketch-based Retrieval: Sketching is also70
an efficient tool for exploring huge amounts of visual data (e.g. im-71
ages, videos, 3D models, etc.) since visual data are universally easy72
1
Online Submission ID: 0000
Figure 2: Sampled images from human sketched object dataset.
Figure 3: Examples of drawing sequence with color encoding temporal orders of strokes.
to render but relatively difficult to describe and explain by words.73
Several excellent sketch-based retrieval systems were proposed to74
retrieve images [Eitz et al. 2011] [Cao et al. 2010], paintings [Shri-75
vastava et al. 2011], 3D models [Eitz et al. 2012b] and even com-76
plex scenes [Xu et al. 2013]. We believe a better understanding of77
human sketching process could help researchers in this field design78
new sketch-based interface more friendly to novice users.79
Average Image: Average Image is one type of data analytic and80
visualization technique appearing in contemporary art [Viegas and81
Wattenberg 2007]. In particular, the simple technique of image av-82
eraging has been used extensively, and to great effect, by several83
well-known visual media artists such as Jason Salavon [Salavon84
2004], James Campbell [Campbell 2002], and Idris Khan [Khan85
2005]. Whereas individual image produces a view of visual data,86
average image aims to capture the data as a whole. We propose87
two extensions to this popular technique. First, we build an interac-88
tive interface for users to update average image in real-time, while89
dynamic change of average image directly and explicitly reflects90
users’ data exploratory process. Previous average image results91
are typically static images manually produced by artists without92
interactive design. Second, we extend Average Image into spatial-93
temporal domain, and propose the idea of “Video Averaging”. In94
2
Online Submission ID: 0000
particular, we average all the strokes drawn by multiple human sub-95
jects frame by frame, when each frames stands for one stroke. In96
this way, the user can not only inspect one average image, as the97
summary of sketched objects, but also can capture dynamic change98
of average images across temporal domain.99
Drawing Assistance System Researchers have taken great efforts100
to teach people sketching with varieties types of guidance. iCan-101
Draw system [Dixon et al. 2010] and the Drawing Assistant [Iarussi102
et al. 2013] both display exemplar realistic images and compare103
users’ sketches with reference image. ShadowDraw [Lee et al.104
2011] further pushes this idea and display a shadow image under-105
lying user’s strokes with real-time feedback. [Limpaecher et al.106
2013] and [Zitnick 2013] use collected human drawing data to107
beautify and correct sketching in a data-driven fashion. While we108
and these previous systems both works on human sketches data, our109
main goal of this project is to understand, explore, and visualize ex-110
isting human sketching behaviors rather than beautifying a specific111
sketch. We try to reveal commonality and difference across mul-112
tiple human subjects, although a deeper understanding of human113
sketches data can definitely contribute to new development and de-114
sign of drawing assistant system.115
3 Data116
Recently, Eitz et al. compiled a very nice large-scale sketched ob-117
ject dataset including 20, 000 unique sketches evenly distributed118
over 250 object categories [Eitz et al. 2012a]. Figure 2 shows119
sampled images from multiple categories. Even for the same cate-120
gory like bear, human subjects actually produce bears with different121
shapes, poses and details, which suggest the diversity and interest-122
ingness of dataset. The authors of dataset ask Amazon Mechnical123
Turk workers to draw one sketch at a time given a object category124
name. They publish 90×250 Human Intelligence Tasks, and collect125
sketches from 1, 350 unique workers. The workers draw a total of126
351, 060 strokes with each sketch containing a median number of127
13 strokes. After manual inspection and clean up, the authors trun-128
cate the dataset to contain exactly 80 sketches per category yielding129
20, 000 sketches.130
The dataset not only stores final sketched objects, but also store131
temporal information of each stroke as a Bezier Spline in SVG for-132
mat, which allows us to analyze sketching process. Figure 3 shows133
examples of drawing sequence with color encoding temporal orders134
of strokes. Green strokes were made in the beginning of sketching135
process while red strokes were composed later. We can see sketches136
demonstrate more interesting patterns and become much more vivid137
after this simple coloring. We now know people usually first sketch138
overall structure of piano, and then draw keys one by one, which139
can never be inferred from Figure 2.140
Since our video averaging and interactive tool require each stroke’s141
pixel positions, we need to extract information from original SVG142
data. We first parse one sketch’s SVG file into separate SVG files143
while each SVG file represent one single stroke. We then convert144
each stroke from SVG format to bitmap format using “mogrify”145
command in ImageMagick package [ImageMagick 2008], which146
leads to 2.84G stroke data. After compressing data using matrix147
sparsification, we save all the stroke data as 673MB matlab file.148
4 Approach149
4.1 Visualizing Drawing Sequence150
Early study [Eitz et al. 2012a] mainly focuses on recognizing static151
sketched objects without taking advantage of temporal information.152
Figure 5: Color Encoding for Average Video
Figure 6: Demonstration of Interactive Brush Tool: First imageis the original average image, after brushing several strokes (asshown in the second image and the third image), we could achievefinal results (the fourth image)
However, exploring such time-series data can reveal many interest-153
ing patterns about how people draw a specific object. Thus we use154
an animated approach to let user observe how drawing changes as155
time goes.156
As shown in Figure 5, our interface allow users to capture draw-157
ing progress at current timestamp. The users can drag the timeline158
bar to navigate entire sketching progress and compare sketched re-159
sults between different frames. Our toolkit can also automatically160
change the timestamp and update result to simulate the original161
drawing process. Also, we use color to encode the temporal in-162
formation. In our case, we use green to indicate that strokes were163
drawn in the beginning while red means that strokes were produced164
afterward.165
4.2 Video Averaging166
Although animation can help us explore time-series data. Animat-167
ing multiple drawing sequence is not an easy task. The simplest168
approach would be displaying them individually in a grid (Fig-169
ure 3). However, as mentioned in class, humans can only trace a170
limited number of moving objects (typically fewer than 6). Thus171
although such kind of visualization can clearly demonstrate each172
sketching sequence, it would be nontrivial for users to compare dif-173
ferent drawings, and figure out general trend of sketching processes174
of same object category.175
In order to solve this problem, we extend image averaging ap-176
proach [Viegas and Wattenberg 2007] to video averaging by com-177
puting mean image of all the strokes drawn by 80 human subjects178
stroke by stroke for the same object category. Such approach can179
concentrate uses’ attention to a small region of the screen. Also, the180
3
Online Submission ID: 0000
Figure 4: Video Averaging Results:We visualize temporal trend of common sketching practice across multiple human subjects. In each row,we show the average sketching results with 3%, 10%, 25%, 42%, 70%, 100% completion rate respectively for a particular object category.
aggregated result can reveal general trends of peoples’ drawings as181
shown in Figure 1 and Figure 4. Notice that in Figure 5, we con-182
tinue to use color encoding method when averaging videos, thus183
temporal information can be shown in one single average image as184
well.185
4.3 Generalized selection186
Although video averaging can show main modes of sketching pro-187
cess, there are still plenty of noise and outliers in the data. In some188
cases, the orientations of sketched object could also be very differ-189
ent between each other (e.g. giraffe). Due to culture and education190
background, human also could have completely different knowl-191
edge and interpretation for the same object category (e.g. western192
dragon vs. eastern dragon). Thus it would be desirable if we can193
filter out noised data, or divide the entire data into different subcat-194
egories.195
In order to achieve this goal, we provide an interactive eraser brush196
for users to remove and shape drawings. Inspiring by Generalized197
Selection [Heer et al. 2008], our brush is different from traditional198
eraser brush. The generalized selection uses manipulation tech-199
niques that couple declarative selection queries with a query relax-200
ation engine that enables users to interactively generalize their se-201
lections. And our system will generalize the selection by two ways:202
First, the whole sketching will be selected and removed if any part203
of it is touched by our brush. Second, users can use our brush at204
arbitrary time frame and the whole drawing will be removed if it205
is touched at that frame. Thus our system will not only remove206
what brush directly touched, but also will remove all the elements207
related to them. As demonstrated in Figure 6, after brushing several208
strokes, we could achieve much cleaner average results.209
This tool is very powerful for users to filter out drawings since when210
they navigate different timestamps and remove whatever outliers211
they see. We will describe more concrete applications and findings212
in the next session.213
5 Results and Applications214
5.1 Patterns and Trends215
By exploring our dataset, we can see some clear patterns and trends216
of drawing sequences for several objects in Figure 1 and Figure 4.217
For example, when people draw bicycles, most people would draw218
the front wheel first. But after that, the drawing order would be a bit219
diverse and random. Either the back wheel or the frame would be220
drawn. Also, for the potted plant example, most people will draw221
the pot first, and then draw the plant from stem to leaf.222
5.2 Identifying Clusters223
By using our brush tool, users can easily shape averaged sketching224
results. Instead of simply brush on noised region to clean up data,225
one can also discover subcategory structure. For example, in “gi-226
raffe” case (Figure 6 and Figure 7) , two different orientations of227
head exist. Users can simply brush one head to keep the other clus-228
ter. Similar results are showed for “key” category, where different229
rotation of keys fall into different clusters (Figure 7).230
5.3 Outliers detection231
Also, the brush can be also used for detecting outliers. By brushing232
some dense region, the common drawings will be removed. And233
those who differ from common patterns will be shown alone. For234
the hourglass case, when we brush the middle region of the hour-235
glass, the remaining drawings all have huge bottleneck which seems236
to be impossible for real hourglass (Figure 8).237
4
Online Submission ID: 0000
Figure 7: Clustering
Figure 8: Outlier Detection
6 Discussion238
The above sessions only discussed drawings that have some general239
patterns. In those cases, it would be easy to see the patterns in the240
averaging results. However, there are also other cases where such241
general patterns don’t exist, and the drawings are very diverse as242
shown in dragon and panda case (Figure 9).243
In dragon case, we can see that there are two kinds of dragon, one244
is with eastern style while the other is more like a western dragon.245
Such inconsistence make the data a bit messy. What’s more, even246
for each subcategory, the shape of the dragons varied greatly. Thus247
the averaging result of those drawings is almost like random noise.248
In the panda case, the result is also shocking, that most people are249
not drawing panda at all. Thus we need to recall that in the origi-250
nal data collection procedure [Eitz et al. 2012a], the people are not251
given any examples, but only the textual instructions. Thus maybe252
some people are just not familiar with the concept panda. Also, for253
the concept dragon, different people may refer to different things as254
well.255
But those are not the only factors that cause the inconsistence. In256
our exploration, we found that animals are typically more difficult257
for people to achieve consistence. This could be explained by that258
the real world images for animals are also very diverse since an-259
imals are movable creatures. Thus people don’t have some static260
memory about the animals.261
Also, we found that objects that are hard to rotate would have bet-262
ter consistence. Objects like tomatoes whose views don’t change263
after rotation are consistent as well. All these findings give us some264
insights that before people begin to draw, they are actually making265
some choice about how to draw that specific object. Since we are266
providing no examples, they have to think of a concrete object and267
choose one view of the object to draw. And such decision process268
varied greatly between different concepts.269
Figure 9: Inconsistent and Bad Sketches
7 Future work270
Understanding such decision process would be really meaningful271
but we currently don’t have enough data to conduct a detailed anal-272
ysis on it. In the future, we plan to collect more data and conduct273
some controlled experiments to analyze how people make those de-274
cisions and what factors are affecting people’s drawings.275
References276
CAMPBELL, J., 2002. http://jimcampbell.tv/portfolio/still image works/.277
CAO, Y., WANG, H., WANG, C., LI, Z., ZHANG, L., AND278
ZHANG, L. 2010. Mindfinder: interactive sketch-based image279
search on millions of images. In Proceedings of the international280
conference on Multimedia, ACM, 1605–1608.281
CHEN, T., CHENG, M.-M., TAN, P., SHAMIR, A., AND HU, S.-282
M. 2009. Sketch2photo: internet image montage. In ACM283
Transactions on Graphics (SIGGRAPH Asia), vol. 28, 124.284
DIXON, D., PRASAD, M., AND HAMMOND, T. 2010. icandraw:285
using sketch recognition and corrective feedback to assist a user286
in drawing human faces. In Proceedings of the SIGCHI Confer-287
ence on Human Factors in Computing Systems, ACM, 897–906.288
EITZ, M., HILDEBRAND, K., BOUBEKEUR, T., AND ALEXA, M.289
2009. Photosketch: A sketch based image query and composit-290
ing system. In SIGGRAPH 2009: Talks, ACM, 60.291
EITZ, M., HILDEBRAND, K., BOUBEKEUR, T., AND ALEXA,292
M. 2011. Sketch-based image retrieval: Benchmark and bag-293
of-features descriptors. Visualization and Computer Graphics,294
IEEE Transactions on 17, 11, 1624–1636.295
EITZ, M., HAYS, J., AND ALEXA, M. 2012. How do humans296
sketch objects? ACM Transactions on Graphics (TOG) 31, 4,297
44.298
EITZ, M., RICHTER, R., BOUBEKEUR, T., HILDEBRAND, K.,299
AND ALEXA, M. 2012. Sketch-based shape retrieval. ACM300
Transactions on Graphics (TOG) 31, 4, 31.301
HEER, J., AGRAWALA, M., AND WILLETT, W. 2008. Generalized302
selection via interactive query relaxation. In Proceedings of the303
5
Online Submission ID: 0000
SIGCHI Conference on Human Factors in Computing Systems,304
ACM, 959–968.305
IARUSSI, E., BOUSSEAU, A., TSANDILAS, T., ET AL. 2013.306
The drawing assistant: automated drawing guidance and feed-307
back from photographs. In ACM Symposium on User Interface308
Software and Technology (UIST).309
IGARASHI, T., MATSUOKA, S., AND TANAKA, H. 2007. Teddy: a310
sketching interface for 3d freeform design. In ACM SIGGRAPH311
2007 courses, ACM, 21.312
IMAGEMAGICK, 2008. www.imagemagick.org/script/index.php.313
KHAN, I., 2005. www.skny.com/artists/idris-khan/images/.314
LEE, Y. J., ZITNICK, C. L., AND COHEN, M. F. 2011. Shad-315
owdraw: real-time user guidance for freehand drawing. In ACM316
Transactions on Graphics (SIGGRAPH), vol. 30, 27.317
LIMPAECHER, A., FELTMAN, N., TREUILLE, A., AND COHEN,318
M. 2013. Real-time drawing assistance through crowdsourcing.319
ACM Transactions on Graphics (TOG) 32, 4, 54.320
NEALEN, A., IGARASHI, T., SORKINE, O., AND ALEXA, M.321
2007. Fibermesh: designing freeform surfaces with 3d curves.322
ACM Transactions on Graphics (TOG) 26, 3, 41.323
SALAVON, J., 2004. www.salavon.com/work/specialmoments/.324
SHRIVASTAVA, A., MALISIEWICZ, T., GUPTA, A., AND EFROS,325
A. A. 2011. Data-driven visual similarity for cross-domain im-326
age matching. In ACM Transactions on Graphics (TOG), vol. 30,327
ACM, 154.328
VIEGAS, F. B., AND WATTENBERG, M. 2007. Artistic data visu-329
alization: Beyond visual analytics. In Online Communities and330
Social Computing. Springer, 182–191.331
XU, K., CHEN, K., FU, H., SUN, W.-L., AND HU, S.-M. 2013.332
Sketch2scene: Sketch-based co-retrieval and co-placement of 3d333
models. ACM Trans. Graph. 32, 4 (July), 123:1–123:15.334
ZITNICK, C. L. 2013. Handwriting beautification using token335
means. ACM Trans. Graph. 32, 4 (July), 53:1–53:8.336
6