Interactive Gesture-based Authentication for Tabletop Devices...Ian Fasel Computer Science...

7
Interactive Gesture-based Authentication for Tabletop Devices Raquel Torres Peralta Computer Science University of Arizona Tucson, Az [email protected] Antons Rebguns Computer Science University of Arizona Tucson, Az [email protected] Ian Fasel Computer Science University of Arizona Tucson, Az [email protected] ABSTRACT Multi-touch tablets allow users to interact with computers through intuitive, natural gestures and direct manipulation of digital objects. One advantage of these devices is that they can offer a large, collaborative space where several users can work on a task at the same time. However the lack of pri- vacy in these situations makes standard password-based au- thentication easily compromised. We therefore propose a new gesture-based authentication system based on each users’ unique signature of touch motion. Our technique has two key fea- tures. First, at each step in authentication the system prompts the user to make a specific gesture selected to maximize the expected long-term information gain. Hence the order of ges- tures can be different every time, and does not require the user to memorize a specific pattern. Second, each gesture is integrated using a novel hierarchical probabilistic model, al- lowing the system to accept or reject a user after a variable number of gestures, often in as few as five gestures. This touch-based approach allows the user to accurately authenti- cate without the need to cover their hand or look over their shoulder. It also allows the system to adapt to the diversity in touch styles for any particular user population. We tested this method using a set of samples were collected under real- world conditions in a business office, using a touch tablet that was used on a near daily basis by users familiar with the de- vice. Despite the lack of sophisticated, high-precision equip- ment or noise suppression techniques, our system is able to achieve extremely high user recognition accuracy with rela- tively few gestures, demonstrating that human touch patterns have a distinctive “signature” can be used as a powerful bio- metric measure for user recognition and personalization. Author Keywords Multitouch, Multi-touch, Authentication, User Recognition ACM Classification Keywords Security, Interaction Beyond User Technical Report, November 2011. General Terms Algorithms, Experimentation, Security INTRODUCTION Touch tablets are used widely for many different purposes. A key advantage of these devices is that multiple users can use them simultaneously, for example in a meeting room or emergency strategy center, where all users have visual access to the entire display. In such cases, traditional character-based user authentication systems do not provide adequate security, since covertly observing the characters another user is typing is relatively easy given the dimensions of the device. To ad- dress this issue, we propose a novel gesture-based authentica- tion system which performs user recognition based on shape and speed characteristics of a set of generic gestures drawn by the user. Rather than require the user to memorize a fixed ordering of gestures, the system interactively requests each gesture based on a strategy to reduce the expected long-term uncertainty about the user identity as quickly as possible, con- ditioned on the gestures it has seen up to now. By requesting the maximally informative features, the system often only re- quires the user to enter a small number of gestures at each authentication attempt before they are accepted (or rejected). Gesture-based authentication can protect users from the well- known “shoulder-surfing” vulnerability of touch devices. Be- cause authentication is based on a biometric “signature” which the user does not consciously control, similar to a handwriting signature, users can maintain relative security even if others can see them making gestures, since an observer would not be able to reproduce user’s natural style of touch. Moreover, our interactive system frees the user from having to memorize a special, distinctive, password-like series of gestures. To ground the problem in the real world, we have collected data and developed our techniques in the context of users in a Geology-related software developing office, where multi- touch tablets are already extensively used in day-to-day tasks. For each gesture, a multi-class support vector machine (SVM) model based on shape and speed features is trained to dis- criminate between all other users (one multi-class model per gesture). The probability distribution of each user for each gesture is modeled using a semi-latent Dirichlet allocation model, allowing multiple gestures (including multiple instances of the same gesture) to be combined to make increasingly accurate predictions as more gestures are acquired. Thus, while average classification accuracy after a single gesture 1

Transcript of Interactive Gesture-based Authentication for Tabletop Devices...Ian Fasel Computer Science...

  • Interactive Gesture-based Authentication for TabletopDevices

    Raquel Torres PeraltaComputer Science

    University of ArizonaTucson, Az

    [email protected]

    Antons RebgunsComputer Science

    University of ArizonaTucson, Az

    [email protected]

    Ian FaselComputer Science

    University of ArizonaTucson, Az

    [email protected]

    ABSTRACTMulti-touch tablets allow users to interact with computersthrough intuitive, natural gestures and direct manipulation ofdigital objects. One advantage of these devices is that theycan offer a large, collaborative space where several users canwork on a task at the same time. However the lack of pri-vacy in these situations makes standard password-based au-thentication easily compromised. We therefore propose a newgesture-based authentication system based on each users’ uniquesignature of touch motion. Our technique has two key fea-tures. First, at each step in authentication the system promptsthe user to make a specific gesture selected to maximize theexpected long-term information gain. Hence the order of ges-tures can be different every time, and does not require theuser to memorize a specific pattern. Second, each gesture isintegrated using a novel hierarchical probabilistic model, al-lowing the system to accept or reject a user after a variablenumber of gestures, often in as few as five gestures. Thistouch-based approach allows the user to accurately authenti-cate without the need to cover their hand or look over theirshoulder. It also allows the system to adapt to the diversityin touch styles for any particular user population. We testedthis method using a set of samples were collected under real-world conditions in a business office, using a touch tablet thatwas used on a near daily basis by users familiar with the de-vice. Despite the lack of sophisticated, high-precision equip-ment or noise suppression techniques, our system is able toachieve extremely high user recognition accuracy with rela-tively few gestures, demonstrating that human touch patternshave a distinctive “signature” can be used as a powerful bio-metric measure for user recognition and personalization.

    Author KeywordsMultitouch, Multi-touch, Authentication, User Recognition

    ACM Classification KeywordsSecurity, Interaction Beyond User

    Technical Report, November 2011.

    General TermsAlgorithms, Experimentation, Security

    INTRODUCTIONTouch tablets are used widely for many different purposes.A key advantage of these devices is that multiple users canuse them simultaneously, for example in a meeting room oremergency strategy center, where all users have visual accessto the entire display. In such cases, traditional character-baseduser authentication systems do not provide adequate security,since covertly observing the characters another user is typingis relatively easy given the dimensions of the device. To ad-dress this issue, we propose a novel gesture-based authentica-tion system which performs user recognition based on shapeand speed characteristics of a set of generic gestures drawnby the user. Rather than require the user to memorize a fixedordering of gestures, the system interactively requests eachgesture based on a strategy to reduce the expected long-termuncertainty about the user identity as quickly as possible, con-ditioned on the gestures it has seen up to now. By requestingthe maximally informative features, the system often only re-quires the user to enter a small number of gestures at eachauthentication attempt before they are accepted (or rejected).

    Gesture-based authentication can protect users from the well-known “shoulder-surfing” vulnerability of touch devices. Be-cause authentication is based on a biometric “signature” whichthe user does not consciously control, similar to a handwritingsignature, users can maintain relative security even if otherscan see them making gestures, since an observer would not beable to reproduce user’s natural style of touch. Moreover, ourinteractive system frees the user from having to memorize aspecial, distinctive, password-like series of gestures.

    To ground the problem in the real world, we have collecteddata and developed our techniques in the context of users ina Geology-related software developing office, where multi-touch tablets are already extensively used in day-to-day tasks.For each gesture, a multi-class support vector machine (SVM)model based on shape and speed features is trained to dis-criminate between all other users (one multi-class model pergesture). The probability distribution of each user for eachgesture is modeled using a semi-latent Dirichlet allocationmodel, allowing multiple gestures (including multiple instancesof the same gesture) to be combined to make increasinglyaccurate predictions as more gestures are acquired. Thus,while average classification accuracy after a single gesture

    1

  • was 31.8%, we were consistently able to achieve 100% afterdrawing eight to fifteen gestures using the hierarchical ap-proach.

    While a fixed ordering of gestures can be effective, it is possi-ble to dramatically reduce the number of gestures required toachieve high accuracy by requesting each next gesture strate-gically based on the gestures seen until then. We do so bytreating the problem as a partially observed Markov decisionprocess (POMDP), in which the hidden state is the user iden-tity, the actions are the requests for specific gestures, and thereward is the information gain after each gesture. Using abatch reinforcement learning method on a set of gesture ex-amples collected from users, we learn a policy for requestingeach subsequent gesture to maximize the expected long-terminformation gain. We find that the InfoMax policy allowsusers to authenticate in dramatically fewer examples than afixed, hard-coded policy, often as few as five and never morethan eight gestures.

    Recognizing the user is different from recognizing the ges-ture, so in this work we assume that the users indeed enter therequested gesture (or at least that the gesture has been previ-ously identified with 100% accuracy). Although we have onlytested our method with a small number of users, our abilityto achieve 100% accuracy in most cases suggests that mostusers do indeed leave a signature over the trajectories for eachgesture, meaning that touch style can be used as a biometricmeasure for authentication and other recognition tasks. Al-though this study focused on multi-touch tablets, these tech-niques use only the trajectories of x, y coordinates of pointsin each gesture, meaning that our method could potentially beapplied on any touch-sensitive device capable of capturing a2d gesture, such as graphics tablets and iPads.

    RELATED WORKIn the past, there has been interest in recognizing a subjectusing their patterns in behavior or physical attributes whenperforming activities other than touch gestures, such as thevariations in weight on the floor when walking [7]. For multi-touch systems, Schoning et al. [8] used biometric information(hand and arm dimensions) to identify a user. Multi-touchsystems recognize several points where the tablet is beingtouched at the same time; however, they do not know a pri-ori which point belongs to which user. Some projects, suchas the DiamondTouch table [4], have made use of extra de-vices such as transmitters and receivers to detect the origin(user) of the touch. This solution, while effective, is restric-tive in the requirement that users must remain on their seatsand while it is effective for user recognition, it can’t validatethe identity of the individuals. Recently, authentication intouch-surfaces had been tried to be solved using external de-vices such as those in IR Ring [10] and even mobile phones[6]. Some systems have tried to solve this problem followingdifferent approaches, i.e. restricting the password elementsto digits so there is no need for displaying the whole key-board except for the number pad which occupies a relativelysmall space that the user can easily cover with his hand [3]or using the user’s hand contour as a signature as in [5]. Our

    Figure 1. Conventional character-based passwords are not suitable forcollaborative multi-touch tablets as all users have visual access to thesurface at all times. User authentication must not rely on secret combi-nations of either characters or gestures.

    approach does not require extra devices which are not guar-anteed to be used by the proper user, or is exclusive of sometechnologies (as RTFID). In [1] InfoMax control has been im-plemented on a robot which can move between objects in theworld and perform the sound-producing manipulations (e.g.grab, lift, move, shake, etc.) on them to identify object’s cate-gory. A latent variable mixture model for acoustic similaritieswas implemented and InfoMax was used to learn polices thatallowed the robot to rapidly reduce uncertainty about the cat-egories of objects in the room. We have adapted this approachto successfully solve the authentication problem presented inthis paper.

    EXPERIMENTAL METHODOLOGYA multi-touch tablet is an input device composed of a surfacewhere the data is displayed in such a way that the touch of auser hand, a stylus, or other object can be detected and local-ized in real-time. The system works in absolute coordinates,where every point of the surface corresponds directly to thepixel on screen. The system used in this paper is composedof a surface, a projector and a camera that captures the im-age of the users’ finger touches from above the surface. Thesamples were collected at a mining business office where amulti-touch tablet is used daily. The samples were taken us-ing a 36in x 22in multi-touch tablet in horizontal position (asa table). The data was captured using the Touchlib library.Eight participants had three sessions scheduled at differentdays and times (one participant at a time). Users were askedto reproduce eight different gestures (Figure 1) over the tablet

    2

  • Figure 2. Vocabulary of gestures.

    (there were some breaks to avoid fatigue). The samples wereasked to be drawn in two ways:

    1. Sequential: The gestures were organized together by type.The participant performed the same gesture a number oftimes before passing to the next one.

    2. Random: The gestures were performed in an unordered se-quence. Both sets were then mixed to form the training andtesting sets.

    Our vocabulary of gestures contains simple one-stroke ges-tures to identify the characteristics of basic movements whenauthenticating the user.

    The samples contained the values captured by the cameradown the tablet’s surface. These values are: x, y, w, h, awherex and y denote the (x, y) coordinates, and w, h and a repre-senting width, height and area respectively. In order to extendthe potential application of this study beyond FTIR multi-touch tablets, we worked only with x and y coordinates whichrepresent 2D trajectories.

    THE MAX-MARGIN MODELWe assume that the gesture itself (i.e., the category of strokein Figure 2) has been accurately predicted. We trained amulticlass SVM (eight classes, one per user) for each ges-ture, which could be used to authenticate a user (who couldbe recognized without explicitly asking them to state whichuser they are for different purposes).

    For all cases, each SVM model was trained on a combinationof the Sequential and Random sample sets from all users. Thetrajectories included in these tests were all at least five pointslong.

    Shape and speed representationThe gesture representation combines both shape and speedfeatures. Each trajectory was represented by a 15x25 pixelimage using interpolation of the trajectory (the image of thesample was resized maintaining its original shape and thentrimmed to the required size avoiding distortion). A low-dimensional representation of these trajectories was then ob-tained by using the first five principal components using Prin-cipal Components Analysis (PCA), which captured 99% ofthe variance of the dataset. The intensity of the pixel at eachlocation in the trajectory is proportional to the speed at thatlocation. Figure 3 shows a representation of a sample ges-ture before resizing. Changes in speed are represented by adifferent color.

    To test the significance of both features (shape and speed),we compared the accuracy of each against the combination ofboth. For the shape representation we processed the samplesas described above.

    Figure 3. Representation of shape and speed. The trajectory is repre-sented by an image where the intensity of the pixel represents the speedat that location. In this figure, the changes in speed are represented withdifferent colors to make it more noticeable.

    For speed, the variable-length sequences were normalized bycreating histograms of the distance between the points dis-cretized into six values. The min and max values were com-puted from training data on a per user, per gesture basis. Thenthe samples to be tested went through the same process, us-ing the min and max computed from training data in order toobtain the normalized histograms.

    Neither velocity nor shape representations allowed us to ac-curately predict users in the multi-class case. The informationin these representations was not sufficient for the multi-class(user-recognition) approach, specifically 7.9% for the speedrepresentation, and 23.8% for shape representation. Evenwhen the combination of both features raised the accuracyto 31.8%, the result is still not satisfactory.

    Although these results are not accurate enough for user recog-nition, we have observed that most users tend to be predictedas one or two specific users. For instance, user 1 may bepredicted to be user 1 and 3 for one gesture, and as user 1and 5 for another when represented by velocity, or as user 1and 6 when using the shape representation. Thus, it seemslikely that by combining different representations over multi-ple, different gestures, touch-based user recognition may stillbe quite feasible, especially if better representations can befound. Given the results, we concluded that user authentica-tion/recognition could not be accurately achieved using justone sample of one-stroke gestures.

    NAIVE BAYES APPROACHWith 31.8% of accuracy for user recognition from a singlegesture, it should be possible to combine the results over mul-tiple gestures to achieve extremely high accuracy. Assumingindependence on gestures, user probabilities are simply mul-tiplied across gestures and re-normalized raising the accuracyto 60%. However, we suspect that the independence assump-tion of a naive Bayes approach may probably be too strong,

    3

  • but an approach including specific interaction terms, may beable to achieve high accuracy. Our Naive Bayes approachresults suggest that enough information is present that an ex-tended solution involving multiple gestures strategically cho-sen may be able to produce acceptable rates for user recogni-tion.

    MULTI-ACTION CATEGORY MODELThe method above gives us a probability estimate for a samplegesture drawn by a user. In order to combine the result of mul-tiple gestures (including repetitions of the same gesture) wedeveloped a model which uses an intermediate representationwhere users represent a distribution of shape-speed similari-ties given each gesture. We note that if we made the assump-tion that each observation is conditionally independent giventhe gesture and user, then the class probabilities conditionedon all observed gestures could be computed simply by takingthe product of the above probabilities and normalizing. Thisassumption is too simplistic however. For instance, if a ges-ture is repeated by a user, then taking products of the SVMprobability estimates would usually result in an overly highconfidence for one user even if that user only gets slightlyhigher probability on each individual trial. Therefore, ourmodel try to deal with the problem of lack of independence.

    Our model takes a principled approach to modeling the factthat a gesture performed by a given user legitimately yields adistribution over user similarities. Our reasoning is that be-cause the underlying causes for shape and speed are a com-plex relationship between the user’s attributes (height, left orright handed, etc.), shape and speed similarities are only indi-rectly related to the user through these (always hidden) prop-erties. Therefore it is important to explicitly model the factthat some users look somewhat like others under certain ac-tions/gestures (for instance, user 1 may be similar to user 3 oruser 5 when drawing gesture 8, but be similar only to user 2when drawing gesture 1).

    We address this by using a generative model in which eachgesture-user category specifies a Dirichlet distribution fromwhich a particular distribution of shape/speed-similarities (es-timated using eq. (1)) are sampled. Thus the probability ofgenerating probabilities φ by drawing gesture a by user i is

    p(φ|a, i) =Γ(∑Nj=1 αaij)∏

    j Γ(αaij)

    N∏k=1

    φαaik−1k (1)

    where αai = (αai1, ..., αaiN ) are the parameters for a Dirich-let distribution over probabilities for user i under gesture a.This model treats elements of φ as functionally different thanuser labels. The posterior probability of a particular usergiven a set of gestures can now be calculated by taking theproduct of the probabilities estimated with eq. (1) for eachgesture and then normalizing.

    LEARNING AN INFOMAX CONTROLLEROnce we have specified a gesture model, we can use rein-forcement learning to find a policy for selecting actions. Letqt be a d-dimensional vector combining the system’s currentbeliefs about the user and its known internal state (described

    below), and define the set of possible gestures A = {gesture1, gesture 2, gesture 3, gesture 4, gesture 5, gesture 6, gesture7, gesture 8}. Then let the function Fθ : Q → A be a deter-ministic controller with k-dimensional parameter θ which ateach time t takes as input a state-variable qt and outputs anaction at.

    RepresentationTo construct qt, let p′ be a representation of system’s currentbeliefs about the user drawing the gesture. Then let qt =(p′, c′, ψ(t)) where c′ is a vector of counters of how ofteneach gesture has been drawn by the current user, and ψ(t) =(ψ1(t), ψ2(t), ψ3(t)) is a vector of radial basis functions ofthe time t, which allows the learned policy to depend on thenumber of steps taken in an episode.

    Let an episode (or history) h = (q1, a1, ..., qT , aT ) be a se-quence of T state-action pairs induced by using a controllerwith parameters θ. We can then define the reward at time tof episode h as the (scaled) negative Shannon entropy of thebelief distribution, i.e.,

    R(qt|h) =1

    a

    (∑i

    p(t)i log p

    (t)i + b

    )(2)

    where p(t)i is the system’s belief that the current user is user ibased on the experiences in h up to time t. Constants a and bare simply to scale reward to [0, 1] and are fixed beforehand.

    where p(t)ik is the agent’s belief that a gesture k is drawn byperson i based on the experiences in h up to time t. Con-stants a and b are simply to scale reward to [0, 1] and are fixedbeforehand.

    Policy LearningIn the current setting, the goal of the learning algorithm is tofind parameters θ that maximizes the expected total rewardover episodes of fixed length L, i.e., maximize the objective

    Φ(θ) = Eh[

    L∑t=1

    R(qt|h)p(h|θ)]. (3)

    Although many optimization algorithms could work in thissituation, in this work we learn the parameters from experi-ence using the Policy Gradients with Parameter Exploration(PGPE) algorithm [9], a model-free reinforcement learningalgorithm for Partially Observable Markov Decision Process(POMDPs) which performs exploration by sampling in theparameter space of a controller. Rather than computing gradi-ents for the objective function with respect to the controller’sparameters, a gradient is instead estimated over a set of hy-perparamters from which parameters θ of a controller aresampled. For completeness we give a brief description herebut we defer to [9] for details.

    Let θn be a d-dimensional vector, so that we can rewrite θ =(θ1, ..., θk) as a set of weight vectors for the function: Fθ(qt) =argmaxa θaqt, i.e., it calculates one linear combination ofthe inputs qt per action, then selects the maximum scoring

    4

  • Figure 4. Pipeline for authentication.

    action. For each learning episode, each parameter of θ is in-dependently sampled from a one dimensional normal distri-bution with mean and variance µi, σi, which we collect to-gether as ρ = (µ1, ...µd, σ1, ...σd).

    PGPE performs a gradient descent procedure over ρ to opti-mize policies of the form

    p(at|qt, ρ) =∫θ

    p(θ|ρ)δ(Fθ(qt), at) dθ (4)

    Let r(h) = R(qT |h), and let H be the set of all possiblehistories. The expected reward is then given by

    J(ρ) =

    ∫Θ

    ∫H

    p(h, θ|ρ)r(h) dh dθ (5)

    Differentiating with respect to ρ and using the identity∇xy(x) =y(x)∇x log y(x), we have

    ∇ρJ(ρ) =∫

    Θ

    ∫H

    p(h, θ|ρ)∇ρ log p(h, θ|ρ)r(h) dh dθ (6)

    Noting that h is conditionally independent of ρ given θ, thiscan be estimated with a sample of histories, by repeatedlychoosing θ from p(θ|ρ) and then running the agent with thispolicy to generate a history h. Thus, given a set of rollouts(h(1), ..., h(N)) generated from sample controllers with pa-rameters (θ(1), ..., θ(N)),

    ∇ρJ(ρ) ≈1

    N

    N∑i=1

    ∇ρ log p(θ(i)|ρ)r(h(i)) (7)

    Stochastic gradient descent can now be performed until a lo-cal maximum is found. [9] show that with proper bookkeep-ing, each gradient update can be efficiently performed usingjust two symmetric samples from the current ρ.

    USABILITYAny implementation must be carefully designed to be wellreceived by users. Even when our hierarchical approach con-sists of multiple elements (see figure 4), those will be im-perceptible for the user who only will have to provide therequired information (as user Id) and gestures to be authenti-cated (see figure 5).

    Figure 5. Pipeline for authentication - User’s perspective while beingauthenticated. User will be asked for the required gestures in order toreduce uncertainty. In this case, the user is recognized after four ges-tures.

    The user will only respond to the system so he won’t need tomemorize a set of gestures, he also can perform the authen-tication process freely in front of any user, since there is noway of reproducing his characteristics. However, we couldeasily set a treshold to limit the number of gestures alowedper access.

    RESULTSWe trained a multiclass SVM model using LIBSVM [2]. Forthe SVM models we used RBF kernels with soft margins,with the kernel degree and margin slack variables determinedempirically through cross validation using a grid search, in-dependently for each model.

    We found that 31.8% of the times the user was authenticatedcorrectly using just one stroke gesture, however the imple-mentation of the pipeline raise this rate to 100% after 8 steps(see Figure 6). The set of gestures required might differ fromuser to user, going from a minimum of three to a maximumof eight.

    To test the efficiency of our approach, we compared the accu-racy of our trained policy against a simple Hand-coded policyconsisting on asking gestures from G1 trough G8 repeatedlyfollowing the same order for every user, every time. The effi-ciency of both policies was measured by the number of ges-tures required to have a 100% accuracy over the user’s au-thenticity. Figure 6 shows how the trained policy reaches100% of certainty at the eight step while the handcoded pol-icy requires 15 steps to do it.

    Our approach outperforms (or ties) the handcoded policy forevery user. Figure 7 shows the joint probabilities per stepwith the Hand-coded policy (provide gestures 1 though 8 re-peatedly). The model returns the probability distribution forour person belief at each step. For this case, probabilities forPerson 3 are higher after the user has provided nine gestures(gestures 1,2,3...8, 1). Figure 8 shows the authentication pro-cess also for Person 3, getting the higher probabilities afterthe third gesture. The system has determined that for thiscase, the best sequence is (G5, G4, G5, G6).

    5

  • 0 5 10 15 20Step #

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Accu

    racy

    (%)

    trained policyhand-coded policy

    Figure 6. Accuracy per step Trained Policy vs Hand-coded. Trainedpolicy outperforms Hand-coded policy. We have 100% of accuracy afterthe 8th gesture provided for authentication. The handcoded policy needsat least 15 gesture samples to obtain the 100% rate.

    0.05

    0.15

    0.25

    0.35

    P3

    P1P2

    P3P4

    P5P6

    P7 P8

    G1

    G2

    G3

    G4 G5

    G6 G7

    G8 G1

    G2

    G3

    G4 G5

    G6 G7

    G8 G1

    G2

    G3

    G4

    0.0

    0.05

    0.1

    0.15

    0.2

    0.25

    0.3

    0.35

    0.4

    Rew

    ard(

    -H)

    Category Joint Probabilities (Handcoded)

    Figure 7. Joint Probabilities per step with Hand-coded policy. Top: Cur-rent belief distribution about a person after each gesture is provided. Xaxis shows which gesture is provided, Y axis is the probability for eachuser. The probability distribution shows the identity of Person 3 (P3)after the ninth gesture.

    0.05

    0.15

    0.25

    0.35

    0.45

    P3

    P1P2

    P3P4

    P5P6

    P7 P8

    G5

    G4 G5

    G6 G7

    G1

    G1

    G1

    G1

    G8 G2

    G8

    G8

    G3

    G3

    G5

    G3

    G3 G2

    G2

    0.0

    0.05

    0.1

    0.15

    0.2

    0.25

    0.3

    0.35

    0.4

    0.45

    0.5

    Rew

    ard(

    -H)

    Category Joint Probabilities (Learned)

    Figure 8. Joint Probabilities per step with trained policy. The prob-ability distribution shows the identity of Person 3 (P3) after the thirdgesture.

    Our first tests on user authentication using the Max-marginmodel showed the similarities among different users. The fi-nal hierarquical results show that those similarities are part ofthe uniqueness of every person.

    CONCLUSIONSNew ways of interaction require new forms of authentication.No matter how collaborative the device is, security will be al-ways required. In the multi-touch case where the movementsare visible to all members of a team on a room, passwordsmust not depend on a secret combination of characters or ges-tures, but on a private signature not subject for duplication.In this paper we have proposed a new authentication systemwhich can be used in any touch device capable of capturing a2d trajectory.

    While the shape and speed representations for a gesture pro-vided 23.8% and 7.9% of accuracy, the combination of bothraises the rate to 31.8%.

    We find that authentication with a single sample using shape,speed or both features is not accurate since users tend to lookalike from system’s point of view. However, those similari-ties can be useful if several samples from different gesturesare provided when discriminating one user from another. The

    6

  • way those samples (gesture types) are provided will deter-mine the efficiency of the system. Our model reduces thenumber of gestures required for an accurate authentication byalmost 50% (eight gestures strategically asked) than a hand-coded policy (15 gestures). This approach outperforms a naiveBayes approach wich provides 60% of accuracy rate recog-nizing a user using a hand-coded policy.

    Having a trained policy reduces the number of gestures re-quired for authentication significantly, which could help usersto be less reluctant to this method, authenticating a user 100%of the time using any touch-device with no external devicesrequired.

    REFERENCES1. Antons Rebguns, Daniel Ford, I. F. Infomax control for

    acoustic exploration of objects by a mobile robot. InAAAI Workshop on Lifelong Learning, AAAI (2011).

    2. Chang, C.-C., and Lin, C.-J. Libsvm : a library forsupport vector machines., 2001. Software available athttp://www.csie.ntu.edu.tw/˜cjlin/libsvm.

    3. David Kim, Paul Dunphy, P. B. J. H. J. N. P. O.Multi-touch authentication on tabletops. In CHI2010:Input, Security and Privacy Policies, CHI 10(2010).

    4. Dietz, P.H.; Leigh, D. Diamondtouch: A multi-usertouch technology. In ACM Symposium on User InterfaceSoftware and Technology (UIST), ACM Press (2001),219–226.

    5. Dominik Schmidt, Ming Ki Chong, H. G. Handsdown:hand-contour-based user identification for interactivesurfaces. In NordiCHI ’10. Proceedings of the 6thNordic Conference on Human-Computer Interaction:Extending Boundaries, ACM (2010).

    6. Johannes Schning, Michael Rohs, A. K. Using mobilephones to spontaneously authenticate and interact withmulti-touch surfaces. In Workshop on designingmulti-touch interaction techniques for coupled privateand public displays, AVI 2008 (2008).

    7. R., O., and ., A. G. Smart floor: a mechanism for naturaluser identification and tracking. In Conference onHuman Factors in Computing Systems, CHI 00 (2000).

    8. Schoning, J., Rohs, M., and A., K. Spatial authenticationon large interactive multi-touch surfaces. In IEEETabetop 2008: Adjunct Proceedings of IEEE Tabletopsand Interactive Surfaces, IEEE 08 (2008).

    9. Sehnke, F.; Osendorfer, C. R. T. G. A. P. J., andSchmidhuber, J. Parameter-exploring policy gradients.In Neural Networks (2009), 551–559.

    10. Volker Roth, Philli Schmidt, B. G. The ir ring:authenticating users’ touches on a multi-touch display.In UIST ’10 Proceedings of the 23nd annual ACMsymposium on User interface software and technology,ACM (2010).

    7

    http://www.csie.ntu.edu.tw/~cjlin/libsvm. http://www.csie.ntu.edu.tw/~cjlin/libsvm.

    IntroductionRelated WorkExperimental MethodologyThe Max-Margin ModelShape and speed representation

    Naive Bayes approachMulti-action category modelLearning an InfoMax controllerRepresentationPolicy Learning

    UsabilityResultsConclusionsREFERENCES