First Edition Community Release
-
Upload
nick-cox-4302 -
Category
Documents
-
view
237 -
download
0
Transcript of First Edition Community Release
-
8/8/2019 First Edition Community Release
1/100
-
8/8/2019 First Edition Community Release
2/100
Multi-Touch Technologies
NUI Group Authors
1st edition [Community Release]: May 2009
This book is 2009 NUI Group. All the content on this page is pro-
tected by a Attribution-Share Alike License. For more information,
see http://creativecommons.org/licenses/by-sa/3.0
Cover design: Justin Riggio
Contact: [email protected]
-
8/8/2019 First Edition Community Release
3/100
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . i
About this book ii About the authors iii
Chapter 1: Hardware Technologies . . . . . . . . . . 1
11 Introduction to Hardware 2
12 Introduction to Optical Multi-ouch echnologies 3
1.2.1 Infrared Light Sources . . . . . . . . . . . . . . . . . . 3
1.2.2 Infrared Cameras. . . . . . . . . . . . . . . . . . . . . 5
13 Frustrated otal Internal Reection (FIR) 9
1.3.1 FTIR Layers . . . . . . . . . . . . . . . . . . . . . . .10
1.3.2 Compliant Surfaces . . . . . . . . . . . . . . . . . . .11
14 Diused Illumination (DI) 13
1.4.1 Front Diffused Illumination . . . . . . . . . . . . . . .131.4.2 Rear Diffused Illumination . . . . . . . . . . . . . . .13
15: Laser light planE (LLP) 15
16: Diused Surace Illumination (DSI) 17
17 LED Light Plane (LED-LP) 18
18 echnique Comparison 20
19: Display echniques 231.9.1 Projectors . . . . . . . . . . . . . . . . . . . . . . . .23
1.9.2 LCD Monitors . . . . . . . . . . . . . . . . . . . . . .24
Chapter 2: Software & Applications . . . . . . . . 25
21: Introduction to Sotware Programming 26
22 racking 27
2.2.1 Tracking for Multi-Touch . . . . . . . . . . . . . . . .28
23: Gesture Recognition 30
2.3.1 GUIs to Gesture . . . . . . . . . . . . . . . . . . . . .30
2.3.2 Gesture Widgets . . . . . . . . . . . . . . . . . . . . .31
-
8/8/2019 First Edition Community Release
4/100
2.3.3 Gesture Recognition . . . . . . . . . . . . . . . . . . 32
2.3.4 Development Frameworks. . . . . . . . . . . . . . . 34
24 Python 37
2.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 37
2.4.2 Python Multi-touch Modules and Projects . . . . . . 38
2.4.3 PyMT . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.4.4 PyMT Architecture. . . . . . . . . . . . . . . . . . . 39
2.4.5 OpenGL . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4.6 Windows, Widgets and Events. . . . . . . . . . . . . 41
2.4.7 Programming for Multi-Touch Input . . . . . . . . . 452.4.8 Implementing the Rotate/Scale/Move Interaction . . . 46
2.4.9 Drawing with Matrix Transforms . . . . . . . . . . . 47
2.4.10 Parameterizing the Problem and Computing the Trans-
formation. . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.4.11 Interpret Touch Locations to Compute the Parameters 50
25 ActionScript 3 & Flash 522.5.1 Opening the Lines of Communication. . . . . . . . . 52
26 NE/C# 56
2.6.1 Advantages of using .NET. . . . . . . . . . . . . . . 56
2.6.2 History of .NET and multi-touch . . . . . . . . . . . 56
2.6.3 Getting started with .NET and multi-touch . . . . . . 57
27: Frameworks and Libraries 59
2.7.1 Computer Vision . . . . . . . . . . . . . . . . . . . . 59
2.7.2 Gateway Applications . . . . . . . . . . . . . . . . . 61
2.7.3 Clients . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.7.4 Simulators . . . . . . . . . . . . . . . . . . . . . . . 63
Appendix A: Abbreviations 65
Appendix B: Building an LLP setup 67
Appendix C: Building a Rear-DI Setup 75
Appendix D: Building an Interactive Floor 81
Appendix E: Reerences 87
-
8/8/2019 First Edition Community Release
5/100
i
Introduction to Multi-Touch
In this chapter:
About this bookAuthors
-
8/8/2019 First Edition Community Release
6/100
ii Multi-Touch Technologies Book
About this book
Founded in 2006, Natural User Interace Group (~ NUI Group) is aninteractive media community researching and creating open source ma-
chine sensing techniques to benet artistic, commercial and educational
applications. Its the largest online (~5000+) open source community
related to the open study o Natural User Interaces.
NUI Group oers a collaborative environment or developers that are interested
in learning and sharing new HCI (Human Computer Interaction) methods and
concepts. Tis may include topics such as: augmented reality, voice/handwriting/gesture recognition, touch computing, computer vision, and inormation visualiza-
tion.
Tis book is a collection o wiki entries chapters written exclusively or this book,
making this publication the rst o its kind and unique in its eld. All contribu-
tors are listed here and countless others have been a part the community that has
developed and nurtured the NUI Group multi-touch community, and to them we
are indebted.
We are sure youll nd this book pleasant to read. Should you have any commnts,
criticisms or section proposals, please eel ree to contact us via the-book@nu-
igroup.com. Our doors are always open and looking or new people with similar
interests and dreams.
-
8/8/2019 First Edition Community Release
7/100
iiiIntroduction to Multi-Touch
About the authorsAlex eiche is an enthusiast that has been working with Human-ComputerInteraction or several years, with a very acute interest in collaborative multiuser
interaction. He recently completed his LLP able, and has been contributing to
the PyM project or several months. He is now experimenting with thin multi-
touch techniques that are constructable using easily accessible equipment to the
average hobbyist.
Ashish Kumar Rai is an undergraduate student in Electronics Engineering atInstitute o echnology, Banaras Hindu University, India. His major contribution
is the development o the simulator QMSim. Currently his work consists odevelopment o innovative hardware and sotware solutions or the advancement
o NUI.
Chris Yanc has been a proessional interactive designer at an online market-ing agency in Cleveland, OH or over 4 years. Ater constructing his own FIR
multi-touch table, he has been ocusing on using Flash and Actionscript 3.0 to
develop multi-touch applications rom communication with CCV and ouchlib.
Chris has been sharing the progress o his multi-touch Flash experiments with theNUIGroup community.
Christian Moore is a pioneer in open-source technology and is one o the keyevangelists o Natural User Interaces. As the ounder o NUI Group, he has led a
community that has developed state-o-the-art human sensing techniques and is
working on the creation o open standards to uniy the communitys eorts into
well-ormatted documentation and rameworks.
Donovan Solms is a Multimedia student at the University o Pretoria inGauteng, South Arica. Ater building the rst multitouch screen in South Arica
in the rst quarter o 2007, he continued to develop sotware or multitouch us-
ing .NE 3 and ActionScript 3. He can currently be ound building commercial
multitouch suraces and sotware at Multimedia Solutions in South Arica.
Grkem etin denes himsel an open source evangelist. His current doctor-
ate studies ocus on human computer interaction issues o open source sotware.etin has authored 5 books, written numerous articles or technical and sector
magazines and spoken at various conerences.
Justin Riggio is a proessional interactive developer and designer that has beenworking reelance or over 5 years now. He lives and works in the NYC area and
-
8/8/2019 First Edition Community Release
8/100
iv Multi-Touch Technologies Book
can be ound at the Jersey shore surng at times. He has designed and built a 40
screen DI multi touch table and has set up and tested LLP LCD set ups and also
projected units. You can nd on his drating table a new plan or a DSI system andAS3 multi touch document reader.
Nolan Ramseyer is a student at University o Caliornia, San Diego in theUnited States studying Bioengineering: Biotechnology with a background in
computers and multimedia. His work in optical-based multi-touch has consumed
a larger part o his ree time and through his experimentations and learning has
helped others achieve their own projects and goals.
Paul DIntino is the echnical raining Manager at Pioneer Electronics inMelbourne, Australia. In 2003 he completed a Certicate III in Electrotechnol-
ogy Entertainment and Servicing, and went onto repairing Consumer Electronics
equipment, specializing in Plasma Vs. His work in both hardware and sotware
or multi-touch systems grew rom his passion or electronics, which he aims to
urther develop with help rom the NUI Group.
Laurence Muller(M.Sc.) is a scientic programmer at the Scientic Visualiza-tion and Virtual Reality Research Group (Section Computational Science) at the
University o Amsterdam, Amsterdam in the Netherlands. In 2008 he completed
his thesis research project titled: Multi-touch displays: design, applications and
perormance evaluation. Currently he is working on innovative scientic sotware
or multi-touch devices.
Ramsin Khoshabeh is a graduate Ph.D. student in the University o Caliornia,San Diego in the Electrical and Computer Engineering department. He works in
the in the Distributed-Cognition and Human-Computer Interaction Laboratory
in the Cognitive Science department. His research is ocused on developing novel
tools or the interaction with and analysis o large datasets o videos. As part o
his work, he has created MultiMouse, a multi-touch Windows mouse interace,
has built VideoNavigator, a cross-platorm multi-touch video navigation interace
Rishi Bedi is a student in Baltimore, Maryland with an interest in emerging userinteraction technologies, programming in several languages, and computer hard-
ware. As an actively contributing NUI Group community member, he hopes to
build his own multi-touch devices & applications, and to continue to support the
communitys various endeavors. Also interested in graphic design, he has designed
this book and has worked in various print and electronic mediums.
Mohammad aha Bintahiris an undergraduate student at University o On-
-
8/8/2019 First Edition Community Release
9/100
vIntroduction to Multi-Touch
tario Institute o echnology, in oronto Canada studying Electrical Engineering
and Business and ocusing on microelectronics and programming. Hes currently
doing personal research or his thesis project involving, both electronic and opticalbased Multi-touch and multi-modal hardware systems.
Tomas Hansen is PhD. Student at the University o Iowa and open sourceenthusiast. His research interests include Inormation Visualization, Computer
Graphics, and especially Human Computer Interaction directed at therapeutic
applications o novel interaces such as multi-touch displays. Tis and his other
work on multi-touch interaces has been supported by the University o Iowas
Mathematical and Physical Sciences Funding Program.
im Roth studied Interaction Design at the Zurich University o the Arts. Dur-ing his studies, he ocussed on interace design, dynamic data visualization and
new methods o human computer interaction.His diploma dealt with the cus-
tomer and consultant situation in the private banking sector. Te nal output was
a multitouch surace, that helped the consultant to explain and visualize the com-
plexeties o modern nancial planning.
Seth Sandleris a University o Caliornia, San Diego graduate with a Bachelorsdegree in Interdisciplinary Computing and the Arts, with an emphasis in music.
During Seths nal year o study he began researching multitouch and ultimat-
ley produced a musical multi-touch project entitled Audioouch. Seth is most
notably known or his presence on the NUI Group orum, introduction o the
Mmini, and CCV development.
EditorsTis book was edited & designed by Grkem etin, Rishi Bedi and Seth Sandler.
Copyright
Tis book is 2009 NUI Group. All the content on this page is protected by a
Attribution-Share Alike License. For more inormation, see http://creativecom-
mons.org/licenses/by-sa/3.0
-
8/8/2019 First Edition Community Release
10/100
vi Multi-Touch Technologies Book
-
8/8/2019 First Edition Community Release
11/100
Chapter 1Multi-Touch Technologies
In this chapter:
1.1 Introduction to Hardware
1.2 Introduction to Optical echniques1.3 Frustrated otal Internal Reection (FIR)1.4 Diused Illumination (DI)1.5 Laser Light Plane (LLP)1.6 Diused Surace Il lumination (DSI)1.7 LED Light Plane (LED-LP)1.8 echnqiue Comparison1.9 Display echniques
1
-
8/8/2019 First Edition Community Release
12/100
2 Multi-Touch Technologies Handbook
1.1 Introduction to Hardware
Multi-touch (or multitouch) denotes a set o interaction techniquesthat allow computer users to control graphical applications with
several ngers. Multi-touch devices consist o a touch screen (e.g.,
computer display, table, wall) or touchpad, as well as sotware that
recognizes multiple simultaneous touch points, as opposed to the standard touch-
screen (e.g. computer touchpad, AM), which recognizes only one touch point.
Te Natural User Interace and its inuence on multi-touch gestural interace
design has brought key changes in computing hardware design, especially thecreation o true multi-touch hardware systems (i.e. support or more than two
inputs). Te aim o NUI Group is to provide an open platorm where hardware
and sotware knowledge can be exchanged reely; through this ree exchange o
knowledge and inormation, there has been an increase in development in regards
to hardware.
On the hardware rontier, NUI Group aims to be an inormational resource hub
or others interested in prototyping and/or constructing, a low cost, high resolu-tion open-source multi-input hardware system. Trough the community research
eorts, there have been improvements to existing multi-touch systems as well as
the creation o new techniques that allow or the development o not only multi-
touch hardware systems, but also multi-modal devices. At the moment there are
ve major techniques being rened by the community that allow or the creation
o a stable multi-touch hardware systems; these include: Je Hans pioneering
Frustrated otal Internal Reection (FIR), Rear Diused Illumination (Rear
DI) such as Microsots Surace able, Laser Light Plan (LLP) pioneered in thecommunity by Alex Popovich and also seen in Microsot s Laserouch prototype,
LED-Light Plane (LED-LP) developed within the community by Nima Mota-
medi, and nally Diused Surace Illumination (DSI) developed within the com-
munity by im Roth.
Tese ve techniques being utilized by the community work on the principal o
Computer Vision and optics (cameras). While optical sensing makes up the vast
majority o techniques in the NUI Group community, there are several other sens-ing techniques that can be utilized in making natural user interace and multitouch
devices. Some o these sensing devices include proximity, acoustic, capacitive, re-
sistive, motion, orientation, and pressure. Oten, various sensors are combined to
orm a particular multitouch sensing technique. In this chapter, we explore some
o the mentioned techniques.
-
8/8/2019 First Edition Community Release
13/100
3Multi-Touch Technologies
1.2 Introduction to Optical Multi-Touch Technologies
Optical or light sensing (camera) based solutions make up a large per-centage o multi-touch devices. Te scalability, low cost and ease o
setup are suggestive reasoning or the popularity o optical solutions.
Stereo Vision , Overhead cameras, Frustrated otal Interal Reec-
tion, Front and Rear Diused Illumination, Laser Light Plane, and Diused Sur-
ace Illumination are all examples o camera based multi-touch systems.
Each o these techniques consist o an optical sensor (typically a camera), inrared
light source, and visual eedback in the orm o projection or LCD. Prior to learn-ing about each particular technique, it is important to understand these three parts
that all optical techniques share.
1.2.1 Inrared Light Sources
Inrared (IR in short) is a portion o the light spectrum that lies just beyond what
can be seen by the human eye. It is a range o wavelengths longer than visible light,
but shorter than microwaves. Near Inrared (NIR) is the lower end o the inrared
light spectrum and typically considered wavelengths between 700nm and 1000nm
(nanometers). Most digital camera sensors are also sensitive to at least NIR and
are oten tted with a lter to remove that part o the spectrum so they only
capture the visible light spectrum. By removing the inrared lter and replacing
it with one that removes the visible light instead, a camera that only sees inrared
light is created.
In regards to multitouch, inrared light is mainly used to distinguish between a
visual image on the touch surace and the object(s)/nger(s) being tracked. Since
most systems have a visual eedback system where an image rom a projector,
LCD or other display is on the touch surace, it is important that the camera does
not see this image when attempting to track objects overlyaed on the display. In
order to separate the objects being tracked rom the visual display, a camera, as
explained above, is modied to only see the inrared spectrum o light; this cuts
out the visual image (visible light spectrum) rom being seen by the camera andthereore, the camera is able to see only the inrared light that illuminates the
object(s)/nger(s) on the touch surace.
Many brands o acrylic sheet are intentionally designed to reduce their IR trans-
mission above 900nm to help control heat when used as windows. Some cameras
-
8/8/2019 First Edition Community Release
14/100
4 Multi-Touch Technologies Handbook
sensors have either dramatically greater, or dramatically lessened sensitivity to
940nm, which also has a slightly reduced occourance in natural sunlight.
IR LEDs are not required, but a IR light source is. In most multitouch optical
techniques (especially LED-LP and FIR), IR LEDs are used because they are
ecient and eective at providing inrared light. On the other hand, Diused
Illumination (DI) does not require IR LEDs, per se, but rather, some kind o
inrared light source like an inrared illuminator (which may have LEDs inside).
Laser light plane (LLP) uses IR lasers as the IR light source.
Most o the time, LEDs can be bought in orms o single LEDs or LED rib-bons:
Single LEDs: Single through-hole LEDs are a cheap and easy to create LED
rames when making FIR, DSI, and LED-LP M setups. Tey require a
knowledge in soldering and a little electrical wiring when constructing. LED
calculators like here and here make it easy ot people to gure out how to
wire the LEDs up. Te most commonly through-hole IR LED used are the
OSRAM SFH 485 P. I you are trying to make a LCD FIR, you will needbrighter than normal IR LEDs, so these are probably your best bet.
LED ribbons: Te easiest solution or making LED rames instead o solder-
ing a ton o through-hole LEDs, LED ribbons are FFC cables with surace
mount IR LEDs already soldered on them as seen here. Tey come with an
adhesive side that can be stuck into a rame and wrapped around a piece o
acrylic with a continuous LED ribbon. All that is required is wrap and plug
the ribbon into the power adapter and done. Te best quality ribbons can beound at environmentallights.com.
LED emitters: When making Rear DI or Front DI setups, pre-built IR emit-
ters are mush easier than soldering a ton o single through-hole LEDs to-
gether. For most Rear DI setups, 4 o these are usually all that is needed to
completely cover the insides o the box. By buying the grouped LEDs, you
will have to eliminate hot spot IR blobs caused by the emitters by bouncing
the IR light o the sides and oor o the enclosed box.
Beore buying LEDs, its strongly advised to check the data sheet o the LED.
Wavelength, angle, and radiant intensity are the most important specications or
all techniques.
Wavelength: 780-940nm. LEDs in this range are easily seen by most cameras
-
8/8/2019 First Edition Community Release
15/100
5Multi-Touch Technologies
and visible light lters can be easily ound or these wavelengths. Te lower
the wavelength, the higher sensitivity which equates to a higher ability to
determine the pressure.
Radiant intensity: Minimum o 80mw. Te ideal is the highest radiant inten-
sity you can nd, which can be much higher than 80mw.
Angle or FIR: Anything less than +/- 48 will not take ull advantage o
total internal reection, and anything above +/- 48 degrees will escape the
acrylic. In order to ensure there is coverage, going beyond +/- 48 degrees is
ne, but anything above +/- 60 is really just a waste as (60 - 48 = +/- 12 de-grees) will escape the acrylic.
Angle or diused illumination: Wider angle LEDs are generally better. Te
wider the LED angle, the easier it is to achieve even illumination. Te wave-
length and radiant intensity
For the DI setup, many setups bump into problem o hotspots. In order to elimi-
nate this, IR light needs to be bounced o the bottom o the box when mounted
to avoid IR hot spots on the screen. Also, a band pass lter or the camera is re-quired homemade band pass lters such as the exposed negative lm or a piece
o a oppy disk will work, but theyll provide poor results. Its always better to buy
a (cheap) band pass lter and putting it into the lens o the camera.
1.2.2 Inrared Cameras
Simple webcams work very well or multitouch setups, but they need to be modi-ed rst. Regular webcams and cameras block out inrared light, letting only vis-
ible light in. We need just the opposite. ypically, by opening the camera up, you
can simply pop the lter o, but on expensive cameras this lter is usually applied
directly to the lens and cannot be modied.
Most cameras will show some inrared light without modication, but much bet-
ter perormance can be achieved i the lter is replaced.
Te perormance o the multi-touch device depends on the used components.
Tereore it is important to careully select your hardware components. Beore
buying a camera it is important to know or what purpose you will be using it.
When you are building your rst (small) test multi-touch device, the requirements
may be lower than when you are building one that is going to be used or demon-
-
8/8/2019 First Edition Community Release
16/100
6 Multi-Touch Technologies Handbook
stration purposes.
Resolution:Te resolution o the camera is very important. Te higher the reso-lution the more pixels are available to detect nger or objects in the camera image.Tis is very important or the precision o the touch device. For small multi-touch
suraces a low resolution webcam (320 x 240 pixels) can be sucient. Larger sur-
aces require cameras with a resolution o 640x480 or higher in order to maintain
the precision.
Frame rate:Te rame rate is the number o rames a camera can take within
one second. More snapshots means that we have more data o what happened in aspecic time step. In order to cope with ast movements and responsiveness o the
system a camera with at least a rame rate o 30 rames per second (FPS) is recom-
mended. Higher rame rates provide a smoother and more responsive experience.
Interace: Basically there are two types o interaces that can be used to connecta camera device to a computer. Depending on the available budget one can choose
between a consumer grade webcam that uses a USB interace or a proessional
camera that is using the IEEE 1394 interace (which is also known as FireWire).An IEEE 1394 device is recommend because it usually has less overhead and
lower latency in transerring the camera image to the computer. Again, lower
latency results in a more responsive system.
Lens type: Most consumer webcams contain an inrared (IR) lter that preventsIR light rom reaching the camera sensor. Tis is done to prevent image distor-
tion. However or our purpose, we want to capture and use IR light. On some
webcams it is possible to remove the IR lter. Tis lter is placed behind the lensand oten has a red color. I it is not possible to remove the IR lter, the lens has
to be replaced with a lens without coating. Webcams oten use a M12 mount.
Proessional series cameras (IEEE 1394) oten come without a lens. Depending
on the type it is usualy possible to use a M12, C or CS mount to attach the lens.
Choosing the right lens can be a dicult task, ortunately many manuactures
provide an online lens calculator. Te calculator calculates the required ocal
length based on two input parameters which are the distance between the lens
and the object (touch surace) and the width or height o the touch surace. Be
sure to check i the calculator chooses a proper lens. Lenses with a low ocal length
oten suer rom severe image distortion (Barrel distortion / sh eye), which can
complicate the calibration o the touch tracking sotware.
Camera sensor & IR bandpass lter: Since FIR works with IR light we
-
8/8/2019 First Edition Community Release
17/100
7Multi-Touch Technologies
need to check i our webcam is able to see IR light. Oten user manuals mention
the sensor name. Using the sensor name one can nd the data sheets o the camera
sensor. Te data sheet usually contains a page with a graph similar as below. Tisimage is belongs to the data sheet o the Sony ICX098BQ CCD sensor [Fig. 1].
Figure 1: Sensitivity o a Sony CCD sensorTe graph shows the spectral sensitivity characteristics. Tese characteristics show
how sensitive the sensor is to light rom specic wavelegths. Te wave length o
IR light is around 880 nm. Unortunately the image example image only shows a
range between 400 and 700 nm.
Beore we can actually use the camera it is required to add a bandpass lter. When
using the (IR sensitive) camera, it will also show all other colors o the spectrum.
In order to block this light one can use a cut-o lter or a bandpass lter. Te cut-
o lter blocks light below a certain wave length, the bandpass lter only allows
light rom a specic wavelength to pass through.
Bandpass lters are usually quite expensive, a cheap solution is to use overexposed
developed negatives.
Recommended hardware
When using a USB webcam it is recommended to buy either o the ollowing:
Te Playstation 3 camera (640x480 at 30 FPS). Te lter can be removed and
higher rame rates are possible using lower resolution.
-
8/8/2019 First Edition Community Release
18/100
8 Multi-Touch Technologies Handbook
Philips SPC 900NC (640x480 at 30 FPS). Te lter cannot be removed, it is
required to replace the lens with a dierent one.
When using IEEE 1394 (Firewire) it is recommended to buy:
Unibrain Fire-i (640x480 at 30 FPS) a cheap low latency camera. Uses the
same sensor as the Philips SPC 900NC.
Point Grey Firey (640x480 at 60 FPS)
Firewire cameras have some benets over normal USB webcams:
Higher ramerate
Capture size
Higher bandwidth
Less overhead or driver (due to less compression)
-
8/8/2019 First Edition Community Release
19/100
9Multi-Touch Technologies
1.3 Frustrated Total Internal Refection (FTIR)
FIR is a name used by the multi-touch community to describe an opti-
cal multi-touch methodology developed by Je Han (Han 2005). Te
phrase actually reers to the well-known underlying optical phenom-
enon underlying Hans method. otal Internal Reection describes a
condition present in certain materials when light enters one material rom another
material with a higher reractive index, at an angle o incidence greater than a spe-
cic angle (Gettys, Keller and Skove 1989, p.799). Te specic angle at which this
occurs depends on the reractive indexes o both materials, and is known as the
critical angle, which can be calculated mathematically using Snells law.
When this happens, no reraction occurs in the material, and the light beam is
totally reected. Hans method uses this to great eect, ooding the inside o a
piece o acrylic with inrared light by trapping the light rays within the acrylic us-
ing the principle o otal Internal Reection. When the user comes into contact
with the surace, the light rays are said to be rustrated, since they can now passthrough into the contact material (usually skin), and the reection is no longer
total at that point. [Fig. 2] Tis rustrated light is scattered downwards towards an
inrared webcam, capable o picking these blobs up, and relaying them to tracking
sotware.
Figure 2: FIRschematic diagramdepicting the bareminimum o partsneeded or a FIR
setup.
-
8/8/2019 First Edition Community Release
20/100
10 Multi-Touch Technologies Handbook
1.3.1 FIR Layers
Tis principle is very useul or implementing multi-touch displays, since the light
that is rustrated by the user is now able to exit the acrylic in a well dened area
under the contact point and becomes clearly visible to the camera below.
Acrylic
According to the paper o Han, it is necessary to use acrylic or the screen. Te
minimum thickness is 6 mm however large screens should use 1 cm to prevent the
screen rom bending.
Beore a sheet o acrylic can be used or a multi-touch screen it needs to be pre-
pared. Because acrylic oten gets cut up roughly, it is required to polish the sides
o the sheet. Tis is done to improve the illumination rom the sides. o polish
the sheet it is recommend to use dierent kinds o sandpaper. First start with a
ne sandpaper to remove most o the scratches, ater that continue with very ne,
super ne and even wet sandpaper. o make your sheets shine you can use Brasso.
Bafe
Te bafe is required to hide the light that is leaking rom the sides o the LEDs.
Tis can be a border o any material (wood/metal).
Diuser
Without a diuser the camera will not only see the touches, but also all objects
behind the surace. By using a diuser, only bright objects (touches) will be visibleto the camera. All other noise data will be let out.
Compliant layer
With a basic FIR setup, the perormance mainly depends on how greasy the
ngertips o the user are. Wet ngers are able to make better contact with the
surace. Dry ngers and objects wont be able to rustrate the IR. o overcome
this problem it is recommended to add a compliant layer on top o the surace.
Instead o rustrating the total internal reection by touch, a compliant layer will
act as a proxy. Te complaint layer can be made out o a silicon material such as
ELASOSIL M 4641. o protect and improve the touch surace, rear projection
material such as Rosco Gray #02105 can be used. With this setup it is no longer
required to have a diuser on the rear side.
-
8/8/2019 First Edition Community Release
21/100
11Multi-Touch Technologies
1.3.2 Compliant Suraces
Te compliant surace is an overlay placed above the acrylic waveguide in a FIR
based multi-touch system. Te compliant surace overlay needs to be made o a
material that has a higher reractive index than that o the acrylic waveguide, and
one that will couple with the acrylic surace under pressure and set o the FIR
eect, and then uncouple once the pressure is released. Note that compliant sur-
aces are only needed or FIR - not or any other method (DI, LLP, DSI). Te
compliant surace overlay can also be used as a projection screen.
Te compliant surace or compliant layer is simply an additional layer between
the projection surace and the acrylic. It enhances the nger contact and gives you
more robust blobs, particularly when dragging as your nger will have less adhe-
sion to the surace. In the FIR technique, the inrared light is emitted into side
the acrylic waveguide, the light travels inside the medium (due to total internal
reection much like a bre optic cable), when you touch the surace o the acrylic
(you rustrate this IR eect) causing the light to reract within the medium on
points o contact and creating the blobs (bright luminescent objects). Tere ismuch experimentation ongoing in the quest or the perect compliant layer.
Some materials used to success include Sorta Clear 40 and similar catalyzed sili-
con rubbers, lexel, and various brands o RV silicone. Others have used abric
based solutions like silicon impregnated interacing and SulkySolvy.
Te original successul method, still rather popular, is to cast a smooth silicone
surace directly on top o your acrylic and then lay your projection surace on that
ater it cures. Tis requires a material that closely approximates the optical proper-
ties o the acrylic as it will then be a part o the acrylic as ar as your transmitted
IR is concerned, hence the three rubber materials mentioned earlier...they all have
a reractive index that is slightly higher but very close to that o acrylic. Gaining in
popularity is the Cast exture method. inkerman has been leading the pack in
making this a simple process or DIY rather than an involved commercial process.
But essentially, by applying the compliant layer to the underside o the projection
surace, and texturing it, then laying the result on acrylic, you gain several benets.Te compliant surace is no longer a part o the acrylic IR eect so you are no
longer limited to materials with a similar reractive index to that o acrylic, al-
though RV and Lexel remain the most popular choices, edging out catalyzed
silicons here. Since it is largely suspended over the acrylic by the texture, except
-
8/8/2019 First Edition Community Release
22/100
12 Multi-Touch Technologies Handbook
where you touch it, you get less attenuation o your reracted IR light, resulting in
brighter blobs.
Fabric based solutions have a smaller ollowing here, and less dramatic o a proven
success rate, but are without question the easiest to implement i an appropriate
material can be sourced. Basically they involve lining the edges o your acrylic
with two sided tape, and stretching the abric over it, then repeating the process
to attach your display surace. Currently the hunt is on to nd a compliant surace
overlay that works as both a projection surace as well as a touch surace. Users
have been experimenting with various rubber materials like silicone. Compliant
suraces, while not a baseline requirement or FIR, present the ollowing advan-tages:
Protects the expensive acrylic rom scratches
Blocks a lot o light pollution.
Provides consistent results (the eectiveness o the bare acrylic touch seems to
be down to how sweaty/greasy yours hands are).
Zero visual disparity between the touch surace and the projection surace
Pressure sensitive
Seems to react better or dragging movements (at least in my experience)
Brighter blobs to track, as there is no longer a diuser between the IR blob
light and the camera.
Developing a compliant surace or a LCD FIR setup is dicult, as the surace
must be absolutely clear and distortion-ree, so as to not obscure the LCD im-
age. Tis diculty is not present in projection-based FIR setups, as the image
is projected onto the compliant surace itsel. o date, no one has successully
built a LCD FIR setup with a 100% clear compliant surace. However, several
individuals have had success with LCD FIR setups, with no compliant surace
whatsoever. It appears that the need or a compliant surace largely depends on
how strong blobs are without one, and specic setups.
-
8/8/2019 First Edition Community Release
23/100
13Multi-Touch Technologies
1.4 Diused Illumination (DI)
Diused Illumination (DI) comes in two main orms: Front DiusedIllumination and Rear Diused Illumination. Both techniques rely
on the same basic principles - the contrast between the silent image
and the nger that touches the surace.
1.4.1 Front Difused Illumination
Visible light (oten rom the ambient surroundings) is shined at the screen rom
above the touch surace. A diuser is placed on top or on bottom o the touch
surace. When an object touches the surace, a shadow is created in the position o
the object. Te camera senses this shadow.
1.4.2 Rear Difused Illumination
Depending on size and conguration o the table, it can be quite challenging to
achieve a uniorm distribution o IR light across the surace or rear DI setups.While certain areas are lit well and hence touches are deteced easily, other areas are
darker, thus requiring the user to press harder in order or a touch to be detected.
Te rst approach or solving this problem should be to optimize the hardware
setup, i.e. positioning o illuminators, changing wall materials, etc. However, i
there is no improvement possible anymore on this level, a sotware based approach
might help.
Currently, ouchlib applies any lter with the same intensity to the whole inputimage. It makes sense, though, to change the lters intensity or dierent areas
o the surace to compensate or changing light conditions. A gradient based ap-
proach (http://eis.comp.lancs.ac.uk/~dominik/cms/index.php?pid=24) can be ap-
plied where a grayscale map is used to determine on a per-pixel-base how strong
the respective lter is supposed to be applied to a certain pixel. Tis grayscale map
can be created in an additional calibration step.
Inrared light is shined at the screen rom below the touch surace. A diuser isplaced on top or on bottom o the touch surace. When an object touches the
surace it reects more light than the diuser or objects in the background; the
extra light is sensed by a camera. [Fig. 3] Depending on the diuser, this method
can also detect hover and objects placed on the surace. Rear DI, as demonstrated
in the gure below, requires inrared illuminators to unction. While these can be
-
8/8/2019 First Edition Community Release
24/100
14 Multi-Touch Technologies Handbook
bought pre-abricated, as discussed in the Inrared Light Source section, they can
also be constructed manually using individual LEDs. Unlike other setups, Rear DI
also needs some sort o a diuser material to diuse the light, which requentlyalso doubles as the projection surace.
Figure 3: Rear DI schematic
-
8/8/2019 First Edition Community Release
25/100
15Multi-Touch Technologies
1.5: Laser light planE (LLP)
Inrared light rom a laser(s) is shined just above the surace. Te laser planeo light is about 1mm thick and is positioned right above the surace, when
the nger just touches it, it will hit the tip o the nger which will register
as a IR blob. [Fig. 4]
Inrared lasers are an easy and usually inexpensive way to create a M setup using
the LLP method. Most setups go with 2-4 lasers, postioned on the corners o the
touch surace. Te laser wattage power rating (mW,W) is related to the brightness
o the laser, so the more power the brighter the IR plane will be.
Te common light wavelengths used are 780nm and 940nm as those are the
wavelengths available on the Aixiz.com website where most people buy their laser
modules. Laser modules need to have line lenses on them to create a light plane.
Te 120 degree line lens is most commonly used, so as to reduce the number o
lasers necessary to cover the entire touch surace.
Saety when using lasers o any power is important, so exercise common sense and
be mindul o where the laser beams are traveling.
1.5.1 Laser Saety
Inrared lasers are used to achieve the LLP eect, and these lasers carry some in-
herent risk. For most multi-touch setups, 5mW-25mW is sucient. Even lasers o
this grade, however, do possess some risk actors with regard to eye damage. Vis-
Figure 4: LLP schematic
-
8/8/2019 First Edition Community Release
26/100
16 Multi-Touch Technologies Handbook
ible light lasers, while certainly dangerous i looked directly into, activate a blink-
response, minimizing the lasers damage. Inrared lasers, however, are impercepti-
ble to the human eye and thereore activate no blink response, allowing or greaterdamage. Te lasers would all under these laser saety categories [Wikipedia]:
A Class 3B laser is hazardous i the eye is exposed directly, but di-use reections such as rom paper or other matte suraces are not harmul.Continuous lasers in the wavelength range rom 315 nm to ar inraredare limited to 0.5 W. For pulsed lasers between 400 and 700 nm, the limit
is 30 mJ. Other limits apply to other wavelengths and to ultrashort pulsedlasers. Protective eyewear is typically required where direct viewing o a
class 3B laser beam may occur. Class-3B lasers must be equipped with a keyswitch and a saety interlock.
As explained in Appendix B, a line lens is used to expand a lasers 1-dimensional
line into a plane. Tis line lens reduces the intensity o the laser, but risk is still
present. It is imperative that laser saety goggles matching the wavelength o the
lasers used (i.e. 780 nm) are used during setup o the LLP device. Following initialsetup and alignment, caution should be exercised when using a LLP setup. No
objects that could reract light oddly should be placed on the setup - or example,
a wine glasss cylindrical bottom would cause laser light to be reected all over
the area haphazardly. A ence should be erected around the border o the setup to
enclose stray laser beams and no lasers should ever be looked at directly. Goggles
vary widely in their protection: Optical density (OD) is a measure o protection
provided by a lens against various wavelengths. Its logarithmic, so a lens provid-
ing OD 5 reduces beam power by 100 times more than an OD 3 lens. Tere is
merit to setting up a system with low-power visible red lasers or basic alignment,
etc, since they are saer and optically behave similarly. When switching into IR,
detection cards are available that allow observation o low power IR lasers by
absorbing the light and re-emitting visible light - like an index card to check a vis-
ible beam, these are a useul way to check alignment o the IR laser while wearing
the goggles.
o reiterate the most important saety precautions enumerated above: always useinrared saety goggles when setting up, and never shine a laser directly in your eye.
I extreme caution is exercised and these rules not broken, LLP setups are quite
sae. However, violation o any o these saety guidelines could result in serious
injury and permanent damage to your retina.
-
8/8/2019 First Edition Community Release
27/100
17Multi-Touch Technologies
Diused Surace Illumination (DSI)
DSI uses a special acrylic to distribute the IR evenly across the sur-ace. Basically use your standard FIR setup with an LED Frame
(no compliant silicone surace needed), and just switch to a special
acrylic. [Fig. 5] Tis acrylic uses small particles that are inside the
material, acting like thousands o small mirrors. When you shine IR light into the
edges o this material, the light gets redirected and spread to the surace o the
acrylic. Te eect is similar to DI, but with even illumination, no hotspots, and
same setup process as FIR.
Evonic manuactures some dierent types o Endlighten. Tese vary in their
thickness and also in the amount o particles inside the material. Available thick-
ness ranges between 6-10mm, ollwing L, XL and XXL or particle amount.
Te 6 mm (L) is too exible or a table setup, but the 10 mm (XXL) works nicely.
Figure 5: DSI Schematic
-
8/8/2019 First Edition Community Release
28/100
18 Multi-Touch Technologies Handbook
1.7. LED Light Plane (LED-LP)
LED-LP is setup the same way as an FIR setup except that the thick
acrylic that the inrared light travels through is removed and the light
travels over the touch surace. Tis picture [Fig. 6] shows the layers
that are common in an LED-LP setup.
Te inrared LEDs are placed around the touch surace; with all sides being sur-
rounding preerred to get a more even distribution o light. Similar to LLP, LED-
LP creates a plane o IR light that lays over the touch surace. Since the light
coming rom the LEDs is conical instead o a at laser plane, the light will light up
objects placed above the touch surace instead o touching it. Tis can be adjusted
or by adjusting lter settings in the sotware (touchlib/Community Core Vision)
such as the threshold levels to only pick up objects that are lit up when they are
very close to the touch surace. Tis is a problem or people starting with this type
o setup and takes some patience. It is also recommended that a bezel (as can be
seen in the picture above) is put over the LEDs to shield the light into more o a
Figure 6: LED-LP 3D Schematic created in SecondLie
-
8/8/2019 First Edition Community Release
29/100
19Multi-Touch Technologies
plane.
LED-LP is usually only recommended when working with an LCD screen asthere are better methods such as Rear DI when using a projector that usually dont
work with an LCD screen. Like Rear DI and LLP the touch surace need not be
thick like in FIR, but only as strong as it needs to support the orces rom work-
ing on the touch surace.
1.7.1 Layers used
First, a touch surace, which is a strong, durable surace that can take the pressureo user interaction that is optically clear should be used. Tis is usually acrylic
or glass. I using in a projector setup, the image is stopped and displayed on the
projection layer. I using in an LCD setup, the diuser is placed below the LCD
screen to evenly distribute the light rom the LCD backlight.
Te source o inrared light or an LED-LP setup comes rom inrared LEDs
that are placed around at least 2 sides o the acrylic right above the touch surace.
ypically the more sides surrounded, the better the setup will be in IR prevalentlighting conditions. Reer to the LED section or more inormation on IR LEDs.
A computer webcam is placed on the opposite site o the touch surace so that is
can see the blobs. See the camera section or more inormation on cameras that
are commonly used.
Figure 7: Picture o LED-LP setup
-
8/8/2019 First Edition Community Release
30/100
20 Multi-Touch Technologies Handbook
1.8 Technique comparison
Arequent question is What is the best technique? Unortunately,there is no simple answer. Each technique has its advantages and dis-
advantages. Te answer is less about what is best and more about what
is best or you, which only you can answer.
FIR
Advantages Disadvantages
An enclosed box is not required
Blobs have strong contrast
Allows or varying blob pressure
With a compliant surace, it can
be used with something as small
as a pen tip
Setup calls or some type o LED
rame (soldering required)
Requires a compliant surace (sili-
cone rubber) or proper use - no
glass surace
Cannot recognize objects or du-
cial markers
Rear DI
Advantages Disadvantages
No need or a compliant surace,
just an diuser/projection suraceon top/bottom
Can use any transparent material
like glass (not just acrylic)
No LED rame required
No soldering (you can buy the IR-
Illuminators ready to go)
Can track objects, ngers, du-
cials, hovering
Dicult to get even illumination
Blobs have lower contrast (harder
to pick up by sotware)
Greater chance o alse blobs
Enclosed box is required
-
8/8/2019 First Edition Community Release
31/100
21Multi-Touch Technologies
Front DI
Advantages Disadvantages No need or a compliant surace,
just an diuser/projection surace
on top/bottom
Can use any transparent material
like glass (not just acrylic)
No LED rame required
No soldering (you can buy the IR-
Illuminators ready to go)
Can track ngers and hovering
An enclosed box is not required
Simplest setup
Cannot track objects and ducials
Dicult to get even illumination
Greater chance o alse blobs
Not as reliable (relies heavily on
ambient lighting environment)
LLP
Advantages Disadvantages
No compliant surace (silicone)
Can use any transparent materiallike glass (not just acrylic)
No LED rame required
An enclosed box is not required
Simplest setup
Could be slightly cheaper thanother techniques
Cannot track traditional objects
and ducials Not truly pressure sensitive (since
light intensity doesnt change with
pressure).
Can cause occlusion i only using
1 or 2 lasers where light hitting
one nger blocks another nger
rom receiving light.
-
8/8/2019 First Edition Community Release
32/100
22 Multi-Touch Technologies Handbook
DSI
Advantages Disadvantages No compliant surace (silicone)
Can easily switch back and orth
between DI (DSI) and FIR
Can detect objects, hovering, and
ducials
Is pressure sensitive
No hotspots
Even nger/object illumination
throughout the surace
Endlighten Acrylic costs more
than regular acrylic (but the some
o the cost can be made up since
no IR illuminators are needed)
Blobs have lower contrast (harder
to pick up by sotware) than FIR
and LLP
Possible size restrictions due to
plexiglasss sotness
LED-LP
Advantages Disadvantages
No compliant surace (silicone)
Can use any transparent materiallike glass (not just acrylic)
No LED rame required
An enclosed box is not required
Could be slightly cheaper than
other techniques
Hovering might be detected
Does not track objects or ducials
LED rame (soldering) required
Only narrow-beam LEDs can be
used - no ribbons.
-
8/8/2019 First Edition Community Release
33/100
23Multi-Touch Technologies
1.9: Display Techniques
1.9.1 Projectors
he use o a projector is one o the methods used to display visual eed-
back on the table. Any video projection device (LCDs, OLEDs, etc.)
should work but projectors tend to be the most versatile and popular
in terms o image size.
Tere are two main display types o projectors: DLP and LCD.
LCD (Liquid Crystal Displays) are made up o a grid o dots that go on and
o as needed. Tese use the same technology that is in your at panel moni-
tor or laptop screen. Tese displays are very sharp and have very strong color.
LCD projectors have a screen door eect and the color tends to ade ater a
ew years.
DLP (Digital Light Processing) is a technology that works by the use o
thousands o tiny mirrors moving back and orth. Te color is then created bya spinning color wheel. Tis type o projector has a very good contrast ratio
and is very small in physical size. DLP may have a slower response rate.
One thing to consider with projects is brightness. Its measured in ANSI lumens.
Te higher the number the brighter the projector and the brighter the setting can
be. In a home theater setting you alway want a brighter screen but in a table setup
it diers.
With the projector close to the screen and the screen size being smaller the pro-
jection can be too bright and give you a hotspot in the screen. Tis hotspot eect
can be dazzling ater looking at the screen or several minutes.
One o the biggest limiting points o a projector is the throw distance. Tis is the
distance that is needed between the lens o the projector and the screen to get the
right image size. For example in order to have a screen size o 34in, then you may
need to have a distance o 2 eet between you projector and the screen. Check tosee i your projection image can be adjusted to t the size o the screen you are
using.
In case you want to make your multi touch display into a boxed solution, a mirror
can assist in redirecting the projected image. Tis provides the necessary throw
-
8/8/2019 First Edition Community Release
34/100
24 Multi-Touch Technologies Handbook
length between the projector and screen while allowing a more compact congu-
ration.
Te aspect ratio is a measure o image width over image height or the projected
image. or example i you have a standard television then your aspect ratio is 4:3.
I you have a wide screen tv then you have a 16:9 ratio. In most cases a 4:3 ratio is
preerred but it depends on the use o the interace.
1.9.2 LCD Monitors
While projection-based displays tend to be the most versatile in terms o setupsize, LCD displays are also an option or providing visual eedback on multi-touch
setups. All LCD displays are inherently transparent the LCD matrix itsel has
no opacity. I all o theplastic casing o the display is removed and IR-blocking di-
user layers are discarded, this LCD matrix remains, attached to its circuit boards,
controller, and power supply (PSU). When mounted in a multi-touch setup, this
LCD matrix allows inrared light to pass through it, and at the same time, display
an image rivaling that o an unmodied LCD monitor or projector.Naturally, in order or an LCD monitor to produce an image when disassembled,
it must be connected to its circuit boards. Tese circuit boards are attached to the
matrix via FFC (at ex cable) ribbon that is o a certain length. A LCD monitor
is unusable or a multi-touch setup i these FFC ribbons are too short to extend
all components o the LCD monitor ar enough away rom the matrix itsel so
as to not inhibit transparency o inrared light through it. Databases o FFC-
compliant LCD monitors exist on websites such as LumenLabs, up to 19. LCD
monitors that are not FFC compliant are still potentially usable or multi-touch
setups, i the FFC ribbons are extended manually. FFC extensions can be ound
in electronics parts stores, and can be soldered onto existing cable to make it the
necessary length.
-
8/8/2019 First Edition Community Release
35/100
Chapter 2Software & Applications
In this chapter:
2.1 Introduction to Sotware2.2 Blob racking2.3 Gesture Recognition2.4 Python
2.5 ActionScript32.6 .NE/C#2.7 List o Frameworks and Applications
25
-
8/8/2019 First Edition Community Release
36/100
26 Multi-Touch Technologies Handbook
2.1: introduction to Sotware Programming
Programming or multi-touch input is much like any other orm o cod-ing, however there are certain protocols, methods, and standards in the
multi-touch world o programming. Trough the work o NUI Group
and other organizations, rameworks have been developed or several
languages, such as ActionScript 3, Python, C, C++, C#, and Java. Multi-touch pro-
gramming is two-old: reading and translating the blob input rom the camera
or other input device, and relaying this inormation through pre-dened protocols
to rameworks which allow this raw blob data to be assembled into gestures that
high-level language can then use to interact with an application. UIO (angibleUser Interace Protocol) has become the industry standard or tracking blob data,
and the ollowing chapters discuss both aspects o multi-touch sotware: touch
tracking, as well as the applications operating o o the tracking rameworks.
-
8/8/2019 First Edition Community Release
37/100
27Software & Applications
2.2. Tracking
Object tracking has been a undamental research element in the eldo Computer Vision. Te task o tracking consists o reliably being
able to re-identiy a given object or a series o video rames contain-
ing that object (estimating the states o physical objects in time rom
a series o unreliable observations). In general this is a very dicult problem since
the object needs to rst be detected (quite oten in clutter, with occlusion, or under
varying lighting conditions) in all the rames and then the data must be associated
somehow between rames in order to identiy a recognized object.
Countless approaches have been presented as solutions to the problem. Te most
common model or the tracking problem is the generative model, which is the
basis o popular solutions such as the Kalman and particle lters.
In most systems, a robust background-subtraction algorithm needs to rst pre-
process each rame. Tis assures that static or background objects in the images are
taken out o consideration. Since illumination changes requent videos, adaptive
models, such as Gaussian mixture models, o background have been developed tointelligently adapt to non-uniorm, dynamic backgrounds.
Once the background has been subtracted out, all that remains are the oreground
objects. Tese objects are oten identied with their centroids, and it is these points
that are tracked rom rame to rame. Given an extracted centroid, the tracking al-
gorithm predicts where that point should be located in the next rame.
For example, to illustrate tracking, a airly simple model built on a Kalman lter
might be a linear constant velocity model. A Kalman lter is governed by two
dynamical equations, the state equation and the observation equation. In this
example, the set o equations would respectively be:
Te state equation describes the propagation o the state variable, xk, with pro-
cess noise, k (oten drawn rom a zero-mean
Normal distribution) I we say xk-1 represents
the true position o the object at time k-1, then
the model predicts the position o the object at
time k as a linear combination o its position at
time k-1, a velocity term based on constant velocity, Vuk, and the noise term,
k. Now since, according to the model, we dont have direct observations o the
precise object positions, the xks, we need the observation equation. Te observa-
-
8/8/2019 First Edition Community Release
38/100
28 Multi-Touch Technologies Handbook
tion equation describes the actual observed variable, yk, which in this case would
simply dene as a noisy observation o the true position, xk. Te measurement
noise, k, is also assumed to be zero-mean Gaussian white noise. Using these twoequations, the Kalman lter recursively iterates a predicted object position given
inormation about its previous state.
For each state prediction, data association is perormed to connect tracking pre-
dictions with object observations. racks that have no nearby observation are con-
sidered to have died and observations that have no nearby tracks are considered to
be a new object to be tracked.
2.2.1 racking or Multi-ouch
racking is very important to multi-touch technology. It is what enables multiple
ngers to perorm various actions without interrupting each other. We are also
able to identiy gestures because the trajectory o each nger can be traced over
time, impossible without tracking.
Tankully, todays multi-touch hardware greatly simplies the task o tracking anobject, so even the simplest Kalman lter in actuality becomes unnecessary. In act
much o the perormance bottleneck o tracking systems tends to come rom gen-
erating and maintaining a model o the background. Computational costs o these
systems heavily constrict CPU usage unless clever tricks are employed. However,
with the current inrared (IR) approaches in multi-touch hardware (e.g., FIR
or DI), an adaptive background model turns out to be overkill. Due to the act
that the captured images lter out (non-inrared) light, much o the background
is removed by the hardware. Given these IR images, it is oten sucient to simply
capture a single static background image to remove nearly all o the ambient light.
Tis background image is then subtracted rom all subsequent rames, the resul-
tant rames have a threshold applied to them, and we are let with images contain-
ing blobs o oreground objects (ngers or surace widgets we wish to track).
Furthermore, given the domain, the tracking problem is even simpler. Unlike gen-
eral tracking solutions, we know that rom one rame o video to another (~33ms
or standard reresh rate o 30Hz), a human nger will travel only a limited dis-
tance. In light o this predictability, we can avoid modeling object dynamics and
merely perorm data association by looking or the closest matching object be-
tween two rames with a k-Nearest Neighbors (k-NN) approach. Normal NN ex-
haustively compares data points rom one set to another to orm correspondences
-
8/8/2019 First Edition Community Release
39/100
29Software & Applications
between the closest points, oten measured with Euclidian distance. With k-NN,
a data point is compared with the set o k points closest to it and they vote or the
label assigned to the point. Using this approach, we are able to reliably track blobsidentied in the images or higher-level implementations. Heuristics oten need
to be employed to deal with situations where ambiguity occurs, or example, when
one object occludes another.
Both the ouchlib and CCV rameworks contain examples o such trackers. Using
OpenCV, an open-source Computer Vision library that enables straightorward
manipulation o images and videos, the rameworks track detected blobs in real-
time with high condence. Te inormation pertaining to the tracked blobs (posi-tion, ID, area, etc.) can then be transmitted as events that identiy when a new blob
has been detected, when a blob dies, and when a blob has moved. Application
developers then utilize this inormation by creating outside applications that listen
or these blob events and act upon them.
-
8/8/2019 First Edition Community Release
40/100
30 Multi-Touch Technologies Handbook
2.3: Gesture Recognition
Te Challenge o NUI : Simple is hard. Easy is harder. Invisible ishardest. - Jean-Louis Gasse
2.3.1 GUIs to Gesture
he uture o Human-Computer-Interaction is the Natural User In-
terace which is now blurring its boundary with the present. With
the advancement in the development o cheap yet reliable multi-touch
hardware, it wouldnt be long when we can see multi-touch screens not
only in highly erudite labs but also in study rooms to drawing rooms and maybe
kitchens too.
Te mouse and the GUI interace has been one o the main reasons or the huge
penetration o computers in the society. However the interaction technique is in-
direct and recognition based. Te Natural User Interace with multi-touch screens
is intuitive, contextual and evocative. Te shit rom GUI to Gesture based inter-
ace will urther make computers an integral but unobtrusive part o our liestyle.
In its broadest sense, the notion o gesture is to embrace all kinds o instances
where an individual engages in movements whose communicative intent is para-
mount, maniest, and openly acknowledged [3] Communication through ges-
tures has been one o the oldest orm o interaction in human civilization owing
to various psychological reasons which, however, is beyond the scope o present
discussion. Te GUI systems leverages previous experience and amiliarity with
the application, whereas the NUI interace leverages the human assumptions and
its logical conclusions to present an intuitive and contextual interace based on
gestures. Tus a gesture based interace is a perect candidate or social and col-
laborative tasks as well or applications involving an artistic touch. Te interace is
physical, more visible and with direct manipulation.
However, only preliminary gestures are being used today in stand-alone applica-
tions on the multi-touch hardware which gives a lot o scope or its critics. Multi-
touch interace requires a new approach rather than re-implementing the GUI
and WIMP methods with it. Te orm o the gestures determines whether the
type o interaction is actually multi-touch or single-touch multi-user. We will dis-
cuss the kind o new gesture widgets required, development o gesture recognition
-
8/8/2019 First Edition Community Release
41/100
31Software & Applications
modules and the supporting ramework to ully leverage the utility o multi-touch
hardware and develop customizable, easy to use complex multi-touch applications.
2.3.2 Gesture Widgets
Multi-touch based NUI setups provide a strong motivation and platorm or a
gestural interace as it is object based ( as opposed to WIMP ) and hence remove
the abstractness between the real world and the application. Te goal o the in-
terace should be to realize a direct manipulation, higher immersion interace but
with tolerance to the lower accuracy implied with such an interace. Te populargestures or scaling, rotating and translating images with two ngers, commonly
reerred as manipulation gestures, are good examples o natural gestures. [Fig. 1]
New types o gesture-widgets [Fig. 2] are required to be build to ully implementthe concept o direct manipulation with the objects ( everything is an object in NUI
with which the user can interact ) and a evocative and contextual environment and
not just trying to emulate mouse-clicks with a gesture.Gesture widgets should be de-
signed with
c r e a t i v e
t h i n k i n g ,
proper user
e e d b a c k keeping in
mind the
context o
the applica-
Figure 2: Gestural widget examples
Figure 1: Examples o gestures in multi-touch applications
-
8/8/2019 First Edition Community Release
42/100
32 Multi-Touch Technologies Handbook
tion and the underlying environment. Tese gesture widgets can then be extended
by the application developer to design complex applications. Tey should also sup-
port customizable user dened gestures.
2.3.3 Gesture Recognition
Te primary goal o gesture recognition research is to create a system which can
identiy specic human gestures and use them to convey inormation or or device
control. In order to ensure accurate gesture recognition and an intuitive interace
a number o constraints are applied to the model. Te multi-touch setup shouldprovide the capability or more than one user to interact with it working on inde-
pendent (but maybe collaborative) applications and multiple such setups working
in tandem.
Gestures are dened by the starting point within the boundary o one context,
end point and the dynamic motion between the start and end points. With multi-
touch input it should also be able to recognize meaning o combination o gestures
separated in space or time. Te gesture recognition procedure can be categorizedin three sequential process:
Detection o Intention: Gestures should only be interpreted when they are
made within the application window. With the implementation o support
or multi-touch interace in X Input Extension Protocol and the upcoming
Figure 3: Gesture examples - (1) interpreted as 1 gesture, (2) as 3.
-
8/8/2019 First Edition Community Release
43/100
33Software & Applications
XI2, it will be the job o the X server to relay the touch event sequence to the
correct application.
Gesture Segmentation: Te same set o gestures in the same application can
map to several meanings depending on the context o the touch events. Tus
the touch events should be again patterned into parts depending on the ob-
ject o intention. Tese patterned data will be sent to the gesture recognition
module.
Gesture Classication: Te gesture recognition module will work upon the
patterned data to map it to the correct command. Tere are various techniquesor gesture recognition which can be used alone or in combination like Hid-
den Markov Models, Articial Neural Networks, Finite State Machine etc.
One o the most important techniques is Hidden Markov Models (HMM).
HMM is a statistical model where the distributed initial points work well and
the output distributions are automatically learned by the training process. It is
capable o modeling spatio-temporal time series where the same gesture can dier
in shape and duration. [4]An articial neural network (ANN), oten just called a neural network (NN), is a
computational model based on biological neural networks [Fig. 4]. It is very ex-
ible in changing environments. A novel approach can be based on the use o a set
o hybrid recognizers using both HMM and partially recurrent ANN. [2]
Te gesture recognition module should also provide unctionality or online and
Figure 4: Blob detection to gesture recognition ramework outline
-
8/8/2019 First Edition Community Release
44/100
34 Multi-Touch Technologies Handbook
ofine recognition as well as shot and progressive gestures. A model based on the
present discussion can be pictorially shown as:
Te separation o the gesture recognition module rom the main client architec-
ture, which itsel can be a MVC architecture, helps in using a gesture recognition
module as per requirement eg say later on speech or other inputs are available to
the gesture recognition module then it can map it to appropriate command or the
interace controller to update the view and the model.
2.3.4 Development Frameworks
A number o rameworks have been released and are being developed to help in the
development o multi-touch applications providing an interace or the manage-
ment o touch events in an object-oriented ashion. However the level o abstrac-
tion is still till the device input management, in the orm o touch events being
sent through UIO protocol irrespective o the underlying hardware. Te gesture
recognition task which will realize the true potential o multi-touch suraces, is
still the job o the client. Oten, however, some basic gestures are already included,in particular those or natural manipulation (e.g. o photos), but in general these
rameworks arent ocused on gestural interaces. Tey rather tend to port the GUI
and WIMP canons to a multitouch environment.[1]
Gesture and gesture recognition modules have currently gained a lot o momen-
tum with the coming up o the NUI interace. Some o the important rameworks
are:
Sparsh-UI - (http://codegooglecom/p/sparsh-ui/)
Sparsh-UI [30], published under LGPL license, seems to be the rst actual mul-
titouch gesture recognition ramework. It can be connected to a a variety o hard-
ware devices and supports dierent operating systems, programming languages
and UI rameworks. ouch messages are retrieved rom the connected devices
and then processed or gesture recognition. Every visual component o the client
interace can be associated to a specic set o gestures that will be attempted to be
recognized. New gestures and recognition algorithms can be added to the deaultset included in the package.
-
8/8/2019 First Edition Community Release
45/100
35Software & Applications
Gesture Denition Markup Language - (http://nuigroupcom/wiki/Getsure_Recognition/)
GDML is a proposed XML dialect that describes how events on the input surace
are built up to create distinct gestures. By using an XML dialect to describe ges-
tures, will allow individual applications to speciy their range o supported gestures
to the Gesture Engine. Custom gestures can be supported. Tis project envisages
1. Describing a standard library (Gesturelib) o gestures suitable or the majority
o applications
2. A library o code that supports the dened gestures, and generates events to
the application layer.
Grati
Grati is a C# ramework built on top o the uio client that manages multi-
touch interactions in table-top interaces. Te possible use o tangible objects is
particularly contemplated. It is designed to support the use o third party modules
or (specialized) gesture recognition algorithms. However a set o modules or therecognition o some basic gestures is included in this project.
NUIFrame
NUIFrame is a C++ ramework based on the above discussed model (currently
under development). It provides a separate gesture recognition module to which
besides the touch-event sequence, contextual inormation regarding the view o
the interace is also provided by the client application. Tis ensures that the samegesture on dierent objects can result in dierent operations depending on the
context. It will also support custom gestures based on user specication or a par-
ticular command. Te set o gesture widgets will also support automatic debugging
by using pattern generation, according to the gestures supported by it.
AME Patterns Library
Te AME Patterns library is a new C++ pattern recognition library, currently o-
cused on real-time gesture recognition. It uses concept-based programming to ex-press gesture recognition algorithms in a generic ashion. Te library has recently
been released under the GNU General Public License as a part o AMELiA (the
Arts, Media and Engineering Library Assortment), an open source library col-
lection. It implements both a traditional hidden Markov model or gesture rec-
ognition, as well as some reduced-parameter models that provide reduced train-
-
8/8/2019 First Edition Community Release
46/100
36 Multi-Touch Technologies Handbook
ing requirements (only 1-2 examples) and improved run-time perormance while
maintaining good recognition results.
Tere are also many challenges associated with the accuracy and useulness o
Gesture Recognition sotware. Tese can be enumerated as ollows:
Noise pattern in the gestures
Inconsistency o people in use o their gestures.
Finding the common gestures used in a given domain instead o mapping a
dicult to remember gesture.
olerance to the variability in the reproduction o the gestures
olerance with respect to inaccurate pointing as compared to GUI system.
Distinguishing intentional gestures rom unintentional postural adjustments-
known as the gesture saliency problem.
Gorilla arm: Humans arent designed to hold their arms in ront o their acesmaking small motions. Ater more than a very ew selections, the arm begins to
eel sore, cramped, and oversized -- the operator looks like a gorilla while using
the touch screen and eels like one aterwards. It limits the development and use
o vertically-oriented touch-screens as a mainstream input technology despite a
promising start in the early 1980s. However its not a problem or short-term us-
age.
Further improvement to the NUI interace can be added by augmenting the inputto the gesture recognition module with the inputs rom proximity and pressure
sensing, voice input, acial gestures etc.
-
8/8/2019 First Edition Community Release
47/100
37Software & Applications
2.4. Python
Python is a dynamic object-oriented programming language thatcan be used or many kinds o sotware development. It oers strong sup-
port or integration with other languages and tools, comes with extensivestandard libraries, and can be learned in a ew days. Many Python pro-grammers report substantial productivity gains and eel the language en-courages the development o higher quality, more maintainable code.
2.4.1 Introduction
his section describes two great technologies: multi-touch user interac-
es and the Python programming language[1]. Te ocus is specically
directed towards using Python or developing multi-touch sotware.
Te Python module PyM[2] is introduced and used to demonstrate
some o the paradigms and challenges involved in writing sotware or multi-
touch input. PyM is an open source python module or rapid prototyping anddevelopment o multi-touch applications.
In many ways both Python and multi-touch have emerged or similar reasons o
making computers easier to use. Although as a programming language Python
certainly requires advanced technical knowledge o computers, a major goal is to
let developers write code aster and with less hassle. Python is oten touted as
being much easier to learn than other programming languages[3]. Many Python
programmers eel strongly that the language lets them ocus on problem solvingand reaching their goals, rather than deal with arcane syntax and strict language
properties. With its emphasis on readability and a plethora o available modules
available rom the open source community, Python tries to make programming
as easy and un as multi-touch user interaces make computers more natural and
intuitive to use.
Although the availability o multi-touch interaces is growing at an unprece-
dented rate, interactions and applications or multi-touch interaces have yet toreach their ull potential. Especially during this still very experimental phase o
exploring multi-touch interaces, being able to implement and test interactions or
prototypes quickly is o utmost importance. Pythons dynamic nature, rapid devel-
opment capabilities, and plethora o available modules, make it an ideal language
or prototyping and developing multi-touch interactions or applications quickly.
-
8/8/2019 First Edition Community Release
48/100
38 Multi-Touch Technologies Handbook
2.4.2 Python Multi-touch Modules and Projects
Te ollowing is a brie overview o some multi-touch related python projects and
modules available:
PyM [2]
PyM is a python module or developing multi-touch enabled media rich
OpenGL applications. It was started as a research project at the University o
Iowa[10], and more recently has been maintained and grown substantially thanks
to a group o very motivated and talented open source developers [2]. It runs onLinux, OSX, and Windows. PyM will be discussed in urther detail in the ol-
lowing subsections. PyM is licensed under the GPL license.
touchPy [4]
Python ramework or working with the UIO[8] protocol. touchPy listens to
UIO input and implements an Observer pattern that developers can use or sub-
class. Te Observer pattern makes touchPy platorm and module agnostic. Forexample it does not care which ramework is used to draw to the screen. Tere are
a series o great tutorials written or touchPy by Alex eiche [9].
PointIR [5]
PointIR is a python based multi-touch system demoed at PyCon 2008. While it
certainly looks very promising, it seems to have vanished rom the (searchable?)
realms o the Internet. Specic license inormation is unknown.
libAVG [6]
According to its web page, libavg is a high-level media development platorm
with a ocus on interactive installations [4]. While not strictly a multi-touch
module, it has been used successully to receive UIO input and do blob tracking
to work as a multi-touch capable system. libavg is currently available or Linux and
Mac OS X. It is open source and licensed under the LGPL.
pyUIO [7]
A python module or receiving and parsing UIO input. pyUIO is licensed
under the MI license.
-
8/8/2019 First Edition Community Release
49/100
39Software & Applications
2.4.3 PyM
PyM is a python module or developing multi-touch enabled media rich OpenGL
applications. Te main objective o PyM is to make developing novel, and cus-
tom interaces as easy and ast as possible. Tis section will introduce PyM in
more detail, discuss its architecture, and use it as an example to discuss some o the
challenges involved in writing sotware or multi-touch user interaces.
Every (useul) program ever written does two things: take input and provide
output. Without some sort o input, a program would always produce the exact
same output (e.g Hello World!), which would make the program rather useless.
Without output, no one would ever know what the program was doing or i it was
even doing anything at all. For applications on multi-touch displays, the main
method o input is touch. Graphics drawn on the display are the primary output.
PyM tries to make dealing with these two orms o input and output as easy
and exible as possible. For dealing with input, PyM wraps the UIO protocol
[8] into an event driven widget ramework. For graphical output PyM builds
on OpenGL to allow or hardware accelerated graphics and allow maximum ex-ibility in drawing.
2.4.4 PyM Architecture
Figure 5 displays the main architecture o PyM. As already mentioned it uses
UIO and OpenGL or input/output. Te current version o PyM relies on py-
glet[11], which is a a cross-platorm OpenGL windowing and multimedia library
or Python. Especially pyglets multimedia unctionality makes dealing with im-ages, audio and video les very easy. Most media les can be loaded with a single
line o python code (and drawn/played to an OpenGL context with a single line as
well). Tis subsection briey discusses the role o OpenGL as a rendinering engine
as well as the event system and widget hierarchy at the heart o PyM.
-
8/8/2019 First Edition Community Release
50/100
40 Multi-Touch Technologies Handbook
Figure 5: PyM Architecture: PyM builds on top o pyglet [11], which providesOpenGL windowing and multimedia loading or various fle ormats. It also imple-ments a UIO client that listens or input over the network. PyM glues these tech-nologies together and provides an object oriented library o widgets and a hierarchical
system or layout and event propagation.
2.4.5 OpenGL
Using OpenGL as a drawing backend has both advantages and disadvantages.
While it allows or ultimate perormance and exibility in terms o 2D and 3D
rendering, it is also very low-level. Tereore its learning curve can be rather steep
and requires a good understanding o basic o computer graphics.
PyM tries to counteract the need or advanced OpenGL knowledge by pro-
viding basic drawing unctions that act as wrappers to OpenGL. For examplePyM includes unctions such as drawCircle, drawRectangle, drawLine, draw-
exturedRectangle and many others that require no knowledge o OpenGL. Us-
ing OpenGL as an underlying rendering engine however, lets more advanced users
take ull control over the visual output o their applications at peek perormance.
PyM also provides helper unctions and classes to aid advanced OpenGL pro-
gramming. Creating, or example, a Frame Buer Object or shaders rom glsl
source can be done in one line. PyM also handles projection matrices and layout
transormations under the hood, so that widgets can always draw and act in their
local coordinate space without having to deal with outside parameters and trans-
ormations. Te main idea is to do things the easy way most o the time; but i
advanced techniques or custom drawing is required to provide easy access to raw
OpenGL power.
-
8/8/2019 First Edition Community Release
51/100
41Software & Applications
Describing OpenGL itsel ar exceeds the scope o this document. For gen-
eral inormation about OpenGL see [12][13], the standard reerence Te Red
Book[14], or the classic Nehe OpenGL tutorials on nehe.gamedev.net [15]
2.4.6 Windows, Widgets and Events
Most Graphical User Interace (GUI) toolkits and rameworks have a concept
called the Widget. A Widget is considered the building block o a common
graphical user interace. It is an element that is displayed visually and can be
manipulated through interactions. Te Wikipedia article on Widgets or examplesays the ollowing [16]:
In computer programming, a widget (or control) is an element oa graphical user interace (GUI) that displays an inormation arrange-ment changeable by the user, such as a window or a text box. [...] A amilyo common reusable widgets has evolved or holding general inormation
based on the PARC research or the Xerox Alto User Interace. [...] Eachtype o widgets generally is defned as a class by object-oriented program-ming (OOP). Tereore, many widgets are derived rom class inheritance.
Wikipedia Article: GUI Widget[16]
PyM uses similar concepts as other GUI toolkits and provides an array o wid-
gets or use in multi-touch applications as part o the ramework. However, the
main ocus lies on letting the programmer easily implement custom widgets andexperiment with novel interaction techniques, rather than providing a stock o
standard widgets. Tis is motivated primarily by the observations and assump-
tions that:
Only very ew Widgets and Interactions have proven themselves standard
within the natural user interace (NUI).
Te NUI is inherently dierent rom the classic GUI / WIMP (Window,Icon, Menu, Pointing