Formal Evaluation of IMU-based Gesture Recognition for UAS ... · This gesture recognition system...

American Institute of Aeronautics and Astronautics

1

Formal Evaluation of IMU-based Gesture Recognition for

UAS Aircraft Carrier Deck Handling

Amanda K. Lampton, Ph.D.*

Justin R.Gray†

Justin P. Miller‡

Systems Technology, Inc., Hawthorne, CA, 90250

Integrating unmanned aircraft systems into manned operations is a challenging

balancing act of making needed changes to accommodate the unmanned systems and yet

minimizing the impact of those changes to daily operations. This is nowhere more apparent

than on the flight deck of a Navy aircraft carrier in which daily operations and mission

events are like a carefully choreographed danced that has evolved and been perfected over

the last hundred years. A dance in which a breakdown in communication can result in a

slowdown of operations at best and catastrophic damage to equipment and/or loss of life at

worst. As the Navy moves toward integrating unmanned operations into manned, of

particular importance is developing technology that allows the aircraft directors on deck to

communicate with unmanned aircraft in as near to the same manner as they do with

manned. This means using the same gesture-based lexicon with which directors

communicate with pilots and without the addition of more personnel on deck. In response to

this need, an effort was made to develop an inertial measurement-based gesture recognition

hardware/software solution. This gesture recognition system entails standard signalman

wands modified by embedding an inertial measurement unit in the shaft and machine

learning-based classification algorithms using the inertial data as the input to establish that

communication link between director and unmanned aircraft. The system was evaluated by

four current U.S. Navy aircraft directors through a series of evaluation tasks intended to

emulate basic carrier deck mission events. Quantitative assessments and director opinions of

the system indicated that it enabled communication between them and the unmanned

aircraft to the extent that the tasks could be accomplished in a timely manner and with little

change to how they guide the aircraft.

I. Introduction

ue to the extraordinarily noisy environment, often reaching up to 140 dB, of an operational aircraft carrier deck

with jet engines, rotorcraft rotors, wind, etc., communication on the flight deck has evolved into a primarily

gestural language employed by Aircraft Directors (ADs) and Navy pilots to communicate with each other. An

Unmanned Aircraft System (UAS) controlled by this same gesture vocabulary could integrate smoothly into this

already chaotic environment, minimize disruptions to carrier evolutions, avoid the need for specially trained

personnel, and minimize transition cost. With the imminent integration of UASs into carrier deck operations, a

technological solution is needed to facilitate UAS deck handling and the communication between ADs and UASs

that performs as well or better as that between ADs and Navy pilots, can learn new gestures as needed, and requires

minimal additional devices or changes to standard deck operations.

To meet this need and at the behest of the Office of Naval Research (ONR), a team led by Systems Technology,

Inc. (STI) has created the on-Deck Intelligent Aircraft Body Language Observer (DIABLO). DIABLO is a unique

IMU-embedded signalman wand hardware/software solution that uses a machine learning-based gesture recognition

algorithm to interpret IMU data sensed within slightly modified standard signalman wands to interpret the signals

given by ADs. This paper focuses on the formal evaluations of the DIABLO system by Navy ADs conducted at the

NAS Patuxent River Manned Flight Simulator (MFS).

* Senior Research Engineer, AIAA Senior Member. † Staff Engineer, Analytical, AIAA Member. ‡ Staff Engineer, Analytical, AIAA Member.

D

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075

2018 AIAA Information Systems-AIAA Infotech @ Aerospace

8–12 January 2018, Kissimmee, Florida

10.2514/6.2018-0075

Copyright © 2018 by Systems Technology, Inc.

. Published by the American Institute of Aeronautics and Astronautics, Inc., with permission.

AIAA SciTech Forum

http://crossmark.crossref.org/dialog/?doi=10.2514%2F6.2018-0075&domain=pdf&date_stamp=2018-01-07


2

Recent efforts to integrate UASs into the fleet include Northrop Grumman’s X-47B. The X-47B is an unmanned

combat air vehicle (UCAV) that was designed to test integrating autonomous aircraft into aircraft carrier-based

operations. Two demonstrators were built and tested between 2011 and 2015, performing both land- and carrier-

based demonstrations. The X-47B is a blended wing-body aircraft with a 62.1 ft wingspan and 38.2 ft length. Its

cruising speed is Mach 0.9+ with a service ceiling of 40,000 ft and range of 2,100+ nm. It is classified as semi-

autonomous as there are some operations in which a remote pilot has control. A photo of the X-47B on a carrier

deck is shown in Figure 1.

Figure 1: X-47B Arrested Carrier Landing on the USS George H.W. Bush

Of particular interest to the DIABLO program is the X-47B precision taxi test phase that was conducted on the

USS Harry S. Truman beginning on Dec. 9, 2012.1 The UCAV’s flight control systems are fully autonomous, but

ground maneuvers are controlled remotely via an arm-mounted control display unit (CDU, see Figure 2).2 The

operator of the CDU follows the direction of the aircraft directors who signal to the UCAV just as they would to any

other aircraft, thus disrupting deck operations as little as possible. As shown in Figure 3, one of the CDU operators

followed the signals of the aircraft director and showed that the X-47B could move with precision around the deck.

Figure 2: Arm-mounted CDU

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


3

Figure 3: Operator Using the CDU to Follow Aircraft Director Signals to Taxi the X-47B Aboard the USS

Harry S. Truman

Though well received by the crew, this initial solution requires an additional person on the flight deck for each

active UAS, which the Navy does not find ideal. Alternatively, there is a significant body of work on recognition of

hand gestures using optical sensing. Venetsky in Ref. 3 thoroughly defines the problem of computer vision and

gesture recognition of the flight deck. He discusses technical issues and describes the conditions with which to

contend when selecting optical sensors and recognition approaches. Major challenges to vision-based gesture

recognition include:

• Low light

• Blooming

• Sun glare

• Steam

• Jet exhaust

• Pose

• Occlusion

• Scene Clutter

• Orientation with respect to aircraft nose

• Rotation with respect to aircraft

• Director hand-off

• Temporal resolution

Of which, DIABLO is susceptible to none.

This paper discussing the formal evaluation of DIABLO is organized as follows. The DIABLO system, both

software and hardware, is described in Section II. The simulator and simulation environment in which the formal

evaluations were conducted are described in Section III. The evaluation tasks in general and the one exemplified

herein are discussed in Section IV. The formal evaluation test set up is discussed in Section V. Results are discussed

in Section VI followed by conclusions drawn from the formal evaluations in Section VII.

II. The DIABLO System

The DIABLO system consists of both a hardware and software component. The prototype hardware component

is comprised of standard signalman wands with a commercial-off-the-shelf IMU embedded in the shaft of the wand.

The software component is the gesture recognition algorithm, which is shown in block diagram form in Figure 4 as

it relates to and communicates with the wands and aircraft. The software itself is comprised of the Gesture

Classification Algorithm and State Logic Algorithm.

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


4

Figure 4: DIABLO Gesture Recognition Algorithm

A. Gesture Recognition Algorithm

1. Gesture Classification Algorithm

The core of the Gesture Classification Algorithm is a model generated using the Library for Online Learning

(LIBOL) algorithms, an open-source library of a family of classical and state-of-the-art online learning algorithms

for large-scale machine learning and data mining research.4 The classifying algorithms require a feature set upon

which to base classification. These feature vectors are extracted from a signal set sent from the DIABLO wands that

includes the three linear accelerations in the x-, y-, and z-axis, the three angular accelerations, and the three

magnetometer readings.

The inertial data used to train the model was gathered from 22 ADs stationed on the USS Carl Vinson while the

ship was in port in October 2016. Using the DIABLO wands, they each performed five sets of the NATOPS gestures

listed below.

• Move Ahead

• Move Back

• Slow Down

• Turn Left

• Turn Right

• Pass Control

• Slow Down Left Engine

• Slow Down Right Engine

• Shut Down Left Engine

• Shut Down Right Engine

• Spread Wing

• Fold Wing

• Stop

• I Have Command

• Engage Nose Gear

• Disengage Nose Gear

• Launch Bar Up

• Launch Bar Down

• Up Hook

• Down Hook

• Pivot Left/Right

• Brakes On/Off

• Throttle Up

• Throttle Down

• Added Capability • Move Ahead

• Slowest

• Slow

• Pace

• Fast

• Fastest

• Turn Left/Right

• Widest

• Wide

• Veer

• Tight

• Nose Bump

Using the DIABLO system, as the AD performs a gesture, the sensed inertial data is passed to the gesture

classification algorithm. Using the trained model, the data is classified and that gesture with the highest value is

passed on as the ‘recognized’ gesture.

2. State Logic Algorithm

To improve the accuracy of the classification algorithm, the State Logic Algorithm was implemented within the

greater Gesture Recognition Algorithm. The algorithm effectively limits the gestures that can be recognized in a

given aircraft deck handling state. The aircraft deck handling states are

• Taxi

• Non-Taxi

• Pivot

• Launch

• Precision

• Recovery

• Emergency

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


5

• No Aircraft

An example of the restricted gestures is for the Taxi state the gestures that can be recognized are Move Ahead, Stop,

Turn Left/Right, Pass Control, Spread/Fold Wings, and Pivot Left/Right.

The final recognized gesture output by the Gesture Recognition Algorithm is sent to the UAS, which interprets

that command into a ground control command.

B. IMU-Embedded Signalman Wands

The prototype IMU-embedded wand is shown in Figure 5 with a standard signalman wand and a Yost Labs 3-

Space™ Wireless 2.4GHz DSSS IMU. It is disassembled in Figure 5a, assembled with a ghost image of the sensor

Figure 5b, and down-the-shaft to show how the IMU is integrated into the wand in Figure 5c. The wireless inertial

sensors fit in the signalman wands with minimum room for movement. To embed them into the wands, the

incandescent light bulb was replaced with a LED to create room within the wand by eliminating the need for 2 D

batteries. The sensor is then mounted and secured to reduce movement.

a) Wand Components

b) Sensor Location

c) Shaft View

Figure 5: Prototype IMU-Embedded Signalman Wand

To further increase classification accuracy, triggers were added to the signalman wands (Figure 6). This addition

enables additional user input while performing gestures. The current IMUs embedded in the wands have two toggle

buttons on the front face of the sensor casing, which output a 0 when not depressed and either a 1, 2, or 3 when

depressed. To utilize the buttons on the sensors, a mechanical trigger was added to each wand that can be depressed

using the index and middle fingers of each hand.

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


6

Figure 6: DIABLO Wand with Trigger

To implement the trigger within the algorithm, rather than recollecting data with the trigger in use, the trigger

acts as a Boolean operator, meaning when the trigger value is greater than 0, the wand trigger is being pressed. In

practice, this means that when the left trigger is pressed, the algorithm will only classify left directional gestures,

thus reducing the number of gestures from which the recognition algorithm can select.

III. Simulator Test of the DIABLO System

A. Simulator Description

Lab 8 of the MFS is the SEOS Prodas Partial Dome simulator. The display system in Lab 8 (see Figure 7)

features an 11-ft diameter real image with 180-deg horizontal, 85-deg up, and 35-deg down field-of-view. Six

Projection Design FL35 LED projectors are used, each having a 2560x1600 pixel resolution, and the display

provides the equivalent of 20/60 visual acuity. The Aechelon pC-Nova image generator is capable of displaying

300k+ polygons updated at 60 Hz with less than 60 ms transport delay.

Figure 7: MFS Lab 8 Simulator

B. Simulation Description

Specific to deck handling Research Development Test & Evaluation (RDT&E), several additional capabilities

have been developed. First, the MFS’s F/A-18E airframe simulation can be used as a surrogate UAS with the

addition of a command-based ground controller developed and integrated with the F-18 E/F airframe. The controller

accepts speed and turn commands as well as discrete signals such as Brakes On, Hook Up, Launch Bar Down, etc.

The visual environment of the carrier deck and the air vehicle are enhanced to support precise visual positioning for

deck handling tasks such as catapult hookup and deck edge parking. Viewpoint slewing supports multiple directors

with a single aircraft, and multiple lab stations can be tied together to support multi-director, multi-aircraft scenarios.

Environmental factors including weather, visibility, ship motion, variable deck friction, and wind conditions are also

available for use in testing. Additionally, a 3-D sound model of the simulated aircraft engine noise was developed to

aid in enriching the cueing of vehicle response to controller commands.

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


7

Figure 8: View into MFS Lab 8 During a DIABLO Evaluation Run

C. System Configuration and Integration

Integrating DIABLO into MFS Lab 8 required two computers. One ran the DIABLO software and received the

incoming data from the signalman wands. It also ran the interface for MFS while the second computer hosted the

simulation and graphics models. A simplified block diagram of this setup is illustrated in Figure 9. The main

challenges faced when integrating DIABLO included ensuring the DIABLO software executed properly on the MFS

computer and ensuring proper communication from DIABLO to the surface software, which then sends the

commands to the aircraft simulation.

Figure 9: MFS/DIABLO Integration Block Diagram

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


8

IV. Evaluation Tasks

A. Overview

The evaluation tasks were designed to assess the performance of DIABLO as an AD taxis a UAS through basic

on deck mission events and path following tasks. A top-down diagram of the simulated aircraft carrier is shown in

Figure 10 with pertinent areas and boundaries labeled.

Figure 10: Top Down Diagram of the Simulated Aircraft Carrier

The tasks evolved over the course of the checkout weeks and first evaluation week at MFS with feedback from

the ADs. The final set of tasks used for formal evaluations include:

• Finger to Launch

• Retrieval to Finger

• Retrieval to Launch

• Path Following Simple

• Path Following Complex

For brevity, the Finger to Launch task is the focus herein as it highlights the maneuverability that is achievable

with DIABLO as the aircraft taxis the length of the deck as well as the precision control necessary to successfully

engage the aircraft in the catapult and ready for launch.

B. Finger to Launch Task Description

Objectives

• Evaluate aircraft taxiing task performance.

• Characterize director assessment of the DIABLO hardware/software solution for UAS carrier deck

handling via ratings, comments, and/or targeted debrief questionnaires.

Environmental Conditions

• Day

• Clear

• 200 mi visibility

• Calm sea state

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


9

Pre-Task Setup

• Launch Bar: Up

• Hook: Up

• Brakes: On

• Power: Idle

• Wings: Folded

• Has Command: Director 1

• Viewpoint: VP 1

Description

From the initial parked position on the finger, direct the aircraft across the landing strip and forward past the

island, avoiding wire mounts and other obstacles. Direct the aircraft toward the elevator just forward of the island,

and near the edge of the deck execute a turn port toward the bow of the ship. Follow the edge of the ship to the next

elevator. Then direct the aircraft to catapult 1. Execute the procedure to prepare the aircraft for a catapult launch up

to and including passing control to the shooter.

While taxiing, pass control to the next director before the aircraft nose passes the current director’s position. The

next director will be signaling I Have Command to accept command of the aircraft. The viewpoint (see Figure 11

and Table 1) will change with the Pass Control/I Have Command exchange. If only one director is working the task,

Pass Control only will trigger the viewpoint change.

Figure 11: Finger to Launch Task Viewpoints

Table 1: Finger to Launch Task Viewpoint Coordinates

N

o.

X, +fwd

(ft)

Y, +stbd

(ft)

Yaw* (deg)

1 -348.79 -85.20 165.0

2 -330.09 90.00 300.0

3 -55.33 24.32 130.0

4 159.48 29.69 110.0

5 241.97 62.62 176.1

6 240.34 22.29 80.0

*Yaw denotes the direction of the view with North through the MFS aircraft carrier simulation y-axis (x-axis in

the figures here).

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


10

Desired Performance

• <180 sec from finger to jet blast deflector 1

• Successfully prep for launch in 1 attempt

Adequate Performance

• <270 sec from finger to jet blast deflector 1

• Successfully prep for launch in 2-3 attempts

V. Formal Evaluations

The duration of each evaluation simulator entry was about 1.5 hours; four ADs served as evaluators. Each AD

was briefed on the tasks. This briefing included discussion of the purpose and goals of the program, the nuances of

the DIABLO hardware and software, and the addition of the trigger to the signalman wands and their use.

The simulator entry consisted of adequate time for familiarization of the system and around 10 formal evaluation

runs. During the familiarization period, the carrier and UAS deck simulation were running, and the AD practiced

using the wands and performing all of the gestures in the lexicon of DIABLO. This included practicing the precision

approach to Cat 1, dropping the launch bar into the catapult cradle, engaging the catapult, and throttling up the

aircraft in preparation for launch. Once comfortable, the AD performed each of the taxiing tasks two or three times

and both of the path following tasks once.

VI. Director Simulation Results

Exemplar results of the pool of ADs are presented and discussed herein. Exemplar performance results for the

Finger to Launch task are presented for AD 2. DIABLO classification results and AD opinion results are presented

for the whole of the formal evaluation.

A. Task Performance Results

The overlay of the Finger to Launch run on the ship deck is shown in Figure 12. Three tracks are shown on this

overlay: 1) Nose Gear, 2) Aircraft CG, and 3) Aircraft Main Gear Midpoint. Tracking these three aircraft points

illustrates how the aircraft is manipulated whilst taxiing to maneuver around the flight deck. For example, the

aircraft is brought straight out from the parked position on the finger between viewpoints 1 and 2 and commanded to

turn sharply near the midpoint of Wire 1 as indicated by the looping of the nose gear tracked position and the sharp

turn of the CG and Main Gear Midpoint positions. The aircraft is taken nearly perpendicular across the wires with

nose and main gear in line. Small corrections are commanded as the aircraft taxis down the length of the deck and

lines up with Cat 1.

The Nose Gear position with respect to time is shown in Figure 13. The dashed magenta line indicates the time

when the Nose Gear crosses the aft edge of JBD 1 near viewpoint 4. These subfigures show that the initial maneuver

from being parked on the Finger to turning to cross the landing wires takes ~35 sec. The aircraft takes 252 sec to

travel from the Finger to JBD 1, and the end game of lining up the aircraft with the catapult and preparing for launch

takes 120 sec.

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


11

Figure 12: Finger to Launch – Ship Deck Track

Figure 13: Finger to Launch – Aircraft Nose Gear Location

The time metric of interest, time from Finger to JBD 1, for this task is compared for all ADs in Figure 14. The

gray dash-dot line indicates the desired performance level, as in anything below that line is considered desired

performance. The adequate performance line is the gray dash line. The performance metric from the exemplar data

described in this section is considered adequate performance. Most of the runs fell within the adequate performance

range for this metric, though many of the later runs for AD 1, 3, and 4 achieved desired performance. This trend

suggests that the ADs became more confident in moving the aircraft quickly down the aircraft carrier deck with

DIABLO.

0 100 200 300 400-400

-300

-200

-100

0

100

200

300

400

Time (sec)

X (

ft)

0 100 200 300 400-150

-100

-50

0

50

100

150

Time (sec)

Y (

ft)

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


12

0 1 2 3 4 5

Aircraft Director

0

50

100

150

200

250

300

350

400

Tim

e (

se

c)

Session 1

Session 2

Figure 14: Finger to Launch – Aircraft Nose to JBD 1 Task Performance

B. Classification Accuracy

In machine learning, a confusion matrix is used to judge the accuracy and performance of an algorithm. For

DIABLO, the confusion matrix represents the accuracy of the gesture performed versus the predicted output of the

algorithm. The confusion matrix is normally formatted as a table, and in those presented below, the columns

represent the actual gesture performed and the rows represent the gesture output from the algorithm. The average of

the matrix diagonal is the overall system accuracy. Ideally, this value is 100%.

1. Methodology

Videos of each of the evaluation sessions were recorded. Included in the field of view of the video was a monitor

(see Figure 15, bottom right corner) that displayed the timestamp of the command history, the gesture performed,

and which triggers are currently pressed by the director. These data and the gesture the director was performing were

then transcribed for comparison to the gesture command history recorded from the DIABLO software. The overall

confusion matrix derived from these data represents the total accuracy of the system and is what must be optimized

for a successful DIABLO system.

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


13

Figure 15: Screenshot of Information Monitor and Director Performing a Task

2. Gesture Concatenation

When testing the DIABLO system, the full set of gestures in the DIABLO lexicon were not used because of the

design of the evaluation tasks. Therefore, only the gestures for which there is data were included in the confusion

matrices. However, the rest of the gestures were tested in a laboratory setting. This subset of gestures included Down

Hook, Throttle Down, Engage Nose Gear Steering, and Disengage Nose Gear Steering.

Of the set of gestures that were used, some are only differentiated by the use of the wand triggers. For example,

Brakes On and Stop are performed in the same manner save for the addition of pressing both triggers to send the

Brakes On command rather than just Stop. Since there is no distinction between the two gestures aside from the use

of the triggers, Stop and Brakes On were combined in the confusion matrix.

3. Results

Due to the real-time aspect of the DIABLO system, two confusion matrices can be created—one with the system

delay considered and one with the system delay ignored. The confusion matrix without system delay is discussed

first as it yielded higher accuracy across the board as opposed to that when the system delay is included. The reason

for this difference in accuracy is that when the aircraft director begins a gesture, the accuracy calculation begins, but

the recognition algorithm has not yet caught up with the director. That delay in the algorithm reduces the accuracy

even though the gesture was recognized correctly shortly thereafter.

Discussion

Table 2 lists the abbreviations for the gestures used in the confusion matrices. Table 3 shows the DIABLO

system confusion matrix with the delay ignored.

The overall accuracy of the algorithm as shown in the system confusion matrix is 91.5%. This was calculated

using 48 runs from four aircraft directors. As the aircraft directors completed evaluation tasks, their competency

with the system improved resulting in higher algorithm accuracy.

Continuing to refine the gesture recognition algorithm itself would improve the accuracy more. Possible

refinements include improving interpretation of gestures by the vehicle, e.g. ignoring commands that do not make

sense, more efficient custody exchange protocol, and more robust processes within the classification algorithm to

better reflect individual classifier accuracies.

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


14

Table 2: System Confusion Matrix Gesture Abbreviations

Abbreviation Name

ma Move Ahead

st Stop

bo Brakes Off

pc Pass Control

tl Turn Left

tr Turn Right

pl Pivot Left

pr Pivot Right

uh Landing Hook Up

sw Spread Wings

fw Fold Wings

rlb Raise Launch Bar

llb Lower Launch Bar

tu Throttle Up

Table 3: DIABLO System Confusion Matrix without Delay

ma st bo pc tl tr pl pr uh sw fw rlb llb tu

ma 97% 14% 11% 8% 8% 6% 4% 1% 0% 0% 1% 0% 0% 0%

st 0% 64% 9% 1% 0% 0% 0% 0% 10% 0% 1% 0% 0% 0%

bo 0% 4% 79% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0%

pc 2% 3% 0% 87% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0%

tl 0% 0% 1% 0% 90% 0% 0% 0% 0% 0% 0% 0% 0% 0%

tr 0% 4% 0% 0% 0% 94% 0% 0% 0% 0% 0% 0% 0% 0%

pl 0% 0% 0% 0% 1% 0% 96% 0% 0% 0% 0% 0% 0% 0%

pr 0% 0% 0% 0% 0% 0% 0% 99% 0% 0% 0% 0% 0% 0%

uh 0% 7% 0% 1% 0% 0% 0% 0% 90% 0% 0% 0% 0% 0%

sw 0% 1% 0% 0% 0% 0% 0% 0% 0% 100% 0% 0% 0% 0%

fw 0% 1% 0% 0% 0% 0% 0% 0% 0% 0% 98% 0% 0% 0%

rlb 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 89% 1% 0%

llb 0% 2% 0% 2% 0% 0% 0% 0% 0% 0% 0% 2% 99% 0%

tu 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 9% 0% 100%

Table 4 shows the confusion matrix for the overall system with delay included. The overall accuracy of the of

the algorithm with delay as shown in the confusion matrix is 83.3%. This was calculated using the same sets of data

used above. As previously stated, this confusion matrix does not fully represent the accuracy of the system due to

how the confusion matrix is calculated. Since the calculation begins at the start of the gesture being performed, the

gestures sent from the algorithm between the start and when the new gesture is recognized and classified is

considered a misclassification. The average system delay was ~500 milliseconds. When evaluated, the aircraft

directors stated that the delay was not noticeable and was representative of a piloted aircraft.

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


15

Table 4: DIABLO System Confusion Matrix with Delay Included

ma st bo pc tl tr pl pr uh sw fw rlb llb tu

ma 93% 14% 15% 13% 14% 11% 8% 3% 0% 8% 1% 1% 0% 0%

st 0% 58% 10% 3% 1% 0% 2% 0% 12% 5% 1% 1% 2% 0%

bo 0% 7% 72% 1% 0% 0% 0% 0% 2% 0% 0% 0% 0% 0%

pc 4% 5% 0% 79% 0% 0% 0% 0% 1% 0% 0% 0% 0% 0%

tl 1% 1% 1% 1% 82% 0% 0% 0% 0% 0% 0% 0% 0% 0%

tr 1% 4% 0% 0% 0% 87% 1% 0% 0% 0% 0% 0% 0% 0%

pl 0% 0% 0% 0% 2% 0% 89% 0% 0% 0% 0% 0% 0% 0%

pr 0% 0% 0% 0% 0% 1% 0% 97% 0% 0% 0% 0% 0% 0%

uh 0% 7% 0% 1% 0% 0% 0% 0% 86% 0% 0% 0% 1% 0%

sw 0% 1% 1% 0% 0% 0% 0% 0% 0% 71% 0% 0% 0% 0%

fw 0% 1% 1% 0% 1% 0% 0% 0% 0% 0% 82% 0% 0% 0%

rlb 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 75% 2% 0%

llb 0% 3% 0% 2% 0% 0% 0% 0% 0% 0% 0% 15% 95% 0%

tu 0% 0% 0% 0% 0% 0% 0% 0% 0% 16% 16% 7% 0% 100%

Sources of Errors

The gesture with the highest amount of error was the Stop gesture. Stop is available as a command in any aircraft

state, which is the likely cause of this high error rate as there is a greater chance for it to be confused with another

gesture, but was considered a necessary safety measure. While the accuracy is lower than that of the other gestures,

it had little impact on the aircraft director’s ability to control the aircraft.

Brakes Off also had a high amount of error. Brakes Off can only be performed when the brakes are on and the

aircraft is in the Non-Taxi State. When coming out of the Non-Taxi State, the aircraft switches to a Taxi State where

Stop and Move Ahead are available for classification. Due to the similar motions between Brakes Off, Stop, and

Move Ahead, it was expected that this would occur since the algorithm was observed to flip between gestures. Since

the aircraft is not moving as the gesture is performed, it does not create any dangerous situations. However, this

misclassification should be resolved. The impact it had on the aircraft director’s ability to control the aircraft was

minimal with the worst-case scenario being the aircraft reverting to a Non-Taxi State. The biggest performance issue

this could create is in fast-paced scenarios where the aircraft needs to be taxiing as soon as possible.

The gesture with the third largest amount of error was Pass Control. The current custody exchange algorithm

implemented in the DIABLO system was designed specifically for passing control and transitioning viewpoints in

the evaluation tasks. This solution is not a suitable candidate for a final product. More realistic methods of

performing the custody exchange have been discussed; however, due to the scope of this program, they were not

implemented and would not have been effective in the evaluation testing simulation environment.

C. AD Opinion Results

The questionnaire was distributed to the participating ADs after the simulator session to obtain opinions about

their experience with the DIABLO system. The AD debrief questionnaire was designed using a 5-point Likert scale

that addressed six distinct areas: 1) Use of IMU-Embedded Signalman Wands During Evaluation Tasks, which

gauged the ease of use of the wands in the simulation environment; 2) Projected Use of IMU-Embedded Signalman

Wands During Carrier Deck Operations, which gauged the projected use of the wands in the carrier deck

environment; 3) Carrier Deck Taxiing Tasks, which assessed the evaluation tasks; 4) Buttons/Triggers Gesture

Modification, which assessed the ADs’ reaction to the addition of the buttons/triggers; 5) Aircraft States for Deck

Operations, which assessed the ADs’ reaction to the breakdown of the deck maneuvering into a series of states each

with a subset of gestures available to the recognition algorithm; and 6) Hardware/Software Solution as a Training

Tool, which assessed the opinion of the ADs regarding the utility of using DIABLO and a suitable simulation

environment as a training tool for both manned and unmanned operations.

The results for the “Use of IMU-Embedded Signalman Wands During Evaluation Tasks” portion of the

questionnaire are shown in Figure 16. From these data, the ADs found the use of the wands intuitive and in line with

their training. Maintaining the proper orientation of the wands was easy, and the handedness of the wands did not

pose a problem in the simulation environment. The ADs also felt the IMU-embedded signalman wands felt similar

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


16

to those used while on duty. For the ADs discussed in detail herein, the following additional comment from AD 2

was noted:

“Wands have come along by leaps and bounds when it comes to oversensitivity. The buttons

and designated left and right wands add another step to the process, but nowhere near a deal

breaker.”

Figure 17 presents the results of the “Projected Use of IMU-Embedded Signalman Wands During Carrier Deck

Operations” portion of the questionnaire. All the ADs strongly agree that the use of signalman wands is routine, and

they are split between agree and strongly agree that using the wands for all UAS handling will be intuitive and in

line with their training. The directors believe that maintaining the proper orientation of the wands should not be a

problem on duty on an aircraft carrier deck. The handedness of the wands could pose a problem for the directors on

a carrier deck according to one director. The other three do not think it will be an issue. As to the differences in the

DIABLO wands and standard signalman wands, three directors do not think they will be distracting on a carrier deck

and one is neutral.

Figure 18 presents the results of the “Carrier Deck Taxiing Tasks” portion of the questionnaire. For the Finger to

Launch task and the Retrieval to Finger task, the directors say these tasks are representative of director/aircraft

communication and aircraft response. The directors’ opinions regarding the Retrieval to Launch task, the Simple

Path Tracking task, and the Complex Path Tracking task were not as strongly in accord that communication and

response were representative with one, two, or two, respectively, only agreeing rather than strongly agreeing. The

following additional comment by AD 1 was noted:

“The complex path was extremely hard. Turns too sharp.” – AD 1

Figure 19 presents the results of the “Buttons/Triggers Gesture Modifications” portion of the questionnaire in

which the ADs generally found the use of the buttons/triggers for certain gestures intuitive and did not significantly

impact their performance with two agreeing and two strongly agreeing. For the ADs discussed in detail herein, the

following additional comments were noted:

“Pivot turn, on/off brakes were great. Nose wheel bump was touchy due to speed control.” –

AD 1

“Buttons still need getting used to. They just add an extra step to the process.” – AD 2

Figure 20 presents the results of the “Aircraft States for Deck Operations” portion of the questionnaire. The ADs

either agree or strongly agree that dividing the standard carrier deck operations into aircraft taxi states is intuitive

and straightforward and that the gestures available in each allow for unrestricted performance of the evaluation

tasks. In addition, the ADs all strongly agree that those gestures available in each aircraft taxi state correspond to

those generally used for those subtasks during normal carrier deck operations. An additional comment noted by AD

2:

“Gestures were in line with actual carrier deck training. Delays in computer threw me off and

over-correction was common.” – AD 2

Finally, Figure 21 presents the results of the “Hardware/Software Solutions as a Training Tool” portion of the

questionnaire. One AD agreed and the rest strongly agreed that a DIABLO-based carrier deck simulation would be a

valuable training tool for interacting with manned aircraft. All ADs strongly agreed that a DIABLO-based carrier

deck simulation would be a valuable training tool for interacting with UAS. In addition, AD 2 noted

“The hardware/software is an intuitive training aid and would be a valuable asset when dealing

with manned/unmanned aircraft alike.” – AD 2

AD 2 had on final general comment about the system that is worthy of note:

“This training in general seems to have great promise. Bugs such as sensitive controls have

improved greatly since the second and last time I was here and certainly the first. Gets better and

easier to every time I interact with it.” – AD 2

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


17

Figure 16: Director Questionnaire – Use of IMU-Embedded Signalman Wands During Evaluation Tasks

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


18

Figure 17: Director Questionnaire – Projected Use of IMU-Embedded Signalman Wands During Carrier

Deck Operations

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


19

Figure 18: Director Questionnaire – Carrier Deck Taxiing Tasks

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


20

Figure 19: Director Questionnaire – Buttons/Triggers Gesture Modifications

Figure 20: Director Questionnaire – Aircraft States for Deck Operations

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


21

Figure 21: Director Questionnaire – Hardware/Software Solutions as a Training Tool

VII. Conclusions

Over the course of the SBIR program summarized herein, the IMU-based gesture recognition system for UAS

integration into the aircraft carrier deck environment, DIABLO, was developed. The DIABLO hardware and

software were progressed to the point that a first round of formal evaluations was appropriate. From these

evaluations, valuable performance metrics, accuracy matrices, and director opinions were obtained from which to

assess the current state of DIABLO as well as to map out a path for future development.

The evaluation tasks to exercise DIABLO were developed based on mission evenets common to the flight deck

environment. Such testing has never been formally conducted by engineers and ADs. As such several conclusions

can be drawn by this initial testing of the Finger to Launch task specifically. The ADs were able to quickly

command the aircraft to taxi the length of the carrier to Cat 1. Aircraft ground max speed may need to be adjusted as

development and testing continue due to the length of the task. After several practice runs, the ADs achieved the

dexterity to prepare the aircraft for launch. The performance metrics appear to be appropriate for this small sample

size. A performance metric for time from crossing JBD 1 to ready for launch procedures should be added in future

testing. Finally, the task and viewpoint changes work well and were found to be effective for both simulating a

common carrier deck mission events and exercising DIABLO in realistic situations.

The DIABLO hardware was generally well-received by the many ADs. They understood that the wands used for

development and evaluation were prototypes that would be improved upon in future iterations. As such, several

conclusions regarding the DIABLO hardware can be drawn. The ruggedness of the wands and sensors must be taken

into account without adding undue weight. The triggers are a viable solution for accuracy improvement. Their use

was simple to learn within the brief time evaluation testing was conducted. The handedness of the wands may be an

issue. Either a software solution to recognize which wand is in which hand or a hardware solution to ensure the

correct orientation may be necessary. Finally, requiring the use of the DIABLO wands during the day to

communicate with UASs on deck should not be an undue burden.

In addition, the gesture recognition software performed well and several more conclusions can be drawn. Gesture

classification was acceptably robust to AD given that a 91.5% accuracy was achieved with none of the evaluation

ADs contributing data to the training of the classification model within DIABLO. The ~500 ms delay between the

AD beginning to perform a new gesture and DIABLO’s recognition of the gesture was considered within a pilot’s

response time, predictable, and not a hindrance to the performance of the task. Finally, the sources of error were

expected and their impact within the confines of the formal evaluations was kept to a minimum.

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075


22

VIII. Acknowledgements

The work described herein was conducted as part of an Office of the Secretary of Defense and Office of Naval

Research Phase II SBIR program. The authors acknowledge the support for the U.S. Navy technical point of contact,

CDR Brent Olde. Furthermore, the authors acknowledge the significant contributions of the NAS Patuxent River

Manned Flight Simulator team with specific recognition Dr. Stephen Naylor, Elizabeth Knoblauch and Alan Taylor

without whom this work would not have been possible. The findings and conclusions in this paper are those of the

authors and do not necessarily represent the views of the funding agency.

IX. References

1. http://www.navy.mil/submit/display.asp?story_id=71011

2. http://www.gizmag.com/x-47b-unmanned-stealth-fighter-uss-truman/25448/

3. L. Venetsky, M. Husni, and M. Yager, "Gesture Recognition for UCAV-N Flight Deck Operations Problem

Definition," Naval Air Systems Command, Lakehurst, NJ, Final Report January 23, 2003.

4. S. C. H. Hoi, J. Wang, and P. Zhao, “LIBOL: A Library for Online Learning Algorithms,” Journal of Machine

Learning Research, Vol. 15, 2014, pp. 495-499, http://libol.stevenhoi.org/.

Dow

nloa

ded

by D

avid

Kly

de o

n Ja

nuar

y 13

, 201

8 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

018-

0075

http://www.navy.mil/submit/display.asp?story_id=71011

http://www.gizmag.com/x-47b-unmanned-stealth-fighter-uss-truman/25448/

http://libol.stevenhoi.org/

Formal Evaluation of IMU-based Gesture Recognition for UAS ... · This gesture recognition system...

Documents

Transcript of Formal Evaluation of IMU-based Gesture Recognition for UAS ... · This gesture recognition system...