Assessing the Quality of a Sword-Fighting Agent€¦ · Assessing the Quality of a Sword-Fighting...

Assessing the Quality of a Sword-Fighting

Agent

J. Dehesa, A. Vidler, C. Lutteroth, J. Padget

July 2, 2019

University of Bath / Ninja Theory Ltd

Sword Fighting in VR

We want to make engaging sword fighting in VR.

Like this.

“Game of Thrones” (HBO, 2013)

2

Sword Fighting in VR

This is hard for several reasons.

Complex control

HTC Vive (HTC, 2016)

Harder to animate

Unreal Engine 4 (Epic Games, 2014)

3

Our proposal

We came to a solution using machine learning.

User

input

Current

pose

Neural

network

Next

pose

The model is trained on motion capture data.

4

Our proposal

It looks like this.

Dehesa et al., 2019

5

The question

Is it good?

6

Wishlist

Reliability

Fidelity

Model behaviour resembles mocap data.

Generality

Works in situations beyond training data.

Stability

Similar inputs produce similar outputs.

7

Wishlist

Viability

Performance

Must run at 90 fps in a VR-ready PC.

Data efficiency

Require a reasonable amount of data.

Training time

Fast training facilitates iterative development.

8

Wishlist

User satisfaction

Players

Experience is enjoyable, realistic, engaging.

Designers

Overall quality is acceptable.

Methodology is usable.

9

Wanted

10

ACTIONABLE

METRICS

Evaluation plan

Measuring reliability

Pose error

Custom metric

to measure

similarity to

mocap.NN

PFNN

6

PFNN

8

GFN

N3× 3

GFN

N4× 4

GFN

N2× 3× 3

GFN

N2× 4× 4

Model

0.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

Err

or(m

2)

11

Evaluation plan

Measuring viability

Model complexity

Big-O analysis for time

and memory.

Benchmarks

Referenced to specific

hardware.

12

Evaluation plan

Measuring viability

Data requirements

Size of data set, cost

of mocap and

pre-processing,

equipment.

Training time

Referenced to specific

hardware and

experiment settings.

13

Evaluation plan

Measuring user satisfaction

User study

Measure enjoyment

and quality.

Designer study

Measure quality and

usability.

Methods?14

User study

Enjoyment questionnaires

Did the user like the experience?

Flow Challenge–skill balance.

Sweetset and Wyeth, 2005

(“GameFlow”)

Presence Sense of “being there”.

Witmer and Singer, 1998

Immersion Believing the world.

Jennett et al., 2008

+ Interactive

+ Standard

− Too broad?− Confounding?− Baseline?

15

User study

Side-by-side comparisons

Does the model look as good as the mocap data?

+ Simple

+ Non-confounding

+ Baseline

− Non-interactive

16

Designer study

Usability questionnaires

Is this methodology useful?

SUS Very broadly used.

Brooke, 1996

SUMI Exhaustive but expensive.

Kirakowski and Corbett, 1993

PSSUQ More task-oriented.

Lewis, 1995

+ Standard

− Applicability?− Which? Why?

17

Open questions

• Are our quantitative metrics enough?

• Which are the best tools for our qualitative studies?

• What kind of baseline can we compare to?

18

Thank you

Assessing the Quality of a Sword-Fighting Agent€¦ · Assessing the Quality of a Sword-Fighting...

Documents

Transcript of Assessing the Quality of a Sword-Fighting Agent€¦ · Assessing the Quality of a Sword-Fighting...