Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of...

20
PersOnalized Smart Environments to increase Inclusion of people with DOwn’s syNdrome 1 Deliverable D6.1 Evaluation Protocols Call: FP7-ICT-2013-10 Objective: ICT-2013.5.3 ICT for smart and personalised inclusion Contractual delivery date: 30.04.2014 (M6) Actual delivery date: 29.07.2014 Version: v1 Editor: Andreas Braun (FhG) Contributors: Silvia Rus (FhG) Reviewers: Juan Carlos Augusto (MU) Terje Grimstad (Karde) Dissemination level: Public Number of pages: 20

Transcript of Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of...

Page 1: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

PersOnalized Smart Environments to increase Inclusion of people with DOwn’s syNdrome

1

Deliverable D6.1

Evaluation Protocols

Call: FP7-ICT-2013-10

Objective: ICT-2013.5.3 ICT for smart and

personalised inclusion

Contractual delivery date: 30.04.2014 (M6)

Actual delivery date: 29.07.2014

Version: v1

Editor: Andreas Braun (FhG)

Contributors: Silvia Rus (FhG)

Reviewers: Juan Carlos Augusto (MU)

Terje Grimstad (Karde)

Dissemination level: Public

Number of pages: 20

Page 2: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

2

Contents

Contents .................................................................................................................................................. 2

1 Executive Summary ......................................................................................................................... 3

2 Introduction ..................................................................................................................................... 4

3 System evaluation in POSEIDON ..................................................................................................... 5

3.1 POSEIDON Timeline ................................................................................................................. 5

3.2 Best practice of system evaluation ......................................................................................... 6

3.3 Measuring the impact of POSEIDON ....................................................................................... 7

3.4 POSEIDON requirements ......................................................................................................... 8

4 Technical evaluation of POSEIDON ............................................................................................... 10

4.1 Evaluating requirements ....................................................................................................... 10

4.1.1 Requirement adherence ................................................................................................ 10

4.1.2 Risk estimation .............................................................................................................. 11

4.2 Evaluation process ................................................................................................................. 13

4.2.1 Organization .................................................................................................................. 14

4.2.2 Managing potential requirement revision .................................................................... 14

4.3 Result Template ..................................................................................................................... 15

5 User experience evaluation of POSEIDON .................................................................................... 16

5.1 Evaluation of user experience requirement .......................................................................... 16

5.2 Evaluation process ................................................................................................................. 16

5.2.1 Organization .................................................................................................................. 17

5.2.2 User experience requirements ...................................................................................... 17

6 Conclusion ..................................................................................................................................... 19

7 References ..................................................................................................................................... 20

Page 3: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

3

1 Executive Summary

The deliverable D6.1 - Evaluation Protocols - is the first deliverable of Work Package 6 - Validation.

The deliverable is linked to task T6.1 - Technical Assessment and has a number of connections to

deliverables of Work Packages 2, & 5. It is based on deliverables D2.1 - Report on requirements and

D2.3 - Report on Design of HW, Interfaces, and Software that build the foundation for the technical

Work Packages. These requirements are driving development in Work Packages 3, 4 & 5. For this

document we are also considering D5.1 - Development framework that describes the different

components and their specific requirements. Additionally, Section 3.2.1 of Part B of the POSEIDON

DoW lists different expected outcomes and the measures to assess their impact.

The document gives an introduction to the system evaluation, restating the timeline of POSEIDON

and the requirement driven development process. We show relevant best practice of system

evaluation and how it relates to the methods used for the POSEIDON evaluation.

The process of the technical evaluation is outlined, stating both methods and management layout.

The evaluation will be based on assessing the adherence to requirements and the associated risk of

not meeting them. Additionally, the process of the user evaluation is briefly recapitulated and an

introduction of the management is given. A template for the evaluation including the process to

estimate the mentioned risk concludes this document.

Page 4: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

4

2 Introduction

In the scope of POSEIDON a variety of different technical systems will be developed and integrated

into a common platform. Thus it is crucial to monitor and evaluate the performance of the system

including, but not limited to hardware, interface, software, integration routines, but also

contributions to safety, privacy and ethics. This task is integrated into Work Package 6 - Validation, as

Task 6.1 - Technical Assessment. This task is intended to materialize the mechanisms that are

envisaged to produce a highly reliable and useful product. In this scope the compliance to

requirements will be double-checked, a link between requirements and testing, as well as monitoring

the validation and pilots including the design revisions.

This deliverable D6.1 Evaluation Protocols outlines the different steps involved in the testing process.

This will range from analyzing the best practice in this domain and the prerequisites of POSEIDON, to

the creation of evaluation protocols themselves - that will be used in the validation phase to quantify

the performance of the technical systems.

The system evaluation of POSEIDON is driven in part by the project specific timeline of pilots and

studies. The most important factor is to verify adherence to the previously defined requirements

gathered in Work Package 2. The technical systems that are to be developed in WP3, WP4 and WP5

should be tested according to the process specified in this document. The evaluation process is

strictly based on the requirements that are defined and can be detailed in the scope of the

validation.

Page 5: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

5

3 System evaluation in POSEIDON

3.1 POSEIDON Timeline The interface strategy strongly depends on the requirements gathered during the requirement

analysis. The interviews of the primary user, the online questionnaires of the secondary and tertiary

user and the first user workshop have all contributed to this process. From these requirements an

interface strategy is extracted, imminently followed by an implementation in form of an integrated

prototype. In Fig. 1, (b) and (d), the requirements gathering phase and the first user workshop

followed by f when the first interfaces and interactive technology are set up, are represented

respectively. The created prototype (g) is evaluated in the second user workshop (h) and the

outcome of the workshops is subsequently analyzed. The feedback is taken into account and the

interface (l) is adjusted, when the interfaces are completely defined. This iterative process for the

interfaces is finished in step r, when the improved interfaces are set up.

Fig. 1 Project and work package milestones and events

From this timeline we can identify a set of milestones of the technical systems developed in the

scope of POSEIDON. The relevant events determining the points in which the system is tested are the

user workshops and the continuous pilots of integrated prototypes. Therefore, we have the following

short list of events that are directly associated to the system evaluation in WP6.

Page 6: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

6

Table 1 Important events related to evaluation in POSEIDON

Event Description Time in project

Planned date

1st User Group Workshop

Marks the end of the requirements gathering face - requirements refinement

M3 January 2014

2nd User Group Workshop

Indicates successful implementation of first set of requirements

M10 September 2014

First Pilot Testing of integrated prototype over a longer time

M20-21 from July 2015

Second Pilot Testing of second integrated prototype over a longer time

M30-31 from May 2016

Final Workshop Final check of requirements and post-project planning

M36 October 2016

From those dates the first user group workshop has a special role, as it marks the end of the

requirement gathering process and the adherence to requirements can’t be validated already.

Instead the primary purpose is to finalize the list of requirements and elucidate new ones according

to the feedback gathered from the users.

The next three events are the primary important events in the scope of the process described in this

deliverable. During those events the systems are tested by the users and the evaluators can

determine how well the requirements set for this specific stage have been fulfilled. Afterwards, the

set of requirements to be fulfilled for the next stage can be adapted and refined according to the

results of the testing.

The last event is primarily intended to prepare the potential market launch of the POSEIDON system

and can also act as a final check of adherence for all requirements, in order to verify that the full

system is running with all intended functionality and at the intended level of stability.

The process of evaluation is closely following the requirements gathering process, as specified in

WP2. In the next section some best practice of system evaluation in general and the necessary

aspects of requirements engineering specifically.

3.2 Best practice of system evaluation A large body of literature has been written on testing systems for conformity, adherence, robustness

and feature-completeness. The goal is always to assure that a certain level of quality is reached that

has been defined early in the project [1]. POSEIDON is following an approach that is based on

defining and testing a set of requirements that are to be fulfilled at different stages of the project.

Thus, we are using an approach that known in software development as Requirements Engineering.

Requirements engineering describes the process of formulating, documenting and maintaining

software requirements and to the subfield of Software Engineering concerned with this process [2].

Typically we can distinguish a set of seven different steps in this process, namely [3]:

1. Requirements inception or requirements elicitation -

2. Requirements identification - identifying new requirements

3. Requirements analysis and negotiation - checking requirements and resolving stakeholder

conflicts

4. Requirements specification (Software Requirements Specification) - documenting the

requirements in a requirements document

Page 7: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

7

5. System modeling - deriving models of the system, often using a notation such as the Unified

Modeling Language

6. Requirements validation - checking that the documented requirements and models are

consistent and meet stakeholder needs

7. Requirements management - managing changes to the requirements as the system is

developed and put into use

As we are not solely developing software in POSEIDON this process has to be adapted to cater also to

the requirements of the different hardware systems that will be developed. These adaptations are

discussed in the following section.

3.3 Measuring the impact of POSEIDON In the application phase of POSEIDON, the consortium included a number of different considerations

into the initial proposal concerned with measuring the impact of POSEIDON. We would like to briefly

revisit these considerations and put them into context. The impact measurement is based on

analysing a set of expected outcomes with a specific measure of success. They are listed in Table 2.

Five different expected key outcomes are listed together with a proposed measure of success. In the

evaluation phase we will transfer the measures and outcomes into requirements that can be mapped

regarding the procedure outlined in the other chapters of this document. We will iteratively revisit

this table and track the progress throughout all project phases.

Table 2 How outcomes are measured in POSEIDON

Expected Outcomes Proposed Measure of Success

Novel accessibility soluti-ons for user groups at risk of exclusion.

One novel service/application with many functions that is available for people with DS (primary user group) and other intellectual disabilities.

Results of testing in primary user group positive so development/production and marketing of the product will proceed.

Interest organisations for DS and also for other persons with intellectual disabilities, positive to the product and will spread information about it (at conferences etc.) because they think it is useful.

The measurement of these outcomes can be done more precisely during the last trimester of the project as part of the preparations for bringing the product to market.

Enhanced quality of life for people at risk of exclusion, including peo-ple with disabilities, older people and people with low digital literacy and skills.

More than 50% of representatives for target group (including parents, carers, and teachers) find that our product makes persons with DS more independent and autonomous in their daily life.

The impact of the product will also be observed in daily situations. Mastering of the technology developed will be measured by observations and interviews of representatives (parents, carers, teachers etc.) and interviews of people with DS.

More than 50% of persons in target group like to use the product.

These measurements will be done through the pilots [months 10 and 20] and the user group workshops [months 3, 10, 36] which will allow us to trace the variations in response in relation to the evolutions of the system.

Strengthened possibilities of employment to non-highly specialised profess-ionals.

The more independent and autonomous in daily life the greater is the chance of employment for people with DS and other intellectual disabilities. Increased independence within the environment will be measured by:

be evidenced by an increasing response to technical triggers rather than ‘being

told what to do next’.

Page 8: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

8

result in relationships that depend less on instruction and more on engagement,

to better facilitate mutual relationships.

These measurements will be achieved through a combination of information

gathered by the kits on their usage and the feedback provided by the users in each

pilot.

Improved competitive-ness of European ICT industry in the field of inclusive smart environ-ments and services.

The US is ahead of Europe with regard to ICT devices and programmes for people with intellectual disabilities (for example Ablelink Technologies). Our service/ application will reduce the gap. Our service/application will be adaptable to different countries, cultures and languages. This will be tested in different European countries. A measure of acceptance is that relevant organizations for targeted beneficiaries find that the product is good and say so at conferences, meetings etc. to increase the use of the product.

The measurement of these will be performed along the life of the process by contacting those who are interested in the project from any dimension. The findings will be compiled and summarized for the final report.

Wider availability and effectiveness of develop-ers’ tools for creating inclusive smart environ-ments (targeted to SMEs, key mainstream industria-lists, open-source devel-opers, and other less technical developers).

The aim of POSEIDON is to make some relevant inclusion services and a framework which should enable a wide range of developers to provide services for people with DS. Increased number of inclusion services and interest will be measured by:

Number of developers participating in POSEIDON social media.

Number of companies providing POSEIDON services.

Number of POSEIDON apps/services provided in different countries.

The methods and architecture developed for the product establish a best practise from which others can learn.

Some of these measurements can be only performed partially and at late stages of the project (after month 30) when the system is fully fledged and we can start transitioning to market deployment. Findings and evidence of these will be described in the final report.

3.4 POSEIDON requirements The interface strategy is closely related to D2.1 - Report on requirements and to D2.3 - Report on

Design of HW, Interfaces, and Software. The general design principles for interfaces have been

presented in D2.3. Therefore, parts of the following section regarding the design principles are taken

from there, while section 3.2.2 regarding the requirements analysis presents the requirements

applicable for the interface strategy.

In order to assure compatibility between the process of requirements engineering outlined in the

previous section and the development in POSEIDON, the requirements have to be adapted

appropriately. We will briefly describe the adaptation process for each step.

1. The requirements inception is performed according to specifications in the DoW, the initial

requirements gathering phase and the results of the first user group workshop

2. The requirements identification is performed iteratively after each user workshop and the

pilot phases of the integrated system

3. The requirements analysis and negotiation occurs led by a core group of developers in strong

collaboration with the representatives of the user organizations

4. The specification of the requirements adds a number of labels to distinguish functional and

non-functional requirements, identify the target system iteration and the associated system

components

Page 9: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

9

5. The combined software- and hardware-architecture is initially designed at a very high level

and will be detailed according to the requirements of the current prototype iteration

This step is the primary concern of this document and detailed in the next section

6. POSEIDON is using a shared Wiki system to manage the requirements and their different

iterations

Page 10: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

10

4 Technical evaluation of POSEIDON

As previously mentioned this section will detail the evaluation process of the technical systems

developed in POSEIDON. It is separated into three distinct parts. At first we will detail how

requirements can be evaluated, including monitoring and risk assessment. The second part outlines

the organizational structure of the evaluation process and the last part gives a set of templates that

will be used to evaluate the requirements.

4.1 Evaluating requirements In the scope of WP2 the requirements of the POSEIDON system are defined and refined throughout

the project. While some requirements can be easily quantified, particularly non-functional

requirements need additional information that helps determining how well they meet criteria set

towards them. In this section we will show how to determine the adherence of requirements and

based on that calculate a risk level that helps to specify how the requirements should be modified in

the later stages of the development and how severe the expected impact on the development

timeline will be.

4.1.1 Requirement adherence

The first important factor in the requirements evaluation is to determine how well they adhere to

the specified metric. Here we can distinguish two different types of requirements - quantitative

requirements and non-quantitative requirements. The first category can be easily measured and in

order to fulfil the requirement a discrete number is given. In the following tables, we will use a four-

level grading system for the different components of the requirement evaluation. This grading will

simplify the notation of adherence in the later project stages. The grades used are A, B, C and F -

whereas A indicates the highest grade, e.g. full adherence and F typically indicates failure. Table 3

shows grading, adherence level and description for quantitative requirements.

Table 3 Quantitative requirements adherence levels, descriptions and associated score

Grade Adherence Level Description

A Full Achieving at least 100% of the specified value

B With Limitations Achieving between 80% and 100%

C Low 50% to 80%

F Failure Less than 50%

Page 11: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

11

A similar grading can be used for non-quantitative requirements and their adherence. In this case the

level is subjective by definition and should therefore be determined by multiple persons, e.g. using a

majority vote.

Table 4 Non-quantitative requirements adherence levels, descriptions and associated score

Grade Adherence Level

Description

A Full Full adherence is achieved if the requirement is completely fulfilled in the given evaluation

B With Limitations

If a requirement is only partially fulfilled, but the deviation is not critical, a level “with limitations” should be given

C Low If a requirement is partially fulfilled and the deviation is high the adherence level is “low”

F Failure If a requirement is not fulfilled at all the level of “failure” has to be attributed

4.1.2 Risk estimation

An important step in estimating the potential impact of not fulfilling a requirement is performing a

risk assessment based on the level of adherence and some other factors. Risk management is a

whole field of study dedicated to identify, assess and priorize risks [4]. Apart from the risks specified

in the DoW we are only using a minimal routine to estimate the risk or impact of note meeting a

certain requirement, based on the following factors:

Risk Score = Adherence Level Score * (Criticality Level + Estimated time to fix)

The adherence level score is derived from Table 4, shown in the previous section. The criticality level

can be defined according to the following table.

Table 5 Criticality of non-adherence to requirements

Grade Criticality Level

Description

A Not critical Not fulfilling this specific requirement will not interfere with the overall functionality of the system - mostly suited for optional requirements

B Potentially critical

If not fulfilled this requirement has some impact on the system or user experience but it is considered low

C Very critical If the requirement is not fulfilled the system is expected to behave unexpectedly or not adhere to the minimum functionality required

F Fatal This breaks the system experience and permits main functions from working

The last component is the estimated time required to fix the requirement in a way so it will be

adhering to the specified level. This is important to get an estimate about the resources that will be

required to fix the problem and how they can be mapped into the remaining project timeline and the

development queue until the next iteration is due. The time required to fix will be quantized similar

to the previous factors, in the following table. It should be noted that the time should be estimated

in a way that includes testing of the fixed requirement:

Page 12: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

12

Table 6 Score associated to estimated time to fix a certain requirement not met

Grade Estimated time to fix

A < 1 hour

B < 1 day

C < 1 week

F > 1 week

In order to calculate the risk score we have to associate the grades that were given to a numeric

value. We are using the simple association of the following table:

Table 7 Association between grades and numeric values

Grade Numeric value

A 0

B 1

C 3

F 5

Now, all components are complete that are required to calculate the risk score. For example, if there

is a requirement with low adherence level that is not critical and will take less than one day to fix and

test the resulting risk score is:

Risk Score = 3 * (0 + 1) = 3

A second example for a requirement with low adherence level that is potentially critical and takes a

long time to fix is:

Risk Score = 3 * (3 + 5) = 24

The risk score is considerably higher than before, as the impact of not meeting this requirement on

the project development roadmap can potentially be very high, primarily due to the long time that is

required for a fix. The risk score for a fully met requirement thus is always 0 and there is obviously no

need to note criticality and time to fix.

Finally, we can group the risks into three distinct risk level groups based on their risk scores, as

shown in the following table:

Table 8 Association between risk levels and risk score

Risk level Risk score

High > 15

Medium < 15

Low < 6

None 0

This risk level should drive the discussion about how to manage changes to requirements and the

impact of not meeting a specific requirement on the development schedule specified in the

development roadmap. This risk level should be noted for any requirement that is not achieving

Page 13: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

13

adherence level, or for all requirements, whereas the risk level for requirements fully met, is

automatically set to “None”.

4.2 Evaluation process

Figure 1 Requirement implementation and evaluation cycle

The evaluation process is following the cycle outlined in Figure 1. Five different phases can be

distinguished:

1. During the piloting & testing the adherence of the different requirements is tested with

regards to the specification. The duration is depending on the setup of the pilot/workshop

2. In the analysis phase these results are used to calculate derived information, such as the risk

score. This phase should not last longer than 14 days, in order to guarantee a thorough

review, while not affecting the overall development too much.

3. The consolidation phase is a physical or virtual meeting, in which a consensus on the ranking

of the requirements has to be found. This meeting should also be used to determine if an

adaptation of requirements is needed

4. In the requirement revision phase the adaptations specified in the consolidation phase have

to be detailed and integrated into the development roadmap

5. The development phase will implement the system in order to fulfill the requirements

needed for the next iteration of the prototype

Piloting & Testing

Analysis

ConsolidationRequirement

Revision

Development

Page 14: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

14

4.2.1 Organization

The organizational structure of the personnel responsible should be kept small, in order to minimize

overhead. We envision a system of three roles operating given the following organizational structure.

Figure 2 Organizational structure of technical evaluation

The different roles will have the following tasks.

Technical Evaluation Coordinator will lead the process, has to query the different stages of

the process, distribute tasks to the technical evaluation committee and lead the

consolidation meeting

The technical evaluation committee members will execute the different tasks related to the

evaluation process, such as performing the technical evaluation, analyzing the results,

participating in the consolidation meeting and adapting the requirements

The requirements advisory board is comprised of several members of the social science

specialists within the consortium that perform the workshops and pilots and can contribute

to the process during the consolidation meeting and analysis phase - giving input to potential

adaptations of requirements for the next iteration

4.2.2 Managing potential requirement revision

In POSEIDON we are using a Wiki-based system to track the requirements. The integrated versioning

allows keeping track of the different versions. In Figure 3, we can see a screenshot of the system. The

system is in the process of being set up - a process that will be completed in time before the first

pilot. The wiki follows the structure of requirements as specified in D2.3.

Figure 3 Screenshot of requirement tracking wiki page

Technical Evaluation

Coordinator

Technical Evaluation Committee

Requirements Advisory

Committee

Page 15: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

15

The wiki system will be updated by all members of the Technical Evaluation committee, with the

Technical Evaluation Coordinator being responsible for periodically checking for adherence to the

specified standards.

In the future we will investigate different wiki add-ons or other system that will allow us to create the

result template presented in the following section automatically, from a subset of requirements.

4.3 Result Template The results of the requirement evaluation should be noted in a specific template that allows

recording the information required in the later stages of the process. The following table can be

expanded and printed out with the current list of requirements to be tested in scope of the next

event. There are two example requirements filled in order to show how the table works. They are

not related to any actual evaluation.

Table 9 Template of requirement evaluation sheet

Label Requirement Category Quantitative Qualitative Adh Level

Criticality Est. time-to-fix

Fun5 Should keep track of

user's position when

traveling outdoors

Functional - - A - -

Fun6a Should provide basic

outdoors navigation

services

Functional - - B F C

Page 16: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

16

5 User experience evaluation of POSEIDON

As previously mentioned this section will detail the evaluation process of user experience and

usability factors of the POSEIDON systems. It is separated into three distinct parts. At first we will

detail how requirements can be evaluated, including monitoring. The process and organizational

aspects are similar to the ones presented in the previous chapter. Therefore we will refrain from

reiterating most of the information and focus on the novel aspects.

5.1 Evaluation of user experience requirement The evaluation of user experience and usability is more difficult to express in terms of quantitative

measurements. There are three different levels that we have to consider in increasing level of

abstraction:

1. Functionality

2. Usability

3. User experience

The functional aspects include all the required aspects for the system to work in the desired way.

This is covered by the technical requirement evaluation presented in the previous section. Usability

extends this scope by also considering aspects such as intuitiveness of use and predictability of the

chosen actions. Finally, user experience covers all aspects of the system including the look and feel of

POSEIDON. This category encompasses aspects, such as “joy of use”, reaction of users to the system,

trust of the users, and various other aspects. In terms of evaluation this often is a subjective matter.

5.2 Evaluation process

Figure 4 User experience testing cycle

The user experience evaluation process is following the cycle outlined in Figure 2. It is similar to the

process for the technical evaluation. Four different phases can be distinguished:

Piloting & Interviews

Analysis

Consolidation

Requirement Revision

Page 17: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

17

1. During the piloting & testing the adherence of the user experience requirements is tested

with regards to the specification. It is happening at the same time as the technical evaluation

2. Analogue to the technical evaluation, the analysis phase uses the results to calculate derived

information, such as the risk score. This phase should not last longer than 14 days.

3. The consolidation phase is analogue to the technical evaluation

4. The requirements revision phase is also similar to the technical evaluation. However, single

user experience factor may affect various requirements. Accordingly, more time should be

planned in to thoroughly perform this task

5.2.1 Organization

The organizational structure of the personnel responsible should be kept small, in order to minimize

overhead. We envision a system of three roles operating given the following organizational structure.

Figure 5 Organizational structure of technical evaluation

The different roles will have the following tasks.

User Experience Evaluation Coordinator will lead the process, has to query the different

stages of the process, distribute tasks to the technical evaluation committee and lead the

consolidation meeting

The user experience evaluation committee members will execute the different tasks related

to the evaluation process, such as performing the user experience evaluation, analysing the

results, participating in the consolidation meeting and adapting the requirements

The requirements advisory board is comprised of several technical specialists within the

consortium that perform the workshops and pilots and can contribute to the process during

the consolidation meeting and analysis phase - giving input on the technical feasibility of

system adaptations

5.2.2 User experience requirements

In order to perform a risk assessment similar to the previously described requirement evaluation, it is

necessary to specify a set of requirements specifically associated to this factor and analyse them

similar to the evaluation of qualitative requirements that was introduced earlier. A practical

approach is to “quantize” the user experience by performing interviews based on Likert-scales or

transfer open interview questions to a Likert-scale scheme after the fact. The latter is required when

interviewing certain user groups that are not suited to fill out Likert questions due to lack of

experience or other factors.

This transfer of open-ended questions to Likert-scale ratings enables to quantify the results of the

questionnaire, in order to specify adherence levels as specified in the previous sections. However,

this process has to be performed very carefully. The members of the user experience evaluation

User Experience Evaluation

Coordinator

User Experience Evaluation Committee

Requirements Advisory

Committee

Page 18: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

18

committee will perform this transformation according to best practice. Example can be found in the

write-ups by Hughes and Silver [5], or Jackson et al. [6].

The next step is to use this coding or quantification to grade the responses on user experience similar

to the process presented in the previous section. According to the specific question, we propose the

scheme, shown in Table 10.

Table 10 User experience adherence levels, descriptions and associated score

Grade Adherence Level Description

A Full Matching or exceeding threshold

B With Limitations Scoring between 80% and 100% of threshold

C Low Scoring between 50% and 80% of threshold

F Failure Scoring less than 50% of threshold

The most important factor to consider is the threshold. This should be defined, based on the

importance of a specific item. On Likert-scale to 10, a typical value could be 7. When analyzing the

responses of the interviews the result is either a quantified table for Likert-style questions or coded

values for open-ended questions. The coding should be chosen accordingly - so we can also assume a

scale of 1-10 and a threshold of 7.

Using this as base for grading the calculations for the different adherence grades, with a threshold of

7 look like this:

Grade Score

A >= 7.00

B 6.60 to 7.00

C 3.50 - 6.60

F < 3.50

Achieving the adherence grades enables us to use the risk assessment of the previous section to

specify risks and estimate their severity and impact on the project. The risk assessment is part of the

analysis phase and should be performed at this step.

Page 19: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

19

6 Conclusion

On the previous pages we have introduced the process and protocols to allow a technical evaluation

of the systems that will be developed in the scope of POSEIDON. We introduced the rationale of the

technical evaluation by restating the POSEIDON timeline and the measures of impact and success

that were introduced during the application phase.

An overview of relevant best practice of system evaluation was given that relates to the requirement

driven design of the POSEIDON development process. Based on this it was possible to introduce a

method to quantify the technical capabilities of a system given a set of requirements. This

assessment is based on two factors - the adherence to the specified requirements and the risk

associated to not fulfilling the requirement, which can be determined using a risk score calculation.

We outlined the process and persons required to perform this evaluation and briefly introduced the

tools necessary to track the requirements.

Additionally, the document also discusses critical differences between technical evaluation and user

experience evaluation. This includes differences in process, but also evaluation and quantification of

measurements.

Page 20: Evaluation Protocols · 2017-08-30 · Requirements engineering describes the process of formulating, documenting and maintaining software requirements and to the subfield of Software

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

20

7 References

[1] B. Beizer, Software system testing and quality assurance. Van Nostrand Reinhold Co., 1984.

[2] B. Nuseibeh and S. Easterbrook, “Requirements engineering: a roadmap,” in Proceedings of the Conference on the Future of Software Engineering, 2000, pp. 35–46.

[3] I. Sommerville and P. Sawyer, Requirements engineering: a good practice guide. John Wiley & Sons, Inc., 1997.

[4] K. Dowd, Beyond value at risk: the new science of risk management, vol. 3. Wiley Chichester, 1998.

[5] G. Hughes and C. Silver, “Quantitative analysis strategies for analysing open-ended survey questions in ATLAS.TI,” 2011. [Online]. Available: http://www.surrey.ac.uk/sociology/research/researchcentres/caqdas/support/analysingsurvey/quantitative_analysis_strategies_for_analysing_openended_survey_questions_in_atlasti.htm.

[6] K. M. Jackson and W. M. K. Trochim, “Concept mapping as an alternative approach for the analysis of open-ended survey responses,” Organ. Res. Methods, vol. 5, no. 4, pp. 307–336, 2002.