Improving the cosmic approximate sizing using the fuzzy logic epcu model alain abran

18
Improving the COSMIC Approximate Sizing using the Fuzzy Logic EPCU Model Francisco Valdés Souto & Alain Abran École de Technologie Supérieure [email protected] [email protected] 1 © 2015 Valdés-Souto & Abran

Transcript of Improving the cosmic approximate sizing using the fuzzy logic epcu model alain abran

Improving the COSMIC Approximate Sizing using

the Fuzzy Logic EPCU Model

Francisco Valdés Souto & Alain Abran

École de Technologie Supérieure [email protected] [email protected]

1 © 2015 Valdés-Souto & Abran

The Sizing Problem

FSM methods work best when the information to be measured– is fully known.

Early phases: only non detailed information is available.

2

UC Identification

© 2015 Valdés-Souto & Abran

FP Identification

Approximate size measurement

• Not all documentation is available

• Quality of documentation is poor

• Portfolio decisions

Frank Vogelezang, 1er Congreso Nacional de

Medición y Estimación de Software, México 2015

Estimation (Approximation)

New approaches

Judge & Compute

Current approximation approaches

COMPUTE

COSMIC standard

Count

Steve McConnell - Software Estimation, Demystifying the Black Art

Henderson et al.: investigated the relationship between FP & KLOC

Meli: A. Early Function

Points (EFP), based on IFPUG 4.0,

B. Extended FP (XFP): EFP & 3 correction factors.

Desharnais et al.: analysed 2 techniques: Function Points Simplified (FPS) a& Backfiring

Conte et al.:

Early & Quick (E&Q) COSMIC - more tests needed to adjust or to confirm it

Vogelezang et al. : study of 50 projects to define size bands using the quartile approach.

Santillo: Analytic Hierarchy Process, for making choices among alternatives

Related Works on Approximation of Functional Size

1992 1997 2003 2004 2007 2011 2012 2013

5

Valdés et al. proposed a solution using the fuzzy logic model from [3-5], referred to as the EPCU model

Almakadmeh: A framework to assign scaling factors for identifying the level of granularity of functional requirements specifications. The state of the art on approximate COSMIC FSM was discussed at IWSM/MENSURA 2013

© 2015 Valdés-Souto & Abran

Needs

local

calibrati

on

Requireme

nt

granularity

level Consideration

Early & Quick

COSMIC

approximation X

Multilevel

Approach

(*)

The precision of the method is strongly

dependent on the training and capability

of the practitioners who use it to

understand the categories at higher levels

of granularity. [38], this approximation

approach combines scaling and

classification approaches.

Quick/Early X Use Cases

The precision is directly proportional to

the level of granularity of the analyzed use

cases model.

EPCU

approximation

approach

Functional

Process &

Use Cases

Does not require local calibration (less

expensive ) and is useful when there are

no historical data available.

Needs

local

calibrati

on

Requireme

nt

granularity

level Consideration

Average

Functional

Process X

Functional

Process

This approximation is valid as long as there

is sufficient reason to assume that the

sample on which the size of the average

functional process is calculated is

representative for the software of which

the functional size of which size is

approximated. [38]

Fixed Size

Classification X

Functional

Process

This approximation is valid as long as there

is sufficient reason to assume that the

assigned size classification is

representative for the software of which

the functional size of which size is

approximated. [38]

Equal Size Bands

approximation X

Functional

Process

This method is recommended for the

approximate sizing of software where the

distribution of the functional process sizes

is skewed. For the business application this

method has little added value over the

average functional process method (1) or

the fixed size classification method (2). [38]

Average Use

Case

approximation X Use Case

This approximation is valid as long as there

is sufficient reason to assume that the

assigned size classification of an average

use case is representative for the software

of which the functional size of which size is

approximated. [38]

Approximate Sizing Approaches are based on 2 main assumptions:

1. Historical data exist for calculating the scaling factor (average, or size bands).

2. The whole set of requirements is described, or at least there is a commitment, defined by the requirements, about the scope of the software to be developed .

Approximation Techniques Analysis Highlights

Original Equal Size Bands approach in Related Topics

• Historical data set: – 37 business application development projects, each having a total size

greater than 100 CFP. (Vogelezang , 2007):

• Quartile values of the Functional Process from this dataset: Small = 4.8 CFP, Medium =7.7 CFP, Large = 10.7, and Very Large = 16.4 CFP

7 © 2015 Valdés-Souto & Abran

Analyzing the Dataset

#Projects Sector

Project

Range

Project

Average

Functional

Size #FP Size

26 Banking 11-2743 476 1345 12375

8 Government 64-2364 481 838 3845

6 Insurance 84-1311 551 342 3305

7 Logistics 193-1164 538 321 3766

% FP

included Description Average Value

Q1 Small FP's 55%

contains FP's in the range up to

6 CFP 3.7

Q2 Medium FP's 26%

contains FP's in the range 6-10

CFP 7.7

Q3 Large FP's 14%

contains FP's in the range 10-25

CFP 14.6

Q4 Very Large FP's 5%

contains FP's of 25 CFP and

larger 44.1

Quartile

Table 3. Q-Size considering four sectors

Table 2. Dataset Characterization

Defining new Output Variable domain considering this Dataset

• Quartile values of the Functional Process considering Q-Size (Table 3) from this dataset: Small = 3.7 CFP, Medium =7.7 CFP, Large = 14.6, and Very Large = 44.1 CFP

9 © 2015 Valdés-Souto & Abran

Figure 3. Output Variable Schema

COSMIC Approximate Sizing Using the EPCU Model

• Variable 1:

Perception of the size of the Use Case (subjective,

experience-based).

• Variable 2:

The number of Objects of Interest related to the Use Case (subjective, experience-based).

Functional Size Estimated for each Use Case (CFP)

What variables influence the size of a Use Case? What is the possible range for

the output variable?

Dataset analysis

10 © 2015 Valdés-Souto & Abran

Experiment Design

ALFA software system/ 14 Use

Case descriptions

EPCU Model

1- Define Input variables & membership functions, 2- Define the Inference rules between the input variables & output variable (Functional Size in CFP)

3-Define Output variable & membership functions

2. Selecting a Measurement Reference

3. Knowing the ALFA Software System

4. Data Collection

5. Data Analysis

1. Define de EPCU Context for Approximate Sizing

Participants provided only with the ALFA list of Use Cases: assign values to the 2 input variables, based on their experience.

11 © 2015 Valdés-Souto & Abran

The same Case Study in 2012

• Case Study with 8 practitioners: – not familiar with the COSMIC method,

– with no historical data for approximating the FSM using COSMIC,

– did not know the EPCU model,

– did not participate in the definition of the EPCU context.

The only information available had =

– A form with the list of Use Cases

– Their own experience with the business process related to the project

• The Case Study= a simulation of the early size estimation step with both approaches (Equal size band & Fuzzy Logic).

12 © 2015 Valdés-Souto & Abran

Case Study Data & Analysis 2014-2015

© 2012 Valdés-Souto & Abran 13

Practitioner

Reference

Functional

Size in

CFP

Estimated

Functional

Size using

the ‘Equal

Size Bands’ MRE

Estimated

Functional

Size using

EPCU (range

from 2 to

16.4) MRE

Estimated

Functional

Size using

EPCU

Improved

(range from 2

to 44) MRE

Practitioner 1 250 81.7 67% 186.32 25% 430.76 72%

Practitioner 2 250 93.3 63% 132.76 47% 240.74 4%

Practitioner 3 250 84.6 66% 62.19 75% 81.65 67%

Practitioner 4 250 93.3 63% 114.34 54% 190.33 24%

Practitioner 5 250 105.2 58% 168.13 33% 379.86 52%

Practitioner 6 250 81.7 67% 111.26 55% 183.14 27%

Practitioner 7 250 93.5 63% 130.43 48% 240.91 4%

Practitioner 8 250 114 54% 199.82 20% 493.47 97%

MMRE 63% 45% 43%

SDMRE 5% 18% 34%

2014 2015

Case Study Data & Analysis 2014-2015

© 2012 Valdés-Souto & Abran 14

Case Study Data Analysis

• MMRE = 45% & SDMRE= 18%.

• Maximum MMRE = 75% & Minimum MMRE = 20%.

• Always underestimate

Fuzzy Logic EPCU model & the real value (2014)

Fuzzy Logic EPCU model & the real value (2015)

• MMRE = 43% & SDMRE= 34%.

• Maximum MMRE = 97% & Minimum MMRE = 4%.

• Equally likely to be above or below the real value

15

© 2015 Valdés-Souto & Abran

Using a cutoff about 16.4 CFP

(2012, 2014) the approximation of

functional size is underestimating;

using the cutoff about 44 CFP

(2015) , the results are above and

below from the real value, as

discussed by De Marco “An estimation is a prediction that is equally likely to be above or below the actual result.”

Exploratory Research Observations

• Fuzzy Logic approach: – it does not use bands, but rather a continuous range in ε R, which is

represented by a membership function.

– But it is sensitive to min-max values

Large scale experiments needed with:

• More case studies

• More participants for each case study

• Increase the collection of a set of projects with their use cases or their functional process identified, in order to conduct a more in depth analysis of the EPCU Improved Size Approximation approach

16 © 2015 Valdés-Souto & Abran

Questions?

17 © 2015 Valdés-Souto & Abran

© 2012 Valdés-Souto & Abran 18

Use case

Use case

classificatio

n (linguistic

values)

Use case

size (value

assignment

)

Presence

(level, not

the number

of) of object

of interest

related to

the Use

case

classificatio

n (linguistic

values)

Presence

(level, not

the number

of) of object

of interest

related to

the Use

case (value

assignment

)

Estimated

Functional

Size using

‘Equal Size

Bands’

approach

Estimated

Functional

Size using

EPCU (TO

16.4)

Estimated

Functional

Size using

EPCU (TO

44)

Use case

classification

(linguistic

values)

Use case

size (value

assignme

nt)

Presence

(level, not

the

number

of) of

object of

interest

related to

the Use

case

classificati

on

(linguistic

values)

Presence

(level, not

the

number

of) of

object of

interest

related to

the Use

case

(value

assignme

nt)

Estimated

Functional

Size using

‘Equal

Size

Bands’

approach

Estimated

Functional

Size using

EPCU

(TO 16.4)

Estimated

Functional

Size using

EPCU

(TO 44)

Use case

classificati

on

(linguistic

values)

Use case

size (value

assignme

nt)

Presence

(level, not

the

number

of) of

object of

interest

related to

the Use

case

classificati

on

(linguistic

values)

Presence

(level, not

the

number

of) of

object of

interest

related to

the Use

case

(value

assignme

nt)

Estimated

Functional

Size using

‘Equal

Size

Bands’

approach

Estimated

Functional

Size using

EPCU

(TO 16.4)

Estimated

Functional

Size using

EPCU

(TO 44)

Use case

classificati

on

(linguistic

values)

Use case

size (value

assignme

nt)

Presence

(level, not

the

number

of) of

object of

interest

related to

the Use

case

classificati

on

(linguistic

values)

Presence

(level, not

the

number

of) of

object of

interest

related to

the Use

case

(value

assignme

nt)

Estimated

Functional

Size using

‘Equal

Size

Bands’

approach

Estimated

Functional

Size using

EPCU

(TO 16.4)

Estimated

Functional

Size using

EPCU

(TO 44)

Use case 1 MEDIUM 3 AVERAGE 4 7.7 16.4 44.0 MEDIUM 3 AVERAGE 3 7.7 12.7 27.5 MEDIUM 2 AVERAGE 2 7.7 6.6 8.8 SMALL 2 AVERAGE 2 4.8 6.6 8.8

Use case 2 MEDIUM 3 FEW 4 7.7 16.4 44.0 MEDIUM 3 AVERAGE 3 7.7 12.7 12.7 SMALL 1 FEW 1 4.8 1.6 2.0 MEDIUM 3 AVERAGE 2 7.7 8.4 11.6

Use case 3 SMALL 2 FEW 3 4.8 10.7 20.5 MEDIUM 3 AVERAGE 3 7.7 12.7 12.7 MEDIUM 2 FEW 1 7.7 4.6 5.3 MEDIUM 2 AVERAGE 2 7.7 6.6 8.8

Use case 4 SMALL 2 FEW 4 4.8 13.8 32.2 SMALL 1.5 FEW 1 4.8 2.6 2.9 SMALL 1 FEW 1 4.8 1.6 2.0 MEDIUM 3 MANY 3 7.7 12.7 27.5

Use case 5 SMALL 2 FEW 4 4.8 13.8 32.2 MEDIUM 3.2 AVERAGE 3.2 7.7 13.3 30.3 MEDIUM 2 AVERAGE 2 7.7 6.6 8.8 SMALL 2 AVERAGE 2 4.8 6.6 8.8

Use case 6 SMALL 2 AVERAGE 4 4.8 13.8 32.2 MEDIUM 3 AVERAGE 3 7.7 12.7 12.7 SMALL 1 FEW 1 4.8 1.6 2.0 MEDIUM 3 AVERAGE 3 7.7 12.7 27.5

Use case 7 MEDIUM 3 AVERAGE 3 7.7 12.7 27.5 SMALL 1.5 FEW 1.3 4.8 2.9 3.3 MEDIUM 3 AVERAGE 2 7.7 8.4 11.6 SMALL 2 FEW 2 4.8 6.6 8.8

Use case 8 SMALL 2 FEW 4 4.8 13.8 32.2 MEDIUM 3.3 AVERAGE 3 7.7 12.7 27.4 SMALL 2 FEW 2 4.8 6.6 8.8 MEDIUM 3 AVERAGE 3 7.7 12.7 27.5

Use case 9 SMALL 1 AVERAGE 3 4.8 7.9 10.5 SMALL 1 FEW 1.5 4.8 2.6 2.9 SMALL 1 FEW 1 4.8 1.6 2.0 SMALL 1 FEW 2 4.8 4.6 5.3

Use case 10 SMALL 2 FEW 3 4.8 10.7 20.5 SMALL 1.3 FEW 1 4.8 1.8 2.0 SMALL 1 FEW 1 4.8 1.6 2.0 MEDIUM 2 AVERAGE 3 7.7 10.7 20.5

Use case 11 MEDIUM 4 AVERAGE 3 7.7 12.5 26.4 MEDIUM 3 AVERAGE 3.5 7.7 14.6 35.7 SMALL 2 AVERAGE 2 4.8 6.6 8.8 MEDIUM 2 AVERAGE 2 7.7 6.6 8.8

Use case 12 SMALL 2 FEW 4 4.8 13.8 32.2 SMALL 1.4 FEW 1.4 4.8 3.2 3.9 SMALL 1 FEW 1 4.8 1.6 2.0 SMALL 2 FEW 2 4.8 6.6 8.8

Use case 13 MEDIUM 3 AVERAGE 4 7.7 16.4 44.0 MEDIUM 3 AVERAGE 3.2 7.7 13.4 30.4 MEDIUM 2 AVERAGE 2 7.7 6.6 8.8 MEDIUM 2 AVERAGE 2 7.7 6.6 8.8

Use case 14 SMALL 2 AVERAGE 4 4.8 13.8 32.2 MEDIUM 3.5 AVERAGE 3.5 7.7 14.7 36.5 MEDIUM 2 AVERAGE 2 7.7 6.6 8.8 MEDIUM 2 AVERAGE 2 7.7 6.6 8.8

81.7 186.3 430.8 93.3 132.8 240.7 84.6 62.2 81.7 93.3 114.3 190.3

Practitioner 1 Practitioner 2 Practitioner 3 Practitioner 4