Function point analysis using NESMA simplifying the sizing without simplifying the size.pdf

8/10/2019 Function point analysis using NESMA simplifying the sizing without simplifying the size.pdf

1/50

Function point analysis using NESMA: simplifyingthe sizing without simplifying the size

P. Morrow F. G. Wilkie I. R. McChesney

Published online: 24 July 2013 Springer Science+Business Media New York 2013

Abstract This paper examines the trade-off between the utility of outputs from simpliedfunctional sizing approaches, and the effort required by these sizing approaches, through apilot study. The goal of this pilot study was to evaluate the quality of sizing outputprovided by NESMAs simplied size estimation methods, adapt their general principles toenhance their accuracy and extent of relevance, and empirically validate such an adaptedapproach using commercial software projects. A dataset of 11 projects was sized using this

adapted approach, and these results compared with those of the established Indicative,Estimated and Full NESMA method approaches. The performances of these adaptationswere evaluated against the NESMA approaches in three ways: (1) effort to perform; (2) theaccuracy of the total function counts produced; and (3) the accuracy of the proles of thefunction counts for each of the base functional component types. The adapted approachoutperformed the Indicative NESMA in terms of sizing accuracy and generally performedas well as the Estimated NESMA across both datasets, and required only * 50 % of theeffort incurred by the Estimated NESMA. This adapted approach, applied to varying levelsof information presented in commercial requirements documentation, overcame some of the limitations of simplied functional sizing methods by providing more than simply thesimplied indication of overall functional size. The provision and renement of the moredetailed function prole enable a greater degree of validation and utility for the sizeestimate.

P. Morrow ( & ) F. G. Wilkie I. R. McChesney

School of Computing and Mathematics, University of Ulster, Newtownabbey,Co-Antrim BT37 0QB, UK e-mail: [email protected]

F. G. Wilkiee-mail: [email protected]

I. R. McChesneye-mail: [email protected]

1 3

Software Qual J (2014) 22:611660DOI 10.1007/s11219-013-9215-1


2/50

Keywords Software size estimation NESMA Function point analysis Simplied estimation Commercial projects

1 Introduction

The topic of software estimation has been at the core of software development concernssince the beginning of software engineering as a discipline. It impacts both the selection of projects that are feasible to pursue and the subsequent successful management of thosechosen projects. This paper is concerned with one aspect of estimation, namely softwaresize estimation. The process of estimating the cost and schedule for new software projectsrepresents a signicant challenge for software development companies. The basis for suchestimates should be an understanding of the size of the software that is to be developed, asevidenced by its role as the fundamental input for cost estimation models such as CO-COMO II (Boehm et al. 2000 ). In this paper, the utility of simplied functional sizingmethods are assessed, and results obtained through adapting their use on commercialprojects are reported.

Functional sizing techniques have become the most established sizing measures, withISO standardsIFPUG (ISO 2009 ), Mark II (ISO 2002 ), NESMA (ISO 2005 ), COSMIC(ISO 2011 ) and FISMA (ISO 2010 ). This paper is concerned with the NESMA methodwhich is derived from the function point analysis (FPA) approach to functional sizingoriginally dened by Albrecht ( 1979 ). The standing of function points as a formal sizingmeasure has come under criticism, e.g. the existence of correlations between their base

functional components may invalidate function points as well-formed metrics (Kitchenhamand Kansala 1993 ). The existence of such correlations, however, demonstrates that func-tion points are a less than minimal sizing measure rather than an invalid one. Thisproperty of functions points may in fact be benecial in enabling the derivation of func-tional size from a subset of their components. The suitability of using FPA as a consistentlyapplicable approach has been demonstrated with variances being observed as averagingapproximately 10 % between experienced counters (Kemerer 1993 ). Their suitability isfurther supported by the establishment of the aforementioned ISO standards.

Academic literature shows that software cost estimation has been subjected to numerousmethods to improve the process, with the development of such methods accounting for

61 % of peer-reviewed published literature in software development cost estimation(Jorgensen and Shepperd 2007 ). Research into expert judgement or analogy-based esti-mation has accounted for 25 % of this literature. However, within industry, the trend is tofocus on cost/effort estimation, and rely upon more informal, expert judgement-based,estimation practices. These are generally adopted by 70 % or more of organisations(Wydenbach and Paynter 1995 ; Molokken-Ostvold et al. 2004 ; Yang et al. 2008 ). Thisapparent gap between the focus of academic research and the expectations of industrypresents a challenge to be addressed if the potential benets which the size estimationprocess may provide for software development are to be realised.

Industry surveys on the use of estimation methods have tended not to directly addressthe reluctance to adopt model-based estimation methods. Yang et al. ( 2008 ), however,specically examined this issue, within relatively large software development organisa-tions, for the use of cost estimation methods. In this case, model-based cost estimationmethods included the use of software size approaches such as FPA metrics. This surveyfound that the main reasons reported for not using estimation models were the insufcientbenets to justify the additional cost and effort required. This perception within industry

612 Software Qual J (2014) 22:611660

1 3


3/50


4/50

(BFCs) or to subsequently assess the associated complexity of each occurrence. In thesecases, the level of detail and the format of the information available are fundamental toselecting an appropriate simplied method.

The other primary motivation for simplifying the size estimation process stems from the

limited availability of resources needed to complete the estimate. In these cases, the effortrequired to develop the estimate and the accuracy of the results obtained are important indetermining which method should be used. These motivations are, of course, not mutuallyexclusive, so each of these factors should be considered if a sizing method is to address thecommercial realities of software size estimation.

The main simplied functional sizing issues and methods are presented in the followingsub-sections, resulting in a discussion of the identied remaining challenges to be faced inthis area of software size estimation. An overview of NESMA functional sizing is thenpresented, providing an indication of how the different levels of detail in the differentNESMA methods are related to each other and the issues affecting their selection.

2.1 Simplied software size estimation issues

In FPA, the functional size is adjusted through consideration of general system charac-teristics (GSC) from which a value adjustment factor (VAF) is calculated and applied tothe unadjusted size. These adjustment factors have been widely criticised as lacking both asound theoretical basis and failing to provide sufcient practical benets as they do notprovide an improved indication of how function points are related to development effort(Lokan 2000 ). In addition, the ISO standard for functional size measurement (ISO 2007 )

does not include consideration of the VAF as it is related to technical characteristics ratherthan functional requirements. Simplied sizing methods may also omit this aspect thusavoiding a potential source of variance between functional counters.

Compensation for the absence of sufcient detail about a project, or locally developedcalibration data, may be achieved through the use of industry project data. Kitchenhamet al. ( 2007 ) reviewed studies comparing cross-company and within-company data costestimation studies. This study concluded that different project characteristics and the sizeof the development company affected the suitability of each type of dataset. Factors suchas small development companies, specialised projects, relatively small projects andhomogeneous within-company datasets provided less suitability for adopting cross-com-pany data. The evidence in this area was not considered to be denitive, in part due to theabsence of any consensus on the approach to researching this issue. Some existing sim-plied sizing methods require the use of either internal or external historical data indeveloping the estimate. The suitability of these simplied sizing methods may thereforevary according to the characteristics of the project under consideration. Our simpliedsizing adaptations address this issue by removing the dependency on either form of his-torical data.

Abran et al. ( 2004 ) used regression models to investigate how the FPA functional proleof a project, i.e. the individual BFC sizes, contributed to the development effort of a

project. The correlation with development effort was found to differ for individual BFCtypes within a dataset, suggesting that the relative size of each BFC within a functionalprole would affect the development effort. This study found that, within each datasetused, when the functional prole fells outside of the average functional prole (incorpo-rating 80 % of projects in a dataset), the correlation with development effort differsconsiderably from the correlation obtained using that entire dataset. In contrast, the projects that fell within the average functional prole for a given dataset produced a similar

614 Software Qual J (2014) 22:611660

1 3


5/50


6/50

components dened by the equivalent full sizing method must be identied by the sim-plied version. The external weightings column is concerned with the use of data fromsources external to the development organisation for determining the complexity of theBFCs. This may take the form of specic weightings derived from cross-company data orthe assumption of complexity determined by the developer of the method. The localcalibration column is concerned with the requirement for specic weightings for BFCs tobe calculated from locally accumulated historical data. The multi-level column identieswhere the simplied sizing method can accommodate more detailed requirements infor-mation to develop the size estimate.

Simplied functional sizing methods may be broadly distinguished in terms of whetheror not all of the BFC types must be identied. Methods which require all of the BFC typessimplify the process by deriving the complexity from historical data (Sect. 2.2). Theprocess is simplied further by methods which only require a subset of the BFC types to beidentied and extrapolate the overall functional size from this subset (Sect. 2.3). The multi-level sizing approaches provide the advantages of not requiring the same level of requirements detail as full sizing methods, but facilitating its incorporation into the sizeestimate when it is available (Sect. 2.4).

2.2 Derived complexity sizing methods

Derived complexity sizing methods require the identication of all of the BFC types, butremove the complexity assessment of each occurrence of a BFC. The complexity is derivedeither from established industry/standards data or from local historical data. It is then

Table 1 Overview of simplied functional sizing methods

Sizing approach BFCrequirements

Externalweightings

Localcalibration

Multi-level

ISBSG average complexity All Yes No NoEstimated NESMA All Yes No No

Early E (Horgan et al.) All No Yes No

Function points simplied (Bock and Klepper)

All No Yes No

Simplied COSMIC All No Yes No

Indicative NESMA ILF, EIF Yes No No

ISBSG extrapolative weightings One or more Yes No No

Internal logical le model(Tichenor)

ILF No Yes No

ILF transaction template(Poul Stall Vinje)

ILF No No No

Simplied indicative NESMA(Wang et al.)

ILF, EIF No No No

Early and quick function pointmethod (Santillo et al.)

Individual and/orAggregated

No No Yes

KISS (Forselius) All (subdivided into 28components)

No No Yes

Simplied NESMA adaptations(this study)

ILF, EIF No No Yes

616 Software Qual J (2014) 22:611660

1 3


7/50

applied, either directly or indirectly through weighting constants, to the BFC counts inorder to produce the overall functional size for a project.

The International Software Benchmarking Standards Group (ISBSG) maintains a publicrepository of software project data (ISBSG 2009 ), from which the general average com-

plexity weighting of each BFC type may be derived. These weightings may be applied tothe counts for each BFC in order to produce the overall functional size through a methodreferred to as Average Complexity Estimation (Meli and Santillo 1999 ). MacDonell andShepperd ( 2007 ) investigated the use of the ISBSG dataset for effort estimation, as part of awider review on cross-company data studies. The use of within-company data was found tobe preferable overall to the use of cross-company data for the dataset analysed. Thequestion of the applicability of complexity weightings obtained from cross-company datafor within-company data may therefore limit this approach to a substitute for the lack of necessary information about a project and/or the availability of resources to complete theestimate.

The Estimated NESMA method removes the requirement for assessing the complexityof each BFC by assuming an associated complexity for each BFC type. Data functions areassumed to be of low complexity, and transaction functions are assumed to be of averagecomplexity (NESMA 2004 ). These assumed complexities were established by NESMA,where the functional sizes produced by this weighting scheme were found to be stronglycorrelated with those of the Full NESMA method. Van Heeringen et al. ( 2009 ), across 42projects, reported that the Estimated NESMA method demonstrated on average an inac-curacy of only 1.5 % relative to the Full NESMA method. However, the author of thatstudy incorporated the direction of the relative inaccuracy for each project in the calcu-

lation, which leads to the positive and negative values cancelling each other out to somedegree. When only the magnitude of the relative inaccuracy is considered, the averageinaccuracy compared to the Full NESMA method would be 5.73 %. As with our precedingstudy (Wilkie et al. 2011 ), the conclusion was that there was no additional value inperforming the Full NESMA method. Candido and Sanches ( 2004 ) examined the effec-tiveness of the different levels of the NESMA approach and found that the EstimatedNESMA method demonstrated an average error of 18 % relative to the full method. Thiswas improved to an acceptable average error of only 4 % by adapting the assumedcomplexities to the typical prole of previously completed projects. The limitation of their approach is that it is still applying the associated complexity of each BFC type acrossevery instance of that BFC, which may not be as effective outside of the relatively uniformlow complexity prole of their tested applications. While these studies demonstrate theeffectiveness of the Estimated NESMA method, the burden of identifying each individualBFC may not be feasible, in terms of either the availability of the necessary data or theexpenditure in estimation effort required.

The Early E approach (Horgan et al. 1998 ) offers a variation on the EstimatedNESMA approach by counting raw function points (RFP), i.e. how many BFCs are present.A single weighting constant derived from local historical projects is then used to convertthe total RFP to a total function point (FP) value. The method was found to produce sizes

within 25 % of the full unadjusted count on 94 % of the estimates. It did perform poorly,however, in predicting effort, so its use would only be recommended for predicting thefunctional size of a system.

The Function Points Simplied (FPS) approach (Bock and Klepper 1992 ) is similar tothe Estimated NESMA approach in that it only requires each individual BFC to be iden-tied, but subsequently uses regression analysis on historical project data to derive aseparate weighting for each BFC. This approach was analysed by Desharnais and Abran

Software Qual J (2014) 22:611660 617

1 3


8/50

(2003 ), who included the EIF component omitted by the original developer, by using adataset of 47 projects to derive the specic weightings and then applied these to anotherdataset of 42 projects. When the results were evaluated against the detailed functioncounts, a statistically signicant correlation was observed, with the simplied method

accurate to within 5 % on average. These ndings suggest comparable performance to theEstimated NESMA method, but this approach requires a signicant historical dataset toproduce these weightings.

For these methods, the data maintained by the system must be identied, with theNESMA approach specically stating that a detailed data model should be available.Simplied versions of COSMIC, on the other hand, are fundamentally process focused,wherein a project is viewed in terms of functional processes which may be sized byidentifying the number of data movements involved. The COSMIC Average FunctionalProcess method requires that only the functional processes need to identied, with thetotal number of processes then multiplied by the average functional process size. TheCOSMIC Equal Size Bands method utilises average size across distinct size bands,enabling the functional processes to be assigned to a particular size band, but the savingsin estimation effort are not as substantial as with the Average Functional Processmethod. These approaches remove the need for a data model to be present, but unlike thesimplied versions of NESMA, the necessary scaling factors must be calibrated locallybefore the methods may be applied. While van Heeringen et al. ( 2009 ) reported rela-tively high levels of accuracy for these methods, within 2 % of the detailed COSMICapproach, the average inaccuracies calculated were subject to the same issue present withhis NESMA data. When the calculation only considers the magnitude of the relative

inaccuracy for each project, the average inaccuracies that result are 28.25 % for theAverage Functional Process method and 7.71 % for the Equal Size Bands method. Theaccuracy of these simplied COSMIC approaches may therefore be more in line with thesimplied NESMA methods.

2.3 Extrapolative sizing methods

Extrapolative sizing methods reduce the estimation effort and information requirements of the sizing further, by only requiring the identication of a subset of the BFC types. Theoverall functional size is then extrapolated from the counts of the required BFC types,either from established industry/standards data or from local historical data.

The Indicative NESMA method only requires the identication of internal logical les(ILF) and external interface les (EIF). For the Indicative NESMA method, the ofcialNESMA counting manual species that errors in functional size with this approach can beup to 50 % (NESMA 2004 ). Van Heeringen et al. ( 2009 ) reported that the IndicativeNESMA method demonstrated on average an inaccuracy of 16.5 % relative to the FullNESMA method. When the calculation is amended to only incorporate the magnitude of the relative inaccuracy for each project, the average inaccuracy obtained is 28.44 %.Candido and Sanches ( 2004 ) also found that the estimation accuracy of the Indicative

NESMA method was signicantly less than that of the Full NESMA approach, with anaverage error of 48 % compared to the Full NESMA method reported.The ISBSG repository of software project data also enables the derivation of the

percentage contribution, on average, that each BFC makes towards the total functionalsize (ISBSG 2009 ). The identication of only one specic type of BFC can then form thenecessary input for the size estimate, with a recommendation for an additional speciedcontingency size to be added for functionality not apparent at an early stage in the

618 Software Qual J (2014) 22:611660

1 3


9/50

lifecycle. It is the inherent weakness of the existence of correlations between BFCswhich, conversely, enables this derivation of a total functional size from a subset of thesecomponents. The question of the applicability of industry wide correlations is, again, anissue with this approach. The nature of these correlations was investigated by Lokan

(1999 ), who had previously identied factors affecting the relative contribution of eachBFC: programming language, type of project developed, type of organisation and the useof prototyping during development. In this study, subsets of projects from an overall setof 269 projects were used to assess which factors had the most effect on correlationsbetween BFCs. While correlations varied across subsets, they ranged from always beingpresent between EI and ILF, to rarely being present between EIF and any other com-ponents. The strongest correlations were found for new development projects; partic-ularly those developed using 4GL languages and application generators. The resultssupport the assertion that estimates may effectively be derived from a restricted set of BFC types, but highlights the limited applicability of such an approach to certain typesof project.

Tichenor ( 2008 ) describes an Internal Logical File Model, developed in 1994, thatderives a total function count from the identied ILFs. Through the use of a statisticalcorrelation between unadjusted function points and the number of ILF, this model hasproduced estimates within 10 % of the actual total function point count. However, it isrecommended that about 30 applications as a minimum must be counted before a statis-tically signicant correlation is obtained and some projects will inevitably be an exceptionto this correlation.

Deriving a functional count by applying a template of transactions to identied ILFs has

been suggested by Poul Staal Vinje (cited in Meli and Santillo 1999 ). The applicationtype template of data inputs, data outputs and data inquiries is applied to each ILF (EIFare counted separately) in the form of an associated functional size for the selected tem-plate. Flexibility is provided by allowing the estimator to assign function point values toeach function type in the template, and this approach reportedly reduces effort to performthe function count to about 20 % of the full count. However, this approach still depends onhow closely the new project ts the selected application type, and at a lower level of granularity, how closely the template ts each ILF.

Wang et al. ( 2008 ) proposed a simplied method based on the Indicative NESMAmethod. A template of expected transactions is assigned to each ILF and EIF according tospecic rules, with scope for adjusting these rules to suit each individual project. Thisapproach addresses some of the issues that arose from our use of the Indicative NESMAmethod and produced results that were signicantly closer to the full function count for thestudied projects. However, the projects had generally been overestimated by the IndicativeNESMA method, which was consistent with an equivalent study of small Web-basedapplications (Candido and Sanches 2004 ), but inconsistent with studies including systemswith a greater range of types and sizes (van Heeringen et al. 2009 ; Wilkie et al. 2011 ). Thespecied rules do not provide an indication of why they would have led to lower estimatesthan the conventional method, and the largest project included in the Wang study was only

414 IFPUG function points in size, so the general applicability of this method is unclear.As only the overall counts produced in this study were reported, the results have not beenvalidated at a more detailed level. The specied rules also, at times, seem to go beyond thelightweight NESMA methods in terms of the data detail, which is to be considered inperforming the estimate. This leads to the unanswered question of how effective thismethod is in reducing the amount of effort required.

Software Qual J (2014) 22:611660 619

1 3


10/50

2.4 Multi-level sizing methods

The Early and Quick Function Point Method 2.0 (Santillo et al. 2005 ) combines aspects of some of the previous approaches, and renements since 1997 enable it to be adapted to t

any of the ISO-specied functional sizing methods (FSM). It supports the identication of existing BFCs, referred to as Base Functional Processes (BFP) and applies weightingsderived from ISBSG benchmark data. It also denes additional component types thatrepresent aggregations of existing BFCs, e.g. the Typical Functional Process (TFP) consistsof the typical operations: Create, Retrieve, Update, Delete, (List), i.e. CRUD(L). TheGeneral Functional Process (GFP) allows a subsystem of two or more BFPs to be moregenerally identied, while the Macro Functional Process (MFP) facilitates the aggregationof two or more GFPs. The basis for aggregation may come from identifying requirementsin the new system that correspond to a known aggregation in an existing system. Thedevelopers of this method report that in most cases the size estimated is within 10 % of thereal size. The reduction in the estimation effort varies according to the degree to whichthe components have been aggregated into MFPs, but the reported savings are generallybetween 50 and 90 %. This method provides for considerable exibility as each part of thesystem can be estimated at whichever level of detail is feasible (or desirable). This enablesthe method to be applied when there is limited requirements detail available, but also takeadvantage of greater detail as and when it is available. As with any function pointapproach, the reliability of the results is limited to the ability of the estimator to identifywhen such component types are present in the documentation. This issue may be moresignicant in this case due to the additional layered classications of components that may

be identied, although (unspecied) encouraging results are reported even with noviceusers of this method. While the reported savings in the cost/time to perform this methodcan be substantial, the quotation of a range of savings does not indicate the average saving.The specication of the method suggests that the estimation effort required would nottypically be equivalent to the simplest methods that require identication of only a subsetof the BFCs. Theoretically, the method is equivalent to a full functional sizing methodwhen functional components are identied at the most detailed level. Indeed, Gencel andDemirors ( 2008 ) reported that their use of this method at the most detailed level of theirstudy required the same amount of estimation effort as the IFPUG FPA method.

The KISS method (Forselius 2006 ) aims to simplify the software sizing procedure whilemaintaining general compatibility with any ISO standard FSM. This method involves thecounting of 28 different functional components, based on the FISMA method. The ratio-nale behind this approach is that by providing more specic components to be counted, lesscomplicated guidelines are required to identify those components. This, in turn, reduces theamount of training required and the subjectivity involved in performing the method. Forexample, external outputs are divided into output forms, reports, text messages/emails andmonitor screen outputs. As with FISMA, algorithmic complexity is directly consideredthrough counting components such as calculation routines and simulation routines. TheKISS Quick level of this method only requires these components to be counted, with anaverage size allocated by the counter enabling a functional size to be derived by applyingmultipliers to the average sizes of the counts of each respective component. Multipliersfrom any of the main standard methods may be used, with a multiplier of 0 applied to anycomponent not required by the chosen method. While the KISS Quick level has producedpromising results in student trials, relative to the more detailed KISS Perfect level of thismethod, the case studies used were small, at approximately 200 FPs or less in size. Thereliability of this approach for larger systems has therefore not been established. The

620 Software Qual J (2014) 22:611660

1 3


11/50

feasibility of determining a reliable average size for each component for larger projectsmay limit the suitability of this approach to those projects with a typical prole. Theestimation effort savings achieved by KISS Quick were not reported, but as each com-ponent must be counted it would not suggest that savings equivalent to the extrapolative

sizing methods could be achievable. The KISS Perfect level of this method addresses someof the shortcomings of the KISS Quick level, as it allows for a more detailed considerationof the functional size by listing and assessing each individual function. However, at thislevel, the effort required is approximately the same as the detailed standard methods andmay therefore not be suitable for classication as a simplied approach.

2.5 Challenges in simplied software sizing

The main issues prevalent in these existing simplied sizing methods are as follows:

1. The reliance on historical data, or assumptions of typical projects, restricts theaccuracy of these methods on new projects, which do not conform to the necessaryprecedents.

2. The extrapolative methods, which provide the greatest savings in estimation effort/ cost, generally do not provide the same level of output as the more detailed methods,therefore, limiting the utility of these methods to indicating the overall functional sizerather than the provision of the full functional prole.

3. The weightings, or assumptions, upon which the methods are based, are applieduniformly across the respective BFC types, limiting the exibility to cater forsignicant variations within an individual project.

4. The methods are not scalable, in terms of the level of input detail from requirementsthat they can accommodate, which limits their ability to benet from a greaterunderstanding of the required functionality of a project as a project progresses throughdevelopment lifecycle steps.

The study reported in this paper involves the size estimation of 11 commercial projects,ranging in size from 218 FP to 1,931 FP, using a simplied approach adapted from theNESMA Indicative method. The adaptations made seek to address each of the limitationsidentied in this review. The Early and Quick Function Point Method is the only one whichovercomes each of these issues, as it can essentially replicate the full function counting

process where it is feasible. However, the aggregated nature of its base components, whileproviding the most exibility of the identied simplied methods, introduces additionalcomponents to be identied and therefore represents the most complicated approach. Ourfocus on adapting the simplied NESMA approaches limits the identication of basecomponents to the existing BFCs, as specied in the NESMA standard, and instead focuseson identifying the prole of their occurrences within a project. The relatively widevariance in the reduction in estimation effort achieved by the Early and Quick FunctionPoint Method suggests that the incorporation of additional input detail into the estimatesignicantly impacts upon this effort. Our study has focused on providing a more efcientapproach by facilitating renement of the prole at a broader level than the individualBFCs.

2.6 NESMA function point counting method

The NESMA function point method measures the functional size of software in terms of ve BFCs:

Software Qual J (2014) 22:611660 621

1 3


12/50

Internal Logical Files (ILF) and External Interface Files (EIF) External Inputs (EI), External Inquiries (EQ) and External Outputs (EO)

These components are referred to as either Data Functions (ILF, EIF) or TransactionFunctions (EI, EQ, EO). There are three levels of size estimation provided for within theNESMA approach, as illustrated in Fig. 1. The Full NESMA estimation method requiresthe identication and complexity classication of each of these components. The com-plexity of each component is determined from counting the number of Data Element Types(DET) and Record Element Types (RET) either present, in the case of les, or transportedacross the system boundary, in the case of transactions, with a classication of low,average or high complexity assigned according to specied boundaries of DET and RETtotals. In this context, DET refers to the attributes present in a le, while each le is itself comprised of one or more RETs that exist as a logical le within the system. Eachindividual component is assigned a function point value according to its classication, as

shown in Table 2.The total functional size for a system is then calculated by summing the individualfunction point values of these components:

Total Functional Size Sum of ILF Sum of EIF Sum of EI Sum of EQ Sum of EO

The Estimated NESMA method removes the requirement of assessing the complexity of each individual component. Instead, as shown in Fig. 1, the Data Functions are assumed tobe of low complexity, while the Transactions Functions are assumed to be of averagecomplexity. The total functional size is calculated using the same formula as for the FullNESMA method, and as such, the same subtotals for each BFC are also provided by thisEstimated NESMA method. The Indicative NESMA method further simplies the processby only requiring the identication of the Data Functions, from a data model, and applyingpredened weightings to the number of components identied. The weightings to applydepend on whether or not the data model being used is normalised (i.e. 3rd normal form):

Fig. 1 Related levels of NESMA functional sizing

622 Software Qual J (2014) 22:611660

1 3


13/50


14/50

assessing whether existing simplied approaches to functional sizing can be adapted toprovide value to the software development process.

Our research aim is:

To examine the business value of simplied functional sizing methods by assessingthe trade-off associated with adapting existing simplied methods to better meet thecommercial needs of software development organisations. These adaptations are tobe evaluated on (i) the relative estimation effort overhead; (ii) the relative accuracyof the overall size estimate; and (iii) the relative accuracy of the provided fullfunctional prole.

Increasing the complexity of the software sizing approach adopted involves a trade-off between the estimation performance achieved and the effort required to develop the sizeestimate. Figure 2 illustrates how the existing NESMA methods may be viewed in terms of this trade-off. Low effort associated with the Indicative NESMA method is at the cost of both estimation accuracy and the provision of the full functional prole.

The sizing adaptations to be developed in this research study will examine the extentto which this trade-off can be improved. The overall estimation accuracy, and theaccuracy of the full functional prole, will be evaluated as the estimation effort isincreased. This will provide an indication of the extent to which the software sizingprocess can be simplied without signicantly sacricing the accuracy and level of detailproduced by the estimate.

Fig. 2 Research aim of sizing adaptations

624 Software Qual J (2014) 22:611660

1 3


15/50

4 Research and development method

This study was conducted with the assistance of Equiniti-ICS, formerly ICS ComputingLtd, which has been in operation since 1966. The software development work of Equiniti-

ICS has primarily been focused on xed-price contract projects, providing enterprisesolutions for the public sector, health trusts, payroll, accounting and nancial domains.This has involved the development of both bespoke and framework products, utilisingapproximately 95 staff directly for this purpose.

4.1 Research methodology

Our research is concerned with adapting the current NESMA approach, to enable thefunctional sizing process to be simplied, while achieving estimation accuracy and degreeof utility comparable with that from the Full NESMA method. Our study is divided into thefollowing three stages: (1) appraisal of NESMA functional sizing methods; (2) develop-ment of simplied NESMA adaptations; (3) Evaluation of simplied NESMA adaptations.

The rst research stage is focused on learning from the application of each of theNESMA methods (Indicative, Estimated, Full), facilitating the identication of potentialareas of improvement in the Indicative NESMA method.

The second research stage is focused on developing the adaptations of simpliedNESMA. This involved adapting the existing Indicative NESMA method to incorporateaspects of the Estimated and Full NESMA methods without incurring the same degree of required estimation effort. This stage involved developing and rening these adaptations

using the dataset of ve projects from our previous study (Wilkie et al. 2011 ).The third research stage is concerned with the testing of our adaptations through appli-cation of both our adaptations, and the existing NESMA methods, using the complete datasetof eleven projects. The NESMA sizing results, and associated estimation effort gures, fromour previous completed study are therefore included in the results of this research study.

4.1.1 Commercial software projects

The dataset for this study took the form of eleven commercial projects supplied by Equ-initi-ICS. The eleven projects were developed during the period 20002012 and share thecharacteristics listed in Table 4.

The development personnel for each project was drawn from a larger team of devel-opers. The data driven nature of these systems supports the application of NESMAfunctional sizing, with standard approaches adopted for documenting and developing these

Table 4 Characteristics of sample projects Application type Data driven applications, e.g.

case management system

Requirements methodology Waterfall-Based

Development team Between 3 and 9 seasonedprofessional development staff

Development language .NET/C#, Visual Basic 6

Database type Relational

Interface type Graphical User Interface

System architecture N-tier

Software Qual J (2014) 22:611660 625

1 3


16/50


17/50

burden of maintaining local historical sizing data in order to establish reliable weightings.The degree of exibility in adapting to different application types is still limited byprojecting the same prole across each Data Function.

The effect of these limitations may be illustrated by comparing the number of Transaction

Functions predicted by the Indicative NESMA method with the actual number of TransactionFunctions specically identied by the Full NESMA method. In our initial study of veprojects (Wilkie et al. 2011 ), the raw counts of each component type were identied duringthe completion of the function counting. In this paper, we have extrapolated the number of expected Transaction Functions by applying the assumptions underlying the IndicativeNESMA method to the raw counts of the Data Functions. As the normalised weightings (25and 10) have been used for the Indicative NESMA method, the assumption made is that eachILF will require only one EO. Consequently, each ILF will be expected to have three EI, oneEQ and one EO; each EIF will be expected to have one EQ and one EO. The NESMAguidelines treat entities used for constants, decoding, etc., as a special case, referred to asFPA tables. When present in an application, all internally maintained entities are groupedtogether as one ILF, and all externally maintained as one EIF. The assumptions stated byNESMA are that there is one EI, one EO and one EQ for the ILF FPA table, and notransaction functions for the EIF FPA table. The same procedure was applied to the NESMAsizing results obtained from the additional 6 projects used in this paper. Table 5 presentsthese results, along with the percentage difference in each case. In a study of 42 projects (vanHeeringen et al. 2009 ), the raw counts of each component type identied according toNESMA guidelines were also provided for each project. As the same normalised weightingswere used in the van Heeringen study, the same approach outlined above was applied to the

larger dataset. As the presence of FPA les was not indicated in the van Heeringen data, all of the Data Functions were treated as normal, but the effect of any such les being presentwould at most lead to a reduction of two EI, one EO and one EQ from the total number of expected transaction functions for any project. The effect on the overall results wouldtherefore be insignicant in this case. Due to the presence of no transaction functions in somecomponents of 8 of the projects in the dataset, these projects were excluded from thisassessment as no relative inaccuracy could be calculated for these components. Table 5includes the overall averages obtained from this analysis of the van Heeringen dataset.

The main observations to be made from these results are:

1. The greatest inaccuracy is found with the EQs, where in our dataset, on average, thenumber of expected EQ was 264.76 % more than actually identied during the functioncount. In only one of the projects did the counted number exceed the expected number.This pattern was signicantly more extreme in the van Heeringen dataset where theaverage difference exceeded 1,100 % andevery projectcontained lessEQ thanexpected.

2. The inaccuracy found with the EOs was signicant, but with a different pattern thanfound with the EQs. In our dataset, all of the projects contained more EOs thanexpected by the Indicative NESMA method, with the average difference being56.57 %. A similar average difference was observed in the van Heeringen dataset, with31 out of the 34 projects (91 %) included, conforming to the pattern of exceeding the

expected number of EOs.3. The pattern with the EIs was more varied, particularly with our dataset which had a

modest average difference of 25.66 %, and in terms of the overall direction, there was amodest difference of 9.09 % between the expected and the counted number of EIs. Thevan Heeringen dataset exhibited a clearer pattern with 71 % of the projects containingless EIs than expected and a more signicant average difference of 87.35 %.

Software Qual J (2014) 22:611660 627

1 3


18/50

T a

b l e 5

C o m p a r i s o n

o f e x p e c t e d a n d f u l l c o u n t e d N E S M A p r o l e s

P r o j e c t

C o u n t e d

I L F

C o u n t e d

E I F

E x p e c t e d

E I

C o u n t e d

E I

P e r c e n t a g e

d i f f e r e n c e

E x p e c t e d

E O

C o u n t e d

E O

P e r c e n t a g e

d i f f e r e n c e

E x p e c t e d

E Q

C o u n t e d

E Q

P e r c e n t a g e

d i f f e r e n c e

A

1 6

3

4 6

4 1

1 2 . 2

0

1 8

6 0

- 7 0 . 0 0

1 8

1 4

2 8 . 5

7

B

2 9

4

8 5

6 9

2 3 . 1

9

3 2

1 6 4

- 8 0 . 4 9

3 2

9

2 5 5 . 5 6

C

5 1

2

1 5 1

1 6 7

- 9 . 5 8

5 2

1 5 1

- 6 5 . 5 6

5 2

5 1

1 . 9 6

D

2 0

1

5 8

8 5

- 3 1

. 7 6

2 0

7 7

- 7 4 . 0 3

2 0

2 5

- 2 0

. 0 0

E

2 7

1

7 9

7 5

5 . 3 3

2 8

6 8

- 5 8 . 8 2

2 8

5

4 6 0 . 0 0

F

9

1

2 5

1 7

4 7 . 0

6

1 0

1 3

- 2 3 . 0 8

1 0

9

1 1 . 1

1

G

2 3

1

6 7

7 5

- 1 0

. 6 7

2 4

3 7

- 3 5 . 1 4

2 4

2 0

2 0 . 0

0

H

2 1

4

6 1

8 7

- 2 9

. 8 9

2 5

8 2

- 6 9 . 5 1

2 5

1 5

6 6 . 6

7

I

1 8

1

5 2

3 1

6 7 . 7

4

1 9

3 0

- 3 6 . 6 7

1 9

1

1 , 8 0 0 . 0 0

J

5 4

1

1 6 0

1 1 8

3 5 . 5

9

5 5

1 0 3

- 4 6 . 6 0

5 5

1 6

2 4 3 . 7 5

K

5 3

3

1 5 7

1 7 3

- 9 . 2 5

5 5

1 4 6

- 6 2 . 3 3

5 5

3 8

4 4 . 7

4

A v e r a g e d i f f e r e n c e ( o v e r a l l

d i r e c t i o n )

9 . 0 9

- 5 6 . 5 7

2 6 4 . 7 6

A v e r a g e d i f f e r e n c e

2 5 . 6

6

5 6 . 5 7

2 6 8 . 4 0

A v e r a g e d i f f e r e n c e ( o v e r a l l

d i r e c t i o n ) ( c a l c u l a t e d f r o m v a n

H e e r i n g e n d a t a s e t )

7 7 . 5

0

- 2 8 . 5 9

1 , 1 1 4 . 8 5

A v e r a g e d i f f e r e n c e ( c a l c u l a t e d f r o m

v a n H e e r i n g e n d a t a s e t )

8 7 . 3

5

5 4 . 5 6

1 , 1 1 4 . 8 5

628 Software Qual J (2014) 22:611660

1 3


19/50

The results from our dataset of 11 projects show that the Indicative NESMA methodunderestimated 9 of these projects, when compared with the Full NESMA method, and theaverage difference across all 11 projects was 26.30 %. The results in Table 5 indicate thatthe signicant over assumption of the number of EQs was partially compensated for by the

under assumption of the number of EOs. A more reliable pattern may be observed from thelarger van Heeringen dataset. As indicated previously, with the van Heeringen dataset, theIndicative NESMA method was found to have an average difference of 28.44 % comparedto the Full NESMA method, which resulted in an average overall over estimation of 16.30 %. Our analysis suggests that to a large extent, these results were achieved by thesignicant inaccuracies found with over assuming the number of EIs and EQs effectivelybeing partially compensated for by under assuming the number of EOs. As a roughindicator of system size, the Indicative NESMA method seemingly produces satisfactoryresults, but it would be incapable of providing anything of value beyond this overall size.The van Heeringen study provided no indication of the types of systems analysed, or of thenature of the requirements documentation used to inform the estimates, so the generalapplicability of these results is uncertain. However, as the projects were drawn frommany different organisations and the measurements reviewed by a peer NESMA certiedanalyst, the general reliability of the original results can be assured.

Our study is concerned with adapting the simplied NESMA approach by increasing theaccuracy and degree of utility of the Indicative NESMA method. The achievement of thisaim involves the incorporation of aspects of the Estimated and Full NESMA methodswithout incurring the same degree of required estimation effort. The Estimated NESMAmethod requires the identication of each instance of a BFC. The estimation effort incurred

is therefore greater than with the Indicative NESMA method. The Estimated NESMAmethod assigns the same complexity classication for each instance of a specic BFC type.The assumption made is that the complexity of each BFC type will, on average, be thesame as the assigned complexity classication. This approach does not provide exibilityfor rening the estimate in situations where a projects BFC complexity prole differssignicantly from the assumption. The Full NESMA method requires the complexity of each instance of a BFC to be assessed, which is often the most time-consuming aspect of the functional sizing. The Estimated and Full NESMA methods have a greater input detailrequirement, since each individual instance of a BFC is required to be considered. Therequirement of identifying the typical proles of occurrences of BFC and their associatedcomplexity would reduce the input detail burden of the functional sizing approach. Thedevelopment of our adaptations of the simplied NESMA approach therefore had thefollowing requirements:

Requirement 1 Remove the dependency on weightings to be applied to the identiedData Functions

Requirement 2 Provide FP output sizes for each BFC typeRequirement 3 Identify the typical prole of occurrences of Transaction Functions,

including their complexity, to be associated with the Data FunctionsRequirement 4 Allow the complexity of the Data Functions, and the identied prole of

Transaction Functions, to be assigned at a ner level

The provision of the complete functional prole enables an assessment of the adapta-tions to be made against the Full NESMA method. The adaptations are required tofacilitate the accommodation of increasing levels of input requirements detail, with anassessment of the estimation performance made at distinct levels.

Software Qual J (2014) 22:611660 629

1 3


20/50

4.3 Research Stage 2: development of simplied sizing adaptations

In adapting the simplied NESMA approach, it was necessary to maintain a degree of consistency with the application of the standardised functional sizing methods to ensure

compatibility, thus achieving the ability to compare and, if necessary, convert the resultingcounts into other equivalent functional sizing measures. The main fundamental require-ment of the approach adopted in this study is the identication of the ILF and EIF com-ponents. Consistency is maintained by requiring that they should be identied according tothe existing NESMA guidelines. The main problems identied with the Indicative NESMAmethod, in the previous section, were the relative inaccuracy compared to more detailedmethods, the limited transparency of the method and the reduced information produced bythe method. These problems limited the value of the method to the provision of a roughindication of a projects functional size. The extent of the relevance of the method isconsequently limited to the earliest stages of the software lifecycle where requirementsdocumentation is least detailed.

4.3.1 Simplied sizing adaptation levels

This study evaluates the effect on simplied size estimation performance by incorporatingan increasingly detailed assessment of the system requirements. Figure 3 provides anoverview of the three different levels of simplied sizing adaptation levels, which weredeveloped.

Fig. 3 Overview of simpliedsizing adaptation levels

630 Software Qual J (2014) 22:611660

1 3


21/50

The Level I sizing is concerned with using the fundamental assumptions of the Indic-ative NESMA method to provide the default functional prole of a project (Requirements1 and 2). Level II sizing is concerned with the coarse-grained renement of this defaultprole to obtain the typical prole for the specic project to be sized (Requirement 3). Atthis level, the transaction functionality is associated with each of the Data Functionsequally. Level III sizing involves the ne-grained renement of the functional prole,enabling the required functionality to be assigned more selectively across relative pro-portions of the Data Functions (Requirement 4).

To obtain the initial level of adaptation (Level I), we utilised the fundamentalassumptions which the Indicative NESMA method is based upon, and explicitly appliedthem to the Data Functions that have been identied rather than incorporating them into aweighted calculation, as shown in Fig. 4. In this initial level of adaptation, the weightingsused, depending upon whether or not the data model is normalised, are replaced with theinitial assumed transaction functions. By default, as normalised data models will be used,each ILF has ve associated Transaction Functions (3 EI, 1 EO and 1 EQ) and each EIFhas two associated Transaction Functions (1 EO and 1 EQ).

The subsequent rened levels of adaptation (Levels II and III) incorporate aspects of

NESMA functional sizing beyond those involved in the Indicative NESMA Method.Figure 5 illustrates how this approach may be related to the different NESMA methods interms of the sizing activities performed. In contrast to the Full NESMA method, in whicheach of the Transaction Functions are identied, the rened levels of adaptation assess theoverall prole of these Transaction Functions. The assessment of this prole is concernedwith the average transaction functionality associated with each Data Function. Theclassication of Level II and Level III sizing enables an assessment to be made on thelimits of size estimation performance when the same average transaction functionality isassigned equally across each data function.

4.3.2 Simplied sizing adaptations

The simplied sizing adaptations require the functional prole of a project to be obtained.The prole of the Transaction Functions, rather than each individual function, musttherefore be represented. At Level I, each of the Data Functions has specic associatedTransaction Functions, as illustrated in Fig. 4, based on the Indicative NESMA method

Fig. 4 Initial adaptation (Level I) from indicative NESMA method

Software Qual J (2014) 22:611660 631

1 3


22/50

assumptions. For each ILF, there are three EI functions (Add, Amend and Delete), one EOfunction and one EQ function. For each EIF, there is one EO function and one EQ function.Each instance of a transaction function type therefore represents a Transaction Class,which will be applied to the Data Functions. This would therefore correspond to seveninitial Transaction Classes for the initial size estimate, as shown in Fig. 6.

For the purposes of this study, a Transaction Class is described by the following fourdimensions:

1. Transaction Function Type[EI or EO or EQ]2. Transaction Level[ILF/EIF or RET]3. Transaction Complexity[Low or Average or High]4. Transaction Proportion (of Transaction Level to which it applies)

The initial default values for each Transaction Class parameter, illustrated in Fig. 6, are

derived from the assumptions of the simplied NESMA methods. The Indicative NESMAmethod assumptions determine the initial Transaction Function Type, Transaction Leveland Transaction Proportion. The Estimated NESMA method assumptions determine the

Fig. 6 Example of initial transaction classes

Fig. 5 Scope of adaptation levels

632 Software Qual J (2014) 22:611660

1 3


23/50


24/50

Transaction Classes may be applied more selectively at a specic proportion of the DataFunctions.

The specic guidelines for each level of adaptation are as follows. The following stepsare involved in obtaining the initial functional size (Level I) for each of the sampleprojects:

1. The rst step is the identication of the required ILF and EIF for the project being

analysed, including the individual RETs they are comprised of. This process isidentical to that of the existing Indicative NESMA method.

2. The second step is to apply the initial default Transactions Classes to the raw countsof ILF and EIF to produce corresponding raw counts for EI, EO and EQ.

3. The third step involves applying the assumptions of the Estimated NESMA method tothe raw counts obtained in the previous stage, i.e. ILF and EIF have low complexity,while EI, EO and EQ have average complexity.

The second and third steps can essentially be automated as they are applying predenedassumptions to the results obtained from the rst stage.

Producing the next set of adaptations (Level II) involves assessing the general prole of Transaction Functions associated with the ILF and EIF according to the Transaction Leveland Complexity for the Transaction Classes. For our study, this took the form of thefollowing steps:

1. In the rst step, it was veried whether the initial Transaction Classes are typicallyrequired for most ILF and EIF.

Fig. 7 Simplied sizingadaptation levels

634 Software Qual J (2014) 22:611660

1 3


25/50

2. It was then determined whether any additional Transactions Classes are found to berequired for most Data Functions, e.g. signicant reporting requirements may requirean additional Transaction Class of type EO.

3. Through consideration of a detailed data model, listing DETs for each RET, it was

determined whether the assumption of low complexity for the Data Functions isgenerally valid for the project.

4. The complexity of individual Transaction Classes was also assessed by considering thenumber of RETs and DETs that are generally involved in such functions.

In order to minimise the effort required in these Level II steps, the assessment of complexity did not involve any detailed counting of DETs. As the complexity of aTransaction Function is allocated according to specied boundaries, each comprised of arange of DETs and a range of RETs, it is only necessary to be concerned with developingan appreciation of the typical complexity boundary for the Transaction Function asso-

ciated with each Transaction Class.The nal set of adaptations (Level III) used in the study addressed the proportion of ILFs/EIFs (or RETs) at which each Transaction Class should be applied, incorporating thefollowing steps:

1. An assessment was made of the relative proportion of Data Functions, if any, whichdid not conform to the assumption of low complexity. If necessary, appropriateproportions of the Data Functions were assigned an alternative complexity of average or high as required.

2. An assessment was made of the relative proportion of Data Functions to which each

Transaction Class should be applied. This assessment was made at a broad level, e.g.half, quarter, etc., in order to maintain the desired reduction in estimation effort. Thisstep sometimes results in a current Transaction Class being split into a number of separate Transaction Classes distinguished, for example, by a different complexity.

The second step at Level III enables complexity to be assigned at a ner level than withthe existing Estimated NESMA method, e.g. instead of all EOs having a complexity of average, there may be one EO Transaction Class with high complexity and one EOTransaction Class with average complexity. Throughout these steps, it is the generalcharacteristics of a Transaction Class that are being assessed based on the availability of appropriate detail in the requirements documentation.

This adapted approach corresponds to a simplied and considerably less detailed pro-cess of identifying Transaction Functions than involved in the more detailed NESMAmethods. Since this consideration of the Transaction Classes may facilitate amending theirassociated complexity, these adaptations exceed the Estimated NESMA method in thisregard, but the broad level at which this is considered ensures that it remains the moresimplied approach in terms of the estimation effort required. In practice, the adaptationsmade to the simplied NESMA approach may be limited by the level of detail contained inthe requirements documentation available at the time the size estimate is completed. Forexample, at the initial stages of a project, there may not be sufcient detail about the DETs

present in the Data Functions to facilitate even a broad assessment of the complexity, asrequired in steps 3 and 4 in Level II. However, assessing the relative proportion of DataFunctions affected by each Transaction Class, from step 2 of Level III, may still befeasible at this initial stage of a project.

The increased exibility offered enables sizing to be adapted to different project pro-les, without requiring similarity to typical projects. The increased transparency

Software Qual J (2014) 22:611660 635

1 3


26/50

facilitates comparison with expert-based estimates by providing a clearer indication of theestimated functionality. In practice, the Transaction Functions associated with specicData Functions would involve more complicated scenarios than catered for by theseTransaction Classes. In this regard, the adaptations utilised in this study correspond to

rening the average transaction functionality associated with each Data Function.

4.3.3 Example calculation of functional size

To obtain the same output detail as provided by a full function count method, the size of each BFC must be calculated from the Transaction Classes. For this example, assume thatafter examination of requirements documentation, a project was determined to contain 20ILFs, comprised of 35 RETs and 2 EIFs. Figure 6 previously showed the initial Trans-action Classes derived automatically for each ILF (15) and EIF (6 and 7). In this case, thethree EI Transaction Classes correspond to the Add, Amend and Delete functionality,in accordance with the Indicative NESMA method assumptions. The initial total functionalcount, corresponding to Level I, for the project is calculated using the NESMA complexityweighting values from Table 2.

The functional size of each Transaction Class is calculated using the following formula:

Number of FP Number of Transaction Level Components Proportion Complexity Weighting of Function Type

For Transaction Classes which are applied across all their respective Transaction Levelcomponents, the Proportion variable takes the value of 1. The functional size for each

Transaction Class would therefore be as follows:

Transaction Class 1 20 1 4 80 FPTransaction Class 2 20 1 4 80 FPTransaction Class 3 20 1 4 80 FPTransaction Class 4 20 1 5 100 FPTransaction Class 5 20 1 4 80 FPTransaction Class 6 2 1 5 10 FPTransaction Class 7 2 1 4 8 FP

The ILF and EIF totals are calculated according to the standard NESMA guidelines of multiplying the number counted by the assumed low complexity weighting. The functionalcount for the remaining BFC types is calculated by adding the size of each of theTransaction Classes for that transaction type. For this example, the size for each BFC typewould therefore be as follows:

ILF 20 7 140 FPEIF 2 7 14FP

EI 80 80 80 240 FP Transaction Classes 1 ; 2 ; 3EO 100 10 110 FP Transaction Classes 4 ; 6EQ 80 8 88 FP Transaction Classes 5 ; 7

The total functional count for the project would therefore be the sum of these values:

636 Software Qual J (2014) 22:611660

1 3


27/50


28/50

ILF 20 7 140 FPEIF 2 7 14 FP

EI 140 210 350 FP Transaction Classes 1 ; 2EO 100 140 10 250 FP Transaction Classes 3 ; 4 ; 6EQ 40 8 48FP Transaction Classes 5 ; 7

The total functional count for the project would now be:

Total Functional Count 140 14 350 250 48 802 FP

4.4 Research Stage 3: testing of simplied sizing adaptations

Stage 3 of our research comprised the testing of our simplied sizing adaptations on thecomplete dataset (Projects A to K) of commercial projects provided by Equiniti-ICS.

4.4.1 Size estimation measurements

Stage 2 of our research used the dataset (Projects A to E) from our previous study(Wilkie et al. 2011 ). The size estimates for the NESMA methods had therefore beendeveloped in the previous study. The effort required to use the NESMA methods was

recorded as the size estimates were completed. The simplied sizing adaptations weredeveloped, and rened, for this current paper and applied to Projects A to E. Due to theprocess of rening the levels of adaptation, the recorded effort for development of thesesize estimates using our adaptations on Projects A to E represents judged effort. Theeffort to complete Level I sizing is identical to that for the Indicative NESMA method, as itautomatically derives the EI, EO and EQ values from the output of this method. The effortvalue for Level I sizing will therefore automatically be the same as that for the IndicativeNESMA method for each project.

For the remaining projects in the dataset (Projects F to K), all size estimates weredeveloped during this current study. The levels of adaptation, as used on Projects A to E,were also used to develop size estimates for Projects F to K. In this case, the actualeffort required to develop each size estimate was recorded for each project, by updating aneffort timesheet at the end of each estimation session. Each of the NESMA methods(Indicative, Estimated and Full) was also used to develop size estimates for these sameprojects, and the estimation effort recorded in the same manner.

Fig. 8 Example of rened transaction classes

638 Software Qual J (2014) 22:611660

1 3


29/50

The previous NESMA method size estimates and all of the size estimates in ResearchStage 3 were completed by the same member of the research team. The use of differentestimation methods in this way on the same project documentation introduces the possi-bility of biasing the estimation effort gures recorded. The estimator may be in effect

learning about the project with one method, which enables the subsequent method to becompleted more quickly. For the initial projects (A to E) as the estimation effort guresfor our adaptations were judged, the issue of bias is offset as the learning effect does notimpact the judgement. In order to counter this learning effect for the remaining projects(F to K), the order in which the different estimation methods were used was designed tominimise a learning advantage for any one method as follows:

1. The Indicative NESMA method and Level I of our adaptations are identical in terms of only requiring the Data Functions to be identied from the requirements documen-tation. This task therefore only needs be completed once to provide the results for both

approaches. The identication of these Data Functions was therefore completed rstfor each of the projects.2. The remaining NESMA methods (Estimated and Full) and the remaining levels of our

adaptations (II and III) both require consideration of the Transaction Functionsrequired in a project. The estimation activities differ in that the NESMA methodsrequire each individual Transaction Function to be identied, while our adaptationsonly require the general prole of the Transaction Functions to be identied. The orderof completion of these methods was varied evenly across the projects in order toensure that each method was given an equal opportunity to benet from previouslyacquired knowledge about a project. Any bias would therefore be cancelled out acrossthe projects enabling the average estimation effort for each method to be comparedmore reliably.

In practice, performing functional sizing may require clarication from the system usersin order to verify understanding of the requirements documentation. The software sizing inthis study involved the use of documentation from completed projects, rendering clari-cation with the customer as beyond the scope of this research. The estimation effortincurred with each method was therefore not inclusive of this aspect of functional sizing,and the gures reported in Tables 7 and 8 represent the time taken to analyse therequirements documentation and determine the functional size according to each specic

method.

5 Empirical study results

Table 6 shows the number of Transaction Functions derived using the highest level of adaptations (Level III) and the associated percentage difference when they are comparedwith the counted number obtained using the Full NESMA method. Tables 7 and 8 showthe effort required in using each of the variants of the NESMA standard to completesizing for each of the commercial projects studied, together with comparable data fromthe three levels of our own adaptations. Due to the difference in how the effort guresfor our adaptations were produced for the initial projects (A to E) and the remainingprojects (F to K), the average percentage reductions in effort achieved by eachmethod, compared to the Full NESMA method, are only provided separately for eachsubset of projects. Table 9 shows the average relative accuracy of the overall functionalsizes and project BFC proles obtained from the NESMA (Indicative and Estimation)

Software Qual J (2014) 22:611660 639

1 3


30/50


31/50


32/50

T a

b l e 8

C o m p a r a b l e

e f f o r t p e r f o r m a n c e f o r t h e s i z e e s t i m a t i o n m e t h o d s ( P r o j e c t s F t o K )

I n d i c a t i v e N E S M A

L e v e l I

L e v e l I I

L e v e l I I I

E s t i m a t e d N E S M A

F u l l N E S M A

E f f o r t ( h ) t o c o m p l e t e ( % R e d u c t i o n R e l a t i v e t o F u l l N E S M A )

P r o j e c t F

1 ( 7 7 . 7 8 % )

1 ( 7 7 . 7 8 % )

1 . 2 5 ( 7 2 . 2 2 % )

1 . 5 ( 6 6 . 6 7 % )

2 . 5 ( 4 4 . 4 4 % )

4 . 5 ( 0 % )

P r o j e c t G

1 ( 9 2 . 8 6 % )

1 ( 9 2 . 8 6 % )

1 . 5 ( 8 9 . 2 9 % )

2 ( 8 5 . 7 1 % )

5 . 5 ( 6 0 . 7 1 % )

1 4 ( 0 % )

P r o j e c t H

1 . 5 ( 8 6 . 9 6 % )

1 . 5 ( 8 6 . 9 6 % )

2 ( 8 2 . 6 1 % )

2 . 5 ( 7 8 . 2 6 % )

5 . 5 ( 5 2 . 1 7 % )

1 1 . 5

( 0 % )

P r o j e c t I

1 . 5 ( 8 0 . 0 0 % )

1 . 5 ( 8 0 . 0 0 % )

1 . 7 5 ( 7 6 . 6 7 % )

2 ( 7 3 . 3 3 % )

3 . 5 ( 5 3 . 3 3 % )

7 . 5 ( 0 % )

P r o j e c t J

4 ( 8 7 . 3 0 % )

4 ( 8 7 . 3 0 % )

5 . 5 ( 8 2 . 5 4 % )

6 . 5 ( 7 9 . 3 7 % )

1 5 ( 5 2 . 3 8 % )

3 1 . 5

( 0 % )

P r o j e c t K

5 ( 8 7 . 1 8 % )

5 ( 8 7 . 1 8 % )

6 ( 8 4 . 6 2 % )

7 ( 8 2 . 0 5 % )

1 6 ( 5 8 . 9 7 % )

3 9 ( 0 % )

A v e r a g e e f f o r t ( h )

1 . 8

1 . 8

2 . 4

2 . 9

6 . 4

1 3 . 8

A v e r a g e % R e d u c t i o n R e l a t i v e t o F u l l N E S M A

8 5 . 3

5 %

8 5 . 3

5 %

8 1 . 3

2 %

7 7 . 5 7

%

5 3 . 6

7 %

M e a n d i f f e r e n c e ( h ) w i t h F u l l N E S M A

- 1 2

. 0

- 1 2

. 0

- 1 1

. 4

- 1 0

. 9

- 7 . 4

p

v a l u e ( v e r s u s f u l l N E S M A )

0 . 0 2 6

0 . 0 2 6

0 . 0 2 7

0 . 0 2 8

0 . 0 3 0

M e a n d i f f e r e n c e ( h ) w i t h E s t i m a t e d N E S M A

- 4 . 6

- 4 . 6

- 4 . 0

- 3 . 5

p

v a l u e ( v e r s u s E s t i m a t e d N E S M A )

0 . 0 2 3

0 . 0 2 3

0 . 0 2 4

0 . 0 2 7

642 Software Qual J (2014) 22:611660

1 3


33/50

methods and our adaptations, compared to the results from the Full NESMA method.

The functional sizing results for each individual project are provided in Appendix .For each result table, where appropriate, the relative performance of the estimationmethods used has been compared using paired sample t tests (2-tail). The mean differenceand the p value are reported for each comparison made. This test determines whether themean difference for each compared paired sample differs signicantly from zero. For

Table 9 Summary of relative sizing inaccuracies of each sizing method compared to full NESMA method(Overall DatasetProjects A to K)

Estimation type ILF EIF EI EO EQ Overall

Percentage inaccuracy relative to full NESMA methodIndicative NESMA 26.30 %

Level I 10.43 % 0.00 % 28.03 % 54.29 % 209.19 % 22.19 %

Level II 10.43 % 0.00 % 14.50 % 11.29 % 42.30 % 4.86 %

Level III 3.71 % 0.00 % 4.33 % 4.91 % 14.66 % 2.38 %

Estimated NESMA 10.43 % 0.00 % 6.78 % 5.43 % 11.67 % 4.50 %

Level ILevel II

Mean difference (%) 13.53 % 43.00 % 166.89 % 17.33 %

p value (2 tail) 0.066 \ 0.001 0.141 0.001

Level ILevel III

Mean difference (%) 6.72 % 23.70 % 49.38 % 194.53 % 19.81 %

p value (2 tail) 0.001 0.003 \ 0.001 0.089 \ 0.001

Level IILevel III

Mean difference (%) 6.72 % 10.17 % 6.38 % 27.64 % 2.48 %

p value (2 tail) 0.001 0.002 0.030 0.014 0.004

Level IIndicative NESMA

Mean difference (%) - 4.11 % p value (2 tail) 0.104

Level IEstimated NESMA

Mean difference (%) 21.25 % 48.86 % 197.52 % 17.69 %

p value (2 tail) 0.009 \ 0.001 0.085 0.001

Level IIIndicative NESMA

Mean difference (%) - 21.44 % p value (2 tail) 0.001

Level IIEstimated NESMA

Mean difference (%) 7.72 % 5.86 % 30.63 % 0.36 %

p value (2 tail) 0.031 0.089 0.013 0.688

Level IIIIndicative NESMA

Mean difference (%) - 23.92 % p value (2 tail) \ 0.001

Level IIIEstimated NESMA

Mean difference (%) - 6.72 % - 2.45 % - 0.52 % 2.99 % - 2.12 % p value (2 tail) \ 0.001 0.201 0.733 0.363 0.010

Software Qual J (2014) 22:611660 643

1 3


34/50

p values less than 0.05, the difference can be considered to be statistically signicant (veryhighly signicant for p \ 0.01), i.e. one method has signicantly outperformed the other.

Consideration of the number of derived Transaction Functions produced by our adap-tations provides an indication of how closely the actual number of Transaction Functions,

as identied using the NESMA approach, has been approximated (Sect. 5.1). In order toassess the implications of our sizing results in relation to our research aim, it is necessary toconsider the estimation effort incurred (Sect. 5.2), the accuracy of the overall functionalsizes estimated (Sect. 5.3) and the accuracy of the prole of the BFCs (Sect. 5.4).

5.1 Number of derived transaction functions

Each of the Transaction Classes identied during the sizing is applied to a specic numberof Data Functions, thus enabling a specic number of Transaction Functions to be derived

from these Transaction Classes. This enables the number of derived Transaction Functionsto be compared with the number of actual Transaction Functions identied using theEstimated (or Full) NESMA method as a means of validating our results. The earlier resultsfrom Table 5, which showed the percentage difference for the expected number of Transaction Functions, may be considered to be the same as the Level I sizing used in thestudy. The results presented in Table 6 show the derived Transaction Functions for theLevel III sizing, thus enabling the improvements made through increasing the detail to beassessed. The Level I average differences for the overall dataset are included forcomparison.

Each of the types of Transaction Function demonstrated overall improvements, in termsof deriving the number of functions, from Level I to Level III sizing. In terms of theaverage improvement between the these levels, the largest was found with the EQTransaction Functions, where the average percentage difference for the overall dataset hasdecreased from 268.40 to 13.67 %. As this component, for some projects, demonstrated themost extreme initial difference, it is intuitive that suitable consideration of the require-ments documentation can provide a more accurate indication of the number of EQTransaction Functions required in those projects. The improvement for this componentacross the complete dataset was, however, not found to be statistically signicant. Thedataset demonstrated a slightly lower relative improvement in the average percentage

difference for EO Transaction Functions, and the lowest relative improvement was foundin the EI Transaction Functions. For both of these components, the improvements werefound to be statistically signicant, however, reecting a more uniform improvementacross the dataset. In general, the extent to which our adapted approach may approximatethe actual values, while maintaining the savings in estimation effort is more signicant forthe EO and EI Transaction Functions than for the EQ Transaction Functions.

5.2 Performance of adaptations: estimation effort

The effort gures presented in Table 7, reported previously (Wilkie et al. 2011 ), for thethree variants of NESMA were recorded as the sizing methods were performed. As a resultof the renement of the adaptations made to the NESMA approach as it was performed onthe rst ve sample projects, the effort gures reported for the these projects (A to E),with the exception of Level I, reect a judgement of the time taken to complete the sizingat each level of adaptation on each project. The effort gures in Table 8 for the remaining

644 Software Qual J (2014) 22:611660

1 3


35/50


36/50


37/50


38/50

the existing NESMA methods, our ability to accommodate additional input detail hasdemonstrated the potential for extending the applicability of simplied sizing methods tothe provision of a project prole comparative to that produced by a full function count.When size estimation must be conducted at an earlier lifecycle stage, the extent to which a

size estimate may be rened in this manner is limited by the level of detail available in therequirements documentation at that stage.

In practice, the value of functional sizing extends beyond the scope of software esti-mation. The sizing process in itself serves as a validation of the requirements documen-tation, providing an indication of the understanding of the system and identifying potentialproblems at an earlier stage of the lifecycle. Functional sizing can therefore provide avaluable contribution to Software Quality. The use of the full NESMA method provides themost rigorous assessment of the requirements documentation, therefore providing the mostsignicant contribution to preventing subsequent software defects. Simplied methodswhich only involve the identication of a subset of the BFCs are limited in this regard. Incontrast, our adapted approach provides a middle ground by identifying the general proleof the requested functionality in the form of Transaction Classes. It therefore enables adegree of broader validation of the requirements documentation by identifying wherespecic types of functionality, represented by a Transaction Class, are present and con-sequently where expected functionality may be missing or inadequately specied.

7 Conclusions

Supporting, rather than replacing, the existing estimation methods utilised in organisationsmay represent a more realistic goal for model-based sizing estimation. The use of sim-plied functional sizing methods has predominantly been focused on either addressing theunavailability of the necessary requirements detail early in the lifecycle, or on providing alow effort indication of the system size. Deriving an overall size from a subset of the BFCsrequires signicant groundwork to establish reliable correlations for a specic organisationand provides limited transparency an

Function point analysis using NESMA simplifying the sizing without simplifying the size.pdf

Documents

Transcript of Function point analysis using NESMA simplifying the sizing without simplifying the size.pdf