Sponsored by the U.S. Department of Defense © 2005 by Carnegie Mellon University page 1 Pittsburgh,...

39
Sponsored by the U.S. Department of Defense © 2005 by Carnegie Mellon University page 1 Pittsburgh, PA 15213-3890 Automated Natural Language Analysis of Requirements Bob Ferguson, SEI Guiseppe Lami, CNR

Transcript of Sponsored by the U.S. Department of Defense © 2005 by Carnegie Mellon University page 1 Pittsburgh,...

Sponsored by the U.S. Department of Defense© 2005 by Carnegie Mellon University

page 1

Pittsburgh, PA 15213-3890

Automated Natural Language Analysis of Requirements

Bob Ferguson, SEIGuiseppe Lami, CNR

© 2004 by Carnegie Mellon University page 2

© 2004 by Carnegie Mellon University page 3

Problem Statement

Requirements and specifications are most often written in natural language. Inspections are an effective tool for defect identification and correction, but --

• Inspections are time consuming and costly.

• Inspections have been shown to identify 25-80% of defects with a median value of 57%www.goldpractices.com/practices/fi/index.php

Can we improve the quality of requirements and specifications and the efficiency of the process by using an automated tool?

© 2004 by Carnegie Mellon University page 4

Why Natural Language?

It is the most natural thing for us to do.

• Generally understood by all parties (engineer, manager, user, sponsor,…)

Formal languages have been tried but have been successful in only very limited domains.

Use cases and scenarios are incomplete and difficult to partition.

• For example, an industry standard may include a significant number of sentences.

• The engineering staff may need several levels of detail.

© 2004 by Carnegie Mellon University page 5

What past efforts exist?

NASA http://satc.gsfc.nasa.gov/metrics/index.html• ARM: Automated Requirements Analysis tool

Tom Gilb, Competitive Engineering, 2005• Describes “Planguage” www.gilb.com

Lightfoot, David, Formal Specification Using Z, Palgrave,1998

TIGER: INCOSEwww.seecforum.unisa.edu.au/SEECTools.html#tiger

Formalize the language or formalize the analysis?

© 2004 by Carnegie Mellon University page 6

Automation Assistance

Document Analysis Tools• Reduce cycle time and effort while producing better results

than possible with tedious manual review• Early detection and correction of simple but often costly errors

allows analysts to focus on more difficult problems

Req. Docs

Evaluation criteria

Inspectionor Review Process

Possibility for Automation

© 2004 by Carnegie Mellon University page 7

QuARS

Quality Analysis for Requirements and Specifications

Automated tool that takes natural language input

Goals:1. Reduce the potential of product defects resulting from

problems in language usage.

2. Ease the editing and inspection burden on staff.

3. Facilitate the identification and analysis of similar and conflicting requirements.

Each of these goals contributes to improving the quality of requirements and specifications by facilitating an improved process for defect identification.

© 2004 by Carnegie Mellon University page 8

Initial Tests

Two companies participated in the initial tests.

Method Company A:• Multiple versions of requirements documents that had

been previously inspected, were put through the QuARS tool.

• Any additional defects were analyzed.• Defects were traced to find first occurrence of defect.

Method Company B:• Single requirements document was analyzed and

compared to inspection results.

© 2004 by Carnegie Mellon University page 9

Company A Results – 1

A manufacturing concern utilizes an external, independent consultant for inspection.

Input2396 Statements

QuARS

QuARS Results

692 identified defects6 hours effort3 days cycle time

Human Results279 identified defects10 business days ~$6,000

Consultant

© 2004 by Carnegie Mellon University page 10

Company A Results – 2

20% of the sentences (484) were defective at the start of the inspection process. Some sentences had multiple defects.

QuARS processing time was approximately 6 hours • Includes learning curve• Includes removing false positives• Rate equals 799 statements per hour

QuARS identified all non-graphical requirements defects that the human inspector identified.

© 2004 by Carnegie Mellon University page 11

Company B Results

Requirement Document

Inspection

Process

Updated Requirement

DocumentQuARS

ResultsRequirements Statements

574

Insurance company that uses a formal inspection process.

- 110 possible defects identified- 94 confirmed defects, 16 false positives- 44 separately identifiable sentences- 8% of statements were defective post-inspection

© 2004 by Carnegie Mellon University page 12

Benefit of Tool Usage

Requirement Document

Inspection

Process

Updated Requirement

DocumentQuARS

110 possible defects identified

Requirement Document

Inspection

Process

Updated Requirement

DocumentQuARSRequirement

DocumentRequirement

Document

Inspection

Process

Inspection

Process

Updated Requirement

Document

Updated Requirement

DocumentQuARSQuARS

110 possible defects identified

Input

QuARS

QuARS Results

692 identified defects6 hours effort3 days cycle time

Human Results279 identified defects10 business days ~$6,000

Human Results279 identified defects10 business days ~$6,000

Human Results279 identified defects10 business days ~$6,000

ConsultantConsultant

After Inspection Process

Manual Only

Tool Only

© 2004 by Carnegie Mellon University page 13

Summary of Initial Results

QuARS shows significant promise for:

• Improving the quality of requirements documents

• Reducing cycle time to identify and remove requirements defects.

• Reduce cost, improve efficiency and improve effectiveness of inspections.

• We can process other text based documents such as test cases and all types of specifications.

© 2004 by Carnegie Mellon University page 14

Basic Operation of QuARS

Lexical Analysis searches for words and phrases that are potentially defective.Syntactical Analysis uses sentence structure to search for additional problems.Readability AnalysisUser defined lexicon can be used to cluster and count requirements of a particular type (security)

© 2004 by Carnegie Mellon University page 15

Definitions

Lexicon: n., dictionary-type listing of words and phrases for a selected purpose.

Lexical: adj., relating to a lexicon.

Syntax: n., the structure of a sentence according to the rules of grammar.

Syntactical: adj., relating to the syntax or grammar of a language.

Semantic: adj., relating to the meaning of text often based on context and domain. (QuARS does not help with semantic analysis).

© 2004 by Carnegie Mellon University page 16

QuARS Quality Model

Lexical analysis identifies words and phrases that are:

• Vague

• Subjective

• Implying choice or option

• Readability (Coleman-Liau)

Syntactical analysis identifies

• Weak phrases or verbs

• Multiplicity

• Implicit expressions

• And Under-specification

© 2004 by Carnegie Mellon University page 17

Analysis Types – 1

Ambiguity: Will the reader have a unique interpretation?

• Vague words and phrases

- clear, easy,useful, adequate, good, bad, etc.

• Subjective

- Similar, having in mind, taking to account, as fast as possible.

- Subjective use often has some unexplored context information. Without detailed familiarity with the organization, business, and individual’s motives, such usage has a high-probability of being misinterpreted.

© 2004 by Carnegie Mellon University page 18

Analysis Types – 2

Ambiguity (cont.)• Implying a choice or option

- Possibly, eventually, if possible, if needed- How will this choice be determined? - Will requirements change? Will we need a choice

function?Implicit expressions• Demonstrative adjectives such as “this” or “that”.

- “This report must have column totals for all dollar amounts.”

• Also words like “above,” “below” or “next”- If the sentence is somehow separated from the

antecedent, the sentence will be impossible to understand.

© 2004 by Carnegie Mellon University page 19

Analysis Types – 3

Weakness

• May, should, can, could

• This is a form of under-specification. It suggests a need for a choice function or future requirement (e.g. TBA).

Under-specification

• Many words will require a second noun to make the use specific.

- For example, “report,” “flow,” “access,” “function”

• Are better by using

- “payroll-report,” “control-flow,” “write-access,” “check-distribution-function”

© 2004 by Carnegie Mellon University page 20

Analysis Types – 4

Multiplicity-function identifies sentences with more than one subject, verb or object.

Example:

• The system will generate the “Payroll-report” and “City-tax-report” monthly.

Multiplicity and implicitness are examples of syntactical analysis.

© 2004 by Carnegie Mellon University page 21

© 2004 by Carnegie Mellon University page 22

QuARS Output – 1

Subjectivity AnalysisThe line number: 883.4.2. The Base Rate is always set as $0.02 per $100. This factor may be updated by Actuarial depending on market conditions.

is defective because it contains the wording: depending on

Number of evaluated sentences: 197Number of defective sentences: 9Defect rate: 4%

© 2004 by Carnegie Mellon University page 23

QuARS Output – 2

Implicit AnalysisLine number 92At a later date this field might be calculated based on actuarial criteria or from Risk Link Accumulation output.

contains an implicit sentence: implicit determiner

Number of evaluated sentences: 197Number of defective sentences: 6Defect rate: 3%

This particular statement was flagged two times, because the implicit usage is two-fold – later and this

© 2004 by Carnegie Mellon University page 24

Clustering Function

Called “View Analysis”

Construct a View Dictionary of domain-related words.

Security {authorization, password, authentication, authorize, authenticate, secure-access, accessibility}

The “V” function in QuARS counts total sentences per each section and counts the number of sentences in the section that include words from the requested lexicon.

While QuARS does not perform semantic analysis, this type of clustering would facilitate analysis performed by humans.

© 2004 by Carnegie Mellon University page 25

© 2004 by Carnegie Mellon University page 26

Creating a Lexicon

© 2004 by Carnegie Mellon University page 27

“False Positives”

Two types of false positives are most common.

1. QuARS identifies a possible defect, however, the actual business usage is correct.

Example: “commission” is usually “vague.” “Commission” is specific in the insurance industry, since it is defined as compensation for a sale .

2. There are times when implicit usage is “ok” provided that the danger of splitting the sentences into different parts of the document is regarded with care.

© 2004 by Carnegie Mellon University page 28

Removing False Positives

Fix the lexicon• The dictionaries can be modified and words added or

deleted.• Hence “commission” can be removed.

Or flag the offending sentence as acceptable and do not show it in the report.

© 2004 by Carnegie Mellon University page 29

Hiding False Positives

11

© 2004 by Carnegie Mellon University page 30

Tool Development Needs

Reporting • Consolidate defects by sentence instead of defect type.

- Ex: All sentences having “vague words” are listed together.

• Use requirement identifier to format report

Integration• Improve user interface• Process other document types• Integrate with tools such as DOORS, CaliberRM

Function• Possible extensions such as identification of passive

voice and indirect objects.

© 2004 by Carnegie Mellon University page 31

Planning: CNR*

Discussing possibilities for near term availability.

Commercialize the tool.

• 3rd party developer/integrator sought

• Discussions with tool vendors

Continued research and development of the engine.

*Consiglio Nazionale delle Richerche (Italian National Research Council)

© 2004 by Carnegie Mellon University page 32

Further Research - 1

Test QuARS in a high-maturity organization

• Use orthogonal defect classification for escaped defects

• What % of requirements defects might be avoided?

Such results are particularly useful. Some high-maturity organizations report that >50% of fielded defects can be traced to defective requirements.

© 2004 by Carnegie Mellon University page 33

Further Research - 2

Test use of QuARS in an Acquisition setting.

• How much can we improve the RFP process?

• Preparation effort, cycle time, rework

• Bidding analysis

• Support for the bidding process in the PMO

© 2004 by Carnegie Mellon University page 34

Further Research - 3

Benefits of using QuARS as part of the process

• Simulation of secondary effects

- Does testing improve?

• Can human inspectors really detect semantic errors if the mechanical errors are not present?

Inspectors have a couple of “bandwidth” limitations. In any one session there are two limiting factors – the size of the deliverable that can be processed in 2 hours, and the maximum number of defects that can be discussed in 2 hours.

© 2004 by Carnegie Mellon University page 35

Further Research – 4

Develop suitable experiments for “clustering” function.

• Consistency checking?

- Can we more easily identify conflicting requirements?

- Have we duplicated a requirement?

• Completeness checking?

- For things like security and privacy, have we covered the bases within the different functional areas?

© 2004 by Carnegie Mellon University page 36

Proposed Process Model

QuARS could be used as the first inspection, or

Used interactively as requirements are entered.

• Cycle time should be shorter, and

• Fewer defects should escape.

Does a process simulation show us results related to our first slide?

Human Inspection

SpecificationDocument

RevisedSpecificationQuARS

RevisedSpecification

© 2004 by Carnegie Mellon University page 37

© 2004 by Carnegie Mellon University page 38

References

Lami, G. “An Automatic Tool for Improving the Quality of Software Requirements”www.ercim.org/publication/Ercim_News/enw58/lami.html

Lami, et. al. “An Automatic Quality Evaluation for Natural Language Requirements”matrix.iei.pi.cnr.it/FMT/WEBPAPER/P11RESFQ01.pdf

Bucchiarone, A. “Quality Analysis of NL Requirements”www.antoniobucchiarone.it/ricerca_CV/ReqPro_QuARS.pdf

© 2004 by Carnegie Mellon University page 39

Contact Information

Bob FergusonSoftware Engineering Institute [email protected]

Giuseppe LamiIstituto di Scienza e Tecnologie Dell'[email protected]