Component Mining Mahdi Cheraghchi-Bashi-Astaneh [email protected].

18
Component Mining Component Mining Mahdi Cheraghchi-Bashi- Mahdi Cheraghchi-Bashi- Astaneh Astaneh [email protected] [email protected]

Transcript of Component Mining Mahdi Cheraghchi-Bashi-Astaneh [email protected].

Page 1: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Component MiningComponent Mining

Mahdi Cheraghchi-Bashi-AstanehMahdi [email protected]@ce.sharif.edu

Page 2: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

22

OutlineOutline

What is a component?What is a component?Software reuseSoftware reuseWhat is component retrieval?What is component retrieval?Pros and cons of reusePros and cons of reuseHow to retrieve?How to retrieve?EvaluationEvaluation

Page 3: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

33

What is a component?What is a component? A part of the whole.A part of the whole. ““A piece of software small enough to create and A piece of software small enough to create and

maintain, big enough to deploy and support, and with maintain, big enough to deploy and support, and with standard interfaces for interoperability" - standard interfaces for interoperability" - Jed Harris, Jed Harris, President CI Labs.President CI Labs.

Self contained binary pieces of software, but not Self contained binary pieces of software, but not complete applications.complete applications.

Can be combined with other components to produce Can be combined with other components to produce complete applications, regardless of the languages the complete applications, regardless of the languages the components are implemented in or platforms they run components are implemented in or platforms they run on. on.

Object-Oriented methods are often used for component Object-Oriented methods are often used for component development and reuse.development and reuse.

Page 4: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

44

Some Examples in PracticeSome Examples in Practice

Borland DelphiBorland DelphiBorland C++ BuilderBorland C++ BuilderBorland KylixBorland KylixOLE / COM / ActiveXOLE / COM / ActiveXJavaBeansJavaBeansCORBACORBA

Page 5: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

55

Software ReuseSoftware Reuse

Software reuse is the process of creating Software reuse is the process of creating software systems from existing software software systems from existing software rather than building software systems from rather than building software systems from scratch. [Krueger,1992]scratch. [Krueger,1992]

Levels of software reuse: source code, Levels of software reuse: source code, algorithms, architectures, domain models, algorithms, architectures, domain models, design, program transformations, design, program transformations, documentation, … every possible aspect documentation, … every possible aspect of a software systemof a software system

Page 6: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

66

What is Component RetrievalWhat is Component Retrieval?? The mere existence of a component library does

not automatically entail its re-use. “Component Mining” is the deliberate, organized

and automated process of extracting reusable components from an existing rich software base.

Re-users need support to help them identifying components which suit their needs, This task is the topic of software component retrieval.

The goal is to develop reusable, adaptable The goal is to develop reusable, adaptable software components rather than large, software components rather than large, monolithic applications.monolithic applications.

Page 7: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

77

Types of ReuseTypes of Reuse

Black-Box Reuse: Black-Box Reuse: a client may reuse the retrieved components “as is.”

Component-adaptive Grey-Box Reuse: a client may reuse the retrieved components without meeting any additional conditions but only after interface-level modifications of the components.

White-Box Reuse: White-Box Reuse: arbitrary additions and modifications are required.

Page 8: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

88

Pros and Cons of ReusePros and Cons of Reuse

Advantages:Advantages:1.1. Reduces time and cost spent on programming.Reduces time and cost spent on programming.2.2. Increases programmers’ productivity.Increases programmers’ productivity.3.3. Increases program quality and reliability.Increases program quality and reliability.4.4. Expertise sharingExpertise sharing

Problems:Problems:1.1. It is hard to find things, especially in a large scale.It is hard to find things, especially in a large scale.2.2. Typically components are not (easily) modifiable.Typically components are not (easily) modifiable.3.3. It is hard to manage a large pool of components.It is hard to manage a large pool of components.4.4. It only worth if it is easier to locate and modify a It only worth if it is easier to locate and modify a

reusable component than to write it from scratch.reusable component than to write it from scratch.

Page 9: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

99

How to Retrieve?How to Retrieve? Component retrieval is in fact a form of information Component retrieval is in fact a form of information

retrieval. Despite this fact, “dedicated” component retrieval. Despite this fact, “dedicated” component retrieval algorithms are being developed, since retrieval algorithms are being developed, since software is more than an ordinary text.software is more than an ordinary text.

Component retrieval is a complex and heuristic Component retrieval is a complex and heuristic process.process.

Typically needs a well-structured repository of Typically needs a well-structured repository of components.components.

Methods of retrievalMethods of retrieval1.1. Algorithms based on the meta-data accompanying software Algorithms based on the meta-data accompanying software

components.components.2.2. Algorithms based on the structure of the components.Algorithms based on the structure of the components.

Exact retrieval versus approximated retrievalExact retrieval versus approximated retrieval

Page 10: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

1010

Retrieval by Meta-DataRetrieval by Meta-Data

By meta-data we mean the documentation By meta-data we mean the documentation accompanying the component.accompanying the component.

This method relies on existence and quality of This method relies on existence and quality of the documentation and needs some pre-the documentation and needs some pre-processing.processing.

How to find?How to find?1.1. Using full-text search on documents and program Using full-text search on documents and program

files: No cost, but inaccuratefiles: No cost, but inaccurate2.2. By classification of the components either By classification of the components either

automatically or manually. (depending on the cost automatically or manually. (depending on the cost and accuracy we need)and accuracy we need)

Page 11: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

1111

Retrieval by StructureRetrieval by Structure

Depends on the availability of the structure Depends on the availability of the structure in some form (source code, interface, etc)in some form (source code, interface, etc)

Depends on the availability of computer Depends on the availability of computer language processors.language processors.

Page 12: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

1212

Some Other MethodsSome Other Methods

Formal component specificationFormal component specification1.1. Domain theories: algebraic model, Domain theories: algebraic model,

signatures, etcsignatures, etc

2.2. Interface specificationsInterface specifications

3.3. Interface matching (automated theorem Interface matching (automated theorem proving, etc)proving, etc)

Semantic ClassificationSemantic Classification Feature-based methods (What possible Feature-based methods (What possible

features can a component have?)features can a component have?)

Page 13: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

1313

Some Other MethodsSome Other Methods

Deduction-Based Component RetrievalDeduction-Based Component RetrievalIs the only method which retrieves proven

matches only.Suitable for the development of high-reliability

or safety-critical applications, e.g. space craft control systems.

Page 14: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

1414

Searching and BrowsingSearching and Browsing Searching: Software developers formulate a query, and the Searching: Software developers formulate a query, and the

repository system returns components that match the query.repository system returns components that match the query. Problem: Formulating an effective query is a challenging Problem: Formulating an effective query is a challenging

task.task. Browsing: Developers determine the relevance of the Browsing: Developers determine the relevance of the

components currently being displayed in terms of their components currently being displayed in terms of their development task, and traverse the associated links.development task, and traverse the associated links. It is an incremental task, and is usually preferred.It is an incremental task, and is usually preferred. Problem: Software developer may be puzzled.Problem: Software developer may be puzzled.

Context-Aware Browsing: Infers developers’ tasks by Context-Aware Browsing: Infers developers’ tasks by monitoring their interactions with the environment.monitoring their interactions with the environment. Similar to browsing, but results in a significantly smaller Similar to browsing, but results in a significantly smaller

browsing space.browsing space. Uses learning methods to refine itself.Uses learning methods to refine itself. Problem: It is difficult to “understand” the content.Problem: It is difficult to “understand” the content.

Page 15: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

1515

The Reuse EnvironmentThe Reuse Environment

A component database.A component database.A library management system providing A library management system providing

access to the database.access to the database.A software component retrieval system A software component retrieval system

(e.g. an ORB) that enables client (e.g. an ORB) that enables client applications to retrieve components from applications to retrieve components from the library server.the library server.

CBSE tools that support the integration of CBSE tools that support the integration of reused components into a new design.reused components into a new design.

Page 16: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

1616

Evaluation MeasuresEvaluation Measures

Recall = Ratio of the number of relevant Recall = Ratio of the number of relevant components retrieved to the total number components retrieved to the total number of relevant components in repositoryof relevant components in repository

Precision = Ratio of the number of Precision = Ratio of the number of relevant components retrieved to the total relevant components retrieved to the total number of components retrievednumber of components retrieved

Response timeResponse time

Page 17: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

1717

Summary and ConclusionSummary and Conclusion Software reuse is a crucial concern in today’s Software reuse is a crucial concern in today’s

world of complex software products.world of complex software products. Component-based development model plays an Component-based development model plays an

important role in software reuse.important role in software reuse. Component-based model is useful only when an Component-based model is useful only when an

satisfactory means of retrieval is available.satisfactory means of retrieval is available. No definite answer has yet been developed for No definite answer has yet been developed for

description of components in unambiguous description of components in unambiguous classifiable terms.classifiable terms.

Component retrieval is a difficult problem and Component retrieval is a difficult problem and more work is needed to find an efficient solution.more work is needed to find an efficient solution.

Page 18: Component Mining Mahdi Cheraghchi-Bashi-Astaneh cheraghchi@ce.sharif.edu.

Mahdi Cheraghchi-Bashi-Astaneh ([email protected] Cheraghchi-Bashi-Astaneh ([email protected])harif.edu)

1818

ReferencesReferences D. Spinellis, K. Raptis, Component Mining: a process and its pattern D. Spinellis, K. Raptis, Component Mining: a process and its pattern

language, language, Information and Software TechnologyInformation and Software Technology 42 (2000) pp 609-617 42 (2000) pp 609-617 Hafedh Mili et al, An experiment in software component retrieval, Hafedh Mili et al, An experiment in software component retrieval,

Information and Software TechnologyInformation and Software Technology 45 (2003) pp 633-649 45 (2003) pp 633-649 K. McArthur et al, An evaluation of the impact of component-based K. McArthur et al, An evaluation of the impact of component-based

architectures on software reusability, architectures on software reusability, Information and Software Information and Software TechnologyTechnology 44 (2002) pp 351-359 44 (2002) pp 351-359

P.A.V. Hall, Architecture-driven component reuse, P.A.V. Hall, Architecture-driven component reuse, Information and Information and Software TechnologySoftware Technology 41 (1999) pp 963-968 41 (1999) pp 963-968

I. Crnkovic, M. Larsson, Challenges of component-based development, I. Crnkovic, M. Larsson, Challenges of component-based development, The Journal of Systems and SoftwareThe Journal of Systems and Software 61 (2002) pp 201-212 61 (2002) pp 201-212

Y. Ye, G. Fischer, Context-Aware Browsing of Large Component Repositories, IEEE 16th International Conference on Automated Software Engineering, 2001

A. M. Zaremski, J. M. Wing, Signature Matching, A Key to ReuseA. M. Zaremski, J. M. Wing, Signature Matching, A Key to Reuse B. Fischer, Deduction-Based Software Component Retrieval (Thesis)