Using the conceptual cohesion of classes for fault prediction in object-oriented systems

Andrian Marcus, Denys Poshyvanyk, R. Ferenc

Research output: Contribution to journalArticle

180 Citations (Scopus)

Abstract

High cohesion is a desirable property of software, as it positively impacts understanding, reuse, and maintenance. Currently proposed measures for cohesion in Object-Oriented (OO) software reflect particular interpretations of cohesion and capture different aspects of cohesion. The paper proposes a new measure for the cohesion of classes in an OO software system, based on the analysis of the unstructured information embedded in the source code, such as comments and identifiers. The measure, named the Conceptual Cohesion of Classes (C3), is inspired from the mechanisms used to measure textual coherence in cognitive psychology and computational linguistics. The paper presents the principles and the technology that stand behind the C3 measure. A large case study on three open source software systems is presented, which compares the new measure with an extensive set of existing metrics and uses them to construct models that predict software faults. The case study shows that the novel measure captures different aspects of class cohesion compared to any of the existing cohesion measures. In addition, combining C3 with existing structural cohesion metrics proves to be a better predictor of faulty classes when compared to different combinations of structural cohesion metrics.

Original languageEnglish
Pages (from-to)287-300
Number of pages14
JournalIEEE Transactions on Software Engineering
Volume34
Issue number2
DOIs
Publication statusPublished - Mar 2008

Fingerprint

Computational linguistics
Open source software

Keywords

  • Fault prediction
  • Fault proneness
  • Information retrieval
  • Latent Semantic Indexing
  • Program comprehension
  • Software cohesion
  • Textual coherence

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Graphics and Computer-Aided Design
  • Software

Cite this

Using the conceptual cohesion of classes for fault prediction in object-oriented systems. / Marcus, Andrian; Poshyvanyk, Denys; Ferenc, R.

In: IEEE Transactions on Software Engineering, Vol. 34, No. 2, 03.2008, p. 287-300.

Research output: Contribution to journalArticle

@article{5fa0b747ccec4a1392ca15cc71dbfd8f,
title = "Using the conceptual cohesion of classes for fault prediction in object-oriented systems",
abstract = "High cohesion is a desirable property of software, as it positively impacts understanding, reuse, and maintenance. Currently proposed measures for cohesion in Object-Oriented (OO) software reflect particular interpretations of cohesion and capture different aspects of cohesion. The paper proposes a new measure for the cohesion of classes in an OO software system, based on the analysis of the unstructured information embedded in the source code, such as comments and identifiers. The measure, named the Conceptual Cohesion of Classes (C3), is inspired from the mechanisms used to measure textual coherence in cognitive psychology and computational linguistics. The paper presents the principles and the technology that stand behind the C3 measure. A large case study on three open source software systems is presented, which compares the new measure with an extensive set of existing metrics and uses them to construct models that predict software faults. The case study shows that the novel measure captures different aspects of class cohesion compared to any of the existing cohesion measures. In addition, combining C3 with existing structural cohesion metrics proves to be a better predictor of faulty classes when compared to different combinations of structural cohesion metrics.",
keywords = "Fault prediction, Fault proneness, Information retrieval, Latent Semantic Indexing, Program comprehension, Software cohesion, Textual coherence",
author = "Andrian Marcus and Denys Poshyvanyk and R. Ferenc",
year = "2008",
month = "3",
doi = "10.1109/TSE.2007.70768",
language = "English",
volume = "34",
pages = "287--300",
journal = "IEEE Transactions on Software Engineering",
issn = "0098-5589",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "2",

}

TY - JOUR

T1 - Using the conceptual cohesion of classes for fault prediction in object-oriented systems

AU - Marcus, Andrian

AU - Poshyvanyk, Denys

AU - Ferenc, R.

PY - 2008/3

Y1 - 2008/3

N2 - High cohesion is a desirable property of software, as it positively impacts understanding, reuse, and maintenance. Currently proposed measures for cohesion in Object-Oriented (OO) software reflect particular interpretations of cohesion and capture different aspects of cohesion. The paper proposes a new measure for the cohesion of classes in an OO software system, based on the analysis of the unstructured information embedded in the source code, such as comments and identifiers. The measure, named the Conceptual Cohesion of Classes (C3), is inspired from the mechanisms used to measure textual coherence in cognitive psychology and computational linguistics. The paper presents the principles and the technology that stand behind the C3 measure. A large case study on three open source software systems is presented, which compares the new measure with an extensive set of existing metrics and uses them to construct models that predict software faults. The case study shows that the novel measure captures different aspects of class cohesion compared to any of the existing cohesion measures. In addition, combining C3 with existing structural cohesion metrics proves to be a better predictor of faulty classes when compared to different combinations of structural cohesion metrics.

AB - High cohesion is a desirable property of software, as it positively impacts understanding, reuse, and maintenance. Currently proposed measures for cohesion in Object-Oriented (OO) software reflect particular interpretations of cohesion and capture different aspects of cohesion. The paper proposes a new measure for the cohesion of classes in an OO software system, based on the analysis of the unstructured information embedded in the source code, such as comments and identifiers. The measure, named the Conceptual Cohesion of Classes (C3), is inspired from the mechanisms used to measure textual coherence in cognitive psychology and computational linguistics. The paper presents the principles and the technology that stand behind the C3 measure. A large case study on three open source software systems is presented, which compares the new measure with an extensive set of existing metrics and uses them to construct models that predict software faults. The case study shows that the novel measure captures different aspects of class cohesion compared to any of the existing cohesion measures. In addition, combining C3 with existing structural cohesion metrics proves to be a better predictor of faulty classes when compared to different combinations of structural cohesion metrics.

KW - Fault prediction

KW - Fault proneness

KW - Information retrieval

KW - Latent Semantic Indexing

KW - Program comprehension

KW - Software cohesion

KW - Textual coherence

UR - http://www.scopus.com/inward/record.url?scp=42549092547&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=42549092547&partnerID=8YFLogxK

U2 - 10.1109/TSE.2007.70768

DO - 10.1109/TSE.2007.70768

M3 - Article

VL - 34

SP - 287

EP - 300

JO - IEEE Transactions on Software Engineering

JF - IEEE Transactions on Software Engineering

SN - 0098-5589

IS - 2

ER -