Experiments with Coupling and Cohesion Metrics

Experiments with Coupling and Cohesion Metrics

in a Large System^*

Timothy C. Lethbridge and Nicolas Anquetil

School of Information Technology and Engineering

150 Louis Pasteur, University of Ottawa

K1N 6N6 Canada

tcl@site.uottawa.ca, anquetil@csi.uottawa.ca

Abstract

We apply coupling and cohesion metrics to various decompositions of a large telecommunications system. Several findings emerge: 1) There is a baseline level of coupling and cohesion that represents the average connectedness of any pair of files in the system. 2) As would be expected, measures of cohesion are greater than this baseline in well-defined subsystems. 3) Interestingly, and contrary to intuition, measures of coupling for such subsystems tend also to be greater than the baseline and to increase as cohesion increases. 4) It is possible to easily calculate upper bounds for coupling and cohesion that no subsystem in a given system can exceed. 5) Measuring quality by subtracting coupling from cohesion, as has been proposed in the literature, gives anomalous results since coupling and cohesion are frequently not on the same scale. We propose improved coupling, cohesion and quality metrics that normalize for the baseline and ceiling levels in a given system. Using our proposed metrics it should be possible to compare different systems, something that the current metrics do not permit.

1. Introduction

The general objective of our research is to assist software engineers to understand very large software systems that are typically poorly documented and organized. Part of our approach to this problem involves automatically extracting subsystems–Software engineers will use the extracted subsystems (also known as ‘clusters’) to guide their exploration of an unfamiliar system. For related research see Lakhotia (1997) and Müller et al (1993).

In order to validate our subsystem extraction methods, we wish to evaluate them for quality. We hope to show that the extracted subsystems in some sense encapsulate some useful aspect of the architecture. One approach to such evaluation is to look at coupling and cohesion. In this paper we discuss the use of coupling and cohesion metrics from the literature. We explain several problems with them, particularly the fact that calcuated values do not fall within the same range from system to system.

The next section describes methods we have been using for calculating cohesion and coupling, and a derived measure of quality that is the difference between cupling and cohesion. The subsequent section describes experiments we have performed using these metrics. The remainder of the paper describes problems with the conventional metrics, and proposes improved, normalized, metrics.

2. Background

2.1 General method of calculating coupling and cohesion

We use the method for calculating coupling and cohesion described in Patel et al (1992) and Kunz and Black (1995). The only difference is that in our case we are clustering files, whereas Kunz and Black were clustering ‘processes’. The basis for this method is a function Sim_W(X,Y) that takes two files X and Y, computes vectors X_W and Y_W from these (described below), and computes a value between 0 and 1:

Several different types W of vectors can be used in the similarity functions; these are described in the next section.

The more similar two characteristic vectors, the more closely related their files are likely to be. We would expect a cohesive subsystem to have all its files closely related. Thus Kunz and Black describe their metric for cohesion as the average similarity of all distinct pairs of files in a cluster P:

Here, m is the number of files in the cluster P, and each p is a member of cluster P.

Coupling is correspondingly defined as the average similarity of all pairs of files in the system, such that one file in each pair is in the cluster P, and the other file is outside the cluster:

Here, m is the number of files in cluster P, and n is the number of files not in cluster P.

A good subsystem is expected to exhibit high cohesion and low coupling. However, because there are two separate metrics, comparison between subsystems is difficult. If one subsystem has higher coupling (better) and higher cohesion (worse) than the other, it is not clear which one is the best. Kunz and Black defined a metric for the quality of a system as simply the difference between the cohesion and coupling:

2.2 Methods of calculating the similarity

For the method described above, the similarity measure Sim_W(X,Y), depends on how the characteristic vectors are constituted. For our experiments, we used three different methods which we are diffentiating using the W subscript in the above equations:

TR: A characteristic vector counts the references to particular user-defined types within the file it describes. This is the method used by Kunz and Black (1995).

DR: A vector represents the use of named variables in the two files being compared, as defined in Patel et al (1992).

RC: A vector represents the calling of routines in one file by routines in the other and vice-versa. We introduced this method to see whether different ways of computing cohesion and coupling affect our results.

3. Experiments with coupling and cohesion

We performed our experiments on a telecommunications system consisting of about 4500 files. We used the 11 techniques described in table 1 to cluster these files into subsystems; our original objective was to evaluate the effectiveness of these techniques at extracting highly cohesive subsystems which were as loosly coupled with each other as possible. However, in the context of this paper, we became more interested in evaluating the cohesion and coupling metrics themselves.

Description of clustering technique	Clustering technique number (where 1 means least cohesive and 11 means most, according to figure 1)
Clusters created manually
Clusters selected by experts (did not include all files)	6
Clusters recorded in the configuration management system	3
Similar to 3, but only considering clusters containing at least one file from the experts' partition (6)	4
Clusters generated automatically .. using similarity based on data references
Automatic data-reference clustering	7
Similar to 7, but only considering clusters containing at least one file from the experts' partition (6)	11
... using similarity based on routine calls
Automatic routine-call clustering	8
Similar to 8, but only considering clusters containing at least one file from the experts' partition (6)	10
... using similarity based on abbreviations in file names (Anquetil and Lethbridge 1998)
Automatic file name clustering	1
Similar to 1, but only considering clusters containing at least one file from the experts' partition (6)	5
Similar to 1, but only considering first abbreviation in each file name	2
Similar to 2, but only considering clusters containing at least one file from the experts' partition (6)	9

Table 1: Eleven clustering techniques used in experiments.