# Main metrics

## Halstead complexity measures

Metrics: length, vocabulary, volume, difficulty, effort, level, bugs, time, intelligentContent, number_operators, number_operands, number_operators_unique, number_operands_unique

$$n1$$ = the number of distinct operators
$$n2$$ = the number of distinct operands
$$N1$$ = the total number of operators
$$N2$$ = the total number of operands

From these numbers, eight measures can be calculated:

• Program vocabulary: $$n = n1 + n2$$
• Program length: $$N = N1 + N2$$
• Calculated program length: $$N' = n1 * log2(n1) + n2 * log2(n2)$$
• Volume: $$V = N * log2(n)$$
• Difficulty: $$D = (n1/2) * (N2/n2)$$
• Effort: $$E = D * V$$
• Time required to program: $$T = E / 18 seconds$$
• Number of delivered bugs: $$B = V / 3000$$

## Cyclomatic complexity number and weighted method count

Metrics: wmc, ccn, ccnMethodMax

The cyclomatic complexity ($$CCN$$) is a measure of control structure complexity of a function or procedure.
We can calculate ccn in two ways (we choose the second):

$(CCN) = E - N + 2P$

Where:
$$P$$ = number of disconnected parts of the flow graph (e.g. a calling program and a subroutine)
$$E$$ = number of edges (transfers of control)
$$N$$ = number of nodes (sequential group of statements containing only one transfer of control)

OR

$$CCN$$ = Number of each decision point

The weighted method count ($$WMC$$) is count of methods parameterized by a algorithm to compute the weight of a method.

Given a weight metric $$w$$ and methods $$m$$ it can be computed as sum m(w') over (w' in w)

Possible algorithms are:

• Cyclomatic Complexity
• Lines of Code
• 1 (unweighted WMC)

This visitor provides two metrics, the maximal CCN of all methods from one class (currently stored as ccnMethodMax)
and the WMC using the CCN as weight metric (currently stored as ccn).

## Kan's defects

Metrics: kanDefect

$$kanDefect = 0.15 + 0.23$$ * number of do…while() + $0.22$ * number of switch() + $$0.07$$ * number of if()

## Maintainability Index

Metrics: mi, mIwoC, commentWeight

According to Wikipedia, "Maintainability Index is a software metric which measures how maintainable (easy to support and change) the source code is. The maintainability index is calculated as a factored formula consisting of Lines Of Code, Cyclomatic Complexity and Halstead volume."

• $$mIwoC$$: Maintainability Index without comments
• $$MIcw$$: Maintainability Index comment weight$• $$mi$$: Maintainability Index = $$mi = MIwoc + MIcw$$ $$MIwoc = 171 - 5.2 * ln(aveV) -0.23 * aveG -16.2 * ln(aveLOC)$$ $$MIcw** = 50 * sin(sqrt(2.4 * perCM))$$ $$mi = MIwoc + MIcw$$ ## Lack of cohesion of methods Metrics: lcom Cohesion metrics measure how well the methods of a class are related to each other. A cohesive class performs one function while a non-cohesive class performs two or more unrelated functions. A non-cohesive class may need to be restructured into two or more smaller classes. High cohesion is desirable since it promotes encapsulation. As a drawback, a highly cohesive class has high coupling between the methods of the class, which in turn indicates high testing effort for that class. Low cohesion indicates inappropriate design and high complexity. It has also been found to indicate a high likelihood of errors. The class should probably be split into two or more smaller classes. ## Card and Agresti metric Metrics: relativeStructuralComplexity, relativeDataComplexity, relativeSystemComplexity, totalStructuralComplexity, totalDataComplexity, totalSystemComplexity $$Fan-out$$ = Structural fan-out = Number of other procedures this procedure calls $$v$$ = number of input/output variables for a procedure ($$SC$$) Structural complexity $$SC = fan-out^2$$ ($$DC$$) Data complexity $$DC = v / (fan-out + 1)$$ ## Length Metrics: cloc, loc, lloc • $$loc$$: lines count • $$cloc$$: lines count without multiline comments • $$lloc$$: lines count without empty lines ## Methods Metrics : nbMethodsIncludingGettersSetters, nbMethods, nbMethodsPrivate, nbMethodsPublic, nbMethodsGetter, nbMethodsSetters ## Coupling Metrics: afferentCoupling, efferentCoupling, instability • Afferent couplings ($Ca): The number of classes in other packages that depend upon classes within the package is an indicator of the package's responsibility.
• Efferent couplings ($$Ce$$): The number of classes in other packages that the classes in a package depend upon is an indicator of the package's dependence on externalities.
• Instability ($$I$$): The ratio of efferent coupling ($$Ce$$) to total coupling ($$Ce + Ca$$) such that $$I = Ce / (Ce + Ca)$$.

## Depth of inheritance tree

Metrics: depthOfInheritanceTree

Measures the length of inheritance from a class up to the root class.

## Page rank

Metrics: pageRank

The first version of the PageRank algorithm was developed by Larry Page and Sergey Brin in 1998. The algorithm is used to rank web pages according to their importance.

We applied the PageRank algorithm to the case of relationships between packages and classes.