Monday 9 February 2015

A model for software development bug calculation

The thing I'm going to talk about is just a sample study of why software applications get complex and buggy through the time. I'm trying to build a simple model for software application's bugginess, so we need to have some definitions first and a scenario, let's get started.

First we all know, software applications should evolve because their environment, users, and needs evolve. So we have to modify or add features to applications to support the ecosystem evolution.

There are many metrics to measure the software quality but for our simple model let's just deal with "bug per lines of code". I've seen the average of say 15 bugs per 1000 line of code in papers, this is just a number to have a feeling of what we are talking about, we are going to use symbols instead of numbers. I just want to emphasize that even advanced programmers write code with bugs. We use two type of symbols one for new line of codes and one for edited lines of code, why because the effect of these two on generating bugs are not the same (will talk about it later):



Ν : for total new lines
E : for total edited lines
:  percentage or ratio of generating bugs in new lines of code
ε :  percentage or ratio of generating bugs in edited lines of code

A fully connected mesh topology
Look at the picture, we simply suppose that our software application consists of 3 modules. The arrows show the side effect we may have if a change appears in one module. So if we make some changes in Mfor example, it might have a side effect on M2 and M3. 

The ψ coefficients also define the amount of the side effect, we simply consider this side effect linear. So if we change E1 lines of code in M1 then we have Eε1 bugs in M1 itself and Eψ12 
in M2 and Eψ13 in M3.

Now consider at the moment the software has B0 bugs (B0 can be zero) and then during a development process or cycle or iteration some new lines of code has been added and some lines of code has been edited. So we have the following new bug count in the software:

(Eε1 + Eψ12 + Eψ13 + ...) + (Eε2 + Eψ21 + Eψ23 + ...)  + (Eε3 + Eψ31 + Eψ31 + ...) + ...
(Nv1 + Nψ12 + Nψ13 + ...) + (Nv2 + Nψ21 + Nψ23 + ...)  + (Nv3 + Nψ31 + Nψ31 + ...)  + ...

These two lines show the number of newly generated bugs after editing (E1+E2+E3+...) lines of code and adding (N1+N2+N3+...) new lines of code to the application. Let's give a try to summarize the above, we define DDB as the bugs directly related to development bugs as:

DDB1 = (Eε1 + Eε+ Eε+ ...) + (Nv1 + Nv2 + Nv3 + ...)

and IDB as the bugs appear in the application via development in neighbor modules, so:

IDB1 = (Eψ12 + Eψ13 + ...) + (Eψ21 + Eψ23 + ...) + ... + (Nψ12 + Nψ13 + ...) +  (Nψ21 + Nψ23 + ...) + ...

or

IDB1 = (E1+N1)(ψ12 ψ13 + ...) + (E2+N2)(ψ21 ψ23 + ...) + (E3+N3)(ψ31 ψ32 + ...) + ...

To get a better understanding let us also consider that all ψij are equal to ψ. This can be the average of the ψij  too, note that sometimes 2 or more people work on modules so you either have to deal with ψij  or consider an average value for all ψij. So if we consider n is the number of the modules in the application, we have:

IDB1 = ψ(n-1)(E1+N1) + ψ(n-1)(E2+N2) + ψ(n-1)(E3+N3) + ...

IDB1 = ψ(n-1) ( E+ N+ E+ N2  + E+ N + ... ) =  ψ(n-1) [ Σ E+ Σ N]  (1)

... and if we also accept that all εequals ε, and for vi , we have:

DDB1 = (Eε + Eε + Eε + ...) + (Nv + Nv + Nv + ...) = ε Σ EΣ Ni   (2)

Equations 1 and 2 are just simple linear model of what happens if we do some development on the software. To translate them in English lest name Σ Eas Total Edited Lines at t1 or TEL1 , and Σ Ni  as Total New Lines at  t1 or TNL1. So we have:

DDB=  ε TEL + TNL1    (3)

IDB1  =  ψ(n-1) (TEL + TNL1)    (4)

So it says:

1: "when you add new lines of code to a project no matter where they are, (even in simplest parts of the code) you, in fact, add some bugs to the module you are working on it which directly related to the new lines of code."

2: "when you modify some lines of code in a project no matter where they are, (even in simplest parts of the code) you in fact add some bugs to the module you are working on it which directly related to the edited lines of code."

Now see this one, it is important when the project is maintaining by a group of programmers.

3: "The more you develop and work on your modules, the more you introduce bugs to other people modules !!!"

We will study more on what we got on equations 1,2,3 and 4 on next posts and will see how we can minimize the number of generated bugs in a software via these equations, it mostly depends on the way modules are cooperating with each other.