Open source evolution analysis

Open Source Evolution Analysis

Izzat Alsmadi
Kenneth Magel
Department of computer science
North Dakota state university
{izzat.alsmadi, kenneth.magel}@ndsu.edu

this model, the more we can tune its
ABSTRACT assumptions and gain confident in its results.

Source code analysis is important The paper analyzes the results of
for software management. It enables us some source code charts from selected open
recognize strengths and weaknesses in our source available projects. Those projects
earlier projects or releases. We developed a will be studied through selected number of
source code analysis tool. This tool gathers their releases.
several metrics from C/C++, C# or Java
source codes. In this paper, we will use the 2. RELATED WORK
tool to analyze some of the open source code
projects. We will study the selected projects Godfrey et al. studied the LOC
release evolutions and compare some releases evolution for Linux[18].
characteristics between the same project Capiluppi et al. suggested that
releases, as well as among different understandability decreases as time
projects. Different programming language
passes by, and focused on code and
code and development styles will be studied
through those open source projects. module sizes [19]. Stroulia et al. used
CVSChecker to study temporal source
General Terms code activities [20]. Marjanovic
Source code analysis. proposed a meta model framework for
Keywords code release history Systems [21].
Open source, code metrics. Scacchi studied the game open source
development practices[22]. Raja et al.
1. INTRODUCTION explored some important software
The knowledge gathered by characteristics that contribute to
software metrics plays an important role consistent software quality in Linux
in software management. This knowledge [23]. Moreno, et al. introduced Jeliot3 ,
can be used to build classification or as a visual source code evaluation
proxy models that can be used toward tool[13]. Graver evaluated object
future projects or releases. A software oriented refactoring process and
metric tool helps us know the required evolution of a compiler[14].
information to build such models. The
3. SWMETRIC TOOL
metrics that are gathered need to be SWMetric is a tool we developed to
compiled to make some hypothesis and
gather metrics on the function and the
assumptions about the model. Using rules
similar to those rules used in data mining class level. The tool can analyze code of
classification models, the more we apply C/C++/C# or Java. It is originally

22nd IEEE International Conference on Software Maintenance (ICSM'06)
0-7695-2354-4/06 $20.00 © 2006

developed as part of a student research size is used for declarations, method
project for Honeywell aviation division. headers and global variable.
3. Release vs. LOC/function
4. GOALS AND APPROACHES Typically, programs should have
a fixed size of functions of no more than
1. Study the relation between project (20-30) LOC’s[9].
releases and LOC size variation. We first
studied how different projects LOC Some of the open source projects
may have been developed by different
changes with the releases.
individuals. An increase of the functions
sizes with time indicates problems in
In traditional approaches, we planning that are causing functions to
should not see much of new lines of expand or inflate. Of the studied projects,
codes produced with the recent releases. two show a relatively fixed amount of
The diagram, however, shows an LOC/function (~ 20) which may indicate
instability of the amount of new LOC stable coding, better predictions and
produced, most of the projects tested, management. The diagram shows a good
show an increase in a release and a percent of projects that had and sustained a
decrease in the next one. However, this fixed LOC/function.
is expected in the case of open source 4. MCDC and nesting per function
projects where there is much of MCDC and nesting are indications
of how many decisions and nodes are there
instability in terms of developers, their
in each function. These are important values
abilities, available time and other for software testing.
resources. 5. Code distribution. We studied code
2. Study the LOC efficiency and distribution through dividing code into three
Declaration Percent of the total code parts : comment lines, declaration and global
size. Most of the projects studied have a variable lines, and the last part is the rest of
steady LOC efficiency. This efficiency is the source code.
almost the same for the different projects
and it also does not vary for most of the 5.CONCLUSION AND FUTURE WORK
time with releases progress. The We are willing to explore more
efficiency percentage for most projects of the available open source codes in
is between 70-80 %. order to be able to make some
hypotheses and be able to build software
Initially, in Software automation classification models. We will make
development scenarios, we should have more detailed studies to compare
a low efficiency, where most of the lines individual to company coding styles, and
are counted in LOC , but not in SLOC, the common or different characteristics
as they are automatically generated by a of coding among the different
code generation tool. We should have an programming languages.
increase for the efficiency with releases, 6. REFERENCES
or increase for SLOC lines from the total 1. Free Software Foundation, 2006, 15 May
LOC lines. 2006. <http://directory.fsf.org/libs/c/>.
2. Open Watcom.org, 02-2006, 15 May
The Declaration percentage of 2006. Openwatcom.org/ftp/source/>.
data declarations to the total code size 3. Sun Microsystems, 05-2006.
indicates how much of the total code <http://java.sun.com/products/archive/>.

22nd IEEE International Conference on Software Maintenance (ICSM'06)
0-7695-2354-4/06 $20.00 © 2006

Open source evolution analysis

More Related Content

What's hot (17)

Similar to Open source evolution analysis (20)

Recently uploaded (20)

Open source evolution analysis