SlideShare a Scribd company logo
Open Source Evolution Analysis

                                                               Izzat Alsmadi
                                                              Kenneth Magel
                                                      Department of computer science
                                                        North Dakota state university
                                                 {izzat.alsmadi, kenneth.magel}@ndsu.edu

                                                                       this model, the more we can tune its
                    ABSTRACT                                           assumptions and gain confident in its results.

                             Source code analysis is important                  The paper analyzes the results of
                    for software management. It enables us             some source code charts from selected open
                    recognize strengths and weaknesses in our          source available projects. Those projects
                    earlier projects or releases. We developed a       will be studied through selected number of
                    source code analysis tool. This tool gathers       their releases.
                    several metrics from C/C++, C# or Java
                    source codes. In this paper, we will use the       2. RELATED WORK
                    tool to analyze some of the open source code
                    projects. We will study the selected projects              Godfrey et al. studied the LOC
                    release evolutions and compare some                releases evolution for Linux[18].
                    characteristics between the same project           Capiluppi et al. suggested that
                    releases, as well as among different               understandability decreases as time
                    projects. Different programming language
                                                                       passes by, and focused on code and
                    code and development styles will be studied
                    through those open source projects.                module sizes [19]. Stroulia et al. used
                                                                       CVSChecker to study temporal source
                    General Terms                                      code      activities [20]. Marjanovic
                    Source code analysis.                              proposed a meta model framework for
                    Keywords                                           code release history Systems [21].
                    Open source, code metrics.                         Scacchi studied the game open source
                                                                       development practices[22]. Raja et al.
                    1. INTRODUCTION                                    explored some important software
                            The knowledge gathered by                  characteristics that      contribute to
                    software metrics plays an important role           consistent software quality in Linux
                    in software management. This knowledge             [23]. Moreno, et al. introduced Jeliot3 ,
                    can be used to build classification or             as a visual source code evaluation
                    proxy models that can be used toward               tool[13]. Graver evaluated object
                    future projects or releases. A software            oriented refactoring process and
                    metric tool helps us know the required             evolution of a compiler[14].
                    information to build such models. The
                                                                       3. SWMETRIC TOOL
                    metrics that are gathered need to be               SWMetric is a tool we developed to
                    compiled to make some hypothesis and
                                                                       gather metrics on the function and the
                    assumptions about the model. Using rules
                    similar to those rules used in data mining         class level. The tool can analyze code of
                    classification models, the more we apply           C/C++/C# or Java. It is originally



22nd IEEE International Conference on Software Maintenance (ICSM'06)
0-7695-2354-4/06 $20.00 © 2006
developed as part of a student research            size is used for declarations, method
                    project for Honeywell aviation division.           headers and global variable.
                                                                       3. Release vs. LOC/function
                    4. GOALS AND APPROACHES                                    Typically, programs should have
                                                                       a fixed size of functions of no more than
                    1. Study the relation between project              (20-30) LOC’s[9].
                    releases and LOC size variation. We first
                    studied how different projects LOC                          Some of the open source projects
                                                                       may have been developed by different
                    changes with the releases.
                                                                       individuals. An increase of the functions
                                                                       sizes with time indicates problems in
                             In traditional approaches, we             planning that are causing functions to
                    should not see much of new lines of                expand or inflate. Of the studied projects,
                    codes produced with the recent releases.           two show a relatively fixed amount of
                    The diagram, however, shows an                     LOC/function (~ 20) which may indicate
                    instability of the amount of new LOC               stable coding, better predictions and
                    produced, most of the projects tested,             management. The diagram shows a good
                    show an increase in a release and a                percent of projects that had and sustained a
                    decrease in the next one. However, this            fixed LOC/function.
                    is expected in the case of open source             4. MCDC and nesting per function
                    projects     where there is much of                         MCDC and nesting are indications
                                                                       of how many decisions and nodes are there
                    instability in terms of developers, their
                                                                       in each function. These are important values
                    abilities, available time and other                for software testing.
                    resources.                                         5. Code distribution. We studied code
                    2. Study the LOC efficiency and                    distribution through dividing code into three
                    Declaration Percent of the total code              parts : comment lines, declaration and global
                    size. Most of the projects studied have a          variable lines, and the last part is the rest of
                    steady LOC efficiency. This efficiency is          the source code.
                    almost the same for the different projects
                    and it also does not vary for most of the          5.CONCLUSION AND FUTURE WORK
                    time with releases progress. The                           We are willing to explore more
                    efficiency percentage for most projects            of the available open source codes in
                    is between 70-80 %.                                order to be able to make some
                                                                       hypotheses and be able to build software
                            Initially, in Software automation          classification models. We will make
                    development scenarios, we should have              more detailed studies to compare
                    a low efficiency, where most of the lines          individual to company coding styles, and
                    are counted in LOC , but not in SLOC,              the common or different characteristics
                    as they are automatically generated by a           of coding among the different
                    code generation tool. We should have an            programming languages.
                    increase for the efficiency with releases,         6. REFERENCES
                    or increase for SLOC lines from the total          1.    Free Software Foundation, 2006, 15 May
                    LOC lines.                                               2006. <http://directory.fsf.org/libs/c/>.
                                                                       2.    Open Watcom.org, 02-2006, 15 May
                            The Declaration percentage of                    2006. Openwatcom.org/ftp/source/>.
                    data declarations to the total code size           3.    Sun Microsystems, 05-2006.
                    indicates how much of the total code                     <http://java.sun.com/products/archive/>.




22nd IEEE International Conference on Software Maintenance (ICSM'06)
0-7695-2354-4/06 $20.00 © 2006

More Related Content

PDF
Ontological approach to the specification of properties of software systems a...
PDF
Csit77404
PDF
Analysis of Software Complexity Measures for Regression Testing
PDF
GENERATING SOFTWARE PRODUCT LINE MODEL BY RESOLVING CODE SMELLS IN THE PRODUC...
PDF
Programmer Productivity Enhancement Through Controlled Natural Language Input
PDF
Software Refactoring Under Uncertainty: A Robust Multi-Objective Approach
PDF
Specification-based Verification of Incomplete Programs
PDF
Software Systems as Cities: a Controlled Experiment
Ontological approach to the specification of properties of software systems a...
Csit77404
Analysis of Software Complexity Measures for Regression Testing
GENERATING SOFTWARE PRODUCT LINE MODEL BY RESOLVING CODE SMELLS IN THE PRODUC...
Programmer Productivity Enhancement Through Controlled Natural Language Input
Software Refactoring Under Uncertainty: A Robust Multi-Objective Approach
Specification-based Verification of Incomplete Programs
Software Systems as Cities: a Controlled Experiment

What's hot (17)

PDF
SOFIA Poster (Abstract) - ADK VLHCC 2010. INDRA/ESI
PDF
Open Engineering Framework
PDF
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
PDF
PDF
Paper 55 final
PDF
Software Engineering Sample Question paper for 2012
PDF
Sanjay kumar joshi
PDF
V2I6_IJERTV2IS60721
PDF
THE UNIFIED APPROACH FOR ORGANIZATIONAL NETWORK VULNERABILITY ASSESSMENT
PDF
Similar Characteristics of Internal Software Quality Attributes for Object-Or...
PDF
Software Patterns
PDF
STRUCTURAL VALIDATION OF SOFTWARE PRODUCT LINE VARIANTS: A GRAPH TRANSFORMATI...
PDF
Integrating profiling into mde compilers
PDF
STATISTICAL ANALYSIS FOR PERFORMANCE COMPARISON
PDF
A DECISION SUPPORT SYSTEM FOR ESTIMATING COST OF SOFTWARE PROJECTS USING A HY...
PDF
WhitePaperTemplate
PDF
Pointcut rejuvenation
SOFIA Poster (Abstract) - ADK VLHCC 2010. INDRA/ESI
Open Engineering Framework
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
Paper 55 final
Software Engineering Sample Question paper for 2012
Sanjay kumar joshi
V2I6_IJERTV2IS60721
THE UNIFIED APPROACH FOR ORGANIZATIONAL NETWORK VULNERABILITY ASSESSMENT
Similar Characteristics of Internal Software Quality Attributes for Object-Or...
Software Patterns
STRUCTURAL VALIDATION OF SOFTWARE PRODUCT LINE VARIANTS: A GRAPH TRANSFORMATI...
Integrating profiling into mde compilers
STATISTICAL ANALYSIS FOR PERFORMANCE COMPARISON
A DECISION SUPPORT SYSTEM FOR ESTIMATING COST OF SOFTWARE PROJECTS USING A HY...
WhitePaperTemplate
Pointcut rejuvenation
Ad

Similar to Open source evolution analysis (20)

PDF
Mapping and visualization of source code a survey
PDF
Put Your Hands in the Mud: What Technique, Why, and How
PPTX
Mapping and visualization of source code a survey
PDF
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
PDF
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
PDF
Fc25949950
PDF
Using Evolutionary Prototypes To Formalize Product Requirements
PPT
Software testing presentation for engineering students of computer science
PDF
2012 ieee projects software engineering @ Seabirds ( Trichy, Chennai, Pondich...
PPTX
20CB304 - SE - UNIT V - Digital Notes.pptx
PPT
UNIT-II.ppt artificial intelligence cse bkk
PPT
UNIT-II.ppt kkljfuudvmllmhghdwscnmlitfxcchmkk
PDF
STATICMOCK : A Mock Object Framework for Compiled Languages
PDF
A novel approach based on topic
PDF
Finding Bad Code Smells with Neural Network Models
PDF
Asundi
PDF
Survey paper
PDF
Se chapter 1,2,3 2 mark qa
PDF
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
DOC
Coupling based structural metrics for measuring the quality of a software (sy...
Mapping and visualization of source code a survey
Put Your Hands in the Mud: What Technique, Why, and How
Mapping and visualization of source code a survey
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
Fc25949950
Using Evolutionary Prototypes To Formalize Product Requirements
Software testing presentation for engineering students of computer science
2012 ieee projects software engineering @ Seabirds ( Trichy, Chennai, Pondich...
20CB304 - SE - UNIT V - Digital Notes.pptx
UNIT-II.ppt artificial intelligence cse bkk
UNIT-II.ppt kkljfuudvmllmhghdwscnmlitfxcchmkk
STATICMOCK : A Mock Object Framework for Compiled Languages
A novel approach based on topic
Finding Bad Code Smells with Neural Network Models
Asundi
Survey paper
Se chapter 1,2,3 2 mark qa
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
Coupling based structural metrics for measuring the quality of a software (sy...
Ad

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPT
Teaching material agriculture food technology
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Machine Learning_overview_presentation.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
TLE Review Electricity (Electricity).pptx
PDF
August Patch Tuesday
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
A Presentation on Artificial Intelligence
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Teaching material agriculture food technology
Heart disease approach using modified random forest and particle swarm optimi...
NewMind AI Weekly Chronicles - August'25-Week II
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Per capita expenditure prediction using model stacking based on satellite ima...
Machine Learning_overview_presentation.pptx
Unlocking AI with Model Context Protocol (MCP)
TLE Review Electricity (Electricity).pptx
August Patch Tuesday
A comparative study of natural language inference in Swahili using monolingua...
Spectral efficient network and resource selection model in 5G networks
Encapsulation_ Review paper, used for researhc scholars
A Presentation on Artificial Intelligence
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Building Integrated photovoltaic BIPV_UPV.pdf

Open source evolution analysis

  • 1. Open Source Evolution Analysis Izzat Alsmadi Kenneth Magel Department of computer science North Dakota state university {izzat.alsmadi, kenneth.magel}@ndsu.edu this model, the more we can tune its ABSTRACT assumptions and gain confident in its results. Source code analysis is important The paper analyzes the results of for software management. It enables us some source code charts from selected open recognize strengths and weaknesses in our source available projects. Those projects earlier projects or releases. We developed a will be studied through selected number of source code analysis tool. This tool gathers their releases. several metrics from C/C++, C# or Java source codes. In this paper, we will use the 2. RELATED WORK tool to analyze some of the open source code projects. We will study the selected projects Godfrey et al. studied the LOC release evolutions and compare some releases evolution for Linux[18]. characteristics between the same project Capiluppi et al. suggested that releases, as well as among different understandability decreases as time projects. Different programming language passes by, and focused on code and code and development styles will be studied through those open source projects. module sizes [19]. Stroulia et al. used CVSChecker to study temporal source General Terms code activities [20]. Marjanovic Source code analysis. proposed a meta model framework for Keywords code release history Systems [21]. Open source, code metrics. Scacchi studied the game open source development practices[22]. Raja et al. 1. INTRODUCTION explored some important software The knowledge gathered by characteristics that contribute to software metrics plays an important role consistent software quality in Linux in software management. This knowledge [23]. Moreno, et al. introduced Jeliot3 , can be used to build classification or as a visual source code evaluation proxy models that can be used toward tool[13]. Graver evaluated object future projects or releases. A software oriented refactoring process and metric tool helps us know the required evolution of a compiler[14]. information to build such models. The 3. SWMETRIC TOOL metrics that are gathered need to be SWMetric is a tool we developed to compiled to make some hypothesis and gather metrics on the function and the assumptions about the model. Using rules similar to those rules used in data mining class level. The tool can analyze code of classification models, the more we apply C/C++/C# or Java. It is originally 22nd IEEE International Conference on Software Maintenance (ICSM'06) 0-7695-2354-4/06 $20.00 © 2006
  • 2. developed as part of a student research size is used for declarations, method project for Honeywell aviation division. headers and global variable. 3. Release vs. LOC/function 4. GOALS AND APPROACHES Typically, programs should have a fixed size of functions of no more than 1. Study the relation between project (20-30) LOC’s[9]. releases and LOC size variation. We first studied how different projects LOC Some of the open source projects may have been developed by different changes with the releases. individuals. An increase of the functions sizes with time indicates problems in In traditional approaches, we planning that are causing functions to should not see much of new lines of expand or inflate. Of the studied projects, codes produced with the recent releases. two show a relatively fixed amount of The diagram, however, shows an LOC/function (~ 20) which may indicate instability of the amount of new LOC stable coding, better predictions and produced, most of the projects tested, management. The diagram shows a good show an increase in a release and a percent of projects that had and sustained a decrease in the next one. However, this fixed LOC/function. is expected in the case of open source 4. MCDC and nesting per function projects where there is much of MCDC and nesting are indications of how many decisions and nodes are there instability in terms of developers, their in each function. These are important values abilities, available time and other for software testing. resources. 5. Code distribution. We studied code 2. Study the LOC efficiency and distribution through dividing code into three Declaration Percent of the total code parts : comment lines, declaration and global size. Most of the projects studied have a variable lines, and the last part is the rest of steady LOC efficiency. This efficiency is the source code. almost the same for the different projects and it also does not vary for most of the 5.CONCLUSION AND FUTURE WORK time with releases progress. The We are willing to explore more efficiency percentage for most projects of the available open source codes in is between 70-80 %. order to be able to make some hypotheses and be able to build software Initially, in Software automation classification models. We will make development scenarios, we should have more detailed studies to compare a low efficiency, where most of the lines individual to company coding styles, and are counted in LOC , but not in SLOC, the common or different characteristics as they are automatically generated by a of coding among the different code generation tool. We should have an programming languages. increase for the efficiency with releases, 6. REFERENCES or increase for SLOC lines from the total 1. Free Software Foundation, 2006, 15 May LOC lines. 2006. <http://directory.fsf.org/libs/c/>. 2. Open Watcom.org, 02-2006, 15 May The Declaration percentage of 2006. Openwatcom.org/ftp/source/>. data declarations to the total code size 3. Sun Microsystems, 05-2006. indicates how much of the total code <http://java.sun.com/products/archive/>. 22nd IEEE International Conference on Software Maintenance (ICSM'06) 0-7695-2354-4/06 $20.00 © 2006