Languages

Details - Estimating the Development Cost of Open Source Software


Estimating the Development Cost of OSS

This page explains the calculations behind Estimating the Development Cost of Open Source Software: $387B of “Shovel Ready Code” the Private Sector Can use to Fuel Growth and Innovation.

We estimated the cost to develop all the OSS code in three steps:

1. Find the OSS projects on the Internet
2. Estimate the software lines of code (SLOC) in the OSS
3. Use the COCOMO estimation model to estimate the cost and effort to develop

Find the OSS projects on the Internet

Black Duck has invested millions of dollars over the last five years to find and catalog the OSS on the Internet. This information serves as the foundation for our product offerings and we refer to it as the Black Duck KnowledgeBase. The KnowledgeBase tracks over 200,000 open source projects from over 4,000 different websites.

http://www.blackducksoftware.com/oss

Estimate the software lines of code (SLOC) in the OSS

As part of our process of ingesting OSS from the Internet into our Knowledge Base, we analyze the available code from every release of every open source project we have using a SLOC counter. SLOC counters are language-aware, and estimate lines of code by opening up each recognized source file, and categorizing each line as source, comments, or blank lines, based on language-specific pattern recognition.

For the purpose of estimating the total SLOC in the OSS, we selected only the most recent released version of each OSS project, and aggregated the SLOC count (excluding comments and blank lines) of each programming language used, filtering out languages such as HTML, CSS, XML, and XMLSchema whose large source line counts we considered predominantly data representations and not programming languages.

COCOMO Estimation Model

For this study, we used Barry Boehm’s widely accepted COnstructive COst MOdel (COCOMO), an algorithmic Software Cost Estimation Model that relates software development effort for a program, in person-years, to source lines of code (SLOC).

http://en.wikipedia.org/wiki/COCOMO

There are different versions of COCOMO: Basic, Intermediate, and Detailed.

Basic COCOMO is a static, single-valued model that computes software development effort (and cost) as a function of program size expressed in estimated lines of code. COCOMO applies to three classes of software projects:

Organic projects - are relatively small, simple software projects in which small teams with good application experience work to a set of less than rigid requirements.
Semi-detached projects - are intermediate (in size and complexity) software projects in which teams with mixed experience levels must meet a mix of rigid and less than rigid requirements.
Embedded projects - are software projects that must be developed within a set of tight hardware, software, and operational constraints.

The basic COCOMO equations take the form:

E=ab(KLOC)^bb
D=cb(E)^db
P=E/D

where E is the effort applied in person-months, D is the development time in chronological months, KLOC is the estimated number of delivered lines of code for the project (expressed in thousands), and P is the number of people required. The coefficients ab, bb, cb and db are given in the following table.

Software project ab bb cb db
Organic 2.4 1.05 2.5 0.38
Semi-detached 3.0 1.12 2.5 0.35
Embedded 3.6 1.20 2.5 0.32

For the estimate of the OSS, we used the Basic COCOMO which is a static, single-valued model that computes software development effort (and cost) as a function of program size expressed in estimated lines of code. COCOMO applies to three classes of software projects which are of varying degrees of complexity: organic, semi-detached, and embedded. From a cost perspective we used the most conservative class of project, the organic class, which are relatively small, simple software projects in which small teams with good application experience work to a set of less than rigid requirements. A possible refinement to the cost estimate would be to segment the 200,000 OSS projects into the three COCOMO categories based on an analysis of the type of code for each project. This would lead to a higher cost estimate.

Knowing the number of person years of development required, and the cost of programmers, we can estimate the cost to develop OSS.

Estimated OSS Development Cost

COCOMO Cost Estimates  
ab 2.4
bb 1.05
KLOC 4,932,000
   
E (P-Months) 25,579,112
E (P-Years) 2,131,593
Average salary $ 75, 662
Overhead (wrap rate) 2.40
Total Estimated Development Cost $387,073,763,266

Using this approach, we estimate that it would take $387 billion to develop the available OSS by traditional proprietary means in year 2008 dollars.

According to the Bureau of Labor Statistics, the average salary for a US programmer in July, 2008 was $75,662. As the Linux Foundation pointed out in their analysis, one problem with this estimate is that OSS is developed around the world and using a US-based salary figure is suspect.

In addition to salary, an overhead factor of 2.4 was applied to capture costs above and beyond salary including testing, equipment, company operating costs and additional compensation costs such as benefits. The Linux Foundation, David Wheeler and others have used the same overhead factor, which also is referred to as the wrap rate.




Legal Notices | Privacy Policy | Site Map | Contact Us