limo economic analysis

25
Mobile Open Source Economic Analysis LiMo Foundation White Paper August 2009

Transcript of limo economic analysis

Page 1: limo economic analysis

�™ White Paper

Mobile Open Source Economic Analysis

LiMo Foundation White PaperAugust 2009

Page 2: limo economic analysis

�™ White Paper

Executive SummaryIt is possible to use quantitative techniques to examine a number of the proposed economic

benefits of open source software. The claimed benefits are a reduction in cost of acquisition,

access to innovation and cost of ownership of software technology.

The quantitative techniques we use to conduct our analysis are based on measuring source

lines of code (SLOC) applied to publicly accessible open source project repositories. To aid

our analysis, we have developed a command line tool to mine information on open source

projects using the ohloh1 web service.

Based on this analysis, there is a strong case for constructive engagement with

open source communities where the corresponding open source software

components are used within a collaboratively developed, open mobile software

platform such as the LiMo Platform™.

There is additionally a case for mobile software platform providers to consider

using certain strategic open source projects as the basis for development of new

functionality on their roadmap.

There is no proven case within this analysis for converting existing proprietary items already

within a mobile software platform to open source. To conduct a cost-benefit analysis of that

scenario would require examination of more factors than SLOC alone.

Based on this analysis, there is a strong case for

constructive engagement with open source

communities...

1http://www.ohloh.net

Page 3: limo economic analysis

2™ White Paper

1. IntroductionThe subject of open source is increasingly important in relation to mobile device platforms and in view of this,

it is vital to understand the underlying economic factors driving the use of open source software in a mobile

context. This paper seeks to move beyond opinion-based debate, by identifying the economic case for open

mobile platforms to acknowledge and embrace their use of open source software and to actively contribute

back changes to open source components modified or adapted within their platform.

This white paper attempts to quantify and corroborate the benefits of using open

source software in mobile platforms in relation to key components which

lie below the mobile commodity line. This line, for our purposes, lies approximately

around the UI framework level of a typical mobile software stack. Components

below the line are considered for this analysis to be commodity software. Above

the line lies the domain of differentiation. The approach we use involves applying economic cost-benefit analysis

techniques where applicable in addition to citing relevant authoritative peer-reviewed material. The following areas

of claimed benefit have been analysed in relation to open source mobile software components around or below

the commodity line:

Reduced cost of software acquisition

Access to software innovation

Reduced cost of software ownership

The analysis of this last area involves trying to quantify the cost to a mobile platform provider of failing to

engage with upstream changes.

Moving away from opinion-based

conjecture towards data-based analysis

Page 4: limo economic analysis

�™ White Paper

2. Adopting open source to reduce the cost of software acquisition2.1 The COCOMO model

The claim that adopting existing open source technology reduces the cost of software acquisition can be

measured using the COnstructive COst MOdel2 (COCOMO) developed in 19813

by Dr Barry Boehm4, Emeritus Professor of Software Engineering at UCSC and

a leading software engineering academic. COCOMO has since evolved into an

industry standard5 with respect to software cost metrics. The model computes the

cost of software development as a function of the total source lines of code (SLOC)

of the corresponding components.

COCOMO has been significantly refined since its inception to reflect the

intervening changes in software development methodology and techniques,

in particular to acknowledge more iterative approaches which better reflect modern development. The latest

version of the model, COCOMO II, contains a number of further adjusting factors and according to the UCSC

Center for Systems and Software Engineering:

“This new, improved COCOMO is now ready to assist

professional software cost estimators for many years to come”6.

The approach taken by COCOMO II is twofold. First, a hierarchy of three different cost models (organic, semi-

detached and embedded) is introduced which is designed to take into account the overhead of development

depending on the type of project being analysed. Secondly, COCOMO combines the cost model with suitable

annualized engineer cost/productivity figure to yield the equivalent cost of development within a typical

software engineering context. These elements combine in a single regression function as follows:

Effort Applied = a(KLOC)b [man-months7]

Development Time = c(Effort Applied)d [months]

People required = Effort Applied / Development Time [count]

The COCOMO Model, developed at USC

and based on measurement of SLOC,

is widely used for estimating software costs.

2http://sunset.usc.edu/csse/research/COCOMOII/cocomo_main.html3Barry Boehm. Software engineering economics. Englewood Cliffs, NJ: Prentice-Hall, 1981.4http://sunset.usc.edu/Research_Group/barry.html5See US Govt Dept of Defense SoftwareTech estimation site: https://www.thedacs.com/databases/url/key/46http://sunset.usc.edu/csse/research/COCOMOII/cocomo_main.html7http://www.amazon.com/Mythical-Month-Essays-Software-Engineering/dp/0201835959

Page 5: limo economic analysis

�™ White Paper

The coefficients in this function vary according to the project type thus:

Software project a b c dOrganic 2.4 1.05 2.5 0.38

Semi-detached 3.0 1.12 2.5 0.35

Embedded 3.6 1.20 2.5 0.32

(source: Software Cost Estimation With Cocomo II)

More detailed information on the COCOMO coefficients is available elsewhere8. For our purposes, COCOMO data

can be viewed as a recognized and respectable starting point to begin an empirical examination of the potential

benefits that open source offers for mobile platform providers in terms of the cost of software acquisition, access

to innovation and cost of software ownership.

2.2 The application of COCOMO to open source software

The applicability of COCOMO models to open source software was introduced in an influential and well-regarded

economic analysis, “Why Open Source? Look At the Numbers!” written by D. Wheeler in 20029 (and updated

regularly since), which remains a widely cited10 paper in relation to the economics

of Linux. The Linux Foundation commissioned some research11 in Oct 2008

updating Wheeler’s work. For the first calculation, they used the basic (i.e. “organic

project”) COCOMO model applied to Fedora 9. Their choice of annualized salary

figure was justified as follows:

“To calculate the costs for these distributions, a base salary was found for computer programmers

from the US Bureau of Labor Statistics. According to the BLS, the average salary for a US programmer

in July, 2008 was $75,662.0810. This was the salary amount used in our SLOC Count run … the

programmer making the average US salary figure of $75,662.08 is actually costing the employer

$97,604.08 in compensation alone. This is just one piece of the total wrap pie.”

We used a loaded cost of $75,000 per engineer per

annum – the same figure used by the Linux Foundation when they updated Wheeler’s work.

8http://www.amazon.com/Software-Cost-Estimation-Cocomo-II/dp/01302669229http://www.dwheeler.com/oss_fs_why.html10For example: http://abstract.cs.washington.edu/wiki/index.php/Open_Source_and_Search,11http://www.linuxfoundation.org/publications/estimatinglinux.php

Page 6: limo economic analysis

�™ White Paper

Combining these factors and applying them to the Fedora 9 source base, the research calculated an equivalent

development cost of $10.78 billion for 204.5 million source lines of code (or SLOC) or in other words, $52/SLOC for

its development up to the current state. Table 1 shows the COCOMO figures taken from this paper and how they

were arrived at by using the coefficients for an organic project.

Total Physical Source Lines of Code (SLOC) 204,500,946

Development Effort Estimate, Person-Years (Person-Months) (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))

59389.53 (712674.36)

Schedule Estimate, Years (Months) (Basic COCOMO model, Months = 2.5 * (person-months**0.38))

24.64 (295.68)

Total Estimated Cost to Develop (average salary = $75,662.08/year, overhead = 2.40).

$10,784,484,309

Table 1: SLOC and estimated production values for Fedora 9 (source: Linux Foundation)

For the Fedora 9 Linux kernel itself, the paper acknowledges that the “organic project” COCOMO model is not

appropriate since:

“the Linux kernel code is typically more complex than an “average” application—among other

things—it requires an analysis that goes beyond the basic COCOMO model. A user space application

like Mozilla, for instance, is much easier to code line by line since it’s abstracted at a much higher

level and has to handle far less tasks. A modern and enterprise-class operating system kernel is

asked to do a great number of extremely complex things, all at once.”

The paper moves on to indicate that an adjusted version of the organic project model is used which takes in the

exponent value from the semi-detached project model instead. The result of this is an upwards revision of the

equivalent cost of development of the 2.6.25 Linux kernel of $1.32 billion for 6.772 million SLOC or $202/SLOC

for its development up to the current state. Table 2 shows the corresponding figures from the paper which

details the use of adjusted COCOMO coefficients:

Total Physical Source Lines of Code (SLOC) 6,772,902

Development Effort Estimate, Person-Years (Person-Months)\ (effort model Person-Months = 4.64607 * (KSLOC**1.12))

7557.4 (90688.77)

Schedule Estimate, Years (Months) (Basic COCOMO model, Months = 2.5 * (person-months**0.38))

15.95 (191.34)

Estimated Average Number of Developers (Effort/Schedule) 473.96

Total Estimated Cost to Develop (average salary = $75,662.08/year, overhead = 2.40).

$1,372,340,206

Table 2: SLOC and estimated production values for Linux 2.6.25 kernel (source: Linux Foundation)

Page 7: limo economic analysis

�™ White Paper

Another way of arriving at a cost per SLOC figure would be to consider a similar mobile platform development

initiative such as that of Symbian OS. In very rough terms using publicly available data, approx 100012 staff amortized

over some 13 years from the Psion EPOC days built what is now,the modern Symbian OS. The result is of the order of

20 million lines of source according to the Symbian Foundation13. At an average loaded cost

of $100,000 per resource, this equates to a development cost of $1300 million, which yields

an equivalent figure of $64/SLOC.

Interpolating between the COCOMO figures derived from the Linux Foundation and

our further estimates, but with a slight bias towards the lower one as we are focusing

on acquisition of primarily middleware/user-level code (albeit low-level/commodity)

rather than kernel software, we arrive at an initial cost/SLOC factor of around $50/SLOC for our calculations in

this paper. We believe that this figure can be reasonably applied to other mainstream open source projects

of relevance to a mobile context in order to conduct a first order estimate of the cost of acquisition of their

corresponding components. We will do that once we have addressed the issue of how to generate accurate

information about component SLOC which we will do in the next section.

2.3 ohloh.net open source code analytics web service

The www.ohloh.net service was launched in 2007 with the specific aim of providing accurate and detailed software

metrics on existing open source projects derived from data mining the corresponding open source code bases.

In particular, ohloh yields extensive information about the evolution of corresponding

SLOC over the duration of a project’s lifetime. It is possible to do this with open source

project, because this information is available in the corresponding version control system

logs. The ohloh service has compiled metadata on more than 300,000 major open

source projects including (among many others) GTK, GStreamer, WebKit and Android.

It uses a sophisticated source code parsing engine called ohcount14 for processing the

corresponding source code available in a pubic repository; svn, cvs and git version control

systems are all supported. The ohloh data is available through a comprehensive and well-

documented free to use web service API15 once a visitor signs up for a developer key. Ohloh was recently acquired

by SourceForge.16

All ohloh code metrics are accessible through a RESTful17 web service API which returns data as XML. In order to

support our research for this paper, we developed and have open sourced a command line driven Python-based

12http://www.gillamorstephens.com/content/en/item_details_core.aspx?guid=AD2DB7B8-FE01-4DF8-A75C-492163FE94FD13http://blog.symbian.org/2009/07/28/oscon-impressions/14http://labs.ohloh.net/ohcount15https://www.ohloh.net/api/getting_started16https://www.ohloh.net/announcements/sourceforge_acquires_ohloh17http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm

SLOC for major open source projects can be obtained from ohloh

web services.http://www.ohloh.net

COCOMO gives us indicative figure of $50/

SLOC for user side system code

Page 8: limo economic analysis

�™ White Paper

tool18 which is able to reap a variety of information about a particular open source project through the ohloh

web service API, parse that information and format and present it in an auto-generated Excel spreadsheet. The

results retrieved from this tool form the core of the analytical data in this paper.

The remainder of this section examines the code analytics for four important open source projects which are

used within a GNOME based mobile Linux platform such as the LiMo Foundation Platform.

2.4 GTK analysis

Gtk (GNOME ToolKit) is the core application framework used in the LiMo Foundation Platform. It is a mature

project which forms the basis of the GNOME Linux Desktop UI and has had over 700 contributors working on it

over more than a decade. Using the pyohloh script, a graph illustrating the evolution of GTK over the past nine

years to the present day can be generated as shown in Figure 1.

Figure 1: Graph of GTK code history over time (source: www.ohloh.net)

Currently, GTKcomprises some 600,000 SLOC. Using our $50/SLOC factor, this equates to an equivalent

engineering cost of $30 million to develop this technology from scratch.

Note the smooth gradient of this graph over the last decade. This is a clear characteristic

of community-grown source code. It evolves gradually in line with a spiral19 or iterative

development approach. The dips and bumps noticeable on closer examination of the graph

are not analysed in this paper but they would typically reflect refactoring activity20.

18http://sourceforge.net/projects/pyohloh/ 19http://www.computer.org/portal/cms_docs_computer/computer/homepage/misc/Boehm/r5061.pdf20The history of the Telepathy open source project is a good example of this.

It would cost $30M to develop GTK

from scratch

Page 9: limo economic analysis

�™ White Paper

2.5 WebKit analysis

WebKit is an open source web rendering engine used within the LiMo Foundation Platform. It is a mature project

with 128 contributors working on it over several years. The project was kick-started in mid-2005 by an injection

of code from Apple (who in turn had bootstrapped from the Konqueror KDE Desktop Browser) and has, since

then, evolved through various versions of the Mac OS X Safari browser and other projects such as Google’s

Chrome. Using the pyohloh script, we were able to generate a graph illustrating the evolution of WebKit’s code

base over its lifetime. This graph is displayed in Figure 2.

Figure 2: Graph of WebKit code history over time (source: www.ohloh.net)

Currently, WebKit comprises some 1.78 million SLOC. Using our $50/SLOC factor, this equates

to an equivalent engineering cost of $89 million to develop this technology from scratch.

It would cost $89M to develop WebKit

from scratch

Page 10: limo economic analysis

9™ White Paper

2.6 GStreamer analysis

GStreamer is a media framework for delivering video and audio and is used in the LiMo Foundation Platform. It

is a mature project with over 420 contributors working on it since 2002. The code base has shipped in various

mobile embedded devices including those based on Nokia’s Maemo platform. Using the pyohloh script, we were

able to generate a graph illustrating the evolution of the GStreamer code base over the course of its existence.

This graph is displayed in Figure 3.

Figure 3: Graph of GStreamer code history over time (source: www.ohloh.net)

Currently, GStreamer comprises some 911,000 SLOC. Using our $50/SLOC factor, this equates

to an equivalent engineering cost of $45.5 million to develop this technology from scratch.

It would cost $45M to develop GStreamer

from scratch

Page 11: limo economic analysis

�0™ White Paper

2.7 BlueZ analysis

BlueZ is the standard Linux Bluetooth stack which is used as the base Bluetooth stack in the LiMo Foundation

Platform. It is a mature project with 49 contributors working on it since 2002. The code base has shipped in

various mobile embedded devices including those based on Nokia’s Maemo platform. Using the pyohloh

script, we were able to generate a graph illustrating the evolution of the BlueZ code baseover the course of its

existence. This graph is displayed in Figure 4.

Figure 4: Graph of BlueZ code history over time (source: www.ohloh.net)

Currently, BlueZ comprises some 105,000 SLOC. Using our $50/SLOC factor, this equates to an equivalent

engineering cost of $5.25 million to develop this technology from scratch.

2.8 Acquisition benefits for a mobile platform provider

As previously indicated, the four open source components analysed in this section (GTK, WebKit, GStreamer and

BlueZ) are used within the LiMo Platform. Using the figures calculated above, the combined cost of engineering

functionalities implemented by these four components alone from scratch comes close to $170 million. Note this

figure does not include the cost of implementing dependencies.

Page 12: limo economic analysis

��™ White Paper

3. Adopting open source to enable access to software innovationThe total number of open source projects being undertaken globally at present is huge21. However, relatively

few from this vast sea of potential will be both: a) active beyond a single developer and b) of direct interest to

mobile device manufacturers today. Nonetheless, it is important to consider this backdrop

as a source of real innovation because what may appear to be an unimportant project today

may become of great significance in relation to future mobile technology in a relatively

short period of time. A good example is WebKit - it has become the de facto standard web

rendering engine on mobile devices within a few years of its inception. Rather than rejecting promising projects

for being incomplete, significant cost savings may be possible by starting from the corresponding source base

rather than beginning from scratch:

“The companies and individuals, who work on Linux-related projects,build this value profit by

sharing the development burden with their peers (and sometimes competitors.) Increasingly it’s

becoming clear that shouldering this research and development burden individually, as Microsoft

has done, is an expensive approach to building software.”22

There are numerous other examples that have evolved to become very important in a mobile context, from

individual components (eg. BlueZ, OpenObex, D-Bus, Telepathy/Farsight) through to entire open source

platforms (eg. Android, Maemo).

In this section we will examine the following three projects in greater detail:

Clutter - open source, advanced UI framework being driven by Intel as a core part of their Moblin platform

oFono – open source telephony framework being driven by Nokia and Intel

GeoClue – open source location framework endorsed by GNOME Mobile

These projects have been chosen as purely indicative examples of innovative work that have the potential to be

included as standard components in future mobile Linux devices. All these selected projects address areas of

technology that are either below the mobile commodity line or are in the process of falling

below it. Our analysis will focus on the development momentum behind these projects and

the potential saving to be gained from using the corresponding source code as a starting

point for further development. It is also worth noting that engaging constructively with a

major field of innovation may result in far greater commercial return than the raw offset in engineering cost.

On the other hand, engineering cost is only one consideration in a decision of this nature; cost of technology

evaluation, selection and engineering learning curve are also factors which we do not take into account here.

Innovation flows from unexpected places

The mobile commodity line is shifting upwards

21ohloh alone indexes more than 300,000 projects22http://www.linuxfoundation.org/publications/estimatinglinux.php

Page 13: limo economic analysis

�2™ White Paper

3.1 Clutter analysis

Clutter is an open source library for creating fast, visually rich and animated user interfaces. It forms the basis

of the advanced UI framework in Intel’s Moblin mobile Linux platform. It is a mature project that was started at

leading UK-based open source development house, OpenedHand23 who have since been acquired by Intel24.

Various blog posts by ex-OpenedHand staff suggest that significant development is being done around Clutter

within Intel. Using the pyohloh script, we were able to generate a graph illustrating the evolution of Clutter’s

code base over the course of its existence. This graph is displayed in Figure 5.

Figure 5: Graph of Clutter code history over time (source: www.ohloh.net)

The gradient of this graph suggests a project with significant development velocity

(~35kSLOC/year), inferring it has not been materially affected by the Intel acquisition. This

rate of development constitutes a substantial capital investment on the part of Intel and

Clutter is clearly a project to keep an eye on.

Currently, Clutter comprises some 86,600 SLOC. Using our $50/SLOC factor, this equates to an equivalent

engineering cost of $4.33 million to develop this technology from scratch.

23http://www.o-hand.com 24http://www.linuxtoday.com/developer/2008082802735NWHWSW

$4.3M invested in Clutter to date

Page 14: limo economic analysis

��™ White Paper

3.2 oFono analysis

The oFono open source project was recently unveiled25 as a joint collaboration between Intel and Nokia and

has generated significant interest in the mobile industry. The project aims to build a world class open source

telephony stack for mobile Linux devices to be used in Intel’s Moblin platform as well as Nokia’s Maemo platform.

Using the pyohloh script, we were able to generate a graph illustrating the evolution of oFono’s code base over

the course of its existence. This graph is displayed in Figure 6.

Figure 6: Graph of oFono code history over time (source: www.ohloh.net)

The profile of the contribution curve indicates that this project was kick-started by a flurry

of coding and possibly a code contribution. Since its inception, activity has returned to a

more characteristic open source development gradient. One other noteworthy point is that

by examining the contributor data output by our script, we were able to confirm that a key

contributor is Marcel Holtmann, who is also a lead committer to BlueZ. Information relating to top committers is

highlighted in Table 3. Note that we have not refined our tool to examine commit sizes.

Contributor ID Account Name Contributor Name Man months Commits

1457041885407693 Denkenz Denis Kenzior 3 176

1457041885371924 Marcel Holtmann Marcel Holtmann 4 30

1457044032859329 ? Andrzej Zaborowski 2 20

1457041885368986 Rémi Denis-Courmont Rémi Denis-Courmont 2 14

1457041885412800 Akiniemi Aki Niemi 1 10

Table 3: oFono top contributors by commit (source: pyohloh)

Currently, the oFono project comprises 21,912 SLOC. Using our $50/SLOC factor, this corresponds to an

equivalent engineering cost of $1.1 million to develop this technology from scratch. Clearly, in spite of the

impressive commitment, oFono is at a very early stage at present judging by the evolution of the code base to

date and the small number of continuously active committers.

25http://www.unwiredview.com/2009/05/12/oFono-nokia-intel-start-a-new-linux-project-against-android/

$1.1M invested in oFono to date

Page 15: limo economic analysis

��™ White Paper

3.3 GeoClue analysis

The GeoClueopen source project delivers a geographic information service via D-Bus to client side applications.

The backend information can potentially come from a number of geo-information sources (eg. GPS or

geoIP address). The project has been used to build utilities such as the Clutter libchamplain26 library and is a

technology earmarked for future inclusion in the GNOME Mobilestack. Using the pyohloh script, we were able to

generate a graph illustrating the evolution of GeoClue’s code baseover the course of its existence. This graph is

displayed in Figure 7.

Figure 7: Graph of GeoClue code history over time (source: www.ohloh.net)

Note that from examination of other active open source projects, a plateau in terms of code activity is typically

an indication of a stalled development rather than a sign that the project is finished. It turns

out from looking at the contributor data that there is only one major developer, who does

not appear to be very active. This was a surprise given that GeoClue is a relatively high

profile GNOME project. Nonetheless, it is valuable to learn this information.

Currently, the GeoClueproject comprises 12,338 SLOC. Using our $50/SLOC factor, this equates to an equivalent

cost of $0.62 million to develop this technology from scratch.

$0.6M invested in GeoClue to date

26http://projects.gnome.org/libchamplain/

Page 16: limo economic analysis

��™ White Paper

4. Adopting open source to reduce cost of software ownershipPeer-reviewed literature exists27 to support the claim that maintenance costs dominate software total cost of

ownership (TCO) but our aim is to support this claim by looking at commit data derived from actual open source

projects. In this section, we will continue the forensic analysis of the code base of two of the same open source

projects we examined in section 2 using the output of our pyohloh script to obtain further information about

the number of developers working on these projects, their commits over time, the proportion of changes that

constitute maintenance and the corresponding proportion that could be considered as original development.

27http://users.jyu.fi/~koskinen/smcosts.htm

Page 17: limo economic analysis

��™ White Paper

4.1 GTK analysis

In relation to the GTK code history graph highlighted in Figure 1, an important milestone of note was the release

of GTK v2.1228 in Sept 2007. Since that release, as Table 4 illustrates, GTK development has continued. For a

platform that forked GTK 2.12 and chose not to update it with upstream changes this further GTK development

can be considered to constitute unleveraged potential. We can quantify the delta to yield an upper bound of

the value of those subsequent upstream contributions. Note that there is no easy way to differentiate between

maintenance and new features within unleveraged potential; both are form part of the ‘forking tax’ the platform

provider incurs by ignoring upstream.

Month Code Comments Blanks Commits Man Months Delta Man Months

01-09-2007 502697 96897 110101 12140 2752 22

01-10-2007 503262 96956 110247 12195 2771 19

01-11-2007 504825 97593 110449 12290 2799 28

01-12-2007 543111 103835 118349 12435 2844 45

01-01-2008 543764 104006 118521 12520 2868 24

01-02-2008 544540 104081 118681 12623 2889 21

01-03-2008 532912 101912 116092 12788 2924 35

01-04-2008 533430 101959 116160 12833 2941 17

01-05-2008 535693 102461 116699 12991 2976 35

01-06-2008 529833 102411 115387 13359 3021 45

01-07-2008 540055 103289 117049 13505 3059 38

01-08-2008 541936 103932 117410 13701 3098 39

01-09-2008 543399 104243 117692 13824 3131 33

01-10-2008 544106 104206 117846 13916 3156 25

01-11-2008 545548 104931 118125 13978 3173 17

01-12-2008 549378 106453 118903 14165 3194 21

01-01-2009 553943 107572 120014 14433 3219 25

01-02-2009 553931 107685 120011 14572 3245 26

01-03-2009 554353 107715 120101 14611 3262 17

01-04-2009 555396 107830 120303 14672 3286 24

01-05-2009 558435 108139 120894 14741 3313 27

01-06-2009 560721 108810 121376 14860 3341 28

01-07-2009 563135 109647 121930 14957 3361 20

Table 4: GTK month by month commit details since release 2.12 (source: pyohloh)

We can use the data in Table 4 to quantify this unleveraged potential in two ways. First we can look at the delta

code size and associate an engineering cost to it. Secondly we can look at the time spent in terms of delta man-

months.

28http://mail.gnome.org/archives/gtk-devel-list/2007-September/msg00052.html

Page 18: limo economic analysis

��™ White Paper

In terms of delta code size, GTK’s code size has increased from 502697 in Sept 2007 to 563135 in July 2009.

Using our $50/SLOC factor, this equates to an engineering cost of $3.02 million to develop this technology

independently. This figure is likely to be on the low side because GTK was already substantially advanced in Sept

2007, so any work to enhance/modify it would be complex by nature.

In terms of delta man-months, it is worth noting that the delta man-month column numbers remains on a

constant curve highlighting the maintenance burden. The overall man-months spent on the project between

GTK 2.12 to the present went from 2752 to 3361, that is 609 man-months. Using the earlier

figure of $75000 per developer per year, this equates to $3.8 million unadjusted and $9.12

million using the COCOMO 2.4 overhead factor. Averaging between the two results gives

us a conservative estimate of $6 million of unleveraged potential between GTK 2.12

and GTKcandidate 2.18. This finally puts a figure to the price of forking GTK from 2.12 and not synchronising/

engaging with upstream development from that point. If a decision to synchronise is made later, there will be an

additional re-engineering cost to make this happen.

If we were to move a year along the development curve, we reach GTK 2.14 released in Sept 200829. Using the

same approach as above, we go from 3131 to 3361 = 230 man months of unleveraged potential since that

point to the current version of GTK at the point of writing. This corresponds to over a third the full unleveraged

potential between Sept 2007 and now or $2.3 million.

As a final data point, we can look at the corresponding developer activity graph in Figure 8 which gives us a relatively

constant maintenance load of around 25 resources per year committing to the GTK mainline over the last couple of

years (equating to approx $1.8 million/year). This graph was generated from the delta man-months column in Table 4.

Figure 8: Graph of GTK developer activity over last two years (source: pyohloh)

Unleveraged potential cost of $6M for GTK

within a 2 year period

29http://mail.gnome.org/archives/gtk-devel-list/2008-September/msg00024.html

Page 19: limo economic analysis

��™ White Paper

4.2 WebKit analysis

The WebKit code history graph highlighted in Figure 2 clearly shows the point in mid-2005 when Apple

announced the open sourcing of WebKit. Since then, the graph has illustrated the characteristic upward curve

of a healthy open source project where code is being continuously evolved, enhanced and added to. In fact,

WebKit is an interesting open source project in that it doesn’t operate fixed releases at all but is available as

a continuously moving svn codeline. Nonetheless, if we were to take a fork at Nov 2007 when HTML5 Media

support was added30, as Table 5 illustrates, a manufacturer building on the Nov 2007 base

and not updating with subsequent changes potentially missed out on 1786845 – 1016544 =

770,301 delta SLOC worth of unleveraged potential. Using our $50/SLOC factor, this equates

to an equivalent cost of $38.5 million to develop this technology from scratch on top of

the WebKit v1.0 source base. As with our GTK analysis above, we can look at the delta man-

months too and Table 5 shows 7394 - 4109 = 3285 man months of presumed beneficial

evolution of the source base. Using the earlier figure of $75000 per developer per year, this equates to $20.5

million unadjusted and $49.3 million using the COCOMO 2.4 overhead factor. Averaging between the two

results gives us a conservative estimate of $44 million of unleveraged potential between WebKit at Sept 2007

and the current WebKit head revision.

30http://webkit.org/blog/140/html5-media-support/

Unleveraged potential cost of $44M for WebKit within a

2 year period

Page 20: limo economic analysis

�9™ White Paper

Month Code Comments Blanks Commits Man Months Delta Man Months

01-11-2007 1016544 322867 254297 28684 4109 134

01-12-2007 1044098 328069 259268 29508 4252 143

01-01-2008 1071166 340421 264312 30372 4390 138

01-02-2008 1457449 386764 302370 31175 4530 140

01-03-2008 1483810 399789 306929 32074 4667 137

01-04-2008 1506161 404098 311594 33040 4806 139

01-05-2008 1527013 407883 316609 33924 4949 143

01-06-2008 1534969 408878 317762 34698 5091 142

01-07-2008 1551059 412792 320853 35410 5235 144

01-08-2008 1566371 414854 324016 36124 5367 132

01-09-2008 1595954 420546 329897 37325 5528 161

01-10-2008 1607483 422141 332694 38364 5696 168

01-11-2008 1623679 426169 336570 39241 5857 161

01-12-2008 1642825 428862 339465 40070 6007 150

01-01-2009 1678058 440574 348052 41194 6200 193

01-02-2009 1691983 444456 351355 42064 6380 180

01-03-2009 1709246 448433 354963 43086 6573 193

01-04-2009 1731266 452352 359429 44230 6784 211

01-05-2009 1747419 455081 362749 45309 6988 204

01-06-2009 1816948 460252 371543 46488 7216 228

01-07-2009 1786845 469754 378470 47106 7394 178

Table 5: WebKit month by month commit details Nov 2007-July 2009 (source: pyohloh)

Page 21: limo economic analysis

20™ White Paper

The unleveraged potential figures are so huge for WebKit that it is clearly very important for an OEM to have a

maintenance strategy in place up front if they want to include WebKit in their product. This is clearly visible by

looking at the corresponding developer activity graph shown in Figure 9 showing an amortized maintenance

load of approximately 200 engineers per year (equating to approximately $15 million/year). This graph

was generated from the delta man-months column in Table 5. Note that the curve is on an upward gradient

demonstrating that WebKit is gaining developer traction.

Figure 9: Graph of WebKit developer activity over last two years (source: pyohloh)

4.3 Maintenance of open source software and community engagement

Using the figures we have uncovered, it is possible to make some quantitatively-backed statements regarding

open source software cost of ownership and the related economic benefits of engaging with corresponding

open source projects and communities. We can now support the following assertions:

Healthy open source projects have a characteristic progressive cost profile in relation to maintenance – in a

sense, they’re never finished but continue evolving ‘upstream’.

The cost of forking and losing connection with upstream development is twofold: i) the

corresponding cost of presumed beneficial unleveraged potential, ii) the further cost of

having to re-engineer modified forked code in the future to accommodate the inevitable

eventual re-sync with upstream. We quantified the former to show that the figures run into

$millions for important components such as GTK, WebKit, GStreamer and BlueZ.

Emergency “deforking” also

incurs cost!

Page 22: limo economic analysis

2�™ White Paper

Accommodating upstream development within the context of an open mobile platform is a key way to reduce

the cost of unleveraged potential.

It is important that mobile industry platform providers engage with the open source communities as early

as possible so that platform maintenance strategy is fully aligned with the upstream development agenda of

these communities, which is far more cost efficient than managing the entire maintenance burden in-house.

In practical terms, a strategy of engagement is bilateral. It involves actively working patches back into

community source and trying to influence the direction of the project.

Nevertheless, we have to acknowledge the reluctance on the part of some major mobile industry players to

depend on an unpredictable and intangible community for a key deliverable when mission critical commercial

release is at stake.

We also need to understand that the benefits of community engagement are not immediately visible or linear

– engagement for purposes of strategic alignment with product development is likely to achieve measurable

benefits only over the medium to longer term. It is more about investing in the relationship

to gain future value. In any case, it is not possible to divest entirely of the need for

engineering resource – some engineers will always be required to integrate, test and modify,

but engagement does offer a mechanism for maximal gain from the community through

maintenance and innovation beyond just initial acquisition. This gain is quantifiable and

bounded in upper terms by the unleveraged potential figures. Nokia seem to understand this and have made

some endorsements to this effect:

“The one who invests most has the biggest influence. If a company has a large group of developers,

it will create more and better proposals and those proposals will take the day.” 31

Nokia’s open source site, http://opensource.nokia.com, is evidence of this in operational practice. There are some

24 showcase open source projects being sponsored and linked from that site including:

S60 WebKit32: Port of WebKit to Nokia S60 platform

PyS6033: Python interpreter for Nokia’s S60 platform

Mobile Web Server34: Nokia’s port of Apache web server to S60 platform

31http://www.mobilemonday.net/news/nokia-finds-value-in-open-source-community32http://opensource.nokia.com/projects/S60browser/index.html33http://opensource.nokia.com/projects/pythonfors60/index.html34http://opensource.nokia.com/projects/mobile-web-server/index.html

TCO control requires upstream

engagement

Page 23: limo economic analysis

22™ White Paper

4.4 Open source maintenance in a commercial mobile context

It is possible to use the ohloh web service to compare the maintenance of commercially driven open source

developments such as oFono, Android and WebKit to more community-sponsored projects such as GTK,

GStreamerand BlueZ. One characteristic difference between them is that commercially developed open source

is often seeded by the injection of large quantities of code into an open repository. This can be clearly seen by

examining Android’s code history shown in Figure 10. Note that the corresponding git repository for this code

history is git://android.git.kernel.org/platform/bionic.git which consists of some 3 million SLOC mainly injected

around two points during the past year.

Figure 10: Graph of Android code history over time (source: www.ohloh.net)

Maintenance beyond the point of injection typically continues to be undertaken mainly by the commercial

entity itself. This was clear from looking at the WebKit contributor details – nearly all the top 25 contributors have

Apple email addresses. It is also interesting to look at the list of Android contributors. The top 8 are highlighted

in Table 8 below. The top committer by far is “The Android Open Source Project” which is a vehicle for an internal

Google engineering team. By doing internet searches, we were able to determine that the remaining individuals

involved also appear to be Google employees and quite probably the key gatekeepers for Google-driven

commits to the project.35

35For example, Jean-Baptiste Queru can be confirmed as a Google employee here: http://www.linkedin.com/in/jbqueru Raphael Moll likewise here: http://www.linkedin.com/pub/raphaël-moll/0/2b9/2ab Xavier Ducrohet likewise here: http://www.linkedin.com/pub/xavier-ducrohet/0/265/4b7 and so on…

Page 24: limo economic analysis

2�™ White Paper

Contributor ID Account Name Contributor Name Man months Commits

41650445438243 ? The Android Open Source Project 5 323

41650445447227 ? Jean-Baptiste Queru 7 56

41650445476011 ? Raphael Moll 2 36

41650445473806 ? Xavier Ducrohet 2 30

41650445473002 ? Android Code Review 0 28

41650445476007 ? Dianne Hackborn 3 28

41650445476012 ? Eric Fischer 2 22

41650445476006 ? Jorg Pleumann 2 21

Table 8: Android month by month commit details since release 3.22 (source: pyohloh)

The data suggests what many in the open source world already know from experience, namely that it is not

easy to ‘dump’ commercially developed software as open source and expect to build a

community around it quickly – the process is likely to take a very long time and requires

significant efforts to align with the interests of external developers who often have different

motivations for getting involved. This should not be taken to mean that all such attempts are

doomed, merely that they face formidable challenges as various commentators36 have identified.

What the data we have examined suggests is that if one wishes to engage the community to assist in

maintenance, it is likely to be more effective in those cases where the corresponding components were

community-created in the first instance as with GTK, GStreamer and BlueZ and even then, only if roadmap

alignment can be achieved.

36http://mobileopportunity.blogspot.com/2009/06/symbian-evolving-toward-open.html

‘Dumping’ is inefficient.

Page 25: limo economic analysis

2�™ White Paper

5. ConclusionsWhere an open mobile platform is already using key open source projects of critical importance, there is

direct economic value in constructive engagement with the corresponding open source communities. The

rationale is that through such engagement the platform provider can reduce the cost of acquisition of future

innovation as well as reduce the cost of maintenance of that software. The latter requires the platform provider

to work collaboratively with the community to align upstream developments. Good candidates for projects

that fall into this category are GTK, WebKit, GStreamer and BlueZ. Note that this does not mean a full reliance

on the community, which may be untenable in the context of commercial predictability requirements, but a

more blended approach. The practical details of how mobile platform providers and device manufacturers can

effectively engage with existing open source communities and seek to minimize the cost of ownership of open

source software will be the subject of a future LiMo Foundation White Paper.

Where a technology lies below the commodity line and is already in a mobile platform in the form of a

proprietary commercial implementation, open sourcing it is unlikely in the short term to build a significant

community around that code outside of the organizations that built the software in the first place.

Consequently, though it may be viewed as beneficial in terms of industry leadership and reputational value, it

is not necessarily economically beneficial in the short term to open source the technology. The motivation to

do so will be driven by non-economic factors such as a desire to see the technology more widely adopted or

used. In the event that a proprietary technology is open sourced, it is essential that the platform provider has a

practical community-building strategy to follow through on the act.

Where a technology is falling below the commodity line and is not already present to some degree in a

mobile platform, the platform provider should look to adopt relevant open source projects to reduce the cost

of software acquisition and offer opportunities for further scale economies through strategic alignment with

other open source based industry initiatives. Good candidates for projects of this type currently include the

Clutter Advanced UI framework and Telepathy IM Communications framework.

Where a technology lies above the commodity line, open source equivalents are of less strategic value to a

platform provider. This is the area of competitive differentiation for OEMs and operators where their value

propositions reside and where open source software tends in general not to offer a compelling technical

alternative.