Charles University in Prague - cuni.cz

Charles University in Prague

Faculty of Social SciencesInstitute of Economic Studies

BACHELOR THESIS

Patents: Means to Innovation orStrategic Ends?

Author: Martin Stepanek

Supervisor: PhDr. Jirı Schwarz

Academic Year: 2011/2012

http://www.cuni.cz/UKENG-1.html

http://fsveng.fsv.cuni.cz/FSVENG-1.html

http://ies.fsv.cuni.cz/index.php?module=board&action=board&lng=en_GB

mailto:[email protected]

mailto:[email protected]

Declaration of Authorship

The author hereby declares that he compiled this thesis independently, using

only the listed resources and literature. The author also declares that he has

not used this thesis to acquire another academic degree.

The author grants to Charles University permission to reproduce and to dis-

tribute copies of this thesis document in whole or in part.

Prague, May 15, 2012 Signature

Acknowledgments

I am most thankful to PhDr. Jirı Schwarz, my thesis supervisor, for his relent-

less support, great ideas, and occasional commendation. The responsibility for

all errors is mine.

I would also like to thank PhDr. Jana Votapkova for her consultation about

the Data Envelopment Analysis.

Abstract

This paper utilizes an extensive dataset of 163,663 US patents granted between

1976 and 2011 to 25 companies within four technological fields (aerospace in-

dustry, computer manufacturing, semiconductor industry, and software devel-

opment), to observe fluctuations in their value and characteristics. I find that

certain indicators have changed immensely during the last 36 years, suggesting

that newer patents are much less valuable than their predecessors. Further,

using Data Envelopment Analysis, I estimate relative production efficiency of

transformation of inputs (research and development expenses and company’s

workforce) into outputs (patent stock and its technological importance), to

provide an empirical evidence for the recent theories of strategical patent ex-

ploitation by large companies. I find that the efficiency varies considerably for

different industries and also for the companies within an industry. There is an

overall trend of increasing efficiency in patent production per unit of input, but

there is none in the effectiveness of creating valuable inventions, which seems

to depend only on the company itself.

JEL Classification D22,L20,O32,O34

Keywords patent value, intellectual property rights, strate-

gic patents, research and development efficiency

Author’s e-mail [email protected]

Abstrakt

Tato prace vyuzıva rozsaheho souboru dat o 163 663 americkych patentech

25 spolecnostı ze ctyr technologickych odvetvı (letectvı, pocıtacova technika,

polovodice a softwarove inzenyrstvı) mezi roky 1976 a 2011, ke sledovanı zmen

v jejich hodnote a vlastnostech. Podle mych pozorovanı se nektere ukaza-

tele velmi vyrazne zmenily v prubehu poslednıch 36 let, coz naznacuje, ze

novejsı patenty jsou vyrazne mene cenne nez jejich predchudci. Dale, s vyuzitım

Data Envelopment Analysis, odhaduji relativnı efektivnost, se kterou mnou po-

zorovane firmy premenujı vstupy (vydaje na vyzkum a vyvoj, pocet zamestnan-

cu) na vystupy (pocet patentu a jejich technologicka hodnota), abych obo-

hatil nedavnou teorii o strategickem zneuzitı patentu firmami. Zjistil jsem, ze

tato efektivita je rozdılna nejen pro ruzna odvetvı, ale i pro firmy v danych

odvetvıch. Ukazalo se, ze efektivita tvorby patentu jako takovych vzrostla,

nicmene efektivita v tvorbe technologicky vyznamnych vynalezu zalezı pouze

na dane firme.

JEL klasifikace D22,L20,O32,O34

Klıcova slova hodnota patentu, dusevnı vlastnistvı, strate-

gicky patent, efektivita vyzkumu a vyvoje

E-mail autora [email protected]

Rozsah prace 97 682 znaku (vcetne mezer)

Contents

List of Tables viii

List of Figures ix

Acronyms x

Thesis Proposal xi

1 Introduction 1

2 Strategical Patents 4

3 Patent Valuation 7

3.1 Patent Indicators . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Correlates of Patent Value . . . . . . . . . . . . . . . . . . . . . 12

3.2.1 Used Variables . . . . . . . . . . . . . . . . . . . . . . . 14

4 The Dataset 17

5 Descriptive Statistics 20

5.1 Citations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.2 Family Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5.3 Renewals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5.4 Patent Trades . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.5 The Other Variables . . . . . . . . . . . . . . . . . . . . . . . . 31

6 Empirical analysis 33

6.1 Econometric analysis . . . . . . . . . . . . . . . . . . . . . . . . 33

6.2 DEA analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

7 Conclusion 47

Contents vii

Bibliography 54

A Forward Citation Distribution I

B Data download VI

C Additional Figures and Tables VII

List of Tables

3.1 The overview of the used variables and the empirical evidence

of their explanatory power regarding patent value. . . . . . . . . 15

6.1 Negative binomial regressions. . . . . . . . . . . . . . . . . . . . 35

6.2 Probit regressions. . . . . . . . . . . . . . . . . . . . . . . . . . 38

A.1 The number of foward patent citations at lags (weighted). . . . V

C.1 Companies overview. . . . . . . . . . . . . . . . . . . . . . . . . XI

C.2 Industries overview. . . . . . . . . . . . . . . . . . . . . . . . . . XII

C.3 Used variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . XII

C.4 Additional regression statistics - negative binomial models. . . . XII

C.5 Additional regression statistics - probit models. . . . . . . . . . XIII

C.6 Correlation matrix. . . . . . . . . . . . . . . . . . . . . . . . . . XIII

C.7 DEA variables (expenditures in $ millions). . . . . . . . . . . . . XIV

C.8 DEA analysis detailed results - software industry, both outcome

variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XV

C.9 DEA analysis - both outcome variables. . . . . . . . . . . . . . . XVI

C.10 DEA analysis - patent numbers as the outcome variable. . . . . XVII

C.11 DEA analysis - composite rating as the outcome variable. . . . . XVIII

List of Figures

5.1 The number of patent applications and grants annually (in thou-

sands). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5.2 Patent citations. . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.3 Sample distribution of forward patent citations. . . . . . . . . . 23

5.4 Forward citations by industry. . . . . . . . . . . . . . . . . . . . 25

5.5 Family size, based on date of application. . . . . . . . . . . . . . 27

5.6 Renewal data, patents granted from 1976 to 1999. . . . . . . . . 29

5.7 The average number of traded patents. . . . . . . . . . . . . . . 31

5.8 The other patent variables. . . . . . . . . . . . . . . . . . . . . . 32

6.1 Coefficients and 95% confidence intervals of time dummies, all

four regressions. Top two are the negative binomial regressions

(with Fcit and Fsize as the dependent variables, respectively),

bottom two are the probit regressions (with renewals and trades,

respectively). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

A.1 Cumulative distribution functions for different time cohorts. . . III

C.1 The delay between the patent application and the following grant

(in days). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII

C.2 Renewal data, patents granted from 2000 to 2003. . . . . . . . . VIII

C.3 Renewal data, patents granted from 2004 to 2007. . . . . . . . . VIII

C.4 Weighted forward citations by industry. . . . . . . . . . . . . . . IX

C.5 Backward citations by industry. . . . . . . . . . . . . . . . . . . IX

C.6 Family size by industry. . . . . . . . . . . . . . . . . . . . . . . X

C.7 Patent trades by industry. . . . . . . . . . . . . . . . . . . . . . X

Acronyms

DEA Data Envelopment Analysis

DMU Decision Making Unit

USPTO United States Patent Office

EPO European Patent Office

WIPO World Intellectual Property Organization

IPC International Patent Classification

AMD Advanced Micro Devices

AM Applied Materials

CS Citrix Systems

HP Hewlett-Packard

IBM International Business Machines

LTC Linear Technology Corporation

MIP Maxim Integrated Products

NC Nuance Communication

UT United Technologies

Bachelor Thesis Proposal

Author Martin Stepanek

Supervisor PhDr. Jirı Schwarz

Proposed Topic Patents: Means to Innovation or Strategic Ends?

Griliches (1981) provides empirical evidence that company’s patent stock

has a positive effect on its market value, particularly when accounting for num-

ber of references of other patents to analyzed patents (Hall, B.H., Jaffe, A.,

Trajtenberg, M., 2000). Harhoff et al. (2003) have shown that other aspects

of patents, such as patent scope or family size are correlated with the patent

value as well.

Those studies only investigate patent’s explanatory power of company’s

value as they are means of innovation, however, as Macdonald (2004) has

shown, patent can also be understood as a strategic instrument. A strate-

gic patent is only important as a tool to keep the company competitive among

others, not a mean of innovation. In my study, I will provide empirical analysis

from this point of view, using patent value as a measurement of innovativeness

of the patent.

Patent value can be understood as the market value of a patent (i.e. costs

for a company if it wanted to purchase a patent from another one) or as the

technological value (i.e. in means of innovation it brings). I will focus mainly

on the technological value which is shown to be correlated with e.g. family

size of the patent, patent scope or number of forward and backward references

related to the patent. I will show that the variables mentioned above, expenses

on research and development and other factors can be used to explain changes

in market value of a firm.

Assuming that higher value patents are the innovative ones and lower value

patents play the strategic role, I will test whether the value of firm’s patents has

changed over time. That is, whether strategic patents indeed are a phenomenon

of the recent years or if there has been no significant change in the average value

Bachelor Thesis Proposal xii

of the patents. The dataset will be constructed using a web crawler created for

this purpose, which will download the required data from the official website

of United States Patent and Trademark Office and related databases with free

access (particularly patents.google.com and www.freepatentsonline.com).

Outline

1. Introduction

2. Relationships among used variables

3. Model overview

4. Data Envelopment Analysis and auxiliary regressions

5. Testing hypotheses

6. Conclusion and future research

Core bibliography

1. Griliches, Z. (1981): Market Value, R&D and Patents. Economics letters. 7(2): pp.

183–187.

2. Hall, B.H., Jaffe, A., Trajtenberg, M. (2000): “Market Value and Patent Citations: A

First Look. NBER, Cambridge, MA

3. Harhoff, D., Scherer, F., Vopel, K. (2003): Citations, family size, opposition and the

value of patent rights. Research Policy. 32(8): pp. 1343–1363.

4. Macdonald, S. (2004): When means become ends: Considering the impact of patent

strategy on innovation. Information Economics and Policy. 16(1): pp. 135–158.

5. Pakes, A. (1985): On patents, R & D, and the stock market rate of return. The

Journal of Political Economy. 93(2): pp. 390–409.

6. Pitkethly, R. H. (1997): The Valuation of Patents: A Review of Patent Valuation

Methods with Consideration of Option Based Methods and the Potential for Further

Research. Judge Institute Working Paper. WP(21/97)

7. Reitzig, M. (2003): What do Patent Indicators Really Measure? A Structural Test of

‘Novelty’ and ‘Inventive Step’ as Determinants of Patent Profitability. LEFIC Center

for Law and Economics at the Copenhagen Business School.

8. Reitzig, M. (2004): Improving patent valuations for management purposes – validating

new indicators by analyzing application rationales. Research Policy. 33: pp. 939–957.

Author Supervisor

Chapter 1

Introduction

Innovative activity is the process of creating inventions. The inventor has to be

rewarded for that kind of activity, so the argument goes, as it generally increases

social welfare, and there would be suboptimal level of the activity without a

possible reward. State protection for the invention is an indirect form of such

a reward, and patenting is its possible mean. A patent is essentially a way of

possessing an invention, allowing to have similar rights as if it were a tangible

asset. The invention may then not be stolen or abused, and it may yield an

income from fees, should someone invent a new creation based on the patented

idea.

The history of patents goes back to the 15th century, but the idea of re-

warding inventors has been here from the time of ancient Greece (Devaiah,

undated). There are two extreme points of view at patenting; positive and

negative. The positive assumes that there would be no incentive for innovation

without patent protection, as it would not bring any reward. Yet there are

e.g. amateur writers, who do not seek a legal protection for their work and

thus prove that such institution is not a necessary condition for a creative ac-

tivity. The negative point of view highlights the fact that patents only create

an incentive for innovative activity; however, it may not result in a valuable

invention.

Nowadays, the process of innovation is mostly performed by the research

and development (R&D) sections of companies, not by individual inventors.

There are billions of dollars spent each year, and many companies hold portfo-

lios containing thousands of patents. Thanks to those, they can restrict their

competitors both on the market and in the race for technological lead. Bertran

(2003) argues that firms can nowadays target a given level of technological

1. Introduction 2

improvement and reach it in a given time, as they have routinized R&D. In

rapidly innovating industries, such as chemicals, drugs, computing equipment,

communication equipment, and professional and scientific instruments, is R&D

done with a much higher intensity, for the firm to keep competitive advantage

(Pakes and Griliches 1980). Smaller firms rely more heavily on trade secrets

than patents for protection of their ideas (Baldwin 1996), yet they have better

results in producing patents per $1 (Bound et al. 1982). Many economists have

tried to use patents and their characteristics as a measurement of the innova-

tive activity within firms (see e.g. Schmookler and Brownlee 1962 or Griliches

1990), or to link them to company’s performance (Griliches 1981, Hall et al.

2000, etc.).

Recently, a new phenomenon of strategic meaning of patents has been dis-

cussed by several authors (Yiannaka and Fulton 2006). While some companies

admit that they apply for new patents mostly for strategic purposes (Bessen

2004), the overall trend remains unclear. I will focus on this very interesting

behaviour and provide an empirical analysis to support the otherwise purely

theoretical literature.

I utilize a large dataset of US patents to show a change in the patent in-

dicators and statistics, as suggested by Jaffe and Lerner (2006). My peerless

dataset contains patents between 1976 and 2011 (the longest possible period to

be monitored due to a limited content of the patent office databases), whereas

the other recent studies (see e.g. Sapsalis and de la Potterie 2007 or Gam-

bardella et al. 2008) only observe patents from a short time span. Hence, I

provide a valuable contribution to the existing research. Then, combining and

further developing methods used by van Pottelsberghe de la Potterie and van

Zeebroeck (2011), van Zeebroeck (2011), Sapsalis et al. (2006), Sapsalis and

de la Potterie (2007), and other authors, I perform an econometric analysis to

show differences between my and the prior findings. Finally, I utilize Data En-

velopment Analysis, a method developed by Charnes et al. (1978), to measure

relative efficiency of the observed companies regarding transformation of in-

puts (R&D expenditures and company’s workforce) into outputs (patent stock

and its value). I find substantial distinctions not only among the companies,

but also throughout the observed time period. The results correspond to the

existing theories of strategical patents.

This paper is structured as follows: the next chapter presents a review of

the strategical meaning and use of patents, Chapter 3 provides an insight into

the history and different approaches to patent valuation, with focus on econo-

1. Introduction 3

metric analysis that I build upon later. Chapter 4 shows a summary of patent

indicators, Chapter 5 is devoted to the dataset, its descriptive statistics are

then depicted in Chapter 6. Chapter 7 covers the empirical analysis. Finally,

the last chapter concludes the work.

Chapter 2

Strategical Patents

Historically, the reason for requesting a patent protection had been the concern

about having own inventions being abused by a third party. Many companies

did not even see patenting as necessary, and it was more profitable for whole

technological fields to be mutually discrete and to not break other’s rights,

than to pay fees for patent application. Innovation had used to be an essential

process in obtaining a lead time on the market, resulting in company’s higher

earnings. However, this has changed dramatically in the beginning of the 20th

century, when the US patent offices significantly lowered required standards of

patent applications. Consequently, it has become much easier for a company to

make an application in order to obtain a patent grant. But, arguably, it has also

made the newly issued patents much less valuable at the same time, because

even inventions which would have never been granted patent protection before,

for their lack of novelty, became easily patentable. A decrease in singularity of

patents had led to a decline in patenting activity until 1982, when the Court of

Appeals for the Federal Circuit was set up. It was the first patent specialized

court in the USA. It upheld twice as many (up to 89%) lesser court decisions

that patents were valid than before, which significantly rose the value of US

patents; thus, it has again become favourable for companies to apply for them.

The sudden impact on the whole patent system was immense, patent grants in

the USA increased by 78% without a rise in research investment between 1983

and 1995 (Kortum and Lerner 1999).

Some industries, like pharmaceutics, have always supported the patent sys-

tem because of the nature of their products and impossibility to produce similar

products without breaching other’s rights. In these technological fields, a com-

pany may become a monopoly with a single invention; its competitors may

2. Strategical Patents 5

simply not be able to produce a substitute to its product. However, in other

fields, like semiconductor industry, the technology pace have always been much

faster than the process of patenting, which resulted in almost no patenting

activity during the 20th century. But, since 1982, companies have appreciated

every employee whose idea could be patented more than ever, and have decided

to continually patent their inventions as they would have fallen behind the com-

petition otherwise. Patents were kept and firms have continued in application

processes in all but the most worthless cases, no matter what industry they

competed in (Macdonald 2004).

The enormous increase in patent applications was not followed by an ade-

quate enlargement of patent offices. There were just too few employees without

advanced computer technology to deal with the immense amounts of applica-

tions.1 It has again led to even lower patenting standards, as there was not

enough time to carefully examine each request. It has been proved that the

worsening of patenting standards has resulted in a different management of

patent portfolios and an aggressive assertion of patents (Bessen 2004). Patents

no longer needed to carry any significant invention in order to be valuable

for the company through the competitive advantage they posed. Companies

started to aim for patent thickets instead of valuable innovations. A patent

thicket refers to complex products stretching over whole patent portfolios. This

is in a high contrast with the one-to-one correspondence between products and

patents usual before (i.e. the process of product creation only involves use of

one patent). A thicket is created around a key patent of a company, and in-

cludes both the patents of the company in possession of the key patent, and its

competitors. Having a patent in a thicket around the key patents held by the

competitor has become one of the new strategies that sprung up. The com-

petitor is then unable to fully utilize his own inventions, because his product

would involve a patent in a possession of a third party, most likely a competitor.

Moreover, such strategies may even end up preventing companies from selling

their products on certain markets.2

As (Hall and Ham 1999, pg. 10) put it,“The reasons that patents were

1Patents must go through several stages of examination, and the examiners may requestthe application form to be filled in due to some components missing or being improperlyprepared. The length of the whole process has changed substantially over the years, asshown in Figure C.1.

2In 2011, a court in Dusseldorf has forbidden Samsung to sell its tablet in Germany,upon a request from Apple. Just recently, in April 2012, another German court bannedMicrosoft from selling Xbox 360, Windows 7, Windows Media Player and Internet Explorerfor infringement of Motorola’s patent rights.

2. Strategical Patents 6

important often had little to do with whether patents provide an incentive to

conduct R&D or enable the firm to profit from the generation of products on

which the invention was based”. The more difficult it becomes to circumnav-

igate the protected invention with a new technology, the more valuable the

patented invention is for a company willing to block its competitors. Gallini

(2002) has shown that the greater is the breadth of patent protection (i.e. the

more areas the patent is involved in), the harder it is for other companies

to break into the market with their own innovations, without violating the

patented protection and thus breaking the law, and the longer the company

can maintain the limited monopoly. Bessen (2004) proved that, under general

conditions, firms attempting for cross-licensing (i.e. creating patent thickets)

have lower incentives for R&D. Such firm’s patent portfolio is then intertwined

with its competitor’s; thereby, every accomplishment producing profits is then

only shared through licensing.3 He also showed that mutually non-aggressive

strategies lead to higher social welfare. Other strategies, as listed in Macdonald

(2004), include: patent discoveries that might block use of similar discoveries

in competitors’ products, or to patent in order to have a portfolio with which

to negotiate licensing agreements with other companies.

So far the literature dealing with strategical patents has only shown the

theoretical background of why it would be more profitable for a company to

not aim at creation of valuable inventions, but to be involved in patent thickets

instead. One way how to observe such behaviour empirically is to analyse

changes in patent production. Simple patent counts would not be sufficient

though, as those are highly correlated with R&D expenses and the size of a

firm. Therefore, I take advantage of DEA to see the change in efficiency of

production, rather than to analyse the volume of the output. Yet not only the

analysis itself is important; the descriptive statistics of my extensive dataset

illustrate a lot of these changes as well.

In order to be able to successfully accomplish these tasks, one must first

have a deeper understanding of patent valuation and characteristics. The next

chapter introduces the main concepts.

3In the dispute between Motorola and Microsoft, Motorola wants 2.25% of the salesprice for using its inventions (O’Gara 2012). Many large companies nowadays must makeagreements with their competitors, in order to be even able to produce (e.g. Apple andSamsung, companies which otherwise sue each other, have an agreement for Samsung creatingsemiconductor parts for Iphone, without which would Apple not be able to make it.)

Chapter 3

Patent Valuation

A lot of effort has been put into patent value estimation since early 1960’s, yet

the results have been rather unsatisfactory so far. Several different approaches

have been suggested, some of which utilize micro- and macroeconomic models

(Bertran 2003 or Bessen 2004), whereas the others rely on characteristics con-

tained in patent documents (see e.g. Harhoff et al. 1999, Sapsalis et al. 2006, or

van Pottelsberghe de la Potterie and van Zeebroeck 2011). According to their

findings, patent value distribution seems to be extremely skewed and only a

few patents are of a significant value. Pitkethly (1997) has divided the patent

valuation methods as following:

Valuation on theoretical basis - Pitkethly explains these methods as

either modelling the future patent’s life or evaluating the past, taking into

account only a very few patent characteristics, and rather base the estimation

on predictions of the market, patenting company and its competitors. Costs

models take only past historical costs into account, without any allowances

for future gains. Market conditions models compare patents to similar traded

assets and their prices. This process yields very precise estimates, but is uneasy

to use, as patents usually do not have any perfect substitutes. Income methods

estimate future cash-flows, time and uncertainty methods split patent life into

several phases with different risk and cash-flows distributional probabilities,

and calculate the value of discounted future earnings. Finally, flexibility and

changing risk methods (both in discrete and in continuous time) utilize real

option pricing.

The theoretical approach estimates the value of a single patent. This is in

most of the cases useful, if not necessary, for better understanding of innova-

tive activity inside firms. Unfortunately, there is just a little direct empirical

3. Patent Valuation 8

evidence from patent data to support it.

Econometric valuation methods estimate the impact of patent indica-

tors on patent value, taking advantage of accessibility of such variables in

large volumes. Some of the indicators are directly correlated with observable

prices, costs, sold quantities, or with latent variables such as novelty, inven-

tive step, breadth, and dependence on complementary assets, which have some

self-explanatory power and may be utilized for further research (Reitzig 2004).

The first attempts to evaluate patents using econometric analysis come from

Schmookler and Brownlee (1962), who searched for a relationship between the

number of patents in company’s portfolio and the total factor productivity.

They did not find any significant correlation. The main pioneer in the field

of econometric patent valuation was Zvi Griliches. In 1981, he observed a

relationship between firm’s output, employment, and physical and R&D capital

(Griliches 1981), followed by a discovery of a significant impact of R&D on

company’s value (Griliches 1984). A long effect of $1 spent on R&D adds $2 in

the market value of the firm above and beyond the indirect influence of patents.

Although, only unanticipated R&D expenditures seem to have a positive effect.

Pakes (1985) demonstrated that about 5% of the variance in the stock market

value of a firm is caused by events changing both R&D and patent applications.

Despite the importance of these earlier findings, the most crucial change

came with the data transformation into electronic form. Previously manually

unobtainable volumes of data led to discoveries of new relationships among

the patent characteristics and patent value. The recent works exhibit correla-

tion between patent indicators and the likelihood of litigation (Lanjouw and

Schankerman 1997, Lanjouw and Schankerman 2004), renewal decisions (van

Pottelsberghe de la Potterie and van Zeebroeck 2011), or e.g. differences be-

tween academic and industrial patents (Sapsalis et al. 2006). Nevertheless,

patent value and its extremely skewed distribution, however well it can be ob-

served, is far from being satisfactorily explained. Some authors have argued

that the distribution may conform rather well to the log-normal distribution.

Scherer (1998) in his work shows that the distribution of returns from inno-

vation may be less skewed than log-normal, but with a long log-linear tail.

In other words, there are a very few extremely valuable patents, producing

many times higher revenues than their actual costs through R&D, and an over-

whelming number of patents worth almost nothing. As (Pitkethly 1997, pg.

2) characterizes the issue, “patents are like lotteries in which there are a few

prizes and a great many blanks”.


Following van Pottelsberghe de la Potterie and van Zeebroeck (2011), the

generally used model is:

V aluei = f(PCi, OCi, Si)

where V aluei is the estimated value of patent i, PCi are patent characteristics

obtained from the application and grant documents, OCi are characteristics of

patent’s owner, and Si are the results of an inventor and owner survey. I will

characterize each of those more precisely in the following two sections.

3.1 Patent Indicators

Reitzig (2004) divided patent indicators obtainable from the application and

grant documents into three categories:

First generation variables are not related to a deeper knowledge of insti-

tutional background of the patent system, and thus are easy to be interpreted

and used. Nevertheless, they do not take depreciation into account. Those

include patent citations and family size of the patent.

Patent citations can be either forward or backward, and show the knowledge

flows among patents. A backward citation exhibits a relationship between two

particular patents; one being an underlying basis (the cited patent) for another

one (the citing patent). Patentee who seeks for state protection of his idea

must first create a list of patents and non-patent literature upon which he has

built the new invention. As the patent is applied for, it goes through many

stages of examination, where the examiner inspects the correctness of such

lists and searches for more possible preceding patents or literature. The grant

document then includes both the citations from the examiner and from the

inventor himself. Reitzig (2003a) argues that backward citations to previously

issued patents may exhibit market potential, whereas citations to non-patent

literature ought to indicate greater technological value. The knowledge flows,

assumed to be shown by patent citations, are supposed to be only present if

the citation comes from the inventor, not the examiner (Alcacer and Gittelman

2006). Jaffe et al. (2000) made a survey on inventors and found out that they

were fully aware of less than one-third of the backward citations of their patents.

Alcacer and Gittelman (2006) then confirmed the observation and showed that

about 63% of backward citations come from examiners, 40% of all patents only

contain backward citations from examiners, and there are only 8% of patents

that were needless of adding any citations from the examiner.


There is a large difference in backward citation counts when comparing

patents under the US (USPTO) and European (EPO) patent offices. European

patents show spectacularly low numbers of citations. It might be because there

are less inventions patented in Europe than in the USA - the patents have

less options to which to refer. The second possible explanation is a different

approach of USPTO. The application for a patent protection must satisfy ’best-

mode practice’; a full list of patents that could possibly be considered a prior

art has to be made as a part of patent application. In Europe, the majority

of citations come from the examiner, not the applicant (Harhoff et al. 1999).

The large difference in citation counts is one of the reasons why it is difficult

to compare studies analysing the European patents to those analysing the US

patents.

The crucial aspect of backward citations is their immediate availability at

the date of the patent grant. The list of backward citations doesn’t change

throughout patent’s life, and thus is a very reliable as a value indicator. Sev-

eral successful attempts were made to show a relationship between backward

citations and patent value, and indeed observed a positive correlation (see e.g.

Narin et al. 1997).

The creation of backward citations naturally implies the existence of their

opposites - forward citations (citations received from other patents). Those

also show the knowledge flows; however, there is a significant difference in the

meaning. Backward citations do not necessarily prove the newly issued patent

to bring any kind of novelty, they only show a relationship to the underlying

invention. But a gain of a forward citation shows a direct impact of the patent

on the other inventions. A patent without a gain of a forward citation after a

few years is most likely an unimportant one, at least as far the technological im-

provement is concerned. On the other hand, a patent with numerous citations

from the other inventions should be of a significant technological value.

Forward citations also have several drawbacks. First, a possible bias may

occur in the estimation if a company cites its own patents. Some authors argue

that such behaviour may demonstrate creation of patent thickets around com-

pany’s invention (Bessen 2008). Second, the most important difficulty arises

from the nature of forward citations; the number of forward citations can grow

over patent’s life. The list of forward citations is always empty at the begin-

ning. US patents are validated for up to 20 years, and even then there is still

a possibility a patent may receive another forward citation in the future. This

probability is lower in certain technological fields (e.g. semiconductor industry)


as a result of rapid innovation. Uncertain citation counts pose a major difficulty

for any statistical or econometric analysis using patents from different years.

Hall et al. (2000) made an assumption that the lifetime of a patent is not longer

than 35 years. Bertran (2003) showed that they were quite right, and moreover

proved that the distribution of received citations barely changes after 12 years

since the application date. Further, Hall et al. (2000) exposed that more than

80% of all citations received by patents occur within the first 17 years after

the grant. Works of Albert et al. (1991), Lanjouw and Schankerman (2001), or

Harhoff et al. (2003) show a positive and very significant correlation between

forward citations and patent value. However similar results these works ex-

hibit, they are very different at the same time, mostly because they build their

observations upon dissimilar datasets. For my study, I have predicted the total

number of forward citations a patent would obtain 31 years after the grant (see

Chapter 5).

The last indicator in the first category is patent’s family size (Lanjouw and

Schankerman 2001, or Harhoff and Reitzig 2004). A company may seek for a

patent protection for one invention in more countries. Patents related to the

same invention, issued in different states, create a patent family. This variable

should in theory be positively correlated with patent value due to additional

costs connected with the patent application, renewal and possible litigation. A

company should only seek for such protection for its most valuable patents (as-

suming that its management board has the information advantage over general

public, leading it through the decision-tree). The works of Harhoff and Reitzig

(2004) confirmed the theory and found a rather strong correlation. Reitzig

(2003b) argues that patent’s family size should be a measure of market size of

the patent. Family size data are available soon after the patent application,

see Chapter 6 for further discussion.

More indicators usable as explanatory variables became available with the

introduction of improved patent databases on the internet. The second gen-

eration indicators include international and local patent classifications.

Patent classifications refer to the scope of a patent. In other words, the

number of different technological areas a patent is involved in. Patent scope

should be correlated with patent value since more valuable inventions would

serve as a foundation for innovation in different technological fields. Never-

theless, the newly developed patent strategies may require patents to have as

wide scope as possible to create even more powerful thickets at the same time.

The effect of a broader breadth on the value of a patent is then unclear, be-


cause strategical breadth is not linked to technological value. Indeed, Lerner

(1994), Harhoff and Reitzig (2004), and several other authors observed great

differences in the connection of patent classifications to patent value.

There are several different types of patent classifications. I am most inter-

ested in the US, European and International ones. International classification

(IPC) divides technological fields into eight sections with approximately 70,000

subdivisions. Each subdivision has a symbol consisting of Arabic numerals and

letters of the Latin alphabet. The IPC symbols are allotted by the national or

regional industrial property office that publishes the patent document.4 US and

European classifications are quite similar to IPC. Lerner (1994) argues that IPC

reflects the economic importance of new inventions, whereas US classification

focuses on the technical meaning.

The last category includes third generation indicators, among which Re-

itzig (2003b) puts variables that come from the patent full-text documentation,

such as the number of claims, design of certain text passages in the patent draft,

the number of words describing the state of the art, or the number of indepen-

dent claims.

Patent claims are the very essence of the invention itself. One can learn

how to make and use it from the description, yet only patent claims define the

scope of the legal protection. It is then a concern of every applicant to provide

the broadest possible patent claim to have the most sufficient protection as

a reward for his invention. Nevertheless, even the patentee must consider his

claims very well, as there is an increasing probability of litigation against the

issued patent when the claims are broader. It has been shown that the number

of words in the description has no explanatory power, whereas the number of

claims is very significant Reitzig (2003a). Recently, van Pottelsberghe de la

Potterie and van Zeebroeck (2011) found an important relationship between

the strategies of application filling and patent value.

3.2 Correlates of Patent Value

Because patent value is an abstract term without a precise definition, it has

to be substituted by its correlate. With the new discoveries, different variables

have been used as they seemed to have a better explanatory power regarding

patent value.

4http://www.wipo.int/classifications/ipc/en/general/preface.html


The most accurate method of estimation of patent value seems to be a

survey made on the inventors and their managers, i.e. the best informed peo-

ple regarding their inventions, who also further decide about patent’s validity.

Harhoff et al. (1999) or Gambardella et al. (2008) made such a survey, Gam-

bardella sent a questionnaire to inventors, questioning: ”Suppose that on the

day on which this patent was granted, the applicant had all the information

about the value of the patent that is available today. In case a patent com-

petitor of the applicant was interested in buying the patent, what would be the

minimum price the applicant should demand?”, asking them to put the value of

a given patent into one of 10 categories, starting at “less than e 50,000”, and

ending at “more than e 5,000,000”. They obtained very high estimates, with

mean value e 3,000,000 and median e 300,000. Harhoff et al. (1999) had very

similar results, 12.9% of patents in their survey were placed in the “more than

DM 5,000,000” category.

Another approach is to look at the record of patent renewal decisions. Cur-

rently, an utility patent in the USA can be validated up to 20 years, starting at

the filling date. The applicant has to pay maintenance fees in order to keep the

patent valid. In most of the European countries, these fees are paid annually,

whereas in the USA the fees are to be paid after 3.5 years, 7.5 years and 11.5

years. The fees grow rapidly over time and are different for small and large

firms (the charge is double for large companies). This approach utilizes the

imperfect information distribution; patent manager is ought to have sufficient

information about the invention to decide whether the possible gains from hav-

ing the invention protected are larger than the fee that has to be paid. Paying

the renewal fee gives him then not only the monopoly for the given time, but

also an option to pay another renewal fee when the time expires (Pitkethly

1997). The patentee needs to consider only the current renewal period for the

optimal decision, as the invention becomes more unprofitable with time due to

increasing fees (Bessen 2008).

The downside of this method is that patents can only be looked upon retro-

spectively and the results may be biased because patents may be renewed only

for strategical purposes, not because of their actual objective economical or

technological value. Rapid innovation in a whole industry may lower the value

of a given patent just after paying the fee as well. Arora et al. (2008) argues

that the renewal approach assumes the annual returns from having the patent

in force to decrease monotonically over patent’s life, and that patents may have

earned a lot in the first years even though they were not renewed. Further, the


estimates only show the extra value generated by issuing the patent, not the

value of an invention to firm if it were not protected by a patent. This is

different to the survey approach, which treats patents as assets, so the asking

price should reflect both the invention value and the patent premium. Hence,

it yields higher estimates than models only estimating the premium, such as

the renewal fee model.

Using renewal fee conception, Bessen (2008) indeed obtained much lower

estimates of patent value than Gambardella in his study. The mean value was

$78,168 and median only $7,175. He also found that patents owned by small

companies are less often renewed than those owned by larger ones. He puts it

as the patents of small companies are thus of a lower value, but it may simply

point out the propensity of larger companies to renew their patent portfolios,

for the cost is insignificant in comparison to the smaller companies.

The last major approach utilizes patent litigation. The probability a com-

pany would be sued for its invention increases with patent value (see Lanjouw

and Schankerman 2001, Reitzig 2003b, Harhoff and Reitzig 2004), as it is rather

costly for other firms to appeal to the court, thus only the most important (i.e.

valuable) patents should be opposed. The probability of a patent being liti-

gated increases with the number of companies inventing in the same area and

the number of claims of the patent (Lanjouw and Schankerman 1997). Reitzig

(2003b) created an litigation likelihood estimator for his econometric model and

used the patent indicators as explanatory variables to estimate the probability

of litigation. In his study, 11.5% of 16,711 European patents were opposed.

The oppositions were successful in 38% of cases.

Other correlates, like the market value of the firm or Tobin’s Q, have been

proposed; however, those can only be linked to the value of a whole patent

portfolio, not to single patents. Serrano (2005) recently came up with an idea

of connecting patent value to the probability that a patent would be traded

to a different company, arguing that the transfer of intellectual property has

become an important source of technology for firms. He showed that more

valuable patents are indeed more likely to be traded. I further utilize this very

interesting finding in my econometric analysis.

3.2.1 Used Variables

An important distinction must be made here; patent indicators, described in

Section 3.1 (except for forward citations), are contained in the patent document


and depend solely on the application process.5 The value correlates (including

forward citations), on the other hand, are given by personal decisions (e.g. to

renew a patent) and depend only on the importance of the patent (i.e. its

value).6

The preceding literature suggests a number of possible variables that may

stand as patent indicators or as value correlates. I follow and further develop

the method suggested by van Pottelsberghe de la Potterie and van Zeebroeck

(2011) and use variables that have previously been shown to have very signifi-

cant explanatory power regarding patent value, to obtain a composite variable

reflecting it. See Chapter 6 for its description.

Table 3.1: The overview of the used variables and the empirical evi-dence of their explanatory power regarding patent value.

Value Correlates Total Positive Negative Insignificant

Forward Citations 34 31 0 3Family Size 22 14 1 7Renewals 15 14 0 1Patent Trade 1 1 0 0

Patent IndicatorsInventors 4 1 1 2Backward patent citations 21 13 1 7Patent Classification 12 6 3 5Number of Inventors 5 1 2 2Priorities 2 0 0 2

Source: van Pottelsberghe de la Potterie and van Zeebroeck (2011)

Table 3.1 shows the complete list of the patent value correlates and the

patent indicators in my study, together with the number of distinct prior works

using them in econometric models, their significance, and sign. Three of my

patent value determinants (forward citations, renewal data and family size)

5All of these are known at the date of the patent grant. They only tell us patent specifi-cations, its breadth and the prior art the patent builds upon, but they cannot tell us muchabout the patent value without additional information. It is like a knowledge of the colour,engine capacity, and the number of doors of a car. We can see that it has more/less thanthe other cars, but can hardly say if it is better.

6Even under strategical behaviour, the assumptions should hold. Continuing the examplewith a car, these variables are similar to how high the car gets in consumer’s ranking, thedecision whether to buy the car, or the decision to later create a new model based on it.These latter variables may then be connected to the former ones (i.e. the decision whetherto buy a car may depend on its colour and engine volume.)


have been many times proved to be highly and positively correlated with patent

value (see e.g. Bessen 2008 or Reitzig 2003b), whereas patent trades have only

been used once so far. To support the theory that traded patents are more

valuable, I construct a model similar to the one used by Serrano (2005) to

obtain resembling results. Furthermore, I use these four variables for DEA

and provide a broad discussion of their evolution in Chapter 5. Finally, the

patent indicators have given ambiguous results so far, mostly due to different

explained variables they have been used with.7 I will use similar models to

those in the preceding literature to test how my data behave regarding the

value correlates I chose.

7Again, the colour of the car may be correlated with the purchase decision, but hardlywith the decision to further remake the car.

Chapter 4

The Dataset

The unique dataset I have created for my study contains data about 163,663

US patents from 4 technological industries: computer manufacturing, computer

software development, aerospace industry, and semiconductor industry, featur-

ing 25 companies in total. The data were downloaded from publicly accessible

patent office databases using web scraping program (for more information and

the full list of the observed companies see Appendix). I have selected those

industries because they are more innovative than the other (Griliches 1998).

On-line database of the NASDAQ Stock Market8 offers a roll of listed compa-

nies divided by industry. From those, I have picked only firms with market

value over $2 billion and history longer than 10 years. Listed companies are

generally obliged to publicly release their annual reports. Moreover, since 1994,

the US companies must post their fillings in electronic form. The reports can

be found in the Edgar-SEC database9 to obtain additional data. The condition

of market value above $2 billion and preferably a long history is required for

meaningful analysis. Smaller companies have patent portfolios of insufficient

size for statistical and econometric relevance, and it would be impossible to see

a shift in the patent strategy of a firm if it had just a short history. Of course,

the analysis for whole industries would be possible even when accounting for

the smaller firms, but the restriction had to be made at some point, since there

are also companies owning patents and not listed on a public stock exchange.

It would be nearly impossible to obtain a complete list of all relevant subjects,

so I made the limitation.

With the list of all suitable companies, I have searched the website of

8http://www.nasdaq.com9http://www.sec.gov/edgar.shtml

4. The Dataset 18

USPTO10 for their patents. Ultimately, I had a list of over 190.000 patents

of the firms, including various departments in different states. Unfortunately,

the US patent database (or any other free database) does not offer spreadsheet

or bulk downloads. In fact, they offer no explicit tool to download the data.

The method I used to obtain the dataset is depicted in the Appendix. I was able

to download the following characteristics: patent number, application number,

filling date, date of the grant, patent’s assignee, references cited (backward ci-

tations), US classification, International classification, and whether the patent

have been traded. However, due to incomplete database (missing data), the

final dataset must have been cleaned in order to have complete statistics.

Some of the observed companies own patents also from other than their main

industries. Because it is rather impossible to distinguish to which industry a

patent exactly belongs to, the trend should be common for all the companies

within an industry (i.e. companies from computer manufacturing most likely

also own several patents from software development or semiconductor industry,

aerospace companies may own computer patents etc.), and these patents ought

to form a small share of the patent portfolio of a given company, I treat all

the patents of one company as if they were from the industry in which I have

classified the company.

Additional data were found on the website of EPO.11 Those include: Euro-

pean classification, priority numbers, citing documents (forward citations), and

family size. Again, the data are only accessible in text format (and cannot be

easily downloaded), which creates certain difficulties in their use. Therefore,

the data had to be refined in several computer programs.

There are two other features impossible to be obtained from the patent

office databases: patent renewal and litigation data. Litigation data can only

be accessed through a very expensive private database, and are not included

in my work for that reason. Patent renewal data are available through Google

bulk download12 and may be downloaded without restrictions. Ultimately, I

have created a unique and extremely comprehensive dataset, containing patents

with date of the grant ranging from 1976 to 2011, and including characteristics

that have never been together before. About 30,000 (16%) records must have

been deleted due to missing data13 and patents of unrelated companies with

similar names to those on my list.

10http://www.uspto.gov11http://www.epo.org12http://www.google.com/googlebooks/uspto.html13Some characteristics (usually one, at most two variables per patent) were not present in

4. The Dataset 19

In the next section, I present the most interesting observations regarding

my dataset. While the preceding literature has mostly focused on looking for

the links between patent indicators, a little attention has been paid to the

evolution of the indicators themselves. I shed some light upon this matter to

contribute to the prior findings.

the database. I was unable to find any rule regarding the missing data, thus I assume it is arandom effect, which should not have any impact on my analysis.

Chapter 5

Descriptive Statistics

In order to understand the changes in the patent characteristics better, one

must observe the shift in the patent system as a whole. The growth of patent

applications and issued patents, mentioned in Chapter 2, is depicted in Figure

5.1. It exhibits an immense increase from 99,000 applications and 75,000 patent

grants per year in 1972, to 490,000 applications and 220,000 grants in 2010.

The upward trend is noticeable from 1983 (i.e. after the establishment of the

Court of Appeals in 1982), which corresponds to the findings of Kortum and

Lerner (1999). Not only the total number of patents has been growing, the

growth rate has been increasing as well. This most probably corresponds to

increasing propensity to patent among companies.

On August 16, 2011, the US patent of number 8,000,000 was issued.14 The

enormously high quantity of patents has a large impact on the patent system.

The average number of days between the application date and the grant date

in my dataset rose from 566 in 1976 (526 in median) to 1420 in 2011 (1323 in

median). Figure C.1 exhibits the rise. Investigation of the impact that this

change may have is beyond the scope of this text and would deserve a further

analysis.

5.1 Citations

The increasing number of patent grants has a substantial effect on my dataset

too. Arguably (as discussed by e.g. Hall et al. 2000 or van Zeebroeck 2011), the

number of backward citations may grow over time, as there are continuously

more patents within a technological field (and possibly large patent thickets

14http://www.uspto.gov/news/Millions of Patents.jsp

5. Descriptive Statistics 21

Figure 5.1: The number of patent applications and grants annually(in thousands).

Source: http://www.uspto.gov/web/offices/ac/ido/oeip/taf/us stat.htm.

around certain essential inventions). Each inventor must then consider more

preceding patents to be cited. Similar logic can be applied to forward citations

as well, but one may hardly ever distinguish between an increase in the number

of patents in a given field because of a common trend, and an increase as a

result of a revolutionary invention.

As a remedy to a possible bias in backward citations due to increasing

number of patents, I have weighted the citation quantities by the total number

of patents from a year before the grant of the patent. I took the estimate of

the total number of patents in 1975 as the base and then added the number of

patent grants each year to obtain the total patent counts.15 The final weights

are then computed as

Bcitt =∑citationst ∗ Base

Base+t∑

i=1Grantsi

Where Bcit is the weighted number of backward citations, Base is the total

15This approach has not been used before. The estimate of the number of patents grantedprior to 1st January 1976 is based on the patent number of the first patent issued in 1976.There are no official statistics of patent counts for given years; therefore, I had to add thenumber of granted patents each year to my approximation to obtain relevant data. Theone-year difference between the patent grant and the total number of patents, which servesas the weight, was chosen by the rule of thumb. I have tried several different years, but theresults were quite the same, so I chose the simplest method.


number of patents in 1975 (i.e. one year before the first patent grant in my

dataset), Grants is the number of US patent grants in a given year, and t is the

difference between the grant year of the given patent and 1975. Both Base and

Grants include patents from all technological fields, taken from the website of

USPTO; citationst are the actual data from my dataset. Figure 5.2 shows both

weighted and non-weighted backward citations from 1976 to 2005.

Figure 5.2: Patent citations.

The number of backward citations (even after weighting) had increased over

time, with a little decline from 1990 to 1993. The data are based on the grant

dates. Figure C.5 shows weighted numbers of backward patent citations for

each industry. We can see that the trend is similar for all industries; however,

the companies in aerospace industry seem to rely more on prior inventions than

the others. The overall change may have several explanations: the most prob-

able would be that the newer inventions are more complex, and thus require

knowledge flows from many different sources. Yet it may also be because the

inventors rather present a more comprehensive list of the prior art, in order

to have the patent granted faster (the examiner must search for less patents).

Finally, it may also be thanks to better technology, which allows the examiner

to search for the prior art more successfully. By far the most citing company is

Citrix Systems, with an average of 51.6 backward citations per patent, whereas

Maxim Integrated Products only has 8.5 citations to preceding patents on av-

erage.


Forward citations cannot be analysed without several adjustments because

of their unknown future value. Backward citations pose no threat in this re-

gard, as mentioned in the Chapter 3. Some authors (van Zeebroeck 2011,

Lanjouw and Schankerman 2004, Sapsalis et al. 2006, or Gambardella et al.

2008) propose a comparison of forward citations obtained only during the first

few (observable) years, while others rather focus only on patents from one year,

in order to be able to compare them among themselves (Schneider and Leuven

2007). These are unfavourable methods for my study, as the former requires a

rather large time span to yield reliable results,16 and the latter does not fit my

aim look for changes in patents from different years.

Hall et al. (2000) suggested constructing a citation distribution to see dif-

ferences among industries in the dataset. I have created such distribution from

a sample of my data, Figure 5.3 contains the results.

Figure 5.3: Sample distribution of forward patent citations.

The distributions are fairly resembling the previous observations, with a

little higher probability of being cited in the early years after the patent grant.

The distribution is very similar for computer industry, which is the only com-

mon observed category for both our datasets. The line representing software

industry deviates from the others due to the characteristics of my observations;

16One cannot utilize observations from the recent years then - it was not a problem forthe other works, because they did not have the ambition to observe the longest possible dataperiod as I do. Further, this approach would be highly misleading because of the changes inforward citation distribution (see Appendix).


the software companies in my dataset started to patent in the late 1980s, and

it is therefore impossible to observe patent citations for as long time period

as for the other technological fields, explaining the steeper decline and almost

zero probability of being cited past the age of 20.

The divergence for higher lags is again given by the fact that the companies

in software and semiconductor industry started applying for patent protection

later than those in aerospace and computer manufacturing industries. But

the variance in citation gains for early years is significant, and points at some

interesting facts, e.g. that the patents in aerospace industry seem to be relevant

for much longer period of time than the patents from semiconductor or software

industry (i.e. the same findings as in the case of backward citations). Such

differences were earlier suggested by Jaffe et al. (1993). They are expected

for the dissimilar nature of the industries; the rapid innovation in software and

semiconductor industry indicates lesser relevance of new inventions to the older

patents.

Hall et al. (2000) in his work further predicts the total number of forward

citations a patent would receive at a given age. Even though his method is

not usable in my study, he inspired me to develop my own approach. It is

fully described in the Appendix, I will only discuss the outcome here. First,

Table A.1 shows the number of forward citations obtained by patents from each

industry and time cohort up to 11 years after the patent grant in a sample from

my dataset (the numbers are weighted by the observed patents in each time

cohort, in order to be directly comparable). The data show a clear overall

increase in the number of forward citations obtained in the early years after

the patent grant. Patents granted between 2001 and 2005 are cited more than

two times as much as those granted between 1996 and 2000 in the first year,

and there is a sharp decline in the citations obtained in the latter years.

I have created sample cumulative distribution functions to be able to predict

the total number of forward citations a patent would obtain and to graphically

illustrate the change in the distribution (see Figure A.1). For newer patents,

the function is much steeper, meaning that the patents in these groups are pre-

dicted to obtain many more citations in the early years, but less in the latter.

That corresponds not only to the trend of rapid innovation (the distributions

of patents in semiconductor industry for the last two time cohorts are in fact

very different from the other industries, which would indeed refer to a sub-

stantial evolution in semiconductor industry during last 20 years), but also to

the suggested decline in patent value over time (and possibly the strategical


exploitation). Arguably, if a patent is obtained for strategical purposes, only

the patents applied for shortly after its grant (i.e. those entangled in the patent

thicket) ought to cite it, as the patent most likely does not bring any signifi-

cant technological improvement (the latter patents would not build upon the

invention).

The functions were then used to predict the total number of forward cita-

tions gained 31 after years the patent grant (i.e. the maximum observable years

for the earliest time cohort). The results are shown in Figure 5.2. We can see a

steady increase until 1996, followed by a steep decline until the end of the data.

Because the distribution of forward citations, just as the distribution of patent

value, is extremely skewed, the median may be more reliable for a conclusion.

Clearly, the trend is similar; however, the changes are much more gradual. It

is crucial to mention that the percentage of patents which have not obtained

a single forward citation has greatly increased. In fact, only about 2% of all

patents in my dataset issued in 1976 have not received a forward citation, in

comparison to 7 percent in 2002 and immense 20% in 2005. It seems that not

only the newer patents are cited less in general, but about one fifth of them

seems not to have any technological value at all.

To better understand the immense drop in forward citations, I have also

obtained the results for each industry. These are depicted in Figure 5.4.

Figure 5.4: Forward citations by industry.

Looking at it, it is clear that the previously mentioned trend varies heavily


across the industries. The patents of companies in aerospace industry (which

have shown to be more dependent on the prior inventions) seem to be very

stable over the observed period. It is in a high contrast with the other in-

dustries, particularly software industry. It seems that the patents in software

industry were of a very high technological value at first, but have lost their

value extremely as the time went by. That is perfectly logical, as the patents

from late 1990s (i.e. those applied for in the middle 1990s) laid the basics for

whole software industry,17 while the recent inventions are not so valuable and,

arguably, mainly strategical.18

To better understand patent citations as a whole, we must also look at

the numbers of patents in each industry (shown in Table C.2). There are less

patents observed in aerospace industry than in software industry, even though

software companies have started to patent much later. There are also more

patents in computer industry than in all the other industries together. From

these statistics and from the figures shown previously, I may conclude that

the number of backward citations (i.e. the number of inventions that a given

patent is built on) is rather similar for all the industries. On the other hand,

the number of forward citations (i.e. the technological value of a given patent)

varies heavily because the industries vary as well. It is not surprising that

the number of forward citations gas grown for most of the observed period in

computer industry simply because not only the total amount of issued patents

grew each year, but also because the growth rate of newly issued patents was

positive each year (i.e. there were more patent grants each year). To better

illustrate this, I use the same methodology as in the case of backward citations,

to weight forward citations as well. Figure C.4 exhibits the results.

We can see that the steady increase within computer industry is now just

rather flat, with only one permanent increase between 1986 and 1989. Yet one

must keep in mind that there is no strong theory suggesting the forward patent

citation counts would be biased by the increasing number of patents, however

likely it might be; thereby, I rather restrain myself from making conclusions

regarding the issue.

Software development company Oracle has the most valuable patents re-

17That applies for the patents granted prior to 1990 as well, but those are not shown inthe Figure 5.4 for their low counts, which could have biased the results.

18Indeed, some famous recent patent applications include specific movement of icons onthe screen by Apple, ”upgrade” button for applications by Lodsy, or one-click purchase byAmazon. (Sakmann 2012)


garding forward citations with an average of 34.2, while its competitor, Red

Hat, only has 4.6 forward citations per its patent on average.19

5.2 Family Size

Unlike citations, family size as a variable does not need to be adjusted. EPO

searches other patent offices’ patents by their priority numbers20 and puts to-

gether similar patents from all around the world. These not only include the

exactly same invention, applied for in different countries, but also all similar

inventions, based on one priority number.21

Figure 5.5: Family size, based on date of application.

Figure 5.5 shows the evolution of the average family size within my dataset.

19The data are the averages for all patents for a given company and a given year.20A priority number is assigned for each new invention, and thus identifies the priority

claim of its owner. Priority claim may be used by a patent application to claim priority fromanother previously filled application, in order to take advantage of the filling date of theformer one. In other words, it is enough to apply for a patent in one country and then applyfor the same patent later (although within one year from the first application) in anothercountry, while taking benefits of having applied for it in the first country before. Any otherinventor who would apply for the same patent between both applications would not gain theright for his invention, even though he would be first in the national regard. This applies toall countries which are party to the Paris Convention. Such behaviour is desirable by firms,delaying their expenses for applications in other countries up to a year without a menace oftheir competitors applying first.

21One invention may have more than one priority number. EPO describes patent familyas: “All the documents directly or indirectly linked via a priority document belong to onepatent family.” (http://www.epo.org/searching/essentials/patent-families/inpadoc.html)


There is a rather steep decline until 1984, followed by a further steady down-

ward trend over the years. The evolution in the first observed years and its

sudden change interestingly corresponds to the increase in patent applications

after the establishment of the Court of Appeals. It may be so that those who

applied for patents before the establishment also sought patent protection in

more countries. Assuming family size to be truly a correlate of patent value,22 I

may conclude that patents are now much less valuable than they were before. I

have again obtained the results for each industry separately as well (see Figure

C.6). But unlike citations, family size shows no important differences across

the industries.

5.3 Renewals

Patents in the present US patent system must be renewed at the end of a certain

period of time to remain valid. This only applies to utility patents, whereas

design and plant patents cannot be renewed (those are granted for much longer

though). Currently, every patent is valid for 4 years after the filling date, and

must be paid for in the last 6 months of its validity in order to be renewed (with

a possible slight delay, which is then fined). Payment validates the patent for

another 4 years, up to 8 years from the grant date. The same procedure may be

repeated twice more, the last payment extends patent’s life up to 20 years from

the application date. The fees are substantially higher for further renewals, and

are currently at $1,130, $2,850, and $4,730 for patents due at 3.5, 7.5, and 11.5

years, respectively.23 These apply only to large firms, smaller firms’ payments

are exactly half of these. None of the companies in my dataset is considered

small in this regard. Only patents with the application date after December

12, 1980, are subject to maintenance fees; thus, it makes no sense to include

any earlier issued patents in my analysis. Beside that, I must only consider

data with the date of the grant up to and including year 2007, i.e. older than

4 years, to be able to observe patent renewal decisions.

Statistics of renewals are fairly interesting. More than 90% of patents in my

dataset were renewed at least once. This is similar for all the observed years,

22The meaning of family size is a little bit different in this case from what was mentionedin Chapter 3. The outcome remains the same though - larger family size should point atmore valuable patents both because it has been applied for in other countries and becausethere are more inventions build upon the underlying idea (i.e. the same reasoning as om thecase of forward patent citations.)

23http://www.uspto.gov/web/offices/ac/qa/ope/fee092611.htm


with an exception of years 1987 to 1991. Computer manufacturing industry has

much lower values in the recent years, falling down to 75% in 2004, whereas

the other three industries exhibit consistent figures above 90%. We can see

that it is very common to extend patent validity at least once. The numbers

may be higher due to the size of observed firms though. The first maintenance

fee may be rather insignificant for companies with yearly revenues exceeding

billion dollars. Only the most useless patents would not be renewed at least

once.

Figure 5.6: Renewal data, patents granted from 1976 to 1999.

To make things more interesting, I have divided my dataset, limited by the

specifications of the renewal system, into three categories: from the application

date after December 12, 1980, to the grant date up to and including 1999; with

the date of the grant between 2000 and 2003; and finally, with the date of the

grant between 2004 and 2007 (i.e. patents, which could have been renewed

three times, two times, and once, respectively). It is then possible to compare

patents that could have been extended for the same period of time. Figure

5.6 exhibits how volatile actually renewal statistics are. On average, about the

same number of patents were renewed once and twice (and could have been

renewed three times), and over 40% of patents were renewed for the full term

in the first category. There was a major decline in full renewals, followed by

low values between 1984 and 1991, while the number of patents not renewed

at all increased. Remarkably, there was an increase in once renewed patents,


at the same time. For the patents with the date of the grant after 1991, the

probability of being renewed for whole 20 years increased up to 80%, with a

slight decline from 1997 to 1999.

The second category (patents granted between 2000 and 2003) is summa-

rized in Figure C.2. It shows a similar trend regarding the number of patents

renewed once or not at all as Figure 5.6. The percentage of patents renewed

twice (i.e. for the longest possible period) is above 70%. This again follows

the decision-making from the first category, looking at patents renewed at least

twice. Patents with the date of the grant between 2004 and 2007 are shown in

Figure C.3. About 90% of all patents were renewed. The number is a little bit

lower than in the previous years due to the decline in the renewals in computer

manufacturing industry. That said, I may again conclude that the overall value

of patents (assuming that renewal data are indeed correlated with it) had been

rather stable over time, but dropped significantly in the last observed period

in computer manufacturing industry.

5.4 Patent Trades

Following the discussion from Serrano (2005), patents are traded because some

companies are more productive in use of a given patent. The cost of such trans-

action, accounting for the cost of implementation, is the reason for selection of

such patents, i.e. only the more valuable patents should be traded. According

to his results, the probability of a patent being traded decreases from the date

of the grant, with a slight increase just after being renewed. That said, my

dataset, curtailed due to the limitations of the other variables I use, should

provide reliable data even for the recently issued patents.

Arguably, the assumed strategical behaviour of companies regarding their

patent portfolios in recent years may have devalued patent trade data as a cor-

relate with patent value, due to a higher probability of trading whole portfolios

for strategical reasons, rather than to obtain highly valuable patents. Because

I only use patent trades in my analysis up to year 2005 and Figure 5.7 shows

no noticeable changes until then, at least in the trade volumes, I made an as-

sumption that if patent trade data have indeed lost their explanatory power,

then only in the very recent years and my analysis should thus not be affected.

Figure 5.7 demonstrates patent trade statistics.24. There is a very significant

24The USPTO database contains general data about patent ownership and its changes. Itwas possible not only to see that a patent has changed its owner, but also for what reason


rise and fall between years 1981 and 1984. The explanation for such activity

would be very hard and doubtful, and thus I refrain from it. Of higher interest

is the steady increase followed by a decrease, from 1986 to 1994 and 1995 to

2006, respectively. The results follow the other patent value correlates (e.g.

forward patent citations).

Figure 5.7: The average number of traded patents.

I have once more obtained the results separately for each industry, those are

depicted in Figure C.7. We can see a rather striking difference in the trade of

computer patents in 1990s, but it seems that it was only a temporary deviation,

followed by a steep decrease.

5.5 The Other Variables

To support the indicators correlated with patent value, several other variables

were obtained. Those are pure patent characteristics, which do not rely on

further management of the patent or time passed since the date of the patent

issue; they are all known by the time of the patent grant and do not evolve any

further.

These complexity measures include: the number of inventors per patent ap-

plication, the number of different International, European and US patent clas-

(i.e. direct trades, acquisitions, court decisions etc.) The data I present in this paper containonly the direct patent trades and licence agreements, which should indicate a higher patentvalue in a similar way to patent trades.


sification, and backward citations to other patents. Figure 5.8 shows changes

in those indicators.

Figure 5.8: The other patent variables.

Clearly, the average number of inventors has grown over the recent decades.

It may indicate a lesser capability of firms to invent, although the companies

in my dataset are very large and it is possible that they simply spend more re-

sources on research, while employing more scientists. Patents of semiconductor

company Applied Materials have the highest average number of inventors per

patent (3.5), whereas software company Red Hat only 1.1.

US and European patent classifications show no significant development

throughout the observed period of time. International classification, on the

other hand, shows a constant decrease, with a very significant drop in 2006.

This is due to several changes in its measurement.25 The last change was on

January 1st, 2006. Detailed statistics of all obtained variables are shown in

Table C.3.

25http://pesquisa.inpi.gov.br/ipc/guide/en/guide.pdf

Chapter 6

Empirical analysis

In this section, I further utilize my dataset to perform an empirical analysis. In

the first part, several econometric models are employed to discover the relation-

ships among my variables; to show the differences in my results, compared to

the previous literature; to provide additional information regarding the effec-

tiveness of the variables (as shown in Table 3.1); and finally, to discover whether

the explanatory variables are sufficient to explain the year-to-year differences

in the value correlates. Then, I use DEA to measure relative efficiency with

which the observed companies transform certain inputs into patents, in order

to further expound my observations and to provide some empirical evidence

for the theory of strategical patents.

6.1 Econometric analysis

The econometric approach has been developed by e.g. Lanjouw and Schanker-

man (2001), Gambardella et al. (2008), or Schneider and Leuven (2007). I

estimate 4 different models which more or less arise from their work, and dis-

cuss the differences in my results compared to what has been found before.

The first two models explain how much of the variance in patent value is ex-

plained by the indicators contained in the patent document, an approach used

by Bessen (2008). In these models, patent value is substituted by its correlates

(forward citations and family size), which have previously been shown to have

a very strong and significant relationship with it (see Table 3.1), because the

value itself cannot be observed. Due to the nature of forward citations, the

number of observations in the first model is lower than in the model using fam-

ily size as the dependent variable, as I had to restrict the dataset from the right

6. Empirical analysis 34

to work with reliable data. The citation values are the predicted forward cita-

tion counts a patent would obtain 31 years after the grant. At the same time,

I check the importance of time and industrial affiliation through the dummy

variables.

The latter two models build upon the works of Reitzig (2003a), or Lanjouw

and Schankerman (2001), and elucidate the importance of patent characteristics

for the managerial decisions, such as whether to renew or to trade a patent.

This is essentially different from the former two models; the question here is

no longer if a certain characteristic is connected to patent value, but how can

it affect the decision tree of patent’s life. Therefore, I also include forward

citations as an explanatory variable in the regressions, to see the significance

of patent’s “performance” on it’s validity and the probability of a change of

the owner. Again, I put all the time and industry dummies in the regression.

Furthermore, I estimate all four models twice more; once with a linear and once

with a quadratic time trend, to test whether the other explanatory variables

(backward citations, the number of inventors and priority numbers, patent

classifications, and the industrial dummy variables) are sufficient to explain

the differences in patent value.

Following Sapsalis et al. (2006), I use the negative binomial model to esti-

mate the equations with forward citations (Fcit) and patent family size (Fsize)

as the dependent variables for their skewed nature (see Table C.3 for the de-

tailed statistics). The individual units yi follow a Poisson regression model

(with parameter λi), with an omitted variable ui, such that exp(ui) follows a

gamma distribution with mean 1 and variance α:

yi ∼ Poisson(µ∗i )

µ∗i = exp(xiβi + ui)

exp(xiβi + ui) ∼ Gamma(1

α,

1

α)

where βi is the vector of parameters, xi is the vector of explanatory variables,

and α is the overdispersion parameter. The vector of explanatory variables con-

sists of the number of inventors per patent application (inventors), the number

of different International (INTclass), European (EUclass) and US (USclass)

patent classification, and backward citations to other patents (Bcit). Most of

the variables are in logarithmic form for easier explanation of the outcome.


The results are reported together with the robust standard errors26 and the

overdispersion parameter α.

Table 6.1: Negative binomial regressions.

Negative binomial F. Citations Family Sizeregression Coef. S. E. Coef. S. E.

Const. 3.37*** 0.08 1.37*** 0.06Aerospace -0.97*** 0.02 -0.11*** 0.01Computers -0.50*** 0.02 -0.21*** 0.01Semiconductors -0.64*** 0.02 -0.14*** 0.01Software (omitted) (omitted)log(Bcit) 0.14*** 0.00 0.11*** 0.00log(inventors) 0.17*** 0.01 0.08*** 0.00log(Usclass) 0.14*** 0.01 -0.02*** 0.00log(Euclass) 0.26*** 0.01 0.16*** 0.00log(INTclass)1 -0.26*** 0.04 -0.07 0.04log(INTclass)2 -0.20*** 0.04 -0.08** 0.03log(INTclass)3 -0.23*** 0.03 -0.14*** 0.02log(INTclass)4 -0.30*** 0.03 -0.07** 0.03log(INTclass)5 -0.29*** 0.02 0.08*** 0.02log(INTclass)6 -0.25*** 0.01 0.11*** 0.01log(INTclass)7 (omitted) 0.00 0.01Priorities -0.08*** 0.01 0.36*** 0.01

log(α) 0.07 0.01 -0.93 0.01Wald χ2-test 19461.61 16187.62Log pseudolikelihood -326066.79 -348564.23Number of obs. 84147 161360

∗ p < 0.10, ∗ ∗ p < 0.05, ∗ ∗ ∗ p < 0.01

Table 6.1 contains the results of the first two regressions.27 Both dependent

variables seem to react to the determinants in a similar fashion. Indeed, my

results correspond to the findings made by the preceding literature. Backward

citations are positive and significant in both models. Several differences appear

among patent classifications. European patent classification is positive and

26As I am not particularly interested in the exact coefficients and I have a large dataset,I rely only on the robust statistics as a remedy for heteroskedasticity (which seems to bepresent according to the tests I have made) and non-normality of errors in all four regressions.Moreover, the models I apply are much less susceptible to possible biases than the standardOLS models. Hence, I assume that my results are reliable.

27I have run the Likelihood-ration test that α equals zero to compare the negative binomialregression model to Poisson model, and proved it to be worse due to unsatisfied condition ofconditional variance equal to conditional mean.


significant in both models, whereas US patent classification is positive and

significant in the regression with forward patent citations, but negative in the

model with family size.

International patent classification had to be divided into 7 groups, each one

joined by a dummy variable to distinguish it from the others, because of the

changes in its measurement. The results are volatile yet significant. The first

four periods in both models appear to have a negative impact on the explained

variables, while the latter two are negative in the first regression and positive

in the regression with patent family size. The last period is omitted and in-

significant for the models, respectively; suggesting that the last (and immense

- see Figure 5.8) change in the measurement removed most of its explanatory

power. The sign of International patent classification follows fickle findings

by Lerner (1994), Harhoff and Reitzig (2004), or Lanjouw and Schankerman

(1997), who observed significant and positive, negative, and insignificant re-

sults, respectively.

Contrary to what could have been expected (see e.g. van Pottelsberghe

de la Potterie and van Zeebroeck (2011)), the number of patent priority claims

recorded in the patent document have a negative and significant effect on the

number of forward citations, yet the variable has the expected positive coeffi-

cient in the second model. The number of inventors has a weighty explanatory

power regarding the explained variables in both cases.

The dummy variables associated with different industries are negative and

significant in both regressions. The negative sign, as well as omitting of one

industrial dummy is caused by inclusion of the intercept in the regression.28

The time dummy variables are not included in the Table 6.1. Instead, I plotted

them, together with the 95% confidence intervals, in Figure 6.1. These clarify

the trend in the explained variables, which remains unexplained using the other

explanatory variables.

The top two graphs belong to the model with forward citations and family

size as the dependent variable, respectively. The bottom two graphs show

the time dummies included in the following two regressions. All of them are

negatively and significantly correlated in the second model, whereas they show

the same influence only in the last observed years (2002 - 2005) in the first

model, have positive and significant coefficient between 1989 and 1998, and are

28Every patent belongs to some industry thus there is the problem of collinearity if I includeboth intercept and all the dummies. I have run additional regression without the intercept,in which all the signs of the industrial variables were positive, substituting the constant.


Figure 6.1: Coefficients and 95% confidence intervals of time dum-mies, all four regressions. Top two are the negative bino-mial regressions (with Fcit and Fsize as the dependentvariables, respectively), bottom two are the probit regres-sions (with renewals and trades, respectively).

insignificant otherwise. This behaviour is directly connected to the evolution

of forward citations as shown in Figure 5.2. Both linear and quadratic time

trends are significant with negative sign in both regressions.

The overall results correspond to the prior findings regarding the sign and

the significance of backward citations, and US and International classifications.

At the same time, on the other hand, suggest that European patent classifica-

tion has a positive and significant explanatory power regarding patent value,

while priorities seem ambiguous.

The latter two models use patent renewal decisions and patent trades as the

dependent variables. Because patent renewals can be fully observed for only a

limited period of time within my dataset, I rather employ a model estimating

the probability a patent would be renewed at least once. The other model

estimates probability that a patent would be traded, building upon findings

made by Serrano (2005). Fforward patent citations are used in their percentile

form.29 Unfortunately, the low number of observations for certain time peri-

29The percentile values are computed for each year and each industry separately. Thepercentile form is used for two reasons: first, it completely standardizes the values of forwardcitations, which have to be predicted otherwise. This was not possible in the previous twomodels due to citations being used as the correlate of patent value, which must be allowed to


ods and industries limit the data for software and aerospace industry in the

percentile estimation.

Table 6.2: Probit regressions.

Probit Renewal Tradedregression Coef. S. E. Coef. S. E.

Const. 3.54*** 0.32 -2.22*** 0.13Aerospace -1.62*** 0.11 0.63*** 0.04Computers -1.61*** 0.10 0.73*** 0.03Semiconductors -1.32*** 0.11 0.41*** 0.03Software (omitted) (omitted)

Fcit(percentiles) 0.42*** 0.02 0.04* 0.02log(Bcit) -0.01 0.01 -0.03** 0.01log(inventors) -0.02* 0.01 -0.07*** 0.01log(Usclass) -0.07*** 0.01 -0.03** 0.01log(Euclass) 0.10*** 0.01 -0.04** 0.01log(INTclass)1 (omitted) 0.15* 0.07log(INTclass)2 0.00 0.16 0.01 0.06log(INTclass)3 -0.07 0.08 0.15** 0.06log(INTclass)4 -0.23*** 0.05 0.02 0.04log(INTclass)5 -0.06 0.04 0.20*** 0.03log(INTclass)6 0.05** 0.02 0.17*** 0.02Priorities 0.03* 0.01 -0.02** 0.01

Wald χ2-test 2238.86 1960.54Log pseudolikelihood -21021.62 -33389.71Number of obs. 79029 84147

∗ p < 0.10, ∗ ∗ p < 0.05, ∗ ∗ ∗ p < 0.01

The results are shown in Table 6.2. The constant and the coefficients of

the industrial dummy variables act differently in both models. This is not

unexpected if we look at the descriptive statistics of renewal data and patent

trades; over 90% of all patents were renewed at least once, but only about 13%

of all patents were traded. The most important result of the regression with

patent renewal data is definitely the significance and the positive sign of forward

evolve over time. Second, the latter two models help to answer the question ”Why was thispatent renewed (traded) and the others were not?” The percentile values directly comparethe given patent to all patents granted in the same year and industry, which substitutes thecomparison that the management would have at the time of the decision. I do not includefamily size variable in the regression, as its value also depends only on the decision of themanagement.


patent citations. It seems that the decision whether to extend patent’s validity

does indeed depend on the value of the patent. The second model exhibits

results very similar to findings of Serrano (2005). Valuable patents, measured

by forward patent citations, are truly more likely to be traded. Together with

the previous outcomes, I may conclude that International classification has lost

nearly all its explanatory power. Further, my dataset yields similar results as

those in the previous literature, and the managerial decisions depend on the

value of the patent, as expected.

The time dummies are mostly insignificant for the former model with re-

newal data, only years from 1999 on seem to be statistically significant at the

5% level. The opposite is true in the case of the latter model. Most of the year

dummies are significant and positively correlated (whereas they are negatively

correlated in the first model), suggesting that patent trades were more ”fash-

ionable” in certain years, but renewal decisions remained constant. However,

the estimations with the time trends included show quite the opposite; both are

significant in the first model, and insignificant in the second. The explanation

is present in the plots of the time dummies in Figure 6.1. First model shows

a downward trend in the time dummy coefficients, while there are significant

ups and downs in the other.

The test statistics for the overall significance of the models show no doubt

that all four models are much better than their alternatives with no predictors

(i.e. the hypothesis H0 = β1 = ... = βn = 0 can be rejected even at 0.1%

significance level). Additional statistics to compare models are shown in Tables

C.4 and C.5.

6.2 DEA analysis

If the theory about the strategical use of patents, suggested by e.g. Macdonald

(2004), is indeed correct, there should be some visible changes in the company’s

performance regarding their patent portfolios. Essentially, a company behaving

strategically is supposed to be less efficient in creating technologically valuable

patents, and more efficient in producing high number of patents in general at

the same time. Such company does not rely on valuable inventions to secure

itself a market advantage, it endeavours to create large patent thickets around

key patents of the competition instead. And to create such thickets, it requires

a large number of patents, whatever quality they may be.

To search for these changes, I use Data Envelopment Analysis (DEA), a


method introduced by Farrell (1957) and further developed by Charnes et al.

(1978), which has been extensively applied to evaluate performance in manu-

facturing and service operations. DEA measures relative efficiency of a homo-

geneous set of decision making units (DMU) through calculating an efficiency

frontier and then comparing each DMU to it. DEA allows multiple inputs

and outputs in natural units to be used at the same time. The final relative

efficiency score is then computed as:

Efficiency = Weighted Sum of OutputsWeighted Sum of Inputs

Essentially, DEA only requires inputs and outputs to be given, not the produc-

tion function (i.e. the ”black box”).

Further, DEA allows for both constant and variable returns to scale, which

is useful for my work, as the companies in my dataset are of dissimilar sizes

and, more importantly, have different propensity to patent; thus, they perform

research and development with a distinct effort. Constant returns to scale

assume that the maximal effectiveness remains the same for production at a

small scale as at a large scale. In other words, a company producing 10 units of

output from 10 units of input is supposed to produce 100 units of output from

100 units of input, if it remains fully effective at both times. Variable returns

to scale assume that a company may be fully effective at both small and large

scale, even though the input-output ratio is different.

Needless to say, the results are very different, depending on which method

is used. There are many reasons why it would be reasonable to assume either of

them. Arguably, it should cost the same amount of money and time to create

an invention of a certain technological value, no matter whether the inventors

work for a small or a large company. Moreover, the average number of inventors

per patent is very similar for all companies in my dataset, even though they

are of considerably different sizes. On the other hand, it is rational to expect

that a large company may be better in R&D in general; that the marginal

product of additional employee in R&D may be increasing; that larger patent

portfolio may mean lesser administrative costs per patent on average; or that

larger companies have more productive employees, and thus the returns to scale

should be variable.

Another problem is the rather small number of observed DMUs for my

analysis. In order to obtain precise estimates of efficiency under the variable

returns to scale assumption, there would have to be enough DMUs of similar


sizes. Unfortunately, this is not the case of my dataset, so the final results may

be a little different compared to use of a larger dataset.

Bound et al. (1982) or Griliches (1981) have shown that the elasticity of

patenting with respect to R&D employment is close to unity; therefore, I may

assume that the constant returns to scale can be used in this situation. Nev-

ertheless, these works are rather old and the situation may have changed dra-

matically since then. Because the question remains unanswered, I have decided

to include both models in my analysis. Furthermore, as the data are in the

logarithmic form to obtain reliable results (i.e. to prevent outliers from ruining

the analysis), the differences in the size of the companies are no longer so clear.

Consequently, both methods ought to yield similar results.

Following Talluri (2000), assuming that there are n observed DMUs, each

with m inputs and s outputs, the relative efficiency is computed as:

max

k=1∑s

vkykp

j=1∑m

ujyjp

s.t.

k=1∑s

vkyki

j=1∑m

ujyji

≤ 1, ∀i

vk, uj,≥ 0, ∀k, j,

where k=1,...,s, j =1,...,m, i=1,...,n, yki = amount of output k produced by

DMU i, xji = amount of input j processed by DMU i, vk = weight given to

output k, and uj = weight given to input j. This fractional problem may then

further be transformed into linear program:

maxk=1∑s

vkykp

s.t.

j=1∑m

ujyjp = 1

k=1∑s

vkyki −j=1∑m

ujxji ≤ 0, ∀i

vk, uj,≥ 0, ∀k, j.


Relative efficiency score means that there is always at least one fully efficient

DMU in the sample, however it does not tell us anything about the overall

efficiency. Yet it is sufficient for my analysis, as I am only interested in the

changes throughout the observed time period. As said above, the efficiency

depends heavily on the assumptions. There are usually more fully efficient units

under variable returns to scale than under constant returns to scale assumption,

because the efficiency frontier is then non-linear.

DEA has also several disadvantages, as mentioned in Prochazkova (2010).

First of all, the number of observations is suggested to be at least three times

higher than the number of input and output variables together. Because of a

limited number of observations in my data sample,30 I must have been very

careful not to use an extensive amount of variables in the analysis. The second

problem is connected to the efficiency. The efficiency frontier may be mislead-

ing if there are outliers present in the dataset. To prevent this, I have run two

outlier detection analyses under MedCalc program with logarithmic transfor-

mation of my data to see possible outliers both from the left and the right side.

Neither of the tests showed possible outliers at the 5% Alpha-level, my data

seem to be useful for the analysis then.

Because I am mainly interested in patents as the company’s output, inputs

connected with patents must have been chosen. Again, since I required a

variable common and obtainable for all the companies in my dataset, there were

not many options. I have obtained the data about the number of employees

(unfortunately, whole company’s workforce, not only the employees in R&D,

because such data were scarce), research and development expenses, and the

net profits the company had in a given year. Because net profits may be

negative and thus pose a threat to the analysis, I have then decided to only

use the former two variables as my input (the data were then adjusted for the

inflation). I have transformed both of them into natural logarithms because of

the large differences among the companies.

Following the preceding discussion, I use the number of patents the company

30The financial data of observed firms are publicly available because the companies arelisted on the stock exchange, but the necessity to publish the reports on-line was first es-tablished in 1994. Despite searching in various databases and writing to the companies forearlier reports, I have obtained just a few of these. As it is necessary for the analysis tohave a sufficient number of observations from the same year, the data are limited by 1994from the left. At the same time, the patent data are based on the date of the application (asthe opposite to the date of the grant in the previous sections), the time difference betweenpatent application and patent grant, and the use of forward citations as the DEA output; Ihave limited the dataset by 2003 from the right.


has applied for in a given year (but only those which were later granted) and

the average of composite rating of such patents as the output variables. Again,

the number of patents have been transformed into natural logarithms. The

composite rating consists of patent characteristics shown to be heavily and

positively correlated with patent value. These are forward citations, family

size, renewal data and patent trades.31 Because the former two are dispersed

count variables, whereas the latter two are binary, I have transformed the first

two into natural logarithms for the final rating to similarly depend on all four

variables.32 It is then computed as:

Rating = ln(1 + Fcit) + ln(1 + Fsize) +Renewed+ Traded

DEA analysis also allows to set restrictions on the computed weights (e.g.

that the weight of research and development expenditures cannot be more

than twice as high as the weight of the number of employees). I tried many

different restrictions; however, none of them meant a significant difference in the

results. In the end, the final models are used without any weight restrictions. I

estimated 6 different models to be able to see the changes from as many points

of view as possible. The models differ in the outcome variables used - one model

only with the raw number of patents (transformed into logarithm form), one

only with the composite rating, and one model using both outcome variables.

Every model was then estimated both assuming constant and variable returns

to scale.

As I am mainly interested in the changes of efficiency for a given company

across the time, simple cross-sectional analysis is not sufficient. Panel data are

better than cross-sectional in this sense because one can then not only compare

one company to another, but is also able to see the change over time.

The downside of panel data usage is a possible bias caused by the unob-

servable technological improvement, especially if the data period is long. The

DMU from the beginning of the dataset is then compared to the same DMU

at the end of the dataset, which may operate under completely different cir-

31While all of these variables are correlated with patent value, they are almost uncorrelatedamong themselves (see Table C.6 in the Appendix), meaning that each of them explains adifferent part of patent value. Therefore, it is favourable to include them all in the analysis.

32Keep in mind that the exact outcome is not important, as I am not interested in thenumber, but only in its change over the observed period. The previous literature suggests thatforward citations are the most reliable correlate with the patent value, thus the compositerating still relies on forward citations the most. The descriptive statistics are shown in theTable C.7.


cumstances. For that reason, rather than using simple panel data estimation

method, I employ DEA Window Analysis, a method developed by Charnes

et al. (1985), which enables me to only compare each DMU with the others

within a given time period. As explained in Chung et al. (2008), assume that

there are l = 1, ..., N companies in the dataset, each observed for all 1, ...,M

years.33 The window length (the number of years within which all the DMUs

are compared) is given by K. I follow Charnes et al. (1985) and set K = 3 as

it seems to yield the most accurate results. The first window is then composed

of all the DMUs from years 1, 2, and 3 (1994, 1995, and 1996, respectively),

the second from years 2, 3, and 4, and so on; generally years j = i,...,i+K-1,

where i = 1,...,M-K+1.

The estimation is separate for each window, each DMU in the window is

then characterized by its efficiency Elij. These are reported in the Table C.8.

Further, to be able to compare the company among the others, the average

efficiency is computed as:

Ml =

M−K+1∑i=1

i+K−1∑j=i

Elij

K × (M −K + 1), l = 1...N

Table C.7 shows descriptive statistics of the variables used. Each observed

year of company’s performance is treated as an unique DMU. Because the

detailed results are extensive, I only include a sample results from software

industry (both outcome variables included) in Table C.8. Each company is

observed over 10 years, each row exhibits one window of the DEA analysis.

Bold are the average efficiency scores for a given year. We can see that there

is a clear downward trend in the case of Adobe, which seems to worsen its

performance in patent production. Microsoft, on the other hand, seems rather

stable over time. The results are pleasingly similar across different windows,

suggesting that the results are precise.

The summary results of the DEA analysis are shown in Tables C.9-11.

We can see that the assumption of variable returns to scale produces higher

estimates than constant returns to scale. The efficiency varies significantly

among the companies and the observed years; however, the numbers are very

similar for companies within an industry.

33I am not able to utilize my full dataset using this method, as several companies areobserved for less than 10 years because they went public at a later date. The analysis thenonly uses 13 out of total 25 companies.


Companies in semiconductor industry seem to be highly effective in general,

but the efficiency is lower when only the composite rating is used as the outcome

variable. This is similar to all industries and suggests that the companies are

more efficient in producing patents, than in producing valuable patents. Altera

is the most efficient company overall, whereas Boeing or the other companies

from aerospace industry seem not to be performing well at all. Interestingly,

IBM, which has publicly announced that it is more profitable for it to create

patent thickets than to make valuable inventions (Bessen 2004), is indeed just

poorly efficient in the model with only the composite rating as the outcome

variable, but it is rather peerless in creating patents in general.

All companies apart from Dell and United Technologies had improved their

efficiency in creating patents with given inputs over the observed period. The

results of the third model (composite index as the outcome variable) are not

so clear. The overall trend is negative under the assumption of constant re-

turns to scale, yet it is rather constant or positive assuming variable returns to

scale. Unlike in the models with raw patent numbers and both outcome vari-

ables (where the results were similar under both assumptions), it seems that

the model with composite rating shows certain deviations. It is not surprising

though; continuing the discussion in the beginning of this section, the results of

the model with pure patent counts as the outcome variable (and therefore also

the results of the model with both outcome variables) allow both the inputs

and the outputs to increase with the size of the company. Nevertheless, the

composite rating depends more on the propensity to create valuable inventions,

rather than on the size of the firm. Thereby, the models with both outcome

variables and raw patent numbers as the outcome variable exhibit similar re-

sults under both assumptions, because it seems that it indeed takes the same

amount of work and money to develop a new invention for both a small and a

large company. But constant returns to scale seem not to be favourable in the

model with the composite rating as the outcome variable - it is definitely not

true that a company with ten times higher input volume is supposed to create

inventions of ten times higher technological value.

That said, I may conclude that the overall trend in the efficiency regarding

transformation of inputs into valuable outputs (based on the estimates under

variable returns to scale assumption) is increasing. Although, one must keep in

mind that those results may be biased a little bit due to rather small dataset.

Yet it is not united for all companies within an industry. Perhaps the efficiency

depends more on the individual behaviour and the decision making of the


firm, than on the industry itself. Indeed, the fact that the companies became

more efficient in creation of patents in general, and diverged in the efficiency of

creating valuable patents, suggest that the trend of relying heavily on patents is

similar to all industries. However, the decision whether to aim for the valuable

patents at the same time was rather individual. Adobe, for example, had

doubled its efficiency in creating patents, but, its efficiency in making valuable

patents had decreased at the same time.

Summarized, the results are following: there is indeed a visible and a rather

strong increase in the efficiency of transformation of the research and devel-

opment inputs into patents in general. The trend is unclear in the case of

transformation of inputs into valuable patents, and seems to depend heavily on

the decision making within the company. According to the literature on strate-

gical patents, the higher efficiency in creating patents in general was expected.

The efficiency of creating valuable inventions, however, ought to be decreas-

ing, as companies spend more resources on creating less valuable inventions,

because the technological value of their patents does not need to be high. The

results point out that not all the companies in my dataset have decided to aim

for patent portfolios of a lesser technological value, or at least some of them

seem to produce valuable inventions more efficiently than before.

Chapter 7

Conclusion

Patents represent a substantial institution in today’s world, providing a pos-

sibility to possess and treat an intangible asset as if it were a material thing.

Importantly, the ownership is then under the law protection and the patented

idea may not be stolen or abused. The institution has been created to support

inventors in their innovative effort; however, according to the theories that have

sprung up recently, the reality may be far from it. Large companies seem to

abuse patents through patent thickets around the key inventions of their com-

petitors in order to maintain market lead, or at least gain certain profits from

licensing.

Yet the theory has only demonstrated the behaviour theoretically, with-

out broader empirical findings (except for occasional confessions of managers).

Therefore, this study attempts to shed more light upon the matter by pre-

senting some remarkable changes in patent characteristics, their value, and

performance of the companies regarding research and development; to provide

interesting observations and to support further investigation.

One must first understand the meaning of patent value to be able to study

its fluctuation. Patent value can be understood either as the technological im-

provement the patent brings, or as the profit inflow resulting from the use of

the patent. These two do not necessarily need to be highly correlated, as a

company may earn significant sums of money indirectly, through licensing and

other means, even though the patent itself is rather meaningless for the tech-

nological field (as suggested by the strategical theory). I am mainly interested

in the technological value of a patent, for a patent ought to primarily protect

valuable inventions.

The preceding literature has suggested a number of variables connected to

7. Conclusion 48

the technological value of patents; particularly forward citations, family size,

renewal decisions and patent trades. Each of these correlates have been proved

to be indeed significantly and positively correlated with patent value. At the

same time, they are nearly uncorrelated with each other, suggesting that each

of them explains a different part of patent value, and thus it is preferable to

know them all at the same time.

I have obtained an extensive dataset containing over 163,000 US patents

from four different industries (computer manufacturing, software development,

aerospace and semiconductor industry) with the grant date ranging from 1976

to 2011. The length of the time span allows me to observe the evolution of

the variables more than it could have been possible before. I have made some

interesting discoveries: patent value (substituted by forward citations) has de-

creased immensely for some technological fields (particularly software indus-

try), while it has remained rather constant for aerospace industry. The other

variables show similar behaviour for all industries, and are also either constant

or decreasing.

Then, using econometric analysis, I have investigated the relationships be-

tween the variables shown to be good correlates with patent value by the preced-

ing literature (forward citations, family size, renewal data, and patent trades)

and the characteristics included in the patent document. My results are similar

to the prior findings, suggesting that patent value is significantly connected to

these; however, they are not sufficient to explain the changes over time. More-

over, decisions regarding patent’s renewal and trades seem to rely heavily on

patent value.

Finally, I have made an estimation of relative efficiency with which the com-

panies in my dataset transform inputs (R&D expenses and total workforce)

into outputs (patent stock and its value). I used Data Envelopment Analysis

to estimate an efficiency frontier and then compare each company to it. Be-

cause the computed efficiency is just relative, the results are only meaningful

in comparison to the other companies, not in general, yet that is sufficient for

the analysis. Under the assumptions of the literature on strategical patents,

companies ought to produce more patents to be able to create patent thickets.

And indeed, my results exhibit an upward trend in the efficiency of trans-

forming the inputs into the raw patent stock. Nevertheless, there is no common

trend regarding the efficiency of transforming the inputs into valuable outputs

though. It seems that the companies have become better in producing patents

in general, but their propensity to create valuable inventions depends heavily

7. Conclusion 49

on the managerial decisions. From the strategical point of view, high patent

counts (and thus higher effectiveness of creating patents) are necessary for cre-

ating patent thickets.

In other words, I may conclude that several characteristics of the strategic

behaviour (particularly the decrease in patent value and higher raw patent

output) are clearly observable from the data. Moreover, DEA offers an unique

way of observing company’s behaviour empirically, and may be used to test

whether company’s acting on the market may be seen from its performance

as well. Yet the evidence overall remains rather unclear and would deserve

a further investigation, to see if patents remain an useful institution in their

current form, or shall be somehow adjusted.

Bibliography

Albert, M., Avery, D., Narin, F., and McAllister, P. (1991). Direct validation

of citation counts as indicators of industrially important patents. Research

Policy, 20(3):251–259.

Alcacer, J. and Gittelman, M. (2006). Patent citations as a measure of knowl-

edge flows: The influence of examiner citations. The Review of Economics

and Statistics, 88(4):774–779.

Arora, A., Ceccagnoli, M., and Cohen, W. (2008). R&D and the patent pre-

mium. International Journal of Industrial Organization, 26(5):1153–1179.

Baldwin, J. (1996). The use of intellectual property rights by Canadian man-

ufacturing firms: Findings from the innovation survey.

Bertran, F. (2003). Pricing patents through citations. University of Rochester,

mimeo.

Bessen, J. (2004). Patent thickets: Strategic patenting of complex technologies.

Working Papers.

Bessen, J. (2008). The value of us patents by owner and patent characteristics.

Research Policy, 37(5):932–945.

Bound, J., Cummins, C., Griliches, Z., Hall, B., and Jaffe, A. (1982). Who

does R&D and who patents?

Charnes, A., Cooper, W., Golany, B., Seiford, L., and Stutz, J. (1985). Founda-

tions of data envelopment analysis for Pareto-Koopmans efficient empirical

production functions. Journal of Econometrics, 30(1):91–107.

Charnes, A., Cooper, W., and Rhodes, E. (1978). Measuring the efficiency of

decision making units. European journal of operational research, 2(6):429–

444.

Bibliography 51

Chung, S., Lee, A., Kang, H., and Lai, C. (2008). A DEA window analysis

on the product family mix selection for a semiconductor fabricator. Expert

Systems with Applications, 35(1):379–388.

Devaiah, V. (undated). A history of patent law. http://www.altlawforum.

org/intellectual-property/publications/a-history-of-patent-law.

Farrell, M. (1957). The measurement of productive efficiency. Journal of the

Royal Statistical Society. Series A (General), 120(3):253–290.

Gallini, N. (2002). The economics of patents: Lessons from recent US patent

reform. The Journal of Economic Perspectives, 16(2):131–154.

Gambardella, A., Harhoff, D., and Verspagen, B. (2008). The value of European

patents. European Management Review, 5(2):69–84.

Griliches, Z. (1981). Market value, R&D, and patents. Economics letters,

7(2):183–187.

Griliches, Z. (1984). R & D, patents, and productivity. NBER Books.

Griliches, Z. (1990). Patent statistics as economic indicators: A survey. Journal

of Economic Literature, 28(4):1661–1707.

Griliches, Z. (1998). Returns to research and development expenditures in the

private sector. NBER Chapters, pages 49–81.

Hall, B. and Ham, R. (1999). The patent paradox revisited: Determinants

of patenting in the US semiconductor industry, 1980-94. Technical report,

National Bureau of Economic Research.

Hall, B., Jaffe, A., and Trajtenberg, M. (2000). Market value and patent cita-

tions: A first look. Technical report, National bureau of economic research.

Harhoff, D., Narin, F., Scherer, F., and Vopel, K. (1999). Citation frequency

and the value of patented inventions. Review of Economics and statistics,

81(3):511–515.

Harhoff, D. and Reitzig, M. (2004). Determinants of opposition against EPO

patent grants—the case of biotechnology and pharmaceuticals. International

journal of industrial organization, 22(4):443–480.

http://www.altlawforum.org/intellectual-property/publications/a-history-of-patent-law

http://www.altlawforum.org/intellectual-property/publications/a-history-of-patent-law

Bibliography 52

Harhoff, D., Scherer, F., and Vopel, K. (2003). Citations, family size, opposition

and the value of patent rights. Research Policy, 32(8):1343–1363.

Jaffe, A. and Lerner, J. (2006). Innovation and its discontents.

Jaffe, A., Trajtenberg, M., and Fogarty, M. (2000). The meaning of patent

citations: Report on the NBER/case-western reserve survey of patentees.

NBER Working Papers.

Jaffe, A., Trajtenberg, M., and Henderson, R. (1993). Geographic localiza-

tion of knowledge spillovers as evidenced by patent citations. the Quarterly

journal of Economics, 108(3):577.

Kortum, S. and Lerner, J. (1999). What is behind the recent surge in patenting?

Research policy, 28(1):1–22.

Lanjouw, J. and Schankerman, M. (1997). Stylized facts of patent litigation:

Value, scope and ownership. Technical report, National Bureau of Economic

Research.

Lanjouw, J. and Schankerman, M. (2001). Enforcing intellectual property

rights. Technical report, National Bureau of Economic Research.

Lanjouw, J. and Schankerman, M. (2004). Patent quality and research pro-

ductivity: Measuring innovation with multiple indicators. The Economic

Journal, 114(495):441–465.

Lerner, J. (1994). The importance of patent scope: An empirical analysis. The

RAND Journal of Economics, pages 319–333.

Macdonald, S. (2004). When means become ends: considering the im-

pact of patent strategy on innovation. Information Economics and Policy,

16(1):135–158.

Narin, F., Hamilton, K., and Olivastro, D. (1997). The increasing linkage

between US technology and public science. Research Policy, 26(3):317–330.

O’Gara, M. (2012). Us court forbids MMI to use German injunction against

Microsoft. SOA World Magazine.

Pakes, A. (1985). On patents, R&D, and the stock market rate of return.

Journal of Political Economy, 93(2):390–409.

Bibliography 53

Pakes, A. and Griliches, Z. (1980). Patents and R&D at the firm level: A first

look.

Pitkethly, R. (1997). The valuation of patents: A review of patent valuation

methods with consideration of option based methods and the potential for

further research. Research Papers in Management Studies - University of

Cambridge, Judge Institute of Management Studies.

Prochazkova, J. (2010). Measuring efficiency of hospitals in the Czech Republic.

Master’s thesis, Charles University in Prague.

Reitzig, M. (2003a). What determines patent value?: Insights from the semi-

conductor industry. Research Policy, 32(1):13–26.

Reitzig, M. (2003b). What do patent indicators really measure. A structural test

of novelty and inventive step as determinants of patent profitability, LEFIC

WP, 1.

Reitzig, M. (2004). Improving patent valuations for management purposes–

validating new indicators by analyzing application rationales. Research Pol-

icy, 33(6-7):939–957.

Sakmann, C. (2012). Patentove spory zpomalujı inovace. CHIP, (3).

Sapsalis, E. and de la Potterie, B. (2007). The institutional sources of knowledge

and the value of academic patents. Econ. Innov. New Techn., 16(2):139–157.

Sapsalis, E., Van Pottelsberghe De La Potterie, B., and Navon, R. (2006). Aca-

demic versus industry patenting: An in-depth analysis of what determines

patent value. Research Policy, 35(10):1631–1645.

Scherer, F. (1998). The size distribution of profits from innovation. Annales

d’Economie et de Statistique, pages 495–516.

Schmookler, J. and Brownlee, O. (1962). Determinants of inventive activity.

The American Economic Review, 52(2):165–176.

Schneider, C. and Leuven, K. (2007). How important are non-corporate

patents? A comparative analysis using patent citations data. CEBR, Copen-

hagen Business School Working Paper.

Serrano, C. (2005). The market for intellectual property: Evidence from the

transfer of patents. Unpublished ”Job Market Paper” available online.

Bibliography 54

Talluri, S. (2000). Data envelopment analysis: Models and extensions. Decision

Line, 31(3):8–11.

van Pottelsberghe de la Potterie, B. and van Zeebroeck, N. (2011). Filing

strategies and patent value. Economics of Innovation and New Technology,

20(6):539–561.

van Zeebroeck, N. (2011). The puzzle of patent value indicators. Economics of

Innovation and New Technology, 20(1):33–62.

Yiannaka, A. and Fulton, M. (2006). Strategic patent breadth and entry deter-

rence with drastic product innovations. International Journal of Industrial

Organization, 24(1):177–202.

Appendix A

Forward Citation Distribution

My basic dataset does not contain the years in which forward patent citations

were gained, only their total number at the date of download. I have created

a sample of approximately 2,000 patents across all the observed years and

industries, and downloaded the month and the year in which each of their

forward citation was received.

To obtain the citation distribution shown in figure 5.3, I have calculated

the difference between the date of the grant of the observed patents and the

date of the citation gain,34 and then counted the number of citations gained at

a given lag for all patents in a given industry. That left me with the distribu-

tion of forward citations in absolute numbers. To further obtain the relative

probability of patent being cited at the age of t years, I have summed all the

citation counts for a given industry and a given time cohort, and calculated

the probability of being cited for a given lag as

P (cnt) =citations at lag t in industry n

citations in industry n(A.1)

where t=1..T, T being the number of observed years for a given industry.

A different approach must have been used to obtain the cumulative distri-

bution function. In order to preserve the changes in the distribution of patents

granted in different years, I have divided the data into 6 groups, each containing

patents granted within 5 years from each other (from 1976 to 2005)35. Because

34Because of the lengthening of the time it takes a patent application to be granted overthe observed period, I rather used the citing patents’s application date as the date of thecitation gain, so as not to get biased results. This also explains why it is possible that somany citations appear within the first few years after the patent grant, even though it takesseveral years for a patent to be granted.

35I use only the data up to and including 2005 to obtain relevant results - the later issued

A. Forward Citation Distribution II

each group then consisted of dissimilar number of patents, I used weights based

on the patent counts to get standardized absolute number of citations gains for

each industry and time cohort.

To further develop the cumulative distribution function and approximate

the patent counts I must have made several assumptions. I can only account

for three factors affecting the citation count: the year of the patent grant,

the number of years the patent was observed for, and the industry it belongs

to. Yet there still may be some unobserved effects. I assume that those are

random effects which are then implicitly included in the distribution. Further,

only the first group of patents could have been observed for more than 31

years, every later one then for 5 less years. I assume that the unobserved tail

of the distribution is similar to the distribution of the previous group (i.e. the

distribution of patent citations obtained between 26 and 31 for patents granted

in 1981-1985 is the same as the distribution for patents granted between 1976

and 1980).

The cumulative distribution function is then created by adding the per-

centage of patent citations obtained at the year t to the percentage of citations

obtained earlier.

Fx(x) = P (X ≤ x) =∑xi≤x

P (X = xi) =∑ci≤t

p(cink) (A.2)

Where p(cink) is the probability of patent from the industry n and time cohort k

being cited at the time i, and t is the lag. Because of the increasing total number

of forward citations, as shown in the Table A.1 (i.e. the citation inflation,

described in the Chapter 5 or e.g. in Hall et al. 2000), would bias the results,36

I have weighted the counts for each time cohort as following:

citntk = cntk(k∏

i=1

Tk∑j=1

cnji

Tk∑l=1

cnli−1

)−1, k = 2, ..., 10 (A.3)

Where cntk is the number of forward patent citations (weighted by the number

patents have not had enough time to receive forward patent citations. Year 2005 was chosenby the rule of thumb.

36Because I assume that the unobserved part of the cumulative distribution function issimilar to results from the former time cohort, to obtain precise results I must weight thecitation counts in order to have the same number of citations received by a patent after agiven time, to be able to tie the results together.

A. Forward Citation Distribution III

of observed patents for the given time cohort) for each industry, year of grant,

and lag. Tk is the same for both sums in the numerator and the denominator,

it is the number of observed years in the time cohort i. These are divided by

the multiplication of all weights up to and including the time cohort k. In other

words, for k = 2 is the fraction equal to the sum of all citations gained by the

patents in the first time cohort up to and including lag T2, divided by the sum

of all citations received by the patents in the second time cohort (that is, again,

until the lag T2). For the next time cohort the fraction remains computed the

same way, but its result is further multiplied by the result of the fraction from

the first case.

In the end, the first time cohort remains the same, the second time cohort

is adjusted so the total number of citations gained within T2 years after the

patent grant is the same as in the first time cohort. The third is adjusted so

the total number of forward citations received within T3 years after the patent

grant is the same as the adjusted number of forward citations in the second

time cohort, and so on. Ultimately, I was able to tie the distribution from the

previous time cohort to the latter ones to obtain the cumulative distributions

for all patents with the year of grant from 1976 until 2005. These are depicted

in Figure A.1.

Figure A.1: Cumulative distribution functions for different time co-horts.

Even though there are visible fractures, the overall results are satisfactory.

One must keep in mind that the exact behaviour of the cumulative distribution

A. Forward Citation Distribution IV

functions more than 4 years after the tying is not important any more, because

it would only be used to predict forward citations for patents located close to

the tying.

Each patent in my dataset can be identified by its date of grant and industry

to its matching function. The predicted total number of citations a patent

would obtain through 31 years is then

total citations = observed citationsFnk(t)

A. Forward Citation Distribution V

Table A.1: The number of foward patent citations at lags (weighted).

1976-1980 1 2 3 4 5 6 7 8 9 10 11

Total 69 467 679 710 663 630 607 615 615 601 560Aerospace 11 95 135 144 131 131 123 130 127 127 120Computer 23 141 200 207 199 196 185 190 186 190 174Semicond. 15 102 175 184 160 102 126 104 127 81 80Software 0 0 0 0 0 0 0 0 0 0 0

1981-1985Total 77 440 590 649 716 703 679 627 617 572 563Aerospace 11 91 122 130 142 142 136 132 119 117 122Computers 27 133 180 209 224 217 218 196 205 176 168Semicond. 18 88 111 106 136 135 114 104 96 107 103Software 0 0 0 0 0 0 0 0 0 0 0




2001-2005Total 1129 2042 2322 2296 2059 1710Aerospace 132 229 298 296 262 248Computers 317 502 559 526 486 411Semicond. 107 230 230 193 173 137Software 193 345 409 447 397 318

Appendix B

Data download

As it would be nearly impossible to collect such a large volume of data manually

from patent office databases, two other possible sources are available: commer-

cial on-line databases, allowing users to download bulk files, or web crawlers.

Commercial databases offer very fast services, often with advanced search for

precise needs; however, those are very expensive, limited in terms of provided

variables, or both. Web crawlers allow users to download exactly the data they

search for, but require a computer to run on and time to work.

For my study, I have used Easy Web Extract software.37 The program

crawls a given website and copies the selected parts (text, numbers, images),

based on set html objects, into the output file. The problem of the method is

the output file itself, as it only contains data as they appear on the website, not

in numerical form, which is preferred for latter use. Furthermore, if the website

is just poorly structured (i.e. a large part of the patent document shown on

the USPTO website is in plain text, not divided into separate parts or tables),

the program cannot download the requested data, simply because he cannot

distinguish them from the others. The only possibility how to obtain the data

then is to download the whole unstructured text and extract the right data

from it.

An average time of data download from one page is roughly 10 seconds;

however, this fully depends on the depth of the search. Given the large dataset

I decided to obtain and the fact that it had to be downloaded from two different

sources (repeatedly, due to new discoveries), the total time I spent downloading

was about 5 months.

37http://webextract.net

Appendix C

Additional Figures and Tables

Figure C.1: The delay between the patent application and the fol-lowing grant (in days).

C. Additional Figures and Tables VIII

Figure C.2: Renewal data, patents granted from 2000 to 2003.

Figure C.3: Renewal data, patents granted from 2004 to 2007.

C. Additional Figures and Tables IX

Figure C.4: Weighted forward citations by industry.

Figure C.5: Backward citations by industry.

C. Additional Figures and Tables X

Figure C.6: Family size by industry.

Figure C.7: Patent trades by industry.

C. Additional Figures and Tables XI

Table C.1: Companies overview.

Period Obs. F. Size F. Cit. Renewals Trades

Adobe 1989-2011 997 2.57 5.97 99.5% 9.7%AMD 1976-2011 9458 2.62 11.34 91.1% 25.7%Airbus 1991-2011 1375 5.54 2.63 98.0% 3.5%Altera 1986-2011 2055 3.06 9.37 98.7% 4.9%Apple 1978-2011 2986 4.02 12.64 99.8% 3.1%AM 1976-2011 5505 5.61 11.43 88.9% 2.6%Autodesk 1993-2011 392 3.54 4.96 96.9% 16.1%Boeing 1976-2011 7794 2.73 7.38 92.4% 3.9%CS 1998-2011 178 6.88 9.88 100.0% 1.1%Dell 1976-2011 2161 1.99 11.12 97.9% 1.6%Google 2003-2011 815 5.42 5.12 100.0% 4.5%HP 1992-2011 20166 3.09 8.43 97.2% 13.9%Intel 1976-2011 20306 3.12 9.47 94.7% 4.0%IBM 1976-2011 60251 2.86 9.62 81.4% 12.0%Intuit 1988-2011 221 1.56 3.05 93.8% 0.5%LTC 2010-2011 48 3.63 1.44 - -Logitech 1990-2011 221 5.57 10.57 93.5% 12.7%MIP 1987-2011 366 2.84 7.96 90.5% 6.8%Microsoft 1986-2011 16709 4.08 9.81 99.9% 1.5%NC 2000-2011 235 2.86 4.40 97.4% 69.8%Oracle 1995-2011 412 2.33 33.46 99.8% 8.3%Red Hat 2004-2011 200 1.73 0.95 100.0% 1.0%Symantec 1993-2011 929 3.21 5.82 100.0% 11.2%Textron 1976-2011 1712 4.97 9.08 85.7% 41.1%UT 1976-2011 5868 4.37 11.56 94.4% 10.2%

C. Additional Figures and Tables XII

Table C.2: Industries overview.

Industry Obs. Renewals F. size F. Cit. Trades

Aerospace 16749 92.7% 3.76 8.63 9.9%Computers 85785 86.9% 2.94 9.49 11.9%Semiconductors 37738 92.9% 3.35 10.20 9.3%Software 21088 99.8% 3.94 9.43 3.5%Dataset 161360 90.3% 3.25 9.56 10.0%

Table C.3: Used variables.

Variable Period Obs. Mean Med. S. D. Min Max P(x>0)

Renewals 1982-2007 100856 1.7 2 0.91 0 3 90.3%F. Size 1976-2011 161360 3.3 2 3.83 0 36 98.6%F. cit. 1976-2005 84820 15.9 8 25.22 0 839 92.6%Trades 1976-2007 161360 0.1 0 0.34 0 1 13.2%B. cit. 1976-2011 161360 10.0 16 30.90 1 1050 -Inventors 1976-2011 161360 2.8 2 1.80 1 60 -US Class 1976-2011 161360 4.3 4 2.90 1 39 -Int. Class 1976-2011 161360 2.9 2 2.16 1 31 -EU Class. 1976-2011 161360 2.4 2 1.87 1 32 -Priorities 1976-2011 161360 1.3 1 0.87 1 104 -

Table C.4: Additional regression statistics - negative binomial mod-els.

Dependent variable Predicted Fcit Family size

Log-Lik Intercept only -335797.59 -366381.45D(161308) 652133.58 697128.52McFadden’s R2 0.03 0.05Maximum Likelihood R2 0.21 0.20AIC 7.75 4.32BIC -301598.72 -1237000.00

Log-Lik Full Model -326066.79 -348564.26LR(49) 19461.61 35634.39Prob > LR 0.00 0.00McFadden’s Adjusted R2 0.03 0.05Cragg & Uhler’s R2 0.21 0.20AIC*n 652225.58 697232.52BIC’ -18973.97 -35046.81

C. Additional Figures and Tables XIII

Table C.5: Additional regression statistics - probit models.

Dependent variable Renewals Trades

Log-Lik Intercept Only -22628.17 -34558.40D(78987) 42043.24 66779.41McFadden’s R2 0.07 0.03Maximum Likelihood R2 0.04 0.03McKelvey and Zavoina’s R2 0.21 0.07Variance of y* 1.27 1.08Count R2 0.92 0.86AIC 0.53 0.80BIC -848738.19 -886930.20

Log-Lik Full Model -21021.62 -33389.71LR(37) 3213.10 2337.39Prob. > LR 0.00 0.00McFadden’s Adj R2 0.07 0.03Cragg & Uhler’s R2 0.09 0.05Efron’s R2 0.04 0.03Variance of error 1.00 1.00Adjusted Count R2 0.00 0.00AIC*n 42127.24 66875.41BIC’ -2795.83 -1838.41

Table C.6: Correlation matrix.

Fcit Fsize Trade RenewedFcit 1.00Fsize 0.07 1.00Trade 0.02 0.00 1.00Renewed 0.08 0.08 0.08 1.00

C. Additional Figures and Tables XIV

Table C.7: DEA variables (expenditures in $ millions).

Variable Obs. Mean Median St. D. Min Max

Employees Input 180 53775 12173 79101 163 319876R&D Expenditures Input 180 957 401 1251 3.78 5315# Patents Output 180 500 107 917 1 4425Composite Rating Output 180 3.69 3.98 1.06 0 5.85

C. Additional Figures and Tables XV

Table C.8: DEA analysis detailed results - software industry, bothoutcome variables.

Constant Returns to Scale

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003

Adobe

100% 80% 84%91% 86% 87%

90% 88% 81%91% 82% 75%

89% 81% 77%81% 78% 71%

79% 73% 66%74% 68% 72%

100% 86% 86% 89% 84% 79% 78% 73% 67% 72%

Microsoft

85% 91% 94%86% 89% 90%

89% 90% 91%90% 91% 90%

95% 93% 91%94% 91% 85%

92% 85% 87%88% 90% 95%

85% 89% 90% 90% 92% 92% 91% 86% 89% 95%

Variable Returns to Scale

Adobe

100% 85% 85%97% 88% 87%

90% 91% 86%96% 89% 88%

90% 89% 88%91% 90% 90%

94% 95% 94%95% 94% 93%

100% 91% 88% 91% 88% 90% 91% 93% 94% 93%

Microsoft

87% 99% 100%100% 100% 100%

100% 100% 99%100% 99% 95%

100% 97% 92%100% 92% 87%

100% 86% 88%88% 90% 97%

87% 99% 100% 100% 100% 97% 95% 87% 89% 97%

C. Additional Figures and Tables XVI

Table C.9: DEA analysis - both outcome variables.

Returns to ConstantScale 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003

Boeing 61% 68% 66% 59% 61% 64% 67% 73% 79% 81%Textron 63% 67% 65% 69% 73% 70% 75% 73% 71% 70%UT 75% 68% 63% 60% 60% 62% 62% 63% 65% 69%

66% 68% 65% 62% 64% 65% 68% 70% 71% 73%

Apple 84% 87% 86% 83% 82% 82% 76% 79% 77% 80%Dell 97% 96% 84% 77% 74% 83% 78% 77% 75% 79%HP 84% 83% 83% 82% 83% 86% 88% 94% 89% 92%IBM 92% 92% 92% 89% 89% 90% 90% 90% 90% 92%

89% 89% 86% 83% 82% 85% 83% 85% 83% 86%

AMD 91% 94% 96% 100% 99% 99% 100% 98% 99% 99%Altera 100% 100% 100% 100% 96% 97% 99% 96% 99% 98%AM 89% 87% 90% 89% 91% 94% 92% 86% 88% 87%Intel 90% 89% 89% 86% 87% 89% 89% 90% 93% 95%

92% 92% 94% 94% 93% 95% 95% 93% 95% 95%

Adobe 100% 86% 86% 89% 84% 79% 78% 73% 67% 72%Microsoft 85% 89% 90% 90% 92% 92% 91% 86% 89% 95%

92% 87% 88% 90% 88% 86% 84% 79% 78% 84%

Variable


66% 69% 67% 70% 73% 72% 74% 75% 78% 79%

Apple 91% 90% 88% 91% 85% 87% 81% 84% 84% 84%Dell 100% 100% 88% 86% 78% 89% 79% 83% 84% 84%HP 89% 90% 89% 93% 92% 90% 93% 100% 94% 96%IBM 100% 100% 100% 100% 100% 100% 100% 100% 99% 100%

95% 95% 91% 92% 89% 92% 88% 92% 90% 91%


97% 97% 97% 95% 95% 97% 97% 95% 96% 96%

Adobe 100% 91% 88% 91% 88% 90% 91% 93% 94% 93%Microsoft 87% 99% 100% 100% 100% 97% 95% 87% 89% 97%

94% 95% 94% 96% 94% 93% 93% 90% 91% 95%

C. Additional Figures and Tables XVII

Table C.10: DEA analysis - patent numbers as the outcome variable.



62% 65% 61% 57% 57% 59% 56% 62% 65% 70%

Apple 72% 81% 80% 72% 70% 70% 66% 72% 75% 80%Dell 97% 96% 82% 77% 71% 81% 72% 76% 72% 79%HP 81% 79% 81% 79% 79% 84% 87% 94% 89% 92%IBM 92% 92% 91% 89% 88% 90% 90% 90% 90% 92%

85% 87% 84% 79% 77% 81% 79% 83% 82% 86%


88% 88% 90% 92% 91% 90% 88% 89% 92% 94%

Adobe 37% 41% 63% 69% 68% 68% 68% 68% 67% 72%Microsoft 78% 81% 83% 83% 85% 86% 88% 86% 89% 95%

57% 61% 73% 76% 77% 77% 78% 77% 78% 84%

Variable


66% 67% 64% 60% 61% 63% 66% 73% 78% 79%

Apple 79% 86% 85% 80% 80% 81% 81% 84% 84% 84%Dell 97% 96% 83% 78% 72% 83% 78% 83% 84% 84%HP 81% 80% 82% 79% 80% 84% 90% 100% 94% 96%IBM 94% 97% 99% 99% 98% 100% 100% 99% 99% 100%

88% 90% 87% 84% 82% 87% 87% 92% 90% 91%


90% 91% 93% 93% 93% 94% 94% 95% 96% 96%

Adobe 88% 86% 87% 87% 88% 90% 91% 93% 94% 93%Microsoft 83% 84% 86% 86% 87% 88% 89% 87% 89% 97%

86% 85% 87% 87% 88% 89% 90% 90% 91% 95%

C. Additional Figures and Tables XVIII

Table C.11: DEA analysis - composite rating as the outcome variable.



49% 53% 55% 59% 61% 63% 66% 60% 59% 60%

Apple 68% 65% 68% 81% 81% 82% 70% 67% 59% 40%Dell 76% 74% 73% 74% 73% 71% 68% 54% 60% 59%HP 53% 58% 57% 62% 63% 61% 58% 52% 46% 38%IBM 52% 54% 55% 53% 58% 54% 51% 44% 40% 30%

62% 63% 63% 68% 69% 67% 62% 54% 51% 42%


73% 74% 72% 73% 73% 75% 72% 67% 68% 65%

Adobe 100% 84% 82% 88% 84% 78% 69% 59% 40% 32%Microsoft 61% 72% 74% 78% 78% 74% 64% 59% 49% 26%

80% 78% 78% 83% 81% 76% 66% 59% 45% 29%

Variable


57% 59% 59% 67% 70% 69% 72% 72% 74% 74%

Apple 73% 70% 72% 86% 84% 87% 81% 83% 84% 84%Dell 78% 79% 76% 78% 76% 76% 74% 80% 84% 84%HP 58% 64% 61% 67% 68% 63% 64% 65% 64% 64%IBM 56% 59% 58% 55% 62% 56% 57% 59% 60% 60%

66% 68% 67% 71% 72% 70% 69% 72% 73% 73%


77% 78% 77% 76% 78% 79% 78% 80% 82% 83%

Adobe 100% 90% 87% 90% 88% 90% 91% 93% 94% 93%Microsoft 68% 85% 90% 95% 90% 85% 69% 69% 70% 69%

84% 88% 88% 92% 89% 88% 80% 81% 82% 81%

Charles University in Prague - cuni.cz

Documents

Transcript of Charles University in Prague - cuni.cz