Spam and beyond: An information-economic … and beyond: An information-economic analysis of...

40
Spam and beyond: An information-economic analysis of un- wanted commercial messages Robert K. Plice Information and Decision Systems San Diego State University San Diego, CA 92182-8234 +1-619-594-6857 [email protected] (corresponding author) Oleg V. Pavlov Worcester Polytechnic Institute Nigel Melville University of Michigan August 2006 1

Transcript of Spam and beyond: An information-economic … and beyond: An information-economic analysis of...

Spam and beyond: An information-economic analysis of un-wanted commercial messages

Robert K. Plice Information and Decision Systems

San Diego State University San Diego, CA 92182-8234

+1-619-594-6857 [email protected]

(corresponding author)

Oleg V. Pavlov Worcester Polytechnic Institute

Nigel Melville

University of Michigan

August 2006

1

Sapm and beyond: An information-economic analysis of unwanted commercial messages

Abstract

The phenomenon of unwanted commercial messages (UCM), including e-mail spam and

emerging forms that target other Internet communications facilities, is analyzed from an informa-

tion-economics perspective. The UCM industry is characterized as a tragedy of the commons.

UCM traffic pays off for its senders when it is noticed and consumed by Internet users, and the

industry is, therefore, dependent on a common-pool resource that is accessed through an infor-

mation asset. An analytical model of the industry is derived and solved computationally, and

two dimensions of information quality held by the senders of UCM traffic are manipulated in the

model. It is shown that such manipulations can moderate over time both the number of UCM

campaigns undertaken and the amount of Internet bandwidth consumed by UCM. Improved fil-

tering reduces the amount of traffic that penetrates an Internet user’s attention space, but actually

increases the amount of Internet bandwidth consumed, and will not lead to the elimination of

UCM. It is also shown that both public and private entities have adopted policies and practices

that have had unintentional informational side effects that have tended to increase, rather than

reduce, spam e-mail traffic. It is concluded that the lessons learned from the case of e-mail spam

can be applied to the development of policies and practices for mitigating newer, emerging forms

of UCM, including versions targeting instant-messaging systems and web logs. The paper adds a

third category of remedies to the literature on commons problems: information-based ap-

proaches.

KEY WORDS AND PHRASES: Internet communications, spam, common-pool re-

sources, tragedy of the commons, economic models, simulations.

2

Sapm and beyond: An information-economic analysis of unwanted commercial messages

1. Introduction In the last decade, a dramatic increase in unwanted commercial message (UCM) traffic

over the Internet has become a costly problem. For example, a significant portion of worldwide

investment in network infrastructure is consumed by the transmission of e-mail spam; in fact,

most Internet e-mail traffic is spam.1 This remains the case despite vigorous actions by internet

service providers (ISPs) and e-mail administrators to deploy filters and other technologies to

thwart spammers’ efforts to gain the attention of Internet users. The spammers do not seem to be

giving up: Even though defensive technologies have, in many cases, reduced the amount of spam

users see in their inboxes, the cost of dealing with spam at the server level continues to increase.

Moreover, emerging categories of UCM have arisen in parallel with the diffusion of new Inter-

net-based communications technologies. Splogs (UCM-content web logs) and spim (instant-

message UCM traffic) are two examples. One security firm executive estimates that ten percent

of instant messaging traffic is now UCM content: “It is where e-mail was several years ago” [5].

If e-mail spam is not getting through to end users as readily as it used to, why do spam-

mers send more of it? Can we apply insights gained from the fight against e-mail spam to devise

strategies for dealing with UCM in general? (E-mail spam has elicited both public and private

policy responses, as well as some empirical research [6], so the ability to draw general lessons

from that experience will be especially valuable.) In this paper, we model the economics of the

1 Because the Internet is a decentralized network of networks that cannot be monitored from any one location, there

is considerable variation in the reported data. For example, one source states that of the roughly 31 billion daily e-

mails sent globally, about 12.4 billion (41%) are spam [1]. Another source suggests an even higher proportion, stat-

ing that spam made up over 60% of all e-mails sent during the first six months of 2004 [2]. Still other reports held

that the spam’s share of total e-mail rose from between 67% and 78% at the beginning of 2004 to between 75% and

88% at the end of the year [3]. Recent reports hold that the percentage of e-mail that consists of spam was still above

65% at the end of 2005 [4].

3

Sapm and beyond: An information-economic analysis of unwanted commercial messages

UCM industry, and show that these questions can be answered once we understand that it em-

bodies a tragedy of the commons.

The reason Internet users find UCM to be objectionable is that it imposes a cost, both di-

rect (such as the filtering infrastructure to stem the flow) and indirect.2 One indirect cost comes

from information overload. Even though Internet bandwidth is plentiful – so that the deluge of

UCM traffic has not (so far) overwhelmed the network – the individual human-attention resource

is scarce; that is, the capacity of UCM recipients to process increasing volumes of information is

limited and fixed [9]. As the amount of UCM traffic grows, it competes for attention with other,

more-valuable information. Thus, demand for the Internet users’ fixed attention endowment in-

creases while the average value of the information involved decreases.

The potential for information overload resulting even from carefully targeted direct ad-

vertising has been recognized [10]. The Internet, however, as an electronic channel for direct

marketing, is distinguished by marginal costs of transmission that are orders of magnitude less

than traditional channels such as direct mail or telephony. This has led to UCM-based direct-

marketing campaigns that are untargeted,3 because the cost of message delivery to an entire

population can be lower than the cost of determining which subset is most likely to respond.

Thus, the rise of electronic media competing for consumer attention can increase information-

overload effects to the detriment of traditional marketing channels – a case of an electronic mar-

2 A summary of statistics from recent reports includes statements that businesses spent $653M on spam filters in

2003 and that 50% of IT managers in a survey reported anti-spam efforts as their number-one priority. Ferris Re-

search has estimated that the price to organizations worldwide is about $50 billion per year [7]. The FTC has con-

cluded that these efforts have reduced the amount of spam reaching the average user’s inbox [8], but some industry

sources question those findings [4].

3 As we will observe later, the marginal cost of sending spam is low, but not zero. We do not argue that UCM mar-

keters would avoid targeting if they had salient information to use; only that the low cost of originating UCM traffic

induces them to send it to recipients whom they know little or nothing about.

4

Sapm and beyond: An information-economic analysis of unwanted commercial messages

ket having a disruptive effect on a traditional industry [11]. UCM, therefore, poses a threat to

legitimate direct marketing (including legitimate Internet marketing)[12; 13].

Previous research has identified e-mail spam as an instance of a tragedy of the digital

commons [14], due to the classification of the limited attention endowment of UCM recipients as

a common-pool resource. Following the terminology widely used in microeconomic analysis

[15], a good is excludable if a party can be prevented from using it, and rival if actions of one

party can negatively affect the others. A common-pool resource is one that is rival but not ex-

cludable. There is a tendency for common-pool resources to be overused, which is referred to as

a tragedy of the commons (or, commons problem)[16]. This comes about because individuals

enjoy private gains from the use of the resource, but do not pay its full cost. An example of a

common-pool resource is fish in the ocean, and an example of a commons problem is overfishing

that depletes the stock of fish to the joint detriment of the industry members. We will detail in

Section 2 how the economics of the UCM and fishing industries are analogous, in that the send-

ing of too much UCM traffic depletes the common-pool attention resource of the message recipi-

ents, also with a jointly negative industry-wide impact.

The literature describes both public and private solutions to commons problems. Public

solutions can be divided into command-and-control and market mechanisms. Command-and-

control involves some form of regulation. With respect to UCM traffic, regulatory and enforce-

ment efforts have been undertaken by various countries [17], but so far have been largely limited

to the case of e-mail spam. The U.S. Federal Trade Commission (FTC) is under mandate from

congress via the 2003 CAN-SPAM Act to focus on an enforcement approach. The FTC is ex-

ploring modifications to existing e-mail protocols that will better enable authorities to trace the

origins of messages [18]. There is debate about the effectiveness of such regulatory efforts, but

they are seen as one important tool for combating UCM [3; 8]. Market mechanisms for control-

ling UCM traffic have also been discussed, including electronic stamps [19], attention bonds

[20], and tradable permits [21]. There is some evidence that these approaches may mitigate the e-

5

Sapm and beyond: An information-economic analysis of unwanted commercial messages

mail spam problem [22], but each of them would require significant changes to the existing pro-

tocols for e-mail exchange over the Internet, making the feasibility of their implementation un-

certain.

The literature on private solutions to commons problems focuses on the Coase theorem

[15], which predicts that parties can negotiate their way toward desirable outcomes if property

rights are well defined and transaction costs are small. There is limited opportunity for negotia-

tion or self-regulation in the UCM industry, however. One reason is that (as we describe Section

2) the industry players are generally unknown to one another. Another form of a private solution

is filtering: automatically separating UCM traffic from non-UCM messages. Some researchers

have proposed combining aspects of filtering with economic-incentive mechanisms [23]; yet, as

mentioned above, these approaches have shifted the locus of UCM-related costs from client

computers to server computers but have not eliminated the problem [24; 25].

Because the UCM industry relies on a common-pool resource that is accessed through in-

formation assets, rather than physical assets, we develop a model to show that specific informa-

tion-based manipulations can be brought to bear. Thus, we contribute the notion of an informa-

tion-based solution to the literature on commons problems, at the same time that we identify

policies and practices that would impact the efficacy of information held within the UCM indus-

try. These results enrich our understanding of electronic markets for attention in general – and

the UCM issue in particular – and serve to inform policy makers and technology providers as

they seek effective ways of dealing with UCM traffic on the Internet. Indeed, our findings imply

that if the UCM problem is to be successfully mitigated, an understanding of the economics of

the industry on the part of both public and private decision makers is mandatory, not just inter-

esting.

2. Industry setting

6

Sapm and beyond: An information-economic analysis of unwanted commercial messages

The UCM industry operates in a legally and ethically gray area, making it problematic to

use research methods commonly employed in studies of other industries. Surveys cannot be ad-

ministered to UCM firms, because they do not make their identities known (that is, they are “es-

sentially anonymous” [18, p. 8] ). Moreover, secondary data is not readily available.4 Given

these limitations, as well as the benefits of developing general models providing insight into

emergent phenomena, we turn to economic modeling as our research methodology. Our model

development proceeds from the observation of a small number of fundamental conditions, each

of which is derived from what we know about UCM and the technical infrastructure that sup-

ports it. These will be sufficient to detail the UCM industry’s dependency on a common-pool

resource, and to describe analytically its characterization as a tragedy of the commons. The ana-

lytic specification will then become the foundation of our economic model.

Crafting a definition of UCM traffic is not straightforward, because one person’s un-

wanted message may be another’s legitimate commercial communication. A typical recipient

probably thinks of UCM traffic as “a message I don’t want.”5 But, if we accept that the origina-

tors of UCM traffic are driven by a profit motive, there must be some recipient who is interested

in receiving that same message – at least to the extent of reading the message and then doing

something that brings about revenue for the sender. The aspect of UCM that makes it undesirable

to users is that the operators do not know who will be interested in the messages and who will

not; therefore, they send the messages indiscriminately.

4The content of legitimate e-mail and IM traffic is usually private or proprietary, and IPSs and corporate IS depart-

ments are reluctant to make it available to researchers. Other sales-related industries, such as real-estate brokerage,

advertising, and retailing are comparatively transparent. For instance, secondary data can be accessed by NAICS

code from sources such as the U.S. Bureau of Economic Analysis for each of them, but there is no NAICS code for

the UCM or spam industries.

5 A survey of e-mail users uncovered considerable disagreement about what constitutes UCM traffic. Being un-

wanted is a necessary, but not sufficient, condition [26].

7

Sapm and beyond: An information-economic analysis of unwanted commercial messages

In order to focus our attention on such messages, which are likely (but not certain) to be

unwelcome by their recipients, we base our economic model on the following definition. It ex-

cludes such things as targeted e-mail campaigns, where the sender has prior knowledge of the

recipients’ interests, but includes the kinds of messages that are annoying, time-consuming, and

potentially costly for their recipients.

Definition. UCM traffic consists of messages sent over the Internet (a) that urge the re-

cipient to take an action that can lead to revenue for the sender; and (b) that are untargeted; i.e.,

sent without specific knowledge of whether or not the recipient will value them.

Using this definition, along with our understanding of Internet technology, we can infer a

number of fundamental conditions that apply to the UCM industry. We summarize them in the

following six observations.

Observation 1: Exploitable targets. UCM traffic involves messages sent from one com-

puter to another over the Internet, which uses a system of IP addresses, message-transfer proto-

cols, and (possibly) filters designed to detect and eliminate UCM content. We use the term ex-

ploitable message target (EMT) to refer to a 3-tuple consisting of:

1. a target address (owned by an Internet user), which corresponds to a particular individ-

ual toward whose attention space a message can be aimed,

2. a message-delivery protocol, which describes the technical steps required to transmit a

message to the target address, and

3. message-content parameters, which identify limits on message content that will ensure

that the message will not be filtered out prior to delivery.

A valid EMT is one that, when used, will successfully deliver a unit of UCM traffic into the at-

tention space of an Internet user. There are a finite number of such 3-tuples in existence; i.e.,

8

Sapm and beyond: An information-economic analysis of unwanted commercial messages

there are a finite number of valid EMTs.6 In invalid EMT is one that, if used, will not result in a

unit of UCM traffic penetrating an Internet user’s attention space.7

Observation 2: UCM campaigns. We use the term operator to refer to a member of the

UCM industry. We use the term campaign to refer to an operator’s sending of a message to each

EMT on an exploit list. Because each campaign requires the use of Internet communications, and

there is a finite amount of Internet bandwidth in existence, the number of campaigns that can be

conducted during any period of time is finite.

Observation 3: Distribution of EMT ages. EMTs are brought into existence by the admin-

istrators of the communications facilities that are being targeted. Once an EMT exists, it remains

capable of receiving UCM traffic for an indefinite period (until invalidated by some action of the

target-address owner or adminstrator). Therefore, each EMT has an age, and the age distribution

of the population of EMTs depends on their historical rates of creation and invalidation. The ra-

tio of the number of EMTs currently of a particular age a to the total number of EMTs in exis-

tence equals the probability that an arbitrary EMT is of age a .

Observation 4: EMT discovery process. By their nature, EMTs are not placed on exploit

lists at the request of target-address owners. Therefore, there are processes of discovery by UCM

operators. Such processes could include harvesting target addresses from locations accessible

over the Internet; randomly generating target addresses and message content; or acquiring known

EMTs from other UCM operators or from entities with which the target-address owner has had

prior communication. Because we only consider EMTs that can actually appear on exploit lists 6Owners of target addresses often take measures to keep them private [26]. If completely successful, they will not

receive UCM traffic. We exclude any addresses from the population of interest, because they are not relevant to the

UCM phenomenon.

7 As examples, a phony e-mail address may be posted online and be harvested for use; or, an e-mail inbox’s owner

may stop using the inbox. In the latter case, an EMT that once was valid becomes invalid. Another example is a fil-

ter upgrade that makes a once-valid EMT invalid, because the message-content parameters no longer ensure that a

message will get past filters.

9

Sapm and beyond: An information-economic analysis of unwanted commercial messages

(and, therefore, are subject to discovery), we can infer that there is a positive probability that

some UCM operator will discover a given EMT during any given time period. Thus, the prob-

ability that any particular operator will know about an EMT is a function of time: The older an

EMT, the larger the number of operators who will have discovered it and placed it on their ex-

ploit lists. And, because the number of campaigns is finite, there is some EMT age at which it

will be included on every campaign’s exploit list. The discovery process may be imperfect; that

is, exploit lists may contain invalid EMTs, which have been discovered but, when used, will not

cause a message to penetrate a user’s attention space.

Observation 5: Internet-user attention endowment. An Internet user must internalize, or

consume, a unit of UCM traffic in order to know what action the campaign operator wishes him

to take. The human processing of a message takes time, and each Internet user has a finite

amount of time available [27]. Therefore, increasing the number of messages sent to an EMT

eventually lowers the probability that the recipient will consume any particular one of them.

Moreover, even if the user simply ignores a unit of UCM traffic (e.g., deletes it without reading

it), it still takes a positive increment of time for her to decide it is not of interest. At some volume

of UCM traffic her attention endowment will be fully consumed. After that, the extra time

needed to deal with one additional message must come from reducing the number that she actu-

ally consumes.

Observation 6: UCM industry revenue. Operators have the potential to receive revenue if

the recipient takes the action being urged in the message; that is, operators execute UCM cam-

paigns because they want to make money. Before the action can be taken, the recipient must con-

sume the UCM message. This implies that there must be a positive relationship between the ag-

gregate industry-wide revenue accruing to operators and the total units of UCM traffic that are

consumed by Internet users.

These six observations about the UCM industry depend only on the definition of UCM

traffic and what is known about the operation of the Internet and the limited attention resources

10

Sapm and beyond: An information-economic analysis of unwanted commercial messages

of humans. Although not all UCM operators and target-address owners are alike, these character-

istics are based on features they have in common by virtue of their involvement in sending or

receiving UCM.

The first six items in Table 1 summarize the observations stated above, and give the nota-

tion we use to represent the corresponding effects within the model. For each observation, the

table states the key implication, and gives restrictions arising from the discussion. All but one of

the restrictions are self explanatory; the exception is the restriction in Observation 5 that there is

some volume of UCM traffic that depletes the Internet user’s attention endowment. The expres-

sion in Table 1 for this restriction is based on Lemma 1, which can be found in the Appendix.

The intuition behind the restriction is that increased volume of UCM traffic eventually consumes

all of the recipient’s available attention. The corresponding model expression requires that if still

more UCM traffic reaches a user after all of his attention is used up, he will have to spend less

time consuming messages. << Table 1 >>

Before using the components in Table 1 to construct our model of the UCM industry, it is

instructive to note that we need nothing more than these six observations to establish that eco-

nomic outcomes in the UCM industry can differ from other industries that may appear compara-

ble. To show this, we state the following, which we can prove based on Observations 1-6:

Observation 7: Common-pool resource property. Let m represent the total number of

UCM campaigns conducted by all operators in the UCM industry. Then, aggregate UCM indus-

try revenue has an interior maximum in m . (The proof is contained in the Appendix.)

The significance of Observation 7 can easily be shown graphically. Figure 1 depicts two

different instantiations of each of the key relationships and restrictions from Table 1, as follows:

Distribution of EMT ages. In Figure 1A, the nonlinear function illustrates the assumption

that younger EMTs outnumber older ones. This might arise from a history of exponential growth

in adoption of an Internet communications protocol over time. The Figure 1B illustration repre-

sents an alternative assumption of constant levels of growth over time.

11

Sapm and beyond: An information-economic analysis of unwanted commercial messages

EMT-discovery process. In Figure 1A, the nonlinear function captures the assumption

that as more UCM operators discover an EMT, the discovery process speeds up. This might arise

from the sharing or selling of exploit lists between operators. The Figure 1B assumption, on the

other hand, is that the probability of an operator using an EMT in a campaign increases linearly

with the age of the EMT.

Internet-user attention endowment. The Figure 1A s-curve functional form assumes that

when there are just a few units of UCM traffic, their recipients will be quite likely to consume

them. As the amount of traffic increases, the probability of a message being consumed drops off

sharply, then plateaus again at a low level that approaches – but never equals – zero. The sim-

pler, Figure 1B assumption is that each additional message incurs a linear reduction in its prob-

ability of being consumed, until the probability reaches zero. << Figure 1 >>

Figure 1 contains just two possible variations for these three functional forms, but an in-

finite number of other instantiations could be used. Our purpose in illustrating the two cases is to

emphasize that the restrictions imposed by our six observations are weak: They accommodate a

variety of quite distinct possibilities as to how UCM operators form their exploit lists and how

Internet users allocate their time. But, we have enough to show analytically that the UCM indus-

try is characterized as a tragedy of the commons. The bottom panels in Figure 1A and 1B illus-

trate Observation 7. (These two curves were constructed by applying Note 6d from Table 1.)

Each curve shows the total units of UCM traffic that will be consumed industry-wide, as a func-

tion of the number m of industry-wide UCM campaigns that are conducted. No matter whether

we use the assumptions illustrated in Figure 1A or 1B – or any other set of conforming assump-

tions, for that matter – there is a point at which more UCM campaigns actually cause fewer mes-

sage units to be consumed. And, because of Observation 6, fewer consumed messages means

less revenue for the industry.

This characterization implies that there are features of the UCM industry that may not

evidence themselves in other sales-related industries. The harder UCM operators work at their

12

Sapm and beyond: An information-economic analysis of unwanted commercial messages

businesses (by conducting more campaigns), the fewer “sales” they will make collectively. We

see from Observation 7 that the fishing industry – often used to illustrate the tragedy of the com-

mons – is an apt analog to the UCM industry. In Region I of the bottom panels of Figure 1, more

UCM campaigns are seen to increase industry revenues – just as more fishing boats might at first

lead to more fish being caught. When Region II is reached, however, the stock of Internet-user

attention is depleted to the point where every new industry entrant actually decreases total indus-

try revenue – just as too many fishing boats will deplete the stock of fish, eventually to the point

where zero fish would be caught.

These effects occur because both the UCM and fishing industry depend on common-pool

resources. In the case of UCM, the resource is not physical, like fish; rather, it is a common-pool

attention resource. Access to Internet-user attention does not depend on a physical asset like a

fishing boat, but on an information asset: specifically, an exploit list containing EMTs. Because

of this, we will be able to show that dealing with the UCM problem may require manipulating

the information assets of the UCM operators; just as solving the problem of overfishing may re-

quire manipulating the ability of fishing boats to catch fish. And, there can be many innovative

and subtle ways to influence the efficacy of an information asset.

3. Model setup Because we have defined the Internet as the transport medium for UCM, we can state the

following:

Observation 8: UCM industry cost. Each UCM operator incurs a positive cost for con-

ducting each campaign, and the more EMTs there are on a campaign’s exploit list, the more it

will cost to conduct the campaign within a given amount of time.

Observation 8 is based on the very nature of the Internet: it is a privately owned packet-

switching network that collects fees for every packet it carries [28]. It is believed that UCM op-

erators are adept at getting someone else to pay for the Internet bandwidth they consume. For

example, they might exploit open relays or proxy servers, or use malicious software to infect and

13

Sapm and beyond: An information-economic analysis of unwanted commercial messages

then hijack target computers [18]. But, although such techniques could allow an operator to con-

duct a UCM campaign at a lower cost than by directly paying a legitimate Internet service pro-

vider,8 UCM operators cannot conduct campaigns for free. This is so because every campaign

requires a set of servers to implement the message-delivery protocol, whether paid for by the op-

erator or exploited illicitly. Every server can send a limited number of messages per unit of time,

depending on its speed and the bandwidth of its Internet connection. To increase the number of

messages that can be transmitted beyond that limit, the UCM operator must gain increased server

capacity – either by acquiring it legitimately or by obtaining it illicitly. Both options have costs.

Gaining legitimate capacity involves paying for computer resources and Internet connectivity.

Gaining illicit capacity involves searching for and successfully invading exploitable targets – or

paying someone else to do so. And, because the useful lives of illicitly acquired assets are short

(as the exploitations are discovered and countermeasures taken), the costs of re-acquiring them

must be borne frequently.

Combining Observation 2 and Observation 8, we can state that there is a finite number of

UCM campaigns that can be executed in a given time interval for any level of UCM industry-

wide spending. Table 1 includes the model notation used to summarize Observation 8. It is then

straightforward to observe the following.

Observation 9: Equilibrium number of campaigns. There is an equilibrium number M of

UCM campaigns such that when m M� industry-wide revenue equals industry-wide cost.

Observation 9 can be understood intuitively from Figure 1 and Observation 7, and is con-

sistent with standard economic analysis. Because industry revenue has an interior maximum and

industry cost increases in m , there is some number M of campaigns where industry revenue

equals industry cost. If fewer than M campaigns are conducted during a given time interval,

8 Here we are focusing just on cost. We realize that a main motivation for these activities is not to save money, but

to hide the UCM operator’s identity. We also know that responsible ISPs will not knowingly provide accounts to

UCM operators, so operators sometimes pay underground providers inflated fees for access [29].

14

Sapm and beyond: An information-economic analysis of unwanted commercial messages

there will be some operator(s) with positive profits. These operators will have an incentive to

conduct even more campaigns during the next time interval. If more than M campaigns are con-

ducted, some operators will lose money, and they will rationally decide to conduct fewer cam-

paigns. The number of campaigns conducted will remain constant from one time interval to the

next only if the average campaign returns a zero profit; so that if any operators decide to do more

campaigns, they will be offset by other operators deciding to do fewer.9

Users have a choice when they begin to receive ever-increasing volumes of UCM traffic:

They can expend more effort each day separating units of UCM traffic from legitimate messages

(possibly consuming some of the unwanted messages in the process); or, they can take an action

that invalidates the EMTs associated with the unwanted messages. (Such actions can include

abandoning the existing target address and replacing it with a new one, or strengthening filtering

capabilities so that existing message-content parameters will be invalid. 10) Both options are

costly. Although new EMTs can be created with little direct cost (see Observation 3), it requires

time and effort to inform legitimate communicants of a new target address, and there are oppor-

tunity costs associated with missed communications if filters are strengthened. Therefore, we can

state the following.

9 More formally, because of low barriers to entry the UCM industry is competitive, and individual operators are

price takers (i.e., each perceives that per-campaign revenue is not affected by the number of campaigns conducted).

Following the textbook analysis of equilibrium in a competitive industry [30], equilibrium revenue per unit of output

(i.e., per campaign) then equals average total cost, and economic profits are absent. Although there is nothing sur-

prising about a zero-profit equilibrium obtaining in a competitive industry, it is seldom recognized by those who

write about UCM. For example, a widely cited research report is typical in referring to the “lucrative spam industry”

[26].

10 A survey of e-mail users reported percentages who have reacted to large amounts of spam by: obtaining obscure

e-mail addresses that are resistant to dictionary attacks (14%); obtaining separate addresses to use in situations that

might lead to spam (23%); and obtaining addresses that they then avoid giving out at all (73%) [26].

15

Sapm and beyond: An information-economic analysis of unwanted commercial messages

Observation 10: EMT invalidation. A rational Internet user will invalidate an EMT when

the cost of sorting through large volumes of UCM traffic exceeds the expected cost of invalida-

tion and replacement. Because the cost of dealing with UCM traffic will strictly increase with the

volume received, for every Internet user there is some volume of UCM traffic that will lead to

the decision to invalidate an EMT. When an EMT is invalidated, UCM operators may or may not

remove it from their exploit lists.

Table 1 includes the model notation used for Observation 10. Observations 9 and 10,

when considered simultaneously, reveal several ways that UCM operator behavior and Internet-

user behavior interact. The following feedback effects will result from an increase in the number

of UCM campaigns conducted by the industry:

1) There will be more UCM traffic reaching each EMT, giving more opportunities for Internet

users to take the actions being urged in the messages and generate revenue for the operators.

This will increase UCM industry profits, causing still more campaigns to be conducted.

2) There will be higher total costs incurred by UCM operators, which reduces industry profits

and leads to fewer campaigns being conducted.

3) More Internet users will decide to invalidate their EMTs and replace them with substitutes.

They will then receive relatively few units of UCM traffic, but they will have more time to

consume some of it. This will lead to increased revenue for the operators, causing them to

execute even more campaigns.

4) Over time, the new, substitute EMTs will be discovered by more and more UCM operators,

increasing the size of their exploit lists (and, therefore, increasing industry costs). The in-

creased UCM volume will consume the Internet users’ attention endowments, causing them

to consume less of the UCM traffic they receive. Thus, UCM industry revenue will eventu-

ally decrease at the same time industry costs increase, resulting in fewer campaigns.

5) Some or all of the invalidated EMTs might be removed from the operators’ exploit lists.

Smaller exploit lists will reduce the UCM industry’s costs, leading to more campaigns.

Because of these multiple ways in which the behavior of UCM operators and Internet us-

ers interact, the equilibrium number of UCM campaigns (i.e., the zero-profit number M in Ob-

servation 9) will vary over time, and it is not intuitively obvious whether it will increase or de-

16

Sapm and beyond: An information-economic analysis of unwanted commercial messages

crease – that will depend on which feedback effects dominate. Even more complication arises

when we recognize that the population of EMTs is not fixed; i.e., there new Internet users creat-

ing new EMTs while the other feedback effects are ongoing.

Two alternatives for analyzing this system are: (a) finding a closed-form expression for

the behavior of M as a function of time, or (b) simulating the system’s behavior and observing

the path of M over time. Under either approach, the analysis will require the adoption of spe-

cific functional forms to represent the relationships given in Table 1. Using simulation methods

offers two distinct advantages over seeking a closed-form expression. First, a closed-form solu-

tion will be feasible only if we choose functional forms with helpful properties that lend tracta-

bility to the model, but a simulation can easily be accomplished for any choices of functional

forms that satisfy the restrictions in Table 1. Second, a simulation will allow us to experiment

with a variety of different assumptions, so that we can be confident that the model’s implications

are not driven by how we choose to instantiate the functions.

For these reasons, we have accomplished the analysis as follows. First, we chose instan-

tiations of the functions and model parameters given in Table 1. Next, we simulated the behavior

of the system over time, specifically noting the behavior of M . Then, we experimented with pa-

rameter changes that represent manipulations of the efficacy of the UCM operators’ information

assets, and noted the results. Additionally, we repeated our observations using different instantia-

tions of the functions, to show that our results were driven not by our assumptions, but by the

fundamental economics of an industry dependent on a common-pool resource.

4. Analysis

4.1. Baseline simulation Of the functions given in Table 1, the distribution of EMT ages requires special treat-

ment because it changes over time; i.e., it is an endogenous result of the simulated activity. One

reason for this is that the feedback effects occur – the number of new EMTs created each time

period depends on the number of UCM campaigns that were conducted during the previous time

17

Sapm and beyond: An information-economic analysis of unwanted commercial messages

period. Another reason is that the total number of Internet users may grow over time, and new

EMTs are created in each time period to accommodate them. Finally, existing EMTs get older as

time goes on. The distribution of EMT ages at any point in the simulation reflects all of this prior

activity.

To capture the growth in the population of Internet users, we have included the constant

term U in the model (see Table 2). We have captured the feedback and EMT-aging effects

analogously to what might be done in a human demographic model, and we have presented the

corresponding model notation in Table 3. A demographic model might quantize data into four

20-year categories: children (age 0-19); young adults (age 20-39); middle-aged (age 40-69) and

senior (age 70 plus), because, for example, there are few significant changes between people of

age 25 and age 30 but large differences from age 15 to 45. These groupings could be called

stages of maturity for humans. The quantities ( )g a , for { }0,1,2,3a ∈ , would represent the num-

bers of people at each maturity stage as of some base year. To adjust the quantities for the fol-

lowing year, one would estimate that 1 20 of the children will become young adults, 1 20 of the

young adults will reach middle age, and so on. That is, one would use a parameter 120h = , and

apply a difference equation such as ( ) ( ) ( ) ( )1 1 1 0 1t t t tg g hg hg+ = + − . The parameter h repre-

sents the speed at which the population matures. << Table 2 >> << Table 3 >>

Table 3 details how we have used this scheme to model the maturity of EMTs, and

thereby, to arrive at an expression for the distribution of EMT ages in each time period. By Ob-

servation 4 we know that EMTs eventually reach a state of maturity where they have been dis-

covered by all UCM operators, but we do not want to make a fixed assumption about how fast

the EMT-discovery process unfolds. Rather, we want to study how a change in the speed of

EMT discovery affects key variables. Accordingly, we have adopted a four-stage maturity con-

struct, which captures the age of an EMT by the parameter { }0,1,2,3a ∈ , and the speed at which

EMTs mature by the constant parameter h (see Table 2). We will then be able to experiment

18

Sapm and beyond: An information-economic analysis of unwanted commercial messages

with different values of h to learn how the equilibrium number of UCM campaigns is influenced

by the speed with which UCM operators discover EMTs.

Table 3 also specifies an equation to compute the number of invalid EMTs in each time

period. Following Observation 10, we include a constant parameter s in the model, which can be

set to either one or zero (see Table 2). As shown in Table 3, when 1s� the model simulation

will reflect the assumption that operators do not purge their exploit lists to remove invalidated

EMTs. That means there will be growth over time in the number of invalid EMTs appearing on

exploit lists. Setting 0s� will allow comparison to the opposite case, when operators remove

invalidated EMTs from their exploit lists.

The final difference equation shown in Table 3 specifies the calculation of the equilib-

rium number of campaigns, M , in each time period. This calculation is based on Observation 9.

When overall UCM industry profits are positive, rational operators will conduct more campaigns

(and, more operators may enter the industry). When profits are negative, the number of cam-

paigns will go down over time (but cannot go below zero).

The difference equations in Table 3, for any choice of the three constants in Table 2, give

us the calculations needed to simulate the dynamic behavior of the system. Table 4 lists various

instantiations used for the static constructs; i.e., those that do not change over time. Both linear

and nonlinear forms are included, following the examples used earlier. << Table 4 >>

As a baseline case, Figure 2 displays the result of running the simulation using linear

forms for each of the static constructs. UCM industry profit ( Π , graphed on the left-hand scale)

begins in positive territory and increases sharply over time as more campaigns ( M , graphed on

the right-hand scale) are conducted. Eventually, the common-pool resource property stated in

Observation 7 begins to dominate, and industry profit decreases over time as users’ attention en-

dowments are depleted. When industry profit reaches zero, the operators stop increasing the

number of campaigns. Then, the main variable of interest, M , reflects that the equilibrium num-

ber of campaigns continues to increase slowly over time. << Figure 2 >>

19

Sapm and beyond: An information-economic analysis of unwanted commercial messages

To verify that substantially identical results obtain regardless of what functional forms

are chosen for the static constructs, Figure 3 describes the results of repeating the simulation

eight times, using the different combinations of linear and nonlinear instantiations listed in Table

4. But for scaling factors, the behavior of the dynamic variable M follows a similar path in each

case. The nonlinear instantiations result in a set of curves that fit within the shaded region at the

top of the figure (the median curve is shown as a solid line). The linear instantiations fit within

the bottom shaded region. Because there is no significant difference arising from alternative as-

sumptions about the static constructs, the remainder of the analysis will be presented by arbitrar-

ily choosing the nonlinear forms of the static constructs, as illustrated in Figure 1A and specified

in Table 4. The authors have separately tested the results for robustness under a variety of differ-

ent assumptions. <<Figure 3>>

The parameter U , representing the number of new EMTs created to meet growth in

numbers of Internet users during each time period, also has an impact on the dynamic behavior

of M . In Figure 4, it is shown that the equilibrium number of campaigns reaches a relatively

stable plateau faster when U is increased, but eventually the different curves arising from

choices of U converge. Accordingly, we will fix the value of U for the remaining presentation

of the analysis, and concentrate on the impact of manipulations of the other two constants in Ta-

ble 2. << Figure 4 >>

4.2. Manipulations of information-asset efficacy As noted in the discussion following Observation 7, UCM operators depend on a com-

mon-pool resource of Internet-user attention, which they access through their information assets:

the exploit lists used to conduct their campaigns. We now turn to an analysis of how changes to

the model parameters impact the efficacy of these information assets. To relate these parameters

to the efficacy of the operators’ information assets, it is useful to identify two distinct quality

characteristics of UCM exploit lists, which we will designate Q1 and Q2. (Here, information

quality is assessed from the point of view of the UCM operators.)

20

Sapm and beyond: An information-economic analysis of unwanted commercial messages

Q1: EMT validity. An operator’s exploit-list quality is reduced when invalid EMTs are

added to it and when valid EMTs are removed from it.

Q1 information quality stems directly from Observation 6 and Observation 8. By Obser-

vation 6, the revenue accruing to a campaign operator depends on the number of valid EMTs

used, because that determines how many messages will be consumed. By Observation 8, the cost

of executing a UCM campaign depends on the size of the exploit list; that is, a larger exploit list

will give rise to higher campaign costs. Other things equal, an operator would be better off using

an exploit list that contains fewer invalid EMTs and more valid EMTs, because that would re-

duce costs and increase revenue. By further analogy to the fishing industry, a reduction in a fish-

erman’s Q1 information quality would occur if he had no fish-finding sonar and wasted time

trolling in areas without fish.

In Figure 5, two of the simulations displayed were conducted with the parameter setting

1s = . This setting models the case when operators do not have the information needed to purge

their exploit lists of invalid EMTs. Comparing to the simulations where 0s = , it is evident that

fewer UCM campaigns are conducted when the exploit lists are not purged. When Q1 informa-

tion quality is impaired, the shape of the curve representing the number of campaigns changes.

Instead of a concave, increasing function, the number of campaigns follows a convex path, de-

creasing over time. This is particularly significant for the assessment of policies and practices

relating to UCM, as we will discuss in Section 5. We summarize this finding as follows:

Proposition A: Reductions in Q1 exploit-list quality. If operators are unable to purge in-

valid EMTs from their exploit lists, the equilibrium number of campaigns (a) will be smaller than

if purged exploit lists were used, and (b) will decrease, rather than increase, over time. << Fig-

ure 5 >>

Q2: EMT exclusivity. An operator’s exploit-list quality is reduced when some of the valid

EMTs it contains are added to another operator’s exploit list.

21

Sapm and beyond: An information-economic analysis of unwanted commercial messages

Q2 quality refers to an information-externality effect: If an operator adds a valid EMT to

her exploit list, she lowers the quality of all the other operators’ exploit lists that also contain that

EMT. The Q2 information-quality dimension stems directly from Observation 5: An Internet

user is more likely to consume a UCM message if there are fewer other messages competing for

his time. In the fishing industry, a reduction in Q2 quality would occur if a fisherman’s private

knowledge of where the fish are plentiful became public – so that other fishermen would arrive

to deplete the stock.

Q2 quality would be reduced to a minimum if all operators discovered new EMTs instan-

taneously, because, in that case, there would be no private information on any exploit lists. The

slower the speed of EMT discovery, the longer it will take for new EMTs to make their way onto

all of the exploit lists, and (other things equal) the higher industry-average Q2 quality will be.

Figure 5 shows two simulations with a parameter setting .1h = , corresponding to slow discovery

(high Q2 quality), and two simulations with 1h = , corresponding to fast discovery (low Q2 qual-

ity). We can see from Figure 5 that the impact of a reduction in Q2 quality is greater when Q1

quality is also impaired. There is interaction between the two quality dimensions, due to the

feedback effects in the model – just as a fisherman’s private knowledge of where the fish are

plentiful would be all the more valuable if nobody had fish-locating sonar.

We summarize our finding regarding Q2 quality as follows:

Proposition B: Reductions in Q2 exploit-list quality. The faster the speed at which opera-

tors discover new EMTs, the smaller the equilibrium number of campaigns. This effect is greater

when operators are unable to purge invalid EMTs from their exploit lists.

Figure 6 presents results from the same four simulations runs as in Figure 5, but displays

the total number of UCM messages sent, E , rather than the equilibrium number of campaigns,

M . (Note that Figure 6 is drawn on a logarithmic scale.) It is clear that reductions in Q1 and Q2

quality that bring about a lower number of UCM campaigns also bring about a higher volume of

UCM message activity on the Internet. Moreover, the amount of Internet bandwidth consumed

22

Sapm and beyond: An information-economic analysis of unwanted commercial messages

by UCM continues to grow in each of the four scenarios, even when the number of campaigns is

decreasing. << Figure 6 >>

Intuitively, reducing the quality of exploit lists reduces industry-wide profits, leading to

industry exits and fewer campaigns. Also, average exploit-list size increases as Q1 and Q2 de-

crease. (A reduction in Q1 comes about when invalid EMTs are added to an operator’s exploit

list, and a reduction in Q2 occurs when the EMTs known to one operator are added to the exploit

lists of other operators.) Thus, reduced exploit-list quality goes together with an increased num-

ber of targets in each campaign, leading to an increase in the total number of UCM messages

sent industry wide.

It is clear from Figures 5 and 6 that changes in the Q1 and Q2 dimensions of information

quality have effects that matter. In general, the outcomes illustrated in Figure 5 suggest that

lower information quality for UCM operators would benefit Internet users, by reducing over time

the number of campaigns conducted. Figure 6, on the other hand, reveals a tradeoff when viewed

from a macro perspective: Lower information quality would cause even more Internet bandwidth

to be taken up by UCM traffic. This bears on one of the motivating questions posed in Section 1:

As e-mail filtering has improved, the spammers send more of it. They do not give up, because

even though they must send many more messages in order to successfully penetrate a user’s at-

tention space, the message that does get through will then have less competition and will be more

likely to be noticed and consumed by the e-mail user. Based on our model, we would not expect

filtering alone to eliminate, or even control, UCM traffic. To craft a more effective response to

rising levels of UCM, we should consider real-world opportunities to manipulate the UCM op-

erators’ information-asset quality.

5. Implications

The findings of Proposition A and B suggest that information can be used as a lever to

shape the behavior of a broad range of UCM markets. Moreover, once the UCM industry is

23

Sapm and beyond: An information-economic analysis of unwanted commercial messages

viewed as a tragedy of the commons, it is evident that there are important – and otherwise over-

looked – information side effects arising from some UCM-related policies and practices that

have been adopted by both private and public entities. These amount to (unintended) manipula-

tions of the efficacy of UCM operators’ information assets. The current focus of UCM mitigation

efforts has been on e-mail spam, so we will seek insights from those policies that can be applied

to other, emerging forms of UCM.

First, we can apply the findings embodied in Proposition A to a policy adopted by private

entities: the way that ISPs handle e-mail communications using the Simple Mail Transport Pro-

tocol (SMTP) [31]. SMTP involves several handshaking steps. The sending computer first estab-

lishes a connection to the receiving computer using port 25, then passes a series of messages.

One of the messages contains the e-mail address of the destination inbox. Normal practice is for

the receiving computer to drop the port 25 connection if the destination inbox address is invalid.

This is a signal to the sending computer that there is a problem with the address. When this hap-

pens, most sending computers are programmed to route a bounce notification back to the inbox

of the user who initiated the e-mail, so that the address error can be corrected.

ISPs follow this practice because it provides a convenience to the users of their e-mail

services, and also eliminates wasted steps in the SMTP protocol when an address is invalid.

However, there is an important information side effect, the significance of which becomes quan-

tifiable only when the UCM industry has been analyzed from a commons perspective: UCM op-

erators are acknowledged to exploit the usual handshaking sequence to improve the Q1 informa-

tion quality of their exploit lists. Widely distributed mass-verification software processes an ex-

ploit list by connecting to port 25 of each receiving computer and then following through with

the handshaking sequence to determine whether or not the e-mail address is valid.11 When an in-

11 When the authors did a Google query using the phrase “email mass verification software,” the first results page

had links to descriptions of seven such programs for sale or download.

24

Sapm and beyond: An information-economic analysis of unwanted commercial messages

valid address is found, it is automatically purged from the exploit list. UCM operators routinely

use such software as “the first step any spammer takes before sending spam” [32, p. 91].

By Proposition A, we conclude that ISPs could significantly reduce the amount of spam

e-mail reaching the average inbox if they altered their SMTP software to eliminate the dropping

of connections in the case of invalid addresses. Instead, the handshaking sequence would pro-

ceed to its normal conclusion, and the erroneously addressed messages would be discarded rather

than stored in an inbox. This change would give rise to the curves in Figure 5 that correspond to

low Q1 information quality, meaning that units of UCM traffic would become an ever-

decreasing, rather than increasing, phenomenon from the standpoint of inbox owners. Such an

outcome would come at a cost. For one thing, erroneously addressed e-mail could no longer be

bounced back to its sender for correction. Further research would be needed to establish whether

the majority of e-mail users would prefer to give up address-error feedback in favor of reduced

spam, but it is probable that many would. Another cost element would arise from the macro-level

increase in the total amount of Internet bandwidth consumed by UCM traffic, as shown by Fig-

ure 6. Clearly, ISPs would bear at least a portion of this burden, in that their SMTP servers

would be called upon to handle more port 25 connections; and some of the protocol sequences

(those involving invalid addresses) would consume more time than previously. Offsetting this

cost would be the potential for long-term reductions in the number of UCM campaigns, and,

therefore, less ISP spending on spam countermeasures.

Although the cost and benefit tradeoff from the ISP’s standpoint is complicated, given the

levels of spending by ISPs on e-mail-spam mitigation measures12 – and proposed measures that

would have far more disruptive impacts on users – it is significant that the potential impact of

this straightforward-to-implement information-based step has not previously been described.

12 A study by IDC estimates worldwide spending on anti-spam packages will increase from $300 million in 2003 to

$1.7 billion in 2008 [33].

25

Sapm and beyond: An information-economic analysis of unwanted commercial messages

This is an example of an overlooked information side effect of a practice that was adopted for

unrelated reasons.

Other current policies and practices can be seen to impact the Q2 dimension of informa-

tion quality. As an example, consider the position taken by a public agency, the U.S. Federal

Trade Commission, regarding the establishment of a “do-not-spam” registry [18]. The FTC con-

cluded that such a registry “would fail to reduce the burden of spam and may even increase the

amount of spam received by consumers” (p. 1). The commission’s reasoning was that “the high

value of e-mail addresses would likely make the Registry the National Do Spam Registry” (p.

16). That is, UCM operators would be able to use the registry as a means of discovering valid e-

mail addresses and adding them to their exploit lists. The FTC concluded that such an increase in

the speed at which operators discover inbox addresses would mean inbox owners would receive

more spam than before.

The FTC’s report was prepared based on inputs solicited from a variety of sources, but

there is no mention of an information-economic analysis of the UCM industry. The commis-

sion’s findings might have been expanded and enriched had such an analysis been available, be-

cause it is evident that there would be information side-effects if a do-not-spam registry were

implemented and used as described. Once those side-effects are understood, it is also clear that

the relationship between speed of inbox-address discovery and the amount of spam in an inbox is

not as straightforward as the FTC reported.

By Proposition B, we conclude that the number of UCM campaigns would actually de-

cline if the registry were established and used as an inbox-discovery mechanism by UCM opera-

tors. That is, the rapid discovery by operators of the valid e-mail addresses (represented by pa-

rameter setting 1h� in our model) would erode the Q2 information quality of their exploit lists,

leading to reduced profits and industry exits (as shown in Figure 5). From the FTC’s point of

view this would be a positive, rather than negative, development, and Figure 5 shows that it

would be even more so if the ISPs simultaneously took steps to erode Q1 information quality by

26

Sapm and beyond: An information-economic analysis of unwanted commercial messages

changing their SMTP-related practices: The number of campaigns would then decline monotoni-

cally toward zero.

From the point of view of the individual inbox owner, the impact of the do-not-spam reg-

istry would be ambiguous. Owners of mature inboxes, which already receive a large volume of

UCM traffic, would see an improvement as the number of campaigns decrease; but owners of

newly created, registered inboxes would observe a rapid, rather than gradual, accumulation of

spam. The overall result would be to lower the maximum number of spam messages that any in-

box receives (whether it is registered or not), while reducing the variation in UCM volume be-

tween different registered inboxes. It is difficult to view this as an undesirable outcome, particu-

larly if participation in the registry were voluntary. For owners of mission-critical inboxes (i.e.,

inboxes that would not serve their purpose if kept private), listing the addresses on a do-not-spam

registry would be rational: The registration would marginally reduce the amount of spam re-

ceived while adding the possibility of enforcement and punishment as a deterrent to UCM opera-

tors.13

The above two examples relate to unintended and unrecognized information-quality side-

effects that result from private practices and public policies, specifically as they apply to e-mail

spam. The lessons from these examples, which relate specifically to e-mail spam, suggest that

information-based approaches may be key to devising effective strategies for combating new

categories of UCM. Such policy innovations might be undertaken not as side-effects, but for the

express purpose of manipulating the Q1 or Q2 quality of exploit lists. For example, bogus, inva-

lid EMTs could be intermingled with valid EMTs in a public directory, eroding the Q1 informa-

tion quality of operators who use it to harvest information, and automated IM agents (“bots”)

13 It is also possible that the information side-effects discussed would interact beneficially with enforcement efforts,

since there would be fewer UCM campaigns and each would use a larger exploit list. Enforcement officers could

thus concentrate their efforts on a smaller number of targets, and each target would have a larger exposure on the

Internet. Future research by the authors may address this aspect of the UCM problem.

27

Sapm and beyond: An information-economic analysis of unwanted commercial messages

might be deployed to make it impossible for UCM operators to distinguish between valid and

invalid IM-related EMTs. Private ISPs could also offer financial incentives to their customers to

induce them to list their EMTs in the registry. The cost of such incentives would be offset by a

reduction in the numbers of campaigns, which would come about consequent to the erosion of

Q2 information quality. Our purpose is not to promote specific innovations in this paper, but to

establish that our information-economic model of the UCM industry adds an important new di-

mension for thinking about existing practices – and for identifying new tools to deal creatively

with the UCM problem in its emerging forms.

We recognize the simplifications and omissions inherent in our model (as is true of all

such models), and we do not contend that information-based tools, by themselves, would be suf-

ficient to eliminate entirely UCM traffic. At the same time, we emphasize what the model does

establish: that there are information side-effects to both private and public decisions; that the ex-

istence and importance of these side effects often goes unrecognized; and that successful public

and private decision making in the context of an information-based commons problem requires

analyzing the information-economics of the industry.

6. Conclusion Our model of the UCM industry is based on a recognition that its economics can be char-

acterized as a tragedy of the commons. We then showed that – because the underlying common-

pool resource is accessed through an information asset – there are manipulations available that

impact industry outcomes. Such manipulations comprise a third category of remedies for com-

mons problems: information-based approaches. These are distinguished from previously dis-

cussed remedies for commons problems and, in particular, the UCM problem.

An information-based approach to resolving commons problems is neither strictly public

nor strictly private. As our examples show, there are actions that can be taken by private entities

(e.g., ISPs) that would reduce the number of UCM campaigns. Also, there are actions that can be

taken by public regulatory officials (e.g., the FTC) that would reduce the number. But, the model

28

Sapm and beyond: An information-economic analysis of unwanted commercial messages

shows that the greatest impact would come from simultaneous public and private action; that is,

the effects of the two information-based remedies are complementary.

Another important feature of the information-based approach is that the tools available

can be subtle, and they can be unintentionally activated (perhaps in the wrong direction) as side-

effects of policies that are adopted for other reasons. It is likely, for example, that ISPs have not

fully considered that their e-mail message-handling practices amount to a manipulation of the

UCM operators’ information quality in a direction that will increase the number of UCM cam-

paigns. Similarly, the FTC appears not to have fully evaluated the information side effects of

policy decisions related to a do-not-spam registry.

As new forms of electronic communication have gained significant levels of diffusion

(such as text messaging, mobile-phone messaging and dynamic wireless networking), they have

also gained noticeable amounts of UCM traffic. Because our model depends only on weak as-

sumptions, it provides insights that can be beneficial to understanding the information-economics

of these other forms of electronic markets for attention. Additionally, the case of UCM may be

viewed as an extreme example of consumer-attention overload of the type that also applies to

marketing campaigns using traditional media. Our model of UCM, therefore, may be of benefit

to researchers who are interested in the marginal payoff from advertising that is delivered in

competition with other claims on a consumer’s attention.

Appendix.

Lemma 1:

Let q be the number of UCM messages sent to an EMT, and let � �q� be the probability

that the owner will consume any particular message. Then there exist positive constants q and

k such that � �k

q q qq

�� � � .

29

Sapm and beyond: An information-economic analysis of unwanted commercial messages

Proof: The total number of messages consumed by the user is � � � �n q q q�� . By Obser-

vation 5, there exists some positive quantity q of UCM messages such that depletes the user’s

time endowment such that receiving an additional message will reduce the number that can be

consumed; i.e.,

� �

0n

q qq

� �� � �

�. (1)

Taking the derivative of � �n q , we can rewrite equation (1) as

� � � �q q q q q� �� � � . (2) Equation (2) contains an inequality based on the ordinary linear first-order differential equation

� � � �1

q qq

� � � , with solutions

� �k

qq

� � (3)

for arbitrary k . Applying the inequality from equation (2) to equation (3) gives the required re-

sult. �

Proof of Observation 7:

Let � �q a be the number of UCM messages sent to EMTs of age a . By Observation 7,

for all positive a there exists a positive aq such that � � aq a q� � � �

� �0

n

q a

� ��

�. By Observation

4, there exists a positive A such that if a A � and a A � , then � � � �q a q a � ; therefore,

a aq q � . Let q be the maximum in the set � 0 1, , Aq q q� , so that for all a , � �q a q� �

� �

� �0

n

q a

� ��

�. Then, by Table 1 Note 4b, � �q a q� �

� �0

n

m

� ��

� and by Table 1 Notes 6a and 6c

� �q a q� � � �

0r

m

� ��

�. �

30

Sapm and beyond: An information-economic analysis of unwanted commercial messages

References 1. TopTenREVIEWS (2004) "2004 Spam Filter Report,"

http://www.spamfilterreview.com/spam-statistics.html accessed 12 December.

2. Symantec (2004) "The Evolving Spam Threat,"

http://enterprisesecurity.symantec.com/article.cfm?articleid=5059 accessed 23 December.

3. Gross, G. CAN-SPAM Law Seen as Ineffective. Computerworld, (December 27 2004).

4. Gross, G. Vendors, Users Dispute FTC Report on Spam. Computerworld, (December 21

2005).

5. Spring, T. Spam Mutates. PC World, 24, 4 (April 2006), 18-20.

6. Melville, N.; Stevens, A.; Plice, R. K.; and Pavlov, O. V. Unsolicited Commercial E-Mail:

Empirical Analysis of a Digital Commons. International Journal of Electronic Com-

merce, 10, 4 (Summer 2006), 143-168.

7. Keizer, G. Spam Could Cost Businesses Worldwide $50 Billion. InformationWeek, 1028 (Feb-

ruary 28 2005), 18.

8. Federal Trade Commission Effectiveness and Enforcement of the CAN-SPAM Act: A Report to

Congress. Washington: Federal Trade Commission, 2005.

9. Van Zandt, T. Information Overload in a Network of Targeted Communication. Rand Journal

of Economics, 35, 3 (Autumn 2004), 542-560.

10. Grossman, G. M., and Shapiro, C. Informative Advertising with Differentiated Products. Re-

view of Economic Studies, 51, 1 (January 1984), 63-81.

11. Clemons, E. K.; Gu, B.; and Lang, K. R. Newly Vulnerable Markets in an Age of Pure In-

formation Products: An Analysis of Online Music and Online News. Journal of Man-

agement Information Systems, 19, 3 (Winter 2002), 17-40.

31

Sapm and beyond: An information-economic analysis of unwanted commercial messages

12. Cournane, A., and Hunt, R. An Analysis of the Tools Used for the Generation and Prevention

of Spam. Computers & Security, 23, 2 (March 2004), 154-166.

13. Grazioli, S., and Jarvenpaa, S. L. Consumer and Business Deception on the Internet: Content

Analysis of Documentary Evidence. International Journal of Electronic Commerce, 7, 4

(Summer 2003), 93-118.

14. Pavlov, O. V.; Melville, N.; and Plice, R. K. Mitigating the Tragedy of the Digital Commons:

The Problem of Unsolicited Commercial E-mail. Communications of the Association for

Information Systems, 16, 4 (July 2005),

15. Mankiw, N. G. Principles of Microeconomics. Mason, Ohio: Thomson, 2001.

16. Hardin, G. The Tragedy of the Commons. Science, 162, 3859 (1968), 1243-1248.

17. Gratton, E. Dealing with Unsolicited Commercial Emails: A Global Perspective. Journal of

Internet Law, 7, 12 (December 2004), 3-13.

18. Federal Trade Commission National Do Not Email Registry: A Report to Congress. Wash-

ington, D.C.: Federal Trade Commission, 2004.

19. Leyden, J. (2004) "We'll Kill Spam in Two Years -- Gates,"

http://www.theregister.co.uk/2004/01/26/well_kill_spam_in_two accessed May 3.

20. Loder, T. C.; Van Alstyne, M. W.; and Wash, R. (2004) "Information Asymmetry and

Thwarting Spam," January 14, 2004, http://ssrn.com/abstract=488444 accessed January

14.

21. Fahlman, S. Selling Interrupt Rights: A Way to Control Unwanted E-Mail and Telephone

Calls. IBM Systems Journal, 41, 4 (2002), 759-766.

22. Kraut, R. E.; Morris, J.; Telang, R.; Filer, D.; Cronin, M.; and Sunder, S. Markets for Atten-

tion: Will Postage for E-Mail Help? In Churchill, E. F., McCarthy, J., Neuwirth, C. and

32

Sapm and beyond: An information-economic analysis of unwanted commercial messages

Rodden, T. (eds.), Proceedings of the 2002 ACM Conference on Computer Supported

Cooperative Work, Reading, MA:ACM Press, 2002, pp. 206-215.

23. Goodman, J.; Heckerman, D.; and Rounthwaite, R. Stopping Spam. Scientific American, 292,

4 (April 2005), 42-45.

24. McFedries, P. Slicing the Ham from the Spam. IEEE Spectrum, 41, 4 (April 2004), 72-83.

25. Nettleton, E. Electronic Marketing and the New Anti-spam Regulations. Journal of Database

Marketing & Customer Strategy Management, 11, 3 (April 2004), 235-240.

26. Fallows, D. (2003) "Spam: How it is Hurting Email and Degrading Life on the Internet,"

http://www.pewinternet.org/pdfs/PIP_Spam_Report.pdf accessed Jan 23.

27. Simon, H. A. (1971) "Designing Organizations for an Information-Rich World"" in Com-

puters, Communications and the Public Interest, (Ed, Greenberger, M.) The Johns Hop-

kins Press, Baltimore, pp. 38-52.

28. Jessup, L. M., and Valacich, J. S. Information Systems Today. Upper Saddle River, NJ: Pear-

son Education, 2006.

29. McWilliams, B. Spam Kings. Sebastopol, CA: O'Reilly, 2005.

30. Scherer, F. M., and Ross, D. Industrial Market Structure and Economic Performance. Bos-

ton, MA: Houghton Mifflin, 1990.

31. Black, U. D. TCP/IP and Related Protocols. New York: McGraw-Hill, 1992.

32. Spammer-X Inside the Spam Cartel: Trade Secrets from the Dark Side. Rockland, MA: Syn-

gress, 2004.

33. Ranger, S. (2005) "Anti-spam Spending Set to Soar,"

http://www.vnunet.com/vnunet/news/2126834/anti-spam-spending-set-soar accessed

January 7.

33

Table 1. Parameterization of model setting

Observation Description and restrictions

Model expression Notes

1. Population of EMTs

There are a finite number of EMTs in existence that re-ceive UCM.

� 1a. The finite set of EMTs that can receive UCM is: � 1b. There are � valid EMTs.

2. UCM cam-paigns

There are a finite number of UCM campaigns.

m 2. The total number of cam-paigns conducted by the UCM industry is: m

3. Distribu-tion of EMT ages

The probability that an EMT is of a given age is a func-tion of variations in the historical rate of EMT creation and elimination.

� �a� 3a. The probability that an EMT is of age a is: � �a� . 3b. By Note 1b, the number of EMTs of age a is, therefore: � � � �g a a ��� .

The probability of an operator knowing about an EMT is a function of EMT age.

� �a�

The older the EMT, the greater the num-ber of operators that have discovered it.

� � 0�� ��

All EMTs will even-tually be discovered by every operator.

0A� � s.t. a A� � � 1a� �

4. EMT dis-covery proc-ess

Exploit lists may contain invalid EMTs

V

4a. The probability that an EMT of age a has been dis-covered for use in a campaign is � �a� . 4b. By Note 2, the number of messages sent to an EMT of age a is, therefore: � � � �q a m a��

4c. By Notes 3b and 4b, the total number of UCM messages sent industry-wide is

� � � �0a

E g a q a V

�� �� ,

where V is the number of inva-lid EMTs on exploit lists.

34

Table 1. Parameterization of model setting (continued)

Observation Description and restrictions

Model expression Notes

The probability that a user will consume a UCM message is a function of the num-ber of messages sent to the EMT.

� �� �q a�

If more UCM mes-sages are sent to an EMT, there is a weakly lower prob-ability of any par-ticular one being consumed.

� � 0�� �

5. Internet user atten-tion endow-ment

The attention en-dowment of the EMT owner can be depleted.

0q� � and 0k�

s.t. � �q a q�

� �� �� �k

q aq a

� �

(see Lemma 1).

5a. The probability that a mes-sage will be consumed if sent to an EMT receiving � �q a messages is: � �� �q a� (see Note 4b).

5b. The number of messages that will be consumed if an EMT receives � �q a messages is therefore: � � � � � �� �n a q a q a�� .

UCM industry-wide aggregate revenue is a function of the number of UCM messages consumed.

� �r N 6. UCM in-dustry reve-nue

If more UCM mes-sages are consumed, the industry’s reve-nue will increase.

� � 0r � ��

6a: Total revenue if N UCM messages are consumed is � �r N .

6b. Combining Note 3b and Note 5b: the number of con-sumed messages for inboxes of age a is: � � � � � � � �� �n a g a q a q a��

6c. Therefore:

� �0a

N n a

��� .

6d. By substitution of Notes 3b, 4b, and 6b into Note 6c: N �

� � � � � �� �0a

a m a m a� � � �

���

35

Table 1. Parameterization of model setting (continued)

Observation Description and restrictions

Model expression Notes

Industry-wide ag-gregate cost is a function of the num-ber of campaigns and the number of messages sent

� �,c m E

The cost increases as more campaigns are executed or more messages sent.

� �0

c

m

� ��

�,

� �0

c

E

� ��

8. UCM in-dustry cost

There is a maximum number of cam-paigns that can be conducted for any given industry-wide aggregate cost.

, 0,x y� � xm� s.t.

� �,xc m y x�

8. The total cost to the industry of executing m campaigns in-volving E messages is � �,c m E .

10. EMT in-validation and replace-ment

The rate of EMT invalidation and re-placement is a func-tion of the number of campaigns

( )z m 10. The probability that the owner of a fully discovered in-box will invalidate and replace it is ( )z m

Table 2. Model constants

Construct Model expression Notes

First-time users of Internet communi-cations facility

U U is a constant representing the num-ber of new, first-time users who obtain an EMT each time period

Speed at which EMTs mature

h h is a constant representing the rate that EMTs of age a mature to age

1a + each time period. Exploit-list cleanup capability

s If 0s = , UCM operators remove in-validated EMTs from their exploit lists. If 1s = , invalidated EMTs re-main on the exploit lists.

36

Table 3. Difference equations for dynamic constructs

Dynamic con-struct

Model expression Notes

Distribution of EMT ages:

( )aγ Initial values:

( )0 0 30,000g =

( )0 1 0g =

( )0 2 0g =

( )0 3 0g =

( ) ( ) ( ) ( ) ( )1 0 0 3 0t t t t tg g U z M g hg+ = + + −

( ) ( ) ( ) ( )1 1 1 0 1t t t tg g hg hg+ = + −

( ) ( ) ( ) ( )1 2 2 1 2t t t tg g hg hg+ = + −

( ) ( ) ( ) ( ) ( )1 3 3 2 3t t t t tg g hg z M g+ = + −

By Table 1 Note 3b,

( ) ( )g aaγ =

� and

( )0

A

ag a

==� � ; therefore,

the discrete function ( )g a

fully determines ( )aγ ,

where { }0,1,2,3a ∈ . By Table 1 Note 10, � � � �3t tz M g are invalidated

and replaced at time 1t� .

Invalid EMTs: V Initial value:

0 0V =

( ) ( )1 3t t t tV V sz M g+ = + By Table 1 Note 10, ( ) ( )3z M g inboxes are in-

validated each time period. If 1s� the invalidated EMTs remain on exploit lists.

Equilibrium number of cam-paigns: M Initial value:

0 100M =

1t tM M entries exits+ = + − , where:

1000, 0

0, otherwisetMentries

� Р>�= ���

1000min , , 0

0, otherwise

tt

Mexits M

� � �Π− Π <� � = � ���

( ) ( ),tr N c M EΠ = −

By Observation 9, per-campaign profits lead to more campaigns, and per-campaign losses lead to fewer campaigns. WLOG, the constant 1000 is an arbi-trary scale factor to reduce volatility in the simulation (because no currency units are specified). The min function is used to ensure a non-negative number of campaigns.

37

Table 4. Instantiations of static constructs

Static construct Model expression Where used

� �1

4a

a��

� (linear form) Figure 2 Figure 3b

Probability of EMT discovery (Table 1 Note 6)

� �1.5

14

aa�� �� ��� �� ���� �

(nonlinear form) Figure 3a Figures 4-6

� � 1500

qq� � � (linear form)

Figure 2 Figure 3b

Probability of consuming mes-sage (Table 1 Note 9) � �

� �5 11

400

4001

400q

qq

q e

�� ��� � �� ����� �

� �

(nonlinear form) Figure 3a Figures 4-6

� � 0.01r N N� Figure 2 Figure 3a, 3b

� � 0.16r N N� Figure 3a, 3b

UCM industry revenue (Table 1 Note 11)

� � 0.08r N N� Figure 3a, 3b Figures 4-6

( , ) 5 .0001c M E M E� � Figure 3b ( , ) 10 .001c M E M E� � Figure 3b

( , ) 10 .0001c M E M E� � Figure 2 Figure 3b

( , ) 10 .00001c M E M E� � Figure 3b

( , ) 20 .0001c M E M E� � Figure 3b

� �, 2250ln .0001c M E M E� � Figure 3a

� �, 4500ln .001c M E M E� � Figure 3a

� �, 4500ln .0001c M E M E� � Figures 4-6

� �, 4500ln .00001c M E M E� � Figure 3a

UCM industry cost (Table 1 Note 15)

� �, 9000ln .0001c M E M E� � Figure 3a

� � � �min .000125 ,1z M M� Figure 3a, 3b

� � � �min .00125 ,1z M M� Figure 2 Figures 4-6

Rate of EMT invalidation (Table 1 Note 16)

� � � �min .0025 ,1z M M� Figure 3a, 3b

38

������������� ���������� ��� ���������������������

������������� �� � �� ������ ���� �������� ������������� ���� ������ ���� ��������

������

�����

������

�����

��������

��������

�� �����

� �����������

�� ��������������

���������������� �� !!!

" #!! !!!

� ��

� �����������

�� ��������������

���������������� �

��������

��������

�� �����

�# !!!

$ #!! !!!

� ��

�����������

� ��%&���

��������

!!"

!!'

� #! �!!

(����������

����%&�����

�%&����

�����������

� ��%&���

��������

!!"

!!'

� #! �!!�%&����

(����������

����%&�����

�����������

�%&�)��

���

�����*����

� #! �!!

#

�%&����

�%&

�����*���

�������

�%&

�����*���

�������

�����������

�%&�)��

���

�����*����

� #! �!!

#

�%&����

� ��� �������

���� ���

� ����� �

'!! "!!

#

�����������������������*��

�����������

�������

�������

�� �����

� ��� �������

���� ���

� ����� �

'!! "!!

#

�����������������������*��

�����������

�������

�������

�� �����

39

+#!!

'#!!

#!!

�#!!

!

� �� ���

� ��� ������ �

�� ���

� ��� ������ �

����

��,� ���������������� �

�������+�����������*���� ��� ��� ������ ������� ���� ��������

10

.1

0

U

h

s

=

=

=

����

��,� ���������������� �

"!!!

+!!!

'!!!

�!!!

.1

0

h

s

=

=

10U =

1000U =

5000U =

�������"������������������ �� ��������������������������

����

+#!!

#!!

�#!!

��,� ���������������� �

'#!!

!

)��)�-'.

����-'.

)��)�-'.

����-'.

)��)�-�.

����-�.

.1h =

.1h =

1h =

1h =0s =

1s =

10U =

�������#�������������*�������� �������� �/�������0-��� ��-'����� ��� �1

� � ���������������� �

����

��,������� �����������������

0���������1

�!��

�!$

�!2

�!3

�!�

�!�!)��)�-'.

����-'.

)��)�-'.

)��)�-�.

����-�.

.1h =

.1h =

1h =

0s =

1s =

10U =

����-'. 1h =

�������3�������������*�������� �������� �/�������0-��� ��-'����� ��� �1��

������ �����������������

����

�,�4%�� ������

�������0�����)� �������1

���,� ���������������� �

0���)��)� �������1

��������������5��

�)� ��������,�!

�!!!

�!!

3!!

"!!

+#!!

'#!!

�#!!

#!!

!

'!!

10

.1

0

U

h

s

=

=

=

�������'������� �����������

40