AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf ·...

33
An Integrated Trial/Repeat Model for New Product Sales Peter S. Fader Bruce G. S. Hardie Chun-Yao Huang 1 August 2001 1 Peter S. Fader is Associate Professor of Marketing at the Wharton School, University of Pennsylvania (email: [email protected]; web: www.petefader.com). Bruce G. S. Hardie is Assistant Profes- sor of Marketing, London Business School (email: [email protected]; web: www.brucehardie.com). Chun-Yao Huang is a PhD candidate at London Business School (email: [email protected]). The second author acknowledges the support of the LBS Centre for Marketing.

Transcript of AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf ·...

Page 1: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

An Integrated Trial/Repeat Modelfor New Product Sales

Peter S. FaderBruce G. S. HardieChun-Yao Huang1

August 2001

1Peter S. Fader is Associate Professor of Marketing at the Wharton School, University of Pennsylvania(email: [email protected]; web: www.petefader.com). Bruce G. S. Hardie is Assistant Profes-sor of Marketing, London Business School (email: [email protected]; web: www.brucehardie.com).Chun-Yao Huang is a PhD candidate at London Business School (email: [email protected]). Thesecond author acknowledges the support of the LBS Centre for Marketing.

Page 2: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

Abstract

Traditional test-market-based new product forecasting models for consumer packaged goodsusually suffer from at least one of three deficiencies: (i) any possible connection between trialand subsequent repeat purchases is generally ignored at the household level (which leads toincorrect parameter estimates and inferences), (ii) consumer preferences for the new productare assumed to be stable over time, and (iii) the effects of marketing activities (e.g., advertisingand promotion) are disregarded. We present a parsimonious stochastic model of new productpurchasing that addresses all of these issues.

Our primary objective is to be able to provide an accurate forecast of overall new productsales. By creating a tight linkage between the trial and repeat purchase processes, we canleverage the limited amount of observed repeat data that are available in the initial weeks afterlaunch. The integrated model allows managers to carefully diagnose the sub-components of newproduct sales (such as percent of triers repeating by time t, repeats per repeater, and so on),without requiring separate models for each one.

We formally combine the trial and repeat processes (and accommodate changing consumerpreferences over time) by introducing a probabilistic “renewal” process that varies with depth-of-repeat. Specifically, as customers gain more experience with the product we would expect theirpreferences (and therefore their underlying buying rates) to settle down. As such, a desirablefeature of the model is that it can evolve to a stationary repeat buying process as the productmoves from being “new” to “established.” These renewal events also allow for the possibility ofconsumer dropout, thereby letting us capture different attrition patterns for the new product.

We examine two different distributions for interpurchase times—the simple exponentialmodel and the Erlang-2, which allows for more regularity in the time between repeat purchases.We introduce marketing mix effects via a proportional hazards framework at the individuallevel. Furthermore, beyond a conventional model with constant covariate effects over time,we also develop a specification that lets the coefficients vary with consumer experience (i.e.,depth-of-repeat).

Overall, this flexible set of model components gives us a general framework to capture andunderstand the wide variety of possible purchase patterns that can occur for a new product soonafter its launch. This framework includes (and generalizes) many of the models considered byGupta (1991) as well as other previous models that have addressed some of the specific issuesdescribed above. We conduct a detailed empirical analysis using data from IRI’s BehaviorScanservice, and show how the model can be used to examine the differential impact on trial andrepeat sales that emerges when we remove (or add) a particular promotional event.

Page 3: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

1 Introduction

Since the early work of Fourt and Woodlock (1960), a number of marketing scientists have

devoted their attention to the development of models designed to generate forecasts of a new

product’s sales performance. During the 1960s and 1970s, the primary focus was on the de-

velopment of test-market models, in which medium-term sales forecasts were projected from

consumer panel data collected during the first few months of the test market. Attention then

shifted to the development of simulated (also known as pre-test) market models that generate

sales forecasts using data from survey research conducted prior to the introduction of the new

product.

Given the popularity of simulated test-market services, such as those offered by A.C. Nielsen’s

BASES subsidiary, we may be tempted to assume that the test market is a thing of the past.

This is far from the truth, as many firms will not commit to the final launch decision purely

on the basis of data collected in a simulated test market. Test-market data provide the “hard”

numbers about sales patterns as well as promotional response indicators that are desired (or even

required) by many brand managers. Furthermore, electronic test market environments such as

Scannel (operated by Taylor Nelson Sofres Secodip in France) and BehaviorScan (operated by

Information Resources, Inc. (IRI) in the US) provide a level of detail (and managerial control)

that could only have been dreamed of back in the 1960s. Coupled with this desire for actual

sales numbers is the desire to get sales estimates “as soon as possible” (e.g., Advertising Age

2000). Thus the need for test-market models is still very strong among consumer packaged goods

(CPG) manufacturers.

The (test-market) new product sales forecasting models used by the various market research

firms are typically minor modifications of those developed 30–40 years ago. Reflecting on the

set of models developed during this era, we can immediately identify two shortcomings.

The first shortcoming concerns the fact that the majority of these models do not explic-

itly incorporate the effects of marketing decision variables. This should come as no surprise

since these models were developed at a time when consumer panel data were collected using

self-completed paper diaries; data on in-store merchandising activities were non-existent, un-

1

Page 4: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

less collected via a custom audit. The emergence of the UPC and laser scanners make such

data readily available today, but the models used in practice have not kept pace with these

technological improvements.

The second shortcoming is a little more subtle. When modeling new product sales, it is

standard practice to separate total sales into trial (i.e., first purchase) and repeat (i.e., subsequent

purchases) components. In order to understand the development of repeat sales, it is common to

decompose repeat sales into its first repeat, second repeat, third repeat (and so on) components.

Within the literature on test-market forecasting models, there is a long tradition of building

so-called “depth-of-repeat” models, which combine the output from each of these sub-models to

arrive at an overall sales forecast for the new product—see, for example, Eskin (1973), Fourt and

Woodlock (1960), Kalwani and Silk (1980), and Massy (1969). As we transition from repeat level

j to level j+1, the only piece of information we effectively retain about each person is that they

made a jth repeat purchase. Thus the probability that an individual will make a 3rd repeat

purchase 2 weeks after his 2nd repeat purchase is generally assumed to be exactly the same

regardless of whether he made his second repeat purchase in week 3 or week 50. Furthermore,

information on the timing of this individual’s trial and first repeat purchases would be completely

ignored. Such an approach has typically been used to capture the nonstationarity in purchasing

rates that we observe during the early phase of a new product’s life. A key problem with this

depth-of-repeat approach is that it will result in misleading inferences about buyer behavior,

since the model formulation fails to recognize the dependence across multiple purchases within

each individual (Gupta and Morrison 1991). For example, Fader and Hardie (1999) show that

the parameters of Eskin-type models of repeat sales will imply the existence of nonstationarity

in repeat-buying behavior even when the model is applied to data from a purely stationary

(simulated) market!

With these two shortcomings in mind, the objective of this paper is to present a stochastic

model for the sales of a new CPG product that integrates all of an individual’s purchases of

the new product (as opposed to developing separate models for trial, first repeat, etc.) and

simultaneously captures the effects of marketing activities and nonstationarity in initial repeat

buying behavior at the individual consumer level. The paper proceeds as follows. In the next

2

Page 5: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

section we develop our model for the sales of a new CPG product, and we present two extensions

to the basic model. This is followed by an empirical analysis in which we examine the fit

and forecasting performance of the proposed model, and consider its use in the evaluation of

alternative launch scenarios. We conclude with a discussion of several issues that arise from this

work and identify several areas worthy of follow-on research.

2 Model Development

Our objective is to develop a model of new product sales that incorporates the effects of mar-

keting mix variables and nonstationarity in buying rates at the individual customer level. The

primary motivation for nonstationarity is the notion that customers’ preferences for the new

product are evolving; as customers gain more experience with the product we would expect

their preferences (and therefore their underlying buying rates) to “settle down.” As such, a

desired feature of the model is that it can capture the “evolution” towards a stationary repeat-

buying process as the product moves from being “new” to “established.”

Nonstationarity is modeled using a multiple-changepoint process for the customer-level buy-

ing process. At each changepoint, there is a renewal of (or change in) the underlying buying

rate. A renewal is interpreted as a revision of preferences, which may be due, perhaps, to ex-

perience with the product or some other unobservable phenomenon. A renewal can occur after

any purchase of the new product, but the probability of occurrence decreases as the customer

gains more experience with the new product (i.e., moves through higher depth-of-repeat levels).

Our model for the evolution of new product purchasing is based on the following five as-

sumptions:

i. The probability of an individual ever trying the new product is π0.

ii. Let the random variable Tj denote the time (since the launch of the new product) at which a

customer makes its jth repeat purchase (j = 0, 1, . . . , J). By convention, j = 0 corresponds

to the trial purchase. The hazard- rate function of the with-covariate interpurchase time

distribution is of the form

3

Page 6: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

h(t|tj) = λex(t)′β

≡ λA(t)

where x(t) is the vector of marketing covariates at time t and β the effects of these

covariates. (This corresponds with the assumption of an exponential baseline distribution

with covariate effects incorporated using the proportional hazards framework.) Assuming

the time-varying covariates remain constant within each unit of time (e.g., week), the

survivor function of the with-covariate interpurchase time distribution is

S(t|tj ;λ) = exp[−λB(t, tj)] (1)

where B(tb, ta) = B(tb)−B(ta) with

B(t) = δt≥1

Int(t)∑i=1

A(i) + [t− Int(t)]A(Int(t) + 1)

It follows that the pdf of the with-covariate interpurchase time distribution is

f(t|tj ;λ) = λA(τ) exp[−λB(t, tj)] (2)

where τ = t if t is integer and Int(t) + 1 otherwise.

iii. Individual purchase rates, λ, are distributed across the population according to a gamma

distribution with shape parameter r and scale parameter α; that is

g(λ) =αrλr−1e−αλ

Γ(r)

iv. Following his jth purchase, a customer renews his value of λ with probability γj . The

depth-of-repeat specific renewal probability is of the form

4

Page 7: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

γj =

η j = 0

1− ψ(1− e−θj) j = 1, 2, . . .(3)

where η, ψ ∈ [0, 1] and θ ≥ 0.

v. Upon the occurrence of a renewal, a customer receives a value of λ = 0 with probability φ.

(This is equivalent to a complete rejection of the new product.) With probability 1−φ, thecustomer draws a new value of λ, independent of his previous one, from the same gamma

distribution of purchase rates described above.

The first assumption follows naturally from the established literature on the modeling of

first-purchase of a new product. Since the early work of Fourt and Woodlock (1960), modelers

have assumed that there is an upper limit on the market penetration level for a new product. The

numerical value of this penetration limit can be interpreted as the probability that a randomly

chosen individual will eventually try the new product, which we denote by π0.

The incorporation of covariate effects in interpurchase time distributions using the propor-

tional hazard specification with a parametric baseline hazard function is well established within

the marketing literature. The choice of the exponential for the baseline distribution (assump-

tion ii) represents the simplest case and is consistent with the assumption of Poisson counts that

underlies much of the stochastic modeling work within the marketing literature. Similarly, as-

sumption (iii) follows the long tradition of using the gamma distribution to capture heterogeneity

in purchase rates (e.g., Morrison and Schmittlein 1988). Note that these two assumptions give

us the “exp/gamma, covariates” model examined by Gupta (1991). We will also examine an

alternative timing process, the Erlang-2, later in the paper.

The final two assumptions provide a paramorphic, as opposed to strictly behavioral, rep-

resentation of when (assumption iv) and how (assumption v) preferences for the new product

evolve. The logic behind equation (3), the probability that a renewal occurs at depth-of-repeat

level j, is as follows: we would expect that the probability of a consumer revising his preferences

following a purchase would decrease as he gains more experience with the new product (i.e.,

5

Page 8: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

moves to a higher depth-of-repeat level). Looking closely at equation (3), we note that as j

increases, γj tends to 1 − ψ. Therefore, if ψ = 1, the probability of a renewal tends to zero

as a consumer moves to higher depth-of-repeat levels; in other words, the model evolves to a

stationary process which would be consistent with the stabilization of consumer preferences. On

the other hand, if ψ < 1, γj > 0 ∀j, which means that individual consumer preferences will not

stabilize; in other words, there is long-term nonstationarity in the marketplace. (If θ → ∞, then

γj is independent of j and equals 1−ψ ∀ j.) The relationship between γj and j is illustrated in

Figure 1 for three sets of values of ψ and θ.

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Depth-of-Repeat Level (j)

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Probabilityof

Renew

al

.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

ψ = 1.0, θ = 0.4......................................................................................................................................................................................................................................................................................................................................................................................................ψ = 0.9, θ = 0.7

ψ = 0.8, θ → ∞

Figure 1: Probability of Renewal by Depth-of-Repeat

Reflecting on assumption (v), the “spike at zero”—receiving a value of λ = 0 with probability

φ—is simply a mechanism by which customers can “drop out” of the market for the new product

even after making several purchases of it; drawing a value of zero upon a renewal is viewed as

being equivalent to rejecting the new product from future purchase consideration. It follows

that the proportion of triers who do not make a repeat purchase is given by ηφ. For j > 1, the

proportion of consumers making a (j − 1)th repeat purchase who will ultimately make a jth

repeat purchase is given by 1 − γj−1φ. For finite θ and ψ = 1, this proportion increases with

j to a limit of 1.0, which is consistent with the observations of Eskin (1973) and Kalwani and

Silk (1980) concerning the nature of depth-of-repeat curves. Given the central role played by

γj in determining how many people ultimately make a (j + 1)th repeat purchase, the case of

j = 0 is treated separately from j ≥ 1 as it has been observed that the proportion of triers who

eventually repeat does not reflect how subsequent repeat purchasing evolves (Eskin 1973).

6

Page 9: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

With probability 1 − φ, the consumer draws a new value of λ from the underlying gamma

distribution; this allows for changes in the consumer’s latent buying rate, which we interpret as a

change in his preference for the new product. This principle of independent renewals from a given

mixing distribution was first raised in Howard’s “Dynamic Inference” model (Howard 1965).

Similar types of renewal processes have been used by Sabavala and Morrison (1981) in their model

of media exposure and Fader and Lattin (1993) in their measure of loyalty for scanner data-based

choice models. However, these earlier models all utilized fixed (time-invariant) renewal processes,

as opposed to the evolutionary process introduced here. (As noted above, equation (3) admits

a fixed (time-variant) renewal process as a special case, i.e., θ → ∞, ψ < 1.) In contrast to a

standard changepoint process application (e.g., Henderson and Matthews 1993; Pievatolo and

Rotondi 2000; Raftery and Akman 1986), our interest is not in the explicit identification of the

changepoint(s) in an observed sequence of variables. Rather we are simply using the changepoint

framework to accommodate shifts in the underlying stochastic process which can then be used

to forecast future outcomes of the process. Furthermore, for each individual, we restrict the set

of possible changepoints to times at which at which they purchase the new product.

To illustrate and convey the intuition of the proposed model, let us consider the following

scenario of a customer who makes three purchases of the new product in the first six weeks of it

being on the market: trial at t0, first repeat at t1 and second repeat at t2. Let us assume that if a

renewal occurs (i.e., preferences are revised), it is immediately after a purchase. One behavioral

“story” consistent with this is to assume that consumption immediately follows purchase, and

that preference revisions would immediately follow consumption.

✲✛ wk 1 ✲✛ wk 2 ✲✛ wk 3 ✲✛ wk 4 ✲✛ wk 5 ✲✛ wk 6 ✲

× × ×0 t0 t1 t2 6

Given t0, t1, t2, we do not know whether the consumer ever revised his preferences and, if he

did, how many times and at which points in time. Let us first assume that the consumer never

revised his preferences in (0, 6]. By assumptions (i) and (ii), the conditional likelihood function

for this consumer is the probability that he eventually tries the new product (π0), multiplied by

7

Page 10: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

the product of the density and survival functions, that is,

L(π0, λ,β; data) = π0f(t0|0;λ)f(t1|t0;λ)f(t2|t1;λ)S(6|t2;λ)

= π0λ3A(2)A(4)A(6) exp

[ − λB(6, 0)]

Following the third assumption, the unconditional likelihood function is:

L(π0, r, α,β; data) =∫ ∞

0L(π0, λ,β; data)

αrλr−1e−αλ

Γ(r)dλ

= π0Γ(r + 3)A(2)A(4)A(6)

Γ(r)

(1

α+B(6, 0)

)3 (α

α+B(6, 0)

)r

(4)

This same likelihood function would emerge if we were to apply the “exp/gamma, covariates”

model from Gupta (1991) to this hypothetical purchase history.

Alternatively, suppose that the consumer revised his preferences following his second (i.e.,

first repeat) purchase. Let the purchasing rate λa reflect the consumer’s initial preference for the

new product, and λb reflect the consumer’s revised preference following his first repeat purchase.

The conditional likelihood function for this consumer is therefore:

L(π0, λa, λb,β; data) = π0f(t0|0;λa)f(t1|t0;λa)f(t2|t1;λb)S(6|t2;λb)

= π0λ2aA(2)A(4) exp

[ − λaB(t1, 0)]λbA(6) exp

[ − λbB(6, t1)]

Following assumption (v), we note that the renewal resulted in a new value of λ being drawn

from the same underlying gamma distribution, an event which occurs with probability 1 − φ.The unconditional likelihood function is therefore:

L(π0, r, α, φ,β; data) =∫ ∞

0

∫ ∞

0L(π0, λa, λb,β; data)

αrλr−1a e−αλa

Γ(r)(1− φ)α

rλr−1b e−αλb

Γ(r)dλa dλb

= π0(1− φ)Γ(r + 2)A(2)A(4)Γ(r)

(1

α+B(t1, 0)

)2 (α

α+B(t1, 0)

)r

× Γ(r + 1)A(6)Γ(r)

(1

α+B(6, t1)

) (α

α+B(6, t1)

)r

(5)

In general, we cannot tell exactly when (or if) renewals of buying rates take place. For this

8

Page 11: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

consumer, the number of renewals could have ranged from zero to three. The set of eight possible

renewal patterns is given in Table 1. Equation (4) is the likelihood function corresponding to the

renewal pattern in row (i), and the likelihood function corresponding to the renewal pattern in

row (iii) is given in equation (5). While we do not know which of the eight patterns corresponds

to the consumer, we can write out the unconditional likelihood function associated with each

of the possible renewal patterns and compute the consumer’s overall likelihood as the weighted

average of the renewal-pattern-specific likelihoods, where the weights are the probabilities of

each renewal pattern occuring. (Following assumption (iv), the probability of observing the

renewal pattern in row (i) is given by (1− γ0)(1− γ1)(1− γ2), while the probability of observingthe renewal pattern in row (iii) is given by (1− γ0)γ1(1− γ2).)

Renewal Occurs After Number ofTrial 1st Repeat 2nd Repeat Renewals

(i) 0(ii) � 1(iii) � 1(iv) � 1(v) � � 2(vi) � � 2(vii) � � 2(viii) � � � 3

Table 1: Feasible Renewal Patterns for Three Purchases

More generally, let Th = {t0, . . . , tj , . . . , tJ} be the set of times at which household h,

(h = 1, . . . , H), makes its K purchases of the new product in the period (0, tc], where tc is the

censoring point that is the end of the calibration period. (Clearly Th = ∅ if K = 0.) The exact

nature of the likelihood function for consumer h depends on whether K = 0 or K > 0.

If no purchase of the new product is observed (i.e., K = 0), this is due to either (i) the

household not being in the market for the new product, or (ii) the household has simply not yet

had the opportunity or need to make a trial purchase. Therefore, the likelihood function for a

household making no purchases is:

L(Th) = (1− π0) + π0

α+B(tc, 0)

)r

(6)

9

Page 12: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

The first term is simply the probability that a household will never try the new product, whereas

the second term is the probability that a household will eventually make a trial purchase multi-

plied by the with-covariate survival function (i.e., the probability that no purchase occurred in

(0, tc]) mixed with the gamma distribution.

When K > 0, the possibility of renewals occurring emerges. For a household making K

purchases of the product in the period (0, tc], there are K renewal opportunities. At each

renewal opportunity, a renewal either occurs or it does not; consequently, there are 2K sets of

possible renewal points. Let there be n ≤ K renewals, and let w = {wi}, i = 1, . . . , n be the

set of renewal points, where wi corresponds to the depth-of-repeat level immediately following

which a renewal occurs. (For the second example above, w = {1}.) If a renewal occurs after thetrial purchase, we have w1 = 0. As we cannot tell exactly when (or if) renewals of buying rates

take place, we first formulate the likelihood function conditional on a given renewal pattern, w.

For the case of no renewals (n = 0), we have

L(Th | w) = π0

{J∏

j=0

A(τj)

}Γ(r + J + 1)

Γ(r)

(1

α+B(tc, 0)

)J+1( α

α+B(tc, 0)

)r

(7)

where τj is the time period (e.g., week) in which the jth purchase occurred, defined as

τj =

tj if tj is integer

Int(tj) + 1 otherwise

For n > 0 renewals, with the last renewal occurring immediately following the last purchase

(i.e., wn = J), we have

L(Th | w) = π0(1− φ)n−1

{J∏

j=0

A(τj)

}

×n∏

i=1

{Γ(r + wi − wi−1)

Γ(r)

(1

α+B(twi , twi−1)

)wi−wi−1(

α

α+B(twi , twi−1)

)r}

×{φ+ (1− φ)

α+B(tc, tJ)

)r}

(8)

10

Page 13: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

where w0 = 0. This likelihood function can be interpreted in the following manner. As the nth

renewal occurred immediately following the last purchase, the final bracketed term represents

the likelihood that no purchase has occurred since twn = tJ . This is either because the renewal

resulted in the new product being rejected (i.e., a value of λ = 0 was drawn, with probability

φ) or because the consumer has not yet had the opportunity or need to make a repeat purchase

in (tJ , tc] (i.e., the probability of drawing a positive value of λ, (1 − φ), multiplied by the

“exp/gamma, covariates” survival function for a time period (tJ , tc]). The probability that the

first n−1 renewals saw positive values of λ being drawn is (1−φ)n−1. For each of the n intervals

during which preferences were stable, the second bracketed term is simply the product of the

“with covariates” pdfs, mixed with the gamma distribution.

Alternatively, if the final renewal occurs some time before the last purchase (i.e., wn < J),

we have

L(Th | w) = π0(1− φ)n{

J∏j=0

A(τj)

}

×n∏

i=1

{Γ(r + wi − wi−1)

Γ(r)

(1

α+B(twi , twi−1)

)wi−wi−1(

α

α+B(twi , twi−1)

)r}

×{Γ(r + J − wn)

Γ(r)

(1

α+B(tc, twn)

)J−wn( α

α+B(tc, twn)

)r}

(9)

As twn < tJ , we know that all renewals saw positive values of λ being drawn, the probability of

which is (1− φ)n. The second bracketed term is interpreted as above, while the final bracketed

term is the likelihood that the last J − wn purchases occurred in (twn , tc].

Now the probability of a given renewal pattern w is

P (w | ψ, θ) =∏j∈w

γj

∏j∈I−w

(1− γj) (10)

where I = {0, 1, . . . , J}. Therefore, for K > 0, the likelihood function associated with Th is

simply the weighted average of the renewal-pattern-specific likelihoods, that is,

11

Page 14: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

L(Th) =∑

s

L(Th | ws)P (ws) (11)

where the summation is over the possible renewal sets indexed by s = 1, 2, . . . , 2K . (For K = 0,

the likelihood function is given by equation (6).) It follows that the overall sample log-likelihood

function is:

LL =H∑

h=1

ln[L(Th)

](12)

Equations (6)–(12) define the model as fitted to a given dataset. Maximum likelihood estimates

of the model parameters (π0, r, α, ψ, θ, φ,β) are obtained by maximizing the log-likelihood func-

tion given in equation (12) above. Standard numerical optimization methods are employed,

using the MATLAB programming language, to obtain the parameter estimates.

2.1 Properties of the Model

In its most general form, the model requires the estimation of 7 + s parameters, where s is the

number of marketing covariates. It is a very flexible model that can capture many patterns of

buying behavior. Examples of such buying phenomena include:

• “Traditional” stationary buying behavior. If γj = 0 ∀ j, we have a purchasing process in

which the latent purchase rates are stationary. (This is associated with θ → ∞ and ψ = 1.)

When π0 = 1, our model reduces to the “exp/gamma, covariates” model considered by

Gupta (1991). When β = 0, we have the two parameter exponential-gamma model of

stationary repeat buying behavior which is the timing counterpart of the NBD counting

model (Gupta and Morrison 1991). The estimates of r and α would equal those obtained

by fitting the NBD to the data. Relaxing the assumption that π0 = 1 gives us the timing

equivalent of Morrison’s (1969) NBD with “spike at zero” (counting) model where 1− π0

is the size of the structural “never buyers” segment.

• The transition from a “new” to “established” product. If ψ = 1 and θ is finite, then γj → 0

12

Page 15: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

as j increases; that is, the probability of a renewal occurring tends to zero as a consumer

moves to higher depth-of-repeat levels. This means that the initial nonstationary buying

process evolves to a stationary process as the product becomes more established (i.e.,

when most buyers have made a large number of repeat purchases). Therefore the model

is consistent with the notion of nonstationary buying behavior during the early stages

of a new product’s life and stationary buying behavior—as characterized by the NBD

model—once it has become established in the marketplace.

• Long-term nonstationarity in repeat buying. When ψ < 1, the probability of renewal will

always be non-zero which means that the repeat buying process is always nonstationary.

If θ → ∞, γj is a constant 1 − ψ; that is, the probability of renewal is constant across all

depth-of-repeat levels. For finite θ, γj → 1− ψ as j increases; that is, the probability of a

renewal tends to the constant 1−ψ as a consumer moves to higher depth-of-repeat levels.

Such a model can easily capture the “leakage” of repeat buyers phenomena observed by

East and Hammond (1996). In particular, if φ > 0, or the underlying gamma distribution

has a mode at zero (r ≤ 1), an on-going low-level of renewals will see some consumers

drawing a value of λ = 0 on a given renewal, thereby “dropping out” of the market for the

product of interest. Other researchers (e.g., Schmittlein, Morrison, and Colombo 1987)

have proposed NBD-based models that include a “death” process. However our model

is far more flexible, allowing for other forms of nonstationarity (e.g., “speeding up” and

“slowing down” of latent purchase rates) beyond a simple “death” process.

2.2 Generating Sales Forecasts

In order to evaluate the tracking performance of the proposed model, or to use the model for

forecasting sales beyond the model calibration period, it is necessary to generate sales numbers

(i.e., counts) from this timing model. We are interested in a number of sales-related measures

for the new product:

i. the cumulative trial sales by time t, T (t),

ii. the cumulative repeat sales by time t, R(t),

13

Page 16: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

iii. the total sales by time t, S(t), which by definition is equal to T (t) +R(t), and

iv. the depth-of-repeat components of repeat sales. Defining Rj(t) as the number of customers

who have made at least j repeat purchases of the new product by time t, we have R(t) =∑∞j=1Rj(t).

Our goal is to generate these numbers over the time interval (0, tf ], where tf denotes the end of

the forecast period.

While we have a simple closed-form expression for expect cumulative trial sales,

E[T (t)] = H × π0

[1−

α+B(t)

)r ],

it is not possible to write out a closed-form expression for R(t), and consequently S(t). We

therefore propose a simulation-based approach to computing the sales numbers. A complete

step-by-step description of this simulation procedure is contained in Appendix A.

2.3 Extensions to the Basic Model

We consider two extensions to the basic model: (i) a relaxation of the assumption that the

interpurchase times are distributed exponentially, and (ii) a recognition of the possibility that

the effects of marketing activities could vary as the consumer gains more experience with the

new product.

As with numerous other stochastic models of buyer behavior, our model is based on the

assumption that individual consumer interpurchase times can be characterized by the expo-

nential distribution. Two potentially troubling characteristics of this distribution are that it is

memoryless (i.e., there is no influence of time since the last purchase) and that the mode of the

distribution is at zero (which means that the next purchase is most likely to occur immediately

after the last one). Consequently a number of researchers have proposed that the Erlang-2 dis-

tribution be used to model interpurchase times, as it allows for a more regular purchase process

(Chatfield and Goodhardt 1973; Herniter 1971; Jeuland, Bass and Wright 1980). We therefore

consider the case of Erlang-2 distributed interpurchase times as an extension to the basic model.

14

Page 17: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

Using Gupta’s (1991) approach to incorporating the effects of time-varying covariates into

the Erlang-2 distribution, the survivor function and pdf of the with-covariate interpurchase time

distribution are given by

S(t | tj) = exp[−λB(t, tj)][1 + λB(t, tj)] (13)

f(t | tj) = λ2A(τ)B(t, tj) exp[−λB(t, tj)] (14)

Coupled with assumptions (i) and (iii)–(v), we arrive at a new set of renewal-pattern-specific

likelihood functions which presented in Appendix B.

Our second extension allows the response to marketing activities to vary across depth-of-

repeat levels (i.e., as the consumer gains more experience with the new product). This notion

is motivated by the work of Helsen and Schmittlein (1994), who examined how price sensitivity

varies across depth-of-repeat classes. In theory, we could estimate a separate β vector for trial,

first repeat, second repeat, and so on. However we would not be able to generate sales forecasts

from such a model as that we would need β vectors for repeat levels not observed during the

model calibration period.

One way to accommodate changing βs in a forecasting setting is to specify a general structure

for the evolution of the coefficients as we move through higher levels of repeat purchasing. We

propose the structure

βj = β0 + (β∞ − β0)(1− e−δj) (15)

in which the covariate effects evolve from their trial values (β0) to their long-run equilibrium

values (β∞). The speed with which the equilibrium values are reached (as a function of repeat

level j) is determined by the δ parameter.

We explore the value of these two extensions in the following empirical analysis.

15

Page 18: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

3 Application

The basic model developed above nests a simpler model in which covariate effects are ignored. A

generalization of the basic model allows for depth-of-repeat-specific βs, as given in equation (15).

At the heart of these three model specifications is the assumption of exponential interpurchase

times. Replacing this with the assumption of Erlang-2 interpurchase times gives us another

three model specifications to consider.

We examine the performance of these six model specifications using test market data for

“Kiwi Bubbles”, a masked name for a shelf-stable juice drink, aimed primarily at children,

which is sold as a multipack with several single-serve containers bundled together. Prior to

national launch, it underwent a year-long test conducted in two of IRI’s BehaviorScan test

markets. We use BehaviorScan panel data, drawn from 2799 panelists in two markets. Using

data for the 267 panelists that tried the new product by the end of week 26, we wish to forecast

the purchasing behavior of the whole panel (i.e., 2799 panelists) to the end of the year (week

52). That is, we fit the six model specifications to the first six months of purchasing data and

generate sales forecasts for the whole year. We have information on the marketing activity

over the 52 weeks the new product was in the test market; this comprises a standard scanner

data measure of promotional activity (i.e., any feature and/or display), along with measures of

advertising and coupon activity. To account for carryover effects, the advertising and coupon

measures are expressed as standard exponentially-smoothed “stock” variables (e.g., Broadbent

1984). The results for the models are reported in Table 2.

Looking at the model log-likelihoods, we immediately observe that the fit of the three ex-

ponential model specifications dominates their Erlang-2 counterparts. This result is completely

consistent with recent work on the modeling of trial purchasing for new CPG products, which

finds strong support of the exponential interpurchase time distribution (Fader and Hardie 2001;

Hardie, Fader, and Wisniewski 1998). The dominance of the exponential model specification

is confirmed when we look at the index of year-end forecast accuracy (WK52 Index); in all

three cases, the exponential specification produces more accurate forecasts than its Erlang-2

counterpart. (Note that this does not necessarily follow from the good fit of the exponential

model specification; as Armstrong (2001) observed, a large number of researchers have found

16

Page 19: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

Exponential Erlang-2Without Covariates Covariates Without Covariates CovariatesCovariates (constant βs) (varying βs) Covariates (constant βs) (varying βs)

π0 0.159 1.000 0.488 0.163 0.426 0.470r 0.574 0.061 0.159 0.416 0.119 0.115α 46.597 80.615 122.239 10.517 14.771 19.463η 0.428 0.269 0.236 0.668 0.570 0.541ψ 0.859 0.968 0.968 0.777 0.847 0.844θ 2.112 2.388 2.464 1.335 1.405 1.484φ 1.000 0.000 0.938 0.587 0.000 0.000β(coupon)† − − 5.208 3.881 − − 3.730 1.633β(advertising)† − − 0.000 0.000 − − 0.000 0.000β(promotion)† − − 0.012 0.009 − − 0.009 0.013β∞(coupon) − − − − 14.852 − − − − 10.345β∞(advertising) − − − − 0.012 − − − − 0.000β∞(promotion) − − − − 0.008 − − − − 0.003δ − − − − 0.164 − − − − 0.342LL −3770.1 −3726.5 −3724.0 −3777.3 −3744.9 −3741.2WK52 Index‡ 97.6 104.2 107.7 85.6 91.0 90.9

† β0 for the varying βs specification‡ 100 × expected week 52 cumulative total sales / actual week 52 cumulative total sales

Table 2: Summary of Model Results

that model fit is a poor way to assess predictive validity.)

Within the set of exponential models, we observe that (i) the inclusion of covariates results

in a significant improvement in calibration-period model fit (p < .001) and (ii) allowing for

depth-of-repeat-specific βs does not result in a significant improvement in calibration-period

model fit (p = .17). Looking at the index of year-end forecast accuracy, we also observe that

the model that allows for depth-of-repeat-specific βs is dominated by the other two model

specifications. This is contrary to the findings of Helsen and Schmittlein (1994). The fact

that the anaylsis undertaken by Helsen and Schmittlein treated trial, first repeat, and second

repeat as independent processes and failed to control for unobserved heterogeneity means that

we have more confidence in our findings. It is, however, too soon to draw any conclusion as

to whether and how the effects of marketing activities really vary across depth-of-repeat levels.

But the modeling approach developed in this paper is the correct way to explore such effects,

as it overcomes the shortcomings identified in the Helsen and Schmittlein analysis framework.

Reflecting on the parameters of the “exponential with constant covariate effects” model, we

note the model suggests that every panelist is potentially in the market for the new product

17

Page 20: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

(π0=1). While at first glance this is counter-intuitive, it is completely consistent with the existing

literature on the modeling of new product trial, in which it has been found that the estimated

value of the penetration limit parameter (i.e., π0) is typically either 1.0 or not significantly

different from 1.0 (Fader and Hardie 2001). The next two numbers give us the estimates of the

shape and scale parameters (r and α) for the underlying gamma distribution that characterizes

the heterogeneous purchasing rates across the panelists. When a given panelist makes a trial

purchase, there is a 27% chance (η) that he will change his purchase rate. If this does occur,

the panelist does not reject the product (φ = 0); rather, he draws a new purchase rate from the

original gamma distribution. When this panelist does eventually make a first repeat purchase,

we use equation 3 to determine that there is a 12% chance that he will undergo a renewal

immediately after this purchase. This drops to 4% following his second repeat purchase. From

that point on, the renewal probability effectively reaches its asymptotic value of 1− ψ = 3.2%.

Finally, we find that the couponing and in-store promotional activities have significant im-

pacts on purchase timing. The zero coefficient for advertising reflects the lower bound placed on

this parameter when maximizing the log-likelihood function; unconstrained, it is negative but

not significantly different from 0.

As a benchmark, we also fit the basic Gupta (1991) “exp/gamma, covariates” model to the

first six months of purchasing data (allowing for the possibility of never-triers); the resulting

six parameter model has a log-likelihood of −3733.0. This represents a significantly worse fit

(p = 0.011) than the above “exponential with constant covariate effects” model specification,

and it substantially overpredicts year-end sales, with a WK52 index of 114. We can therefore

conclude that there is nonstationarity in the repeat buying behavior for the new product—over-

and-above the temporary changes induced by the marketing activities— that must be explicitly

captured in a model for the sales of a new product.

The forecasting performance of the “exponential with constant covariate effects” model spec-

ification is illustrated in Figure 2. In addition to a total sales forecast, managers are interested in

the break-down of total sales into its trial, first repeat, and additional repeat components— see

for example Clarke (1984). The model-based predictions provide an accurate tracking of both

the total sales curve as well as its trial and repeat components.

18

Page 21: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

0 4 8 12 16 20 24 28 32 36 40 44 48 52

Week

0

10

20

30

Cum

.Sa

lesper100HH

........................................

.........................................

.............................................................

...................................................................

....................................................................................................................

..............................................................................................................................................................................

.....................................................................................................................................................................................................

.......................................................................................

.............................................................................................................................................

.......................................................................................................................................................................................................................................................................................................................................................

...........................................................................................................

......................................................................................................................

........................................

..............................................................

.........................................................................................

............................................................................

...............................................................................................

............................................................................................................

.............................................................

...............................

..........................................................................................................................

..................................................................................................

....................................

.............................................

............................................................

.................................................

..................................................................................................

.................................................................

..........................................................................................

...........................................

...............................................

Actual

..........................

.............

.......................... .............

............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ....

............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ........

............. ............. ............. ............. ............. .......................... ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. .............

............. .............

..........................

.......................................

..........................

.............

..........................

..........................

............. ..........................

..........................

............. ..........................

............. ............. ............. .......................... .............

..........................

.............

Predicted

Trial................................................ .........

First Repeat...........................................................

..........

Additional Repeat.............................................................

Total Sales............................................................................... .........

Figure 2: Predicted Sales

Even though the level of additional repeat sales is relatively low at the end of the calibration

period, it is evident that additional repeat will quickly bypass the other sales components, and

will comprise the lion’s share of total sales in the period following week 52. The ability of our

model to accurately track and forecast this key component is, perhaps, the strongest indicator

of its validity and usefulness.

Two other widely-monitored measures of new product performance are “percent triers re-

peating” and “repeats per repeater” (Clarke 1984; Rangan and Bell 1994). At any point in time

t, percent triers repeating is computed as R1(t)/T (t), while repeats per repeater is computed as

R(t)/R1(t). In Figures 3 and 4 we compare the actual development of these two measures with

the predictions derived from the model, observing that the model-based numbers accurately

track the actual numbers.

0 4 8 12 16 20 24 28 32 36 40 44 48 52

Week

0

10

20

30

40

50

%Triers

...............................................................................................................................................................................................

......................................

...............................................................................................................................................................

.......................................................................

..........................................................................................................................

.......................................................................................................................................................................................................................................

..............................

Actual

...........................................................................................

.......................................

..........................

............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. .............

Predicted

Figure 3: Tracking Percent Triers Repeating

19

Page 22: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

0 4 8 12 16 20 24 28 32 36 40 44 48 52

Week

0

1

2

3

Average

#RepeatPurchases

..............................................................................................................................

.......................................................................................................................................

..............................................................................

.................................................

........................................

............................................................................................................

.......................................

.......................................................

.....................................................................................

........................................

Actual

.................................................... .............

.......................................

..........................

............. ............. ............. ............. .......................... ............. ............. .............

............. ............. ............. .......................... .............

.......................... .............

Predicted

Figure 4: Tracking Repeats Per Repeater

Referring back to Table 2, we observe that the no-covariate model generates the most accurate

forecast, as judged on the basis of the WK52 index. The forecasting performance of this model

is illustrated in Figure 5.

0 4 8 12 16 20 24 28 32 36 40 44 48 52

Week

0

10

20

30

Cum

.Sa

lesper100HH

..........................................................................................................................

..................................................................................................

....................................

.............................................

............................................................

.................................................

..................................................................................................

.................................................................

..........................................................................................

...........................................

...............................................

Actual

....................................................

..........................

..........................

..........................

..........................

..........................

..........................

..........................

..........................

............. ............. ............. ............. ............. ............. ............. ............. ....Predicted

Figure 5: Sales Forecast: No-Covariate Model

Does this mean that marketing-mix variables have no value in a new product sales forecasting

model? Not at all. Fader and Hardie (2001) note that the inclusion of marketing-mix variables

has a big impact on forecasting performance when the model calibration period is relatively short.

(However, with longer calibration periods, the forecasts generated using no-covariate models are

just as accurate as their with-covariate counterparts.) Furthermore, the with-covariate model

can be used to evaluate the impact of incremental changes in the marketing-mix as the marketing

manager seeks to finalize the (national) launch plan for the new product. We now consider such

an application of the model.

20

Page 23: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

One element of the promotional activity for “Kiwi Bubbles” was an FSI coupon distributed

in week 3. In order to determine the impact of this early couponing activity, the marketing

manager would want to know what the sales path would be had this coupon not been distributed.

Alternatively, noting the apparent sales increase in week 3, the marketing manager may consider

repeating such a promotional activity further on in the launch phase of the new product. We

therefore consider two scenarios, the first corresponding to the removal of the coupon dropped

in week 3, the second corresponding to a repeat of this coupon (i.e., same face value and fuse) in

week 20. We generate the sales forecasts under each scenario and compare them to the base case

corresponding to the sales forecast associated with the marketing plan used in the test market.

The predicted total sales paths for these two scenarios is reported in Figure 6, along with

the (predicted) sales path associated with the base case. We observe that under scenario 1, first

year sales are down by 4.4% while under scenario 2, first year sales are up by 2.1%.

0 4 8 12 16 20 24 28 32 36 40 44 48 52

Week

0

10

20

30

Cum

.Sa

lesper100HH

.....................................................................................................................................

.................................................................................

......................................

.....................................

..................................................

............................................

...........................................

.......................................................

.................................................

...........................................................

.........................................

...................................................................

.............................................

...Base case

..........................

.......................... .............

.......................................

..........................

..........................

..........................

..........................

............. ............. ..........................

............. ............. ..........................

.......................................

.........................

Scenario 1

......

.................

.......

..........

........

........

.........

........

.........

............

...........

...........

.........

..........

.........

..

Scenario 2

Figure 6: Total Sales for Base Case and Scenarios 1 & 2 (Coupon Deleted/Added)

These overall changes in total sales are decomposed in Figure 7, which reports cumulative

trial and repeat sales under each scenario, indexed against the base case. We observe that year-

end trial sales are down by 2.1% under scenario 1 and up by 1.0% under scenario 2. Year-end

repeat sales are down by 5.7% under scenario 1 and up by 2.8% under scenario 2. These numbers

provide an indication of the permanent loss (or gain) in sales that may be due to the deletion

(addition) of this type of coupon event. Further calculations reveal that 17.6% of the change in

year 1 sales under scenario 1 is due to the reduction in trial sales (alone), whereas 18.0% of the

increase in year 1 sales under scenario 2 is due to the change in trial sales (alone).

21

Page 24: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

0 4 8 12 16 20 24 28 32 36 40 44 48 52

Week

60

70

80

90

100

110

SalesIndex(basecase

=100) .................................

......

..................

..............

........................................

.......................................

........................................................

....................................................................

Trial

............................................................................................................................................... ..........

.................................................................................

..........................

............. ............. .......................... ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. .............

............. ............. ............. ............. ............. ............. ............. ............. ............. ....................................... ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. .........

Repeat

Scenario 1

Scenario 2

Figure 7: Scenario Trial and Repeat Sales (Cumulative) Indexed to Base Case

These numbers by-themselves can be a little misleading as we must consider the “trickle-

through-repeat” effects of the changes in trial. In Figures 8 and 9, we compare the development

of “percent triers repeaters” and “repeats per repeater” under each scenario, indexed against the

base case. Under scenario 1 we observe that the percentage of triers who make a repeat purchase

initially drops by almost 15% but has effectively recovered by the end of the year (down by 1.3%);

the initial drop in the number of repeat purchases per repeater was not as great but has not

recovered as much (a difference of 2.5% at year-end). Thus we may conclude that much of the

reduction in repeat sales under scenario 1 is due to the fact that a number of consumers who

would have been induced to try the new product because of the couponing activity delay their

trial purchase and therefore the follow-on repeat purchases are not observed. Under scenario 2,

the extra coupon has minimal impact on trial or first repeat; the primary impact is on the

repeat-buying behavior of those consumers who are already repeat buyers (repeats per repeater

are up by a small 1.3%).

This analysis of the “Kiwi Bubbles” test-market data illustrates the value of the model

developed in this paper, and demonstrates how we can use such a model to help the marketing

manager evaluate incremental changes to the new product launch plan.

22

Page 25: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

0 4 8 12 16 20 24 28 32 36 40 44 48 52

Week

85

90

95

100

105

Index(basecase

=100) .............

...............................................................................................................................................

............. .............

................................................................. ............. ............. .............

.......................... .............

............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ....

............. ............. ............. ............. ............. ............. ............. ............. ............. ..........................

............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. ............. .........

Scenario 1

Scenario 2

Figure 8: Scenario Impact on Percent Triers Repeating (Indexed to Base Case)

0 4 8 12 16 20 24 28 32 36 40 44 48 52

Week

90

95

100

105

Index(basecase

=100)

............. ........................................................................................................ .............

.......................... .............

..........................

............. ............. ............. ............. ............. ............. .......................... ............. ............. .............

..........................

............. ............. ............. ............. ............. .............

............. ............. ............. ............. ............. ............. ............. ............. ............. ....................................... ............. ............. ............. ............. ............. ............. ............. ............. .............

............. ............. ............. ............. ............. ........

Scenario 1

Scenario 2

Figure 9: Scenario Impact on Repeats Per Repeater (Indexed to Base Case)

4 Conclusions

While certain “hot topics” come and go in the field of marketing research, there has always

been a high level of interest (shared by academics and practitioners alike) in the issue of fore-

casting new product sales. At the same time, however, recent years have seen a widening gap

between methodological developments in academia and the state-of-the-art in actual practice.

This paper bridges this gap with a model featuring three important contributions: (i) a fully

integrated model of trial-repeat behavior; (ii) careful consideration of marketing mix (covariate)

effects, including the possibility that the impact of advertising, coupons, and in-store promo-

tional campaigns might each evolve in a different manner with deeper depths-of-repeat; (iii)

explicit accommodation of nonstationary repeat buying behavior, which allows chaotic early

23

Page 26: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

behavior to settle down towards a steady-state buying pattern over time.

We examined several variations to our basic model structure, including different individual-

level timing processes (i.e., exponential versus Erlang-2) and covariate schemes (i.e., no covariates

versus constant covariates versus varying with depth-of-repeat). One conclusion that emerged is

the benefit of simplicity—the simplest model (exponential timing with no covariates) proved to

have excellent forecasting capabilities. This is a theme that echoes recent work with trial-only

models (Hardie, Fader, and Wisniewksi 1998) as well as repeat-only models (Fader and Hardie

1999). The fact that it continues to hold even when we mix these different types of buying

behaviors is a strong tribute to its robustness and generalizability.

Although covariates are not necessarily required for our model to produce excellent forecasts,

they are still an important (and managerial desirable) component to include in the final specifi-

cation. One of the principal reasons for running a test market is to learn about the effectiveness

of these different levers in order to know which ones to use, and when to use them. Although

some marketing mix elements (e.g., end-aisle displays) are aimed primarily at generating new

triers, they also impact repeat sales both directly (i.e., enticing a past buyer to buy again) and

indirectly (since a promotion-induced trier may continue to buy the product in the future). Our

model allows us to capture these different behavioral effects, and can therefore give managers a

correct sense of how well their marketing mix allocations are working.

Beyond the context (i.e., a single new CPG product) discussed so far in the paper, it is worth

discussing other relevant applications/extensions for the general type of methodology presented

here. First, it is important to emphasize that the behavioral “story” behind our model is by

no means limited to the CPG setting. A similar pattern will likely emerge for other types of

products and services (although the specific parameters that characterize the various components

of the model will likely vary from one context to another). Likewise, the model might apply

nicely to new customers who are first encountering an existing product/service. For instance, as

Internet “newbies” first learn about various websites, their behavior over time should conform

to the basic set of assumptions outlined here; this would be a very promising area for future

investigation.

As we run the model over multiple products/services, it will be instructive to look for “meta-

24

Page 27: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

patterns” in the resulting model parameters. Our empirical analysis revealed one particular type

of nonstationary behavior, but it would be useful to catalogue different forms of nonstationarity

(and covariate effects) and begin to associate them with product characteristics or other external

measures. Many firms (e.g., BASES) attempt to database hundreds or thousands of products

using simple sales summaries to enable early forecasts for new launches. Such a process can be

greatly enhanced by using the parameters from a complete (and behaviorally plausible) model

rather than relying strictly on summary statistics (such as repeats per repeater and the other

measures we discussed earlier). As our field continues to make rapid advances with hierarchical

Bayes methods, this task should become a workable possibility, even for practitioners, in the

near future.

Finally, one issue not addressed here, but sometimes considered in the context of new product

sales, is the role of competition. Our experience with trial-repeat modeling mirrors that of firms

such as BASES, who have found that accurate forecasts rarely require any explicit accommoda-

tion of competitive effects. Nevertheless, it is interesting to think about how new product entry

can affect—and be affected by—existing market structures (see Bronnenberg, et al. (2000) for

a recent review of this literature). But beyond these past approaches—mostly post hoc econo-

metric models that were not intended for forecasting purposes— that other researchers have

employed, we are intrigued by an extension of our product-specific stochastic model to one that

can deal with sales patterns (and perhaps marketing activities) for other rivals. So while we

view our integrated model as offering a reasonably accurate and managerially useful picture of

the trial-repeat process for a given new product, we see it as just one step towards the creation

of a “Holy Grail” model that builds in competition and other category-level phenomena to be

able to anticipate the complete set of market dynamics that surround a new product launch.

25

Page 28: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

Appendix A

Let the non-zero elements of the vector Nh denote the times at which customer h made his

trial, first repeat, etc. purchases (if at all). For a given individual, we simulate the elements

of Nh in the following manner. We start by drawing a uniform random variate to determine

whether the consumer will ever make a purchase of the new product (with probability π0). If

this is the case, a value of λ is drawn from the gamma distribution. Using this value of λ and

the actual values of the covariates, we simulate an interpurchase time off the “exponential with

covariates” interpurchase time distribution. This gives us the consumer’s simulated value of t0,

the time of his trial purchase. If t0 > tf , the consumer is deemed to have made zero purchases

of the new product by time tf and the procedure moves on to the next consumer. If t0 ≤ tf ,we record the time of this purchase (Nh(0) = t0) and then draw a uniform random number to

determine whether the consumer retains his value of λ (with probability 1 − γ0) or whether arenewal occurs (with probability γ0), in which case a new value of λ is drawn from the underlying

distribution. Another uniform random number is drawn in the process of determining the new

value of λ. With probability φ, a value of λ = 0 is drawn and the consumer is deemed to have

rejected the new product and the procedure moves on to the next consumer. If the new value of

λ is drawn from the gamma distribution (with probability 1 − φ), or no renewal has occurred,

another exponential with covariates interpurchase time is simulated and added to t0 to give

us the consumer’s simulated value of t1, the time of his first repeat purchase. If t1 > tf , the

consumer is deemed to have made only a trial purchase by time tf and the procedure moves on

to the next consumer. If t1 ≤ tf , we record the time of this first repeat purchase (Nh(1) = t1)

and the whole process continues for this consumer until tj > tf or a value of λ = 0 is drawn

when a renewal occurs, at which time the procedure moves to the next consumer.

Once we have simulated Nh for all individuals, we can compute total sales and its components

in the following manner:

26

Page 29: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

T (t) =H∑

h=1

I{0 < Nh(0) ≤ t}

Rj(t) =H∑

h=1

I{0 < Nh(j) ≤ t}

R(t) =∞∑

j=1

Rj(t)

S(t) = T (t) +R(t)

where I{·} is an indicator function which equals 1 if the logical condition is true, and 0 otherwise.

We repeat this simulation, say 100 times, and take the average of the run-specific S(t), T (t),

etc. This simulation-based approach will be used in the empirical analysis.

27

Page 30: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

Appendix B

Assumption (ii) states that the individual consumer interpurchase times follow the exponential

with-covariate distribution with survivor function and pdf given by equations (1) and (2). When

we replace this with the assumption that the individual consumer interpurchase times follow the

Erlang-2 with-covariate distribution with survivor function and pdf given by equations (13) and

(14), we arrive at new expressions for the renewal-pattern-specific likelihood functions:

i. For a household making no purchases in the calibration period (0, tc]:

L(Th) = (1− π0) + π0

α+B(tc, 0)

)r[1 +

rB(tc, 0)α+B(tc, 0)

](A1)

ii. When K > 0 purchases with no renewals (n = 0), we have

L(Th |w) = π0

{J∏

j=0

A(τj)B(tj , tj−1)

}

×{Γ(r + 2(J + 1))

Γ(r)

α+B(tc, 0)

)r (1

α+B(tc, 0)

)2(J+1)

×[1 +

(r + 2(J + 1))B(tc, tJ)α+B(tc, 0)

]}(A2)

When π0 = 1, this is “Erlang-2/gamma, covariates” model considered by Gupta (1991).

iii. For n > 0 renewals, with the last renewal occurring immediately following the last purchase

(i.e., wn = J), we have

L(Th |w) = π0(1− φ)n−1

{J∏

j=0

A(τj)B(tj , tj−1)

n∏i=1

{Γ(r + 2(wi − wi−1))

Γ(r)

α+B(twi , twi−1)

)r( 1α+B(twi , twi−1)

)2(wi−wi−1)}

×{φ+ (1− φ)

α+B(tc, tJ)

)r[1 +

rB(tc, tJ)α+B(tc, tJ)

]}(A3)

iv. For n > 0 renewals, with the last renewal occurring some time before the last purchase

(i.e., wn < J), we have

28

Page 31: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

L(Th |w) = π0(1− φ)n{

J∏j=0

A(τj)B(tj , tj−1)

n∏i=1

{Γ(r + 2(wi − wi−1))

Γ(r)

α+B(twi , twi−1)

)r (1

α+B(twi , twi−1)

)2(wi−wi−1)}

×{Γ(r + 2(J − wn))

Γ(r)

α+B(tc, twn)

)r( 1α+B(tc, twn)

)2(J−wn)

×[1 +

(r + 2(J − wn))B(tc, tJ)α+B(tc, twn)

]}(A4)

Equations (A1)–(A4) replace equations (6)–(9) respectively. Consequently equations (A1)–

(A4), (10)–(12) define the model as fitted to a given dataset when we assume the underlying

interpurchase times follow the Erlang-2 distribution.

29

Page 32: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

References

Advertising Age (2000), “Safe at Any Speed?” January 24, 1, 12.

Armstrong, J. Scott (2001), “Evaluating Forecasting Methods,” in J. Scott Armstrong (ed.),Principles of Forecasting: A Handbook for Researchers and Practitioners, Norwell, MA:Kluwer Academic Publishers, 443–472.

Broadbent, Simon (1984), “Modelling with Adstock,” Journal of the Market Research Society,26 (4), 295–312.

Bronnenberg, Bart J., Vijay Mahajan, and Wilfried R. Vanhonacker (2000), “The Emergence ofMarket Structure in New Repeat-Purchase Categories: The Interplay of Market Share andRetailer Distribution,” Journal of Marketing Research, 37 (February), 16–31.

Chatfield, C. and G. J. Goodhardt (1973), “A Consumer Purchasing Model with Erlang Inter-Purchase Times,” Journal of the American Statistical Association, 68 (December), 828–835.

Clarke, Darral G. (1984), “G.D. Searle & Co.: Equal Low-Calorie Sweetener (B),” HarvardBusiness School Case 9-585-011.

East, Robert and Kathy Hammond (1996), “The Erosion of Repeat-Purchase Loyalty,” Market-ing Letters, 7 (March), 163–171.

Eskin, Gerald J. (1973), “Dynamic Forecasts of New Product Demand Using a Depth of RepeatModel,” Journal of Marketing Research, 10 (May), 115–129.

Fader, Peter S. and Bruce G. S. Hardie (1999), “Investigating the Properties of the Eskin/Kalwani& Silk Model of Repeat Buying for New Products,” in Lutz Hildebrandt, Dirk Annacker,and Daniel Klapper (eds.), Marketing and Competition in the Information Age, Proceedingsof the 28th EMAC Conference, May 11–14, Berlin: Humboldt University.

Fader, Peter S. and Bruce G. S. Hardie (2001), “Forecasting Trial Sales of New Consumer Pack-aged Goods,” in J. Scott Armstrong (ed.), Principles of Forecasting: A Handbook for Re-searchers and Practitioners, Norwell, MA: Kluwer Academic Publishers, 613–630.

Fader, Peter S. and James M. Lattin (1993), “Accounting for Heterogeneity and Nonstation-arity in a Cross-Sectional Model of Consumer Purchase Behavior,” Marketing Science, 12(Summer), 304–317.

Fourt, Louis A. and Joseph W. Woodlock (1960), “Early Prediction of Market Success for NewGrocery Products,” Journal of Marketing, 25 (October), 31–38.

Gupta, Sunil (1991), “Stochastic Models of Interpurchase Time with Time-Dependent Covari-ates,” Journal of Marketing Research, 28 (February), 1–15.

Hardie, Bruce G. S., Peter S. Fader, and Michael Wisniewski (1998), “An Empirical Comparisonof New Product Trial Forecasting Models,” Journal of Forecasting, 17 (June–July), 209–229.

Helsen, Kristiaan and David Schmittlein (1994), “Understanding Price Effects For New Non-durables: How Price Responsiveness Varies Across Depth-of-Repeat Classes and Types ofConsumers,” European Journal of Operational Research, 76 (July 28), 359–374.

Henderson, R. and J.N. S. Matthews (1993), “An Investigation of Changepoints in the AnnualNumber of Cases of Haemolytic Uraemic Syndrome,” Applied Statistics, 42 (3), 461–471.

30

Page 33: AnIntegratedTrial/RepeatModel forNewProductSalesfacultyresearch.london.edu/docs/01-1001.pdf · h(t|tj)=λex(t) β ≡ λA(t) where x(t) is the vector of marketing covariates at time

Herniter, Jerome (1971), “A Probabilistic Market Model of Purchase Timing and Brand Selec-tion,” Management Science, 18 Part II (December), P102–P113.

Howard, Ronald A. (1965), “Dynamic Inference,” Operations Research, 13 (September–October),712–733.

Jeuland, Abel P., Frank M. Bass, and Gordon P. Wright (1980), “A Multibrand Stochastic ModelCompounding Heterogeneous Erlang Timing and Multinomial Choice Processes,” OperationsResearch, 28 (March-April), 255–277.

Kalwani, Manohar and Alvin J. Silk (1980), “Structure of Repeat Buying for New PackagedGoods,” Journal of Marketing Research, 17 (August), 316–322.

Massy, William F. (1969), “Forecasting the Demand for New Convenience Products,” Journalof Marketing Research, 6 (November), 405–412.

Morrison, Donald G. (1969), “Conditional Trend Analysis: A Model that Allows for Nonusers,”Journal of Marketing Research, 6 (August), 342–346.

Morrison, Donald G. and David C. Schmittlein (1988), “Generalizing the NBD Model for Cus-tomer Purchases: What Are the Implications and Is It Worth the Effort?” Journal ofBusiness and Economic Statistics, 6 (April), 145–159.

Pievatolo, Antonio and Renata Rotondi (2000), “Analysing the Interevent Time Distribution toIdentify Seismicity Phases: A Bayesian Nonparametric Approach to the Multiple-ChangepointProblem,” Applied Statistics, 49 (4), 543–562.

Raftery, A. E. and V .E Akman (1986), “Bayesian Analysis of a Poisson Process with a Change-Point,” Biometrika, 73 (1), 85–89.

Rangan, V. Kasturi and Marie Bell (1994), “Nestle Refrigerated Foods: Contadina Pasta &Pizza (A),” Harvard Business School Case 9-595-035.

Sabavala, Darius J. and Donald G. Morrison (1981), “A Nonstationary Model of Binary ChoiceApplied to Media Exposure,” Management Science, 27 (June), 637–657.

Schmittlein, David C., Donald G. Morrison, and Richard Colombo (1987), “Counting YourCustomers: Who They Are and What Will They Do Next?” Management Science, 33(January), 1–24.

31