SITUATIONS MATTER: UNDERSTANDING HOW ...SITUATIONS MATTER: UNDERSTANDING HOW INDIVIDUAL BROWSING...

Journal of Electronic Commerce Research, VOL 21, NO 2, 2020

13

SITUATIONS MATTER: UNDERSTANDING HOW INDIVIDUAL BROWSING

SITUATION ROUTINENESS IMPACTS ONLINE USERS’ ADVERTISEMENT CLICKS

BEHAVIOR

Xianghua Lu

School of Management, Fudan University

Shanghai 200433, China

[email protected]

Yuxin Chen

NYU Shanghai

1555 Century Avenue, Shanghai 200122, China

[email protected]

ABSTRACT

Targeting the most valuable users and generating advertisement clicks from them is a veritable challenge for

online advertisers. Most of the extant research about behavioral targeted advertising focuses on matching

advertisements with the right people by profiling users’ latent interests. This study posits that users’ individual

browsing situations should also be examined with relation to the advertisement clicks. We introduce a new angle—

browsing situation routineness—to explore the impact of browsing situations on users’ advertisement clicks.

Supported by a unique clickstream data set, our empirical study indicates that consumers are more likely to click the

advertisements when they visit websites with high content routineness (CR), particularly in leisure-oriented websites.

However, time routineness (TR) works in the opposite manner; users tend to click fewer advertisements during their

routine browsing time, compared to non-routine browsing times. We also find that CR and TR have a complementary

joint effect to increase click-through rate. The present study contributes an alternative dimension—targeted advertising

based on individual browsing situation routineness—to research and practices about behavioral targeted advertising.

Keywords: Targeting; individual; Browsing behavior; Routineness; Online advertisement

1. Introduction

Online targeted advertising has been increasingly used to deliver advertisements to users who are most likely to

be influenced by them. Examples of recent innovations in online targeted advertising include behavioral targeting

(BT) (matching advertisements to a user based on her/his historic online behavior data), contextual targeting (matching

advertisements to a user based on the web content that she/he is viewing), and IP-based targeting or geo-targeting

[IAB 2010; Lambrecht & Tucker 2013]. In particular, BT has become the most popular approach to match

advertisements with the right people by using consumers’ collected behavioral data [Yan et al. 2009]. Various BT

approaches have been developed, such as retargeting [Lambrecht & Tuck 2013], browsing frequency-based targeting

[Lu et al. 2016], interests label system-based targeting [Lian & Xu 2018], and life stage mining methods [Bahari et al.

2015], to match advertisements with the right people by exploring users’ latent interests. While these extant BT

algorithms focus more on users’ personal profile, this study posits that users’ individual browsing situations also

matter on BT.

Research indicates that the situation around us is constantly affecting how we think and act [Sommers 2012;

Cialdini 2008], contexts of all types—where we are, whom we are with, and even whether we are in a hurry—change

our cognitive ability, and influence our behavior. Taking online advertisements as an example, the Wall Street Journal

(2016) reported that mobile and social-media advertising start displaying advertisements at a very unusual time slot,

late at night1. The browsing situation at midnight is perceived as a relatively quiet and minimally interruptive personal

time for most users. Would users click more advertisements during this period of inactivity compared to routine times?

In this study, driven by the importance of the browsing situation, we investigated users’ behavioral responses to the

advertisements in different situations. In particular, we were interested in the perspective of browsing situation

1 It is largely due to the increasing number of people staying awake at midnight in the current “always online” age.

mailto:[email protected]

Lu & Chen: Situations Matter

Page 114

routineness, which captures the online users’ familiarity on a specific browsing situation, and examines its relationship

with users’ advertisement click behavior.

Garfinkel (1964: 244) defined routineness at the individual level is the activities by which “persons discover,

create, and sustain standardization”. By definition, routineness means low variability and high predictability [Perrow

1967]. Behavioral data collected by online intermediaries can provide us with a unique empirical context to distinguish

users’ routine browsing situations from their non-routine situations, and enable us explore how the browsing situation

affects users’ click behavior. The authors of this study developed two measurements for individual browsing situation

routineness: browsing content routineness (CR) and time routineness (TR). Unlike the extant behavioral targeting that

pays more attention on what users browsed in the past, CR and TR extend the browsing behavior into where and when

the browsing behavior happens, and provide more perspectives for the behavioral targeting advertising research and

practices. Our empirical results indicate that consumers are more likely to click the advertisements on websites with

high CR, particularly in leisure-oriented websites. However, the TR works the opposite way: users tend to click more

during non-routine times. We also find that TR and CR have a complementary joint effect that increases click-through

rate (CTR).

This study offers several implications for research and practice. First, we provide a new angle (i.e., individual

browsing routineness) for online behavioral targeting research and practice. Browsing situation routineness can

capture the individual browsing context of each user for every advertisement’s exposure, thereby allowing advertisers

to link the right advertisements with the right browsing situation. Advertisers may improve their targeting precision

by combining this new perspective and achieve higher revenue from online advertisements. Second, traditional BT

practices emphasize matching advertisements with users’ latent interests profiles. The authors of this study observed

that high individual browsing situation routineness can improve the CTR without involving any complicated interests

profile computing and matching process. This result demonstrates that browsing situation creates click value for online

advertisements. This can provide the advertisement platforms or content websites with a new pricing method to display

advertisements by analyzing their users’ browsing routineness. Third, most published BT studies have focused on

targeting the right people; the appropriate times to display the advertisements have seldom been considered. In this

study, we fill this gap by investigating the effects of users’ browsing TR on CTR. The findings indicate that

advertisements that reach the audience during non-routine times can have higher performance than those that target

TR.

The paper is organized as follows: In Section 2, we present an overview of literature that is relevant to our

research. Section 3 introduces the research context and the operationalization of browsing situation routineness.

Section 4 presents our empirical model and our analysis of the results, followed by the robustness analysis at Section

5 and a discussion of theoretical and practical implications of this research at Section 6.

2. Literature Review

This study is relevant in two streams of literature: studies about online BT and studies about situational factors.

In addition, we introduce the optimal stimulation level theory to explain why situational factors influence users’

behavior.

2.1. BT

BT refers to approaches where online advertisements are delivered to a user based on their historic online behavior

data. Advertising intermediaries use consumer-tracking technologies, such as cookies, beacons, and video cookies, to

record consumers’ online behavioral information. According to a study by The Wall Street Journal, the 50 most

popular websites in the United States installed more than 3,000 tracking files on a single computer [Angwin 2010].

Data are collected to track most aspects of users’ online behavior; pages visited, browsing trails, searches, and click-

troughs to predict users’ interests.

Yan et al. (2009) examined whether and how BT could help online advertising using real-world advertisement

click-through logs. They segmented users based on their past browsing and search behavior and compared

advertisement responses across segments. Their results suggest that if BT is used, CTRs of online advertisements can

be improved by as much as 670%. Goldfarb and Tucker (2011) noted that consumers’ purchase intent was lower after

viewing online display advertisements in the countries that enforced privacy regulations that restricted advertisers’

ability to collect consumer data used for targeted advertising. These results prove the value of consumer data for BT.

Hoban and Bucklin (2015) conducted an online randomized field experiment and noted that targeted advertising

positively affects visitation to the firm’s website for users in most stages of the purchase funnel. Moriguchi et al.

(2016) later found that targeted advertising is effective in the lower purchase funnel, such as the search and evaluation

stage, but ineffective in the upper funnel, where customers only browse the products.

The aforementioned studies empirically illustrated that overall BT could have a favorable effect on consumers’

actions. However, the effects of BT are heavily dependent on the algorithms used by advertisement platforms. Trusov


Page 115

et al. (2016) pointed out that the collection of consumer information is easy but the extraction of meaningful behavior

patterns from such data remains a largely open question. Most of the current BT methods use visit frequency to proxy

users’ interests or familiarity with a certain area, and attach a relevant label on users [Lu et al. 2016]. For example, if

customer A has browsed automobile-related content 10 times and customer B has only browsed it 3 times, customer

A would be defined as “interested in cars”, whereas customer B may not. Therefore, in the future, customer A is more

likely to be exposed to automobile-related advertisements than customer B.

However, this browsing frequency-based targeting strategy could be biased as it ignores users’ overall browsing

behavior. Taking customer A and customer B as examples again, if customer A’s total browsing is 1,000 pageviews,

then 10 visits to automobile-related pages occupy 1% of customer A’s browsing history. Suppose customer B viewed

100 pages but visited automobile-related pages 3 times. Therefore, automobile-related pages occupy 3% of customer

B’s browsing history. From this perspective, customer B could have higher familiarity or routineness towards

automobile than customer A (3% > 1%). Therefore, ignoring the overall pattern of users’ browsing behavior could

cause misleading results when predicting users’ true interests. In addition, pushing the advertisements that are relevant

with users’ past behavior or interests may heighten users’ awareness of online tracking and privacy violations. Users

tend to have stronger reactions when they view advertisements that are closely related to their past behavior [Lu et al.

2016].

Due to the problems related to frequency-based BT, some increasingly comprehensive targeting algorithms have

been developed. Bleier and Eisenbeiss (2015) proposed a two-dimensional conceptualization of advertisement

targeting: targeting depth defines how closely the advertisement reflects a consumer’s interests, and targeting breadth

determines how completely the banner reflects these interests. The consumers’ click-through intentions with

advertisements targeting depth and breadth were examined. Trusov et al. (2016) developed a correlated topic model-

based algorithm to uncover individual user profiles from online browsing data, which could summarize a consumer’s

interests and preferences efficiently.

Although the aforementioned studies sought to profile users’ latent interests precisely and comprehensively for

the purpose of BT, BT research is still at its nascent stage. This study proposes that aside from users’ inherent interests

or personalities, the individual browsing situation of each user when they are exposed to the advertisements would

also affect advertisement click behavior. Therefore, in this study, we will extend the literature about online

advertisements targeting from a new angle (i.e., the individual browsing situation), and examine how this new

perspective might influence users’ advertisement click behavior.

2.2. The Impact of Situational Factors

Situational factors are described as “factors particular to a time and place of observation…and which have a

demonstrable and systematic effect on current behavior” [Belk 1975]. Situational factors can be classified into five

groups: physical surroundings; social surroundings; temporal perspective; task definition; and antecedent state [Belk

1975]. These factors can temporarily alter users’ preferences, attitudes, or intentions, thereby changing their permanent

perspective into a situational perspective, and therefore, transforming their behaviors [Simon & Usunier 2007]. For

example, a user who is not in a hurry may react quite differently to a service encounter than a user who is feeling

rushed. By including factors specific to particular times and places, advertisers may be able to fully understand the

motivations and behaviors that influence users’ advertisement click behavior.

The importance of situational factors on decision making or human behavior has been explained in numerous

theories, such as the organizational contingency theory [Fiedler 1964] and situational leadership theory [Hersey &

Blanchard 1969], which claims the optimal course of action is contingent upon the internal and external situation. In

research about individual decision making, situational factors were also found to have a constant effect on how we

think and act [Sommers 2012; Cialdini 2008; Li et al 2016]. Some browsing situations even provide greater predictive

power than demographic or personal factors alone [Belk 1975; Stanton & Bonner 1980; Sommers 2012]. For example,

in a study about helping behavior, Darley and Batson (1973) noted that after controlling participants’ dispositional

factors, only 10% of participants in situations with limited time would offer help, whereas 64% of participants would

offer help in situations with unlimited time. In another study about the adoption of self-service technology at retail

stores, Collier et al. (2015) noted four situational factors that affected user behavior: order size, wait-time tolerance,

location convenience, and employee presence. However, in research about online targeted advertising, to the best of

our knowledge, only a few studies have incorporated situational factors.

Contextual targeting may partially take into account the context of advertisements displayed by delivering online

advertisements based on the web content that users are viewing [Goldfarb & Tucker 2011; Zhang & Katona 2012].

The assumption of contextual targeting is that users who browse a web page about automobiles are more likely to be

interested in cars than those browsing a web page about health care. However, this type of contextual targeting entirely

neglects users’ past experiences, because automobile advertisements will be displayed to all users that visit automobile

web pages; such a policy is not sufficient to capture users’ heterogeneity. Different people may perceive and respond


Page 116

to the same situation differently because of their discrepant backgrounds and experiences. Therefore, in this study, we

propose a new angle called the individual browsing situation, which attempts to integrate users’ individual experiences

with their situation, and examines how this individualized browsing situation will affect users’ motivations and

advertisement click behaviors.

Based on the features of online targeted advertising, we operationalize the particular individual browsing situation

from two aspects: browsing CR and browsing TR. Routineness is defined as “repetitive, recognizable patterns of

interdependent actions, carried out by actors” ([Feldman & Pentland 2003]. Routineness is a direct reflection of

experience because it is the learning brought about by the repetition of similar problem-solving actions [Freeman &

Hannana 1983]. By definition, routineness indicates low variability and high predictability [Perrow 1967]. In this

study, we use this concept to proxy users’ familiarity for a particular browsing situation. The routineness value of each

browsing is calculated by the similarity of new browsing situations with users’ past browsing experience.

Nonetheless, routineness is not a static concept. Sacks (1984) described the remarkable lengths to which people

go to believe that even the most calamitous situations (e.g., airplane hijackings) are ordinary. In the context of our

research, if a user always stays awake at midnight, the browsing behavior at midnight is not considered non-routine.

Therefore, each user may have their own individual browsing situational routineness value at a specific place and

time, and this value always changes over time.

2.3. The Optimal Stimulation Level Theory

The optimal stimulation level theory could be used to illustrate the underlying rationale that individual browsing

situational routineness can affect users’ advertisement click behavior. The optimal stimulation level theory postulates

that individual behavior is influenced by the intrinsically motivated desire to reach a specific level of stimulation,

which is termed “optimum stimulation level” [Berlyne 1960]. Consumer research has established that this level of

optimum stimulation varies between individuals [Raju 1980]. When the stimulation derived from the environment is

too low, individuals will attempt to increase stimulation. When the stimulation is too high, individuals will attempt to

decrease stimulation. Psychological pleasantness is highest at the optimal stimulation level, the level of stimulation at

which a person feels most comfortable [Steenkamp & Baumgartner 1992].

Behavior, such as risk-taking, variety-seeking, or curiosity-motivated behavior, aimed at modifying

environmental stimulation to the optimum level of stimulation, is called “exploratory behavior”. In numerous settings,

it has been noted that these general explorative behaviors affect the willingness to choose new or familiar products on

a repurchase occasion, to click repeated advertisements [Baumgartner & Steenkamp 1996], or to seek extra

information [Raju 1980].

In the context of our research, we use this theory to explain how high or low browsing situation routineness affects

users’ advertisement click behavior. We postulate that when users are in highly routine situations (environments)

where they feel that the situation is familiar, the stimulation from this new situation could be low; therefore, users may

like to seek more variety, or become more curious about the information provided by advertisements, and therefore,

be more likely to click the advertisements. Whereas in non-routine situations, users’ environmental stimulation may

already be very high; therefore, users will be less likely to click the advertisements to search extra information. Thus,

we postulated the following hypotheses:

H1: Users are more likely to click advertisements on the high browsing CR web pages than those on the lower

browsing CR web pages.

H2: Users are more likely to click advertisements during the high browsing TR periods than the lower browsing

TR period.

H3: Browsing CR and TR have a positive complementary effect in terms of the increase in click probability for

advertisements.

3. Data

3.1. Research Context

The aim of this study is to explore the impact of browsing situation routineness on users’ click behavior. We

obtained data from one of the largest online advertising intermediaries in China. This company provided three months

[from 1 May 2014 to 31 July 2014] of website clickstream data from 139,000 randomly drawn users. When users

browsed the network websites of this company, relevant advertisement was displayed on the webpage. The contents

of advertisements are various, range from automobile, cosmetics, FMCG et al. Very few optimization algorithm was

adopted during our data collection period, therefore, the same ads would be displayed if users visit the same website

at the same time. The intermediary company collected the information of each advertisement display, includes users’

cookieID, browsing time, browsing web page URL that advertisements were displayed on, advertisement IDs, web

domain, and advertisement click behavior. These users have generated a total of approximately 803,000

advertisements display and click observations.


Page 117

We excluded the website browsing observations occurred in mobile devices (less than 2% of the whole dataset)

because users may have different click proneness when using mobile phone to browse online advertisement. In

addition, to ensure each user has sufficient browsing history, we only keep users who have more than 15 observations

in our dataset, and we eventually have about 6,000 users with total 422,000 observations, which every user had

approximately 72 observations.

A unique feature of this dataset is that the intermediary conducted a keyword frequency count for each web page,

therefore, we have the keywords’ information for each URL. For example, for the web page “http://2014.ifeng.com/”,

the intermediary extracted 10 keywords and their frequencies, (i.e., Germany: 20; Soccer:4; Wechat: 3; Wechat

Moments: 2; Beijing: 2; Columbia: 1; Weibo: 1; Music: 1; Japan: 1; Europe: 1). These keywords allowed us to measure

the CR of each user’s individual browsing situations, which we elaborate in the following section.

3.2. Data Pretreatment

From the 146,516 web pages in the data set, 2,233 unique keywords were extracted. With these keywords, we

were able to aggregate web pages into some groups by calculating the content similarity of each web page. Ideally,

we would directly cluster these web pages with these 2,233 keywords. However, due to numerous missing values, this

146,516 × 2,233 matrix was a very sparse matrix; the clustering results would have been biased if we had conducted

the clustering based on this sparse matrix. Therefore, we conducted two steps of pretreatment before the clustering:

First, in the web page dimension, instead of focusing on individual web page levels, we used the domain

information as the analysis level. The domain information of the browsed URL is usually included in the clickstream

data. In our case, approximately 172 web domains were involved. Web domains include certain channel information;

for example, the website ifeng.com has more than 10 domains, such as news.ifeng.com and sports.ifeng.com. An

analysis at the domain level may help us aggregate the web page into a higher level without losing too much

information.

Second, for the dimensions of keywords, we combined the 2,233 keywords into 352 keyword groups. We intended

to combine keywords with the same meaning but different expressions. For example, terms such as Germany and

German were combined into one keyword. We also combined the names of movie stars as one group. Three MBA

students were hired to perform this job and were asked to group the keywords based on their own understanding.

During the grouping task, for any keywords that were ambiguous or controversial, we used a voting mechanism to

decide the final results. Finally, we had 352 unique keyword groups.

From this new 172 × 352 matrix, we adopted the term frequency–inverse document frequency (TF-IDF) algorithm

to compute the similarity of keywords across each web domain, and the Gaussian mixture model (GMM) generated

39 content groups from these web domains. Appendix 1 presents the algorithm used to cluster the web pages. Table 1

lists the typical web domains that belong to the top six content groups. We label these six groups with a possible tag

for further discussion.

http://2014.ifeng.com/


Page 118

Table 1. Clustering Results of the Top Six Content Groups

Rank of

content

groups

Examples of included domains Possible tag

Group1

tv.67.com；news.67.com；gb.cri.cn；music.67.com；emotion.miss-no1.com；

mz.mop.com；news.eastday.com；big5.chinabroadcast.cn；enjoy.eastday.com

；tt.mop.com；www.gucheng.com；www.xici.net；big5.cri.cn；

roll.eastday.com；baidu.baomihua.com；joy.online.sh.cn；society.eastday.com

；star.67.com；yule.jinti.com

Entertainment

related

Group 2

www.quanjing.com；quanjing.com；sh.eastday.com；www.ddmap.com；

history.eastday.com；sheying.eastday.com；beijing.jinti.com；changsha.ddmap.com

Location

related

Group 3

newcar.xcar.com.cn；auto.163.com；topic.qc188.com；www.qc188.com；

info.xcar.com.cn；iask.qc188.com；config.qc188.com；drive.xcar.com.cn；

www.3quan.com；news.52che.com；club.qc188.com；auto.cri.cn；

review.qc188.com；video.qc188.com；gai.qc188.com；auto1.cri.cn

Auto related

Group 4

health.miss-no1.com；iask.qc188.com；enjoy.eastday.com；pk.qc188.com；

shenzhen.jinti.com；mil.eastday.com；www.ddmap.com；photo.jschina.com.cn

；yule.jinti.com；collection.aili.com

Virtual

community

Group 5

2014.ifeng.com；pinglun.eastday.com；finance.gucheng.com；

www.etnet.com.cn；emotion.mop.com；mil.eastday.com；column.etnet.com.cn

；forex.gucheng.com；world.eastday.com；pic.miss-no1.com；cat.abang.com

Economy and

finance related

Group 6

www.miss-no1.com；beauty.aili.com；lifestyle.aili.com；health.aili.com；

zhuangban.onlylady.com；love.aili.com；celeb.aili.com；fashion.aili.com；

bbs.aili.com；auto.aili.com；cosmetics.aili.com；family.aili.com；art.aili.com

Woman related

It should be noted that each domain does not exclusively belong to one cluster. We computed the probability of

each domain belonging to these 39 content groups, and got a 172 × 39 probability matrix 𝑃(𝐷𝑚 ∈ 𝐶𝑘)(𝑚 =1,2 … 172, 𝑘 = 1,2 … 39) after the GMM calculation. The probability vector for the domain “health.miss-no1.com”

is presented in Table 2 as an illustrative example. Using the probability matrix in the clustering process enabled us to

capture the different content perspectives for a specific domain, and preserve more information than the dummy

variable method2. For the purpose of comparison, we also conducted a robustness analysis of the dummy variables.

Table 2. Probability Vector of the Web Domain

domain V1 V8 V2 V6 V3 V22 V5 V31 V33

health.miss-

no1.com 0.00074 0.03784 0.00003 0.00001 0.00004 0.95763 0.00123 0.00240 0.00009

3.3. The Operationalization of Variables

We used two variables to measure the routineness of users’ browsing situation: CR and TR. CR measures the

degree of content familiarity when users browse a new web page. TR measures the frequency with which users

routinely browse websites during a specific time slot. We also measured the users’ non-routineness behavior by coding

whether it is the first time that users visited this type of content, and whether it is the first time that users browsed the

Internet at this time slot.

We used the data from the first month (2014 May) as the historic information baseline for the calculation of June

and July’s routineness value. The routineness value of May could be biased without the historic data; therefore, we

did not examine the relationship between the routineness value of May with users’ click behavior, only the data from

June and July were used in our empirical analysis.

2 In the case of health.miss-no1.com, we set the dummy v22 = 1 and other 38 content groups as 0 under the dummy variable

coding method.


Page 119

3.3.1. Individual Content Routineness of Browsing Situation

Let 𝐶𝑘 denote the kth content group (𝑘 = 1,2,3, … ,39), with i as the user. Based on the data from May 2014, we

could have 𝜃𝑖𝑘 =∑ 𝑁𝑖𝑑∗𝑃(𝐷∈𝐶𝑘)172

𝑑=1

𝑁𝑖 as the ratio of user i’s browsing frequency at the kth content group among all their

browsing. 𝑁𝑖𝑑 is the number of visits on domain D, and 𝑁𝑖 is the total visit time during May. 𝑃(𝐷 ∈ 𝐶𝑘) is the

weight to adjust the value for each domain on 39 content groups. The matrix 𝜃𝑖𝑘 represents user i’s share of visits

across 39 content groups in May; we used this matrix to proxy users’ baseline familiarity on each content groups.

When user i browses a new web page in June and July, we compute a CR value to reflect how familiar this user

is with this web page’s content. If user i has high familiarity with one specific content group k with high 𝜃𝑖𝑘, and

simultaneously, the web page domain they are visiting has high relevant probability on content group k with higher

𝑃(𝐷 ∈ 𝐶𝑘), then user i should have a high CR value for this browsing situation. We set the CR value for 𝐶𝑅𝑖𝑚 as:

𝐶𝑅𝑖𝑚𝑡 = ∑ 𝑃(𝐷𝑚𝑡 ∈ 𝐶𝑘) ∗ 𝜃𝑖𝑘39𝑘=1 (1)

where m is the domain ID that user i is browsing at time t. For example, if user i’s baseline familiarity on three

content groups (i.e., 𝜃𝑖𝑘) are (0.1, 0.6, 0.4) based on the data from May 2014. This user visits two domains, A and B,

on June 2014, and the relevant probability of A and B on these three content groups are (0.2, 0.4, 0.4) and (0.1, 0.8,

0.1), respectively. With the aforementioned equation, user i’s routineness value on domain A is (0.2, 0.4, 0.4) × (0.1,

0.6, 0.4)’ = 0.43, while their routineness value on domain B is (0.1, 0.8, 0.1) × (0.1, 0.6, 0.4)’ = 0.53. Therefore, user

i has higher CR on domain B than domain A.

After each calculation of routineness value, we also need to update 𝜃𝑖𝑘 to incorporate user i’s new browsing

behavior.

𝜃𝑖𝑘𝑛𝑒𝑤 =

∑ 𝑁𝑖𝑑∗𝑃(𝐷∈𝐶𝑘)172𝑑=1

𝑁𝑖+1 (2)

For the next round of the routineness value calculation of user i, we need replace 𝜃𝑖𝑘 with this updated 𝜃𝑖𝑘𝑛𝑒𝑤

in Equation (1).

With the aforementioned algorithm, we can compute a unique CR value for each user on each new browsing

situation in June and July.

3.3.2. Individual Time routineness of Browsing Situation

The TR of browsing behavior measures the frequency with which a user routinely browses websites during a

specific time slot. In this study, we set each hour as a time slot Hh (h = 1,2….24). We used the data from May to

calculate the baseline TR for each user at each hour, the vector 𝜋𝑖ℎ =𝑁𝑖ℎ

𝑁𝑖 (h = 1,2….24) captures user i’s browsing

situation routineness distribution among all their browsing during 24 hours.

We use Equation (3) to calculate the TR value for June and July’s observation. If user i visits a web page at time

t (belongs to time slot 𝐻𝑡), we retrieve the 𝜋𝑖ℎ vector for each user, and compute the distance between t with h to

proxy the probability of t belonging to time slot h.

𝑇𝑅𝑖𝑡 = ∑ 𝑃(𝑡~ℎ) ∗ 𝜋𝑖ℎ23ℎ=0 (3)

In Equation (3), we set up the probability function 𝑃(𝑡~ℎ) to compute the distance between t and h.

𝑃(𝑡~ℎ) =exp (−β𝑑ℎ)

∑𝑚exp (−β𝑑ℎ) (4)

where h = 0,1,…23, Σm is the summation of m = 0,1,…23, dh is the minimum distance between h + 0.5 and t

with hour as the unit, and β > 0. For example, if t = 0:10, d0 = 0.5 − 1/6 = 1/3 (10 minute equals to 1/6 hour), d1 = 1.5

− 1/6 = 1 1/3, d2 = 2.5 − 1/6 = 2 1/3,… d11 = 11.5-1/6 = 11 1/3, d12 = 11.5 + 1/6 = 11 2/3, d13 = 10.5 + 1/6 = 10

2/3,… d23 = 0.5 + 1/6 = 2/3. With the aforementioned specifications, P (t~h) is a probability measure with Σh P (t~h)

= 1. β is the weight of the distance between t on each h. When β is larger, the higher weights are on closer time slots.

We assume β = 1 to begin with and then check β = 0.5 as robustness check.

After computing the TR value 𝑇𝑅𝑖𝑡, we update 𝜋𝑖ℎ again for the next round of calculation of user i’s TR value.

𝜋𝑖ℎ𝑛𝑒𝑤 = {

𝑁𝑖ℎ

𝑁𝑖+1 𝐻𝑡 ≠ ℎ

𝑁𝑖ℎ+1

𝑁𝑖+1 𝐻𝑡 = ℎ

(5)

3.3.3. Non-Routineness Measurement of Browsing Situation

The aforementioned two variables capture the individual routineness for each browsing situation. For the purpose

of comparison, we also coded two non-routineness variables: hourfirst and webfirst. Sommers (2012) indicated that

being the first instance is a good measurement of an exceptional situation. Users tend to be more cautious when they


Page 120

are in an exceptional situation, and are less likely to click the advertisements to increase their stimulation level. We

used these two variables to capture whether users would have different advertisement click behavior in non-routine

browsing situations. The variable hourfirst indicates whether user i visited this web domain for the first time at time

slot h. The variable webfirst indicates the first time user i visited a web domain from a content group that they have

visited at time t.

3.3.4. Control variables

In addition to our focal variables, CRit, TRit, and two non-routineness variables, we included several groups of

control variables. The first group of variables indicated users’ past browsing frequency on this type of web domain.

Chatterjee et al. (2003) and Braun and Moe (2013) have indicated that consumers might respond differently when

they visit a website for the first time or after several times. Multiple exposures may enhance the impression of

advertisements and lead to a higher click rate. Therefore, we included two variables, ImpCidVisitit and ClkCidVisitit,

into our model as users who repeatedly viewed or clicked on an advertisement might respond differently to those who

occasionally saw it. ImpCidVisitit is the total previous frequency of user i for the web domain they are browsing at

time t. ClkCidVisitit is the total previous click frequency of user i for the web domain they are browsing at time t.

The second group of variables was the visitors’ proneness to clicking on the advertisements. We characterized a

visitor’s inherent disposition to click using the variable adClickProneid(t), which is measured by the total clicks of

user i divided by their total visits in content group d(t) when they are browsing at time t. It is an aggregated variable

that we combine the first month click and browsing behavior of each of the 39 content groups together to reflect users’

click tendency.

We also coded two user-related variables to reflect users’ idiosyncratic characteristics. One variable is Cityid,

whereas the other variable is the browser that the user is using to browse the website. We set Cityid = 1 if a user was

located in a big city such as Beijing or Shanghai; and Cityid = 7 if a user was located in a small village3. According to

a survey, users’ preferred Internet browsers can indicate their age; Internet Explorer users are usually older, whereas

Chrome users are usually younger: (https://www.elie.net/blog/web/survey-internet-explorer-users-are-older-chrome-

seduces-youth). Therefore, we coded browser = 0 to represent Internet Explorer and browser = 1 to represent Chrome

and Firefox, whereas browser = 2 represented Safari and Opera. Table 3 compiles the definitions of the main study

variables.

Table 3. Definitions of the Main Variables

Variable Type Variable Name Definition

Dependent

variable Clickit

If user i clicks the advertisements on the web page that they are

browsing at time t, Clickit = 1, otherwise Clickit = 0

Independent

variables

CRit The CR of user i when they are browsing a web page at time t

TRit The TR of user i when they are browsing a web page at time t

hourfirstit The non-routineness value of user i for time slot 𝐻𝑡

Webfirstit The non-routineness value of user i for the web domain they are

browsing at time t

Control

variables

ImpCidVisitit The browsing frequency of user i for the web domain they are

browsing at time t

ClkCidVisitit The advertisement click frequency of user i for the web domain they

are browsing at time t

𝑎𝑑𝐶𝑙𝑖𝑐𝑘𝑝𝑟𝑜𝑛𝑒𝑖𝑡 The advertisement click proneness of user i for the content group

they are browsing at time t

User

characteristics

Cityidi Cityid = 1 if a user is located in a big city, and Cityid = 7 if the user

is located in a small village

Browersi 0 represents IE, and 1 represents Chrome and Firefox, whereas 2

represents Safari and Opera

3.4. Descriptive Summarization

Table 4 reports the descriptive summary for the main study variables. We observed that, on average, these users

click 7.1% of the advertisements displayed to them. The average CR is 0.62, which suggests that 62% of the browsing

content is similar with those in the user’s browsing history; this finding is consistent with that of research suggesting

users’ browsing behavior maintains a stable pattern (Montgomery, 2000). The average TR of consumers is much

3 The city code information was retrieved from http://j.news.163.com/docs/17/2016031611/BI9GLPKE05248EIC.html

https://www.elie.net/blog/web/survey-internet-explorer-users-are-older-chrome-seduces-youth

https://www.elie.net/blog/web/survey-internet-explorer-users-are-older-chrome-seduces-youth


Page 121

lower, only 13.07% of the browsing time has similar routineness, indicating that an individual’s browsing time varies

significantly each time.

Table 4. Descriptive Summary of the Main Variables

Mean SD Min. Max.

Click 0.0708 0.2565 0 1

CR 0.6204 0.3700 0 1

TR 0.1307 0.0937 0 0.462

Webfirst 0.161 0.367 0 1

Hourfirst 0.337 0.47 0 1

ImpCidVisit 63.2607 150.8246 1 7358

ClkCidVisit 2.9385 10.7428 0 276

adClickProne 0.0330 0.0930 0 1

Browser 0.7723 0.4656 0 2

CityID 3.2878 2.2662 0 7

Table 5. Correlation Matrix of the Main Variables

Click CR TR Hourfir

st

webfirst

t

Imp_ci

dvisit

Clk_cid

visit

adClick

Prone

Browse

r CityID

Click 1

CR 0.0999 1

TR 0.0448 0.0518 1

Hourfirst 0.0717 0.0965 0.3126 1

Webfirst -0.061 -0.5431 0.0303 -0.2915 1

ImpCidVisit 0.152 0.27 -0.0507 0.266 -0.1867 1

ClkCidVisit 0.1443 0.2444 -0.0517 0.236 -0.1811 0.8874 1

adClickProne 0.467 0.0195 0.0393 0.054 -0.0137 0.0707 0.0589 1

Browser -0.023 0.0616 -0.0474 -0.0834 -0.0169 -0.0387 -0.0314 -0.015 1

CityId 0.0014 -0.0958 0.0325 -0.0622 0.0791 -0.1264 -0.1168 -0.0003 -0.0032 1

Figure 1 presents the frequency distribution of browsing time; 8:00 AM and 9:00 AM are the peak times when

users browse the Internet. This finding is consistent with that of Internet traffic analysis reports (Brouwer, 2014)4,

mostly because this period is usually when we prepare to start work.

Figure 1. Browsing Frequency Distribution of Browsing Time

4 http://insights.wired.com/profiles/blogs/linking-browsing-patterns-and-user-conversions#axzz4LcZyBf4l

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 2 1 2 2 2 3

May June July


Page 122

4. Research Model and Estimation Results

We used a random-utility framework to model a user’s click decision. A user clicked on an advertisement when

their clicking utility exceeded a threshold level. In particular, for user i viewing the web page at time t, we used Clickit

to denote the user’s click binary response on the advertisements that displayed on this web page and uit to denote the

latent value. In the utility equation, in addition to the independent and control variables introduced earlier, we also

controlled the fixed effect of the content group by including the dummy variables ( tCgroup ), because users’

advertisement click tendency is not the same across different content groups. For example, research has indicated that

users are more likely to click advertisements on entertainment websites compared to economy-related websites

(Brouwer, 2014). The fix effect of 24 hour slot (ht) is also controlled to eliminate the impact of time on click

probability. it represents the random error that follows the type-I extreme distribution. The click equation is defined

as:

*0 1 2 3 1 2

3 4 7 t

1[ 0],

Im h

it it

it it it it it it it

it it i t it

Click u

u CR TR CR TR Hourfirst webfirst

pCidVisit ClkCidVisit Clickprone Cgroup

4.1 Main Results

Table 6 reports the main results of our logit regression model. Model 1 is the baseline without considering the

effects of users’ routineness variables. Model 2 is the main model that did not include users’ idiosyncratic variables.

Model 3 is the full model. We observed that the full model increased the hit rate of users’ click behavior from 18.99%

to 19.45%, suggesting that including individual browsing routineness variables can improve the prediction power.

Table 6. Logit Regression Results and Model Comparison

VARIABLES Model 1 Model2 Model 3

Routineness

variables

CR 0.710*** 0.710***

(0.0440) (0.0441)

TR -1.179*** -1.278***

(0.212) (0.212)

CR*TR 2.175*** 2.234***

(0.244) (0.244)

Non-routineness

variables

webfirst -1.353*** -1.080*** -1.078***

(0.0334) (0.0346) (0.0346)

hourfirst 0.241*** 0.261*** 0.273***

(0.0146) (0.0151) (0.0151)

Control

variables

ImpCidVisit 0.000612*** 0.000524*** 0.000583***

(3.46e-05) (3.50e-05) (3.53e-05)

ClkCidVisit 0.0115*** 0.0110*** 0.0112***

(0.000437) (0.000437) (0.000439)

adCliclprone 11.68*** 11.79*** 11.77***

(0.0582) (0.0587) (0.0589)

Users

demographic

variables

Browser dummy Included

Cityid dummy Included

GroupID Included Included Included

Hour fixed

effect

Included Included Included

constant -7.673*** -8.123*** -8.153***

(0.190) (0.193) (0.198)

Observation 801,998 801,998 801,998

Likelihood -140918.12 -140762.02 -140479.7

Hit rate(p=1) 18.99% 19.24% 19.45%

Standard errors in parentheses*** p < 0.01, ** p < 0.05, * p < 0.1

The results of Model 2 and Model 3 indicate a similar pattern. We interpret our results mainly based on Model

3. The results show that the coefficient of browsing CR was significantly positive, which suggests that users are more


Page 123

likely to click the advertisements if they are browsing higher CR websites. More specifically, with one unit increase

in users’ CR, the probability of users click on an advertisement will increase by 67% (= exp (0.710)/(1 + exp (0.710))).

This result echoed by the negative and significant non-routineness content browsing variable (webfirst); users are less

likely to click the advertisements for the web domain they first visited compared to the domains they have visited

before, thereby H1 is supported. This result is consistent with the optimal stimulation level theory that users tend to

be more active to seek new information or click advertisements when they are in an environment where stimulation is

low. However, in a new situation, where the stimulation level is already high, they are less likely to seek extra

information because when users visit a new web domain, they tend to have a very direct goal, and are, therefore, also

less likely to click the advertisements (Lu et al. 2016).

However, H2 was not supported by our empirical results. We observed that the coefficient of browsing TR was

significantly negative, whereas the coefficient of the non-routineness browsing time variable (hourfirst) was

significantly positive, which suggests that users are less likely to click advertisements during the routine browsing

time, but would be more willing to click if they first browse the Internet during this time slot. Therefore, the optimal

stimulation level does not apply to the click behavior of TR. This could be because, unlike higher CR, it can represent

higher familiarity with the browsing environment. The higher TR does not necessarily lead to low stimulation level.

In other words, time may not be one of the environmental factors that trigger explorative behavior to increase or

decrease the user’s stimulation level. The non-routine browsing time may be a newly available time for users. Users

tend to be less occupied, and therefore, more inquisitive about the new information provided by advertisements. This

result supports the practice that advertisers display advertisements late at night (Wall Street Journal, 2016). Because,

for most users, late night is a non-routine browsing time, and is also available as personal time with relatively few

interruptions. During this period, users are more likely to click advertisements to acquire new information.

The interactive term for CR and TR is significantly positive. This result indicates that even though TR negatively

affects users’ advertisement click behavior, it can enhance the effects of CR on click probability, thus H3 is supported.

This result suggests that if users visit a routine website at a routine time, they are more likely to click the

advertisements, compared to the situation where users are visiting a routine website at a non-routine time. To further

support this argument, we use the variable hourfirst as a proxy of non-routine time. In line with H3, the impact of CR

should be negatively moderated by the non-routineness browsing time variable (hourfirst). We replace CR*TR with

the interactive term between CR and hourfirst, and run the Model 3 again. The coefficient of the new interaction term

is -0.417***(S.E.=0.085), which strongly supports H3 again from the reversed perspective.

For the control variables, users’ past visits and click frequency were assumed to have a significantly positive

effect on their click probability, which suggests that the traditional frequency-based BT algorithm also works in our

study. In addition, the click proneness of a user was positively correlated with their click behavior, suggesting that

users were heterogeneous and click proneness could be a good predictor of advertisement clicks.

4.2 The Differential Effect of Browsing Situation on Content Groups

The estimation result indicates that the fixed effect of content groups plays an important role in users’ click

behavior. Therefore, in this section, we further examine how the content browsing situation of websites moderates the

effect of individual routineness value. Because the top six content groups occupied about 80% of the observations in

this study, we focused our analysis on only these six content groups (636,720 observations left).

We ran an interaction model by adding six interaction terms between the dummy groups and CR; the detailed

estimation results are listed in Appendix 2. Figure 2 presents the coefficient magnitude of each group. The baseline is

the coefficient of CR (0.82). The group 2 and 4 had more benefits from the CR than did other groups, whereas CR

had a negative moderating effect on group 3 and 5.


4

Figure 2. Magnitude of CR’s Impact Across Six Content Groups

As indicated in Table 1, we labelled group 2 and 4 as location-related and virtual community-related web domains;

both groups contain leisure-oriented websites. Figure 3 suggests that users with higher CR are more likely to click

advertisements when they are at these two types of websites, compared to other content websites. For group 3

(automobile-related websites) and 5 (economy and finance-related websites), high CR did not help to increase the

advertisement clicks. This could be because, compared to leisure-oriented groups (i.e., 2 and 4), the content of group

3 and 5 is more function -driven, the stimulation level of these websites is higher than that of virtual communities;

therefore, the role of CR is less important. We thereby postulate that the effects of browsing content routineness might

be more salient in the leisure-oriented websites.

5. Robustness Check

5.1. Excluding Users with Lesser Observations from the Analysis

One of the limitations of our data set is that we only used users’ web clickstream data in our focal intermediary

websites network. Users’ Internet browsing behavior that occurs on other websites outside of this network was found

to be lacking. This may affect the precision of the CR or TR value calculation because part of users’ browsing behavior

history is not available for us. In particular, the situational routineness value of users that had very few observations

in our data set could be biased. In this robustness analysis, we excluded these users and conducted the analysis only

with data from users who had more than 15 observations in our data set, which partially corrected the bias caused by

incomplete observations. The first column in Table 7 reports the result of these 422,244 users. We observed that our

results were consistent with that of the whole sample for users who had relatively complete browsing behavior history.

This suggests that our model is applicable to most users, regardless of whether they have complete or incomplete

browsing behavior history.

5.2. Analysis Excluding the Targeted Observations

In our observations, some advertisements had adopted BT techniques to reach users (the total number of targeted

observations is 41,220, about 5% of our observations). As revealed by the focal intermediary, the company mainly

used frequency-based targeting techniques. For example, as requested by an automobile brand advertiser, its

automobile advertisements only targeted users who have at least visited an automobile-related web page once. The

intermediary would search the database for users who reached this threshold, and display auto advertisements to them

no matter what type of web page they are visiting.

For these targeted users, their click behavior could be affected both by the specific targeting techniques and the

browsing situation. To rule out the influence from the current targeting techniques, we excluded these observations

during the robustness test. The new result of the remaining 760,795 observations is listed in column 3 of Table 7. The

pattern of the results remained unchanged and were very similar to our main results.

5.3. Analysis by Setting Beta = 0.5

While computing the TR, by default, we set beta as 1 to indicate the closeness of t to h, the higher weight it is. In

this robustness test, we set beta = 0.5 to test how the results would change. We observed that the TR mean value

0

0.5

1

1.5

2

2.5

1 2 3 4 5 6

Coefficients

Content Groups


Page 125

decreased from 0.13 to 0.098 (0.052). However, the estimation results of the test remained very similar to that of our

model, as indicated in the last column in Table 7.

Table 7. Regression Results for Robustness Checks

VARIABLES Model Exclude

low- visited users

Model Exclude BT

Observations

Model when

beta=0.5

Routineness

variables

CR 0.598*** 0.693*** 0.613***

(0.0546) (0.0456) (0.0529)

TR -0.600** -1.358*** -1.922***

(0.262) (0.219) (0.374)

CR*TR 2.041*** 2.331*** 3.956***

(0.316) (0.251) (0.437)

Non-

routineness

variables

webfirst -0.706*** -1.084*** -1.071***

(0.0486) (0.0358) (0.0345)

hourfirst 0.0958*** 0.274*** 0.283***

(0.0219) (0.0155) (0.0150)

Control

variables

ImpCidVisit 1.06e-05 0.000657*** 0.000588***

(4.10e-05) (3.62e-05) (3.52e-05)

ClkCidVisit 0.0168*** 0.0101*** 0.0112***

(0.000487) (0.000449) (0.000438)

adCliclprone 9.216*** 11.88*** 11.76***

(0.0784) (0.0607) (0.0587)

Users

demographic

variables

Browser dummy Included Included Included

Cityid Dummy Included Included Included

constant -7.403*** -8.414*** -7.871***

(0.258) (0.219) (0.192)

GroupID Included Included Included

Hour fixed effect Included Included Included

Observation 422,244 760,795 801,998

Likelihood -78285.1 -129814.12 -140061.02

Standard errors in parentheses*** p < 0.01, ** p < 0.05, * p < 0.1

6. Conclusion

This study postulates that not only do users’ personal interest profiles matter in BT, but their individual browsing

situations also affect their advertisement click behavior. We developed two measurements to capture individual

browsing situations: browsing CR and TR. Their relationship with users’ click behavior was empirically examined

with a unique dataset. Our results indicate that users are more likely to click the advertisements when they are visiting

high-CR websites, particularly for leisure-oriented websites. However, users tend to click more during non-routine

times than routine time (high-TR). We also observed that TR and CR have a complementary joint effect to increase

CTR.

6.1. Theoretical Implications

The results of this study contribute to the literature about online targeted advertising in several aspects. First, we

developed a new concept, browsing situation routineness, to examine how the situation familiarity would affect users’

click behavior. To the best of our knowledge, most of the current literature on BT focuses on users’ latent interests or

preference profiling. The impact of individual browsing situations on advertisement clicks have not been examined.

This study fills in this gap and provide a new perspective for the behavioral targeting advertising research.

Secondly, we used the optimal stimulation level theory to explain how the situation familiarity would affect users’

behavior. The results indicate that optimal stimulation level theory can only be used to explain the underlying rationale

of CR, but not TR. Users tend to be more inquisitive about the new information when they are less occupied at non-

routine browsing time, which might overshadow the effects of stimulation level. This finding highlights we may treat

the time factors in behavioral targeting in different way. The result also suggests that optimal stimulation theory might

not be applicable for time factors, which enrich this theory with a new perspective.

Thirdly, the complementary joint effect between CR and TR shows that although TR might not be an independent

factor to affect users’ stimulation level, it could enhance the effects of CR on click probability. The interaction effect


Page 126

is consistent with the optimal stimulation level theory that when users are doing routine activities at a non-routine

time, her/his stimulation level is even lower compared to the situation where users doing routine activities at a non-

routine time, thus more likely to click the advertisements. Our study provides an empirical foundation for further

theory exploration about the combined effects of different targeting factors on users’ click behavior.

6.2. Practical Implications

This study also has several practical implications for online advertisers and intermediaries. First, we developed a

methodology to measure users’ individual browsing situations. The algorithm to measure the CR and TR can be

quickly used by other advertisement intermediaries as they are built on the standard clickstream data.

Second, the empirical examination of the relationship between browsing situation routineness and users’ CTR

provides evidence that situation-based targeting can significantly increase CTR. In particular, for the advertisements

displayed on specific websites during a specific period, assigning advertisements to the users with higher CR and

lower TR can directly help intermediaries achieve higher revenue.

Third, we argue that the proposed browsing situation routineness approach may be valuable for numerous

applications beyond BT practices. For example, this new angle may also provide an extra pricing strategy for

advertisements displayed on advertisement platforms or content sites. If the targeted users have high CR values on a

specific website, advertisers on that specific website may need to pay more than they would need to pay on websites

with lower CR values. The situation routineness method can also address the privacy issue of traditional BT because

it is not purely based on users’ browsing history.

6.3. Limitations

This research had some limitations. First, this data set was provided by an advertisement intermediary; only the

Internet browsing data at the network of this intermediary was visible to us. Although we conducted the robustness

test to prove this is not a concern, we believe that if future research can extend this study with a complete browsing

clickstream data for each user, the results can be more convincing. Second, our data set involves more than 200 online

advertisements, and the intermediary did not provide the information about these advertisements to us; therefore, we

cannot control the effects of advertisements on users’ click behavior. Future research with information about the

advertisements may address this issue and enable an increasingly robust analysis. Third, the performance of this

algorithm partially relies on the cluster results of content groups. Due to the diversity of the contents on the websites

used in this study, we ended up with 39 content groups, we only retained the top six content groups to have a reasonable

moderating effect analysis. If online advertisers had predefined smaller numbers of content groups, the analysis may

have been more complete.

Acknowledgement

Authors would like to thank the financial support from The National Natural Science Foundation of China (Grant

Number 71872050, 71972043 and 71531006). We would also like to thank Za Tingting for her contribution on this

paper.

REFERENCES

Angwin, J. “The Web's New Gold Mine: Your Secrets,” July 30, 2010,

https://www.wsj.com/articles/SB10001424052748703940904575395073512989404

Collier, Joel E. and Barnes, Donald C. “Self-service delight: Exploring the hedonic aspects of self-service,” Journal

of Business Research, Vol. 68, No.5:986-993, 2015.

Bahari, T. Femina, and M. S. Elayidom, “An Efficient CRM-Data Mining Framework for the Prediction of Customer

Behaviour,” Procedia Computer Science, Vol. 46, No. Complete:725-731, 2015.

Baumgartner, H. and J. B. E. M. Steenkamp, “Exploratory consumer buying behavior: Conceptualization and

measurement,” International Journal of Research in Marketing, Vol. 13, No. 2:0-137, 1996.

Belk, Russell W., “Situation Variables and Consumer Behavior,” Journal of Consumer Research, Vol. 2, No. 3:157-

164, 1975.

Berlyne, D. E., Conflict, Arousal and Curiosity, New York: McGraw Hill, 1960.

Bleier, Alexander , and M. Eisenbeiss, “The Importance of Trust for Personalized Online Advertising,” Journal of

Retailing, Vol. 91, No. 3:390-409, 2015.

Cialdini B. Influence: Science and Practice, Allyn and Bacon; 5th edition (August 8, 2008)

Darley, John M., and C. D. Batson, “"From Jerusalem to Jericho": A study of situational and dispositional variables

in helping behavior,” Journal of Personality and Social Psychology, Vol. 27, No. 1:100-108, 1973.

Pentland, Feldman Brian T., “Reconceptualizing Organizational Routines as a Source of Flexibility and Change,”

Administrative Science Quarterly, Vol. 48, No. 1:94-118, 2003.


Page 127

Fiedler, F. E., A theory of leadership effectiveness. In L. Berkowitz (Ed.), Advances in experimental social

psychology. New York: Academic Press, 1964.

Freeman J. and M. T. Hannan, “Niche Width and the Dynamics of Organizational Populations,” American Journal of

Sociology, Vol. 88, No. 6:1116-1145, 1983.

Garfinkel, Harold. “Studies of the Routine Grounds of Everyday Activities,” Social Problems, Vol. 11, No. 3: 225-

50, 1964.

Goldfarb, Avi, and Tucker, Catherine E. “Privacy Regulation and Online Advertising,” Management Science, Vol. 57,

No. 1:57-71, 2011.

Hersey, P. and K. H. Blanchard, “An introduction to situational leadership,” Training and Development Journal, Vol.

23:26–34, 1969.

Hoban P. R. and R. E. Bucklin, “Effects of Internet Display Advertising in the Purchase Funnel: Model-Based Insights

from a Randomized Field Experiment,” Journal of Marketing Research, Vol. 52, No. 3:375-393, 2015.

Li H., V. S. Lai, and X. Luo, “Understanding the Role of Social Situations on Continuance Participation Intention in

Online Communities: An Empirical Perspective,” Journal of Electronic Commerce Research, Vol. 17, No. 3,

2016.

IAB, Targeting local markets: an IAB interactive advertising guide. Interactive Advertising Bureau, 2010. Available

at http://www.iab.net/media/file/IAB_Local_Targeting_ Guide_0922_FINAL.pdf. Retrieved on 5 October 2014.

Lambecht, A. and C. Tucker, “When does retargeting work? Information specificity in online advertising,” Journal of

Marketing Research, Vol. 50, No. 5:561-576, 2013.

Lian S. and Y. Xu, “Family Profiles in Retailing: A Novel Algorithm Combining PU Learning and Feature Selection,”

Japan: PACIS, 2010.

Lu X., X. Zhao, and L. Xue, “Is combining contextual and behavioral targeting strategies effective in online

advertising?” ACM Transactions on Management Information Systems, Vol. 7, No. 1:1-20, 2016.

Moriguchi, Tz and G. Xiong, and X. Luo, “Retargeting Ads for Shopping Cart Recovery: Evidence from Online Field

Experiments,” 2016. Available at

SSRN: https://ssrn.com/abstract=2847631 or http://dx.doi.org/10.2139/ssrn.2847631S

Perrow, C., “A framework for the comparative analysis of organizations,” American Sociological Review, Vol. 32,

No. 2:194-208, 1967.

Raju, P. S., “Optimum Stimulation Level: Its Relationship to Personality, Demographics, and Exploratory Behavior,”

Journal of Consumer Research, Vol. 7, No. 3:272-282, 1980

Sacks, H., On doing ‘‘being ordinary.’’ In J. M. Atkinson & J. Heritage (Eds.), Structures of social action: Studies in

conversation analysis (pp. 413–429). Cambridge, UK: Cambridge University Press,1984.

Simon, Francoise , and J. C. Usunier, "Cognitive, demographic, and situational determinants of service customer

preference for personnel-in-contact over self-service technology," International Journal of Research in

Marketing, Vol. 24, No. 2:163-173, 2007.

Sommers S. Situations Matter: Understanding How Context Transforms Your World, Riverhead Books, 2012.

Stanton J.L. and P. G. Bonner,"An Investigation of the Differential Impact of Purchase Situation on Levels of

Consumer Choice Behavior", in NA - Advances in Consumer Research Volume 07, eds. Jerry C. Olson, Ann

Abor, MI : Association for Consumer Research, Pages: 639-643,1980.

Steenkamp, J.B.E.M. and H. Baumgartner, “The Role of Optimum Stimulation Level in Exploratory Consumer

Behavior,” Journal of Consumer Research, Vol. 19, No. 3:434-448, 1992.

Trusov M., L. Ma and Z. Jamal, “Crumbs of the Cookie: User Profiling in Customer-Base Analysis and Behavioral

Targeting,” Marketing Science Special Issue on Big Data, Vol. 35, No. 3:405-426, 2016.

Yan, J., G. Wang, E. Zhang, Y. Jiang and Z. Chen, How much can behavioral targeting help online advertising?, In

Proceedings of WWW 2009 MADRID, available at http://www2009.eprints.org. Retrieved on 5 October 2014.

Zhang, K. and Z. Katona, “Contextual advertising,” Marketing Science, Vol. 31, No. 6:980-994, 2012.

http://www.iab.net/media/file/IAB_Local_Targeting_%20Guide_0922_FINAL.pdf

https://ssrn.com/abstract=2847631

https://dx.doi.org/10.2139/ssrn.2847631

http://www2009.eprints.org/


Page 128

Appendix 1: The Cluster Algorithm

We conducted the clustering in two steps. The first step is to calculate the weight of each keyword among all the

web domains. We adopted the Vector Space Model (VSM) to perform this task. For this model, we used 𝐷𝑖 to

describe each web page.

𝐷𝑖 = (𝑇𝑗 , 𝑊𝑖𝑗 , 𝑗 = 1,2,3, … , 𝑀), 𝑖 = 1,2, … , 𝑁 N web pages, M keywords

𝑇𝑗——Keywords 𝑗

𝑊𝑖𝑗——The weight of keyword 𝑇𝑗at web page 𝐷𝑖 .

TF-IDF was used to compute it. IT-IDF is a widely used method to compute the weight matrix. The keyword TF

used here is the frequency of keywords T at domain D, whereas IDF suggests that if the number of web domains that

include the keyword T is small, then the weight of keywords T at web domain D is higher. Therefore, we differentiated

the web domains. The formula to calculate TF and IDF are as follows:

𝑇𝐹𝑑𝑡 =𝑇ℎ𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝐾𝑒𝑦𝑤𝑜𝑟𝑑𝑠 𝑇 𝑖𝑛 𝑤𝑒𝑏 𝑑𝑜𝑚𝑎𝑖𝑛 𝐷

𝑇ℎ𝑒 𝑡𝑜𝑡𝑎𝑙 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑎𝑙𝑙 𝑘𝑒𝑦𝑤𝑜𝑟𝑑𝑠 𝑖𝑛 𝑤𝑒𝑏 𝑑𝑜𝑚𝑎𝑖𝑛 𝐷

𝐼𝐷𝐹𝑡 = 𝑙𝑜𝑔𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑤𝑒𝑏 𝑑𝑜𝑚𝑎𝑖𝑛 (299）

𝑡ℎ𝑒 𝑤𝑒𝑏 𝑑𝑜𝑚𝑎𝑖𝑛 𝑖𝑛𝑐𝑙𝑢𝑑𝑒𝑠 𝑘𝑒𝑦𝑤𝑜𝑟𝑑𝑠 𝑇

𝑊𝑑𝑡 = 𝑇𝐹𝑖𝑗 × 𝐼𝐷𝐹𝑡

We denoted the web domain as D, 𝐷 = 1,2,3, … ,172, keywords as 𝑇, 𝑇 = 1,2,3, … ,352. Therefore, the weight

of keyword T in web domain D is 𝑊𝑑𝑡. After this step, we had a 172 × 352 matrix to indicate the weight of each

domain across 352 types of keywords. After the 172× 352 matrix, we could conduct the cluster to further aggregate

the 172 domains into fewer types based on the similarity of the keywords.

This study used the GMM. We used the package “sklearn” on Python to conduct a detailed clustering process. To

determine the optimal segment number of clusters, we used the “silhouette_coefficient” to pick the number. The

optimal evaluation criteria is the highest of this coefficient, and the lowest of the standard deviation. Table 8 lists the

comparison when the cluster groups equal 37, 39, 41, and 42; we found that when the group numbers = 39, it had the

largest silhouette coefficient and smallest deviation. Therefore, in this study, we used 39 content groups.

Table 8. Model Comparison

Group

numbers

Cluster

Time

Silhouette

coefficient

SD Min Max

37 5 .2552 .0053572 .247 .26

39 5 .2554 .0036469 .249 .268

41 5 .2408 .0075961 .229 .25

42 5 .2466 .0047223 .239 .251


Page 129

Appendix 2: Table 9. Estimation Results of the Six Content Groups

VARIABLES moderating effect of

CR

moderating effect of

TR

moderating effect of

CR and TR

Routineness

variables

CR 0.820*** 1.079*** 0.917*** (0.163) (0.0491) (0.166)

TR -0.578** -0.0102 0.0169 (0.261) (0.610) (0.615)

CR*TR 1.190*** 0.313 0.387 (0.282) (0.294) (0.298)

Non-routineness

variables

webfirst -0.269*** -0.286*** -0.277***

(0.0228) (0.0227) (0.0228)

hourfirst 0.262*** 0.217*** 0.219***

(0.0467) (0.0469) (0.0469)

Control

variables

ImpCidVisit 0.000827*** 0.000811*** 0.000821*** (4.00e-05) (4.00e-05) (4.00e-05)

ClkCidVisit 0.00871*** 0.00871*** 0.00866*** (0.000463) (0.000463) (0.000463)

adCliclprone 12.01*** 11.97*** 11.99*** (0.0670) (0.0668) (0.0670)

Browsesystem dummy -0.0710*** -0.0715*** -0.0724*** (0.0120) (0.0120) (0.0120)

cityid 0.00786*** 0.00796*** 0.00809*** (0.00254) (0.00254) (0.00254)

Group1 1.146*** 1.220*** 1.188*** (0.102) (0.106) (0.129)

Group 2 -0.339** 0.105 -0.529*** (0.135) (0.109) (0.156)

Group 3 -0.116 -0.522*** -0.204 (0.178) (0.194) (0.222)

Group 4 -2.391*** -3.230*** -3.445*** (0.485) (0.674) (0.735)

Group 5 -0.696*** -0.943*** -0.621*** (0.158) (0.162) (0.191)

Group 6 0.0430 0.460** 0.280 (0.268) (0.191) (0.316)

CRXGroup1 0.116

0.122 (0.164)

(0.165)

CRXGroup 2 0.813***

0.767*** (0.186)

(0.187)

CRXGroup 3 -0.690**

-0.705** (0.287)

(0.290)

CRXGroup 4 1.180

0.714 (0.864)

(0.909)

CRXGroup 5 -0.819***

-0.827*** (0.283)

(0.285)

CRXGroup 6 0.459

0.388 (0.497)

(0.494)

TRXGroup1

-0.293 -0.375 (0.600) (0.607)

TRXGroup 2

2.119*** 2.004*** (0.619) (0.628)

TRXGroup 3 0.341 0.627 (1.003) (0.995)

TRXGroup 4 8.086*** 7.830*** (2.943) (3.035)

TRXGroup 5 -1.035 -0.594 (1.051) (1.047)

TRXGroup 6 -1.680 -1.686 (1.245) (1.252)

Constant -5.386*** -5.472*** -5.417***

(0.132) (0.135) (0.151)

Observations 636,720 636,720 636,720

Hour dummy Included Included Included

Likelihood -130758.73 -130792.48 -130657.83

SITUATIONS MATTER: UNDERSTANDING HOW ...SITUATIONS MATTER: UNDERSTANDING HOW INDIVIDUAL BROWSING...

Documents

Transcript of SITUATIONS MATTER: UNDERSTANDING HOW ...SITUATIONS MATTER: UNDERSTANDING HOW INDIVIDUAL BROWSING...