Experiments on credence goods - uibk.ac.at · low income 1.021 1.066 1.079 1.055 high income 1.037...
Transcript of Experiments on credence goods - uibk.ac.at · low income 1.021 1.066 1.079 1.055 high income 1.037...
Field Experiments on Credence Goods
and Fraud
Prof. Dr. Loukas Balafoutas
Innsbruck Winter School on Credence Goods, Incentives and Behavior
Kühtai, 16.03.2019 – 22.03.2019
• Internal validity: Do the data permit correct causal
inference? A matter of experimental design
(treatment choice, parameterization, successful
randomization) and data analysis. Usually not a
problem.
• External validity: Is it possible to generalize
inferences from the experiment to the real world?
• Often a major source of criticism.
Validity of experiments
• Lack of realism in the interaction protocol; essential
elements missing in sterile lab environment
• Used subject pools are special
• Subjects do not take experiments seriously; money at
stake is too small
• Small sample sizes
• Hawthorne effects
For an economist’s defense, see A. Falk & J. Heckman,
“Lab experiments are a major source of knowledge in the
social sciences”, Science (2009)
Typical concerns against (lab)
experiments in economics
• One option: Take the lab to the field (e.g., credence goodsexperiments with medical students or professionals).
• Partly overcomes the criticism on special subject pools, but does not go a long way towards realism.
• Natural field experiments: Today‘s topic. Observe thebehavior of subjects in their „natural environment“. Theydo not know they are part of an experiment.
• Combine randomization and realism, but possibly lesscontrol.
• Also, they face challenges in design.
• And they are subject to ethical issues (more on this later).
Field experiments
Field experiments with credence
goods: Schneider (2012)
• Schneider (2012) reports on undercover visits to auto repair
garages with one test vehicle.
• His main aim is to find out if there is mistreatment in the car
repair industry.
• Then he examines whether reputation mitigates these
problems by appearing as either a one-time or repeat-
business customer.
Schneider (2012)
• In total, 91 undercover garage visits with a prearranged set
of defects (loose battery cable, low level of coolant, missing
taillight).
• The mechanic was asked to thoroughly inspect the vehicle,
diagnose its condition, make a repair recommendation, and
provide a price estimate.
• Schneider then compares the observed outcomes with the
repairs that were known ahead of time to be actually
necessary.
• And rejects all repairs.
Implementing reputational concerns
• Random assignment of mechanics to two procedures.
• In one, Schneider appeared as a one-time customer, stating
that he was moving away and having moving boxes visible
in the back of the car.
• In the other, he appeared as a possible repeat customer by
providing a home address near to the garage, and
suggesting that he was seeking a local mechanic for an
ongoing relationship.
• Then he estimated the effects of this repeat-business
procedure on the number of legitimate defects discovered,
the diagnosis fee, the repair recommendation, and the
repair price.
Main results
• Overtreatment was pervasive. In about 30% of cases the
mechanics provided completely unnecessary repairs.
• Overcharging was very limited. Only 2% of total charges are
accounted for by overcharging.
• Undertreatment was frighteningly high. In about 80% of
cases at least one of the defects was missed.
• Is this intentional, or is it due to incompetence? Highlights
the importance of investing effort in diagnostic precision.
• But mechanics working with the author confirm that these
defects should be identifiable to even the least experienced
mechanics.
Main results
• Reputational concerns do not seem to matter much. No
evidence that motorists representing repeat business
receive different repair recommendations, repair prices, or
diagnosis quality versus one-time motorists.
• However, there is a difference in the diagnosis fee: 37.70$
for repeat customers, but 59.75$ for one-time customers.
• Perhaps because this is a dimension that the customer can
directly observe and evaluate.
• Literature on labor supply of taxi drivers, also more
recently on prosociality and gender discrimination.
• However, it is interesting to think about the actual
product taxi drivers sell – an expert service!
• Possible types of fraud
▫ overtreatment = taking longer route (provides no
benefit)
▫ overcharging = charging more than justified by chosen
route (e.g. night tariff, fictional fares, no taximeter)
▫ undertreatment usually not a problem
10
Field experiments with taxi drivers
• Field experiment with taxi drivers in Athens.
• First experiment was intended to measure whether and
how informational advantages of taxi drivers lead to
exploitation of customers.
11
Balafoutas, Beck, Kerschbamer and
Sutter (2013)
11
• H1: Information about city
▫ passengers who are not familiar with the city are more
likely to face (more extensive) overtreatment
• H2: Information about fares
▫ passengers who are not familiar with the fares are
more likely to face overcharging
• H3: Income
▫ high-income passengers receive worse service than
low-income passengers
Hypotheses
12
• “Local” passenger
▫ Enters taxi and states requested destination
▫ Speaking in Greek
• “Non-local native” passenger
▫ Enters taxi and states requested destination, adding
“do you know this destination, because I am not
familiar with the city”
▫ Speaking in Greek
• “Foreign” passenger
▫ Identical to “non-local” passenger, but
▫ Speaking in English
13
Method – Manipulating familiarity with
city and tariff
• High income
▫ wearing suit and
carrying briefcase
▫ top-end hotel
• Low income
▫ casual clothes and backpack
▫ low-end accommodation
14
Method – Manipulating perceived
income
14
• The experiment
▫ five experimenters (all male in their late twenties)
▫ simultaneous observations through triples: three
experimenters with same starting point, same
destination, different roles
▫ randomization over routes, days, and time
16
Method
local native non-local native foreigner
low income 58 58 58
high income 58 58 58
• Facts
– 14,000 taxi drivers in Athens
– one-man companies
– same fare system nationwide
• Incentives
– one minute can be spent on
• detour : 37 cent
• traffic jam: 16 cent
• waiting: 0 cents
Taxis in Athens
• keeping track of fraud
▫ GPS-logger
records exact position every single second
allows reconstructing exact route, duration etc.
18
Method – GPS Logger
locals non-locals foreigners
OT-index 1.029 1.077 1.087
23
Results – Overtreatment (normalized by
shortest route in triple)
1
1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
local non-local native foreigner
Support for H1:
local
significantly
smaller than
non-local and
foreigner
no difference
between
non-local and
foreigner
24
Results – Duration Index by Origin
of Customer
24
0.2
.4.6
.81
1 1.2 1.4 1.6 1.8 2
locals non-local natives
foreigners
25
Results – Overcharging frequency
locals non-locals foreigners
OC
frequency
0.034 0.078 0.224
0
0.05
0.1
0.15
0.2
0.25
local non-local native foreigner
Support for H2:
foreigner
significantly
higher than
local and non-
local
no difference
between
local and
non-local
26
Results – Total fare paid (normalized
on cheapest fare in triple)
locals non-locals foreigners
price-index 1.038 1.113 1.223
0.9
0.95
1
1.05
1.1
1.15
1.2
1.25
local non-local native foreigner
27
Results – Overtreatment by Income
of Customer
OT-index locals non-locals foreigners total
low income 1.021 1.066 1.079 1.055
high income 1.037 1.087 1.096 1.073
low-income
high-income
0.98
1
1.02
1.04
1.06
1.08
1.1
local non-local native foreigner
No support for
H3:
No difference
between
high-income
and low-
income
passengers
(1)
Overtreatment
Index
(2)
Overcharged
Amount
(3)
Price
Index
non-resident (non-
local + foreigner)
0.049***
(0.016)
0.123
(0.093)
0.084***
(0.024)
foreign0.002
(0.028)
1.499**
(0.660)
0.130**
(0.055)
high income0.017
(0.013)
-0.209
(0.189)
-0.024
(0.020)
time of the day0.024
(0.018)
-0.140
(0.359)
0.026
(0.038)
Additional controls Yes Yes Yes
N = 348. *, **, *** denotes significance at the 10%, 5%, 1% level respectively. Clustering of standard
errors by set of simultaneous observations. Route and experimenter fixed effects. Tobit regressions,
dependent variable left-censored at 0/1. Marginal effects reported. Standard errors in parentheses.
Econometric Analysis
• Overtreatment: more extensive for passengers who are
not familiar with the city (H1)
• Overcharging: more common for foreigners (H2),
presumably due to unfamiliarity with tariffs
• Income: High-income customers are on average slightly
more prone to fraud, but not significantly so (contrary to
H3).
• If they want to, taxi drivers know whom to cheat and how
to cheat!
29
Summary Balafoutas et al. (2013)
„First-degree“, or direct moral
hazard and credence goods
• Agent takes hidden action that hurts interests of
uninformed principal
• Example (credence goods): insured patient asks for
more extensive or more expensive treatment
• Higher cost for insurance company
• Organizational context: expense account fraud
• Moral hazard operates through demand side in credence
goods market
• Expert seller anticipates weak incentives of consumer
to exert control
• Increases fraud (overtreatment or overcharging)
• Moral hazard operates through supply side
• Examples: service on corporate cars, physicians’
behavior with fully insured patients
„Second-degree“, or indirect moralhazard and credence goods
Balafoutas, Kerschbamer and Sutter
(2017)
• We study the possibility of second-degree moral hazard
in a field experiment.
• Notice that the stories are observationally equivalent:
higher coverage leads to higher expenditure.
• But we control behavior on the demand side.
• Also, adverse selection not an issue.
Design
• Same market as in first study: Taxi rides in Athens
• Only one role: non-local natives, to control for perceived
information and to preserve the credence goods nature
of the service
• No income manipulation
• Both genders
Design: Moral hazard
• Exists when the passenger does not pay for ride herself,
i.e., has expenses reimbursed.
• Hence, our central treatment variation: Full Refund vs.
Control.
• Fixed script upon entering the taxi.
• In CTR: script for non-local native (“Do you know where
X is? I am not from Athens”)
• Then: “Can I get a receipt at the end of the ride?”
Design: Moral hazard
• In FR (Full Refund) Treatment: exact same script as in
CTR, adding one short phrase.
• “Can I get a receipt at the end of the ride? My employer
is covering the expenses.”
• This short phrase was the only difference between the
two treatments!
The experiment
• Four assistants, similar in age.
• Took (almost) simultaneous rides in 100 quadruples
(=400 observations).
• Each assistant switching between CTR and FR, but all
four cells in each quadruple.
Full Refund Control
Male
passengerN=100 N=100
Female
passengerN=100 N=100
The experiment
• 11 routes (subset of first experiment, adding one route
from southern Athens to airport).
• Rides spread over 15 days in March 2013 and July
2014.
• 8am to midnight.
• Total furation: 156 h.
• Total distance: 6482 km
Hypotheses
• Main hypothesis: We should observe more fraud in the
Full Refund treatment.
• Overtreatment or Overcharging?
• No prior regarding gender effects or differential effects of
Full Refund by gender.
• Although, intuitively we thought women might be more
susceptible to fraud (perceived as less likely to engage
in confrontation?)
Results: Overtreatment Index (duration)
1.00
1.04
1.08
1.12
1.16
1.20
Men Women
Full Refund
Control
No significant difference between FR and CTR.
Similar results with distance travelled.
Overcharging
• Overall difference between FR and CTR: 37% vs. 20% (p < 0.01, Fisher’s exact test).
• We have a strong treatment effect on the likelihood of facing overcharging!
• In CTR: women face more fraud than men (26% vs. 13%, p < 0.05)
• Difference disappears under Full Refund: Behavior towards men is more responsive to our treatment manipulation and drives this result
Overcharging amount
• Moreover, the extent of overcharging is higher in the
moral hazard treatment.
• Unconditional overcharging amounts:
CTR FR total
male passengers 0.72 1.46 1.09
female
passengers1.10 1.40 1.25
total 0.91 1.43 1.17
Results: Price Index
• Overall difference between FR and CTR: 1.17 vs. 1.09 (p <
0.01, Wilcoxon signed-ranks test based on treatment
averages within quadruple).
• Central result of the paper: Our moral hazard manipulation
leads to higher consumer expenditure.
• Due to more frequent and more extensive overcharging
(given insignificant differences in duration and distance
index).
Overtreatment vs. Overcharging
• If driver anticipates no control from passenger, then
natural to expect stronger differences in the
overcharging dimension (more lucrative).
• Another reason: opportunity costs of time.
• Passenger may not mind about price, but would resent
longer route.
A similar study in the computer repair
sector – Kerschbamer, Neururer and
Sutter (2016)
• Field experiment in the Austrian market for computer
repair services.
• Goal: Measure the impact of informing the expert provider
that an insurance company will pay the repair bill on the
extent and type of fraud. We like to call this second-
degree moral hazard.
Experimental Design
• Hardware of a computer is manipulated in such a way that
it is no longer possible to boot the computer (details later).
• Computers are handed in for repair at computer shops all
over Austria with a fixed script.
• Two treatments: control treatment (CONTROL) and
insurance treatment (INSURANCE).
• Between-subjects design and treatments are randomly
assigned to repair shops.
Experimental Design
• After the repair we compare the defect to the actual repair
and the bill to detect and quantify overtreatment,
undertreatment and overcharging.
• Hypothesis: Indicating to the expert that the repair cost
will be paid by an insurance company increases the
amount of overtreatment and overcharging.
Experimental Treatments
• Fixed script for computer problem: “When starting my
computer an error message appears and I am not able to
boot the computer. I have no idea what this means and I
would like you to repair it, please.”
• Control treatment: “I need a bill for the repair!”
• Insurance treatment: “I need a bill for the repair because I
have an insurance.”
Implementation Issues
• Test-computers must be in a perfect condition, except for
our manipulation:
– We bought five refurbished laptops warranty not
visible for the computer experts and perfect condition
was assured.
• The value of the test-computers should be high enough
such that repair (instead of replacement) is a plausible
strategy:
– The laptop cost 684 € and the correct price of an
appropriate repair was about 60 - 80 €.
Implementation Issues
• Important that the computer expert is able to diagnose the
problem correctly (otherwise measured misconduct might
be due to incompetence and not fraud):
– We destroy one of the two RAM modules of the
computer. As a result the computer beeps in a specific
way and is no longer able to boot.
– Additionally, the following error message appears on
the screen: “ERROR 1830: Invalid memory
configuration - Power off and install a memory module
to Slot -0 or the lower slot.”
Implementation Issues
• Our IT Department helped us with the manipulation and
they assured us that this error should be diagnosed
correctly by any expert in the field. In addition, it is not
uncommon that RAM modules crash from time to time
and therefore the problem should be well known to
experts.
• Our IT Department estimated 30 minutes for the repair.
Data Set
• Experiment was conducted between March and
September 2013 and all computers were handed in during
regular shop opening hours.
• The shops were randomly selected from the 251 shops
listed in the telephone directory.
• Treatment assignment was also random.
Results – successful repairs
• 58 out of 61 repair shops managed to repair the computer
successfully (remaining three observations are excluded
from further analysis).
• Average time until pick-up was possible:
2.29 days
Results – Repair Price
• Control treatment: 70.17€
• Insurance treatment: 128.68€
• Difference in mean repair price is economically
impressive and statistically highly significant
(p = 0.002, Mann-Whitney test).
Cumulative Frequencies of Repair
Prices
0
10
20
30
40
50
60
70
80
90
100
0 25 50 75 100 125 150 175 200 225 250 275
Rela
tive c
um
ula
tive f
requency (
in %
)
Repair prices (in Euro)
CONTROL INSURANCE
Overtreatment (too much repair)
• Five observations with additional repairs not related to our
manipulation and all of them took place in the insurance
treatment.
– Overtreatment is significantly more frequent in
INSURANCE (17.24% vs. 0%; p = 0.018, Fisher’s
exact test). Average overtreatment costs about 200€.
Overcharging in the Spare Parts
Dimension
• Four cases with overcharging in the spare parts
dimension. Two of them happened in CONTROL and two
in INSURANCE. So, there is no treatment difference with
respect to the frequency of overcharging in the spare
parts dimension.
Overcharging in the Working Time
Dimension
• Out of the 58 repair shops, 30 indicated the working time
on the bill.
– Control treatment: 0.55 hours
– Insurance treatment: 1.02 hours
– p = 0.01, Mann-Whitney test (this holds also when
overprovision observations are excluded, p = 0.046,
Mann-Whitney test).
– The difference accounts for 41 Euro out of the 58 Euro
difference between CONTROL and INSURANCE.
Summary of Kerschbamer et al. (2016)
• Consumer expenditures on computer repairs are higher in
the presence of an indicated insurance coverage.
• Decomposing the treatment effect into different types of
fraud we find that the difference in repair prices is mainly
due to overtreatment (replacing more parts than
necessary) and to overcharging in the working time
dimension (charging for more working time than actually
provided).
A recent experiment on ways to
reduce fraud
"Can fraud be reduced with priming? Evidence
from a real-world market for credence goods"
Christopher Bindra and Graeme Pearce
(work in progress)
Motivation
• Reducing the informational advantage of the seller is one
way that fraudulent behaviour can be mitigated.
• But this isn't always a viable solution
• Evidence from social psychology and behavioural
economics suggests that priming could be used to
positively impact people's behaviour.
• Making honest behaviour salient might encourage sellers
to reduce fraudulent behaviour.
Experiment
• 20 RAs, randomly assigned to groups of four.
• Within a group, all testers took a taxi from a designatedtaxi stand to a common destination. We refer to this as a quadruple.– Took taxis in random order with apprx. 1 minute between
each journey.
– Each tester carried a GPS tracker to record the route
• Testers recorded a number of variables: the meterreading, the number of „extra charges“ the driver added, if the driver kept change, and so on.
• The experiment varies the script that the testers spoke.
• We collected data in two waves in Mai 2018 (400 rides) and February 2019 (200 rides).
Experiment
• Baseline: “I would like to go to x. Do you know where it is?
I’m not from Vienna and I dont know the way.“
• Positive: “Did you hear about that study where researchers
found that around 80% of taxi drivers were shown to behave
honestly towards passengers, always taking them on the
cheapest route? I read about it on the internet.”
• Negative: “Did you hear about that study where researchers
found that around 20% of taxi drivers were shown to behave
dishonestly towards passengers, taking them on more
expensive routes than necessary? I read about it on the
internet.“
• Uber Priming: “I checked the Uber price on line and it
seemed cheap.“
Summary
• Find that “Positive” priming of taxi drivers induces them
to be more dishonest.
• But not negative priming. Your thoughts?
• All priming scripts generally increase consumer
expenditure, even if not significant.
• Find little evidence that overtreatment is influenced by
the priming manipulations.
Anagol, Cole and Sarkar (2017)
• Field experiment to evaluate quality of advice of
insurance agents in India.
• Undercover consumers asked for life insurance
recommendations.
• Term versus whole life insurance.
• Standardized scripts, auditors were men in their late 30s.
• Audits lasted for about 35 minutes.
Anagol et al. (2017)
• Exogenous treatment variation in:
• Customer‘s needs:
– “I want to save and invest money for the future, and I also
want to make sure my wife and children will be taken care
of if I die. I do not have the discipline to save on my own”
(whole life),
– „I am worried that if I die early, my wife and kids will not be
able to live comfortably or meet our financial obligations. I
want to cover that risk at an affordable cost“ (term)
• Customer‘s beliefs & second opinion:
– „I have heard from [source] that whole [term] insurance is
a really good product for me. Maybe we should explore
that further”
Anagol et al. (2017)
• Overwhelming majority of recommendations are for
whole life insurance.
• This is essentially a form of overtreatment, especially
when the customer needs term.
• Insurance agents respond (even if weakly) to
consumer‘s needs and to their beliefs, even when beliefs
are wrong.
Anagol et al. (2017)
• Overall, mentioning that one has received advice from
another agent does not affect the quality of advice.
• However, when this second opinion was a poor
recommendation, there is an increase in the quality of
advice.
• Hence, there is some evidence to suggest benefits of
second opinions.
Field experiments in the healthcare
sector
• Possibly the largest and economically and socially most
important expert service; field evidence of great value.
• However, there are important ethical issues (to be
discussed later), as well as difficulties in
implementation.
• For instance, it is important to ensure sufficient training
for the undercover patients and cover as many
eventualities as possible.
Currie, Lin and Meng (2014)
• Field experiment in Chinese hospitals, on financial
incentives and antibiotics prescription.
• Four similar undercover patients to each of 80 doctors in
16 different hospitals, in 2011-2012.
• All complained of mild flu-like symptoms: „For the last
two days, I've been feeling fatigued. I have been having
a low grade fever, slight dizziness, a sore throat, and a
poor appetite. This morning, the symptoms worsened so
I took my body temperature. It was 37 °C.”
• For such mild symptoms, antibiotics should not be
prescribed unless tests diagnose a bacterial infection.
• Hence, a good setting to study overtreatment.
Currie et al. (2014)
• Patient A: baseline
• Patient B: asked for antibiotics prescription: „Doctor, can
you prescribe some antibiotics for me?”
• Patient C: asked for a prescription and indicated that he
would buy the drugs elsewhere: „Doctor, I can get a
discounted price in a drug store, but I don't know what
medicine to take. Can you write a prescription for me?”
• Patient D: asked for antibiotics prescription and indicated
that he would buy the drugs elsewhere (B plus C).
• All four types visited the same doctor (at least two
months interval between C and D).
Currie et al. (2014)
• Antibiotics prescription frequencies / number of drugs
prescribed:
• Antibiotics prescriptions driven mainly by financial
incentives, not by consumer demand or physician
ignorance.
A (Baseline) 55% ** / 2.63 **
B (asked for antibiotics) 85% ** / 3.24 **
C (asked for prescription + buys
elsewhere)10% ** / 1.79 **
D (asked for antibiotics + buys
elsewhere)14% ** / 1.97 **
Currie et al. (2014)
• Four more treatments in Experiment 2
• B‘: small gift
• C‘: patient knowledge. „I learned from the Internet that
simple flu/cold patients should not take antibiotics. Is this
true? Can I not take antibiotics unless they are
necessary?”
• D‘: buys elsewhere. „Doctor, my sister-in-law works at a
drug store. She can offer me a discount if I buy drugs in
her store. But I don't know what medicine to take, so
could you please write a prescription for me?
• E‘: C‘ plus D‘
Currie et al. (2014)
• Antibiotics prescription frequencies / number of drugs
prescribed:
• Antibiotics prescriptions drop as a result of patient
knowledge and lack of financial incentives.
A (Baseline) 0.63 / 2.49
B‘ (gift) 0.50 / 2.33
C‘ (patient knowledge) 0.43 ** / 2.21
D‘ (buys elsewhere) 0.12 ** / 1.88 **
E‘ (patient knowledge + buys
elsewhere)0.08 ** / 1.62 **
Lu (2014)
• Two possible mechanisms on why insurance can
increase consumer expenditure in healthcare markets:
• Agency hypothesis: doctors increase their income at the
expense of patients (potentially suboptimal treatments).
• Considerate doctor hypothesis: doctors want to improve
patients‘ well-being taking into account both, quality of
treatment and ability to pay.
• Observational data cannot disentangle between the two
hypotheses, due to issues such as adverse selection
and patient responses to insurance („second-degree
moral hazard“).
Lu (2014)
• Field experiment in Chinese hospitals, on the role of
patient insurance and financial incentives.
• Two hypothetical patients: Both male around 65. Patient
1 at risk of heart disease and diabetes (hypertension,
high triglycerides, high blood sugar); Patient 2 at risk of
heart disease (hypertension) and already receiving
treatment.
• Two testers who visit doctors on behalf of the
hypothetical patients – relatives who live in another
region – and ask to buy drugs for them.
• Visits to cardiology and endocrinology departments in 49
hospitals in or near Beijing.
Lu (2014)
Treatment variation:
1. Incentives vs. no incentives:
„{the relative} asked me to buy the medicines here for him”
(incentives)
“{the relative} wants to get a prescription and buy drugs at
his local store”
(no incentives)
2. Insured vs. not insured: tester indicates whether the
relative has government insurance
Lu (2014)
• Physicians write prescriptions that are significantly more
costly for insured than uninsured patients, but only if
physicians receive kickbacks.
• In line with agency hypothesis, but not with considerate
doctor hypothesis.
Insured /
incentives
Not insured
/ incentives
Insured /
no incentives
Not insured /
no incentives
Monthly drug
expenditure (in
yuan)
424.78
(23.54)
298.71
(15.84)
324.50
(18.95)
307.03
(15.44)
Number of
prescribed
drugs
2.47
(0.10)
2.20
(0.08)
2.18
(0.07)
2.18
(0.06)
Das, Holla, Mohpal and Muralidharan
(2016)
• Field experiment in India, comparing physician effort and
treatment between private and public healthcare
providers.
• Fifteen fake patients presented to public and private
clinics in rural India with symptoms of one of three
conditions: chest pain in 45 yr-old male; asthma in 25 yr-
old male or female; dysenteria in child (not present).
• Treatment variation:
– Between subjects: Public vs. Private clinic
– Within subjects: visits to doctors who work in a public clinic and
also operate private practice
Das et al. (2016)
Outcome variables:
• Price charged for visit
• Measures of quality of care:
– Indian government guidelines for questions and examinations to
secure differential diagnosis in each case;
– length of consulation;
– correct diagnosis and appropriate treatment and prescriptions,
based on inputs from a panel of doctors, pharmacists, and a
pharmaceutical company
Main findings
• Public vs. private providers: private providers exert more
effort and perform equally well in terms of diagnosis and
treatment quality.
• Within subjects, in clinic vs. in private practice: same
doctors spend more time and are more likely to offer
correct diagnosis when seeing patient in practice.
• No differences with respect to overtreatment.
Gottschalk, Mimra and Waibel (2018)
• Field experiment in market for dental care in Switzerland.
• Aims:
– Measure overtreatment
– Explore the effect of patient information
– Explore the effect of patient socioeconomic status (SES)
– Explore the role of further, non-experimental market and
dentist characteristics
Design
• One test patient visits 180 different dentists in the canton
of Zurich.
• Patient has one minor superficial caries lesion between
two teeth.
• Condition assessed by four indepentent reference
dentists: patient should receive no treatment (i.e., no
filling).
• Case considered very easy, requires low diagnostic
effort.
Design
• Test patient visits dentists, based on the following
scenario: he had x-ray taken at a dental hygiene
practice, which recommended a check-up at a dentist.
• Hence, test patient shows the same x-ray to all dentists.
• Advantages:
– Easy diagnosis
– Condition did not change over time, identical information to
all dentists
– Guidelines provide clear recommendation (provide no
treatment and re-evaluate one year later)
Treatment variation: patient
information
• Informed condition: The patient indicates to the dentist
– that he has uploaded the x-ray to an internet platform
where dentists offer free advice
– that he has not yet received a reply
• Standard condition: No additional script
Treatment variation: patient SES
• High SES patient:
– high quality suit, expensive watch, car key, expensive
mobile phone
– specified occupation on patient form: translator at a bank
• Low SES patient
– cheap unbranded clothes, backpack, no accessories
– specified occupation: student of translation doing
internship
Outcome variables
• Overtreatment: recommendation that includes at least
one filling.
• Undertreatment not possible, given that correct
recommendation is no treatment.
• Charging dimension: dentists can choose prices within a
certain range.
• They can also charge a diagnosis fee.
Recommendations and overtreatment
(binary)
Standard
patient
Informed
patientAverage
Low SES 37.8% 26.7% 33.2%
High SES 20.0% 26.7% 23.3%
Average 28.9% 26.7% 27.8% (50/180)
Patient information
• Some evidence of information effect among low SES
patients.
• However, all differences insignificant.
• Hence, no effect of petient information on dentists‘
recommendations on the whole.
• Contrary to intuition, theory, and previous evidence
• Possible explanations?
Patient SES
• Higher SES leads to less overtreatment.
• Contrary to expectations based on distributional
preferences or price sensitivity.
• Possible explanations put forward in the paper:
similarity of dentists‘ and patients‘ SES
perceived likelihood of repeated interaction
higher SES as a signal of more or better information
Ethical concerns
Discuss relevant ethical aspects of the field experiments
discussed today, in particular (but not only) in healthcare
markets
Ethical concerns
1. Ethical aspects and externalities
is it acceptable to encourage fraudulent behavior?
experts‘ wasted time and resources; cost-benefit
analysis
ethical clearance for experiments (especially in
healthcare industry)
2. Deception (aren‘t we lying to experts?)
3. Anonymity must be guaranteed, despite fraudulent
behavior.
4. Ongoing methodological discussion on false positives
(applies to most experimental work)