Pitfalls of Curve Fitting for Large Losses · Pitfalls of Curve Fitting for Large Losses 12 October...
Transcript of Pitfalls of Curve Fitting for Large Losses · Pitfalls of Curve Fitting for Large Losses 12 October...
1
www.guycarp.com
Pitfalls of Curve Fitting for Large Losses
12 October 2011
Guy Carpenter
Amit Parmar, Michael Cane
1 Guy Carpenter
Agenda
Introduction
Theoretical analysis
– Data sample size issues
– Model uncertainty
– Parameter error
– Summary
Real-world analysis
– UK Motor market fitting
– Individual clients versus market curve
Summary
Questions
2
Introduction
3 Guy Carpenter
Introduction What is curve fitting used for?
Understanding the historical data and simplifying data sets
Modelling where there are few data points
Understanding potential extremes of the data (via tails)
Reducing sample variation
3
4 Guy Carpenter
Introduction Why is curve fitting important for actuaries?
Stochastic modelling
Benchmarking exercises
Helps alleviate free-cover problem in
experience rating
Exposure rating may not be possible
Fundamental input to the capital model
Advantages to separating the frequency
and severity
x
x
x
x
x
x
x
x
x
x
x
x
x
Free cover
5 Guy Carpenter
Introduction What are some of the common pitfalls?
4
Theoretical analysis
7 Guy Carpenter
Theoretical analysis
If we sample from:
A known distribution
With known parameters
Is it possible to go wrong?
Lets find out…
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
- 10 20 30
CD
F
Claim Size (£M)
5
8 Guy Carpenter
Sample sizes
− 30, 300 & 3000 ultimate claim data samples
Distribution
− Simple Pareto
Parameters
− Alpha = 1.6
− Lambda = 1,500,000
Reinsurance structure
− Common motor programme:
£3m xs £2m
£5m xs £5m
£15m xs £10m
Unlimited xs £25m
Theoretical analysis Our experiment
Data sample size issues
6
10 Guy Carpenter
Theoretical analysis - Data sample size issues What are the implications of insufficient data?
Sample
size
Simple Pareto
Alpha
CV
30 1.364 0.193
300 1.648 0.059
3000 1.596 0.019
Results obtained using
MetaRisk Fit
How does the low sample size affect the pricing?
11 Guy Carpenter
118%
144%
177%
235%
97% 93% 93%100%
70%
100%
130%
160%
190%
220%
3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M
30 claims
300 claims
3000 claims
Theoretical analysis - Data sample size issues Loss cost to the layer
Pricing using Simple Pareto distribution from each data set
Significantly mis-priced with small data sample
Assume 3000
claims provides
benchmark price
7
Model uncertainty
13 Guy Carpenter
Theoretical analysis – Model uncertainty
Suppose we have:
Sufficient data:
− 3000 claim data sample
What can go wrong?
Distribution:
− What are the chances of selecting the correct distribution?
?
What is the effect on our pricing?
8
14 Guy Carpenter
MetaRisk Fit – Severity distributions
Simple Pareto Lognormal Pareto T
Extreme Value Limit Generalized Cauchy
Inverse Transformed
Gamma
Exponential Normal Split Simple Pareto
Inverse Paralogistic Uniform Transformed Gamma
Loglogistic Generalized Extreme Value Inverse Burr
Paralogistic Extremal Pareto Burr
Loggamma Ballasted Pareto Transformed Beta
Gamma Power Generalized Beta
Inverse Weibull Beta Inverse Generalized Beta
Inverse Gaussian Inverse Beta
Inverse Gamma Generalized Pareto
Key: 1-Parameter 2-Parameter 3-Parameter 4-Parameter
Theoretical analysis – Model uncertainty Possible severity distributions
Common distributions used to conduct our analysis
15 Guy Carpenter
Theoretical analysis – Model uncertainty Chances of getting the wrong distribution with sufficient data
MetaRisk Fit:
Simple Pareto
is 1 of the 28
distributions
9
16 Guy Carpenter
107% 106%
89%
70%
103% 99%93%
83%
142%
89%
12%
0%
99% 92%
83%
68%
178%
194%
72%
2%
0%
20%
40%
60%
80%
100%
120%
140%
160%
180%
200%
3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M
Lognormal
Ballested Pareto
Exponential
Transformed Beta
Gamma
Simple pareto
means
Theoretical analysis – Model uncertainty Expected loss to the layer
3000 claims
Lognormal: Over-pricing for lower layers; Under-pricing for higher layers
17 Guy Carpenter
102%
101%
92%83%
100%
99%95%
89%
104%
83%
22%
1%
99%
95%
89%80%
109%126%
65%
7%
0%
20%
40%
60%
80%
100%
120%
140%
3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M
Lognormal
Ballested Pareto
Exponential
Transformed Beta
Gamma
Simple pareto
Theoretical analysis – Model uncertainty Standard deviation of loss to the layer
3000 claims
Lognormal also underestimates volatility on the higher layers
10
Parameter error
19 Guy Carpenter
Theoretical analysis – Parameter error
Suppose we have:
Sufficient data:
− 3000 claim data sample
Correct distribution:
− Simple Pareto
What can go wrong?
Incorrect parameters:
− Instead of α = 1.6
− We could pick lower or higher values
What is the effect on our pricing?
?
11
20 Guy Carpenter
0%
50%
100%
150%
200%
250%
300%
350%
3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M
Theoretical analysis – Parameter error The funnel of uncertainty
α =1.6
1.2
1.4
1.8
2.0
Simple Pareto with different alpha values
318%
177%
32%
63%
243%
153%
44%
68%
133%
114%
77%
87%
3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M
How can we deal with this volatility?
21 Guy Carpenter
Theoretical analysis – Parameter error Quantifying parameter error
Parameter error is effectively
measuring sample size error
Distortion is accentuated in multi-
parameter distributions
Parameter standard deviation and
correlation quantifies parameter
uncertainties
We simulate parameters for each run of
the model e.g., year of simulation
We assume a lognormal distribution for
parameter uncertainty
MetaRisk Fit extract -
Ballasted Pareto, 3000 claims
12
22 Guy Carpenter
Theoretical analysis Summary
Data sample
issues
Distribution
uncertainty
Parameter
error
Effect on pricing
5M xs 5M layer
Insufficient
data
Sufficient
data
Wrong
parameters
Wrong
distribution
Right
distribution
Right
parameters
Wrong
parameters
Mis-priced by
144%
Mis-priced by
106%
No effect
Right
distribution
Real-world analysis UK Motor Market
13
24 Guy Carpenter
Real-world analysis Setting the scene
Case Study: UK Motor Market
Benchmarking is particularly important in Europe:
– No industry data collectors such as ISO / NCCI
Homogenous line of business
We have access to approximately 60% of motor market data in the UK
Unlimited reinsurance coverage
– Not loss limited
– Low deductibles
Compulsory line of business
25 Guy Carpenter
Real-world analysis Market data statistics – for developed claims
Market data summary statistics
Number of companies 21
Analysis threshold £1,700,000
Total number of claims 1750
Average claim number (per client) 83
Minimum claim number 11
Maximum claim size £30,235,668
Basis Report Year
Years selected 2000 – 2010
Inflation 7.5% pa
14
26 Guy Carpenter
The largest observed claim has a big influence on the fit
Real-world analysis Largest claim effect
How do we deal
with such outliers?
Remove
Ignore
Weighting
Transform
27 Guy Carpenter
Real-world analysis What selection criteria to use?
Mathematical tests
Goodness-of-fit tests such as:
By eye – visual judgement
E.G.,
– CDF
– QQ Graph
– PP Graph
1
)1(
2
Akaike 2.
Kn
KKKNLL
enfornK
NLL 2
))ln(ln(.
2
HQ 3.
2
)ln(.
2
Schwartz 4.
nKNLL
Likelihood-Log Natural 1.
parameters ofnumber K
points data ofnumber n :Where
15
28 Guy Carpenter
Choosing the market curve Possible criteria
Good fit versus over parameterisation
– Use an information criteria like the H-Q test
Higher number of parameters may lead to less predictive power
Parameter CV should be low
Parameters should be significantly different from zero
Interpretability of the model and parameters
Where is the curve going to be used ?
Curve-fitting is subjective; it is an art not a science
29 Guy Carpenter
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0M 2M 4M 6M 8M 10M
Pro
bab
ility
Claim value (£m)
Inverse Gaussian Empirical
Real-world analysis What part of curve to fit to?
Inverse Gaussian – good fit to the body of the distribution (0 - £10M)
16
30 Guy Carpenter
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
10M 15M 20M 25M 30M
Pro
bab
ility
Claim value (£m)
Inverse Gaussian Empirical
Real-world analysis What part of curve to fit to?
Although, the fit is heavier at the tail (£10M - £30M)
31 Guy Carpenter
Generalised Beta
Has a good fit when looking at the CDF graph
Best performing in tests
BUT…
CVs of parameters are too high
Beta value is too low
Selected Distribution: Weibull
17
32 Guy Carpenter
107%
124%
106%
118%
85%
90%
95%
100%
105%
110%
115%
120%
125%
3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M
Burr
Simple pareto
Lognormal
Transformed Gamma
Generalised Beta
Weibull
Real-world analysis Effect on the layers
Burr, Lognormal & Transformed Gamma similar to Weibull
Lo
g o
f lo
ss
es
Simple Pareto & Generalised Beta: Over-pricing for higher layers
Individual clients’ versus market curve
18
34 Guy Carpenter
Real-world analysis - Individual clients’ vs. Market curve Individual client data statistics
Attribute Client A Client B Client C Market
Total number of claims 293 52 12 1,515
Analysis threshold £1,700,000
Maximum claim size £29,731,529 £16,415,791 £12,090,704 £30,235,668
Minimum claim size £1,709,255 £1,736,425 £1,727,721 £1,702,032
Average claim size £3,955,290 £4,138,872 £4,853,976 £4,209,709
Basis Report Year
Years 2000 - 2010
35 Guy Carpenter
Real-world analysis Client empirical vs. best fit
Parameters
Name Value Std Dev CV
mu 14.28 0.26 0.02
sigma 0.89 0.11 0.12
Correlations
beta
theta -0.94
Parameters
Name Value Std Dev CV
alpha 2.35 0.51 0.22
tau 1.70 0.31 0.18
Correlations
beta
theta 0.87
Parameters
Name Value Std Dev CV
alpha 1.07 0.38 0.35
Client A
Client B
Client C
19
36 Guy Carpenter
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2M 7M 12MP
rob
abili
ty
Claim value
Chart Title
Market
Client A - Lognormal
Client B - Log Gamma
Client C - Simple Pareto
Market Empirical
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2M 7M 12M
Pro
bab
ility
Claim value
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2M 7M 12M
Pro
bab
ility
Claim value
Market Empirical 98% 100% 66% 27%
Client A 95% 79% 64% 55%
Client B 96% 83% 109% 256%
Client C 102% 153% 378% 1660%
Real-world analysis Market Curve vs. Clients’ best fit
Layers 3M XS 2M 5M XS 5M 15M XS 10M Unltd XS 25M
Market Dev 100% 100% 100% 100%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2M 7M 12M
Pro
bab
ility
Claim value
Chart Title
Market
Client A - Lognormal
Client B - Log Gamma
Client C - Simple Pareto
Market Empirical
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2M 7M 12M
Pro
bab
ility
Claim value
Chart Title
Market
Client A - Lognormal
Client B - Log Gamma
Client C - Simple Pareto
Market Empirical
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2M 7M 12M
Pro
bab
ility
Claim value
Chart Title
Market
Client A - Lognormal
Client B - Log Gamma
Client C - Simple Pareto
Market Empirical
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2M 7M 12M
Pro
bab
ility
Claim value
Chart Title
Market
Client A - Lognormal
Client B - Log Gamma
Client C - Simple Pareto
Market Empirical0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2M 7M 12M
Pro
bab
ility
Claim value
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2M 7M 12M
Pro
bab
ility
Claim value
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2M 7M 12M
Pro
bab
ility
Claim value
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2M 7M 12M
Pro
bab
ility
Claim value
Market 99% 95% 86% 65%
0%10%20%30%40%50%60%70%80%90%
100%
2M 12M
Pro
ba
bil
ity
Claim value
Market
Market developed
Market Empirical
Client A - Lognormal
Client B - Log Gamma
Client C - Simple Pareto
Summary
20
38 Guy Carpenter
Summary Key messages
Parameter Selection
Model Selection
Sample Size
Bad news: Difficult to hide from
the pitfalls of curve fitting
– Multiplicative effect
– Implications where
curves are most needed
– Model selection has
least impact
Good news:
‘Ultimately curve-fitting is where science and art meet’
Bench marking
Company data
Blended approach
39 Guy Carpenter
Any Questions?