A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf ·...

56
A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised by G´ erard Biau (LSTA) and Jean-Philippe Vert (Institut Curie) eminaire Statistiques - IRMA Strasbourg, October 2015 Erwan Scornet Random forests

Transcript of A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf ·...

Page 1: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

A walk in random forests

Erwan Scornet (LSTA, Institut Curie),Supervised by Gerard Biau (LSTA)

and Jean-Philippe Vert (Institut Curie)

Seminaire Statistiques - IRMAStrasbourg, October 2015

Erwan Scornet Random forests

Page 2: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Background on random forests

Random forests are a class of algorithms used to solve regression and classificationproblems

They are often used in applied fields since they handle high-dimensionalsettings.

They have good predictive power and can outperform state-of-the-art meth-ods.

Erwan Scornet Random forests

Page 3: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Background on random forests

But theoretical results are not yet entirely sufficient to explain their goodaccuracy.

Erwan Scornet Random forests

Page 4: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

1 Construction of random forests

2 Random forests and kernel methods

3 Consistency of Breiman forests

Erwan Scornet Random forests

Page 5: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

General framework of the presentation

Regression setting

We are given a training set Dn = {(X1,Y1), ..., (Xn,Yn)} where the pairs(Xi ,Yi ) ∈ [0, 1]d × R are i .i .d . distributed as (X ,Y ).

We assume that

Y = m(X) + ε,

where ε ∼ N (0, σ2). We want to build an estimate of the regressionfunction m using random forest algorithm.

Erwan Scornet Random forests

Page 6: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

How to build a tree?

Breiman Random forests are defined by

1 A splitting rule : minimize the square loss.

2 A stopping rule : leave exactly one point in each cell.

Erwan Scornet Random forests

Page 7: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

How to perform splits of Breiman’s forests?

For a cut direction j ∈ {1, . . . , d} and a split position z ∈ [0, 1] , thecriterion takes the form

Ln(j , z) =1

Nn(A)

n∑i=1

(Yi − YAL

1X

(j)i <z− YAR

1X

(j)i ≥z

)2

,

where

AL = {x ∈ A : x(j) < z} and AR = {x ∈ A : x(j) ≥ z}YA is the average of the Yi ’s belonging to A.

Nn(A) is the number of points in A

Erwan Scornet Random forests

Page 8: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

How to perform splits of Breiman’s forests?

An example: j = 1 and z = 0.5.

16,2

14,8

17,1

5,8

16,2

7,1

6,2

5,7

5,5

Erwan Scornet Random forests

Page 9: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

How to perform splits of Breiman’s forests?

An example: j = 1 and z = 0.5.

16,2

14,8

17,1

5,8

16,2

7,1

6,2

5,7

5,5

Ln(1, 0.5) =1

Nn(A)

n∑i=1

(Yi − YAL

1X

(1)i <0.5︸ ︷︷ ︸

Average on AL

− YAR1X

(1)i ≥0.5

)2

,

Erwan Scornet Random forests

Page 10: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

How to perform splits of Breiman’s forests?

An example: j = 1 and z = 0.5.

16,2

14,8

17,1

5,8

16,2

7,1

6,2

5,7

5,5

Ln(1, 0.5) =1

Nn(A)

n∑i=1

(Yi − YAL

1X

(1)i <0.5

− YAR1X

(1)i ≥0.5︸ ︷︷ ︸

Average on AR

)2

,

Erwan Scornet Random forests

Page 11: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Construction of random forests

Randomness in tree construction

resample the data set via bootstrap;

At each node, preselect a subset of mtry variables eligible forsplitting.

Erwan Scornet Random forests

Page 12: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Literature

Random forests were created by Breiman [2001].

Many extentions have been proposed to

solve ranking problems [Clemencon et al., 2013],solve survival analysis problems [Ishwaran et al., 2008],perform quantile estimation [Meinshausen, 2006],

and to improve calculation time [Geurts et al., 2006].

Many theoretical results focus on simplified version on random forests,whose construction is independent of the dataset[Biau et al., 2008, Ishwaran and Kogalur, 2010, Biau, 2012, Genuer,2012, Zhu et al., 2012].

Asymptotic normality of random forests [Mentch and Hooker, 2014,Wager, 2014].

Erwan Scornet Random forests

Page 13: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Random prediction or not?

Tree estimate:

mn(x,Θ) =n∑

i=1

1Xi∈An(x,Θ)

Nn(x,Θ)Yi

where Nn(x,Θ) is the number of points in the cell An(x,Θ).

Erwan Scornet Random forests

Page 14: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Random prediction or not?

Tree estimate:

mn(x,Θ) =n∑

i=1

1Xi∈An(x,Θ)

Nn(x,Θ)Yi

where Nn(x,Θ) is the number of points in the cell An(x,Θ).

M-Finite forest estimate :

mM,n(x,Θ1, . . . ,ΘM) =1

M

M∑m=1

mn(x,Θm)

Erwan Scornet Random forests

Page 15: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Random prediction or not?

Tree estimate:

mn(x,Θ) =n∑

i=1

1Xi∈An(x,Θ)

Nn(x,Θ)Yi

where Nn(x,Θ) is the number of points in the cell An(x,Θ).

M-Finite forest estimate :

mM,n(x,Θ1, . . . ,ΘM) =1

M

M∑m=1

mn(x,Θm)

Conditionally on Dn, the estimate mM,n depends on Θ1, . . . ,ΘM .

Erwan Scornet Random forests

Page 16: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Random prediction or not?

Tree estimate:

mn(x,Θ) =n∑

i=1

1Xi∈An(x,Θ)

Nn(x,Θ)Yi

where Nn(x,Θ) is the number of points in the cell An(x,Θ).

M-Finite forest estimate :

mM,n(x,Θ1, . . . ,ΘM) =1

M

M∑m=1

mn(x,Θm) →M→∞

EΘ [mn(x,Θ)]︸ ︷︷ ︸m∞,n(x)

Erwan Scornet Random forests

Page 17: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

1 Construction of random forests

2 Random forests and kernel methods

3 Consistency of Breiman forests

Erwan Scornet Random forests

Page 18: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Theoretical difficulties for studying random forests

The infinite random forests estimate takes the form

m∞,n(x) =n∑

i=1

YiEΘ

[1Xi∈An(x,Θ)

Nn(x,Θ)

],

where

Nn(x,Θ) is the number of points in the cell An(x,Θ).

Two different difficulties:

The number of points in each cell is unknown.

The tree dependency on the random variable Θ is unknown.

Erwan Scornet Random forests

Page 19: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Theoretical difficulties for studying random forests

The infinite random forests estimate takes the form

m∞,n(x) =n∑

i=1

YiEΘ

[1Xi∈An(x,Θ)

Nn(x,Θ)

],

where

Nn(x,Θ) is the number of points in the cell An(x,Θ).

Two different difficulties:

The number of points in each cell is unknown.

The tree dependency on the random variable Θ is unknown.

Erwan Scornet Random forests

Page 20: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Kernel based on Random Forests (KeRF)

5,56,2

6,8

5,3

6,0

15,1

16,2

14,8

17,118

5,8

5,8

16,2

16,2

7,1

6,25,7

5,5

5,56,2

6,8

5,3

6,0

15,1

16,2

14,8

17,118

5,8

5,8

16,2

16,2

7,1

6,25,7

5,5

5,56,2

6,8

5,3

6,0

15,1

16,2

14,8

17,118

5,8

5,8

16,2

16,2

7,1

6,25,7

5,5

Erwan Scornet Random forests

Page 21: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Kernel based on Random Forests (KeRF)

5,56,2

6,8

5,3

6,0

15,1

16,2

14,8

17,118

5,8

5,8

16,2

16,2

7,1

6,25,7

5,5

5,56,2

6,8

5,3

6,0

15,1

16,2

14,8

17,118

5,8

5,8

16,2

16,2

7,1

6,25,7

5,5

5,56,2

6,8

5,3

6,0

15,1

16,2

14,8

17,118

5,8

5,8

16,2

16,2

7,1

6,25,7

5,5

Erwan Scornet Random forests

Page 22: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Kernel based on Random Forests (KeRF)

5,56,2

6,8

5,3

6,0

15,1

16,2

14,8

17,118

5,8

5,8

16,2

16,2

7,1

6,25,7

5,5

5,56,2

6,8

5,3

6,0

15,1

16,2

14,8

17,118

5,8

5,8

16,2

16,2

7,1

6,25,7

5,5

5,56,2

6,8

5,3

6,0

15,1

16,2

14,8

17,118

5,8

5,8

16,2

16,2

7,1

6,25,7

5,5

Infinite KeRF estimate:

m∞,n(x) =

∑ni=1 YiKk(x,Xi )∑nj=1 Kk(x,Xj)

,

where Kk(x,Xi ) = PΘ [Xi ∈ An(x,Θ)].

Erwan Scornet Random forests

Page 23: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Breiman KeRF vs Breiman random forests

n = 800, d = 50 n = 600, d = 100

Y = X 21 + exp(−X 2

2 ) Y = − sin(2X1) + X 22 + X3

− exp(−X4) +N (0, 0.5)

Erwan Scornet Random forests

Page 24: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

A simple model: the centred forest

Erwan Scornet Random forests

Page 25: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

A simple model: the centred forest

Erwan Scornet Random forests

Page 26: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

A simple model: the centred forest

Erwan Scornet Random forests

Page 27: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

A simple model: the centred forest

Erwan Scornet Random forests

Page 28: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

A simple model: the centred forest

Erwan Scornet Random forests

Page 29: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

A simple model: the centred forest

p=1/2

p=1/2

p=1/2

p=1/2

Erwan Scornet Random forests

Page 30: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

A simple model: the centred forest

Erwan Scornet Random forests

Page 31: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

A simple model: the centred forest

Erwan Scornet Random forests

Page 32: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

A simple model: the centred forest

Erwan Scornet Random forests

Page 33: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Centred KeRF vs centred random forests

n = 800, d = 50 n = 600, d = 100

Y = X 21 + exp(−X 2

2 ) Y = − sin(2X1) + X 22 + X3

− exp(−X4) +N (0, 0.5)

Erwan Scornet Random forests

Page 34: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Uniform KeRF vs uniform random forests

n = 800, d = 50 n = 600, d = 100

Y = X 21 + exp(−X 2

2 ) Y = − sin(2X1) + X 22 + X3

− exp(−X4) +N (0, 0.5)

Erwan Scornet Random forests

Page 35: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Analyzing KeRF estimates

Infinite KeRF estimate: m∞,n(x) =∑n

i=1 YiKk (x,Xi )∑nj=1 Kk (x,Xj )

Local averaging estimate and thus easier to analyze.

One common assumption on kernel estimate is that Kk(x, z) = K ( x−zk )

which is not the case here. Thus, standard methods to deal with ker-nel estimate cannot be directly adapted to our case.

Generally, Kk(x,Xi ) cannot be explicited (due to the complexity ofpartitioning). But it can be computed for centred/uniform randomforests.

Erwan Scornet Random forests

Page 36: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Centred forests

For all x, z ∈ [0, 1]d ,

K cck (x, z) =

∑k1,...,kd∑dj=1 kj=k

k!

k1! . . . kd !

(1

d

)k d∏m=1

1d2km xme=d2km zme.

Representations of z 7→ K cck ((0.5, 0.5), z) for k = 1, 2, 5

Erwan Scornet Random forests

Page 37: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Uniform forests

For all z ∈ [0, 1]d ,

K ufk (0, z) =

∑k1,...,kd∑dj=1 kj=k

k!

k1! . . . kd !

(1

d

)k d∏m=1

zm

∞∑j=km

(− log zm)j

j!.

Representations of z 7→ K ufk

(0, (z1 − 0.5, z2 − 0.5)

)for k = 1, 2, 5

Erwan Scornet Random forests

Page 38: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Rate of consistency of KeRF

Centred KeRF

Assume that m is Lipschitz. Then, provided 2k/n→ 0, and k →∞,

E[mcc∞,n(X)−m(X)

]2 ≤ C1n−1/(3+d log 2)(log n)2.

Erwan Scornet Random forests

Page 39: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Rate of consistency of KeRF

Centred KeRF

Assume that m is Lipschitz. Then, provided 2k/n→ 0, and k →∞,

E[mcc∞,n(X)−m(X)

]2 ≤ C1n−1/(3+d log 2)(log n)2.

Uniform KeRF

Assume that m is Lipschitz. Then, provided 2k/n→ 0, and k →∞

E[muf∞,n(X)−m(X)

]2 ≤ Cn−1/(3+1.5d log 2)(log n)2.

Erwan Scornet Random forests

Page 40: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Rate of consistency of KeRF

Centred KeRF

Assume that m is Lipschitz. Then, provided 2k/n→ 0, and k →∞,

E[mcc∞,n(X)−m(X)

]2 ≤ C1n−1/(3+d log 2)(log n)2.

Uniform KeRF

Assume that m is Lipschitz. Then, provided 2k/n→ 0, and k →∞

E[muf∞,n(X)−m(X)

]2 ≤ Cn−1/(3+1.5d log 2)(log n)2.

Minimax rate for Lipschitz functions: n−1

1+0.5d

Erwan Scornet Random forests

Page 41: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Summary of KeRF

Pros

KeRF and random forests are close in terms of accuracy.

KeRF estimates are more amenable to analysis, since they are kernelestimates.

The weighted function Kk is related to the shape of the partitions.

Cons

Computing the infinite kernel Kk is time consuming.

Breiman KeRF is difficult to express since the kernel K depends onthe data set.

Erwan Scornet Random forests

Page 42: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

1 Construction of random forests

2 Random forests and kernel methods

3 Consistency of Breiman forests

Erwan Scornet Random forests

Page 43: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Tree consistency

For a tree whose construction is independent of data, if

1 diam(An(X))→ 0, in probability;

2 Nn(An(X))→∞, in probability;

then the tree is consistent, that is

limn→∞

E |mn(X)−m(X)|2 = 0.

Erwan Scornet Random forests

Page 44: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Consistency of centred random forest

Estimation error [Biau, 2012]

Under proper assumptions on the regression model,

E[mcc∞,n(X)− mcc

∞,n(X)]2 ≤ Cσ2 2kn

nk1/2n

Approximation error [Biau, 2012]

Under proper assumptions on the regression model,

E[mcc∞,n(X)−m(X)

]2 ≤ 2dL2.2−0.75knd log 2 + ‖m‖2

∞e−n/2kn

Erwan Scornet Random forests

Page 45: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Consistency of centred random forest

If the forest is fully grown, that is, if kn = blog2 nc

Estimation error [Biau, 2012]

Under proper assumptions on the regression model,

E[mcc∞,n(X)− mcc

∞,n(X)]2 ≤ Cσ2 2kn

nk1/2n

Approximation error [Biau, 2012]

Under proper assumptions on the regression model,

E[mcc∞,n(X)−m(X)

]2 ≤ 2dL2.2−0.75knd log 2 + ‖m‖2

∞e−n/2kn

Erwan Scornet Random forests

Page 46: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Consistency of centred random forest

If the forest is fully grown, that is, if kn = blog2 nc

Estimation error [Biau, 2012]

Under proper assumptions on the regression model,

E[mcc∞,n(X)− mcc

∞,n(X)]2 ≤ Cσ2 2kn

nk1/2n

Approximation error [Biau, 2012]

Under proper assumptions on the regression model,

E[mcc∞,n(X)−m(X)

]2 ≤ 2dL2.2−0.75knd log 2 + ‖m‖2

∞e−n/2kn

Erwan Scornet Random forests

Page 47: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Consistency of centred random forest

If the forest is fully grown, that is, if kn = blog2 nc

Estimation error [Biau, 2012]

Under proper assumptions on the regression model,

E[mcc∞,n(X)− mcc

∞,n(X)]2 ≤ Cσ2(log2 n)−1/2

Approximation error [Biau, 2012]

Under proper assumptions on the regression model,

E[mcc∞,n(X)−m(X)

]2 ≤ 2dL2.2−0.75knd log 2 + ‖m‖2

∞e−n/2kn

Erwan Scornet Random forests

Page 48: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Consistency of centred random forest

If the forest is fully grown, that is, if kn = blog2 nc

Estimation error [Biau, 2012]

Under proper assumptions on the regression model,

E[mcc∞,n(X)− mcc

∞,n(X)]2 ≤ Cσ2(log2 n)−1/2

Approximation error [Biau, 2012]

Under proper assumptions on the regression model,

E[mcc∞,n(X)−m(X)

]2 ≤ 2dL22−0.75knd log 2 + ‖m‖2

∞e−n/2kn

Erwan Scornet Random forests

Page 49: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Consistency of centred random forest

If the forest is fully grown, that is, if kn = blog2 nc

Estimation error [Biau, 2012]

Under proper assumptions on the regression model,

E[mcc∞,n(X)− mcc

∞,n(X)]2 ≤ Cσ2(log2 n)−1/2

Approximation error [Biau, 2012]

Under proper assumptions on the regression model,

E[mcc∞,n(X)−m(X)

]2 ≤ 2dL2n−0.75d log 2 + ‖m‖2

∞×1

Erwan Scornet Random forests

Page 50: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Algorithm for Breiman random forest

Randomness for Breiman random forests

Data sampling : bootstrap

At each cell, select randomly mtry coordinates among {1, . . . , d}.

Choose the split by minimizing the CART-split criterion on the cellalong the mtry selected coordinates.

Stop when each cell contains exactly one point.

Erwan Scornet Random forests

Page 51: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Algorithm for Breiman random forest

Randomness for Breiman random forests

Data sampling : subsampling, that is choosing an points among nwith an < n

At each cell, select randomly mtry coordinates among {1, . . . , d}.

Choose the split by minimizing the CART-split criterion on the cellalong the mtry selected coordinates.

Stop when the number of cell is exactly tn.

Erwan Scornet Random forests

Page 52: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Assumption (H1)

Additive regression model:

Y =d∑

i=1

mi (X(i)) + ε,

where

X is uniformly distributed on [0, 1]d ,

ε ∼ N (0, σ2) with ε independent of X,

Each model component mi is continuous.

Erwan Scornet Random forests

Page 53: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Consistency

Theorem [S. et al., 2014]

Assume that (H1) is satisfied. Then, provided an → ∞ andtn(log an)9/an → 0, random forests are consistent, i.e.,

limn→∞

E [m∞,n(X)−m(X)]2 = 0.

Remarks

First consistency result for Breiman’s original forest.

Consistency of CART.

Erwan Scornet Random forests

Page 54: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Sparsity and random forests

Assume that

Y =S∑

i=1

mi (X(i)) + ε,

for some S < d .

Denote by j1,n(X), . . . , jk,n(X) the first k cut directions used toconstruct the cell containing X.

Proposition [S. et al., 2014]

Let k ∈ N? and ξ > 0. Under appropriate assumptions, with probability1− ξ, for all n large enough, we have, for all 1 ≤ q ≤ k,

jq,n(X) ∈ {1, . . . ,S}.

Erwan Scornet Random forests

Page 55: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

G. Biau. Analysis of a random forests model. Journal of Machine Learning Research, 13:1063–1095,2012.

G. Biau, L. Devroye, and G. Lugosi. Consistency of random forests and other averaging classifiers.Journal of Machine Learning Research, 9:2015–2033, 2008.

L. Breiman. Random forests. Machine Learning, 45:5–32, 2001.

S. Clemencon, M. Depecker, and N. Vayatis. Ranking forests. Journal of machine learning research,14(1):39–73, 2013.

R. Genuer. Variance reduction in purely random forests. Journal of Nonparametric Statistics, 24:543–562, 2012.

P. Geurts, D. Ernst, and L. Wehenkel. Extremely randomized trees. Springer science, Mars 2006.

H. Ishwaran and U. Kogalur. Consistency of random survival forests. Statistics & Probability Letters,80:1056–1064, 2010.

H. Ishwaran, U. Kogalur, E. Blackstone, and M. Lauer. Random survival forest. The annals ofapplied statistics, 2(3):841–860, 2008.

N. Meinshausen. Quantile regression forests. Journal of Machine Learning Research, 7:983–999,2006.

L. Mentch and G. Hooker. Ensemble trees and clts: Statistical inference for supervised learning.arXiv:1404.6473, 2014.

S., Gerard Biau, and Jean-Philippe Vert. Consistency of random forests. arXiv:1405.2881, 2014.

S. Wager. Asymptotic theory for random forests. arXiv:1405.0352, 2014.

R. Zhu, D. Zeng, and M.R. Kosorok. Reinforcement learning trees. 2012.

Erwan Scornet Random forests

Page 56: A walk in random forests - unistra.frirma.math.unistra.fr/~gardes/SEMINAIRE/scornet.pdf · 2015-10-20 · A walk in random forests Erwan Scornet (LSTA, Institut Curie), Supervised

Merci pour votre attention !

Erwan Scornet Random forests