De la Garza Phenomenon

Post on 24-Feb-2016

52 views 0 download

Tags:

description

De la Garza Phenomenon. Bikas K Sinha ISI, Kolkata RU Workshop : APRIL 18, 2012 Collaborators : N K Mandal & M Pal Calcutta University. Nomenclature..……. Liski - Mandal -Shah- Sinha (2002) : Topics in Optimal Design : Springer- Verlag Monograph - PowerPoint PPT Presentation

Transcript of De la Garza Phenomenon

De la Garza Phenomenon

Bikas K SinhaISI, Kolkata

RU Workshop : APRIL 18, 2012 Collaborators : N K Mandal & M Pal

Calcutta University

Nomenclature..……

Liski-Mandal-Shah-Sinha (2002) : Topics in Optimal Design : Springer-Verlag Monograph

Pukelshiem (2006) : Optimal Design of Experiments

Refers as …..Property of Admissibility • Khuri-Mukherjee-Sinha-Ghosh (2006) : Statistical

Science …..de la Garza Phenomenon• Min Yang (2010) : Annals of Statistics …title of the

paper ‘On the de la Garza Phenomenon’

Motivating Example : First Course in Regression

• X : -3.2 -2.7 -1.8 0.2 4.7 6.3 8.2• Y : … … … …. … … ……• Fit a linear regression equation of Y on X under the usual

model assumptions….etc etc• X-transformed to U……• U : -1.00, -0.91, -0.75, -0.40, 0.39, 0.67, 1.00• Motivating Question : If we believe in the linear

regression model, what good are so many u-values ? Why can’t we work with exactly two u-values &, that too, possibly with +/- 1 ?

Linear Regression Model Mean Model Yx = α + βx with Homoscedastic Errors • Given DN = [(x 1, n 1); (x 2, n 2); …(x k, n k)] ; N=∑ni

• χ = Space of the Regressor ‘X’ = [a, b], a < b WOLG : a ≤ x 1 < x 2 <….< xk ≤ b; x’s all distinctFor each i, ni ≥ 1 such that ∑ni = N [given]Estimability of α and β ensured iff k ≥ 2.Fitting of Linear Regression Model : β^ = b yx= SPyx / SSxx ; α^ = ybar – b xbarInference rests on normality of errors etc etc

Motivating Theory :Undergraduate Level

X : a ≤ x 1 < x 2 <….< xk ≤ b [k > 1, all x’s distinct]Y : y1 , y2 , y3 , …. yk ……responses on YAssume Linear Regression of Y on X : E[Yx] = α + βx Usual conditions on the errors…. Find BLUE of the regression coefficient ‘β’.Smart Student’s thought…..pairwise unbiased estimators…β^_(i,j) =b_(i,j) = (yi – yj) / (xi – xj), 1<= i < j <= kSo….BLUE can be based on the {b_(i,j)’s}…..k_c_2 pairs All Distinct ? / Correlated / Uncorrelated ?Basis : b_(1,2), b_(1,3), …, b_(1,k) …each unbiased but Jointly correlated estimates…..y_1 is involved everywhere

Formation of BLUE…..

• Work out means, variances/ covariances of the estimators and start from there to arrive at the BLUE.

• Define ‘η’ as the (k-1)x1 col. vector of the ‘difference estimators’ i.e., η =(b_(1,2), b_(1,3),…,b_(1,k)) so that

• E[η] = β1 & Disp.(η) = σ2 W, W being a pd matrix• Then blue of β = η’ W-1 1 / 1’ W-1 1• Show that indeed the above simplifies to β^=b=∑ (yi - ybar)(xi – xbar)/ ∑(xi -xbar)^2.

Smarter move…..• V1 = [y1 – y2]/√2 / [x1 – x2]/ √2

• V2 = [y1 + y2 – 2y3]/ √6 / [x1 + x2 - 2x3] √6• …….• Vn-1= [y1 + y2 +…- (n-1)yn]/ √{n(n-1)} /

• [x1 + x2 +…- (n-1)xn] / √{n(n-1)}• Then these V’s are uncorrelated.• Hence W(V) is a diagonal matrix etc etc….• Derivation of β^ is much easier……• Claim: Same result….novel derivation …use of Helmert’s

Orthogonal Transformation.

Motivating Theory : Master LevelRegression Design on X : (x1 , n1); ( x2, n2); …………..(xk , nk) [k > 1]; all x’s distinctY : {(y1j); (y2j); ….(ykj)}…altogether n = sum ni observationsAssume Linear Regression of Y on X : E[Yx] = α + βx Usual conditions on the errors…. Find BLUE of the regression coefficient ‘β’.Smart Student’s thought…..pairwise unbiased estimators…β^_(i,j) =b_(i,j) = (ybari – ybarj) / (xi – xj), 1<= i < j <= kSo….BLUE can be based on the {b_(i,j)’s}. How many ? Correlated /Uncorrelated ?Basis : b_(1,2), b_(1,3), …, b_(1,k) …each unbiased but Jointly correlated estimates…..y_1 is involved everywhere

Motivating Theory : Master Level & Beyond…..

• Work out means, variances/ covariances of the estimators and start from there to arrive at the BLUE.

• Define ‘η’ as the vector of these ‘difference estimators’ so that

• E[η] = β1 & Disp.(η) = σ2 W…..Complicated ?• Then blue of β = η’ W-1 1 / 1’ W-1 1Show that indeed the above simplifies to β^=b=∑ni (ybari - ybarbar)(xi – xbar) / ∑ni (xi -xbar)2.

Smarter move….

• V1 = [√n1 ybar1 - √n2 ybar2]/[….]

• V2 = [√n1 ybar1 + √n2 ybar2 - 2√n3 ybar3]/[...]• Etc etc• This time W-matrix becomes a diagonal matrix…• Tremendous simplification in the formation of β^

Turn back to the basic question…

X : -3.2 -2.7 -1.8 0.2 4.7 6.3 8.2 • U : -1.00, -0.91, -0.75, -0.40, 0.39, 0.67, 1.00• Motivating Question : If you believe in the

linear regression model• E[Y_x] = α + βx = δ + γu = E[Y_u] what good are so many u-values ? Why can’t

you work with exactly two u-values &, that too, possibly with +/- 1 ?

Fisher Information Matrix• I(θ; DN) = X’ X = 2 x 2 matrix with elements

• [(N T1); (T1 T2)]where T1 = ∑ ni xi & T2 = ∑ ni x2

i X Nx2 = [1 Nx1 , col. vector of xi’s with ni repeats] Averaged Information Matrix per ObservationIBAR = (I/N) I(θ) = [(1 μ’1) (μ’1 μ’2)] where μ’1 = ∑ ni xi / N μ’2 = ∑ ni x2

i / N I(θ) : pd matrix iff k ≥ 2 distinct x’s are considered

de la Garza Phenomenon [de la Garza, A. (1954) : AMS]

• Research Paper [Annals of Statistics] : 2010• Research Paper [Annals of Statistics] : 2009• Springer-Verlag Monograph on Optimal Designs :

2002 • Wiley Book on Optimal Designs : 2006• Continuous Flow of Papers involving Linear &

Non-Linear Models – both qualitative and quantitative responses – enormous impact of de la Garza Phenomenon in optimality studies

Continuous Design Theory• Context : Linear Regression Model • Space of Regressor : χ = [a, b], a < b• k ≥ 2 distinct x-values in CHI with positive weights • w1, w2, …, wk such that ∑wi = 1• In applications, we consider in terms of ‘N’ observations, with

Nwi = Ni observations taken at

• x = xi , i = 1, 2, …, k.

• [Choice of ‘N’ ensures integral values of Ni’s] Version of IBAR = [(1 μ’1) (μ’1 μ’2)] where μ’1 = ∑ wi xi AND μ’2 = ∑ wi x2

i Known as Information Matrix arising out of aContinuous Design, in terms of {(xi ,wi ); i = 1, 2, …, k}

De la Garza Phenomenon : Continuous Design Theory

• Context : Linear Regression Model with Homoscedastic Errors

• Claim 1 : Given any continuous regression design ‘D_(k, x, w)’ with ‘k’ support points in χ = [a, b] :

• a ≤ x 1 < x 2 <….< xk ≤ b; x’s all distinct and with positive weights w1, w2, …, wk [such that ∑wi = 1], whenever k > 2, we can find exactly 2 points ‘x*’ and ‘x**’ with suitable weights ‘p*’ and ‘p**’ such that (i) x 1 ≤ x* < x** ≤ x k; (ii) p* + p** = 1 and (iii) IBAR based on ‘D*_[(x*, p*); (x**, p**)]’ is identical to IBAR based on D_(k, x, w). [Info. Equivalence]

Proof of Claim 1 • Recall μ’1 = ∑ wi xi [1st moment]

• AND μ’2 = ∑ wi x2i [2nd moment]

• Start with • IBAR = [(1 μ’1) (μ’1 μ’2 )]• Set IBAR = I*BAR and derive defining equ. • p*x* + p** x** = μ’1 …………………..(1)

• p*x*2 + p** x**2 = μ’2…………..(2)• Claim : There is an acceptable solution for • [(x*, p*); (x**, p**) satisfying (1) and (2).

Proof….contd.• WOLG : x1 = -1 AND xk = +1

• Solution set : Define μ2 = μ’2 – μ’12 > 0

• x* = μ’1 +/- [p** μ2/p*]• X** = μ’1 -/+ [p* μ2/p**]• Further, for x* < x**, we readily verify • -1 < x* = μ’1 – [p** μ2/p*] AND

• x* < x** = μ’1 + [p* μ2/p**] < 1 • whenever μ2 / [μ2 + (1 + μ’1)2] < p* <• (1- μ’1)2 / [μ2 + (1- μ’1)2] • NOTE : Verified LHS < RHS

Statement of Information Equivalence : Polynomial Regression

Therefore : Guaranteed existence of [(x*, p*); (x**, p**)]; -1 < x* < x** < 1; 0 < p* < 1 such that IBAR = IBAR*.de la Garza Phenomenon applies to pth degree polynomial regression model in terms of Information Equivalence of any k [>p+1]–point supported continuous design with that of a suitablychosen exactly (p+1)-point supported continuous design !

Caratheodory’s Theorem

• If ‘p+1’ is the number of parameters in a model, one can restrict attention to at most (p+1)(p+2)/2 parameters.

• Strength…..model specification …most general• Weakness….pth degree polynomial regression

model…de la Garza provides much better result • [ p+1 < <(p+1)(p+2)/2, in general terms]

Higher Degree Polynomial Regression• Yes….de a Garza Phenomenon holds for higher

degree polynomial regressions as well…..proof is a marvel exercise in matrix theory !!!

• Equate given pd matrix I(D) to I(D*) where • I(D*) = X*W*X*, with X* being a square matrix

and W* being a diagonal matrix. The claim is that such X* and W* matrices exist with minimum number of support points …..this is the spirit of de la Garza Phenomenon in terms of Information Equivalence. Information Dominance came much later.

Back to de la Garza Phenomenon: Exact Design Theory [EDT]

• This aspect …somehow…has been bypassed in the literature……difficult to provide a general theory as to the exact sample size for Info. Equi. to work !

• Motivating Example : Linear Regression with 3 points to start with : [-1, 0, 1] so that k = 3 > 2. Accordingly to de la Garza Phenomenon, under continuous design theory, there are weights

• 0 < w -1, w0 , w +1 < 1, sum = 1• assigned to these points. AND then we can find

De la Garza Phenomenon : EDTone 2-point design, say [(a, p); (b, q)] such that -1 ≤ a < b ≤ 1, 0 < p < 1 and there is InformationEquivalence between the two designs ! What if we are in an exact design scenario with a given total number of observations ‘N’ and its decomposition into n(-), n(o) and n(+) – being assigned to -1, 0 and 1 respectively ? Can we now find a solution to [(a, na); (b, nb)] satisfying

EDT…• (i) -1 ≤ a < b ≤ 1;• (ii) na + nb = N – both being integers• (iii) Information Equivalence ?• Do we need a condition on ‘N’ at all ? • Crucial Observation : NOT ALL VALUES OF ‘N’

ARE AMENABLE TO SUPPORTING THE EQUIVALENCE THEOREM OF THE INFORMATION MATRIX .….NEEDED A MINIMUM VALUE……ONLY THEN IT WORKS !

EDT : Choice of ‘N’ • Examples : N Remark• (i) -1(1), 0(1), +1 (1) : 3 NOT Possible• (ii) -1(2), 0(2), +1(2) : 6 Possible• (iii) -1(1), 0(2), +1(1) : 4 Possible• (iv) -1(2), 0(1), +1(1) : 4 Not Possible• (v) -1(4), 0(2), +1(2) : 8 Possible• (vi) -1(1), 0(3), +1(1) : 5 Possible• (vii) -1(1), 0(2), +1 (4) : 7 Possible• (viii) -1(1), a(1), +1(1) : 3 Not Possible• (vi) -1(2), a(2), +1(2) : 6 Possible iff 3 – 2(3) < a < 2 (3) – 3

EDT : General Theory for 3 pointswith point symmetry

• Consider a general allocation design : • -1 (n-), 0(no) and 1(n+) where each of n-, no and n+

is a positive integer and (n-) + (no) + (n+) = N ≥ 3. • Once more, we want to replace the above 3-

point point-symmetric design by a two point design of the form : (x, nx) and (y, ny) so that nx + ny = N and, moreover, Information Equivalence holds. That suggests

EDT

• x nx + y ny = (n+) – (n-) ..…….(3)

• x2 nx + y2 ny = (n+) + (n-) ……….(4) • • Set • a = nx, b = ny, T1 = (n+) – (n-) and T2 = (n+ ) + (n-)

……………(5)• From (3) and (4), in terms of (5), we obtain• x = [T1 / (a+b)] ± [{b[(a+b)T2 – T1

2]}/a(a+b)2]

• y = [T1 / (a+b)] ±[{a[(a+b)T2 – T12]}/b(a+b)2]

• It can be readily verified that (a+b) T2 > T12.

EDT• Let us choose • x = [T1 / (a+b)] + [{b[(a+b)T2 – T1

2]}/a(a+b)2]• and• y = [T1 / (a+b)] - [{a[(a+b)T2 – T1

2]}/b(a+b)2]• so that y < x. • Note that T1 and T2 are both known. We will

now sort out values of nx and ny subject to nx + ny = N so as to satisfy the requirement that

• -1 ≤ y < x ≤ 1.

EDT• First, note that • (i) a + b = N • (ii) expressions for x and y depend on a and b only

through a/b or b/a.• Set n(-)/N = P- n( 0) / N = Po n(+)/N = P+

• Conditions : -1 ≤ y AND x ≤ 1

• Equivalent to : • 1 + T1/(a+b) ≥ [{a[(a+b)T2 – T1

2]}/b(a+b) 2]• AND • 1 – T1/(a+b) ≥[{b[(a+b) T2 – T1

2]}/a(a+b) 2]•

EDT• Equivalent to : [Po(1-Po)+ 4(P+)(P-)]/[2(P-) + Po]2 ≤ nx/ny nx/ny <= [2(P+) + Po]2 /[Po(1-Po)+4(P+)(P-)]• Equivalent to :L =[Po(1-Po)+ 4(P+)(P-)]/[Po(1-Po)+ 4(P+)(P-)+[2(P-) + Po]2] ≤ nx / N <= [2(P+) + Po]2 / [Po(1- Po)+ 4(P+)(P-) + [2(P+) + Po]2] = U• • Written alternatively as : N.L ≤ nx ≤N.U.

EDT

• Implication : Choice of ‘N’ must be such that the interval [N.L, N.U] includes at least one integer which can serve as the value of nx. A sufficient condition for this to happen is, of course, that the length of the interval viz.

N(U - L) ≥ 1. Even otherwise, a choice of nx could be ensured.

Note : So far….this [length less than unity] has been eluding us !!!

EDT (i) Po = P+ = P- = 1/3 [point and mass symmetric design]• Here we find L = 2/5 and U = 3/5.• • So, for N = 3, N.L = 6/5 and N.U =9/5, which do not

include any integer. So 3-point design with point and mass symmetry cannot be replaced by a 2-point design whenever N = 3.

• • Again, for N = 6, we have N.L = 12/5, N.U = 18/5 and

these include the integer ‘3’. So there is a solution and we have : ± (2/3), each with 3 observations…as was mentioned before.

EDT• For N = 9, we have N.L = 18/5 and N.U = 27/5.

These include 2 integers : 4 and 5. So we have two solutions :

• • [-5/(30), 4]; [4/(30), 5] • AND • [-4/(30), 5]; [5/(30), 4].

EDT

• (ii) Po = 2/7, P+ = 4/7 and P- = 1/7 i.e., the initial design is has a size which is a multiple of 7, say N = 7k. This design is pt-sym but mass-asymmetric.

• And explicitly it is : [(-1, k); (0, 2k), (1, 4k)] where k is an integer.

• Note that L and U are independent of k. Computations yield : L = 13/21 [= 39/63] and U =50/63.

• (a) k =1 : N = 7; N.L=13/3 < N.U=50/9 : one sol. • nx = 5, x = 3/7 + (1040)/70;

• ny = 2, y = 3/7 – 5 (1040)/140

EDT

• (b) k = 2 : N = 14….three solutions• nx = 9, x = 3/7 + (520)/42;

• ny = 5, y = 3/7 – 3 (2080)/140 • nx =10, x = 3/7 + (1040)/70;

• ny = 4, y = 3/7 –(260)/14 nx =11, x = 3/7 + (3432)/154; ny = 3, y = 3/7 –(3432)/ 42 •

EDT• (iii) Po = 3/5, P+ = P- = 1/5 i.e., the initial design has size

multiple of 5, say N = 5k and explicitly it is : • [(-1, k); (0, 3k); (1, k)] where k is an integer. . • This is point and mass-symmetric• Note that L and U are independent of k. Computations

yield : L = 2/7 and U = 5/7. • k = 1 : N = 5, 10/7 ≤ nx ≤ 25/7 :

• (nx, ny) = (2, 3) OR (3, 2). • Solutions : x = 2/(15) and y = -3/(15) • with nx = 3 and ny = 2; • x = 3/(15) and y = -2/(15) • with nx = 2 and ny = 3.

EDT• k = 2 : N = 10, 20/7 ≤ nx ≤ 50/7 : nx = 3, 4, 5, 6, 7.• Solutions: x = 6/(210) and y = -14/(210) for (nx, ny) = (7, 3)x = 14/(210) and y = -6/(210) for (nx, ny) = (3, 7)x = 4/(60) and y = -6/(60) for (nx, ny) = (6, 4)x = 6/(60) and y = -4/(60) for (nx, ny) = (4, 6)x = 2/(10) and y = -2/(10) for (nx, ny) = (5, 5).

EDT

• EXAMPLE of 3 -point asymmetric design : N = 3• Consider an asymmetric design [(-1, 1), (a, 1), (1, 1)]

with a # 0. WOLG, we take a > 0.• Consider Information Equivalence with [(x, 2), (y, 1)]. • Then • a = 2x + y……………………..…(6)• 2 + a2= 2x2+ y2…………………..(7)• This yields : x = a/3 ± 2/3 times (a2 + 3) • and for 0 < a < 1, it turns out that • a/3 – 2/3 times (a3+ 3) < -1 • and 1 < a/3 + 2/3 times (a2 + 3).• Hence, N = 3 does not work !

EDT

• For N = 6, naturally, equal allocation of 2 at each of the 3 points will yield the same negative result when we opt for [(x, 4), (y, 2)]. It follows that [(x, 5), (y, 1)] also fails to yield any affirmative result.

• For [(x, 3), (y, 3)], we require • 2a = 3(x+y) • 4 + 2a2= 3(x2+ y2).• We obtain :• x, y = a/3 ±1/3 times (6 + 2a2)

EDT• Note : For a = 0, this leads to : x, y = ± (2/3).

This was discussed earlier. • Condition : -1 < x < 1 leads to : • 0 < a < 2(3) – 3, if a > 0. • This was stated earlier.

EDT• More examples…..• [(-1, 1); (0, 2); (1, 1)] is equivalent to • [(-1/(2), 2); ((1/(2), 2)] • [(-1, 2); (0, 1); (1, 1)] : Impossible • [(-1, 4); (0, 2); (1, 2)] is equivalent to• [(-1/4 - (165)/20; 5); (-1/4 + (165)/12, 3]

Turning back to the example…

U : -1.00, -0.91, -0.75, -0.40, 0.39, 0.67, 1.00Under Linear Regression : Does there exist a 2-point Information Equivalent Design ?Computations yield : n = 7 μ’1= -1/7= -0.142857; μ’2 = 4.1516/7Alt. Choice : -1 < a(4) < 0 < b(3) < 1 for 7 obs. 4a + 3b = -1 and 4a^2 + 3b^2 = 4.1516 a = -0.7982 AND b = 0.7309….reqd. solution

Quadratic Regression : Info Equi.• Context : Quadratic Regression Model with

Homoscedastic Errors • [ Mean Model Yx = α + βx + γx2 ]• Claim : Given any continuous regression design ‘D_(k, x,

w)’ with ‘k’ support points in χ =[a, b] :• a ≤ x 1 < x 2 <….< xk ≤ b; x’s all distinct and with positive

weights w1, w2, …, wk [such that ∑wi = 1], whenever k > 3, we can find exactly 3 points ‘x*’, ‘x**’ and ‘x***’ with suitable weights ‘p*’, ‘p**’ and ‘p***’ such that (i) x 1 ≤ x* < x** < x*** ≤ x k; (ii) p* + p** + p***= 1 and (iii) IBAR based on ‘D*_[(x*, p*); (x**, p**); (x***, p***)]’ is identical to IBAR based on D_(k, x, w). [Info. Equivalence]

Quadratic Regression : EDT• Problem # 1 • Given D_4 : [(-1, 1); (-a, 1); (a, 1); (1, 1)] • Can we find [(x, 2); (y, 1); (z, 1)] for Information

Equivalence with -1 ≤ x # y # z ≤ 1?• Answer : Impossible !• Problem # 2 • Given D_6 : [(-1, 1); (-0.5, 2); (0.5, 2); (1, 1)] • Can we find [(-x, f); (0, 6-2f); (x, f)] for Information

Equivalence with 0 < x < 1 ?• Yes : Unique sol. x = (3)/2 and f = 2.

More on Quadratic Regression : EDTProblem # 3. What about D_(2k+2) : [(-1, 1); (-0.5, k); (0.5, k); (1, 1)] ?Sol. [(-x, f); (0, 2k+2-2f); (x, f)] for some x & f ?‘No’ for k = 3 to 7For k = 8 : f = 6 and x = 1/(2) !More Affirmative Cases :(i) D_36 :[-1, 2);(-0.5, 16);(0.5, 16);(1, 2)] = D_36 : [(-1/ (2), 12); (0, 12); (1/ (2), 12)](ii) D_68 :[-1, 2);(-0.5, 32);(0.5, 32);(1, 2)] = D_68 : [(-(2/5), 25); (0, 18); ((2/5), 25)]

Information Domination…• De la Garza Phenomenon : Info Equivalence• More to it in terms of Information Domination• WOLG ………..χ = [-1, 1]• Claim 2: Given D*=[(x*, p*); (x**, p**)] with (x*, x**) NOT both

equal to (-1, 1), there exists • 0 < c < 1 so that Dc = [(-1, c); (+1, 1-c)] produces an Information

Matrix I(Dc) which ‘dominates’ I(D*) in the sense of ‘matrix domination’. That is, I(Dc) – I(D*) is nnd.

In a way, I(Dc) dominates I(D*) in every sense !• This is the best result one can think of ………...in terms of

‘improving’ over I(D*) !!

Information Domination….

• Proof of Claim 2 :• Set 1 – 2c = μ’1 and solve for c =[1- μ’1]/2.

• Note that (x*, x**) # (-1, 1) so that -1 < μ’1 < 1 and so 0 < c < 1.

• Next note that μ’2 < 1.

• Therefore, I(Dc) – I(D*) = [(0, 0) (0, 1- μ’2)] which is nnd.

• Message : Push the points to the boundaries !

Quadratic Regression : Information Dominance

• Context : Quadratic Regression Model with Homoscedastic Errors [ Mean Model Yx = α + βx + γx2 ]

• Claim : Set χ = [-1, 1] WOLG.• Given any continuous regression design • ‘D*_[(x*, p*); (x**, p**); (x***, p***)]’ with -1 <

x* < x** < X*** < 1, there exist proportions ‘p’, ‘q’ and ‘r’ and a constant c, -1 < c < 1 such that the design D_[(-1, p); (c, r); (+1, q)] provides Information Dominance over the design D*.

Sketch of the Proof….• I= (1 μ’1 μ’2)

• (μ’1 μ’2 μ’3)

• (μ’2 μ’3 μ’4 )• I* = etc etc• Equate μ’1 , μ’2 and μ’3 to those of I* and solve for

p, q, r and c. Then show that• μ’4 < μ*’4

• For details…..Pukelsheim’s Book • Also…….Liski et al Monograph [2002] : • Topics in Optimal Design

Binary Response Models • Impressive Literature on Optimality Issues• de la Garza Phenomenon & Information

Dominance…recent advances….• Optimal designs for binary data under

logistic regression. • Mathew-Sinha (2001) • Jour. Stat Plan. & Inf., 93, 295-307•

Binary Response Model….

• P[Yx = 1] = 1/[1+exp{-(α + βx)}]

• {(xi, ni)}; i=1, 2, …, k ….given data • Binomial model…..log likelihood….differentiation etc

etc…Information Matrix…..Approximate Theory : {(xi, pi)} etc……∑ pi = 1Set ai = α + βxi for each iI(α,β)=[(∑ pi exp(-ai)/[1+exp(-ai)]2; (∑ pi xi exp(-ai)/[1+exp(-ai)]2; do; (∑ pi xi

2 exp(-ai)/[1+exp(-ai)]2

Domination in Logistic Regression Model

• Given {(xi , pi)} etc……subject to ∑ pi = 1 and a set of distinct real numbers ai‘s…there exists a real number ‘c’ satisfying

• (i) ∑ pi xi exp(-ai)/[1+exp(-ai)]2

• = c exp(-c)/[1+exp(-c)]2;

• (ii) (∑ pi xi2 exp(-ai)/[1+exp(-ai)]2

<= c2 exp(-c)/[1+exp(-c)]2

Remark : +/- ‘c’ does better than ai’s…k > 2…

Non-Linear Models ? • For most non-linear models, de la Garza phenomenon

holds and it goes beyond in the sense of Matrix Domination….known as ‘Loewner Domination’…..Min Yang [Annals of Stat., 2010]

• Non-linear Models with 3 parameters • theta_o + {theta_1 x / [x + theta_2]}…E_max• theta_o + {theta_1 exp(x/theta_2)}…Expon.• Theta_o + {theta_1 log(x + theta_2)}..Loglinear• There are designs supported by exactly 3 points

(including the two extreme points) which are as good as those supported by more than 3 points in the sense of Matrix Domination !

Non-Linear Models….More Ref.

• UIC School…..strong research group…….• Fang & Hedayat….2008….Annals• Li & Majumdar….2009…..JSPI• Stufken & Yang…..2009….Annals • Others in UIC group……2010 / 2011 ……• German School…….

Here we stop……

• B.K.Sinha

• RU• April 18,

2012