Estimation Techniques for Dose-response Functions Presented by Bahman Shafii, Ph.D.
-
Upload
destiny-clayton -
Category
Documents
-
view
26 -
download
1
description
Transcript of Estimation Techniques for Dose-response Functions Presented by Bahman Shafii, Ph.D.
Estimation Techniques for Dose-response Functions
Presented by
Bahman Shafii, Ph.D.
Statistical ProgramsCollege of Agricultural and Life Sciences
University of Idaho
Acknowledgments
• Research partially funded by USDA-ARS Hatch Project IDA01412, Idaho Agricultural Experiment Station.
• Collaborators:
• William J. Price Ph. D., Statistical Programs, University of Idaho.
• Steven Seefeldt, Ph. D., USDA -ARS, University of Alaska Fairbanks.
• Dose-response models are common in agricultural research.
• They can encompass many types of problems:
• Modeling environmental effects due to exposure to chemical or temperature regimes.
• Estimation of time dependent responses such as germination, emergence, or hatching.
(e.g. Shafii and Price 2001; Shafii, et al. 2009)
• Bioassay assessments via calibration curves and quantal estimation. (e.g. Shafii and Price 2006)
Introduction
Estimation
• Curve estimation.• Linear or non-linear techniques.
• Estimate other quantities:• percentiles.
• typically: LD50, LC50, EC50, etc.
• percentile estimation problematic.• inverted solutions.• unknown distributions.• approximate variances.
• The response distribution:
• Continuous• Normal• Log Normal• Gamma, etc.
• Discrete - quantal responses• Binomial, Multinomial (yes/no)• Poisson (count)
• The response form:
• Typically expressed as a nonlinear curve
• increasing or decreasing sigmoidal form• increasing or decreasing asymptotic form
Dose
Res
pon
se
Dose
• Given a dose-response curve and an observed response:
• What dose generated the response?
• What is the probability of a dose given an observed response and the calibration curve?
• This problem fits naturally into a Bayesian framework.
Bioassay and Calibration
Dose
Res
pon
se
Measured Response
Unknown Dose
• Typical dose-response estimation assumes that the functional form or tolerance distribution, is known, e.g. a sigmoidal shape.
• In some cases, however, it may be advantageous to relax this assumption and restrict estimation to a family of dose-response forms.
• The dose-response population consists of a mixture of subpopulations which can not be sampled separately.
• The dose-response series exhibits a more complex behavior than a simple sigmoidal shape, e.g. hormesis.
• Objectives
• Outline estimation methods for dose-
response models.
• Modern approaches.• Probit - Maximum Likelihood
• Generalized non-linear models.
• Bayesian solutions.
• Traditional approaches.
• Probit - Least Squares.
• Objectives
• Demonstrate solutions for calibration of an
unknown dose with a binary response
assuming:
• A known dose-response form.• Standard MLE estimation.
• Standard Parametric Bayesian estimation.
• A family of dose-response forms.
• Nonparametric Bayesian estimation.
Estimation Methods
Traditional Approach
• Probit Analysis - Least Squares
^
where pij = yij / N and yij is the number of successes out of N
trials in the jth replication of the ith dose. 0 and 1 are regression parameters and i is a random
error; ij ~ N(0,2).
• Minimize: SSerror = (pij - probit)2
• A linearized least squares estimation (Bliss, 1934 ; Fisher, 1935;
Finney, 1971):
Probiti = -1(pij) = 0 + 1*dosei + ij (1)
• is a convenient CDF form or “tolerance distribution“, e.g.
• Normal: pij = (1/2) exp((x-)2/2
• Logistic: pij = 1 / (1 + exp( -dosei - ))
• Modified Logistic: pij = C + (C-M) / (1 + exp( -dosei -)) (e.g. Seefeldt et al. 1995)
• Gompertz: pij = 0 (1 - exp(exp(-(dose))))
• Exponential: pij = 0 exp(-(dose))
• SAS: PROC REG.
Modern Approaches
• Probit Analysis - Maximum Likelihood
for data set yij where i = (0 + 1*dosei ) and 0, 1, and dosei are those given previously.
• The CDF, , is typically defined as a Normal, Logistic, or
Gompertz distribution as given above.
• SAS: PROC PROBIT.
• The responses, yij, are assumed binomial at each dose i
with parameter i. Using the joint likelihood, L(i) :
Maximize: L(i) (i)yij (1 - i)(N - yij) (2)
• Limitations:
• Least squares limited.• Linearized solution to a non-linear problem.
• Even under ML, solution for percentiles approximated. • inversion.• use of the ratio 0/1 (Fieller, 1944).
• Appropriate only for proportional data.
• Assumes the response -1(pij) ~ N(, 2).
• Interval estimation and comparison of percentile values approximated.
Probit Analysis
Modern Approaches (cont)
• Nonlinear Regression - Iterative Least Squares
where yij is an observed continuous response, f(dosei)
may be generalized to any continuous function of dose
and ij ~ N(, 2).
• Minimize: SSerror = [ yij - f(dosei) ] 2.
• SAS: PROC NLIN.
• Directly models the response as:
yij = f(dosei) + ij (3)
• Nonlinear Regression - Iterative Least Squares
• Limitations:
• assumes the data, yij , is continuous; could be discrete.
• the response distribution may not be Normal,
i.e. ij ~ N(, 2).
• standard errors and inference are asymptotic.
• treatment comparisons difficult in PROC NLIN.
• differential sums of squares, or
• specialized SAS codes ; PROC IML.
• Generalized Nonlinear Model - Maximum Likelihood
Modern Approaches (cont)
where yij and f(dosei) are as defined above.
• Estimation through maximum likelihood where the
response distribution may take on many
forms:
Normal: yij ~ N(i, ) ,
Binomial: yij ~ bin(N, i) ,
Poisson: yij ~ poisson(i) , or
in general: yij ~ ƒ().
• Directly models the response as:
yij = f(dosei) + ij
• Generalized Nonlinear Model - Maximum Likelihood
• Maximize: L() ƒ(yij) (4)
• Nonlinear estimation.
• Response distribution not restricted to Normal.
• May also incorporate random components into the model.
• Treatment comparisons easier in SAS.• Contrast and estimate statements.
• SAS: PROC NLMIXED.
• Generalized Nonlinear Model - Inference
• Formulate a full dummy variable model encompassing k
treatments.• The joint likelihood over the k treatments becomes:
L(k) ijk ƒ(kyijk) (5)
where yijk is the jth replication of the ith dose in the kth treatment and k are the parameters of the kth treatment.
• Comparison of parameter values is then possible through single and multiple degree of freedom contrasts.
• Generalized Nonlinear Model
• Limitations
• percentile solution may still be based on inversion or Fieller’s theorem.
• inferences based on normal theory approximations.
• standard errors and confidence intervals asymptotic.
• Bayesian Estimation - Iterative Numerical Techniques
Modern Approaches (cont)
• Considers the probability of the parameters, ,
given the data yij.
• Using Bayes theorem, estimate:
p(|yij) = p(yij|)*p() (6)
p(yij|)*p()d
where p(|yij) is the posterior distribution of given the data yij, p(yij|) is the likelihood definedabove, and p() is a prior probability distribution for the parameters .
• Bayesian Estimation - Iterative Numerical Techniques
• Nonlinear estimation.
• Percentiles can be found from the distribution of .
• The likelihood is same as Generalized Nonlinear Model.
• flexibility in the response distribution.
• f(dosei) any continuous function of dose.
• Inherently allows updating of the estimation.
• Correct interval estimation (credible intervals).
• agrees well with GNLM at midrange percentiles.
• can perform better at extreme percentiles.
• SAS: PROC MCMC.
• Limitations
• User must specify a prior probability p().
• Estimation requires custom programming.• SAS: PROC MCMC• Specialized software: WinBUGS
• Computationally intensive solutions.
• Requires statistical expertise. • Sample programs and data are available at:
http://www.uidaho.edu/ag/statprog
• Bayesian Estimation - Iterative Numerical Techniques
Calibration Methods
• Tolerance Distribution: Logistic
• The response yij/Ni at dose i = 1 to k, and replication
j =1 to r , is binomial with the proportion of success
given by:
yij/Ni = M/(1 + exp(- (dosei - ))) (7)
where is a rate related parameter and is the dosei for which the proportion of success, yij/Ni , is M/2. M is the theoretical maximum proportion attainable.
• A convenient generalization of (1) will allow to represent any dose at which yij/Ni = Q:
yij/Ni = M*C / (C + exp(- (dosei - ))) (8)
Where the constant C = Q/(M – Q). Note that, if Q = M/2, then C = 1 and equation (8) reverts to the standard form given in (7).
Equation (8), therefore, permits an unknown dose at a given response, Q, to be estimated through parameter .
• Maximum Likelihood
• Given the binomial responses, yij/Ni, a joint
likelihood may be defined as:
L(i | yij/Ni) ij (i)yij (1 - i)(Ni - yij) (9)Where the binomial parameter ,i , is defined by (8)
and the associated parameters, = [M, , ], are estimated through maximization of (9). Ni and yij are the total number of trials and number of successes, respectively.• Inferences on are carried out assuming ~ N(, ).
• SAS: PROC NLMIXED
• Bayesian: Parametric
• A Bayesian posterior distribution for is given by:
pr(| yij/Ni) pr(yij/Ni |) · pr() (10)
where pr(yij/Ni j|) is the likelihood shown in (9) and pr()
is a prior distribution for the parameters = [M, , ].
Estimation of is carried out through numerically
intensive techniques such as MCMC. (e.g. Price and Shafii 2005)
• Inference on is obtained through integration of
(10) over the parameter space of M and .
• Bayesian: Nonparametric
• Assuming the responses, yij/Ni, are binomial, a likelihood canthen be defined as:
L(P | yij/Ni) ij (pi)yij (1 - pi)(Ni - yij) (11)
• This methodology was first proposed by Mukhopadhyay (2000) and followed by Kottas et al. (2002).
• The technique considers the dose-response series as a multinomial process with parameters P = [p1, p2, p3, … pk].
• If the random segments between true response rates, pi , are distributed as a Dirichlet Process (DP), a joint
prior distribution on the pi may then be defined by:
pr(P) i (pi – pi - 1)(i - 1) (12)
where i = { F0(dose i) – F0(dose i – 1 ) }, is a precision parameter , and F0 is a base tolerance distribution.
• The precision parameter, , reflects how closely the final estimation follows the base distribution. Low values indicate less correspondence , while larger values indicate a tighter association.
• The base distribution, F0(.), defines a family of tolerance distributions.
• A posterior distribution for P can then be defined by combining (11) and (12) as:
pr(P | yij/Ni) ij (pi)yij (1 - pi)(Ni - yij) i (pi – pi - 1)(i - 1)
(13)
• Estimation of this posterior is again carried out numerically using techniques such as MCMC.
• Inference on an unknown dose, , at a known response p0 = y0/N0, is obtained through sampling of the posterior given in (13) .
Concluding Remarks• Dose-response models have wide application in agriculture.
• Probit models of estimation are limited in scope.
• Generalized nonlinear and Bayesian models provide the most flexible framework for dose-response estimation.
• Can use various response distributions • Can use various dose-response models.• Can incorporate random model effects.• Can be used to compare treatments.
• GNLM: full dummy variable modeling.• Bayesian methods: probability statements.
• They are useful for quantifying the relative efficacy of treatments.
• Bayesian estimation is preferred when estimating extreme percentiles.
• Generalized nonlinear models sufficient in most situations.
• Methodology proposed here uses a base tolerance distribution.
• Should be used and interpreted with caution.• Standard model assessment techniques still apply.• Introduces more uncertainty into the estimation situation.
Concluding Remarks (cont)• Bioassay is an import part of dose-response analysis.
• Determining an unknown dose can be problematic for some parametric functional forms.
• Dose estimation fits naturally in a Bayesian framework.
• Some dose-response data may not follow typical sigmoidal patterns.
References Bliss, C. I. 1934. The method of probits. Science, 79:2037, 38-39
Bliss, C. I. 1938. The determination of dosage-mortality curves from small numbers. Quart. J. Pharm., 11: 192-216.
Berkson, J. 1944. Application of the Logistic function to bio-assay. J. Amer. Stat. Assoc. 39: 357-65.
Feiller, E. C. 1944. A fundamental formula in the statistics of biological assay and some applications. Quart. J. Pharm. 17: 117-23.
Finney, D. J. 1971. Probit Analysis. Cambridge University Press, London.
Fisher, R. A. 1935. Appendix to Bliss, C. I.: The case of zero survivors., Ann. Appl. Biol., 22: 164-5.
SAS Inst. Inc. 2004. SAS OnlineDoc, Version 9, Cary, NC.
Seefeldt, S.S., J. E. Jensen, and P. Fuerst. 1995. Log-logistic analysis of herbicide dose-response relationships. Weed Technol. 9:218-227.
Kottas, A., M. D. Branco, and A. E. Gelfand. 2002. A Nonparametric Bayesian Modeling Approach for Cytogenetic Dosimetry. Biometrics 58, 593-600.
ReferencesMukhopadhyay, S. 2000. Bayesian Nonparametric Inference on the Dose Level
with Specified Response Rate. Biometrics 56, 220-226.
Price, W. J. and B. Shafii. 2005. Bayesian Analysis of Dose-response Calibration Curves. Proceedings of the Seventeenth Annual Kansas State
University Conference on Applied Statistics in Agriculture [CDROM], April 25-27, 2005. Manhattan Kansas.
Shafii, B. and W. J. Price. 2001. Estimation of cardinal temperatures in germination data analysis. Journal of Agricultural, Biological and Environmental Statistics. 6(3):356-366.
Shafii, B. and W. J. Price. 2006. Bayesian approaches to dose-response calibration models. Abstract: Proceedings of the XXIII International Biometrics Conference [CDROM], July 16 - 21, 2006. Montreal, Quebec Canada.
Shafii, B., Price, W.J., Barney, D.L. and Lopez, O.A. 2009. Effects of stratification and cold storage on the seed germination characteristics of cascade huckleberry and oval-leaved bilberry. Acta Hort. 810:599-608.
Questions / Comments