Statistics Seminar

Adjusting for Bias Induced by Informative Dose Selection Procedures

Many fields such acute toxicity studies, Phase I cancer trials, sensory studies and psychometric testing use informative dose allocation procedures. In this talk, we explain how such adaptive designs induce bias, and in the context of dose-finding designs we show how to modify frequency data to adjust for this bias.

To provide context, we start the talk with a general discussion of issues in inference following adaptive designs. Then, we assume a binary response Y has a monotone positive response prob- ability to a stimulus or treatment X, and we consider designs that sequentially select X values for new subjects in a way that concentrates treatments in a certain region of interest under the dose-response curve. We discuss how data analysis at the end of a study is affected by choosing the stimulus value for each subject sequentially according to some informative sampling rule.

Without loss of generality, we call a positive response a toxicity and the stimulus a dose. For simplicity, we restrict this talk to the case of a univariate treatment X and binary Y, and further assume that treatments are limited to a finite set {d1, d2, . . . , dM } of M values we call doses. Now suppose n subjects receive treatments that were sequentially selected (according so some rule using data from prior subjects) from the restricted set of M doses.  Let Nm and Tm denote the number of subjects receiving treatment dm and the number of toxicities observed on treatment dm, respectively. Define Fm = P{Y = 1|X = dm} = E[Y |X = dm].

Then it is often said that the distribution of Tm given Nm is Binomial with parameters (Fm, Nm). But taking Nm as fixed is not the same as conditioning on this random variable, and conditioning on informative dose assignments is not the same as conditioning on summary dose frequencies. Indeed, it is easy to show that the observed dose-specific toxicity rate, Tm/Nm, is biased for Fm. From first principals, we obtain

 

E[Tm / Nm] = Fm - Cov[Tm/Nm, Nm] / E[Nm]

 

The observed toxicity rate is biased for Fm because adaptive allocations, by design, induce a correlation between toxicity rates and allocation frequencies.

This bias impacts inference procedures: Isotonic regression methods use dose-specific toxicity rates directly. Standard likelihood-based methods mask the bias by providing first-order linear approximations. We illustrate these biases using isotonic and likelihood-based regression methods in some well known (small sample size) adaptive methods including selected up-and-down designs, interval designs, and the continual reassessment method. Then we propose a bias adjustment inspired by Firth (1993).

 

[Nancy Flournoy; University of Missouri – http://web.missouri.edu/flournoyn/]

[flournoyn@missouri.edu  –   https://en.wikipedia.org/wiki/Nancy_Flournoy]

Methods for Preferential Sampling in Geostatistics

 

Preferential sampling in geostatistics refers to the instance in which the process that determines the sampling locations may depend on the spatial process that is being modelled. If ignored, this dependency can result in biased parameter estimates and may affect the resulting spatial prediction. Recent research on correcting for preferential sampling bias has been limited to stationary sampling locations, such as air-quality monitoring sites. We propose a flexible framework for inference on preferentially sampled fields, which can be used to expand preferential sampling methodology to the case in which the preferentially sampled locations are obtained from a process moving in space and time. An example of such data, the preferential sampling of ocean temperature by tagged marine mammals, is presented.

 

 

Modelling under-reported data through INAR-hidden Markov chains - June 22, 2017 at 4pm in ESB 4192

The interest in the analysis of count time series has been growing in the past years, and many models have been considered in the literature (Al-Osh and Alzaid, 1987, J Time Series Analysis). The main reason for this increasing popularity is the limited performance of the classical time series analysis approach when dealing with discrete valued time series. With the introduction of discrete time series analysis techniques, several challenges appeared such as unobserved heterogeneity, periodicity, under-reporting, among others. Many efforts have been devoted in order to introduce seasonality in these models (Morina et al., 2011, Statistics in Medicine) and also coping with unobserved heterogeneity. However, the problem of under-reported data is still in a quite early stage of study in many different fields. This phenomenon is very common in many contexts such as epidemiological and biomedical research. It might lead to potentially biased inferences and may also invalidate the main assumptions of the classical models. Especially, in public health context it is well known that several diseases have been traditionally under-reported (occupational related diseases, food exposures diseases, ...). The model we will present in this work considers two discrete time series: the observed series of counts Y_t which may be under-reported, and the underlying series X_t with an INAR(1) structure X_n = alpha*X_{n-1}+W_n, where 0 < alpha < 1 is a fixed parameter and W_n are the innovations which are Poisson(lambda) distributed. The binomial thinning operator (or binomial subsampling) is defined as alpha*X_{n-1}= \sum_{i=1}^{X_{n-1}} Z_i(alpha); where Z_i are i.i.d Bernoulli random variables with probability of success equal to alpha. The way we allow the observed process Y_n to be under-reported is by defining that Y_n is X_n with probability 1-omega  or is q*X_n with probability omega. Obviously, this definition means that the observed Y_n coincides with the underlying series X_n, and therefore the count at time n is not under-reported with probability 1-omega. Several applications in the field of public health will be discussed, using real data regarding incidence and mortality attributable to diseases related to occupational and environmental exposures and known toxics and traditionally under-reported. Full details of the work can be found in Fernandez-Fontelo et al. (2016, Statistics in Medicine, v 35, pp 4875-4890).