Department Seminars 2003
| DATE/PLACE: | Thursday, December 11, 2003, 4:00pm Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TYPE: | BRG Seminar |
| TITLE: | Causal Inference: Using Graphical Models in the Absence and Presence of Background Knowledge |
| SPEAKER: | Dr. Rebecca Ayesha Ali, Department of Statistics and Applied Probability, National University of Singapore. |
|
Graphical Markov models are used in many diverse areas which in recent times include Epidemiology and Statistical Genetics. This talk is divided into two sections. The first section will discuss conducting model searches in the absence of background knowledge. I will introduce ancestral graphs: graphs that encode the conditional independence relations holding only among the observed variables of some data-generating process. I will also provide a simple representation of Markov equivalence classes for ancestral graphs. For a given set of data, there are often many competing models (graphs) that could explain the data and automated searches may be unable to distinguish between these models. By characterizing equivalence classes one can begin to search across equivalence classes rather than graphs alone when doing model selection. The second section of my talk will discuss doing a model search in the presence of background knowledge. In particular, I will focus on an example from STD research in which the underlying graph is postulated based on available background knowledge. Causal inference methods (marginal structural Cox models) are then used to account for time-dependent confounding and intermediate variables in this longitudinal cohort data. Keywords: directed acyclic graphs, ancestral graphs, Markov equivalence, marginal structural models. |
|
| DATE/PLACE: | Monday, December 08, 2003, 4:00pm Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TYPE: | Statistics Seminar |
| TITLE: | Visualizing categorical data with hammock plots |
| SPEAKER: | Dr. Matthias Schonlau, Rand Statistical Consulting Service, RAND, Santa Monica, CA. |
| A Hammock plot is a new graph to visualize categorical data. The graph is a hybrid between Mosaic plots and parallel coordinate plots. It borrows the idea of representing a category by an area from Mosaic plots, and it takes the arrangement of the variables on the plot from parallel coordinate plots. It is also related to the Clustergram (www.schonlau.net/clustergram.html), a graph for the visualizing hierarchical and non- hierarchical cluster analyses. Clustergrams and dendrogram graphs have similar objectives, however, dendrograms can only be constructed for hierarchical cluster analyses. I explore the strengths and weaknesses of the proposed graph with several data sets.
More information about Dr Schonlau may be found at www.schonlau.net |
|
| DATE/PLACE: | Thursday, November 27, 2003, 4:00pm Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TYPE: | BRG Seminar |
| TITLE: | Generalised Linear Models for Sparsely Correlated Data |
| SPEAKER: | Dr. Thomas Lumley, Dept of Biostatistics, University of Washington. |
| I define `sparsely correlated' data as data where two randomly chosen small sets of data are likely to be independent (the precise version of this takes too much notation for an abstract). This includes clustered and longitudinal data but also more complicated situations such as incomplete crossed designs. Marginal generalised linear models for sparsely correlated data are mathematically interesting as even the rate of convergence is not immediately obvious. They are also of practical interest as they turn out to be easy to fit, in many cases with standard software, in contrast to generalised linear mixed models. I will present some theoretical and practical applications. | |
| DATE/PLACE: | Tuesday, November 25, 2003, 4:00pm Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TYPE: | Statistics Seminar |
| TITLE: | What's the deal with concurvity and the generalized additive model? |
| SPEAKER: | Dr. Tim Ramsay, Institute of Population Health, University of Ottawa. |
| In the relatively short time since it's popularization in Hastie and Tibshirani's landmark book entitled "Generalized Additive Models", the GAM (generalized additive model) has been widely adopted as a valuable addition to the standard statistical toolbox. The GAM combines the flexible nature of nonparametric smoothers with the highly developed framework of the generalized linear model to provide an extremely useful technique for exploratory data analysis (EDA). Unfortunately, the complexity of some of the assumptions underlying asymptotic results for the GAM has lead naive users to mistake this powerful EDA technique for a model from which to make inference. This talk will focus on concurvity, the nonparametric analogue of multicollinearity. Through a comparison between the additive model and linear regression, concurvity will be shown to be a straightforward generalization of multicollinearity. The speaker will argue that concurvity should be thought of as the rule, rather than as the exception, and will demonstrate the effect of concurvity on bias and standard error. The take-home message will be two-fold. First, one should be very careful about using GAMs for inference. Second, the GAM remains unparalleled as a tool for exploring complex relationships between continuous variables. The talk is based on examples, rather than on formal analytic results, and assumes only statistical intuition and a passing familiarity with nonparametric regression. It should easily be accessible to both graduate students and reasonably advanced upper-year undergraduates. | |
| DATE/PLACE: | Tuesday, November 18, 2003, 4:00pm Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TYPE: | Statistics Seminar |
| TITLE: | Limited and full information estimation and goodness-of-fit testing in 2^n contingency tables |
| SPEAKER: | Dr. Harry Joe, Department of Statistics, UBC. |
| High-dimensional contingency tables tend to be sparse and standard goodness-of-fit statistics such as X^2 cannot be used without pooling of categories. As an improvement on arbitrary pooling, for goodness-of-fit of large 2^n contingency tables, we propose a class of quadratic form statistics based on the residuals of margins or multivariate moments up to order r. Further the marginal residuals are useful for diagnosing lack of fit of parametric models. These classes of test statistics are asymptotically chi-square and have better small sample properties than X^2 and G^2. We also show that these classes of test statistics have better power than X^2 for some useful multivariate binary models. Related to this class of test statistics is a class of limited information estimators based on low-dimensional margins. We show that these estimators have high efficiency for one commonly used item response model. This is joint work with Albert Maydeu-Olivares, Department of Psychology, University of Barcelona | |
| DATE/PLACE: | Friday, November 14, 2003, 4:00pm Chan Auditorium at CMMT/BCRI, 950 W 28th Ave |
| TYPE: | Statistics Seminar / BRG Seminar |
| TITLE: | Software Innovation for Computational Biology and Bioinformatics |
| SPEAKER: | Dr. Robert Gentleman, Department of Biostatistics, Harvard School of Public Health. |
| Software is an essential component of solutions to problems in computational biology and bioinformatics (CBB). In this talk I consider some of the roles that software and software development can take in different CBB projects. An emphasis is placed on the rapid development and deployment of reusable components. Some of the lessons learned in the R project (www.r-project.org) and in the Bioconductor Project (www.bioconductor.org) will be used as concrete examples. The two examples are software for the analysis of microarray data and software for the analysis of protein-protein interaction data. The notion of reproducible research, its reliance on software, and some of the many benefits that this approach provides will be presented. | |
| DATE/PLACE: | Tuesday, October 28, 2003, 4:00pm Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TYPE: | Statistics Seminar |
| TITLE: | On Subset Selection Under Order Restrictions |
| SPEAKER: | Dr. Constance van Eeden, Honorary Professor, Department of Statistics, UBC. |
| Abstract Available | |
| DATE/PLACE: | Thursday, September 4, 2003, 16:00 Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TYPE: | Joint Workshop/Biostats Research Group |
| TITLE: | Nonparametric Mixed-Effect Models |
| SPEAKER: | Chong Gu , Dept of Statistics, Purdue University |
|
Mixed-effect models are widely used for the analysis of correlated data such as longitudinal data and repeated measures. In this talk, I will present some recent results on the nonparametric estimation of the fixed effects in such models. Using Henderson's likelihood, the "variance components" can be turned into "mean components," and computation and cross-validation strategies developed for independent data can be used to handle correlated data. The optimality of cross-validation in the setting can be established through asymptotic analysis and simulation studies. Real-data examples are also presented to illustrate potential applications of the methodology. The talk is based on joint work with Ping Ma. |
|
| DATE/PLACE: | Tuesday, August 19, 2003, 4:00pm Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TITLE: | Longitudinal Analyses for Ordinal Expanded Disability Status Scale Scores from a Multiple Sclerosis Clinical Trial |
| SPEAKER: | Lei Han, Department of Statistics, UBC |
| Longitudinal data sets are repeated observations for a group of subjects over time, which are usually comprised of one or more response variables and a corresponding vector of covariates. We focus on the case of a categorical response variable. A simple first-order generalized estimating equations (GEE1) approach for longitudinal categorical data is first discussed. A second-order generalized estimating equations approach (GEE2) and a revised GEE1 approach which avoids the computational burden associated with the GEE2 approach while retaining some of its desirable properties are also described. These GEE approaches are applied to the ordinal EDSS data from the Betaseron clinical trial in relapsing-remitting multiple sclerosis (MS). Our principal objectives are to examine the practicality of these GEE approaches for ordinal outcome measures as collected in typical MS clinical trials, and to identify the presence of any treatment or covariate effects on the EDSS response in the Betaseron clinical trial in relapsing-remitting MS. | |
| DATE/PLACE: | Friday, August 8, 2003, 11:00 Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TITLE: | R grid Graphics |
| SPEAKER: | Dr. Paul Murrell Department of Statistics The University of Auckland |
|
This talk will discuss the differences between the way that a user sees statistical graphics and the way a software developer sees statistical graphics. We will consider reasons for moving users towards the developers view and describe how the grid graphics add-on package for R allows users to make that transition. |
|
| DATE/PLACE: | Tuesday, July 15, 2003, 16:00 Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TITLE: | Longitudinal Analyses for Magnetic Resonance Imaging Outcomes in the PRISMS Multiple Sclerosis Clinical Trial |
| SPEAKER: | Lindsey Turner Department of statistics, UBC |
| Longitudinal data sets consist of repeated observations of a response variable on a group of patients over a period of time. Often a corresponding set of covariates is available for each patient. Usually analysis of such data is based on summaries over time. Although this allows the use of simple techniques to assess treatment effects, the use of these univariate summaries does not allow for the examination of patterns over time in the repeated responses. The purpose of this project is to utilize the general estimating equations (GEE) approach to the analysis of longitudinal responses in the UBC frequent magnetic resonance imaging (MRI) substudy of the 2-year PRISMS interferon beta-1a trial in relapsing remitting multiple sclerosis. This method is not only able to identify treatment effects but is also able to describe the nature of these effects over time. | |
| DATE/PLACE: | Tuesday, April 8, 2003, 16:00 Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TITLE: | Overview of data, projects and possibilities at the Research Data Centre |
| SPEAKER: | Nicole Fortin, Academic Director, Research Data Centre James Croal, Analyst, Research Data Centre |
The British Columbia Inter-university Research Data Centre provides access |
|
| DATE/PLACE: | Tuesday, April 1, 2003, 15:45 Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TITLE: | Bayes-assisted goodness-of-fit tests |
| SPEAKER: | Richard Lockhart, SFU |
|
I discuss the impact of three principles on the problem of choosing a good goodness-of-fit test. First: when testing statistical hypotheses alternatives of interest are neither indetectably nor grossly different from the null hypothesis. Second, good tests are designed to be sensitive to alternatives likely to arise in practice. Third, the purpose of limit theorems is to provide good approximate probability calculations of interest to statisticians. I will use Bayesian priors on the alternative hypothesis to construct tests which maximize the expected power for a prior which depends on the sample size. Priors will be presented for which the optimal procedures are (approximately) such goodness-of-fit tests as the Cram'er-von Mises or the Anderson-Darling test. |
|
| DATE/PLACE: | Tuesday, March 18, 2003, 16:00 Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TITLE: | Fully Modified Estimation of Fractional Cointegration Models |
| SPEAKER: | Chang S. Kim Department of Economics, UBC |
| Efficient estimation techniques for certain models of fractional cointegration are developed. Such models can capture long run economic equilibrium relationships while allowing for a wider range of mean reverting behavior than standard models of cointegration. It is shown that a fractional version of the Fully-Modified (FM) method suggested in Phillips and Hansen (1990) retains its asymptotic validity as well as some optimality properties in certain models of fractional cointegration. In the course of this development, some new results on the definition and estimation of long run variance matrices for fractional processes are given, these being useful in the construction of the fractional FM (FFM) estimator. Simulations analyzing the finite sample performance of the FFM-estimator are reported and an empirical application to exchange rate dynamics is conducted. | |
| DATE/PLACE: | Tuesday, March 04, 2003, 16:00 Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TITLE: | Regime-Switching and the Estima Multifractal Processes |
| SPEAKER: | Adlai Fisher Faculty of Commerce Department of Finance, UBC |
|
We propose a discrete-time stochastic volatility model in which regime- switching serves three purposes. First, changes in regimes capture low frequency variations, which is their traditional role. Second, they specify intermediate frequency dynamics that are usually assigned to smooth autoregressive processes. Finally, high frequency switches gen- erate substantial outliers. Thus, a single mechanism captures three important features of the data that are typically addressed as distinct phenomena in the literature. Maximum likelihood estimation is de- veloped and shown to perform well in finite sample. We estimate on exchange rate data a version of the process with four parameters and more than a thousand states. The estimated model compares favor- ably to earlier specifications both in- and out-of-sample. Multifractal forecasts slightly improve on GARCH(1,1) at daily and weekly inter- vals, and provide considerable gains in accuracy at horizons of 10 to 50 days. Keywords: Forecasting, long memory, Markov regime-switching, max-imum likelihood estimation, scaling, stochastic volatility, time defor-mation, volatility component, Vuong test. |
|
| DATE/PLACE: | Tuesday, February 18, 2003, 16:00 Leonard S. Klinck 301 6356 Agricultural Road, UBC |
| TITLE: | Extended Voting Measures |
| SPEAKER: | Tim Swartz Dept of Stats/Actsci, SFU |
| In this talk, I extend a number of voting measures used to quantify voting power. The extension is based on the recognition that individuals sometimes vote in coalitions. This observation gives rise to a statistical model which considers past voting patterns of subsets of eligible voters. The model is then used to obtain estimates of the probabilities of all voting combinations from which empirical measures are calculated. The calculation of the estimated probabilities may involve high-dimensional integrations. An example is given based on past decisions arising from the Supreme Court of Canada. | |
In conducting time series studies to investigate the relationship between air pollution and a health outcome, for example respiratory mortality, it is important to have a good measure of the level of pollution on any particular day. Often daily measurements are available from a number of monitoring sites across the study. Each of these monitors may measure different sets of pollutants, there may be periods of missing data, and all of the recorded measurements will be subject to error. Here, a (Bayesian) hierarchical model is used for the analysis of such data, addressing the issues described, and specifically, allows information from multiple sites on different pollutants to be combined. This allows an estimate of a 'smoothed', or underlying pollution level for each pollutant at each site to be obtained, incorporating any possible lag structure, along with a measure of uncertainty. These modelled levels of pollution can then be used in time series analyses examining the relationship with health outcome. The measure of uncertainty is particularly useful for accounting for the variation in the pollution level, whether informally, when interpreting the regression coefficients, or more formally via error-in-variables modelling. These methods are applied to levels of a number of pollutants, including PM10, CO, NO and SO2, measured at eight sites in London for the period 1993-96. Associations between the resulting modelled levels of pollution and daily mortality counts in London (from 1993-96) are then examined and compared with those obtained using the original pollution measurements. The sensitivity of relative risks and the width of their confidence intervals are examined with respect to model assumptions, with particular interest in the effect of periods of missing data. This is a joint work with Jon Wakefield.
DATE/PLACE:
Thursday, January 16, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE:
Biostats Research Group
TITLE:
The effects of modelling pollution levels on the relative risks obtained from time series studies examining the relationship between air pollution and health
SPEAKER:
Gavin Shaddick
Detp of Mathematics
University of Bath, UK
