Department Seminars 2003

DATE/PLACE: Thursday, December 11, 2003, 4:00pm
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: BRG Seminar
TITLE: Causal Inference: Using Graphical Models in the Absence and Presence of Background Knowledge
SPEAKER: Dr. Rebecca Ayesha Ali,
Department of Statistics and Applied Probability,
National University of Singapore
.

Graphical Markov models are used in many diverse areas which in recent times include Epidemiology and Statistical Genetics. This talk is divided into two sections.

The first section will discuss conducting model searches in the absence of background knowledge. I will introduce ancestral graphs: graphs that encode the conditional independence relations holding only among the observed variables of some data-generating process. I will also provide a simple representation of Markov equivalence classes for ancestral graphs. For a given set of data, there are often many competing models (graphs) that could explain the data and automated searches may be unable to distinguish between these models. By characterizing equivalence classes one can begin to search across equivalence classes rather than graphs alone when doing model selection.

The second section of my talk will discuss doing a model search in the presence of background knowledge. In particular, I will focus on an example from STD research in which the underlying graph is postulated based on available background knowledge. Causal inference methods (marginal structural Cox models) are then used to account for time-dependent confounding and intermediate variables in this longitudinal cohort data.

Keywords: directed acyclic graphs, ancestral graphs, Markov equivalence, marginal structural models.

DATE/PLACE: Monday, December 08, 2003, 4:00pm
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Statistics Seminar
TITLE: Visualizing categorical data with hammock plots
SPEAKER: Dr. Matthias Schonlau,
Rand Statistical Consulting Service,
RAND, Santa Monica, CA
.
A Hammock plot is a new graph to visualize categorical data. The graph is a hybrid between Mosaic plots and parallel coordinate plots. It borrows the idea of representing a category by an area from Mosaic plots, and it takes the arrangement of the variables on the plot from parallel coordinate plots. It is also related to the Clustergram (www.schonlau.net/clustergram.html), a graph for the visualizing hierarchical and non- hierarchical cluster analyses. Clustergrams and dendrogram graphs have similar objectives, however, dendrograms can only be constructed for hierarchical cluster analyses. I explore the strengths and weaknesses of the proposed graph with several data sets.

More information about Dr Schonlau may be found at www.schonlau.net

DATE/PLACE: Thursday, November 27, 2003, 4:00pm
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: BRG Seminar
TITLE: Generalised Linear Models for Sparsely Correlated Data
SPEAKER: Dr. Thomas Lumley,
Dept of Biostatistics,
University of Washington
.
I define `sparsely correlated' data as data where two randomly chosen small sets of data are likely to be independent (the precise version of this takes too much notation for an abstract). This includes clustered and longitudinal data but also more complicated situations such as incomplete crossed designs. Marginal generalised linear models for sparsely correlated data are mathematically interesting as even the rate of convergence is not immediately obvious. They are also of practical interest as they turn out to be easy to fit, in many cases with standard software, in contrast to generalised linear mixed models. I will present some theoretical and practical applications.
DATE/PLACE: Tuesday, November 25, 2003, 4:00pm
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Statistics Seminar
TITLE: What's the deal with concurvity and the generalized additive model?
SPEAKER: Dr. Tim Ramsay,
Institute of Population Health,
University of Ottawa
.
In the relatively short time since it's popularization in Hastie and Tibshirani's landmark book entitled "Generalized Additive Models", the GAM (generalized additive model) has been widely adopted as a valuable addition to the standard statistical toolbox. The GAM combines the flexible nature of nonparametric smoothers with the highly developed framework of the generalized linear model to provide an extremely useful technique for exploratory data analysis (EDA). Unfortunately, the complexity of some of the assumptions underlying asymptotic results for the GAM has lead naive users to mistake this powerful EDA technique for a model from which to make inference. This talk will focus on concurvity, the nonparametric analogue of multicollinearity. Through a comparison between the additive model and linear regression, concurvity will be shown to be a straightforward generalization of multicollinearity. The speaker will argue that concurvity should be thought of as the rule, rather than as the exception, and will demonstrate the effect of concurvity on bias and standard error. The take-home message will be two-fold. First, one should be very careful about using GAMs for inference. Second, the GAM remains unparalleled as a tool for exploring complex relationships between continuous variables. The talk is based on examples, rather than on formal analytic results, and assumes only statistical intuition and a passing familiarity with nonparametric regression. It should easily be accessible to both graduate students and reasonably advanced upper-year undergraduates.
DATE/PLACE: Friday, November 21, 2003, 4:00pm
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Statistics Seminar / BRG Seminar
TITLE: Some experiences with random effect (generalized) linear models
SPEAKER: Dr. David Brillinger,
Department of Statistics,
U California, Berkeley
.
A number of applied studies leading to random effect models will be discussed and results presented. The examples are taken from the fields of medicine (red blood cell survival), seismology (attenuation laws), demography (distributions of births and deaths) and risk analysis (wildfires).

Refereces

  1. D.R. Brillinger and H.K. Preisler, "Maximum likelihood estimation in a latent variable problem". Studies in Econometrics, Time Series and Multivariate Statistics. Academic Press, New York (1983), pp. 31-65.
  2. D.R. Brillinger and H.K. Preisler, "Further analysis of the Joyner-Boore attenuation data", Bull. Seismol. Soc. America, Vol. 75 (1985), pp. 611-614.
  3. "The natural variability of vital rates and associated statistics", Biometrics, Vol. 42 (1986), pp. 693-712.
  4. "Spatial-temporal modelling of spatially aggregate birth data", Survey Methodology Journal, Vol. 16, pp.255-269 (1990).
  5. "Random effect models in the estimation of wildfire risk". In preparation
DATE/PLACE: Tuesday, November 18, 2003, 4:00pm
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Statistics Seminar
TITLE: Limited and full information estimation and goodness-of-fit testing in 2^n contingency tables
SPEAKER: Dr. Harry Joe,
Department of Statistics,
UBC
.
High-dimensional contingency tables tend to be sparse and standard goodness-of-fit statistics such as X^2 cannot be used without pooling of categories. As an improvement on arbitrary pooling, for goodness-of-fit of large 2^n contingency tables, we propose a class of quadratic form statistics based on the residuals of margins or multivariate moments up to order r. Further the marginal residuals are useful for diagnosing lack of fit of parametric models. These classes of test statistics are asymptotically chi-square and have better small sample properties than X^2 and G^2. We also show that these classes of test statistics have better power than X^2 for some useful multivariate binary models. Related to this class of test statistics is a class of limited information estimators based on low-dimensional margins. We show that these estimators have high efficiency for one commonly used item response model. This is joint work with Albert Maydeu-Olivares, Department of Psychology, University of Barcelona
DATE/PLACE: Friday, November 14, 2003, 4:00pm
Chan Auditorium at CMMT/BCRI,
950 W 28th Ave
TYPE: Statistics Seminar / BRG Seminar
TITLE: Software Innovation for Computational Biology and Bioinformatics
SPEAKER: Dr. Robert Gentleman,
Department of Biostatistics,
Harvard School of Public Health
.
Software is an essential component of solutions to problems in computational biology and bioinformatics (CBB). In this talk I consider some of the roles that software and software development can take in different CBB projects. An emphasis is placed on the rapid development and deployment of reusable components. Some of the lessons learned in the R project (www.r-project.org) and in the Bioconductor Project (www.bioconductor.org) will be used as concrete examples. The two examples are software for the analysis of microarray data and software for the analysis of protein-protein interaction data. The notion of reproducible research, its reliance on software, and some of the many benefits that this approach provides will be presented.
DATE/PLACE: Tuesday, October 28, 2003, 4:00pm
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Statistics Seminar
TITLE: On Subset Selection Under Order Restrictions
SPEAKER: Dr. Constance van Eeden, Honorary Professor,
Department of Statistics,
UBC
.
Abstract Available
DATE/PLACE: Thursday, September 4, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Joint Workshop/Biostats Research Group
TITLE: Nonparametric Mixed-Effect Models
SPEAKER: Chong Gu ,
Dept of Statistics,
Purdue University

Mixed-effect models are widely used for the analysis of correlated data such as longitudinal data and repeated measures. In this talk, I will present some recent results on the nonparametric estimation of the fixed effects in such models. Using Henderson's likelihood, the "variance components" can be turned into "mean components," and computation and cross-validation strategies developed for independent data can be used to handle correlated data. The optimality of cross-validation in the setting can be established through asymptotic analysis and simulation studies. Real-data examples are also presented to illustrate potential applications of the methodology. The talk is based on joint work with Ping Ma.

DATE/PLACE: Tuesday, August 19, 2003, 4:00pm
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TITLE: Longitudinal Analyses for Ordinal Expanded Disability Status Scale Scores from a Multiple Sclerosis Clinical Trial
SPEAKER: Lei Han,
Department of Statistics,
UBC
Longitudinal data sets are repeated observations for a group of subjects over time, which are usually comprised of one or more response variables and a corresponding vector of covariates. We focus on the case of a categorical response variable. A simple first-order generalized estimating equations (GEE1) approach for longitudinal categorical data is first discussed. A second-order generalized estimating equations approach (GEE2) and a revised GEE1 approach which avoids the computational burden associated with the GEE2 approach while retaining some of its desirable properties are also described. These GEE approaches are applied to the ordinal EDSS data from the Betaseron clinical trial in relapsing-remitting multiple sclerosis (MS). Our principal objectives are to examine the practicality of these GEE approaches for ordinal outcome measures as collected in typical MS clinical trials, and to identify the presence of any treatment or covariate effects on the EDSS response in the Betaseron clinical trial in relapsing-remitting MS.

DATE/PLACE: Friday, August 8, 2003, 11:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TITLE: R grid Graphics
SPEAKER: Dr. Paul Murrell
Department of Statistics
The University of Auckland

This talk will discuss the differences between the way that a user sees statistical graphics and the way a software developer sees statistical graphics. We will consider reasons for moving users towards the developers view and describe how the grid graphics add-on package for R allows users to make that transition.


DATE/PLACE: Tuesday, July 15, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TITLE: Longitudinal Analyses for Magnetic Resonance Imaging Outcomes in the PRISMS Multiple Sclerosis Clinical Trial
SPEAKER: Lindsey Turner
Department of statistics, UBC
Longitudinal data sets consist of repeated observations of a response variable on a group of patients over a period of time. Often a corresponding set of covariates is available for each patient. Usually analysis of such data is based on summaries over time. Although this allows the use of simple techniques to assess treatment effects, the use of these univariate summaries does not allow for the examination of patterns over time in the repeated responses. The purpose of this project is to utilize the general estimating equations (GEE) approach to the analysis of longitudinal responses in the UBC frequent magnetic resonance imaging (MRI) substudy of the 2-year PRISMS interferon beta-1a trial in relapsing remitting multiple sclerosis. This method is not only able to identify treatment effects but is also able to describe the nature of these effects over time.

DATE/PLACE: Thursday, July 10, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Biostats Research Group
TITLE: Modelling the association between mortality and long term exposure to air pollution
SPEAKER: Dr. Gavin Shaddick
Detp of Mathematics
University of Bath, UK

Compared to acute effects of air pollution on health, rather less attention has been given to the investigation of chronic effects of air pollution, i.e. the association between health outcomes and long-term exposures to air pollution, possibly over several years. This is primarily due to the lack of availability of suitable data (including potential confounders), and it is uncertain to what extent, if at all, findings from studies of short-term effects can be extrapolated to longer term (chronic) effects. The majority of the studies of the chronic effects of air pollution have focused only on concurrent exposures: that is, on the associations between health outcome and pollution levels measured in the same, or very recent, years. As such, these studies take no account either of possible latency effects (e.g. due to exposures earlier in life), or of the effects of cumulative exposures over many years. The approach presented here is novel in that it addresses both of these issues at a much higher level of geographical resolution that has been possible before, and additionally allows the effects to vary over calendar time. This is achieved by using health outcomes from distinct time periods, each of which is associated with previous exposure over a number of years. By considering sub-divisions of these exposure periods, the effects of different lags can be examined. This is implemented within a Bayesian framework and applied to ward-level mortality data (respiratory deaths) from Great Britain during the period 1981-96 with associated exposures from ambient concentrations of black smoke and sulphur dioxide from 1966-81 measured in the same wards. The effects of unmeasured confounders is considered, both in terms of possible over-dispersion and, as a number of the wards may be common to more than one of the time periods, in the possibility that the outcomes may not be considered independent. The effects of socio-economic deprivation, using a census based index are also taken into account. The study has shown consistent associations between long-term SO exposure and respiratory mortality, with increased risks of similar magnitude to those previously observed in studies in the USA, suggesting that the long term health risks of exposure to air pollution merit continued attention.


DATE/PLACE: Tuesday, April 8, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TITLE: Overview of data, projects and possibilities at the Research Data Centre
SPEAKER: Nicole Fortin,
Academic Director, Research Data Centre
James Croal,
Analyst, Research Data Centre
The British Columbia Inter-university Research Data Centre provides access
to Statistics Canada's household surveys for faculty and students from SFU,
UBC and UVic. Researchers from agricultural sciences, business
administration, education, health and social sciences have projects at the
RDC using longitudinal complex-design survey microdata. This presentation
will discuss:
a) how to access Statistics Canada data
b) projects currently in progress at the RDC
c) what role can statisticians have in RDC projects

The Research Data Centre does not financially support any projects. However,
we can assist in connecting statisticians with a diverse range of applied
and policy-relevant research projects. The presentation will provide an
overview of the types of problems which statisticians could be involved with
when collaborating or consulting with a RDC project. This work could involve
correcting for response or sample bias, model development and diagnostics,
weighting of pooled records, and variance estimation within a complex survey
design.

DATE/PLACE: Thursday, April 03, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Biostats Research Group
TITLE: Analyses of Longitudinal and Time-to-Event Data in a Randomized Clinical Trial in the Presence of a Lag Time in the Stabilization of Treatment
SPEAKER: Eugenia Yu,
Dept of Statistics,
UBC

Randomized controlled clinical trials (RCTs) are generally considered as the best experimental setting for assessing new medical therapies. In medical research, the evaluation of RCTs is often based on two approaches: the commonly recommended intent-to-treat (ITT) approach and the more controversial per-protocol (PP) approach, which respectively attempt to assess the clinical effectiveness and efficacy of a therapy. In the presence of a variable lag time in treatment stabilization following randomization, the baseline is defined at the time of randomization for both treatment groups under the ITT approach, whereas the baseline is shifted to the time of treatment stabilization for each treated individual under the PP approach. In this work, the ITT and the PP analyses are applied to the evaluation of an eye pressure lowering therapy, in data from the Collaborative Normal Tension Glaucoma Study, where the intervention incurred a lag time before the reduced pressure level became stable. The potential problems of bias in parameter estimation and diminishing statistical power in testing the treatment effect under the PP approach are also investigated through some simulation work. While the ITT and the PP approaches fail to account for the delay in treatment stabilization, other approaches that incorporate the lag time information including the use of a multistate model for survival analysis and a piecewise linear mixed effects (LME) model for longitudinal analysis are applied for an effectiveness assessment of the therapy. Finally, we consider a baseline-adjustment approach to match the control group to the delayed treated group for an efficacy assessment of the therapy. The different approaches are compared, and recommendations based on their performance in our study and their general applicability are also given.


DATE/PLACE: Tuesday, April 1, 2003, 15:45
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TITLE: Bayes-assisted goodness-of-fit tests
SPEAKER: Richard Lockhart,
SFU

I discuss the impact of three principles on the problem of choosing a good goodness-of-fit test. First: when testing statistical hypotheses alternatives of interest are neither indetectably nor grossly different from the null hypothesis. Second, good tests are designed to be sensitive to alternatives likely to arise in practice. Third, the purpose of limit theorems is to provide good approximate probability calculations of interest to statisticians.

I will use Bayesian priors on the alternative hypothesis to construct tests which maximize the expected power for a prior which depends on the sample size. Priors will be presented for which the optimal procedures are (approximately) such goodness-of-fit tests as the Cram'er-von Mises or the Anderson-Darling test.


DATE/PLACE: Tuesday, March 25, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Joint Workshop/Biostats Research Group
TITLE: Non-additive Effects in Logistic Regression
SPEAKER: Kazi Azad
Dept of Statistics
UBC

Logistic regression is commonly used in epidemiology to model the relationship between risk factors and presence/absence of a disease. Usually it is difficult to look for interaction structure (many possible pairwise interactions, for instance) to include in the model. So a model which is additive on the logit scale is fitted. If the number of risk factors is relatively large such an additive relationship may not make good sense. A new logistic regression model is proposed to incorporate non-additive interaction effects. In some scenarios this model might better reflect the relationship between the response variable and the risk factors. The Bayesian approach is followed to fit the model and a Markov chain Monte Carlo (MCMC) algorithm, known as the hybrid algorithm is used to simulate the parameters. We apply the new model to three examples and interpret the parameter estimates. We compare the predictive performance of the new model with that of the step-wise and the ordinary logistic regression models.


DATE/PLACE: Tuesday, March 18, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TITLE: Fully Modified Estimation of Fractional Cointegration Models
SPEAKER: Chang S. Kim
Department of Economics, UBC
Efficient estimation techniques for certain models of fractional cointegration are developed. Such models can capture long run economic equilibrium relationships while allowing for a wider range of mean reverting behavior than standard models of cointegration. It is shown that a fractional version of the Fully-Modified (FM) method suggested in Phillips and Hansen (1990) retains its asymptotic validity as well as some optimality properties in certain models of fractional cointegration. In the course of this development, some new results on the definition and estimation of long run variance matrices for fractional processes are given, these being useful in the construction of the fractional FM (FFM) estimator. Simulations analyzing the finite sample performance of the FFM-estimator are reported and an empirical application to exchange rate dynamics is conducted.

DATE/PLACE: Re-scheduled to March 27, 2003, 16:00
Thursday, March 13, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Biostats Research Group
TITLE: Simplex Mixed Models for Longitudinal Proportional Data
SPEAKER: Zhenguo Qiu
Children Hospital & UBC

We present the simplex mixed models, a class of generalized linear mixed models (GLMMs), based on the simplex distribution of Barndorff-Nielsen and Jorgensen (1991), which is suitable for modeling longitudinal continuous proportional data. The penalized quasi-likelihood (PQL) estimation developed by Breslow and Clayton (1993) for the GLMMs is discussed in the simplex mixed models, then estimation for variance parameters of random effects is derived by using the restricted maximum likelihood (REML) method. However, the result of simulation study shows that such an approximate inference may seriously underestimate the variance parameters, especially when the dispersion parameter is large. To overcome the bias, we propose a new version of PQL and REML for the simplex mixed models through the high-order multivariate Laplace approximation. The simulation results reveal that the proposed high-order approximate inference greatly reduces the bias of estimates comparing to the original approximate inference type. The proposed methods are illustrated by analyzing the eye surgery data.


DATE/PLACE: Tuesday, March 04, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TITLE: Regime-Switching and the Estima Multifractal Processes
SPEAKER: Adlai Fisher
Faculty of Commerce
Department of Finance, UBC

We propose a discrete-time stochastic volatility model in which regime- switching serves three purposes. First, changes in regimes capture low frequency variations, which is their traditional role. Second, they specify intermediate frequency dynamics that are usually assigned to smooth autoregressive processes. Finally, high frequency switches gen- erate substantial outliers. Thus, a single mechanism captures three important features of the data that are typically addressed as distinct phenomena in the literature. Maximum likelihood estimation is de- veloped and shown to perform well in finite sample. We estimate on exchange rate data a version of the process with four parameters and more than a thousand states. The estimated model compares favor- ably to earlier specifications both in- and out-of-sample. Multifractal forecasts slightly improve on GARCH(1,1) at daily and weekly inter- vals, and provide considerable gains in accuracy at horizons of 10 to 50 days.

Keywords: Forecasting, long memory, Markov regime-switching, max-imum likelihood estimation, scaling, stochastic volatility, time defor-mation, volatility component, Vuong test.

DATE/PLACE: Thursday, Feb 27, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Biostats Research Group
TITLE: INCORPORATING HAPLOTYPE UNCERTAINTY INTO ASSOCIATION STUDIES
SPEAKER: Peter Pare/Kelly Burkett
St Paul's Hospital

There are over 3 million polymorphic sites (SNPs) in the 3 billion base pair human genome. It is this variation in DNA sequence that leads to inter-individual variation including the variation in susceptibility to the common diseases which contribute the major burden of illness in our society. (Athlerosclerosis, Asthma, Emphysema, Arthritis). The revolution in molecular genetics has made it possible to identify the gene variants responsible for disease susceptibility. It is not possible or necessary to test for all of the 3 million SNPs in the human genome. The likely culprits are in genes rather than between genes and blocks of SNPs tend to be clustered together in haplotypes (a series of SNPs in series along a chromosome). However because each person has two chromosomes and the methods of genotyping only give the two bases at each site without a chromosomal assignment, specific haplotypes can only be inferred not measured. The uncertainty in this haplotype assignment has not been previously incorporated into associations studies of genetic haplotypes and disease. Kelly Burkett and Jinko Graham have developed a method to incorporate this uncertainty along with environmental confounders into a single multivariate model. The model and an example of its application will be presented.


DATE/PLACE: Tuesday, February 18, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TITLE: Extended Voting Measures
SPEAKER: Tim Swartz
Dept of Stats/Actsci, SFU
In this talk, I extend a number of voting measures used to quantify voting power. The extension is based on the recognition that individuals sometimes vote in coalitions. This observation gives rise to a statistical model which considers past voting patterns of subsets of eligible voters. The model is then used to obtain estimates of the probabilities of all voting combinations from which empirical measures are calculated. The calculation of the estimated probabilities may involve high-dimensional integrations. An example is given based on past decisions arising from the Supreme Court of Canada.

DATE/PLACE: Thursday, Feb 13, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Biostats Research Group
TITLE: Outbreak Detection: An algorithm proposed for use at BCCDC.
SPEAKER: Rick White
Dept of Statistics, UBC / BCCD
and
Dr. Monika Naus
BCCD

The BC Centre for Diseases Control (BCCDC) is responsible for dealing with outbreaks of communicable disease in BC and has been interested in devloping an "early warning system" to detect such outbreaks. Over the last year an algorithm has been developed and is being applied weekly to the case counts of several "prototype" diseases. It appears to be quite sensitive compared to the visual scan of case counts employed previously at the BCCDC. The algorithm is based on generalized additive models fitting loess curves to estimate the underlying rate of a disease. An estimated operdispersion parameter is used in conjuction with the rate to compute a probability for each observed count. Alerts can be based on a low probability, a sharp increases in the estimated rate and/or consecutive observations above the rate. Sensitivity and specificity are being evaluated by Monte Carlo simulations.


DATE/PLACE: Thursday, January 30, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Biostats Research Group
TITLE: Statistical Models, Degeneracy and Inference for Social Networks
SPEAKER: Mark Handcock
Dept of Statistics, University of Washington

The process of formulation and information encapsulated within social networks result in a form of "relational data". Relational data arise in many social science fields and graph models are a natural approach to representing the structure of these relations. This framework has many applications including, for example, the structure of social and communication networks and the behavior of certain epidemics. We consider statistical and stochastic models for such graphs that can be used to represent the structural characteristics of the networks. In our applications, the nodes usually represent people, and the edges represent a specified relationship between the people. A commonly used model formulation was introduced by Frank and Strauss (1986) and derived from developments in spatial statistics (Besag 1974). These models allow for the potentially complex dependencies within relational data structures.

To date, the use of graph models for networks has been limited by three interrelated factors: the complexity of realistic models, paucity of empirically relevant simulation studies, and a poor understanding of the properties of inferential methods. In this talk we discuss solutions to these limitations. We emphasize the important of likelihood-based inferential procedures and role of Markov Chain Monte Carlo (MCMC) algorithms for simulation and inference.

A primary ongoing issue is the identification of classes of realistic and parsimonious models. In this regard show the unsuitability of some commonly promoted Markov models classes because they can result in degenerate probability distributions. We also consider the suitability and inference for classes of "power law" models that have been proposed for certain random graphs.

The ideas are motivated and illustrated by the study of sexual relations networks with the objective of understanding the social determinants of HIV spread.


DATE/PLACE: Tuesday, January 21, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Joint Workshop / Biostats Research Group
TITLE: Hidden Markov models for multiple processes
SPEAKER: Rachel MacKay
Dept of Statistics, UBC

Hidden Markov models (HMMs) are a useful tool for capturing the behaviour of overdispersed, autocorrelated data. These models have been applied to many different problems, including speech recognition, precipitation modelling, and gene finding and profiling.

Typically, HMMs are applied to individual stochastic processes; HMMs for simultaneously modelling multiple processes have not been widely studied. In this context, random effects may be a natural choice for capturing differences between processes. In this talk, we develop the theory required for implementing and interpreting these models in a general setting, using the framework of generalized linear mixed models. We discuss parameter estimation and the properties of the resulting estimators, as well as hypothesis tests for the variance components. We then apply these models to two data sets, one relating to lesion counts in multiple sclerosis patients and the other to faecal coliform counts in sea water.


DATE/PLACE: Thursday, January 16, 2003, 16:00
Leonard S. Klinck 301
6356 Agricultural Road, UBC
TYPE: Biostats Research Group
TITLE: The effects of modelling pollution levels on the relative risks obtained from time series studies examining the relationship between air pollution and health
SPEAKER: Gavin Shaddick
Detp of Mathematics
University of Bath, UK

In conducting time series studies to investigate the relationship between air pollution and a health outcome, for example respiratory mortality, it is important to have a good measure of the level of pollution on any particular day. Often daily measurements are available from a number of monitoring sites across the study. Each of these monitors may measure different sets of pollutants, there may be periods of missing data, and all of the recorded measurements will be subject to error. Here, a (Bayesian) hierarchical model is used for the analysis of such data, addressing the issues described, and specifically, allows information from multiple sites on different pollutants to be combined. This allows an estimate of a 'smoothed', or underlying pollution level for each pollutant at each site to be obtained, incorporating any possible lag structure, along with a measure of uncertainty. These modelled levels of pollution can then be used in time series analyses examining the relationship with health outcome. The measure of uncertainty is particularly useful for accounting for the variation in the pollution level, whether informally, when interpreting the regression coefficients, or more formally via error-in-variables modelling. These methods are applied to levels of a number of pollutants, including PM10, CO, NO and SO2, measured at eight sites in London for the period 1993-96. Associations between the resulting modelled levels of pollution and daily mortality counts in London (from 1993-96) are then examined and compared with those obtained using the original pollution measurements. The sensitivity of relative risks and the width of their confidence intervals are examined with respect to model assumptions, with particular interest in the effect of periods of missing data.

This is a joint work with Jon Wakefield.

a place of mind, The University of British Columbia

Department of Statistics

333-6356 Agricultural Road
Vancouver, BC, V6T 1Z2
Tel: 604.822.0570
Fax: 604.822.6960
E-mail: [UNIT E-MAIL]

Emergency Procedures | Accessibility | Contact UBC | © Copyright The University of British Columbia