Seminar

UBC Statistics Department Colloquium Series: A Debiased Machine Learning Single-Imputation Framework for Item Nonresponse in Surveys

Machine learning methods are now increasingly studied and used in National Statistical Offices, in particular to handle item nonresponse, where some survey respondents answer certain questions but leave others missing. In most surveys, item nonresponse affects key study variables, and imputation is routinely used to handle the resulting missing data. Standard parametric imputation methods can support rigorous inference when their modeling assumptions are approximately correct. However, when the imputation model is misspecified, the resulting inferences may be potentially misleading. Machine learning offers a flexible alternative by learning complex relationships between variables from the data, which can reduce the risk of misspecification. At the same time, this flexibility introduces new challenges for survey inference, since modern learning algorithms may converge more slowly than classical parametric models and may not automatically deliver valid uncertainty quantification. In this talk, I will present a survey sampling extension of the double/debiased machine learning framework of Chernozhukov et al. (2018). The proposed approach combines machine learning-based imputation with design-based survey weighting and an orthogonalized estimating strategy, leading to root-$n$ consistent and asymptotically normal estimation of population means under realistic conditions. We also develop a consistent variance estimator, yielding asymptotically valid confidence intervals while allowing the use of a wide range of machine learning algorithms. I will briefly discuss aggregation procedures and conclude with simulation results illustrating the performance of the proposed methodology.
 

This talk is part of the UBC Statistics Colloquium Series, which features broad and accessible seminars throughout the term.

van Eeden seminar: From Diffusion Models to Schrödinger Bridges - When Generative Modeling meets Optimal Transport

Zoom Registration

https://ubc.zoom.us/meeting/register/Z_eCE0H9QqGknxiuC66eBg  

Title

From Diffusion Models to Schrödinger Bridges - When Generative Modeling meets Optimal Transport

Abstract

Denoising Diffusion models have revolutionized generative modeling. Conceptually, these methods define a transport mechanism from a noise distribution to a data distribution. Recent advancements have extended this framework to define transport maps between arbitrary distributions, significantly expanding the potential for unpaired data translation. However, existing methods often fail to approximate optimal transport maps, which are theoretically known to possess advantageous properties. In this talk, we will show how one can modify current methodologies to compute Schrödinger bridges—an entropy-regularized variant of dynamic optimal transport. We will demonstrate this methodology on a variety of unpaired data translation tasks.

van Eeden speakers

Dr. Arnaud Doucet has been invited to be this year's van Eeden speaker by the graduate students in the Department of Statistics at the University of British Columbia. A van Eeden speaker is a prominent statistician who is chosen each year to give a lecture, supported by the UBC Constance van Eeden Fund (https://www.stat.ubc.ca/constance-van-eeden-fund). The 2025 seminar is additionally sponsored by the Canadian Statistical Sciences Institute (CANSSI), the Pacific Institute for the Mathematical Sciences (PIMS), and the Walter H. Gage Memorial Fund.

 

 

Recent and current projects in statistics education

To join this seminar virtually: Please request Zoom connection details from ea@stat.ubc.ca.

Abstract: The work of the Flexible Learning in Statistics Group ranges from conducting studies of important aspects of statistics education to developing and testing resources for difficult statistics concepts. In this seminar, students will present several recent projects: using student focus groups to assess Shiny apps, developing and testing interactive resources to improve understanding of Bayesian inference, enhancing Stat 251 labs by creating active learning material and introducing pre-lab quizzes, and conducting a study of the impact of exam question wording on the performance of students with English as an Additional Language (EAL). You’ll also hear about StatEngage, the ASDa-led project to guide students through the challenges of consulting.

Careers and collaborations in health research statistics

To join this seminar virtually: Please request Zoom connection details from ea@stat.ubc.ca.

Abstract: This session will be a perspective of what working as a statistics consultant in a contract research organisation for pharmaceutical/biotech companies entails. In addition to an overview of potential career paths, the specific critical tasks and responsibilities involved for a statistician working in real-world data will be discussed.

A look into the type of statistical methodologies through case studies will be provided, demonstrating how they play a role in drug development, regulatory submissions, and health technology assessments. This sets the stage for the discussion of potential research collaborations between UBC students and industry, where students can have the opportunity to advance health research whilst gaining experience on whether a career in health research is of interest.

van Eeden seminar: Ethical AI is More than Loss Functions

Zoom Registration

https://ubc.zoom.us/meeting/register/u5Mpfu-orz4uHtMoS6AcwTE_0VZ_DDEghNdA

(If you have any questions about your registration or the seminar, please contact ea@stat.ubc.ca.)

Title

Ethical AI is More than Loss Functions

Abstract

What constitutes a fair algorithm and the ethical use of data is context specific. Algorithms are not neutral and optimization choices will reflect a specific value system and the distribution of power to make these decisions. Data also reflect societal bias, such as structural racism. Ethics and fairness research for health AI spans many fields, including policy, medicine, computer science, sociology, and statistics. Considerations go well beyond loss functions and typical measures of statistical assessment. This talk includes discussion of team construction, who decides the research question, minimum standards for research quality, reproducibility, least publishable units, and community engaged research. Overarching themes are also that centering health equity and developing methodology tailored to specific health questions are critical given the stakes involved.

van Eeden speakers

Professor Sherri Rose has been invited to be this year's van Eeden speaker by the graduate students in the Department of Statistics at the University of British Columbia. A van Eeden speaker is a prominent statistician who is chosen each year to give a lecture, supported by the UBC Constance van Eeden Fund. The 2024 seminar is additionally sponsored by the Canadian Statistical Sciences Institute (CANSSI), the Pacific Institute for the Mathematical Sciences (PIMS), and the Walter H. Gage Memorial Fund.