Category tree Gaussian process for computer experiments with many-category qualitative factors and application to cooling system design

In computer experiments, Gaussian process (GP) models are widely employed for emulation. However, when both qualitative and quantitative factors are involved, especially when qualitative factors have many categories, GP-based emulation becomes challenging, and existing methods can become unwieldy due to the curse of dimensionality. Motivated by computer experiments for the design of a cooling system, we introduce a new tree-based GP model for emulating computer codes with high-cardinality qualitative factors, referred to as the category tree GP (ctGP). The proposed approach incorporates a tree structure to partition the categories of the qualitative factors, after which GP or mixed-input GP models are fitted to the simulation outputs within the leaf nodes. The splitting rule is designed to reflect the cross-correlations among the categories of the qualitative factors, which a recent theoretical study has identified as a key component for improving prediction accuracy, and a pruning procedure based on cross-validation error is introduced to further ensure strong predictive performance. An application to the design of a cooling system demonstrates that the proposed method not only yields substantial computational gains and accurate predictions, but also offers meaningful insights into the system by uncovering an interpretable tree structure. Furthermore, in this cooling system design problem, the computer code is capable of generating multiple responses in addition to a single objective response; to accommodate this, we extend the ctGP framework to handle multiple responses by introducing an additional categorical variable that indicates which response is associated with each experimental point. Finally, we complete the cooling system design study by addressing the corresponding global optimization problem using Bayesian optimization with ctGP and an expected-improvement-type criterion.

To join this seminar virtually, please request Zoom connection details from ea@stat.ubc.ca. 

Bio: Ray-Bing Chen is a Professor in the Institute of Statistics and Data Science at National Tsing Hua University. He received his Ph.D. in Statistics from the University of California, Los Angeles in 2003. Prof. Chen’s research interests include statistical and machine learning, statistical modeling, computer experiments, and optimal design. His work has been published in leading journals such as the Annals of Applied Statistics, Journal of Computational and Graphical Statistics, Statistics and Computing, Technometrics, Journal of Quality Technology and Computational Statistics and Data Science. In recognition of his contributions to the field, he was elected as an Elected Member of the International Statistical Institute in 2020.

Event Photo
Ray-Bing Chen
Event type: Statistics Seminar
Speaker's page: https://sites.google.com/view/ray-bingchenswebsite/home
Location: ESB 4192 / Zoom
Event date: -
Speaker: Ray-Bing Chen, Professor, Institute of Statistics and Data Science, National Tsing Hua University