Simultaneous Outlier Detection and Prediction for Kriging with True Identification
On the afternoon of May 28th, Professor Wang Zhanfeng from the University of Science and Technology of China was invited by the Financial Statistics Research Office to deliver a special lecture titled "Simultaneous Outlier Detection and Prediction for Kriging" in Room 526, Xingjian Building. Professor Wang received his Bachelor’s and Ph.D. degrees from the University of Science and Technology of China in 2003 and 2008, respectively. He is currently a doctoral supervisor, primarily engaged in research fields such as biostatistics, functional data analysis, and non-Euclidean data analysis. He currently serves as the Chairman of the Tourism Big Data Branch of the Chinese Association for Applied Statistics and the Vice Chairman of the Geoscience DataBranch of the National Industrial Statistics Teaching Association.The lecture centered on the core question: "How to achieve robust spatial prediction in the presence of outliers." Professor Wang pointed out that Kriging, as a core tool in geostatistics andcomputer experiments, can effectively fit complex response surfaces, but its interpolation propertymakes it highly sensitive to outliers. However, outliers are not always "noise"—for example,anomalous regions in medical images may indicate malignant tumors, giving them clinical value. How to detect outliers while maintaining prediction accuracy has long been a challenging problem in the field.Professor Wang and his collaborators proposed a novel method for simultaneous outlier detection and "robust Kriging." This method introduces a normal-gamma hierarchical prior and automatically identifies the locations and magnitudes of outliers through hierarchical likelihood and maximum a posteriori estimation. Theoretical analysis proves that under appropriate regularityconditions, the method possesses an oracle property: both the estimation of hyperparameters and outlier detection are consistent, and predictive information remains consistent even in the presence of outliers.The study further reveals that the proposed method has clear interpretations under extreme parameter settings: when the bias parameter is zero, it reduces to classical Kriging; when the parameter tends to infinity, it is equivalent to Kriging prediction after removing all outliers. This property allows a natural smooth transition between using the full dataset and conservative outlier removal. Computationally, the team designed a simple and efficient two-step iterative algorithmthat automatically identifies outliers using a thresholding rule, with strictly controllable convergence criteria.In numerical simulations, the research team used a parsimonious bivariate Matérn model as a testbed, introducing 5%, 10%, and 15% outliers under three experimental designs: grid design, random design, and Latin hypercube design. Compared with several mainstream methods such as classical Kriging, Gaussian process regression, and robust Gaussian process, the proposed SODK method achieved the lowest average root mean square prediction error across all scenarios, with significantly smaller standard deviations. In the NASA airfoil simulation experiment, SODK also performed excellently; further analysis showed that after removing outliers identified by SODK, the prediction accuracy of classical Kriging improved significantly, verifying the method’s effectiveness as an "outlier screener."
In the application to the Swiss Jura soil lead concentration data, SODK successfully identified multiple extreme observation points, complementing the detection results of traditional methods. Comparisons of prediction accuracy showed that SODK consistently outperformed other methods
across 100 repeated experiments.At the end of the lecture, Professor Wang discussed several promising directions for futureresearch: joint modeling of correlated outliers, extensions to other stochastic processes such as the Student-t process, and outlier detection in multivariate output settings. After the lecture, the attending faculty and students engaged in a lively discussion with Professor Wang on topics such as computational details of hierarchical likelihood, threshold selection strategies, and variable selection in high-dimensional spaces. The lecture was logically clear, rigorous in argumentation, and provided both theoretical depth and practical computational tools, greatly broadening theacademic horizons of the audience in the interdisciplinary field of spatial statistics and outlier detection. Finally, the lecture concluded with warm applause.(By Qibing Gao)