You are here

Parsimonious Structure-Exploiting Deep Neural Network Surrogates for Bayesian Inverse Problems

Omar Ghattas, University of Texas at Austin

In an inverse problem, one seeks to infer unknown parameters or parameter fields from measurements or observations of the state of a natural or engineered system. Such problems are fundamental to many fields of science and engineering: often available models possess unknown or uncertain input parameters that must be inferred from experimental or observational data. The Bayesian framework for inverse problems accounts for uncertainty in the inferred parameters stemming from uncertainties in the observational data, the model, and any prior knowledge. Bayesian inverse problems (BIPs) governed by large-scale complex models in high parameter dimensions (such as nonlinear PDEs with uncertain infinite dimensional parameter fields) quickly become prohibitive, since the forward model must be solved numerous times---as many as millions---to characterize the uncertainty in the parameters.
Efficient evaluation of the parameter-to-observable (p2o) map, defined by solution of the forward model, is the key to making BIPs tractable. Surrogate approximations of p2o maps have the potential to greatly accelerate BIP, provided that the p2o map can be accurately approximated using (far) fewer forward model solves than would be required for solving the BIP using the full p2o map. Unfortunately, constructing such surrogates presents significant challenges when the parameter dimension is high and the forward model is expensive. Deep neural networks (DNNs) have emerged as leading contenders for overcoming these challenges. We demonstrate that black box application of DNNs for problems with infinite dimensional parameter fields leads to poor results, particularly in the common situation when training data are limited due to the expense of the model. However, by constructing a network architecture that is adapted to the geometry and intrinsic low-dimensionality of the p2o map as revealed through adjoint PDEs, one can construct a "parsimonious" DNN surrogate with superior approximation properties with only limited training data. Applicaton to an inverse problem in Antarctic ice sheet flow is discussed.
This work is joint with Tom O'Leary-Roseberry, Peng Chen, Umberto Villa, and Nick Alger.