Events

Colloquium

We usually meet (with a few exceptions, please see below) on Fridays  at 10:00 am (Eastern Time). If you are interested in joining, please fill out the Registration Form. For questions please contact Harbir Antil (hantil@gmu.edu).

Upcoming Events

Previous Events

Fall 2023

Upcoming Events
Date Speaker Affiliation Title
Date Speaker, Affiliation, Title
Friday,
April 28, 2023
Arvind Saibaba North Carolina State University (Room: Expl. 4106) Monte Carlo Methods for Estimating the Diagonal of a Real Symmetric Matrix ... more ... less
Fri, Apr 28,
2023
Arvind Saibaba,
North Carolina State University (Room: Expl. 4106)

Monte Carlo Methods for Estimating the Diagonal of a Real Symmetric Matrix ... more ... less

Abstract:

For real symmetric matrices that are accessible only through matrix vector products, we present Monte Carlo estimators for computing the diagonal elements. Our probabilistic bounds for normwise absolute and relative errors apply to Monte Carlo estimators based on random Rademacher, sparse Rademacher, normalized and unnormalized Gaussian vectors. The novel use of matrix concentration inequalities in our proofs represents a systematic model for future analyses. Our bounds mostly do not depend on the matrix dimension and imply that the accuracy of the estimators increases with the diagonal dominance. I will also present LanczosMC a method for estimating the diagonals of the inverse of a symmetric positive definite matrix that combines a preconditioned Lanczos low-rank approximation for the inverse with a Monte Carlo estimator. The accuracy of the methods is demonstrated by numerical experiments on synthetic test matrices and applications to inverse problems, sensitivity analysis, and network science.

Abstract:

For real symmetric matrices that are accessible only through matrix vector products, we present Monte Carlo estimators for computing the diagonal elements. Our probabilistic bounds for normwise absolute and relative errors apply to Monte Carlo estimators based on random Rademacher, sparse Rademacher, normalized and unnormalized Gaussian vectors. The novel use of matrix concentration inequalities in our proofs represents a systematic model for future analyses. Our bounds mostly do not depend on the matrix dimension and imply that the accuracy of the estimators increases with the diagonal dominance. I will also present LanczosMC a method for estimating the diagonals of the inverse of a symmetric positive definite matrix that combines a preconditioned Lanczos low-rank approximation for the inverse with a Monte Carlo estimator. The accuracy of the methods is demonstrated by numerical experiments on synthetic test matrices and applications to inverse problems, sensitivity analysis, and network science.

Friday,
May 05, 2023
Ken Shirakawa (jointly with Applied Math Seminar) Chiba University (Room: Expl. 4106)
Fri, May 05,
2023
Ken Shirakawa (jointly with Applied Math Seminar),
Chiba University (Room: Expl. 4106)

Previous Events

Spring 2023 ... show ... hide
Date Speaker Affiliation Title
Date Speaker, Affiliation, Title
Friday,
January 27, 2023
Annalisa Quaini (jointly with Applied Math Seminar) University of Houston (Room: Expl. 4106) Towards the computational design of smart nanocarriers ... more ... less
Fri, Jan 27,
2023
Annalisa Quaini (jointly with Applied Math Seminar),
University of Houston (Room: Expl. 4106)

Towards the computational design of smart nanocarriers ... more ... less


Abstract:

Membrane fusion is a potentially efficient strategy for the delivery of macromolecular therapeutics into the cell interior. However, existing nanocarriers formulated to induce membrane fusion suffer from a key limitation: the high concentrations of fusogenic lipids needed to cross cellular membrane barriers lead to toxicity in vivo. To overcome this limitation, we are developing in silico models that will explore the use of membrane phase separation to achieve efficient membrane fusion with minimal concentrations of fusion-inducing lipids and therefore reduced toxicity. The models we consider are formulated in terms of partial differential equations posed on evolving surfaces, i.e., the surface of the nanocarrier that undergoes fusion. For the numerical solution, we use a fully Eulerian hybrid (finite difference in time and trace finite element in space) discretization method. The method avoids any triangulation of the surface and uses a surface-independent background mesh to discretize the problem. Thus, our method is capable of handling problems posed on implicitly defined surfaces and surfaces undergoing strong deformations and topological transitions.

Abstract:

Membrane fusion is a potentially efficient strategy for the delivery of macromolecular therapeutics into the cell interior. However, existing nanocarriers formulated to induce membrane fusion suffer from a key limitation: the high concentrations of fusogenic lipids needed to cross cellular membrane barriers lead to toxicity in vivo. To overcome this limitation, we are developing in silico models that will explore the use of membrane phase separation to achieve efficient membrane fusion with minimal concentrations of fusion-inducing lipids and therefore reduced toxicity. The models we consider are formulated in terms of partial differential equations posed on evolving surfaces, i.e., the surface of the nanocarrier that undergoes fusion. For the numerical solution, we use a fully Eulerian hybrid (finite difference in time and trace finite element in space) discretization method. The method avoids any triangulation of the surface and uses a surface-independent background mesh to discretize the problem. Thus, our method is capable of handling problems posed on implicitly defined surfaces and surfaces undergoing strong deformations and topological transitions.

Friday,
February 03, 2023
No Colloquium due to IDIA Symposium
Fri, Feb 03,
2023
No Colloquium due to IDIA Symposium

Friday,
February 10, 2023
Wolfgang Dahmen (jointly with Applied Math Seminar) University of South Carolina (Room: Expl. 4106) Accuracy Controlled Data Assimilation for Parabolic Problems ... more ... less
Fri, Feb 10,
2023
Wolfgang Dahmen (jointly with Applied Math Seminar),
University of South Carolina (Room: Expl. 4106)

Accuracy Controlled Data Assimilation for Parabolic Problems ... more ... less


Abstract:

State Estimation or Data Assimilation are about estimating ``physical states'' of interest from two sources of partial information: data produced by external sensors and a (typically incomplete or uncalibrated) background model, given in terms of a partial differential equation. In this talk we focus on states that ideally satisfy a parabolic equation with known right hand side but unknown initial values. Additional partial information is given in terms of data that represent the unknown state in a subdomain of the whole space-time cylinder up to a fixed time horizon. Recovering the state from this information is known to be a (mildly) ill-posed problem. Earlier contributions employ mesh-dependent regularizations in a fully discrete setting.

In contrast, we start from a regularized least-squares formulation on an infinite-dimensional level that respects a stable space-time variational formulation of the parabolic problem.

We argue that this has several principal advantages. First, it allows one to disentangle discretization and discretization parameters and identify reasonable “target objects” also in the presence of inconsistent data. More specifically, exploiting the equivalence of errors and residuals in appropriate norms, given by the variational formulation, we derive rigorous computable a posteriori error bounds quantifying the uncertainties of numerical outcomes. Moreover, these quantities provide stopping criteria for an iterative Schur complement solver that is shown to exhibit optimal performance. Finally, it gives rise to estimates for consistency errors and suggests a 'doubley nested' iteration striving for an optimal balance of regularization and discretization.

The theoretical results are illustrated by some numerical tests, including inconsistent data.

(joint work with R. Stevenson and J. Westerdiep)

Abstract:

State Estimation or Data Assimilation are about estimating ``physical states'' of interest from two sources of partial information: data produced by external sensors and a (typically incomplete or uncalibrated) background model, given in terms of a partial differential equation. In this talk we focus on states that ideally satisfy a parabolic equation with known right hand side but unknown initial values. Additional partial information is given in terms of data that represent the unknown state in a subdomain of the whole space-time cylinder up to a fixed time horizon. Recovering the state from this information is known to be a (mildly) ill-posed problem. Earlier contributions employ mesh-dependent regularizations in a fully discrete setting.

In contrast, we start from a regularized least-squares formulation on an infinite-dimensional level that respects a stable space-time variational formulation of the parabolic problem.

We argue that this has several principal advantages. First, it allows one to disentangle discretization and discretization parameters and identify reasonable “target objects” also in the presence of inconsistent data. More specifically, exploiting the equivalence of errors and residuals in appropriate norms, given by the variational formulation, we derive rigorous computable a posteriori error bounds quantifying the uncertainties of numerical outcomes. Moreover, these quantities provide stopping criteria for an iterative Schur complement solver that is shown to exhibit optimal performance. Finally, it gives rise to estimates for consistency errors and suggests a 'doubley nested' iteration striving for an optimal balance of regularization and discretization.

The theoretical results are illustrated by some numerical tests, including inconsistent data.

(joint work with R. Stevenson and J. Westerdiep)

Friday,
February 17, 2023
no colloquium
Fri, Feb 17,
2023
no colloquium
Friday,
February 24, 2023
Elizabeth Newman (jointly with Applied Math Seminar) Emory University (Room: Expl. 4106) How to Train Better: Exploiting the Separability of Deep Neural Networks ... more ... less
Fri, Feb 24,
2023
Elizabeth Newman (jointly with Applied Math Seminar),
Emory University (Room: Expl. 4106)

How to Train Better: Exploiting the Separability of Deep Neural Networks ... more ... less


Abstract:

Deep neural networks (DNNs) have gained undeniable success as high-dimensional function approximators in countless applications. However, there is a significant hidden cost behind the success - the cost of training DNNs. Typically, the training problem is posed as a stochastic optimization problem with respect to the learnable DNN weights. With millions of weights, a non-convex and non-smooth objective function, and many hyperparameters to tune, solving the training problem well is no easy task.

In this talk, we will make DNN training easier by exploiting the separability of common DNN architectures; that is, the weights of the final layer of the DNN are applied linearly. We will leverage this linearity in two ways. First, we will approximate the stochastic optimization problem deterministically via a sample average approximation. In this setting, we can eliminate the linear weights through variable projection (i.e., partial optimization). Second, in the stochastic optimization setting, we will consider a powerful iterative sampling approach to update the linear weights, which notably incorporates automatic regularization parameter selection methods. Throughout the talk, we will demonstrate the efficacy of these two approaches through numerical examples.

Abstract:

Deep neural networks (DNNs) have gained undeniable success as high-dimensional function approximators in countless applications. However, there is a significant hidden cost behind the success - the cost of training DNNs. Typically, the training problem is posed as a stochastic optimization problem with respect to the learnable DNN weights. With millions of weights, a non-convex and non-smooth objective function, and many hyperparameters to tune, solving the training problem well is no easy task.

In this talk, we will make DNN training easier by exploiting the separability of common DNN architectures; that is, the weights of the final layer of the DNN are applied linearly. We will leverage this linearity in two ways. First, we will approximate the stochastic optimization problem deterministically via a sample average approximation. In this setting, we can eliminate the linear weights through variable projection (i.e., partial optimization). Second, in the stochastic optimization setting, we will consider a powerful iterative sampling approach to update the linear weights, which notably incorporates automatic regularization parameter selection methods. Throughout the talk, we will demonstrate the efficacy of these two approaches through numerical examples.

Friday,
March 03, 2023
Fri, Mar 03,
2023

Friday,
March 10, 2023
Samy Wu Fung (jointly with Applied Math Seminar) Colorado School of Mines (Room: Expl. 4106) Using Hamilton Jacobi PDEs in Optimization ... more ... less
Fri, Mar 10,
2023
Samy Wu Fung (jointly with Applied Math Seminar),
Colorado School of Mines (Room: Expl. 4106)

Using Hamilton Jacobi PDEs in Optimization ... more ... less


Abstract:

First-order optimization algorithms are widely used today. Two standard building blocks in these algorithms are proximal operators (proximals) and gradients. Although gradients can be computed for a wide array of functions, explicit proximal formulas are only known for limited classes of functions. We provide an algorithm, HJ-Prox, for accurately approximating such proximals. This is derived from a collection of relations between proximals, Moreau envelopes, Hamilton-Jacobi (HJ) equations, heat equations, and importance sampling. In particular, HJ-Prox smoothly approximates the Moreau envelope and its gradient. The smoothness can be adjusted to act as a denoiser. Our approach applies even when functions are only accessible by (possibly noisy) blackbox samples. Our approach can also be embedded into a zero-order algorithm with guaranteed convergence to global minima, assuming continuity of the objective function; this is done by leveraging connections between the gradient of the Moreau envelope and the proximal operator. We show HJ-Prox is effective numerically via several examples.

Abstract:

First-order optimization algorithms are widely used today. Two standard building blocks in these algorithms are proximal operators (proximals) and gradients. Although gradients can be computed for a wide array of functions, explicit proximal formulas are only known for limited classes of functions. We provide an algorithm, HJ-Prox, for accurately approximating such proximals. This is derived from a collection of relations between proximals, Moreau envelopes, Hamilton-Jacobi (HJ) equations, heat equations, and importance sampling. In particular, HJ-Prox smoothly approximates the Moreau envelope and its gradient. The smoothness can be adjusted to act as a denoiser. Our approach applies even when functions are only accessible by (possibly noisy) blackbox samples. Our approach can also be embedded into a zero-order algorithm with guaranteed convergence to global minima, assuming continuity of the objective function; this is done by leveraging connections between the gradient of the Moreau envelope and the proximal operator. We show HJ-Prox is effective numerically via several examples.

Friday,
March 17, 2023
Spring break (no colloquium)
Fri, Mar 17,
2023
no colloquium
Friday,
March 24, 2023
Daniel Wachsmuth (jointly with Applied Math Seminar) University of Wuerzburg (Room: Expl. 4106) A topological derivative-based algorithm to solve optimal control problems with L0(Omega) control cost ... more ... less
Fri, Mar 24,
2023
Daniel Wachsmuth (jointly with Applied Math Seminar),
University of Wuerzburg (Room: Expl. 4106)

A topological derivative-based algorithm to solve optimal control problems with L0(Omega) control cost ... more ... less


Abstract:

We consider optimization problems with L0-cost of the controls. Here, we take the support of the control as independent optimization variable. Topological derivatives of the corresponding value function with respect to variations of the support are derived. These topological derivatives are used in an optimization algorithm. In the algorithm, topology changes happen at large values of the topological derivative. Convergence results are given.

Abstract:

We consider optimization problems with L0-cost of the controls. Here, we take the support of the control as independent optimization variable. Topological derivatives of the corresponding value function with respect to variations of the support are derived. These topological derivatives are used in an optimization algorithm. In the algorithm, topology changes happen at large values of the topological derivative. Convergence results are given.

Friday,
March 31, 2023
Dante Kalise (jointly with Applied Math Seminar) Imperial College London (Room: Expl. 4106) Data-driven schemes for high-dimensional Hamilton-Jacobi-Bellman PDEs ... more ... less
Fri, Mar 31,
2023
Dante Kalise (jointly with Applied Math Seminar),
Imperial College London (Room: Expl. 4106)

Data-driven schemes for high-dimensional Hamilton-Jacobi-Bellman PDEs ... more ... less

Abstract:

Optimal feedback synthesis for nonlinear dynamics -a fundamental problem in optimal control- is enabled by solving fully nonlinear Hamilton-Jacobi-Bellman type PDEs arising in dynamic programming. While our theoretical understanding of dynamic programming and HJB PDEs has seen a remarkable development over the last decades, the numerical approximation of HJB-based feedback laws has remained largely an open problem due to the curse of dimensionality. More precisely, the associated HJB PDE must be solved over the state space of the dynamics, which is extremely high-dimensional in applications such as distributed parameter systems or agent-based models.

In this talk we will review recent approaches regarding the effective numerical approximation of very high-dimensional HJB PDEs via data-driven schemes in supervised and semi-supervised learning environments. We will discuss the use of representation formulas as synthetic data generators, and different architectures for the value function, such a polynomial approximation, tensor decompositions, and deep neural networks.

Abstract:

Optimal feedback synthesis for nonlinear dynamics -a fundamental problem in optimal control- is enabled by solving fully nonlinear Hamilton-Jacobi-Bellman type PDEs arising in dynamic programming. While our theoretical understanding of dynamic programming and HJB PDEs has seen a remarkable development over the last decades, the numerical approximation of HJB-based feedback laws has remained largely an open problem due to the curse of dimensionality. More precisely, the associated HJB PDE must be solved over the state space of the dynamics, which is extremely high-dimensional in applications such as distributed parameter systems or agent-based models.

In this talk we will review recent approaches regarding the effective numerical approximation of very high-dimensional HJB PDEs via data-driven schemes in supervised and semi-supervised learning environments. We will discuss the use of representation formulas as synthetic data generators, and different architectures for the value function, such a polynomial approximation, tensor decompositions, and deep neural networks.

Friday,
April 07, 2023
Victor M. Zavala (jointly with Applied Math Seminar) University of Wisconsin-Madison Graph-Structured Optimization: Properties, Algorithms, and Software ... more ... less
Fri, Apr 07,
2023
Victor M. Zavala (jointly with Applied Math Seminar),
University of Wisconsin-Madison

Graph-Structured Optimization: Properties, Algorithms, and Software ... more ... less


Abstract:

We study properties for nonlinear optimization problems whose structures are induced by graphs. These problems arise in many applications such as dynamic optimization, optimal control, stochastic optimization, optimization with partial differential equations, and network optimization. We show that for a given pair of nodes, the sensitivity of the primal-dual solution at one node against a data perturbation at the other node decays exponentially with respect to the distance between these two nodes on the graph. In other words, the solution sensitivity decays exponentially as one moves away from the perturbation point. We discuss how this property provides new and interesting insights on the behavior of complex systems and how it enables the design of new decomposition and approximation algorithms and software. Specifically, we discuss model predictive control formulations with specialized time discretization approaches and overlapping Schwarz decomposition approaches for solving large-scale problems. We also show how to easily implement graph-structured problems in the Julia programming language.

About the speaker:

Victor M. Zavala is the Baldovin-DaPra Professor in the Department of Chemical and Biological Engineering at the University of Wisconsin-Madison and a computational mathematician in the Mathematics and Computer Science Division at Argonne National Laboratory. He holds a B.Sc. degree from Universidad Iberoamericana and a Ph.D. degree from Carnegie Mellon University, both in chemical engineering. He is on the editorial board of the Journal of Process Control, Mathematical Programming Computation, and Computers & Chemical engineering. He is a recipient of NSF and DOE Early Career awards and of the Presidential Early Career Award for Scientists and Engineers (PECASE). His research interests include statistics, control, and optimization and applications to energy and environmental systems.

Abstract:

We study properties for nonlinear optimization problems whose structures are induced by graphs. These problems arise in many applications such as dynamic optimization, optimal control, stochastic optimization, optimization with partial differential equations, and network optimization. We show that for a given pair of nodes, the sensitivity of the primal-dual solution at one node against a data perturbation at the other node decays exponentially with respect to the distance between these two nodes on the graph. In other words, the solution sensitivity decays exponentially as one moves away from the perturbation point. We discuss how this property provides new and interesting insights on the behavior of complex systems and how it enables the design of new decomposition and approximation algorithms and software. Specifically, we discuss model predictive control formulations with specialized time discretization approaches and overlapping Schwarz decomposition approaches for solving large-scale problems. We also show how to easily implement graph-structured problems in the Julia programming language.

About the speaker:

Victor M. Zavala is the Baldovin-DaPra Professor in the Department of Chemical and Biological Engineering at the University of Wisconsin-Madison and a computational mathematician in the Mathematics and Computer Science Division at Argonne National Laboratory. He holds a B.Sc. degree from Universidad Iberoamericana and a Ph.D. degree from Carnegie Mellon University, both in chemical engineering. He is on the editorial board of the Journal of Process Control, Mathematical Programming Computation, and Computers & Chemical engineering. He is a recipient of NSF and DOE Early Career awards and of the Presidential Early Career Award for Scientists and Engineers (PECASE). His research interests include statistics, control, and optimization and applications to energy and environmental systems.

Friday,
April 14, 2023
East Cost Optimization Meeting (ECOM) George Mason University
Fri, Apr 14,
2023
East Cost Optimization Meeting (ECOM),
George Mason University

Friday,
April 21, 2023
Paul Manns (jointly with Applied Math Seminar) TU Dortmund University (Room: Expl. 4106) Recent advances in mixed-integer PDE constrained optimization: regularization, stationarity, and algorithms ... more ... less
Fri, Apr 21,
2023
Paul Manns (jointly with Applied Math Seminar),
TU Dortmund University (Room: Expl. 4106)

Recent advances in mixed-integer PDE constrained optimization: regularization, stationarity, and algorithms ... more ... less

Abstract:

We are concerned with PDE-constrained optimization problems that feature integer-valued optimization variables. We show advantages and limits of relaxation-based solution algorithms and different ways of integrating regularization properties into the problem formulation and algorithms in order to overcome these problems. In particular, the introduction of a total variation penalty term to the objective fundamentally changes properties of the problem and allows us to derive stationarity conditions of the optimization problem and trust-region subproblems by means of local variations of the level sets of the integer-valued functions. We are able to show that the iterates of a trust-region algorithm (solving the aforementioned subproblems) converge to stationary points.

Abstract:

We are concerned with PDE-constrained optimization problems that feature integer-valued optimization variables. We show advantages and limits of relaxation-based solution algorithms and different ways of integrating regularization properties into the problem formulation and algorithms in order to overcome these problems. In particular, the introduction of a total variation penalty term to the objective fundamentally changes properties of the problem and allows us to derive stationarity conditions of the optimization problem and trust-region subproblems by means of local variations of the level sets of the integer-valued functions. We are able to show that the iterates of a trust-region algorithm (solving the aforementioned subproblems) converge to stationary points.

Fall 2022 ... show ... hide
Date Speaker Affiliation Title
Date Speaker, Affiliation, Title
Friday,
August 26, 2022
Johannes Royset The Naval Postgraduate School (Room: Expl. 4106) Rockafellian functions: The most important concept in optimization that you haven’t heard of ... more ... less
Fri, Aug 26,
2022
Johannes Royset,
The Naval Postgraduate School (Room: Expl. 4106)

Rockafellian functions: The most important concept in optimization that you haven’t heard of ... more ... less


Abstract:

Rockafellian functions are central to sensitivity analysis, optimality conditions, algorithmic developments, and duality. They encode an embedding of an actual problem of interest within a family of problems and lead to broad insights and computational possibilities. We introduce Rockafellians and illustrate their application in stochastic optimization, influence maximization, and inventory control. Through Rockafellian relaxation, we are able to explore a decision space broadly and discover solutions that remain hidden for more conservative approaches to decision making under uncertainty such as distributional robust optimization.

Abstract:

Rockafellian functions are central to sensitivity analysis, optimality conditions, algorithmic developments, and duality. They encode an embedding of an actual problem of interest within a family of problems and lead to broad insights and computational possibilities. We introduce Rockafellians and illustrate their application in stochastic optimization, influence maximization, and inventory control. Through Rockafellian relaxation, we are able to explore a decision space broadly and discover solutions that remain hidden for more conservative approaches to decision making under uncertainty such as distributional robust optimization.

Friday,
September 02, 2022
Soledad Villar Johns Hopkins University (Room: Expl. 4106) Machine learning that obeys physical law ... more ... less
Fri, Sep 02,
2022
Soledad Villar,
Johns Hopkins University (Room: Expl. 4106)

Machine learning that obeys physical law ... more ... less


Abstract:

In this talk we will give an overview of the enormous progress in the last few years, by several research groups, in designing machine learning methods that respect the fundamental symmetries and coordinate freedoms of physical law. Some of these frameworks make use of irreducible representations, some make use of high-order tensor objects, and some apply symmetry-enforcing constraints. Different physical laws obey different combinations of fundamental symmetries, but a large fraction (possibly all) of classical physics is equivariant to translation, rotation, reflection (parity), boost (relativity), units scalings, and permutations. We show that it is simple to parameterize universally approximating polynomial functions that are equivariant under these symmetries, or under the Euclidean, Lorentz, and Poincare groups, at any dimensionality d. The key observation is that nonlinear O(d)-equivariant (and related-group-equivariant) functions can be universally expressed in terms of a lightweight collection of (dimensionless) scalars -- scalar products and scalar contractions of the scalar, vector, and tensor inputs. We complement our theory with numerical examples that show that the scalar-based method is simple, efficient, and scalable, and mention ongoing work on cosmology simulations.

Abstract:

In this talk we will give an overview of the enormous progress in the last few years, by several research groups, in designing machine learning methods that respect the fundamental symmetries and coordinate freedoms of physical law. Some of these frameworks make use of irreducible representations, some make use of high-order tensor objects, and some apply symmetry-enforcing constraints. Different physical laws obey different combinations of fundamental symmetries, but a large fraction (possibly all) of classical physics is equivariant to translation, rotation, reflection (parity), boost (relativity), units scalings, and permutations. We show that it is simple to parameterize universally approximating polynomial functions that are equivariant under these symmetries, or under the Euclidean, Lorentz, and Poincare groups, at any dimensionality d. The key observation is that nonlinear O(d)-equivariant (and related-group-equivariant) functions can be universally expressed in terms of a lightweight collection of (dimensionless) scalars -- scalar products and scalar contractions of the scalar, vector, and tensor inputs. We complement our theory with numerical examples that show that the scalar-based method is simple, efficient, and scalable, and mention ongoing work on cosmology simulations.

Friday,
September 09, 2022
Rayanne Luke (jointly with Applied Math Seminar) NIST/JHU (Room: Expl. 4106) Optimal multiclass classification and prevalence estimation with applications to SARS-CoV-2 antibody assays ... more ... less
Fri, Sep 09,
2022
Rayanne Luke (jointly with Applied Math Seminar),
NIST/JHU (Room: Expl. 4106)

Optimal multiclass classification and prevalence estimation with applications to SARS-CoV-2 antibody assays ... more ... less


Abstract:

Antibody tests are routinely used to identify past infection, with examples including Lyme disease and, of course, COVID-19. An accurate classification strategy is crucial to interpreting diagnostic test results and includes problems with more than two classes. Classification is further complicated when the relative fraction of the population in each class, or generalized prevalence, is unknown. In this talk, I will present a prevalence estimation method that is independent of classification and an associated classification scheme that minimizes false classifications. This work hinges on constructing probability models for data that are inputs to an optimal-decision theory framework. As an illustration, I will apply the method to antibody data with SARS-CoV-2 naïve, previously infected, and vaccinated classes. This is based on joint work with Paul Patrone and Anthony Kearsley.

Abstract:

Antibody tests are routinely used to identify past infection, with examples including Lyme disease and, of course, COVID-19. An accurate classification strategy is crucial to interpreting diagnostic test results and includes problems with more than two classes. Classification is further complicated when the relative fraction of the population in each class, or generalized prevalence, is unknown. In this talk, I will present a prevalence estimation method that is independent of classification and an associated classification scheme that minimizes false classifications. This work hinges on constructing probability models for data that are inputs to an optimal-decision theory framework. As an illustration, I will apply the method to antibody data with SARS-CoV-2 naïve, previously infected, and vaccinated classes. This is based on joint work with Paul Patrone and Anthony Kearsley.

Thursday,
September 15, 2022
Jose A. Iglesias (jointly with Applied Math Seminar) University of Twente, Netherlands (Room: Expl. 4106) Geometric convergence in regularization of inverse problems ... more ... less
Thu, Sep 15,
2022
Jose A. Iglesias (jointly with Applied Math Seminar),
University of Twente, Netherlands (Room: Expl. 4106)

Geometric convergence in regularization of inverse problems ... more ... less


Abstract:

Variational regularization theory of ill-posed inverse problems with known forward models has a long tradition spanning all the way back to the seminal contributions of Tikhonov in the 1940s. It studies questions like consistency as the regularization parameter and noise level converge to zero simultaneously, generally from a functional-analytic point of view.

Often, and in particular when dealing with imaging applications like deblurring or Radon transform inversion for tomography, the regularization energies used in such approaches contain spatial derivatives. As such, they also have rich analytical backgrounds in terms of properties and regularity of minimizers.

In this talk, I will present some recent work bridging these two areas together. In the regime of vanishing noise and regularization parameter, we obtain results of convergence in Hausdorff distance of level sets of minimizers (which can be interpreted as objects to be recovered in an imaging context) and uniform L^infty bounds. These hold not only with total variation regularization, but also when regularizing with the fractional Laplacian.

Based on joint work with Gwenael Mercier, Kristian Bredies, and Otmar Scherzer.

Abstract:

Variational regularization theory of ill-posed inverse problems with known forward models has a long tradition spanning all the way back to the seminal contributions of Tikhonov in the 1940s. It studies questions like consistency as the regularization parameter and noise level converge to zero simultaneously, generally from a functional-analytic point of view.

Often, and in particular when dealing with imaging applications like deblurring or Radon transform inversion for tomography, the regularization energies used in such approaches contain spatial derivatives. As such, they also have rich analytical backgrounds in terms of properties and regularity of minimizers.

In this talk, I will present some recent work bridging these two areas together. In the regime of vanishing noise and regularization parameter, we obtain results of convergence in Hausdorff distance of level sets of minimizers (which can be interpreted as objects to be recovered in an imaging context) and uniform L^infty bounds. These hold not only with total variation regularization, but also when regularizing with the fractional Laplacian.

Based on joint work with Gwenael Mercier, Kristian Bredies, and Otmar Scherzer.

Friday,
September 23, 2022
Sergey Dolgov University of Bath, UK (Room: Expl. 4106) TTRISK: Tensor Train Decomposition Algorithm for Risk Averse Optimization ... more ... less
Fri, Sep 23,
2022
Sergey Dolgov,
University of Bath, UK (Room: Expl. 4106)

TTRISK: Tensor Train Decomposition Algorithm for Risk Averse Optimization ... more ... less


Abstract:

This talk develops a new algorithm named TTRISK to solve high-dimensional risk-averse optimization problems governed by differential equations (ODEs and/or PDEs) under uncertainty. As an example, we focus on the so-called Conditional Value at Risk (CVaR), but the approach is equally applicable to other coherent risk measures. Both the full and reduced space formulations are considered. The algorithm is based on low rank tensor approximations of random fields discretized using stochastic collocation. To avoid non-smoothness of the objective function underpinning the CVaR, we propose an adaptive strategy to select the width parameter of the smoothed CVaR to balance the smoothing and tensor approximation errors. Moreover, unbiased Monte Carlo CVaR estimate can be computed by using the smoothed CVaR as a control variate. To accelerate the computations, we introduce an efficient preconditioner for the KKT system in the full space formulation.The numerical experiments demonstrate that the proposed method enables accurate CVaR optimization constrained by large-scale discretized systems. In particular, the first example consists of an elliptic PDE with random coefficients as constraints. The second example is motivated by a realistic application to devise a lockdown plan for United Kingdom under COVID-19. The results indicate that the risk-averse framework is feasible with the tensor approximations under tens of random variables.

Abstract:

This talk develops a new algorithm named TTRISK to solve high-dimensional risk-averse optimization problems governed by differential equations (ODEs and/or PDEs) under uncertainty. As an example, we focus on the so-called Conditional Value at Risk (CVaR), but the approach is equally applicable to other coherent risk measures. Both the full and reduced space formulations are considered. The algorithm is based on low rank tensor approximations of random fields discretized using stochastic collocation. To avoid non-smoothness of the objective function underpinning the CVaR, we propose an adaptive strategy to select the width parameter of the smoothed CVaR to balance the smoothing and tensor approximation errors. Moreover, unbiased Monte Carlo CVaR estimate can be computed by using the smoothed CVaR as a control variate. To accelerate the computations, we introduce an efficient preconditioner for the KKT system in the full space formulation.The numerical experiments demonstrate that the proposed method enables accurate CVaR optimization constrained by large-scale discretized systems. In particular, the first example consists of an elliptic PDE with random coefficients as constraints. The second example is motivated by a realistic application to devise a lockdown plan for United Kingdom under COVID-19. The results indicate that the risk-averse framework is feasible with the tensor approximations under tens of random variables.

Friday,
September 30, 2022
Robert J. Baraldi Sandia National Labs (Room: Expl. 4106) An Inexact Trust-Region Algorithm for Nonsmooth Nonconvex Regularized Problems ... more ... less
Fri, Sep 30,
2022
Robert J. Baraldi,
Sandia National Labs (Room: Expl. 4106)

An Inexact Trust-Region Algorithm for Nonsmooth Nonconvex Regularized Problems ... more ... less


Abstract:

Many inverse problems require minimizing the sum of smooth and nonsmooth functions. For example, basis pursuit denoise applications in data science require minimizing a measure of data misfit plus an L1-regularizer. Similar problems arise in the optimal control of partial differential equations (PDEs) when sparsity of the control is desired. For such applications, it is often impossible to compute exact derivatives or function values due to problem size and complexity. We develop a novel inexact trust-region method to minimize the sum of a smooth nonconvex function and a nonsmooth convex function. The trust-region routine permits and systematically controls the use of inexact objective function and derivative evaluations. When using a quadratic Taylor model for the trust-region subproblem, our algorithm is an inexact, matrix-free proximal Newton-type method that permits indefinite Hessians. Using unconstrained and convex constrained trust-region methods as motivation, we describe various methods for efficiently solving the nonsmooth trust-region subproblem. We also prove global convergence of our method in Hilbert space and demonstrate its efficacy on three examples from data science and PDE-constrained optimization.

Abstract:

Many inverse problems require minimizing the sum of smooth and nonsmooth functions. For example, basis pursuit denoise applications in data science require minimizing a measure of data misfit plus an L1-regularizer. Similar problems arise in the optimal control of partial differential equations (PDEs) when sparsity of the control is desired. For such applications, it is often impossible to compute exact derivatives or function values due to problem size and complexity. We develop a novel inexact trust-region method to minimize the sum of a smooth nonconvex function and a nonsmooth convex function. The trust-region routine permits and systematically controls the use of inexact objective function and derivative evaluations. When using a quadratic Taylor model for the trust-region subproblem, our algorithm is an inexact, matrix-free proximal Newton-type method that permits indefinite Hessians. Using unconstrained and convex constrained trust-region methods as motivation, we describe various methods for efficiently solving the nonsmooth trust-region subproblem. We also prove global convergence of our method in Hilbert space and demonstrate its efficacy on three examples from data science and PDE-constrained optimization.

Friday,
October 07, 2022
Deepanshu Verma Emory University Advances and Challenges in Solving High-dimensional HJB Equations Arising in Optimal Control ... more ... less
Fri, Oct 07,
2022
Deepanshu Verma,
Emory University

Advances and Challenges in Solving High-dimensional HJB Equations Arising in Optimal Control ... more ... less


Abstract:

We present a neural network approach for approximately solving high-dimensional stochastic as well as deterministic control problems. Our network design and the training problem leverage insights from optimal control theory. We approximate the value function of the control problem using a neural network and use the Pontryagin maximum principle and Dynamic Programming principle to express the optimal control (and therefore the sampling) in terms of the value function. Our training loss consists of a weighted sum of the objective functional of the control problem and penalty terms that enforce the Hamilton Jacobi Bellman equations along the sampled trajectories. As a result, we can obtain the value function in the regions of the state space traveled by optimal trajectories to avoid the Curse of Dimensionality. Importantly, training is unsupervised in that it does not require solutions of the control problem.

Our approach for stochastic control problem reduces to the method of characteristics as the system dynamics become deterministic. In our numerical experiments, we compare our method to existing solvers for a more general class of semi-linear PDEs. Using a two-dimensional toy problem, we demonstrate the importance of the PMP to inform the sampling. For a 100-dimensional benchmark problem, we demonstrate that approach improves accuracy and time-to-solution. Finally we consider a PDE based dynamical system to demonstrate the scalability of our approach.

Abstract:

We present a neural network approach for approximately solving high-dimensional stochastic as well as deterministic control problems. Our network design and the training problem leverage insights from optimal control theory. We approximate the value function of the control problem using a neural network and use the Pontryagin maximum principle and Dynamic Programming principle to express the optimal control (and therefore the sampling) in terms of the value function. Our training loss consists of a weighted sum of the objective functional of the control problem and penalty terms that enforce the Hamilton Jacobi Bellman equations along the sampled trajectories. As a result, we can obtain the value function in the regions of the state space traveled by optimal trajectories to avoid the Curse of Dimensionality. Importantly, training is unsupervised in that it does not require solutions of the control problem.

Our approach for stochastic control problem reduces to the method of characteristics as the system dynamics become deterministic. In our numerical experiments, we compare our method to existing solvers for a more general class of semi-linear PDEs. Using a two-dimensional toy problem, we demonstrate the importance of the PMP to inform the sampling. For a 100-dimensional benchmark problem, we demonstrate that approach improves accuracy and time-to-solution. Finally we consider a PDE based dynamical system to demonstrate the scalability of our approach.

Ryan Vaughn Dartmouth DIffusion Maps for Manifolds with Boundary ... more ... less
Ryan Vaughn,
Dartmouth

DIffusion Maps for Manifolds with Boundary ... more ... less


Abstract:

Diffusion maps and other graph Laplacian based methods provide a useful mesh-free technique for approximating Laplacian eigenfunctions on a closed Riemannian manifold in the absence of a mesh. However, these techniques suffer from several issues when applied to boundary value problems on manifolds with boundary. In this talk, we present some new results concerning the convergence of the Diffusion Maps algorithm on manifolds with boundary. In particular, we prove convergence in bias of the estimator to the Neumann Laplacian. We then show that by supplementing Diffusion Maps with an additional boundary estimator, one is able to numerically solve Dirichlet and Neumann boundary value problems on Riemannian manifolds with boundary.

Abstract:

Diffusion maps and other graph Laplacian based methods provide a useful mesh-free technique for approximating Laplacian eigenfunctions on a closed Riemannian manifold in the absence of a mesh. However, these techniques suffer from several issues when applied to boundary value problems on manifolds with boundary. In this talk, we present some new results concerning the convergence of the Diffusion Maps algorithm on manifolds with boundary. In particular, we prove convergence in bias of the estimator to the Neumann Laplacian. We then show that by supplementing Diffusion Maps with an additional boundary estimator, one is able to numerically solve Dirichlet and Neumann boundary value problems on Riemannian manifolds with boundary.

Friday,
October 14, 2022
Andreas Mang University of Houston (Room: Expl. 4106) Fast algorithms for nonlinear optimal control of geodesic flows of diffeomorphisms ... more ... less
Fri, Oct 14,
2022
Andreas Mang,
University of Houston (Room: Expl. 4106)

Fast algorithms for nonlinear optimal control of geodesic flows of diffeomorphisms ... more ... less


Abstract:

In this talk, we will discuss optimal control formulations for diffeomorphic registration and efficient algorithms for their solution. Our contributions are in the design of numerical methods and computational kernels that scale on heterogeneous, high-performance computing platforms. Diffeomorphic registration is an infinite-dimensional, nonlinear inverse problem. The inputs are two views of the same object. Given these views, we seek a spatial transformation $y$ that relates points in one view to its corresponding points in the other. We formulate the problem as a constrained optimization problem with dynamical systems as constraints. We introduce a pseudo-time variable $t$ and parameterize the sought-after mapping $y$ by its velocity $v$. Prescribing suitable regularity requirements for $v$ allows us to ensure that $y$ is a diffeomorphism. We will explore different formulations and discuss various numerical solution strategies.

Our solvers are based on state-of-the-art algorithms to enable fast convergence and short runtime. We will showcase results on real and synthetic data to study the rate of convergence, time-to-solution, numerical accuracy, and scalability of our solvers. As a highlight, we will showcase results for a GPU-accelerated implementation termed CLAIRE that allows us to solve clinically relevant 3D image registration problems with 50 million unknowns to high accuracy in under 5 seconds on a single GPU, and scales up to 100s of GPUs.

This is joint work with George Biros, Miriam Schulte, and others.

Abstract:

In this talk, we will discuss optimal control formulations for diffeomorphic registration and efficient algorithms for their solution. Our contributions are in the design of numerical methods and computational kernels that scale on heterogeneous, high-performance computing platforms. Diffeomorphic registration is an infinite-dimensional, nonlinear inverse problem. The inputs are two views of the same object. Given these views, we seek a spatial transformation $y$ that relates points in one view to its corresponding points in the other. We formulate the problem as a constrained optimization problem with dynamical systems as constraints. We introduce a pseudo-time variable $t$ and parameterize the sought-after mapping $y$ by its velocity $v$. Prescribing suitable regularity requirements for $v$ allows us to ensure that $y$ is a diffeomorphism. We will explore different formulations and discuss various numerical solution strategies.

Our solvers are based on state-of-the-art algorithms to enable fast convergence and short runtime. We will showcase results on real and synthetic data to study the rate of convergence, time-to-solution, numerical accuracy, and scalability of our solvers. As a highlight, we will showcase results for a GPU-accelerated implementation termed CLAIRE that allows us to solve clinically relevant 3D image registration problems with 50 million unknowns to high accuracy in under 5 seconds on a single GPU, and scales up to 100s of GPUs.

This is joint work with George Biros, Miriam Schulte, and others.

Friday,
October 21, 2022
Fri, Oct 21,
2022

Friday,
October 28, 2022
Daniel McKenzie Colorado School of Mines Using Fermat metrics in data science ... more ... less
Fri, Oct 28,
2022
Daniel McKenzie,
Colorado School of Mines

Using Fermat metrics in data science ... more ... less


Abstract:

Fermat metrics are a class of data-driven metrics, defined on Euclidean point clouds. They have their origins in First Passage Percolation but are finding increasing use in data analysis. In this talk I will introduce the Fermat metrics and highlight some important theoretical results. Then, I will discuss some of my recent work on the algorithmic aspects of Fermat metrics and conclude with some recent interesting applications to data science.

Abstract:

Fermat metrics are a class of data-driven metrics, defined on Euclidean point clouds. They have their origins in First Passage Percolation but are finding increasing use in data analysis. In this talk I will introduce the Fermat metrics and highlight some important theoretical results. Then, I will discuss some of my recent work on the algorithmic aspects of Fermat metrics and conclude with some recent interesting applications to data science.

Hugo Díaz Boundary control of time-harmonic eddy current equations Boundary control of time-harmonic eddy current equations ... more ... less
Hugo Díaz,
Boundary control of time-harmonic eddy current equations

Boundary control of time-harmonic eddy current equations ... more ... less


Abstract:

Motivated by various applications, this talk develops the notion of boundary control for Maxwell's equations in the frequency domain. Surface curl is shown to be the appropriate regularization in order for the optimal control problem to be well-posed. Since, all underlying variables are assumed to be complex valued, the standard results on differentiability do not directly apply. Instead, we extend the notion of Wirtinger derivatives to complexified Hilbert spaces. Optimality conditions are rigorously derived and higher order boundary regularity of the adjoint variable is established. The state and adjoint variables are discretized using higher order Nédélec finite elements. The finite element space for controls is identified, as a space, which preserves the structure of the control regularization. Convergence of the fully discrete scheme is established. The theory is validated by numerical experiments, in some cases, motivated by realistic applications.

Abstract:

Motivated by various applications, this talk develops the notion of boundary control for Maxwell's equations in the frequency domain. Surface curl is shown to be the appropriate regularization in order for the optimal control problem to be well-posed. Since, all underlying variables are assumed to be complex valued, the standard results on differentiability do not directly apply. Instead, we extend the notion of Wirtinger derivatives to complexified Hilbert spaces. Optimality conditions are rigorously derived and higher order boundary regularity of the adjoint variable is established. The state and adjoint variables are discretized using higher order Nédélec finite elements. The finite element space for controls is identified, as a space, which preserves the structure of the control regularization. Convergence of the fully discrete scheme is established. The theory is validated by numerical experiments, in some cases, motivated by realistic applications.

Friday,
November 04, 2022
SIAM Washington-Baltimore Sectional Meeting No regular colloquium. Please register for the sectional (in-person) meeting
Fri, Nov 04,
2022
SIAM Washington-Baltimore Sectional Meeting

No regular colloquium. Please register for the sectional (in-person) meeting

Friday,
November 11, 2022
Uday V. Shanbhag Pennsylvania State University (Room: Expl. 4106) Probability Maximization via Minkowski Functionals: Convex Representations and Tractable Resolution ... more ... less
Fri, Nov 11,
2022
Uday V. Shanbhag,
Pennsylvania State University (Room: Expl. 4106)

Probability Maximization via Minkowski Functionals: Convex Representations and Tractable Resolution ... more ... less


Abstract:

Abstract:

Friday,
November 18, 2022
Boris Mordukhovich Wayne State University Optimal Control of Sweeping Processes with Applications ... more ... less
Fri, Nov 18,
2022
Boris Mordukhovich,
Wayne State University

Optimal Control of Sweeping Processes with Applications ... more ... less


Abstract:

This talk is devoted to a novel class of optimal control problems governed by the so-called sweeping (or Moreau) processes that are described by discontinuous dissipative differential inclusions. Although such dynamical processes, strongly motivated by applications, have appeared in 1970s, optimal control problems for them have been formulated quite recently and occurred to be rather complicated from the viewpoint of developing control theory. Their study and applications require advanced tools of variational analysis and generalized differentiation, which will be presented in the lecture. Combining this machinery with the method of discrete approximations leads us deriving new necessary optimality conditions and their applications to practical models in elastoplasticity, traffic equilibria, robotics, etc.

Abstract:

This talk is devoted to a novel class of optimal control problems governed by the so-called sweeping (or Moreau) processes that are described by discontinuous dissipative differential inclusions. Although such dynamical processes, strongly motivated by applications, have appeared in 1970s, optimal control problems for them have been formulated quite recently and occurred to be rather complicated from the viewpoint of developing control theory. Their study and applications require advanced tools of variational analysis and generalized differentiation, which will be presented in the lecture. Combining this machinery with the method of discrete approximations leads us deriving new necessary optimality conditions and their applications to practical models in elastoplasticity, traffic equilibria, robotics, etc.

Friday,
November 25, 2022
Thanksgiving
Fri, Nov 25,
2022
no colloquium
Friday,
December 02, 2022
Andrea Bonito Texas A&M University (Room: Expl. 4106) Optimal Learning ... more ... less
Fri, Dec 02,
2022
Andrea Bonito,
Texas A&M University (Room: Expl. 4106)

Optimal Learning ... more ... less


Abstract:

We discuss the problem of learning an unknown function from given data, i.e., construct an approximation to the function that predicts its values away from the data. There are numerous settings for this learning problem depending on what additional information is provided about the unknown function, how the accuracy is measured, what is known about the data and data sites, and whether the data observations are polluted by noise.

We provide a mathematical description of optimal performances (the smallest possible error of recovery) in the presence of a model class assumption. Under standard model class assumptions, we show that a near optimal recovery can be found by solving a certain discrete over-parameterized optimization problem with a penalty term. Here, near optimal means that the error is no larger than a constant multiple of the optimal error. This explains the advantage of over-parameterization which is commonly used in modern machine learning.

This is joint work with Peter Binev, Ronald DeVore, and Guergana Petrova.

Abstract:

We discuss the problem of learning an unknown function from given data, i.e., construct an approximation to the function that predicts its values away from the data. There are numerous settings for this learning problem depending on what additional information is provided about the unknown function, how the accuracy is measured, what is known about the data and data sites, and whether the data observations are polluted by noise.

We provide a mathematical description of optimal performances (the smallest possible error of recovery) in the presence of a model class assumption. Under standard model class assumptions, we show that a near optimal recovery can be found by solving a certain discrete over-parameterized optimization problem with a penalty term. Here, near optimal means that the error is no larger than a constant multiple of the optimal error. This explains the advantage of over-parameterization which is commonly used in modern machine learning.

This is joint work with Peter Binev, Ronald DeVore, and Guergana Petrova.

Spring 2022 ... show ... hide
Date Speaker Affiliation Title
Date Speaker, Affiliation, Title
Friday,
January 28, 2022
Felix Otto Max-Planck-Institut für Mathematik in den Naturwissenschaften Regularity theory for optimal transportation and an application to the matching problem ... more ... less
Fri, Jan 28,
2022
Felix Otto,
Max-Planck-Institut für Mathematik in den Naturwissenschaften

Regularity theory for optimal transportation and an application to the matching problem ... more ... less


Abstract:

The optimal transportation of one measure into another, leading to the notion of their Wasserstein distance, is a problem in the calculus of variations with a wide range of applications. The subtle regularity theory for the optimal map is traditionally based on the regularity theory for the Monge-Ampere equation, which was revolutionized by Caffarelli, based on comparison principle arguments.

We present a purely variational approach to the regularity theory for optimal transportation, introduced with M.~Goldman and refined with M.~Huesmann. Following De Giorgi's philosophy for the regularity theory of minimal surfaces, it is based on the approximation of the displacement by a harmonic gradient, through the construction of a variational competitor.

Due to its robustness and low-regularity approach, this approach is like taylor-made to study the popular problem of matching two independent Poisson point processes. For example, it can be used to prove non-existence of a stationary cyclically monotone coupling, which is joint work with M.~Huesmann and F.~Mattesini.

Abstract:

The optimal transportation of one measure into another, leading to the notion of their Wasserstein distance, is a problem in the calculus of variations with a wide range of applications. The subtle regularity theory for the optimal map is traditionally based on the regularity theory for the Monge-Ampere equation, which was revolutionized by Caffarelli, based on comparison principle arguments.

We present a purely variational approach to the regularity theory for optimal transportation, introduced with M.~Goldman and refined with M.~Huesmann. Following De Giorgi's philosophy for the regularity theory of minimal surfaces, it is based on the approximation of the displacement by a harmonic gradient, through the construction of a variational competitor.

Due to its robustness and low-regularity approach, this approach is like taylor-made to study the popular problem of matching two independent Poisson point processes. For example, it can be used to prove non-existence of a stationary cyclically monotone coupling, which is joint work with M.~Huesmann and F.~Mattesini.

Friday,
February 04, 2022
Barbara Kaltenbacher University of Klagenfurt Reduced, all-at-once, and variational formulations of inverse problems and their iterative solution ... more ... less
Fri, Feb 04,
2022
Barbara Kaltenbacher ,
University of Klagenfurt

Reduced, all-at-once, and variational formulations of inverse problems and their iterative solution ... more ... less


Abstract:

The conventional way of formulating inverse problems such as identification of a (possibly infinite dimensional) parameter, is via some forward operator, which is the concatenation of the observation operator with the parameter-to-state-map for the underlying model.

Recently, all-at-once formulations have been considered as an alternative to this reduced formulation, avoiding the use of a parameter-to-state map, which would sometimes lead to too restrictive conditions. Here the model and the observation are considered simultaneously as one large system with the state and the parameter as unknowns.

A still more general formulation of inverse problems, containing both the reduced and the all-at-once formulation, but also the well-known and highly versatile so-called variational approach (not to be mistaken with variational regularization) as special cases, is to formulate the inverse problem as a minimization problem (instead of an equation) for the state and parameter. Regularization can be incorporated via imposing constraints and/or adding regularization terms to the objective.

We will consider iterative regularization methods resulting from the application of gradient or Newton type iterations to such minimization based formulations and provide convergence results. In doing so, instead of regularizing the minimization problem and then applying standard iterative optimization methods, we regularize *by* iterating, more precisely by early stopping.

Abstract:

The conventional way of formulating inverse problems such as identification of a (possibly infinite dimensional) parameter, is via some forward operator, which is the concatenation of the observation operator with the parameter-to-state-map for the underlying model.

Recently, all-at-once formulations have been considered as an alternative to this reduced formulation, avoiding the use of a parameter-to-state map, which would sometimes lead to too restrictive conditions. Here the model and the observation are considered simultaneously as one large system with the state and the parameter as unknowns.

A still more general formulation of inverse problems, containing both the reduced and the all-at-once formulation, but also the well-known and highly versatile so-called variational approach (not to be mistaken with variational regularization) as special cases, is to formulate the inverse problem as a minimization problem (instead of an equation) for the state and parameter. Regularization can be incorporated via imposing constraints and/or adding regularization terms to the objective.

We will consider iterative regularization methods resulting from the application of gradient or Newton type iterations to such minimization based formulations and provide convergence results. In doing so, instead of regularizing the minimization problem and then applying standard iterative optimization methods, we regularize *by* iterating, more precisely by early stopping.

Friday,
February 11, 2022
Vince Lyzinski University of Maryland The Importance of Being Correlated: Implications of Dependence in Joint Spectral Inference across Multiple Networks ... more ... less
Fri, Feb 11,
2022
Vince Lyzinski,
University of Maryland

The Importance of Being Correlated: Implications of Dependence in Joint Spectral Inference across Multiple Networks ... more ... less


Abstract:

Spectral inference on multiple networks is a rapidly-developing subfield of graph statistics. Recent work has demonstrated that joint, or simultaneous, spectral embedding of multiple independent networks can deliver more accurate estimation than individual spectral decompositions of those same networks. Such inference procedures typically rely heavily on independence assumptions across the multiple network realizations, and even in this case, little attention has been paid to the induced network correlation in such joint embeddings. Here, we present a generalized omnibus embedding methodology and provide a detailed analysis of this embedding across both independent and correlated networks, the latter of which significantly extends the reach of such procedures. We describe how this omnibus embedding can itself induce correlation, leading us to distinguish between inherent correlation -- the correlation that arises naturally in multisample network data -- and induced correlation, which is an artifice of the joint embedding methodology. We show that the generalized omnibus embedding procedure is flexible and robust, and prove both consistency and a central limit theorem for the embedded points. We examine how induced and inherent correlation can impact inference for network time series data, and we provide network analogues of classical questions such as the effective sample size for more generally correlated data. Further, we show how an appropriately calibrated generalized omnibus embedding can detect changes in real biological networks that previous embedding procedures could not discern, confirming that the effect of inherent and induced correlation can be subtle and transformative, with import in theory and practice.

Abstract:

Spectral inference on multiple networks is a rapidly-developing subfield of graph statistics. Recent work has demonstrated that joint, or simultaneous, spectral embedding of multiple independent networks can deliver more accurate estimation than individual spectral decompositions of those same networks. Such inference procedures typically rely heavily on independence assumptions across the multiple network realizations, and even in this case, little attention has been paid to the induced network correlation in such joint embeddings. Here, we present a generalized omnibus embedding methodology and provide a detailed analysis of this embedding across both independent and correlated networks, the latter of which significantly extends the reach of such procedures. We describe how this omnibus embedding can itself induce correlation, leading us to distinguish between inherent correlation -- the correlation that arises naturally in multisample network data -- and induced correlation, which is an artifice of the joint embedding methodology. We show that the generalized omnibus embedding procedure is flexible and robust, and prove both consistency and a central limit theorem for the embedded points. We examine how induced and inherent correlation can impact inference for network time series data, and we provide network analogues of classical questions such as the effective sample size for more generally correlated data. Further, we show how an appropriately calibrated generalized omnibus embedding can detect changes in real biological networks that previous embedding procedures could not discern, confirming that the effect of inherent and induced correlation can be subtle and transformative, with import in theory and practice.

Friday,
February 18, 2022
Daniel Wachsmuth Julius Maximilians University of Würzburg Proximal gradient methods for control problems with non-smooth and non-convex control cost ... more ... less
Fri, Feb 18,
2022
Daniel Wachsmuth,
Julius Maximilians University of Würzburg

Proximal gradient methods for control problems with non-smooth and non-convex control cost ... more ... less


Abstract:

We investigate the convergence of the proximal gradient method applied to control problems with non-smooth and non-convex control cost. Here, we focus on control cost functionals that promote sparsity, which includes functionals of L^p-type for p in [0, 1). We prove stationarity properties of weak limit points of the method. These properties are weaker than those provided by Pontryagin’s maximum principle and weaker than L-stationarity.

Abstract:

We investigate the convergence of the proximal gradient method applied to control problems with non-smooth and non-convex control cost. Here, we focus on control cost functionals that promote sparsity, which includes functionals of L^p-type for p in [0, 1). We prove stationarity properties of weak limit points of the method. These properties are weaker than those provided by Pontryagin’s maximum principle and weaker than L-stationarity.

Friday,
February 25, 2022
Prasanna Balaprakash Argonne National Laboratory Automated Machine Learning with DeepHyper ... more ... less
Fri, Feb 25,
2022
Prasanna Balaprakash,
Argonne National Laboratory

Automated Machine Learning with DeepHyper ... more ... less


Abstract:

Scientific data sets are diverse and often require data-set-specific deep neural network (DNN) models. Nevertheless, designing high-performing DNN architecture for a given data set is an expert-driven, time-consuming, trial-and-error manual task. To that end, we have developed DeepHyper [1], a software package that uses scalable neural architecture and hyperparameter search to automate the design and development of DNN models for scientific and engineering applications. In this talk, we will focus on two new algorithmic components that we developed recently. The first is DeepHyper/AgEBO [2] that seeks to reduce the overall computation time by combining Aging Evolution (AE) to search over neural architectures and asynchronous Bayesian optimization (BO) to tune hyperparameters of data-parallel training. The second is DeepHyper/AutoDEUQ [3], an automated approach for generating an ensemble of deep neural networks and using them for estimating aleatoric (data) and epistemic (model) uncertainties.

[1] https://deephyper.readthedocs.io/en/latest/

[2] R. Egele, P. Balaprakash, I. Guyon, V. Vishwanath, F. Xia, R. Stevens, Z. Liu. AgEBO-Tabular: Joint neural architecture and hyperparameter search with autotuned data-parallel training for tabular data. In SC21: International Conference for High Performance Computing, Networking, Storage and Analysis, 2021.

[3] R. Egele, R. Maulik, K. Raghavan, P. Balaprakash, B. Lusch. AutoDEUQ: Automated Deep Ensemble with Uncertainty Quantification, (in review), 2021.

Abstract:

Scientific data sets are diverse and often require data-set-specific deep neural network (DNN) models. Nevertheless, designing high-performing DNN architecture for a given data set is an expert-driven, time-consuming, trial-and-error manual task. To that end, we have developed DeepHyper [1], a software package that uses scalable neural architecture and hyperparameter search to automate the design and development of DNN models for scientific and engineering applications. In this talk, we will focus on two new algorithmic components that we developed recently. The first is DeepHyper/AgEBO [2] that seeks to reduce the overall computation time by combining Aging Evolution (AE) to search over neural architectures and asynchronous Bayesian optimization (BO) to tune hyperparameters of data-parallel training. The second is DeepHyper/AutoDEUQ [3], an automated approach for generating an ensemble of deep neural networks and using them for estimating aleatoric (data) and epistemic (model) uncertainties.

[1] https://deephyper.readthedocs.io/en/latest/

[2] R. Egele, P. Balaprakash, I. Guyon, V. Vishwanath, F. Xia, R. Stevens, Z. Liu. AgEBO-Tabular: Joint neural architecture and hyperparameter search with autotuned data-parallel training for tabular data. In SC21: International Conference for High Performance Computing, Networking, Storage and Analysis, 2021.

[3] R. Egele, R. Maulik, K. Raghavan, P. Balaprakash, B. Lusch. AutoDEUQ: Automated Deep Ensemble with Uncertainty Quantification, (in review), 2021.

Friday,
March 04, 2022
Richard J. Braun University of Delaware Automated Tear Breakup Detection and Modeling on the Ocular Surface ... more ... less
Fri, Mar 04,
2022
Richard J. Braun,
University of Delaware

Automated Tear Breakup Detection and Modeling on the Ocular Surface ... more ... less


Abstract:

The tear film is a thin fluid multilayer left on the eye surface after a blink. A good tear film is essential for health and proper function of the eye, yet millions have a condition called dry eye disease (DED) that inhibits vision and may lead to inflammation and ocular surface damage. However, there is little quantitative data about tear film failure, often called tear break up (TBU). Currently, it is not possible to directly measure important variables such as tear osmolarity (saltiness) with areas of TBU. We present an (mostly) automatic method that we have developed to extract data from video of the tear film dyed with fluorescein (for visualization). We have data for 15 healthy subjects comprising 467 instances of TBU. Using parameter identification from fits to appropriate math models, we estimate which mechanisms are most important in TBU and variables like osmolarity within regions of TBU. Not only is new data obtained, but far more data, enabling statistical methods to be applied. So far, the methods provide baseline data for TBU in healthy subjects; future work will produce data from DED subjects.

Abstract:

The tear film is a thin fluid multilayer left on the eye surface after a blink. A good tear film is essential for health and proper function of the eye, yet millions have a condition called dry eye disease (DED) that inhibits vision and may lead to inflammation and ocular surface damage. However, there is little quantitative data about tear film failure, often called tear break up (TBU). Currently, it is not possible to directly measure important variables such as tear osmolarity (saltiness) with areas of TBU. We present an (mostly) automatic method that we have developed to extract data from video of the tear film dyed with fluorescein (for visualization). We have data for 15 healthy subjects comprising 467 instances of TBU. Using parameter identification from fits to appropriate math models, we estimate which mechanisms are most important in TBU and variables like osmolarity within regions of TBU. Not only is new data obtained, but far more data, enabling statistical methods to be applied. So far, the methods provide baseline data for TBU in healthy subjects; future work will produce data from DED subjects.

Friday,
March 11, 2022
Mark Embree Virginia Tech CUR Matrix Factorizations: Algorithms, Analysis, Applications ... more ... less
Fri, Mar 11,
2022
Mark Embree,
Virginia Tech

CUR Matrix Factorizations: Algorithms, Analysis, Applications ... more ... less


Abstract:

Interpolatory matrix factorizations provide alternatives to the singular value decomposition (SVD) for obtaining low-rank approximations; this class includes the CUR factorization, where the C and R matrices are subsets of columns and rows of the target matrix. While interpolatory approximations lack the SVD's optimality, their ingredients are easier to interpret than singular vectors: since they are copied from the matrix itself, they inherit the data's key properties (e.g., nonnegative/integer values, sparsity, etc.). We shall provide an overview of these approximate factorizations, show how they can be analyzed using interpolatory projectors, and describe a method for their construction based on the Discrete Empirical Interpolation Method (DEIM). We will then describe use of this factorization for two applications: footstep analysis from building vibrations, and identification of representative survey responses from a text collection.

This talk describes joint work with Dan Sorensen, Pablo Tarazaga, Ed Gitre, and students.

Abstract:

Interpolatory matrix factorizations provide alternatives to the singular value decomposition (SVD) for obtaining low-rank approximations; this class includes the CUR factorization, where the C and R matrices are subsets of columns and rows of the target matrix. While interpolatory approximations lack the SVD's optimality, their ingredients are easier to interpret than singular vectors: since they are copied from the matrix itself, they inherit the data's key properties (e.g., nonnegative/integer values, sparsity, etc.). We shall provide an overview of these approximate factorizations, show how they can be analyzed using interpolatory projectors, and describe a method for their construction based on the Discrete Empirical Interpolation Method (DEIM). We will then describe use of this factorization for two applications: footstep analysis from building vibrations, and identification of representative survey responses from a text collection.

This talk describes joint work with Dan Sorensen, Pablo Tarazaga, Ed Gitre, and students.

Friday,
March 18, 2022
Spring break (no colloquium)
Fri, Mar 18,
2022
no colloquium
Friday,
March 25, 2022
Steven Rodriguez Naval Research Lab Enabling Rapid Meshless Multiphysics Modeling with Hybrid Data-Driven Projection Tree Reduced-Order Modeling ... more ... less
Fri, Mar 25,
2022
Steven Rodriguez,
Naval Research Lab

Enabling Rapid Meshless Multiphysics Modeling with Hybrid Data-Driven Projection Tree Reduced-Order Modeling ... more ... less


Abstract:

Abstract:

Friday,
April 01, 2022
East Cost Optimization Meeting (ECOM) George Mason University
Fri, Apr 01,
2022
East Cost Optimization Meeting (ECOM),
George Mason University

Friday,
April 08, 2022
no colloquium
Fri, Apr 08,
2022
no colloquium
Friday,
April 15, 2022
Kaushik Bhattacharya Caltech Multi-scale modeling of materials and neural operators ... more ... less
Fri, Apr 15,
2022
Kaushik Bhattacharya,
Caltech

Multi-scale modeling of materials and neural operators ... more ... less


Abstract:

The behavior of materials involve physics at multiple length and time scales: electronic, atomistic, domains, defects etc. The engineering properties that we observe and exploit in application are a sum total of all these interactions. Multiscale modeling seeks to understand this complexity with a divide and conquer approach. It introduces an ordered hierarchy of scales, and postulates that the interaction is pairwise within this hierarchy. The coarser-scale controls the finer-scale and filters the details of the finer scale. Still, the practical implementation of this approach is computationally challenging. This talk introduces the notion of neural operators as controlled approximations of operators mapping one function space to another and explains how they can be used for multiscale modeling. They lead to extremely high fidelity models that capture all the details of the small scale but can be directly implemented at the coarse scale in a computationally efficient manner. We demonstrate the ideas with examples drawn from first principles study of defects and crystal plasticity study of inelastic impact.

About the speaker:

Kaushik Bhattacharya is Howell N. Tyson, Sr., Professor of Mechanics and Professor of Materials Science as well as the Vice-Provost at the California Institute of Technology. He received his B.Tech degree from the Indian Institute of Technology, Madras, India in 1986, his Ph.D from the University of Minnesota in 1991 and his post-doctoral training at the Courant Institute for Mathematical Sciences during 1991-1993. He joined Caltech in 1993. His research concerns the mechanical behavior of materials, and specifically uses theory to guide the development of new materials. He has received the von Kármán Medal of the Society of Industrial and Applied Mathematics (2020), Distinguished Alumni Award of the Indian Institute of Technology, Madras (2019), the Outstanding Achievement Award of the University of Minnesota (2018), the Warner T. Koiter Medal of the American Society of Mechanical Engineering (2015) and the Graduate Student Council Teaching and Mentoring Award at Caltech (2013).

Abstract:

The behavior of materials involve physics at multiple length and time scales: electronic, atomistic, domains, defects etc. The engineering properties that we observe and exploit in application are a sum total of all these interactions. Multiscale modeling seeks to understand this complexity with a divide and conquer approach. It introduces an ordered hierarchy of scales, and postulates that the interaction is pairwise within this hierarchy. The coarser-scale controls the finer-scale and filters the details of the finer scale. Still, the practical implementation of this approach is computationally challenging. This talk introduces the notion of neural operators as controlled approximations of operators mapping one function space to another and explains how they can be used for multiscale modeling. They lead to extremely high fidelity models that capture all the details of the small scale but can be directly implemented at the coarse scale in a computationally efficient manner. We demonstrate the ideas with examples drawn from first principles study of defects and crystal plasticity study of inelastic impact.

About the speaker:

Kaushik Bhattacharya is Howell N. Tyson, Sr., Professor of Mechanics and Professor of Materials Science as well as the Vice-Provost at the California Institute of Technology. He received his B.Tech degree from the Indian Institute of Technology, Madras, India in 1986, his Ph.D from the University of Minnesota in 1991 and his post-doctoral training at the Courant Institute for Mathematical Sciences during 1991-1993. He joined Caltech in 1993. His research concerns the mechanical behavior of materials, and specifically uses theory to guide the development of new materials. He has received the von Kármán Medal of the Society of Industrial and Applied Mathematics (2020), Distinguished Alumni Award of the Indian Institute of Technology, Madras (2019), the Outstanding Achievement Award of the University of Minnesota (2018), the Warner T. Koiter Medal of the American Society of Mechanical Engineering (2015) and the Graduate Student Council Teaching and Mentoring Award at Caltech (2013).

Friday,
April 22, 2022
Bethany Lusch Argonne National Laboratory Integrating Machine-Learned Surrogate Models with Simulations ... more ... less
Fri, Apr 22,
2022
Bethany Lusch,
Argonne National Laboratory

Integrating Machine-Learned Surrogate Models with Simulations ... more ... less


Abstract:

Simulations can be computationally expensive, so it can be advantageous to use machine learning to train a surrogate model that is orders of magnitude faster. However, completely data-driven black-box models often have disadvantages such as limited generalizability and the chance of physically-impossible predictions. I will describe our recent work on surrogate modeling for applications such as automotive engines and weather, as well as how we are creating hybrid models by integrating surrogate models back into simulations.

About the speaker:

Dr. Bethany Lusch is an Assistant Computer Scientist in the data science group at the Argonne Leadership Computing Facility at Argonne National Lab. Her research expertise includes developing methods and tools to integrate AI with science, especially for dynamical systems and PDE-based simulations. She holds a PhD and MS in applied mathematics from the University of Washington and a BS in mathematics from the University of Notre Dame.

Abstract:

Simulations can be computationally expensive, so it can be advantageous to use machine learning to train a surrogate model that is orders of magnitude faster. However, completely data-driven black-box models often have disadvantages such as limited generalizability and the chance of physically-impossible predictions. I will describe our recent work on surrogate modeling for applications such as automotive engines and weather, as well as how we are creating hybrid models by integrating surrogate models back into simulations.

About the speaker:

Dr. Bethany Lusch is an Assistant Computer Scientist in the data science group at the Argonne Leadership Computing Facility at Argonne National Lab. Her research expertise includes developing methods and tools to integrate AI with science, especially for dynamical systems and PDE-based simulations. She holds a PhD and MS in applied mathematics from the University of Washington and a BS in mathematics from the University of Notre Dame.

Friday,
April 29, 2022
MathWorks
Fri, Apr 29,
2022
MathWorks

Fall 2021 ... show ... hide
Date Speaker Affiliation Title
Date Speaker, Affiliation, Title
Friday,
August 27, 2021
Howard Elman University of Maryland College Park Surrogate Approximation of the Grad-Shafranov Free Boundary Problem via Stochastic Collocation on Sparse Grids ... more ... less
Fri, Aug 27,
2021
Howard Elman,
University of Maryland College Park

Surrogate Approximation of the Grad-Shafranov Free Boundary Problem via Stochastic Collocation on Sparse Grids ... more ... less


Abstract:

In magnetic confinement fusion devices, the equilibrium configuration of a plasma is determined by the balance between the hydrostatic pressure in the fluid and the magnetic forces generated by an array of external coils and the plasma itself. The location of the plasma is not known a priori and must be obtained as the solution to a free boundary problem. The partial differential equation that determines the behavior of the combined magnetic field depends on a set of physical parameters (location of the coils, intensity of the electric currents going through them, magnetic permeability, etc.) that are subject to uncertainty and variability. The confinement region is in turn a function of these stochastic parameters as well. In this work, we consider variations on the current intensities running through the external coils as the dominant source of uncertainty. This leads to a parameter space of dimension equal to the number of coils in the reactor. With the aid of a surrogate function built on a sparse grid in parameter space, a Monte Carlo strategy is used to explore the effect that stochasticity in the parameters has on important features of the plasma boundary. The use of the surrogate function reduces the time required for the Monte Carlo simulations by factors that range between 7 and over 30. Joint work with Jiaxing Liang (Applied Mathematics Program, University of Maryland) and Tonatiuh Sánchez-Vizuet (Department of Mathematics, University of Arizona).

Abstract:

In magnetic confinement fusion devices, the equilibrium configuration of a plasma is determined by the balance between the hydrostatic pressure in the fluid and the magnetic forces generated by an array of external coils and the plasma itself. The location of the plasma is not known a priori and must be obtained as the solution to a free boundary problem. The partial differential equation that determines the behavior of the combined magnetic field depends on a set of physical parameters (location of the coils, intensity of the electric currents going through them, magnetic permeability, etc.) that are subject to uncertainty and variability. The confinement region is in turn a function of these stochastic parameters as well. In this work, we consider variations on the current intensities running through the external coils as the dominant source of uncertainty. This leads to a parameter space of dimension equal to the number of coils in the reactor. With the aid of a surrogate function built on a sparse grid in parameter space, a Monte Carlo strategy is used to explore the effect that stochasticity in the parameters has on important features of the plasma boundary. The use of the surrogate function reduces the time required for the Monte Carlo simulations by factors that range between 7 and over 30. Joint work with Jiaxing Liang (Applied Mathematics Program, University of Maryland) and Tonatiuh Sánchez-Vizuet (Department of Mathematics, University of Arizona).

Friday,
September 03, 2021
Meenakshi Singh Colorado School of Mines Investigating quantum speed limits with superconducting qubits ... more ... less
Fri, Sep 03,
2021
Meenakshi Singh,
Colorado School of Mines

Investigating quantum speed limits with superconducting qubits ... more ... less


Abstract:

The speed at which quantum entanglement between qubits with short range interactions can be generated is limited by the Lieb-Robinson bound. Introducing longer range interactions relaxes this bound and entanglement can be generated at a faster rate. The speed limit for this has been analytically found only for a two-qubit system under the assumption of negligible single qubit gate time. We seek to demonstrate this speed limit experimentally using two superconducting transmon qubits. Moreover, we aim to measure the increase in this speed limit induced by introducing additional qubits (coupled with the first two). Since the speed up grows with additional entangled qubits, it is expected to increase as the system size increases. This has important implications for large-scale quantum computing.

Bio:

Dr. Singh is an experimental physicist with research focused on quantum thermal effects and quantum computing. She graduated from the Indian Institute of Technology with an M. S. in Physics in 2006 and received a Ph. D. in Physics from the Pennsylvania State University in 2012. Her Ph. D. thesis was focused on quantum transport in nanowires. She went on to work at Sandia National Laboratories on Quantum Computing as a post-doctoral scholar. Since 2017, she is an Assistant Professor in the Department of Physics at the Colorado School of Mines. At Mines, her research projects include measurements of entanglement propagation and thermal effects in superconducting hybrids. She recently received the NSF CAREER award to pursue research in phonon interactions with spin qubits in silicon quantum dots.

Abstract:

The speed at which quantum entanglement between qubits with short range interactions can be generated is limited by the Lieb-Robinson bound. Introducing longer range interactions relaxes this bound and entanglement can be generated at a faster rate. The speed limit for this has been analytically found only for a two-qubit system under the assumption of negligible single qubit gate time. We seek to demonstrate this speed limit experimentally using two superconducting transmon qubits. Moreover, we aim to measure the increase in this speed limit induced by introducing additional qubits (coupled with the first two). Since the speed up grows with additional entangled qubits, it is expected to increase as the system size increases. This has important implications for large-scale quantum computing.

Bio:

Dr. Singh is an experimental physicist with research focused on quantum thermal effects and quantum computing. She graduated from the Indian Institute of Technology with an M. S. in Physics in 2006 and received a Ph. D. in Physics from the Pennsylvania State University in 2012. Her Ph. D. thesis was focused on quantum transport in nanowires. She went on to work at Sandia National Laboratories on Quantum Computing as a post-doctoral scholar. Since 2017, she is an Assistant Professor in the Department of Physics at the Colorado School of Mines. At Mines, her research projects include measurements of entanglement propagation and thermal effects in superconducting hybrids. She recently received the NSF CAREER award to pursue research in phonon interactions with spin qubits in silicon quantum dots.

Friday,
September 10, 2021
Minh-Binh Tran Southern Methodist University On the wave turbulence theory for stochastic and random multidimensional KdV type equations ... more ... less
Fri, Sep 10,
2021
Minh-Binh Tran,
Southern Methodist University

On the wave turbulence theory for stochastic and random multidimensional KdV type equations ... more ... less


Abstract:

In this talk, I consider a multidimensional KdV type equation, the Zakharov-Kuznetsov (ZK) equation. I will present a derivation of the 3-wave kinetic equation from both the stochastic ZK equation and the deterministic ZK equation with random initial condition. The equation is given on a hypercubic lattice of size . In the case of the stochastic ZK equation, I will show that the two point correlation function can be asymptotically expressed as the solution of the 3-wave kinetic equation at the kinetic limit under very general assumptions, in which the initial condition is out of equilibrium and the size of the domain is fixed. In the case of the deterministic ZK equation with random initial condition, the kinetic equation can also be derived at the kinetic limit, but under more restrictive assumptions. This is a joint work with Gigliola Staffilani (MIT).

Abstract:

In this talk, I consider a multidimensional KdV type equation, the Zakharov-Kuznetsov (ZK) equation. I will present a derivation of the 3-wave kinetic equation from both the stochastic ZK equation and the deterministic ZK equation with random initial condition. The equation is given on a hypercubic lattice of size . In the case of the stochastic ZK equation, I will show that the two point correlation function can be asymptotically expressed as the solution of the 3-wave kinetic equation at the kinetic limit under very general assumptions, in which the initial condition is out of equilibrium and the size of the domain is fixed. In the case of the deterministic ZK equation with random initial condition, the kinetic equation can also be derived at the kinetic limit, but under more restrictive assumptions. This is a joint work with Gigliola Staffilani (MIT).

Friday,
September 17, 2021
Fri, Sep 17,
2021

Friday,
September 24, 2021
Martin Burger Friedrich-Alexander Universität Erlangen-Nürnberg A Bregman Learning Framework for Sparse Neural Networks ... more ... less
Fri, Sep 24,
2021
Martin Burger,
Friedrich-Alexander Universität Erlangen-Nürnberg

A Bregman Learning Framework for Sparse Neural Networks ... more ... less


Abstract:

This talk will discuss a novel learning framework based on stochastic Bregman iterations. It allows to train sparse neural networks with an inverse scale space approach, starting from a very sparse network and gradually adding significant parameters. Apart from a baseline algorithm called LinBreg, we will discuss an accelerated version using momentum, and AdaBreg, which is a Bregmanized generalization of the Adam algorithm. Moreover a statistically profound sparse parameter initialization strategy, stochastic convergence analysis of the loss decay, and additional convergence proofs in the convex regime can be derived. The Bregman learning framework can also be applied to Neural Architecture Search, e.g. to unveil autoencoder architectures for denoising or deblurring tasks.

Abstract:

This talk will discuss a novel learning framework based on stochastic Bregman iterations. It allows to train sparse neural networks with an inverse scale space approach, starting from a very sparse network and gradually adding significant parameters. Apart from a baseline algorithm called LinBreg, we will discuss an accelerated version using momentum, and AdaBreg, which is a Bregmanized generalization of the Adam algorithm. Moreover a statistically profound sparse parameter initialization strategy, stochastic convergence analysis of the loss decay, and additional convergence proofs in the convex regime can be derived. The Bregman learning framework can also be applied to Neural Architecture Search, e.g. to unveil autoencoder architectures for denoising or deblurring tasks.

Friday,
October 01, 2021
Youssef M. Marzouk MIT Transport methods for simulation-based inference and data assimilation ... more ... less
Fri, Oct 01,
2021
Youssef M. Marzouk,
MIT

Transport methods for simulation-based inference and data assimilation ... more ... less


Abstract:

Many practical Bayesian inference problems fall into the "likelihood-free" setting, where evaluations of the likelihood function or prior density are unavailable or intractable; instead one can only simulate (i.e., draw samples from) the associated distributions. I will discuss how transportation of measure can help solve such problems, by constructing maps that push prior samples, or samples from a joint parameter-data prior, to the desired conditional distribution. These methods have broad utility for inference in stochastic and generative models, as well as for data assimilation problems motivated by geophysical applications. Key issues in this construction center on: (1) the estimation of transport maps from few samples; and (2) parameterizations of monotone maps. I will discuss developments on both fronts, including some recent efforts in joint dimension reduction for conditional sampling.

As an example, I will present an approach to nonlinear filtering in dynamical systems which uses sparse triangular transport maps to produce robust approximations of the filtering distribution in high dimensions. The approach can be understood as the natural generalization of the ensemble Kalman filter (EnKF) to nonlinear updates, and can reduce the intrinsic bias of the EnKF at a marginal increase in computational cost.

This is joint work with Ricardo Baptista, Alessio Spantini, Olivier Zahm, and Jakob Zech.

Abstract:

Many practical Bayesian inference problems fall into the "likelihood-free" setting, where evaluations of the likelihood function or prior density are unavailable or intractable; instead one can only simulate (i.e., draw samples from) the associated distributions. I will discuss how transportation of measure can help solve such problems, by constructing maps that push prior samples, or samples from a joint parameter-data prior, to the desired conditional distribution. These methods have broad utility for inference in stochastic and generative models, as well as for data assimilation problems motivated by geophysical applications. Key issues in this construction center on: (1) the estimation of transport maps from few samples; and (2) parameterizations of monotone maps. I will discuss developments on both fronts, including some recent efforts in joint dimension reduction for conditional sampling.

As an example, I will present an approach to nonlinear filtering in dynamical systems which uses sparse triangular transport maps to produce robust approximations of the filtering distribution in high dimensions. The approach can be understood as the natural generalization of the ensemble Kalman filter (EnKF) to nonlinear updates, and can reduce the intrinsic bias of the EnKF at a marginal increase in computational cost.

This is joint work with Ricardo Baptista, Alessio Spantini, Olivier Zahm, and Jakob Zech.

Friday,
October 08, 2021
Sven Leyffer Argonne National Laboratory Mixed-Integer PDE-Constrained Optimization ... more ... less
Fri, Oct 08,
2021
Sven Leyffer,
Argonne National Laboratory

Mixed-Integer PDE-Constrained Optimization ... more ... less


Abstract:

Many complex applications can be formulated as optimization problems constrained by partial differential equations (PDEs) with integer decision variables. Examples include the design and control of gas networks, disaster recovery, and topology optimization, and are referred to as mixed-integer PDE-constrained optimization problems, or MIPDECOs. We present the problem of designing an electromagnetic cloak as a MIPDECO with integer-valued control inputs that are distributed in the computational domain. We show that the problems can be solved by optimizing only the continuous relaxations of the approximations and then applying a sum-up rounding methodology to obtain integer-valued controls. These controls are shown to converge and exhibit the desired approximation properties under suitable refinements of the discretizations. We also propose a trust-region method that solves a sequence of linear integer programs to tackle more general integer optimal control problems regularized with a total variation penalty. The total variation penalty allows us to prove the existence of minimizers of the integer optimal control problem, and we present efficient computational techniques for solving these problems.

Abstract:

Many complex applications can be formulated as optimization problems constrained by partial differential equations (PDEs) with integer decision variables. Examples include the design and control of gas networks, disaster recovery, and topology optimization, and are referred to as mixed-integer PDE-constrained optimization problems, or MIPDECOs. We present the problem of designing an electromagnetic cloak as a MIPDECO with integer-valued control inputs that are distributed in the computational domain. We show that the problems can be solved by optimizing only the continuous relaxations of the approximations and then applying a sum-up rounding methodology to obtain integer-valued controls. These controls are shown to converge and exhibit the desired approximation properties under suitable refinements of the discretizations. We also propose a trust-region method that solves a sequence of linear integer programs to tackle more general integer optimal control problems regularized with a total variation penalty. The total variation penalty allows us to prove the existence of minimizers of the integer optimal control problem, and we present efficient computational techniques for solving these problems.

Friday,
October 15, 2021
Martina Bukac University of Notre Dame Adaptive time-stepping methods for fluid-structure interaction problems ... more ... less
Fri, Oct 15,
2021
Martina Bukac,
University of Notre Dame

Adaptive time-stepping methods for fluid-structure interaction problems ... more ... less


Abstract:

In realistic flow problems described by partial differential equations (PDEs), where the dynamics are not known, or in which the variables are changing rapidly, the robust, adaptive time-stepping is central to accurately and efficiently predict the long-term behavior of the solution. This is especially important in the coupled flow problems, such as the fluid-structure interaction (FSI), which often exhibit complex dynamic behavior. While the adaptive spatial mesh refinement techniques are well established and widely used, less attention has been given to the adaptive time-stepping methods for PDEs. We will discuss novel, adaptive, partitioned numerical methods for FSI problems with thick and thin structures. The time integration in the proposed methods is based on the refactorized Cauchy's one-legged 'theta-like' method, which consists of a backward Euler method, where the fluid and structure sub-problems are sub-iterated until convergence, followed by a forward Euler method.The bulk of the computation is done by the backward Euler method, as the forward Euler step is equivalent to (and implemented as) a linear extrapolation. We will present the numerical analysis of the proposed methods showing linear convergence of the sub-iterative process and unconditional stability. The time adaptation strategies will be discussed. The properties of the methods, as well as the selection of the parameters used in the adaptive process, will be explored in numerical examples.

Abstract:

In realistic flow problems described by partial differential equations (PDEs), where the dynamics are not known, or in which the variables are changing rapidly, the robust, adaptive time-stepping is central to accurately and efficiently predict the long-term behavior of the solution. This is especially important in the coupled flow problems, such as the fluid-structure interaction (FSI), which often exhibit complex dynamic behavior. While the adaptive spatial mesh refinement techniques are well established and widely used, less attention has been given to the adaptive time-stepping methods for PDEs. We will discuss novel, adaptive, partitioned numerical methods for FSI problems with thick and thin structures. The time integration in the proposed methods is based on the refactorized Cauchy's one-legged 'theta-like' method, which consists of a backward Euler method, where the fluid and structure sub-problems are sub-iterated until convergence, followed by a forward Euler method.The bulk of the computation is done by the backward Euler method, as the forward Euler step is equivalent to (and implemented as) a linear extrapolation. We will present the numerical analysis of the proposed methods showing linear convergence of the sub-iterative process and unconditional stability. The time adaptation strategies will be discussed. The properties of the methods, as well as the selection of the parameters used in the adaptive process, will be explored in numerical examples.

Friday,
October 22, 2021
Patrick Farrell University of Oxford Computing multiple solutions of PDEs with deflation ... more ... less
Fri, Oct 22,
2021
Patrick Farrell,
University of Oxford

Computing multiple solutions of PDEs with deflation ... more ... less


Abstract:

Computing the distinct solutions u of an equation f(u,λ)=0 as a parameter λ ∈ ℝ is varied is a central task in applied mathematics and engineering. The solutions are captured in a bifurcation diagram, plotting (some functional of) u as a function of λ. In this talk I will present a useful idea, deflation, for this task.

Deflation has three advantages. First, it is capable of computing disconnected bifurcation diagrams; previous algorithms only aimed to compute that part of the bifurcation diagram continuously connected to the initial data. Second, its implementation is very simple: it only requires a minor modification to an existing Newton-based solver. Third, it can scale to very large discretisations if a good preconditioner is available; no auxiliary problems must be solved.

We will present applications to hyperelastic structures, liquid crystals, and Bose-Einstein condensates, and discuss how PDE-constrained optimisation problems may be solved to design systems with certain bifurcation properties.

Abstract:

Computing the distinct solutions u of an equation f(u,λ)=0 as a parameter λ ∈ ℝ is varied is a central task in applied mathematics and engineering. The solutions are captured in a bifurcation diagram, plotting (some functional of) u as a function of λ. In this talk I will present a useful idea, deflation, for this task.

Deflation has three advantages. First, it is capable of computing disconnected bifurcation diagrams; previous algorithms only aimed to compute that part of the bifurcation diagram continuously connected to the initial data. Second, its implementation is very simple: it only requires a minor modification to an existing Newton-based solver. Third, it can scale to very large discretisations if a good preconditioner is available; no auxiliary problems must be solved.

We will present applications to hyperelastic structures, liquid crystals, and Bose-Einstein condensates, and discuss how PDE-constrained optimisation problems may be solved to design systems with certain bifurcation properties.

Friday,
October 29, 2021
Fri, Oct 29,
2021

Friday,
November 05, 2021
no colloquium
Fri, Nov 05,
2021
no colloquium
Friday,
November 12, 2021
Barbara Kaltenbacher Universität Klagenfurt (AAU) Reduced, all-at-once, and variational formulations of inverse problems and their iterative solution ... more ... less
Fri, Nov 12,
2021
Barbara Kaltenbacher,
Universität Klagenfurt (AAU)

Reduced, all-at-once, and variational formulations of inverse problems and their iterative solution ... more ... less

Abstract:

The conventional way of formulating inverse problems such as identification of a (possibly infinite dimensional) parameter, is via some forward operator, which is the concatenation of the observation operator with the parameter-to-state-map for the underlying model.

Recently, all-at-once formulations have been considered as an alternative to this reduced formulation, avoiding the use of a parameter-to-state map, which would sometimes lead to too restrictive conditions. Here the model and the observation are considered simultaneously as one large system with the state and the parameter as unknowns.

A still more general formulation of inverse problems, containing both the reduced and the all-at-once formulation, but also the well-known and highly versatile so-called variational approach (not to be mistaken with variational regularization) as special cases, is to formulate the inverse problem as a minimization problem (instead of an equation) for the state and parameter. Regularization can be incorporated via imposing constraints and/or adding regularization terms to the objective.

We will consider iterative regularization methods resulting from the application of gradient or Newton type iterations to such minimization based formulations and provide convergence results. In doing so, instead of regularizing the minimization problem and then applying standard iterative optimization methods, we regularize *by* iterating, more precisely by early stopping.

Abstract:

The conventional way of formulating inverse problems such as identification of a (possibly infinite dimensional) parameter, is via some forward operator, which is the concatenation of the observation operator with the parameter-to-state-map for the underlying model.

Recently, all-at-once formulations have been considered as an alternative to this reduced formulation, avoiding the use of a parameter-to-state map, which would sometimes lead to too restrictive conditions. Here the model and the observation are considered simultaneously as one large system with the state and the parameter as unknowns.

A still more general formulation of inverse problems, containing both the reduced and the all-at-once formulation, but also the well-known and highly versatile so-called variational approach (not to be mistaken with variational regularization) as special cases, is to formulate the inverse problem as a minimization problem (instead of an equation) for the state and parameter. Regularization can be incorporated via imposing constraints and/or adding regularization terms to the objective.

We will consider iterative regularization methods resulting from the application of gradient or Newton type iterations to such minimization based formulations and provide convergence results. In doing so, instead of regularizing the minimization problem and then applying standard iterative optimization methods, we regularize *by* iterating, more precisely by early stopping.

Friday,
November 19, 2021
Alfred Hero University of Michigan Learning to benchmark ... more ... less
Fri, Nov 19,
2021
Alfred Hero,
University of Michigan

Learning to benchmark ... more ... less


Abstract:

We address the problem of learning an achievable lower bound on classification error from a labeled sample. We establish an optimization framework for this meta-learning problem, which we call benchmark learning. Benchmark learning leads to an accurate data-driven predictor of an achievable lower bound on misclassification error probability without having to construct any classifier and without assuming any parametric model for the data. The resultant predictor can be used to establish whether it is possible to improve classification performance of any specific classifier. It also yields a stopping rule for sequentially trained classifiers. In addition, The talk will cover relevant background, theory, algorithms, and applications of benchmark learning.

Abstract:

We address the problem of learning an achievable lower bound on classification error from a labeled sample. We establish an optimization framework for this meta-learning problem, which we call benchmark learning. Benchmark learning leads to an accurate data-driven predictor of an achievable lower bound on misclassification error probability without having to construct any classifier and without assuming any parametric model for the data. The resultant predictor can be used to establish whether it is possible to improve classification performance of any specific classifier. It also yields a stopping rule for sequentially trained classifiers. In addition, The talk will cover relevant background, theory, algorithms, and applications of benchmark learning.

Friday,
November 26, 2021
no colloquium (Thanksgiving)
Fri, Nov 26,
2021
no colloquium
Friday,
December 03, 2021
Andrew Gillette LLNL Delaunay interpolation diagnostics for model assessment ... more ... less
Fri, Dec 03,
2021
Andrew Gillette,
LLNL

Delaunay interpolation diagnostics for model assessment ... more ... less


Abstract:

Surrogate models, reduced order models, and trained neural networks are now ubiquitous in scientific applications, yet the metrics used to evaluate their accuracy remain heuristic and application-dependent. We will address the challenge of model assessment from the perspective of function approximation: given only the ability to evaluate a function f : R^d --> R^k on some set of inputs from R^d, what can we conclude about the properties of f itself? Using a scalable Delaunay-based interpolation method, we build a sequence of piecewise linear approximations of f and compute their rate of convergence. The technique is inspired by classical a priori convergence error estimates for finite element methods. Initial results indicate this rate can help identify multi-scale behavior, requisite sampling density, and regions of near-linearity in a model. We will discuss applications of the approach to the iterative selection of parameters for inertial confinement fusion simulations carried out at LLNL.

About the speaker:

Andrew Gillette is a computational mathematician at the Center for Applied Scientific Computing (CASC) at Lawrence Livermore National Laboratory. He works on a variety of projects involving numerical methods for PDEs, high-dimensional function approximation, computational geometry, and machine learning. Prior to joining CASC in 2019, Andrew was an Associate Professor of Mathematics at the University of Arizona.

Abstract:

Surrogate models, reduced order models, and trained neural networks are now ubiquitous in scientific applications, yet the metrics used to evaluate their accuracy remain heuristic and application-dependent. We will address the challenge of model assessment from the perspective of function approximation: given only the ability to evaluate a function f : R^d --> R^k on some set of inputs from R^d, what can we conclude about the properties of f itself? Using a scalable Delaunay-based interpolation method, we build a sequence of piecewise linear approximations of f and compute their rate of convergence. The technique is inspired by classical a priori convergence error estimates for finite element methods. Initial results indicate this rate can help identify multi-scale behavior, requisite sampling density, and regions of near-linearity in a model. We will discuss applications of the approach to the iterative selection of parameters for inertial confinement fusion simulations carried out at LLNL.

About the speaker:

Andrew Gillette is a computational mathematician at the Center for Applied Scientific Computing (CASC) at Lawrence Livermore National Laboratory. He works on a variety of projects involving numerical methods for PDEs, high-dimensional function approximation, computational geometry, and machine learning. Prior to joining CASC in 2019, Andrew was an Associate Professor of Mathematics at the University of Arizona.

Spring 2021 ... show ... hide
Date Speaker Affiliation Title
Date Speaker, Affiliation, Title
Friday,
January 29, 2021
Irene Fonseca Carnegie Mellon University Geometric Flows and Phase Transitions in Heterogeneous Media ... more ... less
Fri, Jan 29,
2021
Irene Fonseca,
Carnegie Mellon University

Geometric Flows and Phase Transitions in Heterogeneous Media ... more ... less

Abstract:

We present the first, unconditional convergence results for an Allen-Cahn type bi-stable reaction diffusion equation in a periodic medium. Our limiting dynamics are given by an analog for anisotropic mean curvature flow, of the formulation due to Ken Brakke. As an essential ingredient in the analysis, we obtain an explicit expression for the effective surface tension, which dictates the limiting anisotropic mean curvature.

This is joint work with Rustum Choksi (McGill), Jessica Lin (McGill), and Raghavendra Venkatraman (CMU).

Abstract:

We present the first, unconditional convergence results for an Allen-Cahn type bi-stable reaction diffusion equation in a periodic medium. Our limiting dynamics are given by an analog for anisotropic mean curvature flow, of the formulation due to Ken Brakke. As an essential ingredient in the analysis, we obtain an explicit expression for the effective surface tension, which dictates the limiting anisotropic mean curvature.

This is joint work with Rustum Choksi (McGill), Jessica Lin (McGill), and Raghavendra Venkatraman (CMU).

Friday,
February 05, 2021
Georg Stadler New York University Estimation of extreme event probabilities in systems governed by PDEs ... more ... less
Fri, Feb 05,
2021
Georg Stadler,
New York University

Estimation of extreme event probabilities in systems governed by PDEs ... more ... less


Abstract:

We propose methods for the estimation of extreme event probabilities in complex systems governed by PDEs. Our approach is guided by ideas from large deviation theory (LDT) and borrows methods from PDE-constrained optimization. The systems under consideration involve random parameters and we are interested in quantifying the probability that a scalar function of the system state is at or above a threshold. The proposed methods initially solve an optimization problem over the set of parameters leading to events above a threshold. Based on solutions of this PDE-constrained optimization problem, we propose (1) an importance sampling method and (2) a method that uses curvature information of the extreme event boundary to estimate small probabilities. We illustrate the application of our approach to quantify the probability of extreme tsunami events on shore. Tsunamis are typically caused by a sudden, unpredictable change of the ocean floor elevation during an earthquake. We model this change as random process and use the one-dimensional shallow water equation to model tsunamis. The PDE-constrained optimization problem arising in this application is governed by the shallow water equation. This is joint work with Shanyin Tong and Eric Vanden-Eijnden from NYU.

Abstract:

We propose methods for the estimation of extreme event probabilities in complex systems governed by PDEs. Our approach is guided by ideas from large deviation theory (LDT) and borrows methods from PDE-constrained optimization. The systems under consideration involve random parameters and we are interested in quantifying the probability that a scalar function of the system state is at or above a threshold. The proposed methods initially solve an optimization problem over the set of parameters leading to events above a threshold. Based on solutions of this PDE-constrained optimization problem, we propose (1) an importance sampling method and (2) a method that uses curvature information of the extreme event boundary to estimate small probabilities. We illustrate the application of our approach to quantify the probability of extreme tsunami events on shore. Tsunamis are typically caused by a sudden, unpredictable change of the ocean floor elevation during an earthquake. We model this change as random process and use the one-dimensional shallow water equation to model tsunamis. The PDE-constrained optimization problem arising in this application is governed by the shallow water equation. This is joint work with Shanyin Tong and Eric Vanden-Eijnden from NYU.

Friday,
February 12, 2021
Eric Vanden-Eijnden New York University Trainability and accuracy of artificial neural networks ... more ... less
Fri, Feb 12,
2021
Eric Vanden-Eijnden,
New York University

Trainability and accuracy of artificial neural networks ... more ... less


Abstract:

The recent success of machine learning suggests that neural networks may be capable of approximating high-dimensional functions with controllably small errors. As a result, they could outperform standard function interpolation methods that have been the workhorses of scientific computing but do not scale well with dimension. In support of this prospect, here I will review what is known about the trainability and accuracy of shallow neural networks, which offer the simplest instance of nonlinear learning in functional spaces that are fundamentally different from classic approximation spaces. The dynamics of training in these spaces can be analyzed using tools from optimal transport and statistical mechanics, which reveal when and how shallow neural networks can overcome the curse of dimensionality. I will also discuss how scientific computing problem in high-dimension once thought intractable can be revisited through the lens of these results. Finally, I will discuss open questions, including potential generalizations to deep architecture.

This talk is based on joint work with Grant Rotskoff, Joan Bruna, Zhengdao Chen, and Sammy Jelassi.

Abstract:

The recent success of machine learning suggests that neural networks may be capable of approximating high-dimensional functions with controllably small errors. As a result, they could outperform standard function interpolation methods that have been the workhorses of scientific computing but do not scale well with dimension. In support of this prospect, here I will review what is known about the trainability and accuracy of shallow neural networks, which offer the simplest instance of nonlinear learning in functional spaces that are fundamentally different from classic approximation spaces. The dynamics of training in these spaces can be analyzed using tools from optimal transport and statistical mechanics, which reveal when and how shallow neural networks can overcome the curse of dimensionality. I will also discuss how scientific computing problem in high-dimension once thought intractable can be revisited through the lens of these results. Finally, I will discuss open questions, including potential generalizations to deep architecture.

This talk is based on joint work with Grant Rotskoff, Joan Bruna, Zhengdao Chen, and Sammy Jelassi.

Friday,
February 19, 2021
Eric Darve Stanford University 2nd order optimizers for physics-informed learning.
Time: 11:15 am EST ... more ... less
Fri, Feb 19,
2021
Eric Darve,
Stanford University

2nd order optimizers for physics-informed learning.
Time: 11:15 am EST ... more ... less

Abstract:

physics-informed learning is a new class of deep learning algorithms that combine deep neural networks and numerical partial differential equation (PDE) solvers based on physical models. Although very promising, these algorithms require the accurate solution of often ill-conditioned optimization problems in high-dimension. 1st order optimizers like the stochastic gradient descent and ADAM have proven very successful for many machine learning applications but typically exhibit weaker performance on physics-informed learning tasks. Instead, 2nd order methods like BFGS and trust-region methods are much more robust and efficient for these problems. In this talk, we will discuss the performance and requirements of these optimizers for physics-informed learning tasks for different types of PDEs.

Abstract:

physics-informed learning is a new class of deep learning algorithms that combine deep neural networks and numerical partial differential equation (PDE) solvers based on physical models. Although very promising, these algorithms require the accurate solution of often ill-conditioned optimization problems in high-dimension. 1st order optimizers like the stochastic gradient descent and ADAM have proven very successful for many machine learning applications but typically exhibit weaker performance on physics-informed learning tasks. Instead, 2nd order methods like BFGS and trust-region methods are much more robust and efficient for these problems. In this talk, we will discuss the performance and requirements of these optimizers for physics-informed learning tasks for different types of PDEs.

Friday,
February 26, 2021
Alexis F. Vasseur University of Texas at Austin Stability of discontinuous solutions for inviscid compressible flows ... more ... less
Fri, Feb 26,
2021
Alexis F. Vasseur,
University of Texas at Austin

Stability of discontinuous solutions for inviscid compressible flows ... more ... less


Abstract:

We will discuss recent developments of the theory of a-contraction with shifts to study the stability of discontinuous solutions of systems of equations modeling inviscid compressible flows, like the compressible Euler equation.

In the general setting, the only stability result for multi-D shocks (Majda, 1981) involves very regular perturbations. More recently, the convex integration method showed that they are not stable under wild $L^2$ perturbations. In the one-dimensional configuration, a consequence of the Bressan theory shows that shocks are stable under small BV perturbations (together with a technical condition known as bounded variations on space-like curve).

The theory of a-contraction allows to extend the Bressan theory to a weak/BV stability result allowing wild perturbations fulfilling only the so-called strong trace property.

Another way to study the stability of inviscid shock is through inviscid limit of viscous models. In one dimension, the study of the so called ”artificial” viscosity limit, is now well understood. However, progress on the vanishing ”physical” viscosity limit (for instance, from compressible Navier-Stokes systems to inviscid limit of Compressible Euler equations) has been far slower.

One of the big recent success of the theory of a-contraction with shifts, is the stability of viscous shocks subject to large perturbations. Stability results on the inviscid model are then inherited at the inviscid limit, thanks to the fact that large perturbations, independent of the viscosity, can be considered at the Navier-Stokes level. These stability results hold in the class of wild perturbations of inviscid limits, without any regularity restriction (not even strong trace property). This shows that the class of inviscid limits of Navier-Stokes equations is better behaved that the class of weak solutions to the inviscid limit problem. A first multi-D result of stability of contact discontinuities without shear, in the class of inviscid limit of Fourier-Navier-Stokes, shows that the same property is true for some situations even in multi-D.

Abstract:

We will discuss recent developments of the theory of a-contraction with shifts to study the stability of discontinuous solutions of systems of equations modeling inviscid compressible flows, like the compressible Euler equation.

In the general setting, the only stability result for multi-D shocks (Majda, 1981) involves very regular perturbations. More recently, the convex integration method showed that they are not stable under wild $L^2$ perturbations. In the one-dimensional configuration, a consequence of the Bressan theory shows that shocks are stable under small BV perturbations (together with a technical condition known as bounded variations on space-like curve).

The theory of a-contraction allows to extend the Bressan theory to a weak/BV stability result allowing wild perturbations fulfilling only the so-called strong trace property.

Another way to study the stability of inviscid shock is through inviscid limit of viscous models. In one dimension, the study of the so called ”artificial” viscosity limit, is now well understood. However, progress on the vanishing ”physical” viscosity limit (for instance, from compressible Navier-Stokes systems to inviscid limit of Compressible Euler equations) has been far slower.

One of the big recent success of the theory of a-contraction with shifts, is the stability of viscous shocks subject to large perturbations. Stability results on the inviscid model are then inherited at the inviscid limit, thanks to the fact that large perturbations, independent of the viscosity, can be considered at the Navier-Stokes level. These stability results hold in the class of wild perturbations of inviscid limits, without any regularity restriction (not even strong trace property). This shows that the class of inviscid limits of Navier-Stokes equations is better behaved that the class of weak solutions to the inviscid limit problem. A first multi-D result of stability of contact discontinuities without shear, in the class of inviscid limit of Fourier-Navier-Stokes, shows that the same property is true for some situations even in multi-D.

Friday,
March 05, 2021
no colloquium
Fri, Mar 05,
2021
no colloquium
Friday,
March 12, 2021
Xue-Cheng Tai Hong Kong Baptist University The Softmax function, Potts model and variational neural networks ... more ... less
Fri, Mar 12,
2021
Xue-Cheng Tai,
Hong Kong Baptist University

The Softmax function, Potts model and variational neural networks ... more ... less


Abstract:

In this talk, we present our recent research on using variational models as layers for deep neural networks (DNNs). We use image segmentation as an example. The technique can also be used for high dimensional data classification as well. Through this technique, we could integrate many well-know variational models for image segmentation into deep neural networks. The new networks will have the advantages of traditional DNNs. At the same time, the outputs from the new networks can also have many good properties of variational models for image segmentation. We will present some techniques to incorporate shape priors into the networks through the variational layers. We will show how to design networks with spatial regularization and volume preservation. We can also design networks with guarantee that the output shapes from the network for image segmentation must be convex shapes/star-shapes. It is numerically verified that these techniques can improve the performance when the true shapes satisfy these priors.

The ideas of these new networks is based on some relationship between the softmax function, the Potts models and the structure of traditional DNNs. We will explain this in detail which leads naturally to the newly designed networks.

This talk is based on joint works with Jun Liu, S. Luo and several other collaborators.

Abstract:

In this talk, we present our recent research on using variational models as layers for deep neural networks (DNNs). We use image segmentation as an example. The technique can also be used for high dimensional data classification as well. Through this technique, we could integrate many well-know variational models for image segmentation into deep neural networks. The new networks will have the advantages of traditional DNNs. At the same time, the outputs from the new networks can also have many good properties of variational models for image segmentation. We will present some techniques to incorporate shape priors into the networks through the variational layers. We will show how to design networks with spatial regularization and volume preservation. We can also design networks with guarantee that the output shapes from the network for image segmentation must be convex shapes/star-shapes. It is numerically verified that these techniques can improve the performance when the true shapes satisfy these priors.

The ideas of these new networks is based on some relationship between the softmax function, the Potts models and the structure of traditional DNNs. We will explain this in detail which leads naturally to the newly designed networks.

This talk is based on joint works with Jun Liu, S. Luo and several other collaborators.

Friday,
March 19, 2021
Michael Hintermüller WIAS and Humboldt-Universität zu Berlin Optimization with learning-informed differential equation constraints and its applications ... more ... less
Fri, Mar 19,
2021
Michael Hintermüller,
WIAS and Humboldt-Universität zu Berlin

Optimization with learning-informed differential equation constraints and its applications ... more ... less


Abstract:

Inspired by applications in optimal control of semilinear elliptic partial differential equations and physics-integrated imaging, differential equation constrained optimization problems with constituents that are only accessible through data-driven techniques are studied. A particular focus is on the analysis and on numerical methods for problems with machine-learned components. For a rather general context, an error analysis is provided, and particular properties resulting from artificial neural network based approximations are addressed. Moreover, for each of the two inspiring applications analytical details are presented and numerical results are provided.

Joint work with G. Dong and K. Papafitsoros (both Weierstrass Institute Berlin)

Abstract:

Inspired by applications in optimal control of semilinear elliptic partial differential equations and physics-integrated imaging, differential equation constrained optimization problems with constituents that are only accessible through data-driven techniques are studied. A particular focus is on the analysis and on numerical methods for problems with machine-learned components. For a rather general context, an error analysis is provided, and particular properties resulting from artificial neural network based approximations are addressed. Moreover, for each of the two inspiring applications analytical details are presented and numerical results are provided.

Joint work with G. Dong and K. Papafitsoros (both Weierstrass Institute Berlin)

Friday,
March 26, 2021
Anders Petersson Lawerence Livermore National Lab Numerical Optimal Control of Quantum Systems.
Time: 11:30 am EST ... more ... less
Fri, Mar 26,
2021
Anders Petersson,
Lawerence Livermore National Lab

Numerical Optimal Control of Quantum Systems.
Time: 11:30 am EST ... more ... less


Abstract:

Abstract:

Friday,
April 02, 2021
East Cost Optimization Meeting (ECOM) George Mason University
Fri, Apr 02,
2021
East Cost Optimization Meeting (ECOM),
George Mason University

Friday,
April 09, 2021
Roland Herzog Chemnitz University of Technology Total Variation and Total Generalized Variation: From Optimal Control to Geometry Processing ... more ... less
Fri, Apr 09,
2021
Roland Herzog,
Chemnitz University of Technology

Total Variation and Total Generalized Variation: From Optimal Control to Geometry Processing ... more ... less


Abstract:

The total variation (TV) semi-norm is popular as a regularizing functional in inverse problems and imaging, favoring piecewise constant functions. As an extension, Bredies, Kunisch and Pock introduced the total generalized variation (TGV), which favors piecewise linear (or higher-order) polynomials. In this presentation, we address discrete TV and TGV models for finite element formulations and their use in optimal control, imaging, and geometry processing applications, along with tailored optimization algorithms.

Abstract:

The total variation (TV) semi-norm is popular as a regularizing functional in inverse problems and imaging, favoring piecewise constant functions. As an extension, Bredies, Kunisch and Pock introduced the total generalized variation (TGV), which favors piecewise linear (or higher-order) polynomials. In this presentation, we address discrete TV and TGV models for finite element formulations and their use in optimal control, imaging, and geometry processing applications, along with tailored optimization algorithms.

Friday,
April 16, 2021
Youngsoo Choi LLNL Where are we with data-driven surrogate modeling for various physical simulations? ... more ... less
Fri, Apr 16,
2021
Youngsoo Choi,
LLNL

Where are we with data-driven surrogate modeling for various physical simulations? ... more ... less


Abstract:

A surrogate model is built to accelerate computationally expensive physical simulations, which is useful in multi-query problems, such as inverse problem, uncertainty quantification, design optimization, and optimal control. In this talk, two types of data-driven surrogate modeling techniques will be discussed, i.e., the black-box approach that incorporates only data and the physics-informed approach that incorporates the physics information as well as data within the surrogate models. The advantages and disadvantages of each method will be discussed. Furthermore, several recent developments at LLNL of data-driven physics-informed surrogate modeling techniques will be introduced in the context of various physical simulations. For example, the time-windowing reduced order model overcomes the difficulty of shock propagation phenomenon, achieving a speed-up of O(2~10) with a relative error less than 1% for relatively small Lagrangian hydrodynamics problems. The space–time reduced order model accelerates large-scale Neutron transport simulations by a factor of 7,000 with a relative error less than 1%. The nonlinear manifold reduced order model shows perfect marriage between machine learning and physics-informed surrogate modeling and also solves the challenge imposed by the advection-dominated physical simulations. Finally, successful application of these surrogate models in design optimization settings will be presented.

About the speaker:

Youngsoo is a computational scientist in CASC under Computing directorate. His research focus lies on developing efficient reduced order models for various physical simulations to be used in multi-query problems, such as inverse problems, design optimization, and uncertainty quantification. He is currently leading data-driven surrogate model development team for various physical simulations. He has earned his undergraduate degree for Civil and Environmental Engineering from Cornell University and his PhD degree for Computational and Mathematical Engineering from Stanford University. He was a postdoc in Sandia National Laboratory and Stanford University prior to joining LLNL in 2017.

Youngsoo Choi

Abstract:

A surrogate model is built to accelerate computationally expensive physical simulations, which is useful in multi-query problems, such as inverse problem, uncertainty quantification, design optimization, and optimal control. In this talk, two types of data-driven surrogate modeling techniques will be discussed, i.e., the black-box approach that incorporates only data and the physics-informed approach that incorporates the physics information as well as data within the surrogate models. The advantages and disadvantages of each method will be discussed. Furthermore, several recent developments at LLNL of data-driven physics-informed surrogate modeling techniques will be introduced in the context of various physical simulations. For example, the time-windowing reduced order model overcomes the difficulty of shock propagation phenomenon, achieving a speed-up of O(2~10) with a relative error less than 1% for relatively small Lagrangian hydrodynamics problems. The space–time reduced order model accelerates large-scale Neutron transport simulations by a factor of 7,000 with a relative error less than 1%. The nonlinear manifold reduced order model shows perfect marriage between machine learning and physics-informed surrogate modeling and also solves the challenge imposed by the advection-dominated physical simulations. Finally, successful application of these surrogate models in design optimization settings will be presented.

About the speaker:

Youngsoo is a computational scientist in CASC under Computing directorate. His research focus lies on developing efficient reduced order models for various physical simulations to be used in multi-query problems, such as inverse problems, design optimization, and uncertainty quantification. He is currently leading data-driven surrogate model development team for various physical simulations. He has earned his undergraduate degree for Civil and Environmental Engineering from Cornell University and his PhD degree for Computational and Mathematical Engineering from Stanford University. He was a postdoc in Sandia National Laboratory and Stanford University prior to joining LLNL in 2017.

Youngsoo Choi

Friday,
April 23, 2021
Jan S Hesthaven EPFL Nonintrusive reduced order models using physics informed neural networks ... more ... less
Fri, Apr 23,
2021
Jan S Hesthaven,
EPFL

Nonintrusive reduced order models using physics informed neural networks ... more ... less


Abstract:

The development of reduced order models for complex applications, offering the promise for rapid and accurate evaluation of the output of complex models under parameterized variation, remains a very active research area. Applications are found in problems which require many evaluations, sampled over a potentially large parameter space, such as in optimization, control, uncertainty quantification, and in applications where a near real-time response is needed.

However, many challenges remain unresolved to secure the flexibility, robustness, and efficiency needed for general large-scale applications, in particular for nonlinear and/or time-dependent problems.

After giving a brief general introduction to projection based reduced order models, we discuss the use of artificial feedforward neural networks to enable the development of fast and accurate nonintrusive models for complex problems. We demonstrate that this approach offers substantial flexibility and robustness for general nonlinear problems and enables the development of fast reduced order models for complex applications.

In the second part of the talk, we discuss how to use residual based neural networks in which knowledge of the governing equations is built into the network and show that this has advantages both for training and for the overall accuracy of the model.

Time permitting, we finally discuss the use of reduced order models in the context of prediction, i.e. to estimate solutions in regions of the parameter beyond that of the initial training. With an emphasis on the Mori-Zwansig formulation for time-dependent problems, we discuss how to accurately account for the effect of the unresolved and truncated scales on the long term dynamics and show that accounting for these through a memory term significantly improves the predictive accuracy of the reduced order model.

Abstract:

The development of reduced order models for complex applications, offering the promise for rapid and accurate evaluation of the output of complex models under parameterized variation, remains a very active research area. Applications are found in problems which require many evaluations, sampled over a potentially large parameter space, such as in optimization, control, uncertainty quantification, and in applications where a near real-time response is needed.

However, many challenges remain unresolved to secure the flexibility, robustness, and efficiency needed for general large-scale applications, in particular for nonlinear and/or time-dependent problems.

After giving a brief general introduction to projection based reduced order models, we discuss the use of artificial feedforward neural networks to enable the development of fast and accurate nonintrusive models for complex problems. We demonstrate that this approach offers substantial flexibility and robustness for general nonlinear problems and enables the development of fast reduced order models for complex applications.

In the second part of the talk, we discuss how to use residual based neural networks in which knowledge of the governing equations is built into the network and show that this has advantages both for training and for the overall accuracy of the model.

Time permitting, we finally discuss the use of reduced order models in the context of prediction, i.e. to estimate solutions in regions of the parameter beyond that of the initial training. With an emphasis on the Mori-Zwansig formulation for time-dependent problems, we discuss how to accurately account for the effect of the unresolved and truncated scales on the long term dynamics and show that accounting for these through a memory term significantly improves the predictive accuracy of the reduced order model.

Friday,
April 30, 2021
Karl Kunisch University of Graz Semiglobal optimal Feedback stabilization of autonomous systems via deep neural network approximation ... more ... less
Fri, Apr 30,
2021
Karl Kunisch,
University of Graz

Semiglobal optimal Feedback stabilization of autonomous systems via deep neural network approximation ... more ... less


Abstract:

A learning approach for optimal feedback gains for nonlinear continuous time control systems is proposed and analysed. The goal is to establish a rigorous framework for computing approximating optimal feedback gains using neural networks. The approach rests on two main ingredients. First, an optimal control formulation involving an ensemble of trajectories with 'control' variables given by the feedback gain functions. Second, an approximation to the feedback functions via realizations by neural networks. Based on universal approximation properties we prove the existence and convergence of optimal stabilizing neural network feedback controllers.

The talk is based on joint work with Daniel Walter.

Abstract:

A learning approach for optimal feedback gains for nonlinear continuous time control systems is proposed and analysed. The goal is to establish a rigorous framework for computing approximating optimal feedback gains using neural networks. The approach rests on two main ingredients. First, an optimal control formulation involving an ensemble of trajectories with 'control' variables given by the feedback gain functions. Second, an approximation to the feedback functions via realizations by neural networks. Based on universal approximation properties we prove the existence and convergence of optimal stabilizing neural network feedback controllers.

The talk is based on joint work with Daniel Walter.

Friday,
May 07, 2021
Robert F. Dejaco NIST Resolving the Shock Layer in Fixed-Bed Adsorption with Boundary Layer Theory ... more ... less
Fri, May 07,
2021
Robert F. Dejaco,
NIST

Resolving the Shock Layer in Fixed-Bed Adsorption with Boundary Layer Theory ... more ... less


Abstract:

In adsorption separations, mixtures flow through a column packed with solid particles. The weakly adsorbing component moves faster than the strongly adsorbing component, causing the exiting mixture to separate relative to the inlet. By exploiting differences in affinity for a solid material, rather than heating and cooling (e.g., conventional distillation), adsorption separations can be very energy efficient. Understanding the so-called “break-through curve” measurement – the outlet fluid concentrations as a function of time – is central to efficient industrial implementation. Mathematical modeling of the associated nonlinear PDE can provide a quantitative connection between the characteristics of the adsorbent material and the break-through curve measurement. We apply boundary layer theory to study breakthrough curve measurements for isothermal single-solute adsorption with plug flow in the limit of fast adsorption compared to convection. Our perturbation theory connects two seemingly unrelated theories, one assuming infinitely fast mass transfer and the other an infinitely long column. The leading order “outer” form of the problem is a conservation law that yields shock waves via the method of characteristics. The discontinuity at the shock can be resolved by rescaling in a moving coordinate system. Analysis of the boundary layer reveals that the associated breakthrough curve has exactly one inflection point, is not necessarily symmetric, and only occurs when the relationship for solute partitioning adopts a certain convexity. A comparison to numerical simulations is presented to support the validity of the approach.

Abstract:

In adsorption separations, mixtures flow through a column packed with solid particles. The weakly adsorbing component moves faster than the strongly adsorbing component, causing the exiting mixture to separate relative to the inlet. By exploiting differences in affinity for a solid material, rather than heating and cooling (e.g., conventional distillation), adsorption separations can be very energy efficient. Understanding the so-called “break-through curve” measurement – the outlet fluid concentrations as a function of time – is central to efficient industrial implementation. Mathematical modeling of the associated nonlinear PDE can provide a quantitative connection between the characteristics of the adsorbent material and the break-through curve measurement. We apply boundary layer theory to study breakthrough curve measurements for isothermal single-solute adsorption with plug flow in the limit of fast adsorption compared to convection. Our perturbation theory connects two seemingly unrelated theories, one assuming infinitely fast mass transfer and the other an infinitely long column. The leading order “outer” form of the problem is a conservation law that yields shock waves via the method of characteristics. The discontinuity at the shock can be resolved by rescaling in a moving coordinate system. Analysis of the boundary layer reveals that the associated breakthrough curve has exactly one inflection point, is not necessarily symmetric, and only occurs when the relationship for solute partitioning adopts a certain convexity. A comparison to numerical simulations is presented to support the validity of the approach.

Danielle C. Brager NIST Mathematically investigating Retinitis Pigmentosa ... more ... less
Danielle C. Brager,
NIST

Mathematically investigating Retinitis Pigmentosa ... more ... less


Abstract:

Retinitis Pigmentosa (RP) is a collection of clinically and genetically heterogeneous degenerative retinal diseases. Patients with RP experience a loss of night vision that progresses to day-light blindness due to the sequential degeneration of rod and cone photoreceptors. While known genetic mutations associated with RP affect the rods, the degeneration of cones inevitably follows in a manner independent of those genetic mutations. Investigation of this secondary death of cone photoreceptors led to the discovery of the rod-derived cone viability factor (RdCVF), a protein secreted by the rods that preserves the cones by accelerating the flow of glucose into cone cells stimulating aerobic glycolysis. In this work, we formulate a predator-prey style system of nonlinear ordinary differential equations to mathematically model photoreceptor interactions in the presence of RP while accounting for the new understanding of RdCVF's role in enhancing cone survival. We utilize the mathematical model and subsequent analysis to examine the underlying processes and mechanisms (defined by the model parameters) that affect cone photoreceptor vitality as RP progresses. The physiologically relevant equilibrium points are interpreted as different stages of retinal degeneration. We determine conditions necessary for the local asymptotic stability of these equilibrium points and use the results as criteria needed to remain in a stage in the progression of retinal degeneration. Experimental data is used for parameter estimation. Pathways to blindness are uncovered via bifurcations and narrows our focus to four of the model equilibria. Using Latin Hypercube Sampling coupled with partial rank correlation coefficients, we perform a sensitivity analysis to determine mechanisms that have a significant effect on the cones at four stages of RP. We derive a non-dimensional form of the mathematical model and perform a numerical bifurcation analysis using MATCONT to explore the existence of stable limit cycles because a stable limit cycle is a stable mode, other than an equilibrium point, where the rods and cones coexist. In our analyses, a set of key parameters involved in photoreceptor outer segment shedding, renewal, and nutrient supply were shown to govern the dynamics of the system. Our findings illustrate the benefit of using mathematical models to uncover mechanisms driving the progression of RP and opens the possibility to use in silico experiments to test treatment options in the absence of rods.

Abstract:

Retinitis Pigmentosa (RP) is a collection of clinically and genetically heterogeneous degenerative retinal diseases. Patients with RP experience a loss of night vision that progresses to day-light blindness due to the sequential degeneration of rod and cone photoreceptors. While known genetic mutations associated with RP affect the rods, the degeneration of cones inevitably follows in a manner independent of those genetic mutations. Investigation of this secondary death of cone photoreceptors led to the discovery of the rod-derived cone viability factor (RdCVF), a protein secreted by the rods that preserves the cones by accelerating the flow of glucose into cone cells stimulating aerobic glycolysis. In this work, we formulate a predator-prey style system of nonlinear ordinary differential equations to mathematically model photoreceptor interactions in the presence of RP while accounting for the new understanding of RdCVF's role in enhancing cone survival. We utilize the mathematical model and subsequent analysis to examine the underlying processes and mechanisms (defined by the model parameters) that affect cone photoreceptor vitality as RP progresses. The physiologically relevant equilibrium points are interpreted as different stages of retinal degeneration. We determine conditions necessary for the local asymptotic stability of these equilibrium points and use the results as criteria needed to remain in a stage in the progression of retinal degeneration. Experimental data is used for parameter estimation. Pathways to blindness are uncovered via bifurcations and narrows our focus to four of the model equilibria. Using Latin Hypercube Sampling coupled with partial rank correlation coefficients, we perform a sensitivity analysis to determine mechanisms that have a significant effect on the cones at four stages of RP. We derive a non-dimensional form of the mathematical model and perform a numerical bifurcation analysis using MATCONT to explore the existence of stable limit cycles because a stable limit cycle is a stable mode, other than an equilibrium point, where the rods and cones coexist. In our analyses, a set of key parameters involved in photoreceptor outer segment shedding, renewal, and nutrient supply were shown to govern the dynamics of the system. Our findings illustrate the benefit of using mathematical models to uncover mechanisms driving the progression of RP and opens the possibility to use in silico experiments to test treatment options in the absence of rods.

Fall 2020 ... show ... hide
Date Speaker Affiliation Title
Date Speaker, Affiliation, Title
Friday,
August 28, 2020
Enrique Zuazua University of Erlangen–Nuremberg (FAU) Turnpike control and deep learning ... more ... less
Fri, Aug 28,
2020
Enrique Zuazua,
University of Erlangen–Nuremberg (FAU)

Turnpike control and deep learning ... more ... less


Abstract:

The turnpike principle asserts that in long time horizons optimal control strategies are nearly of a steady state nature.

In this lecture we shall survey on some recent results on this topic and present some its consequences on deep supervised learning.

This lecture will be based in particular on recent joint work with C: Esteve, B. Geshkovski and D. Pighin.

arxiv

Abstract:

The turnpike principle asserts that in long time horizons optimal control strategies are nearly of a steady state nature.

In this lecture we shall survey on some recent results on this topic and present some its consequences on deep supervised learning.

This lecture will be based in particular on recent joint work with C: Esteve, B. Geshkovski and D. Pighin.

arxiv

Friday,
September 04, 2020
Rainald Löhner George Mason University Modeling and Simulation of Viral Propagation in the Built Environment ... more ... less
Fri, Sep 04,
2020
Rainald Löhner,
George Mason University

Modeling and Simulation of Viral Propagation in the Built Environment ... more ... less


Abstract:

This talk will begin by summarizing mechanical characteristics of virus contaminants and the transmission via droplets and aerosols. The ordinary and partial differential equations describing the physics of these processes with high fidelity will be presented. We shall also describe the appropriate numerical schemes to solve these problems. We will conclude the talk with several realistic examples of the built environments, such as TSA Queues, Hospital Rooms. DOI

Abstract:

This talk will begin by summarizing mechanical characteristics of virus contaminants and the transmission via droplets and aerosols. The ordinary and partial differential equations describing the physics of these processes with high fidelity will be presented. We shall also describe the appropriate numerical schemes to solve these problems. We will conclude the talk with several realistic examples of the built environments, such as TSA Queues, Hospital Rooms. DOI

Friday,
September 11, 2020
Fioralba Cakoni Rutgers University Spectral Problems in Inverse Scattering for Inhomogeneous Media ... more ... less
Fri, Sep 11,
2020
Fioralba Cakoni,
Rutgers University

Spectral Problems in Inverse Scattering for Inhomogeneous Media ... more ... less


Abstract:

The inverse scattering problem for inhomogeneous media amounts to inverting a locally compact nonlinear operator, thus presenting difficulties in arriving at a solution. Initial efforts to deal with the nonlinear and ill-posed nature of the inverse scattering problem focused on the use of nonlinear optimization methods. Although efficient in many situations, their use suffers from the need for strong a priori information in order to implement such an approach. In addition, recent advances in material science and nanostructure fabrications have introduced new exotic materials for which full reconstruction of the constitutive parameters from scattering data is challenging or even impossible. In order to circumvent these difficulties, a recent trend in inverse scattering theory has focused on the development of new methods, in which the amount of a priori information needed is drastically reduced but at the expense of obtaining only limited information of the scatterers. Such methods come under the general title of qualitative approach in inverse scattering theory; they yield mathematically justified and computationally simple reconstruction algorithms by investigating properties of the linear scattering operator to decode non-linear information about the scattering object. In this spirit, a possible approach is to exploit spectral properties of operators associated with scattering phenomena which carry essential information about the media. The identified eigenvalues must satisfy two important properties: 1) can be determined from the scattering operator, and 2) are related to geometrical and physical properties of the media in an understandable way.

In this talk we will discuss some old and new eigenvalue problems arising in scattering theory for inhomogeneous media. We will present a two-fold discussion: on one hand relating the eigenvalues to the measurement operator (to address the first property) and on the other hand viewing them as the spectrum of appropriate (possibly non-self-adjoint) partial differential operators (to address the second property). Numerical examples will be presented to show what kind of information these eigenvalues, and more generally the qualitative approach, yield on the unknown inhomogeneity.

Abstract:

The inverse scattering problem for inhomogeneous media amounts to inverting a locally compact nonlinear operator, thus presenting difficulties in arriving at a solution. Initial efforts to deal with the nonlinear and ill-posed nature of the inverse scattering problem focused on the use of nonlinear optimization methods. Although efficient in many situations, their use suffers from the need for strong a priori information in order to implement such an approach. In addition, recent advances in material science and nanostructure fabrications have introduced new exotic materials for which full reconstruction of the constitutive parameters from scattering data is challenging or even impossible. In order to circumvent these difficulties, a recent trend in inverse scattering theory has focused on the development of new methods, in which the amount of a priori information needed is drastically reduced but at the expense of obtaining only limited information of the scatterers. Such methods come under the general title of qualitative approach in inverse scattering theory; they yield mathematically justified and computationally simple reconstruction algorithms by investigating properties of the linear scattering operator to decode non-linear information about the scattering object. In this spirit, a possible approach is to exploit spectral properties of operators associated with scattering phenomena which carry essential information about the media. The identified eigenvalues must satisfy two important properties: 1) can be determined from the scattering operator, and 2) are related to geometrical and physical properties of the media in an understandable way.

In this talk we will discuss some old and new eigenvalue problems arising in scattering theory for inhomogeneous media. We will present a two-fold discussion: on one hand relating the eigenvalues to the measurement operator (to address the first property) and on the other hand viewing them as the spectrum of appropriate (possibly non-self-adjoint) partial differential operators (to address the second property). Numerical examples will be presented to show what kind of information these eigenvalues, and more generally the qualitative approach, yield on the unknown inhomogeneity.

Friday,
September 18, 2020
Shawn Walker Louisiana State University Mathematical Modeling and Numerics for Nematic Liquid Crystals ... more ... less
Fri, Sep 18,
2020
Shawn Walker,
Louisiana State University

Mathematical Modeling and Numerics for Nematic Liquid Crystals ... more ... less


Abstract:

I start with an overview of nematic liquid crystals (LCs), including their basic physics, applications, and how they are modeled. In particular, I describe different models, such as Oseen-Frank, Landau-de Gennes, and the Ericksen model, as well as their numerical discretization. In addition, I give the advantages and disadvantages of each model. For the rest of the talk, I will focus on Landau-de Gennes (LdG) and Ericksen.

Next, I will highlight parts of the analysis of these models and how it relates to numerical analysis, with specific emphasis on finite element methods (FEMs) to compute energy minimizers; much of this work is joint with various co-authors which I will review. I will illustrate the methods we have developed by presenting numerical simulations in two and three dimensions including non-orientable line fields (LdG model). Finally, I will conclude with some current problems in modeling and simulating LCs and an outlook to future directions.

Abstract:

I start with an overview of nematic liquid crystals (LCs), including their basic physics, applications, and how they are modeled. In particular, I describe different models, such as Oseen-Frank, Landau-de Gennes, and the Ericksen model, as well as their numerical discretization. In addition, I give the advantages and disadvantages of each model. For the rest of the talk, I will focus on Landau-de Gennes (LdG) and Ericksen.

Next, I will highlight parts of the analysis of these models and how it relates to numerical analysis, with specific emphasis on finite element methods (FEMs) to compute energy minimizers; much of this work is joint with various co-authors which I will review. I will illustrate the methods we have developed by presenting numerical simulations in two and three dimensions including non-orientable line fields (LdG model). Finally, I will conclude with some current problems in modeling and simulating LCs and an outlook to future directions.

Friday,
September 25, 2020
Carola-Bibiane Schönlieb University of Cambridge Multi-tasking inverse problems: more together than alone ... more ... less
Fri, Sep 25,
2020
Carola-Bibiane Schönlieb,
University of Cambridge

Multi-tasking inverse problems: more together than alone ... more ... less


Abstract:

Inverse imaging problems in practice constitute a pipeline of tasks that starts with image reconstruction, involves registration, segmentation, and a prediction task at the end. The idea of multi-tasking inverse problems is to make use of the full information in the data in every step of this pipeline by jointly optimising for all tasks. While this is not a new idea in inverse problems, the ability of deep learning to capture complex prior information paired with its computational efficiency renders an all-in-one approach practically possible for the first time.
In this talk we will discuss multi-tasking approaches to inverse problems, and their analytical and numerical challenges. This will include a variational model for joint motion estimation and reconstruction for fast tomographic imaging, joint registration and reconstruction (using a template image as a shape prior in the reconstruction) for limited angle tomography, as well as a variational model for joint image reconstruction and segmentation for MRI. These variational approaches will be put in contrast to a deep learning framework for multi-tasking inverse problems, with examples for joint image reconstruction and segmentation, and joint image reconstruction and classification from tomographic data.

Abstract:

Inverse imaging problems in practice constitute a pipeline of tasks that starts with image reconstruction, involves registration, segmentation, and a prediction task at the end. The idea of multi-tasking inverse problems is to make use of the full information in the data in every step of this pipeline by jointly optimising for all tasks. While this is not a new idea in inverse problems, the ability of deep learning to capture complex prior information paired with its computational efficiency renders an all-in-one approach practically possible for the first time.
In this talk we will discuss multi-tasking approaches to inverse problems, and their analytical and numerical challenges. This will include a variational model for joint motion estimation and reconstruction for fast tomographic imaging, joint registration and reconstruction (using a template image as a shape prior in the reconstruction) for limited angle tomography, as well as a variational model for joint image reconstruction and segmentation for MRI. These variational approaches will be put in contrast to a deep learning framework for multi-tasking inverse problems, with examples for joint image reconstruction and segmentation, and joint image reconstruction and classification from tomographic data.

Friday,
October 02, 2020
Drew P. Kouri Sandia National Laboratories Randomized Sketching for Low-Memory Dynamic Optimization ... more ... less
Fri, Oct 02,
2020
Drew P. Kouri,
Sandia National Laboratories

Randomized Sketching for Low-Memory Dynamic Optimization ... more ... less


Abstract:

In this talk, we develop a novel limited-memory method to solve dynamic optimization problems. The memory requirements for such problems often present a major obstacle, particularly for problems with PDE constraints such as optimal flow control, full waveform inversion, and optical tomography. In these problems, PDE constraints uniquely determine the state of a physical system for a given control; the goal is to find the value of the control that minimizes an objective or cost functional. While the control is often low dimensional, the state is typically more expensive to store. To reduce the memory requirements, we employ randomized matrix approximation to compress the state as it is generated. The compressed state is then used to compute approximate gradients and to apply the Hessian to vectors. The approximation error in these quantities is controlled by the target rank of the compressed state. This approximate first- and second-order information can readily be used in any optimization algorithm. As an example, we develop a sketched trust-region method that adaptively learns the target rank using a posteriori error information and provably converges to a stationary point of the original problem. To conclude, we apply our randomized compression to the optimal control of a linear elliptic PDE and the optimal control of fluid flow past a cylinder.

Abstract:

In this talk, we develop a novel limited-memory method to solve dynamic optimization problems. The memory requirements for such problems often present a major obstacle, particularly for problems with PDE constraints such as optimal flow control, full waveform inversion, and optical tomography. In these problems, PDE constraints uniquely determine the state of a physical system for a given control; the goal is to find the value of the control that minimizes an objective or cost functional. While the control is often low dimensional, the state is typically more expensive to store. To reduce the memory requirements, we employ randomized matrix approximation to compress the state as it is generated. The compressed state is then used to compute approximate gradients and to apply the Hessian to vectors. The approximation error in these quantities is controlled by the target rank of the compressed state. This approximate first- and second-order information can readily be used in any optimization algorithm. As an example, we develop a sketched trust-region method that adaptively learns the target rank using a posteriori error information and provably converges to a stationary point of the original problem. To conclude, we apply our randomized compression to the optimal control of a linear elliptic PDE and the optimal control of fluid flow past a cylinder.

Friday,
October 09, 2020
Kevin Carlberg Facebook Nonlinear model reduction: using machine learning to enable rapid simulation of extreme-scale physics models ... more ... less
Fri, Oct 09,
2020
Kevin Carlberg,
Facebook

Nonlinear model reduction: using machine learning to enable rapid simulation of extreme-scale physics models ... more ... less


Abstract:

Physics-based modeling and simulation has become indispensable across many applications in science and engineering, ranging from autonomous-vehicle control to designing new materials. However, achieving high predictive fidelity necessitates modeling fine spatiotemporal resolution, which can lead to extreme-scale computational models whose simulations consume months on thousands of computing cores. This constitutes a formidable computational barrier: the cost of truly high-fidelity simulations renders them impractical for important time-critical applications (e.g., rapid design, control, real-time simulation) in engineering and science. In this talk, I will present several advances in the field of nonlinear model reduction that leverage machine-learning techniques ranging from convolutional autoencoders to LSTM networks to overcome this barrier. In particular, these methods produce low-dimensional counterparts to high-fidelity models called reduced-order models (ROMs) that exhibit 1) accuracy, 2) low cost, 3) physical-property preservation, 4) guaranteed generalization performance, and 5) error quantification.

Abstract:

Physics-based modeling and simulation has become indispensable across many applications in science and engineering, ranging from autonomous-vehicle control to designing new materials. However, achieving high predictive fidelity necessitates modeling fine spatiotemporal resolution, which can lead to extreme-scale computational models whose simulations consume months on thousands of computing cores. This constitutes a formidable computational barrier: the cost of truly high-fidelity simulations renders them impractical for important time-critical applications (e.g., rapid design, control, real-time simulation) in engineering and science. In this talk, I will present several advances in the field of nonlinear model reduction that leverage machine-learning techniques ranging from convolutional autoencoders to LSTM networks to overcome this barrier. In particular, these methods produce low-dimensional counterparts to high-fidelity models called reduced-order models (ROMs) that exhibit 1) accuracy, 2) low cost, 3) physical-property preservation, 4) guaranteed generalization performance, and 5) error quantification.

Friday,
October 16, 2020
Noemi Petra University of California, Merced Optimal design of large-scale Bayesian linear inverse problems under reducible model uncertainty: good to know what you don't know ... more ... less
Fri, Oct 16,
2020
Noemi Petra,
University of California, Merced

Optimal design of large-scale Bayesian linear inverse problems under reducible model uncertainty: good to know what you don't know ... more ... less


Abstract:

Optimal experimental design (OED) refers to the task of determining an experimental setup such that the measurements are most informative about the underlying parameters. This is particularly important in situations where experiments are costly or time-consuming, and thus only a small number of measurements can be collected. In addition to the parameters estimated by an inverse problem, the governing mathematical models often involve simplifications, approximations, or modeling assumptions, resulting in additional uncertainty. These additional uncertainties must be taken into account in the experimental design process; failing to do so could result in suboptimal designs. In this talk, we consider optimal design of infinite-dimensional Bayesian linear inverse problems governed by uncertain forward models. In particular, we seek experimental designs that minimize the posterior uncertainty in the primary parameters, while accounting for the uncertainty in secondary (nuisance) parameters. We accomplish this by deriving a marginalized A-optimality criterion and developing an efficient computational approach for its optimization. We illustrate our approach for estimating an uncertain time-dependent source in a contaminant transport model with an uncertain initial state as secondary uncertainty. Our results indicate that accounting for additional model uncertainty in the experimental design process is crucial.

References: This presentation is based on the following paper https://arxiv.org/abs/1308.4084 and manuscript https://arxiv.org/abs/2006.11939.

Abstract:

Optimal experimental design (OED) refers to the task of determining an experimental setup such that the measurements are most informative about the underlying parameters. This is particularly important in situations where experiments are costly or time-consuming, and thus only a small number of measurements can be collected. In addition to the parameters estimated by an inverse problem, the governing mathematical models often involve simplifications, approximations, or modeling assumptions, resulting in additional uncertainty. These additional uncertainties must be taken into account in the experimental design process; failing to do so could result in suboptimal designs. In this talk, we consider optimal design of infinite-dimensional Bayesian linear inverse problems governed by uncertain forward models. In particular, we seek experimental designs that minimize the posterior uncertainty in the primary parameters, while accounting for the uncertainty in secondary (nuisance) parameters. We accomplish this by deriving a marginalized A-optimality criterion and developing an efficient computational approach for its optimization. We illustrate our approach for estimating an uncertain time-dependent source in a contaminant transport model with an uncertain initial state as secondary uncertainty. Our results indicate that accounting for additional model uncertainty in the experimental design process is crucial.

References: This presentation is based on the following paper https://arxiv.org/abs/1308.4084 and manuscript https://arxiv.org/abs/2006.11939.

Friday,
October 23, 2020
Boyan Lazarov Lawrence Livermore National Laboratory Large-scale topology optimization ... more ... less
Fri, Oct 23,
2020
Boyan Lazarov,
Lawrence Livermore National Laboratory

Large-scale topology optimization ... more ... less


Abstract:

Topology optimization has gained the status of being the preferred optimization tool in the mechanical, automotive, and aerospace industries. It has undergone tremendous development since its introduction in 1988, and nowadays, it has spread to many other disciplines such as Acoustics, Optics, and Material Design. The basic idea is to distribute material in a predefined domain by minimizing a selected objective and fulfilling a set of constraints. The procedure consists of repeated system analyses, gradient evaluation steps by adjoint sensitivity analysis, and design updates based on mathematical programming methods. Regularization techniques ensure the existence of a solution.

The result of the topology optimization procedure is a bitmap image of the design. The ability of the method to modify every pixel/voxel results in design freedom unavailable by any other alternative approach. However, this freedom comes with the requirement of using the computational power of large parallel machines. Incorporating a model accounting for exploitation and manufacturing variations in the optimization process and the high contrast between the material phases increase further the computational cost. Thus, this talk focuses on methods for reducing the computational complexity, ensuring manufacturability of the optimized design and efficient handling of the high contrast of the material properties. The development will be demonstrated in airplane wing design, compliant mechanisms, heat sinks, material microstructures for additive manufacturing, and photonic devices.

Abstract:

Topology optimization has gained the status of being the preferred optimization tool in the mechanical, automotive, and aerospace industries. It has undergone tremendous development since its introduction in 1988, and nowadays, it has spread to many other disciplines such as Acoustics, Optics, and Material Design. The basic idea is to distribute material in a predefined domain by minimizing a selected objective and fulfilling a set of constraints. The procedure consists of repeated system analyses, gradient evaluation steps by adjoint sensitivity analysis, and design updates based on mathematical programming methods. Regularization techniques ensure the existence of a solution.

The result of the topology optimization procedure is a bitmap image of the design. The ability of the method to modify every pixel/voxel results in design freedom unavailable by any other alternative approach. However, this freedom comes with the requirement of using the computational power of large parallel machines. Incorporating a model accounting for exploitation and manufacturing variations in the optimization process and the high contrast between the material phases increase further the computational cost. Thus, this talk focuses on methods for reducing the computational complexity, ensuring manufacturability of the optimized design and efficient handling of the high contrast of the material properties. The development will be demonstrated in airplane wing design, compliant mechanisms, heat sinks, material microstructures for additive manufacturing, and photonic devices.

Friday,
October 30, 2020
Martin J. Gander University of Geneva Seven Things I would have liked to know when starting to work on Domain Decomposition ... more ... less
Fri, Oct 30,
2020
Martin J. Gander,
University of Geneva

Seven Things I would have liked to know when starting to work on Domain Decomposition ... more ... less


Abstract:

It is not easy to start working in a new field of research. I will give a personal overview over seven things I would have liked to know when I started working on domain decomposition (DD) methods:

  1. Seminal contributions to DD not easy to start with
  2. Seminal contributions to DD ideal to start with
  3. DD solvers are obtained by discretizing 2)
  4. There are better transmission conditions than Dirichlet or Neumann
  5. "Optimal" in classical DD means scalable
  6. Coarse space components can do more than provide scalability
  7. DD methods should always be used as preconditioners

Abstract:

It is not easy to start working in a new field of research. I will give a personal overview over seven things I would have liked to know when I started working on domain decomposition (DD) methods:

  1. Seminal contributions to DD not easy to start with
  2. Seminal contributions to DD ideal to start with
  3. DD solvers are obtained by discretizing 2)
  4. There are better transmission conditions than Dirichlet or Neumann
  5. "Optimal" in classical DD means scalable
  6. Coarse space components can do more than provide scalability
  7. DD methods should always be used as preconditioners

Friday,
November 06, 2020
Siddhartha Mishra ETH Zürich Deep Learning and Computations of PDEs ... more ... less
Fri, Nov 06,
2020
Siddhartha Mishra,
ETH Zürich

Deep Learning and Computations of PDEs ... more ... less


Abstract:

We present recent results on the use of deep learning techniques in the context of computing different aspects of PDEs. The first part of the talk will be on novel supervised learning algorithms for efficient computation of parametric PDEs with applications to Uncertainty quantification and PDE constrained optimization. The second part of the talk will be focussed on a recently proposed class of unsupervised learning algorithms, Physics Informed Neural Networks (PINNs) and we describe their application to compute solutions for the forward problem for high-dimensional PDE as well as for the data assimilation inverse problems for PDEs.

Abstract:

We present recent results on the use of deep learning techniques in the context of computing different aspects of PDEs. The first part of the talk will be on novel supervised learning algorithms for efficient computation of parametric PDEs with applications to Uncertainty quantification and PDE constrained optimization. The second part of the talk will be focussed on a recently proposed class of unsupervised learning algorithms, Physics Informed Neural Networks (PINNs) and we describe their application to compute solutions for the forward problem for high-dimensional PDE as well as for the data assimilation inverse problems for PDEs.

Friday,
November 13, 2020
Jianfeng Lu Duke University Solving Eigenvalue Problems in High Dimension ... more ... less
Fri, Nov 13,
2020
Jianfeng Lu,
Duke University

Solving Eigenvalue Problems in High Dimension ... more ... less


Abstract:

The leading eigenvalue problem of a differential operator arises in many scientific and engineering applications, in particular quantum many-body problems. Due to the curse of dimensionality, conventional algorithms become impractical due to the huge computational and memory complexity. In this talk, we will discuss some of our recent works on novel approaches for eigenvalue problems in high dimension, using techniques from randomized algorithms, coordinate methods, and deep learning. (joint work with Jiequn Han, Yingzhou Li, Zhe Wang and Mo Zhou).

Abstract:

The leading eigenvalue problem of a differential operator arises in many scientific and engineering applications, in particular quantum many-body problems. Due to the curse of dimensionality, conventional algorithms become impractical due to the huge computational and memory complexity. In this talk, we will discuss some of our recent works on novel approaches for eigenvalue problems in high dimension, using techniques from randomized algorithms, coordinate methods, and deep learning. (joint work with Jiequn Han, Yingzhou Li, Zhe Wang and Mo Zhou).

Friday,
November 20, 2020
Ramnarayan Krishnamurthy MathWorks Hands-On Workshop - Deep Learning in MATLAB ... more ... less
Fri, Nov 20,
2020
Ramnarayan Krishnamurthy,
MathWorks

Hands-On Workshop - Deep Learning in MATLAB ... more ... less


Abstract:

Artificial Intelligence techniques like deep learning are introducing automation to the products we build and the way we do business. These techniques can be used to solve complex problems related to images, signals, text and controls.

In this hands-on workshop, you will write code and use MATLAB Online to:

  1. Train deep neural networks on GPUs in the cloud.
  2. Create deep learning models from scratch for image and signal data.
  3. Explore pretrained models and use transfer learning.
  4. Import and export models from Python frameworks such as Keras and PyTorch.
  5. Automatically generate code for embedded targets.

Follow up: Useful Resources

Abstract:

Artificial Intelligence techniques like deep learning are introducing automation to the products we build and the way we do business. These techniques can be used to solve complex problems related to images, signals, text and controls.

In this hands-on workshop, you will write code and use MATLAB Online to:

  1. Train deep neural networks on GPUs in the cloud.
  2. Create deep learning models from scratch for image and signal data.
  3. Explore pretrained models and use transfer learning.
  4. Import and export models from Python frameworks such as Keras and PyTorch.
  5. Automatically generate code for embedded targets.

Follow up: Useful Resources

Friday,
November 27, 2020
Thanksgiving Break
Fri, Nov 27,
2020
Thanksgiving Break

Friday,
December 04, 2020
Rayanne Luke University of Delaware Parameter Identification for Tear Film Thinning and Breakup ... more ... less
Fri, Dec 04,
2020
Rayanne Luke,
University of Delaware

Parameter Identification for Tear Film Thinning and Breakup ... more ... less


Abstract:

Millions of Americans experience dry eye syndrome, a condition that decreases quality of vision and causes ocular discomfort. A phenomenon associated with dry eye syndrome is tear film breakup (TBU), or the formation of dry spots on the eye. The dynamics of the tear film can be studied using fluorescence imaging. Many parameters affecting tear film thickness and fluorescent intensity distributions within TBU are difficult to measure directly in vivo. We estimate breakup parameters by fitting computed results from thin film fluid PDE models to experimental fluorescent intensity data gathered from normal subjects’ tear films in vivo. Both evaporation and the Marangoni effect can cause breakup. The PDE models include these mechanisms in combination and separately. The parameters are determined by a nonlinear least squares minimization between computed and experimental fluorescent intensity, and they indicate the relative importance of each mechanism. Optimal values for computed breakup variables that cannot be measured in vivo fall near or within accepted experimental ranges for the general corneal region. Our results are a step towards characterizing the mechanisms that cause a wide range of breakup instances and help medical professionals to better understand tear film function and dry eye syndrome.

Abstract:

Millions of Americans experience dry eye syndrome, a condition that decreases quality of vision and causes ocular discomfort. A phenomenon associated with dry eye syndrome is tear film breakup (TBU), or the formation of dry spots on the eye. The dynamics of the tear film can be studied using fluorescence imaging. Many parameters affecting tear film thickness and fluorescent intensity distributions within TBU are difficult to measure directly in vivo. We estimate breakup parameters by fitting computed results from thin film fluid PDE models to experimental fluorescent intensity data gathered from normal subjects’ tear films in vivo. Both evaporation and the Marangoni effect can cause breakup. The PDE models include these mechanisms in combination and separately. The parameters are determined by a nonlinear least squares minimization between computed and experimental fluorescent intensity, and they indicate the relative importance of each mechanism. Optimal values for computed breakup variables that cannot be measured in vivo fall near or within accepted experimental ranges for the general corneal region. Our results are a step towards characterizing the mechanisms that cause a wide range of breakup instances and help medical professionals to better understand tear film function and dry eye syndrome.

Stephan Wojtowytsch Princeton University Tetrahedral symmetry in the final and penultimate layers of neural network classifiers ... more ... less
Stephan Wojtowytsch,
Princeton University

Tetrahedral symmetry in the final and penultimate layers of neural network classifiers ... more ... less


Abstract:

A recent empirical study found that the penultimate layer of a well-trained neural network classifier maps training data samples to the vertices of a low-dimensional tetrahedron in a high-dimensional ambient space. We explain this observation from a theoretical perspective in a toy model for deep networks and give complementary examples to show that even the output of a shallow neural network classifier is generally non-uniform over a data class. As deep networks are the composition of a (slightly less) deep network and a shallow network, these example illustrate how a network would fail to output a uniform classifier over the training samples if the data is mapped to sets with inconvenient geometry in an intermediate layer.

Abstract:

A recent empirical study found that the penultimate layer of a well-trained neural network classifier maps training data samples to the vertices of a low-dimensional tetrahedron in a high-dimensional ambient space. We explain this observation from a theoretical perspective in a toy model for deep networks and give complementary examples to show that even the output of a shallow neural network classifier is generally non-uniform over a data class. As deep networks are the composition of a (slightly less) deep network and a shallow network, these example illustrate how a network would fail to output a uniform classifier over the training samples if the data is mapped to sets with inconvenient geometry in an intermediate layer.

Summer 2020 ... show ... hide
Date Speaker Affiliation Title
Date Speaker, Affiliation, Title
Friday,
May 22, 2020
Jianghao Wang MathWorks Practical Deep Learning in the Classroom ... more ... less
Fri, May 22,
2020
Jianghao Wang,
MathWorks

Practical Deep Learning in the Classroom ... more ... less


Abstract:

Deep learning is quickly becoming embedded in everyday applications. It’s becoming essential for students to adopt this technology, almost regardless of what their future jobs are. We will highlight some of the mathematics needed to construct and understand deep learning solutions.

About the speaker:

Jianghao Wang is the deep learning academic liaison at MathWorks. In her role, Jianghao supports deep learning research and teaching in academia. Before joining MathWorks, Jianghao obtained her Ph.D. in Statistical Climatology from the University of Southern California and B.S. in Applied Mathematics from Nankai University.

Abstract:

Deep learning is quickly becoming embedded in everyday applications. It’s becoming essential for students to adopt this technology, almost regardless of what their future jobs are. We will highlight some of the mathematics needed to construct and understand deep learning solutions.

About the speaker:

Jianghao Wang is the deep learning academic liaison at MathWorks. In her role, Jianghao supports deep learning research and teaching in academia. Before joining MathWorks, Jianghao obtained her Ph.D. in Statistical Climatology from the University of Southern California and B.S. in Applied Mathematics from Nankai University.

Friday,
May 29, 2020
Akwum Onwunta University of Maryland, College Park Fast solvers for optimal control problems constrained by PDEs with uncertain inputs ... more ... less
Fri, May 29,
2020
Akwum Onwunta,
University of Maryland, College Park

Fast solvers for optimal control problems constrained by PDEs with uncertain inputs ... more ... less

Abstract:

Optimization problems constrained by deterministic steady-state partial differential equations (PDEs) are computationally challenging. This is even more so if the constraints are deterministic unsteady PDEs since one would then need to solve a system of PDEs coupled globally in time and space, and time-stepping methods quickly reach their limitations due to the enormous demand for storage [5]. Yet, more challenging than the afore-mentioned are problems constrained by unsteady PDEs involving (countably many) parametric or uncertain inputs. A viable solution approach to optimization problems with stochastic constraints employs the spectral stochastic Galerkin finite element method (SGFEM). However, the SGFEM often leads to the so-called curse of dimensionality, in the sense that it results in prohibitively high dimensional linear systems with tensor product structure [1, 2, 4]. Moreover, a typical model for an optimal control problem with stochastic inputs (OCPS) will usually be used for the quantification of the statistics of the system response – a task that could in turn result in additional enormous computational expense.

It is worth pursuing computationally efficient ways to simulate OCPS using SGFEMs since the Galerkin approximation provides a favorable framework for error estimation [3]. In this talk, we consider two prototypical model OCPS and discretize them with SGFEM. We exploit the underlying mathematical structure of the discretized systems at the heart of the optimization routine to derive and analyze low- rank iterative solvers and robust block-diagonal preconditioners for solving the resulting stochastic Galerkin systems. The developed solvers are quite efficient in the reduction of temporal and storage requirements of the high-dimensional linear systems [1, 2]. Finally, we illustrate the effectiveness of our solvers with numerical experiments.

Keywords: Stochastic Galerkin system, iterative methods, PDE-constrained optimization, saddle-point system, low-rank solution, preconditioning, Schur complement.

References:

  1. Benner, S. Dolgov, A. Onwunta and M. Stoll, Low-rank solvers for unsteady Stokes-Brinkman optimal control prob- lem with random data, Computer Methods in Applied Mechanics and Engineering, 304, pp. 26 – 54, 2016.
  2. Benner, A. Onwunta and M. Stoll, Block-diagonal preconditioning for optimal control problems constrained by PDEs with uncertain inputs, SIAM Journal on Matrix Analysis and Applications, 37 (2), pp. 491 – 518, 2016.
  3. Bespalov and C. E. Powell and D. Silvester, Energy norm a posteriori error estimation for parametric operator equa- tions. SIAM Journal on Scientific Computing, 36 (2), pp. A339 – A363, 2013.
  4. Rosseel and G. N. Wells, Optimal control with stochastic PDE constraints and uncertain controls, Computer Methods in Applied Mechanics and Engineering, 213-216, pp. 152 – 167, 2012.
  5. Stoll and T. Breiten, A low-rank in time approach to PDE-constrained optimization, SIAM Journal on Scientific Computing, 37 (1), pp. B1 – B29, 2015.

About the speaker:

Akwum Onwunta is a postdoctoral research associate at the University of Maryland, College Park (UMCP). Before joining UMCP, he had worked at Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany as a scientific researcher and at Deutsche Bank, Frankfurt, as a Marie Curie research fellow / quantitative risk analyst. He holds a PhD in Mathematics from Otto von Guericke University, Magdeburg, Germany.

Abstract:

Optimization problems constrained by deterministic steady-state partial differential equations (PDEs) are computationally challenging. This is even more so if the constraints are deterministic unsteady PDEs since one would then need to solve a system of PDEs coupled globally in time and space, and time-stepping methods quickly reach their limitations due to the enormous demand for storage [5]. Yet, more challenging than the afore-mentioned are problems constrained by unsteady PDEs involving (countably many) parametric or uncertain inputs. A viable solution approach to optimization problems with stochastic constraints employs the spectral stochastic Galerkin finite element method (SGFEM). However, the SGFEM often leads to the so-called curse of dimensionality, in the sense that it results in prohibitively high dimensional linear systems with tensor product structure [1, 2, 4]. Moreover, a typical model for an optimal control problem with stochastic inputs (OCPS) will usually be used for the quantification of the statistics of the system response – a task that could in turn result in additional enormous computational expense.

It is worth pursuing computationally efficient ways to simulate OCPS using SGFEMs since the Galerkin approximation provides a favorable framework for error estimation [3]. In this talk, we consider two prototypical model OCPS and discretize them with SGFEM. We exploit the underlying mathematical structure of the discretized systems at the heart of the optimization routine to derive and analyze low- rank iterative solvers and robust block-diagonal preconditioners for solving the resulting stochastic Galerkin systems. The developed solvers are quite efficient in the reduction of temporal and storage requirements of the high-dimensional linear systems [1, 2]. Finally, we illustrate the effectiveness of our solvers with numerical experiments.

Keywords: Stochastic Galerkin system, iterative methods, PDE-constrained optimization, saddle-point system, low-rank solution, preconditioning, Schur complement.

References:

  1. Benner, S. Dolgov, A. Onwunta and M. Stoll, Low-rank solvers for unsteady Stokes-Brinkman optimal control prob- lem with random data, Computer Methods in Applied Mechanics and Engineering, 304, pp. 26 – 54, 2016.
  2. Benner, A. Onwunta and M. Stoll, Block-diagonal preconditioning for optimal control problems constrained by PDEs with uncertain inputs, SIAM Journal on Matrix Analysis and Applications, 37 (2), pp. 491 – 518, 2016.
  3. Bespalov and C. E. Powell and D. Silvester, Energy norm a posteriori error estimation for parametric operator equa- tions. SIAM Journal on Scientific Computing, 36 (2), pp. A339 – A363, 2013.
  4. Rosseel and G. N. Wells, Optimal control with stochastic PDE constraints and uncertain controls, Computer Methods in Applied Mechanics and Engineering, 213-216, pp. 152 – 167, 2012.
  5. Stoll and T. Breiten, A low-rank in time approach to PDE-constrained optimization, SIAM Journal on Scientific Computing, 37 (1), pp. B1 – B29, 2015.

About the speaker:

Akwum Onwunta is a postdoctoral research associate at the University of Maryland, College Park (UMCP). Before joining UMCP, he had worked at Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany as a scientific researcher and at Deutsche Bank, Frankfurt, as a Marie Curie research fellow / quantitative risk analyst. He holds a PhD in Mathematics from Otto von Guericke University, Magdeburg, Germany.

Friday,
June 05, 2020
Patrick O’Neil BlackSky Applications of Deep Learning to Large Scale Remote Sensing ... more ... less
Fri, Jun 05,
2020
Patrick O’Neil,
BlackSky

Applications of Deep Learning to Large Scale Remote Sensing ... more ... less

Abstract:

With the proliferation of Earth imaging satellites, the rate at which satellite imagery is acquired has outpaced the ability to manually review the data. Therefore, it is critical to develop systems capable of autonomously monitoring the globe for change. At BlackSky, we use a host of deep learning models, deployed in Amazon Web Services, to process all images downlinked from our Globals constellation of imaging satellites. In this talk, we will discuss some of these models and challenges we face when building remote sensing machine learning models at scale.

Abstract:

With the proliferation of Earth imaging satellites, the rate at which satellite imagery is acquired has outpaced the ability to manually review the data. Therefore, it is critical to develop systems capable of autonomously monitoring the globe for change. At BlackSky, we use a host of deep learning models, deployed in Amazon Web Services, to process all images downlinked from our Globals constellation of imaging satellites. In this talk, we will discuss some of these models and challenges we face when building remote sensing machine learning models at scale.

Friday,
June 12, 2020
Ira B. Schwartz US Naval Research Laboratory Fear in Networks: How social adaptation controls epidemic outbreaks ... more ... less
Fri, Jun 12,
2020
Ira B. Schwartz,
US Naval Research Laboratory

Fear in Networks: How social adaptation controls epidemic outbreaks ... more ... less


Abstract:

Disease control is of paramount importance in public health, with total eradication as the ultimate goal. Mathematical models of disease spread in populations are an important component in implementing effective vaccination and treatment campaigns. However, human behavior in response to an outbreak of disease has only recently been included in the modeling of epidemics on networks. In this talk, I will review some of the mathematical models and machinery used to describe the underlying dynamics of rare events in finite population disease models, which include human reactions on what are called adaptive networks. A new model that includes a dynamical systems description of the force of the noise that drives the disease to extinction. Coupling the effective force of noise with vaccination as well as human behavior reveals how to best utilize stochastic disease controlling resources such as vaccination and treatment programs. Finally, I will also present a general theory to derive the most probable paths to extinction for heterogeneous networks, which leads to a novel optimal control to extinction.

This research has been supported by the Office of Naval Research, Air Force of Scientific Research and the National Institutes of Health, and done primarily in collaboration with Jason Hindes, Brandon Lindley, and Leah Shaw.

About the speaker:

Trained and educated as both an applied mathematician (University of Marylan, Ph.D.) and physicist (University of Hartford, BS), Dr. Schwartz and his collaborators, post doctoral fellows and students have impacted a diverse array of applications in the field of nonlinear science. Dr. Schwartz has over 120 refereed publications in areas such as physics, mathematics, biology and chemistry. The main underlying theme in the applications field has been the mathematical and numerical techniques of nonlinear dynamics and chaos, and most recently, nonlinear stochastic analysis and control of cooperative and networked dynamical systems. Dr. Schwartz has been written up several times in Science and Scientific American magazines, has given invited and plenary talks at international applied mathematics, physics, and engineering conferences, and he is one of the founding organizers of the biennial SIAM conference on Dynamical Systems. Several of his discoveries developed in nonlinear science are currently patented, including collaborative robots, synchronized coupled lasers, and chaos tracking and control for which he was awarded the US Navy Tech Transfer award. Dr. Schwartz is an elected fellow of the American Physical Society and the current vice-chair of the SIAM Dynamical Systems Group.

Abstract:

Disease control is of paramount importance in public health, with total eradication as the ultimate goal. Mathematical models of disease spread in populations are an important component in implementing effective vaccination and treatment campaigns. However, human behavior in response to an outbreak of disease has only recently been included in the modeling of epidemics on networks. In this talk, I will review some of the mathematical models and machinery used to describe the underlying dynamics of rare events in finite population disease models, which include human reactions on what are called adaptive networks. A new model that includes a dynamical systems description of the force of the noise that drives the disease to extinction. Coupling the effective force of noise with vaccination as well as human behavior reveals how to best utilize stochastic disease controlling resources such as vaccination and treatment programs. Finally, I will also present a general theory to derive the most probable paths to extinction for heterogeneous networks, which leads to a novel optimal control to extinction.

This research has been supported by the Office of Naval Research, Air Force of Scientific Research and the National Institutes of Health, and done primarily in collaboration with Jason Hindes, Brandon Lindley, and Leah Shaw.

About the speaker:

Trained and educated as both an applied mathematician (University of Marylan, Ph.D.) and physicist (University of Hartford, BS), Dr. Schwartz and his collaborators, post doctoral fellows and students have impacted a diverse array of applications in the field of nonlinear science. Dr. Schwartz has over 120 refereed publications in areas such as physics, mathematics, biology and chemistry. The main underlying theme in the applications field has been the mathematical and numerical techniques of nonlinear dynamics and chaos, and most recently, nonlinear stochastic analysis and control of cooperative and networked dynamical systems. Dr. Schwartz has been written up several times in Science and Scientific American magazines, has given invited and plenary talks at international applied mathematics, physics, and engineering conferences, and he is one of the founding organizers of the biennial SIAM conference on Dynamical Systems. Several of his discoveries developed in nonlinear science are currently patented, including collaborative robots, synchronized coupled lasers, and chaos tracking and control for which he was awarded the US Navy Tech Transfer award. Dr. Schwartz is an elected fellow of the American Physical Society and the current vice-chair of the SIAM Dynamical Systems Group.

Friday,
June 19, 2020
Thomas M. Surowiec Philipps-Universität Marburg Optimization of Elliptic PDEs with Uncertain Inputs: Basic Theory and Numerical Stability ... more ... less
Fri, Jun 19,
2020
Thomas M. Surowiec,
Philipps-Universität Marburg

Optimization of Elliptic PDEs with Uncertain Inputs: Basic Theory and Numerical Stability ... more ... less


Abstract:

Systems of partial differential equations subject to random parameters provide a natural way of incorporating noisy data or model uncertainty into a mathematical setting. The associated optimal decision-making problems, whose feasible sets are at least partially governed by the solutions of these random PDEs, are infinite dimensional stochastic optimization problems. In order to obtain solutions that are resilient to the underlying uncertainty, a common approach is to use risk measures to model the user’s risk preference. The talk will be split into two main parts: Basic Theory and Numerical Stability.

In the first part, we propose a minimal set of technical assumptions needed to prove existence of solutions and derive optimality conditions. For the second part of the talk, we consider a specific class of stochastic optimization problems motivated by the application to PDE-constrained optimization. In particular, we are interested in finding answers to such questions as: How do the solutions behave in the large-data limit? Can we derive statements on the rate of convergence as the sample-size increases and mesh-size decreases?

After reviewing several notions of probability metrics and their usage in stability analysis of stochastic optimization problems, we present qualitative and quantitative stability results. These results demonstrate the parametric dependence of the optimal values and optimal solutions with respect to changes in the underlying probability measure. These statements provide us with answers to the questions posed above for a class of risk-neutral PDE-constrained problems.

Abstract:

Systems of partial differential equations subject to random parameters provide a natural way of incorporating noisy data or model uncertainty into a mathematical setting. The associated optimal decision-making problems, whose feasible sets are at least partially governed by the solutions of these random PDEs, are infinite dimensional stochastic optimization problems. In order to obtain solutions that are resilient to the underlying uncertainty, a common approach is to use risk measures to model the user’s risk preference. The talk will be split into two main parts: Basic Theory and Numerical Stability.

In the first part, we propose a minimal set of technical assumptions needed to prove existence of solutions and derive optimality conditions. For the second part of the talk, we consider a specific class of stochastic optimization problems motivated by the application to PDE-constrained optimization. In particular, we are interested in finding answers to such questions as: How do the solutions behave in the large-data limit? Can we derive statements on the rate of convergence as the sample-size increases and mesh-size decreases?

After reviewing several notions of probability metrics and their usage in stability analysis of stochastic optimization problems, we present qualitative and quantitative stability results. These results demonstrate the parametric dependence of the optimal values and optimal solutions with respect to changes in the underlying probability measure. These statements provide us with answers to the questions posed above for a class of risk-neutral PDE-constrained problems.

Friday,
June 26, 2020
Mahamadi Warma George Mason University Fractional PDEs and their controllability properties: What is so far known and what is still unknown? ... more ... less
Fri, Jun 26,
2020
Mahamadi Warma,
George Mason University

Fractional PDEs and their controllability properties: What is so far known and what is still unknown? ... more ... less


Abstract:

In this talk, we are interested to fractional PDEs (elliptic, parabolic and hyperbolic) associated with the fractional Laplace operator. After introducing some real-life phenomena where these problems occur, we shall give a complete overview on the subject. The similarities and the differences of these fractional PDEs with the classical local PDEs with be discussed. Concerning the control theory of fractional PDEs, we will give a complete overview of the topic. More precisely, we will introduce the known important results so far obtained and we will enumerate several related important problems that have been not yet investigated by the Mathematics community. The talk will be delivered for a wide audience avoiding unnecessary technicalities.

Abstract:

In this talk, we are interested to fractional PDEs (elliptic, parabolic and hyperbolic) associated with the fractional Laplace operator. After introducing some real-life phenomena where these problems occur, we shall give a complete overview on the subject. The similarities and the differences of these fractional PDEs with the classical local PDEs with be discussed. Concerning the control theory of fractional PDEs, we will give a complete overview of the topic. More precisely, we will introduce the known important results so far obtained and we will enumerate several related important problems that have been not yet investigated by the Mathematics community. The talk will be delivered for a wide audience avoiding unnecessary technicalities.

Friday,
July 03, 2020
no colloquium
Fri, Jul 03,
2020
no colloquium
Friday,
July 10, 2020
John Harlim The Pennsylvania State University Learning Missing Dynamics through Data ... more ... less (password: 7v.#=9%N)
Fri, Jul 10,
2020
John Harlim,
The Pennsylvania State University

Learning Missing Dynamics through Data ... more ... less


(password: 7v.#=9%N)
video
Abstract:

Recent success of machine learning has drawn tremendous interest in applied mathematics and scientific computations. In this talk, I would address the classical closure problem that is also known as model error, missing dynamics, or reduced-order-modeling in various community. Particularly, I will discuss a general framework to compensate for the model error. The proposed framework reformulates the model error problem into a supervised learning task to approximate very high-dimensional target functions, involving the Mori-Zwanzig representation of the projected dynamical systems. Connection to traditional parametric approaches will be clarified as specifying the appropriate hypothesis space for the target function. Theoretical convergence and numerical demonstration on modeling problems arising from PDE's will be discussed.

Abstract:

Recent success of machine learning has drawn tremendous interest in applied mathematics and scientific computations. In this talk, I would address the classical closure problem that is also known as model error, missing dynamics, or reduced-order-modeling in various community. Particularly, I will discuss a general framework to compensate for the model error. The proposed framework reformulates the model error problem into a supervised learning task to approximate very high-dimensional target functions, involving the Mori-Zwanzig representation of the projected dynamical systems. Connection to traditional parametric approaches will be clarified as specifying the appropriate hypothesis space for the target function. Theoretical convergence and numerical demonstration on modeling problems arising from PDE's will be discussed.

Friday,
July 17, 2020
Maziar Raissi University of Colorado Boulder Hidden Physics Models ... more ... less (password: 1P&@+!5v)
Fri, Jul 17,
2020
Maziar Raissi,
University of Colorado Boulder

Hidden Physics Models ... more ... less


(password: 1P&@+!5v)
video
Abstract:

A grand challenge with great opportunities is to develop a coherent framework that enables blending conservation laws, physical principles, and/or phenomenological behaviors expressed by differential equations with the vast data sets available in many fields of engineering, science, and technology. At the intersection of probabilistic machine learning, deep learning, and scientific computations, this work is pursuing the overall vision to establish promising new directions for harnessing the long-standing developments of classical methods in applied mathematics and mathematical physics to design learning machines with the ability to operate in complex domains without requiring large quantities of data. To materialize this vision, this work is exploring two complementary directions: (1) designing data-efficient learning machines capable of leveraging the underlying laws of physics, expressed by time dependent and non-linear differential equations, to extract patterns from high-dimensional data generated from experiments, and (2) designing novel numerical algorithms that can seamlessly blend equations and noisy multi-fidelity data, infer latent quantities of interest (e.g., the solution to a differential equation), and naturally quantify uncertainty in computations.

Abstract:

A grand challenge with great opportunities is to develop a coherent framework that enables blending conservation laws, physical principles, and/or phenomenological behaviors expressed by differential equations with the vast data sets available in many fields of engineering, science, and technology. At the intersection of probabilistic machine learning, deep learning, and scientific computations, this work is pursuing the overall vision to establish promising new directions for harnessing the long-standing developments of classical methods in applied mathematics and mathematical physics to design learning machines with the ability to operate in complex domains without requiring large quantities of data. To materialize this vision, this work is exploring two complementary directions: (1) designing data-efficient learning machines capable of leveraging the underlying laws of physics, expressed by time dependent and non-linear differential equations, to extract patterns from high-dimensional data generated from experiments, and (2) designing novel numerical algorithms that can seamlessly blend equations and noisy multi-fidelity data, infer latent quantities of interest (e.g., the solution to a differential equation), and naturally quantify uncertainty in computations.

Friday,
July 24, 2020
Ratna Khatri Naval Research Lab Fractional Deep Neural Network via Constrained Optimization ... more ... less
Fri, Jul 24,
2020
Ratna Khatri,
Naval Research Lab

Fractional Deep Neural Network via Constrained Optimization ... more ... less


Abstract:

In this talk, we will introduce a novel algorithmic framework for a deep neural network (DNN) which allows us to incorporate history (or memory) into the network. This DNN, called Fractional-DNN, can be viewed as a time-discretization of a fractional in time nonlinear ordinary differential equation (ODE). The learning problem then is a minimization problem subject to that fractional ODE as constraints. We test our network on datasets for classification problems. The key advantages of the fractional-DNN are a significant improvement to the vanishing gradient issue due to the memory effect, and a better handling of nonsmooth data due to the network's ability to approximate non-smooth functions.

Abstract:

In this talk, we will introduce a novel algorithmic framework for a deep neural network (DNN) which allows us to incorporate history (or memory) into the network. This DNN, called Fractional-DNN, can be viewed as a time-discretization of a fractional in time nonlinear ordinary differential equation (ODE). The learning problem then is a minimization problem subject to that fractional ODE as constraints. We test our network on datasets for classification problems. The key advantages of the fractional-DNN are a significant improvement to the vanishing gradient issue due to the memory effect, and a better handling of nonsmooth data due to the network's ability to approximate non-smooth functions.

Birgul Koc Virginia Tech Data-Driven Variational Multiscale Reduced Order Models ... more ... less
Birgul Koc,
Virginia Tech

Data-Driven Variational Multiscale Reduced Order Models ... more ... less


Abstract:

We propose a new data-driven reduced order model (ROM) framework that centers around the hierarchical structure of the variational multiscale (VMS) methodology and utilizes data to increase the ROM accuracy at a modest computational cost. The VMS methodology is a natural fit for the hierarchical structure of the ROM basis: In the first step, we use the ROM projection to separate the scales into three categories: (i) resolved large scales, (ii) resolved small scales, and (iii) unresolved scales. In the second step, we explicitly identify the VMS-ROM closure terms, i.e., the terms representing the interactions among the three types of scales. In the third step, instead of ad hoc modeling techniques used in VMS for standard numerical methods (e.g., finite element), we use available data to model the VMS-ROM closure terms. Thus, instead of phenomenological models used in VMS for standard numerical discretizations (e.g., eddy viscosity models), we utilize available data to construct new structural VMS-ROM closure models. Specifically, we build ROM operators (vectors, matrices, and tensors) that are closest to the true ROM closure terms evaluated with the available data. We test the new data-driven VMS-ROM in the numerical simulation of the 1D Burgers equation and the 2D flow past a circular cylinder. The numerical results show that the data-driven VMS-ROM is significantly more accurate than standard ROMs.

Abstract:

We propose a new data-driven reduced order model (ROM) framework that centers around the hierarchical structure of the variational multiscale (VMS) methodology and utilizes data to increase the ROM accuracy at a modest computational cost. The VMS methodology is a natural fit for the hierarchical structure of the ROM basis: In the first step, we use the ROM projection to separate the scales into three categories: (i) resolved large scales, (ii) resolved small scales, and (iii) unresolved scales. In the second step, we explicitly identify the VMS-ROM closure terms, i.e., the terms representing the interactions among the three types of scales. In the third step, instead of ad hoc modeling techniques used in VMS for standard numerical methods (e.g., finite element), we use available data to model the VMS-ROM closure terms. Thus, instead of phenomenological models used in VMS for standard numerical discretizations (e.g., eddy viscosity models), we utilize available data to construct new structural VMS-ROM closure models. Specifically, we build ROM operators (vectors, matrices, and tensors) that are closest to the true ROM closure terms evaluated with the available data. We test the new data-driven VMS-ROM in the numerical simulation of the 1D Burgers equation and the 2D flow past a circular cylinder. The numerical results show that the data-driven VMS-ROM is significantly more accurate than standard ROMs.

Friday,
July 31, 2020
Eric Cyr Sandia National Laboratories A Layer-Parallel Approach for Training Deep Neural Networks ... more ... less
Fri, Jul 31,
2020
Eric Cyr,
Sandia National Laboratories

A Layer-Parallel Approach for Training Deep Neural Networks ... more ... less


Abstract:

Deep neural networks are a powerful machine learning tool with the capacity to “learn” complex nonlinear relationships described by large data sets. Despite their success training these models remains a challenging and computationally intensive undertaking. In this talk we will present a new layer-parallel training algorithm that exploits a multigrid scheme to accelerate both forward and backward propagation. Introducing a parallel decomposition between layers requires inexact propagation of the neural network. The multigrid method used in this approach stiches these subdomains together with sufficient accuracy to ensure rapid convergence. We demonstrate an order of magnitude wall-clock time speedup over the serial approach, opening a new avenue for parallelism that is complementary to existing approaches. Results for this talk can be found in [1,2]. We will also present related work concerning parallel-in-time optimization algorithms for PDE-constrained optimization.

[1] S. Guenther, L. Ruthotto, J. B. Schroder, E. C. Cyr, N. R. Gauger, Layer-Parallel Training of Deep Residual Neural Networks, SIMODs, Vol. 2 (1), 2020.
[2] E. C. Cyr, S. Guenther, J. B. Schroder, Multilevel Initialization for Layer-Parallel Deep Neural Network Training, arXiv preprint arXiv:1912.08974, 2019.

Abstract:

Deep neural networks are a powerful machine learning tool with the capacity to “learn” complex nonlinear relationships described by large data sets. Despite their success training these models remains a challenging and computationally intensive undertaking. In this talk we will present a new layer-parallel training algorithm that exploits a multigrid scheme to accelerate both forward and backward propagation. Introducing a parallel decomposition between layers requires inexact propagation of the neural network. The multigrid method used in this approach stiches these subdomains together with sufficient accuracy to ensure rapid convergence. We demonstrate an order of magnitude wall-clock time speedup over the serial approach, opening a new avenue for parallelism that is complementary to existing approaches. Results for this talk can be found in [1,2]. We will also present related work concerning parallel-in-time optimization algorithms for PDE-constrained optimization.

[1] S. Guenther, L. Ruthotto, J. B. Schroder, E. C. Cyr, N. R. Gauger, Layer-Parallel Training of Deep Residual Neural Networks, SIMODs, Vol. 2 (1), 2020.
[2] E. C. Cyr, S. Guenther, J. B. Schroder, Multilevel Initialization for Layer-Parallel Deep Neural Network Training, arXiv preprint arXiv:1912.08974, 2019.

Friday,
August 07, 2020
Marta D'Elia Sandia National Laboratories A unified theoretical and computational nonlocal framework: generalized vector calculus and machine-learned nonlocal models ... more ... less
Fri, Aug 07,
2020
Marta D'Elia,
Sandia National Laboratories

A unified theoretical and computational nonlocal framework: generalized vector calculus and machine-learned nonlocal models ... more ... less


Abstract:

Nonlocal models provide an improved predictive capability thanks to their ability to capture effects that classical partial differential equations fail to capture. Among these effects we have multiscale behavior (e.g. in fracture mechanics) and anomalous behavior such as super- and sub-diffusion. These models have become incredibly popular for a broad range of applications, including mechanics, subsurface flow, turbulence, heat conduction and image processing. However, their improved accuracy comes at a price of many modeling and numerical challenges.

In this talk I will first address the problem of connecting nonlocal and fractional calculus by developing a unified theoretical framework that enables the identification of a broad class of nonlocal models. Then, I will present two recently developed machine-learning techniques for nonlocal and fractional model learning. These physics-informed, data-driven, tools allow for the reconstruction of model parameters or nonlocal kernels. Several numerical tests in one and two dimensions illustrate our theoretical findings and the robustness and accuracy of our approaches.

Abstract:

Nonlocal models provide an improved predictive capability thanks to their ability to capture effects that classical partial differential equations fail to capture. Among these effects we have multiscale behavior (e.g. in fracture mechanics) and anomalous behavior such as super- and sub-diffusion. These models have become incredibly popular for a broad range of applications, including mechanics, subsurface flow, turbulence, heat conduction and image processing. However, their improved accuracy comes at a price of many modeling and numerical challenges.

In this talk I will first address the problem of connecting nonlocal and fractional calculus by developing a unified theoretical framework that enables the identification of a broad class of nonlocal models. Then, I will present two recently developed machine-learning techniques for nonlocal and fractional model learning. These physics-informed, data-driven, tools allow for the reconstruction of model parameters or nonlocal kernels. Several numerical tests in one and two dimensions illustrate our theoretical findings and the robustness and accuracy of our approaches.

Research Interaction and Training Seminars (RITS)

CMAI Summer Schools

Conferences & Workshops