Gerrit Welper
Assistant Professor of Mathematics,
College of Sciences
University of Central Florida
"Approximation and Optimization Theory for Neural Networks"
Wednesday, Sep 18, 2024, Schedule:
- Nespresso & Teatime - 417 DSL Commons
- 03:00 to 03:30 PM Eastern Time (US and Canada)
- Colloquium - 499 DSL Seminar Room
- 03:30 to 04:30 PM Eastern Time (US and Canada)
Click Here to Join via Zoom
Meeting # 942 7359 5552
Zoom Meeting # 942 7359 5552
Abstract:
The error analysis of neural networks is often split into three components: the approximation error, describes how well the networks can approximate functions, given infinite data. The estimation error describes the additional error contributions from sampling and the optimization error describes the contributions from the numerical optimizers.
In particular, the latter is best understood in severely over-parametrized regimes, where we have more samples than width and the networks show an almost linear behavior dominated by the neural tangent kernel (NTK). Many practical networks, however, have less width than samples and therefore, we extend the theory to the under-parametrized regime. This requires a more careful consideration of approximation errors and results in a unified theory for all three error components.
Since the NTK theory cannot adequately capture the nonlinear nature of the networks, we also provide some first steps towards a nonlinear theory by comparing trained simple model networks with known results form nonlinear approximation.