Part of Advances in Neural Information Processing Systems 36 (NeurIPS 2023) Main Conference Track
Jiyoung Park, Ian Pelakh, Stephan Wojtowytsch
We investigate how shallow ReLU networks interpolate between known regions. Our analysis shows that empirical risk minimizers converge to a minimum norm interpolant as the number of data points and parameters tends to infinity when a weight decay regularizer is penalized with a coefficient which vanishes at a precise rate as the network width and the number of data points grow. With and without explicit regularization, we numerically study the implicit bias of common optimization algorithms towards known minimum norm interpolants.