Rejection via Learning Density Ratios

Part of Advances in Neural Information Processing Systems 37 (NeurIPS 2024) Main Conference Track

Bibtex Paper

Authors

Alexander Soen, Hisham Husain, Philip Schulz, Vu Nguyen

Abstract

Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions. The predominant approach is to alter the supervised learning pipeline by augmenting typical loss functions, letting model rejection incur a lower loss than an incorrect prediction.Instead, we propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.This can be formalized via the optimization of a loss's risk with a $ \phi$-divergence regularization term.Through this idealized distribution, a rejection decision can be made by utilizing the density ratio between this distribution and the data distribution.We focus on the setting where our $ \phi $-divergences are specified by the family of $ \alpha $-divergence.Our framework is tested empirically over clean and noisy datasets.