Part of Advances in Neural Information Processing Systems 11 (NIPS 1998)
Timothy Brown, Hui Tong, Satinder Singh
This paper examines the application of reinforcement learning to a telecommunications networking problem . The problem requires that rev(cid:173) enue be maximized while simultaneously meeting a quality of service constraint that forbids entry into certain states. We present a general solution to this multi-criteria problem that is able to earn significantly higher revenues than alternatives.