The paper studied the online experimental design problem where there are temporal dependencies between the two control policies/treatments. The novelty of the problem setup and the theoretical analysis in the paper are appreciated by all the reviewers. Although the analysis is the main contribution, the paper would be much stronger if there are meaningful experiments on toy problems to showcase the performance the online MLE-based approach vs the standard experimental design approaches.