Paper ID: 591
Title: Stochastic Convex Optimization with Multiple Objectives
Reviews

Submitted by Assigned_Reviewer_1

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper proposes an approach to stochastic multi-objective optimization. The main idea is simply described: optimize a single objective while taking other objectives as constraints. The authors proposes a primal-dual stochastic optimization algorithm to solve the problem and prove that it achieves (for the primal objective) the optimal 1/\sqrt{T} convergence rate.

As far as I am concerned, the theory is solid and it does provide a good insight into the problem of interest. The authors didn't mention how to set the parameters gamma_i, which may be difficult for specific problem. In fact, the challenge of setting gamma_i is not necessarily lower than the difficulty of setting mixture weights in the linear scalarization approach. The paper could be more interesting if we don't need gamma_i in advance.

It is nice to see lambda^0's can be upper bounded in section 4.2. It would be interesting to discuss the optimal choice of \theta in certain circumstances. The paper is well-written and the results (as far as I know) are new. It is nice to see that the stochastic method well fit the multi-objective problem even though the choice of gamma_i is still a problem.
Q2: Please summarize your review in 1-2 sentences
A good paper overall. Interesting theoretical results with rigorous arguments. Would be better if including some empirical evaluation.

Submitted by Assigned_Reviewer_4

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper presents an algorithm for multiple objectives stochastic convex optimization.
The issue with multiple objectives is handled by choosing one objective to minimize and by
constraining all the others via lower bounds. The stochastic sampling is addressed by averaging
the costs on the available samples. The proposed algorithm is then a primal-dual projected gradient descent.
The authors show also that 2 naive implementations yield a worse solution error bound than the primal-dual
approach.

Quality
--
The paper seems technically sound. I have checked several steps but not all.
The main serious shortcoming is that there are no experiments at all to support
any of the claims made in the paper.
Overall, this seems an unfinished paper. Although this is mostly a theoretical
paper, I'd recommend to add experiments to show advantages and limitations
of the proposed strategy (transfer m objectives to constraints).
For example, how does one choose the bounds for each objective function?
What is the best way to pick the objective to be minimized?
Can one draw any connection of Pareto optimal solutions?

Clarity
--
Overall the paper is clear.
The authors should also show some illustrations together with the theory.
Typos/unclear
line 183: what is \hat f_i ?
line 215: to fail for -> fail for


Originality
--
The originality of this work is difficult to assess.
It seems that Algorithm 1 and the formal proofs are new,
but I am not an expert in the field and relevant literature
could have escaped me. The idea of moving all but one
objectives to the set of constraints seems quite simple
and with several limitations.


Significance
--
The use of multiple objectives as well as stochastic sampling
is present in a few applications (some shown at the end of
the Introduction) and worthy of attention.
However, the resulting algorithm is quite a straightforward
idea, and has some limitations (optimization is with respect to only
one objective function) that are not fully examined in the paper.
The full significance of the algorithm is not demonstrated with
experiments either. The main contribution then is left
to the proof of the error bounds (Theorems 1 and 2).

Q2: Please summarize your review in 1-2 sentences
Multiobjective optimization can be tackled by selecting one cost and
imposing bounds on the other costs. There are no experiments to
validate all the theory.

Submitted by Assigned_Reviewer_5

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper considered the setting where we want to optimize one function \bar{f]_0 when there are constraints on other functions \bar{f}_1...\bar{f}f_m, but our access is only stochastic for both the objective and the constraints. Naive approaches don't work, but a very natural primal-dual algorithm works to get a rate of 1/\sqrt T.

Overall, I was happy with almost all aspects of the paper - however the convergence analysis can be made clearer and more intuitive. While it was shown that simple things don't work, the algorithm that does work is quite simple and probably one of the first ones that one would think of - this could be a pro (someone has to prove it, and it might not be obvious how to) or a con (the result is not that surprising). However, it is a complete work, most of my concerns were addressed, it was fairly easy to follow, clear and original.

Minor typos :
At many points there is \lambda^*_i and elsewhere \lambda^i_*. I assumed this was a typo and that you refer to the same object everywhere.
Q2: Please summarize your review in 1-2 sentences
I recommend this paper for acceptance, because it is clear and fairly original, of high quality, and of significance to the community, but the surprise factor is low.
Author Feedback

Q1:Author rebuttal: Please respond to any concerns raised in the reviews. There are no constraints on how you want to argue your case, except for the fact that your text should be limited to a maximum of 6000 characters. Note however that reviewers and area chairs are very busy and may not read long vague rebuttals. It is in your own interest to be concise and to the point.
We are grateful to all anonymous reviewers for their useful and constructive comments and PC members for handling our paper.

========= Reviewer 1 ==========
Q: Setting gamma_i is still challenging.

A: We agree that \gamma_i needs to be set appropriately, but it is eminently worthy to emphasize that setting \gamma_i would be relatively easier than setting the combination weights required in the scalarization method. For instance, in maximizing both precision and recall, Neyman–Pearson classification, and the finance example we have discussed at the beginning of the article, this can be done by some prior knowledge about the problem's domain (e.g., [1] and [2] discuss this for few applications). Also, as discussed, the proposed primal-dual algorithm guarantees bound on *ALL* the objectives, while solving the weighted combination of objectives does not necessarily guarantee any bound on single objectives which is the main goal of the proposed method.

[1] Clayton Scott, Performance Measures for Neyman–Pearson Classification, IEEE-IT, 2007.
[2] Clayton Scott and Robert Nowak, A Neyman–Pearson Approach to Statistical Learning, IEEE-IT, 2005.

Q: It is nice to see lambda^0's can be upper bounded in Section 4.2. It would be interesting to discuss the optimal choice of \theta in certain circumstances.

A: Thanks a lot for pointing these facts out. We will resolve these issues and include discussion about these parameters in the setting of three problems we discussed in the Introduction.

Q: Would be better if including some empirical evaluation

A: Regarding the empirical evaluation of the proposed algorithms, we would like to mention that the application of the algorithm to Neyman–Pearson classification and finance problem will be included in an extended version. Also, in this paper we only considered simple Lipschitz continuous objectives and the generalization to the objective with stronger assumptions on the objectives such as smoothness and strong convexity to obtain better convergence rate will be discussed in an extended work as well.

========= Reviewer 2 ==========

Q: Although this is mostly a theoretical paper, I'd recommend to add experiments to show advantages and limitations of the proposed strategy.

A: We agree with the reviewer that having experimental results definitely makes this work more complete and we will include empirical studies along with the detailed algorithm for each specific application and generalization of the algorithm to smooth and strongly convex functions in an extended version of this work.

Q: Only optimizing one objective function with the rest added to the constraints, too simple, limited significance.

A: Indeed, this strategy is popular in the study of multi-objective optimization. But in this work we showed that in stochastic optimization, the proposed primal-dual algorithm is able to bound *ALL* the objectives. Solving the reduced problem with a standard stochastic optimization algorithm does not work as we discussed, even under strong assumptions on the stochastic objectives. Hence, our work departs from previous literature in our treatment of objectives and guarantee on the individual objectives.


========= Reviewer 3 ==========
Thanks a lot for pointing out the typo. We will fix that and do our best to make the convergence analysis more clear and intuitive.