Summary by Open Review 1 year ago
This paper is a direct extension of the work of \cite{journals/corr/1305.2532}. Prior to this work, \cite{journals/corr/1305.2532} proposed an algorithm for learning to predict fixed size lists using either a single greedy policy (SCP) or a list of sequentially applied policies (CONSEQOPT). Given a sub-modular reward function, \cite{journals/corr/1305.2532} provided theoretical guarantees on the regret of the learned policies via a reduction to cost sensitive classification. In practice one can then approximate the cost sensitive classification loss with convex surrogate losses that can be efficiently learned. In this work, the authors extend the framework of \cite{journals/corr/1305.2532} to knapsack problems, wherein each element in the list has an associated length and the sum total lengths may not exceed a fixed budget. (Note that I intentionally use the word "length" to differentiate from the "cost" of cost-sensitive classification in the reduction, which gets a bit confusing in the paper as the authors overload the word "cost" frequently.) They extend the algorithm of \cite{journals/corr/1305.2532} to be sensitive to the length of items added to the list, and extend the analysis of \cite{journals/corr/1305.2532} to provide regret guarantees in this setting. The authors then apply their algorithm to the problem of multi-document summarization, where items correspond to sentences and the budget constraints the total length of the summary. They show an improvement over prior work with their new approach.
The paper is reasonably well written and straightforward to follow, with the exception of containing lots of notation and terminology that is sometimes difficult to keep straight (e.g. the definitions of the words "cost" and "weight", and the use of \ell as both cost sensitive losses and the length (cost? weight?) of each item for the budget.) The work itself is a useful extension to prior work that appears to be experimentally sound.
Pros:
+ Novel extension of prior work to incorporate budgetary constraints
+ Works well experimentally
Cons:
- Overloading of symbols and words leads to confusion
References:
\cite{journals/corr/1305.2532} Ross, Stephane, Zhou, Jiaji, Yue, Yisong, Dey, Debadeepta, and Bagnell, .A. Learning policies for contextual submodular prediction. In ICML, 2013.

more
less