Collaborative Filtering for Implicit Feedback Datasets Collaborative Filtering for Implicit Feedback Datasets
Paper summary This paper is about a recommendation system approach using collaborative filtering (CF) on implicit feedback datasets. The core of it is the minimization problem $$\min_{x_*, y_*} \sum_{u,i} c_{ui} (p_{ui} - x_u^T y_i)^2 + \underbrace{\lambda \left ( \sum_u || x_u ||^2 + \sum_i || y_i ||^2\right )}_{\text{Regularization}}$$ with * $\lambda \in [0, \infty[$ is a hyper parameter which defines how strong the model is regularized * $u$ denoting a user, $u_*$ are all user factors $x_u$ combined * $i$ denoting an item, $y_*$ are all item factors $y_i$ combined * $x_u \in \mathbb{R}^n$ is the latent user factor (embedding); $n$ is another hyper parameter. $n=50$ seems to be a reasonable choice. * $y_i \in \mathbb{R}^n$ is the latent item factor (embedding) * $r_{ui}$ defines the "intensity"; higher values mean user $u$ interacted more with item $i$ * $p_{ui} = \begin{cases}1 & \text{if } r_{ui} >0\\0 &\text{otherwise}\end{cases}$ * $c_{ui} := 1 + \alpha r_{ui}$ where $\alpha \in [0, \infty[$ is a hyper parameter; $\alpha =40$ seems to be reasonable In contrast, the standard matrix factoriation optimization function looks like this ([example]( $$\min_{x_*, y_*} \sum_{(u, i, r_{ui}) \in \mathcal{R}} {(r_{ui} - x_u^T y_i)}^2 + \underbrace{\lambda \left ( \sum_u || x_u ||^2 + \sum_i || y_i ||^2\right )}_{\text{Regularization}}$$ where * $\mathcal{R}$ is the set of all ratings $(u, i, r_{ui})$ - user $u$ has rated item $i$ with value $r_{ui} \in \mathbb{R}$ They use alternating least squares (ALS) to train this model. The prediction then is the dot product between the user factor and all item factors ([source]( allows researchers to publish paper summaries that are voted on and ranked!

Sponsored by: and