Adaptive Subgradient Methods for Online Learning and Stochastic OptimizationAdaptive Subgradient Methods for Online Learning and Stochastic OptimizationDuchi, John C. and Hazan, Elad and Singer, Yoram2010

Paper summaryjoecohenThis is Adagrad. Adagrad is an adaptive learning rate method. Some sample code from [[Stanford CS231n]](https://cs231n.github.io/neural-networks-3/#ada) is:
```python
# Assume the gradient dx and parameter vector x
cache += dx**2
x += - learning_rate * dx / (np.sqrt(cache) + eps)
```

This is Adagrad. Adagrad is an adaptive learning rate method. Some sample code from [[Stanford CS231n]](https://cs231n.github.io/neural-networks-3/#ada) is:
```python
# Assume the gradient dx and parameter vector x
cache += dx**2
x += - learning_rate * dx / (np.sqrt(cache) + eps)
```

Your comment:

You must log in before you can post this comment!

You must log in before you can submit this summary! Your draft will not be saved!

Preview:

0

Short Science allows researchers to publish paper summaries that are voted on and ranked! About