Browse Source

!14375 Add optimizer formula to comments - Adagrad

From: @gerayking
Reviewed-by: @zh_qh,@zhunaipan
Signed-off-by: @zhunaipan
pull/14375/MERGE
mindspore-ci-bot Gitee 4 years ago
parent
commit
5f5ad74346
1 changed files with 11 additions and 1 deletions
  1. +11
    -1
      mindspore/nn/optim/ada_grad.py

+ 11
- 1
mindspore/nn/optim/ada_grad.py View File

@@ -36,12 +36,22 @@ def _check_param_value(accum, update_slots, prim_name=None):


class Adagrad(Optimizer):
"""
r"""
Implements the Adagrad algorithm with ApplyAdagrad Operator.

Adagrad is an online Learning and Stochastic Optimization.
Refer to paper `Efficient Learning using Forward-Backward Splitting
<https://proceedings.neurips.cc/paper/2009/file/621bf66ddb7c962aa0d22ac97d69b793-Paper.pdf>`_.
The updating formulas are as follows,

.. math::
\begin{array}{ll} \\
h_{t} = h_{t-1} + g\\
w_{t} = w_{t-1} - lr*\frac{1}{\sqrt{h_{t}}}*g
\end{array}

:math:`h` represents the cumulative sum of gradient squared, :math:`g` represents `gradients`.
:math:`lr` represents `learning_rate`, :math:`w` represents `params`.

Note:
When separating parameter groups, the weight decay in each group will be applied on the parameters if the


Loading…
Cancel
Save