doc update for probability module

fix typo fix pylint problem fix warning fix typo fix typo update formula in log-normal
4 years ago · 1f4d906f17
--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.Bijector.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.Bijector.rst
@@ -3,7 +3,7 @@ mindspore.nn.probability.bijector.Bijector

 .. py:class:: mindspore.nn.probability.bijector.Bijector(is_constant_jacobian=False, is_injective=True, name=None, dtype=None, param=None)

    Bijector类。
    Bijector类。Bijector描述了一种随机变量的映射方法，可以通过一个已有的随机变量 :math:`X` 和一个映射函数 :math:`g` 生成一个新的随机变量 :math:`Y = g(X)` 。

    **参数：**
    
@@ -19,23 +19,22 @@ mindspore.nn.probability.bijector.Bijector

    .. note::
        Bijector的 `dtype` 为None时，输入值必须是float类型，除此之外没有其他强制要求。在初始化过程中，当 `dtype` 为None时，对参数的数据类型没有强制要求。但所有参数都应具有相同的float类型，否则将引发TypeError。具体来说，参数类型跟随输入值的数据类型，即当 `dtype` 为None时，Bijector的参数将被强制转换为与输入值相同的类型。当指定了 `dtype` 时，参数和输入值的 `dtype` 必须相同。当参数类型或输入值类型与 `dtype` 不相同时，将引发TypeError。只能使用mindspore的float数据类型来指定Bijector的 `dtype` 。
    

    .. py:method:: cast_param_by_value(value, para)

        将Bijector的参数para的数据类型转换为与value相同的类型。
        
        将输入中的 `para` 的数据类型转换为与 `value` 相同的类型，一般由Bijector的子类用于基于输入对自身参数进行数据类型变化。

        **参数：**

        - **value** (Tensor) - 输入数据。
        - **para** (Tensor) - Bijector参数。
        - **kwargs** (dict) - 函数需要的关键字参数字典。
        
    .. py:method:: construct(name, *args, **kwargs)

        重写Cell中的 `construct` 。

        .. note::
            支持的函数包括：'forward'、'inverse'、'forward_log_jacobian'、'inverse_log_jacobian'。
            支持的函数名称包括：'forward'、'inverse'、'forward_log_jacobian'、'inverse_log_jacobian'。

        **参数：**
        
@@ -45,41 +44,56 @@ mindspore.nn.probability.bijector.Bijector
        
    .. py:method:: forward(value, *args, **kwargs)

        正变换：将输入值转换为另一个分布。
        正映射，计算输入随机变量 :math:`X = value` 经过映射后的值 :math:`Y = g(value)`。
        
        **参数：**

        - **value** (Tensor) - 输入。
        - **value** (Tensor) - 输入随机变量的值。
        - **args** (list) - 函数所需的位置参数列表。
        - **kwargs** (dict) - 函数所需的关键字参数字典。

        **返回：**

        Tensor, 输出随机变量的值。
        
    .. py:method:: forward_log_jacobian(value, *args, **kwargs)

        对正变换导数取对数。
        计算正映射导数的对数值，即 :math:`\log(dg(x) / dx)`。
        
        **参数：**

        - **value** (Tensor) - 输入。
        - **value** (Tensor) - 输入随机变量的值。
        - **args** (list) - 函数所需的位置参数列表。
        - **kwargs** (dict) - 函数所需的关键字参数字典。

        **返回：**

        Tensor, 正映射导数的对数值。
        
    .. py:method:: inverse(value, *args, **kwargs)

        逆变换：将输入值转换回原始分布。
        逆映射，计算输出随机变量 :math:`Y = value` 时对应的输入随机变量的值 :math:`X = g^{-1}(value)`。
        
        **参数：**
        
        - **value** (Tensor) - 输入。
        - **value** (Tensor) - 输出随机变量的值。
        - **args** (list) - 函数所需的位置参数列表。
        - **kwargs** (dict) - 函数所需的关键字参数字典。
        

        **返回：**

        Tensor, 输入随机变量的值。

    .. py:method:: inverse_log_jacobian(value, *args, **kwargs)

        对逆变换的导数取对数。
        
        计算逆映射导数的对数值，即 :math:`\log(dg^{-1}(x) / dx)`。

        **参数：**

        - **value** (Tensor) - 输入。
        - **value** (Tensor) - 输出随机变量的值。
        - **args** (list) - 函数所需的位置参数列表。
        - **kwargs** (dict) - 函数所需的关键字参数字典。
        

        **返回：**

        Tensor, 逆映射导数的对数值。
--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.Exp.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.Exp.rst
@@ -4,7 +4,7 @@ mindspore.nn.probability.bijector.Exp
 .. py:class:: mindspore.nn.probability.bijector.Exp(name='Exp')

    指数Bijector（Exponential Bijector）。
    此Bijector执行如下操作：
    此Bijector对应的映射函数为：

    .. math::
        Y = \exp(x).
@@ -39,3 +39,51 @@ mindspore.nn.probability.bijector.Exp
    >>> print(ans4.shape)
    (3,)


    .. py:method:: forward(value)

        正映射，计算输入随机变量 :math:`X = value` 经过映射后的值 :math:`Y = \exp(value)`。

        **参数：**

        - **value** (Tensor) - 输入随机变量的值。

        **返回：**

        Tensor, 输出随机变量的值。

    .. py:method:: forward_log_jacobian(value)

        计算正映射导数的对数值，即 :math:`\log(d\exp(x) / dx)`。

        **参数：**

        - **value** (Tensor) - 输入随机变量的值。

        **返回：**

        Tensor, 正映射导数的对数值。

    .. py:method:: inverse(value)

        正映射，计算输出随机变量 :math:`Y = value` 时对应的输入随机变量的值 :math:`X = \log(value)`。

        **参数：**

        - **value** (Tensor) - 输出随机变量的值。

        **返回：**

        Tensor, 输入随机变量的值。

    .. py:method:: inverse_log_jacobian(value)

        计算逆映射导数的对数值，即 :math:`\log(d\log(x) / dx)`。

        **参数：**

        - **value** (Tensor) - 输出随机变量的值。

        **返回：**

        Tensor, 逆映射导数的对数值。
--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.GumbelCDF.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.GumbelCDF.rst
@@ -4,10 +4,10 @@ mindspore.nn.probability.bijector.GumbelCDF
 .. py:class:: mindspore.nn.probability.bijector.GumbelCDF(loc=0.0, scale=1.0, name='GumbelCDF')

    GumbelCDF Bijector。
    此Bijector执行如下操作：
    此Bijector对应的映射函数为：

    .. math::
        Y = \exp(-\exp(\frac{-(X - loc)}{scale}))
        Y = g(x) = \exp(-\exp(\frac{-(X - loc)}{scale}))

    **参数：**

@@ -20,11 +20,15 @@ mindspore.nn.probability.bijector.GumbelCDF
    ``Ascend`` ``GPU``

    .. note::
        `scale` 必须大于零。对于 `inverse` 和 `inverse_log_jacobian` ，输入应在(0, 1)范围内。`loc` 和 `scale` 的数据类型必须为float。如果 `loc` 、 `scale` 作为numpy.ndarray或Tensor传入，则它们必须具有相同的数据类型，否则将引发错误。
        `scale` 中元素必须大于零。对于 `inverse` 和 `inverse_log_jacobian` ，输入应在(0, 1)范围内。`loc` 和 `scale` 中元素的数据类型必须为float。如果 `loc` 、 `scale` 作为numpy.ndarray或Tensor传入，则它们必须具有相同的数据类型，否则将引发错误。

    **异常：**

    - **TypeError** - `loc` 或 `scale` 的数据类型不为float，或 `loc` 和 `scale` 的数据类型不相同。
    - **TypeError** - `loc` 或 `scale` 中元素的数据类型不为float，或 `loc` 和 `scale` 中元素的数据类型不相同。

    **支持平台：**

    ``Ascend`` ``GPU``

    **样例：**

@@ -51,3 +55,50 @@ mindspore.nn.probability.bijector.GumbelCDF
    >>> print(ans4.shape)
    (3,)

    .. py:method:: forward(value)

        正映射，计算输入随机变量 :math:`X = value` 经过映射后的值 :math:`Y = g(value)`。

        **参数：**

        - **value** (Tensor) - 输入随机变量的值。

        **返回：**

        Tensor, 输入随机变量的值。

    .. py:method:: forward_log_jacobian(value)

        计算正映射导数的对数值，即 :math:`\log(dg(x) / dx)`。

        **参数：**

        - **value** (Tensor) - 输入随机变量的值。

        **返回：**

        Tensor, 正映射导数的对数值。

    .. py:method:: inverse(value)

        正映射，计算输出随机变量 :math:`Y = value` 时对应的输入随机变量的值 :math:`X = g(value)`。

        **参数：**

        - **value** (Tensor) - 输出随机变量的值。

        **返回：**

        Tensor, 输出随机变量的值。

    .. py:method:: inverse_log_jacobian(value)

        计算逆映射导数的对数值，即 :math:`\log(dg^{-1}(x) / dx)`。

        **参数：**

        - **value** (Tensor) - 输出随机变量的值。

        **返回：**

        Tensor, 逆映射导数的对数值。
--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.Invert.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.Invert.rst
@@ -3,7 +3,8 @@ mindspore.nn.probability.bijector.Invert

 .. py:class:: mindspore.nn.probability.bijector.Invert(bijector, name='')

    反转Bijector（Invert Bijector），计算输入Bijector的反函数。
    逆映射Bijector（Invert Bijector）。
    计算输入Bijector的逆映射。如果正向映射（下面的`bijector`输入)对应的映射函数为 :math:`Y = g(X)`，那么对应的逆映射Bijector的映射函数为 :math:`Y = h(X) = g^{-1}(X)` 。

    **参数：**

@@ -38,37 +39,57 @@ mindspore.nn.probability.bijector.Invert
    .. py:method:: bijector
        :property:

        返回基础Bijector。
        Bijector类，返回基础Bijector。

    .. py:method:: forward(x)

        逆变换：将输入值转换回原始分布。
        计算基础Bijector的逆映射，即 :math:`Y = h(X) = g^{-1}(X)`。

        **参数：**

        - **x** (Tensor) - 输入。
        - **x** (Tensor) - 基础Bijector的输出随机变量的值。

        **返回：**

        Tensor, 基础Bijector的输入随机变量的值。

    .. py:method:: forward_log_jacobian(x)

        逆变换导数的对数。
        计算基础Bijector的逆映射导数的对数值，即 :math:`\log dg^{-1}(x) / dx`。

        **参数：**

        - **x** (Tensor) - 输入。
        - **x** (Tensor) - 基础Bijector的输出随机变量的值。

        **返回：**

        Tensor, 基类逆映射导数的对数值。

    .. py:method:: inverse(y)

        正变换：将输入值转换为另一个分布。
        计算基础Bijector的正映射，即 :math:`Y = g(X)`。

        **参数：**

        - **y** (Tensor) - 输入。
        - **y** (Tensor) - 基础Bijector的输入随机变量的值。

        **返回：**

        Tensor, 基础Bijector的输出随机变量的值。

        **返回：**

        Tensor, 输出随机变量的值。

    .. py:method:: inverse_log_jacobian(y)

        正变换导数的对数。
        计算基础Bijector的正映射导数的对数，即 :math:`Y = \log dg(x) / dx`。

        **参数：**

        - **y** (Tensor) - 输入。
        - **y** (Tensor) - 基础Bijector的输入随机变量的值。

        **返回：**

        Tensor, 基类正映射导数的对数值。

--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.PowerTransform.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.PowerTransform.rst
@@ -4,7 +4,7 @@ mindspore.nn.probability.bijector.PowerTransform
 .. py:class:: mindspore.nn.probability.bijector.PowerTransform(power=0., name='PowerTransform')

    乘方Bijector（Power Bijector）。
    此Bijector执行如下操作：
    此Bijector对应的映射函数为：

    .. math::
        Y = g(X) = (1 + X * c)^{1 / c}, X >= -1 / c
@@ -13,7 +13,7 @@ mindspore.nn.probability.bijector.PowerTransform

    Power Bijector将输入从 `[-1/c, inf]` 映射到 `[0, inf]` 。

    当 `c=0` 时，此Bijector等于Exp Bijector。
    当 `c=0` 时，此Bijector等于 :class:`mindspore.nn.probability.bijector.Exp` Bijector。

    **参数：**

@@ -25,12 +25,12 @@ mindspore.nn.probability.bijector.PowerTransform
    ``Ascend`` ``GPU``

    .. note::
        `power` 的数据类型必须为float。
        `power` 中元素的数据类型必须为float。

    **异常：**

    - **ValueError** - `power` 小于0或静态未知。
    - **TypeError** - `power` 的数据类型不是float。
    - **ValueError** - `power` 中元素小于0或静态未知。
    - **TypeError** - `power` 中元素的数据类型不是float。

    **样例：**

@@ -54,3 +54,50 @@ mindspore.nn.probability.bijector.PowerTransform
    >>> print(ans4.shape)
    (3,)

    .. py:method:: forward(value)

        正映射，计算输入随机变量 :math:`X = value` 经过映射后的值 :math:`Y = g(value)`。

        **参数：**

        - **value** (Tensor) - 输入随机变量的值。

        **返回：**

        Tensor, 输入随机变量的值。

    .. py:method:: forward_log_jacobian(value)

        计算正映射导数的对数值，即 :math:`\log(dg(x) / dx)`。

        **参数：**

        - **value** (Tensor) - 输入随机变量的值。

        **返回：**

        Tensor, 正映射导数的对数值。

    .. py:method:: inverse(value)

        正映射，计算输出随机变量 :math:`Y = value` 时对应的输入随机变量的值 :math:`X = g(value)`。

        **参数：**

        - **value** (Tensor) - 输出随机变量的值。

        **返回：**

        Tensor, 输出随机变量的值。

    .. py:method:: inverse_log_jacobian(value)

        计算逆映射导数的对数值，即 :math:`\log(dg^{-1}(x) / dx)`。

        **参数：**

        - **value** (Tensor) - 输出随机变量的值。

        **返回：**

        Tensor, 逆映射导数的对数值。
--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.ScalarAffine.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.ScalarAffine.rst
@@ -4,7 +4,7 @@ mindspore.nn.probability.bijector.ScalarAffine
 .. py:class:: mindspore.nn.probability.bijector.ScalarAffine(scale=1.0, shift=0.0, name='ScalarAffine')

    标量仿射Bijector（Scalar Affine Bijector）。
    此Bijector执行如下操作：
    此Bijector对应的映射函数为：

    .. math::
        Y = a * X + b
@@ -22,11 +22,11 @@ mindspore.nn.probability.bijector.ScalarAffine
    ``Ascend`` ``GPU``

    .. note::
        `shift` 和 `scale` 的数据类型必须为float。如果 `shift` 、 `scale` 作为numpy.ndarray或Tensor传入，则它们必须具有相同的数据类型，否则将引发错误。
        `shift` 和 `scale` 中元素的数据类型必须为float。如果 `shift` 、 `scale` 作为numpy.ndarray或Tensor传入，则它们必须具有相同的数据类型，否则将引发错误。

    **异常：**

    - **TypeError** - `shift` 或 `scale` 的数据类型不为float，或 `shift` 和 `scale` 的数据类型不相同。
    - **TypeError** - `shift` 或 `scale` 中元素的数据类型不为float，或 `shift` 和 `scale` 的数据类型不相同。

    **样例：**

@@ -50,3 +50,50 @@ mindspore.nn.probability.bijector.ScalarAffine
    >>> print(ans4.shape)
    ()

    .. py:method:: forward(value)

        正映射，计算输入随机变量 :math:`X = value` 经过映射后的值 :math:`Y = g(value)`。

        **参数：**

        - **value** (Tensor) - 输入随机变量的值。

        **返回：**

        Tensor, 输入随机变量的值。

    .. py:method:: forward_log_jacobian(value)

        计算正映射导数的对数值，即 :math:`\log(dg(x) / dx)`。

        **参数：**

        - **value** (Tensor) - 输入随机变量的值。

        **返回：**

        Tensor, 正映射导数的对数值。

    .. py:method:: inverse(value)

        正映射，计算输出随机变量 :math:`Y = value` 时对应的输入随机变量的值 :math:`X = g(value)`。

        **参数：**

        - **value** (Tensor) - 输出随机变量的值。

        **返回：**

        Tensor, 输出随机变量的值。

    .. py:method:: inverse_log_jacobian(value)

        计算逆映射导数的对数值，即 :math:`\log(dg^{-1}(x) / dx)`。

        **参数：**

        - **value** (Tensor) - 输出随机变量的值。

        **返回：**

        Tensor, 逆映射导数的对数值。
--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.Softplus.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.bijector.Softplus.rst
@@ -4,10 +4,10 @@ mindspore.nn.probability.bijector.Softplus
 .. py:class:: mindspore.nn.probability.bijector.Softplus(sharpness=1.0, name='Softplus')

    Softplus Bijector。
    此Bijector执行如下操作：
    此Bijector对应的映射函数为：

    .. math::
        Y = \frac{\log(1 + e ^ {kX})}{k}
        Y = g(x) = \frac{\log(1 + e ^ {kX})}{k}

    其中k是锐度因子。

@@ -21,11 +21,11 @@ mindspore.nn.probability.bijector.Softplus
    ``Ascend`` ``GPU``

    .. note::
        `sharpness` 的数据类型必须为float。
        `sharpness` 中元素的数据类型必须为float。

    **异常：**

    - **TypeError** - sharpness的数据类型不为float。
    - **TypeError** - sharpness中元素的数据类型不为float。

    **样例：**

@@ -51,3 +51,50 @@ mindspore.nn.probability.bijector.Softplus
    >>> print(ans4.shape)
    (3,)

    .. py:method:: forward(value)

        正映射，计算输入随机变量 :math:`X = value` 经过映射后的值 :math:`Y = g(value)`。

        **参数：**

        - **value** (Tensor) - 输入随机变量的值。

        **返回：**

        Tensor, 输入随机变量的值。

    .. py:method:: forward_log_jacobian(value)

        计算正映射导数的对数值，即 :math:`\log(dg(x) / dx)`。

        **参数：**

        - **value** (Tensor) - 输入随机变量的值。

        **返回：**

        Tensor, 正映射导数的对数值。

    .. py:method:: inverse(value)

        正映射，计算输出随机变量 :math:`Y = value` 时对应的输入随机变量的值 :math:`X = g(value)`。

        **参数：**

        - **value** (Tensor) - 输出随机变量的值。

        **返回：**

        Tensor, 输出随机变量的值。

    .. py:method:: inverse_log_jacobian(value)

        计算逆映射导数的对数值，即 :math:`\log(dg^{-1}(x) / dx)`。

        **参数：**

        - **value** (Tensor) - 输出随机变量的值。

        **返回：**

        Tensor, 逆映射导数的对数值。
--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Bernoulli.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Bernoulli.rst
@@ -4,6 +4,7 @@ mindspore.nn.probability.distribution.Bernoulli
 .. py:class:: mindspore.nn.probability.distribution.Bernoulli(probs=None, seed=None, dtype=mstype.int32, name='Bernoulli')

    伯努利分布（Bernoulli Distribution）。
    离散随机分布，取值范围为 :math:`\{0, 1\}` ，概率质量函数为 :math:`P(X = 0) = p, P(X = 1) = 1-p`。

    **参数：**

@@ -17,7 +18,12 @@ mindspore.nn.probability.distribution.Bernoulli
    ``Ascend`` ``GPU``

    .. note:: 
        `probs` 必须是合适的概率（0<p<1）。
        `probs` 中元素必须是合适的概率（0<p<1）。

    **异常：**

    - **ValueError** - `probs` 中元素小于0或大于1。
    - **TypeError** - `dtype` 不是float的子类。

    **样例：**

@@ -104,4 +110,8 @@ mindspore.nn.probability.distribution.Bernoulli

    .. py:method:: probs

        返回结果为1的概率。     
        返回结果为1的概率。

        **返回：**

        Tensor, 结果为1的概率。
--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Beta.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Beta.rst
@@ -3,12 +3,18 @@ mindspore.nn.probability.distribution.Beta

 .. py:class:: mindspore.nn.probability.distribution.Beta(concentration1=None, concentration0=None, seed=None, dtype=mstype.float32, name='Beta')

    贝塔分布（Beta Distribution）。
    Beta 分布（Beta Distribution）。
    连续随机分布，取值范围为 :math:`[0, 1]` ，概率密度函数为 

    .. math:: 
        f(x, \alpha, \beta) = x^\alpha (1-x)^{\beta - 1} / B(\alpha, \beta).

    其中 :math:`B` 为 Beta 函数。

    **参数：**

    - **concentration1** (list, numpy.ndarray, Tensor) - 贝塔分布的alpha。
    - **concentration0** (list, numpy.ndarray, Tensor) - 贝塔分布的beta。
    - **concentration1** (int, float, list, numpy.ndarray, Tensor) - Beta 分布的alpha。
    - **concentration0** (int, float, list, numpy.ndarray, Tensor) - Beta 分布的beta。
    - **seed** (int) - 采样时使用的种子。如果为None，则使用全局种子。默认值：None。
    - **dtype** (mindspore.dtype) - 采样结果的数据类型。默认值：mindspore.float32。
    - **name** (str) - 分布的名称。默认值：'Beta'。
@@ -18,8 +24,13 @@ mindspore.nn.probability.distribution.Beta
    ``Ascend``

    .. note:: 
        - `concentration1` 和 `concentration0` 必须大于零。
        - `dtype` 必须是float，因为贝塔分布是连续的。
        - `concentration1` 和 `concentration0` 中元素必须大于零。
        - `dtype` 必须是float，因为 Beta 分布是连续的。

    **异常：**

    - **ValueError** - `concentration1` 或者 `concentration0` 中元素小于0。
    - **TypeError** - `dtype` 不是float的子类。

    **样例：**

@@ -28,7 +39,7 @@ mindspore.nn.probability.distribution.Beta
    >>> import mindspore.nn.probability.distribution as msd
    >>> from mindspore import Tensor
    >>>
    >>> # 初始化concentration1为3.0和concentration0为4.0的贝塔分布。
    >>> # 初始化concentration1为3.0和concentration0为4.0的 Beta 分布。
    >>> b1 = msd.Beta([3.0], [4.0], dtype=mindspore.float32)
    >>>
    >>> # Beta分布可以在没有参数的情况下初始化。
@@ -113,10 +124,18 @@ mindspore.nn.probability.distribution.Beta
    .. py:method:: concentration0
        :property:

        返回concentration0（也称为贝塔分布的beta）。
        返回concentration0（也称为 Beta 分布的beta）。

        **返回：**

        Tensor, concentration0 的值。
        
    .. py:method:: concentration1
        :property:

        返回concentration1（也称为贝塔分布的alpha）。
        返回concentration1（也称为 Beta 分布的alpha）。

        **返回：**

        Tensor, concentration1 的值。

--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Categorical.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Categorical.rst
@@ -4,6 +4,7 @@ mindspore.nn.probability.distribution.Categorical
 .. py:class:: mindspore.nn.probability.distribution.Categorical(probs=None, seed=None, dtype=mstype.float32, name='Categorical')

    分类分布。
    离散随机分布，取值范围为 :math:`\{1, 2, ..., k\}` ，概率质量函数为 :math:`P(X = i) = p_i, i = 1, ..., k`。

    **参数：**

@@ -19,6 +20,10 @@ mindspore.nn.probability.distribution.Categorical
    .. note:: 
        `probs` 的秩必须至少为1，值是合适的概率，并且总和为1。

    **异常：**

    - **ValueError** - `probs` 的秩为0或者其中所有元素的和不等于1。

    **样例：**

    >>> import mindspore
@@ -81,8 +86,12 @@ mindspore.nn.probability.distribution.Categorical
    >>> ans = ca2.kl_loss('Categorical', probs_b, probs_a)
    >>> print(ans.shape)
    ()
    

    .. py:method:: probs

        返回事件概率。
        
        返回事件发生的概率。

        **返回：**

        Tensor, 事件发生的概率。

--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Cauchy.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Cauchy.rst
@@ -4,6 +4,12 @@ mindspore.nn.probability.distribution.Cauchy
 .. py:class:: mindspore.nn.probability.distribution.Cauchy(loc=None, scale=None, seed=None, dtype=mstype.float32, name='Cauchy')

    柯西分布（Cauchy distribution）。
    连续随机分布，取值范围为所有实数，概率密度函数为

    .. math:: 
        f(x, a, b) = 1 / \pi b(1 - ((x - a)/b)^2).

    其中 :math:`a, b` 为分别为柯西分布的位置参数和比例参数。

    **参数：**

@@ -18,10 +24,16 @@ mindspore.nn.probability.distribution.Cauchy
    ``Ascend``

    .. note:: 
        - `scale` 必须大于零。
        - `scale` 中的元素必须大于零。
        - `dtype` 必须是float，因为柯西分布是连续的。
        - GPU后端不支持柯西分布。

    **异常：**

    - **ValueError** - `scale` 中元素小于0。
    - **TypeError** - `dtype` 不是float的子类。
    

    **样例：**

    >>> import mindspore
@@ -107,14 +119,21 @@ mindspore.nn.probability.distribution.Cauchy
    >>> ans = cauchy2.sample((2,3), loc_a, scale_a)
    >>> print(ans.shape)
    (2, 3, 3)
    

    .. py:method:: loc
        :property:

        返回分布位置。

        **返回：**

        Tensor, 分布的位置值。
        
    .. py:method:: scale
        :property:

        返回分布比例。
        

        **返回：**

        Tensor, 分布的比例值。
--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Distribution.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Distribution.rst
@@ -18,20 +18,24 @@ mindspore.nn.probability.distribution.Distribution

    .. note:: 
        派生类必须重写 `_mean` 、 `_prob` 和 `_log_prob` 等操作。必填参数必须通过 `args` 或 `kwargs` 传入，如 `_prob` 的 `value` 。
    

    .. py:method:: cdf(value, *args, **kwargs)

        在给定值下评估累积分布函数（Cumulatuve Distribution Function, CDF）。
        在给定值下计算累积分布函数（Cumulatuve Distribution Function, CDF）。

        **参数：**

        - **value** (Tensor) - 要评估的值。
        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        
        - **value** (Tensor) - 要计算的值。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            可以通过 `args` 或 `kwargs` 传递其 `dist_spec_args` 来选择性地将Distribution传递给函数。

        **返回：**

        Tensor, 累积分布函数的值。

    .. py:method:: construct(name, *args, **kwargs)

        重写Cell中的 `construct` 。
@@ -45,137 +49,185 @@ mindspore.nn.probability.distribution.Distribution
        - **name** (str) - 函数名称。
        - **args** (list) - 函数所需的位置参数列表。
        - **kwargs** (dict) - 函数所需的关键字参数字典。
        

        **返回：**

        Tensor, name对应函数的值。

    .. py:method:: cross_entropy(dist, *args, **kwargs)

        评估分布a和b之间的交叉熵。
        计算分布a和b之间的交叉熵。

        **参数：**

        - **dist** (str) - 分布的类型。
        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            Distribution b的 `dist_spec_args` 必须通过 `args` 或 `kwargs` 传递给函数。 传入Distribution a的 `dist_spec_args` 是可选的。
        

        **返回：**

        Tensor, 交叉熵的值。

    .. py:method:: entropy(*args, **kwargs)

        计算熵。

        **参数：**

        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            可以通过 `args` 或 `kwargs` 传递其 `dist_spec_args` 来选择性地将Distribution传递给函数。
        

        **返回：**

        Tensor, 熵的值。

    .. py:method:: get_dist_args(*args, **kwargs)

        检查默认参数的可用性和有效性。
        返回分布的参数列表。

        **参数：**

        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note:: 
           传递给字类的参数的顺序应该与通过 `_add_parameter` 初始化默认参数的顺序相同。
        

        **返回：**

        list[Tensor], 参数列表。

    .. py:method:: get_dist_type()

        返回分布类型。
        

        **返回：**

        string, 分布类型名字。

    .. py:method:: kl_loss(dist, *args, **kwargs)

        评估KL散度，即KL(a||b)。
        计算KL散度，即KL(a||b)。

        **参数：**

        - **dist** (str) - 分布的类型。
        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            Distribution b的 `dist_spec_args` 必须通过 `args` 或 `kwargs` 传递给函数。 传入Distribution a的 `dist_spec_args` 是可选的。
        

        **返回：**

        Tensor, KL散度。

    .. py:method:: log_cdf(value, *args, **kwargs)

        计算给定值对于的cdf的对数。
        计算给定值对于的累积分布函数的对数。

        **参数：**

        - **value** (Tensor) - 要评估的值。
        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        - **value** (Tensor) - 要计算的值。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            可以通过 `args` 或 `kwargs` 传递其 `dist_spec_args` 来选择性地将Distribution传递给函数。
        

        **返回：**

        Tensor, 累积分布函数的对数。

    .. py:method:: log_prob(value, *args, **kwargs)

        计算给定值对应的概率的对数（pdf或pmf）。

        **参数：**

        - **value** (Tensor) - 要评估的值。
        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        - **value** (Tensor) - 要计算的值。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            可以通过 `args` 或 `kwargs` 传递其 `dist_spec_args` 来选择性地将Distribution传递给函数。
        

        **返回：**

        Tensor, 累积分布函数的对数。

    .. py:method:: log_survival(value, *args, **kwargs)

        计算给定值对应的剩余函数的对数。
        计算给定值对应的生存函数的对数。

        **参数：**

        - **value** (Tensor) - 要评估的值。
        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        - **value** (Tensor) - 要计算的值。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            可以通过 `args` 或 `kwargs` 传递其 `dist_spec_args` 来选择性地将Distribution传递给函数。
        

        **返回：**

        Tensor, 生存函数的对数。

    .. py:method:: mean(*args, **kwargs)

        评估平均值。
        计算期望。

        **参数：**

        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            可以通过 `args` 或 `kwargs` 传递其 `dist_spec_args` 来选择性地将Distribution传递给函数。
        

        **返回：**

        Tensor, 概率分布的期望。

    .. py:method:: mode(*args, **kwargs)

        评估模式。
        计算众数。

        **参数：**

        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            可以通过 `args` 或 `kwargs` 传递其 `dist_spec_args` 来选择性地将Distribution传递给函数。
        

        **返回：**

        Tensor, 概率分布的众数。

    .. py:method:: prob(value, *args, **kwargs)

        评估给定值下的概率（Probability Density Function或Probability Mass Function）。
        计算给定值下的概率。对于离散分布是计算概率质量函数（Probability Mass Function），而对于连续分布是计算概率密度函数（Probability Density Function）。

        **参数：**

        - **value** (Tensor) - 要评估的值。
        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        
        - **value** (Tensor) - 要计算的值。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            可以通过 `args` 或 `kwargs` 传递其 `dist_spec_args` 来选择性地将Distribution传递给函数。

        **返回：**

        Tensor, 概率值。

    .. py:method:: sample(*args, **kwargs)

        采样函数。
@@ -183,46 +235,62 @@ mindspore.nn.probability.distribution.Distribution
        **参数：**

        - **shape** (tuple) - 样本的shape。
        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            可以通过 `args` 或 `kwargs` 传递其 `dist_spec_args` 来选择性地将Distribution传递给函数。
        

        **返回：**

        Tensor, 根据概率分布采样的样本。

    .. py:method:: sd(*args, **kwargs)

        标准差评估。
        计算标准差。

        **参数：**

        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            可以通过 `args` 或 `kwargs` 传递其 `dist_spec_args` 来选择性地将Distribution传递给函数。
        

        **返回：**

        Tensor, 概率分布的标准差。

    .. py:method:: survival_function(value, *args, **kwargs)

        计算给定值对应的剩余函数。
        计算给定值对应的生存函数。

        **参数：**

        - **value** (Tensor) - 要评估的值。
        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        
        - **value** (Tensor) - 要计算的值。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            可以通过 `args` 或 `kwargs` 传递其 `dist_spec_args` 来选择性地将Distribution传递给函数。

        **返回：**

        Tensor, 生存函数的值。

    .. py:method:: var(*args, **kwargs)

        评估方差。
        计算方差。

        **参数：**

        - **args** (list) - 传递给子类的位置参数列表。
        - **kwargs** (dict) - 传递给子类的关键字参数字典。
        - **args** (list) - 位置参数列表，具体需要的参数根据子类的实现确定。。
        - **kwargs** (dict) - 关键字参数字典，具体需要的参数根据子类的实现确定。

        .. note::
            可以通过 `args` 或 `kwargs` 传递其 `dist_spec_args` 来选择性地将Distribution传递给函数。
        

        **返回：**

        Tensor, 概率分布的方差。

--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Exponential.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Exponential.rst
@@ -3,11 +3,17 @@ mindspore.nn.probability.distribution.Exponential

 .. py:class:: mindspore.nn.probability.distribution.Exponential(rate=None, seed=None, dtype=mstype.float32, name='Exponential')

    示例类：指数分布（Exponential Distribution）。
    指数分布（Exponential Distribution）。
    连续随机分布，取值范围为所有实数，概率密度函数为

    .. math::
        f(x, \lambda) = \lambda \exp(-\lambda x).

    其中 :math:`\lambda` 为分别为指数分布的率参数。

    **参数：**

    - **rate** (float, list, numpy.ndarray, Tensor) - 逆指数。默认值：None。
    - **rate** (int, float, list, numpy.ndarray, Tensor) - 率参数。默认值：None。
    - **seed** (int) - 采样时使用的种子。如果为None，则使用全局种子。默认值：None。
    - **dtype** (mindspore.dtype) - 事件样例的类型。默认值：mindspore.float32。
    - **name** (str) - 分布的名称。默认值：'Exponential'。
@@ -17,9 +23,14 @@ mindspore.nn.probability.distribution.Exponential
    ``Ascend`` ``GPU``

    .. note:: 
        - `rate` 必须大于0。
        - `rate` 中的元素必须大于0。
        - `dtype` 必须是float，因为指数分布是连续的。

    **异常：**

    - **ValueError** - `rate` 中元素小于0。
    - **TypeError** - `dtype` 不是float的子类。

    **样例：**

    >>> import mindspore
@@ -98,9 +109,13 @@ mindspore.nn.probability.distribution.Exponential
    >>> ans = e2.sample((2,3), rate_a)
    >>> print(ans.shape)
    (2, 3, 1)
    

    .. py:method:: rate
        :property:

        返回 `rate` 。
        

        **返回：**

        Tensor, rate 的值。

--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Gamma.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Gamma.rst
@@ -4,11 +4,17 @@ mindspore.nn.probability.distribution.Gamma
 .. py:class:: mindspore.nn.probability.distribution.Gamma(concentration=None, rate=None, seed=None, dtype=mstype.float3, name='Gamma')

    伽马分布（Gamma distribution）。
    连续随机分布，取值范围为 :math:`(0, \inf)` ，概率密度函数为

    .. math::
        f(x, \alpha, \beta) = \beta^\alpha / \Gamma(\alpha) x^{\alpha - 1} \exp(-\beta x).

    其中 :math:`G` 为 Gamma 函数。

    **参数：**

    - **concentration** (list, numpy.ndarray, Tensor) - 浓度，也被称为伽马分布的alpha。默认值：None。
    - **rate** (list, numpy.ndarray, Tensor) - 逆尺度参数，也被称为伽马分布的beta。默认值：None。
    - **concentration** (int, float, list, numpy.ndarray, Tensor) - 浓度，也被称为伽马分布的alpha。默认值：None。
    - **rate** (int, float, list, numpy.ndarray, Tensor) - 逆尺度参数，也被称为伽马分布的beta。默认值：None。
    - **seed** (int) - 采样时使用的种子。如果为None，则使用全局种子。默认值：None。
    - **dtype** (mindspore.dtype) - 事件样例的类型。默认值：mindspore.float32。
    - **name** (str) - 分布的名称。默认值：'Gamma'。
@@ -18,9 +24,14 @@ mindspore.nn.probability.distribution.Gamma
    ``Ascend``

    .. note:: 
        - `concentration` 和 `rate` 必须大于零。
        - `concentration` 和 `rate` 中的元素必须大于零。
        - `dtype` 必须是float，因为伽马分布是连续的。

    **异常：**

    - **ValueError** - `concentration` 或者 `rate` 中元素小于0。
    - **TypeError** - `dtype` 不是float的子类。

    **样例：**

    >>> import mindspore
@@ -107,14 +118,22 @@ mindspore.nn.probability.distribution.Gamma
    >>> ans = g2.sample((2,3), concentration_a, rate_a)
    >>> print(ans.shape)
    (2, 3, 3)
    

    .. py:method:: concentration
        :property:

        返回分布的浓度（也称为伽马分布的alpha）。
        

        **返回：**

        Tensor, concentration 的值。

    .. py:method:: rate
        :property:

        返回分布的逆尺度（也称为伽马分布的beta）。
        

        **返回：**

        Tensor, rate 的值。

--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Geometric.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Geometric.rst
@@ -6,10 +6,11 @@ mindspore.nn.probability.distribution.Geometric
    几何分布（Geometric Distribution）。

    它代表在第一次成功之前有k次失败，即在第一次成功实现时，总共有k+1个伯努利试验。
    离散随机分布，取值范围为正自然数集，概率质量函数为 :math:`P(X = i) = p(1-p)^{i-1}, i = 1, 2, ...`。

    **参数：**

    - **probs** (float, list, numpy.ndarray, Tensor) - 成功的概率。默认值：None。
    - **probs** (int, float, list, numpy.ndarray, Tensor) - 成功的概率。默认值：None。
    - **seed** (int) - 采样时使用的种子。如果为None，则使用全局种子。默认值：None。
    - **dtype** (mindspore.dtype) - 事件样例的类型。默认值：mindspore.int32.
    - **name** (str) - 分布的名称。默认值：'Geometric'。
@@ -21,6 +22,11 @@ mindspore.nn.probability.distribution.Geometric
    .. note:: 
        `probs` 必须是合适的概率（0<p<1）。


    **异常：**

    - **ValueError** - `probs` 中元素小于0或者大于1。

    **样例：**

    >>> import mindspore
@@ -101,9 +107,13 @@ mindspore.nn.probability.distribution.Geometric
    >>> ans = g2.sample((2,3), probs_a)
    >>> print(ans.shape)
    (2, 3, 1)
    

    .. py:method:: probs
        :property:

        返回伯努利试验成功的概率。
        

        **返回：**

        Tensor, 伯努利试验成功的概率值。

--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Gumbel.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Gumbel.rst
@@ -3,12 +3,18 @@ mindspore.nn.probability.distribution.Gumbel

 .. py:class:: mindspore.nn.probability.distribution.Gumbel(loc, scale, seed=0, dtype=mstype.float32, name='Gumbel')

    耿贝尔分布（Gumbel distribution）。
    Gumbel分布（Gumbel distribution）。
    连续随机分布，取值范围为 :math:`(0, \inf)` ，概率密度函数为

    .. math:: 
        f(x, a, b) = 1 / b \exp(\exp(-(x - a) / b) - x).

    其中 :math:`a, b` 为分别为Gumbel分布的位置参数和比例参数。

    **参数：**

    - **loc** (float, list, numpy.ndarray, Tensor) - 耿贝尔分布的位置。
    - **scale** (float, list, numpy.ndarray, Tensor) - 耿贝尔分布的尺度。
    - **loc** (int, float, list, numpy.ndarray, Tensor) - Gumbel分布的位置。
    - **scale** (int, float, list, numpy.ndarray, Tensor) - Gumbel分布的尺度。
    - **seed** (int) - 采样时使用的种子。如果为None，则使用全局种子。默认值：None。
    - **dtype** (mindspore.dtype) - 分布类型。默认值：mindspore.float32。
    - **name** (str) - 分布的名称。默认值：'Gumbel'。
@@ -19,9 +25,14 @@ mindspore.nn.probability.distribution.Gumbel

    .. note:: 
        - `scale` 必须大于零。
        - `dtype` 必须是浮点类型，因为耿贝尔分布是连续的。
        - `dtype` 必须是浮点类型，因为Gumbel分布是连续的。
        - GPU后端不支持 `kl_loss` 和 `cross_entropy` 。

    **异常：**

    - **ValueError** - `scale` 中元素小于0。
    - **TypeError** - `dtype` 不是float的子类。

    **样例：**

    >>> import mindspore
@@ -38,14 +49,21 @@ mindspore.nn.probability.distribution.Gumbel
    >>> value = np.array([1.0, 2.0]).astype(np.float32)
    >>> pdf = Prob()
    >>> output = pdf(Tensor(value, dtype=mindspore.float32))
    

    .. py:method:: loc
        :property:

        返回分布位置。
        

        **返回：**

        Tensor, 分布的位置值。

    .. py:method:: scale
        :property:

        返回分布尺度。
        
        返回分布比例。

        **返回：**

        Tensor, 分布的比例值。
--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.LogNormal.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.LogNormal.rst
@@ -4,7 +4,13 @@ mindspore.nn.probability.distribution.LogNormal
 .. py:class:: mindspore.nn.probability.distribution.LogNormal(loc=None, scale=None, seed=0, dtype=mstype.float32, name='LogNormal')

    对数正态分布（LogNormal distribution）。
    对数正态分布是随机变量的连续概率分布，变量的对数为正态分布。它被构造为正态分布的指数变换。
    连续随机分布，取值范围为 :math:`(0, \inf)` ，概率密度函数为

    .. math:: 
        f(x, a, b) = 1 / xb\sqrt{2\pi} \exp(-(\ln(x) - a)^2 / 2b^2).

    其中 :math:`a, b` 为分别为基础正态分布的平均值和标准差。
    服从对数正态分布的随机变量的对数服从正态分布。它被构造为正态分布的指数变换。

    **参数：**

@@ -22,6 +28,11 @@ mindspore.nn.probability.distribution.LogNormal
        - `scale` 必须大于零。
        - `dtype` 必须是float，因为对数正态分布是连续的。

    **异常：**

    - **ValueError** - `scale` 中元素小于0。
    - **TypeError** - `dtype` 不是float的子类。

    **样例：**

    >>> import numpy as np
@@ -39,13 +50,21 @@ mindspore.nn.probability.distribution.LogNormal
    >>> output = pdf(Tensor([1.0, 2.0], dtype=mindspore.float32))
    >>> print(output.shape)
    (2, 2)
    

    .. py:method:: loc
        :property:

        返回分布的均值。
        
        返回分布位置。

        **返回：**

        Tensor, 分布的位置值。

    .. py:method:: scale
        :property:

        返回分布的标准差。
        返回分布比例。

        **返回：**

        Tensor, 分布的比例值。
--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Logistic.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Logistic.rst
@@ -3,12 +3,18 @@ mindspore.nn.probability.distribution.Logistic

 .. py:class:: mindspore.nn.probability.distribution.Logistic(loc=None, scale=None, seed=None, dtype=mstype.float32, name='Logistic')

    逻辑斯谛分布（Logistic distribution）。
    Logistic分布（Logistic distribution）。
    连续随机分布，取值范围为 :math:`(0, \inf)` ，概率密度函数为

    .. math::
        f(x, a, b) = 1 / b \exp(\exp(-(x - a) / b) - x).

    其中 :math:`a, b` 为分别为Logistic分布的位置参数和比例参数。

    **参数：**

    - **loc** (int, float, list, numpy.ndarray, Tensor) - 逻辑斯谛分布的位置。默认值：None。
    - **scale** (int, float, list, numpy.ndarray, Tensor) - 逻辑斯谛分布的尺度。默认值：None。
    - **loc** (int, float, list, numpy.ndarray, Tensor) - Logistic分布的位置。默认值：None。
    - **scale** (int, float, list, numpy.ndarray, Tensor) - Logistic分布的尺度。默认值：None。
    - **seed** (int) - 采样时使用的种子。如果为None，则使用全局种子。默认值：None。
    - **dtype** (mindspore.dtype) - 事件样例的类型。默认值：mindspore.float32。
    - **name** (str) - 分布的名称。默认值：'Logistic'。
@@ -19,7 +25,12 @@ mindspore.nn.probability.distribution.Logistic

    .. note:: 
        - `scale` 必须大于零。
        - `dtype` 必须是float，因为逻辑斯谛分布是连续的。
        - `dtype` 必须是float，因为Logistic分布是连续的。

    **异常：**

    - **ValueError** - `scale` 中元素小于0。
    - **TypeError** - `dtype` 不是float的子类。

    **样例：**

@@ -27,9 +38,9 @@ mindspore.nn.probability.distribution.Logistic
    >>> import mindspore.nn as nn
    >>> import mindspore.nn.probability.distribution as msd
    >>> from mindspore import Tensor
    >>> # 初始化loc为3.0和scale为4.0的逻辑斯谛分布。
    >>> # 初始化loc为3.0和scale为4.0的Logistic分布。
    >>> l1 = msd.Logistic(3.0, 4.0, dtype=mindspore.float32)
    >>> # 可以在没有参数的情况下初始化逻辑斯谛分布。
    >>> # 可以在没有参数的情况下初始化Logistic分布。
    >>> # 在这种情况下，`loc`和`scale`必须通过参数传入。
    >>> l2 = msd.Logistic(dtype=mindspore.float32)
    >>>
@@ -90,14 +101,22 @@ mindspore.nn.probability.distribution.Logistic
    >>> ans = l1.sample((2,3), loc_a, scale_a)
    >>> print(ans.shape)
    (2, 3, 3)
    

    .. py:method:: loc
        :property:

        返回分布位置。
        

        **返回：**

        Tensor, 分布的位置值。

    .. py:method:: scale
        :property:

        返回分布尺度。
        
        返回分布比例。

        **返回：**

        Tensor, 分布的比例值。

--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Normal.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Normal.rst
@@ -4,6 +4,12 @@ mindspore.nn.probability.distribution.Normal
 .. py:class:: mindspore.nn.probability.distribution.Normal(mean=None, sd=None, seed=None, dtype=mstype.float32, name='Normal')

    正态分布（Normal distribution）。
    连续随机分布，取值范围为 :math:`(-\inf, \inf)` ，概率密度函数为

    .. math:: 
        f(x, \mu, \sigma) = 1 / \sigma\sqrt{2\pi} \exp(-(x - \mu)^2 / 2\sigma^2).

    其中 :math:`\mu, \sigma` 为分别为正态分布的期望与标准差。

    **参数：**

@@ -21,6 +27,11 @@ mindspore.nn.probability.distribution.Normal
        - `sd` 必须大于零。
        - `dtype` 必须是float，因为正态分布是连续的。

    **异常：**

    - **ValueError** - `sd` 中元素小于0。
    - **TypeError** - `dtype` 不是float的子类。

    **样例：**

    >>> import mindspore
--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Poisson.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Poisson.rst
@@ -4,10 +4,16 @@ mindspore.nn.probability.distribution.Poisson
 .. py:class:: mindspore.nn.probability.distribution.Poisson(rate=None, seed=None, dtype=mstype.float32, name='Poisson')

    泊松分布（Poisson Distribution）。
    离散随机分布，取值范围为正自然数集，概率质量函数为

    .. math::
        P(X = k) = \lambda^k \exp(-\lambda) / k!, k = 1, 2, ...

    其中 :math:`\lambda` 为率参数（rate)。

    **参数：**

    - **rate** (list, numpy.ndarray, Tensor) - 泊松分布的率参数。默认值：None。
    - **rate** (int, float, list, numpy.ndarray, Tensor) - 泊松分布的率参数。默认值：None。
    - **seed** (int) - 采样时使用的种子。如果为None，则使用全局种子。默认值：None。
    - **dtype** (mindspore.dtype) - 事件样例的类型。默认值：mindspore.float32。
    - **name** (str) - 分布的名称。默认值：'Poisson'。
@@ -19,6 +25,11 @@ mindspore.nn.probability.distribution.Poisson
    .. note:: 
        `rate` 必须大于0。


    **异常：**

    - **ValueError** - `rate` 中元素小于0。

    **样例：**

    >>> import mindspore
@@ -83,9 +94,13 @@ mindspore.nn.probability.distribution.Poisson
    >>> ans = p2.sample((2,3), rate_a)
    >>> print(ans.shape)
    (2, 3, 1)
    

    .. py:method:: rate
        :property:

        返回分布的 `rate` 参数。
        

        **返回：**

        Tensor, rate 参数的值。

--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.TransformedDistribution.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.TransformedDistribution.rst
@@ -4,7 +4,8 @@ mindspore.nn.probability.distribution.TransformedDistribution
 .. py:class:: mindspore.nn.probability.distribution.TransformedDistribution(bijector, distribution, seed=None, name='transformed_distribution')

    转换分布（Transformed Distribution）。
    该类包含一个Bijector和一个分布，并通过Bijector定义的操作将原始分布转换为新分布。
    该类包含一个Bijector和一个分布，并通过Bijector定义的操作将原始分布转换为新分布。可如果原始分布为 :math:`X` ，Bijector的映射函数为 :math:`g`， 那么对应的转换分布为 :math:`Y = g(X)` 。


    **参数：**

@@ -20,6 +21,10 @@ mindspore.nn.probability.distribution.TransformedDistribution
    .. note:: 
        用于初始化原始分布的参数不能为None。例如，由于未指定 `mean` 和 `sd` ，因此无法使用mynormal = msd.Normal(dtype=mindspore.float32)初始化TransformedDistribution。

    **异常：**

    - **ValueError** - bijector不是Bijector类，distribution不是Distribution类。

    **样例：**

    >>> import numpy as np
@@ -48,4 +53,4 @@ mindspore.nn.probability.distribution.TransformedDistribution
    >>> cdf, sample = net(tx)
    >>> print(sample.shape)
    (2, 3)
    

--- a/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Uniform.rst
+++ b/docs/api/api_python/nn_probability/mindspore.nn.probability.distribution.Uniform.rst
@@ -3,7 +3,13 @@ mindspore.nn.probability.distribution.Uniform

 .. py:class:: mindspore.nn.probability.distribution.Uniform(low=None, high=None, seed=None, dtype=mstype.float32, name='Uniform')

    示例类：均匀分布（Uniform Distribution）。
    均匀分布（Uniform Distribution）。
    连续随机分布，取值范围为 :math:`[a, b]` ，概率密度函数为

    .. math:: 
        f(x, a, b) = 1 / (b - a).

    其中 :math:`a, b` 为分别为均匀分布的下界和上界。

    **参数：**

@@ -21,6 +27,13 @@ mindspore.nn.probability.distribution.Uniform
        - `low` 必须小于 `high` 。
        - `dtype` 必须是float类型，因为均匀分布是连续的。


    **异常：**

    - **ValueError** - `low` 大于等于 `high` 。
    - **TypeError** - `dtype` 不是float的子类。


    **样例：**

    >>> import mindspore
@@ -108,14 +121,21 @@ mindspore.nn.probability.distribution.Uniform
    >>> ans = u2.sample((2,3), low_a, high_a)
    >>> print(ans.shape)
    (2, 3, 2)
    

    .. py:method:: high
        :property:

        返回分布的上限。
        

        **返回：**

        Tensor, 分布的上限值。

    .. py:method:: low
        :property:

        返回分布的下限。
        

        **返回：**

        Tensor, 分布的下限值。
--- a/mindspore/python/mindspore/nn/probability/bijector/bijector.py
+++ b/mindspore/python/mindspore/nn/probability/bijector/bijector.py
@@ -26,7 +26,10 @@ from ..distribution import TransformedDistribution

 class Bijector(Cell):
    """
    Bijecotr class.
    Bijecotr class. A bijector perform a mapping from one distribution to the other via some function.
    If X is a random variable following the original distribution,
    and g(x) is the mapping function,
    then Y = g(X) is the random variable following the transformed distribution.

    Args:
        is_constant_jacobian (bool): Whether the Bijector has constant derivative. Default: False.
@@ -234,24 +237,56 @@ class Bijector(Cell):
    def forward(self, value, *args, **kwargs):
        """
        Forward transformation: transform the input value to another distribution.

        Args:
            value (Tensor): the value of the input variables.
            *args (list): the list of positional arguments forwarded to subclasses.
            **kwargs (dict): the dictionary of keyword arguments forwarded to subclasses.

        Output:
            Tensor, the value of the  transformed random variable.
        """
        return self._forward(value, *args, **kwargs)

    def inverse(self, value, *args, **kwargs):
        """
        Inverse transformation: transform the input value back to the original distribution.

        Args:
            value (Tensor): the value of the transformed variables.
            *args (list): the list of positional arguments forwarded to subclasses.
            **kwargs (dict): the dictionary of keyword arguments forwarded to subclasses.

        Output:
            Tensor, the value of the input random variable.
        """
        return self._inverse(value, *args, **kwargs)

    def forward_log_jacobian(self, value, *args, **kwargs):
        """
        Logarithm of the derivative of the forward transformation.

        Args:
            value (Tensor): the value of the input variables.
            *args (list): the list of positional arguments forwarded to subclasses.
            **kwargs (dict): the dictionary of keyword arguments forwarded to subclasses.

        Output:
            Tensor, the value of logarithm of the derivative of the forward transformation.
        """
        return self._forward_log_jacobian(value, *args, **kwargs)

    def inverse_log_jacobian(self, value, *args, **kwargs):
        """
        Logarithm of the derivative of the inverse transformation.

        Args:
            value (Tensor): the value of the transformed variables.
            *args (list): the list of positional arguments forwarded to subclasses.
            **kwargs (dict): the dictionary of keyword arguments forwarded to subclasses.

        Output:
            Tensor, the value of logarithm of the derivative of the inverse transformation.
        """
        return self._inverse_log_jacobian(value, *args, **kwargs)

--- a/mindspore/python/mindspore/nn/probability/bijector/exp.py
+++ b/mindspore/python/mindspore/nn/probability/bijector/exp.py
@@ -27,6 +27,18 @@ class Exp(PowerTransform):
    Args:
        name (str): The name of the Bijector. Default: 'Exp'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - **forward**
        - **inverse**
        - **forward_log_jacobian**
        - **backward_log_jacobian**

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Bijector`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

--- a/mindspore/python/mindspore/nn/probability/bijector/gumbel_cdf.py
+++ b/mindspore/python/mindspore/nn/probability/bijector/gumbel_cdf.py
@@ -32,6 +32,19 @@ class GumbelCDF(Bijector):
        scale (float, list, numpy.ndarray, Tensor): The scale. Default: 1.0.
        name (str): The name of the Bijector. Default: 'GumbelCDF'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - **forward**
        - **inverse**
        - **forward_log_jacobian**
        - **backward_log_jacobian**

        It should be notice that the input should be always a tensor,
        with a shape that can be broadcasted to that of `loc` and `scale`.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Bijector`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

@@ -44,7 +57,7 @@ class GumbelCDF(Bijector):

    Raises:
        TypeError: When the dtype of `loc` or `scale` is not float,
                   and when the dtype of `loc` and `scale` is not same.
                   or when the dtype of `loc` and `scale` is not same.

    Examples:
        >>> import mindspore
@@ -86,18 +99,28 @@ class GumbelCDF(Bijector):
        self._scale = self._add_parameter(scale, 'scale')
        check_greater_zero(self._scale, "scale")


        self.cast = P.Cast()
        self.exp = exp_generic
        self.log = log_generic


    @property
    def loc(self):
        """
        Return the loc parameter of the bijector.

        Output:
            Tensor, the loc parameter of the bijector.
        """
        return self._loc

    @property
    def scale(self):
        """
        Return the scale parameter of the bijector.

        Output:
            Tensor, the scale parameter of the bijector.
        """
        return self._scale

    def extend_repr(self):
--- a/mindspore/python/mindspore/nn/probability/bijector/invert.py
+++ b/mindspore/python/mindspore/nn/probability/bijector/invert.py
@@ -26,6 +26,18 @@ class Invert(Bijector):
        name (str): The name of the Bijector. Default: "". When name is set to "", it is actually
            'Invert' + bijector.name.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - **forward**
        - **inverse**
        - **forward_log_jacobian**
        - **backward_log_jacobian**

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Bijector`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

@@ -73,36 +85,52 @@ class Invert(Bijector):

    def inverse(self, y):
        """
        Forward transformation: transform the input value to another distribution.
        Perform the inverse transformation of the inverse bijector,
        namely the forward transformation of the underlying bijector.

        Args:
            y (Tensor): Tensor of any shape.
            y (Tensor): the value of the transformed random variable.

        Output:
            Tensor, the value of the input random variable.
        """
        return self.bijector("forward", y)

    def forward(self, x):
        """
        Inverse transformation: transform the input value back to the original distribution.
        Perform the forward transformation of the inverse bijector,
        namely the inverse transformation of the underlying bijector.

        Args:
            x (Tensor): Tensor of any shape.
            x (Tensor): the value of the input random variable.

        Output:
            Tensor, the value of the transformed random variable.
        """
        return self.bijector("inverse", x)

    def inverse_log_jacobian(self, y):
        """
        Logarithm of the derivative of the forward transformation.
        Logarithm of the derivative of the inverse transformation of the inverse bijector,
        namely logarithm of the derivative of the forward transformation of the underlying bijector.

        Args:
            y (Tensor): Tensor of any shape.
            y (Tensor): the value of the transformed random variable.

        Output:
            Tensor, logarithm of the derivative of the inverse transformation of the inverse bijector.
        """
        return self.bijector("forward_log_jacobian", y)

    def forward_log_jacobian(self, x):
        """
        Logarithm of the derivative of the inverse transformation.
        Logarithm of the derivative of the forward transformation of the inverse bijector,
        namely logarithm of the derivative of the inverse transformation of the underlying bijector.

        Args:
            x (Tensor): Tensor of any shape.
            x (Tensor): the value of the input random variable.

        Output:
            Tensor, logarithm of the derivative of the forward transformation of the inverse bijector.
        """
        return self.bijector("inverse_log_jacobian", x)
--- a/mindspore/python/mindspore/nn/probability/bijector/power_transform.py
+++ b/mindspore/python/mindspore/nn/probability/bijector/power_transform.py
@@ -37,6 +37,19 @@ class PowerTransform(Bijector):
        power (float, list, numpy.ndarray, Tensor): The scale factor. Default: 0.
        name (str): The name of the bijector. Default: 'PowerTransform'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - **forward**
        - **inverse**
        - **forward_log_jacobian**
        - **backward_log_jacobian**

        It should be notice that the input should be always a tensor,
        with a shape that can be broadcasted to that of `power`.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Bijector`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

@@ -92,6 +105,12 @@ class PowerTransform(Bijector):

    @property
    def power(self):
        """
        Return the power parameter of the bijector.

        Output:
            Tensor, the power parameter of the bijector.
        """
        return self._power

    def extend_repr(self):
@@ -110,7 +129,8 @@ class PowerTransform(Bijector):
        power_local = self.cast_param_by_value(x, self.power)

        # broad cast the value of x and power
        ones = self.fill(self.dtypeop(power_local), self.shape(x + power_local), 1.)
        ones = self.fill(self.dtypeop(power_local),
                         self.shape(x + power_local), 1.)
        power_local = power_local * ones
        x = x * ones
        safe_power = self.select_base(self.equal_base(power_local, 0.),
@@ -130,7 +150,8 @@ class PowerTransform(Bijector):
        power_local = self.cast_param_by_value(y, self.power)

        # broad cast the value of x and power
        ones = self.fill(self.dtypeop(power_local), self.shape(y + power_local), 1.)
        ones = self.fill(self.dtypeop(power_local),
                         self.shape(y + power_local), 1.)
        power_local = power_local * ones
        y = y * ones
        safe_power = self.select_base(self.equal_base(power_local, 0.),
@@ -159,7 +180,8 @@ class PowerTransform(Bijector):
        power_local = self.cast_param_by_value(x, self.power)

        # broad cast the value of x and power
        ones = self.fill(self.dtypeop(power_local), self.shape(x + power_local), 1.)
        ones = self.fill(self.dtypeop(power_local),
                         self.shape(x + power_local), 1.)
        power_local = power_local * ones
        x = x * ones

--- a/mindspore/python/mindspore/nn/probability/bijector/scalar_affine.py
+++ b/mindspore/python/mindspore/nn/probability/bijector/scalar_affine.py
@@ -33,6 +33,19 @@ class ScalarAffine(Bijector):
        shift (float, list, numpy.ndarray, Tensor): The shift factor. Default: 0.0.
        name (str): The name of the bijector. Default: 'ScalarAffine'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - **forward**
        - **inverse**
        - **forward_log_jacobian**
        - **backward_log_jacobian**

        It should be notice that the input should be always a tensor,
        with a shape that can be broadcasted to that of `shift` and `scale`.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Bijector`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

@@ -94,10 +107,22 @@ class ScalarAffine(Bijector):

    @property
    def scale(self):
        """
        Return the scale parameter of the bijector.

        Output:
            Tensor, the scale parameter of the bijector.
        """
        return self._scale

    @property
    def shift(self):
        """
        Return the shift parameter of the bijector.

        Output:
            Tensor, the shift parameter of the bijector.
        """
        return self._shift

    def extend_repr(self):
--- a/mindspore/python/mindspore/nn/probability/bijector/softplus.py
+++ b/mindspore/python/mindspore/nn/probability/bijector/softplus.py
@@ -34,6 +34,19 @@ class Softplus(Bijector):
        sharpness (float, list, numpy.ndarray, Tensor): The scale factor. Default: 1.0.
        name (str): The name of the Bijector. Default: 'Softplus'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - **forward**
        - **inverse**
        - **forward_log_jacobian**
        - **backward_log_jacobian**

        It should be notice that the input should be always a tensor,
        with a shape that can be broadcasted to that of `sharpness`.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Bijector`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

@@ -127,6 +140,12 @@ class Softplus(Bijector):

    @property
    def sharpness(self):
        """
        Return the sharpness parameter of the bijector.

        Output:
            Tensor, the sharpness parameter of the bijector.
        """
        return self._sharpness

    def extend_repr(self):
--- a/mindspore/python/mindspore/nn/probability/distribution/bernoulli.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/bernoulli.py
@@ -25,6 +25,8 @@ from ._utils.custom_ops import exp_generic, log_generic
 class Bernoulli(Distribution):
    """
    Bernoulli Distribution.
    A Bernoulli Distribution is a discrete distribution with the range {0, 1}
    and the probability mass function as :math:`P(X = 0) = p, P(X = 1) = 1-p`.

    Args:
        probs (float, list, numpy.ndarray, Tensor): The probability of that the outcome is 1. Default: None.
@@ -32,6 +34,18 @@ class Bernoulli(Distribution):
        dtype (mindspore.dtype): The type of the event samples. Default: mstype.int32.
        name (str): The name of the distribution. Default: 'Bernoulli'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob`, `log_prob`, `cdf`, `log_cdf`, `survival_function`, and `log_survival`
        - `mean`, `sd`, `var`, and `entropy`
        - `kl_loss` and `cross_entropy`
        - `sample`

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Distribution`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

@@ -39,6 +53,10 @@ class Bernoulli(Distribution):
        `probs` must be a proper probability (0 < p < 1).
        `dist_spec_args` is `probs`.

    Raises:
        ValueError: When p <= 0 or p >=1.
        TypeError: When the input `dtype` is not a subclass of float.

    Examples:
        >>> import mindspore
        >>> import mindspore.nn as nn
@@ -91,7 +109,7 @@ class Bernoulli(Distribution):
        >>> #     dist (str): the name of the distribution. Only 'Bernoulli' is supported.
        >>> #     probs1_b (Tensor): the probability of success of distribution b.
        >>> #     probs1_a (Tensor): the probability of success of distribution a. Default: self.probs.
        >>> # Examples of kl_loss. `cross_entropy` is similar.
        >>> # Examples of `kl_loss`. `cross_entropy` is similar.
        >>> ans = b1.kl_loss('Bernoulli', probs_b)
        >>> print(ans.shape)
        (3,)
@@ -165,6 +183,9 @@ class Bernoulli(Distribution):
        """
        Return the probability of that the outcome is 1
        after casting to dtype.

        Output:
            Tensor, the probs of the distribution.
        """
        return self._probs

--- a/mindspore/python/mindspore/nn/probability/distribution/beta.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/beta.py
@@ -25,18 +25,36 @@ from ._utils.custom_ops import log_generic


 class Beta(Distribution):
    """
    r"""
    Beta distribution.
    A Beta distributio is a continuous distribution with the range :math:`[0, 1]` and the probability density function:

    .. math::
        f(x, \alpha, \beta) = x^\alpha (1-x)^{\beta - 1} / B(\alpha, \beta),

    where :math:`B` is the Beta function.

    Args:
        concentration1 (list, numpy.ndarray, Tensor): The concentration1,
          also know as alpha of the Beta distribution.
        concentration0 (list, numpy.ndarray, Tensor): The concentration0, also know as
          beta of the Beta distribution.
        concentration1 (int, float, list, numpy.ndarray, Tensor): The concentration1,
          also know as alpha of the Beta distribution. Default: None.
        concentration0 (int, float, list, numpy.ndarray, Tensor): The concentration0, also know as
          beta of the Beta distribution. Default: None.
        seed (int): The seed used in sampling. The global seed is used if it is None. Default: None.
        dtype (mindspore.dtype): The type of the event samples. Default: mstype.float32.
        name (str): The name of the distribution. Default: 'Beta'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob` and `log_prob`
        - `mean`, `sd`, `var`, and `entropy`
        - `kl_loss` and `cross_entropy`
        - `sample`

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Distribution`, and examples below.

    Supported Platforms:
        ``Ascend``

@@ -45,6 +63,10 @@ class Beta(Distribution):
        `dist_spec_args` are `concentration1` and `concentration0`.
        `dtype` must be a float type because Beta distributions are continuous.

    Raises:
        ValueError: When concentration1 <= 0 or concentration0 >=1.
        TypeError: When the input `dtype` is not a subclass of float.

    Examples:
        >>> import mindspore
        >>> import mindspore.nn as nn
@@ -145,10 +167,12 @@ class Beta(Distribution):
        Constructor of Beta.
        """
        param = dict(locals())
        param['param_dict'] = {'concentration1': concentration1, 'concentration0': concentration0}
        param['param_dict'] = {
            'concentration1': concentration1, 'concentration0': concentration0}

        valid_dtype = mstype.float_type
        Validator.check_type_name("dtype", dtype, valid_dtype, type(self).__name__)
        Validator.check_type_name(
            "dtype", dtype, valid_dtype, type(self).__name__)

        # As some operators can't accept scalar input, check the type here
        if isinstance(concentration0, float):
@@ -158,8 +182,10 @@ class Beta(Distribution):

        super(Beta, self).__init__(seed, dtype, name, param)

        self._concentration1 = self._add_parameter(concentration1, 'concentration1')
        self._concentration0 = self._add_parameter(concentration0, 'concentration0')
        self._concentration1 = self._add_parameter(
            concentration1, 'concentration1')
        self._concentration0 = self._add_parameter(
            concentration0, 'concentration0')
        if self._concentration1 is not None:
            check_greater_zero(self._concentration1, "concentration1")
        if self._concentration0 is not None:
@@ -183,7 +209,8 @@ class Beta(Distribution):
    def extend_repr(self):
        """Display instance object as string."""
        if self.is_scalar_batch:
            s = 'concentration1 = {}, concentration0 = {}'.format(self._concentration1, self._concentration0)
            s = 'concentration1 = {}, concentration0 = {}'.format(
                self._concentration1, self._concentration0)
        else:
            s = 'batch_shape = {}'.format(self._broadcast_shape)
        return s
@@ -193,6 +220,9 @@ class Beta(Distribution):
        """
        Return the concentration1, also know as the alpha of the Beta distribution,
        after casting to dtype.

        Output:
            Tensor, the concentration1 parameter of the distribution.
        """
        return self._concentration1

@@ -201,6 +231,9 @@ class Beta(Distribution):
        """
        Return the concentration0, also know as the beta of the Beta distribution,
        after casting to dtype.

        Output:
            Tensor, the concentration2 parameter of the distribution.
        """
        return self._concentration0

@@ -222,14 +255,16 @@ class Beta(Distribution):
        """
        The mean of the distribution.
        """
        concentration1, concentration0 = self._check_param_type(concentration1, concentration0)
        concentration1, concentration0 = self._check_param_type(
            concentration1, concentration0)
        return concentration1 / (concentration1 + concentration0)

    def _var(self, concentration1=None, concentration0=None):
        """
        The variance of the distribution.
        """
        concentration1, concentration0 = self._check_param_type(concentration1, concentration0)
        concentration1, concentration0 = self._check_param_type(
            concentration1, concentration0)
        total_concentration = concentration1 + concentration0
        return concentration1 * concentration0 / (self.pow(total_concentration, 2) * (total_concentration + 1.))

@@ -237,7 +272,8 @@ class Beta(Distribution):
        """
        The mode of the distribution.
        """
        concentration1, concentration0 = self._check_param_type(concentration1, concentration0)
        concentration1, concentration0 = self._check_param_type(
            concentration1, concentration0)
        comp1 = self.greater(concentration1, 1.)
        comp2 = self.greater(concentration0, 1.)
        cond = self.logicaland(comp1, comp2)
@@ -254,12 +290,13 @@ class Beta(Distribution):
            H(X) = \log(\Beta(\alpha, \beta)) - (\alpha - 1) * \digamma(\alpha)
                   - (\beta - 1) * \digamma(\beta) + (\alpha + \beta - 2) * \digamma(\alpha + \beta)
        """
        concentration1, concentration0 = self._check_param_type(concentration1, concentration0)
        concentration1, concentration0 = self._check_param_type(
            concentration1, concentration0)
        total_concentration = concentration1 + concentration0
        return self.lbeta(concentration1, concentration0) \
               - (concentration1 - 1.) * self.digamma(concentration1) \
               - (concentration0 - 1.) * self.digamma(concentration0) \
               + (total_concentration - 2.) * self.digamma(total_concentration)
            - (concentration1 - 1.) * self.digamma(concentration1) \
            - (concentration0 - 1.) * self.digamma(concentration0) \
            + (total_concentration - 2.) * self.digamma(total_concentration)

    def _cross_entropy(self, dist, concentration1_b, concentration0_b, concentration1_a=None, concentration0_a=None):
        r"""
@@ -274,7 +311,8 @@ class Beta(Distribution):
        """
        check_distribution_name(dist, 'Beta')
        return self._entropy(concentration1_a, concentration0_a) \
               + self._kl_loss(dist, concentration1_b, concentration0_b, concentration1_a, concentration0_a)
            + self._kl_loss(dist, concentration1_b, concentration0_b,
                            concentration1_a, concentration0_a)

    def _log_prob(self, value, concentration1=None, concentration0=None):
        r"""
@@ -290,9 +328,10 @@ class Beta(Distribution):
        """
        value = self._check_value(value, 'value')
        value = self.cast(value, self.dtype)
        concentration1, concentration0 = self._check_param_type(concentration1, concentration0)
        concentration1, concentration0 = self._check_param_type(
            concentration1, concentration0)
        log_unnormalized_prob = (concentration1 - 1.) * self.log(value) \
                                + (concentration0 - 1.) * self.log1p(self.neg(value))
            + (concentration0 - 1.) * self.log1p(self.neg(value))
        return log_unnormalized_prob - self.lbeta(concentration1, concentration0)

    def _kl_loss(self, dist, concentration1_b, concentration0_b, concentration1_a=None, concentration0_a=None):
@@ -313,19 +352,23 @@ class Beta(Distribution):
                       + \digamma(\alpha_{a} + \beta_{a}) * (\alpha_{b} + \beta_{b} - \alpha_{a} - \beta_{a})
        """
        check_distribution_name(dist, 'Beta')
        concentration1_b = self._check_value(concentration1_b, 'concentration1_b')
        concentration0_b = self._check_value(concentration0_b, 'concentration0_b')
        concentration1_b = self._check_value(
            concentration1_b, 'concentration1_b')
        concentration0_b = self._check_value(
            concentration0_b, 'concentration0_b')
        concentration1_b = self.cast(concentration1_b, self.parameter_type)
        concentration0_b = self.cast(concentration0_b, self.parameter_type)
        concentration1_a, concentration0_a = self._check_param_type(concentration1_a, concentration0_a)
        concentration1_a, concentration0_a = self._check_param_type(
            concentration1_a, concentration0_a)
        total_concentration_a = concentration1_a + concentration0_a
        total_concentration_b = concentration1_b + concentration0_b
        log_normalization_a = self.lbeta(concentration1_a, concentration0_a)
        log_normalization_b = self.lbeta(concentration1_b, concentration0_b)
        return (log_normalization_b - log_normalization_a) \
               - (self.digamma(concentration1_a) * (concentration1_b - concentration1_a)) \
               - (self.digamma(concentration0_a) * (concentration0_b - concentration0_a)) \
               + (self.digamma(total_concentration_a) * (total_concentration_b - total_concentration_a))
            - (self.digamma(concentration1_a) * (concentration1_b - concentration1_a)) \
            - (self.digamma(concentration0_a) * (concentration0_b - concentration0_a)) \
            + (self.digamma(total_concentration_a) *
               (total_concentration_b - total_concentration_a))

    def _sample(self, shape=(), concentration1=None, concentration0=None):
        """
@@ -340,7 +383,8 @@ class Beta(Distribution):
            Tensor, with the shape being shape + batch_shape.
        """
        shape = self.checktuple(shape, 'shape')
        concentration1, concentration0 = self._check_param_type(concentration1, concentration0)
        concentration1, concentration0 = self._check_param_type(
            concentration1, concentration0)
        batch_shape = self.shape(concentration1 + concentration0)
        origin_shape = shape + batch_shape
        if origin_shape == ():
@@ -348,8 +392,10 @@ class Beta(Distribution):
        else:
            sample_shape = origin_shape
        ones = self.fill(self.dtype, sample_shape, 1.0)
        sample_gamma1 = C.gamma(sample_shape, alpha=concentration1, beta=ones, seed=self.seed)
        sample_gamma2 = C.gamma(sample_shape, alpha=concentration0, beta=ones, seed=self.seed)
        sample_gamma1 = C.gamma(
            sample_shape, alpha=concentration1, beta=ones, seed=self.seed)
        sample_gamma2 = C.gamma(
            sample_shape, alpha=concentration0, beta=ones, seed=self.seed)
        sample_beta = sample_gamma1 / (sample_gamma1 + sample_gamma2)
        value = self.cast(sample_beta, self.dtype)
        if origin_shape == ():
--- a/mindspore/python/mindspore/nn/probability/distribution/categorical.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/categorical.py
@@ -29,20 +29,37 @@ from ._utils.custom_ops import exp_generic, log_generic, broadcast_to

 class Categorical(Distribution):
    """
    Create a categorical distribution parameterized by event probabilities.
    Categorical distribution.
    A Categorical Distribution is a discrete distribution with the range {1, 2, ..., k}
    and the probability mass function as :math:`P(X = i) = p_i, i = 1, ..., k`.

    Args:
        probs (Tensor, list, numpy.ndarray): Event probabilities.
        probs (Tensor, list, numpy.ndarray): Event probabilities. Default: None.
        seed (int): The global seed is used in sampling. Global seed is used if it is None. Default: None.
        dtype (mindspore.dtype): The type of the event samples. Default: mstype.int32.
        name (str): The name of the distribution. Default: Categorical.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob`, `log_prob`, `cdf`, `log_cdf`, `survival_function`, and `log_survival`
        - `mean`, `sd`, `var`, and `entropy`
        - `kl_loss` and `cross_entropy`
        - `sample`

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Distribution`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

    Note:
        `probs` must have rank at least 1, values are proper probabilities and sum to 1.

    Raises:
        ValueError: When the sum of all elements in `probs` is not 1.

    Examples:
        >>> import mindspore
        >>> import mindspore.nn as nn
@@ -95,7 +112,7 @@ class Categorical(Distribution):
        >>> #     dist (str): the name of the distribution. Only 'Categorical' is supported.
        >>> #     probs_b (Tensor): event probabilities of distribution b.
        >>> #     probs (Tensor): event probabilities of distribution a. Default: self.probs.
        >>> # Examples of kl_loss. `cross_entropy` is similar.
        >>> # Examples of `kl_loss`, `cross_entropy` is similar.
        >>> ans = ca1.kl_loss('Categorical', probs_b)
        >>> print(ans.shape)
        ()
@@ -172,6 +189,9 @@ class Categorical(Distribution):
    def probs(self):
        """
        Return the probability after casting to dtype.

        Output:
            Tensor, the probs of the distribution.
        """
        return self._probs

--- a/mindspore/python/mindspore/nn/probability/distribution/cauchy.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/cauchy.py
@@ -24,8 +24,15 @@ from ._utils.custom_ops import exp_generic, log_generic, log1p_generic


 class Cauchy(Distribution):
    """
    r"""
    Cauchy distribution.
    A Cauchy distributio is a continuous distribution with the range :math:`[0, 1]`
    and the probability density function:

    .. math::
        f(x, a, b) = 1 / \pi b(1 - ((x - a)/b)^2),

    where a and b are loc and scale parameter respectively.

    Args:
        loc (int, float, list, numpy.ndarray, Tensor): The location of the Cauchy distribution.
@@ -34,6 +41,18 @@ class Cauchy(Distribution):
        dtype (mindspore.dtype): The type of the event samples. Default: mstype.float32.
        name (str): The name of the distribution. Default: 'Cauchy'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob`, `log_prob`, `cdf`, `log_cdf`, `survival_function`, and `log_survival`
        - `mode` and `entropy`
        - `kl_loss` and `cross_entropy`
        - `sample`

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Distribution`, and examples below.

    Supported Platforms:
        ``Ascend``

@@ -43,6 +62,10 @@ class Cauchy(Distribution):
        `dtype` must be a float type because Cauchy distributions are continuous.
        Cauchy distribution is not supported on GPU backend.

    Raises:
        ValueError: When scale <= 0.
        TypeError: When the input `dtype` is not a subclass of float.

    Examples:
        >>> import mindspore
        >>> import mindspore.nn as nn
@@ -144,7 +167,8 @@ class Cauchy(Distribution):
        param = dict(locals())
        param['param_dict'] = {'loc': loc, 'scale': scale}
        valid_dtype = mstype.float_type
        Validator.check_type_name("dtype", dtype, valid_dtype, type(self).__name__)
        Validator.check_type_name(
            "dtype", dtype, valid_dtype, type(self).__name__)
        super(Cauchy, self).__init__(seed, dtype, name, param)

        self._loc = self._add_parameter(loc, 'loc')
@@ -171,11 +195,11 @@ class Cauchy(Distribution):

        self.entropy_const = np.log(4 * np.pi)


    def extend_repr(self):
        """Display instance object as string."""
        if self.is_scalar_batch:
            str_info = 'location = {}, scale = {}'.format(self._loc, self._scale)
            str_info = 'location = {}, scale = {}'.format(
                self._loc, self._scale)
        else:
            str_info = 'batch_shape = {}'.format(self._broadcast_shape)
        return str_info
@@ -184,6 +208,9 @@ class Cauchy(Distribution):
    def loc(self):
        """
        Return the location of the distribution after casting to dtype.

        Output:
            Tensor, the loc parameter of the distribution.
        """
        return self._loc

@@ -191,6 +218,9 @@ class Cauchy(Distribution):
    def scale(self):
        """
        Return the scale of the distribution after casting to dtype.

        Output:
            Tensor, the scale parameter of the distribution.
        """
        return self._scale

@@ -320,7 +350,7 @@ class Cauchy(Distribution):
        sum_square = self.sq(scale_a + scale_b)
        square_diff = self.sq(loc_a - loc_b)
        return self.log(sum_square + square_diff) - \
                self.log(self.const(4.0)) - self.log(scale_a) - self.log(scale_b)
            self.log(self.const(4.0)) - self.log(scale_a) - self.log(scale_b)

    def _cross_entropy(self, dist, loc_b, scale_b, loc_a=None, scale_a=None):
        r"""
--- a/mindspore/python/mindspore/nn/probability/distribution/distribution.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/distribution.py
@@ -437,7 +437,7 @@ class Distribution(Cell):

    def cdf(self, value, *args, **kwargs):
        """
        Evaluate the cdf at given value.
        Evaluate the cumulative distribution function(cdf) at given value.

        Args:
            value (Tensor): value to be evaluated.
@@ -447,6 +447,9 @@ class Distribution(Cell):
        Note:
            A distribution can be optionally passed to the function by passing its dist_spec_args through
            `args` or `kwargs`.

        Output:
            Tensor, the cdf of the distribution.
        """
        return self._call_cdf(value, *args, **kwargs)

@@ -479,7 +482,7 @@ class Distribution(Cell):

    def log_cdf(self, value, *args, **kwargs):
        """
        Evaluate the log cdf at given value.
        Evaluate the log the cumulative distribution function(cdf) at given value.

        Args:
            value (Tensor): value to be evaluated.
@@ -489,6 +492,9 @@ class Distribution(Cell):
        Note:
            A distribution can be optionally passed to the function by passing its dist_spec_args through
            `args` or `kwargs`.

        Output:
            Tensor, the log cdf of the distribution.
        """
        return self._call_log_cdf(value, *args, **kwargs)

@@ -513,6 +519,9 @@ class Distribution(Cell):
        Note:
            A distribution can be optionally passed to the function by passing its dist_spec_args through
            `args` or `kwargs`.

        Output:
            Tensor, the survival function of the distribution.
        """
        return self._call_survival(value, *args, **kwargs)

@@ -546,6 +555,9 @@ class Distribution(Cell):
        Note:
            A distribution can be optionally passed to the function by passing its dist_spec_args through
            `args` or `kwargs`.

        Output:
            Tensor, the log survival function of the distribution.
        """
        return self._call_log_survival(value, *args, **kwargs)

@@ -573,6 +585,9 @@ class Distribution(Cell):
        Note:
            dist_spec_args of distribution b must be passed to the function through `args` or `kwargs`.
            Passing in dist_spec_args of distribution a is optional.

        Output:
            Tensor, the kl loss function of the distribution.
        """
        return self._kl_loss(dist, *args, **kwargs)

@@ -590,6 +605,9 @@ class Distribution(Cell):
        Note:
            A distribution can be optionally passed to the function by passing its *dist_spec_args* through
            *args* or *kwargs*.

        Output:
            Tensor, the mean of the distribution.
        """
        return self._mean(*args, **kwargs)

@@ -607,6 +625,9 @@ class Distribution(Cell):
        Note:
            A distribution can be optionally passed to the function by passing its *dist_spec_args* through
            *args* or *kwargs*.

        Output:
            Tensor, the mode of the distribution.
        """
        return self._mode(*args, **kwargs)

@@ -616,11 +637,14 @@ class Distribution(Cell):

        Args:
            *args (list): the list of positional arguments forwarded to subclasses.
            **kwargs (dict: the dictionary of keyword arguments forwarded to subclasses.
            **kwargs (dict): the dictionary of keyword arguments forwarded to subclasses.

        Note:
            A distribution can be optionally passed to the function by passing its *dist_spec_args* through
            *args* or *kwargs*.

        Output:
            Tensor, the standard deviation of the distribution.
        """
        return self._call_sd(*args, **kwargs)

@@ -635,6 +659,9 @@ class Distribution(Cell):
        Note:
            A distribution can be optionally passed to the function by passing its *dist_spec_args* through
            *args* or *kwargs*.

        Output:
            Tensor, the variance of the distribution.
        """
        return self._call_var(*args, **kwargs)

@@ -670,6 +697,9 @@ class Distribution(Cell):
        Note:
            A distribution can be optionally passed to the function by passing its *dist_spec_args* through
            *args* or *kwargs*.

        Output:
            Tensor, the entropy of the distribution.
        """
        return self._entropy(*args, **kwargs)

@@ -685,6 +715,9 @@ class Distribution(Cell):
        Note:
            dist_spec_args of distribution b must be passed to the function through `args` or `kwargs`.
            Passing in dist_spec_args of distribution a is optional.

        Output:
            Tensor, the cross_entropy of two distributions.
        """
        return self._call_cross_entropy(dist, *args, **kwargs)

@@ -712,6 +745,9 @@ class Distribution(Cell):
        Note:
            A distribution can be optionally passed to the function by passing its *dist_spec_args* through
            *args* or *kwargs*.

        Output:
            Tensor, the sample generated from the distribution.
        """
        return self._sample(*args, **kwargs)

--- a/mindspore/python/mindspore/nn/probability/distribution/exponential.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/exponential.py
@@ -24,15 +24,34 @@ from ._utils.custom_ops import exp_generic, log_generic


 class Exponential(Distribution):
    """
    Example class: Exponential Distribution.
    r"""
    Exponential Distribution.
    An Exponential distributio is a continuous distribution with the range :math:`[0, 1]`
    and the probability density function:

    .. math::
        f(x, \lambda) = \lambda \exp(-\lambda x),

    where :math:`\lambda` is the rate of the distribution.

    Args:
        rate (float, list, numpy.ndarray, Tensor): The inverse scale. Default: None.
        rate (int, float, list, numpy.ndarray, Tensor): The inverse scale. Default: None.
        seed (int): The seed used in sampling. The global seed is used if it is None. Default: None.
        dtype (mindspore.dtype): The type of the event samples. Default: mstype.float32.
        name (str): The name of the distribution. Default: 'Exponential'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob`, `log_prob`, `cdf`, `log_cdf`, `survival_function`, and `log_survival`
        - `mean`, `sd`, `var`, and `entropy`
        - `kl_loss` and `cross_entropy`
        - `sample`

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Distribution`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

@@ -41,6 +60,10 @@ class Exponential(Distribution):
        `dist_spec_args` is `rate`.
        `dtype` must be a float type because Exponential distributions are continuous.

    Raises:
        ValueError: When rate <= 0.
        TypeError: When the input `dtype` is not a subclass of float.

    Examples:
        >>> import mindspore
        >>> import mindspore.nn as nn
@@ -133,7 +156,8 @@ class Exponential(Distribution):
        param = dict(locals())
        param['param_dict'] = {'rate': rate}
        valid_dtype = mstype.float_type
        Validator.check_type_name("dtype", dtype, valid_dtype, type(self).__name__)
        Validator.check_type_name(
            "dtype", dtype, valid_dtype, type(self).__name__)
        super(Exponential, self).__init__(seed, dtype, name, param)

        self._rate = self._add_parameter(rate, 'rate')
@@ -167,6 +191,9 @@ class Exponential(Distribution):
    def rate(self):
        """
        Return `rate` of the distribution after casting to dtype.

        Output:
            Tensor, the rate parameter of the distribution.
        """
        return self._rate

--- a/mindspore/python/mindspore/nn/probability/distribution/gamma.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/gamma.py
@@ -25,18 +25,38 @@ from ._utils.custom_ops import log_generic


 class Gamma(Distribution):
    """
    r"""
    Gamma distribution.
    A Gamma distributio is a continuous distribution with the range :math:`[0, 1]`
    and the probability density function:

    .. math::
        f(x, \alpha, \beta) = \beta^\alpha / \Gamma(\alpha) x^{\alpha - 1} \exp(-\beta x).

    where :math:`G` is the Gamma function,
    and :math:`\alpha, \beta` are the concentration and the rate of the distribution respectively.

    Args:
        concentration (list, numpy.ndarray, Tensor): The concentration,
        concentration (int, float, list, numpy.ndarray, Tensor): The concentration,
          also know as alpha of the Gamma distribution. Default: None.
        rate (list, numpy.ndarray, Tensor): The rate, also know as
        rate (int, float, list, numpy.ndarray, Tensor): The rate, also know as
          beta of the Gamma distribution. Default: None.
        seed (int): The seed used in sampling. The global seed is used if it is None. Default: None.
        dtype (mindspore.dtype): The type of the event samples. Default: mstype.float32.
        name (str): The name of the distribution. Default: 'Gamma'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob`, `log_prob`, `cdf`, `log_cdf`, `survival_function`, and `log_survival`
        - `mean`, `sd`, `mode`, `var`, and `entropy`
        - `kl_loss` and `cross_entropy`
        - `sample`

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Distribution`, and examples below.

    Supported Platforms:
        ``Ascend``

@@ -45,6 +65,10 @@ class Gamma(Distribution):
        `dist_spec_args` are `concentration` and `rate`.
        `dtype` must be a float type because Gamma distributions are continuous.

    Raises:
        ValueError: When concentration <= 0 or rate <= 0.
        TypeError: When the input `dtype` is not a subclass of float.

    Examples:
        >>> import mindspore
        >>> import mindspore.nn as nn
@@ -147,7 +171,8 @@ class Gamma(Distribution):
        param = dict(locals())
        param['param_dict'] = {'concentration': concentration, 'rate': rate}
        valid_dtype = mstype.float_type
        Validator.check_type_name("dtype", dtype, valid_dtype, type(self).__name__)
        Validator.check_type_name(
            "dtype", dtype, valid_dtype, type(self).__name__)

        # As some operators can't accept scalar input, check the type here
        if isinstance(concentration, (int, float)):
@@ -157,7 +182,8 @@ class Gamma(Distribution):

        super(Gamma, self).__init__(seed, dtype, name, param)

        self._concentration = self._add_parameter(concentration, 'concentration')
        self._concentration = self._add_parameter(
            concentration, 'concentration')
        self._rate = self._add_parameter(rate, 'rate')
        if self._concentration is not None:
            check_greater_zero(self._concentration, "concentration")
@@ -182,7 +208,8 @@ class Gamma(Distribution):
    def extend_repr(self):
        """Display instance object as string."""
        if self.is_scalar_batch:
            s = 'concentration = {}, rate = {}'.format(self._concentration, self._rate)
            s = 'concentration = {}, rate = {}'.format(
                self._concentration, self._rate)
        else:
            s = 'batch_shape = {}'.format(self._broadcast_shape)
        return s
@@ -192,6 +219,9 @@ class Gamma(Distribution):
        """
        Return the concentration, also know as the alpha of the Gamma distribution,
        after casting to dtype.

        Output:
            Tensor, the concentration parameter of the distribution.
        """
        return self._concentration

@@ -200,6 +230,9 @@ class Gamma(Distribution):
        """
        Return the rate, also know as the beta of the Gamma distribution,
        after casting to dtype.

        Output:
            Tensor, the rate parameter of the distribution.
        """
        return self._rate

@@ -244,7 +277,8 @@ class Gamma(Distribution):
        """
        concentration, rate = self._check_param_type(concentration, rate)
        mode = (concentration - 1.) / rate
        nan = self.fill(self.dtypeop(concentration), self.shape(concentration), np.nan)
        nan = self.fill(self.dtypeop(concentration),
                        self.shape(concentration), np.nan)
        comp = self.greater(concentration, 1.)
        return self.select(comp, mode, nan)

@@ -257,7 +291,7 @@ class Gamma(Distribution):
        """
        concentration, rate = self._check_param_type(concentration, rate)
        return concentration - self.log(rate) + self.lgamma(concentration) \
               + (1. - concentration) * self.digamma(concentration)
            + (1. - concentration) * self.digamma(concentration)

    def _cross_entropy(self, dist, concentration_b, rate_b, concentration_a=None, rate_a=None):
        r"""
@@ -272,7 +306,8 @@ class Gamma(Distribution):
        """
        check_distribution_name(dist, 'Gamma')
        return self._entropy(concentration_a, rate_a) +\
               self._kl_loss(dist, concentration_b, rate_b, concentration_a, rate_a)
            self._kl_loss(dist, concentration_b, rate_b,
                          concentration_a, rate_a)

    def _log_prob(self, value, concentration=None, rate=None):
        r"""
@@ -289,8 +324,10 @@ class Gamma(Distribution):
        value = self._check_value(value, 'value')
        value = self.cast(value, self.dtype)
        concentration, rate = self._check_param_type(concentration, rate)
        unnormalized_log_prob = (concentration - 1.) * self.log(value) - rate * value
        log_normalization = self.lgamma(concentration) - concentration * self.log(rate)
        unnormalized_log_prob = (concentration - 1.) * \
            self.log(value) - rate * value
        log_normalization = self.lgamma(
            concentration) - concentration * self.log(rate)
        return unnormalized_log_prob - log_normalization

    def _cdf(self, value, concentration=None, rate=None):
@@ -332,11 +369,12 @@ class Gamma(Distribution):
        rate_b = self._check_value(rate_b, 'rate_b')
        concentration_b = self.cast(concentration_b, self.parameter_type)
        rate_b = self.cast(rate_b, self.parameter_type)
        concentration_a, rate_a = self._check_param_type(concentration_a, rate_a)
        concentration_a, rate_a = self._check_param_type(
            concentration_a, rate_a)
        return (concentration_a - concentration_b) * self.digamma(concentration_a) \
               + self.lgamma(concentration_b)  - self.lgamma(concentration_a) \
               + concentration_b * self.log(rate_a) - concentration_b * self.log(rate_b) \
               + concentration_a * (rate_b / rate_a - 1.)
            + self.lgamma(concentration_b) - self.lgamma(concentration_a) \
            + concentration_b * self.log(rate_a) - concentration_b * self.log(rate_b) \
            + concentration_a * (rate_b / rate_a - 1.)

    def _sample(self, shape=(), concentration=None, rate=None):
        """
--- a/mindspore/python/mindspore/nn/probability/distribution/geometric.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/geometric.py
@@ -26,16 +26,29 @@ from ._utils.custom_ops import exp_generic, log_generic
 class Geometric(Distribution):
    """
    Geometric Distribution.

    A Geometric Distribution is a discrete distribution with the range as the non-negative integers,
    and the probability mass function as :math:`P(X = i) = p(1-p)^{i-1}, i = 1, 2, ...`.
    It represents that there are k failures before the first success, namely that there are in total k+1 Bernoulli
    trials when the first success is achieved.

    Args:
        probs (float, list, numpy.ndarray, Tensor): The probability of success. Default: None.
        probs (int, float, list, numpy.ndarray, Tensor): The probability of success. Default: None.
        seed (int): The seed used in sampling. Global seed is used if it is None. Default: None.
        dtype (mindspore.dtype): The type of the event samples. Default: mstype.int32.
        name (str): The name of the distribution. Default: 'Geometric'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob`, `log_prob`, `cdf`, `log_cdf`, `survival_function`, and `log_survival`
        - `mean`, `sd`, `mode`, `var`, and `entropy`
        - `kl_loss` and `cross_entropy`
        - `sample`

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Distribution`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

@@ -43,6 +56,9 @@ class Geometric(Distribution):
        `probs` must be a proper probability (0 < p < 1).
        `dist_spec_args` is `probs`.

    Raises:
        ValueError: When p <= 0 or p >= 1.

    Examples:
        >>> import mindspore
        >>> import mindspore.nn as nn
@@ -138,7 +154,8 @@ class Geometric(Distribution):
        param = dict(locals())
        param['param_dict'] = {'probs': probs}
        valid_dtype = mstype.int_type + mstype.uint_type + mstype.float_type
        Validator.check_type_name("dtype", dtype, valid_dtype, type(self).__name__)
        Validator.check_type_name(
            "dtype", dtype, valid_dtype, type(self).__name__)
        super(Geometric, self).__init__(seed, dtype, name, param)

        self._probs = self._add_parameter(probs, 'probs')
@@ -176,6 +193,9 @@ class Geometric(Distribution):
    def probs(self):
        """
        Return the probability of success of the Bernoulli trial, after casting to dtype.

        Output:
            Tensor, the probs parameter of the distribution.
        """
        return self._probs

--- a/mindspore/python/mindspore/nn/probability/distribution/gumbel.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/gumbel.py
@@ -26,16 +26,31 @@ from ._utils.custom_ops import exp_generic, log_generic


 class Gumbel(TransformedDistribution):
    """
    r"""
    Gumbel distribution.
    A Gumbel distributio is a continuous distribution with the range :math:`[0, 1]`
    and the probability density function:

    .. math::
        f(x, a, b) = 1 / b \exp(\exp(-(x - a) / b) - x),

    where a and b are loc and scale parameter respectively.

    Args:
        loc (float, list, numpy.ndarray, Tensor): The location of Gumbel distribution.
        scale (float, list, numpy.ndarray, Tensor): The scale of Gumbel distribution.
        loc (int, float, list, numpy.ndarray, Tensor): The location of Gumbel distribution. Default: None.
        scale (int, float, list, numpy.ndarray, Tensor): The scale of Gumbel distribution. Default: None.
        seed (int): the seed used in sampling. The global seed is used if it is None. Default: None.
        dtype (mindspore.dtype): type of the distribution. Default: mstype.float32.
        name (str): the name of the distribution. Default: 'Gumbel'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob`, `log_prob`, `cdf`, `log_cdf`, `survival_function`, and `log_survival`
        - `mean`, `sd`, `mode`, `var`, and `entropy`
        - `kl_loss` and `cross_entropy`
        - `sample`

    Supported Platforms:
        ``Ascend`` ``GPU``

@@ -45,6 +60,10 @@ class Gumbel(TransformedDistribution):
        `dtype` must be a float type because Gumbel distributions are continuous.
        `kl_loss` and `cross_entropy` are not supported on GPU backend.

    Raises:
        ValueError: When scale <= 0.
        TypeError: When the input `dtype` is not a subclass of float.

    Examples:
        >>> import mindspore
        >>> import mindspore.nn as nn
@@ -72,7 +91,8 @@ class Gumbel(TransformedDistribution):
        Constructor of Gumbel distribution.
        """
        valid_dtype = mstype.float_type
        Validator.check_type_name("dtype", dtype, valid_dtype, type(self).__name__)
        Validator.check_type_name(
            "dtype", dtype, valid_dtype, type(self).__name__)
        gumbel_cdf = msb.GumbelCDF(loc, scale)
        super(Gumbel, self).__init__(
            distribution=msd.Uniform(0.0, 1.0, dtype=dtype),
@@ -101,6 +121,9 @@ class Gumbel(TransformedDistribution):
    def loc(self):
        """
        Return the location of the distribution after casting to dtype.

        Output:
            Tensor, the loc parameter of the distribution.
        """
        return self._loc

@@ -108,6 +131,9 @@ class Gumbel(TransformedDistribution):
    def scale(self):
        """
        Return the scale of the distribution after casting to dtype.

        Output:
            Tensor, the scale parameter of the distribution.
        """
        return self._scale

@@ -155,7 +181,8 @@ class Gumbel(TransformedDistribution):
        .. math::
            STD(X) = \frac{\pi}{\sqrt(6)} * scale
        """
        scale = self.scale * self.fill(self.parameter_type, self.broadcast_shape, 1.0)
        scale = self.scale * \
            self.fill(self.parameter_type, self.broadcast_shape, 1.0)
        return scale * np.pi / self.sqrt(self.const(6.))

    def _entropy(self):
@@ -165,7 +192,8 @@ class Gumbel(TransformedDistribution):
        .. math::
            H(X) = 1. + \log(scale) + Euler-Mascheroni_constant
        """
        scale = self.scale * self.fill(self.parameter_type, self.broadcast_shape, 1.0)
        scale = self.scale * \
            self.fill(self.parameter_type, self.broadcast_shape, 1.0)
        return 1. + self.log(scale) + np.euler_gamma

    def _log_prob(self, value):
@@ -219,8 +247,9 @@ class Gumbel(TransformedDistribution):
        loc_b = self.cast(loc_b, self.parameter_type)
        scale_b = self.cast(scale_b, self.parameter_type)
        return self.log(scale_b / self.scale) +\
               np.euler_gamma * (self.scale / scale_b - 1.) + (self.loc - loc_b) / scale_b +\
               self.expm1((loc_b - self.loc) / scale_b + self.lgamma(self.scale / scale_b + 1.))
            np.euler_gamma * (self.scale / scale_b - 1.) + (self.loc - loc_b) / scale_b +\
            self.expm1((loc_b - self.loc) / scale_b +
                       self.lgamma(self.scale / scale_b + 1.))

    def _sample(self, shape=()):
        shape = self.checktuple(shape, 'shape')
--- a/mindspore/python/mindspore/nn/probability/distribution/log_normal.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/log_normal.py
@@ -23,10 +23,18 @@ from ._utils.custom_ops import exp_generic, log_generic


 class LogNormal(msd.TransformedDistribution):
    """
    r"""
    LogNormal distribution.
    A log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose
    logarithm is normally distributed. It is constructed as the exponential transformation of a Normal distribution.
    logarithm is normally distributed.
    The log-normal distribution has the range :math:`(0, \inf)` with the pdf as

    .. math::
        f(x, \mu, \sigma) = 1 / x\sigma\sqrt{2\pi} \exp(-(\ln(x) - \mu)^2 / 2\sigma^2).

    where :math:`\mu, \sigma` are the mean and
    the standard deviation of the underlying normal distribution respectively.
    It is constructed as the exponential transformation of a Normal distribution.

    Args:
        loc (int, float, list, numpy.ndarray, Tensor): The mean of the underlying Normal distribution. Default: None.
@@ -36,6 +44,19 @@ class LogNormal(msd.TransformedDistribution):
        dtype (mindspore.dtype): type of the distribution. Default: mstype.float32.
        name (str): the name of the distribution. Default: 'LogNormal'.


    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob`, `log_prob`, `cdf`, `log_cdf`, `survival_function`, and `log_survival`
        - `mean`, `sd`, `mode`, `var`, and `entropy`
        - `kl_loss` and `cross_entropy`
        - `sample`

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Distribution`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

@@ -44,6 +65,10 @@ class LogNormal(msd.TransformedDistribution):
        `dist_spec_args` are `loc` and `scale`.
        `dtype` must be a float type because LogNormal distributions are continuous.

    Raises:
        ValueError: When scale <= 0.
        TypeError: When the input `dtype` is not a subclass of float.

    Examples:
        >>> import numpy as np
        >>> import mindspore
@@ -82,7 +107,7 @@ class LogNormal(msd.TransformedDistribution):

        self.log_2pi = np.log(2 * np.pi)

        #ops needed for the class
        # ops needed for the class
        self.dtypeop = P.DType()
        self.exp = exp_generic
        self.expm1 = P.Expm1()
@@ -103,6 +128,9 @@ class LogNormal(msd.TransformedDistribution):
        """
        Distribution parameter for the pre-transformed mean
        after casting to dtype.

        Output:
            Tensor, the loc parameter of the distribution.
        """
        return self._loc

@@ -111,6 +139,9 @@ class LogNormal(msd.TransformedDistribution):
        """
        Distribution parameter for the pre-transformed standard deviation
        after casting to dtype.

        Output:
            Tensor, the scale parameter of the distribution.
        """
        return self._scale

--- a/mindspore/python/mindspore/nn/probability/distribution/logistic.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/logistic.py
@@ -24,16 +24,35 @@ from ._utils.custom_ops import exp_generic, log_generic


 class Logistic(Distribution):
    """
    r"""
    Logistic distribution.
    A Logistic distributio is a continuous distribution with the range :math:`(-\inf, \inf)`
    and the probability density function:

    .. math::
        f(x, a, b) = 1 / b \exp(\exp(-(x - a) / b) - x).

    where a and b are loc and scale parameter respectively.

    Args:
        loc (int, float, list, numpy.ndarray, Tensor): The location of the Logistic distribution. Default: None.
        scale (int, float, list, numpy.ndarray, Tensor): The scale of the Logistic distribution. Default: None.
        loc (float, list, numpy.ndarray, Tensor): The location of the Logistic distribution. Default: None.
        scale (float, list, numpy.ndarray, Tensor): The scale of the Logistic distribution. Default: None.
        seed (int): The seed used in sampling. The global seed is used if it is None. Default: None.
        dtype (mindspore.dtype): The type of the event samples. Default: mstype.float32.
        name (str): The name of the distribution. Default: 'Logistic'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob`, `log_prob`, `cdf`, `log_cdf`, `survival_function`, and `log_survival`
        - `mean`, `sd`, `mode`, `var`, and `entropy`
        - `kl_loss` and `cross_entropy`
        - `sample`

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Distribution`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

@@ -42,6 +61,10 @@ class Logistic(Distribution):
        `dist_spec_args` are `loc` and `scale`.
        `dtype` must be a float type because Logistic distributions are continuous.

    Raises:
        ValueError: When scale <= 0.
        TypeError: When the input `dtype` is not a subclass of float.

    Examples:
        >>> import mindspore
        >>> import mindspore.nn as nn
@@ -127,7 +150,8 @@ class Logistic(Distribution):
        param = dict(locals())
        param['param_dict'] = {'loc': loc, 'scale': scale}
        valid_dtype = mstype.float_type
        Validator.check_type_name("dtype", dtype, valid_dtype, type(self).__name__)
        Validator.check_type_name(
            "dtype", dtype, valid_dtype, type(self).__name__)
        super(Logistic, self).__init__(seed, dtype, name, param)

        self._loc = self._add_parameter(loc, 'loc')
@@ -184,6 +208,9 @@ class Logistic(Distribution):
    def loc(self):
        """
        Return the location of the distribution after casting to dtype.

        Output:
            Tensor, the loc parameter of the distribution.
        """
        return self._loc

@@ -191,6 +218,9 @@ class Logistic(Distribution):
    def scale(self):
        """
        Return the scale of the distribution after casting to dtype.

        Output:
            Tensor, the scale parameter of the distribution.
        """
        return self._scale

--- a/mindspore/python/mindspore/nn/probability/distribution/normal.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/normal.py
@@ -25,8 +25,16 @@ from ._utils.custom_ops import exp_generic, log_generic


 class Normal(Distribution):
    """
    r"""
    Normal distribution.
    A Normal distributio is a continuous distribution with the range :math:`(-\inf, \inf)`
    and the probability density function:

    .. math::
        f(x, \mu, \sigma) = 1 / \sigma\sqrt{2\pi} \exp(-(x - \mu)^2 / 2\sigma^2).

    where :math:`\mu, \sigma` are the mean and
    the standard deviation of the normal distribution respectively.

    Args:
        mean (int, float, list, numpy.ndarray, Tensor): The mean of the Normal distribution. Default: None.
@@ -35,6 +43,18 @@ class Normal(Distribution):
        dtype (mindspore.dtype): The type of the event samples. Default: mstype.float32.
        name (str): The name of the distribution. Default: 'Normal'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob`, `log_prob`, `cdf`, `log_cdf`, `survival_function`, and `log_survival`
        - `mean`, `sd`, `mode`, `var`, and `entropy`
        - `kl_loss` and `cross_entropy`
        - `sample`

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Distribution`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

@@ -43,6 +63,10 @@ class Normal(Distribution):
        `dist_spec_args` are `mean` and `sd`.
        `dtype` must be a float type because Normal distributions are continuous.

    Raises:
        ValueError: When sd <= 0.
        TypeError: When the input `dtype` is not a subclass of float.

    Examples:
        >>> import mindspore
        >>> import mindspore.nn as nn
--- a/mindspore/python/mindspore/nn/probability/distribution/poisson.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/poisson.py
@@ -25,8 +25,11 @@ from ._utils.custom_ops import exp_generic, log_generic


 class Poisson(Distribution):
    """
    r"""
    Poisson Distribution.
    A Poisson Distribution is a discrete distribution with the range as the non-negative integers,
    and the probability mass function as :math:`P(X = k) = \lambda^k \exp(-\lambda) / k!, k = 1, 2, ...`,
    where :math:`\lambda` is the rate of the distribution.

    Args:
        rate (list, numpy.ndarray, Tensor): The rate of the Poisson distribution. Default: None.
@@ -34,6 +37,18 @@ class Poisson(Distribution):
        dtype (mindspore.dtype): The type of the event samples. Default: mstype.float32.
        name (str): The name of the distribution. Default: 'Poisson'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob`, `log_prob`, `cdf`, `log_cdf`, `survival_function`, and `log_survival`
        - `mean`, `sd`, `mode`, `var`, and `entropy`
        - `kl_loss` and `cross_entropy`
        - `sample`

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Distribution`, and examples below.

    Supported Platforms:
        ``Ascend``

@@ -41,6 +56,10 @@ class Poisson(Distribution):
        `rate` must be strictly greater than 0.
        `dist_spec_args` is `rate`.

    Raises:
        ValueError: When rate <= 0.
        TypeError: When the input `dtype` is not a subclass of float.

    Examples:
        >>> import mindspore
        >>> import mindspore.nn as nn
@@ -119,7 +138,8 @@ class Poisson(Distribution):
        param = dict(locals())
        param['param_dict'] = {'rate': rate}
        valid_dtype = mstype.int_type + mstype.uint_type + mstype.float_type
        Validator.check_type_name("dtype", dtype, valid_dtype, type(self).__name__)
        Validator.check_type_name(
            "dtype", dtype, valid_dtype, type(self).__name__)

        # As some operators can't accept scalar input, check the type here
        if isinstance(rate, (int, float)):
@@ -151,6 +171,9 @@ class Poisson(Distribution):
    def rate(self):
        """
        Return `rate` of the distribution after casting to dtype.

        Output:
            Tensor, the rate parameter of the distribution.
        """
        return self._rate

--- a/mindspore/python/mindspore/nn/probability/distribution/transformed_distribution.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/transformed_distribution.py
@@ -28,18 +28,36 @@ class TransformedDistribution(Distribution):
    Transformed Distribution.
    This class contains a bijector and a distribution and transforms the original distribution
    to a new distribution through the operation defined by the bijector.
    If X is an random variable following the underying distribution,
    and g(x) is a function represented by the bijector,
    then Y = g(X) is a random variable following the transformed distribution.

    Args:
        bijector (Bijector): The transformation to perform.
        distribution (Distribution): The original distribution. Must has a float dtype.
        seed (int): The seed is used in sampling. The global seed is used if it is None. Default:None.
        distribution (Distribution): The original distribution. Must be a float dtype.
        seed (int): The seed is used in sampling. The global seed is used if it is None. Default: None.
          If this seed is given when a TransformedDistribution object is initialized, the object's sampling function
          will use this seed; elsewise, the underlying distribution's seed will be used.
        name (str): The name of the transformed distribution. Default: 'transformed_distribution'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob`, `log_prob`, `cdf`, `log_cdf`, `survival_function`, and `log_survival`
        - `mean`
        - `sample`

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Distribution`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

    Raises:
        TypeError: When the input `bijector` is not a Bijector instance.
        TypeError: When the input `distribution` is not a Distribution instance.

    Note:
        The arguments used to initialize the original distribution cannot be None.
        For example, mynormal = msd.Normal(dtype=mindspore.float32) cannot be used to initialized a
@@ -93,8 +111,10 @@ class TransformedDistribution(Distribution):
                                   [nn.probability.bijector.Bijector], type(self).__name__)
        validator.check_value_type('distribution', distribution,
                                   [Distribution], type(self).__name__)
        validator.check_type_name("dtype", distribution.dtype, mstype.float_type, type(self).__name__)
        super(TransformedDistribution, self).__init__(seed, distribution.dtype, name, param)
        validator.check_type_name(
            "dtype", distribution.dtype, mstype.float_type, type(self).__name__)
        super(TransformedDistribution, self).__init__(
            seed, distribution.dtype, name, param)

        self._bijector = bijector
        self._distribution = distribution
@@ -121,21 +141,44 @@ class TransformedDistribution(Distribution):
        # broadcast bijector batch_shape and distribution batch_shape
        self._broadcast_shape = self._broadcast_bijector_dist()


    @property
    def bijector(self):
        """
        Return the bijector of the transformed distribution.

        Output:
            Bijector, the bijector of the transformed distribution.
        """
        return self._bijector

    @property
    def distribution(self):
        """
        Return the underlying distribution of the transformed distribution.

        Output:
            Bijector, the underlying distribution of the transformed distribution.
        """
        return self._distribution

    @property
    def dtype(self):
        """
        Return the dtype of the transformed distribution.

        Output:
            Mindspore.dtype, the dtype of the transformed distribution.
        """
        return self._dtype

    @property
    def is_linear_transformation(self):
        """
        Return whether the transformation is linear.

        Output:
            Bool, true if the transformation is linear, and false otherwise.
        """
        return self._is_linear_transformation

    def _broadcast_bijector_dist(self):
@@ -144,7 +187,8 @@ class TransformedDistribution(Distribution):
        """
        if self.batch_shape is None or self.bijector.batch_shape is None:
            return None
        bijector_shape_tensor = self.fill_base(self.dtype, self.bijector.batch_shape, 0.0)
        bijector_shape_tensor = self.fill_base(
            self.dtype, self.bijector.batch_shape, 0.0)
        dist_shape_tensor = self.fill_base(self.dtype, self.batch_shape, 0.0)
        return (bijector_shape_tensor + dist_shape_tensor).shape

@@ -174,12 +218,14 @@ class TransformedDistribution(Distribution):
            \log(Py(a)) = \log(Px(g^{-1}(a))) + \log((g^{-1})'(a))
        """
        inverse_value = self.bijector("inverse", value)
        unadjust_prob = self.distribution("log_prob", inverse_value, *args, **kwargs)
        unadjust_prob = self.distribution(
            "log_prob", inverse_value, *args, **kwargs)
        log_jacobian = self.bijector("inverse_log_jacobian", value)
        isneginf = self.equal_base(unadjust_prob, -np.inf)
        isnan = self.equal_base(unadjust_prob + log_jacobian, np.nan)
        return self.select_base(isneginf,
                                self.select_base(isnan, unadjust_prob + log_jacobian, unadjust_prob),
                                self.select_base(
                                    isnan, unadjust_prob + log_jacobian, unadjust_prob),
                                unadjust_prob + log_jacobian)

    def _prob(self, value, *args, **kwargs):
--- a/mindspore/python/mindspore/nn/probability/distribution/uniform.py
+++ b/mindspore/python/mindspore/nn/probability/distribution/uniform.py
@@ -24,8 +24,15 @@ from ._utils.custom_ops import exp_generic, log_generic


 class Uniform(Distribution):
    """
    Example class: Uniform Distribution.
    r"""
    Uniform Distribution.
    A Uniform distributio is a continuous distribution with the range :math:`[a, b]`
    and the probability density function:

    .. math::
        f(x, a, b) = 1 / b \exp(\exp(-(x - a) / b) - x),

    where a and b are the lower and upper bound respectively.

    Args:
        low (int, float, list, numpy.ndarray, Tensor): The lower bound of the distribution. Default: None.
@@ -34,6 +41,18 @@ class Uniform(Distribution):
        dtype (mindspore.dtype): The type of the event samples. Default: mstype.float32.
        name (str): The name of the distribution. Default: 'Uniform'.

    Inputs and Outputs of APIs:
        The accessible api is defined in the base class, including:

        - `prob`, `log_prob`, `cdf`, `log_cdf`, `survival_function`, and `log_survival`
        - `mean`, `sd`, `var`, and `entropy`
        - `kl_loss` and `cross_entropy`
        - `sample`

        It should be notice that the input should be always a tensor.
        For more details of all APIs, including the inputs and outputs,
        please refer to :class:`mindspore.nn.probability.bijector.Distribution`, and examples below.

    Supported Platforms:
        ``Ascend`` ``GPU``

@@ -42,6 +61,11 @@ class Uniform(Distribution):
        `dist_spec_args` are `high` and `low`.
        `dtype` must be float type because Uniform distributions are continuous.

    Raises:
        ValueError: When high <= low.
        TypeError: When the input `dtype` is not a subclass of float.


    Examples:
        >>> import mindspore
        >>> import mindspore.nn as nn
@@ -181,13 +205,19 @@ class Uniform(Distribution):
    def low(self):
        """
        Return the lower bound of the distribution after casting to dtype.

        Output:
            Tensor, the lower bound of the distribution.
        """
        return self._low

    @property
    def high(self):
        """
        Return the upper bound of the distribution after casting to dtype..
        Return the upper bound of the distribution after casting to dtype.

        Output:
            Tensor, the upper bound of the distribution.
        """
        return self._high