| @@ -1,7 +1,9 @@ | |||
|  | |||
| ============================================================ | |||
| - [What Is MindSpore?](#what-is-mindspore) | |||
| [查看中文](./README_CN.md) | |||
| - [What Is MindSpore](#what-is-mindspore) | |||
| - [Automatic Differentiation](#automatic-differentiation) | |||
| - [Automatic Parallel](#automatic-parallel) | |||
| - [Installation](#installation) | |||
| @@ -0,0 +1,220 @@ | |||
|  | |||
| ============================================================ | |||
| [View English](./README.md) | |||
| - [MindSpore介绍](#mindspore介绍) | |||
| - [自动微分](#自动微分) | |||
| - [自动并行](#自动并行) | |||
| - [安装](#安装) | |||
| - [二进制文件](#二进制文件) | |||
| - [来源](#来源) | |||
| - [Docker镜像](#docker镜像) | |||
| - [快速入门](#快速入门) | |||
| - [文档](#文档) | |||
| - [社区](#社区) | |||
| - [治理](#治理) | |||
| - [交流](#交流) | |||
| - [贡献](#贡献) | |||
| - [版本说明](#版本说明) | |||
| - [许可证](#许可证) | |||
| ## MindSpore介绍 | |||
| MindSpore是一种适用于端边云场景的新型开源深度学习训练/推理框架。 | |||
| MindSpore提供了友好的设计和高效的执行,旨在提升数据科学家和算法工程师的开发体验,并为Ascend AI处理器提供原生支持,以及软硬件协同优化。 | |||
| 同时,MindSpore作为全球AI开源社区,致力于进一步开发和丰富AI软硬件应用生态。 | |||
| <img src="docs/MindSpore-architecture.png" alt="MindSpore Architecture" width="600"/> | |||
| 欲了解更多详情,请查看我们的[总体架构](https://www.mindspore.cn/docs/zh-CN/master/architecture.html)。 | |||
| ### 自动微分 | |||
| 当前主流深度学习框架中有三种自动微分技术: | |||
| - **基于静态计算图的转换**:编译时将网络转换为静态数据流图,将链式法则应用于数据流图,实现自动微分。 | |||
| - **基于动态计算图的转换**:记录算子过载正向执行时网络的运行轨迹,对动态生成的数据流图应用链式法则,实现自动微分。 | |||
| - **基于源码的转换**:该技术是从功能编程框架演进而来,以即时编译(Just-in-time Compilation,JIT)的形式对中间表达式(程序在编译过程中的表达式)进行自动差分转换,支持复杂的控制流场景、高阶函数和闭包。 | |||
| TensorFlow早期采用的是静态计算图,PyTorch采用的是动态计算图。静态映射可以利用静态编译技术来优化网络性能,但是构建网络或调试网络非常复杂。动态图的使用非常方便,但很难实现性能的极限优化。 | |||
| MindSpore找到了另一种方法,即基于源代码转换的自动微分。一方面,它支持自动控制流的自动微分,因此像PyTorch这样的模型构建非常方便。另一方面,MindSpore可以对神经网络进行静态编译优化,以获得更好的性能。 | |||
| <img src="docs/Automatic-differentiation.png" alt="Automatic Differentiation" width="600"/> | |||
| MindSpore自动微分的实现可以理解为程序本身的符号微分。MindSpore IR是一个函数中间表达式,它与基础代数中的复合函数具有直观的对应关系。复合函数的公式由任意可推导的基础函数组成。MindSpore IR中的每个原语操作都可以对应基础代数中的基本功能,从而可以建立更复杂的流控制。 | |||
| ### 自动并行 | |||
| MindSpore自动并行的目的是构建数据并行、模型并行和混合并行相结合的训练方法。该方法能够自动选择开销最小的模型切分策略,实现自动分布并行训练。 | |||
| <img src="docs/Automatic-parallel.png" alt="Automatic Parallel" width="600"/> | |||
| 目前MindSpore采用的是算子切分的细粒度并行策略,即图中的每个算子被切分为一个集群,完成并行操作。在此期间的切分策略可能非常复杂,但是作为一名Python开发者,您无需关注底层实现,只要顶层API计算是有效的即可。 | |||
| ## 安装 | |||
| ### 二进制文件 | |||
| MindSpore提供跨多个后端的构建选项: | |||
| | 硬件平台 | 操作系统 | 状态 | | |||
| | :------------ | :-------------- | :--- | | |||
| | Ascend 910 | Ubuntu-x86 | ✔️ | | |||
| | | EulerOS-x86 | ✔️ | | |||
| | | EulerOS-aarch64 | ✔️ | | |||
| | GPU CUDA 10.1 | Ubuntu-x86 | ✔️ | | |||
| | CPU | Ubuntu-x86 | ✔️ | | |||
| | | Windows-x86 | ✔️ | | |||
| 使用`pip`命令安装,以`CPU`和`Ubuntu-x86`build版本为例: | |||
| 1. 请从[MindSpore下载页面](https://www.mindspore.cn/versions)下载并安装whl包。 | |||
| ``` | |||
| pip install https://ms-release.obs.cn-north-4.myhuaweicloud.com/0.6.0-beta/MindSpore/cpu/ubuntu_x86/mindspore-0.6.0-cp37-cp37m-linux_x86_64.whl | |||
| ``` | |||
| 2. 执行以下命令,验证安装结果。 | |||
| ```python | |||
| import numpy as np | |||
| import mindspore.context as context | |||
| import mindspore.nn as nn | |||
| from mindspore import Tensor | |||
| from mindspore.ops import operations as P | |||
| context.set_context(mode=context.GRAPH_MODE, device_target="CPU") | |||
| class Mul(nn.Cell): | |||
| def __init__(self): | |||
| super(Mul, self).__init__() | |||
| self.mul = P.Mul() | |||
| def construct(self, x, y): | |||
| return self.mul(x, y) | |||
| x = Tensor(np.array([1.0, 2.0, 3.0]).astype(np.float32)) | |||
| y = Tensor(np.array([4.0, 5.0, 6.0]).astype(np.float32)) | |||
| mul = Mul() | |||
| print(mul(x, y)) | |||
| ``` | |||
| ``` | |||
| [ 4. 10. 18.] | |||
| ``` | |||
| ### 来源 | |||
| [MindSpore安装](https://www.mindspore.cn/install)。 | |||
| ### Docker镜像 | |||
| MindSpore的Docker镜像托管在[Docker Hub](https://hub.docker.com/r/mindspore)上。 | |||
| 目前容器化构建选项支持情况如下: | |||
| | 硬件平台 | Docker镜像仓库 | 标签 | 说明 | | |||
| | :----- | :------------------------ | :----------------------- | :--------------------------------------- | | |||
| | CPU | `mindspore/mindspore-cpu` | `x.y.z` | 已经预安装MindSpore `x.y.z` CPU版本的生产环境。 | | |||
| | | | `devel` | 提供开发环境从源头构建MindSpore(`CPU`后端)。安装详情请参考https://www.mindspore.cn/install。 | | |||
| | | | `runtime` | 提供运行时环境安装MindSpore二进制包(`CPU`后端)。 | | |||
| | GPU | `mindspore/mindspore-gpu` | `x.y.z` | 已经预安装MindSpore `x.y.z` GPU版本的生产环境。 | | |||
| | | | `devel` | 提供开发环境从源头构建MindSpore(`GPU CUDA10.1`后端)。安装详情请参考https://www.mindspore.cn/install。 | | |||
| | | | `runtime` | 提供运行时环境安装MindSpore二进制包(`GPU CUDA10.1`后端)。 | | |||
| | Ascend | <center>—</center> | <center>—</center> | 即将推出,敬请期待。 | | |||
| > **注意:** 不建议从源头构建GPU `devel` Docker镜像后直接安装whl包。我们强烈建议您在GPU `runtime` Docker镜像中传输并安装whl包。 | |||
| * CPU | |||
| 对于`CPU`后端,可以直接使用以下命令获取并运行最新的稳定镜像: | |||
| ``` | |||
| docker pull mindspore/mindspore-cpu:0.6.0-beta | |||
| docker run -it mindspore/mindspore-cpu:0.6.0-beta /bin/bash | |||
| ``` | |||
| * GPU | |||
| 对于`GPU`后端,请确保`nvidia-container-toolkit`已经提前安装,以下是`Ubuntu`用户安装指南: | |||
| ``` | |||
| DISTRIBUTION=$(. /etc/os-release; echo $ID$VERSION_ID) | |||
| curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add - | |||
| curl -s -L https://nvidia.github.io/nvidia-docker/$DISTRIBUTION/nvidia-docker.list | tee /etc/apt/sources.list.d/nvidia-docker.list | |||
| sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit nvidia-docker2 | |||
| sudo systemctl restart docker | |||
| ``` | |||
| 使用以下命令获取并运行最新的稳定镜像: | |||
| ``` | |||
| docker pull mindspore/mindspore-gpu:0.6.0-beta | |||
| docker run -it --runtime=nvidia --privileged=true mindspore/mindspore-gpu:0.6.0-beta /bin/bash | |||
| ``` | |||
| 要测试Docker是否正常工作,请运行下面的Python代码并检查输出: | |||
| ```python | |||
| import numpy as np | |||
| import mindspore.context as context | |||
| from mindspore import Tensor | |||
| from mindspore.ops import functional as F | |||
| context.set_context(device_target="GPU") | |||
| x = Tensor(np.ones([1,3,3,4]).astype(np.float32)) | |||
| y = Tensor(np.ones([1,3,3,4]).astype(np.float32)) | |||
| print(F.tensor_add(x, y)) | |||
| ``` | |||
| ``` | |||
| [[[ 2. 2. 2. 2.], | |||
| [ 2. 2. 2. 2.], | |||
| [ 2. 2. 2. 2.]], | |||
| [[ 2. 2. 2. 2.], | |||
| [ 2. 2. 2. 2.], | |||
| [ 2. 2. 2. 2.]], | |||
| [[ 2. 2. 2. 2.], | |||
| [ 2. 2. 2. 2.], | |||
| [ 2. 2. 2. 2.]]] | |||
| ``` | |||
| 如果您想了解更多关于MindSpore Docker镜像的构建过程,请查看[docker](docker/README.md) repo了解详细信息。 | |||
| ## 快速入门 | |||
| 参考[快速入门](https://www.mindspore.cn/tutorial/zh-CN/master/quick_start/quick_start.html)实现图片分类。 | |||
| ## 文档 | |||
| 有关安装指南、教程和API的更多详细信息,请参阅[用户文档](https://gitee.com/mindspore/docs)。 | |||
| ## 社区 | |||
| ### 治理 | |||
| 查看MindSpore如何进行[开放治理](https://gitee.com/mindspore/community/blob/master/governance.md)。 | |||
| ### 交流 | |||
| - [MindSpore Slack](https://join.slack.com/t/mindspore/shared_invite/zt-dgk65rli-3ex4xvS4wHX7UDmsQmfu8w) 开发者交流平台。 | |||
| - `#mindspore`IRC频道(仅用于会议记录) | |||
| - 视频会议:待定 | |||
| - 邮件列表:<https://mailweb.mindspore.cn/postorius/lists> | |||
| ## 贡献 | |||
| 欢迎参与贡献。更多详情,请参阅我们的[贡献者Wiki](CONTRIBUTING.md)。 | |||
| ## 版本说明 | |||
| 版本说明请参阅[RELEASE](RELEASE.md)。 | |||
| ## 许可证 | |||
| [Apache License 2.0](LICENSE) | |||
| @@ -150,7 +150,7 @@ TensorPtr TensorPy::MakeTensor(const py::array &input, const TypePtr &type_ptr) | |||
| // Get tensor shape. | |||
| std::vector<int> shape(buf.shape.begin(), buf.shape.end()); | |||
| if (data_type == buf_type) { | |||
| // Use memory copy if input data type is same as the required type. | |||
| // Use memory copy if input data type is the same as the required type. | |||
| return std::make_shared<Tensor>(data_type, shape, buf.ptr, buf.size * buf.itemsize); | |||
| } | |||
| // Create tensor with data type converted. | |||
| @@ -546,9 +546,11 @@ def set_context(**kwargs): | |||
| Note: | |||
| Attribute name is required for setting attributes. | |||
| The mode is not recommended to be changed after net was initilized because the implementations of some | |||
| operations are different in graph mode and pynative mode. Default: PYNATIVE_MODE. | |||
| Args: | |||
| mode (int): Running in GRAPH_MODE(0) or PYNATIVE_MODE(1). Default: PYNATIVE_MODE. | |||
| mode (int): Running in GRAPH_MODE(0) or PYNATIVE_MODE(1). | |||
| device_target (str): The target device to run, support "Ascend", "GPU", "CPU". Default: "Ascend". | |||
| device_id (int): Id of target device, the value must be in [0, device_num_per_host-1], | |||
| while device_num_per_host should no more than 4096. Default: 0. | |||
| @@ -148,7 +148,7 @@ class Cell: | |||
| def update_cell_type(self, cell_type): | |||
| """ | |||
| Update the current cell type mainly identify if quantization aware training network. | |||
| The current cell type is updated when a quantization aware training network is encountered. | |||
| After being invoked, it can set the cell type to 'cell_type'. | |||
| """ | |||
| @@ -934,7 +934,7 @@ class GraphKernel(Cell): | |||
| Base class for GraphKernel. | |||
| A `GraphKernel` a composite of basic primitives and can be compiled into a fused kernel automatically when | |||
| context.set_context(enable_graph_kernel=True). | |||
| enable_graph_kernel in context is set to True. | |||
| Examples: | |||
| >>> class Relu(GraphKernel): | |||
| @@ -661,7 +661,7 @@ class LogSoftmax(GraphKernel): | |||
| Log Softmax activation function. | |||
| Applies the Log Softmax function to the input tensor on the specified axis. | |||
| Suppose a slice along the given aixs :math:`x` then for each element :math:`x_i` | |||
| Suppose a slice in the given aixs :math:`x` then for each element :math:`x_i` | |||
| the Log Softmax function is shown as follows: | |||
| .. math:: | |||
| @@ -987,10 +987,10 @@ class LayerNorm(Cell): | |||
| Applies Layer Normalization over a mini-batch of inputs. | |||
| Layer normalization is widely used in recurrent neural networks. It applies | |||
| normalization over a mini-batch of inputs for each single training case as described | |||
| normalization on a mini-batch of inputs for each single training case as described | |||
| in the paper `Layer Normalization <https://arxiv.org/pdf/1607.06450.pdf>`_. Unlike batch | |||
| normalization, layer normalization performs exactly the same computation at training and | |||
| testing times. It can be described using the following formula. It is applied across all channels | |||
| testing time. It can be described using the following formula. It is applied across all channels | |||
| and pixel but only one batch size. | |||
| .. math:: | |||
| @@ -1139,9 +1139,9 @@ class LambNextMV(GraphKernel): | |||
| Outputs: | |||
| Tuple of 2 Tensor. | |||
| - **add3** (Tensor) - The shape is same as the shape after broadcasting, and the data type is | |||
| - **add3** (Tensor) - The shape is the same as the shape after broadcasting, and the data type is | |||
| the one with high precision or high digits among the inputs. | |||
| - **realdiv4** (Tensor) - The shape is same as the shape after broadcasting, and the data type is | |||
| - **realdiv4** (Tensor) - The shape is the same as the shape after broadcasting, and the data type is | |||
| the one with high precision or high digits among the inputs. | |||
| Examples: | |||
| @@ -55,7 +55,7 @@ class Softmax(Cell): | |||
| .. math:: | |||
| \text{softmax}(x_{i}) = \frac{\exp(x_i)}{\sum_{j=0}^{n-1}\exp(x_j)}, | |||
| where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor. | |||
| where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor. | |||
| Args: | |||
| axis (Union[int, tuple[int]]): The axis to apply Softmax operation, -1 means the last dimension. Default: -1. | |||
| @@ -87,11 +87,11 @@ class LogSoftmax(Cell): | |||
| Applies the LogSoftmax function to n-dimensional input tensor. | |||
| The input is transformed with Softmax function and then with log function to lie in range[-inf,0). | |||
| The input is transformed by the Softmax function and then by the log function to lie in range[-inf,0). | |||
| Logsoftmax is defined as: | |||
| :math:`\text{logsoftmax}(x_i) = \log \left(\frac{\exp(x_i)}{\sum_{j=0}^{n-1} \exp(x_j)}\right)`, | |||
| where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor. | |||
| where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor. | |||
| Args: | |||
| axis (int): The axis to apply LogSoftmax operation, -1 means the last dimension. Default: -1. | |||
| @@ -123,7 +123,7 @@ class ELU(Cell): | |||
| Exponential Linear Uint activation function. | |||
| Applies the exponential linear unit function element-wise. | |||
| The activation function defined as: | |||
| The activation function is defined as: | |||
| .. math:: | |||
| E_{i} = | |||
| @@ -162,7 +162,7 @@ class ReLU(Cell): | |||
| Applies the rectified linear unit function element-wise. It returns | |||
| element-wise :math:`\max(0, x)`, specially, the neurons with the negative output | |||
| will suppressed and the active neurons will stay the same. | |||
| will be suppressed and the active neurons will stay the same. | |||
| Inputs: | |||
| - **input_data** (Tensor) - The input of ReLU. | |||
| @@ -197,7 +197,7 @@ class ReLU6(Cell): | |||
| - **input_data** (Tensor) - The input of ReLU6. | |||
| Outputs: | |||
| Tensor, which has the same type with `input_data`. | |||
| Tensor, which has the same type as `input_data`. | |||
| Examples: | |||
| >>> input_x = Tensor(np.array([-1, -2, 0, 2, 1]), mindspore.float16) | |||
| @@ -234,7 +234,7 @@ class LeakyReLU(Cell): | |||
| - **input_x** (Tensor) - The input of LeakyReLU. | |||
| Outputs: | |||
| Tensor, has the same type and shape with the `input_x`. | |||
| Tensor, has the same type and shape as the `input_x`. | |||
| Examples: | |||
| >>> input_x = Tensor(np.array([[-1.0, 4.0, -8.0], [2.0, -5.0, 9.0]]), mindspore.float32) | |||
| @@ -365,7 +365,7 @@ class PReLU(Cell): | |||
| PReLU is defined as: :math:`prelu(x_i)= \max(0, x_i) + w * \min(0, x_i)`, where :math:`x_i` | |||
| is an element of an channel of the input. | |||
| Here :math:`w` is an learnable parameter with default initial value 0.25. | |||
| Here :math:`w` is a learnable parameter with a default initial value 0.25. | |||
| Parameter :math:`w` has dimensionality of the argument channel. If called without argument | |||
| channel, a single parameter :math:`w` will be shared across all channels. | |||
| @@ -413,7 +413,7 @@ class PReLU(Cell): | |||
| class HSwish(Cell): | |||
| r""" | |||
| rHard swish activation function. | |||
| Hard swish activation function. | |||
| Applies hswish-type activation element-wise. The input is a Tensor with any valid shape. | |||
| @@ -422,7 +422,7 @@ class HSwish(Cell): | |||
| .. math:: | |||
| \text{hswish}(x_{i}) = x_{i} * \frac{ReLU6(x_{i} + 3)}{6}, | |||
| where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor. | |||
| where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor. | |||
| Inputs: | |||
| - **input_data** (Tensor) - The input of HSwish. | |||
| @@ -456,7 +456,7 @@ class HSigmoid(Cell): | |||
| .. math:: | |||
| \text{hsigmoid}(x_{i}) = max(0, min(1, \frac{x_{i} + 3}{6})), | |||
| where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor. | |||
| where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor. | |||
| Inputs: | |||
| - **input_data** (Tensor) - The input of HSigmoid. | |||
| @@ -65,7 +65,7 @@ class Dropout(Cell): | |||
| dtype (:class:`mindspore.dtype`): Data type of input. Default: mindspore.float32. | |||
| Raises: | |||
| ValueError: If keep_prob is not in range (0, 1). | |||
| ValueError: If `keep_prob` is not in range (0, 1). | |||
| Inputs: | |||
| - **input** (Tensor) - An N-D Tensor. | |||
| @@ -373,8 +373,8 @@ class OneHot(Cell): | |||
| axis is created at dimension `axis`. | |||
| Args: | |||
| axis (int): Features x depth if axis == -1, depth x features | |||
| if axis == 0. Default: -1. | |||
| axis (int): Features x depth if axis is -1, depth x features | |||
| if axis is 0. Default: -1. | |||
| depth (int): A scalar defining the depth of the one hot dimension. Default: 1. | |||
| on_value (float): A scalar defining the value to fill in output[i][j] | |||
| when indices[j] = i. Default: 1.0. | |||
| @@ -492,18 +492,18 @@ class Unfold(Cell): | |||
| The input tensor must be a 4-D tensor and the data format is NCHW. | |||
| Args: | |||
| ksizes (Union[tuple[int], list[int]]): The size of sliding window, should be a tuple or list of int, | |||
| ksizes (Union[tuple[int], list[int]]): The size of sliding window, should be a tuple or a list of integers, | |||
| and the format is [1, ksize_row, ksize_col, 1]. | |||
| strides (Union[tuple[int], list[int]]): Distance between the centers of the two consecutive patches, | |||
| should be a tuple or list of int, and the format is [1, stride_row, stride_col, 1]. | |||
| rates (Union[tuple[int], list[int]]): In each extracted patch, the gap between the corresponding dim | |||
| pixel positions, should be a tuple or list of int, and the format is [1, rate_row, rate_col, 1]. | |||
| rates (Union[tuple[int], list[int]]): In each extracted patch, the gap between the corresponding dimension | |||
| pixel positions, should be a tuple or a list of integers, and the format is [1, rate_row, rate_col, 1]. | |||
| padding (str): The type of padding algorithm, is a string whose value is "same" or "valid", | |||
| not case sensitive. Default: "valid". | |||
| - same: Means that the patch can take the part beyond the original image, and this part is filled with 0. | |||
| - valid: Means that the patch area taken must be completely contained in the original image. | |||
| - valid: Means that the taken patch area must be completely covered in the original image. | |||
| Inputs: | |||
| - **input_x** (Tensor) - A 4-D tensor whose shape is [in_batch, in_depth, in_row, in_col] and | |||
| @@ -511,7 +511,7 @@ class Unfold(Cell): | |||
| Outputs: | |||
| Tensor, a 4-D tensor whose data type is same as 'input_x', | |||
| and the shape is [out_batch, out_depth, out_row, out_col], the out_batch is same as the in_batch. | |||
| and the shape is [out_batch, out_depth, out_row, out_col], the out_batch is the same as the in_batch. | |||
| Examples: | |||
| >>> net = Unfold(ksizes=[1, 2, 2, 1], strides=[1, 1, 1, 1], rates=[1, 1, 1, 1]) | |||
| @@ -556,11 +556,11 @@ class MatrixDiag(Cell): | |||
| Returns a batched diagonal tensor with a given batched diagonal values. | |||
| Inputs: | |||
| - **x** (Tensor) - The diagonal values. It can be of the following data types: | |||
| float32, float16, int32, int8, uint8. | |||
| - **x** (Tensor) - The diagonal values. It can be one of the following data types: | |||
| float32, float16, int32, int8, and uint8. | |||
| Outputs: | |||
| Tensor, same type as input `x`. The shape should be x.shape + (x.shape[-1], ). | |||
| Tensor, has the same type as input `x`. The shape should be x.shape + (x.shape[-1], ). | |||
| Examples: | |||
| >>> x = Tensor(np.array([1, -1]), mstype.float32) | |||
| @@ -587,11 +587,11 @@ class MatrixDiagPart(Cell): | |||
| Returns the batched diagonal part of a batched tensor. | |||
| Inputs: | |||
| - **x** (Tensor) - The batched tensor. It can be of the following data types: | |||
| float32, float16, int32, int8, uint8. | |||
| - **x** (Tensor) - The batched tensor. It can be one of the following data types: | |||
| float32, float16, int32, int8, and uint8. | |||
| Outputs: | |||
| Tensor, same type as input `x`. The shape should be x.shape[:-2] + [min(x.shape[-2:])]. | |||
| Tensor, has the same type as input `x`. The shape should be x.shape[:-2] + [min(x.shape[-2:])]. | |||
| Examples: | |||
| >>> x = Tensor([[[-1, 0], [0, 1]], [[-1, 0], [0, 1]], [[-1, 0], [0, 1]]], mindspore.float32) | |||
| @@ -617,12 +617,12 @@ class MatrixSetDiag(Cell): | |||
| Modify the batched diagonal part of a batched tensor. | |||
| Inputs: | |||
| - **x** (Tensor) - The batched tensor. It can be of the following data types: | |||
| float32, float16, int32, int8, uint8. | |||
| - **x** (Tensor) - The batched tensor. It can be one of the following data types: | |||
| float32, float16, int32, int8, and uint8. | |||
| - **diagonal** (Tensor) - The diagonal values. | |||
| Outputs: | |||
| Tensor, same type as input `x`. The shape same as `x`. | |||
| Tensor, has the same type and shape as input `x`. | |||
| Examples: | |||
| >>> x = Tensor([[[-1, 0], [0, 1]], [[-1, 0], [0, 1]], [[-1, 0], [0, 1]]], mindspore.float32) | |||
| @@ -72,7 +72,7 @@ class SequentialCell(Cell): | |||
| args (list, OrderedDict): List of subclass of Cell. | |||
| Raises: | |||
| TypeError: If arg is not of type list or OrderedDict. | |||
| TypeError: If the type of the argument is not list or OrderedDict. | |||
| Inputs: | |||
| - **input** (Tensor) - Tensor with shape according to the first Cell in the sequence. | |||
| @@ -131,7 +131,7 @@ class Conv2d(_Conv): | |||
| Args: | |||
| in_channels (int): The number of input channel :math:`C_{in}`. | |||
| out_channels (int): The number of output channel :math:`C_{out}`. | |||
| kernel_size (Union[int, tuple[int]]): The data type is int or tuple with 2 integers. Specifies the height | |||
| kernel_size (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the height | |||
| and width of the 2D convolution window. Single int means the value is for both the height and the width of | |||
| the kernel. A tuple of 2 ints means the first value is for the height and the other is for the | |||
| width of the kernel. | |||
| @@ -147,7 +147,7 @@ class Conv2d(_Conv): | |||
| last extra padding will be done from the bottom and the right side. If this mode is set, `padding` | |||
| must be 0. | |||
| - valid: Adopts the way of discarding. The possibly largest height and width of output will be returned | |||
| - valid: Adopts the way of discarding. The possible largest height and width of output will be returned | |||
| without padding. Extra pixels will be discarded. If this mode is set, `padding` | |||
| must be 0. | |||
| @@ -158,7 +158,7 @@ class Conv2d(_Conv): | |||
| the padding of top, bottom, left and right is the same, equal to padding. If `padding` is a tuple | |||
| with four integers, the padding of top, bottom, left and right will be equal to padding[0], | |||
| padding[1], padding[2], and padding[3] accordingly. Default: 0. | |||
| dilation (Union[int, tuple[int]]): The data type is int or tuple with 2 integers. Specifies the dilation rate | |||
| dilation (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the dilation rate | |||
| to use for dilated convolution. If set to be :math:`k > 1`, there will | |||
| be :math:`k - 1` pixels skipped for each sampling location. Its value should | |||
| be greater or equal to 1 and bounded by the height and width of the | |||
| @@ -451,7 +451,7 @@ class Conv2dTranspose(_Conv): | |||
| Args: | |||
| in_channels (int): The number of channels in the input space. | |||
| out_channels (int): The number of channels in the output space. | |||
| kernel_size (Union[int, tuple]): int or tuple with 2 integers, which specifies the height | |||
| kernel_size (Union[int, tuple]): int or a tuple of 2 integers, which specifies the height | |||
| and width of the 2D convolution window. Single int means the value is for both the height and the width of | |||
| the kernel. A tuple of 2 ints means the first value is for the height and the other is for the | |||
| width of the kernel. | |||
| @@ -825,7 +825,7 @@ class DepthwiseConv2d(Cell): | |||
| Args: | |||
| in_channels (int): The number of input channel :math:`C_{in}`. | |||
| out_channels (int): The number of output channel :math:`C_{out}`. | |||
| kernel_size (Union[int, tuple[int]]): The data type is int or tuple with 2 integers. Specifies the height | |||
| kernel_size (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the height | |||
| and width of the 2D convolution window. Single int means the value is for both the height and the width of | |||
| the kernel. A tuple of 2 ints means the first value is for the height and the other is for the | |||
| width of the kernel. | |||
| @@ -841,7 +841,7 @@ class DepthwiseConv2d(Cell): | |||
| last extra padding will be done from the bottom and the right side. If this mode is set, `padding` | |||
| must be 0. | |||
| - valid: Adopts the way of discarding. The possibly largest height and width of output will be returned | |||
| - valid: Adopts the way of discarding. The possible largest height and width of output will be returned | |||
| without padding. Extra pixels will be discarded. If this mode is set, `padding` | |||
| must be 0. | |||
| @@ -849,16 +849,16 @@ class DepthwiseConv2d(Cell): | |||
| Tensor borders. `padding` should be greater than or equal to 0. | |||
| padding (int): Implicit paddings on both sides of the input. Default: 0. | |||
| dilation (Union[int, tuple[int]]): The data type is int or tuple with 2 integers. Specifies the dilation rate | |||
| dilation (Union[int, tuple[int]]): The data type is int or a tuple of 2 integers. Specifies the dilation rate | |||
| to use for dilated convolution. If set to be :math:`k > 1`, there will | |||
| be :math:`k - 1` pixels skipped for each sampling location. Its value should | |||
| be greater or equal to 1 and bounded by the height and width of the | |||
| be greater than or equal to 1 and bounded by the height and width of the | |||
| input. Default: 1. | |||
| group (int): Split filter into groups, `in_ channels` and `out_channels` should be | |||
| divisible by the number of groups. Default: 1. | |||
| has_bias (bool): Specifies whether the layer uses a bias vector. Default: False. | |||
| weight_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the convolution kernel. | |||
| It can be a Tensor, a string, an Initializer or a numbers.Number. When a string is specified, | |||
| It can be a Tensor, a string, an Initializer or a number. When a string is specified, | |||
| values from 'TruncatedNormal', 'Normal', 'Uniform', 'HeUniform' and 'XavierUniform' distributions as well | |||
| as constant 'One' and 'Zero' distributions are possible. Alias 'xavier_uniform', 'he_uniform', 'ones' | |||
| and 'zeros' are acceptable. Uppercase and lowercase are both acceptable. Refer to the values of | |||
| @@ -36,7 +36,7 @@ class Embedding(Cell): | |||
| the corresponding word embeddings. | |||
| Note: | |||
| When 'use_one_hot' is set to True, the input should be of type mindspore.int32. | |||
| When 'use_one_hot' is set to True, the type of the input should be mindspore.int32. | |||
| Args: | |||
| vocab_size (int): Size of the dictionary of embeddings. | |||
| @@ -48,9 +48,9 @@ class Embedding(Cell): | |||
| dtype (:class:`mindspore.dtype`): Data type of input. Default: mindspore.float32. | |||
| Inputs: | |||
| - **input** (Tensor) - Tensor of shape :math:`(\text{batch_size}, \text{input_length})`. The element of | |||
| the Tensor should be integer and not larger than vocab_size. else the corresponding embedding vector is zero | |||
| if larger than vocab_size. | |||
| - **input** (Tensor) - Tensor of shape :math:`(\text{batch_size}, \text{input_length})`. The elements of | |||
| the Tensor should be integer and not larger than vocab_size. Otherwise the corresponding embedding vector will | |||
| be zero. | |||
| Outputs: | |||
| Tensor of shape :math:`(\text{batch_size}, \text{input_length}, \text{embedding_size})`. | |||
| @@ -253,7 +253,7 @@ class MSSSIM(Cell): | |||
| Args: | |||
| max_val (Union[int, float]): The dynamic range of the pixel values (255 for 8-bit grayscale images). | |||
| Default: 1.0. | |||
| power_factors (Union[tuple, list]): Iterable of weights for each of the scales. | |||
| power_factors (Union[tuple, list]): Iterable of weights for each scal e. | |||
| Default: (0.0448, 0.2856, 0.3001, 0.2363, 0.1333). Default values obtained by Wang et al. | |||
| filter_size (int): The size of the Gaussian filter. Default: 11. | |||
| filter_sigma (float): The standard deviation of Gaussian kernel. Default: 1.5. | |||
| @@ -35,7 +35,7 @@ class LSTM(Cell): | |||
| Applies a LSTM to the input. | |||
| There are two pipelines connecting two consecutive cells in a LSTM model; one is cell state pipeline | |||
| and another is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`. | |||
| and the other is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`. | |||
| Given an input :math:`x_t` at time :math:`t`, an hidden state :math:`h_{t-1}` and an cell | |||
| state :math:`c_{t-1}` of the layer at time :math:`{t-1}`, the cell state and hidden state at | |||
| time :math:`t` is computed using an gating mechanism. Input gate :math:`i_t` is designed to protect the cell | |||
| @@ -68,18 +68,17 @@ class LSTM(Cell): | |||
| input_size (int): Number of features of input. | |||
| hidden_size (int): Number of features of hidden layer. | |||
| num_layers (int): Number of layers of stacked LSTM . Default: 1. | |||
| has_bias (bool): Specifies whether has bias `b_ih` and `b_hh`. Default: True. | |||
| has_bias (bool): Whether the cell has bias `b_ih` and `b_hh`. Default: True. | |||
| batch_first (bool): Specifies whether the first dimension of input is batch_size. Default: False. | |||
| dropout (float, int): If not 0, append `Dropout` layer on the outputs of each | |||
| LSTM layer except the last layer. Default 0. The range of dropout is [0.0, 1.0]. | |||
| bidirectional (bool): Specifies whether this is a bidirectional LSTM. If set True, | |||
| number of directions will be 2 otherwise number of directions is 1. Default: False. | |||
| bidirectional (bool): Specifies whether it is a bidirectional LSTM. Default: False. | |||
| Inputs: | |||
| - **input** (Tensor) - Tensor of shape (seq_len, batch_size, `input_size`). | |||
| - **hx** (tuple) - A tuple of two Tensors (h_0, c_0) both of data type mindspore.float32 or | |||
| mindspore.float16 and shape (num_directions * `num_layers`, batch_size, `hidden_size`). | |||
| Data type of `hx` should be the same of `input`. | |||
| Data type of `hx` should be the same as `input`. | |||
| Outputs: | |||
| Tuple, a tuple constains (`output`, (`h_n`, `c_n`)). | |||
| @@ -205,7 +204,7 @@ class LSTMCell(Cell): | |||
| Applies a LSTM layer to the input. | |||
| There are two pipelines connecting two consecutive cells in a LSTM model; one is cell state pipeline | |||
| and another is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`. | |||
| and the other is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`. | |||
| Given an input :math:`x_t` at time :math:`t`, an hidden state :math:`h_{t-1}` and an cell | |||
| state :math:`c_{t-1}` of the layer at time :math:`{t-1}`, the cell state and hidden state at | |||
| time :math:`t` is computed using an gating mechanism. Input gate :math:`i_t` is designed to protect the cell | |||
| @@ -238,7 +237,7 @@ class LSTMCell(Cell): | |||
| input_size (int): Number of features of input. | |||
| hidden_size (int): Number of features of hidden layer. | |||
| layer_index (int): index of current layer of stacked LSTM . Default: 0. | |||
| has_bias (bool): Specifies whether has bias `b_ih` and `b_hh`. Default: True. | |||
| has_bias (bool): Whether the cell has bias `b_ih` and `b_hh`. Default: True. | |||
| batch_first (bool): Specifies whether the first dimension of input is batch_size. Default: False. | |||
| dropout (float, int): If not 0, append `Dropout` layer on the outputs of each | |||
| LSTM layer except the last layer. Default 0. The range of dropout is [0.0, 1.0]. | |||
| @@ -243,6 +243,10 @@ class BatchNorm1d(_BatchNorm): | |||
| .. math:: | |||
| y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta | |||
| Note: | |||
| The implementation of BatchNorm is different in graph mode and pynative mode, therefore the mode is not | |||
| recommended to be changed after net was initilized. | |||
| Args: | |||
| num_features (int): `C` from an expected input of size (N, C). | |||
| eps (float): A value added to the denominator for numerical stability. Default: 1e-5. | |||
| @@ -319,6 +323,10 @@ class BatchNorm2d(_BatchNorm): | |||
| .. math:: | |||
| y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta | |||
| Note: | |||
| The implementation of BatchNorm is different in graph mode and pynative mode, therefore that mode can not be | |||
| changed after net was initilized. | |||
| Args: | |||
| num_features (int): `C` from an expected input of size (N, C, H, W). | |||
| eps (float): A value added to the denominator for numerical stability. Default: 1e-5. | |||
| @@ -384,8 +392,8 @@ class GlobalBatchNorm(_BatchNorm): | |||
| r""" | |||
| Global normalization layer over a N-dimension input. | |||
| Global Normalization is cross device synchronized batch normalization. Batch Normalization implementation | |||
| only normalize the data within each device. Global normalization will normalize the input within the group. | |||
| Global Normalization is cross device synchronized batch normalization. The implementation of Batch Normalization | |||
| only normalizes the data within each device. Global normalization will normalize the input within the group. | |||
| It has been described in the paper `Batch Normalization: Accelerating Deep Network Training by | |||
| Reducing Internal Covariate Shift <https://arxiv.org/abs/1502.03167>`_. It rescales and recenters the | |||
| feature using a mini-batch of data and the learned parameters which can be described in the following formula. | |||
| @@ -467,10 +475,10 @@ class LayerNorm(Cell): | |||
| Applies Layer Normalization over a mini-batch of inputs. | |||
| Layer normalization is widely used in recurrent neural networks. It applies | |||
| normalization over a mini-batch of inputs for each single training case as described | |||
| normalization on a mini-batch of inputs for each single training case as described | |||
| in the paper `Layer Normalization <https://arxiv.org/pdf/1607.06450.pdf>`_. Unlike batch | |||
| normalization, layer normalization performs exactly the same computation at training and | |||
| testing times. It can be described using the following formula. It is applied across all channels | |||
| testing time. It can be described using the following formula. It is applied across all channels | |||
| and pixel but only one batch size. | |||
| .. math:: | |||
| @@ -545,7 +553,7 @@ class GroupNorm(Cell): | |||
| Group Normalization over a mini-batch of inputs. | |||
| Group normalization is widely used in recurrent neural networks. It applies | |||
| normalization over a mini-batch of inputs for each single training case as described | |||
| normalization on a mini-batch of inputs for each single training case as described | |||
| in the paper `Group Normalization <https://arxiv.org/pdf/1803.08494.pdf>`_. Group normalization | |||
| divides the channels into groups and computes within each group the mean and variance for normalization, | |||
| and it performs very stable over a wide range of batch size. It can be described using the following formula. | |||
| @@ -557,7 +565,7 @@ class GroupNorm(Cell): | |||
| num_groups (int): The number of groups to be divided along the channel dimension. | |||
| num_channels (int): The number of channels per group. | |||
| eps (float): A value added to the denominator for numerical stability. Default: 1e-5. | |||
| affine (bool): A bool value, this layer will has learnable affine parameters when set to true. Default: True. | |||
| affine (bool): A bool value, this layer will have learnable affine parameters when set to true. Default: True. | |||
| gamma_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the gamma weight. | |||
| The values of str refer to the function `initializer` including 'zeros', 'ones', 'xavier_uniform', | |||
| 'he_uniform', etc. Default: 'ones'. | |||
| @@ -61,7 +61,7 @@ class Conv2dBnAct(Cell): | |||
| Args: | |||
| in_channels (int): The number of input channel :math:`C_{in}`. | |||
| out_channels (int): The number of output channel :math:`C_{out}`. | |||
| kernel_size (Union[int, tuple]): The data type is int or tuple with 2 integers. Specifies the height | |||
| kernel_size (Union[int, tuple]): The data type is int or a tuple of 2 integers. Specifies the height | |||
| and width of the 2D convolution window. Single int means the value is for both height and width of | |||
| the kernel. A tuple of 2 ints means the first value is for the height and the other is for the | |||
| width of the kernel. | |||
| @@ -292,19 +292,19 @@ class BatchNormFoldCell(Cell): | |||
| class FakeQuantWithMinMax(Cell): | |||
| r""" | |||
| Quantization aware op. This OP provide Fake quantization observer function on data with min and max. | |||
| Quantization aware op. This OP provides the fake quantization observer function on data with min and max. | |||
| Args: | |||
| min_init (int, float): The dimension of channel or 1(layer). Default: -6. | |||
| max_init (int, float): The dimension of channel or 1(layer). Default: 6. | |||
| ema (bool): Exponential Moving Average algorithm update min and max. Default: False. | |||
| ema (bool): The exponential Moving Average algorithm updates min and max. Default: False. | |||
| ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. | |||
| per_channel (bool): Quantization granularity based on layer or on channel. Default: False. | |||
| channel_axis (int): Quantization by channel axis. Default: 1. | |||
| num_channels (int): declarate the min and max channel size, Default: 1. | |||
| num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. | |||
| symmetric (bool): The quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. | |||
| num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8. | |||
| symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False. | |||
| quant_delay (int): Quantization delay parameters according to the global step. Default: 0. | |||
| Inputs: | |||
| @@ -431,7 +431,7 @@ class Conv2dBnFoldQuant(Cell): | |||
| variance vector. Default: 'ones'. | |||
| fake (bool): Whether Conv2dBnFoldQuant Cell adds FakeQuantWithMinMax op. Default: True. | |||
| per_channel (bool): FakeQuantWithMinMax Parameters. Default: False. | |||
| num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. | |||
| num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8. | |||
| symmetric (bool): The quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. | |||
| quant_delay (int): The Quantization delay parameters according to the global step. Default: 0. | |||
| @@ -614,7 +614,7 @@ class Conv2dBnWithoutFoldQuant(Cell): | |||
| Default: 'normal'. | |||
| bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Default: 'zeros'. | |||
| per_channel (bool): FakeQuantWithMinMax Parameters. Default: False. | |||
| num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. | |||
| num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8. | |||
| symmetric (bool): The quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. | |||
| quant_delay (int): Quantization delay parameters according to the global step. Default: 0. | |||
| @@ -736,7 +736,7 @@ class Conv2dQuant(Cell): | |||
| Default: 'normal'. | |||
| bias_init (Union[Tensor, str, Initializer, numbers.Number]): Initializer for the bias vector. Default: 'zeros'. | |||
| per_channel (bool): FakeQuantWithMinMax Parameters. Default: False. | |||
| num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. | |||
| num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8. | |||
| symmetric (bool): The quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. | |||
| quant_delay (int): Quantization delay parameters according to the global step. Default: 0. | |||
| @@ -845,7 +845,7 @@ class DenseQuant(Cell): | |||
| has_bias (bool): Specifies whether the layer uses a bias vector. Default: True. | |||
| activation (str): The regularization function applied to the output of the layer, eg. 'relu'. Default: None. | |||
| per_channel (bool): FakeQuantWithMinMax Parameters. Default: False. | |||
| num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. | |||
| num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8. | |||
| symmetric (bool): The quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. | |||
| quant_delay (int): Quantization delay parameters according to the global step. Default: 0. | |||
| @@ -947,15 +947,14 @@ class ActQuant(_QuantActivation): | |||
| r""" | |||
| Quantization aware training activation function. | |||
| Add Fake Quant OP after activation. Not Recommand to used these cell for Fake Quant Op | |||
| Will climp the max range of the activation and the relu6 do the same operation. | |||
| This part is a more detailed overview of ReLU6 op. | |||
| Add the fake quant op to the end of activation op, by which the output of activation op will be truncated. | |||
| Please check `FakeQuantWithMinMax` for more details. | |||
| Args: | |||
| activation (Cell): Activation cell class. | |||
| ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. | |||
| per_channel (bool): Quantization granularity based on layer or on channel. Default: False. | |||
| num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. | |||
| num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8. | |||
| symmetric (bool): The quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. | |||
| quant_delay (int): Quantization delay parameters according to the global steps. Default: 0. | |||
| @@ -1010,7 +1009,7 @@ class LeakyReLUQuant(_QuantActivation): | |||
| activation (Cell): Activation cell class. | |||
| ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. | |||
| per_channel (bool): Quantization granularity based on layer or on channel. Default: False. | |||
| num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. | |||
| num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8. | |||
| symmetric (bool): The quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. | |||
| quant_delay (int): Quantization delay parameters according to the global step. Default: 0. | |||
| @@ -1080,9 +1079,9 @@ class HSwishQuant(_QuantActivation): | |||
| activation (Cell): Activation cell class. | |||
| ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. | |||
| per_channel (bool): Quantization granularity based on layer or on channel. Default: False. | |||
| num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. | |||
| symmetric (bool): The quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. | |||
| num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8. | |||
| symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False. | |||
| quant_delay (int): Quantization delay parameters according to the global step. Default: 0. | |||
| Inputs: | |||
| @@ -1149,9 +1148,9 @@ class HSigmoidQuant(_QuantActivation): | |||
| activation (Cell): Activation cell class. | |||
| ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. | |||
| per_channel (bool): Quantization granularity based on layer or on channel. Default: False. | |||
| num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. | |||
| symmetric (bool): The quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. | |||
| num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8. | |||
| symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False. | |||
| quant_delay (int): Quantization delay parameters according to the global step. Default: 0. | |||
| Inputs: | |||
| @@ -1217,7 +1216,7 @@ class TensorAddQuant(Cell): | |||
| Args: | |||
| ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. | |||
| per_channel (bool): Quantization granularity based on layer or on channel. Default: False. | |||
| num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. | |||
| num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8. | |||
| symmetric (bool): The quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. | |||
| quant_delay (int): Quantization delay parameters according to the global step. Default: 0. | |||
| @@ -1269,7 +1268,7 @@ class MulQuant(Cell): | |||
| Args: | |||
| ema_decay (float): Exponential Moving Average algorithm parameter. Default: 0.999. | |||
| per_channel (bool): Quantization granularity based on layer or on channel. Default: False. | |||
| num_bits (int): The quantization number bit, support 4 and 8bit. Default: 8. | |||
| num_bits (int): The bit number of quantization, supporting 4 and 8bits. Default: 8. | |||
| symmetric (bool): The quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): The quantization algorithm uses narrow range or not. Default: False. | |||
| quant_delay (int): Quantization delay parameters according to the global step. Default: 0. | |||
| @@ -80,7 +80,7 @@ class L1Loss(_Loss): | |||
| When argument reduction is 'sum', the sum of :math:`L(x, y)` will be returned. :math:`N` is the batch size. | |||
| Args: | |||
| reduction (str): Type of reduction to apply to loss. The optional values are "mean", "sum", "none". | |||
| reduction (str): Type of reduction to be applied to loss. The optional values are "mean", "sum", and "none". | |||
| Default: "mean". | |||
| Inputs: | |||
| @@ -107,7 +107,7 @@ class L1Loss(_Loss): | |||
| class MSELoss(_Loss): | |||
| r""" | |||
| MSELoss create a criterion to measures the mean squared error (squared L2-norm) between :math:`x` and :math:`y` | |||
| MSELoss creates a criterion to measure the mean squared error (squared L2-norm) between :math:`x` and :math:`y` | |||
| by element, where :math:`x` is the input and :math:`y` is the target. | |||
| For simplicity, let :math:`x` and :math:`y` be 1-dimensional Tensor with length :math:`N`, | |||
| @@ -120,7 +120,7 @@ class MSELoss(_Loss): | |||
| When argument reduction is 'sum', the sum of :math:`L(x, y)` will be returned. :math:`N` is the batch size. | |||
| Args: | |||
| reduction (str): Type of reduction to apply to loss. The optional values are "mean", "sum", "none". | |||
| reduction (str): Type of reduction to be applied to loss. The optional values are "mean", "sum", and "none". | |||
| Default: "mean". | |||
| Inputs: | |||
| @@ -210,14 +210,14 @@ class SoftmaxCrossEntropyWithLogits(_Loss): | |||
| Note: | |||
| While the target classes are mutually exclusive, i.e., only one class is positive in the target, the predicted | |||
| probabilities need not be exclusive. All that is required is that the predicted probability distribution | |||
| probabilities need not to be exclusive. It is only required that the predicted probability distribution | |||
| of entry is a valid one. | |||
| Args: | |||
| is_grad (bool): Specifies whether calculate grad only. Default: True. | |||
| sparse (bool): Specifies whether labels use sparse format or not. Default: False. | |||
| reduction (Union[str, None]): Type of reduction to apply to loss. Support 'sum' or 'mean' If None, | |||
| do not reduction. Default: None. | |||
| reduction (Union[str, None]): Type of reduction to be applied to loss. Support 'sum' and 'mean'. If None, | |||
| do not perform reduction. Default: None. | |||
| smooth_factor (float): Label smoothing factor. It is a optional input which should be in range [0, 1]. | |||
| Default: 0. | |||
| num_classes (int): The number of classes in the task. It is a optional input Default: 2. | |||
| @@ -225,7 +225,7 @@ class SoftmaxCrossEntropyWithLogits(_Loss): | |||
| Inputs: | |||
| - **logits** (Tensor) - Tensor of shape (N, C). | |||
| - **labels** (Tensor) - Tensor of shape (N, ). If `sparse` is True, The type of | |||
| `labels` is mindspore.int32. If `sparse` is False, the type of `labels` is same as the type of `logits`. | |||
| `labels` is mindspore.int32. If `sparse` is False, the type of `labels` is the same as the type of `logits`. | |||
| Outputs: | |||
| Tensor, a tensor of the same shape as logits with the component-wise | |||
| @@ -282,8 +282,8 @@ class SoftmaxCrossEntropyExpand(Cell): | |||
| where :math:`x_i` is a 1D score Tensor, :math:`t_i` is the target class. | |||
| Note: | |||
| When argument sparse is set to True, the format of label is the index | |||
| range from :math:`0` to :math:`C - 1` instead of one-hot vectors. | |||
| When argument sparse is set to True, the format of the label is the index | |||
| ranging from :math:`0` to :math:`C - 1` instead of one-hot vectors. | |||
| Args: | |||
| sparse(bool): Specifies whether labels use sparse format or not. Default: False. | |||
| @@ -69,7 +69,7 @@ def names(): | |||
| def get_metric_fn(name, *args, **kwargs): | |||
| """ | |||
| Gets the metric method base on the input name. | |||
| Gets the metric method based on the input name. | |||
| Args: | |||
| name (str): The name of metric method. Refer to the '__factory__' | |||
| @@ -82,7 +82,7 @@ class Metric(metaclass=ABCMeta): | |||
| @abstractmethod | |||
| def clear(self): | |||
| """ | |||
| A interface describes the behavior of clearing the internal evaluation result. | |||
| An interface describes the behavior of clearing the internal evaluation result. | |||
| Note: | |||
| All subclasses should override this interface. | |||
| @@ -92,7 +92,7 @@ class Metric(metaclass=ABCMeta): | |||
| @abstractmethod | |||
| def eval(self): | |||
| """ | |||
| A interface describes the behavior of computing the evaluation result. | |||
| An interface describes the behavior of computing the evaluation result. | |||
| Note: | |||
| All subclasses should override this interface. | |||
| @@ -102,7 +102,7 @@ class Metric(metaclass=ABCMeta): | |||
| @abstractmethod | |||
| def update(self, *inputs): | |||
| """ | |||
| A interface describes the behavior of updating the internal evaluation result. | |||
| An interface describes the behavior of updating the internal evaluation result. | |||
| Note: | |||
| All subclasses should override this interface. | |||
| @@ -36,8 +36,8 @@ def _update_run_op(beta1, beta2, eps, lr, weight_decay, param, m, v, gradient, d | |||
| Update parameters. | |||
| Args: | |||
| beta1 (Tensor): The exponential decay rate for the 1st moment estimates. Should be in range (0.0, 1.0). | |||
| beta2 (Tensor): The exponential decay rate for the 2nd moment estimates. Should be in range (0.0, 1.0). | |||
| beta1 (Tensor): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0). | |||
| beta2 (Tensor): The exponential decay rate for the 2nd moment estimations. Should be in range (0.0, 1.0). | |||
| eps (Tensor): Term added to the denominator to improve numerical stability. Should be greater than 0. | |||
| lr (Tensor): Learning rate. | |||
| weight_decay (Number): Weight decay. Should be equal to or greater than 0. | |||
| @@ -180,12 +180,12 @@ class Adam(Optimizer): | |||
| the order will be followed in the optimizer. There are no other keys in the `dict` and the parameters | |||
| which in the 'order_params' should be in one of group parameters. | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. | |||
| When the learning_rate is a Iterable or a Tensor with dimension of 1, use the dynamic learning rate, then | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate. | |||
| When the learning_rate is an Iterable or a Tensor in a 1D dimension, use the dynamic learning rate, then | |||
| the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, | |||
| use dynamic learning rate, the i-th learning rate will be calculated during the process of training | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with | |||
| dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero | |||
| dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. | |||
| Default: 1e-3. | |||
| beta1 (float): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0). | |||
| @@ -195,11 +195,11 @@ class Adam(Optimizer): | |||
| eps (float): Term added to the denominator to improve numerical stability. Should be greater than 0. Default: | |||
| 1e-8. | |||
| use_locking (bool): Whether to enable a lock to protect updating variable tensors. | |||
| If True, updating of the var, m, and v tensors will be protected by a lock. | |||
| If False, the result is unpredictable. Default: False. | |||
| If true, updates of the var, m, and v tensors will be protected by a lock. | |||
| If false, the result is unpredictable. Default: False. | |||
| use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients. | |||
| If True, update the gradients using NAG. | |||
| If False, update the gradients without using NAG. Default: False. | |||
| If true, update the gradients using NAG. | |||
| If false, update the gradients without using NAG. Default: False. | |||
| weight_decay (float): Weight decay (L2 penalty). It should be equal to or greater than 0. Default: 0.0. | |||
| loss_scale (float): A floating point value for the loss scale. Should be greater than 0. Default: 1.0. | |||
| @@ -304,12 +304,12 @@ class AdamWeightDecay(Optimizer): | |||
| the order will be followed in the optimizer. There are no other keys in the `dict` and the parameters | |||
| which in the 'order_params' should be in one of group parameters. | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. | |||
| When the learning_rate is a Iterable or a Tensor with dimension of 1, use the dynamic learning rate, then | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate. | |||
| When the learning_rate is an Iterable or a Tensor in a 1D dimension, use the dynamic learning rate, then | |||
| the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, | |||
| use dynamic learning rate, the i-th learning rate will be calculated during the process of training | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with | |||
| dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero | |||
| dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. | |||
| Default: 1e-3. | |||
| beta1 (float): The exponential decay rate for the 1st moment estimations. Default: 0.9. | |||
| @@ -114,12 +114,12 @@ class FTRL(Optimizer): | |||
| than or equal to zero. Use fixed learning rate if lr_power is zero. Default: -0.5. | |||
| l1 (float): l1 regularization strength, must be greater than or equal to zero. Default: 0.0. | |||
| l2 (float): l2 regularization strength, must be greater than or equal to zero. Default: 0.0. | |||
| use_locking (bool): If True use locks for update operation. Default: False. | |||
| use_locking (bool): If True, use locks for updating operation. Default: False. | |||
| loss_scale (float): Value for the loss scale. It should be equal to or greater than 1.0. Default: 1.0. | |||
| weight_decay (float): Weight decay value to multiply weight, must be zero or positive value. Default: 0.0. | |||
| Inputs: | |||
| - **grads** (tuple[Tensor]) - The gradients of `params` in optimizer, the shape is as same as the `params` | |||
| - **grads** (tuple[Tensor]) - The gradients of `params` in the optimizer, the shape is the same as the `params` | |||
| in optimizer. | |||
| Outputs: | |||
| @@ -39,8 +39,8 @@ def _update_run_op(beta1, beta2, eps, global_step, lr, weight_decay, param, m, v | |||
| Update parameters. | |||
| Args: | |||
| beta1 (Tensor): The exponential decay rate for the 1st moment estimates. Should be in range (0.0, 1.0). | |||
| beta2 (Tensor): The exponential decay rate for the 2nd moment estimates. Should be in range (0.0, 1.0). | |||
| beta1 (Tensor): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0). | |||
| beta2 (Tensor): The exponential decay rate for the 2nd moment estimations. Should be in range (0.0, 1.0). | |||
| eps (Tensor): Term added to the denominator to improve numerical stability. Should be greater than 0. | |||
| lr (Tensor): Learning rate. | |||
| weight_decay (Number): Weight decay. Should be equal to or greater than 0. | |||
| @@ -122,8 +122,8 @@ def _update_run_op_graph_kernel(beta1, beta2, eps, global_step, lr, weight_decay | |||
| Update parameters. | |||
| Args: | |||
| beta1 (Tensor): The exponential decay rate for the 1st moment estimates. Should be in range (0.0, 1.0). | |||
| beta2 (Tensor): The exponential decay rate for the 2nd moment estimates. Should be in range (0.0, 1.0). | |||
| beta1 (Tensor): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0). | |||
| beta2 (Tensor): The exponential decay rate for the 2nd moment estimations. Should be in range (0.0, 1.0). | |||
| eps (Tensor): Term added to the denominator to improve numerical stability. Should be greater than 0. | |||
| lr (Tensor): Learning rate. | |||
| weight_decay (Number): Weight decay. Should be equal to or greater than 0. | |||
| @@ -184,7 +184,7 @@ def _check_param_value(beta1, beta2, eps, prim_name): | |||
| class Lamb(Optimizer): | |||
| """ | |||
| Lamb Dynamic LR. | |||
| Lamb Dynamic Learning Rate. | |||
| LAMB is an optimization algorithm employing a layerwise adaptive large batch | |||
| optimization technique. Refer to the paper `LARGE BATCH OPTIMIZATION FOR DEEP LEARNING: TRAINING BERT IN 76 | |||
| @@ -214,16 +214,16 @@ class Lamb(Optimizer): | |||
| the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which | |||
| in the value of 'order_params' should be in one of group parameters. | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. | |||
| When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate. | |||
| When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then | |||
| the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, | |||
| use dynamic learning rate, the i-th learning rate will be calculated during the process of training | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with | |||
| dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero | |||
| dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. | |||
| beta1 (float): The exponential decay rate for the 1st moment estimates. Default: 0.9. | |||
| beta1 (float): The exponential decay rate for the 1st moment estimations. Default: 0.9. | |||
| Should be in range (0.0, 1.0). | |||
| beta2 (float): The exponential decay rate for the 2nd moment estimates. Default: 0.999. | |||
| beta2 (float): The exponential decay rate for the 2nd moment estimations. Default: 0.999. | |||
| Should be in range (0.0, 1.0). | |||
| eps (float): Term added to the denominator to improve numerical stability. Default: 1e-6. | |||
| Should be greater than 0. | |||
| @@ -58,12 +58,12 @@ class LARS(Optimizer): | |||
| epsilon (float): Term added to the denominator to improve numerical stability. Default: 1e-05. | |||
| coefficient (float): Trust coefficient for calculating the local learning rate. Default: 0.001. | |||
| use_clip (bool): Whether to use clip operation for calculating the local learning rate. Default: False. | |||
| lars_filter (Function): A function to determine whether apply lars algorithm. Default: | |||
| lars_filter (Function): A function to determine whether apply the LARS algorithm. Default: | |||
| lambda x: 'LayerNorm' not in x.name and 'bias' not in x.name. | |||
| Inputs: | |||
| - **gradients** (tuple[Tensor]) - The gradients of `params` in optimizer, the shape is | |||
| as same as the `params` in optimizer. | |||
| - **gradients** (tuple[Tensor]) - The gradients of `params` in the optimizer, the shape is the | |||
| as same as the `params` in the optimizer. | |||
| Outputs: | |||
| Union[Tensor[bool], tuple[Parameter]], it depends on the output of `optimizer`. | |||
| @@ -127,26 +127,26 @@ class LazyAdam(Optimizer): | |||
| the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which | |||
| in the value of 'order_params' should be in one of group parameters. | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. | |||
| When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate. | |||
| When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then | |||
| the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, | |||
| use dynamic learning rate, the i-th learning rate will be calculated during the process of training | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with | |||
| dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero | |||
| dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. | |||
| Default: 1e-3. | |||
| beta1 (float): The exponential decay rate for the 1st moment estimates. Should be in range (0.0, 1.0). Default: | |||
| 0.9. | |||
| beta2 (float): The exponential decay rate for the 2nd moment estimates. Should be in range (0.0, 1.0). Default: | |||
| 0.999. | |||
| beta1 (float): The exponential decay rate for the 1st moment estimations. Should be in range (0.0, 1.0). | |||
| Default: 0.9. | |||
| beta2 (float): The exponential decay rate for the 2nd moment estimations. Should be in range (0.0, 1.0). | |||
| Default: 0.999. | |||
| eps (float): Term added to the denominator to improve numerical stability. Should be greater than 0. Default: | |||
| 1e-8. | |||
| use_locking (bool): Whether to enable a lock to protect updating variable tensors. | |||
| If True, updating of the var, m, and v tensors will be protected by a lock. | |||
| If False, the result is unpredictable. Default: False. | |||
| If true, updates of the var, m, and v tensors will be protected by a lock. | |||
| If false, the result is unpredictable. Default: False. | |||
| use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients. | |||
| If True, updates the gradients using NAG. | |||
| If False, updates the gradients without using NAG. Default: False. | |||
| If true, update the gradients using NAG. | |||
| If true, update the gradients without using NAG. Default: False. | |||
| weight_decay (float): Weight decay (L2 penalty). Default: 0.0. | |||
| loss_scale (float): A floating point value for the loss scale. Should be equal to or greater than 1. Default: | |||
| 1.0. | |||
| @@ -83,12 +83,12 @@ class Momentum(Optimizer): | |||
| the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which | |||
| in the value of 'order_params' should be in one of group parameters. | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. | |||
| When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate. | |||
| When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then | |||
| the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, | |||
| use dynamic learning rate, the i-th learning rate will be calculated during the process of training | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with | |||
| dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero | |||
| dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. | |||
| momentum (float): Hyperparameter of type float, means momentum for the moving average. | |||
| It should be at least 0.0. | |||
| @@ -40,8 +40,6 @@ class Optimizer(Cell): | |||
| """ | |||
| Base class for all optimizers. | |||
| This class defines the API to add Ops to train a model. | |||
| Note: | |||
| This class defines the API to add Ops to train a model. Never use | |||
| this class directly, but instead instantiate one of its subclasses. | |||
| @@ -55,12 +53,12 @@ class Optimizer(Cell): | |||
| To improve parameter groups performance, the customized order of parameters can be supported. | |||
| Args: | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning | |||
| rate. When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning | |||
| rate. When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then | |||
| the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, | |||
| use dynamic learning rate, the i-th learning rate will be calculated during the process of training | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with | |||
| dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero | |||
| dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. | |||
| parameters (Union[list[Parameter], list[dict]]): When the `parameters` is a list of `Parameter` which will be | |||
| updated, the element in `parameters` should be class `Parameter`. When the `parameters` is a list of `dict`, | |||
| @@ -84,8 +82,8 @@ class Optimizer(Cell): | |||
| type of `loss_scale` input is int, it will be converted to float. Default: 1.0. | |||
| Raises: | |||
| ValueError: If the learning_rate is a Tensor, but the dims of tensor is greater than 1. | |||
| TypeError: If the learning_rate is not any of the three types: float, Tensor, Iterable. | |||
| ValueError: If the learning_rate is a Tensor, but the dimension of tensor is greater than 1. | |||
| TypeError: If the learning_rate is not any of the three types: float, Tensor, nor Iterable. | |||
| """ | |||
| def __init__(self, learning_rate, parameters, weight_decay=0.0, loss_scale=1.0): | |||
| @@ -179,7 +177,7 @@ class Optimizer(Cell): | |||
| An approach to reduce the overfitting of a deep learning neural network model. | |||
| Args: | |||
| gradients (tuple[Tensor]): The gradients of `self.parameters`, and have the same shape with | |||
| gradients (tuple[Tensor]): The gradients of `self.parameters`, and have the same shape as | |||
| `self.parameters`. | |||
| Returns: | |||
| @@ -204,7 +202,7 @@ class Optimizer(Cell): | |||
| network. | |||
| Args: | |||
| gradients (tuple[Tensor]): The gradients of `self.parameters`, and have the same shape with | |||
| gradients (tuple[Tensor]): The gradients of `self.parameters`, and have the same shape as | |||
| `self.parameters`. | |||
| Returns: | |||
| @@ -87,22 +87,22 @@ class ProximalAdagrad(Optimizer): | |||
| in the value of 'order_params' should be in one of group parameters. | |||
| accum (float): The starting value for accumulators, must be zero or positive values. Default: 0.1. | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. | |||
| When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate. | |||
| When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then | |||
| the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, | |||
| use dynamic learning rate, the i-th learning rate will be calculated during the process of training | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with | |||
| dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero | |||
| dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. | |||
| Default: 0.001. | |||
| l1 (float): l1 regularization strength, must be greater than or equal to zero. Default: 0.0. | |||
| l2 (float): l2 regularization strength, must be greater than or equal to zero. Default: 0.0. | |||
| use_locking (bool): If True use locks for update operation. Default: False. | |||
| use_locking (bool): If True, use locks for updating operation. Default: False. | |||
| loss_scale (float): Value for the loss scale. It should be greater than 0.0. Default: 1.0. | |||
| weight_decay (float): Weight decay value to multiply weight, must be zero or positive value. Default: 0.0. | |||
| Inputs: | |||
| - **grads** (tuple[Tensor]) - The gradients of `params` in optimizer, the shape is as same as the `params` | |||
| - **grads** (tuple[Tensor]) - The gradients of `params` in the optimizer, the shape is the same as the `params` | |||
| in optimizer. | |||
| Outputs: | |||
| @@ -106,12 +106,12 @@ class RMSProp(Optimizer): | |||
| the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which | |||
| in the value of 'order_params' should be in one of group parameters. | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. | |||
| When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate. | |||
| When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then | |||
| the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, | |||
| use dynamic learning rate, the i-th learning rate will be calculated during the process of training | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with | |||
| dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero | |||
| dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. | |||
| Default: 0.1. | |||
| decay (float): Decay rate. Should be equal to or greater than 0. Default: 0.9. | |||
| @@ -78,12 +78,12 @@ class SGD(Optimizer): | |||
| the order will be followed in optimizer. There are no other keys in the `dict` and the parameters which | |||
| in the value of 'order_params' should be in one of group parameters. | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or graph for the learning rate. | |||
| When the learning_rate is a Iterable or a Tensor with dimension of 1, use dynamic learning rate, then | |||
| learning_rate (Union[float, Tensor, Iterable, LearningRateSchedule]): A value or a graph for the learning rate. | |||
| When the learning_rate is an Iterable or a Tensor in a 1D dimension, use dynamic learning rate, then | |||
| the i-th step will take the i-th value as the learning rate. When the learning_rate is LearningRateSchedule, | |||
| use dynamic learning rate, the i-th learning rate will be calculated during the process of training | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor with | |||
| dimension of 0, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| according to the formula of LearningRateSchedule. When the learning_rate is a float or a Tensor in a zero | |||
| dimension, use fixed learning rate. Other cases are not supported. The float learning rate should be | |||
| equal to or greater than 0. If the type of `learning_rate` is int, it will be converted to float. | |||
| Default: 0.1. | |||
| momentum (float): A floating point value the momentum. should be at least 0.0. Default: 0.0. | |||
| @@ -138,9 +138,9 @@ class TrainOneStepCell(Cell): | |||
| r""" | |||
| Network training package class. | |||
| Wraps the network with an optimizer. The resulting Cell be trained with input *inputs. | |||
| Backward graph will be created in the construct function to do parameter updating. Different | |||
| parallel modes are available to run the training. | |||
| Wraps the network with an optimizer. The resulting Cell is trained with input *inputs. | |||
| The backward graph will be created in the construct function to update the parameter. Different | |||
| parallel modes are available for training. | |||
| Args: | |||
| network (Cell): The training network. | |||
| @@ -231,14 +231,14 @@ class DataWrapper(Cell): | |||
| class GetNextSingleOp(Cell): | |||
| """ | |||
| Cell to run get next operation. | |||
| Cell to run for getting the next operation. | |||
| Args: | |||
| dataset_types (list[:class:`mindspore.dtype`]): The types of dataset. | |||
| dataset_shapes (list[tuple[int]]): The shapes of dataset. | |||
| queue_name (str): Queue name to fetch the data. | |||
| Detailed information, please refer to `ops.operations.GetNext`. | |||
| For detailed information, refer to `ops.operations.GetNext`. | |||
| """ | |||
| def __init__(self, dataset_types, dataset_shapes, queue_name): | |||
| @@ -360,7 +360,7 @@ class ParameterUpdate(Cell): | |||
| param (Parameter): The parameter to be updated manually. | |||
| Raises: | |||
| KeyError: If parameter with the specified name do not exist. | |||
| KeyError: If parameter with the specified name does not exist. | |||
| Examples: | |||
| >>> network = Net() | |||
| @@ -329,7 +329,7 @@ class DistributedGradReducer(Cell): | |||
| def construct(self, grads): | |||
| """ | |||
| In some circumstances, the data precision of grads could be mixed with float16 and float32. Thus, the | |||
| Under certain circumstances, the data precision of grads could be mixed with float16 and float32. Thus, the | |||
| result of AllReduce is unreliable. To solve the problem, grads should be cast to float32 before AllReduce, | |||
| and cast back after the operation. | |||
| @@ -54,8 +54,8 @@ class DynamicLossScaleUpdateCell(Cell): | |||
| Dynamic Loss scale update cell. | |||
| For loss scaling training, the initial loss scaling value will be set to be `loss_scale_value`. | |||
| In every training step, the loss scaling value will be updated by loss scaling value/`scale_factor` | |||
| when there is overflow. And it will be increased by loss scaling value * `scale_factor` if there is no | |||
| In each training step, the loss scaling value will be updated by loss scaling value/`scale_factor` | |||
| when there is an overflow. And it will be increased by loss scaling value * `scale_factor` if there is no | |||
| overflow for a continuous `scale_window` steps. This cell is used for Graph mode training in which all | |||
| logic will be executed on device side(Another training mode is normal(non-sink) mode in which some logic will be | |||
| executed on host). | |||
| @@ -133,7 +133,7 @@ class FixedLossScaleUpdateCell(Cell): | |||
| """ | |||
| Static scale update cell, the loss scaling value will not be updated. | |||
| For usage please refer to `DynamicLossScaleUpdateCell`. | |||
| For usage, refer to `DynamicLossScaleUpdateCell`. | |||
| Args: | |||
| loss_scale_value (float): Init loss scale. | |||
| @@ -57,7 +57,7 @@ class _TupleGetItemTensor(base.TupleGetItemTensor_): | |||
| data (tuple): A tuple of items. | |||
| index (Tensor): The index in tensor. | |||
| Outputs: | |||
| Type, is same as the element type of data. | |||
| Type, is the same as the element type of data. | |||
| """ | |||
| def __init__(self, name): | |||
| @@ -81,7 +81,7 @@ def _tuple_getitem_by_number(data, number_index): | |||
| number_index (Number): Index in scalar. | |||
| Outputs: | |||
| Type, is same as the element type of data. | |||
| Type, is the same as the element type of data. | |||
| """ | |||
| return F.tuple_getitem(data, number_index) | |||
| @@ -96,7 +96,7 @@ def _tuple_getitem_by_slice(data, slice_index): | |||
| slice_index (Slice): Index in slice. | |||
| Outputs: | |||
| Tuple, element type is same as the element type of data. | |||
| Tuple, element type is the same as the element type of data. | |||
| """ | |||
| return _tuple_slice(data, slice_index) | |||
| @@ -111,7 +111,7 @@ def _tuple_getitem_by_tensor(data, tensor_index): | |||
| tensor_index (Tensor): Index to select item. | |||
| Outputs: | |||
| Type, is same as the element type of data. | |||
| Type, is the same as the element type of data. | |||
| """ | |||
| return _tuple_get_item_tensor(data, tensor_index) | |||
| @@ -126,7 +126,7 @@ def _list_getitem_by_number(data, number_index): | |||
| number_index (Number): Index in scalar. | |||
| Outputs: | |||
| Type is same as the element type of data. | |||
| Type is the same as the element type of data. | |||
| """ | |||
| return F.list_getitem(data, number_index) | |||
| @@ -186,7 +186,7 @@ def _tensor_getitem_by_slice(data, slice_index): | |||
| slice_index (Slice): Index in slice. | |||
| Outputs: | |||
| Tensor, element type is same as the element type of data. | |||
| Tensor, element type is the same as the element type of data. | |||
| """ | |||
| return compile_utils.tensor_index_by_slice(data, slice_index) | |||
| @@ -201,7 +201,7 @@ def _tensor_getitem_by_tensor(data, tensor_index): | |||
| tensor_index (Tensor): An index expressed by tensor. | |||
| Outputs: | |||
| Tensor, element type is same as the element type of data. | |||
| Tensor, element type is the same as the element type of data. | |||
| """ | |||
| return compile_utils.tensor_index_by_tensor(data, tensor_index) | |||
| @@ -216,7 +216,7 @@ def _tensor_getitem_by_tuple(data, tuple_index): | |||
| tuple_index (tuple): Index in tuple. | |||
| Outputs: | |||
| Tensor, element type is same as the element type of data. | |||
| Tensor, element type is the same as the element type of data. | |||
| """ | |||
| return compile_utils.tensor_index_by_tuple(data, tuple_index) | |||
| @@ -32,7 +32,7 @@ def _list_setitem_with_string(data, number_index, value): | |||
| number_index (Number): Index of data. | |||
| Outputs: | |||
| list, type is same as the element type of data. | |||
| list, type is the same as the element type of data. | |||
| """ | |||
| return F.list_setitem(data, number_index, value) | |||
| @@ -48,7 +48,7 @@ def _list_setitem_with_number(data, number_index, value): | |||
| value (Number): Value given. | |||
| Outputs: | |||
| list, type is same as the element type of data. | |||
| list, type is the same as the element type of data. | |||
| """ | |||
| return F.list_setitem(data, number_index, value) | |||
| @@ -64,7 +64,7 @@ def _list_setitem_with_Tensor(data, number_index, value): | |||
| value (Tensor): Value given. | |||
| Outputs: | |||
| list, type is same as the element type of data. | |||
| list, type is the same as the element type of data. | |||
| """ | |||
| return F.list_setitem(data, number_index, value) | |||
| @@ -80,7 +80,7 @@ def _list_setitem_with_List(data, number_index, value): | |||
| value (list): Value given. | |||
| Outputs: | |||
| list, type is same as the element type of data. | |||
| list, type is the same as the element type of data. | |||
| """ | |||
| return F.list_setitem(data, number_index, value) | |||
| @@ -96,7 +96,7 @@ def _list_setitem_with_Tuple(data, number_index, value): | |||
| value (list): Value given. | |||
| Outputs: | |||
| list, type is same as the element type of data. | |||
| list, type is the same as the element type of data. | |||
| """ | |||
| return F.list_setitem(data, number_index, value) | |||
| @@ -158,18 +158,18 @@ class ExtractImagePatches(PrimitiveWithInfer): | |||
| The input tensor must be a 4-D tensor and the data format is NHWC. | |||
| Args: | |||
| ksizes (Union[tuple[int], list[int]]): The size of sliding window, should be a tuple or list of int, | |||
| ksizes (Union[tuple[int], list[int]]): The size of sliding window, should be a tuple or a list of integers, | |||
| and the format is [1, ksize_row, ksize_col, 1]. | |||
| strides (Union[tuple[int], list[int]]): Distance between the centers of the two consecutive patches, | |||
| should be a tuple or list of int, and the format is [1, stride_row, stride_col, 1]. | |||
| rates (Union[tuple[int], list[int]]): In each extracted patch, the gap between the corresponding dim | |||
| pixel positions, should be a tuple or list of int, and the format is [1, rate_row, rate_col, 1]. | |||
| rates (Union[tuple[int], list[int]]): In each extracted patch, the gap between the corresponding dimension | |||
| pixel positions, should be a tuple or a list of integers, and the format is [1, rate_row, rate_col, 1]. | |||
| padding (str): The type of padding algorithm, is a string whose value is "same" or "valid", | |||
| not case sensitive. Default: "valid". | |||
| - same: Means that the patch can take the part beyond the original image, and this part is filled with 0. | |||
| - valid: Means that the patch area taken must be completely contained in the original image. | |||
| - valid: Means that the taken patch area must be completely covered in the original image. | |||
| Inputs: | |||
| - **input_x** (Tensor) - A 4-D tensor whose shape is [in_batch, in_row, in_col, in_depth] and | |||
| @@ -177,7 +177,7 @@ class ExtractImagePatches(PrimitiveWithInfer): | |||
| Outputs: | |||
| Tensor, a 4-D tensor whose data type is same as 'input_x', | |||
| and the shape is [out_batch, out_row, out_col, out_depth], the out_batch is same as the in_batch. | |||
| and the shape is [out_batch, out_row, out_col, out_depth], the out_batch is the same as the in_batch. | |||
| """ | |||
| @prim_attr_register | |||
| @@ -436,8 +436,8 @@ class MatrixDiag(PrimitiveWithInfer): | |||
| Returns a batched diagonal tensor with a given batched diagonal values. | |||
| Inputs: | |||
| - **x** (Tensor) - A tensor which to be element-wise multi by `assist`. It can be of the following data types: | |||
| float32, float16, int32, int8, uint8. | |||
| - **x** (Tensor) - A tensor which to be element-wise multi by `assist`. It can be one of the following data | |||
| types: float32, float16, int32, int8, and uint8. | |||
| - **assist** (Tensor) - A eye tensor of the same type as `x`. It's rank must greater than or equal to 2 and | |||
| it's last dimension must equal to the second to last dimension. | |||
| @@ -490,7 +490,7 @@ class MatrixDiagPart(PrimitiveWithInfer): | |||
| Returns the batched diagonal part of a batched tensor. | |||
| Inputs: | |||
| - **x** (Tensor) - The batched tensor. It can be of the following data types: | |||
| - **x** (Tensor) - The batched tensor. It can be one of the following data types: | |||
| float32, float16, int32, int8, uint8. | |||
| - **assist** (Tensor) - A eye tensor of the same type as `x`. With shape same as `x`. | |||
| @@ -531,7 +531,7 @@ class MatrixSetDiag(PrimitiveWithInfer): | |||
| Modify the batched diagonal part of a batched tensor. | |||
| Inputs: | |||
| - **x** (Tensor) - The batched tensor. It can be of the following data types: | |||
| - **x** (Tensor) - The batched tensor. It can be one of the following data types: | |||
| float32, float16, int32, int8, uint8. | |||
| - **assist** (Tensor) - A eye tensor of the same type as `x`. With shape same as `x`. | |||
| - **diagonal** (Tensor) - The diagonal values. | |||
| @@ -178,8 +178,8 @@ class FakeQuantPerLayer(PrimitiveWithInfer): | |||
| quant_delay (int): Quantilization delay parameter. Before delay step in training time not update | |||
| simulate quantization aware funcion. After delay step in training time begin simulate the aware | |||
| quantize funcion. Default: 0. | |||
| symmetric (bool): Quantization algorithm use symmetric or not. Default: False. | |||
| narrow_range (bool): Quantization algorithm use narrow range or not. Default: False. | |||
| symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False. | |||
| training (bool): Training the network or not. Default: True. | |||
| Inputs: | |||
| @@ -318,8 +318,8 @@ class FakeQuantPerChannel(PrimitiveWithInfer): | |||
| quant_delay (int): Quantilization delay parameter. Before delay step in training time not | |||
| update the weight data to simulate quantize operation. After delay step in training time | |||
| begin simulate the quantize operation. Default: 0. | |||
| symmetric (bool): Quantization algorithm use symmetric or not. Default: False. | |||
| narrow_range (bool): Quantization algorithm use narrow range or not. Default: False. | |||
| symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False. | |||
| training (bool): Training the network or not. Default: True. | |||
| channel_axis (int): Quantization by channel axis. Ascend backend only supports 0 or 1. Default: 1. | |||
| @@ -3359,7 +3359,7 @@ class InplaceUpdate(PrimitiveWithInfer): | |||
| indices (Union[int, tuple]): Indices into the left-most dimension of `x`. | |||
| Inputs: | |||
| - **x** (Tensor) - A tensor which to be inplace updated. It can be of the following data types: | |||
| - **x** (Tensor) - A tensor which to be inplace updated. It can be one of the following data types: | |||
| float32, float16, int32. | |||
| - **v** (Tensor) - A tensor of the same type as `x`. Same dimension size as `x` except | |||
| the first dimension, which must be the same as the size of `indices`. | |||
| @@ -3474,7 +3474,7 @@ class TransShape(PrimitiveWithInfer): | |||
| - **out_shape** (tuple[int]) - The shape of output data. | |||
| Outputs: | |||
| Tensor, a tensor whose data type is same as 'input_x', and the shape is same as the `out_shape`. | |||
| Tensor, a tensor whose data type is same as 'input_x', and the shape is the same as the `out_shape`. | |||
| """ | |||
| @prim_attr_register | |||
| def __init__(self): | |||
| @@ -31,7 +31,7 @@ class ScalarCast(PrimitiveWithInfer): | |||
| - **input_y** (mindspore.dtype) - The type should cast to be. Only constant value is allowed. | |||
| Outputs: | |||
| Scalar. The type is same as the python type corresponding to `input_y`. | |||
| Scalar. The type is the same as the python type corresponding to `input_y`. | |||
| Examples: | |||
| >>> scalar_cast = P.ScalarCast() | |||
| @@ -132,7 +132,7 @@ class TensorAdd(_MathBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -1067,7 +1067,7 @@ class Sub(_MathBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -1105,7 +1105,7 @@ class Mul(_MathBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -1144,7 +1144,7 @@ class SquaredDifference(_MathBinaryOp): | |||
| float16, float32, int32 or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -1333,7 +1333,7 @@ class Pow(_MathBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -1618,7 +1618,7 @@ class Minimum(_MathBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -1656,7 +1656,7 @@ class Maximum(_MathBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -1694,7 +1694,7 @@ class RealDiv(_MathBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -1733,7 +1733,7 @@ class Div(_MathBinaryOp): | |||
| is a number or a bool, the second input should be a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Raises: | |||
| @@ -1772,7 +1772,7 @@ class DivNoNan(_MathBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Raises: | |||
| @@ -1814,7 +1814,7 @@ class FloorDiv(_MathBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -1844,7 +1844,7 @@ class TruncateDiv(_MathBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -1873,7 +1873,7 @@ class TruncateMod(_MathBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -1900,7 +1900,7 @@ class Mod(_MathBinaryOp): | |||
| the second input should be a tensor whose data type is number. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Raises: | |||
| @@ -1967,7 +1967,7 @@ class FloorMod(_MathBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -2025,7 +2025,7 @@ class Xdivy(_MathBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is float16, float32 or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -2059,7 +2059,7 @@ class Xlogy(_MathBinaryOp): | |||
| The value must be positive. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, | |||
| Tensor, the shape is the same as the shape after broadcasting, | |||
| and the data type is the one with high precision or high digits among the two inputs. | |||
| Examples: | |||
| @@ -2219,7 +2219,7 @@ class Equal(_LogicBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting,and the data type is bool. | |||
| Tensor, the shape is the same as the shape after broadcasting,and the data type is bool. | |||
| Examples: | |||
| >>> input_x = Tensor(np.array([1, 2, 3]), mindspore.float32) | |||
| @@ -2250,7 +2250,7 @@ class ApproximateEqual(_LogicBinaryOp): | |||
| - **x2** (Tensor) - A tensor of the same type and shape as 'x1'. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape of 'x1', and the data type is bool. | |||
| Tensor, the shape is the same as the shape of 'x1', and the data type is bool. | |||
| Examples: | |||
| >>> x1 = Tensor(np.array([1, 2, 3]), mindspore.float32) | |||
| @@ -2328,7 +2328,7 @@ class NotEqual(_LogicBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting,and the data type is bool. | |||
| Tensor, the shape is the same as the shape after broadcasting,and the data type is bool. | |||
| Examples: | |||
| >>> input_x = Tensor(np.array([1, 2, 3]), mindspore.float32) | |||
| @@ -2364,7 +2364,7 @@ class Greater(_LogicBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting,and the data type is bool. | |||
| Tensor, the shape is the same as the shape after broadcasting,and the data type is bool. | |||
| Examples: | |||
| >>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32) | |||
| @@ -2399,7 +2399,7 @@ class GreaterEqual(_LogicBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting,and the data type is bool. | |||
| Tensor, the shape is the same as the shape after broadcasting,and the data type is bool. | |||
| Examples: | |||
| >>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32) | |||
| @@ -2434,7 +2434,7 @@ class Less(_LogicBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting,and the data type is bool. | |||
| Tensor, the shape is the same as the shape after broadcasting,and the data type is bool. | |||
| Examples: | |||
| >>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32) | |||
| @@ -2469,7 +2469,7 @@ class LessEqual(_LogicBinaryOp): | |||
| a bool when the first input is a tensor or a tensor whose data type is number or bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting,and the data type is bool. | |||
| Tensor, the shape is the same as the shape after broadcasting,and the data type is bool. | |||
| Examples: | |||
| >>> input_x = Tensor(np.array([1, 2, 3]), mindspore.int32) | |||
| @@ -2495,7 +2495,7 @@ class LogicalNot(PrimitiveWithInfer): | |||
| - **input_x** (Tensor) - The input tensor whose dtype is bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the `input_x`, and the dtype is bool. | |||
| Tensor, the shape is the same as the `input_x`, and the dtype is bool. | |||
| Examples: | |||
| >>> input_x = Tensor(np.array([True, False, True]), mindspore.bool_) | |||
| @@ -2533,7 +2533,7 @@ class LogicalAnd(_LogicBinaryOp): | |||
| a tensor whose data type is bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting, and the data type is bool. | |||
| Tensor, the shape is the same as the shape after broadcasting, and the data type is bool. | |||
| Examples: | |||
| >>> input_x = Tensor(np.array([True, False, True]), mindspore.bool_) | |||
| @@ -2563,7 +2563,7 @@ class LogicalOr(_LogicBinaryOp): | |||
| a tensor whose data type is bool. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting,and the data type is bool. | |||
| Tensor, the shape is the same as the shape after broadcasting,and the data type is bool. | |||
| Examples: | |||
| >>> input_x = Tensor(np.array([True, False, True]), mindspore.bool_) | |||
| @@ -3182,7 +3182,7 @@ class Atan2(_MathBinaryOp): | |||
| - **input_y** (Tensor) - The input tensor. | |||
| Outputs: | |||
| Tensor, the shape is same as the shape after broadcasting,and the data type is same as `input_x`. | |||
| Tensor, the shape is the same as the shape after broadcasting,and the data type is same as `input_x`. | |||
| Examples: | |||
| >>> input_x = Tensor(np.array([[0, 1]]), mindspore.float32) | |||
| @@ -100,7 +100,7 @@ class Softmax(PrimitiveWithInfer): | |||
| Softmax operation. | |||
| Applies the Softmax operation to the input tensor on the specified axis. | |||
| Suppose a slice along the given aixs :math:`x` then for each element :math:`x_i` | |||
| Suppose a slice in the given aixs :math:`x` then for each element :math:`x_i` | |||
| the Softmax function is shown as follows: | |||
| .. math:: | |||
| @@ -151,7 +151,7 @@ class LogSoftmax(PrimitiveWithInfer): | |||
| Log Softmax activation function. | |||
| Applies the Log Softmax function to the input tensor on the specified axis. | |||
| Suppose a slice along the given aixs :math:`x` then for each element :math:`x_i` | |||
| Suppose a slice in the given aixs :math:`x` then for each element :math:`x_i` | |||
| the Log Softmax function is shown as follows: | |||
| .. math:: | |||
| @@ -429,7 +429,7 @@ class HSwish(PrimitiveWithInfer): | |||
| .. math:: | |||
| \text{hswish}(x_{i}) = x_{i} * \frac{ReLU6(x_{i} + 3)}{6}, | |||
| where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor. | |||
| where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor. | |||
| Inputs: | |||
| - **input_data** (Tensor) - The input of HSwish, data type should be float16 or float32. | |||
| @@ -502,7 +502,7 @@ class HSigmoid(PrimitiveWithInfer): | |||
| .. math:: | |||
| \text{hsigmoid}(x_{i}) = max(0, min(1, \frac{x_{i} + 3}{6})), | |||
| where :math:`x_{i}` is the :math:`i`-th slice along the given dim of the input Tensor. | |||
| where :math:`x_{i}` is the :math:`i`-th slice in the given dimension of the input Tensor. | |||
| Inputs: | |||
| - **input_data** (Tensor) - The input of HSigmoid, data type should be float16 or float32. | |||
| @@ -2234,7 +2234,7 @@ class DropoutDoMask(PrimitiveWithInfer): | |||
| shape of `input_x` must be same as the value of `DropoutGenMask`'s input `shape`. If input wrong `mask`, | |||
| the output of `DropoutDoMask` are unpredictable. | |||
| - **keep_prob** (Tensor) - The keep rate, between 0 and 1, e.g. keep_prob = 0.9, | |||
| means dropping out 10% of input units. The value of `keep_prob` is same as the input `keep_prob` of | |||
| means dropping out 10% of input units. The value of `keep_prob` is the same as the input `keep_prob` of | |||
| `DropoutGenMask`. | |||
| Outputs: | |||
| @@ -2674,9 +2674,9 @@ class Pad(PrimitiveWithInfer): | |||
| Args: | |||
| paddings (tuple): The shape of parameter `paddings` is (N, 2). N is the rank of input data. All elements of | |||
| paddings are int type. For `D` th dimension of input, paddings[D, 0] indicates how many sizes to be | |||
| extended ahead of the `D` th dimension of the input tensor, and paddings[D, 1] indicates how many sizes to | |||
| be extended behind of the `D` th dimension of the input tensor. | |||
| paddings are int type. For the input in `D` th dimension, paddings[D, 0] indicates how many sizes to be | |||
| extended ahead of the input tensor in the `D` th dimension, and paddings[D, 1] indicates how many sizes to | |||
| be extended behind of the input tensor in the `D` th dimension. | |||
| Inputs: | |||
| - **input_x** (Tensor) - The input tensor. | |||
| @@ -2733,9 +2733,9 @@ class MirrorPad(PrimitiveWithInfer): | |||
| - **input_x** (Tensor) - The input tensor. | |||
| - **paddings** (Tensor) - The paddings tensor. The value of `paddings` is a matrix(list), | |||
| and its shape is (N, 2). N is the rank of input data. All elements of paddings | |||
| are int type. For `D` th dimension of input, paddings[D, 0] indicates how many sizes to be | |||
| extended ahead of the `D` th dimension of the input tensor, and paddings[D, 1] indicates | |||
| how many sizes to be extended behind of the `D` th dimension of the input tensor. | |||
| are int type. For the input in `D` th dimension, paddings[D, 0] indicates how many sizes to be | |||
| extended ahead of the input tensor in the `D` th dimension, and paddings[D, 1] indicates how many sizes to | |||
| be extended behind of the input tensor in the `D` th dimension. | |||
| Outputs: | |||
| Tensor, the tensor after padding. | |||
| @@ -2880,11 +2880,11 @@ class Adam(PrimitiveWithInfer): | |||
| Args: | |||
| use_locking (bool): Whether to enable a lock to protect updating variable tensors. | |||
| If True, updating of the var, m, and v tensors will be protected by a lock. | |||
| If False, the result is unpredictable. Default: False. | |||
| If true, updates of the var, m, and v tensors will be protected by a lock. | |||
| If false, the result is unpredictable. Default: False. | |||
| use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients. | |||
| If True, updates the gradients using NAG. | |||
| If False, updates the gradients without using NAG. Default: False. | |||
| If true, update the gradients using NAG. | |||
| If true, update the gradients without using NAG. Default: False. | |||
| Inputs: | |||
| - **var** (Tensor) - Weights to be updated. | |||
| @@ -2894,8 +2894,8 @@ class Adam(PrimitiveWithInfer): | |||
| - **beta1_power** (float) - :math:`beta_1^t` in the updating formula. | |||
| - **beta2_power** (float) - :math:`beta_2^t` in the updating formula. | |||
| - **lr** (float) - :math:`l` in the updating formula. | |||
| - **beta1** (float) - The exponential decay rate for the 1st moment estimates. | |||
| - **beta2** (float) - The exponential decay rate for the 2nd moment estimates. | |||
| - **beta1** (float) - The exponential decay rate for the 1st moment estimations. | |||
| - **beta2** (float) - The exponential decay rate for the 2nd moment estimations. | |||
| - **epsilon** (float) - Term added to the denominator to improve numerical stability. | |||
| - **gradient** (Tensor) - Gradients. Has the same type as `var`. | |||
| @@ -2974,11 +2974,11 @@ class FusedSparseAdam(PrimitiveWithInfer): | |||
| Args: | |||
| use_locking (bool): Whether to enable a lock to protect updating variable tensors. | |||
| If True, updating of the var, m, and v tensors will be protected by a lock. | |||
| If False, the result is unpredictable. Default: False. | |||
| If true, updates of the var, m, and v tensors will be protected by a lock. | |||
| If false, the result is unpredictable. Default: False. | |||
| use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients. | |||
| If True, updates the gradients using NAG. | |||
| If False, updates the gradients without using NAG. Default: False. | |||
| If true, update the gradients using NAG. | |||
| If true, update the gradients without using NAG. Default: False. | |||
| Inputs: | |||
| - **var** (Parameter) - Parameters to be updated. With float32 data type. | |||
| @@ -2989,8 +2989,8 @@ class FusedSparseAdam(PrimitiveWithInfer): | |||
| - **beta1_power** (Tensor) - :math:`beta_1^t` in the updating formula. With float32 data type. | |||
| - **beta2_power** (Tensor) - :math:`beta_2^t` in the updating formula. With float32 data type. | |||
| - **lr** (Tensor) - :math:`l` in the updating formula. With float32 data type. | |||
| - **beta1** (Tensor) - The exponential decay rate for the 1st moment estimates. With float32 data type. | |||
| - **beta2** (Tensor) - The exponential decay rate for the 2nd moment estimates. With float32 data type. | |||
| - **beta1** (Tensor) - The exponential decay rate for the 1st moment estimations. With float32 data type. | |||
| - **beta2** (Tensor) - The exponential decay rate for the 2nd moment estimations. With float32 data type. | |||
| - **epsilon** (Tensor) - Term added to the denominator to improve numerical stability. With float32 data type. | |||
| - **gradient** (Tensor) - Gradient value. With float32 data type. | |||
| - **indices** (Tensor) - Gradient indices. With int32 data type. | |||
| @@ -3108,11 +3108,11 @@ class FusedSparseLazyAdam(PrimitiveWithInfer): | |||
| Args: | |||
| use_locking (bool): Whether to enable a lock to protect updating variable tensors. | |||
| If True, updating of the var, m, and v tensors will be protected by a lock. | |||
| If False, the result is unpredictable. Default: False. | |||
| If true, updates of the var, m, and v tensors will be protected by a lock. | |||
| If false, the result is unpredictable. Default: False. | |||
| use_nesterov (bool): Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients. | |||
| If True, updates the gradients using NAG. | |||
| If False, updates the gradients without using NAG. Default: False. | |||
| If true, update the gradients using NAG. | |||
| If true, update the gradients without using NAG. Default: False. | |||
| Inputs: | |||
| - **var** (Parameter) - Parameters to be updated. With float32 data type. | |||
| @@ -3123,8 +3123,8 @@ class FusedSparseLazyAdam(PrimitiveWithInfer): | |||
| - **beta1_power** (Tensor) - :math:`beta_1^t` in the updating formula. With float32 data type. | |||
| - **beta2_power** (Tensor) - :math:`beta_2^t` in the updating formula. With float32 data type. | |||
| - **lr** (Tensor) - :math:`l` in the updating formula. With float32 data type. | |||
| - **beta1** (Tensor) - The exponential decay rate for the 1st moment estimates. With float32 data type. | |||
| - **beta2** (Tensor) - The exponential decay rate for the 2nd moment estimates. With float32 data type. | |||
| - **beta1** (Tensor) - The exponential decay rate for the 1st moment estimations. With float32 data type. | |||
| - **beta2** (Tensor) - The exponential decay rate for the 2nd moment estimations. With float32 data type. | |||
| - **epsilon** (Tensor) - Term added to the denominator to improve numerical stability. With float32 data type. | |||
| - **gradient** (Tensor) - Gradient value. With float32 data type. | |||
| - **indices** (Tensor) - Gradient indices. With int32 data type. | |||
| @@ -3227,7 +3227,7 @@ class FusedSparseFtrl(PrimitiveWithInfer): | |||
| l2 (float): l2 regularization strength, must be greater than or equal to zero. | |||
| lr_power (float): Learning rate power controls how the learning rate decreases during training, | |||
| must be less than or equal to zero. Use fixed learning rate if `lr_power` is zero. | |||
| use_locking (bool): Use locks for update operation if True . Default: False. | |||
| use_locking (bool): Use locks for updating operation if True . Default: False. | |||
| Inputs: | |||
| - **var** (Parameter) - The variable to be updated. The data type must be float32. | |||
| @@ -3320,7 +3320,7 @@ class FusedSparseProximalAdagrad(PrimitiveWithInfer): | |||
| var = \frac{sign(\text{prox_v})}{1 + lr * l2} * \max(\left| \text{prox_v} \right| - lr * l1, 0) | |||
| Args: | |||
| use_locking (bool): If True, updating of the var and accum tensors will be protected. Default: False. | |||
| use_locking (bool): If true, updates of the var and accum tensors will be protected. Default: False. | |||
| Inputs: | |||
| - **var** (Parameter) - Variable tensor to be updated. The data type must be float32. | |||
| @@ -3415,7 +3415,7 @@ class KLDivLoss(PrimitiveWithInfer): | |||
| \end{cases} | |||
| Args: | |||
| reduction (str): Specifies the reduction to apply to the output. | |||
| reduction (str): Specifies the reduction to be applied to the output. | |||
| Its value should be one of 'none', 'mean', 'sum'. Default: 'mean'. | |||
| Inputs: | |||
| @@ -3487,7 +3487,7 @@ class BinaryCrossEntropy(PrimitiveWithInfer): | |||
| \end{cases} | |||
| Args: | |||
| reduction (str): Specifies the reduction to apply to the output. | |||
| reduction (str): Specifies the reduction to be applied to the output. | |||
| Its value should be one of 'none', 'mean', 'sum'. Default: 'mean'. | |||
| Inputs: | |||
| @@ -3575,9 +3575,9 @@ class ApplyAdaMax(PrimitiveWithInfer): | |||
| With float32 or float16 data type. | |||
| - **lr** (Union[Number, Tensor]) - Learning rate, :math:`l` in the updating formula, should be scalar. | |||
| With float32 or float16 data type. | |||
| - **beta1** (Union[Number, Tensor]) - The exponential decay rate for the 1st moment estimates, | |||
| - **beta1** (Union[Number, Tensor]) - The exponential decay rate for the 1st moment estimations, | |||
| should be scalar. With float32 or float16 data type. | |||
| - **beta2** (Union[Number, Tensor]) - The exponential decay rate for the 2nd moment estimates, | |||
| - **beta2** (Union[Number, Tensor]) - The exponential decay rate for the 2nd moment estimations, | |||
| should be scalar. With float32 or float16 data type. | |||
| - **epsilon** (Union[Number, Tensor]) - A small value added for numerical stability, should be scalar. | |||
| With float32 or float16 data type. | |||
| @@ -3939,7 +3939,7 @@ class SparseApplyAdagrad(PrimitiveWithInfer): | |||
| Args: | |||
| lr (float): Learning rate. | |||
| update_slots (bool): If `True`, `accum` will be updated. Default: True. | |||
| use_locking (bool): If True, updating of the var and accum tensors will be protected. Default: False. | |||
| use_locking (bool): If true, updates of the var and accum tensors will be protected. Default: False. | |||
| Inputs: | |||
| - **var** (Parameter) - Variable to be updated. The data type must be float16 or float32. | |||
| @@ -4099,7 +4099,7 @@ class ApplyProximalAdagrad(PrimitiveWithInfer): | |||
| var = \frac{sign(\text{prox_v})}{1 + lr * l2} * \max(\left| \text{prox_v} \right| - lr * l1, 0) | |||
| Args: | |||
| use_locking (bool): If True, updating of the var and accum tensors will be protected. Default: False. | |||
| use_locking (bool): If true, updates of the var and accum tensors will be protected. Default: False. | |||
| Inputs: | |||
| - **var** (Parameter) - Variable to be updated. The data type should be float16 or float32. | |||
| @@ -4195,7 +4195,7 @@ class SparseApplyProximalAdagrad(PrimitiveWithInfer): | |||
| var = \frac{sign(\text{prox_v})}{1 + lr * l2} * \max(\left| \text{prox_v} \right| - lr * l1, 0) | |||
| Args: | |||
| use_locking (bool): If True, updating of the var and accum tensors will be protected. Default: False. | |||
| use_locking (bool): If true, updates of the var and accum tensors will be protected. Default: False. | |||
| Inputs: | |||
| - **var** (Parameter) - Variable tensor to be updated. The data type must be float16 or float32. | |||
| @@ -4697,7 +4697,7 @@ class ApplyFtrl(PrimitiveWithInfer): | |||
| Update relevant entries according to the FTRL scheme. | |||
| Args: | |||
| use_locking (bool): Use locks for update operation if True . Default: False. | |||
| use_locking (bool): Use locks for updating operation if True . Default: False. | |||
| Inputs: | |||
| - **var** (Parameter) - The variable to be updated. The data type should be float16 or float32. | |||
| @@ -4788,7 +4788,7 @@ class SparseApplyFtrl(PrimitiveWithInfer): | |||
| l2 (float): l2 regularization strength, must be greater than or equal to zero. | |||
| lr_power (float): Learning rate power controls how the learning rate decreases during training, | |||
| must be less than or equal to zero. Use fixed learning rate if `lr_power` is zero. | |||
| use_locking (bool): Use locks for update operation if True . Default: False. | |||
| use_locking (bool): Use locks for updating operation if True . Default: False. | |||
| Inputs: | |||
| - **var** (Parameter) - The variable to be updated. The data type must be float16 or float32. | |||
| @@ -4967,8 +4967,8 @@ class ConfusionMulGrad(PrimitiveWithInfer): | |||
| axis (Union[int, tuple[int], list[int]]): The dimensions to reduce. | |||
| Default:(), reduce all dimensions. Only constant value is allowed. | |||
| keep_dims (bool): | |||
| - If True, keep these reduced dimensions and the length is 1. | |||
| - If False, don't keep these dimensions. Default:False. | |||
| - If true, keep these reduced dimensions and the length is 1. | |||
| - If false, don't keep these dimensions. Default:False. | |||
| Inputs: | |||
| - **input_0** (Tensor) - The input Tensor. | |||
| @@ -5094,9 +5094,9 @@ class CTCLoss(PrimitiveWithInfer): | |||
| Calculates the CTC(Connectionist Temporal Classification) loss. Also calculates the gradient. | |||
| Args: | |||
| preprocess_collapse_repeated (bool): If True, repeated labels are collapsed prior to the CTC calculation. | |||
| preprocess_collapse_repeated (bool): If true, repeated labels are collapsed prior to the CTC calculation. | |||
| Default: False. | |||
| ctc_merge_repeated (bool): If False, during CTC calculation, repeated non-blank labels will not be merged | |||
| ctc_merge_repeated (bool): If false, during CTC calculation, repeated non-blank labels will not be merged | |||
| and are interpreted as individual labels. This is a simplfied version of CTC. | |||
| Default: True. | |||
| ignore_longer_outputs_than_inputs (bool): If True, sequences with longer outputs than inputs will be ignored. | |||
| @@ -5192,7 +5192,7 @@ class BasicLSTMCell(PrimitiveWithInfer): | |||
| keep_prob (float): If not 1.0, append `Dropout` layer on the outputs of each | |||
| LSTM layer except the last layer. Default 1.0. The range of dropout is [0.0, 1.0]. | |||
| forget_bias (float): Add forget bias to forget gate biases in order to decrease former scale. Default to 1.0. | |||
| state_is_tuple (bool): If True, state is tensor tuple, containing h and c; If False, one tensor, | |||
| state_is_tuple (bool): If true, state is tensor tuple, containing h and c; If false, one tensor, | |||
| need split first. Default to True. | |||
| activation (str): Activation. Default to "tanh". | |||
| @@ -496,12 +496,11 @@ def convert_quant_network(network, | |||
| per_channel (bool, list or tuple): Quantization granularity based on layer or on channel. If `True` | |||
| then base on per channel otherwise base on per layer. The first element represent weights | |||
| and second element represent data flow. Default: (False, False) | |||
| symmetric (bool, list or tuple): Quantization algorithm use symmetric or not. If `True` then base on | |||
| symmetric (bool, list or tuple): Whether the quantization algorithm is symmetric or not. If `True` then base on | |||
| symmetric otherwise base on asymmetric. The first element represent weights and second | |||
| element represent data flow. Default: (False, False) | |||
| narrow_range (bool, list or tuple): Quantization algorithm use narrow range or not. If `True` then base | |||
| on narrow range otherwise base on off narrow range. The first element represent weights and | |||
| second element represent data flow. Default: (False, False) | |||
| narrow_range (bool, list or tuple): Whether the quantization algorithm uses narrow range or not. | |||
| The first element represents weights and the second element represents data flow. Default: (False, False) | |||
| Returns: | |||
| Cell, Network which has change to quantization aware training network cell. | |||
| @@ -31,8 +31,8 @@ def cal_quantization_params(input_min, | |||
| input_max (numpy.ndarray): The dimension of channel or 1. | |||
| data_type (numpy type) : Can ben numpy int8, numpy uint8. | |||
| num_bits (int): Quantization number bit, support 4 and 8bit. Default: 8. | |||
| symmetric (bool): Quantization algorithm use symmetric or not. Default: False. | |||
| narrow_range (bool): Quantization algorithm use narrow range or not. Default: False. | |||
| symmetric (bool): Whether the quantization algorithm is symmetric or not. Default: False. | |||
| narrow_range (bool): Whether the quantization algorithm uses narrow range or not. Default: False. | |||
| Returns: | |||
| scale (numpy.ndarray): quantization param. | |||
| @@ -34,7 +34,7 @@ pkg_dir = os.path.join(pwd, 'build/package') | |||
| def _read_file(filename): | |||
| with open(os.path.join(pwd, filename)) as f: | |||
| with open(os.path.join(pwd, filename), encoding='UTF-8') as f: | |||
| return f.read() | |||