You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

lstm.py 12 kB

5 years ago
5 years ago
6 years ago
5 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
6 years ago
5 years ago
6 years ago
6 years ago
5 years ago
6 years ago
5 years ago
5 years ago
6 years ago
5 years ago
6 years ago
6 years ago
5 years ago
6 years ago
6 years ago
5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242
  1. # Copyright 2020 Huawei Technologies Co., Ltd
  2. #
  3. # Licensed under the Apache License, Version 2.0 (the "License");
  4. # you may not use this file except in compliance with the License.
  5. # You may obtain a copy of the License at
  6. #
  7. # http://www.apache.org/licenses/LICENSE-2.0
  8. #
  9. # Unless required by applicable law or agreed to in writing, software
  10. # distributed under the License is distributed on an "AS IS" BASIS,
  11. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  12. # See the License for the specific language governing permissions and
  13. # limitations under the License.
  14. # ============================================================================
  15. """lstm"""
  16. import math
  17. import numpy as np
  18. from mindspore._checkparam import Validator as validator
  19. from mindspore.common.initializer import initializer
  20. from mindspore.common.parameter import Parameter
  21. from mindspore.common.tensor import Tensor
  22. from mindspore.nn.cell import Cell
  23. from mindspore.ops import operations as P
  24. __all__ = ['LSTM', 'LSTMCell']
  25. class LSTM(Cell):
  26. r"""
  27. LSTM (Long Short-Term Memory) layer.
  28. Apply LSTM layer to the input.
  29. There are two pipelines connecting two consecutive cells in a LSTM model; one is cell state pipeline
  30. and the other is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`.
  31. Given an input :math:`x_t` at time :math:`t`, an hidden state :math:`h_{t-1}` and an cell
  32. state :math:`c_{t-1}` of the layer at time :math:`{t-1}`, the cell state and hidden state at
  33. time :math:`t` is computed using an gating mechanism. Input gate :math:`i_t` is designed to protect the cell
  34. from perturbation by irrelevant inputs. Forget gate :math:`f_t` affords protection of the cell by forgetting
  35. some information in the past, which is stored in :math:`h_{t-1}`. Output gate :math:`o_t` protects other
  36. units from perturbation by currently irrelevant memory contents. Candidate cell state :math:`\tilde{c}_t` is
  37. calculated with the current input, on which the input gate will be applied. Finally, current cell state
  38. :math:`c_{t}` and hidden state :math:`h_{t}` are computed with the calculated gates and cell states. The complete
  39. formulation is as follows.
  40. .. math::
  41. \begin{array}{ll} \\
  42. i_t = \sigma(W_{ix} x_t + b_{ix} + W_{ih} h_{(t-1)} + b_{ih}) \\
  43. f_t = \sigma(W_{fx} x_t + b_{fx} + W_{fh} h_{(t-1)} + b_{fh}) \\
  44. \tilde{c}_t = \tanh(W_{cx} x_t + b_{cx} + W_{ch} h_{(t-1)} + b_{ch}) \\
  45. o_t = \sigma(W_{ox} x_t + b_{ox} + W_{oh} h_{(t-1)} + b_{oh}) \\
  46. c_t = f_t * c_{(t-1)} + i_t * \tilde{c}_t \\
  47. h_t = o_t * \tanh(c_t) \\
  48. \end{array}
  49. Here :math:`\sigma` is the sigmoid function, and :math:`*` is the Hadamard product. :math:`W, b`
  50. are learnable weights between the output and the input in the formula. For instance,
  51. :math:`W_{ix}, b_{ix}` are the weight and bias used to transform from input :math:`x` to :math:`i`.
  52. Details can be found in paper `LONG SHORT-TERM MEMORY
  53. <https://www.bioinf.jku.at/publications/older/2604.pdf>`_ and
  54. `Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling
  55. <https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/43905.pdf>`_.
  56. Args:
  57. input_size (int): Number of features of input.
  58. hidden_size (int): Number of features of hidden layer.
  59. num_layers (int): Number of layers of stacked LSTM . Default: 1.
  60. has_bias (bool): Whether the cell has bias `b_ih` and `b_hh`. Default: True.
  61. batch_first (bool): Specifies whether the first dimension of input is batch_size. Default: False.
  62. dropout (float, int): If not 0, append `Dropout` layer on the outputs of each
  63. LSTM layer except the last layer. Default 0. The range of dropout is [0.0, 1.0].
  64. bidirectional (bool): Specifies whether it is a bidirectional LSTM. Default: False.
  65. Inputs:
  66. - **input** (Tensor) - Tensor of shape (seq_len, batch_size, `input_size`).
  67. - **hx** (tuple) - A tuple of two Tensors (h_0, c_0) both of data type mindspore.float32 or
  68. mindspore.float16 and shape (num_directions * `num_layers`, batch_size, `hidden_size`).
  69. Data type of `hx` must be the same as `input`.
  70. Outputs:
  71. Tuple, a tuple constains (`output`, (`h_n`, `c_n`)).
  72. - **output** (Tensor) - Tensor of shape (seq_len, batch_size, num_directions * `hidden_size`).
  73. - **hx_n** (tuple) - A tuple of two Tensor (h_n, c_n) both of shape
  74. (num_directions * `num_layers`, batch_size, `hidden_size`).
  75. Examples:
  76. >>> net = nn.LSTM(10, 12, 2, has_bias=True, batch_first=True, bidirectional=False)
  77. >>> input = Tensor(np.ones([3, 5, 10]).astype(np.float32))
  78. >>> h0 = Tensor(np.ones([1 * 2, 3, 12]).astype(np.float32))
  79. >>> c0 = Tensor(np.ones([1 * 2, 3, 12]).astype(np.float32))
  80. >>> output, (hn, cn) = net(input, (h0, c0))
  81. """
  82. def __init__(self,
  83. input_size,
  84. hidden_size,
  85. num_layers=1,
  86. has_bias=True,
  87. batch_first=False,
  88. dropout=0,
  89. bidirectional=False):
  90. super(LSTM, self).__init__()
  91. validator.check_value_type("batch_first", batch_first, [bool], self.cls_name)
  92. validator.check_positive_int(hidden_size, "hidden_size", self.cls_name)
  93. validator.check_positive_int(num_layers, "num_layers", self.cls_name)
  94. self.batch_first = batch_first
  95. self.transpose = P.Transpose()
  96. self.lstm = P.LSTM(input_size=input_size,
  97. hidden_size=hidden_size,
  98. num_layers=num_layers,
  99. has_bias=has_bias,
  100. bidirectional=bidirectional,
  101. dropout=float(dropout))
  102. weight_size = 0
  103. gate_size = 4 * hidden_size
  104. num_directions = 2 if bidirectional else 1
  105. for layer in range(num_layers):
  106. input_layer_size = input_size if layer == 0 else hidden_size * num_directions
  107. increment_size = gate_size * input_layer_size
  108. increment_size += gate_size * hidden_size
  109. if has_bias:
  110. increment_size += 2 * gate_size
  111. weight_size += increment_size * num_directions
  112. stdv = 1 / math.sqrt(hidden_size)
  113. w_np = np.random.uniform(-stdv, stdv, (weight_size, 1, 1)).astype(np.float32)
  114. self.weight = Parameter(initializer(Tensor(w_np), [weight_size, 1, 1]), name='weight')
  115. def construct(self, x, hx):
  116. if self.batch_first:
  117. x = self.transpose(x, (1, 0, 2))
  118. h, c = hx
  119. x, h, c, _, _ = self.lstm(x, h, c, self.weight)
  120. if self.batch_first:
  121. x = self.transpose(x, (1, 0, 2))
  122. return x, (h, c)
  123. class LSTMCell(Cell):
  124. r"""
  125. LSTM (Long Short-Term Memory) layer.
  126. Apply LSTM layer to the input.
  127. There are two pipelines connecting two consecutive cells in a LSTM model; one is cell state pipeline
  128. and the other is hidden state pipeline. Denote two consecutive time nodes as :math:`t-1` and :math:`t`.
  129. Given an input :math:`x_t` at time :math:`t`, an hidden state :math:`h_{t-1}` and an cell
  130. state :math:`c_{t-1}` of the layer at time :math:`{t-1}`, the cell state and hidden state at
  131. time :math:`t` is computed using an gating mechanism. Input gate :math:`i_t` is designed to protect the cell
  132. from perturbation by irrelevant inputs. Forget gate :math:`f_t` affords protection of the cell by forgetting
  133. some information in the past, which is stored in :math:`h_{t-1}`. Output gate :math:`o_t` protects other
  134. units from perturbation by currently irrelevant memory contents. Candidate cell state :math:`\tilde{c}_t` is
  135. calculated with the current input, on which the input gate will be applied. Finally, current cell state
  136. :math:`c_{t}` and hidden state :math:`h_{t}` are computed with the calculated gates and cell states. The complete
  137. formulation is as follows.
  138. .. math::
  139. \begin{array}{ll} \\
  140. i_t = \sigma(W_{ix} x_t + b_{ix} + W_{ih} h_{(t-1)} + b_{ih}) \\
  141. f_t = \sigma(W_{fx} x_t + b_{fx} + W_{fh} h_{(t-1)} + b_{fh}) \\
  142. \tilde{c}_t = \tanh(W_{cx} x_t + b_{cx} + W_{ch} h_{(t-1)} + b_{ch}) \\
  143. o_t = \sigma(W_{ox} x_t + b_{ox} + W_{oh} h_{(t-1)} + b_{oh}) \\
  144. c_t = f_t * c_{(t-1)} + i_t * \tilde{c}_t \\
  145. h_t = o_t * \tanh(c_t) \\
  146. \end{array}
  147. Here :math:`\sigma` is the sigmoid function, and :math:`*` is the Hadamard product. :math:`W, b`
  148. are learnable weights between the output and the input in the formula. For instance,
  149. :math:`W_{ix}, b_{ix}` are the weight and bias used to transform from input :math:`x` to :math:`i`.
  150. Details can be found in paper `LONG SHORT-TERM MEMORY
  151. <https://www.bioinf.jku.at/publications/older/2604.pdf>`_ and
  152. `Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling
  153. <https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/43905.pdf>`_.
  154. LSTMCell is a single-layer RNN, you can achieve multi-layer RNN by stacking LSTMCell.
  155. Args:
  156. input_size (int): Number of features of input.
  157. hidden_size (int): Number of features of hidden layer.
  158. has_bias (bool): Whether the cell has bias `b_ih` and `b_hh`. Default: True.
  159. batch_first (bool): Specifies whether the first dimension of input is batch_size. Default: False.
  160. dropout (float, int): If not 0, append `Dropout` layer on the outputs of each
  161. LSTM layer except the last layer. Default 0. The range of dropout is [0.0, 1.0].
  162. bidirectional (bool): Specifies whether this is a bidirectional LSTM. If set True,
  163. number of directions will be 2 otherwise number of directions is 1. Default: False.
  164. Inputs:
  165. - **input** (Tensor) - Tensor of shape (seq_len, batch_size, `input_size`).
  166. - **h** - data type mindspore.float32 or
  167. mindspore.float16 and shape (num_directions, batch_size, `hidden_size`).
  168. - **c** - data type mindspore.float32 or
  169. mindspore.float16 and shape (num_directions, batch_size, `hidden_size`).
  170. Data type of `h' and 'c' must be the same of `input`.
  171. - **w** - data type mindspore.float32 or
  172. mindspore.float16 and shape (`weight_size`, 1, 1).
  173. The value of `weight_size` depends on `input_size`, `hidden_size` and `bidirectional`
  174. Outputs:
  175. `output`, `h_n`, `c_n`, 'reserve', 'state'.
  176. - **output** (Tensor) - Tensor of shape (seq_len, batch_size, num_directions * `hidden_size`).
  177. - **h** - A Tensor with shape (num_directions, batch_size, `hidden_size`).
  178. - **c** - A Tensor with shape (num_directions, batch_size, `hidden_size`).
  179. - **reserve** - reserved
  180. - **state** - reserved
  181. Examples:
  182. >>> net = nn.LSTMCell(10, 12, has_bias=True, batch_first=True, bidirectional=False)
  183. >>> input = Tensor(np.ones([3, 5, 10]).astype(np.float32))
  184. >>> h = Tensor(np.ones([1, 3, 12]).astype(np.float32))
  185. >>> c = Tensor(np.ones([1, 3, 12]).astype(np.float32))
  186. >>> w = Tensor(np.ones([1152, 1, 1]).astype(np.float32))
  187. >>> output, h, c, _, _ = net(input, h, c, w)
  188. """
  189. def __init__(self,
  190. input_size,
  191. hidden_size,
  192. has_bias=True,
  193. batch_first=False,
  194. dropout=0,
  195. bidirectional=False):
  196. super(LSTMCell, self).__init__()
  197. self.batch_first = validator.check_value_type("batch_first", batch_first, [bool], self.cls_name)
  198. self.transpose = P.Transpose()
  199. self.lstm = P.LSTM(input_size=input_size,
  200. hidden_size=hidden_size,
  201. num_layers=1,
  202. has_bias=has_bias,
  203. bidirectional=bidirectional,
  204. dropout=float(dropout))
  205. def construct(self, x, h, c, w):
  206. if self.batch_first:
  207. x = self.transpose(x, (1, 0, 2))
  208. x, h, c, _, _ = self.lstm(x, h, c, w)
  209. if self.batch_first:
  210. x = self.transpose(x, (1, 0, 2))
  211. return x, h, c, _, _