You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

session_basic.h 15 kB

5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
adapte to remove inline merge me commit for remove inline deal witch multiple cases of switch in ConstructKernelGraph deal with switch and call cases in ConstructKernelGraph fix bug and rebase master ConstructKernelGraph adapte to remove inline fix InsertMultipleAssignToGraph bug add graph input to new graph which is created for switch input replace CreateNewParameterFromCNode to NewParameter in order to set new parameter's abstract and kernel_info avoids create a new switch repeatedly when the cnode is a call switch without real input null pointer check update frontend code Revert "update frontend code" This reverts commit ce1f600d1e9b4b47d9b81122f981bbbe505dd250. update frontend code PR_2948 fix bug of CheckLabalIndex handle switch_layer in ConstructKernelGraph add attr for assign node to avoid erasing by cse pass cherry-pick ms commit[59b35f690ddcc94ff35a4f4eaf3816121b32235b]:temporary avoid list getitem problem rebase master Revert "cherry-pick ms commit[59b35f690ddcc94ff35a4f4eaf3816121b32235b]:temporary avoid list getitem problem" This reverts commit 74c258f94260ca0769a1ef69c6ef8e831c301dbf. Revert "handle switch_layer in ConstructKernelGraph" This reverts commit cb5367f02d69facbca8d39e9234c501608aee27f. Revert "update frontend code PR_2948" This reverts commit 234ac583400a96a8ddd641f7a722e1ccd5e056c6. Revert "merge me commit for remove inline" This reverts commit 55c0ebd42b6699c7686f5ce585e745f87dd42280. fix diff after rebase master doing remove inline in me overwrite FindNodePrimitive Revert "doing remove inline in me" This reverts commit b42e893125bc624d323e855ac6ae615333c06e65.
5 years ago
adapte to remove inline merge me commit for remove inline deal witch multiple cases of switch in ConstructKernelGraph deal with switch and call cases in ConstructKernelGraph fix bug and rebase master ConstructKernelGraph adapte to remove inline fix InsertMultipleAssignToGraph bug add graph input to new graph which is created for switch input replace CreateNewParameterFromCNode to NewParameter in order to set new parameter's abstract and kernel_info avoids create a new switch repeatedly when the cnode is a call switch without real input null pointer check update frontend code Revert "update frontend code" This reverts commit ce1f600d1e9b4b47d9b81122f981bbbe505dd250. update frontend code PR_2948 fix bug of CheckLabalIndex handle switch_layer in ConstructKernelGraph add attr for assign node to avoid erasing by cse pass cherry-pick ms commit[59b35f690ddcc94ff35a4f4eaf3816121b32235b]:temporary avoid list getitem problem rebase master Revert "cherry-pick ms commit[59b35f690ddcc94ff35a4f4eaf3816121b32235b]:temporary avoid list getitem problem" This reverts commit 74c258f94260ca0769a1ef69c6ef8e831c301dbf. Revert "handle switch_layer in ConstructKernelGraph" This reverts commit cb5367f02d69facbca8d39e9234c501608aee27f. Revert "update frontend code PR_2948" This reverts commit 234ac583400a96a8ddd641f7a722e1ccd5e056c6. Revert "merge me commit for remove inline" This reverts commit 55c0ebd42b6699c7686f5ce585e745f87dd42280. fix diff after rebase master doing remove inline in me overwrite FindNodePrimitive Revert "doing remove inline in me" This reverts commit b42e893125bc624d323e855ac6ae615333c06e65.
5 years ago
5 years ago
5 years ago
adapte to remove inline merge me commit for remove inline deal witch multiple cases of switch in ConstructKernelGraph deal with switch and call cases in ConstructKernelGraph fix bug and rebase master ConstructKernelGraph adapte to remove inline fix InsertMultipleAssignToGraph bug add graph input to new graph which is created for switch input replace CreateNewParameterFromCNode to NewParameter in order to set new parameter's abstract and kernel_info avoids create a new switch repeatedly when the cnode is a call switch without real input null pointer check update frontend code Revert "update frontend code" This reverts commit ce1f600d1e9b4b47d9b81122f981bbbe505dd250. update frontend code PR_2948 fix bug of CheckLabalIndex handle switch_layer in ConstructKernelGraph add attr for assign node to avoid erasing by cse pass cherry-pick ms commit[59b35f690ddcc94ff35a4f4eaf3816121b32235b]:temporary avoid list getitem problem rebase master Revert "cherry-pick ms commit[59b35f690ddcc94ff35a4f4eaf3816121b32235b]:temporary avoid list getitem problem" This reverts commit 74c258f94260ca0769a1ef69c6ef8e831c301dbf. Revert "handle switch_layer in ConstructKernelGraph" This reverts commit cb5367f02d69facbca8d39e9234c501608aee27f. Revert "update frontend code PR_2948" This reverts commit 234ac583400a96a8ddd641f7a722e1ccd5e056c6. Revert "merge me commit for remove inline" This reverts commit 55c0ebd42b6699c7686f5ce585e745f87dd42280. fix diff after rebase master doing remove inline in me overwrite FindNodePrimitive Revert "doing remove inline in me" This reverts commit b42e893125bc624d323e855ac6ae615333c06e65.
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284
  1. /**
  2. * Copyright 2019-2020 Huawei Technologies Co., Ltd
  3. *
  4. * Licensed under the Apache License, Version 2.0 (the "License");
  5. * you may not use this file except in compliance with the License.
  6. * You may obtain a copy of the License at
  7. *
  8. * http://www.apache.org/licenses/LICENSE-2.0
  9. *
  10. * Unless required by applicable law or agreed to in writing, software
  11. * distributed under the License is distributed on an "AS IS" BASIS,
  12. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  13. * See the License for the specific language governing permissions and
  14. * limitations under the License.
  15. */
  16. #ifndef MINDSPORE_CCSRC_BACKEND_SESSION_SESSION_BASIC_H
  17. #define MINDSPORE_CCSRC_BACKEND_SESSION_SESSION_BASIC_H
  18. #include <vector>
  19. #include <string>
  20. #include <unordered_map>
  21. #include <utility>
  22. #include <memory>
  23. #include <map>
  24. #include <set>
  25. #include "backend/session/session_context.h"
  26. #include "backend/session/kernel_graph.h"
  27. #include "backend/session/anf_runtime_algorithm.h"
  28. #include "ir/anf.h"
  29. #include "ir/tensor.h"
  30. #include "utils/any.h"
  31. #include "utils/contract.h"
  32. #include "runtime/device/kernel_info.h"
  33. #include "utils/ms_context.h"
  34. #include "runtime/device/bucket.h"
  35. #if !defined(_WIN32) && !defined(_WIN64)
  36. #include "debug/debugger/debugger.h"
  37. #endif
  38. namespace mindspore {
  39. using GraphId = uint32_t;
  40. using GraphInfo = std::string;
  41. namespace session {
  42. using CallBackFunc = uint32_t (*)(uint32_t graph_id,
  43. const std::map<std::string, mindspore::tensor::TensorPtr> &params_list);
  44. using AnyList = std::vector<Any>;
  45. using AnyListPtr = std::shared_ptr<AnyList>;
  46. struct OpRunInfo {
  47. std::string op_name;
  48. PrimitivePtr primitive;
  49. AbstractBasePtr abstract;
  50. bool is_dynamic_shape = false;
  51. bool is_auto_mixed_precision = false;
  52. std::string next_op_name = "";
  53. #if defined(__APPLE__)
  54. int next_input_index = 0;
  55. #else
  56. size_t next_input_index = 0;
  57. #endif
  58. };
  59. struct InputTensorInfo {
  60. std::vector<tensor::TensorPtr> input_tensors;
  61. std::vector<int64_t> input_tensors_mask;
  62. std::set<KernelWithIndex> input_kernel;
  63. };
  64. struct OutputTensorInfo {
  65. tensor::TensorPtr output_stub_tensor;
  66. bool is_weight;
  67. };
  68. using OpRunInfoPtr = std::shared_ptr<OpRunInfo>;
  69. using KernelMapTensor = std::map<session::KernelWithIndex, BaseRef, session::KernelWithIndexCmp>;
  70. class Executor;
  71. class SessionBasic : public std::enable_shared_from_this<SessionBasic> {
  72. public:
  73. SessionBasic() : context_(nullptr), summary_callback_(nullptr), device_id_(0) {
  74. #if !defined(_WIN32) && !defined(_WIN64)
  75. debugger_ = nullptr;
  76. #endif
  77. }
  78. virtual void Init(uint32_t device_id) { device_id_ = device_id; }
  79. void InitExecutor(const std::string &device_name, uint32_t device_id);
  80. virtual void SyncStream() {}
  81. virtual ~SessionBasic() { summary_callback_ = nullptr; }
  82. GraphId CompileGraph(const GraphSegmentPtr &segment, const AnfNodePtrList &outputs);
  83. GraphId CompileGraph(NotNull<FuncGraphPtr> func_graph);
  84. void BuildGraph(GraphId graphId);
  85. void RunGraph(const GraphId &graph_id, const std::vector<tensor::TensorPtr> &inputs, VectorRef *outputs);
  86. void RunGraphAsync(const GraphId &graph_id, const std::vector<tensor::TensorPtr> &inputs, VectorRef *outputs);
  87. void RunOp(OpRunInfo *, const GraphInfo &, std::vector<tensor::TensorPtr> *input_tensors, VectorRef *outputs,
  88. const std::vector<int64_t> &tensors_mask);
  89. void RunOpsInGraph(const GraphId &graph_id, const std::vector<tensor::TensorPtr> &inputs, VectorRef *outputs);
  90. virtual void RegisterSummaryCallBackFunc(const CallBackFunc &callback);
  91. bool CreateCNodeOfKernelGraph(const AnfNodePtr &node, KernelGraph *graph);
  92. std::shared_ptr<KernelGraph> ConstructKernelGraph(const AnfNodePtrList &lst, const AnfNodePtrList &outputs);
  93. std::shared_ptr<KernelGraph> ConstructKernelGraph(const FuncGraphPtr &func_graph,
  94. std::vector<KernelGraphPtr> *all_out_graph);
  95. CNodePtr CreateNewCNode(const CNodePtr &cnode, KernelGraph *graph,
  96. std::unordered_map<AnfNodePtr, AnfNodePtr> *other_graph_cnode);
  97. CNodePtr CreateNewCNode(const CNodePtr &cnode, KernelGraph *graph);
  98. // get graph id in child graphs by ME front anf node pointer
  99. virtual GraphId GetGraphIdByNode(const AnfNodePtr &) const;
  100. virtual GraphId GetFinalRunGraph() const { return kInvalidGraphId; }
  101. void AssignParamKey(const KernelGraphPtr &kernel_graph);
  102. void InitPSParamAndOptim(const KernelGraphPtr &kernel_graph, const std::vector<tensor::TensorPtr> &inputs_const);
  103. bool IsGetNextGraph(const std::shared_ptr<KernelGraph> &kernel_graph, std::string *channel_name);
  104. virtual bool CheckModelInputs(uint32_t graph_id, const std::vector<tensor::TensorPtr> &inputs,
  105. std::string *error_msg) const {
  106. return true;
  107. }
  108. void GetModelInputsInfo(uint32_t graph_id, std::vector<tensor::TensorPtr> *inputs,
  109. std::vector<std::string> *inputs_name) const;
  110. void GetModelOutputsInfo(uint32_t graph_id, std::vector<tensor::TensorPtr> *outputs,
  111. std::vector<std::string> *outputs_name) const;
  112. std::vector<tensor::TensorPtr> GetInputNeedLockTensors(const GraphId &graph_id,
  113. const std::vector<tensor::TensorPtr> &inputs);
  114. // Get graph by graph id, if not exist return null ptr
  115. KernelGraphPtr GetGraph(GraphId graph_id) const;
  116. void ClearGraph();
  117. // create a single run op graph
  118. std::shared_ptr<KernelGraph> ConstructSingleOpGraph(const OpRunInfo &op_run_info,
  119. const std::vector<tensor::TensorPtr> &input_tensors,
  120. const std::vector<int64_t> &tensors_mask, bool is_ascend = false);
  121. void EraseValueNodeTensor(const std::vector<int64_t> &tensors_mask, std::vector<tensor::TensorPtr> *input_tensors);
  122. void RunOpRemoveNopNode(const KernelGraphPtr &kernel_graph) const;
  123. void RunOpHideNopNode(const KernelGraphPtr &kernel_graph) const;
  124. #ifdef ENABLE_DEBUGGER
  125. // set debugger
  126. void SetDebugger() {
  127. debugger_ = Debugger::GetInstance();
  128. auto ms_context = MsContext::GetInstance();
  129. MS_EXCEPTION_IF_NULL(ms_context);
  130. debugger_->Init(device_id_, ms_context->get_param<std::string>(MS_CTX_DEVICE_TARGET));
  131. }
  132. #endif
  133. private:
  134. CNodePtr CreateSwitchInput(const CNodePtr &cnode, const AnfNodePtr &node_input, KernelGraph *graph);
  135. std::vector<AnfNodePtr> CreateSwitchOrPartialNode(const CNodePtr &cnode, KernelGraph *graph);
  136. std::vector<AnfNodePtr> CreateValueNode(const CNodePtr &cnode, KernelGraph *graph);
  137. void CreateCNodeInputs(const CNodePtr &cnode, KernelGraph *graph, std::vector<AnfNodePtr> *cnode_inputs);
  138. std::vector<AnfNodePtr> CreateCallSwitchInputs(const CNodePtr &cnode, KernelGraph *graph);
  139. void GetCNodeInfo(const CNodePtr &cnode, std::vector<AnfNodePtr> *cnode_inputs) const;
  140. void GetNewCNodeInputs(const CNodePtr &cnode, KernelGraph *graph, std::vector<AnfNodePtr> *cnode_inputs,
  141. std::unordered_map<AnfNodePtr, AnfNodePtr> *other_graph_cnode);
  142. std::vector<AnfNodePtr> CreateCallSwitchLayerInputs(const CNodePtr &cnode, KernelGraph *graph);
  143. void ProcessNodeRetFunc(const CNodePtr &cnode, KernelGraph *graph, const std::vector<AnfNodePtr> &real_inputs);
  144. void HandleInternalOutput(const AnfNodePtr &input_front_node, const AnfNodePtr &backend_node,
  145. const FuncGraphManagerPtr &front_func_graph_manager,
  146. const std::shared_ptr<KernelGraph> &backend_graph);
  147. std::string AddPartialParametersMap(const FuncGraphManagerPtr &front_func_graph_manager,
  148. const AnfNodePtr &partial_node);
  149. protected:
  150. friend class Executor;
  151. friend class CompileNodesTask;
  152. friend class CompileGraphTask;
  153. friend class BuildGraphTask;
  154. friend class RunGraphTask;
  155. friend class RunOpTask;
  156. friend class RunOpsInGraphTask;
  157. virtual bool IsSupportSummary() { return true; }
  158. virtual void CreateOutputTensors(const GraphId &graph_id, const std::vector<tensor::TensorPtr> &input_tensors,
  159. VectorRef *outputs,
  160. std::map<tensor::TensorPtr, session::KernelWithIndex> *tensor_to_node);
  161. virtual void UpdateOutputTensors(const VectorRef *outputs,
  162. const std::map<tensor::TensorPtr, session::KernelWithIndex> &tensor_to_node);
  163. virtual void UnifyMindIR(const KernelGraphPtr &graph) {}
  164. virtual void FinalOptimize(const KernelGraphPtr &graph) const;
  165. virtual GraphId CompileGraphImpl(const AnfNodePtrList &lst, const AnfNodePtrList &outputs) { return 0; }
  166. virtual GraphId CompileGraphImpl(NotNull<FuncGraphPtr> func_graph) { return kInvalidGraphId; }
  167. virtual void BuildGraphImpl(GraphId) {}
  168. virtual void PreExecuteGraph(const std::shared_ptr<KernelGraph> &kernel_graph,
  169. const std::vector<tensor::TensorPtr> &inputs, VectorRef *const outputs) {}
  170. virtual void PostExecuteGraph(const std::shared_ptr<KernelGraph> &kernel_graph,
  171. const std::vector<tensor::TensorPtr> &inputs, VectorRef *const outputs) {}
  172. virtual void ExecuteGraph(const std::shared_ptr<KernelGraph> &kernel_graph) {}
  173. void RunGraphImpl(const GraphId &graph_id, const std::vector<tensor::TensorPtr> &inputs, VectorRef *outputs);
  174. virtual void BuildOpImpl(const OpRunInfo &op_run_info, const GraphInfo &graph_info,
  175. const std::vector<tensor::TensorPtr> &input_tensors,
  176. const std::vector<int64_t> &tensors_mask) {}
  177. virtual void RunOpImpl(const GraphInfo &graph_info, OpRunInfo *op_run_info,
  178. std::vector<tensor::TensorPtr> *input_tensors, VectorRef *outputs,
  179. const std::vector<int64_t> &tensors_mask) {}
  180. void RunOpsInGraphImpl(const GraphId &graph_id, const std::vector<tensor::TensorPtr> &inputs, VectorRef *outputs);
  181. virtual void BuildOpsInGraph(const GraphId &graph_id, const std::map<AnfNodePtr, size_t> &parameter_index,
  182. const std::vector<tensor::TensorPtr> &graph_inputs,
  183. const std::map<KernelWithIndex, size_t> &cnode_refcount) {}
  184. virtual void SetSummaryNodes(KernelGraph *graph);
  185. void LoadInputs(const GraphId &graph_id, const std::vector<tensor::TensorPtr> &inputs_const) {
  186. auto kernel_graph = GetGraph(graph_id);
  187. MS_EXCEPTION_IF_NULL(kernel_graph);
  188. if (!kernel_graph->executable()) {
  189. return;
  190. }
  191. MS_LOG(INFO) << "Load inputs";
  192. LoadInputData(kernel_graph, inputs_const);
  193. }
  194. virtual void LoadInputData(const std::shared_ptr<KernelGraph> &kernel_graph,
  195. const std::vector<tensor::TensorPtr> &inputs_const) const {}
  196. void UpdateOutputs(const std::shared_ptr<KernelGraph> &kernel_graph, VectorRef *const outputs,
  197. const std::vector<tensor::TensorPtr> &input_tensors) const;
  198. void UpdateOutputAbstract(const std::shared_ptr<KernelGraph> &kernel_graph, OpRunInfo *op_run_info) const;
  199. void Summary(KernelGraph *graph);
  200. // create graph output for RunOp
  201. void CreateOutputNode(const CNodePtr &cnode, const std::shared_ptr<KernelGraph> &graph);
  202. CNodePtr ConstructOutput(const AnfNodePtrList &outputs, const std::shared_ptr<KernelGraph> &graph);
  203. // Generate graph info for a single op graph
  204. GraphInfo GetSingleOpGraphInfo(const CNodePtr &kernel, const std::vector<tensor::TensorPtr> &input_tensors);
  205. void GetSingleOpRunInfo(const CNodePtr cnode, OpRunInfo *run_info);
  206. tensor::TensorPtr GetValueNodeOutputTensor(const AnfNodePtr &node, size_t output_index);
  207. tensor::TensorPtr GetParameterOutputTensor(const AnfNodePtr &node,
  208. const std::map<AnfNodePtr, size_t> &parameter_index,
  209. const std::vector<tensor::TensorPtr> &graph_inputs);
  210. tensor::TensorPtr GetCNodeOutputTensor(const KernelWithIndex &kernel_with_index,
  211. const std::map<KernelWithIndex, tensor::TensorPtr> &op_output);
  212. void GetOpInputTensors(const CNodePtr &cnode, const std::map<KernelWithIndex, tensor::TensorPtr> &op_output,
  213. const std::map<AnfNodePtr, size_t> &parameter_index,
  214. const std::vector<tensor::TensorPtr> &graph_inputs, InputTensorInfo *input_tensor_info);
  215. // create a new kernel graph and update the graph sum
  216. KernelGraphPtr NewKernelGraph();
  217. AnfNodePtr CreateParameterFromTuple(const AnfNodePtr &node, KernelGraph *graph);
  218. virtual ParameterPtr CreateNewParameterFromParameter(const AnfNodePtr &anf, KernelGraph *graph);
  219. ValueNodePtr CreateValueNodeKernelGraph(const AnfNodePtr &anf, KernelGraph *graph);
  220. ParameterPtr CreateNewParameter(const AnfNodePtr &anf, KernelGraph *graph);
  221. AnfNodePtr CreateNewParameterFromCNode(const AnfNodePtr &anf, KernelGraph *graph);
  222. void AddParameterToGraphInputs(const std::vector<AnfNodePtr> &parameters, KernelGraph *graph);
  223. void InitInternalOutputParameter(const AnfNodePtr &out_node, const AnfNodePtr &parameter);
  224. AnfNodePtr FindPullNode(const AnfNodePtr &push_node, const std::vector<AnfNodePtr> &node_list);
  225. void UpdateGraphDynamicShapeAttr(const NotNull<KernelGraphPtr> &root_graph);
  226. void UpdateAllGraphDynamicShapeAttr(const std::vector<KernelGraphPtr> &all_graphs);
  227. virtual std::shared_ptr<device::Bucket> CreateBucket(uint32_t bucket_id, uint32_t bucket_size) { return nullptr; }
  228. void InitAllBucket(const KernelGraphPtr &graph);
  229. void AddGradAddrToBucket(const GraphId &graph_id, const std::vector<tensor::TensorPtr> &grad_tensor);
  230. void ClearAllBucket(const GraphId &graph_id);
  231. std::vector<uint32_t> GetAllReduceSplitIndex();
  232. virtual std::string GetCommWorldGroup() { return std::string(); }
  233. void DumpGraph(const std::shared_ptr<KernelGraph> &kernel_graph);
  234. #if (ENABLE_CPU && !_WIN32)
  235. void CheckPSModeConsistence(const KernelGraphPtr &kernel_graph) const;
  236. void GetBatchElements(const AnfNodePtr &kernel_node) const;
  237. void InitPsWorker(const KernelGraphPtr &kernel_graph);
  238. #endif
  239. std::map<uint32_t, std::vector<std::shared_ptr<device::Bucket>>> bucket_map_;
  240. std::map<uint32_t, uint32_t> free_bucket_id_map_;
  241. std::unordered_map<GraphId, std::shared_ptr<KernelGraph>> graphs_;
  242. std::unordered_map<GraphInfo, std::shared_ptr<KernelGraph>> run_op_graphs_;
  243. std::unordered_map<FuncGraph *, KernelGraphPtr> front_backend_graph_map_;
  244. std::unordered_map<AnfNodePtr, AnfNodePtr> partial_parameters_map_;
  245. std::unordered_map<AnfNodePtr, std::string> partial_target_map_;
  246. std::shared_ptr<Context> context_;
  247. CallBackFunc summary_callback_;
  248. static GraphId graph_sum_;
  249. uint32_t device_id_;
  250. std::shared_ptr<Executor> executor_;
  251. #if !defined(_WIN32) && !defined(_WIN64)
  252. std::shared_ptr<Debugger> debugger_;
  253. #endif
  254. #if (ENABLE_CPU && !_WIN32)
  255. bool initialized_ps_cache_{false};
  256. #endif
  257. };
  258. using SessionPtr = std::shared_ptr<session::SessionBasic>;
  259. using NamedSummaryOutputs = std::map<std::string, std::pair<AnfNodePtr, int>>;
  260. } // namespace session
  261. void DumpGraphExeOrder(const std::string &file_name, const std::string &target_dir,
  262. const std::vector<CNodePtr> &execution_order);
  263. } // namespace mindspore
  264. #endif // MINDSPORE_CCSRC_BACKEND_SESSION_SESSION_BASIC_H