In the last commit (015d7354c7), I deleted the check on whether different ValueNode have
same tensor value, but forgot the situation that several nodes use the same ValueNode,
in this case, the function will create several parameter for the same ValueNode, but all
ValueNode is replaced with the first parameter, and the remaining parameters are not used.
This will result in a "parameter has no user" error.
Use a std::set for the ValueNodes can resolve this problem.
Robin-hood-hashing (https://github.com/martinus/robin-hood-hashing)
is considered faster then std::unordered_map/set,
so we use it to improve mindspore performance.
1. robin_hood head file in `third_party/robin_hood/include`;
2. In `utils/hash_map.h` and `utils/hash_set.h`, we define:
- mindspore::HashMap as an alias of robin_hood::unordered_map;
- mindspore::HashSet as an alias of robin_hood::unordered_set;
3. Replace:
- `#include <unordered_map>` --> `#include "utils/hash_map.h"`;
- `#include <unordered_set>` --> `#include "utils/hash_set.h"`;
- `std::unordered_map` --> `mindspore::HashMap`;
- `std::unordered_set` --> `mindspore::HashSet`;
- `map.insert(std::pair(key, value))` --> `map.emplace(key, value)`;
- `[] (const std::pair<K, V> &p) {..} ` --> `[] (const auto &p) {..} `;
4. Fix issues found by switch to robin_hood:
- AnfNodeConfig hash and equal;
- Fix a bug in `Slice::operator==()`;
- Fix a bug in `CNode::HasPrimalAttr()`;
- Fix map.erase() usage bugs: `map.erase(iter++)` --> `iter = map.erase(iter)`;
- Fix some iterator invalidated problem;
5. Some std::unordered_map/set can not replace by robin_hood:
- As parameter of functions that exposed to python by pybind11;
- Use bad hash that cause robin_hood::map over_flow, such as AbstractBasePtrListHasher;
6. Update cpp unit tests;
7. Add build option '-F' to enable robin_hood, default on.
Changed the callback function GetProcessorFromContext to GetTargetFromContext,
so that we can use it to filter the clusterable op list, added a GetProcessorByTarget into AkgKernelJsonGenerator.
Moved the function IsKeepBasicNode, GetValidOps, OpListFilter from graph_kernel_helper
to graph_kernel_utils. combined the GetValidOps and OpListFilter.
Decoupled the pass getitem_tuple from "optimizer/common/helper.h", by deleting the checking
of input size. cnode->input(i) also checks the input index.
1. move functions from graph_kernel_helper.cc to graph_builder.cc:
the EliminateMakeTuple, implemented with SpreadTuples.
the ConvertNonscalarTensorToParameter, remove checking the equal tensor.
the IsTupleOutput (original IsMakeTupleOut), use recursion.
the CreateNewFuseCNode, remove the "output" argument; call SetNewKernelInfo in it.
the ReplaceNewFuseCNode,
the BuildSingleGraphFromNodes (original MixedNodesTransToGraph)
the ReplaceNodesWithGraphKernelNode (original FuseNodesToSubGraph)
2. create graph_kernel_utils.cc.
the ExtractGraphKernelName and SpreadTuples was moved to the file.
3. add SetNewKernelInfo to the callback functions.