The basic idea is: exploits data dependency to control the execution order
of side-effect operations, and keep the semantics of ANF unchanged.
The ControlDepend primitive is removed and there are two primitives added:
1. UpdateState:
```
a = Assign(para, value)
```
became:
```
a = Assign(para, value, u)
u = UpdateState(u, a)
```
2. Load:
```
x = Add(para, value)
```
became:
```
p = Load(para, u)
x = Add(p, value)
u = UpdateState(u, p)
```
align fuse_ops_fusion
align composite_ops_fusion
unify ops table
Init new_code's kernel_info with orig_node's kernel_info in function NewCNodeWithInfo
enable run bert
add pass tensor_promotion
add macro for bias_add and bias_add_grad in expander pass
exclude unused attrs in primitive compare for GraphKernelCSE
exclude fusion_type in kernelinfo cmp for cse in graphkernel
check processor
remove graph kernel pass before select kernel
recover run_standalone_pretrain_ascend.sh
remove is_before_kernel_select
move add_atomic_clean from pass directory to graph_kernel directory
update fuse op list in Ascend back-end
1. Update akg submodule
2. Refactor akg_kernel_build, akg_ascend_kernel_build, akg_gpu_kernel_build
3. Add akg_kernel_json_decoder to support converting kernel_json to AnfNode.
4. Add GraphKernel Cost Model. (mindspore/_extends/graph_kernel)
5. Add some GraphKernel passes to GpuSession, move these passes to backend/optimizer/graph_kernel.
6. Add global id for ir files.
7. Fix bug in ConstInputToAttr.