mindspore2022

Commit Graph

Author	SHA1	Message	Date
zengzitao	28ab0a963a	fix omp num_threads by using get_max_threads	4 years ago
i-robot	cdb618984f	!26832 Support ValueNode inputs json generation in CollectFusedJsonWithSingleKernel Merge pull request !26832 from zichun_ye/akg_json_build	4 years ago
zengzitao	62458b5636	adapt graph kernel for cpu	4 years ago
Zichun Ye	6e4c5b8e49	support value_node inputs in CollectFusedJsonWithSingleKernel	4 years ago
dayschan	2038295a25	Decouple GraphKernelCluster from ME backend Changed the callback function GetProcessorFromContext to GetTargetFromContext, so that we can use it to filter the clusterable op list, added a GetProcessorByTarget into AkgKernelJsonGenerator. Moved the function IsKeepBasicNode, GetValidOps, OpListFilter from graph_kernel_helper to graph_kernel_utils. combined the GetValidOps and OpListFilter. Decoupled the pass getitem_tuple from "optimizer/common/helper.h", by deleting the checking of input size. cnode->input(i) also checks the input index.	4 years ago
Zichun Ye	996e7c39b3	support generating json for ops with multi ouputs in akg fix typo from code check fix namespace error add CollectFusedJsonWithSingleKernel to generate json for customp[ fix typo fix typo drop changes in CreateOutputsJson	4 years ago
dayschan	7cc4e170cc	decouple akg_kernel_json_generator	4 years ago
looop5	58e27d87bc	add Custom, custom_op_info_register, CustomRegOp to __init__	4 years ago
dayschan	cbb84ff580	Move IsRealKernel and IsRealCNodeKernel from AnfAlgo to AnfUtils the function IsOneOfPrimitive and IsOneOfPrimitiveCNode is useful, we can move them into anf.cc	4 years ago
dayschan	da08e33af8	Add GraphKernelCallback functions, and call them in AkgKernelJsonGenerator. only the GET functions are implemented now. remove the calling of AnfAlgo's GET functions for node info from AkgKernelJsonGenerator. And, bugfix in pass reorder_ops, which set attrs for the same prim::Cast primitive in different CNode.	4 years ago
i-robot	166a54fef5	!25475 Decouple AkgKernelJsonGenerator from MS backend (part 1) Merge pull request !25475 from DeshiChen/1025_genjson	4 years ago
Yang Jiao	cacadae241	fix static check	4 years ago
dayschan	05f0bd950f	Decouple AkgKernelJsonGenerator from MS backend (step 1) * move GetInputTensorValue from common_utils to json_generator * get dtype size by `Number.nbits()` instead of `GetDtypeNbyte` map. * manually get attr from anfnode, instead of `AnfAlgo::GetNodeAttr` * replace `AnfAlgo::GetCNodePrimitive` with `GetCNodePrimitive` in anf.cc * it's not used to judge `AnfAlgo::IsRealKernel` in inner function. cleancode jobs: * remove the `Clean` function from AkgKernelJsonGenerator * delete the json key "id", to delete the mutex in AkgKernelJsonGenerator	4 years ago
dayschan	f3e8923909	Add akg_kernel_json_generator to namespace mindspore::graphkernel	4 years ago
dayschan	6a26d7f6d9	Move TypeId2String from kernel_compiler/ to ir/dtype_extends.cc changed the function to "TypeIdToString", and use the Type::ToString() function, instead of TypeId-String map. changed the DtypeToTypeId together, the original StringToType can be used. added a new interface StringToTypeId.	4 years ago
i-robot	aa63062595	!25082 Set akg kernel attrs in backend pass Merge pull request !25082 from DeshiChen/1015_set_kernel_attr	4 years ago
lingyunli63	e4173e1a1c	akg cache for gpu/cce/cpu	4 years ago
dayschan	6600c0c474	Set akg kernel attrs in backend pass it's unreasonable to change the node when generating kernel json. instead, it should be set in a pass. most of the operators in original akg_kernel_attrs_process are not longer used, so we deleted them, leaving only the "Cast" and "MatMul/BatchMatMul".	4 years ago
i-robot	8e23867c30	!23984 Custom operator supports tbe dsl Merge pull request !23984 from looop5/custom_commit	4 years ago
looop5	46789f260a	Custom operator supports tbe dsl	4 years ago
ckey_Dou	d91ff90d96	fix cleancode error	4 years ago
Yang Jiao	2d83f0e9ef	fix static-check	4 years ago
dayschan	7502345c8f	Add GraphKernelFlags into namespace mindspore::graphkernel	4 years ago
dayschan	32ecd8ee79	GraphKernel supports CPU only Linux system is supported now. change the default value of `ENABLE_AKG` to off, and controlled by option `-K`. the `ENABLE_AKG` is auto enabled when `ENABLE_GPU` or `ENABLE_D` is on. since now, we can use `ENABLE_AKG` to control the compilation of graphkernel and akg codes. fix usage description for option "-K", it should be "[-K on\|off]". LLVM is required by akg for cpu kernels, so AKG for cpu is default disabled now.	4 years ago
ckey_Dou	80e6ea96f6	fix the bug of overriding of compute_capability_	4 years ago
ckey_Dou	ea9f8ac164	fix pclint-plus and codedex	4 years ago
zengzitao	7d6c6b17bb	fix master warnings	4 years ago
baihuawei	6eec288c39	opt ascend single op mode runtime code	4 years ago
ms_yan	36a8886ca2	Revert "[feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset" This reverts commit `b077aa1cab`. Revert "[feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset" This reverts commit `4e6f7dc97d`. delete pass_registry_test.cc comment hiai_nlu_model_multi.pb related line	4 years ago
djc	b077aa1cab	[feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset	4 years ago
djc	4e6f7dc97d	[feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset	4 years ago
dayschan	df7dd0fcf3	restrain the INFO log in graphkernel	4 years ago
looop5	5e0bc410b3	fix workspace address type	4 years ago
i-robot	20167a8012	!21412 Clean the unused code and enhance the release stragedy for shared memory Merge pull request !21412 from chengbin/master	4 years ago
i-robot	a76e683dc0	!21653 GraphKernel CleanCode For master Merge pull request !21653 from ZengZitao/gk_fix_waring_master	4 years ago
zengzitao	b8095efcab	gk fix warnings in master	4 years ago
ckey_Dou	228f03864e	1. enhance the deletion of shared memory 2. delete unused function	4 years ago
lby	a5029f061c	ascend kernel build refactory	4 years ago
i-robot	19362a482b	!21364 BUGFIX: using the kernel_name to generate hash_id Merge pull request !21364 from chengbin/bug_fix	4 years ago
i-robot	6edcee8720	!20230 set workspace info Merge pull request !20230 from looop5/workspace	4 years ago
ckey_Dou	97e3951bcd	using the kernel_name to generate the hash_id	4 years ago
looop5	2fcf970f69	deal with workspace deal with workspace on cuda	4 years ago
ckey_Dou	d293c5eb26	using kernel pool to share the compiling results when running on multi cards	4 years ago
dayschan	137608b518	Add LiteGraph for graphkernel Add a subdirectory "model" in the "backend/optimizer/graph_kernel" for litegraph. Implement two interfaces "AnfGraph2LiteGraph" and "LiteGraph2AnfGraph". The litegraph will be the base data structure when we migrate the GraphKernel code from python("mindspore/_extends/graph_kernel") to c++.	4 years ago
dayschan	3ab53dd26d	Send compilation attrs to akg 1. Add a new message type "AKG/ATTR" in AkgKernelBuilder. the attrs was sent before the kernel infos. 2. Send "online_tuning" attribute when the flag is not zero, but error occurs in the latest akg submodule. 3. Send "repository_path" attribute when the flag is not empty. 4. Add a new value "compute_capability" into kernel info when the processor is GPU.	4 years ago
caifubi	2aa66cbc27	Add uniqueName for data dump	4 years ago
i-robot	6bd66b3d70	!18729 Add register limit constraint Merge pull request !18729 from lishanni/for_regis_limit	4 years ago
lishanni513	8c98146c76	Add register limit constraint	4 years ago
lingyunli63	a995bea507	recompute_fuse	4 years ago
dayschan	2ac8c65327	Add GraphKernelPassManager to manage the passes of GraphKernel Refactor the original "PassManager" class, and derive the "GraphKernelPassManager" GraphKernel's ir files are dumped into a new sub-directory "graph_kernel" in the original "verbose_ir_files" All GraphKernel's passes are divided into 3 levels, and controlled by the flag "opt_level" by default. when the opt_level is greaterequal to the pass's level, this pass will run. The default "opt_level" is 2 when GraphKernel is enabled. Levels: 1. Basic features, like cluster, splitter, and some preprocess, postprocess. 2. All stable features, mainly includes the optimization passes. 3. Experimental features, like stitch-fusion, parallel-fusion. The two flags "enable_pass" and "disable_pass" are available in this commit. User can manually enable some passes when it's disabled by "opt_level", or disable the enabled passes, by specifying that pass in this format: "stage_id.pass_id" or "stage_name.pass_name", multiple passes are separated by comma(",") the stage/pass index and stage/pass name can be found from the ir filename. e.g. "--enable_pass=cluster.graph_kernel_expander,1.1,1.2" Others: 1. the pass "tensor_promotion" is not useful, remove it. 2. put the pass "InsertPadOps" before "ArithmeticSimplify".	4 years ago

1 2 3

103 Commits (a64920c46802a3030bfdf8754870ffb1e5c6ba78)