yuximiao
e99c0a48e6
support start profiler in the minddle of training.
4 years ago
casgj
b15c09db6d
Fix file name and field type changes generated by HCCL.
4 years ago
i-robot
617485fa0b
!25838 Operator time data get the average value and remove the first step
Merge pull request !25838 from zangqx/profiling_gpu_permission
4 years ago
yelihua
cfa8c7a0e8
update OWNERS
4 years ago
臧庆香
235f655325
Get the average and remove the first step
4 years ago
casgj
5af0365f73
clean code for profiler.
4 years ago
huangbingjian
d7e97dd74a
change EXCEPTION level to CRITICAL level
4 years ago
casgj
383746e323
Fix the error that operation name is null in aicore files.
4 years ago
gaojing
e4b5d77b8e
fix the wrong value of average flops.
4 years ago
臧庆香
66e3775493
timeline description
4 years ago
casgj
7c9b45f373
fix the error that the memory-related files generated are missing in profiler.
4 years ago
i-robot
bc521674cc
!23473 modify the master build alarm
Merge pull request !23473 from zangqx/profiling_gpu_permission
4 years ago
i-robot
75e7a5ebdc
!23482 Deal with case that the timeline 8001 show all communication operators lacks part of the communication operator time in profiler.
Merge pull request !23482 from zangqx/master_gaojing2
4 years ago
臧庆香
ec03a76c66
modify the master build alarm
4 years ago
zangqx
09a0392540
Deal with the case that the timeline operators lacks part of the communication operator in profiler.
4 years ago
yanghaitao1
177f3f75bf
remove profiler if compiled with -s on
4 years ago
i-robot
5fa582fef3
!23159 adjust the process node to which the HostCpuOps belongs
Merge pull request !23159 from zangqx/profiling_gpu_permission
4 years ago
臧庆香
c4731c3efa
Adjust the process node to which the operator belongs
4 years ago
i-robot
b1be8dfd31
!22851 MD Profiling: Update Connector Init to Remove any existing file
Merge pull request !22851 from cathwong/ckw_mon_seq_pipelines_fix
4 years ago
i-robot
9886c07c1c
!22459 Transform device_id to rank_id for cpu_profiler
Merge pull request !22459 from 张毅辉/Device_id_to_rank_id
4 years ago
zhangyihui
6a36171a0e
transform device_id to rank_id for cpu_profiler
4 years ago
臧庆香
c28bc7ccba
parser multiple st_track_data error
4 years ago
Cathy Wong
bc85c606b8
MD Profiling: Update Connector Init to Remove any existing file
to fix sequential pipeline scenario.
4 years ago
i-robot
5f9e9d96ec
!22542 Bugfix for scope-level flops data cannot display
Merge pull request !22542 from gzhcv/CommunicationOpNotOverlapped
4 years ago
gzhcv
edb1b4798e
Bugfix for scope-level flops data cannot display
4 years ago
gaojing
fa02606348
step train modified
4 years ago
zhangyihui
3e5cb3b506
fix bugs for device_id_to_rank_id
4 years ago
i-robot
acee9b24bc
!22284 Change Op name in hccl to Op name in step trace
Merge pull request !22284 from 张毅辉/op_name_of_hccl_to_op_name_of_step_trace
4 years ago
i-robot
6301361570
!22249 Analysis of overlapping time of communication operator and computation operator
Merge pull request !22249 from gzhcv/CommunicationOpNotOverlapped
4 years ago
zhangyihui
dab750d1a5
Mapping op_name of hccl to op_name of step trace
4 years ago
zhangyihui
3d19949eb4
device_id to rank_id
4 years ago
i-robot
ad4b85e125
!21929 Fix code check.
Merge pull request !21929 from yuximiao/fix_static
4 years ago
gzhcv
2c99884d83
Add cluster bottleneck analyse feature
5 years ago
i-robot
db44b88e1e
!22094 MD Profiling: For mismatch op info btwn files, skip bottleneck analysis
Merge pull request !22094 from cathwong/ckw_mon_py_analyze_fixes3
4 years ago
ms_yan
36a8886ca2
Revert "[feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset"
This reverts commit b077aa1cab .
Revert "[feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset"
This reverts commit 4e6f7dc97d .
delete pass_registry_test.cc
comment hiai_nlu_model_multi.pb related line
4 years ago
djc
b077aa1cab
[feat] [assistant] [I3T96T] add new Dataset operator CMUARCTICDataset
4 years ago
djc
4e6f7dc97d
[feat] [assistant] [I3T96X] add new Dataset operator LibriSpeechDataset
4 years ago
Cathy Wong
85931175bc
MD Profiling: For mismatch op info btwn files, skip bottleneck analysis.
Add to summary output: per_pipeline_time and per_push_queue_time.
Enhance UT
4 years ago
yanghaitao1
8fc11cb676
adapt delete libms_profiler_fwk.a
4 years ago
yuximiao
fd44b00f0a
fix code check
4 years ago
Cathy Wong
a2cbd4b5fa
MD Profiling Analyze: Search for device trace file
MinddataProfilingAnalyzer() - remove device_target input parm
4 years ago
yanghaitao1
c1428a8af1
fix aicpu parser error
4 years ago
gaojing
310841bd51
profiler cleancode
4 years ago
i-robot
ce00445f56
!20968 Add OWNERS to profiler/parser
Merge pull request !20968 from cathwong/code_docs_ckw_owners_profiler
4 years ago
Cathy Wong
ad3a38b125
Add OWNERS to profiler/parser
4 years ago
i-robot
764b228af8
!20652 Add scope-level flops in ascend profiler
Merge pull request !20652 from gzhcv/Flops
4 years ago
i-robot
14f4c0403e
!20904 Communication operator granularity display
Merge pull request !20904 from 张毅辉/cluster_profiler_for_slow_net
4 years ago
gzhcv
96dfea1e6f
Add scope-level flops in ascend profiler
4 years ago
i-robot
aede52e9d4
!20122 CI code warning fixes
Merge pull request !20122 from cathwong/ckw_ci_q3_lint2_analyzer
4 years ago
Cathy Wong
308e275a37
CI code warning fixes. Update file permissions for CSV file.
4 years ago