============================= test session starts ============================== platform linux -- Python 3.7.5, pytest-5.4.3, py-1.8.1, pluggy-0.13.1 rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st/dyn_shape_dev, inifile: /home/jenkins/sault/virtual_test/virtualenv_004/sault/config/pytest.ini plugins: anyio-3.7.1, xdist-1.32.0, forked-1.1.3 [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:16.550.391 [trace_attr.c:105](tid:15744) platform is 1. [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:16.550.562 [trace_recorder.c:114](tid:15744) use root path: /home/jenkins/ascend/atrace [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:16.550.592 [trace_signal.c:133](tid:15744) register signal handler for signo 2 succeed. [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:16.550.603 [trace_signal.c:133](tid:15744) register signal handler for signo 15 succeed. [INFO] RUNTIME(15744,python3.7):2024-01-11-05:27:16.970.979 [runtime.cc:1159] 15744 GetAicoreNumByLevel: workingDev_=0 [INFO] RUNTIME(15744,python3.7):2024-01-11-05:27:16.971.061 [runtime.cc:4719] 15744 GetVisibleDevices: ASCEND_RT_VISIBLE_DEVICES param was not set collected 2 items test_relu6.py [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.309.025 [process_mode_manager.cpp:109][OpenProcess][tid:15744] [ProcessModeManager] enter into open process deviceId[3] rankSize[0] [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.016 [process_mode_manager.cpp:379][InitTsdClient][tid:15744] [TsdClient] deviceId[3] begin to init hdc client [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.192 [version_verify.cpp:34][SetVersionInfo][tid:15744] VersionVerify: send client version to server [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.227 [version_verify.cpp:50][SetVersionInfo][tid:15744] send feature_info:{msg_type:35, features:{check before send aicpu package,}} [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.239 [version_verify.cpp:50][SetVersionInfo][tid:15744] send feature_info:{msg_type:37, features:{check before send open qs message,}} [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.515 [version_verify.cpp:66][PeerVersionCheck][tid:15744] VersionVerify: Check client version info, server[1230], client[1230] [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.531 [version_verify.cpp:87][ParseVersionInfo][tid:15744] VersionVerify: pass client version info success [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.540 [hdc_client.cpp:276][CheckHdcConnection][tid:15744] Service[2] create hdc success [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.556 [version_verify.cpp:120][SpecialFeatureCheck][tid:15744] VersionVerify: new type[35], supported [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.600 [process_mode_manager.cpp:748][GetDeviceCheckCode][tid:15744] [TsdClient][deviceId=3] [sessionId=1] wait package info respond [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.708 [process_mode_manager.cpp:379][InitTsdClient][tid:15744] [TsdClient] deviceId[3] begin to init hdc client [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.806 [version_verify.cpp:34][SetVersionInfo][tid:15744] VersionVerify: send client version to server [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.817 [version_verify.cpp:50][SetVersionInfo][tid:15744] send feature_info:{msg_type:35, features:{check before send aicpu package,}} [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.827 [version_verify.cpp:50][SetVersionInfo][tid:15744] send feature_info:{msg_type:37, features:{check before send open qs message,}} [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.957 [version_verify.cpp:66][PeerVersionCheck][tid:15744] VersionVerify: Check client version info, server[1230], client[1230] [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.970 [version_verify.cpp:87][ParseVersionInfo][tid:15744] VersionVerify: pass client version info success [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.979 [hdc_client.cpp:276][CheckHdcConnection][tid:15744] Service[2] create hdc success [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.312.990 [process_mode_manager.cpp:426][ConstructOpenMsg][tid:15744] [TsdClient] tsd get process sign successfully, procpid[15744] signSize[48] [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.313.015 [version_verify.cpp:112][SpecialFeatureCheck][tid:15744] VersionVerify: previous type[6], supported [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.313.038 [process_mode_manager.cpp:126][OpenProcess][tid:15744] [ProcessModeManager] deviceId[3] sessionId[1] rankSize[0], wait sub process start respond [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.512.016 [stub_process_mode_nowin.cpp:63][ProcessQueueForMdc][tid:15744] [TsdClient] it is unnecessary of current mode[0] chiptype[1] to grant queue auth to aicpusd [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.512.064 [stub_process_mode_nowin.cpp:101][OpenInHost][tid:15744] enter into OpenInHost deviceid[3] [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.512.077 [stub_process_mode_nowin.cpp:105][OpenInHost][tid:15744] host cpu not support [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.512.085 [process_mode_manager.cpp:156][OpenProcess][tid:15744] [TsdClient][deviceId=3] [sessionId=1] start hccp and computer process success [INFO] RUNTIME(15744,python3.7):2024-01-11-05:27:21.514.805 [device.cc:340] 15744 Init: isDoubledie:0, topologytype:0 [INFO] RUNTIME(15744,python3.7):2024-01-11-05:27:21.531.746 [npu_driver.cc:5428] 17216 GetDeviceStatus: GetDeviceStatus status=1. [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:21.531.771 [atrace_api.c:28](tid:15744) AtraceCreate start [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:21.531.886 [trace_rb_log.c:84](tid:15744) [RUNTIME_ATRACE_DEV3_TS0] create ring buffer success, buffer size : 131152. [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:21.531.903 [atrace_api.c:32](tid:15744) AtraceCreate end [INFO] TDT(15744,python3.7):2024-01-11-05:27:21.531.917 [client_manager.cpp:157][SetProfilingCallback][tid:15744] [TsdClient] set profiling callback success [TRACE] GE(15744,python3.7):2024-01-11-05:27:21.681.315 [status:INIT] [ge_api.cc:144]15744 GEInitializeImpl:GEInitialize start [INFO] PROFILING(15744,python3.7):2024-01-11-05:27:21.904.609 [msprofiler_impl.cpp:156] >>> (tid:15744) ProfNotifySetDevice called, is open: 1, devId: 3 [INFO] PROFILING(15744,python3.7):2024-01-11-05:27:21.904.776 [platform.cpp:38] >>> (tid:15744) Profiling platform version: 1.0. [INFO] PROFILING(15744,python3.7):2024-01-11-05:27:21.904.794 [ai_drv_dev_api.cpp:384] >>> (tid:15744) Succeeded to DrvGetApiVersion version: 0x72313 [TRACE] GE(15744,python3.7):2024-01-11-05:27:21.957.732 [status:RUNNING] [ge_api.cc:211]15744 GEInitializeImpl:Initializing environment [INFO] GE(15744,python3.7):2024-01-11-05:27:21.957.779 [gelib.cc:98][EVENT]15744 Initialize:[GEPERFTRACE] GE Init Start [INFO] GE(15744,python3.7):2024-01-11-05:27:21.958.053 [gelib.cc:307][EVENT]15744 SystemInitialize:Online infer init GELib success, device id :3 [INFO] DVPP(15744,python3.7):2024-01-11-05:27:22.341.250 [dvpp_engine.cc:41][ENGINE][Initialize:41][tid:15744]dvpp engine do not support [INFO] TUNE(15744,python3.7):2024-01-11-05:27:22.345.919 [cann_kb_pyfunc_mgr.cpp:72][CANNKB][Tid:15744]"CannKbPyfuncMgr: Enter PyObjectInit, reference_ is 0!" [INFO] TUNE(15744,python3.7):2024-01-11-05:27:22.345.957 [handle_manager.cpp:115][CANNKB][Tid:15744]"Start to run init functions to load dynamic python lib!" [INFO] TUNE(15744,python3.7):2024-01-11-05:27:22.346.019 [handle_manager.cpp:407][CANNKB][Tid:15744]"Init functions of loading dynamic python lib end!" [INFO] TUNE(15744,python3.7):2024-01-11-05:27:22.346.030 [cann_kb_pyfunc_mgr.cpp:24][CANNKB][Tid:15744]"CANN_KB_Py has already been initialized." [INFO] TUNE(15744,python3.7):2024-01-11-05:27:22.346.254 [cann_kb_pyfunc_mgr.cpp:117][CANNKB][Tid:15744]"CannKbPyfuncMgr: Run PyObjectInit successfully!" [INFO] HCCL(15744,python3.7):2024-01-11-05:27:34.298.785 [plugin_manager.cc:42][15744]hcom running normal mode. [INFO] DVPP(15744,python3.7):2024-01-11-05:27:34.299.417 [dvpp_engine.cc:92][ENGINE][GetOpsKernelInfoStores:92][tid:15744]dvpp ops kernel info store do not support [INFO] DVPP(15744,python3.7):2024-01-11-05:27:34.299.583 [dvpp_engine.cc:69][ENGINE][GetGraphOptimizerObjs:69][tid:15744]dvpp graph optimizer do not support [INFO] DVPP(15744,python3.7):2024-01-11-05:27:34.827.862 [dvpp_ops_kernel_builder.cc:48][ENGINE][Initialize:48][tid:15744]dvpp ops kernel builder do not support [INFO] GE(15744,python3.7):2024-01-11-05:27:34.836.760 [gelib.cc:169][EVENT]15744 Initialize:[GEPERFTRACE] The time cost of GELib::Initialize is [12878933] micro second. [TRACE] GE(15744,python3.7):2024-01-11-05:27:34.921.838 [status:STOP] [ge_api.cc:255]15744 GEInitializeImpl:GEInitialize finished [TRACE] GE(15744,python3.7):2024-01-11-05:27:34.921.991 [status:INIT] [ge_api.cc:398]15744 Session:Start to construct session. [TRACE] GE(15744,python3.7):2024-01-11-05:27:34.922.010 [status:RUNNING] [ge_api.cc:408]15744 Session:Creating session [INFO] GE(15744,python3.7):2024-01-11-05:27:34.922.420 [graph_var_manager.cc:1445][EVENT]15744 SetMemoryMallocSize:Total memory size is 34359738368 [INFO] GE(15744,python3.7):2024-01-11-05:27:34.922.435 [graph_var_manager.cc:1424][EVENT]15744 SetAllMemoryMaxValue:The graph_mem_max_size is 27917287424 and the var_mem_max_size is 5368709120 [INFO] PROFILING(15744,python3.7):2024-01-11-05:27:34.922.730 [msprofiler_impl.cpp:156] >>> (tid:15744) ProfNotifySetDevice called, is open: 1, devId: 3 [TRACE] GE(15744,python3.7):2024-01-11-05:27:34.923.548 [status:RUNNING] [ge_api.cc:411]15744 Session:Session id is 0 [TRACE] GE(15744,python3.7):2024-01-11-05:27:34.923.569 [status:STOP] [ge_api.cc:420]15744 Session:Session Constructor finished [INFO] PROFILING(15744,python3.7):2024-01-11-05:27:34.933.210 [platform.cpp:38] >>> (tid:15744) Profiling platform version: 1.0. [INFO] PROFILING(15744,python3.7):2024-01-11-05:27:34.933.241 [ai_drv_dev_api.cpp:384] >>> (tid:15744) Succeeded to DrvGetApiVersion version: 0x72313 [TRACE] GE(15744,python3.7):2024-01-11-05:27:34.933.412 [status:INIT] [ge_api.cc:144]15744 GEInitializeImpl:GEInitialize start TotalTime = 0.385714, [20] [parse]: 0.220792 [symbol_resolve]: 0.02858, [1] [Cycle 1]: 0.0285007, [1] [resolve]: 0.028476 [combine_like_graphs]: 1.06e-06 [graph_reusing]: 3.02e-06 [meta_unpack_prepare]: 0.00015925 [pre_cconv]: 4.18e-06 [abstract_specialize]: 0.00482414 [pack_expand]: 1.548e-05 [auto_monad]: 0.00011353 [inline]: 1.72e-06 [pre_auto_parallel]: 1.999e-05 [pipeline_split]: 2.8e-06 [optimize]: 0.124593, [35] [py_interpret_to_execute]: 3.79e-06 [rewriter_before_opt_a]: 0.0001685 [opt_a]: 0.12333, [4] [Cycle 1]: 0.0848061, [30] [expand_dump_flag]: 4.35e-06 [switch_simplify]: 2.607e-05 [a_1]: 0.00043102 [recompute_prepare]: 9.2e-06 [updatestate_depend_eliminate]: 1.098e-05 [updatestate_assign_eliminate]: 7e-06 [updatestate_loads_eliminate]: 6.66e-06 [parameter_eliminate]: 4.95e-06 [a_2]: 9.201e-05 [accelerated_algorithm]: 8.82e-06 [pynative_shard]: 2.19999e-06 [auto_parallel]: 3.23e-06 [parallel]: 1.879e-05 [merge_comm]: 9.87e-06 [allreduce_fusion]: 1.92e-06 [virtual_dataset]: 5.29e-06 [get_grad_eliminate_]: 4.50001e-06 [virtual_output]: 4.23e-06 [merge_forward]: 8.97e-06 [cell_reuse_recompute_pass]: 9.5e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.268e-05 [meta_fg_expand]: 0.0268247, [1] [Cycle 1]: 0.00055306, [1] [resolve]: 0.0005298 [after_resolve]: 2.269e-05 [a_after_grad]: 3.983e-05 [renormalize]: 0.0566447 [real_op_eliminate]: 2.805e-05 [auto_monad_grad]: 3.42e-05 [auto_monad_eliminator]: 5.054e-05 [cse]: 0.00012947 [a_3]: 0.00018026 [Cycle 2]: 0.028835, [30] [expand_dump_flag]: 2.49e-06 [switch_simplify]: 6.679e-05 [a_1]: 0.00046 [recompute_prepare]: 1.107e-05 [updatestate_depend_eliminate]: 1.254e-05 [updatestate_assign_eliminate]: 8.74e-06 [updatestate_loads_eliminate]: 8.66e-06 [parameter_eliminate]: 4e-06 [a_2]: 0.0001259 [accelerated_algorithm]: 1.31e-05 [pynative_shard]: 1.88e-06 [auto_parallel]: 7.8e-06 [parallel]: 5.63999e-06 [merge_comm]: 3.28e-06 [allreduce_fusion]: 1.44e-06 [virtual_dataset]: 7.05e-06 [get_grad_eliminate_]: 6.38e-06 [virtual_output]: 6.06e-06 [merge_forward]: 1.045e-05 [cell_reuse_recompute_pass]: 8.29998e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.515e-05 [meta_fg_expand]: 0.00673061, [3] [Cycle 1]: 0.000351, [1] [resolve]: 0.00033136 [Cycle 1]: 0.0004471, [1] [resolve]: 0.00042734 [Cycle 1]: 0.00034175, [1] [resolve]: 0.00032202 [after_resolve]: 3.176e-05 [a_after_grad]: 5.346e-05 [renormalize]: 0.0206192 [real_op_eliminate]: 3.084e-05 [auto_monad_grad]: 3.491e-05 [auto_monad_eliminator]: 5.507e-05 [cse]: 0.00011961 [a_3]: 0.00021769 [Cycle 3]: 0.00267604, [30] [expand_dump_flag]: 2.48e-06 [switch_simplify]: 6.786e-05 [a_1]: 0.0005707 [recompute_prepare]: 1.18e-05 [updatestate_depend_eliminate]: 1.397e-05 [updatestate_assign_eliminate]: 1.079e-05 [updatestate_loads_eliminate]: 1.08e-05 [parameter_eliminate]: 3.9e-06 [a_2]: 0.00016022 [accelerated_algorithm]: 1.585e-05 [pynative_shard]: 1.25e-06 [auto_parallel]: 3.72e-06 [parallel]: 3.81e-06 [merge_comm]: 2.66e-06 [allreduce_fusion]: 1.68e-06 [virtual_dataset]: 9.12001e-06 [get_grad_eliminate_]: 8.33e-06 [virtual_output]: 7.89999e-06 [merge_forward]: 1.199e-05 [cell_reuse_recompute_pass]: 5.4e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.945e-05 [meta_fg_expand]: 2.883e-05 [after_resolve]: 1.153e-05 [a_after_grad]: 1.37e-05 [renormalize]: 0.00130798 [real_op_eliminate]: 1.321e-05 [auto_monad_grad]: 5.15e-06 [auto_monad_eliminator]: 2.326e-05 [cse]: 0.00010345 [a_3]: 7.493e-05 [Cycle 4]: 0.0007543, [30] [expand_dump_flag]: 1.22e-06 [switch_simplify]: 9e-06 [a_1]: 0.00015501 [recompute_prepare]: 1.035e-05 [updatestate_depend_eliminate]: 1.371e-05 [updatestate_assign_eliminate]: 1.112e-05 [updatestate_loads_eliminate]: 1.075e-05 [parameter_eliminate]: 2.2e-06 [a_2]: 0.0001572 [accelerated_algorithm]: 1.558e-05 [pynative_shard]: 1.40999e-06 [auto_parallel]: 3.3e-06 [parallel]: 3.81e-06 [merge_comm]: 2.25e-06 [allreduce_fusion]: 1.62e-06 [virtual_dataset]: 8.81e-06 [get_grad_eliminate_]: 8e-06 [virtual_output]: 7.61e-06 [merge_forward]: 1.239e-05 [cell_reuse_recompute_pass]: 3.89999e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.906e-05 [meta_fg_expand]: 8.98e-06 [after_resolve]: 1.115e-05 [a_after_grad]: 1.388e-05 [renormalize]: 1.20002e-07 [real_op_eliminate]: 7.72e-06 [auto_monad_grad]: 2.16001e-06 [auto_monad_eliminator]: 2.037e-05 [cse]: 4.93e-05 [a_3]: 6.725e-05 [py_interpret_to_execute_after_opt_a]: 4.50001e-06 [slice_cell_reuse_recomputed_activation]: 2.61e-06 [rewriter_after_opt_a]: 7.93e-05 [convert_after_rewriter]: 1.903e-05 [order_py_execute_after_rewriter]: 1.308e-05 [opt_b]: 0.00059062, [2] [Cycle 1]: 0.00050204, [7] [b_1]: 0.00044594 [b_2]: 3.30999e-06 [updatestate_depend_eliminate]: 3.4e-06 [updatestate_assign_eliminate]: 2.46e-06 [updatestate_loads_eliminate]: 2.22e-06 [renormalize]: 3.00002e-07 [cse]: 1.03e-05 [Cycle 2]: 7.79e-05, [7] [b_1]: 3.763e-05 [b_2]: 2.02e-06 [updatestate_depend_eliminate]: 2.21e-06 [updatestate_assign_eliminate]: 1.95e-06 [updatestate_loads_eliminate]: 1.89e-06 [renormalize]: 7.0002e-08 [cse]: 6.17e-06 [cconv]: 2.264e-05 [opt_after_cconv]: 5.21e-05, [1] [Cycle 1]: 4.725e-05, [7] [c_1]: 5.04e-06 [parameter_eliminate]: 1.98e-06 [updatestate_depend_eliminate]: 2.34e-06 [updatestate_assign_eliminate]: 1.98e-06 [updatestate_loads_eliminate]: 1.77e-06 [cse]: 6.3e-06 [renormalize]: 2.80001e-07 [remove_dup_value]: 9.92e-06 [tuple_transform]: 3.716e-05, [1] [Cycle 1]: 3.269e-05, [3] [d_1]: 1.431e-05 [d_2]: 5.99999e-06 [renormalize]: 1.8e-07 [add_cache_embedding]: 1.189e-05 [add_recomputation]: 4.946e-05 [cse_after_recomputation]: 1.687e-05, [1] [Cycle 1]: 1.252e-05, [1] [cse]: 7.77e-06 [environ_conv]: 1.772e-05 [label_micro_interleaved_index]: 2.66e-06 [label_fine_grained_interleaved_index]: 2.55e-06 [assign_add_opt]: 1.70001e-06 [slice_recompute_activation]: 2.63e-06 [micro_interleaved_order_control]: 1.79e-06 [full_micro_interleaved_order_control]: 1.86e-06 [comp_comm_scheduling]: 2.29e-06 [reorder_send_recv_between_fp_bp]: 2.31e-06 [comm_op_add_attrs]: 1.13e-06 [add_comm_op_reuse_tag]: 9.79999e-07 [overlap_opt_shard_in_pipeline]: 1.27e-06 [grouped_pairwise_exchange_alltoall]: 1.46e-06 [overlap_recompute_and_grad_model_parallel]: 2.34e-06 [overlap_grad_matmul_and_grad_allreduce]: 7.7e-07 [split_matmul_comm_elemetwise]: 2.77e-06 [split_layernorm_comm]: 1.83e-06 [process_send_recv_for_ge]: 2.66e-06 [handle_group_info]: 1.02e-06 [auto_monad_reorder]: 2.176e-05 [get_jit_bprop_graph]: 5.99997e-07 [eliminate_special_op_node]: 0.00052681 [validate]: 4.772e-05 [distribtued_split]: 1.32999e-06 [task_emit]: 0.00575607 [execute]: 8.09e-06 Sums parse : 0.220792s : 63.76% symbol_resolve.resolve : 0.028476s : 8.22% combine_like_graphs : 0.000001s : 0.00% graph_reusing : 0.000003s : 0.00% meta_unpack_prepare : 0.000159s : 0.05% pre_cconv : 0.000004s : 0.00% abstract_specialize : 0.004824s : 1.39% pack_expand : 0.000015s : 0.00% auto_monad : 0.000114s : 0.03% inline : 0.000002s : 0.00% pre_auto_parallel : 0.000020s : 0.01% pipeline_split : 0.000003s : 0.00% optimize.py_interpret_to_execute : 0.000004s : 0.00% optimize.rewriter_before_opt_a : 0.000168s : 0.05% optimize.opt_a.expand_dump_flag : 0.000011s : 0.00% optimize.opt_a.switch_simplify : 0.000170s : 0.05% optimize.opt_a.a_1 : 0.001617s : 0.47% optimize.opt_a.recompute_prepare : 0.000042s : 0.01% optimize.opt_a.updatestate_depend_eliminate : 0.000051s : 0.01% optimize.opt_a.updatestate_assign_eliminate : 0.000038s : 0.01% optimize.opt_a.updatestate_loads_eliminate : 0.000037s : 0.01% optimize.opt_a.parameter_eliminate : 0.000015s : 0.00% optimize.opt_a.a_2 : 0.000535s : 0.15% optimize.opt_a.accelerated_algorithm : 0.000053s : 0.02% optimize.opt_a.pynative_shard : 0.000007s : 0.00% optimize.opt_a.auto_parallel : 0.000018s : 0.01% optimize.opt_a.parallel : 0.000032s : 0.01% optimize.opt_a.merge_comm : 0.000018s : 0.01% optimize.opt_a.allreduce_fusion : 0.000007s : 0.00% optimize.opt_a.virtual_dataset : 0.000030s : 0.01% optimize.opt_a.get_grad_eliminate_ : 0.000027s : 0.01% optimize.opt_a.virtual_output : 0.000026s : 0.01% optimize.opt_a.merge_forward : 0.000044s : 0.01% optimize.opt_a.cell_reuse_recompute_pass : 0.000003s : 0.00% optimize.opt_a.cell_reuse_handle_not_recompute_node_pass : 0.000066s : 0.02% optimize.opt_a.meta_fg_expand : 0.000038s : 0.01% optimize.opt_a.meta_fg_expand.resolve : 0.001611s : 0.47% optimize.opt_a.after_resolve : 0.000077s : 0.02% optimize.opt_a.a_after_grad : 0.000121s : 0.03% optimize.opt_a.renormalize : 0.078572s : 22.69% optimize.opt_a.real_op_eliminate : 0.000080s : 0.02% optimize.opt_a.auto_monad_grad : 0.000076s : 0.02% optimize.opt_a.auto_monad_eliminator : 0.000149s : 0.04% optimize.opt_a.cse : 0.000402s : 0.12% optimize.opt_a.a_3 : 0.000540s : 0.16% optimize.py_interpret_to_execute_after_opt_a : 0.000005s : 0.00% optimize.slice_cell_reuse_recomputed_activation : 0.000003s : 0.00% optimize.rewriter_after_opt_a : 0.000079s : 0.02% optimize.convert_after_rewriter : 0.000019s : 0.01% optimize.order_py_execute_after_rewriter : 0.000013s : 0.00% optimize.opt_b.b_1 : 0.000484s : 0.14% optimize.opt_b.b_2 : 0.000005s : 0.00% optimize.opt_b.updatestate_depend_eliminate : 0.000006s : 0.00% optimize.opt_b.updatestate_assign_eliminate : 0.000004s : 0.00% optimize.opt_b.updatestate_loads_eliminate : 0.000004s : 0.00% optimize.opt_b.renormalize : 0.000000s : 0.00% optimize.opt_b.cse : 0.000016s : 0.00% optimize.cconv : 0.000023s : 0.01% optimize.opt_after_cconv.c_1 : 0.000005s : 0.00% optimize.opt_after_cconv.parameter_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_depend_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_assign_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_loads_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.cse : 0.000006s : 0.00% optimize.opt_after_cconv.renormalize : 0.000000s : 0.00% optimize.remove_dup_value : 0.000010s : 0.00% optimize.tuple_transform.d_1 : 0.000014s : 0.00% optimize.tuple_transform.d_2 : 0.000006s : 0.00% optimize.tuple_transform.renormalize : 0.000000s : 0.00% optimize.add_cache_embedding : 0.000012s : 0.00% optimize.add_recomputation : 0.000049s : 0.01% optimize.cse_after_recomputation.cse : 0.000008s : 0.00% optimize.environ_conv : 0.000018s : 0.01% optimize.label_micro_interleaved_index : 0.000003s : 0.00% optimize.label_fine_grained_interleaved_index : 0.000003s : 0.00% optimize.assign_add_opt : 0.000002s : 0.00% optimize.slice_recompute_activation : 0.000003s : 0.00% optimize.micro_interleaved_order_control : 0.000002s : 0.00% optimize.full_micro_interleaved_order_control : 0.000002s : 0.00% optimize.comp_comm_scheduling : 0.000002s : 0.00% optimize.reorder_send_recv_between_fp_bp : 0.000002s : 0.00% optimize.comm_op_add_attrs : 0.000001s : 0.00% optimize.add_comm_op_reuse_tag : 0.000001s : 0.00% optimize.overlap_opt_shard_in_pipeline : 0.000001s : 0.00% optimize.grouped_pairwise_exchange_alltoall : 0.000001s : 0.00% optimize.overlap_recompute_and_grad_model_parallel : 0.000002s : 0.00% optimize.overlap_grad_matmul_and_grad_allreduce : 0.000001s : 0.00% optimize.split_matmul_comm_elemetwise : 0.000003s : 0.00% optimize.split_layernorm_comm : 0.000002s : 0.00% optimize.process_send_recv_for_ge : 0.000003s : 0.00% optimize.handle_group_info : 0.000001s : 0.00% auto_monad_reorder : 0.000022s : 0.01% get_jit_bprop_graph : 0.000001s : 0.00% eliminate_special_op_node : 0.000527s : 0.15% validate : 0.000048s : 0.01% distribtued_split : 0.000001s : 0.00% task_emit : 0.005756s : 1.66% execute : 0.000008s : 0.00% Time group info: ------[substitution.] 0.030340 387 0.01% : 0.000003s : 5: substitution.float_depend_g_call 0.04% : 0.000011s : 14: substitution.float_tuple_getitem_switch 96.61% : 0.029312s : 25: substitution.getattr_setattr_resolve 0.02% : 0.000005s : 3: substitution.graph_param_transform 0.01% : 0.000003s : 3: substitution.incorporate_call 0.04% : 0.000011s : 3: substitution.incorporate_call_switch 2.10% : 0.000638s : 59: substitution.inline 0.03% : 0.000008s : 14: substitution.less_batch_normalization 0.15% : 0.000044s : 23: substitution.meta_unpack_prepare 0.04% : 0.000012s : 11: substitution.minmaximum_grad 0.01% : 0.000004s : 5: substitution.partial_eliminate 0.00% : 0.000001s : 3: substitution.partial_unused_args_eliminate 0.02% : 0.000007s : 47: substitution.remove_not_recompute_node 0.18% : 0.000055s : 38: substitution.replace_applicator 0.03% : 0.000008s : 20: substitution.replace_old_param 0.01% : 0.000003s : 2: substitution.reset_defer_inline 0.02% : 0.000007s : 8: substitution.set_cell_output_no_recompute 0.03% : 0.000008s : 5: substitution.specialize_transform 0.03% : 0.000009s : 4: substitution.switch_simplify 0.04% : 0.000012s : 2: substitution.transpose_eliminate 0.14% : 0.000042s : 15: substitution.tuple_list_convert_item_index_to_positive 0.06% : 0.000017s : 15: substitution.tuple_list_get_item_const_eliminator 0.08% : 0.000023s : 15: substitution.tuple_list_get_item_depend_reorder 0.24% : 0.000072s : 33: substitution.tuple_list_get_item_eliminator 0.08% : 0.000023s : 15: substitution.tuple_list_get_set_item_eliminator ------[renormalize.] 0.078555 6 95.19% : 0.074776s : 3: renormalize.infer 4.81% : 0.003779s : 3: renormalize.specialize ------[replace.] 0.000794 68 48.02% : 0.000381s : 23: replace.getattr_setattr_resolve 29.13% : 0.000231s : 31: replace.inline 6.72% : 0.000053s : 2: replace.meta_unpack_prepare 8.07% : 0.000064s : 4: replace.switch_simplify 1.46% : 0.000012s : 2: replace.transpose_eliminate 6.60% : 0.000052s : 6: replace.tuple_list_get_item_eliminator ------[match.] 0.029874 68 97.83% : 0.029226s : 23: match.getattr_setattr_resolve 1.92% : 0.000574s : 31: match.inline 0.11% : 0.000032s : 2: match.meta_unpack_prepare 0.03% : 0.000009s : 4: match.switch_simplify 0.04% : 0.000012s : 2: match.transpose_eliminate 0.07% : 0.000021s : 6: match.tuple_list_get_item_eliminator ------[func_graph_cloner_run.] 0.004331 69 67.54% : 0.002925s : 28: func_graph_cloner_run.FuncGraphClonerGraph 32.46% : 0.001406s : 41: func_graph_cloner_run.FuncGraphSpecializer ------[meta_graph.] 0.000000 0 ------[manager.] 0.000000 0 ------[pynative] 0.000000 0 ------[others.] 0.033882 255 3.22% : 0.001091s : 104: opt.transform.opt_a 1.33% : 0.000450s : 92: opt.transform.opt_b 88.44% : 0.029964s : 10: opt.transform.opt_resolve 0.40% : 0.000135s : 1: opt.transforms.meta_unpack_prepare 6.51% : 0.002207s : 40: opt.transforms.opt_a 0.01% : 0.000004s : 1: opt.transforms.opt_after_cconv 0.01% : 0.000004s : 2: opt.transforms.opt_b 0.05% : 0.000019s : 2: opt.transforms.opt_trans_graph 0.03% : 0.000009s : 3: opt.transforms.special_op_eliminate [INFO] GE(15744,python3.7):2024-01-11-05:27:35.433.754 [scalable_config.cc:55][EVENT]21144 ScalableConfig:device total max size: 34359738368, page_mem_size_total_thresold: 32641751449, uncacheable_size_threshold: 17179869184 [INFO] GE(15744,python3.7):2024-01-11-05:27:35.514.589 [graph_var_manager.cc:1424][EVENT]21144 SetAllMemoryMaxValue:The graph_mem_max_size is 27917287424 and the var_mem_max_size is 5368709120 [INFO] GE(15744,python3.7):2024-01-11-05:27:35.514.704 [graph_manager.cc:1248][EVENT]21144 PreRun:PreRun start: graph node size 3, session id 1, graph id 0, graph name online. [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:35.515.632 [atrace_api.c:28](tid:21144) AtraceCreate start [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:35.515.705 [trace_rb_log.c:84](tid:21144) [RUNTIME_ATRACE_DEV64_TS0] create ring buffer success, buffer size : 131152. [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:35.515.719 [atrace_api.c:32](tid:21144) AtraceCreate end [INFO] TDT(15744,python3.7):2024-01-11-05:27:35.515.749 [client_manager.cpp:157][SetProfilingCallback][tid:21144] [TsdClient] set profiling callback success [INFO] GE(15744,python3.7):2024-01-11-05:27:35.516.838 [parallel_partitioner.cc:165][EVENT]21144 DoPipelinePartition:[GEPERFTRACE] The time cost of OptimizeSubgraph::PipelinePartition is [29] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.516.895 [parallel_partitioner.cc:178][EVENT]21144 DoFlowGraphPartition:[GEPERFTRACE] The time cost of OptimizeSubgraph::FlowGraphPartition is [14] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.516.965 [graph_prepare.cc:1378][EVENT]21144 Init:[GEPERFTRACE] The time cost of FileConstantUtils::ConvertFileConstToConst is [9] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.517.663 [graph_manager.cc:1050][EVENT]21144 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.preparer.PrepareInit is [734] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.517.697 [graph_manager.cc:1052][EVENT]21144 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.HandleSummaryOp is [9] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.517.853 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of ForToWhilePass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.517.885 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of ProcessNetOutput::SavePass is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.517.952 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of ProcessNetOutput::NetOutputPass is [54] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.517.967 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of ProcessNetOutput::DataPass is [1] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.518.072 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of CreateSubGraphWithScopePass is [27] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.518.086 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of SubgraphMultiDimsClonePass is [1] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.518.110 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of MultiBatchClonePass is [13] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.518.229 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of SplitVariableIntoSubgraphPass is [2] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.518.252 [graph_manager.cc:1054][EVENT]21144 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.preparer.NormalizeGraph is [540] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.526.029 [graph_manager.cc:1055][EVENT]21144 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.OptimizeGraphInit is [7742] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.527.070 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of AssertPass is [1] micro second, call num is [6] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.527.101 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of SwitchDeadBranchElimination is [3] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.527.113 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of MergePass is [4] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.527.123 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of InferShapePass is [303] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.527.132 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of ReplaceWithEmptyConstPass is [15] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.527.140 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of SplitShapeNPass is [1] micro second, call num is [6] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.527.149 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of DimensionComputePass is [21] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.527.157 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of ConstantFoldingPass is [18] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.527.165 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of InferValuePass is [5] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.255 [graph_manager.cc:1056][EVENT]21144 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.OptimizeOriginalGraphForQuantize is [3185] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.326 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of CondRemovePass is [8] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.345 [graph_prepare.cc:1982][EVENT]21144 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::ProcessBeforeInfershape is [54] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.696 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of AssertPass is [3] micro second, call num is [6] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.720 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of SwitchDeadBranchElimination is [0] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.732 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of MergePass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.741 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of InferShapePass is [180] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.750 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of ReplaceWithEmptyConstPass is [7] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.759 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of SplitShapeNPass is [3] micro second, call num is [6] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.767 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of DimensionComputePass is [5] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.775 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of ConstantFoldingPass is [8] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.784 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of InferValuePass is [2] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.820 [graph_prepare.cc:1983][EVENT]21144 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::FormatAndShapeProcess is [460] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.845 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of PreRun::MarkForceUnknownForCondPass is [5] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.857 [graph_prepare.cc:1984][EVENT]21144 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::CtrlFlowPreProcess is [22] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.871 [graph_prepare.cc:1985][EVENT]21144 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::multibatch::GetDynamicOutputShape is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.895 [graph_prepare.cc:1986][EVENT]21144 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::InsertAippOpUtil::Instance().UpdateDataNodeByAipp is [11] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.907 [graph_prepare.cc:1987][EVENT]21144 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::SaveOriginalGraphToOmModel is [1] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.926 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of PrepareOptimize::ShapeOperateOpRemovePass is [5] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.938 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of PrepareOptimize::ReplaceTransShapePass is [2] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.529.952 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of PrepareOptimize::MarkAgnosticPass is [5] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.034 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of EnterPass is [2] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.046 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of CondPass is [4] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.055 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of PrintOpPass is [2] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.064 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of NoUseReshapeRemovePass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.072 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of DropOutPass is [2] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.081 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of AssertPass is [2] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.089 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of TransposeRemovePass is [0] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.097 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of UnusedConstPass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.106 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of StopGradientPass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.114 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of PreventGradientPass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.122 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of PlaceholderWithDefaultPass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.130 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of SnapshotPass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.138 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of GuaranteeConstPass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.146 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of VarIsInitializedOpPass is [6] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.162 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of ParallelConcatStartOpPass is [2] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.171 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of IdentityPass is [3] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.194 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of PrepareOptimize::PrunePass is [11] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.207 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of PrepareOptimize::HcclMemcpyPass is [2] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.237 [graph_prepare.cc:1988][EVENT]21144 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::PrepareOptimize is [319] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.530.250 [graph_manager.cc:1065][EVENT]21144 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.preparer.PrepareDynShape is [959] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.543.495 [graph_manager.cc:1077][EVENT]21144 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.OptimizeOriginalGraph is [13224] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.543.575 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of PrepareRunningFormatRefiner::VariablePrepareOpPass is [7] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.543.626 [graph_manager.cc:1080][EVENT]21144 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.preparer.PrepareRunningFormatRefiner is [86] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.493 [graph_manager.cc:1081][EVENT]21144 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.OptimizeOriginalGraphJudgeInsert is [3850] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.538 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of SubexpressionMigrationPass is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.552 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of UnusedArgsCleanPass is [2] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.564 [graph_manager.cc:1082][EVENT]21144 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::SubexpressionMigration is [34] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.596 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::MergeInputMemcpyPass is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.612 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::SwitchDataEdgesBypass is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.626 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::ConstantFuseSamePass is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.660 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::CSEBeforeFuseDataNodesWithCommonInputPass is [23] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.674 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::FuseDataNodesWithCommonInputPass is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.688 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::CommonSubexpressionEliminationPass is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.702 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::PermutePass is [3] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.753 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::SameTransdataBreadthFusionPass is [42] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.784 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::VariableOpPass is [8] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.807 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::TransOpWithoutReshapeFusionPass is [13] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.854 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::TransOpBreadthFusionPass is [37] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.873 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::DataFlowPreparePass is [7] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.887 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::MergeUnknownShapeNPass is [2] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.547.896 [graph_manager.cc:2700][EVENT]21144 OptimizeStage1:[GEPERFTRACE] The time cost of GraphManager::OptimizeStage1_1 is [306] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.016 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of EnterPass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.029 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of AddNPass is [2] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.039 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of SwitchDeadBranchElimination is [2] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.047 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of SwitchLogicRemovePass is [2] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.056 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of MergePass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.064 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of CastRemovePass is [10] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.072 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of TransposeTransDataPass is [3] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.081 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of ReshapeRemovePass is [4] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.089 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of TransOpSymmetryEliminationPass is [3] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.097 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of TransOpNearbyAllreduceFusionPass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.116 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of ReplaceWithEmptyConstPass is [7] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.124 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of DimensionComputePass is [5] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.132 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of ConstantFoldingPass is [10] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.141 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of DimensionAdjustPass is [4] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.149 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of UselessControlOutRemovePass is [3] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.159 [graph_manager.cc:2741][EVENT]21144 OptimizeStage1:[GEPERFTRACE] The time cost of GraphManager::OptimizeStage1_2 is [243] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.176 [graph_manager.cc:2752][EVENT]21144 OptimizeStage1:[GEPERFTRACE] The time cost of extern constant folding is [0] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.201 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::Migration is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.213 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::ArgsClean is [1] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.231 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::PrunePass is [8] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.248 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::NextIterationPass is [5] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.259 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::ControlTriggerPass is [3] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.270 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::MergeToStreamMergePass is [3] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.297 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::SwitchToStreamSwitchPass is [17] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.312 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::AttachStreamLabelPass is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.326 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::MultiBatchPass is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.337 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::SubgraphMultiDimsPass is [1] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.350 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::IteratorOpPass is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.360 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::VariableRefUselessControlOutDeletePass is [2] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.379 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::ReshapeRecoveryPass is [9] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.392 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::RemoveSameConstPass is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.402 [graph_manager.cc:2810][EVENT]21144 OptimizeStage1:[GEPERFTRACE] The time cost of GraphManager::OptimizeStage1_3 is [206] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.429 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of IdentityPass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.441 [graph_manager.cc:2821][EVENT]21144 OptimizeStage1:[GEPERFTRACE] The time cost of GraphPrepare::node_pass is [30] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.472 [graph_manager.cc:1087][EVENT]21144 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::OptimizeStage1 is [889] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.619 [graph_manager.cc:1088][EVENT]21144 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.OptimizeAfterStage1 is [133] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.659 [graph_manager.cc:1089][EVENT]21144 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::GraphUtilsEx::InferShapeInNeed is [19] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.677 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of PreRun::CtrlEdgeTransferPass is [1] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.692 [graph_manager.cc:1097][EVENT]21144 PreRunOptimizeOriginalGraph:PreRun:PreRunOptimizeOriginalGraph success. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.548.723 [graph_manager.cc:3325][EVENT]21144 OptimizeSubgraph:[GEPERFTRACE] The time cost of OptimizeSubgraph::StagePartition is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.549.137 [engine_place.cc:144][EVENT]21144 Run:The time cost of AIcoreEngine::CheckSupported is [284] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.549.165 [engine_place.cc:144][EVENT]21144 Run:The time cost of DNN_VM_GE_LOCAL_OP_STORE::CheckSupported is [9] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.549.174 [engine_place.cc:144][EVENT]21144 Run:The time cost of DNN_VM_RTS_OP_STORE::CheckSupported is [10] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.549.272 [graph_manager.cc:3351][EVENT]21144 OptimizeSubgraph:[GEPERFTRACE] The time cost of OptimizeSubgraph::GraphPartitionDynamicShape is [535] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.549.290 [graph_manager.cc:3364][EVENT]21144 OptimizeSubgraph:[GEPERFTRACE] The time cost of OptimizeSubgraph::SubgraphPartitionAndOptimization::CompositeEngine is [2] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.549.372 [engine_partitioner.cc:1139][EVENT]21144 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionInitialize is [16] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.549.390 [engine_partitioner.cc:1142][EVENT]21144 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionMarkClusters is [5] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.549.532 [engine_partitioner.cc:1148][EVENT]21144 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionSplitSubGraphs is [133] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.549.574 [engine_partitioner.cc:1155][EVENT]21144 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionSortSubGraphs is [29] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.549.631 [engine_partitioner.cc:1164][EVENT]21144 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionAddPartitionsToGraphNode is [45] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.549.666 [graph_manager.cc:3405][EVENT]21144 SubgraphPartitionAndOptimization:[GEPERFTRACE] The time cost of OptimizeSubgraph::Partition1 is [361] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.549.686 [graph_manager.cc:3412][EVENT]21144 SubgraphPartitionAndOptimization:[GEPERFTRACE] The time cost of OptimizeSubgraph::SetSubgraphPreProc is [8] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.561.853 [graph_manager.cc:3422][EVENT]21144 SubgraphPartitionAndOptimization:[GEPERFTRACE] The time cost of OptimizeSubgraph::SetSubGraph is [12152] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.561.896 [graph_manager.cc:3428][EVENT]21144 SubgraphPartitionAndOptimization:[GEPERFTRACE] The time cost of OptimizeSubgraph::SetSubgraphPostProc is [10] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.032 [graph_manager.cc:3467][EVENT]21144 SubgraphPartitionAndOptimization:[GEPERFTRACE] The time cost of OptimizeSubgraph::MergeSubGraph is [114] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.052 [graph_manager.cc:3377][EVENT]21144 OptimizeSubgraph:[GEPERFTRACE] The time cost of OptimizeSubgraph::SubgraphPartitionAndOptimization::AtomicEngine is [12748] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.069 [graph_manager.cc:1106][EVENT]21144 PreRunOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::OptimizeSubgraph is [13353] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.082 [graph_manager.cc:1115][EVENT]21144 PreRunOptimizeSubGraph:PreRun:PreRunOptimizeSubGraph success. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.115 [graph_manager.cc:1130][EVENT]21144 PreRunAfterOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.ReplacePrecompiledNodeWithOmGraph is [5] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.159 [graph_manager.cc:1131][EVENT]21144 PreRunAfterOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.OptimizeWholeGraph is [20] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.191 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::LinkGenMaskNodesPass is [11] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.209 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::HcclContinuousMemcpyPass is [5] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.219 [graph_manager.cc:2837][EVENT]21144 OptimizeStage2:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses is [43] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.302 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of ConstantFoldingPass is [14] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.315 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of ReshapeRemovePass is [2] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.324 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of CondRemovePass is [4] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.333 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of BitcastPass is [1] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.341 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of AssignRemovePass is [6] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.350 [base_pass.cc:339][EVENT]21144 Run:[GEPERFTRACE] The time cost of DimensionAdjustPass is [4] micro second, call num is [3] [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.360 [graph_manager.cc:2864][EVENT]21144 OptimizeStage2:[GEPERFTRACE] The time cost of OptimizeStage2::MergedGraphNameToPasses is [123] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.381 [graph_manager.cc:2872][EVENT]21144 OptimizeStage2:[GEPERFTRACE] The time cost of OptimizeStage2::RemoveIsolatedConst is [12] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.401 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::ControlAttrOptimize::MultiBatchPass is [2] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.417 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::RefIdentityDeleteOpPass is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.433 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::VariableRefDeleteOpPass is [7] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.448 [compile_nodes_pass.cc:88][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::ControlAttrOptimize::CompileNodesPass is [3] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.459 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::ControlAttrOptimize::CompileNodesPass is [16] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.469 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::SwapSpacePass is [2] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.565 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::InputOutputConnectionIdentifyPass is [85] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.599 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::AtomicAddrCleanPass is [21] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.613 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::EndOfSequenceAddControlPass is [3] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.634 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::ControlAttrOptimize::SubgraphPass is [5] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.648 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::ControlAttrOptimize::AttachStreamLabelPass is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.658 [graph_manager.cc:2927][EVENT]21144 OptimizeStage2:[GEPERFTRACE] The time cost of OptimizeStage2::ControlAttrOptimize is [260] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.670 [graph_manager.cc:2937][EVENT]21144 OptimizeStage2:[GEPERFTRACE] The time cost of ModelBuilder::AssignFunctionalLabels is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.685 [graph_manager.cc:2943][EVENT]21144 OptimizeStage2:[GEPERFTRACE] The time cost of MemcpyAddrAsyncPass::Run. is [7] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.562.696 [graph_manager.cc:2950][EVENT]21144 OptimizeStage2:[GEPERFTRACE] The time cost of BufferPoolMemoryPass::Run. is [1] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.573.460 [graph_manager.cc:2958][EVENT]21144 OptimizeStage2:[GEPERFTRACE] The time cost of ParallelGroupPass::Run. is [43] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.573.514 [graph_manager.cc:1132][EVENT]21144 PreRunAfterOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::OptimizeStage2 is [11340] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.573.603 [graph_manager.cc:1135][EVENT]21144 PreRunAfterOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::GetCompilerStages(graph_node->GetGraphId()).optimizer.OptimizeGraphBeforeBuild is [70] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.573.658 [graph_manager.cc:2975][EVENT]21144 MemConflictProc:[GEPERFTRACE] The time cost of HandleMemoryRWConflict is [36] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.573.702 [graph_manager.cc:2981][EVENT]21144 MemConflictProc:[GEPERFTRACE] The time cost of MemLayoutConflictOptimizer::Run. is [29] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.573.720 [pass_manager.cc:82][EVENT]21144 Run:[GEPERFTRACE] The time cost of OptimizeStage2::SetFftsPlusAttrPass is [1] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.573.731 [graph_manager.cc:2986][EVENT]21144 MemConflictProc:[GEPERFTRACE] The time cost of SetFftsPlusAttrPass::last_passes.Run is [15] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.573.741 [graph_manager.cc:1136][EVENT]21144 PreRunAfterOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::MemConflictProc is [119] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.573.871 [graph_manager.cc:3555][EVENT]21144 Build:[GEPERFTRACE] The time cost of GraphManager::RecoverIrDefinitionAndModifyAippData is [93] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.573.989 [engine_partitioner.cc:1139][EVENT]21144 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionInitialize is [18] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.574.007 [engine_partitioner.cc:1142][EVENT]21144 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionMarkClusters is [4] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.574.123 [engine_partitioner.cc:1148][EVENT]21144 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionSplitSubGraphs is [106] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.574.156 [engine_partitioner.cc:1155][EVENT]21144 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionSortSubGraphs is [19] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.574.199 [engine_partitioner.cc:1164][EVENT]21144 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionAddPartitionsToGraphNode is [32] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.574.233 [graph_builder.cc:865][EVENT]21144 SecondPartition:[GEPERFTRACE] The time cost of EnginePartitioner::Partition2 is [282] micro second. [INFO] RUNTIME(15744,python3.7):2024-01-11-05:27:35.574.736 [logger.cc:1071] 21144 ModelBindStream: model_id=832, stream_id=1089, flag=0. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.574.779 [task_generator.cc:804][EVENT]21144 GenerateTask:[GEPERFTRACE] The time cost of TaskGenerator::SetStreamCtx is [184] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.574.854 [task_generator.cc:805][EVENT]21144 GenerateTask:[GEPERFTRACE] The time cost of TaskGenerator::PrepareForGenerateTask is [61] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.575.682 [task_generator.cc:814][EVENT]21144 GenerateTask:[GEPERFTRACE] The time cost of TaskGenerator::DoGenerateTask is [810] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.575.699 [task_generator.cc:954][EVENT]21144 GetTaskInfo:[GEPERFTRACE] The time cost of TaskGenerator::GenerateTask is [1104] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.575.775 [task_generator.cc:967][EVENT]21144 GetTaskInfo:[GEPERFTRACE] The time cost of TaskGenerator::AddModelTaskToModel is [47] micro second. [INFO] RUNTIME(15744,python3.7):2024-01-11-05:27:35.575.794 [logger.cc:1084] 21144 ModelUnbindStream: model_id=832, stream_id=1089, [INFO] GE(15744,python3.7):2024-01-11-05:27:35.575.993 [graph_manager.cc:1152][EVENT]21144 PreRunAfterOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::Build is [2226] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.576.014 [graph_manager.cc:1164][EVENT]21144 PreRunAfterOptimizeSubGraph:PreRun:PreRunAfterOptimizeSubGraph success. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.576.058 [graph_manager.cc:1271][EVENT]21144 PreRun:[GEPERFTRACE] The time cost of FlowModelBuild is [59361] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.576.070 [graph_manager.cc:1272][EVENT]21144 PreRun:[GEPERFTRACE] GE PreRun End [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:35.576.396 [atrace_api.c:93](tid:21144) AtraceDestroy start [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:35.576.420 [atrace_api.c:95](tid:21144) AtraceDestroy end [INFO] GE(15744,python3.7):2024-01-11-05:27:35.581.639 [graph_converter.cc:838][EVENT]21144 ConvertComputeGraphToExecuteGraph:[GEPERFTRACE] The time cost of ConvertComputeGraphToExecuteGraph::CreateMainNode is [1556] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.581.827 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of ZeroCopy is [136] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.582.341 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of CEM is [488] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.582.552 [copy_flow_launch_fuse.cc:395][EVENT]21144 Run:[GEPERFTRACE] The time cost of Pass::CopyFlowLaunchFuse is [183] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.582.576 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of CopyFlowLaunch is [209] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.582.822 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of TrustOutTensor is [233] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.582.852 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of AicpuFuseHostInputs is [9] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.582.888 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of ZeroCopy is [23] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.583.082 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of CEM is [180] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.583.165 [copy_flow_launch_fuse.cc:395][EVENT]21144 Run:[GEPERFTRACE] The time cost of Pass::CopyFlowLaunchFuse is [64] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.583.180 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of CopyFlowLaunch is [79] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.583.233 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of TrustOutTensor is [20] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.583.245 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of AicpuFuseHostInputs is [0] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.583.271 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of ZeroCopy is [16] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.583.343 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of CEM is [61] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.583.409 [copy_flow_launch_fuse.cc:395][EVENT]21144 Run:[GEPERFTRACE] The time cost of Pass::CopyFlowLaunchFuse is [54] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.583.422 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of CopyFlowLaunch is [67] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.583.447 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of TrustOutTensor is [17] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.583.458 [base_optimizer.cc:70][EVENT]21144 Run:[GEPERFTRACE] The time cost of AicpuFuseHostInputs is [0] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.583.471 [graph_converter.cc:849][EVENT]21144 ConvertComputeGraphToExecuteGraph:[GEPERFTRACE] The time cost of ConvertComputeGraphToExecuteGraph::RunAllPass is [1787] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.583.682 [graph_converter.cc:853][EVENT]21144 ConvertComputeGraphToExecuteGraph:[GEPERFTRACE] The time cost of ConvertComputeGraphToExecuteGraph::TopologicalSorting is [201] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.584.441 [graph_converter.cc:857][EVENT]21144 ConvertComputeGraphToExecuteGraph:[GEPERFTRACE] The time cost of ConvertComputeGraphToExecuteGraph::AppendGraphLevelData is [743] micro second. [INFO] GE(15744,python3.7):2024-01-11-05:27:35.584.598 [graph_converter.cc:862][EVENT]21144 ConvertComputeGraphToExecuteGraph:[GEPERFTRACE] The time cost of ConvertComputeGraphToExecuteGraph::CalculatePriority is [127] micro second. TotalTime = 0.025205, [20] [parse]: 0.00273351 [symbol_resolve]: 0.0109578, [1] [Cycle 1]: 0.0108989, [1] [resolve]: 0.0108772 [combine_like_graphs]: 1.26e-06 [graph_reusing]: 3.38e-06 [meta_unpack_prepare]: 5.091e-05 [pre_cconv]: 6.30003e-07 [abstract_specialize]: 0.00218602 [pack_expand]: 1.151e-05 [auto_monad]: 5.127e-05 [inline]: 1.48e-06 [pre_auto_parallel]: 1.128e-05 [pipeline_split]: 2.86e-06 [optimize]: 0.00394906, [35] [py_interpret_to_execute]: 4.25999e-06 [rewriter_before_opt_a]: 3.548e-05 [opt_a]: 0.00345149, [2] [Cycle 1]: 0.00086547, [30] [expand_dump_flag]: 3.67e-06 [switch_simplify]: 1.309e-05 [a_1]: 0.00020282 [recompute_prepare]: 2.31e-06 [updatestate_depend_eliminate]: 6.32e-06 [updatestate_assign_eliminate]: 3.66e-06 [updatestate_loads_eliminate]: 3.11e-06 [parameter_eliminate]: 3.55e-06 [a_2]: 2.927e-05 [accelerated_algorithm]: 6.75e-06 [pynative_shard]: 1.75e-06 [auto_parallel]: 3.31e-06 [parallel]: 9.88e-06 [merge_comm]: 3.39e-06 [allreduce_fusion]: 1.86e-06 [virtual_dataset]: 2.73e-06 [get_grad_eliminate_]: 1.99e-06 [virtual_output]: 1.7e-06 [merge_forward]: 5.22e-06 [cell_reuse_recompute_pass]: 8.39995e-07 [cell_reuse_handle_not_recompute_node_pass]: 5.88001e-06 [meta_fg_expand]: 3.6e-06 [after_resolve]: 5.17999e-06 [a_after_grad]: 2.6e-06 [renormalize]: 0.00033855 [real_op_eliminate]: 4.74e-06 [auto_monad_grad]: 3.98e-06 [auto_monad_eliminator]: 1.059e-05 [cse]: 2.542e-05 [a_3]: 1.568e-05 [Cycle 2]: 0.00022901, [30] [expand_dump_flag]: 1.05e-06 [switch_simplify]: 2.09e-06 [a_1]: 1.583e-05 [recompute_prepare]: 1.6e-06 [updatestate_depend_eliminate]: 2.97e-06 [updatestate_assign_eliminate]: 2.44e-06 [updatestate_loads_eliminate]: 2.37999e-06 [parameter_eliminate]: 9.00007e-07 [a_2]: 2.724e-05 [accelerated_algorithm]: 5.58e-06 [pynative_shard]: 9.20001e-07 [auto_parallel]: 3.19e-06 [parallel]: 3.50999e-06 [merge_comm]: 1.6e-06 [allreduce_fusion]: 1.11e-06 [virtual_dataset]: 2.31e-06 [get_grad_eliminate_]: 1.79e-06 [virtual_output]: 1.66e-06 [merge_forward]: 3.04e-06 [cell_reuse_recompute_pass]: 3.80001e-07 [cell_reuse_handle_not_recompute_node_pass]: 4.43e-06 [meta_fg_expand]: 1.86e-06 [after_resolve]: 3.25e-06 [a_after_grad]: 2.19999e-06 [renormalize]: 7.0002e-08 [real_op_eliminate]: 1.74e-06 [auto_monad_grad]: 9.79999e-07 [auto_monad_eliminator]: 3.91e-06 [cse]: 8.66e-06 [a_3]: 1.335e-05 [py_interpret_to_execute_after_opt_a]: 3.53999e-06 [slice_cell_reuse_recomputed_activation]: 2.58e-06 [rewriter_after_opt_a]: 2.001e-05 [convert_after_rewriter]: 5.98e-06 [order_py_execute_after_rewriter]: 4.57e-06 [opt_b]: 8.75e-05, [1] [Cycle 1]: 8.276e-05, [7] [b_1]: 3.986e-05 [b_2]: 2.89e-06 [updatestate_depend_eliminate]: 2.4e-06 [updatestate_assign_eliminate]: 2.19001e-06 [updatestate_loads_eliminate]: 2.17e-06 [renormalize]: 3.09999e-07 [cse]: 6.62e-06 [cconv]: 2.409e-05 [opt_after_cconv]: 4.928e-05, [1] [Cycle 1]: 4.545e-05, [7] [c_1]: 5.03e-06 [parameter_eliminate]: 8.59996e-07 [updatestate_depend_eliminate]: 2.24e-06 [updatestate_assign_eliminate]: 1.92e-06 [updatestate_loads_eliminate]: 1.9e-06 [cse]: 6.2e-06 [renormalize]: 1.69995e-07 [remove_dup_value]: 1.105e-05 [tuple_transform]: 3.434e-05, [1] [Cycle 1]: 3.082e-05, [3] [d_1]: 1.309e-05 [d_2]: 5.94e-06 [renormalize]: 1.40004e-07 [add_cache_embedding]: 1.156e-05 [add_recomputation]: 4.31e-05 [cse_after_recomputation]: 1.558e-05, [1] [Cycle 1]: 1.094e-05, [1] [cse]: 6.86e-06 [environ_conv]: 5.32001e-06 [label_micro_interleaved_index]: 2.24e-06 [label_fine_grained_interleaved_index]: 2.91e-06 [assign_add_opt]: 1.66e-06 [slice_recompute_activation]: 2.47001e-06 [micro_interleaved_order_control]: 1.87e-06 [full_micro_interleaved_order_control]: 2.05e-06 [comp_comm_scheduling]: 2.06001e-06 [reorder_send_recv_between_fp_bp]: 2.82e-06 [comm_op_add_attrs]: 1.44e-06 [add_comm_op_reuse_tag]: 1.05e-06 [overlap_opt_shard_in_pipeline]: 1.26e-06 [grouped_pairwise_exchange_alltoall]: 1.73e-06 [overlap_recompute_and_grad_model_parallel]: 1.9e-06 [overlap_grad_matmul_and_grad_allreduce]: 7.89994e-07 [split_matmul_comm_elemetwise]: 2.85e-06 [split_layernorm_comm]: 2.08e-06 [process_send_recv_for_ge]: 9e-07 [handle_group_info]: 1.42999e-06 [auto_monad_reorder]: 1.589e-05 [get_jit_bprop_graph]: 4.39999e-07 [eliminate_special_op_node]: 0.00045975 [validate]: 2.438e-05 [distribtued_split]: 1.59e-06 [task_emit]: 0.00452902 [execute]: 8.09001e-06 Sums parse : 0.002734s : 12.36% symbol_resolve.resolve : 0.010877s : 49.17% combine_like_graphs : 0.000001s : 0.01% graph_reusing : 0.000003s : 0.02% meta_unpack_prepare : 0.000051s : 0.23% pre_cconv : 0.000001s : 0.00% abstract_specialize : 0.002186s : 9.88% pack_expand : 0.000012s : 0.05% auto_monad : 0.000051s : 0.23% inline : 0.000001s : 0.01% pre_auto_parallel : 0.000011s : 0.05% pipeline_split : 0.000003s : 0.01% optimize.py_interpret_to_execute : 0.000004s : 0.02% optimize.rewriter_before_opt_a : 0.000035s : 0.16% optimize.opt_a.expand_dump_flag : 0.000005s : 0.02% optimize.opt_a.switch_simplify : 0.000015s : 0.07% optimize.opt_a.a_1 : 0.000219s : 0.99% optimize.opt_a.recompute_prepare : 0.000004s : 0.02% optimize.opt_a.updatestate_depend_eliminate : 0.000009s : 0.04% optimize.opt_a.updatestate_assign_eliminate : 0.000006s : 0.03% optimize.opt_a.updatestate_loads_eliminate : 0.000005s : 0.02% optimize.opt_a.parameter_eliminate : 0.000004s : 0.02% optimize.opt_a.a_2 : 0.000057s : 0.26% optimize.opt_a.accelerated_algorithm : 0.000012s : 0.06% optimize.opt_a.pynative_shard : 0.000003s : 0.01% optimize.opt_a.auto_parallel : 0.000006s : 0.03% optimize.opt_a.parallel : 0.000013s : 0.06% optimize.opt_a.merge_comm : 0.000005s : 0.02% optimize.opt_a.allreduce_fusion : 0.000003s : 0.01% optimize.opt_a.virtual_dataset : 0.000005s : 0.02% optimize.opt_a.get_grad_eliminate_ : 0.000004s : 0.02% optimize.opt_a.virtual_output : 0.000003s : 0.02% optimize.opt_a.merge_forward : 0.000008s : 0.04% optimize.opt_a.cell_reuse_recompute_pass : 0.000001s : 0.01% optimize.opt_a.cell_reuse_handle_not_recompute_node_pass : 0.000010s : 0.05% optimize.opt_a.meta_fg_expand : 0.000005s : 0.02% optimize.opt_a.after_resolve : 0.000008s : 0.04% optimize.opt_a.a_after_grad : 0.000005s : 0.02% optimize.opt_a.renormalize : 0.000339s : 1.53% optimize.opt_a.real_op_eliminate : 0.000006s : 0.03% optimize.opt_a.auto_monad_grad : 0.000005s : 0.02% optimize.opt_a.auto_monad_eliminator : 0.000015s : 0.07% optimize.opt_a.cse : 0.000034s : 0.15% optimize.opt_a.a_3 : 0.000029s : 0.13% optimize.py_interpret_to_execute_after_opt_a : 0.000004s : 0.02% optimize.slice_cell_reuse_recomputed_activation : 0.000003s : 0.01% optimize.rewriter_after_opt_a : 0.000020s : 0.09% optimize.convert_after_rewriter : 0.000006s : 0.03% optimize.order_py_execute_after_rewriter : 0.000005s : 0.02% optimize.opt_b.b_1 : 0.000040s : 0.18% optimize.opt_b.b_2 : 0.000003s : 0.01% optimize.opt_b.updatestate_depend_eliminate : 0.000002s : 0.01% optimize.opt_b.updatestate_assign_eliminate : 0.000002s : 0.01% optimize.opt_b.updatestate_loads_eliminate : 0.000002s : 0.01% optimize.opt_b.renormalize : 0.000000s : 0.00% optimize.opt_b.cse : 0.000007s : 0.03% optimize.cconv : 0.000024s : 0.11% optimize.opt_after_cconv.c_1 : 0.000005s : 0.02% optimize.opt_after_cconv.parameter_eliminate : 0.000001s : 0.00% optimize.opt_after_cconv.updatestate_depend_eliminate : 0.000002s : 0.01% optimize.opt_after_cconv.updatestate_assign_eliminate : 0.000002s : 0.01% optimize.opt_after_cconv.updatestate_loads_eliminate : 0.000002s : 0.01% optimize.opt_after_cconv.cse : 0.000006s : 0.03% optimize.opt_after_cconv.renormalize : 0.000000s : 0.00% optimize.remove_dup_value : 0.000011s : 0.05% optimize.tuple_transform.d_1 : 0.000013s : 0.06% optimize.tuple_transform.d_2 : 0.000006s : 0.03% optimize.tuple_transform.renormalize : 0.000000s : 0.00% optimize.add_cache_embedding : 0.000012s : 0.05% optimize.add_recomputation : 0.000043s : 0.19% optimize.cse_after_recomputation.cse : 0.000007s : 0.03% optimize.environ_conv : 0.000005s : 0.02% optimize.label_micro_interleaved_index : 0.000002s : 0.01% optimize.label_fine_grained_interleaved_index : 0.000003s : 0.01% optimize.assign_add_opt : 0.000002s : 0.01% optimize.slice_recompute_activation : 0.000002s : 0.01% optimize.micro_interleaved_order_control : 0.000002s : 0.01% optimize.full_micro_interleaved_order_control : 0.000002s : 0.01% optimize.comp_comm_scheduling : 0.000002s : 0.01% optimize.reorder_send_recv_between_fp_bp : 0.000003s : 0.01% optimize.comm_op_add_attrs : 0.000001s : 0.01% optimize.add_comm_op_reuse_tag : 0.000001s : 0.00% optimize.overlap_opt_shard_in_pipeline : 0.000001s : 0.01% optimize.grouped_pairwise_exchange_alltoall : 0.000002s : 0.01% optimize.overlap_recompute_and_grad_model_parallel : 0.000002s : 0.01% optimize.overlap_grad_matmul_and_grad_allreduce : 0.000001s : 0.00% optimize.split_matmul_comm_elemetwise : 0.000003s : 0.01% optimize.split_layernorm_comm : 0.000002s : 0.01% optimize.process_send_recv_for_ge : 0.000001s : 0.00% optimize.handle_group_info : 0.000001s : 0.01% auto_monad_reorder : 0.000016s : 0.07% get_jit_bprop_graph : 0.000000s : 0.00% eliminate_special_op_node : 0.000460s : 2.08% validate : 0.000024s : 0.11% distribtued_split : 0.000002s : 0.01% task_emit : 0.004529s : 20.47% execute : 0.000008s : 0.04% Time group info: ------[substitution.] 0.010771 39 98.80% : 0.010642s : 8: substitution.getattr_setattr_resolve 0.04% : 0.000005s : 3: substitution.graph_param_transform 0.89% : 0.000096s : 3: substitution.inline 0.03% : 0.000003s : 2: substitution.less_batch_normalization 0.09% : 0.000010s : 13: substitution.meta_unpack_prepare 0.01% : 0.000001s : 3: substitution.partial_unused_args_eliminate 0.01% : 0.000002s : 4: substitution.remove_not_recompute_node 0.02% : 0.000003s : 2: substitution.replace_old_param 0.08% : 0.000009s : 1: substitution.tuple_list_get_item_eliminator ------[renormalize.] 0.000332 2 61.25% : 0.000203s : 1: renormalize.infer 38.75% : 0.000129s : 1: renormalize.specialize ------[replace.] 0.000168 10 77.35% : 0.000130s : 6: replace.getattr_setattr_resolve 17.71% : 0.000030s : 3: replace.inline 4.94% : 0.000008s : 1: replace.tuple_list_get_item_eliminator ------[match.] 0.010684 10 99.02% : 0.010578s : 6: match.getattr_setattr_resolve 0.90% : 0.000096s : 3: match.inline 0.08% : 0.000009s : 1: match.tuple_list_get_item_eliminator ------[func_graph_cloner_run.] 0.000485 10 70.28% : 0.000341s : 5: func_graph_cloner_run.FuncGraphClonerGraph 29.72% : 0.000144s : 5: func_graph_cloner_run.FuncGraphSpecializer ------[meta_graph.] 0.000000 0 ------[manager.] 0.000000 0 ------[pynative] 0.000000 0 ------[others.] 0.011304 105 0.66% : 0.000075s : 52: opt.transform.opt_a 0.28% : 0.000031s : 23: opt.transform.opt_b 96.15% : 0.010870s : 2: opt.transform.opt_resolve 0.26% : 0.000029s : 1: opt.transforms.meta_unpack_prepare 2.37% : 0.000268s : 20: opt.transforms.opt_a 0.03% : 0.000004s : 1: opt.transforms.opt_after_cconv 0.02% : 0.000002s : 1: opt.transforms.opt_b 0.15% : 0.000017s : 2: opt.transforms.opt_trans_graph 0.07% : 0.000008s : 3: opt.transforms.special_op_eliminate TotalTime = 0.0870279, [20] [parse]: 0.00127508 [symbol_resolve]: 0.0123574, [1] [Cycle 1]: 0.0123031, [1] [resolve]: 0.0122211 [combine_like_graphs]: 9.5e-07 [graph_reusing]: 2.88e-06 [meta_unpack_prepare]: 0.00013437 [pre_cconv]: 6.90001e-07 [abstract_specialize]: 0.00390929 [pack_expand]: 1.432e-05 [auto_monad]: 7.62e-05 [inline]: 1.15e-06 [pre_auto_parallel]: 7.73e-06 [pipeline_split]: 1.75e-06 [optimize]: 0.0637136, [35] [py_interpret_to_execute]: 4.32e-06 [rewriter_before_opt_a]: 0.00016541 [opt_a]: 0.0624915, [4] [Cycle 1]: 0.0306361, [30] [expand_dump_flag]: 3.15e-06 [switch_simplify]: 2.372e-05 [a_1]: 0.00041174 [recompute_prepare]: 8.98e-06 [updatestate_depend_eliminate]: 9.52e-06 [updatestate_assign_eliminate]: 6.45001e-06 [updatestate_loads_eliminate]: 6.88e-06 [parameter_eliminate]: 4.91e-06 [a_2]: 8.058e-05 [accelerated_algorithm]: 7.76e-06 [pynative_shard]: 1.15e-06 [auto_parallel]: 3.18e-06 [parallel]: 6.42e-06 [merge_comm]: 2.57e-06 [allreduce_fusion]: 1.62e-06 [virtual_dataset]: 5.02e-06 [get_grad_eliminate_]: 4.5e-06 [virtual_output]: 3.91999e-06 [merge_forward]: 7.88e-06 [cell_reuse_recompute_pass]: 5.49997e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.156e-05 [meta_fg_expand]: 0.00184526, [1] [Cycle 1]: 0.00044458, [1] [resolve]: 0.00042387 [after_resolve]: 2.021e-05 [a_after_grad]: 3.824e-05 [renormalize]: 0.0275514 [real_op_eliminate]: 2.508e-05 [auto_monad_grad]: 3.331e-05 [auto_monad_eliminator]: 4.939e-05 [cse]: 0.00011453 [a_3]: 0.00016831 [Cycle 2]: 0.0261401, [30] [expand_dump_flag]: 2.72e-06 [switch_simplify]: 6.505e-05 [a_1]: 0.00045192 [recompute_prepare]: 9.91e-06 [updatestate_depend_eliminate]: 1.306e-05 [updatestate_assign_eliminate]: 8.98e-06 [updatestate_loads_eliminate]: 8.58e-06 [parameter_eliminate]: 3.81999e-06 [a_2]: 0.00012194 [accelerated_algorithm]: 1.263e-05 [pynative_shard]: 1.6e-06 [auto_parallel]: 6.46e-06 [parallel]: 5.71e-06 [merge_comm]: 2.89e-06 [allreduce_fusion]: 1.49e-06 [virtual_dataset]: 6.96e-06 [get_grad_eliminate_]: 6.11e-06 [virtual_output]: 5.73e-06 [merge_forward]: 1.026e-05 [cell_reuse_recompute_pass]: 5.9e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.477e-05 [meta_fg_expand]: 0.00484527, [3] [Cycle 1]: 0.00037912, [1] [resolve]: 0.00036053 [Cycle 1]: 0.00042836, [1] [resolve]: 0.00041042 [Cycle 1]: 0.00032635, [1] [resolve]: 0.0003083 [after_resolve]: 3.1e-05 [a_after_grad]: 5.187e-05 [renormalize]: 0.0198129 [real_op_eliminate]: 3.028e-05 [auto_monad_grad]: 3.879e-05 [auto_monad_eliminator]: 6.051e-05 [cse]: 0.00013422 [a_3]: 0.00020863 [Cycle 3]: 0.00267443, [30] [expand_dump_flag]: 3.59e-06 [switch_simplify]: 6.701e-05 [a_1]: 0.00062772 [recompute_prepare]: 1.171e-05 [updatestate_depend_eliminate]: 1.406e-05 [updatestate_assign_eliminate]: 1.121e-05 [updatestate_loads_eliminate]: 1.064e-05 [parameter_eliminate]: 4.06001e-06 [a_2]: 0.00015375 [accelerated_algorithm]: 1.55e-05 [pynative_shard]: 1.5e-06 [auto_parallel]: 4.88e-06 [parallel]: 4.07e-06 [merge_comm]: 2.64e-06 [allreduce_fusion]: 1.72e-06 [virtual_dataset]: 8.49e-06 [get_grad_eliminate_]: 8.01e-06 [virtual_output]: 7.33e-06 [merge_forward]: 1.212e-05 [cell_reuse_recompute_pass]: 4.60001e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.962e-05 [meta_fg_expand]: 2.832e-05 [after_resolve]: 1.081e-05 [a_after_grad]: 1.392e-05 [renormalize]: 0.00127328 [real_op_eliminate]: 1.305e-05 [auto_monad_grad]: 5.64e-06 [auto_monad_eliminator]: 2.38e-05 [cse]: 9.313e-05 [a_3]: 7.073e-05 [Cycle 4]: 0.00073599, [30] [expand_dump_flag]: 1.5e-06 [switch_simplify]: 8.9e-06 [a_1]: 0.00015602 [recompute_prepare]: 1.007e-05 [updatestate_depend_eliminate]: 1.396e-05 [updatestate_assign_eliminate]: 1.068e-05 [updatestate_loads_eliminate]: 1.026e-05 [parameter_eliminate]: 1.94e-06 [a_2]: 0.00014983 [accelerated_algorithm]: 1.473e-05 [pynative_shard]: 1.27e-06 [auto_parallel]: 3.61e-06 [parallel]: 3.75e-06 [merge_comm]: 2.35e-06 [allreduce_fusion]: 1.48e-06 [virtual_dataset]: 8.09e-06 [get_grad_eliminate_]: 7.59e-06 [virtual_output]: 7.34e-06 [merge_forward]: 1.175e-05 [cell_reuse_recompute_pass]: 3.80001e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.879e-05 [meta_fg_expand]: 8.84e-06 [after_resolve]: 1.046e-05 [a_after_grad]: 1.342e-05 [renormalize]: 7.99992e-08 [real_op_eliminate]: 7.53e-06 [auto_monad_grad]: 2.35e-06 [auto_monad_eliminator]: 2.04e-05 [cse]: 4.87e-05 [a_3]: 6.377e-05 [py_interpret_to_execute_after_opt_a]: 4.47e-06 [slice_cell_reuse_recomputed_activation]: 2.4e-06 [rewriter_after_opt_a]: 6.732e-05 [convert_after_rewriter]: 1.764e-05 [order_py_execute_after_rewriter]: 1.282e-05 [opt_b]: 0.00057582, [2] [Cycle 1]: 0.00048975, [7] [b_1]: 0.00043451 [b_2]: 3.42001e-06 [updatestate_depend_eliminate]: 3.63e-06 [updatestate_assign_eliminate]: 2.45e-06 [updatestate_loads_eliminate]: 2.19e-06 [renormalize]: 3.89999e-07 [cse]: 1.014e-05 [Cycle 2]: 7.657e-05, [7] [b_1]: 3.654e-05 [b_2]: 2.08e-06 [updatestate_depend_eliminate]: 2.26e-06 [updatestate_assign_eliminate]: 1.92e-06 [updatestate_loads_eliminate]: 1.91e-06 [renormalize]: 7.99992e-08 [cse]: 5.94999e-06 [cconv]: 2.33e-05 [opt_after_cconv]: 8.661e-05, [1] [Cycle 1]: 8.203e-05, [7] [c_1]: 4.98e-06 [parameter_eliminate]: 2.26e-06 [updatestate_depend_eliminate]: 2.21e-06 [updatestate_assign_eliminate]: 2.02e-06 [updatestate_loads_eliminate]: 1.9e-06 [cse]: 6.40999e-06 [renormalize]: 3.19997e-07 [remove_dup_value]: 9.65e-06 [tuple_transform]: 3.66e-05, [1] [Cycle 1]: 3.266e-05, [3] [d_1]: 1.468e-05 [d_2]: 6.1e-06 [renormalize]: 1.70003e-07 [add_cache_embedding]: 1.159e-05 [add_recomputation]: 4.06e-05 [cse_after_recomputation]: 1.598e-05, [1] [Cycle 1]: 1.178e-05, [1] [cse]: 7.37e-06 [environ_conv]: 6.2e-06 [label_micro_interleaved_index]: 2.53e-06 [label_fine_grained_interleaved_index]: 2.58e-06 [assign_add_opt]: 1.53e-06 [slice_recompute_activation]: 2.36e-06 [micro_interleaved_order_control]: 1.77e-06 [full_micro_interleaved_order_control]: 1.73999e-06 [comp_comm_scheduling]: 2.35e-06 [reorder_send_recv_between_fp_bp]: 2.37e-06 [comm_op_add_attrs]: 1.12e-06 [add_comm_op_reuse_tag]: 1.01e-06 [overlap_opt_shard_in_pipeline]: 1.26e-06 [grouped_pairwise_exchange_alltoall]: 1.26e-06 [overlap_recompute_and_grad_model_parallel]: 1.7e-06 [overlap_grad_matmul_and_grad_allreduce]: 5.9e-07 [split_matmul_comm_elemetwise]: 2.34e-06 [split_layernorm_comm]: 2.3e-06 [process_send_recv_for_ge]: 9.70002e-07 [handle_group_info]: 1.45e-06 [auto_monad_reorder]: 1.486e-05 [get_jit_bprop_graph]: 4.19997e-07 [eliminate_special_op_node]: 0.00077396 [validate]: 2.765e-05 [distribtued_split]: 1.3e-06 [task_emit]: 0.00451959 [execute]: 8.05e-06 Sums parse : 0.001275s : 1.63% symbol_resolve.resolve : 0.012221s : 15.60% combine_like_graphs : 0.000001s : 0.00% graph_reusing : 0.000003s : 0.00% meta_unpack_prepare : 0.000134s : 0.17% pre_cconv : 0.000001s : 0.00% abstract_specialize : 0.003909s : 4.99% pack_expand : 0.000014s : 0.02% auto_monad : 0.000076s : 0.10% inline : 0.000001s : 0.00% pre_auto_parallel : 0.000008s : 0.01% pipeline_split : 0.000002s : 0.00% optimize.py_interpret_to_execute : 0.000004s : 0.01% optimize.rewriter_before_opt_a : 0.000165s : 0.21% optimize.opt_a.expand_dump_flag : 0.000011s : 0.01% optimize.opt_a.switch_simplify : 0.000165s : 0.21% optimize.opt_a.a_1 : 0.001647s : 2.10% optimize.opt_a.recompute_prepare : 0.000041s : 0.05% optimize.opt_a.updatestate_depend_eliminate : 0.000051s : 0.06% optimize.opt_a.updatestate_assign_eliminate : 0.000037s : 0.05% optimize.opt_a.updatestate_loads_eliminate : 0.000036s : 0.05% optimize.opt_a.parameter_eliminate : 0.000015s : 0.02% optimize.opt_a.a_2 : 0.000506s : 0.65% optimize.opt_a.accelerated_algorithm : 0.000051s : 0.06% optimize.opt_a.pynative_shard : 0.000006s : 0.01% optimize.opt_a.auto_parallel : 0.000018s : 0.02% optimize.opt_a.parallel : 0.000020s : 0.03% optimize.opt_a.merge_comm : 0.000010s : 0.01% optimize.opt_a.allreduce_fusion : 0.000006s : 0.01% optimize.opt_a.virtual_dataset : 0.000029s : 0.04% optimize.opt_a.get_grad_eliminate_ : 0.000026s : 0.03% optimize.opt_a.virtual_output : 0.000024s : 0.03% optimize.opt_a.merge_forward : 0.000042s : 0.05% optimize.opt_a.cell_reuse_recompute_pass : 0.000002s : 0.00% optimize.opt_a.cell_reuse_handle_not_recompute_node_pass : 0.000065s : 0.08% optimize.opt_a.meta_fg_expand : 0.000037s : 0.05% optimize.opt_a.meta_fg_expand.resolve : 0.001503s : 1.92% optimize.opt_a.after_resolve : 0.000072s : 0.09% optimize.opt_a.a_after_grad : 0.000117s : 0.15% optimize.opt_a.renormalize : 0.048638s : 62.09% optimize.opt_a.real_op_eliminate : 0.000076s : 0.10% optimize.opt_a.auto_monad_grad : 0.000080s : 0.10% optimize.opt_a.auto_monad_eliminator : 0.000154s : 0.20% optimize.opt_a.cse : 0.000391s : 0.50% optimize.opt_a.a_3 : 0.000511s : 0.65% optimize.py_interpret_to_execute_after_opt_a : 0.000004s : 0.01% optimize.slice_cell_reuse_recomputed_activation : 0.000002s : 0.00% optimize.rewriter_after_opt_a : 0.000067s : 0.09% optimize.convert_after_rewriter : 0.000018s : 0.02% optimize.order_py_execute_after_rewriter : 0.000013s : 0.02% optimize.opt_b.b_1 : 0.000471s : 0.60% optimize.opt_b.b_2 : 0.000006s : 0.01% optimize.opt_b.updatestate_depend_eliminate : 0.000006s : 0.01% optimize.opt_b.updatestate_assign_eliminate : 0.000004s : 0.01% optimize.opt_b.updatestate_loads_eliminate : 0.000004s : 0.01% optimize.opt_b.renormalize : 0.000000s : 0.00% optimize.opt_b.cse : 0.000016s : 0.02% optimize.cconv : 0.000023s : 0.03% optimize.opt_after_cconv.c_1 : 0.000005s : 0.01% optimize.opt_after_cconv.parameter_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_depend_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_assign_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_loads_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.cse : 0.000006s : 0.01% optimize.opt_after_cconv.renormalize : 0.000000s : 0.00% optimize.remove_dup_value : 0.000010s : 0.01% optimize.tuple_transform.d_1 : 0.000015s : 0.02% optimize.tuple_transform.d_2 : 0.000006s : 0.01% optimize.tuple_transform.renormalize : 0.000000s : 0.00% optimize.add_cache_embedding : 0.000012s : 0.01% optimize.add_recomputation : 0.000041s : 0.05% optimize.cse_after_recomputation.cse : 0.000007s : 0.01% optimize.environ_conv : 0.000006s : 0.01% optimize.label_micro_interleaved_index : 0.000003s : 0.00% optimize.label_fine_grained_interleaved_index : 0.000003s : 0.00% optimize.assign_add_opt : 0.000002s : 0.00% optimize.slice_recompute_activation : 0.000002s : 0.00% optimize.micro_interleaved_order_control : 0.000002s : 0.00% optimize.full_micro_interleaved_order_control : 0.000002s : 0.00% optimize.comp_comm_scheduling : 0.000002s : 0.00% optimize.reorder_send_recv_between_fp_bp : 0.000002s : 0.00% optimize.comm_op_add_attrs : 0.000001s : 0.00% optimize.add_comm_op_reuse_tag : 0.000001s : 0.00% optimize.overlap_opt_shard_in_pipeline : 0.000001s : 0.00% optimize.grouped_pairwise_exchange_alltoall : 0.000001s : 0.00% optimize.overlap_recompute_and_grad_model_parallel : 0.000002s : 0.00% optimize.overlap_grad_matmul_and_grad_allreduce : 0.000001s : 0.00% optimize.split_matmul_comm_elemetwise : 0.000002s : 0.00% optimize.split_layernorm_comm : 0.000002s : 0.00% optimize.process_send_recv_for_ge : 0.000001s : 0.00% optimize.handle_group_info : 0.000001s : 0.00% auto_monad_reorder : 0.000015s : 0.02% get_jit_bprop_graph : 0.000000s : 0.00% eliminate_special_op_node : 0.000774s : 0.99% validate : 0.000028s : 0.04% distribtued_split : 0.000001s : 0.00% task_emit : 0.004520s : 5.77% execute : 0.000008s : 0.01% Time group info: ------[substitution.] 0.014010 387 0.02% : 0.000003s : 5: substitution.float_depend_g_call 0.08% : 0.000011s : 14: substitution.float_tuple_getitem_switch 92.99% : 0.013029s : 25: substitution.getattr_setattr_resolve 0.03% : 0.000005s : 3: substitution.graph_param_transform 0.03% : 0.000004s : 3: substitution.incorporate_call 0.01% : 0.000002s : 3: substitution.incorporate_call_switch 4.42% : 0.000619s : 59: substitution.inline 0.05% : 0.000007s : 14: substitution.less_batch_normalization 0.21% : 0.000029s : 23: substitution.meta_unpack_prepare 0.09% : 0.000012s : 11: substitution.minmaximum_grad 0.03% : 0.000004s : 5: substitution.partial_eliminate 0.01% : 0.000001s : 3: substitution.partial_unused_args_eliminate 0.04% : 0.000006s : 47: substitution.remove_not_recompute_node 0.39% : 0.000055s : 38: substitution.replace_applicator 0.05% : 0.000007s : 20: substitution.replace_old_param 0.02% : 0.000003s : 2: substitution.reset_defer_inline 0.05% : 0.000007s : 8: substitution.set_cell_output_no_recompute 0.05% : 0.000007s : 5: substitution.specialize_transform 0.06% : 0.000009s : 4: substitution.switch_simplify 0.09% : 0.000013s : 2: substitution.transpose_eliminate 0.30% : 0.000041s : 15: substitution.tuple_list_convert_item_index_to_positive 0.12% : 0.000017s : 15: substitution.tuple_list_get_item_const_eliminator 0.17% : 0.000023s : 15: substitution.tuple_list_get_item_depend_reorder 0.51% : 0.000072s : 33: substitution.tuple_list_get_item_eliminator 0.17% : 0.000024s : 15: substitution.tuple_list_get_set_item_eliminator ------[renormalize.] 0.048622 6 92.60% : 0.045026s : 3: renormalize.infer 7.40% : 0.003597s : 3: renormalize.specialize ------[replace.] 0.000755 68 46.32% : 0.000350s : 23: replace.getattr_setattr_resolve 30.00% : 0.000226s : 31: replace.inline 7.00% : 0.000053s : 2: replace.meta_unpack_prepare 8.37% : 0.000063s : 4: replace.switch_simplify 1.55% : 0.000012s : 2: replace.transpose_eliminate 6.76% : 0.000051s : 6: replace.tuple_list_get_item_eliminator ------[match.] 0.013597 68 95.46% : 0.012981s : 23: match.getattr_setattr_resolve 4.10% : 0.000557s : 31: match.inline 0.13% : 0.000018s : 2: match.meta_unpack_prepare 0.07% : 0.000009s : 4: match.switch_simplify 0.10% : 0.000013s : 2: match.transpose_eliminate 0.14% : 0.000019s : 6: match.tuple_list_get_item_eliminator ------[func_graph_cloner_run.] 0.004102 69 67.26% : 0.002759s : 28: func_graph_cloner_run.FuncGraphClonerGraph 32.74% : 0.001343s : 41: func_graph_cloner_run.FuncGraphSpecializer ------[meta_graph.] 0.000000 0 ------[manager.] 0.000000 0 ------[pynative] 0.000000 0 ------[others.] 0.017459 255 5.91% : 0.001032s : 104: opt.transform.opt_a 2.51% : 0.000438s : 92: opt.transform.opt_b 78.04% : 0.013624s : 10: opt.transform.opt_resolve 0.67% : 0.000116s : 1: opt.transforms.meta_unpack_prepare 12.68% : 0.002213s : 40: opt.transforms.opt_a 0.02% : 0.000004s : 1: opt.transforms.opt_after_cconv 0.02% : 0.000004s : 2: opt.transforms.opt_b 0.11% : 0.000019s : 2: opt.transforms.opt_trans_graph 0.05% : 0.000009s : 3: opt.transforms.special_op_eliminate TotalTime = 0.0239554, [20] [parse]: 0.00138063 [symbol_resolve]: 0.0111858, [1] [Cycle 1]: 0.0111235, [1] [resolve]: 0.0111027 [combine_like_graphs]: 1.18e-06 [graph_reusing]: 3.24001e-06 [meta_unpack_prepare]: 4.926e-05 [pre_cconv]: 6.69999e-07 [abstract_specialize]: 0.00216528 [pack_expand]: 1.172e-05 [auto_monad]: 5.321e-05 [inline]: 1.62e-06 [pre_auto_parallel]: 1.021e-05 [pipeline_split]: 3.58e-06 [optimize]: 0.0039295, [35] [py_interpret_to_execute]: 4.12e-06 [rewriter_before_opt_a]: 3.493e-05 [opt_a]: 0.00343468, [2] [Cycle 1]: 0.00083957, [30] [expand_dump_flag]: 3.78e-06 [switch_simplify]: 1.308e-05 [a_1]: 0.00020216 [recompute_prepare]: 2.49e-06 [updatestate_depend_eliminate]: 6.4e-06 [updatestate_assign_eliminate]: 3.48e-06 [updatestate_loads_eliminate]: 3.26e-06 [parameter_eliminate]: 3.55e-06 [a_2]: 2.983e-05 [accelerated_algorithm]: 7.13e-06 [pynative_shard]: 1.75001e-06 [auto_parallel]: 3.43e-06 [parallel]: 9.25e-06 [merge_comm]: 3.78e-06 [allreduce_fusion]: 1.79e-06 [virtual_dataset]: 2.66e-06 [get_grad_eliminate_]: 2.07e-06 [virtual_output]: 1.76e-06 [merge_forward]: 5.1e-06 [cell_reuse_recompute_pass]: 8.49999e-07 [cell_reuse_handle_not_recompute_node_pass]: 5.73e-06 [meta_fg_expand]: 3.42001e-06 [after_resolve]: 5.03e-06 [a_after_grad]: 2.38e-06 [renormalize]: 0.00031049 [real_op_eliminate]: 4.99999e-06 [auto_monad_grad]: 4.59e-06 [auto_monad_eliminator]: 1.127e-05 [cse]: 2.497e-05 [a_3]: 1.585e-05 [Cycle 2]: 0.00023034, [30] [expand_dump_flag]: 1.02e-06 [switch_simplify]: 2.21e-06 [a_1]: 1.652e-05 [recompute_prepare]: 1.77e-06 [updatestate_depend_eliminate]: 2.9e-06 [updatestate_assign_eliminate]: 2.5e-06 [updatestate_loads_eliminate]: 2.31e-06 [parameter_eliminate]: 1.05e-06 [a_2]: 2.67e-05 [accelerated_algorithm]: 5.89e-06 [pynative_shard]: 1.01e-06 [auto_parallel]: 3.37001e-06 [parallel]: 3.28e-06 [merge_comm]: 1.58e-06 [allreduce_fusion]: 1.24001e-06 [virtual_dataset]: 2.2e-06 [get_grad_eliminate_]: 1.86e-06 [virtual_output]: 1.7e-06 [merge_forward]: 3.01e-06 [cell_reuse_recompute_pass]: 3.50003e-07 [cell_reuse_handle_not_recompute_node_pass]: 4.76999e-06 [meta_fg_expand]: 1.88e-06 [after_resolve]: 3.19e-06 [a_after_grad]: 2.53e-06 [renormalize]: 7.99992e-08 [real_op_eliminate]: 1.77e-06 [auto_monad_grad]: 1.01e-06 [auto_monad_eliminator]: 4.07e-06 [cse]: 8.14e-06 [a_3]: 1.306e-05 [py_interpret_to_execute_after_opt_a]: 3.29001e-06 [slice_cell_reuse_recomputed_activation]: 2.48e-06 [rewriter_after_opt_a]: 1.948e-05 [convert_after_rewriter]: 6.13e-06 [order_py_execute_after_rewriter]: 4.65e-06 [opt_b]: 8.966e-05, [1] [Cycle 1]: 8.502e-05, [7] [b_1]: 3.98e-05 [b_2]: 2.94e-06 [updatestate_depend_eliminate]: 2.35e-06 [updatestate_assign_eliminate]: 2.36e-06 [updatestate_loads_eliminate]: 2.22999e-06 [renormalize]: 3.20004e-07 [cse]: 7.21e-06 [cconv]: 2.423e-05 [opt_after_cconv]: 4.915e-05, [1] [Cycle 1]: 4.519e-05, [7] [c_1]: 5.03e-06 [parameter_eliminate]: 6.19999e-07 [updatestate_depend_eliminate]: 2.28e-06 [updatestate_assign_eliminate]: 1.99e-06 [updatestate_loads_eliminate]: 1.86e-06 [cse]: 6.73e-06 [renormalize]: 1.59998e-07 [remove_dup_value]: 1.122e-05 [tuple_transform]: 3.447e-05, [1] [Cycle 1]: 3.095e-05, [3] [d_1]: 1.299e-05 [d_2]: 6.04e-06 [renormalize]: 1.70003e-07 [add_cache_embedding]: 1.184e-05 [add_recomputation]: 4.107e-05 [cse_after_recomputation]: 1.558e-05, [1] [Cycle 1]: 1.132e-05, [1] [cse]: 7.13e-06 [environ_conv]: 5.71e-06 [label_micro_interleaved_index]: 2.18e-06 [label_fine_grained_interleaved_index]: 2.55999e-06 [assign_add_opt]: 1.56e-06 [slice_recompute_activation]: 2.48e-06 [micro_interleaved_order_control]: 1.78e-06 [full_micro_interleaved_order_control]: 1.99e-06 [comp_comm_scheduling]: 2.15e-06 [reorder_send_recv_between_fp_bp]: 2.32e-06 [comm_op_add_attrs]: 1.1e-06 [add_comm_op_reuse_tag]: 9.59997e-07 [overlap_opt_shard_in_pipeline]: 1.21e-06 [grouped_pairwise_exchange_alltoall]: 1.5e-06 [overlap_recompute_and_grad_model_parallel]: 1.9e-06 [overlap_grad_matmul_and_grad_allreduce]: 1.18e-06 [split_matmul_comm_elemetwise]: 2.59e-06 [split_layernorm_comm]: 1.76e-06 [process_send_recv_for_ge]: 9.29998e-07 [handle_group_info]: 1.15e-06 [auto_monad_reorder]: 1.494e-05 [get_jit_bprop_graph]: 4.20005e-07 [eliminate_special_op_node]: 0.00046048 [validate]: 2.449e-05 [distribtued_split]: 1.31e-06 [task_emit]: 0.00445131 [execute]: 7.57e-06 Sums parse : 0.001381s : 6.62% symbol_resolve.resolve : 0.011103s : 53.21% combine_like_graphs : 0.000001s : 0.01% graph_reusing : 0.000003s : 0.02% meta_unpack_prepare : 0.000049s : 0.24% pre_cconv : 0.000001s : 0.00% abstract_specialize : 0.002165s : 10.38% pack_expand : 0.000012s : 0.06% auto_monad : 0.000053s : 0.26% inline : 0.000002s : 0.01% pre_auto_parallel : 0.000010s : 0.05% pipeline_split : 0.000004s : 0.02% optimize.py_interpret_to_execute : 0.000004s : 0.02% optimize.rewriter_before_opt_a : 0.000035s : 0.17% optimize.opt_a.expand_dump_flag : 0.000005s : 0.02% optimize.opt_a.switch_simplify : 0.000015s : 0.07% optimize.opt_a.a_1 : 0.000219s : 1.05% optimize.opt_a.recompute_prepare : 0.000004s : 0.02% optimize.opt_a.updatestate_depend_eliminate : 0.000009s : 0.04% optimize.opt_a.updatestate_assign_eliminate : 0.000006s : 0.03% optimize.opt_a.updatestate_loads_eliminate : 0.000006s : 0.03% optimize.opt_a.parameter_eliminate : 0.000005s : 0.02% optimize.opt_a.a_2 : 0.000057s : 0.27% optimize.opt_a.accelerated_algorithm : 0.000013s : 0.06% optimize.opt_a.pynative_shard : 0.000003s : 0.01% optimize.opt_a.auto_parallel : 0.000007s : 0.03% optimize.opt_a.parallel : 0.000013s : 0.06% optimize.opt_a.merge_comm : 0.000005s : 0.03% optimize.opt_a.allreduce_fusion : 0.000003s : 0.01% optimize.opt_a.virtual_dataset : 0.000005s : 0.02% optimize.opt_a.get_grad_eliminate_ : 0.000004s : 0.02% optimize.opt_a.virtual_output : 0.000003s : 0.02% optimize.opt_a.merge_forward : 0.000008s : 0.04% optimize.opt_a.cell_reuse_recompute_pass : 0.000001s : 0.01% optimize.opt_a.cell_reuse_handle_not_recompute_node_pass : 0.000010s : 0.05% optimize.opt_a.meta_fg_expand : 0.000005s : 0.03% optimize.opt_a.after_resolve : 0.000008s : 0.04% optimize.opt_a.a_after_grad : 0.000005s : 0.02% optimize.opt_a.renormalize : 0.000311s : 1.49% optimize.opt_a.real_op_eliminate : 0.000007s : 0.03% optimize.opt_a.auto_monad_grad : 0.000006s : 0.03% optimize.opt_a.auto_monad_eliminator : 0.000015s : 0.07% optimize.opt_a.cse : 0.000033s : 0.16% optimize.opt_a.a_3 : 0.000029s : 0.14% optimize.py_interpret_to_execute_after_opt_a : 0.000003s : 0.02% optimize.slice_cell_reuse_recomputed_activation : 0.000002s : 0.01% optimize.rewriter_after_opt_a : 0.000019s : 0.09% optimize.convert_after_rewriter : 0.000006s : 0.03% optimize.order_py_execute_after_rewriter : 0.000005s : 0.02% optimize.opt_b.b_1 : 0.000040s : 0.19% optimize.opt_b.b_2 : 0.000003s : 0.01% optimize.opt_b.updatestate_depend_eliminate : 0.000002s : 0.01% optimize.opt_b.updatestate_assign_eliminate : 0.000002s : 0.01% optimize.opt_b.updatestate_loads_eliminate : 0.000002s : 0.01% optimize.opt_b.renormalize : 0.000000s : 0.00% optimize.opt_b.cse : 0.000007s : 0.03% optimize.cconv : 0.000024s : 0.12% optimize.opt_after_cconv.c_1 : 0.000005s : 0.02% optimize.opt_after_cconv.parameter_eliminate : 0.000001s : 0.00% optimize.opt_after_cconv.updatestate_depend_eliminate : 0.000002s : 0.01% optimize.opt_after_cconv.updatestate_assign_eliminate : 0.000002s : 0.01% optimize.opt_after_cconv.updatestate_loads_eliminate : 0.000002s : 0.01% optimize.opt_after_cconv.cse : 0.000007s : 0.03% optimize.opt_after_cconv.renormalize : 0.000000s : 0.00% optimize.remove_dup_value : 0.000011s : 0.05% optimize.tuple_transform.d_1 : 0.000013s : 0.06% optimize.tuple_transform.d_2 : 0.000006s : 0.03% optimize.tuple_transform.renormalize : 0.000000s : 0.00% optimize.add_cache_embedding : 0.000012s : 0.06% optimize.add_recomputation : 0.000041s : 0.20% optimize.cse_after_recomputation.cse : 0.000007s : 0.03% optimize.environ_conv : 0.000006s : 0.03% optimize.label_micro_interleaved_index : 0.000002s : 0.01% optimize.label_fine_grained_interleaved_index : 0.000003s : 0.01% optimize.assign_add_opt : 0.000002s : 0.01% optimize.slice_recompute_activation : 0.000002s : 0.01% optimize.micro_interleaved_order_control : 0.000002s : 0.01% optimize.full_micro_interleaved_order_control : 0.000002s : 0.01% optimize.comp_comm_scheduling : 0.000002s : 0.01% optimize.reorder_send_recv_between_fp_bp : 0.000002s : 0.01% optimize.comm_op_add_attrs : 0.000001s : 0.01% optimize.add_comm_op_reuse_tag : 0.000001s : 0.00% optimize.overlap_opt_shard_in_pipeline : 0.000001s : 0.01% optimize.grouped_pairwise_exchange_alltoall : 0.000001s : 0.01% optimize.overlap_recompute_and_grad_model_parallel : 0.000002s : 0.01% optimize.overlap_grad_matmul_and_grad_allreduce : 0.000001s : 0.01% optimize.split_matmul_comm_elemetwise : 0.000003s : 0.01% optimize.split_layernorm_comm : 0.000002s : 0.01% optimize.process_send_recv_for_ge : 0.000001s : 0.00% optimize.handle_group_info : 0.000001s : 0.01% auto_monad_reorder : 0.000015s : 0.07% get_jit_bprop_graph : 0.000000s : 0.00% eliminate_special_op_node : 0.000460s : 2.21% validate : 0.000024s : 0.12% distribtued_split : 0.000001s : 0.01% task_emit : 0.004451s : 21.33% execute : 0.000008s : 0.04% Time group info: ------[substitution.] 0.010989 39 98.85% : 0.010862s : 8: substitution.getattr_setattr_resolve 0.04% : 0.000005s : 3: substitution.graph_param_transform 0.86% : 0.000095s : 3: substitution.inline 0.03% : 0.000003s : 2: substitution.less_batch_normalization 0.09% : 0.000010s : 13: substitution.meta_unpack_prepare 0.01% : 0.000001s : 3: substitution.partial_unused_args_eliminate 0.01% : 0.000001s : 4: substitution.remove_not_recompute_node 0.02% : 0.000002s : 2: substitution.replace_old_param 0.08% : 0.000009s : 1: substitution.tuple_list_get_item_eliminator ------[renormalize.] 0.000304 2 57.91% : 0.000176s : 1: renormalize.infer 42.09% : 0.000128s : 1: renormalize.specialize ------[replace.] 0.000171 10 77.87% : 0.000133s : 6: replace.getattr_setattr_resolve 17.13% : 0.000029s : 3: replace.inline 5.01% : 0.000009s : 1: replace.tuple_list_get_item_eliminator ------[match.] 0.010901 10 99.05% : 0.010798s : 6: match.getattr_setattr_resolve 0.87% : 0.000095s : 3: match.inline 0.08% : 0.000009s : 1: match.tuple_list_get_item_eliminator ------[func_graph_cloner_run.] 0.000496 10 70.38% : 0.000349s : 5: func_graph_cloner_run.FuncGraphClonerGraph 29.62% : 0.000147s : 5: func_graph_cloner_run.FuncGraphSpecializer ------[meta_graph.] 0.000000 0 ------[manager.] 0.000000 0 ------[pynative] 0.000000 0 ------[others.] 0.011531 105 0.65% : 0.000075s : 52: opt.transform.opt_a 0.27% : 0.000031s : 23: opt.transform.opt_b 96.22% : 0.011095s : 2: opt.transform.opt_resolve 0.24% : 0.000028s : 1: opt.transforms.meta_unpack_prepare 2.33% : 0.000269s : 20: opt.transforms.opt_a 0.03% : 0.000004s : 1: opt.transforms.opt_after_cconv 0.02% : 0.000002s : 1: opt.transforms.opt_b 0.15% : 0.000017s : 2: opt.transforms.opt_trans_graph 0.07% : 0.000008s : 3: opt.transforms.special_op_eliminate . TotalTime = 0.0865174, [20] [parse]: 0.0012792 [symbol_resolve]: 0.0123618, [1] [Cycle 1]: 0.0123066, [1] [resolve]: 0.0122878 [combine_like_graphs]: 5.09994e-07 [graph_reusing]: 2.22e-06 [meta_unpack_prepare]: 0.00016541 [pre_cconv]: 4.49996e-07 [abstract_specialize]: 0.0038762 [pack_expand]: 1.309e-05 [auto_monad]: 7.225e-05 [inline]: 1e-06 [pre_auto_parallel]: 7.12e-06 [pipeline_split]: 1.4e-06 [optimize]: 0.0655885, [35] [py_interpret_to_execute]: 3.8e-06 [rewriter_before_opt_a]: 0.00016104 [opt_a]: 0.064443, [4] [Cycle 1]: 0.0310492, [30] [expand_dump_flag]: 3.15e-06 [switch_simplify]: 2.506e-05 [a_1]: 0.00073096 [recompute_prepare]: 8.12999e-06 [updatestate_depend_eliminate]: 1.003e-05 [updatestate_assign_eliminate]: 6.62e-06 [updatestate_loads_eliminate]: 6.26e-06 [parameter_eliminate]: 4.79e-06 [a_2]: 7.545e-05 [accelerated_algorithm]: 7.95e-06 [pynative_shard]: 1.04e-06 [auto_parallel]: 3.12e-06 [parallel]: 5.87e-06 [merge_comm]: 2.60001e-06 [allreduce_fusion]: 1.77e-06 [virtual_dataset]: 5.59e-06 [get_grad_eliminate_]: 4.63999e-06 [virtual_output]: 4.49e-06 [merge_forward]: 7.71e-06 [cell_reuse_recompute_pass]: 5.30003e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.14e-05 [meta_fg_expand]: 0.00180784, [1] [Cycle 1]: 0.00042829, [1] [resolve]: 0.00041009 [after_resolve]: 2.004e-05 [a_after_grad]: 3.948e-05 [renormalize]: 0.0277049 [real_op_eliminate]: 2.592e-05 [auto_monad_grad]: 3.151e-05 [auto_monad_eliminator]: 4.897e-05 [cse]: 0.00010485 [a_3]: 0.00016779 [Cycle 2]: 0.0268834, [30] [expand_dump_flag]: 2.37e-06 [switch_simplify]: 8.122e-05 [a_1]: 0.00091841 [recompute_prepare]: 8.78e-06 [updatestate_depend_eliminate]: 1.148e-05 [updatestate_assign_eliminate]: 8.54e-06 [updatestate_loads_eliminate]: 8.44e-06 [parameter_eliminate]: 3.32e-06 [a_2]: 0.00011589 [accelerated_algorithm]: 1.092e-05 [pynative_shard]: 1.03001e-06 [auto_parallel]: 3.96e-06 [parallel]: 3.95e-06 [merge_comm]: 2.34001e-06 [allreduce_fusion]: 1.55e-06 [virtual_dataset]: 6.93e-06 [get_grad_eliminate_]: 6.56e-06 [virtual_output]: 6.06e-06 [merge_forward]: 9.64999e-06 [cell_reuse_recompute_pass]: 4.19997e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.451e-05 [meta_fg_expand]: 0.00477244, [3] [Cycle 1]: 0.00033261, [1] [resolve]: 0.00031443 [Cycle 1]: 0.00042351, [1] [resolve]: 0.00040542 [Cycle 1]: 0.000325, [1] [resolve]: 0.00030733 [after_resolve]: 3.091e-05 [a_after_grad]: 6.213e-05 [renormalize]: 0.0201855 [real_op_eliminate]: 3.081e-05 [auto_monad_grad]: 3.367e-05 [auto_monad_eliminator]: 5.611e-05 [cse]: 0.00011905 [a_3]: 0.00020507 [Cycle 3]: 0.00322016, [30] [expand_dump_flag]: 2.51e-06 [switch_simplify]: 8.524e-05 [a_1]: 0.00118947 [recompute_prepare]: 1.046e-05 [updatestate_depend_eliminate]: 1.408e-05 [updatestate_assign_eliminate]: 1.083e-05 [updatestate_loads_eliminate]: 1.034e-05 [parameter_eliminate]: 3.66e-06 [a_2]: 0.00015136 [accelerated_algorithm]: 1.498e-05 [pynative_shard]: 1.06e-06 [auto_parallel]: 3.37e-06 [parallel]: 3.59e-06 [merge_comm]: 2.82e-06 [allreduce_fusion]: 1.87e-06 [virtual_dataset]: 8.6e-06 [get_grad_eliminate_]: 8.55e-06 [virtual_output]: 7.93e-06 [merge_forward]: 1.141e-05 [cell_reuse_recompute_pass]: 4.1e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.874e-05 [meta_fg_expand]: 2.788e-05 [after_resolve]: 1.132e-05 [a_after_grad]: 1.917e-05 [renormalize]: 0.00124327 [real_op_eliminate]: 1.408e-05 [auto_monad_grad]: 5.27e-06 [auto_monad_eliminator]: 2.384e-05 [cse]: 9.386e-05 [a_3]: 7.104e-05 [Cycle 4]: 0.00100834, [30] [expand_dump_flag]: 1.21e-06 [switch_simplify]: 8.53e-06 [a_1]: 0.00041649 [recompute_prepare]: 1.012e-05 [updatestate_depend_eliminate]: 1.361e-05 [updatestate_assign_eliminate]: 1.08e-05 [updatestate_loads_eliminate]: 1.057e-05 [parameter_eliminate]: 1.86e-06 [a_2]: 0.00015024 [accelerated_algorithm]: 1.513e-05 [pynative_shard]: 1.22e-06 [auto_parallel]: 3.29001e-06 [parallel]: 3.71e-06 [merge_comm]: 2.29001e-06 [allreduce_fusion]: 1.69e-06 [virtual_dataset]: 8.74e-06 [get_grad_eliminate_]: 7.98e-06 [virtual_output]: 7.88e-06 [merge_forward]: 1.184e-05 [cell_reuse_recompute_pass]: 3.80001e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.865e-05 [meta_fg_expand]: 9.22e-06 [after_resolve]: 1.099e-05 [a_after_grad]: 1.952e-05 [renormalize]: 5.99975e-08 [real_op_eliminate]: 7.96e-06 [auto_monad_grad]: 2.47e-06 [auto_monad_eliminator]: 2.088e-05 [cse]: 4.897e-05 [a_3]: 6.465e-05 [py_interpret_to_execute_after_opt_a]: 3.75e-06 [slice_cell_reuse_recomputed_activation]: 1.81e-06 [rewriter_after_opt_a]: 6.149e-05 [convert_after_rewriter]: 1.636e-05 [order_py_execute_after_rewriter]: 1.228e-05 [opt_b]: 0.00057435, [2] [Cycle 1]: 0.00048865, [7] [b_1]: 0.00043425 [b_2]: 2.93e-06 [updatestate_depend_eliminate]: 3.63e-06 [updatestate_assign_eliminate]: 2.55999e-06 [updatestate_loads_eliminate]: 2.28e-06 [renormalize]: 3.80001e-07 [cse]: 9.89e-06 [Cycle 2]: 7.605e-05, [7] [b_1]: 3.59e-05 [b_2]: 2.15e-06 [updatestate_depend_eliminate]: 2.24e-06 [updatestate_assign_eliminate]: 1.99e-06 [updatestate_loads_eliminate]: 1.91999e-06 [renormalize]: 6.99947e-08 [cse]: 6.29e-06 [cconv]: 1.7e-05 [opt_after_cconv]: 5.903e-05, [1] [Cycle 1]: 5.445e-05, [7] [c_1]: 1.305e-05 [parameter_eliminate]: 1.99e-06 [updatestate_depend_eliminate]: 2.44e-06 [updatestate_assign_eliminate]: 2.03e-06 [updatestate_loads_eliminate]: 1.87e-06 [cse]: 6.2e-06 [renormalize]: 2.89998e-07 [remove_dup_value]: 7.38e-06 [tuple_transform]: 4.152e-05, [1] [Cycle 1]: 3.782e-05, [3] [d_1]: 2.035e-05 [d_2]: 5.33e-06 [renormalize]: 1.50001e-07 [add_cache_embedding]: 7.86e-06 [add_recomputation]: 3.131e-05 [cse_after_recomputation]: 1.623e-05, [1] [Cycle 1]: 1.206e-05, [1] [cse]: 7.61e-06 [environ_conv]: 4.73e-06 [label_micro_interleaved_index]: 1.88e-06 [label_fine_grained_interleaved_index]: 1.22e-06 [assign_add_opt]: 1.23e-06 [slice_recompute_activation]: 1.31e-06 [micro_interleaved_order_control]: 1.1e-06 [full_micro_interleaved_order_control]: 9.79999e-07 [comp_comm_scheduling]: 1.18e-06 [reorder_send_recv_between_fp_bp]: 1.01999e-06 [comm_op_add_attrs]: 6.59995e-07 [add_comm_op_reuse_tag]: 5.80003e-07 [overlap_opt_shard_in_pipeline]: 7.50006e-07 [grouped_pairwise_exchange_alltoall]: 5.99997e-07 [overlap_recompute_and_grad_model_parallel]: 1.16e-06 [overlap_grad_matmul_and_grad_allreduce]: 3.89999e-07 [split_matmul_comm_elemetwise]: 1.6e-06 [split_layernorm_comm]: 1.42999e-06 [process_send_recv_for_ge]: 5.10001e-07 [handle_group_info]: 4.79995e-07 [auto_monad_reorder]: 1.242e-05 [get_jit_bprop_graph]: 3.44e-06 [eliminate_special_op_node]: 0.00046713 [validate]: 2.439e-05 [distribtued_split]: 1.13e-06 [task_emit]: 0.00237444 [execute]: 4.24e-06 Sums parse : 0.001279s : 1.64% symbol_resolve.resolve : 0.012288s : 15.77% combine_like_graphs : 0.000001s : 0.00% graph_reusing : 0.000002s : 0.00% meta_unpack_prepare : 0.000165s : 0.21% pre_cconv : 0.000000s : 0.00% abstract_specialize : 0.003876s : 4.97% pack_expand : 0.000013s : 0.02% auto_monad : 0.000072s : 0.09% inline : 0.000001s : 0.00% pre_auto_parallel : 0.000007s : 0.01% pipeline_split : 0.000001s : 0.00% optimize.py_interpret_to_execute : 0.000004s : 0.00% optimize.rewriter_before_opt_a : 0.000161s : 0.21% optimize.opt_a.expand_dump_flag : 0.000009s : 0.01% optimize.opt_a.switch_simplify : 0.000200s : 0.26% optimize.opt_a.a_1 : 0.003255s : 4.18% optimize.opt_a.recompute_prepare : 0.000037s : 0.05% optimize.opt_a.updatestate_depend_eliminate : 0.000049s : 0.06% optimize.opt_a.updatestate_assign_eliminate : 0.000037s : 0.05% optimize.opt_a.updatestate_loads_eliminate : 0.000036s : 0.05% optimize.opt_a.parameter_eliminate : 0.000014s : 0.02% optimize.opt_a.a_2 : 0.000493s : 0.63% optimize.opt_a.accelerated_algorithm : 0.000049s : 0.06% optimize.opt_a.pynative_shard : 0.000004s : 0.01% optimize.opt_a.auto_parallel : 0.000014s : 0.02% optimize.opt_a.parallel : 0.000017s : 0.02% optimize.opt_a.merge_comm : 0.000010s : 0.01% optimize.opt_a.allreduce_fusion : 0.000007s : 0.01% optimize.opt_a.virtual_dataset : 0.000030s : 0.04% optimize.opt_a.get_grad_eliminate_ : 0.000028s : 0.04% optimize.opt_a.virtual_output : 0.000026s : 0.03% optimize.opt_a.merge_forward : 0.000041s : 0.05% optimize.opt_a.cell_reuse_recompute_pass : 0.000002s : 0.00% optimize.opt_a.cell_reuse_handle_not_recompute_node_pass : 0.000063s : 0.08% optimize.opt_a.meta_fg_expand : 0.000037s : 0.05% optimize.opt_a.meta_fg_expand.resolve : 0.001437s : 1.84% optimize.opt_a.after_resolve : 0.000073s : 0.09% optimize.opt_a.a_after_grad : 0.000140s : 0.18% optimize.opt_a.renormalize : 0.049134s : 63.05% optimize.opt_a.real_op_eliminate : 0.000079s : 0.10% optimize.opt_a.auto_monad_grad : 0.000073s : 0.09% optimize.opt_a.auto_monad_eliminator : 0.000150s : 0.19% optimize.opt_a.cse : 0.000367s : 0.47% optimize.opt_a.a_3 : 0.000509s : 0.65% optimize.py_interpret_to_execute_after_opt_a : 0.000004s : 0.00% optimize.slice_cell_reuse_recomputed_activation : 0.000002s : 0.00% optimize.rewriter_after_opt_a : 0.000061s : 0.08% optimize.convert_after_rewriter : 0.000016s : 0.02% optimize.order_py_execute_after_rewriter : 0.000012s : 0.02% optimize.opt_b.b_1 : 0.000470s : 0.60% optimize.opt_b.b_2 : 0.000005s : 0.01% optimize.opt_b.updatestate_depend_eliminate : 0.000006s : 0.01% optimize.opt_b.updatestate_assign_eliminate : 0.000005s : 0.01% optimize.opt_b.updatestate_loads_eliminate : 0.000004s : 0.01% optimize.opt_b.renormalize : 0.000000s : 0.00% optimize.opt_b.cse : 0.000016s : 0.02% optimize.cconv : 0.000017s : 0.02% optimize.opt_after_cconv.c_1 : 0.000013s : 0.02% optimize.opt_after_cconv.parameter_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_depend_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_assign_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_loads_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.cse : 0.000006s : 0.01% optimize.opt_after_cconv.renormalize : 0.000000s : 0.00% optimize.remove_dup_value : 0.000007s : 0.01% optimize.tuple_transform.d_1 : 0.000020s : 0.03% optimize.tuple_transform.d_2 : 0.000005s : 0.01% optimize.tuple_transform.renormalize : 0.000000s : 0.00% optimize.add_cache_embedding : 0.000008s : 0.01% optimize.add_recomputation : 0.000031s : 0.04% optimize.cse_after_recomputation.cse : 0.000008s : 0.01% optimize.environ_conv : 0.000005s : 0.01% optimize.label_micro_interleaved_index : 0.000002s : 0.00% optimize.label_fine_grained_interleaved_index : 0.000001s : 0.00% optimize.assign_add_opt : 0.000001s : 0.00% optimize.slice_recompute_activation : 0.000001s : 0.00% optimize.micro_interleaved_order_control : 0.000001s : 0.00% optimize.full_micro_interleaved_order_control : 0.000001s : 0.00% optimize.comp_comm_scheduling : 0.000001s : 0.00% optimize.reorder_send_recv_between_fp_bp : 0.000001s : 0.00% optimize.comm_op_add_attrs : 0.000001s : 0.00% optimize.add_comm_op_reuse_tag : 0.000001s : 0.00% optimize.overlap_opt_shard_in_pipeline : 0.000001s : 0.00% optimize.grouped_pairwise_exchange_alltoall : 0.000001s : 0.00% optimize.overlap_recompute_and_grad_model_parallel : 0.000001s : 0.00% optimize.overlap_grad_matmul_and_grad_allreduce : 0.000000s : 0.00% optimize.split_matmul_comm_elemetwise : 0.000002s : 0.00% optimize.split_layernorm_comm : 0.000001s : 0.00% optimize.process_send_recv_for_ge : 0.000001s : 0.00% optimize.handle_group_info : 0.000000s : 0.00% auto_monad_reorder : 0.000012s : 0.02% get_jit_bprop_graph : 0.000003s : 0.00% eliminate_special_op_node : 0.000467s : 0.60% validate : 0.000024s : 0.03% distribtued_split : 0.000001s : 0.00% task_emit : 0.002374s : 3.05% execute : 0.000004s : 0.01% Time group info: ------[substitution.] 0.014015 450 0.03% : 0.000004s : 6: substitution.float_depend_g_call 0.08% : 0.000011s : 14: substitution.float_tuple_getitem_switch 92.99% : 0.013033s : 25: substitution.getattr_setattr_resolve 0.02% : 0.000003s : 3: substitution.graph_param_transform 0.02% : 0.000003s : 3: substitution.incorporate_call 0.01% : 0.000002s : 3: substitution.incorporate_call_switch 4.27% : 0.000598s : 65: substitution.inline 0.05% : 0.000006s : 14: substitution.less_batch_normalization 0.27% : 0.000038s : 42: substitution.meta_unpack_prepare 0.11% : 0.000015s : 16: substitution.minmaximum_grad 0.03% : 0.000004s : 6: substitution.partial_eliminate 0.00% : 0.000001s : 3: substitution.partial_unused_args_eliminate 0.04% : 0.000006s : 47: substitution.remove_not_recompute_node 0.41% : 0.000057s : 44: substitution.replace_applicator 0.05% : 0.000006s : 20: substitution.replace_old_param 0.02% : 0.000003s : 2: substitution.reset_defer_inline 0.04% : 0.000005s : 8: substitution.set_cell_output_no_recompute 0.05% : 0.000007s : 5: substitution.specialize_transform 0.06% : 0.000009s : 4: substitution.switch_simplify 0.06% : 0.000008s : 2: substitution.transpose_eliminate 0.33% : 0.000047s : 20: substitution.tuple_list_convert_item_index_to_positive 0.15% : 0.000021s : 20: substitution.tuple_list_get_item_const_eliminator 0.21% : 0.000029s : 20: substitution.tuple_list_get_item_depend_reorder 0.51% : 0.000072s : 38: substitution.tuple_list_get_item_eliminator 0.20% : 0.000028s : 20: substitution.tuple_list_get_set_item_eliminator ------[renormalize.] 0.049121 6 92.80% : 0.045582s : 3: renormalize.infer 7.20% : 0.003538s : 3: renormalize.specialize ------[replace.] 0.000746 68 46.56% : 0.000347s : 23: replace.getattr_setattr_resolve 29.68% : 0.000221s : 31: replace.inline 7.01% : 0.000052s : 2: replace.meta_unpack_prepare 8.19% : 0.000061s : 4: replace.switch_simplify 1.89% : 0.000014s : 2: replace.transpose_eliminate 6.66% : 0.000050s : 6: replace.tuple_list_get_item_eliminator ------[match.] 0.013572 68 95.68% : 0.012986s : 23: match.getattr_setattr_resolve 3.94% : 0.000535s : 31: match.inline 0.14% : 0.000019s : 2: match.meta_unpack_prepare 0.06% : 0.000009s : 4: match.switch_simplify 0.06% : 0.000008s : 2: match.transpose_eliminate 0.12% : 0.000016s : 6: match.tuple_list_get_item_eliminator ------[func_graph_cloner_run.] 0.004017 69 67.93% : 0.002728s : 28: func_graph_cloner_run.FuncGraphClonerGraph 32.07% : 0.001288s : 41: func_graph_cloner_run.FuncGraphSpecializer ------[meta_graph.] 0.000000 0 ------[manager.] 0.000000 0 ------[pynative] 0.000000 0 ------[others.] 0.019045 585 0.77% : 0.000147s : 2: opt.transform.meta_unpack_prepare 25.17% : 0.004793s : 461: opt.transform.opt_a 0.05% : 0.000009s : 7: opt.transform.opt_after_cconv 2.31% : 0.000441s : 94: opt.transform.opt_b 71.54% : 0.013625s : 10: opt.transform.opt_resolve 0.11% : 0.000021s : 8: opt.transform.opt_trans_graph 0.05% : 0.000009s : 3: opt.transform.special_op_eliminate . ============================== 2 passed in 21.91s ============================== [TRACE] GE(15744,python3.7):2024-01-11-05:27:38.077.839 [status:INIT] [ge_api.cc:463]15744 ~Session:Start to destruct session. [TRACE] GE(15744,python3.7):2024-01-11-05:27:38.077.890 [status:RUNNING] [ge_api.cc:475]15744 ~Session:Session id is 0 [TRACE] GE(15744,python3.7):2024-01-11-05:27:38.077.901 [status:RUNNING] [ge_api.cc:476]15744 ~Session:Destroying session [TRACE] GE(15744,python3.7):2024-01-11-05:27:38.078.795 [status:STOP] [ge_api.cc:491]15744 ~Session:Session Destructor finished [TRACE] GE(15744,python3.7):2024-01-11-05:27:38.078.824 [status:INIT] [ge_api.cc:301]15744 GEFinalize:GEFinalize start [INFO] GE(15744,python3.7):2024-01-11-05:27:38.078.885 [execution_runtime.cc:80][EVENT]15744 FinalizeExecutionRuntime:Execution runtime finalize begin. [INFO] GE(15744,python3.7):2024-01-11-05:27:38.078.905 [execution_runtime.cc:92][EVENT]15744 FinalizeExecutionRuntime:Execution runtime finalized. [TRACE] GE(15744,python3.7):2024-01-11-05:27:38.078.934 [status:RUNNING] [ge_api.cc:313]15744 GEFinalize:Finalizing environment [INFO] TUNE(15744,python3.7):2024-01-11-05:27:38.366.291 [cann_kb_pyfunc_mgr.cpp:127][CANNKB][Tid:15744]"CannKbPyfuncMgr: enter PyObjectDeinit function, reference_[1]" [INFO] TUNE(15744,python3.7):2024-01-11-05:27:38.366.340 [cann_kb_pyfunc_mgr.cpp:138][CANNKB][Tid:15744]"CannKbPyfuncMgr: PyObjectDeinit function end successfully!" [INFO] GE(15744,python3.7):2024-01-11-05:27:38.367.782 [gelib.cc:324][EVENT]15744 SystemFinalize:Online infer finalize GELib success. [TRACE] GE(15744,python3.7):2024-01-11-05:27:38.646.363 [status:STOP] [ge_api.cc:341]15744 GEFinalize:GEFinalize finished [INFO] TDT(15744,python3.7):2024-01-11-05:27:38.878.778 [process_mode_manager.cpp:184][Close][tid:15744] [TsdClient] Close [deviceId=3][sessionId=1] hccp and computer enter [INFO] TDT(15744,python3.7):2024-01-11-05:27:38.878.838 [version_verify.cpp:112][SpecialFeatureCheck][tid:15744] VersionVerify: previous type[7], supported [INFO] TDT(15744,python3.7):2024-01-11-05:27:38.878.889 [process_mode_manager.cpp:192][Close][tid:15744] [TsdClient][deviceId=3] [sessionId=1] wait hccp and computer process close respond [INFO] TDT(15744,python3.7):2024-01-11-05:27:38.900.478 [process_mode_manager.cpp:197][Close][tid:15744] [TsdClient][logicDeviceId_=3]has recv close hccp and computer process respond [INFO] TDT(15744,python3.7):2024-01-11-05:27:38.900.522 [stub_process_mode_nowin.cpp:151][CloseInHost][tid:15744] enter into CloseInHost deviceid[3] [INFO] TDT(15744,python3.7):2024-01-11-05:27:38.900.534 [stub_process_mode_nowin.cpp:154][CloseInHost][tid:15744] host cpu not support [INFO] TDT(15744,python3.7):2024-01-11-05:27:38.900.584 [process_mode_manager.cpp:208][Close][tid:15744] [TsdClient][deviceId=3] [sessionId=1] close hccp and computer process success [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:38.900.599 [atrace_api.c:93](tid:15744) AtraceDestroy start [INFO] ATRACE(15744,python3.7):2024-01-11-05:27:38.900.619 [atrace_api.c:95](tid:15744) AtraceDestroy end [INFO] PROFILING(15744,python3.7):2024-01-11-05:27:38.900.644 [msprofiler_impl.cpp:156] >>> (tid:15744) ProfNotifySetDevice called, is open: 0, devId: 3 [INFO] RUNTIME(15744,python3.7):2024-01-11-05:27:40.529.751 [runtime.cc:1737] 15744 ~Runtime: deconstruct runtime.