============================= test session starts ============================== platform linux -- Python 3.7.5, pytest-5.4.3, py-1.8.1, pluggy-0.13.1 rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st/dyn_shape_dev, inifile: /home/jenkins/sault/virtual_test/virtualenv_004/sault/config/pytest.ini plugins: anyio-3.7.1, xdist-1.32.0, forked-1.1.3 [INFO] ATRACE(38167,python3.7):2024-01-11-05:44:47.456.284 [trace_attr.c:105](tid:38167) platform is 1. [INFO] ATRACE(38167,python3.7):2024-01-11-05:44:47.456.468 [trace_recorder.c:114](tid:38167) use root path: /home/jenkins/ascend/atrace [INFO] ATRACE(38167,python3.7):2024-01-11-05:44:47.456.492 [trace_signal.c:133](tid:38167) register signal handler for signo 2 succeed. [INFO] ATRACE(38167,python3.7):2024-01-11-05:44:47.456.503 [trace_signal.c:133](tid:38167) register signal handler for signo 15 succeed. [INFO] RUNTIME(38167,python3.7):2024-01-11-05:44:47.841.277 [runtime.cc:1159] 38167 GetAicoreNumByLevel: workingDev_=0 [INFO] RUNTIME(38167,python3.7):2024-01-11-05:44:47.841.331 [runtime.cc:4719] 38167 GetVisibleDevices: ASCEND_RT_VISIBLE_DEVICES param was not set collected 2 items test_sinh.py [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.184.960 [process_mode_manager.cpp:109][OpenProcess][tid:38167] [ProcessModeManager] enter into open process deviceId[3] rankSize[0] [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.190.135 [process_mode_manager.cpp:379][InitTsdClient][tid:38167] [TsdClient] deviceId[3] begin to init hdc client [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.190.295 [version_verify.cpp:34][SetVersionInfo][tid:38167] VersionVerify: send client version to server [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.190.324 [version_verify.cpp:50][SetVersionInfo][tid:38167] send feature_info:{msg_type:35, features:{check before send aicpu package,}} [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.190.336 [version_verify.cpp:50][SetVersionInfo][tid:38167] send feature_info:{msg_type:37, features:{check before send open qs message,}} [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.190.656 [version_verify.cpp:66][PeerVersionCheck][tid:38167] VersionVerify: Check client version info, server[1230], client[1230] [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.190.670 [version_verify.cpp:87][ParseVersionInfo][tid:38167] VersionVerify: pass client version info success [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.190.679 [hdc_client.cpp:276][CheckHdcConnection][tid:38167] Service[2] create hdc success [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.190.692 [version_verify.cpp:120][SpecialFeatureCheck][tid:38167] VersionVerify: new type[35], supported [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.190.748 [process_mode_manager.cpp:748][GetDeviceCheckCode][tid:38167] [TsdClient][deviceId=3] [sessionId=1] wait package info respond [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.190.884 [process_mode_manager.cpp:379][InitTsdClient][tid:38167] [TsdClient] deviceId[3] begin to init hdc client [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.190.970 [version_verify.cpp:34][SetVersionInfo][tid:38167] VersionVerify: send client version to server [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.190.981 [version_verify.cpp:50][SetVersionInfo][tid:38167] send feature_info:{msg_type:35, features:{check before send aicpu package,}} [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.190.991 [version_verify.cpp:50][SetVersionInfo][tid:38167] send feature_info:{msg_type:37, features:{check before send open qs message,}} [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.191.118 [version_verify.cpp:66][PeerVersionCheck][tid:38167] VersionVerify: Check client version info, server[1230], client[1230] [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.191.130 [version_verify.cpp:87][ParseVersionInfo][tid:38167] VersionVerify: pass client version info success [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.191.138 [hdc_client.cpp:276][CheckHdcConnection][tid:38167] Service[2] create hdc success [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.191.148 [process_mode_manager.cpp:426][ConstructOpenMsg][tid:38167] [TsdClient] tsd get process sign successfully, procpid[38167] signSize[48] [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.191.173 [version_verify.cpp:112][SpecialFeatureCheck][tid:38167] VersionVerify: previous type[6], supported [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.191.190 [process_mode_manager.cpp:126][OpenProcess][tid:38167] [ProcessModeManager] deviceId[3] sessionId[1] rankSize[0], wait sub process start respond [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.395.779 [stub_process_mode_nowin.cpp:63][ProcessQueueForMdc][tid:38167] [TsdClient] it is unnecessary of current mode[0] chiptype[1] to grant queue auth to aicpusd [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.395.817 [stub_process_mode_nowin.cpp:101][OpenInHost][tid:38167] enter into OpenInHost deviceid[3] [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.395.827 [stub_process_mode_nowin.cpp:105][OpenInHost][tid:38167] host cpu not support [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.395.835 [process_mode_manager.cpp:156][OpenProcess][tid:38167] [TsdClient][deviceId=3] [sessionId=1] start hccp and computer process success [INFO] RUNTIME(38167,python3.7):2024-01-11-05:44:52.398.515 [device.cc:340] 38167 Init: isDoubledie:0, topologytype:0 [INFO] RUNTIME(38167,python3.7):2024-01-11-05:44:52.418.411 [npu_driver.cc:5428] 40027 GetDeviceStatus: GetDeviceStatus status=1. [INFO] ATRACE(38167,python3.7):2024-01-11-05:44:52.418.467 [atrace_api.c:28](tid:38167) AtraceCreate start [INFO] ATRACE(38167,python3.7):2024-01-11-05:44:52.418.558 [trace_rb_log.c:84](tid:38167) [RUNTIME_ATRACE_DEV3_TS0] create ring buffer success, buffer size : 131152. [INFO] ATRACE(38167,python3.7):2024-01-11-05:44:52.418.571 [atrace_api.c:32](tid:38167) AtraceCreate end [INFO] TDT(38167,python3.7):2024-01-11-05:44:52.418.586 [client_manager.cpp:157][SetProfilingCallback][tid:38167] [TsdClient] set profiling callback success [TRACE] GE(38167,python3.7):2024-01-11-05:44:52.570.199 [status:INIT] [ge_api.cc:144]38167 GEInitializeImpl:GEInitialize start [INFO] PROFILING(38167,python3.7):2024-01-11-05:44:52.781.200 [msprofiler_impl.cpp:156] >>> (tid:38167) ProfNotifySetDevice called, is open: 1, devId: 3 [INFO] PROFILING(38167,python3.7):2024-01-11-05:44:52.781.307 [platform.cpp:38] >>> (tid:38167) Profiling platform version: 1.0. [INFO] PROFILING(38167,python3.7):2024-01-11-05:44:52.781.321 [ai_drv_dev_api.cpp:384] >>> (tid:38167) Succeeded to DrvGetApiVersion version: 0x72313 [TRACE] GE(38167,python3.7):2024-01-11-05:44:52.828.867 [status:RUNNING] [ge_api.cc:211]38167 GEInitializeImpl:Initializing environment [INFO] GE(38167,python3.7):2024-01-11-05:44:52.828.926 [gelib.cc:98][EVENT]38167 Initialize:[GEPERFTRACE] GE Init Start [INFO] GE(38167,python3.7):2024-01-11-05:44:52.829.180 [gelib.cc:307][EVENT]38167 SystemInitialize:Online infer init GELib success, device id :3 [INFO] DVPP(38167,python3.7):2024-01-11-05:44:53.185.005 [dvpp_engine.cc:41][ENGINE][Initialize:41][tid:38167]dvpp engine do not support [INFO] TUNE(38167,python3.7):2024-01-11-05:44:53.188.133 [cann_kb_pyfunc_mgr.cpp:72][CANNKB][Tid:38167]"CannKbPyfuncMgr: Enter PyObjectInit, reference_ is 0!" [INFO] TUNE(38167,python3.7):2024-01-11-05:44:53.188.183 [handle_manager.cpp:115][CANNKB][Tid:38167]"Start to run init functions to load dynamic python lib!" [INFO] TUNE(38167,python3.7):2024-01-11-05:44:53.188.236 [handle_manager.cpp:407][CANNKB][Tid:38167]"Init functions of loading dynamic python lib end!" [INFO] TUNE(38167,python3.7):2024-01-11-05:44:53.188.247 [cann_kb_pyfunc_mgr.cpp:24][CANNKB][Tid:38167]"CANN_KB_Py has already been initialized." [INFO] TUNE(38167,python3.7):2024-01-11-05:44:53.188.314 [cann_kb_pyfunc_mgr.cpp:117][CANNKB][Tid:38167]"CannKbPyfuncMgr: Run PyObjectInit successfully!" [INFO] HCCL(38167,python3.7):2024-01-11-05:45:05.201.894 [plugin_manager.cc:42][38167]hcom running normal mode. [INFO] DVPP(38167,python3.7):2024-01-11-05:45:05.202.372 [dvpp_engine.cc:92][ENGINE][GetOpsKernelInfoStores:92][tid:38167]dvpp ops kernel info store do not support [INFO] DVPP(38167,python3.7):2024-01-11-05:45:05.202.500 [dvpp_engine.cc:69][ENGINE][GetGraphOptimizerObjs:69][tid:38167]dvpp graph optimizer do not support [INFO] DVPP(38167,python3.7):2024-01-11-05:45:05.709.136 [dvpp_ops_kernel_builder.cc:48][ENGINE][Initialize:48][tid:38167]dvpp ops kernel builder do not support [INFO] GE(38167,python3.7):2024-01-11-05:45:05.716.850 [gelib.cc:169][EVENT]38167 Initialize:[GEPERFTRACE] The time cost of GELib::Initialize is [12887876] micro second. [TRACE] GE(38167,python3.7):2024-01-11-05:45:05.800.785 [status:STOP] [ge_api.cc:255]38167 GEInitializeImpl:GEInitialize finished [TRACE] GE(38167,python3.7):2024-01-11-05:45:05.800.913 [status:INIT] [ge_api.cc:398]38167 Session:Start to construct session. [TRACE] GE(38167,python3.7):2024-01-11-05:45:05.800.928 [status:RUNNING] [ge_api.cc:408]38167 Session:Creating session [INFO] GE(38167,python3.7):2024-01-11-05:45:05.801.347 [graph_var_manager.cc:1445][EVENT]38167 SetMemoryMallocSize:Total memory size is 34359738368 [INFO] GE(38167,python3.7):2024-01-11-05:45:05.801.362 [graph_var_manager.cc:1424][EVENT]38167 SetAllMemoryMaxValue:The graph_mem_max_size is 27917287424 and the var_mem_max_size is 5368709120 [INFO] PROFILING(38167,python3.7):2024-01-11-05:45:05.801.629 [msprofiler_impl.cpp:156] >>> (tid:38167) ProfNotifySetDevice called, is open: 1, devId: 3 [TRACE] GE(38167,python3.7):2024-01-11-05:45:05.802.517 [status:RUNNING] [ge_api.cc:411]38167 Session:Session id is 0 [TRACE] GE(38167,python3.7):2024-01-11-05:45:05.802.534 [status:STOP] [ge_api.cc:420]38167 Session:Session Constructor finished [INFO] PROFILING(38167,python3.7):2024-01-11-05:45:05.812.052 [platform.cpp:38] >>> (tid:38167) Profiling platform version: 1.0. [INFO] PROFILING(38167,python3.7):2024-01-11-05:45:05.812.076 [ai_drv_dev_api.cpp:384] >>> (tid:38167) Succeeded to DrvGetApiVersion version: 0x72313 [TRACE] GE(38167,python3.7):2024-01-11-05:45:05.812.274 [status:INIT] [ge_api.cc:144]38167 GEInitializeImpl:GEInitialize start TotalTime = 0.390013, [20] [parse]: 0.231952 [symbol_resolve]: 0.028294, [1] [Cycle 1]: 0.028226, [1] [resolve]: 0.0281979 [combine_like_graphs]: 7.9e-06 [graph_reusing]: 2.34e-06 [meta_unpack_prepare]: 0.00014406 [pre_cconv]: 4.4e-06 [abstract_specialize]: 0.00428111 [pack_expand]: 1.305e-05 [auto_monad]: 9.991e-05 [inline]: 1.29e-06 [pre_auto_parallel]: 1.601e-05 [pipeline_split]: 2.17999e-06 [optimize]: 0.119499, [35] [py_interpret_to_execute]: 3.17e-06 [rewriter_before_opt_a]: 0.00018078 [opt_a]: 0.118268, [4] [Cycle 1]: 0.0812169, [30] [expand_dump_flag]: 3.45e-06 [switch_simplify]: 2.583e-05 [a_1]: 0.00041036 [recompute_prepare]: 9.96e-06 [updatestate_depend_eliminate]: 9.64e-06 [updatestate_assign_eliminate]: 7.25e-06 [updatestate_loads_eliminate]: 6.37e-06 [parameter_eliminate]: 4.83e-06 [a_2]: 8.168e-05 [accelerated_algorithm]: 6e-06 [pynative_shard]: 1.11e-06 [auto_parallel]: 3.4e-06 [parallel]: 1.222e-05 [merge_comm]: 7.38999e-06 [allreduce_fusion]: 2.42e-06 [virtual_dataset]: 5.30999e-06 [get_grad_eliminate_]: 4.79e-06 [virtual_output]: 4.15e-06 [merge_forward]: 7.69e-06 [cell_reuse_recompute_pass]: 7.99999e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.371e-05 [meta_fg_expand]: 0.0265731, [1] [Cycle 1]: 0.00053718, [1] [resolve]: 0.00051289 [after_resolve]: 2.329e-05 [a_after_grad]: 4.05e-05 [renormalize]: 0.053378 [real_op_eliminate]: 2.716e-05 [auto_monad_grad]: 3.374e-05 [auto_monad_eliminator]: 4.751e-05 [cse]: 0.00010611 [a_3]: 0.00017198 [Cycle 2]: 0.0273816, [30] [expand_dump_flag]: 3.60001e-06 [switch_simplify]: 6.271e-05 [a_1]: 0.00043103 [recompute_prepare]: 1.079e-05 [updatestate_depend_eliminate]: 1.131e-05 [updatestate_assign_eliminate]: 8.9e-06 [updatestate_loads_eliminate]: 8.36e-06 [parameter_eliminate]: 3.76e-06 [a_2]: 0.00012469 [accelerated_algorithm]: 1.242e-05 [pynative_shard]: 1.32e-06 [auto_parallel]: 5.47001e-06 [parallel]: 5.18e-06 [merge_comm]: 2.9e-06 [allreduce_fusion]: 1.83e-06 [virtual_dataset]: 7.55e-06 [get_grad_eliminate_]: 5.99e-06 [virtual_output]: 5.92e-06 [merge_forward]: 1.04e-05 [cell_reuse_recompute_pass]: 1.09e-06 [cell_reuse_handle_not_recompute_node_pass]: 1.775e-05 [meta_fg_expand]: 0.00648522, [3] [Cycle 1]: 0.00032494, [1] [resolve]: 0.00030669 [Cycle 1]: 0.00044846, [1] [resolve]: 0.00043039 [Cycle 1]: 0.00031294, [1] [resolve]: 0.00029518 [after_resolve]: 3.217e-05 [a_after_grad]: 5.381e-05 [renormalize]: 0.0194407 [real_op_eliminate]: 2.807e-05 [auto_monad_grad]: 3.579e-05 [auto_monad_eliminator]: 5.505e-05 [cse]: 0.00012439 [a_3]: 0.0002111 [Cycle 3]: 0.00262398, [30] [expand_dump_flag]: 2.76e-06 [switch_simplify]: 6.162e-05 [a_1]: 0.0005526 [recompute_prepare]: 1.277e-05 [updatestate_depend_eliminate]: 1.327e-05 [updatestate_assign_eliminate]: 1.038e-05 [updatestate_loads_eliminate]: 9.67e-06 [parameter_eliminate]: 3.31e-06 [a_2]: 0.0001604 [accelerated_algorithm]: 1.637e-05 [pynative_shard]: 1.41e-06 [auto_parallel]: 4.57e-06 [parallel]: 4.52e-06 [merge_comm]: 3.14e-06 [allreduce_fusion]: 2.2e-06 [virtual_dataset]: 8.76e-06 [get_grad_eliminate_]: 8.12e-06 [virtual_output]: 7.71e-06 [merge_forward]: 1.218e-05 [cell_reuse_recompute_pass]: 5.50004e-07 [cell_reuse_handle_not_recompute_node_pass]: 2.218e-05 [meta_fg_expand]: 2.915e-05 [after_resolve]: 1.221e-05 [a_after_grad]: 1.561e-05 [renormalize]: 0.00129315 [real_op_eliminate]: 1.214e-05 [auto_monad_grad]: 4.37e-06 [auto_monad_eliminator]: 2.276e-05 [cse]: 8.311e-05 [a_3]: 7.247e-05 [Cycle 4]: 0.00076912, [30] [expand_dump_flag]: 1.2e-06 [switch_simplify]: 8.89001e-06 [a_1]: 0.00016139 [recompute_prepare]: 1.098e-05 [updatestate_depend_eliminate]: 1.377e-05 [updatestate_assign_eliminate]: 1.101e-05 [updatestate_loads_eliminate]: 1.002e-05 [parameter_eliminate]: 1.55999e-06 [a_2]: 0.00015846 [accelerated_algorithm]: 1.608e-05 [pynative_shard]: 1.26e-06 [auto_parallel]: 3.17e-06 [parallel]: 3.58e-06 [merge_comm]: 2.49e-06 [allreduce_fusion]: 2.08e-06 [virtual_dataset]: 8.78e-06 [get_grad_eliminate_]: 7.78e-06 [virtual_output]: 7.27e-06 [merge_forward]: 1.171e-05 [cell_reuse_recompute_pass]: 3.69997e-07 [cell_reuse_handle_not_recompute_node_pass]: 2.178e-05 [meta_fg_expand]: 8.59e-06 [after_resolve]: 1.164e-05 [a_after_grad]: 1.539e-05 [renormalize]: 7.0002e-08 [real_op_eliminate]: 7.94e-06 [auto_monad_grad]: 1.94e-06 [auto_monad_eliminator]: 2.036e-05 [cse]: 4.437e-05 [a_3]: 6.601e-05 [py_interpret_to_execute_after_opt_a]: 3.93e-06 [slice_cell_reuse_recomputed_activation]: 1.54e-06 [rewriter_after_opt_a]: 7.313e-05 [convert_after_rewriter]: 1.61e-05 [order_py_execute_after_rewriter]: 1.085e-05 [opt_b]: 0.00058376, [2] [Cycle 1]: 0.00049176, [7] [b_1]: 0.0004346 [b_2]: 3.86e-06 [updatestate_depend_eliminate]: 3.02e-06 [updatestate_assign_eliminate]: 2.63e-06 [updatestate_loads_eliminate]: 2.05e-06 [renormalize]: 2.80001e-07 [cse]: 9.12e-06 [Cycle 2]: 8.29e-05, [7] [b_1]: 3.968e-05 [b_2]: 2.42e-06 [updatestate_depend_eliminate]: 2.16e-06 [updatestate_assign_eliminate]: 2.01e-06 [updatestate_loads_eliminate]: 1.82e-06 [renormalize]: 7.99992e-08 [cse]: 6.31e-06 [cconv]: 1.456e-05 [opt_after_cconv]: 5.581e-05, [1] [Cycle 1]: 5.142e-05, [7] [c_1]: 5.87e-06 [parameter_eliminate]: 1.78999e-06 [updatestate_depend_eliminate]: 2.48e-06 [updatestate_assign_eliminate]: 2.05e-06 [updatestate_loads_eliminate]: 1.81e-06 [cse]: 7.16e-06 [renormalize]: 1.90004e-07 [remove_dup_value]: 6.98e-06 [tuple_transform]: 3.807e-05, [1] [Cycle 1]: 3.426e-05, [3] [d_1]: 1.497e-05 [d_2]: 6.35e-06 [renormalize]: 1.60006e-07 [add_cache_embedding]: 8.04001e-06 [add_recomputation]: 3.61e-05 [cse_after_recomputation]: 1.593e-05, [1] [Cycle 1]: 1.2e-05, [1] [cse]: 7.41e-06 [environ_conv]: 1.659e-05 [label_micro_interleaved_index]: 1.99999e-06 [label_fine_grained_interleaved_index]: 1.37999e-06 [assign_add_opt]: 2.41e-06 [slice_recompute_activation]: 1.39e-06 [micro_interleaved_order_control]: 1.09e-06 [full_micro_interleaved_order_control]: 1.21e-06 [comp_comm_scheduling]: 1.4e-06 [reorder_send_recv_between_fp_bp]: 1.71e-06 [comm_op_add_attrs]: 6.69999e-07 [add_comm_op_reuse_tag]: 5.60001e-07 [overlap_opt_shard_in_pipeline]: 6.49998e-07 [grouped_pairwise_exchange_alltoall]: 7.79997e-07 [overlap_recompute_and_grad_model_parallel]: 1.49e-06 [overlap_grad_matmul_and_grad_allreduce]: 4.89999e-07 [split_matmul_comm_elemetwise]: 1.89e-06 [split_layernorm_comm]: 1.15e-06 [process_send_recv_for_ge]: 1.57001e-06 [handle_group_info]: 5.69999e-07 [auto_monad_reorder]: 1.742e-05 [get_jit_bprop_graph]: 2.90005e-07 [eliminate_special_op_node]: 0.00049049 [validate]: 4.378e-05 [distribtued_split]: 8.70001e-07 [task_emit]: 0.00491699 [execute]: 6.77e-06 Sums parse : 0.231952s : 66.09% symbol_resolve.resolve : 0.028198s : 8.03% combine_like_graphs : 0.000008s : 0.00% graph_reusing : 0.000002s : 0.00% meta_unpack_prepare : 0.000144s : 0.04% pre_cconv : 0.000004s : 0.00% abstract_specialize : 0.004281s : 1.22% pack_expand : 0.000013s : 0.00% auto_monad : 0.000100s : 0.03% inline : 0.000001s : 0.00% pre_auto_parallel : 0.000016s : 0.00% pipeline_split : 0.000002s : 0.00% optimize.py_interpret_to_execute : 0.000003s : 0.00% optimize.rewriter_before_opt_a : 0.000181s : 0.05% optimize.opt_a.expand_dump_flag : 0.000011s : 0.00% optimize.opt_a.switch_simplify : 0.000159s : 0.05% optimize.opt_a.a_1 : 0.001555s : 0.44% optimize.opt_a.recompute_prepare : 0.000044s : 0.01% optimize.opt_a.updatestate_depend_eliminate : 0.000048s : 0.01% optimize.opt_a.updatestate_assign_eliminate : 0.000038s : 0.01% optimize.opt_a.updatestate_loads_eliminate : 0.000034s : 0.01% optimize.opt_a.parameter_eliminate : 0.000013s : 0.00% optimize.opt_a.a_2 : 0.000525s : 0.15% optimize.opt_a.accelerated_algorithm : 0.000051s : 0.01% optimize.opt_a.pynative_shard : 0.000005s : 0.00% optimize.opt_a.auto_parallel : 0.000017s : 0.00% optimize.opt_a.parallel : 0.000025s : 0.01% optimize.opt_a.merge_comm : 0.000016s : 0.00% optimize.opt_a.allreduce_fusion : 0.000009s : 0.00% optimize.opt_a.virtual_dataset : 0.000030s : 0.01% optimize.opt_a.get_grad_eliminate_ : 0.000027s : 0.01% optimize.opt_a.virtual_output : 0.000025s : 0.01% optimize.opt_a.merge_forward : 0.000042s : 0.01% optimize.opt_a.cell_reuse_recompute_pass : 0.000003s : 0.00% optimize.opt_a.cell_reuse_handle_not_recompute_node_pass : 0.000075s : 0.02% optimize.opt_a.meta_fg_expand : 0.000038s : 0.01% optimize.opt_a.meta_fg_expand.resolve : 0.001545s : 0.44% optimize.opt_a.after_resolve : 0.000079s : 0.02% optimize.opt_a.a_after_grad : 0.000125s : 0.04% optimize.opt_a.renormalize : 0.074112s : 21.12% optimize.opt_a.real_op_eliminate : 0.000075s : 0.02% optimize.opt_a.auto_monad_grad : 0.000076s : 0.02% optimize.opt_a.auto_monad_eliminator : 0.000146s : 0.04% optimize.opt_a.cse : 0.000358s : 0.10% optimize.opt_a.a_3 : 0.000522s : 0.15% optimize.py_interpret_to_execute_after_opt_a : 0.000004s : 0.00% optimize.slice_cell_reuse_recomputed_activation : 0.000002s : 0.00% optimize.rewriter_after_opt_a : 0.000073s : 0.02% optimize.convert_after_rewriter : 0.000016s : 0.00% optimize.order_py_execute_after_rewriter : 0.000011s : 0.00% optimize.opt_b.b_1 : 0.000474s : 0.14% optimize.opt_b.b_2 : 0.000006s : 0.00% optimize.opt_b.updatestate_depend_eliminate : 0.000005s : 0.00% optimize.opt_b.updatestate_assign_eliminate : 0.000005s : 0.00% optimize.opt_b.updatestate_loads_eliminate : 0.000004s : 0.00% optimize.opt_b.renormalize : 0.000000s : 0.00% optimize.opt_b.cse : 0.000015s : 0.00% optimize.cconv : 0.000015s : 0.00% optimize.opt_after_cconv.c_1 : 0.000006s : 0.00% optimize.opt_after_cconv.parameter_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_depend_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_assign_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_loads_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.cse : 0.000007s : 0.00% optimize.opt_after_cconv.renormalize : 0.000000s : 0.00% optimize.remove_dup_value : 0.000007s : 0.00% optimize.tuple_transform.d_1 : 0.000015s : 0.00% optimize.tuple_transform.d_2 : 0.000006s : 0.00% optimize.tuple_transform.renormalize : 0.000000s : 0.00% optimize.add_cache_embedding : 0.000008s : 0.00% optimize.add_recomputation : 0.000036s : 0.01% optimize.cse_after_recomputation.cse : 0.000007s : 0.00% optimize.environ_conv : 0.000017s : 0.00% optimize.label_micro_interleaved_index : 0.000002s : 0.00% optimize.label_fine_grained_interleaved_index : 0.000001s : 0.00% optimize.assign_add_opt : 0.000002s : 0.00% optimize.slice_recompute_activation : 0.000001s : 0.00% optimize.micro_interleaved_order_control : 0.000001s : 0.00% optimize.full_micro_interleaved_order_control : 0.000001s : 0.00% optimize.comp_comm_scheduling : 0.000001s : 0.00% optimize.reorder_send_recv_between_fp_bp : 0.000002s : 0.00% optimize.comm_op_add_attrs : 0.000001s : 0.00% optimize.add_comm_op_reuse_tag : 0.000001s : 0.00% optimize.overlap_opt_shard_in_pipeline : 0.000001s : 0.00% optimize.grouped_pairwise_exchange_alltoall : 0.000001s : 0.00% optimize.overlap_recompute_and_grad_model_parallel : 0.000001s : 0.00% optimize.overlap_grad_matmul_and_grad_allreduce : 0.000000s : 0.00% optimize.split_matmul_comm_elemetwise : 0.000002s : 0.00% optimize.split_layernorm_comm : 0.000001s : 0.00% optimize.process_send_recv_for_ge : 0.000002s : 0.00% optimize.handle_group_info : 0.000001s : 0.00% auto_monad_reorder : 0.000017s : 0.00% get_jit_bprop_graph : 0.000000s : 0.00% eliminate_special_op_node : 0.000490s : 0.14% validate : 0.000044s : 0.01% distribtued_split : 0.000001s : 0.00% task_emit : 0.004917s : 1.40% execute : 0.000007s : 0.00% Time group info: ------[substitution.] 0.030028 383 0.01% : 0.000003s : 5: substitution.float_depend_g_call 0.03% : 0.000010s : 14: substitution.float_tuple_getitem_switch 96.86% : 0.029087s : 25: substitution.getattr_setattr_resolve 0.01% : 0.000004s : 3: substitution.graph_param_transform 0.01% : 0.000002s : 3: substitution.incorporate_call 0.00% : 0.000001s : 3: substitution.incorporate_call_switch 1.93% : 0.000581s : 59: substitution.inline 0.02% : 0.000005s : 10: substitution.less_batch_normalization 0.14% : 0.000042s : 23: substitution.meta_unpack_prepare 0.05% : 0.000015s : 11: substitution.minmaximum_grad 0.03% : 0.000009s : 5: substitution.partial_eliminate 0.00% : 0.000001s : 3: substitution.partial_unused_args_eliminate 0.02% : 0.000007s : 47: substitution.remove_not_recompute_node 0.19% : 0.000057s : 38: substitution.replace_applicator 0.02% : 0.000007s : 20: substitution.replace_old_param 0.01% : 0.000003s : 2: substitution.reset_defer_inline 0.02% : 0.000007s : 8: substitution.set_cell_output_no_recompute 0.02% : 0.000007s : 5: substitution.specialize_transform 0.02% : 0.000007s : 4: substitution.switch_simplify 0.03% : 0.000010s : 2: substitution.transpose_eliminate 0.13% : 0.000039s : 15: substitution.tuple_list_convert_item_index_to_positive 0.05% : 0.000015s : 15: substitution.tuple_list_get_item_const_eliminator 0.07% : 0.000022s : 15: substitution.tuple_list_get_item_depend_reorder 0.22% : 0.000067s : 33: substitution.tuple_list_get_item_eliminator 0.07% : 0.000021s : 15: substitution.tuple_list_get_set_item_eliminator ------[renormalize.] 0.074096 6 95.08% : 0.070450s : 3: renormalize.infer 4.92% : 0.003646s : 3: renormalize.specialize ------[replace.] 0.000647 68 47.50% : 0.000307s : 23: replace.getattr_setattr_resolve 29.62% : 0.000192s : 31: replace.inline 6.58% : 0.000043s : 2: replace.meta_unpack_prepare 8.24% : 0.000053s : 4: replace.switch_simplify 1.46% : 0.000009s : 2: replace.transpose_eliminate 6.60% : 0.000043s : 6: replace.tuple_list_get_item_eliminator ------[match.] 0.029587 68 98.05% : 0.029009s : 23: match.getattr_setattr_resolve 1.73% : 0.000513s : 31: match.inline 0.10% : 0.000030s : 2: match.meta_unpack_prepare 0.02% : 0.000007s : 4: match.switch_simplify 0.03% : 0.000010s : 2: match.transpose_eliminate 0.06% : 0.000018s : 6: match.tuple_list_get_item_eliminator ------[func_graph_cloner_run.] 0.003758 69 66.41% : 0.002496s : 28: func_graph_cloner_run.FuncGraphClonerGraph 33.59% : 0.001262s : 41: func_graph_cloner_run.FuncGraphSpecializer ------[meta_graph.] 0.000000 0 ------[manager.] 0.000000 0 ------[pynative] 0.000000 0 ------[others.] 0.033443 255 3.19% : 0.001068s : 104: opt.transform.opt_a 1.31% : 0.000437s : 92: opt.transform.opt_b 88.64% : 0.029643s : 10: opt.transform.opt_resolve 0.37% : 0.000124s : 1: opt.transforms.meta_unpack_prepare 6.38% : 0.002134s : 40: opt.transforms.opt_a 0.01% : 0.000004s : 1: opt.transforms.opt_after_cconv 0.01% : 0.000004s : 2: opt.transforms.opt_b 0.06% : 0.000019s : 2: opt.transforms.opt_trans_graph 0.02% : 0.000008s : 3: opt.transforms.special_op_eliminate [INFO] GE(38167,python3.7):2024-01-11-05:45:06.347.965 [scalable_config.cc:55][EVENT]42668 ScalableConfig:device total max size: 34359738368, page_mem_size_total_thresold: 32641751449, uncacheable_size_threshold: 17179869184 [INFO] GE(38167,python3.7):2024-01-11-05:45:06.432.546 [graph_var_manager.cc:1424][EVENT]42668 SetAllMemoryMaxValue:The graph_mem_max_size is 27917287424 and the var_mem_max_size is 5368709120 [INFO] GE(38167,python3.7):2024-01-11-05:45:06.432.623 [graph_manager.cc:1248][EVENT]42668 PreRun:PreRun start: graph node size 3, session id 1, graph id 0, graph name online. [INFO] ATRACE(38167,python3.7):2024-01-11-05:45:06.433.489 [atrace_api.c:28](tid:42668) AtraceCreate start [INFO] ATRACE(38167,python3.7):2024-01-11-05:45:06.433.561 [trace_rb_log.c:84](tid:42668) [RUNTIME_ATRACE_DEV64_TS0] create ring buffer success, buffer size : 131152. [INFO] ATRACE(38167,python3.7):2024-01-11-05:45:06.433.574 [atrace_api.c:32](tid:42668) AtraceCreate end [INFO] TDT(38167,python3.7):2024-01-11-05:45:06.433.598 [client_manager.cpp:157][SetProfilingCallback][tid:42668] [TsdClient] set profiling callback success [INFO] GE(38167,python3.7):2024-01-11-05:45:06.434.490 [parallel_partitioner.cc:165][EVENT]42668 DoPipelinePartition:[GEPERFTRACE] The time cost of OptimizeSubgraph::PipelinePartition is [18] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.434.527 [parallel_partitioner.cc:178][EVENT]42668 DoFlowGraphPartition:[GEPERFTRACE] The time cost of OptimizeSubgraph::FlowGraphPartition is [14] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.434.571 [graph_prepare.cc:1378][EVENT]42668 Init:[GEPERFTRACE] The time cost of FileConstantUtils::ConvertFileConstToConst is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.435.233 [graph_manager.cc:1050][EVENT]42668 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.preparer.PrepareInit is [676] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.435.257 [graph_manager.cc:1052][EVENT]42668 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.HandleSummaryOp is [5] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.435.368 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of ForToWhilePass is [2] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.435.392 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of ProcessNetOutput::SavePass is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.435.449 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of ProcessNetOutput::NetOutputPass is [45] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.435.462 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of ProcessNetOutput::DataPass is [1] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.435.541 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of CreateSubGraphWithScopePass is [14] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.435.555 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of SubgraphMultiDimsClonePass is [1] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.435.571 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of MultiBatchClonePass is [6] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.435.652 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of SplitVariableIntoSubgraphPass is [2] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.435.671 [graph_manager.cc:1054][EVENT]42668 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.preparer.NormalizeGraph is [402] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.442.742 [graph_manager.cc:1055][EVENT]42668 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.OptimizeGraphInit is [7045] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.443.594 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of AssertPass is [2] micro second, call num is [6] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.443.617 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of SwitchDeadBranchElimination is [2] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.443.628 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of MergePass is [4] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.443.638 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of InferShapePass is [240] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.443.647 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of ReplaceWithEmptyConstPass is [13] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.443.655 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of SplitShapeNPass is [2] micro second, call num is [6] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.443.663 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of DimensionComputePass is [21] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.443.672 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of ConstantFoldingPass is [20] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.443.680 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of InferValuePass is [5] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.253 [graph_manager.cc:1056][EVENT]42668 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.OptimizeOriginalGraphForQuantize is [2478] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.312 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of CondRemovePass is [4] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.328 [graph_prepare.cc:1982][EVENT]42668 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::ProcessBeforeInfershape is [45] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.651 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of AssertPass is [1] micro second, call num is [6] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.671 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of SwitchDeadBranchElimination is [2] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.681 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of MergePass is [2] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.690 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of InferShapePass is [173] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.698 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of ReplaceWithEmptyConstPass is [8] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.706 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of SplitShapeNPass is [1] micro second, call num is [6] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.715 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of DimensionComputePass is [6] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.723 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of ConstantFoldingPass is [7] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.731 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of InferValuePass is [3] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.765 [graph_prepare.cc:1983][EVENT]42668 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::FormatAndShapeProcess is [423] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.788 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of PreRun::MarkForceUnknownForCondPass is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.801 [graph_prepare.cc:1984][EVENT]42668 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::CtrlFlowPreProcess is [21] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.814 [graph_prepare.cc:1985][EVENT]42668 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::multibatch::GetDynamicOutputShape is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.840 [graph_prepare.cc:1986][EVENT]42668 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::InsertAippOpUtil::Instance().UpdateDataNodeByAipp is [15] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.853 [graph_prepare.cc:1987][EVENT]42668 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::SaveOriginalGraphToOmModel is [1] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.868 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of PrepareOptimize::ShapeOperateOpRemovePass is [4] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.880 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of PrepareOptimize::ReplaceTransShapePass is [1] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.893 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of PrepareOptimize::MarkAgnosticPass is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.960 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of EnterPass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.972 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of CondPass is [3] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.981 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of PrintOpPass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.445.991 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of NoUseReshapeRemovePass is [3] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.000 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of DropOutPass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.010 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of AssertPass is [3] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.021 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of TransposeRemovePass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.030 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of UnusedConstPass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.041 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of StopGradientPass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.050 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of PreventGradientPass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.058 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of PlaceholderWithDefaultPass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.066 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of SnapshotPass is [0] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.074 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of GuaranteeConstPass is [0] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.092 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of VarIsInitializedOpPass is [4] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.104 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of ParallelConcatStartOpPass is [2] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.114 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of IdentityPass is [2] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.137 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of PrepareOptimize::PrunePass is [8] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.150 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of PrepareOptimize::HcclMemcpyPass is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.180 [graph_prepare.cc:1988][EVENT]42668 PrepareDynShape:[GEPERFTRACE] The time cost of Prepare::PrepareOptimize is [318] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.446.193 [graph_manager.cc:1065][EVENT]42668 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.preparer.PrepareDynShape is [911] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.458.416 [graph_manager.cc:1077][EVENT]42668 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.OptimizeOriginalGraph is [12200] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.458.488 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of PrepareRunningFormatRefiner::VariablePrepareOpPass is [5] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.458.530 [graph_manager.cc:1080][EVENT]42668 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.preparer.PrepareRunningFormatRefiner is [74] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.461.784 [graph_manager.cc:1081][EVENT]42668 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.OptimizeOriginalGraphJudgeInsert is [3237] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.461.826 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of SubexpressionMigrationPass is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.461.839 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of UnusedArgsCleanPass is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.461.852 [graph_manager.cc:1082][EVENT]42668 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::SubexpressionMigration is [34] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.461.879 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::MergeInputMemcpyPass is [4] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.461.893 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::SwitchDataEdgesBypass is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.461.906 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::ConstantFuseSamePass is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.461.937 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::CSEBeforeFuseDataNodesWithCommonInputPass is [21] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.461.951 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::FuseDataNodesWithCommonInputPass is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.461.966 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::CommonSubexpressionEliminationPass is [4] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.461.979 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::PermutePass is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.017 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::SameTransdataBreadthFusionPass is [27] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.047 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::VariableOpPass is [7] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.065 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::TransOpWithoutReshapeFusionPass is [7] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.111 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::TransOpBreadthFusionPass is [36] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.128 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::DataFlowPreparePass is [6] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.141 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_1::MergeUnknownShapeNPass is [1] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.150 [graph_manager.cc:2700][EVENT]42668 OptimizeStage1:[GEPERFTRACE] The time cost of GraphManager::OptimizeStage1_1 is [276] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.268 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of EnterPass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.281 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of AddNPass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.291 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of SwitchDeadBranchElimination is [2] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.299 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of SwitchLogicRemovePass is [2] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.308 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of MergePass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.316 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of CastRemovePass is [7] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.324 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of TransposeTransDataPass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.332 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of ReshapeRemovePass is [4] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.340 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of TransOpSymmetryEliminationPass is [5] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.349 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of TransOpNearbyAllreduceFusionPass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.357 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of ReplaceWithEmptyConstPass is [10] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.365 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of DimensionComputePass is [5] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.373 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of ConstantFoldingPass is [11] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.381 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of DimensionAdjustPass is [4] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.389 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of UselessControlOutRemovePass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.399 [graph_manager.cc:2741][EVENT]42668 OptimizeStage1:[GEPERFTRACE] The time cost of GraphManager::OptimizeStage1_2 is [234] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.420 [graph_manager.cc:2752][EVENT]42668 OptimizeStage1:[GEPERFTRACE] The time cost of extern constant folding is [0] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.441 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::Migration is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.454 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::ArgsClean is [0] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.472 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::PrunePass is [8] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.486 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::NextIterationPass is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.498 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::ControlTriggerPass is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.510 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::MergeToStreamMergePass is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.530 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::SwitchToStreamSwitchPass is [10] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.544 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::AttachStreamLabelPass is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.557 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::MultiBatchPass is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.567 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::SubgraphMultiDimsPass is [1] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.580 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::IteratorOpPass is [4] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.591 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::VariableRefUselessControlOutDeletePass is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.608 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::ReshapeRecoveryPass is [6] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.620 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage1_3::RemoveSameConstPass is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.628 [graph_manager.cc:2810][EVENT]42668 OptimizeStage1:[GEPERFTRACE] The time cost of GraphManager::OptimizeStage1_3 is [192] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.655 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of IdentityPass is [2] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.668 [graph_manager.cc:2821][EVENT]42668 OptimizeStage1:[GEPERFTRACE] The time cost of GraphPrepare::node_pass is [32] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.692 [graph_manager.cc:1087][EVENT]42668 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::OptimizeStage1 is [823] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.814 [graph_manager.cc:1088][EVENT]42668 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.OptimizeAfterStage1 is [107] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.847 [graph_manager.cc:1089][EVENT]42668 PreRunOptimizeOriginalGraph:[GEPERFTRACE] The time cost of GraphManager::GraphUtilsEx::InferShapeInNeed is [14] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.863 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of PreRun::CtrlEdgeTransferPass is [1] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.883 [graph_manager.cc:1097][EVENT]42668 PreRunOptimizeOriginalGraph:PreRun:PreRunOptimizeOriginalGraph success. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.462.903 [graph_manager.cc:3325][EVENT]42668 OptimizeSubgraph:[GEPERFTRACE] The time cost of OptimizeSubgraph::StagePartition is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.463.252 [engine_place.cc:144][EVENT]42668 Run:The time cost of AIcoreEngine::CheckSupported is [242] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.463.274 [engine_place.cc:144][EVENT]42668 Run:The time cost of DNN_VM_GE_LOCAL_OP_STORE::CheckSupported is [6] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.463.283 [engine_place.cc:144][EVENT]42668 Run:The time cost of DNN_VM_RTS_OP_STORE::CheckSupported is [7] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.463.352 [graph_manager.cc:3351][EVENT]42668 OptimizeSubgraph:[GEPERFTRACE] The time cost of OptimizeSubgraph::GraphPartitionDynamicShape is [436] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.463.368 [graph_manager.cc:3364][EVENT]42668 OptimizeSubgraph:[GEPERFTRACE] The time cost of OptimizeSubgraph::SubgraphPartitionAndOptimization::CompositeEngine is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.463.427 [engine_partitioner.cc:1139][EVENT]42668 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionInitialize is [15] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.463.442 [engine_partitioner.cc:1142][EVENT]42668 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionMarkClusters is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.463.563 [engine_partitioner.cc:1148][EVENT]42668 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionSplitSubGraphs is [112] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.463.596 [engine_partitioner.cc:1155][EVENT]42668 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionSortSubGraphs is [20] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.463.643 [engine_partitioner.cc:1164][EVENT]42668 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionAddPartitionsToGraphNode is [35] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.463.668 [graph_manager.cc:3405][EVENT]42668 SubgraphPartitionAndOptimization:[GEPERFTRACE] The time cost of OptimizeSubgraph::Partition1 is [288] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.463.686 [graph_manager.cc:3412][EVENT]42668 SubgraphPartitionAndOptimization:[GEPERFTRACE] The time cost of OptimizeSubgraph::SetSubgraphPreProc is [7] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.151 [graph_manager.cc:3422][EVENT]42668 SubgraphPartitionAndOptimization:[GEPERFTRACE] The time cost of OptimizeSubgraph::SetSubGraph is [11453] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.184 [graph_manager.cc:3428][EVENT]42668 SubgraphPartitionAndOptimization:[GEPERFTRACE] The time cost of OptimizeSubgraph::SetSubgraphPostProc is [8] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.289 [graph_manager.cc:3467][EVENT]42668 SubgraphPartitionAndOptimization:[GEPERFTRACE] The time cost of OptimizeSubgraph::MergeSubGraph is [84] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.306 [graph_manager.cc:3377][EVENT]42668 OptimizeSubgraph:[GEPERFTRACE] The time cost of OptimizeSubgraph::SubgraphPartitionAndOptimization::AtomicEngine is [11926] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.322 [graph_manager.cc:1106][EVENT]42668 PreRunOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::OptimizeSubgraph is [12424] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.335 [graph_manager.cc:1115][EVENT]42668 PreRunOptimizeSubGraph:PreRun:PreRunOptimizeSubGraph success. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.355 [graph_manager.cc:1130][EVENT]42668 PreRunAfterOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.ReplacePrecompiledNodeWithOmGraph is [4] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.397 [graph_manager.cc:1131][EVENT]42668 PreRunAfterOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::stages.optimizer.OptimizeWholeGraph is [18] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.425 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::LinkGenMaskNodesPass is [9] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.442 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::HcclContinuousMemcpyPass is [4] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.452 [graph_manager.cc:2837][EVENT]42668 OptimizeStage2:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses is [39] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.524 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of ConstantFoldingPass is [12] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.537 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of ReshapeRemovePass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.546 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of CondRemovePass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.554 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of BitcastPass is [1] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.563 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of AssignRemovePass is [4] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.571 [base_pass.cc:339][EVENT]42668 Run:[GEPERFTRACE] The time cost of DimensionAdjustPass is [4] micro second, call num is [3] [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.580 [graph_manager.cc:2864][EVENT]42668 OptimizeStage2:[GEPERFTRACE] The time cost of OptimizeStage2::MergedGraphNameToPasses is [113] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.591 [graph_manager.cc:2872][EVENT]42668 OptimizeStage2:[GEPERFTRACE] The time cost of OptimizeStage2::RemoveIsolatedConst is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.607 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::ControlAttrOptimize::MultiBatchPass is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.620 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::RefIdentityDeleteOpPass is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.635 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::VariableRefDeleteOpPass is [4] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.648 [compile_nodes_pass.cc:88][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::ControlAttrOptimize::CompileNodesPass is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.660 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::ControlAttrOptimize::CompileNodesPass is [14] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.670 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::SwapSpacePass is [1] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.740 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::InputOutputConnectionIdentifyPass is [59] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.769 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::AtomicAddrCleanPass is [16] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.782 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::AfterMergePasses::EndOfSequenceAddControlPass is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.800 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::ControlAttrOptimize::SubgraphPass is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.813 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::ControlAttrOptimize::AttachStreamLabelPass is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.823 [graph_manager.cc:2927][EVENT]42668 OptimizeStage2:[GEPERFTRACE] The time cost of OptimizeStage2::ControlAttrOptimize is [219] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.841 [graph_manager.cc:2937][EVENT]42668 OptimizeStage2:[GEPERFTRACE] The time cost of ModelBuilder::AssignFunctionalLabels is [9] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.857 [graph_manager.cc:2943][EVENT]42668 OptimizeStage2:[GEPERFTRACE] The time cost of MemcpyAddrAsyncPass::Run. is [6] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.475.868 [graph_manager.cc:2950][EVENT]42668 OptimizeStage2:[GEPERFTRACE] The time cost of BufferPoolMemoryPass::Run. is [2] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.484.585 [graph_manager.cc:2958][EVENT]42668 OptimizeStage2:[GEPERFTRACE] The time cost of ParallelGroupPass::Run. is [42] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.484.638 [graph_manager.cc:1132][EVENT]42668 PreRunAfterOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::OptimizeStage2 is [9226] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.484.728 [graph_manager.cc:1135][EVENT]42668 PreRunAfterOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::GetCompilerStages(graph_node->GetGraphId()).optimizer.OptimizeGraphBeforeBuild is [74] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.484.775 [graph_manager.cc:2975][EVENT]42668 MemConflictProc:[GEPERFTRACE] The time cost of HandleMemoryRWConflict is [29] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.484.814 [graph_manager.cc:2981][EVENT]42668 MemConflictProc:[GEPERFTRACE] The time cost of MemLayoutConflictOptimizer::Run. is [26] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.484.830 [pass_manager.cc:82][EVENT]42668 Run:[GEPERFTRACE] The time cost of OptimizeStage2::SetFftsPlusAttrPass is [1] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.484.841 [graph_manager.cc:2986][EVENT]42668 MemConflictProc:[GEPERFTRACE] The time cost of SetFftsPlusAttrPass::last_passes.Run is [16] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.484.851 [graph_manager.cc:1136][EVENT]42668 PreRunAfterOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::MemConflictProc is [105] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.484.979 [graph_manager.cc:3555][EVENT]42668 Build:[GEPERFTRACE] The time cost of GraphManager::RecoverIrDefinitionAndModifyAippData is [96] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.485.073 [engine_partitioner.cc:1139][EVENT]42668 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionInitialize is [16] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.485.089 [engine_partitioner.cc:1142][EVENT]42668 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionMarkClusters is [3] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.485.199 [engine_partitioner.cc:1148][EVENT]42668 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionSplitSubGraphs is [99] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.485.234 [engine_partitioner.cc:1155][EVENT]42668 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionSortSubGraphs is [21] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.485.277 [engine_partitioner.cc:1164][EVENT]42668 PartitionSubGraph:[GEPERFTRACE] The time cost of EnginePartitioner::PartitionAddPartitionsToGraphNode is [31] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.485.311 [graph_builder.cc:865][EVENT]42668 SecondPartition:[GEPERFTRACE] The time cost of EnginePartitioner::Partition2 is [274] micro second. [INFO] RUNTIME(38167,python3.7):2024-01-11-05:45:06.485.761 [logger.cc:1071] 42668 ModelBindStream: model_id=832, stream_id=1089, flag=0. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.485.803 [task_generator.cc:804][EVENT]42668 GenerateTask:[GEPERFTRACE] The time cost of TaskGenerator::SetStreamCtx is [183] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.485.865 [task_generator.cc:805][EVENT]42668 GenerateTask:[GEPERFTRACE] The time cost of TaskGenerator::PrepareForGenerateTask is [50] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.486.643 [task_generator.cc:814][EVENT]42668 GenerateTask:[GEPERFTRACE] The time cost of TaskGenerator::DoGenerateTask is [762] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.486.657 [task_generator.cc:954][EVENT]42668 GetTaskInfo:[GEPERFTRACE] The time cost of TaskGenerator::GenerateTask is [1039] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.486.703 [task_generator.cc:967][EVENT]42668 GetTaskInfo:[GEPERFTRACE] The time cost of TaskGenerator::AddModelTaskToModel is [23] micro second. [INFO] RUNTIME(38167,python3.7):2024-01-11-05:45:06.486.722 [logger.cc:1084] 42668 ModelUnbindStream: model_id=832, stream_id=1089, [INFO] GE(38167,python3.7):2024-01-11-05:45:06.486.877 [graph_manager.cc:1152][EVENT]42668 PreRunAfterOptimizeSubGraph:[GEPERFTRACE] The time cost of GraphManager::Build is [2003] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.486.896 [graph_manager.cc:1164][EVENT]42668 PreRunAfterOptimizeSubGraph:PreRun:PreRunAfterOptimizeSubGraph success. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.486.926 [graph_manager.cc:1271][EVENT]42668 PreRun:[GEPERFTRACE] The time cost of FlowModelBuild is [52534] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.486.937 [graph_manager.cc:1272][EVENT]42668 PreRun:[GEPERFTRACE] GE PreRun End [INFO] ATRACE(38167,python3.7):2024-01-11-05:45:06.487.242 [atrace_api.c:93](tid:42668) AtraceDestroy start [INFO] ATRACE(38167,python3.7):2024-01-11-05:45:06.487.264 [atrace_api.c:95](tid:42668) AtraceDestroy end [INFO] GE(38167,python3.7):2024-01-11-05:45:06.491.833 [graph_converter.cc:838][EVENT]42668 ConvertComputeGraphToExecuteGraph:[GEPERFTRACE] The time cost of ConvertComputeGraphToExecuteGraph::CreateMainNode is [1296] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.491.971 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of ZeroCopy is [99] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.492.440 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of CEM is [450] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.492.613 [copy_flow_launch_fuse.cc:395][EVENT]42668 Run:[GEPERFTRACE] The time cost of Pass::CopyFlowLaunchFuse is [152] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.492.631 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of CopyFlowLaunch is [171] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.492.845 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of TrustOutTensor is [202] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.492.870 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of AicpuFuseHostInputs is [8] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.492.904 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of ZeroCopy is [23] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.493.080 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of CEM is [163] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.493.158 [copy_flow_launch_fuse.cc:395][EVENT]42668 Run:[GEPERFTRACE] The time cost of Pass::CopyFlowLaunchFuse is [62] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.493.171 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of CopyFlowLaunch is [74] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.493.209 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of TrustOutTensor is [19] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.493.220 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of AicpuFuseHostInputs is [0] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.493.246 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of ZeroCopy is [16] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.493.313 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of CEM is [58] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.493.378 [copy_flow_launch_fuse.cc:395][EVENT]42668 Run:[GEPERFTRACE] The time cost of Pass::CopyFlowLaunchFuse is [53] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.493.390 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of CopyFlowLaunch is [66] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.493.415 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of TrustOutTensor is [17] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.493.425 [base_optimizer.cc:70][EVENT]42668 Run:[GEPERFTRACE] The time cost of AicpuFuseHostInputs is [0] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.493.436 [graph_converter.cc:849][EVENT]42668 ConvertComputeGraphToExecuteGraph:[GEPERFTRACE] The time cost of ConvertComputeGraphToExecuteGraph::RunAllPass is [1568] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.493.642 [graph_converter.cc:853][EVENT]42668 ConvertComputeGraphToExecuteGraph:[GEPERFTRACE] The time cost of ConvertComputeGraphToExecuteGraph::TopologicalSorting is [197] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.494.260 [graph_converter.cc:857][EVENT]42668 ConvertComputeGraphToExecuteGraph:[GEPERFTRACE] The time cost of ConvertComputeGraphToExecuteGraph::AppendGraphLevelData is [605] micro second. [INFO] GE(38167,python3.7):2024-01-11-05:45:06.494.392 [graph_converter.cc:862][EVENT]42668 ConvertComputeGraphToExecuteGraph:[GEPERFTRACE] The time cost of ConvertComputeGraphToExecuteGraph::CalculatePriority is [112] micro second. TotalTime = 0.0236031, [20] [parse]: 0.00270258 [symbol_resolve]: 0.0106751, [1] [Cycle 1]: 0.0106292, [1] [resolve]: 0.0106106 [combine_like_graphs]: 6.69999e-07 [graph_reusing]: 2.65e-06 [meta_unpack_prepare]: 4.608e-05 [pre_cconv]: 4.1e-07 [abstract_specialize]: 0.00185195 [pack_expand]: 8.75e-06 [auto_monad]: 3.399e-05 [inline]: 1.65001e-06 [pre_auto_parallel]: 8.38e-06 [pipeline_split]: 2.12e-06 [optimize]: 0.00385758, [35] [py_interpret_to_execute]: 3.89e-06 [rewriter_before_opt_a]: 3.605e-05 [opt_a]: 0.00339458, [2] [Cycle 1]: 0.00077867, [30] [expand_dump_flag]: 2.71e-06 [switch_simplify]: 1.271e-05 [a_1]: 0.00017621 [recompute_prepare]: 2.56e-06 [updatestate_depend_eliminate]: 5.01e-06 [updatestate_assign_eliminate]: 3.17e-06 [updatestate_loads_eliminate]: 2.6e-06 [parameter_eliminate]: 2.75e-06 [a_2]: 3.222e-05 [accelerated_algorithm]: 2.86e-06 [pynative_shard]: 1.42e-06 [auto_parallel]: 3.01e-06 [parallel]: 6.86001e-06 [merge_comm]: 3.23e-06 [allreduce_fusion]: 2.49e-06 [virtual_dataset]: 2.53e-06 [get_grad_eliminate_]: 2.24e-06 [virtual_output]: 1.88e-06 [merge_forward]: 3.76e-06 [cell_reuse_recompute_pass]: 8.2e-07 [cell_reuse_handle_not_recompute_node_pass]: 6.61999e-06 [meta_fg_expand]: 2.53999e-06 [after_resolve]: 4.89e-06 [a_after_grad]: 2.74e-06 [renormalize]: 0.00029433 [real_op_eliminate]: 4.25e-06 [auto_monad_grad]: 2.66e-06 [auto_monad_eliminator]: 7.14e-06 [cse]: 1.706e-05 [a_3]: 1.688e-05 [Cycle 2]: 0.00024367, [30] [expand_dump_flag]: 8.80005e-07 [switch_simplify]: 2.29001e-06 [a_1]: 1.63e-05 [recompute_prepare]: 1.89e-06 [updatestate_depend_eliminate]: 2.93e-06 [updatestate_assign_eliminate]: 2.3e-06 [updatestate_loads_eliminate]: 2.1e-06 [parameter_eliminate]: 7.49998e-07 [a_2]: 2.933e-05 [accelerated_algorithm]: 2.59e-06 [pynative_shard]: 1.06e-06 [auto_parallel]: 2.95e-06 [parallel]: 2.92e-06 [merge_comm]: 1.91e-06 [allreduce_fusion]: 1.53e-06 [virtual_dataset]: 2.34001e-06 [get_grad_eliminate_]: 1.93001e-06 [virtual_output]: 1.77e-06 [merge_forward]: 2.79e-06 [cell_reuse_recompute_pass]: 3.69997e-07 [cell_reuse_handle_not_recompute_node_pass]: 6.18e-06 [meta_fg_expand]: 2.22e-06 [after_resolve]: 3.69e-06 [a_after_grad]: 2.36e-06 [renormalize]: 6.99947e-08 [real_op_eliminate]: 2e-06 [auto_monad_grad]: 8.60004e-07 [auto_monad_eliminator]: 3.83e-06 [cse]: 8.46e-06 [a_3]: 1.448e-05 [py_interpret_to_execute_after_opt_a]: 3.91001e-06 [slice_cell_reuse_recomputed_activation]: 1.4e-06 [rewriter_after_opt_a]: 1.816e-05 [convert_after_rewriter]: 4.33e-06 [order_py_execute_after_rewriter]: 3.57e-06 [opt_b]: 9.434e-05, [1] [Cycle 1]: 8.966e-05, [7] [b_1]: 4.246e-05 [b_2]: 3.43e-06 [updatestate_depend_eliminate]: 2.18e-06 [updatestate_assign_eliminate]: 2.3e-06 [updatestate_loads_eliminate]: 2.1e-06 [renormalize]: 2.39997e-07 [cse]: 7.66e-06 [cconv]: 1.407e-05 [opt_after_cconv]: 5.186e-05, [1] [Cycle 1]: 4.792e-05, [7] [c_1]: 5.18e-06 [parameter_eliminate]: 6.49998e-07 [updatestate_depend_eliminate]: 2.15e-06 [updatestate_assign_eliminate]: 2.01e-06 [updatestate_loads_eliminate]: 2.08e-06 [cse]: 6.62e-06 [renormalize]: 2.00002e-07 [remove_dup_value]: 7.14e-06 [tuple_transform]: 3.471e-05, [1] [Cycle 1]: 3.119e-05, [3] [d_1]: 1.249e-05 [d_2]: 5.85e-06 [renormalize]: 1.8e-07 [add_cache_embedding]: 8.21e-06 [add_recomputation]: 2.947e-05 [cse_after_recomputation]: 1.556e-05, [1] [Cycle 1]: 1.149e-05, [1] [cse]: 6.77e-06 [environ_conv]: 3.81e-06 [label_micro_interleaved_index]: 1.63e-06 [label_fine_grained_interleaved_index]: 1.36e-06 [assign_add_opt]: 1.1e-06 [slice_recompute_activation]: 1.3e-06 [micro_interleaved_order_control]: 1.11001e-06 [full_micro_interleaved_order_control]: 1.11e-06 [comp_comm_scheduling]: 2.05e-06 [reorder_send_recv_between_fp_bp]: 1.28e-06 [comm_op_add_attrs]: 9.90003e-07 [add_comm_op_reuse_tag]: 5.9e-07 [overlap_opt_shard_in_pipeline]: 7.2e-07 [grouped_pairwise_exchange_alltoall]: 7.49998e-07 [overlap_recompute_and_grad_model_parallel]: 1.16e-06 [overlap_grad_matmul_and_grad_allreduce]: 5.10001e-07 [split_matmul_comm_elemetwise]: 1.64e-06 [split_layernorm_comm]: 1.12e-06 [process_send_recv_for_ge]: 5.60001e-07 [handle_group_info]: 5.70006e-07 [auto_monad_reorder]: 1.111e-05 [get_jit_bprop_graph]: 3.00002e-07 [eliminate_special_op_node]: 0.00044788 [validate]: 1.907e-05 [distribtued_split]: 1.15e-06 [task_emit]: 0.00374096 [execute]: 7.5e-06 Sums parse : 0.002703s : 13.18% symbol_resolve.resolve : 0.010611s : 51.73% combine_like_graphs : 0.000001s : 0.00% graph_reusing : 0.000003s : 0.01% meta_unpack_prepare : 0.000046s : 0.22% pre_cconv : 0.000000s : 0.00% abstract_specialize : 0.001852s : 9.03% pack_expand : 0.000009s : 0.04% auto_monad : 0.000034s : 0.17% inline : 0.000002s : 0.01% pre_auto_parallel : 0.000008s : 0.04% pipeline_split : 0.000002s : 0.01% optimize.py_interpret_to_execute : 0.000004s : 0.02% optimize.rewriter_before_opt_a : 0.000036s : 0.18% optimize.opt_a.expand_dump_flag : 0.000004s : 0.02% optimize.opt_a.switch_simplify : 0.000015s : 0.07% optimize.opt_a.a_1 : 0.000193s : 0.94% optimize.opt_a.recompute_prepare : 0.000004s : 0.02% optimize.opt_a.updatestate_depend_eliminate : 0.000008s : 0.04% optimize.opt_a.updatestate_assign_eliminate : 0.000005s : 0.03% optimize.opt_a.updatestate_loads_eliminate : 0.000005s : 0.02% optimize.opt_a.parameter_eliminate : 0.000003s : 0.02% optimize.opt_a.a_2 : 0.000062s : 0.30% optimize.opt_a.accelerated_algorithm : 0.000005s : 0.03% optimize.opt_a.pynative_shard : 0.000002s : 0.01% optimize.opt_a.auto_parallel : 0.000006s : 0.03% optimize.opt_a.parallel : 0.000010s : 0.05% optimize.opt_a.merge_comm : 0.000005s : 0.03% optimize.opt_a.allreduce_fusion : 0.000004s : 0.02% optimize.opt_a.virtual_dataset : 0.000005s : 0.02% optimize.opt_a.get_grad_eliminate_ : 0.000004s : 0.02% optimize.opt_a.virtual_output : 0.000004s : 0.02% optimize.opt_a.merge_forward : 0.000007s : 0.03% optimize.opt_a.cell_reuse_recompute_pass : 0.000001s : 0.01% optimize.opt_a.cell_reuse_handle_not_recompute_node_pass : 0.000013s : 0.06% optimize.opt_a.meta_fg_expand : 0.000005s : 0.02% optimize.opt_a.after_resolve : 0.000009s : 0.04% optimize.opt_a.a_after_grad : 0.000005s : 0.02% optimize.opt_a.renormalize : 0.000294s : 1.44% optimize.opt_a.real_op_eliminate : 0.000006s : 0.03% optimize.opt_a.auto_monad_grad : 0.000004s : 0.02% optimize.opt_a.auto_monad_eliminator : 0.000011s : 0.05% optimize.opt_a.cse : 0.000026s : 0.12% optimize.opt_a.a_3 : 0.000031s : 0.15% optimize.py_interpret_to_execute_after_opt_a : 0.000004s : 0.02% optimize.slice_cell_reuse_recomputed_activation : 0.000001s : 0.01% optimize.rewriter_after_opt_a : 0.000018s : 0.09% optimize.convert_after_rewriter : 0.000004s : 0.02% optimize.order_py_execute_after_rewriter : 0.000004s : 0.02% optimize.opt_b.b_1 : 0.000042s : 0.21% optimize.opt_b.b_2 : 0.000003s : 0.02% optimize.opt_b.updatestate_depend_eliminate : 0.000002s : 0.01% optimize.opt_b.updatestate_assign_eliminate : 0.000002s : 0.01% optimize.opt_b.updatestate_loads_eliminate : 0.000002s : 0.01% optimize.opt_b.renormalize : 0.000000s : 0.00% optimize.opt_b.cse : 0.000008s : 0.04% optimize.cconv : 0.000014s : 0.07% optimize.opt_after_cconv.c_1 : 0.000005s : 0.03% optimize.opt_after_cconv.parameter_eliminate : 0.000001s : 0.00% optimize.opt_after_cconv.updatestate_depend_eliminate : 0.000002s : 0.01% optimize.opt_after_cconv.updatestate_assign_eliminate : 0.000002s : 0.01% optimize.opt_after_cconv.updatestate_loads_eliminate : 0.000002s : 0.01% optimize.opt_after_cconv.cse : 0.000007s : 0.03% optimize.opt_after_cconv.renormalize : 0.000000s : 0.00% optimize.remove_dup_value : 0.000007s : 0.03% optimize.tuple_transform.d_1 : 0.000012s : 0.06% optimize.tuple_transform.d_2 : 0.000006s : 0.03% optimize.tuple_transform.renormalize : 0.000000s : 0.00% optimize.add_cache_embedding : 0.000008s : 0.04% optimize.add_recomputation : 0.000029s : 0.14% optimize.cse_after_recomputation.cse : 0.000007s : 0.03% optimize.environ_conv : 0.000004s : 0.02% optimize.label_micro_interleaved_index : 0.000002s : 0.01% optimize.label_fine_grained_interleaved_index : 0.000001s : 0.01% optimize.assign_add_opt : 0.000001s : 0.01% optimize.slice_recompute_activation : 0.000001s : 0.01% optimize.micro_interleaved_order_control : 0.000001s : 0.01% optimize.full_micro_interleaved_order_control : 0.000001s : 0.01% optimize.comp_comm_scheduling : 0.000002s : 0.01% optimize.reorder_send_recv_between_fp_bp : 0.000001s : 0.01% optimize.comm_op_add_attrs : 0.000001s : 0.00% optimize.add_comm_op_reuse_tag : 0.000001s : 0.00% optimize.overlap_opt_shard_in_pipeline : 0.000001s : 0.00% optimize.grouped_pairwise_exchange_alltoall : 0.000001s : 0.00% optimize.overlap_recompute_and_grad_model_parallel : 0.000001s : 0.01% optimize.overlap_grad_matmul_and_grad_allreduce : 0.000001s : 0.00% optimize.split_matmul_comm_elemetwise : 0.000002s : 0.01% optimize.split_layernorm_comm : 0.000001s : 0.01% optimize.process_send_recv_for_ge : 0.000001s : 0.00% optimize.handle_group_info : 0.000001s : 0.00% auto_monad_reorder : 0.000011s : 0.05% get_jit_bprop_graph : 0.000000s : 0.00% eliminate_special_op_node : 0.000448s : 2.18% validate : 0.000019s : 0.09% distribtued_split : 0.000001s : 0.01% task_emit : 0.003741s : 18.24% execute : 0.000007s : 0.04% Time group info: ------[substitution.] 0.010524 37 99.05% : 0.010424s : 8: substitution.getattr_setattr_resolve 0.03% : 0.000003s : 3: substitution.graph_param_transform 0.75% : 0.000079s : 3: substitution.inline 0.08% : 0.000009s : 13: substitution.meta_unpack_prepare 0.01% : 0.000001s : 3: substitution.partial_unused_args_eliminate 0.01% : 0.000001s : 4: substitution.remove_not_recompute_node 0.01% : 0.000001s : 2: substitution.replace_old_param 0.05% : 0.000005s : 1: substitution.tuple_list_get_item_eliminator ------[renormalize.] 0.000289 2 58.79% : 0.000170s : 1: renormalize.infer 41.21% : 0.000119s : 1: renormalize.specialize ------[replace.] 0.000126 10 75.68% : 0.000095s : 6: replace.getattr_setattr_resolve 18.67% : 0.000023s : 3: replace.inline 5.64% : 0.000007s : 1: replace.tuple_list_get_item_eliminator ------[match.] 0.010449 10 99.19% : 0.010364s : 6: match.getattr_setattr_resolve 0.75% : 0.000079s : 3: match.inline 0.05% : 0.000005s : 1: match.tuple_list_get_item_eliminator ------[func_graph_cloner_run.] 0.000400 10 67.61% : 0.000270s : 5: func_graph_cloner_run.FuncGraphClonerGraph 32.39% : 0.000130s : 5: func_graph_cloner_run.FuncGraphSpecializer ------[meta_graph.] 0.000000 0 ------[manager.] 0.000000 0 ------[pynative] 0.000000 0 ------[others.] 0.011015 105 0.74% : 0.000082s : 52: opt.transform.opt_a 0.30% : 0.000033s : 23: opt.transform.opt_b 96.28% : 0.010605s : 2: opt.transform.opt_resolve 0.27% : 0.000030s : 1: opt.transforms.meta_unpack_prepare 2.13% : 0.000235s : 20: opt.transforms.opt_a 0.04% : 0.000004s : 1: opt.transforms.opt_after_cconv 0.02% : 0.000002s : 1: opt.transforms.opt_b 0.15% : 0.000017s : 2: opt.transforms.opt_trans_graph 0.07% : 0.000008s : 3: opt.transforms.special_op_eliminate TotalTime = 0.0842057, [20] [parse]: 0.00133283 [symbol_resolve]: 0.0123801, [1] [Cycle 1]: 0.0123195, [1] [resolve]: 0.0123012 [combine_like_graphs]: 7.2e-07 [graph_reusing]: 2.63e-06 [meta_unpack_prepare]: 0.00012944 [pre_cconv]: 5.4e-07 [abstract_specialize]: 0.0037285 [pack_expand]: 1.306e-05 [auto_monad]: 6.935e-05 [inline]: 1.31e-06 [pre_auto_parallel]: 8.64e-06 [pipeline_split]: 2.1e-06 [optimize]: 0.0618677, [35] [py_interpret_to_execute]: 4.15e-06 [rewriter_before_opt_a]: 0.00017622 [opt_a]: 0.0607018, [4] [Cycle 1]: 0.0294577, [30] [expand_dump_flag]: 3.57e-06 [switch_simplify]: 2.345e-05 [a_1]: 0.0003986 [recompute_prepare]: 9.73e-06 [updatestate_depend_eliminate]: 9.41999e-06 [updatestate_assign_eliminate]: 6.5e-06 [updatestate_loads_eliminate]: 5.91e-06 [parameter_eliminate]: 4.07e-06 [a_2]: 8.127e-05 [accelerated_algorithm]: 5.69e-06 [pynative_shard]: 1.27e-06 [auto_parallel]: 2.93e-06 [parallel]: 7.01e-06 [merge_comm]: 3.19e-06 [allreduce_fusion]: 2.02e-06 [virtual_dataset]: 5.3e-06 [get_grad_eliminate_]: 4.65001e-06 [virtual_output]: 4.22e-06 [merge_forward]: 7.36e-06 [cell_reuse_recompute_pass]: 5.69999e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.368e-05 [meta_fg_expand]: 0.00172001, [1] [Cycle 1]: 0.0004103, [1] [resolve]: 0.0003932 [after_resolve]: 2.03e-05 [a_after_grad]: 3.352e-05 [renormalize]: 0.0264768 [real_op_eliminate]: 2.47e-05 [auto_monad_grad]: 3.168e-05 [auto_monad_eliminator]: 4.77e-05 [cse]: 0.00010702 [a_3]: 0.00017353 [Cycle 2]: 0.0255175, [30] [expand_dump_flag]: 3.29e-06 [switch_simplify]: 6.058e-05 [a_1]: 0.00043026 [recompute_prepare]: 1.085e-05 [updatestate_depend_eliminate]: 1.176e-05 [updatestate_assign_eliminate]: 8.50001e-06 [updatestate_loads_eliminate]: 8.39e-06 [parameter_eliminate]: 3.49e-06 [a_2]: 0.00012364 [accelerated_algorithm]: 1.217e-05 [pynative_shard]: 1.24e-06 [auto_parallel]: 4.55e-06 [parallel]: 4.86e-06 [merge_comm]: 2.72e-06 [allreduce_fusion]: 1.87e-06 [virtual_dataset]: 7.39e-06 [get_grad_eliminate_]: 6.12e-06 [virtual_output]: 5.93e-06 [merge_forward]: 9.67e-06 [cell_reuse_recompute_pass]: 4.89999e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.727e-05 [meta_fg_expand]: 0.00470414, [3] [Cycle 1]: 0.00032336, [1] [resolve]: 0.00030578 [Cycle 1]: 0.00043571, [1] [resolve]: 0.00041734 [Cycle 1]: 0.00031422, [1] [resolve]: 0.00029705 [after_resolve]: 3.198e-05 [a_after_grad]: 5.367e-05 [renormalize]: 0.0193695 [real_op_eliminate]: 2.777e-05 [auto_monad_grad]: 3.578e-05 [auto_monad_eliminator]: 5.541e-05 [cse]: 0.00012155 [a_3]: 0.00021032 [Cycle 3]: 0.00262597, [30] [expand_dump_flag]: 2.95e-06 [switch_simplify]: 6.052e-05 [a_1]: 0.00055301 [recompute_prepare]: 1.305e-05 [updatestate_depend_eliminate]: 1.302e-05 [updatestate_assign_eliminate]: 1.07e-05 [updatestate_loads_eliminate]: 1.004e-05 [parameter_eliminate]: 3.52e-06 [a_2]: 0.00016025 [accelerated_algorithm]: 1.623e-05 [pynative_shard]: 1.17e-06 [auto_parallel]: 4.19e-06 [parallel]: 4.29e-06 [merge_comm]: 3.37e-06 [allreduce_fusion]: 2.15e-06 [virtual_dataset]: 8.57e-06 [get_grad_eliminate_]: 7.95e-06 [virtual_output]: 7.54e-06 [merge_forward]: 1.192e-05 [cell_reuse_recompute_pass]: 4.19997e-07 [cell_reuse_handle_not_recompute_node_pass]: 2.208e-05 [meta_fg_expand]: 2.896e-05 [after_resolve]: 1.266e-05 [a_after_grad]: 1.535e-05 [renormalize]: 0.00130154 [real_op_eliminate]: 1.228e-05 [auto_monad_grad]: 4.45999e-06 [auto_monad_eliminator]: 2.227e-05 [cse]: 8.158e-05 [a_3]: 7.303e-05 [Cycle 4]: 0.00076779, [30] [expand_dump_flag]: 1.19e-06 [switch_simplify]: 8.8e-06 [a_1]: 0.00015881 [recompute_prepare]: 1.109e-05 [updatestate_depend_eliminate]: 1.339e-05 [updatestate_assign_eliminate]: 1.024e-05 [updatestate_loads_eliminate]: 9.76001e-06 [parameter_eliminate]: 1.68e-06 [a_2]: 0.00016026 [accelerated_algorithm]: 1.585e-05 [pynative_shard]: 1.37e-06 [auto_parallel]: 2.91e-06 [parallel]: 3.61e-06 [merge_comm]: 2.61e-06 [allreduce_fusion]: 2.06001e-06 [virtual_dataset]: 8.72e-06 [get_grad_eliminate_]: 7.68e-06 [virtual_output]: 7.47e-06 [merge_forward]: 1.167e-05 [cell_reuse_recompute_pass]: 2.90005e-07 [cell_reuse_handle_not_recompute_node_pass]: 2.122e-05 [meta_fg_expand]: 8.22e-06 [after_resolve]: 1.181e-05 [a_after_grad]: 1.532e-05 [renormalize]: 6.00048e-08 [real_op_eliminate]: 8.26e-06 [auto_monad_grad]: 1.82e-06 [auto_monad_eliminator]: 2.026e-05 [cse]: 4.598e-05 [a_3]: 6.689e-05 [py_interpret_to_execute_after_opt_a]: 3.86999e-06 [slice_cell_reuse_recomputed_activation]: 1.59e-06 [rewriter_after_opt_a]: 6.843e-05 [convert_after_rewriter]: 1.555e-05 [order_py_execute_after_rewriter]: 1.088e-05 [opt_b]: 0.00058096, [2] [Cycle 1]: 0.00048917, [7] [b_1]: 0.00043348 [b_2]: 3.54e-06 [updatestate_depend_eliminate]: 3.4e-06 [updatestate_assign_eliminate]: 2.55999e-06 [updatestate_loads_eliminate]: 2.04999e-06 [renormalize]: 3.50003e-07 [cse]: 8.81001e-06 [Cycle 2]: 8.244e-05, [7] [b_1]: 3.964e-05 [b_2]: 2.28e-06 [updatestate_depend_eliminate]: 2.07e-06 [updatestate_assign_eliminate]: 1.93e-06 [updatestate_loads_eliminate]: 1.74e-06 [renormalize]: 6.99947e-08 [cse]: 6.36e-06 [cconv]: 1.367e-05 [opt_after_cconv]: 5.307e-05, [1] [Cycle 1]: 4.881e-05, [7] [c_1]: 5.04e-06 [parameter_eliminate]: 1.47e-06 [updatestate_depend_eliminate]: 2.46e-06 [updatestate_assign_eliminate]: 2.01001e-06 [updatestate_loads_eliminate]: 1.8e-06 [cse]: 6.28e-06 [renormalize]: 2.20003e-07 [remove_dup_value]: 6.72e-06 [tuple_transform]: 3.665e-05, [1] [Cycle 1]: 3.299e-05, [3] [d_1]: 1.357e-05 [d_2]: 6.19e-06 [renormalize]: 1.59998e-07 [add_cache_embedding]: 7.97999e-06 [add_recomputation]: 2.897e-05 [cse_after_recomputation]: 1.667e-05, [1] [Cycle 1]: 1.252e-05, [1] [cse]: 7.75e-06 [environ_conv]: 5.01e-06 [label_micro_interleaved_index]: 1.59e-06 [label_fine_grained_interleaved_index]: 1.3e-06 [assign_add_opt]: 1.09999e-06 [slice_recompute_activation]: 1.3e-06 [micro_interleaved_order_control]: 1.26e-06 [full_micro_interleaved_order_control]: 1.1e-06 [comp_comm_scheduling]: 1.29e-06 [reorder_send_recv_between_fp_bp]: 1.39001e-06 [comm_op_add_attrs]: 7.99999e-07 [add_comm_op_reuse_tag]: 5.9e-07 [overlap_opt_shard_in_pipeline]: 8.60004e-07 [grouped_pairwise_exchange_alltoall]: 9.89996e-07 [overlap_recompute_and_grad_model_parallel]: 1.11e-06 [overlap_grad_matmul_and_grad_allreduce]: 5.80003e-07 [split_matmul_comm_elemetwise]: 1.51e-06 [split_layernorm_comm]: 1.16e-06 [process_send_recv_for_ge]: 5.60001e-07 [handle_group_info]: 5.69999e-07 [auto_monad_reorder]: 1.244e-05 [get_jit_bprop_graph]: 2.90005e-07 [eliminate_special_op_node]: 0.00052034 [validate]: 2.281e-05 [distribtued_split]: 8.70001e-07 [task_emit]: 0.00392548 [execute]: 6.56e-06 Sums parse : 0.001333s : 1.76% symbol_resolve.resolve : 0.012301s : 16.26% combine_like_graphs : 0.000001s : 0.00% graph_reusing : 0.000003s : 0.00% meta_unpack_prepare : 0.000129s : 0.17% pre_cconv : 0.000001s : 0.00% abstract_specialize : 0.003728s : 4.93% pack_expand : 0.000013s : 0.02% auto_monad : 0.000069s : 0.09% inline : 0.000001s : 0.00% pre_auto_parallel : 0.000009s : 0.01% pipeline_split : 0.000002s : 0.00% optimize.py_interpret_to_execute : 0.000004s : 0.01% optimize.rewriter_before_opt_a : 0.000176s : 0.23% optimize.opt_a.expand_dump_flag : 0.000011s : 0.01% optimize.opt_a.switch_simplify : 0.000153s : 0.20% optimize.opt_a.a_1 : 0.001541s : 2.04% optimize.opt_a.recompute_prepare : 0.000045s : 0.06% optimize.opt_a.updatestate_depend_eliminate : 0.000048s : 0.06% optimize.opt_a.updatestate_assign_eliminate : 0.000036s : 0.05% optimize.opt_a.updatestate_loads_eliminate : 0.000034s : 0.05% optimize.opt_a.parameter_eliminate : 0.000013s : 0.02% optimize.opt_a.a_2 : 0.000525s : 0.69% optimize.opt_a.accelerated_algorithm : 0.000050s : 0.07% optimize.opt_a.pynative_shard : 0.000005s : 0.01% optimize.opt_a.auto_parallel : 0.000015s : 0.02% optimize.opt_a.parallel : 0.000020s : 0.03% optimize.opt_a.merge_comm : 0.000012s : 0.02% optimize.opt_a.allreduce_fusion : 0.000008s : 0.01% optimize.opt_a.virtual_dataset : 0.000030s : 0.04% optimize.opt_a.get_grad_eliminate_ : 0.000026s : 0.03% optimize.opt_a.virtual_output : 0.000025s : 0.03% optimize.opt_a.merge_forward : 0.000041s : 0.05% optimize.opt_a.cell_reuse_recompute_pass : 0.000002s : 0.00% optimize.opt_a.cell_reuse_handle_not_recompute_node_pass : 0.000074s : 0.10% optimize.opt_a.meta_fg_expand : 0.000037s : 0.05% optimize.opt_a.meta_fg_expand.resolve : 0.001413s : 1.87% optimize.opt_a.after_resolve : 0.000077s : 0.10% optimize.opt_a.a_after_grad : 0.000118s : 0.16% optimize.opt_a.renormalize : 0.047148s : 62.30% optimize.opt_a.real_op_eliminate : 0.000073s : 0.10% optimize.opt_a.auto_monad_grad : 0.000074s : 0.10% optimize.opt_a.auto_monad_eliminator : 0.000146s : 0.19% optimize.opt_a.cse : 0.000356s : 0.47% optimize.opt_a.a_3 : 0.000524s : 0.69% optimize.py_interpret_to_execute_after_opt_a : 0.000004s : 0.01% optimize.slice_cell_reuse_recomputed_activation : 0.000002s : 0.00% optimize.rewriter_after_opt_a : 0.000068s : 0.09% optimize.convert_after_rewriter : 0.000016s : 0.02% optimize.order_py_execute_after_rewriter : 0.000011s : 0.01% optimize.opt_b.b_1 : 0.000473s : 0.63% optimize.opt_b.b_2 : 0.000006s : 0.01% optimize.opt_b.updatestate_depend_eliminate : 0.000005s : 0.01% optimize.opt_b.updatestate_assign_eliminate : 0.000004s : 0.01% optimize.opt_b.updatestate_loads_eliminate : 0.000004s : 0.01% optimize.opt_b.renormalize : 0.000000s : 0.00% optimize.opt_b.cse : 0.000015s : 0.02% optimize.cconv : 0.000014s : 0.02% optimize.opt_after_cconv.c_1 : 0.000005s : 0.01% optimize.opt_after_cconv.parameter_eliminate : 0.000001s : 0.00% optimize.opt_after_cconv.updatestate_depend_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_assign_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_loads_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.cse : 0.000006s : 0.01% optimize.opt_after_cconv.renormalize : 0.000000s : 0.00% optimize.remove_dup_value : 0.000007s : 0.01% optimize.tuple_transform.d_1 : 0.000014s : 0.02% optimize.tuple_transform.d_2 : 0.000006s : 0.01% optimize.tuple_transform.renormalize : 0.000000s : 0.00% optimize.add_cache_embedding : 0.000008s : 0.01% optimize.add_recomputation : 0.000029s : 0.04% optimize.cse_after_recomputation.cse : 0.000008s : 0.01% optimize.environ_conv : 0.000005s : 0.01% optimize.label_micro_interleaved_index : 0.000002s : 0.00% optimize.label_fine_grained_interleaved_index : 0.000001s : 0.00% optimize.assign_add_opt : 0.000001s : 0.00% optimize.slice_recompute_activation : 0.000001s : 0.00% optimize.micro_interleaved_order_control : 0.000001s : 0.00% optimize.full_micro_interleaved_order_control : 0.000001s : 0.00% optimize.comp_comm_scheduling : 0.000001s : 0.00% optimize.reorder_send_recv_between_fp_bp : 0.000001s : 0.00% optimize.comm_op_add_attrs : 0.000001s : 0.00% optimize.add_comm_op_reuse_tag : 0.000001s : 0.00% optimize.overlap_opt_shard_in_pipeline : 0.000001s : 0.00% optimize.grouped_pairwise_exchange_alltoall : 0.000001s : 0.00% optimize.overlap_recompute_and_grad_model_parallel : 0.000001s : 0.00% optimize.overlap_grad_matmul_and_grad_allreduce : 0.000001s : 0.00% optimize.split_matmul_comm_elemetwise : 0.000002s : 0.00% optimize.split_layernorm_comm : 0.000001s : 0.00% optimize.process_send_recv_for_ge : 0.000001s : 0.00% optimize.handle_group_info : 0.000001s : 0.00% auto_monad_reorder : 0.000012s : 0.02% get_jit_bprop_graph : 0.000000s : 0.00% eliminate_special_op_node : 0.000520s : 0.69% validate : 0.000023s : 0.03% distribtued_split : 0.000001s : 0.00% task_emit : 0.003925s : 5.19% execute : 0.000007s : 0.01% Time group info: ------[substitution.] 0.014006 383 0.02% : 0.000003s : 5: substitution.float_depend_g_call 0.07% : 0.000010s : 14: substitution.float_tuple_getitem_switch 93.55% : 0.013103s : 25: substitution.getattr_setattr_resolve 0.02% : 0.000003s : 3: substitution.graph_param_transform 0.02% : 0.000002s : 3: substitution.incorporate_call 0.01% : 0.000002s : 3: substitution.incorporate_call_switch 4.05% : 0.000568s : 59: substitution.inline 0.04% : 0.000006s : 10: substitution.less_batch_normalization 0.20% : 0.000029s : 23: substitution.meta_unpack_prepare 0.08% : 0.000011s : 11: substitution.minmaximum_grad 0.02% : 0.000003s : 5: substitution.partial_eliminate 0.01% : 0.000001s : 3: substitution.partial_unused_args_eliminate 0.04% : 0.000006s : 47: substitution.remove_not_recompute_node 0.42% : 0.000058s : 38: substitution.replace_applicator 0.05% : 0.000007s : 20: substitution.replace_old_param 0.02% : 0.000003s : 2: substitution.reset_defer_inline 0.05% : 0.000007s : 8: substitution.set_cell_output_no_recompute 0.05% : 0.000007s : 5: substitution.specialize_transform 0.05% : 0.000007s : 4: substitution.switch_simplify 0.07% : 0.000010s : 2: substitution.transpose_eliminate 0.27% : 0.000038s : 15: substitution.tuple_list_convert_item_index_to_positive 0.11% : 0.000015s : 15: substitution.tuple_list_get_item_const_eliminator 0.15% : 0.000021s : 15: substitution.tuple_list_get_item_depend_reorder 0.47% : 0.000066s : 33: substitution.tuple_list_get_item_eliminator 0.14% : 0.000020s : 15: substitution.tuple_list_get_set_item_eliminator ------[renormalize.] 0.047134 6 92.50% : 0.043602s : 3: renormalize.infer 7.50% : 0.003533s : 3: renormalize.specialize ------[replace.] 0.000626 68 45.44% : 0.000284s : 23: replace.getattr_setattr_resolve 31.14% : 0.000195s : 31: replace.inline 6.94% : 0.000043s : 2: replace.meta_unpack_prepare 8.10% : 0.000051s : 4: replace.switch_simplify 1.54% : 0.000010s : 2: replace.transpose_eliminate 6.84% : 0.000043s : 6: replace.tuple_list_get_item_eliminator ------[match.] 0.013613 68 95.90% : 0.013055s : 23: match.getattr_setattr_resolve 3.71% : 0.000505s : 31: match.inline 0.12% : 0.000016s : 2: match.meta_unpack_prepare 0.05% : 0.000007s : 4: match.switch_simplify 0.08% : 0.000010s : 2: match.transpose_eliminate 0.13% : 0.000018s : 6: match.tuple_list_get_item_eliminator ------[func_graph_cloner_run.] 0.003595 69 66.80% : 0.002402s : 28: func_graph_cloner_run.FuncGraphClonerGraph 33.20% : 0.001194s : 41: func_graph_cloner_run.FuncGraphSpecializer ------[meta_graph.] 0.000000 0 ------[manager.] 0.000000 0 ------[pynative] 0.000000 0 ------[others.] 0.017383 255 6.15% : 0.001069s : 104: opt.transform.opt_a 2.51% : 0.000436s : 92: opt.transform.opt_b 78.41% : 0.013630s : 10: opt.transform.opt_resolve 0.65% : 0.000113s : 1: opt.transforms.meta_unpack_prepare 12.09% : 0.002101s : 40: opt.transforms.opt_a 0.02% : 0.000004s : 1: opt.transforms.opt_after_cconv 0.02% : 0.000004s : 2: opt.transforms.opt_b 0.10% : 0.000018s : 2: opt.transforms.opt_trans_graph 0.05% : 0.000009s : 3: opt.transforms.special_op_eliminate TotalTime = 0.0222354, [20] [parse]: 0.0012957 [symbol_resolve]: 0.0107239, [1] [Cycle 1]: 0.0106767, [1] [resolve]: 0.0106576 [combine_like_graphs]: 8.79998e-07 [graph_reusing]: 2.43e-06 [meta_unpack_prepare]: 4.644e-05 [pre_cconv]: 4.49996e-07 [abstract_specialize]: 0.00186042 [pack_expand]: 8.49e-06 [auto_monad]: 3.379e-05 [inline]: 1.56e-06 [pre_auto_parallel]: 7.44e-06 [pipeline_split]: 2.03e-06 [optimize]: 0.00381392, [35] [py_interpret_to_execute]: 4.11e-06 [rewriter_before_opt_a]: 3.514e-05 [opt_a]: 0.00334663, [2] [Cycle 1]: 0.00077607, [30] [expand_dump_flag]: 2.42001e-06 [switch_simplify]: 1.219e-05 [a_1]: 0.00017528 [recompute_prepare]: 2.82e-06 [updatestate_depend_eliminate]: 5.28e-06 [updatestate_assign_eliminate]: 3.07e-06 [updatestate_loads_eliminate]: 2.68e-06 [parameter_eliminate]: 2.38e-06 [a_2]: 3.155e-05 [accelerated_algorithm]: 2.81e-06 [pynative_shard]: 1.22e-06 [auto_parallel]: 3.17e-06 [parallel]: 6.53e-06 [merge_comm]: 3.24e-06 [allreduce_fusion]: 1.86e-06 [virtual_dataset]: 2.62e-06 [get_grad_eliminate_]: 2.35e-06 [virtual_output]: 1.87e-06 [merge_forward]: 3.47e-06 [cell_reuse_recompute_pass]: 5.10001e-07 [cell_reuse_handle_not_recompute_node_pass]: 6.67e-06 [meta_fg_expand]: 2.85e-06 [after_resolve]: 4.57e-06 [a_after_grad]: 2.9e-06 [renormalize]: 0.00029589 [real_op_eliminate]: 4.04e-06 [auto_monad_grad]: 3.2e-06 [auto_monad_eliminator]: 7.57e-06 [cse]: 1.594e-05 [a_3]: 1.688e-05 [Cycle 2]: 0.00024087, [30] [expand_dump_flag]: 9.29998e-07 [switch_simplify]: 2.39e-06 [a_1]: 1.769e-05 [recompute_prepare]: 1.85e-06 [updatestate_depend_eliminate]: 2.89e-06 [updatestate_assign_eliminate]: 2.16e-06 [updatestate_loads_eliminate]: 2.15e-06 [parameter_eliminate]: 7.80004e-07 [a_2]: 2.94e-05 [accelerated_algorithm]: 2.64e-06 [pynative_shard]: 8.89995e-07 [auto_parallel]: 2.73e-06 [parallel]: 2.99e-06 [merge_comm]: 1.96e-06 [allreduce_fusion]: 1.5e-06 [virtual_dataset]: 2.28e-06 [get_grad_eliminate_]: 1.96999e-06 [virtual_output]: 1.84e-06 [merge_forward]: 2.59e-06 [cell_reuse_recompute_pass]: 3.00002e-07 [cell_reuse_handle_not_recompute_node_pass]: 6.04001e-06 [meta_fg_expand]: 1.86e-06 [after_resolve]: 3.73001e-06 [a_after_grad]: 2.52e-06 [renormalize]: 7.99992e-08 [real_op_eliminate]: 2.01e-06 [auto_monad_grad]: 9e-07 [auto_monad_eliminator]: 3.76999e-06 [cse]: 7.44e-06 [a_3]: 1.4e-05 [py_interpret_to_execute_after_opt_a]: 3.52e-06 [slice_cell_reuse_recomputed_activation]: 1.49e-06 [rewriter_after_opt_a]: 1.716e-05 [convert_after_rewriter]: 4.09e-06 [order_py_execute_after_rewriter]: 4.68e-06 [opt_b]: 9.566e-05, [1] [Cycle 1]: 9.084e-05, [7] [b_1]: 4.331e-05 [b_2]: 3.39e-06 [updatestate_depend_eliminate]: 2.4e-06 [updatestate_assign_eliminate]: 2.33e-06 [updatestate_loads_eliminate]: 2.13e-06 [renormalize]: 2.59999e-07 [cse]: 7.15e-06 [cconv]: 1.52e-05 [opt_after_cconv]: 5.299e-05, [1] [Cycle 1]: 4.905e-05, [7] [c_1]: 5.4e-06 [parameter_eliminate]: 6.19999e-07 [updatestate_depend_eliminate]: 2.33e-06 [updatestate_assign_eliminate]: 1.9e-06 [updatestate_loads_eliminate]: 1.78e-06 [cse]: 6.92e-06 [renormalize]: 2.20003e-07 [remove_dup_value]: 6.51e-06 [tuple_transform]: 3.514e-05, [1] [Cycle 1]: 3.15e-05, [3] [d_1]: 1.246e-05 [d_2]: 6.17e-06 [renormalize]: 1.49994e-07 [add_cache_embedding]: 7.7e-06 [add_recomputation]: 3.122e-05 [cse_after_recomputation]: 1.64e-05, [1] [Cycle 1]: 1.222e-05, [1] [cse]: 7.33e-06 [environ_conv]: 3.83e-06 [label_micro_interleaved_index]: 1.41e-06 [label_fine_grained_interleaved_index]: 1.26e-06 [assign_add_opt]: 1.45e-06 [slice_recompute_activation]: 1.81e-06 [micro_interleaved_order_control]: 1.31e-06 [full_micro_interleaved_order_control]: 1.21e-06 [comp_comm_scheduling]: 1.18e-06 [reorder_send_recv_between_fp_bp]: 1.3e-06 [comm_op_add_attrs]: 6.19999e-07 [add_comm_op_reuse_tag]: 5.79996e-07 [overlap_opt_shard_in_pipeline]: 5.99997e-07 [grouped_pairwise_exchange_alltoall]: 7.49998e-07 [overlap_recompute_and_grad_model_parallel]: 1.26e-06 [overlap_grad_matmul_and_grad_allreduce]: 4.80002e-07 [split_matmul_comm_elemetwise]: 1.60999e-06 [split_layernorm_comm]: 1.16e-06 [process_send_recv_for_ge]: 5.9e-07 [handle_group_info]: 5.99997e-07 [auto_monad_reorder]: 1.058e-05 [get_jit_bprop_graph]: 2.89998e-07 [eliminate_special_op_node]: 0.00047653 [validate]: 1.783e-05 [distribtued_split]: 7.90002e-07 [task_emit]: 0.0037429 [execute]: 6.61e-06 Sums parse : 0.001296s : 6.75% symbol_resolve.resolve : 0.010658s : 55.55% combine_like_graphs : 0.000001s : 0.00% graph_reusing : 0.000002s : 0.01% meta_unpack_prepare : 0.000046s : 0.24% pre_cconv : 0.000000s : 0.00% abstract_specialize : 0.001860s : 9.70% pack_expand : 0.000008s : 0.04% auto_monad : 0.000034s : 0.18% inline : 0.000002s : 0.01% pre_auto_parallel : 0.000007s : 0.04% pipeline_split : 0.000002s : 0.01% optimize.py_interpret_to_execute : 0.000004s : 0.02% optimize.rewriter_before_opt_a : 0.000035s : 0.18% optimize.opt_a.expand_dump_flag : 0.000003s : 0.02% optimize.opt_a.switch_simplify : 0.000015s : 0.08% optimize.opt_a.a_1 : 0.000193s : 1.01% optimize.opt_a.recompute_prepare : 0.000005s : 0.02% optimize.opt_a.updatestate_depend_eliminate : 0.000008s : 0.04% optimize.opt_a.updatestate_assign_eliminate : 0.000005s : 0.03% optimize.opt_a.updatestate_loads_eliminate : 0.000005s : 0.03% optimize.opt_a.parameter_eliminate : 0.000003s : 0.02% optimize.opt_a.a_2 : 0.000061s : 0.32% optimize.opt_a.accelerated_algorithm : 0.000005s : 0.03% optimize.opt_a.pynative_shard : 0.000002s : 0.01% optimize.opt_a.auto_parallel : 0.000006s : 0.03% optimize.opt_a.parallel : 0.000010s : 0.05% optimize.opt_a.merge_comm : 0.000005s : 0.03% optimize.opt_a.allreduce_fusion : 0.000003s : 0.02% optimize.opt_a.virtual_dataset : 0.000005s : 0.03% optimize.opt_a.get_grad_eliminate_ : 0.000004s : 0.02% optimize.opt_a.virtual_output : 0.000004s : 0.02% optimize.opt_a.merge_forward : 0.000006s : 0.03% optimize.opt_a.cell_reuse_recompute_pass : 0.000001s : 0.00% optimize.opt_a.cell_reuse_handle_not_recompute_node_pass : 0.000013s : 0.07% optimize.opt_a.meta_fg_expand : 0.000005s : 0.02% optimize.opt_a.after_resolve : 0.000008s : 0.04% optimize.opt_a.a_after_grad : 0.000005s : 0.03% optimize.opt_a.renormalize : 0.000296s : 1.54% optimize.opt_a.real_op_eliminate : 0.000006s : 0.03% optimize.opt_a.auto_monad_grad : 0.000004s : 0.02% optimize.opt_a.auto_monad_eliminator : 0.000011s : 0.06% optimize.opt_a.cse : 0.000023s : 0.12% optimize.opt_a.a_3 : 0.000031s : 0.16% optimize.py_interpret_to_execute_after_opt_a : 0.000004s : 0.02% optimize.slice_cell_reuse_recomputed_activation : 0.000001s : 0.01% optimize.rewriter_after_opt_a : 0.000017s : 0.09% optimize.convert_after_rewriter : 0.000004s : 0.02% optimize.order_py_execute_after_rewriter : 0.000005s : 0.02% optimize.opt_b.b_1 : 0.000043s : 0.23% optimize.opt_b.b_2 : 0.000003s : 0.02% optimize.opt_b.updatestate_depend_eliminate : 0.000002s : 0.01% optimize.opt_b.updatestate_assign_eliminate : 0.000002s : 0.01% optimize.opt_b.updatestate_loads_eliminate : 0.000002s : 0.01% optimize.opt_b.renormalize : 0.000000s : 0.00% optimize.opt_b.cse : 0.000007s : 0.04% optimize.cconv : 0.000015s : 0.08% optimize.opt_after_cconv.c_1 : 0.000005s : 0.03% optimize.opt_after_cconv.parameter_eliminate : 0.000001s : 0.00% optimize.opt_after_cconv.updatestate_depend_eliminate : 0.000002s : 0.01% optimize.opt_after_cconv.updatestate_assign_eliminate : 0.000002s : 0.01% optimize.opt_after_cconv.updatestate_loads_eliminate : 0.000002s : 0.01% optimize.opt_after_cconv.cse : 0.000007s : 0.04% optimize.opt_after_cconv.renormalize : 0.000000s : 0.00% optimize.remove_dup_value : 0.000007s : 0.03% optimize.tuple_transform.d_1 : 0.000012s : 0.06% optimize.tuple_transform.d_2 : 0.000006s : 0.03% optimize.tuple_transform.renormalize : 0.000000s : 0.00% optimize.add_cache_embedding : 0.000008s : 0.04% optimize.add_recomputation : 0.000031s : 0.16% optimize.cse_after_recomputation.cse : 0.000007s : 0.04% optimize.environ_conv : 0.000004s : 0.02% optimize.label_micro_interleaved_index : 0.000001s : 0.01% optimize.label_fine_grained_interleaved_index : 0.000001s : 0.01% optimize.assign_add_opt : 0.000001s : 0.01% optimize.slice_recompute_activation : 0.000002s : 0.01% optimize.micro_interleaved_order_control : 0.000001s : 0.01% optimize.full_micro_interleaved_order_control : 0.000001s : 0.01% optimize.comp_comm_scheduling : 0.000001s : 0.01% optimize.reorder_send_recv_between_fp_bp : 0.000001s : 0.01% optimize.comm_op_add_attrs : 0.000001s : 0.00% optimize.add_comm_op_reuse_tag : 0.000001s : 0.00% optimize.overlap_opt_shard_in_pipeline : 0.000001s : 0.00% optimize.grouped_pairwise_exchange_alltoall : 0.000001s : 0.00% optimize.overlap_recompute_and_grad_model_parallel : 0.000001s : 0.01% optimize.overlap_grad_matmul_and_grad_allreduce : 0.000000s : 0.00% optimize.split_matmul_comm_elemetwise : 0.000002s : 0.01% optimize.split_layernorm_comm : 0.000001s : 0.01% optimize.process_send_recv_for_ge : 0.000001s : 0.00% optimize.handle_group_info : 0.000001s : 0.00% auto_monad_reorder : 0.000011s : 0.06% get_jit_bprop_graph : 0.000000s : 0.00% eliminate_special_op_node : 0.000477s : 2.48% validate : 0.000018s : 0.09% distribtued_split : 0.000001s : 0.00% task_emit : 0.003743s : 19.51% execute : 0.000007s : 0.03% Time group info: ------[substitution.] 0.010567 37 99.07% : 0.010469s : 8: substitution.getattr_setattr_resolve 0.03% : 0.000003s : 3: substitution.graph_param_transform 0.73% : 0.000077s : 3: substitution.inline 0.08% : 0.000008s : 13: substitution.meta_unpack_prepare 0.01% : 0.000001s : 3: substitution.partial_unused_args_eliminate 0.01% : 0.000001s : 4: substitution.remove_not_recompute_node 0.02% : 0.000002s : 2: substitution.replace_old_param 0.06% : 0.000006s : 1: substitution.tuple_list_get_item_eliminator ------[renormalize.] 0.000291 2 58.03% : 0.000169s : 1: renormalize.infer 41.97% : 0.000122s : 1: renormalize.specialize ------[replace.] 0.000126 10 76.02% : 0.000096s : 6: replace.getattr_setattr_resolve 18.32% : 0.000023s : 3: replace.inline 5.66% : 0.000007s : 1: replace.tuple_list_get_item_eliminator ------[match.] 0.010492 10 99.21% : 0.010409s : 6: match.getattr_setattr_resolve 0.74% : 0.000077s : 3: match.inline 0.06% : 0.000006s : 1: match.tuple_list_get_item_eliminator ------[func_graph_cloner_run.] 0.000408 10 68.65% : 0.000280s : 5: func_graph_cloner_run.FuncGraphClonerGraph 31.35% : 0.000128s : 5: func_graph_cloner_run.FuncGraphSpecializer ------[meta_graph.] 0.000000 0 ------[manager.] 0.000000 0 ------[pynative] 0.000000 0 ------[others.] 0.011061 105 0.73% : 0.000080s : 52: opt.transform.opt_a 0.30% : 0.000033s : 23: opt.transform.opt_b 96.31% : 0.010653s : 2: opt.transform.opt_resolve 0.26% : 0.000029s : 1: opt.transforms.meta_unpack_prepare 2.13% : 0.000235s : 20: opt.transforms.opt_a 0.03% : 0.000004s : 1: opt.transforms.opt_after_cconv 0.02% : 0.000002s : 1: opt.transforms.opt_b 0.15% : 0.000017s : 2: opt.transforms.opt_trans_graph 0.07% : 0.000008s : 3: opt.transforms.special_op_eliminate . TotalTime = 0.0843952, [20] [parse]: 0.00137619 [symbol_resolve]: 0.0123066, [1] [Cycle 1]: 0.0122448, [1] [resolve]: 0.0122271 [combine_like_graphs]: 1.04e-06 [graph_reusing]: 2.69e-06 [meta_unpack_prepare]: 0.00016331 [pre_cconv]: 4.99997e-07 [abstract_specialize]: 0.0037085 [pack_expand]: 1.269e-05 [auto_monad]: 6.906e-05 [inline]: 1.38e-06 [pre_auto_parallel]: 7.27e-06 [pipeline_split]: 2.24e-06 [optimize]: 0.0635752, [35] [py_interpret_to_execute]: 4.05e-06 [rewriter_before_opt_a]: 0.0001751 [opt_a]: 0.0623736, [4] [Cycle 1]: 0.0299607, [30] [expand_dump_flag]: 3.38e-06 [switch_simplify]: 2.405e-05 [a_1]: 0.00072211 [recompute_prepare]: 8.63e-06 [updatestate_depend_eliminate]: 9.76e-06 [updatestate_assign_eliminate]: 6.67e-06 [updatestate_loads_eliminate]: 6.16e-06 [parameter_eliminate]: 3.73e-06 [a_2]: 0.0001149 [accelerated_algorithm]: 5.77e-06 [pynative_shard]: 1.25e-06 [auto_parallel]: 3.11e-06 [parallel]: 6.37001e-06 [merge_comm]: 3.11e-06 [allreduce_fusion]: 2.07e-06 [virtual_dataset]: 5.65001e-06 [get_grad_eliminate_]: 4.89e-06 [virtual_output]: 4.55e-06 [merge_forward]: 8.3e-06 [cell_reuse_recompute_pass]: 7.2e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.369e-05 [meta_fg_expand]: 0.00170551, [1] [Cycle 1]: 0.00041132, [1] [resolve]: 0.00039418 [after_resolve]: 2.096e-05 [a_after_grad]: 3.94e-05 [renormalize]: 0.026671 [real_op_eliminate]: 2.735e-05 [auto_monad_grad]: 3.192e-05 [auto_monad_eliminator]: 4.785e-05 [cse]: 0.00010607 [a_3]: 0.00017015 [Cycle 2]: 0.0258074, [30] [expand_dump_flag]: 2.98e-06 [switch_simplify]: 7.367e-05 [a_1]: 0.00091363 [recompute_prepare]: 9.82e-06 [updatestate_depend_eliminate]: 1.175e-05 [updatestate_assign_eliminate]: 8.6e-06 [updatestate_loads_eliminate]: 8.1e-06 [parameter_eliminate]: 3.46e-06 [a_2]: 0.00011904 [accelerated_algorithm]: 1.175e-05 [pynative_shard]: 1.18e-06 [auto_parallel]: 4.97e-06 [parallel]: 5.4e-06 [merge_comm]: 3.09e-06 [allreduce_fusion]: 1.75e-06 [virtual_dataset]: 7.48e-06 [get_grad_eliminate_]: 6.45e-06 [virtual_output]: 5.94e-06 [merge_forward]: 1.013e-05 [cell_reuse_recompute_pass]: 4.49996e-07 [cell_reuse_handle_not_recompute_node_pass]: 1.676e-05 [meta_fg_expand]: 0.00470489, [3] [Cycle 1]: 0.00035091, [1] [resolve]: 0.00033391 [Cycle 1]: 0.00041538, [1] [resolve]: 0.0003984 [Cycle 1]: 0.00031087, [1] [resolve]: 0.00029359 [after_resolve]: 3.377e-05 [a_after_grad]: 6.588e-05 [renormalize]: 0.0191601 [real_op_eliminate]: 3.009e-05 [auto_monad_grad]: 3.476e-05 [auto_monad_eliminator]: 5.482e-05 [cse]: 0.00011878 [a_3]: 0.00020683 [Cycle 3]: 0.00325359, [30] [expand_dump_flag]: 2.82e-06 [switch_simplify]: 7.961e-05 [a_1]: 0.00120967 [recompute_prepare]: 1.172e-05 [updatestate_depend_eliminate]: 1.309e-05 [updatestate_assign_eliminate]: 1.045e-05 [updatestate_loads_eliminate]: 1.009e-05 [parameter_eliminate]: 3.35e-06 [a_2]: 0.00015614 [accelerated_algorithm]: 1.588e-05 [pynative_shard]: 1.08001e-06 [auto_parallel]: 4.12e-06 [parallel]: 4.29001e-06 [merge_comm]: 3.2e-06 [allreduce_fusion]: 2.13e-06 [virtual_dataset]: 8.81e-06 [get_grad_eliminate_]: 8.51e-06 [virtual_output]: 7.9e-06 [merge_forward]: 1.121e-05 [cell_reuse_recompute_pass]: 5.60001e-07 [cell_reuse_handle_not_recompute_node_pass]: 2.168e-05 [meta_fg_expand]: 2.885e-05 [after_resolve]: 1.278e-05 [a_after_grad]: 2.134e-05 [renormalize]: 0.00125607 [real_op_eliminate]: 1.283e-05 [auto_monad_grad]: 4.60001e-06 [auto_monad_eliminator]: 2.219e-05 [cse]: 8.147e-05 [a_3]: 7.084e-05 [Cycle 4]: 0.00101594, [30] [expand_dump_flag]: 1.18e-06 [switch_simplify]: 8.63e-06 [a_1]: 0.00040866 [recompute_prepare]: 1.083e-05 [updatestate_depend_eliminate]: 1.32e-05 [updatestate_assign_eliminate]: 1.036e-05 [updatestate_loads_eliminate]: 9.36e-06 [parameter_eliminate]: 1.65e-06 [a_2]: 0.00015611 [accelerated_algorithm]: 1.546e-05 [pynative_shard]: 1.27e-06 [auto_parallel]: 3.13e-06 [parallel]: 3.63e-06 [merge_comm]: 2.5e-06 [allreduce_fusion]: 2.11e-06 [virtual_dataset]: 8.99e-06 [get_grad_eliminate_]: 8.24e-06 [virtual_output]: 8.01e-06 [merge_forward]: 1.218e-05 [cell_reuse_recompute_pass]: 3.30001e-07 [cell_reuse_handle_not_recompute_node_pass]: 2.137e-05 [meta_fg_expand]: 8.35e-06 [after_resolve]: 1.197e-05 [a_after_grad]: 2.103e-05 [renormalize]: 7.0002e-08 [real_op_eliminate]: 7.97e-06 [auto_monad_grad]: 1.76e-06 [auto_monad_eliminator]: 1.984e-05 [cse]: 4.498e-05 [a_3]: 6.525e-05 [py_interpret_to_execute_after_opt_a]: 3.59e-06 [slice_cell_reuse_recomputed_activation]: 1.59e-06 [rewriter_after_opt_a]: 6.934e-05 [convert_after_rewriter]: 1.604e-05 [order_py_execute_after_rewriter]: 1.106e-05 [opt_b]: 0.00057458, [2] [Cycle 1]: 0.00048446, [7] [b_1]: 0.00043005 [b_2]: 3.08e-06 [updatestate_depend_eliminate]: 3.19e-06 [updatestate_assign_eliminate]: 2.4e-06 [updatestate_loads_eliminate]: 1.99e-06 [renormalize]: 2.99995e-07 [cse]: 8.76e-06 [Cycle 2]: 8.101e-05, [7] [b_1]: 3.931e-05 [b_2]: 2.41e-06 [updatestate_depend_eliminate]: 2.08e-06 [updatestate_assign_eliminate]: 1.82e-06 [updatestate_loads_eliminate]: 1.73e-06 [renormalize]: 7.99992e-08 [cse]: 6.08e-06 [cconv]: 1.352e-05 [opt_after_cconv]: 6.049e-05, [1] [Cycle 1]: 5.625e-05, [7] [c_1]: 1.385e-05 [parameter_eliminate]: 1.45e-06 [updatestate_depend_eliminate]: 2.35e-06 [updatestate_assign_eliminate]: 1.91e-06 [updatestate_loads_eliminate]: 1.83001e-06 [cse]: 5.99e-06 [renormalize]: 2.99995e-07 [remove_dup_value]: 6.5e-06 [tuple_transform]: 4.484e-05, [1] [Cycle 1]: 4.122e-05, [3] [d_1]: 2.212e-05 [d_2]: 6.15999e-06 [renormalize]: 1.89997e-07 [add_cache_embedding]: 8.58e-06 [add_recomputation]: 2.971e-05 [cse_after_recomputation]: 1.706e-05, [1] [Cycle 1]: 1.246e-05, [1] [cse]: 7.35e-06 [environ_conv]: 4.51e-06 [label_micro_interleaved_index]: 2.1e-06 [label_fine_grained_interleaved_index]: 1.39e-06 [assign_add_opt]: 1.58e-06 [slice_recompute_activation]: 1.54e-06 [micro_interleaved_order_control]: 1.16e-06 [full_micro_interleaved_order_control]: 1.52e-06 [comp_comm_scheduling]: 1.36e-06 [reorder_send_recv_between_fp_bp]: 1.59e-06 [comm_op_add_attrs]: 7.99999e-07 [add_comm_op_reuse_tag]: 6.19999e-07 [overlap_opt_shard_in_pipeline]: 6.40001e-07 [grouped_pairwise_exchange_alltoall]: 7.09995e-07 [overlap_recompute_and_grad_model_parallel]: 1.09e-06 [overlap_grad_matmul_and_grad_allreduce]: 4.89999e-07 [split_matmul_comm_elemetwise]: 1.76e-06 [split_layernorm_comm]: 1.16e-06 [process_send_recv_for_ge]: 6.00005e-07 [handle_group_info]: 5.9e-07 [auto_monad_reorder]: 1.192e-05 [get_jit_bprop_graph]: 2.73e-06 [eliminate_special_op_node]: 0.00057595 [validate]: 2.211e-05 [distribtued_split]: 9.20001e-07 [task_emit]: 0.00237127 [execute]: 5e-06 Sums parse : 0.001376s : 1.81% symbol_resolve.resolve : 0.012227s : 16.11% combine_like_graphs : 0.000001s : 0.00% graph_reusing : 0.000003s : 0.00% meta_unpack_prepare : 0.000163s : 0.22% pre_cconv : 0.000000s : 0.00% abstract_specialize : 0.003709s : 4.89% pack_expand : 0.000013s : 0.02% auto_monad : 0.000069s : 0.09% inline : 0.000001s : 0.00% pre_auto_parallel : 0.000007s : 0.01% pipeline_split : 0.000002s : 0.00% optimize.py_interpret_to_execute : 0.000004s : 0.01% optimize.rewriter_before_opt_a : 0.000175s : 0.23% optimize.opt_a.expand_dump_flag : 0.000010s : 0.01% optimize.opt_a.switch_simplify : 0.000186s : 0.24% optimize.opt_a.a_1 : 0.003254s : 4.29% optimize.opt_a.recompute_prepare : 0.000041s : 0.05% optimize.opt_a.updatestate_depend_eliminate : 0.000048s : 0.06% optimize.opt_a.updatestate_assign_eliminate : 0.000036s : 0.05% optimize.opt_a.updatestate_loads_eliminate : 0.000034s : 0.04% optimize.opt_a.parameter_eliminate : 0.000012s : 0.02% optimize.opt_a.a_2 : 0.000546s : 0.72% optimize.opt_a.accelerated_algorithm : 0.000049s : 0.06% optimize.opt_a.pynative_shard : 0.000005s : 0.01% optimize.opt_a.auto_parallel : 0.000015s : 0.02% optimize.opt_a.parallel : 0.000020s : 0.03% optimize.opt_a.merge_comm : 0.000012s : 0.02% optimize.opt_a.allreduce_fusion : 0.000008s : 0.01% optimize.opt_a.virtual_dataset : 0.000031s : 0.04% optimize.opt_a.get_grad_eliminate_ : 0.000028s : 0.04% optimize.opt_a.virtual_output : 0.000026s : 0.03% optimize.opt_a.merge_forward : 0.000042s : 0.06% optimize.opt_a.cell_reuse_recompute_pass : 0.000002s : 0.00% optimize.opt_a.cell_reuse_handle_not_recompute_node_pass : 0.000073s : 0.10% optimize.opt_a.meta_fg_expand : 0.000037s : 0.05% optimize.opt_a.meta_fg_expand.resolve : 0.001420s : 1.87% optimize.opt_a.after_resolve : 0.000079s : 0.10% optimize.opt_a.a_after_grad : 0.000148s : 0.19% optimize.opt_a.renormalize : 0.047087s : 62.04% optimize.opt_a.real_op_eliminate : 0.000078s : 0.10% optimize.opt_a.auto_monad_grad : 0.000073s : 0.10% optimize.opt_a.auto_monad_eliminator : 0.000145s : 0.19% optimize.opt_a.cse : 0.000351s : 0.46% optimize.opt_a.a_3 : 0.000513s : 0.68% optimize.py_interpret_to_execute_after_opt_a : 0.000004s : 0.00% optimize.slice_cell_reuse_recomputed_activation : 0.000002s : 0.00% optimize.rewriter_after_opt_a : 0.000069s : 0.09% optimize.convert_after_rewriter : 0.000016s : 0.02% optimize.order_py_execute_after_rewriter : 0.000011s : 0.01% optimize.opt_b.b_1 : 0.000469s : 0.62% optimize.opt_b.b_2 : 0.000005s : 0.01% optimize.opt_b.updatestate_depend_eliminate : 0.000005s : 0.01% optimize.opt_b.updatestate_assign_eliminate : 0.000004s : 0.01% optimize.opt_b.updatestate_loads_eliminate : 0.000004s : 0.00% optimize.opt_b.renormalize : 0.000000s : 0.00% optimize.opt_b.cse : 0.000015s : 0.02% optimize.cconv : 0.000014s : 0.02% optimize.opt_after_cconv.c_1 : 0.000014s : 0.02% optimize.opt_after_cconv.parameter_eliminate : 0.000001s : 0.00% optimize.opt_after_cconv.updatestate_depend_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_assign_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.updatestate_loads_eliminate : 0.000002s : 0.00% optimize.opt_after_cconv.cse : 0.000006s : 0.01% optimize.opt_after_cconv.renormalize : 0.000000s : 0.00% optimize.remove_dup_value : 0.000006s : 0.01% optimize.tuple_transform.d_1 : 0.000022s : 0.03% optimize.tuple_transform.d_2 : 0.000006s : 0.01% optimize.tuple_transform.renormalize : 0.000000s : 0.00% optimize.add_cache_embedding : 0.000009s : 0.01% optimize.add_recomputation : 0.000030s : 0.04% optimize.cse_after_recomputation.cse : 0.000007s : 0.01% optimize.environ_conv : 0.000005s : 0.01% optimize.label_micro_interleaved_index : 0.000002s : 0.00% optimize.label_fine_grained_interleaved_index : 0.000001s : 0.00% optimize.assign_add_opt : 0.000002s : 0.00% optimize.slice_recompute_activation : 0.000002s : 0.00% optimize.micro_interleaved_order_control : 0.000001s : 0.00% optimize.full_micro_interleaved_order_control : 0.000002s : 0.00% optimize.comp_comm_scheduling : 0.000001s : 0.00% optimize.reorder_send_recv_between_fp_bp : 0.000002s : 0.00% optimize.comm_op_add_attrs : 0.000001s : 0.00% optimize.add_comm_op_reuse_tag : 0.000001s : 0.00% optimize.overlap_opt_shard_in_pipeline : 0.000001s : 0.00% optimize.grouped_pairwise_exchange_alltoall : 0.000001s : 0.00% optimize.overlap_recompute_and_grad_model_parallel : 0.000001s : 0.00% optimize.overlap_grad_matmul_and_grad_allreduce : 0.000000s : 0.00% optimize.split_matmul_comm_elemetwise : 0.000002s : 0.00% optimize.split_layernorm_comm : 0.000001s : 0.00% optimize.process_send_recv_for_ge : 0.000001s : 0.00% optimize.handle_group_info : 0.000001s : 0.00% auto_monad_reorder : 0.000012s : 0.02% get_jit_bprop_graph : 0.000003s : 0.00% eliminate_special_op_node : 0.000576s : 0.76% validate : 0.000022s : 0.03% distribtued_split : 0.000001s : 0.00% task_emit : 0.002371s : 3.12% execute : 0.000005s : 0.01% Time group info: ------[substitution.] 0.013964 446 0.02% : 0.000003s : 6: substitution.float_depend_g_call 0.07% : 0.000009s : 14: substitution.float_tuple_getitem_switch 93.36% : 0.013037s : 25: substitution.getattr_setattr_resolve 0.02% : 0.000003s : 3: substitution.graph_param_transform 0.02% : 0.000002s : 3: substitution.incorporate_call 0.02% : 0.000002s : 3: substitution.incorporate_call_switch 4.00% : 0.000558s : 65: substitution.inline 0.04% : 0.000006s : 10: substitution.less_batch_normalization 0.26% : 0.000036s : 42: substitution.meta_unpack_prepare 0.10% : 0.000014s : 16: substitution.minmaximum_grad 0.02% : 0.000003s : 6: substitution.partial_eliminate 0.01% : 0.000001s : 3: substitution.partial_unused_args_eliminate 0.04% : 0.000006s : 47: substitution.remove_not_recompute_node 0.44% : 0.000061s : 44: substitution.replace_applicator 0.05% : 0.000007s : 20: substitution.replace_old_param 0.02% : 0.000003s : 2: substitution.reset_defer_inline 0.04% : 0.000006s : 8: substitution.set_cell_output_no_recompute 0.05% : 0.000007s : 5: substitution.specialize_transform 0.05% : 0.000007s : 4: substitution.switch_simplify 0.06% : 0.000008s : 2: substitution.transpose_eliminate 0.30% : 0.000043s : 20: substitution.tuple_list_convert_item_index_to_positive 0.14% : 0.000019s : 20: substitution.tuple_list_get_item_const_eliminator 0.18% : 0.000025s : 20: substitution.tuple_list_get_item_depend_reorder 0.51% : 0.000071s : 38: substitution.tuple_list_get_item_eliminator 0.18% : 0.000026s : 20: substitution.tuple_list_get_set_item_eliminator ------[renormalize.] 0.047073 6 92.56% : 0.043573s : 3: renormalize.infer 7.44% : 0.003500s : 3: renormalize.specialize ------[replace.] 0.000619 68 45.64% : 0.000283s : 23: replace.getattr_setattr_resolve 30.45% : 0.000189s : 31: replace.inline 7.23% : 0.000045s : 2: replace.meta_unpack_prepare 7.99% : 0.000050s : 4: replace.switch_simplify 1.98% : 0.000012s : 2: replace.transpose_eliminate 6.71% : 0.000042s : 6: replace.tuple_list_get_item_eliminator ------[match.] 0.013527 68 96.01% : 0.012988s : 23: match.getattr_setattr_resolve 3.64% : 0.000492s : 31: match.inline 0.12% : 0.000016s : 2: match.meta_unpack_prepare 0.05% : 0.000007s : 4: match.switch_simplify 0.06% : 0.000008s : 2: match.transpose_eliminate 0.12% : 0.000017s : 6: match.tuple_list_get_item_eliminator ------[func_graph_cloner_run.] 0.003587 69 66.76% : 0.002395s : 28: func_graph_cloner_run.FuncGraphClonerGraph 33.24% : 0.001192s : 41: func_graph_cloner_run.FuncGraphSpecializer ------[meta_graph.] 0.000000 0 ------[manager.] 0.000000 0 ------[pynative] 0.000000 0 ------[others.] 0.019025 585 0.76% : 0.000146s : 2: opt.transform.meta_unpack_prepare 25.43% : 0.004838s : 461: opt.transform.opt_a 0.05% : 0.000010s : 7: opt.transform.opt_after_cconv 2.29% : 0.000435s : 94: opt.transform.opt_b 71.30% : 0.013565s : 10: opt.transform.opt_resolve 0.12% : 0.000024s : 8: opt.transform.opt_trans_graph 0.04% : 0.000009s : 3: opt.transform.special_op_eliminate . ============================== 2 passed in 21.08s ============================== [TRACE] GE(38167,python3.7):2024-01-11-05:45:10.327.727 [status:INIT] [ge_api.cc:463]38167 ~Session:Start to destruct session. [TRACE] GE(38167,python3.7):2024-01-11-05:45:10.327.777 [status:RUNNING] [ge_api.cc:475]38167 ~Session:Session id is 0 [TRACE] GE(38167,python3.7):2024-01-11-05:45:10.327.788 [status:RUNNING] [ge_api.cc:476]38167 ~Session:Destroying session [TRACE] GE(38167,python3.7):2024-01-11-05:45:10.328.709 [status:STOP] [ge_api.cc:491]38167 ~Session:Session Destructor finished [TRACE] GE(38167,python3.7):2024-01-11-05:45:10.328.734 [status:INIT] [ge_api.cc:301]38167 GEFinalize:GEFinalize start [INFO] GE(38167,python3.7):2024-01-11-05:45:10.328.785 [execution_runtime.cc:80][EVENT]38167 FinalizeExecutionRuntime:Execution runtime finalize begin. [INFO] GE(38167,python3.7):2024-01-11-05:45:10.328.801 [execution_runtime.cc:92][EVENT]38167 FinalizeExecutionRuntime:Execution runtime finalized. [TRACE] GE(38167,python3.7):2024-01-11-05:45:10.328.823 [status:RUNNING] [ge_api.cc:313]38167 GEFinalize:Finalizing environment [INFO] TUNE(38167,python3.7):2024-01-11-05:45:10.614.496 [cann_kb_pyfunc_mgr.cpp:127][CANNKB][Tid:38167]"CannKbPyfuncMgr: enter PyObjectDeinit function, reference_[1]" [INFO] TUNE(38167,python3.7):2024-01-11-05:45:10.614.535 [cann_kb_pyfunc_mgr.cpp:138][CANNKB][Tid:38167]"CannKbPyfuncMgr: PyObjectDeinit function end successfully!" [INFO] GE(38167,python3.7):2024-01-11-05:45:10.615.695 [gelib.cc:324][EVENT]38167 SystemFinalize:Online infer finalize GELib success. [TRACE] GE(38167,python3.7):2024-01-11-05:45:10.805.932 [status:STOP] [ge_api.cc:341]38167 GEFinalize:GEFinalize finished [INFO] TDT(38167,python3.7):2024-01-11-05:45:11.138.246 [process_mode_manager.cpp:184][Close][tid:38167] [TsdClient] Close [deviceId=3][sessionId=1] hccp and computer enter [INFO] TDT(38167,python3.7):2024-01-11-05:45:11.138.275 [version_verify.cpp:112][SpecialFeatureCheck][tid:38167] VersionVerify: previous type[7], supported [INFO] TDT(38167,python3.7):2024-01-11-05:45:11.138.305 [process_mode_manager.cpp:192][Close][tid:38167] [TsdClient][deviceId=3] [sessionId=1] wait hccp and computer process close respond [INFO] TDT(38167,python3.7):2024-01-11-05:45:11.159.818 [process_mode_manager.cpp:197][Close][tid:38167] [TsdClient][logicDeviceId_=3]has recv close hccp and computer process respond [INFO] TDT(38167,python3.7):2024-01-11-05:45:11.159.832 [stub_process_mode_nowin.cpp:151][CloseInHost][tid:38167] enter into CloseInHost deviceid[3] [INFO] TDT(38167,python3.7):2024-01-11-05:45:11.159.842 [stub_process_mode_nowin.cpp:154][CloseInHost][tid:38167] host cpu not support [INFO] TDT(38167,python3.7):2024-01-11-05:45:11.159.876 [process_mode_manager.cpp:208][Close][tid:38167] [TsdClient][deviceId=3] [sessionId=1] close hccp and computer process success [INFO] ATRACE(38167,python3.7):2024-01-11-05:45:11.159.888 [atrace_api.c:93](tid:38167) AtraceDestroy start [INFO] ATRACE(38167,python3.7):2024-01-11-05:45:11.159.903 [atrace_api.c:95](tid:38167) AtraceDestroy end [INFO] PROFILING(38167,python3.7):2024-01-11-05:45:11.159.923 [msprofiler_impl.cpp:156] >>> (tid:38167) ProfNotifySetDevice called, is open: 0, devId: 3 [INFO] RUNTIME(38167,python3.7):2024-01-11-05:45:12.683.339 [runtime.cc:1737] 38167 ~Runtime: deconstruct runtime.