Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YOLO Pose model errors in edge.to_executorch #7214

Open
agunapal opened this issue Dec 5, 2024 · 2 comments
Open

YOLO Pose model errors in edge.to_executorch #7214

agunapal opened this issue Dec 5, 2024 · 2 comments
Labels
actionable Items in the backlog waiting for an appropriate impl/fix bug Something isn't working module: exir Issues related to Export IR triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@agunapal
Copy link

agunapal commented Dec 5, 2024

🐛 Describe the bug

Getting an error when I try to get the .pte for Yolo Pose model. The issue happens with YOLO object detection model too.

Setup

pip install ultralytics

Python code to run

import torch

from executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPartitioner
from executorch.devtools import generate_etrecord
from executorch.exir import (
    EdgeCompileConfig,
    ExecutorchBackendConfig,
    to_edge_transform_and_lower,
)
from executorch.extension.export_util.utils import save_pte_program
from torch.export import export
from ultralytics import YOLO


pose_model = YOLO("yolo11n-pose.pt")  # Load model
pose_model.model.eval()

inputs = torch.rand((1, 3, 640, 640))

ep: torch.export.ExportedProgram = export(
    pose_model.model, args=(inputs,), strict=False
)

with torch.no_grad():
    edge = to_edge_transform_and_lower(
        ep,
        partitioner=[XnnpackPartitioner()],
    )


exec_prog = edge.to_executorch(
    config=ExecutorchBackendConfig(extract_delegate_segments=False)
)

model_name = "yolo_pose_xnnpack_fp32"
save_pte_program(exec_prog, model_name, "./")

Error

/home/agunapal/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/emit/_emitter.py:378: UserWarning: Accessing the data pointer of FakeTensor is deprecated and will error in PyTorch 2.5. This is almost definitely a bug in your code and will cause undefined behavior with subsystems like torch.compile. Please wrap calls to tensor.data_ptr() in an opaque custom op; If all else fails, you can guard accesses to tensor.data_ptr() on isinstance(tensor, FakeTensor). (Triggered internally at ../c10/core/StorageImpl.cpp:34.)
  typing.cast(torch.UntypedStorage, spec.storage).data_ptr(),
Traceback (most recent call last):
  File "/home/agunapal/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/emit/_emitter.py", line 1445, in run_node
    ret = super().run_node(n)
  File "/home/agunapal/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/fx/interpreter.py", line 203, in run_node
    return getattr(self, n.op)(n.target, args, kwargs)
  File "/home/agunapal/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/emit/_emitter.py", line 1583, in placeholder
    self._tensor_spec_to_evalue(spec)
  File "/home/agunapal/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/emit/_emitter.py", line 380, in _tensor_spec_to_evalue
    ).contents
ValueError: NULL pointer access

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/agunapal/export_games/pose/pose_estimation.py", line 34, in <module>
    exec_prog = edge.to_executorch(
  File "/home/agunapal/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/program/_program.py", line 1357, in to_executorch
    return ExecutorchProgramManager(
  File "/home/agunapal/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/program/_program.py", line 1404, in __init__
    self._emitter_output: EmitterOutput = emit_program(
  File "/home/agunapal/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/emit/_emit_program.py", line 167, in emit_program
    emitter.run()
  File "/home/agunapal/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/emit/_emitter.py", line 1435, in run
    super().run(*args, initial_env, enable_io_processing=False)
  File "/home/agunapal/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/fx/interpreter.py", line 146, in run
    self.env[node] = self.run_node(node)
  File "/home/agunapal/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/emit/_emitter.py", line 1450, in run_node
    raise InternalError(
executorch.exir.error.InternalError: Failed with error: NULL pointer access
Here is the node in the graph module:
graph():
    %b_model_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_0_bn_num_batches_tracked]
    %b_model_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_1_bn_num_batches_tracked]
    %b_model_2_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_2_cv1_bn_num_batches_tracked]
    %b_model_2_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_2_cv2_bn_num_batches_tracked]
    %b_model_2_m_0_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_2_m_0_cv1_bn_num_batches_tracked]
    %b_model_2_m_0_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_2_m_0_cv2_bn_num_batches_tracked]
    %b_model_3_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_3_bn_num_batches_tracked]
    %b_model_4_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_4_cv1_bn_num_batches_tracked]
    %b_model_4_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_4_cv2_bn_num_batches_tracked]
    %b_model_4_m_0_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_4_m_0_cv1_bn_num_batches_tracked]
    %b_model_4_m_0_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_4_m_0_cv2_bn_num_batches_tracked]
    %b_model_5_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_5_bn_num_batches_tracked]
    %b_model_6_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_6_cv1_bn_num_batches_tracked]
    %b_model_6_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_6_cv2_bn_num_batches_tracked]
    %b_model_6_m_0_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_6_m_0_cv1_bn_num_batches_tracked]
    %b_model_6_m_0_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_6_m_0_cv2_bn_num_batches_tracked]
    %b_model_6_m_0_cv3_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_6_m_0_cv3_bn_num_batches_tracked]
    %b_model_6_m_0_m_0_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_6_m_0_m_0_cv1_bn_num_batches_tracked]
    %b_model_6_m_0_m_0_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_6_m_0_m_0_cv2_bn_num_batches_tracked]
    %b_model_6_m_0_m_1_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_6_m_0_m_1_cv1_bn_num_batches_tracked]
    %b_model_6_m_0_m_1_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_6_m_0_m_1_cv2_bn_num_batches_tracked]
    %b_model_7_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_7_bn_num_batches_tracked]
    %b_model_8_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_8_cv1_bn_num_batches_tracked]
    %b_model_8_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_8_cv2_bn_num_batches_tracked]
    %b_model_8_m_0_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_8_m_0_cv1_bn_num_batches_tracked]
    %b_model_8_m_0_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_8_m_0_cv2_bn_num_batches_tracked]
    %b_model_8_m_0_cv3_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_8_m_0_cv3_bn_num_batches_tracked]
    %b_model_8_m_0_m_0_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_8_m_0_m_0_cv1_bn_num_batches_tracked]
    %b_model_8_m_0_m_0_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_8_m_0_m_0_cv2_bn_num_batches_tracked]
    %b_model_8_m_0_m_1_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_8_m_0_m_1_cv1_bn_num_batches_tracked]
    %b_model_8_m_0_m_1_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_8_m_0_m_1_cv2_bn_num_batches_tracked]
    %b_model_9_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_9_cv1_bn_num_batches_tracked]
    %b_model_9_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_9_cv2_bn_num_batches_tracked]
    %b_model_10_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_10_cv1_bn_num_batches_tracked]
    %b_model_10_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_10_cv2_bn_num_batches_tracked]
    %b_model_10_m_0_attn_qkv_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_10_m_0_attn_qkv_bn_num_batches_tracked]
    %b_model_10_m_0_attn_proj_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_10_m_0_attn_proj_bn_num_batches_tracked]
    %b_model_10_m_0_attn_pe_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_10_m_0_attn_pe_bn_num_batches_tracked]
    %b_model_10_m_0_ffn_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_10_m_0_ffn_0_bn_num_batches_tracked]
    %b_model_10_m_0_ffn_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_10_m_0_ffn_1_bn_num_batches_tracked]
    %b_model_13_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_13_cv1_bn_num_batches_tracked]
    %b_model_13_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_13_cv2_bn_num_batches_tracked]
    %b_model_13_m_0_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_13_m_0_cv1_bn_num_batches_tracked]
    %b_model_13_m_0_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_13_m_0_cv2_bn_num_batches_tracked]
    %b_model_16_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_16_cv1_bn_num_batches_tracked]
    %b_model_16_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_16_cv2_bn_num_batches_tracked]
    %b_model_16_m_0_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_16_m_0_cv1_bn_num_batches_tracked]
    %b_model_16_m_0_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_16_m_0_cv2_bn_num_batches_tracked]
    %b_model_17_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_17_bn_num_batches_tracked]
    %b_model_19_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_19_cv1_bn_num_batches_tracked]
    %b_model_19_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_19_cv2_bn_num_batches_tracked]
    %b_model_19_m_0_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_19_m_0_cv1_bn_num_batches_tracked]
    %b_model_19_m_0_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_19_m_0_cv2_bn_num_batches_tracked]
    %b_model_20_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_20_bn_num_batches_tracked]
    %b_model_22_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_22_cv1_bn_num_batches_tracked]
    %b_model_22_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_22_cv2_bn_num_batches_tracked]
    %b_model_22_m_0_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_22_m_0_cv1_bn_num_batches_tracked]
    %b_model_22_m_0_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_22_m_0_cv2_bn_num_batches_tracked]
    %b_model_22_m_0_cv3_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_22_m_0_cv3_bn_num_batches_tracked]
    %b_model_22_m_0_m_0_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_22_m_0_m_0_cv1_bn_num_batches_tracked]
    %b_model_22_m_0_m_0_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_22_m_0_m_0_cv2_bn_num_batches_tracked]
    %b_model_22_m_0_m_1_cv1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_22_m_0_m_1_cv1_bn_num_batches_tracked]
    %b_model_22_m_0_m_1_cv2_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_22_m_0_m_1_cv2_bn_num_batches_tracked]
    %b_model_23_cv2_0_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv2_0_0_bn_num_batches_tracked]
    %b_model_23_cv2_0_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv2_0_1_bn_num_batches_tracked]
    %b_model_23_cv2_1_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv2_1_0_bn_num_batches_tracked]
    %b_model_23_cv2_1_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv2_1_1_bn_num_batches_tracked]
    %b_model_23_cv2_2_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv2_2_0_bn_num_batches_tracked]
    %b_model_23_cv2_2_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv2_2_1_bn_num_batches_tracked]
    %b_model_23_cv3_0_0_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv3_0_0_0_bn_num_batches_tracked]
    %b_model_23_cv3_0_0_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv3_0_0_1_bn_num_batches_tracked]
    %b_model_23_cv3_0_1_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv3_0_1_0_bn_num_batches_tracked]
    %b_model_23_cv3_0_1_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv3_0_1_1_bn_num_batches_tracked]
    %b_model_23_cv3_1_0_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv3_1_0_0_bn_num_batches_tracked]
    %b_model_23_cv3_1_0_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv3_1_0_1_bn_num_batches_tracked]
    %b_model_23_cv3_1_1_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv3_1_1_0_bn_num_batches_tracked]
    %b_model_23_cv3_1_1_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv3_1_1_1_bn_num_batches_tracked]
    %b_model_23_cv3_2_0_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv3_2_0_0_bn_num_batches_tracked]
    %b_model_23_cv3_2_0_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv3_2_0_1_bn_num_batches_tracked]
    %b_model_23_cv3_2_1_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv3_2_1_0_bn_num_batches_tracked]
    %b_model_23_cv3_2_1_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv3_2_1_1_bn_num_batches_tracked]
    %b_model_23_cv4_0_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv4_0_0_bn_num_batches_tracked]
    %b_model_23_cv4_0_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv4_0_1_bn_num_batches_tracked]
    %b_model_23_cv4_1_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv4_1_0_bn_num_batches_tracked]
    %b_model_23_cv4_1_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv4_1_1_bn_num_batches_tracked]
    %b_model_23_cv4_2_0_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv4_2_0_bn_num_batches_tracked]
    %b_model_23_cv4_2_1_bn_num_batches_tracked : [num_users=0] = placeholder[target=b_model_23_cv4_2_1_bn_num_batches_tracked]
--> %c_model_23_lifted_tensor_0 : [num_users=3] = placeholder[target=c_model_23_lifted_tensor_0]
    %c_model_23_lifted_tensor_1 : [num_users=3] = placeholder[target=c_model_23_lifted_tensor_1]
    %_lifted_tensor_constant261 : [num_users=1] = placeholder[target=_lifted_tensor_constant261]
    %_lifted_tensor_constant262 : [num_users=1] = placeholder[target=_lifted_tensor_constant262]
    %_lifted_tensor_constant263 : [num_users=1] = placeholder[target=_lifted_tensor_constant263]
    %_lifted_tensor_constant264 : [num_users=1] = placeholder[target=_lifted_tensor_constant264]
    %_lifted_tensor_constant265 : [num_users=1] = placeholder[target=_lifted_tensor_constant265]
    %_lifted_tensor_constant266 : [num_users=1] = placeholder[target=_lifted_tensor_constant266]
    %_lifted_tensor_constant267 : [num_users=1] = placeholder[target=_lifted_tensor_constant267]
    %_lifted_tensor_constant268 : [num_users=1] = placeholder[target=_lifted_tensor_constant268]
    %_lifted_tensor_constant269 : [num_users=1] = placeholder[target=_lifted_tensor_constant269]
    %_lifted_tensor_constant270 : [num_users=1] = placeholder[target=_lifted_tensor_constant270]
    %_lifted_tensor_constant271 : [num_users=1] = placeholder[target=_lifted_tensor_constant271]
    %_lifted_tensor_constant272 : [num_users=1] = placeholder[target=_lifted_tensor_constant272]
    %_lifted_tensor_constant273 : [num_users=1] = placeholder[target=_lifted_tensor_constant273]
    %_lifted_tensor_constant274 : [num_users=1] = placeholder[target=_lifted_tensor_constant274]
    %x : [num_users=1] = placeholder[target=x]
    %alloc : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((40,), torch.float32),), kwargs = {})
    %aten_arange_start_step : [num_users=1] = call_function[target=torch.ops.aten.arange.start_out](args = (0, 40), kwargs = {out: %alloc})
    %alloc_1 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((40,), torch.float32),), kwargs = {})
    %aten_arange_start_step_1 : [num_users=1] = call_function[target=torch.ops.aten.arange.start_out](args = (0, 40), kwargs = {out: %alloc_1})
    %alloc_2 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((80,), torch.float32),), kwargs = {})
    %aten_arange_start_step_2 : [num_users=1] = call_function[target=torch.ops.aten.arange.start_out](args = (0, 80), kwargs = {out: %alloc_2})
    %alloc_3 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((80,), torch.float32),), kwargs = {})
    %aten_arange_start_step_3 : [num_users=1] = call_function[target=torch.ops.aten.arange.start_out](args = (0, 80), kwargs = {out: %alloc_3})
    %alloc_4 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 2, 8400), torch.float32),), kwargs = {})
    %aten_unsqueeze_copy_default : [num_users=1] = call_function[target=torch.ops.aten.unsqueeze_copy.out](args = (%c_model_23_lifted_tensor_0, 0), kwargs = {out: %alloc_4})
    %alloc_5 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((8400,), torch.float32),), kwargs = {})
    %aten_select_copy_int : [num_users=1] = call_function[target=torch.ops.aten.select_copy.int_out](args = (%c_model_23_lifted_tensor_0, 0, 0), kwargs = {out: %alloc_5})
    %alloc_6 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((8400,), torch.float32),), kwargs = {})
    %aten_select_copy_int_1 : [num_users=1] = call_function[target=torch.ops.aten.select_copy.int_out](args = (%c_model_23_lifted_tensor_0, 0, 1), kwargs = {out: %alloc_6})
    %alloc_7 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((), torch.float32),), kwargs = {})
    %aten__to_copy_default : [num_users=1] = call_function[target=torch.ops.aten._to_copy.out](args = (%_lifted_tensor_constant270,), kwargs = {out: %alloc_7})
    %lowered_module_0 : [num_users=1] = get_attr[target=lowered_module_0]
    %executorch_call_delegate : [num_users=1] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_0, %x), kwargs = {})
    %getitem : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate, 0), kwargs = {})
    %alloc_8 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 16, 160, 160), torch.float32), ((1, 16, 160, 160), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem, [16, 16], 1), kwargs = {out: %alloc_8})
    %alloc_9 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 16, 160, 160), torch.float32), ((1, 16, 160, 160), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_1 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem, [16, 16], 1), kwargs = {out: %alloc_9})
    %getitem_1 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_8, 1), kwargs = {})
    %getitem_2 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_9, 0), kwargs = {})
    %lowered_module_1 : [num_users=1] = get_attr[target=lowered_module_1]
    %executorch_call_delegate_1 : [num_users=1] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_1, %getitem_1, %getitem_2), kwargs = {})
    %getitem_3 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_1, 0), kwargs = {})
    %alloc_10 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 32, 80, 80), torch.float32), ((1, 32, 80, 80), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_2 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_3, [32, 32], 1), kwargs = {out: %alloc_10})
    %alloc_11 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 32, 80, 80), torch.float32), ((1, 32, 80, 80), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_3 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_3, [32, 32], 1), kwargs = {out: %alloc_11})
    %getitem_4 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_10, 1), kwargs = {})
    %getitem_5 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_11, 0), kwargs = {})
    %lowered_module_2 : [num_users=1] = get_attr[target=lowered_module_2]
    %executorch_call_delegate_2 : [num_users=2] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_2, %getitem_4, %getitem_5), kwargs = {})
    %getitem_6 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_2, 0), kwargs = {})
    %getitem_7 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_2, 1), kwargs = {})
    %alloc_12 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 64, 40, 40), torch.float32), ((1, 64, 40, 40), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_4 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_7, [64, 64], 1), kwargs = {out: %alloc_12})
    %alloc_13 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 64, 40, 40), torch.float32), ((1, 64, 40, 40), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_5 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_7, [64, 64], 1), kwargs = {out: %alloc_13})
    %getitem_8 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_12, 1), kwargs = {})
    %getitem_9 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_13, 0), kwargs = {})
    %lowered_module_3 : [num_users=1] = get_attr[target=lowered_module_3]
    %executorch_call_delegate_3 : [num_users=2] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_3, %getitem_8, %getitem_9), kwargs = {})
    %getitem_10 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_3, 0), kwargs = {})
    %getitem_11 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_3, 1), kwargs = {})
    %alloc_14 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 128, 20, 20), torch.float32), ((1, 128, 20, 20), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_6 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_11, [128, 128], 1), kwargs = {out: %alloc_14})
    %alloc_15 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 128, 20, 20), torch.float32), ((1, 128, 20, 20), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_7 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_11, [128, 128], 1), kwargs = {out: %alloc_15})
    %getitem_12 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_14, 1), kwargs = {})
    %getitem_13 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_15, 0), kwargs = {})
    %lowered_module_4 : [num_users=1] = get_attr[target=lowered_module_4]
    %executorch_call_delegate_4 : [num_users=1] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_4, %getitem_12, %getitem_13), kwargs = {})
    %getitem_14 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_4, 0), kwargs = {})
    %alloc_16 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 128, 20, 20), torch.float32), ((1, 128, 20, 20), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_8 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_14, [128, 128], 1), kwargs = {out: %alloc_16})
    %alloc_17 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 128, 20, 20), torch.float32), ((1, 128, 20, 20), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_9 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_14, [128, 128], 1), kwargs = {out: %alloc_17})
    %getitem_15 : [num_users=2] = call_function[target=operator.getitem](args = (%alloc_16, 1), kwargs = {})
    %getitem_16 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_17, 0), kwargs = {})
    %lowered_module_5 : [num_users=1] = get_attr[target=lowered_module_5]
    %executorch_call_delegate_5 : [num_users=1] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_5, %getitem_15), kwargs = {})
    %getitem_17 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_5, 0), kwargs = {})
    %aten_view_copy_default : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%getitem_17, [1, 2, 128, 400]), kwargs = {})
    %alloc_18 : [num_users=4] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 2, 32, 400), torch.float32), ((1, 2, 32, 400), torch.float32), ((1, 2, 64, 400), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_10 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%aten_view_copy_default, [32, 32, 64], 2), kwargs = {out: %alloc_18})
    %getitem_18 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_18, 0), kwargs = {})
    %getitem_19 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_18, 1), kwargs = {})
    %getitem_20 : [num_users=2] = call_function[target=operator.getitem](args = (%alloc_18, 2), kwargs = {})
    %lowered_module_6 : [num_users=1] = get_attr[target=lowered_module_6]
    %executorch_call_delegate_6 : [num_users=1] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_6, %getitem_18), kwargs = {})
    %alloc_19 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 2, 32, 400), torch.float32),), kwargs = {})
    %aten_expand_copy_default : [num_users=1] = call_function[target=torch.ops.aten.expand_copy.out](args = (%getitem_19, [1, 2, 32, 400]), kwargs = {out: %alloc_19})
    %alloc_20 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 2, 64, 400), torch.float32),), kwargs = {})
    %aten_expand_copy_default_1 : [num_users=1] = call_function[target=torch.ops.aten.expand_copy.out](args = (%getitem_20, [1, 2, 64, 400]), kwargs = {out: %alloc_20})
    %alloc_21 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 2, 64, 400), torch.float32),), kwargs = {})
    %aten_clone_default : [num_users=1] = call_function[target=torch.ops.aten.clone.out](args = (%getitem_20,), kwargs = {memory_format: torch.contiguous_format, out: %alloc_21})
    %getitem_21 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_6, 0), kwargs = {})
    %aten_view_copy_default_1 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%aten_expand_copy_default, [2, 32, 400]), kwargs = {})
    %aten_view_copy_default_2 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%aten_expand_copy_default_1, [2, 64, 400]), kwargs = {})
    %aten_view_copy_default_3 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%aten_clone_default, [1, 128, 20, 20]), kwargs = {})
    %alloc_22 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 2, 400, 32), torch.float32),), kwargs = {})
    %aten_expand_copy_default_2 : [num_users=1] = call_function[target=torch.ops.aten.expand_copy.out](args = (%getitem_21, [1, 2, 400, 32]), kwargs = {out: %alloc_22})
    %aten_view_copy_default_4 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%aten_expand_copy_default_2, [2, 400, 32]), kwargs = {})
    %lowered_module_7 : [num_users=1] = get_attr[target=lowered_module_7]
    %executorch_call_delegate_7 : [num_users=1] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_7, %aten_view_copy_default_4, %aten_view_copy_default_1), kwargs = {})
    %getitem_22 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_7, 0), kwargs = {})
    %aten_view_copy_default_5 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%getitem_22, [1, 2, 400, 400]), kwargs = {})
    %lowered_module_8 : [num_users=1] = get_attr[target=lowered_module_8]
    %executorch_call_delegate_8 : [num_users=1] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_8, %_lifted_tensor_constant261, %aten_view_copy_default_5), kwargs = {})
    %getitem_23 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_8, 0), kwargs = {})
    %alloc_23 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 2, 400, 400), torch.float32),), kwargs = {})
    %aten_expand_copy_default_3 : [num_users=1] = call_function[target=torch.ops.aten.expand_copy.out](args = (%getitem_23, [1, 2, 400, 400]), kwargs = {out: %alloc_23})
    %aten_view_copy_default_6 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%aten_expand_copy_default_3, [2, 400, 400]), kwargs = {})
    %lowered_module_9 : [num_users=1] = get_attr[target=lowered_module_9]
    %executorch_call_delegate_9 : [num_users=1] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_9, %aten_view_copy_default_2, %aten_view_copy_default_6), kwargs = {})
    %getitem_24 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_9, 0), kwargs = {})
    %aten_view_copy_default_8 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%getitem_24, [1, 128, 20, 20]), kwargs = {})
    %lowered_module_10 : [num_users=1] = get_attr[target=lowered_module_10]
    %executorch_call_delegate_10 : [num_users=3] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_10, %_lifted_tensor_constant262, %_lifted_tensor_constant264, %_lifted_tensor_constant263, %_lifted_tensor_constant265, %aten_arange_start_step, %aten_arange_start_step_1, %aten_view_copy_default_3, %aten_view_copy_default_8, %getitem_15, %getitem_16), kwargs = {})
    %getitem_25 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_10, 0), kwargs = {})
    %getitem_26 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_10, 1), kwargs = {})
    %getitem_27 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_10, 2), kwargs = {})
    %alloc_24 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((40,), torch.int64),), kwargs = {})
    %aten__to_copy_default_1 : [num_users=1] = call_function[target=torch.ops.aten._to_copy.out](args = (%getitem_25,), kwargs = {out: %alloc_24})
    %alloc_25 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((40,), torch.int64),), kwargs = {})
    %aten__to_copy_default_2 : [num_users=1] = call_function[target=torch.ops.aten._to_copy.out](args = (%getitem_26,), kwargs = {out: %alloc_25})
    %alloc_26 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((40, 1), torch.int64),), kwargs = {})
    %aten_unsqueeze_copy_default_1 : [num_users=1] = call_function[target=torch.ops.aten.unsqueeze_copy.out](args = (%aten__to_copy_default_1, -1), kwargs = {out: %alloc_26})
    %alloc_27 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 256, 40, 40), torch.float32),), kwargs = {})
    %aten_index_tensor : [num_users=1] = call_function[target=torch.ops.aten.index.Tensor_out](args = (%getitem_27, [None, None, %aten_unsqueeze_copy_default_1, %aten__to_copy_default_2]), kwargs = {out: %alloc_27})
    %lowered_module_11 : [num_users=1] = get_attr[target=lowered_module_11]
    %executorch_call_delegate_11 : [num_users=1] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_11, %aten_index_tensor, %getitem_10), kwargs = {})
    %getitem_28 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_11, 0), kwargs = {})
    %alloc_28 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 64, 40, 40), torch.float32), ((1, 64, 40, 40), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_11 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_28, [64, 64], 1), kwargs = {out: %alloc_28})
    %alloc_29 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 64, 40, 40), torch.float32), ((1, 64, 40, 40), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_12 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_28, [64, 64], 1), kwargs = {out: %alloc_29})
    %getitem_29 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_28, 1), kwargs = {})
    %getitem_30 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_29, 0), kwargs = {})
    %lowered_module_12 : [num_users=1] = get_attr[target=lowered_module_12]
    %executorch_call_delegate_12 : [num_users=3] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_12, %_lifted_tensor_constant266, %_lifted_tensor_constant268, %_lifted_tensor_constant267, %_lifted_tensor_constant269, %aten_arange_start_step_2, %aten_arange_start_step_3, %getitem_29, %getitem_30), kwargs = {})
    %getitem_31 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_12, 0), kwargs = {})
    %getitem_32 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_12, 1), kwargs = {})
    %getitem_33 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_12, 2), kwargs = {})
    %alloc_30 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((80,), torch.int64),), kwargs = {})
    %aten__to_copy_default_3 : [num_users=1] = call_function[target=torch.ops.aten._to_copy.out](args = (%getitem_31,), kwargs = {out: %alloc_30})
    %alloc_31 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((80,), torch.int64),), kwargs = {})
    %aten__to_copy_default_4 : [num_users=1] = call_function[target=torch.ops.aten._to_copy.out](args = (%getitem_32,), kwargs = {out: %alloc_31})
    %alloc_32 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((80, 1), torch.int64),), kwargs = {})
    %aten_unsqueeze_copy_default_2 : [num_users=1] = call_function[target=torch.ops.aten.unsqueeze_copy.out](args = (%aten__to_copy_default_3, -1), kwargs = {out: %alloc_32})
    %alloc_33 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 128, 80, 80), torch.float32),), kwargs = {})
    %aten_index_tensor_1 : [num_users=1] = call_function[target=torch.ops.aten.index.Tensor_out](args = (%getitem_33, [None, None, %aten_unsqueeze_copy_default_2, %aten__to_copy_default_4]), kwargs = {out: %alloc_33})
    %lowered_module_13 : [num_users=1] = get_attr[target=lowered_module_13]
    %executorch_call_delegate_13 : [num_users=1] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_13, %aten_index_tensor_1, %getitem_6), kwargs = {})
    %getitem_34 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_13, 0), kwargs = {})
    %alloc_34 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 32, 80, 80), torch.float32), ((1, 32, 80, 80), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_13 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_34, [32, 32], 1), kwargs = {out: %alloc_34})
    %alloc_35 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 32, 80, 80), torch.float32), ((1, 32, 80, 80), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_14 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_34, [32, 32], 1), kwargs = {out: %alloc_35})
    %getitem_35 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_34, 1), kwargs = {})
    %getitem_36 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_35, 0), kwargs = {})
    %lowered_module_14 : [num_users=1] = get_attr[target=lowered_module_14]
    %executorch_call_delegate_14 : [num_users=2] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_14, %getitem_35, %getitem_36, %getitem_33), kwargs = {})
    %getitem_37 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_14, 0), kwargs = {})
    %getitem_38 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_14, 1), kwargs = {})
    %alloc_36 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 64, 40, 40), torch.float32), ((1, 64, 40, 40), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_15 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_38, [64, 64], 1), kwargs = {out: %alloc_36})
    %alloc_37 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 64, 40, 40), torch.float32), ((1, 64, 40, 40), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_16 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_38, [64, 64], 1), kwargs = {out: %alloc_37})
    %getitem_39 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_36, 1), kwargs = {})
    %getitem_40 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_37, 0), kwargs = {})
    %lowered_module_15 : [num_users=1] = get_attr[target=lowered_module_15]
    %executorch_call_delegate_15 : [num_users=2] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_15, %getitem_39, %getitem_40, %getitem_27), kwargs = {})
    %getitem_41 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_15, 0), kwargs = {})
    %getitem_42 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_15, 1), kwargs = {})
    %alloc_38 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 128, 20, 20), torch.float32), ((1, 128, 20, 20), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_17 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_42, [128, 128], 1), kwargs = {out: %alloc_38})
    %alloc_39 : [num_users=2] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 128, 20, 20), torch.float32), ((1, 128, 20, 20), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_18 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_42, [128, 128], 1), kwargs = {out: %alloc_39})
    %getitem_43 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_38, 1), kwargs = {})
    %getitem_44 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_39, 0), kwargs = {})
    %lowered_module_16 : [num_users=1] = get_attr[target=lowered_module_16]
    %executorch_call_delegate_16 : [num_users=4] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_16, %getitem_43, %getitem_37, %getitem_41, %getitem_44), kwargs = {})
    %getitem_45 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_16, 0), kwargs = {})
    %getitem_46 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_16, 1), kwargs = {})
    %getitem_47 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_16, 2), kwargs = {})
    %getitem_48 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_16, 3), kwargs = {})
    %aten_view_copy_default_9 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%getitem_45, [1, 51, -1]), kwargs = {})
    %aten_view_copy_default_10 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%getitem_46, [1, 51, -1]), kwargs = {})
    %aten_view_copy_default_11 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%getitem_48, [1, 51, -1]), kwargs = {})
    %lowered_module_17 : [num_users=1] = get_attr[target=lowered_module_17]
    %executorch_call_delegate_17 : [num_users=4] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_17, %aten_view_copy_default_9, %aten_view_copy_default_10, %aten_view_copy_default_11, %getitem_37, %getitem_41, %getitem_47), kwargs = {})
    %getitem_49 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_17, 0), kwargs = {})
    %getitem_50 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_17, 1), kwargs = {})
    %getitem_51 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_17, 2), kwargs = {})
    %getitem_52 : [num_users=2] = call_function[target=operator.getitem](args = (%executorch_call_delegate_17, 3), kwargs = {})
    %alloc_40 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 51, 8400), torch.float32),), kwargs = {})
    %aten_clone_default_1 : [num_users=4] = call_function[target=torch.ops.aten.clone.out](args = (%getitem_49,), kwargs = {out: %alloc_40})
    %aten_view_copy_default_12 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%getitem_50, [1, 65, -1]), kwargs = {})
    %aten_view_copy_default_13 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%getitem_51, [1, 65, -1]), kwargs = {})
    %aten_view_copy_default_14 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%getitem_52, [1, 65, -1]), kwargs = {})
    %alloc_41 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 17, 8400), torch.float32),), kwargs = {})
    %aten_slice_copy_tensor : [num_users=1] = call_function[target=torch.ops.aten.slice_copy.Tensor_out](args = (%aten_clone_default_1, 1, 2, 9223372036854775807, 3), kwargs = {out: %alloc_41})
    %alloc_42 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 17, 8400), torch.float32),), kwargs = {})
    %aten_slice_copy_tensor_1 : [num_users=1] = call_function[target=torch.ops.aten.slice_copy.Tensor_out](args = (%aten_clone_default_1, 1, 2, 9223372036854775807, 3), kwargs = {out: %alloc_42})
    %lowered_module_18 : [num_users=1] = get_attr[target=lowered_module_18]
    %executorch_call_delegate_18 : [num_users=2] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_18, %aten_slice_copy_tensor, %aten_view_copy_default_12, %aten_view_copy_default_13, %aten_view_copy_default_14), kwargs = {})
    %getitem_53 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_18, 0), kwargs = {})
    %getitem_54 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_18, 1), kwargs = {})
    %alloc_43 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 17, 8400), torch.float32),), kwargs = {})
    %aten_copy_default : [num_users=1] = call_function[target=torch.ops.aten.copy.out](args = (%aten_slice_copy_tensor_1, %getitem_53), kwargs = {out: %alloc_43})
    %alloc_44 : [num_users=3] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 64, 8400), torch.float32), ((1, 1, 8400), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_19 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%getitem_54, [64, 1], 1), kwargs = {out: %alloc_44})
    %alloc_45 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 51, 8400), torch.float32),), kwargs = {})
    %aten_slice_scatter_default : [num_users=1] = call_function[target=torch.ops.aten.slice_scatter.out](args = (%aten_clone_default_1, %aten_copy_default, 1, 2, 9223372036854775807, 3), kwargs = {out: %alloc_45})
    %getitem_55 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_44, 0), kwargs = {})
    %getitem_56 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_44, 1), kwargs = {})
    %alloc_46 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 51, 8400), torch.float32),), kwargs = {})
    %aten_slice_scatter_default_1 : [num_users=4] = call_function[target=torch.ops.aten.slice_scatter.out](args = (%aten_clone_default_1, %aten_slice_scatter_default, 0, 0, 9223372036854775807), kwargs = {out: %alloc_46})
    %aten_view_copy_default_15 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%getitem_55, [1, 4, 16, 8400]), kwargs = {})
    %alloc_47 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 17, 8400), torch.float32),), kwargs = {})
    %aten_slice_copy_tensor_2 : [num_users=1] = call_function[target=torch.ops.aten.slice_copy.Tensor_out](args = (%aten_slice_scatter_default_1, 1, 0, 9223372036854775807, 3), kwargs = {out: %alloc_47})
    %alloc_48 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 17, 8400), torch.float32),), kwargs = {})
    %aten_slice_copy_tensor_3 : [num_users=1] = call_function[target=torch.ops.aten.slice_copy.Tensor_out](args = (%aten_slice_scatter_default_1, 1, 0, 9223372036854775807, 3), kwargs = {out: %alloc_48})
    %lowered_module_19 : [num_users=1] = get_attr[target=lowered_module_19]
    %executorch_call_delegate_19 : [num_users=2] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_19, %_lifted_tensor_constant272, %_lifted_tensor_constant271, %aten_select_copy_int, %aten_slice_copy_tensor_2, %aten_view_copy_default_15, %c_model_23_lifted_tensor_1), kwargs = {})
    %getitem_57 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_19, 0), kwargs = {})
    %getitem_58 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_19, 1), kwargs = {})
    %alloc_49 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 16, 4, 8400), torch.float32),), kwargs = {})
    %aten__softmax_default : [num_users=1] = call_function[target=torch.ops.aten._softmax.out](args = (%getitem_57, 1, False), kwargs = {out: %alloc_49})
    %alloc_50 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 17, 8400), torch.float32),), kwargs = {})
    %aten_copy_default_1 : [num_users=1] = call_function[target=torch.ops.aten.copy.out](args = (%aten_slice_copy_tensor_3, %getitem_58), kwargs = {out: %alloc_50})
    %alloc_51 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 51, 8400), torch.float32),), kwargs = {})
    %aten_slice_scatter_default_2 : [num_users=1] = call_function[target=torch.ops.aten.slice_scatter.out](args = (%aten_slice_scatter_default_1, %aten_copy_default_1, 1, 0, 9223372036854775807, 3), kwargs = {out: %alloc_51})
    %alloc_52 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 51, 8400), torch.float32),), kwargs = {})
    %aten_slice_scatter_default_3 : [num_users=4] = call_function[target=torch.ops.aten.slice_scatter.out](args = (%aten_slice_scatter_default_1, %aten_slice_scatter_default_2, 0, 0, 9223372036854775807), kwargs = {out: %alloc_52})
    %alloc_53 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 17, 8400), torch.float32),), kwargs = {})
    %aten_slice_copy_tensor_4 : [num_users=1] = call_function[target=torch.ops.aten.slice_copy.Tensor_out](args = (%aten_slice_scatter_default_3, 1, 1, 9223372036854775807, 3), kwargs = {out: %alloc_53})
    %alloc_54 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 17, 8400), torch.float32),), kwargs = {})
    %aten_slice_copy_tensor_5 : [num_users=1] = call_function[target=torch.ops.aten.slice_copy.Tensor_out](args = (%aten_slice_scatter_default_3, 1, 1, 9223372036854775807, 3), kwargs = {out: %alloc_54})
    %lowered_module_20 : [num_users=1] = get_attr[target=lowered_module_20]
    %executorch_call_delegate_20 : [num_users=2] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_20, %_lifted_tensor_constant274, %_lifted_tensor_constant273, %aten_select_copy_int_1, %aten_slice_copy_tensor_4, %aten__softmax_default, %c_model_23_lifted_tensor_1), kwargs = {})
    %getitem_59 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_20, 0), kwargs = {})
    %getitem_60 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_20, 1), kwargs = {})
    %aten_view_copy_default_16 : [num_users=1] = call_function[target=executorch.exir.memory.view](args = (%getitem_59, [1, 4, 8400]), kwargs = {})
    %alloc_55 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 17, 8400), torch.float32),), kwargs = {})
    %aten_copy_default_2 : [num_users=1] = call_function[target=torch.ops.aten.copy.out](args = (%aten_slice_copy_tensor_5, %getitem_60), kwargs = {out: %alloc_55})
    %alloc_56 : [num_users=3] = call_function[target=executorch.exir.memory.alloc](args = ([((1, 2, 8400), torch.float32), ((1, 2, 8400), torch.float32)],), kwargs = {})
    %aten_split_with_sizes_copy_default_20 : [num_users=0] = call_function[target=torch.ops.aten.split_with_sizes_copy.out](args = (%aten_view_copy_default_16, [2, 2], 1), kwargs = {out: %alloc_56})
    %alloc_57 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 51, 8400), torch.float32),), kwargs = {})
    %aten_slice_scatter_default_4 : [num_users=1] = call_function[target=torch.ops.aten.slice_scatter.out](args = (%aten_slice_scatter_default_3, %aten_copy_default_2, 1, 1, 9223372036854775807, 3), kwargs = {out: %alloc_57})
    %getitem_61 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_56, 0), kwargs = {})
    %getitem_62 : [num_users=1] = call_function[target=operator.getitem](args = (%alloc_56, 1), kwargs = {})
    %alloc_58 : [num_users=1] = call_function[target=executorch.exir.memory.alloc](args = (((1, 51, 8400), torch.float32),), kwargs = {})
    %aten_slice_scatter_default_5 : [num_users=1] = call_function[target=torch.ops.aten.slice_scatter.out](args = (%aten_slice_scatter_default_3, %aten_slice_scatter_default_4, 0, 0, 9223372036854775807), kwargs = {out: %alloc_58})
    %lowered_module_21 : [num_users=1] = get_attr[target=lowered_module_21]
    %executorch_call_delegate_21 : [num_users=1] = call_function[target=torch.ops.higher_order.executorch_call_delegate](args = (%lowered_module_21, %aten_unsqueeze_copy_default, %getitem_61, %getitem_62, %getitem_56, %aten__to_copy_default, %c_model_23_lifted_tensor_1, %aten_slice_scatter_default_5), kwargs = {})
    %getitem_63 : [num_users=1] = call_function[target=operator.getitem](args = (%executorch_call_delegate_21, 0), kwargs = {})
    return (getitem_63, getitem_50, getitem_51, getitem_52, getitem_49)
This node c_model_23_lifted_tensor_0 has metadata of:


While executing %c_model_23_lifted_tensor_0 : [num_users=3] = placeholder[target=c_model_23_lifted_tensor_0]
Original traceback:
None

Versions

Collecting environment information...
PyTorch version: 2.5.0+cpu
Is debug build: False
CUDA used to build PyTorch: Could not collect
ROCM used to build PyTorch: N/A

OS: CentOS Stream 9 (x86_64)
GCC version: (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2)
Clang version: Could not collect
CMake version: version 3.31.1
Libc version: glibc-2.34

Python version: 3.10.0 (default, Mar  3 2022, 09:58:08) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.12.0-0_fbk16_zion_7661_geb00762ce6d2-x86_64-with-glibc2.34
Is CUDA available: False
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: 
GPU 0: NVIDIA PG509-210
GPU 1: NVIDIA PG509-210
GPU 2: NVIDIA PG509-210
GPU 3: NVIDIA PG509-210
GPU 4: NVIDIA PG509-210
GPU 5: NVIDIA PG509-210
GPU 6: NVIDIA PG509-210
GPU 7: NVIDIA PG509-210

Nvidia driver version: 525.105.17
cuDNN version: Probably one of the following:
/usr/lib64/libcudnn.so.8.8.0
/usr/lib64/libcudnn.so.9.1.0
/usr/lib64/libcudnn_adv.so.9.1.0
/usr/lib64/libcudnn_adv_infer.so.8.8.0
/usr/lib64/libcudnn_adv_train.so.8.8.0
/usr/lib64/libcudnn_cnn.so.9.1.0
/usr/lib64/libcudnn_cnn_infer.so.8.8.0
/usr/lib64/libcudnn_cnn_train.so.8.8.0
/usr/lib64/libcudnn_engines_precompiled.so.9.1.0
/usr/lib64/libcudnn_engines_runtime_compiled.so.9.1.0
/usr/lib64/libcudnn_graph.so.9.1.0
/usr/lib64/libcudnn_heuristic.so.9.1.0
/usr/lib64/libcudnn_ops.so.9.1.0
/usr/lib64/libcudnn_ops_infer.so.8.8.0
/usr/lib64/libcudnn_ops_train.so.8.8.0
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Address sizes:                   46 bits physical, 48 bits virtual
Byte Order:                      Little Endian
CPU(s):                          192
On-line CPU(s) list:             0-191
Vendor ID:                       GenuineIntel
Model name:                      Intel(R) Xeon(R) Platinum 8339HC CPU @ 1.80GHz
CPU family:                      6
Model:                           85
Thread(s) per core:              2
Core(s) per socket:              24
Socket(s):                       4
Stepping:                        11
Frequency boost:                 enabled
CPU(s) scaling MHz:              100%
CPU max MHz:                     1801.0000
CPU min MHz:                     800.0000
BogoMIPS:                        3600.00
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local avx512_bf16 dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke avx512_vnni md_clear flush_l1d arch_capabilities
Virtualization:                  VT-x
L1d cache:                       3 MiB (96 instances)
L1i cache:                       3 MiB (96 instances)
L2 cache:                        96 MiB (96 instances)
L3 cache:                        132 MiB (4 instances)
NUMA node(s):                    4
NUMA node0 CPU(s):               0-23,96-119
NUMA node1 CPU(s):               24-47,120-143
NUMA node2 CPU(s):               48-71,144-167
NUMA node3 CPU(s):               72-95,168-191
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Enhanced IBRS, IBPB conditional, RSB filling
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected

Versions of relevant libraries:
[pip3] executorch==0.4.0a0+6a085ff
[pip3] numpy==2.1.3
[pip3] torch==2.5.0+cpu
[pip3] torchaudio==2.5.0+cpu
[pip3] torchsr==1.0.4
[pip3] torchvision==0.20.0+cpu
[conda] executorch                0.4.0a0+6a085ff          pypi_0    pypi
[conda] numpy                     2.1.3                    pypi_0    pypi
[conda] torch                     2.5.0+cpu                pypi_0    pypi
[conda] torchaudio                2.5.0+cpu                pypi_0    pypi
[conda] torchsr                   1.0.4                    pypi_0    pypi
[conda] torchvision               0.20.0+cpu               pypi_0    pypi


@dbort dbort added bug Something isn't working module: exir Issues related to Export IR triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module actionable Items in the backlog waiting for an appropriate impl/fix labels Dec 6, 2024
@guangy10
Copy link
Contributor

guangy10 commented Dec 6, 2024

@agunapal Can you try with strict=True for export? I'm curious if it will run into any issue. Maybe the problem is that some python logic is "unsafely" traced out (unless you have verified the code won't affect the model's logic) leading to the weird NULL pointer issue during emitting in the downstream.

@angelayi
Copy link
Contributor

angelayi commented Dec 7, 2024

Can you also print the exported program from torch.export?

print(ep)
print(ep.constants)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
actionable Items in the backlog waiting for an appropriate impl/fix bug Something isn't working module: exir Issues related to Export IR triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

4 participants