Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Detection output on mono camera looks not to scale #604

Open
MRo47 opened this issue Sep 26, 2024 · 5 comments
Open

[BUG] Detection output on mono camera looks not to scale #604

MRo47 opened this issue Sep 26, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@MRo47
Copy link

MRo47 commented Sep 26, 2024

Version: 1 commit behind iron commit hash cf5d2aaee9117298ea1632c98ffd36a5d7d535ac

Issue

  • Trying to run a Depth pipeline on OAK-D Pro W camera.
  • What I want is depth output with a yolo detection running on the left camera.
  • The detections look they are not to scale, is there a parameter I need to set to scale them?

Steps to reproduce

  1. Set config (camera.yaml) as below
/**:
  ros__parameters:
    camera:
      i_enable_imu: true
      i_enable_ir: true
      i_nn_type: none
      i_pipeline_type: Depth
    left:
      i_publish_topic: true
      i_enable_nn: true
      i_disable_node: false
      i_resolution: '720P'
    left_nn:
      i_board_socket_id: 1
      i_nn_config_path: depthai_ros_driver/yolo
  1. Modify launch file to add the depthai_filters::Detection2DOverlay and detection_labels, these have to be set up for the overlay as well as the default are mobilenet ssd labels.
detection_labels = [
        "person",
        "bicycle",
        "car",
        "motorbike",
        "aeroplane",
        "bus",
        "train",
        "truck",
        "boat",
        "traffic light",
        "fire hydrant",
        "stop sign",
        "parking meter",
        "bench",
        "bird",
        "cat",
        "dog",
        "horse",
        "sheep",
        "cow",
        "elephant",
        "bear",
        "zebra",
        "giraffe",
        "backpack",
        "umbrella",
        "handbag",
        "tie",
        "suitcase",
        "frisbee",
        "skis",
        "snowboard",
        "sports ball",
        "kite",
        "baseball bat",
        "baseball glove",
        "skateboard",
        "surfboard",
        "tennis racket",
        "bottle",
        "wine glass",
        "cup",
        "fork",
        "knife",
        "spoon",
        "bowl",
        "banana",
        "apple",
        "sandwich",
        "orange",
        "broccoli",
        "carrot",
        "hot dog",
        "pizza",
        "donut",
        "cake",
        "chair",
        "sofa",
        "pottedplant",
        "bed",
        "diningtable",
        "toilet",
        "tvmonitor",
        "laptop",
        "mouse",
        "remote",
        "keyboard",
        "cell phone",
        "microwave",
        "oven",
        "toaster",
        "sink",
        "refrigerator",
        "book",
        "clock",
        "vase",
        "scissors",
        "teddy bear",
        "hair drier",
        "toothbrush",
    ]

    detection_viz_node = ComposableNode(
        package="depthai_filters",
        plugin="depthai_filters::Detection2DOverlay",
        parameters=[
            {"label_map": detection_labels},
        ],
        remappings=[
            ("rgb/preview/image_raw", "/oak/left/image_raw"),
            ("nn/detections", "/oak/left_nn/detections"),
        ],
    )
    
    ...
    
    ComposableNodeContainer(
            name=f"{name}_container",
            namespace=namespace,
            package="rclcpp_components",
            executable="component_container",
            composable_node_descriptions=[
                ComposableNode(
                    package="depthai_ros_driver",
                    plugin="depthai_ros_driver::Camera",
                    name=name,
                    namespace=namespace,
                    parameters=[
                        params_file,
                        tf_params,
                        parameter_overrides,
                        {"left_nn.i_label_map": detection_labels}
                    ],
                ),
                detection_viz_node,
            ],
            arguments=["--ros-args", "--log-level", log_level],
            prefix=[launch_prefix],
            output="both",
        ),
  1. Build and launch
ros2 launch depthai_ros_driver camera.launch.py

Below is how the overlay looks, the person doesn't look like this on camera, trust me and we don't have ghosts ;)

Screenshot from 2024-09-26 14-59-25

@MRo47 MRo47 added the bug Something isn't working label Sep 26, 2024
@MRo47
Copy link
Author

MRo47 commented Sep 26, 2024

scaling_issue.zip

Complete launch file and yaml config

@Serafadam
Copy link
Collaborator

Hi, thanks for the report, indeed there is a minor bug here, fix will be on the way but just to recap:

  • Images fed to the NN node need to be resized to fit the NN input, in this case 416x416, by default this is automatically done by an internal image manip node, the input image is "squeezed" to preserve FOV of the original image, more information on that can be found here
  • Since the image is squeezed, the resulting bounding boxes are also "squeezed" to fit this format
  • You can verify that by using the passthrough output from NN node, can be enabled with left_nn.i_enable_passthrough: true, when you pass that image to the detection overlay you should see the bounding boxes aligned to the squeezed image
  • DetectionOverlay2D filter is missing the desqueeze parameter which would apply correct transformation, this will be fixed to mimic the behavior in spatial_bb, this is a small modification so if you have built depthai-ros from source you can make the modification or wait for the next release.

@MRo47
Copy link
Author

MRo47 commented Sep 27, 2024

@Serafadam Thank you for the pointers above.

Yup that was my guess that it was the scaling done my manip node before input to the NN and I can confirm that on setting left_nn.i_enable_passthrough: true and doing the overlay on the output topic the output is correctly aligned.

From a design perspective I have 3 questions

  1. Shouldn't desqueeze be performed on the Depth node directly? My reasoning is that the user gave the model input of size WxH so they should expect the detections to be of same size right?
  2. Would the above complicate the implementation somehow?
  3. To perform desqueeze on DetectionOverlay2D filter node you will need to have information on the "squeeze ratios" meaning input_image_size:NN_input_size, are these captured somewhere in a topic? or would this require passing the NN configs to the DetectionOverlay2D filter?

@Serafadam
Copy link
Collaborator

An implementation of this dequeeze is available on this branch to test out.
A small modification to the DetectionOverlayNode in launch file is needed for it:

remappings=[ 
("rgb/preview/image_raw", "/oak/left/image_raw"),
("rgb/preview/camera_info", "/oak/left_nn/camera_info"), 
("nn/detections", "/oak/left_nn/detections"), 
 ], 

Shouldn't desqueeze be performed on the Depth node directly? My reasoning is that the user gave the model input of size WxH so they should expect the detections to be of same size right?

At this moment this is not implemented in the API, this is the default behavior in both C++ and Python API as described in the documentation. This might change in the future.

To perform desqueeze on DetectionOverlay2D filter node you will need to have information on the "squeeze ratios" meaning input_image_size:NN_input_size, are these captured somewhere in a topic? or would this require passing the NN configs to the DetectionOverlay2D filter?

Currently it is done based on the camera_info topic taken from passthrough image, unfortunately ROS Vision message doesn't carry this information

@MRo47
Copy link
Author

MRo47 commented Sep 30, 2024

Hey @Serafadam

I was able to have a workaround for this in #606

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants