yolov8目标检测的图片输入尺寸及预处理问题 #6994

moruofan11 · 2023-12-14T14:29:41Z

moruofan11
Dec 14, 2023

对于一个已经训练好的yolov8模型，我可以使用终端指令yolo task=detect mode=predict model=best.pt imgsz=640 source=0 show=True去调用摄像头，对摄像头输入的视频流的每一帧进行目标检测，此时我所训练的模型输入层是640640的三通道图像。
但是，如果我使用中端指令把imgsz改为其他尺寸如1280，我的摄像头设定为1280720的像素尺寸，那么这个模型对图像进行预处理，然后输入给模型，那么yolov8的这个预处理是怎么做的？（正常情况下yolov8给输入层的预处理后的图像不会发生畸变）（经过事实验证，如果提高输入图像的分辨率，模型能够预测的更准确）
如果我想用OpenCV来使用这个输入640*640的模型，但是又想让输入图像分辨率较高，我该怎么用opencv复写出yolov8这样的预处理效果？

pderrenger · 2024-02-06T13:19:55Z

pderrenger
Feb 6, 2024
Maintainer

@moruofan11 当你使用不同的图像尺寸（例如1280）进行预测时，YOLOv8会自动对输入图像进行适当的预处理以适配模型。这通常包括缩放和填充操作，确保图像不会发生畸变，同时保持原始宽高比。

对于使用OpenCV进行预处理，你可以按照以下步骤来模拟YOLOv8的预处理过程：

保持图像的宽高比，将图像缩放到模型的输入尺寸（例如640x640）中较短的一边。
对缩放后的图像进行填充，以达到所需的输入尺寸，通常填充的是图像的右侧和底部。
确保填充值（通常是灰色，即(114, 114, 114)）与训练时使用的填充值相匹配。

以下是一个简单的OpenCV代码示例，展示了如何进行这种预处理：

import cv2

def letterbox(img, new_shape=(640, 640), color=(114, 114, 114)):
    shape = img.shape[:2]  # current shape [height, width]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    # Scale ratio (new / old)
    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])

    # Compute padding
    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border

    return img

# Load image
img = cv2.imread('path/to/your/image.jpg')

# Preprocess image
img_preprocessed = letterbox(img)

# Now you can pass `img_preprocessed` to your model for prediction

这段代码会将图像缩放并填充到指定的尺寸，同时保持原始宽高比，从而避免畸变。在实际应用中，你可能需要根据你的具体需求调整这个函数。

2 replies

StanleyYake Sep 20, 2024

I Found another version：https://github.com/WongKinYiu/yolov9/blob/5b1ea9a8b3f0ffe4fe0e203ec6232d788bb3fcff/utils/panoptic/augmentations.py#L126

glenn-jocher Sep 20, 2024
Maintainer

@StanleyYake thank you for sharing that version. If you have any questions about the Ultralytics implementation, feel free to ask!

Han6260 · 2024-04-10T06:44:54Z

Han6260
Apr 10, 2024

Does the model not normalize the images during the detection and training process?

8 replies

glenn-jocher Apr 10, 2024
Maintainer

You're welcome! If you have any more questions or need further assistance, feel free to reach out. Happy coding! 😊

caibird9996 Jun 30, 2024

Hello, I would like to inquire about the feature maps processed from different images. Are their feature map sizes different? For example, if two different images are preprocessed with the same YOLOv8, will their large feature maps have different @glenn-jocher

glenn-jocher Jun 30, 2024
Maintainer

Hi @caibird9996,

Thank you for your question! The feature map sizes in YOLO models, including YOLOv8, are determined by the architecture and the input image size. Typically, YOLO models use a series of downsampling operations (like convolutions with stride and pooling) which reduce the spatial dimensions of the input image to create feature maps at different scales.

For a 320x320 input image, you would expect the feature maps to be downsampled by factors of 8, 16, and 32, resulting in feature maps of sizes 40x40, 20x20, and 10x10 respectively. If you're seeing a maximum feature map size of 32, it suggests there might be a discrepancy in the downsampling process or the model configuration.

Here are a few things to check:

Model Configuration: Ensure that the model configuration (e.g., the YAML file) specifies the correct input size and architecture details.
Preprocessing: Verify that the input images are being resized correctly to 320x320 before being fed into the model.
Model Layers: Check the layers of your model to ensure the downsampling operations are correctly implemented.

If the issue persists, please provide a reproducible example so we can investigate further. You can find guidance on creating a minimum reproducible example here.

Additionally, make sure you are using the latest version of the Ultralytics YOLO package to benefit from the latest fixes and improvements.

Feel free to share more details or any specific code snippets if you need further assistance. 😊

caibird9996 Jul 1, 2024

Hello, I want to use feature map x in the detect layer. It is found that for the three feature maps of the input image with a size of 320*320, their chanel, width, and height are sometimes different. For the model parameters that have been set to imgsz 320 and batch 4, why does the chanel of the feature map sometimes become larger or smaller, and the width and height are sometimes changed from 40, 20, and 10 to 32, 16, and 8? @glenn-jocher

glenn-jocher Jul 2, 2024
Maintainer

Hello @caibird9996,

Thank you for your question! The behavior you're observing with the feature map sizes and channels can be influenced by several factors, including the model architecture, the specific layers used, and any dynamic operations within the network.

Here are a few points to consider:

Model Architecture: Ensure that the model architecture is consistent and that there are no dynamic changes in the layers that could affect the feature map sizes. For example, certain layers like pooling or strided convolutions can alter the dimensions of the feature maps.
Input Size: Verify that the input size is consistently set to 320x320. Any deviation in the input size can lead to different feature map dimensions. You can enforce this by explicitly resizing your input images before feeding them into the model.
Batch Size: The batch size should not affect the spatial dimensions of the feature maps, but it's good to ensure that the batch processing is consistent and does not introduce any unexpected behavior.
Dynamic Layers: Some layers or operations might dynamically adjust based on the input. Ensure that all layers are configured to maintain consistent output dimensions.

To help us better understand and address the issue, could you please provide a minimum reproducible example? This will allow us to replicate the behavior and investigate further. You can find guidance on creating a reproducible example here.

Additionally, please ensure you are using the latest version of the Ultralytics YOLO package, as updates often include important fixes and improvements.

If you have any specific code snippets or additional details, feel free to share them. We're here to help! 😊

pdw2002 · 2024-08-09T02:32:12Z

pdw2002
Aug 9, 2024

我将自己的数据集训练完后进行模型推理的时候，设置输入的图像尺寸大小为(2560，2560)，但是最终控制台打印输出的图像大小为（1, 3, 1472, 2560），是否也跟上面所说的yolov8在输入推理时会进行缩放填充等操作保持图像的原始宽高比有关系？

3 replies

glenn-jocher Aug 9, 2024
Maintainer

@pdw2002 是的，YOLOv8在进行推理时会对输入图像进行缩放和填充，以保持图像的原始宽高比。这种预处理操作确保了输入图像不会发生畸变，同时适配模型的输入尺寸。你看到的最终输出尺寸（1, 3, 1472, 2560）是因为YOLOv8在处理过程中对图像进行了适当的缩放和填充，以匹配模型的输入要求。希望这能解答你的疑问。

djdll Sep 7, 2024

@glenn-jocher Hi,I was wondering where I could save or find preprocessed and post-preprocessed images, both in inference and training. I noticed that yolo classification does not use padding method, but I am not sure if it is resize or crop directly

glenn-jocher Sep 7, 2024
Maintainer

You can find preprocessed images in the runs directory during training or inference. YOLOv8 typically resizes images without padding for classification tasks.

dhyuk54 · 2024-10-26T03:22:55Z

dhyuk54
Oct 26, 2024

@pderrenger

谢谢您的解答,我还有一个疑惑, 在使用yolo5 或者5以上版本时,想使用自定义的数据集,工业图片的大小是 1280X960的话在送入模型之前 yolo会自动处理图片增强吗? 还是说按照yolo指定的图片格式进行转换比如说640X640 在送入模型的呢? 如果自己的图像数据比较大的情况下是否要增加基层网络结果呢?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ultralytics

yolov8目标检测的图片输入尺寸及预处理问题 #6994

{{title}}

Replies: 4 comments 13 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

yolov8目标检测的图片输入尺寸及预处理问题 #6994

Replies: 4 comments · 13 replies

pderrenger Feb 6, 2024 Maintainer

glenn-jocher Sep 20, 2024 Maintainer

glenn-jocher Apr 10, 2024 Maintainer

glenn-jocher Jun 30, 2024 Maintainer

glenn-jocher Jul 2, 2024 Maintainer

glenn-jocher Aug 9, 2024 Maintainer

glenn-jocher Sep 7, 2024 Maintainer

Replies: 4 comments 13 replies

pderrenger
Feb 6, 2024
Maintainer

glenn-jocher Sep 20, 2024
Maintainer

glenn-jocher Apr 10, 2024
Maintainer

glenn-jocher Jun 30, 2024
Maintainer

glenn-jocher Jul 2, 2024
Maintainer

glenn-jocher Aug 9, 2024
Maintainer

glenn-jocher Sep 7, 2024
Maintainer