README and Docs updates with A100 TensorRT times (#270)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-01-11 21:54:41 +01:00
parent 216cf2ddb6
commit e18ae9d8e1
10 changed files with 250 additions and 241 deletions
--- a/README.md
+++ b/README.md
@ -121,13 +121,13 @@ Ultralytics [release](https://github.com/ultralytics/ultralytics/releases) on fi

 <details open><summary>Detection</summary>

-| Model                                                                                | size<br><sup>(pixels) | mAP<sup>val<br>50-95 | Speed<br><sup>CPU<br>(ms) | Speed<br><sup>T4 GPU<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
-| ------------------------------------------------------------------------------------ | --------------------- | -------------------- | ------------------------- | ---------------------------- | ------------------ | ----------------- |
-| [YOLOv8n](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt) | 640                   | 37.3                 | -                         | -                            | 3.2                | 8.7               |
-| [YOLOv8s](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s.pt) | 640                   | 44.9                 | -                         | -                            | 11.2               | 28.6              |
-| [YOLOv8m](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8m.pt) | 640                   | 50.2                 | -                         | -                            | 25.9               | 78.9              |
-| [YOLOv8l](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8l.pt) | 640                   | 52.9                 | -                         | -                            | 43.7               | 165.2             |
-| [YOLOv8x](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8x.pt) | 640                   | 53.9                 | -                         | -                            | 68.2               | 257.8             |
+| Model                                                                                | size<br><sup>(pixels) | mAP<sup>val<br>50-95 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
+| ------------------------------------------------------------------------------------ | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- |
+| [YOLOv8n](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt) | 640                   | 37.3                 | -                              | 0.99                                | 3.2                | 8.7               |
+| [YOLOv8s](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s.pt) | 640                   | 44.9                 | -                              | 1.20                                | 11.2               | 28.6              |
+| [YOLOv8m](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8m.pt) | 640                   | 50.2                 | -                              | 1.83                                | 25.9               | 78.9              |
+| [YOLOv8l](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8l.pt) | 640                   | 52.9                 | -                              | 2.39                                | 43.7               | 165.2             |
+| [YOLOv8x](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8x.pt) | 640                   | 53.9                 | -                              | 3.53                                | 68.2               | 257.8             |

 - **mAP<sup>val</sup>** values are for single-model single-scale on [COCO val2017](http://cocodataset.org) dataset.
  <br>Reproduce by `yolo mode=val task=detect data=coco.yaml device=0`
@ -138,8 +138,8 @@ Ultralytics [release](https://github.com/ultralytics/ultralytics/releases) on fi

 <details><summary>Segmentation</summary>

-| Model                                                                                    | size<br><sup>(pixels) | mAP<sup>box<br>50-95 | mAP<sup>mask<br>50-95 | Speed<br><sup>CPU<br>(ms) | Speed<br><sup>T4 GPU<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
-| ---------------------------------------------------------------------------------------- | --------------------- | -------------------- | --------------------- | ------------------------- | ---------------------------- | ------------------ | ----------------- |
+| Model                                                                                    | size<br><sup>(pixels) | mAP<sup>box<br>50-95 | mAP<sup>mask<br>50-95 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
+| ---------------------------------------------------------------------------------------- | --------------------- | -------------------- | --------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- |
 | [YOLOv8n](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n-seg.pt) | 640                   | 36.7                 | 30.5                  | -                              | -                                   | 3.4                | 12.6              |
 | [YOLOv8s](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s-seg.pt) | 640                   | 44.6                 | 36.8                  | -                              | -                                   | 11.8               | 42.6              |
 | [YOLOv8m](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8m-seg.pt) | 640                   | 49.9                 | 40.8                  | -                              | -                                   | 27.3               | 110.2             |
@ -155,8 +155,8 @@ Ultralytics [release](https://github.com/ultralytics/ultralytics/releases) on fi

 <details><summary>Classification</summary>

-| Model                                                                                    | size<br><sup>(pixels) | acc<br><sup>top1 | acc<br><sup>top5 | Speed<br><sup>CPU<br>(ms) | Speed<br><sup>T4 GPU<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) at 640 |
-| ---------------------------------------------------------------------------------------- | --------------------- | ---------------- | ---------------- | ------------------------- | ---------------------------- | ------------------ | ------------------------ |
+| Model                                                                                    | size<br><sup>(pixels) | acc<br><sup>top1 | acc<br><sup>top5 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) at 640 |
+| ---------------------------------------------------------------------------------------- | --------------------- | ---------------- | ---------------- | ------------------------------ | ----------------------------------- | ------------------ | ------------------------ |
 | [YOLOv8n](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n-cls.pt) | 224                   | 66.6             | 87.0             | -                              | -                                   | 2.7                | 4.3                      |
 | [YOLOv8s](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s-cls.pt) | 224                   | 72.3             | 91.1             | -                              | -                                   | 6.4                | 13.5                     |
 | [YOLOv8m](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8m-cls.pt) | 224                   | 76.4             | 93.2             | -                              | -                                   | 17.0               | 42.7                     |
--- a/README.zh-CN.md
+++ b/README.zh-CN.md
@ -115,13 +115,13 @@ success = YOLO("yolov8n.pt").export(format="onnx")  # 将模型导出为 ONNX

 <details open><summary>目标检测</summary>

-| 模型                                                                                        | 尺寸<br><sup>（像素） | mAP<sup>val<br>50-95 | 推理速度<br><sup>CPU<br>(ms) | 推理速度<br><sup>T4 GPU<br>(ms) | 参数量<br><sup>(M) | FLOPs<br><sup>(B) |
-| ----------------------------------------------------------------------------------------- | --------------- | -------------------- | ------------------------ | --------------------------- | --------------- | ----------------- |
-| [YOLOv8n](https://github.com/ultralytics/ultralytics/releases/download/v8.0.0/yolov8n.pt) | 640             | 37.3                 | -                        | -                           | 3.2             | 8.7               |
-| [YOLOv8s](https://github.com/ultralytics/ultralytics/releases/download/v8.0.0/yolov8s.pt) | 640             | 44.9                 | -                        | -                           | 11.2            | 28.6              |
-| [YOLOv8m](https://github.com/ultralytics/ultralytics/releases/download/v8.0.0/yolov8m.pt) | 640             | 50.2                 | -                        | -                           | 25.9            | 78.9              |
-| [YOLOv8l](https://github.com/ultralytics/ultralytics/releases/download/v8.0.0/yolov8l.pt) | 640             | 52.9                 | -                        | -                           | 43.7            | 165.2             |
-| [YOLOv8x](https://github.com/ultralytics/ultralytics/releases/download/v8.0.0/yolov8x.pt) | 640             | 53.9                 | -                        | -                           | 68.2            | 257.8             |
+| 模型                                                                                   | 尺寸<br><sup>（像素） | mAP<sup>val<br>50-95 | 推理速度<br><sup>CPU ONNX<br>(ms) | 推理速度<br><sup>A100 TensorRT<br>(ms) | 参数量<br><sup>(M) | FLOPs<br><sup>(B) |
+| ------------------------------------------------------------------------------------ | --------------- | -------------------- | ----------------------------- | ---------------------------------- | --------------- | ----------------- |
+| [YOLOv8n](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt) | 640             | 37.3                 | -                             | 0.99                               | 3.2             | 8.7               |
+| [YOLOv8s](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s.pt) | 640             | 44.9                 | -                             | 1.20                               | 11.2            | 28.6              |
+| [YOLOv8m](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8m.pt) | 640             | 50.2                 | -                             | 1.83                               | 25.9            | 78.9              |
+| [YOLOv8l](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8l.pt) | 640             | 52.9                 | -                             | 2.39                               | 43.7            | 165.2             |
+| [YOLOv8x](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8x.pt) | 640             | 53.9                 | -                             | 3.53                               | 68.2            | 257.8             |

 - **mAP<sup>val</sup>** 结果都在 [COCO val2017](http://cocodataset.org) 数据集上，使用单模型单尺度测试得到。
  <br>复现命令 `yolo mode=val task=detect data=coco.yaml device=0`
@ -132,8 +132,8 @@ success = YOLO("yolov8n.pt").export(format="onnx")  # 将模型导出为 ONNX

 <details><summary>实例分割</summary>

-| 模型                                                                                            | 尺寸<br><sup>（像素） | mAP<sup>box<br>50-95 | mAP<sup>mask<br>50-95 | 推理速度<br><sup>CPU<br>(ms) | 推理速度<br><sup>T4 GPU<br>(ms) | 参数量<br><sup>(M) | FLOPs<br><sup>(B) |
-| --------------------------------------------------------------------------------------------- | --------------- | -------------------- | --------------------- | ------------------------ | --------------------------- | --------------- | ----------------- |
+| 模型                                                                                            | 尺寸<br><sup>（像素） | mAP<sup>box<br>50-95 | mAP<sup>mask<br>50-95 | 推理速度<br><sup>CPU ONNX<br>(ms) | 推理速度<br><sup>A100 TensorRT<br>(ms) | 参数量<br><sup>(M) | FLOPs<br><sup>(B) |
+| --------------------------------------------------------------------------------------------- | --------------- | -------------------- | --------------------- | ----------------------------- | ---------------------------------- | --------------- | ----------------- |
 | [YOLOv8n](https://github.com/ultralytics/ultralytics/releases/download/v8.0.0/yolov8n-seg.pt) | 640             | 36.7                 | 30.5                  | -                             | -                                  | 3.4             | 12.6              |
 | [YOLOv8s](https://github.com/ultralytics/ultralytics/releases/download/v8.0.0/yolov8s-seg.pt) | 640             | 44.6                 | 36.8                  | -                             | -                                  | 11.8            | 42.6              |
 | [YOLOv8m](https://github.com/ultralytics/ultralytics/releases/download/v8.0.0/yolov8m-seg.pt) | 640             | 49.9                 | 40.8                  | -                             | -                                  | 27.3            | 110.2             |
@ -149,8 +149,8 @@ success = YOLO("yolov8n.pt").export(format="onnx")  # 将模型导出为 ONNX

 <details><summary>分类</summary>

-| 模型                                                                                            | 尺寸<br><sup>（像素） | acc<br><sup>top1 | acc<br><sup>top5 | 推理速度<br><sup>CPU<br>(ms) | 推理速度<br><sup>T4 GPU<br>(ms) | 参数量<br><sup>(M) | FLOPs<br><sup>(B) at 640 |
-| --------------------------------------------------------------------------------------------- | --------------- | ---------------- | ---------------- | ------------------------ | --------------------------- | --------------- | ------------------------ |
+| 模型                                                                                            | 尺寸<br><sup>（像素） | acc<br><sup>top1 | acc<br><sup>top5 | 推理速度<br><sup>CPU ONNX<br>(ms) | 推理速度<br><sup>A100 TensorRT<br>(ms) | 参数量<br><sup>(M) | FLOPs<br><sup>(B) at 640 |
+| --------------------------------------------------------------------------------------------- | --------------- | ---------------- | ---------------- | ----------------------------- | ---------------------------------- | --------------- | ------------------------ |
 | [YOLOv8n](https://github.com/ultralytics/ultralytics/releases/download/v8.0.0/yolov8n-cls.pt) | 224             | 66.6             | 87.0             | -                             | -                                  | 2.7             | 4.3                      |
 | [YOLOv8s](https://github.com/ultralytics/ultralytics/releases/download/v8.0.0/yolov8s-cls.pt) | 224             | 72.3             | 91.1             | -                             | -                                  | 6.4             | 13.5                     |
 | [YOLOv8m](https://github.com/ultralytics/ultralytics/releases/download/v8.0.0/yolov8m-cls.pt) | 224             | 76.4             | 93.2             | -                             | -                                  | 17.0            | 42.7                     |
--- a/docs/assets/favicon.ico
+++ b/docs/assets/favicon.ico
--- a/docs/engine.md
+++ b/docs/engine.md
@ -5,7 +5,7 @@ BaseTrainer contains the generic boilerplate training routine. It can be customi

 * `get_model(cfg, weights)` - The function that builds a the model to be trained
 * `get_dataloder()` - The function that builds the dataloder
-More details and source code can be found in [`BaseTrainer` Reference](../reference/base_trainer.md)
+More details and source code can be found in [`BaseTrainer` Reference](reference/base_trainer.md)

 ## DetectionTrainer
 Here's how you can use the YOLOv8 `DetectionTrainer` and customize it.
--- a/docs/index.md
+++ b/docs/index.md
@ -1,6 +1,10 @@
 <div align="center">
  <a href="https://github.com/ultralytics/ultralytics" target="_blank">
    <img width="1024" src="https://raw.githubusercontent.com/ultralytics/assets/main/yolov8/banner-yolov8.png"></a>
+<br>
+    <a href="https://github.com/ultralytics/ultralytics/actions/workflows/ci.yaml"><img src="https://github.com/ultralytics/ultralytics/actions/workflows/ci.yaml/badge.svg" alt="Ultralytics CI"></a>
+    <a href="https://zenodo.org/badge/latestdoi/264818686"><img src="https://zenodo.org/badge/264818686.svg" alt="YOLOv8 Citation"></a>
+    <a href="https://hub.docker.com/r/ultralytics/ultralytics"><img src="https://img.shields.io/docker/pulls/ultralytics/ultralytics?logo=docker" alt="Docker Pulls"></a>
    <br>
    <a href="https://console.paperspace.com/github/ultralytics/ultralytics"><img src="https://assets.paperspace.io/img/gradient-badge.svg" alt="Run on Gradient"/></a>
    <a href="https://colab.research.google.com/github/ultralytics/ultralytics/blob/main/examples/tutorial.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
--- a/mkdocs.yml
+++ b/mkdocs.yml
@ -1,25 +1,14 @@
 site_name: Ultralytics Docs
 repo_url: https://github.com/ultralytics/ultralytics
-repo_name: Ultralytics
+edit_uri: https://github.com/ultralytics/ultralytics/tree/main/docs
+repo_name: ultralytics/ultralytics

 theme:
  name: "material"
  logo: https://github.com/ultralytics/assets/raw/main/logo/Ultralytics-logomark-white.png
-  icon:
-    repo: fontawesome/brands/github
-    admonition:
-      note: octicons/tag-16
-      abstract: octicons/checklist-16
-      info: octicons/info-16
-      tip: octicons/squirrel-16
-      success: octicons/check-16
-      question: octicons/question-16
-      warning: octicons/alert-16
-      failure: octicons/x-circle-16
-      danger: octicons/zap-16
-      bug: octicons/bug-16
-      example: octicons/beaker-16
-      quote: octicons/quote-16
+  favicon: assets/favicon.ico
+  font:
+    text: Roboto

  palette:
    # Palette toggle for light mode
@ -34,12 +23,16 @@ theme:
        icon: material/brightness-4
        name: Switch to light mode
  features:
+    - content.action.edit
    - content.code.annotate
    - content.tooltips
    - search.highlight
    - search.share
    - search.suggest
    - toc.follow
+    - navigation.top
+    - navigation.expand
+    - navigation.footer

 extra_css:
  - stylesheets/style.css
@ -72,8 +65,10 @@ markdown_extensions:
  - pymdownx.keys
  - pymdownx.mark
  - pymdownx.tilde
+
 plugins:
  - mkdocstrings
+  - search

 # Primary navigation
 nav:
--- a/ultralytics/nn/autobackend.py
+++ b/ultralytics/nn/autobackend.py
@ -22,32 +22,31 @@ class AutoBackend(nn.Module):

    def __init__(self, weights='yolov8n.pt', device=torch.device('cpu'), dnn=False, data=None, fp16=False, fuse=True):
        """
-        Ultralytics YOLO MultiBackend class for python inference on various backends
+        MultiBackend class for python inference on various platforms using Ultralytics YOLO.

        Args:
-          weights: the path to the weights file. Defaults to yolov8n.pt
-          device: The device to run the model on.
-          dnn: If you want to use OpenCV's DNN module to run the inference, set this to True. Defaults to
-        False
-          data: a dictionary containing the following keys:
-          fp16: If true, will use half precision. Defaults to False
-          fuse: whether to fuse the model or not. Defaults to True
+            weights (str): The path to the weights file. Default: 'yolov8n.pt'
+            device (torch.device): The device to run the model on.
+            dnn (bool): Use OpenCV's DNN module for inference if True, defaults to False.
+            data (dict): Additional data, optional
+            fp16 (bool): If True, use half precision. Default: False
+            fuse (bool): Whether to fuse the model or not. Default: True

-        Supported format and their usage:
-            | Platform              | weights          |
-            |-----------------------|------------------|
-            | PyTorch               | *.pt             |
-            | TorchScript           | *.torchscript    |
-            | ONNX Runtime          | *.onnx           |
-            | ONNX OpenCV DNN       | *.onnx --dnn     |
-            | OpenVINO              | *.xml            |
-            | CoreML                | *.mlmodel        |
-            | TensorRT              | *.engine         |
-            | TensorFlow SavedModel | *_saved_model    |
-            | TensorFlow GraphDef   | *.pb             |
-            | TensorFlow Lite       | *.tflite         |
-            | TensorFlow Edge TPU   | *_edgetpu.tflite |
-            | PaddlePaddle          | *_paddle_model   |
+        Supported formats and their usage:
+            Platform              | Weights Format
+            -----------------------|------------------
+            PyTorch               | *.pt
+            TorchScript           | *.torchscript
+            ONNX Runtime          | *.onnx
+            ONNX OpenCV DNN       | *.onnx --dnn
+            OpenVINO              | *.xml
+            CoreML                | *.mlmodel
+            TensorRT              | *.engine
+            TensorFlow SavedModel | *_saved_model
+            TensorFlow GraphDef   | *.pb
+            TensorFlow Lite       | *.tflite
+            TensorFlow Edge TPU   | *_edgetpu.tflite
+            PaddlePaddle          | *_paddle_model
        """
        super().__init__()
        w = str(weights[0] if isinstance(weights, list) else weights)
@ -234,15 +233,16 @@ class AutoBackend(nn.Module):

    def forward(self, im, augment=False, visualize=False):
        """
-        Runs inference on the given model
+        Runs inference on the YOLOv8 MultiBackend model.

        Args:
-          im: the image tensor
-          augment: whether to augment the image. Defaults to False
-          visualize: if True, then the network will output the feature maps of the last convolutional layer.
-        Defaults to False
+            im (torch.tensor): The image tensor to perform inference on.
+            augment (bool): whether to perform data augmentation during inference, defaults to False
+            visualize (bool): whether to visualize the output predictions, defaults to False
+
+        Returns:
+            (tuple): Tuple containing the raw output tensor, and the processed output for visualization (if visualize=True)
        """
-        # YOLOv5 MultiBackend inference
        b, ch, h, w = im.shape  # batch, channel, height, width
        if self.fp16 and im.dtype != torch.float16:
            im = im.half()  # to FP16
@ -325,19 +325,25 @@ class AutoBackend(nn.Module):

    def from_numpy(self, x):
        """
-        `from_numpy` converts a numpy array to a tensor
+         Convert a numpy array to a tensor.

         Args:
-          x: the numpy array to convert
+             x (numpy.ndarray): The array to be converted.
+
+         Returns:
+             (torch.tensor): The converted tensor
         """
        return torch.from_numpy(x).to(self.device) if isinstance(x, np.ndarray) else x

    def warmup(self, imgsz=(1, 3, 640, 640)):
        """
-        Warmup model by running inference once
+        Warm up the model by running one forward pass with a dummy input.

        Args:
-          imgsz: the size of the image you want to run inference on.
+            imgsz (tuple): The shape of the dummy input tensor in the format (batch_size, channels, height, width)
+
+        Returns:
+            (None): This method runs the forward pass and don't return any value
        """
        warmup_types = self.pt, self.jit, self.onnx, self.engine, self.saved_model, self.pb, self.triton, self.nn_module
        if any(warmup_types) and (self.device.type != 'cpu' or self.triton):
--- a/ultralytics/nn/tasks.py
+++ b/ultralytics/nn/tasks.py
@ -17,35 +17,36 @@ from ultralytics.yolo.utils.torch_utils import (fuse_conv_and_bn, initialize_wei


 class BaseModel(nn.Module):
-    '''
-     The BaseModel class is a base class for all the models in the Ultralytics YOLO family.
-    '''
+    """
+    The BaseModel class serves as a base class for all the models in the Ultralytics YOLO family.
+    """

    def forward(self, x, profile=False, visualize=False):
        """
-        > `forward` is a wrapper for `_forward_once` that runs the model on a single scale
+        Forward pass of the model on a single scale.
+        Wrapper for `_forward_once` method.

        Args:
-          x: the input image
-          profile: whether to profile the model. Defaults to False
-          visualize: if True, will return the intermediate feature maps. Defaults to False
+            x (torch.tensor): The input image tensor
+            profile (bool): Whether to profile the model, defaults to False
+            visualize (bool): Whether to return the intermediate feature maps, defaults to False

        Returns:
-          The output of the network.
+            (torch.tensor): The output of the network.
        """
        return self._forward_once(x, profile, visualize)

    def _forward_once(self, x, profile=False, visualize=False):
        """
-        > Forward pass of the network
+        Perform a forward pass through the network.

        Args:
-          x: input to the model
-          profile: if True, the time taken for each layer will be printed. Defaults to False
-          visualize: If True, it will save the feature maps of the model. Defaults to False
+            x (torch.tensor): The input tensor to the model
+            profile (bool):  Print the computation time of each layer if True, defaults to False.
+            visualize (bool): Save the feature maps of the model if True, defaults to False

        Returns:
-          The last layer of the model.
+            (torch.tensor): The last output of the model.
        """
        y, dt = [], []  # outputs
        for m in self.model:
@ -62,13 +63,15 @@ class BaseModel(nn.Module):

    def _profile_one_layer(self, m, x, dt):
        """
-        It takes a model, an input, and a list of times, and it profiles the model on the input, appending
-        the time to the list
+        Profile the computation time and FLOPs of a single layer of the model on a given input. Appends the results to the provided list.

        Args:
-          m: the model
-          x: the input image
-          dt: list of time taken for each layer
+            m (nn.Module): The layer to be profiled.
+            x (torch.Tensor): The input data to the layer.
+            dt (list): A list to store the computation time of the layer.
+
+        Returns:
+            None
        """
        c = m == self.model[-1]  # is final layer, copy input as inplace fix
        o = thop.profile(m, inputs=(x.copy() if c else x,), verbose=False)[0] / 1E9 * 2 if thop else 0  # FLOPs
@ -84,10 +87,10 @@ class BaseModel(nn.Module):

    def fuse(self):
        """
-        > It takes a model and fuses the Conv2d() and BatchNorm2d() layers into a single layer
+        Fuse the `Conv2d()` and `BatchNorm2d()` layers of the model into a single layer, in order to improve the computation efficiency.

        Returns:
-          The model is being returned.
+            (nn.Module): The fused model is returned.
        """
        LOGGER.info('Fusing layers... ')
        for m in self.model.modules():
@ -103,8 +106,8 @@ class BaseModel(nn.Module):
        Prints model information

        Args:
-          verbose: if True, prints out the model information. Defaults to False
-          imgsz: the size of the image that the model will be trained on. Defaults to 640
+          verbose (bool): if True, prints out the model information. Defaults to False
+          imgsz (int): the size of the image that the model will be trained on. Defaults to 640
        """
        model_info(self, verbose, imgsz)

@ -129,10 +132,10 @@ class BaseModel(nn.Module):

    def load(self, weights):
        """
-        > This function loads the weights of the model from a file
+        This function loads the weights of the model from a file

        Args:
-          weights: The weights to load into the model.
+          weights (str): The weights to load into the model.
        """
        # Force all tasks to implement this function
        raise NotImplementedError("This function needs to be implemented by derived classes!")
--- a/ultralytics/yolo/engine/trainer.py
+++ b/ultralytics/yolo/engine/trainer.py
@ -84,6 +84,7 @@ class BaseTrainer:
        if overrides is None:
            overrides = {}
        self.args = get_config(config, overrides)
+        self.device = utils.torch_utils.select_device(self.args.device, self.args.batch)
        self.check_resume()
        self.console = LOGGER
        self.validator = None
@ -113,7 +114,6 @@ class BaseTrainer:
            print_args(dict(self.args))

        # Device
-        self.device = utils.torch_utils.select_device(self.args.device, self.batch_size)
        self.amp = self.device.type != 'cpu'
        self.scaler = amp.GradScaler(enabled=self.amp)
        if self.device.type == 'cpu':
@ -164,7 +164,15 @@ class BaseTrainer:
            callback(self)

    def train(self):
+        # Allow device='', device=None on Multi-GPU systems to default to device=0
+        if isinstance(self.args.device, int) or self.args.device:  # i.e. device=0 or device=[0,1,2,3]
            world_size = torch.cuda.device_count()
+        elif torch.cuda.is_available():  # i.e. device=None or device=''
+            world_size = 1  # default to device 0
+        else:  # i.e. device='cpu' or 'mps'
+            world_size = 0
+
+        # Run subprocess if DDP training, else train normally
        if world_size > 1 and "LOCAL_RANK" not in os.environ:
            command = generate_ddp_command(world_size, self)
            try:
--- a/ultralytics/yolo/utils/ops.py
+++ b/ultralytics/yolo/utils/ops.py
@ -1,5 +1,3 @@
-# Ultralytics YOLO 🚀, GPL-3.0 license
-
 import contextlib
 import math
 import re
@ -50,15 +48,15 @@ def coco80_to_coco91_class():  # converts 80-index (val2014) to 91-index (paper)

 def segment2box(segment, width=640, height=640):
    """
-    > Convert 1 segment label to 1 box label, applying inside-image constraint, i.e. (xy1, xy2, ...) to
+    Convert 1 segment label to 1 box label, applying inside-image constraint, i.e. (xy1, xy2, ...) to
    (xyxy)
    Args:
-      segment: the segment label
-      width: the width of the image. Defaults to 640
-      height: The height of the image. Defaults to 640
+      segment (torch.tensor): the segment label
+      width (int): the width of the image. Defaults to 640
+      height (int): The height of the image. Defaults to 640

    Returns:
-      the minimum and maximum x and y values of the segment.
+      (np.array): the minimum and maximum x and y values of the segment.
    """
    # Convert 1 segment label to 1 box label, applying inside-image constraint, i.e. (xy1, xy2, ...) to (xyxy)
    x, y = segment.T  # segment xy
@ -69,17 +67,16 @@ def segment2box(segment, width=640, height=640):

 def scale_boxes(img1_shape, boxes, img0_shape, ratio_pad=None):
    """
-    > Rescale boxes (xyxy) from img1_shape to img0_shape
+    Rescales bounding boxes (in the format of xyxy) from the shape of the image they were originally specified in (img1_shape) to the shape of a different image (img0_shape).
    Args:
-      img1_shape: The shape of the image that the bounding boxes are for.
-      boxes: the bounding boxes of the objects in the image
-      img0_shape: the shape of the original image
-      ratio_pad: a tuple of (ratio, pad)
+      img1_shape (tuple): The shape of the image that the bounding boxes are for, in the format of (height, width).
+      boxes (torch.tensor): the bounding boxes of the objects in the image, in the format of (x1, y1, x2, y2)
+      img0_shape (tuple): the shape of the target image, in the format of (height, width).
+      ratio_pad (tuple): a tuple of (ratio, pad) for scaling the boxes. If not provided, the ratio and pad will be calculated based on the size difference between the two images.

    Returns:
-      The boxes are being returned.
+      boxes (torch.tensor): The scaled bounding boxes, in the format of (x1, y1, x2, y2)
    """
-    #
    if ratio_pad is None:  # calculate from img0_shape
        gain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1])  # gain  = old / new
        pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2  # wh padding
@ -113,7 +110,7 @@ def non_max_suppression(
        nm=0,  # number of masks
 ):
    """
-    > Perform non-maximum suppression (NMS) on a set of boxes, with support for masks and multiple labels per box.
+    Perform non-maximum suppression (NMS) on a set of boxes, with support for masks and multiple labels per box.

    Arguments:
        prediction (torch.Tensor): A tensor of shape (batch_size, num_boxes, num_classes + 4 + num_masks)
@ -134,7 +131,7 @@ def non_max_suppression(
        nm (int): The number of masks output by the model.

    Returns:
-        List[torch.Tensor]: A list of length batch_size, where each element is a tensor of
+        (List[torch.Tensor]): A list of length batch_size, where each element is a tensor of
            shape (num_boxes, 6 + num_masks) containing the kept boxes, with columns
            (x1, y1, x2, y2, confidence, class, mask1, mask2, ...).
    """
@ -231,12 +228,12 @@ def non_max_suppression(

 def clip_boxes(boxes, shape):
    """
-    > It takes a list of bounding boxes and a shape (height, width) and clips the bounding boxes to the
+    It takes a list of bounding boxes and a shape (height, width) and clips the bounding boxes to the
    shape

    Args:
-      boxes: the bounding boxes to clip
-      shape: the shape of the image
+      boxes (torch.tensor): the bounding boxes to clip
+      shape (tuple): the shape of the image
    """
    if isinstance(boxes, torch.Tensor):  # faster individually
        boxes[..., 0].clamp_(0, shape[1])  # x1
@ -262,16 +259,16 @@ def clip_coords(boxes, shape):

 def scale_image(im1_shape, masks, im0_shape, ratio_pad=None):
    """
-    > It takes a mask, and resizes it to the original image size
+    Takes a mask, and resizes it to the original image size

    Args:
-      im1_shape: model input shape, [h, w]
-      masks: [h, w, num]
-      im0_shape: the original image shape
-      ratio_pad: the ratio of the padding to the original image.
+      im1_shape (tuple): model input shape, [h, w]
+      masks (torch.tensor): [h, w, num]
+      im0_shape (tuple): the original image shape
+      ratio_pad (tuple): the ratio of the padding to the original image.

    Returns:
-      The masks are being returned.
+      masks (torch.tensor): The masks that are being returned.
    """
    # Rescale coordinates (xyxy) from im1_shape to im0_shape
    if ratio_pad is None:  # calculate from im0_shape
@ -297,14 +294,12 @@ def scale_image(im1_shape, masks, im0_shape, ratio_pad=None):

 def xyxy2xywh(x):
    """
-    > It takes a list of bounding boxes, and converts them from the format [x1, y1, x2, y2] to [x, y, w,
-    h]  where xy1=top-left, xy2=bottom-right
+    Convert bounding box coordinates from (x1, y1, x2, y2) format to (x, y, width, height) format.

    Args:
-      x: the input tensor
-
+        x (np.ndarray) or (torch.Tensor): The input tensor containing the bounding box coordinates in (x1, y1, x2, y2) format.
    Returns:
-      the center of the box, the width and the height of the box.
+       y (numpy.ndarray) or (torch.Tensor): The bounding box coordinates in (x, y, width, height) format.
    """
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[..., 0] = (x[..., 0] + x[..., 2]) / 2  # x center
@ -316,13 +311,12 @@ def xyxy2xywh(x):

 def xywh2xyxy(x):
    """
-    > It converts the bounding box from x,y,w,h to x1,y1,x2,y2 where xy1=top-left, xy2=bottom-right
+    Convert bounding box coordinates from (x, y, width, height) format to (x1, y1, x2, y2) format where (x1, y1) is the top-left corner and (x2, y2) is the bottom-right corner.

    Args:
-      x: the input tensor
-
+        x (np.ndarray) or (torch.Tensor): The input tensor containing the bounding box coordinates in (x, y, width, height) format.
    Returns:
-      the top left and bottom right coordinates of the bounding box.
+        y (numpy.ndarray) or (torch.Tensor): The bounding box coordinates in (x1, y1, x2, y2) format.
    """
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[..., 0] = x[..., 0] - x[..., 2] / 2  # top left x
@ -334,17 +328,16 @@ def xywh2xyxy(x):

 def xywhn2xyxy(x, w=640, h=640, padw=0, padh=0):
    """
-    > It converts the normalized coordinates to the actual coordinates [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
+    Convert normalized bounding box coordinates to pixel coordinates.

    Args:
-      x: the bounding box coordinates
-      w: width of the image. Defaults to 640
-      h: height of the image. Defaults to 640
-      padw: padding width. Defaults to 0
-      padh: height of the padding. Defaults to 0
-
+        x (np.ndarray) or (torch.Tensor): The bounding box coordinates.
+        w (int): Width of the image. Defaults to 640
+        h (int): Height of the image. Defaults to 640
+        padw (int): Padding width. Defaults to 0
+        padh (int): Padding height. Defaults to 0
    Returns:
-      the xyxy coordinates of the bounding box.
+        y (numpy.ndarray) or (torch.Tensor): The coordinates of the bounding box in the format [x1, y1, x2, y2] where x1,y1 is the top-left corner, x2,y2 is the bottom-right corner of the bounding box.
    """
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[..., 0] = w * (x[..., 0] - x[..., 2] / 2) + padw  # top left x
@ -356,18 +349,16 @@ def xywhn2xyxy(x, w=640, h=640, padw=0, padh=0):

 def xyxy2xywhn(x, w=640, h=640, clip=False, eps=0.0):
    """
-    > It takes in a list of bounding boxes, and returns a list of bounding boxes, but with the x and y
-    coordinates normalized to the width and height of the image
+    Convert bounding box coordinates from (x1, y1, x2, y2) format to (x, y, width, height, normalized) format. x, y, width and height are normalized to image dimensions

    Args:
-      x: the bounding box coordinates
-      w: width of the image. Defaults to 640
-      h: height of the image. Defaults to 640
-      clip: If True, the boxes will be clipped to the image boundaries. Defaults to False
-      eps: the minimum value of the box's width and height.
-
+        x (np.ndarray) or (torch.Tensor): The input tensor containing the bounding box coordinates in (x1, y1, x2, y2) format.
+        w (int): The width of the image. Defaults to 640
+        h (int): The height of the image. Defaults to 640
+        clip (bool): If True, the boxes will be clipped to the image boundaries. Defaults to False
+        eps (float): The minimum value of the box's width and height. Defaults to 0.0
    Returns:
-      the xywhn format of the bounding boxes.
+        y (numpy.ndarray) or (torch.Tensor): The bounding box coordinates in (x, y, width, height, normalized) format
    """
    if clip:
        clip_boxes(x, (h - eps, w - eps))  # warning: inplace clip
@ -381,17 +372,16 @@ def xyxy2xywhn(x, w=640, h=640, clip=False, eps=0.0):

 def xyn2xy(x, w=640, h=640, padw=0, padh=0):
    """
-    > It converts normalized segments into pixel segments of shape (n,2)
+    Convert normalized coordinates to pixel coordinates of shape (n,2)

    Args:
-      x: the normalized coordinates of the bounding box
-      w: width of the image. Defaults to 640
-      h: height of the image. Defaults to 640
-      padw: padding width. Defaults to 0
-      padh: padding height. Defaults to 0
-
+        x (numpy.ndarray) or (torch.Tensor): The input tensor of normalized bounding box coordinates
+        w (int): The width of the image. Defaults to 640
+        h (int): The height of the image. Defaults to 640
+        padw (int): The width of the padding. Defaults to 0
+        padh (int): The height of the padding. Defaults to 0
    Returns:
-      the x and y coordinates of the top left corner of the bounding box.
+        y (numpy.ndarray) or (torch.Tensor): The x and y coordinates of the top left corner of the bounding box
    """
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[..., 0] = w * x[..., 0] + padw  # top left x
@ -401,13 +391,12 @@ def xyn2xy(x, w=640, h=640, padw=0, padh=0):

 def xywh2ltwh(x):
    """
-    > It converts the bounding box from [x, y, w, h] to [x1, y1, w, h] where xy1=top-left
+    Convert the bounding box format from [x, y, w, h] to [x1, y1, w, h], where x1, y1 are the top-left coordinates.

    Args:
-      x: the x coordinate of the center of the bounding box
-
+        x (numpy.ndarray) or (torch.Tensor): The input tensor with the bounding box coordinates in the xywh format
    Returns:
-      the top left x and y coordinates of the bounding box.
+        y (numpy.ndarray) or (torch.Tensor): The bounding box coordinates in the xyltwh format
    """
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[:, 0] = x[:, 0] - x[:, 2] / 2  # top left x
@ -417,13 +406,12 @@ def xywh2ltwh(x):

 def xyxy2ltwh(x):
    """
-    > Convert nx4 boxes from [x1, y1, x2, y2] to [x1, y1, w, h] where xy1=top-left, xy2=bottom-right
+    Convert nx4 bounding boxes from [x1, y1, x2, y2] to [x1, y1, w, h], where xy1=top-left, xy2=bottom-right

    Args:
-      x: the input tensor
-
+      x (numpy.ndarray) or (torch.Tensor): The input tensor with the bounding boxes coordinates in the xyxy format
    Returns:
-      the xyxy2ltwh function.
+      y (numpy.ndarray) or (torch.Tensor): The bounding box coordinates in the xyltwh format.
    """
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[:, 2] = x[:, 2] - x[:, 0]  # width
@ -433,10 +421,10 @@ def xyxy2ltwh(x):

 def ltwh2xywh(x):
    """
-    > Convert nx4 boxes from [x1, y1, w, h] to [x, y, w, h] where xy1=top-left, xy=center
+    Convert nx4 boxes from [x1, y1, w, h] to [x, y, w, h] where xy1=top-left, xy=center

    Args:
-      x: the input tensor
+      x (torch.tensor): the input tensor
    """
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[:, 0] = x[:, 0] + x[:, 2] / 2  # center x
@ -446,14 +434,13 @@ def ltwh2xywh(x):

 def ltwh2xyxy(x):
    """
-    > It converts the bounding box from [x1, y1, w, h] to [x1, y1, x2, y2] where xy1=top-left,
-    xy2=bottom-right
+    It converts the bounding box from [x1, y1, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right

    Args:
-      x: the input image
+      x (numpy.ndarray) or (torch.Tensor): the input image

    Returns:
-      the xyxy coordinates of the bounding boxes.
+      y (numpy.ndarray) or (torch.Tensor): the xyxy coordinates of the bounding boxes.
    """
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[:, 2] = x[:, 2] + x[:, 0]  # width
@ -463,14 +450,13 @@ def ltwh2xyxy(x):

 def segments2boxes(segments):
    """
-    > It converts segment labels to box labels, i.e. (cls, xy1, xy2, ...) to (cls, xywh)
+    It converts segment labels to box labels, i.e. (cls, xy1, xy2, ...) to (cls, xywh)

    Args:
-      segments: list of segments, each segment is a list of points, each point is a list of x, y
-    coordinates
+      segments (list): list of segments, each segment is a list of points, each point is a list of x, y coordinates

    Returns:
-      the xywh coordinates of the bounding boxes.
+      (np.array): the xywh coordinates of the bounding boxes.
    """
    boxes = []
    for s in segments:
@ -481,15 +467,14 @@ def segments2boxes(segments):

 def resample_segments(segments, n=1000):
    """
-    > It takes a list of segments (n,2) and returns a list of segments (n,2) where each segment has been
-    up-sampled to n points
+    It takes a list of segments (n,2) and returns a list of segments (n,2) where each segment has been up-sampled to n points

    Args:
-      segments: a list of (n,2) arrays, where n is the number of points in the segment.
-      n: number of points to resample the segment to. Defaults to 1000
+      segments (list): a list of (n,2) arrays, where n is the number of points in the segment.
+      n (int): number of points to resample the segment to. Defaults to 1000

    Returns:
-      the resampled segments.
+      segments (list): the resampled segments.
    """
    for i, s in enumerate(segments):
        s = np.concatenate((s, s[0:1, :]), axis=0)
@ -501,14 +486,14 @@ def resample_segments(segments, n=1000):

 def crop_mask(masks, boxes):
    """
-    > It takes a mask and a bounding box, and returns a mask that is cropped to the bounding box
+    It takes a mask and a bounding box, and returns a mask that is cropped to the bounding box

    Args:
-      masks: [h, w, n] tensor of masks
-      boxes: [n, 4] tensor of bbox coords in relative point form
+      masks (torch.tensor): [h, w, n] tensor of masks
+      boxes (torch.tensor): [n, 4] tensor of bbox coordinates in relative point form

    Returns:
-      The masks are being cropped to the bounding box.
+      (torch.tensor): The masks are being cropped to the bounding box.
    """
    n, h, w = masks.shape
    x1, y1, x2, y2 = torch.chunk(boxes[:, :, None], 4, 1)  # x1 shape(1,1,n)
@ -520,17 +505,17 @@ def crop_mask(masks, boxes):

 def process_mask_upsample(protos, masks_in, bboxes, shape):
    """
-    > It takes the output of the mask head, and applies the mask to the bounding boxes. This produces masks of higher
+    It takes the output of the mask head, and applies the mask to the bounding boxes. This produces masks of higher
    quality but is slower.

    Args:
-      protos: [mask_dim, mask_h, mask_w]
-      masks_in: [n, mask_dim], n is number of masks after nms
-      bboxes: [n, 4], n is number of masks after nms
-      shape: the size of the input image
+      protos (torch.tensor): [mask_dim, mask_h, mask_w]
+      masks_in (torch.tensor): [n, mask_dim], n is number of masks after nms
+      bboxes (torch.tensor): [n, 4], n is number of masks after nms
+      shape (tuple): the size of the input image (h,w)

    Returns:
-      mask
+      (torch.tensor): The upsampled masks.
    """
    c, mh, mw = protos.shape  # CHW
    masks = (masks_in @ protos.float().view(c, -1)).sigmoid().view(-1, mh, mw)
@ -541,17 +526,17 @@ def process_mask_upsample(protos, masks_in, bboxes, shape):

 def process_mask(protos, masks_in, bboxes, shape, upsample=False):
    """
-    > It takes the output of the mask head, and applies the mask to the bounding boxes. This is faster but produces
+    It takes the output of the mask head, and applies the mask to the bounding boxes. This is faster but produces
    downsampled quality of mask

    Args:
-      protos: [mask_dim, mask_h, mask_w]
-      masks_in: [n, mask_dim], n is number of masks after nms
-      bboxes: [n, 4], n is number of masks after nms
-      shape: the size of the input image
+      protos (torch.tensor): [mask_dim, mask_h, mask_w]
+      masks_in (torch.tensor): [n, mask_dim], n is number of masks after nms
+      bboxes (torch.tensor): [n, 4], n is number of masks after nms
+      shape (tuple): the size of the input image (h,w)

    Returns:
-      mask
+      (torch.tensor): The processed masks.
    """

    c, mh, mw = protos.shape  # CHW
@ -572,16 +557,16 @@ def process_mask(protos, masks_in, bboxes, shape, upsample=False):

 def process_mask_native(protos, masks_in, bboxes, shape):
    """
-    > It takes the output of the mask head, and crops it after upsampling to the bounding boxes.
+    It takes the output of the mask head, and crops it after upsampling to the bounding boxes.

    Args:
-      protos: [mask_dim, mask_h, mask_w]
-      masks_in: [n, mask_dim], n is number of masks after nms
-      bboxes: [n, 4], n is number of masks after nms
-      shape: input_image_size, (h, w)
+      protos (torch.tensor): [mask_dim, mask_h, mask_w]
+      masks_in (torch.tensor): [n, mask_dim], n is number of masks after nms
+      bboxes (torch.tensor): [n, 4], n is number of masks after nms
+      shape (tuple): the size of the input image (h,w)

    Returns:
-      masks: [h, w, n]
+      masks (torch.tensor): The returned masks with dimensions [h, w, n]
    """
    c, mh, mw = protos.shape  # CHW
    masks = (masks_in @ protos.float().view(c, -1)).sigmoid().view(-1, mh, mw)
@ -598,17 +583,17 @@ def process_mask_native(protos, masks_in, bboxes, shape):

 def scale_segments(img1_shape, segments, img0_shape, ratio_pad=None, normalize=False):
    """
-    > Rescale segment coords (xyxy) from img1_shape to img0_shape
+    Rescale segment coordinates (xyxy) from img1_shape to img0_shape

    Args:
-      img1_shape: The shape of the image that the segments are from.
-      segments: the segments to be scaled
-      img0_shape: the shape of the image that the segmentation is being applied to
-      ratio_pad: the ratio of the image size to the padded image size.
-      normalize: If True, the coordinates will be normalized to the range [0, 1]. Defaults to False
+      img1_shape (tuple): The shape of the image that the segments are from.
+      segments (torch.tensor): the segments to be scaled
+      img0_shape (tuple): the shape of the image that the segmentation is being applied to
+      ratio_pad (tuple): the ratio of the image size to the padded image size.
+      normalize (bool): If True, the coordinates will be normalized to the range [0, 1]. Defaults to False

    Returns:
-      the segmented image.
+      segments (torch.tensor): the segmented image.
    """
    if ratio_pad is None:  # calculate from img0_shape
        gain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1])  # gain  = old / new
@ -629,11 +614,11 @@ def scale_segments(img1_shape, segments, img0_shape, ratio_pad=None, normalize=F

 def masks2segments(masks, strategy='largest'):
    """
-    > It takes a list of masks(n,h,w) and returns a list of segments(n,xy)
+    It takes a list of masks(n,h,w) and returns a list of segments(n,xy)

    Args:
-      masks: the output of the model, which is a tensor of shape (batch_size, 160, 160)
-      strategy: 'concat' or 'largest'. Defaults to largest
+      masks (torch.tensor): the output of the model, which is a tensor of shape (batch_size, 160, 160)
+      strategy (str): 'concat' or 'largest'. Defaults to largest

    Returns:
      segments (List): list of segment masks
@ -654,12 +639,12 @@ def masks2segments(masks, strategy='largest'):

 def clip_segments(segments, shape):
    """
-    > It takes a list of line segments (x1,y1,x2,y2) and clips them to the image shape (height, width)
+    It takes a list of line segments (x1,y1,x2,y2) and clips them to the image shape (height, width)

    Args:
-      segments: a list of segments, each segment is a list of points, each point is a list of x,y
+      segments (list): a list of segments, each segment is a list of points, each point is a list of x,y
    coordinates
-      shape: the shape of the image
+      shape (tuple): the shape of the image
    """
    if isinstance(segments, torch.Tensor):  # faster individually
        segments[:, 0].clamp_(0, shape[1])  # x
@ -670,5 +655,13 @@ def clip_segments(segments, shape):


 def clean_str(s):
-    # Cleans a string by replacing special characters with underscore _
+    """
+    Cleans a string by replacing special characters with underscore _
+
+    Args:
+      s (str): a string needing special characters replaced
+
+    Returns:
+      (str): a string with special characters replaced by an underscore _
+    """
    return re.sub(pattern="[|@#!¡·$€%&()=?¿^*;:,¨´><+]", repl="_", string=s)