ultralytics 8.0.122 Fix torch.Tensor inference (#3363)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: krzysztof.gonia <4281421+kgonia@users.noreply.github.com>
2023-06-25 01:36:07 +02:00
parent 51d8cfa9c3
commit 682c9ef70f
16 changed files with 471 additions and 154 deletions
--- a/docs/modes/predict.md
+++ b/docs/modes/predict.md
@ -1,6 +1,6 @@
 ---
 comments: true
-description: Get started with YOLOv8 Predict mode and input sources. Accepts various input sources such as images, videos, and directories.
+description: Get started with YOLOv8 Predict mode and input sources. Accepts various input sources such as images, videos, and directories.
 keywords: YOLOv8, predict mode, generator, streaming mode, input sources, video formats, arguments customization
 ---

@ -12,60 +12,279 @@ passing `stream=True` in the predictor's call method.

 !!! example "Predict"

-    === "Return a list with `Stream=False`"
+    === "Return a list with `stream=False`"
        ```python
-        inputs = [img, img]  # list of numpy arrays
-        results = model(inputs)  # list of Results objects
+        from ultralytics import YOLO
+
+        # Load a model
+        model = YOLO('yolov8n.pt')  # pretrained YOLOv8n model
+
+        # Run batched inference on a list of images
+        results = model(['im1.jpg', 'im2.jpg'])  # return a list of Results objects
        
+        # Process results list
        for result in results:
            boxes = result.boxes  # Boxes object for bbox outputs
            masks = result.masks  # Masks object for segmentation masks outputs
+            keypoints = result.keypoints  # Keypoints object for pose outputs
            probs = result.probs  # Class probabilities for classification outputs
        ```

-    === "Return a generator with `Stream=True`"
+    === "Return a generator with `stream=True`"
        ```python
-        inputs = [img, img]  # list of numpy arrays
-        results = model(inputs, stream=True)  # generator of Results objects
+        from ultralytics import YOLO
+
+        # Load a model
+        model = YOLO('yolov8n.pt')  # pretrained YOLOv8n model
+
+        # Run batched inference on a list of images
+        results = model(['im1.jpg', 'im2.jpg'], stream=True)  # return a generator of Results objects
        
+        # Process results generator
        for result in results:
            boxes = result.boxes  # Boxes object for bbox outputs
            masks = result.masks  # Masks object for segmentation masks outputs
+            keypoints = result.keypoints  # Keypoints object for pose outputs
            probs = result.probs  # Class probabilities for classification outputs
        ```

+## Inference Sources
+
+YOLOv8 can process different types of input sources for inference, as shown in the table below. The sources include static images, video streams, and various data formats. The table also indicates whether each source can be used in streaming mode with the argument `stream=True` ✅. Streaming mode is beneficial for processing videos or live streams as it creates a generator of results instead of loading all frames into memory.
+
 !!! tip "Tip"

-    Streaming mode with `stream=True` should be used for long videos or large predict sources, otherwise results will accumuate in memory and will eventually cause out-of-memory errors. 
+    Use `stream=True` for processing long videos or large datasets to efficiently manage memory. When `stream=False`, the results for all frames or data points are stored in memory, which can quickly add up and cause out-of-memory errors for large inputs. In contrast, `stream=True` utilizes a generator, which only keeps the results of the current frame or data point in memory, significantly reducing memory consumption and preventing out-of-memory issues.

-## Sources
+| Source      | Argument                                   | Type                                  | Notes                                                                      |
+|-------------|--------------------------------------------|---------------------------------------|----------------------------------------------------------------------------|
+| image       | `'image.jpg'`                              | `str` or `Path`                       | Single image file.                                                         |
+| URL         | `'https://ultralytics.com/images/bus.jpg'` | `str`                                 | URL to an image.                                                           |
+| screenshot  | `'screen'`                                 | `str`                                 | Capture a screenshot.                                                      |
+| PIL         | `Image.open('im.jpg')`                     | `PIL.Image`                           | HWC format with RGB channels.                                              |
+| OpenCV      | `cv2.imread('im.jpg')`                     | `np.ndarray` of `uint8 (0-255)`       | HWC format with BGR channels.                                              |
+| numpy       | `np.zeros((640,1280,3))`                   | `np.ndarray` of `uint8 (0-255)`       | HWC format with BGR channels.                                              |
+| torch       | `torch.zeros(16,3,320,640)`                | `torch.Tensor` of `float32 (0.0-1.0)` | BCHW format with RGB channels.                                             |
+| CSV         | `'sources.csv'`                            | `str` or `Path`                       | CSV file containing paths to images, videos, or directories.               |       
+| video ✅     | `'video.mp4'`                              | `str` or `Path`                       | Video file in formats like MP4, AVI, etc.                                  |
+| directory ✅ | `'path/'`                                  | `str` or `Path`                       | Path to a directory containing images or videos.                           |
+| glob ✅      | `'path/*.jpg'`                             | `str`                                 | Glob pattern to match multiple files. Use the `*` character as a wildcard. |
+| YouTube ✅   | `'https://youtu.be/Zgi9g1ksQHc'`           | `str`                                 | URL to a YouTube video.                                                    |
+| stream ✅    | `'rtsp://example.com/media.mp4'`           | `str`                                 | URL for streaming protocols such as RTSP, RTMP, or an IP address.          |

-YOLOv8 can accept various input sources, as shown in the table below. This includes images, URLs, PIL images, OpenCV,
-numpy arrays, torch tensors, CSV files, videos, directories, globs, YouTube videos, and streams. The table indicates
-whether each source can be used in streaming mode with `stream=True` ✅ and an example argument for each source.
+Below are code examples for using each source type:

-| source      | model(arg)                                 | type           | notes            |
-|-------------|--------------------------------------------|----------------|------------------|
-| image       | `'im.jpg'`                                 | `str`, `Path`  |                  |
-| URL         | `'https://ultralytics.com/images/bus.jpg'` | `str`          |                  |
-| screenshot  | `'screen'`                                 | `str`          |                  |
-| PIL         | `Image.open('im.jpg')`                     | `PIL.Image`    | HWC, RGB         |
-| OpenCV      | `cv2.imread('im.jpg')`                     | `np.ndarray`   | HWC, BGR         |
-| numpy       | `np.zeros((640,1280,3))`                   | `np.ndarray`   | HWC              |
-| torch       | `torch.zeros(16,3,320,640)`                | `torch.Tensor` | BCHW, RGB        |
-| CSV         | `'sources.csv'`                            | `str`, `Path`  | RTSP, RTMP, HTTP |         
-| video ✅     | `'vid.mp4'`                                | `str`, `Path`  |                  |
-| directory ✅ | `'path/'`                                  | `str`, `Path`  |                  |
-| glob ✅      | `'path/*.jpg'`                             | `str`          | Use `*` operator |
-| YouTube ✅   | `'https://youtu.be/Zgi9g1ksQHc'`           | `str`          |                  |
-| stream ✅    | `'rtsp://example.com/media.mp4'`           | `str`          | RTSP, RTMP, HTTP |
+!!! example "Prediction sources"

-## Arguments
+    === "image"
+        Run inference on an image file. 
+        ```python
+        from ultralytics import YOLO
+    
+        # Load a pretrained YOLOv8n model
+        model = YOLO('yolov8n.pt')
+    
+        # Define path to the image file
+        source = 'path/to/image.jpg'
+    
+        # Run inference on the source
+        results = model(source)  # list of Results objects
+        ```
+    
+    === "screenshot"
+        Run inference on the current screen content as a screenshot.
+        ```python
+        from ultralytics import YOLO
+    
+        # Load a pretrained YOLOv8n model
+        model = YOLO('yolov8n.pt')
+    
+        # Define current screenshot as source
+        source = 'screen'
+    
+        # Run inference on the source
+        results = model(source)  # list of Results objects
+        ```
+    
+    === "URL"
+        Run inference on an image or video hosted remotely via URL.
+        ```python
+        from ultralytics import YOLO
+    
+        # Load a pretrained YOLOv8n model
+        model = YOLO('yolov8n.pt')
+    
+        # Define remote image or video URL
+        source = 'https://ultralytics.com/images/bus.jpg'
+    
+        # Run inference on the source
+        results = model(source)  # list of Results objects
+        ```
+    
+    === "PIL"
+        Run inference on an image opened with Python Imaging Library (PIL).
+        ```python
+        from PIL import Image
+        from ultralytics import YOLO
+    
+        # Load a pretrained YOLOv8n model
+        model = YOLO('yolov8n.pt')
+    
+        # Open an image using PIL
+        source = Image.open('path/to/image.jpg')
+    
+        # Run inference on the source
+        results = model(source)  # list of Results objects
+        ```
+    
+    === "OpenCV"
+        Run inference on an image read with OpenCV.
+        ```python
+        import cv2
+        from ultralytics import YOLO
+    
+        # Load a pretrained YOLOv8n model
+        model = YOLO('yolov8n.pt')
+    
+        # Read an image using OpenCV
+        source = cv2.imread('path/to/image.jpg')
+    
+        # Run inference on the source
+        results = model(source)  # list of Results objects
+        ```
+    
+    === "numpy"
+        Run inference on an image represented as a numpy array.
+        ```python
+        import numpy as np
+        from ultralytics import YOLO
+    
+        # Load a pretrained YOLOv8n model
+        model = YOLO('yolov8n.pt')
+    
+        # Create a random numpy array of HWC shape (640, 640, 3) with values in range [0, 255] and type uint8
+        source = np.random.randint(low=0, high=255, size=(640, 640, 3), dtype='uint8')
+    
+        # Run inference on the source
+        results = model(source)  # list of Results objects
+        ```
+    
+    === "torch"
+        Run inference on an image represented as a PyTorch tensor.
+        ```python
+        import torch
+        from ultralytics import YOLO
+    
+        # Load a pretrained YOLOv8n model
+        model = YOLO('yolov8n.pt')
+    
+        # Create a random torch tensor of BCHW shape (1, 3, 640, 640) with values in range [0, 1] and type float32
+        source = torch.rand(1, 3, 640, 640, dtype=torch.float32)
+    
+        # Run inference on the source
+        results = model(source)  # list of Results objects
+        ```
+    
+    === "CSV"
+        Run inference on a collection of images, URLs, videos and directories listed in a CSV file.
+        ```python
+        import torch
+        from ultralytics import YOLO
+    
+        # Load a pretrained YOLOv8n model
+        model = YOLO('yolov8n.pt')
+    
+        # Define a path to a CSV file with images, URLs, videos and directories
+        source = 'path/to/file.csv'
+    
+        # Run inference on the source
+        results = model(source)  # list of Results objects
+        ```
+    
+    === "video"
+        Run inference on a video file. By using `stream=True`, you can create a generator of Results objects to reduce memory usage.
+        ```python
+        from ultralytics import YOLO
+    
+        # Load a pretrained YOLOv8n model
+        model = YOLO('yolov8n.pt')
+    
+        # Define path to video file
+        source = 'path/to/video.mp4'
+    
+        # Run inference on the source
+        results = model(source, stream=True)  # generator of Results objects
+        ```
+    
+    === "directory"
+        Run inference on all images and videos in a directory. To also capture images and videos in subdirectories use a glob pattern, i.e. `path/to/dir/**/*`.
+        ```python
+        from ultralytics import YOLO
+    
+        # Load a pretrained YOLOv8n model
+        model = YOLO('yolov8n.pt')
+    
+        # Define path to directory containing images and videos for inference
+        source = 'path/to/dir'
+    
+        # Run inference on the source
+        results = model(source, stream=True)  # generator of Results objects
+        ```
+    
+    === "glob"
+        Run inference on all images and videos that match a glob expression with `*` characters.
+        ```python
+        from ultralytics import YOLO
+    
+        # Load a pretrained YOLOv8n model
+        model = YOLO('yolov8n.pt')
+    
+        # Define a glob search for all JPG files in a directory
+        source = 'path/to/dir/*.jpg'
+    
+        # OR define a recursive glob search for all JPG files including subdirectories
+        source = 'path/to/dir/**/*.jpg'
+    
+        # Run inference on the source
+        results = model(source, stream=True)  # generator of Results objects
+        ```
+    
+    === "YouTube"
+        Run inference on a YouTube video. By using `stream=True`, you can create a generator of Results objects to reduce memory usage for long videos.
+        ```python
+        from ultralytics import YOLO
+    
+        # Load a pretrained YOLOv8n model
+        model = YOLO('yolov8n.pt')
+    
+        # Define source as YouTube video URL
+        source = 'https://youtu.be/Zgi9g1ksQHc'
+    
+        # Run inference on the source
+        results = model(source, stream=True)  # generator of Results objects
+        ```
+    
+    === "Stream"
+        Run inference on remote streaming sources using RTSP, RTMP, and IP address protocols.
+        ```python
+        from ultralytics import YOLO
+    
+        # Load a pretrained YOLOv8n model
+        model = YOLO('yolov8n.pt')
+    
+        # Define source as RTSP, RTMP or IP streaming address
+        source = 'rtsp://example.com/media.mp4'
+    
+        # Run inference on the source
+        results = model(source, stream=True)  # generator of Results objects
+        ```
+
+## Inference Arguments

 `model.predict` accepts multiple arguments that control the prediction operation. These arguments can be passed directly to `model.predict`:
 !!! example

-    ```
+    ```python
    model.predict(source, save=True, imgsz=320, conf=0.5)
    ```

@ -97,12 +316,12 @@ All supported arguments:

 ## Image and Video Formats

-YOLOv8 supports various image and video formats, as specified
-in [yolo/data/utils.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/yolo/data/utils.py). See the
-tables below for the valid suffixes and example predict commands.
+YOLOv8 supports various image and video formats, as specified in [yolo/data/utils.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/yolo/data/utils.py). See the tables below for the valid suffixes and example predict commands.

 ### Image Suffixes

+The below table contains valid Ultralytics image formats.
+
 | Image Suffixes | Example Predict Command          | Reference                                                                     |
 |----------------|----------------------------------|-------------------------------------------------------------------------------|
 | .bmp           | `yolo predict source=image.bmp`  | [Microsoft BMP File Format](https://en.wikipedia.org/wiki/BMP_file_format)    |
@ -118,6 +337,8 @@ tables below for the valid suffixes and example predict commands.

 ### Video Suffixes

+The below table contains valid Ultralytics video formats.
+
 | Video Suffixes | Example Predict Command          | Reference                                                                        |
 |----------------|----------------------------------|----------------------------------------------------------------------------------|
 | .asf           | `yolo predict source=video.asf`  | [Advanced Systems Format](https://en.wikipedia.org/wiki/Advanced_Systems_Format) |