Add YOLOv5 tutorials to docs.ultralytics.com (#1657)

Co-authored-by: ayush chaurasia <ayush.chaurarsia@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sergiu Waxmann <47978446+sergiuwaxmann@users.noreply.github.com>
2023-03-29 22:56:52 +02:00
parent ec10002a4a
commit ccb6419835
29 changed files with 3656 additions and 278 deletions
--- a/docs/modes/index.md
+++ b/docs/modes/index.md
@ -1,4 +1,4 @@
-# YOLOv8 Modes
+# Ultralytics YOLOv8 Modes

 <img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png">

--- a/docs/modes/predict.md
+++ b/docs/modes/predict.md
@ -1,63 +1,66 @@
 <img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png">

-Inference or prediction of a task returns a list of `Results` objects. Alternatively, in the streaming mode, it returns
-a generator of `Results` objects which is memory efficient. Streaming mode can be enabled by passing `stream=True` in
-predictor's call method.
+YOLOv8 **predict mode** can generate predictions for various tasks, returning either a list of `Results` objects or a
+memory-efficient generator of `Results` objects when using the streaming mode. Enable streaming mode by
+passing `stream=True` in the predictor's call method.

 !!! example "Predict"

-    === "Return a List"
+    === "Return a list with `Stream=False`"
+        ```python
+        inputs = [img, img]  # list of numpy arrays
+        results = model(inputs)  # list of Results objects
+        
+        for result in results:
+            boxes = result.boxes  # Boxes object for bbox outputs
+            masks = result.masks  # Masks object for segmentation masks outputs
+            probs = result.probs  # Class probabilities for classification outputs
+        ```

-    ```python
-    inputs = [img, img]  # list of np arrays
-    results = model(inputs)  # List of Results objects
-    
-    for result in results:
-        boxes = result.boxes  # Boxes object for bbox outputs
-        masks = result.masks  # Masks object for segmenation masks outputs
-        probs = result.probs  # Class probabilities for classification outputs
-    ```
-    
-    === "Return a Generator"
+    === "Return a list with `Stream=True`"
+        ```python
+        inputs = [img, img]  # list of numpy arrays
+        results = model(inputs, stream=True)  # generator of Results objects
+        
+        for result in results:
+            boxes = result.boxes  # Boxes object for bbox outputs
+            masks = result.masks  # Masks object for segmentation masks outputs
+            probs = result.probs  # Class probabilities for classification outputs
+        ```

-    ```python
-    inputs = [img, img]  # list of numpy arrays
-    results = model(inputs, stream=True)  # generator of Results objects
-    
-    for r in results:
-        boxes = r.boxes  # Boxes object for bbox outputs
-        masks = r.masks  # Masks object for segmenation masks outputs
-        probs = r.probs  # Class probabilities for classification outputs
-    ```
+!!! tip "Tip"
+
+    Streaming mode with `stream=True` should be used for long videos or large predict sources, otherwise results will accumuate in memory and will eventually cause out-of-memory errors. 

 ## Sources

-YOLOv8 can run inference on a variety of sources. The table below lists the various sources that can be used as input
-for YOLOv8, along with the required format and notes. Sources include images, URLs, PIL images, OpenCV, numpy arrays,
-torch tensors, CSV files, videos, directories, globs, YouTube videos, and streams. The table also indicates whether each
-source can be used as a stream and the model argument required for that source.
+YOLOv8 can accept various input sources, as shown in the table below. This includes images, URLs, PIL images, OpenCV,
+numpy arrays, torch tensors, CSV files, videos, directories, globs, YouTube videos, and streams. The table indicates
+whether each source can be used in streaming mode with `stream=True` ✅ and an example argument for each source.

-| source     | stream  | model(arg)                                 | type           | notes            |
-|------------|---------|--------------------------------------------|----------------|------------------|
-| image      |         | `'im.jpg'`                                 | `str`, `Path`  |                  |
-| URL        |         | `'https://ultralytics.com/images/bus.jpg'` | `str`          |                  |
-| screenshot |         | `'screen'`                                 | `str`          |                  |
-| PIL        |         | `Image.open('im.jpg')`                     | `PIL.Image`    | HWC, RGB         |
-| OpenCV     |         | `cv2.imread('im.jpg')[:,:,::-1]`           | `np.ndarray`   | HWC, BGR to RGB  |
-| numpy      |         | `np.zeros((640,1280,3))`                   | `np.ndarray`   | HWC              |
-| torch      |         | `torch.zeros(16,3,320,640)`                | `torch.Tensor` | BCHW, RGB        |
-| CSV        |         | `'sources.csv'`                            | `str`, `Path`  | RTSP, RTMP, HTTP |         
-| video      | &check; | `'vid.mp4'`                                | `str`, `Path`  |                  |
-| directory  | &check; | `'path/'`                                  | `str`, `Path`  |                  |
-| glob       | &check; | `'path/*.jpg'`                             | `str`          | Use `*` operator |
-| YouTube    | &check; | `'https://youtu.be/Zgi9g1ksQHc'`           | `str`          |                  |
-| stream     | &check; | `'rtsp://example.com/media.mp4'`           | `str`          | RTSP, RTMP, HTTP |
+| source      | model(arg)                                 | type           | notes            |
+|-------------|--------------------------------------------|----------------|------------------|
+| image       | `'im.jpg'`                                 | `str`, `Path`  |                  |
+| URL         | `'https://ultralytics.com/images/bus.jpg'` | `str`          |                  |
+| screenshot  | `'screen'`                                 | `str`          |                  |
+| PIL         | `Image.open('im.jpg')`                     | `PIL.Image`    | HWC, RGB         |
+| OpenCV      | `cv2.imread('im.jpg')[:,:,::-1]`           | `np.ndarray`   | HWC, BGR to RGB  |
+| numpy       | `np.zeros((640,1280,3))`                   | `np.ndarray`   | HWC              |
+| torch       | `torch.zeros(16,3,320,640)`                | `torch.Tensor` | BCHW, RGB        |
+| CSV         | `'sources.csv'`                            | `str`, `Path`  | RTSP, RTMP, HTTP |         
+| video ✅     | `'vid.mp4'`                                | `str`, `Path`  |                  |
+| directory ✅ | `'path/'`                                  | `str`, `Path`  |                  |
+| glob ✅      | `'path/*.jpg'`                             | `str`          | Use `*` operator |
+| YouTube ✅   | `'https://youtu.be/Zgi9g1ksQHc'`           | `str`          |                  |
+| stream ✅    | `'rtsp://example.com/media.mp4'`           | `str`          | RTSP, RTMP, HTTP |

-## Image Formats
+## Image and Video Formats

-For images, YOLOv8 supports a variety of image formats defined
-in [yolo/data/utils.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/yolo/data/utils.py). The
-following suffixes are valid for images:
+YOLOv8 supports various image and video formats, as specified
+in [yolo/data/utils.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/yolo/data/utils.py). See the
+tables below for the valid suffixes and example predict commands.
+
+### Image Suffixes

 | Image Suffixes | Example Predict Command          | Reference                                                                     |
 |----------------|----------------------------------|-------------------------------------------------------------------------------|
@ -72,11 +75,7 @@ following suffixes are valid for images:
 | .webp          | `yolo predict source=image.webp` | [WebP](https://en.wikipedia.org/wiki/WebP)                                    |
 | .pfm           | `yolo predict source=image.pfm`  | [Portable FloatMap](https://en.wikipedia.org/wiki/Netpbm#File_formats)        |

-## Video Formats
-
-For videos, YOLOv8 also supports a variety of video formats defined
-in [yolo/data/utils.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/yolo/data/utils.py). The
-following suffixes are valid for videos:
+### Video Suffixes

 | Video Suffixes | Example Predict Command          | Reference                                                                        |
 |----------------|----------------------------------|----------------------------------------------------------------------------------|
@ -95,68 +94,78 @@ following suffixes are valid for videos:

 ## Working with Results

-Results object consists of these component objects:
+The `Results` object contains the following components:

- `Results.boxes`: `Boxes` object with properties and methods for manipulating bboxes
- `Results.masks`: `Masks` object used to index masks or to get segment coordinates.
- `Results.probs`: `torch.Tensor` containing the class probabilities/logits.
- `Results.orig_img`: Original image loaded in memory.
- `Results.path`: `Path` containing the path to input image
+- `Results.boxes`: `Boxes` object with properties and methods for manipulating bounding boxes
+- `Results.masks`: `Masks` object for indexing masks or getting segment coordinates
+- `Results.probs`: `torch.Tensor` containing class probabilities or logits
+- `Results.orig_img`: Original image loaded in memory
+- `Results.path`: `Path` containing the path to the input image

-Each result is composed of torch.Tensor by default, in which you can easily use following functionality:
+Each result is composed of a `torch.Tensor` by default, which allows for easy manipulation:

-```python
-results = results.cuda()
-results = results.cpu()
-results = results.to("cpu")
-results = results.numpy()
-```
+!!! example "Results"
+
+    ```python
+    results = results.cuda()
+    results = results.cpu()
+    results = results.to('cpu')
+    results = results.numpy()
+    ```

 ### Boxes

-`Boxes` object can be used index, manipulate and convert bboxes to different formats. The box format conversion
-operations are cached, which means they're only calculated once per object and those values are reused for future calls.
+`Boxes` object can be used to index, manipulate, and convert bounding boxes to different formats. Box format conversion
+operations are cached, meaning they're only calculated once per object, and those values are reused for future calls.

- Indexing a `Boxes` objects returns a `Boxes` object
+- Indexing a `Boxes` object returns a `Boxes` object:

-```python
-results = model(inputs)
-boxes = results[0].boxes
-box = boxes[0]  # returns one box
-box.xyxy 
-```
+!!! example "Boxes"
+
+    ```python
+    results = model(img)
+    boxes = results[0].boxes
+    box = boxes[0]  # returns one box
+    box.xyxy
+    ```

 - Properties and conversions

-```python
-boxes.xyxy  # box with xyxy format, (N, 4)
-boxes.xywh  # box with xywh format, (N, 4)
-boxes.xyxyn  # box with xyxy format but normalized, (N, 4)
-boxes.xywhn  # box with xywh format but normalized, (N, 4)
-boxes.conf  # confidence score, (N, 1)
-boxes.cls  # cls, (N, 1)
-boxes.data  # raw bboxes tensor, (N, 6) or boxes.boxes .
-```
+!!! example "Boxes Properties"
+
+    ```python
+    boxes.xyxy  # box with xyxy format, (N, 4)
+    boxes.xywh  # box with xywh format, (N, 4)
+    boxes.xyxyn  # box with xyxy format but normalized, (N, 4)
+    boxes.xywhn  # box with xywh format but normalized, (N, 4)
+    boxes.conf  # confidence score, (N, 1)
+    boxes.cls  # cls, (N, 1)
+    boxes.data  # raw bboxes tensor, (N, 6) or boxes.boxes
+    ```

 ### Masks

 `Masks` object can be used index, manipulate and convert masks to segments. The segment conversion operation is cached.

-```python
-results = model(inputs)
-masks = results[0].masks  # Masks object
-masks.segments  # bounding coordinates of masks, List[segment] * N
-masks.data  # raw masks tensor, (N, H, W) or masks.masks 
-```
+!!! example "Masks"
+
+    ```python
+    results = model(inputs)
+    masks = results[0].masks  # Masks object
+    masks.segments  # bounding coordinates of masks, List[segment] * N
+    masks.data  # raw masks tensor, (N, H, W) or masks.masks 
+    ```

 ### probs

 `probs` attribute of `Results` class is a `Tensor` containing class probabilities of a classification operation.

-```python
-results = model(inputs)
-results[0].probs  # cls prob, (num_class, )
-```
+!!! example "Probs"
+
+    ```python
+    results = model(inputs)
+    results[0].probs  # cls prob, (num_class, )
+    ```

 Class reference documentation for `Results` module and its components can be found [here](../reference/results.md)

@ -165,16 +174,14 @@ Class reference documentation for `Results` module and its components can be fou
 You can use `plot()` function of `Result` object to plot results on in image object. It plots all components(boxes,
 masks, classification logits, etc.) found in the results object

-```python
-res = model(img)
-res_plotted = res[0].plot()
-cv2.imshow("result", res_plotted)
-```
+!!! example "Plotting"

-!!! example "`plot()` arguments"
+    ```python
+    res = model(img)
+    res_plotted = res[0].plot()
+    cv2.imshow("result", res_plotted)
+    ```

-    `show_conf (bool)`: Show confidence
-
-    `line_width (Float)`: The line width of boxes. Automatically scaled to img size if not provided
-
-    `font_size (Float)`: The font size of . Automatically scaled to img size if not provided
+- `show_conf (bool)`: Show confidence
+- `line_width (Float)`: The line width of boxes. Automatically scaled to img size if not provided
+- `font_size (Float)`: The font size of . Automatically scaled to img size if not provided