ultralytics 8.0.122 Fix torch.Tensor inference (#3363)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: krzysztof.gonia <4281421+kgonia@users.noreply.github.com>
2023-06-25 01:36:07 +02:00
parent 51d8cfa9c3
commit 682c9ef70f
16 changed files with 471 additions and 154 deletions
--- a/docs/datasets/classify/index.md
+++ b/docs/datasets/classify/index.md
@ -102,4 +102,19 @@ In this example, the `train` directory contains subdirectories for each class in

 ## Supported Datasets

-TODO
+Ultralytics supports the following datasets with automatic download:
+
+* [Caltech 101](caltech101.md): A dataset containing images of 101 object categories for image classification tasks.
+* [Caltech 256](caltech256.md): An extended version of Caltech 101 with 256 object categories and more challenging images.
+* [CIFAR-10](cifar10.md): A dataset of 60K 32x32 color images in 10 classes, with 6K images per class.
+* [CIFAR-100](cifar100.md): An extended version of CIFAR-10 with 100 object categories and 600 images per class.
+* [Fashion-MNIST](fashion-mnist.md): A dataset consisting of 70,000 grayscale images of 10 fashion categories for image classification tasks.
+* [ImageNet](imagenet.md): A large-scale dataset for object detection and image classification with over 14 million images and 20,000 categories.
+* [ImageNet-10](imagenet10.md): A smaller subset of ImageNet with 10 categories for faster experimentation and testing.
+* [Imagenette](imagenette.md): A smaller subset of ImageNet that contains 10 easily distinguishable classes for quicker training and testing.
+* [Imagewoof](imagewoof.md): A more challenging subset of ImageNet containing 10 dog breed categories for image classification tasks.
+* [MNIST](mnist.md): A dataset of 70,000 grayscale images of handwritten digits for image classification tasks.
+
+### Adding your own dataset
+
+If you have your own dataset and would like to use it for training classification models with Ultralytics, ensure that it follows the format specified above under "Dataset format" and then point your `data` argument to the dataset directory.
--- a/docs/datasets/detect/index.md
+++ b/docs/datasets/detect/index.md
@ -1,81 +1,53 @@
 ---
 comments: true
-description: Learn about supported dataset formats for training YOLO detection models, including Ultralytics YOLO and COCO, in this Object Detection Datasets Overview.
-keywords: object detection, datasets, formats, Ultralytics YOLO, label format, dataset file format, dataset definition, YOLO dataset, model configuration
+description: Explore supported dataset formats for training YOLO detection models, including Ultralytics YOLO and COCO. This guide covers various dataset formats and their specific configurations for effective object detection training.
+keywords: object detection, datasets, formats, Ultralytics YOLO, COCO, label format, dataset file format, dataset definition, YOLO dataset, model configuration
 ---

 # Object Detection Datasets Overview

+Training a robust and accurate object detection model requires a comprehensive dataset. This guide introduces various formats of datasets that are compatible with the Ultralytics YOLO model and provides insights into their structure, usage, and how to convert between different formats.
+
 ## Supported Dataset Formats

 ### Ultralytics YOLO format

-** Label Format **
-
-The dataset format used for training YOLO detection models is as follows:
-
-1. One text file per image: Each image in the dataset has a corresponding text file with the same name as the image file and the ".txt" extension.
-2. One row per object: Each row in the text file corresponds to one object instance in the image.
-3. Object information per row: Each row contains the following information about the object instance:
-    - Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
-    - Object center coordinates: The x and y coordinates of the center of the object, normalized to be between 0 and 1.
-    - Object width and height: The width and height of the object, normalized to be between 0 and 1.
-
-The format for a single row in the detection dataset file is as follows:
-
-```
-<object-class> <x> <y> <width> <height>
-```
-
-Here is an example of the YOLO dataset format for a single image with two object instances:
-
-```
-0 0.5 0.4 0.3 0.6
-1 0.3 0.7 0.4 0.2
-```
-
-In this example, the first object is of class 0 (person), with its center at (0.5, 0.4), width of 0.3, and height of 0.6. The second object is of class 1 (car), with its center at (0.3, 0.7), width of 0.4, and height of 0.2.
-
-** Dataset file format **
-
-The Ultralytics framework uses a YAML file format to define the dataset and model configuration for training Detection Models. Here is an example of the YAML format used for defining a detection dataset:
+The Ultralytics YOLO format is a dataset configuration format that allows you to define the dataset root directory, the relative paths to training/validation/testing image directories or *.txt files containing image paths, and a dictionary of class names. Here is an example:

 ```yaml
-train: <path-to-training-images>
-val: <path-to-validation-images>
+# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
+path: ../datasets/coco128  # dataset root dir
+train: images/train2017  # train images (relative to 'path') 128 images
+val: images/train2017  # val images (relative to 'path') 128 images
+test:  # test images (optional)

-nc: <number-of-classes>
-names: [<class-1>, <class-2>, ..., <class-n>]
-```
-
-The `train` and `val` fields specify the paths to the directories containing the training and validation images, respectively.
-
-The `nc` field specifies the number of object classes in the dataset.
-
-The `names` field is a list of the names of the object classes. The order of the names should match the order of the object class indices in the YOLO dataset files.
-
-NOTE: Either `nc` or `names` must be defined. Defining both are not mandatory
-
-Alternatively, you can directly define class names like this:
-
-```yaml
+# Classes (80 COCO classes)
 names:
  0: person
  1: bicycle
+  2: car
+  ...
+  77: teddy bear
+  78: hair drier
+  79: toothbrush
 ```

-** Example **
+Labels for this format should be exported to YOLO format with one `*.txt` file per image. If there are no objects in an image, no `*.txt` file is required. The `*.txt` file should be formatted with one row per object in `class x_center y_center width height` format. Box coordinates must be in **normalized xywh** format (from 0 - 1). If your boxes are in pixels, you should divide `x_center` and `width` by image width, and `y_center` and `height` by image height. Class numbers should be zero-indexed (start with 0).

-```yaml
-train: data/train/
-val: data/val/
+<p align="center"><img width="750" src="https://user-images.githubusercontent.com/26833433/91506361-c7965000-e886-11ea-8291-c72b98c25eec.jpg"></p>

-nc: 2
-names: ['person', 'car']
-```
+The label file corresponding to the above image contains 2 persons (class `0`) and a tie (class `27`):
+
+<p align="center"><img width="428" src="https://user-images.githubusercontent.com/26833433/112467037-d2568c00-8d66-11eb-8796-55402ac0d62f.png"></p>
+
+When using the Ultralytics YOLO format, organize your training and validation images and labels as shown in the example below.
+
+<p align="center"><img width="700" src="https://user-images.githubusercontent.com/26833433/134436012-65111ad1-9541-4853-81a6-f19a3468b75f.png"></p>

 ## Usage

+Here's how you can use these formats to train your model:
+
 !!! example ""

    === "Python"
@ -98,14 +70,34 @@ names: ['person', 'car']

 ## Supported Datasets

-TODO
+Here is a list of the supported datasets and a brief description for each:

-## Port or Convert label formats
+- [**Argoverse**](./argoverse.md): A collection of sensor data collected from autonomous vehicles. It contains 3D tracking annotations for car objects.
+- [**COCO**](./coco.md): Common Objects in Context (COCO) is a large-scale object detection, segmentation, and captioning dataset with 80 object categories.
+- [**COCO8**](./coco8.md): A smaller subset of the COCO dataset, COCO8 is more lightweight and faster to train.
+- [**GlobalWheat2020**](./globalwheat2020.md): A dataset containing images of wheat heads for the Global Wheat Challenge 2020.
+- [**Objects365**](./objects365.md): A large-scale object detection dataset with 365 object categories and 600k images, aimed at advancing object detection research.
+- [**SKU-110K**](./sku-110k.md): A dataset containing images of densely packed retail products, intended for retail environment object detection.
+- [**VisDrone**](./visdrone.md): A dataset focusing on drone-based images, containing various object categories like cars, pedestrians, and cyclists.
+- [**VOC**](./voc.md): PASCAL VOC is a popular object detection dataset with 20 object categories including vehicles, animals, and furniture.
+- [**xView**](./xview.md): A dataset containing high-resolution satellite imagery, designed for the detection of various object classes in overhead views.

-### COCO dataset format to YOLO format
+### Adding your own dataset
+
+If you have your own dataset and would like to use it for training detection models with Ultralytics YOLO format, ensure that it follows the format specified above under "Ultralytics YOLO format". Convert your annotations to the required format and specify the paths, number of classes, and class names in the YAML configuration file.
+
+## Port or Convert Label Formats
+
+### COCO Dataset Format to YOLO Format
+
+You can easily convert labels from the popular COCO dataset format to the YOLO format using the following code snippet:

 ```python
 from ultralytics.yolo.data.converter import convert_coco

 convert_coco(labels_dir='../coco/annotations/')
-```
+```
+
+This conversion tool can be used to convert the COCO dataset or any dataset in the COCO format to the Ultralytics YOLO format.
+
+Remember to double-check if the dataset you want to use is compatible with your model and follows the necessary format conventions. Properly formatted datasets are crucial for training successful object detection models.
--- a/docs/datasets/detect/sku-110k.md
+++ b/docs/datasets/detect/sku-110k.md
@ -61,6 +61,7 @@ To train a YOLOv8n model on the SKU-110K dataset for 100 epochs with an image si
        ```bash
        # Start training from a pretrained *.pt model
        yolo detect train data=SKU-110K.yaml model=yolov8n.pt epochs=100 imgsz=640
+        ```

 ## Sample Data and Annotations

--- a/docs/datasets/detect/visdrone.md
+++ b/docs/datasets/detect/visdrone.md
@ -10,22 +10,6 @@ The [VisDrone Dataset](https://github.com/VisDrone/VisDrone-Dataset) is a large-

 VisDrone is composed of 288 video clips with 261,908 frames and 10,209 static images, captured by various drone-mounted cameras. The dataset covers a wide range of aspects, including location (14 different cities across China), environment (urban and rural), objects (pedestrians, vehicles, bicycles, etc.), and density (sparse and crowded scenes). The dataset was collected using various drone platforms under different scenarios and weather and lighting conditions. These frames are manually annotated with over 2.6 million bounding boxes of targets such as pedestrians, cars, bicycles, and tricycles. Attributes like scene visibility, object class, and occlusion are also provided for better data utilization.

-## Citation
-
-If you use the VisDrone dataset in your research or development work, please cite the following paper:
-
-```bibtex
-@ARTICLE{9573394,
-  author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
-  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
-  title={Detection and Tracking Meet Drones Challenge}, 
-  year={2021},
-  volume={},
-  number={},
-  pages={1-1},
-  doi={10.1109/TPAMI.2021.3119563}}
-```
-
 ## Dataset Structure

 The VisDrone dataset is organized into five main subsets, each focusing on a specific task:
--- a/docs/datasets/pose/index.md
+++ b/docs/datasets/pose/index.md
@ -111,14 +111,40 @@ flip_idx: [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]

 ## Supported Datasets

-TODO
+This section outlines the datasets that are compatible with Ultralytics YOLO format and can be used for training pose estimation models:

-## Port or Convert label formats
+### COCO-Pose

-### COCO dataset format to YOLO format
+- **Description**: COCO-Pose is a large-scale object detection, segmentation, and pose estimation dataset. It is a subset of the popular COCO dataset and focuses on human pose estimation. COCO-Pose includes multiple keypoints for each human instance.
+- **Label Format**: Same as Ultralytics YOLO format as described above, with keypoints for human poses.
+- **Number of Classes**: 1 (Human).
+- **Keypoints**: 17 keypoints including nose, eyes, ears, shoulders, elbows, wrists, hips, knees, and ankles.
+- **Usage**: Suitable for training human pose estimation models.
+- **Additional Notes**: The dataset is rich and diverse, containing over 200k labeled images.
+- [Read more about COCO-Pose](./coco.md)
+
+### COCO8-Pose
+
+- **Description**: [Ultralytics](https://ultralytics.com) COCO8-Pose is a small, but versatile pose detection dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation.
+- **Label Format**: Same as Ultralytics YOLO format as described above, with keypoints for human poses.
+- **Number of Classes**: 1 (Human).
+- **Keypoints**: 17 keypoints including nose, eyes, ears, shoulders, elbows, wrists, hips, knees, and ankles.
+- **Usage**: Suitable for testing and debugging object detection models, or for experimenting with new detection approaches.
+- **Additional Notes**: COCO8-Pose is ideal for sanity checks and CI checks.
+- [Read more about COCO8-Pose](./coco8-pose.md)
+
+### Adding your own dataset
+
+If you have your own dataset and would like to use it for training pose estimation models with Ultralytics YOLO format, ensure that it follows the format specified above under "Ultralytics YOLO format". Convert your annotations to the required format and specify the paths, number of classes, and class names in the YAML configuration file.
+
+### Conversion Tool
+
+Ultralytics provides a convenient conversion tool to convert labels from the popular COCO dataset format to YOLO format:

 ```python
 from ultralytics.yolo.data.converter import convert_coco

 convert_coco(labels_dir='../coco/annotations/', use_keypoints=True)
 ```
+
+This conversion tool can be used to convert the COCO dataset or any dataset in the COCO format to the Ultralytics YOLO format. The `use_keypoints` parameter specifies whether to include keypoints (for pose estimation) in the converted labels.
--- a/docs/datasets/segment/index.md
+++ b/docs/datasets/segment/index.md
@ -46,7 +46,7 @@ train: <path-to-training-images>
 val: <path-to-validation-images>

 nc: <number-of-classes>
-names: [ <class-1>, <class-2>, ..., <class-n> ]
+names: [<class-1>, <class-2>, ..., <class-n>]

 ```

@ -73,7 +73,7 @@ train: data/train/
 val: data/val/

 nc: 2
-names: [ 'person', 'car' ]
+names: ['person', 'car']
 ```

 ## Usage
@ -100,9 +100,18 @@ names: [ 'person', 'car' ]

 ## Supported Datasets

-## Port or Convert label formats
+* [COCO](coco.md): A large-scale dataset designed for object detection, segmentation, and captioning tasks with over 200K labeled images.
+* [COCO8-seg](coco8-seg.md): A smaller dataset for instance segmentation tasks, containing a subset of 8 COCO images with segmentation annotations.

-### COCO dataset format to YOLO format
+### Adding your own dataset
+
+If you have your own dataset and would like to use it for training segmentation models with Ultralytics YOLO format, ensure that it follows the format specified above under "Ultralytics YOLO format". Convert your annotations to the required format and specify the paths, number of classes, and class names in the YAML configuration file.
+
+## Port or Convert Label Formats
+
+### COCO Dataset Format to YOLO Format
+
+You can easily convert labels from the popular COCO dataset format to the YOLO format using the following code snippet:

 ```python
 from ultralytics.yolo.data.converter import convert_coco
@ -110,6 +119,10 @@ from ultralytics.yolo.data.converter import convert_coco
 convert_coco(labels_dir='../coco/annotations/', use_segments=True)
 ```

+This conversion tool can be used to convert the COCO dataset or any dataset in the COCO format to the Ultralytics YOLO format.
+
+Remember to double-check if the dataset you want to use is compatible with your model and follows the necessary format conventions. Properly formatted datasets are crucial for training successful object detection models.
+
 ## Auto-Annotation

 Auto-annotation is an essential feature that allows you to generate a segmentation dataset using a pre-trained detection model. It enables you to quickly and accurately annotate a large number of images without the need for manual labeling, saving time and effort.