`ultralytics 8.0.94` HUBDatasetStats() Segment and Pose support (#2450)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: JF Chen <k-2feng@hotmail.com>
Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
Co-authored-by: Laughing-q <1185102784@qq.com>
single_channel
Glenn Jocher 2 years ago committed by GitHub
parent af49a85cf3
commit e21428ca4e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,101 @@
---
comments: true
---
# Image Classification Datasets Overview
## Dataset format
The folder structure for classification datasets in torchvision typically follows a standard format:
```
root/
|-- class1/
| |-- img1.jpg
| |-- img2.jpg
| |-- ...
|
|-- class2/
| |-- img1.jpg
| |-- img2.jpg
| |-- ...
|
|-- class3/
| |-- img1.jpg
| |-- img2.jpg
| |-- ...
|
|-- ...
```
In this folder structure, the `root` directory contains one subdirectory for each class in the dataset. Each subdirectory is named after the corresponding class and contains all the images for that class. Each image file is named uniquely and is typically in a common image file format such as JPEG or PNG.
** Example **
For example, in the CIFAR10 dataset, the folder structure would look like this:
```
cifar-10-/
|
|-- train/
| |-- airplane/
| | |-- 10008_airplane.png
| | |-- 10009_airplane.png
| | |-- ...
| |
| |-- automobile/
| | |-- 1000_automobile.png
| | |-- 1001_automobile.png
| | |-- ...
| |
| |-- bird/
| | |-- 10014_bird.png
| | |-- 10015_bird.png
| | |-- ...
| |
| |-- ...
|
|-- test/
| |-- airplane/
| | |-- 10_airplane.png
| | |-- 11_airplane.png
| | |-- ...
| |
| |-- automobile/
| | |-- 100_automobile.png
| | |-- 101_automobile.png
| | |-- ...
| |
| |-- bird/
| | |-- 1000_bird.png
| | |-- 1001_bird.png
| | |-- ...
| |
| |-- ...
```
In this example, the `train` directory contains subdirectories for each class in the dataset, and each class subdirectory contains all the images for that class. The `test` directory has a similar structure. The `root` directory also contains other files that are part of the CIFAR10 dataset.
## Usage
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model
model.train(data='path/to/dataset', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=path/to/data model=yolov8n-seg.pt epochs=100 imgsz=640
```
## Supported Datasets
TODO

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,106 @@
---
comments: true
---
# Object Detection Datasets Overview
## Supported Dataset Formats
### Ultralytics YOLO format
** Label Format **
The dataset format used for training YOLO detection models is as follows:
1. One text file per image: Each image in the dataset has a corresponding text file with the same name as the image file and the ".txt" extension.
2. One row per object: Each row in the text file corresponds to one object instance in the image.
3. Object information per row: Each row contains the following information about the object instance:
- Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
- Object center coordinates: The x and y coordinates of the center of the object, normalized to be between 0 and 1.
- Object width and height: The width and height of the object, normalized to be between 0 and 1.
The format for a single row in the detection dataset file is as follows:
```
<object-class> <x> <y> <width> <height>
```
Here is an example of the YOLO dataset format for a single image with two object instances:
```
0 0.5 0.4 0.3 0.6
1 0.3 0.7 0.4 0.2
```
In this example, the first object is of class 0 (person), with its center at (0.5, 0.4), width of 0.3, and height of 0.6. The second object is of class 1 (car), with its center at (0.3, 0.7), width of 0.4, and height of 0.2.
** Dataset file format **
The Ultralytics framework uses a YAML file format to define the dataset and model configuration for training Detection Models. Here is an example of the YAML format used for defining a detection dataset:
```
train: <path-to-training-images>
val: <path-to-validation-images>
nc: <number-of-classes>
names: [<class-1>, <class-2>, ..., <class-n>]
```
The `train` and `val` fields specify the paths to the directories containing the training and validation images, respectively.
The `nc` field specifies the number of object classes in the dataset.
The `names` field is a list of the names of the object classes. The order of the names should match the order of the object class indices in the YOLO dataset files.
NOTE: Either `nc` or `names` must be defined. Defining both are not mandatory
Alternatively, you can directly define class names like this:
```yaml
names:
0: person
1: bicycle
```
** Example **
```yaml
train: data/train/
val: data/val/
nc: 2
names: ['person', 'car']
```
## Usage
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
# Train the model
model.train(data='coco128.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco128.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Supported Datasets
TODO
## Port or Convert label formats
### COCO dataset format to YOLO format
```
from ultralytics.yolo.data.converter import convert_coco
convert_coco(labels_dir='../coco/annotations/')
```

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,57 @@
---
comments: true
---
# Datasets Overview
Ultralytics provides support for various datasets to facilitate computer vision tasks such as detection, instance segmentation, pose estimation, classification, and multi-object tracking. Below is a list of the main Ultralytics datasets, followed by a summary of each computer vision task and the respective datasets.
## [Detection Datasets](detect/index.md)
Bounding box object detection is a computer vision technique that involves detecting and localizing objects in an image by drawing a bounding box around each object.
* [Argoverse](detect/argoverse.md): A dataset containing 3D tracking and motion forecasting data from urban environments with rich annotations.
* [COCO](detect/coco.md): A large-scale dataset designed for object detection, segmentation, and captioning with over 200K labeled images.
* [COCO8](detect/coco8.md): Contains the first 4 images from COCO train and COCO val, suitable for quick tests.
* [Global Wheat 2020](detect/globalwheat2020.md): A dataset of wheat head images collected from around the world for object detection and localization tasks.
* [Objects365](detect/objects365.md): A high-quality, large-scale dataset for object detection with 365 object categories and over 600K annotated images.
* [SKU-110K](detect/sku-110k.md): A dataset featuring dense object detection in retail environments with over 11K images and 1.7 million bounding boxes.
* [VisDrone](detect/visdrone.md): A dataset containing object detection and multi-object tracking data from drone-captured imagery with over 10K images and video sequences.
* [VOC](detect/voc.md): The Pascal Visual Object Classes (VOC) dataset for object detection and segmentation with 20 object classes and over 11K images.
* [xView](detect/xview.md): A dataset for object detection in overhead imagery with 60 object categories and over 1 million annotated objects.
## [Instance Segmentation Datasets](segment/index.md)
Instance segmentation is a computer vision technique that involves identifying and localizing objects in an image at the pixel level.
* [COCO](segment/coco.md): A large-scale dataset designed for object detection, segmentation, and captioning tasks with over 200K labeled images.
* [COCO8-seg](segment/coco8-seg.md): A smaller dataset for instance segmentation tasks, containing a subset of 8 COCO images with segmentation annotations.
## [Pose Estimation](pose/index.md)
Pose estimation is a technique used to determine the pose of the object relative to the camera or the world coordinate system.
* [COCO](pose/coco.md): A large-scale dataset with human pose annotations designed for pose estimation tasks.
* [COCO8-pose](pose/coco8-pose.md): A smaller dataset for pose estimation tasks, containing a subset of 8 COCO images with human pose annotations.
## [Classification](classify/index.md)
Image classification is a computer vision task that involves categorizing an image into one or more predefined classes or categories based on its visual content.
* [Caltech 101](classify/caltech101.md): A dataset containing images of 101 object categories for image classification tasks.
* [Caltech 256](classify/caltech256.md): An extended version of Caltech 101 with 256 object categories and more challenging images.
* [CIFAR-10](classify/cifar10.md): A dataset of 60K 32x32 color images in 10 classes, with 6K images per class.
* [CIFAR-100](classify/cifar100.md): An extended version of CIFAR-10 with 100 object categories and 600 images per class.
* [Fashion-MNIST](classify/fashion-mnist.md): A dataset consisting of 70,000 grayscale images of 10 fashion categories for image classification tasks.
* [ImageNet](classify/imagenet.md): A large-scale dataset for object detection and image classification with over 14 million images and 20,000 categories.
* [ImageNet-10](classify/imagenet10.md): A smaller subset of ImageNet with 10 categories for faster experimentation and testing.
* [Imagenette](classify/imagenette.md): A smaller subset of ImageNet that contains 10 easily distinguishable classes for quicker training and testing.
* [Imagewoof](classify/imagewoof.md): A more challenging subset of ImageNet containing 10 dog breed categories for image classification tasks.
* [MNIST](classify/mnist.md): A dataset of 70,000 grayscale images of handwritten digits for image classification tasks.
## [Multi-Object Tracking](track/index.md)
Multi-object tracking is a computer vision technique that involves detecting and tracking multiple objects over time in a video sequence.
* [Argoverse](detect/argoverse.md): A dataset containing 3D tracking and motion forecasting data from urban environments with rich annotations for multi-object tracking tasks.
* [VisDrone](detect/visdrone.md): A dataset containing object detection and multi-object tracking data from drone-captured imagery with over 10K images and video sequences.

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,120 @@
---
comments: true
---
# Pose Estimation Datasets Overview
## Supported Dataset Formats
### Ultralytics YOLO format
** Label Format **
The dataset format used for training YOLO segmentation models is as follows:
1. One text file per image: Each image in the dataset has a corresponding text file with the same name as the image file and the ".txt" extension.
2. One row per object: Each row in the text file corresponds to one object instance in the image.
3. Object information per row: Each row contains the following information about the object instance:
- Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
- Object center coordinates: The x and y coordinates of the center of the object, normalized to be between 0 and 1.
- Object width and height: The width and height of the object, normalized to be between 0 and 1.
- Object keypoint coordinates: The keypoints of the object, normalized to be between 0 and 1.
Here is an example of the label format for pose estimation task:
Format with Dim = 2
```
<class-index> <x> <y> <width> <height> <px1> <py1> <px2> <py2> <pxn> <pyn>
```
Format with Dim = 3
```
<class-index> <x> <y> <width> <height> <px1> <py1> <p1-visibility> <px2> <py2> <p2-visibility> <pxn> <pyn> <p2-visibility>
```
In this format, `<class-index>` is the index of the class for the object,`<x> <y> <width> <height>` are coordinates of boudning box, and `<px1> <py1> <px2> <py2> <pxn> <pyn>` are the pixel coordinates of the keypoints. The coordinates are separated by spaces.
** Dataset file format **
The Ultralytics framework uses a YAML file format to define the dataset and model configuration for training Detection Models. Here is an example of the YAML format used for defining a detection dataset:
```yaml
train: <path-to-training-images>
val: <path-to-validation-images>
nc: <number-of-classes>
names: [<class-1>, <class-2>, ..., <class-n>]
# Keypoints
kpt_shape: [num_kpts, dim] # number of keypoints, number of dims (2 for x,y or 3 for x,y,visible)
flip_idx: [n1, n2 ... , n(num_kpts)]
```
The `train` and `val` fields specify the paths to the directories containing the training and validation images, respectively.
The `nc` field specifies the number of object classes in the dataset.
The `names` field is a list of the names of the object classes. The order of the names should match the order of the object class indices in the YOLO dataset files.
NOTE: Either `nc` or `names` must be defined. Defining both are not mandatory
Alternatively, you can directly define class names like this:
```
names:
0: person
1: bicycle
```
(Optional) if the points are symmetric then need flip_idx, like left-right side of human or face.
For example let's say there're five keypoints of facial landmark: [left eye, right eye, nose, left point of mouth, right point of mouse], and the original index is [0, 1, 2, 3, 4], then flip_idx is [1, 0, 2, 4, 3].(just exchange the left-right index, i.e 0-1 and 3-4, and do not modify others like nose in this example)
** Example **
```yaml
train: data/train/
val: data/val/
nc: 2
names: ['person', 'car']
# Keypoints
kpt_shape: [17, 3] # number of keypoints, number of dims (2 for x,y or 3 for x,y,visible)
flip_idx: [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
```
## Usage
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-pose.pt') # load a pretrained model (recommended for training)
# Train the model
model.train(data='coco128-pose.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco128-pose.yaml model=yolov8n-pose.pt epochs=100 imgsz=640
```
## Supported Datasets
TODO
## Port or Convert label formats
### COCO dataset format to YOLO format
```
from ultralytics.yolo.data.converter import convert_coco
convert_coco(labels_dir='../coco/annotations/', use_keypoints=True)
```

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,106 @@
---
comments: true
---
# Instance Segmentation Datasets Overview
## Supported Dataset Formats
### Ultralytics YOLO format
** Label Format **
The dataset format used for training YOLO segmentation models is as follows:
1. One text file per image: Each image in the dataset has a corresponding text file with the same name as the image file and the ".txt" extension.
2. One row per object: Each row in the text file corresponds to one object instance in the image.
3. Object information per row: Each row contains the following information about the object instance:
- Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
- Object bounding coordinates: The bounding coordinates around the mask area, normalized to be between 0 and 1.
The format for a single row in the segmentation dataset file is as follows:
```
<class-index> <x1> <y1> <x2> <y2> ... <xn> <yn>
```
In this format, `<class-index>` is the index of the class for the object, and `<x1> <y1> <x2> <y2> ... <xn> <yn>` are the bounding coordinates of the object's segmentation mask. The coordinates are separated by spaces.
Here is an example of the YOLO dataset format for a single image with two object instances:
```
0 0.6812 0.48541 0.67 0.4875 0.67656 0.487 0.675 0.489 0.66
1 0.5046 0.0 0.5015 0.004 0.4984 0.00416 0.4937 0.010 0.492 0.0104
```
Note: The length of each row does not have to be equal.
** Dataset file format **
The Ultralytics framework uses a YAML file format to define the dataset and model configuration for training Detection Models. Here is an example of the YAML format used for defining a detection dataset:
```yaml
train: <path-to-training-images>
val: <path-to-validation-images>
nc: <number-of-classes>
names: [<class-1>, <class-2>, ..., <class-n>]
```
The `train` and `val` fields specify the paths to the directories containing the training and validation images, respectively.
The `nc` field specifies the number of object classes in the dataset.
The `names` field is a list of the names of the object classes. The order of the names should match the order of the object class indices in the YOLO dataset files.
NOTE: Either `nc` or `names` must be defined. Defining both are not mandatory.
Alternatively, you can directly define class names like this:
```yaml
names:
0: person
1: bicycle
```
** Example **
```yaml
train: data/train/
val: data/val/
nc: 2
names: ['person', 'car']
```
## Usage
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-seg.pt') # load a pretrained model (recommended for training)
# Train the model
model.train(data='coco128-seg.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco128-seg.yaml model=yolov8n-seg.pt epochs=100 imgsz=640
```
## Supported Datasets
## Port or Convert label formats
### COCO dataset format to YOLO format
```
from ultralytics.yolo.data.converter import convert_coco
convert_coco(labels_dir='../coco/annotations/', use_segments=True)
```

@ -0,0 +1,29 @@
---
comments: true
---
# Multi-object Tracking Datasets Overview
## Dataset Format (Coming Soon)
Multi-Object Detector doesn't need standalone training and directly supports pre-trained detection, segmentation or Pose models.
Support for training trackers alone is coming soon
## Usage
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
model = YOLO('yolov8n.pt')
results = model.track(source="https://youtu.be/Zgi9g1ksQHc", conf=0.3, iou=0.5, show=True)
```
=== "CLI"
```bash
yolo track model=yolov8n.pt source="https://youtu.be/Zgi9g1ksQHc" conf=0.3, iou=0.5 show
```

@ -2,6 +2,63 @@
comments: true comments: true
--- ---
# 🚧 Page Under Construction ⚒ # Ultralytics Android App: Real-time Object Detection with YOLO Models
This page is currently under construction! 👷Please check back later for updates. 😃🔜 The Ultralytics Android App is a powerful tool that allows you to run YOLO models directly on your Android device for real-time object detection. This app utilizes TensorFlow Lite for model optimization and various hardware delegates for acceleration, enabling fast and efficient object detection.
## Quantization and Acceleration
To achieve real-time performance on your Android device, YOLO models are quantized to either FP16 or INT8 precision. Quantization is a process that reduces the numerical precision of the model's weights and biases, thus reducing the model's size and the amount of computation required. This results in faster inference times without significantly affecting the model's accuracy.
### FP16 Quantization
FP16 (or half-precision) quantization converts the model's 32-bit floating-point numbers to 16-bit floating-point numbers. This reduces the model's size by half and speeds up the inference process, while maintaining a good balance between accuracy and performance.
### INT8 Quantization
INT8 (or 8-bit integer) quantization further reduces the model's size and computation requirements by converting its 32-bit floating-point numbers to 8-bit integers. This quantization method can result in a significant speedup, but it may lead to a slight reduction in mean average precision (mAP) due to the lower numerical precision.
!!! tip "mAP Reduction in INT8 Models"
The reduced numerical precision in INT8 models can lead to some loss of information during the quantization process, which may result in a slight decrease in mAP. However, this trade-off is often acceptable considering the substantial performance gains offered by INT8 quantization.
## Delegates and Performance Variability
Different delegates are available on Android devices to accelerate model inference. These delegates include CPU, [GPU](https://www.tensorflow.org/lite/android/delegates/gpu), [Hexagon](https://www.tensorflow.org/lite/android/delegates/hexagon) and [NNAPI](https://www.tensorflow.org/lite/android/delegates/nnapi). The performance of these delegates varies depending on the device's hardware vendor, product line, and specific chipsets used in the device.
1. **CPU**: The default option, with reasonable performance on most devices.
2. **GPU**: Utilizes the device's GPU for faster inference. It can provide a significant performance boost on devices with powerful GPUs.
3. **Hexagon**: Leverages Qualcomm's Hexagon DSP for faster and more efficient processing. This option is available on devices with Qualcomm Snapdragon processors.
4. **NNAPI**: The Android Neural Networks API (NNAPI) serves as an abstraction layer for running ML models on Android devices. NNAPI can utilize various hardware accelerators, such as CPU, GPU, and dedicated AI chips (e.g., Google's Edge TPU, or the Pixel Neural Core).
Here's a table showing the primary vendors, their product lines, popular devices, and supported delegates:
| Vendor | Product Lines | Popular Devices | Delegates Supported |
|-----------------------------------------|---------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------|
| [Qualcomm](https://www.qualcomm.com/) | [Snapdragon (e.g., 800 series)](https://www.qualcomm.com/snapdragon) | [Samsung Galaxy S21](https://www.samsung.com/global/galaxy/galaxy-s21-5g/), [OnePlus 9](https://www.oneplus.com/9), [Google Pixel 6](https://store.google.com/product/pixel_6) | CPU, GPU, Hexagon, NNAPI |
| [Samsung](https://www.samsung.com/) | [Exynos (e.g., Exynos 2100)](https://www.samsung.com/semiconductor/minisite/exynos/) | [Samsung Galaxy S21 (Global version)](https://www.samsung.com/global/galaxy/galaxy-s21-5g/) | CPU, GPU, NNAPI |
| [MediaTek](https://www.mediatek.com/) | [Dimensity (e.g., Dimensity 1200)](https://www.mediatek.com/products/smartphones) | [Realme GT](https://www.realme.com/global/realme-gt), [Xiaomi Redmi Note](https://www.mi.com/en/phone/redmi/note-list) | CPU, GPU, NNAPI |
| [HiSilicon](https://www.hisilicon.com/) | [Kirin (e.g., Kirin 990)](https://www.hisilicon.com/en/products/Kirin) | [Huawei P40 Pro](https://consumer.huawei.com/en/phones/p40-pro/), [Huawei Mate 30 Pro](https://consumer.huawei.com/en/phones/mate30-pro/) | CPU, GPU, NNAPI |
| [NVIDIA](https://www.nvidia.com/) | [Tegra (e.g., Tegra X1)](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems-dev-kits-modules/) | [NVIDIA Shield TV](https://www.nvidia.com/en-us/shield/shield-tv/), [Nintendo Switch](https://www.nintendo.com/switch/) | CPU, GPU, NNAPI |
Please note that the list of devices mentioned is not exhaustive and may vary depending on the specific chipsets and device models. Always test your models on your target devices to ensure compatibility and optimal performance.
Keep in mind that the choice of delegate can affect performance and model compatibility. For example, some models may not work with certain delegates, or a delegate may not be available on a specific device. As such, it's essential to test your model and the chosen delegate on your target devices for the best results.
## Getting Started with the Ultralytics Android App
To get started with the Ultralytics Android App, follow these steps:
1. Download the Ultralytics App from the [Google Play Store](https://play.google.com/store/apps/details?id=com.ultralytics.ultralytics_app).
2. Launch the app on your Android device and sign in with your Ultralytics account. If you don't have an account yet, create one [here](https://hub.ultralytics.com/).
3. Once signed in, you will see a list of your trained YOLO models. Select a model to use for object detection.
4. Grant the app permission to access your device's camera.
5. Point your device's camera at objects you want to detect. The app will display bounding boxes and class labels in real-time as it detects objects.
6. Explore the app's settings to adjust the detection threshold, enable or disable specific object classes, and more.
With the Ultralytics Android App, you now have the power of real-time object detection using YOLO models right at your fingertips. Enjoy exploring the app's features and optimizing its settings to suit your specific use cases.

@ -2,6 +2,54 @@
comments: true comments: true
--- ---
# 🚧 Page Under Construction ⚒ # Ultralytics iOS App: Real-time Object Detection with YOLO Models
This page is currently under construction! 👷Please check back later for updates. 😃🔜 The Ultralytics iOS App is a powerful tool that allows you to run YOLO models directly on your iPhone or iPad for real-time object detection. This app utilizes the Apple Neural Engine and Core ML for model optimization and acceleration, enabling fast and efficient object detection.
## Quantization and Acceleration
To achieve real-time performance on your iOS device, YOLO models are quantized to either FP16 or INT8 precision. Quantization is a process that reduces the numerical precision of the model's weights and biases, thus reducing the model's size and the amount of computation required. This results in faster inference times without significantly affecting the model's accuracy.
### FP16 Quantization
FP16 (or half-precision) quantization converts the model's 32-bit floating-point numbers to 16-bit floating-point numbers. This reduces the model's size by half and speeds up the inference process, while maintaining a good balance between accuracy and performance.
### INT8 Quantization
INT8 (or 8-bit integer) quantization further reduces the model's size and computation requirements by converting its 32-bit floating-point numbers to 8-bit integers. This quantization method can result in a significant speedup, but it may lead to a slight reduction in accuracy.
## Apple Neural Engine
The Apple Neural Engine (ANE) is a dedicated hardware component integrated into Apple's A-series and M-series chips. It's designed to accelerate machine learning tasks, particularly for neural networks, allowing for faster and more efficient execution of your YOLO models.
By combining quantized YOLO models with the Apple Neural Engine, the Ultralytics iOS App achieves real-time object detection on your iOS device without compromising on accuracy or performance.
| Release Year | iPhone Name | Chipset Name | Node Size | ANE TOPs |
|--------------|------------------------------------------------------|-------------------------------------------------------|-----------|----------|
| 2017 | [iPhone X](https://en.wikipedia.org/wiki/IPhone_X) | [A11 Bionic](https://en.wikipedia.org/wiki/Apple_A11) | 10 nm | 0.6 |
| 2018 | [iPhone XS](https://en.wikipedia.org/wiki/IPhone_XS) | [A12 Bionic](https://en.wikipedia.org/wiki/Apple_A12) | 7 nm | 5 |
| 2019 | [iPhone 11](https://en.wikipedia.org/wiki/IPhone_11) | [A13 Bionic](https://en.wikipedia.org/wiki/Apple_A13) | 7 nm | 6 |
| 2020 | [iPhone 12](https://en.wikipedia.org/wiki/IPhone_12) | [A14 Bionic](https://en.wikipedia.org/wiki/Apple_A14) | 5 nm | 11 |
| 2021 | [iPhone 13](https://en.wikipedia.org/wiki/IPhone_13) | [A15 Bionic](https://en.wikipedia.org/wiki/Apple_A15) | 5 nm | 15.8 |
| 2022 | [iPhone 14](https://en.wikipedia.org/wiki/IPhone_14) | [A16 Bionic](https://en.wikipedia.org/wiki/Apple_A16) | 4 nm | 17.0 |
Please note that this list only includes iPhone models from 2017 onwards, and the ANE TOPs values are approximate.
## Getting Started with the Ultralytics iOS App
To get started with the Ultralytics iOS App, follow these steps:
1. Download the Ultralytics App from the [App Store](https://apps.apple.com/xk/app/ultralytics/id1583935240).
2. Launch the app on your iOS device and sign in with your Ultralytics account. If you don't have an account yet, create one [here](https://hub.ultralytics.com/).
3. Once signed in, you will see a list of your trained YOLO models. Select a model to use for object detection.
4. Grant the app permission to access your device's camera.
5. Point your device's camera at objects you want to detect. The app will display bounding boxes and class labels in real-time as it detects objects.
6. Explore the app's settings to adjust the detection threshold, enable or disable specific object classes, and more.
With the Ultralytics iOS App, you can now leverage the power of YOLO models for real-time object detection on your iPhone or iPad, powered by the Apple Neural Engine and optimized with FP16 or INT8 quantization.

@ -4,26 +4,24 @@ comments: true
# HUB Datasets # HUB Datasets
## Upload a Dataset ## 1. Upload a Dataset
Ultralytics HUB datasets are just like YOLOv5 and YOLOv8 🚀 datasets, they use the same structure and the same label formats to keep Ultralytics HUB datasets are just like YOLOv5 and YOLOv8 🚀 datasets, they use the same structure and the same label formats to keep
everything simple. everything simple.
When you upload a dataset to Ultralytics HUB, make sure to **place your dataset YAML inside the dataset root directory** When you upload a dataset to Ultralytics HUB, make sure to **place your dataset YAML inside the dataset root directory**
as in the example shown below, and then zip for upload to https://hub.ultralytics.com/. Your **dataset YAML, directory as in the example shown below, and then zip for upload to [https://hub.ultralytics.com](https://hub.ultralytics.com/). Your **dataset YAML, directory
and zip** should all share the same name. For example, if your dataset is called 'coco6' as in our and zip** should all share the same name. For example, if your dataset is called 'coco8' as in our
example [ultralytics/hub/coco6.zip](https://github.com/ultralytics/hub/blob/master/coco6.zip), then you should have a example [ultralytics/hub/example_datasets/coco8.zip](https://github.com/ultralytics/hub/blob/master/example_datasets/coco8.zip), then you should have a `coco8.yaml` inside your `coco8/` directory, which should zip to create `coco8.zip` for upload:
coco6.yaml inside your coco6/ directory, which should zip to create coco6.zip for upload:
```bash ```bash
zip -r coco6.zip coco6 zip -r coco8.zip coco8
``` ```
The example [coco6.zip](https://github.com/ultralytics/hub/blob/master/coco6.zip) dataset in this repository can be The [example_datasets/coco8.zip](https://github.com/ultralytics/hub/blob/master/example_datasets/coco8.zip) dataset in this repository can be downloaded and unzipped to see exactly how to structure your custom dataset.
downloaded and unzipped to see exactly how to structure your custom dataset.
<p align="center"> <p align="center">
<img width="80%" src="https://user-images.githubusercontent.com/26833433/201424843-20fa081b-ad4b-4d6c-a095-e810775908d8.png" title="COCO6" /> <img width="80%" src="https://user-images.githubusercontent.com/26833433/201424843-20fa081b-ad4b-4d6c-a095-e810775908d8.png" title="COCO8" />
</p> </p>
The dataset YAML is the same standard YOLOv5 and YOLOv8 YAML format. See The dataset YAML is the same standard YOLOv5 and YOLOv8 YAML format. See

@ -36,6 +36,6 @@ We hope that the resources here will help you get the most out of HUB. Please br
- [**Models: Training and Exporting**](./models.md). Train YOLOv5 and YOLOv8 models on your custom datasets and export them to various formats for deployment. - [**Models: Training and Exporting**](./models.md). Train YOLOv5 and YOLOv8 models on your custom datasets and export them to various formats for deployment.
- [**Integrations: Options**](./integrations.md). Explore different integration options for your trained models, such as TensorFlow, ONNX, OpenVINO, CoreML, and PaddlePaddle. - [**Integrations: Options**](./integrations.md). Explore different integration options for your trained models, such as TensorFlow, ONNX, OpenVINO, CoreML, and PaddlePaddle.
- [**Ultralytics HUB App**](./app/index.md). Learn about the Ultralytics App for iOS and Android, which allows you to run models directly on your mobile device. - [**Ultralytics HUB App**](./app/index.md). Learn about the Ultralytics App for iOS and Android, which allows you to run models directly on your mobile device.
- [**iOS**](./app/ios.md) * [**iOS**](./app/ios.md). Learn about YOLO CoreML models accelerated on Apple's Neural Engine on iPhones and iPads.
- [**Android**](./app/android.md) * [**Android**](./app/android.md). Explore TFLite acceleration on mobile devices.
- [**Inference API**](./inference_api.md). Understand how to use the Inference API for running your trained models in the cloud to generate predictions. - [**Inference API**](./inference_api.md). Understand how to use the Inference API for running your trained models in the cloud to generate predictions.

@ -50,7 +50,7 @@ In this example, replace `API_KEY` with your actual API key, `MODEL_ID` with the
You can use the YOLO Inference API with the command-line interface (CLI) by utilizing the `curl` command. Replace `API_KEY` with your actual API key, `MODEL_ID` with the desired model ID, and `image.jpg` with the path to the image you want to analyze: You can use the YOLO Inference API with the command-line interface (CLI) by utilizing the `curl` command. Replace `API_KEY` with your actual API key, `MODEL_ID` with the desired model ID, and `image.jpg` with the path to the image you want to analyze:
```commandline ```bash
curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \ curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
-H "x-api-key: API_KEY" \ -H "x-api-key: API_KEY" \
-F "image=@/path/to/image.jpg" \ -F "image=@/path/to/image.jpg" \
@ -89,12 +89,12 @@ In this example, the `data` dictionary contains the query arguments `size`, `con
This will send the query parameters along with the file in the POST request. See the table below for a full list of available inference arguments. This will send the query parameters along with the file in the POST request. See the table below for a full list of available inference arguments.
| Argument | Default | Type | Notes | | Inference Argument | Default | Type | Notes |
|--------------|---------|---------|-----------------------------------------| |--------------------|---------|---------|------------------------------------------------|
| `size` | `640` | `int` | allowable range is `32` - `1280` pixels | | `size` | `640` | `int` | valid range is `32` - `1280` pixels |
| `confidence` | `0.25` | `float` | allowable range is `0.01` - `1.0` | | `confidence` | `0.25` | `float` | valid range is `0.01` - `1.0` |
| `iou` | `0.45` | `float` | allowable range is `0.0` - `0.95` | | `iou` | `0.45` | `float` | valid range is `0.0` - `0.95` |
| `url` | `''` | `str` | | | `url` | `''` | `str` | optional image URL if not image file is passed |
| `normalize` | `False` | `bool` | | | `normalize` | `False` | `bool` | |
## Return JSON format ## Return JSON format
@ -124,7 +124,7 @@ YOLO detection models, such as `yolov8n.pt`, can return JSON responses from loca
``` ```
=== "CLI API" === "CLI API"
```commandline ```bash
curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \ curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
-H "x-api-key: API_KEY" \ -H "x-api-key: API_KEY" \
-F "image=@/path/to/image.jpg" \ -F "image=@/path/to/image.jpg" \
@ -218,7 +218,7 @@ YOLO segmentation models, such as `yolov8n-seg.pt`, can return JSON responses fr
``` ```
=== "CLI API" === "CLI API"
```commandline ```bash
curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \ curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
-H "x-api-key: API_KEY" \ -H "x-api-key: API_KEY" \
-F "image=@/path/to/image.jpg" \ -F "image=@/path/to/image.jpg" \
@ -356,7 +356,7 @@ YOLO pose models, such as `yolov8n-pose.pt`, can return JSON responses from loca
``` ```
=== "CLI API" === "CLI API"
```commandline ```bash
curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \ curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
-H "x-api-key: API_KEY" \ -H "x-api-key: API_KEY" \
-F "image=@/path/to/image.jpg" \ -F "image=@/path/to/image.jpg" \

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -28,7 +28,7 @@ Creating a custom model to detect your objects is an iterative process of collec
YOLOv5 models must be trained on labelled data in order to learn classes of objects in that data. There are two options for creating your dataset before you start training: YOLOv5 models must be trained on labelled data in order to learn classes of objects in that data. There are two options for creating your dataset before you start training:
<details markdown> <details open markdown>
<summary>Use <a href="https://roboflow.com/?ref=ultralytics">Roboflow</a> to create your dataset in YOLO format</summary> <summary>Use <a href="https://roboflow.com/?ref=ultralytics">Roboflow</a> to create your dataset in YOLO format</summary>
### 1.1 Collect Images ### 1.1 Collect Images

@ -4,7 +4,7 @@ This example demonstrates how to perform inference using YOLOv8 and YOLOv5 model
## Usage ## Usage
```commandline ```bash
git clone ultralytics git clone ultralytics
cd ultralytics cd ultralytics
pip install . pip install .

@ -161,6 +161,41 @@ nav:
- YOLOv5: models/yolov5.md - YOLOv5: models/yolov5.md
- YOLOv8: models/yolov8.md - YOLOv8: models/yolov8.md
- Segment Anything Model (SAM): models/sam.md - Segment Anything Model (SAM): models/sam.md
- Datasets:
- datasets/index.md
- Detection:
- datasets/detect/index.md
- Argoverse: datasets/detect/argoverse.md
- COCO: datasets/detect/coco.md
- COCO8: datasets/detect/coco8.md
- GlobalWheat2020: datasets/detect/globalwheat2020.md
- Objects365: datasets/detect/objects365.md
- SKU-110K: datasets/detect/sku-110k.md
- VisDrone: datasets/detect/visdrone.md
- VOC: datasets/detect/voc.md
- xView: datasets/detect/xview.md
- Segmentation:
- datasets/segment/index.md
- COCO: datasets/segment/coco.md
- COCO8-seg: datasets/segment/coco8-seg.md
- Pose:
- datasets/pose/index.md
- COCO: datasets/pose/coco.md
- COCO8-pose: datasets/pose/coco8-pose.md
- Classification:
- datasets/classify/index.md
- Caltech 101: datasets/classify/caltech101.md
- Caltech 256: datasets/classify/caltech256.md
- CIFAR-10: datasets/classify/cifar10.md
- CIFAR-100: datasets/classify/cifar100.md
- Fashion-MNIST: datasets/classify/fashion-mnist.md
- ImageNet: datasets/classify/imagenet.md
- ImageNet-10: datasets/classify/imagenet10.md
- Imagenette: datasets/classify/imagenette.md
- Imagewoof: datasets/classify/imagewoof.md
- MNIST: datasets/classify/mnist.md
- Multi-Object Tracking:
- datasets/track/index.md
- Usage: - Usage:
- CLI: usage/cli.md - CLI: usage/cli.md
- Python: usage/python.md - Python: usage/python.md
@ -327,6 +362,7 @@ plugins:
reference/nn.md: reference/nn/modules.md reference/nn.md: reference/nn/modules.md
reference/ops.md: reference/yolo/utils/ops.md reference/ops.md: reference/yolo/utils/ops.md
reference/results.md: reference/yolo/engine/results.md reference/results.md: reference/yolo/engine/results.md
reference/base_val.md: index.md
tasks/classification.md: tasks/classify.md tasks/classification.md: tasks/classify.md
tasks/detection.md: tasks/detect.md tasks/detection.md: tasks/detect.md
tasks/segmentation.md: tasks/segment.md tasks/segmentation.md: tasks/segment.md
@ -362,6 +398,7 @@ plugins:
yolov5/train_custom_data.md: yolov5/tutorials/train_custom_data.md yolov5/train_custom_data.md: yolov5/tutorials/train_custom_data.md
yolov5/architecture.md: yolov5/tutorials/architecture_description.md yolov5/architecture.md: yolov5/tutorials/architecture_description.md
yolov5/export.md: yolov5/tutorials/model_export.md yolov5/export.md: yolov5/tutorials/model_export.md
yolov5/yolov5_quickstart_tutorial.md: yolov5/quickstart_tutorial.md
yolov5/tips_for_best_training_results.md: yolov5/tutorials/tips_for_best_training_results.md yolov5/tips_for_best_training_results.md: yolov5/tutorials/tips_for_best_training_results.md
yolov5/tutorials/yolov5_neural_magic_tutorial.md: yolov5/tutorials/neural_magic_pruning_quantization.md yolov5/tutorials/yolov5_neural_magic_tutorial.md: yolov5/tutorials/neural_magic_pruning_quantization.md
yolov5/tutorials/model_ensembling_tutorial.md: yolov5/tutorials/model_ensembling.md yolov5/tutorials/model_ensembling_tutorial.md: yolov5/tutorials/model_ensembling.md
@ -372,4 +409,17 @@ plugins:
yolov5/tutorials/model_export_tutorial.md: yolov5/tutorials/model_export.md yolov5/tutorials/model_export_tutorial.md: yolov5/tutorials/model_export.md
yolov5/tutorials/jetson_nano_tutorial.md: yolov5/tutorials/running_on_jetson_nano.md yolov5/tutorials/jetson_nano_tutorial.md: yolov5/tutorials/running_on_jetson_nano.md
yolov5/tutorials/yolov5_model_ensembling_tutorial.md: yolov5/tutorials/model_ensembling.md yolov5/tutorials/yolov5_model_ensembling_tutorial.md: yolov5/tutorials/model_ensembling.md
reference/base_val.md: index.md yolov5/tutorials/roboflow_integration.md: yolov5/tutorials/roboflow_datasets_integration.md
yolov5/tutorials/pruning_and_sparsity_tutorial.md: yolov5/tutorials/model_pruning_and_sparsity.md
yolov5/tutorials/yolov5_transfer_learning_with_frozen_layers_tutorial.md: yolov5/tutorials/transfer_learning_with_frozen_layers.md
yolov5/tutorials/transfer_learning_with_frozen_layers_tutorial.md: yolov5/tutorials/transfer_learning_with_frozen_layers.md
yolov5/tutorials/yolov5_model_export_tutorial.md: yolov5/tutorials/model_export.md
yolov5/tutorials/neural_magic_tutorial.md: yolov5/tutorials/neural_magic_pruning_quantization.md
yolov5/tutorials/yolov5_clearml_integration_tutorial.md: yolov5/tutorials/clearml_logging_integration.md
yolov5/tutorials/yolov5_train_custom_data.md: yolov5/tutorials/train_custom_data.md
yolov5/tutorials/comet_integration_tutorial.md: yolov5/tutorials/comet_logging_integration.md
yolov5/tutorials/yolov5_pruning_and_sparsity_tutorial.md: yolov5/tutorials/model_pruning_and_sparsity.md
yolov5/tutorials/yolov5_jetson_nano_tutorial.md: yolov5/tutorials/running_on_jetson_nano.md
yolov5/environments/yolov5_amazon_web_services_quickstart_tutorial.md: yolov5/environments/aws_quickstart_tutorial.md
yolov5/environments/yolov5_google_cloud_platform_quickstart_tutorial.md: yolov5/environments/google_cloud_quickstart_tutorial.md
yolov5/environments/yolov5_docker_image_quickstart_tutorial.md: yolov5/environments/docker_image_quickstart_tutorial.md

@ -1,6 +1,6 @@
# Ultralytics YOLO 🚀, AGPL-3.0 license # Ultralytics YOLO 🚀, AGPL-3.0 license
__version__ = '8.0.93' __version__ = '8.0.94'
from ultralytics.hub import start from ultralytics.hub import start
from ultralytics.vit.sam import SAM from ultralytics.vit.sam import SAM

@ -24,7 +24,7 @@ TASK2MODEL = {
'detect': 'yolov8n.pt', 'detect': 'yolov8n.pt',
'segment': 'yolov8n-seg.pt', 'segment': 'yolov8n-seg.pt',
'classify': 'yolov8n-cls.pt', 'classify': 'yolov8n-cls.pt',
'pose': 'yolov8n-pose.yaml'} 'pose': 'yolov8n-pose.pt'}
CLI_HELP_MSG = \ CLI_HELP_MSG = \
f""" f"""

@ -15,7 +15,7 @@ import psutil
from torch.utils.data import Dataset from torch.utils.data import Dataset
from tqdm import tqdm from tqdm import tqdm
from ..utils import LOCAL_RANK, LOGGER, NUM_THREADS, TQDM_BAR_FORMAT from ..utils import DEFAULT_CFG, LOCAL_RANK, LOGGER, NUM_THREADS, TQDM_BAR_FORMAT
from .utils import HELP_URL, IMG_FORMATS from .utils import HELP_URL, IMG_FORMATS
@ -51,7 +51,7 @@ class BaseDataset(Dataset):
imgsz=640, imgsz=640,
cache=False, cache=False,
augment=True, augment=True,
hyp=None, hyp=DEFAULT_CFG,
prefix='', prefix='',
rect=False, rect=False,
batch_size=None, batch_size=None,

@ -71,7 +71,7 @@ def seed_worker(worker_id): # noqa
def build_yolo_dataset(cfg, img_path, batch, data_info, mode='train', rect=False, stride=32): def build_yolo_dataset(cfg, img_path, batch, data_info, mode='train', rect=False, stride=32):
"""Build YOLO Dataset""" """Build YOLO Dataset"""
dataset = YOLODataset( return YOLODataset(
img_path=img_path, img_path=img_path,
imgsz=cfg.imgsz, imgsz=cfg.imgsz,
batch_size=batch, batch_size=batch,
@ -87,7 +87,6 @@ def build_yolo_dataset(cfg, img_path, batch, data_info, mode='train', rect=False
use_keypoints=cfg.task == 'pose', use_keypoints=cfg.task == 'pose',
classes=cfg.classes, classes=cfg.classes,
data=data_info) data=data_info)
return dataset
def build_dataloader(dataset, batch, workers, shuffle=True, rank=-1): def build_dataloader(dataset, batch, workers, shuffle=True, rank=-1):

@ -209,7 +209,7 @@ class ClassificationDataset(torchvision.datasets.ImageFolder):
album_transform: Albumentations transforms, used if installed album_transform: Albumentations transforms, used if installed
""" """
def __init__(self, root, augment, imgsz, cache=False): def __init__(self, root, augment=False, imgsz=224, cache=False):
"""Initialize YOLO object with root, image size, augmentations, and cache settings""" """Initialize YOLO object with root, image size, augmentations, and cache settings"""
super().__init__(root=root) super().__init__(root=root)
self.torch_transforms = classify_transforms(imgsz) self.torch_transforms = classify_transforms(imgsz)

@ -310,17 +310,19 @@ class HUBDatasetStats():
Arguments Arguments
path: Path to data.yaml or data.zip (with data.yaml inside data.zip) path: Path to data.yaml or data.zip (with data.yaml inside data.zip)
task: Dataset task. Options are 'detect', 'segment', 'pose', 'classify'.
autodownload: Attempt to download dataset if not found locally autodownload: Attempt to download dataset if not found locally
Usage Usage
from ultralytics.yolo.data.utils import HUBDatasetStats from ultralytics.yolo.data.utils import HUBDatasetStats
stats = HUBDatasetStats('coco128.yaml', autodownload=True) # usage 1 stats = HUBDatasetStats('/Users/glennjocher/Downloads/coco8.zip', task='detect') # detect dataset
stats = HUBDatasetStats('/Users/glennjocher/Downloads/coco6.zip') # usage 2 stats = HUBDatasetStats('/Users/glennjocher/Downloads/coco8-seg.zip', task='segment') # segment dataset
stats = HUBDatasetStats('/Users/glennjocher/Downloads/coco8-pose.zip', task='pose') # pose dataset
stats.get_json(save=False) stats.get_json(save=False)
stats.process_images() stats.process_images()
""" """
def __init__(self, path='coco128.yaml', autodownload=False): def __init__(self, path='coco128.yaml', task='detect', autodownload=False):
"""Initialize class.""" """Initialize class."""
zipped, data_dir, yaml_path = self._unzip(Path(path)) zipped, data_dir, yaml_path = self._unzip(Path(path))
try: try:
@ -336,6 +338,7 @@ class HUBDatasetStats():
self.im_dir.mkdir(parents=True, exist_ok=True) # makes /images self.im_dir.mkdir(parents=True, exist_ok=True) # makes /images
self.stats = {'nc': len(data['names']), 'names': list(data['names'].values())} # statistics dictionary self.stats = {'nc': len(data['names']), 'names': list(data['names'].values())} # statistics dictionary
self.data = data self.data = data
self.task = task # detect, segment, pose, classify
@staticmethod @staticmethod
def _find_yaml(dir): def _find_yaml(dir):
@ -352,11 +355,10 @@ class HUBDatasetStats():
"""Unzip data.zip.""" """Unzip data.zip."""
if not str(path).endswith('.zip'): # path is data.yaml if not str(path).endswith('.zip'): # path is data.yaml
return False, None, path return False, None, path
assert Path(path).is_file(), f'Error unzipping {path}, file not found' unzip_dir = unzip_file(path, path=path.parent)
unzip_file(path, path=path.parent) assert unzip_dir.is_dir(), f'Error unzipping {path}, {unzip_dir} not found. ' \
dir = path.with_suffix('') # dataset directory == zip name f'path/to/abc.zip MUST unzip to path/to/abc/'
assert dir.is_dir(), f'Error unzipping {path}, {dir} not found. path/to/abc.zip MUST unzip to path/to/abc/' return True, str(unzip_dir), self._find_yaml(unzip_dir) # zipped, data_dir, yaml_path
return True, str(dir), self._find_yaml(dir) # zipped, data_dir, yaml_path
def _hub_ops(self, f): def _hub_ops(self, f):
"""Saves a compressed image for HUB previews.""" """Saves a compressed image for HUB previews."""
@ -364,20 +366,33 @@ class HUBDatasetStats():
def get_json(self, save=False, verbose=False): def get_json(self, save=False, verbose=False):
"""Return dataset JSON for Ultralytics HUB.""" """Return dataset JSON for Ultralytics HUB."""
# from ultralytics.yolo.data import YOLODataset from ultralytics.yolo.data import YOLODataset # ClassificationDataset
from ultralytics.yolo.data.dataloaders.v5loader import LoadImagesAndLabels
def _round(labels): def _round(labels):
"""Update labels to integer class and 6 decimal place floats.""" """Update labels to integer class and 4 decimal place floats."""
return [[int(c), *(round(x, 4) for x in points)] for c, *points in labels] if self.task == 'detect':
coordinates = labels['bboxes']
elif self.task == 'segment':
coordinates = [x.flatten() for x in labels['segments']]
elif self.task == 'pose':
n = labels['keypoints'].shape[0]
coordinates = np.concatenate((labels['bboxes'], labels['keypoints'].reshape(n, -1)), 1)
else:
raise ValueError('Undefined dataset task.')
zipped = zip(labels['cls'], coordinates)
return [[int(c), *(round(float(x), 4) for x in points)] for c, points in zipped]
for split in 'train', 'val', 'test': for split in 'train', 'val', 'test':
if self.data.get(split) is None: if self.data.get(split) is None:
self.stats[split] = None # i.e. no test set self.stats[split] = None # i.e. no test set
continue continue
dataset = LoadImagesAndLabels(self.data[split]) # load dataset
dataset = YOLODataset(img_path=self.data[split],
data=self.data,
use_segments=self.task == 'segment',
use_keypoints=self.task == 'pose')
x = np.array([ x = np.array([
np.bincount(label[:, 0].astype(int), minlength=self.data['nc']) np.bincount(label['cls'].astype(int).flatten(), minlength=self.data['nc'])
for label in tqdm(dataset.labels, total=len(dataset), desc='Statistics')]) # shape(128x80) for label in tqdm(dataset.labels, total=len(dataset), desc='Statistics')]) # shape(128x80)
self.stats[split] = { self.stats[split] = {
'instance_stats': { 'instance_stats': {
@ -388,7 +403,7 @@ class HUBDatasetStats():
'unlabelled': int(np.all(x == 0, 1).sum()), 'unlabelled': int(np.all(x == 0, 1).sum()),
'per_class': (x > 0).sum(0).tolist()}, 'per_class': (x > 0).sum(0).tolist()},
'labels': [{ 'labels': [{
str(Path(k).name): _round(v.tolist())} for k, v in zip(dataset.im_files, dataset.labels)]} Path(k).name: _round(v)} for k, v in zip(dataset.im_files, dataset.labels)]}
# Save, print and return # Save, print and return
if save: if save:
@ -402,13 +417,12 @@ class HUBDatasetStats():
def process_images(self): def process_images(self):
"""Compress images for Ultralytics HUB.""" """Compress images for Ultralytics HUB."""
# from ultralytics.yolo.data import YOLODataset from ultralytics.yolo.data import YOLODataset # ClassificationDataset
from ultralytics.yolo.data.dataloaders.v5loader import LoadImagesAndLabels
for split in 'train', 'val', 'test': for split in 'train', 'val', 'test':
if self.data.get(split) is None: if self.data.get(split) is None:
continue continue
dataset = LoadImagesAndLabels(self.data[split]) # load dataset dataset = YOLODataset(img_path=self.data[split], data=self.data)
with ThreadPool(NUM_THREADS) as pool: with ThreadPool(NUM_THREADS) as pool:
for _ in tqdm(pool.imap(self._hub_ops, dataset.im_files), total=len(dataset), desc=f'{split} images'): for _ in tqdm(pool.imap(self._hub_ops, dataset.im_files), total=len(dataset), desc=f'{split} images'):
pass pass

@ -37,26 +37,39 @@ def is_url(url, check=True):
def unzip_file(file, path=None, exclude=('.DS_Store', '__MACOSX')): def unzip_file(file, path=None, exclude=('.DS_Store', '__MACOSX')):
""" """
Unzip a *.zip file to path/, excluding files containing strings in exclude list Unzips a *.zip file to the specified path, excluding files containing strings in the exclude list.
Replaces: ZipFile(file).extractall(path=path)
If the zipfile does not contain a single top-level directory, the function will create a new
directory with the same name as the zipfile (without the extension) to extract its contents.
If a path is not provided, the function will use the parent directory of the zipfile as the default path.
Args:
file (str): The path to the zipfile to be extracted.
path (str, optional): The path to extract the zipfile to. Defaults to None.
exclude (tuple, optional): A tuple of filename strings to be excluded. Defaults to ('.DS_Store', '__MACOSX').
Raises:
BadZipFile: If the provided file does not exist or is not a valid zipfile.
Returns:
(Path): The path to the directory where the zipfile was extracted.
""" """
if not (Path(file).exists() and is_zipfile(file)): if not (Path(file).exists() and is_zipfile(file)):
raise BadZipFile(f"File '{file}' does not exist or is a bad zip file.") raise BadZipFile(f"File '{file}' does not exist or is a bad zip file.")
if path is None: if path is None:
path = Path(file).parent # default path path = Path(file).parent # default path
with ZipFile(file) as zipObj: with ZipFile(file) as zipObj:
for i, f in enumerate(zipObj.namelist()): # list all archived filenames in the zip file_list = [f for f in zipObj.namelist() if all(x not in f for x in exclude)]
# If zip does not expand into a directory create a new directory to expand into top_level_dirs = {Path(f).parts[0] for f in file_list}
if i == 0:
info = zipObj.getinfo(f) if len(top_level_dirs) > 1 or not file_list[0].endswith('/'):
if info.file_size > 0 or not info.filename.endswith('/'): # element is a file and not a directory
path = Path(path) / Path(file).stem # define new unzip directory path = Path(path) / Path(file).stem # define new unzip directory
unzip_dir = path
else: for f in file_list:
unzip_dir = f
if all(x not in f for x in exclude):
zipObj.extract(f, path=path) zipObj.extract(f, path=path)
return unzip_dir # return unzip dir
return path # return unzip dir
def check_disk_space(url='https://ultralytics.com/assets/coco128.zip', sf=1.5, hard=True): def check_disk_space(url='https://ultralytics.com/assets/coco128.zip', sf=1.5, hard=True):

@ -318,7 +318,7 @@ class ConfusionMatrix:
nc, nn = self.nc, len(names) # number of classes, names nc, nn = self.nc, len(names) # number of classes, names
sn.set(font_scale=1.0 if nc < 50 else 0.8) # for label size sn.set(font_scale=1.0 if nc < 50 else 0.8) # for label size
labels = (0 < nn < 99) and (nn == nc) # apply names to ticklabels labels = (0 < nn < 99) and (nn == nc) # apply names to ticklabels
ticklabels = (names + ['background']) if labels else 'auto' ticklabels = (list(names) + ['background']) if labels else 'auto'
with warnings.catch_warnings(): with warnings.catch_warnings():
warnings.simplefilter('ignore') # suppress empty matrix RuntimeWarning: All-NaN slice encountered warnings.simplefilter('ignore') # suppress empty matrix RuntimeWarning: All-NaN slice encountered
sn.heatmap(array, sn.heatmap(array,
@ -332,10 +332,11 @@ class ConfusionMatrix:
vmin=0.0, vmin=0.0,
xticklabels=ticklabels, xticklabels=ticklabels,
yticklabels=ticklabels).set_facecolor((1, 1, 1)) yticklabels=ticklabels).set_facecolor((1, 1, 1))
title = 'Confusion Matrix' + ' Normalized' * normalize
ax.set_xlabel('True') ax.set_xlabel('True')
ax.set_ylabel('Predicted') ax.set_ylabel('Predicted')
ax.set_title('Confusion Matrix') ax.set_title(title)
fig.savefig(Path(save_dir) / 'confusion_matrix.png', dpi=250) fig.savefig(Path(save_dir) / f'{title.lower().replace(" ", "_")}.png', dpi=250)
plt.close(fig) plt.close(fig)
def print(self): def print(self):

@ -38,5 +38,5 @@ default_space = {
task_metric_map = { task_metric_map = {
'detect': 'metrics/mAP50-95(B)', 'detect': 'metrics/mAP50-95(B)',
'segment': 'metrics/mAP50-95(M)', 'segment': 'metrics/mAP50-95(M)',
'classify': 'top1_acc', 'classify': 'metrics/accuracy_top1',
'pose': None} 'pose': 'metrics/mAP50-95(P)'}

@ -72,9 +72,8 @@ class ClassificationTrainer(BaseTrainer):
return # dont return ckpt. Classification doesn't support resume return # dont return ckpt. Classification doesn't support resume
def build_dataset(self, img_path, mode='train'): def build_dataset(self, img_path, mode='train', batch=None):
dataset = ClassificationDataset(root=img_path, imgsz=self.args.imgsz, augment=mode == 'train') return ClassificationDataset(root=img_path, imgsz=self.args.imgsz, augment=mode == 'train')
return dataset
def get_dataloader(self, dataset_path, batch_size=16, rank=0, mode='train'): def get_dataloader(self, dataset_path, batch_size=16, rank=0, mode='train'):
"""Returns PyTorch DataLoader with transforms to preprocess images for inference.""" """Returns PyTorch DataLoader with transforms to preprocess images for inference."""

@ -46,7 +46,8 @@ class ClassificationValidator(BaseValidator):
"""Finalizes metrics of the model such as confusion_matrix and speed.""" """Finalizes metrics of the model such as confusion_matrix and speed."""
self.confusion_matrix.process_cls_preds(self.pred, self.targets) self.confusion_matrix.process_cls_preds(self.pred, self.targets)
if self.args.plots: if self.args.plots:
self.confusion_matrix.plot(save_dir=self.save_dir, names=list(self.names.values())) for normalize in True, False:
self.confusion_matrix.plot(save_dir=self.save_dir, names=self.names.values(), normalize=normalize)
self.metrics.speed = self.speed self.metrics.speed = self.speed
self.metrics.confusion_matrix = self.confusion_matrix self.metrics.confusion_matrix = self.confusion_matrix

@ -32,7 +32,7 @@ class DetectionTrainer(BaseTrainer):
gs = max(int(de_parallel(self.model).stride.max() if self.model else 0), 32) gs = max(int(de_parallel(self.model).stride.max() if self.model else 0), 32)
return build_yolo_dataset(self.args, img_path, batch, self.data, mode=mode, rect=mode == 'val', stride=gs) return build_yolo_dataset(self.args, img_path, batch, self.data, mode=mode, rect=mode == 'val', stride=gs)
def get_dataloader(self, dataset_path, batch_size, rank=0, mode='train'): def get_dataloader(self, dataset_path, batch_size=16, rank=0, mode='train'):
"""TODO: manage splits differently.""" """TODO: manage splits differently."""
# Calculate stride - check if model is initialized # Calculate stride - check if model is initialized
if self.args.v5loader: if self.args.v5loader:
@ -62,8 +62,7 @@ class DetectionTrainer(BaseTrainer):
LOGGER.warning("WARNING ⚠️ 'rect=True' is incompatible with DataLoader shuffle, setting shuffle=False") LOGGER.warning("WARNING ⚠️ 'rect=True' is incompatible with DataLoader shuffle, setting shuffle=False")
shuffle = False shuffle = False
workers = self.args.workers if mode == 'train' else self.args.workers * 2 workers = self.args.workers if mode == 'train' else self.args.workers * 2
dataloader = build_dataloader(dataset, batch_size, workers, shuffle, rank) return build_dataloader(dataset, batch_size, workers, shuffle, rank) # return dataloader
return dataloader
def preprocess_batch(self, batch): def preprocess_batch(self, batch):
"""Preprocesses a batch of images by scaling and converting to float.""" """Preprocesses a batch of images by scaling and converting to float."""

@ -144,7 +144,8 @@ class DetectionValidator(BaseValidator):
LOGGER.info(pf % (self.names[c], self.seen, self.nt_per_class[c], *self.metrics.class_result(i))) LOGGER.info(pf % (self.names[c], self.seen, self.nt_per_class[c], *self.metrics.class_result(i)))
if self.args.plots: if self.args.plots:
self.confusion_matrix.plot(save_dir=self.save_dir, names=list(self.names.values())) for normalize in True, False:
self.confusion_matrix.plot(save_dir=self.save_dir, names=self.names.values(), normalize=normalize)
def _process_batch(self, detections, labels): def _process_batch(self, detections, labels):
""" """

Loading…
Cancel
Save