If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our [Tips for Best Training Results](https://docs.ultralytics.com/yolov5/tutorials/tips_for_best_training_results/).
Join the vibrant [Ultralytics Discord](https://discord.gg/YVsATxj6wr) 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.
Join the vibrant [Ultralytics Discord](https://ultralytics.com/discord) 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.
@ -155,15 +155,19 @@ Additionally, you can try FastSAM through a [Colab demo](https://colab.research.
We would like to acknowledge the FastSAM authors for their significant contributions in the field of real-time instance segmentation:
```bibtex
@misc{zhao2023fast,
!!! note ""
=== "BibTeX"
```bibtex
@misc{zhao2023fast,
title={Fast Segment Anything},
author={Xu Zhao and Wenchao Ding and Yongqi An and Yinglong Du and Tao Yu and Min Li and Ming Tang and Jinqiao Wang},
year={2023},
eprint={2306.12156},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
}
```
The original FastSAM paper can be found on [arXiv](https://arxiv.org/abs/2306.12156). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/CASIA-IVA-Lab/FastSAM). We appreciate their efforts in advancing the field and making their work accessible to the broader community.
@ -17,32 +17,51 @@ In this documentation, we provide information on four major models:
5. [YOLOv7](./yolov7.md): Updated YOLO models released in 2022 by the authors of YOLOv4.
6. [YOLOv8](./yolov8.md): The latest version of the YOLO family, featuring enhanced capabilities such as instance segmentation, pose/keypoints estimation, and classification.
7. [Segment Anything Model (SAM)](./sam.md): Meta's Segment Anything Model (SAM).
7. [Mobile Segment Anything Model (MobileSAM)](./mobile-sam.md): MobileSAM for mobile applications by Kyung Hee University.
8. [Fast Segment Anything Model (FastSAM)](./fast-sam.md): FastSAM by Image & Video Analysis Group, Institute of Automation, Chinese Academy of Sciences.
8. [Mobile Segment Anything Model (MobileSAM)](./mobile-sam.md): MobileSAM for mobile applications by Kyung Hee University.
9. [Fast Segment Anything Model (FastSAM)](./fast-sam.md): FastSAM by Image & Video Analysis Group, Institute of Automation, Chinese Academy of Sciences.
You can use many of these models directly in the Command Line Interface (CLI) or in a Python environment. Below are examples of how to use the models with CLI and Python:
## CLI Example
## Usage
Use the `model` argument to pass a model YAML such as `model=yolov8n.yaml` or a pretrained *.pt file such as `model=yolov8n.pt`
You can use RT-DETR for object detection tasks using the `ultralytics` pip package. The following is a sample code snippet showing how to use RT-DETR models for training and inference:
This example provides simple inference code for YOLO, SAM and RTDETR models. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using models with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
PyTorch pretrained models as well as model YAML files can also be passed to the `YOLO()`, `SAM()`, `NAS()` and `RTDETR()` classes to create a model instance in python:
=== "Python"
```python
from ultralytics import YOLO
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()`, `SAM()`, `NAS()` and `RTDETR()` classes to create a model instance in python:
model = YOLO("yolov8n.pt") # load a pretrained YOLOv8n model
```python
from ultralytics import YOLO
model.info() # display model information
model.train(data="coco128.yaml", epochs=100) # train the model
```
# Load a COCO-pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
@ -12,7 +12,7 @@ The MobileSAM paper is now available on [arXiv](https://arxiv.org/pdf/2306.14289
A demonstration of MobileSAM running on a CPU can be accessed at this [demo link](https://huggingface.co/spaces/dhkim2810/MobileSAM). The performance on a Mac i5 CPU takes approximately 3 seconds. On the Hugging Face demo, the interface and lower-performance CPUs contribute to a slower response, but it continues to function effectively.
MobileSAM is implemented in various projects including [Grounding-SAM](https://github.com/IDEA-Research/Grounded-Segment-Anything), [AnyLabeling](https://github.com/vietanhdev/anylabeling), and [SegmentAnythingin3D](https://github.com/Jumpat/SegmentAnythingin3D).
MobileSAM is implemented in various projects including [Grounding-SAM](https://github.com/IDEA-Research/Grounded-Segment-Anything), [AnyLabeling](https://github.com/vietanhdev/anylabeling), and [SegmentAnythingin3D](https://github.com/Jumpat/SegmentAnythingin3D).
MobileSAM is trained on a single GPU with a 100k dataset (1% of the original images) in less than a day. The code for this training will be made available in the future.
@ -15,7 +15,7 @@ Real-Time Detection Transformer (RT-DETR), developed by Baidu, is a cutting-edge
### Key Features
- **Efficient Hybrid Encoder:** Baidu's RT-DETR uses an efficient hybrid encoder that processes multi-scale features by decoupling intra-scale interaction and cross-scale fusion. This unique Vision Transformers-based design reduces computational costs and allows for real-time object detection.
- **Efficient Hybrid Encoder:** Baidu's RT-DETR uses an efficient hybrid encoder that processes multiscale features by decoupling intra-scale interaction and cross-scale fusion. This unique Vision Transformers-based design reduces computational costs and allows for real-time object detection.
- **IoU-aware Query Selection:** Baidu's RT-DETR improves object query initialization by utilizing IoU-aware query selection. This allows the model to focus on the most relevant objects in the scene, enhancing the detection accuracy.
- **Adaptable Inference Speed:** Baidu's RT-DETR supports flexible adjustments of inference speed by using different decoder layers without the need for retraining. This adaptability facilitates practical application in various real-time object detection scenarios.
@ -28,16 +28,39 @@ The Ultralytics Python API provides pre-trained PaddlePaddle RT-DETR models with
## Usage
### Python API
You can use RT-DETR for object detection tasks using the `ultralytics` pip package. The following is a sample code snippet showing how to use RT-DETR models for training and inference:
```python
from ultralytics import RTDETR
!!! example ""
model = RTDETR("rtdetr-l.pt")
model.info() # display model information
model.train(data="coco8.yaml") # train
model.predict("path/to/image.jpg") # predict
```
This example provides simple inference code for RT-DETR. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using RT-DETR with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
=== "Python"
```python
from ultralytics import RTDETR
# Load a COCO-pretrained RT-DETR-l model
model = RTDETR('rtdetr-l.pt')
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
If you use Baidu's RT-DETR in your research or development work, please cite the [original paper](https://arxiv.org/abs/2304.08069):
```bibtex
@misc{lv2023detrs,
!!! note ""
=== "BibTeX"
```bibtex
@misc{lv2023detrs,
title={DETRs Beat YOLOs on Real-time Object Detection},
author={Wenyu Lv and Shangliang Xu and Yian Zhao and Guanzhong Wang and Jinman Wei and Cheng Cui and Yuning Du and Qingqing Dang and Yi Liu},
year={2023},
eprint={2304.08069},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
}
```
We would like to acknowledge Baidu and the [PaddlePaddle](https://github.com/PaddlePaddle/PaddleDetection) team for creating and maintaining this valuable resource for the computer vision community. Their contribution to the field with the development of the Vision Transformers-based real-time object detector, RT-DETR, is greatly appreciated.
| [FastSAM-s](fast-sam.md) with YOLOv8 backbone | 23.7 MB | 11.8 M | 115 ms/im |
@ -205,16 +206,20 @@ Auto-annotation with pre-trained models can dramatically cut down the time and e
If you find SAM useful in your research or development work, please consider citing our paper:
```bibtex
@misc{kirillov2023segment,
!!! note ""
=== "BibTeX"
```bibtex
@misc{kirillov2023segment,
title={Segment Anything},
author={Alexander Kirillov and Eric Mintun and Nikhila Ravi and Hanzi Mao and Chloe Rolland and Laura Gustafson and Tete Xiao and Spencer Whitehead and Alexander C. Berg and Wan-Yen Lo and Piotr Dollár and Ross Girshick},
year={2023},
eprint={2304.02643},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
}
```
We would like to express our gratitude to Meta AI for creating and maintaining this valuable resource for the computer vision community.
@ -36,35 +36,49 @@ Each model variant is designed to offer a balance between Mean Average Precision
## Usage
### Python API
Ultralytics has made YOLO-NAS models easy to integrate into your Python applications via our `ultralytics` python package. The package provides a user-friendly Python API to streamline the process.
The YOLO-NAS models are easy to integrate into your Python applications. Ultralytics provides a user-friendly Python API to streamline the process.
The following examples show how to use YOLO-NAS models with the `ultralytics` package for inference and validation:
#### Predict Usage
### Inference and Validation Examples
To perform object detection on an image, use the `predict` method as shown below:
In this example we validate YOLO-NAS-s on the COCO8 dataset.
This example provides simple inference and validation code for YOLO-NAS. For handling inference results see [Predict](../modes/predict.md) mode. For using YOLO-NAS with additional modes see [Val](../modes/val.md) and [Export](../modes/export.md). YOLO-NAS on the `ultralytics` package does not support training.
This snippet demonstrates the simplicity of loading a pre-trained model and running a prediction on an image.
=== "Python"
#### Val Usage
PyTorch pretrained `*.pt` models files can be passed to the `NAS()` class to create a model instance in python:
Validation of the model on a dataset can be done as follows:
```python
from ultralytics import NAS
```python
from ultralytics import NAS
# Load a COCO-pretrained YOLO-NAS-s model
model = NAS('yolo_nas_s.pt')
model = NAS('yolo_nas_s')
results = model.val(data='coco8.yaml)
```
# Display model information (optional)
model.info()
In this example, the model is validated against the dataset specified in the 'coco8.yaml' file.
# Validate the model on the COCO8 example dataset
results model.val(data='coco8.yaml')
# Run inference with the YOLO-NAS-s model on the 'bus.jpg' image
results = model('path/to/bus.jpg')
```
=== "CLI"
CLI commands are available to directly run the models:
```bash
# Load a COCO-pretrained YOLO-NAS-s model and validate it's performance on the COCO8 example dataset
yolo val model=yolo_nas_s.pt data=coco8.yaml
# Load a COCO-pretrained YOLO-NAS-s model and run inference on the 'bus.jpg' image
@ -88,12 +102,16 @@ The YOLO-NAS models support both inference and validation modes, allowing you to
Harness the power of the YOLO-NAS models to drive your object detection tasks to new heights of performance and speed.
## Acknowledgements and Citations
## Citations and Acknowledgements
If you employ YOLO-NAS in your research or development work, please cite SuperGradients:
```bibtex
@misc{supergradients,
!!! note ""
=== "BibTeX"
```bibtex
@misc{supergradients,
doi = {10.5281/ZENODO.7789328},
url = {https://zenodo.org/record/7789328},
author = {Aharon, Shay and {Louis-Dupont} and {Ofri Masad} and Yurkova, Kate and {Lotem Fridman} and {Lkdci} and Khvedchenya, Eugene and Rubin, Ran and Bagrov, Natan and Tymchenko, Borys and Keren, Tomer and Zhilko, Alexander and {Eran-Deci}},
@ -101,8 +119,8 @@ If you employ YOLO-NAS in your research or development work, please cite SuperGr
publisher = {GitHub},
journal = {GitHub repository},
year = {2021},
}
```
}
```
We express our gratitude to Deci AI's [SuperGradients](https://github.com/Deci-AI/super-gradients/) team for their efforts in creating and maintaining this valuable resource for the computer vision community. We believe YOLO-NAS, with its innovative architecture and superior object detection capabilities, will become a critical tool for developers and researchers alike.
description: Get an overview of YOLOv3, YOLOv3-Ultralytics and YOLOv3u. Learn about their key features, usage, and supported tasks for object detection.
You can use these models for object detection tasks using the Ultralytics YOLOv3 repository. The following is a sample code snippet showing how to use the YOLOv3u model for inference:
You can use YOLOv3 for object detection tasks using the Ultralytics repository. The following is a sample code snippet showing how to use YOLOv3 model for inference:
```python
from ultralytics import YOLO
!!! example ""
# Load the model
model = YOLO('yolov3.pt') # load a pretrained model
This example provides simple inference code for YOLOv3. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using YOLOv3 with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
# Perform inference
results = model('image.jpg')
=== "Python"
# Print the results
results.print()
```
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:
## Citations and Acknowledgments
```python
from ultralytics import YOLO
# Load a COCO-pretrained YOLOv3n model
model = YOLO('yolov3n.pt')
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
@ -53,15 +53,19 @@ YOLOv4 is a powerful and efficient object detection model that strikes a balance
We would like to acknowledge the YOLOv4 authors for their significant contributions in the field of real-time object detection:
```bibtex
@misc{bochkovskiy2020yolov4,
!!! note ""
=== "BibTeX"
```bibtex
@misc{bochkovskiy2020yolov4,
title={YOLOv4: Optimal Speed and Accuracy of Object Detection},
author={Alexey Bochkovskiy and Chien-Yao Wang and Hong-Yuan Mark Liao},
year={2020},
eprint={2004.10934},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
}
```
The original YOLOv4 paper can be found on [arXiv](https://arxiv.org/pdf/2004.10934.pdf). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/AlexeyAB/darknet). We appreciate their efforts in advancing the field and making their work accessible to the broader community.
YOLOv5u is an enhanced version of the [YOLOv5](https://github.com/ultralytics/yolov5) object detection model from Ultralytics. This iteration incorporates the anchor-free, objectness-free split head that is featured in the [YOLOv8](./yolov8.md) models. Although it maintains the same backbone and neck architecture as YOLOv5, YOLOv5u provides an improved accuracy-speed tradeoff for object detection tasks, making it a robust choice for numerous applications.
YOLOv5u represents an advancement in object detection methodologies. Originating from the foundational architecture of the [YOLOv5](https://github.com/ultralytics/yolov5) model developed by Ultralytics, YOLOv5u integrates the anchor-free, objectness-free split head, a feature previously introduced in the [YOLOv8](./yolov8.md) models. This adaptation refines the model's architecture, leading to an improved accuracy-speed tradeoff in object detection tasks. Given the empirical results and its derived features, YOLOv5u provides an efficient alternative for those seeking robust solutions in both research and practical applications.
- **Anchor-free Split Ultralytics Head:**YOLOv5u replaces the conventional anchor-based detection head with an anchor-free split Ultralytics head, boosting performance in object detection tasks.
- **Anchor-free Split Ultralytics Head:**Traditional object detection models rely on predefined anchor boxes to predict object locations. However, YOLOv5u modernizes this approach. By adopting an anchor-free split Ultralytics head, it ensures a more flexible and adaptive detection mechanism, consequently enhancing the performance in diverse scenarios.
- **Optimized Accuracy-Speed Tradeoff:**By delivering a better balance between accuracy and speed, YOLOv5u is suitable for a diverse range of real-time applications, from autonomous driving to video surveillance.
- **Optimized Accuracy-Speed Tradeoff:**Speed and accuracy often pull in opposite directions. But YOLOv5u challenges this tradeoff. It offers a calibrated balance, ensuring real-time detections without compromising on accuracy. This feature is particularly invaluable for applications that demand swift responses, such as autonomous vehicles, robotics, and real-time video analytics.
- **Variety of Pre-trained Models:**YOLOv5u includes numerous pre-trained models for tasks like Inference, Validation, and Training, providing the flexibility to tackle various object detection challenges.
- **Variety of Pre-trained Models:**Understanding that different tasks require different toolsets, YOLOv5u provides a plethora of pre-trained models. Whether you're focusing on Inference, Validation, or Training, there's a tailor-made model awaiting you. This variety ensures you're not just using a one-size-fits-all solution, but a model specifically fine-tuned for your unique challenge.
## Supported Tasks
@ -38,43 +38,69 @@ YOLOv5u is an enhanced version of the [YOLOv5](https://github.com/ultralytics/yo
You can use YOLOv5u for object detection tasks using the Ultralytics repository. The following is a sample code snippet showing how to use YOLOv5u model for inference:
```python
from ultralytics import YOLO
!!! example ""
# Load the model
model = YOLO('yolov5n.pt') # load a pretrained model
This example provides simple inference code for YOLOv5. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using YOLOv5 with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
# Perform inference
results = model('image.jpg')
=== "Python"
# Print the results
results.print()
```
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:
## Citations and Acknowledgments
```python
from ultralytics import YOLO
# Load a COCO-pretrained YOLOv5n model
model = YOLO('yolov5n.pt')
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
@ -17,7 +17,7 @@ structure of a BiC module. (c) A SimCSPSPPF block. ([source](https://arxiv.org/p
### Key Features
- **Bi-directional Concatenation (BiC) Module:** YOLOv6 introduces a BiC module in the neck of the detector, enhancing localization signals and delivering performance gains with negligible speed degradation.
- **Bidirectional Concatenation (BiC) Module:** YOLOv6 introduces a BiC module in the neck of the detector, enhancing localization signals and delivering performance gains with negligible speed degradation.
- **Anchor-Aided Training (AAT) Strategy:** This model proposes AAT to enjoy the benefits of both anchor-based and anchor-free paradigms without compromising inference efficiency.
- **Enhanced Backbone and Neck Design:** By deepening YOLOv6 to include another stage in the backbone and neck, this model achieves state-of-the-art performance on the COCO dataset at high-resolution input.
- **Self-Distillation Strategy:** A new self-distillation strategy is implemented to boost the performance of smaller models of YOLOv6, enhancing the auxiliary regression branch during training and removing it at inference to avoid a marked speed decline.
@ -36,15 +36,43 @@ YOLOv6 also provides quantized models for different precisions and models optimi
## Usage
### Python API
You can use YOLOv6 for object detection tasks using the Ultralytics pip package. The following is a sample code snippet showing how to use YOLOv6 models for training:
```python
from ultralytics import YOLO
!!! example ""
model = YOLO("yolov6n.yaml") # build new model from scratch
model.info() # display model information
model.predict("path/to/image.jpg") # predict
```
This example provides simple training code for YOLOv6. For more options including training settings see [Train](../modes/train.md) mode. For using YOLOv6 with additional modes see [Predict](../modes/predict.md), [Val](../modes/val.md) and [Export](../modes/export.md).
=== "Python"
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:
```python
from ultralytics import YOLO
# Build a YOLOv6n model from scratch
model = YOLO('yolov6n.yaml')
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
We would like to acknowledge the authors for their significant contributions in the field of real-time object detection:
```bibtex
@misc{li2023yolov6,
!!! note ""
=== "BibTeX"
```bibtex
@misc{li2023yolov6,
title={YOLOv6 v3.0: A Full-Scale Reloading},
author={Chuyi Li and Lulu Li and Yifei Geng and Hongliang Jiang and Meng Cheng and Bo Zhang and Zaidan Ke and Xiaoming Xu and Xiangxiang Chu},
year={2023},
eprint={2301.05586},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
}
```
The original YOLOv6 paper can be found on [arXiv](https://arxiv.org/abs/2301.05586). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/meituan/YOLOv6). We appreciate their efforts in advancing the field and making their work accessible to the broader community.
@ -49,13 +49,17 @@ We regret any inconvenience this may cause and will strive to update this docume
We would like to acknowledge the YOLOv7 authors for their significant contributions in the field of real-time object detection:
```bibtex
@article{wang2022yolov7,
!!! note ""
=== "BibTeX"
```bibtex
@article{wang2022yolov7,
title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
journal={arXiv preprint arXiv:2207.02696},
year={2022}
}
```
}
```
The original YOLOv7 paper can be found on [arXiv](https://arxiv.org/pdf/2207.02696.pdf). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/WongKinYiu/yolov7). We appreciate their efforts in advancing the field and making their work accessible to the broader community.
@ -83,25 +83,52 @@ YOLOv8 is the latest iteration in the YOLO series of real-time object detectors,
You can use YOLOv8 for object detection tasks using the Ultralytics pip package. The following is a sample code snippet showing how to use YOLOv8 models for inference:
```python
from ultralytics import YOLO
!!! example ""
# Load the model
model = YOLO('yolov8n.pt') # load a pretrained model
This example provides simple inference code for YOLOv8. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using YOLOv8 with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
# Perform inference
results = model('image.jpg')
=== "Python"
# Print the results
results.print()
```
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:
## Citation
```python
from ultralytics import YOLO
# Load a COCO-pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
Please note that the DOI is pending and will be added to the citation once it is available. The usage of the software is in accordance with the AGPL-3.0 license.