diff --git a/.github/workflows/greetings.yml b/.github/workflows/greetings.yml index ebaf609..d851032 100644 --- a/.github/workflows/greetings.yml +++ b/.github/workflows/greetings.yml @@ -32,7 +32,7 @@ jobs: If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our [Tips for Best Training Results](https://docs.ultralytics.com/yolov5/tutorials/tips_for_best_training_results/). - Join the vibrant [Ultralytics Discord](https://discord.gg/YVsATxj6wr) 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users. + Join the vibrant [Ultralytics Discord](https://ultralytics.com/discord) 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users. ## Install diff --git a/docs/models/fast-sam.md b/docs/models/fast-sam.md index aaa8813..4e022ac 100644 --- a/docs/models/fast-sam.md +++ b/docs/models/fast-sam.md @@ -155,15 +155,19 @@ Additionally, you can try FastSAM through a [Colab demo](https://colab.research. We would like to acknowledge the FastSAM authors for their significant contributions in the field of real-time instance segmentation: -```bibtex -@misc{zhao2023fast, - title={Fast Segment Anything}, - author={Xu Zhao and Wenchao Ding and Yongqi An and Yinglong Du and Tao Yu and Min Li and Ming Tang and Jinqiao Wang}, - year={2023}, - eprint={2306.12156}, - archivePrefix={arXiv}, - primaryClass={cs.CV} -} -``` +!!! note "" + + === "BibTeX" + + ```bibtex + @misc{zhao2023fast, + title={Fast Segment Anything}, + author={Xu Zhao and Wenchao Ding and Yongqi An and Yinglong Du and Tao Yu and Min Li and Ming Tang and Jinqiao Wang}, + year={2023}, + eprint={2306.12156}, + archivePrefix={arXiv}, + primaryClass={cs.CV} + } + ``` The original FastSAM paper can be found on [arXiv](https://arxiv.org/abs/2306.12156). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/CASIA-IVA-Lab/FastSAM). We appreciate their efforts in advancing the field and making their work accessible to the broader community. diff --git a/docs/models/index.md b/docs/models/index.md index e683790..389e5e6 100644 --- a/docs/models/index.md +++ b/docs/models/index.md @@ -17,32 +17,51 @@ In this documentation, we provide information on four major models: 5. [YOLOv7](./yolov7.md): Updated YOLO models released in 2022 by the authors of YOLOv4. 6. [YOLOv8](./yolov8.md): The latest version of the YOLO family, featuring enhanced capabilities such as instance segmentation, pose/keypoints estimation, and classification. 7. [Segment Anything Model (SAM)](./sam.md): Meta's Segment Anything Model (SAM). -7. [Mobile Segment Anything Model (MobileSAM)](./mobile-sam.md): MobileSAM for mobile applications by Kyung Hee University. -8. [Fast Segment Anything Model (FastSAM)](./fast-sam.md): FastSAM by Image & Video Analysis Group, Institute of Automation, Chinese Academy of Sciences. -9. [YOLO-NAS](./yolo-nas.md): YOLO Neural Architecture Search (NAS) Models. -10. [Realtime Detection Transformers (RT-DETR)](./rtdetr.md): Baidu's PaddlePaddle Realtime Detection Transformer (RT-DETR) models. +8. [Mobile Segment Anything Model (MobileSAM)](./mobile-sam.md): MobileSAM for mobile applications by Kyung Hee University. +9. [Fast Segment Anything Model (FastSAM)](./fast-sam.md): FastSAM by Image & Video Analysis Group, Institute of Automation, Chinese Academy of Sciences. +10. [YOLO-NAS](./yolo-nas.md): YOLO Neural Architecture Search (NAS) Models. +11. [Realtime Detection Transformers (RT-DETR)](./rtdetr.md): Baidu's PaddlePaddle Realtime Detection Transformer (RT-DETR) models. You can use many of these models directly in the Command Line Interface (CLI) or in a Python environment. Below are examples of how to use the models with CLI and Python: -## CLI Example +## Usage -Use the `model` argument to pass a model YAML such as `model=yolov8n.yaml` or a pretrained *.pt file such as `model=yolov8n.pt` +You can use RT-DETR for object detection tasks using the `ultralytics` pip package. The following is a sample code snippet showing how to use RT-DETR models for training and inference: -```bash -yolo task=detect mode=train model=yolov8n.pt data=coco128.yaml epochs=100 -``` +!!! example "" -## Python Example + This example provides simple inference code for YOLO, SAM and RTDETR models. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using models with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md). -PyTorch pretrained models as well as model YAML files can also be passed to the `YOLO()`, `SAM()`, `NAS()` and `RTDETR()` classes to create a model instance in python: + === "Python" -```python -from ultralytics import YOLO + PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()`, `SAM()`, `NAS()` and `RTDETR()` classes to create a model instance in python: -model = YOLO("yolov8n.pt") # load a pretrained YOLOv8n model + ```python + from ultralytics import YOLO -model.info() # display model information -model.train(data="coco128.yaml", epochs=100) # train the model -``` + # Load a COCO-pretrained YOLOv8n model + model = YOLO('yolov8n.pt') + + # Display model information (optional) + model.info() + + # Train the model on the COCO8 example dataset for 100 epochs + results model.train(data='coco8.yaml', epochs=100, imgsz=640) + + # Run inference with the YOLOv8n model on the 'bus.jpg' image + results = model('path/to/bus.jpg') + ``` + + === "CLI" + + CLI commands are available to directly run the models: + + ```bash + # Load a COCO-pretrained YOLOv8n model and train it on the COCO8 example dataset for 100 epochs + yolo train model=yolov8n.pt data=coco8.yaml epochs=100 imgsz=640 + + # Load a COCO-pretrained YOLOv8n model and run inference on the 'bus.jpg' image + yolo predict model=yolov8n.pt source=path/to/bus.jpg + ``` For more details on each model, their supported tasks, modes, and performance, please visit their respective documentation pages linked above. diff --git a/docs/models/mobile-sam.md b/docs/models/mobile-sam.md index 66d2391..7c49d4f 100644 --- a/docs/models/mobile-sam.md +++ b/docs/models/mobile-sam.md @@ -12,7 +12,7 @@ The MobileSAM paper is now available on [arXiv](https://arxiv.org/pdf/2306.14289 A demonstration of MobileSAM running on a CPU can be accessed at this [demo link](https://huggingface.co/spaces/dhkim2810/MobileSAM). The performance on a Mac i5 CPU takes approximately 3 seconds. On the Hugging Face demo, the interface and lower-performance CPUs contribute to a slower response, but it continues to function effectively. -MobileSAM is implemented in various projects including [Grounding-SAM](https://github.com/IDEA-Research/Grounded-Segment-Anything), [AnyLabeling](https://github.com/vietanhdev/anylabeling), and [SegmentAnythingin3D](https://github.com/Jumpat/SegmentAnythingin3D). +MobileSAM is implemented in various projects including [Grounding-SAM](https://github.com/IDEA-Research/Grounded-Segment-Anything), [AnyLabeling](https://github.com/vietanhdev/anylabeling), and [Segment Anything in 3D](https://github.com/Jumpat/SegmentAnythingin3D). MobileSAM is trained on a single GPU with a 100k dataset (1% of the original images) in less than a day. The code for this training will be made available in the future. @@ -85,15 +85,19 @@ model.predict('ultralytics/assets/zidane.jpg', bboxes=[439, 437, 524, 709]) We have implemented `MobileSAM` and `SAM` using the same API. For more usage information, please see the [SAM page](./sam.md). -### Citing MobileSAM +## Citations and Acknowledgements If you find MobileSAM useful in your research or development work, please consider citing our paper: -```bibtex -@article{mobile_sam, - title={Faster Segment Anything: Towards Lightweight SAM for Mobile Applications}, - author={Zhang, Chaoning and Han, Dongshen and Qiao, Yu and Kim, Jung Uk and Bae, Sung Ho and Lee, Seungkyu and Hong, Choong Seon}, - journal={arXiv preprint arXiv:2306.14289}, - year={2023} -} -``` +!!! note "" + + === "BibTeX" + + ```bibtex + @article{mobile_sam, + title={Faster Segment Anything: Towards Lightweight SAM for Mobile Applications}, + author={Zhang, Chaoning and Han, Dongshen and Qiao, Yu and Kim, Jung Uk and Bae, Sung Ho and Lee, Seungkyu and Hong, Choong Seon}, + journal={arXiv preprint arXiv:2306.14289}, + year={2023} + } + ``` diff --git a/docs/models/rtdetr.md b/docs/models/rtdetr.md index 6c8b642..256f8c4 100644 --- a/docs/models/rtdetr.md +++ b/docs/models/rtdetr.md @@ -15,7 +15,7 @@ Real-Time Detection Transformer (RT-DETR), developed by Baidu, is a cutting-edge ### Key Features -- **Efficient Hybrid Encoder:** Baidu's RT-DETR uses an efficient hybrid encoder that processes multi-scale features by decoupling intra-scale interaction and cross-scale fusion. This unique Vision Transformers-based design reduces computational costs and allows for real-time object detection. +- **Efficient Hybrid Encoder:** Baidu's RT-DETR uses an efficient hybrid encoder that processes multiscale features by decoupling intra-scale interaction and cross-scale fusion. This unique Vision Transformers-based design reduces computational costs and allows for real-time object detection. - **IoU-aware Query Selection:** Baidu's RT-DETR improves object query initialization by utilizing IoU-aware query selection. This allows the model to focus on the most relevant objects in the scene, enhancing the detection accuracy. - **Adaptable Inference Speed:** Baidu's RT-DETR supports flexible adjustments of inference speed by using different decoder layers without the need for retraining. This adaptability facilitates practical application in various real-time object detection scenarios. @@ -28,16 +28,39 @@ The Ultralytics Python API provides pre-trained PaddlePaddle RT-DETR models with ## Usage -### Python API +You can use RT-DETR for object detection tasks using the `ultralytics` pip package. The following is a sample code snippet showing how to use RT-DETR models for training and inference: -```python -from ultralytics import RTDETR +!!! example "" -model = RTDETR("rtdetr-l.pt") -model.info() # display model information -model.train(data="coco8.yaml") # train -model.predict("path/to/image.jpg") # predict -``` + This example provides simple inference code for RT-DETR. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using RT-DETR with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md). + + === "Python" + + ```python + from ultralytics import RTDETR + + # Load a COCO-pretrained RT-DETR-l model + model = RTDETR('rtdetr-l.pt') + + # Display model information (optional) + model.info() + + # Train the model on the COCO8 example dataset for 100 epochs + results model.train(data='coco8.yaml', epochs=100, imgsz=640) + + # Run inference with the RT-DETR-l model on the 'bus.jpg' image + results = model('path/to/bus.jpg') + ``` + + === "CLI" + + ```bash + # Load a COCO-pretrained RT-DETR-l model and train it on the COCO8 example dataset for 100 epochs + yolo train model=rtdetr-l.pt data=coco8.yaml epochs=100 imgsz=640 + + # Load a COCO-pretrained RT-DETR-l model and run inference on the 'bus.jpg' image + yolo predict model=rtdetr-l.pt source=path/to/bus.jpg + ``` ### Supported Tasks @@ -54,20 +77,24 @@ model.predict("path/to/image.jpg") # predict | Validation | :heavy_check_mark: | | Training | :heavy_check_mark: | -# Citations and Acknowledgements +## Citations and Acknowledgements If you use Baidu's RT-DETR in your research or development work, please cite the [original paper](https://arxiv.org/abs/2304.08069): -```bibtex -@misc{lv2023detrs, - title={DETRs Beat YOLOs on Real-time Object Detection}, - author={Wenyu Lv and Shangliang Xu and Yian Zhao and Guanzhong Wang and Jinman Wei and Cheng Cui and Yuning Du and Qingqing Dang and Yi Liu}, - year={2023}, - eprint={2304.08069}, - archivePrefix={arXiv}, - primaryClass={cs.CV} -} -``` +!!! note "" + + === "BibTeX" + + ```bibtex + @misc{lv2023detrs, + title={DETRs Beat YOLOs on Real-time Object Detection}, + author={Wenyu Lv and Shangliang Xu and Yian Zhao and Guanzhong Wang and Jinman Wei and Cheng Cui and Yuning Du and Qingqing Dang and Yi Liu}, + year={2023}, + eprint={2304.08069}, + archivePrefix={arXiv}, + primaryClass={cs.CV} + } + ``` We would like to acknowledge Baidu and the [PaddlePaddle](https://github.com/PaddlePaddle/PaddleDetection) team for creating and maintaining this valuable resource for the computer vision community. Their contribution to the field with the development of the Vision Transformers-based real-time object detector, RT-DETR, is greatly appreciated. diff --git a/docs/models/sam.md b/docs/models/sam.md index cd2d168..2aad7d8 100644 --- a/docs/models/sam.md +++ b/docs/models/sam.md @@ -72,6 +72,7 @@ The Segment Anything Model can be employed for a multitude of downstream tasks t # Run inference model('path/to/image.jpg') ``` + === "CLI" ```bash @@ -99,6 +100,7 @@ The Segment Anything Model can be employed for a multitude of downstream tasks t predictor.set_image(cv2.imread("ultralytics/assets/zidane.jpg")) # set with np.ndarray results = predictor(bboxes=[439, 437, 524, 709]) results = predictor(points=[900, 370], labels=[1]) + # Reset image predictor.reset_image() ``` @@ -114,9 +116,8 @@ The Segment Anything Model can be employed for a multitude of downstream tasks t overrides = dict(conf=0.25, task='segment', mode='predict', imgsz=1024, model="mobile_sam.pt") predictor = SAMPredictor(overrides=overrides) - # segment with additional args + # Segment with additional args results = predictor(source="ultralytics/assets/zidane.jpg", crop_n_layers=1, points_stride=64) - ``` - More additional args for `Segment everything` see [`Predictor/generate` Reference](../reference/models/sam/predict.md). @@ -140,11 +141,11 @@ The Segment Anything Model can be employed for a multitude of downstream tasks t Here we compare Meta's smallest SAM model, SAM-b, with Ultralytics smallest segmentation model, [YOLOv8n-seg](../tasks/segment.md): -| Model | Size | Parameters | Speed (CPU) | -|------------------------------------------------|----------------------------|------------------------|-------------------------| -| Meta's SAM-b | 358 MB | 94.7 M | 51096 ms/im | -| [MobileSAM](mobile-sam.md) | 40.7 MB | 10.1 M | 46122 ms/im | -| [FastSAM-s](fast-sam.md) with YOLOv8 backbone | 23.7 MB | 11.8 M | 115 ms/im | +| Model | Size | Parameters | Speed (CPU) | +|------------------------------------------------|----------------------------|------------------------|----------------------------| +| Meta's SAM-b | 358 MB | 94.7 M | 51096 ms/im | +| [MobileSAM](mobile-sam.md) | 40.7 MB | 10.1 M | 46122 ms/im | +| [FastSAM-s](fast-sam.md) with YOLOv8 backbone | 23.7 MB | 11.8 M | 115 ms/im | | Ultralytics [YOLOv8n-seg](../tasks/segment.md) | **6.7 MB** (53.4x smaller) | **3.4 M** (27.9x less) | **59 ms/im** (866x faster) | This comparison shows the order-of-magnitude differences in the model sizes and speeds between models. Whereas SAM presents unique capabilities for automatic segmenting, it is not a direct competitor to YOLOv8 segment models, which are smaller, faster and more efficient. @@ -205,16 +206,20 @@ Auto-annotation with pre-trained models can dramatically cut down the time and e If you find SAM useful in your research or development work, please consider citing our paper: -```bibtex -@misc{kirillov2023segment, - title={Segment Anything}, - author={Alexander Kirillov and Eric Mintun and Nikhila Ravi and Hanzi Mao and Chloe Rolland and Laura Gustafson and Tete Xiao and Spencer Whitehead and Alexander C. Berg and Wan-Yen Lo and Piotr Dollár and Ross Girshick}, - year={2023}, - eprint={2304.02643}, - archivePrefix={arXiv}, - primaryClass={cs.CV} -} -``` +!!! note "" + + === "BibTeX" + + ```bibtex + @misc{kirillov2023segment, + title={Segment Anything}, + author={Alexander Kirillov and Eric Mintun and Nikhila Ravi and Hanzi Mao and Chloe Rolland and Laura Gustafson and Tete Xiao and Spencer Whitehead and Alexander C. Berg and Wan-Yen Lo and Piotr Dollár and Ross Girshick}, + year={2023}, + eprint={2304.02643}, + archivePrefix={arXiv}, + primaryClass={cs.CV} + } + ``` We would like to express our gratitude to Meta AI for creating and maintaining this valuable resource for the computer vision community. diff --git a/docs/models/yolo-nas.md b/docs/models/yolo-nas.md index 2dd834d..4137ac0 100644 --- a/docs/models/yolo-nas.md +++ b/docs/models/yolo-nas.md @@ -36,35 +36,49 @@ Each model variant is designed to offer a balance between Mean Average Precision ## Usage -### Python API +Ultralytics has made YOLO-NAS models easy to integrate into your Python applications via our `ultralytics` python package. The package provides a user-friendly Python API to streamline the process. -The YOLO-NAS models are easy to integrate into your Python applications. Ultralytics provides a user-friendly Python API to streamline the process. +The following examples show how to use YOLO-NAS models with the `ultralytics` package for inference and validation: -#### Predict Usage +### Inference and Validation Examples -To perform object detection on an image, use the `predict` method as shown below: +In this example we validate YOLO-NAS-s on the COCO8 dataset. -```python -from ultralytics import NAS +!!! example "" -model = NAS('yolo_nas_s') -results = model.predict('ultralytics/assets/bus.jpg') -``` + This example provides simple inference and validation code for YOLO-NAS. For handling inference results see [Predict](../modes/predict.md) mode. For using YOLO-NAS with additional modes see [Val](../modes/val.md) and [Export](../modes/export.md). YOLO-NAS on the `ultralytics` package does not support training. -This snippet demonstrates the simplicity of loading a pre-trained model and running a prediction on an image. + === "Python" -#### Val Usage + PyTorch pretrained `*.pt` models files can be passed to the `NAS()` class to create a model instance in python: -Validation of the model on a dataset can be done as follows: + ```python + from ultralytics import NAS -```python -from ultralytics import NAS + # Load a COCO-pretrained YOLO-NAS-s model + model = NAS('yolo_nas_s.pt') -model = NAS('yolo_nas_s') -results = model.val(data='coco8.yaml) -``` + # Display model information (optional) + model.info() -In this example, the model is validated against the dataset specified in the 'coco8.yaml' file. + # Validate the model on the COCO8 example dataset + results model.val(data='coco8.yaml') + + # Run inference with the YOLO-NAS-s model on the 'bus.jpg' image + results = model('path/to/bus.jpg') + ``` + + === "CLI" + + CLI commands are available to directly run the models: + + ```bash + # Load a COCO-pretrained YOLO-NAS-s model and validate it's performance on the COCO8 example dataset + yolo val model=yolo_nas_s.pt data=coco8.yaml + + # Load a COCO-pretrained YOLO-NAS-s model and run inference on the 'bus.jpg' image + yolo predict model=yolo_nas_s.pt source=path/to/bus.jpg + ``` ### Supported Tasks @@ -88,21 +102,25 @@ The YOLO-NAS models support both inference and validation modes, allowing you to Harness the power of the YOLO-NAS models to drive your object detection tasks to new heights of performance and speed. -## Acknowledgements and Citations +## Citations and Acknowledgements If you employ YOLO-NAS in your research or development work, please cite SuperGradients: -```bibtex -@misc{supergradients, - doi = {10.5281/ZENODO.7789328}, - url = {https://zenodo.org/record/7789328}, - author = {Aharon, Shay and {Louis-Dupont} and {Ofri Masad} and Yurkova, Kate and {Lotem Fridman} and {Lkdci} and Khvedchenya, Eugene and Rubin, Ran and Bagrov, Natan and Tymchenko, Borys and Keren, Tomer and Zhilko, Alexander and {Eran-Deci}}, - title = {Super-Gradients}, - publisher = {GitHub}, - journal = {GitHub repository}, - year = {2021}, -} -``` +!!! note "" + + === "BibTeX" + + ```bibtex + @misc{supergradients, + doi = {10.5281/ZENODO.7789328}, + url = {https://zenodo.org/record/7789328}, + author = {Aharon, Shay and {Louis-Dupont} and {Ofri Masad} and Yurkova, Kate and {Lotem Fridman} and {Lkdci} and Khvedchenya, Eugene and Rubin, Ran and Bagrov, Natan and Tymchenko, Borys and Keren, Tomer and Zhilko, Alexander and {Eran-Deci}}, + title = {Super-Gradients}, + publisher = {GitHub}, + journal = {GitHub repository}, + year = {2021}, + } + ``` We express our gratitude to Deci AI's [SuperGradients](https://github.com/Deci-AI/super-gradients/) team for their efforts in creating and maintaining this valuable resource for the computer vision community. We believe YOLO-NAS, with its innovative architecture and superior object detection capabilities, will become a critical tool for developers and researchers alike. diff --git a/docs/models/yolov3.md b/docs/models/yolov3.md index 703829d..efe2160 100644 --- a/docs/models/yolov3.md +++ b/docs/models/yolov3.md @@ -1,7 +1,7 @@ --- comments: true description: Get an overview of YOLOv3, YOLOv3-Ultralytics and YOLOv3u. Learn about their key features, usage, and supported tasks for object detection. -keywords: YOLOv3, YOLOv3-Ultralytics, YOLOv3u, Object Detection, Inferencing, Training, Ultralytics +keywords: YOLOv3, YOLOv3-Ultralytics, YOLOv3u, Object Detection, Inference, Training, Ultralytics --- # YOLOv3, YOLOv3-Ultralytics, and YOLOv3u @@ -49,32 +49,59 @@ TODO ## Usage -You can use these models for object detection tasks using the Ultralytics YOLOv3 repository. The following is a sample code snippet showing how to use the YOLOv3u model for inference: +You can use YOLOv3 for object detection tasks using the Ultralytics repository. The following is a sample code snippet showing how to use YOLOv3 model for inference: -```python -from ultralytics import YOLO +!!! example "" -# Load the model -model = YOLO('yolov3.pt') # load a pretrained model + This example provides simple inference code for YOLOv3. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using YOLOv3 with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md). -# Perform inference -results = model('image.jpg') + === "Python" -# Print the results -results.print() -``` + PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python: -## Citations and Acknowledgments + ```python + from ultralytics import YOLO + + # Load a COCO-pretrained YOLOv3n model + model = YOLO('yolov3n.pt') + + # Display model information (optional) + model.info() + + # Train the model on the COCO8 example dataset for 100 epochs + results model.train(data='coco8.yaml', epochs=100, imgsz=640) + + # Run inference with the YOLOv3n model on the 'bus.jpg' image + results = model('path/to/bus.jpg') + ``` + + === "CLI" + + CLI commands are available to directly run the models: + + ```bash + # Load a COCO-pretrained YOLOv3n model and train it on the COCO8 example dataset for 100 epochs + yolo train model=yolov3n.pt data=coco8.yaml epochs=100 imgsz=640 + + # Load a COCO-pretrained YOLOv3n model and run inference on the 'bus.jpg' image + yolo predict model=yolov3n.pt source=path/to/bus.jpg + ``` + +## Citations and Acknowledgements If you use YOLOv3 in your research, please cite the original YOLO papers and the Ultralytics YOLOv3 repository: -```bibtex -@article{redmon2018yolov3, - title={YOLOv3: An Incremental Improvement}, - author={Redmon, Joseph and Farhadi, Ali}, - journal={arXiv preprint arXiv:1804.02767}, - year={2018} -} -``` +!!! note "" + + === "BibTeX" + + ```bibtex + @article{redmon2018yolov3, + title={YOLOv3: An Incremental Improvement}, + author={Redmon, Joseph and Farhadi, Ali}, + journal={arXiv preprint arXiv:1804.02767}, + year={2018} + } + ``` Thank you to Joseph Redmon and Ali Farhadi for developing the original YOLOv3. diff --git a/docs/models/yolov4.md b/docs/models/yolov4.md index 60d4527..1af7ccd 100644 --- a/docs/models/yolov4.md +++ b/docs/models/yolov4.md @@ -53,15 +53,19 @@ YOLOv4 is a powerful and efficient object detection model that strikes a balance We would like to acknowledge the YOLOv4 authors for their significant contributions in the field of real-time object detection: -```bibtex -@misc{bochkovskiy2020yolov4, - title={YOLOv4: Optimal Speed and Accuracy of Object Detection}, - author={Alexey Bochkovskiy and Chien-Yao Wang and Hong-Yuan Mark Liao}, - year={2020}, - eprint={2004.10934}, - archivePrefix={arXiv}, - primaryClass={cs.CV} -} -``` +!!! note "" + + === "BibTeX" + + ```bibtex + @misc{bochkovskiy2020yolov4, + title={YOLOv4: Optimal Speed and Accuracy of Object Detection}, + author={Alexey Bochkovskiy and Chien-Yao Wang and Hong-Yuan Mark Liao}, + year={2020}, + eprint={2004.10934}, + archivePrefix={arXiv}, + primaryClass={cs.CV} + } + ``` The original YOLOv4 paper can be found on [arXiv](https://arxiv.org/pdf/2004.10934.pdf). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/AlexeyAB/darknet). We appreciate their efforts in advancing the field and making their work accessible to the broader community. diff --git a/docs/models/yolov5.md b/docs/models/yolov5.md index 5ade9f4..feb86db 100644 --- a/docs/models/yolov5.md +++ b/docs/models/yolov5.md @@ -8,17 +8,17 @@ keywords: YOLOv5u, object detection, pre-trained models, Ultralytics, Inference, ## Overview -YOLOv5u is an enhanced version of the [YOLOv5](https://github.com/ultralytics/yolov5) object detection model from Ultralytics. This iteration incorporates the anchor-free, objectness-free split head that is featured in the [YOLOv8](./yolov8.md) models. Although it maintains the same backbone and neck architecture as YOLOv5, YOLOv5u provides an improved accuracy-speed tradeoff for object detection tasks, making it a robust choice for numerous applications. +YOLOv5u represents an advancement in object detection methodologies. Originating from the foundational architecture of the [YOLOv5](https://github.com/ultralytics/yolov5) model developed by Ultralytics, YOLOv5u integrates the anchor-free, objectness-free split head, a feature previously introduced in the [YOLOv8](./yolov8.md) models. This adaptation refines the model's architecture, leading to an improved accuracy-speed tradeoff in object detection tasks. Given the empirical results and its derived features, YOLOv5u provides an efficient alternative for those seeking robust solutions in both research and practical applications. ![Ultralytics YOLOv5](https://raw.githubusercontent.com/ultralytics/assets/main/yolov5/v70/splash.png) ## Key Features -- **Anchor-free Split Ultralytics Head:** YOLOv5u replaces the conventional anchor-based detection head with an anchor-free split Ultralytics head, boosting performance in object detection tasks. +- **Anchor-free Split Ultralytics Head:** Traditional object detection models rely on predefined anchor boxes to predict object locations. However, YOLOv5u modernizes this approach. By adopting an anchor-free split Ultralytics head, it ensures a more flexible and adaptive detection mechanism, consequently enhancing the performance in diverse scenarios. -- **Optimized Accuracy-Speed Tradeoff:** By delivering a better balance between accuracy and speed, YOLOv5u is suitable for a diverse range of real-time applications, from autonomous driving to video surveillance. +- **Optimized Accuracy-Speed Tradeoff:** Speed and accuracy often pull in opposite directions. But YOLOv5u challenges this tradeoff. It offers a calibrated balance, ensuring real-time detections without compromising on accuracy. This feature is particularly invaluable for applications that demand swift responses, such as autonomous vehicles, robotics, and real-time video analytics. -- **Variety of Pre-trained Models:** YOLOv5u includes numerous pre-trained models for tasks like Inference, Validation, and Training, providing the flexibility to tackle various object detection challenges. +- **Variety of Pre-trained Models:** Understanding that different tasks require different toolsets, YOLOv5u provides a plethora of pre-trained models. Whether you're focusing on Inference, Validation, or Training, there's a tailor-made model awaiting you. This variety ensures you're not just using a one-size-fits-all solution, but a model specifically fine-tuned for your unique challenge. ## Supported Tasks @@ -38,52 +38,78 @@ YOLOv5u is an enhanced version of the [YOLOv5](https://github.com/ultralytics/yo === "Detection" - | Model | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
A100 TensorRT
(ms) | params
(M) | FLOPs
(B) | - | ---------------------------------------------------------------------------------------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- | - | [YOLOv5nu](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5nu.pt) | 640 | 34.3 | 73.6 | 1.06 | 2.6 | 7.7 | - | [YOLOv5su](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5su.pt) | 640 | 43.0 | 120.7 | 1.27 | 9.1 | 24.0 | - | [YOLOv5mu](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5mu.pt) | 640 | 49.0 | 233.9 | 1.86 | 25.1 | 64.2 | - | [YOLOv5lu](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5lu.pt) | 640 | 52.2 | 408.4 | 2.50 | 53.2 | 135.0 | - | [YOLOv5xu](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5xu.pt) | 640 | 53.2 | 763.2 | 3.81 | 97.2 | 246.4 | - | | | | | | | | - | [YOLOv5n6u](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5n6u.pt) | 1280 | 42.1 | 211.0 | 1.83 | 4.3 | 7.8 | - | [YOLOv5s6u](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5s6u.pt) | 1280 | 48.6 | 422.6 | 2.34 | 15.3 | 24.6 | - | [YOLOv5m6u](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5m6u.pt) | 1280 | 53.6 | 810.9 | 4.36 | 41.2 | 65.7 | - | [YOLOv5l6u](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5l6u.pt) | 1280 | 55.7 | 1470.9 | 5.47 | 86.1 | 137.4 | - | [YOLOv5x6u](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5x6u.pt) | 1280 | 56.8 | 2436.5 | 8.98 | 155.4 | 250.7 | + | Model | YAML | size
(pixels) | mAPval
50-95 | Speed
CPU ONNX
(ms) | Speed
A100 TensorRT
(ms) | params
(M) | FLOPs
(B) | + |---------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|-----------------------|----------------------|--------------------------------|-------------------------------------|--------------------|-------------------| + | [yolov5nu.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5nu.pt) | [yolov5n.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 34.3 | 73.6 | 1.06 | 2.6 | 7.7 | + | [yolov5su.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5su.pt) | [yolov5s.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 43.0 | 120.7 | 1.27 | 9.1 | 24.0 | + | [yolov5mu.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5mu.pt) | [yolov5m.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 49.0 | 233.9 | 1.86 | 25.1 | 64.2 | + | [yolov5lu.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5lu.pt) | [yolov5l.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 52.2 | 408.4 | 2.50 | 53.2 | 135.0 | + | [yolov5xu.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5xu.pt) | [yolov5x.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 53.2 | 763.2 | 3.81 | 97.2 | 246.4 | + | | | | | | | | | + | [yolov5n6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5n6u.pt) | [yolov5n6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 42.1 | 211.0 | 1.83 | 4.3 | 7.8 | + | [yolov5s6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5s6u.pt) | [yolov5s6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 48.6 | 422.6 | 2.34 | 15.3 | 24.6 | + | [yolov5m6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5m6u.pt) | [yolov5m6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 53.6 | 810.9 | 4.36 | 41.2 | 65.7 | + | [yolov5l6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5l6u.pt) | [yolov5l6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 55.7 | 1470.9 | 5.47 | 86.1 | 137.4 | + | [yolov5x6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5x6u.pt) | [yolov5x6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 56.8 | 2436.5 | 8.98 | 155.4 | 250.7 | ## Usage You can use YOLOv5u for object detection tasks using the Ultralytics repository. The following is a sample code snippet showing how to use YOLOv5u model for inference: -```python -from ultralytics import YOLO +!!! example "" -# Load the model -model = YOLO('yolov5n.pt') # load a pretrained model + This example provides simple inference code for YOLOv5. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using YOLOv5 with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md). -# Perform inference -results = model('image.jpg') + === "Python" -# Print the results -results.print() -``` + PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python: -## Citations and Acknowledgments + ```python + from ultralytics import YOLO + + # Load a COCO-pretrained YOLOv5n model + model = YOLO('yolov5n.pt') + + # Display model information (optional) + model.info() + + # Train the model on the COCO8 example dataset for 100 epochs + results model.train(data='coco8.yaml', epochs=100, imgsz=640) + + # Run inference with the YOLOv5n model on the 'bus.jpg' image + results = model('path/to/bus.jpg') + ``` + + === "CLI" + + CLI commands are available to directly run the models: + + ```bash + # Load a COCO-pretrained YOLOv5n model and train it on the COCO8 example dataset for 100 epochs + yolo train model=yolov5n.pt data=coco8.yaml epochs=100 imgsz=640 + + # Load a COCO-pretrained YOLOv5n model and run inference on the 'bus.jpg' image + yolo predict model=yolov5n.pt source=path/to/bus.jpg + ``` + +## Citations and Acknowledgements If you use YOLOv5 or YOLOv5u in your research, please cite the Ultralytics YOLOv5 repository as follows: -```bibtex -@software{yolov5, - title = {Ultralytics YOLOv5}, - author = {Glenn Jocher}, - year = {2020}, - version = {7.0}, - license = {AGPL-3.0}, - url = {https://github.com/ultralytics/yolov5}, - doi = {10.5281/zenodo.3908559}, - orcid = {0000-0001-5950-6979} -} -``` +!!! note "" + + === "BibTeX" + ```bibtex + @software{yolov5, + title = {Ultralytics YOLOv5}, + author = {Glenn Jocher}, + year = {2020}, + version = {7.0}, + license = {AGPL-3.0}, + url = {https://github.com/ultralytics/yolov5}, + doi = {10.5281/zenodo.3908559}, + orcid = {0000-0001-5950-6979} + } + ``` Special thanks to Glenn Jocher and the Ultralytics team for their work on developing and maintaining the YOLOv5 and YOLOv5u models. diff --git a/docs/models/yolov6.md b/docs/models/yolov6.md index 2f13a80..a921612 100644 --- a/docs/models/yolov6.md +++ b/docs/models/yolov6.md @@ -17,7 +17,7 @@ structure of a BiC module. (c) A SimCSPSPPF block. ([source](https://arxiv.org/p ### Key Features -- **Bi-directional Concatenation (BiC) Module:** YOLOv6 introduces a BiC module in the neck of the detector, enhancing localization signals and delivering performance gains with negligible speed degradation. +- **Bidirectional Concatenation (BiC) Module:** YOLOv6 introduces a BiC module in the neck of the detector, enhancing localization signals and delivering performance gains with negligible speed degradation. - **Anchor-Aided Training (AAT) Strategy:** This model proposes AAT to enjoy the benefits of both anchor-based and anchor-free paradigms without compromising inference efficiency. - **Enhanced Backbone and Neck Design:** By deepening YOLOv6 to include another stage in the backbone and neck, this model achieves state-of-the-art performance on the COCO dataset at high-resolution input. - **Self-Distillation Strategy:** A new self-distillation strategy is implemented to boost the performance of smaller models of YOLOv6, enhancing the auxiliary regression branch during training and removing it at inference to avoid a marked speed decline. @@ -36,15 +36,43 @@ YOLOv6 also provides quantized models for different precisions and models optimi ## Usage -### Python API +You can use YOLOv6 for object detection tasks using the Ultralytics pip package. The following is a sample code snippet showing how to use YOLOv6 models for training: -```python -from ultralytics import YOLO +!!! example "" -model = YOLO("yolov6n.yaml") # build new model from scratch -model.info() # display model information -model.predict("path/to/image.jpg") # predict -``` + This example provides simple training code for YOLOv6. For more options including training settings see [Train](../modes/train.md) mode. For using YOLOv6 with additional modes see [Predict](../modes/predict.md), [Val](../modes/val.md) and [Export](../modes/export.md). + + === "Python" + + PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python: + + ```python + from ultralytics import YOLO + + # Build a YOLOv6n model from scratch + model = YOLO('yolov6n.yaml') + + # Display model information (optional) + model.info() + + # Train the model on the COCO8 example dataset for 100 epochs + results model.train(data='coco8.yaml', epochs=100, imgsz=640) + + # Run inference with the YOLOv6n model on the 'bus.jpg' image + results = model('path/to/bus.jpg') + ``` + + === "CLI" + + CLI commands are available to directly run the models: + + ```bash + # Build a YOLOv6n model from scratch and train it on the COCO8 example dataset for 100 epochs + yolo train model=yolov6n.yaml data=coco8.yaml epochs=100 imgsz=640 + + # Build a YOLOv6n model from scratch and run inference on the 'bus.jpg' image + yolo predict model=yolov6n.yaml source=path/to/bus.jpg + ``` ### Supported Tasks @@ -68,15 +96,19 @@ model.predict("path/to/image.jpg") # predict We would like to acknowledge the authors for their significant contributions in the field of real-time object detection: -```bibtex -@misc{li2023yolov6, - title={YOLOv6 v3.0: A Full-Scale Reloading}, - author={Chuyi Li and Lulu Li and Yifei Geng and Hongliang Jiang and Meng Cheng and Bo Zhang and Zaidan Ke and Xiaoming Xu and Xiangxiang Chu}, - year={2023}, - eprint={2301.05586}, - archivePrefix={arXiv}, - primaryClass={cs.CV} -} -``` +!!! note "" + + === "BibTeX" + + ```bibtex + @misc{li2023yolov6, + title={YOLOv6 v3.0: A Full-Scale Reloading}, + author={Chuyi Li and Lulu Li and Yifei Geng and Hongliang Jiang and Meng Cheng and Bo Zhang and Zaidan Ke and Xiaoming Xu and Xiangxiang Chu}, + year={2023}, + eprint={2301.05586}, + archivePrefix={arXiv}, + primaryClass={cs.CV} + } + ``` The original YOLOv6 paper can be found on [arXiv](https://arxiv.org/abs/2301.05586). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/meituan/YOLOv6). We appreciate their efforts in advancing the field and making their work accessible to the broader community. diff --git a/docs/models/yolov7.md b/docs/models/yolov7.md index 1350bbb..4f4a035 100644 --- a/docs/models/yolov7.md +++ b/docs/models/yolov7.md @@ -49,13 +49,17 @@ We regret any inconvenience this may cause and will strive to update this docume We would like to acknowledge the YOLOv7 authors for their significant contributions in the field of real-time object detection: -```bibtex -@article{wang2022yolov7, - title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors}, - author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark}, - journal={arXiv preprint arXiv:2207.02696}, - year={2022} -} -``` +!!! note "" + + === "BibTeX" + + ```bibtex + @article{wang2022yolov7, + title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors}, + author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark}, + journal={arXiv preprint arXiv:2207.02696}, + year={2022} + } + ``` The original YOLOv7 paper can be found on [arXiv](https://arxiv.org/pdf/2207.02696.pdf). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/WongKinYiu/yolov7). We appreciate their efforts in advancing the field and making their work accessible to the broader community. diff --git a/docs/models/yolov8.md b/docs/models/yolov8.md index 882fc4c..ca9752d 100644 --- a/docs/models/yolov8.md +++ b/docs/models/yolov8.md @@ -83,33 +83,60 @@ YOLOv8 is the latest iteration in the YOLO series of real-time object detectors, You can use YOLOv8 for object detection tasks using the Ultralytics pip package. The following is a sample code snippet showing how to use YOLOv8 models for inference: -```python -from ultralytics import YOLO +!!! example "" -# Load the model -model = YOLO('yolov8n.pt') # load a pretrained model + This example provides simple inference code for YOLOv8. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using YOLOv8 with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md). -# Perform inference -results = model('image.jpg') + === "Python" -# Print the results -results.print() -``` + PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python: -## Citation + ```python + from ultralytics import YOLO + + # Load a COCO-pretrained YOLOv8n model + model = YOLO('yolov8n.pt') + + # Display model information (optional) + model.info() + + # Train the model on the COCO8 example dataset for 100 epochs + results model.train(data='coco8.yaml', epochs=100, imgsz=640) + + # Run inference with the YOLOv8n model on the 'bus.jpg' image + results = model('path/to/bus.jpg') + ``` + + === "CLI" + + CLI commands are available to directly run the models: + + ```bash + # Load a COCO-pretrained YOLOv8n model and train it on the COCO8 example dataset for 100 epochs + yolo train model=yolov8n.pt data=coco8.yaml epochs=100 imgsz=640 + + # Load a COCO-pretrained YOLOv8n model and run inference on the 'bus.jpg' image + yolo predict model=yolov8n.pt source=path/to/bus.jpg + ``` + +## Citations and Acknowledgements If you use the YOLOv8 model or any other software from this repository in your work, please cite it using the following format: -```bibtex -@software{yolov8_ultralytics, - author = {Glenn Jocher and Ayush Chaurasia and Jing Qiu}, - title = {Ultralytics YOLOv8}, - version = {8.0.0}, - year = {2023}, - url = {https://github.com/ultralytics/ultralytics}, - orcid = {0000-0001-5950-6979, 0000-0002-7603-6750, 0000-0003-3783-7069}, - license = {AGPL-3.0} -} -``` +!!! note "" + + === "BibTeX" + + ```bibtex + @software{yolov8_ultralytics, + author = {Glenn Jocher and Ayush Chaurasia and Jing Qiu}, + title = {Ultralytics YOLOv8}, + version = {8.0.0}, + year = {2023}, + url = {https://github.com/ultralytics/ultralytics}, + orcid = {0000-0001-5950-6979, 0000-0002-7603-6750, 0000-0003-3783-7069}, + license = {AGPL-3.0} + } + ``` Please note that the DOI is pending and will be added to the citation once it is available. The usage of the software is in accordance with the AGPL-3.0 license. diff --git a/docs/modes/predict.md b/docs/modes/predict.md index d7a62d6..881fd17 100644 --- a/docs/modes/predict.md +++ b/docs/modes/predict.md @@ -576,7 +576,7 @@ You can use the `plot()` method of a `Result` objects to visualize predictions. im.save('results.jpg') # save image ``` - The `plot()` method has the following arguments available: + The `plot()` method supports the following arguments: | Argument | Type | Description | Default | |--------------|-----------------|--------------------------------------------------------------------------------|---------------| diff --git a/ultralytics/engine/results.py b/ultralytics/engine/results.py index 19072aa..c387dc5 100644 --- a/ultralytics/engine/results.py +++ b/ultralytics/engine/results.py @@ -209,7 +209,7 @@ class Results(SimpleClass): results = model('bus.jpg') # results list for r in results: im_array = r.plot() # plot a BGR numpy array of predictions - im = Image.fromarray(im[..., ::-1]) # RGB PIL image + im = Image.fromarray(im_array[..., ::-1]) # RGB PIL image im.show() # show image im.save('results.jpg') # save image ```