YOLOv8-16bit/docs/models/rtdetr.md

---
comments: true
description: Explore RT-DETR, a high-performance real-time object detector. Learn how to use pre-trained models with Ultralytics Python API for various tasks.
---

# RT-DETR

## Overview

Real-Time Detection Transformer (RT-DETR) is an end-to-end object detector that provides real-time performance while maintaining high accuracy. It efficiently processes multi-scale features by decoupling intra-scale interaction and cross-scale fusion, and supports flexible adjustment of inference speed using different decoder layers without retraining. RT-DETR outperforms many real-time object detectors on accelerated backends like CUDA with TensorRT.

### Key Features

- **Efficient Hybrid Encoder:** RT-DETR uses an efficient hybrid encoder that processes multi-scale features by decoupling intra-scale interaction and cross-scale fusion. This design reduces computational costs and allows for real-time object detection.
- **IoU-aware Query Selection:** RT-DETR improves object query initialization by utilizing IoU-aware query selection. This allows the model to focus on the most relevant objects in the scene.
- **Adaptable Inference Speed:** RT-DETR supports flexible adjustments of inference speed by using different decoder layers without the need for retraining. This adaptability facilitates practical application in various real-time object detection scenarios.

## Pre-trained Models

Ultralytics RT-DETR provides several pre-trained models with different scales:

- RT-DETR-L: 53.0% AP on COCO val2017, 114 FPS on T4 GPU
- RT-DETR-X: 54.8% AP on COCO val2017, 74 FPS on T4 GPU

## Usage

### Python API

```python
from ultralytics import RTDETR

model = RTDETR("rtdetr-l.pt")
model.info()  # display model information
model.predict("path/to/image.jpg")  # predict
```

### Supported Tasks

| Model Type          | Pre-trained Weights | Tasks Supported  |
|---------------------|---------------------|------------------|
| RT-DETR Large       | `rtdetr-l.pt`       | Object Detection |
| RT-DETR Extra-Large | `rtdetr-x.pt`       | Object Detection |

### Supported Modes

| Mode       | Supported          |
|------------|--------------------|
| Inference  | :heavy_check_mark: |
| Validation | :heavy_check_mark: |
| Training   | :x: (Coming soon)  |

# Citations and Acknowledgements

If you use RT-DETR in your research or development work, please cite the [original paper](https://arxiv.org/abs/2304.08069):

```bibtex
@misc{lv2023detrs,
      title={DETRs Beat YOLOs on Real-time Object Detection},
      author={Wenyu Lv and Shangliang Xu and Yian Zhao and Guanzhong Wang and Jinman Wei and Cheng Cui and Yuning Du and Qingqing Dang and Yi Liu},
      year={2023},
      eprint={2304.08069},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
```

We would like to acknowledge Baidu's [PaddlePaddle]((https://github.com/PaddlePaddle/PaddleDetection)) team for creating and maintaining this valuable resource for the computer vision community.
`ultralytics 8.0.99` HUB resume fix and Docs updates (#2567) Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com> Co-authored-by: Yonghye Kwon <developer.0hye@gmail.com> 2 years ago			`---`
			`comments: true`
			`description: Explore RT-DETR, a high-performance real-time object detector. Learn how to use pre-trained models with Ultralytics Python API for various tasks.`
			`---`

			`# RT-DETR`

			`## Overview`

			`Real-Time Detection Transformer (RT-DETR) is an end-to-end object detector that provides real-time performance while maintaining high accuracy. It efficiently processes multi-scale features by decoupling intra-scale interaction and cross-scale fusion, and supports flexible adjustment of inference speed using different decoder layers without retraining. RT-DETR outperforms many real-time object detectors on accelerated backends like CUDA with TensorRT.`

			`### Key Features`

			`- Efficient Hybrid Encoder: RT-DETR uses an efficient hybrid encoder that processes multi-scale features by decoupling intra-scale interaction and cross-scale fusion. This design reduces computational costs and allows for real-time object detection.`
			`- IoU-aware Query Selection: RT-DETR improves object query initialization by utilizing IoU-aware query selection. This allows the model to focus on the most relevant objects in the scene.`
			`- Adaptable Inference Speed: RT-DETR supports flexible adjustments of inference speed by using different decoder layers without the need for retraining. This adaptability facilitates practical application in various real-time object detection scenarios.`

			`## Pre-trained Models`

			`Ultralytics RT-DETR provides several pre-trained models with different scales:`

			`- RT-DETR-L: 53.0% AP on COCO val2017, 114 FPS on T4 GPU`
			`- RT-DETR-X: 54.8% AP on COCO val2017, 74 FPS on T4 GPU`

			`## Usage`

			`### Python API`

			```python
			`from ultralytics import RTDETR`

			`model = RTDETR("rtdetr-l.pt")`
			`model.info() # display model information`
			`model.predict("path/to/image.jpg") # predict`
			```

			`### Supported Tasks`

			`\| Model Type \| Pre-trained Weights \| Tasks Supported \|`
			`\|---------------------\|---------------------\|------------------\|`
			\| RT-DETR Large \| `rtdetr-l.pt` \| Object Detection \|
			\| RT-DETR Extra-Large \| `rtdetr-x.pt` \| Object Detection \|

			`### Supported Modes`

			`\| Mode \| Supported \|`
			`\|------------\|--------------------\|`
			`\| Inference \| :heavy_check_mark: \|`
			`\| Validation \| :heavy_check_mark: \|`
			`\| Training \| :x: (Coming soon) \|`

`ultralytics 8.0.100` add Mosaic9() augmentation (#2605) Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com> Co-authored-by: Tommy in Tongji <36354458+TommyZihao@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: BIGBOSS-FOX <47949596+BIGBOSS-FOX@users.noreply.github.com> Co-authored-by: xbkaishui <xxkaishui@gmail.com> 2 years ago			`# Citations and Acknowledgements`

			`If you use RT-DETR in your research or development work, please cite the [original paper](https://arxiv.org/abs/2304.08069):`

			```bibtex
			`@misc{lv2023detrs,`
			`title={DETRs Beat YOLOs on Real-time Object Detection},`
			`author={Wenyu Lv and Shangliang Xu and Yian Zhao and Guanzhong Wang and Jinman Wei and Cheng Cui and Yuning Du and Qingqing Dang and Yi Liu},`
			`year={2023},`
			`eprint={2304.08069},`
			`archivePrefix={arXiv},`
			`primaryClass={cs.CV}`
			`}`
			```

			`We would like to acknowledge Baidu's [PaddlePaddle]((https://github.com/PaddlePaddle/PaddleDetection)) team for creating and maintaining this valuable resource for the computer vision community.`