2.9 KiB

Raw Blame History

comments	description
true	Explore RT-DETR, a high-performance real-time object detector. Learn how to use pre-trained models with Ultralytics Python API for various tasks.

RT-DETR

Overview

Real-Time Detection Transformer (RT-DETR) is an end-to-end object detector that provides real-time performance while maintaining high accuracy. It efficiently processes multi-scale features by decoupling intra-scale interaction and cross-scale fusion, and supports flexible adjustment of inference speed using different decoder layers without retraining. RT-DETR outperforms many real-time object detectors on accelerated backends like CUDA with TensorRT.

Key Features

Efficient Hybrid Encoder: RT-DETR uses an efficient hybrid encoder that processes multi-scale features by decoupling intra-scale interaction and cross-scale fusion. This design reduces computational costs and allows for real-time object detection.
IoU-aware Query Selection: RT-DETR improves object query initialization by utilizing IoU-aware query selection. This allows the model to focus on the most relevant objects in the scene.
Adaptable Inference Speed: RT-DETR supports flexible adjustments of inference speed by using different decoder layers without the need for retraining. This adaptability facilitates practical application in various real-time object detection scenarios.

Pre-trained Models

Ultralytics RT-DETR provides several pre-trained models with different scales:

RT-DETR-L: 53.0% AP on COCO val2017, 114 FPS on T4 GPU
RT-DETR-X: 54.8% AP on COCO val2017, 74 FPS on T4 GPU

Usage

Python API

from ultralytics import RTDETR

model = RTDETR("rtdetr-l.pt")
model.info()  # display model information
model.predict("path/to/image.jpg")  # predict

Supported Tasks

Model Type	Pre-trained Weights	Tasks Supported
RT-DETR Large	`rtdetr-l.pt`	Object Detection
RT-DETR Extra-Large	`rtdetr-x.pt`	Object Detection

Supported Modes

Mode	Supported
Inference	✔️
Validation	✔️
Training	❌ (Coming soon)

Citations and Acknowledgements

If you use RT-DETR in your research or development work, please cite the original paper:

@misc{lv2023detrs,
      title={DETRs Beat YOLOs on Real-time Object Detection},
      author={Wenyu Lv and Shangliang Xu and Yian Zhao and Guanzhong Wang and Jinman Wei and Cheng Cui and Yuning Du and Qingqing Dang and Yi Liu},
      year={2023},
      eprint={2304.08069},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

We would like to acknowledge Baidu's PaddlePaddle team for creating and maintaining this valuable resource for the computer vision community.

2.9 KiB Raw Blame History