From b239246452b9321e17287af8e92a9d449fa2216b Mon Sep 17 00:00:00 2001 From: Glenn Jocher Date: Wed, 12 Jul 2023 00:13:29 +0200 Subject: [PATCH] Update SAM docs page (#3672) --- docs/models/sam.md | 58 ++++++++++++++++--- docs/reference/vit/rtdetr/model.md | 2 +- docs/reference/vit/rtdetr/predict.md | 2 +- docs/reference/vit/rtdetr/train.md | 2 +- docs/reference/vit/rtdetr/val.md | 2 +- docs/reference/vit/sam/amg.md | 2 +- docs/reference/vit/sam/autosize.md | 2 +- docs/reference/vit/sam/build.md | 2 +- docs/reference/vit/sam/model.md | 2 +- docs/reference/vit/sam/modules/decoders.md | 5 ++ docs/reference/vit/sam/modules/encoders.md | 2 +- .../vit/sam/modules/mask_generator.md | 2 +- .../vit/sam/modules/prompt_predictor.md | 2 +- docs/reference/vit/sam/modules/sam.md | 2 +- docs/reference/vit/sam/modules/transformer.md | 2 +- docs/reference/vit/sam/predict.md | 2 +- docs/reference/vit/utils/loss.md | 2 +- docs/reference/vit/utils/ops.md | 2 +- docs/reference/yolo/utils/downloads.md | 5 ++ 19 files changed, 77 insertions(+), 23 deletions(-) diff --git a/docs/models/sam.md b/docs/models/sam.md index 8dd1e35..f4cb63b 100644 --- a/docs/models/sam.md +++ b/docs/models/sam.md @@ -30,13 +30,30 @@ For an in-depth look at the Segment Anything Model and the SA-1B dataset, please The Segment Anything Model can be employed for a multitude of downstream tasks that go beyond its training data. This includes edge detection, object proposal generation, instance segmentation, and preliminary text-to-mask prediction. With prompt engineering, SAM can swiftly adapt to new tasks and data distributions in a zero-shot manner, establishing it as a versatile and potent tool for all your image segmentation needs. -```python -from ultralytics import SAM - -model = SAM('sam_b.pt') -model.info() # display model information -model.predict('path/to/image.jpg') # predict -``` +!!! example "SAM prediction example" + + Device is determined automatically. If a GPU is available then it will be used, otherwise inference will run on CPU. + + === "Python" + + ```python + from ultralytics import SAM + + # Load a model + model = SAM('sam_b.pt') + + # Display model information (optional) + model.info() + + # Run inference with the model + model('path/to/image.jpg') + ``` + === "CLI" + + ```bash + # Run inference with a SAM model + yolo predict model=sam_b.pt source=path/to/image.jpg + ``` ## Available Models and Supported Tasks @@ -53,6 +70,33 @@ model.predict('path/to/image.jpg') # predict | Validation | :x: | | Training | :x: | +## SAM comparison vs YOLOv8 + +Here we compare Meta's smallest SAM model, SAM-b, with Ultralytics smallest segmentation model, [YOLOv8n-seg](../tasks/segment): + +| Model | Size | Parameters | Speed (CPU) | +|---------------------------------------------|----------------------------|------------------------|-------------------------| +| Meta's SAM-b | 358 MB | 94.7 M | 51096 ms | +| Ultralytics [YOLOv8n-seg](../tasks/segment) | **6.7 MB** (53.4x smaller) | **3.4 M** (27.9x less) | **59 ms** (866x faster) | + +This comparison shows the order-of-magnitude differences in the model sizes and speeds. Whereas SAM presents unique capabilities for automatic segmenting, it is not a direct competitor to YOLOv8 segment models, which are smaller, faster and more efficient since they are dedicated to more targeted use cases. + +To reproduce this test: + +```python +from ultralytics import SAM, YOLO + +# Profile SAM-b +model = SAM('sam_b.pt') +model.info() +model('ultralytics/assets') + +# Profile YOLOv8n-seg +model = YOLO('yolov8n-seg.pt') +model.info() +model('ultralytics/assets') +``` + ## Auto-Annotation: A Quick Path to Segmentation Datasets Auto-annotation is a key feature of SAM, allowing users to generate a [segmentation dataset](https://docs.ultralytics.com/datasets/segment) using a pre-trained detection model. This feature enables rapid and accurate annotation of a large number of images, bypassing the need for time-consuming manual labeling. diff --git a/docs/reference/vit/rtdetr/model.md b/docs/reference/vit/rtdetr/model.md index c979186..f444608 100644 --- a/docs/reference/vit/rtdetr/model.md +++ b/docs/reference/vit/rtdetr/model.md @@ -6,4 +6,4 @@ keywords: RTDETR, Ultralytics, YOLO, object detection, speed, accuracy, implemen ## RTDETR --- ### ::: ultralytics.vit.rtdetr.model.RTDETR -

\ No newline at end of file +

diff --git a/docs/reference/vit/rtdetr/predict.md b/docs/reference/vit/rtdetr/predict.md index c5b5420..032c2da 100644 --- a/docs/reference/vit/rtdetr/predict.md +++ b/docs/reference/vit/rtdetr/predict.md @@ -6,4 +6,4 @@ keywords: RTDETRPredictor, object detection, vision transformer, Ultralytics YOL ## RTDETRPredictor --- ### ::: ultralytics.vit.rtdetr.predict.RTDETRPredictor -

\ No newline at end of file +

diff --git a/docs/reference/vit/rtdetr/train.md b/docs/reference/vit/rtdetr/train.md index b7bb384..03f33f7 100644 --- a/docs/reference/vit/rtdetr/train.md +++ b/docs/reference/vit/rtdetr/train.md @@ -11,4 +11,4 @@ keywords: RTDETRTrainer, Ultralytics YOLO Docs, object detection, VIT-based RTDE ## train --- ### ::: ultralytics.vit.rtdetr.train.train -

\ No newline at end of file +

diff --git a/docs/reference/vit/rtdetr/val.md b/docs/reference/vit/rtdetr/val.md index 43c1898..32359b3 100644 --- a/docs/reference/vit/rtdetr/val.md +++ b/docs/reference/vit/rtdetr/val.md @@ -11,4 +11,4 @@ keywords: RTDETRDataset, RTDETRValidator, data validation, documentation ## RTDETRValidator --- ### ::: ultralytics.vit.rtdetr.val.RTDETRValidator -

\ No newline at end of file +

diff --git a/docs/reference/vit/sam/amg.md b/docs/reference/vit/sam/amg.md index 82c66e8..a5b5e4f 100644 --- a/docs/reference/vit/sam/amg.md +++ b/docs/reference/vit/sam/amg.md @@ -86,4 +86,4 @@ keywords: Ultralytics, SAM, MaskData, mask_to_rle_pytorch, area_from_rle, genera ## batched_mask_to_box --- ### ::: ultralytics.vit.sam.amg.batched_mask_to_box -

\ No newline at end of file +

diff --git a/docs/reference/vit/sam/autosize.md b/docs/reference/vit/sam/autosize.md index cbb0ca7..ca84d37 100644 --- a/docs/reference/vit/sam/autosize.md +++ b/docs/reference/vit/sam/autosize.md @@ -6,4 +6,4 @@ keywords: ResizeLongestSide, Ultralytics YOLO, automatic image resizing, image r ## ResizeLongestSide --- ### ::: ultralytics.vit.sam.autosize.ResizeLongestSide -

\ No newline at end of file +

diff --git a/docs/reference/vit/sam/build.md b/docs/reference/vit/sam/build.md index faa26ee..6c39621 100644 --- a/docs/reference/vit/sam/build.md +++ b/docs/reference/vit/sam/build.md @@ -26,4 +26,4 @@ keywords: SAM, VIT, computer vision models, build SAM models, build VIT models, ## build_sam --- ### ::: ultralytics.vit.sam.build.build_sam -

\ No newline at end of file +

diff --git a/docs/reference/vit/sam/model.md b/docs/reference/vit/sam/model.md index 7d924d4..4149847 100644 --- a/docs/reference/vit/sam/model.md +++ b/docs/reference/vit/sam/model.md @@ -6,4 +6,4 @@ keywords: Ultralytics, VIT, SAM, object detection, computer vision, deep learnin ## SAM --- ### ::: ultralytics.vit.sam.model.SAM -

\ No newline at end of file +

diff --git a/docs/reference/vit/sam/modules/decoders.md b/docs/reference/vit/sam/modules/decoders.md index e89ca9d..940d720 100644 --- a/docs/reference/vit/sam/modules/decoders.md +++ b/docs/reference/vit/sam/modules/decoders.md @@ -1,3 +1,8 @@ +--- +description: Learn about Ultralytics YOLO's MaskDecoder, Transformer architecture, MLP, mask prediction, and quality prediction. +keywords: Ultralytics YOLO, MaskDecoder, Transformer architecture, mask prediction, image embeddings, prompt embeddings, multi-mask output, MLP, mask quality prediction +--- + ## MaskDecoder --- ### ::: ultralytics.vit.sam.modules.decoders.MaskDecoder diff --git a/docs/reference/vit/sam/modules/encoders.md b/docs/reference/vit/sam/modules/encoders.md index 8c338bc..bd5760a 100644 --- a/docs/reference/vit/sam/modules/encoders.md +++ b/docs/reference/vit/sam/modules/encoders.md @@ -51,4 +51,4 @@ keywords: Ultralytics YOLO, ViT Encoder, Position Embeddings, Attention, Window ## add_decomposed_rel_pos --- ### ::: ultralytics.vit.sam.modules.encoders.add_decomposed_rel_pos -

\ No newline at end of file +

diff --git a/docs/reference/vit/sam/modules/mask_generator.md b/docs/reference/vit/sam/modules/mask_generator.md index e2e1251..beec1d3 100644 --- a/docs/reference/vit/sam/modules/mask_generator.md +++ b/docs/reference/vit/sam/modules/mask_generator.md @@ -6,4 +6,4 @@ keywords: SamAutomaticMaskGenerator, Ultralytics YOLO, automatic mask generator, ## SamAutomaticMaskGenerator --- ### ::: ultralytics.vit.sam.modules.mask_generator.SamAutomaticMaskGenerator -

\ No newline at end of file +

diff --git a/docs/reference/vit/sam/modules/prompt_predictor.md b/docs/reference/vit/sam/modules/prompt_predictor.md index f7e3b37..00de169 100644 --- a/docs/reference/vit/sam/modules/prompt_predictor.md +++ b/docs/reference/vit/sam/modules/prompt_predictor.md @@ -6,4 +6,4 @@ keywords: PromptPredictor, Ultralytics, YOLO, VIT SAM, image captioning, deep le ## PromptPredictor --- ### ::: ultralytics.vit.sam.modules.prompt_predictor.PromptPredictor -

\ No newline at end of file +

diff --git a/docs/reference/vit/sam/modules/sam.md b/docs/reference/vit/sam/modules/sam.md index acd467b..7ead8cb 100644 --- a/docs/reference/vit/sam/modules/sam.md +++ b/docs/reference/vit/sam/modules/sam.md @@ -6,4 +6,4 @@ keywords: Ultralytics VIT, Sam module, PyTorch vision library, image classificat ## Sam --- ### ::: ultralytics.vit.sam.modules.sam.Sam -

\ No newline at end of file +

diff --git a/docs/reference/vit/sam/modules/transformer.md b/docs/reference/vit/sam/modules/transformer.md index 994b984..e0d8eeb 100644 --- a/docs/reference/vit/sam/modules/transformer.md +++ b/docs/reference/vit/sam/modules/transformer.md @@ -16,4 +16,4 @@ keywords: Ultralytics YOLO, Attention module, TwoWayTransformer module, Object D ## Attention --- ### ::: ultralytics.vit.sam.modules.transformer.Attention -

\ No newline at end of file +

diff --git a/docs/reference/vit/sam/predict.md b/docs/reference/vit/sam/predict.md index 836d91e..3547951 100644 --- a/docs/reference/vit/sam/predict.md +++ b/docs/reference/vit/sam/predict.md @@ -6,4 +6,4 @@ keywords: Ultralytics, VIT SAM Predictor, object detection, YOLO ## Predictor --- ### ::: ultralytics.vit.sam.predict.Predictor -

\ No newline at end of file +

diff --git a/docs/reference/vit/utils/loss.md b/docs/reference/vit/utils/loss.md index 3eb366e..cd45d5f 100644 --- a/docs/reference/vit/utils/loss.md +++ b/docs/reference/vit/utils/loss.md @@ -11,4 +11,4 @@ keywords: DETRLoss, RTDETRDetectionLoss, Ultralytics, object detection, image cl ## RTDETRDetectionLoss --- ### ::: ultralytics.vit.utils.loss.RTDETRDetectionLoss -

\ No newline at end of file +

diff --git a/docs/reference/vit/utils/ops.md b/docs/reference/vit/utils/ops.md index f4b7c81..e4660f0 100644 --- a/docs/reference/vit/utils/ops.md +++ b/docs/reference/vit/utils/ops.md @@ -16,4 +16,4 @@ keywords: Ultralytics, YOLO, object detection, HungarianMatcher, inverse_sigmoid ## inverse_sigmoid --- ### ::: ultralytics.vit.utils.ops.inverse_sigmoid -

\ No newline at end of file +

diff --git a/docs/reference/yolo/utils/downloads.md b/docs/reference/yolo/utils/downloads.md index dd07646..3e06f8f 100644 --- a/docs/reference/yolo/utils/downloads.md +++ b/docs/reference/yolo/utils/downloads.md @@ -23,6 +23,11 @@ keywords: Ultralytics YOLO, downloads, trained models, datasets, weights, deep l ### ::: ultralytics.yolo.utils.downloads.safe_download

+## get_github_assets +--- +### ::: ultralytics.yolo.utils.downloads.get_github_assets +

+ ## attempt_download_asset --- ### ::: ultralytics.yolo.utils.downloads.attempt_download_asset