adirik / owlvit-base-patch32

Zero-shot / open vocabulary object detection

replicate.com
Total runs: 23.6K
24-hour runs: 0
7-day runs: 0
30-day runs: 0
Github
Model's Last Updated: October 13 2023

Introduction of owlvit-base-patch32

Model Details of owlvit-base-patch32

Readme
OWL-ViT

OWL-ViT uses CLIP and vision transformers backbones to enable open-vocabulary object detection. See the paper , original repository and Hugging Face implementation for details.

Using the API

You can use OWL-ViT to query images with text descriptions of any object. To use it, simply upload an image and enter comma separated text descriptions of objects you want to query the image for. You can also use the score threshold slider to set a threshold to filter out low probability predictions.

OWL-ViT is trained on text templates, hence you can get better predictions by querying the image with text templates used in training the original model: “photo of a star-spangled banner”, “image of a shoe”. Refer to the CLIP paper to see the full list of text templates used to augment the training data.

References
@article{minderer2022simple,
  title={Simple Open-Vocabulary Object Detection with Vision Transformers},
  author={Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, Neil Houlsby},
  journal={ECCV},
  year={2022},
}

Pricing of owlvit-base-patch32 replicate.com

Run time and cost

This model costs approximately $0.0072 to run on Replicate, or 138 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker .

This model runs on Nvidia T4 GPU hardware . Predictions typically complete within 33 seconds. The predict time for this model varies significantly based on the inputs.

Runs of adirik owlvit-base-patch32 on replicate.com

23.6K
Total runs
0
24-hour runs
0
3-day runs
0
7-day runs
0
30-day runs

More Information About owlvit-base-patch32 replicate.com Model

More owlvit-base-patch32 license Visit here:

https://github.com/google-research/scenic/blob/main/LICENSE

owlvit-base-patch32 replicate.com

owlvit-base-patch32 replicate.com is an AI model on replicate.com that provides owlvit-base-patch32's model effect (Zero-shot / open vocabulary object detection), which can be used instantly with this adirik owlvit-base-patch32 model. replicate.com supports a free trial of the owlvit-base-patch32 model, and also provides paid use of the owlvit-base-patch32. Support call owlvit-base-patch32 model through api, including Node.js, Python, http.

owlvit-base-patch32 replicate.com Url

https://replicate.com/adirik/owlvit-base-patch32

adirik owlvit-base-patch32 online free

owlvit-base-patch32 replicate.com is an online trial and call api platform, which integrates owlvit-base-patch32's modeling effects, including api services, and provides a free online trial of owlvit-base-patch32, you can try owlvit-base-patch32 online for free by clicking the link below.

adirik owlvit-base-patch32 online free url in replicate.com:

https://replicate.com/adirik/owlvit-base-patch32

owlvit-base-patch32 install

owlvit-base-patch32 is an open source model from GitHub that offers a free installation service, and any user can find owlvit-base-patch32 on GitHub to install. At the same time, replicate.com provides the effect of owlvit-base-patch32 install, users can directly use owlvit-base-patch32 installed effect in replicate.com for debugging and trial. It also supports api for free installation.

owlvit-base-patch32 install url in replicate.com:

https://replicate.com/adirik/owlvit-base-patch32

Url of owlvit-base-patch32

Provider of owlvit-base-patch32 replicate.com

Other API from adirik

replicate

Detect everything with language!

Total runs: 4.2M
Run Growth: 0
Growth Rate: 0.00%
Updated:October 23 2023
replicate

Realistic interior design with text and image inputs

Total runs: 663.5K
Run Growth: 0
Growth Rate: 0.00%
Updated:April 06 2024
replicate

Photorealism with RealVisXL V3.0 Turbo based on SDXL

Total runs: 190.7K
Run Growth: 0
Growth Rate: 0.00%
Updated:January 19 2024
replicate

Generates speech from text

Total runs: 130.4K
Run Growth: 0
Growth Rate: 0.00%
Updated:February 01 2024
replicate

Flux lora, use "CNSTLL" to trigger

Total runs: 74.8K
Run Growth: 0
Growth Rate: 0.00%
Updated:August 24 2024
replicate

Photorealism with RealVisXL V4.0

Total runs: 46.5K
Run Growth: 0
Growth Rate: 0.00%
Updated:February 20 2024
replicate

Monocular depth estimation

Total runs: 8.0K
Run Growth: 0
Growth Rate: 0.00%
Updated:December 15 2023
replicate

Lightweight multimodal model for visual question answering, reasoning and captioning

Total runs: 7.8K
Run Growth: 0
Growth Rate: 0.00%
Updated:February 26 2024
replicate

Generate videos from text prompts with Kandinsky-2.2

Total runs: 7.3K
Run Growth: 0
Growth Rate: 0.00%
Updated:October 18 2023
replicate

Zero-shot speech synthesizer for text-to-speech and voice conversion

Total runs: 4.4K
Run Growth: 0
Growth Rate: 0.00%
Updated:December 15 2023
replicate

Kosmos-G: Generating Images in Context with Multimodal Large Language Models

Total runs: 4.3K
Run Growth: 0
Growth Rate: 0.00%
Updated:December 07 2023
replicate

Text-guided image generation and editing

Total runs: 3.9K
Run Growth: 0
Growth Rate: 0.00%
Updated:March 08 2024
replicate

Photorealism with Realistic Vision v6.0

Total runs: 3.6K
Run Growth: 0
Growth Rate: 0.00%
Updated:March 11 2024
replicate

Editable image generation with MasaCtrl-SDXL

Total runs: 3.4K
Run Growth: 0
Growth Rate: 0.00%
Updated:December 07 2023
replicate

Generates 3D assets from images

Total runs: 2.9K
Run Growth: 0
Growth Rate: 0.00%
Updated:February 28 2024
replicate

[Non-commerical] A multi-image visual language model

Total runs: 2.6K
Run Growth: 0
Growth Rate: 0.00%
Updated:March 13 2024
replicate

[Non-commerical] A multi-image visual language model

Total runs: 2.2K
Run Growth: 0
Growth Rate: 0.00%
Updated:March 13 2024
replicate

PyTorch version of Lightweight OpenPose as introduced in "Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose"

Total runs: 1.6K
Run Growth: 0
Growth Rate: 0.00%
Updated:September 09 2023
replicate

Detects objects in an image

Total runs: 1.5K
Run Growth: 0
Growth Rate: 0.00%
Updated:November 16 2023
replicate

Image-Prompt Multi-view Diffusion for 3D Generation

Total runs: 1.5K
Run Growth: 0
Growth Rate: 0.00%
Updated:January 19 2024
replicate

Generate texture for your mesh with text prompts

Total runs: 1.2K
Run Growth: 0
Growth Rate: 0.00%
Updated:November 28 2023
replicate

Multilingual speech translation that preserves original vocal style and prosody

Total runs: 1.2K
Run Growth: 0
Growth Rate: 0.00%
Updated:March 12 2024
replicate

Generate 3D assets using text descriptions

Total runs: 1000
Run Growth: 0
Growth Rate: 0.00%
Updated:November 16 2023
replicate

Text-Guided Image Generation and Manipulation

Total runs: 824
Run Growth: 0
Growth Rate: 0.00%
Updated:February 01 2022
replicate

Base version of Mamba 2.8B, a 2.8 billion parameter state space language model

Total runs: 810
Run Growth: 0
Growth Rate: 0.00%
Updated:February 06 2024
replicate

Mamba 2.8B state space language model fine tuned for chat

Total runs: 791
Run Growth: 0
Growth Rate: 0.00%
Updated:February 16 2024
replicate

LEdits++ for image editing

Total runs: 760
Run Growth: 0
Growth Rate: 0.00%
Updated:March 27 2024
replicate

E5-mistral-7b-instruct language embedding model

Total runs: 629
Run Growth: 0
Growth Rate: 0.00%
Updated:February 23 2024
replicate

Multilingual E5-large language embedding model

Total runs: 536
Run Growth: 0
Growth Rate: 0.00%
Updated:February 26 2024
replicate

Inst-Inpaint: Instructing to Remove Objects with Diffusion Models

Total runs: 521
Run Growth: 0
Growth Rate: 0.00%
Updated:October 04 2023
replicate

[Non-commercial] Generate texture for 3D assets using text descriptions

Total runs: 268
Run Growth: 0
Growth Rate: 0.00%
Updated:March 21 2024
replicate

Fast text-to-3D Gaussian generation by bridging 2D and 3D diffusion models

Total runs: 245
Run Growth: 0
Growth Rate: 0.00%
Updated:March 04 2024
replicate

Image editing with Prompt-to-Prompt for SDXL

Total runs: 237
Run Growth: 0
Growth Rate: 0.00%
Updated:March 16 2024
replicate

Base version of Mamba 130M, a 130 million parameter state space language model

Total runs: 140
Run Growth: 0
Growth Rate: 0.00%
Updated:February 06 2024
replicate

Generate panoramic images with text prompts

Total runs: 118
Run Growth: 0
Growth Rate: 0.00%
Updated:January 30 2024
replicate

Generating object-level shape variations with Stable Diffusion

Total runs: 82
Run Growth: 0
Growth Rate: 0.00%
Updated:December 11 2023
replicate

Performs speaker identity verification

Total runs: 76
Run Growth: 0
Growth Rate: 0.00%
Updated:November 21 2023
replicate

Base version of Mamba 1.4B, a 1.4 billion parameter state space language model

Total runs: 72
Run Growth: 0
Growth Rate: 0.00%
Updated:February 06 2024
replicate

Base version of Mamba 2.8B Slim Pyjama, a 2.8 billion parameter state space language model

Total runs: 71
Run Growth: 0
Growth Rate: 0.00%
Updated:February 06 2024
replicate

Multilingual E5-small language embedding model

Total runs: 49
Run Growth: 0
Growth Rate: 0.00%
Updated:February 26 2024
replicate

Base version of Mamba 370M, a 370 million parameter state space language model

Total runs: 48
Run Growth: 0
Growth Rate: 0.00%
Updated:February 06 2024
replicate

Base version of Mamba 790M, a 790 million parameter state space language model

Total runs: 47
Run Growth: 0
Growth Rate: 0.00%
Updated:February 06 2024
replicate

Multilingual E5-large language embedding model

Total runs: 20
Run Growth: 0
Growth Rate: 0.00%
Updated:February 26 2024