colidefics huggingface.co api & vidore colidefics github AI Model

Introduction of colidefics

Model Details of colidefics

ColPali: Visual Retriever based on PaliGemma-3B with ColBERT strategy

Idefics2 version

ColIdefics is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features. It is a Idefics2 extension that generates ColBERT - style multi-vector representations of text and images. It was introduced in the paper ColPali: Efficient Document Retrieval with Vision Language Models and first released in this repository

Model Description

This model is built iteratively starting from an off-the-shelf SigLIP model. We finetuned it to create BiSigLIP and fed the patch-embeddings output by SigLIP to an LLM, PaliGemma-3B to create BiPali .

One benefit of inputting image patch embeddings through a language model is that they are natively mapped to a latent space similar to textual input (query). This enables leveraging the ColBERT strategy to compute interactions between text tokens and image patches, which enables a step-change improvement in performance compared to BiPali.

Model Training

Dataset

Our training dataset of 127,460 query-page pairs is comprised of train sets of openly available academic datasets (63%) and a synthetic dataset made up of pages from web-crawled PDF documents and augmented with VLM-generated (Claude-3 Sonnet) pseudo-questions (37%). Our training set is fully English by design, enabling us to study zero-shot generalization to non-English languages. We explicitly verify no multi-page PDF document is used both ViDoRe and in the train set to prevent evaluation contamination. A validation set is created with 2% of the samples to tune hyperparameters.

Note: Multilingual data is present in the pretraining corpus of the language model (Gemma-2B) and potentially occurs during PaliGemma-3B's multimodal training.

Parameters

All models are trained for 1 epoch on the train set. Unless specified otherwise, we train models in bfloat16 format, use low-rank adapters ( LoRA ) with alpha=32 and r=32 on the transformer layers from the language model, as well as the final randomly initialized projection layer, and use a paged_adamw_8bit optimizer. We train on an 8 GPU setup with data parallelism, a learning rate of 5e-5 with linear decay with 2.5% warmup steps, and a batch size of 32.

Usage

import torch
import typer
from torch.utils.data import DataLoader
from tqdm import tqdm
from transformers import AutoProcessor
from PIL import Image

from colpali_engine.models.idefics_colbert_architecture import ColIdefics
from colpali_engine.trainer.retrieval_evaluator import CustomEvaluator
from colpali_engine.utils.colidefics_processing_utils import process_images, process_queries
from colpali_engine.utils.image_from_page_utils import load_from_dataset


def main() -> None:
    """Example script to run inference with ColIdefics"""

    # Load model
    model_name = "vidore/colidefics"
    model = ColIdefics.from_pretrained("HuggingFaceM4/idefics2-8b", torch_dtype=torch.bfloat16, device_map="cuda").eval()
    model.load_adapter(model_name)
    processor = AutoProcessor.from_pretrained(model_name)

    # select images -> load_from_pdf(<pdf_path>),  load_from_image_urls(["<url_1>"]), load_from_dataset(<path>)
    images = load_from_dataset("vidore/docvqa_test_subsampled")
    queries = ["From which university does James V. Fiorca come ?", "Who is the japanese prime minister?"]

    # run inference - docs
    dataloader = DataLoader(
        images,
        batch_size=4,
        shuffle=False,
        collate_fn=lambda x: process_images(processor, x),
    )
    ds = []
    for batch_doc in tqdm(dataloader):
        with torch.no_grad():
            batch_doc = {k: v.to(model.device) for k, v in batch_doc.items()}
            embeddings_doc = model(**batch_doc)
        ds.extend(list(torch.unbind(embeddings_doc.to("cpu"))))

    # run inference - queries
    dataloader = DataLoader(
        queries,
        batch_size=4,
        shuffle=False,
        collate_fn=lambda x: process_queries(processor, x, Image.new("RGB", (448, 448), (255, 255, 255))),
    )

    qs = []
    for batch_query in dataloader:
        with torch.no_grad():
            batch_query = {k: v.to(model.device) for k, v in batch_query.items()}
            embeddings_query = model(**batch_query)
        qs.extend(list(torch.unbind(embeddings_query.to("cpu"))))

    # run evaluation
    retriever_evaluator = CustomEvaluator(is_multi_vector=True)
    scores = retriever_evaluator.evaluate(qs, ds)
    print(scores.argmax(axis=1))


if __name__ == "__main__":
    typer.run(main)

Limitations

Focus : The model primarily focuses on PDF-type documents and high-ressources languages, potentially limiting its generalization to other document types or less represented languages.
Support : The model relies on multi-vector retreiving derived from the ColBERT late interaction mechanism, which may require engineering efforts to adapt to widely used vector retrieval frameworks that lack native multi-vector support.

License

The base model behing ColIdefics (Idefics2) is under MIT license. The adapters attached to the model are under MIT license.

Contact

Manuel Faysse: [email protected]
Hugues Sibille: [email protected]
Tony Wu: [email protected]

Citation

If you use any datasets or models from this organization in your research, please cite the original dataset as follows:

@misc{faysse2024colpaliefficientdocumentretrieval,
  title={ColPali: Efficient Document Retrieval with Vision Language Models}, 
  author={Manuel Faysse and Hugues Sibille and Tony Wu and Bilel Omrani and Gautier Viaud and Céline Hudelot and Pierre Colombo},
  year={2024},
  eprint={2407.01449},
  archivePrefix={arXiv},
  primaryClass={cs.IR},
  url={https://arxiv.org/abs/2407.01449}, 
}

Runs of vidore colidefics on huggingface.co

Total runs

24-hour runs

3-day runs

7-day runs

30-day runs

More Information About colidefics huggingface.co Model

More colidefics license Visit here:

https://choosealicense.com/licenses/mit

colidefics huggingface.co

colidefics huggingface.co is an AI model on huggingface.co that provides colidefics's model effect (), which can be used instantly with this vidore colidefics model. huggingface.co supports a free trial of the colidefics model, and also provides paid use of the colidefics. Support call colidefics model through api, including Node.js, Python, http.

colidefics huggingface.co Url

https://huggingface.co/vidore/colidefics

vidore colidefics online free

colidefics huggingface.co is an online trial and call api platform, which integrates colidefics's modeling effects, including api services, and provides a free online trial of colidefics, you can try colidefics online for free by clicking the link below.

vidore colidefics online free url in huggingface.co:

https://huggingface.co/vidore/colidefics

colidefics install

colidefics is an open source model from GitHub that offers a free installation service, and any user can find colidefics on GitHub to install. At the same time, huggingface.co provides the effect of colidefics install, users can directly use colidefics installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

colidefics install url in huggingface.co:

https://huggingface.co/vidore/colidefics

huggingface.co

vidore/colqwen2-v1.0-hf

Total runs: 96.5K

Run Growth: -4.6K

Growth Rate: -4.73%

Updated:June 02 2025

huggingface.co

vidore/colqwen2.5-v0.2

Total runs: 73.4K

Run Growth: 35.8K

Growth Rate: 48.77%

Updated:June 16 2025

huggingface.co

vidore/colpali-v1.3

Total runs: 50.2K

Run Growth: -20.4K

Growth Rate: -40.04%

Updated:March 14 2025

huggingface.co

vidore/colqwen2-v1.0

Total runs: 46.6K

Run Growth: 2.3K

Growth Rate: 5.02%

Updated:June 06 2025

huggingface.co

vidore/colpali-v1.2

Total runs: 36.9K

Run Growth: -64.9K

Growth Rate: -175.92%

Updated:March 14 2025

huggingface.co

vidore/colsmolvlm-alpha

Total runs: 35.1K

Run Growth: 0

Growth Rate: 0.00%

Updated:February 06 2025

huggingface.co

vidore/colqwen-omni-v0.1

Total runs: 25.8K

Run Growth: 22.5K

Growth Rate: 87.37%

Updated:July 17 2025

huggingface.co

vidore/colqwen2-v0.1

Total runs: 12.3K

Run Growth: -400

Growth Rate: -3.24%

Updated:March 22 2025

huggingface.co

vidore/colpali

Total runs: 5.5K

Run Growth: -212

Growth Rate: -3.85%

Updated:November 25 2025

huggingface.co

vidore/colpali-v1.2-hf

Total runs: 2.9K

Run Growth: -44

Growth Rate: -1.53%

Updated:April 16 2025

huggingface.co

vidore/colSmol-256M

Total runs: 1.5K

Run Growth: -1.1K

Growth Rate: -73.23%

Updated:March 14 2025

huggingface.co

vidore/colSmol-500M

Total runs: 1.5K

Run Growth: -309

Growth Rate: -20.40%

Updated:March 14 2025

huggingface.co

vidore/colpali-v1.3-hf

Total runs: 1.2K

Run Growth: -808

Growth Rate: -66.83%

Updated:April 16 2025

huggingface.co

vidore/colpaligemma2-3b-pt-448-base

Total runs: 109

Run Growth: -29

Growth Rate: -26.61%

Updated:April 14 2025

huggingface.co

vidore/colpali-v1.1

Total runs: 92

Run Growth: -251

Growth Rate: -272.83%

Updated:March 14 2025

huggingface.co

vidore/colpali-v1.2-hf-deprecated

Total runs: 67

Run Growth: 0

Growth Rate: 0.00%

Updated:October 31 2024

huggingface.co

vidore/colsmolvlm-v0.1

Total runs: 25

Run Growth: -21

Growth Rate: -84.00%

Updated:March 14 2025

huggingface.co

vidore/biqwen2-v0.1

Total runs: 24

Run Growth: 6

Growth Rate: 25.00%

Updated:April 08 2025

huggingface.co

vidore/colSmol-500M-base

Total runs: 8

Run Growth: 0

Growth Rate: 0.00%

Updated:January 23 2025

huggingface.co

vidore/colpali2-3b-pt-448

Total runs: 4

Run Growth: 2

Growth Rate: 50.00%

Updated:December 18 2024

huggingface.co

vidore/colSmol-256M-base

Total runs: 3

Run Growth: 0

Growth Rate: 0.00%

Updated:January 23 2025

huggingface.co

vidore/colpali-duo-base

Total runs: 2

Run Growth: 2

Growth Rate: 100.00%

Updated:October 16 2024

huggingface.co

vidore/ColSmolVLM-256M-Base

Total runs: 2

Run Growth: 0

Growth Rate: 0.00%

Updated:January 23 2025

huggingface.co

vidore/colpali-hard-v1.1

Total runs: 1

Run Growth: -1

Growth Rate: -100.00%

Updated:August 26 2024

huggingface.co

vidore/colpali-3b-pt-448

Total runs: 1

Run Growth: -3

Growth Rate: -300.00%

Updated:August 26 2024

huggingface.co

vidore/colpali-v1.2-merged-state_dict

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:October 31 2024

huggingface.co

vidore/bisiglip

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:September 09 2024

huggingface.co

vidore/colqwen2-base

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:June 06 2025

huggingface.co

vidore/colpaligemma-3b-pt-448-base

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:October 19 2024

huggingface.co

vidore/baseline-results

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:March 17 2025

huggingface.co

vidore/colpali-v1.2-merged

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:February 06 2025

huggingface.co

vidore/ColSmolVLM-base

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:November 27 2024

huggingface.co

vidore/colpali-v1.3-merged

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:August 04 2025

huggingface.co

vidore/bipali-hard-v1.1

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:August 24 2024

huggingface.co

vidore/submission_baselines_without_tabfquad

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:June 25 2024

huggingface.co

vidore/colqwen2-v0.1-merged

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:March 09 2025

huggingface.co

vidore/bipali

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:July 15 2025

huggingface.co

vidore/ColSmolVLM-Instruct-256M-base

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:February 07 2025

huggingface.co

vidore/colqwen2.5-base

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:June 06 2025

huggingface.co

vidore/colqwen2-v1.0-merged

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:April 16 2025

huggingface.co

vidore/debug_new_vidore_result_format_2

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:December 02 2024

huggingface.co

vidore/ColSmolVLM-Instruct-500M-base

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:February 07 2025

huggingface.co

vidore/colpaligemma-3b-mix-448-base

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:August 23 2024

huggingface.co

vidore/cohere-embed-english-v3

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:November 07 2024

vidore / colidefics

Introduction of colidefics

Model Details of colidefics

ColPali: Visual Retriever based on PaliGemma-3B with ColBERT strategy

Idefics2 version

Model Description

Model Training

Dataset

Parameters

Usage

Limitations

License

Contact

Citation

Runs of vidore colidefics on huggingface.co

More Information About colidefics huggingface.co Model

More colidefics license Visit here:

colidefics huggingface.co

colidefics huggingface.co Url

vidore colidefics online free

vidore colidefics online free url in huggingface.co:

colidefics install

colidefics install url in huggingface.co:

Url of colidefics

colidefics huggingface.co Url

Provider of colidefics huggingface.co

Other API from vidore