dmusingu / lapvqa-diffvqa

huggingface.co
Total runs: 0
24-hour runs: 0
7-day runs: 0
30-day runs: 0
Model's Last Updated: June 06 2026
visual-question-answering

Introduction of lapvqa-diffvqa

Model Details of lapvqa-diffvqa

LAPVQA — Differential VQA (Frozen Off-the-shelf Encoders)

Part of the LAPVQA collection .

Description

Task heads for Differential VQA : given a prior and a current chest X-ray, answer questions about radiological changes. Trained on MIMIC-Diff-VQA with five frozen encoders. Each .pt file is a plain state dict of DiffVQAHead .

Architecture — DiffVQAHead
vis_proj   : Linear(vis_dim → 512)   # shared for both images
frame_emb  : Embedding(2, 512)       # 0=reference, 1=current
memory     : [ref_proj + frame_emb(0) ; curr_proj + frame_emb(1)]  → [B, 2N, 512]
tok_emb    : Embedding(50257, 512)
pos_emb    : Embedding(200, 512)
decoder    : 6 × TransformerDecoderLayer (pre-norm)
lm_head    : Linear(512 → 50257, bias=False)
File Encoder vis_dim
clip-vit-l14_best.pt CLIP ViT-L/14 1024
coca_best.pt CoCa 768
florence2_best.pt Florence-2 1024
siglip_best.pt SigLIP 1152
owlv2_best.pt OWLv2 1024
Results (test set)
Encoder BLEU-1 BLEU-4 ROUGE-1 RadGraph-s
CLIP ViT-L/14 0.184 0.128 0.336 0.322
CoCa 0.196 0.138 0.320 0.317
Florence-2 0.191 0.138 0.319 0.318
SigLIP 0.186 0.131 0.322 0.313
Loading
import torch
import tiktoken
from lapvqa.diffvqa.model import DiffVQAHead

ckpt = torch.load("coca_best.pt", map_location="cpu")
head = DiffVQAHead(vis_dim=768)   # adjust vis_dim per encoder
head.load_state_dict(ckpt)
head.eval()

enc = tiktoken.get_encoding("gpt2")
bos_id = eos_id = enc.eot_token

# curr_vis, ref_vis: [B, N, vis_dim] — patch tokens from the frozen encoder
answers = head.generate(
    curr_vis    = curr_vis,
    ref_vis     = ref_vis,
    prompt_ids  = question_ids,   # [B, Q]
    bos_id      = bos_id,
    eos_id      = eos_id,
    max_new_tokens = 128,
)
decoded = [enc.decode(ids) for ids in answers]

Runs of dmusingu lapvqa-diffvqa on huggingface.co

0
Total runs
0
24-hour runs
0
3-day runs
0
7-day runs
0
30-day runs

More Information About lapvqa-diffvqa huggingface.co Model

More lapvqa-diffvqa license Visit here:

https://choosealicense.com/licenses/apache-2.0

lapvqa-diffvqa huggingface.co

lapvqa-diffvqa huggingface.co is an AI model on huggingface.co that provides lapvqa-diffvqa's model effect (), which can be used instantly with this dmusingu lapvqa-diffvqa model. huggingface.co supports a free trial of the lapvqa-diffvqa model, and also provides paid use of the lapvqa-diffvqa. Support call lapvqa-diffvqa model through api, including Node.js, Python, http.

lapvqa-diffvqa huggingface.co Url

https://huggingface.co/dmusingu/lapvqa-diffvqa

dmusingu lapvqa-diffvqa online free

lapvqa-diffvqa huggingface.co is an online trial and call api platform, which integrates lapvqa-diffvqa's modeling effects, including api services, and provides a free online trial of lapvqa-diffvqa, you can try lapvqa-diffvqa online for free by clicking the link below.

dmusingu lapvqa-diffvqa online free url in huggingface.co:

https://huggingface.co/dmusingu/lapvqa-diffvqa

lapvqa-diffvqa install

lapvqa-diffvqa is an open source model from GitHub that offers a free installation service, and any user can find lapvqa-diffvqa on GitHub to install. At the same time, huggingface.co provides the effect of lapvqa-diffvqa install, users can directly use lapvqa-diffvqa installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

lapvqa-diffvqa install url in huggingface.co:

https://huggingface.co/dmusingu/lapvqa-diffvqa

Url of lapvqa-diffvqa

lapvqa-diffvqa huggingface.co Url

Provider of lapvqa-diffvqa huggingface.co

dmusingu
ORGANIZATIONS

Other API from dmusingu

huggingface.co

Total runs: 0
Run Growth: 0
Growth Rate: 0.00%
Updated:June 06 2026
huggingface.co

Total runs: 0
Run Growth: 0
Growth Rate: 0.00%
Updated:June 06 2026