TRIBE v2 is a deep multimodal brain encoding model that predicts fMRI brain responses to naturalistic stimuli (video, audio, text). It combines state-of-the-art feature extractors —
LLaMA 3.2
(text),
V-JEPA2
(video), and
Wav2Vec-BERT
(audio) — into a unified Transformer architecture that maps multimodal representations onto the cortical surface.
Quick start
Load a pretrained model from HuggingFace and predict brain responses to a video:
from tribev2 import TribeModel
model = TribeModel.from_pretrained("facebook/tribev2", cache_folder="./cache")
df = model.get_events_dataframe(video_path="path/to/video.mp4")
preds, segments = model.predict(events=df)
print(preds.shape) # (n_timesteps, n_vertices)
Predictions are for the "average" subject (see paper for details) and live on the
fsaverage5
cortical mesh (~20k vertices). You can also pass
text_path
or
audio_path
to
model.get_events_dataframe
— text is automatically converted to speech and transcribed to obtain word-level timings.
tribev2/
├── main.py # Experiment pipeline: Data, TribeExperiment
├── model.py # FmriEncoder: Transformer-based multimodal→fMRI model
├── pl_module.py # PyTorch Lightning training module
├── demo_utils.py # TribeModel and helpers for inference from text/audio/video
├── eventstransforms.py # Custom event transforms (word extraction, chunking, …)
├── utils.py # Multi-study loading, splitting, subject weighting
├── utils_fmri.py # Surface projection (MNI / fsaverage) and ROI analysis
├── grids/
│ ├── defaults.py # Full default experiment configuration
│ └── test_run.py # Quick local test entry point
├── plotting/ # Brain visualization (PyVista & Nilearn backends)
└── studies/ # Dataset definitions (Algonauts2025, Lahner2024, …)
Contributing to open science
If you use this software, please share your results with the broader research community using the following citation:
@article{dAscoli2026TribeV2,
title={A foundation model of vision, audition, and language for in-silico neuroscience},
author={d'Ascoli, St{\'e}phane and Rapin, J{\'e}r{\'e}my and Benchetrit, Yohann and Brookes, Teon and Begany, Katelyn and Raugel, Jos{\'e}phine and Banville, Hubert and King, Jean-R{\'e}mi},
year={2026}
}
License
This project is licensed under CC-BY-NC-4.0. See
LICENSE
for details.
tribev2 huggingface.co is an AI model on huggingface.co that provides tribev2's model effect (), which can be used instantly with this facebook tribev2 model. huggingface.co supports a free trial of the tribev2 model, and also provides paid use of the tribev2. Support call tribev2 model through api, including Node.js, Python, http.
tribev2 huggingface.co is an online trial and call api platform, which integrates tribev2's modeling effects, including api services, and provides a free online trial of tribev2, you can try tribev2 online for free by clicking the link below.
facebook tribev2 online free url in huggingface.co:
tribev2 is an open source model from GitHub that offers a free installation service, and any user can find tribev2 on GitHub to install. At the same time, huggingface.co provides the effect of tribev2 install, users can directly use tribev2 installed effect in huggingface.co for debugging and trial. It also supports api for free installation.