vaishali / multitabqa-base-sql

huggingface.co
Total runs: 7
24-hour runs: 0
7-day runs: 0
30-day runs: -7
Model's Last Updated: February 21 2024
table-question-answering

Introduction of multitabqa-base-sql

Model Details of multitabqa-base-sql

MultiTabQA (base-sized model)

MultiTabQA was proposed in MultiTabQA: Generating Tabular Answers for Multi-Table Question Answering by Vaishali Pal, Andrew Yates, Evangelos Kanoulas, Maarten de Rijke. The original repo can be found here .

Model description

MultiTabQA is a tableQA model which generates the answer table from multiple-input tables. It can handle multi-table operators such as UNION, INTERSECT, EXCEPT, JOINS, etc.

MultiTabQA is based on the TAPEX(BART) architecture, which is a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder.

Intended Uses

This pre-trained model can be used on SQL queries over multiple input tables.

How to Use

Here is how to use this model in transformers:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import pandas as pd

tokenizer = AutoTokenizer.from_pretrained("vaishali/multitabqa-base-sql")
model = AutoModelForSeq2SeqLM.from_pretrained("vaishali/multitabqa-base-sql")

query = "select count(*) from department where department_id not in (select department_id from management)"
table_names = ['department', 'management']
tables=[{"columns":["Department_ID","Name","Creation","Ranking","Budget_in_Billions","Num_Employees"],
                  "index":[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14],
                  "data":[
                          [1,"State","1789",1,9.96,30266.0],
                          [2,"Treasury","1789",2,11.1,115897.0],
                          [3,"Defense","1947",3,439.3,3000000.0],
                          [4,"Justice","1870",4,23.4,112557.0],
                          [5,"Interior","1849",5,10.7,71436.0],
                          [6,"Agriculture","1889",6,77.6,109832.0],
                          [7,"Commerce","1903",7,6.2,36000.0],
                          [8,"Labor","1913",8,59.7,17347.0],
                          [9,"Health and Human Services","1953",9,543.2,67000.0],
                          [10,"Housing and Urban Development","1965",10,46.2,10600.0],
                          [11,"Transportation","1966",11,58.0,58622.0],
                          [12,"Energy","1977",12,21.5,116100.0],
                          [13,"Education","1979",13,62.8,4487.0],
                          [14,"Veterans Affairs","1989",14,73.2,235000.0],
                          [15,"Homeland Security","2002",15,44.6,208000.0]
                        ]
                  },
                  {"columns":["department_ID","head_ID","temporary_acting"],
                    "index":[0,1,2,3,4],
                    "data":[
                            [2,5,"Yes"],
                            [15,4,"Yes"],
                            [2,6,"Yes"],
                            [7,3,"No"],
                            [11,10,"No"]
                          ]
                  }]

input_tables = [pd.read_json(table, orient="split") for table in tables]

# flatten the model inputs in the format: query + " " + <table_name> : table_name1 + flattened_table1 + <table_name> : table_name2 + flattened_table2 + ...  
#flattened_input = query + " " + [f"<table_name> : {table_name} linearize_table(table) for table_name, table in zip(table_names, tables)]
model_input_string = """select count(*) from department where department_id not in (select department_id from management) <table_name> : department col : Department_ID | Name | Creation | Ranking | Budget_in_Billions | Num_Employees row 1 : 1 | State | 1789 | 1 | 9.96 | 30266 row 2 : 2 | Treasury | 1789 | 2 | 11.1 | 115897 row 3 : 3 | Defense | 1947 | 3 | 439.3 | 3000000 row 4 : 4 | Justice | 1870 | 4 | 23.4 | 112557 row 5 : 5 | Interior | 1849 | 5 | 10.7 | 71436 row 6 : 6 | Agriculture | 1889 | 6 | 77.6 | 109832 row 7 : 7 | Commerce | 1903 | 7 | 6.2 | 36000 row 8 : 8 | Labor | 1913 | 8 | 59.7 | 17347 row 9 : 9 | Health and Human Services | 1953 | 9 | 543.2 | 67000 row 10 : 10 | Housing and Urban Development | 1965 | 10 | 46.2 | 10600 row 11 : 11 | Transportation | 1966 | 11 | 58.0 | 58622 row 12 : 12 | Energy | 1977 | 12 | 21.5 | 116100 row 13 : 13 | Education | 1979 | 13 | 62.8 | 4487 row 14 : 14 | Veterans Affairs | 1989 | 14 | 73.2 | 235000 row 15 : 15 | Homeland Security | 2002 | 15 | 44.6 | 208000 <table_name> : management col : department_ID | head_ID | temporary_acting row 1 : 2 | 5 | Yes row 2 : 15 | 4 | Yes row 3 : 2 | 6 | Yes row 4 : 7 | 3 | No row 5 : 11 | 10 | No"""
inputs = tokenizer(model_input_string, return_tensors="pt")

outputs = model.generate(**inputs)

print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
# 'col : count(*) row 1 : 11'
How to Fine-tune

Please find the fine-tuning script here .

BibTeX entry and citation info
@inproceedings{pal-etal-2023-multitabqa,
    title = "{M}ulti{T}ab{QA}: Generating Tabular Answers for Multi-Table Question Answering",
    author = "Pal, Vaishali  and
      Yates, Andrew  and
      Kanoulas, Evangelos  and
      de Rijke, Maarten",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.acl-long.348",
    doi = "10.18653/v1/2023.acl-long.348",
    pages = "6322--6334",
    abstract = "Recent advances in tabular question answering (QA) with large language models are constrained in their coverage and only answer questions over a single table. However, real-world queries are complex in nature, often over multiple tables in a relational database or web page. Single table questions do not involve common table operations such as set operations, Cartesian products (joins), or nested queries. Furthermore, multi-table operations often result in a tabular output, which necessitates table generation capabilities of tabular QA models. To fill this gap, we propose a new task of answering questions over multiple tables. Our model, MultiTabQA, not only answers questions over multiple tables, but also generalizes to generate tabular answers. To enable effective training, we build a pre-training dataset comprising of 132,645 SQL queries and tabular answers. Further, we evaluate the generated tables by introducing table-specific metrics of varying strictness assessing various levels of granularity of the table structure. MultiTabQA outperforms state-of-the-art single table QA models adapted to a multi-table QA setting by finetuning on three datasets: Spider, Atis and GeoQuery.",
}

Runs of vaishali multitabqa-base-sql on huggingface.co

7
Total runs
0
24-hour runs
0
3-day runs
0
7-day runs
-7
30-day runs

More Information About multitabqa-base-sql huggingface.co Model

More multitabqa-base-sql license Visit here:

https://choosealicense.com/licenses/mit

multitabqa-base-sql huggingface.co

multitabqa-base-sql huggingface.co is an AI model on huggingface.co that provides multitabqa-base-sql's model effect (), which can be used instantly with this vaishali multitabqa-base-sql model. huggingface.co supports a free trial of the multitabqa-base-sql model, and also provides paid use of the multitabqa-base-sql. Support call multitabqa-base-sql model through api, including Node.js, Python, http.

multitabqa-base-sql huggingface.co Url

https://huggingface.co/vaishali/multitabqa-base-sql

vaishali multitabqa-base-sql online free

multitabqa-base-sql huggingface.co is an online trial and call api platform, which integrates multitabqa-base-sql's modeling effects, including api services, and provides a free online trial of multitabqa-base-sql, you can try multitabqa-base-sql online for free by clicking the link below.

vaishali multitabqa-base-sql online free url in huggingface.co:

https://huggingface.co/vaishali/multitabqa-base-sql

multitabqa-base-sql install

multitabqa-base-sql is an open source model from GitHub that offers a free installation service, and any user can find multitabqa-base-sql on GitHub to install. At the same time, huggingface.co provides the effect of multitabqa-base-sql install, users can directly use multitabqa-base-sql installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

multitabqa-base-sql install url in huggingface.co:

https://huggingface.co/vaishali/multitabqa-base-sql

Url of multitabqa-base-sql

multitabqa-base-sql huggingface.co Url

Provider of multitabqa-base-sql huggingface.co

vaishali
ORGANIZATIONS

Other API from vaishali

huggingface.co

Total runs: 1
Run Growth: 1
Growth Rate: 100.00%
Updated:August 16 2025
huggingface.co

Total runs: 1
Run Growth: 1
Growth Rate: 100.00%
Updated:March 25 2025
huggingface.co

Total runs: 0
Run Growth: 0
Growth Rate: 0.00%
Updated:September 23 2025