michaelfeil / ct2fast-gte-base

huggingface.co
Total runs: 6
24-hour runs: 0
7-day runs: -1
30-day runs: -15
Model's Last Updated: October 13 2023
sentence-similarity

Introduction of ct2fast-gte-base

Model Details of ct2fast-gte-base

# Fast-Inference with Ctranslate2

Speedup inference while reducing memory by 2x-4x using int8 inference in C++ on CPU or GPU.

quantized version of thenlper/gte-base

pip install hf-hub-ctranslate2>=2.12.0 ctranslate2>=3.17.1
# from transformers import AutoTokenizer
model_name = "michaelfeil/ct2fast-gte-base"
model_name_orig="thenlper/gte-base"

from hf_hub_ctranslate2 import EncoderCT2fromHfHub
model = EncoderCT2fromHfHub(
        # load in int8 on CUDA
        model_name_or_path=model_name,
        device="cuda",
        compute_type="int8_float16"
)
outputs = model.generate(
    text=["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
    max_length=64,
) # perform downstream tasks on outputs
outputs["pooler_output"]
outputs["last_hidden_state"]
outputs["attention_mask"]

# alternative, use SentenceTransformer Mix-In
# for end-to-end Sentence embeddings generation
# (not pulling from this CT2fast-HF repo)

from hf_hub_ctranslate2 import CT2SentenceTransformer
model = CT2SentenceTransformer(
    model_name_orig, compute_type="int8_float16", device="cuda"
)
embeddings = model.encode(
    ["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
    batch_size=32,
    convert_to_numpy=True,
    normalize_embeddings=True,
)
print(embeddings.shape, embeddings)
scores = (embeddings @ embeddings.T) * 100

# Hint: you can also host this code via REST API and
# via github.com/michaelfeil/infinity  

Checkpoint compatible to ctranslate2>=3.17.1 and hf-hub-ctranslate2>=2.12.0

  • compute_type=int8_float16 for device="cuda"
  • compute_type=int8 for device="cpu"

Converted on 2023-10-13 using

LLama-2 -> removed <pad> token.

Licence and other remarks:

This is just a quantized version. Licence conditions are intended to be idential to original huggingface repo.

Original description

gte-base

General Text Embeddings (GTE) model. Towards General Text Embeddings with Multi-stage Contrastive Learning

The GTE models are trained by Alibaba DAMO Academy. They are mainly based on the BERT framework and currently offer three different sizes of models, including GTE-large , GTE-base , and GTE-small . The GTE models are trained on a large-scale corpus of relevance text pairs, covering a wide range of domains and scenarios. This enables the GTE models to be applied to various downstream tasks of text embeddings, including information retrieval , semantic textual similarity , text reranking , etc.

Metrics

We compared the performance of the GTE models with other popular text embedding models on the MTEB benchmark. For more detailed comparison results, please refer to the MTEB leaderboard .

Model Name Model Size (GB) Dimension Sequence Length Average (56) Clustering (11) Pair Classification (3) Reranking (4) Retrieval (15) STS (10) Summarization (1) Classification (12)
gte-large 0.67 1024 512 63.13 46.84 85.00 59.13 52.22 83.35 31.66 73.33
gte-base 0.22 768 512 62.39 46.2 84.57 58.61 51.14 82.3 31.17 73.01
e5-large-v2 1.34 1024 512 62.25 44.49 86.03 56.61 50.56 82.05 30.19 75.24
e5-base-v2 0.44 768 512 61.5 43.80 85.73 55.91 50.29 81.05 30.28 73.84
gte-small 0.07 384 512 61.36 44.89 83.54 57.7 49.46 82.07 30.42 72.31
text-embedding-ada-002 - 1536 8192 60.99 45.9 84.89 56.32 49.25 80.97 30.8 70.93
e5-small-v2 0.13 384 512 59.93 39.92 84.67 54.32 49.04 80.39 31.16 72.94
sentence-t5-xxl 9.73 768 512 59.51 43.72 85.06 56.42 42.24 82.63 30.08 73.42
all-mpnet-base-v2 0.44 768 514 57.78 43.69 83.04 59.36 43.81 80.28 27.49 65.07
sgpt-bloom-7b1-msmarco 28.27 4096 2048 57.59 38.93 81.9 55.65 48.22 77.74 33.6 66.19
all-MiniLM-L12-v2 0.13 384 512 56.53 41.81 82.41 58.44 42.69 79.8 27.9 63.21
all-MiniLM-L6-v2 0.09 384 512 56.26 42.35 82.37 58.04 41.95 78.9 30.81 63.05
contriever-base-msmarco 0.44 768 512 56.00 41.1 82.54 53.14 41.88 76.51 30.36 66.68
sentence-t5-base 0.22 768 512 55.27 40.21 85.18 53.09 33.63 81.14 31.39 69.81
Usage

Code example

import torch.nn.functional as F
from torch import Tensor
from transformers import AutoTokenizer, AutoModel

def average_pool(last_hidden_states: Tensor,
                 attention_mask: Tensor) -> Tensor:
    last_hidden = last_hidden_states.masked_fill(~attention_mask[..., None].bool(), 0.0)
    return last_hidden.sum(dim=1) / attention_mask.sum(dim=1)[..., None]

input_texts = [
    "what is the capital of China?",
    "how to implement quick sort in python?",
    "Beijing",
    "sorting algorithms"
]

tokenizer = AutoTokenizer.from_pretrained("thenlper/gte-base")
model = AutoModel.from_pretrained("thenlper/gte-base")

# Tokenize the input texts
batch_dict = tokenizer(input_texts, max_length=512, padding=True, truncation=True, return_tensors='pt')

outputs = model(**batch_dict)
embeddings = average_pool(outputs.last_hidden_state, batch_dict['attention_mask'])

# (Optionally) normalize embeddings
embeddings = F.normalize(embeddings, p=2, dim=1)
scores = (embeddings[:1] @ embeddings[1:].T) * 100
print(scores.tolist())

Use with sentence-transformers:

from sentence_transformers import SentenceTransformer
from sentence_transformers.util import cos_sim

sentences = ['That is a happy person', 'That is a very happy person']

model = SentenceTransformer('thenlper/gte-base')
embeddings = model.encode(sentences)
print(cos_sim(embeddings[0], embeddings[1]))
Limitation

This model exclusively caters to English texts, and any lengthy texts will be truncated to a maximum of 512 tokens.

Citation

If you find our paper or models helpful, please consider citing them as follows:

@misc{li2023general,
      title={Towards General Text Embeddings with Multi-stage Contrastive Learning}, 
      author={Zehan Li and Xin Zhang and Yanzhao Zhang and Dingkun Long and Pengjun Xie and Meishan Zhang},
      year={2023},
      eprint={2308.03281},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Runs of michaelfeil ct2fast-gte-base on huggingface.co

6
Total runs
0
24-hour runs
-1
3-day runs
-1
7-day runs
-15
30-day runs

More Information About ct2fast-gte-base huggingface.co Model

More ct2fast-gte-base license Visit here:

https://choosealicense.com/licenses/mit

ct2fast-gte-base huggingface.co

ct2fast-gte-base huggingface.co is an AI model on huggingface.co that provides ct2fast-gte-base's model effect (), which can be used instantly with this michaelfeil ct2fast-gte-base model. huggingface.co supports a free trial of the ct2fast-gte-base model, and also provides paid use of the ct2fast-gte-base. Support call ct2fast-gte-base model through api, including Node.js, Python, http.

michaelfeil ct2fast-gte-base online free

ct2fast-gte-base huggingface.co is an online trial and call api platform, which integrates ct2fast-gte-base's modeling effects, including api services, and provides a free online trial of ct2fast-gte-base, you can try ct2fast-gte-base online for free by clicking the link below.

michaelfeil ct2fast-gte-base online free url in huggingface.co:

https://huggingface.co/michaelfeil/ct2fast-gte-base

ct2fast-gte-base install

ct2fast-gte-base is an open source model from GitHub that offers a free installation service, and any user can find ct2fast-gte-base on GitHub to install. At the same time, huggingface.co provides the effect of ct2fast-gte-base install, users can directly use ct2fast-gte-base installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

ct2fast-gte-base install url in huggingface.co:

https://huggingface.co/michaelfeil/ct2fast-gte-base

Url of ct2fast-gte-base

ct2fast-gte-base huggingface.co Url

Provider of ct2fast-gte-base huggingface.co

michaelfeil
ORGANIZATIONS

Other API from michaelfeil