togethercomputer / m2-bert-80M-2k

huggingface.co
Total runs: 16
24-hour runs: 0
7-day runs: -4
30-day runs: 1
Model's Last Updated: January 10 2024
fill-mask

Introduction of m2-bert-80M-2k

Model Details of m2-bert-80M-2k

Monarch Mixer-BERT

An 80M checkpoint of M2-BERT, pretrained with sequence length 2048. This is a BERT-style model that has not been fine-tuned. We recommend fine-tuning it for specific use cases before using it.

Check out the paper Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture and our blog post on retrieval for more on how we trained this model for long sequence.

This model was trained by Jon Saad-Falcon, Dan Fu, and Simran Arora.

Check out our GitHub for instructions on how to download and fine-tune it!

How to use

You can load this model using Hugging Face AutoModel :

from transformers import AutoModelForMaskedLM
model = AutoModelForMaskedLM.from_pretrained(
  "togethercomputer/m2-bert-80M-2k-retrieval",
  trust_remote_code=True
)

You should expect to see a large error message about unused parameters for FlashFFTConv. If you'd like to load the model with FlashFFTConv, you can check out our GitHub .

Acknowledgments

Alycia Lee helped with AutoModel support.

Citation

If you use this model, or otherwise found our work valuable, you can cite us as follows:

@inproceedings{fu2023monarch,
  title={Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture},
  author={Fu, Daniel Y and Arora, Simran and Grogan, Jessica and Johnson, Isys and Eyuboglu, Sabri and Thomas, Armin W and Spector, Benjamin and Poli, Michael and Rudra, Atri and R{\'e}, Christopher},
  booktitle={Advances in Neural Information Processing Systems},
  year={2023}
}

Runs of togethercomputer m2-bert-80M-2k on huggingface.co

16
Total runs
0
24-hour runs
0
3-day runs
-4
7-day runs
1
30-day runs

More Information About m2-bert-80M-2k huggingface.co Model

More m2-bert-80M-2k license Visit here:

https://choosealicense.com/licenses/apache-2.0

m2-bert-80M-2k huggingface.co

m2-bert-80M-2k huggingface.co is an AI model on huggingface.co that provides m2-bert-80M-2k's model effect (), which can be used instantly with this togethercomputer m2-bert-80M-2k model. huggingface.co supports a free trial of the m2-bert-80M-2k model, and also provides paid use of the m2-bert-80M-2k. Support call m2-bert-80M-2k model through api, including Node.js, Python, http.

togethercomputer m2-bert-80M-2k online free

m2-bert-80M-2k huggingface.co is an online trial and call api platform, which integrates m2-bert-80M-2k's modeling effects, including api services, and provides a free online trial of m2-bert-80M-2k, you can try m2-bert-80M-2k online for free by clicking the link below.

togethercomputer m2-bert-80M-2k online free url in huggingface.co:

https://huggingface.co/togethercomputer/m2-bert-80M-2k

m2-bert-80M-2k install

m2-bert-80M-2k is an open source model from GitHub that offers a free installation service, and any user can find m2-bert-80M-2k on GitHub to install. At the same time, huggingface.co provides the effect of m2-bert-80M-2k install, users can directly use m2-bert-80M-2k installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

m2-bert-80M-2k install url in huggingface.co:

https://huggingface.co/togethercomputer/m2-bert-80M-2k

Url of m2-bert-80M-2k

Provider of m2-bert-80M-2k huggingface.co

togethercomputer
ORGANIZATIONS

Other API from togethercomputer