MCL-base huggingface.co api & McmanusChen MCL-base github AI Model

Introduction of MCL-base

Model Details of MCL-base

MCL base model (cased)

Pretrained model on English language using a Multi-perspective Course Learning (MCL) objective. It was introduced in this paper . This model is cased: it makes a difference between english and English.

Model description

MCL-base is an Electra-stype transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it was pretrained on the raw texts only, with no humans labeling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and labels from those texts. More precisely, it was pretrained with three self-supervision courses and two self-correction courses under an encoder-decoder framework:

Self-supervision Courses : including Replaced Token Detection (RTD), Swapped Token Detection (STD) and Inserted Token Detection (ITD). As for the RTD, taking a sentence, the model randomly masks 15% of the words in the input then run the entire masked sentence through the encoder and has to predict the masked words. This is different from traditional BERT models that only conduct pre-training on the encoder. It allows the decoder to futher discriminate the output sentence from the encoder.
Self-correction Courses: According to the above self-supervision courses, a competition mechanism between $G$ and $D$ seems to shape up. Facing the same piece of data, $G$ tries to reform the sequence in many ways, while $D$ yearns to figure out all the jugglery caused previously. However, the shared embedding layer of these two encoders becomes the only bridge of communication, which is apparently insufficient. To strengthen the link between the two components, and to provide more supervisory information on pre-training, we conduct an intimate dissection of the relationship between $G$ and $D$.

This way, the model learns an inner representation of the English language that can then be used to extract features useful for downstream tasks: if you have a dataset of labeled sentences, for instance, you can train a standard classifier using the features produced by the MCL model as inputs.

Pretraining

We implement the experiments on two settings: \textit{base} and \textit{tiny}. \textit{Base} is the standard training configuration of BERT$_\text{Base}$. The model is pre-trained on English Wikipedia and BookCorpus, containing 16 GB of text with 256 million samples. We set the maximum length of the input sequence to 512, and the learning rates are 5e-4. Training lasts 125K steps with a 2048 batch size. We use the same corpus as with CoCo-LM and 64K cased SentencePiece vocabulary. \textit{Tiny} conducts the ablation experiments on the same corpora with the same configuration as the \textit{base} setting, except that the batch size is 512.

Model Architecture

The layout of our model architecture maintains the same as CoCo-LM both on \textit{base} and \textit{tiny} settings. $D$ consists of 12-layer Transformer, 768 hidden size, plus T5 relative position encoding. $G$ is a shallow 4-layer Transformer with the same hidden size and position encoding. After pre-training, we discard $G$ and use $D$ in the same way as BERT, with a classification layer for downstream tasks.

Evaluation results

When fine-tuned on downstream tasks, this model achieves the following results:

Glue test results:

Task	MNLI-(m/mm)	QQP	QNLI	SST-2	CoLA	STS-B	MRPC	RTE	Average
	88.5/88.5	92.2	93.4	94.1	70.8	91.3	91.6	84.0	88.3

SQuAD 2.0 test results:

Metric	EM	F1
	82.9	85.9

BibTeX entry and citation info

@article{DBLP:journals/corr/abs-2305-03981,
  author       = {Beiduo Chen and
                  Shaohan Huang and
                  Zihan Zhang and
                  Wu Guo and
                  Zhenhua Ling and
                  Haizhen Huang and
                  Furu Wei and
                  Weiwei Deng and
                  Qi Zhang},
  title        = {Pre-training Language Model as a Multi-perspective Course Learner},
  journal      = {CoRR},
  volume       = {abs/2305.03981},
  year         = {2023},
  url          = {https://doi.org/10.48550/arXiv.2305.03981},
  doi          = {10.48550/arXiv.2305.03981},
  eprinttype    = {arXiv},
  eprint       = {2305.03981},
  timestamp    = {Thu, 11 May 2023 15:54:24 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2305-03981.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

Runs of McmanusChen MCL-base on huggingface.co

Total runs

24-hour runs

3-day runs

7-day runs

30-day runs

More Information About MCL-base huggingface.co Model

More MCL-base license Visit here:

https://choosealicense.com/licenses/apache-2.0

MCL-base huggingface.co

MCL-base huggingface.co is an AI model on huggingface.co that provides MCL-base's model effect (), which can be used instantly with this McmanusChen MCL-base model. huggingface.co supports a free trial of the MCL-base model, and also provides paid use of the MCL-base. Support call MCL-base model through api, including Node.js, Python, http.

MCL-base huggingface.co Url

https://huggingface.co/McmanusChen/MCL-base

McmanusChen MCL-base online free

MCL-base huggingface.co is an online trial and call api platform, which integrates MCL-base's modeling effects, including api services, and provides a free online trial of MCL-base, you can try MCL-base online for free by clicking the link below.

McmanusChen MCL-base online free url in huggingface.co:

https://huggingface.co/McmanusChen/MCL-base

MCL-base install

MCL-base is an open source model from GitHub that offers a free installation service, and any user can find MCL-base on GitHub to install. At the same time, huggingface.co provides the effect of MCL-base install, users can directly use MCL-base installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

MCL-base install url in huggingface.co:

https://huggingface.co/McmanusChen/MCL-base

huggingface.co

McmanusChen/distilbert-base-uncased-finetuned-cola

Total runs: 1

Run Growth: -1

Growth Rate: -100.00%

Updated:May 07 2024

huggingface.co

McmanusChen/distilbert-base-uncased-finetuned-semeval

Total runs: 1

Run Growth: 0

Growth Rate: 0.00%

Updated:May 07 2024

huggingface.co

McmanusChen/bert-base-uncased-finetuned-semeval

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:May 07 2024

huggingface.co

McmanusChen/MRP-beneproject

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated:March 10 2026

McmanusChen / MCL-base

Introduction of MCL-base

Model Details of MCL-base

MCL base model (cased)

Model description

Pretraining

Model Architecture

Evaluation results

BibTeX entry and citation info

Runs of McmanusChen MCL-base on huggingface.co

More Information About MCL-base huggingface.co Model

More MCL-base license Visit here:

MCL-base huggingface.co

MCL-base huggingface.co Url

McmanusChen MCL-base online free

McmanusChen MCL-base online free url in huggingface.co:

MCL-base install

MCL-base install url in huggingface.co:

Url of MCL-base

MCL-base huggingface.co Url

Provider of MCL-base huggingface.co

Other API from McmanusChen