Melayu BERT is a masked language model based on
BERT
. It was trained on the
OSCAR
dataset, specifically the
unshuffled_original_ms
subset. The model used was
English BERT model
and fine-tuned on the Malaysian dataset. The model achieved a perplexity of 9.46 on a 20% validation dataset. Many of the techniques used are based on a Hugging Face tutorial
notebook
written by
Sylvain Gugger
, and
fine-tuning tutorial notebook
written by
Pierre Guillou
. The model is available both for PyTorch and TensorFlow use.
Model
The model was trained on 3 epochs with a learning rate of 2e-3 and achieved a training loss per steps as shown below.
Step
Training loss
500
5.051300
1000
3.701700
1500
3.288600
2000
3.024000
2500
2.833500
3000
2.741600
3500
2.637900
4000
2.547900
4500
2.451500
5000
2.409600
5500
2.388300
6000
2.351600
How to Use
As Masked Language Model
from transformers import pipeline
pretrained_name = "StevenLimcorn/MelayuBERT"
fill_mask = pipeline(
"fill-mask",
model=pretrained_name,
tokenizer=pretrained_name
)
fill_mask("Saya [MASK] makan nasi hari ini.")
Import Tokenizer and Model
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("StevenLimcorn/MelayuBERT")
model = AutoModelForMaskedLM.from_pretrained("StevenLimcorn/MelayuBERT")
MelayuBERT huggingface.co is an AI model on huggingface.co that provides MelayuBERT's model effect (), which can be used instantly with this StevenLimcorn MelayuBERT model. huggingface.co supports a free trial of the MelayuBERT model, and also provides paid use of the MelayuBERT. Support call MelayuBERT model through api, including Node.js, Python, http.
MelayuBERT huggingface.co is an online trial and call api platform, which integrates MelayuBERT's modeling effects, including api services, and provides a free online trial of MelayuBERT, you can try MelayuBERT online for free by clicking the link below.
StevenLimcorn MelayuBERT online free url in huggingface.co:
MelayuBERT is an open source model from GitHub that offers a free installation service, and any user can find MelayuBERT on GitHub to install. At the same time, huggingface.co provides the effect of MelayuBERT install, users can directly use MelayuBERT installed effect in huggingface.co for debugging and trial. It also supports api for free installation.