Omartificial-Intelligence-Space / SA-BERT-Classifier

huggingface.co
Total runs: 0
24-hour runs: 0
7-day runs: 0
30-day runs: 0
Model's Last Updated: May 13 2025
text-classification

Introduction of SA-BERT-Classifier

Model Details of SA-BERT-Classifier

SA-BERT-Classifier: Saudi Dialect Classifier

Model Logo

Model Description

SA-BERT-Classifier is a binary classifier that distinguishes between Saudi and non-Saudi Arabic dialects. Built on top of the SA-BERT-V1 embeddings, this model achieves high accuracy in identifying Saudi dialectal expressions across various domains and contexts.

Intended Use

This model is designed for:

  • Dialect identification in Arabic text
  • Content filtering for region-specific applications
  • Improving NLP pipelines for Saudi audience targeting
  • Research on dialectal variations in Arabic
Performance Metrics

The model achieves the following performance on our test set:

Metric Score
Accuracy 0.9821
Precision 0.9745
Recall 0.9890
F1 Score 0.9817
Usage
Using the Hugging Face Transformers Pipeline
import os
import torch
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
    TextClassificationPipeline
)

# Configuration
MODEL_ID = "Omartificial-Intelligence-Space/SA-BERT-Classifier"
HF_TOKEN = os.getenv("HUGGINGFACE_HUB_TOKEN", "<YOUR_TOKEN_HERE>")
DEVICE = 0 if torch.cuda.is_available() else -1

# Load tokenizer & model
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_auth_token=HF_TOKEN)
model = AutoModelForSequenceClassification.from_pretrained(
    MODEL_ID, use_auth_token=HF_TOKEN
).to("cuda" if DEVICE == 0 else "cpu")

# Build the pipeline
pipeline = TextClassificationPipeline(
    model=model,
    tokenizer=tokenizer,
    device=DEVICE,
    return_all_scores=True
)

# Example
text = "السلام عليكم ورحمة الله كيف حالك اليوم؟"
results = pipeline(text)[0]

# Format results
scores = {int(item["label"].split("_")[-1]): item["score"] for item in results}
p_non_saudi = scores.get(0, 0.0)
p_saudi = scores.get(1, 0.0)
prediction = "Saudi" if p_saudi > p_non_saudi else "Non-Saudi"

print(f"Text: {text}")
print(f"P(Non-Saudi): {p_non_saudi:.4f}")
print(f"P(Saudi): {p_saudi:.4f}")
print(f"Prediction: {prediction}")
Training Parameters
  • Embedding model : Omartificial-Intelligence-Space/SA-BERT-V1
  • Max sequence length : 256
  • Classifier : Logistic Regression with balanced class weights
  • Training split : 80% train, 20% test (stratified)
Example Results

Here are some example predictions from our test set:

Sample Text P(Non-Saudi) P(Saudi) Predicted
الإسلام دين رحمة وتسامح، مو تعصب ولا قسوة. 0.0000 1.0000 Saudi
مهرجان الملك عبدالعزيز للإبل له قيمة ثقافية واقتصادية كبيرة. 0.0000 1.0000 Saudi
قبل تبدأ بأي بزنس، لازم تسوي دراسة جدوى كويسة. 0.0000 1.0000 Saudi
هل الطريق إلى المدينة الأخرى سالك؟ وهل توجد تحويلات؟ 0.9998 0.0002 Non-Saudi
هل المطعم مفتوح الآن لتناول الغداء؟ وكم وقت الانتظار تقريباً؟ 0.9999 0.0001 Non-Saudi
تحب سياحة البر؟ عندك أماكن كثيرة بالجنوب والوسط. 0.9993 0.0007 Non-Saudi
صبحك الله بالخير والعافية يالغالي، عسى يومك كله خير وسعادة. 0.0000 1.0000 Saudi
Analysis

The classifier demonstrates several noteworthy characteristics:

  1. High confidence predictions : The model often predicts with very high confidence (near 0.0 or 1.0)
  2. Dialectal markers : Expressions like "مو" (not), "وش" (what), "عشان" (because) are strong Saudi dialect indicators
  3. MSA (Modern Standard Arabic) sensitivity : Formal, MSA-heavy sentences tend to be classified as non-Saudi, regardless of content
  4. Lexical features : Saudi-specific vocabulary (e.g., references to places like "جازان", "العلا") increases Saudi classification probability
Limitations
  • The model may perform less effectively on mixed-dialect text or code-switching between MSA and dialect
  • Very short text with limited dialectal markers may yield less reliable results
  • Performance may vary for specialized domains not well-represented in the training data
  • The binary classification (Saudi/non-Saudi) does not distinguish between specific non-Saudi dialects
Citation

If you use this model in your research or applications, please cite:

@misc{nacar2025marbertv2saclassifier,
  title={SA-BERT-Classifier: Saudi Dialect Classifier},
  author={Nacar, Omer},
  year={2025},
  publisher={Omartificial-Intelligence-Space},
  howpublished={\url{https://huggingface.co/Omartificial-Intelligence-Space/MarBERTv2-SA-Classifier}},
}

Runs of Omartificial-Intelligence-Space SA-BERT-Classifier on huggingface.co

0
Total runs
0
24-hour runs
0
3-day runs
0
7-day runs
0
30-day runs

More Information About SA-BERT-Classifier huggingface.co Model

More SA-BERT-Classifier license Visit here:

https://choosealicense.com/licenses/apache-2.0

SA-BERT-Classifier huggingface.co

SA-BERT-Classifier huggingface.co is an AI model on huggingface.co that provides SA-BERT-Classifier's model effect (), which can be used instantly with this Omartificial-Intelligence-Space SA-BERT-Classifier model. huggingface.co supports a free trial of the SA-BERT-Classifier model, and also provides paid use of the SA-BERT-Classifier. Support call SA-BERT-Classifier model through api, including Node.js, Python, http.

Omartificial-Intelligence-Space SA-BERT-Classifier online free

SA-BERT-Classifier huggingface.co is an online trial and call api platform, which integrates SA-BERT-Classifier's modeling effects, including api services, and provides a free online trial of SA-BERT-Classifier, you can try SA-BERT-Classifier online for free by clicking the link below.

Omartificial-Intelligence-Space SA-BERT-Classifier online free url in huggingface.co:

https://huggingface.co/Omartificial-Intelligence-Space/SA-BERT-Classifier

SA-BERT-Classifier install

SA-BERT-Classifier is an open source model from GitHub that offers a free installation service, and any user can find SA-BERT-Classifier on GitHub to install. At the same time, huggingface.co provides the effect of SA-BERT-Classifier install, users can directly use SA-BERT-Classifier installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

Url of SA-BERT-Classifier

Provider of SA-BERT-Classifier huggingface.co

Omartificial-Intelligence-Space
ORGANIZATIONS

Other API from Omartificial-Intelligence-Space