DistilBERT Tone Classification Model
This model fine-tunes distilbert-base-uncased to classify tone into 8 categories relevant to community and mentorship transcripts.
📌 Labels
uplifting
thoughtful
practical
reflective
motivational
informative
neutral
negative
📊 Dataset
The model is trained on the tone-dataset
,
a dataset containing 1000+ labeled examples created for the MyVillageProject tone classification task.
Data includes first-person and third-person statements, anecdotes, factual notes, and reflective entries.
🚀 Training
Base model: distilbert-base-uncased
Optimizer: AdamW (lr=5e-5)
Batch size: 16
Epochs: 8
Loss: CrossEntropy
Metrics: Accuracy + Weighted F1
📈 Validation Metrics
Epoch Training Loss Validation Loss Accuracy F1
1 No log 0.484719 0.894161 0.895220
2 No log 0.264668 0.923358 0.923200
3 No log 0.243101 0.930657 0.930599
4 No log 0.302434 0.916058 0.918166
5 No log 0.305320 0.923358 0.923836
6 No log 0.294621 0.916058 0.916176
7 No log 0.303021 0.919708 0.919583
8 0.215900 0.298230 0.916058 0.915722
Final Training Summary:
TrainOutput(global_step=552, training_loss=0.1959800598198089,
metrics={
'train_runtime': 39.2397,
'train_samples_per_second': 223.244,
'train_steps_per_second': 14.067,
'total_flos': 290134644572160.0,
'train_loss': 0.1959800598198089,
'epoch': 8.0
})
💻 Usage
from transformers import pipeline
classifier = pipeline("text-classification", model="Dc-4nderson/tone-distilbert")
text = "Ronnie mentioned the turnout was twice what they expected, and it felt like a victory."
print(classifier(text))
Output:
[{'label': 'uplifting'}]
🔖 License
Apache-2.0
👥 Maintainer
Dequan Anderson/ Dc-4nderson