This network implements a bi-directional Transformer-based encoder as
described in
"RoBERTa: A Robustly Optimized BERT Pretraining Approach"
.
It includes the embedding lookups and transformer layers, but does not
include the masked language model head used during pretraining.
The default constructor gives a fully customizable, randomly initialized
RoBERTa encoder with any number of layers, heads, and embedding
dimensions. To load preset architectures and weights, use the
from_preset()
constructor.
Disclaimer: Pre-trained models are provided on an "as is" basis, without
warranties or conditions of any kind. The underlying model is provided by a
third party and subject to a separate license, available
here
.
Arguments
vocabulary_size
: int. The size of the token vocabulary.
num_layers
: int. The number of transformer layers.
num_heads
: int. The number of attention heads for each transformer.
The hidden size must be divisible by the number of attention heads.
hidden_dim
: int. The size of the transformer encoding layer.
intermediate_dim
: int. The output dimension of the first Dense layer in
a two-layer feedforward network for each transformer.
dropout
: float. Dropout probability for the Transformer encoder.
max_sequence_length
: int. The maximum sequence length this encoder can
consume. The sequence length of the input must be less than
max_sequence_length
default value. This determines the variable
shape for positional embeddings.
Example Usage
import keras
import keras_hub
import numpy as np
Raw string data.
features = ["The quick brown fox jumped.", "I forgot my homework."]
labels = [0, 3]
# Pretrained classifier.
classifier = keras_hub.models.RobertaClassifier.from_preset(
"roberta_base_en",
num_classes=4,
)
classifier.fit(x=features, y=labels, batch_size=2)
classifier.predict(x=features, batch_size=2)
# Re-compile (e.g., with a new learning rate).
classifier.compile(
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=keras.optimizers.Adam(5e-5),
jit_compile=True,
)
# Access backbone programmatically (e.g., to change `trainable`).
classifier.backbone.trainable = False# Fit again.
classifier.fit(x=features, y=labels, batch_size=2)
roberta_base_en huggingface.co is an AI model on huggingface.co that provides roberta_base_en's model effect (), which can be used instantly with this keras roberta_base_en model. huggingface.co supports a free trial of the roberta_base_en model, and also provides paid use of the roberta_base_en. Support call roberta_base_en model through api, including Node.js, Python, http.
roberta_base_en huggingface.co is an online trial and call api platform, which integrates roberta_base_en's modeling effects, including api services, and provides a free online trial of roberta_base_en, you can try roberta_base_en online for free by clicking the link below.
keras roberta_base_en online free url in huggingface.co:
roberta_base_en is an open source model from GitHub that offers a free installation service, and any user can find roberta_base_en on GitHub to install. At the same time, huggingface.co provides the effect of roberta_base_en install, users can directly use roberta_base_en installed effect in huggingface.co for debugging and trial. It also supports api for free installation.