IceBERT-PoS huggingface.co api & mideind IceBERT-PoS github AI Model

Introduction of IceBERT-PoS

Model Details of IceBERT-PoS

Prediction Methods

The model provides several prediction methods:

prepare_inputs(words, tokenizer, truncate=False) : Prepares inputs for a single list of words, returning tensors without batch dimension.
predict_labels(input_ids, attention_mask, word_mask) : Low-level prediction from prepared tensors with batch dimension.
predict_labels_from_text(sentences, tokenizer, truncate=False) : Returns structured predictions as (category, [attributes]) tuples from word lists. These can be slightly more readable and more suitable for some applications.
predict_ifd_labels_from_text(sentences, tokenizer, truncate=False) : Returns predictions in IFD (Icelandic Frequency Dictionary) format from word lists. Use this for evaluation against MIM-GOLD datasets or when you need compatibility with traditional Icelandic POS taggers.

All methods accept pre-tokenized word lists rather than raw sentences for better control over tokenization.

from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained("mideind/IceBERT-PoS", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("mideind/IceBERT-PoS")

# Example sentence
sentence = "Ég veit að þú kemur í kvöld til mín ."
sentence_words = sentence.split()

# Get predictions in (category, [attributes]) format
result = model.predict_labels_from_text([sentence_words], tokenizer)
expected = [
    [
        ("fp", ["1", "sing", "nom"]),
        ("sf", ["sing", "act", "1", "pres"]),
        ("c", []),
        ("fp", ["2", "sing", "nom"]),
        ("sf", ["sing", "act", "2", "pres"]),
        ("af", []),
        ("n", ["neut", "sing", "acc"]),
        ("af", []),
        ("fp", ["1", "sing", "gen"]),
        ("pl", []),
    ]
]
assert result == expected, f"Expected {expected}, but got {result}"
print("Test passed successfully!")

# Get predictions in IFD format (for MIM-GOLD evaluation)
ifd_result = model.predict_ifd_labels_from_text([sentence_words], tokenizer)
ifd_expected = [
    ["fp1en", "sfg1en", "c", "fp2en", "sfg2en", "af", "nheo", "af", "fp1ee", "pl"]
]
assert ifd_result == ifd_expected, f"Expected {ifd_expected}, but got {ifd_result}"
print("IFD conversion test passed successfully!")

# Alternative: use prepare_inputs for single sentence prediction
input_ids, attention_mask, word_mask = model.prepare_inputs(sentence_words, tokenizer)
single_result = model.predict_labels(input_ids.unsqeeze(0), attention_mask.unsqeeze(0), word_mask.unsqeeze(0))
assert single_result == expected, f"Expected {expected}, but got {single_result}"
print("Single sentence prediction test passed successfully!")

Handling Long Sequences with Truncation

By default, truncate=False to avoid hard-to-debug issues where input is silently truncated. However, very long sequences will cause errors:

from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained("mideind/IceBERT-PoS", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("mideind/IceBERT-PoS")

# Create a very long sentence that exceeds model limits
words = ["Þetta", "er", "mjög", "löng", "setning"] * 200  # Very long sentence
print(f"Input length: {len(words)} words")

# This will crash due to sequence length exceeding model limits
try:
    result = model.predict_labels_from_text([words], tokenizer, truncate=False)
    print("This shouldn't print - sequence was too long!")
except Exception as e:
    print(f"Error as expected: {type(e).__name__}")

# Use truncate=True for long sequences
result_truncated = model.predict_labels_from_text([words], tokenizer, truncate=True)
print(f"Truncated result length: {len(result_truncated[0])} tokens")
print("Warning: Output length differs from input length due to truncation!")

# When using truncation, you must handle the length mismatch carefully
# The output will have fewer predictions than input words
assert len(result_truncated[0]) < len(words), "Truncation should reduce length"
print("Truncation example completed successfully!")

Runs of mideind IceBERT-PoS on huggingface.co

Total runs

24-hour runs

-1

3-day runs

-110

7-day runs

-7.8K

30-day runs

More Information About IceBERT-PoS huggingface.co Model

IceBERT-PoS huggingface.co

IceBERT-PoS huggingface.co is an AI model on huggingface.co that provides IceBERT-PoS's model effect (), which can be used instantly with this mideind IceBERT-PoS model. huggingface.co supports a free trial of the IceBERT-PoS model, and also provides paid use of the IceBERT-PoS. Support call IceBERT-PoS model through api, including Node.js, Python, http.

IceBERT-PoS huggingface.co Url

https://huggingface.co/mideind/IceBERT-PoS

mideind IceBERT-PoS online free

IceBERT-PoS huggingface.co is an online trial and call api platform, which integrates IceBERT-PoS's modeling effects, including api services, and provides a free online trial of IceBERT-PoS, you can try IceBERT-PoS online for free by clicking the link below.

mideind IceBERT-PoS online free url in huggingface.co:

https://huggingface.co/mideind/IceBERT-PoS

IceBERT-PoS install

IceBERT-PoS is an open source model from GitHub that offers a free installation service, and any user can find IceBERT-PoS on GitHub to install. At the same time, huggingface.co provides the effect of IceBERT-PoS install, users can directly use IceBERT-PoS installed effect in huggingface.co for debugging and trial. It also supports api for free installation.