obi / deid_bert_i2b2

huggingface.co
Total runs: 3.6K
24-hour runs: -8
7-day runs: 10
30-day runs: 1.6K
Model's Last Updated: August 22 2022
token-classification

Introduction of deid_bert_i2b2

Model Details of deid_bert_i2b2

Model Description

  • A ClinicalBERT [Alsentzer et al., 2019] model fine-tuned for de-identification of medical notes.
  • Sequence Labeling (token classification): The model was trained to predict protected health information (PHI/PII) entities (spans). A list of protected health information categories is given by HIPAA .
  • A token can either be classified as non-PHI or as one of the 11 PHI types. Token predictions are aggregated to spans by making use of BILOU tagging.
  • The PHI labels that were used for training and other details can be found here: Annotation Guidelines
  • More details on how to use this model, the format of data and other useful information is present in the GitHub repo: Robust DeID .

How to use

  • A demo on how the model works (using model predictions to de-identify a medical note) is on this space: Medical-Note-Deidentification .
  • Steps on how this model can be used to run a forward pass can be found here: Forward Pass
  • In brief, the steps are:
    • Sentencize (the model aggregates the sentences back to the note level) and tokenize the dataset.
    • Use the predict function of this model to gather the predictions (i.e., predictions for each token).
    • Additionally, the model predictions can be used to remove PHI from the original note/text.

Dataset

I2B2 I2B2
TRAIN SET - 790 NOTES TEST SET - 514 NOTES
PHI LABEL COUNT PERCENTAGE COUNT PERCENTAGE
DATE 7502 43.69 4980 44.14
STAFF 3149 18.34 2004 17.76
HOSP 1437 8.37 875 7.76
AGE 1233 7.18 764 6.77
LOC 1206 7.02 856 7.59
PATIENT 1316 7.66 879 7.79
PHONE 317 1.85 217 1.92
ID 881 5.13 625 5.54
PATORG 124 0.72 82 0.73
EMAIL 4 0.02 1 0.01
OTHERPHI 2 0.01 0 0
TOTAL 17171 100 11283 100

Training procedure

  • Steps on how this model was trained can be found here: Training . The "model_name_or_path" was set to: "emilyalsentzer/Bio_ClinicalBERT".

    • The dataset was sentencized with the en_core_sci_sm sentencizer from spacy.
    • The dataset was then tokenized with a custom tokenizer built on top of the en_core_sci_sm tokenizer from spacy.
    • For each sentence we added 32 tokens on the left (from previous sentences) and 32 tokens on the right (from the next sentences).
    • The added tokens are not used for learning - i.e, the loss is not computed on these tokens - they are used as additional context.
    • Each sequence contained a maximum of 128 tokens (including the 32 tokens added on). Longer sequences were split.
    • The sentencized and tokenized dataset with the token level labels based on the BILOU notation was used to train the model.
    • The model is fine-tuned from a pre-trained RoBERTa model.
  • Training details:

    • Input sequence length: 128
    • Batch size: 32
    • Optimizer: AdamW
    • Learning rate: 4e-5
    • Dropout: 0.1

Results

Questions?

Post a Github issue on the repo: Robust DeID .

Runs of obi deid_bert_i2b2 on huggingface.co

3.6K
Total runs
-8
24-hour runs
-7
3-day runs
10
7-day runs
1.6K
30-day runs

More Information About deid_bert_i2b2 huggingface.co Model

More deid_bert_i2b2 license Visit here:

https://choosealicense.com/licenses/mit

deid_bert_i2b2 huggingface.co

deid_bert_i2b2 huggingface.co is an AI model on huggingface.co that provides deid_bert_i2b2's model effect (), which can be used instantly with this obi deid_bert_i2b2 model. huggingface.co supports a free trial of the deid_bert_i2b2 model, and also provides paid use of the deid_bert_i2b2. Support call deid_bert_i2b2 model through api, including Node.js, Python, http.

deid_bert_i2b2 huggingface.co Url

https://huggingface.co/obi/deid_bert_i2b2

obi deid_bert_i2b2 online free

deid_bert_i2b2 huggingface.co is an online trial and call api platform, which integrates deid_bert_i2b2's modeling effects, including api services, and provides a free online trial of deid_bert_i2b2, you can try deid_bert_i2b2 online for free by clicking the link below.

obi deid_bert_i2b2 online free url in huggingface.co:

https://huggingface.co/obi/deid_bert_i2b2

deid_bert_i2b2 install

deid_bert_i2b2 is an open source model from GitHub that offers a free installation service, and any user can find deid_bert_i2b2 on GitHub to install. At the same time, huggingface.co provides the effect of deid_bert_i2b2 install, users can directly use deid_bert_i2b2 installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

deid_bert_i2b2 install url in huggingface.co:

https://huggingface.co/obi/deid_bert_i2b2

Url of deid_bert_i2b2

deid_bert_i2b2 huggingface.co Url

Provider of deid_bert_i2b2 huggingface.co

obi
ORGANIZATIONS

Other API from obi

huggingface.co

Total runs: 453.1K
Run Growth: -97.3K
Growth Rate: -21.47%
Updated:February 23 2025