NLP - BERT
A Bidirectional Encoder Representations from Transformers (aka BERT) encodes natural language using adjacent words to enhance contextual understanding. The provided implementation allows building on top of an existing BERT model:
- Model:
distilbert-base-uncased - Model is trained in a federated learning structure. Local models are trained per client, then models are averaged.
- Tokenizer:
DistilBertTokenizerFast
Operation
- When using
add_agreement()to forge an agreement on the trained model, useOperation.EXECUTEfor theoperationparameter. - When using
add_agreement()to allow a counterparty to use your dataset for model training, or usingcreate_job()to train your model, useOperation.BERT_SEQ_CLF_TRAIN.
Parameters
optimizer_meta: OptimizerParams = OptimizerParams()epochs: int = 1batchsize: int = 32model_path: str = ""dataset_path: str = ""send_model: bool = Falsetest_size: float = 0.0model_output: str = "binary" # Union["binary", "multiclass", "regression"]binary_metric: str = "accuracy" # 🔗f1_score, 🔗roc_auc_score, 🔗precision_recall_curven_classes: int = None# currently used for object detection. This should not include background class.num_clients: int = 0client_index: int = 0federated_rounds: int = 1num_labels: int = 2data_column: str = ""target_column: str = ""
Limitations
- BERT is not supported in SMPC.
Tue Jul 02 2024 11:47:15 GMT-0400 (Eastern Daylight Time)