Decision Tree
Securely and privately train a Decision Tree model over vertically-partitioned datasets, and use federated-security inferencing on the trained Decision Tree model.
Operation
When using add_agreement()
to forge an agreement on a trained model, use the positioned asset’s UUID for the operation
parameter.
When using add_agreement()
to allow a counterparty to use your dataset for model training, or using create_job()
to train a Decision Tree, use the appropriate operation
parameter below.
PSI Vertical Decision Tree
Use Operation.PSI_VERTICAL_DECISION_TREE_TRAIN
to identify an overlap of matching records across datasets, and then train a Decision Tree classification or regression model on the vertically-partitioned intersection.
Parameters
When running the training protocol explicitly (PSI_VERTICAL_DECISION_TREE_TRAIN
) using create_job()
:
decision_tree: Dict{ "regression": bool, max_depth: int}
- Set
regression
toFalse
for classification. - Setting
max_depth
to 3 or more increases execution time sharply.
psi: Dict{ "match_column": Union[str, List[str]] }
- Name of the column to match. If not the same in all datasets, a list of the matching column names, one for each
tables
dataset in order. - If a single fieldname is provided, each dataset must have the same name for that
match_column
, eg. “ID”.
target_column: str
- The name of the target column for the training.
- If multiple target columns are found with the same name, an exception will be thrown.
Inference parameters
psi: Dict{ "match_column": List[str] = ["id0", "id1"] }
Limitations
- Supported for up to 100,000 samples.
- The owned dataset must be supplied as the first (or left-side) dataset asset.