Module tripleblind.model_asset
Specialized Asset representing trained models, such as a neural network.
The ModelAsset wraps a generic asset, allowing the complexity of creating jobs to be completely hidden. Common operations can happen with just a few lines of code.
For example:
import tripleblind as tb
# Use a trained model to privately analyze a patient xray
model = tb.ModelAsset("diagnose_disease_model")
result = model.infer(data="xray.jpg")
print(result.table.dataframe)
Classes
class ModelAsset (uuid: UUID)
-
Points to a dataset or an algorithm indexed on the TripleBlind Router.
Ancestors
Subclasses
Static methods
def cast(asset: Asset) -> ModelAsset
-
Convert a generic Asset into a ModelAsset
This should only be used on an asset known to be model, no validation occurs during the cast.
Args
asset
:Asset
- A generic Asset
Returns
ModelAsset
- A ModelAsset object
def find(search: Optional[Union[str, re.Pattern]], namespace: Optional[UUID] = None, owned: Optional[bool] = False, owned_by: Optional[int] = None, session: Optional[Session] = None, exact_match: Optional[bool] = True) -> ModelAsset
-
Search the Router index for an asset matching the given search
Args
search
:str
orre.Pattern
, optional- Either an asset ID or a search pattern applied to asset names and descriptions. A simple string will match a substring or the entire string if exact_match is True, or a regular expression can be passed for complex searches.
namespace
:UUID
, optional- The UUID of the user to which this asset belongs. None indicates any user, NAMESPACE_DEFAULT_USER indicates the current API user.
owned
:bool
, optional- Only return owned assets (either personally or by the current user's team)
owned_by
:int
, optional- Only return owned assets owned by the given teamID
session
:Session
, optional- A connection session. If not specified, the default session is used.
exact_match
:bool
, optional- When the 'search' is a string, setting this to True will perform an exact match. Ignored for regex patterns, defaults to True.
Raises
TripleblindAssetError
- Thrown when multiple assets are found which match the search.
Returns
ModelAsset
- A single asset, or None if no match found
Methods
def infer(self, data: Union[Asset, TableAsset, str, Path, List[Asset], List[TableAsset], List[str], List[Path]], preprocessor: Optional[Union[TabularPreprocessor, List[TabularPreprocessor], TabularPreprocessorBuilder, List[TabularPreprocessorBuilder], ImagePreprocessorBuilder, List[ImagePreprocessorBuilder], NumpyInputPreprocessor, List[NumpyInputPreprocessor], NumpyInputPreprocessorBuilder, List[NumpyInputPreprocessorBuilder]]] = None, params: Optional[Dict] = None, job_name: Optional[str] = None, silent: Optional[bool] = False, session: Optional[Session] = None, stream_output: bool = False, identifier_columns: Optional[Union[List[str], str]] = None) -> JobResult | StatusOutputStream
-
Perform an inference using a model
NOTE: For inferences which produce textual output, such as a classifier, the result can be easily accessed via code like this:
r = model.infer("data.csv") print(r.table)
Or the r.table.dataframe can be used as a standard Pandas dataframe.
Args
data
:Asset
orstr
- The data to infer against. Can be an Asset or or a path to a file.
preprocessor
:Preprocessor
- A preprocessor to apply to the data. it not defined, the dataset is used directly.
params
:dict
- Dictionary of unique parameters for the model. Typically, this is not needed.
job_name
:str
, optional- Reference name for the job with performs this task.
silent
:bool
, optional- Suppress status messages during execution? Default is to show messages.
session
:Session
, optional- A connection session. If not specified, the default session is used.
stream_output
:bool
, optional- Whether to start the job and return a StatusOutputStream, or wait for job completion and return a JobResult (the default).
identifier_columns
:str, List[str]
, optional- Column or columns which will be returned alongside results. Default is None.
Raises
TripleblindAPIError
- Inference failed
Returns
When
stream_output
is set to False (the default), a JobResult is returned once the job completes. If successful, the inference output is found at result.asset and/or result.tableIf
stream_output
is set to True, a StatusOutputStream object is immediately returned and can be used as a Generator that outputs the status messages produced while the job is running. def psi_infer(self, data: Union[Asset, List[Asset], TableAsset, List[TableAsset]], match_column: Union[str, List[str]], regression_type: Optional[RegressionType] = None, preprocessor: Optional[Union[TabularPreprocessor, List[TabularPreprocessor], TabularPreprocessorBuilder, List[TabularPreprocessorBuilder]]] = None, params: Optional[Dict] = None, job_name: Optional[str] = None, silent: Optional[bool] = False, session: Optional[Session] = None, stream_output: bool = False) -> JobResult | StatusOutputStream
-
Perform an inference using a model on distributed data matched with PSI
NOTE: For inferences which produce textual output, such as a classifier, the result can be easily accessed via code like this:
r = model.psi_infer("data.csv") print(r.table)
Or the r.table.dataframe can be used as a standard Pandas dataframe.
Args
data
:Asset
orstr
- The data to infer against.
match_column
:Union[str, List[str]]
- Name of the column to match. If not the same in all datasets, a list of the matching column names, starting with the initiator asset and then listing a name in each dataset.
regression_type
:RegressionType
- The type of regression to be performed. If populated, indicates a regression inference will be performed. One of: tb.RegressionType.LINEAR, LOGISTIC
preprocessor
:Union[TabularPreprocessor, List[TabularPreprocessor], TabularPreprocessorBuilder, List[TabularPreprocessorBuilder]]
, optional- A preprocessor to apply to the data. If not defined, the dataset is used directly.
params
:dict
- Dictionary of unique parameters for the model. Typically, this is not needed.
job_name
:str
, optional- Reference name for the job with performs this task.
silent
:bool
, optional- Suppress status messages during execution? Default is to show messages.
session
:Session
, optional- A connection session. If not specified, the default session is used.
stream_output
:bool
, optional- Whether to start the job and return a StatusOutputStream, or wait for job completion and return a JobResult (the default).
Returns
When
stream_output
is set to False (the default), a JobResult is returned once the job completes. If successful, the inference output is found at result.asset and/or result.tableIf
stream_output
is set to True, a StatusOutputStream object is immediately returned and can be used as a Generator that outputs the status messages produced while the job is running.
Inherited members
class ModelTrainerAsset (uuid: UUID)
-
Points to a dataset or an algorithm indexed on the TripleBlind Router.
Ancestors
Static methods
def cast(asset: Asset) -> ModelTrainerAsset
-
Convert a generic Asset into a ModelTrainerAsset
This should only be used on an asset known to be model, no validation occurs during the cast.
Args
asset
:Asset
- A generic Asset
Returns
ModelTrainerAsset
- A ModelTrainerAsset object
Methods
def train(self, data: Optional[Union[Asset, str, Path, Package, List[Asset], List[str], List[Path], List[Package]]], data_type: str = 'table', epochs: int = 1, model_output: str = None, data_shape: Optional[List[int]] = None, batch_size: Optional[int] = None, test_size: Optional[float] = None, preprocessor: Union[TabularPreprocessor, List[TabularPreprocessor], TabularPreprocessorBuilder, List[TabularPreprocessorBuilder], ImagePreprocessorBuilder, List[ImagePreprocessorBuilder], NumpyInputPreprocessor, List[NumpyInputPreprocessor], NumpyInputPreprocessorBuilder, List[NumpyInputPreprocessorBuilder]] = None, loss_name: str = None, loss_params: Optional[Dict] = None, optimizer_name: str = None, optimizer_params: Optional[Dict] = None, lr_scheduler_name: Optional[str] = None, lr_scheduler_params: Optional[Dict] = None, params: Optional[Dict] = None, delete_trainer: Optional[bool] = False, job_name: Optional[str] = None, silent: Optional[bool] = False, session: Optional[Session] = None, stream_output: bool = False) -> JobResult | StatusOutputStream
-
Train this model using the data and parameters specified
Args
data
:Asset
orstr
- One or more datasets to use for training. If a string is passed, it must be a path to valid data that will be converted into a temporary asset for the training.
dataset
:Asset, str, Path, Package
orlist
ofsame
, optional- One or more datasets to use for training. Datasets can be specified as Assets or as a filename. When a filename is given it will automatically be converted to a temporary Asset which gets deleted at the completion of the Job.
data_type
:str
- The type of the training data. Valid values are "table", "image", and "numpy".
epochs
:int
, optional- Number of passes to make through the training data.
model_output
:str
- The type result generated by the model. Valid values are "regression", "multiclass", and "binary".
data_shape
:List[int]
, optional- Description of the training data, depending on the data_type: table - number of columns of data, e.g. [cols] image - image dimensions, e.g. [width, height, bytes-per-pixel] numpy - not used
batch_size
:int
, optional- Number of data samples to pass at one time during training.
test_size
:float
, optional- A percentage of the data to be reserved for accuracy testing and reporting with each epoch.
preprocessor
:Preprocessor
orList[Preprocessor]
, optional- A single preprocessor to apply to all data, or a list of preprocessors to apply to each dataset. If a list of preprocessors is given, the count must match the number of datasets.
loss_name
:str
, optional- A loss function name, consistent with PyTorch. See https://pytorch.org/docs/stable/nn.html#loss-functions
loss_params
:dict
, optional- Dictionary of parameters appropriate for the loss function.
optimizer_name
:str
, optional- An optimizer function name, consistent with PyTorch. See https://pytorch.org/docs/stable/optim.html
optimizer_params
:dict
, optional- Dictionary of parameters appropriate for the optimizer_name.
lr_scheduler_name
:str
, optional- A learning rate scheduler function name, either "CyclicLR" or "CyclicCosineDecayLR". Default is to use a constant learning rate.
lr_scheduler_params
:dict
, optional-
Dictionary of parameters appropriate for the scheduler_name. Legal values depend on the lr_scheduler_name. For "CyclicLR"::
{ 'step_size': 10, # Number of epochs over which the cycle is completed. 'base_lr': 0.0001, # Starting rate, lower boundary in the cycle 'max_lr': 0.01, # Upper boundary in the cycle. 'mode': 'triangular' # or "triangular2", or "exp_range" 'gamma': # Multiplicative factor of decay of learning rate at the end of each cycle, default=0.99 }
For "CyclicCosineDecayLR"::
{ "init_decay_epochs": 10, # Number of initial decay epochs. "min_decay_lr": 0.0001, # Learning rate at the end of decay. "restart_interval": 3, # Restart interval for fixed cycles, or None to disable cycles. "restart_interval_multiplier": 1.5, # Multiplication coefficient for geometrically increasing cycles. "restart_lr": 0.01, # Learning rate when cycle restarts. "warmup_epochs": # Number of warmup epochs, default is None "warmup_start_lr": # Learning rate at the beginning of warmup. }
params
:dictionary
, optional- Additional customer parameters.
delete_trainer
:bool
, optional- Set to True to delete the training model after training completes. Ignored if stream_output is set to True.
job_name
:str
, optional- Reference name for the job which
performs this task.
Default is "Model training -
" silent
:bool
, optional- Suppress status messages during execution? Default is to show messages.
session
:Session
, optional- A connection session. If not specified, the default session is used.
stream_output
:bool
, optional- Whether to start the job and return a StatusOutputStream, or wait for job completion and return a JobResult (the default).
Raises
TripleblindTrainingError
- Model training failed
Returns
When
stream_output
is set to False (the default), a JobResult is returned once the job completes. If successful, the inference output is found at result.asset and/or result.tableIf
stream_output
is set to True, a StatusOutputStream object is immediately returned and can be used as a Generator that outputs the status messages produced while the job is running.
Inherited members