Analytics

The following analytic use cases will help you understand how to use TripleBlind to conduct privacy-preserving analytics on 3rd-party data.

Use Case #1: Blind Join (SDK)

Using the TripleBlind SDK, join two or more tabular datasets on the intersection of a given column and return any columns for the intersection subset. Personas represented in this use case are: Data Scientist or Data Analyst (User), Dataset Owner.

Workflow

The following workflow is used to perform this analysis using TripleBlind.

  1. Initialize a TripleBlind session
  2. Register new assets or locate existing assets
  3. Explore assets
  4. Run an analysis process and get results

Steps

To execute this use case follow these steps in your Python IDE:

1. The User authenticates with the TripleBlind Router and starts a Session.

import tripleblind as tb
tb.initialize(api_token=user1_token)

ℹ️The call to tb.initialize is unnecessary if the User token is set up in the User’s tripleblind.yaml file.

2. The User registers a dataset as a new asset.

   asset_0 = tb.Asset.position(
       file_handle=data_dir / "store_transactions.csv",
       name=f"Shop Transaction-{run_id}",
       desc="Fictional retail transaction data.",
       is_discoverable=True,
       cost=1,
   )

Or, searches for an existing dataset by name or UUID.

asset0 = tb.TableAsset.find("Shop Transaction")
# or
asset0 = tb.TableAsset.find("673b8bd1-e758-4d56-b6c1-4e1ff946f1c7")

3. Optionally, the User explores a synthetic data view and profile of a registered asset.

4. The Owner adds an agreement to their asset.

asset0.add_agreement(with_org=1, operation=tb.Operation.BLIND_JOIN)

Or, authorizes the access request manually on the 🔗Access Requests page in the web interface.

5. The User performs a Blind Join on the datasets and consumes the results.

result = asset0.blind_join(
   intersect_with=asset1,
   match_column=["address", "customer_address"],
   return_columns=["customer_address"]
)
print(result.dataframe)
result.download("result.zip", overwrite=True)

Use Case #2: Blind Join (web interface)

Using the TripleBlind web interface, join two or more tabular datasets on the intersection of a given column and return columns for the intersection subset. Note that the owner of any 3rd-party datasets must approve the operation, and only unmasked fields from 3rd-party datasets are available to be returned from those datasets. Personas represented in this use case are: Data Scientist, Data Analyst, or Business User (User) and Dataset Owner.

Workflow

The following workflow is used to perform this analysis using TripleBlind.

  1. Log in to the TripleBlind web interface
  2. Register new assets or locate existing assets
  3. Explore assets
  4. Run an analysis process and get results

Steps

To execute this use case follow these steps:

1. The User logs into the TripleBlind web interface and selects Create New Process.

The User selects the Blind Join process and then selects Continue.

2. The User adds datasets as new assets to be used in the operation, or searches for existing asset(s), and then selects Continue.

3. The User specifies columns to intersect and the match “fuzziness.”


The User specifies column(s) to return from each dataset and selects Continue.

ℹ️Only unmasked columns from 3rd-party datasets are available for return. If the desired columns are not visible, please request the Owner to unmask them.

4. The User confirms the Job details and initiates the process by selecting Run.

5. The Owner grants authorization.


6. The User monitors Process and downloads result when complete.




Wed May 15 2024 03:01:25 GMT-0400 (Eastern Daylight Time)