Google Cloud Storage
Last updated
Last updated
This guide describes how to connect Aporia to a Google Cloud Storage (GCS) data source in order to monitor your ML Model in production.
We will assume that your model inputs, outputs, and optionally delayed actuals are stored in a file in GCS. Currently, the following file formats are supported:
parquet
json
This data source may also be used to connect to your model's training dataset to be used as a baseline for model monitoring.
In order to provide access to GCS, you'll need to update your Aporia Dataproc worker service account with the necessary API permissions.
Go to the Cloud Storage buckets page.
Select the buckets where your data is stored.
Click on the permissions button:
On the Permissions tab, click on the Add Principal button.
On the Grant access page, do the following:
Add the Aporia Dataproc Worker Service Account as a principal.
Assign the Storage Object Viewer role
Click Save.
Now Aporia has the read permission it needs to connect to the GSC buckets you have granted permissions.
Go to the Aporia platform and log in to your account.
Go to the Integrations page and click on the Data Connectors tab
Scroll to Connect New Data Source section
Click Connect on the GCS card and follow the instructions
Bravo! 👏 now you can use the data source you've created across all your models in Aporia.