Binary Classification

Binary classification models predict a binary outcome (one of two possible classes). In Aporia, these models are represented by the binary model type.

Examples of binary classification problems:

  • Will the customer buy this product or not_buy this product?

  • Is this email spam or not_spam?

  • Is this review written by a customer or a robot?

Frequently, binary models output not only a yes/no answer, but also a probability.

Example: Boolean Decision without Probability

If you have a model with a yes/no decision but without a probability value, then your database may look like the following:

To monitor this model, we will create a new model version with a schema that include a boolean prediction:

apr_model = aporia.create_model_version(
  model_id="<MODEL_ID>",
  model_version="v1",
  model_type="binary"
  features={
     ...
  },
  predictions={
    "decision": "boolean",
  },
)

To connect this model to Aporia from your data source, call the connect_serving(...) API:

apr_model.connect_serving(
  data_source=my_data_source,

  id_column="id",
  timestamp_column="timestamp",

  # Map the "label" column as the label for the "decision" prediction. 
  labels={
    # Prediction name -> Column name
    "decision": "label"
  }
)

Check out the Data Sources section for further reading on the available data sources and how to connect to each one of them.

Example: Boolean Decision with Probability

If you have a model with a yes/no decision and a probability / confidence value for it, then your database may look like the following:

To monitor this model, it's recommended to create a new model version with a schema that includes the final decision as boolean field, and the probability as a numeric field:

apr_model = aporia.create_model_version(
  model_id="<MODEL_ID>",
  model_version="v1",
  model_type="binary"
  features={
     ...
  },
  predictions={
    "decision": "boolean",
    "proba": "numeric",
  },
)

To connect the model to Aporia from a data source, call the connect_serving(...) API:

apr_model.connect_serving(
  data_source=my_data_source,
    
  id_column="id",
  timestamp_column="timestamp",

  # Map the "label" column as the label for "decision" and "proba". 
  labels={
    # Prediction name -> Column name representing 
    "decision": "label",
    "proba": "label",
  }
)

Check out the Data Sources section for further reading on the available data sources and how to connect to each one of them.

Example: Probability Only

In cases when there is no threshold for your boolean prediction, and the final business result is actually a probability, you may simply omit the decision field from the examples in the previous section and only include the proba field for your prediction.

Don't want to connect to a database?

Don't worry - you can log your predictions directly to Aporia.

Last updated