Aporia Documentation
Get StartedBook a Demo🚀 Cool StuffBlog
V1
V1
  • Welcome to Aporia!
  • 🤗Introduction
    • Quickstart
    • Support
  • 💡Core Concepts
    • Why Monitor ML Models?
    • Understanding Data Drift
    • Analyzing Performance
    • Tracking Data Segments
    • Models & Versions
    • Explainability
  • 🏠Storing your Predictions
    • Overview
    • Real-time Models (Postgres)
    • Real-time Models (Kafka)
    • Batch Models
    • Kubeflow / KServe
    • Logging to Aporia directly
  • 🚀Model Types
    • Regression
    • Binary Classification
    • Multiclass Classification
    • Multi-Label Classification
    • Ranking
  • 📜NLP
    • Intro to NLP Monitoring
    • Example: Text Classification
    • Example: Token Classification
    • Example: Question Answering
  • 🍪Data Sources
    • Overview
    • Amazon S3
    • Athena
    • BigQuery
    • Delta Lake
    • Glue Data Catalog
    • PostgreSQL
    • Redshift
    • Snowflake
  • ⚡Monitors
    • Overview
    • Data Drift
    • Metric Change
    • Missing Values
    • Model Activity
    • Model Staleness
    • New Values
    • Performance Degradation
    • Prediction Drift
    • Value Range
    • Custom Metric
  • 📡Integrations
    • Slack
    • JIRA
    • New Relic
    • Single Sign On (SAML)
    • Webhook
    • Bodywork
  • 🔑API Reference
    • Custom Metric Definition Language
    • REST API
    • SDK Reference
    • Metrics Glossary
Powered by GitBook
On this page
  • Example: Boolean Decision without Probability
  • Example: Boolean Decision with Probability
  • Example: Probability Only
  1. Model Types

Binary Classification

Binary classification models predict a binary outcome (one of two possible classes). In Aporia, these models are represented by the binary model type.

Examples of binary classification problems:

  • Will the customer buy this product or not_buy this product?

  • Is this email spam or not_spam?

  • Is this review written by a customer or a robot?

Frequently, binary models output not only a yes/no answer, but also a probability.

Example: Boolean Decision without Probability

If you have a model with a yes/no decision but without a probability value, then your database may look like the following:

id
feature1 (numeric)
feature2 (boolean)
decision (boolean)
label (boolean)
timestamp (datetime)

1

13.5

True

True

True

2014-10-19 10:23:54

2

-8

False

False

True

2014-10-19 10:24:24

To monitor this model, we will create a new model version with a schema that include a boolean prediction:

apr_model = aporia.create_model_version(
  model_id="<MODEL_ID>",
  model_version="v1",
  model_type="binary"
  features={
     ...
  },
  predictions={
    "decision": "boolean",
  },
)

To connect this model to Aporia from your data source, call the connect_serving(...) API:

apr_model.connect_serving(
  data_source=my_data_source,

  id_column="id",
  timestamp_column="timestamp",

  # Map the "label" column as the label for the "decision" prediction. 
  labels={
    # Prediction name -> Column name
    "decision": "label"
  }
)

Example: Boolean Decision with Probability

If you have a model with a yes/no decision and a probability / confidence value for it, then your database may look like the following:

id
feature1 (numeric)
feature2 (boolean)
proba (numeric)
decision (boolean)
label (boolean)
timestamp (datetime)

1

13.5

True

0.8

True

True

2014-10-19 10:23:54

2

-8

False

0.5

False

True

2014-10-19 10:24:24

To monitor this model, it's recommended to create a new model version with a schema that includes the final decision as boolean field, and the probability as a numeric field:

apr_model = aporia.create_model_version(
  model_id="<MODEL_ID>",
  model_version="v1",
  model_type="binary"
  features={
     ...
  },
  predictions={
    "decision": "boolean",
    "proba": "numeric",
  },
)

To connect the model to Aporia from a data source, call the connect_serving(...) API:

apr_model.connect_serving(
  data_source=my_data_source,
    
  id_column="id",
  timestamp_column="timestamp",

  # Map the "label" column as the label for "decision" and "proba". 
  labels={
    # Prediction name -> Column name representing 
    "decision": "label",
    "proba": "label",
  }
)

Example: Probability Only

In cases when there is no threshold for your boolean prediction, and the final business result is actually a probability, you may simply omit the decision field from the examples in the previous section and only include the proba field for your prediction.

Don't want to connect to a database?

PreviousRegressionNextMulticlass Classification

Last updated 2 years ago

Check out the section for further reading on the available data sources and how to connect to each one of them.

Check out the section for further reading on the available data sources and how to connect to each one of them.

Don't worry - you can

🚀
Data Sources
Data Sources
log your predictions directly to Aporia.