Binary Classification
Binary classification models predict a binary outcome (one of two possible classes). In Aporia, these models are represented by the binary model type.
Examples of binary classification problems:
- Will the customer
buy
this product ornot_buy
this product? - Is this email
spam
ornot_spam
? - Is this review written by a
customer
or arobot
?
Frequently, binary models output not only a yes/no answer, but also a probability.
If you have a model with a yes/no decision but without a probability value, then your database may look like the following:
id | feature1 (numeric) | feature2 (boolean) | decision (boolean) | label (boolean) | timestamp (datetime) |
---|---|---|---|---|---|
1 | 13.5 | True | True | True | 2014-10-19 10:23:54 |
2 | -8 | False | False | True | 2014-10-19 10:24:24 |
To monitor this model, we will create a new model version with a schema that include a
boolean
prediction:apr_model = aporia.create_model_version(
model_id="<MODEL_ID>",
model_version="v1",
model_type="binary"
features={
...
},
predictions={
"decision": "boolean",
},
)
To connect this model to Aporia from your data source, call the
connect_serving(...)
API: apr_model.connect_serving(
data_source=my_data_source,
id_column="id",
timestamp_column="timestamp",
# Map the "label" column as the label for the "decision" prediction.
labels={
# Prediction name -> Column name
"decision": "label"
}
)
Check out the Data Sources section for further reading on the available data sources and how to connect to each one of them.
If you have a model with a yes/no decision and a probability / confidence value for it, then your database may look like the following:
id | feature1 (numeric) | feature2 (boolean) | proba (numeric) | decision (boolean) | label (boolean) | timestamp (datetime) |
---|---|---|---|---|---|---|
1 | 13.5 | True | 0.8 | True | True | 2014-10-19 10:23:54 |
2 | -8 | False | 0.5 | False | True | 2014-10-19 10:24:24 |
To monitor this model, it's recommended to create a new model version with a schema that includes the final decision as
boolean
field, and the probability as a numeric
field:apr_model = aporia.create_model_version(
model_id="<MODEL_ID>",
model_version="v1",
model_type="binary"
features={
...
},
predictions={
"decision": "boolean",
"proba": "numeric",
},
)
To connect the model to Aporia from a data source, call the
connect_serving(...)
API: apr_model.connect_serving(
data_source=my_data_source,
id_column="id",
timestamp_column="timestamp",
# Map the "label" column as the label for "decision" and "proba".
labels={
# Prediction name -> Column name representing
"decision": "label",
"proba": "label",
}
)
Check out the Data Sources section for further reading on the available data sources and how to connect to each one of them.
In cases when there is no threshold for your boolean prediction, and the final business result is actually a probability, you may simply omit the
decision
field from the examples in the previous section and only include the proba
field for your prediction. Don't want to connect to a database?
Last modified 4mo ago