In Aporia, custom metrics are defined using syntax that is similar to python's.
There are three building blocks which can be used in order to create a custom metric expression:
Constants - a numeric value (e.g. 2, 0.5, ..)
Functions - out of the builtin function collection you can find below (e.g. sum, count, ...). All those functions return a numeric value.
Binary operation - +, -, *, /, **. Operands can be both constants or function calls.
Builtin Functions
Before we dive into each of the supported function, there are two general concepts you should be familiar with regarding all functions - field expressions and data segment filters.
Field Expressions
A field expression can be described in the following format:
<field_category>.<field_name>[<segment filter>]
Field category is one of the following: features / raw_inputs / predictions / actuals. Note that you can only use categories which you defined in you schema while creating your model version. In addition, don't forget that predictions and actuals categories have the same field names.
The segment filter is optional, for further information about the filters read the section below.
Data Segment Filters
Data segment filters are boolean expressions, designed to reduce to a specific data segment the field on which we perform the function.
Each boolean condition in a segment filter is a comparison between a field and a constant value. For example:
[features.Driving_License == True] // will filter out records in which Driving_License != True
[raw_inputs.Age <= 35] // will only include records in which Age <= 35
Conditions can be combined using and / or and all fields can be checked for missing values using is None / is not None.
The following describe the supported combinations:
Type / Operation
==
!=
<
>
>=
<=
Boolean
True/False
True/False
✖️
✖️
✖️
✖️
Categorical
numeric constants
numeric constants
✖️
✖️
✖️
✖️
String
numeric constants
numeric constants
✖️
✖️
✖️
✖️
Numeric
numeric constants
numeric constants
numeric constants
numeric constants
numeric constants
numeric constants
The table cells indicates the type we can compare to.
Examples
// Average annual premium of those with a driving license
sum(features.Annual_Premium[features.Driving_License == True]) / prediction_count()
// Three time number of prediction of those who are under 35 years old and live in CA
prediction_count(raw_inputs.Age <= 35 and raw_inputs.Region_Code == 28) * 3
prediction_count(features.Age > 27) / (sum(features.Annual_Premium) + sum(features.Vintage))
Supported functions
accuracy
Parameters
prediction: prediction field
label: label field
threshold: numeric. Probability threshold according to which we decide the if a class is positive
filter: the filter we want to apply on the records before calculating the metric
actuals_count
Parameters
No parameters needed, cannot apply filters on this metric.
actuals_ratio
Parameters
No parameters needed, cannot apply filters on this metric.
auc_roc
Parameters
prediction: prediction probability field
label: label field
filter: the filter we want to apply on the records before calculating the metric
count
Parameters
No parameters needed, cannot apply filters on this metric.
f1_score
Parameters
prediction: prediction probability field
label: label field
threshold: numeric. Probability threshold according to which we decide the if a class is positive
average: the average strategy (micro / macro / weighted)
top_k: consider only top-k items.
filter: the filter we want to apply on the records before calculating the metric
fn_count
Parameters
prediction: prediction probability field
label: label field
threshold: numeric. Probability threshold according to which we decide the if a class is positive
filter: the filter we want to apply on the records before calculating the metric
fp_count
Parameters
prediction: prediction probability field
label: label field
threshold: numeric. Probability threshold according to which we decide the if a class is positive
filter: the filter we want to apply on the records before calculating the metric
fp_rate
Parameters
prediction: prediction probability field
label: label field
threshold: numeric. Probability threshold according to which we decide the if a class is positive
filter: the filter we want to apply on the records before calculating the metric
logloss
Parameters
prediction: prediction field
label: label field
filter: the filter we want to apply on the records before calculating the metric
mae
Parameters
prediction: prediction field
label: label field
filter: the filter we want to apply on the records before calculating the metric
mape
Parameters
prediction: prediction field
label: label field
filter: the filter we want to apply on the records before calculating the metric
max
Parameters
field: numeric or dict. The field for which the metric will be computed. Can be of any category (feature / raw_input / prediction / actual)
filter: the filter we want to apply on the records before calculating the metric
keys: keys to filter in when field type is dict.
mean
Parameters
field: numeric or dict. The field for which the metric will be computed. Can be of any category (feature / raw_input / prediction / actual)
filter: the filter we want to apply on the records before calculating the metric
keys: keys to filter in when field type is dict.
median
Parameters
field: numeric or dict. The field for which the metric will be computed. Can be of any category (feature / raw_input / prediction / actual)
filter: the filter we want to apply on the records before calculating the metric
keys: keys to filter in when field type is dict.
min
Parameters
field: numeric or dict. The field for which the metric will be computed. Can be of any category (feature / raw_input / prediction / actual)
filter: the filter we want to apply on the records before calculating the metric
keys: keys to filter in when field type is dict.
miss_rate
Parameters
prediction: prediction probability field
label: label field
threshold: numeric. Probability threshold according to which we decide the if a class is positive
filter: the filter we want to apply on the records before calculating the metric
missing_count
Parameters
field: the field for which the metric will be computed. Can be of any category (feature / raw_input / prediction / actual)
filter: the filter we want to apply on the records before calculating the metric
missing_ratio
Parameters
field: the field for which the metric will be computed. Can be of any category (feature / raw_input / prediction / actual)
filter: the filter we want to apply on the records before calculating the metric
mse
Parameters
prediction: prediction field
label: label field
filter: the filter we want to apply on the records before calculating the metric
ndcg
Parameters
prediction: prediction field
label: label field
rank: the rank position
filter: the filter we want to apply on the records before calculating the metric
not_missing_count
Parameters
field: the field for which the metric will be computed. Can be of any category (feature / raw_input / prediction / actual)
filter: the filter we want to apply on the records before calculating the metric
precision_score
Parameters
prediction: prediction probability field
label: label field
threshold: numeric. Probability threshold according to which we decide the if a class is positive
average: the average strategy (micro / macro / weighted)
top_k: consider only top-k items.
filter: the filter we want to apply on the records before calculating the metric
recall_score
Parameters
prediction: prediction probability field
label: label field
threshold: numeric. Probability threshold according to which we decide the if a class is positive
average: the average strategy (micro / macro / weighted)
top_k: consider only top-k items.
filter: the filter we want to apply on the records before calculating the metric
rmse
Parameters
prediction: prediction field
label: label field
filter: the filter we want to apply on the records before calculating the metric
specificity
Parameters
prediction: prediction probability field
label: label field
threshold: numeric. Probability threshold according to which we decide the if a class is positive
filter: the filter we want to apply on the records before calculating the metric
std
Parameters
field: numeric or dict. The field for which the metric will be computed. Can be of any category (feature / raw_input / prediction / actual)
filter: the filter we want to apply on the records before calculating the metric
keys: keys to filter in when field type is dict
sum
Parameters
field: numeric or dict. The field for which the metric will be computed. Can be of any category (feature / raw_input / prediction / actual)
filter: the filter we want to apply on the records before calculating the metric
keys: keys to filter in when field type is dict
tn_count
Parameters
prediction: prediction probability field
label: label field
threshold: numeric. Probability threshold according to which we decide the if a class is positive
filter: the filter we want to apply on the records before calculating the metric
tp_count
Parameters
prediction: prediction probability field
label: label field
threshold: numeric. Probability threshold according to which we decide the if a class is positive
filter: the filter we want to apply on the records before calculating the metric
unique_count
Parameters
field: numeric or dict. The field for which the metric will be computed. Can be of any category (feature / raw_input / prediction / actual)
filter: the filter we want to apply on the records before calculating the metric
keys: keys to filter in when field type is dict.
value_count
Parameters
field: the field for which the metric will be computed. Can be of any category (feature / raw_input / prediction / actual)
value: the value we want to count
filter: the filter we want to apply on the records before calculating the metric
keys: keys to filter in when field type is dict.
value_percentage
Parameters
field: the field for which the metric will be computed. Can be of any category (feature / raw_input / prediction / actual)
value: the value we want to count
filter: the filter we want to apply on the records before calculating the metric
keys: keys to filter in when field type is dict.
variance
Parameters
field: numeric or dict. The field for which the metric will be computed. Can be of any category (feature / raw_input / prediction / actual)
filter: the filter we want to apply on the records before calculating the metric
keys: keys to filter in when field type is dict.
wape
Parameters
prediction: prediction field
label: label field
filter: the filter we want to apply on the records before calculating the metric