Links
Comment on page

Custom Metric Syntax

In Aporia, custom metrics are defined using syntax that is similar to python's.
There are three building blocks which can be used in order to create a custom metric expression:
  • Constants - a numeric value (e.g. 2, 0.5, ..)
  • Functions - out of the builtin function collection you can find below (e.g. sum, count, ...). All those functions return a numeric value.
  • Binary operation - +, -, *, /, **. Operands can be both constants or function calls.

Builtin Functions

Before we dive into each of the supported functions, let's take a look at a few examples of custom metric definitions.
// Average annual premium of those with a driving license
sum(column="annual_premium") / count()
// Mean predicted probability
mean(column="proba")
// Model revenue
5 * tp_count(column="will_buy_insurance") -2 * fp_count(column="will_buy_insurance")
// nDCG@4 per step
ndcg_at_k(column="p_views", k=4)
ndcg_at_k(column="p_add_to_cart", k=4)
ndcg_at_k(column="p_purchases", k=4)
// accuracy using custom threshold
accuracy(column="proba", type="numeric", threshold=0.2)

Filters within functions

Within Aporia we can always set a segment on our metrics as a whole, but sometimes this is just not enough. Many times we will need to pass a segment of our data to a specific function as part of our metric.
Aporia supports these cases by passing another argument to functions called "filter".
With the "filter" argument you'll be able to set any filtering to the data passed in the "column" argument using the custom segment syntax.
For example:
// Ratio of the annual premium of people above 70 out of the total premium
sum(column="annual_premium", filter="age > 70") / sum(column="annual_premium")
To allow you to set any of your segments upon these metrics as a whole as well, setting a filter within a metric will create behind the scenes, the intersection of the segment within the filter with all of your existing filters. These segments will be counted as any regular segment.

Supported functions

count
Parameters
No parameters needed, the metric will count the total number of unique IDs.
missing_count
Parameters
  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)
missing_ratio
Parameters
  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)
max
Parameters
  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)
min
Parameters
  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)
mean
Parameters
  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)
sum
Parameters
  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)
absolute_error_sum
Parameters
  • column: the name of the prediction field on which we want to apply the function
absolute_sum
Parameters
  • column: the name of the prediction field on which we want to apply the function
mae
Parameters
  • column: the name of the prediction field on which we want to apply the function
mse
Parameters
  • column: the name of the prediction field on which we want to apply the function
rmse
Parameters
  • column: the name of the prediction field on which we want to apply the function
tp_count
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • type: the data type of the prediction field we chose. Can be: "numeric" or "boolean".
  • threshold: probability threshold according to which we decide the if a class is positive. Required for numeric predictions
fp_count
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • type: the data type of the prediction field we chose. Can be: "numeric" or "boolean".
  • threshold: probability threshold according to which we decide the if a class is positive. Required for numeric predictions
tn_count
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • type: the data type of the prediction field we chose. Can be: "numeric" or "boolean".
  • threshold: probability threshold according to which we decide the if a class is positive. Required for numeric predictions
fn_count
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • type: the data type of the prediction field we chose. Can be: "numeric" or "boolean".
  • threshold: probability threshold according to which we decide the if a class is positive. Required for numeric predictions
accuracy
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • type: the data type of the prediction field we chose. Can be: "numeric", "boolean" or "categorical"
  • threshold: probability threshold according to which we decide the if a class is positive. Required for numeric predictions
  • method: will define the average strategy to use. Can be: "macro", "micro" or "weighted". Required for categorical predictions
precision
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • type: the data type of the prediction field we chose. Can be: "numeric", "boolean" or "categorical"
  • threshold: probability threshold according to which we decide the if a class is positive. Required for numeric predictions
  • method: will define the average strategy to use. Can be: "macro", "micro" or "weighted". Required for categorical predictions.
recall
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • type: the data type of the prediction field we chose. Can be: "numeric", "boolean" or "categorical"
  • threshold: probability threshold according to which we decide the if a class is positive. Required for numeric predictions
  • method: will define the average strategy to use. Can be: "macro", "micro" or "weighted". Required for categorical predictions.
f1
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • type: the data type of the prediction field we chose. Can be: "numeric", "boolean" or "categorical"
  • threshold: probability threshold according to which we decide the if a class is positive. Required for numeric predictions
  • method: will define the average strategy to use. Can be: "macro", "micro" or "weighted". Required for categorical predictions.
fp_count_per_class
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • type: the data type of the prediction field we chose. Can be: "categorical" or "array"
  • class_name: the class on which we want to calculate the function
fn_count_per_class
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • type: the data type of the prediction field we chose. Can be: "categorical" or "array"
  • class_name: the class on which we want to calculate the function
tp_count_per_class
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • type: the data type of the prediction field we chose. Can be: "categorical" or "array"
  • class_name: the class on which we want to calculate the function
tn_count_per_class
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • type: the data type of the prediction field we chose. Can be: "categorical" or "array"
  • class_name: the class on which we want to calculate the function
accuracy_at_k
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • k: numeric integer between 1 to 12. Only the top-k items will be considered.
precision_at_k
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • k: numeric integer between 1 to 12. Only the top-k items will be considered.
recall_at_k
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • k: numeric integer between 1 to 12. Only the top-k items will be considered.
ndcg_at_k
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • k: numeric integer between 1 to 12. Only the top-k items will be considered
map_at_k
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • k: numeric integer between 1 to 12. Only the top-k items will be considered
mrr_at_k
Parameters
  • column: the name of the prediction field on which we want to apply the function
  • k: numeric integer between 1 to 12. Only the top-k items will be considered