Custom Metric Syntax

In Aporia, custom metrics are defined using syntax that is similar to python's.

There are three building blocks which can be used in order to create a custom metric expression:

  • Constants - a numeric value (e.g. 2, 0.5, ..)

  • Functions - out of the builtin function collection you can find below (e.g. sum, count, ...). All those functions return a numeric value.

  • Binary operation - +, -, *, /, **. Operands can be both constants or function calls.

Builtin Functions

Before we dive into each of the supported functions, let's take a look at a few examples of custom metric definitions.

// Average annual premium of those with a driving license
sum(column="annual_premium") / count()

// Mean predicted probability
mean(column="proba")

// Model revenue
5 * tp_count(column="will_buy_insurance") -2 * fp_count(column="will_buy_insurance")

// nDCG@4 per step
ndcg_at_k(column="p_views", k=4)
ndcg_at_k(column="p_add_to_cart", k=4)
ndcg_at_k(column="p_purchases", k=4)

// accuracy using custom threshold
accuracy(column="proba", type="numeric", threshold=0.2)

// R-squared - Expanding brackets to use available aggregations
rss = squared_error_sum(column="prediction")
tss = squared_sum(column="actual") - 2*mean(column="actual")*sum(column="actual") + column_count(column="actual")*(mean(column="actual")**2)
1 - rss/tss

Filters within functions

Within Aporia we can always set a segment on our metrics as a whole, but sometimes this is just not enough. Many times we will need to pass a segment of our data to a specific function as part of our metric.

Aporia supports these cases by passing another argument to functions called "filter".

With the "filter" argument you'll be able to set any filtering to the data passed in the "column" argument using the custom segment syntax.

For example:

To allow you to set any of your segments upon these metrics as a whole as well, setting a filter within a metric will create behind the scenes, the intersection of the segment within the filter with all of your existing filters. These segments will be counted as any regular segment.

Supported functions

Numerical Measures

chevron-rightabsolute_sumhashtag

Returns the sum of absolutes for the given column.

Parameters

  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)

chevron-rightcounthashtag

Returns the total number of rows.

Parameters

No parameters needed.

chevron-rightcolumn_counthashtag

Returns the number of rows with non-null values for the given column.

Parameters

  • column: the name of the field on which we want to apply the function. Can be any field.

chevron-rightmaxhashtag

Returns the maximum value for the given column.

Parameters

  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)

chevron-rightmax_lengthhashtag

Returns the maximum length for the given column (items for arrays/embeddings, characters for text).

Parameters

  • column: the name of the field on which we want to apply the function. Can be text/array/numeric array/embedding field of any group (feature / raw_input / prediction / actual)

chevron-rightmedianhashtag

Returns the median value for the given column.

Parameters

  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)

chevron-rightmeanhashtag

Returns the average value for the given column.

Parameters

  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)

chevron-rightmean_lengthhashtag

Returns the average length for the given column (items for arrays/embeddings, characters for text).

Parameters

  • column: the name of the field on which we want to apply the function. Can be text/array/numeric array/embedding field of any group (feature / raw_input / prediction / actual)

chevron-rightminhashtag

Returns the minimum value for the given column.

Parameters

  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)

chevron-rightmin_lengthhashtag

Returns the minimum length for the given column (items for arrays/embeddings, characters for text).

Parameters

  • column: the name of the field on which we want to apply the function. Can be text/array/numeric array/embedding field of any group (feature / raw_input / prediction / actual)

chevron-rightmissing_counthashtag

Returns the number of rows with null values for the given column.

Parameters

  • column: the name of the field on which we want to apply the function. Can be any field.

chevron-rightmissing_ratiohashtag

Returns the percentage of rows with null values for the given column.

Parameters

  • column: the name of the field on which we want to apply the function. Can be any field.

chevron-rightsumhashtag

Returns the sum for the given column.

Parameters

  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)

chevron-rightsquared_sumhashtag

Returns the sum of squared values for the given column.

Parameters

  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)

chevron-rightsquared_deviation_sumhashtag

Returns the sum of squares for the given column.

For column x, with m mean of all x samples, equals to sum of (x-m)Β².

Parameters

  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)

chevron-rightvalue_counthashtag

Returns the number of entries where the given column is equal to the given value.

For example, value_count(column="bool", value=True) will return count of entries where bool=TRUE.

Parameters

  • column: the name of the field on which we want to apply the function. Can be any boolean/categorical field.

  • value: The value of the field to look for.

chevron-rightvariancehashtag

Returns the variance for the given column.

Parameters

  • column: the name of the field on which we want to apply the function. Can be numeric field of any group (feature / raw_input / prediction / actual)

Regression Metrics

chevron-rightabsolute_error_sumhashtag

Returns the sum of absolute errors for the given prediction.

For a prediction P and actual A, returns the sum of |P-A|.

Parameters

  • column: the name of the numeric prediction field on which we want to apply the function. Must have a numeric actual mapped to it.

chevron-rightmaehashtag

Calculates MAE for the given prediction.

Parameters

  • column: the name of the numeric prediction field on which we want to apply the function. Must have a numeric actual mapped to it.

chevron-rightmsehashtag

Calculates MSE for the given prediction.

Parameters

  • column: the name of the numeric prediction field on which we want to apply the function. Must have a numeric actual mapped to it.

chevron-rightrmsehashtag

Calculates RMSE for the given prediction.

Parameters

  • column: the name of the numeric prediction field on which we want to apply the function. Must have a numeric actual mapped to it.

chevron-rightsquared_error_sumhashtag

Returns the sum of squared errors for the given prediction.

For a prediction P and actual A, returns the sum of (P-A)Β².

Parameters

  • column: the name of the numeric prediction field on which we want to apply the function. Must have a numeric actual mapped to it.

Binary Classification Metrics

chevron-rightaccuracyhashtag

Calculates accuracy for the given prediction.

Parameters

  • column: the name of the numeric/boolean prediction field on which we want to apply the function. Must have a boolean actual mapped to it.

  • threshold: probability threshold according to which we decide if a class is positive. Required for numeric predictions.

  • method: will define the average strategy to use. Can be: "macro", "micro" or "weighted". Required for categorical predictions.

chevron-rightauc_rochashtag

Calculates AUC ROC for the given prediction.

Parameters

  • column: the name of the numeric prediction field on which we want to apply the function. Must have a boolean actual mapped to it.

chevron-rightfn_counthashtag

Returns the number of False-Negative results.

Parameters

  • column: the name of the numeric/boolean prediction field on which we want to apply the function. Must have a boolean actual mapped to it.

  • threshold: probability threshold according to which we decide if a class is positive. Required for numeric predictions.

chevron-rightfp_counthashtag

Returns the number of False-Positive results.

Parameters

  • column: the name of the numeric/boolean prediction field on which we want to apply the function. Must have a boolean actual mapped to it.

  • threshold: probability threshold according to which we decide if a class is positive. Required for numeric predictions.

chevron-rightf1hashtag

Calculates f1-score for the given prediction.

Parameters

  • column: the name of the numeric/boolean prediction field on which we want to apply the function. Must have a boolean actual mapped to it.

  • threshold: probability threshold according to which we decide if a class is positive. Required for numeric predictions.

  • method: will define the average strategy to use. Can be: "macro", "micro" or "weighted". Required for categorical predictions.

chevron-rightprecisionhashtag

Calculates precision for the given prediction.

Parameters

  • column: the name of the numeric/boolean prediction field on which we want to apply the function. Must have a boolean actual mapped to it.

  • threshold: probability threshold according to which we decide if a class is positive. Required for numeric predictions.

  • method: will define the average strategy to use. Can be: "macro", "micro" or "weighted". Required for categorical predictions.

chevron-rightrecallhashtag

Calculates recall for the given prediction.

Parameters

  • column: the name of the numeric/boolean prediction field on which we want to apply the function. Must have a boolean actual mapped to it.

  • threshold: probability threshold according to which we decide if a class is positive. Required for numeric predictions.

  • method: will define the average strategy to use. Can be: "macro", "micro" or "weighted". Required for categorical predictions.

chevron-righttn_counthashtag

Returns the number of True-Negative results.

Parameters

  • column: the name of the numeric/boolean prediction field on which we want to apply the function. Must have a boolean actual mapped to it.

  • threshold: probability threshold according to which we decide if a class is positive. Required for numeric predictions.

chevron-righttp_counthashtag

Returns the number of True-Positive results.

Parameters

  • column: the name of the numeric/boolean prediction field on which we want to apply the function. Must have a boolean actual mapped to it.

  • threshold: probability threshold according to which we decide if a class is positive. Required for numeric predictions.

Multiclass Classification Metrics

chevron-rightaccuracy_per_classhashtag

Calculates accuracy for the given prediction per the specified category class.

Parameters

  • column: the name of the categorical prediction field on which we want to apply the function. Must have a categorical actual mapped to it.

  • class_name: the class on which we want to calculate the function.

chevron-rightfn_count_per_classhashtag

Returns the number of False-Negative results per the specified category class.

Parameters

  • column: the name of the categorical prediction field on which we want to apply the function. Must have a categorical actual mapped to it.

  • class_name: the class on which we want to calculate the function.

chevron-rightfp_count_per_classhashtag

Returns the number of False-Positive results per the specified category class.

Parameters

  • column: the name of the categorical prediction field on which we want to apply the function. Must have a categorical actual mapped to it.

  • class_name: the class on which we want to calculate the function.

chevron-rightf1_per_classhashtag

Calculates f1-score for the given prediction per the specified category class.

Parameters

  • column: the name of the categorical prediction field on which we want to apply the function. Must have a categorical actual mapped to it.

  • class_name: the class on which we want to calculate the function.

chevron-rightprecision_per_classhashtag

Calculates precision for the given prediction per the specified category class.

Parameters

  • column: the name of the categorical prediction field on which we want to apply the function. Must have a categorical actual mapped to it.

  • class_name: the class on which we want to calculate the function.

chevron-rightrecall_per_classhashtag

Calculates recall for the given prediction per the specified category class.

Parameters

  • column: the name of the categorical prediction field on which we want to apply the function. Must have a categorical actual mapped to it.

  • class_name: the class on which we want to calculate the function.

chevron-righttn_count_per_classhashtag

Returns the number of True-Negative results per the specified category class.

Parameters

  • column: the name of the categorical prediction field on which we want to apply the function. Must have a categorical actual mapped to it.

  • class_name: the class on which we want to calculate the function.

chevron-righttp_count_per_classhashtag

Returns the number of True-Positive results per the specified category class.

Parameters

  • column: the name of the categorical prediction field on which we want to apply the function. Must have a categorical actual mapped to it.

  • class_name: the class on which we want to calculate the function.

Ranking Metrics

chevron-rightaccuracy_at_khashtag

Calculates Accuracy for the given prediction on the top K items.

Parameters

  • column: the name of the array prediction field on which we want to apply the function. Must have an array actual mapped to it. If using candidate-level rankingarrow-up-right, can be a boolean prediction with a mapped boolean actual.

  • k: numeric integer between 1 to 12. Only the top-k items will be considered.

chevron-rightmap_at_khashtag

Calculates MAP (Mean-Average-Precision) for the given prediction on the top K items.

Parameters

  • column: the name of the array prediction field on which we want to apply the function. Must have an array actual mapped to it. If using candidate-level rankingarrow-up-right, can be a boolean prediction with a mapped boolean actual.

  • k: numeric integer between 1 to 12. Only the top-k items will be considered.

chevron-rightmrr_at_khashtag

Calculates MRR (Mean-Reciprocal-Rank) for the given prediction on the top K items.

Parameters

  • column: the name of the array prediction field on which we want to apply the function. Must have an array actual mapped to it. If using candidate-level rankingarrow-up-right, can be a boolean prediction with a mapped boolean actual.

  • k: numeric integer between 1 to 12. Only the top-k items will be considered.

chevron-rightndcg_at_khashtag

Calculates NDCG for the given prediction on the top K items.

Parameters

  • column: the name of the array prediction field on which we want to apply the function. Must have an array actual mapped to it. If using candidate-level rankingarrow-up-right, can be a boolean prediction with a mapped boolean actual.

  • k: numeric integer between 1 to 12. Only the top-k items will be considered.

chevron-rightprecision_at_khashtag

Calculates Precision for the given prediction on the top K items.

Parameters

  • column: the name of the array prediction field on which we want to apply the function. Must have an array actual mapped to it. If using candidate-level rankingarrow-up-right, can be a boolean prediction with a mapped boolean actual.

  • k: numeric integer between 1 to 12. Only the top-k items will be considered.

chevron-rightrecall_at_khashtag

Calculates Recall for the given prediction on the top K items.

Parameters

  • column: the name of the array prediction field on which we want to apply the function. Must have an array actual mapped to it. If using candidate-level rankingarrow-up-right, can be a boolean prediction with a mapped boolean actual.

  • k: numeric integer between 1 to 12. Only the top-k items will be considered.

Last updated