# Custom Metric Syntax

In Aporia, custom metrics are defined using syntax that is similar to python's.

There are three building blocks which can be used in order to create a custom metric expression:

**Constants**- a numeric value (e.g.`2`

,`0.5`

, ..)**Functions**- out of the builtin function collection you can find below (e.g.`sum`

,`count`

, ...). All those functions return a numeric value.**Binary operation**-`+`

,`-`

,`*`

,`/`

,`**`

. Operands can be both constants or function calls.

## Builtin Functions

Before we dive into each of the supported functions, let's take a look at a few examples of custom metric definitions.

```
// Average annual premium of those with a driving license
sum(column="annual_premium") / count()
// Mean predicted probability
mean(column="proba")
// Model revenue
5 * tp_count(column="will_buy_insurance") -2 * fp_count(column="will_buy_insurance")
// nDCG@4 per step
ndcg_at_k(column="p_views", k=4)
ndcg_at_k(column="p_add_to_cart", k=4)
ndcg_at_k(column="p_purchases", k=4)
// accuracy using custom threshold
accuracy(column="proba", type="numeric", threshold=0.2)
// R-squared - Expanding brackets to use available aggregations
rss = squared_error_sum(column="prediction")
tss = squared_sum(column="actual") - 2*mean(column="actual")*sum(column="actual") + column_count(column="actual")*(mean(column="actual")**2)
1 - rss/tss
```

### Filters within functions

Within Aporia we can always set a segment on our metrics as a whole, but sometimes this is just not enough. Many times we will need to pass a segment of our data to a specific function as part of our metric.

Aporia supports these cases by passing another argument to functions called "**filter"**.

With the "**filter**" argument you'll be able to set any filtering to the data passed in the "**column" **argument using the custom segment syntax.

**For example:**

```
// Ratio of the annual premium of people above 70 out of the total premium
sum(column="annual_premium", filter="age > 70") / sum(column="annual_premium")
```

To allow you to set any of your segments upon these metrics as a whole as well, setting a filter within a metric will create behind the scenes, the intersection of the segment within the filter with all of your existing filters. These segments will be counted as any regular segment.

### Supported functions

#### Numerical Measures

## absolute_sum

Returns the sum of absolutes for the given column.

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be numeric field of any group (`feature`

/`raw_input`

/`prediction`

/`actual`

)

## column_count

Returns the number of rows with non-null values for the given column.

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be any field.

## max

Returns the maximum value for the given column.

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be numeric field of any group (`feature`

/`raw_input`

/`prediction`

/`actual`

)

## max_length

Returns the maximum length for the given column (items for arrays/embeddings, characters for text).

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be**text/array/numeric array/embedding**field of any group (`feature`

/`raw_input`

/`prediction`

/`actual`

)

## median

Returns the median value for the given column.

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be numeric field of any group (`feature`

/`raw_input`

/`prediction`

/`actual`

)

## mean

Returns the average value for the given column.

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be numeric field of any group (`feature`

/`raw_input`

/`prediction`

/`actual`

)

## mean_length

Returns the average length for the given column (items for arrays/embeddings, characters for text).

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be**text/array/numeric array/embedding**field of any group (`feature`

/`raw_input`

/`prediction`

/`actual`

)

## min

Returns the minimum value for the given column.

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be numeric field of any group (`feature`

/`raw_input`

/`prediction`

/`actual`

)

## min_length

Returns the minimum length for the given column (items for arrays/embeddings, characters for text).

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be**text/array/numeric array/embedding**field of any group (`feature`

/`raw_input`

/`prediction`

/`actual`

)

## missing_count

Returns the number of rows with null values for the given column.

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be any field.

## missing_ratio

Returns the percentage of rows with null values for the given column.

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be any field.

## sum

Returns the sum for the given column.

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be numeric field of any group (`feature`

/`raw_input`

/`prediction`

/`actual`

)

## squared_sum

Returns the sum of squared values for the given column.

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be numeric field of any group (`feature`

/`raw_input`

/`prediction`

/`actual`

)

## squared_deviation_sum

Returns the sum of squares for the given column.

For column **x**, with **m** mean of all x samples, equals to sum of (x-m)².

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be numeric field of any group (`feature`

/`raw_input`

/`prediction`

/`actual`

)

## value_count

Returns the number of entries where the given column is equal to the given value.

For example, value_count(column="bool", value=True) will return count of entries where bool=TRUE.

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be any**boolean/categorical**field.**value**: The value of the field to look for.

## variance

Returns the variance for the given column.

**Parameters**

**column**: the name of the field on which we want to apply the function. Can be numeric field of any group (`feature`

/`raw_input`

/`prediction`

/`actual`

)

#### Regression Metrics

## absolute_error_sum

Returns the sum of absolute errors for the given prediction.

For a prediction P and actual A, returns the sum of |P-A|.

**Parameters**

**column**: the name of the**numeric prediction**field on which we want to apply the function. Must have a**numeric actual**mapped to it.

## mae

Calculates MAE for the given prediction.

**Parameters**

**column**: the name of the**numeric prediction**field on which we want to apply the function. Must have a**numeric actual**mapped to it.

## mse

Calculates MSE for the given prediction.

**Parameters**

**column**: the name of the**numeric prediction**field on which we want to apply the function. Must have a**numeric actual**mapped to it.

## rmse

Calculates RMSE for the given prediction.

**Parameters**

**column**: the name of the**numeric prediction**field on which we want to apply the function. Must have a**numeric actual**mapped to it.

## squared_error_sum

Returns the sum of squared errors for the given prediction.

For a prediction P and actual A, returns the sum of (P-A)².

**Parameters**

**column**: the name of the**numeric prediction**field on which we want to apply the function. Must have a**numeric actual**mapped to it.

#### Binary Classification Metrics

## accuracy

Calculates accuracy for the given prediction.

**Parameters**

**column**: the name of the**numeric/boolean****prediction**field on which we want to apply the function. Must have a**boolean actual**mapped to it.**threshold**: probability threshold according to which we decide if a class is positive. Required for**numeric predictions**.**method**: will define the average strategy to use. Can be: "macro", "micro" or "weighted". Required for**categorical predictions**.

## auc_roc

Calculates AUC ROC for the given prediction.

**Parameters**

**column**: the name of the**numeric****prediction**field on which we want to apply the function. Must have a**boolean actual**mapped to it.

## fn_count

Returns the number of False-Negative results.

**Parameters**

**column**: the name of the**numeric/boolean****prediction**field on which we want to apply the function. Must have a**boolean actual**mapped to it.**threshold**: probability threshold according to which we decide if a class is positive. Required for**numeric predictions**.

## fp_count

Returns the number of False-Positive results.

**Parameters**

**column**: the name of the**numeric/boolean****prediction**field on which we want to apply the function. Must have a**boolean actual**mapped to it.**threshold**: probability threshold according to which we decide if a class is positive. Required for**numeric predictions**.

## f1

Calculates f1-score for the given prediction.

**Parameters**

**column**: the name of the**numeric/boolean****prediction**field on which we want to apply the function. Must have a**boolean actual**mapped to it.**threshold**: probability threshold according to which we decide if a class is positive. Required for**numeric predictions.****method**: will define the average strategy to use. Can be: "macro", "micro" or "weighted". Required for**categorical predictions**.

## precision

Calculates precision for the given prediction.

**Parameters**

**column**: the name of the**numeric/boolean****prediction**field on which we want to apply the function. Must have a**boolean actual**mapped to it.**threshold**: probability threshold according to which we decide if a class is positive. Required for**numeric predictions.****method**: will define the average strategy to use. Can be: "macro", "micro" or "weighted". Required for**categorical predictions**.

## recall

Calculates recall for the given prediction.

**Parameters**

**column**: the name of the**numeric/boolean****prediction**field on which we want to apply the function. Must have a**boolean actual**mapped to it.**threshold**: probability threshold according to which we decide if a class is positive. Required for**numeric predictions.****method**: will define the average strategy to use. Can be: "macro", "micro" or "weighted". Required for**categorical predictions**.

## tn_count

Returns the number of True-Negative results.

**Parameters**

**column**: the name of the**numeric/boolean****prediction**field on which we want to apply the function. Must have a**boolean actual**mapped to it.**threshold**: probability threshold according to which we decide if a class is positive. Required for**numeric predictions**.

## tp_count

Returns the number of True-Positive results.

**Parameters**

**column**: the name of the**numeric/boolean****prediction**field on which we want to apply the function. Must have a**boolean actual**mapped to it.**threshold**: probability threshold according to which we decide if a class is positive. Required for**numeric predictions**.

#### Multiclass Classification Metrics

## accuracy_per_class

Calculates accuracy for the given prediction per the specified category class.

**Parameters**

**column**: the name of the**categorical****prediction**field on which we want to apply the function. Must have a**categorical****actual**mapped to it.**class_name**: the class on which we want to calculate the function.

## fn_count_per_class

Returns the number of False-Negative results per the specified category class.

**Parameters**

**column**: the name of the**categorical****prediction**field on which we want to apply the function. Must have a**categorical****actual**mapped to it.**class_name**: the class on which we want to calculate the function.

## fp_count_per_class

Returns the number of False-Positive results per the specified category class.

**Parameters**

**column**: the name of the**categorical****prediction**field on which we want to apply the function. Must have a**categorical****actual**mapped to it.**class_name**: the class on which we want to calculate the function.

## f1_per_class

Calculates f1-score for the given prediction per the specified category class.

**Parameters**

**column**: the name of the**categorical****prediction**field on which we want to apply the function. Must have a**categorical****actual**mapped to it.**class_name**: the class on which we want to calculate the function.

## precision_per_class

Calculates precision for the given prediction per the specified category class.

**Parameters**

**column**: the name of the**categorical****prediction**field on which we want to apply the function. Must have a**categorical****actual**mapped to it.**class_name**: the class on which we want to calculate the function.

## recall_per_class

Calculates recall for the given prediction per the specified category class.

**Parameters**

**column**: the name of the**categorical****prediction**field on which we want to apply the function. Must have a**categorical****actual**mapped to it.**class_name**: the class on which we want to calculate the function.

## tn_count_per_class

Returns the number of True-Negative results per the specified category class.

**Parameters**

**column**: the name of the**categorical****prediction**field on which we want to apply the function. Must have a**categorical****actual**mapped to it.**class_name**: the class on which we want to calculate the function.

## tp_count_per_class

Returns the number of True-Positive results per the specified category class.

**Parameters**

**column**: the name of the**categorical****prediction**field on which we want to apply the function. Must have a**categorical****actual**mapped to it.**class_name**: the class on which we want to calculate the function.

#### Ranking Metrics

## accuracy_at_k

Calculates Accuracy for the given prediction on the top K items.

**Parameters**

**column**: the name of the**array prediction**field on which we want to apply the function. Must have an**array actual**mapped to it. If using candidate-level ranking, can be a**boolean prediction**with a mapped**boolean actual**.**k**: numeric integer between 1 to 12. Only the top-k items will be considered.

## map_at_k

Calculates MAP (Mean-Average-Precision) for the given prediction on the top K items.

**Parameters**

**column**: the name of the**array prediction**field on which we want to apply the function. Must have an**array actual**mapped to it. If using candidate-level ranking, can be a**boolean prediction**with a mapped**boolean actual**.**k**: numeric integer between 1 to 12. Only the top-k items will be considered.

## mrr_at_k

Calculates MRR (Mean-Reciprocal-Rank) for the given prediction on the top K items.

**Parameters**

**column**: the name of the**array prediction**field on which we want to apply the function. Must have an**array actual**mapped to it. If using candidate-level ranking, can be a**boolean prediction**with a mapped**boolean actual**.**k**: numeric integer between 1 to 12. Only the top-k items will be considered.

## ndcg_at_k

Calculates NDCG for the given prediction on the top K items.

**Parameters**

**column**: the name of the**array prediction**field on which we want to apply the function. Must have an**array actual**mapped to it. If using candidate-level ranking, can be a**boolean prediction**with a mapped**boolean actual**.**k**: numeric integer between 1 to 12. Only the top-k items will be considered.

## precision_at_k

Calculates Precision for the given prediction on the top K items.

**Parameters**

**column**: the name of the**array prediction**field on which we want to apply the function. Must have an**array actual**mapped to it. If using candidate-level ranking, can be a**boolean prediction**with a mapped**boolean actual**.**k**: numeric integer between 1 to 12. Only the top-k items will be considered.

## recall_at_k

Calculates Recall for the given prediction on the top K items.

**Parameters**

**column**: the name of the**array prediction**field on which we want to apply the function. Must have an**array actual**mapped to it. If using candidate-level ranking, can be a**boolean prediction**with a mapped**boolean actual**.**k**: numeric integer between 1 to 12. Only the top-k items will be considered.

Last updated