# Overview

**Aporia monitors your models by connecting&#x20;*****directly*****&#x20;to your data.** If you don't store your predictions yet, see our guide on [Storing Your Predictions](/v1/storing-your-predictions/overview.md) (recommended), or just [log them directly to Aporia.](/v1/storing-your-predictions/logging-to-aporia-directly.md)

Aporia currently supports the following data sources:

* Amazon S3
* BigQuery
* Redshift
* Athena
* Snowflake
* PostgreSQL
* Delta Lake
* Glue Data Catalog

{% hint style="info" %}
If your storage or database are not shown here, please contact your Aporia account manager for further assistance.
{% endhint %}

### Configure Data Source

Connecting to a data source begins with configuring its connection details. For example, to connect to a Postgres database, we can create the following data source object:

```python
data_source = PostgresJDBCDataSource(
  url="jdbc:postgresql://<POSTGRES_HOSTNAME>/<DBNAME>",
  query="SELECT * FROM model_predictions",
  user="<DB_USER>",
  password="<DB_PASSWORD>"
)
```

Please refer to the documentation page of the relevant data source for a complete list of supported parameters and configuration options.

### Connect Serving Data

After creating a data source, we can create a model version and connect it to the data source. For example:

```python
apr_model = aporia.create_model_version(
  model_id="<MODEL_ID>",
  model_version="v1",
  model_type="binary"
  
  raw_inputs={
    "raw_text": "text",
  },

  features={
    "amount": "numeric",
    "owner": "string",
    "is_new": "boolean",
    "embeddings": {"type": "tensor", "dimensions": [768]},
  },

  predictions={
    "will_buy_insurance": "boolean",
    "proba": "numeric",
  },
)

apr_model.connect_serving(
  data_source=data_source,

  # Names of the prediction ID and prediction timestamp columns
  id_column="prediction_id",
  timestamp_column="prediction_timestamp",
)
```

By default, each raw input, feature, and prediction is mapped to the same column in the PostgreSQL query.

As part of the `connect serving` API, you must specify the following two additional columns:

* `id_column` - A unique ID to represent this prediction.
* `timestamp_column` - A column representing when did this prediction occur.

### Integrating Delayed Actuals

Integrating actuals can be done by using the `labels` argument of the `connect_serving` API. To use it, each Aporia prediction can be mapped to a column representing its actual value.

For example, let's assume we have two columns - `will_buy_insurance` (which is the model prediction), and `did_buy_insurance` (the ground truth). To integrate it to Aporia:

```python
apr_model = aporia.create_model_version(
  ...
  predictions={
    "will_buy_insurance": "boolean"
  }
)

apr_model.connect_serving(
  data_source=data_source,

  id_column="prediction_id",
  timestamp_column="prediction_timestamp",

  labels={
    # Prediction name -> Column name representing 
    "will_buy_insurance": "did_buy_insurance"
  }
)
```

The ground truth can be `NULL` until it actually has value, and that's okay.

### Connecting Training / Test Sets

To connect your model version to training or test sets, you can use the `connect_training` and `connect_testing` APIs.

For example:

```python
# Training set
apr_model.connect_training(
  data_source=training_set_data_source,
  id_column="id",
  timestamp_column="timestamp",
)

# Test set
apr_model.connect_testing(
  data_source=test_set_data_source,
  id_column="id",
  timestamp_column="timestamp",
)
```

### Advanced Mapping

Any column that has the same name as a raw input, feature, or prediction in the model schema is mapped to the corresponding raw input, feature, or prediction.

However, you can override this mapping using the `raw_inputs`, `features`, `predictions`, and `labels` arguments to the `connect_serving` / `connect_training` / `connect_testing` APIs. Example:

```python
apr_model.connect_serving(
  data_source=aporia.GlueDataSource(
    database="datalake",
    query="""
      SELECT
        my_id,
        full_name,
        age,
        my_gender_col,
        decision,
        was_decision_correct,
        occurred_at,
      FROM predictions
    """,
  ),

  id_column="my_id",
  timestamp_column="occurred_at",
  raw_inputs={
    "fullname": "full_name",
  }
  features={
    "age": "age",
    "gender": "my_gender_col",
  },
  predictions={
    "will_buy_insurance": "decision",
  },
  labels={
    "will_buy_insurance": "was_decision_correct"
  }
)
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.aporia.com/v1/data-sources/overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
