Kubeflow / KServe
If you are using Kubeflow or KServe for model serving, you can store the predictions of your models using InferenceDB.
InferenceDB is an open-source cloud native tool that connects to KServe and streams predictions to a data lake, based on Kafka.
WARNING: InferenceDB is still experimental!
InferenceDB is an open-source project developed by Aporia. It is still experimental, and not yet ready for production!
This guide will explain how to deploy a simple scikit-learn model using KServe, and log its inferences to a Parquet file in S3.
Requirements
KNative Eventing - with the Kafka broker
Kafka - with Schema Registry, Kafka Connect, and Confluent S3 Sink connector plugin
To get started as quickly as possible, see the environment preperation tutorial, which shows how to set up a full environment in minutes.
Step 1: Kafka Broker
First, we will need a Kafka broker to collect all KServe inference requests and responses:
apiVersion: eventing.knative.dev/v1
kind: Broker
metadata:
name: sklearn-iris-broker
namespace: default
annotations:
eventing.knative.dev/broker.class: Kafka
spec:
config:
apiVersion: v1
kind: ConfigMap
name: inferencedb-kafka-broker-config
namespace: knative-eventing
---
apiVersion: v1
kind: ConfigMap
metadata:
name: inferencedb-kafka-broker-config
namespace: knative-eventing
data:
# Number of topic partitions
default.topic.partitions: "8"
# Replication factor of topic messages.
default.topic.replication.factor: "1"
# A comma separated list of bootstrap servers. (It can be in or out the k8s cluster)
bootstrap.servers: "kafka-cp-kafka.default.svc.cluster.local:9092"Step 2: InferenceService
Next, we will serve a simple sklearn model using KServe:
Note the logger section - you can read more about it in the KServe documentation.
Step 3: InferenceLogger
Finally, we can log the predictions of our new model using InferenceDB:
Step 4: Send requests
First, we will need to port-forward the Istio service so we can access it from our local machine:
Prepare a payload in a file called iris-input.json:
And finally, you can send some inference requests:
Step 5: Success!
If everything was configured correctly, these predictions should have been logged to a Parquet file in S3.
Last updated