Example: Question Answering

Question answering models can retrieve the answer to a question from a given text, which is useful for searching for an answer in a document.

>>> from transformers import pipeline

>>> qa_model = pipeline("question-answering")

This downloads a default pretrained model and tokenizer for Questioning Answering. Now you can use the qa_model on your target question / context:

qa_model(
    question="Where are the best cookies?",
    context="The best cookies are in Aporia's office."
)

# ==> {'score': 0.8362494111061096,
#      'start': 24,
#      'end': 39,
#      'answer': "Aporia's office"}

Extract Embeddings

To extract embeddings from the model, we'll first need to do two things:

  1. Pass output_hidden_states=True to our model params.

  2. When we call pipeline(...) it does a lot of things for us - preprocessing, inference, and postprocessing. We'll need to break all this, so we can interfere in the middle and get embeddings 😉

In other words:

qa_model = pipeline("question-answering", model_kwargs={"output_hidden_states": True})

# Preprocess
model_inputs = next(qa_model.preprocess(QuestionAnsweringPipeline.create_sample(
    question="Where are the best cookies?", 
    context="The best cookies are in Aporia's office."
)))

# Inference
model_output = qa_model.model(input_ids=model_inputs["input_ids"])

# Postprocessing
start, end = model_output[:2]
qa_model.postprocess([{"start": start, "end": end, **model_inputs}])
  # ==> {'score': 0.8362494111061096, 'start': 24, 'end': 39, 'answer': "Aporia's office"}

And finally, to extract embeddings for this prediction:

embeddings = torch.mean(model_output.hidden_states[-1], dim=1).squeeze()

Storing your Predictions

The next step would be to store your predictions in a data store, including the embeddings themselves. For more information on storing your predictions, please check out the Storing Your Predictions section.

For example, you could use a Parquet file on S3 or a Postgres table that looks like this:

Integrate to Aporia

Now let’s add some monitoring to this model 🚀 To monitor this model in Aporia, the first step is to create a model version:

apr_model = aporia.create_model_version(
  model_id="<MODEL_ID>",
  model_version="v1",
  model_type="multiclass"
  raw_inputs={
    "question": "text",
    "context": "text"
  },
  features={
     "embeddings": {"type": "tensor", "dimensions": [768]}
  },
  predictions={
    "answer": "string",
    "score": "numeric"
  },
)

Next, we can log predictions directly to Aporia:

qa_model = pipeline(
    task="question-answering",
    model_kwargs={"output_hidden_states": True}
)


def predict(question: str, answer: str):
    # Preprocess
    model_inputs = next(qa_model.preprocess(QuestionAnsweringPipeline.create_sample(
        question="Where are the best cookies?", 
        context="The best cookies are in Aporia's office."
    )))
    
    # Inference
    model_output = qa_model.model(input_ids=model_inputs["input_ids"])
    
    # Postprocessing
    start, end = model_output[:2]
    result = qa_model.postprocess([{"start": start, "end": end, **model_inputs}])
    
    # Log prediction to Aporia
    apr_model.log_prediction(
        id=str(uuid.uuid4()),
        raw_inputs={
            "question": question,
            "context": context
        },
        features={
            "embeddings": embeddings
        },
        predictions={
            "answer": result["answer"],
            "score": result["score"]
        }
    )
    
    return result

Alternatively, connect Aporia to a data source. For more information, see Data Sources - Overview:

apr_model.connect_serving(
    data_source=<DATA_SOURCE>,
    
    id_column="id",
    timestamp_column="timestamp"
)

Your model should now be integrated to Aporia! 🎉

Last updated