Question answering models can retrieve the answer to a question from a given text, which is useful for searching for an answer in a document.
Throughout the guide, we will use a simple question answering model based on π€ HuggingFace
>>> from transformers import pipeline
>>> qa_model = pipeline("question-answering")
This downloads a default pretrained model and tokenizer for Questioning Answering. Now you can use the qa_model on your target question / context:
qa_model(
question="Where are the best cookies?",
context="The best cookies are in Aporia's office."
)
# ==> {'score': 0.8362494111061096,
# 'start': 24,
# 'end': 39,
# 'answer': "Aporia's office"}
Extract Embeddings
To extract embeddings from the model, we'll first need to do two things:
Pass output_hidden_states=True to our model params.
When we call pipeline(...) it does a lot of things for us - preprocessing, inference, and postprocessing. We'll need to break all this, so we can interfere in the middle and get embeddings π
In other words:
qa_model = pipeline("question-answering", model_kwargs={"output_hidden_states": True})
# Preprocess
model_inputs = next(qa_model.preprocess(QuestionAnsweringPipeline.create_sample(
question="Where are the best cookies?",
context="The best cookies are in Aporia's office."
)))
# Inference
model_output = qa_model.model(input_ids=model_inputs["input_ids"])
# Postprocessing
start, end = model_output[:2]
qa_model.postprocess([{"start": start, "end": end, **model_inputs}])
# ==> {'score': 0.8362494111061096, 'start': 24, 'end': 39, 'answer': "Aporia's office"}
And finally, to extract embeddings for this prediction:
The next step would be to store your predictions in a data store, including the embeddings themselves. For more information on storing your predictions, please check out the Storing Your Predictions section.
For example, you could use a Parquet file on S3 or a Postgres table that looks like this:
id
question
context
embeddings
answer
score
timestamp
1
Where are the best cookies?
The best cookies are in...
[0.77, 0.87, 0.94, ...]
Aporia's Office
0.982
2021-11-20 13:41:00
2
Where is the best hummus?
The best hummus is in...
[0.97, 0.82, 0.13, ...]
Another Place
0.881
2021-11-20 13:45:00
3
Where is the best burger?
The best burger is in...
[0.14, 0.55, 0.66, ...]
Blablabla
0.925
2021-11-20 13:49:00
Integrate to Aporia
Now letβs add some monitoring to this model π To monitor this model in Aporia, the first step is to create a model version: