BERT Answer Similarity

Definitions

BERT Answer Similarity measures the semantic similarity between the Generated Answer and the Ground Truth Answers.

This metric leverages the BERT model to calculate semantic similarity.

Example Usage

Required data items: answer, ground_truths

from continuous_eval.metrics.generation.text import BertAnswerSimilarity

datum = {
    "question": "Who wrote 'Romeo and Juliet'?",
    "retrieved_context": ["William Shakespeare is the author of 'Romeo and Juliet'."],
    "ground_truth_context": ["William Shakespeare is the author of 'Romeo and Juliet'."],
    "answer": "Shakespeare wrote 'Romeo and Juliet'",
    "ground_truth_answers": [
        "William Shakespeare wrote 'Romeo and Juliet",
        "William Shakespeare",
        "Shakespeare",
        "Shakespeare is the author of 'Romeo and Juliet'"
    ]
}

metric = BertAnswerSimilarity()
print(metric(**datum))

Example Output

The metric outputs the max BERT similarity score calculated using items in ground_truths

{
    'bert_answer_similarity': 0.9274404048919678
}