Skip to main content

LLM-based Answer Relevance

Definition

LLM-based Answer Relevance outputs a score between 0.0 - 1.0 assessing the consistency of the generated answer based on the reference ground truth answers.

Scoring rubric in LLM Prompt:

  • 0.0 means that the answer is completely irrelevant to the question.
  • 0.5 means that the answer is partially relevant to the question or it only partially answers the question.
  • 1.0 means that the answer is relevant to the question and completely answers the question.

Example Usage

Required data items: question, answer

res = client.metrics.compute(
Metric.LLMBasedAnswerRelevance,
args={
"question": "Who wrote 'Romeo and Juliet'?",
"answer": "Shakespeare wrote 'Romeo and Juliet'",
},
)
print(res)

Sample Output

{
'LLM_based_answer_relevance': 1.0,
'LLM_based_answer_relevance_reasoning': "The answer is relevant to the question and completely answers the question by correctly identifying Shakespeare as the author of 'Romeo and Juliet'."
}