LLM-based Context Coverage


Context Coverage measures completeness of the retrieved contexts to generated a ground truth answer.

This metric requires the LLM evaluator to output correct and complex JSON. If the JSON cannot be parsed, the score returns -1.0.

Example Usage

Required data items: question, retrieved_context, ground_truths

from continuous_eval.metrics.retrieval import LLMBasedContextCoverage
from continuous_eval.llm_factory import LLMFactory
datum = {
"question": "What is the largest and second city in France?",
"retrieved_context": [
"Lyon is a major city in France.",
"Paris is the capital of France and also the largest city in the country.",
"ground_truth_answers": ["Paris is the largest city in France and Marseille is the second largest."],
metric = LLMBasedContextCoverage(LLMFactory("gpt-4-1106-preview"))

Sample Output

'LLM_based_context_coverage': 0.5,
"classification": [
"statement_1": "Paris is the largest city in France.",
"reason": "This is directly stated in the context.",
"Attributed": 1
"statement_2": "Marseille is the second largest city in France.",
"reason": "This information is not provided in the context, which only mentions Paris and Lyon.",
"Attributed": 0