Extended Context Precision & Recall

Definitions

This metric is similar Context Precision & Recall but with additional matching strategies to determine relevance.

Matching Strategy

Given that the ground truth contexts can be defined differently from the exact chunks retrieved. For example, a ground truth contexts can be a sentence that contains the information, while the contexts retrieved are uniform 512-token chunks. We have following matching strategies that determine relevance:

Match Type	Component	Retrieved Component Considered relevant if:
`Rouge Chunk Match`	Chunk	Match to a Ground Truth Context Chunk with ROUGE-L Recall > 0.7.
`Rouge Sentence Match`	Sentence	Match to a Ground Truth Context Sentence with ROUGE-L Recall > 0.8.

Example Usage

Required data items: retrieved_context, ground_truth_context

res = client.metrics.compute(
    Metric.PrecisionRecallF1Ext,
    args={
        "retrieved_context": [
            "Paris is the capital of France and also the largest city in the country.",
            "Lyon is a major city in France.",
        ],
        "ground_truth_context": ["Paris is the capital of France."],
    },
)
print(res)

Example Output

{
  'sentence_precision': 1.0, 
  'sentence_recall': 1.0, 
  'sentence_f1': 1.0, 
  'chunk_precision': 1.0, 
  'chunk_recall': 1.0,
  'chunk_f1': 1.0
}

Definitions​

Matching Strategy​

Example Usage​

Example Output​

Definitions

Matching Strategy

Example Usage

Example Output