Ranked-Aware Metrics
Definitions
Rank-aware metrics takes into account the order in which the contexts are retrieved.
Average Precision (AP) measures all relevant chunks retrieved and calculates weighted score. Mean of AP across dataset is frequently referred to as MAP.
Reciprocal Rank (RR) measures when your first relevant chunk appear in your retrieval. Mean of RR across dataset is frequently referred to as MRR.
Normalized Discounted Cumulative Gain (NDCG) accounts for the cases where your classification of relevancy is non-binary.
tip
Focus on MRR if a single chunk typically contains all the information needed to answer a question. Focus on MAP if multiple chunks need to be synthesized to answer a question.
Example Usage
Required data items: retrieved_context
, ground_truth_context
res = client.metrics.compute(
Metric.RankedRetrievalMetrics,
args={
"retrieved_context": [
"Lyon is a major city in France.",
"Paris is the capital of France and also the largest city in the country.",
],
"ground_truth_context": ["Paris is the capital of France."],
},
)
print(res)
Example Output
{
'average_precision': 0.5,
'reciprocal_rank': 0.5,
'ndcg': 0.6309297535714574
}