Custom Metrics

Create your own metric

To define your own metrics, you only need to extend the Metric class implementing the __call__ method.

Optional methods are batch (if it is possible to implement optimizations for batch processing) and aggregate (to aggregate metrics results over multiple samples_).

Check out Metric Folder for examples of how various types of metrics are implemented.

Example

from continuous_eval.metrics.base import Metric

class CustomMetric(Metric):
    def __init__(self) -> None:
        super().__init__()

    def __call__(self, input_from_dataset, **kwargs):

        # implement metric calculation
        score = ...

        return {"custom_metric_score": score}

LLM-based Custom Metric

If you specifically want to create custom metrics based on LLMs (LLM-as-a-judge based on your own criteria), please check out LLM-based Custom Metric.

Add additional LLM Interface

If you want to use a different LLM endpoint, you can augment the LLMFactory to add the model interface and parameters. Check out LLMFactory for details.