Skip to main content

Runtime monitor (online evaluation)

The Runtime Monitor feature allows you to evaluate results on production data in real-time. Reference-free metrics should be used in runtime monitors. When you evaluate results on the fly, you won't need reference outputs in datasets to compare against.

Synchronous evaluation

If you need real-time evaluation as part of your system, you can use Relari's synchronous metrics to compute results locally. For example, you can detect hallucinations during runtime with the following code:

note

Synchronous metrics results won't be saved in the Relari cloud and you will need to save the results in your logging system if you want to track the performance this way.

from relari import RelariClient
from relari import Metric

client = RelariClient()

res = client.metrics.compute(
Metric.LLMBasedFaithfulness,
args={
"question": [
"What's SaaS?"
],
"retrieved_context": [
"Software as a service (SaaS) is a way of delivering applications remotely over the internet instead of locally on machines (known as “on-premise” software)."
],
"answer": "SaaS is a way of delivering applications remotely over the internet instead of locally on machines."
},
)

Asynchronous evaluation

For non-intrusive performance monitoring, you can run evaluations asynchronously. This allows you to monitor your system without affecting runtime. You can perform asynchronous evaluations on sampled data points or in batches, as shown below:

import os
from relari import RelariClient
from relari import Metric

client = RelariClient()

sample_batch_data = [
{
"question": "What's SaaS?",
"retrieved_context": [
"Software as a service (SaaS) is a way of delivering applications remotely over the internet instead of locally on machines (known as “on-premise” software)."
],
"answer": "SaaS is a way of delivering applications remotely over the internet instead of locally on machines."
},
{
"question": "What is Python's asyncio module used for?",
"retrieved_context": [
"The asyncio module in Python is used for writing single-threaded concurrent code using coroutines, multiplexing I/O access over sockets and other resources, running network clients and servers, and other related primitives."
],
"answer": "Python's asyncio module provides a framework for writing concurrent code using async/await syntax."
},
]

eval_id = client.evaluations.submit(
project_id=PROJECT_ID,
name=None,
metadata=dict(),
pipeline=[Metric.LLMBasedFaithfulness, Metric.LLMBasedRelevance],
data=sample_batch_data,
)

print(f"Evaluation submitted with ID: {eval_id}")
tip

Sampling We recommend sample production data for regularly monitoring production performance. Adjust the sampling rate based on the volume of data and the resources available.

Monitoring and Analytics

You can pipe asynchronous evaluation results into your monitoring and analytics infrastructure or use the Relari dashboard to track your application's performance.