Pipeline Overview
Definition
To evaluate your custom AI application pipeline, you first need to define it in using the Pipeline
classes.
A pipeline is a sequence of steps that transform data from one format to another. In a typical AI application, it usually starts with a user instruction, then goes through a series of steps (Module
) to return an answer.
The basic component of a pipeline is a Module
.
A module is a named component with specific inputs and outputs.
It can be a simple function or a complex model that takes some input and returns some output.
To define a module you need to specify the following:
name
: a unique name for the moduleinput
: the input of the module, can be a dataset field (DatasetField
, see dataset page) another module or nothing (None
)output
the output type (e.g.,str
,List[str]
,Dict[str, str]
, etc.)description
: Optional string describing the fieldeval
: an optional list of metrics (see next page)tests
: an optional list of tests (see next page)
Through the Pipeline
class, you can define a sequence of modules that represent your application pipeline.
Example
Consider the following pipeline example:
This Retrieval-Augmented Generation (RAG) pipeline consists of three simple modules. A Retriever that fetches the relevant documents, a Reranker that reorders and filters the documents, and a Generator that uses LLM to generate a response based on information in the documents.