Skip to content

Example Datasets

Datasets Available

Below are the example datasets you can use to

Dataset Description Data format
correctness 1,200 examples, created from InstructQA `Dataset`
retrieval 300 examples, created from HotpotQA `Dataset`
faithfulness 544 examples, created from InstructQA `Dataset`
graham_essays/small/txt 10 Paul Graham essays, created from graham-essays Zip of txt
graham_essays/small/chromadb OpenAI Embeddings of 395 chunks from 10 Paul Graham essays Zip of embeddings (in ChromaDB format)

Download Datasets

The example datasets can be example_data_downloader helper function.

from continuous_eval.data_downloader import example_data_downloader
# Download a dataset for evaluation
dataset = example_data_downloader("retrieval")
# Download embeddings for dataset generation
db = example_data_downloader("graham_essays/small/chromadb", Path("temp"), force_download=False)