Colaberry·Library
🧪

Evaluation Datasets

Datasets used to score agent / prompt quality.

1
vetted

🔍 Tags across Evaluation Datasets

Click any tag to filter the inventory.
🔢Order = frequency (matches) · 📏Size = recency (newer → bigger) · 🎨Color = avg rating · Tilt = vetted ratio (upright → vetted)
Filtering by tag: colaberry ✕
Filter: All ✓ Colaberry vetted only 📁 source: execution/ops_platform/evaluation.py 🌐 Ingest URL + Submit manually