@hackage / shikumi-eval

Typed evaluation framework for shikumi LM programs (EP-8)

Latest0.1.0.0

Metadata

Last updated 13 Jun 2026, by shinzui
License BSD-3-Clause
Maintained by: nadeem@gmail.com
Lottery factor: 1

Installation

Add this line in your cabal file

Readme

The evaluation framework for shikumi: the owned data model (Example, Prediction, Dataset, Metric, Score, Report — MasterPlan integration point #5), built-in pure and LM-backed metrics, an evaluate runner that scores a Shikumi.Program.Program over a typed dataset with bounded parallelism and per-example error boundaries, and golden testing that pins a program's behaviour deterministically under a mock or replayed LM.

About

Metadata

Links

Installation

Readme