Following up on the Haskell benchmark, I ran the same
orchestration setup on the same AoC 2025 Days 1–5 puzzles — this time requiring solutions in
OCaml. The methodology is identical: each model gets an isolated directory, a puzzle
description, and must write its final answer to ANSWER.txt. Wrong answer or no answer = ejection.