Crucible

Multi-agent evolutionary trading on prediction markets.

The idea

Most quant strategies live or die by a single hypothesis the team believes in. Crucible inverts that: it runs a population of trading agents in parallel, each with a different prompt, model, and decision process, and lets the markets decide which ones survive. Strategies that produce risk-adjusted returns get reproduced and mutated; strategies that don't are culled. The fitness function is the open hard problem.

Why prediction markets

Kalshi and Polymarket settle in cash on resolved real-world events. Outcomes are unambiguous, contracts are short-lived, and the universe of markets is broad enough that LLM-driven agents can reason across politics, sports, macro events, and culture using the same general toolset. That makes them an unusually clean environment to test whether a population of LLM agents can find edge.

Architecture

Each agent gets its own memory: a relational store for trades, positions, and outcomes, and a vector store for narrative context — articles, prior reasoning, market commentary. Memory is embedded at multiple granularities so an agent can retrieve either a one-line summary or the full reasoning trace behind a past decision.

An observer pattern sits alongside the agents. After every decision, a cheaper model (Haiku) classifies the agent's reasoning along several axes — what kind of signal it claimed to be acting on, how confident it was, what category of market it touched. This produces a clean dataset for analyzing why a given lineage of agents wins or loses, separate from the trading loop itself.

Status

Infrastructure is in place; the active work is on the fitness function and the selection mechanism. The interesting question isn't whether LLMs can trade — it's whether selection pressure across a population produces strategies that generalize beyond their training environment.

Python, Postgres, pgvector, Claude (Sonnet + Haiku), Kalshi & Polymarket APIs.