Co-Evolution

A Tiny, Exact Lab for Judge-Policy Self-Play featured image

A Tiny, Exact Lab for Judge-Policy Self-Play

A fully enumerable toy experiment on evaluator drift, policy collapse, and the ceiling on self-improvement in judge-policy co-evolution.

avatar
Jean Michel A. Sarr
Read more