Research

I Kept Starting Over

Why I kept abandoning writing projects, and why I am building Distill to create a sustainable loop for research, memory, and public thinking.

Jean Michel A. Sarr

• May 24, 2026 • 3 min read

A Tiny, Exact Lab for Judge-Policy Self-Play

A fully enumerable toy experiment on evaluator drift, policy collapse, and the ceiling on self-improvement in judge-policy co-evolution.

Jean Michel A. Sarr

• May 3, 2026 • 11 min read

Why RLHF Can't Scale: Understanding the Fundamental Limitations

Examining why RLHF faces fundamental limitations across scalability, human judgment quality, reward models, and governance that constrain the development of more capable AI …

Jean Michel A. Sarr

• Nov 13, 2025 • 10 min read

What Works in Synthetic Alignment: Evidence and Scorecard

The verdict is in. We deliver a scorecard on synthetic alignment, assessing which of RLHF's limitations have been solved and which remain, backed by six key empirical insights.

Jean Michel A. Sarr

• Nov 13, 2025 • 8 min read

The Path Forward: Five Critical Research Frontiers

Exploring five critical research frontiers: meta-alignment, post-deployment adaptation, breaking the iteration ceiling, judge bias auditing, and co-evolution dynamics.

Jean Michel A. Sarr

• Nov 13, 2025 • 17 min read