Research

Why RLHF Can't Scale: Understanding the Fundamental Limitations

Examining why RLHF faces fundamental limitations across scalability, human judgment quality, reward models, and governance that constrain the development of more capable AI …

Jean Michel A. Sarr

• Nov 13, 2025 • 10 min read

What Works in Synthetic Alignment: Evidence and Scorecard

The verdict is in. We deliver a scorecard on synthetic alignment, assessing which of RLHF's limitations have been solved and which remain, backed by six key empirical insights.

Jean Michel A. Sarr

• Nov 13, 2025 • 8 min read

The Path Forward: Five Critical Research Frontiers

Exploring five critical research frontiers: meta-alignment, post-deployment adaptation, breaking the iteration ceiling, judge bias auditing, and co-evolution dynamics.

Jean Michel A. Sarr

• Nov 13, 2025 • 17 min read