Research

Why RLHF Can't Scale: Understanding the Fundamental Limitations featured image

Why RLHF Can't Scale: Understanding the Fundamental Limitations

Examining why RLHF faces fundamental limitations across scalability, human judgment quality, reward models, and governance that constrain the development of more capable AI …

avatar
Jean Michel A. Sarr
Read more
What Works in Synthetic Alignment: Evidence and Scorecard featured image

What Works in Synthetic Alignment: Evidence and Scorecard

The verdict is in. We deliver a scorecard on synthetic alignment, assessing which of RLHF's limitations have been solved and which remain, backed by six key empirical insights.

avatar
Jean Michel A. Sarr
Read more
The Path Forward: Five Critical Research Frontiers featured image

The Path Forward: Five Critical Research Frontiers

Exploring five critical research frontiers: meta-alignment, post-deployment adaptation, breaking the iteration ceiling, judge bias auditing, and co-evolution dynamics.

avatar
Jean Michel A. Sarr
Read more