Evaluation

What Works in Synthetic Alignment: Evidence and Scorecard featured image

What Works in Synthetic Alignment: Evidence and Scorecard

The verdict is in. We deliver a scorecard on synthetic alignment, assessing which of RLHF's limitations have been solved and which remain, backed by six key empirical insights.

avatar
Jean Michel A. Sarr
Read more