EleutherAI1 年前弱到强泛化实验Experiments in Weak-to-Strong Generalization撰写近期项目的结果Writing up results from a recent project
EleutherAI1 年前无须 Oracle 概念标签的自由形式最小二乘概念擦除Free Form Least-Squares Concept Erasure Without Oracle Concept Labels在推理时无需概念标签,实现比 LEACE 更精细的编辑。Achieving even more surgical edits than LEACE without concept labels at inference time.