Tree Search Distillation for Language Models Using PPO

(ayushtambde.com)

69 points | by at2005 13 hours ago ago

5 comments