Supervised fine tuning on curated data is reinforcement learning

(arxiv.org)

71 points | by GabrielBianconi 4 days ago ago

19 comments