Show HN: Llama-8B Teaches Itself Baby Steps to Deep Research Using RL

(github.com)

39 points | by diegocaples 4 days ago ago

3 comments