Reinforcement Learning from Human Feedback (RLHF) in Notebooks

(github.com)

72 points | by ash_at_hny 3 days ago ago

3 comments