Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training

(github.com)

262 points | by xlayn 4 days ago ago

74 comments