Potemkin Understanding in LLMs: New Study Reveals Flaws in AI Benchmarks

(socket.dev)

7 points | by akyuu 15 hours ago ago

2 comments