Predicting the Order of Upcoming Tokens Improves Language Modeling

(arxiv.org)

6 points | by wavelander a day ago ago

1 comments