Transformers Without Normalization

(arxiv.org)

2 points | by fzliu 16 hours ago ago

1 comments