Reducing the High Cost of Training NLP Models With SRU

By ASAPP - 2021-02-24

Description

Highly expressive and efficient neural models can be designed using SRU++ with little attention computation needed.

Summary

++ Natural language models have achieved various groundbreaking results in NLP and related fields [1, 2, 3, 4].
Our model obtains better perplexity and bits-per-character (bpc) while using 2.5x-10x less training time and cost compared to top-performing Transformer models.
In addition, not every SRU++ layer needs attention.
Numbers are lower the better.

Topics

NLP (0.31)
Machine_Learning (0.15)
UX (0.1)

Similar Articles

Rethinking Attention with Performers

By Google AI Blog - 2020-10-23

Posted by Krzysztof Choromanski and Lucy Colwell, Research Scientists, Google Research Transformer models have achieved state-of-the-art...

Introducing ABENA: BERT Natural Language Processing for Twi

By Medium - 2020-10-23

Transformer Language Modeling for Akuapem and Asante Twi

FastFormers: 233x Faster Transformers inference on CPU

By Medium - 2020-11-04

Since the birth of BERT followed by that of Transformers have dominated NLP in nearly every language-related tasks whether it is Question-Answering, Sentiment Analysis, Text classification or Text…