Adaptive Semiparametric Language Models

By Deepmind - 2021-02-24

Description

We present a language model that combines a large parametric neural network (i.e., a transformer) with a non-parametric episodic memory component in an integrated architecture. Our model uses extended ...

Summary

  • Abstract We present a language model that combines a large parametric neural network (i.e., a transformer) with a non-parametric episodic memory component in an integrated architecture.
  • We design a gating function to adaptively combine multiple information sources to make a prediction.
  • This mechanism allows the model to use either local context, short-term memory, or long-term memory (or any combination of them) on an ad hoc basis depending on the context.

 

Topics

  1. NLP (0.31)
  2. Machine_Learning (0.12)
  3. UX (0.06)

Similar Articles

UKPLab/EasyNMT

By GitHub - 2021-01-27

Easy to use, state-of-the-art Neural Machine Translation for 100+ languages - UKPLab/EasyNMT

FastFormers: 233x Faster Transformers inference on CPU

By Medium - 2020-11-04

Since the birth of BERT followed by that of Transformers have dominated NLP in nearly every language-related tasks whether it is Question-Answering, Sentiment Analysis, Text classification or Text…