Attention mechanism in Deep Learning, Explained

By KDnuggets - 2021-02-09

Description

Attention is a powerful mechanism developed to enhance the performance of the Encoder-Decoder architecture on neural network-based machine translation tasks. Learn more about how this process works an ...

Summary

  • Attention is a powerful mechanism developed to enhance the performance of the Encoder-Decoder architecture on neural network-based machine translation tasks.
  • This helps the model to cope efficiently with long input sentences.
  • By multiplying each encoder hidden state with its softmax score (scalar), we obtain the alignment vector or the annotation vector.
  • The alignment vectors are summed up to produce the context vector.

 

Topics

  1. NLP (0.29)
  2. Machine_Learning (0.28)
  3. Backend (0.06)

Similar Articles

By RStudio AI Blog - 2020-12-22

In forecasting spatially-determined phenomena (the weather, say, or the next frame in a movie), we want to model temporal evolution, ideally using recurrence relations. At the same time, we'd like to ...

Rethinking Attention with Performers

By Google AI Blog - 2020-10-23

Posted by Krzysztof Choromanski and Lucy Colwell, Research Scientists, Google Research Transformer models have achieved state-of-the-art...

Markov Chain

By DeepAI - 2019-05-17

Markov chains are used to model probabilities using information that can be encoded in the current state. Something transitions from one state to another semi-randomly, or stochastically.