Melvin's digital garden

Learning to tranduce with unbounded memory

speaker: Alex Ognev event: Papers We Love SG ** key contributions implement continuously differentiable analogues of stacks, queues, and dequeues superior generalization to Deep Recurrent NN ** recurrent neural network output and hidden state at time t is fed back at time t+1 has memory and can learn sequences of input difficult to train/scale ** long short-term memory has memory cells controlled by special gates

  • input gate
  • forget gate
  • output gate network learns when to remember values ** continuous neural stack values have a certainty

pop with certainty u_t push v with certainty p_t read value at t

  • weighted sum of values from top until total certainty is 1

neural network learns a controller for the data structure ** advantages memory is unbounded stack suitable for input with hierachical structure such as languages fewer parameters to learn during training as compares to LSTM

Links to this note