Learning to tranduce with unbounded memory

speaker: Alex Ognev event: Papers We Love SG ** key contributions implement continuously differentiable analogues of stacks, queues, and dequeues superior generalization to Deep Recurrent NN ** recurrent neural network output and hidden state at time t is fed back at time t+1 has memory and can learn sequences of input difficult to train/scale ** long short-term memory has memory cells controlled by special gates

input gate
forget gate
output gate network learns when to remember values ** continuous neural stack values have a certainty

pop with certainty u_t push v with certainty p_t read value at t

weighted sum of values from top until total certainty is 1

neural network learns a controller for the data structure ** advantages memory is unbounded stack suitable for input with hierachical structure such as languages fewer parameters to learn during training as compares to LSTM

Melvin's digital garden

Learning to tranduce with unbounded memory

Links to this note