The first and second generation of Neural Turing Machine

Hand-writing pdf version:


Computer has a CPU and a RAM.

Differential Neural Computer has a neural network as the controller that take the role of the CPU.
The memory is an $N W$ matrix that take the role of *the RAM, where $N$ means the locations and $W$ means the length of each pieces of memory.

Memory augmentation and attention mechanism

The episodic memories or evenet memories are known to depend on the hippocampus in the human brain.

The main point is that the memory of the network is external to the network itself.

The attention mechanism defines some distributions over the $N$ locations.
Each $i-th$ component of a weighting vector will communicate how much attention the controller should give to the content in the $i-th$ location of the memory.


Every unit and operation in this structure is differentiable.


The controller wants to do something which involves memory, and it doesn’t just look at every location of the memor.
Instead, it focues its attention on those locations which contain the information it is looking for.

The weighting produced for an input is a distribution over the N locations for their relative importance in a particular process(reading or writing).

Note that the weightings are produced by means by a vector emitted by the controller, which is called interface vector. The

Three interactions between controller and memory

The controller and memory are mediated by the interface vector.

Content lookup

A particular set of values within the interface vector, which we will collect in something called key vector, is compared to the content of each location. This comparison is made by means of a similarity measure.

Temporal memory linkage

The transitions between consecutively written locations are recorded in an $N * N$ matrix, called temproal link matrix “L”. The sequence by which the controller writes in the memory is an information by itself, and it is something we want to store.

DNC stores the ‘temporal link’ to keep track of the order things where written in, and records the current ‘usage’ level of each memory location.

Dynamic memory allocation

Each location has a usage level represented as a number from 0 to 1. A weighting that picks out an unused location is sent to the write head, so that it knows where to store new information. The word “dynamic” refers to the ability of the controller to reallocate memory that is no longer required, erasing its content.