Optimizing a cognitive model¶
The purpose of this example is to illustrate how Nengo DL can be used to optimize a more complex cognitive model, involving the retrieval of information from highly structured semantic pointers. We will create a network that takes a collection of information as input (encoded using semantic pointers), and train it to retrieve some specific element from that collection.
[1]:
%matplotlib inline
from urllib.request import urlretrieve
import zipfile
import matplotlib.pyplot as plt
import nengo
import nengo.spa as spa
import numpy as np
import tensorflow as tf
import nengo_dl
/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/compat.py:26: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.
WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/__init__.py:38: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.
WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/__init__.py:38: The name tf.logging.WARN is deprecated. Please use tf.compat.v1.logging.WARN instead.
WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/__init__.py:43: The name tf.disable_v2_behavior is deprecated. Please use tf.compat.v1.disable_v2_behavior instead.
WARNING:tensorflow:From /home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorflow/python/compat/v2_compat.py:61: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
The first thing to do is define a function that produces random examples of structured semantic pointers. Each example consists of a collection of role-filler pairs of the following form:
\(TRACE_0 = \sum_{j=0}^N Role_{0,j} \circledast Filler_{0,j}\)
where terms like \(Role\) refer to simpler semantic pointers (i.e., random vectors), the \(\circledast\) symbol denotes circular convolution, and the summation means vector addition. That is, we define different pieces of information consisting of Roles and Fillers, and then we sum the information together in order to generate the full trace. As an example of how this might look in practice, we could encode information about a dog as
\(DOG = COLOUR \circledast BROWN + LEGS \circledast FOUR + TEXTURE \circledast FURRY + ...\)
The goal of the system is then to retrieve a cued piece of information from the semantic pointer. For example, if we gave the network the trace \(DOG\) and the cue \(COLOUR\) it should output \(BROWN\).
[2]:
def get_data(n_items, pairs_per_item, vec_d, vocab_seed):
# the vocabulary object will handle the creation of semantic
# pointers for us
rng = np.random.RandomState(vocab_seed)
vocab = spa.Vocabulary(dimensions=vec_d, rng=rng, max_similarity=1)
# initialize arrays of shape (n_inputs, n_steps, vec_d)
traces = np.zeros((n_items, 1, vec_d))
cues = np.zeros((n_items, 1, vec_d))
targets = np.zeros((n_items, 1, vec_d))
# iterate through all of the examples to be generated
for n in range(n_items):
role_names = ["ROLE_%d_%d" % (n, i) for i in range(pairs_per_item)]
filler_names = ["FILLER_%d_%d" % (n, i) for i in range(pairs_per_item)]
# create key for the 'trace' of bound pairs (i.e. a
# structured semantic pointer)
trace_key = 'TRACE_' + str(n)
trace_ptr = vocab.parse('+'.join(
"%s * %s" % (x, y) for x, y in zip(role_names, filler_names)))
trace_ptr.normalize()
vocab.add(trace_key, trace_ptr)
# pick which element will be cued for retrieval
cue_idx = rng.randint(pairs_per_item)
# fill array elements correspond to this example
traces[n, 0, :] = vocab[trace_key].v
cues[n, 0, :] = vocab["ROLE_%d_%d" % (n, cue_idx)].v
targets[n, 0, :] = vocab["FILLER_%d_%d" % (n, cue_idx)].v
return traces, cues, targets, vocab
Next we’ll define a Nengo model that retrieves cued items from structured semantic pointers. So, for a given trace (e.g., \(TRACE_0\)) and cue (e.g., \(Role_{0,0}\)), the correct output would be the corresponding filler (\(Filler_{0,0}\)). The model we’ll build will perform such retrieval by implementing a computation of the form:
\(TRACE_0 \:\: \circledast \sim Role_{0,0} \approx Filler_{0,0}\)
That is, convolving the trace with inverse of the given cue will produce (approximately) the associated filler. More details about the mathematics of how/why this works can be found here.
We can create a model to perform this calculation by using the nengo.networks.CircularConvolution
network that comes with Nengo.
[3]:
seed = 0
dims = 32
minibatch_size = 50
n_pairs = 2
with nengo.Network(seed=seed) as net:
# use rectified linear neurons to ensure differentiability
net.config[nengo.Ensemble].neuron_type = nengo.RectifiedLinear()
net.config[nengo.Connection].synapse = None
# provide a pointer and a cue as input to the network
trace_inp = nengo.Node(np.zeros(dims))
cue_inp = nengo.Node(np.zeros(dims))
# create a convolution network to perform the computation
# specified above
cconv = nengo.networks.CircularConvolution(5, dims, invert_b=True)
# connect the trace and cue inputs to the circular
# convolution network
nengo.Connection(trace_inp, cconv.input_a)
nengo.Connection(cue_inp, cconv.input_b)
# probe the output
out = nengo.Probe(cconv.output)
In order to assess the retrieval accuracy of the model we need a metric for success. In this case we’ll say that a cue has been successfully retrieved if the output vector is more similar to the correct filler vector than it is to any of the other vectors in the vocabulary.
[4]:
def accuracy(output, vocab, targets, t_step=-1):
# provide the probed output data, the vocab,
# the target vectors, and the time step at which to evaluate
# get output at the given time step
output = output[:, t_step, :]
# compute similarity between each output and vocab item
sims = np.dot(vocab.vectors, output.T)
idxs = np.argmax(sims, axis=0)
# check that the output is most similar to the target
acc = np.mean(np.all(vocab.vectors[idxs] == targets[:, 0], axis=1))
return acc
Now we can run the model on some test data to check the baseline retrieval accuracy. Since we used only a small number of neurons in the circular convolution network, we should expect mediocre results.
[5]:
# generate some test inputs
test_traces, test_cues, test_targets, test_vocab = get_data(
minibatch_size, n_pairs, dims, vocab_seed=seed)
test_inputs = {trace_inp: test_traces, cue_inp: test_cues}
# run the simulator for one time step to compute the network outputs
with nengo_dl.Simulator(
net, minibatch_size=minibatch_size, seed=seed) as sim:
sim.step(data=test_inputs)
print('Retrieval accuracy: ', accuracy(sim.data[out], test_vocab,
test_targets))
|################## Building network (29%) | ETA: 0:00:00
/home/travis/build/nengo/nengo-dl/nengo_dl/simulator.py:102: UserWarning: No GPU support detected. It is recommended that you install tensorflow-gpu (`pip install tensorflow-gpu`).
"No GPU support detected. It is recommended that you "
Build finished in 0:00:01
Optimization finished in 0:00:00
WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/simulator.py:160: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.
WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/simulator.py:160: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.
|# Constructing graph | 0:00:00WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/tensor_graph.py:200: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/tensor_graph.py:200: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/tensor_graph.py:205: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.
WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/tensor_graph.py:205: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.
| Constructing graph: creating base arrays (0%) | ETA: --:--:--WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/tensor_graph.py:229: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.
WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/tensor_graph.py:229: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.
Construction finished in 0:00:00
Retrieval accuracy: 0.04
These results indicate that the model is only rarely performing accurate retrieval, which means that this network is not very capable of manipulating structured semantic pointers in a useful way.
We can visualize the similarity of the output for one of the traces to get a sense of what this accuracy looks like (the similarity to the correct output is shown in red).
[6]:
plt.figure(figsize=(10, 5))
bars = plt.bar(np.arange(len(test_vocab.vectors)),
np.dot(test_vocab.vectors, sim.data[out][0, 0]))
bars[np.where(np.all(test_vocab.vectors == test_targets[0, 0],
axis=1))[0][0]].set_color("r")
plt.ylim([-1, 1])
plt.xlabel("Vocabulary items")
plt.ylabel("Similarity");
We can see that the actual output is not particularly similar to this desired output, which illustrates that the model is not performing accurate retrieval.
Now we’ll train the network parameters to improve performance. We won’t directly optimize retrieval accuracy, but will instead minimize the mean squared error between the model’s output vectors and the vectors corresponding to the correct output items for each input cue. We’ll use a large number of training examples that are distinct from our test data, so as to avoid explicitly fitting the model parameters to the test items.
To make this example run a bit quicker we’ll download some pretrained model parameters by default. Set do_training=True
to train the model yourself.
[7]:
sim = nengo_dl.Simulator(net, minibatch_size=minibatch_size, seed=seed)
# pick an optimizer and learning rate
optimizer = tf.train.RMSPropOptimizer(2e-3)
do_training = False
if do_training:
# create training data and data feeds
train_traces, train_cues, train_targets, _ = get_data(
n_items=5000, pairs_per_item=n_pairs, vec_d=dims, vocab_seed=seed+1)
train_data = {trace_inp: train_traces, cue_inp: train_cues,
out: train_targets}
# train the model
sim.train(train_data, optimizer, n_epochs=100)
sim.save_params("./spa_retrieval_params")
else:
# download pretrained parameters
urlretrieve(
"https://drive.google.com/uc?export=download&"
"id=1jm7EUt7P7IFMsxmoBFXX3me-NsDMw_85",
"spa_retrieval_params.zip")
with zipfile.ZipFile("spa_retrieval_params.zip") as f:
f.extractall()
# load parameters
sim.load_params('./spa_retrieval_params')
Build finished in 0:00:01
Optimization finished in 0:00:00
Construction finished in 0:00:00
WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/simulator.py:1106: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.
WARNING:tensorflow:From /home/travis/build/nengo/nengo-dl/nengo_dl/simulator.py:1106: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.
WARNING:tensorflow:From /home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
WARNING:tensorflow:From /home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
We can now recompute the network outputs using the trained model on the test data. We can see that the retrieval accuracy is significantly improved. You can modify the dimensionality of the vectors and the number of bound pairs in each trace to explore how these variables influence the upper bound on retrieval accuracy.
[8]:
sim.step(data=test_inputs)
print('Retrieval accuracy: ', accuracy(sim.data[out], test_vocab,
test_targets))
sim.close()
Retrieval accuracy: 0.74
[9]:
plt.figure(figsize=(10, 5))
bars = plt.bar(np.arange(len(test_vocab.vectors)),
np.dot(test_vocab.vectors, sim.data[out][0, 0]))
bars[np.where(np.all(test_vocab.vectors == test_targets[0, 0],
axis=1))[0][0]].set_color("r")
plt.ylim([-1, 1])
plt.xlabel("Vocabulary items")
plt.ylabel("Similarity");
Check out this example for a more complicated version of this task/model, in which a structured semantic pointer is built up over time by binding together sequentially presented input items.