NengoDL combines two frameworks: Nengo and TensorFlow. This tutorial is designed for people who are familiar with TensorFlow and looking to learn more about neuromorphic modelling with NengoDL. For the other approach, users familiar with Nengo looking to learn how to use NengoDL, check out this tutorial.
[1]:
%matplotlib inline
import gzip
import pickle
from urllib.request import urlretrieve
import sys
import matplotlib.pyplot as plt
import nengo
from nengo.utils.matplotlib import rasterplot
import nengo_dl
import numpy as np
import tensorflow as tf
We’ll start with the very basics, where you might be wondering what Nengo is and why you would want to use it. Nengo is a tool for constructing and simulating neural networks. That is, to some extent, the same purpose as TensorFlow (although TensorFlow is more of a general computational framework with neural network leanings). For example, here is how we might build a simple two layer auto-encoder network in TensorFlow:
[2]:
n_in = 784
n_hidden = 64
minibatch_size = 50
with tf.Graph().as_default() as auto_graph:
# input
tf_a = tf.placeholder(tf.float32, shape=(minibatch_size, n_in))
# first layer
tf_b = tf.layers.dense(
tf_a, n_hidden, activation=tf.nn.relu,
kernel_initializer=tf.initializers.glorot_uniform())
# second layer
tf_c = tf.layers.dense(
tf_b, n_in, activation=tf.nn.relu,
kernel_initializer=tf.initializers.glorot_uniform())
And here is how we would build the same network architecture in Nengo:
[3]:
with nengo.Network() as auto_net:
# input
nengo_a = nengo.Node(np.zeros(n_in))
# first layer
nengo_b = nengo.Ensemble(
n_hidden, 1, neuron_type=nengo.RectifiedLinear())
nengo.Connection(
nengo_a, nengo_b.neurons, transform=nengo_dl.dists.Glorot())
# second layer
nengo_c = nengo.Ensemble(
n_in, 1, neuron_type=nengo.RectifiedLinear())
nengo.Connection(
nengo_b.neurons, nengo_c.neurons,
transform=nengo_dl.dists.Glorot())
# probes are used to collect data from the network
p_c = nengo.Probe(nengo_c.neurons)
Note the basic similarities: an overall container (tf.Graph
or nengo.Network
), containing layers (Ensembles) of neurons connected by weights.
One difference you’ll note is that with Nengo we separate the creation of the layers and the creation of the connections between layers. This is because the connection structure in Nengo networks often has a lot more state and general complexity than in typical deep learning networks, so it is helpful to be able to control it independently (we’ll see examples of this later).
Another new object you may notice is the nengo.Probe
. This is used to collect data from the simulation; by adding a probe to nengo_c.neurons
, we are indicating that we want to collect the activities of those neurons when the simulation is running. You can think of this like the “fetch” arguments in a TensorFlow Session, except we’re explicitly defining which objects in the network we want to fetch during graph construction rather than at run time.
We will not go into a lot of detail on Nengo here; there is much more functionality available, but we will focus on the features most familiar or relevant to those coming from a TensorFlow background. For a more in-depth introduction to Nengo, check out the Nengo-specific documentation and examples.
To simulate a TensorFlow network we create a Session
and call sess.run
:
[4]:
with tf.Session(graph=auto_graph) as sess:
sess.run(tf.global_variables_initializer())
output = sess.run(
tf_c, feed_dict={tf_a: np.ones((minibatch_size, n_in))})
Again, accomplishing the same thing in Nengo bears many similarities. We create a Simulator
and call sim.run
:
[5]:
with nengo_dl.Simulator(network=auto_net,
minibatch_size=minibatch_size) as sim:
sim.run(0.001, data={nengo_a: np.ones((minibatch_size, 1, n_in))})
Build finished in 0:00:00
Optimization finished in 0:00:00
| Constructing graph: build stage (0%) | ETA: --:--:--
/home/travis/build/nengo/nengo-dl/nengo_dl/simulator.py:131: UserWarning: No GPU support detected. It is recommended that you install tensorflow-gpu (`pip install tensorflow-gpu`).
"No GPU support detected. It is recommended that you "
Construction finished in 0:00:00
Simulation finished in 0:00:00
One difference you may note is the 0.001
in the call to sim.run
. This is specifying the length of time (in simulated seconds) that we want to simulate the network. 0.001 corresponds to one simulation timestep with the default Simulator
dt
of 0.001 (which is also why our input data has a shape of 1 in the second dimension).
This highlights a key difference between Nengo and TensorFlow. Nengo simulations are fundamentally temporal in nature; unlike TensorFlow where the graph simply represents an abstract set of computations, in Nengo we (almost) always think of the graph as representing a stateful neural simulation, where values are accumulated, updated, and communicated over time. This is not to say there is no overlap (we can create TensorFlow simulations that execute over time, and we can create Nengo simulations without temporal dynamics), but this is a different way of thinking about computations that influences how we construct and simulate networks in Nengo.
More details on the NengoDL Simulator can be found in the user guide.
Although Nengo can be used to create TensorFlow-style networks, it has been primarily designed for a different style of modelling: “neuromorphic” networks. Neuromorphic networks include features drawn from biological neural networks, in an effort to understand or recreate the functionality of biological brains. Note that these models fall on a spectrum with standard artificial neural networks, with different approaches incorporating different biological features. But in general the structure and parameterization of these networks often differs significantly from standard deep network architectures.
We touched on this above in the discussion of temporality, which is one common feature of neuromorphic networks. Another common characteristic is the use of more complicated neuron models, in particular spiking neurons. In contrast to “rate” neurons (like relu
) that output a continuous value, spiking neurons communicate via discrete bursts of output called spikes.
We can visualize this difference with a simple 1-layer network:
[6]:
with nengo.Network() as net:
# our input node will output a sine wave with a period of 1 second
a = nengo.Node(lambda t: np.sin(2 * np.pi * t))
# we'll create one ensemble with rate neurons
b_rate = nengo.Ensemble(
10, 1, neuron_type=nengo.RectifiedLinear(), seed=2)
nengo.Connection(a, b_rate)
# and another ensemble with spiking neurons
b_spike = nengo.Ensemble(
10, 1, neuron_type=nengo.SpikingRectifiedLinear(), seed=2)
nengo.Connection(a, b_spike)
p_a = nengo.Probe(a)
p_rate = nengo.Probe(b_rate.neurons)
p_spike = nengo.Probe(b_spike.neurons)
with nengo_dl.Simulator(net) as sim:
# simulate the model for 1 second
sim.run(1.0)
plt.figure()
plt.plot(sim.trange(), sim.data[p_a])
plt.xlabel("time")
plt.ylabel("input value")
plt.title("a")
plt.figure()
plt.plot(sim.trange(), sim.data[p_rate])
plt.xlabel("time")
plt.ylabel("firing rate")
plt.title("b_rate")
plt.figure()
rasterplot(sim.trange(), sim.data[p_spike])
plt.xlabel("time")
plt.ylabel("neuron")
plt.title("b_spike");
Build finished in 0:00:00
Optimization finished in 0:00:00
Construction finished in 0:00:00
Simulation finished in 0:00:00
Each neuron responds to the input signal differently due to the random parameterization in the network (e.g. connection weights and biases). We have matched the parameterization in the rate and spiking ensembles so that it is easier to see the parallels.
Note that the same information is being represented in the two ensembles. For example, when the second neuron (orange) is outputting a high continuous value (in the second graph), the corresponding spiking neuron is outputting more discrete spikes (orange lines in the third graph).
We can see the parallels more clearly if we introduce another Nengo feature, synaptic filters. This is inspired by a biological feature where discrete spikes induce a continuous electrical waveform in the receiving neuron, at the synapse (the point where the two neurons connect). But computationally we can think of this simply as applying a filter to the spiking signal.
[7]:
# nengo uses a linear lowpass filter by default
filt = nengo.Lowpass(tau=0.05)
# apply filter to ensemble output spikes
filtered_spikes = filt.filt(sim.data[p_spike])
plt.figure()
plt.plot(sim.trange(), filtered_spikes)
plt.xlabel("time")
plt.ylabel("filtered spike train (firing rates)");
We can see how the spike trains, when viewed through a synaptic filter, approximate the continuous rate values in the second graph above.
In this example we have computed the filtered signal manually for demonstration purposes, but in a typical Nengo model these synaptic filters are applied throughout the model, on the Connection
objects. For example, the above filtering would be equivalent to nengo.Connection(b_spike.neurons, x, synapse=0.05)
(from the perspective of a hypothetical downstream object x
).
This is a helpful duality to keep in mind when coming to neuromorphic modelling and Nengo from a standard deep network background. Although spiking neurons seem like a radically different paradigm, they can compute and communicate the same information as their rate counterparts. But note that this only makes sense when we think of the network temporally (neurons spiking and being filtered over time).
There are many other neuron types built into Nengo (see the documentation for a complete list). These neuron models have various different behaviours, and managing their parameterization and simulation is an important part of Nengo’s design.
The goal of NengoDL is not to replace TensorFlow or Nengo, but to allow them to smoothly work together. Thus one important feature is the ability to write TensorFlow code directly, and insert it into a Nengo network. This allows us to use whichever framework is best suited for different parts of a model.
This functionality is accessed through the nengo_dl.TensorNode
class. This allows us to wrap TensorFlow code in a Nengo object, so that it can easily communicate with the rest of a Nengo model. The TensorFlow code is written in a function that takes tf.Tensors
as input, applies the desired manipulations through TensorFlow operations, and returns a tf.Tensor
. We then pass that function to the TensorNode.
For simple cases we can use nengo_dl.tensor_layer
. This is a simplified interface for constructing TensorNodes
that mimics the common layer-based API of deep learning frameworks (such as tf.layers
). For example, suppose we want to apply batch normalization to the output of one of the Nengo ensembles. There is no built-in way to do batch normalization in Nengo, so we can instead turn to TensorFlow for this part of the model.
[8]:
with net:
batch_norm = nengo_dl.tensor_layer(
b_rate.neurons, tf.layers.batch_normalization, momentum=0.9)
p_batch_norm = nengo.Probe(batch_norm)
This is essentially equivalent to the TensorFlow function tf.layers.batch_normalization(b_rate.neurons, momentum=0.9)
, except it works with Nengo objects. For example, b_rate
is a nengo.Ensemble
in this case, and we can add Probes or Connections to batch_norm
in the same way as any other Nengo object.
Using nengo_dl.tensor_layer
is simply a shortcut for creating a TensorNode
and Connection
; the above is equivalent to
[9]:
with net:
batch_norm = nengo_dl.TensorNode(
lambda t, x: tf.layers.batch_normalization(x, momentum=0.9),
size_in=10)
nengo.Connection(b_rate.neurons, batch_norm, synapse=None)
p_batch_norm = nengo.Probe(batch_norm)
In general, we can use any function (a built in TensorFlow function or one we write ourselves) in a TensorNode. It must accept two parameters, t
and x
, where t
is the current simulation time and x
is the value of any Connections incoming to the TensorNode. Note that in the case of tensor_layers
the t
parameter is omitted. x
will have shape (minibatch_size, size_in)
, where size_in
is the dimensionality of the input Connections to the node (specified in the
size_in=10
argument above). The TensorNode
/tensor_layer
function should return a tf.Tensor
with shape (minibatch_size, size_out)
, where size_out
is the output dimensionality of the node (dependent on the manipulations applied to the inputs x
). We could explicitly specify size_out=10
in the above example, or if we don’t specify the output size it will be determined automatically by calling the node function with placeholder inputs.
Here is a simple network to illustrate a TensorNode’s input and output:
[10]:
with nengo.Network() as net:
# node to provide an input value for the TensorNode
a = nengo.Node([0.5, -0.1])
# a TensorNode function to illustrate i/o
def tensor_func(t, x):
# print out the value of inputs t and x
t = nengo_dl.utils.print_op(t, "t:")
with tf.control_dependencies([t]):
x = nengo_dl.utils.print_op(x, "x:")
# output t + x
return tf.add(t, x)
# create the TensorNode
b = nengo_dl.TensorNode(tensor_func, size_in=2)
nengo.Connection(a, b, synapse=None)
p = nengo.Probe(b)
with nengo_dl.Simulator(net) as sim:
print("TensorNode input:")
sim.run_steps(10, progress_bar=False)
print("TensorNode output:")
print(sim.data[p])
Build finished in 0:00:00
Optimization finished in 0:00:00
Construction finished in 0:00:00
TensorNode input:
t: 0.001
x: [[ 0.5 -0.1]]
t: 0.002
x: [[ 0.5 -0.1]]
t: 0.003
x: [[ 0.5 -0.1]]
t: 0.004
x: [[ 0.5 -0.1]]
t: 0.0050000004
x: [[ 0.5 -0.1]]
t: 0.006
x: [[ 0.5 -0.1]]
t: 0.007
x: [[ 0.5 -0.1]]
t: 0.008
x: [[ 0.5 -0.1]]
t: 0.009000001
x: [[ 0.5 -0.1]]
t: 0.010000001
x: [[ 0.5 -0.1]]
TensorNode output:
[[ 0.501 -0.099]
[ 0.502 -0.098]
[ 0.503 -0.097]
[ 0.504 -0.096]
[ 0.505 -0.095]
[ 0.506 -0.094]
[ 0.507 -0.093]
[ 0.508 -0.092]
[ 0.509 -0.091]
[ 0.51 -0.09 ]]
We can see, as we expect, that the input tensor t
is reflecting the current simulation time over the 10 timesteps we executed, and x
contains the value of the input Node that we connected to the TensorNode. And we can see in the probe data that the TensorNode is outputting the operation we defined in TensorFlow (tf.add(t, x)
).
One point that can be important to keep in mind is that the main Nengo simulation graph is built within a tf.while_loop
(that is what allows us to simulate and optimize a Nengo network over time). So the code that is defined within the TensorNode function will be executed within that while_loop
context, which is usually what we want. However, sometimes we may want to write code that will execute outside the simulation loop (for example, code required to create the TensorNode’s
parameters). This can be achieved by passing a callable class to the TensorNode instead of a simple function. That callable class can optionally define pre_build
and post_build
methods. pre_build
will be called before the main simulation loop is constructed. post_build
will be called after everything else in the graph has been constructed and the simulation Session has been initialized (this allows the TensorNode to build functionality that depends on an active Session or the
Simulator random seed).
Here is a simple TensorNode that illustrates the different build stages:
[11]:
with nengo.Network() as net:
class TensorFunc:
def pre_build(self, shape_in, shape_out):
# shape_in and shape_out are the input and output shape of
# the TensorNode
print("in pre_build, scope:",
tf.get_default_graph().get_name_scope())
def __call__(self, t, x):
# this is the main tensornode function, equivalent to the
# tensor_func discussed above
print("in build, scope:",
tf.get_default_graph().get_name_scope())
return x
def post_build(self, sess, rng):
# post_build is called after the Simulator's underlying
# tf.Session is initialized, and that Session is passed
# in as `sess`. `rng` is the Simulator's random number
# generator.
print("in post_build, scope:",
tf.get_default_graph().get_name_scope())
a = nengo_dl.TensorNode(TensorFunc(), size_in=1, size_out=1)
# build the network
with nengo_dl.Simulator(net, progress_bar=False) as sim:
pass
in pre_build, scope: SimTensorNodeBuilder
in build, scope: while/iteration_0/SimTensorNodeBuilder
in post_build, scope: SimTensorNodeBuilder_1
More details on TensorNode usage can be found in the user guide.
NengoDL allows model parameters to be optimized via TensorFlow optimization algorithms, through the Simulator.train
function. Returning to the autoencoder examples from the beginning of this tutorial, we’ll optimize those networks to encode MNIST digits.
[12]:
# download MNIST dataset
urlretrieve("http://deeplearning.net/data/mnist/mnist.pkl.gz",
"mnist.pkl.gz")
with gzip.open("mnist.pkl.gz") as f:
(train_data, _), _, (test_data, _) = pickle.load(
f, encoding="latin1")
np.random.shuffle(train_data)
n_epochs = 2
In TensorFlow the training would be done something like:
[13]:
with auto_graph.as_default():
# create placeholder for target values
targets = tf.placeholder(tf.float32, (minibatch_size, n_in))
# compute loss (mean squared error)
loss = tf.reduce_mean(tf.square(targets - tf_c))
# apply optimizer
optimizer = tf.train.RMSPropOptimizer(1e-3)
opt_op = optimizer.minimize(loss)
with tf.Session(graph=auto_graph) as sess:
sess.run(tf.global_variables_initializer())
# run training loop
for i in range(n_epochs):
for j in range(0, len(train_data), minibatch_size):
sess.run(opt_op, feed_dict={
tf_a: train_data[j:j+minibatch_size],
targets: train_data[j:j+minibatch_size]})
# evaluate performance on test set
error = np.mean([
sess.run(loss, feed_dict={
tf_a: test_data[j:j+minibatch_size],
targets: test_data[j:j+minibatch_size]})
for j in range(0, len(test_data), minibatch_size)])
print("error:", error)
# display example output
output = sess.run(tf_c, feed_dict={tf_a: test_data[:minibatch_size]})
plt.figure()
plt.imshow(output[0].reshape((28, 28)))
error: 0.013011572
Before running the same training in NengoDL, we’ll change the Nengo model parameters to more closely match the TensorFlow network (we omitted these details in the original presentation to keep things simple).
[14]:
# set initial neuron gains to 1 and biases to 0
for ens in auto_net.all_ensembles:
ens.gain = nengo.dists.Choice([1])
ens.bias = nengo.dists.Choice([0])
# disable synaptic filtering on all connections
for conn in auto_net.all_connections:
conn.synapse = None
We also need to modify the data slightly. As mentioned above, NengoDL simulations are essentially temporal, so data is described over time (indicating what the inputs/targets should be on each simulation timestep). So instead of the data having shape (batch_size, n)
, it will have shape (batch_size, n_steps, n)
. In this case we’ll just be training for a single timestep, but we still need to add that extra axis with length 1.
[15]:
train_data = train_data[:, None, :]
test_data = test_data[:, None, :]
Now we can run the NengoDL equivalent of the above TensorFlow training:
[16]:
# define loss function (we could use the pre-defined
# `nengo_dl.obj.mse`, but we define it explicitly here
# for clarity)
def loss(outputs, targets):
return tf.reduce_mean(tf.square(outputs - targets))
with nengo_dl.Simulator(auto_net, minibatch_size=minibatch_size) as sim:
# sim.train automatically adds the necessary elements to the
# graph and runs the training loop
# note: the probe acts as the placeholder to feed in target values
sim.train({nengo_a: train_data, p_c: train_data},
tf.train.RMSPropOptimizer(1e-3),
objective={p_c: loss}, n_epochs=n_epochs)
# evaluate performance on test set
error = sim.loss({nengo_a: test_data, p_c: test_data},
objective={p_c: loss})
print("error:", error)
# display example output
sim.step(data={nengo_a: test_data[:minibatch_size]})
plt.figure()
plt.imshow(sim.data[p_c][0].reshape((28, 28)))
Build finished in 0:00:00
Optimization finished in 0:00:00
Construction finished in 0:00:00
Training finished in 0:00:25 (loss: 0.0124)
Calculation finished in 0:00:01
error: 0.012338359
More details on using sim.train
can be found in the user guide.
NengoDL also provides access to a different optimization method, the Neural Engineering Framework (NEF). This uses linear least-squares optimization to solve for optimal connection weights analytically, rather than using an iterative gradient-descent based algorithm. The advantage of the NEF is that it is very fast and general (for example, it does not require the network to be differentiable). The disadvantage is that it optimizes each set of connection weights individually (i.e., it cannot jointly optimize across multiple layers).
The NEF optimization is accessed by setting the function
argument on a nengo.Connection
. This specifies the function that we would like those connection weights to approximate. In addition, in previous examples you may have noticed that we were forming Connections using ensemble.neurons
(rather than ensemble
). Using ensemble.neurons
specifies that we want to form a direct connection between ensemble neurons, without applying the NEF optimization. So when we want to use the
function
argument, the Connection
source object should be an ensemble
, not ensemble.neurons
. For example, we could use the NEF to create a network to approximate the function \(sin(x^2)\):
[17]:
with nengo.Network(seed=0) as net:
# input node outputting a random signal for x
inpt = nengo.Node(nengo.processes.WhiteSignal(1, 5, rms=0.3))
# first ensemble, will compute x^2
ens0 = nengo.Ensemble(50, 1)
# second ensemble, will compute sin(x^2)
ens1 = nengo.Ensemble(50, 1)
# output node
outpt = nengo.Node(size_in=1)
# connect input to first ensemble
nengo.Connection(inpt, ens0)
# connect first to second ensemble, solve for weights
# to approximate the square function
nengo.Connection(ens0, ens1, function=np.square)
# connect second ensemble to output, solve for weights
# to approximate the sin function
nengo.Connection(ens1, outpt, function=np.sin)
# add a probe on the input and output
inpt_p = nengo.Probe(inpt)
outpt_p = nengo.Probe(outpt)
with nengo_dl.Simulator(net) as sim:
sim.run(1.0)
plt.figure()
plt.plot(sim.trange(), sim.data[inpt_p], label="x")
plt.plot(sim.trange(), np.sin(sim.data[inpt_p]**2), label="sin(x^2)")
plt.plot(sim.trange(), sim.data[outpt_p], label="output")
plt.legend();
Build finished in 0:00:00
Optimization finished in 0:00:00
Construction finished in 0:00:00
Simulation finished in 0:00:00
The NEF optimization can be used in combination with the deep learning optimization methods. For example, we could optimize some parameters with the NEF and others with sim.train
(see this example). Or we could initialize each set of connection weights individually with the NEF, and then further refine them with end-to-end training via sim.train
. As always, the overall theme is that NengoDL allows us to use whichever method is
most appropriate for a particular goal.
See this example for a deeper introduction to the principles of the NEF.
Neuromorphic hardware is specialized compute hardware designed to simulate neuromorphic networks quickly/efficiently. However, often it is difficult to program this custom hardware, and it requires writing custom code for each neuromorphic platform. One of the primary design goals of Nengo is to alleviate these challenges, by providing a single API that can be used to build networks across many different neuromorphic platforms.
The idea is that the front-end network construction code is the same (Networks
, Nodes
, Ensembles
, Connections
, and Probes
), and then each platform has its own Simulator
class (the back-end) that compiles and executes that network definition for some compute platform. This provides a consistent interface so that we only need to write code once and can then run that network on novel hardware platforms with no additional effort. For example, we could take the network from
above and simulate it on different hardware platforms:
[18]:
# run on a standard CPU
with nengo.Simulator(net) as sim:
sim.run(1.0)
# run on Loihi neuromorphic hardware
# (requires https://www.nengo.ai/nengo-loihi/)
# with nengo_loihi.Simulator(net) as sim:
# sim.run(1.0)
# run on SpiNNaker neuromorphic hardware
# (requires https://github.com/project-rig/nengo_spinnaker)
# with nengo_spinnaker.Simulator(net) as sim:
# sim.run(1.0)
# run on any OpenCL-compatible hardware
# (requires https://github.com/nengo/nengo-ocl)
# with nengo_ocl.Simulator(net) as sim:
# sim.run(1.0)
plt.figure()
plt.plot(sim.trange(), sim.data[inpt_p], label="x")
plt.plot(sim.trange(), np.sin(sim.data[inpt_p]**2), label="sin(x^2)")
plt.plot(sim.trange(), sim.data[outpt_p], label="output")
plt.legend();
/home/travis/miniconda/envs/test/lib/python3.6/site-packages/nengo/builder/optimizer.py:636: UserWarning: Skipping some optimization steps because SciPy is not installed. Installing SciPy may result in faster simulations.
warnings.warn("Skipping some optimization steps because SciPy is "
We have commented out the different backends above because they require extra installation steps, but if you are running this example yourself you can install any of those backends (or more) and uncomment that code to see the same network running on that new hardware platform. Note that we can think of NengoDL as a TensorFlow back-end (among other things); it takes a standard Nengo network, and simulates it using TensorFlow.
We can take advantage of this cross-platform compatibility to effectively incorporate NengoDL’s deep learning functionality into any other Nengo back-end. We build our Network, optimize it in NengoDL, save the optimized model parameters back into the Network definition, and then simulate that optimized Network in a different back-end. See this example in nengo-loihi, where a spiking network is optimized in NengoDL and then deployed on Loihi.
In this tutorial we have demonstrated how to translate TensorFlow concepts into NengoDL, including network construction, execution, and optimization. We have also discussed how to use TensorNodes to combine TensorFlow and Nengo code, and introduced some of the unique features of Nengo (such as NEF optimization and neuromorphic cross-platform execution). However, there is much more functionality in NengoDL than we are able to introduce here; check out the user guide or other examples for more information. If you would like more information on how NengoDL is implemented under the hood using TensorFlow, check out the white paper.