Note
This documentation is for a development version. Click here for the latest stable release (v0.4.2).
API reference¶
LMU Layers¶
Core classes for the KerasLMU package.
Implementation of LMU cell (to be used within Keras RNN wrapper). |
|
A layer of trainable low-dimensional delay systems. |
|
Layer class for the feedforward variant of the LMU. |
- class keras_lmu.LMUCell(*args, **kwargs)[source]¶
Implementation of LMU cell (to be used within Keras RNN wrapper).
In general, the LMU cell consists of two parts: a memory component (decomposing the input signal using Legendre polynomials as a basis), and a hidden component (learning nonlinear mappings from the memory component). [1] [2]
This class processes one step within the whole time sequence input. Use the
LMU
class to create a recurrent Keras layer to process the whole sequence. CallingLMU()
is equivalent to doingRNN(LMUCell())
.- Parameters:
- memory_dint
Dimensionality of input to memory component.
- orderint
The number of degrees in the transfer function of the LTI system used to represent the sliding window of history. This parameter sets the number of Legendre polynomials used to orthogonally represent the sliding window.
- thetafloat
The number of timesteps in the sliding window that is represented using the LTI system. In this context, the sliding window represents a dynamic range of data, of fixed size, that will be used to predict the value at the next time step. If this value is smaller than the size of the input sequence, only that number of steps will be represented at the time of prediction, however the entire sequence will still be processed in order for information to be projected to and from the hidden layer. If
trainable_theta
is enabled, then theta will be updated during the course of training.- hidden_cell
keras.layers.Layer
Keras Layer/RNNCell implementing the hidden component.
- trainable_thetabool
If True, theta is learnt over the course of training. Otherwise, it is kept constant.
- hidden_to_memorybool
If True, connect the output of the hidden component back to the memory component (default False).
- memory_to_memorybool
If True, add a learnable recurrent connection (in addition to the static Legendre system) to the memory component (default False).
- input_to_hiddenbool
If True, connect the input directly to the hidden component (in addition to the connection from the memory component) (default False).
- discretizerstr
The method used to discretize the A and B matrices of the LMU. Current options are “zoh” (short for Zero Order Hold) and “euler”. “zoh” is more accurate, but training will be slower than “euler” if
trainable_theta=True
. Note that a larger theta is needed when discretizing using “euler” (a value that is larger than4*order
is recommended).- kernel_initializer
tf.initializers.Initializer
Initializer for weights from input to memory/hidden component. If
None
, no weights will be used, and the input size must match the memory/hidden size.- recurrent_initializer
tf.initializers.Initializer
Initializer for
memory_to_memory
weights (if that connection is enabled).- kernel_regularizer
keras.regularizers.Regularizer
Regularizer for weights from input to memory/hidden component.
- recurrent_regularizer
keras.regularizers.Regularizer
Regularizer for
memory_to_memory
weights (if that connection is enabled).- use_biasbool
If True, the memory component includes a bias term.
- bias_initializer
tf.initializers.Initializer
Initializer for the memory component bias term. Only used if
use_bias=True
.- bias_regularizer
keras.regularizers.Regularizer
Regularizer for the memory component bias term. Only used if
use_bias=True
.- dropoutfloat
Dropout rate on input connections.
- recurrent_dropoutfloat
Dropout rate on
memory_to_memory
connection.- input_dOptional[int]
Size of last axis on input signals. This only needs to be specified if hidden_cell=None and input_to_hidden=True, otherwise the input dimensionality can be inferred dynamically.
References
[1]Voelker and Eliasmith (2018). Improving spiking dynamical networks: Accurate delays, higher-order synapses, and time cells. Neural Computation, 30(3): 569-609.
[2]Voelker and Eliasmith. “Methods and systems for implementing dynamic neural networks.” U.S. Patent Application No. 15/243,223. Filing date: 2016-08-22.
- property theta¶
Value of the
theta
parameter.If
trainable_theta=True
this returns the trained value, not the initial value passed in to the constructor.
- build(input_shape)[source]¶
Builds the cell.
Notes
This method should not be called manually; rather, use the implicit layer callable behaviour (like
my_layer(inputs)
), which will apply this method with some additional bookkeeping.
- call(inputs, states, training=False)[source]¶
Apply this cell to inputs.
Notes
This method should not be called manually; rather, use the implicit layer callable behaviour (like
my_layer(inputs)
), which will apply this method with some additional bookkeeping.
- class keras_lmu.LMU(*args, **kwargs)[source]¶
A layer of trainable low-dimensional delay systems.
Each unit buffers its encoded input by internally representing a low-dimensional (i.e., compressed) version of the sliding window.
Nonlinear decodings of this representation, expressed by the A and B matrices, provide computations across the window, such as its derivative, energy, median value, etc ([1], [2]). Note that these decoder matrices can span across all of the units of an input sequence.
- Parameters:
- memory_dint
Dimensionality of input to memory component.
- orderint
The number of degrees in the transfer function of the LTI system used to represent the sliding window of history. This parameter sets the number of Legendre polynomials used to orthogonally represent the sliding window.
- thetafloat
The number of timesteps in the sliding window that is represented using the LTI system. In this context, the sliding window represents a dynamic range of data, of fixed size, that will be used to predict the value at the next time step. If this value is smaller than the size of the input sequence, only that number of steps will be represented at the time of prediction, however the entire sequence will still be processed in order for information to be projected to and from the hidden layer. If
trainable_theta
is enabled, then theta will be updated during the course of training.- hidden_cell
keras.layers.Layer
Keras Layer/RNNCell implementing the hidden component.
- trainable_thetabool
If True, theta is learnt over the course of training. Otherwise, it is kept constant.
- hidden_to_memorybool
If True, connect the output of the hidden component back to the memory component (default False).
- memory_to_memorybool
If True, add a learnable recurrent connection (in addition to the static Legendre system) to the memory component (default False).
- input_to_hiddenbool
If True, connect the input directly to the hidden component (in addition to the connection from the memory component) (default False).
- discretizerstr
The method used to discretize the A and B matrices of the LMU. Current options are “zoh” (short for Zero Order Hold) and “euler”. “zoh” is more accurate, but training will be slower than “euler” if
trainable_theta=True
. Note that a larger theta is needed when discretizing using “euler” (a value that is larger than4*order
is recommended).- kernel_initializer
tf.initializers.Initializer
Initializer for weights from input to memory/hidden component. If
None
, no weights will be used, and the input size must match the memory/hidden size.- recurrent_initializer
tf.initializers.Initializer
Initializer for
memory_to_memory
weights (if that connection is enabled).- kernel_regularizer
keras.regularizers.Regularizer
Regularizer for weights from input to memory/hidden component.
- recurrent_regularizer
keras.regularizers.Regularizer
Regularizer for
memory_to_memory
weights (if that connection is enabled).- use_biasbool
If True, the memory component includes a bias term.
- bias_initializer
tf.initializers.Initializer
Initializer for the memory component bias term. Only used if
use_bias=True
.- bias_regularizer
keras.regularizers.Regularizer
Regularizer for the memory component bias term. Only used if
use_bias=True
.- dropoutfloat
Dropout rate on input connections.
- recurrent_dropoutfloat
Dropout rate on
memory_to_memory
connection.- return_sequencesbool, optional
If True, return the full output sequence. Otherwise, return just the last output in the output sequence.
References
[1]Voelker and Eliasmith (2018). Improving spiking dynamical networks: Accurate delays, higher-order synapses, and time cells. Neural Computation, 30(3): 569-609.
[2]Voelker and Eliasmith. “Methods and systems for implementing dynamic neural networks.” U.S. Patent Application No. 15/243,223. Filing date: 2016-08-22.
- property theta¶
Value of the
theta
parameter.If
trainable_theta=True
this returns the trained value, not the initial value passed in to the constructor.
- build(input_shape)[source]¶
Builds the layer.
Notes
This method should not be called manually; rather, use the implicit layer callable behaviour (like
my_layer(inputs)
), which will apply this method with some additional bookkeeping.
- class keras_lmu.layers.LMUFeedforward(*args, **kwargs)[source]¶
Layer class for the feedforward variant of the LMU.
This class assumes no recurrent connections are desired in the memory component.
Produces the output of the delay system by evaluating the convolution of the input sequence with the impulse response from the LMU cell.
- Parameters:
- memory_dint
Dimensionality of input to memory component.
- orderint
The number of degrees in the transfer function of the LTI system used to represent the sliding window of history. This parameter sets the number of Legendre polynomials used to orthogonally represent the sliding window.
- thetafloat
The number of timesteps in the sliding window that is represented using the LTI system. In this context, the sliding window represents a dynamic range of data, of fixed size, that will be used to predict the value at the next time step. If this value is smaller than the size of the input sequence, only that number of steps will be represented at the time of prediction, however the entire sequence will still be processed in order for information to be projected to and from the hidden layer.
- hidden_cell
keras.layers.Layer
Keras Layer implementing the hidden component.
- input_to_hiddenbool
If True, connect the input directly to the hidden component (in addition to the connection from the memory component) (default False).
- discretizerstr
The method used to discretize the A and B matrices of the LMU. Current options are “zoh” (short for Zero Order Hold) and “euler”. “zoh” is more accurate, but training will be slower than “euler” if
trainable_theta=True
. Note that a larger theta is needed when discretizing using “euler” (a value that is larger than4*order
is recommended).- kernel_initializer
tf.initializers.Initializer
Initializer for weights from input to memory/hidden component. If
None
, no weights will be used, and the input size must match the memory/hidden size.- kernel_regularizer
keras.regularizers.Regularizer
Regularizer for weights from input to memory/hidden component.
- use_biasbool
If True, the memory component includes a bias term.
- bias_initializer
tf.initializers.Initializer
Initializer for the memory component bias term. Only used if
use_bias=True
.- bias_regularizer
keras.regularizers.Regularizer
Regularizer for the memory component bias term. Only used if
use_bias=True
.- dropoutfloat
Dropout rate on input connections.
- return_sequencesbool, optional
If True, return the full output sequence. Otherwise, return just the last output in the output sequence.
- conv_mode“fft” or “raw”
The method for performing the inpulse response convolution. “fft” uses FFT convolution (default). “raw” uses explicit convolution, which may be faster for particular models on particular hardware.
- truncate_irfloat
The portion of the impulse response to truncate when using “raw” convolution (see
conv_mode
). This is an approximate upper bound on the error relative to the exact implementation. Smallertheta
values result in more truncated elements for a given value oftruncate_ir
, improving efficiency.
- build(input_shape)[source]¶
Builds the layer.
Notes
This method should not be called manually; rather, use the implicit layer callable behaviour (like
my_layer(inputs)
), which will apply this method with some additional bookkeeping.