Legendre Memory Unit layer
layer_lmu.RdA layer of trainable low-dimensional delay systems. Each unit buffers its encoded input by internally representing a low-dimensional (i.e., compressed) version of the sliding window. Nonlinear decodings of this representation, expressed by the A and B matrices, provide computations across the window, such as its derivative, energy, median value, etc (1, 2). Note that these decoder matrices can span across all of the units of an input sequence.
Usage
layer_lmu(
  object,
  memory_d,
  order,
  theta,
  hidden_cell,
  trainable_theta = FALSE,
  hidden_to_memory = FALSE,
  memory_to_memory = FALSE,
  input_to_hidden = FALSE,
  discretizer = "zoh",
  kernel_initializer = "glorot_uniform",
  recurrent_initializer = "orthogonal",
  kernel_regularizer = NULL,
  recurrent_regularizer = NULL,
  use_bias = FALSE,
  bias_initializer = "zeros",
  bias_regularizer = NULL,
  dropout = 0,
  recurrent_dropout = 0,
  return_sequences = FALSE,
  ...
)Arguments
- memory_d
- Dimensionality of input to memory component. 
- order
- The number of degrees in the transfer function of the LTI system used to represent the sliding window of history. This parameter sets the number of Legendre polynomials used to orthogonally represent the sliding window. 
- theta
- The number of timesteps in the sliding window that is represented using the LTI system. In this context, the sliding window represents a dynamic range of data, of fixed size, that will be used to predict the value at the next time step. If this value is smaller than the size of the input sequence, only that number of steps will be represented at the time of prediction, however the entire sequence will still be processed in order for information to be projected to and from the hidden layer. If - trainable_thetais enabled, then theta will be updated during the course of training.
- hidden_cell
- Keras Layer/RNNCell implementing the hidden component. 
- trainable_theta
- If TRUE, theta is learnt over the course of training. Otherwise, it is kept constant. 
- hidden_to_memory
- If TRUE, connect the output of the hidden component back to the memory component (default FALSE). 
- memory_to_memory
- If TRUE, add a learnable recurrent connection (in addition to the static 
- input_to_hidden
- If TRUE, connect the input directly to the hidden component (in addition to 
- discretizer
- The method used to discretize the A and B matrices of the LMU. Current options are "zoh" (short for Zero Order Hold) and "euler". "zoh" is more accurate, but training will be slower than "euler" if - trainable_theta=TRUE. Note that a larger theta is needed when discretizing using "euler" (a value that is larger than- 4*orderis recommended).
- kernel_initializer
- Initializer for weights from input to memory/hidden component. If - NULL, no weights will be used, and the input size must match the memory/hidden size.
- recurrent_initializer
- Initializer for - memory_to_memoryweights (if that connection is enabled).
- kernel_regularizer
- Regularizer for weights from input to memory/hidden component. 
- recurrent_regularizer
- Regularizer for - memory_to_memoryweights (if that connection is enabled).
- use_bias
- If TRUE, the memory component includes a bias term. 
- bias_initializer
- Initializer for the memory component bias term. Only used if - use_bias=TRUE.
- bias_regularizer
- Regularizer for the memory component bias term. Only used if - use_bias=TRUE.
- dropout
- Dropout rate on input connections. 
- recurrent_dropout
- Dropout rate on - memory_to_memoryconnection.
- return_sequences
- If TRUE, return the full output sequence. Otherwise, return just the last output in the output sequence. 
Output shape
- if - return_state: a list of tensors. The first tensor is the output. The remaining tensors are the last states, each with shape- (batch_size, state_size), where- state_sizecould be a high dimension tensor shape.
- if - return_sequences: N-D tensor with shape- [batch_size, timesteps, output_size], where- output_sizecould be a high dimension tensor shape, or- [timesteps, batch_size, output_size]when- time_majoris- TRUE
- else, N-D tensor with shape - [batch_size, output_size], where- output_sizecould be a high dimension tensor shape.
References
- A. Voelker, I. Kajić and C. Eliasmith, Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks (2019) 
- A. Voelker and C. Eliasmith Improving spiking dynamical networks: Accurate delays, higher-order synapses, and time cells. Neural Computation, 30(3): 569-609. (2018) 
- A. Voelker and C. Eliasmith Methods and systems for implementing dynamic neural networks. U.S. Patent Application No. 15/243,223. 
Examples
if (FALSE) {
library(keras)
inp <- layer_input(c(28, 3))
hidden_cell <- layer_lstm_cell(10)
lmu <- layer_lmu(memory_d=10, order=3, theta=28, hidden_cell=hidden_cell)(inp)
model <- keras_model(inp, lmu)
model(array(1, c(32, 28, 3)))
}