Gated Linear Unit
layer_glu.Rd
In such form introduced in Language modeling with gated convolutional networks by Dauphin et al., when it was used in sequence processing tasks and compared with gating mechanism used in LSTM layers. In the context of time series processing explicitly proposed in Temporal Fusion Transformer.
Arguments
- object
What to compose the new
Layer
instance with. Typically a Sequential model or a Tensor (e.g., as returned bylayer_input()
). The return value depends onobject
. Ifobject
is:missing or
NULL
, theLayer
instance is returned.a
Sequential
model, the model with an additional layer is returned.a Tensor, the output tensor from
layer_instance(object)
is returned.
- units
Positive integer, dimensionality of the output space.
- activation
Name of activation function to use. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).
- return_gate
Logical - return gate values. Default: FALSE
Value
Tensor of shape (batch_size, ..., units). Optionally, it can also return a weights tensor with identical shape.
Details
Computed according to the equation: $$GLU(\gamma) = \sigma(W\gamma + b) \odot (V\gamma + c)$$
Input and Output Shapes
Input shape: nD tensor with shape: (batch_size, ..., input_dim)
. The most
common situation would be a 2D input with shape (batch_size, input_dim)
.
Output shape: nD tensor with shape: (batch_size, ..., units)
. For
instance, for a 2D input with shape (batch_size, input_dim)
, the output
would have shape (batch_size, unit)
.
References
Y. N. Dauphin., et al. Language modeling with gated convolutional networks.. International conference on machine learning. PMLR (2017)
B. Lim, S.O. Arik, N. Loeff, T. Pfiste, Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting(2020)
Examples
library(keras)
# ================================================================
# SEQUENTIAL MODEL, NO GATE VALUES RETURNED
# ================================================================
model <-
keras_model_sequential() %>%
layer_glu(10, input_shape = 30)
#> Loaded Tensorflow version 2.10.0
model
#> Model: "sequential"
#> ________________________________________________________________________________
#> Layer (type) Output Shape Param #
#> ================================================================================
#> glu (GLU) (None, 10) 620
#> ================================================================================
#> Total params: 620
#> Trainable params: 620
#> Non-trainable params: 0
#> ________________________________________________________________________________
output <- model(matrix(1, 32, 30))
#> Error in py_call_impl(callable, dots$args, dots$keywords): RuntimeError: Exception encountered when calling layer "glu" " f"(type GLU).
#>
#> Evaluation error: tensorflow.python.framework.errors_impl.InternalError: Exception encountered when calling layer "dense" " f"(type Dense).
#>
#> {{function_node __wrapped__MatMul_device_/job:localhost/replica:0/task:0/device:GPU:0}} Attempting to perform BLAS operation using StreamExecutor without BLAS support [Op:MatMul]
#>
#> Call arguments received by layer "dense" " f"(type Dense):
#> • inputs=tf.Tensor(shape=(32, 30), dtype=float32)
#> .
#>
#> Call arguments received by layer "glu" " f"(type GLU):
#> • inputs=tf.Tensor(shape=(32, 30), dtype=float32)
dim(output)
#> Error in eval(expr, envir, enclos): object 'output' not found
output[1,]
#> Error in eval(expr, envir, enclos): object 'output' not found
# ================================================================
# WITH GATE VALUES RETURNED
# ================================================================
inp <- layer_input(30)
out <- layer_glu(units = 10, return_gate = TRUE)(inp)
model <- keras_model(inp, out)
model
#> Model: "model"
#> ________________________________________________________________________________
#> Layer (type) Output Shape Param #
#> ================================================================================
#> input_1 (InputLayer) [(None, 30)] 0
#> glu_1 (GLU) [(None, 10), 620
#> (None, 10)]
#> ================================================================================
#> Total params: 620
#> Trainable params: 620
#> Non-trainable params: 0
#> ________________________________________________________________________________
c(values, gate) %<-% model(matrix(1, 32, 30))
#> Error in py_call_impl(callable, dots$args, dots$keywords): RuntimeError: Exception encountered when calling layer "glu_1" " f"(type GLU).
#>
#> Evaluation error: tensorflow.python.framework.errors_impl.InternalError: Exception encountered when calling layer "dense" " f"(type Dense).
#>
#> {{function_node __wrapped__MatMul_device_/job:localhost/replica:0/task:0/device:GPU:0}} Attempting to perform BLAS operation using StreamExecutor without BLAS support [Op:MatMul]
#>
#> Call arguments received by layer "dense" " f"(type Dense):
#> • inputs=tf.Tensor(shape=(32, 30), dtype=float32)
#> .
#>
#> Call arguments received by layer "glu_1" " f"(type GLU):
#> • inputs=tf.Tensor(shape=(32, 30), dtype=float32)
dim(values)
#> Error in eval(expr, envir, enclos): object 'values' not found
dim(gate)
#> Error in eval(expr, envir, enclos): object 'gate' not found
values[1,]
#> Error in eval(expr, envir, enclos): object 'values' not found
gate[1,]
#> Error in eval(expr, envir, enclos): object 'gate' not found