Interpretable multi-head attention layer — layer_interpretable_mh_attention • aion

Interpretable multi-head attention layer

Usage

layer_interpretable_mh_attention(
  object,
  state_size,
  num_heads,
  dropout_rate = 0,
  ...
)

Arguments

num_heads: Number of attention heads.
dropout_rate: Dropout rate

References

1. B. Lim, S.O. Arik, N. Loeff, T. Pfiste, Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting(2020)
TFT original implementation by Google

Examples

lookback   <- 28
horizon    <- 14
all_steps  <- lookback + horizon
state_size <- 5

queries <- layer_input(c(horizon, state_size))
keys    <- layer_input(c(all_steps, state_size))
values  <- layer_input(c(all_steps, state_size))

imh_attention <-
   layer_interpretable_mh_attention(
      state_size = state_size, num_heads = 10
   )(queries, keys, values)