Single Transformer Block for the FT-Transformer

A transformer block consisting of a multi-head self-attention mechanism followed by a feed-forward network.

This is used in LearnerTorchFTTransformer.

nn_module

Calls nn_ft_transformer_block() when trained.

State

The state is the value calculated by the public method $shapes_out().

Other PipeOps: mlr_pipeops_nn_adaptive_avg_pool1d, mlr_pipeops_nn_adaptive_avg_pool2d, mlr_pipeops_nn_adaptive_avg_pool3d, mlr_pipeops_nn_avg_pool1d, mlr_pipeops_nn_avg_pool2d, mlr_pipeops_nn_avg_pool3d, mlr_pipeops_nn_batch_norm1d, mlr_pipeops_nn_batch_norm2d, mlr_pipeops_nn_batch_norm3d, mlr_pipeops_nn_block, mlr_pipeops_nn_celu, mlr_pipeops_nn_conv1d, mlr_pipeops_nn_conv2d, mlr_pipeops_nn_conv3d, mlr_pipeops_nn_conv_transpose1d, mlr_pipeops_nn_conv_transpose2d, mlr_pipeops_nn_conv_transpose3d, mlr_pipeops_nn_dropout, mlr_pipeops_nn_elu, mlr_pipeops_nn_flatten, mlr_pipeops_nn_ft_cls, mlr_pipeops_nn_geglu, mlr_pipeops_nn_gelu, mlr_pipeops_nn_glu, mlr_pipeops_nn_hardshrink, mlr_pipeops_nn_hardsigmoid, mlr_pipeops_nn_hardtanh, mlr_pipeops_nn_head, mlr_pipeops_nn_identity, mlr_pipeops_nn_layer_norm, mlr_pipeops_nn_leaky_relu, mlr_pipeops_nn_linear, mlr_pipeops_nn_log_sigmoid, mlr_pipeops_nn_max_pool1d, mlr_pipeops_nn_max_pool2d, mlr_pipeops_nn_max_pool3d, mlr_pipeops_nn_merge, mlr_pipeops_nn_merge_cat, mlr_pipeops_nn_merge_prod, mlr_pipeops_nn_merge_sum, mlr_pipeops_nn_prelu, mlr_pipeops_nn_reglu, mlr_pipeops_nn_relu, mlr_pipeops_nn_relu6, mlr_pipeops_nn_reshape, mlr_pipeops_nn_rrelu, mlr_pipeops_nn_selu, mlr_pipeops_nn_sigmoid, mlr_pipeops_nn_softmax, mlr_pipeops_nn_softplus, mlr_pipeops_nn_softshrink, mlr_pipeops_nn_softsign, mlr_pipeops_nn_squeeze, mlr_pipeops_nn_tanh, mlr_pipeops_nn_tanhshrink, mlr_pipeops_nn_threshold, mlr_pipeops_nn_tokenizer_categ, mlr_pipeops_nn_tokenizer_num, mlr_pipeops_nn_unsqueeze, mlr_pipeops_torch_ingress, mlr_pipeops_torch_ingress_categ, mlr_pipeops_torch_ingress_ltnsr, mlr_pipeops_torch_ingress_num, mlr_pipeops_torch_loss, mlr_pipeops_torch_model, mlr_pipeops_torch_model_classif, mlr_pipeops_torch_model_regr

Super classes

mlr3pipelines::PipeOp -> mlr3torch::PipeOpTorch -> PipeOpTorchFTTransformerBlock

Methods

Inherited methods

Method `new()`

Create a new instance of this R6 class.

Usage

PipeOpTorchFTTransformerBlock$new(
  id = "nn_ft_transformer_block",
  param_vals = list()
)

Arguments

id: (character(1))
Identifier of the resulting object.
param_vals: (list())
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

PipeOpTorchFTTransformerBlock$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

# Construct the PipeOp
pipeop = po("nn_ft_transformer_block")
pipeop
#> 
#> ── PipeOp <nn_ft_transformer_block>: not trained ───────────────────────────────
#> Values: attention_n_heads=8, attention_dropout=0.2,
#> attention_initialization=kaiming, attention_normalization=<nn_layer_norm>,
#> ffn_dropout=0.1, ffn_activation=<nn_reglu>, ffn_normalization=<nn_layer_norm>,
#> residual_dropout=0, prenormalization=TRUE, is_first_layer=FALSE,
#> query_idx=<NULL>, attention_bias=TRUE, ffn_bias_first=TRUE,
#> ffn_bias_second=TRUE
#> 
#> ── Input channels: 
#>    name           train predict
#>  <char>          <char>  <char>
#>   input ModelDescriptor    Task
#> 
#> ── Output channels: 
#>    name           train predict
#>  <char>          <char>  <char>
#>  output ModelDescriptor    Task
# The available parameters
pipeop$param_set
#> <ParamSet(16)>
#>                           id    class lower upper nlevels        default
#>                       <char>   <char> <num> <num>   <num>         <list>
#>  1:        attention_n_heads ParamInt     1   Inf     Inf <NoDefault[0]>
#>  2:        attention_dropout ParamDbl     0     1     Inf <NoDefault[0]>
#>  3: attention_initialization ParamFct    NA    NA       2 <NoDefault[0]>
#>  4:  attention_normalization ParamUty    NA    NA     Inf <NoDefault[0]>
#>  5:             ffn_d_hidden ParamInt     1   Inf     Inf <NoDefault[0]>
#>  6:  ffn_d_hidden_multiplier ParamDbl     0   Inf     Inf <NoDefault[0]>
#>  7:              ffn_dropout ParamDbl     0     1     Inf <NoDefault[0]>
#>  8:           ffn_activation ParamUty    NA    NA     Inf <NoDefault[0]>
#>  9:        ffn_normalization ParamUty    NA    NA     Inf <NoDefault[0]>
#> 10:         residual_dropout ParamDbl     0     1     Inf <NoDefault[0]>
#> 11:         prenormalization ParamLgl    NA    NA       2 <NoDefault[0]>
#> 12:           is_first_layer ParamLgl    NA    NA       2 <NoDefault[0]>
#> 13:                query_idx ParamUty    NA    NA     Inf <NoDefault[0]>
#> 14:           attention_bias ParamLgl    NA    NA       2 <NoDefault[0]>
#> 15:           ffn_bias_first ParamLgl    NA    NA       2 <NoDefault[0]>
#> 16:          ffn_bias_second ParamLgl    NA    NA       2 <NoDefault[0]>
#>                  value
#>                 <list>
#>  1:                  8
#>  2:                0.2
#>  3:            kaiming
#>  4: <nn_layer_norm[1]>
#>  5:             [NULL]
#>  6:             [NULL]
#>  7:                0.1
#>  8:      <nn_reglu[1]>
#>  9: <nn_layer_norm[1]>
#> 10:                  0
#> 11:               TRUE
#> 12:              FALSE
#> 13:             [NULL]
#> 14:               TRUE
#> 15:               TRUE
#> 16:               TRUE

nn_module

State

See also

Super classes

Methods

Public methods

Method new()

Usage

Arguments

Method clone()

Usage

Arguments

Examples

Method `new()`

Method `clone()`