Single Transformer Block for the FT-Transformer
Source:R/PipeOpTorchFTTransformerBlock.R
mlr_pipeops_nn_ft_transformer_block.RdA transformer block consisting of a multi-head self-attention mechanism followed by a feed-forward network.
This is used in LearnerTorchFTTransformer.
nn_module
Calls nn_ft_transformer_block() when trained.
See also
Other PipeOps:
mlr_pipeops_nn_adaptive_avg_pool1d,
mlr_pipeops_nn_adaptive_avg_pool2d,
mlr_pipeops_nn_adaptive_avg_pool3d,
mlr_pipeops_nn_avg_pool1d,
mlr_pipeops_nn_avg_pool2d,
mlr_pipeops_nn_avg_pool3d,
mlr_pipeops_nn_batch_norm1d,
mlr_pipeops_nn_batch_norm2d,
mlr_pipeops_nn_batch_norm3d,
mlr_pipeops_nn_block,
mlr_pipeops_nn_celu,
mlr_pipeops_nn_conv1d,
mlr_pipeops_nn_conv2d,
mlr_pipeops_nn_conv3d,
mlr_pipeops_nn_conv_transpose1d,
mlr_pipeops_nn_conv_transpose2d,
mlr_pipeops_nn_conv_transpose3d,
mlr_pipeops_nn_dropout,
mlr_pipeops_nn_elu,
mlr_pipeops_nn_flatten,
mlr_pipeops_nn_ft_cls,
mlr_pipeops_nn_geglu,
mlr_pipeops_nn_gelu,
mlr_pipeops_nn_glu,
mlr_pipeops_nn_hardshrink,
mlr_pipeops_nn_hardsigmoid,
mlr_pipeops_nn_hardtanh,
mlr_pipeops_nn_head,
mlr_pipeops_nn_identity,
mlr_pipeops_nn_layer_norm,
mlr_pipeops_nn_leaky_relu,
mlr_pipeops_nn_linear,
mlr_pipeops_nn_log_sigmoid,
mlr_pipeops_nn_max_pool1d,
mlr_pipeops_nn_max_pool2d,
mlr_pipeops_nn_max_pool3d,
mlr_pipeops_nn_merge,
mlr_pipeops_nn_merge_cat,
mlr_pipeops_nn_merge_prod,
mlr_pipeops_nn_merge_sum,
mlr_pipeops_nn_prelu,
mlr_pipeops_nn_reglu,
mlr_pipeops_nn_relu,
mlr_pipeops_nn_relu6,
mlr_pipeops_nn_reshape,
mlr_pipeops_nn_rrelu,
mlr_pipeops_nn_selu,
mlr_pipeops_nn_sigmoid,
mlr_pipeops_nn_softmax,
mlr_pipeops_nn_softplus,
mlr_pipeops_nn_softshrink,
mlr_pipeops_nn_softsign,
mlr_pipeops_nn_squeeze,
mlr_pipeops_nn_tanh,
mlr_pipeops_nn_tanhshrink,
mlr_pipeops_nn_threshold,
mlr_pipeops_nn_tokenizer_categ,
mlr_pipeops_nn_tokenizer_num,
mlr_pipeops_nn_unsqueeze,
mlr_pipeops_torch_ingress,
mlr_pipeops_torch_ingress_categ,
mlr_pipeops_torch_ingress_ltnsr,
mlr_pipeops_torch_ingress_num,
mlr_pipeops_torch_loss,
mlr_pipeops_torch_model,
mlr_pipeops_torch_model_classif,
mlr_pipeops_torch_model_regr
Super classes
mlr3pipelines::PipeOp -> mlr3torch::PipeOpTorch -> PipeOpTorchFTTransformerBlock
Methods
Method new()
Create a new instance of this R6 class.
Usage
PipeOpTorchFTTransformerBlock$new(
id = "nn_ft_transformer_block",
param_vals = list()
)Arguments
id(
character(1))
Identifier of the resulting object.param_vals(
list())
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction.
Examples
# Construct the PipeOp
pipeop = po("nn_ft_transformer_block")
pipeop
#> PipeOp: <nn_ft_transformer_block> (not trained)
#> values: <attention_n_heads=8, attention_dropout=0.2, attention_initialization=kaiming, attention_normalization=<nn_layer_norm>, ffn_dropout=0.1, ffn_activation=<nn_reglu>, ffn_normalization=<nn_layer_norm>, residual_dropout=0, prenormalization=TRUE, is_first_layer=FALSE, query_idx=<NULL>, attention_bias=TRUE, ffn_bias_first=TRUE, ffn_bias_second=TRUE>
#> Input channels <name [train type, predict type]>:
#> input [ModelDescriptor,Task]
#> Output channels <name [train type, predict type]>:
#> output [ModelDescriptor,Task]
# The available parameters
pipeop$param_set
#> <ParamSet(16)>
#> id class lower upper nlevels default
#> <char> <char> <num> <num> <num> <list>
#> 1: attention_n_heads ParamInt 1 Inf Inf <NoDefault[0]>
#> 2: attention_dropout ParamDbl 0 1 Inf <NoDefault[0]>
#> 3: attention_initialization ParamFct NA NA 2 <NoDefault[0]>
#> 4: attention_normalization ParamUty NA NA Inf <NoDefault[0]>
#> 5: ffn_d_hidden ParamInt 1 Inf Inf <NoDefault[0]>
#> 6: ffn_d_hidden_multiplier ParamDbl 0 Inf Inf <NoDefault[0]>
#> 7: ffn_dropout ParamDbl 0 1 Inf <NoDefault[0]>
#> 8: ffn_activation ParamUty NA NA Inf <NoDefault[0]>
#> 9: ffn_normalization ParamUty NA NA Inf <NoDefault[0]>
#> 10: residual_dropout ParamDbl 0 1 Inf <NoDefault[0]>
#> 11: prenormalization ParamLgl NA NA 2 <NoDefault[0]>
#> 12: is_first_layer ParamLgl NA NA 2 <NoDefault[0]>
#> 13: query_idx ParamUty NA NA Inf <NoDefault[0]>
#> 14: attention_bias ParamLgl NA NA 2 <NoDefault[0]>
#> 15: ffn_bias_first ParamLgl NA NA 2 <NoDefault[0]>
#> 16: ffn_bias_second ParamLgl NA NA 2 <NoDefault[0]>
#> value
#> <list>
#> 1: 8
#> 2: 0.2
#> 3: kaiming
#> 4: <nn_layer_norm[1]>
#> 5: [NULL]
#> 6: [NULL]
#> 7: 0.1
#> 8: <nn_reglu[1]>
#> 9: <nn_layer_norm[1]>
#> 10: 0
#> 11: TRUE
#> 12: FALSE
#> 13: [NULL]
#> 14: TRUE
#> 15: TRUE
#> 16: TRUE