Skip to contents

This module implements the Gaussian Error Linear Unit Gated Linear Unit (GeGLU) activation function. It computes \(\text{GeGLU}(x, g) = x \cdot \text{GELU}(g)\) where \(x\) and \(g\) are created by splitting the input tensor in half along the last dimension.

Usage

nn_geglu()

References

Shazeer N (2020). “GLU Variants Improve Transformer.” 2002.05202, https://arxiv.org/abs/2002.05202.

Examples

x = torch::torch_randn(10, 10)
glu = nn_geglu()
glu(x)
#> torch_tensor
#>  0.5019 -0.0478  3.3846  0.1640 -0.1329
#>  0.0159  0.2733  0.0006  0.0956 -0.0281
#>  0.0038  1.3667  0.5392  0.1094 -0.7105
#> -0.2250  0.3124 -0.5476 -0.0186  0.0198
#> -0.0750 -0.1627 -0.1270 -0.2911  0.0552
#> -0.5853  0.2720 -0.1163  0.9933 -0.0611
#>  0.0317  0.0933  0.0303 -0.3528 -0.1193
#> -0.2845  0.0798  0.0772  0.0227 -0.0569
#> -0.0121 -0.1985 -0.2082  0.1170 -0.0601
#>  0.0411  0.0992  0.0005 -0.0273 -0.0529
#> [ CPUFloatType{10,5} ]