Skip to contents

This module implements the Gaussian Error Linear Unit Gated Linear Unit (GeGLU) activation function. It computes \(\text{GeGLU}(x, g) = x \cdot \text{GELU}(g)\) where \(x\) and \(g\) are created by splitting the input tensor in half along the last dimension.

Usage

nn_geglu()

References

Shazeer N (2020). “GLU Variants Improve Transformer.” 2002.05202, https://arxiv.org/abs/2002.05202.

Examples

x = torch::torch_randn(10, 10)
glu = nn_geglu()
glu(x)
#> torch_tensor
#> -0.0781  0.0429  0.2326 -0.3157  0.3587
#>  0.0269 -1.2169  1.1411  0.3839  1.0649
#> -2.8136  0.0258  0.3226 -0.1794 -0.2462
#> -0.0137  0.0488  0.0029 -0.0838 -0.1425
#> -0.7842 -1.2676  0.9033  0.0471  0.0375
#> -0.1288  0.3352 -0.1260  0.1101  2.9018
#> -0.1718 -0.0550 -0.2146  0.0457  0.1578
#>  0.0651  0.3436 -0.0093 -0.1082 -0.0523
#> -0.0214 -0.6303  0.0938  2.1864  0.0974
#> -0.0777 -0.1244  0.6496  0.0455 -0.0079
#> [ CPUFloatType{10,5} ]