Padé Activation Units: End-to-end Learning of Activation Functions in Deep Neural Network
Arxiv link: https://arxiv.org/abs/1907.06732
Padé Activation Units (PAU) have become Rational Activation Functions.
Please check the updated repo here !
PAU matches or outperforms common activations in terms of predictive performance and training time. And, therefore relieves the network designer of having to commit to a potentially underperforming choice.
Check new Repo !
PAU is implemented as a pytorch extension using CUDA 10.1. So all that is needed is to install the extension. This requires the cuda compiler and dev-tools, however the process is pretty straight forward:
in the folder /pau/cuda execute
python3 setup.py install
For this, you might need super user rights or work in a virtual environment.
PAU can be integrated in the same way as any other common activation function.
import torch
from pau.utils import PAU
model = torch.nn.Sequential(
torch.nn.Linear(D_in, H),
PAU(), # e.g. instead of torch.nn.ReLU()
torch.nn.Linear(H, D_out),
)
To reproduce the reported results of the paper execute:
$ export PYTHONPATH="./"
$ python experiments/main.py --dataset mnist --arch conv --optimizer adam --lr 2e-3
# DATASET: Name of the dataset, for MNIST use mnist and for Fashion-MNIST use fmnist
# ARCH: selected neural network architecture: vgg, lenet or conv
# OPTIMIZER: either adam or sgd
# LR: learning rate