SANN Overviews  Activation Functions
As
mentioned before, a multilayer perceptron (MLP)
is a feedforward neural network architecture with unidirectional full
connections between successive layers. However, this does not uniquely
determine the property of a network. In addition to network architectures,
the neurons of a network have activation functions that transform the
incoming signals from the neurons of the previous layer using a mathematical
function. The type of this function is the activation function itself
and can profoundly influence the performance of the network. Thus, it
is important to choose a type of activation function for the neurons of
a neural network.
The
input neurons usually have no activation function. In other words, they
use the identity function, which means that the input signals are
not transformed at all. Instead they are combined in a weighted sum (weighted
by the inputhidden layer weights) and passed on to the neurons in the
layer above (usually called the hidden layer). For an MLP with
two layers (MLP2) it is recommended that you use the tanh (hyperbolic)
function although other types are also possible such as the logistic sigmoid and exponential
functions. The output neuron activation functions are, for most cases,
set to the identity but this may vary from task to task. For example,
in classification tasks they are set to softmax (Bishop 1995) while
for regression problems they are set to identity (together with the choice
of tanh for the hidden neurons).
The
set of neuron activation functions for the hidden and output neurons available
in STATISTICA Automatic Neural
Networks is given in the table below:
Function 
Definition 
Description 
Range 
Identity 

The
activation of the neuron is passed on directly as the output 

Logistic
sigmoid 

An
Sshaped curve 

Hyperbolic
tangent 

A
sigmoid curve similar to the logistic
function. Often performs better than the logistic function because of
its symmetry. Ideal for multilayer perceptrons, particularly the hidden
layers 

Exponential 

The negative exponential function 

Sine 

Possibly useful if recognizing radially distributed
data. Not used by default 

Softmax 

Mainly used for (but not restricted to) classification
tasks. Useful for constructing neural networks with normalized multiple
outputs which makes it particularly suitable for
creating neural network classifiers with probabilistic outputs. 

Gaussian 

This type of isotropic Gaussian activation
function is solely used by the hidden units of an RBF neural network which
are also known as radial basis functions. The location (also
known as prototype vectors) and spread parameters
are equivalent to the inputhidden layer weights of an MLP neural network 

Where a
is the net input of a neuron (for MLPs  weighted sum of neuron's inputs).
For the Gaussian: X
represents the vector of neuron's inputs, mu
 the vector of input weights (RBF center), sigma
 the RBF's spread (width).