 GLZ Introductory Overview - Computational Approach

To summarize the basic ideas, the generalized linear model differs from the general linear model (of which, for example, multiple regression is a special case) in two major respects: First, the distribution of the dependent or response variable can be (explicitly) non-normal, and does not have to be continuous, i.e., it can be binomial, multinomial, or ordinal multinomial (i.e., contain information on ranks only); second, the dependent variable values are predicted from a linear combination of predictor variables, which are "connected" to the dependent variable via a link function. The general linear model for a single dependent variable can be considered a special case of the generalized linear model: In the general linear model the dependent variable values are expected to follow the normal distribution, and the link function is a simple identity function (i.e., the linear combination of values for the predictor variables is not transformed).

To illustrate, in the general linear model a response variable Y is linearly associated with values on the X variables by

Y = b0 + b1X1 + b2X2 + ... + bkXk + e

(where e stands for the error variability that cannot be accounted for by the predictors; note that the expected value of e is assumed to be 0), while the relationship in the generalized linear model is assumed to be

Y = g (b0 + b1X1 + b2X2 + ... + bkXk) + e

where e is the error, and g(...) is a function. Formally, the inverse function of g(...), say f(...), is called the link function; so that:

f (muy) = b0 + b1X1 + b2X2 + ... + bkXk

where muy stands for the expected value of y.

Link functions and distributions. Various link functions (see McCullagh and Nelder, 1989) can be chosen, depending on the assumed distribution of the y variable values:

Normal, Gamma, Inverse normal, and Poisson distributions:

Identity link: f(z) = z Log link: f(z) = log(z) Power link: f(z) = za, for a given a Binomial, and Ordinal Multinomial distributions:

Probit link: f(z)=invnorm(z) where invnorm is the inverse of the standard normal cumulative distribution function (see Distributions and their functions).

Complementary log-log link: f(z)=log(-log(1-z)) 