This page was last updated on 09/27/02
Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the Normal distribution, such as the Poisson, Binomial, Multinomial, and etc. Generalized Linear Models also relax the requirement of equality or constancy of variances that is required for hypothesis tests in traditional linear models.
The General Linear
Univariate Model (GLUM)
Most parametric statistical analyses can be viewed as a process of fitting a linear model to the observed data and testing hypotheses about the fitted model’s parameters. Even the lowly t – test is a form of the General Linear Univariate Model (GLUM). The Analysis of Variance (ANOVA), Regression, Multiple Regression, and the Analysis of Covariance (ANCOVA) are more complicated forms of the GLUM.
The least squares criterion is used to obtain estimates of the parameters of these GLUM models. Additional assumptions must be met in order to test hypotheses about the model’s parameters. Besides the assumption of independence of the observations, which is required for all statistical analyses, hypothesis tests derived from GLUM’s require normality of the response variable and constancy or homogeneity of variances.
The General Linear Multivariate Model (GLMM)
When attempting to explain variation in more than one response variable simultaneously the modeling exercise is to fit the General Linear Multivariate Model (GLMM) to the data. Commonly used multivariate statistical procedures such as Multivariate Analysis of Variance (MANOVA), Multivariate Analysis of Covariance (MANCOVA), Discriminant Function Analysis (DFA), Canonical Correlation Analysis (CCA), and Principal Components Analysis (PCA) are all forms of the GLMM. To perform hypothesis tests in the context of the GLMM, one must assume that the response variables are multivariate normal and that the variancecovariance matrices are homogeneous.
When the distribution of the response variable(s) is not normal or multivariate normal, or if the variances or the variancecovariance matrices are not homogeneous, then application of hypothesis tests to GLUM’s or GLMM’s can lead to Type I and Type II error rates that differ from the nominal rates. Traditionally, transformations of the scale of the response variables have been applied to insure that the assumptions required for hypotheses tests are met. For example, count data are often Poisson distributed and tend to be right skewed. Furthermore, the variance of a Poisson random variable is equal to the mean of the response. Hence, for count data a transformation must both normalize the data and eliminate the inherent variance heterogeneity. Commonly, count data are transformed to a logarithmic scale or even a squareroot scale, however such transformations are not always successful in achieving the desired end. In fact, there is no a priori reason to believe that a scale exists that will insure that data meet the normality and variance homogeneity assumptions.
General  izing the Linear Model
The Generalized Linear Model is an extension of the General Linear Model to include response variables that follow any probability distribution in the exponential family of distributions. The exponential family includes such useful distributions as the Normal, Binomial, Poisson, Multinomial, Gamma, Negative Binomial, and others. Hypothesis tests applied to the Generalized Linear Model do not require normality of the response variable, nor do they require homogeneity of variances. Hence, Generalized Linear Models can be used when response variables follow distributions other than the Normal distribution, and when variances are not constant. For example, count data would be appropriately analyzed as a Poisson random variable within the context of the Generalized Linear Model.
Parameter estimates are obtained using the principle of maximum likelihood; therefore hypothesis tests are based on comparisons of likelihoods or the deviances of nested models.
What puts the ized in Generalized Linear Models
The common linear regression model (a form of the general linear model) specifies that the mean response µ is identical to a linear function ? of the predictor variables x_{j:}
_{ }
_{}
(1)
and uses least
squares as the criterion by which to estimate the unknown parameters ß = (ß_{0}, ß_{1},..., ß_{p})'. When
observations are independent and normally distributed with constant variance s^{2},^{ }least squares
estimation of ß and s^{2} is equivalent to
maximum likelihood estimation.
Generalized linear
models encompass the general linear model and enlarge the class of linear leastsquares
models in two ways: the distribution of Y for
fixed x is merely assumed to be from the
exponential family of distributions, which includes important distributions such as the
binomial, Poisson, exponential, and gamma distributions, in addition to the normal
distribution. Also, the relationship between E(Y) = µ and ? is specified by a
nonlinear link function ? = g(µ),
which is only required to be monotonic and differentiable.
The link function serves to link the random or stochastic component of the model, the probability distribution of the response variable, to the systematic component of the model (the linear predictor):
_{},
(2)
Where g(µ) is a nonlinear link function that links the random component, E(Y), to the systematic component _{}. For traditional linear models in which the random component consists of the assumption that the response variable follows the Normal distribution, the canonical link function is the identity link. The identity link specifies that the expected mean of the response variable is identical to the linear predictor, rather than to a nonlinear function of the linear predictor. The canonical link functions for a variety of probability distribution are given below.
Probability Distribution 
Canonical Link Function 


Normal 
Identity 
Binomial 
Logit 
Poisson 
Log 
Gamma 
Reciprocal 
Although other link functions are possible, the canonical links are most often used.
The parameters in a
generalized linear model can be estimated by the maximum likelihood method. For a given
probability distribution specified by f(y_{i} ; ß,
F) and observations y = (y_{1}, y_{2}, . . ., y_{n})', the
loglikelihood function for ß and
F, expressed as a function of mean values µ = (µ_{1},…, µ_{n})
of the responses {Y_{1}, Y_{2}, . . . , Y_{n}}, has the form
_{}.
The maximum
likelihood estimates of the parameters ß can be obtained by iterative
reweighted least squares (IRLS). Detailed information about the iterative algorithm and
asymptotic properties of the parameter estimates can be found in McCullagh and Nelder
(1989).
Analogous to the
residual sum of squares in linear regression, the goodnessoffit of a generalized linear
model can be measured by the scaled deviance
_{},
where _{}is the maximum likelihood achievable for an exact fit in
which the fitted values are equal to the observed values, and _{} is the loglikelihood function calculated at the
estimated parameters ß. The deviance function is very useful for comparing
two models when one model has parameters that are a subset of the second model. The
deviance is additive for such nested models if maximum likelihood estimates are used
(McCullagh and Nelder 1989). Consider two nested models with the second having some
covariates omitted and denote the maximum likelihood estimates in the two models by _{} and _{} , respectively.
Then the deviance difference _{} is identical to the likelihoodratio statistic and
has an approximate_{}distribution
with degrees of freedom equal to the difference between the numbers of parameters in the
two models. For probability distributions in the exponential family the_{}approximation is usually quite accurate for differences of
deviance even though it may be inaccurate for the deviances themselves (McCullagh and
Nelder 1989).
Overdispersion
If the sampling
variance of a response variable Y_{i} is significantly
greater than that predicted by an expected probability distribution, Y_{i} is said to be
overdispersed.
The covariance
matrix of _{} is estimated by COV_{}
= F(X'WX)^{1},
where X
is the covariate matrix and W is a weight matrix used in the iterative
algorithm. If overdispersion occurs, ignoring it (i.e., setting F = 1) will result in underestimating the standard
errors of the parameter estimates, which may lead to incorrect conclusions. McCullagh and
Nelder (1989) suggest modeling mean and dispersion jointly as a way to take possible
overdispersion into account. The detailed fitting procedure can be found in McCullagh and
Nelder (1989).
GLZ’s can be fit and evaluated using SPLUS, SAS, SPSS, and a number of other statistical packages. Of the major packages, SPLUS and SAS provide greater flexibility in fitting and evaluating GLZ’s
References
Agresti, A. 1996. An Introduction to Categorical Data Analysis. John Wiley & Sons: New York. (A very readable introduction the many forms of the generalized linear model)
McCullagh, P. and J.A. Nelder. 1989. Generalized Linear Models. Chapman and Hall: London. (mathematical statistics of generalized linear model)
Ecological
Applications of Generalized Linear Models
Vincent, P.J. and J.M. Haworth. 1983. Poisson regression models of species abundance. Journal of Biogeography 10: 153160.
Connor, E.F., E. Hosfield, D. Meeter, and X. Nui. 1997. Tests for aggregation and sizebased sampleunit selection when sample units vary in size. Ecology 78: 1238 1249.
Links to Other
Websites
Site 
Description 


Introduction, bibliography, software, and other information on GLZ’s 

Fairly comprehensive introduction to GLZ’s 

Using Matlab to fit GLZ’s 

Brief introduction to GLZ’s 