The loss function (the term loss was first used by Wald, 1939) represents a selected measure of the discrepancy between the observed data and the data "predicted" by the fitted function. This is the function that is minimized in the process of fitting a model. For example, in many traditional general linear model techniques, the loss function is the sum of squared deviations from the fitted line or plane.

A common alternative to the least squares loss function is to maximize the likelihood or log-likelihood function (or to minimize the negative log-likelihood function; the term maximum likelihood was first used by Fisher, 1929a; see also Maximum Likelihood Method). These functions are typically used when fitting non-linear models. In most general terms, the likelihood function is defined as:

In theory, we can compute the probability (now called L, the likelihood) of the specific dependent variable values to occur in our samples, given the respective regression model.

See also, Nonlinear Estimation, Variance Components and Mixed Model ANOVA/ANCOVA,
Time Series,