 Survival & Failure Time Analysis Introductory Overview - Distribution Fitting

General Introduction. In summary, the life table gives us a good indication of the distribution of failures over time. However, for predictive purposes it is often desirable to understand the shape of the underlying survival function in the population. The major distributions that have been proposed for modeling survival or failure times are the exponential (and linear exponential) distribution, the Weibull distribution of extreme events (see Notes and Technical Information for the interpretation of the parameters for the Weibull distribution), and the Gompertz distribution. Survival Analysis fits all of these theoretical distributions to the observed life table.

See also the Weibull and Reliability/Failure time Analysis in the Process Analysis module for options to fit the Weibull distribution to raw data using maximum likelihood methods.

Estimation. The parameter estimation procedure (for estimating the parameters of the theoretical survival functions) is essentially a least squares linear regression algorithm (see Gehan & Siddiqui, 1973). A linear regression algorithm can be used because all four theoretical distributions can be "made linear" by appropriate transformations (see Notes and Technical Information). Such transformations sometimes produce different variances for the residuals at different times, leading to biased estimates. Therefore, the fitting algorithm in Survival Analysis also computes two types of weighted least squares estimates.

Goodness-of-fit. Given the parameters for the different distribution functions and the respective model, we can compute the likelihood of the data. You can also compute the likelihood of the data under the null model, that is, a model that allows for different hazard rates in each interval. Without going into details, these two likelihoods can be compared via an incremental Chi-square test statistic. If this Chi-square is statistically significant, then we conclude that the respective theoretical distribution fits the data significantly worse than the null model; that is, we reject the respective distribution as a model for our data.

Plots. Survival Analysis produces plots of the survival function, hazard, and probability density for the observed data and the respective theoretical distributions. These plots provide a quick visual check of the goodness-of-fit of the theoretical distribution.