# Statistical Modelling

Module code: MD7442

Module co-ordinator: Mark Rutherford

## Linear Models

Provides a coherent structure for the fitting, interpretation and model checking for linear models and gives practical experience of using these models with data collected in both clinical and pre-clinical settings.

• Introduction to linear models
• Relationships between variables
• Simple regression models - least squares, model assumptions, Analysis of Variance, coefficient of determination, statistical tests for the regression parameters, using the model for prediction, association and causation, model choice, extrapolation
• General linear models - definition, least squares estimation, properties of LSEs, confidence intervals on regression parameters, hypothesis testing, mean estimation and the prediction of new observations
• Polynomial models and indicator variables - polynomial models, categorical covariates, types of sums of squares
• Collinearity
• Examination of the residuals
• Model Building
• Regression analysis in special cases - non-linear models, heterogeneous observations, correlated observations

## Generalised Linear Models

Introduces the theory and application of Generalised Linear Models (GLMs). The module covers all stages in the modelling process, from selecting an initial model, through fitting to model checking and then interpretation and communication of the results and at each stage the necessary theory is developed.

• Introduction to GLMs - exponential family of distributions: the normal, Poisson, binomial and gamma, identifying the canonical and dispersion parameters and the mean and variance, defining the linear predictor, offset and link functions
• Selecting an Initial GLM
• Fitting a GLM - problems fitting a GLM by maximum likelihood, IRLS
• Model Selection & Checking - Deviance and Pearson’s X2 as goodness of fit measures, analysis of deviance tables, Pearson and Deviance residuals, diagnostic plots and statistics
• Further GLMs & Extensions - log-linear models for multinomial distributions, over-dispersion, quasi-likelihood

## Learning

• 20 one-hour lectures
• 20 one-hour workshops

## Assessment

• Exam, 90 minutes (50%)
• Coursework (50%)