MA7007 - Statistical Modelling and Forecasting (2020/21)
|Module specification||Module approved to run in 2020/21|
|Module title||Statistical Modelling and Forecasting|
|Module level||Masters (07)|
|Credit rating for module||20|
|School||School of Computing and Digital Media|
|Total study hours||200|
|Running in 2020/21||
This module will introduce students to modern statistical modelling techniques and how those techniques can be used for prediction and forecasting. Throughout the statistical environment and software R will be used in conjunction with relevant statistical libraries.
The module will, introduce modern regression techniques (including smoothing), discuss different model selection techniques (including the classical statistical hypothesis) and how those techniques can be used for prediction purpose.
1. To equip graduate students with modern statistical techniques
2. To provide students with some selected advanced statistics topics including forecasting
3. To prepare students to be able to read and understand professional articles
4. To prepare students to carry on their own research and use modern statistical techniques as one of the tools for their research.
1. Introduction to probability, distribution functions and inference
Different types of probability distribution functions
Statistical inference and hypothesis testing,
2. Flexible regression models, for continuous and categorical data including
i) Generalised Linear Models (GLM)
ii) Generalised Additive Models (GAM) and
iii) Generalised Additive Models for Location Scale and Shape (GAMLSS)
Introduction to smoothing techniques.
3. Model selection techniques and forecasting:
i) Likelihood ratio test
ii) Generalised Akaike information criterion
iii) The bias versus variance dilemma
iv) Forward and backward and stepwise selection techniques
v) ridge regression and lasso
v) Cross validation techniques
vi) Validation and test samples techniques
Learning and teaching
The module will be delivered through a combination of lectures and associated tutorial and laboratory workshops over a period of 12 weeks. Topics of lectures will be supplemented with laboratory sessions to illustrate the application of the techniques studied. The R software will be used and students will be encouraged to broaden their knowledge by exploring complex real world data sets. Critical evaluation of the techniques used will be encouraged. The tutorial and lab sessions will also provide opportunities for students to obtain informal feedback from the teaching staff on their progress.
Additional teaching and learning resources will be made available via WebLearn and students will be expected to spend a significant proportion of their time on private study
On completing the module, students will be able to:
1. Demonstrate substantial knowledge and understanding of modern statistical modelling techniques, and their relation to traditional statistical analyses.
2. Be able to use R to analyse a wide range of univariate response data, with
explanatory variates, factors and smoothing.
3. To be able to undestand the different types of density distribution functions, and some of their properties.
4. To build a statistical model and be able to check its assumptions.
5. To understand the power and limitation of the prediction (forecasting) techniques .
6. Carry out independent investigation and write clear and concise scientific reports.
7. A critical evaluation and a clear understanding of the applications of legal, social, ethical and professional issues to academic research and PhD programmes.
Due to its practical nature, the module will be assessed by means of a comprehensive case study investigation. The case study will be based on a realistic data driven problem and will provide an opportunity for students to demonstrate their skills in problem solving, analysis of data using modern statistical techniques, critical evaluation of selected models and structural report writing ability.
The case study should include:
- Preliminary investigation concerning the problem in hand.
- Selecting suitable methods for solving the problem.
- Evaluating the limitation of the techniques and the impact of any simplifying assumptions on the validity of the solution.
- Gathering input data and using R software for analyses.
- Interpreting the results and writing a comprehensive report of about 5000 words.
Aitkin M. Francis B. Hinde J. Darnell R. (2009) Statistical modelling in R, Oxford Statistical Science Series
Hastie T, Tibshirani R. and Friedman J. (2009) The Elements of statistical Learning. Data Mining, Inference and Prediction. Second Edition, Springer [CORE]
Stasinopoulos D.M. Rigby, R.A. Voudouris V, Heller G. De Bastiani F. (2015) Flexible Regression and Smoothing, The GAMLSS packages in R. First Draft http://www.gamlss.org/wpcontent/uploads/2015/07/FlexibleRegressionAndSmoothingDraft-1.pdf