module specification

CC7184 - Data Mining and Machine Learning (2022/23)

Module specification Module approved to run in 2022/23
Module title Data Mining and Machine Learning
Module level Masters (07)
Credit rating for module 20
School School of Computing and Digital Media
Total study hours 200
 
52 hours Assessment Preparation / Delivery
100 hours Guided independent study
48 hours Scheduled learning & teaching activities
Assessment components
Type Weighting Qualifying mark Description
Coursework 50%   Apply data mining and machine learning techniques to a real-world problem (2000 words report + application)
Unseen Examination 50%   2-hour unseen exam
Running in 2022/23

(Please note that module timeslots are subject to change)
Period Campus Day Time Module Leader
Spring semester North Thursday Morning
Summer studies North Monday Morning
Summer studies North Tuesday Afternoon

Module summary

This module provides an appreciation of data mining and machine learning fundamental concepts, algorithms, and process. It covers machine learning algorithms and data mining techniques for data analysis, pattern mining, clustering, classification and regression. It equips the students with practical skills in applying data mining and machine learning techniques in real-world analytics problems.

The aims of this module are to:
• provide students with an understanding of data mining and machine learning fundamental concepts, algorithms, and process.
• understand the purpose and breadth of areas of application of data mining and  machine learning
• understand and compare the techniques and tools available for various type of data analytics problems
• develop students with practical skills in applying data mining techniques to solve  real-world analytics problems.

Syllabus

• Concepts and fundamentals of data mining and machine learning [LO1], [LO2]
• Data mining process:   cross Industry standard processing (CRISP) for data mining
• [LO1]
• Data preparation and graphical exploration: visualising large data sets, data cleaning, outlier detection, variable transformation [LO1],[LO3]
• Machine learning to classify: Decision Tree, Naïve Bayes, Bayesian networks, Support Vector Machines [LO3],[LO4],[LO6]
•  Machine learning to predict: Logistic regression, Neural network, and Deep Learning [LO3],[LO4],[LO5],[LO6]
• Mining relationships among records: cluster analysis,  association analysis (‘market basket analysis’) [LO3],[LO4], LO6]
• Model evaluation and predictive performance [LO4],[LO5],[LO6]

Balance of independent study and scheduled teaching activity

Topics will be introduced through the medium of formal lectures, supported by tutorial and workshop sessions, and blended learning as follows:
- Lecture (2 hour / week):
Introduction of the major topics identified in the syllabus, plus practical exercises, directed reading and other further studies
- Workshop (2 hour / week):
Data mining technical skills will be further developed through lab-based workshops. Specific practical exercises are set to support students' development of skills with powerful mining package (e.g. Python Scikit Learn and TensorFlow).
.
- Blended learning:
Using the University’s VLE and online tools to deliver content, assessment and feedback, to encourage active learning, and to enhance student engagement and learning experience.

Learning outcomes

On successful completion of this module the student should be able to:
LO1 Demonstrate a comprehensive understanding of data mining and machine learning fundamental concepts, algorithms and process
LO2 Demonstrate an understanding of the purpose and breadth of areas of application of data mining and machine learning
LO3 Identify machine learning algorithms appropriate for particular classes of problems
LO4 Undertake a comparative evaluation of the strengths and limitations of various data mining techniques
LO5 Comprehensive understanding of the state of the art techniques in data mining and machine learning
LO6 Demonstrate capacity to perform a self-directed piece of practical work that applies data mining techniques in a real-world problem and considers potential commercial risk.

Assessment strategy

The module will be assessed by a practical piece of coursework (50%) and a 2-hour unseen examination (50%).

The coursework is designed mainly to assess the practical aspects of the module. It will provide students with the opportunity to undertake research on current issues and practical techniques in data mining and machine learning LO1, LO4, LO5, LO6. It will also enable students to apply their knowledge to a practical real-word problem, demonstrating their skills for problem-solving and critical thinking/evaluation LO2, LO3, LO6.

The unseen examination will provide an opportunity for students to demonstrate their understanding of data mining and machine learning concepts and techniques and their ability to apply these techniques appropriately to the solution of given problems/scenarios LO1, LO2, LO3, LO4, LO5. The examination will test the students' retention, understanding and insight of material drawn from the module.

Bibliography

Reading list available at: https://rl.talis.com/3/londonmet/lists/15BCF94D-F01E-AD5A-FE85-619EE204ECA7.html?lang=en-GB

1. A. Aldo Faisal, Cheng Soon Ong, and Marc Peter Deisenroth,(2020) Mathematics for Machine Learning, Cambridge University Press
2. Aurélien Géron  (2019) Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, O'Reilly Media [Core]
3. Ian H. Witten, Eibe Frank, Mark A. Hall, Christopher J. Pal (2016) Data Mining: Practical Machine Learning Tools and Techniques, Elsevier Science
4. Mohammed J. Zaki and Wagner Meira, Jr., (2020) Data Mining and Machine Learning: Fundamental Concepts and Algorithms, 2nd ed. Cambridge University Press  [Core]
5. Xin-She Yang, (2019) Introduction to Algorithms for Data Mining and Machine Learning, Elsevier Science