CSC-30043 - Data Mining
Coordinator: Sangeeta Sangeeta Tel: +44 1782 7 33079
Lecture Time: See Timetable...
Level: Level 6
Credits: 15
Study Hours: 150
School Office: 01782 733075

Programme/Approved Electives for 2025/26

None

Available as a Free Standing Elective

No

Co-requisites

None

Prerequisites

CSC-10058 Introduction to Data Science I
CSC-10060 Introduction to Data Science II

Barred Combinations

None

Description for 2025/26

As the capacity to store large amounts of data as well as the capacity to process such data increases exponentially the need for data mining expertise is paramount. Many companies have been harvesting data from their operations for many years and are realising the amazing potential that this data can unlock in terms of informing their future operations. Maximising efficiencies, sales potentials and cutting down costs are among the many benefits that can be gained by the data mining knowledge and skillset that this module will provide.

Aims
To provide the full skillset that is required from a data scientist in order to identify and collect appropriate data sets (sampling, selection etc.), pre-processing methods (cleaning, filtering etc.) and subsequently apply techniques in order to generate new information.

Intended Learning Outcomes

Identify and collect appropriate data in order to design a data mining work flow.: 1,2
Apply pre-processing techniques to the collected data sets that minimise bias and distortion in the data.: 1,2
Select and apply appropriate data mining techniques in order to extract new and useful information from the data.: 1,2
Validate the findings of a data analysis and quantify their validity.: 1,2

Study hours

Lectures: 24h
Group work and preparation for presentation: 50h
Independent study: 76h

School Rules

None

Description of Module Assessment

1: Group Assessment weighted 50%
Group Project based on solutions to external partners' data related problems
External partners to the school/university (such as companies, services, government bodies etc.) will be invited to present data-related problems to which groups of students will attempt to address. Real data from such partners will be analysed by applying the data mining techniques that will be learned. Each group will then present their solution in a 10 minute presentation to the problem providers at the end of the module. This will include presentation of data collection, work flow, techniques used and reflection on bias and distortion and the validity of the results. The mark for each student in a group will be composed of a group element as well as an individual element.

2: Report weighted 50%
Individual Report
In this assessment, the student will write a report and reflect on their understanding of the various data mining principles.