CMT655: Manipulating and Exploiting Data

School Cardiff School of Computer Science and Informatics
Department Code COMSC
Module Code CMT655
External Subject Code 100370
Number of Credits 20
Level L7
Language of Delivery English
Module Leader Dr Nico Potyka
Semester Spring Semester
Academic Year 2025/6

Outline Description of Module

Modern applications rely on data to provide a meaningful experience to users. The aim of this module is to expand on the use of databases from previous modules, discuss the definition of data and its sources, then examine the data manipulation layer of a multi-tiered architecture, exploring how to convert data into meaningful results, utilising techniques like classification, and utilising large scale Databases. 

On completion of the module a student should be able to

  1. Critically assess the legal, social and ethical implications of data collection, storage, and knowledge extraction. 
     

  1. Manipulate a range of supplied data into usable formats for various purposes, justifying the use of appropriate tools and critically appraising their underlying mechanisms and concepts. 
     

  1. Design software based on requirements that uses third party libraries for processing and extracting knowledge (e.g., for statistical analysis and natural language processing tasks). 
     

  1. Package processed data in suitable formats for systems to use (e.g., JSON, CSV, web API). 
     

  1. Analyse, design, and transform a conceptual schema into the schema of an appropriate database management system. 

How the module will be delivered

The module will be delivered through a combination of lectures, supervised lab sessions and tutorials as appropriate. You will be expected to attend all timetabled sessions and engage with online material. You will be guided through learning activities appropriate to your module, which may include: 

on-line resources that you work through at your own pace (e.g. videos, web resources, e-books, quizzes), 

on-line interactive sessions to work with other students and staff (e.g. discussions, live streaming of presentations, live-coding, team meetings) 

face to face small group sessions (e.g. help classes, feedback sessions) 

Skills that will be practised and developed

Design appropriate software to generate meaningful content from data. 

Processing data sets of varying sizes adopting appropriate data science techniques. 

Exploiting the use of third party APIs and libraries for the extraction of knowledge. 

Professional and ethical behaviour, time management, critical, and defensible thinking. 

How the module will be assessed

A blend of assessment types which may include coursework and portfolio assessments, class tests, and/or formal examinations 

Students will be provided with reassessment opportunities in line with University regulations. 

Assessment Breakdown

Type % Title Duration(hrs)
Portfolio 100 Portfolio N/A

Syllabus content

Using third party libraries (e.g., Pandas, NumPy) 

Basic notions on statistics 

Data cleaning and processing tasks 

Statistical Analysis programming tasks (e.g., regression, correlation) 

Data formats (e.g., JSON, CSV) 

Legal, social, and ethical considerations in analysing data 

Designing database systems (e.g., relational and non- relational databases) 

Using database systems for data processing tasks (e.g., SQL) 


Copyright Cardiff University. Registered charity no. 1136855