CM3104: Large-Scale Databases

School Cardiff School of Computer Science and Informatics
Department Code COMSC
Module Code CM3104
External Subject Code 100754
Number of Credits 20
Level L6
Language of Delivery English
Module Leader Dr Alia Abdelmoty
Semester Autumn Semester
Academic Year 2025/6

Outline Description of Module

This module explores a range of database technologies that have been motivated by the demands of applications that create massive volumes of data with rapidly changing data types - structured, semi-structured and unstructured data. For example, management of location and geo-spatial information has resulted in extensions to conventional relational databases that can be supported by object-relational database systems. Access to massive quantities of social, scientific and commercial data on the web has resulted in more radical departures from the relational data model. The module introduces the modelling and management of large-scale datasets with a range of modern database technologies, including NoSQL document and graph databases. 

On completion of the module a student should be able to

  1. Demonstrate an appreciation of applications of large-scale databases in a variety of commercial, scientific and professional contexts; 
     

  1. Discuss how relational databases are extended with object-relational technologies to support management of spatial information; 
     

  1. Understand the characteristics of and methods of processing geospatial information for purposes of storage and retrieval; 
     

  1. Describe non-relational database approaches including document and graph databases to support access to large data sets; 
     

  1. Be able to choose and develop a non-relational database solution suitable for the type of data and application considered; 

How the module will be delivered

Modules will be delivered through a combination of lectures, online seminars and supervised lab sessions. You will be guided through learning activities appropriate to your module, which may include: 

face to face lectures for demonstration and discussion of learning material 

lab sessions for application and practice of learning material and assessment support 

on-line resources that you work through at your own pace (e.g. videos, web resources, e-books, quizzes), 

on-line interactive sessions to work with other students and staff (e.g. discussions, live streaming of presentations, live-coding, team meetings) 

Skills that will be practised and developed

Practice creating and querying NoSQL databases using MongoDB; 

Model and query data in Graph databases using Neo4J; 

Use of spatial SQL language to store and retrieve spatial/geographic data. 

Basic geoparsing methods in Python for place name entity recognition and for geocoding place names. 

How the module will be assessed

A blend of assessment types which may include coursework and portfolio assessments, class tests, and/or formal examinations 

Students will be provided with reassessment opportunities in line with University regulations. 

Assessment Breakdown

Type % Title Duration(hrs)
Written Assessment 50 Coursework N/A
Exam online – Autumn semester 50 Large Scale Database 2

Syllabus content

Review of applications that require support for massive quantities of data, with reference to Cloud Computing and to Big Data. 

Non-relational database management methods (NoSQL) for access to large distributed datasets. 

Spatial databases for Geographical Information Systems (GIS), including spatial data models and spatial extensions of SQL. 

Methods for indexing spatial data and textual data and methods for geo-referencing documents to support spatio-textual indexing. 


Copyright Cardiff University. Registered charity no. 1136855