CM3104: Large-Scale Databases
School | Cardiff School of Computer Science and Informatics |
Department Code | COMSC |
Module Code | CM3104 |
External Subject Code | 100754 |
Number of Credits | 20 |
Level | L6 |
Language of Delivery | English |
Module Leader | Dr Alia Abdelmoty |
Semester | Autumn Semester |
Academic Year | 2025/6 |
Outline Description of Module
This module explores a range of database technologies that have been motivated by the demands of applications that create massive volumes of data with rapidly changing data types - structured, semi-structured and unstructured data. For example, management of location and geo-spatial information has resulted in extensions to conventional relational databases that can be supported by object-relational database systems. Access to massive quantities of social, scientific and commercial data on the web has resulted in more radical departures from the relational data model. The module introduces the modelling and management of large-scale datasets with a range of modern database technologies, including NoSQL document and graph databases.
On completion of the module a student should be able to
-
Demonstrate an appreciation of applications of large-scale databases in a variety of commercial, scientific and professional contexts;
-
Discuss how relational databases are extended with object-relational technologies to support management of spatial information;
-
Understand the characteristics of and methods of processing geospatial information for purposes of storage and retrieval;
-
Describe non-relational database approaches including document and graph databases to support access to large data sets;
-
Be able to choose and develop a non-relational database solution suitable for the type of data and application considered;
How the module will be delivered
Modules will be delivered through a combination of lectures, online seminars and supervised lab sessions. You will be guided through learning activities appropriate to your module, which may include:
face to face lectures for demonstration and discussion of learning material
lab sessions for application and practice of learning material and assessment support
on-line resources that you work through at your own pace (e.g. videos, web resources, e-books, quizzes),
on-line interactive sessions to work with other students and staff (e.g. discussions, live streaming of presentations, live-coding, team meetings)
Skills that will be practised and developed
Practice creating and querying NoSQL databases using MongoDB;
Model and query data in Graph databases using Neo4J;
Use of spatial SQL language to store and retrieve spatial/geographic data.
Basic geoparsing methods in Python for place name entity recognition and for geocoding place names.
How the module will be assessed
A blend of assessment types which may include coursework and portfolio assessments, class tests, and/or formal examinations
Students will be provided with reassessment opportunities in line with University regulations.
Assessment Breakdown
Type | % | Title | Duration(hrs) |
---|---|---|---|
Written Assessment | 50 | Coursework | N/A |
Exam online – Autumn semester | 50 | Large Scale Database | 2 |
Syllabus content
Review of applications that require support for massive quantities of data, with reference to Cloud Computing and to Big Data.
Non-relational database management methods (NoSQL) for access to large distributed datasets.
Spatial databases for Geographical Information Systems (GIS), including spatial data models and spatial extensions of SQL.
Methods for indexing spatial data and textual data and methods for geo-referencing documents to support spatio-textual indexing.