CMT309: Computational Data Science
| School | Cardiff School of Computer Science and Informatics |
| Department Code | COMSC |
| Module Code | CMT309 |
| External Subject Code | 100366 |
| Number of Credits | 20 |
| Level | L7 |
| Language of Delivery | English |
| Module Leader | Dr Oktay Karakus |
| Semester | Autumn Semester |
| Academic Year | 2025/6 |
Outline Description of Module
This module introduces the foundations of computational data science, covering both theoretical underpinnings and the practical computational applications of core data science knowledge and skills. Students will learn how to extract, store and analyse both numeric and textual data using a range of computational programming languages.
On completion of the module a student should be able to
-
Use the Python programming language to complete a range of programming tasks
-
Demonstrate familiarity with programming concepts and data structures
-
Use code to extract, store and analyse textual and numeric data
-
Carry out data analysis and statistical testing using code
-
Critically analyse and discuss methods of data collection, management and storage
-
Extract textual and numeric data from a range of sources, including online
-
Reflect upon the legal, ethical and social issues relating to data science and its applications
How the module will be delivered
Modules will be delivered through mainly in-person sessions, supportive online sources and reading materials.
You will be guided through learning activities appropriate to your module, which may include:
-
Through in-person lectures, you will be taught theoretical aspects of each week’s topic with visually promoted examples. You will be involved the discussions via questions and interactive polls (Mentimeter).
-
Practical hours will serve as the gaining experience step of the things covered during lectures. You will be given some example scenarios to develop a Python code to solve a generic problem at the early stages of the module. When it gets closer to the end of semester, case studies will be focused on primary data science aspects.
-
Self-paced online resources, including videos, web materials, and e-books–> Weekly lectures and practical sessions will be supported by sources from web/cloud for you to improve your experience more in cases when you could not get enough during the practical sessions.
Skills that will be practised and developed
Fundamental programming in Python
Reading and Writing common data formats
Data analysis using appropriate libraries
Understanding HTML document structure and the fundamentals of the web (HTTP, APIs) and data scraping.
How the module will be assessed
A blend of assessment types which includes coursework and portfolio assessments.
Students will be provided with reassessment opportunities in line with University regulations.
Assessment Breakdown
| Type | % | Title | Duration(hrs) |
|---|---|---|---|
| Practical-Based Assessment | 100 | Programming And Data Science Portfolio | N/A |
Syllabus content
Computational & algorithmic thinking and developing basic algorithmic steps for coding.
Basic programming in Python: Fundamental data types, program control structures, Object Oriented Programming and other basic language features.
Data extraction and importing; analysis using common libraries (e.g. pandas, numpy, scipy)
Data Visualisation (e.g. matplotlib, plotly)
Natural language processing using common libraries (e.g regex, nltk)
Data Science applications
Legal issues relating to Data Science (GDPR)
Social and Ethical issues relating to Data Science
Descriptive statistics & Hypothesis testing
Retrieving data from online sources (web scraping, APIs)
Preprocessing and Cleaning Big Data