CM2105: Data Processing and Visualisation
School | Cardiff School of Computer Science and Informatics |
Department Code | COMSC |
Module Code | CM2105 |
External Subject Code | 100737 |
Number of Credits | 10 |
Level | L5 |
Language of Delivery | English |
Module Leader | Professor Hantao Liu |
Semester | Autumn Semester |
Academic Year | 2020/1 |
Outline Description of Module
The aim of this module is to develop skills needed to process information and provide an understanding of statistical methods to analyse the resulting data. The techniques studied will enable data collection from a range of sources, including files and the web. The module will cover Python modules that can easily manipulate and convert information to extract data. Statistics to describe collections of data will be studied, along with basic methods to derive correlations and test simple hypotheses.
On completion of the module a student should be able to
- Use Python to extract, manipulate, store and analyse information from a range of sources.
- Understand statistical methods to apply to data
- Understand static visualisations of data
- Create static visualisations of data
How the module will be delivered
Modules will be delivered through blended learning. You will be guided through learning activities appropriate to your module, which may include: • on-line resources that you work through at your own pace (e.g. videos, web resources, e-books, quizzes), • on-line interactive sessions to work with other students and staff (e.g. discussions, live streaming of presentations, live-coding, team meetings) • face to face small group sessions (e.g. help classes, feedback sessions)
Skills that will be practised and developed
Retrieve and manipulate data
Statistical data analysis
Create static visualisations of data
How the module will be assessed
A blend of assessment types which may include coursework and portfolio assessments, class tests, and/or formal examinations.
Assessment Breakdown
Type | % | Title | Duration(hrs) |
---|---|---|---|
Project | 100 | Individual Project Work | N/A |
Syllabus content
Python modules for data processing, for example:
pandas
statsmodels
scipy
Statistical methods
Linear regression
Linear correlation
Hypothesis testing
Visualisation
Matplotlib
Background Reading and Resource List
Doing Data Science: Straight Talk from the Frontline, Cathy O'Neil and Rachel Schutt (ISBN-10: 1449358659)
Python for Data Analysis, Wes McKinney (ISBN 10: 1449319793)