CM2105: Data Processing and Visualisation

School Cardiff School of Computer Science and Informatics
Department Code COMSC
Module Code CM2105
External Subject Code 100737
Number of Credits 10
Level L5
Language of Delivery English
Module Leader null null null
Semester Autumn Semester
Academic Year 2018/9

Outline Description of Module

The aim of this module is to develop skills needed to process information and provide an understanding of statistical methods to analyse the resulting data. The techniques studied will enable data collection from a range of sources, including files and the web. The module will cover Python modules that can easily manipulate and convert information to extract data. Statistics to describe collections of data will be studied, along with basic methods to derive correlations and test simple hypotheses.

On completion of the module a student should be able to

1. Use Python to extract, manipulate, store and analyse information from a range of sources.
2. Understand statistical methods to apply to data
3. Understand static visualisations of data
4. Create static visualisations of data

How the module will be delivered

The module will be delivered through a combination of lectures, supervised lab sessions, example classes and tutorials as appropriate.

Skills that will be practised and developed

  • Retrieve and manipulate data

  • Statistical data analysis

  • Create static visualisations of data

How the module will be assessed

There will be two points of assessment in the module:

(1) Coursework: There will be an individual project that will assess the knowledge and ability to implement techniques of data processing and visualisations (LO1, LO4).

(2) Exam: There will be a 2 hour written exam. The exam will assess the knowledge and understanding of data processing and visualisations (LO2, LO3).

Both assessments will allow the student to demonstrate their knowledge and practical skills and to apply the principles taught in lectures.

The potential for reassessment in this module is a 100% written exam during the summer.

Assessment Breakdown

Type % Title Duration(hrs)
Written Assessment 50 Individual Project Work N/A
Exam - Autumn Semester 50 Data Processing And Visualisation 2

Syllabus content

Python modules for data processing, for example:
pandas
statsmodels
scipy
Statistical methods
Linear regression
Linear correlation
Hypothesis testing
Visualisation
Matplotlib

Background Reading and Resource List

Doing Data Science: Straight Talk from the Frontline, Cathy O'Neil and Rachel Schutt (ISBN-10: 1449358659)
Python for Data Analysis, Wes McKinney (ISBN 10: 1449319793)


Copyright Cardiff University. Registered charity no. 1136855