CM2105: Data Processing and Visualisation

School Cardiff School of Computer Science and Informatics
Department Code COMSC
Module Code CM2105
External Subject Code 100737
Number of Credits 10
Level L5
Language of Delivery English
Module Leader Professor Hantao Liu
Semester Autumn Semester
Academic Year 2020/1

Outline Description of Module

The aim of this module is to develop skills needed to process information and provide an understanding of statistical methods to analyse the resulting data. The techniques studied will enable data collection from a range of sources, including files and the web. The module will cover Python modules that can easily manipulate and convert information to extract data. Statistics to describe collections of data will be studied, along with basic methods to derive correlations and test simple hypotheses.

On completion of the module a student should be able to

  1. Use Python to extract, manipulate, store and analyse information from a range of sources.
  2. Understand statistical methods to apply to data
  3. Understand static visualisations of data
  4. Create static visualisations of data

How the module will be delivered

Modules will be delivered through blended learning. You will be guided through learning activities appropriate to your module, which may include: • on-line resources that you work through at your own pace (e.g. videos, web resources, e-books, quizzes), • on-line interactive sessions to work with other students and staff (e.g. discussions, live streaming of presentations, live-coding, team meetings) • face to face small group sessions (e.g. help classes, feedback sessions)

Skills that will be practised and developed

Retrieve and manipulate data

Statistical data analysis

Create static visualisations of data

How the module will be assessed

A blend of assessment types which may include coursework and portfolio assessments, class tests, and/or formal examinations.

Assessment Breakdown

Type % Title Duration(hrs)
Project 100 Individual Project Work N/A

Syllabus content

Python modules for data processing, for example:

pandas

statsmodels

scipy

Statistical methods

Linear regression

Linear correlation

Hypothesis testing

Visualisation

Matplotlib

Background Reading and Resource List

Doing Data Science: Straight Talk from the Frontline, Cathy O'Neil and Rachel Schutt (ISBN-10: 1449358659)

Python for Data Analysis, Wes McKinney (ISBN 10: 1449319793)

 


Copyright Cardiff University. Registered charity no. 1136855