Fundamentals of Data Science

Module code: MA7419

This module will teach you to obtain, explore and manipulate data in an efficient and reproducible way. These are important skills in any area that uses data to solve practical problems: applied statistics, business analytics, finance, physical and social science research.

This module will use the R programming language and the RStudio Integrated Development. Both of these are free and can be installed by you on your own computer. They are also available on university computers. Lectures will introduce a series of realistic data science problems which you will then work collaboratively with other students to solve.

You will gain the skills to use publically available data sets to develop solutions to real-world problems. Datasets chosen will reflect a range of data formats and application areas. For example: the World Bank Development Indicators database; the Human Mortality Database of mortality and population data; Project Gutenberg database of copyright-free literature; STATS19 Great Britain’s official road traffic casualty database. Students will also need to research and use R software packages suitable for the task at hand.

A basic understanding of programming in any language is desirable.

