Skip to main content
 

MATH2687: Data Science and Statistical Computing II

Please ensure you check the module availability box for each module outline, as not all modules will run in each academic year. Each module description relates to the year indicated in the module availability box, and this may change from year to year, due to, for example: changing staff expertise, disciplinary developments, the requirements of external bodies and partners, and student feedback. Current modules are subject to change in light of the ongoing disruption caused by Covid-19.

Type Open
Level 2
Credits 10
Availability Available in 2024/2025
Module Cap None.
Location Durham
Department Mathematical Sciences

Prerequisites

  • [Calculus I (Maths Hons) (MATH1081) or Calculus 1 (MATH1061) AND Linear Algebra I (Maths Hons) (MATH1091) or Linear Algebra 1 (MATH1071) AND Probability I (MATH1597) AND Statistics I (MATH1617)] OR [SMA (MATH1561) AND SMB (MATH1571)]

Corequisites

  • None

Excluded Combinations of Modules

  • None

Aims

  • To equip students with the skills to import, explore, manipulate, visualise and report real data sets using the statistical programming language R.
  • To introduce students to the concepts and mathematics behind sampling and sampling- based estimators.

Content

  • Modern usage of R.
  • Data import, clearning, wrangling and exploration.
  • Visualization, plotting, exploratory data analysis.
  • Application, methods, theory and coding of simulation based approaches to statistics.
  • Monte Carlo hypothesis testing, non-parametric and parametric Bootstrap.
  • Approximating expectations of random variables by Monte Carlo and general Monte Carlo integration. Accuracy of approximations.
  • Simulating random variables via simple Monte Carlo methods (inverse transform/rejection/importance sampling).

Learning Outcomes

Subject-specific Knowledge:

  • By the end of the module students will:
  • have a solid foundation in the R programming language;
  • be able to import and manipulate real world data sets using modern libraries in the R ecosystem;
  • be able to perform an exploratory data analysis including a variety of visualisations;
  • understand the mathematics (methodology and theory) of sampling-based estimators and simple Monte Carlo simulation;
  • be able to use simulation approaches and apply the mathematics of sampling-based estimators to real world statistics problems.

Subject-specific Skills:

  • Students will have foundational skills in data science, specifically in data import, manipulation and exploration.
  • Students will have foundational skills in simulation and sampling-based methodology.

Key Skills:

  • Students will have basic mathematical skills in the following areas: problem solving, modelling, computation.

Modes of Teaching, Learning and Assessment and how these contribute to the learning outcomes of the module

  • Lectures demonstrate what is required to be learned and the application of the theory to practical examples.
  • Computer practicals consolidate the studied material, explore theoretical ideas in practice, enhance practical understanding, and develop practical data analysis skills.
  • Tutorials provide active problem-solving engagement and immediate feedback to the learning process.
  • Assignments for self-study develop problem-solving skills and enable students to test and develop their knowledge and understanding.
  • Formative assessments provide feedback to guide students in the correct development of their knowledge and skills in preparation for the summative assessment.
  • Computer-based examinations assess the ability to use statistical software and basic programming to solve predictable and unpredictable problems.
  • The end-of-year examination assesses the knowledge acquired and the ability to solve predictable and unpredictable problems.

Teaching Methods and Learning Hours

ActivityNumberFrequencyDurationTotalMonitored
Lectures21Two in weeks: 1-10 and one in week 211 hour21 
Tutorials6Weeks 2, 4, 6, 8, 10, 211 hour6Yes
Computer Practicals10One in weeks 1-101 hour10Yes
Preparation and reading63 
Total100 

Summative Assessment

Component: ExaminationComponent Weighting: 70%
ElementLength / DurationElement WeightingResit Opportunity
Written Examination2 hours100 
Component: Practical AssessmentComponent Weighting: 30%
ElementLength / DurationElement WeightingResit Opportunity
Computer-based examination2 hours100 

Formative Assessment

Regular assignments to be formatively assessed and returned with feedback. Other problems are set for self-study and complete solutions are made available to students.

More information

If you have a question about Durham's modular degree programmes, please visit our FAQ webpages, Help page or our glossary of terms. If you have a question about modular programmes that is not covered by the FAQ, or a query about the on-line Undergraduate Module Handbook, please contact us.

Prospective Students: If you have a query about a specific module or degree programme, please Ask Us.

Current Students: Please contact your department.