NMAK21010U Topics in Statistical Genetics

Volume 2021/2022
Education

MSc Programme in Statistics

Content

Introduction to topics in Statistical Genetics, that is, the application of statistical methods for modelling and drawing inferences from genetic data, in particular DNA data. Genetics have been of statistical interest for more than 100 years.

The course discusses mathematical theory and statistical models to understand how genetic data vary in populations and how we can draw infererence from genetic data. Mathematical and statistical theory underlies vast progress and claims made in recent years about human's relationship to Neanderthals, the origin and spread of diseases (such a Covid19), and how the world was populated.

Random variables modelling genetic data from individuals in a population are highly correlated ("exchangeable random variables") and standard asymptotic theory does not apply. The theory and models are based on Markov chains/processes, in discrete and continuous time. Inference procedures are adhoc or advanced, and often based on models with latent variables.

Key mathematical/statistical concepts are ancestral processes, the coalescent process, the age and frequency of alleles (genetic types) in populations, and inference for genetic data based on such processes. Relatedness between indivduals is desribed in terms of a stochastic graph.

Learning Outcome

Knowledge:

At the end of the course the student will have knowledge about the use of statistics in genetics, how genetic variation is modelled, ancestral processes, and how inference can be made from such processes.

The student will have the knowledge to explain

  • population genetic models, like the simple Wright-Fisher model,
  • the coalescent process and Ewens sampling formula
  • the frequency distribution of alleles (types)
  • statistical methods for inference on genetic data in different situations
  • the use of Markov chains to model genetic variation

 

Skills:

The student will acquire the skills to analysis simple genetic data sets, and to extract basic mathematical properties about ancestral processes.

Competencies:

At the end of the course the students will have the competence to

  • carry out inference for (simple) genetic data sets
  • extract relevant mathematical properties of genetic models
  • extract biological insight from mathematical/statistical models

 

Course literature to be decided, but will likely be a mix of research papers and extracts from books.

 

Basic mathematical statistics and probability based on measure theory such as 2nd year courses or equivalent.

Academic qualifications equivalent to a BSc degree is recommended.
Four hours of lectures and three hours of exercises per week for 7 weeks.
  • Category
  • Hours
  • Lectures
  • 28
  • Preparation
  • 130
  • Theory exercises
  • 18
  • Practical exercises
  • 3
  • Exam
  • 27
  • Total
  • 206
Oral
Continuous feedback during the course of the semester

Students receive feedback at the exercise sessions.

Credit
7,5 ECTS
Type of assessment
Written assignment, 27 hours
Written take-home assignment
Aid
All aids allowed
Marking scale
7-point grading scale
Censorship form
No external censorship
One internal examiner
Re-exam

As ordinary exam

Criteria for exam assesment

The student must in a satisfactory way demonstrate that he/she has mastered the learning outcome of the course.