NFYK15002U  Advanced Methods in Applied Statistics

Volume 2017/2018
Education

MSc Programme in Physics
MSc Programme in Environmental Science

Content

The course will offer the practical knowledge and hands-on experience in computational analysis of data in all frontier physics research, with particular relevance for particle physics, astrophysics, and cosmology. Lectures, examples, and exercises will be administered via computer demonstration, mainly using the python or C/C++ coding languages.

 

A subset of the course will focus on the analysis features relevant to the specific graduate research topics and interests of the enrolled students.

Learning Outcome

Knowledge:

  • Be familiar with multiple machine learning algorithms and multivariate analysis techniques
  • Understand the biases and impacts of various confidence interval methods
  • Understand Bayesian and Frequentist approaches to interpreting data and the limits of assumed priors
  • Minimization techniques such as hill climbing methods, flocking algorithms, and simulated annealing

 

Skills:

  • Maximum Likelihood fitting
  • Construction of Confidence Intervals (Poisson, Feldman-Cousins, a priori and a posteriori p-values, etc.)
  • Apply computational methods to de-noise data and images
  • Code a chi-squared function in the language of the students preference (Python, C/C++, Ruby, JAVA, R, etc)
  • Creation and usage of spline functions
  • Application of Kernel Density Estimators

 

Competences:

This course will provide the advanced computational tools for data analysis related to manuscript preparation, thesis writing, and understanding the methodology and statistical relevance of results in journal articles. The students will have enhanced general coding skills useful in the both academia and industry.

See Absalon for final course material. The following is an example of expected course litterature.

 “Statistical Data Analysis” by G. Cowan

 

Class lecture notes and links to scholarly articles will be posted online.

- It is absolutely necessary to have extended knowledge and skill with at least one applicable computer programming language (Python, C/C++, Ruby, R, JAVA, or MatLab) for the course, with a preference for Python or C++. At a minimum, students should have accumulated at least 100 hours of writing, modifying, and debugging code in a single software language. If you have any questions or concerns about the coding competency required, please contact Jason Koskinen.
- The ability and experience to install external software packages, e.g. the MultiNest Bayesian inference package or “emcee” Markov Chain Monte Carlo sampler.
- Completion of “Applied Statistics: From Data to Results”, or equivalent, is strongly encouraged but not strictly required.
Instructor lectures, in-class examples, computer-based exercises, and discussion.
It is expected that students bring their own laptops or have access to a computer upon which they can install software to write, compile, and execute code.

There will be an introduction the week before the course begins to address software requirements and any additional course logistics.
Credit
7,5 ECTS
Type of assessment
Continuous assessment, throughout the course
Written assignment, 28 hours
Assessment will be based on:
1) Continuous evaluation consisting of:
- An in-class short oral presentation (10%)
- Graded problem sets and project(s) centering around the coding, implementation, and execution of a statistical method (50%)

2) Written assignment:
- Take home final exam (40%)

Each part of the exam must be passed separately in order to pass the course.

It is possible to some extent to arrange a different weight in individual cases in agreement between the student and course responsible, if this can be justified. Agreement must be arranged at least 1-week prior to take home final exam date.
Aid
All aids allowed
Marking scale
7-point grading scale
Censorship form
No external censorship
More internal examiners
Re-exam

Only the part of the exam (continuous or written assignment) that was not passed can be re-taken. Points from the part of the exam (if any) that was passed, are carried over and count with the same weight as at the regular exam.

If the continuous evaluation was not passed, a number of problem sets can be re-submitted no later than two days before the re-exam.

If the written assignment was not passed, the students should do a new take home exam.

Criteria for exam assesment

see learning outcome

  • Category
  • Hours
  • Lectures
  • 36
  • Practical exercises
  • 32
  • Project work
  • 36
  • Preparation
  • 102
  • Total
  • 206