NSCPHD1219 Advanced methods in statistical data analysis

Volume 2014/2015
Content

Data from experiments in high energy physics and observations in astrophysics demand nowadays a highly sophisticated statistical treatment. By inviting experts from both areas, we provide the students with the widest overview of the most advanced statistical methods currently used. The course will cover the fundamental concepts of modern statistical data analysis, including examples derived from the two areas of science mentioned.

The course is well suited and highly relevant for PhD students from a wide range of fields in Science, beyond the ones mentioned. The course will consist of lectures and practical problem- solving sessions (both calculations and computer exercises) and will last 5 full days.

The course will be advertised internationally, and is open to Danish and foreign students.

 

Scientific Content:

Day 1-3 lecturer: Dr. W. Verkerke

1) Basic Statistics

Mean, Variance, Standard Deviation. Gaussian Standard Deviation. Covariance, correlations. Basic distributions : Binomial, Poisson, Gaussian. Central Limit Theorem. Error propagation

2) Event classification

Comparing discriminating variables. Choosing the optimal cut. Working in more than one dimension. Approximating the optimal discriminant. Techniques: Principal component analysis, Fisher Discriminant, Neural Network, Boosted Decision Trees, Probability Density Estimate, Empirical Modeling

3) Estimation and fitting

Introduction to estimation. Properties of chi-2, Maximum Likelihood estimators. Measuring and interpreting Goodness-Of-Fit. Numerical stability issues in fitting. Mitigating fit stability problems. Bounding fit parameters. Fit validation studies. Maximum Likelihood bias issues at low statistics. Toy Monte Carlo techniques. Designing and understanding Joint fits. Designing and understanding Multi-dimensional fits.

4) Confidence interval, limits & significance

Probability, Bayes Theorem. Simple Bayesian methods and issues. Frequentist confidence intervals and issues. Classical hypothesis testing. Goodness-of-fit. Likelihood ratio intervals and issues. Nuisance parameters. Likelihood principle

5) Systematic uncertainties

Sources of systematic errors. Sanity checks versus systematic error studies. Common issues in systematic evaluations. Correlations between systematic uncertainties. Combining statistical and systematic error and problem-solving.

Day 3-5 lecturer: Dr. R. Trotta

1. Foundational aspects: what is probability?

Probability as frequency; Probability as degree of knowledge; Bayes Theorem; Priors; Building the likelihood function; Combination of multiple observations; Nuisance parameters

2. Learning from experience: Bayesian parameter inference

Markov Chain Monte Carlo methods; Importance sampling; Nested sampling; Reporting inferences; Credible regions vs confidence regions; The meaning of sigma

3. Bayesian model selection and cosmological applications

The different levels of inference; The Bayesian evidence and the Bayes factor; Computing Bayes factors; Information criteria for approximate model selection; The meaning of significance; Comparison with classical hypothesis testing; Model complexity; Bayesian model averaging

4. Experiment optimization and prediction

Fisher matrix formalism; Figures of merit; Expected usefulness of an experiment ; Survey optimization; Experimental utility; Bayesian adaptive exploration; Applications to dark energy

 

Learning Outcome

The course is open to PhD students in all fields.

After the course, the students will have a detailed understanding of the fundamental concepts of modern statistical data analysis. They will also be able to use those concepts in solving concrete problems arising in data analysis, after having trained on some specific problems based on high energy particle physics or astrophysics data samples during the course.

lecture notes

2 ECTS points will be assigned for preparation time ahead of the course start + 5 days active participation to lectures and exercises (8 hours each day).

An additional take home exam, with workload 2 full time weeks (10 x 8 hours each) , allows to assign up to a total of 5 ECTS points.
  • Category
  • Hours
  • Class Instruction
  • 40
  • Preparation
  • 10
  • Total
  • 50