NSCPHD1219 Advanced methods in statistical data analysis
Data from experiments in high energy physics and observations in astrophysics demand nowadays a highly sophisticated statistical treatment. By inviting experts from both areas, we provide the students with the widest overview of the most advanced statistical methods currently used. The course will cover the fundamental concepts of modern statistical data analysis, including examples derived from the two areas of science mentioned.
The course is well suited and highly relevant for PhD students from a wide range of fields in Science, beyond the ones mentioned. The course will consist of lectures and practical problem- solving sessions (both calculations and computer exercises) and will last 5 full days.
The course will be advertised internationally, and is open to
Danish and foreign students.
Scientific Content:
Day 1-3 lecturer: Dr. W. Verkerke
1) Basic Statistics
Mean, Variance, Standard Deviation. Gaussian Standard Deviation. Covariance, correlations. Basic distributions : Binomial, Poisson, Gaussian. Central Limit Theorem. Error propagation
2) Event classification
Comparing discriminating variables. Choosing the optimal cut. Working in more than one dimension. Approximating the optimal discriminant. Techniques: Principal component analysis, Fisher Discriminant, Neural Network, Boosted Decision Trees, Probability Density Estimate, Empirical Modeling
3) Estimation and fitting
Introduction to estimation. Properties of chi-2, Maximum Likelihood estimators. Measuring and interpreting Goodness-Of-Fit. Numerical stability issues in fitting. Mitigating fit stability problems. Bounding fit parameters. Fit validation studies. Maximum Likelihood bias issues at low statistics. Toy Monte Carlo techniques. Designing and understanding Joint fits. Designing and understanding Multi-dimensional fits.
4) Confidence interval, limits & significance
Probability, Bayes Theorem. Simple Bayesian methods and issues. Frequentist confidence intervals and issues. Classical hypothesis testing. Goodness-of-fit. Likelihood ratio intervals and issues. Nuisance parameters. Likelihood principle
5) Systematic uncertainties
Sources of systematic errors. Sanity checks versus systematic error studies. Common issues in systematic evaluations. Correlations between systematic uncertainties. Combining statistical and systematic error and problem-solving.
Day 3-5 lecturer: Dr. R. Trotta
1. Foundational aspects: what is probability?
Probability as frequency; Probability as degree of knowledge; Bayes Theorem; Priors; Building the likelihood function; Combination of multiple observations; Nuisance parameters
2. Learning from experience: Bayesian parameter inference
Markov Chain Monte Carlo methods; Importance sampling; Nested sampling; Reporting inferences; Credible regions vs confidence regions; The meaning of sigma
3. Bayesian model selection and cosmological applications
The different levels of inference; The Bayesian evidence and the Bayes factor; Computing Bayes factors; Information criteria for approximate model selection; The meaning of significance; Comparison with classical hypothesis testing; Model complexity; Bayesian model averaging
4. Experiment optimization and prediction
Fisher matrix formalism; Figures of merit; Expected usefulness of an experiment ; Survey optimization; Experimental utility; Bayesian adaptive exploration; Applications to dark energy
The course is open to PhD students in all fields.
After the course, the students will have a detailed understanding
of the fundamental concepts of modern statistical data analysis.
They will also be able to use those concepts in solving concrete
problems arising in data analysis, after having trained on some
specific problems based on high energy particle physics or
astrophysics data samples during the course.
lecture notes
An additional take home exam, with workload 2 full time weeks (10 x 8 hours each) , allows to assign up to a total of 5 ECTS points.
- Category
- Hours
- Class Instruction
- 40
- Preparation
- 10
- Total
- 50
Course information
- Language
- English
- Course code
- NSCPHD1219
- Level
- Ph.D.
- Duration
- 1 week
- Placement
- Autumn
- Schedule
- a phd school of 1 weeks duration.
lectures and exercises 8 hours each day. - Course capacity
- 35
- Study board
- Natural Sciences PhD Committee
Contracting department
- The Niels Bohr Institute
Course responsibles
- Steen Harle Hansen (6-6b64717668714371656c316e7831676e)
- Stefania Xella (5-7a676e6e634270646b306d7730666d)
Lecturers
Dr. Roberto Trotta (Imperial College, London)
Dr. Wouter Verkerke (Nikhef , Amsterdam)