NMAK14029U Statistics for Bioinformatics and eScience (StatBI/E)

Volume 2014/2015
Education
MSc Programme in Bioinformatics
Content

The course is based on a set of concrete cases that will take the participants though the following content. 

 

  • Standard discrete and continuous distributions, descriptive methods, the frequency and Bayesian interpretations, conditioning, independence, and selected probability results.
  • Simulation. 
  • Mean, variance, estimators, two-sample comparisons, multiple testing.
  • Maximum likelihood and least squares estimation.
  • Standard errors and confidence intervals.
  • Bootstrapping. 
  • Correlation, linear, non-linear, logistic and Poisson regression.
  • Dimensionality reduction, model selection and model validation. 
  • The statistical programming language R. 
  • Models for neuron activity, gene expression, database searches, motif and word occurrences, internet traffic, diagnostic tests etc.

 

Learning Outcome

Knowledge:

The basic concepts in mathematical statistics, such as;

  • Probability distributions
  • Standard errors and confidence intervals
  • Maximum likelihood and least squares estimation
  • Bootstrapping
  • Hypothesis testing and p-values
  • Linear, non-linear, logistic and Poisson regression


Skills:

  • Master practical implementation in R.
  • Use computer simulations for computations with probability distributions, including bootstrapping.
  • Compute uncertainty measures, such as standard errors and confidence intervals, for estimated parameters.
  • Compute predictions based on regression models taking into account the uncertainty of the predictions.
  • Assess a fitted distribution using descriptive methods.
  • Use general purpose methods, such as the method of least squares and maximum likelihood, to fit probability distributions to empirical data.
  • Summarize empirical data and compute relevant descriptive statistics for discrete and continuous probability distributions.


Competences:

  • Formulate scientific questions in statistical terms.
  • Interpret and report the conclusions of a practical data analysis.
  • Assess the fit of a regression model based on diagnostic quantities and plots.
  • Investigate scientific questions that are formulated in terms of comparisons of distributions or parameters by statistical methods.
  • Investigate scientific questions regarding association in terms of linear, non-linear, logistic and Poisson regression models.
MSc students and BSc students in their 3rd year with MatIntro or an equivalent course.
5 hours of lectures and 3 hours of exercises per week. 7 weeks of classes.
  • Category
  • Hours
  • Exam
  • 30
  • Lectures
  • 35
  • Practical exercises
  • 21
  • Preparation
  • 90
  • Project work
  • 30
  • Total
  • 206
Credit
7,5 ECTS
Type of assessment
Written assignment, 30 hours
2 days take-home assignment.
Exam registration requirements
Approval of a midd way group project report.
Marking scale
7-point grading scale
Censorship form
No external censorship
Re-exam
If ten or fewer students have sign up for re-exam, the type of assessment will be changed to a 30 min. oral exam with 30 min. preparation. All aids allowed.
Criteria for exam assesment

The student must in a satisfactory way demonstrate that he/she has mastered the learning outcome.