NMAK14029U Statistics for Bioinformatics and eScience (StatBI/E)

Volume 2015/2016
Education
MSc Programme in Bioinformatics
Content

The course is based on a set of concrete cases that will take the participants though the following content.

 

  • Standard discrete and continuous distributions, descriptive methods, the frequency and Bayesian interpretations, conditioning, independence, and selected probability results.
  • Simulation.
  • Mean, variance, estimators, two-sample comparisons, multiple testing.
  • Maximum likelihood and least squares estimation.
  • Standard errors and confidence intervals.
  • Bootstrapping.
  • Correlation, linear, non-linear, logistic and Poisson regression.
  • Dimensionality reduction, model selection and model validation.
  • The statistical programming language R.
  • Models for neuron activity, gene expression, database searches, motif and word occurrences, internet traffic, diagnostic tests etc.

 

Learning Outcome

Knowledge:

The basic concepts in mathematical statistics, such as;

  • Probability distributions
  • Standard errors and confidence intervals
  • Maximum likelihood and least squares estimation
  • Bootstrapping
  • Hypothesis testing and p-values
  • Linear, non-linear, logistic and Poisson regression


Skills:

  • Master practical implementation in R.
  • Use computer simulations for computations with probability distributions, including bootstrapping.
  • Compute uncertainty measures, such as standard errors and confidence intervals, for estimated parameters.
  • Compute predictions based on regression models taking into account the uncertainty of the predictions.
  • Assess a fitted distribution using descriptive methods.
  • Use general purpose methods, such as the method of least squares and maximum likelihood, to fit probability distributions to empirical data.
  • Summarize empirical data and compute relevant descriptive statistics for discrete and continuous probability distributions.


Competences:

  • Formulate scientific questions in statistical terms.
  • Interpret and report the conclusions of a practical data analysis.
  • Assess the fit of a regression model based on diagnostic quantities and plots.
  • Investigate scientific questions that are formulated in terms of comparisons of distributions or parameters by statistical methods.
  • Investigate scientific questions regarding association in terms of linear, non-linear, logistic and Poisson regression models.
MSc students and BSc students in their 3rd year with MatIntro or an equivalent course.
5 hours of lectures and 3 hours of exercises per week. 7 weeks of classes.
  • Category
  • Hours
  • Exam
  • 30
  • Lectures
  • 35
  • Practical exercises
  • 21
  • Preparation
  • 90
  • Project work
  • 30
  • Total
  • 206
Credit
7,5 ECTS
Type of assessment
Written assignment, 30 hours
2 days take-home assignment.
Exam registration requirements
Approval of a midd way group project report.
Marking scale
7-point grading scale
Censorship form
No external censorship
One internal examiner
Re-exam
If ten or fewer students have sign up for re-exam, the type of assessment will be changed to a 30 min. oral exam with 30 min. preparation. All aids allowed.
Whether the exam is written or oral it is required that the midd way group project report must be approved. if it is not approved before the ordinary exam it must be re-submitted no later than two weeks before the beginning of the re-exam week.
Criteria for exam assesment

The student must in a satisfactory way demonstrate that he/she has mastered the learning outcome.