NPLK19000U Big Data in Biotechnology
MSc Programme in Biotechnology
MSc Programme in Biotechnology with a minor subject
MSc Programme in Environmental Science
MSc Programme in Human Physiology
This is an introductory course in large-scale data analysis. Many experimental procedures such as the various “-omics” techniques routinely employed within biotechnology produce vast amounts of data. Therefore, the amount of available data in many biotechnological disciplines is steadily increasing. While most biotechnological data is not necessarily of a size and complexity defined as big data, fundamental knowledge and skills of large-scale computing systems and analysis methods is required to make use of this wealth of information. The purpose of this course is to introduce the theory and practice of large-scale data analysis to students, which will allow them to perform and assess different types of ”-omics”-scale data procedures. Tentative list of data types to be covered in the course: Transcriptomic data (RNAseq), Metabolomic data (LC-MS), and Biological image data.
This course covers the fundamental challenges with analysis of large amounts of data, i.e. how to handle large data files and how to overcome computational/storage limitations. The course provides knowledge and skills to perform data wrangling and normalization. The students will obtain working knowledge of basic data handling, data analysis, and data visualization. Through in-depth focus on the handling and analysis of a relevant set of different data-types using programming-based analysis techniques, this course will address statistical and computational challenges of large-scale data analysis.
Basic knowledge of the experimental methods used to generate the data types used in the course will also be briefly covered, because an understanding of the experimental methods used to generate data is often needed to assess bias and confounding factors in data.
At course completion, the student will have:
- The classification of data
- The general principles of large-scale data analysis
- Common pitfalls in large-scale data analysis
- The basic concepts underlying clustering and visualization techniques
- Data management, including how to efficiently store, transfer, and analyse large amounts of data
- How to structure and perform large-scale data analyses in a coding-based software environment, such as for example R or Python
- Handling and modifying large datasets
- Visualization and dissemination of data
- Analysing different types of large-scale biotechnology data
- Digital reflection and critical evaluation of the quality of different types of biotechnology data
- Digital analysis and methodology of large-scale data analyses
Original literature, software manuals and tutorials, and teacher provided compendia.
- Theory exercises
- Practical exercises
Continuous feedback during the course
- 7,5 ECTS
- Type of assessment
- Written examination, 3 hours under invigilation
- Type of assessment details
- The course has been selected for ITX exam
See important information about ITX-exams at Study Information, menu point: Exams -> Exam types and rules -> Written on-site exams (ITX)
- Only certain aids allowed
For the ITX exam the software programs R and Rstudio will be available during the exam.
- Marking scale
- 7-point grading scale
- Censorship form
- No external censorship
20 minutes oral exam in course curriculum without preparation. No aids allowed.
Criteria for exam assesment
See learning outcome.
- Course code
- 7,5 ECTS
- Full Degree Master
- 1 block
- Block 1
- Course capacity
The number of seats may be reduced in the late registration period
- Study Board for the Biological Area
- Department of Plant and Environmental Sciences
- Faculty of Science
- Henrik Hjarvard de Fine Licht (13-71716d6e6f72776e75726c717d4979756e7737747e376d74)