NIFK24000U Data Management and Analysis Using R

Volume 2024/2025
Education

MSc Programme in Agricultural Economics
MSc Programme in Environmental and Natural Resource Economics

Content

The amount of data available is increasing dramatically. Your future career and Master's thesis will require you to handle and extract information from large quantities of data. We have designed this hands-on course to equip you to meet the data challenges ahead.

You will be introduced to concepts, terminology and methods relevant to handling data and spatial information in R. At the course end, you will have a toolbox of scripts to optimize data management procedures by looping through data and using vector-oriented iterative processes. You will work in R studio writing and debugging code for merging datasets, data cleaning and coding different types of variables, as well as overlaying spatial layers. We will use both base R as well as Tidyverse applications.

You will also be introduced to basic procedures for analysis. This includes tabulating basic statistical measures, specifying regression models and interpreting and visualizing results. Throughout the course, the focus will be on writing, adapting and implementing code in R scripts.

The course aims to develop students’ data management and analysis skills through hands-on group work. The last week of the course will be independent (unsupervised) group project work with empirical datasets.

The course mainly uses the free statistical software package R and briefly introduces the geographical information software Q-GIS.     

Don’t be a slave to the spreadsheet. Join our course and become part of an ever-increasing vibrant community using the object-oriented programming environment R as their playground.

Learning Outcome

This course aims to provide participants with tools and experience in managing and analyzing data, using cross-sectional and spatial data as examples, that would be required to conduct an MSc thesis project or do research based on quantitative data in social sciences and beyond.

 

Knowledge:

  • Knowing codes required to identify different types of datasets and variables (including the nature of maps and geodata) and the implications for the choice of appropriate data management procedure and analysis strategy
  • Show an overview of principles and procedures for importing, merging, coding, transforming and otherwise preparing data for statistical analysis in R
  • Know the arguments for using scripts
  • Possess an overview of basic approaches to quantitative data analysis

 

Skills:

  • Apply procedures for managing different types of data in R in preparation for statistical analysis 
  • Ability to combine different data sets and produce composite maps from multiple sets of digital spatial data
  • Implement statistical analysis in R to derive basic cross-sectional and spatial metrics and estimate linear regression models
  • Solve coding problems in data management and basic statistical analysis in R using available online support, including ChatGTP
  • Generate figures and graphs to interpret, visualize and present statistical results in a clear and concise manner

 

Competencies:

  • Formulate and implement a strategy for solving data management and analysis problems by combining tools from different packages in R to address analytical research problems in empirical datasets
  • Program a script, including debugging using online tools, including ChatGTP, to answer specific research questions
Literature

No obligatory literature curriculum. Relevant material will be shared through Absalon. 

Basic statistics course recommended, and some experience with R and insight in simple data management and analysis expected.

Academic qualifications equivalent to a BSc degree are recommended.
The course involves hands-on writing of R code, focusing on providing students with practical programming skills. Students will implement codes from packages relevant for data management as well as analysis. Hence, learning outcomes are achieved by students individually but supported by peer groups, working on scripts with illustrative exercises. Teachers will assist when students are stuck, but the goal is for the students to become self-reliant and independent. Hence, students are expected to solve problems by, for instance, googling how others before them have solved similar programming problems. Exercises will be based on data sets from small case studies and larger surveys focusing on natural resource management problems examined from a natural and social science perspective. During the exercises, the students will accumulate a command library for the relevant tasks applicable to similar data management and analysis projects. As students come from a diverse set of backgrounds with very different skills, we will proceed in different tempi, offering some students more complicated challenges, including scripts on machine learning and Bayesian statistics. Students should be handing in a written unsupervised group assignment (the course project).
  • Category
  • Hours
  • Lectures
  • 30
  • Preparation
  • 40
  • Practical exercises
  • 40
  • Project work
  • 96
  • Total
  • 206
Oral
Individual
Collective
Continuous feedback during the course of the semester
Feedback by final exam (In addition to the grade)
Peer feedback (Students give each other feedback)
Credit
7,5 ECTS
Type of assessment
Oral examination, 15 minutes
Type of assessment details
The exam involves a plenum presentation of relevant code to further learning, including through failed attempts to solve coding problems. Students will be assessed individually based on a short oral presentation, in plenum, of the course project, taking departure in their script with data management procedures and analysis output such as tables, figures and models testing their research questions and hypothesis.
Exam registration requirements

The exam is conditional on students' course participation, which includes handing in a written unsupervised group assignment (the course project) on Thursday, the third week of the course. Students will select a dataset and develop a data management and analysis strategy for their course project. Students will receive feedback on their project Friday, the last day of the course.

Aid
All aids allowed
Marking scale
passed/not passed
Censorship form
No external censorship
several internal examiners
Exam period

The exam is scheduled Friday in week 34 - the last day of the course.  

Re-exam

Same as the ordinary exam.

If the student has not handed in a written assignment (i.e. the course project), then it must be handed in three weeks prior to re-exam. It must be approved before the exam.

Criteria for exam assesment

To pass the course the student must convincingly fulfil the learning outcomes described above and display command of the packages and individual commands and procedures covered by the curriculum.