NIFK19006U Managing and Analyzing Data in Social Science
MSc programme in Agricultural Economics
MSc programme in Environmental And Natural Resource Economics
Are you feeling the constraints of excel spreadsheets? The amount of data available is increasing dramatically. In your future career as well as doing your Master thesis will require you to handle and extract information from large quantities of data. We have designed this hands-on course to equip you to meet the data challenges ahead.
You will be introduced to concepts, terminology and methods relevant to handling data and spatial information in R and QGIS. At course end, you will have a toolbox of scripts enabling you to optimise data management procedures by looping through data and using vector oriented iterative processes. You will work in R studio writing and debugging code for merging datasets, data cleaning and coding of different types of variables as well as overlaying spatial layers.
You will also be introduced to basic procedures for testing hypothesis. This includes tabulating basic statistical measures, the specification of regression models and interpreting and visualising results. Throughout the course, the focus will be on making the data handling process transparent and reflecting on the implications of data management choices and choice of statistical approach in relation to validity and reliability of the results of the analysis and good scientific practice.
The course aims to develop students’ skills to conduct own data management and analysis through hands-on work is groups. The last week of the course will be independent (supervised) group project work with empirical datasets.
The course uses the free statistical software package R and the geographical information software Q-GIS.
Don’t be a slave to the spreadsheet. Join our course and become part of an ever-increasing vibrant community using the object-oriented programing environment R as their playground.
The aim of this course is to provide participants with the tools and experience in managing and analysing data, with a focus on socioeconomic and spatial data, that would be required to conduct a MSc thesis project or do research based on quantitative data in social sciences and beyond.
Describe different types of datasets and variables (incl. the nature of maps and geodata) and the implications for the choice of appropriate data management procedure and analysis strategy
Explain principles of good conduct in relation to data storage, documentation and anonymization of person sensitive data
Show an overview of principles and procedures for importing, merging, coding, transforming and otherwise preparing data for statistical analysis in R and Q-GIS
Describe the arguments for using scripts
Present an overview of basic approaches to quantitative data analysis
Apply procedures for managing different types of data in R and Q-GIS in preparation for statistical analysis
Combine different data sets and produce composite maps from multiple sets of digital spatial data
Develop research questions and hypothesis
Implement statistical analysis in R to derive basic cross-sectional and spatial metrics and estimate linear regression models
Solve coding problems in data management and basic statistical analysis in R
Interpret, visualize and present statistical results in a clear and concise manner
Formulate relevant research questions and hypothesis to address analytical research problems in relation to empirical datasets in the context of social science
Program a script to answer specific research questions
Argue convincingly for appropriate choice of data management procedure and statistical methods suitable to answer basic research questions and test hypothesis based on available data and specific empirical problems
Discuss the results of empirical data analysis in terms of relevance, reliability, validity and interpretation
Reflect critically on the implications of data quality, data handling procedures, statistical methods and tests in relation to conclusions drawn from the analysis
Examples of relevant literature:
Paradis, E.: 2005, R for beginners. http://cran.r-project.org/doc/contrib/Paradis-rdebuts_en.pdf
Ricci V.: 2005 - R Functions For Regression Analysis. http://cran.r-project.org/doc/contrib/Ricci-refcard-regression.pdf
Thiede R., Sutton T., Düster H., Sutton M.: 2014, Quantum GIS Training Manual Release 1.0. http://manual.linfiniti.com/LinfinitiQGISTrainingManual-en.pdf
Abedin, J., & Das, K. K. (2015). Data Manipulation with R. Packt Publishing Ltd. http://www.allitebooks.com/data-manipulation-with-r-second-edition/
Osborne, J. W. (2012). Best practices in data cleaning: A complete guide to everything you need to do before and after collecting your data. Sage.
The precise literature list will be present on the course homepage in Absalon .
Academic qualifications equivalent to a BSc degree is recommended.
- 7,5 ECTS
- Type of assessment
- Oral examination, 15 minutesStudents will be assessed individually based on a short oral presentation in plenum of test of own developed research hypothesis, script with data management procedures, and output of analysis such as tables, figures and models based on the written assignment.
- Exam registration requirements
The exam is conditional on handing in of a written group assignment Thursday the third week of the course. Students will receive feedback on the assignment Friday the last day of the course.
- All aids allowed
- Marking scale
- passed/not passed
- Censorship form
- No external censorship
One or more internal examiners
- Exam period
The exam is scheduled Friday in week 34 - the last day of the course.
As the ordinary exam.
If the student has not handed in a written assignement then it must be handed in three weeks prior to the deadline of registration for the re-exam. It must be approved before the exam.
Criteria for exam assesment
To pass the course the student must convincingly fullfil the Learning Outcome described above.
- Practical exercises
- Project work