ASDK20004U Advanced Social Data Science I

Volume 2020/2021

Mandatory course on MSc programme in Social Data Science at University of Copenhagen. The course is only open for students enrolled in the MSc programme in Social Data Science.


The course introduces students to advanced quantitative social science methods, supervised machine learning and formal models of networks. The social sciences have developed a number of methods and approaches to inferring causal relations and testing theory based on observational data and ‘found’ data. At the same time, machine learning methods are becoming ever more prominent, both for measurement and analysis. The first part of the course introduces advanced regression models and key research designs for causal identification from observational data in the social science, including regression-discontinuity, difference-in-difference, event studies and instrumental variables. The second part of the course introduces the basic approaches to and methods of supervised machine learning in a social science context. This includes linear models, tree-based classification and (cross)validation. We also introduce the intersection of machine learning and social science empirical methods and to challenges in (re)interpreting machine learning results through a social science lens, with a focus on machine learning model explainability and interpretability. Finally, the course introduces basic network concepts and measures to be explored further in the Social Data Theory.

Learning Outcome


  • Show familiarity with advanced regression methods and different research designs for causal inference in the social sciences.
  • Describe core concepts and methods in supervised machine learning, including linear models, tree-based classification, overfitting, bias/variance trade-off and cross-validation.
  • Provide an overview of empirical issues at the intersection between machine learning and social science and describe challenges of interpretability of machine learning models.
  • Define key concepts in the analysis of complex networks.



  • Implement common social science identification strategies to handle problems of endogeneity and selection.
  • Set up and execute simple supervised machine learning models for measurement and prediction in Python.
  • Explain challenges in applying and learning from machine learning in a social science context.
  • Structure network data in Python, as well as to construct and extract various network measures.



  • Design and carry out basic analyses of complex social science networks.
  • Evaluate and implement appropriate modelling approaches given dataset and objective, i.e. whether the goal is to evaluate a policy, make a model with best fit of the data or construct new measures.
  • Critically assess how various research designs and identification strategies can or cannot be applied to questions of causal relationships in observational and ‘found’ data and use this to develop data collection strategies.
  • Account for the possibilities and limitations in the use of machine learning in the social sciences and reflect upon contemporary (mis)use of applications of machine learning in policy and research contexts.
Teaching combines lectures and classes, with a heavy emphasis on hands-on work with data in Python. Classes will present students with opportunities to apply their knowledge of programming and data handling and structuring from SDS Base Camp and Elementary Social Data Science to more advanced concepts and problems.
  • Category
  • Hours
  • Lectures
  • 28
  • Preparation
  • 70
  • Exercises
  • 42
  • Project work
  • 66
  • Total
  • 206
Peer feedback (Students give each other feedback)
7,5 ECTS
Type of assessment
Written assignment
Written exam in the form of a group project.
Exam registration requirements

To be eligible for the exam in ASDSI, it is a requirement that students have passed all courses on semester 1 (i.e. Social Data Science Base Camp, Elementary Social Data Science and Data Governance: Law, Ethics and Politics).

Another requirement for eligibility is that students have completed a number of compulsory problem sets based on a social science question combining knowledge of social science research design with methods from the course. The problem sets must be completed in groups and must be approved by the instructor.

All aids allowed
Marking scale
7-point grading scale
Censorship form
No external censorship

If there are only a few students, the form of the reexamination will be an extended synopsis with oral defence. The asessment will be exclusively of the oral defence.

Criteria for exam assesment

The exam will be assessed on the basis of the learning outcome (knowledge, skills and competencies) for the course.