ASDK20006U Advanced Social Data Science II
Full-degree students enrolled at the Faculty of Social Science, UCPH
- MSc in Security Risk Management
- Master Programme in Social Data Science
- Master Programmes in Sociology
- Master programme in Political Science and Social Science
- Master Programmes in Economics
Mandatory course on MSc programme in Social Data Science at University of Copenhagen.
The wealth of new data in the digital society is characterized by high frequency observations in a high granularity setting, allowing for both comprehensive and detailed analysis of social and individual behaviour. Text data such as comments and conversations on social media have the potential to provide thick descriptions of social interactions and individual values in large-scale, sometimes population level, settings. Digitalization of large corpuses of legal, administrative and political texts allows for dynamic analysis of evolving social ideas and issues. Most digital data do not arrive in simple accessible, quantifiable and comparable forms, but as text, sound and pictures. Advanced Social Data Science II focuses on unstructured data and methods for processing, transforming and dealing with complex and high dimensional text data. The course presents classic text data methods for characterizing and developing typologies and categories of individual and social behaviour, networks and ideas. Furthermore, it introduces stateof-the-art methods for classifying complex unstructured text data, and relates such data-driven methods to existing theoretical methods and models in the social sciences. Coding in the course is in Python, and the course requires familiarity with Python.
After completing the course, the student is expected to be able to:
Knowledge:
- Understand key concepts in traditional bag-of-words approaches to text data and how they relate to embeddings- and transformer-based approaches.
- Explain the differences between and capabilities of neural network architectures such as RNN, LSTM and attentions-based models.
- Account for various learning strategies, algorithms as well as approaches: clustering and unsupervised learning, supervised learning, semi-supervised learning, and transfer learning.
- Account for the potential of different representations, encodings and transformations of text, structured and unstructured.
Skills:
- Apply and justify preprocessing methods for text data.
- Extract reliable informations from text data using supervised and unsupervised learning and techniques from natural language processing.
- Use scikit-learn and PyTorch to apply basic and advanced machine learning models.
- Apply state-of-the-art deep transfer learning to classify unstructured data.
Competences:
- Integrate theoretical and applied knowledge within the field of Social Data Science and formulate compelling research questions given an unstructured dataset.
- Construct data sets from unstructured text and media data that are validated and well documented.
- Independently carry out a problem-driven end-to-end analysis given an unstructured dataset of text, including exploratory analysis and discovery using unsupervised methods and supervised learning for measurement, and assessment of model-based biases.
- Critically evaluate the implications of results, considering model limitations and biases, and systematic noise introduced by data collection and sampling methods.
Examples of course readings:
● Gentzkow, M., Kelly, B. T., & Taddy, M. Text as Data.
*Journal of Economic Literature*.
● Grimmer, J., & Stewart, B. M. (2013). Text as data: The
promise and pitfalls of automatic content analysis methods for
political texts. *Political Analysis*, 21(3), 267-297.
● Hastie, T., & Tibshirani, R. & Friedman, J.(2008). *The
Elements of Statistical Learning; Data Mining, Inference and
Prediction*.
● Hovy, D. *Text Analysis in Python for Social Scientists.*
● Jurafsky, Dan, and James H. Martin. *Speech and language
processing*. Vol. 3. London: Pearson, 2014.
● Rodriguez, P. L., & Spirling, A. (2022). Word embeddings:
What works, what doesn’t, and how to tell the difference for
applied research. The Journal of Politics, 84(1), 101–115.
● Bail, C. A. (2024). Can generative AI improve social science?
Proceedings of the National Academy of Sciences, 121(21),
e2314021121.
- Category
- Hours
- Lectures
- 28
- Class Instruction
- 42
- Preparation
- 112
- Exam
- 24
- Total
- 206
When registered you will be signed up for exam.
- Full-degree students – sign up at Selfservice on KUnet
The dates for the exams are found here Exams – Faculty of Social Sciences - University of Copenhagen (ku.dk)
Please note that it is your own responsibility to check for overlapping exam dates.
- Credit
- 7,5 ECTS
- Type of assessment
- On-site written exam, 4 hours under invigilation
- Type of assessment details
- The exam can only be written individually.
The exam assesses students’ ability to explain, apply, and critically evaluate
methods for computational text analysis. Exam questions may include
conceptual questions as well as questions involving code, data processing,
model interpretation, and validation. - Examination prerequisites
11 out of the 14 assignments must be approved for the student to
participate in the exam.- Aid
- Only certain aids allowed (see description below)
Students may bring one two-sided A4 sheet of notes. No
internet access, digital notes, communication tools, or generative AI tools are
allowed during the exam. - Marking scale
- 7-point grading scale
- Censorship form
- No external censorship
- Exam period
Exam information:
The examination date can be found in the exam schedule here
The exact time and place will be available in Digital Exam from the middle of the semester.
- Re-exam
Same as the ordinary exam
Reexam info:
The reexamination date/period can be found in the reexam schedule here
Criteria for exam assesment
Students are assessed on the extent to which they master the learning outcome for the course.
To obtain the top grade “12”, the student must with no or only a few minor weaknesses be able to demonstrate an excellent performance displaying a high level of command of all aspects of the relevant material and can make use of the knowledge, skills and competencies listed in the learning outcomes.
To obtain the passing grade “02”, the student must in a satisfactory way be able to demonstrate a minimal acceptable level of the knowledge, skills and competencies listed in the learning outcomes.
Course information
- Language
- English
- Course code
- ASDK20006U
- Credit
- 7,5 ECTS
- Level
- Full Degree Master
- Duration
- 1 block
- Placement
- Block 4
Study board
- Social Data Science
Contracting departments
- Social Data Science
- Department of Political Science
- Department of Sociology
- Department of Economics
Contracting faculty
- Faculty of Social Sciences
Course Coordinators
- Frederik Hjorth (2-74764e7774813c79833c7279)
- Clara Johan E Vandeweerdt (17-666f6475643179647167687a6868756777436c6976316e7831676e)