ASDK20006U Advanced Social Data Science II
Full-degree students enrolled at the Faculty of Social Science, UCPH
- MSc in Security Risk Management
- Master Programme in Social Data Science
- Master Programmes in Sociology
- Master programme in Political Science
- Master Programmes in Economics
Mandatory course on MSc programme in Social Data Science at University of Copenhagen.
The wealth of new data in the digital society is characterized by high frequency observations in a high granularity setting, allowing for both comprehensive and detailed analysis of social and individual behaviour. Messages in digital form and comments and conversations on social media have the potential to provide thick descriptions of social interactions and individual values in largescale, sometimes population level, settings. At the same time, digitalization of large corpuses of legal, administrative and political texts allows for dynamic analysis of evolving social ideas and issues. At the same time, most digital data do not arrive in simple accessible, quantifiable and comparable forms, but as text, sound and pictures. Advanced Social Data Science II focuses on unstructured data and methods for processing, transforming and dealing with complex and high dimensional data. The course presents classic unsupervised learning methods for characterizing and developing typologies and categories of individual and social behaviour, networks and ideas. Furthermore, it introduces state-of-the-art methods of self-supervision and transfer learning for classifying complex unstructured data such as text and audio-visual data, and relates such data-driven methods to existing theoretical methods and models in the social sciences.
Knowledge
- Explain the differences between and capabilities of neural network architectures such as CNN, RNN, LSTM and Attention based models.
- Account for various learning strategies, algorithms as well as approaches: clustering and unsupervised learning, supervised learning, semi-supervised learning, transfer learning, multi-task learning.
- Account for the potential of different representations, encodings and transformations of text, structured and unstructured.
Skills
- Extract reliable information from text data using supervised and unsupervised learning and techniques from natural language processing.
- Use scikit-learn and PyTorch to apply basic and advanced machine learning models.
- Apply state-of-the-art deep transfer learning to classify
unstructured data.
Competencies
- Integrate theoretical and applied knowledge within the field of Social Data Science and formulate compelling research questions given an unstructured dataset.
- Construct validated and documented data sets for social science from unstructured text and media data.
- Independently carry out an end-to-end analysis given an unstructured dataset of text, including exploratory analysis and discovery using unsupervised methods and supervised learning for measurement, and assessment of model-based biases.
- Critically evaluate the implications of results, considering model limitations and biases, and systematic noise introduced by data collection and sampling methods.
- Communicate results using comprehensive statistics and modern visualization methods in particular plotting new data types to specialists within the academic field.
Examples of course readings:
- Bishop, Christopher: *Pattern Recognition and Machine Learning*. Spring Publishing, 2006.
- Cantu, Francisco & Michelle Torres: "Learning to See: Visual Analysis for Social Science Data".
- Gentzkow, M., Kelly, B. T., & Taddy, M. Text as Data. *Journal of Economic Literature*.
- Grimmer, J., & Stew art, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. *Political Analysis*, 21(3), 267-297.
- Hastie, T., & Tibshirani, R. & Friedman, J.(2008). *The Elements of Statistical Learning; Data Mining, Inference and Prediction*.
- Hovy, D. *Text Analysis in Python for Social Scientists.*
- Jurafsky, Dan, and James H. Martin. *Speech and language processing*. Vol. 3. London: Pearson, 2014.
- Krippendorff, Klaus. Content analysis: An introduction to its methodology. Sage publications, 2018.
- Category
- Hours
- Lectures
- 28
- Preparation
- 112
- Exercises
- 42
- Exam
- 24
- Total
- 206
- Credit
- 7,5 ECTS
- Type of assessment
- Written assignment
- Type of assessment details
- 72-hour written take home exam. The exam can be written
individually or in groups of 2-4
students. - Aid
- All aids allowed
ChatGPT and other large language model tools are permitted as a dedicated source, meaning text copied verbatim needs to be quoted, the tool cited, and generally the specific use made of them needs to be described in the submitted exam.
- Marking scale
- 7-point grading scale
- Censorship form
- No external censorship
- Re-exam
A witten take-home assignment, written either in a group, or individually, on a subject pertaining to the course content and prescribed literature. The subject must be pre-approved by the course lecturer(s). The essay must be structured like a standard academic written assignment based on an explicitly defined research question, include the application of multiple methods taught in the course, and relate the use of these to relevant course readings.
Criteria for exam assesment
The exam will be assessed on the basis of the learning outcome (knowledge, skills and competencies) for the course.
Course information
- Language
- English
- Course code
- ASDK20006U
- Credit
- 7,5 ECTS
- Level
- Full Degree Master
- Duration
- 1 block
- Placement
- Block 4
- Course capacity
- 70 students.
Study board
- Social Data Science
Contracting departments
- Social Data Science
- Department of Political Science
- Department of Sociology
- Department of Economics
Contracting faculty
- Faculty of Social Sciences
Course Coordinators
- Frederik Georg Hjorth (fh@ifs.ku.dk)