Machine Learning in Social Science

Period: 2024-

Social-science theories are often evaluated with a single model specification, which makes it hard to compare competing explanations and to identify where theories fail.

This line uses benchmark-style evaluation, prediction gaps, and large administrative/survey data to connect prediction to explanation: where model performance diverges, we gain substantive clues about hidden mechanisms and subgroup-specific dynamics.

Grants and Funded Projects

NWO Veni: The Social Contexts of Upward Mobility

NWO · EUR 320k · 2024-2027

Role: Principal Investigator

Population-scale modeling of educational and socioeconomic mobility using social-context networks and predictive methods that bridge explanation and forecasting.
NWO Open Competition XS: Embedded Lives

NWO · EUR 50k · 2025-2026

Role: Principal Investigator

Graph-based modeling of Dutch population life courses, including embedding methods and reusable infrastructure for context-aware social-data analysis.
ZonMw BePrepared: Vaccination Decision Profiles

ZonMw · EUR 146k (UU share) · 2024-2025

Role: Co-applicant

Data-driven modeling of vaccination decision profiles and uptake, combining survey beliefs/values with CBS microdata and network structure for targeted interventions.

Selected Publications

Prediction Gaps as Pathways to Explanation: Rethinking Educational Outcomes through Differences in Model Performance

J Garcia-Bernardo, E Jaspers, W Machado, S Plach, EJ van Leeuwen (2025). arXiv preprint.

Code PDF
Population-Scale Network Embeddings Expose Educational Divides in Network Structure Related to Right-Wing Populist Voting

M Luken, J Garcia-Bernardo, S Deb, F Hafner, M Khosla (2025). arXiv preprint.

PDF
Using Large Language Models for Text Annotation in Social Science and Humanities: A Hands-On Python/R Tutorial

Q Fang, J Garcia-Bernardo, EJ van Kesteren (2025). OSF preprint.

Software PDF
Combining the strengths of Dutch survey and register data in a data challenge to predict fertility (PreFer)

E Sivak, P Pankowska, A Mendrik, T Emery, J Garcia-Bernardo, et al. (2024). Journal of Computational Social Science.

PDF
The potential of benchmark challenges in the social sciences

P Pankowska, A Mendrik, T Emery, J Garcia-Bernardo (2024). Social Science Information.
Avoiding Overfitting in Variable-Order Markov Models: a Cross-Validation Approach

V Secchini, J Garcia-Bernardo, P Jansky (2025). arXiv preprint.

Code PDF