Machine Learning in Social Science

Period: 2024-

Social-science theories are often evaluated with a single model specification, which makes it hard to compare competing explanations and to identify where theories fail.

This line uses benchmark-style evaluation, prediction gaps, and large administrative/survey data to connect prediction to explanation: where model performance diverges, we gain substantive clues about hidden mechanisms and subgroup-specific dynamics.

Grants and Funded Projects

Selected Publications

  1. Prediction Gaps as Pathways to Explanation: Rethinking Educational Outcomes through Differences in Model Performance

    J Garcia-Bernardo, E Jaspers, W Machado, S Plach, EJ van Leeuwen (2025). arXiv preprint.

    Code PDF

  2. Population-Scale Network Embeddings Expose Educational Divides in Network Structure Related to Right-Wing Populist Voting

    M Luken, J Garcia-Bernardo, S Deb, F Hafner, M Khosla (2025). arXiv preprint.

    PDF

  3. Using Large Language Models for Text Annotation in Social Science and Humanities: A Hands-On Python/R Tutorial

    Q Fang, J Garcia-Bernardo, EJ van Kesteren (2025). OSF preprint.

    Software PDF

  4. Combining the strengths of Dutch survey and register data in a data challenge to predict fertility (PreFer)

    E Sivak, P Pankowska, A Mendrik, T Emery, J Garcia-Bernardo, et al. (2024). Journal of Computational Social Science.

    PDF

  5. The potential of benchmark challenges in the social sciences

    P Pankowska, A Mendrik, T Emery, J Garcia-Bernardo (2024). Social Science Information.

  6. Avoiding Overfitting in Variable-Order Markov Models: a Cross-Validation Approach

    V Secchini, J Garcia-Bernardo, P Jansky (2025). arXiv preprint.

    Code PDF