Hollow Datasets: Algorithmic Calculability in Data Curation

Authors

  • Alejandro Alvarado Rojas University of Southern California

DOI:

https://doi.org/10.5210/spir.v2024i0.15045

Keywords:

critical data studies, data curation, data science platforms, hollow datasets, infrastructure studies

Abstract

Data science platforms are infrastructures for collaborative curation, processing, analysis, and application of datasets. In facilitating access to data resources, these platforms change the social and material conditions of knowledge generation from data, which may be characterized as the platformization of data science. Platform configurations shape the curatorial practices that render data actionable. However, the specific platform mechanisms of data curation on these platforms are overlooked. In this study, I examine the sociotechnical organization of data curation on Kaggle, a prominent data science platform. By conceptualizing Kaggle as a calculative infrastructure, I conduct a technographic analysis of Kaggle’s Usability Rating to unpack the calculation of data quality. Findings suggest that making data curation calculable operates through algorithmic rationality that conditions the generation of hollow datasets by reducing meaningful, contextual dataset contents to numerical indicators. Hollow datasets capture how digital platform logics and data science cultures reconfigure data curation as a procedural achievement in pursuit of data quality.

Downloads

Published

2026-01-02

How to Cite

Alvarado Rojas, . A. (2026). Hollow Datasets: Algorithmic Calculability in Data Curation. AoIR Selected Papers of Internet Research. https://doi.org/10.5210/spir.v2024i0.15045

Issue

Section

Papers A