Nuevo paper publicado en Scientific Data, journal de Nature

Cuentos: A Large-Scale Eye-Tracking Reading Corpus on Spanish Narrative Texts
Fermin Travi, Bruno Bianchi, Diego Fernandez Slezak & Juan E Kamienkowski

Abstract

Eye-tracking is a well-established method for studying reading processes. Our gaze jumps word to word, sampling information almost sequentially. Time spent on each word, along with skipping or revisiting patterns, provides proxies for cognitive processes during comprehension. However, few studies have focused on Spanish, where empirical data remain scarce, and little is known about how findings from other languages translate to Spanish reading behavior. We present the largest publicly available Spanish eye-tracking dataset to date, comprising readings of self-contained stories from 113 native speakers (mean age 23.8; 61 females, 52 males). The dataset comprises both long stories (3300 ± 747 words, 11 readings per item on average) and short stories (795 ± 135 words, 50 readings per item on average), providing extensive coverage of natural reading scenarios with over 940,000 fixations covering close to 40,000 words (8,500 unique words). This comprehensive resource offers opportunities to investigate Spanish eye movement patterns, explore language-specific cognitive processes, examine Spanish linguistic phenomena, and develop computational algorithms for reading research and natural language processing applications.

Link al articulo: https://doi.org/10.1038/s41597-026-06798-z

Acknowledgements

We thank Julia Carbajal and Diego Shalom for contributing to the long stories corpus data collection, Gabriel Leclercq for contributing on the preprocessing of short stories corpus data, Malena Mul Fedele y Daniel Vigo for the collaboration on the fatigue measurements, and Julia Carbajal, Diego Shalom, Sebastian Cantini Burden, Gabriel Leclercq, and Alfredo Umfurer for enlightening discussions along different projects using the present dataset. This research was supported by Agencia I + D + i (PICT 2021-I-A-00998, Argentina) and CONICET (PIP 11220220100240CO, Argentina).