Phonetic and Phonemic Segmentation: Neural Representations of Speech Attributes in Natural Dialogue

D-036

Juan Octavio Castro^1,², Joaquín E. González¹, Jazmín Vidal Domínguez¹, Agustín Gravano^3,⁴, Pablo E. Riera^1,⁵, Juan E. Kamienkowski^1,^5,⁶

Laboratorio de Inteligencia Artificial Aplicada, Instituto de Ciencias de la Computación, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires - CONICET, Argentina
Departamento de Física, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Argentina
Laboratorio de Inteligencia Artificial; Escuela de Negocios, Universidad Torcuato Di Tella, Argentina
CONICET, Argentina
Departamento de Computación, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Argentina
Maestría de Explotación de Datos y Descubrimiento del Conocimiento, FCEyN-FI, UBA, Argentina.

Presenting Author:
Juan Octavio Castro

joctavio287@gmail.com

The study of speech in natural environments poses challenges for traditional electroencephalogram (EEG) analysis approaches. In recent years, machine learning models—particularly regularized linear encoding models—have enabled a transition toward experimental designs that incorporate dynamic and naturalistic stimuli, such as speech during dialogue. This work aims to understand how different speech attributes are encoded in the brain within the context of unscripted natural dialogue. To this end, we extract low-level attributes (envelope, pitch, spectrogram, among others), high-level attributes (phonemes, phonological features, among others), and attributes derived from representations obtained with deep neural networks (Wav2Vec2.0, Whisper). The results show that the inclusion of high-level attributes significantly improves the prediction of brain signals across all frequency bands. In particular, predictions based on phonemes and phonological features suggest that neural sensitivity is consistent with the hypothesis of a hierarchical language processing system.

Posters

Posters List