Can Informativity Effects Be Predictability Effects in Disguise?

Recent work in corpus linguistics has observed that informativity predicts articulatory reduction of a linguistic unit above and beyond the unit’s predictability in the local context, i.e., the unit’s probability given the current context. Informativity of a unit is the inverse of average (log-scale...

Full description

Saved in:
Bibliographic Details
Main Author: Vsevolod Kapatsinski
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/27/7/739
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recent work in corpus linguistics has observed that informativity predicts articulatory reduction of a linguistic unit above and beyond the unit’s predictability in the local context, i.e., the unit’s probability given the current context. Informativity of a unit is the inverse of average (log-scaled) predictability and corresponds to its information content. Research in the field has interpreted effects of informativity as speakers being sensitive to the information content of a unit in deciding how much effort to put into pronouncing it or as accumulation of memories of pronunciation details in long-term memory representations. However, average predictability can improve the estimate of local predictability of a unit above and beyond the observed predictability in that context, especially when that context is rare. Therefore, informativity can contribute to explaining variance in a dependent variable like reduction above and beyond local predictability simply because informativity improves the (inherently noisy) estimate of local predictability. This paper shows how to estimate the proportion of an observed informativity effect that is likely to be artifactual, due entirely to informativity improving the estimates of predictability, via simulation. The proposed simulation approach can be used to investigate whether an effect of informativity is likely to be real, under the assumption that corpus probabilities are an unbiased estimate of probabilities driving reduction behavior, and how much of it is likely to be due to noise in predictability estimates, in any real dataset.
ISSN:1099-4300