Which genetic mechanisms underlie the relationship between preschool vocabulary and later literacy skills?

Last updated 18 December 2023

Preschool vocabulary acquisition is associated with later language and literacy skills.1 Genetic factors might partially explain this link,2 but the precise mechanisms are unclear. Thus far, twin-based studies have implicated mechanisms involving genetic amplification or genetic innovation.2,3

In their latest study, an international team of researchers, including first author Ellen Verhoef and lead scientist Beate St Pourcain from the Max Planck Institute (MPI) for Psycholinguistics investigated this problem using genome-wide genetic data from participants of a large UK population-based cohort. Specifically, the scientists evaluated the evidence for an amplification of genetic factors related to early vocabulary versus genetic innovation occurring during development. To do so, they studied expressive and receptive vocabulary skills at 38 months of age and various language- and literacy-related skills, as well as nonverbal intelligence, at age 7 to 13 years in approximately 6,000 unrelated children. They then analyzed genetic relationships between early-childhood expressive and receptive vocabulary, and later language and literacy-related skills.

The researchers found little support for the emergence (i.e. innovation) of novel genetic sources for language, literacy or cognitive ability during mid-childhood or early adolescence. However, they did find evidence to support that genetic factors contributing to early childhood receptive vocabulary were amplified. These genetic factors seemed to explain most of the genetic variance underlying differences in later reading, verbal and nonverbal cognitive skills.

“While individual predictions of a child’s future language and reading abilities using very early vocabulary scores are poor, and this includes genetic predictions, our study clearly highlights that the genetic foundations underlying these early skills play an important role during later life, in particular for literacy and cognitive skills, as observed in a large population-based cohort”, explains St Pourcain. “Thus, our study underlines the need for (1) accurate and detailed assessments of language skills during toddlerhood, which are currently only sparsely available in large-scale cohorts, and (2) an in-depth characterisation of genetic factors contributing to early language development, so that we can better understand the genetic and non-genetic processes contributing to later-life outcomes”.

Watch the video abstract of this paper by the authors.

Referring to

Verhoef, E. et al. (2020), . J. Child Psychol. Psychiatr. doi: 10.1111/jcpp.13327.


1Bleses, D. et al. (2016). Early productive vocabulary predicts academic achievement 10 years later. Appl. Psycholinguist. 37, 1461–1476. doi: 10.1017/S0142716416000060.

2Hayiou-Thomas, M.E. et al. (2012). The etiology of variation in language skills changes with development: A longitudinal twin study of language from 2 to 12 years. Dev. Sci. 15, 233–249. doi: 10.1111/j.1467-7687.2011.01119.x.

3Harlaar, N. et al. (2008). Why do preschool language abilities correlate with later reading? A twin study. J. Speech. Lang. Res. 51, 688–705. doi: 10.1044/1092-4388(2008/049).


Genetic amplification: genetic factors are associated with a trait throughout development and increasingly explain that trait as age progresses.

Genetic innovation: genetic factors (that were previously unrelated to a trait) become associated with that trait during development.

Dr Jessica Edwards
Jessica received her MA in Biological Sciences and her DPhil in Neurobehavioural Genetics from the University of Oxford (Magdalen College). After completing her post-doctoral research, she moved into scientific editing and publishing, first working for Spandidos Publications (London, UK) and then moving to Nature Publishing Group. Jessica is now a freelance editor and science writer, and started writing for “The Bridge” in December 2017.

Add a comment

Your email address will not be published. Required fields are marked *