New cross-linguistic data and methods support the weak negative effect of the "derived" allele of ASPM on tone, but not of Microcephalin

Dan Dediu
Membres autors
PLOS ONE. 16 - 6, pp. e0253546 - e0253546
DOI: 10.1371/journal.pone.0253546

While it is generally accepted that language and speech have genetic foundations, and that the widespread inter-individual variation observed in many of their aspects is partly driven by variation in genes, it is much less clear if differences between languages may also be partly rooted in our genes. One such proposal is that the population frequencies of the so-called “derived” alleles of two genes involved in brain growth and development, ASPM and Microcephalin, are related to the probability of speaking a tone language or not. The original study introducing this proposal used a cross-linguistic statistical approach, showing that these associations are “special” when compared with many other possible relationships between genetic variants and linguistic features. Recent experimental evidence supports strongly a negative effect of the “derived” allele of ASPM on tone perception and/or processing within individuals, but failed to find any effect for Microcephalin. Motivated by these experimental findings, I conduct here a cross-linguistic statistical test, using a larger and updated dataset of 175 samples from 129 unique (meta)populations, and a battery of methods including mixed-effects regression (Bayesian and maximum-likelihood), mediation and path analysis, decision trees and random forests, using permutations and restricted sampling to control for the confounding effects of genealogy (language families) and contact (macroareas). Overall, the results support a negative weak effect of ASPM-D against the presence of tone above and beyond the strong confounding influences of genealogy and contact, but they suggest that the original association between tone and MCPH1 might have been a false positive, explained by differences between populations and languages within and outside Africa. Thus, these cross-linguistic population-scale statistical results are fully consonant with the inter-individual-level experimental results, and suggest that the observed linguistic diversity may be, at least in some cases, partly driven by genetic diversity.