Molecular Medicine Israel

Discovery of 42 genome-wide significant loci associated with dyslexia

Abstract

Reading and writing are crucial life skills but roughly one in ten children are affected by dyslexia, which can persist into adulthood. Family studies of dyslexia suggest heritability up to 70%, yet few convincing genetic markers have been found. Here we performed a genome-wide association study of 51,800 adults self-reporting a dyslexia diagnosis and 1,087,070 controls and identified 42 independent genome-wide significant loci: 15 in genes linked to cognitive ability/educational attainment, and 27 new and potentially more specific to dyslexia. We validated 23 loci (13 new) in independent cohorts of Chinese and European ancestry. Genetic etiology of dyslexia was similar between sexes, and genetic covariance with many traits was found, including ambidexterity, but not neuroanatomical measures of language-related circuitry. Dyslexia polygenic scores explained up to 6% of variance in reading traits, and might in future contribute to earlier identification and remediation of dyslexia.

Main

The ability to read is crucial for success at school and access to employment, information and health and social services, and is related to attained socioeconomic status1. Dyslexia is a neurodevelopmental disorder characterized by severe reading difficulties, present in 5–17.5% of the population, depending on diagnostic criteria2,3. It often involves impaired phonological processing (the decoding of sound units, or phonemes, within words) and frequently co-occurs with psychiatric and other developmental disorders4, especially attention-deficit hyperactivity disorder (ADHD)5,6 and speech and language disorders7,8. Dyslexia may represent the low extreme of a continuum of reading ability, a complex multifactorial trait with heritability estimates ranging from 40% to 80%9,10. Identifying genetic risk factors not only aids increased understanding of the biological mechanisms, but may also expand diagnostic capabilities, facilitating earlier identification of individuals prone to dyslexia and co-occurring disorders for specific support.

Previous genome-wide investigations of dyslexia have been limited to linkage analyses of affected families11 or modest (n < 2,300 cases) association studies of diagnosed children and adolescents12. Candidate genes from linkage studies show inconsistent replication, and genome-wide association studies (GWAS) have not found significant associations, although LOC388780 and VEPH1 were supported in gene-based tests12. Larger cohorts are vital for increasing sensitivity to detect new genetic associations of small effect. Here, we present the largest dyslexia GWAS to date, with 51,800 adults self-reporting a dyslexia diagnosis and 1,087,070 controls, all of whom are research participants with the personal genetics company 23andMe, Inc. We validate our association discoveries in independent cohorts, provide functional annotations of significant variants (mainly single-nucleotide polymorphisms (SNPs)) and potential causal genes, and estimates of SNP-based heritability. Lastly, we investigate genetic correlations with reading and related skills, health, socioeconomic, and psychiatric measures, and evaluate the evidence for previously implicated dyslexia candidate genes in our well-powered results.

Results

Genome-wide associations

The full dataset included 51,800 (21,513 males, 30,287 females) participants responding ‘yes’ to the question ‘Have you been diagnosed with dyslexia?’ (cases) and 1,087,070 (446,054 males, 641,016 females) participants responding ‘no’ (controls). Participants were aged 18 years or over (mean ages of cases and controls were 49.6 years (s.d. 16.2) and 51.7 years (s.d. 16.6), respectively). We identified 42 independent genome-wide significant associated loci (P < 5 × 10−8) and 64 loci with suggestive significance (P < 1 × 10−6) (Fig. 1 and Supplementary Table 1). Genomic inflation was moderate (λGC = 1.18) and consistent with polygenicity (see Q–Q plot, Extended Data Fig. 1). We also performed sex-specific GWAS and age-specific GWAS (younger or older than 55 years) because dyslexia prevalence was higher in our younger (5.34% in 20- to 30-year-olds) than older (3.23% in 80- to 90-year-olds) participants. These subsample analyses showed high consistency with the main GWAS (of the full sample). Genetic correlation estimated by linkage disequilibrium (LD) score regression (LDSC) was 0.91 (95% confidence intervals (CI): 0.86–0.96; P = 8.26 × 10−253) in males and females, and 0.97 (95% CI: 0.91–1.02; P = 2.32 × 10−268) between younger and older adults.

Sign up for our Newsletter