Abstract
The carefully orchestrated process of neurogenesis governs the expansion of the developing cortex, with the extension of these neurogenic pathways underpinning the increasingly complex cortex in the primate-lineage. This expansion has co-occurred with the rapid evolution of genomic loci speculated to be responsible for the regulation of these pathways. Disruption of these tightly regulated processes results in abnormal structural changes in the developed cortex, known as cortical malformations.
Periventricular nodular heterotopia (PVNH) is a genetically and phenotypically heterogeneous cortical malformation characterised by the mispositioning of grey matter along the surface of the lateral ventricles and is associated with epilepsy and developmental delay. Currently, only a third of patients have a molecular diagnosis, presenting a considerable diagnostic gap. Further identification of genetic determinants of this phenotype will highlight cellular pathways critical for neurogenesis, while simultaneously improving patient management. This thesis details the utilisation of high- throughput sequencing analyses on a cohort of 202 PVNH families in combination with functional assays to improve current diagnostic rates.
The first aspect of this study relates to the assessment of data quality in the current cohort, spurred by the discovery of multiple seemingly pathogenic changes resulting from vector contamination. Consequently, a rigorous workflow of quality control steps was employed to assess the extent of vector contamination and similar quality issues in the PVNH cohort. This process revealed weaknesses in the currently available quality control tools, requiring the development of a new programme, that led to the detection of 11 vector contaminations events, allowing for appropriate mitigation.
The primary variant analyses of this study focused on a subcohort of 137 PVNH patients and their unaffected parents. These trio analyses sought to assess the contribution of coding de novo variation and biallelic genotypes to the pathogenesis of PVNH. Pathogenic variants were discovered for 17 patients, with variant recurrence observed in two genes: FLNA (n = 4), a known PVNH gene and, a novel gene SON (n = 5), associated with ZTTK (Zhu-Tokita-Takenouchi-Kim) syndrome, a severe multi-system congenital malformation disorder. The remaining pathogenic variants were associated with a broad range of neurodevelopmental phenotypes. This finding suggests that a significant proportion of PVNH cases occur in association with known neurodevelopmental diseases, where PVNH is a rare or currently under appreciated phenotypic component.
The third aspect of this project involved further utilisation of exploratory variant analyses in the 102 unresolved whole-genome sequence PVNH trios. This included profiling several forms of variation known to be associated with other neurodevelopmental diseases, including mosaic variation, structural changes, and non- coding variants. In addition, 56 PVNH patients without complete parental information were screened for pathogenic variation using a restricted gene filtering approach, under the hypothesis that a considerable portion of PVNH cases are associated with a broad range of neurodevelopmental disorders without currently recognised associations with PVNH. In combination, these analyses led to the discovery of a further 14 pathogenic variants, including another pathogenic variant in SON.
Finally, the association of SON with PVNH was further explored. This was achieved using a combination of transcriptomic analyses focused on understanding differential transcript splicing and expression, and functional analyses aimed at understanding the differential localisation and protein interactors of SON isoforms. These analyses led to the identification of a currently unannotated protein-coding transcript of SON, that may represent a critical element in the pathogenesis of ZTTK syndrome. Proteomic analyses that were aimed at assessing the binding partners of each RNA-binding isoform of SON, further supported this idea. In addition, this proteomic information was used to prioritise candidate variants identified in the unresolved PVNH trio subcohort, under the hypothesis that shared pathways are being differentially disrupted in this condition.
Together, these analyses represent a comprehensive genomic examination of a cohort of PVNH families and contribute insights into the complex genetic architecture underlying this condition, while providing diagnostic answers for 31 families. Further, this study provides a range of analytical frameworks for the identification of quality anomalies and candidate variation, which are broadly applicable to the study of the genomic determinants of rare diseases using high-throughput sequencing data.