Abstract
Breast cancer is a heterogeneous disease with a complex genetic architecture. It is the most commonly diagnosed cancer in women worldwide. Oestrogen receptor-positive (ER+) breast cancer accounts for up to 80% of all cases. This type of breast cancer is primarily driven by sustained oestrogen signalling, which promotes proliferation and differentiation in both healthy and malignant breast tissue. In breast cancer, normal oestrogen receptor (ER) signalling is disrupted. The pathogenic reprogramming of the oestrogen-regulated network plays a crucial role in neoplastic transformation, tumour progression, and endocrine resistance. However, the mechanisms mediating this process are not yet well understood.
The ER is encoded by the ESR1 gene on the long arm of chromosome 6. Yet, despite its well-established role in driving ER+ breast cancer, ESR1 mutations are rare in primary breast cancer. The 6q25.1 locus, which encompasses the ESR1 gene, has emerged as a significant genetic susceptibility locus associated with multiple breast cancer subtypes. Moreover, in ER+ tumours, ESR1 expression levels show a positive association with those of three upstream genes, including CCDC170—a pattern not observed in oestrogen receptor negative (ER−) breast cancer. In this thesis, the 6q25.1 locus is defined as a 725 kb region (chr6: 151725897-152450754) spanning RMND1, ARMT1, CCDC170 and ESR1.
This thesis aimed to investigate the structural organisation and epigenetic landscape of the 6q25.1 locus to determine the mechanisms through which genetic variants within this region influence breast cancer susceptibility.
The first aim of this study was to investigate the genetic complexity and structural organisation of the 6q25.1 locus to better understand its association with different breast cancer subtypes and its variable risk signals across populations. Breast cancer incidence varies significantly worldwide, with the highest rates reported in Europe, North America, and Oceania, and substantially lower rates in East and South-East Asia. This analysis revealed that risk variants in this region are more strongly associated with ER- negative breast cancer, converging at regulatory sites primarily involved in the regulation of CCDC170 and ESR1. In Europeans, intragenic variants spanning the CCDC170–ESR1 locus represent two distinct risk signals with minimal linkage disequilibrium. In comparison it was observed that in East Asian populations, these variants cluster as a single highly correlated signal.
Using publicly available Hi-C datasets, this study demonstrated that the structural organisation and epigenetic landscape of the 6q25.1 locus also differs between various breast cancer models. These differences suggest that the cell-type-specific effects of 6q25.1 risk variants could potentially be mediated by variations in 3D chromatin structure. Cohesin was shown to play a critical role in regulating genetic interactions and chromatin accessibility within this locus. In MCF7 cells, depletion of the RAD21 cohesin subunit led to a gain in short-range interactions and a loss in chromatin accessibility, particularly within an ER+ specific super enhancer upstream of ESR1. This change was also associated with a downregulation in H3K27ac at multiple sites. These findings suggest that cohesin may play a role in the transcriptional regulation of 6q25.1 genes, potentially mediated through cohesin-dependent interactions with the super enhancer in ER+ breast cancer.
The second objective focused on determining the mechanism through which a specific genetic risk variant within the 6q25.1 locus influences breast cancer susceptibility. The rs77275268 variant is in strong linkage disequilibrium (R2 > 0.9) with a leading candidate causal variant, rs9397435, in both European and East Asian populations, and coincides with a putative CTCF binding site. In an ER+ breast cancer model, rs77275268 was associated with a site-specific gain in CTCF and RAD21 binding, which coincided with a reduction in H3K27ac and downregulation of CCDC170. CTCF is a key architectural protein that demarcates chromatin domains and acts as an insulator by forming loop boundaries that restrict enhancer–promoter interactions. Therefore, increased CTCF binding at this site may reinforce insulation between the ER+ super- enhancer and adjacent gene promoters, thereby disrupting normal regulatory communication. Although the precise role of CCDC170 remains unclear, altered expression of this gene has been linked to prognosis in both ER+ and ER− breast cancer, suggesting that it may represent an important clinical target.
The third aim of this thesis was to investigate whether the rs77275268 variant influences transcript splicing or isoform usage at the 6q25.1 locus. This was motivated by growing evidence that chromatin architectural proteins such as CTCF and cohesin not only regulate gene expression, but also modulate alternative splicing—potentially through changes in enhancer–promoter interactions and chromatin looping. Given that rs77275268 was associated with a site-specific gain in CTCF and RAD21 binding, we hypothesised that it may also influence isoform diversity. To test this, long-read transcriptomic sequencing was performed using the Oxford Nanopore platform. This analysis identified two putative novel ESR1 isoforms that may be influenced by the rs77275268 allele, suggesting a potential role for this variant in modulating alternative splicing. While further validation is required to confirm the structure and relative abundance of these transcripts, the findings support a broader role for chromatin architecture in shaping transcript diversity.
In conclusion, this thesis underscores the pivotal role of chromatin architectural proteins, particularly cohesin, in the regulation of the 6q25.1 locus. It proposes a mechanism through which cohesin modulates the ER+ super enhancer, influencing gene expression and potentially driving the coregulation of critical genes in ER+ breast cancer. These insights highlight the importance of further research into the genetic and epigenetic interactions at this locus, which could unveil novel therapeutic targets for managing breast cancer progression and assessing breast cancer risk.