Abstract
CpG dinucleotides are known to play a crucial role in regulatory domains, affecting gene expression in their natural context. Here, we demonstrate that intragenic CpG frequency and distribution impacts transgene and genomic gene expression levels in mammalian cells. As shown for the Macrophage Inflammatory Protein 1α,
de novo
RNA synthesis correlates with the number of CpG dinucleotides, whereas RNA splicing, stability, nuclear export and translation are not affected by the sequence modification. Differences in chromatin accessibility
in vivo
and altered nucleosome positioning
in vitro
suggest that increased CpG levels destabilize the chromatin structure. Moreover, enriched CpG levels correlate with increased RNA polymerase II elongation rates
in vivo.
Interestingly, elevated CpG levels particularly at the 5′ end of the gene promote efficient transcription. We show that this is a genome-wide feature of highly expressed genes, by identifying a domain of ∼700 bp with high CpG content downstream of the transcription start site, correlating with high levels of transcription. We suggest that these 5′ CpG domains are required to distort the chromatin structure and to increase gene activity.