Abstract
Next generation DNA sequencing and analysis of amplicons spanning the pharmacogene CYP2D6 suggested that the Nextera transposase used for fragmenting and providing sequencing priming sites displayed a targeting bias. This manifested as dramatically lower sequencing coverage at sites in the amplicon that appeared likely to form G-quadruplex structures. Since secondary DNA structures such as G-quadruplexes are abundant in the human genome, and are known to interact with many other proteins, we further investigated these sites of low coverage. Our investigation revealed that G-quadruplex structures are formed in vitro within the CYP2D6 pharmacogene at these sites, and G-quadruplexes can interact with the hyperactive Tn5 transposase (EZ-Tn5) with high affinity. These findings indicate that secondary DNA structures such as G-quadruplexes may represent preferential transposon integration sites and provide additional evidence for the role of G-quadruplex structures in transposition or viral integration processes.
•Regions of low read depth were observed in sequencing data generated from Nextera libraries.•We showed DNA in these regions could form G-quadruplexes.•Sequences from these regions bind to the EZ-Tn5 transposase with high affinity.•G-quadruplexes may represent target sites for Tn5 and other transposases.