Abstract
DNA methylation represents a modification to DNA that can change gene function but does not affect its sequence. While this epigenetic mark is not required in the most basal stem cells of the body, it accumulates during development in a cell- and lineage-specific manner (Hon et al., 2013; Lee et al., 2014). Because of this, it is well known that DNA methylation can be used as a biomarker to quantitate specific cell types of interest within heterogeneous tissues. Much research over the last decade has focused on DNA methylation changes that occur in cancer or biomarkers that utilise cell-free circulating DNA. However, there has been little research into the differential methylation patterns between leukocytes and other tissues as a detection tool for inflammation in various contexts. Furthermore, the widespread use of DNA methylation biomarkers has been limited by the cost and throughput of methylation analysis. In this PhD project, I proposed to address these gaps by first developing novel DNA methylation biomarkers that can initially distinguish blood/immune- derived cells from other cell-types and, secondly, integrating the pan-leukocyte biomarkers into a high-throughput, cost-effective bisulphite amplicon sequencing pipeline capable of processing thousands of samples simultaneously.
I identified candidate biomarker loci by employing publicly available methylome datasets of pure cell populations encompassing mesoderm, ectoderm, and endoderm-derived tissues. Methylation at these loci was measured with a pipeline consisting of open-source nucleic acid handling protocols (DNA extraction and clean-up), a dual-index, four-primer PCR that integrates sequencing adapters and indexes into the initial amplification process, and high-throughput multiplex sequencing with the Illumina next generation sequencing technology. The validation of both biomarker loci and the pipeline process were performed with in vitro mixing of peripheral blood mononuclear cell and intestinal organoid DNA combined at a defined range of ratios.
Several of the preliminary biomarker loci have been previously reported in the literature concerning inflammatory diseases. Still, many authors suggest dynamic methylation changes in local tissues rather than changes in overall cellular composition. These findings emphasise the importance of fully understanding the changes in cellular composition that occur during disease progression and using adequate controls. Moreover, I use TCGA datasets to show a strong, positive linear relationship between infiltrating leukocytes and DNA methylation levels at the HOXA3 locus in six cancer types. I hypothesise this is due to leukocyte infiltration rather than a feature of the diseased cells.
The ability of each pan-leukocyte biomarker to detect inflammation was tested on faecal samples that contained associated calprotectin scores (a marker of intestinal immune cell activity found in stool). I observed progressive increases in DNA methylation at the biomarker loci in stool with low, intermediate, and high levels of calprotectin. Furthermore, there was an overall linear relationship between DNA methylation at the biomarker loci and faecal calprotectin levels (R2 = 0.48).
Each DNA methylation biomarker could also distinguish leukocyte fractions from epithelial cells in human breast milk. During a one-month time period, which includes instances of mastitis, leukocytes were tracked from human breastmilk. Significant increases in total leukocyte fraction were observed during mastitis but would decrease after recovery. Moreover, DNA methylation patterns from one biomarker are conserved in Bos taurus and can be used to detect the leukocyte fraction in cow’s milk.
Finally, I optimised a low-cost and high-throughput bisulphite conversion pipeline for DNA methylation markers based on BOMB open-source protocols. As mentioned above, protocols for non-methylation analysis purposes have been implemented into the high-throughput pipeline. However, they had not yet been optimised for DNA methylation analysis. I showed that the BOMB protocol performs equally well to more expensive commercial kits, suggesting it could easily be substituted into the high-throughput pipeline or in any other form of methylation analysis.
By combining novel DNA methylation biomarkers and cost-effective, high- throughput techniques, I developed a cheap assay for precise quantitation of leukocytes against a background of other cell types from bodily fluids such as saliva, breast milk, and stool. Collectively, my data provides an excellent proof of concept for using DNA methylation as a high-throughput, cost-effective screening tool for various pathologies.