Learning Gene Interactions and Networks from Perturbation Screens and Expression Data
We investigate a variety of methods to first discover and then understand genetic interactions. Beginning with pairwise interactions, we propose a method for inferring pairwise gene interactions en masse from short- interfering RNA screens. We use the siRNA off-target effects to form a matrix of knocked-down genes, and consider the observed fitness to be a linear combination of individual and pairwise effects in this matrix. These effects can then be inferred using a variety of statistical learning methods. We evaluate two such methods for this task, xyz and glinternet. Using either method, we are able to find interactions in small simulated data sets. Neither method scales to genome-scale data sets, however. In our larger simulations both methods suffer from scalability problems, either with their accuracy or running time. We overcome these limitations by developing our own lasso-based regression method, which takes into account the binary nature of our perturbation screens. Using a compressed sparse representation of the pairwise interaction matrix, and parallelising updates, we are able to run this method on exome-scale data. Generalising from pairwise interactions we then consider network models, in which pairwise gene interactions form edges of a graph. Such networks are often understood in terms of functional modules, groups of genes that act together to perform a task. We develop a method that combines pairwise interaction and gene expression data to effectively find functional modules in simulated data.
Advisor: Gavryushkin, Alex; Huang, Zhiyi; Vignes, Matthieu
Degree Name: Master of Science
Degree Discipline: Computer Science
Publisher: University of Otago
Keywords: epistasis; gene interactions; regression; clustering; gene expression; parallel; siRNA; perturbation screen; protein-protein interaction; simulation; co-expression; functional module
Research Type: Thesis