Searching for genes

Oxford statisticians and geneticists have developed a new strategy for surveying the human genome in order to identify those genes involved in human disease.

University of Oxford

statisticians and geneticists have developed a new strategy for surveying the human genome in order to identify those genes involved in human disease.

Finding the gene loci involved in particular human diseases (that is, the places along the human genome where those genes are located) has proved remarkably difficult.

Almost all current methods of searching look separately at different positions in the human genome where disease-related variants might be (there are hundreds of thousands, or millions, of such positions). However, it is known that it is often the interaction of genes at different points on the genome which causes disease, and so just looking at individual gene loci might not be very effective.

However, checking each possible pair of loci involves many more tests: if a study examines 300,000 variants, there are about 45,000,000,000 possible interacting pairs.

This raises practical and statistical problems – the latter because if such an enormous number of hypotheses are tested, the chance of a false positive result is very high. This multiple testing problem requires that each individual test must be very tightly controlled, and has led to a widespread view that searching for interacting genes would not be feasible.

The study, by Oxford statisticians Dr Jonathan Marchini and Professor Peter Donnelly and geneticist Professor Lon Cardon, aimed to evaluate the power of genome-wide searches for pairs of genes acting in combination. The researchers used models to compare the efficacy of looking at each gene locus separately to looking at every possible combination of two loci. They found that the latter was both feasible, and, contrary to prevailing wisdom, a more powerful way of identifying the disease genes in many settings.

In practice, the researchers recommend a middle ground, which is a two-stage strategy aiming to get the best out of both approaches. Their paper, published in Nature Genetics, recommends that first a locus-by-locus search is performed, after which all loci reaching some low level of association with the disease are tested in pairs.