by Laurent Jacob, Department of Statistics at UC Berkeley
Measuring gene expressions to study a biological phenomenon or build prognosis tools is now common practice. When analyzing this type of data, one is very often interested in detecting pre-defined sets of genes that are known to work together and are significantly differentially expressed between two particular conditions. Multivariate statistics allow to test for differential expression at the gene set level directly which makes them more interpretable than the widely used gene set enrichment approach. However, they are known to lose power quickly with increasing dimension. At the same time, an increasing number of regulation networks are becoming available, specifying, for example, which genes activate or inhibit the expression of other genes. We intend to use these networks to build spaces of lower dimension, yet retaining most of the expression shift of gene sets. This makes the multivariate testing amenable and provably more powerful under (partly) coherent expression shift assumption.