AbstractDNA microarrays allow to study the expression profile of the whole genome of an organism. This technology is quite expensive, and the number of tested samples is often limited (at most 5 replicates). In those conditions, statistical tests are associated to bad performances. Various methods have been developed to optimize differential expression analysis. We describe a set of methods, to provide a comprehensive view of various approaches, of their main advantages and limitations. Our first objective is to guide the analysis of differential expression, at the gene-level, by using biological or empirical informations. To improve the performances of statistical tests, we propose to share information across genes, gathered using an appropriate criteria. The window t-test has been developed following this strategy, to use the empirical relationship between variability and mean expression level. The window t-test only depends on the number of replicates, that defines the number of probesets used to compute variance estimates. Evaluation of methods reveals that the window t-test performs similarly to or better than the best methods . Many biological informations can be used to define gene-sets (metabolic pathway, chromosomal location…). Current methods for gene-set analysis of differential expression are developed to test several hypothesis. We generalize gene-set analysis to answer the main Q0 question : « Does the individual expression values of the gene-set members differ between two condition ? ». We developed FAERI to answer to this question, by considering 3 criteria : the correlation between genes, the expression level, and the direction of the response (under- or over-expression). FAERI is a modified ANOVA-2 procedure, starting with a two-step reduction of expression data (Z-standardization, directional reduction). ANOVA-2 is shown to be the best-performing method when analyzing uni-directional gene-sets (all members are either activated, or repressed). FAERI reveals to be the most appropriate method for all tested gene-set types. We developed PEGASE to perform differential expression analysis both at the gene and gene-set level. Consensus evaluation from several methods is included, to provide users with good results, even if the choice of an optimal method is not easy. Several methods are implemented in PEGASE, both at the gene and gene-set level, and performance evaluation can be run based on biological or empirical knowledge. PEGASE is also used as a back-end by PHOENIX, an online tool for microarray data analysis . 1. Berger F., De Hertogh B., Pierre M., Gaigneaux A. & Depiereux E. The “Window t-test”: a simple and powerful approach to detect differentially expressed genes in microarray datasets. Cent. Eur. J. Biol., 2008, 3, 327-344. 2. Berger F., De Hertogh B., Bareke E., Pierre M., Gaigneaux A. & Depiereux E. PHOENIX: a web-interface for (re)analyses of microarray data. Cent. Eur. J. Biol., 2009, 4(4) : 603 : 618.
|Date of Award||19 Nov 2009|
|Supervisor||Eric DEPIEREUX (Supervisor), MARCEL REMON (President), Mauro Delorenzi (Jury), J.-L. RUELLE (Jury) & Christophe Lambert (Jury)|
Développement critique de méthodes d'analyse statistique de l'expression différentielle de gènes et de groupes de gènes, mesurées sur damiers à ADN.
Berger, F. (Author). 19 Nov 2009
Student thesis: Doc types › Doctor of Sciences