Bioinformatics-driven approaches to building new fly models of human disease. Stephanie E. Mohr1, Yanhui Hu1, Ian Flockhart1, Juliane Schneider2, Charles Roesel1,3, Lizabeth Perkins1, Norbert Perrimon1,4. 1) Dept Gen, Harvard Med Sch, Boston, MA; 2) Countway Medical Library, Harvard Med Sch, Boston, MA; 3) Grad Program in Bioinformatics, Northeastern University, Boston, MA; 4) Howard Hughes Medical Institute, Boston, MA.
Drosophila is used to model human diseases at cell, pathway, organ and organism levels, and to learn about the normal functions of disease-associated genes. Development of new fly disease models depends on 1) accurate associations between human disease terms, human genes and their fly orthologs, and 2) availability or production of relevant reagents. To improve the quality and ease of identifying fly orthologs of human disease genes, we developed the ortholog tool DIOPT (www.flyrnai.org/diopt) and DIOPT-DIST (www.flyrnai.org/diopt-dist), which incorporates disease information from Online Mendelian Inheritance in Man (OMIM) and genome-wide association studies (GWAS). Recent improvements include fuzzy search; inclusion of a tenth algorithm (OrthoDB) in the DIOPT voting system output; and a combined automated and curated approach to map terms between OMIM and Medical Subject Headings. We used DIOPT-DIST to identify genes represented in the Drosophila RNAi Screening Center and Transgenic RNAi Project reagent collections, as well as nominate genes for production of new TRiP lines. A large number of our reagents could be put to immediate use to study disease gene orthologs. The diseases covered include ciliopathies, cohesinopathies, disorders related to lysosomes, peroxisomes or mitochondria, and enzyme deficiencies. In total, we identify 860 fly genes that are high-confidence orthologs (DIOPT score 8) matching 853 human genes and 1200 diseases. Many of these are rare, poorly understood, and/or not previously modeled in the fly. Altogether, our team, whose expertise spans bioinformatics, library science, and molecular genetics, is using a variety of approaches to make disease-relevant software tools and reagents better and easier to access, with the ultimate goal of facilitating meaningful disease-related studies at the lab bench.