Phylogenetic footprinting and comparative analysis of related cis-regulatory modules reveals structural constraints on enhancer evolution and function. Thomas Brody1, Alexander Kuzin1, Mukta Kundu1, Jermaine Ross1, Amar Yavatkar2, Ward F. Odenwald1. 1) Neural Cell-Fate Determinants Section; 2) Information Technology Program, NINDS/NIH, Bethesda, MD.

   We have developed web-accessible alignment algorithms and a genome-wide database of Drosophila conserved sequence clusters to explore the structural and functional constraints on the regulatory genome. Analysis using the comparative genomics tool EvoPrinter reveals 1) that enhancers consist of clusters of conserved sequence blocks (CSBs); 2) flexibility of non-conserved inter-clustal sequences can be used to define functional limits of enhancers; 3) CSBs are often organized into tightly associated groupings referred to as super-blocks, multiple CSBs connected via invariant spacing length, and 4) insertions between CSBs can be used to resolve semi-autonomous sub-modules. The alignment tool cis-Decoder reveals three aspects of enhancer structure: 1) conserved clusters contain overlapping repeat motifs, suggesting that cooperative interactions among multiple factors are required for enhancer activity; 2) most developmental enhancers contain multiple binding sites for a signature factor(s) that functionally defines their cis-regulatory behavior, and 3) coordinately regulated enhancers can often be identified based on their shared conserved elements present in the same proportionate balance. Functional tests reveal 1) that dynamic gene expression is carried out by multiple sub-pattern enhancers that drive expression in overlapping, non-identical subsets of cells; 2) many enhancers are multipurpose, functioning in embryos, larvae and/or adults; in evolutionary terms, it might be easier to incorporate novel functions into pre-existing enhancers than to create enhancers anew, and 3) most conserved sequence clusters function as cis-regulatory modules, suggesting that the multiplicity of enhancers, on the order of 70,000 in the entire genome, represents an added dimension to the regulatory complexity required for organism development. These studies highlight the advantages of using evolutionary conservation as a guide to the analysis of cis-regulatory sequences.