Simbios Talk by Mitul Saha, Stanford University, November 5, 2008

Title: Automatic Identification of Conserved Domains in cryoEM Map

A major development in structural biology over the last 15 years has been the success of cryoEM in the enormously challenging task of determining the structures of large macromolecular assemblies (LMAs, such as ribosomes, chaperonins, viruses). CryoEM has emerged as a method distinctly suited for determining structures of LMAs in their near-native conditions and inferring conformation flexibility associated with their working mechanisms. However, unlike X-ray crystallography and NMR, cryoEM does not yield structures with atomic resolution. A significant portion of current cryoEM-based structure determination research is about bridging this resolution gap between cryoEM and conventional methods (X-ray crystallography, NMR) through computational means.

Towards this end, we present a new, first-of-its-kind, fully-automated computational tool MOTIF-EM for identifying domains or motifs in large macro-molecular assemblies (such as chaperonins, viruses, etc.) that remain conserved upon conformation or evolutionary changes. MOTIF-EM takes cryoEM volumetric maps as inputs. The technique used by MOTIF-EM to detect conserved substructures is inspired by a recent breakthrough in 2D object recognition. The technique works by constructing rotationally-invariant, low-dimensional representations of local regions in the input cryoEM maps. Correspondences are established between the reduced representations (by comparing them using a simple metric) across the input maps. The correspondences are clustered using hash tables and graph theory to retrieve conserved structural domains or motifs. MOTIF-EM has been used to extract conserved domains occurring in large macro-molecular assembly maps, (such as those of viruses P22 and epsilon 15, Ribosome 70S, GroEL, etc.) which remain conserved upon conformation changes or are repeated in the same structure. It has also been used to build atomic resolution models of certain maps. MOTIF-EM was also able to identify the conserved folds shared between dsDNA bacteriophages HK97, Epsilon 15, and Phi 29, which in turn points to close evolutionary links in dsDNA phages.