Simbios Talk by Patrice Koehl, University of California Davis, April 18, 2006

Title: Protein structure and sequence spaces: how big are they?

Gene family and protein structure family sizes are known to vary widely, from orphans to considerably populated sets of far-diverged homologs. The underlying causes behind these observed irregularities in sequence space remain unclear. In the first part of this seminar, I will show that geometry and thermodynamics are key elements for measuring the ability of a protein to accept mutations without losing its structure. In the second part of the seminar, I will propose new descriptions of the protein structure space and sequence space. In particular, I will show that a protein structure can be represented by a one dimensional string without loss of information. In parallel, I will demonstrate the advantages of using a vectorial representation of protein sequences.