Title: Protein structure and sequence spaces: how big are they?
Gene family and protein structure family sizes are known to vary widely, from orphans to considerably populated sets of far-diverged homologs. The underlying causes behind these observed irregularities in sequence
space remain unclear. In the first part of this seminar, I will show that
geometry and thermodynamics are key elements for measuring the ability of a
protein to accept mutations without losing its structure.
In the second part of the seminar, I will propose new descriptions of the
protein structure space and sequence space. In particular, I will show that
a protein structure can be represented by a one dimensional string without
loss of information. In parallel, I will demonstrate the advantages of using a
vectorial representation of protein sequences.