Molecular Medicine Israel

Big-data approaches to protein structure prediction

A protein’s structure determines its function. Experimental protein structure determination is cumbersome and costly, which has driven the search for methods that can predict protein structure from sequence information (1). About half of the known proteins are amenable to comparative modeling; that is, an evolutionarily related protein of known structure can be used as a template for modeling the unknown structure. For the remaining proteins, no satisfactory solution had been found. On page 294 of this issue, Ovchinnikov et al. (2) used recently developed methodology for predicting intraprotein amino acid contacts in combination with protein sequences from metagenomics of microbial DNA to compute reliable models for 622 protein families, and discovered more than 100 new folds along the way. The fast-paced growth of metagenomics data should enable reliable structure prediction of many more protein families.

Sign up for our Newsletter