A structural basis for sequence comparisons. An evaluation of scoring methodologies

J Mol Biol. 1993 Oct 20;233(4):716-38. doi: 10.1006/jmbi.1993.1548.

Abstract

A residue-exchange matrix has been derived that is suitable for comparison of amino acid sequences. This matrix is based on the tabulation of 207,795 amino acid replacements observed in 65 homologous sets of structurally aligned three-dimensional structures (235 proteins). The majority of the data is from structural comparisons where there is between 15 and 40% sequence identity. As a result, a scoring matrix such as the one devised here should provide a sensitive basis for the comparison of amino acid sequences and the search for homologous sequences in amino acid databases. In order to assess the value of this matrix we have made a comparative analysis with 12 other published scoring matrices that have been used for the alignment of protein amino acid sequences. We find that the matrix derived here is among the better performers in terms of alignment significance, detection of homologous sequences and the accuracy of alignments.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Cytochrome c Group / chemistry
  • Endopeptidases / chemistry
  • Globins / chemistry
  • Humans
  • Information Systems
  • Molecular Sequence Data
  • Reproducibility of Results
  • Retroviridae / enzymology
  • Sequence Analysis / methods*
  • Sequence Homology, Amino Acid*

Substances

  • Cytochrome c Group
  • Globins
  • Endopeptidases