Change search
ReferencesLink to record
Permanent link

Direct link
Prediction of zinc-binding sites in proteins and efficient protein structure description and comparison
Stockholm University, Faculty of Science, Department of Physical, Inorganic and Structural Chemistry. (Sven Hovmöller)
2008 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

A large number of proteins require certain metals to stabilize their structures or to function properly. About one third of all proteins in the Protein Data Bank (PDB) contain metals and it is estimated that approximately the same proportion of all proteins are metalloproteins.

Zinc, the second most abundant transition metal found in eukaryotic organisms, plays key roles, mainly structural and catalytic, in many biological functions. Predicting whether a protein binds zinc and even the accurate location of binding sites is important when investigating the function of an experimentally uncharacterized protein.

Describing and comparing protein structures with both efficiency and accuracy are essential for systematic annotation of functional properties of proteins, be it on an individual or on a genome scale. Dozens of structure comparison methods have been developed in the past decades. In recent years, several research groups have endeavoured in developing methods for fast comparison of protein structures by representing the three-dimensional (3D) protein structures as one-dimensional (1D) geometrical strings based on the shape symbols of clustered regions of φ/ψ torsion angle pairs of the polypeptide backbones. These 1D geometrical strings, shape strings, are as compact as 1D secondary structures but carry more elaborate structural information in loop regions and thus are more suitable for fast structure database searching, classification of loop regions and evaluation of model structures.

In this thesis, a new method for predicting zinc-binding sites in proteins from amino acid sequences is described. This method predicts zinc-binding Cys, His, Asp and Glu (the four most common zinc-binding residues) with 75% precision (86% for Cys and His only) at 50% recall according to a solid 5-fold cross-validation on a non-redundant set of the PDB chains containing 2727 unique chains, of which 235 bind to zinc. This method predicts zinc-binding Cys and His with about 10% higher precision at different recall levels compared to a previously published method. In addition, different methods for describing and comparing protein structures are reviewed. Some recently developed methods based on 1D geometrical representation of backbone structures are emphasized and analyzed in details.

Place, publisher, year, edition, pages
2008. , 42 p.
Keyword [en]
zinc-binding, shape strings, protein structures, secondary structures, machine learning
National Category
Biochemistry and Molecular Biology Biochemistry and Molecular Biology Bioinformatics and Systems Biology Structural Biology
Research subject
Biochemistry; Structural Biology; Molecular Biology
URN: urn:nbn:se:su:diva-32783OAI: diva2:281539
2008-04-18, Magnélisalen, kemiska övningslaboratoriet, Svante Arrhenius väg 12, Frescati, Magnélisalen, 10:00 (English)
protein structure prediction
Available from: 2010-01-20 Created: 2009-12-16 Last updated: 2010-01-20Bibliographically approved

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Shu, Nanjiang
By organisation
Department of Physical, Inorganic and Structural Chemistry
Biochemistry and Molecular BiologyBiochemistry and Molecular BiologyBioinformatics and Systems BiologyStructural Biology

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 64 hits
ReferencesLink to record
Permanent link

Direct link