Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Ensemble methods for protein structure prediction
Stockholm University, Faculty of Science, Department of Biochemistry and Biophysics.
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Proteins play an essential role in virtually all of life's processes. Their function is tightly coupled to the three-dimensional structure they adopt.

Solving protein structures experimentally is a complicated, time- and resource-consuming endeavor. With the rapid growth of the amount of protein sequences known, it is very likely that only a small fraction of known proteins will ever have their structures solved experimentally. Recently, computational methods for protein structure prediction have become increasingly accurate and offer a promise for bridging this gap.

In this work, we show the ways the rapidly growing amounts of available biological data can be used to improve the accuracy of protein structure prediction. We discuss the use of multiple sources of structural information to improve the quality of predicted models. The methods for assigning the estimated quality scores for predicted models are discussed as well.  In particular we present a novel, successful approach to the clustering-based quality assessment, which runs nearly 50 times faster than other methods of comparable accuracy, allowing to tackle much larger problems.

Additionally, this thesis discusses the impact the recent breakthroughs in sequencing and the consequent rapid growth of sequence data have on the prediction of residue-residue contacts. We propose a novel methodology, which allows for predicting such contacts with astonishing, previously unheard-of accuracy. These contacts in turn can be used to guide protein modeling, allowing for discovering protein structures that have been unattainable by conventional prediction methods.

Finally, a considerable part of this dissertation discusses the community efforts in protein structure prediction, as embodied by CASP (Critical Assessment of protein Structure Prediction) experiments.

Place, publisher, year, edition, pages
Stockholm: Department of Biochemistry and Biophysics, Stockholm University , 2013. , 60 p.
Keyword [en]
protein structure prediction, model quality assessment, contact prediction, homology modeling, ab-initio prediction, consensus prediction, structural bioinformatics, bioinformatics, protein structure
National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry with Emphasis on Theoretical Chemistry
Identifiers
URN: urn:nbn:se:su:diva-89366ISBN: 978-91-7447-698-9 (print)OAI: oai:DiVA.org:su-89366DiVA: diva2:617433
Public defence
2013-05-31, Magnéli Hall, Arrhenius Laboratory, Svante Arrhenius väg 16 B, Stockholm, 10:00 (English)
Opponent
Supervisors
Funder
EU, FP7, Seventh Framework Programme, 215524
Note

At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 3: Submitted. Paper 4: Submitted.

Available from: 2013-05-09 Created: 2013-04-23 Last updated: 2015-10-27Bibliographically approved
List of papers
1. Assessment of global and local model quality in CASP8 using Pcons and ProQ
Open this publication in new window or tab >>Assessment of global and local model quality in CASP8 using Pcons and ProQ
2009 (English)In: Proteins: Structure, Function, and Genetics, ISSN 0887-3585, E-ISSN 1097-0134, Vol. 77, no 9, 167-172 p.Article in journal (Refereed) Published
Abstract [en]

Model Quality Assessment Programs (MQAPs) are programs developed to rank protein models. These methods can be trained to predict the overall global quality of a model or what local regions in a model that are likely to be incorrect. In CASP8, we participated with two predictors that predict both global and local quality using either consensus information, Pcons, or purely structural information, ProQ. Consistently with results in previous CASPs, the best performance in CASP8 was obtained using the Pcons method. Furthermore, the results show that the modification introduced into Pcons for CASP8 improved the predictions against GDT_TS and now a correlation coefficient above 0.9 is achieved, whereas the correlation for ProQ is about 0.7. The correlation is better for the easier than for the harder targets, but it is not below 0.5 for a single target and below 0.7 only for three targets. The correlation coefficient for the best local quality MQAP is 0.68 showing that there is still clear room for improvement within this area. We also detect that Pcons still is not always able to identify the best model. However, we show that using a linear combination of Pcons and ProQ it is possible to select models that are better than the models from the best single server. In particular, the average quality over the hard targets increases by about 6% compared with using Pcons alone.

Keyword
quality assessment, MQAP, consensus
National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry with Emphasis on Theoretical Chemistry
Identifiers
urn:nbn:se:su:diva-34572 (URN)10.1002/prot.22476 (DOI)000272244700019 ()19544566 (PubMedID)
Available from: 2010-01-11 Created: 2010-01-11 Last updated: 2017-12-12Bibliographically approved
2. Improved predictions by Pcons.net using multiple templates
Open this publication in new window or tab >>Improved predictions by Pcons.net using multiple templates
2011 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 27, no 3, 426-427 p.Article in journal (Refereed) Published
Abstract [en]

Multiple templates can often be used to build more accurate homology models than models built from a single template. Here we introduce PconsM, an automated protocol that uses multiple templates to build protein models. PconsM has been among the top-performing methods in the recent CASP experiments and consistently perform better than the single template models used in Pcons. net. In particular for the easier targets with many alternative templates with a high degree of sequence identity, quality is readily improved with a few percentages over the highest ranked model built on a single template. PconsM is available as an additional pipeline within the Pcons. net protein structure prediction server.

National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry with Emphasis on Theoretical Chemistry
Identifiers
urn:nbn:se:su:diva-67495 (URN)10.1093/bioinformatics/btq664 (DOI)000286991300021 ()
Note

authorCount :4

Available from: 2011-12-29 Created: 2011-12-28 Last updated: 2017-12-08Bibliographically approved
3. PconsC: combination of direct information methods and alignments improves contact prediction
Open this publication in new window or tab >>PconsC: combination of direct information methods and alignments improves contact prediction
2013 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 29, no 14, 1815-1816 p.Article in journal (Refereed) Published
Abstract [en]

Recently, several new contact prediction methods have been published. They use (i) large sets of multiple aligned sequences and (ii) assume that correlations between columns in these alignments can be the results of indirect interaction. These methods are clearly superior to earlier methods when it comes to predicting contacts in proteins. Here, we demonstrate that combining predictions from two prediction methods, PSICOV and plmDCA, and two alignment methods, HHblits and jackhmmer at four different e-value cut-offs, provides a relative improvement of 20% in comparison with the best single method, exceeding 70% correct predictions for one contact prediction per residue.

National Category
Biological Sciences
Research subject
Biochemistry with Emphasis on Theoretical Chemistry
Identifiers
urn:nbn:se:su:diva-92624 (URN)10.1093/bioinformatics/btt259 (DOI)000321747800017 ()
Note

AuthorCount:3;

Available from: 2013-08-19 Created: 2013-08-14 Last updated: 2017-12-06Bibliographically approved
4. PconsD: Ultra rapid, accurate model quality assessment for protein structure prediction
Open this publication in new window or tab >>PconsD: Ultra rapid, accurate model quality assessment for protein structure prediction
2013 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 29, no 14, 1817-1818 p.Article in journal (Refereed) Published
Abstract [en]

Clustering methods are often needed for accurately assessing the quality of modeled protein structures. Recent blind evaluation of quality assessment methods in CASP10 showed that there is very little difference between many different methods as far as ranking models and selecting best model are concerned. When comparing many models the computational cost of the model comparison can become significant. Here, we present PconsD, a very fast, stream-computing method for distance-driven model quality assessment, that runs on consumer hardware. PconsD is at least one order of magnitude faster than other methods of comparable accuracy.

Keyword
protein structure prediction, quality assessment, stream computing, GPGPU, MQAP, OpenCL
National Category
Bioinformatics and Systems Biology
Research subject
Biochemistry with Emphasis on Theoretical Chemistry
Identifiers
urn:nbn:se:su:diva-89364 (URN)10.1093/bioinformatics/btt272 (DOI)000321747800019 ()
Funder
EU, FP7, Seventh Framework Programme, 215524Swedish Research Council, VR-NT 2009-5072Swedish Research Council, VR-M 2010-3555VINNOVAEU, FP7, Seventh Framework Programme, 201924
Note

AuthorCount:2;

Available from: 2013-04-23 Created: 2013-04-23 Last updated: 2017-12-06Bibliographically approved

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Skwark, Marcin J.
By organisation
Department of Biochemistry and Biophysics
Bioinformatics and Systems Biology

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1103 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf