Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The structural basis of hyperpromiscuity in a core combinatorial network of type II toxin–antitoxin and related phage defense systems
Show others and affiliations
2023 (English)In: Proceedings of the National Academy of Sciences of the United States of America, ISSN 0027-8424, E-ISSN 1091-6490, Vol. 120, no 33, article id e2305393120Article in journal (Refereed) Published
Abstract [en]

Toxin-antitoxin (TA) systems are a large group of small genetic modules found in prokaryotes and their mobile genetic elements. Type II TAs are encoded as bicistronic (two-gene) operons that encode two proteins: a toxin and a neutralizing antitoxin. Using our tool NetFlax (standing for Network-FlaGs for toxins and antitoxins), we have performed a large-scale bioinformatic analysis of proteinaceous TAs, revealing interconnected clusters constituting a core network of TA-like gene pairs. To understand the structural basis of toxin neutralization by antitoxins, we have predicted the structures of 3,419 complexes with AlphaFold2. Together with mutagenesis and functional assays, our structural predictions provide insights into the neutralizing mechanism of the hyperpromiscuous Panacea antitoxin domain. In antitoxins composed of standalone Panacea, the domain mediates direct toxin neutralization, while in multidomain antitoxins the neutralization is mediated by other domains, such as PAD1, Phd-C, and ZFD. We hypothesize that Panacea acts as a sensor that regulates TA activation. We have experimentally validated 16 NetFlax TA systems and used domain annotations and metabolic labeling assays to predict their potential mechanisms of toxicity (such as membrane disruption, and inhibition of cell division or protein synthesis) as well as biological functions (such as antiphage defense). We have validated the antiphage activity of a RosmerTA system encoded by Gordonia phage Kita, and used fluorescence microscopy to confirm its predicted membrane-depolarizing activity. The interactive version of the NetFlax TA network that includes structural predictions can be accessed at http://netflax.webflags.se/.

Place, publisher, year, edition, pages
2023. Vol. 120, no 33, article id e2305393120
Keywords [en]
toxin, antitoxin, AlphaFold, phage, Panacea
National Category
Biochemistry Molecular Biology
Identifiers
URN: urn:nbn:se:su:diva-224333DOI: 10.1073/pnas.2305393120PubMedID: 37556498Scopus ID: 2-s2.0-85167528527OAI: oai:DiVA.org:su-224333DiVA, id: diva2:1817587
Funder
Knut and Alice Wallenberg Foundation, 2020.0037Swedish Research Council, 2019-01085Swedish Research Council, 2022-01603Swedish Research Council, 2021-01146Swedish Research Council, 2021-03979Carl Tryggers foundation , CTS19:24The Kempe Foundations, SMK-2061.1Swedish Cancer Society, 20 0872 PjThe Crafoord Foundation, 20220562Ragnar Söderbergs stiftelse, M23/14Available from: 2023-12-06 Created: 2023-12-06 Last updated: 2025-02-20Bibliographically approved
In thesis
1. Unlocking protein sequences: Advances in protein structure and ligand-binding site prediction
Open this publication in new window or tab >>Unlocking protein sequences: Advances in protein structure and ligand-binding site prediction
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The protein sequence determines how it will fold into its unique three-dimensional structure. Once folded, proteins perform their functions by interacting with other proteins or molecules called ligands within the cell. Experimental determination of protein structure and function is tedious. Computational approaches aim to accurately predict the properties of proteins to complement experimental efforts of understanding biochemical mechanisms within the cell. This thesis introduces computational techniques that predict the structure of protein complexes and identify protein residues involved in interactions with common biomolecules, such as metal ions and nucleic acids, based on sequence information. 

AlphaFold, a method that predicted protein structure using sequence information with almost experimental accuracy, was a critical breakthrough that shaped the field of protein structure prediction. Subsequently, approaches such as FoldDock adapted the AlphaFold pipeline for dimer complexes. Paper I applies the FoldDock protocol to understand toxin-antitoxin systems. These protein complexes are highly evolutionary conserved, and high-confidence dimer predictions were generated. Paper II applies the FoldDock protocol to study protein-protein interactions in the human proteome. To verify the reliability of machine-learning-based computational methods, they must be tested on independent data different from the data used to train the method. Paper III involves generating and using a homology-reduced independent test set to benchmark the performance of protein complex structure predictors, including the recent AlphaFold release adapted for multi-chain proteins – AlphaFold-Multimer. A confidence score (pDockQ2) was proposed to estimate the quality of the interfaces within multimers. Paper I, Paper II and Paper III are associated with predicting and evaluating protein-protein interactions. 

Representation learning involves finding effective representations of input data to maximise available information, making it easier to understand and process them for downstream prediction tasks. A recent advance in protein representation learning is Protein Language models (pLMs), where large language models are trained on a massive corpus of protein sequences. Highly contextualised and informative vector representations contained in the last hidden layer of the model have been used to predict numerous properties, such as ligand binding sites, subcellular localisation, and post-translational modifications, among others. Paper IV uses residue-level embeddings to predict whether a protein binds to one or more of the ten most common ions. It also predicts residue-level binding probabilities for multiple ions simultaneously. Paper V expands this approach beyond metals. It explores the impact of structure-informed features alongside sequence embeddings to predict whether a residue binds to nucleic acids, small molecules or metals.  Paper IV and Paper V are associated with developing machine learning methods to predict and evaluate protein-ligand interactions. 

In summary, the research conducted within this thesis offers valuable insights into three crucial levers to systematically harness the potential of machine learning for protein bioinformatics. These are (1) construction of homology-reduced non-redundant datasets, (2) finding optimal protein representations, and (3) rigorous evaluation and inference. 

Place, publisher, year, edition, pages
Stockholm: Department of Biochemistry and Biophysics, Stockholm University, 2024. p. 55
National Category
Bioinformatics (Computational Biology)
Research subject
Biochemistry towards Bioinformatics
Identifiers
urn:nbn:se:su:diva-224344 (URN)978-91-8014-613-5 (ISBN)978-91-8014-614-2 (ISBN)
Public defence
2024-01-26, Air & Fire, SciLifeLab, Tomtebodavägen 23A, Solna, 09:00 (English)
Opponent
Supervisors
Available from: 2024-01-02 Created: 2023-12-07 Last updated: 2023-12-20Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Shenoy, AditiElofsson, Arne

Search in DiVA

By author/editor
Shenoy, AditiElofsson, Arne
By organisation
Department of Biochemistry and BiophysicsScience for Life Laboratory (SciLifeLab)
In the same journal
Proceedings of the National Academy of Sciences of the United States of America
BiochemistryMolecular Biology

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 112 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf