Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Maximizing gain in high-throughput screening using conformal prediction
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences. Karolinska Institutet, Swetox, Sweden.ORCID iD: 0000-0003-3107-331X
Number of Authors: 42018 (English)In: Journal of Cheminformatics, E-ISSN 1758-2946, Vol. 10, article id 7Article in journal (Refereed) Published
Abstract [en]

Iterative screening has emerged as a promising approach to increase the efficiency of screening campaigns compared to traditional high throughput approaches. By learning from a subset of the compound library, inferences on what compounds to screen next can be made by predictive models, resulting in more efficient screening. One way to evaluate screening is to consider the cost of screening compared to the gain associated with finding an active compound. In this work, we introduce a conformal predictor coupled with a gain-cost function with the aim to maximise gain in iterative screening. Using this setup we were able to show that by evaluating the predictions on the training data, very accurate predictions on what settings will produce the highest gain on the test data can be made. We evaluate the approach on 12 bioactivity datasets from PubChem training the models using 20% of the data. Depending on the settings of the gain-cost function, the settings generating the maximum gain were accurately identified in 8-10 out of the 12 datasets. Broadly, our approach can predict what strategy generates the highest gain based on the results of the cost-gain evaluation: to screen the compounds predicted to be active, to screen all the remaining data, or not to screen any additional compounds. When the algorithm indicates that the predicted active compounds should be screened, our approach also indicates what confidence level to apply in order to maximize gain. Hence, our approach facilitates decision-making and allocation of the resources where they deliver the most value by indicating in advance the likely outcome of a screening campaign.

Place, publisher, year, edition, pages
2018. Vol. 10, article id 7
Keywords [en]
Conformal prediction, HTS, Gain-cost function, PubChem datasets
National Category
Chemical Sciences Computer and Information Sciences
Identifiers
URN: urn:nbn:se:su:diva-154852DOI: 10.1186/s13321-018-0260-4ISI: 000425976800001PubMedID: 29468427OAI: oai:DiVA.org:su-154852DiVA, id: diva2:1195797
Available from: 2018-04-06 Created: 2018-04-06 Last updated: 2022-05-10Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMed

Authority records

Svensson, FredrikNorinder, Ulf

Search in DiVA

By author/editor
Svensson, FredrikNorinder, Ulf
By organisation
Department of Computer and Systems Sciences
In the same journal
Journal of Cheminformatics
Chemical SciencesComputer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 15 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf