Change search
ReferencesLink to record
Permanent link

Direct link
Duplicate detection in adverse drug reaction surveillance
Stockholm University, Faculty of Science, Department of Mathematics.
2007 (English)In: Data Mining and Knowledge Discovery, ISSN 1384-5810Article in journal (Refereed) Published
Place, publisher, year, edition, pages
URN: urn:nbn:se:su:diva-24198ISI: 000245669600001OAI: diva2:196999
Part of urn:nbn:se:su:diva-6764Available from: 2007-04-16 Created: 2007-04-10 Last updated: 2011-03-08Bibliographically approved
In thesis
1. Statistical methods for knowledge discovery in adverse drug reaction surveillance
Open this publication in new window or tab >>Statistical methods for knowledge discovery in adverse drug reaction surveillance
2007 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Collections of individual case safety reports are the main resource for early discovery of unknown adverse reactions to drugs once they have been introduced to the general public. The data sets involved are complex and based on voluntary submission of reports, but contain pieces of very important information. The aim of this thesis is to propose computationally feasible statistical methods for large-scale knowledge discovery in these data sets. The main contributions are a duplicate detection method that can reliably identify pairs of unexpectedly similar reports and a new measure for highlighting suspected drug-drug interaction.

Specifically, we extend the hit-miss model for database record matching with a hit-miss mixture model for scoring numerical record fields and a new method to compensate for strong record field correlations. The extended hit-miss model is implemented for the WHO database and demonstrated to be useful in real world duplicate detection, despite the noisy and incomplete information on individual case safety reports. The Information Component measure of disproportionality has been in routine use since 1998 to screen the WHO database for excessive adverse drug reaction reporting rates. Here, it is further refined. We introduce improved credibility intervals for rare events, post-stratification adjustment for suspected confounders and an extension to higher order associations that allows for simple but robust screening for potential risk factors. A new approach to identifying reporting patterns indicative of drug-drug interaction is also proposed. Finally, we describe how imprecision estimates specific to each prediction of a Bayes classifier may be obtained with the Bayesian bootstrap. Such case-based imprecision estimates allow for better prediction when different types of errors have different associated loss, with a possible application in combining quantitative and clinical filters to highlight drug-ADR pairs for clinical review.

Place, publisher, year, edition, pages
Stockholm: Matematiska institutionen, 2007. 41 p.
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
urn:nbn:se:su:diva-6764 (URN)91-7155-411-4 (ISBN)
Public defence
2007-05-07, sal 14, hus 5, Kräftriket, Stockholm, 13:00
Available from: 2007-04-16 Created: 2007-04-10 Last updated: 2011-06-20Bibliographically approved

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Norén, G. Niklas
By organisation
Department of Mathematics

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 37 hits
ReferencesLink to record
Permanent link

Direct link