Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Solving the correspondence problem in analytical chemistry: Automated methods for alignment and quantification of multiple signals
Stockholm University, Faculty of Science, Department of Analytical Chemistry.
2012 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

When applying statistical data analysis techniques to analytical chemical data, all variables must have correspondence over the samples dimension in order for the analysis to generate meaningful results. Peak shifts in NMR and chromatography destroys that correspondence and creates data matrices that have to be aligned before analysis. In this thesis, new methods are introduced that allow for automated transformation from unaligned raw data to aligned data matrices where each column corresponds to a unique signal. These methods are based around linear multivariate models for the peak shifts and Hough transform for establishing the parameters of these linear models. Methods for quantification under difficult conditions, such as crowded spectral regions, noisy data and unknown peak identities are also introduced. These methods include automated peak selection and a robust method for background subtraction. This thesis focuses on the processing of the data; the experimental work is secondary and is not discussed in great detail.

All the developed methods are put together in a full procedure that takes us from raw data to a table of concentrations in a matter of minutes.

The procedure is applied to 1H-NMR data from biological samples, which is one of the toughest alignment tasks available in the field of analytical chemistry. It is shown that the procedure performs consistently on the same level as much more labor intensive manual techniques such as Chenomx NMRSuite spectral profiling.

Several kinds of datasets are evaluated using the procedure. Most of the data is from the field of Metabolomics, where the goal is to establish concentrations of as many small molecules as possible in biological samples.

Place, publisher, year, edition, pages
Stockholm: Department of Analytical Chemistry, Stockholm University , 2012. , 74 p.
National Category
Chemical Sciences
Research subject
Analytical Chemistry
Identifiers
URN: urn:nbn:se:su:diva-74556ISBN: 978-91-7447-485-5 (print)OAI: oai:DiVA.org:su-74556DiVA: diva2:516732
Public defence
2012-05-25, Magnélisalen, Kemiska övningslaboratoriet, Svante Arrhenius väg 16 B, Stockholm, 13:00 (English)
Opponent
Supervisors
Available from: 2012-05-03 Created: 2012-03-16 Last updated: 2012-05-02Bibliographically approved
List of papers
1. Vibrational overtone combination spectroscopy (VOCSY)—a new way of using IR and NIR data
Open this publication in new window or tab >>Vibrational overtone combination spectroscopy (VOCSY)—a new way of using IR and NIR data
Show others...
2007 (English)In: Analytical and Bioanalytical Chemistry, ISSN 1618-2642, E-ISSN 1618-2650, Vol. 388, no 1, 179-188 p.Article in journal (Refereed) Published
Abstract [en]

This work explores a novel method for rearranging 1st order (one-way) infra-red (IR) and/or near infra-red (NIR) ordinary spectra into a representation suitable for multi-way modelling and analysis. The method is based on the fact that the fundamental IR absorption and the first, second, and consecutive overtones of NIR absorptions represent identical chemical information. It is therefore possible to rearrange these overtone regions of the vectors comprising an IR and NIR spectrum into a matrix where the fundamental, 1st, 2nd, and consecutive overtones of the spectrum are arranged as either rows or columns in a matrix, resulting in a true three-way tensor of data for several samples. This tensorization facilitates explorative analysis and modelling with multi-way methods, for example parallel factor analysis (PARAFAC), N-way partial least squares (N-PLS), and Tucker models. The vibrational overtone combination spectroscopy (VOCSY) arrangement is shown to benefit from the “order advantage”, producing more robust, stable, and interpretable models than, for example, the traditional PLS modelling method. The proposed method also opens the field of NIR for true peak decomposition—a feature unique to the method because the latent factors acquired using PARAFAC can represent pure spectral components whereas latent factors in principal component analysis (PCA) and PLS usually do not.

Keyword
Near-infrared, Infrared, Calibration, PARAFAC, Multi-way, Second-order advantage
National Category
Analytical Chemistry
Research subject
Analytical Chemistry
Identifiers
urn:nbn:se:su:diva-10782 (URN)10.1007/s00216-007-1180-8 (DOI)000245292200022 ()
Available from: 2008-01-07 Created: 2008-01-07 Last updated: 2017-12-13Bibliographically approved
2. Proof of principle of a generalized fuzzy Hough transform approach to peak alignment of one-dimensional 1H NMR data
Open this publication in new window or tab >>Proof of principle of a generalized fuzzy Hough transform approach to peak alignment of one-dimensional 1H NMR data
Show others...
2007 (English)In: Analytical and Bioanalytical Chemistry, ISSN 1618-2642, E-ISSN 1618-2650, Vol. 389, no 3, 875-885 p.Article in journal (Refereed) Published
Abstract [en]

In metabolic profiling, multivariate data analysis techniques are used to interpret one-dimensional (1D) 1H NMR data. Multivariate data analysis techniques require that peaks are characterised by the same variables in every spectrum. This location constraint is essential for correct comparison of the intensities of several NMR spectra. However, variations in physicochemical factors can cause the locations of the peaks to shift. The location prerequisite may thus not be met, and so, to solve this problem, alignment methods have been developed. However, current state-of-the-art algorithms for data alignment cannot resolve the inherent problems encountered when analysing NMR data of biological origin, because they are unable to align peaks when the spatial order of the peaks changes—a commonly occurring phenomenon. In this paper a new algorithm is proposed, based on the Hough transform operating on an image representation of the NMR dataset that is capable of correctly aligning peaks when existing methods fail. The proposed algorithm was compared with current state-of-the-art algorithms operating on a selected plasma dataset to demonstrate its potential. A urine dataset was also processed using the algorithm as a further demonstration. The method is capable of successfully aligning the plasma data but further development is needed to address more challenging applications, for example urine data.

Keyword
NMR, Peak detection, Hough transform, Alignment, Metabolic profiling
National Category
Analytical Chemistry
Research subject
Analytical Chemistry
Identifiers
urn:nbn:se:su:diva-10783 (URN)10.1007/s00216-007-1475-9 (DOI)000249645800024 ()
Available from: 2008-01-07 Created: 2008-01-07 Last updated: 2017-12-13Bibliographically approved
3. A solution to the 1D NMR alignment problem using an extended generalized fuzzy Hough transform and mode support
Open this publication in new window or tab >>A solution to the 1D NMR alignment problem using an extended generalized fuzzy Hough transform and mode support
Show others...
2009 (English)In: Analytical and Bioanalytical Chemistry, ISSN 1618-2642, E-ISSN 1618-2650, Vol. 395, no 1, 213-223 p.Article in journal (Refereed) Published
Abstract [en]

This paper approaches the problem of intersample peak correspondence in the context of later applying statistical data analysis techniques to 1D 1H-nuclear magnetic resonance (NMR) data. Any data analysis methodology will fail to produce meaningful results if the analyzed data table is not synchronized, i.e., each analyzed variable frequency (Hz) does not originate from the same chemical source throughout the entire dataset. This is typically the case when dealing with NMR data from biological samples. In this paper, we present a new state of the art for solving this problem using the generalized fuzzy Hough transform (GFHT). This paper describes significant improvements since the method was introduced for NMR datasets of plasma in Csenki et al. (Anal Bioanal Chem 389:875-885, 15) and is now capable of synchronizing peaks from more complex datasets such as urine as well as plasma data. We present a novel way of globally modeling peak shifts using principal component analysis, a new algorithm for calculating the transform and an effective peak detection algorithm. The algorithm is applied to two real metabonomic 1H-NMR datasets and the properties of the method are compared to bucketing. We implicitly prove that GFHT establishes the objectively true correspondence. Desirable features of the GFHT are: (1) intersample peak correspondence even if peaks change order on the frequency axis and (2) the method is symmetric with respect to the samples.

Keyword
Metabolic profiling, NMR, Peak detection, Image processing, Hough transform, Synchronization, Alignment
National Category
Analytical Chemistry
Research subject
Analytical Chemistry
Identifiers
urn:nbn:se:su:diva-35113 (URN)10.1007/s00216-009-2940-4 (DOI)000268866800024 ()
Available from: 2010-01-14 Created: 2010-01-14 Last updated: 2017-12-12Bibliographically approved
4. Time-resolved biomarker discovery inH-NMR data using generalized fuzzy Hough transform alignment and parallel factor analysis
Open this publication in new window or tab >>Time-resolved biomarker discovery inH-NMR data using generalized fuzzy Hough transform alignment and parallel factor analysis
Show others...
2010 (English)In: Analytical and Bioanalytical Chemistry, ISSN 1618-2642, E-ISSN 1618-2650, Vol. 396, no 5, 1681-1689 p.Article in journal (Refereed) Published
Abstract [en]

This work addresses the subject of time-series analysis of comprehensive 1H-NMR data of biological origin. One of the problems with toxicological and efficacy studies is the confounding of correlation between the administered drug, its metabolites and the systemic changes in molecular dynamics, i.e., the flux of drug-related molecules correlates with the molecules of system regulation. This correlation poses a problem for biomarker mining since this confounding must be untangled in order to separate true biomarker molecules from dose-related molecules. One way of achieving this goal is to perform pharmacokinetic analysis. The difference in pharmacokinetic time profiles of different molecules can aid in the elucidation of the origin of the dynamics, this can even be achieved regardless of whether the identity of the molecule is known or not. This mode of analysis is the basis for metabonomic studies of toxicology and efficacy. One major problem concerning the analysis of 1H-NMR data generated from metabonomic studies is that of the peak positional variation and of peak overlap. These phenomena induce variance in the data, obscuring the true information content and are hence unwanted but hard to avoid. Here, we show that by using the generalized fuzzy Hough transform spectral alignment, variable selection, and parallel factor analysis, we can solve both the alignment and the confounding problem stated above. Using the outlined method, several different temporal concentration profiles can be resolved and the majority of the studied molecules and their respective fluxes can be attributed to these resolved kinetic profiles. The resolved time profiles hereby simplifies finding true biomarkers and bio-patterns for early detection of biological conditions as well as providing more detailed information about the studied biological system. The presented method represents a significant step forward in time-series analysis of biological 1H-NMR data as it provides almost full automation of the whole data analysis process and is able to analyze over 800 unique features per sample. The method is demonstrated using a 1H-NMR rat urine dataset from a toxicology study and is compared with a classical approach: COW alignment followed by bucketing.

Keyword
Urine, 1H-NMR, Alignment, Multivariate, Metabolic profiling, PARAFAC, Drug metabolism, Toxicology
National Category
Analytical Chemistry
Research subject
Analytical Chemistry
Identifiers
urn:nbn:se:su:diva-54042 (URN)10.1007/s00216-009-3421-5 (DOI)
Available from: 2011-01-25 Created: 2011-01-25 Last updated: 2017-12-11Bibliographically approved
5. Automated annotation and quantification of metabolites in (1)H NMR data of biological origin
Open this publication in new window or tab >>Automated annotation and quantification of metabolites in (1)H NMR data of biological origin
Show others...
2012 (English)In: Analytical and Bioanalytical Chemistry, ISSN 1618-2642, E-ISSN 1618-2650, Vol. 403, no 2, 443-455 p.Article in journal (Refereed) Published
Abstract [en]

In 1H NMR metabolomic datasets, there are often over a thousand peaks per spectrum, many of which change position drastically between samples. Automatic alignment, annotation, and quantification of all the metabolites of interest in such datasets have not been feasible. In this work we propose a fully automated annotation and quantification procedure which requires annotation of metabolites only in a single spectrum. The reference database built from that single spectrum can be used for any number of 1H NMR datasets with a similar matrix. The procedure is based on the generalized fuzzy Hough transform (GFHT) for alignment and on Principal-components analysis (PCA) for peak selection and quantification. We show that we can establish quantities of 21 metabolites in several 1H NMR datasets and that the procedure is extendable to include any number of metabolites that can be identified in a single spectrum. The procedure speeds up the quantification of previously known metabolites and also returns a table containing the intensities and locations of all the peaks that were found and aligned but not assigned to a known metabolite. This enables both biopattern analysis of known metabolites and data mining for new potential biomarkers among the unknowns.

Keyword
1H NMR, Alignment, Multivariate, Metabolomics, Hough transform, Urine, Quantification, Spectral profiling
National Category
Analytical Chemistry
Research subject
Analytical Chemistry
Identifiers
urn:nbn:se:su:diva-74546 (URN)10.1007/s00216-012-5789-x (DOI)000302256800012 ()
Available from: 2012-03-16 Created: 2012-03-16 Last updated: 2017-12-07Bibliographically approved

Open Access in DiVA

fulltext(1876 kB)1462 downloads
File information
File name FULLTEXT01.pdfFile size 1876 kBChecksum SHA-512
99b5495537df9a15a45eef7acfef88b13a9f44447d32f48e347478c4440ed2bccb6ac1893a8137a61dc222b1e7c6607a7cbd648835691ab8defb5eef543da162
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Alm, Erik
By organisation
Department of Analytical Chemistry
Chemical Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 1462 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 300 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf