Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Regression Trees for Streaming Data with Local Performance Guarantees
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
2014 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Online predictive modeling of streaming data is a key task for big data analytics. In this paper, a novel approach for efficient online learning of regression trees is proposed, which continuously updates, rather than retrains, the tree as more labeled data become available. A conformal predictor outputs prediction sets instead of point predictions; which for regression translates into prediction intervals. The key property of a conformal predictor is that it is always valid, i.e., the error rate, on novel data, is bounded by a preset significance level. Here, we suggest applying Mondrian conformal prediction on top of the resulting models, in order to obtain regression trees where not only the tree, but also each and every rule, corresponding to a path from the root node to a leaf, is valid. Using Mondrian conformal prediction, it becomes possible to analyze and explore the different rules separately, knowing that their accuracy, in the long run, will not be below the preset significance level. An empirical investigation, using 17 publicly available data sets, confirms that the resulting rules are independently valid, but also shows that the prediction intervals are smaller, on average, than when only the global model is required to be valid. All-in-all, the suggested method provides a data miner or a decision maker with highly informative predictive models of streaming data.

Place, publisher, year, edition, pages
IEEE Computer Society, 2014. 461-470 p.
Keyword [en]
Conformal prediction, Interpretable models, Regression trees, Streaming data
National Category
Information Systems
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-111094DOI: 10.1109/BigData.2014.7004263ISBN: 978-1-4799-5665-4 (print)OAI: oai:DiVA.org:su-111094DiVA: diva2:774231
Conference
2014 IEEE International Conference on Big Data, Washington DC, 27-30 October, 2014
Available from: 2014-12-22 Created: 2014-12-22 Last updated: 2015-02-11Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text
By organisation
Department of Computer and Systems Sciences
Information Systems

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 33 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf