Change search
ReferencesLink to record
Permanent link

Direct link
Size matters: choosing the most informative set of window lengths for mining patterns in event sequences
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
2015 (English)In: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 29, no 6, 1838-1864 p.Article in journal (Refereed) Published
Abstract [en]

In order to find patterns in data, it is often necessary to aggregate or summarise data at a higher level of granularity. Selecting the appropriate granularity is a challenging task and often no principled solutions exist. This problem is particularly relevant in analysis of data with sequential structure. We consider this problem for a specific type of data, namely event sequences. We introduce the problem of finding the best set of window lengths for analysis of event sequences for algorithms with real-valued output. We present suitable criteria for choosing one or multiple window lengths and show that these naturally translate into a computational optimisation problem. We show that the problem is NP-hard in general, but that it can be approximated efficiently and even analytically in certain cases. We give examples of tasks that demonstrate the applicability of the problem and present extensive experiments on both synthetic data and real data from several domains. We find that the method works well in practice, and that the optimal sets of window lengths themselves can provide new insight into the data.

Place, publisher, year, edition, pages
2015. Vol. 29, no 6, 1838-1864 p.
Keyword [en]
Event sequence, Pattern mining, Window length, Output-space clustering, Exploratory data analysis
National Category
Information Systems
Research subject
Computer and Systems Sciences
URN: urn:nbn:se:su:diva-111102DOI: 10.1007/s10618-014-0397-3ISI: 000361826200012OAI: diva2:774239
Available from: 2014-12-22 Created: 2014-12-22 Last updated: 2015-10-26Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Papapetrou, Panagiotis
By organisation
Department of Computer and Systems Sciences
In the same journal
Data mining and knowledge discovery
Information Systems

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 16 hits
ReferencesLink to record
Permanent link

Direct link