Semigeometric Tiling of Event Sequences
2016 (English)In: Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016, Proceedings, Part I / [ed] Paolo Frasconi, Niels Landwehr, Giuseppe Manco, Jilles Vreeken, Springer, 2016, 329-344 p.Conference paper (Refereed)
Event sequences are ubiquitous, e.g., in finance, medicine, and social media. Often the same underlying phenomenon, such as television advertisements during Superbowl, is reflected in independent event sequences, like different Twitter users. It is hence of interest to find combinations of temporal segments and subsets of sequences where an event of interest, like a particular hashtag, has an increased occurrence probability. Such patterns allow exploration of the event sequences in terms of their evolving temporal dynamics, and provide more fine-grained insights to the data than what for example straightforward clustering can reveal. We formulate the task of finding such patterns as a novel matrix tiling problem, and propose two algorithms for solving it. Our first algorithm is a greedy set-cover heuristic, while in the second approach we view the problem as time-series segmentation. We apply the algorithms on real and artificial datasets and obtain promising results. The software related to this paper is available at https://github.com/bwrc/semigeom-r.
Place, publisher, year, edition, pages
Springer, 2016. 329-344 p.
Lecture Notes in Computer Science, ISSN 0302-9743 ; 9851
Event sequences, Tiling, Covering, Binary matrices
Research subject Computer and Systems Sciences
IdentifiersURN: urn:nbn:se:su:diva-135442DOI: 10.1007/978-3-319-46128-1_21ISBN: 978-3-319-46127-4 (print)ISBN: 978-3-319-46128-1 (print)OAI: oai:DiVA.org:su-135442DiVA: diva2:1045226
European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016