Z-Series: Mining and learning from complex sequential data
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]
The amount and complexity of sequential data collected across various domains have grown rapidly, posing significant challenges for extracting useful knowledge from such data sources. The complexity arises from diverse sequence representations with varying granularities, such as multivariate time series, histogram snapshots, and heterogeneous health records, which often describe a single data instance with multiple sequences. Due to this complexity, the underlying temporal relations between sequences may not be clear and can change over time, making knowledge discovery even more challenging.
To address these challenges, this thesis proposes event intervals as a unified representation for complex sequential data. Event intervals capture the underlying temporal relations between sequences by comparing the relative locations of event intervals in both the time and value dimensions, making them suitable for describing diverse sequential data. The proposed artifacts aim to efficiently and effectively discover patterns of interest, transform sequential data in different application domains through temporal abstraction, and provide interpretable features for machine learning tasks without compromising performance. The effectiveness of the proposed artifacts is evaluated through empirical experiments and practical evaluations, which demonstrate their applicability and performance.
The thesis is structured into three parts. First, it introduces state-of-the-art frameworks for mining event interval sequences, including frequent arrangement mining, classification, and clustering. The utility of these frameworks is demonstrated through comparative empirical evaluations against other frameworks. Second, the thesis applies temporal abstraction to complex sequential data in different application domains, showcasing its applicability through tasks such as disproportionality analysis and local grouping detection for time series. Lastly, event intervals are used as interpretable features for learning tasks, outperforming competitive algorithms using different feature representations. This part focuses on univariate and multivariate time series, and extensive experiments are performed on the publicly available benchmark datasets with statistical tests.
Place, publisher, year, edition, pages
Stockholm: Department of Computer and Systems Sciences, Stockholm University , 2023. , p. 102
Series
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 23-009
National Category
Computer Sciences
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-222042ISBN: 978-91-8014-508-4 (print)ISBN: 978-91-8014-509-1 (electronic)OAI: oai:DiVA.org:su-222042DiVA, id: diva2:1803200
Public defence
2023-11-24, L30, NOD-huset, Borgarfjordsgatan 12, Kista, 13:00 (English)
Opponent
Supervisors
2023-10-312023-10-072023-10-24Bibliographically approved
List of papers