Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Coriander: A Toolset for Generating Realistic Android Digital Evidence Datasets
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0002-5115-1453
2017 (English)In: Digital Forensics and Cyber Crime: Proceedings / [ed] Petr Matoušek, Martin Schmiedecker, Springer, 2017, p. 228-233Conference paper, Published paper (Refereed)
Abstract [en]

Triage has been suggested as a means to prioritize and identify sources and artifacts of evidence that might be of most interest when faced with large amounts of digital evidence. Memory Forensics has long relied on simple string matching to triage evidence sources. In this paper, we describe the early devel-opments into our study on Machine Learning-based triage for Memory Forensics. To start off, there are no large datasets of memory captures available. We thus, develop a toolset to enable the automated creation of realistic Android process memory dumps. Using our toolset we generate a dataset of 2375 process memory string dumps from both malicious and benign Android applications, classified by VirusTotal, and sourced from the AndroZoo project. Our dataset and toolset are made available online to help promote research in this field and related areas.

Place, publisher, year, edition, pages
Springer, 2017. p. 228-233
Series
Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, ISSN 1867-8211 ; 216
Keywords [en]
Android Forensics, Digital Forensics, Mobile Forensics, Memory Forensics, Digital Evidence, Datasets, Metadata, Machine Learning, Triage
National Category
Computer Sciences
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-149260DOI: 10.1007/978-3-319-73697-6_18ISBN: 978-3-319-73696-9 (print)ISBN: 978-3-319-73697-6 (electronic)OAI: oai:DiVA.org:su-149260DiVA, id: diva2:1159976
Conference
9th International Conference, ICDF2C 2017, Prague, Czech Republic, October 9-11, 2017
Available from: 2017-11-24 Created: 2017-11-24 Last updated: 2022-02-28Bibliographically approved
In thesis
1. Advancing Automation in Digital Forensic Investigations
Open this publication in new window or tab >>Advancing Automation in Digital Forensic Investigations
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Digital Forensics is used to aid traditional preventive security mechanisms when they fail to curtail sophisticated and stealthy cybercrime events. The Digital Forensic Investigation process is largely manual in nature, or at best quasi-automated, requiring a highly skilled labour force and involving a sizeable time investment. Industry standard tools are evidence-centric, automate only a few precursory tasks (E.g. Parsing and Indexing) and have limited capabilities of integration from multiple evidence sources. Furthermore, these tools are always human-driven.

These challenges are exacerbated in the increasingly computerized and highly networked environment of today. Volumes of digital evidence to be collected and analyzed have increased, and so has the diversity of digital evidence sources involved in a typical case. This further handicaps digital forensics practitioners, labs and law enforcement agencies, causing delays in investigations and legal systems due to backlogs of cases. Improved efficiency of the digital investigation process is needed, in terms of increasing the speed and reducing the human effort expended. This study aims at achieving this time and effort reduction, by advancing automation within the digital forensic investigation process.

Using a Design Science research approach, artifacts are designed and developed to address these practical problems. Summarily, the requirements, and architecture of a system for automating digital investigations in highly networked environments are designed. The architecture initially focuses on automation of the identification and acquisition of digital evidence, while later versions focus on full automation and self-organization of devices for all phases of the digital investigation process. Part of the remote evidence acquisition capability of this system architecture is implemented as a proof of concept. The speed and reliability of capturing digital evidence from remote mobile devices over a client-server paradigm is evaluated. A method for the uniform representation and integration of multiple diverse evidence sources for enabling automated correlation, simple reasoning and querying is developed and tested. This method is aimed at automating the analysis phase of digital investigations. Machine Learning (ML)-based triage methods are developed and tested to evaluate the feasibility and performance of using such techniques to automate the identification of priority digital evidence fragments. Models from these ML methods are evaluated in identifying network protocols within DNS tunneled network traffic. A large dataset is also created for future research in ML-based triage for identifying suspicious processes for memory forensics.

From an ex ante evaluation, the designed system architecture enables individual devices to participate in the entire digital investigation process, contributing their processing power towards alleviating the burden on the human analyst. Experiments show that remote evidence acquisition of mobile devices over networks is feasible, however a single-TCP-connection paradigm scales poorly. A proof of concept experiment demonstrates the viability of the automated integration, correlation and reasoning over multiple diverse evidence sources using semantic web technologies. Experimentation also shows that ML-based triage methods can enable prioritization of certain digital evidence sources, for acquisition or analysis, with up to 95% accuracy.

The artifacts developed in this study provide concrete ways to enhance automation in the digital forensic investigation process to increase the investigation speed and reduce the amount of costly human intervention needed.

 

Place, publisher, year, edition, pages
Stockholm: Department of Computer and Systems Sciences, Stockholm University, 2018. p. 149
Series
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 18-002
Keywords
Digital Forensics, Machine Learning, Computer Forensics, Network Forensics, Predictive Modelling, Distributed Systems, Mobile Devices, Mobile Forensics, Memory Forensics, Android, Semantic Web, Hypervisors, Virtualization, Remote Acquisition, Evidence Analysis, Correlation, P2P, Bittorrent
National Category
Computer Systems Communication Systems Telecommunications Computer Sciences Computer Engineering Information Systems
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-161555 (URN)978-91-7797-521-2 (ISBN)978-91-7797-520-5 (ISBN)
Public defence
2018-12-17, L30, NOD-huset, Borgarfjordsgatan 12, Kista, 14:00 (English)
Opponent
Supervisors
Available from: 2018-11-22 Created: 2018-10-30 Last updated: 2022-02-26Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Homem, Irvin

Search in DiVA

By author/editor
Homem, Irvin
By organisation
Department of Computer and Systems Sciences
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 289 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf