Change search
Link to record
Permanent link

Direct link
Duneld, Martin
Alternative names
Publications (10 of 22) Show all publications
Sandell, J., Asplund, E., Ayele, W. Y. & Duneld, M. (2024). Performance Comparison Analysis of ArangoDB, MySQL, and Neo4j: An Experimental Study of Querying Connected Data. In: Tung X. Bui (Ed.), Proceedings of the 57th Annual Hawaii International Conference on System Sciences: . Paper presented at 57th Annual Hawaii International Conference on System Sciences (HICSS 2024), Honolulu, USA, January 3-6, 2024 (pp. 7760-7769). Honolulu
Open this publication in new window or tab >>Performance Comparison Analysis of ArangoDB, MySQL, and Neo4j: An Experimental Study of Querying Connected Data
2024 (English)In: Proceedings of the 57th Annual Hawaii International Conference on System Sciences / [ed] Tung X. Bui, Honolulu, 2024, p. 7760-7769Conference paper, Published paper (Refereed)
Abstract [en]

Choosing and developing performant database solutions helps organizations optimize their operational practices and decision-making. Since graph data is becoming more common, it is crucial to develop and use them in big data with complex relationships with high and consistent performance. However, legacy database technologies such as MySQL are tailored to store relational databases and need to perform more complex queries to retrieve graph data. Previous research has dealt with performance aspects such as CPU and memory usage. In contrast, energy usage and temperature of the servers are lacking. Thus, this paper evaluates and compares state-of-the-art graphs and relational databases from the performance aspects to allow a more informed selection of technologies. Graph-based big data applications benefit from informed selection database technologies for data retrieval and analytics problems. The results show that Neo4j performs faster in querying connected data than MySQL and ArangoDB, and energy, CPU, and memory usage performances are reported in this paper.

Place, publisher, year, edition, pages
Honolulu: , 2024
Series
Proceedings of the Annual Hawaii International Conference on System Sciences, ISSN 1530-1605, E-ISSN 2572-6862
Keywords
Graph Data, Querying Performance, Connected Data, Energy Usage, Performance Benchmark
National Category
Computer Sciences
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-231316 (URN)2-s2.0-85199800629 (Scopus ID)978-0-9981331-7-1 (ISBN)
Conference
57th Annual Hawaii International Conference on System Sciences (HICSS 2024), Honolulu, USA, January 3-6, 2024
Available from: 2024-06-18 Created: 2024-06-18 Last updated: 2025-02-24Bibliographically approved
Li, X., Henriksson, A., Duneld, M., Nouri, J. & Wu, Y. (2024). Supporting Teaching-to-the-Curriculum by Linking Diagnostic Tests to Curriculum Goals: Using Textbook Content as Context for Retrieval-Augmented Generation with Large Language Models. In: Andrew M. Olney; Irene-Angelica Chounta; Zitao Liu; Olga C. Santos; Ig Ibert Bittencourt (Ed.), Artificial Intelligence in Education: 25th International Conference, AIED 2024, Recife, Brazil, July 8–12, 2024, Proceedings, Part I. Paper presented at Artificial Intelligence in Education. AIED 2024, Recife, Brazil, July 8–12, 2024. (pp. 118-132). Springer Nature
Open this publication in new window or tab >>Supporting Teaching-to-the-Curriculum by Linking Diagnostic Tests to Curriculum Goals: Using Textbook Content as Context for Retrieval-Augmented Generation with Large Language Models
Show others...
2024 (English)In: Artificial Intelligence in Education: 25th International Conference, AIED 2024, Recife, Brazil, July 8–12, 2024, Proceedings, Part I / [ed] Andrew M. Olney; Irene-Angelica Chounta; Zitao Liu; Olga C. Santos; Ig Ibert Bittencourt, Springer Nature , 2024, p. 118-132Conference paper, Published paper (Refereed)
Abstract [en]

Using AI for automatically linking exercises to curriculum goals can support many educational use cases and facilitate teaching-to-the-curriculum by ensuring that exercises adequately reflect and encompass the curriculum goals, ultimately enabling curriculum-based assessment. Here, we introduce this novel task and create a manually labeled dataset where two types of diagnostic tests are linked to curriculum goals for Biology G7-9 in Sweden. We cast the problem both as an information retrieval task and a multi-class text classification task and explore unsupervised approaches to both, as labeled data for such tasks is typically scarce. For the information retrieval task, we employ SOTA embedding model ADA-002 for semantic textual similarity (STS), while we prompt a large language model in the form of ChatGPT to classify diagnostic tests into curriculum goals. For both task formulations, we investigate different ways of using textbook content as a pivot and provide additional context for linking diagnostic tests to curriculum goals. We show that a combination of the two approaches in a retrieval-augmented generation model, whereby STS is used for retrieving textbook content as context to ChatGPT that then performs zero-shot classification, leads to the best classification accuracy (73.5%), outperforming both STS-based classification (67.5%) and LLM-based classification without context (71.5%). Finally, we showcase how the proposed method could be used in pedagogical practices.

Place, publisher, year, edition, pages
Springer Nature, 2024
Series
Lecture Notes in Computer Science (LNCS), ISSN 0302-9743, E-ISSN 1611-3349 ; 14829
Keywords
Teaching-to-the-Curriculum, Semantic Textual Similarity, Large Language Models, ChatGPT, Retrieval-Augmented Generation.
National Category
Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-232105 (URN)10.1007/978-3-031-64302-6_9 (DOI)001312807700009 ()2-s2.0-85200234051 (Scopus ID)978-3-031-64302-6 (ISBN)978-3-031-64301-9 (ISBN)
Conference
Artificial Intelligence in Education. AIED 2024, Recife, Brazil, July 8–12, 2024.
Available from: 2024-07-24 Created: 2024-07-24 Last updated: 2025-02-07Bibliographically approved
Wu, Y., Henriksson, A., Nouri, J., Duneld, M. & Li, X. (2023). Beyond Benchmarks: Spotting Key Topical Sentences While Improving Automated Essay Scoring Performance with Topic-Aware BERT. Electronics, 12(1), Article ID 150.
Open this publication in new window or tab >>Beyond Benchmarks: Spotting Key Topical Sentences While Improving Automated Essay Scoring Performance with Topic-Aware BERT
Show others...
2023 (English)In: Electronics, E-ISSN 2079-9292, Vol. 12, no 1, article id 150Article in journal (Refereed) Published
Abstract [en]

Automated Essay Scoring (AES) automatically allocates scores to essays at scale and may help teachers reduce the heavy burden during grading activities. Recently, researchers have deployed neural-based AES approaches to improve upon the state-of-the-art AES performance. These neural-based AES methods mainly take student essays as the sole input and focus on learning the relationship between student essays and essay scores through deep neural networks. However, their only product, the predicted holistic score, is far from providing adequate pedagogical information, such as automated writing evaluation (AWE). In this work, we propose Topic-aware BERT, a new method of learning relations among scores, student essays, as well as topical information in essay instructions. Beyond improving the AES benchmark performance, Topic-aware BERT can automatically retrieve key topical sentences in student essays by probing self-attention maps in intermediate layers. We evaluate the performance of Topic-aware BERT of different variants to (i) perform AES and (ii) retrieve key topical sentences using the open dataset Automated Student Assessment Prize and a manually annotated dataset. Our experiments show that Topic-aware BERT achieves a strong AES performance compared with the previous best neural-based AES methods and demonstrates effectiveness in identifying key topical sentences in argumentative essays.

Keywords
Artificial Intelligence, Natural Language Processing, Automated Essay Scoring, Automated Writing Evaluation, BERT
National Category
Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-213557 (URN)10.3390/electronics12010150 (DOI)000910414500001 ()2-s2.0-85145830143 (Scopus ID)
Note

This article belongs to the Special Issue Artificial Intelligence Solutions and Applications for Distributed Systems in Smart Spaces

Available from: 2023-01-09 Created: 2023-01-09 Last updated: 2025-02-07Bibliographically approved
Li, X., Henriksson, A., Nouri, J., Duneld, M. & Wu, Y. (2023). Linking Swedish Learning Materials to Exercises through an AI-Enhanced Recommender System. In: Marcelo Milrad, Nuno Otero, María Cruz Sánchez‑Gómez, Juan José Mena, Dalila Durães, Filippo Sciarrone, Claudio Alvarez-Gómez, Manuel Rodrigues, Pierpaolo Vittorini, Rosella Gennari, Tania Di Mascio, Marco Temperini, Fernando De la Prieta (Ed.), Methodologies and Intelligent Systems for Technology Enhanced Learning, 13th International Conference: . Paper presented at 13th International Conference on Methodologies and Intelligent Systems for Technology Enhanced Learning (MIS4TEL 2023), Guimarães, Portugal, July 12-14, 2023 (pp. 96-107). Cham: Springer
Open this publication in new window or tab >>Linking Swedish Learning Materials to Exercises through an AI-Enhanced Recommender System
Show others...
2023 (English)In: Methodologies and Intelligent Systems for Technology Enhanced Learning, 13th International Conference / [ed] Marcelo Milrad, Nuno Otero, María Cruz Sánchez‑Gómez, Juan José Mena, Dalila Durães, Filippo Sciarrone, Claudio Alvarez-Gómez, Manuel Rodrigues, Pierpaolo Vittorini, Rosella Gennari, Tania Di Mascio, Marco Temperini, Fernando De la Prieta, Cham: Springer, 2023, p. 96-107Conference paper, Published paper (Refereed)
Abstract [en]

As an integral part of AI-enhanced learning, a content recommender automatically filters and recommends relevant learning materials to the learner or the instructor in a learning system. It can effectively help instructors in pedagogical practices and support students in self-regulated learning. Content recommendation technologies and applications have been studied extensively, however, the SOTA technologies have not adequately adapted to the education domain and there is very limited research on how different models and solutions can be applied in the Swedish context and for multiple subjects. In this paper, we develop a text similarity-based content recommender system. Specifically, given a quiz, we automatically recommend supportive learning resources as a reference to the answer and link back to the textbook sections where the examined knowledge points reside. We present a generic method for Swedish educational content recommendations using the most representative models, evaluate and analyze in multi-dimensions such as model types, pooling methods, subjects etc. The best results are obtained by Sentence-BERT (SBERT) with max paragraph-level pooling, outperforming traditional Natural Language Processing (NLP) models and knowledge graph-based models, obtaining on average 95% in Recall@3 and 82% in MRR, and outstanding in dealing with texts containing symbols, equations or calculations. This research provides empirical evidence and analysis, and can be used as a guidance when building a Swedish educational content recommender.

Place, publisher, year, edition, pages
Cham: Springer, 2023
Series
Lecture Notes in Networks and Systems, ISSN 2367-3370, E-ISSN 2367-3389 ; 764
Keywords
AI-enhanced Learning, Educational Content Recommender, NLP, Text Similarity, Textual Semantic Search
National Category
Computer Sciences
Identifiers
urn:nbn:se:su:diva-223047 (URN)10.1007/978-3-031-41226-4_10 (DOI)2-s2.0-85172692344 (Scopus ID)978-3-031-41225-7 (ISBN)978-3-031-41226-4 (ISBN)
Conference
13th International Conference on Methodologies and Intelligent Systems for Technology Enhanced Learning (MIS4TEL 2023), Guimarães, Portugal, July 12-14, 2023
Available from: 2023-10-18 Created: 2023-10-18 Last updated: 2024-09-06Bibliographically approved
Wu, Y., Nouri, J., Megyesi, B., Henriksson, A., Duneld, M. & Li, X. (2023). Towards Data-effective Educational Question Generation with Prompt-based Learning. In: : . Paper presented at Computing Conference 2023. Springer Nature
Open this publication in new window or tab >>Towards Data-effective Educational Question Generation with Prompt-based Learning
Show others...
2023 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Practice and exam-style questions, as essential educational tools, contribute to educators’ effective teaching. Automatic question generation (QG) is a promising technique that can eliminate the manual effort of constructing questions and boost technology-enhanced education systems. Recently, deep neural network-based question-generation approaches have significantly improved upon state-of-the-art of question generation. Nevertheless, these approaches are often developed based on huge and non-educational datasets consisting of over 100,000 examples, which negatively affect the scalability and reliability of the educational QG systems. This study proposes a prompt-based learning QG approach that could generate questions in a data-effective way. The proposed prompt-based learning QG approach is trained and evaluated on a general dataset SQuAD, and an educational dataset SciQ. Experiment results demonstrate that our approach outperforms existing best QG models by a vast margin in data-effective scenarios and could generate high-quality educational questions with as few as 1,000 training examples.

Place, publisher, year, edition, pages
Springer Nature, 2023
Series
Lecture Notes in Networks and Systems, ISSN 2367-3370, E-ISSN 2367-3389 ; 711
Keywords
Question Generation, Natual Language Processing, Artificial Intelligence, Prompt-based Learning
National Category
Information Systems
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-224808 (URN)10.1007/978-3-031-37717-4_11 (DOI)2-s2.0-85174674904 (Scopus ID)978-3-031-37716-7 (ISBN)
Conference
Computing Conference 2023
Available from: 2023-12-27 Created: 2023-12-27 Last updated: 2024-10-30Bibliographically approved
Wu, Y., Henriksson, A., Duneld, M. & Nouri, J. (2023). Towards Improving the Reliability and Transparency of ChatGPT for Educational Question Answering. In: LNCS Springer Conference Proceedings: . Paper presented at Eighteenth European Conference on Technology Enhanced Learning EC-TEL, 2023. Springer
Open this publication in new window or tab >>Towards Improving the Reliability and Transparency of ChatGPT for Educational Question Answering
2023 (English)In: LNCS Springer Conference Proceedings, Springer, 2023Conference paper, Published paper (Refereed)
Abstract [en]

Large language models (LLMs), such as ChatGPT, have shown remarkable performance on various natural language processing (NLP) tasks, including educational question answering (EQA). However, LLMs generate text entirely based on knowledge obtained during pre-training, which means they struggle with recent information or domain-specific knowledge bases. Moreover, only providing answers to questions posed to LLMs without any grounding materials makes it difficult for students to judge their validity.

We therefore propose a method for integrating information retrieval systems with LLMs when developing EQA systems, which in addition to improving EQA performance grounds the answers in the educational context. Our experiments show that the proposed system outperforms vanilla ChatGPT with a vast margin of 110.9%, 67.8%, and 43.3% on BLEU, ROUGE, and METEOR scores. In addition, we argue that the use of the retrieved educational context enhances the transparency and reliability of the EQA process, making it easier to determine the correctness of the answers.

Place, publisher, year, edition, pages
Springer, 2023
Series
Lecture Notes in Computer Science (LNCS), ISSN 0302-9743, E-ISSN 1611-3349 ; 14200
Keywords
AI NLP ChatGPT LLMs Educational Question Answering
National Category
Computer Sciences
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-224809 (URN)10.1007/978-3-031-42682-7_32 (DOI)2-s2.0-85171972992 (Scopus ID)978-3-031-42681-0 (ISBN)
Conference
Eighteenth European Conference on Technology Enhanced Learning EC-TEL, 2023
Available from: 2023-12-27 Created: 2023-12-27 Last updated: 2024-10-30Bibliographically approved
Li, X., Nouri, J., Henriksson, A., Duneld, M. & Wu, Y. (2022). Automatic Educational Concept Extraction Using NLP. In: Marco Temperini; Vittorio Scarano; Ivana Marenzi; Milos Kravcik; Elvira Popescu; Rosa Lanzillotti; Rosella Gennari; Fernando De la Prieta; Tania Di Mascio; Pierpaolo Vittorini (Ed.), Methodologies and Intelligent Systems for Technology Enhanced Learning, 12th International Conference: . Paper presented at MIS4TEL 2022, 12th International Conference on Methodologies and Intelligent Systems for Technology Enhanced Learning, L'Aquila (Italy) / Hybrid, 13-15 July, 2022 (pp. 133-138). Springer Nature
Open this publication in new window or tab >>Automatic Educational Concept Extraction Using NLP
Show others...
2022 (English)In: Methodologies and Intelligent Systems for Technology Enhanced Learning, 12th International Conference / [ed] Marco Temperini; Vittorio Scarano; Ivana Marenzi; Milos Kravcik; Elvira Popescu; Rosa Lanzillotti; Rosella Gennari; Fernando De la Prieta; Tania Di Mascio; Pierpaolo Vittorini, Springer Nature , 2022, p. 133-138Conference paper, Published paper (Refereed)
Abstract [en]

Educational concepts are the core of teaching and learning. From the perspective of educational technology, concepts are essential meta-data, represen- tative terms that can connect different learning materials, and are the foundation for many downstream tasks. Some studies on automatic concept extraction have been conducted, but there are no studies looking at the K-12 level and focused on the Swedish language. In this paper, we use a state-of-the-art Swedish BERT model to build an automatic concept extractor for the Biology subject using fine- annotated digital textbook data that cover all content for K-12. The model gives a recall measure of 72% and has the potential to be used in real-world settings for use cases that require high recall. Meanwhile, we investigate how input data fea- tures influence model performance and provide guidance on how to effectively use text data to achieve the optimal results when building a named entity recognition (NER) model.

Place, publisher, year, edition, pages
Springer Nature, 2022
Series
Lecture Notes in Networks and Systems, ISSN 2367-3370, E-ISSN 2367-3389 ; 580
Keywords
Concept extraction, NLP, BERT, Sequence model, NER
National Category
Computer Sciences
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-213067 (URN)10.1007/978-3-031-20617-7_17 (DOI)000921287500017 ()2-s2.0-85144211791 (Scopus ID)978-3-031-20617-7 (ISBN)978-3-031-20616-0 (ISBN)
Conference
MIS4TEL 2022, 12th International Conference on Methodologies and Intelligent Systems for Technology Enhanced Learning, L'Aquila (Italy) / Hybrid, 13-15 July, 2022
Available from: 2022-12-19 Created: 2022-12-19 Last updated: 2024-09-06Bibliographically approved
Dziadek, J., Henriksson, A. & Duneld, M. (2017). Improving Terminology Mapping in Clinical Text with Context-Sensitive Spelling Correction. In: Rebecca Randell, Ronald Cornet, Colin McCowan, Niels Peek, Philip J. Scott (Ed.), Informatics for Health: Connected Citizen-Led Wellness and Population Health. Paper presented at The Medical Informatics Europe (MIE) Conference, Manchester, UK, 24-26 April, 2017 (pp. 241-245). IOS Press
Open this publication in new window or tab >>Improving Terminology Mapping in Clinical Text with Context-Sensitive Spelling Correction
2017 (English)In: Informatics for Health: Connected Citizen-Led Wellness and Population Health / [ed] Rebecca Randell, Ronald Cornet, Colin McCowan, Niels Peek, Philip J. Scott, IOS Press, 2017, p. 241-245Conference paper, Published paper (Refereed)
Abstract [en]

The mapping of unstructured clinical text to an ontology facilitates meaningful secondary use of health records but is non-trivial due to lexical variation and the abundance of misspellings in hurriedly produced notes. Here, we apply several spelling correction methods to Swedish medical text and evaluate their impact on SNOMED CT mapping; first in a controlled evaluation using medical literature text with induced errors, followed by a partial evaluation on clinical notes. It is shown that the best-performing method is context-sensitive, taking into account trigram frequencies and utilizing a corpus-based dictionary.

Place, publisher, year, edition, pages
IOS Press, 2017
Series
Studies in Health Technology and Informatics, ISSN 0926-9630, E-ISSN 1879-8365 ; 235
Keywords
spelling correction, terminology mapping, clinical text
National Category
Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-149424 (URN)10.3233/978-1-61499-753-5-241 (DOI)978-1-61499-752-8 (ISBN)978-1-61499-753-5 (ISBN)
Conference
The Medical Informatics Europe (MIE) Conference, Manchester, UK, 24-26 April, 2017
Available from: 2017-11-30 Created: 2017-11-30 Last updated: 2025-02-07Bibliographically approved
Henriksson, A., Kvist, M., Dalianis, H. & Duneld, M. (2015). Identifying adverse drug event information in clinical notes with distributional semantic representations of context. Journal of Biomedical Informatics, 57, 333-349
Open this publication in new window or tab >>Identifying adverse drug event information in clinical notes with distributional semantic representations of context
2015 (English)In: Journal of Biomedical Informatics, ISSN 1532-0464, E-ISSN 1532-0480, Vol. 57, p. 333-349Article in journal (Refereed) Published
Abstract [en]

For the purpose of post-marketing drug safety surveillance, which has traditionally relied on the volun- tary reporting of individual cases of adverse drug events (ADEs), other sources of information are now being explored, including electronic health records (EHRs), which give us access to enormous amounts of longitudinal observations of the treatment of patients and their drug use. Adverse drug events, which can be encoded in EHRs with certain diagnosis codes, are, however, heavily underreported. It is therefore important to develop capabilities to process, by means of computational methods, the more unstructured EHR data in the form of clinical notes, where clinicians may describe and reason around suspected ADEs. In this study, we report on the creation of an annotated corpus of Swedish health records for the purpose of learning to identify information pertaining to ADEs present in clinical notes. To this end, three key tasks are tackled: recognizing relevant named entities (disorders, symptoms, drugs), labeling attributes of the recognized entities (negation, speculation, temporality), and relationships between them (indication, adverse drug event). For each of the three tasks, leveraging models of distributional semantics – i.e., unsupervised methods that exploit co-occurrence information to model, typically in vector space, the meaning of words – and, in particular, combinations of such models, is shown to improve the predictive performance. The ability to make use of such unsupervised methods is critical when faced with large amounts of sparse and high-dimensional data, especially in domains where annotated resources are scarce.

Keywords
adverse drug events, electronic health records, corpus annotation, machine learning, distributional semantics, relation extraction
National Category
Computer Sciences Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-122464 (URN)10.1016/j.jbi.2015.08.013 (DOI)000363437500028 ()
Projects
High-Performance Data Mining for Drug Effect Detection
Funder
Swedish Foundation for Strategic Research , IIS11-0053
Available from: 2015-11-02 Created: 2015-11-02 Last updated: 2025-02-01Bibliographically approved
Velupillai, S., Duneld, M., Henriksson, A., Kvist, M., Skeppstedt, M. & Dalianis, H. (Eds.). (2015). Louhi 2014: Special issue on health text mining and information analysis. Paper presented at EACL 2014 Workshop - The Fifth International Workshop on Health Text Mining and Information Analysis, Gothenburg, Sweden, April 27, 2014. London: BioMed Central
Open this publication in new window or tab >>Louhi 2014: Special issue on health text mining and information analysis
Show others...
2015 (English)Conference proceedings (editor) (Refereed)
Place, publisher, year, edition, pages
London: BioMed Central, 2015
National Category
Information Systems
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-119911 (URN)
Conference
EACL 2014 Workshop - The Fifth International Workshop on Health Text Mining and Information Analysis, Gothenburg, Sweden, April 27, 2014
Note

Special Issue: BMC Medical Informatics and Decision Making, ISSN 1472-6947, Volume 15, Supplement 2.

Available from: 2015-11-11 Created: 2015-08-28 Last updated: 2022-02-23Bibliographically approved
Organisations

Search in DiVA

Show all publications