Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Supporting Teaching-to-the-Curriculum by Linking Diagnostic Tests to Curriculum Goals: Using Textbook Content as Context for Retrieval-Augmented Generation with Large Language Models
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0002-7860-1784
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0001-9731-1048
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0000-0002-9942-8730
Show others and affiliations
Number of Authors: 52024 (English)In: Artificial Intelligence in Education: 25th International Conference, AIED 2024, Recife, Brazil, July 8–12, 2024, Proceedings, Part I / [ed] Andrew M. Olney; Irene-Angelica Chounta; Zitao Liu; Olga C. Santos; Ig Ibert Bittencourt, Springer Nature , 2024, p. 118-132Conference paper, Published paper (Refereed)
Abstract [en]

Using AI for automatically linking exercises to curriculum goals can support many educational use cases and facilitate teaching-to-the-curriculum by ensuring that exercises adequately reflect and encompass the curriculum goals, ultimately enabling curriculum-based assessment. Here, we introduce this novel task and create a manually labeled dataset where two types of diagnostic tests are linked to curriculum goals for Biology G7-9 in Sweden. We cast the problem both as an information retrieval task and a multi-class text classification task and explore unsupervised approaches to both, as labeled data for such tasks is typically scarce. For the information retrieval task, we employ SOTA embedding model ADA-002 for semantic textual similarity (STS), while we prompt a large language model in the form of ChatGPT to classify diagnostic tests into curriculum goals. For both task formulations, we investigate different ways of using textbook content as a pivot and provide additional context for linking diagnostic tests to curriculum goals. We show that a combination of the two approaches in a retrieval-augmented generation model, whereby STS is used for retrieving textbook content as context to ChatGPT that then performs zero-shot classification, leads to the best classification accuracy (73.5%), outperforming both STS-based classification (67.5%) and LLM-based classification without context (71.5%). Finally, we showcase how the proposed method could be used in pedagogical practices.

Place, publisher, year, edition, pages
Springer Nature , 2024. p. 118-132
Series
Lecture Notes in Computer Science (LNCS), ISSN 0302-9743, E-ISSN 1611-3349 ; 14829
Keywords [en]
Teaching-to-the-Curriculum, Semantic Textual Similarity, Large Language Models, ChatGPT, Retrieval-Augmented Generation.
National Category
Natural Language Processing
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-232105DOI: 10.1007/978-3-031-64302-6_9ISI: 001312807700009Scopus ID: 2-s2.0-85200234051ISBN: 978-3-031-64302-6 (electronic)ISBN: 978-3-031-64301-9 (print)OAI: oai:DiVA.org:su-232105DiVA, id: diva2:1885717
Conference
Artificial Intelligence in Education. AIED 2024, Recife, Brazil, July 8–12, 2024.
Available from: 2024-07-24 Created: 2024-07-24 Last updated: 2025-02-07Bibliographically approved
In thesis
1. Exploring Natural Language Processing for Linking Digital Learning Materials: Towards Intelligent and Adaptive Learning Systems
Open this publication in new window or tab >>Exploring Natural Language Processing for Linking Digital Learning Materials: Towards Intelligent and Adaptive Learning Systems
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The digital transformation in education has created many opportunities but also made it challenging to navigate the growing landscape of digital learning materials. The volume and diversity of learning resources create challenges for both educators and learners to identify and utilize the most relevant resources based on specific learning contexts. In light of this, there is a critical demand for systems capable of effectively connecting different learning materials to support teaching and learning activities and, for that purpose, natural language processing can be used to provide some of the essential building blocks for educational content recommendation systems. Hence, this thesis explores the use of natural language processing techniques for automatically linking and recommending relevant learning resources in the form of textbook content, exercises and curriculum goals. A key question is how to represent diverse learning materials effectively and, to that end, various language models are explored; the obtained representations are then used for measuring semantic textual similarity between learning materials. Learning materials can also be represented based on educational concepts, which is investigated in an ontology-based linking approach. To further enhance the representations and improve linking performance, different language models can be combined and augmented using external knowledge in the form of knowledge graphs and knowledge bases. Beyond approaches based on semantic textual similarity, prompting large language models is explored and a method based on retrieval-augmented generation (RAG) to improve linking performance is proposed. 

The thesis presents a systematic empirical evaluation of natural language processing techniques for representing and linking digital learning content, spanning different types of learning materials, use cases, and subjects. The results demonstrate the feasibility of unsupervised approaches based on semantic textual similarity of representations derived from pre-trained language models, and that contextual embeddings outperform traditional text representation methods. Furthermore, zero-shot prompting of large language models can outperform methods based on semantic textual similarity, leveraging RAG to exploit an external knowledge base in the form of a digital textbook. The potential practical applications of the proposed approaches for automatic linking of digital learning materials pave the way for the development of intelligent and adaptive learning systems, including intelligent textbooks.

Place, publisher, year, edition, pages
Stockholm: Department of Computer and Systems Sciences, Stockholm University, 2024. p. 70
Series
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 24-011
Keywords
Natural Language Processing, Technology Enhanced Learning, Educational Content Recommendation, Intelligent Textbooks, Pre-Trained Language Models, Large Language Models, Semantic Textual Similarity, Knowledge Graphs
National Category
Computer and Information Sciences
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-232990 (URN)978-91-8014-927-3 (ISBN)978-91-8014-928-0 (ISBN)
Public defence
2024-10-22, Lilla hörsalen, NOD-huset, Borgarfjordsgatan 12, Kista, 13:00 (English)
Opponent
Supervisors
Available from: 2024-09-27 Created: 2024-09-06 Last updated: 2024-09-19Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Li, XiuHenriksson, AronDuneld, MartinNouri, JalalWu, Yongchao

Search in DiVA

By author/editor
Li, XiuHenriksson, AronDuneld, MartinNouri, JalalWu, Yongchao
By organisation
Department of Computer and Systems Sciences
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 78 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf