Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Achievement tests and optimal design for pretesting of questions
Stockholms universitet, Samhällsvetenskapliga fakulteten, Statistiska institutionen.ORCID-id: 0000-0003-2889-0263
2019 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Achievement tests are used to measure the students' proficiency in a particular knowledge. Computerized achievement tests (e.g. GRE and SAT) are usually based on questions available in an item bank to measure the proficiency of students. An item bank is a large collection of items with known characteristics (e.g. difficulty). Item banks are continuously updated and revised with new items in place of obsolete, overexposed or flawed items over time. This thesis is devoted to updating and maintaining the item bank with high-quality questions and better estimations of item parameters (item calibration). 

The thesis contains four manuscripts. One paper investigates the impact of student ability dimensionality on the estimated parameters and the other three deal with item calibration.

In the first paper, we investigate how the ability dimensionality influences the estimates of the item-parameters. By a case and simulation study, we found that a multidimensional model better discriminates among the students.

The second paper describes a method for optimal item calibration by efficiently selecting the examinees based on their ability levels. We develop an algorithm which selects intervals for the students' ability levels for optimal calibration of the items. We also develop an equivalence theorem for item calibration to verify the optimal design.  

The algorithm developed in Paper II becomes complicated with the increase of number of calibrated items. So, in Paper III we develop a new exchange algorithm based on the equivalence theorem developed in Paper II.

Finally, the fourth paper generalizes the exchange algorithm described in Paper III by assuming that the students have multidimensional abilities to answer the questions.

sted, utgiver, år, opplag, sider
Department of Statistics, Stockholm University , 2019. , s. 26
Emneord [en]
Achievement test, Equivalence theorem, Exchange algorithm, Item calibration, Item response theory model, Optimal experimental design
HSV kategori
Forskningsprogram
statistik
Identifikatorer
URN: urn:nbn:se:su:diva-174079ISBN: 978-91-7797-879-4 (tryckt)ISBN: 978-91-7797-880-0 (digital)OAI: oai:DiVA.org:su-174079DiVA, id: diva2:1357038
Disputas
2019-11-15, William-Olssonsalen, Geovetenskapens hus, Svante Arrhenius väg 14, floor 1, Stockholm, 10:00 (engelsk)
Opponent
Veileder
Merknad

At the time of the doctoral defense, the following papers were unpublished and had a status as follows: Paper 1: Manuscript. Paper 3: Manuscript. Paper 4: Manuscript.

Tilgjengelig fra: 2019-10-23 Laget: 2019-10-02 Sist oppdatert: 2022-02-26bibliografisk kontrollert
Delarbeid
1. Discrimination with unidimensional and multidimensional item response theory models for educational data
Åpne denne publikasjonen i ny fane eller vindu >>Discrimination with unidimensional and multidimensional item response theory models for educational data
2022 (engelsk)Inngår i: Communications in statistics. Simulation and computation, ISSN 0361-0918, E-ISSN 1532-4141, Vol. 51, nr 6, s. 2992-3012Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Achievement tests are used to characterize the proficiency of higher-education students. Item response theory (IRT) models are applied to these tests to estimate the ability of students (as latent variable in the model). In order for quality IRT parameters to be estimated, especially ability parameters, it is important that the appropriate number of dimensions is identified. Through a case study, based on a statistics exam for students in higher education, we show how dimensions and other model parameters can be chosen in a real situation. Our model choice is based both on empirical and on background knowledge of the test. We show that dimensionality influences the estimates of the item-parameters, especially the discrimination parameter which provides information about the quality of the item. We perform a simulation study to generalize our conclusions. Both the simulation study and the case study show that multidimensional models have the advantage to better discriminate between examinees. We conclude from the simulation study that it is safer to use a multidimensional model compared to a unidimensional if it is unknown which model is the correct one.

Emneord
Achievement tests, Discrimination, Multidimensional four parameter logistic model, Multidimensional graded response model, Multidimensional item response theory
HSV kategori
Forskningsprogram
statistik
Identifikatorer
urn:nbn:se:su:diva-177401 (URN)10.1080/03610918.2019.1705344 (DOI)000504938300001 ()2-s2.0-85078600822 (Scopus ID)
Tilgjengelig fra: 2020-01-05 Laget: 2020-01-05 Sist oppdatert: 2022-09-27bibliografisk kontrollert
2. Optimal Item Calibration for Computerized Achievement Tests
Åpne denne publikasjonen i ny fane eller vindu >>Optimal Item Calibration for Computerized Achievement Tests
2019 (engelsk)Inngår i: Psychometrika, ISSN 0033-3123, E-ISSN 1860-0980, Vol. 84, nr 4, s. 1101-1128Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Item calibration is a technique to estimate characteristics of questions (called items) for achievement tests. In computerized tests, item calibration is an important tool for maintaining, updating and developing new items for an item bank. To efficiently sample examinees with specific ability levels for this calibration, we use optimal design theory assuming that the probability to answer correctly follows an item response model. Locally optimal unrestricted designs have usually a few design points for ability. In practice, it is hard to sample examinees from a population with these specific ability levels due to unavailability or limited availability of examinees. To counter this problem, we use the concept of optimal restricted designs and show that this concept naturally fits to item calibration. We prove an equivalence theorem needed to verify optimality of a design. Locally optimal restricted designs provide intervals of ability levels for optimal calibration of an item. When assuming a two-parameter logistic model, several scenarios with D-optimal restricted designs are presented for calibration of a single item and simultaneous calibration of several items. These scenarios show that the naive way to sample examinees around unrestricted design points is not optimal.

Emneord
achievement tests, computerized tests, item calibration, optimal restricted design, two-parameter logistic model
HSV kategori
Forskningsprogram
statistik
Identifikatorer
urn:nbn:se:su:diva-169646 (URN)10.1007/s11336-019-09673-6 (DOI)000492593800010 ()
Tilgjengelig fra: 2019-06-12 Laget: 2019-06-12 Sist oppdatert: 2022-02-26bibliografisk kontrollert
3. An exchange algorithm for optimal calibration of  items in computerized achievement tests
Åpne denne publikasjonen i ny fane eller vindu >>An exchange algorithm for optimal calibration of  items in computerized achievement tests
(engelsk)Manuskript (preprint) (Annet vitenskapelig)
Abstract [en]

The importance of large scale achievement tests, like national tests in school, eligibility tests for university, or international assessments for evaluation of students, is increasing. Pretesting of questions for the above mentioned tests is done to determine characteristic properties of the questions by adding them to an ordinary achievement test. If computerized tests are used, it has been shown using optimal experimental design methods that it is efficient to assign pretest questions to examinees based on their abilities. We can consider the specific distribution of abilities of the available examinees and apply restricted optimal designs.A previously used algorithm optimizes the criterion directly. We develop here a new algorithm which builds on an equivalence theorem. It discretizises the design space with the possibility to change the grid during the run, makes use of an exchange idea and filters computed designs. We illustrate how the algorithm works in some examples and how convergence can be checked. We show that this new algorithm can be used flexibly even if different models are assumed for different questions.

HSV kategori
Forskningsprogram
statistik
Identifikatorer
urn:nbn:se:su:diva-174075 (URN)
Tilgjengelig fra: 2019-10-02 Laget: 2019-10-02 Sist oppdatert: 2022-02-26bibliografisk kontrollert
4. Optimal Calibration of Items for Multidimensional Achievement Tests
Åpne denne publikasjonen i ny fane eller vindu >>Optimal Calibration of Items for Multidimensional Achievement Tests
2024 (engelsk)Inngår i: Journal of educational measurement, ISSN 0022-0655, E-ISSN 1745-3984, Vol. 61, nr 2, s. 274-302Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Multidimensional achievement tests are recently gaining more importance in educational and psychological measurements. For example, multidimensional diagnostic tests can help students to determine which particular domain of knowledge they need to improve for better performance. To estimate the characteristics of candidate items (calibration) for future multidimensional achievement tests, we use optimal design theory. We generalize a previously developed exchange algorithm for optimal design computation to the multidimensional setting. We also develop an asymptotic theorem saying which item should be calibrated by examinees with extreme abilities. For several examples, we compute the optimal design numerically with the exchange algorithm. We see clear structures in these results and explain them using the asymptotic theorem. Moreover, we investigate the performance of the optimal design in a simulation study. 

Emneord
Achievement tests, exchange algorithm, item calibration, multidimensional item response model, optimal restricted design
HSV kategori
Forskningsprogram
statistik
Identifikatorer
urn:nbn:se:su:diva-174077 (URN)10.1111/jedm.12386 (DOI)001184813000001 ()2-s2.0-85187667942 (Scopus ID)
Tilgjengelig fra: 2019-10-02 Laget: 2019-10-02 Sist oppdatert: 2024-09-05bibliografisk kontrollert

Open Access i DiVA

Achievement tests and optimal design for pretesting of questions(1085 kB)1722 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 1085 kBChecksum SHA-512
52354cfeb6c7004a775f276311aa99772215300fe309c007d38853e329b14ddc25d4536fe10c4e746986047a5d403cd23e8b6c158911377d0b222580f9b20fdd
Type fulltextMimetype application/pdf

Person

Ul Hassan, Mahmood

Søk i DiVA

Av forfatter/redaktør
Ul Hassan, Mahmood
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 1722 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

isbn
urn-nbn

Altmetric

isbn
urn-nbn
Totalt: 1155 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf