Change search
Link to record
Permanent link

Direct link
Publications (10 of 31) Show all publications
Bjermo, J., Fackle-Fornius, E. & Miller, F. (2025). Optimizing calibration designs with uncertainty in abilities. British Journal of Mathematical & Statistical Psychology
Open this publication in new window or tab >>Optimizing calibration designs with uncertainty in abilities
2025 (English)In: British Journal of Mathematical & Statistical Psychology, ISSN 0007-1102, E-ISSN 2044-8317Article in journal (Refereed) Epub ahead of print
Abstract [en]

Before items can be implemented in a test, the item characteristics need to be calibrated through pretesting. To achieve high-quality tests, it's crucial to maximize the precision of estimates obtained during item calibration. Higher precision can be attained if calibration items are allocated to examinees based on their individual abilities. Methods from optimal experimental design can be used to derive an optimal ability-matched calibration design. However, such an optimal design assumes known abilities of the examinees. In practice, the abilities are unknown and estimated based on a limited number of operational items. We develop the theory for handling the uncertainty in abilities in a proper way and show how the optimal calibration design can be derived when taking account of this uncertainty. We demonstrate that the derived designs are more robust when the uncertainty in abilities is acknowledged. Additionally, the method has been implemented in the R-package optical.

Keywords
ability, computerized adaptive tests, item calibration, optimal experimental design
National Category
Statistics in Social Sciences
Identifiers
urn:nbn:se:su:diva-242416 (URN)10.1111/bmsp.12387 (DOI)2-s2.0-105000444923 (Scopus ID)
Available from: 2025-04-23 Created: 2025-04-23 Last updated: 2025-04-23
Ul Hassan, M. & Miller, F. (2024). Optimal Calibration of Items for Multidimensional Achievement Tests. Journal of educational measurement, 61(2), 274-302
Open this publication in new window or tab >>Optimal Calibration of Items for Multidimensional Achievement Tests
2024 (English)In: Journal of educational measurement, ISSN 0022-0655, E-ISSN 1745-3984, Vol. 61, no 2, p. 274-302Article in journal (Refereed) Published
Abstract [en]

Multidimensional achievement tests are recently gaining more importance in educational and psychological measurements. For example, multidimensional diagnostic tests can help students to determine which particular domain of knowledge they need to improve for better performance. To estimate the characteristics of candidate items (calibration) for future multidimensional achievement tests, we use optimal design theory. We generalize a previously developed exchange algorithm for optimal design computation to the multidimensional setting. We also develop an asymptotic theorem saying which item should be calibrated by examinees with extreme abilities. For several examples, we compute the optimal design numerically with the exchange algorithm. We see clear structures in these results and explain them using the asymptotic theorem. Moreover, we investigate the performance of the optimal design in a simulation study. 

Keywords
Achievement tests, exchange algorithm, item calibration, multidimensional item response model, optimal restricted design
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:su:diva-174077 (URN)10.1111/jedm.12386 (DOI)001184813000001 ()2-s2.0-85187667942 (Scopus ID)
Available from: 2019-10-02 Created: 2019-10-02 Last updated: 2024-09-05Bibliographically approved
Miller, F. & Fackle-Fornius, E. (2024). Parallel Optimal Calibration of Mixed-Format Items for Achievement Tests. Psychometrika, 89(3), 903-928
Open this publication in new window or tab >>Parallel Optimal Calibration of Mixed-Format Items for Achievement Tests
2024 (English)In: Psychometrika, ISSN 0033-3123, E-ISSN 1860-0980, Vol. 89, no 3, p. 903-928Article in journal (Refereed) Published
Abstract [en]

When large achievement tests are conducted regularly, items need to be calibrated before being used as operational items in a test. Methods have been developed to optimally assign pretest items to examinees based on their abilities. Most of these methods, however, are intended for situations where examinees arrive sequentially to be assigned to calibration items. In several calibration tests, examinees take the test simultaneously or in parallel. In this article, we develop an optimal calibration design tailored for such parallel test setups. Our objective is both to investigate the efficiency gain of the method as well as to demonstrate that this method can be implemented in real calibration scenarios. For the latter, we have employed this method to calibrate items for the Swedish national tests in Mathematics. In this case study, like in many real test situations, items are of mixed format and the optimal design method needs to handle that. The method we propose works for mixed-format tests and accounts for varying expected response times. Our investigations show that the proposed method considerably enhances calibration efficiency.

Keywords
Achievement tests, Calibration, Mixed-format items, Optimal design, Swedish national test
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:su:diva-229049 (URN)10.1007/s11336-024-09968-3 (DOI)001204673600001 ()38619664 (PubMedID)2-s2.0-85190464106 (Scopus ID)
Available from: 2024-05-20 Created: 2024-05-20 Last updated: 2025-02-13Bibliographically approved
Tsirpitzi, R. E., Miller, F. & Burman, C.-F. (2023). Robust optimal designs using a model misspecification term. Metrika (Heidelberg) (86), 781-804
Open this publication in new window or tab >>Robust optimal designs using a model misspecification term
2023 (English)In: Metrika (Heidelberg), ISSN 0026-1335, E-ISSN 1435-926X, no 86, p. 781-804Article in journal (Refereed) Published
Abstract [en]

Much of classical optimal design theory relies on specifying a model with only a small number of parameters. In many applications, such models will give reasonable approximations. However, they will often be found not to be entirely correct when enough data are at hand. A property of classical optimal design methodology is that the amount of data does not influence the design when a fixed model is used. However, it is reasonable that a low dimensional model is satisfactory only if limited data is available. With more data available, more aspects of the underlying relationship can be assessed. We consider a simple model that is not thought to be fully correct. The model misspecification, that is, the difference between the true mean and the simple model, is explicitly modeled with a stochastic process. This gives a unified approach to handle situations with both limited and rich data. Our objective is to estimate the combined model, which is the sum of the simple model and the assumed misspecification process. In our situation, the low-dimensional model can be viewed as a fixed effect and the misspecification term as a random effect in a mixed-effects model. Our aim is to predict within this model. We describe how we minimize the prediction error using an optimal design. We compute optimal designs for the full model in different cases. The results confirm that the optimal design depends strongly on the sample size. In low-information situations, traditional optimal designs for models with a small number of parameters are sufficient, while the inclusion of the misspecification term lead to very different designs in data-rich cases. 

Keywords
Fedorov algorithm, Gaussian process, Mixed-effects model, Optimal experimental design, Statistical modelling
National Category
Mathematics
Identifiers
urn:nbn:se:su:diva-215443 (URN)10.1007/s00184-023-00893-6 (DOI)000928574600001 ()2-s2.0-85147564000 (Scopus ID)
Available from: 2023-03-16 Created: 2023-03-16 Last updated: 2023-10-09Bibliographically approved
ul Hassan, M. & Miller, F. (2022). Discrimination with unidimensional and multidimensional item response theory models for educational data. Communications in statistics. Simulation and computation, 51(6), 2992-3012
Open this publication in new window or tab >>Discrimination with unidimensional and multidimensional item response theory models for educational data
2022 (English)In: Communications in statistics. Simulation and computation, ISSN 0361-0918, E-ISSN 1532-4141, Vol. 51, no 6, p. 2992-3012Article in journal (Refereed) Published
Abstract [en]

Achievement tests are used to characterize the proficiency of higher-education students. Item response theory (IRT) models are applied to these tests to estimate the ability of students (as latent variable in the model). In order for quality IRT parameters to be estimated, especially ability parameters, it is important that the appropriate number of dimensions is identified. Through a case study, based on a statistics exam for students in higher education, we show how dimensions and other model parameters can be chosen in a real situation. Our model choice is based both on empirical and on background knowledge of the test. We show that dimensionality influences the estimates of the item-parameters, especially the discrimination parameter which provides information about the quality of the item. We perform a simulation study to generalize our conclusions. Both the simulation study and the case study show that multidimensional models have the advantage to better discriminate between examinees. We conclude from the simulation study that it is safer to use a multidimensional model compared to a unidimensional if it is unknown which model is the correct one.

Keywords
Achievement tests, Discrimination, Multidimensional four parameter logistic model, Multidimensional graded response model, Multidimensional item response theory
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:su:diva-177401 (URN)10.1080/03610918.2019.1705344 (DOI)000504938300001 ()2-s2.0-85078600822 (Scopus ID)
Available from: 2020-01-05 Created: 2020-01-05 Last updated: 2022-09-27Bibliographically approved
Ul Hassan, M. & Miller, F. (2021). An exchange algorithm for optimal calibration of items in computerized achievement tests. Computational Statistics & Data Analysis, 157, Article ID 107177.
Open this publication in new window or tab >>An exchange algorithm for optimal calibration of items in computerized achievement tests
2021 (English)In: Computational Statistics & Data Analysis, ISSN 0167-9473, E-ISSN 1872-7352, Vol. 157, article id 107177Article in journal (Refereed) Published
Abstract [en]

The importance of large scale achievement tests, like national tests in school, eligibility tests for university, or international assessments for evaluation of students, is increasing. Pretesting of questions for the above mentioned tests is done to determine characteristic properties of the questions by adding them to an ordinary achievement test. If computerized tests are used, it has been shown using optimal experimental design methods that it is efficient to assign pretest questions to examinees based on their abilities. The specific distribution of abilities of the available examinees are considered and restricted optimal designs are applied. A new algorithm is developed which builds on an equivalence theorem. It discretizes the design space with the possibility to change the grid adaptively during the run, makes use of an exchange idea and filters computed designs. It is illustrated how the algorithm works through some examples as well as how convergence can be checked. The new algorithm is flexible and can be used even if different models are assumed for different questions.

Keywords
Achievement tests, Computerized tests, Exchange algorithm, Experimental design, Item response model, Optimal restricted design
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:su:diva-190950 (URN)10.1016/j.csda.2021.107177 (DOI)000620292000002 ()
Available from: 2021-03-04 Created: 2021-03-04 Last updated: 2022-02-25Bibliographically approved
Bjermo, J. & Miller, F. (2021). Efficient Estimation of Mean Ability Growth Using Vertical Scaling. Applied measurement in education, 34(3), 163-178
Open this publication in new window or tab >>Efficient Estimation of Mean Ability Growth Using Vertical Scaling
2021 (English)In: Applied measurement in education, ISSN 0895-7347, E-ISSN 1532-4818, Vol. 34, no 3, p. 163-178Article in journal (Refereed) Published
Abstract [en]

In recent years, the interest in measuring growth in student ability in various subjects between different grades in school has increased. Therefore, good precision in the estimated growth is of importance. This paper aims to compare estimation methods and test designs when it comes to precision and bias of the estimated growth of mean ability between two groups of students that differ substantially. This is performed by a simulation study. One- and two-parameter item response models are assumed and the estimated abilities are vertically scaled using the non-equivalent anchor test design by estimating the abilities in one single run, so-called concurrent calibration. The connection between the test design and the Fisher information is also discussed. The results indicate that the expected a posteriori estimation method is preferred when estimating differences in mean ability between groups. Results also indicate that a test design with common items of medium difficulty leads to better precision, which coincides with previous results from horizontal equating.

National Category
Educational Sciences Mathematics
Identifiers
urn:nbn:se:su:diva-195839 (URN)10.1080/08957347.2021.1933981 (DOI)000661773400001 ()
Available from: 2021-08-26 Created: 2021-08-26 Last updated: 2022-02-25Bibliographically approved
Tsirpitzi, R. E. & Miller, F. (2021). Optimal dose-finding for efficacy-safety models. Biometrical Journal, 63(6), 1185-1201
Open this publication in new window or tab >>Optimal dose-finding for efficacy-safety models
2021 (English)In: Biometrical Journal, ISSN 0323-3847, E-ISSN 1521-4036, Vol. 63, no 6, p. 1185-1201Article in journal (Refereed) Published
Abstract [en]

Dose-finding is an important part of the clinical development of a new drug. The purpose of dose-finding studies is to determine a suitable dose for future development based on both efficacy and safety. Optimal experimental designs have already been used to determine the design of this kind of studies, however, often that design is focused on efficacy only. We consider an efficacy-safety model, which is a simplified version of the bivariate Emax model. We use here the clinical utility index concept, which provides the desirable balance between efficacy and safety. By maximizing the utility of the patients, we get the estimated dose. This desire leads us to locally c-optimal designs. An algebraic solution for c-optimal designs is determined for arbitrary c vectors using a multivariate version of Elfving's method. The solution shows that the expected therapeutic index of the drug is a key quantity determining both the number of doses, the doses itself, and their weights in the optimal design. A sequential design is proposed to solve the complication of parameter dependency, and it is illustrated in a simulation study.

Keywords
bivariate model, dose-finding, Elfving´s method, optimal design, sequential design
National Category
Pharmaceutical Sciences Probability Theory and Statistics
Identifiers
urn:nbn:se:su:diva-194263 (URN)10.1002/bimj.202000181 (DOI)000637704200001 ()33829555 (PubMedID)
Available from: 2021-06-17 Created: 2021-06-17 Last updated: 2022-02-25Bibliographically approved
Ul Hassan, M. & Miller, F. (2019). Optimal Item Calibration for Computerized Achievement Tests. Psychometrika, 84(4), 1101-1128
Open this publication in new window or tab >>Optimal Item Calibration for Computerized Achievement Tests
2019 (English)In: Psychometrika, ISSN 0033-3123, E-ISSN 1860-0980, Vol. 84, no 4, p. 1101-1128Article in journal (Refereed) Published
Abstract [en]

Item calibration is a technique to estimate characteristics of questions (called items) for achievement tests. In computerized tests, item calibration is an important tool for maintaining, updating and developing new items for an item bank. To efficiently sample examinees with specific ability levels for this calibration, we use optimal design theory assuming that the probability to answer correctly follows an item response model. Locally optimal unrestricted designs have usually a few design points for ability. In practice, it is hard to sample examinees from a population with these specific ability levels due to unavailability or limited availability of examinees. To counter this problem, we use the concept of optimal restricted designs and show that this concept naturally fits to item calibration. We prove an equivalence theorem needed to verify optimality of a design. Locally optimal restricted designs provide intervals of ability levels for optimal calibration of an item. When assuming a two-parameter logistic model, several scenarios with D-optimal restricted designs are presented for calibration of a single item and simultaneous calibration of several items. These scenarios show that the naive way to sample examinees around unrestricted design points is not optimal.

Keywords
achievement tests, computerized tests, item calibration, optimal restricted design, two-parameter logistic model
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:su:diva-169646 (URN)10.1007/s11336-019-09673-6 (DOI)000492593800010 ()
Available from: 2019-06-12 Created: 2019-06-12 Last updated: 2022-02-26Bibliographically approved
Miller, F. & Burman, C.-F. (2018). A decision theoretical modeling for Phase III investments and drug licensing. Journal of Biopharmaceutical Statistics, 28(4), 698-721
Open this publication in new window or tab >>A decision theoretical modeling for Phase III investments and drug licensing
2018 (English)In: Journal of Biopharmaceutical Statistics, ISSN 1054-3406, E-ISSN 1520-5711, Vol. 28, no 4, p. 698-721Article in journal (Refereed) Published
Abstract [en]

For a new candidate drug to become an approved medicine, several decision points have to be passed. In this article, we focus on two of them: First, based on Phase II data, the commercial sponsor decides to invest (or not) in Phase III. Second, based on the outcome of Phase III, the regulator determines whether the drug should be granted market access. Assuming a population of candidate drugs with a distribution of true efficacy, we optimize the two stakeholders' decisions and study the interdependence between them. The regulator is assumed to seek to optimize the total public health benefit resulting from the efficacy of the drug and a safety penalty. In optimizing the regulatory rules, in terms of minimal required sample size and the Type I error in Phase III, we have to consider how these rules will modify the commercial optimization made by the sponsor. The results indicate that different Type I errors should be used depending on the rarity of the disease.

Keywords
Clinical trials, drug regulation, optimal Type I error, rare diseases, sample size
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:su:diva-157844 (URN)10.1080/10543406.2017.1377729 (DOI)000434668800008 ()28920757 (PubMedID)
Available from: 2018-06-26 Created: 2018-06-26 Last updated: 2022-02-26Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-4161-7851

Search in DiVA

Show all publications