Showing 23 Result(s)
Statistical Methods and Data Analysis

Item Response Theory: How Modern Tests Work

Every time you take a standardized test — an IQ assessment, a college entrance exam, a professional certification — the questions have been calibrated using sophisticated statistical models that most test-takers never learn about. Item Response Theory (IRT) is the mathematical framework behind virtually all modern psychological and educational testing, …

Addressing the Divide Between Psychology and Psychometrics
Statistical Methods and Data Analysis

Bridging Psychology and Psychometrics

In 2024, Psychometrika ran an unusual exchange. Three senior psychometricians — Klaas Sijtsma, Jules Ellis, and Denny Borsboom — published a focus article arguing that the humble sum score, the simple total of right-or-wrong answers on a test, is psychometrics’ greatest accomplishment and should remain central to practice. Two commentaries, …

Interpreting Differential Item Functioning with Response Process Data
Statistical Methods and Data Analysis

Differential Item Functioning and Response Process

A test item that scores differently for two groups of equally able examinees is called a differential item functioning (DIF) item, and identifying these items is now a routine part of large-scale assessment quality control. The hard part has never been the detection — statistical tests for DIF have been …

Integrating SDT and IRT Models for Mixed-Format Exams
Statistical Methods and Data Analysis

Integrating SDT and IRT Models for Mixed-Format Exams

Lawrence T. DeCarlo’s recent article introduces a psychological framework for mixed-format exams, combining signal detection theory (SDT) for multiple-choice items and item response theory (IRT) for open-ended items. This fusion allows for a unified model that captures the nuances of each item type while providing insights into the underlying cognitive …

Rotation Local Solutions in Multidimensional Item Response Models
Statistical Methods and Data Analysis

Rotation Local Solutions in Multidimensional IRT

Multidimensional item response theory (MIRT) extends one-dimensional models like the 2PL or 3PL to test items that load on more than one latent trait. Once a model has more than one factor, the factor solution is not unique: any rotation of the factor axes produces an equivalent fit, so an …

Group-Theoretical Symmetries in Item Response Theory (IRT)
Statistical Methods and Data Analysis

Group-Theoretic Symmetries in Item Response Theory

Item response theory (IRT) parameters are not unique. Different parameterizations of the same model fit the data identically, and the choice between them is settled by convention rather than discovered from the data. The standard fixes — anchoring the latent-trait scale, fixing one item’s parameters, or imposing identification constraints during …

Theoretical Framework for Bayesian Hierarchical 2PLM with ADVI
Statistical Methods and Data Analysis

Bayesian Hierarchical 2PLM with ADVI

Calibrating a two-parameter logistic (2PL) item response theory model on a small or sparse dataset is a recurring practical problem. Maximum-likelihood estimators give unstable estimates with wide standard errors when there are few respondents per item, and they offer no principled way to share information across items or examinee subgroups. …

Evaluating Coefficient Alpha and Alternatives in Non-Normal Data
Statistical Methods and Data Analysis

Coefficient Alpha and Alternatives in Non-Normal Data

Cronbach’s coefficient alpha is the most-reported reliability statistic in psychology and educational measurement. It is also one of the most-misunderstood. The classical formula assumes that test items measure a single construct with equal factor loadings (tau-equivalence), uncorrelated errors, and continuously distributed scores. Real psychological measurement rarely meets all three assumptions: …

Refining Reliability with Attenuation-Corrected Estimators
Statistical Methods and Data Analysis

Attenuation-Corrected Reliability Estimators

Most psychometrics textbooks teach the classical “correction for attenuation” — Spearman’s century-old technique for estimating what the correlation between two psychological constructs would be if the tests measuring them were perfectly reliable. The technique is simple: divide the observed correlation by the square root of the product of the two …

Decoding Prior Sensitivity in Bayesian Structural Equation Modeling for Sparse Factor Loading Structures
Statistical Methods and Data Analysis

Bayesian SEM Prior Sensitivity

The standard confirmatory factor analysis (CFA) machinery requires the analyst to commit, in advance, to which cross-loadings are exactly zero. This independent-clusters assumption is convenient for identification but rarely correct: most psychological measures have small but nonzero cross-loadings, and forcing them to zero produces systematic misfit that propagates into biased …