What is significance?

The IAW test contributes to the evolving field of cognitive assessment by addressing limitations in traditional verbal ability measures. Its open-ended design aligns with efforts to create testing environments that recognize diverse cognitive styles. By offering a reliable and valid alternative, the IAW test has the potential to enhance how verbal intelligence is assessed across populations.

What are future directions?

Future research could focus on expanding the test’s applicability by examining its performance across different cultural and linguistic groups. Addressing current limitations, such as the need for test-retest reliability studies, will further strengthen its psychometric foundation.

The IAW test offers a fresh perspective on verbal ability assessment, prioritizing inclusivity and meaningful engagement. With continued refinement and research, it has the potential to become a widely used tool for assessing verbal intelligence in diverse settings.

Jouve, X. (2023). I Am A Word Test: An Open-Ended And Untimed Approach To Verbal Ability Assessment. Cogn-IQ Research Papers. https://pubscience.org/ps-1mSQS-530828-wbh6

Assessing Verbal Intelligence with the IAW Test

Published: April 7, 2023 · Last reviewed: May 4, 2026

📖1,848 words⏱8 min read📚4 references cited

Most vocabulary tests on standard intelligence batteries — the WAIS, the Stanford-Binet, the RIAS — present examinees with a target word and ask them to define it or pick the best definition from a multiple-choice list. The format is efficient, easy to score, and has decades of psychometric research behind it. It is also, in important ways, an unnatural cognitive task. Real verbal ability rarely involves selecting one of four definitions; it involves retrieving the right word for the meaning you have in mind. The I Am a Word (IAW) test, developed by Xavier Jouve and most recently revised in 2023, takes the inverse approach: present the meaning, the structural constraints, and the context, and ask the examinee to produce the word. The format change is small in description but has real psychometric and conceptual consequences.

What the IAW test actually does

The IAW test consists of 100 open-ended verbal items administered without a time limit. Each item presents structural and semantic clues — typically a definition or sentence-frame plus indications of word length, part of speech, or letter constraints — and the examinee types the target word. Several design choices distinguish it from conventional vocabulary measures:

Production rather than recognition. The examinee must retrieve the word from semantic memory, not select among presented alternatives. This eliminates the four-options-in-front-of-you cueing artifact that affects multiple-choice vocabulary scores.
No time pressure. The untimed format separates the construct of “verbal knowledge” from “verbal speed” — two related but distinguishable abilities that timed multiple-choice formats blend.
Multiple correct answers per item. Where natural language admits several semantically appropriate words for a single meaning slot, the scoring accepts any of the validated synonyms, reflecting how lexical access actually operates.
Automated scoring. Despite being open-ended, scoring is automated against a curated answer set, retaining the reliability advantages of objective scoring while accommodating linguistic flexibility.

The validation work, reported in Jouve (2023) using a sample of 1,083 examinees, found Cronbach’s alpha of .95 for internal consistency and a correlation of .83 with the Wechsler Adult Intelligence Scale–Third Edition (WAIS-III) Verbal Comprehension Index as well as a strong correlation with the Reynolds Intellectual Assessment Scales (RIAS) Verbal Intelligence Index. These numbers locate the IAW in the same psychometric neighborhood as established verbal-ability subtests of major commercial batteries.

The format question: why open-ended matters

The choice between multiple-choice and open-ended verbal testing is older than commercial intelligence assessment. Heim and Watts’s 1967 experiment in the British Journal of Educational Psychology directly compared the two formats on the same vocabulary content with the same examinees and reported that the formats produce non-trivially different score patterns. Multiple-choice items show systematic cueing effects: examinees can sometimes identify the correct definition by elimination of distractors, by partial recognition of the target word, or by drawing on test-wiseness rather than lexical knowledge. Open-ended formats remove these supports and arguably tap a more authentic representation of vocabulary depth.

The trade-off is in scoring: open-ended responses are harder to score reliably, particularly when the response space is large. Historically, this is why high-stakes commercial vocabulary tests have favored multiple-choice. The IAW’s automated scoring against a validated synonym set is the modern resolution to this trade-off — open-ended response acceptance with multiple-choice-grade scoring reliability.

What “verbal intelligence” actually means

Vocabulary measures are among the most reliable and most g-loaded of any cognitive task. Stanovich’s 1993 chapter in Advances in Child Development and Behavior argues that vocabulary functions as a particularly informative cognitive measure because it is, in effect, a distillation of accumulated language exposure. A child or adult’s working vocabulary reflects a long history of reading, conversation, and instruction, with each individual word a small data point in that history.

That gives vocabulary an unusual property among cognitive tests: it is highly stable, highly reliable, and substantially heritable, while also being directly responsive to environmental enrichment. The IAW, like other vocabulary measures, inherits this property. A high IAW score does not testify to fluid problem-solving in the moment but to a long arc of language exposure and retention.

In the Cattell-Horn-Carroll (CHC) framework, vocabulary tests load primarily on comprehension-knowledge (Gc), the broad ability that encompasses verbal-conceptual knowledge accumulated over time. Production-format vocabulary tests like the IAW are sometimes hypothesized to also tap long-term retrieval (Glr) components more strongly than recognition-format tests, because retrieval rather than recognition is the operative process. The IAW’s untimed format separates Glr efficiency from Glr access, which traditional speeded vocabulary tests blend.

How the IAW compares to standard battery vocabulary subtests

The validation correlation of .83 with the WAIS-III VCI is informative for what it implies about construct overlap. Two tests that correlate at .83 are measuring substantially the same underlying construct — they are not independent measures with weak overlap. Practical implications:

The IAW is unlikely to identify cognitive strengths or weaknesses that the WAIS VCI misses. Both tests are measuring essentially the same verbal-knowledge factor.
The IAW can serve as a stand-alone verbal-ability indicator. Where a full WAIS administration is impractical and only verbal ability is needed, the IAW’s psychometrics support its use as a faster, simpler alternative.
Score discrepancies between the IAW and a battery VCI deserve scrutiny. If a child or adult scores substantially higher on the IAW than on the WAIS VCI, the gap may reflect the format difference (production-friendly vs. recognition-friendly) and points toward strengths in retrieval or comfort with open-ended response formats.

The Reynolds Intellectual Assessment Scales (RIAS), discussed by Brueggemann, Reynolds, and Kamphaus (2006) in Gifted Education International, is one of several contemporary intelligence batteries that produce a Verbal Intelligence Index distinct from a nonverbal index. The RIAS VIX was the second criterion measure in the IAW validation, and the strong correlation with the RIAS VIX further supports the IAW’s positioning as a verbal-intelligence measure rather than a domain-specific vocabulary task.

Practical applications

Several use cases follow from the IAW’s psychometric properties:

Research where verbal ability is a covariate. Studies needing to control for or characterize verbal ability without administering a full IQ battery can use the IAW as a screening or matching variable.
Self-testing and educational settings. The untimed, open-ended format reduces the test-anxiety component that compresses scores in multiple-choice timed administration. Examinees self-pacing through open-ended items often perform at a level closer to their underlying ability.
Cross-population assessment. Open-ended production removes the reliance on familiarity with the multiple-choice testing format that disadvantages some examinees. The IAW is correspondingly more accessible to individuals with limited test-taking experience.
Tracking verbal development. Because vocabulary grows with continued language exposure, the IAW can be used to track changes in verbal ability over time, with the caveat that practice effects on the same items should be considered for short-interval retesting.

What the IAW does not do

Several boundaries on interpretation:

It is a verbal-knowledge measure, not a general intelligence measure. A high IAW score does not directly imply high fluid reasoning, working memory, or processing speed. Verbal abilities are heavily g-loaded but not equivalent to g.
It is calibrated for English speakers. Like other vocabulary tests, it is not directly translatable to other languages without renorming. Cross-language validation work is a separate undertaking.
It depends on reading and writing exposure. Individuals with limited literacy may underperform on the IAW relative to their underlying cognitive ability — the same caveat applies to all written vocabulary tests.
The validation evidence is from the test author. The .95 alpha and .83 correlation come from Jouve (2023), the test’s developer. Independent replication by other research groups is the standard expectation for a test in widespread clinical use, and as of writing the IAW has not yet accumulated that independent literature.

Open questions

Several questions remain for future work:

Test-retest reliability. Internal consistency (alpha) measures the homogeneity of items at a single administration; test-retest reliability captures stability across occasions. The two are related but distinct, and the test-retest properties of the IAW are not yet established at the same level of detail as internal consistency.
Cross-population norms. Performance norms for specific clinical populations (older adults, individuals with learning disabilities, second-language English speakers) would expand the test’s applicability.
Comparison with newer intelligence batteries. The validation used the WAIS-III, an older edition. Comparison with current batteries (WAIS-IV, WAIS-V) would update the criterion evidence.
Differential item functioning. Whether items perform comparably across demographic subgroups is the standard fairness check, and item-level DIF analysis on a test of this size is a natural next step.

Frequently Asked Questions

What does the IAW test actually measure?

Verbal ability — specifically, the breadth and depth of an individual’s vocabulary and the ability to retrieve appropriate words from semantic memory given contextual constraints. It maps onto the comprehension-knowledge (Gc) factor in the Cattell-Horn-Carroll framework.

How is the IAW different from the vocabulary subtest in a regular IQ test?

Standard IQ vocabulary subtests usually present a word and ask for its definition. The IAW inverts this: it presents a definition or context and asks the examinee to produce the word. The IAW is also untimed and accepts multiple correct answers per item where the language admits synonyms.

What’s the difference between a Cronbach’s alpha of .95 and a correlation of .83?

Alpha measures internal consistency — how well the items on a single test administration hang together as measures of the same construct. The .83 correlation with the WAIS-III VCI is concurrent validity — how strongly the IAW score relates to a separate, established measure of the same construct. They are different kinds of evidence and a strong test usually has both.

Is open-ended testing harder than multiple-choice?

Generally, yes — at the same level of item difficulty, open-ended formats produce lower raw scores because cueing effects and elimination strategies are unavailable. But this is part of the design intention: removing cueing produces a less inflated estimate of vocabulary knowledge.

Can the IAW be used to estimate IQ?

The IAW provides a verbal-ability score that strongly correlates with verbal IQ measures from major batteries. It is best used as a verbal-intelligence indicator rather than a stand-alone full-scale IQ estimate, since fluid reasoning, working memory, and other cognitive components are not assessed.

How long does the IAW take?

Because it is untimed, administration time varies widely by examinee. Most adults complete the 100 items in under an hour; some take longer, which is part of the design — pacing is set by the examinee, not by the clock.

What populations is the IAW validated for?

The 2023 validation sample of 1,083 examinees is the primary evidence base. Generalization to specific clinical populations — older adults, learning-disabled examinees, non-native English speakers — is a separate empirical question that further validation work would need to address.

References

Jouve, X. (2023). I Am A Word Test: An Open-Ended and Untimed Approach to Verbal Ability Assessment. Cogn-IQ Research Papers. https://pubscience.org/ps-1mSQS-530828-wbh6
Heim, A. W., & Watts, K. P. (1967). An experiment on multiple-choice versus open-ended answering in a vocabulary test. British Journal of Educational Psychology, 37(3), 339–346. https://doi.org/10.1111/j.2044-8279.1967.tb01950.x
Stanovich, K. E. (1993). Does Reading Make You Smarter? Literacy and the Development of Verbal Intelligence. Advances in Child Development and Behavior, 24, 133–180. https://doi.org/10.1016/S0065-2407(08)60302-X
Brueggemann, A. E., Reynolds, C. R., & Kamphaus, R. W. (2006). The Reynolds Intellectual Assessment Scales (RIAS) and Assessment of Intellectual Giftedness. Gifted Education International, 21(2–3), 127–136. https://doi.org/10.1177/026142940602100305

Xavier Jouve, Ph.D.PsychometricianPhD

Xavier Jouve, Ph.D., is a psychometrician and quantitative psychologist specializing in cognitive ability measurement, item response theory, and test development. He is Head of Research at Cogn-IQ, where he has designed and validated seven cognitive assessment instruments — including the JCTI (inductive reasoning), JCCES (crystallized intelligence), IAW (vocabulary), JCFS (figurative sequences), JCWS (verbal reasoning), GIE (general knowledge), and WN (logical inference) — collectively normed on over 13,000 examinees. His work applies 2PL IRT modeling, computerized adaptive testing, and advanced composite scoring methods (including the modified Tellegen & Briggs Formula 4 with cubic correction) to produce research-grade cognitive measures available online. ORCID: 0009-0006-1283-045X

ORCID

Related Research

Intelligence Research and Cognitive Abilities

The G Factor: What General Intelligence Really Means

In 1904, Charles Spearman noticed something that would reshape the study of intelligence for the next century: children who scored well on one type of…

Apr 10, 2026

IQ Scores and Ranges

What Is Mensa? Requirements, Testing Process, and What Membership Actually Means

Mensa. The name conjures images of genius-level intellects gathering to solve the world's hardest puzzles. In reality, the world's largest and oldest high-IQ society is…

Mar 25, 2026

Psychometric Testing and IQ Assessment

IQ Test Anxiety: How Stress Affects Your Score and What to Do About It

You sit down for an IQ assessment. Your palms are sweating, your mind races, and the moment you see the first timed task, your thoughts…

Mar 22, 2026

Psychometric Testing and IQ Assessment

Raven's Progressive Matrices: The Culture-Fair IQ Test Explained

Among the hundreds of cognitive tests developed over the past century, few have achieved the global reach of Raven's Progressive Matrices. Administered in settings from…

Mar 19, 2026

Psychological Measurement and Testing

How to Interpret IQ Test Results: A Psychometrician's Guide

You've received an IQ test report — perhaps for yourself, your child, or a client. It's filled with numbers, percentiles, confidence intervals, and subtest scores.…

Mar 15, 2026