What is significance?

The findings affirm the JCCES as a reliable tool for assessing crystallized cognitive skills. Its robust internal consistency and ability to evaluate a wide range of abilities make it a valuable resource for educational and psychological assessments. At the same time, addressing the limitations of model fit for certain items and exploring additional alternative answers could further enhance its utility.

What are future directions?

Future research should focus on refining the JCCES by analyzing unexplored alternative answers and improving the fit of specific items within the 2PLM framework. Expanding the study to include diverse populations could also improve the generalizability of the results, ensuring the scale remains relevant in broader contexts.

The evaluation of the JCCES highlights its strengths in reliability and inclusivity while identifying areas for further improvement. This balanced approach ensures the scale continues to serve as a meaningful instrument for cognitive assessment and educational research.

Jouve, X. (2023). Evaluating The Jouve Cerebrals Crystallized Educational Scale (JCCES): Reliability, Internal Consistency, And Alternative Answer Recognition. Cogn-IQ Research Papers. https://pubscience.org/ps-1mSR3-32426c-iYHT

Assessing the Reliability of JCCES in Measuring Crystallized Cognitive Skills

Published: April 17, 2023 · Last reviewed: May 4, 2026

📖1,724 words⏱7 min read📚6 references cited

The Jouve-Cerebrals Crystallized Educational Scale (JCCES) is a measure of academic and educational crystallized knowledge designed to assess what the Cattell-Horn-Carroll model calls Gc — the broad cognitive ability that captures depth and breadth of acquired knowledge. Its 2023 psychometric evaluation reported a Cronbach’s alpha of .96 across 1,079 examinees, two-parameter logistic (2PL) item response theory calibration with good model fit for the majority of items, and an explicit alternative-answer-recognition mechanism that broadens scoring criteria for ambiguous responses (Jouve, 2023). This article examines what those properties mean in practice and how the JCCES sits within the broader literature on educational crystallized intelligence measurement.

What the JCCES measures

The JCCES samples educational and cultural knowledge across several content domains — vocabulary, general information, science, history, literature, geography, and other curricular content typically acquired through formal schooling and informed reading. The scale reflects the educational facet of crystallized intelligence (Gc), which Schneider and McGrew (2018) define within the Cattell-Horn-Carroll framework as the breadth and depth of a person’s acquired declarative and procedural knowledge, particularly the verbal and culturally-mediated knowledge developed through investment of fluid reasoning into educational and life experiences.

This makes the JCCES distinct from a vocabulary-only Gc measure. Where a pure vocabulary subtest taps the narrow ability of lexical knowledge (VL within CHC), an educational scale taps a broader composite that includes general information (K0), academic knowledge across content areas, and the depth of learned material that examinees have integrated. The 2023 technical paper described items spanning a wide difficulty range to support measurement at both ends of the ability distribution, with the goal of avoiding the floor-and-ceiling compression that affects fixed-form tests at the tails.

Reliability and what .96 means

The reported Cronbach’s alpha of .96 is high in absolute terms but is in the expected range for a long, content-diverse Gc instrument. Several factors support this magnitude:

Content diversity within a coherent construct. Items sample multiple knowledge domains, but all converge on the same underlying ability (educational Gc). High inter-item correlations are expected when each item indicates the same latent trait through different surface content.
Item count. Cronbach’s alpha increases with the number of items contributing to the score. A long scale on a coherent construct will reliably reach .95+ when items are well-developed.
Sample size. The 1,079-examinee development sample is large by psychometric standards, providing the precision needed to estimate item parameters and reliability coefficients with narrow confidence intervals.

The interpretation worth emphasizing is that high internal consistency does not by itself establish validity. A scale could be highly internally consistent (all items measure the same thing) while measuring something other than what it claims to measure. The convergent and discriminant validity evidence — how scores correlate with established Gc measures, how they differentiate from non-Gc abilities — is the complementary requirement. The JCCES technical paper reported convergent validity evidence within the Cogn-IQ system, and replication in independent samples remains an ongoing research need.

The 2PL IRT framework and why it matters

Classical Test Theory (CTT) and Item Response Theory (IRT) offer different views of the same data. CTT works at the test level: total scores, mean and variance, internal consistency. IRT works at the item level: estimating item difficulty (b parameter) and item discrimination (a parameter) so that the relationship between examinee ability and item-correct probability is modeled explicitly.

The two-parameter logistic (2PL) model used for the JCCES specifies the probability of a correct response as a logistic function of the difference between examinee ability (θ) and item difficulty (b), scaled by the item discrimination (a). This produces several practical benefits over CTT:

Items can be characterized independently of the sample. The difficulty and discrimination parameters describe properties of the item itself, allowing comparison across item pools, equating across forms, and adaptive testing applications.
Examinee ability can be estimated independently of the specific items administered. Two examinees who answered different subsets of items can be placed on a common ability metric.
Information functions identify where the test is most precise. A 2PL-calibrated test can be evaluated for which ability ranges it measures well and where additional items would improve precision.

König, Spoden, and Frey (2022) studied the robustness of 2PL calibration in small-sample contexts and showed that the model produces stable estimates even when sample size is below the conventional N = 500 threshold, provided that item-pool characteristics are favorable. The JCCES’s 1,079-examinee sample is well above the threshold and supports stable item-parameter estimates.

The technical paper noted that the 2PL fit was good for the majority of items, with a subset showing model-fit limitations. This is typical of large item pools and is the appropriate basis for item revision: poorly-fitting items are flagged for review, replacement, or alternative scoring rather than being assumed to function as the model expects.

Alternative answer recognition

One distinctive feature of the JCCES is explicit recognition of alternative answers — accepting responses that are technically not the keyed correct answer but that demonstrate the underlying knowledge or reasoning the item is testing. The 2023 evaluation incorporated this through a kernel-estimator approach that allows partial credit or full credit for substantively correct responses outside the original key.

This addresses a well-known fairness issue in fixed-key scoring of open-response items. Examinees from different educational backgrounds, age cohorts, or cultural contexts may produce responses that meet the item’s measurement intent without exactly matching the keyed answer. Conventional scoring would mark these wrong; alternative-answer-recognition scoring marks them according to actual content.

Cormier and colleagues (2022) provided systematic evidence that examinee characteristics — particularly linguistic background and exposure history — affect cognitive test performance more strongly than test characteristics in many contexts. Tests that handle response variability flexibly are more robust to these examinee-level differences than tests that score against a rigid key, which is why the JCCES design choice has practical implications for fairness across diverse examinee populations.

How the JCCES fits within Gc measurement

The CHC framework — refined by McGrew (2023) in his 30-year retrospective on Carroll’s three-stratum theory — organizes cognitive abilities into a hierarchy with general intelligence (g) at the top, broad abilities (Gc, Gf, Gv, Gs, Gwm, Gsm, Glr, Ga, Grw) at the middle, and narrow abilities at the base. Within Gc, several narrow abilities are recognized:

Lexical Knowledge (VL): knowledge of word definitions and word concepts.
Language Development (LD): general understanding of spoken native language.
General Verbal Information (K0): general knowledge about cultural, historical, and academic content.
Information about Culture (K2): specific knowledge of cultural artifacts, conventions, and symbols.
Communication Ability (CM): ability to communicate verbal information clearly.

An educational scale like the JCCES samples primarily K0 and K2, with secondary loadings on VL through any items that depend on word knowledge. This positioning differentiates the JCCES from purely vocabulary-focused Gc measures (which load most heavily on VL) and from achievement-test style instruments (which sample academic content but typically without psychometric calibration as a measure of the underlying Gc construct).

Carroll’s (1993) original factor-analytic survey, which underpins the modern CHC framework, established that knowledge-tapping items across content areas cluster on the broad Gc factor with high reliability across hundreds of independent samples. Modern instruments — Wechsler Information subtests, Woodcock-Johnson Comprehension-Knowledge cluster, KABC-II Knowledge subtest — all draw on this validated approach. The JCCES follows the same tradition with contemporary IRT-based calibration and the alternative-answer-recognition extension.

Practical implications

For researchers and clinicians considering the JCCES:

Use it when broad educational/cultural Gc is the construct of interest. Vocabulary-only or single-domain measures cannot substitute for content-diverse instruments when the question is about the breadth of accumulated knowledge.
Recognize the cultural and educational specificity. Educational Gc instruments necessarily reflect the cultural and curricular context in which examinees acquired their knowledge. Cross-cultural use requires either norm reference within the target population or careful interpretive caveats.
Pair with measures of fluid reasoning. Gc and Gf are partially separable broad abilities, and a complete cognitive profile requires both. Educational Gc alone cannot identify high-Gf, low-Gc patterns or vice versa.
Read the technical paper for the 2PL fit details. Items with model-fit limitations should be interpreted with appropriate caution, particularly in high-stakes applications.

For self-administered users, the JCCES provides a credible estimate of educational crystallized ability with the standard caveats applying to unproctored cognitive testing: the score is informative for individual reflection and research participation but is not equivalent to a clinically administered cognitive assessment.

What remains to be established

Several research questions matter for refining the JCCES’s evidence base. Independent replication of the reliability and validity coefficients in non-Cogn-IQ samples would strengthen the evidence base. Item-level differential functioning across linguistic, educational, and demographic subgroups would clarify the bounds of fair use. Convergent and discriminant validity correlations with established Gc instruments outside the Cogn-IQ system (Wechsler Information, WJ-IV Comprehension-Knowledge, KABC-II Knowledge) would situate the JCCES within the broader Gc measurement landscape. The 2023 paper provides the necessary first-step evidence; ongoing research can extend it.

The takeaway

The Jouve-Cerebrals Crystallized Educational Scale (JCCES) is a 2PL-calibrated educational Gc measure with a Cronbach’s alpha of .96, alternative-answer-recognition scoring, and a sample-1,079 development study supporting its psychometric properties. It samples the broader educational and cultural facet of crystallized intelligence rather than the narrower vocabulary facet, and its IRT-based design supports calibrated, fair item-level evaluation. As with any single-construct measure, the practical strength is in scoping: when broad educational Gc is the construct of interest, the JCCES is purpose-built; when a comprehensive cognitive profile is required, it should be paired with measures of other CHC broad abilities.

References

Jouve, X. (2023). Evaluating the Jouve Cerebrals Crystallized Educational Scale (JCCES): Reliability, internal consistency, and alternative answer recognition. Cogn-IQ Research Papers. https://pubscience.org/ps-1mSR3-32426c-iYHT
Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. Cambridge University Press.
Schneider, W. J., & McGrew, K. S. (2018). The Cattell–Horn–Carroll theory of cognitive abilities. In D. P. Flanagan & E. M. McDonough (Eds.), Contemporary intellectual assessment: Theories, tests, and issues (4th ed., pp. 73–163). Guilford Press.
McGrew, K. S. (2023). Carroll’s three-stratum (3S) cognitive ability theory at 30 years: Impact, 3S-CHC theory clarification, structural replication, and cognitive–achievement psychometric network analysis extension. Journal of Intelligence, 11(2), 32. https://doi.org/10.3390/jintelligence11020032
König, C., Spoden, C., & Frey, A. (2022). Robustness of the performance of the optimized hierarchical two-parameter logistic IRT model for small-sample item calibration. Behavior Research Methods, 55(8), 3965–3983. https://doi.org/10.3758/s13428-022-02000-5
Cormier, D. C., Bulut, O., McGrew, K. S., & Kennedy, K. (2022). Linguistic influences on cognitive test performance: Examinee characteristics are more important than test characteristics. Journal of Intelligence, 10(1), 8. https://doi.org/10.3390/jintelligence10010008

Xavier Jouve, Ph.D.PsychometricianPhD

Xavier Jouve, Ph.D., is a psychometrician and quantitative psychologist specializing in cognitive ability measurement, item response theory, and test development. He is Head of Research at Cogn-IQ, where he has designed and validated seven cognitive assessment instruments — including the JCTI (inductive reasoning), JCCES (crystallized intelligence), IAW (vocabulary), JCFS (figurative sequences), JCWS (verbal reasoning), GIE (general knowledge), and WN (logical inference) — collectively normed on over 13,000 examinees. His work applies 2PL IRT modeling, computerized adaptive testing, and advanced composite scoring methods (including the modified Tellegen & Briggs Formula 4 with cubic correction) to produce research-grade cognitive measures available online. ORCID: 0009-0006-1283-045X

ORCID

Related Research

Child Cognitive Development

Does Music Training Increase IQ? What the Research Actually Shows

Few claims in popular science are as persistent as the idea that music makes you smarter. From the "Mozart Effect" craze of the 1990s —…

Apr 15, 2026

Intelligence Research and Cognitive Abilities

Working Memory: Why It Matters More Than You Think

Every time you hold a phone number in mind while searching for a pen, follow a multi-step recipe, or mentally compare two arguments in a…

Apr 13, 2026

Intelligence Research and Cognitive Abilities

The G Factor: What General Intelligence Really Means

In 1904, Charles Spearman noticed something that would reshape the study of intelligence for the next century: children who scored well on one type of…

Apr 10, 2026

Cognitive Neuroscience and Brain Function

Sleep Deprivation and Cognitive Performance: What One Bad Night Does to Your Brain

In 1999, researchers at the University of New South Wales made a startling discovery: people who had been awake for 17–19 hours performed on cognitive…

Apr 8, 2026

Cognitive Neuroscience and Brain Function

Mindfulness and Cognitive Performance: Does Meditation Actually Make You Smarter?

Meditation has entered the mainstream. From corporate boardrooms to elementary schools, from military training to clinical therapy, mindfulness practices are promoted as cognitive enhancers that…

Apr 6, 2026

Assessing the Reliability of JCCES in Measuring Crystallized Cognitive Skills

What the JCCES measures

Reliability and what .96 means

The 2PL IRT framework and why it matters

Alternative answer recognition

How the JCCES fits within Gc measurement

Practical implications

What remains to be established

The takeaway

References

Related Research

Does Music Training Increase IQ? What the Research Actually Shows

Working Memory: Why It Matters More Than You Think

The G Factor: What General Intelligence Really Means

Sleep Deprivation and Cognitive Performance: What One Bad Night Does to Your Brain

Mindfulness and Cognitive Performance: Does Meditation Actually Make You Smarter?

People Also Ask

Leave a Reply Cancel reply

What the JCCES measures

Reliability and what .96 means

The 2PL IRT framework and why it matters

Alternative answer recognition

How the JCCES fits within Gc measurement

Practical implications

What remains to be established

The takeaway

References

Related Research

People Also Ask

You may also like...

Popular Posts

Leave a Reply Cancel reply