What is significance?

The findings affirm the JCCES as a reliable tool for assessing crystallized cognitive skills. Its robust internal consistency and ability to evaluate a wide range of abilities make it a valuable resource for educational and psychological assessments. At the same time, addressing the limitations of model fit for certain items and exploring additional alternative answers could further enhance its utility.

What are future directions?

Future research should focus on refining the JCCES by analyzing unexplored alternative answers and improving the fit of specific items within the 2PLM framework. Expanding the study to include diverse populations could also improve the generalizability of the results, ensuring the scale remains relevant in broader contexts.

The evaluation of the JCCES highlights its strengths in reliability and inclusivity while identifying areas for further improvement. This balanced approach ensures the scale continues to serve as a meaningful instrument for cognitive assessment and educational research.

Jouve, X. (2023). Evaluating The Jouve Cerebrals Crystallized Educational Scale (JCCES): Reliability, Internal Consistency, And Alternative Answer Recognition. Cogn-IQ Research Papers. https://pubscience.org/ps-1mSR3-32426c-iYHT

JCCES: Reliability of the Crystallized Skills Scale

Published: April 17, 2023 · Last reviewed: May 6, 2026

📖2,463 words⏱10 min read📚7 references cited

The Jouve-Cerebrals Crystallized Educational Scale (JCCES) is a three-subtest cognitive battery — verbal analogies, mathematical problems, and general knowledge — built on the Cattell-Horn-Carroll (CHC) framework with primary loading on crystallized intelligence (Gc) and a secondary quantitative-knowledge (Gq) component. Across cross-battery factor analyses, JCCES indicators load strongly on a general factor (g) shared with the WAIS Verbal Comprehension Index, the ACT, the GAMA, the RIAS, and the AFQT, with the Math Problems subtest forming a partly-separable Gq facet. The composite Cognitive Acumen Index correlates r = .80 with WAIS FSIQ, r = .82 with WAIS VCI, r = .84 with the AFQT, and r = .83 with the SAT Composite (Cogn-IQ, 2025). The “Educational” element of the name refers to the school-acquired content of the math subtest and parts of the general-knowledge subtest, not to an achievement-test framing — the test is a traditional cognitive-ability battery in the Carroll (1993) sense.

Three subtests, two indices

The JCCES contains 129 items distributed across three subtests:

Verbal Analogies (VA, 41 items, α = .85). Classical A:B::C:D verbal reasoning, in the format used since early intelligence-test work and retained across modern batteries. Loads primarily on Gc through the lexical-knowledge requirement and secondarily on fluid reasoning (Gf) through the inductive identification of the underlying relation. Open-ended response with multiple accepted answers per item.
Math Problems (MP, 32 items, α = .92). Quantitative-reasoning items requiring arithmetic and algebraic operations together with strategy formulation. The clearest “educational” component, since most examinees acquire the underlying content through formal schooling rather than independent learning. Loads on Gq with secondary Gf loading; in cross-battery factor analyses with the GAMA, MP behaves as a bridge toward Gf (Cogn-IQ, 2025).
General Knowledge (GK, 56 items, α = .94). Hybrid items spanning Wechsler-Information-style world knowledge (acquired incidentally through reading, conversation, media exposure) and curriculum-derived content (history, science, literature acquired through schooling). The mix per item varies; collectively the subtest loads strongly on verbal Gc with the school-acquired subset adding educational variance. Open-ended with multiple accepted answers.

Two composite indices summarize the subtest profile, both on the standard IQ metric (M = 100, SD = 15):

Cognitive Acumen Index (CAI): the all-subtest composite (α = .96, SEM ≈ 2.96), the broad Gc-plus-Gq estimate.
Verbal Acumen Index (VAI): the verbal-only composite combining VA and GK (α = .93), excluding the math component for cases where pure verbal-Gc estimation is the goal.

The composite formulas use a modified Tellegen-Briggs Formula 4 with cubic correction, addressing tail-compression bias that affects the original TB-4 method by approximately 2-3 points on average and up to 6 points at distribution extremes (Cogn-IQ, 2025).

Convergent validity across multiple criterion batteries

The JCCES Technical Manual (Cogn-IQ, 2025) reports convergent-validity correlations against six external criterion measures spanning the major individually-administered IQ batteries (WAIS, WISC, RIAS), aptitude tests (AFQT, SAT, GAMA), and academic-achievement composites (SAT subscores). The pattern across these criteria — not any single correlation — is what locates the JCCES within the cognitive-ability landscape.

Wechsler-family correlations (CAI):

WAIS VCI: r = .82 (N = 56) | WAIS Vocabulary: .75 | WAIS Similarities: .57 | WAIS Information: .83
WAIS FSIQ: .80 (N = 43) | WAIS VIQ: .80 | WAIS PIQ: .67
WISC FSIQ: .82 (N = 15) | WISC VIQ: .73 | WISC PIQ: .64

The WAIS Information correlation (.83) is the highest single correlation in the panel, reflecting the content overlap between Information items and the JCCES General Knowledge subtest. The Similarities correlation is lower (.57), consistent with Similarities tapping a verbal-abstraction strand that the JCCES verbal subtests sample less directly. The CAI–WAIS PIQ correlation of .67 is meaningfully lower than the CAI–VIQ correlation of .80, the expected pattern for a Gc-anchored measure compared with the more Gv- and Gs-loaded performance subtests.

RIAS correlations (CAI):

RIAS VIX (Verbal Intelligence Index): .80 (N = 119)
RIAS Guess What: .75 | RIAS Verbal Reasoning: .64

The RIAS sample (N = 119) is the largest individually-administered comparison and provides the most stable estimate. The .80 with VIX is similar to the .80–.85 range with WAIS verbal indices.

Aptitude-test correlations (CAI):

AFQT Percentile: .84 (N = 62) — the Armed Forces Qualification Test, a long-validated military-aptitude composite.
SAT Composite (2005–2016 edition): .83 (N = 117) | SAT Math: .69 | SAT Writing: .78 | SAT Reading: .73
GAMA IQ (a non-verbal Gf battery): .59 (N = 59) | GAMA Analogies: .51 | GAMA Constructions: .47

The aptitude-test panel is informative for what it shows about construct breadth. The .84 with AFQT and .83 with SAT Composite indicate that the JCCES taps the same g-loaded construct that drives performance on these widely-used selection instruments. The substantially lower correlation with the GAMA IQ (.59) — a figural, non-verbal Gf measure — reflects the partial separation of Gc and Gf at the broad-ability stratum: the JCCES is a Gc-anchored measure, and its correlation with a Gf-anchored measure should be moderate rather than high.

The VAI (verbal-only composite) shows the same general pattern with slightly different magnitudes: VAI–WAIS VCI is .85, VAI–RIAS VIX is .83, but VAI–GAMA IQ drops to .47 (Cogn-IQ, 2025). The VAI’s purer verbal loading produces tighter correlations with verbal criteria and looser correlations with figural-Gf criteria — the expected differential-validity pattern for a verbal-only composite.

Factor structure: three cross-battery EFAs

The technical manual reports three cross-battery exploratory factor analyses, each examining how JCCES subtests load alongside subtests from a different external battery. The methodology follows established factor-analytic practice (Gorsuch 1983; Horn 1965; Velicer 1976; Carroll 1993), with parallel analysis for factor retention and CHC interpretation following McGrew (2009) and Schneider and McGrew (2018).

JCCES × ACT (N = 58)

Two factors retained, accounting for 73.9% common variance (F1 = 49.2%, F2 = 24.7%). The factor-analytic decomposition produced two near-orthogonal domains (HTMT = .612):

F1 — academic literacy/science cluster: ACT English, ACT Reading, ACT Science, with JCCES VA and GK.
F2 — quantitative cluster: ACT Math with JCCES MP.

g-loadings were uniformly high across all indicators (Jensen, 1998 method): ACT Science .861, ACT Math .861, ACT English .841, JCCES MP .794, ACT Reading .739, JCCES GK .657, JCCES VA .590. Explained common variance attributable to g (ECV) was .807 with ω_h = .998 — meaning that of the variance shared across indicators, 80.7% loaded on the general factor. The CHC interpretation: F1 maps onto Gc/Grw (reading-and-writing), F2 maps onto Gq, with a strong general factor spanning both.

JCCES × GAMA (N = 57)

Two factors retained, accounting for 68.6% common variance (F1 = 39.3%, F2 = 29.3%). HTMT = .458, indicating clearer factor separation than the ACT analysis. The two strands:

F1 — figural-reasoning set: GAMA Sequences, GAMA Constructions, with JCCES MP cross-loading.
F2 — verbal/knowledge set: JCCES VA and GK.

g-loadings were sizeable across all indicators: GAMA SEQ .786, GAMA CON .779, JCCES MP .780, GAMA ANA .734, JCCES GK .558, GAMA MAT .539, JCCES VA .471. The MP cross-domain proximity (complexity ≈ 1.97) is the analytic signature of a subtest that bridges Gq and Gf. ECV = .663 with ω_h = .994 — a slightly weaker general factor than the ACT analysis, reflecting the cleaner Gc–Gf separation between the JCCES (Gc-anchored) and GAMA (Gf-anchored) batteries.

The CHC interpretation reads directly: shared general ability is prominent, GAMA emphasizes Gf, JCCES emphasizes Gc, and MP bridges toward Gf. This is the classic Gc–Gf differentiation pattern documented by Carroll (1993) and reaffirmed in McGrew’s (2023) 30-year three-stratum review.

JCCES × WAIS VCI (N = 56)

Parallel analysis suggested one factor; two factors are reported for descriptive completeness, accounting for 77.8% common variance (F1 = 56.5%, F2 = 21.3%). The decomposition:

F1 — broad Gc factor: JCCES VA, JCCES GK, WAIS Vocabulary, WAIS Information.
F2 — quantitative/abstraction strand: JCCES MP and WAIS Similarities cross-link.

g-loadings were very high across the Gc indicators: VA .918, GK .918, WAIS Vocabulary .917, WAIS Information .909, with WAIS Similarities .756 and JCCES MP .658. ECV = .904, ω_h = .998 — the strongest general-factor evidence in the three analyses, consistent with the high construct overlap between two Gc batteries. The CHC interpretation: overlap is dominated by Gc (Vocabulary, Information, VA, GK), MP reflects Gq within the same general factor, and Similarities contributes a verbal-abstraction nuance within Gc (Keith, 2005).

Subtest intercorrelations

The within-test subtest correlations clarify the hierarchical structure:

VA × GK: r = .77 — strong, consistent with both subtests loading heavily on verbal Gc.
VA × MP: r = .59 — moderate, consistent with the partial separation of Gc and Gq.
MP × GK: r = .55 — moderate, consistent with MP tapping a partly-distinct quantitative ability from the verbal-knowledge subtests.

The pattern shows the expected structure for a Gc-anchored battery with a Gq subtest: a tight verbal subfactor (VA + GK) and a partly-separable quantitative subfactor (MP), both contributing to the overall composite. The cross-battery factor analyses confirm this internal pattern at the broader cognitive-ability level.

Reliability and what α = .96 means

Internal consistency for the JCCES, computed across the technical-manual sample (N = 1,551), is α = .96 for the full-scale CAI, with subtest reliabilities of .85 (VA), .92 (MP), and .94 (GK), and α = .93 for the VAI. The CAI’s standard error of measurement on the IQ-scaled metric is approximately 2.96 points, with a 95% confidence band of [.957, .963] on the reliability coefficient.

These coefficients fall in the expected range for a long, well-developed cognitive battery with homogeneous loading on the underlying construct. The verbal subtests’ high reliabilities (.85–.94) reflect the open-ended response format and the careful item development across two decades of revision; the math subtest’s .92 reflects the relative homogeneity of quantitative items at varying difficulty levels.

The 2PL IRT framework

The JCCES has been calibrated using the two-parameter logistic (2PL) item response theory model in addition to classical-test-theory analysis. The 2PL specifies the probability of a correct response as a logistic function of the difference between examinee ability (θ) and item difficulty (b), scaled by the item discrimination parameter (a). This produces:

Sample-independent characterization of items, supporting equating across forms and computer-adaptive applications.
Item-independent estimation of examinee ability, allowing comparison across examinees who answered different subsets.
Information functions identifying ability ranges where the test is most precise and where additional items would improve measurement.

König, Spoden, and Frey (2022) showed that 2PL calibrations produce stable item-parameter estimates even in small-sample contexts when item-pool characteristics are favorable. The JCCES’s 1,551-examinee sample is well above conventional thresholds for stable calibration. The technical manual reports good 2PL fit for the majority of items, with a subset showing model-fit limitations flagged for ongoing revision — the standard psychometric pattern for large item pools.

Discrepancy norms and base rates

The technical manual reports cumulative base rates for absolute standard-score differences across the four meaningful contrasts (CAI–VAI, VA–MP, VA–GK, MP–GK). Selected thresholds (Cogn-IQ, 2025):

|CAI − VAI| ≥ 7 occurs in 38.10% of cases; ≥ 10 occurs in 13.5%; ≥ 14 occurs in 1.5% — the CAI–VAI gap reflects the relative contribution of MP, so a large gap indicates either unusual quantitative strength (CAI > VAI) or unusual quantitative weakness (CAI < VAI).
|VA − GK| differences track the within-verbal cluster and remain small (≥ 14 occurs in 0% of cases in the development sample).
|VA − MP| and |MP − GK| differences track the verbal-quantitative separation and can be substantial in profiles with marked Gc–Gq dissociation.

These base rates support clinical and research interpretation of subtest discrepancies — a CAI > VAI difference of 8 points or more is an unusual profile (under 5% of cases) that warrants attention for what it implies about the examinee’s relative quantitative strength.

Open-ended scoring with alternative-answer recognition

The JCCES uses open-ended responses scored 0/1, with the verbal subtests (VA and GK) accepting multiple correct answers per item where multiple substantively-correct responses exist. This addresses two well-known issues with rigid-key scoring of verbal items.

First, examinees from different educational and cultural backgrounds may produce responses that satisfy the item’s measurement intent without exactly matching the keyed answer. Conventional rigid-key scoring would mark these wrong; alternative-answer-recognition scoring marks them according to actual content. Cormier et al. (2022) provided systematic evidence that examinee characteristics — particularly linguistic background and exposure history — affect cognitive test performance more strongly than test characteristics, making this fairness consideration concrete rather than abstract.

Second, multiple-choice formats are susceptible to guessing, particularly at the high-ability end where item discrimination is hardest to maintain. Open-ended responses eliminate the guessing-noise component, which contributes to the test’s reliability and its usability across the full ability range.

Practical implications

For researchers and clinicians considering the JCCES:

Use the CAI when broad Gc-plus-Gq estimation is the goal. The all-subtest composite is the appropriate index for general crystallized-intelligence assessment with a quantitative-reasoning component.
Use the VAI when pure verbal-Gc estimation is the goal. The verbal-only composite is the closer parallel to the WAIS VCI or RIAS VIX and is appropriate when the construct of interest is verbal Gc specifically.
Account for the Gc–Gf differentiation. The JCCES × GAMA factor analysis (.59 correlation between full composites) is the clearest evidence that the JCCES does not substitute for a Gf measure. A complete cognitive profile combines Gc-anchored instruments like the JCCES with Gf-anchored instruments like Raven’s APM or the GAMA.
Expect cultural and educational variance in MP and GK. Examinees with limited formal schooling, recent immigrants, or those from substantially different curricular backgrounds may show CAI–VAI discrepancies reflecting schooling rather than cognitive differences. The discrepancy base rates support interpreting differences ≥ 8 points as warranting attention.
Consult the technical manual for full norm tables, item parameters, and analytic detail. The Cogn-IQ (2025) manual is the authoritative source for high-stakes applications.

For self-administered users, the JCCES provides a credible estimate of crystallized intelligence with the standard caveats applying to unproctored cognitive testing: the score is informative for individual reflection and research participation but is not equivalent to a clinically-administered IQ assessment.

Open research directions

The JCCES has been in use since 2005 (initially under the CCAT name) and renamed in 2010 to reflect its primary Gc focus, with continuous refinement through 2023 and into the current 2025 manual. The accumulated reliability, validity, and factor-analytic evidence reflects roughly two decades of psychometric work rather than a recently-developed instrument. Useful extensions of the existing evidence base include independent replication of the convergent-validity correlations and factor-analytic results in samples outside the Cogn-IQ ecosystem and differential item functioning analyses across linguistic, educational, and demographic subgroups to refine the bounds of fair use.

The takeaway

The JCCES is a three-subtest crystallized-intelligence battery — verbal analogies, mathematical problems, and general knowledge — with reliability α = .96 for the all-subtest composite (CAI) and α = .93 for the verbal-only composite (VAI). Convergent-validity correlations span r = .80–.84 with WAIS FSIQ, AFQT, and SAT Composite, with appropriately lower correlations against figural-Gf criteria (r = .59 with GAMA IQ). Three cross-battery factor analyses (× ACT, × GAMA, × WAIS VCI) confirm a strong general factor with Gc, Gq, and Gf differentiation aligning with Carroll-CHC theory. The “Educational” element of the name reflects the school-acquired content of the math subtest and parts of the general-knowledge subtest, but the test as a whole is a traditional cognitive-ability battery, not an academic-achievement scale.

References

Cogn-IQ. (2025). JCCES Technical Manual. Cogn-IQ. https://www.cogn-iq.org/methods/jcces-manual/
Jouve, X. (2023). Evaluating the Jouve Cerebrals Crystallized Educational Scale (JCCES): Reliability, internal consistency, and alternative answer recognition. Cogn-IQ Research Papers. https://pubscience.org/ps-1mSR3-32426c-iYHT
Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. Cambridge University Press.
Schneider, W. J., & McGrew, K. S. (2018). The Cattell–Horn–Carroll theory of cognitive abilities. In D. P. Flanagan & E. M. McDonough (Eds.), Contemporary intellectual assessment: Theories, tests, and issues (4th ed., pp. 73–163). Guilford Press.
McGrew, K. S. (2023). Carroll’s three-stratum (3S) cognitive ability theory at 30 years: Impact, 3S-CHC theory clarification, structural replication, and cognitive–achievement psychometric network analysis extension. Journal of Intelligence, 11(2), 32. https://doi.org/10.3390/jintelligence11020032
König, C., Spoden, C., & Frey, A. (2022). Robustness of the performance of the optimized hierarchical two-parameter logistic IRT model for small-sample item calibration. Behavior Research Methods, 55(8), 3965–3983. https://doi.org/10.3758/s13428-022-02000-5
Cormier, D. C., Bulut, O., McGrew, K. S., & Kennedy, K. (2022). Linguistic influences on cognitive test performance: Examinee characteristics are more important than test characteristics. Journal of Intelligence, 10(1), 8. https://doi.org/10.3390/jintelligence10010008

Xavier Jouve, Ph.D.PsychometricianPhD

Xavier Jouve, Ph.D., is a psychometrician and quantitative psychologist specializing in cognitive ability measurement, item response theory, and test development. He is Head of Research at Cogn-IQ, where he has designed and validated seven cognitive assessment instruments — including the JCTI (inductive reasoning), JCCES (crystallized intelligence), IAW (vocabulary), JCFS (figurative sequences), JCWS (verbal reasoning), GIE (general knowledge), and WN (logical inference) — collectively normed on over 13,000 examinees. His work applies 2PL IRT modeling, computerized adaptive testing, and advanced composite scoring methods (including the modified Tellegen & Briggs Formula 4 with cubic correction) to produce research-grade cognitive measures available online. ORCID: 0009-0006-1283-045X

ORCID

Related Research

Child Cognitive Development

Does Music Training Increase IQ?

Few claims in popular science have been as durable as the idea that music makes you smarter. The 1990s "Mozart Effect" sent pregnant women rushing…

Apr 15, 2026

Intelligence Research and Cognitive Abilities

Working Memory: Why It Matters

Working memory is the cognitive system that holds a small amount of information in mind, briefly, in a way that allows you to use it.…

Apr 13, 2026

Intelligence Research and Cognitive Abilities

The G Factor: What General Intelligence Means

The g factor — Charles Spearman's name for the common variance that runs through all cognitive tests — is the most replicated and the most…

Apr 10, 2026

Cognitive Neuroscience and Brain Function

Sleep Deprivation and Cognitive Performance

Williamson and Feyer (2000), in Occupational and Environmental Medicine, ran a deceptively simple experiment: they kept healthy adults awake for 28 hours and tested their…

Apr 8, 2026

Cognitive Neuroscience and Brain Function

Mindfulness and Cognitive Performance

Meditation has entered the mainstream cognitive-enhancement market. Corporate wellness programs, military training pipelines, schools, and clinics promote mindfulness as a way to sharpen attention, expand…

Apr 6, 2026

JCCES: Reliability of the Crystallized Skills Scale

Three subtests, two indices

Convergent validity across multiple criterion batteries