IQ Scores and Ranges

The IQ Bell Curve: How Scores Are Distributed

Published: March 17, 2026 · Last reviewed:
📖2,565 words11 min read📚9 references cited

The bell curve that plots IQ scores is one of the most recognizable images in popular psychology, and one of the most widely misunderstood. It is not a discovered natural law of intelligence; it is a deliberate engineering choice imposed on raw test scores during a process called standardization. Understanding why test publishers force the distribution into this shape — and what it does and does not tell you about your own score — is the difference between reading an IQ report as a meaningful piece of measurement and reading it as a verdict.

The short version: IQ scores are constructed to follow a normal distribution with a mean of 100 and a standard deviation of 15. About 68% of the population scores between 85 and 115, about 95% between 70 and 130, and roughly 99.7% between 55 and 145. The mathematical framework is what makes percentile rankings, confidence intervals, and cross-test comparisons possible — but it is also what makes naïve interpretations of a single score misleading.

Why IQ scores follow a normal distribution

The bell curve pattern is engineered into IQ tests by design. When test developers create a new intelligence assessment, they administer it to a large normative sample — typically 2,000–4,000 people stratified by age, sex, education, and ethnicity. Raw scores from this sample are then mathematically transformed so that the final distribution has a mean of exactly 100 and a standard deviation of 15. Without that transformation, raw scores from any single test would have whatever shape the items happened to produce, which would be useless for comparison across tests, ages, or populations.

This norming approach was formalized by David Wechsler in 1939 with the deviation IQ, which replaced the older mental-age ratio method. The mean of 100 was kept for continuity with that earlier metric, and the standard deviation of 15 was chosen because it matched empirical data and produced round-number cutoffs at ±1, ±2, and ±3 SD.

There is a deeper reason the normal distribution works as well as it does for cognitive ability. The Central Limit Theorem states that when a trait is influenced by many small, independent contributing factors, the resulting distribution will approximate a bell curve regardless of the individual factors’ distributions. Intelligence is genuinely polygenic: the largest genome-wide association meta-analysis to date (Savage et al., 2018; N = 269,867) identified hundreds of variants each contributing a tiny fraction of variance, and Plomin and Deary (2015) summarize the broader literature as showing thousands of loci with small effects layered on a similarly diffuse mosaic of environmental influences. That polygenic, multifactorial architecture is exactly the structure the Central Limit Theorem predicts will produce something close to a normal distribution. The bell curve is therefore not just a statistical convention; it is a reasonable approximation of how cognitive variation actually accumulates in the population.

What standard deviations mean in IQ terms

Once raw scores are transformed to the IQ metric, the standard deviation is the key to interpretation. With a mean of 100 and an SD of 15, the proportion of the population in each band follows directly from the normal distribution:

Range IQ Score Band Population Coverage Approximate Ratio
Within ±1 SD 85–115 68.2% ~2 in 3 people
Within ±2 SD 70–130 95.4% ~19 in 20
Within ±3 SD 55–145 99.7% ~997 in 1,000
Above +2 SD ≥ 130 2.3% ~1 in 44
Above +3 SD ≥ 145 0.13% ~1 in 741
Below −2 SD ≤ 70 2.3% ~1 in 44

The numbers grow quickly more extreme as you move into the tails. An IQ of 145 (+3 SD) corresponds to roughly 1 in 741 people; an IQ of 160 (+4 SD), assuming the normal distribution holds — which, as discussed below, it does not entirely — corresponds to about 1 in 31,560. For where these scores sit conceptually and how they connect to descriptive labels, see the related guides on high IQ ranges and percentiles and what an IQ of 130, 140, or 150 actually means.

How tests are normed: the bell curve as a construction

Producing a normally distributed IQ scale involves several technical steps that most test-takers never see, and that explain why the distribution looks the way it does.

Step 1 — Item development and pilot testing. Developers write hundreds of candidate items spanning the range of difficulty the test must cover. These are administered to pilot samples and analyzed using either classical test theory or item response theory (IRT) to estimate item difficulty, discrimination, and how well each item fits with the others.

Step 2 — Standardization sampling. The finalized item set is administered to a carefully constructed normative sample. The WAIS-IV standardization included 2,200 adults stratified to match U.S. Census demographics on age, sex, education, ethnicity, and region; the WISC-V used a comparably structured sample of 2,200 children and adolescents.

Step 3 — Raw-to-scaled-score conversion. Raw scores are converted to scaled scores through a process that imposes the desired distributional shape: typically by ranking scores in the standardization sample, converting ranks to z-scores via the normal distribution, and linearly transforming to the IQ metric (×15 + 100). The output is, by construction, normally distributed within the standardization sample.

Step 4 — Age norming. Cognitive abilities change with age, so separate norm tables are built for different age bands. A 25-year-old and a 70-year-old who give the same raw performance will receive different IQ scores because they are compared to different age-peer reference distributions. “IQ 110” means “outperforming about 75% of same-age peers,” not “75% of all humans.”

The crucial point is that the bell curve is the output of standardization, not its input. Whatever shape raw performance actually takes, the scaling procedure forces the published distribution to be normal. The Central Limit Theorem argument explains why this forced shape is also a reasonable description of underlying variation; the standardization machinery explains why the published numbers fit the curve regardless.

Where the normal distribution actually fails

For the central 95% of the distribution, the bell curve is an excellent description. At the extremes, three known deviations matter for interpretation.

Excess at the low end. More people score below IQ 70 than the normal curve predicts. The excess reflects pathological causes of intellectual disability — chromosomal disorders, severe perinatal injury, profound environmental deprivation — that produce cognitive impairment outside the polygenic-environmental architecture of normal variation. Zigler and Hodapp’s (1986) “two-group” model formalized this: the lower tail of normal variation accounts for some mild intellectual disability, while organic causes produce a separate, smaller distribution at lower ability that adds a bump to the overall curve below about IQ 50.

Test ceiling at the high end. IQ tests have finite item banks, and once an examinee answers every available high-difficulty item correctly the test cannot distinguish further. This produces a ceiling effect that compresses scores above approximately IQ 145–160 on most clinical tests, so whether the actual ability distribution thins out exactly as the normal curve predicts at IQ 160+ is not cleanly testable with standard instruments.

Conditional standard errors increase at the tails. Reliability is highest near the mean, where the item bank is densest, and lower at the extremes. The published average reliability (α ≈ .97 and SEM ≈ 2.6 for full-scale IQ on Wechsler tests) understates the measurement uncertainty around scores near IQ 145 or 55, where confidence intervals are wider than the average SEM implies.

The Flynn effect: why the bell curve keeps moving

Even when a test is well-normed, the population it measures does not stand still. James Flynn’s (1987) analysis of IQ data from 14 nations established that raw IQ performance has risen substantially across the twentieth century — roughly 3 points per decade in the United States and most industrialized countries. The phenomenon, now called the Flynn effect, has been confirmed in two independent meta-analyses. Trahan, Stuebing, Fletcher, and Hiscock (2014), pooling 285 studies, estimated mean gains of 2.31 IQ points per decade (2.93 for modern Stanford-Binet and Wechsler tests since 1972). Pietschnig and Voracek’s (2015) more comprehensive synthesis of 271 samples and nearly 4 million participants estimated annual gains of 0.41 IQ points for fluid reasoning, 0.28 for full-scale IQ, and 0.21 for crystallized knowledge.

For the bell curve this has two consequences. First, every set of norms has a sell-by date: a person scoring 100 against 1990 norms might score only about 91 against 2020 norms because the 1990 mean is now below the 2020 mean. Publishers re-standardize periodically (WAIS-IV in 2008, WAIS-5 in 2024) to recenter, but between revisions the norms drift relative to the actual population. Second, gains have not been uniform across cognitive domains — fluid reasoning has risen faster than crystallized knowledge — so the shape of the cognitive-ability distribution has shifted in ways a single Flynn-effect number obscures.

Why the normal distribution matters for test interpretation

The practical value of forcing scores into a bell curve is that it converts raw item-counts into language that means something across people, ages, and tests.

  • Percentile rankings. Telling a parent their child scored at the 84th percentile (IQ 115) communicates standing in a way that “47 of 60 items correct” cannot. Percentiles flow directly from the normal distribution and make scores from different tests comparable.
  • Confidence intervals. Because the test’s standard error of measurement is known and the score distribution is normal, clinicians can compute the probability that an examinee’s true score falls within a given range. A reported IQ of 110 with a 95% confidence interval of 105–115 communicates the genuine uncertainty in the measurement; a single number alone implies false precision.
  • Discrepancy analysis. When a person performs unevenly across cognitive domains — high verbal, low processing speed, for example — the normal distribution provides the framework to determine whether the difference is statistically unusual or within expected sampling variation. This is foundational for diagnoses such as specific learning disabilities and ADHD.
  • Cross-test comparison. Because all major modern IQ tests are normed to the same metric (mean = 100, SD = 15), Wechsler, Stanford-Binet, and WJ-IV scores can be meaningfully compared. Older or specialized scales that use different SDs (e.g., the Cattell Culture Fair test uses SD = 24) require translation: a “Cattell IQ” of 148 corresponds to roughly the 98th percentile, the same as a “Wechsler IQ” of 132 — both are +2 SD on their respective scales despite the 16-point numerical gap.

Schmidt and Hunter’s (1998) Psychological Bulletin meta-analysis of 85 years of personnel-selection research established that general mental ability — measured by exactly this kind of standardized IQ scoring — is among the strongest single predictors of job performance across occupations, with validity coefficients around r = 0.5 for medium-complexity work. Predictive validity at that level depends entirely on the standardization framework that the normal distribution provides; without it, raw scores would not even be comparable, let alone interpretable as predictors.

Common misconceptions about the IQ bell curve

Several persistent myths about the bell curve resist correction in popular coverage.

“IQ is fixed and the curve is destiny.” The bell curve describes the population distribution at a moment in time; it says nothing about within-person stability or capacity for change. Education, health, and cumulative experience can shift an individual’s measured score by 10–15 points or more across a lifetime even while the population distribution holds its shape. Deary (2012) reviews both the substantial test-retest stability of IQ across decades and the real, smaller malleability seen with specific interventions.

“A 15-point difference always means the same thing.” Fifteen points is one SD in the population, but the practical implications depend on where on the curve the difference falls. The gap between 85 and 100 typically affects everyday functioning more than the gap between 130 and 145, because the lower portion spans the threshold where common cognitive demands become difficult.

“All IQ tests are on the same scale.” Modern Wechsler, Stanford-Binet, and most clinical batteries use mean = 100 and SD = 15. Older or specialized tests do not. A “140” on Wechsler (+2.67 SD, 99.6th percentile) is very different from a “140” on the Cattell Culture Fair (+1.67 SD, 95th percentile). Check the SD of the test before interpreting the number.

“The bell curve proves group differences are innate.” The within-group distribution is silent on the causes of between-group differences. High within-group heritability does not imply that between-group differences are genetic — a methodological point that has been settled in behavior genetics for half a century. Mean differences between groups are an empirical observation; their causes are a separate, harder question the bell curve cannot answer.

Frequently asked questions

Why is the average IQ exactly 100?

Because test publishers define it that way. The mean of 100 was inherited from the older mental-age ratio method, where 100 represented mental age equal to chronological age. Wechsler’s 1939 deviation-IQ formulation kept the mean at 100 for continuity, and every major IQ test since has done the same. There is nothing special about the number itself — it is a chosen anchor, not a discovered constant.

Why is the standard deviation 15 instead of 10 or 20?

SD = 15 is a convention adopted by Wechsler and now used by Stanford-Binet, Woodcock-Johnson, the Reynolds Intellectual Assessment Scales, and most other major batteries. Some older or specialized tests use different values: the Cattell Culture Fair uses SD = 24, and earlier Stanford-Binet revisions used SD = 16. When comparing scores across tests, what matters is not the raw number but how many SDs above or below the mean it represents.

Does the bell curve actually describe intelligence in the real world?

For the central 95% of the population, yes — closely. The Central Limit Theorem applied to the polygenic, multifactorial architecture of intelligence (Plomin & Deary, 2015; Savage et al., 2018) predicts approximately normal variation, which is what large-sample data show. At the low extreme, organic causes of intellectual disability add a bump that the normal curve underpredicts. At the high extreme, test ceiling effects and limited sample sizes make the precise shape uncertain.

How rare is an IQ of 130 or 140?

Under the normal distribution, IQ 130 (+2 SD) occurs in about 2.3% of the population — roughly 1 in 44 people. IQ 140 (+2.67 SD) is around 1 in 261. IQ 145 (+3 SD) is about 1 in 741. These are theoretical frequencies; the upper-tail rates depend on whether the actual ability distribution thins out exactly as the normal curve predicts, which is not perfectly established because most clinical IQ tests cannot measure reliably above approximately 145–160.

Has the average IQ gone up over time?

Yes, substantially. The Flynn effect — first documented systematically by Flynn (1987) and quantified in meta-analyses by Trahan et al. (2014) and Pietschnig and Voracek (2015) — refers to gains of approximately 2–3 IQ points per decade across the twentieth century. Tests are re-normed periodically to recenter the average at 100, so a person scoring 100 against current norms would have scored higher against older norms.

Why does my reported IQ score include a confidence interval?

Because no test measures perfectly. The standard error of measurement (SEM) for full-scale IQ on a modern Wechsler test is about 2.6 points, which means the reported score is the center of a band of probable true scores rather than a single fixed value. A 95% confidence interval of about ±5 points around the reported score is standard. Two scores whose confidence intervals overlap are not meaningfully different even if the point estimates differ by several points.

References

  • Deary, I. J. (2012). Intelligence. Annual Review of Psychology, 63, 453-482. https://doi.org/10.1146/annurev-psych-120710-100353
  • Flynn, J. R. (1987). Massive IQ gains in 14 nations: What IQ tests really measure. Psychological Bulletin, 101(2), 171-191. https://doi.org/10.1037/0033-2909.101.2.171
  • Pietschnig, J., & Voracek, M. (2015). One century of global IQ gains: A formal meta-analysis of the Flynn effect (1909–2013). Perspectives on Psychological Science, 10(3), 282-306. https://doi.org/10.1177/1745691615577701
  • Plomin, R., & Deary, I. J. (2015). Genetics and intelligence differences: Five special findings. Molecular Psychiatry, 20(1), 98-108. https://doi.org/10.1038/mp.2014.105
  • Savage, J. E., Jansen, P. R., Stringer, S., Watanabe, K., Bryois, J., de Leeuw, C. A., et al. (2018). Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nature Genetics, 50(7), 912-919. https://doi.org/10.1038/s41588-018-0152-6
  • Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124(2), 262-274. https://doi.org/10.1037/0033-2909.124.2.262
  • Trahan, L. H., Stuebing, K. K., Fletcher, J. M., & Hiscock, M. (2014). The Flynn effect: A meta-analysis. Psychological Bulletin, 140(5), 1332-1360. https://doi.org/10.1037/a0037173
  • Wechsler, D. (1939). The Measurement of Adult Intelligence. Williams & Wilkins.
  • Zigler, E., & Hodapp, R. M. (1986). Understanding Mental Retardation. Cambridge University Press.

Related Research

Psychometric Testing and IQ Assessment

IQ Test Anxiety: How Stress Affects Your Score

You sit down for an IQ assessment. Your palms are sweating, your mind races, and the moment you see the first timed task, your thoughts…

Mar 22, 2026
Psychological Measurement and Testing

How to Interpret IQ Test Results

You've received an IQ test report — for yourself, your child, or a client — and what should be a clean answer is a thicket…

Mar 15, 2026
Cognitive Abilities and Intelligence

What an IQ of 130, 140, or 150 Means

If you've received a score of 130, 140, or 150 on an IQ test — or if you're simply curious about what these numbers represent…

Sep 27, 2025
Cognitive Abilities and Intelligence

High IQ Ranges: Percentiles and Meaning

"High IQ" is one of the most loosely used phrases in popular discussion of intelligence. The honest answer to "what counts as a high IQ?"…

Jan 15, 2025

People Also Ask

What are iq test anxiety: how stress affects your score and what to do about it?

You sit down for an IQ assessment. Your palms are sweating, your mind races, and the moment you see the first timed task, your thoughts scatter. You know you can do better than this — but the anxiety won't let you. If this sounds familiar, you're not alone. Test anxiety affects an estimated 25–40% of students and can depress cognitive test scores by 5–12 points — enough to shift someone across diagnostic categories.

Read more →
How to Interpret IQ Test Results: A Psychometrician's Guide?

You've received an IQ test report — perhaps for yourself, your child, or a client. It's filled with numbers, percentiles, confidence intervals, and subtest scores. What does it all mean? This guide walks you through interpreting a cognitive ability report the way a psychometrician would, helping you understand not just what the scores say, but what they don't.

Read more →
What Does an IQ of 130, 140, or 150 Actually Mean?

If you've received a score of 130, 140, or 150 on an IQ test — or if you're simply curious about what these numbers represent — you've likely found that the internet offers more mythology than explanation. These scores place individuals well above average, but what that means practically, statistically, and psychologically requires more than a percentile table.

Read more →
SAT Scores and IQ: How Closely Are They Correlated?

The SAT is the most widely taken standardized test in the United States, completed by over two million students annually. IQ tests are the most established instruments for measuring cognitive ability. Given their shared reliance on reasoning, problem-solving, and processing speed, a natural question arises: does your SAT score reflect your IQ? The answer is yes — partially — but the relationship is more complex than a simple conversion table would suggest.

Read more →
What are the key aspects of why do iq scores follow a normal distribution??

The bell curve pattern isn't an accident — it's engineered into IQ tests by design. When test developers create a new intelligence assessment, they administer it to a large normative sample (typically 2,000–4,000 people stratified by age, sex, education, and ethnicity). Raw scores from this sample are then mathematically transformed so that the final distribution has a mean of exactly 100 and a standard deviation of 15.

Why does what do standard deviations mean in iq terms? matter in psychology?

The standard deviation (SD) is the key to interpreting where any IQ score falls on the bell curve. With a mean of 100 and an SD of 15: This means that roughly two-thirds of the population scores between 85 and 115, and 95% falls between 70 and 130. The extremes become vanishingly rare: an IQ of 145 (+3 SD) occurs in about 1 in 741 people, while an IQ of 160 (+4 SD) is roughly 1 in 31,560.

📋 Cite This Article

Jouve, X. (2026, March 17). The IQ Bell Curve: How Scores Are Distributed. PsychoLogic. https://www.psychologic.online/iq-bell-curve/