Age bias may hide hypothyroidism under a normal TSH

Scientists have been explaining that TSH responds to thyroid hormones differently in childhood, early adulthood, late adulthood and very advanced age.

This poses a problem for regions that have implemented TSH-only screening for thyroid dysfunction.

The effect of age on TSH is one of many factors that can make this screening test less accurate (Ling et al, 2018).

Thyroid scientists have known about the age-distortions of TSH for at least 20 years (Hollowell et al, 2002). People have been advocating for age-specific reference intervals for a long time (Surks & Hollowell, 2007).

Many advocates emphasize the older population’s need for a higher TSH reference upper limit while ignoring the effects a higher limit would have on the younger population.

That one-sided advocacy to prevent overmedication of the elderly has failed to emphasize the fact that age can manipulate the TSH-FT4 relationship in the opposite direction in younger adults.

It has led some people to imagine that raising the TSH limit for all ages wouldn’t hurt.

Here’s a research-based example of what age bias can do:

If a region’s TSH upper limit is inflated high enough, such as 6.5 mIU/L (the reference limit was recently raised from 4.0 to 6.5 in Alberta as of June 2022), it will reduce false positive screening results of hypothyroidism in very old people. A reference interval of 6.5 will accommodate a lot of healthy older women’s naturally higher TSH levels.

The very old person’s test result will make clinical biochemists and health care systems happy. A FT4 test won’t be triggered by a “high” result in their algorithm. That money can now pay for a test for a disease people in our medical culture consider more urgent or deadly.

But the very same TSH interval of 6.5 will increase the rate of false negative screening results for 20- to 70-year-olds with thyroid dysfunction. Their TSH does not rise as high in response to the same FT4 levels as older citizens. The TSH will not be a “sensitive” screening tool for them. The 6.5 upper limit will fail to flag a result such as a TSH of 5.5 mIU/L as abnormally high for their age.

The young and middle-aged adult’s false-negative result will make clinical biochemists and health care administrators shrug. They accept the fact that a TSH of 5.5 is an abnormal result for their age. Meh. It’s only “mild” hypothyroidism, they say.

But nobody who supports this kind of policy checks to see if people are suffering excruciating symptoms that a naive physician will blame on their other health conditions for the next ten years or more. It’s a medical error that they can sweep under the rug. It won’t be considered an error if the result is declared “normal” in their population.

Recently, researchers found that women in their 20s had a 97.5th percentile TSH of 3.21 mIU/L, while women in their 60s had an upper limit of TSH of 5.07 (Barhanovic et al, 2019).

Another study found that if a 25-year-old with a Free T4 level of 10 pmol/L has a given TSH response, it’s likely a 45-year-old will have a 22% higher TSH. A 65-year old is likely to have have an additional 22% higher TSH (Brown et al, 2016).

Complicating the age factor is biological sex. On average, the male HPT axis does not age in the same way as the female HPT axis. On average, across all decades of life, females will have a 12% higher TSH than males when their FT4 is at the lower reference limit (Brown et al, 2016).

But in fact, you can’t generalize about age without considering sex, nor generalize about sex without considering age, because the TSH-FT4 relationship fluctuates in different directions in males and females from one age group to the next (Strich et al, 2016).

If administrators continue to force everyone in the population to be screened by an age-blind, sex-blind TSH interval, then patients will rely on physicians who understand how to interpret a TSH screening test with age and sex in mind.

In this scientific research review intended for thyroid experts — physicians, administrators, scientists and educated thyroid patients — I discuss some of some of the best recent research on age- and sex-based shifts in TSH reference intervals, as well as age-based shifts in the TSH-FT4 relationship.

COPYRIGHT NOTICE: Reproduction of a copyrighted article’s data, graphics, and quotations falls within US copyright law as “fair use” and within Canadian copyright law as “fair dealing.” — “A fair use is any copying of copyrighted material done for a limited and “transformative” purpose, such as to comment upon, criticize, or parody a copyrighted work. Such uses can be done without permission from the copyright owner” (Stanford University).

How healthy and how age-biased is the reference population?

The healthiest populations that have no interferences with thyroid function tend to have a narrow interval that rises only to around 2.5 or 3.0. (Walsh, 2022; Aw et al, 2019)

However, since the early 2000s, the consensus has been to place the upper reference limit of TSH around 4 or 4.5. This was a compromise to prevent overdiagnosis and overtreatment of hypothyroidism.

Scientists have always known that the statistics for TSH differ based on the criteria you use to screen the population.

When the reference population is not carefully screened, the percentage of people with TSH above 4.5 and below 0.4 in the advanced age group.

Three different levels of screening are depicted by three lines on the following graph from Hollowell and colleagues’s 2002 report based on the National Health and Nutrition Examination Survey in the United States (NHANES III).

The X axis, in fine print, has age intervals on it.

Look at the dip in the “20-29” age group in the graph on the left for the hypothyroid end of the spectrum, and contrast it with the bump in the 20-39 age group in the graph on the right. This is where later researchers found an age-based shift in the TSH response to FT4 that can conceal some cases of hypothyroidism within the high-normal range. Apparently this age group makes hyperthyroidism a little easier to detect. I’ll return to this age group below.

On the right hand side of both graphs, even in the healthiest “reference population,” the increase in the percentage of people with abnormal results rises with advanced age, after the 6th decade for hypothyroidism, and after the 7th decade for hyperthyroidism.

How do you screen a population to set a reference interval? In Hollowell’s study, the “disease free” population was filtered only by the absence of self-reported thyroid disease.

“Disease-free excludes those people who have reported having thyroid disease, goiter, or taking thyroid medications”

(Holowell et al, 2002)

In contrast, their “reference population” was more carefully screened not only for the biochemical absence of thyroid disease, but many other health factors:

“Risk factors include pregnancy, taking estrogen, androgens, or lithium, and the presence of thyroid antibodies and biochemical evidence of hypothyroidism or hyperthyroidism.” 

(Holowell et al, 2002)

If they don’t exclude “risk factors,” researchers will obtain an inflated reference interval because of the age effect.

In addition, it may also matter if a researcher divides their population by race/ethnicity, although it’s hard to tell if the effect was based on ancestry, iodine intake, or nutrition.

Therefore, scientists have known for a long time that a single TSH reference interval such as 0.4 to 4.5 mIU/L was going to detect hypothyroidism in the elderly more often than in the young.

In 2002, they were clearly fascinated by the age-effect, but it was not yet widely known that younger people could manifest the same degree of thyroid dysfunction with a more subtle elevation in TSH than their elders.

Part of their problem was imprecise assay technology. Their distinction between “clinically significant” versus “subclinical or mild” thyroid dysfunction was based on Total T4 measurements. Quality Free T4 immunoassays were not yet in widespread use.

In 2007, Surks and Hollowell published a follow-up article in which they argued for age-specific intervals:

“TSH distribution progressively shifts toward higher concentrations with age. The prevalence of SCH may be significantly overestimated unless an age-specific range for TSH is used.

In the body of their 2007 report, they even suggested that they had underestimated the true prevalence of abnormally elevated TSH in younger age groups:

“If an age-specific 97.5 centile is used instead of the fixed 4.5 mIU/liter, as suggested by the present analysis, the prevalence of raised TSH concentrations would be less in older people and greater in younger age groups than previously reported.”

(Surks & Hollowell, 2007)

Many other scientists have echoed this call for age-based TSH intervals, but the drive for simplicity overwhelms the drive for accurate diagnosis. People who imagine that a “mildly elevated” TSH cannot be associated with tissue hypothyroidism clearly do not understand how harmful a false-negative diagnosis can be.

Now that we know the background, let’s leave 2002 behind and move on to newer research that builds on this.

The 20-39 age group has a distinctive TSH-FT4 relationship.

Hadlow and colleagues (2013) performed some excellent research that demonstrated that age alters the correlation between TSH and FT4.

Click to view the exclusion criteria for the population

In the graphs below, the exclusion criteria for the study population were:

  • no factors [such as TPOab positive status, drugs] confounding the physiological HPT axis response or resulting in analytical interference with thyroid function testing.
  • no hospitalized patients,
  • no pregnant women,
  • no patients aged younger than 1 year,
  • no patients receiving specialist endocrine, surgical, or medical care,
  • no patients having records with an unknown time of collection or collection times outside usual office hours
  • no treatment of Graves disease or thyrotoxicosis, multinodular goiter, thyroid cancer, partial or total thyroidectomy, or hypopituitarism
  • no history of treatment with radioiodine, antithyroid drugs, lithium, antiepileptic drugs, amiodarone, or liothyronine (LT3). (Patients treated with thyroxine (LT4) were were analyzed separately from the rest of the population.)

Each age group has a different TSH-FT4 relationship trajectory through the normal reference ranges for both hormones. This holds true in a large population where the effects of thyroid autoimmunity, drugs, and other factors are excluded.

The largest difference is seen in the median TSH of the age group (both sexes) between 20 and 40 years of age, depicted by the lightest gray line that sags below the other three darker lines throughout the blue rectangle representing normal TSH and normal FT4.

Find the trendlines that align with “6” on the Y axis.

Find the trendlines that align with “4” on the Y axis. They reveal the following discrepancies across age group.

  • The 50th percentile of 20-39-year-olds will have FT4 ~10 pmol/L
  • The 50th percentile of 40-59-year-olds will have FT4 ~12 pmol/L
  • The 50th percentile of >80-year-olds will have FT4 ~15 pmol/L

There is only a 50-50 chance that a “normal” TSH level of 3.9 mIU/L will rule out a low FT4 of <10 pmol/L for the age group 20-39. Their trendline crosses the upper FT4 and higher TSH reference limits where these two meet. Given that the trendlines are 50th percentile medians, the data points on each side of the trendline constitute 50% of the data.

This means that a senior with a TSH of 4.5 is not “equally hypothyroid” (will not have the same FT4 level) as a 30-year old man with a TSH of 4.5.

It compromises the “predictive value” of the TSH-only screening test when its thresholds (cutoffs) are not closely correlated with a diagnosis confirmed by FT4 measurements.

Brown’s estimate of the difference of age and sex on TSH-FT4 relationships

Next is another set of by images expressing a similar concept, Brown and colleagues (2016) — an article also coauthored by Hadlow.

Click to view the exclusion criteria for the population

This population of 4,427 people was not as carefully groomed as the 120,000+ population in the graphs above:

  • Not a diverse ethnic/racial group. Bussleton, Australia is a “predominantly Caucasian population.”
  • Not likely to suffer iodine deficiency, as it is “an iodine-sufficient region.”
  • Not taking known medications that may influence TSH or FT4, based on the individual’s self-report as they completed a survey. Excluded on this basis were “128 individuals on thyroxine treatment; 4 on antithyroid drug treatment; and 39 on drugs affecting pituitary–thyroid axis function or thyroid function tests (including amiodarone, phenytoin, carbamazepine and lithium carbonate).”
  • After biochemical analysis, the following were excluded: “4 with low free T4 accompanied by inappropriately low TSH suggesting hypopituitarism” and “2 with [extremely] outlying free T4”

Therefore, the population included people with positive TPO antibodies, and people who may have had severe chronic illnesses despite not being hospitalized.

The graph on the left depicts the “negative double sigmoidal” pattern of the hormone correlations with a scatterplot, showing once again that it is not a simple “log-linear” regression line, but a hormone relationship that flattens within the normal ranges when many diverse subgroups are included.

Each dot represents one person’s TSH-FT4 relationship.

On the right, the section in the red box is expanded, and the logarithmic TSH scale becomes a linear TSH scale.

There are “regression model predictions” in the graph on the right.

A caution is in order. This was a simulation based on all the data in three age groups, so it is not based on a real scatterplot of data, but based on input variables into a program, isolating age from other variables. Therefore, it is highly unlikely that a healthy individual aged 16-40 will have a TSH of 2.2 mIU/L when their FT4 is as low as 10 pmol/L, as this region of the red box, enlarged in the linear scale graph, was sparsely populated in the scatterplot graph on the left.

Nevertheless, the visualization based on modeling shows that the degree of TSH elevation per unit of FT4 based on age group is significant.

This age graph is duplicated below, next to the effect of biological sex, for comparison:

As stated in small print under the image, At FT4 10 pmol/L

  • TSH is 22% higher per age tertile, on average
  • TSH is 12% higher for males than females, on average.

These graphs involve artificial statistical manipulations:

  • The graph on the left depicts the effect of age alone (all other variables, including sex, being controlled)
  • The graph on the right depicts the effect of sex alone (all other variables, including age, being controlled).

Since an individual cannot have an age without simultaneously having a biological sex, these graphics are not realistic as they artificially isolate the one variable from the other. Unfortunately, in reality,

  • One cannot simply “add” the TSH-amplifying effect of being female to the TSH-amplifying effect of being over 80 years old.
  • Nor can one “subtract” the effect of being male from the TSH-limiting effect of being aged 16-40.

The age-effect and sex-effect are not linear across age groups, as shown below.

How TSH-FT4 relationships change across age in females

This section answers an important question:

How biased can TSH screening be when there is a single TSH reference interval for all ages >1 year old, when screening adult females?

Females are at higher risk of acquired hypothyroidism than males.

“Spontaneous hypothyroidism is about 10 times more prevalent in women compared to men

(Efframidis et al, 2021)

Back in 2002, Hollowell found that the effect of “risk factors” on the average TSH level is also unevenly distributed among men and women across the lifespan; the women with risk factors had a TSH that increased through adulthood.

Therefore, it’s important to have a screening threshold that is sensitive to the development of hypothyroidism at all ages.

In 2019, Barhanovic and colleagues examined the association between age and TSH in 946 healthy females aged 20-69 on an Abbott architect i2000 platform.

Click to view the exclusion criteria for the population

Their exclusion criteria were:

  • no personal and/or family history of thyroid disease,
  • not positive for thyroglobulin and/or thyroid peroxidase antibodies (TGAb <4.1 mU/L, TPOAb <5.6 IU/mL),
  • not taking medicine affecting thyroid function (oral contraceptives, estrogen, glucocorticoids, anti-epileptic drugs, amiodarone, and lithium),
  • no diagnosis or a suspected case of polycystic ovarian syndrome (PCOS),
  • no history of severe non-thyroid illness,
  • no pregnancy or breast feeding.
  • no subjects with serum values of TSH<0.1 mU/L or TSH>10.0 mU/L as these results indicate a high probability of thyroid dysfunction.

As you can see, the TSH upper limit in the female reference population rises with age.

The TSH reference interval always has a skewed population distribution with a low median. According to their data tables, this is the difference that age makes to the percent of their female reference population with higher TSH levels:

  • Among women aged 20-29,
    • 11.3% had a TSH between 2.5 and 4.0, and
    • 1% had a TSH above 4.0 mIU/L.
  • Among women aged 60-69,
    • 27.6% had a TSH between 2.5 and 4.0, and
    • 6.2% had a TSH above 4.0 mIU/L.

What did they find regarding the FT3 and FT4 reference intervals? Did they fall as one would expect, as TSH rose across the age groups? No, they did not. According to their age groups’ median levels of FT3 and FT4,

No statistically significant differences in the serum values of free triiodothyronine [FT3] were observed between any of the groups (p=0.17).

Free thyroxin [FT4] values were significantly higher in the two oldest groups compared to first, second and third age-related group (p<0.01).”

(Barhanovic et al, 2019)

The differences between age groups are exaggerated when looking at the upper reference limits:

  • The FT4 upper reference limit was
    • 17.53 and 16.11 pmol/L (10.59 – 17.53) in the two youngest cohorts, but
    • 19.08 (11.01 – 19.08) in the oldest cohort.
    • This is a rise from 63.2% to 100% of the oldest cohort’s reference range width (11.01-19.08)
  • The FT3 upper reference limit was
    • 5.8 and 5.7 pmol/L (3.2 – 5.8) in the two youngest cohorts, and
    • 6.5 in the oldest cohort (3.2 – 6.5).
    • This is a rise from 75% to 100% of the oldest cohort’s reference range width (3.2 – 6.5)

Therefore, the average TSH levels in the oldest group of women were not inverse reflections of lower FT3 or FT4 levels. Both the FT3 and FT4 rose with age in untreated, healthy women without thyroid disease.

In fact, the extra TSH stimulation appears to have enhanced T3 and T4 secretion from the older women’s healthy thyroid glands. The HPT axis is altered by aging.

How would a single TSH reference range affect Barhanovic’s young adult female cohorts?

Barhanovic reported that many potentially hypothyroid women between age 20 and 40 would fail to be detected by TSH-only screening policy if the reference range used the manufacturer’s recommended upper limit of 4.94 mIU/L instead of their cohort’s age-specific upper limit.

  • 5.2% of healthy women aged 20-29 (TSH 3.21 upper limit)
  • 3.3% of women aged 30-39 (TSH 3.60 upper limit)

Since the upper limit is set at the 97.5th percentile, only 2.5% of the healthy population ought to be above the reference interval. This was not the case for the two youngest age groups, which had more than 2.5% of their population above range.

Finally, consider that their reference interval only accounts for the norms of healthy women. If the local policy is to reserve TSH screening only for symptomatically hypothyroid patients and those with risk factors for thyroid disease, the percentage of women with a falsely “normal” TSH test result may rise.

The TSH-FT4 relationship differs in males and females.

In the excellent research by Strich and colleagues, 2016, they analyzed the shifts in the core of the HPT axis by sex. They used a Cobas Roche platform for all their data. Their their sample size was 10,227 males and 17,713 females (27,940 total).

Click to view the exclusion criteria for the population

Their exclusion criteria were:

  • no positive titers of TPOAb or TGAb,
  • no past or current thyroid treatment with levothyroxine, methimazole, propylthiouracil, recombinant thyrotropin,
  • no treatment with antiepileptic drugs, lithium, or glucocorticoids.
  • no samples in which TSH levels were above 7.5mIU/L or below 0.2mIU/L

(NOTE: In Strich’s table, ages 40-60 and 60-80 each have single data points representing two decades of life. In the graph above, these data have been spread across two decades so that the X axis can function as a timeline with 10 year increments.)

It certainly looks like the HPT axis is skewed not only by age but also by sex, even when thyroid antibodies are excluded.

The different TSH-FT4 relationship that Hadlow and colleagues saw in the 20-39 age group is confirmed here in the age 20-29 group by Strich and colleagues. There is a trend toward a positive TSH-FT4 correlation in both males and females in the 20-29 age group.

The “negative logarithmic” relationship is temporarily reversed, on average, in the 20-39 age group, which becomes mildly positive in males.

Averaging the data from both sexes into a single line on a graph will conceal the sex difference.

Isn’t it interesting how male and female pituitary sensitivity to FT4, or their thyroid’s sensitivity to TSH, or both, can shift in opposite directions over a lifespan?

Strich articulated the results this way:

“There was a parallel stepwise decrease in FT4 levels until age 40. …

There was a negative correlation between FT4 and TSH (r = −0.02, P = 0.01) up to age 80.

In general, both FT4 and FT3 are slightly lower among females for each TSH quartile.”

(Strich et al, 2016)

In part 2, I’ll provide more of a review of Strich’s article, which gives insight into the way age and sex shifts the FT3:FT4 ratio as well.

This biological fact makes it impossible for a single TSH reference interval to maintain its reputation as an equally “sensitive” screening test across all age groups.

What are the potential harms of an age-blind, sex-blind TSH interval?

Here are two brief hypothetical examples, followed by a real case study from research.


If the TSH reference interval is too high for his age and sex, an untreated man in his 30s or 40s may have a normal TSH despite moderate thyroid hypofunction.

He may also have type 2 diabetes lowering his FT3:FT4 ratio below his age cohort (Gu et al, 2017). Having diabetes is a risk factor for thyroid disease, but his dosing of a TSH-lowering drug like metformin may conspire with his age and sex to prevent his TSH from rising above range (Cannarella et al, 2021).

His undiagnosed thyroid disease may cause infertility (Lotti et al, 2016), or may worsen his type 2 diabetes.


If the TSH reference interval is too high for her age and sex, a high-normal TSH may hide subclinical hypothyroidism in an untreated 26-year old woman who recently got married and decided to stop contraceptive use.

If she has autoimmunity against thyroglobulin (TGAb) but doesn’t have thyroid peroxidase antibodies (TPOAb), her TSH will not rise as high as peers who have early Hashimoto’s (Brown et al, 2016).

Let’s imagine she isn’t pregnant yet, but her test results are very high-normal in a region like Alberta that raised its TSH reference limit to 6.5.

What would you do? Would you risk it and leave her untreated, just in case her TSH “normalizes” after the first sensitive 3 months of pregnancy? Her risk of hypothyroidism is higher after many years of contraceptive use (Qiu et al, 2021)

Her thyroid may or may not be healthy enough to sustain her fetus through a healthy pregnancy, given the pressures put on the HPT axis (ATA guidelines — Alexander et al, 2017).

Pregnancy trimester-specific TSH reference intervals exist, but they may also be subject to administrative inflation of their upper limit. They do not prepare women for an unplanned pregnancy that may be at higher risk of adverse events such as miscarriage (Liu et al, 2014).

Physician age- and sex-bias when judging of TSH during pregnancy

Imagine the female in the previous example subsequently became pregnant and had a TSH of 5 mIU/L in early pregnancy.

Apparently some Alberta scientists would want her to remain untreated until her 2nd TSH test during pregnancy confirms subclinical or overt hypothyroidism (Yamamoto et al, 2020).

They are of the school of thought that a TSH no higher than 10 mIU/L is always mild subclinical hypothyroidism and is safe for all human beings of all ages and both sexes and regardless of pregnancy status.

They demonstrated no concern about whether her concurrent FT4 or FT3 would also be low or low-normal for her age and trimester. A woman’s TSH can also be reduced during placenta-derived hCG hormone stimulation of the pituitary in the first trimester, according to guidelines (Alexander et al, 2017).

Final thoughts

Age is not an isolated influence. Many factors conspire with age to lower a person’s TSH from “high” into the “normal” range (Ling et al, 2018)

Meanwhile, administrators move TSH reference limits around to save dollars.

Publications that praise the benefits of raising the upper TSH reference limit fail to account for how it would discriminate against the younger hypothyroid population (Henze et al, 2019; Symonds et al, 2020). They also fail to inquire into the clinical history and consequences faced by patients who are misdiagnosed “euthyroid” when they are not.

This is not ethical. Accurate screening and diagnosis matters. Hypothyroid people are not statistics. You can’t compare an old woman with a TSH of 7 with a young man with a TSH of 7 and say they have the same thyroid status.

Individual physicians can’t fix a broken health care system. No matter how correct or flawed local TSH reference intervals are for the whole population or part of a population, clinical judgment is necessary for the diagnosis of an individual.

Instead of expecting the HPT axis to behave the same way in all people, physicians must become aware of the ways in which true hypothyroidism can be concealed by various health factors that limit TSH secretion, and the effects of TSH stimulation.

Therefore, science-based thyroid education and clinical sensitivity are the only ways forward.

Link to part 2 & 3 on age-bias:


References for all articles cited in the “analyzing normal lab results” series are in a separate post.

2 thoughts on “Age bias may hide hypothyroidism under a normal TSH

  1. Dear Tania, thank you, thank you, thank you!!!
    So interesting. I never thought that the influence of age on HPT function could be so strong. I thought that since
    the pituitary-thyroid axis setpoints are determined genetically, then they will be almost constant in time. After reading your article, I did a search and found this.
    “Genome-wide association studies (WAS) performed so far are revealed genetic variants in about 30 loci robustly associated with thyroid function11,12,13. However, these variants explain only <9% of the heritability of TSH and FT4 variation14, while in total, it has been estimated at 65 and 39–80% for TSH and FT4, respectively15,16, suggesting that many loci still await discovery." (2018)

    Given such strong external influences on thyroid function, it is difficult to assume the existence of a single reference interval of TSH.

  2. Accoding to statistical axioms, the thyroid reference ranges for old folks should be formed specially for them as they are different from the subjects who were tested to form the existing reference ranges. This difference may not be clear, but it does create different concerns.

Leave a public reply here, on our website.

This site uses Akismet to reduce spam. Learn how your comment data is processed.