We’ve known since at least 2002 that being within “normal range” is never good enough when it comes to thyroid hormone blood tests.
An important set of articles has taught us that each human being has an optimal range for TSH, Total T4 and Total T3 that is far less the width of the population-wide reference range.
A third article has informed us that Free T4 and Free T3 have individual healthy ranges that are even narrower than the laboratory ranges for Total T4 and Total T3 and TSH.
This post is a summary and critical review of these three articles:
- Andersen, S., Pedersen, K. M., Bruun, N. H., & Laurberg, P. (2002). Narrow Individual Variations in Serum T4 and T3 in Normal Subjects: A Clue to the Understanding of Subclinical Thyroid Disease. The Journal of Clinical Endocrinology & Metabolism, 87(3)
- Andersen, S., Bruun, N. H., Pedersen, K. M., & Laurberg, P. (2003). Biologic Variation is Important for Interpretation of Thyroid Function Tests. Thyroid, 13(11)
- Ankrah-Tetteh, T., Wijeratne, S., & Swaminathan, R. (2008). Intraindividual variation in serum thyroid hormones, parathyroid hormone and insulin-like growth factor-1. Annals of Clinical Biochemistry, 45(Pt 2)
The first scientific article that taught this principle in 2002 has now been cited by 435 additional scientific articles, according to the Scopus database that tracks scientific citations.
The second article published in 2003, the authors’ further commentary on the same research experiment, published in a different journal, has been cited 132 times.
The third article published in 2008, which provided results for Free T3 and Free T4, has been cited only 25 times.
This principle is incredibly well-known within the field of thyroid science. A very powerful lesson was learned from only two experiments with a very small set of human participants.
Why, then, are population-wide reference ranges still being used to diagnose thyroid disease?
Why do _wider_ than population-wide “subclinical” ranges exclude many from thyroid therapy?
Why do some people still imagine that a Free T3 barely hanging on at the lower end of reference range is normal and acceptable for the severely symptomatic thyroid patient?
And even worse, why is the TSH healthy-thyroid population reference range being used to evaluate lifelong thyroid hormone therapy as a success, as if Humpty Dumpty could be put back together again merely by the means of TSH-normalizing masking tape?
The research teaches us that we can’t predict whether the patient sitting before us has a homeostatic thyroid set point for T3 and/or T4 in the upper half of reference or the middle of reference when TSH is normal.
We can’t tell whether TSH, T3 and T4 interrelationships are being distorted by illness or by a wide variety of medications if we don’t know where a person’s healthy constellation of thyroid hormone set-points are found.
The most productive question to ask is this: How can doctors and patients work together to discover individually optimal thyroid hormone levels? Read on.
In Andersen’s 2002 study (also discussed in 2003), 16 men from Denmark averaging 38 years old (24 – 52) who had no sign of thyroid disease (no goiter, no thyroid diagnosis, no thyroid medication) were tested monthly over 12 months.
In 2008, Ankrah-Tetteh and two colleagues studied 10 healthy people’s TSH, Free T4, Free T3, Parathyroid hormone (PTH) and insulin-like growth factor-1 (IGF-1). Samples were taken weekly over six weeks, following the findings of Andersen’s study that explained how many repeated tests would be necessary to establish an individual’s unique range.
Andersen and colleagues found that on average,
- Total T3 levels fit within 54% of the population reference range.
- Total T4 levels fit within 58% of the range.
- TSH fit within 49% of reference range.
All three tests yielded an index for individual variation, now known as the “Index of Individuality” (IOI or II) below 0.6.
This is a poor result for all three tests.
In contrast, “When the ratio is greater than 1.4, the reference range works as intended.”
Hm, this means that TSH, T3 and T4 reference ranges are NOT going to work as intended.
The “II” ratio below 0.6 means that for TSH, T3 and T4 levels, “laboratory reference ranges are relatively insensitive to aberrations from normality in the individual.”
Healthy individuals do not change their FT3 or FT4 levels very much. They are very stable within individuals.
The only reason the reference range was so wide for the population as a whole is that some individuals had a lower, yet narrow range within the population, while others had a higher, yet narrow range within the population.
Andersen and team explained what this means for the individual:
“An individual test result may be far outside the individual reference range while still lie well within the laboratory reference range. This indicates a low sensitivity of the population-based reference ranges and causes uncertainty in the diagnosis of overt, and in particular subclinical thyroid disease.”
It does more than cause uncertainty in overt and subclinical thyroid disease — it makes it very difficult to define optimal thyroid _health_ by the laboratory reference ranges.
“Consequently, a test result within the laboratory reference range does not necessarily indicate a normal thyroid function in the individual.”
“A test result” is a single test result, but there are three hormones here.
This philosophy of judging each separate hormone in isolation from the others is flawed, because none of these test results stands strong on its own as an isolated result. None, not even the TSH.
Despite testing all three hormone levels, Andersen’s team made some contradictory and unjustified claims and conclusions regarding the usefulness of the TSH test alone and regarding the role of symptoms in diagnosis, as I discuss below.
In contrast with Andersen, Ankrah-Tetteh found that the free thyroid hormones (FT4, FT3) had even less of a fit with the population-wide reference range. Free T3 was the narrowest intra-individual range.
- Free T3 fit within only 38% of the population reference range
- Free T4 levels fit within 41%
- TSH levels fit within 68%
It appears to say very clearly that carefully regulating FT3 and FT4 within a very narrow and individualized range is even more important to the untreated thyroid-healthy human body than normalizing one’s TSH to fit within the population.
Unfortunately, Ankrah-Tetteh did not comment on the practical implications of their results — they merely cited Andersen and team’s discussion.
This is unfortunate because of the major flaws in Andersen’s discussion.
ANDERSEN WAS PROFOUNDLY TSH-CENTRIC
Andersen and team’s data is valuable in itself and is worth pondering, despite the limitations of its small data set.
Unfortunately, their interpretations of their data exhibit a failure of scientific logic and clinical judgment because they cannot see the biases inherent in the diagnostic and therapeutic paradigm they espouse.
Perhaps one reason why Andersen and team’s study has been cited 435 times is that they harp on endlessly about the sensitivity of TSH to “amplify” thyroid hormone levels.
This biochemical amplification makes the researchers rhetorically amplify the necessity of discovering and treating a high or low TSH rather than optimizing a person’s T3 or T4.
The mere amplification of digits on a measurement scale does nothing to make the TSH more important to the human body than T3 or T4 in bloodstream.
Why are we prioritizing a lab test just because it gives us a wide variation in numeric values?
A narrow variation is equally a justification for medical interest and inquiry.
A narrower-than-population range does not prove T3 is unimportant, but rather that more careful control of blood concentrations are crucial to health. T3 is the last hormone to fall below reference in autoimmune hypothyroidism before therapy is in initiated, mainly because it is buoyed up by many of the body’s compensatory mechanisms. This is the profound lesson taught by Abdalla & Bianco in 2014 when they called for a paradigm shift toward seeing the T3 hormone as essential, in their article titled “Defending plasma T3 is a biological priority.”
They do not seem to realize that outside of thyroid therapy, the TSH must increase, decrease and fluctuate in order to try to keep T3 and T4 so stable and within narrow limits, and that the body’s TSH adjustment is not just an amplified response but a means to an end. What they interpret as a signal of abnormality can be equally a signal of functionality or loss of functionality.
We use a large wrench to tighten or loosen a small nut and bolt — does that make the wrench more important than the nut and bolt? Can we tell if the wrench (TSH) is ineffective at adjusting T3 (if it has lost its hold on a disintegrating nut), merely by judging the position of its handle by a population statistic? An equally impotent TSH may fall within reference or not far beyond it.
Andersen’s alarmism about out-of-range TSH exemplifies the “overzealous” attitude that Utiger cautioned us against in 1988 as the TSH test was refined. Their discussion promotes the idea that all TSH values outside population reference are dangerous and a cause for alarm regardless of a T3 or T4 value concurrently within reference range.
Many of Andersen’s interpretations unfortunately promote the widespread “in or out of range” philosophy of thyroid test result interpretation.
They rush to reaffirm the very reference boundaries their data has questioned.
This is an interpretation that is clearly at odds with the findings of their study.
How did they arrive at this inconsistent interpretation?
STARTING WITH A PRESUMPTION
The 2002 study begins with a fundamental presumption that has nothing to do with their methods or the results.
All research data analysis is based on warrants or presumptions, but one must carefully examine whether one’s warrants are relevant to the study at hand.
It begins with a bold claim about the uselessness of thyroid clinical symptoms and signs and uses them to heighten clinical dependency on T3, T4, and TSH test interpretation:
“Clinical symptoms and signs are often nonspecific, and the diagnosis and monitoring of therapy depends crucially on measurements of thyroid hormones and TSH in blood. (2,3)”
How does such an opening statement even frame the relevance of their study?
Their study did not examine any “clinical symptoms and signs.” Neither did their study examine “monitoring of [thyroid] therapy.”
In addition, what are citations 2 and 3 about? Do they even support such a claim?
Citation #2 is a study that pointed out that three doctors’ diagnoses conflicted with each other in the absence of laboratory test data on a large number of undiagnosed women. This study proved more about the diversity of human clinical judgment given a lack of a shared validated systematic clinical scoring tool. It said nothing about the intrinsic significance of clinical symptoms and signs.
Citation #3 is an American Thyroid Association guideline for the detection of thyroid dysfunction. It focuses on the use of TSH alone in screening, and merely advocates screening TSH more often when symptoms and signs are present — like banging one’s head repeatedly against the same brick wall, never analyzing T3 or T4 concentrations.
Therefore, it’s unclear why these two items are cited to support such a statement. Neither citation casts any doubt on the importance of clinical symptoms and signs. Neither focused on monitoring thyroid therapy using thyroid hormones in addition to TSH.
Why did they believe that these documents justified their opening claim of relevance?
Why, then, are they invoking these belief systems?
Apparently they are appealing to their audience at the outset and can count on readers responding with nods of agreement.
It worked. The fact that this opening sentence was published with these citations attests to the complete failure of peer reviewers to do a simple fact-checking exercise to see if the logic of the sentence fit with the content of the references cited.
These beliefs have little to do with the design or results of their experiment, but these beliefs do return to preside with great power over the interpretation and application of their results.
ANDERSEN JUMPS TO CONCLUSIONS
Andersen’s team cites prior beliefs about TSH’s amplified response to both T3 and T4.
They use these beliefs to minimize the relative importance of their T3 and T4 findings and exaggerate the importance of the TSH.
“Serum TSH responds heavily to minor changes in thyroid hormone concentrations in serum. Hence, subclinical thyroid disease with abnormal TSH but T4 and T3 within laboratory reference ranges is probably always a sign that T4 and T3 are outside the individual reference range and thus an indicator for abnormal thyroid function in the individual. This emphasizes the importance of serum TSH relative to T3 and T4 this being total or estimated free hormone concentrations in serum. If there are clinical signs, or if other conditions such as pregnancy requires normal thyroid function to ensure normal fetal brain development, then there is a need for treatment.” (2003, p. 1075)
They use now falsified presumptions about TSH sensitivity to T4 and T3 hormones in circulation (this belief is especially untrue regarding T3 levels in T4-monotherapy and in cases of nonthyroidal illness).
Their focus on TSH reference boundaries contrasts directly with their own findings about the utter failure and uselessness of a “normal TSH” range to judge the individual’s current thyroid hormone status as truly euthyroid.
Based on prior scientific proof that TSH amplifies T3 and T4, they extend this to the judgment that the clinical significance of TSH beyond reference range is also amplified.
In other words, they jump to unjustified conclusions based on the beliefs (paradigms, philosophies) promoted by the medical culture at the time.
Nothing in their own study leads one to believe that TSH is more important than T3 or T4 to the human body.
Nothing in their study examined “clinical signs” even though they mention them here, because they already dismissed signs and symptoms at the outset.
Nothing in their study of 16 men touches on the issue of pregnancy and normal fetal brain development.
They do not pay enough attention to the ways their own study casts serious doubt on the TSH-first testing policy advocated since the late 1980s.
FAILURE OF THYROID GUIDELINES
Instead of proving a failure of thyroid blood testing, the research points to the failure of decades of thyroid therapy guidelines by misapplying reference ranges across diverse populations and individuals.
The failure is the overreliance on TSH population-wide reference range boundaries to define hypothyroidism, euthyroidism and hyperthyroid status, and FT4 only to provide clarity when TSH-only diagnosis is too vague.
On the one hand, the main message of Andersen, the universal principle, is that the individual has an unique narrow range of hormone levels that is dwarfed by the wide range of the population. Even though the research was done on very small numbers of healthy-thyroid people, aspects of this research apply to ALL human beings.
On the other hand, thyroid guideline writers have acknowledged that this research’s strong cautions about going outside of the reference ranges do NOT apply to ALL human beings. They teach us that there are exceptional situations like fetal life, euthyroid old age, fasting, pregnancy, hypothalamus and pituitary disorders, and the list goes on.
Even more radical exceptions exist.
Exception: Thyroid guidelines have even allowed T3 and T4 to fall below range during the descent into nonthyroidal illness (also known as Low T3 syndrome) while TSH remains normal or low. This is a state that so many have been taught to believe is always benign and temporary even though the depth and duration of the T3 deficit is so often predictive of fatality or continued morbidity within a year after this crisis.
Exception: Thyroid guidelines have permitted the TSH to rise swiftly far above reference range during the recovery phase of nonthyroidal illness, because nature tends to follow the lower T3 with a lowered T4, which then finally permits TSH to rise (if it rises swiftly enough) and to overstimulate healthy thyroid tissue (if there is enough to stimulate), thereby replenishing depleted T3 stores and permitting recovery of health.
Thyroid guidelines for hypothyroid diagnosis have not heeded the caution of Andersen’s team — Andersen would have stood strongly against the current “wait and see” policy of prolonging the subclinical hypothyroid state before therapy by permitting TSH to fly over range until it surpasses an arbitrary limit of 10.0 mU/L.
Here’s where exceptions stop.
Thyroid guideline writers have refused to consider that research provides a basis for individualized accommodations within thyroid therapy by adjusting TSH, FT3 and FT4.
In thyroid therapy guidelines, they promote the belief that they are judging dosing by the measuring stick of thyroid health by “normalizing the TSH,” when in fact, standard thyroid guidelines do not follow nature exactly.
Thyroid guidelines pick and choose certain principles of this research to rigidly enforce, exaggerate, and guard with fearful prohibitions. Thyroid guidelines pick and choose _other_ principles of this research to utterly ignore and treat with disdain as unnecessary or irrelevant information.
Thyroid guidelines judge with biased judgment whenever they excuse the T3:T4 ratio distortions in T4-monotherapy just because they so often hide within the Free T3 and Free T4 population reference ranges.
This excuse too easily accommodates the dosing effect of their favorite pharmaceutical and refuses to subject it to health outcomes research.
Therapy guidelines judge with biased judgment whenever they permit chronic hypothyroidism to persist in therapy by permitting Free T3 level to be confined in the lower half of reference range or even fall below the reference boundary.
These guidelines are not just biased, but utterly blind to what really matters to thyroid patients’ health and well being.
They call themselves guidelines, but they provide no guidance to health practitioners to discover where the patient’s individual optimal ranges for each hormone lie for them in their altered state of thyroid disease and therapy.
IS THYROID TESTING USELESS? NO!
We’ve learned that unlike many other blood tests in use today, thyroid hormone and TSH blood test reference ranges fail to achieve the minimum “Individuality Index” of 0.6.
This means that when a single, isolated laboratory result falls somewhere within these reference ranges, it fails to achieve diagnostic sensitivity for the individual patient.
What do you do? Does this mean that all thyroid testing is useless, including the TSH which fails to meet this criterion?
Again, it means the opposite — It means that hormone concentrations in blood are so biologically significant that they are far more narrowly controlled in healthy individuals than in the population at large.
The human body obviously cares intensely about where the thyroid hormone supply is in blood, in relation to the wider statistical range.
It means that all three hormones are likely interrelated and looking at all of them is going to be more informative than looking at each in isolation.
It means that a group of thyroid test results can provide indices of thyroid hormone supply in bloodstream in relation to the individual’s genetic diversity, their current metabolic needs, and their fluctuating physiological status over a lifetime.
It means that wise clinicians should also look beyond TSH, FT4 and FT3 to attend to other biomarkers of tissue thyroid hormone sufficiency and listen to the patients’ symptoms to interpret the results in context.
HOW CAN WE HELP THYROID PATIENTS?
NONE of these three studies examined what the optimal ranges were for individuals with damaged or missing thyroids while on thyroid therapy.
Only a few recent studies have bravely begun to do this work by resurrecting the importance of clinical symptoms:
- Hoermann, R., Midgley, J. E. M., Larisch, R., & Dietrich, J. W. (2019). Functional and Symptomatic Individuality in the Response to Levothyroxine Treatment. Frontiers in Endocrinology, 10. https://doi.org/10.3389/fendo.2019.00664
- Ito, M., Miyauchi, A., Hisakado, M., Yoshioka, W., Kudo, T., Nishihara, E., … Nakamura, H. (2019). Thyroid function related symptoms during levothyroxine monotherapy in athyreotic patients. Endocrine Journal. https://doi.org/10.1507/endocrj.EJ19-0094
- Larisch, R., Midgley, J. E. M., Dietrich, J. W., & Hoermann, R. (2018). Symptomatic Relief is Related to Serum Free Triiodothyronine Concentrations during Follow-up in Levothyroxine-Treated Patients with Differentiated Thyroid Cancer. Experimental and Clinical Endocrinology & Diabetes: Official Journal, German Society of Endocrinology [and] German Diabetes Association, 126(9), 546–552. https://doi.org/10.1055/s-0043-125064
The lesson of these articles is that in standard thyroid therapy, achieving a Free T3 of a certain level for the individual is crucial to chronic symptom relief, and that the TSH reference range is not a relevant judge of the patient’s achievement of this FT3 level.
To those who question the relevance of thyroid patients’ symptoms, the evidence that symptoms relate to FT3 levels is never going to be enough.
The skeptics would rather wait until 20 more years of thyroid therapy research passes while their patients suffer with a merely normalized TSH.
Irresponsible therapists do not care if ignoring Free T3 and Free T4 testing and failing to optimize these levels contributes to a patient’s chronic symptoms and exacerbates many other chronic illnesses in their body.
Those who question the relevance of thyroid patients’ symptoms are allowing themselves to be blind physicians led by the blind therapy guidelines and are not listening to research evidence.
Abdalla, S. M., & Bianco, A. C. (2014). Defending plasma T3 is a biological priority. Clinical Endocrinology, 81(5), 633–641. https://doi.org/10.1111/cen.12538
Andersen, S., Pedersen, K. M., Bruun, N. H., & Laurberg, P. (2002). Narrow Individual Variations in Serum T4 and T3 in Normal Subjects: A Clue to the Understanding of Subclinical Thyroid Disease. The Journal of Clinical Endocrinology & Metabolism, 87(3), 1068–1072. https://doi.org/10.1210/jcem.87.3.8165
Andersen, S., Bruun, N. H., Pedersen, K. M., & Laurberg, P. (2003). Biologic Variation is Important for Interpretation of Thyroid Function Tests. Thyroid, 13(11), 1069–1078. https://doi.org/10.1089/105072503770867237
Ankrah-Tetteh, T., Wijeratne, S., & Swaminathan, R. (2008). Intraindividual variation in serum thyroid hormones, parathyroid hormone and insulin-like growth factor-1. Annals of Clinical Biochemistry, 45(Pt 2), 167–169. https://doi.org/10.1258/acb.2007.007103