Discordance in thyroid symptoms reported by patients and their doctors

discordance-symptoms-reportedThere’s no surprise that patients and doctors would disagree about the incidence rate of hypothyroid symptoms in a given patient.

“We have an average of 7.2 symptoms per patient” says the cohort of 262 patients.

“No, you have an average of 4.0 symptoms per patient,” says the cohort of 100 doctors about those 262 patients.

This is especially unsurprising during standard therapy in contemporary society, now that the TSH test is commonly used to declare a patient “euthyroid” despite any or all chronic symptoms of hypothyroidism.

But to what degree do patients and physicians disagree in their reporting on a symptom by symptom basis?

What was the most common symptom reported by patients and underreported by doctors?

In this post, I reproduce verbatim the abstract of a presentation to the American Thyroid Association (ATA) at a conference in 2018.

It’s relevant to note that the lead author is Yi J. Chen, who works for AbbVie, the makers of Synthroid brand of levothyroxine.

With that in mind, I have some questions and concerns about this research poster in relation to Mr. Chen’s other posters presented at the same ATA conference as this one.

Abstract for Poster #323

Symptoms such as tiredness, constipation, depression, cold intolerance and weight gain, are often associated with hypothyroidism (HT).

The Adelphi Disease Specific Program (DSP) is a large scale, real-world data generation study. The DSP collected data on 1,000 adult patients with a confirmed diagnosis of HT from 60 primary care physicians and 40 endocrinologists in the US in Jan-Apr 2018, and part of the data describes the symptoms associated with HT and HT management in the real world.

In addition to detailed data on disease (e.g., diagnosis, severity, history, comorbid conditions) and treatment from the physicians, both physicians and patients also independently provided information on the patient’s current symptoms.

Number of symptoms reported by physicians vs. by patients was compared at a significance level of 0.05. A Kappa statistic was calculated to assess the agreement level between patient and physician reporting on each symptom type.

This analysis included 262 patients who had symptoms reported by themselves and symptoms documented by their physicians.

  • The mean age was 48.7 (SD 15.7),
  • 78% were female,
  • 70% had overt HT, and
  • 94% were receiving prescribed HT treatment.

Mean number of symptoms reported by physicians was

  • 4.0 (SD 3.8) compared to a mean of
  • 7.2 (SD 7.9) symptoms reported by patients ( p < 0.01).

The most commonly-reported symptoms were

  • weight gain (33% physician-reported and 40% patient-reported),
  • inability to lose weight (34% and 31%),
  • dry/flaky skin (26% and 26%),
  • head hair loss (17% and 22%),
  • brittle hair (15% and 20%),
  • low energy/excessive tiredness (21% and 48%),
  • constipation (15% and 30%),
  • depression (14% and 24%) and
  • cold extremities/cold intolerance (7% and 22%).

Discordance of patient and physician symptom reporting was present (or observed) in the symptoms of cold intolerance, tiredness, weight gain, constipation and depression (all kappa statistic <0.40, indicating fair agreement).

Symptoms were more often reported by HT patients than their physicians, most notably tiredness and cold intolerance.

Further investigation is needed to determine the reasons for discordance and ways to improve it.”


It is unfortunate that only the abstract of this presentation is available and not the full text. One cannot find out any of the methods of data collection or analysis at the “Adelphi Disease Specific Programme” on the Adelphi Group website.

Several fundamental questions arise.

Why aren’t the symptoms correlated with any biometric measurements?

Two of the symptoms related to “weight,” but there is no BMI or weight gain/loss data or fat/lean body mass.

Three of the symptoms were skin / hair related, but no dermatologist was involved in measuring dry/flaky skin or hair loss or brittle hair.

This is very frustrating and dismaying for a presentation at a medical conference, given that thyroid disease is treated largely in relation to measurements, especially laboratory test results, not just self-reported symptoms.

Not having such data reported reduces the clinical relevance of your findings down to the level of a mere disagreement between subjective report of a patient vs. a doctor’s poor memory or poor regard for a patient’s subjective reports.

As doctors walked by this poster presentation, one can imagine many of them thinking to themselves this:

  • “Yeah, so what? Of course patients will exaggerate their number of symptoms when asked, and doctors’ reports are more accurate because they are objective.”

There is no mention of whether the findings are biased by methods by which doctors collected “patient symptoms” data — presumably they only collected data from their patient’s self-report during a prior visit to their own office.

Is this a measurement of physician recall? Is this a measure of the variance in patient reporting in two situations, an office visit versus a survey?

It is very possible that some patients would find certain symptoms like “constipation” or “depression” too embarrassing or filled with negative stigma to report to their doctor.

What were the patients’ Free T3 and Free T4 levels?

The only thyroid biomarkers that directly correlate with “thyroid symptoms” are thyroid hormones, FT3 and FT4.

It is a biological fact that in a person with severe thyroid dysfunction, TSH levels do not directly cause hypothyroid or hyperthyroid symptoms because TSH enters a different receptor than thyroid hormones. TSH has its indirect effect by stimulating T3 and T4 secretion from a thyroid.

In earlier studies that examined hypothyroid symptoms extensively using scoring symptoms and multiple biomarkers, it was not the TSH that was correlated with them. Instead, symptoms were closely related to the Free T3 and Free T4 most of all, and then ankle reflex test and cholesterol. The correlation was especially strong in cases of overt hypothyroidism before therapy had commenced (Meier et al, 2003; Zulewski et al, 1997).

Interestingly, Chen’s 2018 study reported here that on average, 7.2 hypothyroid symptoms (only 9 symptoms listed) were reported per patient, and 4.0 reported by the doctor per patient.

That is a very high number of symptoms, given that in comparison, in Zulewski’s 1997 study, which correlated symptoms with FT3 and FT4 not TSH, a mean score of 7.8 out of 12 symptoms was reported only in the category of overt hypothyroidism prior to therapy.

Zulewski and team even created a reference range for the symptom score of 1 point per symptom present, with “two cut-off points: hypothyroid range, more than 5 points (positive predictive value, 96.9%); euthyroid, 2 points or less (negative predictive value for exclusion of hypothyroidism, 94.2%); and intermediate range, 2–5 points.” (Zulewski et al, 1997)

Why are the researchers not dismayed and shocked by such a high number of symptoms reported by both patients AND doctors (4.0, 7.2)?

Why are they just coolly and calmly reporting it as a discrepancy between patient and doctor report?

Why are the results reported as means, or averages?

It is a tradition to give results as averages, especially with large sample sizes like 262 patients. But it is also quite common to give statistical indicators of distribution along with an average.

Here, no measures of dispersion or distribution were given except for the average total number of symptoms reported.

What was the range of variation in patient-report and physician report for each reported symptom? How wide was the spread? Were there 30% of patients that had 1 symptom reported by a doctor but 11 symptoms reported by the patient?

Knowing this range would give us a better idea of individual variation in patient-reported symptoms in response to therapy, since supposedly 94% of these patients were on standard therapy.

In thyroid therapy, the concern today should arise with the patients who struggle with signs and symptoms of unresolved hypothyroidism.

We know from recent clinical studies that some patients respond well to LT4 monotherapy, while others struggle to achieve optimal levels of Free T3 and Free T4 that can provide freedom from hypothyroidism (see listed articles by Hoermann, Midgley, Larisch and Dietrich; and a study of symptoms by Ito and colleagues).

What was the thyroid status of patients in this study?

Can you make heads or tails of this finding out of context?

  • “70% had overt HT, and 94% were receiving prescribed HT treatment.”

Any reasonable person should be puzzled. In their abstract, HT meant hypothyroidism. In thyroid science, the term “overt hypothyroidism” is usually used when TSH exceeds 10.0 mIU/L at diagnosis, before therapy.

Does this mean that 70% of the 262 patients were still classified as being in a state of hypothyroidism during the study? If so, then no wonder they had symptoms! A TSH over 10.0 mU/L and a FT4 below reference usually determines a diagnosis of “overt HT.” BUT yet, they were receiving HT treatment, so how could their TSH be over 10.0 mU/L?

I’m flummoxed, unless they mean that they “had previously been diagnosed with overt HT.”

If 30% of these 262 patients did NOT have a prior diagnosis of overt HT and 7% were not even on HT therapy, what the heck were they doing in the data analysis, having their symptom reports averaged with those of other patients?

Subclinical hypothyroid patients will have a lot more living thyroid gland tissue that can convert T4 into T3 and respond to TSH, and by being included, they’re muddying the results for the utterly thyroidless or thyroid-fibrosed people.

What was the variation in agreement vs. disagreement?

Even if the study is purely focused on subjective patient vs. doctor reported symptoms, were some patients grossly misrepresented (or misremembered) by their doctors while others were in agreement?

That would also give us an idea of the rate at which patients’ symptoms are not being acknowledged or recalled by their doctors.

What was the mean reported number of symptoms reported by the 40 endocrinologists vs. the 60 primary care physicians?

Such a finding would be of interest in light of the recent patient-blaming study (Esfandiari et al, 2019) that showed many physicians, including the researchers of the study, considered patients’ requests for tests and therapies in response to symptoms as “barriers” to adequate thyroid hormone therapy.

In Esfandiari’s study, endocrinologists were far more likely to report patients’ requests as a barrier to their therapy.

Perhaps endocrinologists reported fewer patient symptoms as well, not really being interested in hearing thyroid symptoms when the TSH was normalized.

ATA: Why are Synthroid-makers’ staff presenting 4 posters at your conference?

As noted above, this poster’s lead author is Yi J. Chen, who works for AbbVie, the makers of Synthroid brand of levothyroxine.

Looking at the larger context of this poster presentation, one can easily see that a total of four posters, including this one summarized above, were presented by Mr. Chen and a similar list of collaborators.

Mr. Chen of AbbVie is always named in the abstract of each poster presentation, along with at least one other staff member of AbbVie, and about two or three other people who don’t work for AbbVie, at least not at the moment.

FOUR posters by the same presenter(s), in each of which two co-presenters were AbbVie staff?

Why so many? Is this normal every year?

One of these presentations was explicitly biased toward the use of Synthroid, but without much logical reasoning supporting it, and you supposedly peer-reviewed the abstract and approved it to be presented at your conference!

Consider this study in the context of abstracts # 321 and 322

Just before the abstract for the poster #323 above, there were two other abstracts by Chen & others in the PDF.

Poster #321 by Chen & friends was titled “Comparative effectiveness of Synthroid vs. Generic levothyroxine on TSH lab outcomes: a confirmatory analysis in a US managed care population.”

It’s actually quite a medically ridiculous study, an exercise in circular reasoning and medical navel-gazing.

What do you think they would discover?

Given that the target of all levothyroxine therapy is currently a superficial normalization of TSH rather than real health outcomes or symptom severity or symptom number, is there any surprise that a normal TSH target was achieved about 75-80% of the time?

Given that a doctor isn’t supposed to target any specific region within the TSH normal range according to the clinical guidelines, and some patients’ TSH results may easily fall near the upper or lower borderline, is there any surprise that one year later, a previously borderline TSH result had fallen above or below the statistical reference range by even 0.1 mU/L, approximately 20% of the time?

Why do you not acknowledge having the test taken in the Winter season can make the TSH result significantly higher, and in the Summer season, a significantly lower TSH, as discovered in thousands of thyroidless patients on thyroid therapy in Sicily (Gullo et al, 2017)?

How easy (or should I say difficult) is it to normalize the highly variable biomarker of TSH when even in healthy controls, there can be a huge circadian rhythm causing daily fluctuating TSH levels? Testing TSH earlier in the morning in such patients will yield a TSH higher than when testing in the afternoon, and one healthy person’s TSH fluctuated over 79.3% of the population reference range each day (Russell et al, 2008).

The data revealed that not much difference existed in terms of percent of TSH within the reference range, 78.5% for Synthroid and 77.2% for Generic Levothyroxine. They even tried narrowing the TSH reference range to manipulate the data, achieving a spread of 75.2% and 73.9%.

Woo hoo for a 1.3% and 1.3% difference in BOTH data analyses (do the math).

Why should anyone care about a 1.3% difference in the rate of achieving a normalized vs. not normalized TSH?

Even more importantly, why should anyone credit Synthroid versus Generic levothyroxine for achievement of a TSH value?

When the TSH target is achieved, is it not a reflection on the expertise of the physician in managing thyroid care, and a reflection the power of T4 dosing to tame (75-80% of the time) the wildly temperamental and oversensitive nature of this biomarker? Why would you attribute TSH normalization to an intrinsic property of the Synthroid brand pharmaceutical?

Yet the abstract concludes with a drug-boosting correlation boast: “Synthroid was associated with better TSH outcomes as compared with generic levothyroxine in a US managed care population, consistent with the previous findings using other real-world data sources.”

And who cares if TSH is achieved if the patient’s hypothyroid symptoms and signs are not being resolved within this reference range?

The next abstract listed in the PDF booklet, #322, also by Mr. Chen and another AbbVie associate and some other medical folks who wanted a line on their publication list on their CV, was titled “Economic value of achieving TSH lab data within reference range for hypothyroidism treatment in a US managed care population.”

There, you didn’t measure any other index besides TSH (no mention of FT3, no FT4), so you are blind to all other biomarkers but the one you favor, not allowing any of them to compete for statistical significance against your favorite horse, the TSH.

Again, it engages in navel-gazing research that merely reinforces guidelines without any proof of their cause-effect relationship on health outcomes in reality.

Tell me by what physiological mechanism a TSH concentration above or below reference — in a person whose TSH can’t stimulate a thyroid gland to produce anything — by itself causes health outcomes? How could TSH’s relationship to health outcomes occur without the mediation of a FT3 value and FT3:FT4 ratio?

Don’t you know that thyroid dogma already acknowledges that TSH remains normalized during critical illness when T3 drops below reference?

Given these many confounding variables on a cause-effect relationship between TSH and health outcomes during thyroid therapy, by what mechanism does a TSH result outside of reference have significance when it is merely “correlated” or “associated” with hospital stays or pharmacy costs? How do you know T3 levels aren’t doing most of the work behind the scenes, as the most neglected thyroid test but the body’s most valuable thyroid hormone concentration (Bianco et al, 2019)?

In this abstract #322, you also categorized patients themselves as “TSH Achievers vs. Non-Achievers.” The language throughout that abstract focused on the patient’s non-adherence and/or blamed the patient’s body. It was the “levothyroxine user” (not the doctor dosing them) who did not achieve the targeted TSH on some lab tests. Is it not rather the fault of the guidelines for making TSH alone the judge and/or the fault of the doctor in dosing it incorrectly?

Oh but this is an ATA conference, and you wouldn’t want to blame the ATA or their members for being the true “non-achievers,” now would you?

You claimed, of course, in poster #322, that the patient achieving a normal-appearing TSH value (without comparing it with any other surrogate endpoint or direct health outcome) was cheaper for the health care system than not achieving it. “The study results suggest that persistent levothyroxine users who achieved TSH goals are associated with less hypothyroidism related health resource utilization and costs.” Of course a non-normal TSH will cost reflex FT4 and/or FT3 testing to start with, given progressive TSH testing policies, and it would usually result in a dose adjustment at an extra doctor visit.

And were the costs more often incurred when the TSH was above reference or below it? There was silence on that aspect of the data, usually provided with similar studies in the past.

These are the 2 studies whose abstracts appear before the “Discordance in Symptoms” study.

The 4th study by Chen and friends, which follows the symptom study #323, is a study of the process of the hypothyroid diagnosis. It places symptoms only at the beginning of the process, silently presuming that the symptoms end after therapy — or that they no longer matter, even if they continue.

In the context of Synthroid advertising and the praise of TSH Achievers, why raise the question of patient-reported symptoms when you don’t show any concern for them or any motive to resolve the doctors’ underreporting of them in the context of therapy?

In your symptom study, you didn’t bother to correlate whether the more symptomatic patients were users of Synthroid or Generic or whether they were TSH achievers or were saving the health care system money.

Dear ATA, do you really have a panel that reviews abstracts for posters based on their medical logic and rigorous methodology, or do you just look at the names and titles and say, “yeah, we want these people to present on this topic at our conference because it validates what we already believe?”

Advice to Mr. Chen of AbbVie

I would like to address my final questions and comments to Mr. Chen, the representative of the makers of Synthroid and the lead author of the study:

  1. Do the makers of Synthroid brand Levothyroxine desire their pharmaceutical to be dosed to genuine effectiveness, OR is their concern merely that it be dosed to ATA guideline criteria?
  2. Do the makers of Synthroid desire that patients all over the world would stop blaming their drug for chronic signs and symptoms of unresolved hypothyroidism, OR are they only interested in persuading doctors to continue to prescribe it?

Hmm, who is your real customer, the doctor and medical system, or the patient?

I would suggest that you are marketing to the doctor and health care systems while completely disregarding the patient’s well being and health as judged by anything other than your favorite biomarker, the TSH.

If you protest that you do care for the health of the Synthroid patient, I would like you to consider these new research findings within the context of thyroid therapy:

In patients with no living thyroid tissue, freedom from hypothyroidism is achieved at levels below the TSH reference range and also when T3 is higher than the mid-reference statistical mean in healthy controls. (Ito et al, 2019; Larisch et al, 2018; Hoermann et al, 2019a & b)

These findings suggest that the discovery of patients’ individual optimal, symptom-free Free T3 and Free T4 levels should be primary targets of dosing levothyroxine.

Consider also the fact that a statistically normal TSH has never been properly validated as a surrogate endpoint in any form of thyroid therapy, since it would take an expensive Phase III clinical trial to do this.

Why is it that you are not interested in funding such an important study, but you would fund studies regarding symptoms and TSH? Could it be out of a desire to cover up the historical shame that you have never commissioned a Phase III clinical trial, because the FDA has never yet required you to? During the 1970s, your drug rose to market dominance hand in hand with normalization of TSH, based on medical persuasion, not clinical trials that compared health outcomes with the other therapies on the market.

According to a presentation by Eugene Sullivan of the FDA, “The primary endpoint(s) of confirmatory Phase 3 trials should represent (directly or through a validated surrogate) something that matters to a patient.”

Let us tell you, we don’t care about our TSH number. We do care about our symptoms and our lifelong health, and you haven’t properly validated the normo-thyroid TSH range as a surrogate endpoint for those real outcomes. Our normalized TSH has no role in protecting our health if it can’t prevent a chronically low(er) FT3 value or eliminate our hypothyroid symptoms on therapy.

With this in mind, can’t you please do something about the ridiculous policies that currently forbid Free T3 and Free T4 testing whenever the TSH is normalized during thyroid therapy?

Can’t you point out that people are treating the thyroid-disabled on therapy as if they have an intact, medically unmanipulated HPT axis, and they are being screened for primary thyroid failure over and over again when they’ve already been diagnosed?

We are now being judged by how well we are biochemically stuffed into a TSH corset to fit into a normo-thyroid suit of clothing while our FT3 could be nowhere near where our TSH would prescribe it to be maintained in health.

In light of your set of 4 posters, I have some homework to assign to you, Mr. Chen and AbbVie:

Go compare health costs costs with both FT3 and FT4 levels, FT3:FT4 ratios, not just “in or out” of reference range but also within range, AND also correlate them with the rate at which symptoms were reported by the patients but disregarded by their physicians due to their supposedly “euthyroid” TSH, and then correlate those variables with the health care costs!


Bianco, A. C., Dumitrescu, A., Gereben, B., Ribeiro, M. O., Fonseca, T. L., Fernandes, G. W., & Bocco, B. M. L. C. (2019). Paradigms of Dynamic Control of Thyroid Hormone Signaling. Endocrine Reviews, 40(4), 1000–1047. https://doi.org/10.1210/er.2018-00275

Chen, Y. J., Gossain, V. V., Anderson, P., Soni-Brahmbhatt, S., Gillespie, A., Piercy, J., & James, H. (2018). 323. DISCORDANCE IN SYMPTOMS REPORTED BY HYPOTHYROIDISM PATIENTS AND THEIR PHYSICIANS. Thyroid, 28 S1, A-112-113. https://www.liebertpub.com/doi/pdf/10.1089/thy.2018.29065.abstracts

Esfandiari, N. H., Reyes-Gastelum, D., Hawley, S. T., Haymart, M. R., & Papaleontiou, M. (2019). Patient Requests for Tests and Treatments Impact Physician Management of Hypothyroidism. Thyroid: Official Journal of the American Thyroid Association, 29(11), 1536–1544. https://doi.org/10.1089/thy.2019.0383

Gullo, D., Latina, A., Frasca, F., Squatrito, S., Belfiore, A., & Vigneri, R. (2017). Seasonal variations in TSH serum levels in athyreotic patients under L-thyroxine replacement monotherapy. Clinical Endocrinology, 87(2), 207–215. https://doi.org/10.1111/cen.13351

Hoermann, R., Midgley, J. E. M., Larisch, R., & Dietrich, J. W. (2019a). Functional and Symptomatic Individuality in the Response to Levothyroxine Treatment. Frontiers in Endocrinology, 10. https://doi.org/10.3389/fendo.2019.00664

Hoermann, R., Midgley, J. E. M., Larisch, R., & Dietrich, J. W. (2019b). Individualised requirements for optimum treatment of hypothyroidism: Complex needs, limited options. Drugs in Context, 8, 212597. https://doi.org/10.7573/dic.212597

Ito, M., Miyauchi, A., Hisakado, M., Yoshioka, W., Kudo, T., Nishihara, E., … Nakamura, H. (2019). Thyroid function related symptoms during levothyroxine monotherapy in athyreotic patients. Endocrine Journal. https://doi.org/10.1507/endocrj.EJ19-0094

Larisch, R., Midgley, J. E. M., Dietrich, J. W., & Hoermann, R. (2018). Symptomatic Relief is Related to Serum Free Triiodothyronine Concentrations during Follow-up in Levothyroxine-Treated Patients with Differentiated Thyroid Cancer. Experimental and Clinical Endocrinology & Diabetes: Official Journal, German Society of Endocrinology [and] German Diabetes Association, 126(9), 546–552. https://doi.org/10.1055/s-0043-125064

Meier, C., Trittibach, P., Guglielmetti, M., Staub, J.-J., & Müller, B. (2003). Serum thyroid stimulating hormone in assessment of severity of tissue hypothyroidism in patients with overt primary thyroid failure: Cross sectional survey. BMJ, 326(7384), 311–312. https://doi.org/10.1136/bmj.326.7384.311

Midgley, J. E. M., Larisch, R., Dietrich, J. W., & Hoermann, R. (2015). Variation in the biochemical response to l-thyroxine therapy and relationship with peripheral thyroid hormone conversion efficiency. Endocrine Connections, 4(4), 196–205. https://doi.org/10.1530/EC-15-0056

Russell, W., Harrison, R. F., Smith, N., Darzy, K., Shalet, S., Weetman, A. P., & Ross, R. J. (2008). Free Triiodothyronine Has a Distinct Circadian Rhythm That Is Delayed but Parallels Thyrotropin Levels. The Journal of Clinical Endocrinology & Metabolism, 93(6), 2300–2306. https://doi.org/10.1210/jc.2007-2674

Zulewski, H., Müller, B., Exer, P., Miserez, A. R., & Staub, J. J. (1997). Estimation of tissue hypothyroidism by a new clinical score: Evaluation of patients with various grades of hypothyroidism and controls. The Journal of Clinical Endocrinology and Metabolism, 82(3), 771–776. https://doi.org/10.1210/jcem.82.3.3810

Leave a public reply here, on our website.

This site uses Akismet to reduce spam. Learn how your comment data is processed.