Key Concepts for assessing claims about treatment effects
There are endless claims about treatments in the mass media, advertisements and everyday personal communication. Some are true and some are false. Many are unsubstantiated: we do not know whether they are true or false. Unsubstantiated claims about the effects of treatments are often wrong. Consequently, people who believe and act on these claims suffer unnecessarily and waste resources by doing things that do not help and might be harmful, and by not doing things that do help.
We have prepared a list of key concepts that people can use to assess claims about the effects of a treatment, including whether
- The basis for a claim is reliable; i.e. whether it is based on fair comparisons of treatments
- The results of fair comparisons are relevant to them and the implications of the results for their decision
- Additional information is needed to assess the reliability and relevance of claims about treatments and, if so, what information is needed
The list serves as a syllabus for identifying the resources needed to help people understand and apply the concepts, and is intended to be universally relevant. Effective treatments can prevent health problems, save lives and improve quality of life. However, nature is a great healer and people often recover from illness without treatment. Likewise, some health problems may get worse despite treatment, or treatment may actually make things worse. For these reasons, knowledge of the natural course of illness should be the starting point for making informed decisions about treatments.
We have written the concepts and explanations in plain language. However, some of these concepts may be unfamiliar and difficult to understand. We did not design the list as a teaching tool. It is a framework, or starting point, for teachers, journalists and other intermediaries for identifying and developing resources (such as longer explanations, examples, games and interactive applications) to help people to understand and apply the concepts.
The list is expected to be a “living” document, allowing modification, additions and deletions, and is subject to yearly review. The list here is a revised version of earlier lists. The next update is planned to take place September 2018. For any comments or suggestions, please contact us.
Current at September 2017; next update September 2018.
The list includes 36 concepts, divided into 3 groups. Click a Key Concept for explanations and links to learning resources:
- Claims: are they justified?
- Treatments can harm
- Anecdotes are unreliable evidence
- Association is not the same as causation
- Common practice is not always evidence-based
- Newer is not necessarily better
- Expert opinion is not always right
- Beware of conflicting interests
- More is not necessarily better
- Earlier is not necessarily better
- Hope may lead to unrealistic expectations
- Explanations about how treatments work can be wrong
- Dramatic treatment effects are rare
- Comparisons: are they fair and reliable?
- Comparisons are needed to identify treatment effects
- Comparison groups should be similar
- Peoples’ outcomes should be analyzed in their original groups
- Comparison groups should be treated equally
- People should not know which treatment they get
- Peoples’ outcomes should be assessed similarly
- All should be followed up
- Consider all of the relevant fair comparisons
- Reviews of fair comparisons should be systematic
- Peer-review and publication does not guarantee reliable information
- All fair comparisons and outcomes should be reported
- Subgroup analyses may be misleading
- Relative measures of effects can be misleading
- Average measures of effects can be misleading
- Fair comparisons with few people or outcome events can be misleading
- Confidence intervals should be reported
- Don’t confuse “statistical significance” with “importance”
- Don’t confuse “no evidence of a difference” with “evidence of no difference”
- Choices: making informed choices
Claims: are they justified?
Not all claims about the effects of treatments are reliable. Well-informed treatment decisions require reliable information.
1-1 Treatments can harm
People often exaggerate the benefits of treatments and ignore or downplay potential harms. However, few effective treatments are 100% safe.
Implication: Always consider the possibility that a treatment may have harmful effects.
1-2 Anecdotes are unreliable evidence
People often believe that improvements in a health problem (e.g. recovery from a disease) was due to having received a treatment. Similarly, they might believe that an undesirable health outcome was due to having received a treatment. However, the fact that an individual got better after receiving a treatment does not mean that the treatment caused the improvement, or that others receiving the same treatment will also improve. The improvement might have occurred even without treatment.
Implication: Claims about the effects of a treatment may be misleading if they are based on stories about how a treatment helped individual people, or if those stories attribute improvements to treatments that have not been assessed in systematic reviews of fair comparisons.
1-3 Association is not the same as causation
The fact that a treatment outcome (i.e. a potential benefit or harm) is associated with a treatment does not mean that the treatment caused the outcome. For example, people who seek and receive a treatment may be healthier and have better living conditions than those who do not seek and receive the treatment. Therefore, people receiving the treatment might appear to benefit from the treatment, but the difference in outcomes could be because of their being healthier and having better living conditions, rather than because of the treatment.
Implication: Unless other reasons for an association between an outcome and a treatment have been ruled out by a fair comparison, do not assume that the outcome was caused by the treatment.
1-4 Common practice is not always evidence-based
Treatments that have not been properly evaluated but are widely used or have been used for a long time are often assumed to work. Sometimes, however, they may be unsafe or of doubtful benefit.
Implication: Do not assume that treatments are beneficial or safe simply because they are widely used or have been used for a long time, unless this has been shown in systematic reviews of fair comparisons of treatments.
1-5 Newer is not necessarily better
New treatments are often assumed to be better simply because they are new or because they are more expensive. However, they are only very slightly likely to be better than other available treatments. Some side effects of treatments, for example, take time to appear and it may not be possible to know whether they will appear without long term follow-up.
Implication: A treatment should not be assumed to be beneficial and safe simply because it is new, brand-named or expensive.
1-6 Expert opinion is not always right
Doctors, researchers, patient organisations and other authorities often disagree about the effects of treatments. This may be because their opinions are not always based on systematic reviews of fair comparisons of treatments.
Implication: Do not rely on the opinions of experts or other authorities about the effects of treatments, unless they clearly base their opinions on the findings of systematic reviews of fair comparisons of treatments.
1-7 Beware of conflicting interests
People with an interest in promoting a treatment (in addition to wanting to help people), such as making money, may promote treatments by exaggerating benefits and ignoring potential harmful effects. Conversely, people may be opposed to a treatment for a range of reasons, such as cultural practices.
Implication: Ask if people making claims that a treatment is effective have conflicting interests. If they have conflicting interests, be careful not to be misled by their claims about the effects of treatments.
1-8 More is not necessarily better
Increasing the dose or amount of a treatment (e.g. how many vitamin pills you take) often increases harms without increasing beneficial effects.
Implication: If a treatment is believed to be beneficial, do not assume that more of it is better.
1-9 Earlier is not necessarily better
People often assume that early detection of disease leads to better outcomes. However, screening people to detect disease is only helpful if two conditions are met. First, there must be an effective treatment. Second, people who are treated before the disease becomes apparent must do better than people who are treated after the disease becomes apparent. Screening tests can be inaccurate (e.g. misclassifying people who do not have disease as having disease). Screening can also cause harm by labelling people as being sick when they are not and because of side effects of the tests and treatments.
Implication: Do not assume that early detection of disease is worthwhile if it has not been assessed in systematic reviews of fair comparisons between people who were screened and people who were not screened.
1-10 Hope may lead to unrealistic expectations
Hope can be a good thing, but sometimes people in need or desperation hope that treatments will work and assume they cannot do any harm. Similarly, fear can lead people to use treatments that may not work and can cause harm. As a result, they may waste time and money on treatments that have never been shown to be useful, or may actually cause harm.
Implication: Do not assume that a treatment is beneficial or safe, or that it is worth whatever it costs, simply because you hope that it might help.
1-11 Explanations about how treatments work can be wrong
Treatments that should work in theory often do not work in practice, or may turn out to be harmful. An explanation of how or why a treatment might work does not prove that it works or that it is safe.
Implication: Do not assume that claims about the effects of treatments based on an explanation of how they might work are correct if the treatments have not been assessed in systematic reviews of fair comparisons of treatments.
1-12 Dramatic treatment effects are rare
Large effects (where everyone or nearly everyone treated experiences a benefit or a harm) are easy to detect without fair comparisons, but few treatments have effects that are so large that fair comparisons are not needed.
Implication: Claims of large effects are likely to be wrong. Expect treatments to have moderate, small or trivial effects, rather than dramatic effects. Do not rely on claims of small or moderate effects of a treatment, which are not based on systematic reviews of fair comparisons of treatments.
Comparisons: are they fair and reliable?
Well-informed treatment decisions requires systematic reviews of fair comparisons of treatments; i.e. comparisons designed to minimise the risk of systematic and random errors. Non-systematic summaries can be misleading, and not all comparisons of treatments are fair comparisons.
2-1 Comparisons are needed to identify treatment effects
If a treatment is not compared to something else, it is not possible to know what would happen without the treatment, so it is difficult to attribute outcomes to the treatment.
Implication: Always ask what the comparisons are when considering claims about the effects of treatments. Claims that are not based on appropriate comparisons are not reliable.
2-2 Comparison groups should be similar
If people in the treatment comparison groups differ in ways other than the treatments being compared, the apparent effects of the treatments might reflect those differences rather than actual treatment effects. Differences in the characteristics of the people in the comparison groups might result in estimates of treatment effects that appear either larger or smaller than they actually are. A method such as allocating people to different treatments by assigning them random numbers (the equivalent of flipping a coin) is the best way to ensure that the groups being compared are similar in terms of both measured and unmeasured characteristics.
Implication: Be cautious about relying on the results of non-randomized treatment comparisons (for example, if the people being compared chose which treatment they received). Be particularly cautious when you cannot be confident that the characteristics of the comparison groups were similar. If people were not randomly allocated to treatment comparison groups, ask if there were important differences between the groups that might have resulted in the estimates of treatment effects appearing either larger or smaller than they actually are.
2-3 Peoples’ outcomes should be analyzed in their original groups
Randomized allocation helps to ensure that the groups have similar characteristics. However, people sometimes do not receive or take the allocated treatments. The characteristics of such people often differ from those who do take the treatment as allocated. Therefore, excluding from the analysis people who did not receive the allocated treatment may mean that like is no longer being compared with like.
Implication: Be cautious about relying on the results of treatment comparisons if patients’ outcomes are not counted in the group to which they were allocated. For example, in a comparison of surgery and drug treatments, people who die while waiting for surgery should be counted in the surgery group, even though they did not receive surgery.
2-4 Comparison groups should be treated equally
Apart from the treatments being compared, people in the treatment comparison groups should otherwise receive similar care. If, for example, people in one group receive more attention and care than people in the comparison group, differences in outcomes could be due to differences in the amount of attention each group received rather than due to the treatments that are being compared. One way of preventing this is to keep providers unaware (“blind”) of which people have been allocated to which treatment.
Implication: Be cautious about relying on the results of treatment comparisons if people in the groups that are being compared were not cared for similarly (apart from the treatments being compared). The results of such comparisons could be misleading.
2-5 People should not know which treatment they get
People in a treatment group may experience improvements (for example, less pain) because they believe they are receiving a better treatment, even if the treatment is not actually better (this is called a placebo effect), or because they behave differently (due to knowing which treatment they received, compared to how they otherwise would have behaved). If individuals know that they are receiving (they are not “blinded” to) a treatment that they believe is better, some or all of the apparent effects of the treatment may be due either to a placebo effect or because the recipients behaved differently.
Implication: Be cautious about relying on the results of treatment comparisons if the participants knew which treatment they were receiving, this may have affected their expectations or behaviour. The results of such comparisons could be misleading.
2-6 Peoples’ outcomes should be assessed similarly
If an outcome is measured differently in two comparison groups, differences in that outcome may be due to how the outcome was measured rather than because of the treatment received by people in each group. For example, if outcome assessors believe that a particular treatment works and they know which patients have received that treatment, they ma y be more likely to observe better outcomes in those who have received the treatment. One way of preventing this is to keep outcome assessors unaware (“blind”) of which people have been allocated to which treatment. This is less important for “objective” outcomes like death than for “subjective” outcomes like pain.
Implication: Be cautious about relying on the results of treatment comparisons if outcomes were not measured in the same way in the different treatment comparison groups. The results of such comparisons could be misleading.
2-7 All should be followed up
People in treatment comparisons who are not followed up to the end of the study may have worse outcomes than those who are followed up. For example, they may have dropped out because the treatment was not working or because of side effects. If those people are excluded, the findings of the study may be misleading.
Implication: Be cautious about relying on the results of treatment comparisons if many people were lost to follow-up, or if there was a big difference between the comparison groups in the percentages of people lost to follow-up. The results of such comparisons could be misleading.
2-8 Consider all of the relevant fair comparisons
A single comparison of treatments rarely provides conclusive evidence and results are often available from other comparisons of the same treatments. These other comparisons may have different results or may help to provide more reliable and precise estimates of the effects of treatments.
Implication: Consider all of the relevant fair comparisons.
2-9 Reviews of fair comparisons should be systematic
Reviews that do not use systematic methods may result in biased or imprecise estimates of the effects of treatments because the selection of studies for inclusion may be biased or the methods may result in some studies not being found. In addition, the appraisal of some studies may be biased, or the synthesis of the results of the selected studies may be inadequate or inappropriate.
Implication: Whenever possible, use systematic reviews of fair comparisons rather than non-systematic reviews of fair comparisons of treatments to inform your decisions.
2-10 Peer-review and publication does not guarantee reliable information
Even though a comparison of treatments has been published in a prestigious journal, it may not be a fair comparison and the results may not be reliable. Peer review (assessment of a study by others working in the same field) does not guarantee that published studies are reliable. Assessments vary and may not be systematic.
Implication: Always consider whether a comparison of the effects of treatments is fair and whether the results are reliable. Peer-review is a poor indicator of reliability.
2-11 All fair comparisons and outcomes should be reported
Many fair comparisons never get published, and outcomes are sometimes left out. Those that do get published are more likely to report favourable results. As a consequence, reliance on published reports sometimes results in the beneficial effects of treatments being overestimated and the adverse effects being underestimated. Biased under-reporting of research is a major problem that is far from being solved. It is scientific and ethical malpractice, and wastes research resources.
Implication: Be aware of the risk of biased underreporting of fair comparisons, whether or not the authors of systematic reviews have addressed this risk.
2-12 Subgroup analyses may be misleading
Comparisons of treatments often report results for a selected group of participants in an effort to assess whether the effect of a treatment is different for different types of people (e.g. men and women or different age groups). These analyses are often poorly planned and reported. Most differential effects suggested by these ‘subgroup results’ are likely to be due to the play of chance and are unlikely to reflect true differences.
Implication: Findings based on results for subgroups of people within a treatment comparison may be misleading.
2-13 Relative measures of effects can be misleading
Relative measures of effects (e.g. the ratio of the probability of an outcome in one treatment group compared with that in a comparison group) are insufficient for judging the importance of the difference (between the probabilities of the outcome). A relative effect may give the impression that a difference is larger than it actually is when the likelihood of the outcome is small to begin with.
For example, if a treatment reduces the probability of getting an illness by 50% but also has harms, and your risk of getting the illness is 2 in 100, receiving the treatment is likely to be worthwhile. If, however, your risk of getting the illness is 2 in 10,000, then receiving the treatment is unlikely to be worthwhile even though the relative effect is the same.
Implication: Always consider the absolute effects of treatments – that is, the difference in outcomes between the treatment groups being compared. Do not make a treatment decision based on relative effects alone.
2-14 Average measures of effects can be misleading
For outcomes that are measured on a scale (e.g. weight or pain) the difference between the average in one treatment group and the average in a comparison group may not make it clear how many people experienced a big enough change (e.g. in weight or pain) for them to notice it, or that they would regard as important.
Implication: When outcomes are measured on a scale, it cannot be assumed that everyone has experienced the average effect of a treatment.
2-15 Fair comparisons with few people or outcome events can be misleading
When there are only few outcome events, differences in outcome frequencies between the treatment comparison groups may easily have occurred by chance and may mistakenly be attributed to differences between the treatments.
Implication: Be cautious about relying on the results of treatment comparisons with few outcome events. The results of such comparisons could be misleading.
2-16 Confidence intervals should be reported
The observed difference in outcomes is the best estimate of how effective or safe treatments are (or would be, if the comparison were made in many more people). However, because of the play of chance, the true difference may be larger or smaller. The confidence interval is the range within which the true difference is likely to lie, after taking into account the play of chance. Although a confidence interval (margin of error) is more informative than a p-value, the latter is often reported. P-values are often misinterpreted to mean that treatments have or do not have important effects.
Implication: Understanding a confidence interval may be necessary to understand the reliability of an estimated treatment effect. Whenever possible, consider confidence intervals when assessing estimates of treatment effects. Do not be misled by p-values.
2-17 Don’t confuse “statistical significance” with “importance”
Statistical significance is often confused with importance. The cut-off for considering a result as statistically significant is arbitrary, and statistically non-significant results can be either informative (showing that it is very unlikely that a treatment has an important effect) or inconclusive (showing that the relative effects of the treatments compared are uncertain).
Implication: Claims that results were significant or non-significant usually mean that they were not statistically significant or non-significant. This is not the same as important or not important. Do not be misled by such claims.
2-18 Don’t confuse “no evidence of a difference” with “evidence of no difference”
Systematic reviews sometimes conclude that there is “no evidence” of effect when there is uncertainty about the difference between two treatments. This is often misinterpreted as meaning that there is no difference between the treatments compared. However, studies can never show that there is “no effect” or “no difference”. They can only rule out important effects or differences.
Implication: Don’t be misled by statements of “no effect” or ”no difference” between treatments. Consider instead the degree to which it is possible to confidently rule out an important difference.
Choices: making informed choices
Well-informed treatment choices require judgements about relevance and importance. The results of specific fair comparisons may not be relevant to you.
3-1 Do the outcomes measured matter to you?
A fair comparison may not include all outcomes that are relevant to treatments. Patients, professionals and researchers may have different views about which outcomes are important. For example, studies often measure outcomes, such as heart rhythm irregularities, as surrogates for important outcomes, like death after heart attack. However, the effects of treatments on surrogate outcomes often do not provide a reliable indication of the effects on outcomes that are important.
Implication: Always consider the possibility that outcomes that are important to you may not have been addressed in fair comparisons. Do not be misled by surrogate outcomes.
3-2 Are you very different from the people studied?
Systematic reviews of studies that only include animals or a selected minority of people are unlikely to provide results that are relevant to most people.
Implication: Results of systematic reviews of studies in animals or highly-selected groups of people may be misleading.
3-3 Are the treatments practical in your setting?
A fair comparison of the effects of a surgical procedure done in a specialised hospital may not provide a reliable estimate of the effects and safety of the same procedure performed in other settings. Similarly, comparing a new drug to a drug or dose that is not commonly used (and which may be less effective or safe than those in common use) would not provide a good estimate of how the new drug compares to what is commonly done.
Implication: Be aware that your circumstances may be sufficiently different from those in the research studies, and that the results may not apply to you.
3-4 Do treatment comparisons reflect your circumstances?
Some treatment comparisons are designed to find out if a treatment can work under ideal circumstances, for example with people who are most likely to benefit, and most likely to comply, and with highly trained practitioners who deliver the treatment exactly as intended. These comparisons, which are sometimes called explanatory or efficacy studies, may not reflect what happens under usual circumstances.
Implication: Be aware that the results of studies with the aim of finding out if a treatment can work may overestimate the benefits of a treatment under more usual circumstances.
3-5 How certain is the evidence?
The certainty of the evidence (the extent to which the research provides a good indication of the likely effects of treatments) can affect the treatment decisions people make. For example, someone might decide not to use or to pay for a treatment if the certainty of the evidence is low or very low. How certain the evidence is depends on the fairness of the comparisons, the risk of being misled by the play of chance, and how directly relevant the evidence is. Systematic reviews provide the best basis for these judgements and should report an assessment of the certainty of the evidence based on these judgements.
Implication: When using the findings of systematic reviews to inform your decisions, always consider the degree of certainty of the evidence.
3-6 Do the advantages outweigh the disadvantages?
Decisions about whether or not to use a treatment should be informed by the balance between the potential benefits and the potential harms, costs and other advantages and disadvantages of the treatment. This balance often depends on the baseline risk (i.e. the likelihood of an individual experiencing an undesirable event), or on the severity of the symptoms.The balance between the advantages and disadvantages of a treatment is more likely to favour taking a treatment for people with a higher baseline risk or more severe symptoms.
Implication: Always consider the balance between advantages and disadvantages of treatments, taking into consideration the baseline risk or the severity of symptoms.
- Keys by Taki Steve, CC By 2.0
GET-IT Jargon Buster
GET-IT provides plain language definitions of health research terms