We define the following words and phrases to unclutter our communications while retaining clarity of meaning. These definitions are predominantly statistical in nature and our definitions align with commonly-accepted statistical meanings. Although this may introduce some initial confusion in those who prefer standard statistical language, it should clarify things for the primary audience: those with diagnosis and their love ones. 

Evidence-based. Treatments that are based on empirical evidence ranging from well-designed non-randomized open label trials to meta-analyses and systematic reviews of randomized controlled trials. Evidence-based treatments vary in their strength of evidence. We subdivide strength of evidence into gold-standard evidence and suggestive evidence.

Gold-standard or solid evidence. Results from trials that match at least one of the following: (i) a well-designed randomized controlled trial (RCT) with at least 30 people* in the treatment arm, (ii) a well-designed meta-analysis or systematic review with a narrow confidence interval of such RCTs, or (iii) in cases of conflicting well-designed studies or meta-analyses, the preponderance of evidence of (i) and (ii). Further, the RCTs should be double-blinded when reasonably possible. The nature of certain therapies, especially psychosocial therapies, makes blinding effectively impossible.


Suggestive evidence. Suggestive evidence is empirical evidence that has lower strength of evidence than gold-standard evidence. This can be well-designed non-randomized studies, significant cohort or case control-studies, or situations in which multiple RCTs show conflicting results, but at least one well-designed RCT is supportive.

Treatment works. A treatment works if there is gold-standard evidence that supports that the treatment provides statistically better outcomes than placebo.

Treatment doesn’t work. A treatment doesn't work if there is gold standard evidence that the treatment fails to provide statistically better outcomes than placebo. A treatment that doesn't work may provide benefit to some individuals outside the statistical norms. 

Response or Respond. Response is the measure of improvement - very often a degree of symptom improvement - chosen by the study designer as a threshold to classify whether the treatment was beneficial. People who experience that degree of improvement or better are said to have responded or are responders to the treatment. The threshold is usually > 50% symptom improvement, though some studies set it as low as 20%.

Substantial symptom improvement or substantial benefit. Substantial symptom improvement and substantial benefit are terms that mean response. When used in the context of a particular study, it means response as defined by the study designer. When used more broadly to describe symptom improvement , it means >50% symptom improvement. This term is introduced since most non-medical people are not familiar with the statistical definition of response but easily understand the concept of substantial symptoms improvement embodied in the idea of response. In a study where 45% of people respond to treatment, we say "45% of people see substantial benefit" from treatment.

Number Needed to Treat (NNT) and Number Needed to Harm (NNH). A treatment's NNT is considered one of the best metrics to predict the impact of treatment and compare treatments. NNT teases apart the placebo (control) impact from the non-placebo impact of a treatment. Likewise, NNH teases apart the frequency of side effects attributable to the treatment from the frequency of side effects found in the control group. Using NNT and NNH helps avoid an over-estimation of both the effectiveness and negative impact of a treatment by removing  the placebo impact. See

"Attributable benefit", benefit "due to", and "odds that a treatment works". All terms mean the same and = 1/NNT. We have a variety of terms to enhance communication and choose the phrase that is most easily understood and is most appropriate for the context. These terms represents the benefit of a treatment above placebo and is equal to the response rate in the treatment group less the response rate in the placebo (control) group. This difference is also called the absolute risk reduction. For example, consider a trial where a drug treatment group has a 50% response rate and the placebo group a 30% response rate. The NNT = 1/(50%-30%) = 5. The 50% treatment response rate is not fully attributable to the treatment, some is attributable to placebo effect. See a review of attributable riskWe must subtract out those who would have responded to placebo (control). Therefore in this example, the odds that you will see substantial symptom improvement (i.e. responded) attributable to treatment and the odds that the treatment works, are 20%. Note the distinction between the two phrases substantial symptom improvement attributable to treatment (or "due to" treatment) and substantial symptom improvement. The former excludes placebo (control) effect, the latter does not. The goal in introducing these phrases is to use accessible language yet still be clear about the statistical meaning of these phrases.


"Attributable risk", "attributable harm", and  risk "due to".  All terms mean the same and = 1/NNH. This is a parallel concept to the immediately preceding one except it is related to harms such as side effects. If 25% of people in the treatment group experience a particular side effect and 5% in the control group, then the the treatment has an attributable risk of 20% for the side effect. Rarely do studies provide attributable harm information so in nearly all cases we specify the frequency of a harm experiencewd in the treatment group. 

Relative Risk. Relative risk (RR) is another, but less desirable, way to compare risk between the treatment group and the control group. The relative risk is the proportion of bad outcomes in the treatment group divided by the proportion of bad outcomes in the control group.  If 15% of people in the treatment group experience weight gain and only 5% in the placebo (control) group, then the relative risk (RR) of weight gain is 15%/5% = 3 (i.e. three times as likely to experience weight gain if treated). The issue with RR is that it doesn't give a sense of absolute significance which is especially important in fairly rare events. We sometimes use RR when we are unable to extract ARR from the study.

Recovery. We mean personal recovery – the combination of clinical recovery plus the reduction of nonclinical factors such as gaining meaning, purpose, happiness, employment, relationships, and self-determination. Clinical recovery is related to symptom reduction only. See more. We choose this measure since that is what people and mental distress predominantly seek.


Remission. An absolute symptom level, one that does not interfere with an individual's behavior and below that required for a diagnosis of a disorder, maintained for at least 6 months.


Significant. We seek to avoid the stand-alone adjective significant because of the frequent confusion between its statistical meaning and its non-statistical meaning. We prefer the terms statistically significant and substantial benefit respectively.

Clinically significant. Clinically significant benefit for a treatment is a level of symptom improvement that can be reasonably and clearly discerned by a patient. Clinically significant is differentiated from statistically significant. For example, antidepressants are shown to have statistically significant benefit for depression, but some meta-analyses show this benefit is as low as 8% over placebo. Some researchers and clinicians see this small benefit as not clinically significant since a person would have a difficult time discerning an 8% decrease in symptoms.

* = we consider this threshold quite low, but it is the number chosen by the Canadian Network for Mood and Anxiety Treatments (CANMAT) and International Society for Bipolar Disorders (ISBD) and reflects that the evidence for some bipolar treatments can be scant.

