Charles M. Beasley, Jr and Roy Tamura: What We Know and Do Not Know by Conventional Statistical Standards About Whether a Drug Does or Does Not Cause a Specific Side Effect (Adverse Drug Reaction)Overview


Charles M. Beasley, Jr.: Reply to Hector Warnes’ comments


          We thank Dr. Warnes for his comments about our recent set of postings.  Dr. Warnes has clearly understood one of our most important points.  It is likely that some proportion of what are listed as adverse drug reactions in product labeling lack the same level of “proof” that they are adverse drug reactions for the drug that is the subject of the labeling that is required for “proof” of efficacy in order to be approved for the treatment of a medical disorder.  He goes on to “dare to say” that perhaps one-third to one-half of listed adverse drug reactions lack “proof” as adverse reactions equivalent to “proof” of efficacy.

          Our experience over our 28 years in the pharmaceutical industry suggests United States Food and Drug Administration expectations for the labeling of adverse events as adverse drug reactions have moved toward a standard of at least some credible evidence that an adverse event is an adverse drug reaction.  For fluoxetine, the first drug that reached approval shortly after I joined Eli Lilly and Company in 1987 and with which I worked directly, virtually all adverse events reported in the clinical development trials were listed.  For olanzapine, approved in 1996, adverse events that were highly non-specific or with substantial reason to believe they were not adverse drug reactions were not listed.  For tadalafil, approved in 2003, only those adverse events with reasonable evidence of being adverse drug reactions or adverse events of such major clinical significance that medical prudence suggested the need to include them were listed.  I cannot speak to the specifics of adverse drug reaction listing standards in other regulatory venues.  However, a widely held concept is that being over-inclusive of adverse events that are unlikely to be adverse drug reactions dilutes the clinical utility of product labeling.

          We did not offer any estimation about the proportion of listed adverse drug reactions lacking robust “proof” of status.  However, we did suggest that for many drugs the adverse events labeled as an adverse drug reactions would need to occur with an incidence >2-3% with a substantially lower incidence in the appropriate control group to have robust “proof” of being adverse drug reactions (Section 6).  This required incidence varies depending on the sample sizes of the investigational drug and the proper control group included in the set of studies in the development program for the drug.  Cardiovascular disorder drugs and anti-diabetic drug classes, among a few other classes, often have much larger sample sizes useful for proper comparisons in their development databases compared to other classes such as drugs for psychiatric disorders.  The larger the useful comparative sample sizes, the greater the sensitivity to smaller differences in or ratios of incidences between groups.

          Lack of robust “proof” that an adverse event is an adverse drug reaction for a given drug does not, in our opinion, imply that adverse events with lesser evidence of being adverse drug reactions should not be listed as adverse drug reactions.  Consistent with a first principle of first do no harm, it is reasonable to expect a lower standard of “proof” than required for efficacy to list an adverse event as an adverse drug reaction.  We believe that what is most fundamentally important is for any individual who uses these lists of adverse drug reactions for any purpose to recognize the potential for false positive inclusion of an adverse event in the list of adverse drug reactions.  Most persons probably recognize that if a medical condition (adverse event) is not listed as an adverse drug reaction, this is not strong evidence that the medical condition is not an adverse drug reaction with a very low incidence.  This matter was addressed in our discussion of the Rule-of-3.

          As Dr. Warnes pointed out and was illustrated by our work, rare adverse drug reactions are almost always identified after initial drug approval.  This identification begins with the observation and reporting of adverse events.  We briefly discussed the need for better (higher quality, more robust “proof”) and more rapid means of determining whether such events are or are not adverse drug reactions.

          About the eight points of Dr. Mehta cited by Dr. Warnes, we believe number seven and eight are particularly relevant to our work.  Number seven addresses alternative statistical methods.  We showed that alternative inferential analytical methods for the same outcome data could require different sample sizes.  Expert statistical consultation can optimize the analytical methods for both planned, a priori analysis of a specific data type collected in an experimental design as well as post hoc analysis of such data.

          Point number eight is critical to the process of performing the best job possible in the development of a list of adverse drug reactions for a given drug.  Inferential statistical results provide what is essentially a probability estimate for the “proof of the truth” of the null hypothesis that might be rejected.  Best decisions about what are or are not adverse drug reactions result from complex cognitive processes involving multiple levels and type of data.  These types of data can range from information about the pharmacological actions of the drug and well-accepted consequences of those actions, through information about the kinetics and metabolism of the drug, on to individual case reports, and finally to formal studies with or without randomization and with or without proper control.  However, statistically significant evidence from studies with randomization and proper control comparison is one important and very robust type of data.


August 15, 2019