Martin M. Katz: Onset of antidepressant effect - Leslie C. Morey’s comment
In following the interesting discussion between Dr. Klein and Dr. Katz with regard to the onset of antidepressant effects, there appears to be an opportunity to illustrate the implications of research results on clinical decision-making. In attempting to do so, I will use the 2x2 tables provided by Dr. Katz from the Katz et al. (2011) paper to explore some of the concerns raised by Dr. Klein, and I believe that both Drs. Klein and Katz raise important points. Please note that this comment seeks to illustrate a point using the Katz data; I make no claims about interpreting the general trends in this research literature.
I believe that one of the central critiques from Dr. Klein is that the data from Katz et al. (2011) do not necessarily support a recommendation for early discontinuation of an antidepressant based upon a lack of an early response. Dr. Katz notes that there are clear data suggesting that active drug demonstrates superiority to placebo, at a group level, within the first week or two of treatment. However, Dr. Klein points out, accurately I believe, that these group differences do not necessarily support the premise of recommending early discontinuation for a particular patient. This is basically an issue of decision theory, into which we must use Bayesian concepts to assist our decision making. The most relevant concept for this particular issue is that of “negative predictive power” or NPP values—in other words, the probability that, if we have decided there will not be a treatment response based upon early indications, what is the probability that such a decision will be correct? I concur with Dr. Klein that the numbers presented in Katz et al. (2011), which highlight “Percentage of Correct Predictions at Treatment Outcome”, are examples of “sensitivity” values—which do not directly address the advisability of recommendations based upon early response. As noted by Dr. Klein, what is needed to determine these NPP values are the full 2x2 (early response by final response) results from that study, which Dr. Katz has provided in his most recent comment.
However, there are inconsistencies in Dr. Katz’s data in his 2x2 tables. In his first two tables (concerning Depressed Mood and Anxiety), he reports that 27 patients in his study demonstrated a 50% decrease on the HAM-D with active treatment, which amounts to a 54% response rate. However, in his third table on the HAM-D, he reports that 39 patients demonstrated the same 50% decrease with active treatment, which amounts to a 78% response rate. After going back to the Katz et al. (2011) re-analysis and then back to the original Katz et al. (2004) paper, it appears that there was a 62% response rate to DBI (apparently 16 of 26 patients) and a 46% response rate to paroxetine (11 of 24 patients); thus, the 54% overall response rate appears to be the correct one. As such, the third table on the HAM-D is probably presented incorrectly--the marginal probabilities appear to line up in a manner suggesting that rows and columns have been switched. If we transpose that third table according to this supposition, we arrive at the appropriate 54% response rate for active treatment. The numbers in the fourth 2x2 table that describes placebo also do not appear to add up correctly. The Katz et al. (2004) article indicates a 30% response rate for placebo (presumably 6 of 20 patients), yet the 2x2 table in Dr. Katz’s response indicates that 15 of 19 patients recovered. Again, it appears that rows and columns were switched, and doing a transposition provides the reported results suggesting 6 patients responding to placebo over the course of the trial.
If my reorganization of these data are correct, we can now calculate the four NPP values needed to address Dr. Klein’s question. These are as follows:
NPP=accuracy of a decision to discontinue medication at 2 weeks based upon < 20% improvement:
Active Drug: NPP
Depressed mood-Retardation 88.2%
Hamilton Rating Scale 90.9%
Hamilton Rating Scale 75.0%
What do these values tell us? It appears that in this limited sample, a lack of early response to Active Drug proves to have considerable negative predictive power, consistent with the viewpoint of Dr. Katz. It is reasonable to consider the HAM-D results as reflecting the most reliable and broad indicator of early response. If those non-early responders (on the HAM-D) had all been discontinued early in the trial, only 9.1% of these patients would have ultimately responded by the end of the trial. There are other utility considerations as to whether that “error rate” is acceptable, but the data do indicate that this early information about response on the HAM-D is quite predictive of eventual outcome. In addition, the 75% NPP for placebo response provides a reasonable rationale supporting a rapid discontinuation of placebo if there is no early response, a result that might be anticipated.
The limits of drawing broad conclusions about this issue from these data should be apparent. First, I am presuming that I have interpreted (or reinterpreted) the tables from Dr. Katz correctly. Furthermore, the sample size is not sufficient to have confidence that these Bayesian estimates would be stable across other samples. In addition, this particular study was of two specific antidepressants. Because these Bayesian estimates are strongly influenced by a priori probabilities—here, the likelihood of a positive response to treatment over the course of the trial—these estimates would vary with different medications having different ultimate treatment efficacy, even if the early response profiles did not differ from those observed here. Even so, I believe this interchange is valuable as a demonstration of the need to consider the implication of treatment decisions at the level of the individual decision. In sum, I believe that Dr. Klein is correct in suggesting that Dr. Katz et al. did not present the most informative analyses to answer the central question in his 2011 paper; that paper essentially presented sensitivity and specificity values, but what was needed to answer this question were positive predictive power and, more importantly for the discontinuation question, negative predictive power. Even so, when the proper figures are calculated, I believe that Dr. Katz is still correct—namely, that a decision to discontinue treatment following the lack of an early response is likely to be correct (based upon these limited data, and on the HAM-D) about 90% of the time.
Leslie C. Morey
December 17, 2015