Thomas A. Ban
Neuropsychopharmacology in Historical Perspective.
Education in the field in the Post-Psychopharmacology Era
Neuropsychopharmacology in Historical Perspective.
Martin M. Katz*: Clinical Trials of Antidepressants: How Changing the Model Can Uncover New, More Effective Molecules (New York: Springer; 2016)
Book review followed by an exchange between the author and Per Bech, Walter Brown, Malcolm Lader and Donald F. Klein
Review by Martin M. Katz
INFORMATION ON CONTENT: This brief book makes note of recent failures and abandonment by many companies of antidepressant drug development. It takes current clinical trial protocols to task and replaces them with a contemporary framework for improving next-generation antidepressants and their underlying science. New, innovative models are based on a neurobehaviorally-informed understanding of drug mechanisms and the component cognitive, mood, and behavioral aspects of depression. The book reconceptualizes not only the clinical trial process but the clinical concept of depression itself, from a “holistic” to a “dimensional” model. These changes are essential to bring pharmaceutical research and development up to date, in order to boost efficiency and effectiveness in finding new molecules and reducing waste. In proposing a new theory of depression, it brings decades of research on onset and specificity of dug actions current, illustrating the application of the new models with case studies and a review of salient depression methods. It is a follow-up to the author’s earlier, more conceptionally-oriented treatment of the subject in his book, Depression and Drugs (Springer 2013), demonstrating the potential benefits of such wide-scale change.
Included in the coverage:
· Why now the need for a new clinical trials model for antidepressants?
· Aims and basic requirements of clinical trials: conventional and component-specific models.
· Methods for measuring the components and the profile of drug actions: the multivantaged and video approaches.
· Achieving the ideal clinical trial: an example of the merged componential and established models.
· Prediction and shortening the clinical trial.
· The video clinical trial.
AUTHOR”S COMMENT: This new book was designed to follow-up the author’s earlier treatise: Depression and Drugs: The Neurobehavioral Structure of a Psychological Storm (Springer 2013). It is intended, in part, to apply the principles, the new theory of “opposed neurobehavioral states” and the methodology developed to test that theory and to manage the thorny problems associated with the evaluation of new putative antidepressants. The multivantaged (MV) and video models for evaluation are described in the first book and illustrated in more detail in the new presentation, complete with case studies so that the reader can more easily follow the procedures. It is hoped that these “new” models can advance the science and introduce greater efficiency into the trial process, thus, encouraging the development of more effective and more rapidly-acting drugs.
February 25, 2016
Per Bech May 26, 2016 Comment
Martin Katz July 14, 2016 Reply to Bech’s comments
Walter Brown July 7, 2016 Comment
Martin Katz July 21, 2016 Reply to Brown’s comment
Malcolm Lader June 23, 2016 Comment
Martin Katz July 28, 2016 Reply to Lader’s comment
Walter Brown August 3, 2017 Final comment
Donald F. Klein January 11, 2018 Final comment
Per Bech’s comment
Although the pharmacological industries have doubled their research funds since 1991 for identifying new drugs with an antidepressant action more efficacious than the existing antidepressants, no such new drugs have been marketed. This is the background for Martin Katz’ description of how his model can uncover new, more effective molecules.
The FDA requirements focusing on optimal dosage and marketability are referred to by Katz as an applied science, whereas his new model is a component-specific model in which the clinical trial becomes a potential step in facilitating an advance to finding new and more effective treatments of major depression. Thus, according to Katz, antidepressants are not “diagnosis-specific,” but are in their modes of action “component-specific.” He refers, in this respect, to Hotelling’s principal component analysis by which he has identified such components as depressed mood, psychic anxiety, psychomotor retardation, psychomotor agitation, hostility, somatization, interpersonal sensitivity, sleep, or cognitive impairment. These components can then be parts of specific dimensions, namely (1) anxiety-agitation-somatization-sleep; (2) depressed mood-retardation; and (3) hostility-interpersonal sensitivity. At the item level of these dimensions, Katz predominantly refers to the Hamilton Depression Scale (HAM-D) and the Symptom (SCL-90).
From a clinimetric point of view, it is indeed valid to have these two rating scales as the platform for the component/dimensional specific approach. The time has come to use both the HAM-D and the SCL-90 as multidimensional scales and not within the concept of the traditional FDA guided trials where these scales are “bureaucratically” considered as unidimensional. The total scores of the HAM-D or the SCL-90 are not sufficient statistics when using Katz’ model for the identification of new, more effective molecules.
Throughout his new book, Katz has used his previous placebo-controlled trials with desipramine versus paroxetine to demonstrate the onset of action for the componential approach, illustrating the superiority of desipramine on the dimension of depressed mood - retardation, with an onset of action already after three days compared to 13 days for paroxetine and 21 days for placebo. From an economic standpoint, when performing clinical trials of antidepressants, Katz recommends such intensive assessments of specific drug-induced changes. He concludes that the field of neuropsychopharmacology stands to gain new knowledge of importance to both basic and clinical research.
We all must read Martin Katz’ attempt to educate us about his very impressive work on going beyond the FDA model of applied science to the basic science of clinical psychopharmacology.
May 26, 2016
Martin M. Katz’s reply to Per Bech’s comment
Per Bech captures the main issue in the book and fully endorses the need to extend the assessment of drug actions in clinical trials to include the componential, dimensional approach. Bech who sharpened the Hamilton Rating Scale to make it even more effective in clinical trials is aware of that scale’s strengths and limitations and strongly supports my multifaceted approach to the problem. He is one of the leading authorities in psychopharmacology in Europe, well in touch with how clinical trials are conducted throughout the pharmaceutical industry on both continents. I thank him for support of my efforts to encourage the field to broaden the assessment effort in these trials and to render them more productive in seeking new, more effective molecules.
July 14, 2016
Walter A. Brown’s comment
Although in the past 50 years both the US federal government and the pharmaceutical industry have spent billions of dollars seeking new treatments for mental illness, clinicians and researchers agree that no truly novel psychotropic drug has surfaced over this time. The key point here is novel.
Antidepressants are a case in point. The pharmaceutical industry comes up with “new” antidepressants all the time and they are launched with great fanfare. But these “new” antidepressants are invariably me-too variants of older drugs. In some instances, the antidepressants now in use have fewer side effects than the older ones but they are no more effective. And the newer antidepressants share many of the limitations of their forbears. Like the first antidepressants, the newer ones take several weeks to exert their full effects and they are ineffective in a large proportion of patients. The psychiatric community has acknowledged this lack of treatment innovation as a major problem. Although some of the reasons for the absence of innovation have been identified, the remedy is far from clear.
First, as many have lamented, despite great advances in our understanding of the brain, little is known about the specific brain abnormalities giving rise to depression. Thus, there are no obvious targets for which to design new antidepressants. As a result, pharmaceutical companies - a major source of treatment innovation - search for potentially useful “new” drugs by looking for compounds which are similar in structure or effects to the existing ones. This approach does identify drugs which work about as well as the existing ones (me-too drugs) but it can only fail with respect to innovation.
In addition, as Martin Katz suggests in this persuasive monograph, even if a researcher has in hand a compound with novel psychotropic properties, our current system for evaluating psychotropic drugs makes it unlikely that its novel clinical effects would be detected, particularly if they were unexpected.
Mindful of the impediments to new antidepressant development and the high failure rate of contemporary antidepressant clinical trials (only about half the trials of approved antidepressants show them to be significantly better than placebo), Katz tackles several features of clinical trials methodology with an eye toward improving the success, efficiency and scientific value of those trials.
There’s a good bit of wisdom in this brief (66-page) volume. Katz argues, convincingly, that since clinical trials are time consuming and expensive it makes sense to maximize the information that they provide. Instead of the current practice of evaluating outcome simply by the change in total score on a measure of depression severity, like the HAM-D or MADRS, Katz suggests that in addition to assessing changes in the depressive syndrome as a whole, efficacy studies should also include thorough measurement of the individual components of depression-anxiety, motor retardation, hostility and so forth. Katz points out that analysis of components provides more information on a drug’s spectrum of action and would foster a better understanding of the relationship between a drug’s pharmacologic activity and its behavioral effects. A clinical trial thus modified would go beyond a strictly commercial venture and advance the science of psychopharmacology. In some instances analysis of components might point to a symptom of depression that is particularly responsive to an experimental drug and thus rescue an otherwise failed trial. If this approach had been followed in the first trials of SSRIs their value as anxiolytics would have been discovered far earlier.
I agree wholeheartedly with Katz’s idea that the information provided by clinical trials and their scientific value would be enhanced by a components analysis. But I would take his concept of maximizing information a bit further. Let’s not forget that the antidepressant activity of the very first antidepressants, imipramine and iproniazid, was discovered when they were being studied for other conditions; imipramine was first tried in patients with schizophrenia (a few got hypomanic and a few showed a reduction in depressive symptoms) and iproniazid induced euphoria in some of the tubercular patients who got it. It’s difficult to deliberately court serendipity, but clinical trials could incorporate, as a matter of policy, an open-minded stance to clinical effects, frequent, meticulous and extensive clinical observation and attention to and follow up of unexpected clinical changes.
Katz also points to data from his own and other’s studies that challenge the widely held belief that it takes several weeks of antidepressant treatment before improvement occurs. He shows that much of the symptom relief brought by antidepressants comes in the first two weeks of treatment and that the type of early response predicts response later down the line. Notably, the absence of improvement in the first two weeks is highly predictive of lack of response at six weeks. Clinical trials could be less costly and time consuming, Katz suggests, if they were shortened on the basis of early response. Although early response can be detected with conventional severity ratings on the HAM-D, Katz’s work suggests that measurements of components are more sensitive to early clinical change. He points out that prospective studies are required to pin down the relationship between early changes in depressive components and eventual outcome. Such studies would, needless to say, provide information pertinent to clinical practice as well as clinical trial design.
Katz’s final recommendation is to use central ratings of videotaped interviews to assess patients in clinical trials. He provides a number of arguments for the value of this approach in multicenter trials, including reduction of variability among sites and raters, an enhanced capacity to observe and evaluate nonverbal behavior (Katz maintains that it’s easier for one observing the interview than one conducting the interview to assess such behavior) and the capacity to establish an archive of taped interviews for further study. These proposed advantages of video based ratings make sense on intuitive grounds, and Katz points to data generated by him and his colleagues that suggest these ratings are reliable and more sensitive to clinical change than conventional ratings. Nevertheless, given the logistical hurdles and expense of this approach, data showing conclusively that it provides an advantage in reliability, validity and outcome is required before implementation is warranted.
Katz gives a nod to Ketamine, but throughout his book he refers to monoaminergic systems, serotonin, norepinephrine and neurotransmitters as providing the neurophysiologic basis for both depressive symptoms and drug actions. Given the ever vanishing validity of the monoamine hypothesis, this book would rest on firmer ground if it stuck to psychopathology and eschewed unproven neurochemistry. As Katz says: “The essence of what is proposed here is that we convert the ‘clinical trial’ into a ‘scientific, clinical study’ aimed at achieving both the practical, primary aim of determining whether the new drug is efficacious for the targeted disorder, and the secondary scientific aims of describing the nature and timing of the full range of clinical actions the drug has on the major aspects of the depressive disorder.” This conversion can be accomplished without recourse to pathophysiological theories.
A few spots need copyediting. There are some useful appendices, including one which lists the instruments used to measure the depressive components.
July 7, 2016
Martin M. Katz’s reply to Walter Brow’s comment
Walter Brown, a highly experienced figure in the clinical trials field provides a detailed analysis of the book and a sharp, well thought through review of its contents. He points to the failure of the drug industry to come up with novel drugs and the slow pace in uncovering the "little-known specific brain abnormalities" that underlie depression. The monograph he states is persuasive in citing that even if new effects of a trial drug were present, the current model trial, is expensive and wasteful, and not designed to uncover them. Confining assessment to one depression scale prevents possible specific effects on particular symptoms, like anxiety, anything novel in the drug's effects, in other words, from being detected. He understands that such studies if they applied a componential approach, would provide a "spectrum of drug actions," not available in the current model. He cites the new models’ strengths in clarifying onset of clinical action, in predicting outcome from early reactions to drugs, and the potential for shortening such trials. Regarding limitations, although impressed with the early results reported utilizing the video approach, he is somewhat reluctant to see it applied generally before further data on validity is produced. Also, believing that the "validity of the monoamine hypothesis of drug efficacy is slowly vanishing," he suggests the author stick to psychopathology, until there is more clarity concerning neurochemical mechanisms in this area. In summary, the author agrees that results linking behavioral and neurochemical factors are only starting to be uncovered but believes that this area of research is farther along than Brown acknowledges. Nevertheless, W. Brown sees much to gain by the field attending seriously to the book's proposed changes in the clinical trials model, and provides an excellent overview of its content
July 21, 2016
Malcom Lader’s comment
Martin Katz is a psychologist with a distinguished record in psychopharmacological research. In this book of exemplary succinctness, he concentrates on the FDA requirements for efficacy trials for antidepressants. He is particularly critical of the wasteful nature of these trials and the limited conclusions that can be drawn. The Hamilton Depression Scale is a particular bête noire (Hamilton 1960). My comments are primarily designed to stimulate controversy and initiate a discussion. Thankfully, as a European I do not have to comply with the rigid, almost ossified, FDA regulations. I have those of the EMA instead!
I shall consider some general points first. What is the purpose of an efficacy trial of this sort? It is basically a legal procedure to establish efficacy according to pre-set, usually legislative criteria. The outcome variable may be specified as, for example, a proportional drop in the Hamilton Depression Scale Score. But this is an artificial outcome. The practicing clinician actually relies on a probabilistic analysis of the chance of obtaining a useful therapeutic response in her/his patients as compared with other treatments, both pharmacological and non-pharmacological. In clinical practice this therapeutic response is a pragmatic outcome such as discharge from hospital or outpatient clinic (Keller 2003). Furthermore, in conjunction with the efficacy, it is essential that the risks of the treatment are carefully evaluated so that a proper risk-benefit ratio can be estimated (Friedman and Leon 2007). Such profiles usually need much larger numbers than for an efficacy trial particularly if the profile of adverse effects contains some rare but severe, even life-threatening, reactions. Post-marketing surveillance may be needed to fulfil that role. In addition, the clinician will have calibrated this risk-benefit ratio against the severity of the condition that he is treating, accepting greater risks for a more severe indication. He may conclude that the risks outweigh the benefits in all but the most severe of the patients who seek help. Also, a differential response in some patients needs careful evaluation so that a particularly responsive sub-type can be identified.
Severity is an important dimension that regulatory authorities may overlook or delegate to cost-effective assessments. As a general rule it is easiest to establish efficacy in the most severely ill patients such as those with a Hamilton Score in the 30s or a MADRS of at least 30 or even 35 (Montgomery and Asberg 1979; Thase 2011). Too often because of the exigencies of being able to recruit patients at an adequate rate, quite mildly ill patients are included and those may not show an adequate response.
One factor which is overlooked in this book is that most cases of unipolar depression have a self-limiting time span (Spijker, de Graaf, van Bijl et al. 2002). Natural remission is the rule rather than the exception. This raises practical problems - if the trial goes on for too long, say over 12 weeks, natural remission in the placebo group will obfuscate the improvement in the drug-treated patients. The theoretical way to control for this is to have a non-treatment group but this raises major practical and ethical problems.
Katz inveighs against the wasteful nature of the trials carried out under FDA auspices. I entirely agree with the waste of expensive resources but also question whether trials with such limited results can be truly ethical. Patients are being exposed to untried treatment procedures for a limited and over-focused return.
One glaring example of this waste of patients and resources concerns the offset of action of putative antidepressants. A pharmaceutical company has a responsibility, scientific and moral, not to introduce any new medication to the market until it has been shown that the medication can be discontinued at the end of treatment with impunity or with only minor perturbations. The placebo-controlled trial provides an appropriate framework in which to establish whether cessation of treatment is uneventful, attended by a few symptoms, or by a recognizable and troublesome withdrawal reaction (Wilson and Lader 2015).
Another neglected topic is compliance which can vitiate the usefulness of an efficacious compound (Demyttenaere and Haddad 2000).
Katz implies that the FDA-type trial could fulfil two main goals. It can establish efficacy for registration purposes and it could be used for more widely useful scientific purposes. I believe that he is right that opportunities are lost but essentially, he is asking for scientific studies into antidepressants to be carried out in a controlled context, a laudable aim. Unfortunately, this cannot be achieved in the controlled trials before efficacy is actually established. Otherwise, if the candidate antidepressant proves inefficacious, much time, effort, and ethical credibility will be lost trying to elucidate the other aspects of the psychopharmacology such as biochemical changes. Caution is needed not to substitute one source of waste with another. I am also less enthusiastic than Katz in accepting correlations between antidepressant effects and biochemical changes. The relationships probably hold for norepinephrine (and I think dopamine) and motor activity, and between serotonin and anxiety, but I am less convinced that the biochemical correlates of depression itself are firmly established. To suggest that they could form the basis of a new model and thereby act as surrogate markers for clinical depression is surely an over-simplification. Correlations appear stronger with adverse effects than wanted effects (Gelder, Andreasen, Lopéz-Ibor and Geddes 2009).
I would also urge evaluation of correlates of insomnia which is not only a common concomitant of depression but a notable harbinger (Benca and Peterson 2008).
Katz adduces a small study from his own group to bolster his support of the different model of depression. I am concerned that he uses a circular argument when he states that his sample were “soundly diagnosed” as depressed. This merely means that the investigators came to some consensus on empirically derived criteria à la current DSM. He also falls back on the weak argument that it is “common knowledge” that a high level of anxiety accompanies depression and retardation. This is too facile. The approach needed in this argument is a return to first principles by carrying out a large study on a population sample with no preconceived assumptions about psychopathological categories. But I do think that the categorical approach merely serves to establish reimbursement criteria for health insurance agencies.
Katz takes particular issue with the Hamilton Depression Scale. It is a poor creature, indeed, with insensitive items. I once gently chided Max Hamilton – one did confront him trenchantly – about the Scale. He generously admitted that it had been drawn up hastily from text-book descriptions and had not been adequately tested for reliability and validity. Max regarded his Anxiety Scale as superior and so do I. In fact, the MADRS generally seems superior to the Hamilton for rating changes in depression severity (Carmody, Rush, Bernstein et al. 2006).
In conclusion, I heartily support Katz’s criticisms and his plea for a new approach that maximizes the biological factors. But I do not think that this constitutes a new model. Certainly more can be achieved and Katz points the way. But I am not fully convinced that we know enough as yet for the alternative model to prove successful in the search for new medications. We are still caught in the Laocoönian coils of serendipity in the history of antidepressant discovery.
Benca R, Peterson MJ. Insomnia and depression. Sleep Medicine, 2008;9;Suppl 1:S3–S9.
Carmody T, Rush AJ, Bernstein I, Warden D, Brannan S, Burnham B, Woo A, Trivedi M. The Montgomery Äsberg and the Hamilton Ratings of depression. A comparison of measures. European Neuropsychopharmacology, 2006;16:601–11.
Demyttenaere K, Haddad P. Compliance with antidepressant therapy and antidepressant discontinuation symptoms. Acta Psychiatrica Scandinavica, 2000;10, Suppl.S403:50–56.
Friedman RA, Leon AC. Expanding the Black Box — Depression, Antidepressants, and the Risk of Suicide. New England Journal of Medicine, 2007;356:2343-46
Gelder MG, Andreasen NC, Lopéz-Ibor JJ, Geddes JR. New Oxford Textbook of Psychiatry, 2009. Oxford, University Press., pp.1190-2.
Hamilton M. A rating scale for depression. J Neurol Neurosurg Psychiatry, 1960;23:56-62.
Keller MB. Past, present, and future directions for defining optimal treatment outcome
in depression; Remission and beyond. JAMA, 2003;289:3152-60.
Montgomery SA, Asberg M. A new depression scale designed to be sensitive to change. British Journal of Psychiatry, 1979;134:382-9.
Spijker J, de Graaf R, van Bijl R, Beekman ATF, Ormel J, Nolen WA. Duration of major depressive episodes in the general population: results from The Netherlands Mental Health Survey and Incidence Study (NEMESIS). British Journal of Psychiatry, 2002;181:208-13.
Thase ME, Larsen KG, Kennedy SH. Assessing the ‘true’ effect of active antidepressant therapy v. placebo in major depressive disorder: use of a mixture model. British Journal of Psychiatry, 2011;199:501-07.
Wilson E, Lader M. A review of the management of antidepressant discontinuation symptoms. Therapeutic Advances in Psychopharmacology, 2015;5:357-68.
June 23, 2016
Martin M. Katz’s reply to Malcolm Lader’s comment
Malcolm Lader is well known in British and European psychiatry and psychopharmacology having contributed substantially to the literature on clinical trials. He is in accord with the author's criticisms of the FDA model for evaluating antidepressants and broadens that critique to include the wasteful aspects and the "limited conclusions that can be drawn" from such trials. He is in agreement that although the Hamilton Depression Rating Scale has turned out to be a very important instrument for standardizing the measurement of efficacy, he turns to his personal experience with Professor Hamilton and the method to cite his reservations about its methodologic inadequacies. Dr. Lader also takes the opportunity to identify other problems to which the FDA presumably has not attended, e.g., allowing trials with ineffective drugs to go on too long, thus, further jeopardizing patient health, not following up with sufficient time to detect potentially serious withdrawal effects, etc. He is also critical of the author's evidence on correlations between clinical and biochemical effects, accepting some of these results, not others. Lader has a point that the correlational evidence is far from overwhelming but at the same time is, within its limitations, sound, and in my view, a highly useful step toward uncovering the complex interactive relationships between behavior and chemistry that characterize this neurobehavioral syndrome and that underlie the efficacy of the antidepressant agents. M. Lader, although supportive of the general approach and the new models proposed by the author, is not convinced through his cursory analysis of their background, that we know enough about these methods as vehicles for finding new antidepressants. He is, however, prepared, to await further developments in clinical trials research. One direct way in which that can be accomplished is for investigators to begin to apply these proposed, well researched alternative methods in more studies
July 28, 2016
Walter Brown’s final comment
Establishing the efficacy of an antidepressant is a perilous business. In the conventional antidepressant clinical trial, drug placebo differences are vanishingly small (effect sizes average about 0.3), fewer than half the clinical trials of approved antidepressants show the antidepressant to have an advantage over placebo and even when an antidepressant proves better than placebo it’s not at all clear that the drug will be useful in a clinical situation; the sorts of depressed patients who make up the subjects of contemporary clinical trials bear little resemblance to those seen in clinical practice.
Martin Katz, in the book reviewed here, tackles some of the problematic features of antidepressant trials and, focusing on measurement issues, offers some solutions. His comments about his book and the comments of others provide a reasonably thorough and balanced assessment of the book’s main thrust, which is that the sorts of outcome measures used in clinical trials need an overhaul and that assessment of depressive components would be more informative than the global appraisals now in vogue. Katz may be right. Clearly anything that can be done to improve the efficiency of and knowledge gained from clinical trials is welcome.
Still, the dearth of treatment innovation pointed to in these exchanges will not be resolved by tweaking clinical trials. Clinical trials are not meant as a route to discovery. When something new comes out of a formal trial it’s strictly by accident. As William Thomas Beaver, a clinical pharmacologist credited with drafting the initial FDA regulations defining adequate and controlled clinical studies said: “The function of the controlled clinical trial is not the ‘discovery’ of a new drug or therapy. Discoveries are made in the animal laboratory, by chance observation, or at the bedside by an acute clinician. The function of the formal controlled clinical trial is to separate the relative handful of discoveries which prove to be true advances in therapy from a legion of false leads and unverifiable clinical impressions, and to delineate in a scientific way the extent of and the limitations which attend the effectiveness of drugs.”
In trying to understand why nothing new has come along, we need to ask ourselves why the 1950s saw a flood of novel psychotropic drugs that hasn’t been equaled since. The far more sophisticated research methods of today have brought us, by contrast, a trickle of me-too drugs which offer a minuscule, if any, advantage over the older agents. Current research methods certainly have their advantages. Placebo control groups, randomization to treatment arms and standard rating instruments help prevent the misleading conclusions that can be drawn from clinical impression alone. Yet, unfettered, open-minded, clinical observation, unconstrained by regulatory requirements, was the “method” which allowed Roland Kuhn and Nathan Kline to detect the healing power of the first antidepressants. Perhaps their experiences can inform the search for better approaches to drug discovery.
Affidavit of William Thomas Beaver, M.D. (personal correspondence, Peter Barton Hutt Esq. and Dr Robert Temple, FDA, December 2007, FDA History Office Files). In: Junod SW. FDA and Clinical Drug Trials: A Short History. www.fda.gov.
August 3, 2017
Donald F Klein’s final comment
Comments on salient aspects of reviews by, Bech, Lader, and Brown of the new book by Martin Katz, are presented. Critical statements and rejoinders by Katz and Klein are interspersed.
The usefulness and novelty of the book’s suggestions how to revivify novel drug discovery is a central concern. That the novel drug discovery process has dried up furnishes the declared motive for Katz’s new book.
Bech, an eminent statistician, well versed in psychopharmacology has argued, like Katz, that the Hamilton Depression Scale is multi-dimensional. His contribution to this review of Katz’ book is on the whole quite supportive. This is surprising since, in Per Bech’s book Clinical Psychometrics that I had reviewed for INHN, he points out that Factor Analysis depends upon the rule of thumb selection of the number of factors, that then is rotated (by various methods) to differing definitions of simple structure. Bech holds that these procedures do not flow from a logical basis that allows firm deductions or sampling inferences. This defect is affirmed by the lack of factor replicability across various samples. Strikingly, Bech argues that the ubiquitous factor analysis does not provide appropriate measures of change or a foundation for diagnosis. This critically challenges Katz’ work, as well as the NIMH, sponsored RDoC manifesto for dimensional primacy via multivariate analysis. Bech supports scale analysis by using more modern approaches including IRT, Rasch and Guttman approach. He does not claim that this subscale approach leads to novel drug discovery. Bech avoids a confrontation with Katz, who reciprocates. Bech does not reject the value of Hamilton Subscales. His team developed a six-item subscale believed to improve depression diagnostic specificity as well as sensitivity to change (Timmerby, Andersen, Søndergaard et al. 2017).
There is a superficial similarity to Katz’ assertions about the utility of a component/dimensional approach for psychopharmacologic studies. Surprisingly, Bech does not address the usefulness of Katz’ components. Further, claims that componential analysis is required for identification of new, more effective drugs goes unremarked. Assertions such as “Antidepressants are not “diagnosis-specific,” but are in their modes of action “component-specific” seem ill-founded since Katz studied only depressed patients. Hoteling’s principal component analysis technique allowed components to be discovered, “…such as depressed mood, psychic anxiety psychomotor retardation, psychomotor agitation, hostility, somatization, interpersonal sensitivity, sleep, or cognitive impairment. These components can then be parts of specific dimensions, namely: (1) anxiety-agitation-somatization-sleep; (2) depressed mood-retardation; and (3) hostility-interpersonal sensitivity.” It is common in a Hotelling analysis that on the major first factor all loading variables are positive, while the second factor is bipolar. That is, some loading variables have positive, and some have negative loadings. The British tradition uses only the contrast evident in the second factor. "In contrast, an American approach rapidly emerged in which factor analysis was used to identify as many factors as possible." Bech argues that these factors, even if “rotated to simplicity,” cannot be represented by a total since they contain items relevant to both severity and group discrimination. This impairs their use both as change and diagnostic measures. Therefore, factor scores derived from patients’ status scores are not particularly sensitive to change as they bury relevant change sensitive items by many unaffected loading items. This problem has been described in a widely unnoticed paper (Klein and Fink 1963). An immediate problem with Katz’ components is psychopathology coverage. For instance, where do hallucinations or delusions or mania or dementia fit? Bech does not address whether these dimensions differ from those produced by ordinary factor analyses or have some qualities that make them particularly useful for drug discovery.
Lader’s extensive review reflects his expertise in epidemiology, pharmacology and nosology (Wilson and Lader 2015). He agrees with Bech and Katz that the FDA required clinical trial has limited legal purposes with regard to marketing, and that artificial outcomes may serve those purposes. It is not stated that it is not the FDA that limits involved clinical trials, rather industry’s profit-maximizing decision to restrict the extent of clinical trials to the economic minimum that passes FDA standards. Lader notes that Katz suggests decreasing waste by expanding the limited FDA requirements for an efficacy trial into complex measurements, including componential analyses, that will lay the groundwork for drug discovery and broadening the range of therapeutic indications. Lader points out that expensive data gathering during this pre-marketing, Phase Three, trial, runs the risk of extensive expenditure on an agent that proves a failure (as is currently frequent). “Caution is needed not to substitute one source of waste with another.” However, the pragmatic outcome measures mentioned are not suitable for outpatient practices. A realistic outcome determination depends upon the current medication profile, patient health status, functional abilities, symptomatic state as well as social functioning, work and family engagement, over several time periods. It is not required by FDA’s sparse standards and is rarely done.
- Lader holds that the measures used in FDA approved trials do not reflect the superior clinical judgments that compare treatments by “a probabilistic analysis of the chance of obtaining a useful therapeutic response.” It’s hard to see what the clinician bases such estimates on. Few methodologically sound studies compare various treatments and use a valid control group. Therefore the clinician does not have enough sound information to support objective choices. The standard waiting list control is unsound. A diagnosis is made, but treatment is delayed. This generates entirely different emotions and expectancies than placebo treatment. It may increase anxiety. This artifact leads to an exaggerated difference between waitlist and active medication. Waitlists may result in covert protocol non-compliance by self-treatment. Concurrent placebo-treated controls are necessary to establish efficacy. Requiring such a control group also needs a medication treatment arm to preserve blindness for the placebo-treated group. This increases trial complexity but justifies efficacy interpretations and public health relevance. The usual simple two-group design is frequently vulnerable to a covert allegiance effect. It does not prove efficacy or allow judgments of respective value with other treatments.
- The clinician does not have the ability or unbiased information needed for a probabilistic analysis, so Lader’s suggestion is reduced to a best guess.
- That patient is not sympathetic to both clinical and research prescriptions, which leads to covert non-compliance. This, as well as dropouts, destroys randomization. This grave problem produces misleading estimates of both efficacy and safety. It is rarely corrected.
- Lader believes that the emphasis on waste in premarketing studies is due to a confusion between the FDA’s narrowly defined regulatory choices, which justify economically rigorous Pharma supported clinical trials, compared to science support where the unknown truths of the therapeutic situation justify wide exploration. Nobody notes that the FDA is a Federal Regulatory Agency debarred from generating knowledge unless closely tied to medication evaluation. Knowledge generation is NIH and NSF’s turf.
- The new models, proposed by Katz, have not convinced Lader that these methods are vehicles for finding new antidepressants. He is, however, prepared, to await further developments in clinical trials research, which is an exasperating truism. Katz argues that the only direct way to relevant research is for investigators to apply his proposed, “well researched” alternative methods in their studies. The cited studies do not address the validation of componential measures. Rather they supposedly support Katz’s stand for decreasing the time span for a clinical evaluation from the usual six weeks to approximately two weeks.
- Brown, a clinical psychiatrist with extensive clinical trials experience, and a focus on the importance of placebo, in agreement with Katz, notes that remarkable biological advances have not produced an understanding of how brain processes can eventuate in depression. “The pharmaceutical industry comes up with ‘new’ antidepressants all the time, and they are launched with great fanfare. But these ‘new’ antidepressants are invariably me-too variants of older drugs.”
- Brown agrees with Katz’s suggestion that if a researcher has in hand a compound with novel psychotropic properties, our current system for evaluating psychotropic drugs makes it unlikely that its novel clinical effects would be detected, particularly if they were unexpected.
This claim is entirely out of keeping with the recognition of anticonvulsants as mood stabilizers as well as the recent furor over the psychedelic ketamine’s quick action. Clinical scientists have eyes, interviews and often understanding. They can see beneficial changes in their patients before scale evaluations and modify these instruments appropriately. A model change is the expanded Hamilton Depression Scale from 17 to 21 items, allowing the distinctive features of atypical depression to be assessed. Scale composition is not a limiting factor on discovery. No psychiatric drug has been discovered by scale analysis.
Scales are used for validation purposes as well defined concrete referents for patient evaluation. That is not the discovery process. Katz’s central prediction is that scale refinement, and extensions of clinical assessment by video recording will lead to discoveries that clinical observation misses. The logical analysis and positive pilot findings that justify investments in expensive programs are not presented. The face validity of Katz’s program depends on confusing possible increases in scale reliability with a unique discovery process.
Katz adduces a small controlled study from his group that contrasts a medicated group vs. placebo. Strangely, the medicated group reported, for independent analysis combines the separate randomizations of desipramine and paroxetine. No justification is given for this senseless procedure. Also, to bolster his support for a different model of depression by using a sample who were “soundly diagnosed” as depressed. It means the investigators stringently applied criteria for Major Depressive Disorder from some accepted source. Such criteria are usually based on variables that portray a dilute version of melancholia. Therefore, variance in depression’s measurement becomes constricted, which is considered useful. But Factor analyses are effectively based on correlations or similar indices of coherence. The constricted variance of the depression variables also constricts correlations with depression towards zero. Analyses within depression may indicate various item groupings that are not relevant to depression diagnosis. Rather they refer to depression modifiers. This view is supported by the labels of the proposed, three dimensions. Katz did not address this issue. Positive within drug analyses were criticized for an obscure presentation that could be swiftly rectified. However, obscurity persisted. Relying on reported inferential statistics appeared to support Katz’s views on early onset of drug effect, the predictability of both major and absent benefit, and radical shortening of clinical trials. Our presentation of difficulties with Katz’s analyses provides candid examples why solely relying on inferential statistics affords an inadequate basis for thoughtful conclusions. The requested 2X2 data layout, as presented in "Martin M. Katz’s (2015) response to Donald F. Klein’s reply to Carlos Morra’s comment presented in a parallel project (Martin M. Katz: Onset of antidepressant action) were insufficiently identified, as Leslie Morey (2015) agreed. The ambiguity is the uncertainty about which table row should be considered as early improvement. Assuming early improvement refers to row 2, this table roughly agrees with Katz’s statement that "70% of patients showing early improvement would go on to respond at 6 or 8 weeks."
Hamilton Rating Scale
early <20% 15 2
>20% 8 25
Note, 33 are predicted to do well but only 27 ( 82% ) did. Based on Katz’s within drug analysis the drug is overvalued.
One might be interested in the possibility that a very low pre-score would indicate a likely treatment shift. However, even better such a score should allow a drug free period of clinical watchful waiting.
The hopefully predictive correlation (0.6) between pre and post measures, accounting for 36% of the variance, is generally considered too low for predictive use. Further problems remain. The "active drug" sample, N = 50, combines the Paroxetine study (N=24) with the DMI study (N=26). No justification is given. The combination of Paroxetine, picked as a serotonergic agent and DMI as a noradrenergic agent requires a prior justification. Apparently an increase in sample size was considered necessary.
Katz provided placebo data to Morey who shared it. This allows progress from a predictive study, derived entirely from within drug data, to an estimate derived from contrasting drug vs. placebo.
Recover 27 6
Not Rec 23 13
Chi-square = 2.77 p=0.09 2Tailed
This analysis focused on invalidating the null does not have sufficient strength to be a useful predictor. The correlation, 0.6 found here, has 95% confidence limits of 0.39, 0.72. So, the correlation's upper limit remains insufficient for predictive utility, even if one stacks the dice by an untrue assumption of sample bivariate normality. Katz's argument is questioned by the insignificant contrast between drug and placebo outcomes. Even strong findings, if derived from a small data set, would call for large sample replication, before allowing interpretation as sound predictions about the useful length of definitive clinical trials. That this insignificant, 6-week, drug vs. placebo contrast justifies the utility of a much shorter clinical trial is preposterous. Katz's claim that larger studies have already agreed with his conclusions needs more than an article reference. The exact analyses allowing parallel conclusions must be pointed out. I have failed to find them.
It is also illogical for large supposedly definitive trials to be followed by a small trial, that could add nothing new. Katz replies that the large studies used total Hamilton scores whereas his small study was investigating componential scores. It follows that claims that componential analysis was backed up by large trials are incorrect.
There was no indication of the multiplicity of analyses picked over to show supportive analyses. It is well known that analyses based on within drug analyses are often meaningless. Adequate placebo controls and proper analyses are required for the correct understanding of real effects. The late partial release of detailed placebo data allowed the comparison of the medicated group to placebo. It was non-significant. This casts doubt on all of Katz’s analyses but this was denied by an assertion of trust in their own analyses. However, requested data allowing independent analyses were not made available.
To sum up, the reviews did not address major issues invalidating Katz’s conclusions. Several illogical beliefs were not exposed. The illogic of supposing tightening up such descriptions would somehow produce novel drugs was reviewed. It was almost unnecessary to review the data analyses since the logical framework was so impaired by the history of discovery. The persuasiveness of these propositions rely on a historical and logical confusion that increasing reliability somehow suffices for increases in discovery. The claimed relation to novel psychiatric drug discovery is not evidenced but appeals to wishful thinking. However, Katz’s data analysis, used to support his conclusions, proved to be, at least, questionable. Both book and reviews fail.
Katz MM. Response to Klein’s reply to Morra’s comment. inhn.org.controversies. October 15, 2015.
Klein DF, Fink M. Multiple item factors as change measures in psychopharmacology. Psychopharmacologia, 1963;4:43-52.
Morey LC. Comment on the interaction between Katz and Klein. inhn.org.controversies. December 17, 2015.
Timmerby N, Andersen JH, Søndergaard S, Østergaard SD, Bech PA. Systematic Review of the Clinimetric Properties of the 6-Item Version of the Hamilton Depression Rating Scale (HAM-D6). Psychother Psychosom. 2017;86:141-9
Wilson E, Lader M. A review of the management of antidepressant discontinuation symptoms. Therapeutic Advances in Psychopharmacology, 2015; 5:357-68.
January 11, 2018
*Martin M. Katz passed away on January 12, 2017.
March 5, 2020