Barry Blackwell: Corporate Corruption in the Psychopharmaceutical Industry

Jay D. Amsterdam, Leemon B. McHenry and Jon N. Jureidini’s commentary: Industry-Corrupted Psychiatric Trials


1.      Introduction


As former editor of the New England Journal of Medicine, Marcia Angell reported in 2008:

Over the past 2 decades, the pharmaceutical industry has gained unprecedented control over the evaluation of its own products. Drug companies now finance most clinical research on prescription drugs, and there is mounting evidence that they often skew the research they sponsor to make their drugs look better and safer (2008). 

In this article we provide an overview of three industry-sponsored clinical trials in psychiatry as part of the “mounting evidence” confirming Angell’s conclusion that drug evaluation is a “broken system.”  We focus attention on three compelling cases that have demonstrated the extent to which the pharmaceutical industry will corrupt science for profit, GlaxoSmith Kline’s studies 329 and 352 and Forest Laboratories CIT-MD-18, and reveal the crucial elements that enable this state of affairs -- industry manipulation of scientific data, ghostwriting clinical trial reports, academic physicians serving  the marketing objectives of the sponsor companies and the failure of checks and balances in the peer review system and in regulatory bodies. 

 Our paper is a follow up to Dr. Barry Blackwell’s Corporate Corruption in the Psychopharmaceutical Industry that initiated this topic.  We address a subset of the issues identified in Dr. Blackwell’s essay, specifically relating to the corruption of clinical trials and expect that others will cover broader topics surrounding the general problem and its solution.




2.      Medical Ghostwriting and Data Misrepresentation


It is now fairly well known in academic medicine that pharmaceutical companies launder their promotional efforts through medical communication companies engaged to ghostwrite articles and then pay key opinion leaders to sign on to the fraudulent articles (Healy et al, 2003; Sismondo, 2007).  Some of the most frequently used medical communications companies or public relations firms include: Scientific Therapeutics Information, Inc., Current Medical Directions, Compete Healthcare Communications, Complete Medical Communications Limited, Carus Clinical Communications, Medical Education Systems, Intramed, Rx Communications, Excerpta Medica, Adelphi Ltd., Weber Shandwick, Prescott Medical Communications, Design Write, Ruder Finn, Belsito and Company, and Cohn and Wolfe.   There is even a medical communication company called “Ethical Strategies,” which attracts pharmaceutical clients with the promise that they fuse “the highest calibre public relations counsel with unparalleled industry knowledge to help organisations strategically define and deliver effective communication” ( These firms are also engaged to “neutralize” physicians who have been identified as disloyal to their client’s ineffective or unsafe drugs (McHenry, 2005; 2010, 136).  What is less clear, however, is the fine detail of the business that is only just starting to emerge in litigation. 


First, pharmaceutical companies seeking to “launch” a new drug on the market or a new indication for a drug already approved for another indication (e.g., adolescent depression, social anxiety disorder, erectile dysfunction, hypercholesterolemia) will hire a public relations firm and a medical communication company as part of their marketing strategy and publication planning.  Such firms will set up advisory board meetings with key opinion leaders and marketing executives in advance of the clinical trials.  Once a trial is complete, the medical writer who is employed by the medical communications firm produces a draft of a manuscript typically from a summary of the Final Study Report of the clinical trial, and then seeks feedback from the pharmaceutical sponsor and then from the external ‘authors’ and the internal company scientists who work for the sponsor.  It is typically at this stage in the production of the manuscript that misrepresentation of the trial data frequently occurs, since the medical writer is under the direction of internal marketing executives to “spin” the data “on-message.”  The medical ghostwriter then revises a number of drafts, eventually submits the completed manuscript to a medical journal for peer review, replies to feedback from the peer-review process, and finally will reply to criticism in the letters to the editor of the journal post-publication.  Once the article is submitted, the medical ghostwriter disappears from the scene or is acknowledged in the fine print for a contribution of “editorial assistance.”  The going price for a manuscript is $18,000 to $40,000 depending on the number of drafts produced and other services specified in the contracts such as organizing teleconferences and advisory board meetings (McHenry, 2010, 130).


Ghostwriting in the hands of the pharmaceutical industry has become a major factor in the “crisis of credibility” in academic medicine.  Academic authorship should be an assertion of intellectual responsibility.  It is assumed that the signed authors have collectively been responsible for study design, conduct, data analysis and writing. The integrity of science depends on the trust placed in individual clinicians and researchers and in the peer-review system which is the foundation of a reliable body of knowledge.  When, however, academic physicians allow their names to appear on ghostwritten articles, they betray this basic ethical responsibility and are guilty of academic dishonesty.  An annual Harvard University Master Class in Psychopharmacology offering continuing medical education demonstrates the point.  Several of the presenters, advertised as “world renowned faculty,” have been some of the worst offenders in medical ghostwriting scandals.  One in particular claims to be “author of over 1000 scientific articles and book chapters, Co-editor of Textbook of Psychopharmacology (  Ghostwriting goes beyond the simple procedure of drafting a manuscript; it provides an academic façade for research that has been designed, conducted and analyzed by industry and it is the vehicle through which the misrepresentation of the data in favor of the study drug is achieved.   

While the vast majority of ghostwritten publications in medicine will never come to light as ghostwritten, all industry-sponsored clinical trials are suspect and should be treated as such.  To date, the only cases in which ghostwriting has been exposed to the public are those in which there were damages that resulted in litigation or from physicians who were approached to participate in ghostwritten publications and blew the whistle (Fugh-Berman, 2005).  As far as litigation is concerned, only a few select cases will surface since the majority will disappear in legal settlement agreements. Incriminating documents remain proprietary information if plaintiffs’ attorneys do not seek to remove the confidentiality designation of protective orders.


3.      Key Opinion Leaders as Co-Conspirators


The term “key opinion leader” (KOL) or “thought leader” is an industry creation for physicians who influence their peers' medical practice and prescribing behavior.  Pharmaceutical companies claim to engage KOLs in the drug development process to gain expert evaluation and feedback on marketing strategy, but in reality these physicians are carefully vetted by the industry on the basis of their malleability to the sponsor’s products.  In this regard, KOLs are highly paid “product champions” who are engaged to “defend the molecule” (McHenry, 2005, 17).  Few physicians and psychiatrists can resist the lure of fame and fortune offered by industry, but the primary motive of ethical duty to patients is compromised by profitable drug promotion.  Marketing directives threaten the accuracy of research results and university professors become what David Healy calls “ornamental additions to business” (2004, xv).

The industry-academic partnerships that have empowered the KOL phenomenon are often traced to  one of the most influential pieces of legislation to impact the field of intellectual property law—the Bayh-Dole Act of 1980.  As noted by Sheldon Krimsky, such legislation was explicitly designed for the privatization of knowledge in the United States, and resulted from a paradigm shift in the philosophy of government from creating public wealth and safety nets for the less fortunate to maximizing private, for-profit sections (2003, 108).  The Bayh-Dole Act created a uniform patent policy that allowed universities to retain ownership to inventions made under federally funded research.  The motivation was to speed up the commercialization process of federally funded research, create new industries and open new markets from the university-patented inventions. 

The growth of university patents and the commercialization of research that followed Bayh-Dole at first seemed to have nothing but positive effects, such as the rapid development of pharmaceuticals, but it soon became clear that the legislation had negative results.  Universities that were losing government funding found the new source of revenue in the technology transfer to industry, but at the price of a proliferation of conflicts of interest.  Krimsky reports that it increased consulting arrangements with greater emphasis on intellectual property (2006, 22). Angell argues that it created a culture of secrecy that “may actually have slowed the sharing of scientific information and the exploration of new scientific leads” (Angell, 2004, 203).  The most disturbing aspect of these arrangements, however, is the manipulation of research results in favor of the sponsor company’s products.

In the description below of corrupted psychiatric trials, KOLs engaged by GlaxoSmith Kline and Forest Laboratories Inc. became the named authors on the ghostwritten publications that appeared in The American Journal of Psychiatry and The Journal of the American Academy of Child and Adolescent Psychiatry.  Several of these academic psychiatrists were also on the companies’ Speakers Bureaus, Advisory Boards and were also providing company-sponsored continuing medical education lectures.  What passed as “medical education,” however, was carefully disguised drug promotion created by medical communication companies and public relations firms.


4.      Complicit Medical Journals


Medical journals are part of the problem rather than the solution to the problem.  Instead of demanding rigorous peer review of submissions and an independent analysis of the data, medical journal editors are pressured to publish favorable articles of industry-sponsored trials and rarely publish critical deconstructions of ghostwritten clinical trials (Horton, 2004, 7; Healy, 2008).  This is due to the simple fact that medical journals and their owners have become dependent upon pharmaceutical revenue, whereby they fail to adhere to the standards of science.  Thus: good news, i.e., the drug is safe and effective makes money; bad news, i.e., the drug is unsafe and ineffective makes no money (except perhaps for plaintiffs’ attorneys).  Good news means more pharmaceutical advertising and more orders of reprints which are disseminated by the sales force.  The pharmaceutical industry and the medical journals make billions of dollars, especially from the publication of an allegedly positive trial of a blockbuster drug.

Serious problems with industry-sponsored clinical trials have been clearly identified in the process of peer review of the submitted manuscript; yet these manuscripts are published against the reviewers’ negative recommendations.  Submissions of deconstructed industry-sponsored clinical trials pass peer review and are rejected by journal editors who override peer review or by attorneys representing the journals’ owners.  Moreover, the pharmaceutical and medical device industries manipulate journal editors with threats of libel actions.  Finally, when journal editors and their owners such as the American Academy of Child and Adolescent Psychiatry and the American Psychiatric Association are confronted with indisputable evidence of industry fraud published in the journals, they refuse to retract (Newman, 2010; Jureidini et al., 2011).  When the probability of having a ghostwritten, fraudulent, industry-sponsored clinical trial accepted for publication in a high-impact medical journal is substantially higher than the probability of having a critical, deconstruction of the same trial accepted there can be no confidence in the medical literature. In this regard, medical journals, contrary to common opinion, are not reliable sources of medical knowledge.  They are guilty of publishing pseudo-science and have become, in the words of former BMJ editor, Richard Smith, “an extension of the marketing arm of pharmaceutical companies” (Smith, 2005).


5.      Three Case Studies


Few ghostwritten articles of clinical trials in psychiatry have been deconstructed in order to publicly reveal their sub rosa research misconduct and misrepresentation of outcome data. As noted above, the only cases of industry-sponsored, ghostwritten articles that have been exposed to the public are those resulting in litigation or from researchers who have blown the whistle on the practice.  As far as litigation is concerned, two industry-sponsored, ghostwritten psychiatric articles have been deconstructed from court documents and have received recent media attention. These two deconstructed articles will be summarized together, as the two studies have much in common and are both the result of pharmaceutical companies manipulating outcome data in order to promote the off-label marketing of antidepressant medication to children and adolescents. The third deconstruction article has received less media attention. It examines an industry-sponsored, ghostwritten article that came to light as part of an academic whistle blower complaint of plagiarism and research misconduct against prominent academic professors at medical research universities and several pharmaceutical company executives. It involved the manipulation of sample size estimates and the misrepresentation of outcome data in adults with bipolar major depressive disorder.

All three of these deconstructed psychiatric trials were published in a medical journal that does not depend on pharmaceutical industry revenue, the International Journal of Risk & Safety in Medicine (See Jureidini et al. 2008, Amsterdam & McHenry 2012, Jureidini et al. 2016,). 

5.1 SmithKline Beecham (GlaxoSmith Kline) Paroxetine Study 329

SmithKline Beecham’s Study 329 was designed to compare the efficacy and safety of paroxetine and imipramine with placebo in the treatment of adolescents with unipolar major depression.  The 1993 protocol for the study (and its subsequent amendments) specified two primary outcome measures: change in total Hamilton Rating Scale for Depression (HRSD) score; and proportion of remitters and responders with a change in HRSD score ≤8 or reduced by ≥50%. The protocol also specified six secondary outcome measures.  A total of 275 subjects were enrolled between April 1994 and March 1997.

The published study 329 article was ghostwritten by Sally Laden of Scientific Therapeutics Information, Inc., under the direct sponsorship of GSK employees and was published by the Journal of the American Academy of Child and Adolescent Psychiatry (JAACAP) in July 2001 under the so-called ‘authorship’ of Keller et al. (2001).  As indicated below, the ‘positive’ results published in the JAACAP article were a gross misrepresentation of the actual ‘negative’ study results.  Deconstruction of the ghostwritten 329 article was accomplished by examining approximately 10,000 court documents from a class action lawsuit, Beverly Smith vs. SmithKline Beecham. A selection of these documents is available at the website of Healthy Skepticism (

Keller et al. claimed that paroxetine was “generally well tolerated and effective for major depression in adolescents” (2001, 762), while GSK claimed that paroxetine demonstrated “Remarkable Efficacy and Safety” (SmithKline Beecham, 2001a). The JAACAP article eventually became one of the most frequently cited studies in the medical literature in support of antidepressant use in child and adolescent depression (Journal Citation Reports  However, unknown to the JAACAP readers, the GSK 329 study was completely negative on all protocol-designated primary outcomes, most secondary protocol-designated outcomes, and that GSK and its ghostwriters withheld clinically important adverse event information on paroxetine-induced suicidal and manic-like behaviors in children and adolescents.

Initial data analysis showed that there was no significant difference between the paroxetine and placebo groups on any of the eight protocol-specified outcome measures (SmithKline Beecham, 1998a).  Undaunted by these disappointing outcomes, the sponsor and the investigators performed additional, non-protocol designated post hoc analyses showing more favorable results for paroxetine.  Even then, only two of these post hoc comparisons were statistically significant for paroxetine (versus placebo) by the time study 329 was first ghostwritten for publication in the Journal of the American Medical Association (JAMA), which rejected it for publication in 1999 (SmithKline Beecham, 1998b). Four of the six negative protocol-specified secondary outcome measures had been removed from the list of secondary outcomes, and the two additional post hoc ‘positive’ outcomes had been added.  Thus, overall, 4 of the eight ‘negative’ protocol-designated outcomes were replaced with 4 ‘positive’ outcomes (although many other ‘negative’ measures had been tested and rejected along the way). Although GSK and Keller et al. testified in court that the rationale for this procedure was that it was part of an analytical plan formulated prior to breaking the blind, no evidence of this plan was ever produced, raising uncertainty about Keller et al’s claim that these ‘positive’ outcomes were really declared a priori (2001,764).

The GSK-funded ghostwriters also conflated the primary and secondary outcomes as early as the first draft of the manuscript, and all 8 outcomes were described as ‘primary’ in the results section.  However, in later drafts of the 329 manuscript, the term ‘primary’ was replaced by the term ‘depression-related’ outcomes (SmithKline Beecham, 1999a), whereby later drafts reported that paroxetine was more effective than placebo on 4 of eight outcomes, without disclosing that the original protocol-designated primary and secondary outcomes were really ‘negative’ (See McHenry et al., 2008).

In July 2000, after being rejected by peer review from JAMA, the revised manuscript was submitted to JAACAP, where one peer reviewer asked that the primary outcomes be specifically reported (SmithKline Beecham, 2000). Despite this request, the two original protocol-designated primary outcomes were still not declared, and the study sponsor and ‘authors’ continued to claim efficacy for paroxetine based on the conflated outcomes. This conflation extended throughout the remainder of the JAACAP peer-review process, obscuring the original ‘negative’ primary outcome results by reporting ‘positive’ outcome results.

Finally, the JAACAP article stated that: “Paroxetine was generally well tolerated in this adolescent population, and most adverse effects were not serious” (2001, 769).  This disingenuous statement hid the fact that the final GSK’s study report in November 1998 indicated the presence of many serious and severe adverse events occurring in the paroxetine-treated subjects.  Specifically, suicidal thoughts and behaviour were grouped under the euphemism of ‘emotional lability’ (SmithKline Beecham, 1998a, 109), and shows that 5 of the six occurrences of ‘emotional lability’ were rated “severe” and that all five had self-harmed or reported emergent suicidal ideation.  Moreover, a review of several serious adverse event (SAE) reports in the final study report (SmithKline Beecham, 1998a, 276-307) revealed three additional cases of suicidal ideas or self-harm that had not been classified as ‘emotional lability.’ Thus, the article’s ‘authors’ should have known that at least 8 subjects in the paroxetine group had self-harmed or worsening suicidal ideation compared to only one subject receiving placebo. Although the GSK senior scientist, Dr.  McCafferty, eventually composed a statement in an early JAACAP draft that 11 subjects on paroxetine (versus 2 on placebo) had SAEs, in subsequent drafts, McCafferty’s SAE disclosures of overdose and mania were edited out, and SAEs on paroxetine were attributed to other non-study related causes (SmithKline Beecham, 1999b).

As reported in CMAJ (Kondro, 2004), an internal GSK ‘Position Piece on the Phase III clinical studies’ from 1998 stated that study 329 “failed to demonstrate a statistically significant difference from placebo on the primary efficacy measures,” and set as a target “To effectively manage the dissemination of these data in order to minimize any potential negative commercial impact.” A cover letter reads: “As you know, the results of the studies were disappointing in that we did not reach statistical significance on the primary endpoints and thus the data do not support a label claim for the treatment of Adolescent Depression” (SmithKline Beecham, 1998c).  These documents were disavowed by GSK, but there was certainly more than one person expressing caution at this time.  One of those persons who can be cited stated: “Originally we had planned to do extensive media relations surrounding this study until we actually viewed the results. Essentially the study did not really show Paxil was effective in treating adolescent depression, which is not something we want to publicize” (SmithKline Beecham, 2001b).

Finally, in an unprecedented re-analysis of an industry-sponsored trial, the raw data of study 329 was made available from a negotiated settlement in a legal action (The People of the State of New York vs. GlaxoSmith Kline, 2004). The data were re-examined by Le Noury et al. and published in the BMJ in September, 2015.  This reanalysis began life as part of the RIAT (Restoring Invisible and Abandoned Trials) initiative by Doshi et al. (2013).  Reporting on this reanalysis, Le Noury et al. wrote:  

The efficacy of paroxetine and imipramine was not statistically or clinically significantly different from placebo for any prespecified primary or secondary efficacy outcome…There were clinically significant increases in harms, including suicidal ideation and behaviour and other serious adverse events in the paroxetine group and cardiovascular problems in the imipramine group.

They concluded:

Access to primary data from trials has important implications for both clinical practice and research, including that published conclusions about efficacy and safety should not be read as authoritative. The reanalysis of Study 329 illustrates the necessity of making primary trial data and protocols available to increase the rigour of the evidence base (Le Noury et al., 2015)

In summary, the results of the paroxetine 329 study demonstrated no significant superiority of paroxetine versus imipramine or placebo on the two protocol-designated primary outcome measures or in the six protocol-designated secondary outcome measures. Nevertheless, at least 19 additional post hoc efficacy outcomes were examined. In the final analysis, a’ positive’ result was found on only 4 of 27 known outcome measures (a finding that could have occurred by chance alone); and, there was a significantly higher rate of paroxetine-related SAEs versus placebo. Consequently, paroxetine study 329 was ‘negative’ for efficacy and ‘positive’ for harm.


5.2 Forest Laboratory Citalopram Study CIT-MD-18

The CIT-MD-18 study protocol was dated September 1, 1999 and amended April 8, 2002.  The study was conducted between 1999 and 2002 and was designed as a 9-week, 20-site, randomized, double-blind comparison of the safety and efficacy of citalopram versus placebo in 160 children (age 7-11) and adolescents (age 12-17) with major depressive disorder. It was designated a Phase III registration trial supporting an FDA indication for depression in pediatric patients. Forest also parsed out the CIT-MD-18 adolescent results to support an FDA adolescent major depressive disorder indication for escitalopram (Lexapro®). The study design included a 1-week, single-blind placebo lead-in followed by an 8-week, double-blind treatment phase during which there were 5 study visits. The primary efficacy measure was the change from baseline to week 8 on the Children’s Depression Rating Scale - Revised (CDRS-R) total score. Secondary efficacy measures were the Clinical Global Impression severity and improvement subscales, Kiddie Schedule for Affective Disorders and Schizophrenia - depression module, and Children’s Global Assessment Scale.

According to court documents made public as part of the Celexa and Lexapro Marketing and Sales Practices Litigation, part of which settled in 2014 and now posted on the Drug Industry Document Archive (, the manuscript was ghostwritten by Natasha Mitchner at Weber Shandwick Communications, under instruction from Jeffrey Lawrence (Product Manager Forest Marketing). In an October 15, 2001 email, Mary Prescott of Weber Shandwick makes it explicit that the manuscript was written prior to the selection of Dr. Karen Wagner as lead author, and the other so-called academic ‘authors’ (Forest, 2001a). 

Dr. Wagner’s input was sought only after the first draft of the CIT-MD-18 manuscript was prepared and reviewed by Forest Research Institute employees. In an email dated December 17, 2001, Mr. Lawrence of Forest wrote to Ms. Mitchner: “Could you do me a favor and finish up the pediatric manuscript?  I know you said you only had a bit more to do... I took a quick look at it and it looked good so I’d like to get it circulated around here before we send if off to Karen [Wagner]” (Forest, 2001b).

Forest control over manuscript production allowed for presentation of selected data to create a positive spin to the study outcome. The published Wagner et al. article concluded that citalopram produced a significantly greater reduction in depressive symptoms than placebo in this population of children and adolescents (Wagner et al., 2004, 1079).  This conclusion was supported by claims that citalopram reduced the mean CDRS-R scores significantly more than placebo beginning at week 1 and at every week thereafter (effect size=2.9); and that response rates at week 8 were significantly greater for citalopram (36%) versus placebo (24%). Wagner et al. also claimed comparable rates of tolerability and treatment discontinuation for adverse events (citalopram=5.6%; placebo=5.9%) (2004, 1079). 

However, deconstruction of these data and supporting documents led us to conclude otherwise. In contrast, we found that the claims of Wagner et al. were predicated upon a combination of misleading analysis of the primary study outcome, an implausible effect size, introduction of post hoc outcomes as if they were primary outcomes, failure to report negative secondary outcomes, inclusion of eight unblinded subjects into efficacy analyses, and misleading analysis and reporting of adverse events.

For example, contrary to protocol stipulation, Forest increased the final study sample size by adding back into the primary outcome analysis eight of nine subjects who, per protocol, should have been excluded from the data analysis because they were inadvertently dispensed unblinded study drug (Jureidini, 2012).  The protocol, stipulated: “Any patient for whom the blind has been broken will immediately be discontinued from the study and no further efficacy evaluations will be performed” (Forest, 1999).  Appendix Table 6 of the CIT-MD-18 Study Report showed that Forest had performed a primary outcome calculation excluding these subjects (Forest, 2002). This protocol exclusion resulted in a ‘negative’ primary efficacy outcome.  Ultimately, however, eight of the excluded subjects were added back into the analysis, turning the marginally insignificant outcome (p<0.052) into a statistically significant outcome (p<0.038), although we would note that there was still no clinically meaningful difference in symptom reduction between citalopram and placebo on the mean CDRS-R scores. The unblinding error was not reported in the published article.

Forest also failed to follow their protocol stipulated plan for analysis of age-by-treatment interaction. The primary outcome variable was the change in total CDRS-R score at week 8 for the entire citalopram versus placebo group, using a 3-way ANCOVA test of efficacy (Forest, 2002).  Although a significant efficacy value favouring citalopram was produced after including the unblindedsubjects in the ANCOVA, this analysis resulted in an age-by-treatment interaction with no significant efficacy demonstrated in children. This important efficacy information was withheld from public scrutiny and was not presented in the published article, nor did the published article report the power analysis used to determine the sample size.  No adequate description of this analysis was available in either the study protocol or the study report.  Furthermore, no indication was made in these study documents as to whether Forest originally intended to examine citalopram efficacy in children and adolescent subgroups separately or whether the study was powered to show citalopram efficacy in these subgroups.  If so, then it would appear that Forest could not make a claim for efficacy in children (and possibly not even in adolescents). However, if Forest powered the study to make a claim for efficacy in the combined child plus adolescent group, this may have been invalidated as a result of the ANCOVA age-by-treatment interaction and would have shown that citalopram was not effective in children.

A further exaggeration of the effect of citalopram was to report “effect size on the primary outcome measure” of 2.9, which was extraordinary and not consistent with the primary data. This claim was questioned by Martin et al. who criticized the article for miscalculating effect size or using an unconventional calculation, which clouded “communication among investigators and across measures” (2005, 817).  The origin of the effect size calculation remained unclear, even after Wagner et al. publicly acknowledged an error and stated that “With Cohen’s method, the effect size was 0.32,” (2005, 819), which is more typical of antidepressant trials. Moreover, we would also note that the use of an effect size calculation was a post hoc maneuver by Forest to spin a more ‘positive’ outcome for citalopram and was not stipulated in the study protocol.

Finally, although Wagner et al. correctly reported that ‘the rate of discontinuation due to adverse events among citalopram-treated patients was comparable to that of placebo,” (2004, 1082) the authors failed to mention that the five citalopram-treated subjects discontinuing treatment did so due to hypomania, agitation, and akathisia. None of these potentially dangerous states of over-arousal occurred with placebo.  Furthermore, citalopram-induced anxiety occurred in one subject severe enough to warrant premature treatment discontinuation; while irritability occurred in three other citalopram (versus one placebo) subject. Taken together, these adverse events raise concerns about dangers from the activating effects of citalopram that should have been reported in the Wagner et al. article.  Instead Wagner et al. reported “adverse events associated with behavioral activation (such as insomnia or agitation) were not prevalent in this trial” (2004, 1082) and claimed that “there were no reports of mania,” (2004, 1081) without acknowledging the case of hypomania.  Furthermore, the published article also failed to report that one patient on citalopram developed abnormal liver function tests (Forest, 2002).

In a letter to the editor of the American Journal of Psychiatry, Mathews et al. criticized the manner in which Wagner et al. dealt with adverse outcomes in the CIT-MD-18 data, stating that: “given the recent concerns about the risk of suicidal thoughts and behaviors in children treated with SSRIs, this study could have attempted to shed additional light on the subject” (2005, 818).   Wagner et al. responded: “At the time the [CIT-MD-18] manuscript was developed, reviewed, and revised, it was not considered necessary to comment further on this topic” (2005, 819).  However, Wagner et al. were disingenuous in their lack of concern over potential citalopram-induced suicidal risk.  In fact, undisclosed in both the Wagner et al. article and Wagner’s letter-to-the-editor, the 2001 negative Lundbeck study had already raised concern over heightened suicide risk (2004, 2005, von Knorring et al., 2006).  

Forest received FDA approval in 2009 for escitalopram in the treatment of adolescent depression on the basis of the SCT-MD-32 trial of escitalopram and the allegedly positive CIT-MD-18 trial of citalopram. This approval was queried by Carandang et al. in 2011 who urged Health Canada not to follow the FDA decision and “demand that standards and process be met until sufficient evidence supporting safety and efficacy is provided for a pediatric indication” (2011, 323).  The analysis of the documents discussed herein confirms Carandang et al’s concerns; the CIT-MD-18 study was negative and therefore not supportive of Forest’s Lexapro adolescent indication application.

5.3 SmithKline Beecham (GlaxoSmith Kline) Paroxetine Study 352

Deconstruction of the GlaxoSmith Kline (GSK) paroxetine 352 study was based, in part, upon documents pertaining to a March 22, 2001 Complaint of Plagiarism of a Nemeroff et al. 2001 article in the American Journal of Psychiatry (AJP) made to the Chair of the Department of Psychiatry at the University of Pennsylvania Perelman School of Medicine (who also happened to be listed as the second author on the plagiarized article) (;  court testimony from the Kilker v. GSK litigation in October 2009) (; United States Senate Report on Ghostwriting in Medical Literature, June 24, 2010 (; GSK Clinical Trials Website Result Summary for Study 29060/352 updated 09 March 2005 (file:///C:/Users/j/Documents/Penn%20Items/GSK%20PAR%20352%20Results%20Sunnary%20Mar%202005.pdf); GSK Paroxetine Protocol PAR-29060/352 (amended 22 July, 1994);and, evidence presented in a Complaint of Scientific Misconduct against Dwight L. Evans, Laszlo Gyulai, Charles Nemeroff, Gary S. Sachs, Charles L. Bowden et al., July 8, 2011 filed with the Office of Research Integrity (ORI) of the Department of Health and Human Services: ORI 2012-33(

The publication of the paroxetine 352 study was to facilitate the off-label prescription of paroxetine for the treatment of the depressive phase of bipolar disorder. This area of treatment represented a natural extension of the already approved indication for paroxetine of unipolar major depressive disorder. Because the 352 study received less public attention than the 329 and CIT-MD-18 studies, it now deserves closer scrutiny.

The paroxetine 352 article was ghostwritten by Sally Laden of Scientific Therapeutics Information, Inc, under the sponsorship of GSK employees and was published by the AJP in June 2001 under the authorship of Nemeroff et al. (2001).  However, the role of GSK and the ghostwriters was not acknowledged in the article.  At least two ghostwritten drafts of the 352 manuscript were produced before the names of any academic authors appeared on the title page. Eventually, prominent academic researchers (with financial ties to GSK) as well as GSK employees, were designated by GSK as ‘authors’ on the 3rd  draft of the 352 manuscript.  According to evidence from a Complaint of Research Misconduct made to the Office of Research Integrity of the Department of Health and Human Services on June 25, 2012 (, the so-called authors were chosen by GSK in consultation with Sally Laden. Most of the named authors on the published article had little or no direct involvement in the design, daily conduct, data analysis, or writing of the manuscript. In fact, the first and second authors on the published article (i.e., Dr. Nemeroff and Dr. Evans) were only selected for this role late in the vetting process (after several other authors were moved to less prominent positions in the byline). GSK had originally selected Dr. Laszlo Gyulai from the University of Pennsylvania, as the paper’s first author.  However, Dr. Gyulai was removed from this position and replaced by Dr. Nemeroff.   The evidence also indicates that the final GSK-assigned authors on the published article never reviewed or even saw preliminary drafts of the paper, and only saw the final edited manuscript just prior to final acceptance by the AJP (Amsterdam & McHenry, 2012).

The 352 study was designed as an 18-site, 10-week, randomized, double-blind, placebo-controlled comparison of paroxetine versus imipramine in subjects with bipolar type I disorder and was designated a Phase IV (i.e., post-marketing, non-indication) study with a projected duration of 2 years.  Its objective was “to compare the efficacy and safety of paroxetine and imipramine to [placebo] in the treatment of bipolar depression in subjects stabilized on lithium therapy”(file:///C:/Users/j/Documents/Penn%20Items/GSK%20PAR%20352%20Results%20Sunnary%20Mar%202005.pdf).  The primary efficacy measures were the change from the baseline HRSD total score, and the change from baseline in the Clinical Global Impression Severity of Illness (CGI/S) score for paroxetine versus placebo and for imipramine versus placebo. The protocol-designated secondary outcomes were the proportion of subjects with a final HRSD score ≤7 or a final CGI/S score ≤2.  Additional secondary outcomes included the proportion of subjects experiencing adverse events, premature treatment discontinuation, and manic or hypomanic reactions as determined by the DSM-III-R Mania/Hypomania Assessment and the Young Mania Rating Scale (YMRS).  

The study population consisted of outpatient subjects ≥ 18 years old, with a lifetime diagnoses of bipolar type I disorder and a history of at least one prior manic or major depressive episode within the preceding 5 years who failed to respond to lithium carbonate for ≥ 7 weeks at therapeutic lithium levels (2001, 907).

The original protocol called for a sample size of 62 subjects per treatment group (or a total of 186 subjects). However, during the course of the study the protocol sample size estimate was amended downward to 46 subjects per treatment group (or a total of 138 subjects) (Amsterdam & McHenry, 2012).  

The statistical plan called for separate analyses on the entire subject population, and on two subgroups of subjects: (i) those who experienced a manic or hypomanic episode during the study; and (ii) those who did not. The YMRS was to be used to assess severity of manic and/or hypomanic symptoms across treatment conditions, and the relationship between change from baseline in YMRS scores and HRSD scores was to be specifically examined.  Factors that might influence treatment outcome were to be examined via the use of interaction terms in the regression models and those that were not statistically significant (i.e., p>0.1) in the primary analysis would be dropped from all subsequent analyses. 

The protocol also specifically noted that the comparison of primary interest was paroxetine versus placebo regardless of baseline lithium level stratification. Finally, mania and hypomania were to be analyzed using logistic regression models that included the effect terms of ‘treatment’, ‘investigator’, and ‘treatment x investigator’ interaction. The protocol noted that if the interaction was not significant, it would be dropped from the model. Virtually none of these protocol-designated procedures were followed or reported in the published article (2001, 908).

The original sample size estimate of 62 subjects per treatment condition was reduced to 46 per group during the study; and this may have been the result of exceedingly slow subject enrolment which ultimately led GSK to add a 19th investigative site. By the time that the study was prematurely terminated by GSK, only 117 (of the originally projected 186) subjects were enrolled, resulting in final sample sizes for paroxetine (n=35), imipramine (n=39), and placebo (n=43). However, by the time the study was published, the declared sample size estimate had mysteriously changed once again: “the study was designed (sic) to enroll 35 patients per arm, which would allow 70% power to detect a 5-point difference on the Hamilton depression scale score (SD=8.5) between treatment groups” (2001, 908). 

Although the article noted that statistical power was estimated at only 70%, Nemeroff et al. failed to inform the reader that this value was unconventionally low, derived post hoc (after the analyses had been completed), was not indicative of the original protocol-designated power estimate, and that the original power was based upon 62 subjects per group or that the original value was reduced during the study to 46 subjects per group. Moreover, Nemeroff et al. did not inform the reader that the power was further reduced to 35 subjects per group after the data were analysed. No mention was made that this second post hoc power change occurred as an extra-regulatory protocol violation of HHS Good Clinical Practice Guidelines; or, that the second reduction in sample size was made post hoc in order to allow the final sample size estimate of 35 subjects per group to comport with the final sample size of the truncated paroxetine enrolment (i.e., n=35). Nemeroff et al. did not acknowledge clearly that the study failed to recruit the originally projected sample size necessary to test the primary study hypothesis, and only hinted by mentioning the low 70% power estimate that the study had insufficient statistical power to adequately test the primary study aims. 

For the study results, Nemeroff et al. played down the protocol-designated statistical procedures that were to be used for the primary efficacy analyses and, instead, emphasized the statistical procedures used for analyzing the unnecessary post hoc lithium stratification level efficacy analyses. Nemeroff et al. did not note that a sample size of 35 subjects per group was insufficient to test for differences among lithium level subgroups. Furthermore, “no adjustments for multiple comparisons were made” (2001, 908) which, if properly applied, would have nullified the only ‘positive’ paroxetine finding in the study. 

Although Nemeroff et al. reported the presence of manic symptoms using the Mania/Hypomania Assessment, no detailed analyses were presented and none of the data from the YMRS rating were ever reported (or its existence even mentioned as an outcome measure). In addition, no information was provided to readers by Nemeroff et al. as to how the results of 4 manic episodes on imipramine and 2 on placebo were determined.

GSK conflated primary and post hoc analyses to present the only ‘positive’ post hoc finding for paroxetine in the entire study as if it was the primary outcome (i.e., a stratified lithium level analysis). However, according to the protocol statistical plan, the post hoc lithium level stratification analyses were completely unnecessary. Of more than 30 separate primary, secondary and post hoc efficacy analyses reported in the GSK Clinical Trials Web-site Results Summary, only the post hoc comparison of paroxetine versus placebo in subjects with low baseline lithium levels showed a statistically ‘positive’ result for paroxetine. Nemeroff et al. attributed the ‘negative’ primary outcome finding of paroxetine versus placebo in all subjects to be the result of an excessive response to placebo in the “high” lithium level subgroup (2001, 909), although there is no evidence to support this conclusion. Nemeroff et al. then emphasized the single ‘positive’ paroxetine efficacy finding as if it was the primary study aim (2001, 909).  

In addition, Nemeroff et al. conflated efficacy and safety data to favor paroxetine by only presenting selected data on treatment-emergent manic and sexual side effect symptoms. For example, Nemeroff et al. presented only the number of clinician-reported manic episodes and minimized the rate of manic and hypomanic symptoms occurring with paroxetine (versus imipramine and placebo).  They also favored the sexual side effect profile of paroxetine (versus imipramine) by portraying paroxetine as having virtually no sexual side effects (2001, 911).  Nemeroff et al. failed to report a higher frequency of paroxetine-induced treatment-emergent depressive symptoms versus imipramine (which was reported in the GSK Clinical Trials Website Results Summary).

In summary, the paroxetine 352 study was a non-informative trial with insufficient statistical power and inconclusive results.  As a consequence, GSK and the ghostwriters had to rely upon conflated post hoc analyses of data subsets in order to portray a favorable result for paroxetine. In this regard, the study was not designed to test whether or not paroxetine (or imipramine) was superior to placebo in lithium level subgroups. Nemeroff et al. failed to disclose that these lithium level subgroups were stratified according to a single baseline lithium level determination for statistical purposes only and that there were no actual discrete low or high lithium level subgroups present in the study - as all subjects were maintained, according to protocol, at therapeutic lithium levels throughout the study. It was the conflation of primary and post hoc analyses that allowed Nemeroff et al. to present the lithium subgroup analyses as if they were the primary and most clinically meaningful entities. Finally, Nemeroff et al. misled its readers in asserting that there is no therapeutic advantage to using antidepressant therapy in bipolar depressed patients with “high” lithium levels. The 352 study was not designed to test this hypothesis and this conclusion does not appear to be supported by the data.  Moreover, with the exception of the paroxetine 352 study, there are no other published studies in the medical literature reporting a lack of antidepressant efficacy in patients with “high” lithium levels.  Conversely, the assertion that antidepressants may be more effective in patients with “low” lithium levels is potentially dangerous and inconsistent with most published practice guidelines for treating bipolar depression (e.g., Pacchiarotti et al., 2013) and may put patients with “low” lithium levels at greater risk for developing antidepressant-induced mania and suicidal ideation (Dunner, 1983; Wehr et al., 1988; Sachs et al., 1994; Leverich et al., 2006).  By downplaying the well-known side effect profile of paroxetine and portraying it as being effective for bipolar depression without manic episodes, Nemeroff et al. were able to conflate successfully efficacy and side effect aims to favor paroxetine over imipramine (irrespective of safety issues).


6.      Conclusion


The problem of truth and transparency in published scientific reports of corporate-sponsored clinical trials has been an on-going concern in the medical and bioethics literature. The difference between what a trial should report and what is actually reported in the medical journals in the past thirty years is so alarming that some editors have declared a crisis of credibility (Fava, 2006).  Corporate mischaracterization of clinical trial results is of concern in psychiatry where outcome measures are more subjective and easily manipulated. Because few industry-sponsored studies gain public scrutiny and even fewer are ever formally retracted, it is important to make these articles transparent to correct the scientific record.  It is furthermore imperative to inform the medical community of mischaracterized data that could lead to potential harm to vulnerable patients.



Amsterdam JD, McHenry L. (2012) The paroxetine 352 bipolar trial: A study in medical ghostwriting. Int J Risk Saf Med. 24(4): 221-231.

Angell, M. (2004) The Truth About the Drug Companies, New York: Random House.

Angell, M. (2008) Industry-sponsored clinical research: :  A broken system, JAMA. 300(9):1069-1071. doi:10.1001/jama.300.9.1069.

Carandang C, Jabbal R, MacBride A, Elbe D. (2011) A review of escitalopram and citalopram in child and adolescent depression.J Can Acad Child Adolesc Psych; 20(4): 323.


Doshi P, Dickersin K, Healy D, Vedula SS, Jefferson T. (2013) Restoring invisible and abandoned trials: a call for people to publish the findings. BMJ;346:f2865.


Dunner D L.(1983) Subtypes of bipolar affective disorder with particular regard to bipolar II. Psychiatry Developments 1:7585.

Fugh-Berman A. (2005) The corporate coauthor, J Gen Intern Med, 20: 547.


Fava G. (2006) A different medicine is possible, Psychother Psychosom, 75: 1-3.


Forest (1999) Forest Research Institute. Study Protocol for MD-18. September 01, 1999. Accessed February 2015.


Forest (2001a) E-mail re: Pediatric data dated 10/15/2001. Accessed February 2015.

Forest (2001b) E-mail re: Pediatric Manuscript dated 12/17/01. Accessed February 2015.


Forest (2002) Forest Research Institute. Study Report for Protocol No. CIT-MD-18. April 8, 2002. Accessed February 2015.


Healy D, Cattell D. (2003), Interface between authorship, industry and science in the domain of therapeutics, Br J Psychiatry, 183:22-27.


Healy, D. (2004) Let Them Eat Prozac, New York: New York University Press.

Healy, D. (2008) Our censored journals, Mens Sana Monographs, 6: 244-256.

Horton, R. (2004) The dawn of McScience, New York Review of Books, March 11: 7.

Jureidini J, McHenry L, Mansfield P. (2008) Clinical trials and drug promotion: selective reporting of study 329. Int J Risk Saf Med. 20(1-2): 73–81

Jureidini, J, McHenry, L. (2011) Conflicted medical journals and the failure of trust,” with Jon Jureidini, Accountability in Research, 18: 45-54.

Jureidini, Jon. (2012) Declaration of Dr. Jon Jureidini. November 15. Accessed February 2015

Jureidini, J, Amsterdam, J, McHenry, L. (2016) The citalopram CIT-MD-18 pediatric depression trial: A deconstruction of medical ghostwriting, data manipulation and academic malfeasance. Int J Risk Saf Med. 28: 33-43.



Keller, M. B., Ryan, N. D., Strober, M., Klein, R. G., Kutcher, S. P., Birmaher, B., Hagino, O. R., Koplewicz, H., Carlsson, G. A., Clarke, G. N., Emslie, G. J., Feinberg, D., Geller, B., Kusumakar, V., Papatheodorou, G., Sack, W. H., Sweeney, M., Wagner, K. D., Weller, E., Winters, N. C., Oakes, R., McCafferty, J. P. (2001) Efficacy of paroxetine in the treatment of adolescent major depression: a randomized, controlled trial. J Am Acad Child Adolesc Psychiatry. Jul; 40(7):762-72.

Kondro W, Sibbald B. (2004) Drug company experts advised staff to withhold data about SSRI use in children. CMAJ Mar 2;170(5):783

Krimsky, S. (2003) Science in the Private Interest, Lanham: Roman & Littlefield.

Krimsky, S. (2006) “Autonomy, disinterest, and entrepreneurial science,” Society May/June, 43/4: 22-29.

Le Noury J, Nardo JM, Healy D, Jureidini J, Raven M, Tufanaru C, Abi-Jaoude E. (2015) Restoring study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. BMJ ; Sep 16;351:h4320.


Leverich GS,Altshuler LL, Frye MA, Suppes T, McElroy SL, Keck PEJr, Kupka RW, Denicoff KD, Nolen WA, GrunzeH, Martinez MI, Post RM. (2006) Risk of switch in mood polarity to hypomania or mania in patients with bipolar depression during acute and continuation trials of venlafaxine, sertraline, and bupropion as adjuncts to mood stabilizers. Am J Psychiatry. 163:232-9.

Martin A, Gilliam WS, Bostic JQ, Rey JM. (2005). Letter to the editor. Child psychopharmacology, effect sizes, and the big bang. Am J Psych: 162 (4): 817.


Mathews M, Adetunji B, Mathews J, Basil B, George V, Mathews M, Budur K, Abraham S. (2005) Child psychopharmacology, effect sizes, and the big bang.  Am J Psych; 162 (4): 818.


McHenry, L. (2005) “On the origin of great ideas: Science in the age of big pharma,” Hasting Center Report, 35, no. 6: 17-19.

McHenry, L, Jureidini, J. (2008) Industry-sponsored ghostwriting in clinical trial reporting: A case study. Account Res, 15/3: 152-167


McHenry, L. (2010) “Of sophists and spin-doctors: Industry-sponsored ghostwriting and the crisis of academic medicine,” Mens Sana Monographs, 8: 129-145.

Nemeroff CB, Evans DL, Gyulai L, Sachs GS, Bowden CL, Gergel IP, Oakes R, Pitts CD. (2001) Double-blind, placebo-controlled comparison of imipramine and paroxetine in the treatment of bipolar depression. Am J Psychiatry; 158(6):906-12.

Newman, M. (2010) The rules of retraction, BMJ, 341: 1246-1248.

Pacchiarotti, I., Bond, D.J., Baldessarini, R.J., Nolen, W.A., Grunze, H., Licht, R.W. Post, R.W., Berk, M., Goodwin, G.M., Sachs, G.S., Tondo L., Findling, R.L., Youngstrom, E.A., Tohen, M., Undurraga, J., González-Pinto, A., Goldberg, J.F., Yildiz, A., Altshuler, L.L., Calabrese, J.R., Mitchell, P.B., Thase, M.E., Koukopoulos, A., Colom, F., Frye, M.A., Malhi, G.S., Fountoulakis, K.N., Vázquez, G., Perlis, R.H., Ketter, T.A., Cassidy, F., Akiskal, H., Azorin, J=M., Valentí, M., Mazzei, D.H., Lafer, B., Kato, T., Mazzarini, L., Martínez-Aran, A., Parker, G., Souery, D., Özerdem, A., McElroy, S.L., Girardi, P., Bauer, M., Yatham, L.N., Zarate, C.A., Nierenberg, A.A., Birmaher, B., Kanba, S., El-Mallakh, R.S., Serretti, A., Rihmer, Z., Young, A.H., Kotzalidis, G.D., MacQueen, G.M., Bowden, C.L., Ghaemi, S.N., Lopez-Jaramillo, C., Rybakowski, J.K., Ha, K., Perugi, G., Kasper, S., Amsterdam, J.D., Hirschfeld, R.M., Kapczinski, F., Vieta, E.(2013) The International Society for Bipolar Disorders (ISBD) Task Force report on antidepressant use in bipolar disorders. American Journal of Psychiatry, 170(11):1249-1262.


Sachs GS, Lafer B, Stoll AL, Banov M, Tibault AB, Tohen M, Rosenbaum JF. (1994) A double-blind trial of bupropion versus desipramine for bipolar depression. J Clin Psychiatry. 55:391-3.


Sismondo, S. (2007) Ghost management: How much of the medical literature is shaped behind the scenes by the pharmaceutical industry? PloS Medicine; 4: e286. 

SmithKline Beecham (1998a) A Multi-center, double-blind, placebo controlled study of paroxetine and imipramine in adolescents with unipolar major depression – Acute Phase. Final Clinical Report. SB Document Number: BRL-029060/RSD-100TW9/1/CPMS-329. 24 November. (accessed March 2007)


SmithKline Beecham (1998b) Draft I. 18 Dec, 1998. (accessed 10 March 2007)


SmithKline Beecham, (1998c) Seroxat/Paxil adolescent depression position piece on the Phase III clinical studies. Available at Last accessed March 2008.


SmithKline Beecham (1999a) Draft submitted to JAMA, 30 July. (accessed 10 March 2007)


SmithKline Beecham (1999b) S Laden. Response to JAMA review. 10 Dec. (accessed 10 March 2007)


SmithKline Beecham (2000). Response to JAACAP Reviewers, November 3.  (accessed 10 March 2007)


SmithKline Beecham (2001a) Hawkins to all sales representatives selling Paxil, Aug 16, 2001, (accessed 10 March 2007)


SmithKline Beecham (2001b) White to Hood, 03/05. Available at  Last accessed March 2008.


Smith, R. (2005) Medical journals are an extension of the marketing arm of pharmaceutical companies, PLoS Med.; 2(5): e138.


von Knorring AL, Olsson GI, Thomsen PH, Lemming OM, Hulten A. (2006) A randomized, double-blind, placebo-controlled study of citalopram in adolescents with depression. J Clin Psychopharm; 26(3): 311-315.

Wagner KD, Robb AS, Findling RL, Jin J, Gutierrez MM, Heydorn WE. (2004)  A randomized, placebo-controlled trial of citalopram for the treatment of major depression in children and adolescents. Am J Psych; 161 (6): 1079-1083.

Wagner KD, Robb AS, Findling RL, Jin J. (2005) Dr. Wagner and colleagues reply. Am J Psych; 162(4): 819.


Wehr TA, Sack DA, Rosenthal NE, Cowdry RW. (1988) Rapid cycling affective disorder: Contributing factors and treatment responses in 51 patients. Am J Psychiatry. 145(2):179-84.


Jay D. Amsterdam, Leemon B. Mc Henry and Jon N. Jureidini

September 8, 2016