Journal Article, 1991

Pahnke’s “Good Friday Experiment”: A Long-Term Follow-Up and Methodological Critique

Introduction & Methodology

On Good Friday, 1962, before services commenced in Boston University’s Marsh Chapel, Walter Pahnke administered small capsules to twenty Protestant divinity students. Thus began the most scientific experiment in the literature designed to investigate the potential of psychedelic drugs to facilitate mystical experience (Pahnke, 1963, 1966, 1967, 1970; Pahnke & Richards, 1969a, 1969b, 1969c). Half the capsules contained psilocybin (30mg), an extract of psychoactive mushrooms, and the other half contained a placebo. According to Pahnke, the experiment determined that “the persons who received psilocybin experienced to a greater extent than did the controls the phenomena described by our typology of mysticism” (Pahnke, 1963, p. 220).

This paper is a brief methodological critique and long-term follow-up study to the “Good Friday Experiment.” Pahnke, who was both a physician and a minister, conducted the experiment in 1962 for his Ph.D. in Religion and Society at Harvard University, with Timothy Leary as his principal academic advisor (Leary, 1962, 1967, 1968). Describing the experiment, Walter Houston Clark, 1961 recipient of the American Psychological Association’s William James Memorial Award for contributions to the psychology of religion, writes, “There are no experiments known to me in the history of the scientific study of religion better designed or clearer in their conclusions than this one” (Clark, 1969, p. 77).

Since a classic means of evaluating mystical experiences is by their fruits, follow-up data is of fundamental importance in evaluating the original experiment. A six-month follow-up was part of the original experiment and a longer term follow-up would probably have been conducted by Pahnke himself had it not been for his death in 1971. For over twenty-five years it has not been legally possible to replicate or revise this experiment. Hence, this long-term follow-up study, conducted by the author, is offered as a way to advance scientific knowledge in the area of psychedelics and experimental mysticism. Lukoff, Zanger and Lu’s review (1990) of psychoactive substances and transpersonal states offers a recent overview of this topic.

Though all raw data from the original experiment is lost, including the uncoded list of participants, extensive research over a period of four years and the enthusiastic cooperation of most of the original subjects have resulted in the identification and location of nineteen out of the original twenty subjects. From November, 1986 to October, 1989, this author tape recorded personal interviews with sixteen of the original subjects, meeting fifteen in their home cities throughout the United States and interviewing one subject (from the control group) over the telephone. In addition to the interviews, all sixteen subjects participating in the long-term follow-up, nine from the control and seven from the experimental group, were re-administered the six-month 100-item follow-up questionnaire used in the original experiment.

Of the remaining three subjects from the experimental group, one is deceased. The identity of another is unknown. One declined to participate citing concerns about privacy. One subject, from the control group, declined to be interviewed or to fill out the questionnaire because he interpreted Pahnke’s pledge of confidentiality to mean that the subjects should not talk about the experiment to anyone. This author’s discussion of the meaning of confidentiality and mention of the explicit support for the long-term follow-up by Pahnke’s wife failed to enlist his participation.

Informal discussions were also conducted with seven out of the ten of Pahnke’s original research assistants for purposes of gathering background information about the experiment. At the time of the experiment, these people were professors or students of religion, psychology and philosophy at universities, colleges and seminaries in the Boston area.

Methodology of the Original Experiment

Pahnke hypothesized that psychedelic drugs, in this case psilocybin, could facilitate a “mystical” experience in religiously inclined volunteers who took the drug in a religious setting. He further hypothesized that such experiences would result in persisting positive changes in attitudes and behavior.

Pahnke believed the most conducive environment for his experiment would be a community of believers participating in a familiar religious ceremony designed to elicit religious feelings, in effect creating an atmosphere similar to that of the tribes which used psilocybin-containing mushrooms for religious purposes (Harner, 1973; Hofmann, Ruck & Wasson, 1978; Hofmann & Schultes, 1979; Wasson, 1968). Accordingly, the experiment was designed to administer psilocybin to a previously acquainted group of Christian divinity students in church during a Good Friday service.

Methodologically, the study was designed as a randomized controlled, matched group, double-blind experiment using an active placebo. Prior to Good Friday, twenty white male Protestant volunteers, all of whom were students at the same theological school in the Boston area, were given a series of psychological and physical tests. Ten sets of closely matched pairs were created using variables such as past religious experience, religious background and training, and general psychological makeup. On the morning of the experiment, a helper who did not participate further in the experiment and who did not know any of the subjects, flipped a coin to determine to which group, psilocybin or placebo, each member of the pair would be assigned.

Three different methods were used to create numerical scales quantifying the experiences of the subjects in terms of an eight-category typology of mystical experiences designed by Pahnke especially for the experiment. Blind independent raters trained in content-analysis procedures scored descriptions of the experiences written by the subjects shortly after Good Friday as well as transcripts of three separate tape-recorded interviews conducted immediately, several days and six months after the experiment. A 147-item questionnaire was administered to the subjects one or two days after Good Friday and a 100-item questionnaire was administered six months after the experiment. The subject’s responses to the interview and the two questionnaires were transformed into three distinct scores averaging the percentage of the maximum possible score in each category. Each of the three complementary scores was then compared to each other.

Pahnke secured support and permission to use Marsh Chapel from Rev. Howard Thurman, Boston University’s dynamic black chaplain. Several small meeting rooms and a self-contained basement chapel were set aside on Good Friday for the participants in the experiment while the main service led by Rev. Thurman was taking place upstairs in the larger chapel. The two-and-a-half hour service was broadcast into the basement chapel, where altar, pews, stained glass windows and various religious symbols were permanently located.

Pahnke gave an active placebo of nicotinic acid to the controls who were expecting to receive either the psilocybin or an inactive placebo. This was done in order to “potentiate suggestion in the control subjects, all of whom knew that psilocybin produced various somatic effects, but none of whom had ever had psilocybin or any related substance before the experiment” (Pahnke, 1963, p. 89).

The ten research assistants worked as part of the experimental team in order to provide emotional support to the subjects prior to and during the service. Subjects were divided into five groups of four with two research assistants, known as group leaders, assigned to each group. These small groups met for two hours prior to the service to build trust and facilitate group support. Subjects were encouraged to “go into the unexplored realms of experience during the actual experiment and not try to fight the effects of the drug even if the experience became very unusual or frightening” (Pahnke, 1963, p. 96).

As a precaution against biasing the subjects toward the typology of mystical experience, leaders were told not to discuss specific aspects of the psychedelic or mystical experience. The lack of overt bias was confirmed by all of the subjects in their long-term follow-up interviews. In a typical long-term follow-up report, psilocybin subject S.J. (all initials used to identify subjects are coded to preserve anonymity) made the following remarks both about the preparation phase of the experiment and the conduct of the group leaders:

None of the fine points of the mystical experience were given to us. We were not told to read any books such as Stace’s book on mysticism or Jacob Boehme’s books, nothing like that. They did not bias us in any way towards that, not at all.

At the insistence of one of the group leaders as well as Pahnke’s faculty sponsor, Timothy Leary, but over the objections of Pahnke, all of the group leaders were also given a pill prior to the service (Leary, 1984, p. 107). This was done in a double-blind manner with one of each group’s leaders receiving a half dose of psilocybin (15 mg) and the other the placebo. Pahnke was concerned this would lead to charges of experimenter bias being leveled against the study, but Leary and the group leader felt that the full involvement of the group leaders would create more of a community feeling and lend necessary confidence to the subjects. Though administered a capsule at the Good Friday service, the group leaders’ reactions were not tape recorded, nor did they fill out questionnaires. Pahnke himself refrained from having any personal experiences with any psychedelic drug until after the experiment and follow-up had been completed.

The double-blind was successfully sustained through all of the preparation phases of the experiment up to and including ingestion of the capsule. The double-blind was even sustained for a portion of the Good Friday service itself because of the use of nicotinic acid as an active placebo. Nicotinic acid acts more quickly than does psilocybin and produces a warm flush through vasodilation of blood vessels in the skin and general relaxation. Subjects in the placebo group mistakenly concluded, in the early stages of the experiment, that they were the ones who had received the psilocybin (Pahnke, 1963, p. 212). The group leaders, unaware that an active placebo was going to be used, were also initially unable to distinguish whether subjects had received the psilocybin or the placebo.

Psilocybin’s powerful subjective effects were eventually obvious to all subjects who received it, even though they had not previously ingested the drug or anything similar to it (Pahnke, 1963, p. 212). Inevitably, the double-blind was broken during the service as the psychoactive effects of the psilocybin deepened and the physiological effects of the nicotinic acid faded. At the end of the day of the experiment, all subjects correctly determined whether they had received the psilocybin or the placebo even though they were never told which group they were in (Pahnke, 1963, p. 210). Pahnke himself remained technically blind until after the six-month follow-up. The comments of subject O.W., gathered in the course of this author’s long-term follow-up, are typical of members of the control group.

After about a half hour I got this burning sensation. It was more like indigestion than a burning sensation. And I said to T.B., “Do you feel anything?” And he said, “No, not yet.” We kept asking, “Do you feel anything?” I said, “You know, I’ve got this burning sensation, and it’s kind of uncomfortable.” And T.B. said, “My God, I don’t have it, you got the psilocybin, I don’t have it.” I thought, “Jeez, at least I was lucky in this trial. I’m sorry T.B. didn’t get it, but I’m gonna’ find out.” I figured, with my luck, I’d probably get the sugar pill, or whatever it is. And I said to Y.M., “Do you feel anything?” No, he didn’t feel anything. So I sat there, and I remember sitting there, and I thought, “Well, Leary told me to chart my course so I’m gonna’ concentrate on that.” And I kept concentrating and sitting there and all I did was get more indigestion and uncomfortable.

Nothing much more happened and within another 40 minutes, 45 minutes, everybody was really quiet and sitting there. Y.M. was sitting there and looking ahead, and all of the sudden T.B. says to me, “Those lights are unbelievable.” And I said, “What lights?” He says, “Look at the candles.” He says, “Can you believe that?” And I looked at the candles, and I thought, “They look like candles.” He says, “Can’t you see something strange about them?” So I remember squinting and looking. I couldn’t see anything strange. And he says, “You know it’s just spectacular.” And I looked at Y.M. and he was sitting there saying, “Yeah.” And I thought, “They got it, I didn’t.”

The follow-up interviews yielded no evidence that the experimental team consciously used their knowledge of which pill the subjects had received to bias the results. However, unconscious bias resulting in an “expectancy effect” cannot be ruled out (Barber, 1976). Still, valuable information can be generated without the successful use of the double-blind methodology. Louis Lasagna, Director of the Center for the Study of Drug Development at Tufts University, writes,

We have witnessed the ascendancy of the randomized, double-blind, controlled clinical trial (RCCT), to the point where many in positions of authority now believe that data obtained via this technique should constitute the only basis for registering a drug or indeed for coming to any conclusions about its efficacy at any time in the drug’s career. My thesis is that this viewpoint is untenable, needlessly rigid, unrealistic, and at times unethical. … Modern trial techniques [were not] necessary to recognize the therapeutic potential of chloral hydrate, the barbiturates, ether, nitrous oxide, chloroform, curare, aspirin, quinine, insulin, thyroid, epinephrine, local anesthetics, belladonna, antacids, sulfonamides, and penicillin, to give a partial list … (Lasagna, 1985, p. 48).

Commenting about the attempt to remove the experimenter from the experiment completely, Tooley and Pratt remark:

In certain participant-observer situations (e.g. psychotherapy, education, change induction, action research) the purpose might be to influence the system under investigation as much as possible, but still accounting for (though now exploiting) the variance within the system attributable to the several significant and relevant aspects of the investigator’s participant observation. From this perspective, the quixotic attempt to eliminate the effects of participant-observation in the name of a misplaced pseudo-objectivity is fruitless, not so much because it is impossible but because it is unproductive …. From our point of view … the question becomes not how to eliminate bias (unaccounted-for influence) of participant observation, but how optimally to account for and exploit the effects of the participant observation transaction in terms of the purposes of the research (Tooley & Pratt, 1964, p. 254-56).

The loss of the double-blind makes it impossible to determine the relative contributions of psilocybin and suggestion in producing the subjects’ reported experiences. If the experiment were designed specifically to measure the pure drug effects of psilocybin, the failure of the double-blind would be quite damaging. In this instance the loss of the double-blind is of lesser significance because the entire experiment was explicitly designed to maximize the combined effect of psilocybin and suggestion. The setting was religious, the participants were religiously inclined and the mood was positive and expectant. Pahnke did not set out to investigate whether psilocybin was able to produce mystical experiences irrespective of preparation and context. He designed the experiment to determine whether volunteers who received psilocybin within a highly supportive, suggestive environment similar to that found in the ritual use of psychoactive substances by various native cultures would report more elements of a classical mystical experience (as defined by the questionnaires) than volunteers who did not receive psilocybin. The loss of the double-blind may have enhanced the power of suggestion to some extent and suggests that restraint should be used in attributing the experiences of the experimental group exclusively to the psilocybin (Zinberg, 1984).

Critique of the Questionnaire

Pahnke designed the questionnaire he used to measure the occurrence of a mystical experience specifically for the experiment. No similar questionnaires existed at the time (Larson, 1986; Rue, 1985; Silverman, 1983). Pahnke decided to measure the mystical experience in reference to eight distinct experiential categories. The categories include 1) sense of unity, 2) transcendence of time and space, 3) sense of sacredness, 4) sense of objective reality, 5) deeply felt positive mood, 6) ineffability, 7) paradoxicality and 8) transiency. These categories are very similar to those elaborated by such well-respected scholars of mystical experience as William James (1902), Evelyn Underhill (1910), and W.T. Stace (1960) and are accepted as valid even by academic critics of the Good Friday experiment such as R.C. Zaehner (1972). At present, the scientific questionnaire most widely used by researchers to assess mystical experiences is a 32-item questionnaire created by Ralph Hood, also based on categories developed by W.T. Stace (Spilka, Hood & Gorsuch, 1985).

Zaehner’s critique of Pahnke’s questionnaire is that it does not contain a category for experiences which are specifically Christian, such as identification with the death and rebirth of Jesus Christ. From Zaehner’s perspective, this omission made it impossible to determine if the experiences reported by the subjects during the Good Friday experiment were religious, since he thought a religious experience for Christians necessarily involves a theistic encounter with Christ. Zaehner objected to the claim that an experience of a generalized, non-specific, apprehension of a transcendent reality beyond any specific cultural forms and figures could properly be called religious. Anticipating this critique, Pahnke asserted in the thesis that he was not attempting to resolve the question of what can properly be called religious but was simply investigating mystical experiences, regardless of whether or not they were considered religious. This author will also leave this delicate discussion to others.

The questionnaire used in the Good Friday experiment has been modified and expanded over the years by Pahnke, William Richards, Stanislav Grof, Franco Di Leo, and Richard Yensen for use in subsequent psychedelic research (Richards, 1975, 1978). From the initial creation of the questionnaire by Pahnke in 1962 to Di Leo and Yensen’s computerized version, called the Peak Experience Profile, the basic items relating to the mystical experience have remained essentially unchanged (Di Leo, 1982). While the original follow-up questionnaire was composed of eight different categories, the Peak Experience Profile uses only six. The category of transiency was eliminated since it measures any altered state of consciousness whether mystical or not. The paradoxicality and alleged ineffability categories were combined into the ineffability category. Over the years, new categories measuring transpersonal but not necessarily mystical experiences were added. For example, new questions relate to the reexperiencing of the stages of birth and the perinatal matrixes as defined by Grof (Grof, 1975, 1980) and also to past-life experiences (Ring, 1982, 1984, 1988). A series of questions relating to difficult and painful nadir experiences, in some sense the opposites of peak experiences, has also been added.

In Pahnke’s original questionnaire and in the subsequent revisions, the completeness with which each subject experienced each category is measured through numerical responses to category-specific questions. Pahnke’s subjects rated each question on the post-drug questionnaire from zero to four, with zero indicating that the item was not experienced at all and four indicating that it was experienced as strong or stronger than ever before. The six-month follow-up questionnaire used a zero to five scale, with four indicating that it was experienced as strong as before and five indicating that it was experienced stronger than ever before.

The questions themselves are of two types. The predominant type asks the subject about experiences of a new perspective. For example, some of the questions used to determine the sense of unity ask subjects to rate the degree to which they experienced a pure awareness beyond any empirical content, a fusion of the self into a larger undifferentiated whole, or a freedom from the limitations of the self in connection with a unity or bond with what was felt to be all-encompassing and greater-than-self. These type of questions are sufficiently detailed and specific to be an effective test for the specific category.

The second type of question, used much less frequently, asks about the loss of a normal state. For example, two questions used to determine the presence of a sense of unity simply required subjects to rate the degree to which they lost their sense of self or experienced a loss of their own identity. This type of question is a minor weak point of the questionnaire because it can be rated highly without having anything to do with mystical experiences. For example, one subject reported in the follow-up interview that under the influence of psilocybin he temporarily had difficulty recalling his career choice, home, names of his wife and children, and even his own name. This experience of a powerful loss of the usual sense of self and identity would be highly correlated with mystical experience in the questionnaire but may not actually be related because it can occur for a variety of reasons. Though the questionnaire has relatively few of this type of question, some overestimation of the completeness of the mystical experience could have been introduced into the data as a result.

In addition to asking questions about the experience itself, the follow-up questionnaire also sought to assess the effects of that experience on the attitudes and behaviors of the subjects. For example, the subjects’ attitude changes were assessed by asking them to use a 0 to 5 scale to rate whether they had experienced an increase or a decrease in their feelings of happiness, joy, peace, reverence, creativity, vocational commitment, need for service, anxiety, and hatred. Changes in subjects’ behavior were assessed by means of questions asking whether or not they experienced changes in their relationships with others, in time spent in quiet meditation or devotional life, or whether they thought their behavior had changed in positive or negative ways.

Pahnke’s questionnaire gathered information only from the self-reports of the subjects, resulting in a general sense of the subjects’ own assessment of the direction of the effects of their Good Friday experience. The data do not yield specific information about the internal psychodynamic mechanisms at work within each subject, nor do they include the views of significant others regarding the effects of the experiment on the subjects.

In contemporary psychotherapy research, more sophisticated methods than Pahnke’s are used to assess personality change (Beutler & Crago, 1983). Reports from significant others such as family members and close friends of the subject are almost always used to add an important “objective” element in assessing personality change. Data from the follow-up questionnaires, administered by Pahnke at six-months and by the author after twenty-four to twenty-seven years, should be considered valuable as far as they go, but this is not very far. Since no detailed personality tests were given prior to the experiment, results of such tests at the time of the long-term follow-up would have been of little value and were not conducted. The long-term follow-up interviews, because of their open-ended format and extensive questioning, yielded more detailed information than the questionnaire about the content of the experiences and the persisting effects.

Source: Rick Doblin, “Pahnke’s ‘Good Friday Experiment’: A Long-Term Follow-Up and Methodological Critique,” The Journal of Transpersonal Psychology, 1991, Vol. 23, No. 1. Download original PDF. Digitally restored from the original publication. Text correction and HTML formatting by AI restoration pipeline. Published here by the Church of Ambrosia as a primary historical document.

Introduction and Methodology - Church of Ambrosia

Pahnke’s “Good Friday Experiment”: A Long-Term Follow-Up and Methodological Critique

Methodology of the Original Experiment

Difficulties with the Double-Blind

Critique of the Questionnaire