Some Thoughts on the Problems Which are Connected with Randomized and Blind Clinical Studies
By: Peter Heusser
This is an article by a physician who has been involved with cancer therapy for a number of years. He has taken a look at the hallowed ground of scientific research. The problem of randomization to create comparable groups is a total impossibility if the individual - unique factor is taken into consideration. It is doubtful that a human being can be found who does not attest to his own uniqueness. If this is one's therapeutic orientation, then randomization is not possible. The only alternative is to eliminate individuality and take a set of agreed on comparables, e.g. age, weight, profession, gender, illness, etc. Here illness is looked on as an isolate. It is just the isolation that the population is beginning to revolt about - not the illness isolation, but personal isolation.
The author further looks at the intricacies of the placebo — here pointing to the role of the therapeutician. Mechanical dispensers, mechanical communicators would be the next step to eliminate the therapeutician, the placebo effort of the physician. A provocative placebo trial system might be to assign patients in a randomized study where medicine dispensing machines are used. This might appear to be far fetched but many a patient seen in practice senses this about what some therapeuticians are today. Recognized researchers are now pointing to such problems, but they are not being heard.
The author goes on to make suggestions which could serve as a spring board for another approach to the evaluation of therapies.
reprinted from Complementary Medicine\Compendium III
On the Comparability of Groups
The comparability of groups must be checked with respect to the relevant parameters. It is obvious that quantitative parameters can be compared. However, what about qualitative ones? Can elements like suffering, family, and biographical situations really be compared in such a way that equal groups of patients can be set up? Every doctor who thinks about patients who have the same diagnosis, and who might even be of the same age and sex, will have to admit that they are really all very different and that they also react very differently.
In that medicine which is oriented entirely towards natural science, the inner connection between that which is the same in patients (diagnosis, age, sex, organs affected, histology, etc.) and that which is different in them is still an insufficiently discussed question. At any rate, a few leading physicians are beginning to think about this relationship. In his book about Goethe and Medicine (Artemis 1991), Frank Nager goes into particular aspects of individual patients. He uses the example of Goethe's diseases, and of his attitude towards them to show that there is a significant connection between someone's diseases and what Goethe called one's own law of growth -- a supreme guiding principle or entelechy -- in other words, the individual element which constitutes the soul and spiritual core of the human being. If there is such a connection, and Rudolf Steiner describes it at great length, no satisfying comparability between patients and groups of patients can exist, even if the age, sex, diagnosis, histology, risk factors, and other parameters are comparable, since they are always influenced by incomparable individual and unique ones. Patients are only conditionally comparable even with respect to comparable parameters because, unlike experiments in the inorganic realm, a countless number of factors are involved which can by no means all be equalized. Realistically, homogeneous groups of patients are an illusion. Even well carried out randomized studies are theoretically irreproducible and, therefore, inconclusive.
Therewith, their applicability is only relative, quite apart from the fact that, strictly speaking, the results of a study are always only valid for the kind of patients who were investigated in that study. Real life contains a much broader patient population. Therefore, randomized studies, already for purely theoretical reasons, cannot give the doctor the certainty for which one had originally hoped, in concrete, daily decision making situations.
This is the situation one actually finds today in the entire field of conducting studies. It is probable that more randomized studies have been carried out in oncology, especially on breast cancer, than in any other medical field. The following notable statement by Professor K. Brunner from Berlin in 1983 summarizes what therapy research has arrived at in this area on the basis of the so-called formation of comparable groups.
"In spite of a large number of good, carefully planned randomized studies, (and an even larger number of bad ones), they don't provide us with any firm data to help us decide on the best possible therapy when we are confronted with an individual case. To be sure, every experienced oncologist thinks that he knows when, for which symptom he has to give a patient an aggressive and burdensome chemotherapy, when he should give a relatively mild treatment, or none at all. He can hardly ever base such decisions on the firm results of large randomized studies. One might also say that he could back up almost any therapeutic decision with the results which have been published, since the latter are often contradictory. One can advance the somewhat provocative thesis that in spite of fifteen years of intensive research, oncology is unable to formulate valid recommendations for medical practice in an individual case. A well informed oncologist, as a rule, cannot do this unless he has a large amount of personal experience."
The situation remains the same to this day. If anything, the criticism of experts in the field has increased (Abel, 1990; Hossfield, 1990). One should note that, in the end, it is the "large amount of personal experience" which is the decisive factor in the assessment of study results. This experience was gained from many individual cases. But originally it was precisely the experience from individual cases which people distrusted and which made them carry out randomized, double blind studies on "comparable" groups of patients. Leading scientists and practitioners now have to admit that the experience they despised is an indispensable means for gaining medical knowledge. In other words, the results of randomized studies by no means give a certainty which could simply replace the certainty gained through practical experience.
Randomization makes little sense unless the groups of patients are sufficiently comparable, but this comparability is very doubtful for the anthropological reasons stated above. In anthroposophically oriented medicine; one tries to take into account, as much as possible, the interaction of the soul and spirit with the body. This is why therapy, for even typical diseases, is often individualized. Randomization, for methodic reasons, makes little sense because the treatment groups are not homogeneous, and therefore, no more comparable than non-randomized groups. Strictly speaking, randomization would not be able to guarantee the structural equality of groups which is demanded for "proof", even if pebbles were used, instead of people with souls and spirits. Structural equality of pebbles would be attained with only a certain probability. Therefore, the result of the study could always be falsely negative or falsely positive with some residual degree of probability.
Since one uses inclusion and exclusion criteria to select patients for randomized studies, the latter are not necessarily representative of the overall population, and their results are not necessarily transferable to practical life.
Drop-outs can be conditioned by the treatment. One must, therefore, assess the effect of the medicaments on them both individually and collectively, otherwise the overall result of the study might be falsified. In other words, a randomized study contains an assessment of the individual case and also of the now non-randomized collective.
This is an implicit admission of the existence of two kinds of therapy assessments which is precisely what one wanted to eliminate through the randomization.
Front an ethical and a legal point of view, a randomized study is only permissible if the status of a medication is still indecisive. If there is good reason to believe that a remedy is effective, one cannot withhold it from patients in the control group. To begin with, one does, of course, have these undecided situations.
However, as a rule, randomized studies cannot convert this undecidedness into a real decidedness. One continually runs into contradictory results in the area of coronary diseases and in cancer treatment, which is where the most randomized studies have been done. This is why Kiene (1991) made the following statement, "One can either find a realistic and directly valid statistical proof of efficacy with a non-randomized study, or one can't find it at all." In this connection, it is interesting that Gehan (1978, 1981) came to the realization that no randomized studies were done at the decisive break-through places of chemotherapeutic progress; for instance, in the successful treatment of Hodgkins disease and of other lymphomas, acute leukemias in children, Wilm's tumor, and scrotal tumors. Feinstein (1984) found a randomized and a non-randomized study in the literature which were done under the same conditions, with the same results.
Another ethical problem which cannot really be reconciled with the thinking of an anthroposophical doctor is the fact that in randomization the therapeutic decision is not made according to the best knowledge and conscience of the doctor, but according to a non-human element, namely; chance. However, according to the law and to his Hippocratic oath, a doctor is obliged to do things for his patient according to his best knowledge and conscience. This is why Begemann (1985) said that in principle randomization is not doctoral. The human relation between doctor and patient is a decisive factor in every therapy. By comparison with the results of non-randomized studies, and of conventional, retrospective investigations, the results of randomized studies have not brought such a terribly great increase in medical knowledge. Therefore, they do not justify the elimination of the human element in the testing of medications. Of course, this applies not only to randomization, but also to blinding. Hellman (1991) thinks that every patient has a fundamental human right to an individual engagement with his doctor, and that even if a patient agrees to be randomized, his doctor is nevertheless obliged to inform him in accordance with his best knowledge and conscience.
Because of the large number of patients in the treatment groups and controls, randomized studies are expensive, and they last a long time. This expense would be justified if one could really achieve more reliable and convincing results for medical practice through these studies than through all others. But as we saw above, this is very doubtful. Sometimes these mammoth studies are even a hindrance to medical progress. For instance, it took thirty years of randomized studies before oncologists had to admit what practitioners had already known before, namely, that chemotherapy had brought practically no therapeutic progress with respect to advanced solid tumors. (Abel, 1990, Hossfeld, 1990). This would have been realized much sooner through a series of smaller, one-armed, non-randomized studies with historical comparison and through the acceptance of practical experience. This would have saved a lot of money and unnecessary side effects up to lethal consequences in thousands of patients. That is another ethical problem.
The theoretical basis for blinding is in the two assumptions that the placebo effects are comparable in the treatment group and in the control group, and that the overall effect is the sum of the effects of the medicaments and of the placebo. The true effect of the medicament could therefore be determined through the difference between the treatment group and the control group. But placebo research shows that these assumptions are wrong and that, therefore, the conclusion which is drawn from them is without foundation. This will be shown in the following;
Where interactions between the test persons exist, a kind of adaptation of the placebo effect can occur. The placebo group can take on the reactions of the medication or verism group, whereas the verism group shows less response (Knowles & Lucas 1960, Atoymata 1954). This also occurs in animal experiments. When rats treated with amphetamine are commingled with untreated ones, the amphetamine induced behavior of the treated rats is diminished, whereas the untreated ones take on some of the treated rats' behavior (McDonald & Heimstra 1965). Thus the psychogenic effect in one group can be the opposite of that in the other one. In this case, the physiogenic effect of the medicament cannot be determined through the difference between the two groups.
Such an adaptation can also result from a situation in which patients are previously informed that they will receive no active substances, and are then given some anyway. In this case the effect of the active substance can be reduced through a skeptical attitude (Skoulund 1991).
Blinding is eliminated as soon as the doctor or the patient knows the effect or side effect of a substance (Noy 1986). Knowledge of such an effect can then strengthen the psychogenic effect, so that the latter can be stronger in the verism group than in the placebo group. Here again the difference between the two groups does not correspond to the actual effect of the medicaments.
Past experience with therapies have a considerable influence on the size of the placebo effect in both a positive and a negative sense.
In one experiment (Bettermann 1968), the placebo effect fluctuated between 17% and 66%, depending on the way the patients had been treated before. This makes one wonder whether it is possible to homogenize patients for a randomized double blind study with respect to previous experiences.
The different psychogenic reactions which were found in various investigations indicate that a medicament's objective effect cannot be established by double blind studies. "It must be appreciated that the placebo effect can be one of the reasons for failure to recognize a useful drug in a therapeutic trial" (Lasagna et al, 1954).
Therefore, some investigators tried to eliminate the placebo reactors through an introductory placebo washout before the actual studies. But Reimher, et al (1989) showed that the placebo reaction was not the same in the experimental and placebo groups even after such a laundering operation. It's impossible to exclude the hidden placebo responders.
If one tells patients that a medication has a strong effect, or that the effect of another one is questionable, it can lead to fluctuations of up to 30% between the verism group and the placebo group. Since this difference is considered to be the physical effect of the medicament, the latter fluctuates by 30% merely through suggestion (Wied 1953). Thus one can see that it is impossible to objectively determine the pure effect of a medicament through double blind studies, for the doctor cannot possibly always conceal his opinions about a medicament or placebo when he gives it to his patients. Every kind of human communication can have a suggestive effect. This can only be eliminated by eliminating the doctor and replacing him with a medication dispenser.
Something similar applies to other information which a doctor gives his patients, such as, what the general effect of the medication is. Whether or not a medication has a physiogenic effect can sometimes depend on whether the patients are informed about it. In a double-blind by Penick (1964), phenmetrazine had no anorectic effect if the patients were not informed about the intended effect, whereas the food intake of informed patients declined considerably. However, there was no loss of appetite in the placebo group in spite of information. Thus information about the medication was a promoter of its physiogenic effect, but not of the psychogenic placebo effect.
One sees that the interaction between physiogenic and psychogenic effect is considerably more complicated than theorists think, and that the overall effect on a patient can by no means be determined by simply adding the physiogenic and psychogenic effects.
It is also impossible to eliminate the doctor's influence through a double-blind study. Uhlehuth et al (1959) told about a double-blind study, with sleeping medicines and placebos, which was carried out by two doctors. One had confidence in the pharmacotherapy, while the other was skeptical. In spite of the double-blind design, the doctor with the positive attitude attained a significant difference, whereas the skeptical one did not. It is also interesting that it was the effect of the medication, and not the effect of the placebo which was increased for the first doctor. Lehmann (1964) said the following about this: "It is nevertheless surprising that the therapists' attitudes so strongly and deceptively broke through the very barrier which had to be assumed to be a foolproof protection against attitudinal contamination, namely, the double-blind drug control."
Thus blinding does not permit one to distinguish the physiogenic and psychogenic effects of medications sufficiently, nor does it enable one to eliminate the influence of the doctor. Because of the different development of so-called psychogenic effects in the test group and the control group, it might even cover up the effects of the medicaments. Blinding does not provide the certainty which it was supposed to give.
Blinding and randomization are ethically problematical because they consciously interfere with the relationship between doctor and patient. They are also not justifiable from a factual point of view because this interference does not result in any certainty.
Randomized double-blind studies were based on the model of natural scientific experiments, and they were supposed to give proofs. However, they do not give the certainty for which one had hoped. The comparability of patients and groups is always relative only for medical, anthropological reasons. This is why randomization does not guarantee the equality of structure which is necessary for proof. Such studies are not really reproducible. In practice, randomization does not help doctors to make decisions about individual cases or about the unrandomized collective. They basically have to rely upon their own judgment. Blinding is based on the unjustified assumption of an additive relation of the physiogenic and psychogenic effects of medicaments. Placebo research has shown that this relation is not additive, but considerably more complex. The pure effect of the medication cannot be derived as a difference between the test group and the control group. Blinding can even conceal the physiogenic effect of a medication. Randomized double-blind studies were also supposed to eliminate the influence of the doctor, but they have not been able to do this. Randomized and blinded studies do not provide the reliability, validity, and applicability which one expects of a proof which is supposed to be relevant to practice,
In addition, there are ethical objections because in randomization and blinding one gives up medical thinking in favor of a scientific one. Since one gains very little from such scientific studies when compared with the uncertainties of non-randomized studies and of individual decisions, this abandonment of medical thinking cannot be justified. This is especially true since the patient has a certain fundamental right to this kind of thinking. In other words, randomized double-blind studies cannot be made into a standard for medication testing. They do not correspond to the real nature of the human being, nor to the direct, individual relation of doctor to patient.
One should work at a study culture which reckons completely with man's nature, and with the doctor/patient relationship that necessarily belongs to this. The studies should be done under real conditions, not artificial conditions. This would involve the following:
1. The need of the patient for a particular doctor or a particular form of therapy should not be repressed to help assure that, from the start, a patient's will to get better will not be disturbed. The assignment of forms of therapy or doctors should not occur by chance, but through the appropriate information and in accordance with the patient's own choice. (The placebo expert should have no objection to this. On the basis of his own presuppositions, he would have to say that the positive will to get better in one of the groups would have to average out statistically with the positive will to get better in the other group).
2. In comparative studies one would have to compare patients with the same diagnoses. One would try to get the greatest possible comparability with respect to quantitative parameters. One should keep the relativity of this comparison and the relativity of the results in mind and accept these from the start, for one is comparing patients who are only approximately equal. One should also describe the qualitative and particular aspects of each patient, and here one must consciously accept that the concepts which refer to the particular patients can not be generalized, making an individual assessment necessary in addition to the collective one. One must also accept the fact that concepts which refer to the patient's subjective feelings or to his/her soul and spiritual aspects are necessarily indistinct by comparison with quantitative concepts, thus some observations may have to be expressed in pictures rather than in concepts. One will also be unable to avoid the fact that individual assessors focus on different aspects of the same phenomena. Hence, it is a question of the doctor taking hold of the actual situation which he has before him. The situation will then include the qualitative and non-comparable elements as well as the quantitative and comparable ones. One will then be able to check the former intersubjectively if they are based on actual observations which are put into concepts which correspond with the observations.
3. One should document the individual cases as carefully as possible. Similar individual cases can then be compiled into a non-randomized collective in which the quantitative and comparable elements can be statistically evaluated as one part. One armed studies can be done and then historically compared with others. Within this study design, one can compare individual patients with others or with the collective. One will accept the fact that the statistical results are necessarily relative due to the nature of the human being.
4. Such studies would be considerably less expensive than randomized ones, and for the reasons mentioned, one should be able to discover just as much through them. The compliance of the patients and the motivation of the doctors should be considerably greater than in randomized studies. The ethical problems in randomization and blinding would be eliminated. The studies would correspond to everyday medical practice and their results would, therefore, be easier to apply to this practice than those from randomized studies.
5. The documentation should include:
— The quantitative parameters before, during and after therapy at meaningful intervals.
— Qualitative aspects and individual peculiarities
— The use of semi-quantitative, comparative scales for qualitative aspects can be useful, but they cannot replace individual descriptions of individual aspects. The patient could be of great help here if he kept a diary.
— The connection between pathology and therapy, that is, the establishment of therapeutic measures should be described as clearly as possible in accordance with the rationale of the medical system which is being used.
— The connection between therapy and course of disease should be clear from the documentation, for example, through the following elements:
temporal connection rhythmic temporal connection
comparison of the patient's subjective descriptions with the doctor's objective assessment
agreement of the course with the rationale of the therapeutic application
comparison of the course with spontaneous healings or with typical courses in the literature
the connection between dose and effect
the comparison of consecutive, possibly unsuccessful applications with others in which there was an effect on the patient
the existence of a medicinal effect in contrast to a placebo effect could be established in the individual patient through comparisons such as the ones one often finds in homeopathic causistics
The collective comparison of many individual cases could then confirm the regularity of the medicinal effect. This is, for instance, how the greater part of the homeopathic materia medica arose.
6. It is obvious that under the above conditions one could select different therapeutic measures even for the same diagnoses and for comparable quantitative or semi-quantitative parameters. The therapy should develop in accordance with the overall assessment of the individual case by the individual doctor. It should also be accepted that various measures can be used to arrive at a goal. For instance, even school medicine uses a variety of medicaments with quite different active principles in order to reduce high blood pressure. "Standardized therapies for standardized diagnoses" contradicts a medicine which is in accordance with the human being.
7. The directors of NRP 34 should take the prototypic change from an abstract disease-oriented medicine to an individualizing patient-centered one which has recently been seriously discussed at large medical congresses (Anschvetz 1991). An art of healing which is moving in this direction can only be tested if the corresponding methods can be carried out in a really individual and patient-centered way, in correspondence with the conditions of every day practice.