Sigmund Freud (1856—1939)

freudSigmund Freud, the father of psychoanalysis, was a physiologist, medical doctor, psychologist and influential thinker of the early twentieth century. Working initially in close collaboration with Joseph Breuer, Freud elaborated the theory that the mind is a complex energy-system, the structural investigation of which is the proper province of psychology. He articulated and refined the concepts of the unconscious, infantile sexuality and repression, and he proposed a tripartite account of the mind’s structure—all as part of a radically new conceptual and therapeutic frame of reference for the understanding of human psychological development and the treatment of abnormal mental conditions. Notwithstanding the multiple manifestations of psychoanalysis as it exists today, it can in almost all fundamental respects be traced directly back to Freud’s original work.

Freud’s innovative treatment of human actions, dreams, and indeed of cultural artifacts as invariably possessing implicit symbolic significance has proven to be extraordinarily fruitful, and has had massive implications for a wide variety of fields including psychology, anthropology, semiotics, and artistic creativity and appreciation. However, Freud’s most important and frequently re-iterated claim, that with psychoanalysis he had invented a successful science of the mind, remains the subject of much critical debate and controversy.

Table of Contents

  1. Life
  2. Backdrop to His Thought
  3. The Theory of the Unconscious
  4. Infantile Sexuality
  5. Neuroses and the Structure of the Mind
  6. Psychoanalysis as a Therapy
  7. Critical Evaluation of Freud
    1. The Claim to Scientific Status
    2. The Coherence of the Theory
    3. Freud’s Discovery
    4. The Efficacy of Psychoanalytic Therapy
  8. References and Further Reading
    1. Works by Freud
    2. Works on Freud and Freudian Psychoanalysis

1. Life

Freud was born in Frieberg, Moravia in 1856, but when he was four years old his family moved to Vienna where he was to live and work until the last years of his life. In 1938 the Nazis annexed Austria, and Freud, who was Jewish, was allowed to leave for England. For these reasons, it was above all with the city of Vienna that Freud’s name was destined to be deeply associated for posterity, founding as he did what was to become known as the first Viennese school of psychoanalysis from which flowed psychoanalysis as a movement and all subsequent developments in this field. The scope of Freud’s interests, and of his professional training, was very broad. He always considered himself first and foremost a scientist, endeavoring to extend the compass of human knowledge, and to this end (rather than to the practice of medicine) he enrolled at the medical school at the University of Vienna in 1873. He concentrated initially on biology, doing research in physiology for six years under the great German scientist Ernst Brücke, who was director of the Physiology Laboratory at the University, and thereafter specializing in neurology. He received his medical degree in 1881, and having become engaged to be married in 1882, he rather reluctantly took up more secure and financially rewarding work as a doctor at Vienna General Hospital. Shortly after his marriage in 1886, which was extremely happy and gave Freud six children—the youngest of whom, Anna, was to herself become a distinguished psychoanalyst—Freud set up a private practice in the treatment of psychological disorders, which gave him much of the clinical material that he based his theories and pioneering techniques on.

In 1885-86, Freud spent the greater part of a year in Paris, where he was deeply impressed by the work of the French neurologist Jean Charcot who was at that time using hypnotism to treat hysteria and other abnormal mental conditions. When he returned to Vienna, Freud experimented with hypnosis but found that its beneficial effects did not last. At this point he decided to adopt instead a method suggested by the work of an older Viennese colleague and friend, Josef Breuer, who had discovered that when he encouraged a hysterical patient to talk uninhibitedly about the earliest occurrences of the symptoms, they sometimes gradually abated. Working with Breuer, Freud formulated and developed the idea that many neuroses (phobias, hysterical paralysis and pains, some forms of paranoia, and so forth) had their origins in deeply traumatic experiences which had occurred in the patient’s past but which were now forgotten—hidden from consciousness. The treatment was to enable the patient to recall the experience to consciousness, to confront it in a deep way both intellectually and emotionally, and in thus discharging it, to remove the underlying psychological causes of the neurotic symptoms. This technique, and the theory from which it is derived, was given its classical expression in Studies in Hysteria, jointly published by Freud and Breuer in 1895.

Shortly thereafter, however, Breuer found that he could not agree with what he regarded as the excessive emphasis which Freud placed upon the sexual origins and content of neuroses, and the two parted company, with Freud continuing to work alone to develop and refine the theory and practice of psychoanalysis. In 1900, after a protracted period of self-analysis, he published The Interpretation of Dreams, which is generally regarded as his greatest work. This was followed in 1901 by The Psychopathology of Everyday Life; and in 1905 by Three Essays on the Theory of Sexuality. Freud’s psychoanalytic theory was initially not well received—when its existence was acknowledged at all it was usually by people who were, as Breuer had foreseen, scandalized by the emphasis placed on sexuality by Freud. It was not until 1908, when the first International Psychoanalytical Congress was held at Salzburg that Freud’s importance began to be generally recognized. This was greatly facilitated in 1909, when he was invited to give a course of lectures in the United States, which were to form the basis of his 1916 book Five Lectures on Psycho-Analysis. From this point on Freud’s reputation and fame grew enormously, and he continued to write prolifically until his death, producing in all more than twenty volumes of theoretical works and clinical studies. He was also not averse to critically revising his views, or to making fundamental alterations to his most basic principles when he considered that the scientific evidence demanded it—this was most clearly evidenced by his advancement of a completely new tripartite (id, ego, and super-ego) model of the mind in his 1923 work The Ego and the Id. He was initially greatly heartened by attracting followers of the intellectual caliber of Adler and Jung, and was correspondingly disappointed when they both went on to found rival schools of psychoanalysis—thus giving rise to the first two of many schisms in the movement—but he knew that such disagreement over basic principles had been part of the early development of every new science. After a life of remarkable vigor and creative productivity, he died of cancer while exiled in England in 1939.

2. Backdrop to His Thought

Although a highly original thinker, Freud was also deeply influenced by a number of diverse factors which overlapped and interconnected with each other to shape the development of his thought. As indicated above, both Charcot and Breuer had a direct and immediate impact upon him, but some of the other factors, though no less important than these, were of a rather different nature. First of all, Freud himself was very much a Freudian—his father had two sons by a previous marriage, Emmanuel and Philip, and the young Freud often played with Philip’s son John, who was his own age. Freud’s self-analysis, which forms the core of his masterpiece The Interpretation of Dreams, originated in the emotional crisis which he suffered on the death of his father and the series of dreams to which this gave rise. This analysis revealed to him that the love and admiration which he had felt for his father were mixed with very contrasting feelings of shame and hate (such a mixed attitude he termed ambivalence). Particularly revealing was his discovery that he had often fantasized as a youth that his half-brother Philip (who was of an age with his mother) was really his father, and certain other signs convinced him of the deep underlying meaning of this fantasy—that he had wished his real father dead because he was his rival for his mother’s affections. This was to become the personal (though by no means exclusive) basis for his theory of the Oedipus complex.

Secondly, and at a more general level, account must be taken of the contemporary scientific climate in which Freud lived and worked. In most respects, the towering scientific figure of nineteenth century science was Charles Darwin, who had published his revolutionary Origin of Species when Freud was four years old. The evolutionary doctrine radically altered the prevailing conception of man—whereas before, man had been seen as a being different in nature from the members of the animal kingdom by virtue of his possession of an immortal soul, he was now seen as being part of the natural order, different from non-human animals only in degree of structural complexity. This made it possible and plausible, for the first time, to treat man as an object of scientific investigation, and to conceive of the vast and varied range of human behavior, and the motivational causes from which it springs, as being amenable in principle to scientific explanation. Much of the creative work done in a whole variety of diverse scientific fields over the next century was to be inspired by, and derive sustenance from, this new world-view, which Freud with his enormous esteem for science, accepted implicitly.

An even more important influence on Freud however, came from the field of physics. The second fifty years of the nineteenth century saw monumental advances in contemporary physics, which were largely initiated by the formulation of the principle of the conservation of energy by Helmholz. This principle states, in effect, that the total amount of energy in any given physical system is always constant, that energy quanta can be changed but not annihilated, and that consequently when energy is moved from one part of the system, it must reappear in another part. The progressive application of this principle led to monumental discoveries in the fields of thermodynamics, electromagnetism and nuclear physics which, with their associated technologies, have so comprehensively transformed the contemporary world. As we have seen, when he first came to the University of Vienna, Freud worked under the direction of Ernst Brücke who in 1873-4 published his Lecture Notes on Physiology (Vorlesungen über Physiologie. Vienna: Wilhelm Braumüller), setting out the view that all living organisms, including humans, are essentially energy-systems to which, no less than to inanimate objects, the principle of the conservation of energy applies. Freud, who had great admiration and respect for Brücke, quickly adopted this new dynamic physiology with enthusiasm. From there it was but a short conceptual step—but one which Freud was the first to take, and on which his claim to fame is largely grounded—to the view that there is such a thing as psychic energy, that the human personality is also an energy-system, and that it is the function of psychology to investigate the modifications, transmissions and conversions of psychic energy within the personality which shape and determine it. This latter conception is the very cornerstone of Freud’s psychoanalytic theory.

3. The Theory of the Unconscious

Freud’s theory of the unconscious, then, is highly deterministic—a fact which, given the nature of nineteenth century science, should not be surprising. Freud was arguably the first thinker to apply deterministic principles systematically to the sphere of the mental, and to hold that the broad spectrum of human behavior is explicable only in terms of the (usually hidden) mental processes or states which determine it. Thus, instead of treating the behavior of the neurotic as being causally inexplicable—which had been the prevailing approach for centuries—Freud insisted, on the contrary, on treating it as behavior for which it is meaningful to seek an explanation by searching for causes in terms of the mental states of the individual concerned. Hence the significance which he attributed to slips of the tongue or pen, obsessive behavior and dreams—all these, he held, are determined by hidden causes in the person’s mind, and so they reveal in covert form what would otherwise not be known at all. This suggests the view that freedom of the will is, if not completely an illusion, certainly more tightly circumscribed than is commonly believed, for it follows from this that whenever we make a choice we are governed by hidden mental processes of which we are unaware and over which we have no control.

The postulate that there are such things as unconscious mental states at all is a direct function of Freud’s determinism, his reasoning here being simply that the principle of causality requires that such mental states should exist, for it is evident that there is frequently nothing in the conscious mind which can be said to cause neurotic or other behavior. An unconscious mental process or event, for Freud, is not one which merely happens to be out of consciousness at a given time, but is rather one which cannot, except through protracted psychoanalysis, be brought to the forefront of consciousness. The postulation of such unconscious mental states entails, of course, that the mind is not, and cannot be, either identified with consciousness, or an object of consciousness. To employ a much-used analogy, it is rather structurally akin to an iceberg, the bulk of it lying below the surface, exerting a dynamic and determining influence upon the part which is amenable to direct inspection—the conscious mind.

Deeply associated with this view of the mind is Freud’s account of instincts or drives. Instincts, for Freud, are the principal motivating forces in the mental realm, and as such they energise the mind in all of its functions. There are, he held, an indefinitely large number of such instincts, but these can be reduced to a small number of basic ones, which he grouped into two broad generic categories, Eros (the life instinct), which covers all the self-preserving and erotic instincts, and Thanatos (the death instinct), which covers all the instincts towards aggression, self-destruction, and cruelty. Thus it is a mistake to interpret Freud as asserting that all human actions spring from motivations which are sexual in their origin, since those which derive from Thanatos are not sexually motivated—indeed, Thanatos is the irrational urge to destroy the source of all sexual energy in the annihilation of the self. Having said that, it is undeniably true that Freud gave sexual drives an importance and centrality in human life, human actions, and human behavior which was new (and to many, shocking), arguing as he does that sexual drives exist and can be discerned in children from birth (the theory of infantile sexuality), and that sexual energy (libido) is the single most important motivating force in adult life. However, a crucial qualification has to be added here—Freud effectively redefined the term sexuality to make it cover any form of pleasure which is or can be derived from the body. Thus his theory of the instincts or drives is essentially that the human being is energized or driven from birth by the desire to acquire and enhance bodily pleasure.

4. Infantile Sexuality

Freud’s theory of infantile sexuality must be seen as an integral part of a broader developmental theory of human personality. This had its origins in, and was a generalization of, Breuer’s earlier discovery that traumatic childhood events could have devastating negative effects upon the adult individual, and took the form of the general thesis that early childhood sexual experiences were the crucial factors in the determination of the adult personality. From his account of the instincts or drives it followed that from the moment of birth the infant is driven in his actions by the desire for bodily/sexual pleasure, where this is seen by Freud in almost mechanical terms as the desire to release mental energy. Initially, infants gain such release, and derive such pleasure, from the act of sucking. Freud accordingly terms this the oral stage of development. This is followed by a stage in which the locus of pleasure or energy release is the anus, particularly in the act of defecation, and this is accordingly termed the anal stage. Then the young child develops an interest in its sexual organs as a site of pleasure (the phallic stage), and develops a deep sexual attraction for the parent of the opposite sex, and a hatred of the parent of the same sex (the Oedipus complex). This, however, gives rise to (socially derived) feelings of guilt in the child, who recognizes that it can never supplant the stronger parent. A male child also perceives himself to be at risk. He fears that if he persists in pursuing the sexual attraction for his mother, he may be harmed by the father; specifically, he comes to fear that he may be castrated. This is termed castration anxiety. Both the attraction for the mother and the hatred are usually repressed, and the child usually resolves the conflict of the Oedipus complex by coming to identify with the parent of the same sex. This happens at the age of five, whereupon the child enters a latency period, in which sexual motivations become much less pronounced. This lasts until puberty when mature genital development begins, and the pleasure drive refocuses around the genital area.

This, Freud believed, is the sequence or progression implicit in normal human development, and it is to be observed that at the infant level the instinctual attempts to satisfy the pleasure drive are frequently checked by parental control and social coercion. The developmental process, then, is for the child essentially a movement through a series of conflicts, the successful resolution of which is crucial to adult mental health. Many mental illnesses, particularly hysteria, Freud held, can be traced back to unresolved conflicts experienced at this stage, or to events which otherwise disrupt the normal pattern of infantile development. For example, homosexuality is seen by some Freudians as resulting from a failure to resolve the conflicts of the Oedipus complex, particularly a failure to identify with the parent of the same sex; the obsessive concern with washing and personal hygiene which characterizes the behavior of some neurotics is seen as resulting from unresolved conflicts/repressions occurring at the anal stage.

5. Neuroses and the Structure of the Mind

Freud’s account of the unconscious, and the psychoanalytic therapy associated with it, is best illustrated by his famous tripartite model of the structure of the mind or personality (although, as we have seen, he did not formulate this until 1923). This model has many points of similarity with the account of the mind offered by Plato over 2,000 years earlier. The theory is termed tripartite simply because, again like Plato, Freud distinguished three structural elements within the mind, which he called id, ego, and super-ego. The id is that part of the mind in which are situated the instinctual sexual drives which require satisfaction; the super-ego is that part which contains the conscience, namely, socially-acquired control mechanisms which have been internalized, and which are usually imparted in the first instance by the parents; while the ego is the conscious self that is created by the dynamic tensions and interactions between the id and the super-ego and has the task of reconciling their conflicting demands with the requirements of external reality. It is in this sense that the mind is to be understood as a dynamic energy-system. All objects of consciousness reside in the ego; the contents of the id belong permanently to the unconscious mind; while the super-ego is an unconscious screening-mechanism which seeks to limit the blind pleasure-seeking drives of the id by the imposition of restrictive rules. There is some debate as to how literally Freud intended this model to be taken (he appears to have taken it extremely literally himself), but it is important to note that what is being offered here is indeed a theoretical model rather than a description of an observable object, which functions as a frame of reference to explain the link between early childhood experience and the mature adult (normal or dysfunctional) personality.

Freud also followed Plato in his account of the nature of mental health or psychological well-being, which he saw as the establishment of a harmonious relationship between the three elements which constitute the mind. If the external world offers no scope for the satisfaction of the id’s pleasure drives, or more commonly, if the satisfaction of some or all of these drives would indeed transgress the moral sanctions laid down by the super-ego, then an inner conflict occurs in the mind between its constituent parts or elements. Failure to resolve this can lead to later neurosis. A key concept introduced here by Freud is that the mind possesses a number of defense mechanisms to attempt to prevent conflicts from becoming too acute, such as repression (pushing conflicts back into the unconscious), sublimation (channeling the sexual drives into the achievement socially acceptable goals, in art, science, poetry, and so forth), fixation (the failure to progress beyond one of the developmental stages), and regression (a return to the behavior characteristic of one of the stages).

Of these, repression is the most important, and Freud’s account of this is as follows: when a person experiences an instinctual impulse to behave in a manner which the super-ego deems to be reprehensible (for example, a strong erotic impulse on the part of the child towards the parent of the opposite sex), then it is possible for the mind to push this impulse away, to repress it into the unconscious. Repression is thus one of the central defense mechanisms by which the ego seeks to avoid internal conflict and pain, and to reconcile reality with the demands of both id and super-ego. As such it is completely normal and an integral part of the developmental process through which every child must pass on the way to adulthood. However, the repressed instinctual drive, as an energy-form, is not and cannot be destroyed when it is repressed—it continues to exist intact in the unconscious, from where it exerts a determining force upon the conscious mind, and can give rise to the dysfunctional behavior characteristic of neuroses. This is one reason why dreams and slips of the tongue possess such a strong symbolic significance for Freud, and why their analysis became such a key part of his treatment—they represent instances in which the vigilance of the super-ego is relaxed, and when the repressed drives are accordingly able to present themselves to the conscious mind in a transmuted form. The difference between normal repression and the kind of repression which results in neurotic illness is one of degree, not of kind—the compulsive behavior of the neurotic is itself a manifestation of an instinctual drive repressed in childhood. Such behavioral symptoms are highly irrational (and may even be perceived as such by the neurotic), but are completely beyond the control of the subject because they are driven by the now unconscious repressed impulse. Freud positioned the key repressions for both, the normal individual and the neurotic, in the first five years of childhood, and of course, held them to be essentially sexual in nature; since, as we have seen, repressions which disrupt the process of infantile sexual development in particular, according to him, lead to a strong tendency to later neurosis in adult life. The task of psychoanalysis as a therapy is to find the repressions which cause the neurotic symptoms by delving into the unconscious mind of the subject, and by bringing them to the forefront of consciousness, to allow the ego to confront them directly and thus to discharge them.

6. Psychoanalysis as a Therapy

Freud’s account of the sexual genesis and nature of neuroses led him naturally to develop a clinical treatment for treating such disorders. This has become so influential today that when people speak of psychoanalysis they frequently refer exclusively to the clinical treatment; however, the term properly designates both the clinical treatment and the theory which underlies it. The aim of the method may be stated simply in general terms—to re-establish a harmonious relationship between the three elements which constitute the mind by excavating and resolving unconscious repressed conflicts. The actual method of treatment pioneered by Freud grew out of Breuer’s earlier discovery, mentioned above, that when a hysterical patient was encouraged to talk freely about the earliest occurrences of her symptoms and fantasies, the symptoms began to abate, and were eliminated entirely when she was induced to remember the initial trauma which occasioned them. Turning away from his early attempts to explore the unconscious through hypnosis, Freud further developed this talking cure, acting on the assumption that the repressed conflicts were buried in the deepest recesses of the unconscious mind. Accordingly, he got his patients to relax in a position in which they were deprived of strong sensory stimulation, and even keen awareness of the presence of the analyst (hence the famous use of the couch, with the analyst virtually silent and out of sight), and then encouraged them to speak freely and uninhibitedly, preferably without forethought, in the belief that he could thereby discern the unconscious forces lying behind what was said. This is the method of free-association, the rationale for which is similar to that involved in the analysis of dreams—in both cases the super-ego is to some degree disarmed, its efficiency as a screening mechanism is moderated, and material is allowed to filter through to the conscious ego which would otherwise be completely repressed. The process is necessarily a difficult and protracted one, and it is therefore one of the primary tasks of the analyst to help the patient recognize, and overcome, his own natural resistances, which may exhibit themselves as hostility towards the analyst. However, Freud always took the occurrence of resistance as a sign that he was on the right track in his assessment of the underlying unconscious causes of the patient’s condition. The patient’s dreams are of particular interest, for reasons which we have already partly seen. Taking it that the super-ego functioned less effectively in sleep, as in free-association, Freud made a distinction between the manifest content of a dream (what the dream appeared to be about on the surface) and its latent content (the unconscious, repressed desires or wishes which are its real object). The correct interpretation of the patient’s dreams, slips of tongue, free-associations, and responses to carefully selected questions leads the analyst to a point where he can locate the unconscious repressions producing the neurotic symptoms, invariably in terms of the patient’s passage through the sexual developmental process, the manner in which the conflicts implicit in this process were handled, and the libidinal content of the patient’s family relationships. To create a cure, the analyst must facilitate the patient himself to become conscious of unresolved conflicts buried in the deep recesses of the unconscious mind, and to confront and engage with them directly.

In this sense, then, the object of psychoanalytic treatment may be said to be a form of self-understanding—once this is acquired it is largely up to the patient, in consultation with the analyst, to determine how he shall handle this newly-acquired understanding of the unconscious forces which motivate him. One possibility, mentioned above, is the channeling of sexual energy into the achievement of social, artistic or scientific goals—this is sublimation, which Freud saw as the motivating force behind most great cultural achievements. Another possibility would be the conscious, rational control of formerly repressed drives—this is suppression. Yet another would be the decision that it is the super-ego and the social constraints which inform it that are at fault, in which case the patient may decide in the end to satisfy the instinctual drives. But in all cases the cure is created essentially by a kind of catharsis or purgation—a release of the pent-up psychic energy, the constriction of which was the basic cause of the neurotic illness.

7. Critical Evaluation of Freud

It should be evident from the foregoing why psychoanalysis in general, and Freud in particular, have exerted such a strong influence upon the popular imagination in the Western World, and why both the theory and practice of psychoanalysis should remain the object of a great deal of controversy. In fact, the controversy which exists in relation to Freud is more heated and multi-faceted than that relating to virtually any other post-1850 thinker (a possible exception being Darwin), with criticisms ranging from the contention that Freud’s theory was generated by logical confusions arising out of his alleged long-standing addiction to cocaine (see Thornton, E.M. Freud and Cocaine: The Freudian Fallacy) to the view that he made an important, but grim, empirical discovery, which he knowingly suppressed in favour of the theory of the unconscious, knowing that the latter would be more socially acceptable (see Masson, J. The Assault on Truth).

It should be emphasized here that Freud’s genius is not (generally) in doubt, but the precise nature of his achievement is still the source of much debate. The supporters and followers of Freud (and Jung and Adler) are noted for the zeal and enthusiasm with which they espouse the doctrines of the master, to the point where many of the detractors of the movement see it as a kind of secular religion, requiring as it does an initiation process in which the aspiring psychoanalyst must himself first be analyzed. In this way, it is often alleged, the unquestioning acceptance of a set of ideological principles becomes a necessary precondition for acceptance into the movement—as with most religious groupings. In reply, the exponents and supporters of psychoanalysis frequently analyze the motivations of their critics in terms of the very theory which those critics reject. And so the debate goes on.

Here we will confine ourselves to: (a) the evaluation of Freud’s claim that his theory is a scientific one, (b) the question of the theory’s coherence, (c) the dispute concerning what, if anything, Freud really discovered, and (d) the question of the efficacy of psychoanalysis as a treatment for neurotic illnesses.

a. The Claim to Scientific Status

This is a crucially important issue since Freud saw himself first and foremost as a pioneering scientist, and repeatedly asserted that the significance of psychoanalysis is that it is a new science, incorporating a new scientific method of dealing with the mind and with mental illness. There can, moreover, be no doubt but that this has been the chief attraction of the theory for most of its advocates since then—on the face of it, it has the appearance of being not just a scientific theory but an enormously strong one, with the capacity to accommodate, and explain, every possible form of human behavior. However, it is precisely this latter which, for many commentators, undermines its claim to scientific status. On the question of what makes a theory a genuinely scientific one, Karl Popper’s criterion of demarcation, as it is called, has now gained very general acceptance: namely, that every genuine scientific theory must be testable, and therefore falsifiable, at least in principle. In other words, if a theory is incompatible with possible observations, it is scientific; conversely, a theory which is compatible with all possible observations is unscientific (see Popper, K. The Logic of Scientific Discovery). Thus the principle of the conservation of energy (physical, not psychic), which influenced Freud so greatly, is a scientific one because it is falsifiable—the discovery of a physical system in which the total amount of physical energy was not constant would conclusively show it to be false. It is argued that nothing of the kind is possible with respect to Freud’s theory—it is not falsifiable. If the question is asked: “What does this theory imply which, if false, would show the whole theory to be false?,” the answer is “Nothing” because the theory is compatible with every possible state of affairs. Hence it is concluded that the theory is not scientific, and while this does not, as some critics claim, rob it of all value, it certainly diminishes its intellectual status as projected by its strongest advocates, including Freud himself.

b. The Coherence of the Theory

A related (but perhaps more serious) point is that the coherence of the theory is, at the very least, questionable. What is attractive about the theory, even to the layman, is that it seems to offer us long sought-after and much needed causal explanations for conditions which have been a source of a great deal of human misery. The thesis that neuroses are caused by unconscious conflicts buried deep in the unconscious mind in the form of repressed libidinal energy would appear to offer us, at last, an insight in the causal mechanism underlying these abnormal psychological conditions as they are expressed in human behavior, and further show us how they are related to the psychology of the normal person. However, even this is questionable, and is a matter of much dispute. In general, when it is said that an event X causes another event Y to happen, both X and Y are, and must be, independently identifiable. It is true that this is not always a simple process, as in science causes are sometimes unobservable (sub-atomic particles, radio and electromagnetic waves, molecular structures, and so forth), but in these latter cases there are clear correspondence rules connecting the unobservable causes with observable phenomena. The difficulty with Freud’s theory is that it offers us entities (for example repressed unconscious conflicts), which are said to be the unobservable causes of certain forms of behavior But there are no correspondence rules for these alleged causes—they cannot be identified except by reference to the behavior which they are said to cause (that is, the analyst does not demonstratively assert: “This is the unconscious cause, and that is its behavioral effect;” rather he asserts: “This is the behavior, therefore its unconscious cause must exist”), and this does raise serious doubts as to whether Freud’s theory offers us genuine causal explanations at all.

c. Freud’s Discovery

At a less theoretical, but no less critical level, it has been alleged that Freud did make a genuine discovery which he was initially prepared to reveal to the world. However, the response he encountered was so ferociously hostile that he masked his findings and offered his theory of the unconscious in its place (see Masson, J. The Assault on Truth). What he discovered, it has been suggested, was the extreme prevalence of child sexual abuse, particularly of young girls (the vast majority of hysterics are women), even in respectable nineteenth century Vienna. He did in fact offer an early seduction theory of neuroses, which met with fierce animosity, and which he quickly withdrew and replaced with the theory of the unconscious. As one contemporary Freudian commentator explains it, Freud’s change of mind on this issue came about as follows:

Questions concerning the traumas suffered by his patients seemed to reveal [to Freud] that Viennese girls were extraordinarily often seduced in very early childhood by older male relatives. Doubt about the actual occurrence of these seductions was soon replaced by certainty that it was descriptions about childhood fantasy that were being offered. (MacIntyre).

In this way, it is suggested, the theory of the Oedipus complex was generated.

This statement begs a number of questions, not least, what does the expression extraordinarily often mean in this context? By what standard is this being judged? The answer can only be: By the standard of what we generally believe—or would like to believe—to be the case. But the contention of some of Freud’s critics here is that his patients were not recalling childhood fantasies, but traumatic events from their childhood which were all too real. Freud, according to them, had stumbled upon and knowingly suppressed the fact that the level of child sexual abuse in society is much higher than is generally believed or acknowledged. If this contention is true—and it must at least be contemplated seriously—then this is undoubtedly the most serious criticism that Freud and his followers have to face.

Further, this particular point has taken on an added and even more controversial significance in recent years, with the willingness of some contemporary Freudians to combine the theory of repression with an acceptance of the wide-spread social prevalence of child sexual abuse. The result has been that in the United States and Britain in particular, many thousands of people have emerged from analysis with recovered memories of alleged childhood sexual abuse by their parents; memories which, it is suggested, were hitherto repressed. On this basis, parents have been accused and repudiated, and whole families have been divided or destroyed. Unsurprisingly, this in turn has given rise to a systematic backlash in which organizations of accused parents, seeing themselves as the true victims of what they term False Memory Syndrome, have denounced all such memory-claims as falsidical — the direct product of a belief in what they see as the myth of repression. (see Pendergast, M. Victims of Memory). In this way, the concept of repression, which Freud himself termed the foundation stone upon which the structure of psychoanalysis rests, has come in for more widespread critical scrutiny than ever before. Here, the fact that, unlike some of his contemporary followers, Freud did not himself ever countenance the extension of the concept of repression to cover actual child sexual abuse, and the fact that we are not necessarily forced to choose between the views that all recovered memories are either veridical or falsidical are frequently lost sight of in the extreme heat generated by this debate, perhaps understandably.

d. The Efficacy of Psychoanalytic Therapy

It does not follow that, if Freud’s theory is unscientific, or even false, it cannot provide us with a basis for the beneficial treatment of neurotic illness because the relationship between a theory’s truth or falsity and its utility-value is far from being an isomorphic one. The theory upon which the use of leeches to bleed patients in eighteenth century medicine was based was quite spurious, but patients did sometimes actually benefit from the treatment! And of course even a true theory might be badly applied, leading to negative consequences. One of the problems here is that it is difficult to specify what counts as a cure for a neurotic illness as distinct, say, from a mere alleviation of the symptoms. In general, however, the efficiency of a given method of treatment is usually clinically measured by means of a control group—the proportion of patients suffering from a given disorder who are cured by treatment X is measured by comparison with those cured by other treatments, or by no treatment at all. Such clinical tests as have been conducted indicate that the proportion of patients who have benefited from psychoanalytic treatment does not diverge significantly from the proportion who recover spontaneously or as a result of other forms of intervention in the control groups used. So, the question of the therapeutic effectiveness of psychoanalysis remains an open and controversial one.

8. References and Further Reading

a. Works by Freud

  • The Standard Edition of the Complete Psychological Works of Sigmund Freud (Ed. J. Strachey with Anna Freud), 24 vols. London: 1953-1964.

b. Works on Freud and Freudian Psychoanalysis

  • Abramson, J.B. Liberation and Its Limits: The Moral and Political Thought of Freud. New York: Free Press, 1984.
  • Bettlelheim, B. Freud and Man’s Soul. Knopf, 1982.
  • Cavell, M. The Psychoanalytic Mind: From Freud to Philosophy. Harvard University Press, 1993.
  • Cavell, M. Becoming a Subject: Reflections in Philosophy and Psychoanalysis. New York:  Oxford University Press, 2006.
  • Chessick, R.D. Freud Teaches Psychotherapy. Hackett Publishing Company, 1980.
  • Cioffi, F. (ed.) Freud: Modern Judgements. Macmillan, 1973.
  • Deigh, J. The Sources of Moral Agency: Essays in Moral Psychology and Freudian Theory. Cambridge, UK: Cambridge University Press, 1996.
  • Dilman, I. Freud and Human Nature. Blackwell, 1983
  • Dilman, I. Freud and the Mind. Blackwell, 1984.
  • Edelson, M. Hypothesis and Evidence in Psychoanalysis. University of Chicago Press, 1984.
  • Erwin, E. A Final Accounting: Philosophical and Empirical Issues in Freudian Psychology. MIT Press, 1996.
  • Fancher, R. Psychoanalytic Psychology: The Development of Freud’s Thought. Norton, 1973.
  • Farrell, B.A. The Standing of Psychoanalysis. Oxford University Press, 1981.
  • Fingarette, H. The Self in Transformation: Psychoanalysis, Philosophy, and the Life of the Spirit. HarperCollins, 1977.
  • Freeman, L. The Story of Anna O.—The Woman who led Freud to Psychoanalysis. Paragon House, 1990.
  • Frosh, S. The Politics of Psychoanalysis: An Introduction to Freudian and Post-Freudian Theory. Yale University Press, 1987.
  • Gardner, S. Irrationality and the Philosophy of Psychoanalysis. Cambridge, Cambridge University Press, 1993.
  • Grünbaum, A. The Foundations of Psychoanalysis: A Philosophical Critique. University of California Press, 1984.
  • Gay, V.P. Freud on Sublimation: Reconsiderations. Albany, NY: State University Press, 1992.
  • Hook, S. (ed.) Psychoanalysis, Scientific Method, and Philosophy. New York University Press, 1959.
  • Jones, E. Sigmund Freud: Life and Work (3 vols), Basic Books, 1953-1957.
  • Klein, G.S. Psychoanalytic Theory: An Exploration of Essentials. International Universities Press, 1976.
  • Lear, J. Love and Its Place in Nature: A Philosophical Interpretation of Freudian Psychoanalysis. Farrar, Straus & Giroux, 1990.
  • Lear, J. Open Minded: Working Out the Logic of the Soul. Cambridge, Harvard University Press, 1998.
  • Lear, Jonathan. Happiness, Death, and the Remainder of Life. Harvard University Press, 2000.
  • Lear, Jonathan. Freud. Routledge, 2005.
  • Levine, M.P. (ed). The Analytic Freud: Philosophy and Psychoanalysis. London: Routledge, 2000.
  • Levy, D. Freud Among the Philosophers: The Psychoanalytic Unconscious and Its Philosophical Critics. New Haven, CT: Yale University Press, 1996.
  • MacIntyre, A.C. The Unconscious: A Conceptual Analysis. Routledge & Kegan Paul, 1958.
  • Mahony, P.J. Freud’s Dora: A Psychoanalytic, Historical and Textual Study. Yale University Press, 1996.
  • Masson, J. The Assault on Truth: Freud’s Suppression of the Seduction Theory. Faber & Faber, 1984.
  • Neu, J. (ed). The Cambridge Companion to Freud. Cambridge          University Press, 1994.
  • O’Neill, J. (ed). Freud and the Passions. Pennsylvania State University Press, 2004.
  • Popper, K. The Logic of Scientific Discovery. Hutchinson, 1959.
  • Pendergast, M. Victims of Memory. HarperCollins, 1997.
  • Reiser, M. Mind, Brain, Body: Towards a Convergence of Psychoanalysis and Neurobiology. Basic Books, 1984.
  • Ricoeur, P. Freud and Philosophy: An Essay in Interpretation (trans. D. Savage). Yale University Press, 1970.
  • Robinson, P. Freud and His Critics. Berkeley, University of California Press, 1993.
  • Rose, J. On Not Being Able to Sleep: Psychoanalysis and the Modern World. Princeton University Press, 2003.
  • Roth, P. The Superego. Icon Books, 2001.
  • Rudnytsky, P.L. Freud and Oedipus. Columbia University Press, 1987.
  • Said, E.W. Freud and the Non-European. Verso (in association with the Freud Museum, London), 2003.
  • Schafer, R. A New Language for Psychoanalysis. Yale University Press, 1976.
  • Sherwood, M. The Logic of Explanation in Psychoanalysis. Academic Press, 1969.
  • Smith, D.L. Freud’s Philosophy of the Unconscious. Kluwer, 1999.
  • Stewart, W. Psychoanalysis: The First Ten Years, 1888-1898. Macmillan, 1969.
  • Sulloway, F. Freud, Biologist of the Mind. Basic Books, 1979.
  • Thornton, E.M. Freud and Cocaine: The Freudian Fallacy. Blond & Briggs, 1983.
  • Tauber, A.I. Freud, the Reluctant Philosopher. Princeton University Press, 2010.
  • Wallace, E.R. Freud and Anthropology: A History and Reappraisal. International Universities Press, 1983.
  • Wallwork, E. Psychoanalysis and Ethics. Yale University Press, 1991.
  • Whitebrook, J. Perversion and Utopia: A Study in Psychoanalysis and Critical Theory. MIT Press, 1995.
  • Whyte, L.L. The Unconscious Before Freud. Basic Books, 1960.
  • Wollheim, R. Freud. Fontana, 1971.
  • Wollheim, R. (ed.) Freud: A Collection of Critical Essays. Anchor, 1974.
  • Wollheim, R. & Hopkins, J. (eds.) Philosophical Essays on Freud. Cambridge University Press, 1982.

See also the articles on Descartes’ Mind-Body DistinctionHigher-Order Theories of Consciousness and Introspection.

Author Information

Stephen P. Thornton
University of Limerick
Ireland

Elizabeth Cady Stanton (1815—1902)

StantonElizabeth Cady Stanton was one of the most influential public figures in nineteenth-century America. She was one of the nation’s first feminist theorists and certainly one of its most productive activists. She was in the tradition of Abigail Adams, who implored her husband John to “remember the ladies” as he helped form the new American nation. Along with Susan B. Anthony, Stanton fueled the movement for women’s suffrage. She advocated for change in both the public and private lives of women–regarding property rights, equal education, employment opportunity, more liberal divorce provisions, and child custody rights. By addressing such a wide range of women’s issues, she laid down the foundation for the three main branches of feminism that are in existence today: liberal feminism, which focuses on women’s similarities to men and emphasizes equality; cultural feminism, which celebrates women’s differences from men and aims for gender equity; and dominance feminism, which focuses on male power / female submission and aims to overturn all forms of gender hierarchy.

Stanton was motivated by liberal humanist ideals of egalitarianism and individual autonomy, which were an outgrowth of the Enlightenment. She was familiar with the philosophical thinkers whose works and ideas were discussed among American intellectuals at the time: Jean-Jacques Rousseau, Immanuel Kant, Mary Wollstonecraft, Johann Wolfgang Goethe, Alexis de Tocqueville, and later John Stuart Mill and Harriet Taylor Mill. Stanton’s writings and speeches demonstrate this. In addition, her years of studying and bantering with the apprentices in her father’s law office ensured that she was acquainted with the works of English legal theorists Edward Coke, William Blackstone, and Jeremy Bentham, as well as with those of her father’s influential colleagues, Joseph Story and James Kent.

Although she was theoretically minded, Stanton had no interest in living the life of an intellectual, removed from the hubbub of social and political life. In fact, her feminist theory grew out of the real-life problems women faced in her age and was developed to solve them. This places her squarely in the American tradition established by Benjamin Franklin, of developing philosophical thought that can be applied to everyday life. Stanton, then, could be termed an applied philosopher or a philosopher-practitioner. She did not seek out theory for theory’s sake, but instead put theory into practice for the purpose of improving social and political life.

Table of Contents

  1. Early Life and Education
  2. Marriage and Family in an Activist World
  3. Women’s Rights Activism
  4. Writings and Influence
    1. The Woman’s Bible
  5. Conclusion
  6. References and Further Reading
    1. Works by Elizabeth Cady Stanton
    2. Biographical and Historical Works Relating to Elizabeth Cady Stanton
    3. Papers and Articles Relating to Daniel Cady

1. Early Life and Education

Elizabeth Cady, the third surviving child and second of the five daughters of Margaret (formerly Livingston) and Daniel Cady (1773-1859), was born November 12, 1815, in Johnstown, New York. Her mother was from a well-to-do family with ties to the American Revolution. Margaret Livingston’s father had been a colonel in the Continental Army, assisting in the capture of John Andre, one of Benedict Arnold’s co-conspirators. Daniel Cady was a prominent lawyer and politician in the state of New York. He became a member of the New York State Assembly (1808-13), held office as a member of Congress (1815-17), and served on the New York Supreme Court (1847-54).

From a young age, Elizabeth was keenly aware of the gender-based power imbalances that were in place in her day. With bitterness, she later recounted the many times her father responded to her aspirations and achievements by declaiming that she should have been born a boy. After the death of her older brother, the only boy in the family to have reached adulthood, she resolved to do her best to fill the void his death left in her father’s life. She learned to play chess and ride a horse. She studied Greek with the family’s minister, the Rev. Simon Hosack. She entered Johnstown Academy and won prizes and awards. Though still unable to please and impress her father to her satisfaction, these moments clearly motivated her to achieve. In all likelihood, Daniel Cady was not quite as indifferent to his daughter’s achievements, nor as hostile to feminism as her memoir sometimes suggests. His laments that she was not a boy were perhaps more a recognition of the social constraints she would face as a grown woman than an expression of his own need for a son.

When Elizabeth was a child, she overheard a conversation her father had with a woman, a  would-be client for whom the law provided no remedy. She voiced her dismay to her father, and this is the counsel he gave her: “When you are grown up, and able to prepare a speech, you must go down to Albany and talk to the legislators; tell them all you have seen in this office–the sufferings of these Scotchwomen . . . if you can persuade them to pass new laws, the old ones will be a dead letter” (Eighty Years, pp. 31-32). Prior to their conversation, young Elizabeth’s plan had been to destroy all the laws that were unjust to women by cutting them out of his law books! His advice gave her an alternative and foreshadowed the career she would make for herself as a reformer.

Born into a world of wealth and privilege, Elizabeth benefited from a better education than most girls were granted in her day. After attending Johnstown Academy, she entered the Troy Female Seminary. She felt it unjust that she was barred from attending the more academically rigorous Union College, then an all-male institution. While she gained greater understanding of women and feminine culture at Troy, overall her experience there convinced her that male-female co-education is superior to single-sex education. Seeing and visiting with men was such a novelty at Troy that it created an almost unnatural obsession with the other sex. In Elizabeth’s view it also exaggerated any differences between the genders and intensified the deficiencies of each.

Elizabeth did not complete a degree at Troy. In part this was because of the influence of the evangelist Charles Finney, a pivotal regional figure in America’s “Second Great Awakening”. Attendance at Finney’s revival sessions was mandatory at Troy, and several of her classmates were readily converted by his “fire and brimstone” sermons. Yet his preaching left Elizabeth terrified and perplexed. She considered his calls to give her heart to Jesus irrational, if not incomprehensible, and she refused to repent. Even so, she was still disturbed by the images of hell and damnation Finney had planted in her mind. Her parents, like many traditional Protestants, rejected the heightened emotionalism of Finney’s evangelical fervor and allowed her to withdraw from Troy. They treated her to a retreat in Niagara where all talk of religion was forbidden, so that she could settle herself and regain her spiritual bearings. After this exposure to Protestant revivalism, Elizabeth remained a religious skeptic for the rest of her life.

Elizabeth continued to study on her own after her time at Troy Seminary. She read the moral philosophy of George Combe and discussed the novels of Sir Walter Scott, James Fenimore Cooper, and Charles Dickens with her brother-in-law Edward Bayard. She also spent time with her intellectual and reform-minded cousins in nearby Peterboro, New York. These were the children of her mother’s sister Elizabeth (Livingston) and her husband Peter Smith.

In the Smith household, Elizabeth was exposed to a number of new people as well as to new social and political ideas. Her aunt and uncle were egalitarians not only in the ideal, but in the everyday, sense. Their home was open to African Americans on their way to freedom in Canada as well as to Oneida Indians they had befriended. It also teemed with activists and intellectuals who discussed, debated and strategized about the social and political events of the day–chief among them abolition. Her uncle, Peter Smith, was a staunch advocate of racial equality who sought an end to American slavery. Her cousin Gerrit Smith followed closely in his father’s footsteps. Gerrit and his friends in the abolition movement would not only influence Elizabeth, but introduce lifelong challenges as she and other social reformers sought to bring full equality to all people, regardless of color, creed, or gender.

2. Marriage and Family in an Activist World

It was at the home of her cousin, Gerrit Smith, that Elizabeth Cady met Henry Stanton, a man ten years her senior. He was already an extremely prominent and influential abolitionist orator. Beginning his career as a journalist, Stanton met Theodore Weld while attending the Rochester Manual Labor Institute and Weld was touring the country to learn more about manual labor schools. Both were compelling public speakers. Both were committed to social and political reform. And both had been influenced by Charles Finney. In Rochester, Stanton first met Finney when he was serving as replacement pastor at a local church. Like Weld–and in stark contrast to his future wife–Stanton was thoroughly impressed by Finney as an orator and theological thinker. In contrast to Elizabeth Cady’s response, the excesses of Finney’s fire and brimstone approach were of no concern to Stanton. He was simply full of awe and admiration for the man.

Stanton and Weld became lifelong friends, and at Weld’s behest, Stanton began attending the newly-established Lane Theological Seminary in 1832. Lane was based on the manual labor model and initially was a great success. In 1834 however, a student-sponsored debate over slavery raised the ire of the institution’s board of trustees, and a gag order was issued:  Henceforth, no events related to political issues were to be held without prior approval from the board. Nearly half the students at the seminary–Stanton and Weld among them–withdrew from the institution in protest. Stanton then began working alongside Weld, first as an agent of the American Anti-Slavery Society, then as an officer of the organization. Studying law under Daniel Cady after he and Elizabeth married, Henry then became a lawyer and a political operative. He aspired to hold office himself, and succeeded in doing so for a short time in the early 1850s. However, he was mostly known as an orator, a social/political activist, and a journalist, writing for the Anti-Slavery Standard, The Liberator, The New York Tribune, and The New York Sun.

In the 1830s, Stanton was a frequent visitor to the Smith household and a chief contributor to their many discussions about social and political issues. When Elizabeth and Stanton met in 1839, she was under the illusion that he was already married. So her earliest interactions with him were as simply an acquaintance who shared his interest in abolition, not as a potential love interest. Once the two discovered the misunderstanding, they began courting against the wishes of Elizabeth’s father. Daniel Cady was no more fond of abolitionism than he was of social reformers in general, and Stanton’s personal cause with Cady was not helped by his questionable financial viability.

After succumbing to family pressure and breaking her engagement to Stanton, Elizabeth had a change of heart, and the two married hastily in May 1840. They then went to London, where Henry was due to serve as a delegate at the World Antislavery Convention. Well-known within women’s rights history is the fact that none of the female delegates at the meeting were allowed to take a seat on the convention floor, but were segregated behind a screen in the balcony. This enraged Elizabeth as well as other American women present, such as Lucretia Mott, Abby Southwick, and Elizabeth Neal. Significantly, Henry gave a speech in favor of full participation by the women present, but his support stopped there. He did not join William Lloyd Garrison and a handful of other male delegates when they sat in the women’s section as an act of protest against such overt inequality.

Henry Stanton’s moderate support of women’s rights in London signaled an ongoing point of disconnect between Elizabeth and her husband. His passion was for abolition. The suffragists and feminists argued that women needed more social and political freedom than they currently had. For Henry, however, the plight of slaves held in bondage, abused, oppressed, and murdered at their masters’ whim was a far greater concern than women’s liberty to fill out a ballot or to hold office. Elizabeth’s passion was for women’s rights. Certainly American slavery was cruel and unjust, but the system of oppression that permitted it was the same system that allowed men to rule over women with arbitrary and capricious authority. A woman who was married to a kind and egalitarian man was simply lucky. The legal system still maintained the power of all men over their wives, no matter how cruel and unkind they may be.

Biographers have debated whether Henry was truly an advocate of Elizabeth’s quest for women’s rights, merely moderately supportive, or actually antagonistic to both her quest and her stature as a suffragist. Like both Henry Blackwell and Theodore Weld, Stanton accepted Elizabeth’s decision to excise the word “obey” from the vows she spoke at their wedding. The minister performing the ceremony was troubled by this detour from convention, and Elizabeth was convinced that the lengthy prayer he offered after the ceremony–lasting nearly an hour–was payback for this crucial omission from their marriage vows. There is no evidence that the matter troubled her husband. Even so, others in their reform-minded circles went further to advance equality in marriage. Theodore Weld, who wed the feminist and abolitionist Angelina Grimke in 1838, vowed to treat his wife as an equal partner in their marriage. Marrying in 1855, Henry Blackwell went much further, denouncing marriage as an institution that enforced male dominance over women. In addition, Blackwell accepted Lucy Stone’s decision to keep her own surname after marriage, the first woman on record in America to have done so. Other male reformers supported or worked alongside their wives in the suffrage struggle. For example, Henry Blackwell spoke on behalf of women’s suffrage and essentially co-edited The Woman’s Journal with Stone. Clearly, Henry Stanton did not match this level of commitment to his wife’s suffrage work.

Yet the level of support that Henry Stanton gave to Elizabeth’s work as a women’s rights advocate must also be measured by the standard set by her father. Daniel Cady repeatedly lamented the fact that Elizabeth was female because he believed her intellect and forceful personality would go to waste in a woman. Women in the world they lived in were meant to attend to the hearth and home, not to go out into the world to become intellectuals or, worse still, rabble-rousing activists. At the same time, her father was not completely unmoved by seeing Elizabeth act on her convictions. After she read one of her upcoming women’s rights speeches to him, he asked if she–a woman of position, privilege, and relative power–genuinely believed that she was at a disadvantage as a woman. When Elizabeth responded by reminding him of all the laws that privileged men and harmed women, her father turned to his law books to provide her with another example that would help further illustrate her point. While never more than outwardly lukewarm to her feminist efforts, Daniel Cady often provided support in this way–giving her legal ammunition to use in her writings and speeches.

Elizabeth was accustomed to receiving only the dimmest signs of approval from her father. So as an adult, she neither expected nor needed the motivation of resounding applause for her suffrage work from Henry Stanton. She had found her calling as a women’s rights advocate, and though she was a formidable force at the podium, Henry was always able to write to his “Dear Lizzie” when the two were apart. Though not a proponent of women’s rights himself, he does not appear to have been hostile to her leadership role in the women’s movement. It simply seems that, like her father, he found it difficult to fully comprehend Elizabeth’s passion for women’s rights when he saw her as a woman of considerable privilege. Because both Elizabeth and Henry were fairly protective of their private life together, it may be impossible to know for certain how each felt about the other’s reform work. Each seems to have been impressed by the other’s achievements in their chosen reform efforts, though neither was exactly a cheerleader for the other’s cause.

After the London antislavery convention, Elizabeth and Henry toured Europe for six months, returning home in the winter and settling with Elizabeth’s parents. During this period, Henry studied law under Daniel Cady, before taking up a position in Boston in 1843. In Boston Elizabeth met prominent reformers and intellectuals, among them Lydia Maria Child, Frederick Douglass, Ralph Waldo Emerson, Margaret Fuller, Nathaniel Hawthorne, Robert Lowell, Abby Kelly, Elizabeth Palmer Peabody, John Greenleaf Whittier, and Paulina Wright. She regularly listened to the sermons of the radical Unitarian and abolitionist minister, Theodore Parker, and was eager to attend the “conversations” of Amos Bronson Alcott and Margaret Fuller. She also visited the utopian Brook Farm community, admiring its idealism, though not the spartan way of life of its inhabitants.

Elizabeth loved Boston, and the art, culture, and intellectual life it had to offer. The loss of all this made the adjustment to rural life difficult for her when, in 1847, the couple moved to Seneca Falls in upstate New York. But with a house generously provided by Daniel Cady for his daughter’s growing family, there was really no way the couple could refuse. By 1847 they had three children, and there would be more–each named in honor of a beloved family member or friend: David Cady (born 1842), Henry Brewster (1844), Gerrit Smith (1845), Theodore Weld (1851), Margaret Livingston (1852), Harriot Eaton (1856) and Robert Livingston (1859).

3. Women’s Rights Activism

Elizabeth Cady Stanton became acquainted with women’s rights activists for the first time at the antislavery convention in London. Women’s abilities, achievements, and rights had been of concern to her since her youth, when she bantered with boys at Johnstown Academy and with the young men apprenticing at her father’s law office. Yet it was her experience as a housewife in Seneca Falls that prompted her to take action on behalf of women’s rights.

In her earliest years as a wife and mother, Cady Stanton found fulfillment in managing a household. In fact, she thrived on the day-to-day challenge to do so with order and efficiency. After a time, the novelty had worn off, and she found housework mundane and depressing. She also found herself sympathizing with everyday women who did not have the same access to power and privilege that she had. Assisting victims of domestic abuse in the area on several occasions, Cady Stanton saw how the same unjust laws that she had intuitively resented and wanted to change as a child were especially burdensome to women without means. Just at this point in her life, an invitation for a visit came from Lucretia Mott, who was only eight miles away in Waterloo. Cady Stanton eagerly took the trip to meet with Mott and other reformers in the community, and the idea was born to hold a convention to discuss women’s rights. In one afternoon, the group planned and announced the two-day meeting, the first of its kind. It was to be held only five days later. The event was a success that far exceeded the expectations of Cady Stanton and her convention co-planners. While a group of about fifty devoted social reformers from nearby Rochester and Syracuse were expected to participate, over two hundred people attended. Nearly seventy signed the Declaration of Sentiments, which Cady Stanton had authored, modeling it after the American Declaration of Independence.

The Seneca Falls Women’s Rights Convention was followed a month later by another such meeting in Rochester, thus setting in motion a tradition that was to shape the nineteenth century women’s movement. Conventions to discuss women’s rights were held annually between 1850 and 1860, with Elizabeth Cady Stanton and her good friend and colleague, Susan B. Anthony, playing complementary roles. Anthony was the strategist, tactician, and all-round logistics coordinator. Cady Stanton was the philosophical thinker, writer, and theoretician. Anthony sometimes chastised Cady Stanton for letting family obligations, namely childrearing, get in the way of her women’s rights work. At the same time, however, she was known to Elizabeth’s children as “Aunt Susan”.  After they had turned seven they often went on extended visits to Anthony’s home in Rochester, a landmark in their lives later described with awe and wonder.

A network of countless brilliant and talented women worked to advance women’s rights in nineteenth-century America, among them Amelia Bloomer, Olympia Brown, Paulina Wright Davis, Abby Foster, Matilda Joslyn Gage, Frances Watkins Harper, Isabella Beecher Hooker, Julia Ward Howe, Mary Ashton Rice Livermore, Lucreita Mott, Ernestine Rose, Caroline Severance, Anna Howard Shaw, Lucy Stone, Sojourner Truth, Frances Willard and Victoria Woodhull. The lifelong friendship of two of them, Elizabeth Cady Stanton and Susan B. Anthony, makes for a unique and compelling story. On the surface, the two could not have been more different. Cady Stanton was born of privilege, had a forceful and sometimes challenging personality, was fond of luxury, was a religious skeptic, and refused to believe that women had to choose between motherhood and public activism. Anthony was not born into wealth, had a quiet and calm demeanor, eschewed self-indulgence, was a devout Quaker who later converted to Unitarianism, and felt strongly that domesticity seriously compromised women’s participation in public life. But for some reason, the contrasts between Cady Stanton and Anthony served to complement, rather than to compete with, each other. Anthony had a room in Cady Stanton’s home where she was welcome to stay at any time. When Cady Stanton was unable to attend a convention, Anthony would often read the speech Elizabeth had written. Writing with eloquence, Cady Stanton could pen an essay or speech with ease, an ability that Anthony greatly admired.

While the relationship between Stanton and Anthony remained stable, the movement they were part of was not always placid. During the Civil War, suffrage leaders agreed to focus on supporting the war effort, forming the Women’s Loyal League. Once the war had ended and slavery had been abolished, Stanton and Anthony joined Frederick Douglass and others to form the American Equal Suffrage Association. This organization was devoted to securing voting rights for newly-freed African Americans and for all women simultaneously. However, when the fifteenth amendment to the Constitution was being ratified without the word “sex” included in the text, the branch of the women’s suffrage movement that was led by Stanton and Anthony was outraged. The text read: “The right of citizens of the United States to vote shall not be denied or abridged by the United States or by any state on account of race, color, or previous condition of servitude”. It was inconceivable to Stanton and her colleagues that their male advocates had failed to bring women along in the struggle for voting rights. In response Stanton and her white female colleagues made arguments on behalf of women that today smack of elitism, if not outright racism. How, they asked, could persons who were uneducated and lacking autonomy while held for years in bondage be given the right to vote when well-educated and “cultured” white women continued to be treated like children at best, chattel at worst? In their anger, they allied themselves with race-baiters in the months prior to the amendment’s passage. Stanton, Anthony, and others felt strongly that any change to the constitution involving voting rights simply must be universal–it must include African American males and females, as well as white women and others who did not yet hold the franchise. Yet clearly the pair of activists were willing to turn a blind eye to the ways in which their arguments fueled the fires of race-hate across the nation.

The race issue contributed to a schism in the women’s movement that would last for two decades. Some women, led by Lucy Stone and joined by Julia Ward Howe and Caroline Severance, formed a new organization based in Boston, the American Woman’s Suffrage Association. The AWSA ultimately endorsed the amendment giving only African American males the vote, believing that their good will and co-operative spirit would be rewarded in time. Unfortunately the NWSA and the AWSA had to struggle to distinguish themselves from each other to the rank and file of women’s rights supporters. They printed rival publications to promote women’s rights from the perspective of the leaders of each organization. Stanton and Anthony published The Revolution (1868-1872) and Stone The Woman’s Journal (1870-1917). They also competed for members and political support. It was not until 1890 that the divided movement reunited and was renamed the National American Suffrage Association, continuing the voting rights struggle for another thirty years.

4. Writings and Influence

As noted, Elizabeth Cady Stanton was an eloquent and prolific writer. While she served as the philosopher of the suffrage movement, Susan B. Anthony served as its strategist. Historians have noted that their respective strengths complemented each other. Equally significant is the different approaches they took to securing rights for women. Anthony was single-minded in her quest for the vote as the stepping stone that would provide women access to all other rights. If only women could vote and hold public office, they would then be able to self-advocate: Women could vote for candidates with policies that empower and support women and their families. They could press for changes to laws related to marriage, divorce, and custody for children. Women could help enact any number of provisions that would give them more power and influence in society. If only they had the vote.

Stanton also believed in the power of voting rights, but she also saw no reason not to carry on the battle for women’s equality on all fronts at once. Women had needed property rights when legislatures began to pass them in the 1830s and ‘40s. They continued to need and deserve other rights related to property and financial self-determination in the 1860s, ‘70s, and ‘80s, and Stanton spoke out about how and why these rights should be granted. Similarly, laws dealing with the most intimate aspects of women’s lives–marriage, divorce, and child custody–merited attention in Stanton’s “here and now,” not at some future time when women would hold the power of the franchise.

Stanton’s published works addressed four main areas of feminist concern:  personal/social freedom, marriage/family matters, legal/political rights and religion’s role and influence. Chief among the social constraints that restricted women’s freedom in Stanton’s day was access to education. She began discussing this subject early in her work as a reformer. Prohibitions against rigorous academic training for girls and women thwarted their intellectual growth and thus the levels of personal and social development they could achieve. The tradition of single-sex education further exacerbated this problem. The respective weaknesses of men and women (which Stanton believed were not natural to each gender but nurtured by social norms and values) were reinforced when they were deprived of interaction. This perpetuated the imbalance of power based on gender. The inferior education women received when Stanton began her work also had the potential to contribute to women’s moral inferiority. Without properly exercising their intellectual powers and being challenged to make difficult academic and moral distinctions, women were unable to function as independent decision-makers. This harmed not only women as individuals, but also the social institutions they are a part of: The family, the local community, and the state. In this sense, Stanton laid the foundation for what would later be called liberal feminism, a school of feminist thought which maintains that women are more similar to men than they are different from them. Thus, it aims for equal treatment of men and women, particularly in matters related to education, employment, pay equity and political participation.

Stanton was among the more bold women’s rights advocates in her era in that she took on issues relating to marriage that other suffragists considered off-limits. She and Anthony both referred to the institution of marriage as it existed in their time as “male marriage”, that is, a civil union made for and by men, which was used to perpetuate male power and authority. Among the most controversial was divorce. She was under none of the popular illusions that marriage was a blessed institution that, fairytale-like, brought out the best in people. She resented the suggestion that a virtuous and patient woman could persuade–through her love, faith or virtue–a domineering, alcoholic or abusive man to become a more kind and considerate husband. Women were only and always harmed by their relationships with such men in Stanton’s view. Their own moral character was compromised, as was the overall moral tone of their home and family. Speaking in favor of pending legislation in New York that would liberalize divorce policies, Stanton said that rather than prevent a woman from leaving an abusive and alcoholic husband, the law should prohibit such men from getting married. Such a policy would go much farther toward protecting the institution of marriage than the laws that prevailed in the day were able to do. Here we see in Stanton’s thought a kernel of what would later become dominance feminism. This is a branch of feminist thought that focuses on the differences between men and women. It concludes that, whether natural or socially constructed, gender distinctions are used to reinforce male dominance and female submission. Like Stanton, today’s dominance feminists are concerned with marriage and divorce. They also venture into territory that Stanton and her contemporaries only dared to hint at in the age of Victorian propriety: Domestic violence, rape, incest, pornography, and prostitution. The aim of dominance feminism is to overturn the male power structure that makes these abuses of women possible.

Stanton’s discussion of the legal and political rights of women is the best known within popular discussions of women’s history today. Women’s right to participate fully in public, political life was certainly of paramount importance to her. Over and over in her lectures and essays, Stanton emphasized the ways in which men’s dominance in public life disabled and disgraced women. With men as the chief arbiters of legal and political right, women were rendered “civilly dead” as Stanton so eloquently termed it at the Seneca Falls convention. Women were powerless to stop their husbands if they chose to squander the family’s resources, even if the source of those resources was the savings or inheritance that women themselves had brought into their households. Men made all the laws regarding education, employment, marriage, divorce, child custody, property, inheritance, breaches of the civil and the criminal codes–every aspect of life that could and did affect women’s lives, but over which women were powerless to effect change. Men thus became like monarchs ruling over all classes of society, who could readily wield tyrannical force, if they willed to do so. All the other social inequalities that concerned Stanton trickled down from this one arena–that of legal and political rights.

Significantly, when Stanton spoke in favor of universal suffrage–that is, of extending voting rights to not only all African American males but also to all women–after the Civil War, she cautioned against maintaining distinctions among the various classes of people in society. The entire class of African Americans held in slavery had been prohibited from voting since the founding of the country. As legislators considered extending the franchise, Stanton implored them to erase all similar social distinctions. Women should no longer be treated as a separate class of individuals who are prohibited from voting any more than newly freed African Americans should. On American soil, Stanton said, all citizens were to be granted equal consideration in this way.

This stance, too, created some friction for Stanton as the post-Civil War discourse on voting rights got underway. She and Anthony were absolutely unyielding in their call for women’s full political equality. While other suffragists, like Lucy Stone, were willing to consider partial suffrage, which would allow women to vote on local issues of concern to them, like education or municipal budgets, Stanton and Anthony held firm: Women are equal in all ways to men and should be treated as such. At times they displayed their own class biases on this point. Why should an ignorant and uneducated man of any race be granted the right to vote in all matters when well-educated and cultured women were allowed to vote only about matters like school policies and local road construction? Most critically, as the post-War discussion advanced and feminists were being urged to accept this as “the Negro’s hour” (as Frederick Douglass so famously put it), Lucy Stone and others were willing to surrender women’s voting rights for the time being to appease their allies in Congress. Stone and others assumed the discussion of women’s voting rights would be resumed in a timely fashion, but they believed that, for the peace of the nation, they needed to step aside temporarily. Stanton and Anthony saw the writing on the wall, as such, and held to insisting on female suffrage until the very last legislators had cast their vote. Ultimately, the controversy over universal suffrage resulted in the schism between Stone and her camp and the Stanton-Anthony coalition discussed above.

a. The Woman’s Bible

 

Stanton had always been critical of the ways in which the Christian churches contributed to women’s oppression, but she addressed this topic head-on in the last decade of her life. In publishing The Woman’s Bible, Stanton was far ahead of the feminist curve. Biblical criticism of any sort was relatively new, having been initiated in mid-nineteenth-century Germany by thinkers like Johann Gottfried Eichhorn (1752–1827), Wilhelm Martin Leberecht de Wette (1780–1849) and Julius Wellhausen (1844–1918). The idea of a largely secular examination of the Bible that investigated its implications, flaws and shortcomings as related to women was virtually unthought of.

The Woman’s Bible consists of a collection of essays by a committee of women intellectuals on passages of the Judeo-Christian scriptures that discuss women. Stanton and her colleagues took a critical approach to the story of Eve in the Garden of Eden, for instance. First of all, the story is deemed an allegory, not a factual account. Darwin’s theory of evolution provides us with a more plausible account of the development of human beings over time, not in one act of creation. Secondly, Eve is praised, rather than blamed, for taking the fruit, because this act demonstrates her thirst for knowledge. As one contributor stated it, “Fearless of death if she can gain wisdom [she] takes of the fruit” (The Woman’s Bible, p. 26). The work takes a similar realist approach to the story of Jesus’ life. Cady Stanton and her contributors readily write about him as a man, not as God nor even as God’s Son. Stanton’s view on the miracles he performed is that human development is such that any human being could perform miracles, if only they have the will to make themselves as pure and good as Jesus was.

Other passages, such as those that recount the marriages of the patriarchs, are discussed so as to reify ideals of marital love. In both the Old and New Testament passages, Stanton and her contributors highlight and heighten the role of the women in question. They put particular emphasis on Miriam’s role in the quest for Jewish freedom, for instance. They also discuss the important work of Deborah the judge, as well as of female disciples in the early Christian church, noting that they were committed to their cause and were conveyors of God’s word. In this sense, Stanton sowed the seed of cultural feminism–the ideal that women bring a special perspective and set of values to their participation in society. Therefore they bring a special set of values that needs to be recognized, understood, and appreciated.

The overall objective of The Woman’s Bible was to use textual analysis and historical criticism to dismantle the traditional interpretation of any one biblical passage and replace it with a feminine, if not a feminist, perspective. The effort was a success on this level. Though it went into additional printings, The Woman’s Bible brought Elizabeth Cady Stanton a great deal of criticism, more for her audacity in taking on the project than in what she actually said about the Bible or its merits.

Although Stanton’s efforts in The Woman’s Bible did not match the increasingly rigorous standards of her contemporaries in theology who were beginning their own critical examinations of the Jewish and Christian scriptures, neither did it fail as an intellectual exercise. This was simply a groundbreaking work, which called into question a number of widely accepted claims about the nature of God, God’s esteem for and relation to women, and women’s place within faith communities. It also paved the way for future work for and by women in religion in the twentieth century. Second wave feminists in the 1960s and 1970s, who were struggling for the full ordination of women in the Christian and Jewish traditions, relied heavily on this early work of early feminist criticism by Stanton. Academic women in the same era were inspired by her example and produced more modern and academically rigorous works that scrutinized sacred texts and religious traditions from a feminist perspective as well. Despite her own religious skepticism, Stanton would have been heartened to have seen a future in which over half of the students in mainstream Protestant seminaries are women, and the ordination of women is commonplace in liberal Protestant and Jewish traditions.

5. Conclusion

Certainly Elizabeth Cady Stanton had an immense amount of influence in her day; her influence and her legacy continue even if she is not always an overtly recognized source by today’s feminists. At the heart of her ideals and advocacy lies a blurring of the distinction between the “public” world of law and politics, and the “private” world of home and family–a distinction that held sway through so much of the eighteenth and nineteenth centuries. Feminists in the 1970s and ‘80s would craft a mantra of sorts that got to the heart of this distinction-blurring: “The personal is political”. Cady Stanton’s work is truly the source of such thinking. Stanton wanted the “public” realm, particularly law and politics, to be imported into the home. She wanted to bring an end to the abuse and neglect men were allowed to impose on women because the law turned a blind eye to domestic violence as a “private” matter. Other women activists of the day, such as Frances Willard the temperance activist, wanted to see women’s “private” virtues exported into the public realm: Give us access to the vote, and we will clean up the crime and corruption of the “public” political realm. Susan B. Anthony’s position represented a middle ground. She rejected the public/private distinction as much as Cady Stanton did, but she did so from a different angle and for different reasons. Access to the vote and the ability to hold public office would allow women to speak for themselves and act on their own behalf. If they are dissatisfied with the laws governing marriage and divorce, then give them voting rights and let them change such laws.

Fond of luxury and susceptible to self-indulgence all her life, Elizabeth Cady Stanton became obese late in life and suffered from maladies that were related to her overall poor health:  Fading eyesight, decreased mobility and chronic fatigue. Even so she remained active and engaged in life, and optimistic that women would indeed succeed in winning the vote in the twentieth century. At the very least she could rest knowing that she had passed on the legacy of the suffrage struggle to her daughter’s generation of women’s rights activists. Stanton’s daughter, Harriot Stanton Blatch, was a feminist activist in her own right who helped compile the six-volume History of Woman’s Suffrage, which Stanton, Anthony, Matilda Joslyn Gage and others had begun in 1881. While her daughter was able to vote for the last twenty years of her life, Elizabeth Cady Stanton was never able to register a ballot. She died October 26, 1902, in New York City, just shy of eighteen years before the passage of the nineteenth amendment.

6. References and Further Reading

a. Works by Elizabeth Cady Stanton

  • “Address of Mrs. Elizabeth Cady Stanton”, delivered at Seneca Falls and Rochester (July and August, 1848).
  • “It is so Unladylike” (1853).
  • “I Have all the Rights I Want” (1858).
  • “The Slave’s Appeal” (1860).
  • “Appeal to the Women of New York” (1860).
  • “Address of Elizabeth Cady Stanton on the Divorce Bill”, before the Judiciary Committee of the New York Senate (February 1861).
  • “Free Speech” (1861).
  • “Address in Favor of Universal Suffrage” before the Judiciary Committees of the Legislature of New York (January 1867).
  • “Marriage/Divorce” (1871).
    • A memorial for Victoria Woodhull, so later than 1871 in actuality.
  • “Woman Suffrage,” delivered to the Judiciary Committee of the House of Representatives (1890).
  • “Patriotism and chastity” (1891).
  • “Solitude of Self,” delivered to the Judiciary Committee of the United States Congress (January 1892).
  • “Suffrage, a Natural Right” (1894).
  • The Woman’s Bible (1895). New York: European Publishing Company.
  • “Bible and Church Degrade Woman” (1898).
  • Eighty Years and More (1898). New York: European Publishing Company.
  • Elizabeth Cady Stanton Papers, Rutgers University Archives.

b. Biographical and Historical Works Relating to Elizabeth Cady Stanton

  • Baker, Jean (2005). Sisters: The Lives of America’s Suffragists. New York: Hill and Wang.
  • Davis, Sue (2009). The Political Thought of Elizabeth Cady Stanton. New York: New York University Press.
  • DuBois, Ellen Carol (1981). Elizabeth Cady Stanton, Susan B. Anthony: Correspondence, Writings, Speeches. Boston: Northeastern University Press.
  • Gage, Matilda Joslyn, and others. A History of Woman Suffrage (in six volumes, 1881-1922). New York: C. Mann.
  • Ginzberg, Lori D. (2009). Elizabeth Cady Stanton: An American Life. New York: Hill and Wang.
  • Griffith, Elisabeth (1984). In Her Own Right: The Life of Elizabeth Cady Stanton. New York: Oxford University Press.
  • Matthews, Jean V. (1997). Women’s Struggle for Equality: The First Phase, 1828-1876. Lanham, MD: Ivan R. Dee Pub.
  • Stanton, Henry B. (1885). Random Recollections. New York: Harper & Bros.
  • Stanton, Theodore and Harriot Stanton Blatch (1922). Elizabeth Cady Stanton. New York: Harper & Bros.

c. Papers and Articles Relating to Daniel Cady

  • Cady, Daniel. Letters and Papers at Syracuse University Library.
    • Also includes some of Elizabeth Cady Stanton’s letters to her relatives.
  • “Retirement of Judge Daniel Cady,” New York Times (January 25, 1855).
  • “Daniel Cady”, in Biographical Directory of the United States Congress. Washington, DC: United States Government Printing Office.
  • Raymond, William (1851). “Daniel Cady,” in Biographical Sketches of Distinguished Men of Columbia County (New York). Albany, NY: Weed, Parsons, & Co.

Author Information

Dorothy Rogers
Email: rogersd@mail.montclair.edu
Montclair State University
U.S.A.

Relational Models Theory

Relational Models Theory is a theory in cognitive anthropology positing a biologically innate set of elementary mental models and a generative computational system operating upon those models.  The computational system produces compound models, using the elementary models as a kind of lexicon.  The resulting set of models is used in understanding, motivating, and evaluating social relationships and social structures.  The elementary models are intuitively quite simple and commonsensical.  They are as follows: Communal Sharing (having something in common), Authority Ranking (arrangement into a hierarchy), Equality Matching (striving to maintain egalitarian relationships), and Market Pricing (use of ratios).  Even though Relational Models Theory is classified as anthropology, it bears on several philosophical questions.

It contributes to value theory by describing a mental faculty which plays a crucial role in generating a plurality of values.  It thus shows how a single human nature can result in conflicting systems of value.  The theory also contributes to philosophy of cognition.  The complex models evidently result from a computational operation, thus supporting the view that a part of the mind functions computationally.  The theory contributes  to metaphysics.  Formal properties posited by the theory are perhaps best understood abstractly, raising the possibility that these mental models correspond to abstract objects.  If so, then Relational Models Theory reveals a Platonist ontology.

Table of Contents

  1. The Theory
    1. The Elementary Models
    2. Resemblance to Classic Measurement Scales
    3. Self-Organization and Natural Selection
    4. Compound Models
    5. Mods and Preos
  2. Philosophical Implications
    1. Moral Psychology
    2. Computational Conceptions of Cognition
    3. Platonism
  3. References
    1. Specifically Addressing Relational Models Theory
    2. Related Issues

1. The Theory

a. The Elementary Models

The anthropologist Alan Page Fiske pioneered Relational Models Theory (RMT).  RMT was originally conceived as a synthesis of certain constructs concerning norms formulated by Max Weber, Jean Piaget, and Paul Ricoeur.  Fiske then explored the theory among the Moose people of Burkina Faso in Africa.  He soon realized that its application was far more general, giving special insight into human nature.  According to RMT, humans are naturally social, using the relational models to structure and understand social interactions, the application of these models seen as intrinsically valuable. All relational models, no matter how complex, are, according to RMT, analyzable by four elementary models: Communal Sharing, Authority Ranking, Equality Matching, Market Pricing.

Any relationship informed by Communal Sharing presupposes a bounded group, the members of which are not differentiated from each other.  Distinguishing individual identities are socially irrelevant.  Generosity within a Communal Sharing group is not usually conceived of as altruism due to this shared identity, even though there is typically much behavior which otherwise would seem like extreme altruism.  Members of a Communal Sharing relationship typically feel that they share something in common, such as blood, deep attraction, national identity, a history of suffering, or the joy of food.  Examples include nationalism, racism, intense romantic love, indiscriminately killing any member of an enemy group in retaliation for the death of someone in one’s own group, sharing a meal.

An Authority Ranking relationship is a hierarchy in which individuals or groups are placed in relative higher or  lower relations .  Those ranked higher have prestige and privilege not enjoyed by those who are lower.  Further, the higher typically have some control over the actions of those who are lower.  However, the higher also have duties of protection and pastoral care for those beneath them.  Metaphors of spatial relation, temporal relation, and magnitude are typically used to distinguish people of different rank. For example, a King having a larger audience room than a Prince, or a King arriving after a Prince for a royal banquet.  Further examples include military rankings, the authority of parents over their children especially in more traditional societies, caste systems, and God’s authority over humankind.  Brute coercive manipulation is not considered to be Authority Ranking; it is more properly categorized as the Null Relation in which people treat each other in non-social ways.

In Equality Matching, one attempts to achieve and sustain an even balance and one-to-one correspondence between individuals or groups.  When there is not a perfect balance, people try to keep track of the degree of imbalance in order to calculate how much correction is needed.  “Equality matching is like using a pan balance: People know how to assemble actions on one side to equal any given weight on the other side” (Fiske 1992, 691).  If you and I are out of balance, we know what would restore equality.  Examples include the principle of one-person/one-vote, rotating credit associations, equal starting points in a race, taking turns offering dinner invitations, and giving an equal number of minutes to each candidate to deliver an on-air speech.

Market Pricing is the application of ratios to social interaction.  This can involve maximization or minimization as in trying to maximize profit or minimize loss.  But it can also involve arriving at an intuitively fair proportion, as in a judge deciding on a punishment proportional to a crime.  In Market Pricing, all socially relevant properties of a relationship are reduced to a single measure of value, such as money or pleasure.  Most utilitarian principles involve maximization.  An exception would be Negative Utilitarianism whose principle is the minimization of suffering.  But all utilitarian principles are applications of Market Pricing, since the maximum and the minimum are both proportions.  Other examples include rents, taxes, cost-benefit analyses including military estimates of kill ratios and proportions of fighter planes potentially lost, tithing, and prostitution.

RMT has been extensively corroborated by controlled studies based on research using a great variety of methods investigating diverse phenomena, including cross-cultural studies (Haslam 2004b).  The research shows that the elementary models play an important role in cognition including perception of other persons.

b. Resemblance to Classic Measurement Scales

It may be jarring to learn that intense romantic love and racism are both categorized as Communal Sharing or that tithing and prostitution are both instances of Market Pricing.  These examples illustrate that a relational model is, at its core, a meaningless formal structure.  Implementation in interpersonal relations and attendant emotional associations enter in on a different level of mental processing.  Each model can be individuated in purely formal terms, each elementary model strongly resembling one of the classic scale types familiar from measurement theory.  (Strictly speaking, it is each mod which can be individuated in purely formal terms.  This finer point will be discussed in the next section.)

Communal Sharing resembles a nominal (categorical) scale.  A nominal scale is simply classifying things into categories.  A questionnaire may be designed to categorize people as theist, atheist, agnostic, and other.  Such a questionnaire is measuring religious belief by using a nominal scale.  The groups into which Communal Sharing sorts people is similar.  One either belongs to a pertinent group or one does not, there being no degree or any shades of gray.  Another illustration of nominal scaling is the pass/fail system of grading.  Authority Ranking resembles an ordinal scale in which items are ranked.  The ranking of students according to their performance is one example.  The ordered classification of shirts in a store as small, medium, large, and extra large is another.  Equality Matching resembles an interval scale.  On interval scales , any unit measures the same magnitude on any point in the scale.  For example, on the Celsius scale the difference between 1 degree and 2 degrees is the same as the difference between 5 degrees and 6 degrees.  Equality Matching resembles an interval scale insofar as one can measure the degree of inequality in a social relationship using equal intervals so as to judge how to correct the imbalance.  It is by use of such a scale that people in an Equality Matching interaction can specify how much one person owes another.  However, an interval scale cannot be used to express a ratio because it has no absolute zero point.  For example, the zero point on the Celsius scale is not absolute so one cannot say that 20 degrees is twice as warm as 10 degrees while on a Kelvin scale because the zero point is absolute one can express ratios.  Given that Market Pricing is the application of ratios to social interactions, it resembles a ratio scale such as the Kelvin scale.  One cannot, for example, meaningfully speak of the maximization of utility without presupposing some sort of ratio scale for measuring utility.  Maximization would correspond to 100 percent.

c. Self-Organization and Natural Selection

The four measurement scales correspond to different levels of semantic richness and precision.  The nominal scale conveys little information, being very coarse grained.  For example, pass/fail grading conveys less information than ranking students.  Giving letter grades is even more precise and semantically rich, conveying how much one student out-performs another.  This is the use of an interval scale.  The most informative and semantically rich is a percentage grade which illustrates the ratio by which one student out-performs another, hence a ratio scale.  For example, if graded accurately a student scoring 90 percent has done twice as well as a student scoring 45 percent.  Counterexamples may be apparent: two students could be ranked differently while receiving the same letter grade by using a deliberately coarse-grained letter grading system so as to minimize low grades.  To take an extreme case, a very generous instructor might award an A to every student (after all, no student was completely lost in class) while at the same time mentally ranking the students in terms of their performance.  Split grades are sometimes used to smooth out the traditional coarse-grained letter grading system .  But, if both scales are as sensitive as possible and based on the same data, the interval scale will convey more information than the ordinal scale.  The ordinal ranking will be derivable from the interval grading, but not vice versa.  This is more obvious in the case of temperature measurement, in which grade inflation is not an issue.  Simply ranking objects in terms of warmer/colder conveys less information than does Celsius measurement.

One scale is more informative than another because it is less symmetrical; greater asymmetry means that more information is conveyed.  On a measurement scale, a permutation which distorts or changes information is an asymmetry.  Analogously, a permutation in a social-relational arrangement which distorts or changes social relations is an asymmetry.  In either case, a permutation which does not carry with it such a distortion or change is symmetric.  The nominal scale type is the most symmetrical scale type, just as Communal Sharing is the most symmetrical elementary model.  In either case, the only asymmetrical permutation is one which moves an item out of a category, for example, expelling someone from the social group.  Any permutation within the category or group makes no difference; no difference to the information conveyed, no difference to the social relation.  In the case of pass/fail grading, the student’s performance could be markedly different from what it actually was.  So long as the student does well enough to pass (or poorly enough to fail), this would not have changed the grade.  Thanks to this high degree of symmetry, the nominal scale conveys relatively little information.

The ordinal scale is less symmetrical.  Any permutation that changes rankings is asymmetrical, since it distorts or changes something significant.  But items arranged could change in many respects relative to each other while their ordering remains unaffected, so a high level of symmetry remains.  Students could vary in their performance, but so long as their relative ranking remains the same, this would make no difference to grades based on an ordinal scale.

An interval scale is even less symmetrical and hence more informative, as seen in the fact that a system of letter grades conveys more information than does a mere ranking of students.  An interval scale conveys the relative degrees of difference between items.  If one student improves from doing C level work to B level work, this would register on an interval scale but would remain invisible on an ordinal scale if the change did not affect student ranking.  Analogously, in Equality Matching, if one person, and one person only, were to receive an extra five minutes to deliver their campaign speech, this would be socially significant.  By contrast, in Authority Ranking, the addition of an extra five minutes to the time taken by a Prince to deliver a speech would make no socially significant difference provided that the relative ranking remains undisturbed (for example, the King still being allotted more time than the Prince, and the Duke less than the Prince).

In Market Pricing, as in any ratio scale, the asymmetry is even greater.  Adding five years to the punishment of every convict could badly skew what should be proportionate punishments.  But giving an extra five minutes to each candidate would preserve balance in Equality Matching.

The symmetries of all the scale types have an interesting formal property.  They form a descending symmetry subgroup chain.  In other words, the symmetries of a ratio scale form a subset of the symmetries of a relevant interval scale, the symmetries of that scale form a subset of the symmetries of a relevant ordinal scale, and the symmetries of that scale form a subset of the symmetries of a relevant nominal scale.  More specifically, the scale types form a containment hierarchy.  Analogously, the symmetries of Market Pricing form a subset of the symmetries of Equality Matching which form a subset of the symmetries of Authority Ranking which form a subset of the symmetries of Communal Sharing.  Descending subgroup chains are common in nature, including inorganic nature.  The symmetries of solid matter form a subset of the symmetries of liquid matter which form a subset of the symmetries of gaseous matter which form a subset of the symmetries of plasma.

This raises interesting questions about the origins of these patterns in the mind: could they result from spontaneous symmetry breakings in brain activity rather than being genetically encoded?  Darwinian adaptations are genetically encoded, whereas spontaneous symmetry breaking is ubiquitous in nature rather than being limited to genetically constrained structures.  The appeal to spontaneous symmetry breaking suggests a non-Darwinian approach to understanding how the elementary models could be “innate” (in the sense of being neither learned nor arrived at through reason).  That is, are the elementary relational models results of self-organization rather than learning or natural selection?  If they are programmed into the genome, why would this programming imitate a pattern in nature which usually occurs without genetic encoding?  The spiral shape of a galaxy, for example, is due to spontaneous symmetry breaking, as is the transition from liquid to solid.  But these transitions are not encoded in genes, of course.  Being part of the natural world, why should the elementary models be understood any differently?

d. Compound Models

While all relational models are analyzable into four fundamental models, the number of models as such is potentially infinite.  This is because social-relational cognition is productive; any instance of a model can serve as a constituent in an even more complex instance of a model.  Consider Authority Ranking and Market Pricing; an instance of one can be embedded in or subordinated to an instance of the other.  When a judge decides on a punishment that is proportionate to the crime, the judge is using a ratio scale and hence Market Pricing.  But the judge is only authorized to do this because of her authority, hence Authority Ranking.  We have here a case of Market Pricing embedded in a superordinate (as opposed to subordinate) structure of Authority Ranking resulting in a compound model.  Now consider ordering food from a waiter.  The superordinate relationship is now Market Pricing, since one is paying for the waiter’s service.  But the service itself is Authority Ranking with the customer as the superior party.  In this case, an instance of Authority Ranking is subordinate to an instance of Market Pricing.  This is also a compound model with the same constituents but differently arranged.  The democratic election of a leader is Authority Ranking subordinated to Equality Matching.  An elementary school teacher’s supervising children to make sure they take turns is Equality Matching subordinated to Authority Ranking.

A model can also be embedded in a model of the same type.  In some complex egalitarian social arrangements, one instance of Equality Matching can be embedded in another.  Anton Pannekoek’s proposed Council Communism is one such example.  The buying and selling of options is the buying and selling of the right to buy and sell, hence recursively embedded Market Pricing.  Moose society is largely structured by a complex model involving multiple levels of Communal Sharing.  A family among the Moose is largely structured by Communal Sharing, as is the village which embeds it, as is the larger community that embeds the village, and so on.  In principle, there is no upper limit on the number of embeddings in a compound model.  Hence, the number of potential relational models is infinite.

e. Mods and Preos

A model, whether elementary or compound, is devoid of meaning when considered in isolation.  As purely abstract structures, models are sometimes known as “mods” , which is an abbreviation of, “cognitively modular but modifiable modes of interacting” (Fiske 2004, 3).  (This may be a misnomer, since, as purely formal structures devoid of semantic content, mods are not modes of social interaction any more than syntax.   is a communication system.)  In order to externalize models, that is, in order to use them to interpret or motivate or structure interactions, one needs “preos,” these being “socially transmitted prototypes, precedents, and principles that complete the mods, specifying how, when and with respect to whom the mods apply” (2004, 4).  Strictly speaking, a relational model is the union of a mod with a preo.  A mod has the formal properties of symmetry, asymmetry, and in some cases embeddedness.  But a mod requires a preo in order to have the properties intuitively identifiable as meaningful, such as social application, emotional resonance, and motivating force.

The notion of a preo updates and includes the notion of an implementation rule, from an earlier stage of relational-models theorizing.  Fiske has identified five kinds of implementation rules (1991, 142).  One kind specifies the domain to which a model applies.  For example, in some cultures Authority Ranking is used to structure and give meaning to marriage.  In other cultures, Authority Ranking does not structure marriage and may even be viewed as immoral in that context.  Another sort of implementation rule specifies the individuals or groups which are to be related by the model.  Communal Sharing, for example, can be applied to different groups of people.  Experience, and sometimes also agreement, decides who is in the Communal Sharing group.  In implementing Authority Ranking, it is not enough to specify how many ranks there are.  One must also specify who belongs to which rank.  A third sort of implementation rule defines values and categories.  In Equality Matching, each participant must give or receive the same thing.  But what counts as the same thing?  In Authority Ranking, a higher-up deserves honor from a lower-down, but what counts as honor and what constitutes showing honor?  There are no a priori or innate answers to these questions; culture and mutual agreement help settle such matters.  Consider the principle of one-person/one-vote, an example of Equality Matching.  Currently in the United States and Great Britain, whoever gets the most votes wins the election.  But it is also possible to have a system in which a two-thirds majority is necessary for there to be a winner.  Without a two-thirds majority, there may be a coalition government, a second election with the lowest performing candidates eliminated, or some other arrangement.  These are different ways of determining what counts as treating each citizen as having an equal say.  A fourth determines the code used to indicate the existence and quality of the relationship.  Authority Ranking is coded differently in different cultures, as it can be represented by the size of one’s office, the height of one’s throne, the number of bars on one’s sleeve, and so forth.  A fifth sort of implementation rule concerns a general tendency to favor some elementary models over others.  For example, Market Pricing may be highly valued in some cultures as fair and reasonable while categorized as dehumanizing in others.  The same is clearly true of Authority Ranking.  Communal Sharing is much more prominent and generally valued in some cultures than in others.  This does not mean that any culture is completely devoid of any specific elementary model but that some models are de-emphasized and marginalized in some cultures as compared to others.  So the union of mod and preo may even serve to marginalize the resulting model in relation to other models.

The fact that the same mod can be united with different preos is one source of normative plurality across cultures, to be discussed in the next section.  Another source is the generation of distinct compound mods.  Different cultures can use different mods, since there is a considerable number of potential mods to choose from.

2. Philosophical Implications

a. Moral Psychology

Each elementary model crucially enters into certain moral values.  An ethic of service to one’s group is a form of Communal Sharing.  It is an altruistic ethic in some sense, but bear in mind that all members of the group share a common identity.  So, strictly speaking, it is not true altruism.  Authority Ranking informs an ethic of obedience to authority including respect, honor, and loyalty.  Any questions of value remaining to be clarified are settled by the authority; subordinates are expected to follow the values thus dictated.  Fairness and even distribution are informed by Equality Matching.  John Rawls’ veil of ignorance exemplifies Equality Matching; a perspective in which one does not know which role one will play guarantees that one aim for equality.  Gregory Vlastos has even attempted to reduce all distributive justice to a framework that can be identified with Equality Matching.  Market Pricing informs libertarian values of freely entering into contracts and taking risks with the aim of increasing one’s own utility or the utility of one’s group.  But this also includes suffering the losses when one’s calculations prove incorrect.  Utilitarianism is a somewhat counterintuitive attempt to extend this sort of morality to all sentient life, but is still recognizable as Market Pricing.  It would be too simple, however, to say that there are only four sorts of values in RMT.  In fact, combinations of models yield complex models, resulting in a potential infinity of complex values.  Potential variety is further increased by the variability of possible preos.  This great variety of values leads to value conflicts most noticeably across cultures.

RMT strongly suggests value pluralism, in Isaiah Berlin’s sense of “pluralism”.  The pluralism in question is a cultural pluralism, different traditions producing mutually incommensurable values.  Berlin drew a distinction between relativism and pluralism, even though there are strong similarities between the two.  Relativism and pluralism both acknowledge values which are incommensurable, meaning that they cannot be reconciled and that there is no absolute or objective way to judge between them.  Pluralism, however, acknowledges empathy and emotional understanding across cultures.  Even if one does not accept the values of another culture, one still has an emotional understanding of how such values could be adopted.  This stands in contrast to relativism, as defined by Berlin.  If relativism is true, then there can be no emotional understanding of alien values.  One understands the value system of an alien culture in essentially the same manner as one understands the behavior of ants or, for that matter, the behavior of tectonic plates; it is a purely causal understanding.  It is the emotionally remote understanding of the scientist rather than the empathic understanding of someone engaging, say, with the poetry and theatre of another culture.  Adopting RMT, pluralism seems quite plausible.  Given that one has the mental capacity to generate the relevant model, one can replicate the alien value in oneself.  One is not simply thinking about the foreigner’s relational model, but using one’s shared human nature to produce that same model in oneself.  This does not, however, mean that one adopts that value, since one can also retain the conflicting model characteristic of one’s own culture.  One’s decisive motivation may still flow wholly from the latter.

But the significance of RMT for the debate over pluralism and absolutism may be more complex than indicated above.  Since RMT incorporates the view that people perceive social relationships as intrinsic values, this may indicate that a society which fosters interactions and relationships is absolutely better than one which divides and atomizes, at least along that one dimension.  This may be an element of moral absolutism in RMT, and it is interesting to see how it is to be reconciled with any pluralism also implied.

b. Computational Conceptions of Cognition

The examples of embedding in Section 1.d. not only illustrate the productivity of social-relational cognition, but also its systematicity.  To speak of the systematicity of thought means that the ability to think a given thought renders probable the ability to think a semantically close thought.  The ability to conceive of Authority Ranking embedding Market Pricing makes it highly likely that one can conceive of Market Pricing embedding Authority Ranking.  One finds productivity and systematicity in language as well.  Any phrase can be embedded in a superordinate phrase.  For example, the determiner phrase [the water] is embedded in the prepositional phrase [in [the water]], and the prepositional phrase [in [the water]] is embedded in the determiner phrase [the fish [in [the water]]].  The in-principle absence of limit here means that the number of phrases is infinite.  Further, the ability to parse (or understand) a phrase renders very probable the ability to parse (or understand) a semantically close phrase.  For example, being able to mentally process Plato did trust Socrates makes it likely that one can process Socrates did trust Plato as well as Plato did trust Plato and Socrates did trust Socrates.  Productivity and systematicity, either in language or in social-relational cognition, constitute a strong inductive argument for a combinatorial operation that respects semantic relations.  (The operation respects semantic relations, given that the meaning of a generated compound is a function of the meanings of its constituents and their arrangement.)  In other words, it is an argument for digital computation.

This is essentially Noam Chomsky’s argument for a computational procedure explaining syntax (insofar as syntax is not idiomatic).  It is also essentially Jerry Fodor’s argument for computational procedures constituting thought processes more generally.  That digital computation underlies both complex social-relational cognition and language raises important questions.  Are all mental processes largely computational or might language and social-relational cognition be special cases?  Do language and social-relational cognition share the same computational mechanism or do they each have their own?  What are the constraints on computation in either language or social-relational cognition?

c. Platonism

Chomsky has noted the discrete infinity of language.  Each phrase consists of a number of constituents which can be counted using natural numbers (discreteness), and there is no longest phrase meaning that the set of all possible phrases is infinite.  Analogous points apply to social-relational cognition.  The number of instances of an elementary mod within any mod can be counted using natural numbers.  In the case discussed earlier in which a customer is ordering food from a waiter, there is one instance of Authority Ranking embedded in one instance of Market Pricing.  The total number of instances is two, a natural number.  There is no principled upper limit on the number of embeddings, hence infinity.  The discrete infinity of language and social-relational cognition is tantamount to their productivity.

However, some philosophers, especially Jerrold Katz, have argued that nothing empirical can exhibit discrete infinity.  Something empirical may be continuously infinite, such as a volume of space containing infinitely many points.  But the indefinite addition of constituent upon constituent has no empirical exemplification.  Space-time, if it were finite in this sense, would contain only finite energy and a finite number of particles.  There are not infinitely many objects, as discrete infinity would imply.  On this reasoning, the discrete infinity of an entity can only mean that the entity exists beyond space and time, still assuming that space-time is finite.  This would mean that sentences, and by similar reasoning compound mods as well, are abstract objects rather than neural features or processes.  This would mean that mods and sentences are abstract objects like numbers.  One finds here a kind of Platonism, Platonism here defined as the view that there are abstract objects.

As a tentative reply, one could say that the symbols generated by a computational system are potentially infinite in number, but this raises questions about the nature of potentiality.  What is a merely potential mod or a merely potential sentence?  It is not something with any spatiotemporal location or any causal power.  Perhaps it is sentence types (as contrasted with tokens) that exhibit discrete infinity.  And likewise with mods, it is mod types that exhibit discrete infinity.  But here too, one is appealing to entities, namely types, that have no spatiotemporal location or causal power.  By definition, these are abstract objects.

The case for Platonism is perhaps stronger for compound mods, but one could also defend the same conclusion with regard to the elementary mods.  Each elementary mod, as noted earlier, corresponds to one of the classic measurement scales.  Different scale types are presupposed by different logics.  Classical two-valued logic presupposes a nominal scale, as illustrated by the law of excluded middle: a statement is either on the truth scale, in which case it is true, or off the scale, in which case it is false.  Alternatively, one could posit two categories, one for true and one for false, and stipulate that any statement belongs on one scale or the other.  Fuzzy logics conceive truth either in terms of interval scales, for example, it is two degrees more true that Michel is bald than that Van is bald, or in terms of ratio scales, for example, it is 80 percent true that Van is bald, 100 percent true that Michel is bald.  Even though it has perhaps not been formalized, there is intuitively a logic which presupposes an ordinal scale.  A logic, say,  in which it is more true that chess is a game than that Ring a Ring o’ Roses is a game, even though it would be meaningless to ask how much more.  If nominal, ordinal, interval, and ratio scales are more basic than various logics, then the question arises as to whether they can seriously be considered empirical or spatiotemporal.  If anything is Platonic, then something more basic than logic is likely to be Platonic.  And what is an elementary mod aside from the scale type which it “resembles”?  Is there any reason to distinguish the elementary mod from the scale type itself?  If not, then the elementary mods themselves are abstract objects, at least on this argument.

Does reflection upon language and the relational models support a Platonist metaphysic?  If so, what is one to make of the earlier discussion of RMT appealing, as it did, to neural symmetry breakings and mental computations?  If mods are abstract objects, then the symmetry breakings and computations may belong to the epistemology of RMT rather than to its metaphysics.  In other words, they may throw light on how one knows about mods rather than actually constituting the mods themselves.  Specifically, the symmetry breaking and computations may account for the production of mental representations of mods rather than the mods themselves.  But whether or not there is a good case here for Platonism is, no doubt, open to further questioning.

3. References

a. Specifically Addressing Relational Models Theory

  • Bolender, John. (2010), The Self-Organizing Social Mind (Cambridge, Mass.: MIT Press).
    • Argues that the elementary relational models are due to self-organizing brain activity.  Also contains a discussion of possible Platonist implications of RMT.
  • Bolender, John. (2011), Digital Social Mind (Exeter, UK: Imprint Academic).
    • Argues that complex relational models are due to mental computations.
  • Fiske, Alan Page. (1990), “Relativity within Moose (‘Mossi’) culture: four incommensurable models for social relationships,” Ethos, 18, pp. 180-204.
    • Fiske here argues that RMT supports moral relativism, although his “relativism” may be the same as Berlin’s “pluralism.”
  • Fiske, Alan Page. (1991), Structures of Social Life: The Four Elementary Forms of Human Relations (New York: The Free Press).
    • The classic work on RMT, containing the first full statement of the theory and a wealth of anthropological illustrations.
  • Fiske, Alan Page. (1992), “The Four Elementary Forms of Sociality: Framework for a Unified Theory of Social Relations,” Psychological Review, 99, 689-723.
    • Essentially, a shorter version of Fiske’s (1991).  Nonetheless, this is a detailed and substantial introduction to RMT.
  • Fiske, Alan Page. (2004), “Relational Models Theory 2.0,” in Haslam (2004a).
    • An updated introduction to RMT.
  • Haslam, Nick. ed. (2004a), Relational Models Theory: A Contemporary Overview (Mahwah, New Jersey and London: Lawrence Erlbaum).
    • An anthology containing an updated introduction to RMT as well as discussions of controlled empirical evidence supporting the theory.
  • Haslam, Nick. ed. (2004b), “Research on the Relational Models: An Overview,” in Haslam (2004a).
    • Reviews controlled studies corroborating that the elementary relational models play an important role in cognition including person perception.
  • Pinker, Steven. (2007), The Stuff of Thought: Language as a Window into Human Nature (London: Allen Lane).
    • Argues that Market Pricing, in contrast to the other three elementary models, is not innate and is somehow unnatural.

b. Related Issues

  • Berlin, Isaiah. (1990), The Crooked Timber of Humanity: Chapters in the History of Ideas. Edited by H. Hardy (London: Pimlico).
    • A discussion of value pluralism in the context of history of ideas.
  • Fodor, Jerry A. (1987), Psychosemantics: The Problem of Meaning in the Philosophy of Mind (Cambridge, Mass. and London: MIT Press).
    • The Appendix argues that systematicity and productivity in thought require a combinatorial system.  The point, however, is a general one, not specifically focused on social-relational cognition.
  • Katz, Jerrold J. (1996), “The unfinished Chomskyan revolution,” Mind & Language, 11 (3), pp. 270-294.
    • Argues that only an abstract object can exhibit discrete infinity.
  • Rawls, John. (1971), A Theory of Justice (Cambridge, Mass.: Harvard University Press).
    • The veil of ignorance illustrates Equality Matching.
  • Szpiro, George G. (2010), Numbers Rule: The Vexing Mathematics of Democracy, from Plato to the Present (Princeton: Princeton University Press).
    • Illustrates various ways in which Equality Matching can be implemented.
  • Stevens, S. S. (1946), “On the Theory of Scales of Measurement,” Science 103, pp. 677-680.
    • A classic discussion of the types of measurement scales.
  • Vlastos, Gregory. (1962), “Justice and Equality,” in Richard B. Brandt, ed. Social Justice (Englewood Cliffs, New Jersey: Prentice-Hall).
    • An attempt to understand all distributive justice in terms of Equality Matching.

Author Information

John Bolender
Email: bolender@metu.edu.tr
Middle East Technical University
Turkey

Global Ethics: Capabilities Approach

The capabilities approach is meant to identify a space in which we can make cross-cultural judgments about ways of life. The capabilities approach is radically different from, yet indebted to, traditional ethical theories such as virtue ethics, consequentialism and deontology.

This article begins with a background on global ethics. This situates the capabilities approach as a possible solution to the problems that arise from globalization. The second section provides Amartya Sen’s account of the basic framework of the capabilities approach. That section also shows how Martha Nussbaum develops the approach. The third section describes Nussbaum’s list of ten central capabilities. This list has been viewed by some philosophers as a definitive list, while others, notably Sen, have argued that no list is complete, because a list should always be subject to revision. The fourth section shows how the approach is similar to, yet very different from, traditional ethical theories such as virtue ethics, consequentialism and deontology. The capabilities approach is shown to add to the approaches of global ethics such as communitarianism, human rights, and the approach of John Rawls. The section compares Michael Boylan’s table of embeddedness with Nussbaum’s capabilities list. The fifth section discusses two main philosophical critiques of the capabilities approach. First, and most notably, Alison Jaggar criticizes Nussbaum for not paying closer attention to asymmetrical power relations. Second, Bernard Williams raises questions about what constitutes a capability. The sixth section shows how the capabilities approach has been applied to advance various areas of applied philosophy including the environment and disability ethics. The final section explains how the capabilities approach has been undertaken as a global endeavor by the United Nations Development Program to fight poverty and illiteracy and to empower women.

Table of Contents

  1. Background of Global Ethics
  2. The Capabilities Approach
    1. Sen
    2. Nussbaum
  3. Nussbaum’s List of Central Capabilities
  4. The Relationship between the Capabilities Approach and Other Ethical Theories
    1. Virtue Ethics
    2. Communitarianism
    3. Deontology
    4. Rawls’ The Law of Peoples
    5. Human Rights
    6. Consequentialism
    7. Boylan’s Table of Embeddedness
  5. Philosophical Criticisms of the Capabilities Approach
    1. Illiberal and Neo-Colonialist
    2. What Is a Capability?
  6. Philosophical Applications
    1. The Environment
    2. Disability Ethics
  7. United Nations Development Program
  8. References and Further Reading

1. Background of Global Ethics

Issues of globalization have sparked great controversy since the 1980s. Globalization, broadly construed, is manifested in various forms of social activity including economic, political and cultural life. Practicing global ethics entails moral reasoning across borders. Borders can entail culture, religion, ethnicity, gender, race, class, sexuality, global location, historical experience, environment, species and nations. Ethicists ask how we best address issues of globalization–that is, how we begin to address conflicts that arise when vastly different cultural norms, values, and practices collide.

 

There have been two broad philosophical approaches to address cross-border moral disagreement and conflict. The dominant approach aims to develop moral theories that are not committed to a single metaphysical world-view or religious foundation, but are compatible with various perspectives. In other words, it is a goal to develop a theory that is both ‘thick’ (that is, it has a robust conception of the good embedded within a particular context, and respects local traditions) and ‘thin’ (that is, it embraces a set of universal norms). These universalists include human rights theorists, Onora O’Neill’s deontology, Seyla Benhabib’s discourse ethics and Martha Nussbaum’s capabilities approach. They tend to be associated with constructing ‘thin’ theories of morality. The other approach, most notably advocated by Michael Walzer, is communitarianism. Communitarians deny the possibility of developing a single universal standard of flourishing that is both thick enough to be useful and thin enough to support reasonable pluralism.

 

The debate between these two approaches to global ethics has reached an impasse. Since communitarians hold that moral norms are always local and valid internal to a particular community, universalists charge the communitarians with relativism. Moreover, universalists argue that communitarians fail to provide useful methods for addressing cross-border moral conflict. However, the communitarians charge the universalists with either positing theories that are too thin to be useful or advancing theories that are substantive but covertly build in premises that are not universally shared, and so risk cultural imperialism.

 

Martha Nussbaum believes her capabilities theory resolves the impasse and offers a viable approach to global ethics that provides a universal measure of human flourishing while also respecting religious and cultural differences. The capabilities approach, she argues, is universal, but ‘of a particular type.’ That is, it is a thick (or substantive) theory of morality that accommodates pluralism. Thus, she argues that her theory avoids criticisms applied to other universalists and communitarians. Before examining her theory, we must address her predecessor, Amartya Sen.

2. The Capabilities Approach

a. Sen

Amartya Sen, an economic theorist and founder of the capabilities approach, developed his theory in order to identify a space in which we can make cross cultural judgments on the quality of life. To best understand how these judgments can be passed, we must investigate a critical distinction made by proponents of the capabilities approach–between function and capability. A function, on the one hand, according to Sen, is an achievement, but this should be broadly understood to include any ‘state of being.’ Let’s examine Sen’s bike-riding example to shed light on a ‘function.’ He says a bicyclist has achieved the purpose of what one does with a bike–namely, ride it. From this example, clearly the choice to ride a bike is a function of a human being, however, the scope of functioning is not merely limited to a person’s intention to ride the bike. A ‘function’ entails any ‘state of being’ which includes excitement, happiness and fear. For example, a child who first begins to ride her bike may display a great amount of fear as she wobbles down the road, but once she understands how to ride the bike smoothly, she can enjoy (or perhaps become excited) riding her bike. Thus, when the child rides her bike (and is excited from doing so), she has performed the functions of riding a bike, and having the emotions associated with doing so, while partaking in the capability of play.

A capability, on the other hand, is a possibility, not just any possibility, but a real one. For example, we can talk about the possibility of a person in a deeply poverty-stricken area to find employment and support a family. However, such a possibility may not be real considering external circumstances–for example, no clothing, food or shelter. Put differently, a ‘capability set’ (as Sen calls it) is the total functions available for a person to perform.  By describing it in such a way, Sen places a deep correlation between freedom and function. That is to say, the more limited one’s freedom, the less opportunities one has to fulfill one’s functions. In sum, Crocker (2008) says succinctly that, according to Sen, a capability X entails (1) having the real possibility for X which (2) depends on my powers and (3) and no external circumstances preventing me from X.

A capability and function should not be understood as mutually exclusive or completely paralleling one another. Let’s consider two people with the same capabilities. Even though they have same capabilities, they may participate in radically different functions. For example, two people may both have the opportunity to engage in play, but do so in radically different ways (for example, one may swim while the other volunteers at a homeless shelter). Proponents of the capabilities approach argue this makes the theory most attractive, that is, it accommodates various ways of life even though it puts forth a conception of the good. Now, let’s consider a situation in which people participate in the same functions, but possess different capabilities set. Consider Sen’s example of hunger. Two people may be hungry, but for radically different reasons. Consider, on the one hand, a person who seeks to fulfill her desire to eat, but cannot because of socio-economic circumstances.  On the other, a person may be hungry because she is fasting for religious reasons or protesting an injustice. In both examples, the person suffers from starvation, but for radically different reasons.

b. Nussbaum

Nussbaum begins her capabilities approach by noting her indebtedness to Aristotle and Karl Marx (and to a lesser extent, J.S. Mill). Like Sen, she embraces the capabilities/function distinction. However, she begins to part ways with Sen’s philosophy when she grounds her theory in Marx and Aristotle. In doing so she argues that a function must not be performed in just any way, but in a ‘truly human way.’ That is to say, if a person lives a life where she is unable to exercise her human powers (for example, self-expressive creativity) then she is living her life in more of an animalistic manner than as a human being.

Nussbaum seeks a capabilities approach that can fully express human powers and not just provide (real) opportunities for people to perform certain functions. In other words, she does not deny, as Sen argues, that a capability is a real possibility or opportunity for an individual to perform certain actions, but that is merely necessary and not sufficient for the capabilities approach. Sen is missing, according to Nussbaum, aspects of what is particularly unique to human beings, that is, human powers. Nussbaum understands the capabilities/function distinction as multiply realized–that is, while the capabilities are the space for the opportunity for particular actions, the way in which that space is manifested, via different actions, is a person’s functioning.

Nussbaum notes that there are three specific differences that sets her capabilities approach apart from Sen. First, Nussbaum (2000) charges Sen with not explicitly rejecting cultural relativism. She agrees with his sympathies for universal norms, she also, criticizes his inability to completely reject cultural relativism. Second, Nussbaum criticizes Sen for not grounding his theory in a Marxian/Aristotelian idea of true human functioning. This is not to say that he would reject Nussbaum’s conclusions drawn from Marx and Aristotle, but rather he is not specifically indebted to (and does not ground his theory in) them.  Third, Sen does not provide an explicit list of central capabilities As a matter of fact, Sen has been critical of attempting to provide a list of central capabilities. Nonetheless, these three points of division seem to separate Sen and Nussbaum.

Nussbaum’s two philosophical justifications are the non-Platonic substantive good approach (that is, intuitionism) and a limited role of proceduralism (that is, discourse ethics)–which are a point of contention amongst critics. According to the former, the primary justification for the capabilities approach, we test various ethical theories against our fixed intuitions and decide which theory best matches them. Nussbaum contends that the theory that best represents our intuitions is the capabilities approach. The intuition that grounds the capabilities, according to Nussbaum, is the intuition of a dignified human life whereby people have the capability to pursue their conception of the good in cooperation with others. Consider her example of a person’s fixed intuition that rape is damaging to human dignity. She claims if one matches that intuition against all ethical theories that it will be best represented by the capabilities approach.

One may have reservations for this justification in situations where a person has underdeveloped (that is, intuitions that have not been challenged by competing intuitions) or mistaken intuitions. In response, Nussbaum argues that underdeveloped and mistaken intuitions must be rejected, and replaced with diversely experienced people who have tested their intuitions against competing beliefs. Although Nussbaum notes the primacy of intuitionism, she also argues that proceduralism has an ancillary justification for the capabilities approach.

Nussbaum’s proceduralism begins not with an intuition, but with a decision procedure, and it is the procedure that confers justification on the outcome. She is sympathetic to this form of proceduralism since it is rooted in Kantian discourse ethics (adopted by Jean Hampton), and has accordingly built into it a conception of equal human worth. In that sense proceduralism is similar to the intuitionist justification. However, there are stark contrasts. What is proceduralism, then? The version Nussbaum is concerned with claims that one consults the desires or preferences of another who is impacted by the outcome of the decision at hand. Similar to the concern above, Nussbaum fears that many people’s desires (like intuitions) will be corrupt, and thus produce a morally repugnant conclusion. Therefore, she seeks not just any desires, but ‘informed desires,’ that is, desires constructed by treating people with dignity. However, because not all desires are informed, and yet proceduralism calls for us to consult all desires affected by the decision, the capabilities approach would be placed on too weak of a foundation. Thus, in virtue of all the mistaken desires, proceduralism merely plays an ancillary role. Yet, it’s fair to say that if everyone had informed desires, then Nussbaum would grant proceduralism as a primary justification for the capabilities approach.

These two justifications are meant to be mutually reinforcing. They are meant to justify both the capabilities approach qua theory and the particular list of central capabilities put forth by Nussbaum. However, due to the limitations Nussbaum places on proceduralism, we must rely on intuitionism as the main justification.

3. Nussbaum’s List of Central Capabilities

 

There is much debate over whether Nussbaum’s list of central capabilities is revisable, and thus subject to change, or whether it is a fixed set of capabilities that cannot be compromised. Earlier in her career, Nussbaum (1995) argued that her list was static, however, she has since backed off such a claim and acknowledged the possibility that they could be altered. From her book, Women and Human Development: The Capabilities Approach (WHD hereafter), here is her list of capabilities, along with a brief description of each.

1. Life – Able to live to the end of a normal length human life, and to not have one’s life reduced to not worth living.

2. Bodily Health – Able to have a good life which includes (but is not limited to) reproductive health, nourishment and shelter.

3. Bodily Integrity – Able to change locations freely, in addition to, having sovereignty over one’s body which includes being secure against assault (for example, sexual assault, child sexual abuse, domestic violence and the opportunity for sexual satisfaction).

4. Senses, Imagination and Thought – Able to use one’s senses to imagine, think and reason in a ‘truly human way’–informed by an adequate education. Furthermore, the ability to produce self-expressive works and engage in religious rituals without fear of political ramifications. The ability to have pleasurable experiences and avoid unnecessary pain. Finally, the ability to seek the meaning of life.

5. Emotions – Able to have attachments to things outside of ourselves; this includes being able to love others, grieve at the loss of loved ones and be angry when it is justified.

6. Practical Reason – Able to form a conception of the good and critically reflect on it.

7. Affiliation

A. Able to live with and show concern for others, empathize with (and show compassion for) others and the capability of justice and friendship. Institutions help develop and protect forms of affiliation.

B. Able to have self-respect and not be humiliated by others, that is, being treated with dignity and equal worth. This entails (at the very least) protections of being discriminated on the basis of race, sex, sexuality, religion, caste, ethnicity and nationality. In work, this means entering relationships of mutual recognition.

8. Other Species – Able to have concern for and live with other animals, plants and the environment at large.

9. Play – Able to laugh, play and enjoy recreational activities.

10. Control over One’s Environment

A. Political – Able to effectively participate in the political life which includes having the right to free speech and association.

B. Material – Able to own property, not just formally, but materially (that is, as a real opportunity). Furthermore, having the ability to seek employment on an equal basis as others, and the freedom from unwarranted search and seizure.

Even though Nussbaum claims each of the ten capabilities is equally important, she places special emphasis on two of them–namely, practical reason and affiliation. We see the importance when she explicitly says the core behind the intuition of human functioning is that of a dignified free person who constructs her way of life in reciprocity with others, and not merely following, or being shaped by, others. Furthermore, Nussbaum notes that these two capabilities suffuse all the others, and this in turn, constitutes a truly human pursuit.

Furthermore, Nussbaum argues that the list is ‘thick,’ but ‘vague.’ It is thick because it provides a specific conception of the good life (that is, human flourishing), however, it is not thick enough that it mandates how one ought to live one’s life. Thus, the capabilities list is ‘thick’ enough to allow us to make cross-cultural judgments (for example, identifying areas where an individual or groups of people are unable to actualize a capability), and yet ‘vague’ enough for an individual to choose whether or not (or how) she wishes to participate in a capability.

Finally, Nussbaum says that citizens should be guaranteed a social minimum whereby capabilities can be realized. It is the role of institutions to ensure that a threshold level of central capabilities is achieved. Institutions (for example, religious, labor, government, and so forth) come in many forms, and protect various interests. For example, the Self Employed Women’s Association (SEWA) helps women provide protection and benefits for work in which they have been traditionally underappreciated. However, as Nussbaum notes, achieving the threshold may not be enough for justice.

4. The Relationship between the Capabilities Approach and Other Ethical Theories

The ethical theories that have dominated Western philosophy include (in one form or another) virtue ethics, consequentialism and deontology. The capabilities cannot be reduced to any of those ethical theories, however, it is indebted more or less to each of them. This section will review Rawls and human rights, both of which have numerous deontological underpinnings, and communitarianism which is closely linked with ethics. Finally, this section will include a section on Michael Boylan’s ‘table of embeddedness’ in order to see the challenges and parallels between it and Nussbaum’s list of capabilities. This section will explore parallels and differences between the capabilities approach and the above ethical theories.

a. Virtue Ethics

Even though there are clear differences between the virtue tradition (specifically, Aristotle) and the capabilities approach, Nussbaum uses the former as a point of departure. That is, Aristotle is the foundation for the capabilities approach because Nussbaum seeks a theory that provides the opportunity for human beings to use their powers to flourish in a truly human way.

Virtue ethics, broadly speaking, like the capabilities approach, claims human beings should exercise their powers qua human in attempt in order to live well. Contemporary neo-Aristotelians strive to explicate an account of flourishing  which may entail providing a naturalistic account of flourishing or through empirical psychology. Nussbaum, however, interprets Aristotle’s account of functioning as merely a moral concept and not naturalistic). However, unlike other neo-Aristotelians (and Aristotle himself), Nussbaum has no intention of providing a comprehensive doctrine of human flourishing, although, as noted above, she believes she is providing a tentatively comprehensive list of capabilities.

There is another stark contrast between virtue ethics and the capabilities approach–namely, character building and motivation. Nussbaum is less concerned with why people perform certain actions, and building one’s character over a period of time through proper motivations, and more concerned with providing the proper space that allows an individual to use her powers to fulfill a capability, if she chooses. One should not mistake this claim to mean that Nussbaum is not concerned with motivation at all, but rather this should be viewed as a shift in emphasis. Nussbaum argues in WHD that informed desires (that is, the justification for the capabilities approach) cannot be any desire, but those which contribute to living well. For example, even though one may fulfill the capability of practical reason through education, one should not use it in such a way that coerces others. Such a desire would be condemned by Nussbaum since, on the one hand, it prevents the coerced person from participating in all the capabilities, and on the other, it does not reflect an informed one.

b. Communitarianism

Communitarianism is a critique of liberal theory, and, on the other, emphasizes the importance of political norms within a community. In brief, liberal theorists contend that a self is ahistorical, asocial and apolitical.  Thus it is not necessarily the case that it will be burdened by the practices and beliefs of its community. Michael Sandel, a nationalist-communitarian, explains that a liberal self is ‘unencumbered’–that is, it is not wedded to a particular conception of the good not of its choosing. This abstract ontology allows liberals to make certain moves in the political sphere. For example, the concept of ‘justice’ entails universal normative claims since all human beings are ontologically the same.

In contrast, Alasdair MacIntyre, a communitarian indebted to Aristotle, argues against liberal political theory beginning with their conception of the self. He says a self is embedded within a particular set of cultural beliefs, practices and history. MacIntyre, following Aristotle, claims that in order for one to live a good life, one must be virtuous. A virtue, according to MacIntyre (2007), is a character trait that allows us to achieve goods that are internal to one’s practices By ‘practice,’ he is referring to a “socially established cooperative human activity through which goods internal to that form of activity are realized in the course of trying to achieve those standards of excellence….”  Thus, living a good life entails being virtuous within the context of a given practice (or community).

Furthermore, communitarians believe justice is limited to communities rather than human beings at large. This, in turn, allows them to reject the notion that we can make universal normative judgments. Finally, MacIntyre believes we need extend our conception of virtue from the individual to the community. It’s a bit unclear what a virtuous community would look like exactly, however, we know that it would have a conception of the good life in which people strive. This is clearly contrary to the liberal project in which, , individuals pursue whatever conception of the good they wish as long as they do not interfere or harm another.

Nussbaum is sympathetic to communitarianism insofar as it acknowledges the importance of local traditions and practices that shape our lives. For example, a Hindi woman in India will have a set of beliefs that shape who she is that differs from a Protestant male in the United States. However, Nussbaum ultimately rejects communitarianism. In her section entitled “Defending Universal Values” from WHD, she says communitarians fail to recognize that there is a conception of the individual that is not indebted to a particular metaphysical tradition. She argues that each person should be treated as an end, worthy of respect, dignity and honor. As mentioned in section II, Nussbaum believes the capabilities is founded on the intuition that each person is worthy of a dignified life, and this intuition holds irrespective of one’s community.

c. Deontology

In putting forth her ancillary justification for the capabilities, Nussbaum is indebted to Jean Hampton’s Kantian proceduralism. Nussbaum (2000) believes we need a “Kantian conception of human worth that prominently includes the ideas of equal worth and nonaggregation” (Nussbaum’s italics,). There are two points to take from this claim. First, she is indebted to the Kantian notion that all human beings have intrinsic worth, and as a result, they should always be treated as an end and never merely as a means. Second, she is critiquing the consequentialist argument for aggregate utility. We saw her specific problems with this argument immediately above.

Although Nussbaum is clearly indebted to deontology since it is a justification (albeit auxiliary) for the capabilities, there remains questions to what extent Kant plays a role. David Crocker (2008) argues that her Kantian equal-worth commitment is nothing more than an addition onto her Aristotelianism since the latter justifies moral and political inequality.

d. Rawls’ The Law of Peoples

 

John Rawls uses the same methodology (and preserves the liberal ontological framework of ‘autonomy’ and ‘reason’) in The Law of Peoples as in A Theory of Justice however, he has extended justice to a global scale rather than merely nationally. Beginning with the ‘global original position,’ Rawls argues that all reasonable (or decent) persons would construct political ideals that benefit all liberal peoples; these ideals would be reached via overlapping consensus. See Daniels (1989) and Pogge (1989) for further discussion on Rawls’ original position. A liberal, democratic society, according to Rawls (1999), would include the following benefits: (1) fair equality of opportunity–including, education, (2) a decent distribution of income, (3) society as employer of last resort through general or local government, (4) basic health care for all citizens and (5) public financing of elections (p. 50).

Rawls (1999) claims that the policies constructed by liberal peoples should direct non-liberal societies to (ideally) all become liberal. Rawls deems an illiberal society which rejects the possibility of becoming liberal (for example, abiding by human rights regulations) as an ‘outlaw state.’ While liberal societies should attempt to tolerate illiberal societies initially, he contends an outlaw state eventually subjects itself to severe sanctions and possible intervention

Nussbaum is indebted to not only Rawls specifically, but often praises the values of liberalism. First, she is committed to Rawls’ method of ‘overlapping consensus’ insofar as it is politically advantageous to perform such tasks as fairly distributing primary goods. Furthermore, Nussbaum (2000) respects Rawls attentiveness to “pluralism and paternalism” while remaining committed to the importance of basic liberties Finally, Nussbaum agrees with Rawls (and liberalism more generally) that we should treat people as dignified human beings, and respect their autonomy qua individual.

Nussbaum is also critical of Rawls beginning with his reluctance to make comparisons of well-being. Rawls refuses to make comparisons since each person constructs their conception of the good, so a person may be satisfied with their way of life even though another may find it unsatisfactory. While there may be fears of paternalism, Nussbaum is clear that we should make comparisons of well-being in order to grant certain areas as needing more resources than others. From this, Nussbaum (2000) criticizes Rawls for not taking seriously enough how greatly individuals vary in their needs. Consider her example. If we are concerned with spending resources on increasing literacy rates around the world, we will have to spend much more on women than men given the discrepancy between them. However, Nussbaum argues that Rawls’ approach could not properly address the obstacles when distributing resources since he is merely concerned with resource-distribution, and not cognizant of the variations of distribution within a particular region.

e. Human Rights

The rhetoric of human rights has arguably been more powerful than any other approach to global justice. There is debate amongst human rights advocates in regards to the origin of rights, how they are manifested (that is, who possess them), their possibility of group distribution and how they ought to be enforced. Nonetheless, human rights are universal political norms that belong to every individual simply in virtue of being human. It does not matter whether one belongs to one affiliation or another; but merely in virtue of being a human being, she is guaranteed minimal norms (for example, the right to life or liberty). These are minimal insofar as they are not connected with any conception of the good life, and thus, do not preclude any groups of people (or communities). For further discussion on the nature of human rights see Griffin (2008) and Donnelly (2003).

Alan Gewirth, in The Community of Rights, attempts to make human rights compatible with communities. We can see the difficulty of such a task given the commitment the communitarianism theorists have to a common good, on the one hand, and a value-neutral approach from rights, on the other. Nonetheless, Gewirth argues that if a community does not uphold a doctrine of human rights, then it ought to be rejected as a legitimate community. Gewirth puts forth a theory of human rights while respecting the role communities play in our lives. Furthermore, Will Kymlicka (1989) extends the concept of rights by constructing a theory of rights that considers communities or group rights.

In WHD, Nussbaum directly addresses the “very close” relationship between human rights and the capabilities approach. She believes the capabilities approach has advantages over human rights insofar as it can take a clear position on issues the latter cannot in addition to providing a clear goal. For example, human rights theorists often disagree on the origin and foundation of rights, whereas the capabilities approach, according to Nussbaum, is not plagued by such criticisms. She raises two concerns for why we should reject human rights in favor of the capabilities approach, and then provides four key roles for human rights.

Nussbaum first claims that human rights proponents often make rights claims in regards to property or economic advantage (for example, they have a right to shelter). However, in converting a language of rights to capabilities, she explains that this statement becomes problematic insofar as it can be understood in many ways including resources, utility and capabilities. The human rights tradition would discuss it in terms of resources; however, merely providing resources does not necessarily raise everyone to the same level of capability in order to allow them to fulfill their function. Second, the language of capability ethics does not contain all the baggage that pertains to human rights.  Although Nussbaum rejects the understanding that human rights are often characterized as simply being Western, she also says the capabilities approach avoids the troubles surrounding this debate.

Even though Nussbaum is critical of human rights, she believes is plays an essential role in global ethics. She presents the following four roles (or advantages) of human rights. First, human rights have the advantage of showing the urgency to claims of injustice. Second, human rights (as of now) have rhetorical power. Third, human rights place value on people’s autonomy. Finally, human rights preserve a sense of agreement insofar as it purports norms that apply to everyone.

f. Consequentialism

It would be easy to mistake the capabilities approach as a consequentialist argument to increase the overall utility in the world, where ‘utility’ can be understood in many ways–including ‘happiness.’ Peter Singer (1972), in his influential work, “Famine, Affluence and Morality,” puts forth arguments fighting global poverty from a consequentialist standpoint. In sum, he argues through a series of objections and replies that those in positions of material power should donate to those in less favorable conditions in order to increase the overall utility (and ultimately decrease poverty) throughout the world. It can be said that that Singer’s consequentialism and the capabilities approach are similar insofar as they both more or less seek to directly reduce poverty, and furthermore, provide more opportunities for those who have few or none.

However, Nussbaum (2000) provides three reasons for why consequentialism is different from the capabilities approach. First, one major difference is for whom the ethical theory accounts. On the one hand, consequentialism is interested in maximizing the utility of everyone (that is, the aggregate). On the other, the capabilities approach is interested in the individual. For example, Nussbaum says that the aggregative solution does not tell us who are the bottom and top, that is, who has control over material goods and whether or not someone else deserves a share of it. Thus, by focusing on the individual, we are able to best identify who needs resources and how much.

Second, related to the above point, consequentialism tends to ignore cross-cultural differences, that is, ignoring the fact that people live vastly different lives. As consequentialism is concerned with overall utility (and not merely particular persons or groups of people), it may ignore a particular good that is minimized in one culture, but widely present in another. Put differently, there are many goods–including education and religion–that are highly important to some and relatively unimportant to others. Consequentialism aggregates all goods under the heading of ‘utility,’ and thus, we are unable to identify which goods must be properly distributed to a particular region. The capabilities approach, however, is not only interested in allowing groups of people to use their power to fulfill a capability, but in each individual person to partake in a capability.

Finally, consequentialism ignores relevant aspects of individuals including emotions (that is, how individuals feel about what is happening to them) and what they are able to do or be (that is, fulfill a capability). This critique tends to be associated with consequentialism at large (and not specifically from the capabilities approach), but it is still worth noting. Since the capabilities strive for human flourishing, which entails the ability to express emotions without fear, we can understand why Nussbaum reiterates this critique.

g. Boylan’s Table of Embeddedness

Michael Boylan, in A Just Society, presents a ‘table of embeddedness,’ which is meant to describe a hierarchy of goods. Boylan’s argument for the table can be seen as follows: if people desire to be good, and becoming good requires action, then all people desire to act; the following table presents the interconnectedness between Boylan’s preconditions for actions and a hierarchy of goods.

Boylan (2004) splits the table into two levels–basic goods and secondary goods. The former, on the one hand, is broken further into ‘most deeply embedded’ goods (for example, food, clothing, shelter and free from being harmed) and ‘deeply embedded’ goods (for example, literate, basic math skills, treated with self-respect, and so forth). On the other hand, Boylan divides the latter into ‘life enhancing’ goods (for example, societal respect, equal opportunity and equal political participation), ‘useful’ goods (for example, property, gain from one’s labor and pursue goods owned by the general public such as a cell phone) and ‘luxurious’ goods (for example, pursue pleasant goods such as vacationing and use one’s will to possess a large portion of society’s resources). Even though society has no duty to provide ‘useful’ or ‘luxurious’ goods, it has an obligation to provide basic goods and life enhancing goods (from the secondary goods) to its members. Finally, in striving for equal respect, Boylan claims society may have to spend greater resources on those who are disadvantaged; in doing so Nussbaum would be sympathetic to Boylan’s claim that some groups of people require disproportionally more resources given their unfortunate circumstances than another. This was her critique of Rawls–namely, that he did not account for the varying needs of individuals. Furthermore, Nussbaum would also grant that society has an obligation to provide its citizens with Boylan’s basic goods such as food, shelter and water. However, the roles in which each list plays will be different given how their respective authors understand its purpose.

Nussbaum’s list, unlike Boylan’s, is not hierarchal, but rather everyone ought to have equal opportunity to perform a function that fulfills a capability. In other words, no capability, according to Nussbaum, is more essential than another. Marcus Düwell (2009) provides two criticisms of this view. First, he claims a lack of hierarchy of goods (or capabilities) raises concerns about its practical guidance in “morally contested topics.” Even though Nussbaum argues that no primacy should be given to a particular capability, it’s worth noting that it would be difficult to fulfill the capability of ‘bodily integrity,’ for example, if one’s capability of life is taken away. Second, it also raises concerns to what extent the capabilities are “foundational moral obligations for others.”

5. Philosophical Criticisms of the Capabilities Approach

The capabilities approach has endured many criticisms since its inception. The primary critique is constructed from the feminist and non-Western perspective. This entry will focus on Alison Jaggar’s critique since it embodies many concerns of power relations. Meanwhile, the latter critique can be found in many theorists, but the focus of this entry will be limited to Bernard Williams since he puts forth two challenges in attempt to seek the nature of a capability. Jaggar’s criticisms are limited to Nussbaum, and Williams’ critique is directed primarily towards Sen. This will provide a greater array of criticisms for the capabilities theory in general.

a. Illiberal and Neo-Colonialist

Alison Jaggar criticizes both Nussbaum’s justifications for the capabilities approach and her list. Jaggar believes Nussbaum may have ignored power asymmetries that exist between not only men and women, but also Western and non-Western peoples. She argues that the intuitionist and proceduralist justifications seem to be neo-colonialist and illiberal.

First, Jaggar (2006) argues that Nussbaum’s theory appears to be neo-colonialist insofar as those in power have the “final authority…to assess the moral worth of…[other’s] voices”. This is problematic for the intuitionist justification since those who possess intuitions that do not match the capabilities list, for example, will be interpreted and possibly jettisoned. Put differently, there are no mechanisms in Nussbaum’s approach that allow us to encourage self-criticism from those who possess the list. Furthermore, Jaggar emphasizes that Nussbaum is committed to a politically liberal project (that is, considering everyone’s intuitions), however, the intuitionist justification paradoxically dismisses ideas that do not match the theory put forth by Nussbaum, and thus, it illiberally disregards others. In order for Nussbaum’s theory to encourage self-criticism, she must include all intuitions.

Second, the capabilities list seems to be illiberal since “other voices” (that is, mistaken or uninformed desires) are not ready for a proceduralist justification. Since Nussbaum demands only informed desires participate in the proceduralist justification for the list, desires that do not match the list will be unable to partake in the discourse. Furthermore, because these voices are silenced, there may be capabilities missing from the list or capabilities on the list that ought to be challenged. Regardless, they will be left untouched.

In sum, Jaggar criticizes Nussbaum’s justifications for the capabilities approach since they ignore asymmetrical power relationships. Jaggar believes that even though Nussbaum claims to be paying attention to such relations, she paradoxically fails to produce a theory that yields an outcome that is cognizant of power. It’s worth noting, though, that Jaggar does not believe these criticisms ultimately entail rejecting the capabilities. Rather, she believes that placing discourse ethics as the main justification for the capabilities may allow the theory to be self-critical, and thus, fully aware of power dynamics.

b. What Is a Capability?

Williams’ (1987) primary concern of the capabilities approach is trying to understand what is meant by a ‘capability.’ In pursuing this inquiry, he believes Sen in particular, but capabilities proponents in general, are unclear on the relationship between ‘choice’ and ‘capability.’ Williams does not provide knock-down arguments against the capabilities, but rather poses two challenges for the capabilities theorist to consider.

First, Williams asks what it means to have the capability to do X? Consider his example. If a person is posted once a year to a desirable holiday resort, does she have the capability to go? In a trivial sense, “yes,” but not in a meaningful way (that is, in a way that contributes to the well-being of an individual). If the term ‘capability’ is understood merely as ‘possibility,’ then it could be granted that she has the capability to go, although, there is still something missing–namely, the ability to choose whether or not to go. This example is meant to illustrate the correlation between capabilities and choice. That is, according to Williams, in this case a capability cannot exist without the option to choose it. However, consider Sen’s example where a capability exists without the ability to choose it. Sen, in his Tanner Lectures, notes that the life expectancy is higher in China than India. He believes this example shows that the higher one’s life expectancy the higher the capability of a standard of living. In response to this claim William asks, what capability is increased by a greater life expectancy? He poses this question since it might be the case that living longer only contributes to one having more time to contemplate whether to commit suicide. In this example, Williams is pointing out the problems with the relationship of a capability that completely lacks choice.

Second, and related to the above challenge, William questions the relationship of the capability of doing X to the actual ability to do X here and now. He notes that the ‘actual ability to do X’ can be understood as ‘can do X.’ In other words, if a person possesses the capability to do X, then it must be the case that she can do X. Consider Sen’s example of the capability of breathing unpolluted air. He would argue that if a person has the capability to breathe unpolluted air, then she can do so. Williams grants that a person living in Los Angeles cannot breathe unpolluted air here and now, however, that is not to say she cannot do so at all. In other words, this person has the capability to breathe unpolluted air, but she cannot do it here and now; this position is contrary, though, to Sen’s claim above that if one has the capability to do X, she can do X. Because she has the capability to breathe unpolluted air, she should move to a place where it is possible to do so. Williams argues, though, that there are large costs associated with moving to a place where she can breathe unpolluted air. Let’s assume that person does not have the economic means to do so. Does this person really have the capability, then, to breathe unpolluted air?–logically speaking, “yes,” however, certainly not in any meaningful sense. By considering the opportunity costs associated with a capability such as breathing unpolluted air, some capabilities may become nearly impossible for many to acquire. Thus, Williams argues it is not simply because one can do X that one has the capability to do X.

6. Philosophical Applications

The capabilities approach is often discussed in terms of providing opportunities (Sen) and using human powers (Nussbaum). More often than not it is an argument to reduce poverty or increase the well-being of people around the globe. Recently, it has provided the framework to further advance arguments in other areas of applied ethics including business ethics, the environment, disability ethics and animal ethics. This entry will merely focus on the environment and disability ethics because it calls attention to how far the capabilities approach can be extended.

a. The Environment

The biggest challenge facing capabilities theorists in regards to the environment is on the area of emphasis. The goal of the capabilities–whether Sen or Nussbaum–is human flourishing or well-being. It is never simply understood as non-human or ecological flourishing. Of course, this is not to say that the capabilities approach has nothing to say about the environment, or worse, that it must harm it in order for human beings to flourish, although, there are obstacles standing in the way when putting forth not only an environmentally friendly capabilities approach, but one in which environmental flourishing is taken just as seriously as human flourishing.

There seems to be two ways in which we can approach environmental ethics from a capabilities perspective. By briefly examining each solution, we will have a broader perspective of how the capabilities approach begins to asses environmental concerns. First, one may begin with the capabilities list, and show how environmental values relate to human flourishing. Recall Nussbaum’s eighth capability (out of ten): Other species have the ability to have a concern for and live with others animals, plants and the environment at large. There are two points we can take from this capability. First, Nussbaum believes the environment clearly plays a role in human flourishing otherwise she would not have included it as a capability. Even though the environment seems to be playing an instrumental role insofar as it contributes to human flourishing, it is nonetheless an essential capability. Furthermore, Nussbaum’s list is beneficial because she believes it should be implemented as public policy which would force countries that do not take the environmental capability seriously to reconsider their current policies. Second, however, Victoria Kamsler (2006) recalls that she places it eighth on the list which, she argues, is hard to deny that it is given less emphasis than on almost all the other capabilities. In defense of Nussbaum, she notes that all the capabilities are meant to be mutually reinforcing, and thus, the dignity of a human being as truly human cannot be met without taking environment flourishing seriously.

Second, rather than starting with the list and placing instrumental value on the environment, one may begin with a general account of flourishing that can be applied to non-human beings such as animals and the environment. Here, the environment is understood as being intrinsically valuable (that is, valuable independent of human beings). Kamsler notes that Nussbaum believes the “most basic intuition behind [the] capability theory… ‘wants to see each thing flourish as the sort of thing that it is'”. In other words, the environment qua capability must be treated as an entity that must flourish in its own right, and not merely for the value it provides human beings.

There still remains a lingering question about the relationship between the environment and the capabilities approach. If the capability is understood as anthropocentric insofar as it is concerned with human flourishing, what should we do when the environment impedes such flourishing? In other words, there seem to be cases in which being concerned with the environment’s flourishing will directly conflict with human flourishing (for example, the capability of work and protecting forests). Kamsler addresses this conflict when she says that the only way to overcome this seemingly tragic dilemma is through technological and political means. This is not to say that it will not be costly or conflict with other capabilities, but it is a solution that goes beyond being complacent with the dilemma.

b. Disability Ethics

A person cannot be said to flourish, according to the capabilities approach, if she is unable to perform functions that partake in the capabilities. This raises interesting questions with people who have disabilities insofar as they may be either physically or mentally impaired from having the ability to perform many functions. Nussbaum has given this topic ample discussion through her Tanner Lectures and various publications.

Nussbaum addresses the question of disabilities via the capabilities approach through her list. Her early formulation of the capabilities list excluded many people from the ability to live a truly human life since she required such a life to include using all five senses, for example. She has since retracted from such bold statements. However, Nussbaum (1995) does note that it would be difficult to imagine a person living a truly human life with total lack of the senses, imagination and reasoning.

Nussbaum (2002) has extended her account of functioning in a truly human way (that is, for human dignity) “as containing many different types of animal dignity, all of which deserve respect and even wonder”. In other words, she believes the mentally disabled can gain dignity not merely from rationality, but also through support for the “capabilities of life, health, and bodily integrity. It will also provide stimulation for senses, imagination and thought” This passage indicates a clear responsibility on the state to not only allow for such stimulation of the senses to occur, but to actually provide the resources for such stimulation to occur.

There are interesting questions about how to implement policies that provide the best opportunity for disabled peoples to perform functions that fulfill capabilities. Nussbaum heralds the Individuals with Disabilities Education Act (IDEA) as a way to understand how the capabilities can be manifested in the current education system. IDEA is a disabilities act that begins with the idea of human individuality. Instead of lumping all disabled students into one group, each student is taken on a case-by-case basis. This approach in turn, allows for each student to receive the proper care she needs. This Act does not focus on education being a ‘human right’ because that would entail the goal of merely providing an education to the student, that is, ensuring she receives an education in one form or another. What makes this Act uniquely indebted to the capabilities is its commitment to providing the opportunity for the students to use their powers qua human beings to fulfill their functions in a truly human way–for example, via their senses, imagination and thought.

7. United Nations Development Program

The UNDP is an organization built on the theoretical principles of the capabilities approach. Its goals include helping countries best address solutions pertaining to democratic governance, poverty reduction, crisis prevention and recovery, environment and energy and HIV/AIDS. The organization is clear that none of these solutions will ever come at the expense of women since they are an advocate of empowering women. The four solutions listed here are designed to assist the various challenges facing nations. However, there are eight concrete goals the UNDP is interested in achieving.

The UNDP has put forth eight Millennium Development Goals (MDGs). The MDGs include the following: (1) eradicate extreme poverty and hunger, (2) achieve universal primary education, (3) promote gender equality and empower women, (4) reduce child mortality, (5) improve maternal health, (6) combat HIV/AIDS, malaria and other diseases, (7) ensure environmental sustainability and (8) develop a global partnership for development. The success or failure of achieving these goals is based on a measurement from the Human Development Report (HDR).

The HDR is designed to measure the ways in which people can live up to their full potential in accordance with their desires and interests. Mahbub ul Haq, founder of the HDR, says “the basic purpose of development is to enlarge people’s choices…[which include] greater access to knowledge, better nutrition and health services, more secure livelihoods, security against crime and physical violence, satisfying leisure hours, political and cultural freedoms and sense of participation in community activities.” There are two points to take from this. First, it is clear that the theoretical aspects of the capabilities approach have been preserved upon measuring the MDGs. Second, the HDR is not committed to merely measuring wealth, but rather providing the opportunities for a person to fulfill any of the capabilities she is interested in pursuing.

8. References and Further Reading

  • Appiah, Kwame A. (2006) Cosmopolitanism: Ethics in a World of Strangers, W.W. Norton: NY.
  • Benhabib, Seyla (1995) “Cultural Complexity, Moral Interdependence, and the Global Dialogical Community” in Women, Culture and Development, Martha C. Nussbaum and Jonathan Glover (eds.), Clarendon Press: Oxford.
  • Boylan, Micahel (2004) A Just Society, Rowman & Littlefield Publishers, Inc: Lanham, MD.
  • Crocker, David (2008) Ethics of Global Development: Agency, Capability and Deliberative Democracy, Cambridge University Press: NY.
  • Daniels, Norman (1989) Reading Rawls: Critical Studies on Rawls’ “A Theory of Justice,” Stanford University Press: Stanford, CA.
  • Donnelly, Jack (2003) Universal Human Rights in Theory and Practice, Cornell University Press: Ithaca.
  • Düwell, Marcus (2009) “On the Possibility of a Hierarchy of Moral Goods,” in Morality and Justice: Reading Boylan’s A Just Society, John-Steward Gordon (ed.), Rowman & Littlefield Publishers, Inc: Lanham, MD.
  • Gewirth, Alan (1978) Reason and Morality, The University of Chicago Press: Chicago, IL.
  • Gewirth, Alan (1996) The Community of Rights, The University of Chicago Press: Chicago, IL.
  • Griffin, James (2008) On Human Rights, Oxford University Press, Oxford.
  • Jaggar, Alison (2006) “Reasoning About Well-Being: Nussbaum’s Methods of Justifying the Capabilities,” The Journal of Political Philosophy, 14:3, 301-322.
  • Kymlicka, Will (1989) Liberalism, Community and Culture, Oxford University Press, Oxford.
  • MacIntyre, Alasdair (1988) Whose Justice? Which Rationality?, University of Notre Dame Press, Notre Dame, IN.
  • MacIntyre, Alasdair (2007) After Virtue, Notre Dame University Press: Notre Dame, IN.
  • Nussbaum, Martha (1995), “Human Capabilities, Female Human Beings,” in Women, Culture, and Development: A Study of Human Capabilities, Martha C. Nussbaum and Jonathan Glover (eds.), Clarendon Press: Oxford.
  • Nussbaum, Martha (2000) Women and Human Development: The Capabilities Approach, Cambridge University Press: Cambridge, MA.
  • Nussbaum, Martha (2002) “Capabilities and Disabilities: Justice for Mentally Disabled Citizens,” Philosophical Topics, 30:2, 133-165.
  • O’Neill, Onora (1996) Towards Justice and Virtue: A Constructive Account of Practical Reason, Cambridge University Press: Cambridge, MA.
  • Pogge, Thomas W. (1989) Realizing Rawls, Cornell University Press, Ithaca.
  • Rawls, John (1999) A Theory of Justice, Harvard University Press: Cambridge, MA.
  • Rawls, John (1999) The Law of Peoples, Harvard University Press: Cambridge, MA.
  • Sandel, Michael (1996) Democracy’s Discontent: America in Search of a Public Philosophy, Harvard University Press: Cambridge, MA.
  • Sandel, Michael J. (1982) Liberalism and the Limits of Justice, Cambridge University Press, Cambridge.
  • Sen, A. K. (1985) “Well-being, Agency and Freedom: the Dewey Lectures,” Journal of Philosophy, 82:4, 169-221.
  • Sen, Amartya K. (1987) The Standard of Living: The Tanner Lectures, Cambridge: Cambridge University Press.
  • Sen, Amartya (2009) The Idea of Justice, Harvard University Press: Cambridge, MA.
  • Singer, Peter (1972) “Famine, Affluence, and Morality,” Philosophy and Public Affairs, 1:1, 229-243.
  • Kamsler, Victoria (2006) “Attending to nature: capabilities and the environment,” in Capabilities Equality: Basic issues and problems (ed.) Alexander Kaufman, Routledge: NY, 198-213.
  • Williams, Bernard (1987), “The Standard of Living: Interests and Capabilities,” in The Standard of Living (ed.), Geoffrey Hawthorn, Cambridge University Press: Cambridge, 94-102.

Author Information

Chad Kleist
Email: chad.kleist@marquette.edu
Marquette University
U. S. A.

Autonomy: Normative

Autonomy is variously rendered as self-law, self-government, self-rule, or self-determination. The concept first came into prominence in ancient Greece (from the Greek auto-nomos), where it characterized city states that were self governing. Only later–during the European Enlightenment–did autonomy come to be widely understood as a property of persons. Today the concept is used in both senses, although most contemporary philosophers deal with autonomy primarily as a property of persons.  This orientation will be maintained here.

Most people would agree that autonomy is normatively important. This agreement is reflected both in the presence of broad assent to the principle that autonomy deserves respect, and in the popular practice of arguing for the institution (or continuation, or discontinuation) of public policy based in some way on the value of self-determination. Many also believe that developing and cultivating autonomy is an important–indeed, on some accounts, an indispensable–part of living a good life. But although the claim that autonomy is normatively significant in some way is intuitively compelling, it is not obvious why autonomy has this significance, or what weight autonomy-based considerations should be given in relation to competing normative considerations. In order to answer these questions with sufficient rigor, it is necessary to have a more detailed understanding of what autonomy is.

This article will be devoted to canvassing the leading work done by philosophers on these two issues, beginning with the question of the nature of autonomy, and then moving to the question of the normative significance of autonomy. It will be seen that autonomy has been understood in several different ways, that it has been claimed to have normative significance of various kinds, and that it has been employed in a wide range of philosophical issues. Special attention will be paid to the question of justification of the principle of respect for autonomous choice.

Table of Contents

  1. History of the Concept of Autonomy
  2. Conceptions of Autonomy
    1. Moral Autonomy
    2. Existentialist Autonomy
    3. Personal Autonomy
    4. Autonomy as a Right
  3. The Normative Roles of Autonomy
    1. Autonomy in Ethical Theory
    2. Autonomy in Applied Ethics
    3. Autonomy in Political Philosophy
    4. Autonomy in Philosophy of Education
  4. Warrant for the Principle of Respect for Autonomous Choice
  5. References and Further Reading

1. History of the Concept of Autonomy

The concept of autonomy first came into prominence in ancient Greece, where it characterized self-governing city-states. Barring one exception (mentioned below), autonomy was not explicitly predicated of persons, although there is reason to hold that many philosophers of that time had something similar in mind when they wrote of persons being guided or ruled by reason. Plato and Aristotle, for example–as well as many of the Stoics–surely would have agreed that a person ruled by reason is a properly self-governing or self-ruling person. What one does not find, however, are ancient philosophers speaking of the ideal of autonomy as that of living according to one’s unique individuality. The one exception to this appears to be found in the thinker and orator Dio of Prusa (ca. 50–ca. 120), who, in his 80th Discourse, clearly seems to predicate autonomy of individual persons in roughly the sense in which it has come to be understood in our own day (see Cooper 2003).

Medieval philosophers made no use of the concept of autonomy that is worthy of note, although once again, many medieval philosophers would have doubtless agreed that those who live in accordance with right reason and the will of God are properly self-governing. The concept of autonomy wouldn’t be circulated in learned circles again until the Renaissance and early modern times, when it was employed both in the traditional political sense, and in an ecclesiastical sense, to refer to churches that were–or at least claimed to be –independent of the authority of the Roman Catholic Pope (see Pohlmann 1971).

The concept of autonomy came into philosophical prominence for the first time with the work of Immanuel Kant. Kant’s work on autonomy, however, was strongly influenced by the writings of Jean-Jacques Rousseau, so a brief word on Rousseau is in order.  Although Rousseau did not use the term ‘autonomy’ in his writings, his conception of moral freedom–defined as “obedience to the law one has prescribed to oneself”– has a clear relation to Kant’s understanding of autonomy (as will be shown below). Moreover, Rousseau wrote of moral freedom as a property of persons, thus presaging Kant’s predication of autonomy of persons. The connections between Rousseau and Kant cannot be taken too far, however; for Rousseau was primarily concerned with the question of how moral freedom can be achieved and sustained by individuals within society given the presence of relations of social dependency and the possibility of domination, whereas Kant was primarily concerned with the place of autonomy in accounts of the subjective conditions requisite for, and the nature of, morality. Because of the connections Kant drew between autonomy and morality, Kant’s conception of autonomy is sometimes referred to as ‘moral autonomy’.

In the nineteenth century, John Stuart Mill contributed to the discussion on the normative significance of autonomy in his work On Liberty. Although Mill did not use the term ‘autonomy’ in this work, he is widely understood as having had self-determination in mind. Mill’s work continues to have considerable influence on discussions on the normative significance of autonomy in relation to paternalism of various kinds.

A tremendous amount of research on autonomy has taken place in the last several decades in both the analytic and continental traditions. Continental philosophers speak more often of authenticity than of autonomy, but there are clear connections between the two, insofar as the ‘self’ in ‘self-determination’ is plausibly understood as the authentic self. Philosophers working in the analytic tradition have gone into great detail attempting to discern necessary and sufficient conditions for the presence of autonomy, as well as to uncover the ground and implications of its normative significance.

2. Conceptions of Autonomy

There are several different conceptions of autonomy, all of which are loosely based upon the core notions of self-government or self-determination, but which differ considerably in the details.

a. Moral Autonomy

As mentioned, moral autonomy is associated with the work of Kant, and is also referred to as ‘autonomy of the will’ or ‘Kantian autonomy.’ This form of autonomy consists in the capacity of the will of a rational being to be a law to itself, independently of the influence of any property of objects of volition. More specifically, an autonomous will is said to be free in both a negative and a positive sense. The will is negatively free in that it operates entirely independently of alien influences, including all contingent empirical determinations associated with appetite, desire-satisfaction, or happiness. The will is positively free in that it can act in accordance with its own law. Kant’s notion of autonomy of the will thus involves, as Andrews Reath has written, “not only a capacity for choice that is motivationally independent, but a lawgiving capacity that is independent of determination by external influence and is guided by its own internal principle–in other words, by a principle that is constitutive of lawgiving” (Reath 2006).  Now, because the lawgiving of the autonomous will contains no content given by contingent empirical influences, this lawgiving must be universal; and because these laws are the product of practical reason, they are necessary. Insofar, then, as Kant understood moral laws as universal and necessary practical laws, it can be seen why Kant posited an essential connection between the possession of autonomy and morality: the products of the autonomous will are universal and necessary practical laws–that is, moral laws. It is thus by virtue of our autonomy that we are capable of morality, and we are moral to the extent that we are autonomous. It is for this reason that Kant’s conception of autonomy is described as moral autonomy. Moral autonomy refers to the capacity of rational agents to impose upon themselves–to legislate for themselves–the moral law.

Furthermore, the capacity for autonomy, according to Kant, is “the basis of the dignity of human and of every rational nature;” and in accordance with this rational nature, is an end in itself. Furthermore, it “restricts freedom of action, and is an object of respect”. Many thinkers have followed Kant in grounding the dignity of persons (and respect for persons generally) in our capacity for autonomy (although it should be noted that not all of these thinkers have accepted Kant’s conception of autonomy). More will be said on this below.

Moral autonomy is said to be a bivalent property possessed by all rational beings by virtue of their rationality–although according to Kant, it is certainly possible not to live in accordance with its deliverances in practice (for more on Kant’s conception of autonomy, see Hill 1989, Guyer 2003, and Reath 2006).

One of the most common objections to this conception of autonomy is that such a robust form of independence from contingent empirical influences is not possible. Kant defended the possibility of such robust independence by arguing that human agents inhabit two realms at once: the phenomenal realm of experience, in relation to which we are determined; and a noumenal or transcendental realm of the intellect, in relation to which we are free. Given the further claim that our noumenal self can exercise efficient causality in the phenomenal realm, Kant held that our autonomy is in large part constituted by our noumenal freedom. The postulation of such a form of freedom may be criticized as metaphysically extravagant, however; and if such freedom is not possible, then neither is moral autonomy in Kant’s strict sense. Some thinkers have argued that Kant’s theorization on the noumenal realm was not meant to have metaphysical significance. Thomas Hill has argued, for example, that Kant may have been merely elaborating on the practical conditions in which we must understand ourselves insofar as we conceive ourselves as free. Objectors have insisted, however, that Kant intended to assert the more robust form of metaphysical freedom. Indeed, it could be pressed, he must have; for without this sense of freedom being operative, actual autonomy–and hence morality, by Kant’s lights–would not be possible.

b. Existentialist Autonomy

Existentialist autonomy is an extreme form of autonomy associated principally with the writings of Jean Paul Sartre. It refers to the complete freedom of subjects to determine their natures and guiding principles independently of any forms of social, anthropological or moral determination. To possess existentialist autonomy is thus to be able to choose one’s nature without constraint from any principles not of one’s own choosing. Sartre held this radical freedom to be entailed by the truth of atheism. According to Sartre, God’s nonexistence entails two key conclusions: firstly, humans cannot have a predetermined nature; and secondly, there cannot exist a realm of values possessing independent validity. Taken together, this entails that human beings are radically free: “For if indeed existence precedes essence, one will never be able to explain one’s action by reference to a given and specific human nature; in other words, there is no determinism–man is free, man is freedom. Nor, on the other hand, if God does not exist, are we provided with any values or commands that could legitimize our behavior. Thus we have neither behind us, nor before us in a luminous realm of values, any means of justification or excuse.”  Fettered neither by a predetermined nature nor by an independently existing order of values, “[m]an is nothing else but what he makes of himself” (Sartre 1946).

Like moral autonomy, existentialist autonomy is a bivalent property which all human persons are said to possess (although possibly without being aware of this). Unlike moral autonomy, however, existentialist autonomy has no necessary connections to morality or to rationality as traditionally conceived.

The primary objection to existentialist autonomy is that it is too radical to be plausible.  Even if God does not exist, it is argued, it does not follow that humans lack a nature that determines–at least to some extent–their choices, tendencies, proclivities, and guiding principles. A thoroughly naturalistic conception of human nature, informed by an understanding of the evolutionary forces operative in human psychology, seems to militate against the notion that humans are as unbounded as existentialist autonomy suggests we are. At the very least, it could be argued that empirical evidence does not speak in favor of the existence of existentialist autonomy in any robust form.

c. Personal Autonomy

Without question, the majority of contemporary work on autonomy has centered on analyses of the nature and normativity of personal autonomy. Personal autonomy (also referred to as ‘individual autonomy’) refers to a psychological property, the possession of which enables agents to reflect critically on their natures, preferences and ends, to locate their most authentic commitments, and to live consistently in accordance with these in the face of various forms of internal and external interference. Personally autonomous agents are said to possess heightened capacities for self-control, introspection, independence of judgment, and critical reflection; and to this extent personal autonomy is often put forth as an ideal of character or a virtue, the opposite of which is blind conformity, or not ‘being one’s own person.’

As mentioned above, personal autonomy has an essential relation to authenticity: the personally autonomous agent is the agent who is effective in determining her life in accordance with her authentic self. Personal autonomy is thus constituted, on the one hand, by a cluster of related capacities (often termed ‘authenticity conditions’), centered on identifying one’s authentic nature or preferences and, on the other hand, by a cluster of capacities (often termed ‘competency conditions’) that are centered on being able effectively to live in accordance with these throughout one’s life in the face of various recalcitrant foreign influences. These capacities may be possessed singly or in unison, and often require a considerable amount of life experience to assume robust forms.

One of the most intractable problems surrounding personal autonomy concerns the analysis of the authentic self (the ‘self’ in ‘self-determination’, as it were).  Some philosophers have claimed that no such self exists; and indeed, some philosophers claim that no self exists at all (for an overview of these problems, see Friedman 2003 and Mackenzie & Stoljar 2000).  Most philosophers accept the possibility of the authentic self at least as a working hypothesis, however, and concentrate attention on the question of how authenticity is secured by an agent. The most popular and influential account is based on the work of Harry Frankfurt and Gerald Dworkin. According to their ‘hierarchical’ account, agents validate the various commitments (beliefs, values, desires, and so forth) that constitute their selves as their own by a process of reflective endorsement. On this account, agents are said to possess first-order and second-order volitions. Our first-order volitions are what we want; and our second-order volitions are what we want to want. According to the hierarchical model, our first-order desires, commitments, and so on are authentic when they are validated by being in harmony with our second-order volitions: that is, when we want what we want to want. Following from this model, an agent is autonomous in relation to a given object when the agent is able to determine her first-order volitions (and corresponding behavior) by her second-order volitions. A simple example may help to illustrate the model. Say that I am a smoker. Although I enjoy lighting up, I do not reflectively endorse my smoking; I desire it, but I do not want to desire it. On the hierarchical model, smoking is not an aspect of my authentic self, because I do not reflectively endorse it; and to the extent that I am unable to change my habits, I am not autonomous in relation to smoking. Conversely, if I can bring my first-order volitions into harmony (or identity) with my second-order volition, then my desire is authentic because it is reflectively endorsed; and to the extent that I can mold my behavior in accordance with my reflective will, I am autonomous in relation to smoking.  Persons who possess the requisite capacities to form authentic desires and effectively to generally live in accordance with them are autonomous agents according to this model (see Frankfurt 1971, 1999 and Dworkin 1988).

The hierarchical model remains–in outline, at least–the leading account of authenticity undergirding most contemporary accounts of personal autonomy, although it has been attacked on many fronts. The primary objection tendered against this account is ‘the problem of origins.’ As we have seen, authentic selfhood as reflective endorsement holds that my authentic self is the self that I reflectively ratify: the self that I endorse as expressing, in a deep sense, who I fundamentally am or wish to be. The problem of origins arises when one attempts to explain how this act of reflective endorsement actually constitutes a break from other-determination (that is, from foreign influence). For, could it not be the case that what appears to me to be an independent act of reflective endorsement is itself conditioned by other-determining factors and therefore ultimately an other-determined act? If this is the case, then it doesn’t seem that the possession of autonomy or the making of autonomous choices is possible. In short, the problem is how to sustain an account of self-determination that is not threatened by the pervasive effects of other-determination (see Taylor 2005 for elaboration on the problem of origins and related sub-problems). Much work on theories of personal autonomy has been explicitly devoted to addressing precisely these sorts of difficulties.

Besides analyzing and clarifying the authenticity conditions necessary for autonomy, philosophers have also worked on providing a thorough account of the competency conditions necessary for the presence of autonomy (see Meyers 1989, Mele 1993, and Berofsky 1995). Competency conditions, as we have seen, are those capacities or conditions that need to be present in order for one to be effective in living according to one’s authentic self-conception in the face of various kinds of interference to that end.  Examples of competency conditions include self-control, logical aptitude, instrumental rationality, resolve, temperance, calmness, and a good memory.

In addition to authenticity and competency conditions, many theories of personal autonomy require the presence of certain external enabling conditions: that is, external or environmental (social, legal, familial, and so forth) conditions which are more than less out of the agent’s control, but which must be in place in order for fully autonomous living to be possible. Such enabling conditions include, for example, a modicum of social freedom, an array of substantive options for choice, the presence of authenticity-oriented social relations, and autonomy-supporting networks of social recognition and acknowledgment (see Raz 1986 and Anderson & Honneth 2005). Without these conditions, effective autonomous living is said by some to be impossible, even where authenticity and competency conditions are robustly satisfied. Different autonomy theorists place different emphases on external enabling conditions. Some contend that external enabling is a necessary condition for autonomy (see Oshana 1998). Others hold that autonomy more properly concerns agential satisfaction of authenticity and competency conditions, regardless of whether the external environment allows for actual autonomous expression (see Christman 2007). Both views can claim some intuitive support. On the one hand, it is reasonable to hold that it is only fitting to call a person ‘autonomous’ if that person is in fact effective in living according to her authentic self-conception. Yet, it also makes sense to call persons ‘autonomous’ who have formed an authentic self-conception and possess the requisite competency conditions effectively to express that self-conception, but happen to lack the contingent socio-relational conditions that allow for the expression of that authentic self. A possible solution to this impasse may be to avoid seeking hard and fast borders to the existence of autonomy, and say that autonomy is present in both cases, but is more robust where the proper external enabling conditions are in place.

The question of normative commitments associated with personal autonomy possession has also been a matter of some dispute. Many philosophers hold that autonomy is normatively content-neutral. According to this account, one (or one’s commitments) can be autonomous regardless of the values one endorses. On this account, one could commit to any kind of life–even the life of a slave–and still be autonomous (see, for example, Friedman 2003). Other philosophers hold that autonomy possession requires substantive normative constraints of some kind or other–at the very least, it is argued that one must value autonomy in order to be truly autonomous (see Oshana 2003). As with the debate just mentioned, both sides of this debate can claim some intuitive support; this can be shown through the asking of opposing but seemingly equally compelling (apparently rhetorical) questions; namely, ‘Can’t one autonomously choose whatever one wants?’, and, ‘How can we call someone autonomous who doesn’t value or seek autonomous living?’ One possible solution to this debate is to say that while almost any individual choice can be autonomous, persons cannot live autonomous lives as a whole without some commitment to the value of autonomy.

Unlike moral and existentialist autonomy, personal autonomy is possessed in degrees, depending on the presence and strength of the constellation of internal capacities and external enabling conditions that make it possible. While not all persons possess personal autonomy, it is commonly claimed that virtually everyone–with the exception of the irredeemably pathological and the handicapped–possesses the capacity for personal autonomy. In addition, the links between personal autonomy possession and moral agency are usually said to be thin at best. Even those who hold that personal autonomy possession requires substantive normative commitments of some kind (such as, for example, a commitment to the value of autonomy itself), they usually hold that it is quite possible to be an autonomous villain. Some philosophers have argued that personal autonomy possession requires the presence of normative competency conditions that effectively provide agents with the capacity to distinguish right from wrong (see Wolf 1990), but this strong account is in general disfavor, and even if the account is correct, few would argue that this means that personally autonomous agents must also always act morally. In the face of this, one may wonder why autonomy-based claims are said to generate demands of respect upon others. This question will be dealt with in more detail in section 4 below.

Lastly, a word should be given on the relation between personal autonomy and freedom (or liberty, which is here taken to be synonymous with freedom). Although it is not uncommon to find the terms ‘(personal) autonomy’ and ‘freedom’ used essentially synonymously, there are some important differences between them.

More often than not, to claim that a person is free is to claim that she is negatively free in the sense that she is not constrained by internal or external forces that hinder making a choice and executing it in action. There is a clear distinction between autonomy and negative freedom, however, given that autonomy refers to the presence of a capacity for effective authentic living, and negative freedom refers to a lack of constraints on action.  It is entirely possible for a person to be free in this negative sense but nonautonomous, or–on accounts that do not require the presence of external enabling conditions for autonomy to be present–for a person to be autonomous but not (negatively) free.

Some writers also speak of positive freedom, and here the connections with autonomy become much deeper. Speaking very generally, to be free in this sense is to possess the abilities, capacities, knowledge, entitlements or skills necessary for the achievement of a given end. For example, I am only (positively) free to win an Olympic gold medal in archery if I am extremely skilled in the sport. Here it should be clear that one can be positively free in many ways and yet not be autonomous. Some philosophers, however, following Isaiah Berlin (Berlin 1948), have described positive freedom in such a way that it becomes basically synonymous with personal autonomy. Like autonomy, the conception of freedom that is operative in a given discussion can vary considerably; but more often than not personal autonomy is distinguished from freedom by the necessary presence, in the former, of a connection to the authenticity of the agent’s self-conception and life-plan–a connection that is usually not found in conceptions of freedom.

d. Autonomy as a Right

Lastly, autonomy is sometimes spoken of in a manner that is more directly normative than descriptive. In political philosophy and bioethics especially, it is common to find references to persons as autonomous, where the autonomy referred to is understood principally as a right to self-determination. In these contexts, to say that a person is autonomous is largely to say that she has a right to determine her life without interference from social or political authorities or forms of paternalism. Importantly, this right to self-directed living is often said to be possessed by persons by virtue either of their potential for autonomous living or of their inherent dignity as persons, but not by virtue of the presence of a developed and active capacity for autonomy (see Hill 1989). Some have argued that political rights (Ingram 1994) and even human rights generally (Richards 1989) are fundamentally based upon respect for the entitlements that attend possessing the capacity for autonomy.

3. The Normative Roles of Autonomy

Although disagreements concerning the nature of autonomy are rife, almost no one disagrees that autonomy has normative significance of some kind; and this agreement is found both in relation to the claim that autonomy is normatively significant for the autonomous agent and to the claim that autonomy is normatively significant for the addressees of autonomy-based demands. Following from this, autonomy plays an important normative role in a variety of philosophical areas.

a. Autonomy in Ethical Theory

Autonomy is referenced or invoked in a number of key ways in ethical theory:

(i) Autonomy serves as a ground for the claims that persons have dignity and inherently deserve basic moral respect

(ii) Autonomy is said to have a value that grounds the claim that persons deserve to be told the truth

(iii) Autonomy is referenced as a fundamental principle of ethics in Kantian deontology

(iv) Autonomy is commonly viewed as a key component of human well-being (and is therefore significant for utilitarianism)

(v) Autonomy is defended as an important virtue

(vi) Autonomy is said to be necessary for moral responsibility

(vii) Autonomy is said to have a value that grounds the claim that autonomy-based demands are worthy of special respect

(i) Ever since Kant, autonomy (or the capacity for autonomy) has been referenced by some philosophers as that property of human beings by virtue of which they possess inherent dignity and therefore inherently deserve to be treated with basic moral respect.  Kant’s justification for the claim that autonomy grounds the inherent dignity of persons was based on the view that it is by virtue of our autonomy that we are ends-in-ourselves.  Beings that lack autonomy are, precisely because of this lack, essentially at the mercy of the determinism that characterizes the phenomenal realm: they are controlled by forces that have nothing to do with their own will. Beings that possess autonomy on the other hand, are, precisely because of this possession, free from this determination; they have the capacity for freedom through the active exercise of their autonomous wills, which allows for the legislation of universal law. Autonomous agents are not passive players in life; they are active agents, determining themselves by their own will, the authors of the laws that they follow (see Guyer 2003). As such, they are not passive means towards nature’s determined ends, but are ends-in-themselves, by virtue of which they possess inherent dignity and deserve basic moral respect. Many have followed Kant in referencing autonomy as the ground of human dignity and as the basis of the basic moral respect owed to persons, although not all have followed Kant in the details of his account (for a recent account that moves away from Kant’s conception of noumenal freedom, see Korsgaard 1996). The most common objection leveled against this account is that it runs into problems involving exclusion. Most would argue that the mentally handicapped, for example, are owed basic moral respect, even if they do not possess (even the capacity for) autonomy. And if human dignity is indexed to the presence of autonomy, it is argued, this would entail, counter-intuitively, that those who are more autonomous have more dignity, and are more worthy of respect. It may also be argued that the capacity for autonomy is a poor ground for human dignity (and respect for persons) for other reasons–for example, because autonomy has no essential connection to morality, or because better grounds are available, or because the very project of grounding human dignity on a property of some kind is ill-conceived. Despite these worries, however, appeals to autonomy as a basis for human dignity and basic moral respect remain quite popular.

(ii) Some philosophers have argued that a proper appreciation for others as autonomous (or as possessing the capacity for autonomy) requires that one not seek to deceive them.  Respect for autonomy is thus said to have an important relation to truthfulness. In Thomas Hill’s words, “Lies often reflect inadequate respect for the autonomy of the person who is deceived.” (Hill 1991) We saw above that autonomy’s value has been used to ground the basic moral respect owed to persons; and the present injunction against deception may be viewed as a specific form that autonomy-based respect for persons may take. It is easy to see why a connection between respect for autonomy and truthfulness (or what comes to the same thing–an injunction against deception) has been attractive to some philosophers, especially those in the Kantian tradition. When we deceive others for our own purposes, we bypass their reflective abilities and make them instruments in the achievement of our own ends, and in doing this we fail to treat them as persons capable and deserving of self-determination. Proper respect for persons as autonomous thus requires a commitment to truthfulness. It has been argued, however, that one may respect and value the autonomy of another while deceiving them at the same time (Buss 2005). One may, for example, use forms of deception so that another’s capacity for autonomy may flourish. The basic idea here is that one may still reason for oneself despite being deliberately influenced by the deceptive behavior of others. As Sarah Buss writes, “To put it somewhat crudely, whether an instance of practical reasoning is self-determined is a matter of whether it is really the agent herself who is doing the reasoning. And this would seem to depend on whether she determines her response to the considerations that figure in her reasoning–not on how the considerations to which she responds relate to reality, nor on how she came to be aware of these considerations.” It may be argued, however, that the conception of autonomy underlying this claim is too thin to be acceptable, and that a better conception would contain the resources necessary to judge self-determining reasoning influenced by the deliberate deception of another as nonautonomous. In this vein, some have argued that a person is autonomous in relation to a given desire or choice only if that person would not feel alienated from the causal process that gave rise to that desire or choice (Christman 2007). On the assumption that persons would feel alienated from deceptive desire- or choice-forming processes, the associated desires or choices would not count as autonomous. In response to this, however, it may be argued that autonomous agents may not feel alienated from all (or many) deceptive forms of influence upon the formation of their desires and choices, depending on the circumstances (Buss 2005). If this were the case, then a commitment to the value of autonomy may not be inconsistent with certain forms of deception or manipulation. Yet, given the traditional opposition between autonomous self-determination and agential determination rooted in deceit and manipulation, it is to be expected that resistance to the notion that they are not incompatible will continue.

(iii) Autonomy plays a key role in Kant’s deontological ethics. We have already seen this in the way in which Kant grounds human dignity in autonomy. But autonomy plays a further (and closely related) normative role for Kant. It is often said that Kant held that the Categorical Imperative can be expressed in three closely related formulas: the Formula of Universal Law, the Formula of Humanity, and the Formula of the Kingdom of Ends. It has also been claimed, however, that Kant defended a fourth formula, which may be called the Formula of Autonomy. Although Kant did not state this formula explicitly, it has been argued that it can be plausibly derived from his description of the Categorical Imperative as “the idea of the will of every rational being as a will that legislates universal law.” The corresponding Formula of Autonomy could then be expressed as an imperative in this way: act so that the maxims you will could be the legislation of universal law. According to this formula, we should act according to principles that express the autonomy of the will. This formulation is important, firstly because it suggests that Kant conceived autonomy as a normative principle (and not merely as a condition of the will that makes morality possible), and secondly because it further reinforces Kant’s claim that humans, as autonomous law-givers, are the source of the universal law that guarantees their freedom and hence marks them out as possessing inherent dignity (see Reath 2006).

(iv) Autonomy is commonly held to be a core component of well-being. This view goes back at least to Mill’s On Liberty, and has been accepted by many contemporary philosophers as well (see for example Griffin 1986 and Sumner 1996). In this connection, some argue that autonomy is an intrinsic part of well-being, and others argue that being autonomous reliably leads to well-being (and hence has instrumental prudential value).  Although thus far, the normative importance of autonomy has been described as being associated primarily with deontology, the claim that autonomy is a core component of well-being shows that it can play a key role in consequentialist moral theories as well. Indeed, as will be discussed in greater detail below (section 4), although most defenses of the principle of respect for autonomy are deontological in nature, it is also possible to defend the principle on consequentialist grounds. From this point of view, it can be argued that autonomy deserves respect because respecting autonomy is reliably conducive to well-being.

(v) Autonomy has been claimed to be an important virtue to possess. It is not difficult to see why this is the case. The autonomous person is a person possessing a constellation of widely desirable qualities such as self-control, self-knowledge, rationality and reflective maturity. To be autonomous is to be self-governing; to be free from domination by foreign influences over one’s character and values; to ‘be one’s own person’. Following from this, it is claimed by some that autonomy is a great virtue to possess – one which constitutes an important part of human flourishing. It may be objected, however, that an excessive concern with autonomy can be at odds with virtue, especially if robust autonomy entails an inability to exhibit loyalty or fidelity to projects, other persons or communities. Recent work on personal autonomy, however, has tended to support the notion that autonomy possession is not incompatible with these and similar forms of attachment (Friedman 2003).

(vi) Autonomy has been seen by some thinkers as having implications for a correct account of moral responsibility. Some accounts hold that autonomy is a necessary condition for moral responsibility. The basic defense of this claim is that it makes little sense to say that someone is morally responsible for her actions if she is not the author of those actions; and since one is the author of one’s actions only if one is autonomous, autonomy possession is necessary for moral responsibility. According to this account, the class of actions that are autonomous and the class of actions for which we are morally responsible are identical, or at least almost so (see Fischer and Ravizza 1998). Other accounts hold that although persons are certainly morally responsible for their autonomous actions, they are also morally responsible for a wider range of actions as well. An account of this sort is often made by those who hold a more demanding conception of autonomy; and defenders of this account argue that we still want to hold persons morally responsible for the many actions that do not satisfy robust autonomy conditions on the one hand, but are not constituted of sheer heteronomy (brainwashing, psychosis, coercion, and so forth) on the other (see Arpaly 2005).

(vi) Many thinkers believe that autonomous claims or demands are worthy of special normative uptake–special respect–by virtue of the fact that they are autonomous. It is important to see how this claim is different from the first point given above (viz., that autonomy is said to ground basic moral respect for persons). The former claim is that the fact that persons are autonomous (or have the capacity for autonomy) is what grounds their special dignity, by virtue of which they are owed basic moral respect. Now, it is possible to owe someone basic moral respect, but not to owe special respect to a subset of their choices. Imagine that someone is brainwashed, for example. Many would argue that although we owe that person basic moral respect (for example, we are obligated, say, not to harm them or to lie to them), we do not owe special respect to that person’s demands (say, to promote or not interfere with those demands). The current claim holds, however, that the fact that a person’s choices are autonomous generates special demands of respect for those choices over and above the basic respect owed to the chooser (whether this be conceived as being by virtue of their capacity for autonomy or not). This principle–that autonomous choice deserves special respect–may be justified in either a deontological or consequentialist manner. Because of the considerable importance of this principle, however, it deserves a more detailed discussion, which is provided in section 4 below.

b. Autonomy in Applied Ethics

The principle of respect for autonomy has had a considerable influence on applied ethics largely because of its versatility: it can be invoked in any applied ethics debate that bears (even remotely) on morally significant situations that involve the demands of self-determination, free choice, authenticity or independence. Seven of the most important of these debates–certainly not an exhaustive list–will be briefly canvassed below:

(i) Autonomy and informed consent

(ii) Autonomy and abortion

(iii) Autonomy and end-of-life decisions

(iv) Autonomy and same-sex marriage

(v) Autonomy and just war theory

(vi) Autonomy and advertising

(vii) Autonomy and environmental ethics

(i) Respect for autonomy has had a major influence on debates in medical ethics, especially those concerning the constraints that should be in place within the physician-patient relationship. Perhaps the most important such constraint is that of informed consent. According to this principle, a patient should not receive medical treatment of any sort unless she is well-informed enough as to the treatment’s nature and effects to be able to make an informed decision about it. The patient must agree to the treatment on the basis of this information. Many have argued that the requirement of informed consent is necessitated as part and parcel of a more basic imperative to respect patient autonomy (Dworkin 2006). Few argue that respecting patient autonomy has no weight at all; more commonly, objectors argue that there are cases in which overriding patient autonomy is sometimes justified by the good consequences that will likely result from doing so.

(ii) Autonomy is also referenced as an important value to be taken into consideration in the abortion debate, although it is referenced in different ways. On the one hand, it is argued that some abortions are justified as an expression of a woman’s reproductive autonomy (see Overall 1990 and Fischer 2003). On the other hand, it could be argued that abortion is morally unacceptable, among other reasons, because it fails to respect the potential future autonomy of the aborted (for a related argument, see Marquis 1989).  Assuming that both of these autonomy-based arguments have weight, adjudicating this dispute requires–among other things–establishing and defending the relative weights of actual and potential autonomy, both in relation to particular choices and in relation to lives as wholes.

(iii) Many argue that considerations of respect for autonomy are also decisive in the debates concerning the moral acceptability of euthanasia and suicide. Respect for autonomy can be viewed as a reason for accepting voluntary euthanasia. The basic argument here is one of consistency: if respect for others’ autonomy requires respecting others’ self-determining life-choices (at least when these are competently made), and if end-of-life decisions are placed within the ambit of life-choices, then end-of-life decisions made by competent, autonomous persons should be respected, even if these decisions involve voluntary euthanasia (for a related argument, see Brock 1993). Some have cast doubt, however, on whether a decision to die can be an autonomous decision at all, given the likely presence of psychological factors such as fear, hopelessness, and despair–factors which would undermine careful introspection and critical thought (Hartling 2006). Respect for autonomy can also be seen as a reason for respecting the decision to end one’s life, even when reasons of mercy are not in play–that is, in cases of suicide–at least where there is reason to hold that the agent is sufficiently competent and rational (Webber & Shulman 1987). Some argue, however, that autonomy-based defenses of voluntary euthanasia and suicide involve a contradiction insofar as they invoke the value of autonomy to justify an act that destroys autonomy (Safranak 1988 and Doerflinger 1989). If correct, these arguments do not show that voluntary euthanasia or suicide are unacceptable; they show rather that arguments to establish their acceptability cannot be based on respect for autonomy. It may be a cause of worry, however, that such arguments prove too much by rendering unacceptable autonomy-based respect for any decision that involves a subsequent lessening of free choice.

(iv) Autonomy also carries normative weight in a number of applied ethics debates relating to public policy. Respect for autonomy can be straightforwardly referenced, for example, as an argument in favor of the acceptability of same-sex marriage: respect for others’ autonomy entails respect for their autonomous decisions, and decisions regarding marriage–even same-sex marriage–fall within these parameters (when autonomous).  Objectors might argue, however, that homosexual marriage is immoral, and that the right to noninterference with autonomous choice does not exist where the object of the choice is immoral. Few would argue, for example, that there exists an obligation to respect someone’s autonomous decision to embezzle money, given that that act is immoral. The question then becomes whether same-sex marriage is morally acceptable.

(v) Respect for autonomy also plays a role in discussions of a just-war theory. Specifically, it has been referenced as the key principle determining the proper constraints and limitations that should be in place if we wish our prosecution of war to be just. It has been argued, for example–and in partial conjunction with what has been said above–that possession of autonomy (or the existence of a capacity for it) is the ground of human dignity, and hence the actions which appreciate that dignity must center on respect for autonomy. In relation to war, this suggests that while war may sometimes be morally permissible (in cases of self-defense, for example), wartime actions cannot involve violations of others’ autonomy, especially that of noncombatants (for an extended discussion, see Zupan 2004).

(vi) In business ethics, respect for autonomy has been identified as a key reason why persuasive advertising practices are morally unacceptable (Crisp 1987). The arguments given in support of this claim largely follow those mentioned above in relation to truthfulness–viz. that respect for others’ autonomy is incompatible with deception or manipulation–combined with the claim that persuasive advertising practices constitute deception or manipulation. In this vein it has also been argued that persuasive advertising undermines consumer autonomy by creating foreign desires and wants and by producing compulsive behavior in consumers. Some have argued, however, that persuasive advertising practices do not threaten consumers’ autonomy, at least not necessarily or intrinsically (Arrington 1982). From this point of view, although such deception may occur, this is the exception; usually it provides consumers with the information necessary for making informed decisions. It has also been argued that, even if persuasive advertising thwarts autonomy, it is still in consumers’ interests to be exposed to it, given that companies would likely only go to such trouble for products that will be market-winners, and hence that consumers would have desired and bought those products anyway, even after careful consideration (Nelson 1978). One obvious problem with this argument is that it assumes that heavy persuasive marketing is a sign of product quality, which is certainly debatable; but even if that premise is granted, it may still be argued that rhetorical device laden advertising, by attempting to bypass consumers’ critical thinking abilities, violates their autonomy.

(vii) Respect for autonomy has even been referenced in relation to issues in environmental ethics. Eric Katz has argued, for example, that nature as a whole constitutes an ‘autonomous subject’, which therefore deserves moral respect and should not be treated as a mere means to the satisfaction of human ends (Katz 1997). Critics of this view may wonder whether the notion of an autonomous subject operative here has been stretched to breaking point, or rendered hollow. If correct, this criticism does not, of course, entail that prohibitions against using nature as a mere means to human ends cannot be provided; it simply means that an acceptable defense cannot be based on autonomy-related considerations.

It should be clear from the breadth and diversity of the employment of the principle of respect for autonomy that it is both, extremely versatile and a mainstay of applied ethics debates. The brief sketches given above concern some of the more prominent autonomy-related discussions in applied ethics, but other debates in applied ethics–relating, for example, to injunctions against discrimination (Gardner 1992 and Doyle 2007) or against domestic abuse (Friedman 2003), to name a couple–have been approached and adjudicated in reference to the importance of respecting autonomy as well.

More often than not, however, those who reference the principle of respect for autonomy in applied ethics either take its normative force for granted, or only devote passing attention to the question of its justification. Yet, given that the principle is neither self-evident nor immune to challenge, it is very important that those who reference the principle be able to provide a robust justification of its normative weight. Because of its fundamentality, this issue will be considered separately and in more detail in section 4 below.

c. Autonomy in Political Philosophy

Autonomy is considered normatively significant for issues in political philosophy, primarily in relation to discussions of social justice and rights. It is particularly important for political liberalism (see, for example, Christman and Anderson 2005); and some have argued that autonomy is the core value of liberalism (see White 1991 and Dagger 2005).  Four of the most important issues in political philosophy that invoke the normative significance of autonomy include:

(i) The establishment and validation of just social and political principles

(ii) The legitimation of political power

(iii) The justification of political rights (both specific and general)

(iv) The acceptability of political paternalism

(i) A conception of the autonomous individual provides the perspective from which social and political principles are formulated, and validated as just, in several contractarian political theories. A classic example is provided in Rawls’ A Theory of Justice (1971).  According to Rawls, principles of social justice are best conceived and validated based on what would be acceptable to (representatives of) members of society gathering together in an ‘original position’ behind a ‘veil of ignorance’. Rawls argued that the conditions that constrain this process will ensure that those taking part in it are acting autonomously (that is, according to Rawls, as free and rational). Of key importance is the veil of ignorance because, by preventing any detailed knowledge of one’s condition or place in society, it “deprives the persons in the original position of the knowledge that would enable them to choose heteronomous principles”. Rawls concludes: “[W]e can say that by acting from these principles persons are acting autonomously: they are acting from principles that they would acknowledge under conditions that best express their nature as free and equal rational beings.” Given these key constraints in the contracting process, it  results, according to Rawls, in valid principles of social justice. Here it can be seen that autonomy has double (and mutually supporting) normative significance: it characterizes members of society in an idealized way in order to form the normatively privileged perspective from which to establish principles of social justice; and it provides the standard that validates those principles as just (viz., by being accepted by autonomous agents). One can see the influence of Kant’s conception of autonomy and its normative significance in this doctrine. Roughly put, Kant held that moral principles are those that would be accepted by persons within an idealized constraint–viz., insofar as they are autonomous. Similarly, Rawls argued that the principles of social justice are those that would be accepted by persons within an idealized constraint–viz., insofar as they are conceived as autonomous (free and rational) agents behind a veil of ignorance in the original position. Rawls explicitly acknowledged his indebtedness to Kant in this regard.

(ii) In relation, a cornerstone of political liberalism is the view that political power is legitimized by its free acceptance by a state’s subjects who are conceived (at least minimally) as autonomous persons. John Locke is recognized as one of the key progenitors of this view of the legitimation of political power. In Two Treatises on Government (1689), he wrote: “Men being, as has been said, by nature all free, equal, and independent, no one can be put out of his estate, and subjected to the political power of another, without his own consent, which is done by agreeing with other men to join and unite into a community for their comfortable, safe, and peaceable living one amongst another…” That which secures the legitimacy of government on such a view is precisely the agreement to do so amongst those who are not only naturally equal in standing, but also free and independent–that is, self-directing. The tradition of placing crucial normative weight on the autonomy of the contracting parties has continued to the present.  Referring to liberalism in political philosophy in general, John Christman has written (2005), “Liberal legitimacy…assumes that autonomous citizens can endorse the principles that shape the institutions of political power….In this way, political power is an outgrowth of autonomous personhood and choice.” As before, autonomy is fulfilling two (mutually supporting) roles: it is being used to delimit the normatively privileged perspective from which judgment is authoritative regarding political legitimacy, and it is informing (at least partly) the standard in relation to which that judgment (viz., the acceptance of a political power) is made.

(iii) Autonomy is referenced as a core ground in the justification of political rights in a broadly liberal political framework. It is argued, for example, that a theory of autonomy has to be presupposed to achieve agreement with a theory of rights that in principle is acceptable to all. It is also argued that autonomy is absolutely central in views of rights that enshrine the idea that people have the freedom equally to conceive and enjoy widely different forms of meaningful life. Attracta Ingram (Ingram 1994) provides a clear articulation of the view that autonomy deserves a central place in the defense of a scheme of political rights: “I think that the most compelling answer is that people’s most vital human interest is in living meaningful lives. This interest cannot be secured while they are at risk of slavery, social subordination, repression, persecution, and grinding poverty – conditions that history shows to be the lot of many in societies which do not recognize the value of individual freedom. So it is rational for people to want to develop the mental capacities and social environment necessary to living independent lives. Since what is at stake is the proper distribution of human freedom, we have here matters of justice and rights; the province of political morality.” (112-3) Autonomy is thus said by some to be a–or the–core unifying value in a conception of rights that is liberal (and hence pluralistic) in tenor (see also Richards 1989). In addition, the value of autonomy is referenced to justify particular rights such as the right to free speech (Brison 2000) or the right to privacy (Kupfer 1987).

(iv) Autonomy is referenced by many as the core value that militates against the acceptability of political (and informal) paternalism. According to a widely-accepted conception, an act is paternalistic if it involves direct interference with another’s actions and will for the purpose of advancing (what the interferer takes to be) that person’s own good. Paternalism bypasses the agent’s capacity to be self-directing and ignores the agent’s wishes regarding the way she would like to live her own life; and it is these factors that constitute a violation of the autonomy of the one suffering paternalistic influence. It is commonly held that possession of the capacity for autonomy gives the agent a right and an authority–at least in relation to minimally voluntary, self-regarding choices (all else being equal)–to be self-determining without interference (for a detailed account of paternalism and the defense of the claims of autonomy, see VanDeVeer 1986; see also Mill’s classic argument against paternalism in On Liberty). Supporters of paternalistic doctrines tend to argue that paternalism is justified based either on the highly beneficial consequences of such interference, or on the ground that a policy of paternalism could be hypothetically accepted by autonomous agents when the possible related consequences are severe enough (on the latter see Rawls 1999). It has also been argued that a certain degree of paternalism is unavoidable, but that such paternalism should be constrained by the goal of leading persons to welfare-promoting choices while not threatening freedom of choice (Sunstein and Thaler 2003).

It is worth mentioning in passing that J.S. Mill, who is often referenced as a champion of individual liberty and a firm critic of paternalistic policy, endorsed a strong version of paternalism, but only in relation to “those backward states of society in which the race itself may be considered as in its nonage.”  In relation to these, Mill (1971) claimed that “Despotism is a legitimate mode of government in dealing with barbarians, provided the end be their improvement and the means justified by actually effecting that end.”  Although these claims of Mill’s would find few supporters today, it is worth adding that the standard that Mill employed to ground the distinction between unjustified and justified paternalism was the presence of a kind of maturity of thought and judgment that is not greatly dissimilar to autonomy: “[A]s soon as mankind have attained the capacity of being guided to their own improvement by conviction or persuasion…compulsion, either in the direct form or in that of pains and penalties for noncompliance, is no longer admissible as a means to their own good, and justifiable only for the security of others.”

d. Autonomy in Philosophy of Education

Several philosophers have argued that autonomy development is the most important goal (or at least one of the most important goals) of a liberal education. Reasoning in support of this claim usually takes two forms. Firstly, some argue that autonomy should be the primary goal of liberal education because autonomy enhancement is the most important goal of the liberal state, and hence an education in such a state should be an education for autonomy (see White 1991, and compare with Raz 1986, ch. 14). Secondly, some argue that autonomy should be the goal of liberal education because it should be a key goal of any form of education, largely because an education for autonomy is crucial for human well-being across the board.

The latter position has been challenged by communitarians, however, who argue that there is no justification for the claim that autonomy is universally valuable, and who see autonomy as at best a parochial (Western) value (see MacIntyre 1981, White 1991, and Raz 1986). The communitarian argument has been challenged in various ways. It has been directly counter-argued, for example, that autonomy is universally intrinsic to well-being (see Norman 1994 and Ishtiyaque & Cuypers 2008). In addition, the epistemic benefits of autonomy development for forming rational judgments about one’s life have been cited as reason for allowing the state to mandate education for autonomy, even over the protests of more traditionally-minded parents (MacMullen 2007). Although communitarians continue to be suspicious of the claim that autonomy should be a goal of all education, it is widely agreed that education for autonomy is central to an education in a liberal society.

4. Warrant for the Principle of Respect for Autonomous Choice

As mentioned above (in section 3a), the idea that autonomy gives rise to demands of respect can take two forms. On the one hand, it is argued that the possession of autonomy or the capacity for it grounds human dignity and the basic moral respect for persons that attends that dignity. On the other hand, it is argued that the fact that a choice or demand is autonomous is reason to give special or added normative uptake to that choice or demand. For clarity, one might refer to the former as the principle of respect for autonomy and the latter as the principle of respect for autonomous choice. The principle of respect for autonomy has already been examined in connection with Kant’s moral philosophy, and it was shown that although this principle has been popular, it is also quite controversial, largely because of problems involving exclusion. The principle of respect for autonomous choice will be examined in the present section. As shown above, this principle plays a key role in a variety of normative debates, especially debates in applied ethics. As has been mentioned, however, the principle is often either invoked without supporting argument or is given thin justification at best. The principle is therefore worthy of further discussion, especially with regard to its normative warrant. What is the warrant for the claim that autonomous choices give rise to special demands of respect? Two views have emerged on this question. Unsurprisingly, these views can be delineated along deontological and consequentialist lines.

Firstly, many philosophers following Kant (often only roughly), contend that autonomous choices deserve special respect because persons, as capable of self-determination, are entitled, all else being equal, to be self-determining without interference. This may be termed the authority view of the justification for the principle of respect for autonomous choice. The authority view is most often allied to the view that respect for autonomy functions primarily as a side-constraint which forbids paternalistically-motivated interference in the self-regarding, minimally voluntary choices of others, even if such interference would be prudentially best for the choosing persons.

Secondly, some philosophers following Mill, contend that autonomous choices deserve special respect because a policy of such respect conduces to desirable prudential consequences, either for the choosing agent, or aggregately. On this view, autonomous choices are not to be respected merely because they are autonomous, or because those making them have a capacity for self-determination, but rather because doing so will lead to the most beneficial prudential results. This more consequentialist view of the normative warrant for the principle of respect for autonomous choice may be termed the benefit view.

Based on the literature, it is quite clear that the authority view is the dominant view in the field, and has been for some time. Many philosophers hold that persons have a right to have their self-determining choices respected even in cases where there is good reason to think that the fulfillment of their autonomous choices would lead to bad prudential results (see Wellman 2003 and Darwall 2006). Against this it may be argued that where the prudential results of respecting a person’s autonomous choice are disastrous enough, interference may be justified, thus opening the door to the salience of consequentialist considerations bearing upon the principle (see Young 1982 and Wellman 2003). It has also been argued that the relation between the fulfillment of at least minimally robust autonomous choice and the resulting expression of authentic selfhood (conceived as highly prudentially significant) suggests that the benefit view deserves to be given closer attention (Piper 2009). Given the great popularity and wide employment of the principle of respect for autonomous choice, it is safe to say that the question of its normative warrant deserves far greater attention than it has thus far received.

5. References and Further Reading

  • Anderson, Joel and Axel Honneth. “Autonomy, Vulnerability, Recognition, and Justice.”  In Autonomy and the Challenges to Liberalism: New Essays, eds. John Christman and Joel Anderson, 127-149. Cambridge: Cambridge University Press, 2005.
  • Arpaly, Nomy. “Responsibility, Applied Ethics, and Complex Autonomy Theories.” In Personal Autonomy: New Essays in Personal Autonomy and Its Role in Contemporary Moral Philosophy, ed. James Stacey Taylor, 162-180. Cambridge: Cambridge University Press, 2005.
  • Arrington, Robert L. “Advertising and Behavior Control.” Journal of Business Ethics 1, no. 1 (Feb. 1982): 3-12.
  • Berlin, Isaiah. Two Concepts of Liberty. Oxford: Oxford University Press, 1958.
  • Berofsky, Bernard. Liberation from Self: A Theory of Personal Autonomy. Cambridge; Cambridge University Press, 1995.
  • Brison, Susan J. “Relational Autonomy and the Freedom of Expression.”  In Relational Autonomy: Feminist Perspectives on Autonomy, Agency, and the Social Self, eds. Catriona Mackenzie and Natalie Stoljar, 280-299. Oxford: Oxford University Press, 2000.
  • Brock, D. “Voluntary Active Euthanasia.” The Hastings Center Report 22, no. 2 (1993): 10-22.
  • Buss, Sarah. “Valuing Autonomy and Respecting Persons: Manipulation, Seduction, and the Basis of Moral Constraints.” Ethics 115, no. 2 (Jan 2005): 195-235.
  • Christman, John and Joel Anderson, eds. Autonomy and the Challenges to Liberalism: New Essays. Cambridge: Cambridge University Press, 2005.
  • Christman, John. “Autonomy, History, and the Subject of Justice.” Social Theory and Practice 33, no. 1 (January 2007): 1-26.
  • Christman, J. “Autonomy, Self-Knowledge, and Liberal Legitimacy.” In Autonomy and the Challenges to Liberalism: New Essays, eds. John Christman and Joel Anderson, 330-357. Cambridge: Cambridge University Press, 2005.
  • Cooper, John. “Stoic Autonomy.” In Autonomy, eds. Ellen Frankel Paul, Fred Miller, and Jeffrey Paul: 1-29. Cambridge: University of Cambridge Press, 2003.
  • Crisp, Roger. “Persuasive Advertising, Autonomy, and the Creation of Desire.” Journal of Business Ethics 6, no. 5 (July 1987): 413-418.
  • Dagger, Richard. “Autonomy, Domination, and the Republican Challenge to Liberalism.”  In Autonomy and the Challenges to Liberalism: New Essays, eds. John Christman and Joel Anderson, 177-203. Cambridge: Cambridge University Press, 2005.
  • Darwall, Stephen. “The Value of Autonomy and Autonomy of the Will.” Ethics 116, no. 2 (January 2006): 263-284.
  • Doerflinger, Richard. “Assisted Suicide: Pro-Choice or Anti-Life?” The Hastings Center Report 19, no. 1 (Jan-Feb 1989): 16-19.
  • Doyle, Oran. “Direct Discrimination, Indirect Discrimination, and Autonomy.” Oxford Journal of Legal Studies 27, no. 3 (2007): 537-553.
  • Dworkin, Gerald. The Theory and Practice of Autonomy. Cambridge: Cambridge University Press, 1988.
  • Dworkin, Gerald. “Autonomy and Informed Consent.” In Ethical Health Care, eds. Patrician Illingworth and Wendy Parmet, 79-91. Upper Saddle River, NJ: Pearson Prentice-Hall, 2006.
  • Fischer, John Martin. “Abortion, Autonomy, and Control Over One’s Body.” Social Philosophy and Policy 20, no. 2 (2003): 286-306.
  • Fischer, John Martin and Mark Ravizza. Responsibility and Control: A Theory of Moral Responsibility. Cambridge: Cambridge University Press, 1998.
  • Frankfurt, Harry. “Freedom of the Will and the Concept of a Person.” In The Importance of What We Care About by Harry Frankfurt, 11-25. Cambridge: Cambridge University Press, 1988.
  • Frankfurt, Harry. Necessity, Volition, and Love. Cambridge: Cambridge University Press, 1999.
  • Friedman, Marilyn. Autonomy, Gender, Politics. Oxford: Oxford University Press, 2003.
  • Gardner, John. “Private Activities and Personal Autonomy: At the Margins of Anti-discrimination Law.” In Discrimination: The Limits of the Law, eds. Bob Hepple and Erika Szyszczak, 148-171. London: Mansell, 1992.
  • Griffin, James. Well-Being: Its Meaning, Measurement, and Moral Importance. Oxford: Oxford University Press, 1986.
  • Guyer, Paul. “Kant on the Theory and Practice of Autonomy.” In Autonomy, eds. Ellen Frankel Paul, Fred Miller, and Jeffrey Paul: 70-98. Cambridge: Cambridge University Press, 2003.
  • Hartling, O.J.  “Euthanasia–the Illusion of Autonomy.” Medicine and Law 25, no. 1 (2006): 189-99.
  • Hill, Thomas. “The Kantian Conception of Autonomy.” In The Inner Citadel: Essays on Individual Autonomy, ed. John Christman, 91-105. Oxford: Oxford University Press, 1989.
  • Hill, Thomas. “Autonomy and Benevolent Lies.” In Autonomy and Self-Respect by Thomas Hill, 25-42. Cambridge: Cambridge University Press, 1991.
  • Ingram, Attracta. A Political Theory of Rights. Oxford: Oxford University Press, 1994.
  • Ishtiyaque, Haji and Stefaan Cuypers. “Authenticity-Sensitive Preferentism and Educating for Well-Being and Autonomy.” Journal of Philosophy of Education 42, no. 1 (February 2008): 85-106.
  • Kant, Immanuel, trans. Mary Gregor. Groundwork of the Metaphysics of Morals.  Cambridge: Cambridge University Press, 1998.
  • Katz, Eric. Nature As Subject. New York: Rowman & Littlefield, 1997.
  • Korsgaard, Christine. Creating the Kingdom of Ends. New York: Cambridge University Press, 1996.
  • Kupfer, Joseph. “Privacy, Autonomy, and Self-Concept.” American Philosophical Quarterly 24, no. 1 (January 1987): 81-9.
  • MacIntyre, Alasdair. After Virtue: A Study in Moral Theory. Notre Dame: University of Notre Dame Press, 1981.
  • Mackenzie, Catriona and Natalie Stoljar, eds. Relational Autonomy: Feminist Perspectives on Autonomy, Agency, and the Social Self. Oxford: Oxford University Press, 2000.
  • MacMullen, Ian. Faith in Schools? Autonomy, Citizenship and Religious Education in the Liberal State. Princeton, NJ: Princeton University Press, 2007.
  • Marquis, Don. “Why Abortion is Immoral.” Journal of Philosophy 86 (April 1989): 183-202.
  • Mele, Alfred. Autonomous Agents: From Self-Control to Autonomy. Oxford: Oxford University Press, 1995.
  • Meyers, Diana. Self, Society, and Personal Choice. New York: Columbia University Press, 1989.
  • Mill, John Stuart. On Liberty. Edited by Curran V. Shields. New Jersey: Prentice-Hall Inc., 1997.
  • Nelson, Philip. “Advertising and Ethics.” In Ethics, Free Enterprise, and Public Policy: Original Essays on Moral Issues in Business, eds. Robert T. De George and Joseph A. Pichler. New York: Oxford University Press, 1978.
  • Norman, Richard. “‘I Did It My Way’: Some Thoughts on Autonomy.” Journal of Philosophy of Education 28, no. 1 (1994): 25-34.
  • Oshana, Marina. “Personal Autonomy and Society.” Journal of Social Philosophy 29, no. 1 (Spring 1998): 81-102.
  • Oshana, Marina. “How Much Should We Value Autonomy?” In Autonomy, eds. Ellen Frankel Paul, Fred Miller, and Jeffrey Paul, 99-126. Cambridge: Cambridge University Press, 2003.
  • Overall, Christine. “Selective Termination of Pregnancy and Women’s Reproductive Autonomy.” The Hastings Center Report 20, no. 3 (May-June 1990): 6-11.
  • Piper, Mark. “On Respect for Personal Autonomy and the Value Instantiated in Autonomous Choice.” Southwest Philosophy Review 25, no. 1 (January 2009): 189-198.
  • Pohlmann, R. “Autonomie.” In Historisches Wörterbuch der Philosophie, ed. J. Ritter, 1: 701-719. Basel: Schwabe, 1971.
  • Rawls, John. A Theory of Justice, rev. ed. Cambridge, MA: Harvard University Press, 1999.
  • Raz, Joseph. The Morality of Freedom. Oxford: Oxford University Press, 1986.
  • Reath, Andrews. “Autonomy of the Will as the Foundation of Morality.” In Agency and Autonomy in Kant’s Moral Theory by Andrews Reath, 121-172. Oxford: Oxford University Press, 2006.
  • Richards, David A.J. “Rights and Autonomy.” In The Inner Citadel: essays on Individual Autonomy, ed. John Christman, 203-220. Oxford: Oxford University Press, 1989.
  • Safranek, John. “Autonomy and Assisted Suicide: the Execution of Freedom.” The Hastings Center Report 28, no. 4 (July-Aug. 1998): 32-36.
  • Sartre, Jean-Paul, trans. Bernard Frechtman. Existentialism is a Humanism. New York: Philosophical Library, 1946.
  • Sumner, L.W. Welfare, Happiness, and Ethics. Oxford: Oxford University Press, 1996.
  • Sunstein, Cass and Richard Thaler. “Libertarian Paternalism is Not an Oxymoron.” The University of Chicago Law Review 70, no. 4 (Fall 2003): 1159-1202.
  • Taylor, James Stacey. “Introduction.” In Personal Autonomy: New Essays on Personal Autonomy and Its Role in Contemporary Moral Philosophy, ed. James Stacey Taylor, 1-29. Cambridge: Cambridge University Press, 2005.
  • VanDeVeer, Donald. Paternalistic Intervention: The Moral Bounds of Benevolence.  Princeton, NJ: Princeton University Press, 1986.
  • Webber, May and Ernest Shulman. “Personal Autonomy and Rational Suicide.” Paper given at the Annual Convention of the American Association of Suicidology/International Association for Suicide Prevention (1987).
  • Wellman, Christopher Heath. “The Paradox of Group Autonomy.” In Autonomy, eds. Ellen Frankel Paul, Fred Miller, and Jeffrey Paul, 265-285. Cambridge: Cambridge University Press, 2003.
  • White, John. Education and the Good Life: Autonomy, Altruism, and the National Curriculum. New York: Teachers College Press, 1991.
  • Wolf, Susan. Freedom and Reason. New York: Oxford University Press, 1990.
  • Young, Robert. “The Value of Autonomy.” The Philosophical Quarterly 32, no. 126 (January 1982): 35-44.
  • Zupan, Daniel S. War, Morality, and Autonomy: An Investigation in Just War Theory. Hampshire, England: Ashgate Publishing Ltd., 2004.

Author Information

Mark Piper
Email: pipermc@jmu.edu
James Madison University
U. S. A.

Cloning

In biology, the activity of cloning creates a copy of some biological entity such as a gene, a cell, or perhaps an entire organism. This article discusses the biological, historical, and moral aspects of cloning mammals. The main area of concentration is the moral dimensions of reproductive cloning, specifically the use of cloning in order to procreate.

The article summarizes the different types of cloning, such as recombinant DNA/molecular cloning, therapeutic cloning, and reproductive cloning. It explores some classic stereotypes of human clones, and it illustrates how many of these stereotypes can be traced back to media portrayals about human cloning. After a brief history of the development of cloning technology, the article considers arguments for and against reproductive cloning.

One of the most predominate themes underlying arguments for reproductive cloning is an appeal to procreative liberty. Because cloning may provide the only way for some individuals to have a child that is genetically their own, a ban on cloning interferes with their reproductive autonomy.

Arguments against cloning appeal to concerns about a clone’s lack of genetic uniqueness and what may be implied because of this. Human cloning is of special interest. There are concerns that cloned humans would lack individuality, that they would be treated in undignified ways by their creators, or that they would be damaged by society’s expectations that they should be more like those from whom they were cloned. Because they would essentially be facsimiles of the original person, there is concern that the clones might possess less moral worth. The predominate theme underlying arguments against human cloning is that the cloned child would undergo some sort of physical, social, mental, or emotional harm. Because of these and other concerns, the United Nations and many countries have banned human cloning. An important philosophical issue is whether such a response against human cloning is warranted.

Table of Contents

  1. Types of Cloning
    1. Recombinant DNA Technology / Molecular Cloning
    2. Therapeutic Cloning
    3. Reproductive Cloning
  2. Misconceptions About Cloning and Their Sources
  3. Cloning Mammals: A Brief History
  4. Arguments in Favor of Reproductive Cloning and Responses
    1. Reproductive Liberty: The Only Way to Have a Genetically Related Child
    2. Cloning and Savior Siblings
    3. Cloning In Order to “Replace” a Deceased Child
    4. The Resultant Loss of Therapeutic Cloning for Stem Cell Research and Treating Diseases
  5. Arguments Against Reproductive Cloning and Responses
    1. The Right to an Open Future
    2. The Right to a Unique Genetic Identity
    3. Cloning is Wrong because it is “Playing God” or because it is “Unnatural”
    4. The Dangers of Cloning
    5. Cloning Entails the Creation of Designer Children, or it Turns Children into Commodities
    6. Cloning and the Ambiguity of Familial Roles
  6. References and Further Reading

1. Types of Cloning

a. Recombinant DNA Technology / Molecular Cloning

DNA/Molecular cloning has been in use by molecular biologists since the early 1960s. When scientists wish to replicate a specific gene to facilitate more thorough study, molecular cloning is implemented in order to generate multiple copies of the DNA fragment of interest. In this process, the specific DNA fragment is transferred from one organism into a self-replicating genetic element, e.g., a bacterial plasmid (Allison, 2007).

Because this kind of cloning does not result in the genesis of a human organism, it has no reproductive intent or goals, and it does not result in the creation and destruction of embryos, there is little to no contention regarding its use.

b. Therapeutic Cloning

Embryonic stem cells are derived from human embryos at approximately five days post-fertilization, in the blastocyst stage of development. Because of their plasticity, embryonic stem cells can be manipulated to become any cell in the human body, e.g., neural cells, retinal cells, liver cells, pancreatic cells, or heart cells. Many scientists hope that, with proper research and application, embryonic stem cells can be used to treat a wide variety of afflictions, e.g., tissue toxicity resulting from cancer therapy (National Cancer Institute, 1999) Alzheimer’s disease (Gearhart, 1998), Parkinson’s disease (Freed et al, 1999; National Institute of Neurological Disorders and Strokes, 1999; Wager et al. 1999; Gearhart, 1998), diabetes (Voltarelli et al, 2007; Shapiro et al., 2000), heart disease (Lumelsky, 2001; Zulewski, 2001), and limb paralysis (Kay and Henderson, 2001).

One current obstacle for the successful use of embryonic stem cells for disease therapy concerns immunological rejection. If a patient were to receive stem cell therapy in order to treat some affliction, her body may reject the stem cells for the same reason human bodies have a tendency to reject donated organs: the body tends to not recognize, and therefore reject, foreign cells. One way to overcome stem cell rejection is by creating embryos through somatic cell nuclear transfer with the patient’s own DNA. In 2008, a California research team succeeded in creating embryos via SCNT and growing them to the blastocyst stage (French et al., 2008). In SCNT, an ovum is emptied of its own nucleus, its DNA, and the chromosomal DNA from another person (in the case, a patient’s) is inserted. The ovum is then artificially induced to begin dividing as if it had been naturally fertilized (usually via the use of an electrical current). Once the embryo is approximately five days old, the stem cells are removed, cultured, differentiated to the desired type of body cell, and inserted back into the patient (the genetic donor in this case). Since the embryo was a genetic duplicate of the patient, there would be no immunological rejection. One use of this technology, for example, is to help treat individuals in the aftermath of a heart attack. Using SCNT to create a genetically identical blastocyst, new healthy cells could be derived and inserted back into the genetic donor’s heart in order to replace the damaged cardiac cells (Strauer, 2009).

It may also be possible to use therapeutic cloning to repair defective genes by homologous recombination (Doetschman et al., 1987). Cellular models of diseases can be developed as well, along with the ability to test drug efficacy: “cloning a single skin cell from a patient with a disease could be used to produce inexhaustible amounts of cells and tissue with that disease. The tissue could be experimented upon to understand why disease occurs. It could be used to understand the genetic contribution to disease and to test vast arrays of new drugs which could not be tested in human people” (Savulescu, 2007, 1-2). Pluripotent stem cells can also be used to test drug toxicity which could also diminish the chances of drug-related birth defects (Boiani and Schöler, 2002, 124).

Therapeutic cloning is controversial because isolating the stem cells from the embryo destroys it. Many individuals regard the human embryo as a person with moral rights, and so they consider its destruction to be morally impermissible. Moreover, because the embryos are created with the explicit intention to destroy them, there are concerns that this treats the embryos in a purely instrumental manner (Annas et al.,, 1996). Although some ethicists are in favor of using surplus embryos from fertility treatments for research (since the embryos were slated for destruction in any case), they are simultaneously against creating embryos solely for research due to the concern that doing so treats the embryos purely as means (Outka, 2002; Peters, 2001). Indeed, it is precisely because of these ethical issues that some individuals object to the positive connotations of the term “therapeutic” and refer to this work, instead, as “research cloning.” The term “therapeutic cloning” is, however, more widely used.

c. Reproductive Cloning

SCNT can also be used for reproductive purposes. Unlike therapeutic cloning, the cloned embryo is transferred into a uterus of a female of the same species and would be, upon successful implantation, allowed to gestate as a naturally fertilized egg would. The cloned embryo would possess identical chromosomal DNA as its genetic predecessor, but, because of the use of a different ovum, its mitochondrial DNA (the genetic material inhabiting the cytoplasm of the enucleated ovum) would differ, and, consequently, it would not be 100% genetically identical (unlike monozygotic multiples who, because they are derived from the same ovum, share identical chromosomal and mitochondrial DNA). In addition to its slight genetic difference, the cloned embryo would likely be gestated in a different uterine environment, which can also have an effect in ways that may serve to distinguish it from its genetic predecessor. For example, a cloned entity’s phenotype (its appearance) may look very different than that of its genetic predecessor because the embryo can undergo epigenetic reprogramming, where nongenetic (i.e., environmental) causes influence genes to manifest themselves differently. The result is that the genes behave in ways that may lead to a difference in appearance.

In addition to somatic cell nuclear transfer, there is another, less controversial and less technologically complex, manner of reproductive cloning: artificial embryo twinning. Here, an embryo is created in a Petri dish via In Vitro Fertilization (IVF). The embryo is then induced to divide into genetic copies of itself, thereby artificially mimicking what happens when monozygotic multiples are formed (Illmensee et al., 2009). The embryos are then transferred into a womb and, upon successful implantation and gestation, are born as identical multiples. If implantation is unsuccessful, the process is repeated.

One argument in favor of artificial embryo twinning is that it provides an infertile couple, who may not have been able to produce many viable embryos through IVF, with more embryos that they can then implant for an increased chance at successful reproduction (Robertson, 1994). Because some of the embryos may be saved and implanted later, it is possible to create identical multiples who are not born at the same time. One advantage to doing this is that the later born twin could serve as a blood or bone marrow donor for her older sibling should the need arise; because they are genetically identical, the match would be guaranteed (the converse could also hold, that is, the older individual could serve as a donor for the clone should the latter ever need it. The existence of a cloned person, therefore, could be mutually beneficial, rather than asymmetrical). However, some concerns have been raised. For example, it has been argued that artificially dividing the embryo constitutes an immoral manipulation of it and that, as much as possible, a unique embryo should be allowed to develop without interference (McCormick, 1994). Concerns over individuality have also been raised; whereas naturally occurring twins are valued as individuals, one worry is that embryos created through artificial twinning, precisely because of the synthetic nature of their genesis, may not be as valued (McCormick, 1994).

2. Misconceptions About Cloning and Their Sources

The general public still seems to regard human reproductive cloning as something that can occur only in the realm of science fiction. The portrayal of cloning in movies, television, and even in journalism has spanned from comedic to dangerous. Human clones have often been depicted in movies as nothing but carbon copies of their genetic predecessor with no minds of their own (e.g., Multiplicity and Star Wars: Attack of the Clones), as products of scientific experiments that have gone horribly wrong, resulting in deformed quasi-humans (Alien Resurrection) or murderous children (Godsend), as persons created simply for spare parts for their respective genetic predecessor (The Island), or as deliberate recreations of famous persons from the past who are expected to act just like their respective predecessor (The Boys from Brazil). Even when depicting nonhuman cloning, films (such as Jurassic Park) tend to portray products of cloning as menacing, modern-day Frankensteinian monsters of sorts, which serve to teach humans a lesson about the dangers of “playing God.”

Many other media outlets, although usually shying away from the ominous representation of clones so prevalent in the movies, have usually portrayed clones as, essentially, facsimiles of their genetic predecessor. On the several occasions which Time Magazine has addressed the issue of cloning, the cover illustrates duplicate instances of the same picture. For example, the February 19, 2001 cover shows two mirror image infants staring at each other, the tagline suggesting that cloning may be used by grieving parents who wish to resurrect their dead child. Even a Discovery Channel program, meant to educate its viewers on the nature of cloning, initially portrays a clone as nothing more than a duplicate of the original person. Interestingly enough, however, a few minutes into the program, the narrator, speaking over a picture of two identical cows, says: “But even if a clone person is created, that doesn’t mean it would be an exact copy of the original.” Yet almost immediately afterwards, the same narrator calls a clone “You, version 2.0.”

As philosopher Patrick Hopkins has pointed out, media conceptions about what human cloning entails, and the type of offspring that will arise from cloning, employ the tacit premise that clones are nothing but copies. The predominate belief that fuels this conception is that genetic determinism is true, i.e., that a person’s genes are the sole determining factor of her behavior and physical appearance; essentially, that a person’s identity is solely determined by her genetic constitution. If a person were to believe that genetic determinism is true, then it follows that she believes that a cloned person would be psychologically identical with her genetic predecessor because they are (almost) genetically identical. Hopkins also points out that, like the narrator in the Discovery Channel program, many media outlets “engage in confusing, contradictory bits of double-talk (or double-show). The images and not-very-clever headlines all convey unsettling messages that clones will be exact copies, while inside the stories go to some effort to educate us that clones will not in fact be exact copies” (1998, 129-130).

3. Cloning Mammals: A Brief History

In 1894, Hans Driesch cloned a sea urchin through inducing twinning by shaking an embryonic sea urchin in a beaker full of sea water until the embryo cleaved into two distinct embryos. In 1902, Hans Spemann cloned a salamander embryo through inducing twinning as well, using a hair from his infant son as a noose to divide the embryo.  In 1928, Spemann successfully cloned a salamander using nuclear transfer. This involved enucleating a single-celled salamander embryo and inserting it with the nucleus of a differentiated salamander embryonic cell.  In 1951, Robert Briggs and Thomas Kling, using Spemann’s methods of embryonic nucleus transfer, successfully cloned frogs. In 1962, John Gurdon announced that he too had successfully cloned frogs but, unlike Briggs and Kling’s method, he did so by transferring differentiated intestinal nuclei from feeding tadpoles (Wilmut et al., 2000). Gurdon’s successful use of differentiated nuclei, rather than the embryonic nuclei used by Briggs and Kling, was particularly surprising to the scientific community. Because embryonic cells are undifferentiated, and therefore extremely malleable, it was not too surprising that transferred embryonic nuclei produced distinct embryos when inserted into an enucleated oocyte. However, inciting differentiated nuclei to behave as undifferentiated nuclei was thought to be impossible, since the conventional wisdom at the time was that once a cell was differentiated (e.g., once it became a cardiac cell, a liver cell, or a blood cell) it could never reverse into an undifferentiated state. It was for this reason that, for a long time, creating a cloned embryo from adult somatic cells was thought to be impossible – it would require taking long-time differentiated cells and getting them to behave like the totipotent cells (cells that are able to differentiate into any cell type, including the ability to form an entirely distinct organism) found in newly fertilized eggs.

In 1995, Dr. Ian Wilmut and Dr. Keith Campbell successfully cloned two mountain sheep, Megan and Morag, from embryonic sheep cells. One year later, in 1996, Wilmut and Campbell successfully cloned the first mammal to be born from an adult somatic cell, specifically an udder cell (a sheep’s mammary gland): Dolly the sheep (Wilmut et al., 1997). In other words, Wilmut and Campbell were able to take a fully differentiated adult cell and revert it back to an undifferentiated, totipotent, state. This was the first time the process had been accomplished for mammalian reproduction. Furthermore, they were able to create a viable pregnancy and produce from it a healthy lamb (however, there were 276 failed attempts before Dolly was created, which, as it will be discussed below, creates concerns over the safety and efficacy of the procedure). Dolly the sheep died in 2003 after having been euthanized due to her suffering from pulmonary adenomatosis, a disease fairly common in sheep that are kept indoors; indeed, many members of Dolly’s flock had succumbed to the same disease. Additionally, she suffered from arthritis. Before she died, she produced six healthy lambs through natural reproduction. Since Dolly, many more mammals have been cloned through the use of SCNT. Some examples are deer, ferrets (Li et al., 2006), mules (Lovgren, 2003), other sheep, goats, cows, mice, pigs, rabbits, a gaur, dogs, and cats. One possible use of reproductive cloning technology is to help save endangered species (Lanza et al., 2000). In 2005, two endangered gray wolves were cloned in Korea (Oh et al., 2008).

The successful cloning of household pets holds special significance in that, when discussing the circumstances that led to their cloning, we can begin to discuss the ethical issues that arise in human reproductive cloning. In 2001, the first feline created via somatic cell nuclear transfer was born. She was named CC, short for “Copy Cat,” and was born at the College of Veterinary Medicine at Texas A&M University. The research that led to her creation was funded by the California based company “Genetic Savings and Clone,” who, between 2004 and 2006, offered grieving pet owners a chance to clone their sick or deceased pets (they closed their doors in 2006 due to the unsustainability of their business). What is most striking about CC is not simply her mere existence, but also that CC does not look nor act like her feline progenitor, Rainbow. Whereas Rainbow, a calico, is stocky and has patches of tan, orange, and white throughout her body, CC barely resembles a calico at all. Not only is she lanky and thin, she has a grey coat over a white body and is lacking the patches of orange or tan typical to calicos. There are personality differences between Rainbow and CC as well; whereas Rainbow is described as a shy, reticent, and a more “hands-off” kind of cat, CC is described as more playful, inquisitive, and affectionate (Hays, 2003).

“Genetic Savings and Clone” was founded by Lou Hawthorne, who was seeking a means to clone his family’s beloved dog Missy. Although Missy died before she was successfully cloned, Hawthorne banked her DNA in the hopes of ultimately succeeding in this endeavor. In 2004, a Texas woman paid $50,000 to clone her deceased Maine Coone Nicky and, as a result, Little Nicky, the world’s first commercially cloned cat, was born.  This was followed, in 2005, by the birth of Snuppy, the world’s first cloned dog. In 2007, three clones from Missy’s DNA were created and returned to the Hawthorne family. All this has incited some pet owners to pay large sums of money to clone their beloved deceased pets. Alan and Kristine Wolf paid thousands of dollars to have their deceased cat, Spot, cloned from skin cells they had preserved. According to the Wolfs, preserving Spot’s skin cells, in their mind, was almost equivalent to having Spot himself preserved. In other words, the Wolfs (and the woman who cloned Nicky) were willing to spend an exorbitant amount of money to clone their pets not just in order to receive another pet, but to, rather, receive what was, in their eyes, the same pet that they had lost (Masterson, 2010).

This allows us to begin exploring the ethical issues in the reproductive cloning debate. Some questions that arise are: Why did these individuals regard the recreation of the same DNA to equate to the recreation of the same entity that had died? Will these expectations transfer over to human cloning, where people will regard cloned children as the same individuals as their genetic predecessors, and therefore treat them with this expectation in mind? Will cloning, thus, compromise a child’s identity? Are such concerns grave enough to permanently ban reproductive cloning altogether?

4. Arguments in Favor of Reproductive Cloning and Responses

a. Reproductive Liberty: The Only Way to Have a Genetically Related Child

The Argument.

Procreative liberty is a right well established in Western political culture (Dworkin, 1994). However, not everyone is physically capable of procreating through traditional modes of conception. Cloning may be the only way for an otherwise infertile couple to have a genetically related child. Therefore, providing cloning as an option contributes to a greater scope of procreative liberty (Häyry, 2003; Harris, 2004; Robertson, 1998). For example, a couple may be able to generate only a few embryos from IVF procedures; cloning via artificially induced twinning would increase the number of embryos to a quantity that is more likely to result in a live birth. In another case, the male partner in a relationship may be unable to produce viable sperm and, instead of seeking a sperm donor, the couple can choose to use SCNT in order to produce a genetic copy of the prospective father. Since the prospective mother would use her own ova, they would both contribute genetically to the child (albeit with a different proportion than a couple who conceived using gamete cells).  In yet another example, neither parent may have usable gametes, so they employ a donor ovum, clone one of the two parents, and gestate the fetus in the female’s uterus. Or, perhaps one of the prospective parents is predisposed to certain genetic disorders and, in order to completely avoid their offspring inheriting these disorders, they decide to clone the other prospective parent. A single woman may want to have a baby, and would rather clone herself instead of using donated sperm. Also, cloning may give homosexual couples the opportunity to have genetically related children (this is especially true for homosexual women where one partner provides the mitochondrial DNA and the other partner provides the chromosomal DNA). These are a few examples of how cloning may provide a genetically related child to a person otherwise unable to have one. Because cloning may be the only way some people can procreate, to deny cloning to these people would be a violation of procreative liberty (Robertson, 2006).

Response 1: Negative vs. positive right to procreate.

One response is to distinguish between a positive right to procreate and a negative right to procreate (Pearson, 2007), and argue that reproductive liberty can be fully respected in the latter sense, and only conditionally respected in the former sense. This conditional respect may support the permissibility of prohibiting human cloning for reproductive purposes.

A negative right to x means that no one has the prima facie right to interfere in your request to fulfill x.  If you possess a negative right to x, this entails only one obligation on the behalf of others: the obligation to not obstruct your obtainment of x. For example, if I have a negative right to life, what this entails is that others have an obligation to not kill me, since this obstructs or hinders my right. Another way to regard it is that a negative right only requires passive obligations (the obligation to not do something or to refrain from acting).

A positive right requires more from obligation-bearers; it requires that active steps be taken in order to provide the right-bearers with the means to fulfill that right. If I have a positive right to life, for instance, it is not just that others have an obligation to not kill me; they have a further obligation to provide me with any services that I would need to ensure my survival. That is, the obligation becomes an active one as well as a passive one: an obligation to not destroy my life and also to provide services that enable me to preserve my life.

Keeping this distinction in mind, it is possible to deny that the right to reproduce is a positive right in the first place. That is, while we ought not to prevent anyone from procreating, we are not required to provide them with any technology whatsoever in order to enable them to procreate if they cannot do so by their own means. Hence, limiting access to certain types of assisted reproductive technologies to an otherwise infertile couple would not necessarily infringe on their (negative) right to procreate (Courtwright and Doron, 2007). Some have argued the opposing side, however, and have maintained that respect for procreative liberty not only entails access to artificial reproductive technology, but also the right to employ gamete donors and surrogate mothers (Ethics Committee of the American Fertility Society, 1985).

Response 2: Procreative liberty is not categorical.

Another possible response is to stress that, even if there is a positive right to procreate, the right is a prima facie, rather than a categorical, one and it is not the case that any step taken to combat infertility is in itself ethical (McCormick, 1993).  Therefore, determining what types of services can be offered to infertile couples must be tempered with certain considerations, e.g., the safety of the offspring born as a result of these services must be taken into account. If a particular type of reproductive technology poses a health risk to the resulting children, this is grounds enough to prevent the use of that technology (Cohen, 1996). In other words, even granting that individuals have a positive right to procreate, it does not follow from this alone that they should be provided with any means necessary for successful procreation. They may not be entitled to the use of a certain technological advancement (e.g., SCNT) if that advancement is deemed to pose a danger to the resulting offspring. Robertson concedes this objection, but he responds that “if a ban on cloning is justified, then a ban on many other forms of assisted reproduction and genetic selection should be as well, yet few persons are prepared to go that far” (2006, 206).  That is, in order for advocates of this objection to be consistent, they should be equally willing to ban other forms of reproductive technology that may result in harm to potential offspring.

b. Cloning and Savior Siblings

The Argument.

The concept of a “savior sibling,” a child that is deliberately conceived so that she could provide a means (through the donation of bodily fluids, umbilical cord blood, a non-vital organ, or tissue) to save an older sibling from illness or death is not new. What is new is that cloning would ensure that the new child is an appropriate match for the existing ailing person, since they would be genetically identical. Permitting cloning, therefore, would allow for a more expedient means of creating a savior sibling, since the alternatives (using preimplantation genetic diagnosis to screen embryos to determine which are genetically compatible with the sibling, implanting into a womb only the ones that are a match and discarding the others, or creating an embryo through natural reproduction and terminating the pregnancy if it is not a genetic match) are more involved and more time consuming. Of course, the rights of the new child would have to be respected; tissue, organs, or bodily fluids should only be removed given her consent (although this would not apply to umbilical cord blood banking, since the infant lacks the capacity for giving consent) (Robertson, 2006).

Response: Violating Kant’s formula of humanity.

Such a prospect raises concerns that cloning would facilitate viewing the resulting children as objects of manufacture, rather than as individuals with value and dignity of their own. The prospect of creating a child, solely to meet the needs of another child and not for her own sake, reduces the created child to a mere means to achieve the ends of the parents and the sick child. While it is admirable that the parents wish to save their existing child, it is not ethically permissible to create another child solely as an instrument to save the life of her sibling (Quintavalle, 2001).

Another way of explaining it is that creating a child solely for the purposes of providing life-saving aid for another child violates Immanuel Kant’s second principle formulation of the categorical imperative. Kant proscribes treating persons as a mere means, rather than as ends in themselves, maintaining that persons should “act in such a way that [humanity is treated] always at the same time as an end and never simply as a means” (1981, 36). Creating a child for the sole purpose of saving another child violates the formula of humanity because the child is created specifically for this end.

It should be noted, however, that such an objection would apply to any method that is used to create a child for similar reasons, including any other type of reproductive technology or even natural procreation. It is the intention with which a child is created that is in question here, not the method that is used in order to create the child. Another response is that Kant’s dictum is misapplied. A child who is created as a “savior sibling” may still, also, be loved and respected as an individual in her own right, and therefore may not necessarily be treated solely as a means (Boyle and Savulescu, 2001).

c. Cloning In Order to “Replace” a Deceased Child

The Argument.

In his article “Even If It Worked, Cloning Won’t Bring Her Back”, ethicist Thomas Murray recounts a letter he heard read at a congressional hearing regarding human reproductive cloning. A chemist, who was presenting her views in support of reproductive cloning, read a letter by a father grieving the death of his infant son. Murray recounts as follows:

Eleven days ago, as I awaited my turn to testify at a congressional hearing on human reproductive cloning, one of five scientists on the witness list took the microphone. Brigitte Boisselier, a chemist working with couples who want to use cloning techniques to create babies, read aloud a letter from “a father (Dada).” The writer, who had unexpectedly become a parent in his late thirties, describes his despair over his 11-month-old son’s death after heart surgery and 17 days of “misery and struggle.” The room was quiet as Boisselier read the man’s words: “I decided then and there that I would never give up on my child. I would never stop until I could give his DNA – his genetic make-up – a chance” (2001).

Depriving grieving parents of this unique opportunity, the only opportunity “to get back the child that they lost,” would be morally wrong. Cloning would provide such an opportunity to grieving parents.

Response 1: Assuming genetic determinism.

Like many of the arguments against reproductive cloning listed below, this argument in favor of cloning, despite its emotional appeal, erroneously assumes that genetic determinism is true. The grieving father’s letter maintained that he would never “give up on my child”, and that the way he would achieve this is to “give his DNA – his genetic make-up – a chance.” In other words, the father equated his son as an individual person to his genetic make-up; because he could recreate his son’s genes, he could recreate his son as a person. The tacit implication here is that cloning is desirable because it somehow presents a way to cheat death. It is through cloning that his son could be, in some sense, resurrected.

Given that individuals have sought to clone their deceased pets, the idea that grieving parents would seek to clone a deceased child is not far-fetched. Thomas Murray continues his article by disclosing that he too is a grieving father, having suffered the death of his twenty-year-old daughter who was abducted from her college campus and shot. Yet cloning, Murray continues, “can neither change the fact of death nor deflect the pain of grief” (2001). Murray goes on to stress that, due to varying other influences outside of genetic duplication, a clone would not, in fact, be a mere copy of its genetic predecessor. One interesting point is that both detractors of cloning (e.g., Kass and Callahan, whose views are explored below) and supporters of cloning (like the researcher that read this letter at the congressional hearing) find convergence in committing the same fallacy. Both assume that cloning recreates identity, and they differ only as to the desirability of that consequence. Yet, given that we have evidence that the robust form of genetic determinism these arguments assume is false (Resnik and Vorhaus, 2006; Elliot, 1998), both detractors and supporters of cloning who rely on it produce faulty arguments.

Response 2: A child is not replaceable.

Given the evidence that genetic determinism is false, Murray further stresses that using cloning as a method of replacing a dead child “is unfair. No child should have to bear the oppressive expectation that he or she will live out the life denied to his or her idealized genetic avatar…. Cloning a child to be a reincarnation of someone else is a grotesque, fun-house mirror distortion of parental expectations” (2001). Dan Brock further supports the contention that cloning in order to replace a deceased child is misguided (Brock, 1997). Moreover, because parents have cloned this child with the expressed purpose of replacing a deceased child, the expectations that the new child will be just like the deceased one would be overwhelming and impede the child’s ability to develop her own individuality (Levick, 2004). It should be stressed, however, that this response targets a particular use of cloning (one based on faulty assumptions), not the actual cloning procedure.

d. The Resultant Loss of Therapeutic Cloning for Stem Cell Research and Treating Diseases

The Argument.

Although SCNT is used to create embryos for therapeutic cloning, there is no intent to implant them in order to create children. Rather, the intent is to use the cells of the embryo in order to further research that may ultimately lead to treatments or cures for certain afflictions. Therefore, a categorical ban on SCNT affects not just the prospect of reproductive cloning, but also the research that could be done with cloned embryos. At the very least, the argument concludes, SCNT should be allowed for research and therapeutic purposes (Devolder and Savulescu, 2006; American Medical Association, 2003; Maas, 2001). This was the position presented by Senator Arlen Specter in his proposed Senate Bill 2439, called the “Human Cloning Prohibition Act of 2002:  A Bill to Prohibit Human [Reproductive] Cloning While Preserving Important Areas of Medical Research, Including Stem Cell Research.”

Response 1: Therapeutic cloning leads to reproductive cloning.

The first response maintains that, because therapeutic cloning and reproductive cloning both implement SCNT, allowing the procedure to be perfected for therapeutic cloning makes it more likely that it will later be used for reproductive purposes (Rifkin, 2002; Kass, 1998)

Response 2: Embryo experimentation is unethical.

The second response applies not just to therapeutic cloning, but to any type of embryo experimentation. From the time that an ovum is fertilized and syngamy (the fusion of two gametes to form a new and distinct genetic code) has successfully taken place, there exists a subject, the embryo, which is a bearer of dignity, moral status, and moral rights. It is unethical to experiment on an embryo for the same reason it is unethical to experiment on any human being and since embryo experimentation often results in the destruction of the embryo, this equates to murdering the embryo (Deckers, 2007; Oduncu, 2003; Novak, 2001). Typically, those who offer the second response (e.g., the Catholic Church) regard the human embryo as a complete moral subject upon conception (Pope John Paul II, 1995; Pope Paul VI, 1968), and therefore any experiment that harms them or destroys them is morally tantamount to any experiment that would destroy a person.

5. Arguments Against Reproductive Cloning and Responses

a. The Right to an Open Future

The Argument.

According to some ethicists who oppose human cloning, a cloned child’s identity and individuality will be compromised given that she will be “saddled with a genotype that has already lived” (Kass, 1998, 56; see also Annas, 1998 and Kitcher, 1997). Because of the expectations that the cloned child will re-live the life of her genetic predecessor, the child would necessarily be deprived of her right to an open future. Because all children deserve to have a life and a future that is completely open to them in terms of its prospects (Feinberg, 1980), and because being the product of cloning would necessarily deprive the resulting child of these prospects, cloning is seriously immoral. In a sense, this objection maintains that a cloned child would either lack the free will to live her life according to her own desire and goals or that, at the very least, her free will would be severely restricted by her parents or the society that has certain expectations of her given her genetic lineage. The child would be destined to live in the shadows of her genetic predecessor (Holm, 1998).

Response 1: Faulting cloning for the misconceptions of others.

This argument is unsuccessful in illustrating that there is something intrinsically morally wrong with cloning. The subject of this objection is not cloning itself, but rather the erroneous attitude that parents will have in regard to their cloned child. The child’s very desire to be different from her predecessor illustrates that she is not destined to be like her predecessor. Once prospective parents, and society in general, come to understand that cloned children will possess just as much individuality as any other person, it is possible that these fears, and the attempts to control the child’s future, will largely abate (Wachbroit, 1997). Additionally, if the reason people treat cloned children unfavorably is due to their misconceptions about cloning, then the proper response is not to ban cloning at the expense of compromising procreative liberty, but rather work to rectify these prejudices and misconceptions (Burley and Harris, 1999).

Moreover, it is not just parents of cloned children that may be guilty of violating the child’s right to an open future;  many parents are, to varying degrees of severity, already guilty of violating such a right with their naturally created children, and often times those attempts are subject to failure (see Agar, 2004, 106 for such an example). If such parents are not deprived of their opportunity to have children out of concern that they will violate their child’s right to an open future, then we seem hard pressed to find a reason to deprive couples who would turn to cloning for reproductive purposes of a similar opportunity.

Response 2: Assuming genetic determinism (again).

At its core, however, this objection assumes the very controversial thesis that either a person’s genes play an almost fatalistic role in her life decisions, or that individuals in society will assume some robust version of genetic determinism to be true and will treat cloned children according to that assumption. As abovementioned, there is much evidence to suggest that genetic determinism is not true. In their article “Genetic Modification and Genetic Determinism,” David Resnik and Daniel Vorhaus state that, when it comes to genetic modification, “even if a desired trait is successfully expressed it may not actually restrict options for the child… the open future critique paints with a far broader brush, alleging that the act of modification per se impacts the child’s right to an open future. And it is this claim that we reject…” (2006, 9). The same can be said about cloning (Pence, 1998 and 2008; Wachbroit, 1997). Even if a cloned child did display certain behavioral traits belonging to her genetic predecessor, it is unclear whether the similarity in traits entails that a child’s future would be closed off. Moreover, there is much evidence that, usually, the general public rejects genetic determinism (Hopkins, 1998).

There is evidence, however, that some would regard cloning as a method for resuscitating the dead (the grieving father in Murray’s article attests to this, as well as the individuals who are willing to pay thousands of dollars in order to clone a deceased pet). This supports Kass’ claim that many people may expect a cloned child to be like her genetic predecessor. However, this misconception may quickly be rectified simply by observing the unique personality of the cloned child, especially since her experiences and her nurture, removed by at least a generation, will be substantially different than that of her genetic predecessor (Dawkins, 1998; Pence, 1998).

b. The Right to a Unique Genetic Identity

The Argument.

Because cloning entails recreating an existing person’s genetic code (with the exception of the difference in mitochondrial DNA), some argue that cloning would, necessarily, entail a violation of the cloned child’s right to a distinctive genetic identity (European Parliament, 1998). According to this objection, our DNA is what endows each human being with uniqueness and dignity (Callahan, 1993). Because cloning recreates a pre-existing DNA sequence, the cloned child would be denied that uniqueness and, therefore, her dignity would be compromised. This objection appears to be an incarnation of the objection from the Right to an Open Future. Certainly the concerns are similar: that a cloned child would be deprived of her own individual identity because of her genetic origins. However, whereas in the objection from the Right to an Open Future, the cloned child is deprived of individuality based on the perception of others (and, as is developed above, this does not seem to really be an objection to the practice of cloning simpliciter), this objection indicates that there is something inherently individuality-compromising, and therefore dignity-compromising, in recreating an existing genetic code. If this objection is successful, if recreating a pre-existing genetic code is intrinsically morally objectionable, then it would seem to present an objection to the actual cloning process.

Response 1: Genetic duplication and identical multiples.

Callahan argues that there is something intrinsically identity-depriving, and therefore dignity-depriving, in duplicating a genetic code. However, there is much evidence to counter this claim. As abovementioned, CC the cat neither looks nor acts like Rainbow, her genetic predecessor. However, the strongest evidence against this claim is the existence of identical multiples, who are, in essence, clones of nature (Pence, 2004; Gould, 1997). No one claims that identical multiples’ right to a unique genetic identity was compromised simply in virtue of their creation, which calls into question whether such a right exists in the first place (Silver, 1998; Tooley, 1998; Rhodes, 1995). If Callahan’s concerns were accurate, identical multiples would fail to be individuals in their own right, and, consequently, be harmed because of this. However, there is no evidence that identical multiples feel this way, and there does not seem to be anything inherent about sharing a genetic code that compromises individuality (Elliot, 1998). The fact that identical multiples do not seem harmed or deprived of individuality merely by virtue of not possessing a unique genetic code is evidence that Callahan’s concern against cloning in this regard is misguided.

Response 2: Forgetting nurture.

Lastly, proponents of this objection ignore the very important role that nurture has in shaping a person’s identity. A cloned child would be gestated in a different uterine environment. She would be born into either the same family, but with a different dynamic, as her genetic predecessor, or be born into a different family altogether. She would also likely be raised in a much different society (e.g., a child born in 2010 would have vastly different social influences than a child born in the 1960s or 1970s). She would have different friends, attend different schools, play different games, watch different television shows, listen to different music. The generational and historical differences between a clone and her genetic predecessor would undoubtedly go a long way when it comes to shaping the personality of the former (Pence, 1998; Dawkins, 1998; Harris, 1997; Bor, 1997).

What forms or shapes each person’s individual identity is an intricate interaction of genetics and nurture (Ridley, 2003). While being genetically identical to a pre-existing person will most likely result in some similarities, it will certainly not be strong enough to deprive a cloned child of her individuality or dignity.  A cloned child’s future would remain open, and there is no evidence that she is denied something irreplaceably unique by not having a unique genetic code. Moreover, concerns that genetic duplication compromises dignity overemphasize the role that genetics has as the source of human dignity. Human dignity, some philosophers have argued, has its source in virtue of our being persons and autonomous rational beings. Since, presumably, a clone would still be a person and an autonomous rational being, a clone would certainly retain her human dignity (Glannon, 2005; Elliot, 1998).

c. Cloning is Wrong because it is “Playing God” or because it is “Unnatural”

The Argument.

Another common concern is that cloning is morally wrong because it oversteps the boundaries of humans’ role in scientific research and development. These boundaries are set by either God (and therefore cloning is wrong because it is “playing God”) or nature (and therefore cloning is wrong because it is “unnatural”). Any method of procreation that does not implement traditional modes of conception, i.e., not involving the union of sperm and ova, is guilty of one (or both) of these infractions (Goodman, 2008; Tierney, 2007).  Moreover, advocates of this objection caution against removing God from the process of creation altogether, which, it is argued, is what reproductive cloning achieves (Rikfin, 2000).

Response 1: Clarifying the meaning of “playing God.”

Advocates of the “playing God” objection have the onus to define exactly what “playing God” means. One possible definition of “playing God” is that anything that interferes with nature, or the natural progression of life, interferes with God’s plan for humanity, and is therefore morally wrong. But this is too vague; humans constantly interfere with nature in ways that are not morally criticized. Almost all instances of medical advancements in the past 100 years (e.g., vaccines against diseases, respirators, incubators for preterm infants, pacemakers, etc.) interfere with nature in the sense that they prevent otherwise harmful or fatal afflictions from taking their toll on a human body. Would the same advocates of this objection against cloning object to artificial insulin injections to treat diabetes? (Glannon, 2005). To be more extreme, almost everything humans engage in, from wearing clothing, to using phones and computers, to indoor plumbing, all, in some sense, interfere with some aspect of nature.

Perhaps the more charitable understanding is that “playing God” is morally wrong when it comes to cloning because it is a process that artificially creates life, outside of the practice of sexual intercourse (Meilaender, 1997). Adhering to this definition of “playing God”, however, would condemn any form of artificial reproductive technology, as well as cloning, e.g., IVF, artificial insemination, or intrauterine insemination. In addition, anything that thwarts the natural process of conception (i.e., birth control) may also be morally condemned.  In the “Instruction on Respect for Human Life in Its Origin and on the Dignity of Procreation,” the Catholic Church denounces all forms of reproductive technology on the grounds that reproductive creation is strictly God’s domain (Congregation for the Doctrine of the Faith, 1987). However, most people who denounce human cloning on the grounds that it “plays God” do not denounce other forms of artificial reproduction on similar grounds.

Response 2: Knowing God’s will.

Yet another response is that this objection purports to know what God’s will is in regards to technological advancements such as cloning. However, since key religious texts (e.g., The Bible, The Torah, or the Qu’ran) make no mention of such advancements, it is presumably impossible to determine what God would have to say about them. In other words, inferences about God’s will on such matters are tenuous because we have little basis from which to draw these purported moral inferences (Pence, 2008).

Response 3: Biologism Fallacy.

One response to the “unnatural” objection is similar to the first response to the “playing God” objection; most everything humans do, from medicine to modern forms of sanitation, are “unnatural”, and most are not considered morally objectionable as a consequence. A second response is that such an objection commits what philosopher Daniel Maguire calls the “Biologism Fallacy”: “the fallacious effort to wring a moral mandate out of raw biological facts” (1983, 148). In other words, “unnatural” is not synonymous with “immoral” (and conversely, “natural” is not synonymous with “moral”). While it is true that cloning (along with other types of reproductive technologies) is not the “natural” way of conceiving a child, this alone does not render cloning immoral.

d. The Dangers of Cloning

The Argument.

Many philosophers and ethicists who would otherwise support reproductive cloning concede that concern for the safety of children born via cloning is reason to caution against its use (Harris, 2004; Glannon, 2005).  The claim is that a cloned child would be in danger of suffering from severe genetic defects as a result of being a clone, or that cloning would result in a high number of severely defective embryos before one healthy human embryo is developed. Ian Wilmut, Dolly’s creator, has denounced human reproductive cloning as too dangerous to attempt (Travis, 2001). According to Wilmut, “Dolly was derived from 277 embryos, so the other 276 didn’t make it. The previous year’s work, which led to the birth and survival of Megan and Morag, used more than 200 embryos. We have success rates of roughly one in a hundred or less” (Klotzko, 1998, 134). Even if a clone were to appear healthy at birth, there are concerns about health problems arising later in life. For example, while there is no evidence that Dolly’s respiratory issues were due to her being a clone, questions remain whether her arthritis, which is uncommon among sheep her age, could have resulted because of the nature of her genesis (Williams, 2003). Even attempting to perfect human reproductive cloning would entail a trial and error approach that would lead to the destruction of many embryos, and may produce severely disabled children before a healthy one is born.

Response 1: The nonidentity problem.

One response typically given by philosophers when concerning the ethics of preconception decisions that may lead to the birth of a disabled child involves an appeal to Derek Parfit’s nonidentity problem (Parfit, 1984, though Parfit himself does not apply this to cloning). Applied to preconception choices, Parfit’s argument can be applied as follows. Suppose I desire to get pregnant, but am currently suffering from a physical ailment that would result in conceiving and birthing an infant with developmental impairments. Yet, if I were to wait two months, my ailment would pass and I would conceive a perfectly healthy baby. Most people would agree that I should wait those two months; and, indeed, if I do not wait, many people would say that I acted wrongly. The resulting child, moreover, would most likely be identified as the victim of my actions. This intuitive response, however, is surprisingly tricky to defend.  If harm is defined as making someone worse off than she otherwise would have been, it is difficult to maintain that I harmed the resulting child by my actions, even if she were impaired. For the child that would have been born two months later would not have been the same child that is born if I do not wait; the impaired child would never have existed had I waited those two months. Unless the child’s life is so bad that her nonexistence would be preferable, I did not make the child worse off by conceiving her and giving birth to her with those impairments, and thus I did not harm her. Because I did not harm her, I did not do anything morally wrong in this circumstance. The argument can best be standardized as follows:

1. I have only harmed an individual if I had made her worse off than she otherwise would have been had it not been for my actions.

2. Only if I have harmed someone can my action be deemed morally wrong.

3. A child born with mental, physical, or developmental impairments usually does not have a life that is so bad that it renders nonexistence preferable.

4. Therefore, a child born with mental, physical or developmental impairments is not made worse off by being brought into existence.

5. Therefore, deliberate conception, gestation, and birthing of a child with mental, physical, or developmental impairments does not, usually, harm the child (unless the impairments are so bad that they make the child’s life worse than not having existed at all).

6. Therefore, I have (usually) done nothing morally wrong by deliberately bringing into existence a child who suffers from mental, physical, or developmental impairments.

Using the nonidentity problem in the context of the reproductive cloning debate yields the following result: The alternative to being born a clone is not to be born at all. Unless the cloned child’s life is made so horrible by her disabilities that it would have been better that she not been born at all, she was not harmed by being brought into existence via cloning, even if she is born with genetic defects as a result. As long as the cloned child has a life that, despite her genetic defect, is still worth living, then it would still be permissible to use cloning to bring her into being (Lane, 2006).

It is important to note, however, that the nonidentity problem is controversial, and that not all philosophers and ethicists agree with its conclusion (Weinberg, 2008; Cohen, 1996). Indeed, many argue that it would be morally impermissible to bring a child into the world who suffers, even if the child’s life has a net value that renders it worth living (Steinbock and McClamrock, 1994).

Response 2: The dangers of natural reproduction.

Natural reproduction can itself produce dangerous results. Women dispose of fertilized eggs during their menstrual cycle more often than they are aware; one study claims that as many as 73% of fertilized eggs do not survive to 6 weeks gestation (Boklage, 1990). From the ones that do implant, approximately 2% to 3% of newborn infants suffer from congenital abnormities of varying degrees of severity (Kumar et al., 2004). If safety concerns about cloning are severe enough to ban its practice, this can only be justified if cloning were more risky (that is, resulted in the birth of more children with more severe abnormalities) than natural reproduction. Some couples choose to reproduce in full knowledge that one or both of them harbor genetic disorders that may be passed along to their offspring, and some of these are rather severe, such as Huntington’s disease. Yet these parents are not prohibited from procreating because of this. Therefore, if parents are not prohibited from procreating on the grounds that they may pass along a severe genetic defect to their children, then it is difficult to deny a set of parents who can only rely on cloning for procreation the chance to do so based on safety reasons alone (unless the abnormalities that may result from cloning are more severe than the abnormalities that may result from natural conception) (Brock, 1997). Similarly, objecting to cloning on the grounds that embryos are sacrificed in order to achieve a live birth is only a valid objection if the number of embryos lost are greater in cloning than in natural reproduction.

Finally, even if safety concerns are sufficient to warrant a current ban on human reproductive cloning, such concerns would be temporary, and would abate as cloning becomes safer. Indeed, safety concerns led the National Bioethics Advisory Commission (1997) to recommend a temporary, rather than permanent, moratorium on human reproductive cloning.

e. Cloning Entails the Creation of Designer Children, or it Turns Children into Commodities

The Argument.

If we engage in cloning, this objection goes, we run the risk of inserting our will too much into our procreative decisions; we would get to choose not just to have a child, but what kind of child to have. In doing so, we run the risk of relegating children to the status of mere possessions or commodities, rather than regarding them as beings with their own intrinsic worth (Harakas, 1998; Kass, 1998; Meilaender, 1997).  When a couple engages in sexual intercourse and produces a baby, the child is an “offspring of a man and woman, but a replication of neither; their offspring but not their product whose meaning and destiny they might determine” (Meilaender, 1997, 42). Because cloning involves the artificial process of recreating a pre-existing genetic code, prospective parents could, first, choose their child’s DNA (thereby creating a “designer child”), and, second, because they are creating a “replica” of an existing person, they will consider the child more akin to property than an individual in her own right. These factors will contribute to viewing and treating the child as a mere commodity. The more “artificial” conception becomes, the more the resulting children will be seen as the possessions of the parents, rather than as persons in their own right. Rev. Stanley Harakas puts this point as follows: “Cloning would deliberately deny by design the cloned human being a set of loving and caring parents. The cloned human being would not be the product of love, but of scientific procedures. Rather than being considered persons, the likelihood is that these cloned human beings would be considered ‘objects’ to be used” (1998, 89).

Although he rejects the contention that clones would not be considered persons, Thomas Shannon expresses concerns that the increasing artificiality of conception, not just via the use of cloning, but via the use of all forms of artificial reproductive technologies, will “transform our thinking about ourselves, and the transformation will be in a mechanistic direction” (Shannon and Walter, 2003, 134). That is, the move away from natural conception towards artificial conception will lead to humans collectively regarding themselves as more machine-like rather than as organic beings.

Response 1: Cloning is not genetic modification.

Cloning does not necessarily entail the creation of “designer” children because cloning recreates a pre-existing DNA; it does not involve modifying or enhancing DNA in order to produce a child with certain desired traits. Cloning is not to be equated with genetic modification or enhancement (Wachbroit, 1997; Strong, 1998).

Response 2: Natural vs. artificial conception.

Advocates of the objection that cloning results in the transformation of procreation into manufacture seem to assume that, whereas we do not consider children that arise from natural reproduction as ours to do what we wish with, we would if they arise from artificial conception. That is, the tacit premise is that there is some trait inherent in artificial (i.e., non-sexual) conception that necessitates parents regarding their children as mere objects, and this trait is not found in “natural” conception. Yet, we can look towards the children who are products of modern day artificial reproduction in order to see that such a concern is not supported by the evidence. There are many children who are products of artificial reproductive technologies (IVF, intrauterine insemination, gender selection, and gamete intrafallopian transfer, among others) and there does not seem to be an increase of despotic control over these children on behalf of their parents. One study found that children born from IVF and DI (donor insemination) are faring as well as children born via natural conception. More importantly, given Meilaender’s concern that the quality of parenting is compromised in tandem with the artificiality of conception, the study found that “the quality of parenting in families with a child conceived by assisted conception is superior to that shown by families with a naturally conceived child, even when gamete donation is used in the child’s conception” (Golombok et al., 1995, 295; also see Golombok, 2003 and Golombok et al., 2001).

Meilaender may respond that, in these cases, the children are still a product of a unification of sperm and ovum, whereas this is not the case with cloning. However, it is unclear why generating a child via somatic cells is more likely to foster despotism than when the child is generated using germ cells. Some have argued that, on the contrary, a cloned child would feel even closer to the parent from whom she was cloned, given that they would share all their genetic information, rather than just half (Pence, 2008). Moreover, the findings of the study supported the thesis that “genetic ties are less important for family functioning than a strong desire for parenthood” (Golombok et al., 1995, 296), which suggests that the parents of cloned children would not be as caught up with the genetic origins of their offspring, and so their parenting would not be as affected by it, as Meilaender contends. According to the study, the quality of parenting increased in tandem with the amount of effort it took to achieve parenthood. It could be argued, therefore, that the quality of parenting for cloned children would be just as good, if not superior, to that of naturally conceived children.

Response 3: Clones would not be loveless creations.

Harakas claims that cloned children will be deprived of loving parents because their genesis will be one of science, rather than love. The studies conducted by Golombok certainly seem to provide evidence to the contrary. Intentionally taking steps to create a child via cloning (or any other kind of reproductive technology) could be seen, instead, as a mutual affirmation of love on behalf of the prospective parents and clear evidence that they really desired the resulting child. Whereas in sexual reproduction the child may be a product of chance, a cloned child would be a product of deliberate choice, which, according to some philosophers, could be a superior method of creation in some respects (Buchanan et al. 2000). Creating a child via cloning does not entail that there is a lack of mutual love between the parents, or that the resulting child would be any less loved (Strong, 1998). Genesis via sexual reproduction is neither a necessary nor a sufficient condition for being born to a set of loving parents and in a nurturing environment.

f. Cloning and the Ambiguity of Familial Roles

The Argument.

Genetically speaking, a cloned child would be her genetic predecessor’s identical twin sibling. If the child is cloned with the intent to serve as the social child of her genetic predecessor, she would be, genetically, her social mother’s twin sister (or his social father’s twin brother), and her social grandparents’ genetic daughter. The concern is that such a radical alteration of familial relationships would be detrimental to the cloned child (Kass, 1998; O’Neil, 2002). As Paul Ramsey puts it: “To mix the parental and the twin relation might well be psychologically disastrous for the young” (Ramsey, 1970). Wide-spread cloning would exacerbate the problem by distorting generational boundaries, which would add a layer of confusion to society’s conception of the nature of the family, and the roles of its individual members (Kass, 1998).

Response 1: No such confusion is likely to arise.

There are two responses to this response. First, doubts can be cast as to whether this confusion would really ensue. Second, even if such confusion did result, it is questionable whether it would be any more detrimental to the child than any confusion that currently exists about parental roles given certain reproductive technologies. For example, it is physically possible for a child to have as many as six distinct “parents”: three genetic parents (the mitochondrial DNA donor, the somatic cell donor used to re-nucleate an enucleated ovum, and the sperm donor), one gestational parent, and two (perhaps even more) social parents. If a cloned child would not experience any less confusion than a child in such a situation, then we would be hard pressed to show why the prospective parents of the former ought to be denied the opportunity to have a genetically related child based on these grounds alone (Harris, 2004). Moreover, doubts can be cast as to whether the ambiguity of genetic lineage caused by the cloning relationship will really result in the consequences Kass and O’Neil are fretting. A social father, for example, is not likely to suddenly rescind his responsibilities toward his daughter because the child is, genetically, his wife’s twin sister (Wachbroit, 1997). Finally, as is evident from children raised by adoptive parents, social parents usually retain the honorific role as the child’s “real” parents, even though there are no genetic ties between them and the adopted child. In other words, what defines a parent seems to have less to do with genetics and more to do with who performs the social role of mother and father (Purdy, 2005).

Response 2: Such confusion would not warrant a prohibition on cloning.

Even if there were such confusion, however, would it be so detrimental as to warrant banning reproductive cloning altogether? Moreover, even if there were a detriment, it is unclear whether that would be a result of society’s prejudice and fear of human cloning, or a result that inherently comes with being a clone. Finally, it would have to be clear that being the genetic twin to a social parent is so detrimental that it would warrant interfering with the prospective parents’ reproductive liberty. Indeed, for any purported harm that may come from cloning (whether physical, psychological, or emotional), it must be argued why those harms are sufficient for banning reproductive cloning if comparable harm would not be sufficient for banning any other kind of reproductive method, whether natural or artificial (Harris, 2004; Robertson, 2006).

6. References and Further Reading

  • Agar, Nicholas (2004), Liberal Eugenics: A Defence of Human Enhancement. Malden: Blackwell.
  • Allison, Lizabeth (2007), Fundamental Molecular Biology, chapter 8: “Recombinant DNA Technology and Molecular Cloning.” Malden: Blackwell, pp. 180-231.
  • Annas, George et  al. (1996), “The Politics of Human-Embryo Research – Avoiding Ethical Gridlock.” New England Journal of Medicine, 334.20: 293-340.
  • Annas, George (1998), “The Prospect of Human Cloning: An Opportunity for National and International Cooperation” in Human Cloning: Biomedical Ethical Review. James Humber and  Robert Almeder (eds). Totawa: Humana Press, pp. 53-63.
  • Boiani, Michele and Hans Schöler (2002), “Determinants of Pluripotency in Mammals” in Principles of Cloning. Jose Cibelli, Robert Lanza, Keith Campbell, Michael D. West (eds.) New  York: Academic Press, pp. 109-152.
  • Boklage, Charles (1990), “Survival Probability of Human Conceptions from Fertilization to Term.” International Journal of Fertility, 35.2:75-94.
  • Bor, Jonathan, “Cloning Adds a Dimension to Nature-Nurture Debate: Identical Humans are Not in the Cards.” The Baltimore Sun, March 9, 1997.
  • Boyle, Robert and Julian Savulescu (2001) “Ethics of Using Preimplantation Genetic Diagnosis to Select a Stem Cell Donor for an Existing Person.”  BMJ 323:1240-1243.
  • Brannigan, Michael (ed.) (2001), Ethical Issues in Human Cloning. New York, NY: Seven Bridges Press.
  • Brock, Dan (1997), “Cloning Human Beings: An Assessment of the Ethical Issues Pro and Con,” in Cloning Human Beings Volume II: Commissioned Papers. Rockville, MD: National Bioethics Advisory Commission.
  • Buchanan, Allen et al. (2002), From Chance to Choice: Genetics and Justice. Cambridge: Cambridge University Press.
  • Burely, Justin and John Harris (1999), “Human Cloning and Child Welfare.” Journal of Medical Ethics, 25.2:108-113.
  • Callahan, Daniel (1993), “Perspective on Cloning: A Threat to Individual Uniqueness; an Attempt to Aid Childless Couples by Engineered Conceptions Could Transform the Idea of Human Identity” in Los Angeles Times, Nov. 12, 1993.
  • Cohen, Cynthia (1996), “’Give Me Children or I’ll Shall Die!’: New Reproductive Technologies and Harm to Children.” Hastings Center Report, 26.2: 19-27.
  • Courtwright, Andrew M. and Wechsler Doron (2007), “Is Restricting Access to Assisted Reproductive Technology an Infringement of Reproductive Rights?”  American Medical Association Journal of Ethics, 9.9:  635-640.
  • Dawkins, Richard (1998), “What Wrong with Cloning?” in Clones and Cloning: Facts and Fantasies About Human Cloning. Martha Nussbaum and Cass Sunstein (eds). New York: Norton and Company Inc, pp. 54-66.
  • Deckers, Jan (2007), “Are Those Who Subscribe to the View that Early Embryos are Persons Irrational and Inconsistent?: A Reply to Brock.” Journal of Medical Ethics, 33.2: 102-106.
  • Devolder, Katrien and Julian Savulescu (2006), “The Moral Imperative to Conduct Cloning and Stem Cell Research.” Cambridge Quarterly of Healthcare Ethics, 15.1: 7-21.
  • Doetschman, R.G. et al. (1987), “Targeted Correction of a Mutant HPRT Gene in Mouse Embryonic Stem Cells.” Nature, 330: 576–578
  • Dworkin, Ronald. Life’s Dominion: An Argument About Abortion. Euthanasia, and Individual Freedom. New York: Vintage Press.
  • Elliot, David (1998), “Uniqueness, Individuality, and Human Cloning,” Journal of Applied Philosophy 15.3: 217-230.
  • Ethics Committee of the American Fertility Society (1985), “The Constitutional Aspects of Procreative Liberty” in Ethical Issues in the New Reproductive Technologies. Richard T. Hull (ed.)Belmont: Wadsworth Publishing Company, pp. 8-15.
  • Feinberg, Joel (1980), “A Child’s Right to an Open Future” in Philosophy of Education: An Anthology. Randall Curren (ed.) Malden: Blackwell, pp. 112-123.
  • Freed. Curt. R. et al. April 21, 1999. “Double-Blind Controlled Trial of Human Embryonic Dopamine Cell Transplants in Advanced Parkinson’s Disease: Study, Design, Surgical Strategy, Patient Demographics, and Pathological Outcome” (presented to the American Academy of Neurology).
  • French, Andrew J. et al. (2008), “Development of Human Cloned Blastocysts Following Somatic Cell Nuclear Transfer with Adult Fibroblasts.” Stem Cells 26.2: 485-493.
  • Gearhart, John D (1998), “New Potential for Human Embryonic Stem Cells,” Science 282: 1061.
  • Glannon, Walter (2005), Biomedical Ethics. New  York: Oxford University Press.
  • Golombok, Susan et al. (1995), “Families Created by the New Reproductive Technologies: Quality of Parenting and Social and Emotional Development of the Children.” Child Development, 66.2: 285-298.
  • Golombok, Susan (2001), “The “Test-Tube” Generation: Parent-Child Relationships and the Psychological Well-Being of In Vitro Fertilization Children at Adolescence.” Child Development, 72.2: 599-608.
  • Golombok, Susan (2003), “ Reproductive Technology and Its Impact on Child Psychosocial and Emotional Development” in: Tremblay RE, Barr RG, Peters RDeV (eds). Encyclopedia on Early Childhood Development [online]. Montreal, Quebec: Centre of Excellence for Early Childhood Development; 2003:1-7.
  • Goodman, Jim. “Cloning Animals is Unnatural, Unethical.” The Capital Times, January 25, 2008.
  • Gould, Stephen Jay (1997), “Individuality: Cloning and Discomfiting Cases of Siamese Twins.” The Sciences 37: 14-16.
  • Harakas, Stanley (1998), “To Clone or Not to Clone?” in Ethical Issues in Human Cloning. Edited by Michael C. Brannigan. New York, NY: Seven Bridges Press, pp. 89-90.
  • Harris, John (1997), “Good-bye, Dolly?: The Ethics of Human Cloning.” Journal of Medical Ethics, 23.6: 353-360)
  • Harris, John (2004), On Cloning. New York: Routledge.
  • Häyry, Matti (2003), “Philosophical Arguments For and Against Human Cloning.” Bioethics, 17.5-6: 447-459.
  • Holm, Soren (1998), “A Life in Shadows: One Reason Why We Should Not Clone Humans.” Cambridge Quarterly of Healthcare Ethics, 7.2: 160-162.
  • Hopkins, Patrick. “Bad Copies: How Popular Media Represent Cloning as an Ethical Problem” in Ethical Issues in Human Cloning. Edited by Michael C. Brannigan. New York, NY: Seven Bridges Press, pp. 128-140.
  • Illmensee, Karl et al. (2009), “Human Embryo Twinning with Applications in Reproductive Medicine.” Fertility and Sterility 93.2: 423-427.
  • Kant, Immanuel (1981), Grounding for the Metaphysics of Morals. Indianapolis: Hackett Publishing Company.
  • Kass, Leon (1998), “The Wisdom of Repugnance: Why We Should Ban the Cloning of Humans” in Ethical Issues in Human Cloning. Edited by Michael C. Brannigan. New York, NY: Seven Bridges Press, pp. 43-66.
  • Kitcher, Phillip (1997), “Who’s Self is it Anyway?” Sciences, 37.5: 58-62.
  • Klotzko, Arlene (1998), “Voices from Roslin: The Creators of Dolly Discuss Science, Ethics, and Social Responsibility.” Cambridge Quarterly of Healthcare Ethics, 7.2: 121-140.
  • Kumar, Vinay et al. (2004), Robbins & Cotran Pathologic Basis of Disease, 7th Edition. New York: Saunders.
  • Lane, Robert (2006), “Safety, Identity, and Consent: A Limited Defense of Reproductive Human Cloning.” Bioethics, 20.3: 125-135.
  • Lanza, Robert et al. (2000), “Cloning Noah’s Ark.” Scientific American, 283.5:84-89.
  • Levick, Stephen (2004), Clone Being: Exploring the Psychological and Social Dimensions. Lanham: Rowman and Littlefield Publishers, Inc.
  • Li, Ziyi (2006), “Cloned Ferrets Produced by Somatic Cell Nuclear Transfer.” Developmental Biology, 293.2: 439-448
  • Lovgren, Stefan, May 29, 2003. ”U.S. Team Produces First Mule Clone.” National Geographic News.
  • Lumelsky, Nadya et al. (2001), “Differentiation of Embryonic Stem Cells to Insulin- Secreting Structures Similar to Pancreatic Islets,” Science, 292: 1389–1394.
  • Maas, Heiko. Auigust 24, 2001. “Taking the Stand for Therapeutic Cloning.” Science Career Magazine.
  • Maguire, Daniel (1983), “The Morality of Homosexual Marriages” in Same-Sex Marriage: The Moral and Legal Debate. Amherst: Prometheus Books, pp. 147-161.
  • McCormick, Richard. “Should We Clone Humans?” The Christian Century, November 17-24, 1993.
  • McCormick, Richard (1994), “Blastomere Separation: Some Concerns.” Hastings Center Report 24.2: 14- 16.
  • Meilaender, Gilbert (1997), “Begetting and Cloning.” First Things, 74: 41-43.
  • Murray, Thomas April 8, 2001. “Even If It Worked, Cloning Won’t Bring Her Back.” The Washington Post.
  • National Bioethics Advisory Commission. “Cloning Human Beings: Reports and Recommendations.” June 9, 1997.
  • National Institute of Neurological Disorders and Stroke.  March 25, 1999, “What Would You Hope to Achieve from Stem Cell Research?” Response to Senator Arlen Specter’s Inquiry.
  • Novak, Michael. September 3, 2001. “The Stem Cell Slide: Be Alert to the Beginnings of Evil.”  National Review.
  • O’Neil, Onera (2002),  Autonomy and Trust in Bioethics. Cambridge: Cambridge University Press.
  • Oduncu Fuat S. (2003), “Stem Cell Research in Germany Ethics of Healing vs. Human Dignity.” Medicine, Healthcare, and Philosophy 6.1: 5-16.
  • Oh, H.J. (2008), “Cloning Endangered Gray Wolves (Canis lupus) from Somatic Cells Collected Postmortem.” Theriogenology, 70.4: 638-647.
  • Outka, Gene (2002), “The Ethics of Human Stem Cell Research.” Kennedy Institute of Ethics Journal 12.2: 175-213.
  • Parfit, Derek (1984), Reasons and Persons. New York: Oxford.
  • Pearson, Yvette E. (2007), “Storks, Cabbage Patches, and the Right to Procreate.” Journal of Bioethical Inquiry 4: 105-115.
  • Pence, Gregory (1998), Who’s Afraid of Human Cloning? Lanham: Rowman and Littlefield.
  • Pence, Gregory (2008), Classic Cases in Medical Ethics: Accounts of Cases That Have Shaped and Define Medical Ethics. New York: McGraw Hill.
  • Peters, Ted (2001), “The Stem Cell Controversy” in The Stem Cell Controversy. Michael Ruse and Christopher Pynes (eds.) Amherst: Prometheus Books, pp. 231-237.
  • Pope John Paul II (1995), Encyclical Evangelium Vitae. New York: Random House.
  • Pope Paul VI (1968), Encyclical Humanae Vitae, AAS 60, No.4, 1 – 11, Q. eciv.
  • President’s Council on Bioethics (2002), Human Cloning and Human Dignity. New York: Public Affairs Reports.
  • Purdy, Laura (2005), “Like Motherless Children: Fetal Eggs and Families.” Journal of Clinical Ethics 16.4: 329-334.
  • Quintavalle, Josephine. December 11, 2001. Quoted in: BBC News. “Doctor Plans UK ‘‘Designer Baby’’ clinic.  <http://news.bbc.co.uk/1/hi/health/1702854.stm>
  • Ramsey, Paul (1970), Fabricated Man: The Ethics of Genetic Control. New Haven: Yale University Press.
  • Resnik, David B and Daniel B Vorhaus (2006), “Genetic modification and genetic determinism.” Philosophy, Ethics, and Humanities in Medicine, 1:9.
  • Rhodes, Rosamond (1995), “Clones, Harms, and Rights.” Cambridge Quarterly of Healthcare Ethics, 4.3: 285-290.
  • Ridley, Matt (2003). Nature via Nurture: Genes, Experience, and What Makes Us Human. London: Fourth Estate.
  • Rifkin, Jeremy. February 3, 2000. “Cloning: What Hath Genomics Wrought”? Los Angeles Times.
  • Rifkin, Jeremy. July-August 2002. “Why I Oppose Human Cloning.” Tikkun, 17.4: 23- 32.
  • Robertson, John (1998), “Human Cloning and the Challenge of Regulation.” New England Journal of Medicine 339: 119-122.
  • Robertson, John (2006), “Reproductive Liberty and Cloning Humans.” Annals of the New York Academy of Sciences 913: 198-208.
  • Robertson John (1994), “The Question of Human Cloning.”  Hastings Center Report 24.2: 6-14.
  • Ruse, Michael and Aryne Sheppard (eds.) (2001). Cloning: Responsible Science or Technomadness? Amherst: Prometheus Books.
  • Shannon, Thomas and James Walter (2003), The New Genetic Medicine. Lanham: Rowman and Littlefield Publishers, Inc.
  • Shaprio, James et al. . (2000), “Islet Transplantation in Seven Patients with Type I Diabetes Mellitus Using a Glucocorticoid-Free Immunosuppressive Regimen.” New England Journal of Medicine, 343.4: 230-238.
  • Silver, Lee (1998), “Cloning, Ethics, and Religion” Cambridge Quarterly of Healthcare Ethics, 7.2: 168-172.
  • Steinbock, Bonnie and Ron McClamrock, (1994) “When Is Birth Unfair to the Child?”, Hastings Center Report 24.6: 15-21.
  • Strauer, B.E. (2009), “Therapeutic Potentials of Stem Cells in Cardiac Diseases.” Minerva Cardioangiol 57. 2: 249–267.
  • Strong, Carson (1998), “Cloning and Infertility.” Cambridge Quarterly of Healthcare Ethics, 7.2: 279-293.
  • Tierney, John. “Are Scientists Playing God?: It Depends on Your Religion.” The New York Times, November 20, 2007.
  • Tooley, Michael (1998), “The Moral Status of Cloning Humans” in Human Cloning: Biomedical Ethical Review. James Humber and  Robert Almeder (eds). Totawa: Humana Press, pp. 67-101.
  • Travis, John. “Dolly Was Lucky: Scientists Warn that Cloning is Too Dangerous for People.” Science News, October 20, 2001.
  • Voltarelli, Julio C. (2007), “Autologous Nonmyeloablative Hematopoietic Stem Cell Transplantation in Newly Diagnosed Type 1 Diabetes Mellitus.” Journal of the American Medical Association, 297.14: 1568-1576.
  • Wachbroit, Robert (1997), “Genetic Encores: The Ethics of Human Cloning.” Institute for Philosophy and Public Policy, 17.4: 1-7.
  • Wagner, John et al. (1999), “Induction of a Midbrain Dopaminergic Phenotype in Nurr1-Overexpressing Neural Stem Cells by Type 1 Astrocytes.” Nature Biotechnology17: 653-659.
  • Wakayama, Teruhiko et al. (1999), “Mice Cloned from Embryonic Stem Cells.” Proceedings of the National Academy of Sciences of the United States of America 96.26: 14984–14989.
  • Weinberg, Rivka (2008), “Identifying and Dissolving the Non-Identity Problem.” Philosophical Studies, 137.1: 3-18.
  • Williams, Nigel (2003). “Death of Dolly Marks Cloning Milestone.” Current Biology, 13.6: R209-R210.
  • Wilmut, Ian et  al. (1997), “Viable Offspring Derived from Fetal and Adult Mammalian Cells.” Nature 385.6619: 810-813.
  • Wilmut, Ian et al. (2000), The Second Creation: Dolly and the Age of Biological Control. London: Headline Book Publishing
  • Zulewski, Henry et al. (2001), “Multipotential nestin-positive stem cells isolated from adult pancreatic islets differentiate ex vivo into pancreatic endocrine, exocrine, and hepatic phenotypes,” Diabetes 50: 521–533.

Author Information

Bertha Alvarez Manninen
Email: bertha.manninen@asu.edu
Arizona State University at the West Campus
U. S. A.

Autonomy

Autonomy is an individual’s capacity for self-determination or self-governance. Beyond that, it is a much-contested concept that comes up in a number of different arenas. For example, there is the folk concept of autonomy, which usually operates as an inchoate desire for freedom in some area of one’s life, and which may or may not be connected with the agent’s idea of the moral good. This folk concept of autonomy blurs the distinctions that philosophers draw among personal autonomy, moral autonomy, and political autonomy. Moral autonomy, usually traced back to Kant, is the capacity to deliberate and to give oneself the moral law, rather than merely heeding the injunctions of others. Personal autonomy is the capacity to decide for oneself and pursue a course of action in one’s life, often regardless of any particular moral content. Political autonomy is the property of having one’s decisions respected, honored, and heeded within a political context.

Another distinction that can be made is between autonomy as a bare capacity to make decisions and of autonomy as an ideal. When autonomy functions as an ideal, agents who do not meet certain criteria in having reached a decision are deemed non-autonomous with respect to that decision. This can function both locally, in terms of particular actions, and globally, in terms of agents as a whole. For instance, children, agents with cognitive disabilities of a certain kind, or members of oppressed groups have been deemed non-autonomous because of their inability to fulfill certain criteria of autonomous agency, due to individual or social constraints.

There is debate over whether autonomy needs to be representative of a kind of “authentic” or “true” self. This debate is often connected to whether the autonomy theorist believes that an “authentic” or “true” self exists. In fact, conceptions of autonomy are often connected to conceptions of the nature of the self and its constitution. Theorists who hold a socially constituted view of the self will have a different idea of autonomy (sometimes even denying its existence altogether) than theorists who think that there can be some sort of core “true” self, or that selves as agents can be considered in abstraction from relational and social commitments and contexts.

Finally, autonomy has been criticized as being a bad ideal, for promoting a pernicious model of human individuality that overlooks the importance of social relationships and dependency. Responses to these criticisms have come in various forms, but for the most part philosophers of autonomy have striven to express the compatibility of the social aspects of human action within their conceptions of self-determination, arguing that there need not necessarily be an antagonism between social and relational ties, and our ability to decide our own course of action.

This article will focus primarily on autonomy at the level of the individual and the work being done on personal autonomy, but will also address the connection of autonomy to issues in bioethics and political theory.

Table of Contents

  1. The History of Autonomy
    1. Before Kant
    2. Kant
    3. The Development of Individualism in Autonomy
    4. Autonomy and Psychological Development
  2. Personal Autonomy
    1. Content-Neutral or Procedural Accounts
      1. Hierarchical Procedural Accounts
      2. Criticisms of Hierarchical Accounts
      3. Coherentist Accounts
    2. Substantive Accounts
  3. Feminist Philosophy of Autonomy
    1. Feminist Criticisms of Autonomy
    2. Relational Autonomy
  4. Autonomy in Social and Political Context
    1. Autonomy and Political Theory
    2. Autonomy and Bioethics
  5. References and Further Reading

1. The History of Autonomy

a. Before Kant

The roots of autonomy as self-determination can be found in ancient Greek philosophy, in the idea of self-mastery. For both Plato and Aristotle, the most essentially human part of the soul is the rational part, illustrated by Plato’s representation of this part as a human, rather than a lion or many-headed beast, in his description of the tripartite soul in the Republic. A just soul, for Plato, is one in which this rational human part governs over the two others. Aristotle identifies the rational part of the soul as most truly a person’s own in the Nicomachean Ethics (1166a17-19).

Plato and Aristotle also both associate the ideal for humanity with self-sufficiency and a lack of dependency on others. For Aristotle, self-sufficiency, or autarkeia, is an essential ingredient of happiness, and involves a lack of dependence upon external conditions for happiness. The best human will be one who is ruled by reason, and is not dependent upon others for his or her happiness.

This ideal continues through Stoic philosophy and can be seen in the early modern philosophy of Spinoza. The concept of autonomy itself continued to develop in the modern period with the decrease of religious authority and the increase of political liberty and emphasis on individual reason. Rousseau’s idea of moral liberty, as mastery over oneself, is connected with civil liberty and the ability to participate in legislation.

b. Kant

Kant further developed the idea of moral autonomy as having authority over one’s actions. Rather than letting the principles by which we make decisions be determined by our political leaders, pastors, or society, Kant called upon the will to determine its guiding principles for itself, thus connecting the idea of self-government to morality; instead of being obedient to an externally imposed law or religious precept, one should be obedient to one’s own self-imposed law.  The former he called heteronomy; the latter autonomy. In his “What is Enlightenment” essay, he described enlightenment as “the human being’s emergence from his self-incurred minority” and called on his readers to have the courage to use their own understanding “without direction from another” (Kant 1996, 17). This description is close to what we might acknowledge today as personal autonomy, but Kant’s account is firmly located within his moral philosophy.

In acting we are guided by maxims, which are the subjective principles by which we might personally choose to abide. If these maxims can be deemed universal, such that they would be assented to and willed by any rational being, and thus not rooted in any individual’s particular contingent experience, then they may gain the status of objective laws of morality. Each moral agent, then, is to be seen as a lawgiver in a community where others are also lawgivers in their own right, and hence are to be respected as ends in themselves; Kant calls this community the kingdom of ends.

While the will is supposed to be autonomous, for Kant, it is also not supposed to be arbitrary or particularistic in its determinations. He sees our inclinations and emotional responses as external to the process of the will’s self-legislation; consequently, letting them determine our actions is heteronomous rather than autonomous. Feelings, emotions, habits, and other non-intellectual factors are excluded from autonomous decision-making. Any circumstances that particularize us are also excluded from autonomous decision-making.

The reason for Kant’s exclusion of feelings, inclinations, and other particular aspects of our lives from the structure of autonomy is rooted in his metaphysical account of the human being, which radically separates the phenomenal human self from the noumenal human self. All empirical aspects of our selfhood — all aspects of our experience — are part of the phenomenal self, and subject to the deterministic laws of natural causality. Our freedom, on the other hand, cannot be perceived or understood; rather we must posit the freedom of the will as the basis for our ability to act morally.

Contemporary Kantians within moral theory do not adhere to Kant’s metaphysics, but seek to understand how something like Kant’s conception of autonomy can still stand today. Thomas Hill suggests, for example, that the separation of our free will from our empirical selfhood be taken less as a metaphysical idea but as a normative claim about what ought to count as reasons for acting (Hill 1989, 96-97)

There are significant differences between Kant’s conception of moral autonomy and the conceptions of personal autonomy developed within the last thirty years, which attempt to articulate how social and cultural influences can be compatible with autonomous decision-making. Further, the majority of contemporary theories of personal autonomy are content-neutral accounts of autonomy which are unconcerned with whether or not a person is acting according to moral laws; they focus more on determining whether or not a person is acting for his or her own reasons than on putting any restrictions on autonomous action.

c. The Development of Individualism in Autonomy

Between Kant’s description of moral autonomy and the recent scholarship on personal autonomy, however, there was a process of individualizing the idea of autonomy. The Romantics, reacting against the emphasis on the universality of reason put forth by the Enlightenment, of which Kant’s philosophy was a part, prized particularity and individuality. They highlighted the role of the passions and emotions over reason, and the importance of developing one’s own unique self. John Stuart Mill also praised and defended the development and cultivation of individuality as worthwhile in itself, writing that “A person whose desires and impulses are his own – are the expression of his own nature, as it has been developed and modified by his own culture – is said to have a character. One whose desires and impulses are not his own has no character, no more than a steam engine has a character” (Mill 1956, 73).

The Romantic conception of individuality was then echoed within the conception of authenticity that runs through phenomenological and existential philosophy. Heidegger posits an inner call of conscience summoning us away from ‘das Man’: in order to be authentic, we need to heed this inner call and break away from inauthentically following the crowd. This conception of authenticity became intertwined with the idea of autonomy: both involve a call to think for oneself and contain a streak of individualism (see Hinchman 1996).

Unlike the universalism espoused by Kantian autonomy, however, authenticity, like the Romantic view, involves a call to be one’s own person, not merely to think for oneself. For Kant, thinking for oneself would, if undertaken properly, lead to universalizing one’s maxims; for both the Romantics and the Existentialists, as well as for Mill, there is no such expectation. This division is still present in the contrast between conceiving of autonomy as a key feature of moral motivation, and autonomy as self-expression and development of individual practical identity.

The emphasis on autonomy within this strain of philosophy was criticized by Emmanuel Lévinas, who sees autonomy as part of our selfish and close-minded desire to strive toward our own fulfillment and self-gratification rather than being open to the disruptive call of the other’s needs (Lévinas 1969). He argues for the value of heteronomy over autonomy. For Lévinas, in heteronomy, the transcendent face of the other calls the ego into question, and the self realizes its unchosen responsibility and obligation to the other. The self is hence not self-legislating, but is determined by the call of the other. This criticism of the basic structure of autonomy has been taken up within continental ethics, which attempts to determine how or whether a practical, normative ethics could be developed within this framework (see for example Critchley 2007).

d. Autonomy and Psychological Development

The connection between autonomy and the ideal of developing one’s own individual self was adopted within the humanistic psychologies of Abraham Maslow and Carl Rogers, who saw the goal of human development as “self-actualization” and “becoming a person,” respectively. For Maslow and Rogers, the most developed person is the most autonomous, and autonomy is explicitly associated with not being dependent on others.

More recently Lawrence Kohlberg developed an account of moral psychological development, in which more developed agents display a greater amount of moral autonomy and independence in their judgments. The highest level bears a great resemblance to the Kantian moral ideal, in its reference to adopting universal values and standards as one’s own.

Kohlberg’s work was criticized by Carol Gilligan, who argued that this pattern reflected male development, but not female. Instead of taking “steps toward autonomy and independence,” in which “separation itself becomes the model and the measure of growth,” “for women, identity has as much to do with intimacy as with separation” (Gilligan 1982, 98). The trajectory is thus less about individualization and independence than toward ultimately balancing and harmonizing an agent’s interests with those of others.

Gilligan does not entirely repudiate autonomy itself as a value, but she also does not suggest how it can be distinguished from the ideals of independence and separation from others. Her critiques have been widely influential and have played a major role in provoking work on feminist ethics and, despite her criticism of the ideal of autonomy, conceptions of “relational autonomy.”

The contemporary literature on personal autonomy within philosophy tends to avoid these psychological ideas of individual development and self-actualization. For the most part, it adopts a content-neutral approach that rejects any particular developmental criteria for autonomous action, and is more concerned with articulating the structure by which particular actions can be deemed autonomous (or, conversely, the structure by which an agent can be deemed autonomous with respect to particular actions).

2. Personal Autonomy

The contemporary discussion of personal autonomy can primarily be distinguished from Kantian moral autonomy through its commitment to metaphysical neutrality. Related to this is the adherence to at least a procedural individualism: within contemporary personal autonomy accounts, an action is not judged to be autonomous because of its rootedness in universal principles, but based on features of the action and decision-making process purely internal and particular to the individual agent.

The main distinction within personal autonomy is that between content-neutral accounts, which do not specify any particular values or principles that must be endorsed by the autonomous agent, and substantive accounts which specify some particular value or values that must be included within autonomous decision-making.

a. Content-Neutral or Procedural Accounts

Content-neutral accounts, also called procedural, are those which deem a particular action autonomous if it has been endorsed by a process of critical reflection. These represent the majority of accounts of personal autonomy. Procedural accounts determine criteria by which an agent’s actions can be said to be autonomous, that do not depend on any particular conception of what kinds of actions are autonomous or what kinds of agents are autonomous. They are neutral with respect to what an agent might conceive of as good or might be trying to achieve.

i. Hierarchical Procedural Accounts

The beginning of the contemporary discussion of personal autonomy is in the 1970s works of Harry Frankfurt and Gerald Dworkin. Their concern was to give an account of what kind of individual freedom ought to be protected, and how that moral freedom may be described in the context of contemporary conceptions of free will. Their insight was that our decisions are worth protecting if they are somehow rooted in our values and overall commitments and objectives, and that they are not worth protecting if they run counter to those values, commitments, and objectives. The concept of personal autonomy, thus, can be used as a way of protecting certain decisions from paternalistic interference. We may not necessarily want to honor the decision of a weak-willed person who decides to do something against their better judgment and against their conscious desire to do otherwise, whereas we do want to protect a person’s decision to pursue an action that accords with their self-consciously held values, even if it is not what we ourselves would have done. Frankfurt and Dworkin phrase this insight in terms of a hierarchy of desires.

Frankfurt’s and Dworkin’s hierarchical accounts of autonomy form the basis upon which the mainstream discussion builds and reacts against. Roughly speaking, according to this hierarchical model, an agent is autonomous with respect to an action on the condition that his or her first-order desire to commit the act is sanctioned by a second-order volition endorsing the first-order desire (see Frankfurt 1988, 12-25). This account is neutral with respect to what the origins of the higher-order desires may be, and thus does not exclude values and desires that are socially or relationally constituted. The cause of such desires does not matter, solely the agent’s identification with them (Frankfurt 1988, 53-54). Autonomy includes our ability to consider and ask whether we do, in fact, identify with our desires or whether we might wish to override them (Dworkin 1988).  The “we”, in this case, is constituted by our higher-order preferences; Dworkin speaks of them as the agent’s “true self” (Dworkin 1989, 59).

ii. Criticisms of Hierarchical Accounts

There are several different objections to the hierarchical model, mostly revolving around the problem in locating the source of an agent’s autonomy, and questioning the idea that autonomy can be located somehow in the process of reflective endorsement itself.

First, the Problem of Manipulation criticism points out that because Frankfurt’s account is ahistorical, it does not protect against the possibility that someone, such as a hypnotist, may have interfered with the agent’s second-order desires. We would hesitate to call such a hypnotized or mind-controlled agent autonomous with respect to his or her actions under these circumstances, but since the hierarchical model does not specify where or how the second order volitions ought to be generated, it cannot adequately distinguish between an autonomous agent and a mind-controlled one. The structure of autonomous agency therefore seems to have a historical dimension to it, since the history of how we developed or generated our volitions seems to matter (see Mele 2001, 144-173).

John Christman develops a historical model of autonomy in order to rectify this problem, such that the means and historical process by which an agent reaches certain decisions is used in determining his or her status as autonomous or not (Christman 1991). This way, an agent brainwashed into having desire X would be deemed nonautonomous with respect to X.  The theory runs into difficulty in a case where an agent might freely choose to give up his or her autonomy, or conversely where an agent might endorse a desire but not endorse the means by which he or she was forced into developing the desire (see Taylor 2005, 10-12), but at least it draws attention to some of the temporal features of autonomous agency.

Another criticism of the hierarchical model is the Regress or Incompleteness Problem. According to Frankfurt and Dworkin, an agent is autonomous with respect to his or her first order desires as long as they are endorsed by second-order desires. However, this raises the question of the source of the second-order volitions; if they themselves rely on third-order volitions, and so on, then there is the danger of an infinite regress in determining the source of the autonomous endorsement (see Watson 1975). If the second order desires are autonomous for some other reason than a higher-order volition, then the hierarchical model is incomplete in its explanation of autonomy. Frankfurt, while acknowledging that there is “no theoretical limit” to the series of higher order desires, holds that the series can end with an agent’s “decisive commitment” to one of the first order desires (Frankfurt 1988, 21). However, the choice of terminating the series is itself arbitrary if there no reason behind it (Watson 1975).

Frankfurt responds to this criticism in “Identification and Wholeheartedness” by defining a decisive commitment as one which the agent makes without reservation, and where the agent feels no reason to continue deliberating (Frankfurt 1988, 168-9). To stop at this point is, Frankfurt argues, hardly arbitrary. It is possible that the agent is mistaken in his or her judgment, but that is always a possibility in deliberation, and thus not an obstacle to Frankfurt’s theory in particular. In making a decision, an agent “also seeks thereby to overcome or to supersede a condition of inner division and to make himself into an integrated whole” (Frankfurt 1988, 174). Thus, by making this decision, the agent has endorsed an intention that establishes “a constraint by which other preferences and decisions are to be guided” (Frankfurt 1988, 175), and thus is self-determining and autonomous.

The criterion of wholeheartedness and unified agency has been criticized by Diana Meyers, who argues for a decentered, fivefold notion of the subject, which includes the unitary, decision-making self, but also acknowledges the functions of the self as divided, as relational, as social, as embodied, and as unconscious (Meyers 2005). The ideal of wholeheartedness has also been criticized on the grounds that it does not reflect the agency of agents from oppressed groups or from mixed traditions. Edwina Barvosa-Carter sees ambivalence as an inescapable feature of much decision-making, especially for mixed-race individuals who have inherited conflicting values, commitments, and traditions (Barvosa-Carter 2007). Marina Oshana makes a similar point, with reference to living within a racist society (Oshana 2005).

In any case, it is a puzzle how decisive commitments or higher-order desires acquire their authority without themselves being endorsed, since deriving authority from external manipulation would seem to undermine this authority. This is the Ab Initio Problem: If the source of an agent’s autonomy is ultimately something that can’t itself be reflectively endorsed, then the agent’s autonomy seems to originate with something with respect to which he or she is non-autonomous, something that falls outside the hierarchical model.

A related objection to the Regress Problem is that this hierarchical account seems to give an unjustified ontological priority to higher versions of the self (see Thalberg 1978). Marilyn Friedman has argued that it begs the question to assume some sort of uncaused “true self” at the top of the hierarchical pyramid. In order to give a procedural account that would avoid these objections, Friedman has proposed an integration model in which desires of different orders ought to be integrated together, rather than being constructed in a pyramid (Friedman 1986).

iii. Coherentist Accounts

Part of the appeal of understanding autonomy is not simply in explaining how we make decisions, but because the idea of autonomy suggests something about how we identify ourselves, what we identify with. For Frankfurt, we identify with a lower level desire if we have a second order volition endorsing it. However, our second order volitions don’t necessarily represent us — we may have no reason for them, which Frankfurt acknowledges.

This concern drives some of the other approaches to personal autonomy, such as Laura Ekstrom’s coherentist account (Ekstrom 1993). Since autonomy is self-governance, it stands to reason that in order to understand autonomous agency, we must clarify our notion of the self and hence what counts as the self’s own reasons for acting; she argues that this will help avoid the Regress Problem and the Ab Initio Problem.

Ekstrom’s account of self is based on the endorsement of preferences. An agent has a preference if he or she holds a certain first level desire to be good; it is similar to a second order volition for Frankfurt. It presupposes higher level states since they are the result of an agent’s higher order reflection about the agent’s desires with regard to goodness. A self, then, is a particular character with certain beliefs and preferences which have been endorsed in a process of self-reflection, and the ability to reshape those beliefs and preferences in light of self-evaluation. The true self includes those beliefs and preferences which cohere together; that coherence itself gives them authorization. A preference is thus endorsed if it coheres with the agent’s character.

Michael Bratman develops a similar account, arguing that our personal identity is partly constituted by the organizing and coordinating function of our long-range plans and intentions (Bratman 2007, 5). Our decisions are autonomous or self-governing with respect to these plans.

This is, of course, only a very brief account of some of the literature on proceduralist accounts of autonomy, and it omits the various defenses of the hierarchical model and the objections to Friedman’s, Christman’s, and others’ formulations. But it should be enough to make clear the way in which theorists offering these accounts strive to ensure that no particular view of what constitutes a flourishing human life is imported into their accounts of autonomy. Autonomy is just one valued human property amongst others, and need not do all the work of describing human flourishing (Friedman 2003).

b. Substantive Accounts

Some doubt, however, that proceduralist accounts are adequate to capture autonomous motivation and action, or to rule out actions that or agents who we would hesitate to call autonomous. Substantive accounts of autonomy, of which there are both weak and strong varieties, set more requirements for autonomous actions to count as autonomous. Whether weak or strong, all substantive accounts posit some particular constraints on what can be considered autonomous; one example might be an account of autonomy that specifies that we might not autonomously be able to choose to be enslaved. Susan Wolf offers a strong substantive account, in which agents must have “normative competency,” in other words, the capacity to identify right and wrong (Wolf 1990). We do not need to be metaphysically responsible for ourselves or absolutely self-originating, but as agents we are morally responsible, and capable of revising ourselves according to our moral reasoning (Wolf 1987). Similarly, Paul Benson’s early accounts of autonomy also advocated a strong substantive account, stressing normative competence, and also the threat of oppressive or inappropriate socialization to our normative competence and thus to our autonomy (Benson 1991).

Contemporary Kantians such as Thomas Hill and Christine Korsgaard also advocate substantive accounts of autonomy. Korsgaard argues that we have practical identities which guide us and serve as the source of our normative commitments (Korsgaard 1996). We have multiple such identities, not all of which are moral, but our most general practical identity is as a member of the “kingdom of ends,” our identity as moral agents. This identity generates universal duties and obligations. Just as Kant called autonomy our capacity for self-legislation, so too Korsgaard calls autonomy our capacity to give ourselves obligations to act based on our practical identities. Since one of these is a universal moral identity, autonomy itself thus has substantive content.

Autonomy, for Hill, means that principles will not simply be accepted because of tradition or authority, but can be challenged through reason. He acknowledges that in our society we do not experience the kind of consensus about values and principles that Kant supposed ideally rational legislators might possess, but argues that it is still possible to bear in mind the perspective of a possible kingdom of ends. Human dignity, the idea of humanity as an end in itself, can represent a shared end regardless of background or tradition (Hill 2000, 43-45).

Substantive accounts have been criticized for conflating personal and moral autonomy and for setting too high a bar for autonomous action. If too much is expected of autonomous agents’ self-awareness and moral reflection, then can any of us be truly said to be autonomous (see for example Christman 2004 and Narayan 2002)? Does arguing that agents living under conditions of oppressive socialization have reduced autonomy help set a standard for promotion of justice, or does it overemphasize their diminished capacity without encouraging and promoting the capacities that they do have? This interplay between our socialization and our capacity for autonomy is highlighted in the relational autonomy literature, covered below.

In order to come to some middle ground between substantive and procedural accounts, Paul Benson has also suggested a weak substantive account, which does not specify any content, but sets the requirement that the agent must regard himself or herself as worthy to act; in other words, that the agent must have self-trust, self-respect (Benson 1991). This condition serves to limit what behavior can be deemed autonomous and to bring it in line with our intuition that a mind-controlled or utterly submissive agent is not acting autonomously, while not ruling out the agent’s ability to decide what values he or she wants to live by.

3. Feminist Philosophy of Autonomy

a. Feminist Criticisms of Autonomy

Feminist philosophers have been critical of concepts and values traditionally seen to be gender neutral, finding that when examined they reveal themselves to be masculine (see Jaggar 1983, Benjamin 1988, Grimshaw 1986, Harding and Hintikka 2003, and Lloyd 1986). Autonomy has long been coded masculine and associated with masculine ideals, despite being something which women have called for in their own right. Jessica Benjamin argues that while we are formally committed to equality, “gender polarity underlies such familiar dualisms as autonomy and dependency” (Benjamin 1988, 7). There has been some debate over whether autonomy is actually a useful value for women, or whether it has been tarnished by association.

Gilligan’s criticisms of autonomy have already been covered, but Benjamin writes along similar lines that:

The ideal of the autonomous individual could only be created by abstracting from the relationship of dependency between men and women. The relationships which people require to nurture them are considered private, and not truly relationships with outside others. Thus the other is reduced to an appendage of the subject – the mere condition of his being – not a being in her own right. The individual who cannot recognize the other or his own dependency without suffering a threat to his identity requires the formal, impersonal principle of rationalized interaction, and is required by them. (Benjamin 1988, 197)

Benjamin ultimately argues that the entire structure of recognition between men and women must be altered in order to permit an end to domination. Neither Gilligan nor Benjamin addresses the possibility of reformulating the notion of autonomy itself, but each sees it as essentially linked with individualism and separation. Sarah Hoagland is more emphatic: she openly rejects autonomy as a value, referring to it as “a thoroughly noxious concept” as it “encourages us to believe that connecting and engaging with others limits us” (Hoagland 1988, 144).

These criticisms have been countered, however, by feminists looking to retain the value of autonomy, who argue that the critics conflate the ideal of “autonomy” with that of “substantive independence.” Autonomy, while it has often been associated with individualism and independence, does not necessarily entail these. Most feminist criticism of autonomy is based on the idea that autonomy implies a particular model or expectation of the self. Marilyn Friedman and John Christman, however, point out that the proceduralist notion of autonomy which is the focus of contemporary philosophical attention does not have such an implication, but is metaphysically neutral and value neutral (Friedman 2000, 37-46; Christman 1995).

b. Relational Autonomy

A feminist attempt to rehabilitate autonomy as a value, and to further underscore the contingency of its relationship to atomistic individualism or independence, emerges in the growing research on “relational autonomy” (Nedelsky 1989, Mackenzie and Stoljar 2000). It addresses the challenge of balancing agency with social embeddedness, without promoting an excessively individualistic liberal atomism, or denying women the agency required to criticize or change their situation. The feminist work on relational autonomy attempts to capture the best of the available positions.

It is worth noting first, for clarity, that there are two levels of relationality at work within relational autonomy: social and relational sources of values, goals, and commitments, and social and relational commitments themselves. While all acknowledge that relationality at both levels is not incompatible with autonomy, not all accounts of relational autonomy require that we pursue social and relational commitments. For instance, on Marilyn Friedman’s account, a person could autonomously choose to be a hermit, despite having been brought up in a family and in a society and having been shaped by that upbringing (Friedman 2003, 94). However other relational autonomy theorists are skeptical about neatly separating the two, because they note that even our unchosen relationships still affect our self-identity and opportunities. They argue that while we need not pursue relationships, we cannot opt out entirely. Anne Donchin demonstrates this with regard to testing for genetically inherited disease (Donchin 2000).

In general, on relational autonomy accounts, autonomy is seen as an ideal by which we can measure how well an agent is able to negotiate his or her pursuit of goals and commitments, some of which may be self-chosen, and some the result of social and relational influences. Social and relational ties are examined in terms of their effect on an agent’s competency in this negotiation: some give strength, others create obstacles, and others are ambiguous. The primary focus of most relational autonomy accounts, however, tends to be less on procedure and more on changing the model of the autonomous self from an individualistic one to one embedded in a social context.

4. Autonomy in Social and Political Context

The value of autonomy can be seen in its social and political context. The idea that our decisions, if made autonomously, are to be respected and cannot be shrugged off, is a valuable one. It concerns the legitimacy of our personal decisions in a social, political, and legislative context.

a. Autonomy and Political Theory

The importance and nature of the value of autonomy is debated within political theory, but is generally intertwined with the right to pursue one’s interests without undue restriction. Discussions about the value of autonomy concern the extent of this right, and how it can be seen as compatible with social needs.

Kant described the protection of autonomy at the political level as encapsulated in the principle of right: that each person had the right to any action that can coexist with the freedom of every other person in accordance with universal law (Kant 1996, 387). Mill’s On Liberty similarly defends the rights of individuals to pursue their own personal goals, and emphasizes the need for being one’s own person (Mill 1956). On his view, this right prohibits paternalism, or restrictions or interference with a person of mature age for his or her own benefit. As Mill writes, “The only part of the conduct of anyone for which he is amenable to society is that which concerns others. In the part which merely concerns himself, his independence is, of right, absolute. Over himself, over his own body and mind, the individual is sovereign” (Mill 1956, 13).

Non-interference is generally seen as key to political autonomy; Gerald Gaus specifies that “the fundamental liberal principle” is “that all interferences with action stand in need of justification” (Gaus 2005, 272). If any paternalistic interference is to be permitted, it is generally restricted to cases where the agent is not deemed to be autonomous with respect to a decision (see for example Dworkin 1972); autonomy serves as a bar to be reached in order for an agent’s decisions to be protected (Christman 2004). The question is then how high the bar ought to be set, and thus what individual actions count as autonomous for the purposes of establishing social policy. Because of this, there is a strong connection between personal and political autonomy.

Further, there is also a connection between political liberalism and content-neutral accounts of autonomy which do not require any predetermined values for the agent to be recognized as autonomous. As Christman and Anderson point out, content-neutral accounts of autonomy accord with liberalism’s model of accommodating pluralism in ways of life, values, and traditions (Christman and Anderson 2005).

The framework of seeing the value of political autonomy in terms of protecting individual choices and decisions, however, has been criticized by those who argue that it rests on an inadequate model of the self.

Communitarians such as Michael Sandel criticize the model of the autonomous self implicit in liberal political theory, arguing that it does not provide an adequate notion of the human person as embedded within and shaped by societal values and commitments. Procedural accounts of autonomous decision-making do not adequately recognize the way our relational commitments shape us. We do not choose our values and commitments from the position of already being autonomous individuals; in other words, the autonomous self does not exist prior to the values and commitments that constitute the basis for its decisions. To deliberate in the abstract from these values and commitments is to leave out the self’s very identity, and that which gives meaning to the deliberation (Sandel 1998).

Feminist scholars have agreed with some of the communitarian criticism, but also caution that the values and commitments that communitarians appeal to may not be ones that are in line with feminist goals, in particular those values that concern the role and makeup of the family (Okin 1989 and Weiss 1995).

Another criticism of the dominant model of autonomy within political theory is made by Martha Fineman, who argues for the need to rethink the conceptions of autonomy that undergird legal and governmental policies in order to better recognize our interdependence and the dependence of all of us upon society (Fineman 2004, 28-30). While not drawing on the philosophical literature on personal autonomy or relational autonomy, but rather drawing upon sociological theories and accounts of legal and government policy, she traces the historical and cultural associations of autonomy with individuality and masculinity, and argues the need to see that real human flourishing includes dependency.

Recognizing the different levels of autonomy at play within the political sphere as a whole can help to clarify what is at stake, and to avoid one-sided accounts of autonomy or the autonomous self. Rainer Forst outlines five different conceptions of autonomy that can combine into a multidimensional account (Forst 2005). The first is moral autonomy, in which an agent can be considered autonomous as long as he or she “acts on the basis of reasons that take every other power equally into account” and which are “justifiable on the basis of reciprocally and generally binding norms” (Forst 2005, 230). Even though this is an interpersonal norm, it is relevant to the political, argues Forst, because it promotes the mutual respect needed for political liberty. Ethical autonomy concerns a person’s desires in the quest for the good life, in the context of the person’s values, commitments, relationships, and communities. Legal autonomy is thus the right not to be forced into a particular set of values and commitments, and is neutral toward them. Political autonomy concerns the right to participate in collective self-rule, exercised with the other members of the relevant community. Finally, social autonomy concerns whether an agent has the means to be an equal member of this community. Attending to social autonomy helps to demonstrate the responsibility of members of the community to consider each other’s needs, and to evaluate political and social structures in terms of whether they serve to promote the social autonomy of all of the members. Forst argues that ultimately “citizens are politically free to the extent to which they, as freedom-grantors and freedom-users, are morally, ethically, legally, politically, and socially autonomous members of a political community … Rights and liberties therefore have to be justified not only with respect to one conception of autonomy but with a complex understanding of what it means to be an autonomous person” (Forst 2005, 238).

Whether or not one agrees with this particular way of dividing the conceptions of autonomy, or with the particular explanation of the details of any of the conceptions, Forst’s account highlights the way that understanding the contribution of autonomy to political theory involves a multifaceted approach. It is of limited use to say that citizens are autonomous because they have the right to vote, if their material needs are not met, or if they are not free in their choice of values or ethical commitments.  Taking ethical autonomy into consideration can help to meet some of the concerns raised above by communitarian and feminist critics of autonomy; meanwhile, taking legal autonomy into account alongside ethical autonomy can help to provide the bulwark of protection against oppressive traditions that feminists are concerned about.

This can also be related to the work done by Martha Nussbaum and Amartya Sen on the capabilities approach to human rights, in which societies are called upon to ensure that all human beings have the opportunity to develop certain capabilities; agents then have a choice whether or not to develop them (see for example Sen 1999 and Nussbaum 2006).  The kind of political autonomy granted to subjects, then, depends on their ability to cultivate these various capabilities within a given society.

b. Autonomy and Bioethics

In applied ethics, such as bioethics, autonomy is a key value. It is appealed to by both sides of a number of debates, such as the right to free speech in hate speech versus the right to be free from hate speech (Mackenzie and Stoljar 2000, 4). There is a lack of consensus, however, on how autonomy ought to be used: how much rationality it requires, whether it merely involves the negative right against interference or whether it involves positive duties of moral reflection and self-legislation.

Autonomy has long been an important principle within biomedical ethics. For example, in the Belmont report, published in 1979 in the United States, which articulates guidelines for experimentation on human subjects, the protection of subjects’ autonomy is enshrined in the principle of “respect for persons.” One of the three key principles of the Report, it states that participants in trials ought to be treated as autonomous, and those with diminished autonomy (due to cognitive or other disabilities or illnesses) are entitled to protection. The way this principle is to be applied takes shape in the form of informed consent, as the Report presumes that this is the best way to protect autonomy.

One of the standard textbooks in biomedical ethics, Principles of Biomedical Ethics by Tom L. Beauchamp and James F. Childress, defends four principles for ethical decision-making, of which “respect for autonomy” is the first, even though it is not intended to override other moral considerations.  The principle can be seen as both a negative and a positive obligation. The negative obligation for health care professionals is that patients’ autonomous decisions should not be constrained by others. The positive obligation calls for “respectful treatment in disclosing information and fostering autonomous decision-making” (Beauchamp and Childress 2001, 64).

Beauchamp and Childress accept that a patient can autonomously choose to be guided by religious, traditional, or community norms and values. While they acknowledge that it can be difficult to negotiate diverse values and beliefs in sharing information necessary for decision-making, this does not excuse a failure to respect a patient’s autonomous decision: “respect for autonomy is not a mere ideal in health care; it is a professional obligation. Autonomous choice is a right, not a duty of patients” (Beauchamp and Childress 2001, 63).

Autonomy is also important within the disability rights movement. Within the disability rights movement, the slogan, “Nothing about us without us” is a call for autonomy or self-determination (see Charlton 1998). It goes beyond merely rejecting having decisions made for people with disabilities by others, but also speaks to the desire for empowerment and recognition as being agents capable of self-determination.

The relational approach to autonomy has become popular in the spheres of health care ethics and disability theory. The language of relational autonomy has been helpful in reframing the dichotomy between strict independence and dependence and providing a way of framing the relationship between a person with a disability and his or her caretaker or guardian. It has also been argued that a relational approach to patient autonomy provides a better model of the decision-making process.

Criticisms of a rationalistic and individualistic ideal of autonomy and the development of the idea of relational autonomy have been taken up within the mainstream of biomedical ethics. In response to criticism that early editions of their textbook on biomedical ethics had not paid adequate heed to intimate relationships and the social dimensions of patient autonomy, Beauchamp and Childress emphasize that they “aim to construct a conception of respect for autonomy that is not excessively individualistic (neglecting the social nature of individuals and the impact of individual choices and actions on others), not excessively focused on reason (neglecting the emotions), and not unduly legalistic (highlighting legal rights and downplaying social practices)” (Beauchamp and Childress, 2001, 57).

Their account of autonomy, however, has still been criticized by Anne Donchin as being a “weak concept” of relational autonomy (Donchin 2000). While they do not deny that selves are developed within a context of community and human relationships, agents are still assumed to have consciously chosen their beliefs and values and to be capable of detaching themselves from relationships at will (Donchin 2000, 238). A strong concept of relational autonomy, on the other hand, holds that “there is a social component built into the very meaning of autonomy,” and that autonomy “involves a dynamic balance among interdependent people tied to overlapping projects” (Donchin 2000, 239). The autonomous self is one “continually remaking itself in response to relationships that are seldom static,” and which “exists fundamentally in relation to others” (Donchin 2000, 239). Donchin argues that it is the strong concept of relational autonomy that offers the most helpful account of decision-making in health care.

5. References and Further Reading

  • Barvosa-Carter, Edwina. “Mestiza Autonomy as Relational Autonomy: Ambivalence and the Social Character of Free Will,” The Journal of Political Philosophy Vol. 15, no. 1 (2007), 1-21.
  • Beauchamp, Tom L. and James F. Childress. Principles of Biomedical Ethics, 5th ed, Oxford and New York: Oxford University Press, 2001.
  • Benjamin, Benjamin. The Bonds of Love: Psychoanalysis, Feminism, and the Problem of Domination, New York: Pantheon Books, 1988, 183-224.
  • Benson, Paul. “Autonomy and Oppressive Socialization,” Social Theory and Practice 17, no. 3 (1991), 385-408.
  • Bratman, Michael. Structures of Agency, Oxford: Oxford University Press, 2007.
  • Charlton, James I. Nothing About Us Without Us: Disability, Oppression and Empowerment, Berkeley and Los Angeles: University of California Press, 1998.
  • Christman, J., (ed.). The Inner Citadel: Essays on Individual Autonomy, New York and Oxford: Oxford University Press, 1989.
  • Christman, John, and Joel Anderson (ed.) Autonomy and the Challenges to Liberalism, Cambridge: Cambridge University Press, 2005.
  • Christman, John. “Autonomy and Personal History,” Canadian Journal of Philosophy, 21 no. 1(1991), 1-24.
  • Christman, John. “Autonomy, Self-Knowledge, and Liberal Legitimacy,” in Autonomy and the Challenges to Liberalism, ed. John Christman and Joel Anderson, Cambridge: Cambridge University Press, 2005.
  • Christman, John. “Feminism and Autonomy,” “Nagging” Questions: Feminist Ethics in Every Life, ed. Dana E. Bushnell. Lanham, MD: Rowman and Littlefield Publishers, Inc., 1995, 17-39.
  • Christman, John. “Relational Autonomy, Liberal Individualism, and the Social Constitution of Selves,” Philosophical Studies 117, no. 1-2 (2004), 143-164.
  • Critchley, Simon. Infinitely Demanding: Ethics of Commitment, Politics of Resistance, London: Verso, 2007.
  • Donchin, Anne. “Autonomy and Interdependence: Quandaries in Genetic Decision Making.” In Relational Autonomy: Feminist Perspectives on Autonomy, Agency, and the Social Self, edited by Catriona Mackenzie and Natalie Stoljar, 236-258. Oxford: Oxford University Press, 2000.
  • Dworkin, Gerald. “Paternalism,” The Monist, 56 no. 1 (1972), 64-84.
  • Dworkin, Gerald. The Theory and Practice of Autonomy, Cambridge: Cambridge University Press, 1988.
  • Dworkin, Gerald. “The Concept of Autonomy,” in The Inner Citadel, ed. John Christman, 54-62.
  • Ekstrom, Laura. “A Coherence Theory of Autonomy,” Philosophy and Phenomenological Research, 53 (1993), 599–616.
  • Fineman, Martha Albertson. The Autonomy Myth: A Theory of Dependency. New York: The New Press, 2004.
  • Forst, Rainer. “Political Liberty: Integrating Five Conceptions of Autonomy,” in Autonomy and the Challenges to Liberalism, 2005, 226-242.
  • Frankfurt, Harry. The Importance of What We Care About, ed. Harry Frankfurt, Cambridge: Cambridge University Press, 1988.
  • Friedman, Marilyn. “Autonomy and the Split-Level Self,” Southern Journal of Philosophy 24 (1986), 19-35.
  • Friedman, Marilyn. Autonomy, Gender, Politics, Oxford: Oxford University Press, 2003.
  • Gaus, Gerald F. “The Place of Autonomy Within Liberalism,” in Autonomy and the Challenges to Liberalism, 2005, 272-306.
  • Gilligan, Carol. In a Different Voice: Psychological Theory and Women’s Development. Cambridge, Mass.: Harvard University Press, 1982.
  • Grimshaw, Jean. Philosophy and Feminist Thinking. Minneapolis: University of Minnesota Press, 1986.
  • Harding, Sandra and Merrill B. Hintikka, eds., Discovering Reality: Feminist Perspectives on Epistemology, Metaphysics, Methodology, and Philosophy of Science, 2 ed.. Dordrecht: Kluwer Academic Publishers, 2003.
  • Hill, Thomas. “The Kantian Conception of Autonomy,” in The Inner Citadel, ed. John Christman, 91-105.
  • Hill, Thomas. “A Kantian Perspective on Moral Rules,” in Respect, Pluralism, and Justice: Kantian Perspectives (Oxford and New York: Oxford University Press, 2000), 33-55.
  • Hinchman, Lewis. “Autonomy, Individuality, and Self-Determination,” in What is Enlightenment? Eighteenth-Century Answers and Twentieth-Century Questions., ed. James Schmidt. Berkeley: University of California Press, 1996, 488-516.
  • Hoagland, Sarah L. Lesbian Ethics: Toward New Value. Palo Alto, California: Institute of Lesbian Studies, 1988, 144.
  • Jaggar, Alison M. Feminist Politics and Human Nature, Totowa, NJ: Rowman & Allenheld, 1983.
  • Lévinas, Emmanuel. Totality and Infinity, trans. Alphonso Lingis, Pittsburgh: Duquesne University Press, 1969.
  • Lloyd, Genevieve. The Man of Reason: Male and Female in Western Philosophy (London: Routledge, 1986).
  • Kant, Immanuel. Practical Philosophy. ed. and trans. Mary Gregor. 1996
  • Korsgaard, Christine. The Sources of Normativity. New York: Cambridge University Press, 1996.
  • Kymlicka, Will. Contemporary Political Philosophy: An Introduction. Oxford: Oxford University Press, 1991.
  • Mackenzie, Catriona, and Stoljar, Natalie, (eds.). Relational Autonomy, New York and Oxford: Oxford University Press, 2000.
  • Mele, Alfred R. Autonomous Agents: From Self-Control to Autonomy, New York and Oxford: Oxford University Press, 2001.
  • Meyers, Diana Tietjens. Self, Society, and Personal Choice, New York: Columbia University Press, 1989.
  • Meyers, Diana Tietjens. “Decentralizing Autonomy: Five Faces of Selfhood.” In Autonomy and the Challenges to Liberalism, edited by John Christman and Joel Anderson, 27-55. Cambridge: Cambridge University Press, 2005.
  • Mill, John Stuart. On Liberty, Indianapolis and New York: The Liberal Arts Press, 1956; originally published 1859.
  • Narayan, Uma. “Minds of Their Own: Choices, Autonomy, Cultural Practices, and Other Women,” in A Mind of One’s Own: Feminist Essays on Reason and Objectivity, ed. Louise M. Antony and Charlotte Witt (Boulder, CO: Westview Press, 2002), 418-432.
  • Nedelsky, Jennifer. “Reconceiving Autonomy: Sources, Thoughts and Possibilities.” Yale Journal of Law and Feminism, no. 1 (1989): 7-36.
  • Nussbaum, Martha. Frontiers of Justice: Disability, Nationality, Species Membership, Cambridge, MA: Belknap Press, 2006.
  • Okin, Susan Moller. Justice, Gender, and the Family, New York: Basic Books, Inc., 1989.
  • Oshana, Marina A. L. “Autonomy and Self-Identity.” In Autonomy and the Challenges to Liberalism: New Essays, edited by John Christman and Joel Anderson, 77-97. Cambridge: Cambridge University Press, 2005
  • Sandel, Michael J. Liberalism and the Limits of Justice, 2nd ed., Cambridge: Cambridge University Press, 1998.
  • Sen, Amartya. Development as Freedom, Oxford: Oxford University Press, 1999.
  • Taylor, James S. (ed.). Personal Autonomy, Cambridge: Cambridge University Press, 2005.
  • Thalberg, Irving. “Hierarchical Analyses of Unfree Action,” Canadian Journal of Philosophy, no. 8 (1978).
  • Watson, Gary. “Free Agency,” Journal of Philosophy, no. 72 (1975), 205-220.
  • Weiss, Penny A. “Feminism and Communitarianism: Comparing Critiques of Liberalism.” In Feminism and Community, edited by Penny A. Weiss and Marilyn Friedman. Philadelphia: Temple University Press, 1995, 161-186.
  • Wolf, Susan. Freedom within Reason (New York: Oxford University Press, 1990).
  • Wolf, Susan. “Sanity and the Metaphysics of Responsibility,” in Responsibility, Character and the Emotions, ed. Ferdinand Schoeman. New York: Cambridge University Press, 1987, 46-62.

Author Information

Jane Dryden
Email: jdryden@mta.ca
Mount Allison University
Canada

Metaphilosophy

What is philosophy? What is philosophy for? How should philosophy be done? These are metaphilosophical questions, metaphilosophy being the study of the nature of philosophy. Contemporary metaphilosophies within the Western philosophical tradition can be divided, rather roughly, according to whether they are associated with (1) Analytic philosophy, (2) Pragmatist philosophy, or (3) Continental philosophy.

The pioneers of the Analytic movement held that philosophy should begin with the analysis of propositions. In the hands of two of those pioneers, Russell and Wittgenstein, such analysis gives a central role to logic and aims at disclosing the deep structure of the world. But Russell and Wittgenstein thought philosophy could say little about ethics. The movement known as Logical Positivism shared the aversion to normative ethics. Nonetheless, the positivists meant to be progressive. As part of that, they intended to eliminate metaphysics. The so-called ordinary language philosophers agreed that philosophy centrally involved the analysis of propositions, but, and this recalls a third Analytic pioneer, namely Moore, their analyses remained at the level of natural language as against logic. The later Wittgenstein has an affinity with ordinary language philosophy. For Wittgenstein had come to hold that philosophy should protect us against dangerous illusions by being a kind of therapy for what normally passes for philosophy. Metaphilosophical views held by later Analytic philosophers include the idea that philosophy can be pursued as a descriptive but not a revisionary metaphysics and that philosophy is continuous with science.

The pragmatists, like those Analytic philosophers who work in practical or applied ethics, believed that philosophy should treat ‘real problems’ (although the pragmatists gave ‘real problems’ a wider scope than the ethicists tend to). The neopragmatist Rorty goes so far as to say the philosopher should fashion her philosophy so as to promote her cultural, social, and political goals. So-called post-Analytic philosophy is much influenced by pragmatism. Like the pragmatists, the post-Analyticals tend (1) to favor a broad construal of the philosophical enterprise and (2) to aim at dissolving rather than solving traditional or narrow philosophical problems.

The first Continental position considered herein is Husserl’s phenomenology. Husserl believed that his phenomenological method would enable philosophy to become a rigorous and foundational science. Still, on Husserl’s conception, philosophy is both a personal affair and something that is vital to realizing the humanitarian hopes of the Enlightenment. Husserl’s existential successors modified his method in various ways and stressed, and refashioned, the ideal of authenticity presented by his writings. Another major Continental tradition, namely Critical Theory, makes of philosophy a contributor to emancipatory social theory; and the version of Critical Theory pursued by Jürgen Habermas includes a call for ‘postmetaphysical thinking’. The later thought of Heidegger advocates a postmetaphysical thinking too, albeit a very different one; and Heidegger associates metaphysics with the ills of modernity. Heidegger strongly influenced Derrida’s metaphilosophy. Derrida’s deconstructive approach to philosophy (1) aims at clarifying, and loosening the grip of, the assumptions of previous, metaphysical philosophy, and (2) means to have an ethical and political import.

Table of Contents

  1. Introduction
    1. Some Pre-Twentieth Century Metaphilosophy
    2. Defining Metaphilosophy
    3. Explicit and Implicit Metaphilosophy
    4. The Classification of Metaphilosophies – and the Treatment that Follows
  2. Analytic Metaphilosophy
    1. The Analytic Pioneers: Russell, the Early Wittgenstein, and Moore
    2. Logical Positivism
    3. Ordinary Language Philosophy and the Later Wittgenstein
    4. Three Revivals
      1. Normative Philosophy including Rawls and Practical Ethics
      2. History of Philosophy
      3. Metaphysics: Strawson, Quine, Kripke
    5. Naturalism including Experimentalism and Its Challenge to Intuitions
  3. Pragmatism, Neopragmatism, and Post-Analytic Philosophy
    1. Pragmatism
    2. Neopragmatism: Rorty
    3. Post-Analytic Philosophy
  4. Continental Metaphilosophy
    1. Phenomenology and Related Currents
      1. Husserl’s Phenomenology
      2. Existential Phenomenology, Hermeneutics, Existentialism
    2. Critical Theory
      1. Critical Theory and the Critique of Instrumental Reason
      2. Habermas
    3. The Later Heidegger
    4. Derrida’s Post-Structuralism
  5. References and Further Reading
    1. Explicit Metaphilosophy and Works about Philosophical Movements or Traditions
    2. Analytic Philosophy including Wittgenstein, Post-Analytic Philosophy, and Logical Pragmatism
    3. Pragmatism and Neopragmatism
    4. Continental Philosophy
    5. Other

1. Introduction

The main topic of the article is the Western metaphilosophy of the last hundred years or so. But that topic is broached via a sketch of some earlier Western metaphilosophies. (In the case of the sketch, ‘Western’ means European. In the remainder of the article, ‘Western’ means European and North American. On Eastern meta­philosophy, see the entries filed under such heads as ‘Chinese philosophy’ and ‘Indian philosophy’.) Once that sketch is in hand, the article defines the notion of metaphilosophy and distinguishes between explicit and implicit metaphilosophy. Then there is a consideration of how metaphilosophies might be categorized and an outline of the course of the remainder of the article.

a. Some Pre-Twentieth Century Metaphilosophy

Socrates believed that the unexamined life – the unphilosophical life – was not worth living (Plato, Apology, 38a). Indeed, Socrates saw his role as helping to rouse people from unreflective lives. He did this by showing them, through his famous ‘Socratic method’, that in fact they knew little about, for example, justice, beauty, love or piety. Socrates’ use of that method contributed to his being condemned to death by the Athenian state. But Socrates’ politics contributed too; and here one can note that, according to the Republic (473c-d), humanity will prosper only when philosophers are kings or kings’ philosophers. It is notable too that, in Plato’s Phaedo, Socrates presents death as liberation of the soul from the tomb of the body.

According to Aristotle, philosophy begins in wonder, seeks the most fundamental causes or principles of things, and is the least necessary but thereby the most divine of sciences (Metaphysics, book alpha, sections 1–3). Despite the point about necessity, Aristotle taught ethics, a subject he conceived as ‘a kind of political science’ (Nicomachean Ethics, book 1) and which had the aim of making men good. Later philosophers continued and even intensified the stress on philosophical practicality. According to the Hellenistic philosophers – the Cynics, Sceptics, Epicureans and Stoics – philosophy revealed (1) what was valuable and what was not, and (2) how one could achieve the former and protect oneself against longing for the latter. The Roman Cicero held that to study philosophy is to prepare oneself for death. The later and neoplatonic thinker Plotinus asked, ‘What, then, is Philosophy?’ and answered, ‘Philosophy is the supremely precious’ (Enneads, I.3.v): a means to blissful contact with a mystical principle he called ‘the One’.

The idea that philosophy is the handmaiden of theology, earlier propounded by the Hellenistic thinker Philo of Alexandria, is most associated with the medieval age and particularly with Aquinas. Aquinas resumed the project of synthesizing Christianity with Greek philosophy – a project that had been pursued already by various thinkers including Augustine, Anselm, and Boethius. (Boethius was a politician inspired by philosophy – but the politics ended badly for him. In those respects he resembles the earlier Seneca. And, like Seneca, Boethius wrote of the consolations of philosophy.)

‘[T]he word “philosophy” means the study [or love – philo] of wisdom, and by “wisdom” is meant not only prudence in our everyday affairs but also a perfect knowledge of all things that mankind is capable of knowing, both for the conduct of life and for the preservation of health and the discovery of all manner of skills.’ Thus Descartes (1988: p. 179). Locke’s Essay Concerning Human Understanding (bk. 4. ch. 19, p. 697) connects philosophy with the love of truth and identifies the following as an ‘unerring mark’ of that love: ‘The not entertaining any Proposition with greater assurance than the Proofs it is built upon will warrant.’ Hume’s ‘Of Suicide’ opens thus: ‘One considerable advantage that arises from Philosophy, consists in the sovereign antidote which it affords to superstition and false religion’ (Hume 1980: 97). Kant held that ‘What can I know?’, ‘What ought I to do?’, and, ‘What may I hope?’ were the ultimate questions of human reason (Critique of Pure Reason, A805 / B33) and asserted that philosophy’s ‘peculiar dignity’ lies in ‘principles of morality, legislation, and religion’ that it can provide (A318 / B375). According to Hegel, the point of philosophy – or of ‘the dialectic’ – is to enable people to recognize the embodiment of their ideals in their social and political lives and thereby to be at home in the world. Marx’s famous eleventh ‘Thesis on Feuerbach’ declared that, while philosophers had interpreted the world, the point was to change it.

b. Defining Metaphilosophy

As the foregoing sketch begins to suggest, three very general metaphilosophical questions are (1) What is philosophy? (2) What is, or what should be, the point of philosophy? (3) How should one do philosophy? Those questions resolve into a host of more specific meta­philosophical conundra, some of which are as follows. Is philosophy a process or a product? What kind of knowledge can philosophy attain? How should one understand philosophical disagreement? Is philosophy historical in some special or deep way? Should philosophy make us better people? Happier people? Is philosophy political? What method(s) and types of evidence suit philosophy? How should philosophy be written (presuming it should be written at all)? Is philosophy, in some sense, over – or should it be?

But how might one define metaphilosophy? One definition owes to Morris Lazerowitz. (Lazerowitz claims to have invented the English word ‘metaphilosophy’ in 1940. But some foreign-language equivalents of the term ‘metaphilosophy’ antedate 1940. Note further that, in various languages including English, sometimes the term takes a hyphen before the ‘meta’.) Lazerowitz proposed (1970) that metaphilosophy is ‘the investigation of the nature of philosophy.’ If we take ‘nature’ to include both the point of philosophy and how one does (or should do) philosophy, then that definition fits with the most general meta­philosophical questions just identified above. Still: there are other definitions of metaphilosophy; and while Lazerowitz’s definition will prove best for our purposes, one needs – in order to appreciate that fact, and in order to give the definition a suitable (further) gloss – to survey the alternatives.

One alternative definition construes metaphilosophy as the philosophy of philosophy. Sometimes that definition intends this idea: metaphilosophy applies the method(s) of philosophy to philosophy itself. That idea itself comes in two versions. One is a ‘first-order’ construal. The thought here is this. Metaphilosophy, as the application of philosophy to philosophy itself, is simply one more instance of philosophy (Wittgenstein 2001: section 121; Williamson 2007: ix). The other version – the ‘second-order’ version of the idea that metaphilosophy applies philosophy to itself – is as follows. Metaphilosophy stands to philosophy as philosophy stands to its subject matter or to other disciplines (Rescher 2006), such that, as Williamson puts it (loc. cit) metaphilosophy ‘look[s] down upon philosophy from above, or beyond.’ (Williamson himself, who takes the first-order view, prefers the term ‘the philosophy of philosophy’ to ‘metaphilosophy’. For he thinks that ‘metaphilosophy’ has this connotation of looking down.) A different definition of metaphilosophy exploits the fact that ‘meta’ can mean not only about but also after. On this definition, metaphilosophy is postphilosophy. Sometimes Lazerowitz himself used ‘metaphilosophy’ in that way. What he had in mind here, more particularly, is the ‘special kind of investigation which Wittgenstein had described as one of the “heirs” of philosophy’ (Lazerowitz 1970). Some French philosophers have used the term similarly, though with reference to Heidegger and/or Marx rather than to Wittgenstein (Elden 2004: 83).

What then commends Lazerowitz’s (original) definition – the definition whereby metaphilosophy is investigation of the nature (and point) of philosophy? Two things. (1) The two ‘philosophy–of–philosophy’ construals are competing specifications of that definition. Indeed, those construals have little content until after one has a considerable idea of what philosophy is. (2) The equation of metaphilosophy and post-philosophy is narrow and tendentious; but Lazerowitz’s definition accommodates post-philosophy as a position within a more widely construed metaphilosophy. Still: Lazerowitz’s definition does require qualification, since there is a sense in which it is too broad. For ‘investigation of the nature of philosophy’ suggests that any inquiry into philosophy will count as meta­philosophical, whereas an inquiry tends to be deemed meta­philosophical only when it pertains to the essence, or very nature, of philosophy. (Such indeed is a third possible reading of the philosophy-of-philosophy construal.) Now, just what does so pertain is moot; and there is a risk of being too unaccommodating. We might want to deny the title ‘metaphilosophy’ to, say, various sociological studies of philosophy, and even, perhaps, to philosophical pedagogy (that is, to the subject of how philosophy is taught). On the other hand, we are inclined to count as meta­philosophical claims about, for instance, philosophy corrupting its students or about professionalization corrupting philosophy (on these claims one may see Stewart 1995 and Anscombe 1957).

What follows will give a moderately narrow interpretation to the term ‘nature’ within the phrase ‘the nature of philosophy’.

c. Explicit and Implicit Metaphilosophy

Explicit metaphilosophy is metaphilosophy pursued as a subfield of, or attendant field to, philosophy. Metaphilosophy so conceived has waxed and waned. In the early twenty-first century, it has waxed in Europe and in the Anglophone (English-speaking) world. Probable causes of the  increasing interest include Analytic philosophy having become more aware of itself as a tradition, the rise of philosophizing of a more empirical sort, and a softening of the divide between ‘Analytic’ and ‘Continental’ philosophy. (This article will revisit all of those topics in one way or another.) However, even when waxing, metaphilosophy generates much less activity than philosophy. Certainly the philosophical scene contains few book-length pieces of metaphilosophy. Books such as Williamson’s The Philosophy of Philosophy, Rescher’s Essay on Metaphilosophy, and What is Philosophy? by Deleuze and Guattari – these are not the rule but the exception.

There is more to metaphilosophy than explicit metaphilosophy. For there is also implicit metaphilosophy. To appreciate that point, consider, first, that philosophical positions can have meta­philosophical aspects. Many philosophical views – views about, say, knowledge, or language, or authenticity – can have implications for the task or nature of philosophy. Indeed, all philosophizing is somewhat meta­philosophical, at least in this sense: any philosophical view or orientation commits its holder to a metaphilosophy that accommodates it. Thus if one advances an ontology one must have a metaphilosophy that countenances ontology. Similarly, to adopt a method or style is to deem that approach at least passable. Moreover, a conception of the nature and point of philosophy, albeit perhaps an inchoate one, motivates and shapes much philosophy. But – and this is what allows there to be implicit metaphilosophy – sometimes none of this is emphasized, or even appreciated at all, by those who philosophize. Much of the metaphilosophy treated here is implicit, at least in the attenuated sense that its authors give philosophy much more attention than philosophy.

d. The Classification of Metaphilosophies – and the Treatment that Follows

One way of classifying metaphilosophy would be by the aim that a given metaphilosophy attributes to philosophy. Alternatively, one could consider that which is taken as the model for philosophy or for philosophical form. Science? Art? Therapy? Something else? A further alternative is to distinguish metaphilosophies according to whether or not they conceive philosophy as somehow essentially linguistic. Another criterion would be the rejection or adoption or conception of metaphysics (metaphysics being something like the study of’ the fundamental nature of reality). And many further classifications are possible.

This article will employ the Analytic–Continental distinction as its most general classificatory schema. Or rather it uses these categories: (1) Analytic philosophy; (2) Continental philosophy; (3) pragmatism, neopragmatism, and post-Analytic philosophy, these being only some of the most important of metaphilosophies of the last century or so. Those metaphilosophies are distinguished from one from another via the philosophies or philosophical movements (movements narrower than those of the three top-level headings) to which they have been conjoined. That approach, and indeed the article’s most general schema, means that this account is organized by chronology as much as by theme. One virtue of the approach is that it provides a degree of historical perspective. Another is that the approach helps to disclose some rather implicit metaphilosophy associated with well-known philosophies. But the article will be thematic to a degree because it will bring out some points of identity and difference between various metaphilosophies and will consider criticisms of the metaphilosophies treated. However, the article will not much attempt to determine, on meta­philosophical or other criteria, the respective natures of Analytic philosophy, pragmatism, or Continental philosophy. The article employs those categories solely for organizational purposes. But note the following points.

  1. The particular placing of some individual philosophers within the schema is problematic. The case of the so-called later Wittgenstein is particularly moot. Is he ‘Analytic’? Should he have his own category?
  2. The delineation of the traditions themselves is controversial. The notions of the Analytic and the Continental are particularly vexed. The difficulties here start with the fact that here a geographical category is juxtaposed to a more thematic or doctrinal one (Williams 2003). Moreover, some philosophers deny that Analytic philosophy has any substantial existence (Preston 2007; see also Rorty 1991a: 217); and some assert the same of Continental philosophy (Glendinning 2006: 13 and ff).
  3. Even only within contemporary Western history, there are significant approaches to philosophy that seem to at least somewhat warrant their own categories. Among those approaches are ‘traditionalist philosophy’, which devotes itself to the study of ‘the grand […] tradition of Western philosophy ranging from the Pre-Socratics to Kant’ (Glock 2008: 85f.), feminism, and environmental philosophy. This article does not examine those approaches.

2. Analytic Metaphilosophy

a. The Analytic Pioneers: Russell, the Early Wittgenstein, and Moore

Bertrand Russell, his pupil Ludwig Wittgenstein, and their colleague G. E. Moore – the pioneers of Analytic philosophy – shared the view that ‘all sound philosophy should begin with an analysis of propositions’ (Russell 1992: 9; first published in 1900). In Russell and Wittgenstein such analysis was centrally a matter of logic. (Note, however, that the expression ‘Analytic philosophy’ seems to have emerged only in the 1930s.)

Russellian analysis has two stages (Beaney 2007: 2–3 and 2009: section 3; Urmson 1956). First, propositions of ordinary or scientific language are transformed into what Russell regarded as their true form. This ‘logical’ or ‘transformative’ analysis draws heavily upon the new logic of Frege and finds its exemplar in Russell’s ‘theory of descriptions’ (Analytic Philosophy, section 2.a). The next step is to correlate elements within the transformed propositions with elements in the world. Commentators have called this second stage or form of analysis – which Russell counted as a matter of ‘philosophical logic’ – ‘reductive’, ‘decompositional’, and ‘metaphysical’. It is decompositional and reductive inasmuch as, like chemical analysis, it seeks to revolve its objects into their simplest elements, such an element being simple in that it itself lacks parts or constituents. The analysis is metaphysical in that it yields a metaphysics. According to the metaphysics that Russell actually derived from his analysis – the metaphysics which he called ‘logical atomism’ – the world comprises indivisible ‘atoms’ that combine, in structures limned by logic, to form the entities of science and everyday life. Russell’s empiricism inclined him to conceive the atoms as mind-independent sense-data. (See further Russell’s Metaphysics, section 4.)

Logic in the dual form of analysis just sketched was the essence of philosophy, according to Russell (2009: ch. 2). Nonetheless, Russell wrote on practical matters, advocating, and campaigning for, liberal and socialist ideas. But he tended to regard such activities as unphilosophical, believing that ethical statements were non-cognitive and hence little amenable to philosophical analysis (see Non-Cognitivism in Ethics). But he did come to hold a form of utilitarianism that allowed ethical statements a kind of truth-aptness. And he did endorse a qualified version of this venerable idea: the contemplation of profound things enlarges the self and fosters happiness. Russell held further that practicing an ethics was little use given contemporary politics, a view informed by worries about the effects of conformity and technocracy. (On all this, see Schultz 1992.)

Wittgenstein agreed with Frege and Russell that ‘the apparent logical form of a proposition need not be its real one’ (Wittgenstein 1961: section 4.0031). And he agreed with Russell that language and the world share a common, ultimately atomistic, form. But Wittgenstein’s Tractatus Logico-Philosophicus developed these ideas into a somewhat Kantian and perhaps even Schopenhauerian position. (That book, first published in 1921, is the main and arguably only work of the so-called ‘early Wittgenstein’. section 2.c treats Wittgenstein’s later views. The title of the book translates as ‘[a] schema of philosophical logic’.)

The Tractatus maintains the following.

Only some types of proposition have sense (or are propositions properly so called), namely, those that depict possible states of affairs. The cat is on the mat is one such proposition. It depicts a possible state of affairs. If that state of affairs does not obtain – if the cat in question is not on the mat in question – then the proposition is rendered false but still has sense. The same holds for most of the propositions of our everyday speech and for scientific propositions. Matters are otherwise with propositions of logic. Propositions of logic express tautologies or contradictions; they do not depict anything – and that entails that they lack sense. (Wittgenstein calls them ‘senseless’, sinnlos.) Nor do metaphysical statements make sense. (They are ‘nonsense’, Unsinn.) Such statements concern value or the meaning of life or God. Thus, they do try to depict something; but that which they try to depict is no possible state of affairs within the world.  ‘[W]henever someone […] want[s] to say something metaphysical’, one should ‘demonstrate to him that he had failed to give a meaning to certain signs in his propositions’ (section 6.53).

A complication is that the Tractatus itself tries to say something metaphysical or at least something logical. Consequently the doctrines of the book entails that it itself lacks sense. Accordingly, Wittgenstein ends the Tractatus with the following words. ‘My propositions serve as elucidations in the following way: anyone who understands me eventually recognises them as nonsensical, when he has used them – as steps – to climb up beyond them […] He must transcend these propositions, and then he will see the world aright. ¶ What we cannot speak about we must pass over in silence’ (section 6.54–7).

Here is the metaphilosophical import of all this. Philosophy is ‘a critique of language’ that exposes metaphysical talk as senseless (section 4.0031). Accordingly, and as just heard, we are to eschew such talk. Yet, Wittgenstein’s attitude to such discourse was not straightforwardly negative. For, as seen, the Tractatus itself is senseless by its own lights. Moreover, the book uses the seemingly honorific words ‘mystical’ and ‘higher’ in relation to the states of affairs that various metaphysical or metaphysico-logical statements purport to depict (section 6.42–6.522). There is an element of reverence, then, in the ‘passing over in silence’; there are some things that philosophy is to leave well enough alone.

Like Russell and Wittgenstein, Moore advocated a form of decompositional analysis. He held that ‘a thing becomes intelligible first when it is analyzed into its constituent concepts’ (Moore 1899: 182; see further Beaney 2009: section 4). But Moore uses normal language rather than logic to specify those constituents; and, in his hands, analysis often supported commonplace, pre-philosophical beliefs. Nonetheless, and despite confessing that other philosophers rather than the world prompted his philosophizing (Schilpp 1942: 14), Moore held that philosophy should give ‘a general description of the whole Universe’ (1953: 1). Accordingly, Moore tackled ethics and aesthetics as well as epistemology and metaphysics. His Principia Ethica used the not-especially-commonsensical idea that goodness was a simple, indefinable quality in order to defend the meaningfulness of ethical statements and the objectivity of moral value. Additionally, Moore advanced a normative ethic, the wider social or political implications of which are debated (Hutchinson 2001).

Russell’s tendency to exclude ethics from philosophy, and Wittgenstein’s protective version of the exclusion, are contentious and presuppose their respective versions of atomism. In turn, that atomism relies heavily upon the idea, as meta­philosophical as it is philosophical, of an ideal language (or at least of an ideal analysis of natural language). Later sections criticize that idea. Such criticism finds little target in Moore. Yet Moore is a target for those who hold that philosophy should be little concerned with words or even, perhaps, with concepts (see section 2.c and the ‘revivals’ treated in section 2.d).

b. Logical Positivism

We witness the spirit of the scientific world-conception penetrating in growing measure the forms of personal and public life, in education, upbringing, architecture, and the shaping of economic and social life according to rational principles. The scientific world-conception serves life, and life receives it. The task of philosophical work lies in […] clarification of problems and assertions, not in the propounding of special “philosophical” pronouncements. The method of this clarification is that of logical analysis.

The foregoing passages owe to a manifesto issued by the Vienna Circle (Neurath, Carnap, and Hahn 1973: 317f. and 328). Leading members of that Circle included Moritz Schlick (a physicist turned philosopher), Rudolf Carnap (primarily a logician), and Otto Neurath (economist, sociologist, and philosopher). These thinkers were inspired by the original positivist, Auguste Comte. Other influences included the empiricisms of Hume, Russell and Ernst Mach, and also the Russell–Wittgenstein idea of an ideal logical language. (Wittgenstein’s Tractatus, in particular, was a massive influence.) The Circle, in turn, gave rise to an international movement that went under several names: logical positivism, logical empiricism, neopositivism, and simply positivism.

The clarification or logical analysis advocated by positivism is two-sided. Its destructive task was the use of the so-called verifiability principle to eliminate metaphysics. According to that principle, a statement is meaningful only when either true by definition or verifiable through experience. (So there is no synthetic apriori. See Kant, Metaphysics, section 2, and A Priori and A Posteriori.) The positivists placed mathematics and logic within the true-by-definition (or analytic apriori) category, and science and most normal talk in the category of verifiable-through-experience (or synthetic aposteriori). All else was deemed meaningless. That fate befell metaphysical statements and finds its most famous illustration in Carnap’s attack (1931) on Heidegger’s ‘What is Metaphysics?’ It was the fate, too, of ethical and aesthetic statements. Hence the non-cognitivist meta-ethics that some positivists developed.

The constructive side of positivistic analysis involved epistemology and philosophy of science. The positivists wanted to know exactly how experience justified empirical knowledge. Sometimes – the positivists took various positions on the issue – the idea was to reduce all scientific statements to those of physics. (See Reductionism.) That particular effort went under the heading of ‘unified science’. So too did an idea that sought to make good on the claim that positivism ‘served life’. The idea I have in mind was this: the sciences should collaborate in order to help solve social problems. That project was championed by the so-called Left Vienna Circle and, within that, especially by Neurath (who served in a socialist Munich government and, later, was a central figure in Austrian housing movements). The positivists had close relations with the Bauhaus movement, which was itself understood by its members as socially progressive (Galison 1990).

Positivism had its problems and its detractors. The believer in ‘special philosophical pronouncements’ will think that positivism decapitates philosophy (compare section 4.a below, on Husserl). Moreover, positivism itself seemingly involved at least one ‘special’ – read: metaphysical – pronouncement, namely, the verifiability principle. Further, there is reason to distrust the very idea of providing strict criteria for nonsense (see Glendinning 2001). Further yet, the idea of an ideal logical language was attacked as unachievable, incoherent, and/or – when used as a means to certify philosophical truth – circular (Copi 1949). There were the following doubts, too, about whether positivism really ‘served life’. (1) Might positivism’s narrow notion of fact prevent it from comprehending the real nature of society? (Critical Theory leveled that objection. See O’Neill and Uebel 2004.) (2) Might positivism involve a disastrous reduction of politics to the discovery of technical solutions to depoliticized ends? (This objection owes again to Critical Theory, but also to others. See Galison 1990 and O’Neill 2003.)

Positivism retained some coherence as a movement or doctrine until the late 1960s, even though the Nazis – with whom the positivists clashed – forced the Circle into exile. In fact, that exile helped to spread the positivist creed. But, not long after the Second World War, the ascendancy that positivism had acquired in Anglophone philosophy began to diminish. It did so partly because of the developments considered by the next section.

c. Ordinary Language Philosophy and the Later Wittgenstein

Some accounts group ordinary language philosophy and the philosophy of the later Wittgenstein (and of Wittgenstein’s disciples) together – under the title ‘linguistic philosophy’. That grouping can mislead. All previous Analytic philosophy was centrally concerned with language. In that sense, all previous Analytic philosophy had taken the so-called ‘linguistic turn’ (see Rorty 1992). Nevertheless, ordinary language philosophy and the later Wittgenstein do mark a change. They twist the linguistic turn away from logical or constructed languages and towards ordinary (that is, vernacular) language, or at least towards natural (non-artificial) language. Thereby the new bodies of thought represent a movement away from Russell, the early Wittgenstein, and the positivists (and back, to an extent, towards Moore). In short – and as many accounts of the history of Analytic philosophy put it – we have here a shift from ideal language philosophy to ordinary language philosophy.

Ordinary language philosophy began with and centrally comprised a loose grouping of philosophers among whom the Oxford dons Gilbert Ryle and J. L. Austin loomed largest. The following view united these philosophers. Patient analysis of the meaning of words can tap the rich distinctions of natural languages and minimize the unclarities, equivocations and conflations to which philosophers are prone. So construed, philosophy is unlike natural science and even, insofar as it avoided systematization, unlike linguistics. The majority of ordinary language philosophers did hold, with Austin, that such analysis was not the ‘the last word’ in philosophy. Specialist knowledge and techniques can in principle everywhere augment and improve it. But natural or ordinary language ‘is the first word’ (Austin 1979: 185; see also Analytic Philosophy, section 4a).

The later Wittgenstein did hold, or at least came close to holding, that ordinary language has the last word in philosophy. This later Wittgenstein retained his earlier view that philosophy was a critique of language – of language that tried to be metaphysical or philosophical. But he abandoned the idea (itself problematically metaphysical) that there was one true form to language. He came to think, instead, that all philosophical problems owe to ‘misinterpretation of our forms of language’ (Wittgenstein 2001: section 111). They owe to misunderstanding of the ways language actually works. A principal cause of such misunderstanding, Wittgenstein thought, is misassimilation of expressions one to another. Such misassimilation can be motivated, in turn, by a ‘craving for generality’ (Wittgenstein 1975: 17ff.) that is inspired by science. The later Wittgenstein’s own philosophizing means to be a kind of therapy for philosophers, a therapy which will liberate them from their problems by showing how, in their very formulations of those problems, their words have ceased to make sense. Wittgenstein tries to show how the words that give philosophers trouble – words such as ‘know’, ‘mind’, and ‘sensation’ – become problematical only when, in philosophers’ hands, they depart from the uses and the contexts that give them meaning. Thus a sense in which philosophy ‘leaves everything as it is’ (2001: section 124). ‘[W]e must do away with all explanation, and description alone must take its place’ (section 109). Still, Wittgenstein himself once asked, ‘[W]hat is the use of studying philosophy if all that it does for you is to enable you to talk with some plausibility about some abstruse questions of logic, etc. […]’? (cited in Malcolm 1984: 35 and 93). And in one sense Wittgenstein did not want to leave everything as it was. To wit: he wanted to end the worship of science. For the view that science could express all genuine truths was, he held, barbarizing us by impoverishing our understanding of the world and of ourselves.

Much meta­philosophical flack has been aimed at the later Wittgenstein and ordinary language philosophy. They have been accused of: abolishing practical philosophy; rendering philosophy uncritical; trivializing philosophy by making it a mere matter of words; enshrining the ignorance of common speech; and, in Wittgenstein’s case – and in his own words (taken out of context) – of ‘destroy[ing] everything interesting’ (2001: section 118; on these criticisms see Russell 1995: ch. 18, Marcuse 1991: ch. 7 and Gellner 2005). Nonetheless, it is at least arguable that these movements of thought permanently changed Analytic philosophy by making it more sensitive to linguistic nuance and to the oddities of philosophical language. Moreover, some contemporary philosophers have defended more or less Wittgensteinian conceptions of philosophy. One such philosopher is Peter Strawson (on whom see section 2.d.iii). Another is Stanley Cavell. Note also that some writers have attempted to develop the more practical side of Wittgenstein’s thought (Pitkin 1993, Cavell 1979).

d. Three Revivals

Between the 1950s and the 1970s, there were three significant, and persisting, meta­philosophical developments within the Analytic tradition.

i. Normative Philosophy including Rawls and Practical Ethics

During positivism’s ascendancy, and for some time thereafter, substantive normative issues – questions about how one should live, what sort of government is best or legitimate, and so on – were widely deemed quasi-philosophical. Positivism’s non-cognitivism was a major cause. So was the distrust, in the later Wittgenstein and in ordinary language philosophy, of philosophical theorizing. This neglect of the normative had its exceptions. But the real change occurred with the appearance, in 1971, of A Theory of Justice by John Rawls.

Many took Rawls’ book to show, through its ‘systematicity and clarity’, that normative theory was possible ‘without loss of rigor’ (Weithman 2003: 6). Rawls’ procedure for justifying normative principles is of particular metaphilosophical note. That procedure, called ‘reflective equilibrium’, has three steps. (The quotations that follow are from Schroeter 2004.)

  1. ‘[W]e elicit the moral judgments of competent moral judges’ on whatever topic is at issue. (In Theories of Justice itself, distributive justice was the topic.) Thereby we obtain ‘a set of considered judgments, in which we have strong confidence’.
  2. ‘[W]e construct a scheme of explicit principles, which will ‘‘explicate’’, ‘‘fit’’, ‘‘match’’ or ‘‘account for’’ the set of considered judgments.’
  3. By moving ‘back and forth between the initial judgments and the principles, making the adjustments which seem the most plausible’, ‘we remove any discrepancy which might remain between the judgments derived from the scheme of principles and the initial considered judgments’, thereby achieving ‘a point of equilibrium, where principles and judgments coincide’.

The conception of reflective equilibrium was perhaps less philosophically orthodox than most readers of Theory of Justice believed. For Rawls came to argue that his conception of justice was, or should be construed as, ‘political not metaphysical’ (Rawls 1999b: 47–72). A political conception of justice ‘stays on the surface, philosophically speaking’ (Rawls 1999b: 395). It appeals only to that which ‘given our history and the traditions embedded in our public life […] is the most reasonable doctrine for us’ (p. 307). A metaphysical conception of justice appeals to something beyond such contingencies. However: despite advocating the political conception, Rawls appeals to an ‘overlapping consensus’ (his term) of metaphysical doctrines. The idea here, or hope, is this (Rawls, section 3; Freeman 2007: 324–415). Citizens in modern democracies hold various and not fully inter-compatible political and social ideas. But those citizens will be able to unite in supporting a liberal conception of justice.

Around the same time as Theory of Justice appeared, a parallel revival in normative philosophy begun. This was the rise of practical ethics. Here is how one prominent practical ethicist presents ‘the most plausible explanation’ for that development. ‘[L]aw, ethics, and many of the professions—including medicine, business, engineering, and scientific research—were profoundly and permanently affected by issues and concerns in the wider society regarding individual liberties, social equality, and various forms of abuse and injustice that date from the late 1950s’ (Beauchamp 2002: 133f.). Now the new ethicists, who insisted that philosophy should treat ‘real problems’ (Beauchamp 2002: 134), did something largely foreign to previous Analytic philosophy (and to that extent did not, in fact, constitute a revival). They applied moral theory to such concrete and pressing matters as racism, sexual equality, abortion, governance and war. (On those problems, see Ethics, section 3).

According to some practical ethicists, moral principles are not only applied to, but also drawn from, cases. The issue here – the relation between theory and its application – broadened out into a more thoroughly metaphilosophical debate. For, soon after Analytic philosophers had returned to normative ethics, some of them rejected a prevalent conception of normative ethical theory, and others entirely rejected such theory. The first camp rejects moral theory qua ‘decision procedure for moral reasoning’ (Williams 1981: ix-x) but does not foreclose other types of normative theory such as virtue ethics. The second and more radical camp holds that the moral world is too complex for any (prescriptive) codification that warrants the name ‘theory’. (On these positions, see Lance and Little 2006, Clarke 1987, Chappell 2009.)

ii. History of Philosophy

For a long time, most analytic philosophers held that the history of philosophy had little to do with doing philosophy. For what – they asked – was the history of philosophy save, largely, a series of mistakes? We might learn from those mistakes, and the history might contain some occasional insights. But (the line of thought continues) we should be wary of resurrecting the mistakes and beware the archive fever that leads to the idea that there is no such thing as philosophical progress. But in the 1970s a more positive attitude to the history of philosophy began to emerge, together with an attempt to reinstate or re-legitimate serious historical scholarship within philosophy (compare Analytic Philosophy section 5.c).

The newly positive attitude towards the history of philosophy was premised on the view that the study of past philosophies was of significant philosophical value. Reasons adduced for that view include the following (Sorell and Rogers 2005). History of philosophy can disclose our assumptions. It can show the strengths of positions that we find uncongenial. It can suggest rolesthat philosophy might take today by revealing ways in which philosophy has been embedded in a wider intellectual and sociocultural frameworks. A more radical view, espoused by Charles Taylor (1984: 17) is that, ‘Philosophy and the history of philosophy are one’; ‘we cannot do the first without also doing the second.’

Many Analytical philosophers continue to regard the study of philosophy’s history as very much secondary to philosophy itself. By contrast, many so-called Continental philosophers take the foregoing ideas, including the more radical view – which is associated with Hegel – as axiomatic. (See much of section 4, below.)

iii. Metaphysics: Strawson, Quine, Kripke

Positivism, the later Wittgenstein, and Ordinary Language Philosophy suppressed Analytic metaphysics. Yet it recovered, thanks especially to three figures, beginning with Peter Strawson.

Strawson had his origins in the ordinary language tradition and he declares a large debt or affinity to Wittgenstein (Strawson 2003: 12). But he is indebted, also, to Kant; and, with Strawson, ordinary language philosophy became more systematic and more ambitious. However, Strawson retained an element of what one might call, in Rae Langton’s phrase, Kantian humility. In order to understand these characterizations, one needs to appreciate that which Strawson advocated under the heading of ‘descriptive metaphysics’. In turn, descriptive metaphysics is best approached via that which Strawson called ‘connective analysis’.

Connective analysis seeks to elucidate concepts by discerning their interconnections, which is to say, the ways in which concepts variously imply, presuppose, and exclude one another. Strawson contrasts this ‘connective model’ with ‘the reductive or atomistic model’ that aims ‘to dismantle or reduce the concepts we examine to other and simpler concepts’ (all Strawson 1991: 21). The latter model is that of Russell, the Tractatus, and, indeed, Moore. Another way in which Strawson departs from Russell and the Tractatus, but not from Moore, lies in this: a principal method of connective analysis is ‘close examination of the actual use of words’ (Strawson 1959: 9). But when Strawson turns to ‘descriptive metaphysics’, such examination is not enough.

Descriptive metaphysics is, or proceeds via, a very general form of connective analysis. The goal here is ‘to lay bare the most general features of our conceptual structure’ (Strawson 1959: 9). Those most general features – our most general concepts – have a special importance. For those concepts, or at least those of them in which Strawson is most interested, are (he thinks) basic or fundamental in the following sense. They are (1) irreducible, (2) unchangeable in that they comprise ‘a massive central core of human thinking which has no history’ (1959: 10) and (3) necessary to ‘any conception of experience which we can make intelligible to ourselves’ (Strawson 1991: 26). And the structure that these concepts comprise ‘does not readily display itself on the surface of language, but lies submerged’ (1956: 9f.).

Descriptive metaphysics is considerably Kantian (see Kant, metaphysics). Strawson is Kantian, too, in rejecting what he calls ‘revisionary metaphysics’. Here we have the element of Kantian ‘humility’ within Strawson’s enterprise. Descriptive metaphysics ‘is content to describe the actual structure of our thought about the world’, whereas revisionary metaphysics aims ‘to produce a better structure’ (Strawson 1959: 9; my stress). Strawson urges several points against revisionary metaphysics.

  1. A revisionary metaphysic is apt to be an overgeneralization of some particular aspect of our conceptual scheme and/or
  2. to be a confusion between conceptions of how things really are with some Weltanschauung.
  3. Revisionary metaphysics attempts the impossible, namely, to depart from the fundamental features of our conceptual scheme. The first point shows the influence of Wittgenstein. So does the third, although it is also (as Strawson may have recognized) somewhat Heideggerian. The second point is reminiscent of Carnap’s version of logical positivism. All this notwithstanding, and consistently enough, Strawson held that systems of revisionary metaphysics can, through the ‘partial vision’ (1959: 9) that they provide, be useful to descriptive metaphysics.

Here are some worries about Strawson’s metaphilosophy. ‘[T]he conceptual system with which “we” are operating may be much more changing, relative, and culturally limited than Strawson assumes it to be’ (Burtt 1963: 35). Next: Strawson imparts very little about the method(s) of descriptive metaphysics (although one might try to discern techniques – in which imagination seems to play a central role – from his actual analyses). More serious is that Strawson imparts little by way of answer to the following questions. ‘What is a concept? How are concepts individuated? What is a conceptual scheme? How are conceptual schemes individuated? What is the relation between a language and a conceptual scheme?’ (Haack 1979: 366f.). Further: why believe that the analytic philosopher has no business providing ‘new and revealing vision[s]’ (Strawson 1992: 2)? At any rate, Strawson helped those philosophers who rejected reductive (especially Russellian and positivistic) versions of analysis but who wanted to continue to call themselves ‘analytic’. For he gave them a reasonably narrow conception of analysis to which they could adhere (Beaney 2009: section 8; compare Glock 2008: 159). Finally note that, despite his criticisms of Strawson, the contemporary philosopher Peter Hacker defends a metaphilosophy rather similar to descriptive metaphysics (Hacker 2003 and 2007).

William Van Orman Quine was a second prime mover in the metaphysical revival. Quine’s metaphysics, which is revisionary in Strawson’s terms, emerged from Quine’s attack upon ‘two dogmas of modern empiricism’. Those ostensible dogmas are: (1) ‘belief in some fundamental cleavage between truths that are analytic, or grounded in meanings independently of matters of fact, and truths that are synthetic, or grounded in fact’; (2) ‘reductionism: the belief that each meaningful statement is equivalent to some logical construction upon terms which refer to immediate experience’ (Quine 1980: 20). Against 1, Quine argues that every belief has some connection to experience. Against 2, he argues that the connection is never direct. For when experience clashes with some belief, which belief(s) must be changed is underdetermined. Beliefs ‘face the tribunal of sense experience not individually but as a corporate body’ (p. 41; see Evidence section 3.c.i). Quine expresses this holistic and radically empiricist conception by speaking of ‘the web of belief’. Some beliefs – those near the ‘edge of the web’ – are more exposed to experience than others; but the interlinking of beliefs is such that no belief is immune to experience.

Quine saves metaphysics from positivism. More judiciously put: Quine’s conception, if correct, saves metaphysics from the verifiability criterion (q.v. section 2.b). For the notion of the web of belief implies that ontological beliefs – beliefs about ‘the most general traits of reality’ (Quine 1960: 161) – are answerable to experience. And, if that is so, then ontological beliefs differ from other beliefs only in their generality. Quine infers that, ‘Ontological questions […] are on a par with questions of natural science’ (1980: 45). In fact, since Quine thinks that natural science, and in particular physics, is the best way of fitting our beliefs to reality, he infers that ontology should be determined by the best available comprehensive scientific theory. In that sense, metaphysics is ‘the metaphysics of science’ (Glock 2003a: 30).

Is the metaphysics of science actually only science? Quine asserts that ‘it is only within science itself, and not in some prior philosophy, that reality is to be identified and described’ (1981: 21). Yet he does leave a job for the philosopher. The philosopher is to translate the best available scientific theory into that which Quine called ‘canonical notation’, namely, ‘the language of modern logic as developed by Frege, Peirce, Russell and others’ (Orenstein 2002: 16). Moreover, the philosopher is to make the translation in such a way as to minimize the theory’s ontological commitments. Only after such a translation, which Quine calls ‘explication’ can one say, at a philosophical level: ‘that is What There Is’. (However, Quine cannot fully capitalize those letters, as it were. For he thinks that there is a pragmatic element to ontology. See section 3.a below.) This role for philosophy is a reduced one. For one thing, it deprives philosophy of something traditionally considered one of its greatest aspirations: necessary truth. On Quine’s conception, no truth can be absolutely necessary. (That holds even for the truths of Quine’s beloved logic, since they, too, fall within the web of belief.) By contrast, even Strawson and the positivists – the latter in the form of ‘analytic truth’ – had countenanced versions of necessary truth.

Saul Kripke – the third important reviver of metaphysics – allows the philosopher a role that is perhaps slightly more distinct than Quine does. Kripke does that precisely by propounding a new notion of necessity. (That said, some identify Ruth Barcan Marcus as the discoverer of the necessity at issue.) According to Kripke (1980), a truth T about X is necessary just when T holds in all possible worlds that contain X. To explain: science shows us that, for example, water is composed of H20; the philosophical question is whether that truth holds of all possible worlds (all possible worlds in which water exists) and is thereby necessary. Any such science-derived necessities are aposteriori just because, and in the sense that, they are (partially) derived from science.

Aposteriori necessity is a controversial idea. Kripke realizes this. But he asks why it is controversial. The notions of the apriori and aposteriori are epistemological (they are about whether or not one needs to investigate the world in order to know something), whereas – Kripke points out – his notion of necessity is ontological (that is, about whether things could be otherwise). As to how one determines whether a truth obtains in all possible worlds, Kripke’s main appeal is to the intuitions of philosophers. The next subsection somewhat scrutinizes that appeal, together with some of the other ideas of this subsection.

e. Naturalism including Experimentalism and Its Challenge to Intuitions

Kripke and especially Quine helped to create, particularly in the United States, a new orthodoxy within Analytic philosophy. That orthodoxy is naturalism or – the term used by its detractors – scientism. But naturalism (/scientism) is no one thing (Glock 2003a: 46; compare Papineau 2009). Ontological naturalism holds that the entities treated by natural science exhaust reality. Meta­philosophical naturalism – which is the focus in what follows – asserts a strong continuity between philosophy and science. A common construal of that continuity runs thus. Philosophical problems are in one way or another ‘tractable through the methods of the empirical sciences’ (Naturalism, Introduction). Now, within meta­philosophical naturalism, one can distinguish empirical philosophers from experimental philosophers (Prinz 2008). Empirical philosophers enlist science to answer, or to help answer, philosophical problems. Experimental philosophers (or ‘experimentalists’) themselves do science, or do so in collaboration with scientists. Let us start with empirical philosophy.

Quine is an empirical philosopher in his approach to metaphysics and even more so in his approach to epistemology. Quine presents and urges his epistemology thus: ‘The stimulation of his sensory receptors is all the evidence anybody has had to go on, ultimately, in arriving at his picture of the world. Why not just see how this construction really proceeds? Why not settle for psychology?’ (Quine 1977: 75). Such naturalistic epistemology – in Quine’s own formulation, ‘naturalized epistemology’ – has been extended to moral epistemology. ‘A naturalized moral epistemology is simply a naturalized epistemology that concerns itself with moral knowledge’ (Campbell and Hunter 2000: 1). There is such a thing, too, as naturalized aesthetics: the attempt to use science to solve aesthetical problems (McMahon 2007). Other forms of empirical philosophy include neurophilosophy, which applies methods from neuroscience, and sometimes computer science, to questions in the philosophy of mind.

Naturalized epistemology has been criticized for being insufficiently normative. How can descriptions of epistemic mechanisms determine license for belief? The difficulty seems especially pressing in the case of moral epistemology. Wittgenstein’s complaint against naturalistic aesthetics – a view he called ‘exceedingly stupid’ – may intend a similar point. ‘The sort of explanation one is looking for when one is puzzled by an aesthetic impression is not a causal explanation, not one corroborated by experience or by statistics as to how people react’ (all Wittgenstein 1966: 17, 21). A wider disquiet about meta­philosophical naturalism is this: it presupposes a controversial view explicitly endorsed by Quine, namely that science alone provides true or good knowledge (Glock 2003a: 28, 46). For that reason and for others, some philosophers, including Wittgenstein, are suspicious even of scientifically-informed philosophy of mind.

Now the experimentalists – the philosophers who actually do science – tend to use science not to propose new philosophical ideas or theories but rather to investigate existing philosophical claims. The philosophical claims at issue are based upon intuitions, intuitions being something like ‘seemings’ or spontaneous judgments. Sometimes philosophers have employed intuitions in support of empirical claims. For example, some ethicists have asserted, from their philosophical armchairs, that character is the most significant determinant of action. Another example: some philosophers have speculated that most people are ‘incompatibilists’ about determinism. (The claim in this second example is, though empirical, construable as a certain type of second-order intuition, namely, as a claim that is empirical, yet made from the armchair, about the intuitions that other people have.) Experimentalists have put such hunches to the test, often concluding that they are mistaken (see Levin 2009 and Levy 2009). At other times, though, the type of intuitively-based claim that experimentalists investigate is non-empirical or at least not evidently empirical. Here one finds, for instance, intuitions about what counts as knowledge, about whether some feature of something is necessary to it (recall Kripke, above), about what the best resolution of a moral dilemma is, and about whether or not we have free will. Now, experimentalists have not quite tested claims of this second sort. But they have used empirical methods in interrogating the ways in which philosophers, in considering such claims, have employed intuitions. Analytic philosophers have been wont to use their intuitions about such non-empirical matters to establish burdens of proof, to support premises, and to serve as data against which to test philosophical theories. But experimentalists have claimed to find that, at least in the case of non-philosophers, intuitions about such matters vary considerably. (See for instance Weinberg, Nichols and Stitch 2001.) So, why privilege the intuitions of some particular philosopher?

Armchair philosophers have offered various responses. One is that philosophers’ intuitions diverge from ‘folk’ intuitions only in this way: the former are more considered versions of the latter (Levin 2009). But might not such considered intuitions vary among themselves? Moreover: why at all trust even considered intuitions? Why not think – with Quine (and William James, Richard Rorty, Nietzsche, and others) that intuitions are sedimentations of culturally or biologically inherited views? A traditional response to that last question (an ‘ordinary language response’ and equally, perhaps, ‘an ideal language’ response) runs as follows. Intuitions do not convey views of the world. Rather they convey an implicit knowledge of concepts or of language. A variation upon that reply gives it a more naturalistic gloss. The idea here is that (considered) intuitions, though indeed ‘synthetic’ and, as such, defeasible, represent good prima facie evidence for the philosophical views at issue, at least if those views are about the nature of concepts (see for instance Graham and Horgan 1994).

3. Pragmatism, Neopragmatism, and Post-Analytic Philosophy

a. Pragmatism

The original or classical pragmatists are the North Americans C.S. Peirce (1839–1914), William James (1842–1910), John Dewey (1859–1952) and, perhaps, G. H. Mead. The metaphilosophy of pragmatism unfolds from that which became known as ‘the pragmatic maxim’.

Peirce invented the pragmatic maxim as a tool for clarifying ideas. His best known formulation of the maxim runs thus: ‘Consider what effects, which might conceivably have practical bearings, we conceive the object of our conception to have. Then, our conception of these effects is the whole of our conception of the object’ (Peirce 1931-58, volume 5: section 402). Sometimes the maxim reveals an idea to have no meaning. Such was the result, Peirce thought, of applying the maxim to transubstantiation, and, indeed, to many metaphysical ideas. Dewey deployed the maxim similarly. He saw it ‘as a method for inoculating ourselves against certain blind alleys in philosophy’ (Talisse and Aikin 2008: 17). James construed the maxim differently. Whereas Peirce seemed to hold that the ‘effects’ at issue were, solely, effects upon sensory experience, James extended those effects into the psychological effects of believing in the idea(s) in question. Moreover, whereas Peirce construed the maxim as a conception of meaning, James turned it into a conception of truth. ‘“The true”’ is that which, ‘in almost any fashion’, but ‘in the long run and on the whole’, is ‘expedient in the way of our thinking’ (James 1995: 86). As a consequence of these moves, James thought that many philosophical disputes were resolvable, and were only resolvable, through the pragmatic maxim.

None of the pragmatists opposed metaphysics as such or as a whole. That may be because each of them held that philosophy is not fundamentally different to other inquiries. Each of Peirce, James and Dewey elaborates the notion of inquiry, and the relative distinctiveness of philosophy, in his own way. But there is common ground on two views. (1) Inquiry is a matter of coping. Dewey, and to an extent James, understand inquiry as an organism trying to cope with its environment. Indeed Dewey was considerably influenced by Darwin. (2) Experimental science is the exemplar of inquiry. One finds this second idea in Dewey but also and especially in Peirce. The idea is that experimental science is the best method or model of inquiry, be the inquiry practical or theoretical, descriptive or normative, philosophical or non-philosophical. ‘Pragmatism as attitude represents what Mr. Peirce has happily termed the “laboratory habit of mind” extended into every area where inquiry may fruitfully be carried on’ (Dewey 1998, volume 2: 378). Each of these views (that is, both 1 and 2) may be called naturalistic (the second being a version of metaphilosophical naturalism; q.v. section 2.e).

According to pragmatism (though Peirce is perhaps an exception) pragmatism was a humanism. Its purpose was to serve humanity. Here is James (1995: 2): ‘no one of us can get along without the far-flashing beams of light it sends over the world’s perspectives’, the ‘it’ here being pragmatist philosophy and also philosophy in general. James held further that pragmatism, this time in contrast with some other philosophies, allows the universe to appear as ‘a place in which human thoughts, choices, and aspirations count for something’ (Gallie 1952: 24). As to Dewey, he held the following. ‘Ideals and values must be evaluated with respect to their social consequences, either as inhibitors or as valuable instruments for social progress’; and ‘philosophy, because of the breadth of its concern and its critical approach, can play a crucial role in this evaluation’ (Dewey, section 4). Indeed, according to Dewey, philosophy is to be ‘a social hope reduced to a working programme of action, a prophecy of the future, but one disciplined by serious thought and knowledge’ (Dewey 1998, vol. 1: 72). Dewey himself pursued such a programme, and not only in his writing – in which he championed a pervasive form of democracy – but also (and to help enable such democracy) as an educationalist.

Humanism notwithstanding, pragmatism was not hostile to religion. Dewey could endorse religion as a means of articulating our highest values. James tended to hold that the truth of religious ideas was to be determined, at the broadest level, in the same way as the truth of anything else. Peirce, for his part, was a more traditional philosophical theist. The conceptions of religion advocated by James and Dewey have been criticized for being very much reconceptions (Talisse and Aikin 2008: 90–94). A broader objection to pragmatist humanism is that its making of man the measure of all things is false and even pernicious. One finds versions of that objection in Heidegger and Critical Theory. One could level the charge, too, from the perspective of environmental ethics. Rather differently, and even more broadly, one might think that ‘moral and political ambitions’ have no place ‘within philosophy proper’ (Glock 2003a: 22 glossing Quine). Objections of a more specific kind have targeted the pragmatic maxim. Critics have faulted Peirce’s version of the pragmatic maxim for being too narrow or too indeterminate; and Russell and others have criticized James’ version as a misanalysis of what we mean by ‘true’.

Pragmatism was superseded (most notably in the United States) or occluded (in those places where it took little hold in the first place) by logical positivism. But the metaphilosophy of logical positivism has important similarities to pragmatism’s. Positivism’s verifiability principle is very similar to Peirce’s maxim. The positivists held that science is the exemplar of inquiry. And the positivists, like pragmatism, aimed at the betterment of society. Note also that positivism itself dissolved partly because its original tenets underwent a ‘“pragmaticization”’ (Rorty 1991b: xviii). That pragmaticization was the work especially of Quine and Davidson, who are ‘logical pragmatists’ in that they use logical techniques to develop some of the main ideas of pragmatists (Glock 2003a: 22–3; see also Rynin 1956). The ideas at issue include epistemological holism and the underdetermination of various type of theory by evidence. The latter is the aforementioned (section 2.d.iii) pragmatic element within Quine’s approach to ontology (on which see further Quine’s Philosophy of Science, section 3).

b. Neopragmatism: Rorty

The label ‘neopragmatism’ has been applied to Robert Brandom, Susan Haack, Nicholas Rescher, Richard Rorty, and other thinkers who, like them, identify themselves with some part(s) of classical pragmatism. (Karl-Otto Apel, Jürgen Habermas, John McDowell, and Hilary Putnam are borderline cases; each takes much from pragmatism but is wary about ‘pragmatist’ as a self-description.) This section concentrates upon the best known, most controversial, and possibly the most meta­philosophical, of the neopragmatists: Rorty.

Much of Rorty’s meta­philosophy issues from his antirepresentationalism. Antirepresentationalism is, in the first instance, this view: no representation (linguistic or mental conception) corresponds to reality in a way that exceeds our commonsensical and scientific notions of what it is to get the world right. Rorty’s arguments against the sort of privileged representations that are at issue here terminate or summarize as follows. ‘[N]othing counts as justification unless by reference to what we already accept […] [T]here is no way to get outside our beliefs and our language so as to find some test other than coherence’ (Rorty 1980: 178). Rorty infers that ‘the notion of “representation,” or that of “fact of the matter,”’ has no ‘useful role in philosophy’ (Rorty 1991b: 2). We are to conceive ourselves, or our conceptions, not as answerable to the world, but only to our fellows (see McDowell 2000: 110).

Rorty thinks that antirepresentationalism entails the rejection of a metaphilosophy which goes back to the Greeks, found a classic expression in Kant, and which is pursued in Analytic philosophy. That metaphilosophy, which Rorty calls ‘epistemological’, presents philosophy as ‘a tribunal of pure reason, upholding or denying the claims of the rest of culture’ (Rorty 1980: 4). More fully: philosophy judges discourses, be they religious, scientific, moral, political, aesthetical or metaphysical, by seeing which of them, and to what degree, disclose reality as it really is. (Clearly, though, more needs to be said if this conception is to accommodate Kant’s ‘transcendental idealism’. See Kant: Metaphysics, section 4.)

Rorty wants the philosopher to be, not a ‘cultural overseer’ adjudicating types of truth claims, but an ‘informed dilettante’ and a ‘Socratic intermediary’ (Rorty 1980: 317). That is, the philosopher is to elicit ‘agreement, or, at least, exciting and fruitful disagreement’ (Rorty 1980: 318) between or within various types or areas of discourse. Philosophy so conceived Rorty calls ‘hermeneutics’. The Rortian philosopher does not seek some schema allowing two or more discourses to be translated perfectly one to the other (an idea Rorty associates with representationalism). Instead she inhabits hermeneutic circle. ‘[W]e play back and forth between guesses about how to characterize particular statements or other events, and guesses about the point of the whole situation, until gradually we feel at ease with what was hitherto strange’ (1980: 319). Rorty connects this procedure to the ‘edification’ that consists in ‘finding new, better, more interesting, more fruitful ways of speaking’ (p. 360) and, thereby, to a goal he calls ‘existentialist’: the goal of finding new types of self-conception and, in that manner, finding new ways to be.

Rorty’s elaboration of all this introduces further notable meta­philosophical views. First: ‘Blake is as much of a philosopher Fichte and Henry Adams more of a philosopher than Frege’ (Rorty 1991a: xv). For Sellars was right, Rorty believes, to define philosophy as ‘an attempt to see how things, in the broadest possible sense of the term, hang together, in the broadest possible sense of the term’ (Sellars 1963: 1; compare section 6, Sellars’ Philosophy of Mind; presumably, though, Rorty holds that one has good philosophy when such attempts prove ‘edifying’). Second: what counts as a philosophical problem is contingent, and not just in that people only discover certain philosophical problems at certain times. Third: philosophical argument, at least when it aspires to be conclusive, requires shared assumptions; where there are no or few shared assumptions, such argument is impossible.

The last of the foregoing ideas is important for what one might call Rorty’s practical metaphilosophy. Rorty maintains that one can argue about morals and/or politics only with someone with whom one shares some assumptions. The neutral ground that philosophy has sought for debates with staunch egoists and unbending totalitarians is a fantasy. All the philosopher can do, besides point that out, is to create a conception that articulates, but does not strictly support, his or her moral or political vision. The philosopher ought to be ‘putting politics first and tailoring a philosophy to suit’ (Rorty 1991b: 178) – and similarly for morality. Rorty thinks that no less a political philosopher than John Rawls has already come close to this stance (Rorty 1991b: 191). Nor does Rorty bemoan any of this. The ‘cultural politics’ which suggests ‘changes in the vocabularies deployed in moral and political deliberation’ (Rorty 2007: ix) is more useful than the attempt to find philosophical foundations for some such vocabulary. The term ‘cultural politics’ could mislead, though. Rorty does not advocate an exclusive concentration on cultural as against social or economic issues. He deplores the sort of philosophy or cultural or literary theory that makes it ‘almost impossible to clamber back down […] to a level […] on which one might discuss the merits of a law, a treaty, a candidate, or a political strategy’ (Rorty 2007: 93).

Rorty’s metaphilosophy, and the philosophical views with which it is intertwined, have been attacked as irrationalist, self-refuting, relativist, unduly ethnocentric, complacent, anti-progressive, and even as insincere. Even Rorty’s self-identification with the pragmatist tradition has been challenged (despite the existence of at least some clear continuities). So have his readings, or appropriations, of his philosophical heroes, who include not only James and Dewey but also Wittgenstein, Heidegger and, to a lesser extent, Davidson and Derrida. For a sample of all these criticisms, see Brandom 2000 (which includes replies by Rorty) and Talisse and Aikin 2008: 140–148.

c. Post-Analytic Philosophy

‘Post-Analytic philosophy’ is a vaguely-defined term for something that is a current rather than a group or school. The term (in use as early as Rajchman and West 1985) denotes the work of philosophers who owe much to Analytic philosophy but who think that they have made some significant departure from it. Often the departures in question are motivated by pragmatist allegiance or influence. (Hence the placing of this section.) The following are all considerably pragmatist and are all counted as post-Analytic philosophers: Richard Rorty; Hilary Putnam; Robert Brandom; John McDowell. Still, those same figures exhibit, also, a turn to Hegel (a turn rendered slightly less remarkable by Hegel’s influence upon Peirce and especially upon Dewey). Some Wittgensteinians count as post-Analytic too, as might the later Wittgenstein himself. Stanley Cavell stands out here, though in one way or another Wittgenstein strongly influenced most of philosophers mentioned in this paragraph. Another common characteristic of those deemed post-Analytic is interest in a range of ‘Continental’ thinkers. Rorty looms large here. But there is also the aforementioned interest in Hegel, and, for instance, the fact that one finds McDowell citing Gadamer.

Post-Analytic philosophy is associated with various more or less meta­philosophical views. One is the rejection or severe revision of any notion of philosophical analysis. Witness Rorty, Brandom’s self-styled ‘analytic pragmatism’, and perhaps, meta­philosophical naturalism (q.v. section 2.e). (Still: only rarely – as in Graham and Horgan 1994, who advocate what they call ‘Post-Analytic Metaphilosophy’ – do naturalists call themselves ‘post-Analytic’.) Some post-Analytic philosophers go further, in that they tend, often under the influence of Wittgenstein, to attempt less to solve and more to dissolve or even discard philosophical problems. Each of Putnam, McDowell and Rorty has his own version of this approach, and each singles out for dissolution the problem of how mind or language relates to the world. A third characteristic feature of post-Analytic philosophy is the rejection of a certain kind of narrow professionalism. That sort of professionalism is preoccupied with specialized problems and tends to be indifferent to broader social and cultural questions. One finds a break from such narrow professionalism in Cavell, in Rorty, in Bernard Williams, and to an extent in Putnam (although also in such “public” Analytic philosophers as A. C. Grayling).

Moreover, innovative or heterodox style is something of a criterion of post-Analytic philosophy. One thinks here especially of Cavell. But one might mention McDowell too. Now, one critic of McDowell faults him for putting ‘barriers of jargon, convolution, and metaphor before the reader hardly less formidable than those characteristically erected by his German luminaries’ (Wright 2002: 157). The criticism betokens the way in post-Analytic philosophers are often regarded, namely as apostates. Post-Analytic philosophers tend to defend themselves by arguing either that Analytic philosophy needs to reconnect itself with the rest of culture, and/or that Analytic philosophy has itself shown the untenability of some of its most central assumptions and even perhaps ‘come to the end of its own project—the dead end’ (Putnam 1985: 28).

4. Continental Metaphilosophy

a. Phenomenology and Related Currents

i. Husserl’s Phenomenology

Phenomenology, as pursued by Edmund Husserl describes phenomena. Phenomena are things in the manner in which they appear. That definition becomes more appreciable through the technique through which Husserl means to gain access to phenomena. Husserl calls that technique the epoche (a term that owes to Ancient Greek skepticism). He designates the perspective that it achieves – the perspective that presents one with ‘phenomena’ – ‘the phenomenological reduction’. The epoche consists in suspending ‘the natural attitude’ (another term of Husserl’s coinage). The natural attitude comprises assumptions about the causes, the composition, and indeed the very existence of that which one experiences. The epoche, Husserl says, temporarily ‘brackets’ these assumptions, or puts them ‘out of play’ – allowing one to describe the world solely in the manner in which it appears. That description is phenomenology.

Phenomenology means to have epistemological and ontological import. Husserl presents the epistemological import – to begin with that – in a provocative way: ‘If “positivism” is tantamount to an absolutely unprejudiced grounding of all sciences on the “positive,” that is to say, on what can be seized upon originaliter, then we are the genuine positivists’ (Husserl 1931:  20). The idea that Husserl shares with the positivists is that experience is the sole source of knowledge. Hence Husserl’s ‘principle of all principles’: ‘whatever presents itself in “intuition” in primordial form […] is simply to be accepted as it gives itself out to be, but obviously only within the limits in which it thus presents itself’ (Husserl 1931: section 24). However, and like various other philosophers (including William James and the German Idealists), Husserl thinks that experience extends beyond what empiricism makes of it. For one thing – and this reveals phenomenology’s intended ontological import – experience can be of essences. A technique of ‘imaginative variation’ similar to Descartes’ procedure with the wax (see Descartes, section 4) allows one to distinguish that which is essential to a phenomenon and, thereby, to make discoveries about the nature of such phenomena as numbers and material things. Now, one might think that this attempt to derive essences from phenomena (from things in the manner in which they appear) must be idealist. Indeed – and despite the fact that he used the phrase ‘to the things themselves!’ as his slogan – Husserl did avow a ‘transcendental idealism’, whereby ‘transcendental subjectivity […] constitutes sense and being’ (Husserl 1999b: section 41). However, the exact content of that idealism – i.e. the exact meaning of the phrase just quoted – is a matter of some interpretative difficulty. It is evident enough, though, that Husserl’s idealism involves (at least) the following ideas. Experience necessarily involves various ‘subjective achievements’. Those achievements comprise various operations that Husserl calls ‘syntheses’ and which one might (although here one encounters difficulties) call ‘mental’. Moreover, the achievements are attributable to a subjectivity that deserves the name ‘transcendental’ in that (1) the achievements are necessary conditions for our experience, (2) the subjectivity at issue is transcendent in this sense: it exists outside the natural world (and, hence, cannot entirely be identified with what we normally construe as the mind). (On the notion of the transcendental, see further Kant’s transcendental idealism and transcendental arguments.)

Husserl argued that the denial of transcendental subjectivity ‘decapitates philosophy’ (Husserl 1970: 9). He calls such philosophy ‘objectivism’ and asserts that it confines itself to the ‘universe of mere facts’ and allies itself with the sciences. (Thus Husserl employs ‘positivism’ and ‘naturalism’ as terms with similar import to ‘objectivism’.) But objectivism cannot even understand science itself, according to Husserl; for science, he maintains, presupposes the achievements of transcendental subjectivity. Further, objectivism can make little sense of the human mind, of humanity’s place within nature, and of values. These latter failings contribute to a perceived meaninglessness to life and a ‘fall into hostility toward the spirit and into barbarity’ (Husserl 1970: 9). Consequently, and because serious investigation of science, mind, our place in nature, and of values belongs to Europe’s very raison d’être, objectivism helps to cause nothing less than a ‘crisis of European humanity’ (Husserl 1970: 299). There is even some suggestion (in the same text) that objectivism prevents us from experiencing people as people: as more than mere things.

The foregoing shows that phenomenology has a normative aspect. Husserl did make a start upon a systematic moral philosophy. But phenomenology is intrinsically ethical (D. Smith 2003: 4–6), in that the phenomenologist eschews prejudice and seeks to divine matters for him- or herself.

ii. Existential Phenomenology, Hermeneutics, Existentialism

Husserl hoped to found a unified and collaborative movement. His hope was partially fulfilled. Heidegger, Sartre and Merleau-Ponty count as heirs to Husserl because (or mainly because) they believed in the philosophical primacy of description of experience. Moreover, many of the themes of post-Husserlian phenomenology are present already, one way or other, in Husserl. But there are considerable, and indeed meta­philosophical, differences between Husserl and his successors. The meta­philosophical differences can be unfolded from this: Heidegger, Sartre and Merleau-Ponty adhere to an ‘existential’ phenomenology. ‘Existential phenomenology’ has two senses. Each construal matters meta­philosophically.

In one sense, ‘existential phenomenology’ denotes phenomenology that departs from Husserl’s self-proclaimed ‘pure’ or ‘transcendental’ phenomenology. At issue here is this view of Husserl’s: it is logically possible that a consciousness could survive the annihilation of everything else (Husserl 1999b: section 13). Existential phenomenologists deny the view. For they accept a kind of externalism whereby experience, or the self, is what it is – and not just causally – by dint of the world that is experienced. (On externalism, see Philosophy of Language, section 4a and Mental Causation, section 3.b.ii.) Various slogans and terms within the work existential phenomenologists express these views. Heidegger’s Being and Time presents the human mode of being as ‘being-in-the-world’ and speaks not of ‘the subject’ or ‘consciousness’ but of ‘Da-sein’ (‘existence’ or, more literally, ‘being-there’). Merleau-Ponty asserts that we are ‘through and through compounded of relationships with the world’, ‘destined to the world’ (2002: xi–xv). In Being and Nothingness, Sartre uses such formulations as ‘consciousness (of) a table’ (sic) in order to signal his rejection of ‘the “reificatory” idea of consciousness as some thing or container distinct from the world in the midst of which we are conscious’ (as Cooper puts – Cooper 1999: 201).

Existential phenomenology, so construed, has meta­philosophical import because it has methodological implications. Being and Nothingness holds that the inseparability of consciousness from the objects of consciousness ruins Husserl’s method of epoche (Sartre 1989: part one, chapter one; Cerbone 2006: 1989). Merleau-Ponty may not go as far. His Phenomenology of Perception has it that, because we are ‘destined to the world’, ‘The most important lesson of the reduction is the impossibility of a complete reduction’ (2002: xv). But the interpretation of this remark is debated (see J Smith 2005). At any rate, Merleau-Ponty found a greater philosophical use for the empirical sciences than did Husserl. Heidegger was more inclined to keep the sciences in their place. But he too – partly because of his existential (externalist) conception of phenomenology – differed from Husserl on the epoche. Again, however, Heidegger’s precise position is hard to discern. (Caputo 1977 describes the interpretative problem and tries to solve it.) Still, Heidegger’s principal innovation in philosophical method has little to do with the epoche. This article considers that innovation before turning to the other sense of existential phenomenology.

Heidegger’s revisions of phenomenological method place him within the hermeneutic tradition. Hermeneutics is the art or practice of interpretation. The hermeneutic tradition (sometimes just called ‘hermeneutics’) is a tradition that gives great philosophical weight to an interpretative mode of understanding. Members of this tradition include Friedrich Schleiermacher (1768–1834), Wilhelm Dilthey (1833–1911) and, after Heidegger, Hans-Georg Gadamer (1900–2002) and Paul Ricœur. Heidegger is hermeneutical in that he holds the following. All understanding is interpretative in that it always has preconceptions. One has genuine understanding insofar as one has worked through the relevant preconceptions. One starts ‘with a preliminary, general view of something; this general view can guide us to insights, which then lead – should lead – to a revised general view, and so on’ (Polt 1999: 98). This ‘hermeneutic circle’ has a special import for phenomenology. For (according to Heidegger) our initial understanding of our relations to the world involves some particularly misleading and stubborn preconceptions, some of which derive from philosophical tradition. Heidegger concludes that what is necessary is ‘a destruction—a critical process in which the traditional concepts, which at first must necessarily be employed, are deconstructed down to the [experiential, phenomenological] sources from which they were drawn’ (Heidegger 1988: 22f.). But Heidegger’s position may be insufficiently, or inconsistently, hermeneutical. The thought is that Heidegger’s own views entail a thesis that, subsequently, Gadamer propounded explicitly. Namely: ‘The very idea of a definitive interpretation [of anything] seems to be intrinsically contradictory’ (Gadamer 1981: 105). This thesis, which Gadamer reaches by conceiving understanding as inherently historical and linguistic, bodes badly for Heidegger’s aspiration to provide definitive ontological answers (an aspiration that he possessed at least as much as Husserl did). Yet arguably (compare Mulhall 1996: 192–5) that very result gels with another of Heidegger’s goals, namely, to help his readers to achieve authenticity (on which more momentarily).

The second meaning or construal of ‘existential phenomenology’ is existentialism. Gabriel Marcel invented that latter term for ideas held by Sartre and by Simone de Beauvoir. Subsequently, Heidegger, Merleau-Ponty, Camus, Karl Jaspers, Kafka, and others, got placed under the label. A term used so broadly is hard to define precisely. But the following five theses each have a good claim to be called ‘existentialist’. Indeed: each of the major existential phenomenologists held some version of at least most of the theses (although, while Sartre came to accept the label ‘existentialist’, Heidegger did not).

  1. One’s life determines ever anew the person that one is.
  2. One is free to determine one’s life and, hence, one’s identity.
  3. There is no objective moral order that can determine one’s values. One encounters values within the world (indeed, one encounters them bound up with facts); but nothing rationally compels decision between values.
  4. 1–3 perturb. Hence a tendency towards the inauthenticity (Heidegger’s term) or bad faith (Sartre’s term) which consists in the denial or refusal of those points – often by letting society determine one’s values and/or identity.
  5. The relation to one’s death – as well as to certain types of anxiety and absurdity or groundlessness – is important for disclosing possibilities of authentic existence.

These theses indicate that for the existentialist philosophy must be practical. It is not, though, that existentialism puts ethics at the heart of philosophy. That is because a further central existentialist idea is that no-one, even in principle, can legislate values for another. True, Sartre declared freedom to be ‘the foundation of all values’ (Sartre 2007: 61); and he wrote Notebooks for an Ethics. According to the ethic in question, to will one’s own freedom is to will the freedom of others. But in no further way does that ethic make much claim to objectivity. Instead, much of it turns upon the ‘good faith’ that consists in not denying the fact of one’s freedom.

What of politics? Little in Husserl fits a conventional understanding of political philosophy. Sartre came to hold that his existential ethics made sense only for a society that had been emancipated by Marxism (Sartre 1963: xxv-xxvi). Merleau-Ponty developed a phenomenologically informed political philosophy – and disagreed with Sartre on concrete political questions and on the manner in which the philosopher should be ‘engaged’ (Diprose and Reynolds: ch. 8; Carmen and Hansen 2005: ch. 12). Sartre and Merleau-Ponty give one to think, also, about the idea of artistic presentations of philosophy (Diprose and Reynolds: ch.s 9 and 18). What of Heidegger? He was, of course, a Nazi, although for how long – how long after he led the ‘Nazification’ of Freiburg University – is debated, as is the relation between his Nazism and his philosophy (Wolin 1993; Young 1997; see also section 4.c below). Now the ‘Heidegger case’ raises, or makes more urgent, some general meta­philosophical issues. Should philosophers get involved in politics? And was Gilbert Ryle right to say – as allegedly, apropos Heidegger, he did say (Cohen 2002: 337 n. 21), – that ‘a shit from the heels up can’t do good philosophy’?

The foregoing material indicates a sense in which phenomenology is its own best critic. Indeed, some reactions against phenomenology and existentialism as such – against the whole or broad conception of philosophy embodied they represent – owe to apostates or to heterodox philosophers within those camps. We saw that, in effect, Sartre came to think that existentialism was insufficient for politics. In fact, he came to hold this: ‘Every philosophy is practical, even the one which at first sight appears to be the most contemplative [. . . Every philosophy is] a social and political weapon’ (Sartre 1960: 5). Levinas accused phenomenologists prior to himself of ignoring an absolutely fundamental ethical dimension to experience (see Davis 1996). Derrida resembles Sartre and Levinas, in that, like them, he developed his own metaphilosophy (treated below) largely via internal criticism of phenomenology. Another objection to phenomenology is that it collapses philosophy into psychology or anthropology. (Husserl himself criticized Heidegger in that way.) Rather differently, some philosophers hold that, despite its attitude to naturalism, phenomenology needs to be naturalized (Petitot et al 1999). As to existentialism, it has been criticized for ruining ethics and for propounding an outlook that is not only an intellectual mistake but also – and Heidegger is taken as the prime exhibit – politically dangerous (see Adorno 1986 and ch. 8 of Wolin).

b. Critical Theory

‘Critical Theory’ names the so-called Frankfurt School – the tradition associated with the Institute of Social Research (Institutfürsozialforschung) which was founded in Frankfurt in 1924. (See Literary Theory section 1 for a wider or less historical notion of Critical Theory.) According to Critical Theory, the point of philosophy is that it can contribute to a critical and emancipatory social theory. The specification of that idea depends upon which Critical Theory is at issue; Critical Theory is an extended and somewhat diverse tradition. Its first generation included Theodor Adorno, Max Horkheimer and Herbert Marcuse. Most of the members of this generation had Jewish backgrounds. For that reason, and because the Institute was Marxist, the first generation fled the Nazis. The Institute re-opened in Frankfurt in 1950. Within the second generation, the most prominent figures are Jürgen Habermas and Albrecht Wellmer. Within the third, Axel Honneth is the best known. There is a fourth generation too. Moreover, there were stages or phases within the first generation. Following Dubiel (1985), we may distinguish, within that generation: (i) an intitial stage (1924 to around 1930) in which the school was more traditionally Marxist than it was subsequently; (ii) a ‘materialist’ stage (1930–1937); (iii) a stage (1937–1940) that began with the adoption of the label ‘Critical Theory’; and (iv) the ‘critique of instrumental reason’ (1940–1945). The treatment of first generation Critical Theory that follows confines itself to iii and iv.

i. Critical Theory and the Critique of Instrumental Reason

It was Horkheimer who introduced the term ‘the critical theory of society’ (‘Critical Theory’ for short) in 1937. He was director of the Institute at the time. He introduced the phrase partly from prudence. By 1937 the Frankfurt School was in the United States, where it was unwise to use the word ‘Marxist’ or even ‘materialist’. But prudence was not the only motive for the new name. Horkheimer meant to clarify and shape the enterprise he was leading. That enerprise, he proposed (see Horkheimer 1937), was the construction of a social theory that was, for one thing, broad. It treats society as a whole or in all its aspects. That breadth, together with the idea that society is more independent of the economy than traditional Marxism recognizes, means that Critical Theory ought to be interdisciplinary. (The expertise of the first-generation encompassed economics, sociology, law, politics, psychology, aesthetics and philosophy.) Next, Critical Theory is emancipatory. It aims at a society that is rational and free and which meets the needs of all. It is to that end that Critical Theory is critical. It means to reveal how contemporary capitalist society, in its economy and its culture and in their interplay, deceives and dominates.

Critical Theory so defined involves philosophy in several ways. (1) From its inception, it adapted philosophical ideas, especially from German Idealism, in order to analyze society. Nonetheless, and following Lukács, (2) Critical Theory thought that some parts of some philosophies could be understood as unknowing reflections of social conditions. (3) Philosophy has a role to play, not as the normative underpinning of the theory, but in justification for the lack of such underpinning. To begin to explain that third point: Horkheimer and company little specified the rational society they sought and little defended the norms by which they indicted contemporary society. With Marx, they held that one should not legislate for what should be the free creation of the future. With Hegel, they held that, anyway, knowledge is conditioned by its time and place. They held also, and again in Hegelian fashion, that there are norms that exist (largely unactualized) within capitalism – norms of justice and freedom and so forth – which suffice to indict capitalism. (4) Critical Theory conceives itself as philosophy’s inheritor. Philosophy, especially post-Kantian German Idealism, had tried to overcome various types of alienation. But only the achievement of a truly free society could actually do that, according to Critical Theory. Note lastly here that, at least after 1936, Critical Theory denied both that ostensibly Marxist regimes were such and that emancipation was anywhere nearly at hand. Consequently, this stage of Critical Theory tended to aim less at revolution and more at propagating awareness of the faults of capitalism and (to a lesser extent) of ‘actually existing socialism’.

There is a sense in which philosophy looms larger (or even larger) in the next phase of the first generation of Critical Theory. For, this phase of the movement (the ‘critique of instrumental reason’ phase) propounded that which we might call (with a nod to Lyotard) a (very!) grand narrative. Adorno and Horkheimer are the principle figures of this phase, and their co-authored Dialectic of Enlightenment its main text. That text connects enlightenment to that which Max Weber had called ‘the disenchantment of the world’. To disenchant the world is to render it calculable. The Dialectic traces disenchantment from the historical Enlightenment back to the proto-rationality of myth and forward to modern industrial capitalism (to its economy, psychology, society, politics, and even to its philosophies). Weber thought that disenchantment had yielded a world wherein individuals were trapped within an ‘iron cage’ (his term) of economy and bureaucracy. Here is the parallel idea in the Dialectic. Enlightenment has reverted to myth, in that the calculated world of contemporary capitalism is ruled, as the mythic world was ruled, by impersonal and brutish forces. Adorno and Horkheimer elaborate via the idea of instrumental reason (although, actually, the preferred term in Dialectic of Enlightenment – and in Horkheimer’s Eclipse of Reason, something of a popularization of the Dialectic – is ‘subjective reason’). Disenchantment produces a merely instrumental reason in that it pushes choice among ends outside of the purview of rationality. That said, the result – Horkheimer and Adorno argue – is a kind of instrumentalization of ends. Ends get replaced, as a kind of default, by things previously regarded merely instrumentally. Thus, at least or especially by the time of contemporary capitalism, life comes to be governed by such means-become-ends as profit, technical expertise, systematization, distraction, and self-preservation.

Do these ideas really amount to Critical Theory? Perhaps they are too abstract to count as interdisciplinary. Worse: they might seem to exclude any orientation towards emancipation. True, commentators show that Adorno offered more practical guidance than was previously thought; also, first-generation Critical Theory, including the critique of instrumental reason, did inspire the 1960s student movement. However: while Marcuse responded to that movement with some enthusiasm, Adorno and Horkheimer did not. Perhaps they could not. For though they fix their hopes upon reason (upon ‘enlightenment thinking’), they indict that very same thing. They write (2002: xvi):

We have no doubt—and herein lies our petitio principii—that freedom in society is inseparable from enlightenment thinking. We believe we have perceived with equal clarity, however, that the very concept of that thinking, no less than the concrete historical forms, the institutions of society with which it is intertwined, already contains the germ of the regression.

ii. Habermas

Habermas is a principal source of the criticisms of Adorno and Horkheimer just presented. (He expresses the last of those criticisms by speaking of a ‘performative contradiction’.) Nonetheless, or exactly because he thinks that his predecessors have failed to make good upon the conception, Habermas pursues Critical Theory as Horkheimer defined it, which is to say, as broad, interdisciplinary, critical, and emancipatory social theory.

Habermas’ Critical Theory comprises, at least centrally, his ‘critique of functionalist reason’, which is a reworking of his predecessors’ critique of instrumental reason. The central thesis of the critique of functionalist reason is that the system has colonized the lifeworld. In order to understand the thesis, one needs to understand not only the notions of system, lifeworld, and colonization but also the notion of communicative action and – this being the most philosophical notion of the ensemble – the notion of communicative rationality.

Communicative action is action that issues from communicative rationality. Communicative rationality consists, roughly, in ‘free and open discussion [of some issue] by all relevant persons, with a final decision being dependent upon the strength of better argument, and never upon any form of coercion’ (Edgar 2006: 23). The lifeworld comprises those areas of life that exhibit communicative action (or, we shall see, which could and perhaps should exhibit it). The areas at issue include the family, education, and the public sphere. A system is a social domain wherein action is determined by more or less autonomous or instrumental procedures rather than by communicative rationality. Habermas counts markets and bureaucracies as among the most significant systems. So the thesis that the lifeworld has been colonized by the system is the following claim. The extension of bureaucracy and markets into areas such as the family, education, and the public sphere prevent those spheres from being governed by free and open discussion.

Habermas uses his colonization thesis to explain alienation, social instability, and the impoverishment of democracy. He maintains, further, that even systems cannot function if colonization proceeds beyond a certain point. The thinking runs thus. Part of the way in which systems undermine communicative action is by depleting resources (social, cultural and psychological) necessary for such action. But systems themselves depend upon those resources. (Note that, sometimes, Habermas uses the term ‘lifeworld’ to refer to those resources themselves rather than to a domain that does or could exhibit communicative action.) Still: Habermas makes it relatively clear that the colonization thesis is meant not only as descriptive but also as normative. For consider the following. (1) A ‘critique’ – as in ‘critique of functional reason’ – is, at least in its modern usage, an indictment. (2) Habermas presents the creation of a ‘communicative’ lifeworld as essential to the completion – a completion that he deems desirable – of what he calls ‘the unfinished project of modernity’. (3) Habermas tells us (in his Theory of Communicative Action, which is the central text for the colonization thesis) that he means to provide the normative basis for a critical theory of society.

How far does Habermas warrant the normativity, which is to say, show that colonization is bad? It is hard to be in favour of self-undermining societies. But (some degree of?) alienation might be thought a price worth paying for certain achievements; and not everyone advocates democracy (or at least the same degree or type of it). But Habermas does have the following argument for the badness of colonization. There is ‘a normative content’ within language itself, in that ‘[r]eaching understanding is the inherent telos of human speech’; and/but a colonized lifeworld, which by definition is not a domain of communicative action, thwarts that telos. (Habermas 1992a: 109 and Habermas 1984: 287 respectively.)

The idea that language has a communicative telos is the crux of Habermas’ thought. For it is central both to his philosophy of language (or to his so-called universal pragmatics) and to his ethics. To put the second of those points more accurately: the idea of a communicative telos is central to his respective conceptions of both ethics and morality. Habermas understands morality to be a matter of norms that are mainly norms of justice and which are in all cases universally-binding. Ethics, by contrast, is a matter of values, where those values: express what is good for some individual or some group; have no authority beyond the individual or group concerned; and are trumped by morality when they conflict with it. Habermas has a principle, derived from the linguistic, communicative telos mentioned above, which he applies to both normal norms and ethical values. To wit: a norm or value is acceptable only if all those affected by it could accept it in reasonable – rational and uncoerced – discourse. This principle makes morality and ethics matters not for the philosopher but ‘for the discourse between citizens’ (Habermas 1992a: 158). (For more on Habermas’ moral philosophy – his ‘discourse ethics’, as it is known – and on his political philosophy, and also on the ways in which the various aspects of his thought fit together, see Finlayson 2005. Note, too, that in the twenty-first century Habermas has turned his attention to (1) that which religion can contribute to the public discourse of secular states and (2) bioethics.)

Habermas’ denial that philosophers have special normative privileges is part of his general (meta)­philosophical orientation. He calls that orientation ‘postmetaphysical thinking’. In rejecting metaphysics, Habermas means to reject not only a normative privilege for philosophy but also the idea that philosophy can ‘make claims about the world as a whole’ (Dews 1995: 209). Habermas connects postmetaphysical thinking to something else too. He connects it to his rejection of that which he calls ‘the philosophy of consciousness’. Habermas detects the philosophy of consciousness in Descartes, in German Idealism, and in much other philosophy besides. Seemingly a philosophy counts as a philosophy of consciousness, for Habermas, just in case it holds this: the human subject apprehends the world in an essentially individual and non-linguistic way. To take Habermas’ so-called ‘communicative turn’ is to reject that view; it is to hold, instead, that human apprehension is at root both linguistic and intersubjective. Habermas believes that Wittgenstein, Mead, and others prefigured and even somewhat accomplished this ‘paradigm shift’ (Habermas 1992a: 173, 194).

Habermasian postmetaphysical thinking has been charged both with retaining objectionable metaphysical elements and with abandoning too many of philosophy’s aspirations. (The second criticism is most associated with Karl-Otto Apel, who nonetheless has co-operated with Habermas in developing discourse ethics. On the first criticism, see for instance Geuss 1981: 94f.) Habermas has been charged, also, with making Critical Theory uncritical. The idea here is this. In allowing that it is alright for some markets and bureaucracies to be systems, Habermas allows too much. (A related but less meta­philosophical issue, touched on above, is whether Habermas has an adequate normative basis for its social criticisms. This issue is an instance of the so-called normativity problem in Critical Theory, on which see Freyenhagen 2008; Finlayson 2009.)

Here are two further meta­philosophical issues. (1) Is it really tenable or desirable for philosophy to be as intertwined with social science as Critical Theory wishes it to be? (For an affirmative answer, see Geuss 2008.) (2) Intelligibility seems particularly important for any thinker who means ‘to reduce the tension between his own insight and the oppressed humanity in whose service he thinks’ (Horkheimer 1937: 221); but Critical Theory has been criticized as culpably obscure and even as mystificatory (see especially the pieces by Popper and Albert in Adorno et al 1976). Adorno has been the principal target for such criticisms (and Adorno did defend his style; see Joll 2009). Yet Habermas, too, is very hard to interpret. That is partly because this philosopher of communication exhibits an ‘unbelievable compulsion to synthesize’ (Knödler-Bunte in Habermas 1992a: 124), which is to say, to combine seemingly disparate – and arguably incompatible – ideas.

c. The Later Heidegger

‘The later Heidegger’ is the Heidegger of, roughly, the 1940s onwards. (Some differences between ‘the two Heideggers’ will emerge below. But hereafter normally ‘Heidegger’ will mean ‘the later Heidegger’.) Heidegger’s difficult, radical, and influential metaphilosophy holds that: philosophy is metaphysics; metaphysics involves a fundamental mistake; metaphysics is complicit in modernity’s ills; metaphysics is entering into its end; and ‘thinking’ should replace metaphysics/philosophy.

Heidegger’s criterion of metaphysics – to start with that – is the identification of being with beings. To explain: metaphysics seeks something designatable as ‘being’ in that metaphysics seeks a principle or ground of beings; and metaphysics identifies being with beings in that it identifies this principle or ground (i.e. being) with something that it itself a being – or at least a cause or property of some being or beings. Heidegger’s favored examples of such construals of being include: the Idea in Plato; Aristotelian or Cartesian or Lockean ‘substance’; various construals of God; the Leibnizian ‘monad’; Husserlian subjectivity; and the Nietzschean ‘will to power’. Philosophy is co-extensive with metaphysics in that all philosophy since Plato involves such a project of grounding.

Now Heidegger himself holds that beings (das Seiende) have a dependence upon being (das Sein). Yet, being is ‘not God and not a cosmic ground’ (Heidegger 1994: 234). Indeed, being is identical to no being or being(s) or property or cause of any being(s) whatsoever. This distinction is ‘the ontological difference—the differentiation between being and beings’ (Heidegger 1982: 17; this statement is from Heidegger’s earlier work, but this idea, if not quite the term, persists). We may put the contention thus: pace metaphysics/philosophy, being is not ontic. But what, then, is being?

It may be that Heidegger employs ‘das Sein’ in two senses (Young 2002: ch. 1, Philipse 1998: section 13b; compare for instance Caputo 1993: 30). We might (as do Young and Philipse) use ‘being’, uncapitalized, to refer to the first of these sense and ‘Being’ (capitalized) to refer to the other. (Where both senses are in play, as sometimes they seem to be in Heidegger’s writing, this article resorts sometimes to the German das Sein. Note, however, that this distinction between two senses of Heideggerian Sein is interpretatively controversial.) In the first and as it were lowercase sense, being is what Heidegger calls sometimes a ‘way of revealing’.  That is, it is something – something ostensibly non-ontic – by dint of which beings are ‘revealed’ or ‘unconcealed’ or ‘come to presence’, and indeed do so in the particular way or ways in which they do. In the second and ‘uppercase’ sense,  Being is that which is responsible for unconcealment, i.e. is responsible for das Sein in the first, lowercase sense. A little more specifically, Being (in this second, uppercase sense) ‘sends’ or ‘destines’ being; accordingly, it is that from which beings are revealed, the ‘reservoir of the non-yet-uncovered, the un-uncovered’ (Heidegger 1971: 60). With this second notion of das Sein, Heidegger means to stress the following point (a point that perhaps reverses a tendency in the early Heidegger): humanity does not determine, at least not wholly, how beings are ‘unconcealed’.

One wants specification of all this. We shall see that Heidegger provides some. Nevertheless, it may be a mistake to seek an exact specification of the ideas at issue. For Heidegger may not really mean das Sein (in either sense) to explain anything. He may mean instead to stress the mysteriousness of the fact that beings are accessible to us in the form that they are and, indeed, at all.

One way in which Heidegger fills out the foregoing ideas is by posit ing ‘epochs’ of being, which is to say, a historical series of ontological regimes (and here lies another difference between the earlier and the later Heidegger). The series runs thus: (1) the ancient Greek understanding of being, with which Heidegger associates the word ‘physis’; (2) the Medieval Christian understanding of being, whereby beings (except God and artifacts) are divinely created things; (3) the modern understanding of being as resource (on which more below). That said, sometimes Heidegger gives a longer list of epochs, in which list the epochs correlate with metaphysical systems. Thus the idea of a ‘history of being [Seinsgeschichte] as metaphysics’ (Heidegger 2003: 65). It is important that this history, and indeed the simpler tripartite scheme, does not mean to be a history merely of conceptions of being. It means to be also a history of being itself, i.e. of ontological regimes. Heidegger holds, then, that beings are ‘unconcealed’ in different ways in different epochs (although he holds also that each metaphysic ‘absolutizes’ its corresponding ontological regime, i.e. that each metaphysic overlooks the fact that beings are unconcealed differently in different epochs; see Young 2002: 29, 54, 68).

Heidegger allows also for some ontological heterogeneity within epochs, too. Here one encounters Heidegger’s notion of ‘the thing’ (das Ding). Trees, hills, animals, jugs, bridges, and pictures can be Things in the emphatic sense at issue, but such Things are ‘modest in number, compared with the countless objects’. A Thing has ‘a worlding being’. It opens a world by ‘gathering’ the fourfold (das Geviert). The fourfold is a unity of ‘earth and sky, divinities and mortals’. (All Heidegger 1971: 179ff.). Some of this conception is actually fairly straightforward. Heidegger tries to show how a bridge (to take one case) can be so interwoven with human life and thereby with other entities that, via the ‘world’ that comprises those interrelations (a world not identical with any particular being), the following is the case. The Thing (the bridge), persons, and numerous other phenomena all stand in relations of mutual determination, i.e. make each other what they are.

But in modernity ontological variety is diminished, according to Heidegger. In modernity Things become mere objects. Indeed subsequently objects themselves, together with human beings, become mere resources. A resource (or ‘standing-reserve’; the German is Bestand) is something that, unlike an object, is determined wholly by a network of purposes into which we place it. Heidegger’s examples include a hydroelectric powerplant on the Rhine and an airplane, together with the electricity and fuel systems to which those artifacts are connected. Heidegger associates resources with modern science and with ‘the metaphysics of subjectivity’ within which (he argues) modern science moves. That metaphysics, which tends towards seeing man as the measure of all things, is in fact metaphysics as such, according to Heidegger. For anthropocentrism is incipient in the very beginnings of philosophy, blossoms in various later philosophers including Descartes and Kant, and reaches its apogee in Nietzsche, the extremity of whose anthropocentrism is the end of metaphysics. It is the end of metaphysics (or, pleonastically: of the metaphysics of subjectivity) in that here, in Nietzsche’s extreme anthropocentrism, metaphysics reaches  its completion or full unfolding. And that end reflects the reign of resources. ‘[T]he world of completed metaphysics can be stringently called “technology”’ (Heidegger 2003: 82). However, in Heidegger’s final analysis the ubiquity of resources owes not to science or metaphysics but to a ‘mode of revealing’; it owes to an epochal ontological regime that Heidegger calls ‘Enframing’, even if he seems to think, also, that a change in human beings could mitigate Enframing and prepare for something different and better. (More on this mitigation shortly.)

What though is wrong with the real being revealed as resource? Enframing is ‘monstrous’ (Heidegger 1994: 321). It is monstrous – Heidegger contends – because it is nihilism. Nihilism is a ‘forgetfulness’ of das Sein (Seinsvergessenheit). Some such forgetfulness is nigh inevitable. We are interested in beings as they present themselves to us. So we overlook the conditions of that presentation, namely, being and Being. But Enframing represents a more thoroughgoing form of forgetfulness. The hegemony of resources makes it very hard (harder than usual – recall above) to conceive that beings could be otherwise, which is to say, to conceive that there is something called ‘Being’ that could yield different regimes of being. In fact, Enframing actively denies being/Being. That is because Enframing, or the metaphysics/science that corresponds to it, proceeds as if humanity were the measure of all things and hence as if being, or that which grants being independently of us (Being), were nothing. Such nihilism sounds bearable. But Heidegger lays much at its door: an impoverishment of culture; a deep kind of homelessness; the devaluation of the highest values (see Young 2002: ch. 2 and passim). He goes so far as to trace ‘the events of world history in this [the twentieth] century’ to Seinsvergessenheit (Heidegger in Wolin 1993: 69).

Heidegger’s response to nihilism is ‘thinking’ (Denken). The thinking at issue is a kind of thoughtful questioning. Its object – that which it thinks about – can be the pre-Socratic ideas from which philosophy developed, or philosophy’s history, or Things, or art. Whatever its object, thinking always involves recognition that it is das Sein, albeit in some interplay with humanity, which determines how beings are. Indeed, Heideggerian thinking involves wonder and gratitude in the face of das Sein. Heidegger uses Meister Eckhart’s notion of ‘releasement’ to elaborate upon such thinking. The idea (prefigured, in fact, in Heidegger’s earlier work) is of a non-impositional comportment towards beings which lets beings be what they are. That comportment ‘grant[s] us the possibility of dwelling in the world in a totally different way’. It promises ‘a new ground and a new foundation upon which we can stand and endure in the world of technology without being imperiled by it’ (Heidegger 1966: 55). Heidegger calls the dwelling at issue ‘poetic’ and one way in which he specifies it is via various poets. Moreover, some of Heidegger’s own writing is semi-poetic. A small amount of it actually consists of poems. So it is not entirely surprising to find Heidegger claiming that,  ‘All philosophical thinking’ is ‘in itself poetic’ (Heidegger 1991, vol. 2: 73; Heidegger made this claim at a time when he still considered himself a philosopher as against a non-metaphysical, and hence non-philosophical, ‘thinker’). The claim is connected to the centrality that Heidegger gives to language, a centrality that is summed up (a little gnomically) in the statement that language is ‘the house of das Sein’ (Heidegger 1994: 217).

Heideggerian ‘thinking’ has been attacked as (some mixture of) irrationalist, quietist, reactionary, and authoritarian (see for example Adorno 1973 and Habermas 1987b: ch. 6). A related objection is that, though Heidegger claimed to leave theology alone, what he produced was an incoherent reworking of religion (Haar 1993; Philipse 1998). Of the more or less secular or (in Caputo’s term) ‘demythologized’ construals of Heidegger, many are sympathetic and, among those, many fasten upon such topics as technology, nihilism, and dwelling (Borgmann 1984, Young 2002: ch.s 7–9; Feenberg 1999: ch. 8). Other secular admirers – including, notably, Rorty and Derrida – concentrate upon Heidegger’s attempt to encapsulate and interrogate the entire philosophical tradition.

d. Derrida’s Post-Structuralism

Structuralism was an international trend in linguistics, literary theory, anthropology, political theory, and other disciplines. It sought to explain phenomena (sounds, tropes, behaviors, norms, beliefs . . .) less via the phenomena themselves, or via their genesis, and more via structures that the phenomena exist within or instantiate. The post-structuralists applied this structural priority to philosophy. They are post-structuralists less because they came after structuralism and more because, in appropriating structuralism, they distanced themselves from the determinism and scientism it often involved (Dews 1987: 1–4). The post-structuralists included Deleuze, Foucault, Lyotard and Lacan (and sometimes post-structuralism is associated with ‘post-modernism’; see Malpas 2003: 7–11). Each of these thinkers (perhaps excepting Lacan) is highly meta­philosophical. But attention is restricted to the best known and most controversial of the post-structuralists, namely, Jacques Derrida.

Derrida practiced ‘deconstruction’ (Déconstruire, la Déconstruction; Derrida adapts the notion of deconstruction from Heidegger’s idea of ‘destruction’, on which latter see section 4.a.ii above). Deconstruction is a ‘textual “operation”’ (Derrida 1987: 3). The notion of text here is a broad one. It extends from written texts to conceptions, discourses, and even practices. Nevertheless, Derrida’s early work concentrates upon actual texts and, more often than not, philosophical ones. The reason Derrida puts ‘operation’ (‘textual “operation”’) within scare-quotes is that he holds that deconstruction is no method. That in turn is for two reasons (each of which should become clearer below). First, the nature of deconstruction varies with that which is deconstructed. Second, there is a sense in which texts deconstruct themselves. Nonetheless: deconstruction, as a practice, reveals such alleged self-deconstruction; and that practice does have a degree of regularity. The practice of deconstruction has several stages. (In presenting those stages, ‘text’ is taken in the narrow sense. Moreover, it is presumed that in each case a single text is, at least centrally, at issue.)

Deconstruction begins with a commentary (Derrida 1976: 158) – with a ‘faithful’ and ‘interior’ reading of a text (Derrida 1987: 6). Within or via such commentary, the focus is upon metaphysical oppositions. Derrida understands metaphysics as ‘the metaphysics of presence’ (another notion adapted from Heidegger); and an opposition belongs to metaphysics (pleonastically, the metaphysics of presence) just in case: (i) it contains a privileged term and a subordinated term; and (ii) the privileged term has to do with presence. ‘Presence’ is presence to consciousness and/or the temporal present. The oppositions at issue include not only presence–absence (construed in either of the two ways just indicated) but also, and among others (and with the term that is privileged within each opposition given first) these: ‘normal/abnormal, standard/parasitic, fulfilled/void, serious/nonserious, literal/nonliteral’ (Derrida 1988: 93).

The next step in deconstruction is to show that the text undermines its own metaphysical oppositions. That is: the privileged terms reveal themselves to be less privileged over the subordinate terms – less privileged vis-à-vis presence, less ‘simple, intact, normal, pure, standard, self-identical’ (Derrida 1988: 93) – than they give themselves out to be. Here is a common way in which Derrida tries to establish the point. He tries to show that a privileged term essentially depends upon, or shares some crucial feature(s) with, its supposed subordinate. One of Derrida’s deconstructions of Husserl can serve as an example. Husserl distinguishes mental life, which he holds to be inherently intentional (inherently characterized by aboutness) from language, which is intentional only via contingent association with such states. Thereby Husserl privileges the mental over the linguistic. However: Husserl’s view of the temporality of experience entails that the presence he makes criterial for intrinsic intentionality – a certain presence of meanings to the mind – is always partially absent. Or so Derrida argues (Derrida, section 4). A second strategy of Derrida’s ‘is to apply a distinction onto itself reflexively and thus show that it itself is imbued with the disfavored term’ (Landau, 1992/1993: 1899). ‘For example, Derrida shows that when Aristotle and other philosophers discuss the nature of metaphors (and thereby the distinction between metaphors and non-metaphors), they use metaphors in the discussions themselves’ (idem) – and so fail in their attempts to relegate or denigrate metaphor. A further strategy involves the notion of undecidability (see Derrida, section 5).

A third stage or aspect of deconstruction is, one can say, less negative or more productive (and Derrida himself calls this the productive moment of deconstruction). Consider Derrida’s deconstruction(s) of the opposition between speech and writing. Derrida argues, initially, as follows. Speech – and even thought, understood as a kind of inner speech – shares with writing features that have often been used to present writing as only a poor descendent of speech. Those features include being variously interpretable and being derivative of something else. But there is more. Derrida posits something, which he calls archi-écriture, ‘arche-writing’, which is ‘fundamental to signifying processes in general, a “writing” that is the condition of all forms of expression, whether scriptural, vocal, or otherwise’ (Johnson 1993: 66). Indeed: as well as being a condition of possibility, arche-writing is, in Derrida’s frequent and arresting phrase, a condition of its impossibility. Arche-writing establishes or reveals a limit to any kind of expression (a limit, namely, to the semantic transparency, and the self-sufficiency, of expressions). Other deconstructions proceed similarly. A hierarchical opposition is undermined; a new term is produced through a kind of generalization of the previously subordinate term; and the new term – such as ‘supplement’, ‘trace’ and the neologism différance (Derrida, section 3.c–e) – represents a condition of possibility and impossibility for the opposition in question.

What is the status of these conditions? Sometimes Derrida calls them ‘quasi-transcendental’. That encourages this idea: here we have an account not just of concepts but of things or phenomena. Yet Derrida himself does not quite say that. He denies that we can make any simple distinction between text and world, between conceptual system and phenomena. Such may be part of the thrust of the (in)famous pronouncement, ‘There is nothing outside of the text’ (il n’y a pas de hors-texte; Derrida 1976: 158). Nor does Derrida think that, by providing such notions as arche-writing, he himself wholly evades the metaphysics of presence. ‘We have no language—no syntax and no lexicon—that is foreign to this history; we can pronounce not a single deconstructive proposition which has not already had to slip into the form, the logic, and the implicit postulations of precisely what it seeks to contest’ (Derrida 1990: 280f.). Still: ‘if no one can escape this necessity, and if no one is therefore responsible for giving in to it […] this does not mean that all the ways of giving in to it are of equal pertinence. The quality and fecundity of a discourse are perhaps measured by the critical rigor with which this relation to the history of metaphysics and to inherited concepts is thought’ (Derrida 1990: 282).

Derrida retained the foregoing views, which he had developed by the end of the 1960s. But there were developments of metaphilosophical significance. (1) In the ’70s, his style became more playful, and his approach to others’ text became more literary (and those changes more or less persisted; Derrida would want to know, however, just what we understand by ‘playful’ and ‘literary’). (2) Again from the ’70s onwards, Derrida joined with others in order to: sustain and promote the teaching of philosophy in schools; to consider philosophy’s role; and to promote philosophy that transgressed disciplinary boundaries. (3) In the ’80s, Derrida tried to show that deconstruction had an ethical and political import. He turned to themes that included cosmopolitanism, decision, forgiveness, law, mourning, racism, responsibility, religion, and terrorism – and claimed, remarkably, that ‘deconstruction is justice’ (Derrida 1999: 15). To give just a hint of this last idea: ‘Justice is what the deconstruction of the law’ – an analysis of the law’s conditions of possibility and impossibility, of its presuppositions and limits – ‘means to bring about’, where ‘law’ means ‘legality, legitimacy, or legitimation (for example)’ (Caputo 1997: 131f.). (On some of these topics, see Derrida, section 7.) (4) By the ’90s, if not earlier, Derrida held that in philosophy the nature of philosophy is always and everywhere at issue (see for instance Derrida 1995: 411).

Despite his views about the difficulty of escaping metaphysics, and despite his evident belief in the critical and exploratory value of philosophy, Derrida has been attacked for undermining philosophy. Habermas provides an instance of the criticism. Habermas argued that Derrida erases the distinction between philosophy and literature. Habermas recognizes that Derrida means to be ‘simultaneously maintaining and relativizing’ the distinction between literature and philosophy (Habermas 1987b: 192). But the result, Habermas thinks, is an effacement of the differences between literature and philosophy. Habermas adds, or infers, that ‘Derrida does not belong to those philosophers who like to argue’ (Habermas 1987b: 193). Derrida objected to being called unargumentative. He objected, also, to Habermas’ procedure of using other deconstructionists – those that Habermas deemed more argumentative – as the source for Derrida’s views.

Subsequently, Habermas and Derrida underwent something of a rapprochement. Little reconciliation was achieved in the so-called ‘Derrida affair’, wherein a collection of philosophers, angry that Derrida was to receive an honorary degree from Cambridge, alleged that Derrida ‘does not meet accepted standards of clarity or rigor’ (quoted Derrida 1995: 420; a detailed attack upon Derrida’s scholarship is Evans 1991).

There might be a sense in which Derrida is too rigorous. For he holds this: ‘Every concept that lays claim to any rigor whatsoever implies the alternative of “all or nothing”’ (Derrida 1988: 116). One might reject that view. Might it be, indeed, that Derrida insists upon rigid oppositions ‘in order to legitimate the project of calling them into question’ (Gerald Graff in Derrida 1988: 115)? One might object, also, that Derrida’s interrogation of philosophy is more abstract, more intangible, than most metaphysics. Something Levinas said apropos Derrida serves as a response. ‘The history of philosophy is probably nothing but a growing awareness of the difficulty of thinking’ (Levinas 1996: 55; compare Derrida 1995: 187f.). The following anxiety might persist. Despite Derrida’s so-called ethical and political ‘turns’, and despite the work he has inspired within he humanities, deconstruction little illuminates phenomena that are not much like anything reasonably designatable as a text (Dews 1987: 35). A more general version of the anxiety is that, for all the presentations of Derrida as ‘a philosopher of difference’, deconstruction obscures differences (Kearney 1984: 114; Habermas 1992a: 159).

5. References and Further Reading

Note that, in the case of many of the items that follow, the date given for a text is not the date of its first publication.

a. Explicit Metaphilosophy and Works about Philosophical Movements or Traditions

  • Anscombe, G. E. M. (1957) ‘Does Oxford Moral Philosophy Corrupt Youth?’ in her Human life, Action, and Ethics: Essays, pp. 161–168. Exeter, UK: Imprint Academic, 2005. Edited by Mary Geach and Luke Gormally.
  • Beaney, Michael (2007) ‘The Analytic Turn in Early Twentieth-Century Philosophy’, in Beaney, Michael ed. The Analytic Turn. Essays in Early Analytic Philosophy and Phenomenology, New York and London: Routledge, 2007.
    • Good on, especially, the notions of analysis in early Analytic philosophy and on the historical precedents of those notions.
  • Beaney, Michael (2009) ‘Conceptions of Analysis in Analytic Philosophy’: Supplement to entry on ‘Analysis’, The Stanford Encyclopedia of Philosophy (Summer 2009 Edition), Edward N. Zalta (ed.).
  • Beauchamp, Tom L. (2002) ‘Changes of Climate in the Development of Practical Ethics’, Science and Engineering Ethics 8: 131–138.
  • Bernstein, Richard J. (2010) The Pragmatic Turn. Cambridge MA and Cambridge.
    • An account of the influence and importance of pragmatism.
  • Chappell, Timothy (2009) ‘Ethics Beyond Moral Theory’ Philosophical Investigations 32: 3 206–243.
  • Chase, James, and Reynolds, Jack (2010) Analytic Versus Continental: Arguments on the Methods and Value of Philosophy. Stocksfield: Acumen.
  • Clarke, Stanley G. (1987) ‘Anti-Theory in Ethics’, American Philosophical Quarterly 24: 3 237–244.
  • Deleuze, Giles, and Guattari, Félix (1994) What is Philosophy? London and New York: Verso. Trans. Graham Birchill and Hugh Tomlinson.
    • Less of an introduction to metaphilosophy than its title might suggest.
  • Galison, Peter (1990) ‘Aufbau/Bauhaus: Logical Positivism and Architectural Modernism’, Critical Inquiry, 16(4[Summer]): 709–752.
  • Glendinning, Simon (2006) The Idea of Continental Philosophy: A Philosophical Chronicle. Edinburgh: Edinburgh University Press.
  • Glock, Hans-Johann (2008) What Is Analytic Philosophy? Cambridge and New York: Cambridge University Press.
    • Comprehensive. Illuminating. Not introductory.
  • Graham, George and Horgan, Terry (1994) ‘Southern Fundamentalism and the End of Philosophy’, Philosophical Issues 5: 219–247.
  • Lazerowitz, Morris (1970) ‘A Note on “Metaphilosophy”, Metaphilosophy, 1(1): 91–91 (sic).
    • An influential (but very short) definition of metaphilosophy.
  • Levin, Janet (2009) ‘Experimental Philosophy’, Analysis, 69(4) 2009: 761–769.
  • Levy, Neil (2009) ‘Empirically Informed Moral Theory: A Sketch of the Landscape’, Ethical Theory and Moral Practice 12:3–8.
  • McNaughton, David (2009) ‘Why Is So Much Philosophy So Tedious?’, Florida Philosophical Review IX(2): 1-13.
  • Joll, Nicholas (2009) ‘How Should Philosophy Be Clear? Loaded Clarity, Default Clarity, and Adorno’, Telos 146 (Spring): 73–95.
  • Joll, Nicholas (Forthcoming) Review of Jürgen Habermas et al, An Awareness of What Is Missing (Polity, 2010), Philosophy.
    • Tries to clarify and evaluate some of Habermas’ thinking on religion.
  • Papineau, David (2009) ‘The Poverty of Analysis’, Proceedings of the Aristotelian Society Supplementary Volume lxxxiii: 1–30.
  • Preston, Aaron (2007) Analytic Philosophy: The History of an Illusion. London and New York: Continuum.
    • Argues, controversially, that Analytic philosophy has never had any substantial philosophical or meta­philosophical unity.
  • Prinz, Jesse J. (2008) ‘Empirical Philosophy and Experimental Philosophy’ in J. Knobe and S. Nichols (eds.) Experimental Philosophy. Oxford: Oxford University Press, 2008.
  • Urmson, J. D. (1956) Philosophical Analysis: Its Development Between the Two World Wars. London: Oxford University Press.
  • Rescher, Nicholas (2006) Philosophical Dialectics. An Essay on Metaphilosophy. Albany: State University of New York Press.
    • Centres upon the notion of philosophical progress. Contains numerous, occasionally gross typographical errors.
  • Rorty, Richard ed. (1992) The Linguistic Turn: Essays in Philosophical Method, Chicago and London: University of Chicago Press. Second edition.
    • A useful study of 1930s to 1960s Analytic metaphilosophy.
  • Rorty, Richard, Schneewind, Jerome B., and Skinner, Quentin eds. (1984) Philosophy in History: Essays in the Historiography of Philosophy. Cambridge: Cambridge University Press.
  • Sorell, Tom, and Rogers, C. A. J. eds. (2005) Analytic Philosophy and History of Philosophy. Oxford and New York: Oxford.
  • Stewart, Jon (1995) ‘Schopenhauer’s Charge and Modern Academic Philosophy: Some Problems Facing Philosophical Pedagogy’, Metaphilosophy 26(3): 270–278.
  • Taylor, Charles (1984) ‘Philosophy and Its History’, in Rorty, Schneewind, and Skinner 1984.
  • Williams, Bernard (2003) ‘Contemporary Philosophy: A Second Look’ in The Blackwell Companion to Philosophy, ed. Nicholas Bunnin and E. P. Tsui-James, pp. 25–37. Oxford: Blackwell. Second edition.
  • Williamson, Timothy (2007) The Philosophy of Philosophy, Malden MA and Oxford: Blackwell.
    • A dense, rather technical work aiming to remedy what it sees as a meta­philosophical lack in Analytic philosophy. Treats, among other things, these notions: conceptual truth; intuitions; thought experiments.

b. Analytic Philosophy including Wittgenstein, Post-Analytic Philosophy, and Logical Pragmatism

  • Austin, J. L., Philosophical Papers (1979). Third edition. Oxford and New York: Oxford University Press.
  • Burtt, E. A. (1963) ‘Descriptive Metaphysics’, Mind 72(285):18–39.
  • Campbell, Richmond and Hunter, Bruce (2000) ‘Introduction’, in R. Campbell and B. Hunter eds. Moral Epistemology Naturalized, Supple. Vol., Canadian Journal of Philosophy: 1–28.
    • Campbell has a published a similar piece, under the title ‘Moral Epistemology’, in the online resource the Stanford Encyclopedia of Philosophy.
  • Carnap (1931) ‘The Elimination of Metaphysics Through Logical Analysis of Language’ in Ayer, A. J. (1959) ed. Logical Positivism. Glencoe IL: The Free Press.
  • Cavell, Stanley (1979) The Claim of Reason. Wittgenstein, Skepticism, Morality, and Tragedy. Oxford: Oxford University Press.
  • Cohen, G. A. (2002) ‘Deeper into Bullshit’, in Buss, Sarah and Overton, Lee eds. Contours of Agency: Themes from the Philosophy of Harry Frankfurt, Cambridge, MA: MIT Press.
    • Adapts Harry Frankfurt’s construal of bullshit in order to diagnose and indict much ‘bullshit in certain areas of philosophical and semi-philosophical culture’ (p. 335). Reprinted in Hardcastle, Gary L. and Reich, George A. eds. Bullshit and Philosophy, Chicago and La Salle, IL: Open Court, 2006.
  • Copi, Irving M. (1949) ‘Language Analysis and Metaphysical Inquiry’ in Rorty 1992.
  • Freeman, Samuel (2007) Rawls. Oxford and New York: Routledge.
  • Gellner, Ernest (2005) Words and Things. An Examination of, and an Attack on, Linguistic Philosophy. Second edition. Abingdon and New York: Routledge.
  • Glock, Hans-Johann (2003a) Quine and Davidson on Language, Thought and Reality. Cambridge and New York: Cambridge University Press.
  • Glock, Hans-Johann ed. (2003b) Strawson and Kant. Oxford and New York: Oxford University Press.
  • Haack, Susan (1979) ‘Descriptive and Revisionary Metaphysics’, Philosophical Studies 35: 361–371.
  • Hacker, P. M. S. (2003) ‘On Strawson’s Rehabilitation of Metaphysics’ in Glock ed. 2003b.
  • Hacker, P. M. S. (2007) Human Nature: the Categorial Framework. Oxford: Blackwell.
  • Hutchinson, Brian (2001) G. E. Moore’s Ethical Theory: Resistance and Reconciliation. Cambridge: Cambridge University Press.
  • Kripke, Saul A (1980) Naming and Necessity. Oxford: Blackwell. Revised and Enlarged edition.
  • Lance, M. and Little, M., (2006) ‘Particularism and anti-theory’, in D. Copp, ed., The Oxford handbook of ethical theory, Oxford and New York: Oxford University Press.
  • Loux, Michael J (2002) Metaphysics. A Contemporary Introduction, second ed. Routledge: London and New York.
  • Malcolm, Norman (1984) Ludwig Wittgenstein: a memoir / by Norman Malcolm; with a biographical sketch by G. H. von Wright and Wittgenstein’s Letters to Malcolm. Second ed. Oxford and New York, Oxford University Press.
  • McDowell, John (1994) Mind and World. Cambridge MA and London: Harvard University Press.
    • Perhaps the paradigmatic ‘post-Analytic’ text.
  • McDowell, John (2000) ‘Towards Rehabilitating Objectivity’ in Brandom ed. (2000).
  • McMahon, Jennifer A. (2007) Aesthetics and Material Beauty: Aesthetics Naturalized. New York and London: Routledge.
  • Moore, G. E. (1899) ‘The Nature of Judgement’, in G. E. Moore Selected Writings, London: Routledge, 1993, ed. T. Baldwin.
  • Moore, G. E. (1953) Some Main Problems of Philosophy. New York: Humanities Press.
    • From lectures given in 1910 and 1911.
  • Moore, G. E. (1993) Principia Ethica. Cambridge and New York: Cambridge University Press.
    • Second and revised edition, containing some other writings by Moore.
  • Neurath, Otto, Carnap, Rudolf, and Hahn, Hans (1996) ‘The Scientific Conception of the World: the Vienna Circle’, in Sarkar, Sahotra ed. The Emergence of Logical Empiricism: from 1900 to the Vienna Circle. New York: Garland Publishing, 1996. pp. 321–340.
    • An English translation of the manifesto issued by the Vienna Circle in 1929.
  • Orenstein, Alex (2002) W. V. Quine. Chesham, UK: Acumen.
  • Pitkin, Hanna (1993) Wittgenstein and Justice. On the Significance of Ludwig Wittgenstein for Social and Political Thought. Berkeley and London: University of California Press.
  • Putnam, Hilary (1985) ‘After Empiricism’ in Rajchman and West 1985.
  • Quine, W. V. O. (1960) Word and Object. Cambridge MA: MIT Press.
  • Quine, W. V. O. (1977) Ontological Relativity and Other Essays. New York: Columbia University Press. New edition.
  • Quine, W. V. O. (1980) From A Logical Point of View. Harvard: Harvard University Press. New edition.
  • Quine, W. V. O. (1981) Theories and Things. Cambridge, MA: Harvard University Press.
  • Rawls, John (1999a) A Theory of Justice. Revised edition. Cambridge MA: Harvard University Press.
  • Rawls, John (1999b) Collected Papers ed. Samuel Freeman. Cambridge, MA: Harvard University Press.
  • Russell, Bertrand (1992) A Critical Exposition of the Philosophy of Leibniz. London and New York: Routledge.
  • Russell, Bertrand (1995) My Philosophical Development. Abingdon, UK and New York: Routledge.
  • Russell, Bertrand (2009) Our Knowledge of the External World: As a Field for Scientific Method in Philosophy. Abingdon and New York: Routledge.
  • Rynin, David (1956) ‘The Dogma of Logical Pragmatism’, Mind 65(259): 379–391.
  • Schilpp, P. A. ed. (1942) The Philosophy of G. E. Moore Northwestern University Press, Evanston IL.
  • Schilpp, Paul Arthur ed. (1942) The Philosophy of G. E. Moore. Evanston and Chicago: Northwestern University Press.
  • Schroeter, François (2004) ‘Reflective Equilibrium and Antitheory’, Noûs, 38(1): 110–134.
  • Schultz, Bart (1992) ‘Bertrand Russell in Ethics and Politics’, Ethics, 102: 3 (April): 594–634.
  • Sellars, Wilfred (1963) Science, Perception and Reality. Routledge & Kegan Paul Ltd; London, and The Humanities Press: New York.
  • Strawson, Peter (1959) Individuals: An Essay in Descriptive Metaphysics. London: Methuen.
  • Strawson, Peter (1991) Analysis and Metaphysics. An Introduction to Philosophy. Oxford and New York: Oxford University Press.
    • Both an introduction to philosophy and an introduction to Strawson’s own philosophical and meta­philosophical views.
  • Strawson, Peter (2003) ‘A Bit of Intellectual Autobiography’ in Glock ed. 2003b.
  • Weinberg, Jonathan M., Nichols, Shaun and Stitch, Stephen (2001) ‘Normativity and Epistemic Intuitions’, Philosophical Topics, 29(1&2): 429–460.
  • Williams, Bernard (1981) Moral Luck. Cambridge: Cambridge University Press.
  • Wittgenstein, Ludwig (1961) Tractatus Logico-Philosophicus. Trans. D.F. Pears and B.F. McGuinness. Routledge: London.
  • Wittgenstein, Ludwig (1966) Lectures and Conversations on Aesthetics, Psychology and Religious Belief. Oxford: Blackwell.
  • Wittgenstein, Ludwig (1969) The Blue and Brown Books. Preliminary Studies for the “Philosophical Investigations”. Blackwell: Oxford.
  • Wittgenstein, Ludwig (2001) Philosophical Investigations. The German Text, with a Revised English Translation. Malden MA and Oxford: Blackwell. Third edition. Trans. G. E. M. Anscombe.
    • The major work of the ‘later’ Wittgenstein.
  • Wright, Crispin (2002) ‘Human Nature?’ in Nicholas H. Smith ed. Reading McDowell. On Mind and World. London and New York: Routledge.

c. Pragmatism and Neopragmatism

  • Brandom, Robert B. ed. (2000) Rorty and His Critics. Malden MA and Oxford: Blackwell.
  • Dewey, John (1998) The Essential Dewey, two volumes, Larry Hickman and Thomas M. Alexander eds. Indiana University Press.
  • James, William (1995) Pragmatism: A New Name for Some Old Ways of Thinking. New York: Dover Publications.
    • Lectures.
  • Peirce, C. S. (1931–58) The Collected Papers of Charles Sanders Peirce, eds. C. Hartshorne, P. Weiss (Vols. 1–6) and A. Burks (Vols. 7–8). Cambridge MA: Harvard University Press.
  • Rorty, Richard (1980) Philosophy and the Mirror of Nature. Oxford: Blackwell.
    • Rorty’s magnum opus.
  • Rorty, Richard (1991a) Consequences of Pragmatism (Essays: 1972–1980). Hemel Hempstead, UK: Harvester Wheatsheaf.
  • Rorty, Richard (1991b) ‘The Priority of Democracy to Philosophy’, pp. 175–196 of his Objectivity, Relativism, and Truth. Philosophical Papers, Volume 1. Cambridge, New York and Melbourne: Cambridge University Press.
  • Rorty, Richard (1998) Achieving Our Country. Leftist Thought in Twentieth-Century America. Cambridge MA and London: Harvard University Press.
  • Rorty, Richard (2007) Philosophy as Cultural Politics. Philosophical Papers, Volume 4. Cambridge: Cambridge University Press.
  • Talisse, Robert B. and Aikin, Scott F. (2008) Pragmatism: A Guide for the Perplexed. Continuum: London and New York.
    • Good and useful.

d. Continental Philosophy

  • Adorno, Theodor W. (1986) The Jargon of Authenticity. London and Henley: Routledge and Kegan Paul, 1986; trans. Knut Tarnowski and Frederic Will.
  • Adorno, Theodor W. and Horkheimer, Max (2002) Dialectic of Enlightenment. Philosophical Fragments. Stanford: Stanford University Press. Trans. Edmund Jephcott.
  • Adorno, Theodor W. (1976) with R. Dahrendorf, J. Habermas, H. Pilot, and K. Popper, The Positivist Dispute in German Sociology, trans. G. Adey and D. Frisby, London: Heinemann Educational Books.
    • Documents from debates between Popperians (who were not, in fact, positivists in any strict sense) and the Frankfurt School.
  • Baxter, Hugh (1987) ‘System and Life-World in Habermas’ Theory of Communicative ActionTheory and Society 16: 1 (January): 39–86.
  • Braver, Lee (2009) Heidegger’s Later Writings. A Reader’s Guide. London and New York: Continuum.
    • Accessible and helpful, yet perhaps somewhat superficial.
  • Caputo, John D (1977) ‘The Question of Being and Transcendental Phenomenology: Reflections on Heidegger’s relationship to Husserl’, Research in Phenomenology 7 (1):84–105.
  • Caputo, John D (1993) Demythologizing Heidegger. Bloomington and Indianapolis: Indiana University Press.
    • More ‘Continental’ than one might guess merely from the title.
  • Caputo, John, D (1997) ‘A Commentary’, Part Two of Derrida, Jacques (1997) Deconstruction in a Nutshell. A Conversation with Jacques Derrida. New York: Fordam University Press. Edited and with a commentary by John D. Caputo.
  • Carmen, Taylor, and B. N. Hansen eds. (2005) The Cambridge Companion to Merleau-Ponty. Cambridge, Cambridge University Press.
  • Cerbone, David (2006) Understanding Phenomenology. Chesham, UK: Acumen.
    • A good introduction to phenomenology.
  • Cooper, David (1999) Existentialism. A Reconstruction 2nd ed. Blackwell: Oxford and Malden, MA
    • Careful, argumentative, fairly accessible.
  • Davis, Colin (1996) Levinas. An Introduction. Cambridge: Polity.
    • Not only introduces Levinas but also mounts a strong challenge to him.
  • Derrida, Jacques (1976) Of Grammatology. Baltimore and London: Johns Hopkins University Press. Trans. G. C. Spivak.
  • Derrida, Jacques (1987) Positions. London: Althone. Trans. Alan Bass.
    • Three relatively early interviews with Derrida. Relatively accessible.
  • Derrida, Jacques (1988) Limited Inc. Evanston, IL: Northwestern University Press.
    • Contains Derrida’s side of an (acrimonious) debate with John Searle. Includes an Afterword wherein Derrida answers questions put to him by Gerald Graff.
  • Derrida, Jacques (1990) Writing and Difference. London: Routledge. Trans. Alan Bass.
  • Derrida, Jacques (1995) Points . . . : Interviews, 1974–1994. Trans. Peggy Kamuf et al. Stanford, CA: Stanford University Press.
  • Derrida, Jacques (1999) ‘Force of Law’ in Drucilla Cornell, Michel Rosenfeld, and David Gray Carlson eds. (1982) Deconstruction and the Possibility of Justice, New York: Routledge.
  • Dews, Peter (1987) Logics of Disintegration. Post-stucturalist Thought and the Claims of Critical Theory. London and New York: Verso.
  • Dews, Peter (1995) ‘Morality, Ethics and “Postmetaphysical Thinking”’ in his The Limits of Disenchantment. Essays on Contemporary European Philosophy. London and New York: Verso, 1995.
  • Diprose, Rosalyn and Reynolds, Jack eds. (2008) Merleau-Ponty: Key Concepts. Chesham, UK: Acumen.
  • Dubiel, Daniel (1985) Theory and Politics. Studies in the Development of Critical Theory. Cambridge MA: MIT Press.
  • Edgar, Andrew (2006) Habermas. The Key Concepts. Routledge. London and New York.
  • Elden, Stuart (2004) Understanding Henri Lefebvre: Theory and the Possible. London and New York: Continuum.
  • Evans, J. Claude (1991) Strategies of Deconstruction: Derrida and the Myth of the Voice. Minneapolis: University of Minnesota Press.
    • Detailed contestation of Derrida’s interpretation of, especially, Husserl.
  • Finlayson, Gordon (2005) Habermas: A Very Short Introduction. Oxford: Oxford University Press.
  • Finlayson, Gordon (2009) ‘Morality and Critical Theory. On the Normative Problem of Frankfurt School Social Criticism’, Telos (146: Spring): 7–41.
  • Freyenhagen, Fabian (2008) ‘Moral Philosophy’ in Deborah Cook (ed.) Theodor Adorno: Key Concepts. Stocksfield: Acumen.
    • A good and somewhat revisionist synopsis of Adorno’s moral philosophy.
  • Gadamer, Hans-Geog (1981) Reason in the Age of Science. Cambridge MA: MIT. Trans. Frederick Lawrence.
  • Geuss, Raymond (1981) The Idea of a Critical Theory. Cambridge and New York: Cambridge University Press.
  • Geuss, Raymond (2008) Philosophy and Real Politics. Princeton and Oxford: Princeton University Press.
  • Glendinning, Simon (2001) ‘Much Ado About Nothing (on Herman Philipse, Heidegger’s Philosophy of Being)’. Ratio 14 (3):281–288.
  • Haar, Michel (1993) Heidegger and the Essence of Man. New York: State University of New York Press. Trans. McNeill, William.
  • Habermas, Jürgen (1984) The Theory of Communicative Action, Volume 1: Reason and the Rationalization of Society. Cambridge: Polity. Trans. McCarthy, Thomas.
  • Habermas, Jürgen (1987a) Knowledge and Human Interests. Cambridge: Polity. Second edition. Trans. Jeremy Shapiro.
  • Habermas, Jürgen (1987b) The Philosophical Discourse of Modernity: Twelve Lectures. Cambridge: Polity Press in association with Blackwell Publishers. Trans. Frederick Lawrence.
    • One of Habermas’ more accessible – and more polemical – works.
  • Habermas, Jürgen (1992a) Autonomy and Solidarity. Interviews with Jürgen Habermas. Ed. Peter Dews. Revised edition.
    • A good place to start with Habermas.
  • Habermas, Jürgen (1992b) Postmetaphysical Thinking: Philosophical Essays. Oxford: Polity Press. Trans. William Mark Hohengarten.
  • Habermas, Jürgen (2008) Between Naturalism and Religion. Philosophical Essays. Cambridge and Malden Ma.: Polity. Trans. Ciaran Cronin.
  • Heidegger, Martin (1962) Being and Time. Oxford: Blackwell. Trans. John Macquarrie and Edward Robinson.
    • The ‘early’ Heidegger’s main work.
  • Heidegger, Martin (1966) Discourse on Thinking. A translation of Gelassenheit. New York: Harper & Row. Trans. John M. Anderson and E. Hans Freund.
  • Heidegger, Martin (1971) Poetry, Language, Thought. New York: Harper & Row. Trans. Albert Hofstadter.
  • Heidegger, Martin (1982) The Basic Problems of Phenomenology. Bloomington and Indianapolis: University of Indiana Press. Revised ed. Trans. Albert Hofstadter.
    • Close in its doctrines to Being and Time, but often considerably more accessible.
  • Heidegger, Martin (1991) Nietzsche, 4 volumes. New York: HarperCollins. Trans. David Farrell Krell.
  • Heidegger, Martin (1994) Basic Writings. London: Routledge. Revised and expanded edition.
    • Contains ‘What is Metaphysics?’, ‘Letter on Humanism’, and ‘The Question Concerning Technology’, among other texts.
  • Heidegger, Martin (2003) The End of Philosophy. Chicago: University of Chicago Press. Trans. Joan Stambaugh.
  • Held, David (1990) Introduction to Critical Theory. Cambridge: Polity.
    • Broad-brush and fairly accessible account of first-generation Critical Theory and of the relatively early Habermas.
  • Horkheimer, Max (1937) ‘Traditional and Critical Theory’ in Horkheimer, Critical Theory: Selected Essays. London and New York: Continuum, 1997.
  • Horkheimer, Max (1974) Eclipse of Reason. New York: Continuum.
    • Like Horkheimer and Adorno’s Dialectic of Enlightenment, but more accessible.
  • Husserl, Edmund (1931) Ideas. General Introduction to Pure Phenomenology. George Allen & Unwin Ltd / Humanities Press. Trans. W. R. Boyce Gibson.
    • Kluwer have produced a newer and more accurate version of this book; but the Boyce Gibson version is slightly more readable.
  • Husserl, Edmund (1970) The Crisis of the European Sciences and Transcendental Phenomenology. Evanston, IL: Northwestern University Press. Trans. David Carr.
  • Husserl, Edmund (1999a) The Idea of Phenomenology Dordrecht: Kluwer. Trans. Lee Hardy.
    • Probably Husserl’s most accessible (or least inaccessible) statement of phenomenology.
  • Husserl, Edmund (1999b) Cartesian Meditations. An Introduction to Phenomenology. Trans. Dorian Cairns. Dordrecht: Kluwer.
  • Johnson, Christopher (1993) System and Writing in the Philosophy of Jacques Derrida. Cambridge: Cambridge University Press.
  • Johnson, Christopher (1999) Derrida. The Scene of Writing. New York: Routledge.
    • Good, short, and orientated around Derrida’s Of Grammatology.
  • Landau, Iddo (1992/1993 [sic]) ‘Early and Later Deconstruction in the Writings of Jacques Derrida’, Cardozo Law Review, 14: 1895–1909.
    • Unusually clear.
  • Levinas, Emmanuel (1996) Proper Names. Stanford: Stanford University Press.
  • Malpas, Simon (2003) Jean-François Lyotard. Routledge. London and New York.
  • Marcuse, Herbert (1991) One-Dimensional Man. Second edition. Routledge: London.
    • A classic work of first-generation Critical Theory.
  • Merleau-Ponty, Maurice (2002) Phenomenology of Perception. New York: Routledge. Trans. Colin Smith.
    • Merleau-Ponty’s principal work.
  • Mulhall, Stephen (1996) Heidegger and Being and Time. Routledge: London and New York.
  • Outhwaite, William (1994) Habermas. A Critical Introduction. Cambridge. Polity.
  • Pattison, George (2000) The Later Heidegger. London and New York: Routledge.
    • A helpful introduction to ‘the later Heidegger’.
  • Philipse, Herman (1998) Heidegger’s Philosophy of Being: a Critical Interpretation. New Jersey: Princeton University Press.
    • A large, serious, and very controversial work that sets out to understand, but also to demolish much of, Heidegger. Q.v. Glendinning (2001) – which defends Heidegger.
  • Plant, Robert (Forthcoming) ‘This strange institution called “philosophy”: Derrida and the primacy of metaphilosophy’, Philosophy and Social Criticism.
  • Polt, Richard (1999) Heidegger: An Introduction. London: UCL Press.
    • Superb introduction, but light on the later Heidegger.
  • Russell, Matheson (2006) Husserl: A Guide for the Perplexed. London and New York: Continuum.
    • Excellent.
  • Sartre, Jean-Paul (1963) The Problem of Method. Trans. Hazel E. Barnes. London: Methuen.
  • Sartre, Jean-Paul (1989) Being and Nothingness. An Essay on Phenomenological Ontology. London: Routledge. Trans. Hazel E. Barnes.
    • The early Sartre’s major work.
  • Sartre, Jean-Paul (1992) Notebooks for an Ethics. Chicago and London: Chicago University Press. Trans. David Pellauer.
  • Sartre, Jean-Paul (2004) The Transcendence of the Ego. A Sketch for a Phenomenological Description. Abingdon, U.K.
  • Sartre, Jean-Paul (2007) Existentialism and Humanism. London: Methuen. Trans. Philip Mairet.
    • Sartre’s philosophy at its most accessible.
  • Smith, David (2003) Husserl and the Cartesian Meditations. London and New York: Routledge.
  • Smith, Joel (2005) ‘Merleau-Ponty and the Phenomenological Reduction’, Inquiry 48(6): 553–571.
  • Wolin, Richard, ed. (1993) The Heidegger Controversy: A Critical Reader. Cambridge MA and London: MIT Press.
    • The controversy in question concerns Heidegger’s Nazism. See also Young 1997.
  • Young, Julian (1997) Heidegger, Philosophy, Nazism. Cambridge: Cambridge University Press.
  • Young, Julian (2002) Heidegger’s Later Philosophy. Cambridge: Cambridge University Press.
    • A slim introduction to, and an attempt to make compelling, the thought of the later Heidegger.

e. Other

  • Borgmann, Albert (1984) Technology and the Character of Everyday Life: A Philo­sophical Inquiry. Chicago and London: University of Chicago Press.
    • Interesting and impassioned. Influenced by Heidegger.
  • Descartes, René (1988) The Philosophical Writings Of Descartes (3 vols). Cambridge: Cambridge University Press. Trans. John Cottingham, Robert Stoothoff, and Dugald Murdoch. Volume one.
  • Feenberg, Andrew (1999) Questioning Technology. London and New York: Routledge.
    • This book has at least one foot in the Critical Theory tradition but also appropriates some ideas from Heidegger.
  • Hume, David (1980) Dialogues Concerning Natural Religion and the Posthumous Essays ‘Of the Immortality of the Soul’ and ‘Of Suicide.’ Indianapolis: Hackett. Ed. Richard H. Popkin.
  • Kant, Immanuel Critique of Pure Reason. Various translations.
    • As is standard, the article above refers to this work using the ‘A’ and ‘B’ nomenclature. The number(s) following ‘A’ denote pages from Kant’s first edition of the text. Number(s) following ‘B’ denote pages from Kant’s second edition.
  • Locke, John (1975) An Essay Concerning Human Understanding. Oxford: Oxford University Press.
  • O’Neill, John (2003) ‘Unified Science as Political Philosophy: Positivism, Pluralism and Liberalism’, Studies in History and Philosophy of Science, vol. 34 (September): 575–596.
  • O’Neill, John and Uebel, Thomas (2004) ‘Horkheimer and Neurath: Restarting a Disrupted Debate’, European Journal of Philosophy, 12:1 75–105.
  • Petitot, Jean, Varela, Francisco, Pachoud, Bernard, and Roy, Jean-Michel eds. (2000) Naturalizing Phenomenology: Issues in Contemporary Phenomenology and Cognitive Science. Stanford: Stanford University Press.

Author and Article Information

Nicholas Joll
Email: joll.nicholas@gmail.com
United Kingdom

Article first published 17/11/2010. Last revised 01/08/2017.

Michel Foucault (1926–1984)

Michel FoucaultMichel Foucault was a major figure in two successive waves of 20th century French thought–the structuralist wave of the 1960s and then the poststructuralist wave. By the premature end of his life, Foucault had some claim to be the most prominent living intellectual in France.

Foucault’s work is transdisciplinary in nature, ranging across the concerns of the disciplines of history, sociology, psychology, and philosophy. At the first decade of the 21st century, Foucault is the author most frequently cited in the humanities in general. In the field of philosophy this is not so, despite philosophy being the primary discipline in which he was educated, and with which he ultimately identified. This relative neglect is because Foucault’s conception of philosophy, in which the study of truth is inseparable from the study of history, is thoroughly at odds with the prevailing conception of what philosophy is.

Foucault’s work can generally be characterized as philosophically oriented historical research; towards the end of his life, Foucault insisted that all his work was part of a single project of historically investigating the production of truth. What Foucault did across his major works was to attempt to produce an historical account of the formation of ideas, including philosophical ideas. Such an attempt was neither a simple progressive view of the history, seeing it as inexorably leading to our present understanding, nor a thoroughgoing historicism that insists on understanding ideas only by the immanent standards of the time. Rather, Foucault continually sought for a way of understanding the ideas that shape our present not only in terms of the historical function these ideas played, but also by tracing the changes in their function through history.

Table of Contents

  1. Life
  2. Early works on psychology
  3. Archaeology
    1. The History of Madness
    2. Writings on Art and Literature
    3. The Birth of the Clinic
    4. The Order of Things
    5. The Archaeology of Knowledge
  4. Genealogy
    1. Discipline and Punish
    2. The Will to Knowledge
    3. Lecture Series
  5. Governmentality
  6. Ethics
  7. References and Further Reading
    1. Primary
    2. Secondary

1. Life

Michel Foucault was born Paul-Michel Foucault in 1926 in Poitiers in western France. His father, Paul-André Foucault, was an eminent surgeon, who was the son of a local doctor also called Paul Foucault. Foucault’s mother, Anne, was likewise the daughter of a surgeon, and had longed to follow a medical career, but her wish had to wait until Foucault’s younger brother as such a career was not available for women at the time. It is surely no coincidence then that much of Foucault’s work would revolve around the critical interrogation of medical discourses.

Foucault was schooled in Poitiers during the years of German occupation. Foucault excelled at philosophy and, having from a young age declared his intention to pursue an academic career, persisted in defying his father, who wanted the young Paul-Michel to follow his forebears into the medical profession. The conflict with his father may have been a factor in Foucault’s dropping the ‘Paul’ from his name. The relationship between father and son remained cool through to the latter’s death in 1959, though Foucault remained close to his mother.

He moved to Paris in 1945, just after the end of the war, to prepare entrance examinations for the École Normale Supérieure d’Ulm, which was then (and still is) the most prestigious institution for education in the humanities in France. In this preparatory khâgne year, he was taught philosophy by the eminent French Hegelian, Jean Hyppolite. Foucault entered the École Normale in 1946, where he was taught by Maurice Merleau-Ponty and mentored by Louis Althusser. Foucault primarily studied philosophy, but also obtained qualifications in psychology. These years at the École Normale were marked by depression – and attempted suicide – which is generally agreed to have resulted from Foucault’s difficulties coming to terms with his homosexuality. While at the École Normale, Foucault also joined the French Communist Party in 1950 under the influence of Althusser, but was never active and left with Althusser’s assent thoroughly disillusioned in 1952.

Foucault aggregated in philosophy from the École Normale in 1951. The same year, he began teaching psychology there, where his students included Jacques Derrida, who would later become a philosophical antagonist of Foucault’s. Foucault also began to work as a laboratory researcher in psychology. He would continue to work in psychology in various capacities until 1955, when he took up a position as a director of the Maison de France at the University of Uppsala in Sweden. From Sweden, he moved to Poland as French cultural attaché in 1958, and then from there moved to the Institut Français in Hamburg in 1959. During these overseas postings, he wrote his first major work and primary doctoral thesis, a history of madness, which was later published in 1961. In 1960, Foucault returned to France to teach psychology in the philosophy department of the University of Clermont-Ferrand. He remained in that post until 1966, during which he lived in Paris and commuted to teach. It was in Paris in 1960 that Foucault met the militant leftist Daniel Defert, then a student and later a sociologist, with whom he would form a partnership that lasted the rest of Foucault’s life.

From 1964, Defert was posted to Tunisia for 18 months of compulsory military service, during which time Foucault visited him more than once. This led to Foucault in 1966 taking up a chair of philosophy at the University of Tunis, where he was to remain until 1968, missing the events of May 1968 in Paris for the most part. 1966 also saw the publication of Foucault’s The Order of Things, which received both praise and critical remarks. It became a bestseller despite its length and the obscurity of its argumentation, and cemented Foucault as a major figure in the French intellectual firmament.

Returning to France in 1968, Foucault presided over the creation and then running of the philosophy department at the new experimental university at Vincennes in Paris. The new university was created as an answer to the student uprising of 1968, and inherited its ferment. Foucault assembled a department composed mostly of militant Marxists, including some who have gone on to be among the most prominent French philosophers of their generation: Alain Badiou, Jacques Rancière, and Étienne Balibar. After scandals related to this militancy, the department was briefly stripped of its official accreditation. Foucault was already moving on, however; he was in 1970 elected to a chair at France’s most prestigious intellectual institution, the Collège de France, which he held for the rest of his life. The only duty of this post is to give an annual series of lectures based on one’s current research. At the time of writing, Foucault’s thirteen Collège lecture series are in the process of being published in their entirety: eight have appeared in French, seven have been published in English.

The early 1970s were a politically tumultuous period in Paris, where Foucault was again living. Foucault threw himself into political activism, primarily in relation to the prison system, as a founder of what was called the “Prisons Information Group.” It originated in an effort to aid political prisoners, but in fact sought to give a voice to all prisoners. In this connection, Foucault became close to Gilles Deleuze, during which friendship Foucault wrote an enthusiastic foreword to the English-language edition of Deleuze and Félix Guattari’s Anti-Oedipus, before Foucault and Deleuze fell out.

In the late ‘70s, the political climate in France cooled considerably; Foucault largely withdrew from activism and turned his hand to journalism. He covered the Iranian Revolution first-hand in newspaper dispatches as the events unfolded in 1978 and 1979. He began to spend more and more time teaching in the United States, where he had lately found an enthusiastic audience.

It was perhaps in the United States that Foucault acquired HIV. He developed AIDS in 1984 and his health quickly declined. He finished editing two volumes on ancient sexuality which were published that year from his sick-bed, before dying on the 26th June, leaving the editing of a fourth and final volume uncompleted. He bequeathed his estate to Defert, with the proviso that there were to be no posthumous publications, a testament which has been subject to ever more elastic interpretation since.

A note on dates: Where there is any disagreement among sources as to the facts of Foucault’s biography, the chronology compiled by Daniel Defert at the start of Foucault’s Dits et écrits is considered in this article to be definitive.

2. Early works on psychology

Foucault’s earliest work lacks a distinctively “Foucauldian” perspective. In these works, Foucault displays influences typical of young French academics of the time: phenomenology, psychoanalysis, and Marxism. Foucault’s primary work of this period was his first monograph, Mental Illness and Personality, published in 1954. This slim volume, commissioned for a series intended for students, begins with an historical survey of the types of explanation put forward in psychology, before producing a synthesis of perspectives from evolutionary psychology, psychoanalysis, phenomenology and Marxism. From these perspectives, mental illness can ultimately be understood as an adaptive, defensive response by an organism to conditions of alienation, which an individual experiences under capitalism. Foucault first modified the book in 1962 in a new edition, entitled Mental Illness and Psychology. This resulted in the change of the later parts – the most Marxist material and the conclusion –to bring them into line with the theoretical perspective that he had by then expounded in his later The History of Madness. According to this view, madness is something natural, and alienation is responsible not so much for creating mental illness as such, but for making madness into mental illness. This was a perspective with which Foucault in turn later grew unhappy, and he had the book go out of print for a time in France.

Foucault’s other major publication of this early period, a long introduction (much longer than the text it introduced) to the French translation of Ludwig Binswanger’s Dream and Existence, a work of Heideggerian existential psychoanalysis, appeared in the same month in 1954 as Mental Illness and Personality. Far from merely introducing Binswanger’s text, Foucault here expounds a novel account of the relation between imagination, dream and reality. He combines Binswanger’s insights with Freud’s, but arguing that neither Binswanger nor Freud understands the fundamental role of dreaming for the imagination. Since imagination is necessary to grasp reality, dreaming is also essential to existence itself. 

3. Archaeology

a. The History of Madness

Foucault’s first canonical monograph, in the sense of a work that he never repudiated, was his 1961 primary doctoral thesis, Madness and Unreason: A History of Madness in the Classical Age, which has ultimately come to be known simply as the History of Madness. It is best known in the English-speaking world by an abridged version, Madness and Civilization, since for decades the latter was the only version available in English. History of Madness is a work of some originality, showing several influences, but not slavishly following any convention. It resembles Friedrich Nietzsche’s Birth of Tragedy in style and form (thought greatly exceeding it in length), proposing a disjunction between reason and unreason similar to Nietzsche’s Apollonian/Dionysian distinction. It also bears the influence of French history and philosophy of science, the most prominent twentieth century representative of which was Gaston Bachelard, the developer of a notion of “epistemological rupture” to which most of Foucault’s works are indebted. Yet Georges Canguilhem’s focus on the division of the normal from the pathological is perhaps the most telling influence on Foucault in this book. Foucault’s thought continues moreover to owe something to Marxism and to social history more generally, constituting an historical analysis of social divisions.

The History of Madness follows logically enough from Foucault’s interest in psychology. The link is stronger even than the title indicates: much of the work is concerned with the birth of medical psychiatry, which Foucault associates with extraordinary changes in the treatment of the mad in modernity, meaning first their systematic exclusion from society in early modernity, followed by their pathologization in late modernity. The History of Madness thus sets the pattern for most of Foucault’s works by being concerned with discrete changes in a given area of social life at particular points in history. Like Foucault’s other major works of the 1960s, it fits broadly into the category of the history and philosophy of science. It has wider philosophical import than that, however, with Foucault ultimately finding that madness is negatively constitutive of Enlightenment reason via its exclusion. The exclusion of unreason itself, concomitant with the physical exclusion of the mad, is effectively the dark side of the valorization of reason in modernity. For this reason, the original main title of the work was Madness and Unreason. Foucault argues in effect for the recuperation of madness, via a valorization of philosophers and artists deemed mad, such as Nietzsche, a recuperation which Foucault thinks the works of such men already portend.

b. Writings on Art and Literature

Foucault’s writings on art and literature have received relatively little attention, even though Foucault’s work is widely influential among scholars of art and literature. This is surely because Foucault’s work directly in these areas is relatively minor and marginal in his corpus. Still, Foucault wrote several short treatments on artists, including Manet and Magritte, and more substantially on literature. In 1963, Foucault wrote a short book on the novelist Raymond Roussel, published in English as Death and the Labyrinth, which is exceptional as Foucault’s only book-length piece of literary or artistic criticism, and which Foucault himself never considered as of a similar importance to his other books of the 1960s. Still, the figure of Roussel offers something of a bridge from The History of Madness and the work that Foucault will now go on to do, not least because Roussel is a writer who could be categorized as rehabilitating madness in the literary sphere. Roussel was a madman – eccentrically suicidal – whose work consisted in playing games with language according to arbitrary rules, but with the utmost dedication and seriousness, the purpose of which was to investigate language itself, and its relation to extra-linguistic things. This latter theme is precisely that which comes to preoccupy Foucault in the 1960s, and in the form too of uncovering the rules of the production of discourse.

Despite that the Roussel book was the only one Foucault wrote on literature, he wrote literary essays throughout the 1960s. He wrote several studies of French literary intellectuals, such as the “Preface to Transgression” about the work of Georges Bataille in relation to that of the Marquis de Sade, the “Prose of Actaeon” about Pierre Klossowski,  the “Thought of the Outside” about Maurice Blanchot. These were all figures who wrote literature or wrote about it, but they were also all philosophical thinkers too, influenced by Nietzsche and/or Martin Heidegger: it was through his contemporary Blanchot, a Heideggerian, that Foucault came to Bataille, and thus to Nietzsche, who proved to be a decisive influence on Foucault’s work at multiple points. Foucault also wrote “Language to Infinity,” about de Sade and his literary influence, and a piece on Flaubert at this time. All of these works contribute to a general engagement by Foucault with the theme of language and its relation to its exterior, a theme which is explored at greater length in his contemporaneous monographs.

c. The Birth of the Clinic

The major work of 1963 for Foucault was his follow-up to his The History of Madness, entitled The Birth of the Clinic: An Archaeology of Medical Perception. The Birth of the Clinic examines the emergence of modern medicine. It follows on from the History of Madness logically enough: the analysis of the psychiatric classification of madness as disease is followed by an analysis on the emergence of modern medicine itself. However, this new study is a considerably more modest work than the other, due largely to a significant methodological tightening. The preface to The Birth of the Clinic proposes to look at discourses on their own terms as they historically occur, without the hermeneutics that attempts to interpret them in their relation to fundamental reality and historical context. That is, as Foucault puts it, to treat signifiers without reference to the signified, to look at the evolution of medical language without passing judgment on the things it supposedly referred to, namely disease.

The main body of the work is an historical study of the emergence of clinical medicine around the time of the French revolution, at which time the transformation of social institutions and political imperatives combined to produce modern institutional medicine for the first time. The leitmotif of the work is the notion of a medical “gaze”: modern medicine is a matter of attentive observation of patients, without prejudging the maladies one may find, in the service of the demographic needs of society. There is some significant tension between the methodology and the rest of the book, however, with much of what is talked about in the book clearly not being signifiers themselves. The fulfillment of the intention announced at the beginning of The Birth of the Clinic is found rather in Foucault’s next book, The Order of Things, first published in 1966.

d. The Order of Things

Subtitled “An Archaeology of the Human Sciences,” this book aims to uncover the history of what today are called the “human sciences.” This is an obscure area, in fact, certainly to English-speaking readers, who are not often used to seeing the relevant disciplines grouped in this way. The human sciences do not comprise mainstream academic disciplines; they are rather an interdisciplinary space for the reflection on the “man” who is the subject of more mainstream scientific knowledge, taken now as an object, sitting between these more conventional areas, and of course associating with disciplines such as anthropology, history, and, indeed, philosophy. Disciplines identified as “human sciences” include psychology, sociology, and the history of culture.

The mainstay of the book is not concerned with this narrow area, however, but its pre-history, in the sense of the academic discourses which preceded its very existence. In dealing with these, Foucault employs a method which is certainly similar to that of his earlier works, but is now more deliberate, namely the broad procedure of looking for what in the French philosophy of science are called “epistemic breaks.” Foucault does not use this phrase, which originated with Gaston Bachelard, but uses a resonant neologism, “episteme.” In using this term, Foucault refers to the stable ensemble of unspoken rules that governs knowledge, which is itself susceptible to historical breaks. The book tracks two major changes in the Western episteme, the first being at the beginning of the “Classical” age during the seventeenth century, and the second being at the beginning of a modern era at the turn of the nineteenth. Foucault does not concern himself here with why these shifts happen, only with what has happened. This then, is now the work that he calls “archaeology.”  In the original preface to The History of Madness, Foucault describes what he is doing as the “archaeology” of madness. This notion, used here apparently off-handedly, becomes the name of Foucault’s research project through the 1960s. In The Birth of the Clinic, Foucault once again uses the word “archaeology” only once, but this time in the subtitle itself. Only with The Order of Things is archaeology formulated as a methodology.

In The Order of Things, Foucault is concerned only to analyze the transformations in discourse as such, with no consideration of the concrete institutional context. The consideration of that context is now put aside until the 1970s. He shows that in each of the disciplines he looks at, the precursors of the contemporary discipline of biology, economics, and linguistics, the same general transformations occur at roughly the same time, encompassing myriad changes at a local level that might not seem connected to one another.

Before the Classical age, Foucault argues, Western knowledge was a rather disorganized mass of different kinds of knowledge (superstitious, religious, philosophical), with the work of science being to note all kinds of resemblances between things. With the advent of the Classical Age, clear distinctions between academic disciplines emerge, part of a general enthusiasm for categorizing information. The aim at this stage is for a total, definitive cataloguing and categorization of what can be observed. Science is concerned with superficial visibles, not looking for anything deeper. Language is understood as simply transparently representing things, such that the only concern with language is work of clarification. For the first time, however, there is an appreciation of the reflexive role of subjects in the enquiry they are conducting – the scientist is himself an object for enquiry, an individual conceived simultaneously as both subject and object. Then, from the beginning of the nineteenth century, a new attention to language emerges, and the search begins for precisely what is hidden from our view, hidden logics behind what we can see. To this tendency belong theories as diverse as the dialectical view of history, psychoanalysis, and Darwinian evolution. Foucault criticises all such thought as involving a division between what is “the Same” and what is other, with the latter usually excluded from scientific inquiry, focusing all the time on “man” as a privileged object of inquiry. Foucault ultimately argues, however, that there are signs of the end of “man” as an object of knowledge, as our thought, in the shape of the “counter-sciences” of psychoanalysis and ethnology, plumbs areas beyond what can be understood in terms of the concept of “man.” One sees, again, the valorization here of mad writers, such as Roussel and Nietzsche: the historico-philosophical thesis of The History of Madness, and its project of the recuperation of madness, is here inscribed in terms of the production of knowledge.

e. The Archaeology of Knowledge

Foucault followed the Order of Things with his Archaeology of Knowledge, which was published in 1969. In this work, Foucault tries to consolidate the method of archaeology: it is the only one of Foucault’s major works that does not comprise an historical study, and thus his most theoretical work. It is the most influential work of Foucault’s in literary criticism and some other applied areas.

Archaeology, Foucault now declares, means approaching language in a way that does not refer to a subject who transcends it – though he acknowledges he has not been rigorous enough in this respect in the past. That is not to say that Foucault is making a strong metaphysical claim about subjectivity, but rather only that he is proposing a mode of analysis that subordinates the role of the subject. Foucault in fact proposes to suspend acceptance not only of the notion of a subject who produces discourse but of all generally accepted discursive unities, such as the book. Instead, he wants to look only at the surface level of what is said, rather than to try to interpret language in terms of what stands behind it, be that hidden meaning, structures, or subjects. Foucault’s suggestion is to look at language in terms of discrete linguistic events, which he calls “statements,” such as to understand the multitudinous ways in which statements relate to one another. Foucault’s statement is not defined by content (a statement is not a proposition), nor by its simple materiality (the sounds made, the marks on paper). The specificity of a statement is rather determined both by such intrinsic properties and by its extrinsic relations, by context as well as content.

Foucault asserts the autonomy of discourse, that language has a power that cannot be reduced to other things, either to the will of a speaking subject, or to economic and social forces, for example. This is not to say that statements exist independently of extra-linguistic reality, however, or of larger “discursive formations” in which they occur. It is rather the opposite. Both these things in effect need to be factored into analyses of statements – the identity of the statement is conditioned both by its relation to other statements, to discourse as such, and to reality, as well as by its intrinsic form. The statement is governed by a “system of its functioning,” which Foucault calls the “archive.” Archaeology is now interpreted as the excavation of the archive. This of course retroactively includes much of what Foucault has been doing all along.

Foucault followed this work with his celebrated 1969 essay, “What is an Author?” (somewhat confusingly because many versions of this circulate, including multiple translations of the original, and Foucault’s own translation, was delivered in English many years later), which effectively concludes the series of Foucault’s writings on literature in the 1960s. This work represents an extension in literary theory of the impulse behind the Archaeology, with Foucault systematically criticizing the notion of an author, and suggesting that we can move beyond ascribing transcendent sovereignty to the subject in our understanding of discourse, understanding the subject rather as a function of discourse.

4. Genealogy

The period after May 1968 saw considerable social upheaval in France, particularly in the universities, where the revolt of that month had begun. Foucault, returning to this atmosphere from a Tunis that was also in political ferment, was politicized.

His work quickly reflected his new engagement (the Archaeology was completed early in 1968, though published the next year). His inaugural lecture at the Collège de France in 1970, published in French as The Order of Discourse (L’ordre du discours – it is available in diverse anthologized English translations under various titles, including as an appendix to the American edition of The Archaeology of Knowledge), represented an attempt to move the analysis of discourse that had preoccupied him through the 1960s onto a more political terrain, asking questions now about the institutional production of discourse. Here, Foucault announces a new project, which he designates “genealogy,” though Foucault never repudiates the archaeological method as such.

“Genealogy” implies doing what Foucault calls the “history of the present.” A genealogy is an explanation of where we have come from: while Foucault’s genealogies stop well before the present, their purpose is to tell us how our current situation originated, and is motivated by contemporary concerns. Of course, one may argue that all history has these features, but with genealogy this is intended rather than a matter of unavoidable bias. Some of Foucault’s archaeologies can be said to have had similar features, but their purpose was to look at epistemic shifts discretely, in themselves, without insisting on this practical relevance. The word “genealogy” is drawn directly from Nietzsche’s Genealogy of Morals: genealogy is a Nietzschean form of history, though rather more meticulously historical than anything Nietzsche ever attempted.

a. Discipline and Punish

In the early 1970s, Foucault’s involvement with the prisoners’ movement led him to lecture two years running on prisons at the Collège de France, which led to his work in 1975: Discipline and Punish: The Birth of the Prison. The subtitle here references The Birth of the Clinic, indicating some continuity of project; both titles in turn of course reference Nietzsche’s Birth of Tragedy.

Discipline and Punish is a book about the emergence of the prison system. The conclusion of the book in relation to this subject matter is that the prison is an institution, the objective purpose of which is to produce criminality and recidivism. The system encompasses the movement that calls for reform of the prisons as an integral and permanent part. This thesis is somewhat obscured by a particular figure from the book that has garnered much more attention, namely Jeremy Bentham’s “panopticon,” a design for a prison in which every prisoner’s every action was visible, which greatly influenced nineteenth century penal architecture, and indeed institutional architecture more generally, up to the level of city planning. Though Foucault is often presented as a theorist of “panopticism,” this is not the central claim of the book.

The more important general theme of the book is that of “discipline” in the penal sense, a specific historical form of power that was taken up by the state with professional soldiering in the 17th century, and spread widely across society, first via the panoptic prison, then via the division of labor in the factory and universal education. The purpose of discipline is to produce “docile bodies,” the individual movements of which can be controlled, and which in its turn involves the psychological monitoring and control of individuals, indeed which for Foucault produces individuals as such.

b. The Will to Knowledge

Foucault indeed focused on the concept of power so much that he remarked that he produced the analysis of power relations rather than the genealogies he had intended. Foucault began talking about power as soon as he began to do genealogy, in The Order of Discourse. In Discipline and Punish he develops a notion of “power-knowledge,” recombining the analysis of the epistemic with analysis of the political. Knowledge now for Foucault is incomprehensible apart from power, although Foucault continues to insist on the relative autonomy of discourse, introducing the notion of power-knowledge precisely as a replacement for the Marxist notion of ideology in which knowledge is seen as distorted by class power; for Foucault, there is no pure knowledge apart from power, but knowledge also has real and irreducible importance for power.

Foucault sketches a notion of power in Discipline and Punish, but his conception of power is primarily expounded only in a work published the following year in 1976, the first volume of his History of Sexuality, with the title The Will to Knowledge. The latter is a reference to Nietzsche’s Will to Power (this original French title is that of the current Penguin English edition – the English translation published in America, however, is titled simply The History of Sexuality: An Introduction).

The Will to Knowledge is an extraordinarily influential work, perhaps Foucault’s most influential. The central thesis of the book is that, contrary to popular perceptions that we are sexually repressed, the entire notion of sexual repression is part and parcel of a general imperative for us to talk about sex like never before: the production of behavior is represented simply as the liberation of innate tendencies.

The problem, says Foucault, is that we have a negative conception of power, which leads us only to call power that which prohibits, while the production of behavior is not problematized at all. Foucault claims that all previous political theory has found itself stuck in a view of power propagated in connection to absolute monarchy, and that our political thought has not caught up with the French Revolution, hence there is today a need “to cut off the head of the king” in political theory. Foucault’s point is that we imagine power as being a thing that can be possessed by individuals, as organized pyramidally, with one person at the apex, operating via negative sanctions. Foucault argues that power is in fact more amorphous and autonomous than this, and essentially relational. That is, power consists primarily not of something a person has, but rather is a matter of what people do, subsistsing in our interactions with one another in the first instance. As such, power is completely ubiquitous to social networks. People, one may say crudely, moreover, are as much products of power as they are wielders of it. Power thus has a relative autonomy apropos of people, just as they do apropos of it: power has its own strategic logics, emerging from the actions of people within a network of power relations. The carceral system and the device of sexuality are two prime examples of such strategies of power: they are not constructed deliberately by anyone or even by any class, but rather emerge out of themselves.

This leads Foucault to an analysis of the specific historical dynamics of power. He introduces the concept of “biopower,” which combines disciplinary power as discussed in Discipline and Punish, with a “biopolitics” that invests people’s live at a biological level, “making” us live according to norms, in order to regulate humanity at the level of the population, while keeping in reserve the bloody sword of “thanatopolitics,” now exaggerated into an industrial warfare that kills millions. This specific historical thesis is dealt with in more detail in the article Foucault and Feminism, in the first section. Foucault’s concerns with sexuality, bodies, and norms form a potent mix that has, via the work of Judith Butler in particular, been one of the main influences on contemporary feminist thought, as well as influential in diverse areas of the humanities and social sciences.

c. Lecture Series

After his lectures on prisons, Foucault for two years returned to the old theme of institutional psychiatry in work that effectively provides a bridge between the theme (and theory) of the genealogy of prisons, and that of sexuality. The first of these, Psychiatric Power, is a genealogical sequel to the The History of Madness. The second, Abnormal, is closer to The Will to Knowledge: as its title suggests, it is concerned with the production of norms, though again not straying far from the psyciatric context. The following year, 1976, Foucault lectured on the genealogy of racism in Society Must Be Defended, which provides a useful companion to The Will to Knowledge, and contains perhaps the clearest exposition of Foucault’s thoughts on biopower. The publication of these lecture series, and, a fortiori, of the lecture series that were given in the eight years in between the publication of The Will to Knowledge and the deathbed publication of the next volumes of The History of Sexuality are transforming our picture of Foucault’s later thought.

5. Governmentality

The notion of biopolitics, as the regulation of populations, brought Foucault’s thinking to the question of the state. Foucault’s work on power had generally been a matter of minimizing the importance of the state in the network of power relations, but now he started to ask about it specifically, via a genealogy of “government,” first in Security, Territory, Population, and then in his genealogy of neoliberalism, The Birth of Biopolitics. Foucault here coins the term “governmentality,” which has a rather shifting meaning.

The function of the notion of governmentality is to throw the focus of thinking about contemporary societies onto government as such, as a technique, rather than to focus on the state or the economy. Well before the publication of these lecture series in recent years, one of these lectures from Security, Territory, Population, dealing with this concept and published in English as “Governmentality,” had already become the basis for what is effectively an entire school of sociology and political theory.

This notion of government takes Foucault’s researches on biopower and puts them on a more human plane, in a tendential move away from the bracketing of subjectivity that had marked Foucault’s approach up to that point. The notion of government for Foucault, like that of power, straddles a gap between the statecraft that is ordinarily called “government” today, and personal conduct, so-called “government of the self.” The two are closely related inasmuch as, in a rather Aristotelian way, governing others depends on one’s relation to oneself. This thematic indeed takes Foucault in precisely the direction of Ancient Greek ethics.

6. Ethics

Foucault’s final years lecturing at the Collège de France, the early 1980s, saw Foucault’s attention move from modern reflections on government, first to Christian thought, then to Ancient. Foucault is here following the genealogy of government, but there are other factors at work. Another reason for this trajectory is the History of Sexuality project, for which Foucault found it necessary to move further and further back in time to trace the roots of contemporary thinking about sex. However, one might ask why Foucault never found it necessary to do this with any other area, for example madness, where doubtless the roots could have been traced further back. Another reason for this turn, then, at this time was a changed climate in French academe, where, the political militancy of the seventies in abeyance, there was a general “turn to ethics.”

The ultimate output of this period was the second and third volumes of the History of Sexuality, written and published at the same time, and constituting in effect a single intellectual effort. These volumes deal with Ancient sex literature, Greek and then Roman. They lack great theoretical conclusions like those of the first volume. They are patient studies of primary texts, and ones that are further from the present, both in the sense of dealing with more chronologically remote material, and in the sense of their relevance to our present-day concerns, than any others Foucault ever made. The relevance of the historical analysis is particularly unclear due to the absence of the fourth volume of the History of Sexuality. It was partially drafted but far from complete, and hence is unpublished and likely to remain so. In dealing with the Christian part of the history of the sexuality, it serves to link the second and third volumes to the first.

The extant volumes chart the changes that occurred within Ancient thinking about sex, between Greek and Roman thinking. There are certainly significant changes over the thousand years of Ancient writing about sex – an increasing attention on individuals for example – but for the purposes of the present it is the general differences between Ancient and modern attitudes that is more instructive. For the Ancients, sex was consistently a relatively minor ethical concern, simply one of many concerns relevant to diet and health more generally.

What Foucault got from studying this material, which he discussed in relation to the present primarily elsewhere than in these two books, is the notion of an ethics concerned with one’s relation to one’s self. Self-constitution is the overarching problematic of Foucault’s research in his final years. This “care for the self” Foucault manifestly finds attractive, though he is scathing of the precise modality it takes in patriarchal Ancient society, and he expresses some wish to resurrect such an ethics today, though he demurs on the question of whether such a resurrection is really possible. Thus, the point for Foucault is not to expound an ethics; it is rather the new analytical possibilities of focusing on subjectivity itself, rather than bracketing it as Foucault had tended to do previously. Foucault becomes interested increasingly in the way subjectivity is constituted precisely by the way in which subjects produce themselves via a relation to truth. Foucault now proclaims that his work was always about subjectivity. The dry investigations of the 1960s, while concerned explicitly about truth, were always about the way in which “the human subject fits into certain games of truth.”

7. References and Further Reading

Below is a list of English translations of works by Foucault that are named above, in the order they were originally written. The shorter writings and interviews of Foucault are also of extraordinary interest, particularly to philosophers. In French, these have been published in an almost complete collection, Dits et écrits, by Gallimard, both in a four volume edition and a two-volume edition. In English, however, Foucault’s shorter works are spread across many overlapping anthologies, which even between them omit much that is important. The three-volume Essential Works series of anthologies, published by Penguin and the New Press, and edited by Paul Rabinow (vol. 1 Ethics, vol. 2 Aesthetics, vol. 3 Power), are the closest to a comprehensive collection in English, although the most compendious single-volume anthology is Foucault Live. Edited by Sylvère Lotringer. New York: Semiotext(e), 1996.

a. Primary

  • Mental illness and psychology. Berkeley: University of California Press, 1987.
  • The History of Madness. London: Routledge, 2006.
  • Death and the Labyrinth. London: Continuum, 2004.
  • Birth of the Clinic. London: Routledge, 1989.
  • The Order of Things. London: Tavistock, 1970.
  • The Archaeology of Knowledge. New York: Pantheon, 1972.
  • Psychiatric Power. New York: Palgrave Macmillan, 2006.
  • Discipline and Punish. London: Allen Lane, 1977.
  • Abnormal. London: Verso, 2003.
  • Society Must Be Defended. New York: Picador, 2003.
  • An Introduction. Vol. 1 of The History of Sexuality. New York: Pantheon, 1978. Reprinted as The Will to Knowledge, London: Penguin, 1998.
  • Security, Territory, Population. New York: Picador, 2009.
  • The Birth of Biopolitics. New York: Picador, 2010.

b. Secondary

  • Timothy J. Armstrong (ed.). Michel Foucault: Philosopher. Hemel Hempstead: Harvester Wheatsheaf, 1992.
    • A particularly good collection of papers on Foucault from his contemporaries.
  • Gilles Deleuze. Foucault. Trans. Seán Hand. London: Athlone, 1988.
    • The best book about Foucault’s work, from one who knew him, though predictably idiosyncratic.
  • Gary Gutting. Michel Foucault’s archaeology of scientific reason. Cambridge: Cambridge University Press, 1989.
    • The definitive volume on Foucault’s archaeological period, and on Foucault and the philosophy of science.
  • Gary Gutting (ed.). Cambridge Companion to Foucault. Cambridge: Cambridge University Press, 1994.
    • Brilliant and comprehensive introductory essays on aspects of Foucault’s thought.
  • David Couzens Hoy (ed.). Foucault: A Critical Reader. Oxford: Blackwell, 1986.
  • Mark G. E. Kelly. The Political Philosophy of Michel Foucault. New York: Routledge, 2009.
    • For the political aspect of Foucault’s thought, from a philosophical perspective.
  • David Macey. The Lives of Michel Foucault. London: Hutchison, 1993.
    • This is the most comprehensive and most sober of the available biographies of Foucault.
  • David Macey. Michel Foucault. London : Reaktion Books, 2004.
    • A readable, abbreviated biography of Foucault.
  • Michael Mahon. Foucault’s Nietschean Genealogy: Truth, Power, and the Subject. Albany: SUNY Press, 1992.
    • A pointedly philosophical work on the influence of Nietzsche on Foucault.
  • Jeremy Moss (ed.).The Later Foucault. London: Sage, 1998.
    • On Foucault’s late work.
  • Barry Smart (ed.). Michel Foucault: Critical Assessments (mutli-volume). London: Routledge, 1995.

Author Information

Mark Kelly
Email: m.kelly@mdx.ac.uk
Middlesex University
United Kingdom

Diogenes of Apollonia (5th cn. B.C.E.)

Diogenes of Apollonia is often considered to be the last of the Presocratic Greek philosophers, although it is more than likely that Democritus was still active after the death of Diogenes. Diogenes’ main importance in the history of philosophy is that he synthesized the earlier Ionic monism of Anaximenes and Heraclitus with the pluralism of Empedocles and Anaxagoras. Diogenes serves as a sort of culminating point for Presocratic philosophy, uniting its differing tendencies toward emphasizing the absolute indivisibility or identity of reality with the equally absolute multiplicity of differing beings. Just as for Heraclitus, the truth for Diogenes was that one self-identical thing is all different things. By abiding by the Presocratic natural law that out of nothing comes nothing and into nothing, nothing goes, Diogenes proposed a definition of nature that identified it with life and explicitly affirmed that it is generated from itself. Diogenes’ main idea was that nature, the entire universe, is an indivisibly infinite, eternally living, and continuously moving substance he called, following Anaximenes, air. All the natural changes occurring throughout the universe—the various forms, the incalculable multiplicity the singular being takes—are one substance, air, under various modes. Air is also intelligent. Indeed, air is intelligence, or noesis in the Ancient Greek. Noesis is the purely intuitive, rational thinking that expresses and sustains all cosmic processes. As the self-causal power of rational, intuitive intelligence, air is also a god. When defining air solely as an atmospheric condition, as we do today, and in relation to the three other main elements, namely, fire, water, and earth, Diogenes’ air becomes the soul of singular beings. The soul is the source of every living thing’s sensitive ability to live, know, and thus also affect and be affected by other singular beings. The soul is also the way the absolute cosmic air identifies itself through a number of living differentiations as the means by which living creatures exhibit their differing degrees of temperature and density.  Through the soul, air is sometimes rarer or more condensed, and likewise sometimes hotter or cooler. The soul is the life-principle that, when mixed with and operating through other aerated forms like blood and veins, allows for the living functions of all singular beings to remain self-sustaining until the necessary process of decomposition affects them. Such decomposition, however, is just another means for nature’s processes to continue to function insofar as each decomposed being is the simultaneous site for the next modification that air will engender and express through itself. Ultimately, for Diogenes, the essence of all reality, identified as intelligent and divine air, is that it is both nature and life, as nature and life are identical as one absolute substance.

Table of Contents

  1. Life and Work
  2. Substance Monism
  3. Air
  4. Intelligence and Divinity
  5. Cosmology and Physiology
  6. Influence and Historical Role
  7. References and Further Reading

1. Life and Work

The exact chronology of the life of Diogenes of Apollonia is unknown, but most accounts place the date of his acme somewhere around 460-430 BCE.  It was once believed that he was from the Cretan city of Apollonia, but it is now thought that the Apollonia of which he was a citizen was the Milesian colony on the Pontus that was actually founded by the Presocratic philosopher Anaximander, and which is today the Bulgarian Black Sea resort town of Sozopol. It is also thought Diogenes lived for some time in Athens and that while there, he became so unpopular (being thought an atheist) that his life was in danger. Further proof of Diogenes’ probable residence in Athens is the parody we find of him in Aristophanes’ The Clouds, even though it is Socrates who is portrayed as holding Diogenes’ views. Diogenes Laertius writes, “Diogenes, son of Apollothemis, an Apolloniate, a physicist and a man of exceptional repute. He was a pupil of Anaximenes, as Antisthenes says. His period was that of Anaxagoras” (IX, 57). Theophrastus also mentions that Diogenes of Apollonia was ‘almost the youngest’ of the physical philosophers. It has been persuasively put forward that Diogenes Laertius was more than likely confused when he wrote that Diogenes of Apollonia was a pupil of Anaximenes, considering the agreed upon earliness and geographic location of Diogenes by most commentators. Like Anaximenes, however, Diogenes held that the fundamental substance of nature is air, but it is highly unlikely he could have studied with him. On the other hand, the view that Diogenes flourished in roughly the same period as Anaxagoras is uncontroversial.

There has been much debate over whether Diogenes wrote a single book or even as many as four. Only fragments of Diogenes’ work survive. A majority of the fragments that we have of Diogenes’ work come from Simplicius’ commentaries on Aristotle’s Physics and On the Heavens. Simplicius writes,

Since the generality of enquirers say that Diogenes of Apollonia made air the primary element, similarly to Anaximenes, while Nicolaus in his theological investigation relates that Diogenes declared the material principle to be between fire and air…, it must be realized that several books were written by this Diogenes (as he himself mentioned in On Nature, where he says that he had spoken also against the physicists—whom he calls ‘sophists’—and written a Meteorology, in which he also says he spoke about the material principle, as well as On the Nature of Man); in the On Nature, at least, which alone of his works came into my hands, he proposes a manifold demonstration that in the material principle posited by him is much intelligence. (Kirk, Raven, and Schofield: 1983, 435)

The debate is over whether On Nature is the one book that Diogenes wrote and which covered many different yet nevertheless interrelated topics (such as man, meteorology, and the Sophists), or that On Nature, On the Nature of Man, Meteorologia, and Against the Sophists were four separate works. Diels, the early German collator of the Presocratic fragments, preferred the former option (DK 64B9), while commentators like Burnet (EGP 353) prefer the latter view. It also entirely possible that Simplicius was either confused or misinformed in his reading of Diogenes because of the fact that the quotations of Diogenes’ work, which he himself provides, contain discussions, for example, on the nature of man, which should have been impossible if indeed he only had a copy of On Nature in his possession. At the same time, we have evidence from a work of the medical author Galen that a certain Diogenes wrote a treatise that dealt with a number of diseases and their causes and remedies. It is probable that this was Diogenes of Apollonia because we have other reports from Galen (and Theophrastus) that Diogenes held views about diagnosing a patient by analyzing his tongue and general complexion. This evidence, along with his discussions regarding anatomy and the function of veins, leads to the probability that Diogenes was a professional doctor of some sort who could have produced a technical medical treatise. Another interesting piece of evidence that suggests Diogenes could have been a doctor is the methodological claim he makes regarding his own form of writing, and which sounds very similar to what is said in the beginning of some of the more philosophical works in the Hippocratic corpus. Diogenes Laertius says that this was the first line of Diogenes’ book: “It is my opinion that the author, at the beginning of any account, should make his principle or starting-point indisputable, and his explanation simple and dignified” (Fr. 1).  Such a no-nonsense approach to writing was often championed by the early medical thinkers.

2. Substance Monism

Following his own recommendation that an author should clearly state his purpose up front, Diogenes began his account of nature by explicitly establishing his principle, or starting-point.  He writes:

My opinion, in sum, is that all existing things are differentiated from the same thing, and are the same thing. And this is manifest: for if the things that exist at present in this world-order—earth and water and air and fire and all the other things apparent in this world-order—if any of these were different from the other (different, that is, in its own proper nature), and did not retain an essential identity while undergoing many changes and differentiations, it would be in no way possible for them to mix with each other, or for one to help or harm the other, or for a growing plant to grow out of the earth or for a living creature or anything else to come into being, unless they were so composed as to be the same thing.  But all these things, being differentiated from the same thing, become different kinds at different times and return into the same thing. (Fr. 2)

Diogenes was what we today call a ‘substance monist’.  Substance monism is the idea that everything is one thing. In other words, it means that all putative different things essentially are one self-identical thing. Substance Monism is an answer to the question, ‘what is and how many are there? According to Diogenes, for anything to be it must paradoxically be both identical to and different from the one, the thing that is – the one substance that is everything. The differences, however, of things from the one thing that is, are never ‘proper,’ as Diogenes argues. That is to say, the differences of things are never substantial, but rather they are only adjectival differences.

Now, while we do not find the term ‘substance’ in the fragments we have of Diogenes’ writing, the idea of a substance, and, moreover, the idea of substance monism, can help us understand what Diogenes meant when he said ‘all existing things are differentiated from the same thing, and are the same thing.’ A substance is what a thing is. It is the basic being of a thing; the essential reality a thing has to have in order for it to be what it is.  Things are substances if they essentially are the things they are. The essence of a substance is its own existence. This line of arguing was common to all the Presocratics because for them it was a natural law that out of nothing came nothing and into nothing, nothing went. To truly be, something had to be the essential source or cause of its own existence. Reality or being, therefore, for most of the Presocratics, and especially for Diogenes, is absolutely immanent to itself, and so all the differences there are in nature inhere in, or are internal to, it. This line of reasoning was an early version of what was to become the ontological argument. A Substance is a thing that exists because that is what it is: a thing that exists, a thing that exists on the basis of its own immanent self-sufficiency.

Diogenes was concerned with understanding what it is that makes a thing be what it is, what a thing’s substantial being is, and how many of these things or substances there really are. He wanted to know what makes a thing substantial. To understand what things are, what makes things be what they are, and how many of them there are, Diogenes simply observed both what he himself was composed of and what the primary qualities of everything he had ever experienced and thus thought about were. Like all the Presocratic philosophers, Diogenes’ chief observation was that all things are natural or physical. Diogenes observed that all things of this ‘present world-order’ are natural or physical elemental qualities such as earth, water, fire, and air. The observation that all things are natural or physical also implied that all things change, and that everything is moving in some degree, both growing and decaying, composing and decomposing, and speeding up and slowing down. For Diogenes, then, all things are physical and moving, for they are all natural and living. Therefore, the one self-identical substance that is in essence all different things is nature itself, which is the mobile, living, and absolutely physical identity of the universe. Furthermore, all the different things nature expresses of itself, or modifies via itself are variable forms of earth, water, fire, and air, which compose and decompose with each other in many ways as nature lives and moves. The elemental qualities of nature differ from each other only in degree and are in essence simply a variety of ways in which nature is identical to itself.

The observation that all things are physical, mobile, and different only in elemental degrees led Diogenes to note that if this is indeed the case then all things must be interrelated in some way. Relations, however, seem to demand some form of proper or substantial difference in order to occur. Diogenes was troubled by the apparent demands of proper duality implied by the living and flowing relations he observed as occurring throughout all of nature. The problem he had was that if all the things he observed relating throughout nature were really different from each other, then there was nothing in them or about them that made such relations even possible in the first place (for how could things truly relate that are really different from each other?) and thus, even more threateningly, everything he perceived as expressing a certain substantial identity was then utterly deceptive and false. In response to this dilemma, he noticed that if things relate in some degree, as they certainly seem to, there must be at least something they share, something in common between them that enables them to relate. That it is manifestly clear that things relate allowed Diogenes to assert the equally indubitable fact that there must be something between them they must all share that allows them to relate.  If things were so different from each other that either they could not relate at all or that their relations brought about only their total fragmentation or annihilation, nothing in nature could grow or move or become in any way radically contrary to what he observed as happening in nature. For this reason, Diogenes posited that there must be some one thing, some self-identical substance that allows all the naturally different things to interact, relate, and compose and decompose with each other. Without a fundamental substance implicitly and inherently linking all things together, nothing would have a common ground to share and work upon or a situational medium through which to change and grow. Therefore, there must be a thing that makes all things relatable, a thing that allows all things to be different from each other to some degree, yet still be connected enough to each other to allow them to interact and compose and decompose with each other. This thing, for Diogenes, was going to have to be everywhere, all the time because there was nowhere at any time that he did not observe natural bodies moving, growing, and relating.

Substance monism, therefore, served not only to explain the absolute immanence and essential self-identity of nature to itself, it also explained how all the kinds of living, growing, and interacting of singular beings occur throughout nature. By sharing the common substance they all modify, all the different things of nature, all the elemental and formal means of composing and decomposing could relate, interact, and help and harm each other through the infinite and eternal process of natural or physical growth and decay. In other words, for Diogenes and his kind of substance monism, being is becoming, nature is nurturing, and all forms of movement, work, creation, destruction, and causality are so many ways one self-identical substance naturally lives the life of all its self-differentiated forms. For Diogenes, substance monism entails that nature is life and that, in essence, the universe lives. One absolutely physical identity underwrites all the apparent diversity.

3. Air

 

Diogenes’ substance monism may seem radically opposed to what we believe today, especially with respect to our definitions of nature and life. Yet, even in Diogenes’ own time, his thinking was considered to be as peculiar and eclectic as that of many of the other Presocratics. Presocratic philosophy was often considered, in its own time and even today, to be neither religious nor scientific, but rather idiosyncratic and esoteric because of its emphasis on achieving the experience of a direct and immediate intuition of the essence of nature. Such an intuition defines the rarity and excellence of Presocratic wisdom. Like other Presocratics, Diogenes was a sage-like independent spirit who neither followed nor founded a school and who made use of the best elements of other philosophies he thought worthy of greater elaboration and which could yield him the wisdom he sought and loved. One such philosopher he borrowed from, as we mentioned, was Anaximenes. Like Anaximenes, Diogenes maintained that air is the one substance of which everything is made, and is a mode of. In his Refutation of all Heresies, Hippolytus reports,

Anaximenes…said that infinite air was the principle, from which the things that are becoming, and that are, and that shall be, and gods and things divine, all come into being, and the rest from its products. The form of air is of this kind: whenever it is most equable, it is invisible to sight, but is revealed by the cold and the hot and the damp and by movement. It is always in motion; for things that change do not change unless there be movement. Through becoming denser or finer it has different appearances; for when it is dissolved into what is finer it becomes fire, while winds, again, are air that is becoming condensed, and cloud is produced from air by felting. When it is condensed still more, water is produced; with a further degree of condensation earth is produced, and when condensed as far as possible, stones. The result is that the most influential components of generation are opposites, hot and cold. (Kirk, Raven, and Schofield: 1983, 145)

Diogenes agreed with Anaximenes and proposed that air is the one substance that is reality. Following Anaximenes, Diogenes argued that air is the essential identity of all different things and that all different things are just so many forms of condensed or rarefied air. Nature, as air, is an infinite and eternal process that, through its indivisible mobility and continuity, constantly becomes all the ways it comes to be and passes away through an absolute multiplicity of singular beings. All different things are momentarily denser or finer forms or modes of one ubiquitous air. Through Simplicius, Theophrastus tells us,

Diogenes the Apolloniate, almost the youngest of those who occupied themselves with these matters (that is, physical studies), wrote for the most part in an eclectic fashion, following Anaxagoras in some things and Leucippus in others. He, too, says that the substance of the universe is infinite and eternal air, from which, when it is condensed and rarefied and changed in its dispositions, the form of other things comes into being. This is what Theophrastus relates about Diogenes; and the book of Diogenes which has reached me, entitled On Nature, clearly says that air is that from which all the rest come into being. (Fr. 2)

Now, there is for us something obviously problematic about Diogenes’ thinking regarding air. The problem we have with trying to reconcile Diogenes’ thinking with what we know today is figuring out how ‘air’ can still be an absolutely cosmic, indivisibly infinite, and eternally living substance when it is limited to only the earth’s atmosphere. We understand air today to be reducible to other properties. To approach this problem it must first be understood what Diogenes meant by the term we are using. Aer in Ancient Greek was rooted in the verb ‘to blow, or breathe’ and the term often denoted a certain sense of loftiness and light, spirited movement.  Aer was also associated with the wind, the sky, and brightness. What Diogenes meant by air was the celerity and rapidity of the light and fluid movement of nature’s waxing and waning, its constant condensing and rarefying, its expanding and contracting. Air, for Diogenes, is the gaseous fluidity of all living and natural phenomena. It is important to understand that by ‘air’ Diogenes did not intend the grand total of all the substantially distinct atoms of oxygen, nitrogen, argon and so on that compose our atmosphere, but rather the simple fact that all things are natural, living, and moving. Air, for Diogenes, was both the constant stirring of the atmosphere as a singular elemental formation, and also all the ‘inhalations’ and ‘exhalations’ of the planetary and celestial movements. Air expresses the becoming of being, the living of nature. A mobile movement, a movement conceived not as the attribute or property of an immobile substance, but rather as a substance itself, movement itself conceived as substance, is what Diogenes understood by air. Air is the indivisible body that is the universe, all that is: “this very thing [air] is both eternal and immortal body, but of the rest some come into being, some pass away” (Fr. 7). And of the rest that come into being and pass away, they are all ways air modifies itself.  Atmospheric air is, therefore, another way absolute, substantial air (aer) becomes and expresses itself.

4. Intelligence and Divinity

 

Diogenes, moreover, says that air is intelligence. The Ancient Greek term for intelligence is noesis. Noesis is not just intelligence in the sense of being sharp or smart. What Diogenes designated by noesis was the active power of a mind to immediately intuit and know what it thinks. Noesis is not so much a belief held by a mind, as it is the activity of thinking itself that is a mind. A mind is an actively thinking thing. Now, we might be wondering how the absolute cosmic substance, air, could also have an immediately intuitive and active mind, that is, how it could also be a thinking thing. First, it is important to keep in mind that everything was physical for Diogenes. Thinking was a physical process for him that was not limited to only organisms with brains. (There will be more on this in the next section.) In other words, thinking did not solely mean cognition for Diogenes. Air is intelligence itself; pure thought intuitively thinking itself.  Just as all singular bodies are in air as modes or ways it modifies and transforms itself through condensation and rarefaction, so too are all minds, all intellects or intelligent beings, in air as modes or ideas through which it immediately intuits and thus thinks itself. If air is intelligence, or purely active thinking, and intelligence is thus the one indivisible body that imbues everything, then every singular body is also going to be imbued with mind. Second, Diogenes argued that intelligence was the power inherent to air with which it could absolutely and internally differentiate itself in a rational and measured fashion. We have already seen the four main elements of nature as an example of this rational and measured differentiation. Intelligence was for Diogenes a sufficient reason for all the differences of degree found throughout nature:

For, he [Diogenes] says, it would not be possible without intelligence for it [sc. the substance] so to be divided up that it has measures of all things—of winter and summer and night and day and rains and winds and fair weather. The other things, too, if one wishes to consider them, one would find disposed in the best possible way. (Fr. 3)

The intelligence and the soul, the thinking and the living of singular beings are modifications of substantial air-intelligence. Through the cessation of breathing, sensing, and knowing, living beings decompose and lose their intelligence, but only so there can be a simultaneous re-composition of air-intelligence elsewhere. Diogenes says, “Men and the other living creatures live by means of air, through breathing it. And this is for them both soul [that is, life principle] and intelligence, as will be clearly shown in this work; and if this is removed, then they die and intelligence fails.” (Fr. 7)

Diogenes also says that air is divine. Divinity designated natural power for the Presocratics, who also tended not to anthropomorphize their gods. Instead, a divinity for the first philosophers was more a natural force, usually an elemental power found permeating all of nature and imbuing it with all its creative and destructive power. Along with substance monism, pantheism—the idea that everything is divine, that God is all things—was an idea shared by many of the Presocratics. For Diogenes, his substance monism definitely entailed pantheism. Air-intelligence is divine. Only a god could remain identical to itself while also rationally differentiating itself through an infinity of singular beings. Only a god as well could have the intuitive intelligence to actively and affirmatively know all the self-identical differentiations it expressed of itself. As Diogenes says, it is only nature conceived as an absolutely immanent and divine air-intelligence that could be “both great and strong and eternal and immortal and much-knowing (Fr. 8).” Diogenes summarized all these points wonderfully when he wrote:

And it seems to me that that which has intelligence is what men call air, and that all men are steered by this and that it has power over all things. For this very thing seems to me to be a god and to have reached everywhere and to dispose all things and to be in everything. And there is no single thing that does not have a share of this; but nothing has an equal share of it, one with another, but there are many fashions both of air itself and of intelligence. For it is many-fashioned, being hotter and colder and drier and moister and more stationary and more swiftly mobile, and many other differentiations are in it both of taste and of color unlimited in number. And yet of all living creatures the soul is the same, air that is warmer than the outside, in which we exist, but much cooler than that near the sun. But in none of living creatures is this warmth alike (since it is not even so in individual men); the difference is not great, but as much as still allows them to be similar. Yet it is not possible for anything to become truly alike, one to the other, of the things undergoing differentiation, without becoming the same. Because, then, the differentiation is many-fashioned, living creatures are many fashioned and many in number, resembling each other neither in form nor in way of life nor in intelligence, because of the number of differentiations. Nevertheless, they all live and see and hear by the same thing, and have the rest of their intelligence from the same thing. (Fr. 5)

5. Cosmology and Physiology

 

Singular beings are not only composed of air, they also live and have intelligence by breathing air. The soul or life principle of all things is an absolute and divine air-intelligence that, in a sense, breathes through itself in all the forms it takes on. Air is both eternal and omnipresent as it takes on an unlimited number of forms. Like many of the Presocratics, Diogenes provides an account of how air modifies itself through a variety of physical compositions ranging from galaxies and solar systems to respiratory, circulatory, and cognitive systems. Diogenes provides us with a cosmogony that explains the creation of the earth and sun on the basis of the condensation and rarefaction of air. In The pseudo-Plutarchean Stromateis, which Eusebius preserved, it is stated that:

Diogenes the Apolloniate premises that air is the element, and that all things are in motion and the worlds innumerable. He gives this account of cosmogony: the whole was in motion, and became rare is some places and dense in others; where the dense ran together centripetally it made the earth, and so the rest by the same method, while the lightest parts took the upper position and produce the sun. (Kirk, Raven, and Schofield: 1983, 445)

Diogenes also made some cosmological observations. He gave an interesting account of heavenly bodies that included an attempt to explain meteorites.

Diogenes says that the heavenly bodies are like pumice-stone, and he considers them as the breathing-holes of the world; and they are fiery. With the visible heavenly bodies are carried round invisible stones, which for this reason have no name: they often fall on the earth and are extinguished, like the stone star that made its fiery descent at Aegospotami. (Kirk, Raven, and Schofield: 1983, 445)

There are many similarities between Diogenes’ cosmogony and cosmology and that of his fellow Presocratics. First, he posits the existence of innumerable worlds like many other Presocratics. It makes sense that Diogenes asserts an immeasurable plurality of worlds because he places no restrictions to the amount of differentiations and composition air can take. Why wouldn’t there be a plethora of worlds littered throughout the universe insofar as worlds are, by definition, just momentary formations of the universe (air) anyway? Secondly, it is from Anaxagoras that Diogenes likely borrowed the idea of a noetic substance forming a vortex within itself. Thirdly, it was common in the Ionic tradition to describe the origin of the earth as the formation of more concentrated and denser material in the center of such a vortex. Likewise, the rarer material would go to the extremes of the vortex, following the law that differentiation is a symmetrical process whereby like follows like. Lighter air, therefore, tends towards greater heights and extremities while denser air tends to concentrate into relative core positions. With respect to astronomical objects, it seems Diogenes said heavenly bodies were like pumice stone because pumice is both glowing and light, or ‘airy,’ and composed of translucent and very porous bubble walls, which are, once again, qualities that accommodate the substance that Diogenes countenances.

From extrasolar objects and the solar system down to the earth itself, Diogenes continues to explain all physical and psychological phenomena as so many self-modifying processes of one substantial air. Within and through the atmospheric air of our planet, Diogenes addresses the thinking and sensing of particular organisms. The law of like following like is as applicable on earth as it is throughout the cosmos. From Theophrastus’ de sensu, Diogenes is reported as having a detailed theory of sensation and cognition based on the reception and circulation of air within and between singular beings. Each of the five senses are dealt with in terms of how they process air. Degrees of intelligence or cognitive ability are also delineated by the amount and kind of air each being possesses. The differences between beings are defined by how swiftly, and with how much agility, they engender and circulate. Some beings, for example, have more intelligence, or more complex brain activity while others have say, a better sense of smell. All kinds of perception, however, are ways that air processes and modifies itself.

Diogenes attributes thinking and the senses, as also life, to air.  Therefore he would seem to do so by the action of similars (for he says that there would be no action of being acted upon, unless all things were from one). The sense of smell is produced by the air round the brain. Hearing is produced whenever the air within the ears, being moved by the air outside, spreads toward the brain. Vision occurs when things are reflected on the pupil, and it, being mixed with the air within, produces a sensation. A proof of this is that, if there is an inflammation of the veins (that is, those in the eye), there is no mixture with the air within, nor vision, although the reflexion exists exactly as before. Taste occurs to the tongue by what is rare and gentle. About touch he gave no definition, either about its nature or its objects. But after this he attempts to say what is the cause of more accurate sensations, and what sort of objects they have. Smell is keenest for those who have least air in their heads, for it is mixed most quickly; and, in addition, if a man draws it in through a longer and narrower channel; for in this way it is more swiftly assessed. Therefore some living creatures are more perceptive of smell than are men; yet nevertheless, if the smell were symmetrical with the air, with regard to mixture, man would smell perfectly.  (Kirk, Raven, and Schofield: 1983, 448).

It seems that for Diogenes correspondence in perception entails a matching-up of the degrees of air within the brain with air that is being received through the sensitive faculties. Sensation itself is the reception of air by air and so is a mixing of airs through the aerated blood channels that are themselves oxygenated through respiration.  (Diogenes also attempted an anatomy of the veins.) Usually, the reception of air by air takes place in an organism as an agitation or irritation of the sense organs and thus also the brain. An accurate or adequate perception is one in which there is a mutually interpenetrating coalescence of finer air flows within, between, and amongst the parts of organisms and the finer air received through sensations. This entails that a certain kind of affective or sensitive openness, which can be regarded as a susceptibility to finer air, allows for greater perceptual correspondences with the other kinds of air-composites.  Such affective openness implies that one must come to pursue or avoid interaction with other air-composites in accordance with how they increase or decrease one’s respiratory and cognitive abilities. The trick is to have sensitive correspondences serve the rationally differentiated regulatory systems that allow organisms to survive and persevere. Overall, Diogenes was one of the first thinkers to emphasis the relationship between sensation, respiration, and cognition.

Theophrastus continues in his report of Diogenes’ thinking regarding sensation and cognition. Pleasure and pain are also definable by the sensitive reception and circulation of air.

That the air within perceives, being a small portion of the god, is indicated by the fact that often, when we have our mind on other things, we neither see nor hear.  Pleasure and pain come about in this way: whenever air mixes in quantity with blood and lightens it, being in accordance with nature, and penetrates through the whole body, pleasure is produced; but whenever the air is present contrary to nature and does not mix, then the blood coagulates and becomes weaker and thicker, and pain is produced. Similarly, confidence and health and their opposites… Thought, as has been said, is caused by pure and dry air; for a moist emanation inhibits the intelligence; for this reason thought is diminished in sleep, drunkenness, and surfeit. That moisture removes intelligence is indicated by the fact that other living creatures are inferior in intellect, for they breathe the air from the earth and take to themselves moister sustenance. (Kirk, Raven, and Schofield: 1983, 448)

The key to cultivating a stronger intelligence, greater pleasures, and a good sense of taste (for the wise man is the sage, the sapiens, the one who tastes well) is to take in, breathe, and allow to permeate one’s organic structure the finer, lighter, dryer, warmer, and swifter air. To breathe well is to live well. To stand erect, awake, warm-blooded, firm, and at attention is to manifest a stronger and more well-regulated and attuned disposition.  Like Heraclitus, Diogenes advises that one must avoid excessive moistening. To become more god-like, more substantially identical with what one essentially is, one should actively, aggressively, and affirmatively seek out other aerated bodies of similar dispositions and compose well with them. Certain compositions lead to the reproduction of new organic forms. Since air is the vitality of its own natural and substantial existence, it will continuously reproduce itself through the distribution of its own aerated seeds.  Indeed, air, understood as nature’s ubiquitous and eternal living, is constantly conceiving itself, impregnating and giving birth to its own various forms of gradients of denser or finer air.

Diogenes, it is worth mentioning, also had an interest in embryology. The self-conception of air takes place through the intermingling of aerated sperm and eggs. For Diogenes, life grows naturally and intelligently at all levels because of the aerated nature of blood and veins.

And in the continuation he shows that also the sperm of living creatures is aerated and acts of intelligence take place when the air, with the blood, gains possession of the whole body through the veins; in the course of which he gives an accurate anatomy of the veins. Now in this he clearly says that what men call air is the material principle. (Fr. 5)

6. Influence and Historical Role

The Eleatic philosophers were monists, believing that were there two things, we would have to say of one that it is not (the other). They thought, however, that one may not speak of what is not, as one would be speaking of nothing. The fact that there is only one thing in existence was thought to entail that change could not occur, as there would need to be two things for there to be the relata required for a causal relation. Diogenes seems to have agreed with the monistic aspect of the Eleatic philosophy while attempting to accommodate the possibility of change. His move was to claim that one thing might be a causa sui, and that the change we experience is the alteration thereof. The substance best suited as the substrate was thought to be air, and here rings reminiscent the view of Anaximenes. One also finds, arguably, the influence of Anaxagoras, when one considers the claim that this substance is intelligence or nous. Finally, it is worth noting that the idea that the universe is a living being is broached in Plato’s Timaeus. And the idea of substance monism has had other advocates in the history of philosophy, most famous perhaps being Benedict Spinoza.

7. References and Further Reading

There are no monographs on Diogenes of Apollonia in English. Unfortunately, Diogenes has been given rather brief attention throughout the secondary literature. Diogenes is usually addressed in chapters in books on the Presocratics.

  • Barnes, Jonathan. The Presocratic Philosophers. London: Routledge & Kegan Paul (1 vol. ed.), 1982, 568-592.
  • Burnet, J. Early Greek Philosophy. London: Black (4th ed.), 1930.
  • Diels, H. “Leukippos und Diogenes von Apollonia.” RM 42, 1887, 1-14.
  • Diller, H. “Die philosophiegeschichtliche Stellung des Diogenes von Apollonia.” Hermes 76, 1941, 359-81.
  • Guthrie, W.K.C. The Presocratic Tradition from Parmenides to Democritus. Vol. II. Cambridge: Cambridge University Press, 1993, 362-381.
  • Huffmeier, F. “Teleologische Weltbetrachtung bei Diogenes von Apollonia.” Philologus 107, 1963, 131-38.
  • Jaeger, Werner. The Theology of the Early Greek Philosophers. Oxford: Oxford University Press, 1967, 155-171.
  • Kirk, G.S., J.E. Raven, and M. Schofield. The Presocratic Philosophers. 2nd edn. Cambridge: Cambridge University Press, 1983.
  • Laks, Andre. “Soul, Sensation, and Thought.” The Cambridge Companion to Early Greek Philosophy. Cambridge: Cambridge University Press, 1999, 250-270.
  • Laks, Andre. Diogene d’ Apollonie. Paris: Lille, 1983.
  • McKirahan, Richard D. Philosophy Before Socrates. Indianapolis: Hackett Publishing Company, 1994, 344-352.
  • Shaw, J. R. “A Note on the Anatomical and Philosophical Claims of Diogenes of Apollonia.” Aperion 11.1, 1977, 53-7.
  • Warren, James. Presocratics. Berkeley: University of California Press, 2007, 175-181.

Author Information

Jason Dockstader
Email: jdock36@hotmail.com
University College Cork
Ireland

David Hume: Causation

HumeDavid Hume (1711-1776) is one of the British Empiricists of the Early Modern period, along with John Locke and George Berkeley. Although the three advocate similar empirical standards for knowledge,  that is, that there are no innate ideas and that all knowledge comes from experience, Hume is known for applying this standard rigorously to causation and necessity. Instead of taking the notion of causation for granted, Hume challenges us to consider what experience allows us to know about cause and effect.

Hume shows that experience does not tell us much. Of two events, A and B, we say that A causes B when the two always occur together, that is, are constantly conjoined. Whenever we find A, we also find B, and we have a certainty that this conjunction will continue to happen. Once we realize that “A must bring about B” is tantamount merely to “Due to their constant conjunction, we are psychologically certain that B will follow A”, then we are left with a very weak notion of necessity. This tenuous grasp on causal efficacy helps give rise to the Problem of Induction–that we are not reasonably justified in making any inductive inference about the world. Among Hume scholars it is a matter of debate how seriously Hume means us to take this conclusion and whether causation consists wholly in constant conjunction.

This article examines the empirical foundations that lead Hume to his account of causation before detailing his definitions of causation and how he uses these key insights to generate the Problem of Induction. After explicating these two main components of Hume’s notion of causation, three families of interpretation will be explored: the causal reductionist, who takes Hume’s definitions of causation as definitive; the causal skeptic, who takes Hume’s problem of induction as unsolved; and the causal realist, who introduces additional interpretive tools to avoid these conclusions and maintains that Hume has some robust notion of causation.

Table of Contents

  1. Causation’s Place in Hume’s Taxonomy
  2. Necessary Connections and Hume’s Two Definitions
  3. The Problem of Induction
  4. Causal Reductionism
  5. Causal Skepticism
  6. Causal Realism
  7. References and Further Reading
    1. A Note on Hume’s Works
    2. Hume’s Works on Causation
    3. Works in the History of Philosophy
    4. Contemporary Metaphysics of Causation

1. Causation’s Place in Hume’s Taxonomy

Hume’s most important contributions to the philosophy of causation are found in A Treatise of Human Nature, and An Enquiry concerning Human Understanding, the latter generally viewed as a partial recasting of the former. Both works start with Hume’s central empirical axiom known as the Copy Principle. Loosely, it states that all constituents of our thoughts come from experience. By learning Hume’s vocabulary, this can be restated more precisely. Hume calls the contents of the mind perceptions, which he divides into impressions and ideas. Though Hume himself is not strict about maintaining a concise distinction between the two, we may think of impressions as having their genesis in the senses, whereas ideas are products of the intellect. Impressions, which are either of sensation or reflection (memory), are more vivid than ideas. Hume’s Copy Principle therefore states that all our ideas are products of impressions.

At first glance, the Copy Principle may seem too rigid. To use Hume’s example, we can have an idea of a golden mountain without ever having seen one. But to proffer such examples as counter to the Copy Principle is to ignore the activities of the mind. The mind may combine ideas by relating them in certain ways. If we have the idea of gold and the idea of a mountain, we can combine them to arrive at the idea of a golden mountain. The Copy Principle only demands that, at bottom, the simplest constituent ideas that we relate come from impressions. This means that any complex idea can eventually be traced back to its constituent impressions.

In the Treatise, Hume identifies two ways that the mind associates ideas, via natural relations and via philosophical relations. Natural relations have a connecting principle such that the imagination naturally leads us from one idea to another. The three natural relations are resemblance, contiguity, and cause and effect. Of these, Hume tells us that causation is the most prevalent. But cause and effect is also one of the philosophical relations, where the relata have no connecting principle, instead being artificially juxtaposed by the mind. Of the philosophical relations, some, such as resemblance and contrariety, can give us certitude. Some cannot. Cause and effect is one of the three philosophical relations that afford us less than certain knowledge, the other two being identity and situation. But of these, causation is crucial. It alone allows us to go beyond what is immediately present to the senses and, along with perception and memory, is responsible for all our knowledge of the world. Hume therefore recognizes cause and effect as both a philosophical relation and a natural relation, at least in the Treatise, the only work where he draws this distinction.

The relation of cause and effect is pivotal in reasoning, which Hume defines as the discovery of relations between objects of comparison. But note that when Hume says “objects”, at least in the context of reasoning, he is referring to the objects of the mind, that is, ideas and impressions, since Hume adheres to the Early Modern “way of ideas”, the belief that sensation is a mental event and therefore all objects of perception are mental. But causation itself must be a relation rather than a quality of an object, as there is no one property common to all causes or to all effects. By so placing causation within Hume’s system, we arrive at a first approximation of cause and effect. Causation is a relation between objects that we employ in our reasoning in order to yield less than demonstrative knowledge of the world beyond our immediate impressions. However, this is only the beginning of Hume’s insight.

2. Necessary Connections and Hume’s Two Definitions

In both the Treatise and the Enquiry, we find Hume’s Fork, his bifurcation of all possible objects of knowledge into relations of ideas and matters of fact. Hume gives several differentiae distinguishing the two, but the principal distinction is that the denial of a true relation of ideas implies a contradiction. Relations of ideas can also be known independently of experience. Matters of fact, however, can be denied coherently, and they cannot be known independently of experience. Although Immanuel Kant later seems to miss this point, arguing for a middle ground that he thinks Hume missed, the two categories must be exclusive and exhaustive. A true statement must be one or the other, but not both, since its negation must either imply a contradiction or not. There is no middle ground. Yet given these definitions, it seems clear that reasoning concerning causation always invokes matters of fact. For Hume, the denial of a statement whose truth condition is grounded in causality is not inconceivable (and hence, not impossible; Hume holds that conceivability implies possibility). For instance, a horror movie may show the conceivability of decapitation not causing the cessation of animation in a human body. But if the denial of a causal statement is still conceivable, then its truth must be a matter of fact, and must therefore be in some way dependent upon experience. Though for Hume, this is true by definition for all matters of fact, he also appeals to our own experience to convey the point. Hume challenges us to consider any one event and meditate on it; for instance, a billiard ball striking another. He holds that no matter how clever we are, the only way we can infer if and how the second billiard ball will move is via past experience. There is nothing in the cause that will ever imply the effect in an experiential vacuum. And here it is important to remember that, in addition to cause and effect, the mind naturally associates ideas via resemblance and contiguity. Hume does not hold that, having never seen a game of billiards before, we cannot know what the effect of the collision will be. Rather, we can use resemblance, for instance, to infer an analogous case from our past experiences of transferred momentum, deflection, and so forth. We are still relying on previous impressions to predict the effect and therefore do not violate the Copy Principle. We simply use resemblance to form an analogous prediction. And we can charitably make such resemblances as broad as we want. Thus, objections like: Under a Humean account, the toddler who burned his hand would not fear the flame after only one such occurrence because he has not experienced a constant conjunction, are unfair to Hume, as the toddler would have had thousands of experiences of the principle that like causes like, and could thus employ resemblance to reach the conclusion to fear the flame.

If Hume is right that our awareness of causation (or power, force, efficacy, necessity, and so forth – he holds all such terms to be equivalent) is a product of experience, we must ask what this awareness consists in. What is meant when some event is judged as cause and effect?  Strictly speaking, for Hume, our only external impression of causation is a mere constant conjunction of phenomena, that B always follows A, and Hume sometimes seems to imply that this is all that causation amounts to. (And this notion of causation as constant conjunction is required for Hume to generate the Problem of induction discussed below.)  Nevertheless, ‘causation’ carries a stronger connotation than this, for constant conjunction can be accidental and therefore doesn’t get us the necessary connection that gives the relation of cause and effect its predictive ability. We may therefore now say that, on Hume’s account, to invoke causality is to invoke a constant conjunction of relata whose conjunction carries with it a necessary connection.

Hume points out that this second component of causation is far from clear. What is this necessity that is implied by causation?  Clearly it is not a logical modality, as there are possible worlds in which the standard laws of causation do not obtain. It might be tempting to state that the necessity involved in causation is therefore a physical or metaphysical necessity. However, Hume considers such elucidations unhelpful, as they tell us nothing about the original impressions involved. At best, they merely amount to the assertion that causation follows causal laws. But invoking this common type of necessity is trivial or circular when it is this very efficacy that Hume is attempting to discover.

We must therefore follow a different route in considering what our impression of necessity amounts to. As causation, at base, involves only matters of fact, Hume once again challenges us to consider what we can know of the constituent impressions of causation. Once more, all we can come up with is an experienced constant conjunction. Of the common understanding of causality, Hume points out that we never have an impression of efficacy. Because of this, our notion of causal law seems to be a mere presentiment that the constant conjunction will continue to be constant, some certainty that this mysterious union will persist. Hume argues that we cannot conceive of any other connection between cause and effect, because there simply is no other impression to which our idea may be traced. This certitude is all that remains.

For Hume, the necessary connection invoked by causation is nothing more than this certainty. Hume’s Copy Principle demands that an idea must have come from an impression, but we have no impression of efficacy in the event itself. Instead, the impression of efficacy is one produced in the mind. As we experience enough cases of a particular constant conjunction, our minds begin to pass a natural determination from cause to effect, adding a little more “oomph” to the prediction of the effect every time, a growing certitude that the effect will follow again. It is the internal impression of this “oomph” that gives rise to our idea of necessity, the mere feeling of certainty that the conjunction will stay constant. Ergo, the idea of necessity that supplements constant conjunction is a psychological projection. We cannot help but think that the event will unfurl in this way.

Having approached Hume’s account of causality by this route, we are now in a position to see where Hume’s two definitions of causation given in the Treatise come from. (He gives similar but not identical definitions in the Enquiry.) He defines “cause” in the following two ways:

(D1)      An object precedent and contiguous to another, and where all the objects resembling the former are placed in like relations of precedency and contiguity to those objects that resemble the latter.

(D2)      An object precedent and contiguous to another, and so united with it, that the idea of the one determined the mind to form the idea of the other, and the impression of the one to form a more lively idea of the other. (T 1.3.14.31; SBN 170)

There are reams of literature addressing whether these two definitions are the same and, if not, to which of them Hume gives primacy. J.A. Robinson is perhaps the staunchest proponent of the position that the two are nonequivalent, arguing that there is a nonequivalence in meaning and that they fail to capture the same extension. Two objects can be constantly conjoined without our mind determining that one causes the other, and it seems possible that we can be determined that one object causes another without their being constantly conjoined. But if the definitions fail in this way, then it is problematic that Hume maintains that both are adequate definitions of causation. Some scholars have argued for ways of squaring the two definitions (Don Garrett, for instance, argues that the two are equivalent if they are both read objectively or both read subjectively), while others have given reason to think that seeking to fit or eliminate definitions may be a misguided project.

One alternative to fitting the definitions lies in the possibility that they are doing two separate things, and it might therefore be inappropriate to reduce one to the other or claim that one is more significant than the other. There are several interpretations that allow us to meaningfully maintain the distinction (and therefore the nonequivalence) between the two definitions unproblematically. For instance, D1 can be seen as tracing the external impressions (that is, the constant conjunction) requisite for our idea of causation while D2 traces the internal impressions, both of which are important to Hume in providing a complete account. As Hume says, the definitions are “presenting a different view of the same object.” (T 1.3.14.31; SBN 170)  Supporting this, Harold Noonan holds that D1 is “what is going on in the world” and that D2 is “what goes on in the mind of the observer” and therefore, “the problem of nonequivalent definitions poses no real problem for understanding Hume.” (Noonan 1999: 150-151)  Simon Blackburn provides a similar interpretation that the definitions are doing two different things, externally and internally. However, Blackburn has the first as giving the “contribution of the world” and the latter giving the “functional difference in the mind that apprehends the regularity.” (Blackburn 2007: 107)  However, this is not the only way to grant a nonequivalence without establishing the primacy of one over the other.

Another method is to cash out the two definitions in terms of the types of relation. Some scholars have emphasized that, according to Hume’s claim in the Treatise, D1 is defining the philosophical relation of cause and effect while D2 defines the natural relation. Walter Ott argues that, if this is right, then the lack of equivalence is not a problem, as philosophical and natural relations would not be expected to capture the same extension. (Ott 2009: 239)  This way of dismissing the nonequivalence of the two definitions becomes more problematic, however, when we realize that Hume does not make the distinction between natural and philosophical relations in the Enquiry, yet provides approximately the same two definitions. If the definitions were meant to separately track the philosophical and natural relations, we might expect Hume to have explained that distinction in the Enquiry rather than dropping it while still maintaining two definitions. Perhaps for this reason, Jonathan Bennett suggests that it is best to forget Hume’s comment of this correspondence. (Bennett 1971: 398)

Though this treatment of literature considering the definitions as meaningfully nonequivalent has been brief, it does serve to show that the definitions need not be forced together. In fact, later in the Treatise, Hume states that necessity is defined by both, either as the constant conjunction or as the mental inference, that they are two different senses of necessity, and Hume, at various points, identifies both as the essence of connection or power. Whether or not Robinson is right in thinking Hume is mistaken in holding this position, Hume himself does not seem to believe one definition is superior to the other, or that they are nonequivalent.

Beyond Hume’s own usage, there is a second worry lingering. Attempting to establish primacy between the definitions implies that they are somehow the bottom line for Hume on causation. But Hume is at pains to point out that the definitions are inadequate. In discussing the “narrow limits of human reason and capacity,” Hume asks,

And what stronger instance can be produced of the surprizing ignorance and weakness of the understanding than [the analysis of causation]?…so imperfect are the ideas we form concerning it, that it is impossible to give any just definition of cause, except what is drawn from something extraneous and foreign to it….But though both these definitions be drawn from circumstances foreign to cause, we cannot remedy this inconvenience, or attain any more perfect definition…. (EHU 7.29; SBN 77, emphasis his)

The tone this passage conveys is one of resigned dissatisfaction. Although Hume does the best that can be expected on the subject, he is dissatisfied, but this dissatisfaction is inevitable. This is because, as Hume maintains in Part VII of the Enquiry, a definiens is nothing but an enumeration of the constituent simple ideas in the definiendum. However, Hume has just given us reason to think that we have no such satisfactory constituent ideas, hence the “inconvenience” requiring us to appeal to the “extraneous.”  This is not to say that the definitions are incorrect. Note that he still applies the appellation “just” to them despite their appeal to the extraneous, and in the Treatise, he calls them “precise.”  Rather, they are unsatisfying. It is an inconvenience that they appeal to something foreign, something we should like to remedy. Unfortunately, such a remedy is impossible, so the definitions, while as precise as they can be, still leave us wanting something further. But if this is right, then Hume should be able to endorse both D1 and D2 as vital components of causation without implying that he endorses either (or both) as necessary and sufficient for causation. For these reasons, Hume’s discussion leading up to the two definitions should be taken as primary in his account of causation rather than the definitions themselves.

3. The Problem of Induction

The second of Hume’s influential causal arguments is known as the problem of induction, a skeptical argument that utilizes Hume’s insights about experience limiting our causal knowledge to constant conjunction. Though Hume gives a quick version of the Problem in the middle of his discussion of causation in the Treatise (T 1.3.6), it is laid out most clearly in Section IV of the Enquiry. An influential argument, the Problem’s skeptical conclusions have had a drastic impact on the field of epistemology. It should be noted, however, that not everyone agrees about what exactly the Problem consists in. Briefly, the typified version of the Problem as arguing for inductive skepticism can be described as follows:

Recall that proper reasoning involves only relations of ideas and matters of fact. Again, the key differentia distinguishing the two categories of knowledge is that asserting the negation of a true relation of ideas is to assert a contradiction, but this is not the case with genuine matters of fact. But in Section IV, Hume only pursues the justification for matters of fact, of which there are two categories:

(A)           Reports of direct experience, both past and present

(B)           Claims about states of affairs not directly observed

Matters of fact of category (A) would include sensory experience and memory, against which Hume never raises doubts, contra René Descartes. For Hume, (B) would include both predictions and the laws of nature upon which predictions rest. We cannot claim direct experience of predictions or of general laws, but knowledge of them must still be classified as matters of fact, since both they and their negations remain conceivable. In considering the foundations for predictions, however, we must remember that, for Hume, only the relation of cause and effect gives us predictive power, as it alone allows us to go beyond memory and the senses. All such predictions must therefore involve causality and must therefore be of category (B). But what justifies them?

It seems to be the laws governing cause and effect that provide support for predictions, as human reason tries to reduce particular natural phenomena “…to a greater simplicity, and to resolve the many particular effects into a few general causes….” (EHU 4.12; SBN 30)  But this simply sets back the question, for we must now wonder what justifies these “general causes.”  One possible answer is that they are justified a priori as relations of ideas. Hume rejects this solution for two reasons:  First, as shown above, we cannot meditate purely on the idea of a cause and deduce the corresponding effect and, more importantly, to assert the negation of any causal law is not to assert a contradiction.

Here we should pause to note that the generation of the Problem of Induction seems to essentially involve Hume’s insights about necessary connection (and hence our treating it first). Since the Problem of Induction demands that causal connections cannot be known a priori, and that our access is only to constant conjunction, the Problem seems to require the most crucial components of his account of necessity. It is therefore an oddity that, in the Enquiry, Hume waits until Section VII to explicate an account of necessity already utilized in the Problem of Section IV. In the Treatise, however, a version of the Problem appears after Hume’s insights about experience limiting causation to constant conjunction but before the explication of the projectivist necessity and his presenting of the two definitions. It is therefore not entirely clear how Hume views the relationship between his account of necessity and the Problem. Stathis Psillos, for instance, views Hume’s inductive skepticism as a corollary to his account of necessary connection. (Psillos 2002: 31)  However, Peter Millican rightly points out that the Problem can still be construed so as to challenge most non-reductive causal theories as well. (Millican 2002: 141)  Kenneth Clatterbaugh goes further, arguing that Hume’s reductive account of causation and the skepticism the Problem raises can be parsed out so they are entirely separable. (Clatterbaugh 1999: 186)  D.M. Armstrong disagrees, arguing that “…if laws of nature are nothing but Humean uniformities, then inductive scepticism is inevitable.” (Armstrong 1999: 52)

Whether the Problem of induction is in fact separable from Hume’s account of necessary connection, he himself connects the two by arguing that “…the knowledge of this relation is not, in any instance, attained by reasonings a priori; but arises entirely from experience, when we find that any particular objects are constantly conjoined with each other.” (EHU 4.6; SBN 27)  Here, Hume invokes the account of causation explicated above to show that the necessity supporting (B) is grounded in our observation of constant conjunction. This is to say that (B) is grounded in (A). But again, (A) by itself gives us no predictive power. We have thus merely pushed the question back one more step and must now ask with Hume, “What is the foundation of all conclusions from experience?” (EHU 4.14; SBN 32, emphasis his)

The answer to this question seems to be inductive reasoning. We use direct observation to draw conclusions about unobserved states of affairs. But this is just to once more assert that (B) is grounded in (A). The more interesting question therefore becomes how we do this. What lets us reason from (A) to (B)?  The only apparent answer is the assumption of some version of the Principle of the Uniformity of Nature (PUN), the doctrine that nature is always uniform, so unobserved instances of phenomena will resemble the observed. This is called an assumption since we have not, as yet, established that we are justified in holding such a principle. Once more, it cannot be known a priori, as we assert no contradiction by maintaining its falsity. A sporadic, random universe is perfectly conceivable. Therefore, knowledge of the PUN must be a matter of fact. But the principle is predictive and not directly observed. This means that the PUN is an instance of (B), but we were invoking the PUN as the grounds for moving from beliefs of type (A) to beliefs of type (B), thus creating a vicious circle when attempting to justify type (B) matters of fact. We use knowledge of (B) as a justification for our knowledge of (B). The bottom line for Hume’s Problem of induction seems to be that there is no clear way to rationally justify any causal reasoning (and therefore no inductive inference) whatsoever. We have no ground that allows us to move from (A) to (B), to move beyond sensation and memory, so any matter of fact knowledge beyond these becomes suspect.

Louis Loeb calls this reconstruction of Hume targeting the justification of causal inference-based reasoning the “traditional interpretation” (Loeb 2008: 108), and Hume’s conclusion that causal inferences have “no just foundation” (T 1.3.6.10; SBN 91) lends support to this interpretation. Under this reconstruction, the epistemic circularity revealed by Hume’s Problem of Induction seems detrimental to knowledge. However, there are philosophers (Max Black, R. B. Braithwaite, Charles Peirce, and Brian Skyrms, for instance) that, while agreeing that Hume targets the justification of inductive inference, insist that this particular justificatory circle is not vicious or that it is unproblematic for various reasons. As discussed below, Hume may be one such philosopher. Alternatively, there are those that think that Hume claims too much in insisting that inductive arguments fail to lend probability to their conclusions. D. C. Stove maintains that, while Hume argues that inductive inference never adds probability to its conclusion, Hume’s premises actually only support “inductive fallibilism”, a much weaker position that induction can never attain certainty (that is, that the inferences are never valid). Hume illicitly adds that no invalid argument can still be reasonable. (Stove 1973: 48)

But not all are in agreement that Hume’s intended target is the justification of causal or inductive inference. Tom Beauchamp and Alexander Rosenberg agree that Hume’s argument implies inductive fallibilism, but hold that this position is adopted intentionally as a critique of the deductivist rationalism of Hume’s time. (Beauchamp and Rosenberg 1981: 44)  Annette Baier defends a similar account, focusing on Hume’s use of “reason” in the argument, which she insists should be used only in the narrow sense of Hume’s “demonstrative sciences”. (Baier 1991: 60)  More recently, Don Garret has argued that Hume’s negative conclusion is one of cognitive psychology, that we do not adopt induction based on doxastically sufficient argumentation. Induction is simply not supported by argument, good or bad. Instead, it is an instinctive mechanism that we share with animals. (Garrett 1997: 92, 94)  Similarly, David Owen holds that Hume’s Problem of induction is not an argument against the reasonableness of inductive inference, but, “Rather Hume is arguing that reason cannot explain how we come to have beliefs in the unobserved on the basis of past experience.” (Owen 1999: 6)  We see that there are a variety of interpretations of Hume’s Problem of induction and, as we will see below, how we interpret the Problem will inform how we interpret his ultimate causal position.

4. Causal Reductionism

Having described these two important components of his account of causation, let us consider how Hume’s position on causation is variously interpreted, starting with causal reductionism. The family of reductionist theories, often read out of Hume’s account of necessity outlined above, maintain that causation, power, necessity, and so forth, as something that exists between external objects rather than in the observer, is constituted entirely by regular succession. In the external world, causation simply is the regularity of constant conjunction. In fact, the defender of this brand of regularity theory of causation is generally labeled a “Humean” about causation. However, since this interpretation, as Hume’s own historical position, remains in contention, the appellation will be avoided here.

Because of the variant opinions of how we should view the relationship between the two definitions proffered by Hume, we find two divergent types of reduction of Humean causation. First, there are reductionists that insist Hume reduces causation to nothing beyond constant conjunction, that is, the reduction is to a simple naïve regularity theory of causation, and therefore the mental projection of D2 plays no part. The motivation for this interpretation seems to be an emphasis on Hume’s D1, either by saying that it is the only definition that Hume genuinely endorses, or that D2 somehow collapses into D1 or that D2 does not represent a genuine ontological reduction, and is therefore not relevant to the metaphysics of causation. Robinson, for instance, claims that D2 is explanatory in nature, and is merely part of an empiricist psychological theory. (Robinson 1962)

This focus on D1 is regarded as deeply problematic by some Hume scholars (Francis Dauer, H.O. Mounce, and Fred Wilson, for instance), because it seems to be an incomplete account of Hume’s discussion of necessary connection presented above. A reductive emphasis on D1 as definitive ignores not only D2 as a definition but also ignores all of the argument leading up to it. This is to disregard the discussion through which Hume accounts for the necessity of causation, a component which he describes as “of much greater importance” than the contiguity and succession of D1. (T 1.3.2.11; SBN 77)  In short, a reduction to D1 ignores the mental determination component. However, this practice may not be as uncharitable as it appears, as many scholars see the first definition as the only component of his account relevant to metaphysics. For instance, D.M. Armstrong, after describing both components, simply announces his intention to set aside the mental component as irrelevant to the metaphysics of causation. (Armstrong 1983: 4)  J. L. Mackie similarly stresses that, “It is about causation so far as we know about it in objects that Hume has the firmest and most fully argued views,” (Mackie 1980: 21) and it is for this reason that he focuses on D1.

However, not everyone agrees that D2 can or should be dropped so easily from Hume’s system. In addition to its accounting for the necessity of causation mentioned above, recall that Hume makes frequent reference to both definitions as accurate or just, and at one point even refers to D2 as constituting the essence of causation. Therefore, whether or not the projectivism of D2 actually is relevant to the metaphysics of causation, a strong case can be made that Hume thinks it is so, and therefore an accurate historical interpretation needs to include D2 in order to capture Hume’s intentions. (Below, the assumption that Hume is even doing metaphysics will also be challenged.)  The more common Humean reduction, then, adds a projectivist twist by somehow reducing causation to constant conjunction plus the internal impression of necessity. (See, for instance, Beauchamp and Rosenberg 1981: 11, Goodman 1983: 60, Mounce 1999: 42, Noonan 1999: 140-145, Ott 2009: 224 or Wilson 1997: 16)  Of course while this second type of reductionist agrees that the projectivist component should be included, there is less agreement as to how, precisely, it is supposed to fit into Hume’s overall causal picture. Largely for this reason, we have a host of reductionist interpretations rather than a single version. The unifying thread of the reductionist interpretations is that causation, as it exists in the object, is constituted by regularity.

But given the Humean account of causation outlined above, it is not difficult to see how Hume’s writings give rise to such reductionist positions. After all, both D1 and D2 seem reductive in nature. If, as is often the case, we take definitions to represent the necessary and sufficient conditions of the definiendum, then both the definitions are reductive notions of causation. D1 reduces causation to proximity, continuity, and constant conjunction, and D2 similarly reduces causation to proximity, continuity, and the internal mental determination that moves the first object or idea to the second. Even considering Hume’s alternate account of definitions, where a definition is an enumeration of the constituent ideas of the definiendum, this does not change the two definitions’ reductive nature. Given that Hume’s discussions of causation culminate in these two definitions, combined with the fact that the conception of causation they provide is used in Hume’s later philosophical arguments of the Treatise, the definitions play a crucial role in understanding his account of causation. Therefore, the various forms of causal reductionism can constitute reasonable interpretations of Hume. By putting the two definitions at center state, Hume can plausibly be read as emphasizing that our only notion of causation is constant conjunction with certitude that it will continue. Nevertheless, reductionism is not the only way to interpret Hume’s theory of causation.

5. Causal Skepticism

One way to interpret the reasoning behind assigning Hume the position of causal skepticism is by assigning similar import to the passages emphasized by the reductionists, but interpreting the claims epistemically rather than ontologically. In other words, rather than interpreting Hume’s insights about the tenuousness of our idea of causation as representing an ontological reduction of what causation is, Humean causal skepticism can instead be viewed as his clearly demarcating the limits of our knowledge in this area and then tracing out the ramifications of this limiting. (Below, we will see that the causal realists also take Hume’s account of necessity as epistemic rather than ontological.)  If Hume’s account is intended to be epistemic, then the Problem of induction can be seen as taking Hume’s insights about our impressions of necessity to an extreme but reasonable conclusion. If it is true that constant conjunction (with or without the added component of mental determination) represents the totality of the content we can assign to our concept of causation, then we lose any claim to robust metaphysical necessity. But once this is lost, we also sacrifice our only rational grounding of causal inference. Our experience of constant conjunction only provides a projectivist necessity, but a projectivist necessity does not provide any obvious form of accurate predictive power. Hence, if we limit causation to the content provided by the two definitions, we cannot use this weak necessity to justify the PUN and therefore cannot ground predictions. We are therefore left in a position of inductive skepticism which denies knowledge beyond memory and what is present to the senses. By limiting causation to constant conjunction, we are incapable of grounding causal inference; hence Humean inductive skepticism.

In this way, the causal skeptic interpretation takes the “traditional interpretation” of the Problem of induction seriously and definitively, defending that Hume never solved it. Since we never directly experience power, all causal claims certainly appear susceptible to the Problem of Induction. The attempted justification of causal inference would lead to the vicious regress explained above in lieu of finding a proper grounding. The supporters of Humean causal skepticism can then be seen as ascribing to him what seems to be a reasonable position,  which is, the conclusion that we have no knowledge of such causal claims, as they would necessarily lack proper justification. The family of interpretations that have Hume’s ultimate position as that of a causal skeptic therefore maintain that we have no knowledge of inductive causal claims, as they would necessarily lack proper justification. We can never claim knowledge of category (B)  D. M. Armstrong reads Hume this way, seeing Hume’s reductivist account of necessity and its implications for laws of nature as ultimately leading him to skepticism. (Armstrong 1983: 53)  Other Hume scholars that defend a skeptical interpretation of causation include Martin Bell, (Rupert and Richman 2007: 129) and Michael Levine, who maintains that Hume’s causal skepticism ultimately undermines his own Enquiry argument against miracles.

There are, however, some difficulties with this interpretation. First, it relies on assigning the “traditional interpretation” to the Problem of induction though, as discussed above, this is not the only account. Secondly, reading the conclusion of the Problem of Induction in this way is difficult to square with the rest of Hume’s corpus. For instance, the Copy Principle, fundamental to his work, has causal implications, and Hume relies on inductive inference as early as T 1.1.1.8; SBN 4. Hume consistently relies on analogical reasoning in the Dialogues Concerning Natural Religion even after Philo grants that the necessity of causation is provided by custom, and the experimental method used to support the “science of man” so vital to Hume’s Treatise clearly demands the reliability of causal inference. Hume’s causal skepticism would therefore seem to undermine his own philosophy. Of course, if this is the correct way to read the Problem of Induction, then so much the worse for Hume.

A more serious challenge for the skeptical interpretation of Hume is that it ignores the proceeding Part of the Enquiry, in which Hume immediately provides what he calls a “solution” to the Problem of Induction. Hume states that, even though they are not supported by reason, causal inferences are “essential to the subsistence of all creatures,” and that:

It is more comfortable to the ordinary wisdom of nature to secure so necessary an act of the mind, by some instinct or mechanical tendency, which may be infallible in its operations, may discover itself at the first appearance of life and thought, and may be independent of all the laboured deductions of the understanding. As nature has taught us the use of our limbs, without giving us the knowledge of the muscles and nerves by which they are actuated; so she has implanted in us an instinct, which carries forward the thought in a correspondent course to that which she has established among external objects; though we are ignorant of those powers and forces, on which this course and succession of objects totally depends. (EHU 5.22; SBN 55)

Here, Hume seems to have causal inference supported by instinct rather than reason. The causal skeptic will interpret this as descriptive rather than normative, but others are not so sure. It is not clear that Hume views this instinctual tendency as doxastically inappropriate in any way. Therefore, another interpretation of this “solution” is that Hume thinks we can be justified in making causal inferences. However, it is not reason that justifies us, but rather instinct (and reason, in fact, is a subspecies of instinct for Hume, implying that at least some instinctual faculties are fit for doxastic assent). This will be discussed more fully below.

6. Causal Realism

Against the positions of causal reductionism and causal skepticism is the New Hume tradition. It started with Norman Kemp Smith’s The Philosophy of David Hume, and defends the view that Hume is a causal realist, a position that entails the denial of both causal reductionism and causal skepticism by maintaining that the truth value of causal statements is not reducible to non-causal states of affairs and that they are in principle, knowable. (Tooley 1987: 246-47)  The case for Humean causal realism is the least intuitive, given the explications above, and will therefore require the most explanation. However, the position can be rendered more plausible with the introduction of three interpretive tools whose proper utilization seems required for making a convincing realist interpretation. Of these, two are distinctions which realist interpretations insist that Hume respects in a crucial way but that non-realist interpretations often deny. The last is some mechanism by which to overcome the skeptical challenges Hume himself raises.

The first distinction is between ontological and epistemic causal claims. Strawson points out that we can distinguish:

(O)  Causation as it is in the objects, and

(E)  Causation so far as we know about it in the objects.

He maintains, “…Hume’s Regularity theory of causation is only a theory about (E), not about (O).” (Strawson 1989: 10)  Whether or not we agree that Hume limits his theory to the latter, the distinction itself is not difficult to grasp. It simply separates what we can know from what is the case. The realist interpretation then applies this to Hume’s account of necessary connection, holding that it is not Hume’s telling us what causation is, but only what we can know of it. Hume’s account is then merely epistemic and not intended to have decisive ontological implications. This undercuts the reductionist interpretation. Simply because Hume says that this is what we can know of causation, it does not follow that Hume therefore believes that this is all that causation amounts to. In fact, such an interpretation might better explain Hume’s dissatisfaction over the definitions. If Hume were a reductionist, then the definitions should be correct or complete and there would not be the reservations discussed above.

Further, given Hume’s skeptical attitude toward speculative metaphysics, it seems unlikely that he would commit the Epistemic Fallacy and allow the inference from “x is all we can know of y” to “x constitutes the real, mind-independent essence of y,”  as some (though not all) reductionist accounts would require. In fact, Hume must reject this inference, since he does not believe a resemblance thesis between perceptions and external objects can ever be philosophically established. He makes this denial explicit in Part XII of the Enquiry.

The epistemic interpretation of the distinction can be made more compelling by remembering what Hume is up to in the third Part of Book One of the Treatise. Here, as in many other areas of his writings, he is doing his standard empiricist investigation. Since we have some notion of causation, necessary connection, and so forth, his Copy Principle demands that this idea must be traceable to impressions. Hume’s account of causation should therefore be viewed an attempt to trace these genesis impressions and to thereby reveal the true content of the idea they comprise. Thus, it is the idea of causation that interests Hume. In fact, the title of Section 1.3.2 is “Of probability; and of the idea of cause and effect”. He announces, “To begin regularly, we must consider the idea of causation, and see from what origin it is deriv’d.” (T 1.3.2.4; SBN 74, his emphasis )  Hume therefore seems to be doing epistemology rather than metaphysics. (Mounce 1999: 32 takes this as indicative of a purely epistemic project.)

Although this employment of the distinction may proffer a potential reply to the causal reductionist, there is still a difficulty lurking. While it may be true that Hume is trying to explicate the content of the idea of causation by tracing its constituent impressions, this does not guarantee that there is a coherent idea, especially when Hume makes occasional claims that we have no idea of power, and so forth. The challenge seems to amount to this:  Even if the previous distinction is correct, and Hume is talking about what we can know but not necessarily what is, the causal realist holds that substantive causal connections exist beyond constant conjunction. This is to posit a far stronger claim than merely having an idea of causation. The realist Hume says that there is causation beyond constant conjunction, thereby attributing him a positive ontological commitment, whereas his own skeptical arguments against speculative metaphysics rejecting parity between ideas and objects should, at best, only imply agnosticism about the existence of robust causal powers. (It is for this reason that Martin Bell and Paul Russell reject the realist interpretation.)  There therefore seems to be a tension between accepting Hume’s account of necessary connection as purely epistemic and attributing to Hume the existence of an entity beyond what we can know by investigating our impressions.

Put another way, Hume’s Copy Principle requires that our ideas derive their content from constitutive impressions. However, if the previous distinction is correct, then Hume has already exhaustively explicated the impressions that give content to our idea of causation.  This is the very same content that leads to the two definitions. It seems that Hume has to commit himself to the position that there is no clear idea of causation beyond the proffered reduction. But if this is true, and Hume is not a reductionist, what is he positing?  It is here that the causal realist will appeal to the other two interpretive tools, viz. a second distinction and a belief mechanism, the former allowing us to make sense of the positive claim and the latter providing justification for it.

The realists claim that the second distinction is explicit in Hume’s writing. This is the distinction between “conceiving” or “imagining” and merely “supposing”. The general proposal is that we can and do have two different levels of clarity when contemplating a particular notion. We can either have a Cartesian clear and distinct idea, or we can have a supposition, that is, a vague, incomplete, or “relative” notion. The suggestion is this:  Simple ideas are clear and distinct (though not as vivid as their corresponding impressions) and can be combined via the various relations. Groups compiled by relating these simple ideas form mental objects. In some cases, they combine in a coherent way, forming clear and distinct complex ideas, while in other cases, the fit is not so great, either because we do not see how the constituent ideas relate, or there is something missing from our conception. These suppositions do not attain the status of complex ideas in and of themselves, and remain an amalgamation of simple ideas that lack unity. The claim would then be that we can conceive distinct ideas, but only suppose incomplete notions.

Something like this distinction has historical precedence. In the Fifth Replies, Descartes distinguishes between some form of understanding and a complete conception. Berkeley also distinguishes between an “idea” and a mere “notion” in the third Dialogue and the second edition of the Principles. Perhaps most telling, Locke uses terminology identical to Hume’s in regard to substance, claiming we have “…no other idea of it at all, but only a Supposition….” (Essay, II.xxiii.2, emphasis his)  Such a supposition is “an obscure and relative Idea.” (Essay, II.xxiii.3)

The realist employment of this second distinction is two-fold. First, the realist interpretation will hold that claims in which Hume states that we have no idea of power, and so forth, are claims about conceiving of causation. They only claim that we have no clear and distinct idea of power, or that what is clearly and distinctly conceived is merely constant conjunction. But a more robust account of causation is not automatically ruled out simply because our notion is not distinct. In this way, the distinction may blunt the passages where Hume seems pessimistic about the content of our idea of causation.

The second step of the causal realist interpretation will be to then insist that we can at least suppose (in the technical sense) a genuine cause, even if the notion is opaque, that is, to insist that mere suppositions are fit for doxastic assent. There doesn’t seem to be anything terribly problematic in believing in something of which we have an unclear representation. To return to the Fifth Replies, Descartes holds that we can believe in the existence and coherence of an infinite being with such vague ideas, implying that a clear and distinct idea is not necessary for belief. Hume denies clear and distinct content beyond constant conjunction, but it is not obvious that he denies all content beyond constant conjunction.

This second distinction is not introduced without controversy. Briefly, against the distinction, Kenneth Winkler offers an alternative suggestion that Hume’s talk of secret connections is actually a reference to further regularities that are simply beyond current human observation (such as the microscopic or subatomic), while ultimately interpreting Hume as an agnostic about robust causation. (Winkler 1991: 552-556)  John Wright argues that this is to ignore Hume’s reasons for his professed ignorance in the hidden, that is, our inability to make causal inferences a priori. (Wright 1983: 92)  Alternatively, Blackburn, a self-proclaimed “quasi-realist”, argues that the terminology of the distinction is too infrequent to bear the philosophical weight that the realist reading would require. (Blackburn 2007: 101-102)  P.J.E. Kail resists this by pointing out that Hume’s overall attitude strongly suggests that he “assumes the existence of material objects,” and that Hume clearly employs the distinction and its terminology in at least one place: T 1.4.2.56; SBN 217-218. (Kail, 2007: 60) There, Hume describes a case in which philosophers develop a notion impossible to clearly and distinctly perceive, that somehow there are properties of objects independent of any perception. We simply cannot conceive such an idea, but it certainly remains possible to entertain or suppose this conjecture. Clatterbaugh takes an even stronger position than Blackburn, positing that for Hume to talk of efficacious secret powers would be literally to talk nonsense, and would force us to disregard Hume’s own epistemic framework, (Clatterbaugh 1999: 204) while Ott similarly argues that the inability to give content to causal terms means Hume cannot meaningfully affirm or deny causation. (Ott 2009: 198)

Even granting that Hume not only acknowledges this second distinction but genuinely believes that we can suppose a metaphysically robust notion of causal necessity, the realist still has this difficulty. How can Hume avoid the anti-realist criticism of Winkler, Ott, and Clatterbaugh that his own epistemic criteria demand that he remain agnostic about causation beyond constant conjunction?  In other words, given the skeptical challenges Hume levels throughout his writings, why think that such a seemingly ardent skeptic would not merely admit the possibility of believing in a supposition, instead of insisting that this is, in fact, the nature of reality?  The realist seems to require some Humean device that would imply that this position is epistemically tenable, that our notion of causation can reasonably go beyond the content identified by the arguments leading to the two definitions of causation and provide a robust notion that can defeat the Problem of Induction.

This is where the realists (and non-realists) seem most divided in their interpretations of Hume. Generally, the appeal is to Hume’s texts suggesting he embraces some sort of non-rational mechanism by which such beliefs are formed and/or justified, such as his purported solution to the Problem of Induction. This picture has been parsed out in terms of doxastic naturalism, transcendental arguments, psychological necessity, instinct, and even some form of proper function. However, what the interpretations all have in common is that humans arrive at certain mediate beliefs via some method quite distinct from the faculty of reason.

Let us now consider the impact that adopting these naturally formed beliefs would have on Hume’s causal theory. The function is two-fold. First, it provides some sort of justification for why it might be plausible for Hume to deem mere suppositions fit for belief. The other role is to answer the skeptical challenges raised by the “traditional interpretation” of the Problem of Induction. It would provide a way to justify causal beliefs despite the fact that said beliefs appear to be without rational grounds. It accomplishes the latter by emphasizing what the argument concludes, namely that inductive reasoning is groundless, that there is no rational basis for inductive inference. As Hume says, “Reason can never show us the connexion of one object with another….” (T 1.3.6.12; SBN 92, emphasis mine)  In granting such a mechanism, we grant Hume the epistemic propriety of affirming something reason cannot establish. Further, it smoothes over worries about consistency arising from the fact that Hume seemingly undercuts all rational belief in causation, but then merrily shrugs off the Problem and continues to invoke causal reasoning throughout his writings.

In the realist framework outlined above, doxastic naturalism is a necessary component for a consistent realist picture. Kemp Smith argues for something stronger, that this non-rational mechanism itself implies causal realism. After engaging the non-rational belief mechanism responsible for our belief in body, he goes on to argue, “Belief in causal action is, Hume argues, equally natural and indispensable; and he freely recognizes the existence of ‘secret’ causes, acting independently of experience.” (Kemp Smith 2005: 88)  He connects these causal beliefs to the unknown causes that Hume tells us are “original qualities in human nature.” (T 1.1.4.6; SBN 13)  Kemp Smith therefore holds that Humean doxastic naturalism is sufficient for Humean causal realism. The reductionist, however, will rightly point out that this move is entirely too fast. Even granting that Hume has a non-rational mechanism at work and that we arrive at causal beliefs via this mechanism does not imply that Hume himself believes in robust causal powers, or that it is appropriate to do so. However, combining Humean non-rational justification with the two distinctions mentioned above at least seems to form a consistent alternative to the reductionist and skeptical interpretations. Just which of these three is right, however, remains contentious.

7. References and Further Reading

a. A Note on Hume’s Works

Hume wrote all of his philosophical works in English, so there is no concern about the accuracy of English translation. For the casual reader, any edition of his work should be sufficient. However, Oxford University Press produced the definitive Clarendon Edition of most of his works. For the serious scholar, these are a must have, as they contain copious helpful notes about Hume’s changes in editions, and so forth. The general editor of the series is Tom L. Beauchamp.

When referencing Hume’s works, however, there are standard editions of the Treatise and his Enquiries originally edited by L.A. Selby-Bigge and later revised by P.H. Nidditch. Hence, citations will often be given with an SBN page number (now called ISBN). But Hume also numerated his own works to varying degrees. The Treatise is divided into three Books, each with Parts, Sections, and paragraphs. Hence, four numbers can give a precise location of a passage. Hume’s two definitions of cause are found at T 1.3.14.31; SBN 170, that is, in the Treatise, Book One, Part Three, Section Fourteen, paragraph thirty-one. This paragraph can be found on page 170 of the Selby-Bigge Nidditch editions. Hume’s shorter works, such as the Enquiry Concerning Human Understanding, are not as thoroughly outlined.  Instead, the Enquiry is only divided into Sections, only some of which have Parts. Hence, we also find Hume’s definitions at EHU 7.29; SBN 76-77, or Part Seven of the Enquiry, paragraph twenty-nine, pages 76 and 77 of the Selby-Bigge Nidditch editions.

b. Hume’s Works on Causation

  • Hume, David. A Treatise of Human Nature. Clarendon Press, Oxford, U.K., 2007, edited by David Fate Norton and Mary J. Norton.
  • Hume, David. An Enquiry Concerning Human Understanding. Clarendon Press, Oxford, U.K., 2000, edited by Tom L. Beauchamp.

c. Works in the History of Philosophy

  • Ayers, Michael. “Natures and Laws from Descartes to Hume”, in The Philosophical Canon in the Seventeenth and Eighteenth Centuries: Essays in Honour of John W. Yolton, edited by G.A.J. Rogers and S. Tomaselli, University of Rochester Press, Rochester, New York, 1996.
    • This article argues that there are two main traditions of efficacy in the Early Modern period, that objects have natures or that they follow laws imposed by God. This bifurcation then informs how Hume argues, as he must engage the former.
  • Baier, Annette C. A Progress of Sentiments- Reflections on Hume’s Treatise. Harvard University Press, Cambridge Massachusetts, 1991.
    • Baier argues for a nuanced reading of the Treatise, that we can only understand it with the addition of the passions, and so forth, of the later Books.
  • Beauchamp, Tom L. and Rosenberg, Alexander. Hume and the Problem of Causation. Oxford University Press, New York, New York, 1981.
    • This is an important but technical explication and defense of the Humean causal reductionist position, both as a historical reading and as a contemporary approach to causation. The authors argue directly against the skeptical position, instead insisting that the Problem of induction targets only Hume’s rationalist predecessors.
  • Beebee, Helen. Hume on Causation. Routledge University Press, New York, New York, 2006.
    • Beebee rejects the standard interpretations of Hume’s causation before proffering her own, which is grounded in human nature and his theory of mind. Her critiques of the standard Humean views are helpful and clear.
  • Bennett, Jonathan. Learning from Six Philosophers. (two volumes)  Oxford University Press, Oxford, U.K., 2001.
    • These two volumes constitute a solid introduction to the major figures of the Modern period. Volume One discusses Descartes, Spinoza, and Leibniz, and Volume Two is an updated recasting of his Locke, Berkeley, Hume- Central Themes.
  • Bennett, Jonathan. Locke, Berkeley, Hume- Central Themes. Oxford University Press, Glasgow, U.K., 1971.
    • This is an excellent overview of the main doctrines of the British empiricists.
  • Blackburn, Simon. “Hume and Thick Connexions”, as reprinted in Read, Rupert and Richman, Kenneth A. (Editors). The New Hume Debate (Revised Edition). Routledge, New York, New York, 2007, pages 100-112.
    • This is the second, updated version of an important investigation into the realism/reductionism debate. He ultimately adopts a “quasi-realist” position that is weaker than the realist definition given above.
  • Buckle, Stephen. Hume’s Enlightenment Tract- The Unity and Purpose of An Enquiry Concerning Human Understanding. Clarendon Press, Oxford, Oxford U.K., 2001.
    • This book examines the Enquiry, distancing it from the standard reading of a recasting of the Treatise. Instead, Buckle argues that the work stands alone as a cohesive whole.
  • Clatterbaugh, Kenneth. The Causation Debate in Modern Philosophy, 1637-1739. Routledge, New York, New York, 1999.
    • This book traces the various causal positions of the Early Modern period, both rationalist and empiricist.
  • Costa, Michael J. “Hume and Causal Realism”. Australasian Journal of Philosophy, 67: 2, pages 172-190.
    • Costa gives his take on the realism debate by clarifying several notions that are often run together. Like Blackburn, he ultimately defends a view somewhere between reductionism and realism.
  • Craig, Edward. “The Idea of Necessary Connexion” in Reading Hume on Human Understanding, edited by Peter Millican, Oxford University Press, New York, New York, 2002, pages 211-229.
    • This article is an updated and expanded defense of the Hume section of The Mind of God and the Works of Man.
  • Craig, Edward. The Mind of God and the Works of Man. Oxford University Press Clarendon, New York, New York, 1987.
    • A complex book that discusses the works of several philosophers in arguing for its central thesis, Craig’s work is one of the first to defend a causal realist interpretation of Hume.
  • Dauer, Francis Watanabe. “Hume on the Relation of Cause and Effect” in A Companion to Hume, edited by Radcliffe, Elizabeth S, Blackwell Publishing, Ltd, Malden, MA, 2008, pages 89-105.
    • Dauer takes a careful look at the text of the Treatise, followed by a critical discussion of the three most popular interpretations of the two definitions.
  • Fogelin, Robert J. Hume’s Skepticism in the Treatise of Human Nature. Routledge and Kegan Paul, London, U.K., 1985.
  • Garrett, Don. Cognition and Commitment in Hume’s Philosophy. Oxford University Press. New York, New York, 1997.
    • This is a great introduction to some of the central issues of Hume’s work. Garrett surveys the various positions on each of ten contentious issues in Hume scholarship before giving his own take. Among other things, he argues for a novel way to square the two definitions of cause.
  • Howson, Colin. Hume’s Problem: Induction and the Justification of Belief. Oxford University Press, Oxford, U.K., 2000.
    • This highly technical text first defends Hume’s skeptical induction against contemporary attempts at refutation, ultimately concluding that the difficulties in justifying induction are inherent. Nevertheless, given certain assumptions, induction becomes viable. He then goes on to provide a reliable Bayesian framework of a limited type.
  • Kail, P.J.E. Projection and Realism in Hume’s Philosophy. Oxford University Press, Oxford, U.K., 2007.
    • This book explores the projectivist strand of Hume’s thought, and how it helps clarify Hume’s position within the realism debate, presenting Hume’s causal account as a combination of projectivism and realism.
  • Kemp Smith, Norman. The Philosophy of David Hume. Palgrave MacMillan, New York, New York, 2005.
    • This is the work that started the New Hume debate. Palgrave MacMillan has released it in a new edition with an extended introduction describing the work’s importance and the status of the debate.
  • Livingston, Donald W. “Hume on Ultimate Causation.”  American Philosophical Quarterly, Volume 8, 1971, pages 63-70.
    • This is a concise argument for causal realism, which Livingston later expands into a book. Here, he defends the Humean skeptical realism that he considers necessary for other strands of Hume’s philosophy.
  • Livingston, Donald W. Hume’s Philosophy of Common Life. University of Chicago Press, Chicago, Illinois, 1984.
    • This is one of the standard explications of Humean causal realism. It stresses Hume’s position that philosophy should conform to and explain common beliefs rather than conflict with them.
  • Loeb, Louis E. “Inductive Inference in Hume’s Philosophy”, in A Companion to Hume, edited by Radcliffe, Elizabeth S, Blackwell Publishing, Ltd, Malden, MA, 2008, pages 106-125.
    • This is a contemporary analysis of the Problem of induction that ultimately rejects causal skepticism.
  • Loeb, Louis E. Stability and Justification in Hume’s Treatise, Oxford University Press, Oxford, U.K., 2002.
    • This well-argued work offers an interpretation of the Treatise building around Hume’s claim that the mind ultimately seeks stability in its beliefs. Linking justification with “settled beliefs” provides a positive rather than merely destructive epistemology.
  • McCracken, Charles J. Malebranche and British Philosophy. Clarendon Press, Oxford, U.K., 1983.
    • Among other things, McCracken shows how much of Hume’s insight into our knowledge of causal necessity can be traced back to the occasionalism of Malebranche.
  • Millican, Peter. “Hume, Causal Realism, and Causal Science”, Mind, Volume 118, Issue 471, July, 2009, pages 647-712.
    • After giving an overview of the recent debate, Millican argues that the New Hume debate should be settled via Hume’s logic, rather than language, and so forth. He largely rejects the realist interpretation, since the reductionist interpretation is required to carry later philosophical arguments that Hume gives.
  • Millican, Peter. “Hume’s Sceptical Doubts concerning Induction”, in Reading Hume on Human Understanding, edited by Peter Millican, Clarendon Press, Oxford, Oxford, U.K. 2002, pages 107-173.
    • This is a somewhat technical reconstruction of the Problem of Induction, as well as an exploration of its place within Hume’s philosophy and its ramifications.
  • Mounce, H.O. Hume’s Naturalism, Routledge, New York, New York, 1999.
    • This book is an extended development of Hume’s doxastic naturalism over his empiricism.
  • Noonan, Harold W. Routledge Philosophy Guidebook to Hume on Knowledge. Routledge, London, U.K., 1999.
    • Noonan gives an accessible introduction to Hume’s epistemology.
  • Ott, Walter. Causation and Laws of Nature in Early Modern Philosophy. Oxford University Press, Oxford, U.K., 2009.
    • This is an advanced survey of causation in the Early Modern period, covering both the rationalists and the empiricists.
  • Owen, David. Hume’s Reason. Oxford University Press, New York, New York, 1999.
    • This book is an extended treatment of Hume’s notion of reason and its impact on many of his important arguments.
  • Robinson, J. A. “Hume’s Two Definitions of ‘Cause’”. The Philosophical Quarterly, Volume 12, 1962.
    • This article is a concise argument for the difficulties inherent to squaring the two definitions.
  • Robinson, J. A. “Hume’s Two Definitions of ‘Cause’ Reconsidered”. The Philosophical Quarterly, Volume 15, 1965, as reprinted in Hume, A Collection of Critical Essays, edited by V. C. Chappell. University of Notre Dame Press, Notre Dame, Indiana, 1966.
    • This is an updated follow-up to his previous article.
  • Read, Rupert and Richman, Kenneth A. (editors). The New Hume Debate- Revised Edition. Routledge, New York, New York, 2007.
    • This compilation presents a balanced collection of the important works on both sides of the causal realism debate.
  • Stove, David. Probability and Hume’s Inductive Skepticism. Oxford University Press, Oxford, U.K., 1973.
    • Stove presents a math-heavy critique of Hume’s inductive skepticism by insisting that Hume claims too much. Instead of concluding that inductive inference adds nothing to the probability of a conclusion, his premises only imply inductive fallibilism, that is, that they never attain deductive certainty. While no inductive inference is valid, this does not imply that they cannot be reasonable.
  • Strawson, Galen. The Secret Connexion- Causation, Realism, and David Hume. Oxford University Press Clarendon, New York, New York, 1989.
    • This book is perhaps the most clear and complete explication of the New Hume doctrines.
  • Wilson, Fred. Hume’s Defense of Causal Inference. University of Toronto Press, Toronto Canada, 1997.
    • Wilson’s main goal is to defend an anti-skeptical interpretation of Hume’s causal inference, but the book is wide-ranging and rich in many areas of Hume scholarship.
  • Winkler, Kenneth P. “The New Hume”, The Philosophical Review, Volume 100, Number 4, October 1991, pages 541-579.
    • Winkler presents a clear and concise case against the realist interpretation.
  • Wright, John P. The Sceptical Realism of David Hume. University of Minnesota Press, Minneapolis, Minnesota, 1983.
    • This book is one of the standard explications of Humean causal realism. The interpretation is arrived at via a focus on Hume’s attention to human nature. The book also places Hume’s notion of knowledge within its historical context.

d. Contemporary Metaphysics of Causation

  • Armstrong, D. M. What is a Law of Nature? Cambridge University Press, New York, New York, 1983.
    • This book investigates the status of the laws of nature. He ultimately argues that laws are relations between universals or properties.
  • Goodman, Nelson. Fact, Fiction, and Forecast. Fourth Edition, Harvard University Press, Cambridge, Massachusetts, 1983.
    • Goodman explicates the Problem of induction and makes a more general form of the difficulty it raises.
  • Mackie, J. L. The Cement of the Universe- A Study of Causation. Oxford University Press Clarendon, New York, New York, 1980.
    • This work begins with Hume’s analysis of causation and then goes on to consider what we can know about causation as it exists in external objects. Though it is highly technical, it touches many issues important to contemporary metaphysics of causation.
  • Psillos, Stathis. Causation and Explanation. MCGill-Queen’s University Press, Montreal, Canada, 2002.
    • This book is an accessible survey of contemporary causality, linking many of the important issues and engaging the relevant literature.
  • Tooley, Michael. Causation–A Realist Approach. Clarendon Press, Oxford, U.K., 1987.
    • Tooley presents a contemporary defense of realism with efficacy as relations among universals. In doing so, he clarifies many notions and commitments of the various realist and anti-realist positions.

Author Information

C. M. Lorkowski
Email: clorkows@kent.edu
Kent State University
U. S. A.

Skeptical Theism

Skeptical theism is the view that God exists but that we should be skeptical of our ability to discern God’s reasons for acting or refraining from acting in any particular instance.  In particular, says the skeptical theist, we should not grant that our inability to think of a good reason for doing or allowing something is indicative of whether or not God might have a good reason for doing or allowing something.  If there is a God, he knows much more than we do about the relevant facts, and thus it would not be surprising at all if he has reasons for doing or allowing something that we cannot fathom.

If skeptical theism is true, it appears to undercut the primary argument for atheism, namely the argument from evil.  This is because skeptical theism provides a reason to be skeptical of a crucial premise in the argument from evil, namely the premise that asserts that at least some of the evils in our world are gratuitous.  If we are not in a position to tell whether God has a reason for allowing any particular instance of evil, then we are not in a position to judge whether any of the evils in our world are gratuitous.  And if we cannot tell whether any of the evils in our world are gratuitous, then we cannot appeal to the existence of gratuitous evil to conclude that God does not exist.  The remainder of this article explains skeptical theism more fully, applies it to the argument from evil, and surveys the reasons for and against being a skeptical theist.

Table of Contents

  1. Introduction to Skeptical Theism
  2. The Argument from Evil
  3. Responses to the Argument from Evil
    1. Denying the Minor Premise
    2. Skepticism about the Minor Premise
  4. Defenses of Skeptical Theism
    1. Arguments from Analogy
    2. Arguments from Complexity
    3. Arguments from Enabling Premises
  5. Objections to Skeptical Theism
    1. Implications for the Divine-Human Relationship
    2. Implications for Everyday Knowledge
    3. Implications for Commonsense Epistemology
    4. Implications for Moral Theory
    5. Implications for Moral Living
  6. References and Further Reading

1. Introduction to Skeptical Theism

Skeptical theism is a conjunction of two theses.  The first thesis of skeptical theism is that theism is true, where “theism” is roughly the view that God exists and “God,” in turn, is an honorific title describing the most perfect being possible.  This is the being putatively described in classical western theologies of Judaism, Christianity, Islam, and some theistic forms of Eastern religions.  The second thesis is that a certain limited form of skepticism is true, where this skepticism applies to the ability of humans to make all-things-considered judgments about what God would do or allow in any particular situation.  Not all theists are skeptical theists, and not all of the philosophers who endorse the skeptical component of skeptical theism are theists.  Since it is the skeptical component that is of most interest, it will be the focus in what follows.

It is important to get clear on the scope of the skepticism endorsed by skeptical theists.  First, it is not a global skepticism—skeptical theists are not committed to the view that we cannot know anything at all.  Instead, the skepticism is (putatively) limited to a narrow range of propositions, namely those having to do with God’s reasons for action.  For example, a skeptical theist could admit that humans have ceteris paribus knowledge of God’s reasons for actions.  An example of such knowledge might be the following: other-things-being-equal, God will eliminate suffering when he is able to do so.  However, knowing this latter claim is consistent with denying that we know the following: God will eliminate this particular instance of suffering.  Holding the combination of these two views is possible for the following reason: while we might know that other-things-being-equal, God will eliminate suffering when he is able to do so, we might not know whether or not other things are equal in any particular instance of suffering.

As an example of this limited sort of skepticism, consider a much more mundane example.  One might know that other-things-being-equal, it is better to save aces in a hand of draw poker (since aces are the highest denomination).  However, one might know this while at the same time withholding judgment on whether or not it is a good idea for Jones to save aces in any particular hand, since one would not know what Jones’ other cards were (for example, perhaps saving an ace requires discarding a member of a four-of-a-kind set in another denomination).

2. The Argument from Evil

Agnosticism is the philosophical view that neither affirms that God exists nor affirms that God does not exist.  On the other hand, atheism is the view that God does not exist.  Perhaps the most powerful argument for atheism is the argument from evil.  According to this line of reasoning, the fact that the world contains evil is powerful evidence that God does not exist.  This is because God is supposed to be the most perfect being possible, and among these perfections is both perfect power and perfect goodness.  If God were perfectly powerful, then he would be able to eliminate all instances of evil.  If God were perfectly good, then he would want to eliminate all instances of evil.  Thus, if God exists, there would be no evil.  But there is evil.  Therefore, God does not exist.

While the foregoing sketches the rough terrain, the argument from evil comes in two distinct forms.  First is the logical problem of evil.  According to the logical problem of evil, it is not logically possible for both evil and God to coexist.  Any world in which God exists will be a world devoid of any evil.  Thus, anyone who believes both that God exists and that evil exists is committed to an implicit contradiction.

Second is the evidential argument from evil. According to the evidential argument from evil, while it is logically possible that both God and evil coexist, the latter is evidence against the former.  The evidential argument is sometimes put in terms of an inference to the best explanation (that is, the abductive argument from evil) and sometimes in terms of probabilities (that is, the inductive argument from evil).  In either case, certain facts about the existence, nature and distribution of evils in the world are offered as pro tanto evidence against the truth of theism.  This article focuses on the probabilistic (inductive) version of the evidential argument from evil as it is the most common in the contemporary literature.

It is widely conceded that there is no logical problem of evil for the following reason: if there is a God, he would allow any particular instance of evil that is necessary either to avoid some evil equally bad or worse or to secure some compensating (or justifying) good.  For instance, the experience of pain is an intrinsic evil.  However, the fact that a human father allows his child to experience the pain of an inoculation does not thereby show that the father is not perfectly good.  That is because, although evil in itself, the pain was necessary to secure a compensating good, namely being immune to a painful or deadly disease.  Philosophers call any instance of evil that is not necessary either to avoid some evil equally bad or worse or to secure some compensating (or justifying) good a gratuitous evil.  Thus, it is only the existence of gratuitous evil (instead of any evil whatsoever) that poses a (putative) problem for theism.

With the distinction between gratuitous and non-gratuitous evil in hand, the evidential argument from evil can be formulated as follows:

1. If God exists, then there are no instances of gratuitous evil.

2. It is likely that at least some instances of evil are gratuitous.

3. Therefore, it is likely that God does not exist.

The gist is that insofar as we have reason to believe that at least some of the evils in our world are not necessary either to avoid some evil equally bad or worse or to secure some compensating (or justifying) good, we have reason to believe that God does not exist.  So there is still a sense in which a logical problem of evil remains—it is logically impossible that God and gratuitous evil coexist.  The evidential nature of this argument focuses around premise (2): the best we can do is to present an inductive case for the claim that any particular evil in our world is gratuitous.

3. Responses to the Argument from Evil

Theists have challenged both premises in the argument from evil.  Regarding premise (1), some have challenged the notion that God is required by his moral perfection to eliminate all instances of gratuitous evil (for example, Van Inwagen 2003).  However, by and large, theists have focused their attention on the minor premise: the claim that it is likely that some of the evils in our world are gratuitous.  There are two ways of responding to this premise.  One may either deny it or seek to show that we should be agnostic about it.  Each strategy is sketched below.

a. Denying the Minor Premise

Challenges to the argument from evil that purport to show that premise (2) is false are typically called theodicies.  A theodicy is an attempt to show that no actual evil in our world is gratuitous, or, in logically equivalent terms, that all the evils in our world are necessary either to avoid some evil equally bad or worse or to secure some compensating (or justifying) good.  If a theist can successfully show this, then premise (2) in the argument from evil is false, and the argument from evil is unsound.

Theodicies take a number of different forms.  Some try to show that the evils in our world are necessary for compensating goods such as moral development, significant free will, and so on.  Others try to show that evils in our world are necessary to avoid evils equally bad or worse.  In either case, a successful theodicy will have to be thorough—if even one instance of evil in the world turns out to be gratuitous, the minor premise is true and the argument from evil goes through.

b. Skepticism about the Minor Premise

The burden of proof for a theodicy is tremendously high.  The theodicist must show that all of the evils in our world are non-gratuitous.  For this reason, many theistic philosophers prefer only to show that we should be agnostic about premise (2).  Skepticism about premise (2) is typically defended in one of two ways: by appeal to a defense or by appeal to the resources of skeptical theism.

Unlike a theodicy, a defense does not attempt to show what God’s actual reason is for allowing any particular instance of evil.  Instead, it attempts to show what God’s reasons might be for all we know.  And if God might have reasons for allowing a particular evil that we do not know about, then we are in no position to endorse premise (2) in the evidential argument from evil.  The idea is that there are relevant alternatives that we are in no position to rule out, and unless we are in such a position, we should not conclude that the minor premise is true.

For example, suppose you are a jurist in a criminal case, and—given only the videotape evidence—you cannot determine whether the defendant or his twin committed the crime.  In this case, you are not justified in concluding that the defendant is guilty, and that is because there is a live possibility that you cannot rule out, and this possibility would show that the defendant is innocent.  The same might be said of premise (2) in the argument from evil: there are live possibilities that we are in no position to rule out, and these possibilities show that God is justified in allowing the evils in our world.  And if so, we are in no position to endorse premise (2) of the argument from evil.

Skeptical theism provides a second, independent case for agnosticism about premise (2).  This case takes the form of an undercutting defeater for the standard defense of premise (2).  Why should we think that it is likely that at least some of the evils in our world are gratuitous?  The standard defense of this claim is as follows:

Well, it seems like many of the evils in our world are gratuitous, so it is likely that at least some instances of evil are gratuitous.

Put differently, we cannot see any reason for God to allow some of the evils in our world, therefore there we should conclude that there is no reason for God to allow some of the evils in our world.  Call this inference pattern the “noseeum” inference (“if we can’t see ‘um, they ain’t there”).

The skeptical theist denies the strength of this noseeum inference.  The fact that an evil appears to be gratuitous to us is not indicative of whether or not it is gratuitous.  So on the one hand, the skeptical theist is happy to grant that it seems as if many of the evils in our world are gratuitous.  However, she denies that this fact is good evidence for the claim that such evils really are gratuitous.  And hence we have no reason to endorse premise (2) in the argument from evil.

4. Defenses of Skeptical Theism

As a reply to the argument from evil, skeptical theism seems initially quite plausible.  Surely if there were a God, there would be many, many cases in which we could see no reason for a course of action although such reasons were available to God. Some things that look unjustifiable given our own perspectives are justifiable once one has all the facts.  Besides relying on this initial plausibility, skeptical theists have defended their view in roughly three ways.

a. Arguments from Analogy

The fact that a young child cannot discern a reason for her parents allowing her to suffer pain does not constitute a good reason for the young child to conclude that there are no such reasons.  In this case, a clear example of the noseeum inference fails.  Given the child’s limited knowledge and experience as compared to the knowledge and experience of her parents, she ought not conclude that her parents are not justified in allowing a certain evil to occur.  Other similar examples are easy to come by: if one does not play much chess, the fact that one cannot see why the chess master makes a particular move is not indicative of whether or not such a move is justified.  It would be silly to reason as follows: I cannot see a good reason for that move, therefore, there is no good reason for that move.

If these cases are persuasive, the skeptical theist can defend her position accordingly.  The cognitive distance between a young child and her parents is analogous to the cognitive position between a human agent and God.  Thus, the fact that a human is unable to see a reason for allowing a particular evil is not a good reason for concluding that God would have no reason for allowing that evil.

b. Arguments from Complexity

On its face, premise (2) is very straightforward: it is very likely that at least some of the evils in our world are gratuitous.  But when we get clear on what that means, we see that this kind of judgment is extraordinarily complex.  It says, in effect, that we are able to identify some instances of evil which were not necessary either to avoid an evil equally bad or worse or to secure some compensating good.  How could we ever know such complex facts?  For example, consider the following:

On the night that Sir Winston Churchill was conceived, had Lady Randolph Churchill fallen asleep in a slightly different position, the precise pathway that each of the millions of spermatozoa took would have been slightly altered.  As a result…Sir Winston Churchill, as we knew him, would not have existed, with the likely result that the evolution of World War II would have been substantially different… (Durston 2000, p. 66)

On the face of it, it appears that it would not matter what position Lady Churchill sleeps in.  Put differently, it appears that there is no good reason to prefer her sleeping in one position rather than another.  But given the specifics of human reproduction, this assumption is unwarranted and—in this case—plausibly false.  So the fact that we cannot see a reason is not indicative of whether or not there is any such reason.  This same objection applies, mutatis mutandis, to the inference from “we can see no reason to allow this evil” to “there is no reason to allow this evil.”

c. Arguments from Enabling Premises

One of the most sophisticated defenses of skeptical theism insists that some sort of enabling premise must be reasonably believed before noseeum inferences are warranted and, further, that this enabling premise is not reasonably believed with regard to inferences about what God would allow.  Two such enabling premises have been proposed in the literature: the first concerns our sensitivity to evidence and the second concerns the representativeness of our inductive samples.

The most common instance of the sensitivity strategy invokes an epistemic principle dubbed the Condition on Reasonable Epistemic Access, or “CORNEA” for short (Wyskstra 1984).  CORNEA says that inferences from “I see no X” to “There is no X” are justified only if it is reasonable to believe that if there were an X, I would likely see it.  So, for example, the inference from “I see no elephant in my office” to “There is no elephant in my office” is licensed by CORNEA since I reasonably believe that if there were an elephant in my office, I would likely see it.  However, such skeptical theists have insisted that it is not reasonable for me to think that if there were a reason for allowing any particular evil that I would be aware of it.  Given this assumption, CORNEA says that the inference from “I see no reason for allowing this instance of evil” to “There is no reason for allowing this instance of evil” is invalid.

The second strategy has to do with our knowledge of the representativeness of the inductive sample used in the noseeum inference.  According to this version of the strategy, the inductive move from “I see no X” to “There is no X” is warranted only if it is reasonable for me to believe that my inductive sample of X’s is representative of the whole.  For example, one should not rely on inductive evidence to conclude that all crows are black unless it is reasonable to assume that one’s sample of crows is representative of all crows.  As applied to the argument from evil, the inference from “I can see no reason to allow this evil” to “There is no reason to allow this evil” is justified only if it is reasonable for one to believe that the sample of reasons currently understood is representative of all of the reasons that are.  The crucial question then becomes whether or not any of us have good reason to think that our sample of goods, evils, and the connections between them is suitably representative.  Some philosophers think that we do have such reason (for example, Tooley 1991).  Others think that our knowledge is not representative (for example, Sennett 1993).  Others think we cannot tell one way or the other whether our sample is representative, and thus we lack good reason for thinking that the sample is representative, as required by the second strategy (for example, Bergmann 2001).

5. Objections to Skeptical Theism

As with any form of skepticism, skeptical theism has its critics.  Some of these critics are theists who think that skeptical theism has unbecoming implications for issues of importance to theism (such as knowledge of God, relationship with God, and the like).  Other critics think that skeptical theism has unbecoming implications for more general issues such as everyday knowledge, moral living, and so on.  The objections to skeptical theism fall roughly into five different sorts.

a. Implications for the Divine-Human Relationship

One prominent criticism of skeptical theism is that it eliminates the potential for a close relationship between humans and God.  It does so in two ways.  First, if skeptical theism undercuts arguments against the existence of God by highlighting the fact that we know very little about how God would act (all-things-considered), then by parity of reasoning it also undercuts arguments for the existence of God.  Skeptical theist considerations seem to suggest agnosticism about whether God would create a world, finely-tune the universe, create rational beings, and so on, despite the fact that each of these are assumptions in standard arguments for the existence of God.  And the same considerations appear to undercut our knowledge of God’s interactions in the world; it is no longer open to the theist to say what God wants in her life (all-things-considered), whether a particular event was a miracle, and so on.

Second, skeptical theism not only appears to undercut one’s knowledge of God, but it also seems to undercut one’s trust in God.  Being in a close relationship with another person requires some kind of understanding of what the other person wants and why the other person acts as she does.  Furthermore, communication is important to a relationship, but skeptical theists should not trust communication from God (including divine commands, mystical experiences, and so on).  Why?  Because for all we know, God has a reason for deceiving us that is beyond our ken.

b. Implications for Everyday Knowledge

Any non-global version of skepticism will face objections that attempt to stretch the skepticism to new areas of inquiry.  One objection of this sort claims that skeptical theism breaks down into a near-global skepticism that disallows what we might think of as everyday knowledge.  Consider the claim that all crows are black.  This seems a perfect example of everyday knowledge.  But a skeptical crowist might respond as follows: “for all we know, there are purple crows beyond our ken, thus, the fact that we see no purple crows is not indicative of the fact that there are no purple crows.”  Thus we do not know the claim that all crows are black.

c. Implications for Commonsense Epistemology

Others have argued not that skeptical theism is incompatible with any particular knowledge claim but that it is incompatible with a promising set of theories in epistemology.  In particular, skeptical theism appears to rule out so-called commonsense epistemologies that rely on something like the principle of credulity: other things being equal, it is reasonable to believe that things are as they appear.  The problem is that skeptical theists grant that at least some evils appear gratuitous, thus, by the principle of credulity, they ought to grant that it is reasonable to believe that at least some evils are gratuitous.  But that is precisely what skeptical theism denies.

d. Implications for Moral Theory

The skeptical theist’s strategy relies on the presumption that there are some moral judgments that we are not justified in making.  Consider an instance of childhood cancer.  The skeptical theist is unwilling to grant that this evil is gratuitous because—for all we know—it was necessary either to prevent some evil equally bad or worse or to secure some compensating good.  Furthermore, if the evil is not gratuitous, it seems that it would be morally permissible (or even morally obligatory) for God to allow that evil to occur.  This is how the skeptical theist hopes to get God off the hook: we cannot blame him for creating the actual world if he meets all of his moral obligations in doing so.

The putative problem is that the skeptical theist seems to be committed to a consequentialist view of ethics, and many philosophers find such a view unappealing.  The apparent implications result from the fact that a skeptical theist seems to allow that no matter how horrendous a particular instance of evil might be, it can always be justified given good enough consequences.  Thus, if one thinks that there are some things that morally ought not be allowed regardless of consequences (such as the torture of an innocent person), this putative implication counts against skeptical theism.

e. Implications for Moral Living

Finally, the most pressing objection to skeptical theism is that it seems to preclude both the possibility of engaging in moral deliberation and the possibility of moral knowledge.  The putative problem can be sketched as follows: if, for any instance of evil, we are unable to tell whether or not the evil is gratuitous, then we are unable to engage in moral deliberation and arrive at a view about what is reasonable for us to do.  For example, suppose a skeptical theist comes upon a young boy drowning in a pond.  His skeptical theism seems to commit him to reasoning as follows: for all I know, the boy’s death is necessary to prevent some greater evil or to secure some greater good, thus I do not have a reason to intervene.

Skeptical theists have offered a number of interesting responses to this objection.  Some think that what is wrong for a person depends only on what he or she knows, and thus it would be wrong for the bystander to let the boy drown since he does not know that the boy’s death is non-gratuitous.  Others think that what is right for God to allow might be different than what is right for us to allow.  In that case, it might be wrong for you to let the boy drown even though you cannot conclude (for skeptical theist reasons) that it is wrong for God to do the same.  Still others insist that there is no unique difficulty here: everyone faces the hurdle of attempting to decide whether a particular event will have, on balance, good or bad consequences.  In that case, though it is true that moral deliberation is difficult given skeptical theism, it is also difficult given any view of religious epistemology.

6. References and Further Reading

  • Almeida, M. & Oppy, G. (2003) “Sceptical Theism and Evidential Arguments from Evil,” Australasian Journal of Philosophy 81:4, pp. 496-516.
    • An objection to skeptical theism based on its implications for the moral life.
  • Alston, W. (1991) “The Inductive Argument from Evil and the Human Cognitive Condition,” Philosophical Perspectives 5, pp. 29-67.
    • A defense of skeptical theism by appeal to analogy.
  • Bergmann, M. (2001) “Skeptical Theism and Rowe’s New Evidential Argument from Evil,” Nous 35, pp. 278-296.
    • Seminal statement of skeptical theism and a defense of skeptical theism by appeal to enabling premises.
  • Draper, P. (1989) “Pain and Pleasure: An Evidential Problem for Theists,” Nous 23, pp. 331-350.
    • A concise statement of the abductive argument from evil.
  • Dougherty, T. (2008) “Epistemological Considerations Concerning Skeptical Theism,” Faith & Philosophy 25, pp. 172-176.
    • An objection to skeptical theism based on its implications for commonsense epistemology.
  • Durston, K. (2000) “The consequential complexity of history and gratuitous evil,” Religious Studies 36, pp. 65-80.
    • A defense of skeptical theism by appeal to complexity.
  • Hasker, W. (2004) “The sceptical solution to the problem of evil,” in Hasker, W. Providence, Evil, and the Openness of God (Routledge) pp. 43-57.
    • An example of an objection to skeptical theism by a theist
  • Hick, J. (1966) Evil and the God of Love (Harper & Rowe).
    • A clear presentation and defense of a soul-crafting theodicy.
  • Howard-Snyder, D. (2010) “Epistemic Humility, Arguments from Evil, and Moral Skepticism,”in Kvanvig, J. (ed.) Oxford Studies in Philosophy of Religion (Oxford: Oxford University Press) pp. 17-57.
    • Responding to an objection to skeptical theism based on its implications for moral living.
  • Jordan, J. (2006) “Does Skeptical Theism Lead to Moral Skepticism?” Philosophy and Phenomenological Research 72, pp. 403-416.
    • An objection to skeptical theism based on its implications for moral living.
  • Mackie, J.L. (1955) “Evil and Omnipotence,” Mind 64:254, pp. 200-212.
    • The classic statement of the logical problem of evil.
  • McBrayer, J. (2010) “Skeptical Theism,” Philosophy Compass 4:1, pp. 1-13 (Blackwell).
    • A thorough review of the case for and against skeptical theism with an exhaustive bibliography.
  • McBrayer, J. (2009) “CORNEA and Inductive Evidence,” Faith & Philosophy 26:1, pp. 77-86.
    • An objection to the defense of skeptical theism by appeal to enabling premises
  • Plantinga, A. (1974) God, Freedom, and Evil (Eerdmans).
    • The classic response to the logical problem of evil.
  • Rowe, W. (2001) “Skeptical Theism: A Response to Bergmann,” Nous 35, pp. 297-303.
    • An objection to the defense of skeptical theism by appeal to analogies and enabling premises.
  • Rowe, W. (1979) “The Problem of Evil and Some Varieties of Atheism,” American Philosophical Quarterly 16, pp. 335-41.
    • A clear and classic statement of the evidential argument from evil.
  • Sennett, J. (1993) “The Inscrutable Evil Defense against the Inductive Argument from Evil,” Faith & Philosophy 10, pp. 220-229.
    • A defense of skeptical theism by appeal to enabling premises.
  • Tooley, M. (1991) “The Argument from Evil,” Philosophical Perspectives 5, pp. 89-134.
    • An objection to the defense of skeptical theism by appeal to enabling premises.
  • Trakakis, N. (2003) “Evil and the complexity of history: a response to Durston,” Religious     Studies 39, pp. 451-458.
    • An objection to the defense of skeptical theism by appeal to complexity.
  • Van Inwagen, P. (2003) The Problem of Evil (Oxford University Press).
    • A clear presentation of the argument from evil (§2) and an example of a defense.
  • Wilks, I. (2009) “Skeptical Theism and Empirical Unfalsifiability,” Faith & Philosophy 26:1, pp. 64-76.
    • An objection to skeptical theism based on its implications for everyday knowledge.
  • Wykstra, S. (1984) “The Humean Obstacle to Evidential Arguments from Suffering: On Avoiding the Evils of ‘Appearance’,” International Journal of Philosophy of Religion 16, pp. 73-93.
    • A defense of skeptical theism by appeal to enabling premises.

Author Information

Justin P. McBrayer
Email: mcbrayer_j@fortlewis.edu
Fort Lewis College
U. S. A.

The American Environmental Justice Movement

The origin of the American environmental justice movement can be traced back to the emergence of the American Civil Rights movement of the 1960s, and more specifically to the U.S. Civil Rights Act of 1964.  The movement reached a new level with the emergence of Robert Bullard’s work entitled Dumping in Dixie in the 1990’s, which constituted a clarion call for environmental justice. Although environmentalism and the environmental justice movement are related, there is a difference.  Environmentalism is concerned with humanity’s adverse impact upon the environment, but proponents are primarily concerned with the impact of an unhealthy environment thrust upon a collective body of life, entailing both human and non-human existence, including in some instances plant life.  The efforts of the environmental justice movement differ from those of the environmentalist movement in that, at the heart of environmental injustice, there are issues of racism and socio-economic injustice.  Although environmentalism focuses upon and acknowledges the negative impact of humanity’s actions upon the environment, the environmental justice movement builds upon the philosophy and work of environmentalism by stressing the manner in which adversely impacting the environment in turn adversely impacts the population of that environment.

Table of Contents

  1. The Definition of Environmental Justice
  2. History of the Environmental Justice Movement
  3. Environmental Racism and Environmental Justice
  4. Principles of the Environmental Justice Movement
  5. Causes of Environmental Injustice
  6. Major Events in the Environmental Justice Movement
  7. Environmental Justice Policy and Law
  8. References and Further Reading
    1. Books
    2. Journals
    3. Governmental and Legal Publications

1. The Definition of Environmental Justice

Although the origin of the environmental justice movement is traced to the passing of the Civil Rights Act of 1964, Robert Bullard’s work entitled Dumping in Dixie published in the 1990’s is considered to be the first book addressing the reality of environmental injustice.  The work examines the widening economic, health and environmental disparities between racial groups and socioeconomic groups at the end of the twentieth and the beginning of the twenty-first centuries.  Bullard states that in writing the book he operated with the assumption that all Americans have a basic right to live, work, play, go to school and worship in a clean and healthy environment (DD, xii).  Bullard’s analysis in Dumping in Dixie “chronicles the emergence of the environmental justice movement in an effort to develop common strategies that are supportive of building sustainable African American communities and other people of color communities.”  (DD, xiii).

Bullard’s wife, a practicing attorney, suggested that he study the spatial location of all the municipal solid-waste disposal facilities in Houston, Texas. This was done as part of a class-action lawsuit filed by Bullard’s wife against the city of Houston, the State of Texas, and Browning Ferris Industries.  The lawsuit originated from a plan to site a municipal landfill in a suburban, middle-income neighborhood of single-family homeowners. The lawsuit became known as Bearn v. Southwestern Waste Management and was the first lawsuit in the United States charging environmental discrimination in waste facility location under the Civil Rights Act. The Northwood Manor neighborhood consisted of over 82 percent African American residents (DD, xii).

The emergence of the environmental justice movement is directly linked to the environmental movement.  Some contend that environmentalism and the environmental justice movement are so interrelated that the movement has essentially redefined the nature of environmentalism. According to Bullard, an environmental revolution is taking shape in the United States which “has touched communities of color from New York to California and from Florida to Alaska” and any location “where African Americans, Latinos, Asians, Pacific Islanders, and Native Americans live and comprise a major portion of the population” (CER, 7).  The influence of the environmental justice movement has broadened the spectrum of environmentalism to include what might be regarded as the trivialities of life, according to Bullard. This includes activities such as play and attending school. It also has implications for something as simple as where humans, animals and plants reside. Bullard points out that the environmental justice movement in the United States focuses upon a diversity of areas including wilderness and wildlife preservation, resource conservation, pollution abatement and population control (DD, 1). The environmental justice movement served to interrelate the physical, social, and cultural dimensions of human, non-human and plant existence under the rubric of environmentalism in general and environmental justice in particular.   (Bullard, 1999) The environmental justice movement has indirectly heightened concern not only for human existence, but also for animals and plant life.  The reality is that no single definition of environmental justice exists. However, a significant legal definition used by the Environmental Protection Agency describes environmental justice as:

[T]he fair treatment and meaningful involvement of all people regardless of race, ethnicity, income, national origin, or educational level with respect to the development, implementation and enforcement of environmental laws, regulations and policies. Fair treatment means that no population, due to policy or economic disempowerment, is forced to bear a disproportionate burden of the negative human health or environmental impacts of pollution or other environmental consequences resulting from industrial, municipal, and commercial operations or the execution of federal, state, local, and tribal programs and policies (EPA, 2).

The environmental justice movement is concerned with the pursuit of social justice and the preamble to the Principles of environmental justice adopted at the First National People of Color Environmental Leadership Summit in Washington D.C., 1991 reflects the primacy of this concern.  According to the environmental justice movement, all Americans, regardless of whether they are white or black, rich or poor, are entitled to equal protection under the law.  The  environmental justice advocates for quality education, employment, and housing, as well as the health of physical environments in which individuals, families and groups live (DD, 7).

While the environmental justice movement is rooted in significant philosophical/sociological underpinnings, the movement strives to be intensely practical. Few environmentalists realize the sociologic implications of what has been termed the “not-in-my-backyard” phenomenon which entails the recognition of the reality that hazardous waste, garbage dumps and polluting industries will inevitably be located in someone’s backyard.  The question then emerges as to whose and which backyards these toxic entities will be located? Bullard concluded based upon sociological analysis that these entities frequently end up in poor, powerless, black communities rather than affluent suburbs and he adds that this has been the case repeatedly (DD, 4).

It is important to note that the movement is critical of Western theories of jurisprudence and philosophy, which are founded upon Kantian, Cartesian and Lockean assumptions. For instance, Kantian jurisprudence is committed to the idea of the universality of rules in addressing a wide range of moral issues, whereas Cartesian dualism devalues the significance of physical existence and threats to that existence, and the philosophical conclusions of John Locke preserves individualism at the expense of the collective group. The environmental justice movement rejects each of these, concluding that no universal law or rule can be applied in a diversity of moral contexts, that the physical existence of a collective body is to be aggressively protected, and, finally, that no one individual or particular group is to be victimized for the benefit of another.  In short, such theories do not “embrace the whole community of life as the relevant moral community” (Rasmussen, 12).  Not only do these traditional philosophical underpinnings of the Western worldview fail to include members of the total human community, these approaches also fail to acknowledge the significance of life in the non-human sphere.

It is also important to note that environmental justice advocates reject the Rawlsian understanding of justice as “fairness”.  In acknowledging the reality of social, economic and moral inequity, Rawls argued that these inequities must be based upon the condition of benefit to the least advantaged. In the philosophy of the environmental justice movement, however, to adopt Rawls’ definition of justice and to tolerate the existence of actual instances of inequities and injustice based upon benefit to the collective victims reflects a perpetuation of centuries of oppression, which have become part and parcel of inadequate and distorted forms of institutional decision-making (Deane Drummond, 10). Furthermore, for environmental justice proponents, “justice is justice as distribution, recognition, and participation, linked in ways that address the wellbeing of the whole community of life in a given locale” (Rasmussen, 17).

Part of the uniqueness of the environmental justice movement is the focus on injustice as a collective experience.  Consequently, those in the movement strive for the actual pursuit, promotion, and establishment of better living conditions in the midst of collective entities, both human and non-human.  As such, at its very core the environmental justice movement is transformational and strives to empower collective victims of environmental injustice with the capacity for self-provision, self-organization, and self-governance (Rasmussen, 17).

In addition and as previously indicated, there is an important distinction to be made between environmentalism and the environmental justice movement.  While environmentalism is concerned with environmental injustice and the pursuit of justice, it is primarily concerned with the abuse of the environment by a hierarchical model which places humanity at the top with the result being the abuse of nature.  On the other hand, environmental justice advocates are more concerned with what is termed “social ecology” or “human welfare ecology.” Their primary concern is the impact of institutional systemic flaws which are the natural result of a progression of historical events resulting in decisions which establish unjust living conditions upon one group of people due to a lack of organization, power and prominence. At the risk of oversimplification, whereas environmentalism is concerned with humanity’s adverse impact upon the environment, environmental justice proponents are primarily concerned with the impact of an unhealthy environment thrust upon a collective body of life, both human and non-human, including in some instances plant life. The efforts of the environmental justice movement go beyond those of the environmentalism movement.

Environmental justice advocates contend that instances of environmental injustice are not simply arbitrary realities which occur in varying contexts.  Rather, instances of environmental injustice are the outcome of an institutional oppression and isolation which have set up an inevitable framework of the powerful oppressing the powerless. The victims, through a significant occurrence of historical and social realities, have been cut off from the power required even to challenge the causes of environmental injustice.  In a very real sense, the environmental justice movement represents another dimension of social liberation, which attempts to protect victims from institutional and systemic oppression. However, the task of the environmental justice movement should not be understood only in terms of the negative.  The central and positive question of the environmental justice movement is, “What constitutes healthy, livable, sustainable, and viable communities in the place we live, work, and play as the outcome of interrelated natural, built, social, and cultural/spiritual environments?” (Lee, 141-44).

The environmental justice movement also understands environmental injustice as part of a history of oppression and contends that profound historical realities predating the contemporary context of human existence in the Western world lie at the root of environmental injustice.  Advocates of environmental justice contend that the lack of power on the part of the victims of environmental injustice have a direct relationship of continuity with events emerging from the recent civil rights issues, to the civil war, and even trace the root cause of the systemic lack of power by certain groups to the impact of European-based realities which continue to shape the modern context of environmental injustice.  Environmental justice proponents focus upon what is termed “the four interlocking C’s” which have led to the exploitation of particular groups of people.  These “C’s” are conquest, colonization, commerce, and Christian implantation.

The call for environmental justice focuses on both environmental and ecological economics, which are reflected respectively in the work of environmental economics advocates such as Herman Daly, John Kenneth Galbraith and Nicholas Georgescu-Roegen, and ecological economics advocates such as Rebecca Pates and John Hagan.  While the environmental justice movement is primarily concerned with issues related to the United States, any consideration of the movement must acknowledge the contributions of these individuals and others and their work regarding global considerations since many of the issues with which the environmental justice movement is concerned are also contained within movements outside the United States dialogue and debate.

2. History of the Environmental Justice Movement

The environmental justice movement originated with the passing of the Civil Rights Act of 1964 and of Title VI, which prohibited the use of federal funds to discriminate on the basis of race, color and national origin.  The movement is also related to the work of Dr. Martin Luther King in the late 1960’s and his efforts on behalf of black sanitation workers in the city of Memphis, Tennessee.  In 1969, Ralph Abascal of the California Rural Legal Assistance filed a suit on behalf of six migrant farm workers, which resulted in the banning of the pesticide DDT. In addition, Congress passed the National Environmental Policy Act (NEPA) that same year.  In 1971, the President’s Council on Environmental Quality (CEQ) acknowledged racial discrimination which adversely affected urban poor and the quality of their environment.  In 1978, the Houston Northwood Manor subdivision residents protested the Whispering Pines Sanitary Landfill and in 1979 Linda McKeever Bullard filed a lawsuit on behalf of Houston’s Northeast Community Action Group. This lawsuit, titled Bean v. Southwestern Waste Management Inc, constituted the first civil rights suit challenging the siting of a waste facility.  The United Church of Christ Commission for Racial Justice issued the “Toxic Waste and Race in the United States” report in 1987.  The report was the first national study exposing the relationship between waste facility location and race.  The Clean Air Act was passed in 1990 and Bullard’s book Dumping in Dixie was published in the same year.  This particular work constituted the first textbook on environmental justice.  The first National People of Color Environmental Leadership Summit was held in Washington in 1991.  In 1994, The Environmental Justice Resource Center was formed at Clark Atlanta University in Atlanta, Georgia.  In addition, during the same year the Washington Office on Environmental Justice (WOEJ) opened in Washington D.C.  The United States environmental justice movement progressed onto the global stage in 1995 when environmental justice delegates participated in the 4th World Conference on Women in Beijing.

The environmental justice movement has existed for more than two decades, reaching an apex in the 1990’s. The movement emerged from an increased awareness of the disproportionately high impacts of environmental pollution on economically and politically disadvantaged communities. It addresses issues such as social, economic and political marginalization of minorities and low income populations, and is also concerned with the perceived increase of pollution not only in neighborhoods and communities, but also in the workplace.

There is no specific founding point for the environmental justice movement, but it was largely created through the fusion of two other movements — the economic analysis of the anti-toxics movement and the racial critique of the Civil Rights movement — and the over-arching perspective of a third — faith. Other strong contributions have come from  academia, from Native Americans, and the labor. (Timeline)

African Americans did not significantly challenge the environmental problems adversely affecting their communities prior to the call for environmental justice.  The shift from denial to acknowledgment and action emerged during the 1980’s.  Until that time African American resistance was largely limited to concern with local issues and generally was concerned with the individualistic nature of the African American struggle for equality.  However, in the 1980’s a transition took place which would give rise to the environmental justice movement as an extension of the Civil Rights movement.  This shift took place under the designation of “environmental activism” (DD, 29).

The environmental justice movement is credited with having begun in Warren County, North Carolina. In this locale residents demonstrated against a landfill which would be placed in their county. The reaction of the citizens concerning the issue reflected the merging of civil rights activists and environmentalists. Representatives from these two groups are alleged to have laid down in front of trucks transferring large amounts of PCB-contaminated soil into the largely African American populated area of Warren County. While the Warren County demonstrations were unsuccessful, they did achieve the result of bringing a renewed focus to the issue of the disproportionately high impact of environmental pollution upon minority communities such as Warren County. Ultimately, this event also placed environmental justice concerns onto the political agenda.

In 1992, a National Law Journal report alleged that the Environmental Protection Agency (EPA) had discriminated in its enforcement of environmental protection law thereby supporting the observations of those among whom the movement originally emerged.  The report indicated that federal fines were more lax for industries operating in communities of color. In addition, the report also contended that the cleanup of environmental disasters in communities of color were much slower than those carried out in the context of wealthier white communities.  Furthermore, the report indicated that standards for clean up in communities of color were not as well established or rigid as those applied in white communities.

3. Environmental Racism and Environmental Justice

Environmental justice advocates argue that an intimate relationship exists between the trilogy of environmental racism, environmental discrimination, and environmental policymaking.  Environmental injustice and environmental racism have their roots in a politico-institutional context bent toward discrimination.  Municipal, state, and federal regulations are, therefore, aimed at permitting, condoning and even promoting environmental racism.

In addition, environmental justice proponents contend that governmental policy is also bent toward the deliberate targeting of communities of color for toxic waste disposal and also the establishing of polluting industries in those communities. Further, policy and legislation not only permit but also endorse the official sanctioning of life-threatening poisons and pollutants being located in communities of color. Environmental justice advocates also contend that residents of victimized people groups are ostracized from access to political power and consequently have been excluded from service on decision-making boards and regulatory bodies, thereby subtly yet deliberately promoting environmental injustice and environmental racism.  Each of these elements contributes to the existence and propagation of environmental injustice and environmental racism (CER, 3).

Environmental justice proponents contend, “Experiences of environmental racism and injustice are not random, nor are they individual.” Consequently, the environmental justice movement is concerned with these two matters, collectivism and perceived intentionality.  On the one hand, environmental justice advocates concern themselves with environmental injustice as it happens to groups; and on the other hand, environmental justice advocates are also concerned with the systemic causes of environmental injustice (Rasmussen, 3-4).

Robert Bullard states that race is a major factor in predicting the placement of Locally Unwanted Land Uses (LULUs). Some would contend that socio-economic class is the central issue, however. Bullard counters that while race and class are combined factors, race is still the predominant factor. Environmental justice activists pronounce that race dominates policy decisions made by those in positions of power since the power arrangements of socio-economic institutions are out of balance.

Bullard also advances that environmental justice is not a social program, nor is it an affirmative action program and also that ultimately the central concern of the movement is the implementation of justice.  In addition, Bullard maintains that the consideration of race in the environmental justice movement, while constituting a portion of the problematic equation associated with environmental injustice is not the only concern of the movement.

We are just as much concerned with inequities in Appalachia, for example, where the whites are basically dumped on because of lack of economic and political clout and lack of having a voice to say ‘no’ and that’s environmental injustice.  We are trying to work with folks across the political spectrums; democrats, republicans, independents, on the reservations, in the barrios, in the ghettos, on the border and internationally to se what we address these issues in a comprehensive manner. (Interview)

However, in his earlier work entitled Confronting Environmental Racism: Voices from the Grassroots, Bullard does give voice to his belief that the problem of environmental injustice is to a large extent a racially oriented problem and that this is a problem which communities of color face.  He couches his discussion concerning environmental justice in the context of the recognition that at the heart of the problem of environmental injustice is a racially divided nation in which extreme racial inequalities persist.  However, by the time of Bullard’s more major work entitled Dumping in Dixie, he had acknowledged that the reality of environmental injustice transcends the issue of the victimization of any one race or ethnic group (CER, 7).

4. Principles of the Environmental Justice Movement

The result of the 1992 National Law Journal report concluded that the EPA had discriminated in its enforcement of Environmental Protection Law Report, which was intended to remedy the reality of environmental racism in the United States. Consequently, in 1991 at the First National People of Color Leadership Summit meeting in Washington D.C., the Principles of Environmental Justice were adopted.  These principles represent an initial rallying cry on behalf of those inhabitants, human and non-human, who are the victims of environmental injustice, and eventually established a context for a guide to action regarding governmental legislation.  Those principles are:

  1. Environmental justice affirms the sacredness of Mother Earth, ecological unity and the interdependence of all species, and the right to be free from ecological destruction.
  2. Environmental justice demands that public policy be based on mutual respect and justice for all peoples, free from any form of discrimination or bias.
  3. Environmental justice mandates the right to ethical, balanced and responsible uses of land and renewable resources in the interest of a sustainable planet for humans and other living things.
  4. Environmental justice calls for universal protection from nuclear testing, extraction, production and disposal of toxic/hazardous wastes and poisons and nuclear testing that threaten the fundamental right to clean air, land, water, and food.
  5. Environmental justice affirms the fundamental right to political, economic, cultural and environmental self-determination of all peoples.
  6. Environmental justice demands the cessation of the production of all toxins, hazardous wastes, and radioactive materials, and that all past and current producers be held strictly accountable to the people for detoxification and the containment at the point of production.
  7. Environmental justice demands the right to participate as equal partners at every level of decision-making including needs assessment, planning, implementation, enforcement and evaluation.
  8. Environmental justice affirms the right of all workers to a safe and healthy work environment, without being forced to choose between an unsafe livelihood and unemployment. It also affirms the right of those who work at home to be free from environmental hazards.
  9. Environmental justice protects the right of victims of environmental injustice to receive full compensation and reparations for damages as well as quality health care.
  10. Environmental justice considers governmental acts of environmental injustice a violation of international law, the Universal Declaration on Human Rights, and the United Nations Convention on Genocide.
  11. Environmental justice must recognize a special legal and natural relationship of Native Peoples to the U.S. government through treaties, agreements, compacts, and covenants affirming sovereignty and self-determination.
  12. Environmental justice affirms the need for urban and rural ecological policies to clean up and rebuild our cities and rural areas in balance with nature, honoring the cultural integrity of all our communities, and providing fair access for all to the full range of resources.
  13. Environmental justice calls for the strict enforcement of principles of informed consent, and a halt to the testing of experimental reproductive and medical procedures and vaccinations on people of color.
  14. Environmental justice opposes the destructive operations of multi-national corporations.
  15. Environmental justice opposes military occupation, repression and exploitation of lands, peoples and cultures, and other life forms.
  16. Environmental justice calls for the education of present and future generations, which emphasizes social and environmental issues, based on our experience and an appreciation of our diverse cultural perspectives.
  17. Environmental justice requires that we, as individuals, make personal and consumer choices to consume as little of Mother Earth’s resources and to produce as little waste as possible; and make the conscious decision to challenge and reprioritize our lifestyles to insure the health of the natural world for present and future generations (ejnet).

The First National People of Color Leadership Summit brought together hundreds of environmental justice activists representing both the national as well as the global stage.  The objective of the conference was to advocate for local and regional environmental justice activism in the form of both regional and ethnic networks. The Summit led to the creation of the Asian Pacific Environmental Network, the Northeast Environmental Justice Network, the Southern Organizing Committee for Economic and Environmental Justice and the Midwest/Great Lakes Environmental Justice Network. In 1993 Max Baucus, Democrat from Montana introduced the Environmental Justice Act of 1993 that addressed assertions that poor and minority areas are disproportionately affected by environmental pollution.  Representative John Lewis, Democrat from Georgia introduced a similar bill in the House of Representatives.

5. Causes of Environmental Injustice

Environmental injustice is said to exist when members of disadvantaged ethnic minority or other groups suffer disproportionately at the local, regional (subnational), or national levels from environmental risks or hazards or from violations of fundamental human rights as a result of environmental factors.  In addition, environmental injustice has occurred when an individual or group of individuals is denied access to environmental investments, benefits, and natural resources.  Furthermore, environmental injustice has taken place when individuals or collective groups are denied access to information, and/or participation in decision-making, as well as access to justice in environment-related matters. The study of environmental injustice has the responsibilities of examining the hierarchies of power that are inherent in any given socio-cultural context and the manner in which those hierarchies not only tolerate but also propagate environmental injustice against any number of disadvantaged people groups (EIPS, 2).

One cause of environmental injustice is institutionalized racism.  Institutionalized racism is defined as the practical reality of deliberately and intentionally targeting neighborhoods and communities comprised of a majority of people of low socio-economic status and of a collective group of individuals of color and is considered to be the natural outgrowth of racism. According to environmental justice proponents, this racism has become acculturated and engrained in contemporary social institutions, not the least of which is a governmental bureaucracy on the municipal, state, and federal levels which not only permits but reinforces the imposition of environmental injustice upon these groups.  Bunyan Bryant defines environmental racism as “the systematic exclusion of people of color from environmental decisions affecting their communities” (Bryant, 5 and Rasmussen, 8).

Another factor leading to the reality of environmental injustice is the commoditization of land, water, energy and air. This has resulted in their being secured and protected for the benefit of those in power over those who lack power.  Advocates of environmental justice remind that regardless of our status in life, we all exist collectively within the context of this biosphere.  Therefore “we breathe the same air, share the same atmosphere with the same ozone layer and climate patterns, eat food from the same soils and seas, and harvest the same acid rain” (Rasmussen, 8).

In addition, the unresponsive and unaccountable governmental policies and regulations which exist at all levels of government contribute to environmental racism and environmental injustice. Government authorities are frequently unresponsive to community needs regarding environmental inequities due to the existence of an oppressive power structure.  Furthermore, governmental availability to powerful corporations who exert power as an act of self-interest also poses problems.  Consequently, the victims of environmental injustice find it difficult if not impossible to use governmental resources and power to advance their cause (Rasmussen, 8).

Moreover, the lack of resources and power in affected communities is a major contributor to the presence of environmental racism.  In addition to the previous obstacles is the common denominator of powerlessness on the part of the victimized on the basis of few financial resources to invest in the struggle for environmental justice and also the lack of power by the victims of environmental injustice.  Specifically, the groups adversely affected by environmental inequities lack the capacity to function as an organized block representing their interests against those in the contest of authority and affluence (Rasmussen, 8).

Finally, a piecemeal approach to regulation which allows loopholes and the consequent ongoing victimization of low-income populations of color contributes to the reality of environmental racism.  The ongoing process of governmental regulation also poses a problem in combating environmental injustice and the implementation of environmental justice.  The consequent gaps between pieces of legislation which are passed in an effort to combat environmental injustice frequently provide a context for the skirting the intent of this legislation (Rasmussen, 8).

6. Major Events in the Environmental Justice Movement

A major event contributing to the development of the environmental movement in the United States was the National Environmental Policy Act of 1969 (NEPA).  The Act established a foundation for United States environmental policy and required that “any major federal action significantly affecting the quality of the human environment” requires evaluation and public disclosure of potential environmental impact through the required Environmental Impact Statement (EIS).  The EIS required by NEPA applies broadly to such categories as highways and other forms of transit projects and programs, natural resource leasing and extraction, industrial farming and policies governing genetically modified crops, as well as large scale urban development projects (NEPA 1969).  NEPA was signed into law on January 1, 1970. The Act establishes national environmental policy and goals for the protection, maintenance, and enhancement of the environment and it provides a process for implementing these goals within the federal agencies.

NEPA also established the Council on Environmental Quality (CEQ).  In its 1971 annual report, CEQ noted that populations of low-income people of color were disproportionately exposed to significant environmental hazards. This recognition constitutes the earliest governmental report acknowledging the existence of what may be termed environmental inequality in the United States.  In 1983 Robert Bullard published his groundbreaking case study of waste disposal practices in Houston, Texas entitled “Solid Waste Sites and the Black Houston Community.” The case study resulted in the publication of Bullard’s Dumping in Dixie: Race, Class, and Environmental Quality in1990. Bullard’s original study discovered that waste sites were not scattered on a random basis throughout the city of Houston, but that they were more likely to be located in African American neighborhoods and even more shockingly near schools.  Bullard’s work was the first actual study to examine the causes of environmental racism.  Bullard discovered a multiplicity of factors which led to the environmental inequality including housing discrimination, lack of zoning and racially and socio-economically insensitive decisions made by public officials over a period of fifty years.

In 1983, further documenting the realities of environmental discrimination, a congressionally authorized U.S. General Accounting Office study uncovered that three out of four off-site, commercial hazardous waste landfills in the southeastern United States were located within predominately African American communities. This was the reality despite the fact that African Americans made up only one-fifth of the region’s population. In 1990, sociologist Robert Bullard published his influential work entitled Dumping in Dixie.His was the first major study of environmental racism linking hazardous facility locations with historical patterns of segregation in the South. In addition, Bullard’s study was one of the first to explore the social and psychological impacts of environmental racism on local populations, as well as acknowledging the emerging environmental justice movement as a response from the communities against these increasingly documented environmental threats.

On February 11, 1994, President Bill Clinton signed Executive Order 12898, Federal Actions to Address Environmental Justice in Minority Populations and Low-Income Populations, to focus federal attention on the environmental and human health conditions of minority and low-income populations with the goal of achieving environmental protection for all communities. The Order directed federal agencies to develop environmental justice strategies to help federal agencies address disproportionately high and adverse human health or environmental effects of their programs on minority and low-income populations. The order is also intended to promote nondiscrimination in federal programs that affect human health and the environment. It aims to provide minority and low-income communities with access to public information and public participation in matters relating to human health and the environment. The Presidential Memorandum accompanying the Order underscores certain provisions of existing law that can help ensure that all communities and persons across the nation live in a safe and healthy environment. Also in 1994, The Environmental Protection Agency renamed the Office of Environmental Equity as the Office of Environmental Justice. The Environmental Justice Act of 1999 introduced into the U.S. Legislature was also a sign of significant progress. In 2003 the EPA established the environmental justice bibliographic database.

7. Environmental Justice Policy and Law

The environmental justice movement credits its momentum and effectiveness to the U.S. Constitution and three significant pieces of legislation: Title VI 601; 602; and 42 U.S.C. 1983.

The Fourteenth Amendment and Equal Protection

Prior to the establishing of terms such as “environmental justice” or environmental racism”, residents living in minority communities who believed they were the victims of unfair environmental policy brought fourteenth amendment actions before local municipalities seeking fair treatment. In Dowdell v. City of Apopka, 1983, discrimination in street paving, water distribution, and storm draining services was established. In United Farm Workers of Florida v. City of Delray Beach, 1974 it was established that there were violations of farm workers’ civil rights by city officials. In Johnson v. City of Arcadia, 1978 the court found discrimination in access to paved streets, parks, and the water supply.  The Supreme Court’s decision in Washington v. Davis, 1976 announced the rule that impermissible discrimination under the Fourteenth Amendment requires a showing of intent, not simply of disparate impact.  In Village of Arlington Heights v. Metropolitan Housing Development Co., 1977 the Court established a set of factors to determine whether invidious discrimination underlies an otherwise legitimate exercise of government authority.

Title VI, Civil Rights Act 601, 602, and 42 U.S.C. 198

Title VI, Civil Rights Act 601 states, “no person in the United States shall on the grounds of race, color or national origin be excluded from participation in, be denied the benefits of, or be subjected to discrimination under any program or activity receiving federal financial assistance.” (U.S.C. 1994) Title VI, Civil rights Act 602 requires “agencies that disperse federal funds to promulgate regulations implementing Title VI Civil rights Act and to create an enforcement framework that details the manner in which discrimination claims will be processed” (Shanahan, 403-406).

In addition to the two foregoing Acts, environmental justice advocates also use 42 U.S.C. 1983 in order to establish that the effect of the agencies’ decision will have a negative impact on the community.  42 U.S.C. 1983 states:

Every person who, under color of any statute, ordinance, regulation, custom, or usage, of any State or Territory or the District of Columbia, subjects, or causes to be subjected, any citizen of the United States or other person within the jurisdiction thereof to the deprivation of any rights, privileges, or immunities secured by the Constitution and laws, shall be liable to the party injured in an action at law (U.S.C. 1983).

These pieces of legislation were beneficial to the environmental justice movement until 2001 when the Supreme Court, in Alexander v. Sandoval held that “602 does not provide an implied private right of action to enforce disparate impact regulations promoted by federal agencies pursuant to 602.”

8. References and Further Reading

a. Books

  • Bullard Robert, Dumping in Dixie: Race, Class, and Environmental Quality. Westview Press, 2000. (cited as DD)
  • Bullard, Robert, Confronting Environmental Racism: Voices from the Grassroots. South End Press, 1993. (cited as CER)
  • Bryant, Bunyan, ed. Environmental Justice: Issues, Problems, and Solutions. Island Press, 1995. (cited as EJ)
  • Camacho, David E. Environmental Injustices, Political Struggles: Race, Class, and the Environment. Duke University Press, 1988.  (cited as EIPS)
  • Rawls, John, Theory of Justice 2nd Edition Oxford University Press, 1999. (cited as TJ)
  • Rawls, Justice as Fairness: A Re-statement. Belknap Press, 2001. (cited as JF)

b. Journals

  • Environmental Justice: An Interview with Robert Bullard, Earth First Journal, July 1999. (cited as Interview)
  • Drummond, Celia Deane, “Environmental Justice and the Economy: A Christian Theologians Views” Ecotheology 11.3 2006: 24-34 (Deane Drummond)
  • Lee, Charles, “environmental justice: Building a Unified Vision of Health and the Environment” Environmental Health Perspectives 10, Supplement 2 (April 2002), 141-144.
  • Rasmussen, Larry, “Environmental Racism and environmental justice: Moral Theory in the Making? Journal of the Society of Christian Ethics 24 1 (2004): 11-28. (cited as Rasmussen)
  • Shanahan, Alice M. “Permitting Justice: EPA’s Revised Guidance for Investigating Title VI Administrative Complaints. ENVTL. LAWYER 403, 406 (Feb. 2001) (citing the Civil Rights Act of 1964, §602, 78 Stat. at 252-253). (cited as Shanahan)

c. Governmental and Legal Publications

  • 42 U.S.C. § 1983 (2002). (cited as U.S.C. 1983)
  • 42 U.S.C. § 2000 (d) (1994). (cited as U.S.C. 1994)
  • Alexander v. Sandoval 532 U.S. 275 (cited as Alexander)

Author Information

Eddy F. Carder
Email: efcarder@pvamu.edu
Prairie View A & M University
U. S. A.

Jerry A. Fodor (1935—2017)

J. FodorJerry Fodor was one of the most important philosophers of mind of the late twentieth and early twenty-first centuries. In addition to exerting an enormous influence on virtually all parts of the literature in the philosophy of mind since 1960, Fodor’s work had a significant impact on the development of the cognitive sciences. In the 1960s, along with Hilary Putnam, Noam Chomsky, and others, Fodor presented influential criticisms of the behaviorism that dominated much philosophy and psychology at the time. Fodor went on to articulate and defend an alternative conception of intentional states and their content that he argues vindicates the core elements of folk psychology within a physicalist framework.

Fodor developed two theories that have been particularly influential across disciplinary boundaries. He defended a “Representational Theory of Mind,” according to which thinking is a computational process defined over mental representations that are physically realized in the brain. On Fodor’s view, these mental representations are internally structured much like sentences in a natural language, in that they have both a syntax and a compositional semantics. Fodor also defends an influential hypothesis about mental architecture, namely, that low-level sensory systems (and language) are “modular,” in the sense that they’re “informationally encapsulated” from the higher-level “central” systems responsible for belief formation, decision-making, and the like. Fodor’s work on modularity has been especially influential among evolutionary psychologists, who go much further than Fodor in claiming that the systems underlying even high-level cognition are modular, a view that Fodor himself vehemently resists.

Fodor has defended a number of other well-known views. He was an early proponent of the claim that mental states are functional states, defined by their role in a cognitive system and not by the physical material that constitutes them. Alongside functionalism, Fodor articulated an early and influential version of non-reductive physicalism, according to which mental states are realized by, but not reducible to, physical states of the brain. Fodor was also a staunch defender of nativism about the structure and contents of the human mind, arguing against a variety of empiricist theories and famously arguing that all lexical concepts are innate. Fodor vigorously argued against all versions of conceptual role semantics in philosophy and psychology, and articulated an alternative view he calls “informational atomism,” according to which lexical concepts are unstructured “atoms” that have their content in virtue of standing in certain external, “informational” relations to entities in the environment.

Table of Contents

  1. Biography
  2. Physicalism, Functionalism, and the Special
    Sciences
  3. Intentional Realism
  4. The Representational Theory of Mind
  5. Content and Concepts
  6. Nativism
  7. Modularity
  8. References and Further Reading

1. Biography

Jerry Fodor was born in New York City on April 22, 1935. He received his A.B. degree from Columbia University in 1956 and his Ph.D. from Princeton University in 1960. His first academic position was at MIT, where he taught in the Departments of Philosophy and Psychology until 1986. He was Distinguished Professor at CUNY Graduate Center from 1986 to 1988, when he moved to Rutgers University, where he was State of New Jersey Professor of Philosophy and Cognitive Science until his retirement in 2016. Fodor died on November 29, 2017.

2. Physicalism, Functionalism, and the
Special Sciences

Throughout his career Fodor endorsed physicalism, the claim that all the genuine particulars and properties in the world are either identical to or in some sense determined by and dependent upon physical particulars and properties. Although there are contested questions about how physicalism should be formulated and understood (Melnyk 2003, Stoljar 2010), there is nevertheless widespread acceptance of some or other version of physicalism among philosophers of mind. To accept physicalism is to deny that psychological and other non-basic properties of the world “float free” from fundamental physical properties. Accepting physicalism thus goes hand in hand with rejecting mind-body dualism.

Some of Fodor’s early work (1968, 1975) aimed (i) to show that “mentalism” was a genuine alternative to dualism and behaviorism, (ii) to show that behaviorism had a number of serious shortcomings, (iii) to defend functionalism as the appropriate physicalist metaphysics underlying mentalism, and (iv) to defend a conception of psychology and other special sciences according to which higher-level laws and the properties that figure in them are irreducible to lower-level laws and properties. Let’s consider each of these in turn.

For much of the twentieth century, behaviorism was widely regarded as the only viable physicalist alternative to dualism. Fodor helped to change that, in part by drawing a clear distinction between mere mentalism, which posits the existence of internal, causally efficacious mental states, and dualism, which is mentalism plus the view that mental states are states of a non-physical substance. Here’s Fodor in his classic book Psychological Explanation:

[P]hilosophers who have wanted to banish the ghost from the machine have usually sought to do so by showing that truths about behavior can sometimes, and in some sense, logically implicate truths about mental states. In so doing, they have rather strongly suggested that the exorcism can be carried through only if such a logical connection can be made out. … [O]nce it has been made clear that the choice between dualism and behaviorism is not exhaustive, a major motivation for the defense of behaviorism is removed: we are not required to be behaviorists simply in order to avoid being dualists (1968, pp. 58-59).

Fodor thus argues that there’s a middle road between dualism and behaviorism. Attributing mental states to organisms in explaining how they get around in and manipulate their environments need not involve the postulation of a mental substance different in kind from physical bodies and brains. In Fodor’s view, behaviorists influenced by Wittgenstein and Ryle ignored the distinction between mentalism and dualism. As Fodor puts it, “confusing mentalism with dualism is the original sin of the Wittgensteinian tradition” (1975, p. 4).

In addition to clearly distinguishing mentalism from dualism, Fodor put forward a number of trenchant objections to behaviorism and the various arguments for it. He argued that neither knowing about the mental states of others nor learning a language with mental terms requires that there be a logical connection between mental and behavioral terms, thus undermining a number of epistemological and linguistic arguments for behaviorism (Fodor and Chihara 1965, Fodor 1968). Perhaps more importantly, Fodor argued that empirical theories in cognitive psychology and linguistics provide a powerful argument against behaviorism, since they posit the existence of various mental states that are not definable in terms of overt behavior (Fodor 1968, 1975). Along with the arguments of Putnam (1963, 1967) and Chomsky (1959), among others, Fodor’s early arguments against behaviorism were an important step in the development of the then emerging cognitive sciences.

Central to this development was the rise of functionalism as a genuine alternative to behaviorism, and Fodor’s Psychological Explanation (1968) was one of the first in-depth treatments and defenses of this view (see also Putnam 1963, 1967). Unlike behaviorism, which attempts to explain behavior in terms of law-like relationships between stimulus inputs and behavioral outputs, functionalism explains behavior in terms of internal properties that mediate between inputs and outputs. Indeed, the main claim of functionalism is that mental properties are individuated in terms of the various causal relations they enter into, where such relations are not restricted to mere input-output relations, but also include their relations to a host of other properties that figure in the relevant empirical theories. Although, at the time, the distinctions between various forms of functionalism weren’t as clear as they are now, Fodor’s brand of functionalism is a version of what is now known as “psycho-functionalism”. On this view, the causal roles that define mental properties are provided by empirical psychology, and not, say, the platitudes of commonsense psychology, or the analyticities expressive of the meanings of mental terms; see Rey (1997, ch.7) and Shoemaker (2003) for discussion.

By defining mental properties in terms of their causal roles, functionalists allow that the same mental property can be instantiated by different kinds of physical systems. Functionalism thus goes hand in hand with the multiple realizability of mental properties. If a given mental property, M, is a functional property that’s defined by a specific causal condition, C, then any number of distinct physical properties, P1, P2, P3… Pn, may each “realize” M in virtue of meeting condition C. Functionalism thereby characterizes mental properties at a level of abstraction that ignores differences in the physical structure of the systems that have these properties. Early functionalists, like Fodor and Putnam, thus took themselves to be articulating a position that was distinct not only from behaviorism, but also from type-identity theory, which identifies mental properties with neurophysiological properties of the brain. If functionalism implies that mental properties can be realized by different physical properties in different kinds of systems (or the same system over time), then functionalism apparently precludes identifying mental properties with physical properties.

Fodor, in particular, articulated his functionalism so that it was seen to have sweeping consequences for debates concerning reductionism and the unity of science. In his seminal essay “Special Sciences” (1974), and also in the introductory chapter of his classic book The Language of Thought (1975), Fodor spells out a metaphysical picture of the special sciences that eventually came to be called “non-reductive physicalism”. This picture is physicalist in that it accepts what Fodor calls the “generality of physics,” which is the claim that every event that falls under a special science predicate also falls under a physical predicate, but not vice versa. It’s non-reductionist in that it denies that “the special sciences should reduce to physical theories in the long run” (1974, p. 97). Traditionally, reductionists sought to articulate bridge laws that link special science predicates with physical predicates, either in the form of bi-conditionals or identity statements. Fodor argues not only that the generality of physics does not require the existence of bridge laws, but that such laws will in general be unavailable given that the events picked out by special science predicates will be “wildly disjunctive” from the perspective of physics (1974, p. 103). Multiple realizability thus guarantees that special science predicates will cross-classify phenomena picked out by physical predicates. This, in turn, undermines the reductionist hope of a unified science whereby the higher-level theories of the special sciences reduce to lower-level theories and ultimately to fundamental physics. On Fodor’s picture, then, the special sciences are “autonomous” in that they articulate irreducible generalizations that quantify over irreducible and casually efficacious higher-level properties (1974, 1975; see also 1998b, ch.2).

Functionalism and non-reductive physicalism are now commonplace in philosophy of mind, and provide the backdrop for many contemporary debates about psychological explanation, laws, multiple realizability, mental causation, and more. This is something for which Fodor surely deserves much of the credit (or blame, depending on one’s view; see Kim 2005 and Heil 2003 for criticisms of the metaphysical underpinnings of non-reductive physicalism).

3. Intentional Realism

A central aim of Fodor’s work is to defend the core elements of folk psychology as at least the starting point for a serious scientific psychology. At a minimum, folk psychology is committed to two kinds of states: belief-like states, which represent the world and guide one’s behavior, and desire-like states, which represent one’s goals and motivate behavior. We routinely appeal to such states in our common-sense explanations of people’s behavior.  For example, we explain why John went to the store in terms of his desire for milk and his belief that there’s milk for sale at the store. Fodor is impressed by the remarkable predictive power of such belief-desire explanations. The following passage is typical:

Common sense psychology works so well it disappears. It’s like those mythical Rolls Royce cars whose engines are sealed when they leave the factory; only it’s better because they aren’t mythical. Someone I don’t know phones me at my office in New York from—as it might be—Arizona. ‘Would you like to lecture here next Tuesday?’ are the words he utters. ‘Yes thank you. I’ll be at your airport on the 3 p.m. flight’ are the words that I reply. That’s all that happens, but it’s more than enough; the rest of the burden of predicting behavior—of bridging the gap between utterances and actions—is routinely taken up by the theory. And the theory works so well that several days later (or weeks later, or months later, or years later; you can vary the example to taste) and several thousand miles away, there I am at the airport and there he is to meet me. Or if I don’t turn up, it’s less likely that the theory failed than that something went wrong with the airline. … The theory from which we get this extraordinary predictive power is just good old common sense belief/desire psychology. … If we could do that well with predicting the weather, no one would ever get his feet wet; and yet the etiology of the weather must surely be child’s play compared with the causes of behavior. (1987, pp. 3-4)

Passages like this may suggest that Fodor’s intentional realism is wedded to the folk-psychological categories of “belief” and “desire”. But this isn’t so. Rather, Fodor’s claim is that there are certain core elements of folk psychology that will be shared by a mature scientific psychology. In particular, Fodor’s view is that a mature psychology will posit states with the following features:@

(1) They will be intentional: they will be “about” things and they will be semantically evaluable. (John’s belief that there’s milk at the store is about the milk at the store, and can be semantically evaluated as true or false.)

(2) They will be causal: they will figure in genuine causal explanations and laws. (John’s belief that there’s milk at the store and his desire for milk figure in a law-like causal explanation of John’s behavior.)

Fodor’s intentional realism thus doesn’t require that folk-psychological categories themselves find a place in a mature psychology. Indeed, Fodor has suggested that the individuation conditions for beliefs are “so vague and pragmatic” that they may not be fit for empirical psychology (1990, p. 175). What Fodor is committed to is the claim that a mature psychology will be intentional through and through, and that the intentional states it posits will be causally implicated in law-like explanations of human behavior. Exactly which intentional states will figure in a mature psychology is a matter to be decided by empirical inquiry, not by a priori reflection on our common sense understanding.

Fodor’s defense of intentional realism is usefully viewed as part of a rationalist tradition that stresses the human mind’s striking ability to think about indefinitely many arbitrary properties of the world. Our minds are apparently sensitive not only to abstract properties such as being a democracy and being virtuous, but also to abstract grammatical properties such as being a noun phrase and being a verb phrase, as well as to such arbitrary properties as being a tiny folded piece of paper, being an oddly-shaped canteen, being a crumpled shirt, and being to the left of my favorite mug. On Fodor’s (1986) view, a system can selectively respond to such non-sensory properties (or properties that are not “transducer detectable”) only if it’s an intentional system capable of manipulating representations of these properties. More specifically, Fodor claims that the distinguishing feature of intentional systems is that they’re sensitive to “non-nomic” properties, that is, properties of objects that do not determine that they fall under laws of nature. Consider Fodor’s (1986) example being a crumpled shirt. Although laws of nature govern crumpled shirts, no object is subsumed under a law in virtue of being a crumpled shirt. Nevertheless, the property of being a crumpled shirt is one that we can represent an object as having, and such representations do enter into laws. For instance, there’s presumably a law-like relationship between my noticing the crumpled shirt, my desire to remark upon it, and my saying “there’s a crumpled shirt”. On Fodor’s view, the job of intentional psychology is to articulate laws governing mental representations that figure in genuine causal explanations of people’s behavior (Fodor 1987, 1998a).

Although positing mental representations that have semantic and causal properties—states that satisfy (1) and (2) above—may not seem particularly controversial, the existence of causally efficacious intentional states has been denied by all manner of behaviorists, epiphenomenalists, Wittgensteinians, interpretationists, instrumentalists, and (at least some) connectionists. Much of Fodor’s work is devoted to defending intentional realism against such views as they have arisen in both philosophy and psychology. In addition to defending intentional realism against the behaviorism of Skinner and Ryle (Fodor 1968, 1975, Fodor et al. 1974), Fodor defends it against the threat of epiphenomenalism (Fodor 1989), against Wittgenstein and other defenders of the “private language argument” (Fodor and Chihara 1965, Fodor 1975), against the eliminativism of the Churchlands (Fodor 1987, 1990), against the instrumentalism of Dennett (Fodor 1981a, Fodor and Lepore 1992), against the interpretationism of Davidson (Fodor 1990, Fodor and Lepore 1992, Fodor 2004), and against certain versions of connectionism (Fodor and Pylyshyn 1988, Fodor 1998b, chs. 9 and 10).

4. The Representational Theory of Mind

For physicalists, accepting that there are mental states that are both intentional and causal raises the question of how such states can exist in a physical world. Intentional realists must explain, for instance, how lawful relations between intentional states can be understood physicalistically. Of particular concern is the fact that at least some intentional laws describe rational relations between the states they quantify over, and, at least since Descartes, philosophers have worried about how a purely physical system could be rational (see Lowe 2008 for skepticism from a non-Cartesian dualist). Fodor’s Representational Theory of Mind (RTM) is his attempt to answer such worries.

As Fodor points out, RTM is “really a loose confederation of theses” that “lacks, to put it mildly, a canonical formulation” (1998a, p. 6). At its core, though, RTM is an attempt to combine Alan Turing’s work on computation with intentional realism (as outlined above). Broadly speaking, RTM claims that mental processes are computational processes, and that intentional states are relations to mental representations that serve as the domain of such processes. On Fodor’s version of RTM, these mental representations have both syntactic structure and a compositional semantics. Thinking thus takes place in an internal language of thought.

Turing demonstrated how to construct a purely mechanical device that could transform syntacticallyindividuated symbols in a way that respects the semantic relations that exist between the meanings, or contents, of the symbols. Formally valid inferences are the paradigm. For instance, modus ponens can be realized on a machine that’s sensitive only to syntactic properties of symbols. The device thus doesn’t have “access” to the symbols’ semantic properties, but can nevertheless transform the symbols in a truth-preserving way. What’s interesting about this, from Fodor’s perspective, is that mental processes also involve chains of thoughts that are truth-preserving. As Fodor puts it:

[I]f you start out with a true thought, and you proceed to do some thinking, it is very often the case that the thoughts that thinking leads you to will also be true. This is, in my view, the most important fact we know about minds; no doubt it’s why God bothered to give us any. (1994, p. 9)

In order to account for this “most important” fact, RTM claims that thoughts themselves are syntactically-structured representations, and that mental processes are computational processes defined over them. Given that the syntax of a representation is what determines its causal role in thought, RTM thereby serves to connect the fact that mental processes are truth-preserving with the fact that they’re causal. On Fodor’s view, “this bringing of logic and logical syntax together with a theory of mental processes is the foundation of our cognitive science” (2008, p. 21).

Suppose a thinker believes that if John ran, then Mary swam. According to RTM, for a thinker to hold such a belief is for the thinker to stand in a certain computational relation to a mental representation that means if John ran, then Mary swam. Now suppose the thinker comes to believe that John ran, and as a result comes to believe that Mary swam. RTM has it that the causal relations between these thoughts hold in virtue of the syntactic form of the underlying mental representations. By picturing the mind as a “syntax-driven machine” (Fodor, 1987, p. 20), RTM thus promises to explain how the causal relations among thoughts can respect rational relations among their contents. It thereby provides a potentially promising reply to Descartes’ worry about how rationality could be exhibited by a mere machine. As Fodor puts it:

So we can now (maybe) explain how thinking could be both rational and mechanical. Thinking can be rational because syntactically specified operations can be truth preserving insofar as they reconstruct relations of logical form; thinking can be mechanical because Turing machines are machines. … [T]his really is a lovely idea and we should pause for a moment to admire it. Rationality is a normative property; that is, it’s one that a mental process ought to have. This is the first time that there has ever been a remotely plausible mechanical theory of the causal powers of a normative property. The first time ever. (2000, p. 19)

In Fodor’s view, it’s a major argument in favor of RTM that it promises an explanation of how mental processes can be truth-preserving, and a major strike against traditional empiricist and associationist theories that, in his view, they offer no plausible competing explanation (2000, pp. 15-18; 2003, pp. 90-94; Fodor and Pylyshyn 1998). (Note that Fodor does not think that RTM offers a satisfying explanation of all aspects of human rationality, as discussed below in the section on modularity.)

In addition to explaining how truth-preserving mental processes could be realized causally, Fodor argues, RTM provides the only hope of explaining the so-called “productivity” and “systematicity” of thought (Fodor 1987, 1998a, 2008). Roughly, productivity is the feature of our minds whereby there is no upper bound to the number of thoughts we can entertain. We can think that the dog is on the deck; that the dog, which chased the squirrel, is on the deck; that the dog, which chased the squirrel, which foraged for nuts, is on the deck; and so on, indefinitely.

Of course, there are thoughts whose contents are so long or complex that other factors prevent us from entertaining them. But abstracting away from such performance limitations, it seems that a theory of our conceptual competence must account for such productivity. Thought also appears to be systematic, in the following sense: a mind that is capable of entertaining a certain thought, p, is also capable of entertaining logical permutations of p. For example, minds that can entertain the thought that the book is to the left of the cup can also entertain the thought that the cup is to the left of the book. Although it’s perhaps possible that there could be minds that do not exhibit such systematicity—a possibility denied by some, for example, Evans (1982) and Peacocke (1992)—it at least appears to be an empirical fact that all minds do.

In Fodor’s view, RTM is the only theory of mind that can explain productivity and systematicity. According to RTM, mental states have internal, constituent structure, and the content of mental states is determined by the content of their constituents and how those constituents are put together. Given a finite base of primitive representations, our capacity to entertain endlessly many thoughts can be explained by positing a finite number of rules for combining representations, which can be applied endlessly many times in the course of constructing complex thoughts. RTM offers a similar explanation of systematicity. The reason that a mind that can entertain the thought that the book is to the left of the cup can also entertain the thought that the cup is to the left of the book is that these thoughts are built up out of the same constituents, using the same rules of combination. RTM thus explains productivity and systematicity because it claims that mental states are representations that have syntactic structure and a compositional semantics. One of Fodor’s main arguments against alternative, connectionist theories is that they fail to account for such features (Fodor and Pylyshyn 1988, Fodor 1998b, chs. 9 and 10).

A further argument Fodor offers in favor of RTM is that successful empirical theories of various non-demonstrative inferences presuppose a system of internal representations in which such inferences are carried out. For instance, standard theories of visual perception attempt to explain how a percept is constructed on the basis of the physical information available and the visual system’s built-in assumptions about the environment, or “natural constraints” (Pylyshyn 2003). Similarly, theories of sentence perception and comprehension require that the language system be able to represent distinct properties (for instance, acoustic, phonological, and syntactic properties) of a single utterance (Fodor et al. 1974). Both sorts of theories require that there be a system of representations capable of representing various properties and serving as the medium in which such inferences are carried out. Indeed, Fodor claims that the best argument in favor of RTM is that “some version or other of RTM underlies practically all current psychological research on mentation, and our best science is ipso facto our best estimate of what there is and what it’s made of” (Fodor 1987, p. 17). Fodor’s The Language of Thought (1975) is the locus classicus of this style of argument.

5. Content and Concepts

Suppose, as RTM suggests, that mental processes are computational processes, and that this explains how rational relations between thoughts can be realized by purely casual relations among symbols in the brain. This leaves open the question of how such symbols come to have their meaning, or content. At least since Brentano, philosophers have worried about how to integrate intentionality into the physical world, a worry that has famously led some to accept the “baselessness of intentional idioms and the emptiness of a science of intention” (Quine 1960, p. 221). Much of Fodor’s work from the 1980s onward was focused on this representational (as opposed to the computational) component of RTM. Although Fodor’s views changed in various ways over the years, some of which are documented below, a unifying theme throughout this work is that it’s at least possible to provide a naturalistic account of intentionality (Fodor 1987, 1990, 1991, 1994, 1998a, 2004, 2008; Fodor and Lepore 1992, 2002; Fodor and Pylyshyn 2014).

In the 1960s and early 1970s, Fodor endorsed a version of so-called “conceptual role semantics” (CRS), according to which the content of a representation is (partially) determined by the conceptual connections it bears to other representations. To take two hoary examples, CRS has it that “bachelor” gets its meaning, in part, by bearing an inferential connection to “unmarried,” and “kill” gets its meaning, in part, by bearing an inferential connection to “die”. Such inferential connections hold, on Fodor’s early view, because “bachelor” and “kill” have complex structure at the level at which they’re semantically interpreted—that is, they have the structure exhibited by the phrases “unmarried adult male” and “cause to die” (Katz and Fodor 1963). In terms of concepts, the claim is that the concept BACHELOR has the internal structure exhibited by ‘UNMARRIED ADULT MALE’, and the concept KILL has the internal structure exhibited by ‘CAUSE TO DIE’. (This article follows the convention of writing the names of concepts in capitals.)

However, Fodor soon came to think that there are serious objections to CRS. Some of these objections were based on his own experimental work in psycholinguistics, which he took to provide evidence against the existence of complex lexical structure. In particular, experimental evidence suggested that understanding a sentence does not involve recovering the (putative) decompositions of the lexical items it contains (Fodor et al. 1975, Fodor et al. 1980). For example, if “bachelor” has the semantic structure exhibited by “unmarried adult male,” then there is an implicit negation in the sentence “If practically all the men in the room are bachelors, then few men in the room have spouses.” But the evidence suggested that it’s easier to understand that sentence than similar sentences containing either an explicit negative (“not married”) or a morphological negative (“unmarried”), as in “If practically all the men in the room are not married/unmarried, then few men in the room have spouses”. This shouldn’t be the case, Fodor reasoned, if “bachelor” includes the negation at the level at which it is semantically interpreted (Fodor et al. 1975, Fodor et al. 1980). (For alternative explanations see Jackendoff (1983, pp. 125-127; 1992, p. 49; 2002, ch. 11), Katz (1977, 1981) and Miller and Johnson-Laird (1976, p. 328).)

In part because of the evidence against decompositional structure, Fodor at one point seriously considered the view that inferential connections among lexical items hold in virtue of inference rules, or “meaning postulates,” which renders CRS consistent with a denial of the claim that lexical items are semantically structured (1975, pp. 148-152). However, Fodor ultimately became convinced that Quine’s doctrine of “confirmation holism” undermines the appeal to meaning postulates, and more generally, any view that implies a principled distinction between those conceptual connections that are “constitutive” of a concept and those that are “merely collateral”. According to confirmation holism, our beliefs don’t have implications for experience when taken in isolation. As Quine famously puts it, “our statements about the external world face the tribunal of sense experience not individually but only as a corporate body” (1953, p. 41). This implies that disconfirming a belief is never simply a matter of testing it against experience. For one could continue to hold a belief in the face of recalcitrant data by revising other beliefs that form part of one’s overall theory. As Quine says, “any statement can be held true come what may, if we make drastic enough adjustments elsewhere in the system” (1953, p. 43). Such Quinean considerations motivate Fodor’s claim that CRS theorists should not appeal to meaning postulates:

Exactly because meaning postulates break the ‘formal’ relation between belonging to the structure of a concept and being among its constitutive inferences, it’s unclear why it matters … whether a given inference is treated as meaning-constitutive. Imagine two minds that differ in that ‘whale → mammal’ is a meaning postulate for one but is ‘general knowledge’ for the other. Are any further differences between these minds entailed? If so, which ones? Is this wheel attached to anything at all? It’s a point that Quine made against Carnap that the answer to ‘When is an inference analytic?’ can’t be just ‘Whenever I feel like saying that it is’. (1998a, p. 111)

Moreover, confirmation holism suggests that the epistemic properties of a concept are potentially connected to the epistemic properties of every other concept, which, according to Fodor, suggests that CRS inevitably leads to semantic holism, the claim that all of a concept’s inferential connections are constitutive. But Fodor argues that semantic holism is unacceptable, since it’s incompatible with the claim that concepts are shareable: “since practically everybody has some eccentric beliefs about practically everything, holism has it that nobody shares any concepts with anybody else” (2004, p. 35; see also Fodor and Lepore 1992, Fodor 1998a). This implication would undermine the possibility of genuine intentional generalizations, which require that type-identical contents are shared across both individuals and different time-slices of the same individual. (Fodor rejects appeals to a weaker notion of “content similarity”; see Fodor and Lepore 1992, pp. 17-22; Fodor 1998a, pp. 28-34.)

Proponents of CRS might reply to these concerns about semantic holism by accepting the ‘molecularist’ claim that only some inferential connections are concept-constitutive. But Fodor suggests that the only way to distinguish the constitutive connections from the rest is to endorse an analytic/synthetic distinction, which, again, confirmation holism gives us reason to reject (for example, 1990, p. x, 1998a, p. 71, 1998b, pp. 32-33, 2008). Fodor’s Quinean point, ultimately, is that theorists should be reluctant to claim that there are certain beliefs people must hold, or inferences they must accept, in order to possess a concept. For thinkers can apparently have any number of arbitrarily strange beliefs involving some concept, consistent with them sharing that concept with others. As Fodor puts it:

[P]eople can have radically false theories and really crazy views, consonant with our understanding perfectly well, thank you, which false views they have and what radically crazy things it is that they believe. Berkeley thought that chairs are mental, for Heaven’s sake! Which are we to say he lacked, the concept MENTAL or the concept CHAIR? (1987, p. 125) (For further reflections along similar lines, see Williamson 2007.)

On Fodor’s view, proponents of CRS are faced with two equally unsatisfying options: they can agree with Quine about the analytic/synthetic distinction, but at the cost of endorsing semantic holism and its unpalatable consequences for the viability of intentionality psychology; or they can deny holism and accept molecularism but at the cost of endorsing an analytic/synthetic distinction, which Fodor thinks nobody knows how to draw.

It bears emphasis that Fodor doesn’t claim that confirmation holism, all by itself, rules out the existence of certain “local” semantic connections that hold as a matter of empirical fact. Indeed, contemporary discussions of possible explanatory roles for analyticity involve delicate psychological and linguistic considerations that are far removed from the epistemological considerations that motivated the positivists. For instance, there are the standard convergences in people’s semantic-cum-conceptual intuitions, which cry out for an empirical explanation. Although some argue that such convergences are best explained by positing analyticities (Grice and Strawson 1956, Rey 2005, Rives 2016), Fodor argues that all such intuitions can be accounted for by an appeal to Quinean “centrality” or “one-criterion” concepts (Fodor 1998a, pp. 80-86). Considerations in linguistics that bear on the existence of an empirically grounded analytic/synthetic distinction include the syntactic and semantic analyses of ‘causative’ verbs, the ‘generativity’ of the lexicon, and the acquisition of certain elements of syntax. Fodor has engaged linguists on a number of such fronts, arguing against proposals of Jackendoff (1992), Pustejovsky (1995), Pinker (1989), Hale and Keyser (1993), and others, defending the Quinean line (see Fodor 1998a, pp. 49-56, and Fodor and Lepore 2002, chs. 5-6; see Pustejovsky 1998 and Hale and Keyser 1999 for rejoinders). Fodor’s view is that all of the relevant empirical facts about minds and language can be explained without any analytic connections, but merely deeply believed ones, precisely as Quine argued.

On Fodor’s view, the problems plaguing CRS ultimately arise as a result of its attempt to connect a theory of meaning with certain epistemic conditions of thinkers. A further argument against such views, Fodor claims, is that such epistemic conditions violate the compositionality constraint that is required for an explanation of productivity and systematicity (see above). For instance, if one believes that brown cows are dangerous, then the concept BROWN COW will license the inference ‘BROWN COW → DANGEROUS’; but this inference is not determined by the inferential roles of BROWN and COW, which it ought to be if meaning-constituting inferences are compositional (Fodor and Lepore 2002, ch.1; for discussion and criticism, see, for example, Block 1993, Boghossian 1993, and Rey 1993).

Another epistemic approach, favored by many psychologists, takes concepts to have “prototype” structure. According to these theories, the structure of a lexical concept specifies the prototypical features of its instances, that is, the features that its instances tend to (but need not) have (Rosch and Mervis 1975). Prototype theories are epistemic accounts because, on these views, having a concept is a matter of knowing the features of its prototypical instances. Given this, Fodor argues that prototype theories are also in danger of violating compositionality. For example, knowing what prototypical pets (dogs) are like and what prototypical fish (trout) are like does not guarantee that you know what prototypical pet fish (goldfish) are like (Fodor 1998a, pp. 102-108, Fodor and Lepore 2002, ch. 2). Since compositionality is required in order to explain the productivity and systematicity of thought, and prototype structures do not compose, it follows that concepts don’t have prototype structure. Fodor (1998b, ch. 4) extends this kind of argument to epistemic accounts that posit so-called “recognitional concepts,” that is, concepts that are individuated by certain recognitional capacities. (For discussion and criticism, see, for example, Horgan 1998, Recanati 2002, and Prinz 2002.)

Fodor thus rejects all theories that individuate concepts in terms of their epistemic properties and their internal structure, and ultimately defends what he calls “informational atomism,” according to which lexical concepts are unstructured atoms whose content is determined by certain informational relations they bear to phenomena in the environment. In claiming that lexical concepts are internally unstructured, Fodor’s informational atomism is meant to respect the evidence and arguments against decomposition, definitions, prototypes, and the like. In claiming that none of the epistemic properties of concepts are constitutive, Fodor is endorsing what he sees as the only alternative to molecularist and holistic theories of content, neither of which, as we’ve seen, he takes to be viable. By separating epistemology from semantics in this way, Fodor’s theory places virtually no constraints on what a thinker must believe or infer in order to possess a particular concept. For instance, what determines whether a mind possesses DOG isn’t whether it has certain beliefs about dogs, but rather whether it possess an internal symbol that stands in the appropriate mind-world relation to the property of being a dog. Rather than talking about concepts as they figure in beliefs, inferences, or other mental states, Fodor instead talks of mere “tokenings” of concepts, where for him these are internal symbols that need not play any specific role in cognition. In his view, this is the only way for a theory of concepts to respect Quinean strictures on analyticity and constitutive conceptual connections. Indeed, Fodor claims that by denying that “the grasp of any interconceptual relations is constitutive of concept possession,” informational atomism allows us to “see why Quine was right about there not being an analytic/synthetic distinction” (Fodor 1998a, p. 71).

Fodor’s most explicit characterization of the mind-world relation that determines content is his “asymmetry dependency” theory (1987, 1990). According to this theory, the concept DOG means dog because dogs cause tokenings of DOG, and non-dogs causing tokenings of DOG is asymmetrically dependent upon dogs causing DOG. In other words, non-dogs wouldn’t cause tokenings of DOG unless dogs cause tokenings of DOG, but not vice versa. This is Fodor’s attempt to meet Brentano’s challenge of providing a naturalistic sufficient condition for a symbol to have a meaning. Not surprisingly, many objections have been raised to Fodor’s asymmetric dependency theory; for an overview see Loewer and Rey 1991.

It’s important to see that in rejecting epistemic accounts of concepts Fodor is not claiming that epistemic properties are irrelevant from the perspective of a theory of concepts. For such properties are what sustain the laws that “lock” concepts onto phenomena in the environment. For instance, it is only because thinkers know a range of facts about dogs—what they look like, that they bark, and so forth—that the concept DOG is lawfully connected to dogs. Knowledge of such facts thus plays a causal role in fixing the content of DOG. But on Fodor’s view, this knowledge doesn’t play a constitutive role. While such epistemic properties mediate the connection between tokens of DOG and dogs, this a mere “engineering” fact about us, which has no implications for the metaphysics of concepts or concept possession (1998a, p. 78). As Fodor puts it, “it’s that your mental structures contrive to resonate to doghood, not how your mental structures contrive to resonate to doghood, that is constitutive of concept possession” (1998a, p. 76). Although the internal relations that DOG bears to other concepts and to percepts are what mediate the connection between DOG and dogs, on Fodor’s view such relations do not determine the content of DOG.

Fodor’s theory is a version of semantic externalism, according to which the meaning of a concept is exhausted by its reference. There are two well-known problems with any such referentialist theory: Frege cases, which putatively show that concepts that have different meanings can nevertheless be referentially identical; and Twin cases, which putatively show that concepts that are referentially distinct can nevertheless have the same meaning. Together, Frege cases and Twin cases suggest that meaning and reference are independent in both directions. Fodor has had much to say about each kind of case, and his views on both have changed over the years.

If conceptual content is exhausted by reference, then two concepts with the same referent ought to be identical in content. As Fodor says, “if meaning is information, then coreferential representations must be synonyms” (1998a, p. 12). But, prima facie, this is false. For as Frege pointed out, it’s easy to generate substitution failures involving coreferential concepts: “John believes that Hesperus is beautiful” may be true while “John believes that Phosphorus is beautiful” is false; “Thales believes that there’s water in the cup” may be true while “Thales believes that there’s H2O in the cup” is false; and so on. Since it’s widely believed that substitution tests are tests for synonymy, such cases suggest that coreferential concepts aren’t synonyms. In light of this, Fregeans introduce a layer of meaning in addition to reference that allows for a semantic distinction between coreferential but distinct concepts. On their view, coreferential concepts are distinct because they have different senses, or “modes of presentation” of a referent, which Fregeans typically individuate in terms of conceptual role (Peacocke 1992).

In one of Fodor’s important early articles on the topic, “Methodological Solipsism Considered as a Research Strategy in Cognitive Psychology” (1980), he argued that psychological explanations depend upon opaque taxonomies of mental states, and that we must distinguish the content of coreferential terms for the purposes of psychological explanation. At that time Fodor thus allowed for a kind of content that’s determined by the internal roles of symbols, which he speculated might be “reconstructed as aspects of form, at least insofar as appeals to content figure in accounts of the mental causation of behavior” (1980, p. 240). However, once he adopted a purely externalist semantics (Fodor 1994), Fodor could no longer allow for a notion of content determined by such internal relations. If conceptual content is exhausted by reference, as informational semantics has it, then there cannot be a semantic distinction between distinct but coreferential concepts.

In later work Fodor thus proposes to distinguish coreferential concepts purely syntactically, and defends the view that modes of presentation (MOPs) are the representational vehicles of thoughts (Fodor 1994, 1998a, 2008, Fodor and Pylyshyn 2014). Taking MOPs to be the syntactically-individuated vehicles of thought serves to connect the theory of concepts to RTM. As Fodor and Pylyshyn put it:

Frege just took for granted that, since coextensive thoughts (concepts) can be distinct, it must be difference in their intensions that distinguish them. But RTM, in whatever form, suggests another possibility: Thoughts and concepts are individuated by their extensions together with their vehicles. The concepts THE MORNING STAR and THE EVENING STAR are distinct because the corresponding mental representations are distinct. That must be so because the mental representation that expresses the concept THE MORNING STAR has a constituent that expresses the concept MORNING, but the mental representation that expresses the concept THE EVENING STAR does not. That’s why nobody can have the concept THE MORNING STAR who doesn’t have the concept MORNING and nobody can have the concept THE EVENING STAR who doesn’t have the concept EVENING. … The result of Frege’s missing this was a century during which philosophers, psychologists, and cognitive scientists in general spent wringing their hands about what meanings could possibly be. (2014, pp. 74-75)

An interesting consequence of this syntactic treatment is that people’s behavior in Frege cases can no longer be given an intentional explanation. Instead, such behavior is explained at the level of syntactically-individuated representations. If, as Fodor suggested in his earlier work (1981), psychological explanations standardly depend upon opaque taxonomies of mental states, then this treatment of Frege cases would threaten the need for intentional explanations in psychology. In an attempt to block this threat, Fodor (1994) argues that Frege cases are in fact quite rare, and can be understood as exceptions rather than counterexamples to psychological laws couched in terms of broad content. The viability of a view that combines a syntactic treatment of Frege cases with RTM has been the focus of a fair amount of literature; see Arjo 1997, Aydede 1998, Aydede and Robins 2001, Brook and Stainton 1997, Rives 2009, Segal 1997, and Schneider 2005.

Let us now turn to Fodor’s treatment of Twin cases. Putnam (1975) asks us to imagine a place, Twin Earth, which is just like earth except the stuff Twin Earthians pick out with the concept WATER is not H2O but some other chemical compound XYZ. Consider Oscar and Twin Oscar, who are both entertaining the thought THERE’S WATER IN THE GLASS. Since they’re physical duplicates, they’re type-identical with respect to everything mental inside their heads. However, Oscar’s thought is true just in case there’s H2O in the glass, whereas Twin Oscar’s thought is true just in case there’s XYZ in the glass. A purely externalist semantics thus seems to imply that Oscar and Twin Oscar’s WATER concepts are of distinct types, despite the fact that Oscar and Twin Oscar are type-identical with respect to everything mental inside their heads. Supposing that intentional laws are couched in terms of broad content, it would follow that Oscar’s and Twin Oscar’s water-directed behavior don’t fall under the same intentional laws.

Such consequences have seemed unacceptable to many, including Fodor, who in his book Psychosemantics (1987) argues that we need a notion of “narrow” content that allows us to account for the fact that Oscar’s and Twin-Oscar’s mental states will have the same causal powers despite differences in their environments. Fodor there defends a “mapping” notion of narrow content, inspired by David Kaplan’s work on demonstratives, according to which the narrow content of a concept is a function from contexts to broad contents (1987, ch. 2). The narrow content of Oscar’s and Twin Oscar’s concept WATER is thus a function that maps Oscar’s context onto the broad content H2O and Twin Oscar’s context onto the broad content XYZ. Such narrow content is shared because Oscar and Twin Oscar are computing the same function. It was Fodor’s hope that this notion of narrow content would allow him to respect the standard Twin-Earth intuitions, while at the same time claim that the intentional properties relevant for psychological explanation supervene on facts internal to thinkers.

However, in The Elm and the Expert (1994) Fodor gives up on the notion of narrow content altogether, and argues that intentional psychology need not worry about Twin cases. Such cases, Fodor claims, only show that it’s conceptually (not nomologically) possible that broad content doesn’t supervene on facts internal to thinkers. One thus cannot appeal to such cases to “argue against the nomological supervenience of broad content on computation since, as far as anybody knows … chemistry allows nothing that is as much like water as XYZ is supposed to be except water” (1994, p. 28). So since Putnam’s Twin Earth is nomologically impossible, and “empirical theories are responsible only to generalizations that hold in nomologically possible worlds,” Twin cases pose no threat to a broad content psychology (1994, p. 29). If it turned out that such cases did occur, then, according to Fodor, the generalizations missed by a broad content psychology would be purely accidental (1994, pp. 30-33). Fodor’s view is thus that Twin cases, like Frege cases, are fully compatible with an intentional psychology that posits only two dimensions to concepts: syntactically-individuated representations and broad contents. Much of Fodor’s work on concepts and content after The Elm and the Expert consisted of further articulation and defense of this view (Fodor 1998a, 2008, Fodor and Pylyshyn 2014).

6. Nativism

In The Language of Thought (1975), Fodor argued not only in favor of RTM but also in favor of the much more controversial view that all lexical concepts are innate. Fodor’s argument starts with the noncontroversial claim that in order to learn a concept one must learn its meaning, or content. But Fodor argues that any such account requires that learnable concepts have meanings that are semantically complex. For instance, if the meaning of BACHELOR is unmarried adult male, then a thinker can learn BACHELOR by confirming the hypothesis that it applies to things that are unmarried adult males. Of course, in order to formulate this hypothesis one must already possess the concepts UNMARRIED, ADULT, and MALE. Standard models of concept learning thus do not apply to primitive concepts that lack internal structure. For instance, one cannot formulate the hypothesis that red things fall under RED unless one already has RED, for the concept RED is a constituent of that very hypothesis. Therefore, primitive concepts like RED cannot be learned, that is, they must be innate. If, as Fodor argues, all lexical concepts are primitive, then it follows that all lexical concepts are innate (1975, ch. 2).

It bears emphasis that Fodor’s claim is not that experience plays no role in the acquisition of lexical concepts. Experience must play a role on any account of concept acquisition, just as it does on any account of language acquisition. Rather, Fodor’s claim is that lexical concepts are not learned on the basis of experience but triggered by it. As Fodor puts it in his most well-known article on the topic, “The Present Status of the Innateness Controversy,” his nativist claim is that the relation between experience and concept acquisition is brute-causal, not rational or evidential:

Nativists and Empiricists disagree on the extent to which the acquisition of lexical concepts is a rational process. In respect of this disagreement, the traditional nomenclature of “Rationalism vs. Empiricism” could hardly be more misleading. It is the Empiricist view that the relation between a lexical concept and the experiences which occasion its acquisition is normally rational—in particular, that the normal relation is that such experiences bestow inductive warrant upon hypotheses which articulate the internal structure of the concepts. Whereas, it’s the Rationalist view that the normal relation between lexical concepts and their occasioning experiences is brute-causal, i.e. “merely” empirical: such experiences function as the innately specified triggers of the concepts which they—to borrow the ethological jargon—“release”.  (1981b, pp. 279-280)

Most theories of concepts—such as conceptual role and prototype theories, discussed above—assume that many lexical concepts have some kind of internal structure. In fact, theorists are sometimes explicit that their motivation for positing complex lexical structure is to reduce the number of primitives in the lexicon. As Ray Jackendoff puts it:

Nearly everyone thinks that learning anything consists of constructing it from previously known parts, using previously known means of combination. If we trace the learning process back and ask where the previously known parts came from, and their previously know parts came from, eventually we have to arrive at a point where the most basic parts are not learned: they are given to the learner genetically, by virtue of the character of brain development. … Applying this view to lexical learning, we conclude that lexical concepts must have a compositional structure, and that the word learner’s [mind] is putting meanings together from smaller parts (2002, 334). (See also Levin and Pinker 1991, p. 4.)

It’s worth stressing that while those in the empiricist tradition typically assume that the primitives are sensory concepts, those who posit complex lexical structure need not commit themselves to any such empiricist claim. Rather, they may simply assume that very few lexical items not decomposable, and deal with the issue of primitives on a case by case basis, as Jackendoff (2002) does. In fact, many of the (apparent) primitives appealed to in the literature—for example, EVENT, THING, STATE, CAUSE, and so forth—are quite abstract and thus not ripe for an empiricist treatment. In any case, as we noted above, Fodor is led to adopt informational atomism, in part, because he is persuaded by the evidence that lexical concepts do not have any structure, decompositional or otherwise. He thus denies that appealing to lexical structure provides an adequate reply to his argument for concept nativism (Fodor 1981b, 1998a, 2008, Fodor and Lepore 2002).

In Concepts: Where Cognitive Science Went Wrong (1998a), Fodor worries about whether his earlier view is adequate. In particular, he’s concerned about whether it has the resources to explain questions such as why it is experiences with doorknobs that trigger the concept DOORKNOB:

[T]here’s a further constraint that whatever theory of concepts we settle on should satisfy: it must explain why there is so generally a content relation between the experience that eventuates in concept attainment and the concept that the experience eventuates in attaining. … [A]ssuming that primitive concepts are triggered, or that they’re ‘caught’, won’t account for their content relation to their causes; apparently only induction will. But primitive concepts can’t be induced; to suppose that they are is circular. (1998a, p. 132)

Fodor’s answer to this worry involves a metaphysical claim about the nature of the properties picked out by most of our lexical concepts. In particular, he claims that it’s constitutive of these properties that our minds “lock” to them as a result of experience with their prototypical (stereotypical) instances. As Fodor puts it, being a doorknob is just “being the kind of thing that our kinds of minds (do or would) lock to from experience with instances of the doorknob stereotype” (1998a, p. 137; see also 2008). By construing such properties as mind-dependent in this way, Fodor thus provides a metaphysical reply to his worry above: there need not be a cognitive or evidential relation between our experiences with doorknobs and our acquisition of DOORKNOB, for being a doorknob just is the property that our minds lock to as a result of experiencing stereotypical instances of doorknobs. Fodor sums up his view as follows:

[I]f the locking story about concept possession and the mind-dependence story about the metaphysics of doorknobhood are both true, then the kind of nativism about DOORKNOB that an informational atomist has to put up with is perhaps not one of concepts but of mechanisms. That consequence may be some consolation to otherwise disconsolate Empiricists. (1998a, p. 142)

In LOT 2: The Language of Thought Revisited (2008), Fodor extends his earlier discussions of concept nativism. Whereas his previous argument turned on the empirical claim that lexical concepts are internally unstructured, Fodor here says that this claim is “superfluous”: “What I should have said is that it’s true and a priori that the whole notion of concept learning is per se confused” (2008, p. 130). Consider a patently complex concept such as GREEN OR TRIANGULAR. Learning this concept would require confirming the hypothesis that the things that fall under it are either green or triangular. However, Fodor says:

[T]he inductive evaluation of that hypothesis itself requires (inter alia) bringing the property green or triangular before the mind as such. You can’t represent something as green or triangular unless you have the concepts GREEN, OR, and TRIANGULAR. Quite generally, you can’t represent anything as such and such unless you already have the concept such and such. … This conclusion is entirely general; it doesn’t matter whether the target concept is primitive (like GREEN) or complex (like GREEN OR TRIANGULAR). (2008, p. 139)

Fodor’s diagnosis of this problem is that standard learning models wrongly assume that acquiring a concept is a matter of acquiring beliefs. Instead, Fodor suggests that “beliefs are constructs out of concepts, not the other way around,” and that the failure to recognize this is what leads to the above circularity (2008, pp. 139-140; see also Fodor’s contribution to Piattelli-Palmarini, 1980).

Fodor’s story about concept nativism in LOT 2 runs as follows: although no concepts—not even complex ones—are learned, concept acquisition nevertheless involves inductive generalizations. We acquire concepts as a result of experiencing their prototypical instances, and learning a prototype is an inductive process. Of course, if concepts were prototypes then it would follow that concept acquisition would be an inductive process. But, as we saw above, Fodor claims that concepts can’t be prototypes since prototypes violate compositionality. Instead, Fodor suggests that learning a prototype is a stage in the acquisition of a concept. His picture thus looks like this (2008, p. 151):

Initial state → (P1) → stereotype/prototype formation → (P2) → locking (= concept attainment).

Why think that P1 is an inductive process? Fodor here appeals to “well-known empirical results suggesting that even very young infants are able to recognize and respond to statistical regularities in their environments,” and claims that “a genetically endowed capacity for statistical induction would make sense if stereotype formation is something that minds are frequently employed to do” (2008, p. 153). What renders this picture consistent with Fodor’s claim that “there can’t be any such thing as concept learning” (p. 139) is that he does not take P2 to be an inferential or intentional process (pp. 154-155). What kind of process is it? Here, Fodor doesn’t have much to say, other than it’s the “kind of thing that our sort of brain tissue just does”: “Psychology gets you from the initial state to P2; then neurology takes over and gets you the rest of the way to concept attainment” (p. 152). So, again, Fodor’s ultimate story about concept nativism is consistent with the view, as he puts it in Concepts, that “maybe there aren’t any innate ideas after all” (1998a, p. 143). Instead, there are innate mechanisms, which take us from the acquisition of prototypes to the acquisition of concepts.

7. Modularity

In his influential book, The Modularity of Mind (1983), Fodor argues that the mind contains a number of highly specialized, “modular” systems, whose operations are largely independent from each other and from the “central” system devoted to reasoning, belief fixation, decision making, and the like. In that book, Fodor was particularly interested in defending a modular view of perception against so-called “New Look” psychologists and philosophers (for example, Bruner, Kuhn, Goodman), who took cognition to be more or less continuous with perception. Whereas New Look theorists focused on evidence suggesting various top-down effects in perceptual processing (for example, ways in which what people believe and expect can affect what they see), Fodor was impressed by evidence from the other direction suggesting that perceptual processes lack access to such “background” information. Perceptual illusions provide a nice illustration. In the famous Müller-Lyer illusion (Figure 1), the top line looks longer than the bottom line even though they’re identical in length.

Muller Figure 1. The Müller-Lyer
Illusion

As Fodor points out, if knowing that the two lines are identical in length does not change the fact that one looks longer than the other, then clearly perceptual processes don’t have access to all of the information available to the perceiver. Thus, there must be limits on how much information is available to the visual system for use in perceptual inferences. In other words, vision must be in some interesting sense modular. The same goes for other sensory/input systems, and, on Fodor’s view, certain aspects of language processing.

Fodor spells out a number of characteristic features of modules. That knowledge of an illusion doesn’t make the illusion go away illustrates one of their central features, namely, that they are informationally encapsulated. Fodor says:

[T]he claim that input systems are informationally encapsulated is equivalent to the claim that the data that can bear on the confirmation of perceptual hypotheses includes, in the general case, considerably less that the organism may know. That is, the confirmation function for input systems does not have access to all the information that the organism internally represents. (1983, p. 69)

In addition, modules are supposed to be domain specific, in the sense that they’re restricted in the sorts of representations (such as visual, auditory, or linguistic) that can serve as their inputs (1983, pp. 47-52). They’re also mandatory. For instance, native English speakers cannot hear utterances of English as mere noise (“You all know what Swedish and Chinese sound like; what does English sound like?” 1983, p. 54), and people with normal vision and their eyes open cannot help but see the 3-D objects in front of them. In general, modules “approximate the condition so often ascribed to reflexes: they are automatically triggered by the stimuli that they apply to” (1983, pp. 54-55). Not only are modular processes domain-specific and out of our voluntary control, they’re also exceedingly fast. For instance, subjects can “shadow” speech (repeat what is heard when it’s heard) with a latency of about 250 milliseconds, and match a description to a picture with 96% accuracy when exposed for a mere 167 milliseconds (1983, pp. 61-64). In addition, modules have shallow outputs, in the sense that the information they carry is simple, or constrained in some way, which is required because otherwise the processing required to generate them couldn’t be encapsulated. As Fodor says, “if the visual system can deliver news about protons, then the likelihood that visual analysis is informationally encapsulated is negligible” (1983, p. 87). Fodor tentatively suggests that the visual system delivers as outputs “basic” perceptual categories (Rosch et al. 1976) such as dog or chair, although others take shallow outputs to be altogether non-conceptual (Carruthers 2006, p. 4). In addition to these features, Fodor also suggests that modules are associated with fixed neural architecture, exhibit characteristic and specific breakdown patterns, and have an ontogeny that exhibits a characteristic pace and sequencing (1983, pp. 98-101).

On Fodor’s view, although sensory systems are modular, the “central” systems underlying belief fixation, planning, decision-making, and the like, are not. The latter exhibit none of the characteristic features associated with modules since they are domain-general, unencapsulated, under our voluntary control, slow, and not associated with fixed neural structures. Fodor draws attention, in particular, to two distinguishing features of central systems: they’re isotropic, in the sense that “in principle, any of one’s cognitive commitments (including, of course, the available experiential data) is relevant to the (dis)confirmation of any new belief” (2008, p. 115); and they’re Quinean, in the sense that they compute over the entirety of one’s belief system, as when one settles on the simplest, most conservative overall belief—as Fodor puts it, “the degree of confirmation assigned to any given hypothesis is sensitive to properties of the entire belief system” (1983, p. 107). Fodor’s picture of mental architecture is one in which there are a number of informationally encapsulated modules that process the outputs of transducer systems, and then generate representations that are integrated in a non-modular central system. The Fodorean mind is thus essentially a big general-purpose computer, with a number of domain-specific computers out near the edges that feed into it.

Fodor’s work on modularity has been criticized on a number of fronts. Empiricist philosophers and psychologists are typically quite happy with the claim that the central system is domain-general, but have criticized Fodor’s claim that input systems are modular (see Prinz 2006 for an overview). Fodor’s work has also been attacked by those who share his rationalist and nativist sympathies. Most notably, evolutionary psychologists reject Fodor’s claim that there must be a non-modular system responsible for integrating modular outputs, and argue instead that the mind is nothing but a collection of modular systems (see Barkow, Cosmides, and Tooby 1992, Carruthers 2006, Pinker 1997, and Sperber 2002). According to such “massive modularity” theorists, what Fodor calls the “central” system is in fact built up out of a number of domain-specific modules, for example, modules devoted to common-sense reasoning about physics, biology, psychology, and the detection of cheaters, to name a few prominent examples from the literature. (The notion of “module” used by such theorists is different in various ways from the notion as introduced by Fodor; see Carruthers 2006 and Barrett 2015.) In addition, evolutionary psychologists claim that these central modules are adaptations, that is, products of selection pressures that faced our hominid ancestors.

That Fodor is a staunch nativist might lead one to believe that he is sympathetic to applying adaptationist reasoning to the human mind. This would be a mistake. Fodor has long been skeptical of the idea that the mind is a product of natural selection, and in his book The Mind Doesn’t Work That Way (2001) he replies to a number of arguments purporting to show that it must be. For instance, evolutionary psychologists claim that the mind must be “reverse engineered”: in order to figure out how it works, we must know what its function is; and in order to know what its function is we must know what it was selected for. Fodor rejects this latter inference, and claims that natural selection is not required in order to underwrite claims about the teleology of the mind. For the notion of function relevant for psychology might be synchronic, not diachronic: “You might think, after all, that what matters in understanding the mind is what ours do now, not what our ancestors’ did some millions of years ago” (1998b, p. 209). Indeed, in general, one does not need to know about the evolutionary history of a system in order to make inferences about its function:

[O]ne can often make a pretty shrewd guess what an organ is for on the basis of entirely synchronic considerations. One might thus guess that hands are for grasping, eyes for seeing, or even that minds are for thinking, without knowing or caring much about their history of selection. Compare Pinker (1997, p. 38): “psychologists have to look outside psychology if they want to explain what the parts of the mind are for.” Is this true? Harvey didn’t have to look outside physiology to explain what the heart is for. It is, in particular, morally certain that Harvey never read Darwin. Likewise, the phylogeny of bird flight is still a live issue in evolutionary theory. But, I suppose, the first guy to figure out what birds use their wings for lived in a cave. (2000, p. 86)

Fodor’s point is that even if one grants that natural selection underwrites teleological claims about the mind, it doesn’t follow that in order to understand a psychological mechanism one must understand the selection pressures that led to it.

Evolutionary psychologists also argue that the adaptive complexity of the mind is best explained by the hypothesis that it is a collection of adaptations. For natural selection is the only known explanation for adaptive complexity in the living world. In response, Fodor claims that the complexity of our minds is irrelevant to the question of whether they’re the products of natural selection:

[W]hat matters to the plausibility that the architecture of our minds is an adaptation is how much genotypic alternation would have been required for it to evolve from the mind of the nearest ancestral ape whose cognitive architecture was different from ours. About that, however, nothing is known. … [I]t’s entirely possible that quite small neurological reorganizations could have effected wild psychological discontinuities between our minds and the ancestral ape’s. … If that’s right, then there is no reason at all to believe that our cognition was shaped by the gradual action of Darwinian selection on prehuman behavioral phenotypes. (2000, pp. 87-88)

Fodor thus argues that adaptive complexity does not warrant the claim that our minds are products of natural selection. In a co-authored book with Massimo Piattelli-Palmarini, What Darwin Got Wrong (2010), Fodor goes much further, arguing that adaptationist explanations in general are both decreasingly of interest in biology and, on further reflection, actually incoherent. Perhaps needless to say, the book has occasioned considerable controversy (see Sober 2010, Pigliucci 2010, Block and Kitcher 2010, and Godfrey-Smith 2010; Fodor and Piattelli-Palmarini reply to their critics in an afterword in the paperback edition of the book).

In The Mind Doesn’t Work That Way (2000), and also in LOT 2 (2008), Fodor reiterates and defends his claim that the central systems are non-modular, and connects this view to more general doubts about the adequacy of RTM as a comprehensive theory of the human mind, doubts that he first voiced in his classic The Modularity of Mind (1983). One of the main jobs of the central system is the fixation of belief via abductive inferences, and Fodor argues that the fact that such inferences are holistic, global, and context-dependent implies that they cannot be realized in a modular system. Given RTM’s commitment to the claim that computational processes are sensitive only to local properties of mental representations, these features of central cognition thus appear to fall outside of RTM’s scope (2000, chs. 2-3; 2008, ch. 4).

Consider, for instance, the simplicity of a belief. As Fodor says: “The thought that there will be no wind tomorrow significantly complicates your arrangements if you had intended to sail to Chicago, but not if your plan was to fly, drive, or walk there” (2000, p. 26). Whether or not a belief complicates a plan thus depends upon the beliefs involved in the plan—that is, the simplicity of a belief is one of its global, context-dependent properties. However, the syntactic properties of representations are local, in the sense that they depend on their intrinsic, context-independent properties. Fodor concludes that to the extent that cognition involves global properties of representations, RTM cannot provide a model of how cognition works:

[A] cognitive science that provides some insight into the part of the mind that isn’t modular may well have to be different, root and branch, from the kind of syntactical account that Turing’s insights inspired. It is, to return to Chomsky’s way of talking, a mystery, not just a problem, how mental processes could be simultaneously feasible and abductive and mechanical. Indeed, I think that, as things now stand, this and consciousness look to be the ultimate mysteries about the mind. (2000, p. 99).

Thus, although Fodor has long championed RTM as the best theory of cognition available, he claims that its application is limited to those portions of the mind that are modular. Needless to say, some disagree with Fodor’s assessment of the limits of RTM (see Carruthers 2006, Ludwig and Schneider 2008, Pinker 2005, and Barrett 2015).

8. References and Further Reading

  • Arjo, Dennis (1996) “Sticking Up for Oedipus: Fodor on Intentional Generalizations and Broad Content,” Mind & Language 11: 231-235.
  • Aydede, Murat (1998) “Fodor on Concepts and Frege Puzzles,” Pacific Philosophical Quarterly 79: 289-294.
  • Aydede, Murat & Philip Robbins (2001) “Are Frege Cases Exceptions to Intentional Generalizations?” Canadian Journal of Philosophy 31: 1-22.
  • Barkow, Jerome, Cosmides, Leda, and Tooby, John (Eds.) The Adapted Mind. Oxford: Oxford University Press.
  • Barrett, H. Clark (2015) The Shape of Thought: How Mental Adaptations Evolve. Oxford: Oxford University Press.
  • Block, Ned (1993). “Holism, Hyper-Analyticity, and Hyper-Compositionality,” Philosophical Issues 3: 37-72.
  • Block, Ned and Philip Kitcher (2010) “Misunderstanding Darwin: Natural Selection’s Secular Critics Get it Wrong,” Boston Review (March/April).
  • Boghossian, Paul (1993). “Does Inferential Role Semantics Rest on a Mistake?” Philosophical Issues 3: 73-88.
  • Brook, Andrew and Robert Stainton (1997) “Fodor’s New Theory of Content and Computation,” Mind & Language 12: 459-474.
  • Carruthers, Peter (2003) “On Fodor’s Problem,” Mind & Language 18: 502-523.
  • Carruthers, Peter (2006) The Architecture of the Mind: Massive Modularity and the Flexibility of Thought. Oxford: Oxford University Press.
  • Chomsky, Noam (1959) “A Review of B.F. Skinner’s Verbal Behavior,” Language 35: 26-58.
  • Evans, Gareth (1982) Varieties of Reference. Oxford: Oxford University Press.
  • Fodor, Janet, Jerry Fodor, and Merril Garrett (1975) “The Psychological Unreality of Semantic Representations,” Linguistic Inquiry 4: 515-531.
  • Fodor, Jerry (1974) “Special Sciences (Or: The Disunity of Science as a Working Hypothesis)” Synthese 28:97-115.
  • Fodor, Jerry (1975) The Language of Thought. New York: Crowell.
  • Fodor, Jerry (1980) “Methodological Solipsism Considered as a Research Strategy in Cognitive Psychology,” Behavioral and Brain Sciences 3: 63-109. Reprinted in Fodor (1981a).
  • Fodor, Jerry (1981a) RePresentations: Philosophical Essays on the Foundations of Cognitive Science. Cambridge, MA: MIT Press.
  • Fodor, Jerry (1981b) “The Present Status of the Innateness Controversy,” In Fodor (1981a).
  • Fodor, Jerry (1983) The Modularity of Mind. Cambridge, MA: MIT Press.
  • Fodor, Jerry (1986) “Why Paramecia Don’t Have Mental Representations,” Midwest Studies in Philosophy 10: 3-23.
  • Fodor, Jerry (1987) Psychosemantics: The Problem of Meaning in the Philosophy of Mind. Cambridge, MA: MIT Press.
  • Fodor, Jerry (1989) “Making mind matter more,” Philosophical Topics 67: 59-79.
  • Fodor, Jerry (1990) A Theory of Content and Other Essays. Cambridge, MA: MIT Press.
  • Fodor, Jerry (1991) “Replies,” In Loewer and Rey (Eds.) Meaning in Mind: Fodor and His Critics. Oxford: Blackwell.
  • Fodor, Jerry (1994) The Elm and the Expert: Mentalese and Its Semantics. Cambridge, MA: MIT Press.
  • Fodor, Jerry (1998a) Concepts: Where Cognitive Science Went Wrong. New York: Oxford University Press.
  • Fodor, Jerry (1998b) In Critical Condition: Polemical Essays on Cognitive Science and the Philosophy of Mind. Cambridge, MA: MIT Press.
  • Fodor, Jerry (2000) The Mind Doesn’t Work That Way: The Scope and Limits of Computational Psychology. Cambridge, MA: MIT Press.
  • Fodor, Jerry (2003) Hume Variations. Oxford: Oxford University Press.
  • Fodor, Jerry (2004) “Having Concepts: A Brief Refutation of the 20th Century,” Mind & Language 19: 29-47.
  • Fodor, Jerry, and Charles Chihara (1965) “Operationalism and Ordinary Language,” American Philosophical Quarterly 2: 281-295.
  • Fodor, Jerry, Thomas Bever, and Merrill Garrett (1974) The Psychology of Language: An Introduction to Psycholinguistics and Generative Grammar. New York: McGraw Hill.
  • Fodor, Jerry, Merril Garrett, Edward Walker, and Cornelia Parkes (1980) “Against Definitions,” Reprinted in Margolis and Laurence (1999).
  • Fodor, Jerry, and Zenon Pylyshyn (1988) “Connectionism and Cognitive Architecture: A Critical Analysis,” Cognition 28: 3-71.
  • Fodor, Jerry, and Ernest Lepore (1992) Holism: A Shopper’s Guide. Oxford: Blackwell.
  • Fodor, Jerry, and Ernest Lepore (2002) The Compositionality Papers. New York: Oxford University Press.
  • Fodor, Jerry, and Massimo Piattelli-Palmarini (2010) What Darwin Got Wrong. Farrar, Straus and Giroux.
  • Fodor, Jerry, and Zenon Pylyshyn (2014) Minds without Meanings. Cambridge, MA: MIT Press.
  • Godfrey-Smith, Peter (2010) “It Got Eaten,” London Review of Books, 32 (13): 29-30.
  • Hale, Kenneth, and Samuel Jay Keyser (1993) “On Argument Structure and Lexical Expression of Syntactic Relations,” in K. Hale and S.J. Keyser (Eds.) The View From Building 20 Cambridge, MA: MIT Press.
  • Hale, Kenneth, & Samuel Jay Keyser (1999) “A Response to Fodor and Lepore “Impossible Words?”” Linguistic Inquiry 30: 453–466.
  • Heil, John (2003). From An Ontological Point of View. Oxford: Oxford University Press.
  • Horgan, Terrence (1998). “Recognitional Concepts and the Compositionality of Concept Possession,” Philosophical Issues 9: 27-33.
  • Jackendoff, Ray (1983). Semantics and Cognition. Cambridge, MA: MIT Press.
  • Jackendoff, Ray (1992). Languages of the Mind. Cambridge, MA: MIT Press.
  • Katz, Jerrold (1977) “The Real Status of Semantic Representations,” Linguistic Inquiry 8: 559-84.
  • Katz, Jerrold (1981) Language and Other Abstract Objects. Oxford: Blackwell.
  • Katz, Jerrold, and J.A. Fodor (1963) “The Structure of a Semantic Theory,” Language 39:170-210.
  • Kim, Jaegwon (2005) Physicalism, or Something Near Enough. Princeton, NJ: Princeton University Press.
  • Loewer, Barry, and Georges Rey (Eds.) (1991). Meaning in Mind: Fodor and His Critics. Oxford: Blackwell.
  • Lowe, E.J. (2008) Personal Agency: The Metaphysics of Mind and Action. Oxford: Oxford University Press.
  • Ludwig, Kirk, and Susan Schneider (2008) “Fodor’s Challenge to the Classical Computational Theory of Mind,” Mind & Language, 23, 123-143.
  • Melnyk, Andrew (2003) A Physicalist Manifesto: Thoroughly Modern Materialism. Cambridge: Cambridge University Press.
  • Miller, George, and Johnson-Laird, Philip (1976). Language and Perception. Cambridge, MA: Harvard University Press.
  • Peacocke, Christopher (1992) A Study of Concepts. Cambridge, MA: MIT Press.
  • Piattelli-Palmarini, Massimo (1980) Language and Learning. Cambridge, MA: Harvard University Press.
  • Pigliucci, Massimo (2010) “A Misguided Attack on Evolution” Nature 464: 353–354.
  • Pinker, Steven (1989) Learnability and Cognition. Cambridge, MA: MIT Press.
  • Pinker, Steven (1997) How the Mind Works. New York: W. W. Norton & Company.
  • Pinker, Steven (2005) “So How Does the Mind Work?” Mind & Language 20: 1-24.
  • Prinz, Jesse (2002) Furnishing the Mind. Cambridge, MA: MIT Press.
  • Prinz, Jesse (2006) “Is the Mind Really Modular?” In Stainton (Ed.) Contemporary Debates in Cognitive Science. Oxford: Blackwell.
  • Pustejovsky, James (1995) The Generative Lexicon. Cambridge, MA: MIT Press.
  • Pustejovsky, James (1998) “Generativity and Explanation in Semantics: A Reply to Fodor and Lepore” Linguistic Inquiry 29: 289-311.
  • Putnam, Hilary (1963) “Brains and Behavior”, reprinted in Putnam 1975b, pp. 325–341.
  • Putnam, Hilary (1967) “The Nature of Mental States”, reprinted in Putnam 1975b, 429–440.
  • Putnam, Hilary (1975) “The Meaning of ‘Meaning’,” Minnesota Studies in the Philosophy of Science 7: 131-193.
  • Putnam, Hilary (1975b) Mind, Language, and Reality, vol. 2. Cambridge: Cambridge University Press.
  • Pylyshyn, Zenon (2003) Seeing and Visualizing. Cambridge, MA: MIT Press.
  • Quine, W.V. O. (1953) “Two Dogmas of Empiricism,” In From a Logical Point of View, Cambridge, MA: Harvard University Press.
  • Quine, W.V. O. (1960) Word and Object. Cambridge, MA: MIT Press.
  • Recanati, François (2002) “The Fodorian Fallacy,” Analysis 62: 285-289.
  • Rey, Georges (1993) “Idealized Conceptual Roles,” Philosophy and Phenomenological Research 53: 47-52.
  • Rey, Georges (2005) “Philosophical analysis as cognitive psychology,” In H. Cohen and C. Lefebvre (Eds.) Handbook of Categorization in Cognitive Science. Dordrecht: Elsevier.
  • Rives, Bradley (2009) “Concept Cartesianism, Concept Pragmatism, and Frege Cases,” Philosophical Studies 144: 211-238.
  • Rives, Bradley (2016) “Concepts and Analytic Intuitions,” Analytic Philosophy 57(4): 285-314.
  • Rosch, Eleanor and Carolyn Mervis (1975) “Family Resemblances: Studies in the Internal Structure of Categories,” Cognitive Psychology 7: 573-605.
  • Rosch, Eleanor, Mervis, C., Gray, W., Johnson, D., and Boyes-Braem, P. (1976). “Basic Objects in Natural Categories,” Cognitive Psychology 8: 382–439.
  • Schneider, Susan (2005) “Direct Reference, Psychological Explanation, and Frege Cases,” Mind & Language 20: 423-447.
  • Segal, Gabriel (1997) “Content and Computation: Chasing the Arrows,” Mind & Language 12: 490-501.
  • Shoemaker, Sydney (2003) “Some Varieties of Functionalism,” In Shoemaker, Identity, Cause, and Mind, Oxford: Oxford University Press.
  • Sober, Eliot (2010) “Natural Selection, Causality, and Laws: What Fodor and Piattelli-Palmarini Got Wrong,” Philosophy of Science 77(4): 594-607.
  • Sperber, Daniel (2002) “In defense of massive modularity,” In Dupoux (Ed.) Language, Brain, and Cognitive Development. Cambridge, MA: MIT Press.
  • Stoljar, Daniel (2010 Physicalism. New York: Routledge.
  • Williamson, Timothy (2007) The Philosophy of Philosophy. Oxford: Blackwell.

Author Information

Bradley Rives
Email: rives@iup.edu
Indiana University of Pennsylvania
U. S. A.

The Trinity

Christians believe that God is a Trinity of Persons, each omnipotent, omniscient and wholly benevolent, co-equal and fully divine. There are not three gods, however, but one God in three Persons: Father, Son and Holy Spirit. Prima facie, the doctrine more commonly known as the Trinity seems gratuitous: why multiply divine beings beyond necessity—especially since one God is hard enough to believe in? For Christians, however, the Trinity doctrine is neither gratuitous nor unmotivated. Claims about Christ’s divinity are difficult to reconcile with the Christian doctrine that there is just one God: Trinitarian theology is an attempt to square the Christian conviction that Jesus is the Son of God, fully divine yet distinct from his Father, with the Christian commitment to monotheism. Nevertheless, while the Trinity doctrine purports to solve a range of theological puzzles it poses a number of intriguing logical difficulties akin to those suggested by the identity of spatio-temporal objects through time and across worlds, puzzle cases of personal identity, and problems of identity and constitution. Philosophical discussions of the Trinity have suggested solutions to the Trinity puzzle comparable to solutions proposed to these classic identity puzzles. When it comes to the Trinity puzzle, however, one must determine whether such solutions accord with theological constraints.

Table of Contents

  1. History and Motivation
    1. Why Should One Believe It?
    2. God and World: The Great Chain of Being and the Bright Line
    3. Trinity East and West: Loose and Descending or Tight and Flat?
    4. The Industry Standard: Nicea and Beyond
  2. Theological Constraints
    1. Monotheism
    2. The Distinctness of Persons
    3. The Equality of Persons, the Descending Trinity and the Filioque Clause
    4. Personality
    5. Christology and the Jesus Predicate Problem
  3. Philosophical Puzzles and Solutions
    1. Trinity and Identity
    2. The “Is” of Predication
    3. Divine Stuff: ‘God’ as a Mass Term
    4. Relative Identity
    5. The Trinity and Other Identity Puzzles
  4. References and Further Reading

1. History and Motivation

a. Why Should One Believe It?

Why should one believe that God is a Trinity of Persons? Historically, most writers have held that even if the existence of God could be known by natural reason, his Trinitarian character could only be discovered through revelation.  Such revelations in the tradition of the Church can only be indirectly encountered through the explication and interpretation of Scripture. This was, for example, Aquinas’ view. However, other writers have suggested that even discounting revelation, reflection on the nature of God should lead us to recognize his Trinitarian character. For instance, Richard Swinburne argues that there is at least a plausibility argument for a Trinity of divine persons insofar as God’s perfectly loving nature drives the production of the Trinitarian Persons:

I believe that there is overriding reason for a first divine individual to bring about a second divine individual and with him to bring about a third divine individual…[L]ove is a supreme good. Love involves sharing, giving to the other what of one’s own is good for him and receiving from the other what of his is good for one; and love involves co-operating with another to benefit third parties. [Richard Swinburne, The Christian God, p. 177-178]

However, this is a minority view, as other contemporary writers reject a prioriarguments for the doctrine of the Trinity.  For example, Brian Leftow challenges it by asking why perfect love should stop at three rather than four or more.

If natural reason fails to provide a compelling reason to regard God as Trinitarian, an appeal to Scripture does not fare much better. There are few hints in the Bible of the Trinity doctrine, which developed later during the Patristic period. The Trinitarian formula figures in injunctions to baptize “in the name of the Father, Son and Holy Spirit” in Matthew 28:19, but both twofold and threefold patterns occur in the New Testament, and there is no mention of the Trinity as such. Rausch in The Trinitarian Controversy notes:

The binatarian formulas are found in Rom 8:11, 2 Cor. 4:14, Gal. 1:1, Eph. 1:20, 1 Tim. 1:2, 1 Pet. 1:21 and 2 John 1:13. The triadic schema is discovered in Matt. 28:19, 1 Cor. 6:11 and 12:4, Gal. 3:11-14, Heb. 10:29, and 1 Pet. 1:2. All these passages indicate that there is no fixity of wording. No doctrine of the Trinity in the Nicene sense is present in the New Testament. (William G. Rusch. The Trinitarian Controversy. Philadelphia: Fortress Press, 1980. P. 2)

Despite this ambiguity, the Gospels do pose puzzles that motivate the development of Trinitarian doctrine. First, they represent Jesus as both the Son of Man, who prays to the Father, and as a divine being, identified in the Fourth Gospel with the Logos, who is “with God and is God,” [John 1:1] and who also announces that he and the Father are “one” [John 10:30]. Second, Scripture speaks of the Spirit who descended on Jesus’ disciples at Pentecost, and who also is conventionally identified with the Spirit that moved over the waters at Creation in Genesis. Arguably, we may regard the Trinity doctrine as an explanatory hypothesis, which purports to make sense of divinity claims concerning the Son and Holy Spirit without undermining the Judeo-Christian commitment to monotheism.

b. God and World: The Great Chain of Being and the Bright Line

The Trinity doctrine is also part of a larger theological project. In the early Christian Era of the Hebrew tradition, there was a plurality of divine, semi-divine and otherwise supernatural beings, which has to be reconciled with Hebraic monotheism. Some of these beings, such as Yahweh’s suite of Seraphim and Cherubim, are indigenous; others were absorbed from Hellenistic religious culture. In the interests of an orderly theological monotheism, these beings have to be defined in relation to God. Some were absorbed into the Godhead as aspects, powers or components of the one God, others were demoted to angelic or demonic status, and yet others were dismissed as the mere hypostatizations of theological façons de parler. The doctrine of the Trinity emerged as part of that theological tidying up process, which, from the Judeo-Christian side, was aimed at drawing a bright line between the one God and everything else.

If Jews and Christians (insofar as they were faithful to their Hebraic roots), were intent on separating out the one God from all other things visible and invisible, Greeks had no compunction about multiplying supernatural beings. Indeed, the Greek problem of “the one and the many” was one of filling the gap between a simple, impassible, atemporal, incorporeal, incorruptible deity and a world of matter, the passive recipient of form, which was temporal, corporeal, and corruptible.

The traditional response was to introduce one or more divine, semi-divine or otherwise supernatural beings to mediate between the One and the many. So Plato, in the Timaeus, speculated that the material world had been created by the Demiurge, a Second God. This strategy was elaborated upon during the late Hellenistic period in Gnostic systems, which introduced elaborate systems of “emanations” from the divine in a continuum, a Great Chain of Being that stretched from the most elevated of beings, through intermediaries, and to those who were sufficiently remote from full divinity to be involved with the world of matter. During the Hellenistic period, Christians and Jews engaged in philosophical theology, like Origen and Philo, adopted similar views since Philosophy was of the Greeks, and the philosophical lingua franca was Platonism.

In contrast, there was no reason to construct a Great Chain of Being within the Hebrew tradition. The writers of Hebrew Scripture did not have any compelling philosophical interests and did not look for mechanisms to explain how things came into being or operated: fiat was good enough. Perhaps more importantly, they did not view materiality as inherently imperfect or defective and so did not need to posit mediating beings to bridge an ontological gap between the divine and base matter, a feature of the Greek tradition. This tradition, though it continued to figure in popular piety, was officially repudiated by orthodox Christians. Yahweh, according to the Genesis accounts, created the world by fiat—no mechanism required—and saw that it was good.

For philosophical theologians in the grip of the problem of the One and the many, fiat would not do—and for Christian theologians, committed to monotheism, the doctrine of mediating divine or semi-divine beings posed special difficulties. As heirs to the Hebrew tradition, they recognized a fundamental ontological dividing line between a divine Creator and his creation and faced the problem of which side of the line the mediating being or beings occupied—exacerbated by the monotheistic assumption that there was only room for one Being on the side of the Creator.

c. Trinity East and West: Loose and Descending or Tight and Flat?

The Trinity Doctrine was an attempt to accommodate both partisans of the Bright Line and also partisans of the Great Chain of Being including Christians who identified Jesus with the Logos, a mediating divine being through which the material world was created. Jews wanted to absorb all divine beings into the one God in the interests of promoting monotheism; Greeks wanted mediating beings to bridge the ontological gap between the world of time and change, and the transcendent reality beyond. In identifying the Logos, incarnate in Christ, as the Second Person of the Trinity, Christians aimed to bridge the ontological gap with a mediating being who was himself incorporated into the Godhead, satisfying the metaphysical interests of both Greeks and Jews.

Elaborated over three centuries before reaching its mature form in the ecumenical councils of Nicea (325 AD) and Constantinople (381 AD), the doctrine of the Trinity represents an attempt to organize and make sense of the Christian conviction that God created the material world, sustained it and acted within history, most particularly through Christ and in his Church. On this account, God creates all things through the Logos and enters into the world in Christ. Jesus promises that he will not abandon his people but that after ascending to his Father will send the Holy Spirit to guide his Church. The Logos and Holy Spirit are not merely supernatural beings of an inferior order who do God’s business in the world: according to the Biblical tradition the material world is not an inferior realm to be handled by inferior mediating beings. Accordingly, Christians needed to hold that the Logos and Holy Spirit were fully divine. To preserve monotheism, however, there could not be divine beings other than God, so Christians were pressed to incorporate the Logos and Holy Spirit into the divine nature.

The tension between those two theological interests shaped the ongoing development of Trinitarian doctrine insofar as the goal of Christian orthodoxy was to make sense of the role of Christ as a mediating divine being—God with us, the Word made flesh through which all things were made—while maintaining the bright line between a transcendent, divine being and everything else in the interests of supporting monotheism. Painting with a broad brush, the former concern drove the development of Trinitarian doctrine that evolved into the Eastern tradition of Social Trinitarianism; the latter shaped the theology of Latin Trinitarianism that came to dominate the West.

Social Trinitarians, in the interests of explaining Christ’s mediating role, conceive of the Trinity as a divine society, each member of which is fully personal, each a center of consciousness, each involved in a loving relationship with the others. This view puts pressure on monotheism; however, advocates suggest that the cost is worth it in order to accommodate what they regard as compelling religious interests. Scripture represents Christ as communicating interpersonally with his Father, praying and being commended as the Son with whom his Father is well pleased. Social Trinitarians regard this sort of relationship as religiously important insofar as it models, in an ideal form, the relationship between God and us, and also between us and our fellows. In addition, the picture of the Trinity as a loving divine society makes sense of the notion of God as Love. For Social Trinitarians, in any case, the fundamental problem is one of making sense of the unity of Persons in one divine Being and this is, indeed, the project of the theologians credited with being the progenitors of Social Trinitarianism: the Cappadocian Fathers, Basil, Gregory of Nazianzus and Gregory of Nyssa.

Latin Trinitarians, by contrast, begin with the God’s unity as given and seek to explain how the Persons may be distinguished one from another. If Social Trinitarians understand the Trinity as a society of Persons, Latin Trinitarians represent the Trinity in toto as an individual and imagine the Persons generated in some manner by the relations among them. In this vein, St. Augustine suggests that the Trinity is analogous to the mind, its knowledge of itself and love of itself, which are distinct but inseparable (Augustine, On the Trinity). Nevertheless, while Latin Trinitarianism makes monotheism unproblematic, it poses difficulties concerning the apparently interpersonal communication between Jesus and his Father, and in addition raises questions about how the Persons, in particular the Holy Spirit, can be understood as personal.

Although Social Trinitarianism and Latin Trinitarianism fall within the scope of Nicene orthodoxy, it may be instructive to consider the difference in heterodox views that emerge in the East and West. When Social Trinitarianism goes bad it degrades into Subordinationism, a family of doctrines that assign an inferiority of being, status or role to the Son and Holy Spirit within the Trinity, which has its roots in the emanationist theologies that proliferated in the Hellenistic world. This view is classically represented in the theology of the heresiarch Arius, who held that the Son was a mere creature, albeit “the first-born of all creation.” Eastern theology tends towards a “loose,” descending Trinity, to tri-theism and subordinationism and so Arianism is the characteristic Eastern heresy.

Western theology, by contrast favors a “tight,” flat Trinity and in the first centuries of the Christian era tended toward ultra-high Christologies like Apollinarianism, the doctrine that, crudely, Jesus was a man in whom the Logos took the place normally occupied by a human rational soul, and Monophytism, according to which Christ had only one nature, and that divine. If the characteristic Trinitarian heresy in the East was Arianism, the characteristic Western heresies belong to a family of heterodox views generically known as Monarchianism, a term coined by Tertullian to designate tight-Trinity doctrines in virtue of their emphasis on the unity of God as the single and only ruler or source of Being, including most notably Modalism (a.k.a. Sabellianism), the doctrine that the Persons of the Trinity are merely “modes,” aspects or offices of the one God.

d. The Industry Standard: Nicea and Beyond

There is enough doctrinal space between Arianianism and Sabellianism to accommodate a range of theological accounts of the Trinity within the scope of Nicene orthodoxy. The Nicene formula declared that the Son was homoousios, “of the same substance” as the Father, which was elaborated by the Cappadocian Fathers in the dictum that the Persons of the Trinity were one ousia but three hypostases. This knocked out Arians on the one side and Sabellians on the other, but left room for a range of interpretations in between since “ousia” was, notoriously, ambiguous. Aristotle had used the term to designate both individuals, substances that are bearers of essences and properties, and the essential natures of individuals, the natural kinds in virtue of which they are substances in the first sense. So, individual human beings are substances in the first sense, and the human nature they share, the natural kind to which they belong, is a substance in the second sense.

The Nicene homoousios formula inherited the ambiguity. Understood in one way, the claim that the Persons of the Trinity were “homoousios” said that the Persons were the same individual, skating dangerously close to the Sabellian claim that they were “monoousios”—of one substance. Understood in the other way, it said merely that they were of the same kind, an interpretation compatible with tri-theism. The Cappadocians attempted to clarify and disambiguate the Nicene formula by employing the term “hypostasis,” used earlier by Origen, to capture the notion of individual identity rather than identity of kind. By itself, this did not solve the problem. First, apart from their revisionary theological usage, ousia and hypostasis were virtual synonyms: as a solution to the Trinity puzzle this formula was rather like saying that the Persons were one thing but different objects. Secondly, “one ousia” still failed to rule out tri-theism—indeed, in non-theological cases, one ousia, many hypostases is precisely what different individuals of the same species are. Homoousios, as intended, ruled out the doctrine that Father and Son were merely similar kinds of beings—homoiousios—but it did not rule out their being distinct individuals of the same kind.

The Cappadocian dictum, however, provided a framework for further discussion of the Trinity puzzle: the Trinitarian Persons were to be understood as being the same something but different something-elses and the substantive theological question was that of characterizing the ways in which they were bound together and individuated.

As to the latter question, Nicea opened the discussion of the “theology” of the Trinity, understood as the exploration of the relations amongst Persons—the “immanent Trinity” as distinct from the “economic Trinity,” that is the Trinity understood in terms of the distinct roles of the Persons in their worldly activities, in creation, redemption and sanctification. Nicea cashed out the homoousios claim by noting that the Son was “begotten, not made” indicating that he was, as noted in a parallel formula then current, “out of the Father’s ousia.” Furthermore, the Holy Spirit was declared at Constantinople to have the same sort of ontological status as the Son. So in the Fourth Century, at the Councils of Nicea and Constantinople, and through the work of the Cappadocians, the agenda for Trinitarian theology was set and the boundaries of orthodoxy were marked.

Within these parameters, the Trinity doctrine poses problems of three sorts: first, theological problems in reconciling theological doctrines concerning the character and properties of God with Trinitarian claims; secondly, theological puzzles that arise from Christological claims in particular; and finally logical puzzles posed by the Trinity doctrine itself. It remains to be seen whether it is possible to formulate a coherent doctrine of the Trinity within the constraints of Christian orthodoxy.

2. Theological Constraints

a. Monotheism

Christians claim to be monotheists and yet, given the doctrine of the Trinity, hold that there are three beings who are fully divine, viz. God the Father, Son and Holy Spirit. The first Trinity puzzle is that of explaining how we can attribute full divinity to the Persons of the Trinity without either compromising monotheism or undermining claims about the distinctness of Trinitarian persons.

Orthodox accounts of the Trinity hover uneasily between Sabellianism—which construes Trinitarian Persons as mere phases, aspects or offices of one God—and tri-theism, according to which the Persons are three Gods. Tri-theism is unacceptable since it is incompatible with the historical Christian commitment to monotheism inherited from the Hebrew tradition.

The fundamental problem for Trinitarian orthodoxy is to develop a doctrine of the Trinity that fits in the space between Sabellianism (or other versions of Monarchianism) and tri-theism. For Social Trinitarians in particular the problem has been one of articulating an account of the Trinity that affirms the individuality of the Persons and their relationships with one another without lapsing into tri-theism.

b. The Distinctness of Persons

Historically, Monarchianism—in particular Modalism (or Sabellianism), the doctrine that the Persons are “modes,” aspects, or roles of God—has been more tempting to Christians than tri-theism. The fundamental problem orthodox Latin Trinitarians face is that of maintaining a distinction between Trinitarian Persons sufficient to avoid Sabellianism, since orthodox Christians hold that the Persons of the Trinity are not merely aspects of God or God under different descriptions but in some sense distinct individuals such that Father ≠ Son ≠ Holy Spirit.

Christians hold that there are properties that distinguish the Persons. First, there are intra–Trinitarian relational properties the Persons have in virtue of their relations to other Trinitarian Persons: the Father begets the Son, but the Son does not beget the Son; the Spirit proceeds from the Father (and the Son) but neither the Father nor the Son proceeds from the Father (and the Son). Secondly, the Persons of the Trinity are distinguished in virtue of their distinctive “missions”—their activities in the world. The Second Person of the Trinity becomes incarnate, is born, suffers, dies, is buried, rises from the dead and ascends to the Father. According to orthodox doctrine, however, the same is not true of the Father (or Holy Spirit) and, indeed, the doctrine that the Father became incarnate, suffered and died is the heresy of patripassionism.

According to Latin Trinitarians, God, the Trinity, is an individual rather than a community of individuals sharing the same divine nature and each Person of the Trinity is that individual. Given this account however, the trick is to block inferences from the ascription of properties characteristic of one Trinitarian Person to the ascription of those properties to other Persons. Moreover, since it is held that the Persons cannot be individuated by their worldly activities, Latin Trinitarians, whose project is to explain the distinctions between Persons, must develop an account of the intra–Trinitarian relations that distinguish them—a project which is at best speculative.

c. The Equality of Persons, the Descending Trinity and the Filioque Clause

Supposing that we tread the fine line, and succeed in affirming both the participation of Trinitarian Persons in one God and their distinctness. Orthodoxy then requires, in addition, that we hold the Persons of the Trinity to be equal in power, knowledge, goodness and all properties pertaining to divinity other than those that are specific to the Persons individually. This poses problems when it comes to divine agency. Assuming that doing A and doing A* are equally good, it is logically possible that one Person may prefer A while another prefers A* (and that the third is, perhaps, indifferent). In the absence of a tie-breaker, it is hard to see how the Trinity can get anything done! If the Person who prefers A and the Person who prefers A* stick to their guns, neither can accomplish his end so it would seem, neither can count as omnipotent; if they defer to one another they also end up in a deadlock.

This is a difficulty for Social Trinitarians in particular insofar as they understand the Trinitarian Persons as distinct centers of consciousness and will whose projects might be incompatible. Swinburne, a Social Trinitarian, attempts to avoid this difficulty by suggesting that the Father, in virtue of his character as the Source of Trinitarian Persons, has the authority to “lay down the rules” so that irresolvable conflicts amongst Trinitarian Persons will be avoided (Swinburne, pp. 172-173). If however we assume that the preferences of one Trinitarian person take precedence so that the other Persons willingly defer to him as a matter of policy, then it is hard to avoid the suspicion that some Persons of the Trinity are “more equal than others”—the heresy of Subordinationism.

Even if Social Trinitarians avoid Subordinationism, the descending account of the Trinity according to which the defining characteristic of the Father is that of being the Source of Trinitarian Persons has theological ramifications which, in the end, resulted in the defining controversy between Eastern and Western churches concerning the Filioque clause. The original version of the Creed formulated by the councils of Nicea and Constantinople, declares that the Holy Spirit proceeds from the Father (ek tou Patros ek poreuomenon). The Filioque Clause, affirming that the Holy Spirit proceeds from the Father and the Son (ex Patri Filioque procedit), which first appeared in the profession of faith formulated at the Council of Toledo in 589, spread throughout Gaul and eventually become normative in the West, was firmly rejected by the Eastern churches on the grounds that it undermined the doctrine that the Father was the Source of Trinitarian Persons and the personality of the Holy Spirit.

Photios, the 9th Century Patriarch of Constantinople who initiated the Photian Schism between East and West, argues in The Mystogogy of the Holy Spirit that the procession of the Holy Spirit from the Son as well as the Father implies that the Father is not up to the task of generating Trinitarian Persons. Either the Father can do the job on his own or he can’t. If he can, then the participation of the Son in the generation of the Holy Spirit is superfluous and so there is no reason to accept the Filioque Clause. If he can’t, then he is a theological failure, which is absurd. Photios, representing the Eastern tradition, assumes a descending account of the Trinity according to which the characteristic hypostatic property of the Father is his role as the Source of the other Trinitarian Persons. He assumes in addition that all properties of Trinitarian Persons are such that they are either generic properties of divinity, and so are shared by all Persons, or hypostatic properties possessed uniquely by the Persons they characterize. It follows from these assumptions that the Filioque Clause should be rejected.

Photios and other Eastern theologians worried also that the Western account of the Trinity undermined the personal character of the Holy Spirit. According to one metaphor, widely employed in the West, the Father, Son and Holy Spirit are analogous to the Lover, the Beloved and the Love between them. Love is not the sort of thing that can have psychological properties or count as a person and so Eastern theologians charged that the “flat” Trinitarian picture that dominated Western Trinitarian theology, in which the Holy Spirit was understood as a relation or mediator between Father and Son undermined the personhood of the Holy Spirit.

Is the “descending” picture at the heart of Eastern Trinitarian theology, according to which the Father is characteristically the progenitor of Trinitarian Persons, inherently subordinationist? It does not seem to be so since there is no compelling reason why we should regard the property of being the Source of Trinitarian persons as one that confers superior status or authority on its possessor. Some parents are smarter, better looking, and richer than their children; others are dumber, uglier, and poorer. When children are young their parents legitimately exercise authority over them; when they are grown up they become their parents’ peers. To the extent that the role of the Father as the Source of Trinitarian Persons is analogous to human parenthood there is no reason to regard the Father as in any respect superior to the other Persons and it is hard to see what other reason could be given for this view.

Nevertheless, the descending Trinity picture lends itself to subordinatist interpretations in a way that the flat Trinity model does not. So when, for example, Swinburne suggests that the Father’s essential character as Source of Trinitarian Persons confers on him the authority to resolve intra–Trinitarian disputes or entitles him to the deference of other Trinitarian Persons he is, at the very least, skating close to the edge of Subordinationism.

d. Personality

Finally, Christians hold that God is personal—the subject of psychological states. But what is personal: the Trinity in toto or the Persons individually? The Litany, which addresses the Persons individually, and the Trinity in toto suggests all of the above:

O God the Father, Creator of heaven and earth; Have mercy upon us.
O God the Son, Redeemer of the world; Have mercy upon us.
O God the Holy Ghost, Sanctifier of the faithful; Have mercy upon us.
O holy, blessed, and glorious Trinity, one God: Have mercy upon us.

But this does not seem to be a coherent position. If the Father, Son and Holy Spirit are distinct centers of consciousness, the sorts of beings to whom one can reasonably appeal for mercy, and the Trinity is a divine society as Social Trinitarians suggest, it would seem that the Trinity could not itself be personal in any robust sense. After invoking the Father, Son and Holy Ghost, the invocation of the Trinity seems superfluous—as if I were to ask permission to build a fence on our adjoining property lines from each of my neighbors and then get them together to ask permission of them as a group.

On the face of it Latin Trinitarians have an easier time explaining what is personal: it is God, the Trinity and the Persons are individually personal to the extent that each is God. The Father is God so insofar as God, the Trinity, is personal, the Father is personal; the Son and Holy Spirit are God so they too are personal. The invocations in the Litany are indeed redundant because all four invoke no one other than God, but that is just a matter of poetic license. Nevertheless, some Christians, in particular Eastern Christians who are sympathetic to the Social Trinitarianism, worry that some metaphors Latin Trinitarians exploit undermine the personal character of the Holy Spirit. In addition, Latin Trinitarianism makes Gospel accounts of Jesus’ praying to the Father difficult to make out. Who was praying to whom? On the Latin Trinitarian account it seems that, insofar as we identify Jesus with the Second Person of the Trinity, God was simply talking to himself.

e. Christology and the Jesus Predicate Problem

The doctrine of the Trinity, as noted earlier, is motivated by the Christian conviction that Jesus was, in some sense, divine. Jesus however was born, suffered under Pontius Pilate, was crucified, died and was buried; he did not understand Chinese; he believed that David was the author of all the Psalms. These properties are, it would seem, incompatible with divinity and, indeed, there appear to be a great many predicates that are true of Jesus which, it would seem, could not be true of God and vice versa.

This is the Jesus Predicate Problem: we do not want to ascribe all the predicates that are true of Jesus to God simpliciter or, in particular, to God the Father. We do not, for example, want to hold that the Father suffered on the Cross—the heresy of Patripassionism. God, as traditionally understood is impassible—he cannot be subject to suffering, pain or harm. Moreover God has no beginning in time or end, and is, according to most orthodox accounts atemporal insofar as he is eternal rather than merely everlasting: he exists outside of time in what is, from the perspective of his subjectivity, the eternal now. Jesus however was born at a particular time and lived his life in time, so to maintain God’s atemporality, we cannot allow predicates that assign temporal properties to Jesus to God, or in particular to God the Father. In general, there are a range of predicates that are true of Jesus that, we want to hold, are not true of God the Father or of the Holy Spirit, and which we would hesitate to ascribe to God simpliciter insofar as they appear to be inconsistent with essential features of divinity.

To avoid the migration of Jesus’ predicates to other Persons of the Trinity, we need to create enough logical space between the Persons to block inferences from claims about Jesus to claims about the Father so that, in general, “Jesus Fs” does not entail “God the Father Fs” where “x Fs” says either that x has a property, is a certain kind of thing or does a certain kind of action. The trouble with Monarchian accounts, which make the Trinity “too tight,” is that they obliterate the logical space between the Persons that would block such inferences. Since Monarchians cannot use Trinitarian doctrine to block these inferences they use Christology to do the job—by either adopting very high Christologies or very low ones.  The wedge has to be driven somewhere and, if there isn’t enough logical space to drive it in between the First and Second Persons of the Trinity, it has to go in between the Second Person, the divine Logos which is from the beginning with God and is God, and whatever it is that is the subject of Jesus predicates.

One way to do this is via an ultra-high Christology according to which the troublesome Jesus predicates aren’t literally true of Christ the divine Logos but are true of something else—the human body he animates, a mere appearance or an imposter. To see how this works, consider Apollarianism, an ultra-high Christology rejected at the Council of Constantinople in 381 and again at the Council of Chalcedon in 451 at which Christological doctrine was formulated. According to this heterodox view, the historical Jesus was a human being who had the Logos plugged into the place that would normally be occupied by a human rational soul. Christ is the Logos and, insofar we ascribe such Jesus predicates as “___ suffered under Pontius Pilate,” “___ was crucified,” “___ died” and “___ was buried” that is merely a façon de parler. Strictly speaking, what these predicates are true of is not Christ but only of the body he used for a time to conduct his worldly operations. Consequently, they do not pass to the Logos or to other Persons of the Trinity, so there is no problem.

The other way to drive the wedge between the Father and the bearer of Jesus predicates is by adopting an ultra-low Christology, that is, by kicking Christ out of the Godhead altogether. Historically, this is the tack taken by Adoptionists who held that the man Jesus became “Son of God” only by adoption and grace dispensed at by God at his baptism, and the view held by contemporary quasi-Christians who deny the divinity of Christ. If Christ, the bearer of Jesus predicates is not divine, problematic Jesus predicates do not pass to the Father, or to God simpliciter, so there is no problem.

Interestingly, Christians have historically rejected ultra-high Christologies on the grounds that they undermine soteriology. This concern was articulated by Gregory of Nazianzus in his critique of Apollinarianism by the dictum “non assumptus, non salus.” The idea is that God’s aim in becoming incarnate was to assume human nature in order to heal it—if Christ only seemed to be human that could not be accomplished. And if he only took on a human body and its vegetative and animal souls—the principles responsible for life, growth, locomotion and emotion—but not the rational soul of a human being, he would have left out the very component of humanness that was in need of healing, since it was precisely man’s rational nature that was corrupted by sin. Anselm makes the same point in Cur Deus Homo? Whatever we think of this sort of argument it was for this reason that Christians worried about Christologies that failed to recognize the full humanity of Christ.

Christians who could not accept either ultra-high or ultra-low Christologies attempted to circumvent the Jesus Predicate Problem by rejecting the ultra-tight Monarchian view of the Trinity. So, writing more than a century before Nicea, Hippolytus suggested that Hereclitean contradictions could be avoided by a Trinitarian doctrine that created enough logical space between the Persons to block inferences from the character of Christ, the Second Person of the Trinity, to claims about the character of the other Persons, the Father in particular. With sufficient logical space between the Persons, Christ’s vincibility, mortality and other properties that are prima facie incompatible with divinity or unworthy of a deity can be segregated so that they don’t transfer to the Father. Given a Subordinationist account on the descending model according to which the Second Person is a semi-divine mediating figure there is no problem assigning troublesome Jesus predicates to him.

The trouble is that once committed to the Nicene doctrine that Christ is wholly divine, consubstantial with and equal to the Father, “God of God, Light of Light, very God of very God,” the same problem arises all over again for the Second Person of the Trinity! If ascribing these properties to the Father is bad, ascribing them to the Son thus understood is just as bad. Historically, the Church’s way with Jesus predicate problems that threaten the doctrine of the Trinity has been to recharacterize them as Christological problems concerning the relation between Christ’s divine and human natures—which are beyond the scope of this essay.

We may ask however whether, once the Church’s Trinity theologians circumvent the Jesus Predicate Problem by passing the buck to the Christologists, there is any reason to worry about Modalism or other tight-Trinity doctrines that minimize the logical space between Persons. As we have seen, historically, the rationale for rejecting Sabellianism was the worry that it did not leave enough space to drive a wedge between Father and Son that would block inferences from “Jesus Fs” to “God the Father Fs.” If however we can contrive a theological account that blocks such inferences Christologically, by driving the wedge between the bearer of Jesus predicates and the Second Person of the Trinity—by, e.g. distinguishing between Christ’s divine and human natures or between Christ qua human and Christ qua God—then there is no particular reason to worry about the space between Trinitarian Persons, and so it may be that Sabellianism is a more attractive proposition than it was initially through to be.

3. Philosophical Puzzles and Solutions

For Christians, at least in the West, Quincunque Vult, commonly known as the Athanasian Creed, defines Trinitarian orthodoxy as follows:

We worship one God in Trinity, and Trinity in Unity, neither confounding the Persons, nor dividing the Substance
For there is one Person of the Father, another of the Son, and another of the Holy Ghost…
Such as the Father is, such is the Son, and such is the Holy Ghost…
[T]he Father is God, the Son is God, and the Holy Ghost is God.
And yet they are not three Gods, but one God

Christians are thus committed to the following claims:

(1) The Father is God

(2) The Son is God

(3) The Holy Spirit is God

(4) The Father is not the Son

(5) The Father is not the Holy Spirit

(6) The Son is not the Holy Spirit

(7) There is exactly one God

a. Trinity and Identity

Can one consistently believe (1) – (7)? It depends on how we read the “is” in (1) – (6). If we read it throughout as the “is” of strict identity, as “=” the answer is no. Identity is an equivalence relation: it is reflexive, symmetric and transitive, which is to say, for all x, y and z the following hold:

Reflexivity:           x = x

Symmetry:            If x = y then y = x

Transitivity:          If x = y and y = z then x = y

In addition, identity is an unrestricted indiscernibilty relation for all properties, which is to say it obeys Leibniz’ Law, understood as the Indiscernibility of Identicals:

LL:                         If x = y then for all properties, P, x has P if and only if y has P

This is bad news. Suppose we read the “is” as “=” in (1) – (6). Then it follows from (1) and (2), by symmetry and transitivity, that the Father is the Son, which contradicts (4). Put another way, given LL, (1) entails that God has all the same properties as the Father, including the property of being identical with the Father insofar as everything has the property of self-identity. (2) says that the Son likewise has all the same properties as God. It follows that, since God has the property of being identical with the Son, the Son also has the property of being identical with the Father, which contradicts (4).

These formal features of identity are non-negotiable in the way that the four-sidedness of squares is: God cannot evade them any more than he can make a square with only three sides. God can make triangles—and pentagons, chiliagons or figures with any number of sides he pleases—but he cannot make such things squares. So, assuming that “God,” “Father,” “Son” and “Holy Spirit” don’t change their reference, the “is” that figures in (1) – (6) cannot be the “is” of strict identity.

b. The “Is” of Predication

In English, most of the time the word “is” occurs it does not express an identity. The “is” that occurs in (8) and (9) is the “is” of predication: it is used to ascribe a property to an object:

(8) Ducati is a dog.

(9) Ducati is canine.

(8) is not an identity statement because “a dog” does not pick out a particular object. Identity is a relation between objects; in particular, it is the relation that everything bears to itself and to no other thing. In a true identity statement the nouns or noun phrases on either sides of the identity pick out the very same thing. (10) and (11) are true identity statements:

(10) Ducati is the chocolate Lab at 613 Second Avenue.

(11) Ducati is Britynic Cadbury of Bourneville

“The chocolate Lab at 613 Second Avenue” and “Britynic Cadbury of Bourneville” each pick out particular dog, as it happens, the same dog that “Ducati” picks out but “a dog” does not. (8) in fact says the same thing as (9)—it says that Ducati has the property of being a dog, that is the property of being canine. The “is” in (8), like the “is” in (9) is therefore, the “is” of predication.

Now consider (1) – (3) understanding the “is” that occurs in each sentence as the “is” of predication to yield:

(1′) The Father is a God

(2′) The Son is a God

(3′) The Holy Spirit is a God

The “is” of predication does not express an equivalence relation and, in general, “x has P” and “y has P” do not imply “x is identical to y.” Ducati is a dog and Riley is a dog but it does not follow that Ducati is (identical to) Riley—in fact they are not. Similarly, (1′) and (2′) do not imply that the Father is the Son so there is no contradiction.

However, (1′) – (3′) just say that the Father, Son and Holy Spirit are each divine, in the way that (8) just says that Ducati is canine, and this leaves open the possibility that there are two, or three Gods involved. They do not explain what makes the Persons one God or provide any rationale for (7). Furthermore, together with (4) – (6) it seems to follow that there are indeed three Gods, just as it follows from “Ducati is a dog,” “Riley is a dog” and “Ducati is not Riley” that there are (at least) two dogs.

This is the concern Gregory of Nyssa addressed in his response to Ablabius, who worried that understanding the unity of Trinitarian persons in terms of their sharing the property of divinity implied Tri-theism:

The argument which you state is something like this: Peter, James, and John, being in one human nature, are called three men: and there is no absurdity in describing those who are united in nature, if they are more than one, by the plural number of the name derived from their nature. If, then, in the above case, custom admits this, and no one forbids us to speak of those who are two as two, or those who are more than two as three, how is it that in the case of our statements of the mysteries of the Faith, though confessing the Three Persons, and acknowledging no difference of nature between them, we are in some sense at variance with our confession, when we say that the Godhead of the Father and of the Son and of the Holy Ghost is one, and yet forbid men to say “there are three Gods”? The question is, as I said, very difficult to deal with. (Gregory of Nyssa, “To Ablabius”)

This is a difficult question indeed.

c. Divine Stuff: ‘God’ as a Mass Term

Gregory proposed the following analogy by way of a solution:

That which is not thus circumscribed is not enumerated, and that which is not enumerated cannot be contemplated in multitude. For we say that gold, even though it be cut into many figures, is one, and is so spoken of, but we speak of many coins or many staters, without finding any multiplication of the nature of gold by the number of staters; and for this reason we speak of gold, when it is contemplated in greater bulk, either in plate or in coin, as “much,” but we do not speak of it as “many golds” on account of the multitude of the material,-except when one says there are “many gold pieces” (Darics, for instance, or staters), in which case it is not the material, but the pieces of money to which the significance of number applies: indeed, properly, we should not call them “gold” but “golden.” As, then, the golden staters are many, but the gold is one, so too those who are exhibited to us severally in the nature of man, as Peter, James, and John, are many, yet the man in them is one. (Gregory of Nyssa. “To Ablabius”)

What Gregory has noticed here is that “gold” is a mass term rather than a count noun. Mass terms have a number of features that distinguish them from count nouns: in particular, they do not take plural, so to that extent as Gregory remarks, “gold…is one.” Intuitively, count nouns designate “things” while mass terms designate “stuffs”—gold, water, oatmeal and the like.

However, Gregory has inferred that human nature and, by analogy, divinity, should be understood as stuff too so that, just as there is one gold, parceled up into bits that are not properly speaking “gold” but merely golden there is just one man parceled up into bits each of which is not, properly speaking, man but merely human.

Richard Cartwright dismisses this solution very quickly as desperate, heretical and unhelpful:

It seems to have been left to Gregory of Nyssa, Basil’s younger brother, to notice that, thus understood, consubstantiality of the Father, the Son, and the Holy Spirit appears to license saying that there are three Gods.  Gregory himself rather desperately suggested that strictly speaking there is only one man. But besides being itself heretical, the suggestion is of no help. (Richard Cartwright. “On the Logical Problem of the Trinity” in Richard Cartwright, Philosophical Essays. MIT Press, 1987. P. 171)

Nevertheless, it may be possible to push a little further along this line. Even though, intuitively, we think of mass nouns as designating more or less homogeneous stuffs, without perceptible, discrete but continuous parts, mass noun is a grammatical category and does not determine the character of what it designates but how we talk about it. The designata of some mass nouns have quite large, readily perceptible discrete parts. Consider “fruit” which, in English typically functions as a mass noun: the plural form, “fruits” is not ill-formed but it is rare and occurs primarily in idioms like “by their fruits ye shall know them”; we say “a lot of fruit” but only rarely “a few fruits” or “many fruits.” Perhaps most tellingly “fruit” takes what are called “sortalizing auxiliary nouns,” devices that attach to mass terms to yield noun phrases that behave like +count nouns: so we talk about “bodies of water,” “piles of sand” and, tellingly, “pieces of fruit.” From the grammatical point of view, Gregory’s revisionary proposal is in order: we can by an act of linguistic legislation decide to treat, perhaps for convenience, “human” as a mass term designating a spatially extended but gappy object, so that Peter, James and John are not, properly speaking, humans but rather pieces of humanity, a stuff which consists of Peter, James, John and all their fellows pooled together.

Perhaps the Trinity can be fixed by an account along the lines of Gregory’s proposal, according to which we may understand the God as a concrete but non-spatio-temporal whole, whose simple, non-spatio-temporal parts are the Trinitarian Persons. If so, then noting that parts need not be spatio-temporal, we might reconstruct (1) – (7) as follows:

(1”) The Father is a part of God

(2”) The Son is a part of God

(3”) The Holy Spirit is a part of God

(4”) The Father is not the same part of God as the Son

(5”) The Father is not the same part of God as the Holy Spirit

(6”) The Son is not the same part of God as the Holy Spirit

(7) There is exactly one God

(1”) – (7) are clearly consistent. Moreover if we remember that “God” is being treated as a mass term, designating all the divinity there is, in the way that “water” designates all the world’s water, of which lakes, rivers and puddles are parts, there is no difficulty in holding that the Persons are equally divine. Every body of water however small is thoroughly H2O: the humblest puddle is as watery as the Pacific Ocean and so, to that extent, water is wholly present in each of its parts. Similarly we can say that each Person is thoroughly God: divinity is wholly present in each of its (non-spatio-temporal) parts.

d. Relative Identity

Gregory’s proposal has not received widespread attention. However a comparable proposal, viz. that the “is” in (1) to (6) be construed as designating relative identity relations, has been widely discussed and solutions to the Trinity puzzle that make this move have been proposed by Peter Geach and more recently by Peter Van Inwagen.

According to Geach, identity statements of the form “x is identical with y” are incomplete: they are elliptical for “x is the same F as y” where F is a sortal term, that is a count noun that conveys criteria of identity. So common nouns like “table,” “man,” and “set” are sortals: grammatically they are count nouns and semantically they, in effect, provide instructions about how to identify them, how to chop out the hunk of the world they fill, how to distinguish them from other objects and how to trace their histories to determine when (if ever) they come into being and when (if ever) they cease to exist. Defenders of the relative identity thesis suggest that we cannot obey the instruction to “count all the things in this room” because “thing” does not convey identity criteria. If I am to count things, I need to know what sorts of things should I count? If I am asked whether this is the same as that, before I can answer I have to ask, “The same what?”

Geach notes further that, where F and G are sortals, it is possible to have cases where some x and y are the same F but not the same G. So, for example, we may want to say that 2/3 is the same rational number as 4/6 but not the same ordered pair of integers or that two copies of Ulysses are the same literary work but not the same book.

Finally, sortal-relative-identity relations are equivalence relations but they are not indiscernibility relations for all properties unrestrictedly. For any sortal-relative-identity relation, being-the-same-F-as, there is a set of predicates, SF, the indiscernibility set for F, such that for any predicate P Î SF, if x is the same F as y then x has P if and only if y has P. For predicates P* Ï SF the inferences from x is the same F as y and x has P* to y has P* and vice versa do not go through.

Now as regards the Trinity puzzle we note that “god” and “person” are sortals and hence that given Geach’s suggestion the following claims are consistent:

(1­R) The Father is the same god as God

(2R) The Son is the same god as God

(3R) The Holy Spirit is the same god as God

(4R) The Father is not the same divine person as the Son

(5R) The Father is not the same divine person as the Holy Spirit

(6R) The Son is not the same divine person as the Holy Spirit

The relative identity account of Trinitarian claims is similar to the reconstruction of Trinitarian claims in (1”) – (6”) insofar as rely on the strategy of invoking different relations in the first and last three statements: the relations of being-part-of-the-same-whole-as and being-the-same-part-as are different to one another as are the relations of being-the-same-god-as and being-the-same-divine-person-as. Consequently, (1R) – (6R) are consistent with (7). Sortals, as noted, provide rules for counting. Counting by book, two copies of Ulysses count as two; counting by literary work, they count as one. Similarly, the suggestion is that counting by divine person, the Father, Son and Holy Spirit count as three but counting by god they count as one and so we can affirm (7): there is exactly one God. The relative identity strategy thus avoids Tri-theism.

The relative identity strategy also circumvents the Jesus Predicate Problem, at least to the extent that we want to block inferences from “The Son Fs” to “The Father Fs” for a range of predicates including “became incarnate,” “was crucified,” “suffered under Pontius Pilate” and the like. To block objectionable inferences we note that these predicates do not fall within the indiscernibility set for divine person and so the relative identity strategy avoids Patripassionism.

In addition, on this account, we can explain why (1R) – (3R) entail that the Father, Son and Holy Spirit each have those properties that are constituitive of full divinity. We note that “is omnipotent,” “is omniscient,” “is omnibenevolent” and other generically divine properties are in the indiscernibillity set for god so that given God has the properties they designate we may infer that the same is true of the Father, Son and Holy Spirit. Intuitively, there are generically divine properties, designated by predicates in the indiscernibility set for for god, which all Trinitarian Persons have in virtue of (1R) – (3R) and there are hypostatic properties which each Person has in virtue of being the Person he is.

Relative identity is however a controversial doctrine in its own right and, even if we accept the metaphysical baggage it carries, may not suitable for theological purposes. So Michael Rae worries that relative identity commits one to a theologically disastrous antirealistism:

Many philosophers are attracted to antirealism, but accepting it as part of a solution to the problem of the Trinity is disastrous.  For clearly orthodoxy will not permit us to say that the very existence of Father, Son, and Holy Spirit is a theory-dependent matter.  Nor will it permit us to say that the distinctness of the divine Persons is somehow relative to our ways of thinking or theorizing. The latter appears to be a form of modalism. And yet it is hard to see how it could be otherwise if Geach’s theory of relative identity is true. For what else could it possibly mean to say that there is simply no fact about whether Father, Son, and Holy Spirit are the same thing as one another, the same thing as God, or, indeed, the same thing as Baal. (Michael Rae, “Relative Identity and the Doctrine of the Trinity,” Philosophia Christi vol. 5, No. 2)

e. The Trinity and Other Identity Puzzles

The logical problem of the Trinity arises because, as we have seen in 3.a, (1) – (7) are inconsistent if the “is” that figures in them is interpreted as the “is” of (absolute) identity. In this respect, the Trinity puzzle is comparable to a range of puzzles concerning the identity of ordinary material objects.

One range of such puzzles concerns the problem of material composition. A lump of clay is made into a statue. The statue and the lump occupy exactly the same spatial region so we want to say that they are they are the same thing and that there is just one material object in the region “they” occupy: we balk at the idea of more than one material object occupying exactly the same place. But the statue and the clay do not have all the same properties: the statue was formed by the sculptor but the lump was not; the lump can survive the most radical changes in shape, including changes that would transform it into a different statue but the statue cannot. Consequently we cannot hold that there is a statue and a lump of clay and that they are strictly identical without falling afoul of Leibniz’ Law. We want to say that the statue and clay count as one material object but we are barred from holding that they are strictly identical. In this respect the problem of material composition poses the same problem as the Trinity doctrine: we want to say the Persons are one God but we are barred, in this case by theological concerns, from saying that they are strictly identical.

The problem posed by the material composition and other identity puzzles, including the Ship of Theseus and the problem of the dividing self which figures in discussions of personal identity, is that there are a great many cases where we want to say that objects x and y are the same thing but where the relation between x and y is such that it violates the formal features of identity—either because it is one-many rather than one-one or because it is not an unrestricted indiscernibility relation. And this is precisely the problem posed by the doctrine of the Trinity.

It was noted above that the proposal in 3.b, that the “is” in (1) – (3) should be interpreted as the “is” of predication, is also unacceptable because it is tri-theistic. It was also noted that the accounts suggested in 3.c and 3.d are not overtly incoherent but ultimately depend respectively on whether a mereology and an account of relative are workable. The relative identity account has been discussed extensively in the literature. The worry about the relative identity account is not that it fails to produce the right results as regards the doctrine of the Trinity, but that relative identity is itself a questionable business and in any case carries metaphysical baggage that may be theologically unacceptable.

The moral of this story should perhaps be that “identity,” as Frege famously remarked, “gives rise to challenging questions which are not altogether easy to answer” (Gottlob Frege, “On Sense and Reference”). For all that critics have ridiculed the doctrine of the Trinity as a prime example of the absurdity of Christian doctrine—as the late Bishop Pike did when he suggested that the Trinity was “a sort of committee god”—Trinity talk is no worse off than much non-theological talk about the identities of non-divine persons and ordinary material objects.

4. References and Further Readings

  • Augustine. “On the Trinity.” The Early Church Fathers. Christian Classics Ethereal Library.
  • Baber, H. E. “Sabellianism Reconsidered.” in Sophia vol. 41, No. 2 (October 2002): 1-18.
  • Baber, H. E. “Trinity, Filioque and Semantic Ascent” forthcoming in Sophia.
  • Bobrinskoy, Boris. The Mystery of the Trinity. Crestwood, NY: St. Vladimir’s Seminary press, 1999.
  • Brower, Jeffrey E. and Michael C. Rea. “Material Constitution and the Trinity.” Faith and Philosophy 22 (2005): 57-76.
  • Brown, David. The Divine Trinity. London: Duckworth, 1985.
  • Cartwright, Richard. “On the Logical Problem of the Trinity.” In Philosophical Essays. The MIT Press, 1990.
  • Davis, Stephen T., Kendall, Daniel, S.J., and O’Collins, Gerald, S.J., eds. The Trinity. An Interdisciplinary Symposium on the Trinity. Oxford: Oxford University Press, 1999.
  • Gregory of Nyssa. “To Ablabius.” The Early Church Fathers. Christian Classics Ethereal Library.
  • Hebblethwaite, Brian. Philosophical Theology and Christian Doctrine. Oxford: Blackwell Publishing Ltd, 2005. Esp. Ch. 5: “Trinity.”
  • Hippolytus, Against All Heresies, Book IX, The Early Church Fathers. Christian Classics Ethereal Library.
  • Leftow, Brian. “Anti Social Trinitarianism.” In The Trinity: An Interdisciplinary Symposium on the Trinity.  Feenstra, R. J. and Plantinga, C. Notre Dame: University of Notre Dame Press, 1989.
  • Peters, Ted. God as Trinity. Louisville, KY: Westminster/John Knox Press, 1993.
  • Photios, Patriarch of Constantinople. On the Mystagogy of the Holy Spirit. Astoria, NY: Studion Publishers, Inc., 1983.
  • Rea, Michael C. “Relative Identity and the Doctrine of the Trinity.” In Philosophic Christi vol. 5, No. 2 (2003): 431-445.
  • Rusch, William G., ed. The Trinitarian Controversy. Philadelphia: Fortress Press, 1980.
  • Stead, C. Divine Substance. Oxford: The Clarendon Press, 1977.
  • Studer, Basil. Trinity and Incarnation. Collegeville, MN: The Liturgical Press, 1993.
  • Swinburne, Richard. The Christian God. Oxford: Oxford University Press, 1994.
  • Van Inwagen, Peter. “And yet there are not three Gods but one God.” In Philosophy and the Christian Faith, ed. T. V. Morris. Notre Dame: University of Notre Dame Press, 1988.
  • Yandell, K. E. “The most brutal and inexcusable error in counting?” Trinity and consistency.  Religious Studies 30 (1994): 201-17.

Author Information

H.E. Baber
Email: baber@usd.edu
University of San Diego
U. S. A.

The Indispensability Argument in the Philosophy of Mathematics

In his seminal 1973 paper, “Mathematical Truth,” Paul Benacerraf presented a problem facing all accounts of mathematical truth and knowledge.  Standard readings of mathematical claims entail the existence of mathematical objects. But, our best epistemic theories seem to deny that knowledge of mathematical objects is possible. Thus, the philosopher of mathematics faces a dilemma: either abandon standard readings of mathematical claims or give up our best epistemic theories.  Neither option is attractive.

The indispensability argument in the philosophy of mathematics is an attempt to avoid Benacerraf’s dilemma by showing that our best epistemology is consistent with standard readings of mathematical claims.  Broadly speaking, it is an attempt to justify knowledge of an abstract mathematical ontology using only a strictly empiricist epistemology.

The indispensability argument in the philosophy of mathematics, in its most general form, consists of two premises.  The major premise states that we should believe that mathematical objects exist if we need them in our best scientific theory.  The minor premise claims that we do in fact require mathematical objects in our scientific theory.  The argument concludes that we should believe in the abstract objects of mathematics.

This article begins with a general overview of the problem of justifying our mathematical beliefs that motivates the indispensability argument.  The most prominent proponents of the indispensability argument have been W.V. Quine and Hilary Putnam.  The second section of the article discusses a reconstruction of Quine’s argument in detail.  Quine’s argument depends on his general procedure for determining a best theory and its ontic commitments, and on his confirmation holism.  The third and fourth sections of the article discuss versions of the indispensability argument which do not depend on Quine’s method: one from Putnam and one from Michael Resnik.

The relationship between constructing a best theory and producing scientific explanations has recently become a salient topic in discussions of the indispensability argument.  The fifth section of the article discusses a newer version of the indispensability argument, the explanatory indispensability argument.  The last four sections of the article are devoted to a general characterization of indispensability arguments over various versions, a brief discussion of the most prominent responses to the indispensability argument, a distinction between inter- and intra-theoretic indispensability arguments, and a short conclusion.

Table of Contents

  1. The Problem of Beliefs About Mathematical Objects
  2. Quine’s Indispensability Argument
    1. A Best Theory
    2. Believing Our Best Theory
    3. Quine’s Procedure for Determining Ontic Commitments
    4. Mathematization
  3. Putnam’s Success Argument
  4. Resnik’s Pragmatic Indispensability Argument
  5. The Explanatory Indispensability Argument
  6. Characteristics of Indispensability Arguments in the Philosophy of Mathematics
  7. Responses to the Indispensability Argument
  8. Inter-theoretic and Intra-theoretic Indispensability Arguments
  9. Conclusion
  10. References and Further Reading

1. The Problem of Beliefs About Mathematical Objects

Most of us have a lot of mathematical beliefs. For example, we might believe that the tangent to a circle intersects the radius of that circle at right angles, that the square root of two can not be expressed as the ratio of two integers, and that the set of all subsets of a given set has more elements than the given set. Most of us also believe that those claims refer to mathematical objects such as circles, integers, and sets. Regarding all these mathematical beliefs, the fundamental question motivating the indispensability argument is, “How can we justify our mathematical beliefs?”

Mathematical objects are in many ways unlike ordinary physical objects such as trees and cars. We learn about ordinary objects, at least in part, by using our senses. It is not obvious that we learn about mathematical objects this way. Indeed, it is difficult to see how we could use our senses to learn about mathematical objects. We do not see integers, or hold sets. Even geometric figures are not the kinds of things that we can sense. Consider any point in space; call it P. P is only a point, too small for us to see, or otherwise sense. Now imagine a precise fixed distance away from P, say an inch and a half. The collection of all points that are exactly an inch and a half away from P is a sphere. The points on the sphere are, like P, too small to sense. We have no sense experience of the geometric sphere. If we tried to approximate the sphere with a physical object, say by holding up a ball with a three-inch diameter, some points on the edge of the ball would be slightly further than an inch and a half away from P, and some would be slightly closer. The sphere is a mathematically precise object. The ball is rough around the edges. In order to mark the differences between ordinary objects and mathematical objects, we often call mathematical objects “abstract objects.”

When we study geometry, the theorems we prove apply directly and exactly to mathematical objects, like our sphere, and only indirectly and approximately to physical objects, like our ball. Numbers, too, are insensible. While we might see or touch a bowl of precisely eighteen grapes, we see and taste the grapes, not the eighteen. We can see a numeral, “18,” but that is the name for a number, just as the term “Russell” is my name and not me. We can sense the elements of some sets, but not the sets themselves. And some sets are sets of sets, abstract collections of abstract objects. Mathematical objects are not the kinds of things that we can see or touch, or smell, taste or hear. If we can not learn about mathematical objects by using our senses, a serious worry arises about how we can justify our mathematical beliefs.

The question of how we can justify our beliefs about mathematical objects has long puzzled philosophers. One obvious way to try to answer our question is to appeal to the fact that we prove the theorems of mathematics. But, appealing to mathematical proofs will not solve the problem. Mathematical proofs are ordinarily construed as derivations from fundamental axioms. These axioms, such as the Zermelo-Frankel axioms for set theory, the Peano Axioms for number theory, or the more-familiar Euclidean axioms for geometry, refer to the same kinds of mathematical objects. Our question remains about how we justify our beliefs in the axioms.

For simplicity, to consider our question of how we can justify our beliefs about mathematical objects, we will only consider sets. Set theory is generally considered the most fundamental mathematical theory. All statements of number theory, including those concerning real numbers, can be written in the language of set theory. Through the method of analysis, all geometric theorems can be written as algebraic statements, where geometric points are represented by real numbers. Claims from all other areas of mathematics can be written in the language of set theory, too.

Sets are abstract objects, lacking any spatio-temporal location. Their existence is not contingent on our existence. They lack causal efficacy. Our question, then, given that we lack sense experience of sets, is how we can justify our beliefs about sets and set theory.

There are a variety of distinct answers to our question. Some philosophers, called rationalists, claim that we have a special, non-sensory capacity for understanding mathematical truths, a rational insight arising from pure thought. But, the rationalist’s claims appear incompatible with an understanding of human beings as physical creatures whose capacities for learning are exhausted by our physical bodies. Other philosophers, called logicists, argue that mathematical truths are just complex logical truths. In the late nineteenth and early twentieth centuries, the logicists Gottlob Frege, Alfred North Whitehead, and Bertrand Russell attempted to reduce all of mathematics to obvious statements of logic, for example, that every object is identical to itself, or that if p then p. But, it turns out that we can not reduce mathematics to logic without adding substantial portions of set theory to our logic. A third group of philosophers, called nominalists or fictionalists, deny that there are any mathematical objects; if there are no mathematical objects, we need not justify our beliefs about them.

The indispensability argument in the philosophy of mathematics is an attempt to justify our mathematical beliefs about abstract objects, while avoiding any appeal to rational insight. Its most significant proponent was Willard van Orman Quine.

2. Quine’s Indispensability Argument

Though Quine alludes to an indispensability argument in many places, he never presented a detailed formulation. For a selection of such allusions, see Quine 1939, 1948, 1951, 1955, 1958, 1960, 1978, and 1986a. This articles discusses the following version of the argument:

QI: QI1. We should believe the theory which best accounts for our sense experience.
QI2. If we believe a theory, we must believe in its ontic commitments.
QI3. The ontic commitments of any theory are the objects over which that theory first-order quantifies.
QI4. The theory which best accounts for our sense experience first-order quantifies over mathematical objects.
QIC. We should believe that mathematical objects exist.

An ontic commitment to object o is a commitment to believing that o exists. First-order quantification is quantification in standard predicate logic. One presumption behind QI is that the theory which best accounts for our sense experience is our best scientific theory. Thus, Quine naturally defers much of the work of determining what exists to scientists. While it is obvious that scientists use mathematics in developing their theories, it is not obvious why the uses of mathematics in science should lead us to believe in the existence of abstract objects. For example, when we study the interactions of charged particles, we rely on Coulomb’s Law, which states that the electromagnetic force F between two charged particles 1 and 2  is proportional to the charges q1 and q2 on the particles and, inversely, to the distance r between them.

CL:    F = k ∣q1q2∣/ r2 ,   where the electrostatic constant k ≈ 9 x 109 Nm2/c2

∣q1q2∣ is the absolute value of the product of the two charges. Notice that CL refers to a real number, k, and employs mathematical functions such as multiplication and absolute value. Still, we use Coulomb’s Law to study charged particles, not to study mathematical objects, which have no effect on those particles. The plausibility of Quine’s indispensability argument thus depends on both (i) Quine’s claim that the evidence for our scientific theories transfers to the mathematical elements of those theories, which is implicit in QI1, and (ii) his method for determining the ontic commitments of our theories at QI3 and QI4. The method underlying Quine’s argument involves gathering our physical laws and writing them in a canonical language of first-order logic. The commitments of this formal theory may be found by examining its quantifications.

The remainder of this section discusses each of the premises of QI in turn.

a. A Best Theory

The first premise of QI is that we should believe the theory which best accounts for our sense experience, that is, we should believe our best scientific theory.

Quine’s belief that we should defer all questions about what exists to natural science is really an expression of what he calls, and has come to be known as, naturalism. He describes naturalism as, “[A]bandonment of the goal of a first philosophy. It sees natural science as an inquiry into reality, fallible and corrigible but not answerable to any supra-scientific tribunal, and not in need of any justification beyond observation and the hypothetico-deductive method” (Quine 1981: 72).

Quine’s naturalism was developed in large part as a response to logical positivism, which is also called logical empiricism or just positivism. Positivism requires that any justified belief be constructed out of, and be reducible to, claims about observable phenomena. We know about ordinary objects like trees because we have sense experience, or sense data, of trees directly. We know about very small or very distant objects, despite having no direct sense experience of them, by having sense data of their effects, say electron trails in a cloud chamber. For the positivists, any scientific claim must be reducible to sense data.

Instead of starting with sense data and reconstructing a world of trees and persons, Quine assumes that ordinary objects exist. Further, Quine starts with an understanding of natural science as our best account of the sense experience which gives us beliefs about ordinary objects. Traditionally, philosophers believed that it was the job of epistemology to justify our knowledge. In contrast, the central job of Quine’s naturalist is to describe how we construct our best theory, to trace the path from stimulus to science, rather than to justify knowledge of either ordinary objects or scientific theory.

Quine’s rejection of positivism included the insight, now known as confirmation holism, that individual sentences are only confirmed in the context of a broader theory. Confirmation holism arises from an uncontroversial observation that any sentence s can be assimilated without contradiction to any theory T, as long as we adjust truth values of any sentences of T that conflict with s. These adjustments may entail further adjustments, and the new theory may in the end look quite different than it did before we accommodated s. But we can, as a matter of logic, hold on to any sentence come what may. And, we are not forced to hold on to any statement, come what may; there are no unassailable truths.

For a simple example, suppose I have a friend named Abigail. I have a set of beliefs, which we can call a theory of my friendship with her. (A theory is just a collection of sentences.) Suppose that I overhear Abigail saying mean things about me. New evidence conflicts with my old theory. I have a choice whether to reject the evidence (for example, “I must have mis-heard”) or to accommodate the evidence by adjusting my theory, giving up the portions about Abigail being my friend, say, or about my friends not saying mean things about me. Similarly, when new astronomical evidence in the 15th and 16th centuries threatened the old geocentric model of the universe, people were faced with a choice of whether to accept the evidence, giving up beliefs about the Earth being at the center of the universe, or to reject the evidence. Instead of requiring that individual experiences are each independently assessed, confirmation holism entails that there are no justifications for particular claims independent of our entire best theory. We always have various options for restoring an inconsistent theory to consistency.

Confirmation holism entails that our mathematical theories and our scientific theories are linked, and that our justifications for believing in science and mathematics are not independent. When new evidence conflicts with our current scientific theory, we can choose to adjust either scientific principles or mathematical ones. Evidence for the scientific theory is also evidence for the mathematics used in that theory.

The question of how we justify our beliefs about mathematical objects arose mainly because we could not perceive them directly. By rejecting positivism’s requirement for reductions of scientific claims to sense data, Quine allows for beliefs in mathematical objects despite their abstractness. We do not need sensory experience of mathematical objects in order to justify our mathematical beliefs. We just need to show that mathematical objects are indispensable to our best theory.

QI1 may best be seen as a working hypothesis in the spirit of Ockham’s Razor. We look to our most reliable endeavor, natural science, to tell us what there is. We bring to science a preference that it account for our entrenched esteem for ordinary experience. And we posit no more than is necessary for our best scientific theory.

b. Believing Our Best Theory

The second premise of QI states that our belief in a theory naturally extends to the objects which that theory posits.

Against QI.2, one might think that we could believe a theory while remaining agnostic or instrumentalist about whether its objects exist. Physics is full of fictional idealizations, like infinitely long wires, centers of mass, and uniform distributions of charge. Other sciences also posit objects that we do not really think exist, like populations in Hardy-Weinberg equilibrium (biology), perfectly rational consumers (economics), and average families (sociology). We posit such ideal objects to facilitate using a theory. We might believe our theory, while recognizing that the objects to which it refers are only ideal. If we hold this instrumentalist attitude toward average families and infinitely long wires, we might hold it toward circles, numbers and sets, too.

Quine argues that any discrepancy between our belief in a theory and our beliefs in its objects is illegitimate double-talk. One can not believe in only certain elements of a theory which one accepts. If we believe a theory which says that there are centers of mass, then we are committed to those centers of mass. If we believe a theory which says that there are electrons and quarks and other particles too small to see, then we are committed to such particles. If our best theory posits mathematical objects, then we must believe that they exist.

QI1 and QI2 together entail that we should believe in all of the objects that our best theory says exist. Any particular evidence applies to the whole theory, which produces its posits uniformly. Quine thus makes no distinction between justifications of observable and unobservable objects, or between physical and mathematical objects. All objects, trees and electrons and sets, are equally posits of our best theory, to be taken equally seriously. “To call a posit a posit is not to patronize it” (Quine 1960: 22).

There will be conflict between our currently best theory and the better theories that future science will produce. The best theories are, of course, not now available. Yet, what exists does not vary with our best theory. Thus, any current expression of our commitments is at best speculative. We must have some skepticism toward our currently best theory, if only due to an inductive awareness of the transience of such theories.

On the other hand, given Quine’s naturalism, we have no better theory from which to evaluate the posits of our currently best theory. There is no external, meta-scientific perspective. We know, casually and meta-theoretically, that our current theory will be superceded, and that we will give up some of our current beliefs; but we do not know how our theory will be improved, and we do not know which beliefs we will give up. The best we can do is believe the best theory we have, and believe in its posits, and have a bit of humility about these beliefs.

c. Quine’s Procedure for Determining Ontic Commitments

The first two premises of QI entail that we should believe in the posits of our best theory, but they do not specify how to determine precisely what the posits of a theory are. The third premise appeals to Quine’s general procedure for determining the ontic commitments of a theory. Anyone who wishes to know what to believe exists, in particular whether to believe that mathematical objects exist, needs a general method for determining ontic commitment. Rather than relying on brute observations, Quine provides a simple, broadly applicable procedure. First, we choose a best theory. Next, we regiment that theory in first-order logic with identity. Last, we examine the domain of quantification of the theory to see what objects the theory needs to come out as true.

Quine’s method for determining our commitments applies to any theory—theories which refer to trees, electrons and numbers, and theories which refer to ghosts, caloric and God.

We have already discussed how the first step in Quine’s procedure applies to QI. The second step of Quine’s procedure for determining the commitments of a theory refers to first-order logic as a canonical language. Quine credits first-order logic with extensionality, efficiency, elegance, convenience, simplicity, and beauty (Quine 1986: 79 and 87). Quine’s enthusiasm for first-order logic largely derives from various attractive technical virtues. In first-order logic, a variety of definitions of logical truth concur: in terms of logical structure, substitution of sentences or of terms, satisfaction by models, and proof. First-order logic is complete, in the sense that any valid formula is provable. Every consistent first-order theory has a model. First-order logic is compact, which means that any set of first-order axioms will be consistent if every finite subset of that set is consistent. It admits of both upward and downward Löwenheim-Skolem theorems, which mean that every theory which has an infinite model will have a model of every infinite cardinality (upward) and that every theory which has an infinite model of any cardinality will have a denumerable model (downward). (See Mendelson 1997: 377.)

Less technically, the existential quantifier in first-order logic is a natural equivalent of the English term “there is,” and Quine proposes that all existence claims can and should be made by existential sentences of first-order logic. “The doctrine is that all traits of reality worthy of the name can be set down in an idiom of this austere form if in any idiom” (Quine 1960: 228).

We should take first-order logic as our canonical language only if:

A.   We need a single canonical language;

B.   It really is adequate; and

C.   There is no other adequate language.

Condition A arises almost without argument from QI1 and QI2. One of Quine’s most striking and important innovations was his linking of our questions about what exists with our concerns when constructing a canonical formal language. When we regiment our correct scientific theory correctly, Quine argues, we will know what we should believe exists. “The quest of a simplest, clearest overall pattern of canonical notation is not to be distinguished from a quest of ultimate categories, a limning of the most general traits of reality” (Quine 1960: 161).

Whether condition B holds depends on how we use our canonical language. First-order logic is uncontroversially useful for what Quine calls semantic ascent. When we ascend, we talk about words without presuming that they refer to anything; we can deny the existence of objects without seeming to commit to them. For example, on some theories of language sentences which contain terms that do not refer to real things are puzzling. Consider:

CP:     The current president of the United States does not have three children.

TF:     The tooth fairy does not exist.

If CP is to be analyzed as saying that there is a current president who lacks the property of having three children, then by parity of reasoning TF seems to say that there is a tooth fairy that lacks the attribute of existence. This analysis comes close to interpreting the reasonable sentence TF as a contradiction saying that there is something that is not.

In contrast, we can semantically ascend, claiming that the term “the tooth fairy” does not refer. TF is conveniently regimented in first-order logic, using “T” to stand for the property of being the tooth fairy. “‘∼(∃x)Tx” carries with it no implication that the tooth fairy exists. Similar methods can be applied to more serious existence questions, like whether there is dark energy or an even number greater than two which is not the sum of two primes. Thus, first-order logic provides a framework for settling disagreements over existence claims.

Against the supposed adequacy of first-order logic, there are expressions whose first-order logical regimentations are awkward at best. Leibniz’s law, that identical objects share all properties, seems to require a second-order formulation. Similarly, H resists first-order logical treatment.

H:     Some book by every author is referred to in some essay by every critic (Hintikka 1973: 345).

H may be adequately handled by branching quantifiers, which are not elements of first-order logic.

There are other limitations on first-order logic. Regimenting a truth predicate in first-order logic leads naturally to the paradox of the liar. And propositional attitudes, such as belief, create opaque contexts that prevent natural substitutions of identicals otherwise permitted by standard first-order inference rules. Still, defenders of first-order logic have proposed a variety of solutions to these difficulties, many of which may be due not to first-order logic itself but to deeper problems with language.

For condition C, Quine argues that no other language is adequate for canonical purposes. Ordinary language appears to be too sloppy, in large part due to its use of names to refer to objects. We use names to refer to some of the things that exist: “Muhammad Ali,” “Jackie Chan,” “The Eiffel Tower,” But some names, such as “Spiderman,” do not refer to anything real. Some things, such as most insects and pebbles, lack names. Some things, such as most people, have multiple names. We could clean up our language, constructing an artificial version in which everything has exactly one name. Still, in principle, there will not be enough names for all objects. As Cantor’s diagonal argument shows, there are more real numbers than there are available names for those numbers. If we want a language in which to express all and only our commitments, we have to look beyond languages which rely on names.

Given the deficiencies of languages with names and Quine’s argument that the existential quantifier is a natural formal equivalent of “there is,” the only obvious remaining alternatives to first-order logic as a canonical language are higher-order logics. Higher-order logics have all of the expressive powers of first-order logic, and more. Most distinctly, where first-order logic allows variables only in the object position (i.e. following a predicate), second-order logic allows variables in the predicate positions as well, and introduces quantifiers to bind those predicates. Logics of third and higher order allow further predication and quantification. As they raise no significant philosophical worries beyond those concerning second-order logic, this discussion will focus solely on second-order logic.

To see how second-order logic works, consider the inference R.

R: R1. There is a red shirt.
R2. There is a red hat.
RC. So, there is something (redness) that some shirt and some hat share.

RC does not follow from R1 and R2 in first-order logic, but it does follow in second-order logic. If we let Sx mean that x is a shirt, and Px mean that x is a property, and so forth, then we have the following valid second-order inference RS:

RS: R1S. (∃x)(Sx & Rx)
R2S. (∃x)(Hx & Rx)
RCS. (∃P)(∃x)(∃y)(Sx & Hy & Px & Py)

Accommodating inferences such as R by extending one’s logic to second-order might seem useful. But, higher-order logics allow us to infer, as a matter of logic, that there is some thing, presumably the property of redness, that the shirt and the hat share. It is simple common sense that shirts and hats exist. It is a matter of significant philosophical controversy whether properties like redness exist. Thus, a logic which permits an inference like RS is controversial.

Quine’s objection to higher-order logics, and thus his defense of first-order logic, is that we are forced to admit controversial elements as interpretations of predicate variables. Even if we interpret predicate variables in the least controversial way, as sets of objects that have those properties, higher-order logics demand sets. Thus, Quine calls second-order logic, “Set theory in sheep’s clothing” (Quine 1986a: 66). Additionally, higher-order logics lack many of the technical virtues, such as completeness and compactness, of first-order logic. (For a defense of second-order logic, see Shapiro 1991.)

Once we settle on first-order logic as a canonical language, we must specify a method for determining the commitments of a theory in that language. Reading existential claims seems straightforward. For example, R2 says that there is a thing which is a hat, and which is red. But, theories do not determine their own interpretations. Quine relies on standard Tarskian model-theoretic methods to interpret first-order theories. On a Tarskian interpretation, or semantics, we ascend to a metalanguage to construct a domain of quantification for a given theory. We consider whether sequences of objects in the domain, taken as values of the variables bound by the quantifiers, satisfy the theory’s statements, or theorems. The objects in the domain that make the theory come out true are the commitments of the theory. “To be is to be a value of a variable” (Quine 1939: 50) is how Quine summarizes the point. (For an accessible discussion of Tarskian semantics, see Tarski 1944).

Quine’s procedure for determining the commitments of a theory can prevent us from prejudging what exists. We construct scientific theory in the most effective and attractive way we can. We balance formal considerations, like the elegance of the mathematics involved, with an attempt to account for the broadest sensory evidence. The more comprehensive and elegant the theory, the more we are compelled to believe it, even if it tells us that the world is not the way we thought it is. If the theory yields a heliocentric model of the solar system, or the bending of rays of light, then we are committed to heliocentrism or bent light rays. Our commitments are the byproducts of this neutral process.

d. Mathematization

The final step of QI involves simply looking at the domain of the theory we have constructed. When we write our best theory in our first-order language, we discover that the theory includes physical laws which refer to functions, sets, and numbers. Consider again Coulomb’s Law: F = k ∣q1q2∣/ r2. Here is a partial first-order regimentation which suffices to demonstrate the commitments of the law, using ‘Px’ for ‘x is a charged particle’.

CLR:     ∀x∀y{(Px & Py) → ∃f [f(q(x), q(y), d(x,y), k) =  F]}       where F = k ∣q(x) q(y)∣ / d(x,y)2

In addition to the charged particles over which the universal quantifiers in front range, there is an existential quantification over a function, f. This function maps numbers [the Coulomb’s Law constant k, and measurements of charge q(x) and q(y), and distance d] to other numbers (measurements of force F between the particles).

In order to ensure that there are enough sets to construct these numbers and functions, our ideal theory must include set-theoretic axioms, perhaps those of Zermelo-Fraenkel set theory, ZF.

The full theory of ZF is unnecessary for scientific purposes; there will be some sets which are never needed and some numbers which are never used to measure any real quantity. But, we take a full set theory in order to make our larger theory as elegant as possible. We can derive from the axioms of any adequate set theory a vast universe of sets. So, CL contains or entails several existential mathematical claims. According to QI, we should believe that these mathematical objects exist.

Examples such as CLR abound. Real numbers are used for measurement throughout physics, and other sciences. Quantum mechanics makes essential use of Hilbert spaces and probability functions. The theory of relativity invokes the hyperbolic space of Lobachevskian geometry. Economics is full of analytic functions. Psychology uses a wide range of statistics.

Opponents of the indispensability argument have developed sophisticated strategies for re-interpreting apparently ineliminable uses of mathematics, especially in physics. Some reinterpretations use alethic modalities (necessity and possibility) to replace mathematical objects. Others replace numbers with space-time points or regions. It is quite easy, but technical, to rewrite first-order theories in order to avoid quantifying over mathematical objects. It is less easy to do so while maintaining Quine’s canonical language of first-order logic.

For example, Hartry Field’s reformulation of Newtonian gravitational theory (Field 1980; discussed below in section 7) replaces the real numbers which are ordinarily used to measure fundamental properties such as mass and momentum with relations among regions of space-time. Field replaces the “2” in the claim “The beryllium sphere has a mass of 2 kg” with a ratio of space-time regions, one twice as long as the other. In order to construct the proper ratios of space-time regions, and having no mathematical axioms at his disposal, Field’s project requires either second-order logic or axioms of mereology, both of which are controversial extensions of first-order logic. For an excellent survey of dispensabilist strategies, and further references, see Burgess and Rosen 1997; for more recent work, see Melia 1998 and Melia 2000.

Quine’s indispensability argument depends on controversial claims about believing in a single best theory, finding our commitments by using a canonical first-order logic, and the ineliminability of mathematics from scientific theories. Other versions of the argument attempt to avoid some of these controversial claims.

3. Putnam’s Success Argument

In his early work, Hilary Putnam accepted Quine’s version of the indispensability argument, but he eventually differed with Quine on a variety of questions. Most relevantly, Putnam abandoned Quine’s commitment to a single, regimented, best theory; and he urged that realism in mathematics can be justified by its indispensability for correspondence notions of truth (which require set-theoretic relations) and for formal logic, especially for metalogical notions like derivability and validity which are ordinarily treated set-theoretically.

The position Putnam calls realism in mathematics is ambiguous between two compatible views. Sentence realism is the claim that sentences of mathematics can be true or false. Object realism is the claim that mathematical objects exist. Most object realists are sentence realists, though some sentence realists, including some structuralists, deny object realism. Indispensability arguments may be taken to establish either sentence realism or object realism. Quine was an object realist. Michael Resnik presents an indispensability argument for sentence realism; see section 4. This article takes Putnam’s realism to be both object and sentence realism, but nothing said below depends on that claim.

Realism contrasts most obviously with fictionalism, on which there are no mathematical objects, and many mathematical sentences considered to be true by the realist are taken to be false. To understand the contrast between realism and fictionalism, consider the following two paradigm mathematical claims, the first existential and the second conditional.

E:   There is a prime number greater than any you have ever thought of.

C:   The consecutive angles of any parallelogram are supplementary.

The fictionalist claims that mathematical existence claims like E are false since prime numbers are numbers and there are no numbers. The standard conditional interpretation of C is that if any two angles are consecutive in a parallelogram, then they are supplementary. If there are no mathematical objects, then standard truth-table semantics for the material conditional entail that C, having a false antecedent, is true. The fictionalist claims that conditional statements which refer to mathematical objects, such as C, are only vacuously true, if true.

Putnam’s non-Quinean indispensability argument, the success argument, is a defense of realism over fictionalism, and other anti-realist positions. The success argument emphasizes the success of science, rather than the construction and interpretation of a best theory.

PS: PS1. Mathematics succeeds as the language of science.
PS2. There must be a reason for the success of mathematics as the language of science.
PS3. No positions other than realism in mathematics provide a reason.
PSC. So, realism in mathematics must be correct.

Putnam’s success argument for mathematics is analogous to his success argument for scientific realism. The scientific success argument relies on the claim that any position other than realism makes the success of science miraculous. The mathematical success argument claims that the success of mathematics can only be explained by a realist attitude toward its theorems and objects. “I believe that the positive argument for realism [in science] has an analogue in the case of mathematical realism. Here too, I believe, realism is the only philosophy that doesn’t make the success of the science a miracle” (Putnam 1975a: 73).

One potential criticism of any indispensability argument is that by making the justification of our mathematical beliefs depend on our justification for believing in science, our mathematical beliefs become only as strong as our confidence in science. It is notoriously difficult to establish the truth of scientific theory. Some philosophers, such as Nancy Cartwright and Bas van Fraassen, have argued that science, or much of it, is false, in part due to idealizations. (See Cartwright 1983 and van Fraassen 1980.) The success of science may be explained by its usefulness, without presuming that scientific theories are true.

Still, even if science were only useful rather than true, PS claims that our mathematical beliefs may be justified by the uses of mathematics in science. The problems with scientific realism focus on the incompleteness and error of contemporary scientific theory. These problems need not infect our beliefs in the mathematics used in science. A tool may work fine, even on a broken machine. One could deny or remain agnostic towards the claims of science, and still attempt to justify our mathematical beliefs using Putnam’s indispensability argument.

The first two premises of PS are uncontroversial, so Putnam’s defense of PS focuses on its third premise. His argument for that premise is essentially a rejection of the argument that mathematics could be indispensable, yet not true. “It is silly to agree that a reason for believing that p warrants accepting p in all scientific circumstances, and then to add ‘but even so it is not good enough’” (Putnam 1971: 356).

For the Quinean holist, Putnam’s argument for PS3 has some force. Such a holist has no external perspective from which to evaluate the mathematics in scientific theory as merely useful. The holist can not say, “Well, I commit to mathematical objects within scientific theory, but I don’t really mean that they exist.”

In contrast, the opponent of PS may abandon the claim that our most sincere commitments are found in the quantifications of our single best theory. Instead, such an opponent might claim that only objects which have causal relations to ordinary physical objects exist. Such a critic is free to deny that mathematical objects exist, despite their utility in science, and nothing in PS prevents such a move, in the way that QI1-QI3 do for Quine’s original argument.

More importantly, any account of the applicability of mathematics to the natural world other than the indispensabilist’s refutes PS3. For example, Mark Balaguer’s plenitudinous platonism claims that mathematics provides a theoretical apparatus which applies to all possible states of the world. (See Balaguer 1998.) It explains the applicability of mathematics to the natural world, non-miraculously, since any possible state of the natural world will be described by some mathematical theory.

Similarly, and more influentially, Hartry Field has argued that the reason that mathematics is successful as the language of science is because it is conservative over nominalist versions of scientific theories. (See Field 1980, especially the preliminary remarks and Chapter 1.) In other words, Field claims that mathematics is just a convenient shorthand for a theory which includes no mathematical axioms.

In response, one could amend PS3 to improve Putnam’s argument:

PS3*: Realism best explains the success of mathematics as the language of science.

A defense of the new argument, PS*, would entail showing that realism is a better explanation of the utility of mathematics than other options.

4. Resnik’s Pragmatic Indispensability Argument

Michael Resnik, like Putnam, presents both a holistic indispensability argument, such as Quine’s, and a non-holistic argument called the pragmatic indispensability argument. In the pragmatic argument, Resnik first links mathematical and scientific justification.

RP: RP1. In stating its laws and conducting its derivations, science assumes the existence of many mathematical objects and the truth of much mathematics.
RP2. These assumptions are indispensable to the pursuit of science; moreover, many of the important conclusions drawn from and within science could not be drawn without taking mathematical claims to be true.
RP3. So, we are justified in drawing conclusions from and within science only if we are justified in taking the mathematics used in science to be true.
RP4. We are justified in using science to explain and predict.
RP5. The only way we know of using science thus involves drawing conclusions from and within it.
RPC. So, by RP3, we are justified in taking mathematics to be true (Resnik 1997: 46-8).

RP, like PS, avoids the problems that may undermine our confidence in science. Even if our best scientific theories are false, their undeniable practical utility still justifies our using them. RP states that we need to presume the truth of mathematics even if science is merely useful. The key premises for RP, then, are the first two. If we can also take mathematics to be merely useful, then those premises are unjustified. The question for the proponent of RP, then, is how to determine whether science really presumes the existence of mathematical objects, and mathematical truth. How do we determine the commitments of scientific theory?

We could ask scientists about their beliefs, but they may work without considering the question of mathematical truth at all. Like PS, RP seems liable to the critic who claims that the same laws and derivations in science can be stated while taking mathematics to be merely useful. (See Azzouni 2004, Leng 2005, and Melia 2000.) The defender of RP needs a procedure for determining the commitments of science that blocks such a response, if not a more general procedure for determining our commitments. RP may thus be best interpreted as a reference back to Quine’s holistic argument, which provided both.

5. The Explanatory Indispensability Argument

Alan Baker and Mark Colyvan have defended an explanatory indispensability argument (Mancosu 2008: section 3.2; see also Baker 2005 and Lyon and Colyvan 2008).

EI: EI1. There are genuinely mathematical explanations of empirical phenomena.
EI2. We ought to be committed to the theoretical posits in such explanations.
EIC. We ought to be committed to the entities postulated by the mathematics in question.

EI differs from Quine’s original indispensability argument QI, and other versions of the argument, by seeking the justification for our mathematical beliefs in scientific explanations which rely on mathematics, rather than in scientific theories. Mathematical explanation and scientific explanation are difficult and controversial topics, beyond the scope of this article. Still, two comments are appropriate.

First, it is unclear whether EI is intended as a greater demand on the indispensabilist than the standard indispensability argument. Does the platonist have to show that mathematical objects are indispensable in both our best theories and our best explanations, or just in one of them? Conversely, must the nominalist dispense with mathematics in both theories and explanations?

Second, EI, like Putnam’s success argument and Resnik’s pragmatic argument, leaves open the question of how one is supposed to determine the commitments of an explanation. EI2 refers to the theoretical posits postulated by explanations, but does not tell us how we are supposed to figure out what an explanation posits. If the commitments of a scientific explanation are found in the best scientific theory used in that explanation, then EI is no improvement on QI. If, on the other hand, EI is supposed to be a new and independent argument, its proponents must present a new and independent criterion for determining the commitments of explanations.

Given the development of the explanatory indispensability argument and the current interest in mathematical explanation, it is likely that more work will be done on these questions.

6. Characteristics of Indispensability Arguments in the Philosophy of Mathematics

Quine’s indispensability argument relies on specific claims about how we determine our commitments, and on what our canonical language must be. Putnam’s success argument, Resnik’s pragmatic argument, and the explanatory indispensability argument are more general, since they do not specify a particular method for determining the commitments of a theory. Putnam and Resnik maintain that we are committed to mathematics because of the ineliminable role that mathematical objects play in scientific theory. Proponents of the explanatory argument argue that our mathematical beliefs are justified by the role that mathematics plays in scientific explanations.

Indispensability arguments in the philosophy of mathematics can be quite general, and can rely on supposedly indispensable uses of mathematics in a wide variety of contexts. For instance, in later work, Putnam defends belief in mathematical objects for their indispensability in explaining our mathematical intuitions. (See Putnam 1994: 506.) Since he thinks that our mathematical intuitions derive exclusively from our sense experience, this later argument may still be classified as an indispensability argument.

Here are some characteristics of many indispensability arguments in the philosophy of mathematics, no matter how general:

Naturalism: The job of the philosopher, as of the scientist, is exclusively to understand our sensible experience of the physical world.
Theory Construction: In order to explain our sensible experience we construct a theory, or theories, of the physical world. We find our commitments exclusively in our best theory or theories.
Mathematization: Some mathematical objects are ineliminable from our best theory or theories.
Subordination of Practice: Mathematical practice depends for its legitimacy on natural scientific practice.

Although the indispensability argument is a late twentieth century development, earlier philosophers may have held versions of the argument. Mark Colyvan classifies arguments from both Frege and Gödel as indispensability arguments, on the strength of their commitments to Theory Construction and Mathematization. (See Colyvan 2001: 8-9.) Both Frege and Gödel, though, deny Naturalism and Subordination of Practice, so they are not indispensabilists according to the characterization in this section.

7. Responses to the Indispensability Argument

The most influential approach to denying the indispensability argument is to reject the claim that mathematics is essential to science. The main strategy for this response is to introduce scientific or mathematical theories which entail all of the consequences of standard theories, but which do not refer to mathematical objects. Such nominalizing strategies break into two groups.

In the first group are theories which show how to eliminate quantification over mathematical objects within scientific theories. Hartry Field has shown how we can reformulate some physical theories to quantify over space-time regions rather than over sets. (See Field 1980 and Field 1989.) According to Field, mathematics is useful because it is a convenient shorthand for more complicated statements about physical quantities. John Burgess has extended Field’s work. (See Burgess 1984, Burgess 1991a, and Burgess and Rosen 1997.) Mark Balaguer has presented steps toward nominalizing quantum mechanics. (See Balaguer 1998.)

The second group of nominalizing strategies attempts to reformulate mathematical theories to avoid commitments to mathematical objects. Charles Chihara (Chihara 1990), Geoffrey Hellman (Hellman 1989), and Hilary Putnam (Putnam 1967b and Putnam 1975a) have all explored modal reformulations of mathematical theories. Modal reformulations replace claims about mathematical objects with claims about possibility.

Another line of criticism of the indispensability argument is that the argument is insufficient to generate a satisfying mathematical ontology. For example, no scientific theory requires any more than א1 sets; we don’t need anything nearly as powerful as the full ZFC hierarchy. But, standard set theory entails the existence of much larger cardinalities. Quine calls such un-applied mathematics “mathematical recreation” (Quine 1986b: 400).

The indispensabilist can justify extending mathematical ontology a little bit beyond those objects explicitly required for science, for simplicity and rounding out. But few indispensabilists have shown interest in justifying beliefs in, say, inaccessible cardinals. (Though, see Colyvan 2007 for such an attempt.) Thus, the indispensabilist has a restricted ontology. Similarly, the indispensability argument may be criticized for making mathematical epistemology a posteriori, rather than  a priori, and for classifying mathematical truths as contingent, rather than necessary. Indispensabilists may welcome these departures from traditional interpretations of mathematics. (For example, see Colyvan 2001, Chapter 6.)

8. Inter-theoretic and Intra-theoretic Indispensability Arguments

Indispensability arguments need not be restricted to the philosophy of mathematics. Considered more generally, an indispensability argument is an inference to the best explanation which transfers evidence for one set of claims to another. If the transfer crosses disciplinary lines, we can call the argument an inter-theoretic indispensability argument. If evidence is transferred within a theory, we can call the argument an intra-theoretic indispensability argument. The indispensability argument in the philosophy of mathematics transfers evidence from natural science to mathematics. Thus, this argument is an inter-theoretic indispensability argument.

One might apply inter-theoretic indispensability arguments in other areas. For example, one could argue that we should believe in gravitational fields (physics) because they are ineliminable from our explanations of why zebras do not go flying off into space (biology). We might think that biological laws reduce, in some sense, to physical laws, or we might think that they are independent of physics, or supervenient on physics. Still, our beliefs in some basic claims of physics seem indispensable to other sciences.

As an example of an intra-theoretic indispensability argument, consider the justification for our believing in the existence of atoms. Atomic theory makes accurate predictions which extend to the observable world. It has led to a deeper understanding of the world, as well as further successful research. Despite our lacking direct perception of atoms, they play an indispensable role in atomic theory. According to atomic theory, atoms exist. Thus, according to an intra-theoretic indispensability argument, we should believe that atoms exist.

As an example of an intra-theoretic indispensability argument within mathematics, consider Church’s Thesis. Church’s Thesis claims that our intuitive notion of an algorithm is equivalent to the technical notion of a recursive function. Church’s Thesis is not provable, in the ordinary sense. But, it might be defended by using an intra-theoretic indispensability argument: Church’s Thesis is fruitful, and, arguably, indispensable to our understanding of mathematics.  For another example, Quine’s argument for QI2, that we must believe in the commitments of any theory we accept, might itself also be called an intra-theoretic indispensability argument.

9. Conclusion

There are at least three ways of arguing for empirical justification of mathematics. The first is to argue, as John Stuart Mill did, that mathematical beliefs are about ordinary, physical objects to which we have sensory access. The second is to argue that, although mathematical beliefs are about abstract mathematical objects, we have sensory access to such objects. (See Maddy 1990.) The currently most popular way to justify mathematics empirically is to argue:

A. Mathematical beliefs are about abstract objects;

B. We have experiences only with physical objects; and yet

C. Our experiences with physical objects justify our mathematical beliefs.

This is the indispensability argument in the philosophy of mathematics.

10. References and Further Reading

  • Azzouni, Jody.  2004. Deflating Existential Consequence: A Case for Nominalism. Oxford University Press.
  • Azzouni, Jody.  1998. “On ‘On What There Is’.”  Pacific Philosophical Quarterly 79: 1-18.
  • Azzouni, Jody.  1997b. “Applied Mathematics, Existential Commitment, and the Quine-Putnam Indispensability Thesis.”  Philosophia Mathematica (3) 5: 193-209.
  • Baker, Alan.  2005. “Are there Genuine Mathematical Explanations of Physical Phenomena?” Mind: 114: 223-238.
  • Baker, Alan.  2003. “The Indispensability Argument and Multiple Foundations for Mathematics.” The Philosophical Quarterly 53.210: 49-67.
  • Baker, Alan.  2001. “Mathematics, Indispensability and Scientific Practice.” Erkenntnis 55: 85 116.
  • Balaguer, Mark.  1998. Platonism and Anti-Platonism in Mathematics. New York: Oxford University Press.
  • Balaguer, Mark.  1996. “Toward a Nominalization of Quantum Mechanics.”  Mind 105: 209-226.
  • Bangu, Sorin Ioan.  2008. “Inference to the Best Explanation and Mathematical Realism.” Synthese 160: 13-20.
  • Benacerraf, Paul.  1973.  “Mathematical Truth.”   In Paul Benacerraf and Hilary Putnam, eds., Philosophy of Mathematics: Selected Readings, second edition, Cambridge: Cambridge University Press, 1983.
  • Burgess, John.  1991b. “Synthetic Physics and Nominalist Realism.”  In C. Wade Savage and Philip Ehrlich, Philosophical and Foundational Issues in Measurement Theory, Hillsdale: Lawrence Erlblum Associates, 1992.
  • Burgess, John.  1991a. “Synthetic Mechanics Revisited.” Journal of Philosophical Logic 20: 121-130. Burgess, John.  1984. “Synthetic Mechanics.” Journal of Philosophical Logic 13: 379-395.
  • Burgess, John.  1983. “Why I am Not a Nominalist.”  Notre Dame Journal of Formal Logic 24.1: 93-105. Burgess, John, and Gideon Rosen.  1997. A Subject with No Object. New York: Oxford.
  • Cartwright, Nancy.  1983. How the Laws of Physics Lie. Oxford: Clarendon Press.
  • Chihara, Charles.  1990. Constructabiliy and Mathematical Existence. Oxford: Oxford University Press.
  • Colyvan, Mark.  2007.  “Mathematical Recreation versus Mathematical Knowledge.”  In Leng, Mary, Alexander Paseau and Michael Potter, eds, Mathematical Knowledge, Oxford University Press, 109-122.
  • Colyvan, Mark.  2002. “Mathematics and Aesthetic Considerations in Science.”  Mind 11: 69-78.
  • Colyvan, Mark.  2001. The Indispensability of Mathematics. Oxford University Press.
  • Field, Hartry.  1989. Realism, Mathematics, and Modality. Oxford: Basil Blackwell.
  • Field, Hartry.  1980. Science Without Numbers. Princeton: Princeton University Press.
  • Frege, Gottlob.  1953. The Foundations of Arithmetic. Evanston: Northwestern University Press.
  • Gödel, Kurt.  1963. “What is Cantor’s Continuum Problem?” In Paul Benacerraf and Hilary Putnam, eds., Philosophy of Mathematics: Selected Readings, second edition. Cambridge: Cambridge University Press, 1983.
  • Gödel, Kurt.  1961. “The Modern Development of the Foundations of Mathematics in the Light of Philosophy.”  In Solomon Feferman et al., eds., Kurt Gödel: Collected Works, Vol. III. New York: Oxford University Press, 1995.
  • Hellman, Geoffrey.  1989. Mathematics Without Numbers. New York: Oxford University Press.
  • Hintikka, Jaakko.  1973. “Quantifiers vs Quantification Theory.”  Dialectica 27.3: 329-358.
  • Leng, Mary.  2005.  “Mathematical Explanation.”  In Cellucci, Carlo and Donald Gillies eds, Mathematical Reasoning and Heuristics, King’s College Publications, London, 167-189.
  • Lyon, Aidan and Mark Colyvan.  2007. “The Explanatory Power of Phase Spaces.” Philosophia Mathematica 16.2: 227-243.
  • Maddy, Penelope.  1992. “Indispensability and Practice.”  The Journal of Philosophy 89: 275­-289.
  • Maddy, Penelope. 1990. Realism in Mathematics. Oxford: Clarendon Press. Mancosu, Paolo.  “Explanation in Mathematics.”  The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), Edward N. Zalta (ed.).
  • Marcus, Russell.  2007. “Structuralism, Indispensability, and the Access Problem.” Facta Philosophica 9, 2007: 203-211.
  • Melia, Joseph.  2002. “Response to Colyvan.”  Mind 111: 75-79.
  • Melia, Joseph.  2000. “Weaseling Away the Indispensability Argument.”  Mind 109: 455-479.
  • Melia, Joseph.  1998. “Field’s Programme: Some Interference.”  Analysis 58.2: 63-71.
  • Mendelson, Elliott.  1997. Introduction to Mathematical Logic, 4th ed.  Chapman & Hall/CRC.
  • Mill, John Stuart.  1941. A System of Logic, Ratiocinative and Inductive: Being a Connected View of the Principles of Evidence and the Methods of Scientific Investigation. London, Longmans, Green.
  • Putnam, Hilary.  1994.  “Philosophy of Mathematics: Why Nothing Works.” In his Words and Life. Cambridge: Harvard University Press.
  • Putnam, Hilary.  1975b. Mathematics, Matter, and Method: Philosophical Papers, Vol. I. Cambridge: Cambridge University Press.
  • Putnam, Hilary. 1975a. “What is Mathematical Truth?”  In Putnam 1975b. Putnam, Hilary.  1974. “Science as Approximation to Truth.”  In Putnam 1975b.
  • Putnam, Hilary.  1971. Philosophy of Logic. In Putnam 1975b.
  • Putnam, Hilary.  1967b. “Mathematics Without Foundations.”  In Putnam 1975b.
  • Putnam, Hilary.  1967a. “The Thesis that Mathematics is Logic.”  In Putnam 1975b.
  • Putnam, Hilary.  1956. “Mathematics and the Existence of Abstract Entities.”  Philosophical Studies 7: 81-88.
  • Quine, W.V. 1995. From Stimulus to Science. Cambridge: Harvard University Press.
  • Quine, W.V. 1986b. “Reply to Charles Parsons.” In Lewis Edwin Hahn and Paul Arthur Schilpp, eds., The Philosophy of W.V. Quine. La Salle: Open Court, 1986.
  • Quine, W.V. 1986a. Philosophy of Logic, 2nd edition.  Cambridge: Harvard University Press.
  • Quine, W.V. 1981. Theories and Things. Cambridge: Harvard University Press.
  • Quine, W.V.  1980. From a Logical Point of View. Cambridge: Harvard University Press.
  • Quine, W.V.  1978. “Success and the Limits of Mathematization.”  In Quine 1981.
  • Quine, W.V.  1969. Ontological Relativity and Other Essays. New York: Columbia University Press. Quine, W.V.  1960. Word & Object. Cambridge: The MIT Press.
  • Quine, W.V.  1958. “Speaking of Objects.”  In Quine 1969.
  • Quine, W.V.  1955. “Posits and Reality.”  In The Ways of Paradox. Cambridge: Harvard University Press, 1976.
  • Quine, W.V.  1951. “Two Dogmas of Empiricism.”  In Quine 1980.
  • Quine, W.V.  1948. “On What There Is.”  In Quine 1980.
  • Quine, W.V.  1939. “Designation and Existence.”  In Feigl and Sellars, Readings in Philosophical Analysis, Appleton-Century-Crofts, Inc., New York: 1940.
  • Resnik, Michael.  1997. Mathematics as a Science of Patterns. Oxford: Oxford University Press.
  • Resnik, Michael.  1995. “Scientific vs. Mathematical Realism: The Indispensability Argument.  Philosophia Mathematica (3) 3: 166-174.
  • Resnik, Michael D.  1993. “A Naturalized Epistemology for a Platonist Mathematical Ontology.”  In Sal Restivo, et. al., eds., Math Worlds: Philosophical and Social Studies of Mathematics and Mathematics Education, Albany: SUNY Press, 1993.
  • Shapiro, Stewart.  Foundations without Foundationalism: A Case for Second-Order Logic. Oxford University Press, 1991.
  • Sher, Gila. 1990. “Ways of Branching Quantifiers.” Linguistics and Philosophy 13: 393-422.
  • Sober, Elliott.  1993. “Mathematics and Indispensability.”  The Philosophical Review 102: 35-57.
  • Tarski, Alfred.  1944. “The Semantic Conception of Truth: and the Foundations of Semantics.” Philosophy and Phenomenological Research 4.3: 341-376.
  • Van Fraassen, Bas C.  1980. The Scientific Image. Oxford: Clarendon Press.
  • Whitehead, Alfred North and Bertrand Russell.  1997. Principia Mathematica to *56. Cambridge University Press.

Author Information

Russell Marcus
Email: rmarcus1@hamilton.edu
Hamilton College
U. S. A.

Transmission and Transmission Failure in Epistemology

An argument transmits justification to its conclusion just in case, roughly, the conclusion is justified in virtue of the premises’ being justified.  An argument fails to transmit justification just in case, roughly, the conclusion is not justified in virtue of the premises’ being justified.  An argument might fail to transmit justification for a variety of uncontroversial reasons, such as the premise’s being unjustified; the premises’ failing to support the conclusion; or the argument’s exhibiting premise circularity.  There are transmission issues concerning testimony, but this article focuses on when arguments (fail to) transmit justification or knowledge or some other epistemic status.

Transmission failure is an interesting issue because it is difficult to identify what, if anything, prevents competent deductions from justifying their conclusions.  One makes a competent deduction when she accepts a deductive argument in certain circumstances.  These deductions seem to be the paradigmatic form of reasoning in that they apparently must transmit justification to their conclusions.  At the same time, though, certain competent deductions seem bad.  Consider Moore’s Proof:  I have a hand therefore there is at least one material thing.  Some philosophers hold that Moore’s Proof cannot transmit justification to its conclusion under any circumstances, and so, despite appearances, some competent deductions are instances of transmission failure.  Identifying what, if anything, prevents such arguments from justifying their conclusions is a tricky, controversial affair.

Transmission principles are intimately connected with closure principles.  An epistemic closure principle might say that, if one knows P and deduces Q from P, then one knows that Q.  Closure principles are silent as to what makes Q known, but the corresponding transmission principles are not.  A transmission principle might say that, if one knows P and deduces Q from P, then one knows Q in virtue of knowing P.

Those sympathetic to Moore’s Proof sometimes say that the “proof” can justify its conclusion even though it lacks the power to resolve doubt.  An argument can resolve doubt about its conclusion when the argument can justify its conclusion even for a subject who antecedently disbelieves or withholds judgment about the argument’s conclusion.

Table of Contents

  1. Transmission: The General Concept
  2. Transmission in Epistemology
  3. Transmission Failure
    1. Uncontroversial Causes
    2. Why Transmission Failure is an Interesting Issue
    3. Two More Puzzling Cases
  4. Transmission (Failure) vs. Closure (Failure)
    1. The Basic Difference
    2. The (Misplaced?) Focus on Closure
    3. Why Transmission is an Interesting Issue, Revisited
  5. Transmission Failure: Two Common Assumptions
    1. Transmission of Warrant vs. Transmission of Justification
    2. Transmission vs. Resolving Doubt
  6. References and Further Reading

1. Transmission: The General Concept

The term ‘transmission’ is not unique to philosophical discourse: religious and cultural traditions often are transmitted from one generation to the next; diseases from one person to another; and information of various kinds from one computer to another (often via the internet).  A car’s transmission gets its name from its intended purpose, namely to transmit the energy from the engine to its wheels (to put it crudely).  The use of ‘transmission’ in epistemological contexts is deeply connected to its use in everyday contexts.  Tucker (2010, section 1) holds that one can clarify the epistemological concept of transmission by considering an everyday instance of transmission.

Under what conditions does Alvin’s computer A transmit information to another computer B?  Tucker suggests it will do so just in case (i) A had the information and (ii) B has the information in virtue of A’s having it.  The first condition is very intuitive.  If A does not have the information but B acquires it anyway, it may be true that something transmitted the information to B.  Yet, unless A has the information, it won’t be true that A transmitted the information to B.  The second condition is intuitive but vague.  If B has the information in virtue of A’s having it, then A causes B to have it.  Yet mere causation is not enough to satisfy this in virtue of relation.  If A sends the information to B over an Ethernet or USB cable, we do seem to have the requisite sort of causal relation, and, in these cases, A seems to transmit the information to B.

Suppose A just finished downloading the information, which makes Alvin so excited that he does a wild victory dance.  During this dance he accidently hits B’s keyboard, which causes B to download the information from the internet (and not Alvin’s computer).  In such a case, A’s having the information causes B to have it, but the information was not transmitted from A to B.  Although transmission requires that a causal relation hold, not just any causal relation will do.  This article will follow Tucker in using ‘in virtue of’ as a placeholder for whatever causal relation is required for transmission.

Generalizing from this example, Tucker concludes that transmission is a three-place relation between: (i) the property P that is transmitted; (ii) the thing a from which the property is transmitted; and (iii) the thing b to which the property is transmitted.  A property P is transmitted from a to b just in case b has P in virtue of a’s having P.  In the above example, the property P is having the information; a is A, Alvin’s computer; and b is B, some other computer.  So A transmits the information to B just in case B has the information in virtue of A’s having it.

The preceding discussion clarifies statements of the form ‘a transmits P to b’, but there is another, more informative kind of transmission ascription, which we can symbolize as ‘R transmits P from a to b’.  Contrast ‘A transmitted the information to B’ with the equally natural expression ‘The USB cable transmitted the information from A to B’.  Whereas the former notes only that the information was transmitted from A to B, the latter additionally notes how it was transmitted.  Under what conditions does the USB cable (more precisely: being connected by the USB cable) transmit the information from A to B?  I suggest that it will do so just in case (i) A had the information and (ii) B has the information in virtue of both A’s having it and A’s being connected by a USB cable to B.

2. Transmission in Epistemology

When epistemologists consider transmission or transmission failure, they generally ask such questions as:

  • Under what conditions does entailment transmit justification?
  • Under what conditions do competent deductions transmit rational belief?
  • Does testimony transmit knowledge?

Epistemologists, then, are concerned with whether some relation (for example, entailment, competent deduction, testimony) transmits some epistemic property (for example, being rational, being justified, being known, or being defeated).  They tend to have in mind, therefore, the more informative sort of transmission ascription (see section 1).  That is, they are concerned not just with whether a belief is, say, known in virtue of another belief’s being known; they are also concerned with whether, say, entailment is the particular relation that allows the first belief to be known in virtue of the second.

This article will focus exclusively on when arguments or inferences (fail to) transmit some epistemic value property, such as being justified or being known.  The reason is that, when philosophers talk about transmission failure as an independent issue, they tend to have in mind the conditions under which an argument or inference fails to transmit.  The conditions under which testimony (fails to) transmit, say, knowledge is an interesting and important issue.  Yet these issues are often pursued in conjunction with or subsumed under other important issues relating to testimony, such as the conditions under which testimony preserves knowledge.  (For a brief intro to some of the relevant transmission issues pertaining to testimony, see Lackey 2008, section 3.)  In any case, this article will focus on the transmission issues pertaining to arguments or inferences, rather than the issues pertaining to testimony or other epistemically interesting relations.

An argument is a set of propositions such that one proposition, the conclusion, is supported by or is taken to be supported by the other propositions in that set, the premises.  An argument, as such, is merely a set of propositions that bear a special relation with one another.  Arguments can play a role in transmitting justification or knowledge when a subject believes the premises or when a subject infers the conclusion from the premises.  If epistemic transmission is analogous to the above computer transmission case (sec. 1), then an argument transmits justification to its conclusion when (i) the premises have some epistemically valuable status (for example, being justified, being known) and (ii) the conclusion has that same status in virtue of the premises’ having it.  (Here and elsewhere, for the sake of simplicity, I ignore the additional complexity of the more informative transmission ascriptions.)  The following case seems to satisfy (i) and (ii), and so it seems to transmit justification from the premises to the conclusion.

The Counting Case: Consider this argument: (a) that there are exactly 25 people in the room; and (b) that if there are exactly 25 people in the room, then there are fewer than 100 people in the room; therefore (c) there are fewer than 100 people in the room.  Suppose that Counter justifiably believes (a) on the basis of perception; that he justifiably believes (b) a priori; and that he believes (c) on the basis of (a) and (b).

The counting case seems to be a paradigmatic case of successful transmission.  Counter’s belief in the premises, namely (a) and (b), are justified (so (i) is satisfied), and the conclusion, namely (c), seems to be justified in virtue the premises’ being justified (so (ii) is satisfied).  Notice, however, that whether an argument transmits is relative to a subject.  The argument in the Counting Case transmits for Counter but not for someone who lacks justification for the premises.

The Counting Case also illustrates the deep connection between the transmission of justification and inferential justification.  When philosophers address inferential justification, they are concerned with the conditions under which the premises of an argument justify the argument’s conclusion. If one belief (belief in the premise) justifies another belief (belief in the conclusion), belief in the conclusion is inferentially justified.  Notice that the conclusion in the counting case is inferentially justified because it is justified by its premises.  The Counting Case, therefore, illustrates both inferential justification and the successful transmission of justification.  This is no accident.  It is almost universally assumed that inferential justification works by transmission; it is assumed that when the conclusion is justified by the premises, the premises transmit their justification to their conclusions.  Hence, the transmission of justification across an argument is deeply connected to inferential justification.

It should be noted that sometimes, when philosophers talk about transmission, they use the term “transfer” rather than “transmission” (for example, Davies 1998).  The latter terminology seems preferable, as Davies now admits (2000: 393, nt. 17).  “Transfer” often connotes that, when P is transferred from a to b, a no longer has P.  If I transfer water from one cup to another, the transferred water is no longer in the first cup.  “Transmission” lacks that connotation: when a computer transmits some information to another computer, the first computer typically retains the transmitted information.

3. Transmission Failure

a. Uncontroversial Causes

An argument is an instance of transmission failure just in case it does not transmit (some degree of) justification (or whatever epistemic status is at issue) from the premises to the conclusion.  Arguments can fail to transmit justification to their conclusions for a number of reasons.  Here are a few relatively uncontroversial causes of transmission failure:

  • Unjustified Premises: If an argument’s premises are all unjustified, then the argument is a trivial case of transmission failure; for the premises had no justification to transmit to its conclusion in the first place.  It does not follow, though, that all of an inference’s premises must be justified for it to transmit justification to its conclusion.  Consider an inductive inference with 100 premises of the form ‘on this occasion the unsuspended pencil fell to the ground’.  If 99 of the 100 premises are justified, it seems that those 100 premises can transmit justification to the belief that the next unsuspended pencil will also fall, despite that one of the premises fails to be justified.  (See the article “Deductive and Inductive Arguments” for a brief explanation of the differences between deductive and inductive arguments.)
  • Premise Circularity:  An argument is premise circular just in case its ultimate conclusion also appears as a premise.  For instance, consider P therefore Q therefore P.  The ultimate conclusion, P, is used as the sole premise for the intermediate conclusion, Q.  Even given that P transmits justification to Q, it seems clear that the justification P has in virtue of Q cannot be transmitted back to Q.  (The term ‘premise circular’ will be used loosely, such that both the extended argument P therefore Q therefore P and the second stage of the argument, Q therefore P, are premise circular.)
  • The Premises Fail to Evidentially Support Their Conclusion: Consider the argument: ‘I have a hand; therefore, the Stay Puft Marshmallow Man is eating a Ghostbuster’.  The premise is justified; however, it fails to transmit its justification to the conclusion because having a hand is not evidence that the Marshmallow Man is doing anything, much less eating a Ghostbuster.
  • The Premises Provide Less Than Maximal Evidential Support: An argument that provides maximal evidential support, such as one in the form of modus ponens, is capable of transmitting all of its premises’ justification to the conclusion.  Arguments that provide some less-than-maximal degree of support, such as a good inductive argument, fail to transmit all of the premises’ justification to the conclusion.  Good inductive arguments with justified premises both partially transmit and partially fail to transmit justification from the premises to the conclusion.  Other things being equal, the stronger the support, the more justification the argument transmits from the premises to the conclusion.
  • Defeaters: A good argument might fail to transmit justification because one has a relevant defeater (for example, relevant counterevidence).  Suppose I believe some mathematical theorem T on what is in fact exemplary deductive reasoning.  If I know that my coffee has been spiked with a drug known to cause egregious errors in reasoning, then my exemplary deductive reasoning is an instance of at least partial transmission failure.

b. Why Transmission Failure is an Interesting Issue

It is relatively uninteresting if an argument fails to transmit for any of the above reasons.  But suppose an argument has well-justified premises; the premises provide deductive (so maximal) support for their conclusion; the subject knows that the premises provide deductive support for their conclusions; there are no relevant defeaters; and it is not premise circular.  A person makes a competent deduction when they accept such an argument.  (Others use the term “competent deduction,” but they often mean something slightly different by the term, including Tucker (2010).)  One might think that competent deductions are the paradigm of good reasoning, that they must transmit justification to their conclusions.  Interest in transmission failure arises because, at first glance at least, there are such arguments that do seem to be instances of transmission failure.  Interest in transmission failure persists because it is very hard to identify what would cause such arguments to be instances of transmission failure.  Consider the following example.

Some philosophers, sometimes called “idealists,” hold that the only things that exist are minds and their ideas.  These idealists, therefore, are skeptics about material objects.  In other words, they reject that there are material objects, where material objects are non-mental objects composed of matter.  These philosophers tend to hold that there are ideas of hands but no hands.  There are ideas of chairs, even apparent perceptions of chairs, but there are no chairs.  Responding to these idealists, G. E. Moore declared that he could prove the existence of the external, or non-mental, world.  Here is his “proof”:

Moore’s Proof (MP)

(MP1)   I have a hand.

(If I have a hand, then there is at least one material object.)

(MP2)  There is at least one material object.

This argument is widely criticized and scorned.  Yet if it fails to transmit justification to its conclusion, why does it do so?

Well, Moore’s Proof is not an instance of transmission failure for any of the obvious reasons: it is a deductive argument; its premise seems well-justified on the basis of perceptual experience; there are no relevant defeaters; and it is not premise circular (that is, Moore did not—or at least need not—use MP2, the conclusion of Moore’s Proof, as a premise for his belief in MP1).  Still, it is hard to dispel the sense that this argument is bad.  This argument seems to beg the question against the skeptic, but it is unclear whether question-begging, by itself, can cause transmission failure (see sec. 5b).  Perhaps Moore’s Proof is not just question-begging, but also viciously circular in some way.  The problem is that it is hard to identify a type of circularity that both afflicts Moore’s argument and is clearly bad.

c. Two More Puzzling Cases

Moore’s Proof is a puzzling case.  If one accepts Moore’s Proof, she has made a competent deduction, which would seem to make it the paradigm of good reasoning.  Nonetheless, it still seems to be a bad argument.  The puzzling nature of this case also appears in a variety of other arguments, including the following two arguments.

Moore’s Proof is aimed at disproving idealism insofar as it is committed to skepticism about the material world, that is, the claim that the external world does not exist.  Consider, however, perceptual skepticism, the idea that, even if the external world does exist, our perceptual experiences do not give us knowledge (directly or via an inference) of this non-mental realm. Proponents of this skepticism typically concoct scenarios in which we would have exactly the same experiences that we do have, but where our perceptual experiences are wildly unreliable.  One popular scenario is that I am the unwitting victim of a mad scientist.  The mad scientist removed my brain, placed it in a vat of nutrients, and then hooked me up to his supercomputer.  In addition to keeping me alive, this supercomputer provides me with a computer generated reality, much like the virtual reality described by the movie Matrix.  Although all of my perceptual experiences are wildly unreliable, they seem just as genuine and trustworthy as my actual experiences.  The skeptic then reasons as follows: if you cannot tell whether you are merely a brain-in-a-vat in the above scenario, then you do not know you have a hand; you cannot tell whether you are a brain-in-a-vat (because your experiences would seem just as genuine even if you were a brain-in-a-vat); therefore, you do not know whether you have a hand.  (See Contemporary Skepticism, especially section 1, for further discussion of this type of skepticism.)

Some philosophers respond that the sort of reasoning in Moore’s Proof can be applied to rule out the skeptical hypothesis that we are brains-in-vats.  Hence:

The Neo-Moorean Argument

(NM1)    I have a hand.

(If I have a hand, then I am not a brain-in-a-vat.)

(NM2)    I am not a brain-in-a-vat.

The Neo-Moorean Argument is just as puzzling as Moore’s Proof.  If one accepts the Neo-Moorean Argument, she has accepted a competent deduction which seems to be the paradigm of good reasoning.  Yet the argument still seems bad, which is why some philosophers hold that it is an instance of transmission failure.

The Zebra Argument, like the Neo-Moorean Argument, is intended to rule out a certain kind of skeptical scenario.  Bobby is at the zoo and sees what appears to be zebra.  Quite naturally, he believes that the creature is a zebra on the basis of its looking like one.  His son, however, is not convinced and asks: “Dad, if a mule is disguised cleverly enough, it will look just like a real zebra.  So how do you know that the creature isn’t a cleverly disguised mule?”  Bobby answers his son’s question with:

The Zebra Argument

(Z1)        That creature is a zebra.

(If it is a zebra, then it is not a cleverly disguised mule.)

(Z2)        It is not a cleverly disguised mule.

It seems that to know that the creature is a zebra, one must know already in some sense that the creature is not a cleverly disguised mule.  Hence, Bobby’s argument seems to exhibit a suspicious type of circularity despite qualifying as a competent deduction.

(There is a rather wide variety of other puzzling cases.  For reasons that will be explained in the next section, arguments that allegedly violate closure principles are also potential examples of transmission failure.  Readers interested in semantic or content externalism should consider McKinsey’s Paradox in section 5 of the closure principles article.  Readers with expertise in the philosophy of mind might be interested in some examples raised by Davies (2003: secs. 3, 5).)

4. Transmission (Failure) vs. Closure (Failure)

Discussions of transmission and transmission failure are connected intimately with discussions of closure and closure failure, which raises the question of how these issues are related.

a. The Basic Difference

Closure principles say, roughly, that if one thing a has some property P and bears some relation R to another thing b, then b also will have P.  More succinctly (and ignoring universal quantification for simplicity’s sake), closure principles say that, if Pa and Rab, then Pb.  Suppose that the property being a pig is closed under the relation being the same species as.  Suppose, in other words, that if Albert is a pig, then anything that is the same species as Albert is also a pig.  Given this assumption, if Albert is a pig and Brutus is the same species as Albert, then Brutus is a pig.  Yet being a pig is clearly not closed under the relation being the same genus as.  Pigs are in the genus mammal along with humans, cows, poodles, and many other creatures.  If Albert is a pig and Brutus is in the same genus as Albert, it does not follow that Brutus is a pig.  Brutus could be a terribly ferocious poodle and still be in the same genus as Albert.

In epistemological contexts, the relevant P will be an epistemic property, such as being justified or known, and R will be something like being competently deduced from or being known to entail.  An epistemic closure principle might say: If Billy knows P and Billy competently deduces Q from P, then Billy also knows Q.

Transmission principles are stronger than their closure counterparts.  Transmission principles, in other words, say everything that their closure counterparts say and more besides.  Recall that closure principles hold that, if Pa and Rab, then Pb.   Transmission principles hold instead that, if Pa and Rab, then Pb in virtue of Pa. Closure principles merely say that b has the property P, but they do not specify why b has that property.  Transmission principles say not only that b has P, but also that b has P because, or in virtue of, Pa and Rab.

Notice that a closure principle can be true when the corresponding transmission principle is false.  Consider:

Pig Closure: If Albert is a pig and is the same species as Brutus, then Brutus is also a pig.

Pig Transmission: If Albert is a pig and is the same species as Brutus, then Brutus is a pig in virtue of Albert’s being a pig.

Even though we are assuming that Pig Closure is true, Pig Transmission will be false when Albert and Brutus are unrelated pigs.  Brutus’ being a pig might be explained by his parents being pigs and/or his having a certain DNA structure, but not by Albert’s being a pig.  Although closure principles can be true when their transmission counterparts are false, if a transmission principle is true, its closure counterpart must also be true.  This is because transmission principles say everything that their closure counterparts say (and more besides).

Epistemic closure principles likewise can be true when their transmission counterparts are false.

Simple Closure: If S knows that P and deduces Q from P, then S knows that Q.

Simple Transmission: If S knows that P and deduces Q from P, then S knows that Q in virtue of knowing that P.

Even supposing Simple Closure is true (which it probably is not), Simple Transmission is false.  Suppose S knows Q on the basis of perceptual experience and then comes to know P on the basis of her knowing Q.  It would be premise circular if she then also based her belief in Q on her belief in P.  If she did so, her extended argument would be Q therefore P therefore Q.  It is plausible in such a case that S still knows the conclusion Q on the basis of the relevant perceptual experience.  Assuming she still knows Q, her deduction from P to Q is not a counterexample to Simple Closure.  On the other hand, this case is a clear counterexample to Simple Transmission.  Although she knows Q, she knows it in virtue of the perceptual experience, not deducing it from her knowledge that P.

The difference between closure and transmission principles was just explained.  Next, the difference between closure and transmission failure will be explained.  There is an instance of closure failure when Pa and Rab hold, but Pb does not.  Simple Closure suffers from closure failure just in case someone deduces Q from her knowledge that P but nonetheless fails to know that Q.  An instance of simple closure failure just is a counterexample to Simple Closure.

There is an instance of transmission failure whenever it is false that Pb in virtue of Pa and Rab.  There are three types of transmission failure which correspond to the three ways in which it might be false that Pb holds in virtue of Pa and Rab.  The first type occurs just in case either Pa or Rab does not hold.  If Pa and Rab do not hold, then Pb cannot hold in virtue of Pa and Rab.  Consequently, Rab would fail to transmit P from a to b.  Notice that this first type of transmission failure can occur even if the relevant transmission principle is true.  Transmission principles do not say that Pa and Rab in fact hold; instead they say if Pa and Rab hold, then Pb holds in virtue of Pa and Rab.  If S fails to know P or fails to deduce Q from P, then the deduction fails to transmit knowledge from P to Q.  Nonetheless, Simple Transmission might still be true, because it does not demand that S actually deduce Q from her knowledge that P.  A similar point explains why one can have type-one transmission failure without having closure failure, that is, without having a counterexample to the corresponding closure principle.  There is, therefore, an interesting difference between transmission and closure failure: an instance of closure failure just is a counterexample to some relevant closure principle, but an instance of transmission failure need not be a counterexample to some relevant transmission principle.

Although the first type of transmission failure never provides a counterexample to some relevant transmission principle, the second and third types always provide such a counterexample. The second type occurs just in case Pa and Rab holds but Pb does not—precisely the same circumstances in which closure failure occurs.  In other words, the second type of transmission failure occurs just in case closure failure does.  It follows that all instances of closure failure are instances of transmission failure.  It does not follow, however, that all instances of transmission failure are instances of closure failure: there will be transmission failure without closure failure whenever there is transmission failure of the first or third types.  Simple Transmission suffers from type-two transmission failure (and closure failure) just in case S deduces Q from her knowledge that P but nonetheless fails to know Q.  (The idea that all instances of closure failure are instances of transmission failure but not vice versa also follows from the fact that transmission principles say everything that their closure counterparts say and more besides.  By saying everything that closure principles say, transmission principles will fail whenever their closure counterparts do.  By saying more than their closure counterparts, they sometimes will fail even when their closure counterparts do not.)

The third type of transmission failure occurs just in case Pa, Rab, and Pb hold, but Pb does not hold in virtue of Pa and Rab.  Since closure principles do not demand that Pb hold in virtue of Pa and Rab, a closure principle may be true even if its corresponding transmission principle suffers from type-three transmission failure.  Simple Transmission suffers from type-three transmission failure just in case S deduces Q from S’s knowledge that P, S knows Q, but S does not know Q in virtue of the deduction from her knowledge that P.  The premise circular argument discussed in this sub-section is a plausible example of this type of failure.  As was explained above, in such a case Simple Closure might hold but Simple Transmission would not.

b. The (Misplaced?) Focus on Closure

There is no doubt that, in the epistemological literature, closure failure is in some sense the bigger issue.  Some epistemological theories seem committed to rejecting intuitive closure principles, and there is extensive debate over how serious of a crime it is to reject these principles.  Although the literature on transmission failure is by no means scant, considerably more ink has been spilt over closure failure.  One naturally is inclined to infer that closure failure is the more important issue, but this may be incorrect: the literature’s focus on closure failure may be misplaced—though this potential misplacement is likely harmless.

Crispin Wright (1985: 438, nt. 1) was perhaps the first to distinguish between epistemic closure and transmission principles, but much of the literature has not observed this distinction, a fact that has been noted by Wright (2003: 76, nt.1) and Davies (2000: 394, nt. 19).  When some philosophers purport to talk about closure principles, they are really talking about transmission principles.  Consider Williamson’s “intuitive closure” principle: “knowing p1,…,pn, competently deducing q, and thereby coming to believe q is in general a way of coming to know q” (2000: 117, emphasis mine).  Closure principles can tell us that everything we competently deduce from prior knowledge itself will be known; however, only transmission principles can tell us the how, that is, that the conclusions are known in virtue of the competent deductions.  Hawthorne likewise treats closure principles as if they were transmission ones: “Our closure principles are perfectly general principles concerning how knowledge can be gained by deductive inference from prior knowledge” (2004: 36, emphasis mine).  Closure principles can tell us that everything we competently deduce from prior knowledge itself will be known; however, only transmission principles can tell us that our knowledge of these conclusions was gained by the deduction from prior knowledge.

Dretske’s 1970 paper “Epistemic Operators” introduced the epistemological world to the issue of closure failure, and his subsequent work on the topic has been extremely important.  Yet even he now admits that discussing transmission failure “provides a more revealing way” of explaining some of his key claims concerning closure failure (2005: 15).  One wonders, then, whether the literature’s greater focus on closure failure is (harmlessly?) misplaced.

c. Why Transmission is an Interesting Issue, Revisited

Although it seems salutary to appreciate the distinction between closure and transmission failure, it may be that some philosophers read too much into this distinction.  Although Wright holds that certain competent deductions are instances of transmission failure, he is “skeptical whether there are any genuine counterexamples to closure” (2002: 332; 2003: 57-8; cf. 2000: 157).  Davies seems sympathetic to a similar position at times (2000: 394) but not at others (1998: 326).  These remarks suggest the following way of explaining why transmission is an interesting issue: “Moore’s Proof seems to be a bad argument, but intuitive closure principles seem too plausible to reject.  This tension can be resolved when Moore’s Proof is treated as an instance of transmission rather than closure failure.  Moore’s Proof seems to be a bad argument and is a bad argument because it fails to transmit justification to its conclusion; it is not, however, a counterexample to intuitive closure principles.”

Smith (2009: 181) comes closest to endorsing this motivation explicitly, but even if it is not widely held, it is worth explaining why it fails.  To do so, two new closure principles need to be introduced.  Simple Closure and Simple Transmission were discussed in 4.A in order to provide a clear case in which a transmission principle is false even if its closure counterpart is true.  Yet Simple Closure is too simple to be plausible.  For example, it fails to account for defeaters (for example, relevant counterevidence).  If S deduces Q from her knowledge that P, then Simple Closure says that S knows Q.  Yet if S makes that deduction even though her total evidence supports ~Q, she will not know Q.

When philosophers defend closure principles, they typically defend, not Simple Closure, but something like:

Strong Closure: If S knows P and S competently deduces Q from P, then S knows that Q.

Simple Closure holds that knowledge is closed over deductions.  Strong Closure, on the other hand, holds that knowledge is closed over competent deductions.  Recall from 3.B that a deduction is competent just in case the premises are well justified; the premises provide deductive (so maximal) support for their conclusions; the subject knows that the premises provide deductive support; there are no relevant defeaters; and it is not premise circular.  Given that competent deductions seem, at first glance at least, to be the paradigm of good reasoning (see 3.B), it should not be surprising that philosophers defend something like Strong Closure.

The second closure principle that needs to be introduced is:

Weak Closure: If S knows P and S competently deduces Q from P, then S has some epistemic status for Q, no matter how weak.

Suppose S competently deduces Q from her knowledge that P.  Strong Closure holds that S must know Q.  Weak Closure, on the other hand, says only that S must have some positive epistemic status for Q, no matter how weak.  (It is worth noting that, despite its name, Weak Closure is not obviously a closure principle.  Closure principles say that if Pa and Rab, then Pb (see 4.A).  If there are three different epistemic properties P, Q, and R, then Weak Closure is in this form: if Pa and Rab, then Pb or Qb or Rb.  This concern can be ignored, because if Weak Closure fails to count as a closure principle, then there would only be further problems with the above motivation.)

Wright (2004), Davies (2003: 29-30), and perhaps also Smith (2009: 180-1) endorse an account of non-inferential knowledge which allows them to endorse Weak Closure but not Strong Closure.  (McLaughlin 2003: 91-2 endorses a similar view, but it is not clear that his explanation of transmission failure is compatible with even Weak Closure.)  Put simply, they hold that to have (strong) non-inferential justification for P, one must have prior entitlement for certain background assumptions.  An entitlement to some background assumption A is something like a very weak justification for A that one has automatically, or by default.  Since they suppose (as is common) that knowledge requires the strong type of justification, they also hold that non-inferential knowledge likewise requires this prior weak and default justification for background assumptions.  (The most extensive defense of this view of non-inferential justification is Wright’s 2004.  See Tucker’s 2009 for a criticism of this view as it relates to perceptual justification.)

Applied to Moore’s Proof, this view holds that, to have non-inferential knowledge that one has a hand (the premise of Moore’s Proof), she must have some prior entitlement to accept that there are material things (the conclusion of Moore’s Proof).  Since the conclusion of Moore’s Proof would not be used as a sub-premise to establish that one has hands, it would not count as premise circular.  Nonetheless, since knowing the premise would require some previous (however weak) justification for the conclusion, this view of non-inferential justification makes Moore’s Proof circular in some other sense.  Does this type of circularity prevent the premise from transmitting knowledge to the conclusion?  Wright and Davies certainly think so, but Cohen (1999:76-7, 87, nt. 52) is more optimistic.  If Wright and Davies are correct, then one has some very weak justification for the conclusion of Moore’s Proof, but they do not and cannot know this conclusion.  Since the conclusion, that there are material things, does have some weak epistemic status, Wright and Davies can endorse Weak Closure.  Yet they are forced to reject Strong Closure because they hold that one cannot know that there are material things.

The ability to endorse Weak Closure is not enough for the above way of motivating the issue of transmission failure to succeed.  Strong Closure (or some principle in the general neighborhood) is what most epistemologists find too plausible to reject.  Since Wright and Davies must reject Strong Closure, their diagnosis of Moore’s Proof cannot explain the badness of Moore’s Proof without rejecting the version of closure that most philosophers find intuitive.  (See Silins 2005: 89-95 for related discussion.)

Something like Strong Closure seems extremely plausible even to those who ultimately reject it (for example, Dretske 2005: 18).  But why does it seem so plausible?  Tucker (2010: 498-9) holds that it seems so plausible because its corresponding transmission principle seems so plausible.  Consider:

Strong Transmission: If S knows P and S competently deduces Q from P, then S knows that Q in virtue of that competent deduction.

Strong Transmission says what Strong Closure says and that the conclusion is justified in virtue of that competent deduction.  Tucker’s suggestion is that Strong Closure seems plausible because Strong Transmission seem plausible.  It seems that justification is closed over a competent deduction because it seems competent deductions must transmit justification to their conclusions, a point discussed above in section 2.B.  From this point of view, it is no surprise to find that the literature often treats closure principles as if they were transmission ones, for our intuitions concerning transmission would explain why certain closure principles seem so plausible.

5. Transmission Failure: Two Common Assumptions

It is commonly held that Moore’s Proof, the Neo-Moorean Argument, and the Zebra Argument are instances of transmission failure.  When philosophers attempt to explain why these arguments fail to transmit, they tend to make two assumptions.

a. Transmission of Warrant vs. Transmission of Justification

Much of the literature on transmission failure focuses on the transmission of warrant rather than the transmission of (doxastic) justification (see Wright 1985, 2002, 2003; Davies 1998, 2000, 2003; and Dretske 2005).  A warrant for P, roughly, is something that counts in favor of accepting P.  An evidential warrant for P is some (inferential or non-inferential) evidence that counts in favor of accepting P.  Entitlement, which was discussed in 4.B, is a type of non-evidential warrant for P, a warrant that one has by default.  One can have a warrant for P even if she does not believe P or believes P but not on the basis of the warrant.  Notice that it is propositions that are warranted relative to a person.

(Doxastic) justification, on the other hand, is a property that beliefs have.  Roughly, a belief is justified when it is held in an epistemically appropriate way.  S is justified in believing P only if (i) S has warrant for P and (ii) S’s belief in P is appropriately connected to that warrant for P.  Hence, one can have warrant for a belief even though it is not justified.  Suppose Merla has some genuine evidential warrant for her belief that Joey is innocent, so her belief satisfies (i); but her belief will not be justified if she believes that Joey is innocent solely because the Magic 8-Ball says so.  Although Merla would have warrant for Joey’s innocence, her belief in his innocence would not be connected appropriately to that warrant.  In other words, her belief would not be justified because it would not satisfy (ii).

Again, Wright, Davies, and Dretske focus on the transmission of warrant, not justification. In a representative statement, Davies maintains that “The question is whether the epistemic warrants that I have for believing the premises add up to an epistemically adequate warrant for the conclusion” (2000: 399, cf. 2003: 51). Dretske focuses more specifically on the transmission of evidential warrant.  Transmission failure, he says, is the idea “that some reasons for believing P do not transmit to things, Q, known to be implied by P” (15).  These philosophers hold that Moore’s Proof fails to transmit in the sense that it fails to make the warrant for its premise warrant for its conclusion.

These philosophers assume, however, that the failure to transmit warrant suffices for the failure to transmit justification.  In other words, they make:

Common Assumption 1: if an argument fails to transmit warrant, then it fails to transmit justification.

The difference between these two types of transmission failure is subtle.  To say that an argument fails to transmit justification is to say that an argument fails to make its conclusion justified.  To say that an argument fails to transmit warrant is to say that the argument fails to make belief in its conclusion justified in a very particular way, namely by converting warrant for the premise into warrant for the conclusion.

Davies, Wright, and, to a lesser extent, Dretske reveal this assumption when they discuss the significance of failing to transmit warrant.  Wright assumes that when an argument fails to transmit warrant, it is not an argument “whereby someone could be moved to rational [or justified] conviction of its conclusion” (2000: 140).  In one paragraph, Davies seems to suppose, at the very least, that “limitations on the transmission of epistemic warrants” suffice for “limitations on our ability to achieve knowledge [and presumably also justification] by inference” (2003: 35-6).  Although there is no one passage that illustrates this, Dretske (2005) assumes that an evidential warrant’s failing to transmit prevents knowledge (and presumably also justification) from transmitting.

This first assumption is significant because the transmission of justification seems to be the more important type of transmission. When we evaluate the quality of arguments (insofar as they are used to organize one’s beliefs) we want to know whether we can justifiably believe the conclusion in virtue of accepting the argument.  Whether an argument transmits warrant is usually relevant to this aim only insofar as it implies something about when the argument transmits justification.

Silins (2005: 87-88) and Tucker (2010: 505-7) criticize this first assumption.  Suppose that Harold’s belief in P is doxastically justified by his evidence E; he notices that P entails Q; and then he subsequently deduces Q from P.  According to Silins and Tucker, it is natural to identify Harold’s reason for accepting Q as P, not E.  Since we are supposing that P entails Q, P is presumably a warrant for Q.  But if P is Harold’s reason for Q and is itself a warrant for Q, it does not seem to matter whether the deduction transmits warrant, that is, whether the deduction makes E into a warrant for Q.  It is worth noting that, even if Common Assumption 1 is ultimately correct, Tucker and Silins still have a point: this assumption is not sufficiently obvious to be taken for granted, as Wright, Dretske, and Davies do.

b. Transmission vs. Resolving Doubt

The second common assumption may be the more important.  It says that failing to have the power to resolve doubt suffices for failing to transmit justification.  In other words:

Common Assumption 2: if an argument fails to have the power to resolve doubt, then it fails to transmit justification to its conclusion.

A deduction P therefore C has the power to resolve doubt (about its conclusion) iff it is possible for one to go from doubting C to justified belief in C solely in virtue of accepting P therefore C.  As I am using the term, one (seriously) doubts P just in case she either disbelieves or withholds judgment about P.  Withholding judgment is more than merely failing to believe or disbelieve P: it is resisting or refraining from both believing and disbelieving P, and one cannot do that unless one has considered P.

Suppose that Hillbilly has been very out of the loop the last few years, and he doubts that Obama is the president.  He then discovers that both CNN and the NY Times say that he is the president.  He might justifiably infer that, after all, Obama is the president.  This is because the argument he would accept has the power to resolve doubt.  On the other hand, the Neo-Moorean Argument, for example, does not have the power to resolve doubt.  If one doubts NM2, that she is not a brain-in-a-vat, she cannot rationally believe, NM1, that she has a hand.  So doubting the conclusion of the Neo-Moorean Argument prevents a key premise in the argument from being justified, thereby preventing the argument from justifying the conclusion.  Since the argument cannot justify its conclusion when the subject antecedently disbelieves or withholds judgment about the conclusion, it lacks the power to resolve doubt.

Wright (2002, 2003), Davies (2003), and McLaughlin (2000) make this second assumption. Wright maintains that “Intuitively, a transmissible warrant should make for the possible advancement of knowledge, or warranted belief, and the overcoming of doubt or agnosticism” (2002: 332, emphasis mine).  In another paper, he says of an example that, “The inference from A to B is thus not at the service of addressing an antecedent agnosticism about B.  So my warrant does not transmit” (2003: 63).

Davies’ (2003) Limitation Principles for the transmission of warrant are, he thinks, motivated “by making use of the idea that failure of transmission of epistemic warrant is the analogue, within the thought of a single subject, of the dialectical phenomenon of begging the question” (41).  In Davies’ view, “The speaker begs the question against the hearer if the hearer’s doubt rationally requires him to adopt background assumptions relative to which the considerations that are supposed to support the speaker’s premises no longer provide that support” (41).  Take the Zebra Argument.  If you doubted Z2, that the animal is not a cleverly disguised mule, then Davies suggests that your perceptual experience will no longer count in favor of your belief in Z1, that the animal is a zebra.  So if I offered you the Zebra Argument in order to convince you that the animal is not a cleverly disguised mule, I would beg the question against you.

It is pretty clear, as Davies’ discussion suggests, that accepting an argument that fails to be a “question-settling justification,” that is, accepting an argument lacking the power to resolve doubt, is the analogue of the dialectical phenomenon of begging the question (for example, 2003: 41-5, esp. 42).  Were I to accept the Zebra Argument when I have antecedent doubt about its conclusion, I would, as it were, beg the question against myself.  Yet Davies never provides any reason to believe that transmission failure is an analogue of begging the question.  He seems to take for granted that for something (for example, an experience or argument) to be a justification at all, it must have the power to resolve doubt.

McLaughlin’s (2000) primary concern is with the transmission of knowledge, not justification, but he seems to make a parallel assumption.  He says the Neo-Moorean Argument cannot transmit knowledge because it begs the question: “The premises fail to provide a sufficient epistemic basis on which to know the conclusion because my basis for one of the premises is dependent on the truth of the conclusion in such a way as to render the argument question begging” (104).  It is Neo-Moorean Argument’s inability to resolve doubt that makes it question-begging.  Hence, McLaughlin seems to assume that the power to resolve doubt is required for the power to make a conclusion known.

Much of the literature on transmission failure, then, operates on the assumption that the power to justify requires the power to resolve doubt.  Taking this assumption for granted was probably a reasonable thing to do at the time the literature was first published; however, this assumption is now challenged most directly by Pryor (2004), but Markie (2005: 409) and Bergmann (2006: 198-200) challenge similar assumptions in connection with easy knowledge and epistemic circularity, respectively.  Although Davies initially endorses Common Assumption 2, he seems inclined to reject it in his later work (2004: 242-3).  Those who challenge this assumption first emphasize (though not necessarily in these words) the conceptual distinction between transmission failure and the inability to resolve doubt, and then they contend that we need some special reason to think that the inability to resolve doubt suffices for transmission failure.

Sometimes philosophers press similar distinctions in different terminology, and it is worth explaining the connection with one other popular way of talking.  Some (for example, Pryor 2004: 369) hold that Moore’s Proof can transmit justification even though it is dialectically ineffective for some audiences.  An argument is dialectically effective for an audience when it is one that will transmit justification (knowledge) to the argument’s conclusion given the audience’s current beliefs, experiences, and other epistemically relevant factors.  Consider again Hillbilly’s argument that Two reliable sources, namely CNN and NY Times, say that Obama is the president; therefore, Obama is the president.  This argument is dialectically effective for Hillbilly because he has no antecedent doubt about the reliability of CNN and NY Times.  This same argument nonetheless may be dialectically ineffective for his cousin if the cousin antecedently doubts (rationally or irrationally) the reliability of these two news outlets.  Before this argument will be dialectically effective for the cousin, her antecedent doubt must be resolved.

Defenders of Moore’s Proof sometimes say that the “proof” is dialectically effective for audiences that lack antecedent doubt in the argument’s conclusion that there are no material things, but not for its intended audience, namely those skeptical of this conclusion.  Moore’s Proof fails to be dialectically effective for this skeptical audience because such skeptics tend to doubt the reliability of perception.

Appreciating the distinction between transmission failure and the inability to resolve doubt (or dialectical effectiveness) not only casts doubt on Common Assumption 2, but also provides proponents of Moore’s Proof with an error theory.  In general, an error theory attempts to explain why something seems true when it is not.  The proponent of Moore’s Proof wants to explain why Moore’s Proof seems to be an instance of transmission failure when it is not.  In other words, this error theory attempts to explain away the intuition that Moore’s Proof is an instance of transmission failure.  The proponent of this error theory will say that this intuition is partly right and partly wrong.  What it gets right is that Moore’s Proof exhibits a genuine failure, namely the failure to resolve doubt (and/or be dialectically effective for its target audience).  What it gets wrong is that Moore’s Proof is an instance of transmission failure.  Yet, since it is easy to conflate the two types of failure, it is easy to mistakenly think that Moore’s Proof is an instance of transmission failure too.

The success of this error theory depends on at least two factors.  The first is whether transmission failure and the inability to resolve doubt are in fact easily confused.  This seems plausible given the widespread tendency to implicitly endorse Common Assumption 2 without comment.  The second is whether one retains the intuition that Moore’s Proof is an instance of transmission failure.  If, after considering this error theory and carefully distinguishing transmission failure from the inability to resolve doubt, one no longer has the intuition that Moore’s Proof is a bad argument, then the error theory seems promising.  If, however, one retains the intuition that Moore’s Proof is a bad argument, it is far less plausible that the intuition of transmission failure arises from conflating transmission failure with the inability to resolve doubt.  Consequently, the error theory would seem considerably less promising.  (Wright 2008 responds to Pryor’s version of this error theory, a response which is criticized by Tucker’s 2010: 523-4.)

6. References and Further Reading

  • Bergmann, Michael. 2006. Justification without Awareness. Oxford: Oxford University Press.
    • In Chapter 7, Bergmann makes a distinction similar to the transmission/resolving doubt distinction and uses it to defend some instances of epistemic circularity.
  • Cohen, Stewart. 1999. “Contextualism, Skepticism, and the Structure of Reasons.” Philosophical Perspectives 13: 57-89.
    • Cohen’s main goal is to defend epistemic contextualism, but he also seems to approve of a type of circularity that Davies and Wright find vicious (see 76-7, 87, nt. 52).
  • Davies, Martin. 2004. “Epistemic Entitlement, Warrant Transmission, and Easy Knowledge.” Aristotelian Society Supplementary Volume 78: 213-45.
    • In this paper, Davies distances himself from his earlier work on transmission failure and seems sympathetic to the error theory discussed in 5.B.
  • Davies, Martin. 2003. “The Problem of Armchair Knowledge.” In Nuccetelli 23-56.
    • In this paper, Davies defends his early views concerning transmission failure, but perhaps its most useful contribution is that it considers a wide variety of cases that he holds are instances of transmission failure (see especially section 5).
  • Davies, Martin. 2000. “Externalism and Armchair Knowledge.” In Boghossian, Paul and Christopher Peacocke (eds.) 384-414.
    • This paper is probably the place to start for those interested in Davies’ early views on transmission failure.
  • Davies, Martin. 1998. “Externalism, Architecturalism, and Epistemic Warrant.” In Wright, Crispin, C. Smith, and C. Macdonald (eds.). Knowing Our Own Minds. Oxford: Oxford University Press, pgs. 321- 361.
    • Davies presents his initial views on transmission failure, which he refines in his 2000 and 2003 and then apparently reconsiders in his 2004.
  • Dretske, Fred. 2005. “The Case against Closure.” In Steup, Matthias and Ernest Sosa (eds.). Contemporary Debates in Epistemology. Malden: Blackwell Publishing, 13-25.
    • Dretske defends his view that closure principles are false, and, in sec. 1, he explains how some of what he says about closure failure in his earlier work can be better expressed in terms of transmission failure.
  • Dretske, Fred. 1970. “Epistemic Operators” Journal of Philosophy 67: 1007-23.
    • Dretske introduces closure failure as an issue for discussion, but his 2005 provides a simpler introduction to the closure failure issue.
  • Hawthorne, John. 2005. “The Case for Closure.” In Steup, Matthias and Ernest Sosa (eds.). Contemporary Debates in Epistemology. Malden: Blackwell Publishing, 26-42.
    • Hawthorne defends intuitive closure principles and criticizes Dretske’s views regarding closure (and transmission) failure.
  • Lackey, Jennifer. 2006.  “Introduction.” In Lackey, Jennifer and Ernest Sosa (eds.). The Epistemology of Testimony. Oxford: Oxford University Press.
    • In Section 3, Lackey briefly discusses some of the transmission issues concerning testimony.
  • Markie, Peter J. “Easy Knowledge.” Philosophy and Phenomenological Research 70: 406-16.
    • Markie discusses some competent deductions that seem to be instances of transmission failure (though he does not use that terminology), and he provides an error theory of the sort discussed in section 5.B above.
  • McKinsey, Michael. 2003. “Transmission of Warrant and Closure of Apriority.”
    • In Nuccetelli 97-115.  McKinsey responds to Wright (2000) and Davies’ (1998, 2000, 2003) charge that McKinsey’s Paradox is an instance of transmission failure.
  • McLaughlin, Brian. 2003. “McKinsey’s Challenge, Warrant Transmission, and Skepticism.”  In Nuccetelli 79-96.
    • McLaughlin provides an objection to Wright’s 2000 conditions for transmission failure, which convinces Wright to modify those conditions in his later work.  It also provides a careful discussion of whether McKinsey’s Paradox is an instance of transmission failure.
  • McLaughlin, Brian. 2000. “Skepticism, Externalism, and Self-Knowledge.” The Aristotelian Society Supplementary Volume 74: 93-118.
    • On pages 104-5, McLaughlin connects transmission failure with question-begging and claims that the Neo-Moorean argument is an instance of transmission failure.
  • Nuccetelli, Susana (ed.). 2003. New Essays on Semantic Externalism and Self-Knowledge. Cambridge: MIT Press.
    • Several chapters of this collection were referenced in this article.
  • Pryor, James. 2004. “What’s Wrong with Moore’s Argument.” Philosophical Issues 14: 349-77.
    • Pryor defends Moore’s Proof from the charge of transmission failure, which includes a very careful discussion of the error theory discussed in 5.B.
  • Silins, Nicholas. 2005. “Transmission Failure Failure.” Philosophical Studies 126: 71-102.
    • Silins defends the Zebra Argument from the charge of transmission failure and provides detailed criticisms of the views of Wright and Davies.
  • Smith, Martin. 2009. “Transmission Failure Explained.” Philosophy and Phenomenological Research 79: 164-89.
    • Smith provides an account of transmission failure in terms of safety and reliability.  A full appreciation of Smith’s view requires at least some background in modal logic, particularly with counterfactuals, or subjunctive conditionals.
  • Tucker, Chris. 2010. “When Transmission Fails.” Philosophical Review 119: 497-529.
    • Tucker defends the Neo-Moorean and Zebra arguments by developing and defending a very permissive account of transmission failure. Much of this entry is merely a simplified version of the first half of Tucker’s 2010 paper.
  • Tucker, Chris. 2009. “Perceptual Justification and Warrant by Default.”  Australasian Journal of Philosophy 87: 445-63.
    • This paper attacks the view of non-inferential justification that Wright, and, to a lesser extent, Smith, Davies, and McLaughlin (2003) assume in their work on transmission failure.
  • Williamson, Timothy. 2000. Knowledge and Its Limits. Oxford: Oxford University Press.
    • This book contains some important work on closure failure that is equally work on transmission failure.
  • Wright, Crispin. 2008. “The Perils of Dogmatism.” Themes from G. E. Moore: New Essays in Epistemology and Ethics. Oxford: Oxford University Press, 25-48.
    • Wright criticizes an alternative to his account of non-inferential justification and, on page 38, he criticizes Pryor’s version of the error theory discussed in 5.B.
  • Wright, Crispin. 2004. “Warrant for Nothing (and Foundations for Free)?” Aristotelian Society Supplementary Volume 78: 167- 212.
    • Wright’s extended defense of his account of non-inferential justification.
  • Wright, Crispin. 2003. “Some Reflections on the Acquisition of Warrant by Inference.” In Nuccetelli 57-78.
    • The place to start for those interested in understanding Wright’s account of transmission failure as it relates to McKinsey’s Paradox and content externalism.
  • Wright, Crispin. 2002. “(Anti)-Sceptics Simple and Subtle: G. E. Moore and John McDowell.” Philosophy and Phenomenological Research 65: 330-348.
    • The place to start for those interested in Wright’s account of transmission failure as it relates to perceptual justification.
  • Wright, Crispin. 2000. “Cogency and Question-Begging: Some Reflections on McKinsey’s Paradox and Putnam’s Proof.” Philosophical Issues 10: 140-63.
    • Wright provides a transmission failure principle which he refines in his 2002 and 2003 in light of McLaughlin’s 2003 criticism.
  • Wright, Crispin. 1985. “Facts and Certainty.” Proceedings of the British Academy, 429-472. Reprinted in Williams, Michael (ed.). 1993. Skepticism. Aldershot: Dartmouth Publishing Company Limited, pgs. 303-346.
    • Wright’s earliest work on transmission failure and perhaps the first paper to distinguish between closure and transmission principles.  Since Wright’s main focus is not transmission failure, you might start with one of Wright’s later papers unless one is very interested in the full details of Wright’s broadly Wittgensteinian epistemology.

Author Information

Chris Tucker
Email: c.tucker@auckland.ac.nz
University of Auckland
New Zealand

Fictionalism in the Philosophy of Mathematics

The distinctive character of fictionalism about any discourse is (a) recognition of some valuable purpose to that discourse, and (b) the claim that that purpose can be served even if sentences uttered in the context of that discourse are not literally true. Regarding (b), if the discourse in question involves mathematics, either pure or applied, the core of the mathematical fictionalist’s view about such discourse is that the purpose of engaging in that discourse can be served even if the mathematical utterances one makes in the context of that discourse are not true (or, in the case of negative existentials such as ‘There are no square prime numbers’, are only trivially true).

Regarding (a), in developing mathematical fictionalism, then, mathematical fictionalists must add to this core view at the very least an account of the value of mathematical inquiry and an explanation of why this value can be expected to be served if we do not assume the literal or face-value truth of mathematics.

The label ‘fictionalism’ suggests a comparison of mathematics with literary fiction, and although the fictionalist may wish to draw only the minimal comparison that both mathematics and fiction can be good without being true, fictionalists may also wish to develop this analogy in further dimensions, for example by drawing on discussions of the semantics of fiction, or on how fiction can represent. Before turning to these issues, though, this article considers what the literal truth of a sentence uttered in the context of mathematical inquiry would amount to, so as to understand the position that fictionalists wish to reject.

Table of Contents

  1. Face-value Semantics, Platonism, and its Competitors
  2. The Fictionalist’s Attitude: Acceptance without Belief
  3. Preface or Prefix Fictionalism?
  4. Mathematical Fictionalism and Empirical Theorizing
    1. Mathematical Fictionalism + Scientific Realism
    2. Mathematical Fictionalism + Nominalistic Scientific Realism
    3. Mathematical Fictionalism + Constructive Empiricism (Bueno)
  5. Hermeneutic or Revolutionary Fictionalism
  6. References and Further Reading

1. Face-value Semantics, Platonism, and its Competitors

In the context of quite ordinary mathematical theorizing we find ourselves uttering sentences whose literal or ‘face-value’ truth would seem to require the existence of mathematical objects such as numbers, functions, or sets.  Thus: ‘2 is an even number’ appears to be of subject-predicate form, with the singular term ‘2’ purporting to stand for an object which is said to have the property of being an even number.  ‘The empty set has no members’ uses a definite description, and at least since Russell presented his theory of definite descriptions it has standardly been assumed that the truth of such sentences requires, at a minimum, the existence and uniqueness of something satisfying the indefinite descriptive phrase ‘is an empty set’.  Most stark, though, is the use of the existential quantifier in the sentences used to express our mathematical theories.  Euclid proved a theorem whose content we would express by means of the sentence ‘There are infinitely many prime numbers’.  One would have to make some very fancy manoeuvres indeed to construe this sentence as requiring anything less than the existence of numbers – infinitely many of them. (For an argument against the ‘ontologically committing’ reading of ‘there is’, see Jody Azzouni (2004: 67), according to which ‘there is’ in English functions as an ‘ontologically neutral anaphora’.  Azzouni’s position on ontological commitments is discussed helpfully in Joseph Melia’s (2005) online review of Azzouni’s book.  For the remainder of this article, though, we will assume, contrary to Azzouni’s position, that the literal truth of sentences of the form ‘there are Fs’ requires the existence of Fs.)

Similar points about the ‘face-value’ commitments of our ordinary utterances can be made if we move outside of the context of pure mathematics to sentences uttered in the context of ordinary day-to-day reasoning, or in the context of empirical science.  As Hilary Putnam (1971) famously pointed out, in stating the laws of our scientific theories we make use of sentences that, at face value, are dripping with commitments to mathematical objects.  Thus, Newton’s law of universal gravitation says that, between any two massive objects a and b there is a force whose magnitude F is directly proportional to the product of the masses ma and mb of those objects and inversely proportional to the square of the distance d between them (F = Gmamb/d2).  Unpacking this statement a little bit we see that it requires that, corresponding to any massive object o there is a real number mo representing its mass as a multiple of some unit of mass; corresponding to any distance between two objects there is a real number d representing that distance as a multiple of some unit of distance; and corresponding to any force there is a real number F representing the magnitude of that force as a multiple of some unit of force.  So, conjoined with the familiar truth that there are massive objects, the literal truth of Newton’s law requires not only that there be forces acting on these objects, and distances separating them, but that there be real numbers corresponding appropriately to masses, forces, and distances, and related in such a way that F = Gmamb/d2.  So Newton’s law taken literally requires the existence of real numbers and of correspondences of objects with real numbers (i.e., functions).

These examples show that, on a literal or face-value reading, some of the sentences used to express our mathematical and scientific theories imply the existence of mathematical objects.  The theories that are expressed by means of such sentences are thus said to be ontologically committed to mathematical objects.  Furthermore, if we interpret these sentences at face value, and if we endorse those sentences when so interpreted (accepting them as expressing truths on that face value interpretation), then it seems that we too, by our acceptance of the truth of such sentences, are committed to an ontology that includes such things as numbers, functions, and sets.

Mathematical realists standardly endorse a face-value reading of those sentences used to express our mathematical and scientific theories, and accept that such sentences so interpreted express truths.  They therefore commit themselves to accepting the existence of mathematical objects.  In inquiring into the nature of the objects to which they are thereby committed, mathematical platonists typically go on to state that the objects to which they are committed are abstract, where this is understood negatively to mean, at a minimum, non-spatiotemporal, acausal, and mind-independent.  But many philosophers are wary of accepting the existence of objects of this sort, not least because (as Benacerraf  (1973) points out) their negative characterization renders it difficult, if not impossible, to account for our ability to have knowledge of such things.  And even without such specific epistemological worries, general ‘Ockhamist’ tendencies warn that we should be wary of accepting the existence of abstract mathematical objects unless the assumption that there are such things proves to be unavoidable.  For many philosophers then, fictionalists included, mathematical platonism presents itself as a last resort – a view to be adopted only if no viable alternative that does not require belief in the existence of abstract mathematical objects presents itself.

What, then, are the alternatives to platonism?  One might reject the face-value interpretation of mathematical sentences, holding that these sentences are true, but that their truth does not (despite surface appearances) require the existence of mathematical objects.  Defenders of such an alternative must provide a method for reinterpreting those sentences of mathematical and empirical discourse that appear to imply the existence of abstract mathematical objects so that, when so-interpreted, these implications disappear.  Alternatively, one might accept the face-value semantics, but reject the truth of the sentences used to express our mathematical theories.  Assuming that it is an advantage of standard platonism that it provides a standard semantics for the sentences used to express our mathematical theories, and that it preserves our intuition that many of these sentences assert truths, each of these options preserves one advantage at the expense of another.  A final alternative is to reject both the face-value interpretation of mathematical sentences and to reject the truth of mathematical sentences once reinterpreted.  This apparently more drastic response is behind at least one position in the philosophy of mathematics that reasonably calls itself fictionalist (Hoffman (2004), building on Kitcher (1984); Hoffman’s claim is that ordinary mathematical utterances are best interpreted as making claims about the collecting and segregating abilities of a (fictional) ideal mathematician.   These claims are not literally true, since the ideal mathematician does not really exist, but there is a standard for correctness for such claims, given by the ‘story’ provided of the ideal agent and his abilities).  However, the label ‘fictionalism’ in the philosophy of mathematics is generally used to pick out positions of the second kind, and that convention will be adhered to in what follows.  That is, according to mathematical fictionalists, sentences of mathematical and mathematically-infused empirical discourse should be interpreted at face value as implying the existence of mathematical objects, but we should not accept that such sentences so-interpreted express truths.  As such, mathematical fictionalism is an error theory with respect to ordinary mathematical and empirical discourse.

2. The Fictionalist’s Attitude: Acceptance without Belief

At a minimum, then, mathematical fictionalists accept a face-value reading of sentences uttered in the context of mathematical and ordinary empirical theorizing, but when those sentences are, on that reading, committed to the existence of mathematical objects, mathematical fictionalists do not accept those sentences to be true.   Some fictionalists, e.g., Field (1989, 45) will go further than this and say that we ought to reject such sentences as false, taking it to be undue epistemic caution to maintain agnosticism rather than rejecting the existence of mathematical objects once it is recognized that we have no reason to believe that there are such things.  Whether one follows Field in rejecting the existence of mathematical objects will depend on one’s motivation for fictionalism.  A broadly naturalist motivation, according to which we should accept the existence of all and only those objects whose existence is confirmed according to our best scientific standards, would seem to counsel disbelief.  On the other hand, fictionalists who reach that position from a starting point of constructive empiricism will take the agnosticism about the unobservable endorsed by that position to apply also in the mathematical case.  Either way, though, fictionalists will agree that one ought not accept the truth of sentences that, on a literal reading, are committed to the existence of mathematical objects.

But simply refusing to accept the truth of sentences of a discourse does not amount to fictionalism with respect to that discourse: one may, for example, refuse to accept the truth-at-face-value of sentences uttered by homeopaths in the context of their discourse concerning homeopathic medicine, but doing so would be indicative of a healthy scepticism rather than a fictionalist approach to the claims of homeopathy.  What is distinctive about fictionalism is that fictionalists place some value on mathematical theorizing: they think that there is some valuable purpose to engaging in discourse apparently about numbers, functions, sets, and so on, and that that purpose is not lost if we do not think that the utterances of our discourse express truths.

Mathematical fictionalists, while refusing to accept the truth of mathematics, do not reject mathematical discourse.  They do not want mathematicians to stop doing mathematics, or empirical scientists to stop doing mathematically-infused empirical science. Rather, they advocate taking an attitude sometimes called acceptance to the utterances of ordinary mathematical discourse.  That is, they advocate making full use of those utterances in one’s theorizing without holding those utterances to be true (an attitude aptly described by Chris Daly (2008, 426) as exploitation). It is, therefore, rather misleading that the locus classicus of mathematical fictionalism is entitled Science without Numbers.  As we will see below, much of Field’s efforts in Science without Numbers are focussed on explaining why scientists can carry on exploiting the mathematically-infused theories they have always used, without committing themselves to the truth of the mathematics assumed by those theories.

The reasonableness of an attitude of acceptance or exploitation rather than belief will depend on the mathematical fictionalist’s analysis of the purpose of mathematical theorizing: what advantages are gained by speaking as if mathematical sentences are true, and are these advantages ones that we can reasonably expect to remain if the sentences of mathematical discourse are not in fact true?  But even prior to consideration of this question, it has been questioned (e.g., by Horwich (1991), and O’Leary-Hawthorne (1994)) whether adopting the proposed attitude of mere acceptance without belief is even possible, given that, plausibly, one’s belief states are indicated by one’s propensities to behave in a particular way, and fictionalists advocate behaving ‘as if’ they are believers, when engaging in a discourse they purport to accept. (These objections are aimed at Bas van Fraassen’s constructive empiricism (van Fraassen 1980), but apply equally well to fictionalists who do not believe our standard mathematical and scientific theories but similarly wish to continue to immerse themselves in ordinary mathematical and scientific activity, despite their reservations.)  Daly (2008) responds to this objection on the fictionalist’s behalf.

3. Preface or Prefix Fictionalism?

Mathematical fictionalists choose to speak ‘as if’ there are numbers, even though they do not believe that there are such things.  How should we understand their ‘disavowal’?  As David Lewis (2005: 315) points out, “There are prefixes or prefaces (explicit or implicit) that rob all that comes after of assertoric force.  They disown or cancel what follows, no matter what that may be.”, and we might imagine mathematical fictionalists as implicitly employing a prefix or preface that stops them from asserting, when doing mathematics, what they readily deny as metaphysicians.  The difference between a prefix and a preface is that “When the assertoric force of what follows is cancelled by a prefix, straightaway some other assertion takes place… Not so for prefaces.”  (ibid. 315)  Thus, preceding the sentence ‘Holmes lived at 221B Baker Street)’ with the prefix ‘According to the Sherlock Holmes stories…’ produces another sentence with assertoric force, whose force is not disowned.  On the other hand, preceding that very same sentence with the preface ‘Let’s make believe the Holmes stories are true, though they aren’t.’ one is not making a further assertion, but rather indicating that one is stepping back from the business of making assertions.

When mathematical fictionalists speak ‘as if’ there are numbers, can we, then, read them as implicitly employing a disowning prefix or preface?  When they utter the sentence ‘There are infinitely many prime numbers’, should we ‘hear’ them as really having begun under their breath with the prefix ‘According to standard mathematics…’, or as having prefaced their utterance with a mumbled: ‘Let’s make believe that the claims of standard mathematics are true, though they aren’t’?  Either reading is possible, but a prefix fictionalism would make the mathematical fictionalist’s insistence on a standard semantics for ordinary mathematical utterances rather less distinctive.  The claim that we have no reason to believe that ordinary mathematical utterances taken at face-value are true, but that the addition of an appropriate prefix transforms them into true claims may be correct, but does not sufficiently distinguish fictionalism so-construed from those reinterpretive anti-Platonist views that simply hold that that ordinary mathematical sentences should be given a non-standard semantics according to which they assert truths.  For example, Geoffrey Hellman’s modal structuralism holds that a mathematical sentence S uttered in the context of a mathematical theory T should be read as essentially ‘saying’: ‘T is consistent and it follows from the axioms of T that S’ (or, in other words, ‘According to (the consistent axioms of) a standard mathematical theory, S’, which is the result of applying a plausible fictionalist prefix to S).  Whether or not one reads this prefix into the semantics of mathematical claims or advocates a standard semantics but places all the value of ‘speaking as if’ there are mathematical objects on the possibility of construing one’s utterances as so-prefixed is arguably a matter of taste.

To take seriously the mathematical fictionalist’s insistence on a standard semantics, then, it is perhaps better to view mathematical fictionalists as implicitly or explicitly preceding their mathematical utterances with a disavowing preface which excuses them from the business of making assertions when they utter sentences whose literal truth would require the existence of mathematical objects.  But this, of course, raises the question of what they think they are doing when they engage, as fictionalists, in mathematical theorizing (both in the context of pure mathematics and in the context of empirical science). Pure mathematics does not present a major difficulty here – fictionalists may, for example, view the purpose of speaking as if the assumptions of our mathematical theories as true to be to enable us easily to consider what follows from those assumptions.  Pure mathematical inquiry can then be considered as speculative inquiry into what would be true if our mathematical assumptions were true, without concern about the question of whether those assumptions are in fact true, and it is perfectly reasonable to carry out such inquiry as one would a conditional proof, taking mathematical axioms as undischarged assumptions.  But in the context of empirical science this answer is not enough.  In empirical scientific theorizing we require that at least some of our theoretical utterances (minimally, those that report or predict observations) to be true, and part of the purpose of engaging in empirical scientific theorizing is to justify our unconditional assertion of the empirical consequences of our theories.  Despite the mathematical fictionalist’s disavowing preface, insulating them from the business of making assertions when they utter mathematical sentences in the context of their empirical theorizing, they are not, and would not want to be, entirely excused from the business of assertion.  The most pressing problem for mathematical fictionalists is to explain why they are licensed to endorse the truth of some, and only some, utterances made in the context of ordinary, mathematically-infused, empirical theorizing.

4. Mathematical Fictionalism and Empirical Theorizing

As we have already noted, our ordinary empirical theorizing is mathematical through and through.  We use mathematics in stating the laws of our scientific theories, in describing and organising the data to which those theories are applied, and in drawing out the consequences of our theoretical assumptions.  If we believe that the mathematical assumptions utilized by those theories are true, and also believe any non-mathematical assumptions we make use of, then we have no difficulty justifying our belief in any empirical consequences we validly derive from those assumptions: if the premises of our arguments are true then the truth of the conclusions we derive will be guaranteed as a matter of logic.  On the other hand, though, if we do not believe the mathematical premises in our empirical arguments, what reason have we to believe their conclusions?

Different versions of mathematical fictionalism take different approaches to answer this question, depending on how realist they wish to be about our scientific theories.  In particular, mathematical fictionalism can be combined with scientific realism (Hartry Field); with ‘nominalistic scientific realism’, or entity realism (Mark Balaguer, Mary Leng); and with constructive empiricism (Otávio Bueno).  We will consider these combinations separately.

a. Mathematical Fictionalism + Scientific Realism

I will here reserve the label ‘scientific realism’ for the Putnam-Boyd formulation of the view.  In Putnam’s words (1975: 73), scientific realists in this sense hold “that terms typically refer… that the theories accepted in mature science are typically approximately true, [and] that the same terms can refer to the same even when they occur in different theories”.  It is the first two parts of Putnam’s tripartite characterization that are particularly problematic for mathematical fictionalists, the first suggesting a standard semantics and the second a commitment to the truth of scientific theories.  If our mature scientific theories include, as Putnam himself contends that they do, statements whose (approximate) truth would require the existence of mathematical objects, then the combination of scientific realism with mathematical fictionalism seems impossible.

If one accepts scientific realism so-formulated, then what prospects are there for mathematical fictionalism?  The only room for manoeuvre comes with the notion of a mature scientific theory.  Certainly, in formulating the claims of our ordinary scientific theories we make use of sentences whose literal truth would require the existence of mathematical objects.  But we also make use of sentences whose literal truth would require the existence of ideal objects such as point masses or continuous fluids, and we generally do not take our use of such sentences to commit us to the existence of such objects.  Quine’s view is that, in our best, most careful expressions of mature scientific theories, sentences making apparent commitments to such objects will disappear in favour of literally true alternatives that carry with them no such commitments.  That is, we are not committed to point masses or continuous fluids because these theoretical fictions can be dispensed with in our best formulation of these mature theories.  Hartry Field, who wishes to combine scientific realism with mathematical fictionalism, thinks that the same can be said for the mathematical objects to which our ordinary scientific theories appear to be committed: in our best expressions of those theories, sentences whose literal truth would require the existence of such objects can be dispensed with.

Hence, the so-called ‘indispensability argument’ for mathematical platonism, and Field’s scientific realist response to this argument:

P1 (Scientific Realism): We ought to believe that the sentences used to express our best (mature) scientific theories, when taken at face value, are true or approximately true.

P2 (Indispensability):  Sentences whose literal truth would require the existence of mathematical objects are indispensable to our best formulations of our best scientific theories.

Therefore:

C (Mathematical Platonism): We ought to believe in the existence of mathematical objects.

(For an alternative formulation of the argument, and defence, see Colyvan (2001).)  In his defence of mathematical fictionalism, Field rejects P2, arguing that we can dispense with commitments to mathematical objects in our best formulations of our scientific theories.  In Science without Numbers (1980) Field makes the case for the dispensability of mathematics in Newtonian science, sketching how to formulate the claims of Newtonian gravitational theory without quantifying over mathematical objects.

But Field is a fictionalist about mathematics, not a mere skeptic about mathematically-stated theories.  That is, Field thinks that there is some value to speaking ‘as if’ there are mathematical objects, even though he does not accept that there really are such things.  The claim that we can dispense with mathematics in formulating the laws of our best scientific theories is, therefore, only the beginning of the story for Field: he also wishes to explain why it is safe for us to use our ordinary mathematical formulations of our scientific theories in our day-to-day theorizing about the world.

Field’s answer to this question is that our ordinary (mathematically-stated) scientific theories are conservative extensions of the literally true non-mathematical theories that we come to once we dispense with mathematics in our theoretical formulations.  A mathematically-stated empirical theory P is a conservative extension of a nominalistically stated theory N just in case any nominalistically stated consequence A of P is also a consequence of the nominalistic theory N.  Or, put another way, suppose we have an ordinary (mathematically-expressed, and therefore platonistic) scientific theory P and a nominalistically acceptable reformulation of that theory, N.  The nominalistically acceptable reformulation will aim to preserve P’s picture of the nonmathematical realm while avoiding positing the existence of any mathematical objects.  If this reformulation is successful, then every nonmathematical fact about the nonmathematical realm implied by P will also be implied by N.  In fact, typically, P will be identical to N + S: the combination of the nominalistic theory N with a mathematical theory S, such as set theory with nonmathematical urelements, that allows one to combine mathematical and nonmathematical vocabulary, e.g., by allowing nonmathematical vocabulary to figure in its comprehension schema.  In the case of Newtonian gravitational theory, Field makes the case for having found the appropriate theory N by sketching a proof of a ‘representation theorem’ which links up nonmathematical laws of N with laws of P that, against the backdrop of N + S, are materially equivalent to the nonmathematical laws.

Why, if we have a pair of such theories, N and P, does this give us license to believe the nonmathematical consequences of P, a theory whose truth we do not accept?  Simply because those consequences are already consequences of the preferred nominalistic theory N, which as scientific realists we take to be true or approximately true.  Our confidence in the truth of the nonmathematical consequences we draw from our mathematically stated scientific theories piggy backs on our confidence in the truth of the nonmathematical theories those theories conservatively extend.

But why, we may ask, should we bother with the mathematically-infused versions of our scientific theories if these theories simply extend our literally believed nominalistic theories by adding a body of falsehoods?  Field’s answer is that mathematics is an incredibly useful, practically (and sometimes even theoretically) indispensable tool that enables us to draw out the consequences of our nominalistic theories.  Nominalistically stated theories are unwieldy, and arguments from nominalistically stated premises to nominalistically stated conclusions, even when available, can be difficult to find and impractically long to write down.   With the help of mathematics, though, such problems can become tractable.  If we want to draw out the consequences of a body of nominalistically stated claims, we can use a ‘representation theorem’ to enable us to ascend to their platonistically stated counterparts, give a quick mathematical argument to some platonistically stated conclusions, then descend, again via the representation theorem, to nominalistic counterparts of those conclusions.  In short, following Carl G. Hempel’s image, mathematics has the function of a ‘theoretical juice extractor’ when applied to the nominalistic theory N:

Thus, in the establishment of empirical knowledge, mathematics (as well as logic) has, so to speak, the function of a theoretical juice extractor: the techniques of mathematical and logical theory can produce no more juice of factual information than is contained in the assumptions to which they are applied; but they may produce a great deal more juice of this kind than might have been anticipated upon a first intuitive inspection of those assumptions which form the raw material for the extractor. — C. G. Hempel (1945): 391

Extracting the consequences of our nominalistically stated theories ‘by hand’ is extremely time consuming (so much so as to make this procedure humanly impracticable without mathematics, as Ketland (2005) has pointed out).  Furthermore, if our nominalistic theories employ second-order logic in their formulation, some of these consequences can only be extracted with the help of mathematics (as noted by Field (1980: 115n. 30; 1985), Urquhart (1990: 151), and discussed in detail by Shapiro (1983)).  Nevertheless, despite being practically and even potentially theoretically indispensable in extracting the juice of factual information from our nominalistically stated theories, the indispensability of mathematics in this sense does not conflict with Field’s rejection of P2 of the indispensability argument as we have presented it, which requires only that we can state the assumptions of our best scientific theories in nonmathematical terms, not that we dispense with all uses of mathematics, for example in drawing out the consequences of those assumptions.

Field’s defense of fictionalism, though admirable, has its problems.  There are concerns both about Field’s dispensability claim and his conservativeness claim.  On the latter point, Shapiro objects that if our nominalistic scientific theories employ second-order logic, then the fact that mathematics is indispensable in drawing out some of the (semantic) consequences of those theories speaks against Field’s claim to have dispensed with mathematics.  As we have noted, indispensability in this sense does not affect Field’s ability to reject P2 of the original indispensability argument.  However, it does suggest a further indispensability argument that questions our license to use mathematics in drawing inferences if we do not believe that mathematics to be true.  Michael D. Resnik (1995: 169-70) expresses an argument of this form in his ‘Pragmatic Indispensability Argument’ as follows:

  1. In stating its laws and conducting its derivations science assumes the existence of many mathematical objects and the truth of much mathematics.
  2. These assumptions are indispensable to the pursuit of science; moreover, many of the important conclusions drawn from and within science could not be drawn without taking mathematical claims to be true.
  3. So we are justified in drawing conclusions from and within science only if we are justified in taking the mathematics used in science to be true.

Even if Field can dispense with mathematics in stating the laws of our scientific theories, the focus this argument places on the indispensability of mathematics in derivations, i.e., in drawing out the consequences of our scientific theories, presents a new challenge to Field’s program.

If we stick with second-order formulations of our nominalistic theories, then premise 2 of this argument is right in claiming that mathematics is indispensable to drawing out some of the consequences of these theories (in the sense that, for any consistent such theory and any sound derivation system for such a theory there will be semantic consequences of those theories that are not derivable within those theories relative to that derivation system).  But does the use of mathematics in uncovering the consequences of our theories require belief in the truth of the mathematics used?  Arguably, our reliance on, e.g., model theory in working out what follows from our nominalistic assumptions requires only that we believe in the consistency of our set theoretic models, not in the actual existence of those sets, so perhaps this form of the indispensability argument (based on the indispensability of mathematics in metalogic rather than in empirical science) can be responded to without dispensing with mathematics in such cases.  (See, e.g., Field (1984); Leng (2007).)

Field (1985: 255), though, has expressed some concerns about the second-order version of Newtonian gravitational theory developed in Science without Numbers.  Aside from the worry about the need to rely on set theory to discover the consequences of our second-order theories, there are more general concerns about the nominalistic acceptability of second-order quantification (with Quine (1970: 66), most famously, complaining that second-order logic is simply ‘set theory in sheep’s clothing’).  While there are various defences of the nominalistic cogency of second order logic available, Field’s own considered view is that second-order quantification is best avoided by nominalists.  This, however, rather complicates the account of applications given in Science without Numbers, since Field can no longer claim that our ordinary (mathematically stated) scientific theories conservatively extend their nominalistically stated counterparts.  In the equation above, what we can say is that for a first-order nominalistic theory N, and a mathematical theory S such as set theory with nonmathematical urelements, N+ S will be a conservative extension of N.  But for our ordinary mathematically stated scientific theory P, we will not be able to find an N such that N + S = P.  In fact, P will in general have ‘greater’ nominalistic content than any proposed counterpart first-order theory N does, by virtue of ruling out some ‘non-standard’ models that N will allow (effectively, because P will imply the existence in spacetime of a standard model for the natural numbers, whereas N will always admit of non-standard models).  We thus do not have the neat representation theorems that allow us to move from claims of N to equivalent claims of P that the machinery of Field’s ‘theoretical juice extractor’ requires.  As Field (2005) concedes, at best we can have partial representation theorems that allow for some match between our mathematical and nonmathematical claims, without straightforward equivalence.

Setting aside the logical machinery required by Field’s account of applications, there are also concerns about the prospects for finding genuinely nominalistic alternatives to our current scientific theories.  Field’s sketched nominalization of Newtonian gravitational theory is meant to show the way for further nominalizations of contemporary theories.  But even if Field has succeeded in making the case for there being a genuinely nominalistic alternative to standard Newtonian science (something that has been questioned by those who are concerned about Field’s postulation of the existence of spacetime points with a structure isomorphic to the 4-dimensional real space R4), many have remained pessimistic about the prospect for extending Field’s technique to further theories, for various reasons.  As Alasdair Urquhart (1990) points out, Newtonian science is very convenient in that, since it assumes that spacetime has the structure of R4, it becomes easy to find claims about relations between spacetime points that correspond to mathematical claims expressed in terms of real numbers.  But contemporary science takes spacetime to have non-constant curvature, and with the lack of an isomorphism between spacetime and R4, the prospects for finding suitable representation theorems to match mathematical claims with claims expressed solely in terms of qualitative relations between spacetime points are less clear.  Furthermore, as David Malament (1982) notes, Newtonian science is likewise convenient in that its laws primarily concern spacetime points and their properties (the mass concentrated at a point, the distance between points, etc.).  But many of our best scientific theories (such as classical Hamiltonian mechanics or quantum mechanics) are standardly expressed as phase space theories, with their laws expressing relations between the possible states of a physical system.  An analogous approach to that of Science without Numbers would dispense with mathematical expressions of these relations in favour of nonmathematical expressions of the same – but this would still leave us with an ontology of possibilia, something that would presumably be at least as problematic as an ontology of abstract mathematical objects.  As Malament (1982: 533) points out, ‘Even a generous nominalist like Field cannot feel entitled to quantify over possible dynamical states.’  And finally, as Malament further notes, the case of quantum mechanics presents even more problems, since, as well as being a phase space theory, the Hilbert space formulation represents the quantum mechanical state of a physical system as an assignment of probabilities to measurement events.  What is given a mathematical measure is a proposition or eventuality (the proposition that ‘A measurement of observable A yields a value within the set Δ’ is assigned a probability’).  But if propositions are the basic ‘objects’ whose properties are represented by the mathematical theory of Hilbert spaces, then applying an analogous approach to that of Science without Numbers would still throw up nominalistically unacceptable commitments.  For, Malament (1982: 534) asks, ‘What could be worse than propositions or eventualities’ for a nominalist such as Field?

These objections, while not conclusive, make clear just how much hard work remains for a full defense of Field’s dispensability claim.  It is not enough to rely on the sketch provided by Science without Numbers: mathematical fictionalists who wish to remain scientific realists in the Putnam-Boyd sense must show how they plan to dispense with mathematics in those cases where the analogy with Newtonian science breaks down (at a minimum, explaining how to deal with spacetime of non-constant curvature, phase space theories, and the probabilistic properties of quantum mechanics).  Mark Balaguer (1996) attempts the third of these challenges, arguing that we can dispense with quantum events just as long as we assume that there are physically real propensity properties of physical systems, an assumption that he claims to be ‘compatible with all interpretations of quantum mechanics except for hidden variables interpretations’ (Balaguer 1996, p. 217).  But there clearly remains much to be done to show that mathematical assumptions can be dispensed with in favour of nominalistically acceptable alternatives.

b. Mathematical Fictionalism + Nominalistic Scientific Realism

Despair about the prospects of completing Field’s project, as well as attention to the explanation of the applicability of mathematics that Field provides, has led some fictionalists (who, with a nod to Colyvan, I will label ‘Easy Road’ fictionalists), to wonder whether there isn’t an easier way to defend their position against the challenge of explaining why it is appropriate to trust the predictions of our mathematically-stated scientific theories if we do not believe those theories to be true.  Look again at Field’s explanation of the applicability of mathematics.  Field’s claim is, effectively, that our mathematically stated theories are predictively successful because they have a true ‘nominalistic’ core (as expressed, for theories for which we have dispensed with mathematics, by the claims of a nominalistic theory N).  What accounts for the trustworthiness of those theories is not that they are true (in their mathematical and nonmathematical parts), but that they are correct in the picture they paint of the nonmathematical realm.  Easy Road fictionalists then ask, doesn’t this explanation of the predictive success of a false theory undermine the case for scientific realism as on the Putnam/Boyd formulation?

Realists typically claim that we have to believe that our scientific theories are true or approximately true, otherwise their predictive success would be miraculous (thus, according to Putnam (1975: 73), realism ‘is the only philosophy that doesn’t make the success of science a miracle’).  But Field’s explanation of the predictive success of ordinary (mathematically stated) Newtonian science shows how that success does not depend on the truth or approximate truth of that theory, only on its having a true ‘nominalistic content’ (as expressed in the nominalistic version of the theory).  Mightn’t we use this as evidence against taking the predictive success of even those theories for which we do not have a neat nominalistic alternative to be indicative of their truth?  Perhaps those, too, may be successful not because they are true in all their parts, but simply because they are correct in the picture they paint of the nonmathematical world?  The contribution mathematical assumptions might be making to our theoretical success would then not depend on the truth of those assumptions, but merely on their ability to enable us to represent systems of nonmathematical objects.  Mathematics provides us with an extremely rich language to describe such systems; maybe it is even indispensable to this purpose.  But if all that the mathematics is doing in our scientific theories is enabling us to form theoretically amenable descriptions of physical systems, then why take the indispensable presence of mathematical assumptions used for this purpose as indicative of their truth?

Thus, Mark Balaguer (1998: 131) suggests that fictionalists should not be scientific realists in the Putnam-Boyd sense, but should instead adopt ‘Nominalistic Scientific Realism’:

the view that the nominalistic content of empirical science—that is, what empirical science entails about the physical world—is true (or mostly true—there may be some mistakes scattered through it), while its platonistic content—that is, what it entails “about” an abstract mathematical realm—is fictional.

This view depends on holding that there is a nominalistic content to our empirical theories (even if this content cannot be expressed in nominalistic terms), and that it is reasonable to believe just this content (believing that, as we might say, our empirical theories are nominalistically adequate, not that they are true).    Similar claims (about the value of our mathematically stated scientific theories residing in their accurate nominalistic content rather than in their truth) can be found in the work of Joseph Melia (2000), Stephen Yablo (2005) and Mary Leng (2010), though of these only Leng explicitly endorses fictionalism. Difficulties arise in characterising exactly what is meant by the nominalistic content of an empirical theory (or the claim that such a theory is nominalistically adequate).  Yablo compares the nominalistic content of a mathematical utterance with the ‘metaphorical content’ of figurative speech, and as with metaphorical content, it is perhaps easier to make a case for our mathematically stated empirical theories having such content than to give a formal account of what that content is (we are assuming, of course, that mathematics may be indispensable to expressing the nominalistic content of our theories, so that we cannot in general expect to be able to identify the nominalistic content of a mathematically stated empirical theory with the literal content of some related theory).

As Leng (2010) argues, the case for the combination of mathematical fictionalism with nominalistic scientific realism depends crucially on showing that the fictionalist’s proposal, to continue to speak with the vulgar in doing science, remains reasonable on the assumption that there are no mathematical objects.  That is, fictionalists must explain why they can reasonably rely on our ordinary scientific theories in meeting standard scientific goals such as the goals of providing predictions and explanations, if they do not believe in the mathematical objects posited by those theories.  Balaguer attempts such an explanation by means of his principal of causal isolation, the claim that there are no causally efficacious mathematical objects.  According to Balaguer (1998: 133),

Empirical science knows, so to speak, that mathematical objects are causally inert.  That is, it does not assign any causal role to any mathematical entity.  Thus, it seems that empirical science predicts that the behaviour of the physical world is not dependent in any way on the existence of mathematical objects.  But this suggests that what empirical science says about the physical world—that is, its complete picture of the physical world—could be true even if there aren’t any mathematical objects.

But while causal inefficacy goes a long way to explaining why the existence of mathematical objects is not required for the empirical success of our scientific theories, and especially in drawing a line between unobservable mathematical and physical posits (such as electrons), by itself the principle of causal isolation doesn’t show mathematical posits to be an idle wheel.  Not all predictions predict by identifying a cause of the phenomenon predicted, and, perhaps more crucially, not all explanations explain causally.  Balaguer suggests that, were there no mathematical objects at all, physical objects in the physical world could still be configured in just the ways our theories claim.  But if there were no mathematical objects, would that mean that we would lose any means of explaining why the world is configured the way it is?  If mathematical posits are essential to some of our explanations of the behaviour of physical systems, and if explanations have to be true in order to explain, then a fictionalist who does not believe in mathematical objects cannot reasonably endorse the kinds of explanations usually provided by empirical science.

Hence yet another indispensability argument has been developed to press ‘nominalistic scientific realists’ on this issue (see, particularly, Colyvan (2002); Baker (2005)).  Alan Baker (2009: 613) summarises the argument as follows:

The Enhanced Indispensability Argument

(1)   We ought rationally to believe in the existence of any entity that places an indispensable explanatory role in our best scientific theories.

(2)   Mathematical objects play an indispensable explanatory role in science.

(3)   Hence, we ought rationally to believe in the existence of mathematical objects.

What the fictionalist should make of this argument depends on what is meant by mathematical objects playing an ‘indispensable explanatory role’.  Let us suppose that this means that sentences whose literal truth would require the existence of mathematical objects are present amongst the explanandans of some explanations that we take to be genuinely explanatory.  Two lines of response suggest themselves: first, we may challenge the explanatoriness of such explanations, arguing that candidate such explanations are merely acting as placeholders for more basic explanations that do not assume any mathematics (effectively rejecting (2); Melia (2000, 2002) may be interpreted as taking this line; Bangu (2008) does so more explicitly).  On the other hand, we may accept that some such explanations are genuinely explanatory, but argue that the explanatoriness of the mathematics in these cases does not depend on its truth (effectively rejecting (1); this line is taken by Leng (2005), who argues that explaining the behaviour of a physical system by appeal to the mathematical features of a mathematical model of that system does not require belief in the existence of the mathematical model in question, only that the features that the model is imagined to have are appropriately tied to the actual features of the physical system in question).  Causal inefficacy does make a difference here, but in a slightly more nuanced way than Balaguer’s discussion suggests: it is hard to see how a causal explanation remains in any way explanatory for one who does not believe in the existence of the object posited as cause.  On the other hand, though, it is less clear that an explanation that appeals to the mathematically described structure of a physical system loses its explanatory efficacy if one is merely pretending or imagining that there are mathematical objects that relate appropriately to them.  As Nancy Cartwright (1983) suggests, causal explanations may be special in this regard – to the extent that nominalistic scientific realists agree with Cartwright on the limitations of inference to the best explanation, we may consider nominalistic scientific realism to be most naturally allied to ‘entity realism’, rather than realism about theories, in the philosophy of science.

c. Mathematical Fictionalism + Constructive Empiricism (Bueno)

What makes ‘nominalistic scientific realism’ a broadly ‘realist’ approach to our scientific theories is that, although its proponents do not believe such theories to be true or even approximately true (due to their mathematical commitments), in holding that our best scientific theories are broadly correct in their presentation of the nonmathematical world, they take it that we have reason to believe in the unobservable physical objects that those theories posit.  An alternative fictionalist option is the combination of mathematical fictionalism with constructive empiricism, according to which we should only believe that our best scientific theories are empirically adequate, correct in their picture of the observable world, remaining agnostic about the claims that those theories make about unobservables.  The combination of mathematical fictionalism with constructive empiricism has been defended by Otávio Bueno (2009).

While Bas van Fraassen is standardly viewed as presenting constructive empiricism as agnosticism about the unobservable physical world, it would seem straightforward that any reason for epistemic caution about theories positing unobservable physical entities should immediately transfer to a caution about theories positing unobservable mathematical entities.  As Gideon Rosen (194: 164) puts the point, abstract entities

are unobservable if anything is.  Experience cannot tell us whether they exist or what they are like.  The theorist who believes what his theories say about the abstract must therefore treat something other than experience as a source of information about what there is.  The empiricist makes it his business to resist this. So it would seem that just as he suspends judgment on what his theory says about unobservable physical objects, he should suspend judgment on what they say about the abstract domain.

This would suggest that constructive empiricism already encompasses, or at least should encompass, mathematical fictionalism, simply extending the fictionalist’s attitude to mathematical posits further to cover the unobservable physical entities posited by our theories.

Despite their natural affinity, the combination of mathematical fictionalism with constructive empiricism is not as straightforward as it may at first seem.  This is because, in characterising the constructive empiricist’s attitude of acceptance, van Fraassen appears to commit the constructive empiricist scientist to beliefs about mathematical objects.  Van Fraassen (1980: 64) adopts the semantic view of theories, and holds that a

theory is empirically adequate if it has some model such that all appearances are isomorphic to empirical substructures of that model.

To accept a theory is, for van Fraassen, to believe it to be empirically adequate, and to believe it to be empirically adequate is to believe something about an abstract mathematical model.  Thus, as Rosen (1994: 165) points out, for the constructive empiricist so-characterized,

The very act of acceptance involves the theorist in a commitment to at least one abstract object.

Bueno (2009: 66) suggests two options for the fictionalist in responding to this challenge: either reformulate the notion of empirical adequacy so that it does not presuppose abstract entities, or adopt a ‘truly fictionalist strategy’ which takes mathematical entities seriously as fictional entities, which Bueno takes (following Thomasson 1999) to be abstract artifacts, created by the act of theorizing.  This latter strategy effectively reintroduces commitment to mathematical objects, albeit as ‘non-existent things’ (Bueno 2009: 74), a move which is hard to reconcile with the fictionalist’s insistence on a uniform semantics (where ‘exists’ is held to mean exists).

5. Hermeneutic or Revolutionary Fictionalism

Each of these three versions of mathematical fictionalism can be viewed in one of two ways: as a hermeneutic account of the attitude mathematicians and scientists actually take to their theories, or as a potentially revolutionary proposal concerning the attitude one ought to take once one sees the fictionalist light.  Each version of fictionalism faces its own challenges (see, e.g., Burgess (2004) for objections to both versions, and, e.g., Leng (2005) and Balaguer (2009) for responses).  The question of whether fictionalism ought to be a hermeneutic or a revolutionary project is an interesting one, and has led at least one theorist with fictionalist leanings to hold back from wholeheartedly endorsing fictionalism.  From a broadly Quinean, naturalistic starting point, Stephen Yablo (1998) has argued that ontological questions should be answered, if at all, by considering the content of our best confirmed scientific theories.  However, Yablo notes, the sentences we use to express those theories can have a dual content – a ‘metaphorical’ content (such as the ‘nominalistic’ content of a mathematically stated empirical claim), as well as a literal content.  Working out what our best confirmed theories say involves us, Yablo thinks, in the hermeneutic project of working out what theorists mean by their theoretical utterances.  But for the more controversial cases, Yablo (1998: 257) argues, there may be no fact of the matter about whether theorists mean to assert the literal content or some metaphorical alternative.   We will often, he points out, utter a sentence S in a ‘make-the-most-of-it spirit’:

I want to be understood as meaning what I literally say if my statement is literally true… and meaning whatever my statement projects onto… if my statement is literally false.  It is thus indeterminate from my point of view whether I am advancing S’s literal content or not.

If answering ontological questions requires the possibility of completing the hermeneutic project of interpreting what theorists themselves mean by their mathematical utterances, and if (as Yablo (1998: 259) worries), the controversial mathematical utterances are permanently ‘equipoised between literal and metaphorical’ then there may be no principled way of choosing between mathematical fictionalism and platonism. Hence we have Yablo’s own reticence, in some incarnations, in endorsing mathematical fictionalism, even in the light of the acknowledged possibility of providing a fictionalist interpretation of our theoretical utterances.

6. References and Further Reading

  • Azzouni, J., (2004): Deflating Existential Consequence: A Case for Nominalism (Oxford: Oxford University Press)
  • Baker, A., (2005): ‘Are There Genuine Mathematical Explanations of Physical Phenomena?’, Mind 114: 223-38
  • Baker, A., (2009): ‘Mathematical Explanation in Science’, British Journal for the Philosophy of Science 60: 611-633
  • Balaguer, M. (1996): ‘Towards a Nominalization of Quantum Mechanics’, Mind105: 209-26.
  • Balaguer, M., (1998): Platonism and Anti-Platonism in Mathematics (Oxford: Oxford University Press)
  • Balaguer, M. (2009): ‘Fictionalism, Theft, and the Story of Mathematics’, Philosophia Mathematica 17: 131-162
  • Bangu, S., (2008): ‘Inference to the Best Explanation and Mathematical Realism’, Synthese 160: 13-20
  • Benacerraf, P., (1973): ‘Mathematical Truth’, The Journal of Philosophy 70: 661-679
  • Bueno, O., (2009): ‘Mathematical Fictionalism’, in Bueno, O. and Linnebo, Ø., eds., New Waves in Philosophy of Mathematics (Hampshire: Palgrave MacMillan): 59-79
  • Cartwright, N., (1983): How the Laws of Physics Lie (Oxford: Oxford University Press)
  • Colyvan, M., (2001): The Indispensability of Mathematics (Oxford: Oxford University Press)
  • Colyvan, M., (2002): ‘Mathematics and Aesthetic Considerations in Science’, Mind 111: 69-74
  • Daly, C., (2008): ‘Fictionalism and the Attitudes’, Philosophical Studies 139: 423-440
  • Field, H., (1980): Science without Numbers (Princeton, NJ: Princeton University Press)
  • Field, H., (1984): ‘Is Mathematical Knowledge Just Logical Knowledge?’, Philosophical Review 93: 509-52.  Reprinted with a postscript in Field (1989): 79-124.
  • Field, H. (1985): ‘On Conservativeness and Incompleteness’, Journal of Philosophy 81: 239-60. Reprinted with a postscript in Field (1989): 125-46
  • Field, H., (1989): Realism, Mathematics and Modality (Oxford: Blackwell)
  • Hellman, G. (1989): Mathematics without Numbers: Towards a Modal-Structural Interpretation (Oxford: Oxford University Press)
  • Hempel, C. G. (1945): ‘On the Nature of Mathematical Truth’, American Mathematical Monthly 52: 543-56.  Reprinted in P. Benacerraf and H. Putnam, eds. (1983): Philosophy of Mathematics: Selected Readings (Cambridge: Cambridge University Press): 377-93
  • Hoffman, S. (2004): ‘Kitcher, Ideal Agents, and Fictionalism’, Philosophia Mathematica 12: 3-17
  • Horwich, P. (1991): ‘On the Nature and Norms of Theoretical Commitment’, Philosophy of Science 58: 1-14
  • Kalderon, M. E. (ed.), (2005): Fictionalism in Metaphysics (Oxford: Oxford University Press)
  • Ketland, J. (2005): ‘Some more curious inferences’, Analysis 65: 18-24
  • Kitcher, P. (1984):  The Nature of Mathematical Knowledge (Oxford: Oxford University Press)
  • Leng, M. (2005a): ‘Mathematical Explanation’, in C. Cellucci and D. Gillies, eds., Mathematical Reasoning, Heuristics and the Development of Mathematics (London: King’s College Publications); 167-89
  • Leng, M. (2005b): ‘Revolutionary Fictionalism: A Call to Arms’, Philosophia Mathematica 13: 277-293
  • Leng, M., (2010): Mathematics and Reality (Oxford: Oxford University Press)
  • Lewis, D., (2005): ‘Quasi-Realism is Fictionalism’, in Kalderon (2005): 314-321
  • Melia, J. (2000): ‘Weaseling Away the Indispensability Argument’, Mind 109: 455-79
  • Melia, J. (2002): ‘Response to Colyvan’, Mind 111: 75-79
  • Melia, J. (2004): ‘Review of Jody Azzouni, Deflating Existential Consequence: A Case for Nominalism’,  Notre Dame Philosophical Reviews 2005/08/15
  • O’Leary-Hawthorne, J. (1994): ‘What does van Fraassen’s critique of scientific realism show?’, The Monist 77: 128-145
  • Putnam, H. (1971): Philosophy of Logic (New York: Harper and Row).  Reprinted in Putnam (1979): 323-57
  • Putnam, H. (1975): ‘What is Mathematical Truth?’, Historia Mathematica 2: 529-43.  Reprinted in Putnam (1979): 60-78
  • Putnam, H. (1979): Mathematics, Matter and Method: Philosophical Papers Vol. II (Cambridge: Cambridge University Press, 2nd ed.)
  • Quine, W. V. (1970): Philosophy of Logic (Cambridge: MA, Harvard University Press, 2nd ed., 1986)
  • Resnik, M. D. (1995): ‘Scientific vs. Mathematical Realism: The Indispensability Argument, Philosophia Mathematica 3: 166-74
  • Rosen, G. (1984): ‘What is Constructive Empiricism?’, Philosophical Studies 74: 143-78
  • Shapiro, S. (1983): ‘Conservativeness and Incompleteness’, Journal of Philosophy 80: 521-31
  • Thomas, R. (2000): ‘Mathematics and Fiction I: Identification’, Logique et Analyse 171-172: 301-340
  • Thomas, R. (2002): ‘Mathematics and Fiction II: Analogy’, Logique et Analyse 177-178: 185-228
  • Urquhart, A. (1990): ‘The Logic of Physical Theory’, in A. D. Irvine, ed., (1990): Physicalism in Mathematics (Dordrecht: Kluwer): 145-54
  • Van Fraassen, B. (1980): The Scientific Image (Oxford: Clarendon Press)
  • Yablo, S. (1998): ‘Does Ontology Rest on a Mistake?’, Aristotelian Society, Supplementary Volume 72: 228-61
  • Yablo, S. (2005): ‘The Myth of the Seven’, in Kalderon (2005): 88-115

Author Information

Mary Leng
Email: mcleng@liv.ac.uk
University of Liverpool
United Kingdom

Plato: Organicism

PlatoOrganicism is the position that the universe is orderly and alive, much like an organism. According to Plato, the Demiurge creates a living and intelligent universe because life is better than non-life and intelligent life is better than mere life. It is the perfect animal.  In contrast with the Darwinian view that the emergence of life and mind are accidents of evolution, the Timaeus holds that the universe, the world, is necessarily alive and intelligent. And mortal organisms are a microcosm of the great macrocosm.

Although Plato is most famous today for his theory of Forms and for the utopian and elitist political philosophy in his Republic, his later writings promote an organicist cosmology which, prima facie, conflicts with aspects of his theory of Forms and of his signature political philosophy. The organicism is found primarily in the Timaeus, but also in the Philebus, Statesman, and Laws.

Because the Timaeus was the only major dialogue of Plato available in the West during most of the Middle Ages, during much of that period his cosmology was assumed by scholars to represent the mature philosophy of Plato, and when many Medieval philosophers refer to Platonism they mean his organicist cosmology, not his theory of Forms. Despite this, Plato’s organicist cosmology is largely unknown to contemporary philosophers, although many scholars have recently begun to show renewed interest.

Table of Contents

  1. Introduction
    1. Whitehead’s Reading of Plato
    2. Greek Organicism
  2. Plato’s Cosmogony and Cosmology
    1. Creation of the World Animal
    2. The Mortal Organism as Microcosm of the Macrocosm
    3. Creation as Procreation
    4. Emergence of Kosmos from Chaos
  3. Relevance to Plato’s Philosophy
    1. Relevance to Plato’s Aesthetics
    2. Relevance to Plato’s Ethics
    3. Relevance to Plato’s Political Philosophy
    4. Relevance to Plato’s Account of Health and Medicine
    5. Relevance to Plato’s Theory of Forms
  4. Influence of Plato’s Cosmology
    1. Transition to Aristotle’s Organicism
    2. Importance for Contemporary Philosophy
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Introduction

a. Whitehead’s Reading of Plato

In his 1927-28 Gifford Lectures, Whitehead (1978) makes the startling suggestion that Plato’s philosophy is akin to a philosophy of organism. This is surprising to many scholars because Plato’s signature doctrine, the theory of Forms, would seem to be as far removed from a philosophy of organism as possible. On the usual understanding of the theory of Forms, reality is divided into a perfect, eternal, unchanging, world of  Forms or universals, and a separate, finite, imperfect world of perceptible particulars, where the latter is an image of the former and is, in some obscure way, unreal, or less real, than the Forms.  Since living things requires growth and change, and since, according to the theory of Forms, these are mere images of the only genuine realities, the Forms, it would seem there can be no fundamental place for living organisms in Plato’s ontology.

The case for Whitehead’s thesis is based on Plato’s Timaeus, where he  compares the kosmos to a living organism, but also, to a lesser degree, on the Laws, Statesman, Philebus and Critias.   Since the Timaeus is concerned with the temporal world, generally thought to be denigrated by the “other-worldly” Plato, its relevance to Plato’s philosophy has been doubted.   First, the cosmology of the Timaeus is not even presented by Socrates, but by Timaeus, a 5th century Pythagorean.   Second, the Timaeus represents its organicist cosmology as a mere probable story.    Third, although Plato employs myths in most of his dialogues, these are generally combined with discursive argument, but the Timaeus is “myth from beginning to end” (Robin, 1996).   For these reasons, many scholars hold that the Timaeus represents a digression into physical speculations that have more to do with the natural sciences per se than they do with philosophy proper (Taylor, 1928).    Russell (1945) allows that the Timaeus deserves to be studied because it has had such great influence on the history of ideas, but holds that “as philosophy it is unimportant.”  The case is further complicated by the controversy over the longstanding view that the Timaeus is a later period dialogue.  For a discussion of these stylometric and chronological disputes see Kraut (1992), Brandwood (1992), and Meinwald (1992).

It is worth remembering, however, that throughout most of the Middle Ages, the Timaeus was the only Platonic dialogues widely available in the West and most scholars at that time assumed that it represents Plato’s mature views (Knowles, 1989).   Second, the dialogue in the Timaeus appears to take up where that of the Republic leaves off, suggesting that Plato himself saw a continuity between the views in the two works.  It is also worth pointing out that some physicists, such as Heisenberg (1958),  have claimed that the Timaeus provided inspiration for their rejection of the materialism of Democritus in favor of the mathematical forms of Plato and the Pythagoreans (see also Brisson and Meyerstein, 1995).   For these and other reasons, a growing number of scholars have, despite the controversies, begun to return to the Timaeus with renewed philosophical interest (Vlastos, 1975; Ostenfield, 1982; Annas, 1999; Sallis, 1999; Carone, 2000; and so forth.).

b. Greek Organicism

In his introduction to Plato’s works, Cairns (1961)  points out that the Greek view, as far back as we have records, is that the world is orderly and alive.  From this perspective, the failure to appreciate Plato’s organicism is part and parcel of a failure to appreciate Greek organicism more generally. For example, whereas modern scholars view the Milesians as forerunners of modern materialism (Jeans, 1958), the Milesians held that matter is alive (Cornford, 1965; Robin, 1996).  Similarly, Anaximenes did not hold that air is the basis of all things in the same sense, or for the same reasons, that a modern materialist might hold such a view.  He views air as breath and sees air as the basis of all things because he sees the world as a living thing and therefore “wants it to breath” (Robin, 1996; Cornford, 1966). Pythagoras too, who exerted great influence on Plato, saw the world as a living breathing being (Robinson, 1968).    Cornford (1966) notes that Plato’s description in the Timaeus of his world animal as a “well rounded sphere” has been seen by some scholars as the best commentary on Parmenides’ comparison of his One Being to a perfect sphere (raising the possibility of a Parmenidean organicism).    Finally, by stressing that fire is the basis of all things, Heraclitus did not mean that fire is the material out of which all things are made.  His fire is an “ever living” fire (Burnet, 1971).  Similar points could be made about other pre-Socratic philosophers.   The Greek tendency to view the world as a living thing is rooted in the fact that the early Greek notion of nature, physis, was closer in meaning to life than to matter (Cornford, 1965).   This is why, as far back as Hesiod, procreation plays such a prominent role in Greek creation stories, as it does in the Timaeus (Section 2c.).   From this perspective, it is not surprising that Plato develops an organicist cosmology.    It would be surprising if he did not have one.

2. Plato’s Cosmogony and Cosmology

a. Creation of the World Animal

The Timaeus describes the world (kosmos) as a created living being.  The world is created by the “Demiurge  [ho demiourgos]” who follows an “eternal pattern” reminiscent of Plato’s Forms (Carone, 2000).  The materials out of which the kosmos is fashioned are already present.    The eternal patterns or Forms, the Demiurge himself, and the materials, all pre-exist the creation.  Thus, Plato’s Demiurge is not omnipotent, but is more like a craftsman, limited both by the eternal patterns and by the prior matter.  The creative act consists in putting “intelligence in soul and soul in body” in accord with the eternal patterns.  The soul in the Timaeus and Laws is understood as the principle of self-motion.

The pre-existing materials are described as “chaos.”   By “chaos” Plato does not mean the complete absence of order, but a kind of order, perhaps even a mechanical order, opposed to Reason.   This “chaotic” tendency survives the imposition of Form and is always threatening to break out and undermine the rational order of the world.   For this reason Plato’s kosmos exhibits a dynamical quality quite alien to modern thought.

The Demiurge creates a living and intelligent world because life is better than non-life and intelligent life is better than mere life.  It is “the perfect animal.”  In contrast with the Darwinian view that the emergence of life and mind are accidents of evolution, the Timaeus holds that the world is necessarily alive and intelligent.

The Timaeus identifies three different kinds of souls, the rational (eternal) soul, the spirited soul, and the plantlike soul capable of sensation but not of genuine self-motion.   The world-animal possesses the highest and most perfect kind of soul, the rational soul, but it also shares in the two lower types of soul as well.  The world may be the perfect animal, but it is not a perfect being because it possesses the lower types of soul.  The presence of these lower types of soul helps to explain the imperfection in the world.

The Timaeus holds that the world is “solitary.”   The Demiurge only creates one world, not because he is stingy, but because he can only create the best and there can only be one best world.   Since it is solitary, there is nowhere for it to go and nothing for it to perceive.   The perfect-animal has, therefore, no external limbs or sense organs.

The Demiurge gives the world the most suitable shape, that is, it is a sphere with each point on the circumference equidistant from the center.   Since it has no need of sense organs or limbs, it is perfectly smooth.  Although the pre-existing visible body is also a sphere, it turns out that a sphere is also the most suitable choice of shape for the perfect animal (Sect. 4c).  The Demiurge imposes an order on that pre-existing material sphere that makes it suitable for the introduction of a soul.    Thus, Plato does not deny that there are material or mechanical conditions for life and mind.  He only insists that these are subordinated in the world to the more basic rule by reason (McDonough, 1991).

The Demiurge makes the perfect animal in the shape of a sphere since a sphere “is the most like itself of all figures” and that makes for the most beautiful figure.  Unlike the modern view that values are a subjective coloring imposed by the human mind (Putnam, 1990), Plato’s kosmos is intrinsically beautiful and good.   Plato’s science of nature does not seek to strip things of value in order to see them “objectively”, but, rather, to describe the intrinsic values writ large in the perfect visible cosmic organism (Sect. 3a-3c).

The Demiurge puts the soul in the center of the sphere, but it “diffuses” throughout the entire sphere.   The Demiurge synchronizes the two spheres “center to center.”  Thus, Plato distinguishes between the organism’s spiritual center and its bodily center, and holds that these must be made, by the Demiurge, to correspond with each other.  This is an early version of the “correlation thesis” (Putnam, 1981), the view that there must be a correspondence between the mental and material states of the organism.   That which is produced directly by intelligence may only have a teleological explanation, while that caused by matter not controlled by intelligence may have only a physical explanation, but that which is produced by the informing of matter by intelligence admits of both a teleological and a physical explanation.   In that case, the teleological and physical “spheres” must correspond with each other.  The world-animal is One in the sense that it possesses an organic unity by virtue of its central order-imposing soul.

Since the kosmos is a perfect animal,  and since an animal has parts, the world is ”a perfect whole of perfect parts.”   The kosmos is a whole of parts because it is “the very image of that whole of which all the animals and their tribes are portions.”  The “whole” of which the kosmos is an image is called “the Form of the Intelligible Animal.”

The Form of the Intelligible Animal contains “all intelligible beings, just as this [visible] world contains all other visible creatures.”  The perfect animal must embrace all possible species of “intelligible beings.”   Thus, Plato’s world-animal is actually a whole ecosystem of interrelated animals.    It should not, however, be assumed that the cosmic animal is not also a single organism.   Although the human body is, in one sense, a single organism, it is, in another sense, a whole system of interrelated organisms (the individual cells of the body), which combine to form one more perfect organism.

The view that the Form of the intelligible animal contains all intelligible beings suggests that only animals are intelligible.   Matter as such is not intelligible.  A material thing is only intelligible because it instantiates a Form.  The Timaeus suggests that the total recipe for the instantiation of the Forms is a living organism.  The ideas that only living things are intelligible and that matter per se is unintelligible are foreign to the modern mind.   Nonetheless, Plato sees a close connection between life and intelligibility.

Since there is nothing outside the perfect animal, it exists “in itself.”  Since it exists “in itself,” it is self sufficient in the visible world.  It does depend on the Forms, but it does not depend on anything more basic in the perceptible world.   Since it moves, but is an image of the self-sufficient Forms, it moves in the most self-sufficient way, that is, it is self- moving.   Since there is nothing outside it, it can only move “within its own limits,”  that is, it can only rotate around its own axis. The circular motion of the perfect animal is the best perceptible image of the perfection and self-sameness of the eternal Forms.

Since the perfect animal is intelligent, it thinks.   Since it is self-moving, it is a self-moving thinker.   Since it is self-sufficient in the visible world, it is, in that realm, absolute spontaneity.   Plato’s characterization of the perfect animal as a “sensible God” expresses the fact that it possesses these divine qualities of self-sufficiency, self movement, and absolute spontaneity deriving from its participation in an eternal pattern.

The Timaeus presents a  complex mathematical account, involving the mixing of various types of being, in various precise proportions, of the creation of the “spherical envelope to the body of the universe,” that is, the heavens.  The more orderly movements of the heavenly bodies are better suited than earthly bodies to represent the eternal patterns, but they are not completely ordered.   In addition to the perfect circular movements of the stars, there is also the less orderly movement of the planets.  Plato distinguishes these as “the same” and “the different.”   Whereas the stars display invariable circular movements, the planets move in diverse manners, a different motion for each of the seven planets.   Thus, the movement of the stars is “undivided,” while that of  the plants is divided into separate diverse motions.   Since the former is superior, the movements of the different are subordinated to those of “the same.”  The entirely regular movement of “the same” is the perfect image of the eternal patterns, while the movement of  “the different” is a manifestation of the imperfect material body of the kosmos.   Nevertheless, since “the different” are in the heavens, they are still much more orderly than the “chaotic” movements of bodies on earth.   Although this account is plainly unbelievable, it sheds light on his concept of an organism and his views about intelligence.

To take one example, Plato invokes the dichotomy of “the same” and “the different” to explain the origins of knowledge and true belief.   Because the soul is composed of both “the same” and “the different,” she is capable of recognizing the sameness or difference in anything that “has being.”  Both knowledge and true opinion achieve truth, for “reason works with equal truth whether she is in the sphere of the diverse or of the same,” but intelligence and knowledge, the work of “the same,” are still superior to true belief, the work of “the different.”   Insofar as the heavens display the movements of “the same,” the world animal achieves intelligence and knowledge, but  insofar as “the circle of the diverse” imparts the “intimations of sense” to the soul mere true belief is achieved.    Plato is, in effect, describing a kind of celestial mechanism to explain the origins of the perfect animal’s knowledge on the one hand and true belief on the other.   His view implies that an organism must  be imperfect if it is to have true beliefs about a corporeal world and that these imperfections must be reflected in its “mechanism” of belief.

Because of their perfect circular motions, the heavens are better suited than earthly movements to measure time.    Thus, time is “the moving image of eternity.”  This temporal “image of eternity” is eternal and “moves in accord with number” while eternity itself “rests in unity.”  But time is not a representation of just any Form.  It is an image of the Form of the Intelligible Animal.   Since time is measured by the movement of the perfect bodies in the heavens, and since that movement constitutes the life of the perfect animal, time is measured by the movement of the perfect life on display in the heavens, establishing a connection between time and life carried down to Bergson (1983).

b. The Mortal Organism as Microcosm of the Macrocosm

The Demiurge creates the world-animal, but leaves the creation of mortal animals to the “created gods,” by which Plato may mean the earth (female) and the sun (male).  Since the created gods imitate the creator, mortal animals are also copies of the world-animal.   Thus, man is a microcosm of the macrocosm, a view that extends from the pre-Socratics (Robinson, 1968), through Scholastic philosophy (Wulf, 1956) and the Renaissance (Cassirer, 1979), to Leibniz (1968), Wittgenstein (1966), Whitehead (1978), and others.

Although plants and the lesser animals are briefly discussed in the Timaeus, the only mortal organism described in detail is man.  Since imperfections are introduced at each stage of copying, man is less perfect than the cosmic-animal, the lesser animals are less perfect than man, and plants are less perfect than the lesser animals.  This yields a hierarchy of organisms, a “great chain of being,” arranged from the most perfect world-animal at the top to the least perfect organisms at the bottom (Lovejoy, 1964).

Since an ordinary organism is a microcosm of the macrocosm, the structure of a mortal organism parallels that of the macrocosm.  Since the structure of the macrocosm is the structure of the heavens (broadly construed to include the earth at the center of the heavenly spheres), one need not rely on empirical studies of ordinary biological organisms.  Since the Timaeus holds that the archetype of an organism is “writ large” in the heavens, the science of astronomy is the primary guide to the understanding of living things. In this respect, our modern view owes more to Aristotle, who accorded greater dignity to the empirical study of ordinary living things (Hamilton, 1964, p. 32).

Since the macrocosm is a sphere with the airy parts at the periphery and the earth at the center, ordinary organisms also have a spherical structure with the airy parts at the periphery and the heavier elements at the center.   Since an ordinary organism is less perfect than the world animal, its spherical shape is distorted.   Although there are three kinds of souls, these are housed in separate bodily spheres.   The rational, or immortal, soul is located in the sphere of the head.  The two mortal souls are encased in the sphere of the thorax and the sphere of the abdomen.   The division of the mortal soul into two parts is compared with the division of a household into the male and female “quarters.”

The head contains the first principle of life.  The soul is united with the body at its center.  Since Plato uses “marrow” as a general term for the material at the center of a seed, the head contains the brain “marrow” suited to house the most divine soul.  There are other kinds of “marrows” at the centers of the chest and abdomen.    The sphere is the natural shape for an animal because the principle of generation takes the same form as a seed, and most seeds are spherical.  The head is a “seed” that gives birth to immortal thoughts.  The thorax and abdomen are “seeds” that give birth to their own appropriate motions.

The motions in the various organic systems imitate the circular motions of the heavens.   Respiration is compared to “the rotation of a wheel.”    Since there can be no vacuum, air taken in at one part forces the air already there to move out of its place, which forces the air further down to move, and so on.  Plato gives a similar account of the circulatory system.  The blood is compelled to move by the action of the heart in the center of the chest.  “[T]he particles of the blood … which are contained within the frame of the animal as in a sort of heaven, are compelled to imitate the motion of the universe.”    The blood circulates around the central heart just as the stars circulate around the central earth.   Similar accounts are given of ingestion and evacuation.   The action of the lungs, heart, and so forth, constitutes the bodily mechanism that implements the organic telos.    In the Phaedo and Laws, Plato compares the Earth, the “true mother of us all,” to an organism with its own circulatory systems of subterranean rivers of water and lava.  The organic model of the heavens is the template for an organic model of the geological structure of the earth.

Since the perfect animal has no limbs or sense organs, “the other six [the non-circular] motions were taken away from him.”  Since there is no eternal pattern for these chaotic motions associated with animal life, they are treated as unintelligible.  There is, for Plato, no science of chaos.  His remarks are consistent with the view that there can be a mechanics of the non-circular bodily motions, but since such a mechanics cannot give the all- important reason for the motion it so does not qualify as a science in Plato’s sense.

Since the rise of the mechanistic world view in the 18th century, it has been impossible for modern thinkers to take Plato’s cosmology seriously.  It cannot, however, be denied that it is a breathtaking vision.   If nothing else, it is a startling reminder how differently ancient thinkers viewed the universe.   According to the Timaeus, we on earth live at the center of one unique perfect cosmic organism, in whose image we have been created, and whose nature and destiny has been ordained by imperceptible transcendent forces from eternity.  When we look up at the night sky, we are not seeing mere physical bodies moving in accord with blind mechanical laws, but, rather, are, quite literally, seeing the radiant airy periphery of that single perfect cosmic life, the image of our own (better) selves, from which we draw our being, our guidance, and our destiny.

Finally, Plato is, in the Timaeus, fashioning important components of our concept of an organism, a concept which survives even when his specific quaint theories, do not.  For example, biologists have noted that animals, especially those, like Plato’s perfect animal, that have no need of external sense organs or limbs, tend towards a spherical shape organized around a center (Buchsbaum, 1957).  Indeed, central state materialism, the modern view that the intelligence is causally traceable to the neural center, is, arguably, a conceptual descendent of Plato’s notion of an organism organized around a center.

c. Creation as Procreation

 

Whereas in his earlier dialogues Plato had distinguished Forms and perceptible objects, the latter copies of the former,  the Timaeus announces the need to posit yet another kind of being, “the Receptacle,” or “nurse of all generation.”  The Receptacle is like the Forms insofar as it is a “universal nature” and is always “the same,” but it must be “formless” so that it can “take every variety of form.”   The Receptacle is likened to “the mother” of all generation, while “the source or spring” of generation, the Demiurge, is likened to the father.   In the Timaeus, the creation of the world is not a purely intellectual act, but, following the sexual motif in pre-Socratic cosmogony, it is modeled on sexual generation.

Plato’s argument for positing the Receptacle is that since visible objects do not exist in themselves, and since they do not exist in the Forms, they must exist “in another,” and the Receptacle is this “other” in which visible objects exist, that is, the argument for positing the Receptacle is premised on the ontologically  dependent status of visible objects.

Since the perfect motion is circular, generation too moves in a circle.  This is true of the generation of the basic elements, earth, air, fire, and water, out of each other, but it is also true of animal generation.  Since the parents of a certain type only generate offspring of the same type, the cycle of procreation always returns, in a circular movement, to the same point from which it started    It is only in creating a copy of themselves, which then go on to do that same, that mortal creatures partake of the eternal (Essentially the same picture is found in Plato’s Symposium and in Aristotle’s Generation of Animals).  Since the sexual act presupposes the prior existence of the male and female principles, the procreation model also explains why Plato’s Demiurge does not create from nothing.

Plato identifies the Receptacle with space, but also suggests that the basic matters, such as fire, are part of its nature, so it cannot be mere space.   Although Plato admits that it somehow “partakes of the intelligible,” he also states that it “is hardly real” and that we only behold it “as in a dream.”   Despite the importance of this view in the Timaeus, Plato is clearly puzzled, and concludes that the Receptacle is only apprehended by a kind of “spurious reason.”   Given his comparison of the receptacle to the female principle, he may think that visible objects are dependent on “another” in something like the sense in which a foetus is dependent on the mother’s womb.  On the other hand, Plato admits that these are murky waters and it is doubtful that the sexual imagery can be taken literally.

d. Emergence of Kosmos from Chaos

The Western intellectual tradition begins, arguably, with the cosmogony in Hesiod’s Theogony, according to which the world emerges from chaos.  A similar story is found in Plato’s creation story in the Timaeus, where, in the beginning, everything is in “disorder” and any “proportion” between things is accidental.   None of the kinds, such as fire, water, and so forth, exist.  These had to be “first set in order” by God, who then, out of them, creates the cosmic animal.   Since the root meaning of the Greek “kosmos” is orderly arrangement, the Timaeus presents a classic picture of the emergence of order out of chaos.

The doctrine of emergent evolution, associated with Bergson (1983), Alexander (1920), and Morgan (1923), is the view that the laws of nature evolve over time (Nagel, 1979).   Since, in the Timaeus, the laws of nature are not fixed by the conditions in the primordial “chaos,” but only arise, under the supervision of the Demiurge, in a temporal process, Plato’s cosmology appears to anticipate these later views.  Mourelatos (1986) argues that emergentism is present in the later pre-Socratic philosophers.  Although emergentism has been out of fashion for some time, it has recently been enjoying a revival (See Kim, Beckermann, and Flores, 1992; McDonough, 2002; Clayton and Davies, 2006, and so forth).

3. Relevance to Plato’s Philosophy

a. Relevance to Plato’s Aesthetics

Since reason dictates that the best creation is the perfect animal, the living kosmos is the most beautiful created thing.   Since the perfect animal is a combination of soul and body, these must be combined in the right proportion.   The correct proportion of these constitutes the organic unity of the organism.   Thus, the beauty of an organism consists in its organic unity.   Since other mortal organisms are microcosms of the macrocosm, the standard of beauty for a mortal organism is set by the beauty of the kosmos.   The beauty of a human being is, in effect, modeled on the beauty of a world.

There is a link between beauty and pleasure, but pleasure is derivative.  Since beauty is a matter of  rational proportion, a rational person naturally finds the sight of beauty pleasurable.   Thus, a rational person finds a well proportioned organism beautiful, where the relevant proportions include not merely physical proportions but the most basic proportion between body and soul.   Finally, since an organism has an organic unity, rationality, beauty, health and virtue can only occur together.    Thus, Plato’s aesthetics shades into his ethics, his view of medicine, and his conception of philosophy itself.

b. Relevance to Plato’s Ethics

Perhaps the most basic objection to Plato’s ethics is the charge that his view that the Forms are patterns for conduct is empty of content.   What can it mean for a changeable, corporeal, mortal, living creature to imitate a non-living immaterial, eternal, unchanging, abstract object?   Plato’s organicist cosmology addresses this gap in his ethical theory.

Since the kosmos is copied from the Form of the Intelligible Animal, and since man is a microcosm of the macrocosm, there is a kinship between the rational part of man and the cosmic life on display in the heavens.   There is a close link, foreign to the modern mind, between ethics and astronomy (Carone, 2000).  This explains why, in the Theaetetus, Socrates states that the philosopher spends their time “searching the heavens.”

Specifically, the ethical individual must strive to imitate the self-sufficiency of the kosmos.  Since the most fundamental dimension of self-sufficiency is self-movement, the ethical individual must strive to be self-moving (like the heavenly bodies).  Since the eternal soul is the rational soul, not the animal or vegetable soul, the ethical individual aims at the life of self-moving rational contemplation.  Since the highest form of the rational life is the life of philosophy, the ethical life coincides with the life of philosophy.

As self-moving, the ethical individual is not moved by external forces, but by the “laws of destiny.”  One must not interpret this in a modern sense.  Plato’s ethical individual is not a cosmic rebel.   The ethical individual does not have their own individualistic destiny.  Since a mortal living being is a microcosm of the macrocosm, it shares in the single law of destiny of the kosmos.  Socrates had earlier stated the analogous view in the Meno that “all nature is akin.”  There is a harmony between man’s law of destiny and that of the kosmos.   Because of their corrupt bodily nature, human beings have fallen away from their cosmic destiny.   Thus, the fundamental ethical imperative is that human beings must strive to reunite with the universal cosmic life from which they have fallen away, the archetype of which is displayed in the heavens.   The ethical law for man is but a special case of the universal law of destiny that applies to all life in the universe.

The bad life is the unbalanced life.   A life is unbalanced when it falls short of the ideal organic unity.   Thus, evil is a kind of disease of the soul.   Since the body is the inferior partner in the union of soul and body, evil results from the undue influence of the body on the soul  Since body and soul are part of an organic unity, and since the soul does not move without the body and vice versa, the diseases of the soul are diseases of the body and vice versa.  Due regard must be given to the bodily needs, but since the soul is the superior partner in that union, the proper proportion is achieved when the rational soul rules the body.   The recipe for a good life is the same as the recipe for a healthy organism.   Thus, the ethics of the Timaeus shades into an account of health and medicine (Sect. 3c).   Since the ethical individual is the philosopher, the account of all of these shades in to account of the philosopher as well.   The ethical individual, the healthy individual, the beautiful individual, and the philosopher are one and the same.

The cosmology of the Timaeus may also serve to counterbalance the elitism in Plato’s earlier ethical views.  Whereas, in Plato’s middle period dialogues, it is implied that goodness and wisdom are only possible for the best human beings (philosophers), the Timaeus suggests the more egalitarian view that since human life is a microcosm of the macrocosm, ethical salvation is possible for all human beings (Carone, 2000).

Plato’s organicism also suggests a more optimistic view of ethical life than is associated with orthodox Platonism.  Whereas, in Plato’s middle period dialogues, the ethical person is represented to be at the mercy of an evil world, and unlikely to be rewarded for their good efforts, the Timaeus posits a “cosmic mechanism” in which virtue is its own reward (Carone, 2000).   Although Socrates may be victimized by unjust men, the ultimate justice is meted out, not in the human law courts, but in the single universal cosmic life.

On the more negative side, Plato’s celestial organicism does commit him to a kind of astrology:  The Demiurge “assigned to each soul a star, and having there placed them as in a chariot, he … declared to them the laws of destiny.”  Taken literally, this opens Plato to easy caricature, but taken symbolically, as it may well be intended, it is a return to the Pythagorean idea that ethical salvation is achieved, not by setting oneself up in individual opposition to the world, but by reuniting with the cosmic rhythm from which one has fallen away (Allen, 1966).   Although this may look more like a cult or religion to modern thinkers, it is worth noting that it does anticipate the criticism of  the human-centered vision of ethics by the modern “deep ecology” movement (Naess, 1990).

c. Relevance to Plato’s Political Philosophy

Since Plato sees an analogy between the polis and the kosmos (Carone, 2000), and since the kosmos is a living organism, Plato’s concept of organism illuminates his account of the polis.   Just as the kosmos is a combination of Reason (Nous) and Necessity (chaos), so too is the polis.   Just as Demiurge brings the kosmos into being by making the primordial chaos submit to Reason, so too, the Statesman brings the polis into being by making the chaos of human life submit to reason.  Carone (2000) suggests that politics, for Plato, is itself is a synthesis of Reason and Necessity.   It is, in this connection, significant, that in Greek, the word “Demiurge” can mean magistrate (Carone, 2000). See Plato’s Political Philosophy.

d. Relevance to Plato’s Account of Health and Medicine

Since an organism is an organic whole, beauty, virtue, wisdom, and health must occur together.   Just as Plato’s organicism issues in an aesthetics and an ethics, it also issues in an account of medicine.   Health is a state of orderly bodily motions induced by the soul, while disease is a state of disorder induced by the chaos of the body.   The diseases of the soul, such as sexual intemperance, are caused by the undue influence of the body on the soul, with the consequence that a person who is foolish is not so voluntarily.

Since an organism is an organic whole, one does not treat the heart in order to cure the person.  One treats the whole person in order to cure the heart.   Since the union of body and soul is fundamental, health requires the correct proportion between them.  Since the enemy of health is the chaos of the body, health is achieved by imitating the rational pattern of the heavens.   Since the heavens are self-moving, that motion is the best which is self-produced.   Thus, a self-imposed “regimen” of rational discipline and gymnastic, including the arts and all philosophy, is the optimal way to manage disease.

Unfortunately, most professors of medicine fail to see that disease is a natural part of life.  Although mortal organisms live within limits, professors of medicine are committed to the impossible task of contravening these limits by external force, medications, surgery, and so forth.  By ignoring an organism’s inherent limits, they fail to respect the inner laws of harmony and proportion in nature.   Just as self-movement is, in general, good, movement caused by some external agency is, in general, bad.   Since an organism is a self-moving rational ordering with its own inherent limits, the best course is to identify the unhealthy habits that have led to the malady and institute a “regimen” to restore the organism to its natural cycles.   In a concession to common sense, however, Plato does allow that intervention by external force may be permissible when the disease is “very dangerous.”

Plato’s view of medicine may seem quaint, but since, on his view, beauty, health, virtue, and wisdom are aspects of (or, perhaps, flow from) a fundamental condition of organic unity, his views on medicine shed light on his aesthetics, ethics, and his conception of philosophy.   Health is, in various Platonic dialogues (Republic 444c-d, Laws, 733e, and so forth.), associated with the philosophical and virtuous life.  The fact that the Timaeus’ recipe for health includes a strong dose of “all philosophy” betokens Plato’s view that health, like wisdom and virtue, are specific states of an organism that derive, and can only derive, from a certain central unifying power of the philosophic soul.

e. Relevance to Plato’s Theory of Forms

Although it may seem that Plato’s organicism is irrelevant to his theory of Forms, or even that it is incompatible with it, it is arguable that it supplements and strengthens the theory of Forms.  The three main tenets of the theory of Forms are that (1) the world of Forms is separate from the world of perceptible objects (the two-world view), (2)  perceptible objects are images or copies of the Forms, and (3)  perceptible objects are unreal or “less real” than the Forms.

With regard to the first thesis, there appears to be a tension between Plato’s organicism and the two-world view.  f the kosmos is perfect and beautiful, not infer that the Forms are not separate from the kosmos but are present in it?   On the other hand, since Aristotle says in the Metaphysics that Plato never abandoned the two-world theory, it is prudent to leave the first thesis unchanged.  Even if Plato’s organicism undercuts some of the original motivations for the two-world view, it does not require its rejection (Sect. 4b).

Although Plato’s organicism does not require a rejection of the second thesis, the view that perceptible objects are images of the Forms, it puts it in a different light. Rather, it suggests that perceptible objects are not images of Forms in the sense in which a photograph is an image of a man, but in something like the sense in which a child is an image of its parents (Sect. 2c).   From this perspective, the orthodox reading of Plato relies on a one-sided view of the image-model and thereby makes Plato’s theory of Forms appear to denigrate the perceptible world more than it really must do (Patterson, 1985).

Plato’s organicism also puts the third thesis, the view that perceptible objects are less real than the Forms, in a new light.   Since most philosophers see the picture of degrees of reality as absurd, Plato’s views are open to easy ridicule.   However, Plato’s organicism suggests that this objection is based on a confusion.     On this view, when Plato states or implies that some items are less real than others, he is arranging them in a hierarchy based on to the degree in which they measure up to a certain ideal of organic unity.  On this scale, a man has more “being” than a tomato because a man has a higher degree of organic unity than a tomato.    That has nothing to do with the absurd view that tomatoes do not exist or that they only exist to a lesser degree.   The view that Plato is committed to these absurd ideas derives from an equivocation of Plato’s notion of “being” (roughly organic unity) with the notion of existence denoted by the existential quantifier.

Rather than being either irrelevant to Plato’s philosophy or incompatible with it, Plato’s organicism provides new interpretations of certain concepts in those theories.   Indeed, it suggests that some of the standard criticisms of Plato’s views are based on equivocations.

4. Influence of Plato’s Cosmology

a. Transition to Aristotle’s Organicism

Although Plato’s organicism does seem to be consistent with a theory of Forms, it does not come without a price for that theory.  The theory of Forms had been posited to act as causes, as standards, and as objects of knowledge (Prior, 1985), and Plato’s organicism does undermine some of the original motivations for the theory of Forms.  For example, Plato’s argument that the Forms are needed as standards requires a depreciation of the perceptible world. If living organisms are not merely an image of perfection and beauty, but are themselves perfect and beautiful, then these can act as intelligible standards and there is no special need to posit another separate world of superior intelligible existence. Similar arguments can be extended to the view that Forms are needed as causes and as objects of knowledge.  If one enriches the perceptible world by populating it with intelligible entities, that is, living organisms possessed of their own internal idea, there is no need to look for intelligible standards, causes, or objects of knowledge, in a separate Platonic realm.  In that case, positing a world of separate Forms is an unnecessary metaphysical hypothesis.  This is precisely the direction taken by Aristotle.

Aristotle follows Plato in speaking of form and matter, but, unlike Plato, he does not separate the form from the perceptible objects. Aristotle holds that what is real are substances, roughly, individual packages of formed matter. However, not just any perceptible entity is a substance.  In the Metaphysics (1032a15-20), Aristotle states that “animals and plants and things of that kind” are substances “if anything is.”   On this view, part of the importance of the Timaeus is that it is intermediary between Plato’s orthodox theory of Forms and Aristotle’s theory substance (Johansen, 2004), a point which is lost if the Timaeus is dismissed as a mere literary work with no philosophical significance.  See Sellars (1967), Furth (1987), and McDonough (2000) for further discussions of Aristotle’s organicism.

b. Importance for Contemporary Philosophy

 

Since Plato’s organicist cosmology includes many plainly unbelievable views (Russell, 1945), the question arises why modern philosophers should take it seriously. Several important points of importance for contemporary philosophy have emerged.  First, Plato’s organicist cosmology is relevant to the interpretation of his theory of Forms by providing new interpretations of key terms in that pivotal theory, and it may even provide an escape from some of the standard objections of that theory (Sect. 4b). Second, Plato’s organicism is intimately linked to his notion of man as the microcosm, a view which appears again in Whitehead’s process philosophy, Wittgenstein’s Tractatus, and others. Third, Plato’s organicism illuminates his ethical views (Sect. 3.2). Fourth, since Plato conceives of the polis on analogy with an organism, it sheds light on his political philosophy (Sect. 3d). Fifth, Plato’s organicism illuminates his account of health and medicine (Sect. 3d), which, in turn, is the classical inspiration for modern holistic views of health and medicine. Sixth, the concept of an organism as, roughly, a sphere organized around a causal center, of which modern “central state materialism is a conceptual descendent,  traces, arguably, to Plato’s Timaeus (Sect. 2b).  Seventh, the Timaeus deserves to be recognized for its contribution to the history of emergentism, which has again become topical in the philosophy of mind (Sect. 2d). Eighth, Aristotle’s theory of substance bears certain conceptual and historical connections to Plato’s organicism (Sect. 4b).  To the degree that these views are important to contemporary philosophy, and history of philosophy, Plato’s organicism is important as well.

5. References and Further Reading

a. Primary Sources

  • Aristotle. 1951.  Metaphysics. Trans. W.D. Ross. The Basic Works of Aristotle. Ed.Richard McKeon.  Pp. 689-933.
  • Aristotle.  1953. Generation of Animals. A.L. Peck, Trans. Cambridge, Mass: Harvard University Press & London, England: William Heinemann, Ltd.
  • Plato. 1968. Republic. Trans.,  Alan Bloom. New York and London: Basic Books.
  • Plato. 1969. Apology. Hugh Tredennick, Trans. Collected Dialogues of Plato.  E. Hamilton and H. Cairns, Ed.  Princeton:  Princeton University Press. Pp.3-26.
  • Plato.  1969.  Phaedo. Hugh Tredennick, Trans.  Collected Dialogues of Plato.  E. Hamilton and H. Cairns, Ed.  Princeton:  Princeton University Press. Pp. 40-98.
  • Plato.  1969.  Gorgias.  W.D. Woodhead, Trans.  Collected Dialogues of Plato.  E.  Hamilton and H. Cairns, Ed.  Princeton:  Princeton University Press. Pp. 229-307.
  • Plato. 1969.   Protagoras.   W.K.C. Guthrie, Trans.  Collected Dialogues of Plato.  E.  Hamilton and H. Cairns, Ed.  Princeton:  Princeton University Press. Pp. 308-352.
  • Plato.  1969.  Theaetetus.  F.M. Cornford, Trans. Collected Dialogues of Plato.  E.  Hamilton and H. Cairns, Ed.  Princeton:  Princeton University Press. Pp. 957-1017.
  • Plato.  1969.  Sophist.  F.M. Cornford, Trans.  Collected Dialogues of Plato.  E. Hamilton and H. Cairns, Ed.  Princeton:  Princeton University Press. Pp. 845-919.
  • Plato.  1969.  Philebus.   R. Hackforth, Trans.  Collected Dialogues of Plato.  E.  Hamilton and H. Cairns, Ed.  Princeton:  Princeton University Press. Pp. 1086-1150.
  • Plato.   1969.   Timaeus.   Benjamin Jowett, Trans.  Collected Dialogues of Plato.  E. Hamilton and H. Cairns, Ed.  Princeton:  Princeton University Press. Pp. 1151-1211.
  • Plato.  1969.  Laws.  A.E. Taylor, Trans.  Collected Dialogues of Plato.  E. Hamilton and H. Cairns, Ed.  Princeton:  Princeton University Press. Pp. 1225-1516.
  • Plato.  1997.  Symposium.  Alexander Nehamas and Paul Woodruff, Trans.  Plato: Complete Works.  John Cooper, Ed.  Indianapolis/Cambridge: Hackett. Pp. 457-505.

b. Secondary Sources

  • Allen, Reginald E.  1966.  Introduction to Greek Philosophy: Thales to Aristotle.  Ed. Reginald E. Allen.  New York: The Free Press.  Pp. 1-23.
  • Alexander, S. I.  1920.  Space, Time, and Deity, 2 vols. London: Macmillan.
  • Bergson, Henri.  1983.  Creative Evolution.  A. Mitchell, Trans.  Lanham, MD: University Press of America.
  • Brandwood, Leonard.  1992.  “Stylometry and Chronology.”  The Cambridge Companion to Plato.  Cambridge: Cambridge University Press.  Pp. 90-120.
  • Brisson,  Luc, and Meyerstein, F. Walter.    1995.  Inventing the Universe: Plato’s Timaeus, the Big Bang, and the Problem of Scientific Knowledge. Albany: State University of New York Press.
  • Buchsbaum, Ralph.  1957.  Animals Without Backbones. Vol. I.  Middlesex, England: Penguin Books.
  • Burnet, John.  1971.  Early Greek Philosophy.   London: Adam and Charles Black.
  • Cairns, Huntington.  1961.  Introduction to The Collected Dialogues of Plato.  Princeton: Princeton University Press. Pp. xiii-xxv.
  • Cassirer, Ernst.  1979.  The Individual and the Cosmos in Renaissance Philosophy.  Trans. Mario Domandi.  Philadelphia: University of Pennsylvania Press.
  • Cornford.  F.M.  1965.   From Religion to Philosophy:  A Study in the Origins of Western Speculation.  New York: Harper and Row.
  • Cornford.  F.M.  1966.  Plato’s Cosmology:  The Timaeus of Plato.  The Liberal Arts Press.
  • Cornford.  F.M.  1997.  Introduction to Plato:  Timaeus.  Indianapolis: Hackett.  Pp. ix-xv.
  • Carone, Gabriela Roxana.  2005.  Plato’s Cosmology and its Ethical Dimensions.  Cambridge: Cambridge University Press.
  • Clayton, Philip, and Davies, Paul., Ed’s.  2006.   The Re-Emergence of Emergence: The Emergentist Hypothesis from Science to Religion.  Oxford: Oxford University Press.
  • Furth, Montgomery.  1988.  Substance, Form, and Psyche: An Aristotelian Metaphysics.  Cambridge: Cambridge University Press.
  • Hamilton, Edith.  1964.  The Greek Way.  New York: The W.W. Norton Co.
  • Heisenberg, Werner.  1958.  Physics and Philosophy.   London: George Allen and Unwin.
  • Johansen, Thomas Kjeller.  2004.  Plato’s Natural Philosophy: A Study of the Timaeus-Critias.   Cambridge: Cambridge University Press.
  • Kim,  Jaegwon, Beckermann, Angsar, and Flores, Hans, Ed’s.  1992.  Emergence or Reduction? Berlin: De Gruyter.
  • Knowles, David.  1989.  Evolution of Medieval Thought.  United Kingdom: Longman.
  • Kraut, Richard.  1992.  “Introduction to the Study of Plato.”   The Cambridge Companion to Plato.  Cambridge: Cambridge University Press.  Pp. 1-50.
  • Leibniz, G.W.  1968.  “Principles of Nature and Grace.”  Leibniz: Philosophical Writings.  Trans, Mary Morris.  New York: Dutton & London: Dent.  Pp. 21-31.
  • Lovejoy, A.O.  1964.  The Great Chain of Being.  Cambridge: Harvard University Press.
  • McDonough, Richard.  1991. “Plato’s not to Blame for Cognitive Science.”  Ancient Philosophy. Vol. 11.  1991.  Pp. 301-314.
  • McDonough, Richard.  2000.  “Aristotle’s Critique of Functionalist Theories of  Mind.”  Idealistic Studies.  Vol. 30.  No. 3.  pp. 209-232.
  • McDonough, Richard.  2002.  “Emergence and Creativity: Five Degrees of Freedom” (including a discussion with the editor).  In Creativity, Cognition and Knowledge.  Terry Dartnall, Ed.  London:  Praeger.  Pp. 283-320.
  • Meinwald, Constance C.  1992.  “Goodbye to the Third Man.”  The Cambridge Companion to Plato.  Cambridge: Cambridge University Press.  Pp. 365-396.
  • Morgan, Lloyd.  1923.  Emergent Evolution. London: Williams and Norgate, 1923.
  • Mourelatos,  A.  1986.  “Quality, Structure, and Emergence in Later Pre-Socratic Philosophy.”  Proceedings of the Boston Colloquium in Ancient Philosophy.  2,  Pp. 127-194.
  • Muirhead, John H.  1931.  The Platonic Tradition in Anglo-Saxon Philosophy.  New York: The Macmillan Company & London: George Allen & Unwin.
  • Naess, Arne.  1990.  Ecology, Community, Lifestyle: Outelines of an Ecosophy.  Cambridge: Cambridge University Press.
  • Nagel, Ernst.  1979.  The Structure of Science.  Indianapolis: Hackett.
  • Patterson, Richard.  1985.  Image and Reality in Plato’s Metaphysics.  Indianapolis: Hackett.
  • Prior, William J.  1985.  The Unity and Development of Plato’s Metaphysics.  LaSalle, Illinois: Open Court.
  • Putnam, Hilary. 1981.  Reason, Truth, and History.  Cambridge: Cambridge University Press.
  • Putnam, Hilary.  1990.  Realism with a Human Face.  Cambridge: Harvard University Press.
  • Robin, Leon.  1996.  Greek Thought and the Origins of the Scientific Spirit.  London and New York: Routledge.
  • Robinson, John Mansley.  1968.  An Introduction to Early Greek Philosophy.  Houghton Mifflin College Division.
  • Russell, Bertrand.  1945.  A History of Western Philosophy.  New York: Simon & Schuster.
  • Sallis, John.   1999.  Chorology:  On Beginning in Plato’s Timaeus.  Indianapolis: Indiana University Press.
  • Sellars, Wilfrid.  1967.  “Raw Materials, Subjects, and Substrata.”   Philosophical Perspectives.   Springfield, Illinois:  Charles C. Thomas, Publisher.  Pp. 137-152.
  • Taylor, A.E.  1928.  A Commentary on Plato’s Timaeus.  Oxford: Oxford University Press.
  • Vlastos, Gregory. 1975.  Plato’s Universe.  Seattle: University of Washington Press.
  • Whitehead, A. N.  1978.  Process and Reality (Corrected Edition).   New York: Macmillan and London: Collier Macmillan.
  • Wittgenstein, Ludwig.  1966.  Tractatus-logico-philosophicus.  Trans, D F. Pears and B. F. McGuiness.  New York: Routledge and Kegan Paul Ltd.
  • Wulf, Maurice De.  1956.  Scholastic Philosophy.   New York: Dover Publications.

Author Information

Richard McDonough
rmm249@cornell.edu
Arium Academy and James Cook University
Singapore

Berlin Circle

ReichenbachThe Berlin Circle was a group of philosophers and scientists who gathered round Hans Reichenbach in late 1920s. Among its other members, were K. Grelling, C. G. Hempel, D. Hilbert, R. von Mises. The Berlin Circle’s name was Die Gesellschaft für Empirische Philosophie (Society for Empirical Philosophy). It  joined up with the Vienna Circle; together they published the journal Erkenntnis that was edited by both Rudolf Carnap and Hans Reichenbach, and they organized several congresses on scientific philosophy, the first of which was held in Prague in 1929.

Members of the Berlin Circle were particularly active in analyzing contemporary physics, especially the theory of relativity, and in developing the frequency interpretation of probability. After the rise of Nazism, several members of the Berlin Circle emigrated from Germany. Reichenbach moved to Turkey in 1933 and then to the USA in 1938; Hempel to Belgium in 1934 and to the USA in 1939; Grelling was killed in a concentration camp. Hence the Berlin Circle was dispersed.

Author Information

Mauro Murzi
Italy

The IEP is actively seeking a replacement article of 8,000 or more words.

The Golden Rule

The most familiar version of the Golden Rule says, “Do unto others as you would have them do unto you.”  Moral philosophy has barely taken notice of the golden rule in its own terms despite the rule’s prominence in commonsense ethics. This article approaches the rule, therefore, through the rubric of building its philosophy, or clearing a path for such construction. The approach reworks common belief rather than elaborating an abstracted conception of the rule’s logic. Working “bottom-up” in this way builds on social experience with the rule and allows us to clear up its long-standing misinterpretations. With those misconceptions go many of the rule’s criticisms.

The article notes the rule’s highly circumscribed social scope in the cultures of its origin and its role in framing psychological outlooks toward others, not directing behavior. This emphasis eases the rule’s “burdens of obligation,” which are already more manageable than expected in the rule’s primary role, socializing children. The rule is distinguished from highly supererogatory rationales commonly confused with it—loving thy neighbor as thyself, turning the other cheek, and aiding the poor, homeless and afflicted. Like agape or unconditional love, these precepts demand much more altruism of us, and are much more liable to utopianism. The golden rule urges more feasible other-directedness and egalitarianism in our outlook.

A raft of additional rationales is offered to challenge the rule’s reputation as overly idealistic and infeasible in daily life. While highlighting the golden rule’s psychological functions, doubt is cast on the rule’s need for empathy and cognitive role-taking. The rule can be followed through adherence to social reciprocity conventions and their approved norms. These may provide a better guide to its practice than the personal exercise of its empathic perspective. This seems true even in novel situations for which these cultural norms can be extrapolated. Here the golden rule also can function as a procedural standard for judging the moral legitimacy of certain conventions.

Philosophy’s two prominent analyses of the golden rule are credited, along with the prospects for assimilating such a rule of thumb, to a universal principle in general theory. The failures of this generalizing approach are detailed, however, in preserving the rule’s distinct contours. The pivotal role of conceptual reductionism is discussed in mainstream ethical theory, noting that other forms of theorizing are possible and are more fit to rules of thumb. Circumscribed, interpersonal rationales like the golden rule need not be viewed philosophically as simply yet-to-be generalized societal principles. Instead, the golden rule and its related rationales-of-scale may need more piecemeal analyses, perhaps know-how models of theory, integrating algorithms and problem-solving procedures that preserve the specialized roles and scope. Neither mainstream explanatory theory, hybrid theory, nor applied ethics currently focuses on such modeling. Consequently, the faults in golden-rule thinking, as represented in general principles, may say less about inherent flaws in the rule’s logic than about shortfalls in theory building.

Finally, a radically different perspective is posed, depicting the golden rule as a description, not prescription, that portrays the symptoms of certain epiphanies and personal transformations observed in spiritual experience.

Table of Contents

  1. Common Observations and Tradition
  2. What Achilles Heel?
  3. Sibling Rules and Associated Principles
  4. Golden Role-Taking and Empathy
  5. The Rule of Love: Agape and Unconditionality
  6. Philosophical Slight
  7. Sticking Points
  8. Ethical Reductionism
  9. Ill-Fitting Theory (Over-Generalizing Rules of Thumb)
  10. Know-How Theory (And Medium-Sized Rationales)
  11. Regressive Default (Is Ancient Wisdom Out-Dated?)
  12. When is a Rule Not a Rule, but a Description?
  13. References and Further Reading

1. Common Observations and Tradition

“Do unto others as you would have them do unto you.”  This seems the most familiar version of the golden rule, highlighting its helpful and proactive gold standard. Its corollary, the so-called “silver rule,” focuses on restraint and non-harm: “do nothing to others you would not have done to you.” There is a certain legalism in the way the “do not” corollary follows its proactive “do unto” partner, in both Western and Eastern scriptural traditions. The rule’s benevolent spirit seems protected here from being used to mask unsavory intents and projects that could be hidden beneath. (It is sobering to encounter the same positive-negative distinction, so recently introduced to handle modern moral dilemmas like abortion, thriving in 500 B.C.E.)

The golden rule is closely associated with Christian ethics though its origins go further back and graces Asian culture as well. Normally we interpret the golden rule as telling us how to act. But in practice its greater role may be psychological, alerting us to everyday self-absorption, and the failure to consider our impacts on others. The rule reminds us also that we are peers to others who deserve comparable consideration. It suggests a general orientation toward others, an outlook for seeing our relations with them. At the least, we should not impact others negatively, treating their interests as secondary.

This is a strongly egalitarian message. When first conveyed, in the inegalitarian social settings of ancient Hebrews, it could have been a very radical message. But it likely was not, since it appears in scripture as an obscure bit of advice among scores of rules with greater point and stricture, given far more emphasis. Most likely the rule also assumed existing peer-conventions for interacting with clan-members, neighbors, co-workers, friends and siblings. In context, the rule affirmed a sentiment like “We’re all Jews here,” or “all of sect Y.” Only when this rule was made a centerpiece of social interaction (by Jesus or Yeshua, and fellow John-the-Baptist disciples) did it become a more radical message, crossing class, clan and tribal boundaries within Judaism. Of special note is the rule’s application to outcasts and those below one’s station—the poor, lepers, Samaritans, and certain heathens (goyem). Yeshua apparently made the rule second in importance only to the First Commandment of “the Father” (Hashem). This was to love God committedly, then love thy neighbor as thyself, which raised the rule’s status greatly. It brought social inclusivity to center stage, thus shifting the focus of Jewish ethics generally. Yet the “love thy neighbor” maxim far exceeds the golden rule in its moral expectations. It stresses loving identification with others while the golden rule merely advises equal treatment.

Only when the golden rule was applied across various cultures did it become a truly revolutionary message. Its “good news,” spread by evangelists like Paul (Saul of Tarsus), fermented a consciousness-shift among early Christians, causing them actually to “love all of God’s children” equally, extending to the sharing of all goods and the acceptance of women as equals. Perhaps this was because such love and sharing radically departed from Jewish tradition and was soon replaced with standard patriarchy and private property. The rule’s socialism might have fermented social upheaval in occupied Roman territories had it actually been practiced on a significant scale, which may help explain its persecution in that empire. Most likely the golden rule was not meant for such universalism, however, and cannot feasibly function on broad scales.

The Confucian version of the golden rule faced a more rigid Chinese clan system, outdoing the Hebrews in social-class distinctions and the sense that many lives are worthless. More, Confucius himself made the golden rule an unrivaled centerpiece of his philosophy of life (The Analects, 1962). The rule, Kung-shu, came full-blown from the very lips and writings of the “morality giver” and in seemingly universal form. It played a role comparable to God’s will, in religious views, to which the concept of “heaven” or “fate” was a distant second. And Confucius explicitly depicted the “shu” component as human-heartedness, akin to compassion. Confucian followers succeeding Mencius into the neo-Confucians, however, emphasized the Kung component or ritual righteousness. They increasingly interpreted the rule within the existing network of Chinese social conventions. It was a source of cultural status quoism—to each social station, its proper portion. Eventually, what came to be called the Rule of the Measuring Square was associated with up to a thousand ritual directives for daily life encompassing etiquette, propriety and politeness within the array of traditional relationships and their strict role-obligations.  The social status quo in Confucian China was anything but compassionate, especially in the broader community and political arenas of life.

In traditional culture, the “others” in “do unto others” was interpreted as “relevant others,” which made the rule much easier to follow, if far less egalitarian or inspiring. One’s true peers were identified only within one’s class, gender, or occupation, as well as one’s extended family members. Generalizing peer relations more broadly was unthinkable, apparently, and was therefore not read into the rule’s intent. Confucius spoke of hopelessly searching in vain, his whole life for one person who could practice Kung-shu for one single day. But clearly he meant one “man,” not person, and one “gentleman” of the highest class. This classism was a source of conflict between Confucianism and Taoism, where the lowest of the low were often depicted as spiritual exemplars.

For the golden rule to have become so pervasive across historical epochs and cultures suggests a growing suspicion of class and ethnic distinctions—challenging ethnocentrism. This trend dovetails nicely with the rule’s challenge to egocentrism at the personal level.  The rule’s strong and explicit egalitarianism has the same limited capture today as it did originally, confined to distinctly religious and closed communities of very limited scope. It is unclear that devout, modern-day Jews or Christians vaunt strong equality of treatment even as an ideal to strive toward. We may speak of social outcasts in our society as comrades, and recognize members of “strange” cultures and unfriendly nations as “fellow children of God.” But we rarely place them on a par with those closer by or close to us, nor treat them especially well. Neither is it clear, to some, that doing so would be best. Instead, the rule’s original small scope and design is preserved, limited to primary groups at most.

Biblical scholars tend to see Yeshua’s message as meant for Jews per se, extending to the treatment of non-Jews yes, but as Jews should treat them. And this does not include treating them as Jews. The golden rule has a very different meaning when it is a circumscribed, in-group prescription. In this form, its application is guided by hosts of assumptions, expectations, traditions, and religious obligations, recognized like-mindedly by “the tribe.” This helps solve the ambiguity problem of how to apply the rule within different roles: parents dealing with children, supervisors with rank-and-file employees, and the like.

2. What Achilles Heel?

When considering a prominent view late in its history, its paths of development also merit analysis. How were its uses broadened or updated over time, to fit modern contexts? Arguably the Paulist extension of the rule to heathens was such a development, as was the rule’s secularization. The rule’s philosophical recasting as a universal principle qualifies most within moral theory. Just as important are ways the rule has been misconstrued and misappropriated, veering from its design function.

We must acknowledge that the golden rule is no longer taken seriously in practice or even aspiration, but merely paid lip service. The same feature that makes the golden rule gleam—its idealism—has dimmed its prospects for influence. The rule is simply too idealistic; that is its established reputation. Note that over-idealism has not discredited Kantian or Utilitarian principles, by contrast, because general theory poses conceptual objects, idealized by nature. They focus on explanation in principle, not application in the concrete. But the golden rule is to be followed, and following the golden rule requires a saintly, unselfish disposition to operate, with a utopian world to operate in. This is common belief. Cloistered monasteries and spiritual communes (Bruderhofs, Koinonia) are its hold-out domains. But even as an ideal in everyday life, the rule is confined to preaching, teaching, and window dressing.  Why then make it the object of serious analysis? The following considerations challenge the rule’s blanket dismissal in practice.

First, the silver component of the golden rule merely bids that we do no harm by mistreating others—treating them the way we would not wish to be treated. There is a general moral consensus in any society on what constitutes harms and mistreatments, wrongs and injustices. So to obey this component of the golden rule is something we typically expect of each other, even without explicitly consulting a hallowed precept. Adhering specifically to the golden rule’s guidelines, then, raises no special difficulty. Its silver role is mostly educative in this context, helping us understand why we expect certain behavior from each other. “See how it feels” when folk violate expectations?

The gold in the rule asks more from us, treating people in fair, beneficial, even helpful ways. As some have it, we are to be loving toward others, even when others do not reciprocate, or in fact mistreat us.  This would be asking much. But despite appearance, the golden rule does not ask it of us. Nothing about love or generosity is mentioned in the rule, nor implied, much less letting oneself be taken advantage of.  Loving thy neighbor as oneself, or turning the other cheek, are distinct precepts—distinct from the golden rule and from each other. These rules are not stated or identified with the golden rationale in biblical or Confucian scripture. Nor are they illustrated together, say in the parables.

We may wish we loved everyone and that everyone loved us, but a wish is not a prescription or command—“Do unto.” And we cannot feasibly love on demand, either in our hearts or actions. (Can we learn to love others as ourselves over a lifetime?) But we can certainly consider how we need or prefer to be treated. And we can treat others that way on almost all occasions, on the spot, without needing to undergo a prior regimen of prayer, meditation, or working with the poor.

As noted, the golden rule may deal more with being other-directed and sensitive rather than proactive. Leading with the word “Do” does not necessarily signal the rule’s demand for action anymore than parents saying to teenagers, “Be good,” when they go on a date. Whether they are (should be) a certain way isn’t the point. There is no need for them to engage their character and its traits, for example. The focus here is on what they do, actually, and should not do. Likewise with “Do your part” or “Don’t get in the way”: these are general directives of how to orient ourselves on certain occasions. They prime us to take certain sorts of postures, showing a readiness to cooperate or to ask others if we are being a pest, though we may not succeed even if we try. They prime us to apologize if in fact we do get in the way, but maybe not more than that.

No altruism (self-sacrifice) is needed for golden-ruling in this psychological form for adopting a certain “other-orientation” in “the spirit of” greater awareness toward others. Usually one bears no cost to engage empathetic feelings, if that is what is needed. One wonders whether an implicit sense of this merely attitudinal “spirit” of the golden rule helps account for why we do not practice it—no hypocrisy required. If so, it would allow an uplifting turnaround in our moral self-understanding and self-criticism.

Conjuring up certain outlooks or orientations is an especially feasible task when provided a golden recipe for how—by role-taking, for example, or empathy or adherence to reciprocity norms. Once our heart goes out to others, following its spontaneous pull hardly requires going the extra foot, much less a mile in effort for anyone. We simply do what we feel, as much as the pull tugs us to. The truth is that we interact largely in words, and kindly words are free. We’re often not occupied when called upon to respond to others, so that responding lushly is easy—there is no hefty competition for our time or interest. Consider the sort of “do-unto” that can make a person’s week: “I wanted to mention how much I appreciate your support during this transition time for me. It’s noticeable, and it means a lot.”

It pays moral philosophy to think the golden rule through in such actual everyday circumstances before imagining the rule’s costs in principle, or worst-case scenarios. Where school systems routinely include some degree of moral education in their curricula, the case for golden-rule feasibility in a society is even stronger. And, arguably, most children already get some such training in school and at home implicitly.

The same reduced-effort scenario holds when sizing up moral exemplarism, often associated with the golden-rule, and with living its sibling principles. Ministering to the poor and ill often involves the routine work of truckers or dock workers, loading canned food or medical supplies to be hauled away, or hauling it oneself. It may involve primitive nursing or cooking, and point of contact service work routinely taken on as jobs by non-exemplars. These are not seen as careers in saintly heroism. Pursuing such work as a mission, not an occupation, takes significant commitment and gumption. But many exemplars report gradually falling into their roles, without really noticing or thinking clearly (David Fattah of Umoja House) or of being dragged into “the life” by others (Andrie Sakharov and Martin Luther King, for example.) (See Colby and Damon 1984, Oliner and Oliner 1988, The Noetics Institute “Creative Altruist” Profiles).  More, everyday exemplars report doing their work out of an atypical outlook on society and their relation to it. This comes spontaneously to them, as ours comes to us. No additional, much less extraordinary effort is required. This seems the point of Mother Teresa’s refrain to those asking how she could possibly work with lepers and the dying, “Come see.”

If the golden rule is designed for small-group interaction, where face-to face relations dominate, a failure to reciprocate in kind will be noticed. It cannot be hidden as in anonymous, institutionally-mediated cooperation at a distance. Subtle pressures will be felt to conform with this group norm, and subtle sanctions will apply to those who take more than they give. Conforming to norms in this setting will be easier than usual, as well, since in-groups attract the like-minded. And in such contexts requiring extraordinarily helpful motivations and actions from others would be seen as unfair.

By assessing the golden rule outside of such contexts we miss its implicit components, the network of mutual understandings, and established community practices that make its adherence feasible and comprehensible. Such considerations are also crucial in determining the adequacy of the golden rule. The shortfalls that have been identified by the rule’s detractors seemingly arise when the rule is over-generalized and set to tasks beyond its design. If its function is primarily psychological, its conceptual or theoretical faults are not key. If its design is small-scale, fit to primary relations, its danger of allowing adherents to be stepped on is not key. The rule should not be used where those around you let them happen or can’t see it happen. And if the rule’s guidance is judged too vague to follow reliably, we should look to the myriad expectations and implicit assumptions that go with it to see if they supply needed precision and clarity.

The golden rule is not only a distinct rationale within a family of related rationales. It is a general marker, the one explicit component in networks of more implicit rationales and specific prescriptions. Teachings that abstract the rule from its implicit corollaries and situational expectations fail to capture what the rule even says. Theoretical models of the rule that further abstract the rule’s logic from its substance, content or process, likely mutilate it beyond recognition.

“How would you feel if?” puts the golden rule’s peer spirit in a mother’s teaching hands when urging her egocentric, but sensitive child to consider others. As a socializing device, the rule helps us identify our roles within mutually respectful and cooperating community.  How well it accomplishes this socializing task is another crucial mark of its adequacy, perhaps the most crucial. The prospect of first engaging this rule typically captures childhood imaginations, like acquiring many highly useful social skills. (Fowler 1981, Kohlberg 1968, 1982)

Putting these considerations together allows us to identify where the golden rule may be operating unnoticed as a matter of routine—in families, friendships, classrooms and neighborhoods, and in hosts of informal organizations aiming to perform services in the community. Isn’t it in fact typical in these interactions that we treat each other reciprocally, as each other would wish, want, choose, consent or prefer?

3. Sibling Rules and Associated Principles

The foregoing appeals for feasibility are not primarily defenses of the golden rule against criticism. They are clarifications of the rule that expose misconceptions, central to its long-standing reputation. We now question, also, the much admired roles of empathy and role-taking in the golden rule, which can ease adherence to it, but are not necessary. The rule is certainly not a guideline for empathizing or role-taking process, as most believe and welcome. However, empathy can help apply the rule and the rule can provide many “teaching moments” for promoting and practicing empathy, which is advantageous. But distinguishing empathy from the rule’s function also is fortunate for the empathetically challenged among us, and those not able to see the others’ sides. Their numbers seem legion. The golden rule can be adhered to in other ways.

The golden rule is much-reputed for being the most culturally universal ethical tenet in human history. This suggests a golden link to human nature and its inherent aspirations. It recommends the rule as a unique standard for international understanding and cooperation—noble aims, much-lauded by supporters. In support of the link, golden logic and paraphrasing has been cited in tribal and industrialized societies across the globe, from time immemorial to the present. This supposedly renders the rule immune to cultural imperialism when made standard for human rights, international law, and the spreading of western democracy and education—a prospect many welcome, while others fear it. Note that if the golden rule is truly distinct from the related principles such as loving thy neighbor as thyself and feeding the poor, these cherished claims for the rule are basically debunked.

Analysis of this endless stream of sightings shows no more than a family resemblance among distinct rationales (See golden rule website in references below.). Some rationales deal with putting oneself in another’s place, with others viewing everyone as part of one human family, or divine family. Still others promote charity, forgiveness and love for all. Culturally, the golden rule rationale is mostly confined to certain strands of the Judeo-Christian and Chinese traditions, which are broad and lasting, at least until recently, but hardly universal (See Wattles 1966).

The golden-rule’s distinctness, here, is seen relative to its origins. The original statement of the golden rule, in the Hebrew Torah, shows a rule, not an ethical principle, much less the sort of universal principle philosophers make of it. It is one of the simpler and most briefly stated dos and don’ts among long lists of particular rules in Leviticus (XIX: 10-18). These directives concern kosher eating, animal sacrifice procedures, threads that can’t be used together in weaved clothing, and even the cleansing of “impurity” (such as menstruation) by bringing pigeons and doves to a rabbi for ceremonial disposition. If one blinks, or one’s mind wanders, one would miss it, its golden gleam notwithstanding. And even a devout Jew is likely to lose concentration when perusing these outdated, dubious and less than riveting observations.

No fair reading of Levitticus XIX: 18 would term its statement the golden rule, not in our modern sense, first stated in Matthew 7:12. For in Levitticus the commandment is merely not to judge an offender by his offense, and thereby hold a grudge against a fellow Jew for committing it. But love him as yourself. The latter, a crucially different principle, is meant here differently than we now interpret it as well. It perhaps can be rendered as `Remember that you offend fellow Jews also and so you are like the offender on other occasions.’

Seen amid such concrete and mutually understood practices of a small tribe, the golden rule poses no role-taking test. Any community member can comply simply by knowing which reciprocity practices are approved or frowned on. Recollecting what it was like to be on the receiving end of others’ slights or benefits also can help. But that would mean taking one’s own perspective, not another’s, in the past. Doing so is not essential to “golden-ruling” however, nor likely reliable. If a kind of imaginative role-playing is contemplated, one need only conjure up images of community elders frowning or fawning over a variety of choice options and everyday practices.

Neither in eastern nor western traditions did the golden rule shine alone. Thus viewing and analyzing it in isolation misses the point. The golden rule’s relation to sibling principles, associated altered its meaning and purpose in different settings. The most prominent standard bearer for this family of rules seems to have been, “loving thy neighbor as thyself.” This “royal law” is a very different sort of prescription from the golden rule, foreseeing a variety of extraordinarily benevolent practices born of extraordinary identification with others. In Judaism, benevolence usually meant helping family members and neighbors primarily, focusing on one’s kind—one’s particular sect. Generosity meant hospitality to the stranger or alien as well, remembering that the Jews were once strangers in a strange land. Alms were given to the poor; crops were not gleaned from the edges of one’s farm-field so that the poor might find sustenance in the remains. Farmland was to lay fallow each seventh year (like the Sabbath when God rested) so that, in part, the poor then could find rest there, and room to grow (Deuteronomy XV: 7, Leviticus XXIII: 22, XXV: 25, 35).

Turning the other cheek (Luke 6:29), loving even one’s enemy (Matthew 5:44) and not turning away when anyone asks of you (5:42)—these go well beyond normal charity or benevolence, even more than identifying with our neighbor. What neighbor would strike or steal from you (taking our cloak so that you must give him your coat also (Matthew 5:40)? Such practices are not at all required or asked of the Confucian “gentleman” whose Kung-shu practice is more about respect for elders and ancestors, and fulfilling hosts of family and community responsibilities.

With regard to Yeshua’s teachings on feeding the hungry, sheltering the homeless, or praying for those who shamefully use and abuse you, he summarily urged that followers “be perfect, even as your Father in Heaven is Perfect” (5:48). This far exceeds what the golden rule asks—simply that we consider others as comparable to us and consider our comparable impacts on them. These do not represent fair or equal reciprocity in fact. Ask how you would wish to be treated if you were a shameful abuser or even homeless person. There is sufficient testimony revealing that many abusers and homeless do not at all want to be shown charity, for example, but condemnation or punishment, in the first case, and being left alone to fend in a “street community” in the second. They feel this is what they deserve. (To abuse-counselors and homeless shelter workers, this goes without saying.) What the abusive and homeless should want, or calculate as their desert, may be something different. But golden-rule role-taking will not tell.

There is one area where the golden rule extends too far, directly into the path of a turning of the other cheek. When we are seriously taken advantage of or mistreated, the rule bids that we treat them well nonetheless. We are to react to unfair treatment as if it were fair treatment, ignoring the moral difference. Critics jump on this problem, as they should, because the golden rule seems designed to highlight such cases. Here is where the rule most contrasts with our typical, pre-moral reaction, while also rising above (Old Testament) justice. In the process, it promotes systematic and egregious self-victimization in the name of self-sacrifice. Yet, is self-sacrifice in the name of unfairness to be admired? Benevolence that suborns injustice, rather than adding ideals to it, seems morally questionable. Moreover, under the golden rule, both victimization and self-victimization seems endless, promoting further abuse in those who have a propensity for it. No matter how much someone takes advantage of us, we are to keep treating them well.   Here the golden rule seems simply unresponsive. Its call to virtuous self-expression is fine, as is its reaction to the equal personhood of the offender. But it neither addresses the wrong being committed, nor that part of the perpetrator to be faulted and held accountable. Interpersonally, the rule calls for a bizarre response, an almost obtuse or incomprehensible one. While a “forgiving” response may be preferable to retribution, why should just desert be completely ignored? It can certainly be integrated into the high-road alternative. In this type of case, the golden rule sides with its infeasible siblings. It bids us to play the exemplar of “new covenant” morality—the morality of love for all people as people, or as children of God. And this asks too much.

These criticisms have merit, but can be mitigated. When dealing with cases of unfairness and abuse, critics assume the golden rule requires us to “take the pain” uncomplainingly. There is no such proviso in the rule. As the Gandhi-King method has shown, it is perfectly legitimate to fault the action—even condemn the action—while not condemning the person, or taking revenge. The practice of abusing or taking advantage of someone does not define its author as a person after all, even when it is habitual. The wrongs anyone commits do not eradicate his good deeds, nor our potential for reform. And the golden rule has us recognize that. But the spirit of silent self-sacrifice is found more in the sibling principles than the golden rule, and should be kept there. In the current case we can readily respond to our oppressor by calling a spade a spade—“You took advantage of me, I noticed.” That would be a first response. “You keep taking advantage of me: that was abusive. I don’t like it; it’s not OK with me.” The abuser responds, “It seems like you like it. Why else would you take it and respond as if it’s OK?” We reply, “Why should I let your abuse drag me down to your level, compounding your offence?”

There are nice and not so nice ways to make this point. If Yeshua is our guide, not so nice approaches are acceptable. To treatment from those known as most righteous in Jerusalem, for example, he responded, “Woe to you Scribes and Pharisees, hypocrites all..you are like whited sepulchers, all clean and fair without, and inside filled with dead man’s bones and all corruption…yours is a house of desolation, the home of the lizard and the spider…Serpents, brood of vipers, how can any of you escape damnation?” (Matthew 23:13-50 as insightfully condensed by Zefferelli.) If this be love, then it is certainly hard love, especially when we note that Yeshua faults the person here, not just the act.

We must also see these cases in social context to see how far the golden rule bids us go. If we are sensible, and have friends, it is unlikely we will place ourselves in the vicinity of serious abusers, or remain there. The social convention of avoiding those who hurt us also must figure into the rule’s understanding. The defense our friends will put up for us against abuse must figure into the rule’s feasibility as well.

Most morally important, these abuse cases do not illustrate the golden rule’s standard application—quite the contrary. Fair-dealing with unfairness and abuse, in particular, call  for special principles of rectification, including punishment, recompense or reform. When used in this context, without alteration, the golden rule poses an alternative to the typical ways these practices are performed. But it remains this sort of special principle. Among its aims, the rule certainly seems bent on goals like rectification, recompense and reform, but indirectly.  Arguably the rule has us exemplify the right path—the path the perpetrator might have taken, but did not, thus demonstrating its allure, its superiority. This includes, for observers in the community, the superiority of fairness over retribution (“’Vengeance is mine,’ sayeth the Lord.”)  Teaching this lesson is aimed at raising moral consciousness, especially in the perpetrator. As such, it resembles the practice of “bearing unmerited suffering” in the Gandhi-King approach, aimed at piquing moral conscience in those oppressing us (King 1986).

Ideally, a perpetrator will think better of his practice, apologizing for past wrongs and making up for them. At least it might move him to abandon this sort of practice. And if moral processes are not awakened, then at least placing the offender in a morally disadvantageous position within the group will bring pressures to bear on his behavior. Exemplifying fairness in this way also shows demonstrates putting the person first, holding his status paramount relative to his actions, and our sense of offense.

Exemplifying a moral high road, so as to edify others does not show passivity or weakness. It is normally communicated in a strong, positive pose. Standing above a vengeful or masochist temptation uplifts the supposed victim, not making him further trodden down.  Indeed, its courageous spirit is key in working its effect, an effect achieved by Gandhi, King and legions of followers under the most morally hostile conditions. Aside from giving abusers pause, high-minded responses bring loud  outcries of protest in one’s cause from outside observers, making reform prudent, and practically necessary.

Again, these realities of the rule can only be seen in context, looking into the subtleties of interpersonal relating, communicated emotion, performance before a social audience and the like. The mere logic or golden principle of the thing is silent on them. The same holds for the less feasible sibling rules of the golden rule family, from giving to the poor to turning the other cheek. Trying them out makes a world of difference in understanding what they say. Consider an experiment with trying to “say yes to all who ask,” and substituting “yes” generally, where we routinely say “no” or “maybe.” Doing so may add much less than expected to our load because, first, it makes us more interested in being kinder, which is a rewarding experience, as it turns out. Second, we find that people do not generally ask much, especially when they see you at risk of being taken advantage of for your exceptional good will. Finding simple ways to make the most needy more self-reliant—such as simply encouraging them to be so—also may lighten the helping load. The good it does may be exceptional.

But what of the lingering “doormat problem” for those who are especially dependent and masochistic, all but inviting victimization from abusers? No full mitigation may be possible here. The golden rule, if not exacerbating the problem in practice, at least serves to legitimatize it. Its rationale has been exploited by many, including some Christian churches and clergy who suborn victimization as a lifestyle, especially for wives and mothers. A rule cannot be responsible for those who misuse it, or fail to grasp its purposes. But those sustaining the rule bear a responsibility to clarify its intent. It certainly would be better if the rule itself made its intentions clear or included illustrations of proper use. Currently, it relies on the chance intervention of moral teachers or service organizations—those opposed to, say, domestic violence.  Even Yeshua’s disciples complained that the parables, supposedly illustrating tenets like the golden rule, were perplexing. Confucian writing was definitely not geared to rank and file Chinese, much less children learning their moral lessons. This is an intolerable shortfall for an egalitarian socialization tool.

Consider a second corollary (the “copper” rule?) that might address such difficulties. “When misused by those do unto fairly, do not quietly bear the offense, instead defending and deflecting if with as much understanding as can be summoned.”  Notice that defending does not conflict with praying for those who shamelessly abuse us. (The “summoning understanding” proviso is meant to forestall reversion to a more pragmatic alternative such as “by any means necessary.”)

4. Golden Role-Taking and Empathy

“Putting oneself in the other guy’s place” is yet another distinct principle, as is “walking a mile in the other guy’s moccasins” (the Navaho version). The first involves taking a perspective, the second, gaining similar life experience in an ongoing way. Notice that “loving thy neighbor as thyself” requires neither of these operations presuming that we know how to love ourselves and need only extend that to someone. But of course we may not know how to love ourselves, or how to do so in the right way. The same can be said with identifying, role-taking or learning from another’s type of experience. Given that we may not be loving enough to ourselves, loving our neighbor is best accomplished by referring to prevailing standards. Our own proclivities or values are certainly not the final word. Just as with acting as we’d have others act toward us, loving thy neighbor concerns how we’re supposed to love others, as we should love ourselves. We must consult the community, its ethical conventions or scriptures (including Kantian or Utilitarian scriptures). The last word comes through a critical comparison of these conventions, in experience, with our proclivities and values.

Neither we nor our neighbors likely think it is legitimate, or even kind, to give a thief additional portions of our property. Doing so might well be masochistic, or even egotistical, thinking about our own character development most, thereby exacerbating crime and endangering the community. If we were the thief, we might very well not think that we should be given more of a victim’s property than we stole. Instead, perhaps, we might wish to steal it. Role-taking cannot guide us here.  In fact, it could easily lead us astray in various misguided directions. Some would consider it ideal to be unconcerned with property because it puts spiritual concerns over materialism, or it puts charity before just desert. Others could make a case for better balancing the competing principles involved. What good does role-taking do here? And how can it work in a non-relativistic way, where everyone taking the other’s role would come to a similar realization of what to do correctly? The golden rule is not meant to raise such questions.

Philosophers deal with these problems by standardizing the way roles are taken, the thinking that goes on in the roles, and so forth. This is what the Kantian veil of ignorance or Rawlsian (1972) original position or Habermasian (1990) ideal speech rubric is for. But surely the commonsense role-taking precepts we are talking about here do not even dream of such measures.

Prescriptions for role-taking are likely prominent in many cultures both for the increased psychological perspective they breed and the door they open to better interpersonal interaction. The interpersonal skill involved is perhaps the best explanation of their widespread use and praise, not their power of edification. It is true that if we truly wished to treat others as ourselves, or the way we would want to be treated—if we were them, not ourselves merely placed in their position—role-taking would help. But it is not unusual for primarily psychological or interpersonal tools to aid ethics without being part of ethics itself.

The golden rule’s (emotional) empathy component is as unclear as its role-taking component. To empathize is not really to take another’s perspective. If we truly took that perspective, we would not have to empathize. Being in that perspective would moot an attempt to “feel with” it from another (Noddings 1984, Hoffman 1987). Even if we took the perspective without the associated emotion, our task would then be to conjure up the emotion in the perspective. It would not be to “feel with” anything. We’d be imaginatively in the other’s  head and heart, imaginatively feeling their feelings directly. More, in any relevant context, the golden rule urges to think before we act, then imagine how we would feel, not how the other would. Thus any empathy involved would involve imaginatively “feeling with” myself, at a future time, recipient of another’s similar action. The point here is to supplant the other’s perspective and imagined reaction with our own. This is not how one empathizes. Emotionally, the appropriate orientation toward causing someone possible harm is worry or foreboding. Toward the prospect of doing future good, it’s anticipation of shared joy, perhaps. “Feeling with” or empathizing with others would be prescribed as, “Do unto others in a way that brings them the likely joy you’d happily share.”

Consider more closely what we are supposed to achieve from role-taking and empathy via the golden rule. We get a sense of how others are different from us, and how their situation differs from ours, uniquely tailored to their perspective and feelings on the matter. We then put ourselves in their place with these differences in tact, added on to ours, and subtracting from ours where necessary. So we occupy their perspective as them, not us, just as we’d wish them to do toward us when acting. (We wouldn’t want them to treat us as they’d wish to be treated, but as we’d wish to be treated when they took our perspective.)

But this already is a consequence of applying the rule, not a way of applying it. If depicted as a rule’s rationale it would say, “Treat others the way they’d wish or choose.” Seemingly the best way to do that is to ask them how they’d like to be treated. If we can’t ask, then perhaps we are not so much doing unto them a way as guessing what they’d like. Putting oneself in their place here would not seem a good idea. Neither would empathy, as opposed to prediction. A good prediction would rest on some track record of what they’ve liked in the past, perhaps acquired from a friend of theirs or one’s own experience with them as a friend.

Without involving others, such role-taking is a unilateral affair, whether well-intended or otherwise. It is often paternalistic, choosing someone’s best interest. The whole process is typically done by oneself, within one’s self-perspective or ego, and it can be spun as one wishes, no checks involved. Fairer and more respectful alternatives would involve not only consulting others on their actual outlooks, but including them in our decision making. “Is it OK with you if….” This approach negotiation is based on a different sort of mutuality, democratizing our choices and actions so that they are multilateral.

5. The Rule of Love: Agape and Unconditionality

To some, the gold in the golden rule is love, the silver component, respect. The love connection is likely made in part by confusing the golden rule with its sibling, love thy neighbor as oneself. Traditionally, ethics could have made the connection semantically—it used the term “self-love” where we now say self-interest. This could render like interest in others as other-love. But this is not really in the spirit of unconditional love.

A more likely path to connecting agape with the golden rule is to consider how we’d ideally wish to be treated by others and most wish we could treat them in turn. Wouldn’t we prefer mutual love to mere respect or toleration? This formulation has appeal though it ignores an important reality. Though we might wish to be treated ideally, we might not wish, or feel able to reciprocate in kind. Keeping mutual expectations a bit less onerous—especially when they apply to strangers and possible enemies—may seem more palatable.

But this is to think in interested and conditional terms. Agapeistic love is disinterested or indifferent, if in a lushly loving way. Its bestowal is not based on anything in particular about the person, but only that they are a person. This sufficiently qualifies them as a beloved. And agape does not come out of us as an interest we have, whether toward people, the good, or anything similar.  It comes only out of love, expressing love, or the good luring us with its goodness. Our staking claim or aim toward the good as a personal goal is not involved. The same is true for self-regard. We love ourselves because we are lovable and valuable, like anyone else. The basic or essential self, the soul within us is lovable whether we happen to like and esteem ourselves or not. (Outka 1972).

The most obvious ethical implication of agape is that it is not socially discriminating. We do not love people because they are attractive, or hold compatible views, or work in a profession we respect.  Are they friend, stranger, or opponent? It doesn’t matter. Most surprising, we do not prefer those close to us or in a special relationship, including parent and child. (Children in agapeistic communities are often raised by the adults as a whole, and in separate quarters from parents, primarily inhabited by peers.)

For moral idealists, agape is most alluring. To love in a non-discriminating way has a certain unblemished perfection to it. Pursuing moral values simply for their value or goodness seems clearly more elevated than pursuing them out of personal preference. Loving someone because they happen to be related to us, or a friend, or could do us a favor is shown up as somewhat cheap and discriminatory by comparison. Seeing ourselves as special is revealed for the trap it is—being stuck with ourselves and our self-preference, a burden to aspiration. What is this condition but the ultimate hold of ego over, binding us to all our attachments? (In philosophy, intellectual ego is a chief obstacle between us and truth, causing us to believe ourselves because we are ourselves, despite knowing that there are thinkers just as wise or wiser, with just as well-seasoned beliefs. Why be led around by the nose of our particular beliefs and interests just because they blare most loudly in our heads?)

Agape is worth pondering as a fit purveyor of the golden rule. What could be more golden? The golden rule’s raison d’être is indeed focused on countering egocentrism and self-interest. But promoting other-directedness is its remedy, not unconditionality. And concern for others’ interests is key to establishing equality as the rule directs. A plausible rendering of the golden rule, making its implicit concern for interests more visible would go, `Treat others the way you would be interested in being treated, making adjustments for their differing interests.’ In these terms, unconditional loving is a bad fit. Are we really “interested” in being treated as anyone should be treated regardless of the interests we identify with, as someone with a soul but no interests worth catering to? Likely not. This same lack of interest haunts Kant’s notion of respecting personhood unconditionally. The golden-rule problem is not that we’re failing to notice others’ personhood, but what others desire or prefer. We could indeed be faulted for ignoring others as persons, treating them like potted plants in the room, but that would only result if they craved our notice, attention, or participation. Typically, it would be fine with others if we just went about our business while not getting into theirs.

To be told that we should not be interested, or to be dealt with by people who will not relate to us in interested terms, basically undermines the golden rule’s effectiveness. As with empathy, we cannot be uninterested on demand, or even after practicing to do so long and hard. And if we do not have our self-identified interests taken seriously, we feel that we are not taken seriously, whether we ideally should or not. Ethics is not only about ideals, nor in fact, primarily about ideals. If interest were not key to ours and theirs, the golden rule would be moot. With unconditional love, reciprocity is beside the point, along with its social reciprocity conventions. Taking any perspective is the same as taking any other. In fact, taking one’s own perspective in particular is discriminatory, even when expressing generosity to others.  So is taking the perspective of any particular other. Happening to be ourselves, or a particular other, and taking that as a basis for favoritism, seems a condition—a failure in unconditionality. I could have been anyone, any of them, as they could have been me. So why do I take who I am or who they are so seriously?. Unlike every other ethic, agape provides no basis for according ourselves special first-person discretion or privacy.  The self-other gap is transcended. It’s not even clear how the typical moral division of labor is justified in agapeistic terms. In principle, when we raise our spoon filled with breakfast cereal at the morning table, the matter of whose mouth it goes into is in question.

Some agapeists would not go this far, instead keeping our self-identification intact. But there is good reason to go farther. Gandhi and King have forwarded a view of loving non-violence that doesn’t even allow self-defense because it involves the preference of self over other. Gandhi characterizes personal integrity as “living life as an open book” since one’s life is not one’s own, but merely one example of everyone’s life. And of course there are the turn the other cheek precepts of Yeshua, which push in this direction.

In any event, ethics is not built for such concerns. It is a system designed to handle conflicts of interest, the direction of interests toward values and, perhaps, the upgrading and transformation of interests into aspirations.  Agape would function, within the golden rule, as something more like a song or affirmation for the self-transformations achieved. It is the very admirable diminution or lack of self-interest, in agapeistic love and in social discrimination that puts an agapeistic golden rule out of reach. Its double dose of moral purity and perfection puts it doubly out of reach. We arguably cannot be perfect as our Father in Heaven is perfect (or complete). We also cannot realistically strive toward it, and most likely should not. Religiously, to do so seems a sacrilege—pretending to the level of understanding, wisdom and “lovability” of infinite godhood. Secularly, its beautiful intentions have unwanted consequences.  Aside from the impersonality of childrearing, anyone who has borne the impersonal treatment or unearnable support from someone bent on “treating everyone the same” can testify to its alienating quality. We wish to be loved for us, for our self-identity and the values we identify with. When we are not loved this way, we do not feel loved at all—not loved for whom we are. Ethically, we expect to be unique, or at least special in others’ eyes when we’ve created a special history. We are entitled to it. We build rights around it. And we feel callously disregarded when a loving gaze shows no special glint of recognition as it surveys us among a group of others. This is less egoism than a sense of distinctness and uniqueness within the additional expectations of realized relationship.

Putting the matter more generally, human motivational systems come individually packaged. They are hard-wired to harboring and pursuing interest. And a valid ethics is designed to serve human nature, even as it strives to improve it. If we can transcend human nature, then we need a different system of values, or perhaps nothing like an ethical system. We have risen beyond good and evil, indifferent to harm of death.. We are born, and remain psychologically individualized throughout life, not possessed of a hive mind in which we directly share our choice-making and experiences. We are each unquestionably possessed of this natural, immutable division of moral labors, which gives us direct and reliable control only of our own self. Hence we are held responsible only for our own actions, expected to do for ourselves, provided special standing to plead our own case of mistreatment, and accorded great discretion in our own individual sphere, to do as we like. When agapeistic morality puts our very nature on the spot, bidding us to recast basic motivations to suit—when it sets us in lifetime struggle against ourselves—it fails to acknowledge morality as our tool, not primarily our taskmaster.  These considerations provide the needed boundary line to situate the golden rule this side of a feasibility-idealism divide. The golden rule is indeed designed for human nature as it is and for egos with interests, trying to be better to each other.

Admittedly the question of agape’s realism may not be decidable given the distinctly spiritual nature of their view. Christian agape, like Buddhist indifference and non-attachment is said to be inexpressible in words. It can only be understood correctly through direct insight and experience. Granted, adherents of these ideals place the achievement of spiritual insight out of common hands. Only a few of the most gifted or fortunate adherents achieve it in a lifetime. As such, spiritual love cannot be the currency of the golden rule as we know it, negotiating mutual equality for the vast majority of humanity in everyday life.

What agapeists may be onto is that the golden rule has a dual nature. At a common level, it is a principle of ethical reciprocity. But for those who use its ethic to rise above good and evil in a mundane sense, the golden rule is a wisdom principle. It marks the transcendence of interested and egoistic perspectives. It points toward its sibling of loving thy neighbor as thyself because thy neighbor is us in some deeper sense, accessible by deeper, less egoistic love.

6. Philosophical Slight.

With the foregoing array of “considered judgments” in hand, we are at last positioned to begin distinct philosophizing on the golden rule. That project starts by consulting philosophy’s reconstitution of traditional commonsense ethics—an added context for golden rule interpretation. Philosophical treatments of the golden rule itself come next, with an evaluation of their alternative top-down approach.

One reason philosophers emphasize the juxtaposition of ethics and human nature stems from the moralistic, if not masochistic cast of ethical traditions. Nietzsche’s depiction of “slave morality” in Christianity is a case in point (Nietzsche 1955). Moral suspicion of medieval shira laws in Islam is another. Because the golden rule is prominent in these suspect traditions, philosophy’s concerns are directly relevant. Self-interest has been rehabilitated in philosophical ethics, along with happiness as satisfying interests, not necessarily matching ethereal ideals or god’s will. Ethics in general has also been feminized to encompass self-caring as well, a kind of third-person empathy and supportive aid to oneself (Gilligan 1982). Here, a clarified golden rule notion can fit well.

The role of ethics as our tool and invention has been promoted over traditional views of its partial “imposition” by Nature, Reason or natural law.  As Aristotelians note, the good for anything depends on its type or species: ethics is for “creatures like us,” and because we are not saintly beings we fall short of by nature. Ironically, this a line preached by Yeshua continually in upholding spirituality, or the heart of “the law,” over the legal letter. “The law (Sabbath) was made for man, not man for the law.” (Mark 2:27-28) On this view, ethics should not fate its users to a life of hypocrisy and of not feeling good enough.

For philosophers, however, even a clarified or unbiased depiction of the golden rule cannot overcome its shortfalls in specificity and decisiveness. Ply the rule in the handling of complex and nuanced problems of complex institutions and it is at sea. We cannot imagine how to begin its application. Exercise it within networks of social roles and practices and the rule seems utterly simplistic. (This said, the irony should not be lost here of critics setting the rule up to fail by over-generalizing its intended scope and standards for success.)

Maximum generalization is the dominant philosophical approach to the rule. And in this form there is no question that its shortfalls are many. The rule seems hopeless for dealing with highly layered institutions working through different hierarchies of status and authority. Yet the rule has been posed by philosophers as the ultimate grounding principle of the major moral-philosophic traditions—of a Kantian-like categorical imperative, and a Utilitarian prototype. It has been claimed, in fact, that the rule’s logic was designed for this generalization across cases, situations, and all varieties of societies (Singer (1963) and Hare (1975)).

These interpretations are highly unlikely judging from the rule’s strikingly ethnic origin and design function, as a bottom-up approach makes clear. As noted, this is a tribal or clan rule, cast in highly traditional societies and nurtured there. There is no evidence that it was ever originally intended to define human obligations and problem solving within the human community writ large, or in complex institutional settings in particular. And so shortfalls found in taking it out of its cultural context—ignoring the range of practices and roles that it presumed, placing it in types of social context that didn’t exist when it was born and raised should be no surprise.  The golden rule’s format invites first-person use, addressing interacting with one or two others. Since the rule’s chief role in society seemingly became the instruction of children, alerting them to impacts on others, its shortfalls in complex problem solving seem irrelevant.  Likewise Kant’s categorical imperative falls short in deciding who does the laundry in a marriage, especially once emotions have become too frayed and raw to import formulae into the discussion.

In small-group interactions what would normally be tolerated as diversity of opinion and practice can be legitimately identified as problematic instead. Being like-minded, most often group members have expressed commitment to common beliefs, values, and responsibilities. But more important, the rule is vastly more detailed and institutionalized here than it seems because of its guidance by established practices, conventions, and understandings. One’s reputation as a group member depends on holding up one’s end of approved norms, including the golden rule, lest one be considered unreliable and untrustworthy. In such contexts, one can imagine a corollary to the golden rule that would make sense: “Show not consideration to him who receiveth without thought of rendering back.” This seems contrary to the golden rule due to our mis-identification of the rule with sibling rationales of forgiveness and unconditional love—letting others abuse and take advantage of us. Moreover, this corollary may not sanction an actual comeuppance of offenders, in violation of golden-rule spirit, functioning instead as a threat or gentle reminder of joint expectations. Such expectations are a commonly accepted part of “doing unto each other” in a neighborhood or co-worker context where conventions of fairness, just desert and doing one’s share go with the territory.

Marcus Singer, in standard philosophical style, portrays the golden rule as a principle, not a rule. This is because it does not direct a specific type of action that can be morally evaluated in itself. Instead, it offers a rationale for generating such rules. Singer is a kind of “father of generalization” in ethics, holding that the rationale for action of any individual in types of situations holds for any other in like situations  (Singer, 1955)    Singer argues further that the golden rule is a procedural principle, directing us through a process—perspective-taking, either real or imaginary, for example—to generate morally salient action directives.

Singer’s is the “ideal” or top-down theoretical approach, as contrasted with our building from common sense. It starts from an abstracted logical ideal, elaborating a theory around it by tracing its logical implications. The approach is notably uninfluenced by the golden rule’s 2,500 year history. Of course, philosophy need not start from the beginning when addressing a concept, nor be confined by an original intent or design or its cultural development.  The argument must be that the rule’s inner logic is the only active ingredient. The rest is chaff or flourish or unnecessary additives.

In principled form, Singer’s golden “rule” serves also as a standard for judging rules and directives for actions that impact us. The rationale of a contemplated action must  adhere to the rubric of a self-other swap to pass ethical muster in the way that, say, our maxim of intentions must pass the universalization test of the Kant’s categorical imperative.

Singer’s view has merit, especially in emphasizing procedure. Still, the distinction between principles and rules may not be as sharp as claimed. General rules (rules of legal evidence, for example) also can be used to derive more specific rules based on their logics; principles need not be consulted. For example: Do nice things; do nice things, anonymously for close neighbors in distress; leave breakfast bakery goods at the doorstep of a next door neighbor the morning after they attend a close relative’s funeral; leave donuts and muffins on your next-door neighbor’s welcome map, rewrapped in a white bag with a sedate silvery bow: leave bagels with chive cheese if they are Jewish, sfagliatelle if they are Italian. The most general rule here, “Do nice things,” targets a type of action that can be morally evaluated as right or wrong, but still needs a procedure for determining specific actions that fall in that category, especially at the borders. Consulting community reciprocity standards or conventions might be one. Thus, do nice things by consulting community standards would proceduralize a rule to generate more specific action directives. Again, no consultation with principles are needed.

A great asset of Singer’s view is its accent on the practical within the prescriptive essence of the rule. Most philosophical principles of ethics are explanatory, providing an ultimate ground for understanding prescriptions. These also can be used to justify moral rationales. But they are prescriptive only in the logical sense of distinguishing “shoulds” from “woulds” or “ares,” not the directive sense—do X in way Y. Singer’s take exposes the how-to or know-how of the golden rule. From here, the rule’s interpersonal role in communication and explanation to others is readily derived, especially during socialization. The rule is not portrayed, then as a stationary intellectual object notched on the wall of an inquiring mind. It takes on a life for the moral community living its life.

R. M. Hare basically places the golden rule in the company of the Kantian and Utilitarian theories, or his own “universal prescriptivism.” That is, he interprets it as a universal grounding principle, a fundamental explanatory principle—for reciprocal respect. This conceives ethical theory on the model of scientific theory, especially a physical theory with its laws of nature. A highlighted purpose of Hare’s account is to bring theoretical clarity and rational backing to what he sees as piecemeal intuitionist and situation-based ethics. These latter approaches typically use examples of ethical judgments that the author considers cogent, leaving the reader to agree or disagree on its intuitive appeal. Yet, Hare renders the crucial “as you would have them do” directive of the golden rule as both what we would “wish” them to do to us (before doing it) and what we are “glad” they did toward us (afterwards). He holds that the golden rule’s logic remains constant, despite these word and tense changes. Notably, no grounding is offered for this claim—for the switch from “would have” to “wish” or “glad,” as if these were obviously the same ideas. Hare apparently feels that they are.

But wishes, choices, preferences, and feelings of gladness certainly do not seem the same thing. Choices can come from wishes, though they rarely do, and one feels glad about the results of choices, if not wishes, generally. Wishing typically has higher goals and lower expectations than wanting; it’s bigger on imagination, weaker on real-world motivation. Choosing is usually endorsing and expressing a want, whether or not it expresses a preference among desired objects. None of these may auger a glad feeling, though one would hope they do, hoping also that one’s choices turn out well and that their consequences please us, which they often, sadly, do not.

7. Sticking Points

The greatest help that the golden rule’s common sense might seek from philosophy is a conceptual analysis of the “as you would have” notion (Matthew 7:12).  This is a tricky phrase. Rendering the rule’s meaning in ways that collapses wish and want obscures important differences, as just noted. An alternative rendering is how you prefer they treat you, singling out the want that has highest priority for you in this peculiar context of mutual reciprocity, not necessarily in general. Further alternatives are treatments we would accept, or acquiesce in or consent to as opposed to actively and ideally choose or choose as most feasible. These are four quite different options. Or would we have others do unto us as we believe or expect they should treat us based on our or their value commitments and sense of entitlement?  Are the expectations of just the two or three people involved to count, or count more than the so-called legitimate expectations of the community? Such interpretations can ride the rule of gold in quite different directions, led by individual tastes, group norms, or transcendent religious or philosophical principles. And we might see some of these as unfair or otherwise illegitimate.

In such contexts, philosophical analysis usually answers questions, clarifying differences in concepts, meanings and their implications. Hare’s account may very likely compound them. I may choose, wish or want that you would treat me with great kindness and generosity, showing me an unselfish plume of altruism. But if I then was legitimately expected to reciprocate out of consistency, I might consent, agree, or acquiesce only in mutual respect or minimal fairness, at most. This is all I’d willingly render to others, certainly, if they did not even render respect and fairness back. From this consent logic we move toward Kantian or social contract versions of mutual respect and a sort of rational expectation that can be widely generalized. But we move very far from the many spirits of the golden rule, wishful and ideal. We move from expanding self-regard other-directedly to hedging our bets, which makes great moral difference.

Similar problems of interpretation rise for the “as” in the related principle, love thy neighbor as thyself (Matthew 20:34).  In ethical philosophy, as noted, “self-love” has been identified traditionally with self-interest or self-preference. In psychology, by contrast, it has been identified with self-esteem and locus of control. These are quite different orientations, setting different generalizable expectations in oneself and in others. It is not clear that generalizing self-love captures appropriate other-love. Common opinion has it that love of others should be more disinterested and charitable than love of self, or self-interest. We feel that it is fine to be hard on ourselves on occasion, but more rarely hard on others. We are our own business, but they are not. They are their own business. It seems morally appropriate to sacrifice our own interests but not those of others even when they are willing. We should not urge or perhaps even ask for such sacrifice, instead taking burdens on ourselves. Joys can be shared, but not burdens quite as much.

We are to be nicer, fairer, and more respectful of others than of ourselves. In fact, ethics is about treating others well, and doing so directly. To treat ourselves ethically is a kind of metaphor since only one person is involved in the exchange, and the exchange can only be indirect. We are not held blameworthy for running our self-esteem down when we think we deserve it, but we are to esteem others even when they have not earned it.

Kant, by contrast, poses equal respect for self and other, with little distinction. We are to treat humanity, whether in ourselves or others, as an end in itself and of infinite value. He also poses second-rung duties to self and other toward the pursuit of happiness—a rational, and so self-expressively autonomous, approach to goods. This might be thought to raise a serious question for altruism—the benefiting of others at our expense. Given duties to self and duties to others, even pertaining to the pursuit of happiness, it is not clear what the grounds would be for preferring others to oneself. Yet one would be honored as generous, the other selfish. And this is so even if we have the perfect right to act autonomously in a generous way, therefore not using ourselves as a mere means to others’ happiness. Throughout his ethical works and essays on religion, however, Kant speaks of philanthropy, kindness, and generosity in praising terms without giving like credit to self-interest.

Some would criticize this penchant for treating others better than ourselves as a Christian bias against self-interest, too often cast as selfishness. But it seems in line with the very purposes of ethics, which is how to interact with others, not oneself. In any case, Yeshua’s conception of love was radically different from the traditional notion of his time as it is from our current common sense.

Most of the population originally introduced to the golden-rule family of rules was uneducated and highly superstitious, even as most may be today. The message greets most of us in childhood. Its Christian trappings growing most, at present, in politically oppressive third-world oligarchies where (sophisticated) education is hard to come by. Likely the rule was designed for such audiences. It was designed to serve them, both as an uplifting inspiration and form of edification, raising their moral consciousness. Yet in these circumstances, the real possibility exists of conceiving the rule as, “if you’re willing to take it (bad treatment) you can blithely dish it out.” Vengeance is also a well-respected principle tied to lex talonis. A related misinterpretation puts us in another’s position with our particular interests in tact, asking ourselves what we in particular would prefer. “If I were you, do you know what I would do in that situation?” Decades of research suggests that these are the interpretations most of us develop spontaneously as we are trying to figure out the golden rule and the place of its rationale in more reasoning over childhood and adolescence (Kohlberg 1982).

We can scoff at the obtuseness of these renderings, but even sophisticates may know less about others’ perspectives than they typically assume. Many have great difficulty imagining strangers’ perspectives from the inside, instead making unwarranted assumptions biased to their own preference. (Selman 1980). Otherwise well-educated and experienced folk can be remarkably unskilled at such perspective-taking tasks. Indeed, feminist psychologists demonstrate this inadequacy empirically in psychological males, especially where it involves empathy or spontaneously “feeling with” others. (Hoffman 1987) (In class, when I’ve fully distinguished empathy from cognitive role-taking, many of my brilliant male students confess, “I don’t think I’ve ever done or experienced that.”) Recent empathy programs designed to stop dangerous bullying in American public schools have acknowledged the absence of empathy in many children. Schools have resorted to bringing babies into the classroom to invoke hopefully deep-seated instincts for emotional identification (or “fellow-feeling”) with other members of our human species (Kohlberg 1969).

How we properly balance empathy with cognitive role-taking is a greater sticking point, plaguing psychological females and feminist authors as much as the rest. (The balance, again, is between feeling with, and imaginatively structuring the person’s conceptual space and point of view.) Such integration problems make it unclear how to follow the golden rules properly in most circumstances. And that is quite a drawback for a moral guideline, if the rule is an action guideline. We might then be advised to seek a different approach such as an interpersonal form of participatory democracy, as was previously noted.

Again, these are precisely the sorts of uncertainties and questions that philosophical analysis and theory is supposed to help answer by moving from common sense to uncommonly good sense. To a certain extent, Kantian and Utilitarian theory does just that, better defining the role of careful thought and estimation (reason), moral personality (the components of “self” and “other” that most count) and how these ground equal consideration. But at some point they move to considerations that serve distinctly theoretical and intellectual purposes, removed from everyday thinking and choice. Kant’s “neumonal self” (composed of reason and free will) and the Bentham-Mill “Util-carrier” (an experience processor for pleasure and pain) are not the selves or others we care about when golden-ruling. Their morally relevant qualities cannot compete in importance with our other personal features. Indeed, we cannot identify with, much less respect these one-sided, disembodied essences enough to overrule the array of motivations and personal qualities that match our sense of moral character and concern.

The theoretical rationality of maximizing good, even with prudence built in, is obviously extremist and over-generalized. Research in more practical-minded economics shows this clearly in coming up with concepts like “satisficing” (seeking enough goods in certain categories of those goods most important to us).  But as philosophers say, the logics of good and reason in Utilitarianism cannot help but extend to maximization—it is simply irrational, all things considered, to pursue less of a good thing when one can acquire more good at little effort. If so, then perhaps all the worse generalization and consistency, which will be avoided by being reasonable and personable. Many of us wish theory to upgrade common sense, not throw it out the window with the golden rationale in tow.

8. Ethical Reductionism

Both present and likely future philosophical accounts may be unhelpful in bringing clarity to the golden rule in its own terms, rather distorting it through overgeneralization. Still, the crafting of general theory in ethics is an important project. It exposes ever deeper and broader logics underlying our common rationales, the golden rule being one. (It is important for some to review these fundamental issues for treating the golden rule philosophically.)

Relative to a commonsense understanding of the golden rule, it is a heady conceptual experience to see this simple rule of thumb universalized–inflated to epic proportions that encompass the entire blueprint for ethical virtue, reasoning, and behavior for humankind. Such is the case with Kantian and Utilitarian super-principles. To increase the complexity of the rule’s implications while retaining its simplicity, transformed to theoretical elegance, is no mean trick. Paul’s revelation that the golden rule is catholic achieved a like headiness in faith. Now to see that faith reinforced by the most rigorous standards of secular reasoning is quite an affirmation. It can also be recruited as a powerful ally in fending off secular criticism.

Often we fail to recognize that extreme reductionism is the centerpiece of the mainstream general theory project. The whole point is to render the seemingly diverse logics of even conflicting moral concepts and phenomena into a single one, or perhaps two. It is very surprising to find how far a rationale can be extended to cover types of cases beyond its seeming ken—to see how much the virtues of golden kindness or respect, for example, can be recast as mere components of a choice process. Character traits, as states of being, appear radically different from processes of deliberation, problem solving, and behavior after all. But the most salient psychological features of virtuous traits fade into the amoral background once the principled source of their moral relevance and legitimacy is redefined. Golden rule compassion becomes virtuous because it allows us to better consider an “other” as a “self,” not necessarily in itself, its expression, or in the good it does.

The project of general theory also exposes how the implications of golden rule’s basic structure fall short when fully extended. Universalization reveals how the basically sound rationale of the golden rule can go unexpectedly awry at full tilt. This shows a hidden chink in its armor. But reducing principles also can overcome the skepticism of those who see the rule as a narrow slogan from the start. The rule can do much more than expected, it turns out, when its far-reaching implications are made explicit. And by exposing the rule’s shortfall and flaws, we can identify the precise sorts of added components or remedies needed to complement it, thus setting back on the right path.

These are the two prime fruits of general theorizing, determining the full extent of a rationale’s reach, before it stretches too thin, and stretching it fully and too thin to expose its failure scenarios. Universalization, in principle, reduces to absurdity in this sense.

Outfitting the golden rule for this project in the standard way, we get “Always act so that you treat any other person, in any context, the way that you would rationally prefer and expressively choose to be treated in that context.” “Never treat someone in a way that would not draw their consent.” (We could say “win” their consent, but that seems a bit “cheerleaderesque.” It invites a process of lobbying that might win or lose due to arbitrary rhetorical skills and which the rule likely does not intend.) Notice that “the standard way” among philosophers is simply to claim that “as you would have” means consent or rational preference without sufficient argument or justification. This is what philosophical research on the matter turns up.

What sorts of faults are revealed by tracing out this principle’s implications? One liability concerns justice. If one puts themselves in the position of someone who has done an injustice, you might reasonably conclude either that the person wishes to be punished, due to their keen sense of justice, or that they wish to be forgiven or to otherwise “get out of” being caught or held accountable. Wishing forgiveness, or at least to be given a second chance, has much to be said for it. And morally, getting one’s just desert also makes sense. A kind of  paradox results, which Christians will recall from the Parable of the Laborers in the vineyard (Matthew 20: 1-16) The rule provides a moral advantage to both punisher and perpetrator in this case. Doing what is fair is good, but using one’s discretion to be forgiving is good also, perhaps better, though not obligatory—a win-win situation. Looking across situations, imagining the social practices and legitimate expectations that result, social members who commit offenses will suffer the luck of the draw. The accountability mechanism of society will not establish a uniform policy of punishment or recompense. (“Luckeee! You got judge X or you mugged a nice guy—wish I had.”)

For moral individualists or libertarians, this is no problem. Who can complain about getting either fair treatment or beneficial treatment? “Should someone be begrudged their generosity,” as the vineyard owner notes, or another their resulting windfall? We accept this discretionary arrangement in many everyday settings. But consider how two children will feel about such unequal treatment, which treats one person as if s/he deserves more, and the other less? Consider how this same sense of being mistreated and perhaps resentful will arise in most small groups of peers. “Why her, not me? Why the favoritism—you value her that much more than me?” The pattern for distributing costs and benefits is unequal to the equal. And that is unjust. Moral liberals will be especially offended by this result. As with many conflicts between moral camps, both sides have a point, which each side seems committed not to acknowledge. And thus far, no way of integrating these rival positions has gained general consensus.

Like any general principle, perhaps, the golden rule also seems incapable of distinguishing general relationships and responsibilities from special ones—responsibilities toward family members, communities of familiars and co-workers, not the wide world of strangers. A proper explanatory principle will allow us to derive such corollaries from its core rationale. But the golden rule falls short: it is truly a rule, not a principle. Compare it with the Utilitarian grounding principle of maximizing good. Maximizing is an ideal logic of reason. Good is an ideal of value of value. We can imagine how a most rational approach to value would promote special situations and relationships, why it would function differently there than in other situations, and why such situations and relationships have special value. Additional good results from family and friendship institutions when members treat each other as special, and especially well. The golden rule, by contrast, bids us to treat people as special when they are not, to treat strangers or enemies as we’d be naturally urged to treat intimates. This is difficult at best, and not clearly a reliable way to maximizing good. It may detract from the good in fact. Also, what is the rationale for treating others as well as those closest to us? Why is showing favoritism toward our favorites a problem? The golden rule itself does not say or explain.

In work situations, are we to ignore who is the boss or supervisor, who is the rank-and-file employee, who is the support staff doing clerical or janitorial work? We’re to be decent to all in some sense, but some we can humanely “order around,” set deadlines for, and some we can’t.

These are serious problems for the golden rule. At a minimum, corollaries would have to be added to the rule explaining how roles and relationships figure in. Treat others as you would choose to be treated in the established social role you each occupy and its legitimate expectations, mother, father, or teacher to children and vice versa, spouses and friends to each other, peer co-workers, supervisor to rank-and-file employees and vice versa, and so forth. Alan Gewirth (1978) has proposed a rule in which we focus on mutual respect for our generic rights alone. This would leave all sorts of other choices to other rationales or to our discretion that the golden rule does not, placing restraints on the rule that it would not currently acknowledge.

Both of these alternatives have horrible consequences for the golden rule however.  Rights simply do not cover enough ethical behavior to rule out forms of psychological cruelty, callousness, and interpersonal exclusion. The reciprocity they guarantee is compatible with most forms of face-to-face interaction that lack it, especially in public peer-relations such as the school or job site, but also in friendships and the family.

Where the ethics or ethos of a society is barbaric, and its hierarchies authoritarian, taking perspectives within roles legitimates these characteristics. How should a slave and her/his master reciprocate? How should a superior race reciprocate with members of a near sub-human race?  This inequality problem is egregious also in adhering to prevailing social reciprocity-conventions applying to roles. Neither ethically skilled role-taking nor empathy can set matters right.

9. Ill-Fitting Theory (Over-Generalizing Rules of Thumb)

Despite its assets, there are further reasons to think that the general theory project is inappropriate for many ethical rationales, the golden rule being perhaps chief among them. Its expose of golden rule faults is more misleading than helpful. General theory assumes that the true and deeper logic of a rationale comes out through generalization, which often is not the case. This should be obvious when theorists note that a rationale cannot avoid certain far-flung implications, no matter how alien or morally outrageous they seem. This “gotcha” view of logical implication speaks badly for logical implication, working alone. Instead of revealing a flaw in the rule’s logic, it may show implicit features of a concept or phenomenon being ignored. In the golden rule’s case this might be a cultural design function being ignored meant purposely to limit the rule’s generalizability and social scope. We must get the rule’s actual “logic” straight, before generalizing it, and this cannot be done in a purely top-down theoretical manner except by creating a different rule.

Rationalist by nature, general theory also assumes that the structure or logic of the rationale is the thing, not its psychological function, emotive effect, or motivational power. The fault here is not emphasizing rational components, but failing to integrate additional components into it adequately. If the golden rule’s logic is procedural, as Singer claims, then it may not serve as a general explanation in “knowing-that” sense, which a Kantian, Utilitarian and “universal prescriptivism” approach like Hare’s ignores. And failing to provide a type of general explanation might not then be a failing.

Besides, the golden rule is unnecessary to the general theoretical project, as Kant (1956)  himself made clear in dismissing it, and in a mere footnote no less ( p. 97, 430:68). We can start with an ideal explanatory principle, ideally structured to capture the explanatory logic of equal consideration or perspective taking. There is no need to generalize from commonsense, distorting a rule designed only for commonsense purposes, in a restricted locale. A reductive account provides an explanation and understanding of one sort, exposing the essential element or active ingredient underlying an ethics’ appearance. It allows us to strip bare what holds the golden rule together beneath surface content that often matters little to its substance. But this account provides neither a good explanation nor understanding of the rule as a whole, or in any element, relative to the rule’s distinct meaning for its users or benefactors, nor its distinctive application in any real-life situation. “I’ve become so focused on getting this project done on time that I’ve lost sight of these people working on it being my colleagues, indeed, my neighbors and friends, of their deserving to be treated that way.”  This is especially true of the implied how-tos or forms of address that make all the difference when showing respect and concern for others. How we do unto our mother or our child or our co-worker, even when their basic personhood is most at stake, requires a remarkably different form of address to convey equal consideration. Patronizing someone (a parent) in showing respect, can convey disrespect. So can failing to “patronize (a child) and thereby coming off cold and remote. These are essential moral matters, golden-rule matters, not just a matter of discretionary style.

Unlike Kant, J. S. Mill (1961) identified the golden rule as “the complete spirit of the ethics of utility” (p. 418). It apparently served as a leading light for the Utilitarian principle, despite the principle’s appearance of not holding high each individual’s sanctity (as a “child of god”).  This may seem outrageous to those who see both the golden rule and Kant’s principle as vaunting this sanctity, whatever their utility to society. For them, Utilitarianism makes an ethic out of the immoral logic of “ends justify the means,” willingly sacrificing the individual to the group—or obligating us to do so. The golden rule, by contrast, asks us to consider another’s equality, not sacrifice her or ourselves for group welfare.

But let us remember that these alleged features of utilitarianism are completely unintended, the result of outfitting its “advance the common good” rationales with universality, then trying to cover all ethical bases, working alone. (Obviously modern democratic constitutions have brought advancing the common good into line with securing individual rights simply by retaining both principles in their own terms and using each to regulate the other.) Even the lush empathy of Utilitarian intent, so key to who sacrifices or willingly serves, was eventually ejected from its general theory. This is an ultimate “golden rule lost” scenario, and reductionism gone wild. Many have noted how “each is to count for one” seems merely inserted into the Utilitarian concept with little utilitarian basis. The golden rule spirit may be one explanation.

The apparent association between the golden rule and the maximizing super-principle came basically from the central role of compassion in early Utilitarian theorizing. When people become experienced with each other, recognizing common needs, hopes and fears, failures and successes, they are moved to act with mutual understanding. This increases their like-mindedness and mutual identification in turn. The resulting sense of connection nurtures increasing indifference toward the narrow desires of those concerned, whether in oneself or others.  Membership in, and contribution to shared community becomes defining. This is how golden rule other-directedness and equality moves toward full mutuality in the pursuit of overall social good. And is there a more “Christian spirit” of charity and service available?

Like most key tenets of ethics, the golden rule shows two major sides: one promoting fairness and individual entitlement, conceived as reciprocity; the other promoting helpfulness and generosity to the end of social welfare. Both the Kantian and Utilitarian traditions focus on only one side, furthering the great distinctions in philosophical ethics—the deontology-teleology and justice-benevolence distinctions. For the general theory project, this one-sidedness is purposeful, a research tool for reductive explanation.

The Utilitarian, Charles Dickens, probably draped most golden-rule content and spirit over the utilitarian side in his Christmas Carol. “Business? Mankind was my business, the common welfare was my business; charity, mercy, forbearance and benevolence were, all, my business. The dealings of my trade were but a drop of water in the comprehensive ocean of my business” (p. 30). In a small way here, Dickens highlighted the direct and visible hand of Utilitarian economics in contrast to the invisible hand of Utilitarian Adam Smith and his capitalist economics—a hand Dickens found quite lacking in compassion or egalitarian benefit. Dickens captured Utilitarianism’s moral hell in even more strikingly golden rule terms, “The air was filled with phantoms, wandering hither and thither in restless haste, moaning as they went…one old ghost cried piteously at being able to assist a wretched woman with an infant, whom it saw below upon a doorstep. The misery with them all was clearly, that they sought to interfere, for good, in human matters, and had lost the power forever” (p. 33). Arguably, the power lost was to treat those in need as one’s potentially needy self should be treated. The each-is-to-count-for-one equality of the golden rule is portrayed as a proven, socially institutionalized means to social good.  “I have always thought of Christmas…as a kind, forgiving charitable and pleasant time when men and women seem, by one consent, to open their shut-up hearts freely and to think of people below them as if they really were fellow-passengers to the grave, not another race of creatures bound on other journeys” (p. 9).What speech more heartily hits the golden tones of the golden rule?

10. Know-How Theory (And Medium-Sized Rationales)

What seems needed to philosophize ably about the golden rule, and its relatives are theoretical models fit for rules of thumb. These would be know-how models, defined by the conceptual work it draped around algorithms, operations, and steps in procedures for putting rules into effect. As noted, these may be psychological rules for taking certain moral points of view, rules of problem solving, negotiation, making contributions to ongoing practices, interactions, and more unilateral actions. These components would be given a context of use and interrelated in crucially different ways, with suggestions for interrelating them further.  Illustrations would be provided of their application and misapplication, at high, medium, and low quality. The resulting combination would be provided overall structure and comprehensibility which would include the rationales needed to explain and justify its components. Rationales for applying the procedures would allow unique and flexible alliances among components fit for particular functions and novel situations. This would encompass the best features of the otherwise inchoate rubric of a conceptual “tool-box.” The illustrations would encompass the best features of philosophically upgraded ethics codes. A range of corollaries would be provided for the rules involved, the golden rule family, for example, capturing the sorts of conventional expectations and practices presumed during the rules’ creation and development. These are of greatest importance to its practicality and success. And, of like importance, background frameworks would be provided for how to practice the rule, indicating the difference in orientation of the novice and expert user. How might we follow a recipe when cooking the way a chef who “knows his way around the kitchen” would?

Relative to mainstream philosophical theory, this project might seem historically regressive, even anti-philosophical.  It resembles a return to the most piecemeal sort of intuitionism, combined with a “hands-on,” applied approach taken to a new clerical extreme. (Applied ethics already boasts hundreds of decision-making step procedures.) For traditional philosophers the small-scale common-sense rationales involved also may seem philosophically uninteresting. “Advancing the good” may be a fine tip for everyday practice but holds little conceptual subtlety compared to “maximizing the ratio of benefits over costs across the domain of sentient beings.” The unseen implications of a maximization principle provided us a new model of practical reason. Resubmitting the range of ethical concepts to it suggests that the aims and consequences of actions, combined with quality of experience, may be all that ethics comes to, personal integrity and inherent rights aside.  Thus the rules of thumb discussed by Mill in his Utilitarianism were quickly deserted by philosophers for rule-utilitarianism. This built newly generalized principles into the very structure of maximization (maximize the regard for rights as inherent and inviolable), turning the pre-existing utilitarian principle (regarding rights and all else as means to social good) into a super-principle, as some term it.

But if one looks back at Mill’s ethical writings as a whole, dropping preconceptions about general theory of utilitarianism itself, one finds ethical rules hat cross categories like deontology and teleology, working insightfully and usefully together, also by rule of thumb, not principle or intuition. There was a time when moral theorist simply dismissed intuitionist and applied theory approaches. Hare does above, Rawls did in his hallowed A Theory of Justice (1972), calling them half-theories. These theories cited piecemeal and ungrounded insights where the completing of conceptual structure was required to provide a full explanatory account. But these forward-looking theorists worked such piecemeal views into pluralist, hybrid, or eclectic theoretical forms. Rawls himself discussed “mixed theories,” with medium-sized principles, which he acknowledged as formidable alternatives to a general theory like his own. (See Rawls’s multiple index references to intuitionism and mixed theories.) The golden rule can find a place here, a merely somewhat generalized, medium-sized or right-sized place, allowing it to function as a lived ethic, readily applicable to everyday life to several ends. There is a certain satisfaction as well to using the most ancient but enduring epigrams of ethics, such as the golden rule, to create the most cutting-edge theoretical forms.

11. Regressive Default (Is Ancient Wisdom Out-Dated?)

Serious innovation in ethics is a long time coming. Arguably, the golden rule has not been seriously updated in its own terms—conceptually, procedurally and culturally since 500 B.C.E., or perhaps 28 C.E., despite quite radical changes in the primary groups of modern societies, and the decrease in tribal societies. Since applied and practical ethics gathered steam, ground-breaking developments like the “ethics of complex organizations” have been few and far between. It is remarkable that moral philosophy is still focused on concepts that were contemporaries of phlogiston and élan and vital, bile and humors. The golden rule long preceded these. Such notions were formulated and plied in an age of rampant superstition, seasoned by deep misconceptions about the nature of reality, human nature (psychology) and social organization. Modern empirical research has had difficulty finding the stable psychological traits that we continue to call virtues. If stable traits exist at all, they may not be organized morally. If they are, their stability and supposed resistance to situational factors of morality appears remarkably weak (Kohlberg 1982a, Myers, chs. 4, 6-9, 12). But philosophers have given hardly a thought to the real prospect that there may be no such things—no real phenomena to cover our grab-bag folk terms. Virtue theorists seem unparsed as they experience a philosophical upswing. Brain research has uncovered forms of mental computation that differ significantly from what we term reasoning or emotion. This should be producing experimental revamping of ethical thinking. Unfortunately, a raft of hasty interpretations of these findings’ significance (by J. Haidt, primarily) have provided grounds for undue skepticism.

The golden rule enjoys the reputation of enduring wisdom, even if its lack of conceptual sophistication leaves philosophers cold. But its ancient origin should make us wonder if it is in fact perennial hot air, misleading even regarding the framework in which moral philosophy is done.

The model of general theory, based on general laws, still enjoys mainstream status in moral philosophy, despite challenges that have diminished its domination. But consider what has happened to its scientific mentor. Important new innovations in physics are questioning the use of general theories marked by laws of nature, gravity, and the like, holding that this centerpiece of physics for centuries was a wrong turn from the beginning that led to the dead end of string theory and an inability to understand quarks and quantum mechanics. Unthinkably regressive anthropomorphic alternatives, such as “biocentric cosmology,” are being taken seriously, or at least stated boldly before a scientific public. (This is the view that reality is determined by our observing it—a giant step beyond the Schrödinger’s Cat Paradox.) In part, this results from challenging the value of sophistication in views like string theory that consider it explanatory to posit non-existent and unknowable scores of reality dimensions for realities we observe. More, these cutting-edge, potentially revolutionary ideas are being proliferated in high-level physics in such popular outlets as Discover Magazine (April 2010 p. 32-44, May p. 52-55) and the Discovery TV Channel, available with “basic cable.”

Where are the parallels in ethics? Where are the steps beyond “subverting the dominant paradigm,” and posing real alternatives for public consumption where ethics meets reality, where the golden rule abides today? Currently, moral philosophy floods its public with an unstoppable stream of “theory and practice” texts championing Kantian deontology and Utilitarian teleology, with the supposedly direct application of their super-principles to concrete cases. (There is nothing like a fundamental explanation to decide an issue and take specific action, is there?) This is especially so when a reader need only follow the philosophical author’s advice to “balance” these two great and conflicting principles in application and practice, as philosophers have been unable to do for centuries. A chapter is always devoted to ancient “virtue ethics” in these volumes despite no one apparently knowing how to apply moral traits or character to anything concrete, in any concrete way.

Before a rule like the golden one is either slighted or acknowledged, moral philosophy should consider innovative approaches to conceiving such rules, their fitness to current practice, and perhaps what we can learn from converting the rule to a programmable algorithm for autonomous agent programming. Perhaps simply “generalization” the rule, as anciently stated is not the most creative theorizing approach. A possible step in a new direction, if originated in more than century-old thinking, is attempted below.

12. When Is a Rule Not a Rule, but a Description?

In classic lectures, compiled as The Varieties of Religious Experience (1901/1985) William James declares the golden rule incompatible with human nature (Lect. 11). It routinely violates the basic structure of human embodiment, the laws of human motivation, and the principles of rational choice of behavior based on them, as depicted above. (James may have confused the rule with sibling principles when making this blanket observation.)

Yet, gathered around this law-like “given,” in James’s remarks are reams of psychological testimony on putative “conversion” and “visitation” experiences wrought by divinities (Lect. 9-12). James identifies certain common features and aftereffects in these putatively supernatural experiences, including ecstatic happiness and sense of liberation, expansive sense of self, and a self-diffusion into those nearby–selflessness of a special, merging sort. He notes, likewise, an overflowing urge to love, give and aid others, nurture and support unlimited others, with unlimited energy, and no sense of sacrifice to oneself. The main attitude observed is “yea-saying” toward everything, reminiscent of Christian calls for “saying yes” to God and “to all who ask of ye” (Lect. 11).This syndrome of experiences and proclivities gives new meaning for its “patients” to what a devoted and dedicated life can be—not devotion to religious duties or divine commands, but the spontaneous embodiment of omnipresent love..

Mystical experience of this sort typically bridges the complete separation between perfect Godhood and sinful devotee, substituting a sense of oneness and “flow” within a cosmic ocean of bliss. James cites ways in which the lasting sensibilities of this experience suborn the asceticism, spiritual purity, and willing material poverty associated with saintliness (Lect. 11).  In a certain way, James goes on to provide a differential diagnosis of this syndrome of symptoms or “golden rule effect.” The cause he infers is some sort of seizure—literally, a seizure—suggesting occult influences, unusual electro-chemical processes within the central nervous system, or both. James notes that when subjected by him to certain “ethers,” known for hallucinogenic effects, those who report these divine visitations also report a strikingly similar experience “under the influence.” All that is missing is a sense of the supernatural. This is not the most morally reassuring depiction of the golden rule as a phenomenon, but so it goes.

Imagine now that there are a third and fourth avenue to these experiences or to the proclivities and golden behaviors that result. One, the third, might involve the secular spiritual transformation that comes from single-mindedness. When someone’s striving for a cherished goal becomes a life-mission, be it mastering a musical instrument or fine art, or putting heart and soul into building a business, or putting a public policy in place (a new drunk-driving ban or universal health care) they often come to embody their goal. “He is his company.” “She has become her music” (“and she writes the songs”).  Certainly in religion this is what is meant by terming someone holy or a living saint. This is also the secular goal of Confucian practice, to make li (behavioral ritual) yi (character). One accomplishes this transformation by complete and intense concentration of thoughts and behavior, and by “letting go” of one’s self-awareness or ego in the task. The work takes over and one becomes “possessed” by it, either in an uplifting way, or as in the need for exorcism, rehab, or at least “intervention” by friends and family. When morality sets the goal and means here, we term their culmination “moral exemplarism.”

This is the indirect pursuit of the golden rule that focuses on ideally good means to ideally good ends. “Love the good with your whole mind, your whole heart and your whole strength,” then you will love your neighbor as yourself, and also treat her as you’d wish to be treated by her. The differential diagnosis here identifies devotion that leads to embodiment as the cause of golden rule effect. And this devotion need not include any following or practicing rules of thumb like the golden rule, purposely fulfilling duties, or practicing those conventional activities associated with being morally upright. It can be as spiritual and abstract an activity as concentrated rational intuition ever-intent on an imagined Platonic form of good, which presumably would direct one’s perception of every reflection of the Form, in every ethical matter one dealt with in life.

Now consider a fourth avenue, much more common to everyday ethics. Here, doing good or being fair is a part-time activity, undertaken alongside hosts of alternatives. It is developed through socialization and reflective practice relative to the normative institutions of society. Social norms are internalized and habituated in action, even to the point of what we call character traits. When dealing with others, and typical moral issues, we gain a sense of proper reciprocity and the need for a certain egalitarianism in how we show respect. In addition, we hear of various rules and principles advising us on how to do this. Among these are members of the golden rule family, perhaps the golden rule itself. One flirts with following those rules of thumb within reach, the way one finds oneself tempted to buy a product one sees advertised. One notices ways that one’s activities already overlap with their biddings. And slowly the rule becomes a partial habit of heart and hand, an implicit directive.

Still, the rule is sometimes consciously referred to as a reminder. Like breathing, that is, the rule has an involuntary and voluntary component in one’s life. Other rules seem reachable somewhere down the road and may slowly become an ideal to work toward, walking a mile in others’ moccasins perhaps,  while an additional class of rules only gets our salute from afar—it is wildly out of reach. “Love your neighbor as yourself” seems in this class, along with, “turn the other cheek” or “give others anything they ask.”

Getting some perspective, the second and third avenues or “ways of embodiment” above are analogous to the two main schools of Zen Buddhism—Rinzai and Soto. In the first, one experiences satori or enlightened awakening in a sudden flash. It is not known how, even a non-devotee may be blessed by this occurrence. One smiles, or laughs as a result, at the contrast in consciousness, then goes back to one’s daily life with no self-awareness of the whole new sense of reality and living it creates. Those around cannot help but notice the whole new range of behaviors that come out, filled with the compassion of a bodhisattva. To the master, it is daily life and interaction: “I eat when I am hungry, I sleep when I am tired.”

The third way is that of gradual enlightenment. One meditates for its own sake, with no special aim in mind—no awaited lightning strike from the blue. “Over time, as one constantly “polishes one’s mirror,” Zen consciousness continually grows until normal consciousness and ego fade out, akin to the Hindu version of enlightenment or moksha. Compassion grows beside it, imperceptibly, until one is bodhisattva. To the recipient, Zen-mind seems ordinary mind.

The fourth way, is more a simulation than a “way.” It is not a form of embodiment at all, and therefore does not generate golden rule effect as a spontaneous offshoot. We learn to act, in some respects, as a master or exemplar would, but without embodying the character being expressed, or being truly self-expressive in our actions. What we call ethics as a whole—the ethics of duties, fulfilling obligations, adhering to responsibilities, and respecting rights can be seen as this sort of partial simulation. We develop moral habits, of course, some of which link together in patterns and proclivities. And we can  “engage” these. But we would not continue to carry around a sense of ethical assembly instructions or recipes needing sometimes to refer to them directly—if we were ethics, if we embodied ethics. We don’t retain rules and instructions when we are friends or parents. (Those who read parenting books are either looking for improvements or fearing that they aren’t true parents yet.) Where else in our daily lives do we look to principles, rules of thumb or formula supplied as advice by a colleague or co-worker to proceed at what we already supposedly can do? When we are a worker, we just work. When we are ethical, we often pause and consult a manual. This is not to deny automaticity or self-reliant reasoning in ethics.

The golden rule displays one algorithm for programming exemplary fair behavior, which can be habituated by repetition and even raised to an art by practice. Virtue ethics (habits) and deliberation ethics (normative ethics) fall here. What we are simulating are side-effects of a moral condition. We are trying to be good, by imitating symptoms of being good.

A behavioral route can be taken instead to these simulations, side-stepping direct reference to the rule. In some ways it is more revealing of our simulation. Here we engage in repetitive behaviors that conform to a reciprocity convention that conforms to the rule. We do not act out of adherence to the rule, but only out or imitation of its applications or illustrations. This again was the Aristotelian approach to learning virtues and also the Confucian approach for starting out. In Japan, this sort of approach extended from the Samurai tea ceremony to the Suzuki method of learning the violin (See Gardner 1993). Such programming is akin to behavioral shaping in behaviorist psychology though it rests primarily on principles of competence motivation, not positive and negative reinforcement.

Social psychology has discovered that the single best way to create or change inner attitudes and motivations is to act as if one already possessed them. Over time, through the psychology of cognitive dissonance reduction, aided by an apparent consistency process in the brain, the mind supplies the motivation needed (Festinger 1957, Van Veen, and others, 2009).  These processes contradict common opinion on how motivations are developed, or at least it does so long as our resolve does. Unless one keeps the behavior going, by whatever means, our psychology will extinguish the behavior for its lack of a motivational correlate.

Here, as elsewhere, the golden rule can act as a conceptual test of whether the group reciprocity conventions of a society are ethically up to snuff. As a means to more morally direct simulation, those interested in the golden rule can try alternative psychological regimens—role-taking is one, empathy might be another. And these can be combined. Those who assume that exemplars must have taken these routes in their socialization may prefer such practices to conventional repetition. However, each is discretionary and but one practical means to it. Each has pros and cons: some routes serve certain personality types or learning styles, others not so well.  In certain cultures, mentoring, mimicking and emulating exemplars will be the way to go.

Deep Thoughts: Perhaps one can also try the way of humor:  “Before you insult a man, walk a mile in his shoes. That way you’ll be a mile away when he gets offended, and you’ll have his shoes.”—John Handy

13. References and Further Reading

  • Allen, C. (1996).   “What’s Wrong with the Golden Rule? Conducting Ethical Research in Cyberspace.” The Information Society, v. 1 no.2: 174-188.
  • Colby, A. and Damon, W. (1984). Some Do Care. New York, NY: Free Press.
  • Colby, A. and Kohlberg, L (1987). The Measurement of Moral Judgment. Vol I, New York: NY: Cambridge University Press.
  • Confucius, (1962). The Analects. New York, NY: Penguin Classics.
  • Cox, J. R. (1993). A Guide to Peer Counseling. New York, NY: Rowan and Littlefield.
  • Dickens, C. (1977). A Christmas Carol. New York, NY: Crown
  • Publishers/Weathervane Books.
  • Firth, Roderick  (1952 ). “Absolutism and the Ideal Observer.” Philosophy and  Phenomenological Research Vol XII #3: 317-345.
  • Festinger, L. (1957). A theory of Cognitive Dissonance. Stanford, CA: Stanford University Press.
  • Fromm, Erich (1956). The Art of Loving. New York: NY: Harper and Row.
  • Fowler, J. (1981). Stages of Faith: The psychology of human development and the quest for meaning. San Francisco, Ca: Harper and Row.
  • Gandhi, M. (1956). All Men Are Brothers. New York: NY, Continuum Press.
  • Gardner, H. (1993). Frames of Mind: The Theory of Multiple Intelligences. New York, NY: Basic Books.
  • Gilligan, Carol. (1982) In A Different Voice. Cambridge, MA:  Harvard University Press.
  • Giraffe Heroes Project,  Box 759, Langley, Washington 98260.
  • Habermas, J. (1990). “Discourse Ethics: Notes on a Program of Philosophical Justification.” In Moral Consciousness and Communicative Action. Cambridge, MA: MIT Press.
  • Hare, Richard M. (1975). Abortion and the Golden Rule. Philosophy and Public Affairs. Vol. 4 #3 201-222.
  • Hoffman, S. (1987). “The Contribution of Empathy to Justice and Moral Judgment.” In Nancy Eisenberg and J. Strayer, (eds.) Empathy and its Development. (Cambridge Studies in Social and Emotional Development) New York, NY: Cambridge University Press.
  • James, W. (1985).  The Varieties of Religious Experience. Cambridge, MA: Harvard University Press.
  • Kant, I. (1956). Groundwork for a Metaphysics of Morals. New York: NY, Harper and Row.
  • King, M. L. (1986). Stride for Freedom. New York: NY: Harper and Row.
  • Kohlberg, L. (1968). The Child as a Moral Philosopher. Psychology Today, 1: 25-32.
  • Kohlberg, L. (1969). “Stage and Sequence: The Cognitive-developmental Approach to Socialization.” In D. A. Goslin (ed.) Handbook of Socialization Theory. Chicago: IL. Rand McNally: 347-480.
  • Kohlberg, Lawrence. (1982). “From Is To Ought.” In The Philosophy of Moral Development.  New York, NY: Harper-Row.
  • Kohlberg, L. (1982a). “Education for Justice: A Modern Statement of the Socratic View.” The Philosophy of Moral Development. New York: NY Harper-Row.
  • Mencius (1993). The Book of Mencius. Trans. Giles. L. Clarendon, VT: Tuttle Publications.
  • Meyers, D. C. (2005). Social Psychology. New York, NY: McGraw-Hill, chapters 4, 6, and 8.
  • Mill, John Stuart. (1861). “Utilitarianism” in The Utilitarians. Garden City, NY: Double Day and Company.
  • Noddings, Nel. (1984). Caring: A Feminine Approach To Ethics. Los Angeles, CA: University of California Press.
  • Noetics Institute: Creative Altruism Program. 101 San Antonio Rd. Petaluma CA94952.
  • Nietzsche, F. (1955). Beyond Good and Evil. Chicago, IL: Gateway Press.
  • Oliner, S., and Oliner, P. (1988). The Altruistic Personality. New York, NY: Free Press.
  • Outka, Gene. (1972). Agape: An ethical Approach. New Haven, CT: Yale University     Press.
  • Rawls, John  (1972). A Theory of Justice. Cambridge, MA, Harvard University Press.
  • Selman, R. (1980). The Growth of Interpersonal Understanding: Developmental and Clinical. New York, NY: Academic Press.
  • Selman, R. (1971). “Taking Another’s Perspective: Role-taking Development in Early Childhood.” Childhood Development. 42, 1721-1734.
  • Singer, M. (1955). “Generalization in Ethics.” Mind, 64 (255): 361-375.
  • Singer, M. (1963).  The Golden Rule. Philosophy Vol. XXXVIII #146: 293-314.
  • Wattles, J. (1966).    The Golden Rule. New York: NY, Oxford University Press.
  • Van Veen, V., Krug, M. K., Schooler, J. W., Carter, C. S. (2009). “Neural Activity Predicts Attitude Change in Cognitive Dissonance.” Nature Neuroscience 12 (11): 1469-1474.
  • Zefferelli, F. (1977). “Jesus of Nazareth” (television mini-series).
  • Zeki, Semir, (2000). “The Neural Basis of Romantic Love.” NeuroReport: 11: 3829-3834.

Author Information

Bill Puka
Email: billpuka@gmail.com
Rensselaer Polytechnic Institute
U. S. A.

The Philosophy of Social Science

The philosophy of social science can be described broadly as having two aims. First, it seeks to produce a rational reconstruction of social science. This entails describing the philosophical assumptions that underpin the practice of social inquiry, just as the philosophy of natural science seeks to lay bare the methodological and ontological assumptions that guide scientific investigation of natural phenomena. Second, the philosophy of social science seeks to critique the social sciences with the aim of enhancing their ability to explain the social world or otherwise improve our understanding of it. Thus philosophy of social science is both descriptive and prescriptive. As such, it concerns a number of interrelated questions. These include: What is the method (or methods) of social science? Does social science use the same methods as natural science? If not, should it aspire to? Or are the methods appropriate to social inquiry fundamentally different from those of natural science? Is scientific investigation of the social world even possible – or desirable? What type of knowledge does social inquiry produce? Can the social sciences be objective and value neutral? Should they strive to be? Does the social world represent a unique realm of inquiry with its own properties and laws? Or can the regularities and other properties of the social world be reduced to facts about individuals?

The following article will survey how philosophers of social science have addressed and debated these questions. It will begin by examining the question of whether social inquiry can – or should – have the same aims and use the same methods as the natural sciences. This is perhaps the most central and enduring issue in the philosophy of social science. Addressing it inevitably leads to discussion of other key controversies in the field, such as the nature of explanation of social phenomena and the possibility of value-free social science. Following examination of the views of proponents and critics of social inquiry modeled on the natural sciences will be a discussion of the debate between methodological individualists and methodological holists. This issue concerns whether social phenomena can be reduced to facts about individuals. The penultimate section of the article asks the question: How does social science as currently practiced enhance our understanding of the social world? Even if social science falls short of the goals of natural science, such as uncovering lawlike regularities and predicting phenomena, it nonetheless may still produce valuable knowledge. The article closes with a brief discussion of methodological pluralism. No single approach to social inquiry seems capable of capturing all aspects of social reality. But a kind of unification of the social sciences can be posited by envisioning the various methods as participating in an on-going dialogue with each other.

Table of Contents

  1. Naturalism and the Unity of Scientific Method
  2. Critiques of Naturalism
    1. The Absence of Social Laws
    2. Interpretivism and the Meaningfulness of the Social World
      1. Descriptivism
      2. Hermeneutics
    3. The Hidden Ideology of Value Neutrality
      1. Critical Theory
      2. Postmodernism
  3. Methodological Individualism versus Holism
  4. What Social Science Does
    1. Uncovering Facts
    2. Correlation Analysis
    3. Identifying Mechanisms
  5. Methodological Pluralism
  6. References and Further Reading

1. Naturalism and the Unity of Scientific Method

The achievements of the natural sciences in the wake of the scientific revolution of the seventeenth century have been most impressive. Their investigation of nature has produced elegant and powerful theories that have not only greatly enhanced understanding of the natural world, but also increased human power and control over it. Modern physics, for instance, has shed light on such mysteries as the origin of the universe and the source of the sun’s energy, and it has also spawned technology that has led to supercomputers, nuclear energy (and bombs), and space exploration. Natural science is manifestly progressive, insofar as over time its theories tend to increase in depth, range and predictive power. It is also consensual. That is, there is general agreement among natural scientists regarding what the aims of science are and how to conduct it, including how to evaluate theories. At least in the long run, natural science tends to produce consent regarding which theories are valid. Given this evident success, many philosophers and social theorists have been eager to import the methods of natural science to the study of the social world. If social science were to achieve the explanatory and predictive power of natural science, it could help solve vexing social problems, such as violence and poverty, improve the performance of institutions and generally foster human well-being. Those who believe that adapting the aims and methods of natural science to social inquiry is both possible and desirable support the unity of scientific method. Such advocacy in this context is also referred to as naturalism.

Of course, the effort to unify social and natural science requires reaching some agreement on what the aims and methods of science are (or should be).  A school of thought, broadly known as positivism, has been particularly important here. An analysis of positivism’s key doctrines is well beyond the scope of this article. However, brief mention of some of its key ideas is warranted, given their substantial influence on contemporary advocates of naturalism. The genesis of positivism can be traced to the ideas of the British empiricists of the seventeenth and eighteenth century, including most notably John Locke, George Berkeley, and David Hume. As an epistemological doctrine, empiricism in essence holds that genuine knowledge of the external world must be grounded in experience and observation. In the nineteenth century, Auguste Comte, who coined the term “positivism,” argued that all theories, concepts or entities that are incapable of being verified empirically must be purged from scientific explanations. The aim of scientific explanation is prediction, he argued, rather than trying to understand a noumenal realm that lies beyond our senses and is thus unknowable. To generate predictions, science seeks to uncover laws of succession governing relations between observed phenomena, of which gravity and Newton’s laws of motion were exemplars. Comte also advocated the unity of scientific method, arguing that the natural and social sciences should both adopt a positivist approach. (Comte was a founder of sociology, which he also called “social physics.”) In the middle third of the twentieth century an influential version of positivism, known as logical positivism, emphasized and refined the logical and linguistic implications of Comte’s empiricism, holding that meaningful statements about the world are limited to those that can be tested through direct observation.

For a variety of reasons, positivism began to fall out of favor among philosophers of science beginning in the latter half of the twentieth century. Perhaps its most problematic feature was the logical positivists’ commitment to the verifiability criterion of meaning. Not only did this implausibly relegate a slew of traditional philosophical questions to the category of meaningless, it also called into question the validity of employing unobservable theoretical entities, processes and forces in natural science theories. Logical positivists held that in principle the properties of unobservables, such as electrons, quarks or genes, could be translated into observable effects. In practice, however, such derivations generally proved impossible, and ridding unobservable entities of their explanatory role would require dispensing with the most successful science of the twentieth century.

Despite the collapse of positivism as a philosophical movement, it continues to exercise influence on contemporary advocates of the unity of scientific method. Though there are important disagreements among naturalists about the proper methodology of science, three core tenets that trace their origin to positivism can be identified. First, advocates of naturalism remain wedded to the view that science is a fundamentally empirical enterprise. Second, most naturalists hold that the primary aim of science is to produce causal explanations grounded in lawlike regularities. And, finally, naturalists typically support value neutrality – the view that the role of science is to describe and explain the world, not to make value judgments.

At a minimum, an empirical approach for the social sciences requires producing theories about the social world that can be tested via observation and experimentation. Indeed, many naturalists support the view, first proposed by Karl Popper, that the line demarcating science from non-science is empirical falsifiability. According to this view, if there is no imaginable empirical test that could show a theory to be false, then it cannot be called a scientific theory. Producing empirically falsifiable theories in turn necessitates creating techniques for systematically and precisely measuring the social world. Much of twentieth century social science involved the formation of such tools, including figuring out ways to operationalize social phenomena – that is, conceptualize them in such a way that they can be measured. The data produced by operations in turn provide the raw, empirical material to construct and test theories. At the practical level, ensuring that scientific theories are subject to proper empirical rigor requires establishing an institutional framework through which a community of social scientists can try to test each others’ theories.

The purpose of a theory, according to naturalists, is to produce causal explanations of events or regularities found in the natural and social worlds. Indeed, this is the primary aim of science. For instance, astronomers may wish to explain the appearance of Haley’s comment at regular intervals of seventy-five years, or they might want to explain a particular event, such as the collision of the comet Shoemaker-Levy 9 with Jupiter in July 1994. Scientific explanations of such regularities or events in turn require identification of lawlike regularities that govern such phenomena. An event or regularity is formally explained when its occurrence is shown to be logically necessary, given certain causal laws and boundary conditions.  This so-called covering law model thus views explanation as adhering to the structure of a deductive argument, with the laws and boundary conditions serving as premises in a syllogism.  Underpinning the explanations of the periodic return of Haley’s comment or the impact of Shoemaker-Levy 9 in astronomy, for instance, would be certain casual laws of physics, namely gravity and Newton’s laws of motion. These laws may be invoked to produce causal explanations of a variety of other events and regularities, such as the orbit of the planets in our solar system, the trajectory of projectiles, the collapse of stars, and so forth. Thus the discovery of lawlike regularities offers the power to produce parsimonious explanations of a wide variety of phenomena. Proponents of the unity of scientific method therefore hold that uncovering laws of social phenomena should be a primary goal of social inquiry, and indeed represents the sine qua non for achieving genuinely scientific social investigation.

The doctrine of value neutrality is grounded in the so-called fact/value distinction, which traces its origins to David Hume’s claim that an ought cannot be derived from an is. That is, factual statements about the world can never logically compel a particular moral evaluation.  For instance, based on scientific evidence, biologists might conclude that violence and competition are natural human traits. But such a factual claim itself does not tell us whether violence and competition are good or bad. According to advocates of naturalism, the same holds true for claims about the social world. For example, political scientists might be able to tell us which social, political and material conditions are conducive to the development of democracy. But, according to this view, a scientific explanation of the causes of democracy cannot tell us whether we ought to strive to bring about democracy or whether democracy itself is a good thing.  Science can help us better understand how to manipulate the social world to help us achieve our goals, but it cannot tell us what those goals ought to be. To believe otherwise is to fall prey to the so-called naturalistic fallacy. However, value neutrality does not bar social scientists from providing an account of the values that individuals hold, nor does it prevent them from trying to discern the effects that values might have on individuals’ behavior or social phenomena. Indeed, Max Weber, a central figure in late nineteenth and early twentieth century sociology and a defender of value neutrality, insisted that providing a rich account of individuals’ values is a key task for social scientists. But he maintained that social scientists can and should keep their ethical judgment of people’s values separate from their scientific analysis of the nature and effects of those values.

2. Critiques of Naturalism

Naturalism has been highly influential in the social sciences, especially since the middle in the twentieth century and particularly in the United States.  Movements to make social inquiry genuinely scientific have dominated many fields, most notably political science and economics. However, whether these efforts have been successful is contestable, and naturalism has been subjected to wide-ranging criticism. Some critics point to what they view as formidable obstacles to subjecting the social world to scientific investigation. These include the possible absence of law-like regularities at the social level, the complexity of the social environment, and the difficulty of conducting controlled experiments. These represent practical difficulties, however, and do not necessarily force the conclusion that modeling social inquiry on the natural sciences is doomed to failure. More radical critics of naturalism argue that the approach is thoroughly misconceived. Proponents of interpretive social inquiry are perhaps the most significant among such critics. Advocates of this approach claim that the aim of social investigation should be to enhance our understanding of a meaningful social world rather than to produce causal explanations of social phenomena grounded in universal laws. In addition, many proponents of interpretive social inquiry also cast doubt on the possibility, as well as the desirability, of naturalism’s goals of objectivity and value neutrality. Their skepticism is shared by adherents of two other influential schools of social inquiry, known as critical theory and postmodernism. But proponents of these approaches also emphasize the various ways in which social science can mask domination in society and generally serve to reinforce the status quo. These various criticisms of naturalism are considered below.

a. The Absence of Social Laws

Among critics who point to practical obstacles impeding efforts to model social inquiry on the natural sciences, perhaps their most important objection questions the very existence of law-like regularities in the social world. They argue that the stringent criteria that philosophers of science have established for deeming an observed regularity to be an authentic law-like regularity cannot be met by proposed social laws. For a regularity to be deemed a genuine law of nature, the standard view holds that it must be universal; that is, it must apply in all times and places. The second law of thermodynamics, for example, is held to apply everywhere in the universe and at all points in the past and future. In addition, the types of laws of most importance to science are causal laws. A law may be described as causal, as opposed to a mere accidental regularity, if it represents some kind of natural necessity – a force or power in nature – that governs the behavior of phenomena.  Not all law-like regularities meet the causal requirement. For instance, it is a regularity of nature that the earth orbits the sun in a certain elliptical path once every 364 days. But the orbital regularities of earth and the other planets in the solar system have no causal powers themselves. They are rather the product of certain conditions and certain causal laws, namely gravity and Newton’s laws of motion.

Whether there are genuine law-like causal regularities that govern social phenomena is not at all clear. In any event, no laws governing the social world have been discovered that meet the demanding criteria of natural science. To be sure, social scientists have identified many social regularities, some of which they have even dubbed social laws. Examples from the discipline of economics would include the laws of supply and demand. From political science we find Roberto Michels’ iron law of oligarchy, which holds that popular movements, regardless of how democratically inclined, over time will become hierarchical in structure. Another proposed law of politics is Duverger’s Law, which posits that two-party systems will emerge in political systems that feature simple-majority, single-ballot electoral systems. But upon closer inspection, these laws fail to meet the criteria for genuine law-like regularities. Sometimes, particularly in economics (which boasts more purported laws than the other social sciences), the laws merely describe logical relationships between concepts. These laws may be true by definition, but because they do not describe the empirical world, they are not scientific laws. On the other hand, social laws that claim to describe empirical regularities invariably turn out to be imprecise, exception ridden and time-bound or place-bound rather than precise and universal. Consider the law of demand from economics, which holds that consumer demand for a good will decrease if prices go up and increase if prices go down. Though this pattern typically occurs, it is not without exception.  Sometimes increasing the price of a good also increases demand for it. This may happen when consumers interpret a higher price as signaling higher quality or because purchasing an expensive good provides an opportunity for conspicuous consumption – wasteful expenditure as a display of status. Moreover, the law of demand is a weak law; it merely specifies an inverse relationship between price and demand. Unlike the more precise laws of natural science, it does not specify the magnitude of the expected change.

In many cases proposed social laws are grounded in simplified and therefore false assumptions about human nature. For instance, the laws of economics are typically grounded in the assumptions of rational choice theory. This theory posits that individuals always act rationally and instrumentally, weighing potential costs and benefits as they aim to maximize their own utility. But though individuals may typically act rational in this sense, especially in the economic sphere, it is nonetheless the case that they do not always do so. Psychologists, for instance, have documented numerous ways in which individuals frequently fail to act rationally, owing to predictable kinds of flawed reasoning or perceptual errors.   Moreover, it is evident that much behavior, even within the sphere of economics, is not instrumental but rather is guided by social norms, habit or tradition. Thus the laws of economics grounded in the assumption of instrumental rationality are in fact false. Outside of economics, the laws of social science are fewer and generally even more dubious.  Duverger’s law, which is also grounded in similar assumptions about human rationality, admits of numerous exceptions. Many simple-majority, single-ballot systems do in fact exhibit more than two political parties. And Michels’ himself acknowledged that his eponymous law could be nullified if steps were taken to enhance norms of democratic participation within groups. At best, such purported laws could be described as tendencies or typical patterns rather than genuine law-like regularities.

The reason for the absence of genuine laws in the social sciences is a source of debate. Some argue that the failure to uncover social laws stems from the complexity of human behavior and the social world. Human behavior is the product of manifold factors, including biological, psychological and perhaps sociological forces, each of which are themselves quite complex. Moreover, the social systems in which human behavior are embedded are themselves highly intricate. Untangling the myriad interactions between multiple individuals in, for example, an economic system is a daunting task. Perhaps it simply lies beyond human cognitive powers to detect law-like patterns in such a milieu. Or perhaps no law-like regularities even obtain at the social level, even if laws obtain at the level of individuals.

In addition to complexity, another impediment to social scientists’ ability to uncover law-like regularities is the difficulty, and sometimes impossibility, of conducting controlled experiments. Natural scientists often enjoy the ability to manipulate variables in a controlled laboratory setting. This helps them identify causal factors with respect to phenomena that they are trying to explain. For practical or ethical reasons, this is often not possible in the social sciences. In many cases the best a social scientist can hope for is to uncover so-called natural experiments, in which a suspected causal factor is present in one naturally occurring setting but absent in another. For instance, suppose social scientists wish to test the hypothesis that television viewing causes violence. They would benefit from a natural experiment if they could find two demographically similar communities, one of which has just recently received access to television and another that remains without it.  They could then track violence rates over time in the two communities to determine if exposure to television does in fact lead to more violence. The difficulty is that social scientists must wait for natural experiments to come to them and, in any event, such experiments seldom offer the opportunity to control for all the potentially relevant variables.

Some observers have pointed to the relative youth of social science to explain the failure to uncover law-like regularities of the social world. According to this view, the social sciences are still awaiting their Galileo or Newton to provide an explanatory framework that will allow them to begin uncovering such laws. However, critics of this view may note that rigorous, systematic attempts to explain social behavior arguably date back all the way to the ancient Greeks. And attempts to produce empirically grounded social inquiry intentionally modeled on natural science are almost as old as the scientific revolution itself. At many points in the history of social science, eminent figures have emerged who seemed to offer the promise of putting social investigation on a proper scientific footing. These would include Thomas Hobbes, Adam Smith, Auguste Comte, Emile Durkheim, Max Weber, as well as the numerous advocates of behaviorism and positivism in the twentieth century. But, in the end, a consensus on method and the hoped-for scientific progress have failed to materialize.

The explanations discussed above for why social scientists have yet to identify genuine law-like regularities cite the practical difficulties of uncovering such laws in the social realm. But more radical critics of naturalism argue that the attempt to unify the methods of the natural and social sciences is deeply misguided. They claim that the social world is different from the natural world in crucial respects that render the methods of natural science at best inadequate for enhancing understanding of the social world. At worst, naturalism not only fundamentally mischaracterizes the social world, it also serves to reinforce oppressive beliefs, values and social practices. These critics include advocates of interpretive social inquiry, critical theorists, and postmodernists.

b. Interpretivism and the Meaningfulness of the Social World

Advocates of interpretivism propose an approach to social inquiry grounded in profoundly different assumptions about the nature of the social world than those who support naturalism. In particular, interpretivists assert that the social world is fundamentally unlike the natural world insofar as the social world is meaningful in a way that the natural world is not. This difference can be made clear by considering the difference between human action and the behavior of entities or systems found in the natural world. Suppose that there is an action by an individual that we wish to explain – for example, voting at a school board meeting for a particular proposal. Imagine that the individual votes for a measure by raising his hand. The act of voting entails more than a particular physical movement, however. In fact, in different situations the same physical behavior of hand raising could indicate different things – posing a question, pointing to the ceiling, yawning, and so forth. Thus to adequately explain the person’s behavior, it is not enough to explain the physical processes that caused the hand raising. Indeed, in most cases of social inquiry, the physical processes will be irrelevant to explanation of the behavior. Rather, what is required is an account of the meaning behind the action. In this example, that would be an account of what the person meant by raising his hand, namely to vote.

There is no equivalent type of explanation in the physical sciences. Astronomers, for instance, might wish to explain the orbital path of a comet. To do so, they cite relevant natural laws and conditions that produce the comet’s orbital trajectory. But the motion of the comet has no meaning per se in need of explanation (although the appearance of the comet might be interpreted by some human observers as having some meaning, such as auguring ill fortune). Similarly, a physiologist might seek to explain the biophysical processes that cause limbs to rise. But, again, the physical processes that cause a human arm to rise have no meaning as such.  It is only from the standpoint of social, as opposed to biological, behavior that the action has meaning. Moreover, the elements of the natural world – its objects, forces, events and phenomena – are not created or constituted by the meanings that human beings attribute to them. They exist independent of human beliefs, and the laws that govern them are not dependent on human beliefs either. Atoms, DNA, planets, and so forth, would still exist and be governed by natural laws if human beings did not exist. This is obviously not the case for the social world. Social institutions – a marketplace, a church, a business firm, a sports game, marriage, and so forth – are created and governed in part by the beliefs that people hold about them.

What implication does the meaningful nature of the social world have for the methods and aims social inquiry? According to interpretivists, it means that the key aim of social inquiry should be to enhance our understanding of the social world’s meanings as opposed to producing causal explanations of social phenomena. Interpretivists often compare social inquiry to textual interpretation. The aim of textual interpretation is to make sense of a novel, play, essay, religious document or other text by laying bare the beliefs, intentions, connections and context that comprise their meaning. Similarly, interpretivists say, the aim of social inquiry should be to make sense of the actions, beliefs, social practices, rituals, value systems, institutions and other elements that comprise the social world.  This involves uncovering the intentions and beliefs that inform human action, which in turn requires making sense of the broader social context in which those beliefs, intentions, and actions reside.

i. Descriptivism

Interpretive theory has drawn much of its inspiration from the fields of cultural anthropology and ethnomethodology, the study of how people make sense of their everyday world. Indeed, some advocates of interpretive social inquiry wish to make the aims and methods of these approaches the exemplar for all social inquiry. A key goal of cultural anthropology is to make sense of the beliefs, norms, practices, and rituals of foreign cultures. For instance, suppose an anthropologist wishes to explain a particular religious ceremony practiced by a hunter-gather tribe. According to interpretivists, the aim of such inquiry has nothing to do with identifying relevant law-like regularities or causal mechanisms that govern the ceremony. Nor should the litmus test of a successful explanation be the ability to generate predictions about the tribes’ behavior in the ceremony (although the capacity to predict behavior might be a byproduct of such inquiry). Rather, the anthropologist’s aim should be to make sense of the purpose and meaning of the ceremony. Naturally, this would require producing an account of how the members of the tribe understand their ceremony. But it would also entail placing the ceremony within the broader context of the tribes’ values, worldview, practices or institutions.  The end product of such investigation would be a so-called thick description that enhances our understanding of the tribe, rather than a causal explanation of their behavior. This kind of social inquiry has been labeled “descriptivism.”

Many social scientists and philosophers acknowledge that advocates of descriptivism have identified an important difference between the social and natural worlds. And there is no doubt that the thick descriptions of foreign cultures that the approach produces have greatly enhanced our understanding of them. This in turn has increased understanding of human society generally, insofar as it has revealed the great diversity of human beliefs, values, traditions, and practices. However, the claim that the primary goal of social inquiry should be to produce thick descriptions has been subjected to serious criticism from advocates of naturalism and well as from critics who identify with the interpretive approach.

A key objection to descriptivism is that it would limit interpretive inquiry to describing cultures or societies in their own terms, leaving no room for criticizing the beliefs, values or self-understandings of those cultures or societies. Clearly, the objection runs, this is unsatisfactory, for persons and even cultures collectively can be unaware or deeply misguided about how their societies really function, and some beliefs and values operative in a society may be incoherent, contradictory, self-defeating or even delusional. Surely a primary task of social inquiry must be to offer accounts that are more penetrating and critical than descriptivism can offer. If, as the Canadian political theorist Charles Taylor has said, the primary aim of social investigation is to tell us “what is really going on,” then descriptivism falls far short of this goal (1985b: 92).

ii. Hermeneutics

An important criticism of descriptivism challenges the notion that the role of the social scientists is to simply to re-express the ideas, beliefs, values and self-understandings of a culture or society by adopting the viewpoint of its inhabitants. This criticism has been developed by advocates of an alternative and influential version of interpretive theory that draws on the philosophical hermeneutics of continental thinkers such as Martin Heidegger, Hans-Georg Gadamer, Paul Ricoeur, as well as Anglo-American theorists working within the tradition, most notably Taylor. These theorists argue that coming to understand a culture or society – or another person or even a text or work of art – does not involve producing an objective description of an independent object. That is, the philosophical hermeneutics approach rejects a subject/object ontology in which knowledge consists of an accurate representation of an external world in the mind of a subject. Instead, explaining the beliefs of a culture or society, whether our own or a foreign one, entails a kind of dialogue with it. The process of coming to understand a culture, society or social practice is analogous to a conversation with another person, especially one aimed at getting to know the other person. In such a conversation, both participants may have their views challenged, their presuppositions about the other exposed, and in the process a better understanding of themselves and their conservation partner will emerge.

The same holds for attempts to understand whole societies or cultures, according to the hermeneutical theorists. Understanding is produced through a dialectical process in which the self-understanding of both parties – the investigator as well as the culture being studied – may be transformed. In striving to explain the worldview embedded in a culture – its beliefs, values, and self-definitions – we must necessarily compare and contrast those beliefs, values, and self-definitions to our own. In doing so, we may come to see limitations, inconsistencies, contradictions, lacunae or even plain falsehoods associated with our own worldview as well as that of others. “Understanding,” Charles Taylor has written, “is inseparable from criticism, but this in turn is inseparable from self-criticism” (1985b: 131).  Advocates of the philosophical hermeneutics approach emphasize that such interpretive inquiry may also be applied to our own world. Taylor, for instance, via deep interpretive inquiry has detected a legitimation crisis at the core of contemporary Western society (1985b: 248-288). He argues that the instrumentalist and acquisitive values of modern industrial society are in contradiction with (and in fact erode) other fundamental Western values, including genuine autonomy and community.

Hermeneutics’ rejection of naturalism’s subject/object epistemology, and its embrace of a dialogical model of understanding, also leads to very different understanding of data in the social sciences. Naturalists, Taylor has argued, wish to make data univocal (1985a: 117). That is, they seek to build theories grounded in data that will admit of only one meaning. Univocal data allow for intersubjective agreement among scientists and thus are a key source of science’s claim to objectivity. In the natural science, the goal of producing univocal data is frequently achieved.  Natural scientists do in fact often reach consensus on the meaning of data used to construct or test a theory – for example, the composition of gasses detected in a volcanic eruption, the number of sea turtle eggs detected on a beach, or the kind of radiation emitted in a supernova. But advocates of a hermeneutical approach to social inquiry argue that the data of social science theories can only be made univocal at the cost of producing a highly distorted or largely vacuous description of the social world. The data of the social world are partly composed of intentions, beliefs, values, rituals, practices and other elements in need of interpretation. Interpreting them requires unpacking the larger web of meanings in which they are embedded. However, no interpretation of such data can be considered final and uncontestable.  As with the interpretation of a novel, a poem or a painting, there will be no criteria or external data that can be appealed to that will produce a definitive and incorrigible interpretation of social phenomena.  This does not mean that anything goes and that all interpretations should be considered equally plausible or valid. But it does mean that the data of social science cannot be univocal in naturalism’s sense. Rather, the data of social science will remain multivocal and always open to multiple meanings. If consensus about the meaning of social phenomena it is to be attained, it must be arrived at via dialogue rather than appeal to data deemed to be external, objective and beyond dispute.

Supporters of the hermeneutical approach also emphasize that social inquiry is inherently evaluative. Here the hermeneutical tradition departs decisively from descriptivism and naturalism, both of which embrace the aim of objective, value-free social inquiry. Descriptivists believe that an objective account of a culture can be rendered by recovering the point of view of the culture’s members. There is no need to assess the validity, coherence or merit of a culture’s desires and values. In fact, if the culture under study is a foreign one, to attempt to do so risks ethnocentricity – the improper judging of another culture in terms of one’s own values.  Advocates of naturalism, embracing the fact/value distinction discussed above, tend to view desires, purposes and values as merely individuals’ subjective preferences, which cannot be rationally assessed. We may seek to explain the causes of people’s beliefs and values, but moral evaluation of them lies beyond science. But hermeneutical interpretivists argue that desires, values and purposes are not merely subjective. As humans we do not simply desire or value some end or trait unreflectively and uncritically. We also evaluate our values, desires and purposes – assess them as noble or base, deep or superficial, authentic or inauthentic, rational or irrational. For instance, a person might desire to hurt someone physically, but also view that desire as shameful, inconsistent with his more deeply held values, and not reflective of the kind of person he aspires to be. Importantly, this person would not be the only one in position to evaluate his desire. In fact, others might be more perceptive in identifying the inconsistencies between the person’s deeper sense of self and his desire to hurt another. This means that a person can be mistaken regarding his or her own values, purposes or desires. They do not necessarily have the final word. The same holds for entire societies and cultures. Incongruence between values, purposes, desires and beliefs may also occur at a society-wide level, and good interpretive inquiry will bring these inconsistencies to light. In doing so, it will be evaluative.

There is another sense in which a purely descriptivist account can fail to provide an adequate account of what’s really going on in a society. A descriptivist account may fail to identify causal processes or mechanisms that operate, to borrow a phrase from Karl Marx, behind the back of a society’s inhabitants. Identifying such processes and mechanisms may take the form of revealing how individual actions or social policies or practices may produce unintended consequences (sometimes welcome, but also often unwanted). Adam Smith’s unpacking of the invisible hand mechanism of the market is an exemplar of such kinds of explanations. Individuals and, indeed, entire societies may be dimly or even wholly unaware of such processes, and simply producing a thick description of a society may leave them obscure. According to some social scientists, unveiling such mechanisms is a central task of social science. This view is discussed in the final section of this article.

Advocates of naturalism as well as of hermeneutics may agree that an important aim of social investigation is to uncover such unseen causal processes. However, proponents of the philosophical hermeneutics approach will insist that any such explanation must begin with an attempt to make sense of individuals on their own terms, with their own concepts and self-descriptions. “Interpretive social science,” Taylor says, “cannot by-pass the agent’s self-understanding” by creating some purportedly neutral scientific language. (1985b: 118). But some naturalists will insist that social science explanations need not always be tied to the particular self-understandings of the people under study. In fact, both the explanandum (that is, the phenomena to be explained) and the explanans (the explanation itself) may sometimes be couched in a neutral, transcultural scientific language. Such explanations typically attempt to make sense of phenomena that are either universal or common at least to most human societies (for example, birth, death, violence, order, domination, hierarchy). They would also be grounded in assumptions about human goals (for example, nutrition, safety, material well-being, status) and human rationality (typically means-end rationality) posited to be species specific rather than culture specific. These explanations require merely a thin, rather than a thick, description of the social practice or phenomena to be explained. In this way, naturalists believe that science can offer explanations of social phenomena that transcend – and are in fact superior to – the self-understanding of the society being explained.

A related critique of interpretive social inquiry leveled by naturalists is the charge of particularism. This criticism says that interpretive social inquiry would appear to produce merely a collection of particularistic interpretive accounts of different cultures. That is, an interpretive approach would seem to limit social science’s ability to explain similar kinds of events and phenomena that occur in different cultures. Political scientists, for example, do not want merely to explain the Iranian Revolution or the Russian Revolution. They also want to explain revolutions in general. This requires uncovering the typical conditions, mechanisms or laws that produce revolutions. That is, it requires creating a model of a typical revolution. This in turn entails abandoning the thick descriptions of human beliefs and goals favored by interpretivists and replacing it with a thinner, more abstract account of human action – the sort used by rational choice theorists, for example. If interpretivists object to using this level of abstraction, naturalists argue, it appears they must relinquish the goal of producing explanations of social phenomena that transcend particular cultures. This would necessitate abandoning many important questions that social sciences have traditionally sought to answer.

c. The Hidden Ideology of Value Neutrality

Two other schools of thought that reject naturalism are critical theory and postmodernism. Both of these approaches agree that social inquiry must be in part interpretive. They also agree with advocates of hermeneutics that interpretation is an inherently evaluative activity. Thus they reject naturalism’s goal of value neutrality.  Their most important contribution to the critique of value neutrality lies in their exploration of the various ways that social science can serve to legitimate and reinforce oppressive values, beliefs and practices and thereby mask domination. Far from being unbiased, value neutrality represents a hidden ideology.

i. Critical Theory

Critical theory traces is origins to the Frankfurt School, founded in the 1920s in Germany, which included such thinkers as Max Horkheimer, Theodor Adorno, Herbert Marcuse and Jurgen Habermas. Coming out of the Marxist tradition, members of this school took to heart Marx’s famous conclusion from his “Theses on Feuerbach”: “Philosophers have hitherto only interpreted the world in various ways; the point is to change it.”  Marx viewed his efforts to explain the inner workings of capitalism and the logic of history as a scientific endeavor. But he also saw social inquiry as necessarily intertwined with critiquing society and ultimately liberating mankind from oppression.  Following in this vein, the original critical theorists argued that a social scientist should not – and cannot – be a neutral observer of the social world. Thus the Frankfurt School sought to retain the social criticism intrinsic to Marxism while distancing their approach from the rigidified orthodox version of the doctrine that propped up the totalitarian system in the Soviet Union and its satellites. In place of orthodox Marxism they aimed to produce a new theory that could at once explain the failure of socialism in the Western liberal democracies and also provide a critique of what they saw as oppressive features of developed capitalist societies.

Today critical theory encompasses a broader group of social theorists than solely the contemporary descendents of the Frankfurt School.  Use of the term has expanded to include many other approaches, such as feminism and other liberation ideologies that claim to offer both a systematic explanation and critique of economic, social and political structures, institutions or ideologies that are held to oppress people. The aim of critical theory is human emancipation, and this is accomplished in part by laying bare structural impediments to genuine freedom, contradictions and incoherencies in people’s beliefs and values, and hidden ideologies that mask domination. Liberation thus comes through enlightenment. When people are made aware of the true nature of their situation, they will cast off the shackles of oppression. In this sense, critical theory remains continuous with the broader Enlightenment project of the West that began in the seventeenth century: reason would triumph over irrationality, superstition and prejudice to usher in a new era of freedom and justice.

For critical theorists the sources of domination and false consciousness are wide-ranging. Those in the Marxist tradition, for instance, explore how the values, beliefs and hierarchies generated by capitalism serve to keep the working class deluded and exploited. Feminist critical theorists examine how patriarchal values, which they find are deeply imbedded in contemporary institutions, legal systems, and social values, serve to keep women subordinate. But critical theorists also train much of their criticism on mainstream social science, particularly its claim to value neutrality. Like the advocates of hermeneutical social inquiry described above, critical theorists contend that social inquiry is an inherently evaluative enterprise. In fact, critical theorists hold that that social science is a necessarily political enterprise. Mainstream social science modeled on naturalism, they charge, reinforces the status quo and serves the interests of the powerful, though usually unwittingly. In contrast, critical theory wears its values on it sleeve as an intentionally partisan endeavor on the side of liberation.

How, according to critical theorists, does naturalistic social science serve the status quo and mask domination? They argue that many of the supposedly neutral, objective concepts and categories of social science actually subtly but powerfully support particular political interests and worldviews. Consider the understanding of rationality that is central to standard economic theory. Economists conceptualize rational action in a particular way, namely as maximizing utility – choosing the most efficient means to achieve some end. Economists may claim that their concept of rationality is merely descriptive, containing no moral judgment of individuals’ behavior. But in ordinary use “rationality” clearly implies a positive moral evaluation, and its opposite, “irrationality,” indicates a negative judgment. Therefore designating actions as rational or irrational has the effect not only of evaluating certain kinds of behavior as superior to others, it also tends to justify public policy grounded in assumptions about what constitutes rational individual or government behavior. In particular, public policy guided by economists’ conceptualization of rationality will tend to be governed by instrumental reasoning – achieving the most efficient means to some desired end. As such, it will be biased against other values or motivations for action that may interfere with efficiency, such a social justice, tradition, or preserving community. Other concepts used by social scientists are similarly value laden, critical theorists charge. When political scientists, for instance, describe societies as developed, developing or undeveloped, such classification necessarily implies a moral and political hierarchy among nations, with the wealthy, capitalist societies invariably winding up on top.

Critical theorists also point to other ways in which social science has helped to justify and reinforce oppressive practices and beliefs. In particular, critical theorists charge that social science often serves to reify social processes. That is, it tends to foster the illusion that malleable or socially constructed aspects of society are natural, permanent or otherwise incapable of being altered. Social scientists tend to take the institutions and social structure of society as well as its values, beliefs, customs and habits are taken as a given. In doing so they establish the parameters within which public policy must operate. According to critical theorists, this produces a bias towards the status quo, and also tends to reinforce the power of dominant groups or forces in society. For example, orthodox economists tend to depict certain features of capitalist economies, such as inequality and unemployment, as the enduring and inevitable (if unwelcome) results of the laws of market system. Attempts to eliminate these features will be ultimately ineffective or produce unacceptably high tradeoffs, in the form of, for example, high inflation and sluggish growth. Nothing can be done about this unhappy situation, economists may say; it results from the fundamental and inalterable dynamics of economic systems. But critical theorists charge that the purported laws of economics are in fact the product of certain institutional arrangements, beliefs and values that can be altered. Other kinds of economic systems are in fact possible. Relying on the (often questionable) expertise of the economist turns public policy into merely a technical matter. The reality is that economic policy is also political policy. The institutions and values that underpin an economy reflect political choices. However, social science modeled on the natural sciences tends to blind the public – as well as social scientists themselves – to this reality.

In addition to helping reify social structures, critical theorists argue that the knowledge produced by social science too easily becomes a tool with which to manipulate people rather than to enlighten or emancipate them. Consider, for instance, some of the ways that governments and private industry use findings from psychology and sociology. Politicians and interest groups hire psychologists to find the best way to sell their policy initiatives to the public, rather than attempting to enhance public understanding of complex policy issues. Political parties and private corporations use focus groups to discover which words or images have the biggest impact on the public and adjust their rhetoric and advertising accordingly. Political consultants in the United States, for example, in recent years have advised opponents of the estate tax to dub it a death tax, which focus group research shows reduces support for it. Such studies have also led consultants to advise opponents of efforts to rein in carbon emissions to use the term “climate change” rather than “global warming.” Public opinion is thus manufactured rather than discovered through deliberation and analysis. Critical theorists claim that in this way social science fosters a society governed by technocratic control and is thus ultimately corrosive to genuine democracy.

Plainly critical theory has much in common with the hermeneutical approach described above. Critical theorists and proponents of a hermeneutical social inquiry both agree that social science is an inherently evaluative enterprise. Also, critical theorists agree that social inquiry must be, at least in part, an interpretive activity. Social inquiry, they agree, must aim at enhancing understanding of our world rather than merely enhancing our powers of prediction and technical control. But the two approaches differ fundamentally in their ontological assumptions about the social world and the relationship between the social scientist and the objects of his or her study. As noted above, the hermeneutical school holds that understanding is a dialogical and transformative process. Through what Hans-Georg Gadamer called a fusion of horizons, both the social inquirer and the target of inquiry create a kind of higher understanding that transcends the viewpoints of both parties.

In contrast, critical theorists, along with those in the naturalism camp, tend to embrace a subject/object ontology. From this standpoint, objective knowledge is produced when the social scientist produces an accurate representation of the social world. This understanding of the relationship between the social investigator and the subjects of his study privileges the social scientist as the knowing expert. The truth – provided by the expert – enlightens the subjects of inquiry and, it is hoped, thereby sets them free. They trade in their distorted ideological understanding for the clear-eyed perspective provided by critical theory. But advocates of hermeneutical inquiry, as well as other critics of naturalism, may object that this approach may undermine the liberationist goals of critical theory. Social inquiry should enlighten its subjects, but this is best attained through dialogue rather than a top-down imposition of expert analysis. Indeed, people may be inclined to reject the verdict of the critical theorists, opposing such knowledge as not reflective of their own self-understanding or experience. For this reason some proponents of hermeneutical inquiry support a participatory form of social science, in which social scientists and non-expert citizens work together in conducting research aimed at enlightening subjects and solving social problems.

It is important to note, however, that critical theorists often insist that the ultimate test of a theory is whether its intended audience accepts it as valid. The purportedly oppressed – for example, the working class, women, racial minorities – must come to see the critical theorists’ evaluation of their situation as true. Nonetheless, the privileged position of the critical theorist is perhaps still retained. For in practice he or she decides when the subjects of his inquiry are still in the grip of false consciousness and when they see their situation as it truly is – that is, when they see the world as critical theory depicts it. Presumably no feminist critical theorist would accept the falsification of her theory of women’s oppression if the subjects of her inquiry, after dialogue and reflection, concluded that traditional gender roles benefit women. Rather, she would conclude that the distorting powers of patriarchal ideology are more pervasive and entrenched than she had thought.

ii. Postmodernism

Adherents of another influential school of thought, postmodernism, have also been critical of social science’s claim to value neutrality and, again like the critical theorists, they tend to see social science as a potential source of domination. While postmodern is a rather loosely defined category, with the views of thinkers associated with it varying widely, some key tenets of the approach can be identified. Central among them is cultural and historical relativism. According to postmodernists, what counts as knowledge and truth is always relative to a particular culture or historical period. This holds not only for moral and aesthetic judgments, but also for the claims to truth made by natural and social science. Thus science does not offer a method for arriving at universal, objective truths that transcend time and place. Rather, it represents one way of knowing that reflects certain values, beliefs and interests of modern, Western society. Moreover, for postmodernists there is no fixed, universal human nature. Instead, human nature (our beliefs, values, desires, interests, and even our emotions) is itself a product of a particular history or social configuration – or, as postmodernists sometimes say, human nature is socially constructed. (Hence a variant of postmodernism is known as social constructionism.)

Postmodernists’ relativism and their denial of a universal human nature lead to certain criticisms of social science modeled on naturalism. They reject as deeply misguided attempts by social scientists to uncover patterns, structures or laws that purportedly transcend history and culture. For postmodernists, understanding of particular societies must be local and contextual.  In this respect, postmodernists partly share the concern of critical theorists that social science tends to reify social patterns and structure. But postmodernists are also skeptical of critical theory’s approach to social inquiry. Though distorting ideologies and power structures may obscure the truth, critical theorists maintain that ultimately an objective picture of society can be rendered. Moreover, the critical theorists’ view of enlightenment is grounded in the view that there is an identifiable universal human nature in need of liberation. But, given their relativism, postmodernists tend to see these views as supporting subtle forms of Western imperialism. In seeking to emancipate people, critical theorists risk imposing their own ethnocentric views of rationality, autonomy and justice onto non-Western societies (or reinforcing them in Western ones). Thus for postmodernists, critical theory is grounded in many of the same faulty assumptions about objectivity, rationality and knowledge as mainstream social science.

Perhaps the most influential postmodern critic of social science was the French social theorist Michel Foucault. Foucault not only challenged the value neutrality of social science, he also disputed the broader enlightenment view (shared by most critical theorists as well as social science modeled on naturalism) that modern reason and science provide the route to moral and epistemological progress. Foucault’s critique of social science concerned the way social science categorized individuals and groups, which he believed constituted a subtle but pervasive form of social power. His critique is some ways resembles the critical theorists’ observations described above regarding the ideological nature of social science categories. But Foucault’s critique was more radical.

Foucault contended that most if not all of the social kinds identified and used by social scientists are inventions. That is, they are the creations of social science as opposed to discoveries of natural kinds that reflect the real underlying, objective structure of social reality. Foucault trained much of his criticism on the fields of clinical psychology, criminology, and sociology, which in the nineteenth century began creating elaborate taxonomies of abnormal types of persons, for example, psychopaths, neurotics, kleptomaniacs, delinquents, and the like. Many of these new kinds of persons were identified by reference to their sexual proclivities. For instance, before the emergence of clinical psychology as a discipline, the today commonplace view that homosexuals are a kind of person did not exist. Of course, people prior to the emergence of psychology recognized that some individuals are sexually attracted to people of the same sex. But they did not generally see this fact as a fundamental element of a person’s nature that could be used to categorize him or her as particular kind of person.

Foucault argued that in the process of creating such categories, social science at the same time created and disseminated a particular view of normality. In this way social science became a new and important kind of potentially oppressive power in the modern world. According to Foucault, the state works hand in hand with other institutions of the modern world – prisons, schools, medical clinics, the military – to monitor and control people. It accomplishes this, however, neither principally through brute force nor via a regiment of rewards and punishments. Rather, the state works in concert with social science to construct the very categories through which individuals understand themselves.  In doing so it establishes the criteria by which normal and abnormal behavior is understood, and thereby regulates behavior – most importantly by getting people to regulate themselves. In this way social science has in effect become a handmaiden to the forces of domination rather than a potential source of emancipation. Significantly, Foucault never claimed that this new type of control is intentional. It is merely an unwelcome artifact of social science.

Foucault’s depiction of social science was part of his broader account of how all social orders generate claims to truth and knowledge. For Foucault what counts as truth or knowledge in a particular society is merely the product of a certain configuration of power relations. There is no truth or knowledge outside of such power regimes, he argued. Since the nineteenth century, the social sciences in conjunction with the state have been instrumental in setting up a new system of power/knowledge, principally through creating – not discovering – the categories by which we make sense of our social world.  But, for Foucault, the alliance of the state and social science is merely the latest power regime in human history. Other systems preceded it and no doubt new systems of power/knowledge will emerge in the future. Here critics point to a disturbing implication of Foucault’s ideas. It appears that for Foucault human beings, collectively or individually, cannot liberate themselves from the grip of such power regimes. They may trade one regime for another, but no genuine emancipation is possible. Indeed, given Foucault’s views of the self as thoroughly constructed by social forces, the very notion of liberation becomes incoherent. Thus Foucault’s radical relativism would seem to undermine the central aim of any critical approach that seeks to unmask oppressive ideologies, enhance human autonomy, advance justice or promote greater social transparency. The ideas of other influential postmodern and social constructionist critics of social inquiry (such as Richard Rorty and Kenneth Gergen) that entail relativism and deny the existence of a fixed human nature would seem to be vulnerable to such criticism, too. Postmodernists may charge that mainstream social science modeled on naturalism and critical theory alike both have the effect of imposing certain modernist notions of normality, rationality, and autonomy. But critics of postmodernism can retort that by undermining the very possibility of genuine emancipation postmodernism invites nihilism, quietism or apathy.

3. Methodological Individualism versus Holism

Another long-standing controversy in the philosophy of social science is the debate between methodological individualists and methodological holists. The former hold that social facts and phenomena are reducible without remainder to facts about individuals. Advocates of methodological holism, on the other hand, argue that there are some facts, conventionally dubbed “social facts,” that are not reducible to facts about individuals and that social phenomena can sometimes be adequately explained without reference to individuals. It should be noted that there is no necessary connection between support for methodological individualism or holism and one’s stance vis-à-vis the naturalism debate. Nonetheless there is a tendency for advocates of naturalism to embrace methodological individualism. Still, holists are found in the naturalist camp, too, including Emile Durkheim and Auguste Comte, both of whom were key figures in founding the field of sociology.

The individualism-holism debate can be somewhat confusing because the terms of debate often refer to different claims. Sometimes methodological individualism is understood to be a theory of meaning that holds that all statements about social entities or phenomena can be defined in terms that refer solely to individuals. So, according to this view, the meaning of “bureaucracy” can be defined exclusively in terms of the individuals that compose a bureaucracy without reference to the properties of a bureaucracy as an institution. Methodological individualism can also constitute an ontological theory. This version claims that only individuals are real and that social entities, facts or phenomena are, at best, useful abstractions. According to this view we may speak of armies, trade cycles or riots in our explanations, but we must keep in mind that such entities and phenomena merely describe individuals and their interactions with each other. Our terms describing social entities and phenomena may be useful for formulating shorthand descriptions or explanations, but this does not mean that the entities and phenomena that they refer to actually exist.

Both the meaning and the ontological versions of methodological individualism are contested. Critics of the meaning theory note that the view entails barring reference to institutions, rules, and norms when defining social entities and phenomena. This, they charge, is simply not possible. For instance, explaining the meaning of “army” would require defining it in terms of the individuals that compose an army, namely soldiers. But the description of the soldiers could not contain any reference to the rules, aims, norms, social relations and structures that in part create an army. Not only would, for example, a description of a soldier as someone who belongs to an army be barred, also prohibited would be any reference to other holistic phenomena and entities, such as wars or platoons. The account of soldiers would have to be limited solely to narrow descriptions of their psychological dispositions. Such restriction seems highly implausible, not the least of which because soldiers’ self-understanding naturally includes holistic entities and phenomena. If individuals incorporate holistic entities into their actions and self-descriptions, why must social science be barred from doing so? Moreover, a social science bereft of such references seems unimaginable, and, in any event, social scientists routinely and without controversy employ them in their descriptions and explanations. Thus few actual practitioners of social inquiry accept the meaning thesis.

The ontological thesis is generally regarded as less objectionable but is still contested. It is arguable that individuals are the only real inhabitants of the social world, even if people typically act as if social entities and phenomena are real. So, for instance, a person might favor privatization of government services on the ground that, in her judgment, government control fosters bureaucracies, which in her view are inherently inefficient. She may hold this belief about bureaucracies without knowing anything about the attitudes, values and so forth of particular individuals who work in them. That is, she believes something about the nature of bureaucracies themselves as opposed to merely holding certain beliefs about the individuals that inhabit them. Methodological holists may claim that her belief is grounded in a proper realist understanding of institutions. Bureaucracies are real entities, they argue, because the institutional structure of bureaucracies affects the behavior of the individuals within in them. But methodological individualists can retort that in principle the structural properties of a bureaucracy can be reduced to facts about the individuals that comprise them. This is true even if individuals, including bureaucrats themselves, believe and act as if bureaucracies themselves have certain properties. It may be impossible to define a bureaucracy in terms that omit reference to holistic entities, but that does not mean that bureaucracies or other holistic entities are real. The situation can be compared to the relationship between paranormal investigators and the ghosts that they believe in. It may be impossible to define “paranormal investigator” without reference to the idea of ghosts and other fantastical entities. And it may be the case that belief in ghosts affects the behavior of paranormal investigators. But none of this proves that ghosts exist.

A third and least controversial version of methodological individualism merely posits that social phenomena must be animated by individual actions. Therefore any satisfactory explanation of a social event or regularity must show how it is the result of individuals responding to a particular social situation. This view does not require that holistic entities or phenomena be defined in terms of individual-level facts, nor does it require denying the reality of holistic entities or phenomena. It simply requires that whenever a holistic entity or phenomena is claimed to cause certain effects, or whenever a social regularity is identified, some plausible mechanism at the individual level that produces the phenomena must be identified.

Some advocates of methodological individualism have argued that methodological holism is politically dangerous. They claim that ascribing reality to holistic entities lends credence to the view that such entities have needs or interests of their own. As such, methodological holism too readily becomes the handmaiden to tyrannical regimes that claim that the needs of the state or the nation transcend those of actual, living people. For this reason, Karl Popper called methodological individualism a “democratic-individualist” approach to social inquiry, whereas methodological collectivism supported “totalitarian justice.” However, critics of methodological individualism claim that it too has its own built-in biases. By denying the reality of institutional structures and other holistic entities – or at least downplaying the degree to which they can constrain individuals’ actions – methodological individualism tends to support a conservative political outlook. This worldview attributes individuals’ social or economic position principally to their own actions and abilities rather than the social situation that they are embedded in. Thus the poor are poor owing to their own choices and effort, and not because the capitalist system presents obstacles to exiting their situation.

4. What Social Science Does

Reflecting the tendency of philosophy of social science, most of this article has focused on comparing social science to the natural sciences. We have seen that formidable problems are encountered when the social sciences strive to produce theories that approach the range, elegance, predictive power and objectivity associated with natural science. But instead of asking whether social science can or should mirror the natural sciences, another way to evaluate social science is to ask: How does social science enhance our understanding of the social world? Assessing the merits of social science in this way entails reflection on the actual practices of social scientists – the methods they use, the questions they ask, the puzzles they try to solve, the kind of evidence that they produce, and so forth. Even if social science has failed to produce theories that rival the elegant and powerful theories of the natural sciences that does not necessarily show that social science is not a worthwhile endeavor. One way to measure the success of the social sciences is to ask whether their findings surpass common sense or folk wisdom, or otherwise tell us something useful, non-obvious or counterintuitive about the social world. This section examines three ways in which social science could be deemed successful by this standard: uncovering facts about the social world, finding correlations, and identifying mechanisms.

a. Uncovering Facts

An important task of social inquiry is to lay bare facts about an often murky social world. This can be a significant achievement in its own right, even if the discovery and collection of facts never leads to the more desirable goals of producing elegant theories and causal explanations of social phenomena or empowers us to make precise predictions about the social world. Without social science, our factual understanding of the social world would be left mainly to folk wisdom and anecdotal evidence, neither of which is very reliable. Uncovering facts about the social world is no mean feat. It often requires empirical rigor and conceptual sophistication. It also often necessitates developing special methods for measuring the entities and phenomena of the social world.

Following are just a few examples of factual questions that social science can help answer. These questions seem inherently interesting or are important from the standpoint of public policy, and the answers to them are not likely to be evident without sophisticated inquiry. From economics: What types of economic systems produce the most robust economic growth? Is the economy currently shrinking or growing? What is the current unemployment rate? Has the income of the median worker in European Union member states increased in the past decade, and, if so, by how much? Has social mobility increased or decreased in advanced industrial societies? From political science: Which nations enjoy the most political freedom? Has political freedom throughout the world increased in recent decades? Has warfare? How popular is the current U.S. president with the American people? Is political discourse getting more sophisticated or less? From sociology: Have community ties grown stronger our weaker in Western societies in the past century? Are people in societies with individualistic values happier than those in communitarian societies? From criminology: Has crime increased in recent decades? If so, what kinds of communities have seen the biggest increases? From social psychology:  How many people in the Western world suffer from clinical depression? Has this number increased or decreased recently? We can also include among the facts uncovered by social inquiry the thick descriptions of cultures and practices that interpretive inquiry can produce.

Of course, what counts as a fact will be a partly interpretive matter and thus dependent upon the self-understandings of the persons being studied. How, for example, should we conceptualize and measure freedom or individualism or depression? The definitions of these terms will always be contestable and subject to change. And social scientists will always be vulnerable to the critique, discussed above, that the facts they uncover reflect their own biases, interests or worldviews. Nonetheless, there are facts about the social world, and it seems fatuous to deny that social science at its best has not made us better acquainted with them, even if no purely neutral and objective concepts can be used to describe them. The same is true, after all, for natural science.

b. Correlation Analysis

A particularly important tool of the social sciences for enhancing understanding of the social world is a host of statistical techniques that can be broadly described as correlation analysis. These statistical innovations were developed by social scientists in the late nineteenth century and came into widespread use beginning in the twentieth. The aim behind their development was to help get a handle on one of the most difficult problems confronting social science: How to account for the often bewildering number of variables that potentially influence social phenomena. As noted above, isolating the effects of particular variables in the social realm presents a formidable challenge to social scientists, owing to the difficulty – and sometimes impossibility – of conducting controlled experiments. Multivariate regression analysis, structural equation modeling and other sophisticated statistical tools address this problem by giving social scientists the ability to gauge with mathematical precision the impact of multiple variables on social phenomena.  For example, suppose criminologists wish to shed light on the factors that influence the rate of violent crime. A host of potential social variables might plausibly be thought to do so, including poverty, education, sex, race, population density, gun-control laws, television viewing, and so forth. Multivariate regression, which provides the ability to hold multiple variables artificially constant, allows researchers to determine how strongly each of these variables is associated with violent crime. Such analysis might be able to tell us, for example, that poverty, sex, and education level accounts for 60% of the variance in crime and that gun control laws have no effect. Multivariate regression can even help gauge the interactive effects of various factors, perhaps showing that education level alone has little effect on crime but does have an impact when combined with poverty and high-population density.

Correlation analysis has greatly enhanced social scientists’ understanding of the social world, but it is hampered by serious limitations. In particular, it can never tell researchers whether one variable causes changes in another variable. This is so even if a one-to-one correspondence between variables in uncovered. For it is always possible that there is an unknown third variable that is the true cause behind changes in the variable that investigators seeks to explain. For example, suppose statistical analysis demonstrates a strong and stable correlation between individuals’ average television-viewing hours and violence: the more television individuals watch, the more likely they are to commit violent acts. But such evidence by itself cannot tell researchers whether watching television makes people more inclined to commit acts of violence or whether the violence-prone are more likely to watch television. Perhaps an unaccounted for third factor – say, poor social skills or unemployment – is the true cause of the violence and the increased television viewing. Explaining the cause of some phenomenon requires understanding of the causal mechanism that produces it. This correlation analysis cannot provide. It can, however, tell social scientists when a causal connection does not exist. Correlation does not entail causation, but causal connections always produce correlation. So failure to uncover a correlation between certain variables can inform researchers that there is no causal connection between them. In this way, correlation analysis provides an important tool for falsifying hypotheses.

c. Identifying Mechanisms

Some philosophers have argued that the primary explanatory power of social science resides in its ability to identify mechanisms, as opposed to discovery of law-like generalizations. Among the more important advocates of this view is Jon Elster, who defines mechanisms as “frequently occurring and easily recognizable causal patterns that are triggered under unknown conditions or with indeterminate consequences” (1999: 1). Mechanisms, Elster says, “allow us to explain, but not predict.” We may not be able to say precisely under what conditions a mechanism will be triggered or exactly how it will operate in particular circumstances. Nonetheless, we know a mechanism when we see one.  Elster denies that social science has uncovered any genuine law-like regularities and doubts that it ever will. However, social scientists can and have identified numerous mechanisms, which produce explanations that go beyond mere description, even if they fall short of explanations grounded in universal laws or theories. Explanation by mechanisms may not always permit us to make predictions, but we can often identify their operation in hindsight. Key aims of social science thus include identifying mechanisms, describing them with greater detail, and, if possible, more precisely identifying the kinds of situations that can trigger them.

With respect to social inquiry, mechanisms can be divided into individual-level and social-level kinds. Individual-level mechanisms describe typical ways in which individuals form desires and beliefs or fall prey to perception or reasoning errors. An important category of these mechanisms has the effect of reducing cognitive dissonance – the uncomfortable psychological stress caused by holding two incompatible beliefs simultaneously. One common mechanism that combats cognitive dissonances is wishful thinking, in which a person represses unpleasant beliefs that he or she knows to be true. The sour-grapes effect, in contrast, works on desires rather than beliefs. This mechanism takes its name from one of Aesop’s fables in which a fox decides that some grapes are undesirable because they are too high atop a vine for him to reach. These psychological mechanisms may be triggered whenever individuals find themselves in a situation that is contrary to the way they would prefer it to be. However, we will generally not be able to predict whether one of these mechanisms will be triggered in such a situation – or, if one is triggered, which one. But we can identify their operation retrospectively, and in this sense they provide some general explanatory power. Elster argues that the works of the ablest social observers in the Western tradition are replete with such mechanisms. Much of his analysis has focused on Alexis de Tocqueville’s Democracy in America and Paul Veyne’s Bread and Circuses, which explore the complex interaction between beliefs, desires and norms in, respectively, nineteenth-century American democracy and the political institutions of classical antiquity. Their insightful use of mechanisms in their explanations allows their work to transcend mere idiographic description and to shed light on contemporary politics.

Social-level mechanisms involve the interaction of individuals. Unveiling them requires untangling such interaction to reveal how it produces social phenomena. Often the most important part of, for example, an economist’s work resides in developing models that show how consumers and producers (or other types of actors) interact with each other to produce particular economic phenomenon. According to this view, the laws of economics and politics discussed above are best understood as typical patterns produced by human interaction rather than genuine law-like regularities. Seen this way, that the law of demand and Michels’ laws, for instance, are exception-ridden and far from universal does not completely vitiate their explanatory power. They still capture important features of human social relations, even if they fail to give social scientists the ability to determine precisely when or under what circumstances such phenomena will occur. Their real value resides not in predicting outcomes but in demystifying an often-opaque social milieu.

Of special interest to social scientists are social-level mechanisms that produce unintended consequences. The paradigmatic case of an unintended consequences explanation is Adam Smith’s invisible hand, a concept developed in his seminal work The Wealth of Nations. The invisible hand occurs when individuals contribute to the public good by pursuing their own, narrow interests. This phenomenon is ubiquitous in a capitalist economy.  Firms seek to increase their profit by striving to produce the best goods for the lowest price, and consumers seek to satisfy their own desires by purchasing such goods. But in seeking to advance their own aims, both also at the same time spur economic growth, which reduces unemployment and raises living standards. The unintended – and happy – result of such self-interested behavior is greater overall wealth and prosperity. Sometimes unintended consequences are unwelcome or even disastrous, as in the case of the so-called tragedy of the commons. This phenomena, described by Garrett Hardin in an influential 1968 Science essay, occurs when individuals have free access to some desirable resource and each seeks to maximize his or her take of the resource, resulting in its depletion, which makes everybody worse off. An example is provided by the rapid exhaustion of the ocean’s stock of fish. Commercial fishermen each strive to maximize their haul of fish, leading to the swift decline of the total stock and a reduction in each fisherman’s daily haul. Paradoxically, to increase their take over the long run, fishermen must submit to limits on how much fish they can remove from the sea.

Considering the explanatory practices of some other fields that we are inclined to call sciences lends support to the legitimacy of explanation via mechanisms rather than universal laws. As Roy D’Andrade has noted, the explanations produced by, for example, biology, geology, meteorology and oceanography typically do not rely on universal laws. As in the social world, the regularities and patterns found in these sciences are not timeless and universal. Instead they are contingent and contextual in the sense that they are dependent upon certain historical and environmental factors.  Change the conditions and the patterns or regularities may alter or disappear altogether. “The [biologist’s] description of DNA,” D’Andrade notes, is “… not the description of a law, but rather the description of a complex contingent mechanism” (1986: 21, emphasis added). Sciences that explain via identification of such mechanisms, which he dubs the “natural” sciences (as opposed to the “physical” sciences, such as physics, astronomy and chemistry), include, he says, much of psychology, sociology, anthropology, economics and other social sciences. Natural sciences tend to view the objects of their inquiries as machines. The machines of the social sciences (understood as natural sciences in D’Andrade’s sense) would include social structures and institutions, such as markets, bureaucracies and electoral systems.  The questions that scientists ask about a machine are: What is it made of? and How does it work? Offering a mechanistic account of the inner workings of machines provides an explanation that offers a degree of generalizable knowledge. However, he adds that in the natural sciences, “[G]eneralizations about how things work are often complex, true only of one particular kind of thing, and usually best stated in a simplified natural language” (1986: 21). This well describes the type of mechanisms discussed above that social science uncovers.

5. Methodological Pluralism

At present there is no agreement about the proper approach to investigating the social world, as this tour through some long-standing issues and debates in the philosophy of social science should have made clear. This lack of consensus is reflected in the methodological pluralism that marks social inquiry as currently practiced. Social scientists in the naturalist mold use various kinds of quantitative analyses, rational choice models (particularly in economics and political science), and experimental research (particularly in psychology) to uncover facts, patterns, and mechanisms in the social realm.  Outside the mainstream, various approaches informed by the descriptivist, hermeneutical, critical theory, and postmodern views described in previous sections can be seen. These would include (to name but a few) existential and humanistic psychology; ethnomethodology in anthropology; phenomenology, deconstructionism, and Foucauldian genealogy in sociology; Marxism, constructivism, and critical theory in political science; and different kinds of participatory research in various fields.

It would be facile to suggest that all of these methods and the theories underpinning them can be fully reconciled. But it also seems doubtful that one approach alone (either among those currently in use or one yet to be discovered) could capture the whole of social reality in all its multi-textured dimensions. Thus the present methodological pluralism of social science seems welcome and necessary. That the social world is a meaningful world created by self-interpreting beings, as the interpretive school holds, is undeniable. Thus one of the aims of social inquiry should be to capture that meaning. Also, as the hermeneutical, postmodern and critical theory approaches insist, social inquiry is inherently evaluative. A purely objective, neutral science of the social world is neither possible nor desirable. So, room must be made in social investigation for reflection on the biases, interests and ideologies embedded in various social science methods. And, finally, naturalistic mainstream social scientists are surely right to continue searching for patterns, mechanisms and causal processes in the social world, for they do exist, even if they are only relatively enduring and dependent upon social context, including the shifting self-understandings of human beings.

From this vantage, a kind of unification of the social sciences can be envisioned, though not in the sense advocated by naturalism. Unification in this sense requires, as the hermeneutical approach suggests, that we view social science as social practice. The efforts of social scientists should be seen as part of a wider, on-going human project to better understand ourselves and our world, and to make our world better. The facts, patterns and mechanisms that mainstream social science uncovers, the meanings that descriptivism unveils, and the self-reflective awareness of the values embedded in such inquiry that critical theory and hermeneutics counsel, should all be part of this broader human conversation.

6. References and Further Reading

  • Adorno, Theodor et al. 1976. The Positivist Dispute in German Sociology. New York: Harper & Row.
    • Advocates of naturalism, including Karl Popper and Hans Albert, debate critical theorists Theodor Adorno and Jurgen Habermas.
  • Bishop, Robert C. 2007.  The Philosophy of the Social Sciences. New York: Continuum.
    • A thorough and accessible overview of key issues in the philosophy of social science, but also an argument against an objectivist view of social inquiry and a defense of a dialogical one.
  • Collingwood, R.G. 1946. The Idea of History. Oxford: Oxford University Press.
    • Traces the development of interpretive social inquiry and defends it as proper approach for historical explanations.
  • Comte, Auguste. 1988. Introduction to Positive Philosophy. Frederick Ferre, trans. Indianapolis, IN: Hackett Publishing Company, Inc.
    • Classic defense of naturalism and methodological holism by the nineteenth century founder of sociology.
  • D’Andrade, Roy. 1986. “Three Scientific World Views and the Covering Law Model,” in Metatheory in Social Science, Donald W. Fiske and Richard A. Shweder (Eds.). Chicago: Chicago University Press.
  • Durkheim, Emile. 1951. Suicide: A Study in Sociology. New York: The Free Press.
    • Durkheim’s explanation of suicide, citing anomie as the key social factor leading to higher suicide rates.
  • Durkheim, Emile. 1982. Rules of Sociological Method. New York: The Free Press.
    • Contains Durkheim’s defense of naturalism and methodological holism.
  • Elster, Jon. 1993. Political Psychology. Cambridge: Cambridge University Press.
    • Examines how Tocqueville and Veyne use psychological and social-level mechanisms to shed light on, respectively, modern egalitarian democracy and the political institutions and practices of classical antiquity.
  • Elster, Jon. 1999. Alchemies of the Mind. Cambridge: Cambridge University Press.
    • Explores the work of classical theorists, literature and folk wisdom for insight into mechanisms governing the interaction between rationality and the emotions.
  • Elster, Jon. 2007. Explaining Social Behavior: More Nuts and Bolts for the Social Sciences. Cambridge: Cambridge University Press.
    • A defense of the view that social science explanations require identification of causal mechanisms, as well as an overview of the different tools and concepts at the disposal of social scientists to help them do so.
  • Foucault, Michel. 1970. The Order of Things: An Archaeology of the Human Sciences. Alan Sheridan, trans. New York: Pantheon.
    • Argues that the emergence of the social sciences marks the emergence of man as a new kind of object of knowledge.
  • Foucault, Michel. 1977. Discipline and Punish: The Birth of the Prison. Alan Sheridan, trans. New York: Pantheon.
    • Argues that, beginning in the late eighteenth century, the locus of punishment shifted from the body to the soul, reflecting a new kind of control.
  • Geertz, Clifford. 1977. The Interpretation of Cultures. New York: Basic Books.
    • Contains Geertz’s argument that the aim of social inquiry is to produce thick descriptions of human cultures.
  • Habermas, Jurgen. 1972. Knowledge and Human Interest. Boston: Beacon Press.
    • Argues that different kinds of human inquiry reflect different interests. The proper aim of social inquiry is human emancipation not technological control.
  • Hardin, Garrett. 1968. “The Tragedy of the Commons.” Science 162: 1243-1248.
  • Held, David. 1980. Introduction to Critical Theory. Berkeley, CA: University of California Press.
    • An expansive introduction to, and evaluation of, the Frankfurt school.
  • Hempel, Carl G. 1942. “The Function of General Laws in History.” Journal of Philosophy 39:35-48.
    • Classic defense of the covering-law or deductive nomological model of explanation.
  • Hollis, Martin. 1994. The Philosophy of Social Science. Cambridge: Cambridge University Press.
    • Introduction to key issues and controversies in the philosophy of social science.
  • Little, Daniel. 1991. Varieties of Social Explanation: An Introduction to the Philosophy of Social Science. Boulder, Colo.: Westview Press.
    • Introduction to the philosophy of social science with emphasis on actual explanations of practicing social scientists. Defends rational choice and materialist explanations, and advocates methodological pluralism.
  • Lukes, Steven. 1968.  “Methodological Individualism Reconsidered.” British Journal of Sociology 19:119-129.
    • Overview of different meanings ascribed to methodological individualism and analysis of their plausibility.
  • Martin, Michael and Lee C. McIntyre (Eds.) 1994. Readings in the Philosophy of Social Science. Cambridge, MA.: The MIT Press.
    • Contains most of the classic essays in the field as well as important contemporary articles.
  • Nagel, Ernest. 1979. The Structure of Science. Indianapolis, IN: Hackett Publishing Company, Inc.
    • Includes an influential defense of naturalism and the possibility of value-neutral social science.
  • Popper, Karl. 1985. “Individualism versus Collectivism.” In Popper Selections. Edited by David Miller. Princeton, NJ: Princeton University Press.
    • Argues against the possibility of reducing sociological phenomena to facts about individual psychology but maintains that a particular kind of methodological holism – methodological collectivism – is  philosophically confused and politically dangerous.
  • Richardson, Frank C. and Blaine J. Fowers. 1998. “Interpretive Social Science: An Overview.” American Behavioral Scientist 41:465-95.
    • An overview and critique of naturalism, descriptivism, critical theory, postmodernism and social constructionism, and an argument for understanding social theory as social practice grounded in a hermeneutical ontology.
  • Rosenberg, Alexander. 1995. Philosophy of Social Science. Boulder, Colo.: Westview Press.
    • A thorough and incisive introduction to the topic.
  • Root, Michael. 1993.  Philosophy of Social Science. Cambridge, MA: Blackwell.
    • A critique of value neutrality in the social sciences. Argues for participatory and partisan social inquiry.
  • Smith, Adam. 1982. The Wealth of Nations. New York: Penguin Classics.
    • A sweeping treatise of political economy that, among other things, provides an explanation of capitalism and serves as the foundation of modern economic theory.
  • Taylor, Charles. 1971. “Interpretation and the Sciences of Man.” Review of Metaphysics 25:3-51.
    • An influential defense of the view that social inquiry must be interpretative.
  • Taylor, Charles. 1985a. Philosophical Papers: Volume 2, Philosophy and the Human Sciences. Cambridge: Cambridge University Press.
    • Contains Taylor’s essay “Peaceful Coexistence in Psychology,” in which he argues that social science data cannot be made univocal.
  • Taylor, Charles. 1985b. Philosophical Papers: Volume 2, Philosophy and the Human Sciences. Cambridge: Cambridge University Press.
    • An argument that theories in political science necessarily entail value judgments.
  • Watkins, J.W.N. 1957. “Historical Explanations in the Social Sciences.” British Journal for the Philosophy of Science 8:104-117.
    • An influential defense of methodological individualism.
  • Weber, Max. 1978. Max Weber: Selections in Translation. Cambridge: Cambridge University Press.
    • Contains Weber’s classic essays on the methodology of social science, including his discussion of rational action, interpretative understanding [Verstehen], ideal types, and value neutrality.
  • Winch, Peter. 1958. The Idea of a Social Science and Its Relation to Philosophy. Atlantic Highlands, NJ: Humanities Press International, Inc.
    • An extended argument against naturalism and for a descriptivist version of interpretive social inquiry, drawing upon the ideas of Ludwig Wittgenstein.

Author Information

William A. Gorton
Email:gorton@alma.edu
Alma College
U. S. A.

Philosophy of Love

This article examines the nature of love and some of the ethical and political ramifications. For the philosopher, the question “what is love?” generates a host of issues: love is an abstract noun which means for some it is a word unattached to anything real or sensible, that is all; for others, it is a means by which our being—our self and its world—are irrevocably affected once we are ‘touched by love’; some have sought to analyze it, others have preferred to leave it in the realm of the ineffable.

Yet it is undeniable that love plays an enormous and unavoidable role in our several cultures; we find it discussed in song, film, and novels—humorously or seriously; it is a constant theme of maturing life and a vibrant theme for youth. Philosophically, the nature of love has, since the time of the Ancient Greeks, been a mainstay in philosophy, producing theories that range from the materialistic conception of love as purely a physical phenomenon—an animalistic or genetic urge that dictates our behavior—to theories of love as an intensely spiritual affair that in its highest permits us to touch divinity. Historically, in the Western tradition, Plato’s Symposium presents the initiating text, for it provides us with an enormously influential and attractive notion that love is characterized by a series of elevations, in which animalistic desire or base lust is superseded by a more intellectual conception of love which also is surpassed by what may be construed by a theological vision of love that transcends sensual attraction and mutuality. Since then there have been detractors and supporters of Platonic love as well as a host of alternative theories—including that of Plato’s student, Aristotle and his more secular theory of true love reflecting what he described as ‘two bodies and one soul.’

The philosophical treatment of love transcends a variety of sub-disciplines including epistemology, metaphysics, religion, human nature, politics and ethics. Often statements or arguments concerning love, its nature and role in human life for example connect to one or all the central theories of philosophy, and is often compared with, or examined in the context of, the philosophies of sex and gender as well as body and intentionality. The task of a philosophy of love is to present the appropriate issues in a cogent manner, drawing on relevant theories of human nature, desire, ethics, and so on.

Table of Contents

  1. The Nature of Love: Eros, Philia, and Agape
    1. Eros
    2. Philia
    3. Agape
  2. The Nature of Love: Further Conceptual Considerations
  3. The Nature of Love: Romantic Love
  4. The Nature of Love: Physical, Emotional, Spiritual
  5. Love: Ethics and Politics
  6. References and Further Reading

1. The Nature of Love: Eros, Philia, and Agape

The philosophical discussion regarding love logically begins with questions concerning its nature. This implies that love has a “nature,” a proposition that some may oppose arguing that love is conceptually irrational, in the sense that it cannot be described in rational or meaningful propositions. For such critics, who are presenting a metaphysical and epistemological argument, love may be an ejection of emotions that defy rational examination; on the other hand, some languages, such as Papuan, do not even admit the concept, which negates the possibility of a philosophical examination. In English, the word “love,” which is derived from Germanic forms of the Sanskrit lubh (desire), is broadly defined and hence imprecise, which generates first order problems of definition and meaning, which are resolved to some extent by the reference to the Greek terms, eros, philia, and agape.

a. Eros

The term eros (Greek erasthai) is used to refer to that part of love constituting a passionate, intense desire for something; it is often referred to as a sexual desire, hence the modern notion of “erotic” (Greek erotikos). In Plato‘s writings however, eros is held to be a common desire that seeks transcendental beauty-the particular beauty of an individual reminds us of true beauty that exists in the world of Forms or Ideas (Phaedrus 249E: “he who loves the beautiful is called a lover because he partakes of it.” Trans. Jowett). The Platonic-Socratic position maintains that the love we generate for beauty on this earth can never be truly satisfied until we die; but in the meantime we should aspire beyond the particular stimulating image in front of us to the contemplation of beauty in itself.

The implication of the Platonic theory of eros is that ideal beauty, which is reflected in the particular images of beauty we find, becomes interchangeable across people and things, ideas, and art: to love is to love the Platonic form of beauty-not a particular individual, but the element they posses of true (Ideal) beauty. Reciprocity is not necessary to Plato’s view of love, for the desire is for the object (of Beauty), than for, say, the company of another and shared values and pursuits.

Many in the Platonic vein of philosophy hold that love is an intrinsically higher value than appetitive or physical desire. Physical desire, they note, is held in common with the animal kingdom. Hence, it is of a lower order of reaction and stimulus than a rationally induced love—that is, a love produced by rational discourse and exploration of ideas, which in turn defines the pursuit of Ideal beauty. Accordingly, the physical love of an object, an idea, or a person in itself is not a proper form of love, love being a reflection of that part of the object, idea, or person, that partakes in Ideal beauty.

b. Philia

In contrast to the desiring and passionate yearning of eros, philia entails a fondness and appreciation of the other. For the Greeks, the term philia incorporated not just friendship, but also loyalties to family and polis-one’s political community, job, or discipline. Philia for another may be motivated, as Aristotle explains in the Nicomachean Ethics, Book VIII, for the agent’s sake or for the other’s own sake. The motivational distinctions are derived from love for another because the friendship is wholly useful as in the case of business contacts, or because their character and values are pleasing (with the implication that if those attractive habits change, so too does the friendship), or for the other in who they are in themselves, regardless of one’s interests in the matter. The English concept of friendship roughly captures Aristotle’s notion of philia, as he writes: “things that cause friendship are: doing kindnesses; doing them unasked; and not proclaiming the fact when they are done” (Rhetoric, II. 4, trans. Rhys Roberts).

Aristotle elaborates on the kinds of things we seek in proper friendship, suggesting that the proper basis for philia is objective: those who share our dispositions, who bear no grudges, who seek what we do, who are temperate, and just, who admire us appropriately as we admire them, and so on. Philia could not emanate from those who are quarrelsome, gossips, aggressive in manner and personality, who are unjust, and so on. The best characters, it follows, may produce the best kind of friendship and hence love: indeed, how to be a good character worthy of philia is the theme of the Nicomachaen Ethics. The most rational man is he who would be the happiest, and he, therefore, who is capable of the best form of friendship, which between two “who are good, and alike in virtue” is rare (NE, VIII.4 trans. Ross). We can surmise that love between such equals-Aristotle’s rational and happy men-would be perfect, with circles of diminishing quality for those who are morally removed from the best. He characterizes such love as “a sort of excess of feeling”. (NE, VIII.6)

Friendships of a lesser quality may also be based on the pleasure or utility that is derived from another’s company. A business friendship is based on utility–on mutual reciprocity of similar business interests; once the business is at an end, then the friendship dissolves. This is similar to those friendships based on the pleasure that is derived from the other’s company, which is not a pleasure enjoyed for whom the other person is in himself, but in the flow of pleasure from his actions or humour.

The first condition for the highest form of Aristotelian love is that a man loves himself. Without an egoistic basis, he cannot extend sympathy and affection to others (NE, IX.8). Such self-love is not hedonistic, or glorified, depending on the pursuit of immediate pleasures or the adulation of the crowd, it is instead a reflection of his pursuit of the noble and virtuous, which culminate in the pursuit of the reflective life. Friendship with others is required “since his purpose is to contemplate worthy actions… to live pleasantly… sharing in discussion and thought” as is appropriate for the virtuous man and his friend (NE, IX.9). The morally virtuous man deserves in turn the love of those below him; he is not obliged to give an equal love in return, which implies that the Aristotelian concept of love is elitist or perfectionist: “In all friendships implying inequality the love also should be proportional, i.e. the better should be more loved than he loves.” (NE, VIII, 7,). Reciprocity, although not necessarily equal, is a condition of Aristotelian love and friendship, although parental love can involve a one-sided fondness.

c. Agape

Agape refers to the paternal love of God for man and of man for God but is extended to include a brotherly love for all humanity. (The Hebrew ahev has a slightly wider semantic range than agape). Agape arguably draws on elements from both eros and philia in that it seeks a perfect kind of love that is at once a fondness, a transcending of the particular, and a passion without the necessity of reciprocity. The concept is expanded on in the Judaic-Christian tradition of loving God: “You shall love the Lord your God with all your heart, and with all your soul, and with all your might” (Deuteronomy 6:5) and loving “thy neighbour as thyself” (Leviticus 19:18). The love of God requires absolute devotion that is reminiscent of Plato’s love of Beauty (and Christian translators of Plato such as St. Augustine employed the connections), which involves an erotic passion, awe, and desire that transcends earthly cares and obstacles. Aquinas, on the other hand, picked up on the Aristotelian theories of friendship and love to proclaim God as the most rational being and hence the most deserving of one’s love, respect, and considerations.

The universalist command to “love thy neighbor as thyself” refers the subject to those surrounding him, whom he should love unilaterally if necessary. The command employs the logic of mutual reciprocity, and hints at an Aristotelian basis that the subject should love himself in some appropriate manner: for awkward results would ensue if he loved himself in a particularly inappropriate, perverted manner! Philosophers can debate the nature of “self-love” implied in this—from the Aristotelian notion that self-love is necessary for any kind of interpersonal love, to the condemnation of egoism and the impoverished examples that pride and self-glorification from which to base one’s love of another. St. Augustine relinquishes the debate—he claims that no command is needed for a man to love himself (De bono viduitatis, xxi). Analogous to the logic of “it is better to give than to receive”, the universalism of agape requires an initial invocation from someone: in a reversal of the Aristotelian position, the onus for the Christian is on the morally superior to extend love to others. Nonetheless, the command also entails an egalitarian love-hence the Christian code to “love thy enemies” (Matthew 5:44-45). Such love transcends any perfectionist or aristocratic notions that some are (or should be) more loveable than others. Agape finds echoes in the ethics of Kant and Kierkegaard, who assert the moral importance of giving impartial respect or love to another person qua human being in the abstract.

However, loving one’s neighbor impartially (James 2:9) invokes serious ethical concerns, especially if the neighbor ostensibly does not warrant love. Debate thus begins on what elements of a neighbor’s conduct should be included in agape, and which should be excluded. Early Christians asked whether the principle applied only to disciples of Christ or to all. The impartialists won the debate asserting that the neighbor’s humanity provides the primary condition of being loved; nonetheless his actions may require a second order of criticisms, for the logic of brotherly love implies that it is a moral improvement on brotherly hate. For metaphysical dualists, loving the soul rather than the neighbor’s body or deeds provides a useful escape clause-or in turn the justification for penalizing the other’s body for sin and moral transgressions, while releasing the proper object of love-the soul-from its secular torments. For Christian pacifists, “turning the other cheek” to aggression and violence implies a hope that the aggressor will eventually learn to comprehend the higher values of peace, forgiveness, and a love for humanity.

The universalism of agape runs counter to the partialism of Aristotle and poses a variety of ethical implications. Aquinas admits a partialism in love towards those to whom we are related while maintaining that we should be charitable to all, whereas others such as Kierkegaard insist on impartiality. Recently, Hugh LaFallotte (1991) has noted that to love those one is partial towards is not necessarily a negation of the impartiality principle, for impartialism could admit loving those closer to one as an impartial principle, and, employing Aristotle’s conception of self-love, iterates that loving others requires an intimacy that can only be gained from being partially intimate. Others would claim that the concept of universal love, of loving all equally, is not only impracticable, but logically empty-Aristotle, for example, argues: “One cannot be a friend to many people in the sense of having friendship of the perfect type with them, just as one cannot be in love with many people at once (for love is a sort of excess of feeling, and it is the nature of such only to be felt towards one person)” (NE, VIII.6).

2. The Nature of Love: Further Conceptual Considerations

Presuming love has a nature, it should be, to some extent at least, describable within the concepts of language. But what is meant by an appropriate language of description may be as philosophically beguiling as love itself. Such considerations invoke the philosophy of language, of the relevance and appropriateness of meanings, but they also provide the analysis of “love” with its first principles. Does it exist and if so, is it knowable, comprehensible, and describable? Love may be knowable and comprehensible to others, as understood in the phrases, “I am in love”, “I love you”, but what “love” means in these sentences may not be analyzed further: that is, the concept “love” is irreducible-an axiomatic, or self-evident, state of affairs that warrants no further intellectual intrusion, an apodictic category perhaps, that a Kantian may recognize.

The epistemology of love asks how we may know love, how we may understand it, whether it is possible or plausible to make statements about others or ourselves being in love (which touches on the philosophical issue of private knowledge versus public behavior). Again, the epistemology of love is intimately connected to the philosophy of language and theories of the emotions. If love is purely an emotional condition, it is plausible to argue that it remains a private phenomenon incapable of being accessed by others, except through an expression of language, and language may be a poor indicator of an emotional state both for the listener and the subject. Emotivists would hold that a statement such as “I am in love” is irreducible to other statements because it is a nonpropositional utterance, hence its veracity is beyond examination. Phenomenologists may similarly present love as a non-cognitive phenomenon. Scheler, for example, toys with Plato’s Ideal love, which is cognitive, claiming: “love itself… brings about the continuous emergence of ever-higher value in the object–just as if it were streaming out from the object of its own accord, without any exertion (even of wishing) on the part of the lover” (1954, p. 57). The lover is passive before the beloved.

The claim that “love” cannot be examined is different from that claiming “love” should not be subject to examination-that it should be put or left beyond the mind’s reach, out of a dutiful respect for its mysteriousness, its awesome, divine, or romantic nature. But if it is agreed that there is such a thing as “love” conceptually speaking, when people present statements concerning love, or admonitions such as “she should show more love,” then a philosophical examination seems appropriate: is it synonymous with certain patterns of behavior, of inflections in the voice or manner, or by the apparent pursuit and protection of a particular value (“Look at how he dotes upon his flowers-he must love them”)?

If love does possesses “a nature” which is identifiable by some means-a personal expression, a discernible pattern of behavior, or other activity, it can still be asked whether that nature can be properly understood by humanity. Love may have a nature, yet we may not possess the proper intellectual capacity to understand it-accordingly, we may gain glimpses perhaps of its essence-as Socrates argues in The Symposium, but its true nature being forever beyond humanity’s intellectual grasp. Accordingly, love may be partially described, or hinted at, in a dialectic or analytical exposition of the concept but never understood in itself. Love may therefore become an epiphenomenal entity, generated by human action in loving, but never grasped by the mind or language. Love may be so described as a Platonic Form, belonging to the higher realm of transcendental concepts that mortals can barely conceive of in their purity, catching only glimpses of the Forms’ conceptual shadows that logic and reason unveil or disclose.

Another view, again derived from Platonic philosophy, may permit love to be understood by certain people and not others. This invokes a hierarchical epistemology, that only the initiated, the experienced, the philosophical, or the poetical or musical, may gain insights into its nature. On one level this admits that only the experienced can know its nature, which is putatively true of any experience, but it also may imply a social division of understanding-that only philosopher kings may know true love. On the first implication, those who do not feel or experience love are incapable (unless initiated through rite, dialectical philosophy, artistic processes, and so on) of comprehending its nature, whereas the second implication suggests (though this is not a logically necessary inference) that the non-initiated, or those incapable of understanding, feel only physical desire and not “love.” Accordingly, “love” belongs either to the higher faculties of all, understanding of which requires being educated in some manner or form, or it belongs to the higher echelons of society-to a priestly, philosophical, or artistic, poetic class. The uninitiated, the incapable, or the young and inexperienced-those who are not romantic troubadours-are doomed only to feel physical desire. This separating of love from physical desire has further implications concerning the nature of romantic love.

3. The Nature of Love: Romantic Love

Romantic love is deemed to be of a higher metaphysical and ethical status than sexual or physical attractiveness alone. The idea of romantic love initially stems from the Platonic tradition that love is a desire for beauty-a value that transcends the particularities of the physical body. For Plato, the love of beauty culminates in the love of philosophy, the subject that pursues the highest capacity of thinking. The romantic love of knights and damsels emerged in the early medieval ages (11th Century France, fine amour) a philosophical echo of both Platonic and Aristotelian love and literally a derivative of the Roman poet, Ovid and his Ars Amatoria. Romantic love theoretically was not to be consummated, for such love was transcendentally motivated by a deep respect for the lady; however, it was to be actively pursued in chivalric deeds rather than contemplated-which is in contrast to Ovid’s persistent sensual pursuit of conquests!

Modern romantic love returns to Aristotle’s version of the special love two people find in each other’s virtues-one soul and two bodies, as he poetically puts it. It is deemed to be of a higher status, ethically, aesthetically, and even metaphysically than the love that behaviorists or physicalists describe.

4. The Nature of Love: Physical, Emotional, Spiritual

Some may hold that love is physical, i.e., that love is nothing but a physical response to another whom the agent feels physically attracted to. Accordingly, the action of loving encompasses a broad range of behavior including caring, listening, attending to, preferring to others, and so on. (This would be proposed by behaviorists). Others (physicalists, geneticists) reduce all examinations of love to the physical motivation of the sexual impulse-the simple sexual instinct that is shared with all complex living entities, which may, in humans, be directed consciously, sub-consciously or pre-rationally toward a potential mate or object of sexual gratification.

Physical determinists, those who believe the world to entirely physical and that every event has a prior (physical cause), consider love to be an extension of the chemical-biological constituents of the human creature and be explicable according to such processes. In this vein, geneticists may invoke the theory that the genes (an individual’s DNA) form the determining criteria in any sexual or putative romantic choice, especially in choosing a mate. However, a problem for those who claim that love is reducible to the physical attractiveness of a potential mate, or to the blood ties of family and kin which forge bonds of filial love, is that it does not capture the affections between those who cannot or wish not to reproduce-that is, physicalism or determinism ignores the possibility of romantic, ideational love—it may explain eros, but not philia or agape.

Behaviorism, which stems from the theory of the mind and asserts a rejection of Cartesian dualism between mind and body, entails that love is a series of actions and preferences which is thereby observable to oneself and others. The behaviorist theory that love is observable (according to the recognizable behavioral constraints corresponding to acts of love) suggests also that it is theoretically quantifiable: that A acts in a certain way (actions X,Y,Z) around B, more so than he does around C, suggests that he “loves” B more than C. The problem with the behaviorist vision of love is that it is susceptible to the criticism that a person’s actions need not express their inner state or emotions—A may be a very good actor. Radical behaviorists, such as B. F. Skinner, claim that observable and unobservable behavior such as mental states can be examined from the behaviorist framework, in terms of the laws of conditioning. On this view, that one falls in love may go unrecognised by the casual observer, but the act of being in love can be examined by what events or conditions led to the agent’s believing she was in love: this may include the theory that being in love is an overtly strong reaction to a set of highly positive conditions in the behavior or presence of another.

Expressionist love is similar to behaviorism in that love is considered an expression of a state of affairs towards a beloved, which may be communicated through language (words, poetry, music) or behavior (bringing flowers, giving up a kidney, diving into the proverbial burning building), but which is a reflection of an internal, emotional state, rather than an exhibition of physical responses to stimuli. Others in this vein may claim love to be a spiritual response, the recognition of a soul that completes one’s own soul, or complements or augments it. The spiritualist vision of love incorporates mystical as well as traditional romantic notions of love, but rejects the behaviorist or physicalist explanations.

Those who consider love to be an aesthetic response would hold that love is knowable through the emotional and conscious feeling it provokes yet which cannot perhaps be captured in rational or descriptive language: it is instead to be captured, as far as that is possible, by metaphor or by music.

5. Love: Ethics and Politics

The ethical aspects in love involve the moral appropriateness of loving, and the forms it should or should not take. The subject area raises such questions as: is it ethically acceptable to love an object, or to love oneself? Is love to oneself or to another a duty? Should the ethically minded person aim to love all people equally? Is partial love morally acceptable or permissible (that is, not right, but excusable)? Should love only involve those with whom the agent can have a meaningful relationship? Should love aim to transcend sexual desire or physical appearances? Some of the subject area naturally spills into the ethics of sex, which deals with the appropriateness of sexual activity, reproduction, hetero and homosexual activity, and so on.

In the area of political philosophy, love can be studied from a variety of perspectives. For example, some may see love as an instantiation of social dominance by one group (males) over another (females), in which the socially constructed language and etiquette of love is designed to empower men and disempower women. On this theory, love is a product of patriarchy, and acts analogously to Karl Marx’s view of religion (the opiate of the people) that love is the opiate of women. The implication is that were they to shrug off the language and notions of “love,” “being in love,” “loving someone,” and so on, they would be empowered. The theory is often attractive to feminists and Marxists, who view social relations (and the entire panoply of culture, language, politics, institutions) as reflecting deeper social structures that divide people into classes, sexes, and races.

This article has touched on some of the main elements of the philosophy of love. It reaches into many philosophical fields, notably theories of human nature, the self, and of the mind. The language of love, as it is found in other languages as well as in English, is similarly broad and deserves more attention.

6. References and Further Reading

  • Aristotle Nicomachean Ethics.
  • Aristotle Rhetoric. Rhys Roberts (trans.).
  • Augustine De bono viduitatis.
  • LaFallotte, Hugh (1991). “Personal Relations.” Peter Singer (ed.) A Companion to Ethics. Blackwell, pp. 327-32.
  • Plato Phaedrus.
  • Plato Symposium.
  • Scheler, Max (1954). The Nature of Sympathy. Peter Heath (trans.). New Haven: Yale University Press.

Author Information

Alexander Moseley
Email: alexandermoseley@icloud.com
United Kingdom

Angélique de Saint Jean Arnauld d’Andilly (1624–1684)

d'AndillyAngélique de Saint-Jean Arnauld d’Andilly, an abbess of the convent of Port-Royal, was a leader of the intransigent party in the Jansenist movement.  A prolific author, Mère Angélique de Saint-Jean translated her determined opposition to civil and ecclesiastical authorities in the Jansenist controversy into a militant version of the neo-Augustinian philosophy she shared with other Jansenists.

Often citing the works of Saint Augustine himself, the abbess defends a dualistic metaphysics where mental reason opposes the physical senses and where supernatural faith opposes a reason ravaged by strong desires. Her moral theory presents an Augustinian account of virtue: the alleged natural virtues of the classical pagans are only disguised vices; authentic moral virtue can spring only from the theological virtues, infused through God’s sovereign grace.  Her epistemology criticizes the exercise of doubt in the religious domain, since such doubt often serves the interests of the civil and religious powers opposed to the Jansenist minority.  Power rather than a disinterested search for truth often characterizes dialogues inviting the minority to entertain doubts which will lead the minority to surrender its convictions to the stronger partner.  Strongly polemical in character, the writings of Mère Angélique de Saint-Jean detail a code of ethical resistance by which an embattled minority can refuse the coercion of the majority through a politics of non-compliance, silence, and spiritual solitude.

Table of Contents

  1. Biography
  2. Works
  3. Philosophical Themes
    1. Virtue Theory
    2. Code of Resistance
    3. Metaphysical Dualism
    4. Epistemology and Certitude
  4. Interpretation and Relevance
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Born on November 28, 1624 Angélique Arnauld d’Andilly belonged to a noblesse de robe family prominent at the French court.  Her father Robert Arnauld d’Andilly was the superintendent of the estate of the Duc d’Orléans, the brother of Louis XIII; her mother Catherine Le Fèvre de la Broderie Arnauld d’Andilly was the daughter of an ambassador.  The family was closely tied to the Parisian convent of Port-Royal and the Jansenist movement with which the convent was allied.  Angélique’s aunts Angélique Arnauld and Agnès Arnauld served as Port-Royal’s abbesses during the convent’s reform in the early seventeenth century; her uncle Antoine Arnauld emerged as Jansenism’s leading philosopher and theologian; her uncle Henri Arnauld, bishop of Angers, become one of the movement’s leading defenders in the episcopate.  Four other aunts and her widowed grandmother became nuns at Port-Royal; four of her sisters would follow.  Her father, one brother, and three cousins would join the solitaires, a community of priests and laymen devoted to meditation and scholarship on the grounds of Port-Royal.  Her father would distinguish himself by his translations of Latin Christian classics; her cousin Louis-Isaac Le Maître de Sacy would become France’s leading biblical exegete and translator.  From infancy, Angélique Arnauld d’Andilly imbibed the convent’s radical Augustinian philosophy and her family’s taste for patristic literature.

Angélique Arnauld d’Andilly entered the convent school of Port-Royal in 1630.  She quickly established herself as an outstanding scholar, renowned for her fluency in Greek and Latin.  Madame de Sévigné praised her as a precocious genius; although hostile to Port-Royal, the Jesuit literary critic Réné Rapin praised her grasp of the works and thought of Saint Augustine.  Now known as Soeur Angélique de Saint-Jean, she pronounced her vows as a nun of Port-Royal in 1644.  Authorities confided a series of key convent positions to her: headmistress of the convent school, novice mistress, subprioress.  In the 1650s as the dispute over Jansenism intensified, the nun commissioned a series of memoirs by and on the nuns central to the convent’s reform.  Apologetic works to prove the convent’s orthodoxy, the memoirs would survive as key literary documents attesting to the personalities and theories of Port-Royal.  Although respected for her intellectual and managerial skill, Mère Angélique de Saint-Jean did not impress all by her emotional temperament.  Even her uncle Antoine Arnauld and her aunt Mère Angélique Arnauld rebuked their niece for what they perceived as an intellectual vanity that often presented itself as icy imperiousness.

When the quarrel over Jansenism turned into “the crisis of the signature,” Mère Angélique de Saint-Jean quickly imposed herself as the head of the most intransigent group of nuns at Port-Royal.  In 1661 Louis XIV had declared that all priests, religious, and teacher must sign a formulary that assented to the Vatican’s condemnation of five theological errors allegedly contained in Cornelius Jansen’s work Augustinus.  Using the droit/fait distinction, Antoine Arnauld had argued that Jansenists could sign the formulary inasmuch as it touched on matters of droit (matters of faith and morals, in this case five theological propositions condemned by the church as heretical) but that they could not assent on matters of fait (empirical fact, in this case the church’s judgment that Jansen himself had defended the heretical propositions).  In June 1661 Soeur Angélique de Saint-Jean reluctantly signed the formulary but, against her uncle’s advice, added a postscript that indicated the strictly reserved nature of her assent.  When the Vatican annulled the reserved signatures and demanded new signatures without any postscript, Soeur Angélique de Saint-Jean cleverly added a new preface to the formulary which explained the conditional nature of the assent of the nuns.  In face of the nuns’ recalcitrance, authorities took stronger measures against the convent.  In 1664 Soeur Angelique was exiled to the convent of the Anonciades, where she lived under virtual house arrest.  In 1665 the nun was regrouped with the other nonsigneuse nuns at Port-Royal.  Deprived of the sacraments and placed under armed guard, the nuns still managed to maintain surreptitious contact with their external allies through the strategies of Soeur Angélique de Saint-Jean.  Throughout the period of persecution, the nun bitterly criticized moderates, such as Madame de Sablé, who sought to negotiate a compromise between the Jansenists and their opponents, as well as the minority of nuns who had signed the formularly without reservation.  Only with reluctance did she accept the “Peace of the Church” (1669-79), which lifted the sanctions from Port-Royal in return for minor concessions in a modified formularly.

Elected abbess in 1678, Mère Angélique de Saint-Jean delivered an extensive cycle of abbatial conferences at Port-Royal.  The conferences were largely commentaries on Scripture, the Rule of Saint Benedict, and the Constitutions of Port-Royal.  Her extensive correspondence, often promoting the works and theories of Saint Augustine to her spiritual directees, and writings of questions dealing with persecution received a large circulation among laity allied with Port-Royal.  In 1679, the persecution of Port-Royal abruptly recommenced.  Archbishop François Harlay de Champvallon ordered the closure of the convent’s school and novitiate; without the ability to accept younger members, the convent was doomed to a slow death.  The convent’s chaplain and confessors were expelled.  Although the nuns were free to pursue their cloistered activities, the newly imposed clerics clearly attempted to convince the nuns to renounce their alleged Jansenist heresies.

During the rest of her abbacy, Mère Angélique de Saint-Jean protested the injustice of this new persecution through letters addressed to bishops, courtiers, aristocrats, and ambassadors.  Her correspondence with the pope and the king shows her characteristic boldness.  Her appeal to Pope Innocent XI is a thinly veiled attack on the Jesuits: “If Your Holiness could finally be informed about all we have suffered, brought about only by the jealousy and malice of certain people against some very learned and very pious theologians, some of whom have participated in the governance of this convent, I am sure that the narrative of these sufferings, which has few parallels in recent centuries, would soften the heart of Your Holiness [L; letter of May 29, 1679 to Pope Innocent XI].”  Her protest to Louis XIV is a rebuke of the refusal of the throne to explain on what grounds this new persecution is justified: “Sire, it is the gravest sorrow of those who have such sentiments [of loyalty toward you] to perceive that you see us as something evil, but we have no way to leave this very painful state of affairs since we are not permitted to know what has placed us in this situation and what still keeps us here [L; letter of February 6, 1680 to King Louis XIV].”  Despite her protests, the sanctions against Port-Royal remained in place and the aging convent became increasingly isolated.

Still in office as abbess, Mère Angélique de Saint-Jean Arnauld d’Andilly died on January 29, 1684.

2. Works

In terms of philosophical significance the most important works of Mère Angélique de Saint-Jean Arnauld d’Andilly are the commentaries produced during her abbacy (1678-84).  Discourses on the Rule of Saint Benedict gives the ancient monastic rule a radical Augustinian edge by its insistence on the absolute necessity of grace to cultivate any of the moral virtues praised by Saint Benedict.  Conferences on the Constitutions of Port-Royal emphasizes the rights of nuns to limited self-government and the right of the abbess to act as the principal spiritual director and theologian of the convent.  Reflections to Prepare the Nuns for Persecution, a commentary on Mère Agnès Arnauld’s earlier Counsels, stresses the opposition between the world and the disciple;  it limits the moral virtues and spiritual dispositions necessary to resist persecution for the sake of personal conscience.

Other opuscules develop Mère Angélique de Saint Jean’s epistemology and political philosophy.  On the Danger of Hesitation and Doubt Once We Know Our Duty analyzes the act of doubt in terms of power relationships. Never neutral, the exercise of self-doubt by a persecuted minority often serves the interests of a majority determined to vanquish the minority and coerce a change in its opinions.  Three Conferences on the Duty to Defend the Church argues that authentic religious obedience is not servility; it can express itself by staunch opposition to civil and ecclesiastical authorities when the latter endorse error or illegitimately invade the sanctuary of conscience.

The extensive correspondence of the abbess also indicates how her militant brand of Augustinianism differs from the more moderate version promoted by the clerical advisers of Port-Royal.  Her epistolary exchange with her uncle Antoine Arnauld details her opposition to compromise over the issue of the Augustinus and expresses the stark opposition between world and self which she considers the fate of concupiscent humanity.

The abbess’s best known-work, the autobiographical Report of Capitivity, details her house arrest at the Anonciade convent; it illustrates how Mère Angélique de Saint-Jean personally used the techniques of resistance to oppression she champions in her more theoretical works.  Discourses of Mère Angélique de Saint-Jean Called “Miséricordes” provides a radical Augustinian framework for the Port-Royal genre of miséricorde, a type of eulogy for deceased nuns and lay benefactors given by the abbess in chapter.  In Mère Angélique de Saint-Jean’s version, the moral virtues of the deceased are clearly the work of divine grace, not of human will; they are an earnest of the election to which God’s inscrutable sovereignty has summoned them.

3. Philosophical Themes

Militancy is the salient trait of the philosophy developed by Mère Angélique de Saint-Jean.  Drawing on the general Augustinian philosophy of Port-Royal, the abbess stresses the stark opposition to the world which should characterize such a philosophy.  Her virtue theory conceives the monastic vows as a species of martyrdom against a corrupt society.  Her dualistic metaphysics studies the drama of the human will as a war between the opposed loves of self and of God.  In her theory of knowledge, the abbess condemns the exercise of doubt as a subtle acquiescence to powerful ecclesiastical and civic authorities who seek to coerce conscience.  In analyzing possible material cooperation with the persecutors of the convent, Arnauld d’Andilly insists on resistance rather than compromise as the path of authentic virtue.

a. Virtue Theory

In Discourses on the Rule of Saint Benedict [DRSB], Mère Angélique de Saint-Jean provides a commentary on the founding rule of Benedictine monasticism.  The commentary develops a theory of virtue which indicates the radical Augustinian moral orientation of the abbess’s moral philosophy.  The traditional monastic virtues assume a distinctive Jansenist coloration in the abbess’s treatment of them.

The virtue of obedience, embodied through the monastic vow of obedience to one’s superior, acquires a new necessity because of the radically disordered nature of human reason.  In this Augustinian account of human concupiscence, fallen reason is no longer capable of self-governance.  “In the original state of creation, there was a perfect relationship between human reason and will.  At the present time, however, this is no longer the case.  Reason has become an instrument in the hands of self-will, which uses it in an improper and destructive way by arming itself with the false appearances of reason to find justice in injustice itself [DRSB, 243].”  The virtue of silence also serves to curb the passions generated by the concupiscent will.  “In maintaining silence we mortify vanity, curiosity, self-love, and all the other poisons that use the tongue to spill outside and to encourage their impetuous, disordered movements [DRSB, 267].”  Similarly, the virtue of humility, the most prized moral virtue in Benedict’s original rule, is tied by the abbess to the controversial Jansenist doctrine of the small number of the elect.  “It is quite certain that only a few will be saved, since one must be saved through humility, which consists in the love of humility and abasement [DRSB, 311].”

The lack of such self-denying moral virtues in the majority of humanity indicates the depth of the depravity of the postlapsarian will.  “There is something perverted in humanity: its will….Humanity is wounded because it turned on itself by acting through its own will [DRSB, 326].”  In its state of weakness, humanity is utterly dependent on God’s grace to heal its concupiscence and to permit it to exercise its will on behalf of the moral good.  “We need God to give us his grace and light.  Without this assistance we move away from the path of salvation rather than toward it.  We are only shadows by ourselves.  We are mistaken about any light we seem to have if it is not God himself who lights our lamps and illumines us [DRSB, 53].”  In this Augustinian perspective, all authentic moral virtue is the result of God’s grace, not of human initiative.  Alleged natural virtue is an illusion of human pride.

In a distinctive recasting of the Augustinian framework of virtue, Mère Angélique de Saint-Jean underscores the militant nature of the moral virtues inspired by grace.  The monastic virtue of humility entails martyrdom as the nun confronts a persecutory world.  “We are obliged to be in the situation of suffering martyrdom….We do not know what God will expose us to, but we do know that as Christians and as nuns we are called to follow Jesus Christ and Jesus Christ crucified, to carry our cross after him and to renounce ourselves.  This cannot be done without suffering [DRSB, 381].”  Rather than providing a sinecure from the warfare of a fallen world, the monastic virtues steel the nuns for a spiritual combat demanding the loss of one’s very self.

b. Code of Resistance

The Augustinian theory of virtue grounds Mère Angélique de Saint Jean’s ethical code of resistance, developed abundantly in Reflections to Prepare the Nuns for Persecution [RPNP].  A commentary on Mère Agnès Arnauld’s earlier Counsels on the Conduct Which the Nuns Should Maintain In the Event of a Change in the Governance of the Convent, the abbess systematically substitutes exhortations to militant resistance for her aunt’s earlier counsels of prudent moderation.

This militant conception of the moral life appears clearly in Mère Angélique de Saint-Jean’s martial transposition of the theological virtues, the source of all authentic moral virtue.  The virtue of faith is no longer the simple assent of the mind to the truths revealed by God; it is a militant witness to the truth of this revelation through long-suffering combat.  “It is faith that supports us in all our afflictions.  Only on faith can we lean for the hope of our salvation.  It obliges us to believe in the mercy of God and to have recourse to this mercy in all our difficulties [RPNP, 20].”  Interpreted from a neo-Platonic dualistic perspective, this combative faith opposes the intellectual and moral inclination of the senses.  “We are everywhere in our senses.  If we are not careful, we follow their judgment rather than that of faith….Our faith should penetrate all the veils that fall before our eyes [RPNP, 288].”  It combats the passions, which can easily induce the believer to flee her moral duties during persecution.  “Faith lifts us up and makes us the master of our passions, while love for ourselves makes us slaves of an infinite number of masters, under whose domination we lose, if we are not careful, the true freedom of the children of God [RPNP, 168].”  Echoing the fideism of Sant-Cyran, the abbess argues that faith must oppose reason itself, when this all too human reason rationalizes away the persecution that is the price of witness to the truth.  “There is still one thing essential to make our suffering perfect: to arm ourselves against the reasoning of the human mind opposed to the principles of faith, which teaches us to find glory in disdain, riches in poverty, life in death [RPNP, 160].”  In this martial recasting of the theological virtues, Mère Angélique de Saint-Jean condemns fear as the most dangerous of the passions and cowardice as the gravest of the vices.

To endure persecution by the opponents of Port-Royal, Mère Angélique de Saint-Jean constructs a code of resistance to the oppressive authorities which is more rigorous than the supple code proposed by her aunt earlier in the persecution.  Whereas Mère Agnès had argued that nuns should largely follow the directives given by superiors in a foreign convent, Mère Angélique de Saint-Jean counsels strict non-compliance.  Whereas her aunt had recommended limited communications with certain appointed confessors and lecturers, Mère Angélique de Saint-Jean insists on determined refusal.  The abbess stresses in particular the need to refuse dialogue with all the imposed authorities.  Although apparently innocent, the purpose of such dialogue is to break the convictions of the persecuted nun and coerce her into surrender.  “People who find themselves removed from all occupations can easily become too preoccupied with considering only the faults and imperfections of their past life…They permit themselves to be overwhelmed by this view of things, which beats them down into mistrust and convinces them that they do not have enough proof that God was in them to persevere in that state to which he had called them.  So they wanted to seek counsel and light elsewhere and consulted other persons instead of those persons whom God had removed in order to be replaced by God in all things [RPNP, 116].”  In the psychological warfare imposed by the enemies of Port-Royal, isolation can easily lead to a pervasive remorse, easily exploited by one’s opponents.  The natural desire to seek dialogue in such persecutory solitude must be repressed in the knowledge that such communication will only be used to shake one’s religious convictions and to destroy one’s grace-inspired willingness to bear witness to the truth in the midst of persecution.

To survive persecution and its attendant psychological solitude, Mère Angélique de Saint-Jean develops a spirituality for the oppressed.  The imposed solitude, in which the nuns are deprived of the sacraments and of the celebration of the divine office, should be received as a grace and not only as a punishment.  The isolation imposed on the protesting nuns invites them to a more immediate communion with God, no longer accessed through the mediation of sacrament, ordained priest, and communal prayer.  “We can say that God in his goodness has put us in a place where we must serve him and that he has given us many means to accomplish this which we would not have otherwise encountered.  We must believe that the heavenly fire that descended apparently to steal certain goods will only turn this assistance into something of a more spiritual nature.  This will teach us to belong to God in a more perfect manner through suffering and privation than through peace and abundance [PNRP, 222].”  In the ecclesiastical deprivations provoked by their refusal to assent to falsehoods, the nuns have discovered a communion with God that transcends the limits of sacrament and social intercourse.  The recognition of God as pure Spirit actually intensifies when the only access to God becomes the solitary prayer of the individual persecuted for the sake of justice.

c. Metaphysical Dualism

Tied to the Augustinian account of virtue is a broader Augustinian metaphysical dualism.  The struggle to embrace the good reflects a deeper struggle in humanity between the peccatory will, locked into the self’s vanity, and the redeemed will, freed toward the love of God.  This civil war within humanity reflects a fundamental polarity between the forces of light and darkness that agitate the cosmos itself.  The Conferences on the Constitutions of the Monastery of Port-Royal exhibit this pervasive metaphysical dualism, even in Mère Angélique de Saint-Jean’s commentary on the legal provisions of the convent’s constitution.

Often citing Saint Augustine’s City of God, Mère Angélique de Saint-Jean conceives human nature primarily in terms of the orientation of its will.  The moral agent turns either toward the self in sin or toward God in authentic love.  “We must always arrive at the principle of Saint Augustine: love has built two cities; we are necessarily citizens of one or the other.  The love of God right up to the contempt of ourselves constitutes the City of God and the kingdom of Jesus Christ.  The love of ourselves right up to the contempt of God builds Babylon, which is the kingdom of the demon [CCPR, I: 321].”  In Mère Angélique’s dualistic universe, there is no middle ground between the virtuous and the vicious, the divine and the demonic.  The central volitional act of love turns either toward the creature or toward the Creator in an itinerary of damnation or salvation.

Only grace can free the concupiscent human will from its downward inclination.  Jesus Christ is not only the unsurpassable model of moral righteousness; he is the cause of this righteousness in the will of the disciple through the redemption wrought by the cross.  “Jesus Christ is not only our model; in order to become a source of grace for us, he annihilated himself.  As Saint Paul says, he shed his own blood to purify us from our dead works [CCPR, I: 384].”  It is the cross that frees the moral agent from the losing spiritual combat with vice into which the agent has been conceived.  Grace’s instauration or restoration of the virtuous life within the will and action of the disciple is as radical as grace’s resurrection of the dead.

d. Epistemology and Certitude

The ethics of resistance developed by Mère Angélique de Saint-Jean has its own epistemology.  The abbess repeatedly warns her embattled subjects that the very willingness to engage in doubt concerning one’s contested religious convictions is to prepare a moral surrender to the opponents of the truth concerning grace.  The opuscule On the Danger of Hesitation and Doubt Once We Know Our Duty [DHD] elaborates the abbess’s argument that rather than being a neutral exercise, the entertainment of doubt on one’s central theological beliefs constitutes a moral danger for the subject who engages in it.

When people are persecuted for their beliefs, the natural inclination of the persecuted is to seek the end of duress by negotiating with their opponents.  A compromise on the disputed points is seen as a supreme good, since it would promise the end of persecution.  Mère Angélique de Saint-Jean warns, however, that persecution is the normal state for the Christian.  The fact that’s one witness provokes the violent opposition of the world’s powerful normally indicates that one is on the path of truth rather than that of error.  “The servants of God know that they could never be in a stronger state of assurance than when they must suffer.  When their enemies hold them in a state of captivity, they find themselves in a greater freedom.  They are in less danger than when they are in the greatest of dangers [DHD, 290].”  Rather than encouraging doubt and debilitating self-scrutiny, the taste of persecution should assure the persecuted that their witness, in this case their testimony on behalf of the sovereignty of divine grace, defends a truth which a vain self-sufficient world desires to crush.  The fact of persecution should strengthen rather than weaken the certitude with which the persecuted hold their well-considered beliefs.

Another problem with the exercise of doubt is the network of power in which all acts of doubt and certitude are embedded.  Any dialogue between the Port-Royal Jansenists and their opponents is based on inequality.  The wealth and juridical/military power available to the persecuting members of state and church far outweigh the meager resources of the persecuted nuns.  Furthermore, the political concerns of the opponents of the nuns will dominate a dialogue in which the nuns’ concerns for the faith will be marginalized.  “These types [of negotiations] only open the door to purely human types of reasoning and all too carnal thoughts.  In these negotiations they claim to be willing to examine everything.  In such a case, one would have to be willing to disarm faith itself…We often speak without thinking through our greatest enemies, the senses, which borrow from reason what they need to plead their cause and often clothe themselves with the most beautiful verbal appearances [DHD, 291].”  To engage in doubt in such a rigged dialogue is not to enter into a mutual pursuit of the truth.  It is to surrender to those who will dominate the discussions through their superior power, eloquence, and emotional appeals to the interest of the persecuted in survival and freedom.  The most powerful and seductive arguments, not the most truthful, will determine the course and outcome of the proposed dialogue.  Moreover, the hypothetical willingness to abandon carefully developed convictions regarding grace and salvation borders on the gravely sinful. Fidelity to truth must trump the instinct for personal or corporate survival.  “Our faith is worth more than a convent and our conscience should be preferred to a building that in God’s sight would only be our tomb if we ever clung to it by defiling our conscience [DHD, 294].”

4. Interpretation and Relevance

Beginning with the eighteenth-century editions of her work, Mère Angélique de Saint-Jean has fascinated her commentators by her combative personality and by the high-profile persecution she and her convent endured.  The literary critic Sainte-Beuve and the dramatist Montherlant have continued this emphasis on the personality of the militant abbess and have provided a negative portrait of a sectarian whose stubbornness plunged her community into an isolation which more diplomatic leadership might have avoided.  The problem with this emphasis on the headstrong personality of the abbess lies in its obfuscation of the philosophical and theological positions which the abbess defended in her numerous works.  Drama trumps theory.  The originality of Mère Angélique de Saint-Jean’s philosophy has also been obscured by its assimilation to the generic Augustinianism of the Jansenist movement.  Her disagreements with the Jansenist mainstream, expressed in the stormy correspondence with her uncle Antoine Arnauld, have often been ignored.

The current philosophical retrieval of Mère Angèlique de Saint-Jean has stressed the philosophy of resistance to oppression and the radical Augustinian recasting of moral virtue which the abbess develops in her writings.  Her epistemological analysis of the exercise of doubt as an expression of power imbalances between the majority and an ostracized minority constitutes one of the most contemporary traits of her philosophy of the duty to resist a peccatory and persecutory world.

5. References and Further Reading

The translations from French to English above are by the author of this article.

a. Primary Sources

  • Arnauld d’Andilly, Mère Angélique de Saint-Jean. Conférences de la Mère Angélique de Saint-Jean sur les Constitutions du monastère de Port-Royal du Saint-Sacrement, ed. Dom Charles Clémencet, 3 vols. (Utrecht: Aux dépens de la Compagnie, 1760).
    • The abbess’s commentary on the constitutions of Port-Royal stresses the rights of the nun and the abbess concerning the governance of the monastery.
  • Arnauld d’Andilly, Mère Angélique de Saint-Jean. Discours de la Révérende Mère Marie Angélique de S. Jean, Abbesse de P.R. des Champs, sur la Règle de S. Benoît (Paris: Osmont et Delespine, 1736).
    • The abbess’s commentary on the Rule of Saint Benedict has a neo-Augustinian stress on the grace essential for any practice of the Benedictine moral virtues.  The actual text of Mère Angélique de Saint-Jean’s commentary must be distinguished from Mère Angélique Arnauld’s earlier commentary on the Rule, which has been interpolated into the printed text.
  • Arnauld d’Andilly, Mère Angélique de Saint-Jean. Discours de la R. Mère Angélique de S. Jean, appellés Miséricordes, ou Recommandations, faites en chapitre, de plusieurs personnes unies à la Maison de Port-Royal des Champs (Utrecht: C. Le Fevre, 1735).
    • This collection of eulogies stresses that divine grace rather than human effort is the ultimate cause of the moral virtues apparent in the lives of righteous nuns and laity associated with Port-Royal.
  • Arnauld d’Andilly, Mère Angélique de Saint-Jean. Lettres de la Mère Angélique de Saint-Jean, ed. Rachel Gillet (P.R. Let 358-61).
    • Extant only in manuscript form at the Bibliothèque de la Société de Port-Royal in Paris, this three-volume collection of letters shows the metaphysical and ethical dualism of the abbess, especially in her letters to Antoine Arnauld.
  • Arnauld d’Andilly, Mère Angélique de Saint-Jean. Réflexions de la Mère Angélique de Saint-Jean Arnauld d’Andilly, Sur le danger qu’il y a d’hésiter et de douter, quand une fois l’on connaît son devoir, in Vies intéressantes et édifiantes des religieuses de Port-Royal et de plusieurs personnes qui y étaient attachées (Utrecht: Aux dépens de la Compagnie, 1750), I: 289-97.
    • This epistemological opuscule analyzes the exercise of doubt in terms of the power imbalance between majority and minority in times of persecution.
  • Arnauld d’Andilly, Mère Angélique de Saint-Jean. Réflexions de la R. Mère Angélique de S. Jean Arnauld, Abbesse de P.R. des Champs, Pour preparer ses soeurs à la persécution, conformément aux Avis que la R. Mère Agnès avait laissés sur cette matière aux religieuses de ce monastère (n.p.: 1737).
    • This address analyzes the virtues and dispositions necessary to resist oppression in the domain of religious conscience.
  • Arnauld d’Andilly, Mère Angélique de Saint-Jean. Relation de la capitivité, ed. Louis Cognet (Paris: Gallimard, 1954.)
    • This autobiographical narrative relates Soeur Angélique de Saint-Jean’s internment at the Anonciade convent during the crisis of the signature in 1664-65.  The work illustrates the nun’s methods of resistance to what she considered illegitimate authority.  A digital version of this work is available at Gallica: Bibliothèque numérique on the webpage of the Bibliothèque nationale de France.

b. Secondary Sources

  • Carr, Thomas M. Voix des Abbesses au grand siècle; La prédication au féminin à Port-Royal (Tübingen: Narr, 2006).
    • The monograph studies the varied literary genres and the moral pragmatism of the discourses given by Mère Angélique de Saint-Jean during her abbacy.
  • Conley, John J. Adoration and Annihilation: The Convent Philosophy of Port-Royal (Notre Dame, IN: University of Notre Dame Press: 2009): 175-236.
    • This philosophical study of the abbess stresses her Augustinian virtue theory, defense of women’s freedom, and theory and practice of resistance to oppressive authorities.
  • Grébil, Germain. “L’image de Mère Angélique de Saint-Jean au XVIIIe siècle,” Chroniques de Port-Royal 35 (1985): 110-25.
    • The article offers a Cartesian interpretation of the abbess’s treatise on the danger of doubt.
  • Montherlant, Henri de. Port-Royal (Paris: Gallimard, 1954).
    • The dramatic tragedy presents Mère Angélique de Saint-Jean as a haughty but heroic strategist of resistance.
  • Orcibal, Jean. Port-Royal entre le miracle et l’obéissance: Flavie Passart et Angélique de St.-Jean Arnauld d’Andilly (Paris: Desclée de Brouwer, 1957).
    • The monograph studies the complex theological background in the dispute between the signeuse Soeur Flavie and the nonsigneuse Soeur Angélique de Saint Jean during the crisis of the signature.
  • Sainte-Beuve, Charles-Augustin. Port-Royal, 3 vols., ed. Maxime Leroy (Paris: Gallimard, 1953-55).
    • The nineteenth-century literary critic presents a critical portrait of Mère Angélique de Saint-Jean as a willful, intolerant sectarian.
  • Sibertin-Blanc, Brigitte. “Biographie et personnalité de la séconde Angélique,” Chroniques de Port-Royal 35 (1985): 74-82.
    • This biographical sketch justifiably expresses skepticism about the abbess’s claim of ignorance concerning the philosophical and theological disputes behind the controversy over Jansen’s Augustinus.
  • Weaver, F. Ellen. “Angélique de Saint-Jean: Abbesse et ‘mythographe’ de Port-Royal,” Chroniques de Port-Royal 35 (1985): 93-108.
    • The historian of Port-Royal demonstrates the apologetic nature and ends of the numerous memoirs written and commissioned by Mère Angélique de Saint-Jean.

Author Information

John J. Conley
Email: jconley1@loyola.edu
Loyola University
U. S. A.

Martin Buber (1878—1965)

Martin Buber was a prominent twentieth century philosopher, religious thinker, political activist and educator. Born in Austria, he spent most of his life in Germany and Israel, writing in German and Hebrew. He is best known for his 1923 book, Ich und Du (I and Thou), which distinguishes between “I-Thou” and “I-It” modes of existence. Often characterized as an existentialist philosopher, Buber rejected the label, contrasting his emphasis on the whole person and “dialogic” intersubjectivity with existentialist emphasis on “monologic” self-consciousness. In his later essays, he defines man as the being who faces an “other” and constructs a world from the dual acts of distancing and relating. His writing challenges Kant, Hegel, Marx, Kierkegaard, Nietzsche, Dilthey, Simmel and Heidegger, and he influenced Emmanuel Lévinas.

Buber was also an important cultural Zionist who promoted Jewish cultural renewal through his study of Hasidic Judaism. He recorded and translated Hasidic legends and anecdotes, translated the Bible from Hebrew into German in collaboration with Franz Rosenzweig, and wrote numerous religious and Biblical studies. He advocated a bi-national Israeli-Palestinian state and argued for the renewal of society through decentralized, communitarian socialism. The leading Jewish adult education specialist in Germany in the 1930s, he developed a philosophy of education based on addressing the whole person through education of character, and directed the creation of Jewish education centers in Germany and teacher-training centers in Israel.

Most current scholarly work on Buber is done in the areas of pedagogy, psychology and applied social ethics.

Table of Contents

  1. Biography
  2. Philosophical Anthropology
    1. Introduction
    2. “I-Thou” and “I-It”
    3. Distance and Relation
    4. Confirmation and Inclusion
    5. Good and Evil
    6. Hindrances to Dialogue
  3. Religious Writings
    1. Hasidic Judaism
    2. Biblical Studies
  4. Political Philosophy
  5. Philosophy of Education
  6. References and Further Reading
    1. General
    2. Mythology
    3. Philosophical Works
    4. Political and Cultural Writing
    5. Religious Studies
    6. Secondary Sources

1. Biography

Mordecai Martin Buber was born in Vienna in February 8, 1878. When he was three, his mother deserted him, and his paternal grandparents raised him in Lemberg (now, Lviv) until the age of fourteen, after which he moved to his father’s estate in Bukovina. Buber would only see his mother once more, when he was in his early thirties. This encounter he described as a “mismeeting” that helped teach him the meaning of genuine meeting. His grandfather, Solomon, was a community leader and scholar who edited the first critical edition of the Midrashim traditional biblical commentaries. Solomon’s estate helped support Buber until it was confiscated during World War II.

Buber was educated in a multi-lingual setting and spoke German, Hebrew, Yiddish, Polish, English, French and Italian, with a reading knowledge of Spanish, Latin, Greek and Dutch. At the age of fourteen he began to be tormented with the problem of imagining and conceptualizing the infinity of time. Reading Kant’s Prolegomena to All Future Metaphysics helped relieve this anxiety. Shortly after he became taken with Nietzsche’s Thus Spoke Zarathustra, which he began to translate into Polish. However, this infatuation with Nietzsche was short lived and later in life Buber stated that Kant gave him philosophic freedom, whereas Nietzsche deprived him of it.

Buber spent his first year of university studies at Vienna. Ultimately the theatre culture of Vienna and the give-and-take of the seminar format impressed him more than any of his particular professors. The winters of 1897-98 and 1898-99 were spent at the University of Leipzig, where he took courses in philosophy and art history and participated in the psychiatric clinics of Wilhelm Wundt and Paul Flecksig (see Schmidt’s Martin Buber’s Formative Years: From German Culture to Jewish Renewal, 1897-1909 for an analysis of Buber’s life during university studies and a list of courses taken). He considered becoming a psychiatrist, but was upset at the poor treatment and conditions of the patients.

The summer of 1899 he went to the University of Zürich, where he met his wife Paula Winkler (1877-1958, pen name Georg Munk). Paula was formally converted from Catholicism to Judaism. They had two children, Rafael (1900-90) and Eva (1901-92).

From 1899-1901 Buber attended the University of Berlin, where he took several courses with Wilhelm Dilthey and Georg Simmel. He later explained that his philosophy of dialogue was a conscious reaction against their notion of inner experience (Erlebnis) (see Mendes-Flohr’s From Mysticism to Dialogue: Martin Buber’s Transformation of German Social Thought for an analysis of the influence of Dilthey and Simmel). During this time Buber gave lectures on the seventeenth century Lutheran mystic Jakob Böhme, publishing an article on him in 1901 and writing his dissertation for the University of Vienna in 1904 “On the History of the Problem of Individuation: Nicholas of Cusa and Jakob Böhme.” After this he lived in Florence from 1905-06, working on a habilitation thesis in art history that he never completed.

In 1904 Buber came across Tzevaat Ha-RIBASH (The Testament of Rabbi Israel, the Baal-Shem Tov), a collection of sayings by the founder of Hasidism. Buber began to record Yiddish Hasidic legends in German, publishing The Tales of Rabbi Nachman, on the Rabbi of Breslov, in 1906, and The Legend of the Baal-Shem in 1907. The Legend of the Baal-Shem sold very well and influenced writers Ranier Maria Rilke, Franz Kafka and Herman Hesse. Buber was a habitual re-writer and editor of all of his writings, which went through many editions even in his lifetime, and many of these legends were later rewritten and included in his later two volume Tales of the Hasidim (1947).

At the same time Buber emerged as a leader in the Zionist movement. Initially under the influence of Theodor Herzl, Buber’s Democratic Faction of the Zionist Party, but dramatically broke away from Herzl after the 1901 Fifth Zionist Congress when the organization refused to fund their cultural projects. In contrast to Herzl’s territorial Zionism, Buber’s Zionism, like that of Ahad Ha’am, was based on cultural renewal. Buber put together the first all-Jewish art exhibition in 1901, and in 1902 co-founded Jüdischer Verlag, a publishing house that produced collections of Jewish poetry and art, with poet Berthold Feiwel, graphic artist Ephraim Mosche Lilien and writer Davis Trietsche. This dedication to the arts continued through the 1910s and 20s, as Buber published essays on theatre and helped to develop both the Hellerau Experimental Theatre and the Dusseldorf Playhouse (see Biemann and Urban’s works for Buber’s notion of Jewish Renaissance and Braiterman for Buber’s relation to contemporaneous artistic movements).

Buber was the editor of the weekly Zionist paper Die Welt in 1901 and of Die Gesellschaft, a collection of forty sociopsychological monographs, from 1905-12 (On Die Gesellschaft see Mendes-Flohr’s From Mysticism to Dialogue: Martin Buber’s Transformation of German Social Thought). His influence as a Jewish leader grew with a series of lectures given between 1909-19 in Prague for the Zionist student group Bar Kochba, later published as “Speeches on Judaism,” and was established by his editorship of the influential monthly journal Der Jude from 1916-24. He also founded, and from 1926-29 co-edited, Die Kreatur with theologian Joseph Wittig and physician Viktor von Weizsäcker. Always active in constructing dialogue across borders, this was the first high level periodical to be co-edited by members of the Jewish, Protestant and Catholic faiths. Buber continued inter-religious dialogue throughout his life, corresponding for instance with Protestant theologians Paul Tillich and Reinhold Niebuhr.

Despite his prolific publishing endeavors, Buber struggled to complete I and Thou. First drafted in 1916 and then revised in 1919, it was not until he went through a self-styled three-year spiritual ascesis in which he only read Hasidic material and Descartes’ Discourse on Method that he was able to finally publish this groundbreaking work in 1923. After I and Thou, Buber is best known for his translation of the Hebrew Bible into German. This monumental work began in 1925 in collaboration with Franz Rosenzweig, but was not completed until 1961, more than 30 years after Rosenzweig’s death.

In 1923 Buber was appointed the first lecturer in “Jewish Religious Philosophy and Ethics” at the University of Frankfurt. He resigned after Hitler came into power in 1933 and was banned from teaching until 1935, but continued to conduct Jewish-Christian dialogues and organize Jewish education until he left for British Palestine in 1938. Initially Buber had planned to teach half a year in Palestine at Hebrew University, an institution he had helped to conceive and found, and half a year in Germany. But Kristallnacht, the devastation of his library in Heppenheim and charges of Reichsfluchtsteuer (Tax on Flight from the Reich), because he had not obtained a legal emigration permit, forced his relocation.

Buber engaged in “spiritual resistance” against Nazism through communal education, seeking to give a positive basis for Jewish identity by organizing the teaching of Hebrew, the Bible and the Talmud. He reopened an influential and prestigious Frankfurt center for Jewish studies, Freies jüdisches Lehrhaus (Free Jewish House of Learning) in 1933 and directed it until his emigration. In 1934 he created and directed the “Central Office for Jewish Adult Educationfor the Reichsvertretung der deutschen Juden (National Representation of German Jews).

After giving well-attended talks in Berlin at the Berlin College of Jewish Education and the Berlin Philharmonie, Buber, who as one of the leading Jewish public figures in Germany became known as the “arch-Jew” by the Nazis, was banned from speaking in public or at closed sessions of Jewish organizations. Despite extreme political pressure, he continued to give lectures and published several essays, including “The Question to the Single One” in 1936, which uses an analysis of Kierkegaard to attack the foundations of totalitarianism (see Between Man and Man).

After his emigration Buber became Chair of the Department of Sociology of Hebrew University, which he held until his retirement in 1951. Continuing the educational work he had begun in Germany, Buber established Beth Midrash l’Morei Am (School for the Education of Teachers of the People) in 1949 and directed it until 1953. This prepared teachers to live and work in the hostels and settlements of the newly arriving emigrants. Education was based on the notion of dialogue, with small classes, mutual questioning and answering, and psychological help for those coming from detention camps.

From the beginning of his Zionist activities Buber advocated Jewish-Arab unity in ending British rule of Palestine and a binational state. In 1925 he helped found Brit Shalom (Covenant of Peace) and in 1939 helped form the League for Jewish-Arab Rapprochement and Cooperation, which consolidated all of the bi-national groups. In 1942, the League created a political platform that was used as the basis for the political party the Ichud (or Ihud, that is, Union). For his work for Jewish-Arab parity Dag Hammarskjöld (then Secretary-General of the United Nations) nominated him for the Nobel Peace Prize in 1959.

In addition to his educational and political activities, the 1940s and 50s saw an outburst of more than a dozen books on philosophy, politics and religion, and numerous public talks throughout America and Europe. Buber received many awards, including the Goethe Prize of the University of Hamburg (1951), the Peace Prize of the German Book Trade (1953), the first Israeli honorary member of the American Academy of Arts and Sciences (1961), and the Erasmus Prize (1963). However, Buber’s most cherished honor was an informal student celebration of his 85th birthday, in which some 400 students from Hebrew University rallied outside his house and made him an honorary member of their student union.

On June 13, 1965 Martin Buber died. The leading Jewish political figures of the time attended his funeral. Classes were cancelled and hundreds of students lined up to say goodbye as Buber was buried in the Har-Hamenuchot cemetery in Jerusalem.

2. Philosophical Anthropology

a. Introduction

Martin Buber’s major philosophic works in English are the widely read I and Thou (1923), a collection of essays from the 1920s and 30s published as Between Man and Man, a collection of essays from the 1950s published as The Knowledge of Man: Selected Essays and Good and Evil: Two Interpretations (1952). For many thinkers Buber is the philosopher of I and Thou and he himself often suggested one begin with that text. However, his later essays articulate a complex and worthy philosophical anthropology.

Buber called himself a “philosophical anthropologist” in his 1938 inaugural lectures as Professor of Social Philosophy at the Hebrew University of Jerusalem, entitled “What is Man?” (in Between Man and Man). He states that he is explicitly responding to Kant’s question “What is man?” and acknowledges in his biographic writings that he has never fully shaken off Kant’s influence. But while Buber finds certain similarities between his thought and Kant’s, particularly in ethics, he explains in “Elements of the Interhuman” (in The Knowledge of Man, 1957) that their origin and goal differ. The origin for Buber is always lived experience, which means something personal, affective, corporeal and unique, and embedded in a world, in history and in sociality. The goal is to study the wholeness of man, especially that which has been overlooked or remains hidden. As an anthropologist he wants to observe and investigate human life and experience as it is lived, beginning with one’s own particular experience; as a philosophic anthropologist he wants to make these particular experiences that elude the universality of language understood. Any comprehensive overview of Buber’s philosophy is hampered by his disdain for systemization. Buber stated that ideologization was the worst thing that could happen to his philosophy and never argued for the objectivity of his concepts. Knowing only the reality of his own experience, he appealed to others who had analogous experiences.

Buber begins these lectures by asserting that man only becomes a problem to himself and asks “What is man?” in periods of social and cosmic homelessness. Targeting Kant and Hegel, he argues that while this questioning begins in solitude, in order for man to find who he is, he must overcome solitude and the whole way of conceiving of knowledge and reality that is based on solitude. Buber accuses Hegel of denigrating the concrete human person and community in favor of universal reason and argues that man will never be at home or overcome his solitude in the universe that Hegel postulates. With its emphasis on history, Hegel locates perfection in time rather than in space. This type of future-oriented perfection, Buber argues, can be thought, but it cannot be imagined, felt or lived. Our relationship to this type of perfection can only rest on faith in a guarantor for the future.

Instead, Buber locates realization in relations between creatures. Overcoming our solitude, which tends to oscillate between conceiving of the self as absorbed in the all (collectivism) and the all as absorbed into the self (solipsistic mysticism), we realize that we always exist in the presence of other selves, and that the self is a part of reality only insofar as it is relational. In contrast to the traditional philosophic answers to “What is man?” that fixate on reason, self-consciousness or free will, Buber argues that man is the being who faces an “other”, and a human home is built from relations of mutual confirmation. 

b. “I-Thou” and “I-It”

Martin Buber’s most influential philosophic work, I and Thou (1923), is based on a distinction between two word-pairs that designate two basic modes of existence: I-Thou” (Ich-Du) and “I-It” (Ich-Es). The “I-Thou” relation is the pure encounter of one whole unique entity with another in such a way that the other is known without being subsumed under a universal. Not yet subject to  classification or limitation, the “Thou” is not reducible to spatial or temporal characteristics. In contrast to this the “I-It” relation is driven by categories of “same” and “different” and focuses on universal definition. An “I-It” relation experiences a detached thing, fixed in space and time, while an “I-Thou” relation participates in the dynamic, living process of an “other”.

Buber characterizes “I-Thou” relations as “dialogical” and “I-It” relations as “monological.” In his 1929 essay “Dialogue,” Buber explains that monologue is not just a turning away from the other but also a turning back on oneself (Rückbiegung). To perceive the other as an It is to take them as a classified and hence predictable and manipulable object that exists only as a part of one’s own experiences. In contrast, in an “I-Thou” relation both participants exist as polarities of relation, whose center lies in the between (Zwischen).

The “I” of man differs in both modes of existence. The “I” may be taken as the sum of its inherent attributes and acts, or it may be taken as a unitary, whole, irreducible being. The “I” of the “I-It” relation is a self-enclosed, solitary individual (der Einzige) that takes itself as the subject of experience. The “I” of the “I-Thou” relation is a whole, focused, single person (der Einzelne) that knows itself as subject. In later writings Buber clarified that inner life is not exhausted by these two modes of being. However, when man presents himself to the world he takes up one of them.

While each of us is born an individual, Buber draws on the Aristotelian notion of entelechy, or innate self-realization, to argue that the development of this individuality, or sheer difference, into a whole personality, or fulfilled difference, is an ongoing achievement that must be constantly maintained. In I and Thou, Buber explains that the self becomes either more fragmentary or more unified through its relationships to others. This emphasis on intersubjectivity is the main difference between I and Thou and Buber’s earlier Daniel: Dialogues on Realization (1913). Like I and Thou, Daniel distinguishes between two modes of existence: orienting (Rientierung), which is a scientific grasp of the world that links experiences, and realization (Verwirklichung), which is immersion in experience that leads to a state of wholeness. While these foreshadow the “I-It” and “I-Thou” modes, neither expresses a relationship to a real “other”. In I and Thou man becomes whole not in relation to himself but only through a relation to another self. The formation of the “I” of the “I-Thou” relation takes place in a dialogical relationship in which each partner is both active and passive and each is affirmed as a whole being. Only in this relationship is the other truly an “other”, and only in this encounter can the “I” develop as a whole being.

Buber identifies three spheres of dialogue, or “I Thou” relations, which correspond to three types of otherness. We exchange in language, broadly conceived, with man, transmit below language with nature, and receive above language with spirit. Socrates is offered as the paradigmatic figure of dialogue with man, Goethe, of dialogue with nature, and Jesus, of dialogue with spirit. That we enter into dialogue with man is easily seen; that we also enter into dialogue with nature and spirit is less obvious and the most controversial and misunderstood aspect of I and Thou. However, if we focus on the “I-Thou” relationship as a meeting of singularities, we can see that if we truly enter into relation with a tree or cat, for instance, we apprehend it not as a thing with certain attributes, presenting itself as a concept to be dissected, but as a singular being, one whole confronting another.

Dialogue with spirit is the most difficult to explicate because Buber uses several different images for it. At times he describes dialogue with spirit as dialogue with the “eternal Thou,” which he sometimes calls God, which  is eternally “other”. Because of this, I and Thou was widely embraced by Protestant theologians, who also held the notion that no intermediary was necessary for religious knowledge. Buber also argues that the precondition for a dialogic community is that each member be in a perpetual relation to a common center, or “eternal Thou”. Here the “eternal Thou” represents the presence of relationality as an eternal value. At other times, Buber describes dialogue with spirit as the encounter with form that occurs in moments of artistic inspiration or the encounter with personality that occurs in intensive engagement with another thinker’s works. Spiritual address is that which calls us to transcend our present state of being through creative action. The eternal form can either be an image of the self one feels called to become or some object or deed that one feels called to bring into the world.

Besides worries over Buber’s description of man’s dialogue with nature and spirit, three other main complaints have been raised against I and Thou. The first, mentioned by Walter Kaufmann in the introduction to his translation of I and Thou, is that the language is overly obscure and romantic, so that there is a risk that the reader will be aesthetically swept along into thinking the text is more profound than it actually is. Buber acknowledges that the text was written in a state of inspiration. For this reason it is especially important to also read his later essays, which are more clearly written and rigorously argued. E. la B. Cherbonnier notes in “Interrogation of Martin Buber” that every objective criticism of Buber’s philosophy would belong, by definition, to the realm of “I-It”. Given the incommensurability of the two modes, this means no objective criticism of the “I-Thou” mode is possible. In his response Buber explains that he is concerned to avoid internal contradiction and welcomes criticism. However, he acknowledges that his intention was not to create an objective philosophic system but to communicate an experience.

Finally, I and Thou is often criticized for denigrating philosophic and scientific knowledge by elevating “I-Thou” encounters above “I-It” encounters. It is important to note that Buber by no means renounces the usefulness and necessity of “I-It” modes. His point is rather to investigate what it is to be a person and what modes of activity further the development of the person. Though one is only truly human to the extent one is capable of “I-Thou” relationships, the “It” world allows us to classify, function and navigate. It gives us all scientific knowledge and is indispensable for life. There is a graduated structure of “I-It” relations as they approximate an “I-Thou” relationship, but the “I-Thou” remains contrasted to even the highest stage of an “I-It” relation, which still contains some objectification. However, each “Thou” must sometimes turn into an “It”, for in responding to an “other” we bind it to representation. Even the “eternal Thou” is turned into an It for us when religion, ethics and art become fixed and mechanical. However, an “I-It” relation can be constituted in such a way as to leave open the possibility of further “I-Thou” encounters, or so as to close off that possibility.

c. Distance and Relation

In I and Thou Martin Buber discusses the a priori basis of the relation, presenting the “I-Thou” encounter as the more primordial one, both in the life of humans, as when an infant reaches for its mother, and in the life of a culture, as seen in relationships in primitive cultures. However, in the 1951 essay “Distance and Relation,” written in the midst of the Palestinian conflicts, he explains that while this may be true from an anthropological perspective, from an ontological one it must be said that distance (Urdistanz) is the precondition for the emergence of relation (Beziehung), whether “I-Thou” or “I-It”. Primal distance sets up the possibility of these two basic word pairs, and the between (Zwischen) emerges out of them. Humans find themselves primally distanced and differentiated; it is our choice to then thin or thicken the distance by entering into an “I-Thou” relation with an “other” or withdrawing into an “I-It” mode of existence.

Only man truly distances, Buber argues, and hence only man has a “world.” Man is the being through whose existence what “is” becomes recognized for itself. Animals respond to the other only as embedded within their own experience, but even when faced with an enemy, man is capable of seeing his enemy as a being with similar emotions and motivations. Even if these are unknown , we are able to recognize that these unknown qualities of the other are “real” while our fantasies about the other are not. Setting at a distance is hence not the consequence of a reflective, “It” attitude, but the precondition for all human encounters with the world, including reflection.

Buber argues that every stage of the spirit, however primal, wishes to form and express itself. Form assumes communication with an interlocutor who will recognize and share in the form one has made. Distance and relation mutually correspond because in order for the world to be grasped as a whole by a person, it must be distanced and independent from him and yet also include him, and his attitude, perception, and relation to it. Consequently, one cannot truly have a world unless one receives confirmation of one’s own substantial and independent identity in one’s relations with others.

Relation presupposes distance, but distance can occur without genuine relation. Buber explains that distance is the universal situation of our existence; relation is personal becoming in the situation. Relation presupposes a genuine other and only man sees the other as other. This other withstands and confirms the self and hence meets our primal instinct for relation. Just as we have the instinct to name, differentiate, and make independent a lasting and substantial world, we also have the instinct to relate to what we have made independent. Only man truly relates, and when we move away from relation we give up our specifically human status.

d. Confirmation and Inclusion

Confirmation is a central theme of Martin Buber’s philosophic texts as well as his articles on education and politics. Buber argues that, while animals sometimes turn to humans in a declaring or announcing mode, they do not need to be told that they are what they are and do not see whom they address as an existence independent of their own experience. But because man experiences himself as indeterminate, his actualization of one possibility over another needs confirmation. In confirmation one meets, chooses and recognizes the other as a subject with the capacity to actualize one’s own potential. In order for confirmation to be complete one must know that he is being made present to the other.

As becomes clear in his articles on education, confirmation is not the same as acceptance or unconditional affirmation of everything the other says or does. Since we are not born completely focused and differentiated and must struggle to achieve a unified personality, sometimes we have to help an “other” to actualize themselves against their own immediate inclination. In these cases confirmation denotes a grasp of the latent unity of the other and confirmation of what the other can become. Nor does confirmation imply that a dialogic or “I-Thou” relation must always be fully mutual. Helping relations, such as educating or healing, are necessarily asymmetrical.

In the course of his writing Buber uses various terms, such as “embrace” or “inclusion” (Umfassung), “imagining the real” (Realphantasie), and in reference to Kant, “synthesizing apperception,” to describe the grasp of the other that is necessary for confirmation and that occurs in an “I-Thou” relation. “Imagining the real” is a capacity; “making present” is an event, the highest expression of this capacity in a genuine meeting of two persons. This form of knowledge is not the subsumption of the particularity of the other under a universal category. When one embraces the pain of another, this is not a sense of what pain is in general, but knowledge of this specific pain of this specific person. Nor is this identification with them, since the pain always remains their own specific pain. Buber differentiates inclusion from empathy. In empathy one’s own concrete personality and situation is lost in aesthetic absorption in the other. In contrast, through inclusion, one person lives through a common event from the standpoint of another person, without giving up their own point of view.

e. Good and Evil

Martin Buber’s 1952 Good and Evil: Two Interpretations answers the question “What is man?” in a slightly different way than the essays in Between Man and Man and The Knowledge of Man. Rather than focusing on relation, Good and Evil: Two Interpretations emphasizes man’s experience of possibility and struggle to become actualized. Framing his discussion around an analysis of psalms and Zoroastrian and Biblical myths, Buber interprets the language of sin, judgment and atonement in purely existential terms that are influenced by Hasidic Judaism, Kant’s analysis of caprice (Willkür) and focused will (Wille), and Kierkegaard’s discussion of anxiety. Buber argues that good and evil are not two poles of the same continuum, but rather direction (Richtung) and absence of direction, or vortex (Wirbel). Evil is a formless, chaotic swirling of potentiality; in the life of man it is experienced as endless possibility pulling in all directions. Good is that which forms and determines this possibility, limiting it into a  particular direction. We manifest the good to the extent we become a singular being with a singular direction.

Buber explains that imagination is the source of both good and evil. The “evil urge” in the imagination generates endless possibilities. This is fundamental and necessary, and only becomes “evil” when it is completely separated from direction. Man’s task is not to eradicate the evil urge, but to reunite it with the good, and become a whole being. The first stage of evil is “sin,” occasional directionlessness. Endless possibility can be overwhelming, leading man to grasp at anything, distracting and busying himself, in order to not have to make a real, committed choice. The second stage of evil is “wickedness,” when caprice is embraced as a deformed substitute for genuine will and becomes characteristic. If occasional caprice is sin, and embraced caprice is wickedness, creative power in conjunction with will is wholeness. The “good urge” in the imagination limits possibility by saying no to manifold possibility and directing passion in order to decisively realize potentiality. In so doing it redeems evil by transforming it from anxious possibility into creativity. Because of the temptation of possibility, one is not whole or good once and for all. Rather, this is an achievement that must be constantly accomplished.

Buber interprets the claim that in the end the good are rewarded and the bad punished as the experience the bad have of their own fragmentation, insubstantiality and “non-existence.” Arguing that evil can never be done with the whole being, but only out of inner contradiction, Buber states that the lie or divided spirit is the specific evil that man has introduced into nature. Here “lie” denotes a self that evades itself, as manifested not just in a gap between will and action, but more fundamentally, between will and will. Similarly, “truth” is not possessed but is rather lived in the person who affirms his or her particular self by choosing direction. This process, Buber argues, is guided by the presentiment implanted in each of us of who we are meant to become.

f. Hindrances to Dialogue

Along with the evasion of responsibility and refusal to direct one’s possibilities described in Good and Evil: Two Interpretations (1952), Buber argues in “Elements of the Interhuman” (1957, in The Knowledge of Man) that the main obstacle to dialogue is the duality of “being” (Sein) and “seeming” (Schein). Seeming is the essential cowardice of man, the lying that frequently occurs in self-presentation when one seeks to communicate an image and make a certain impression. The fullest manifestation of this is found in the propagandist, who tries to impose his own reality upon others. Corresponding to this is the rise of “existential mistrust” described in Buber’s 1952 address at Carnegie Hall, “Hope for this Hour” (in Pointing the Way). Mistrust takes it for granted that the other dissembles, so that rather than genuine meeting, conversation becomes a game of unmasking and uncovering unconscious motives. Buber criticizes Marx, Nietzsche and Freud for meeting the other with suspicion and perceiving the truth of the other as mere ideology. Similarly, in his acceptance speech for the 1953 Peace Prize of the German Book Trade, “Genuine Dialogue and the Possibilities of Peace” (in Pointing the Way), Buber argues the precondition for peace is dialogue, which in turn rests on trust. In mistrust one presupposes that the other is likewise filled with mistrust, leading to a dangerous reserve and lack of candor.

As it is a key component of his philosophic anthropology that one becomes a unified self through relations with others, Buber was also quite critical of psychiatrist Carl Jung and the philosophers of existence. He argued that subsuming reality under psychological categories cuts man off from relations and does not treat the whole person, and especially objected to Jung’s reduction of psychic phenomenon to categories of the private unconscious. Despite his criticisms of Freud and Jung, Buber was intensely interested in psychiatry and gave a series of lectures at the Washington School of Psychiatry at the request of Leslie H. Farber (1957, in The Knowledge of Man) and engaged in a public dialogue with Carl Rogers at the University of Michigan (see Anderson and Cissna’s The Martin Buber-Carl Rogers Dialogue: A New Transcript With Commentary). In these lectures, as well as his 1951 introduction to Hans Trüb’s Heilung aus der Begegnung (in English as “Healing Through Meeting” in Pointing the Way), Buber criticizes the tendency of psychology to “resolve” guilt without addressing the damaged relations at the root of the feeling. In addition to Farber, Rogers and Trüb, Buber’s dialogical approach to healing influenced a number of psychologists and psychoanalysts, including Viktor von Weizsäcker, Ludwig Binswanger and Arie Sborowitz.

Often labeled an existentialist, Buber rejected the association. He asserted that while his philosophy of dialogue presupposes existence, he knew of no philosophy of existence that truly overcomes solitude and lets in otherness far enough. Sartre in particular makes self-consciousness his starting point. But in an “I-Thou” relation one does not have a split self, a moment of both experience and self-reflection. Indeed, self-consciousness is one of the main barriers to spontaneous meeting. Buber explains the inability to grasp otherness as perceptual inadequacy that is fostered as a defensive mechanism in an attempt to not be held responsible to what is addressing one. Only when the other is accorded reality are we held accountable to him; only when we accord ourselves a genuine existence are we held accountable to ourselves. Both are necessary for dialogue, and both require courageous confirmation of oneself and the other.

In Buber’s examples of non-dialogue, the twin modes of distance and relation lose balance and connectivity, and one pole overshadows the other, collapsing the distinction between them. For example, mysticism (absorption in the all) turns into narcissism (a retreat into myself), and collectivism (absorption in the crowd) turns into lack of engagement with individuals (a retreat into individualism). Buber identifies this same error in Emmanuel Lévinas’ philosophy. While Lévinas acknowledged Buber as one of his main influences, the two had a series of exchanges, documented in Levinas & Buber: Dialogue and Difference, in which Buber argued that Lévinas had misunderstood and misapplied his philosophy. In Buber’s notion of subject formation, the self is always related to and responding to an “other”. But when Lévinas embraces otherness, he renders the other transcendent, so that the self always struggles to reach out to and adequately respond to an infinite other. This throws the self back into the attitude of solitude that Buber sought to escape.

3. Religious Writings

a. Hasidic Judaism

In his 1952 book Eclipse of God, Martin Buber explains that philosophy usually begins with a wrong set of premises: that an isolated, inquiring mind experiences a separate, exterior world, and that the absolute is found in universals. He prefers the religious, which in contrast, is founded on relation, and means the covenant of the absolute with the particular. Religion addresses whole being, while philosophy, like science, fragments being. This emphasis on relation, particularity and wholeness is found even in Buber’s earliest writings, such as his 1904 dissertation on the panentheistic German mystics Nicholas of Cusa and Jakob Böhme, “On the History of the Problem of Individuation: Nicholas of Cusa and Jakob Böhme.” Nicholas of Cusa postulates that God is a “coincidence of opposites” and that He “contracts” himself into each creature, so that each creature best approximates God by actualizing its own unique identity. Böhme similarly presents God as both transcendent and immanent, and elaborates that perfection of individuality is developed through mutual interaction.

The same elements that attracted Buber to Nicholas of Cusa and Böhme he found fulfilled in Hasidism, producing collections of Hasidic legends and anecdotes (Tales of Rabbi Nachman, The Legend of the Baal-Shem and Tales of the Hasidim) as well as several commentaries (including On Judaism, The Origin and Meaning of Hasidism and The Way of Man: According to the Teaching of Hasidism). The Hebrew tsimtsum expresses God’s “contraction” into the manifold world so that relation can emerge. In distinction from the one, unlimited source, this manifold is limited, but has the choice and responsibility to effect the unification (yihud) of creation. The restoration of unity is described as “the freeing of the sparks,” understood as the freeing of the divine element from difference through the hallowing of the everyday.

In addition to defining Hasidism by its quest for unity, Buber contrasts the Hasidic insistence on the ongoing redemption of the world with the Christian belief that redemption has already occurred through Jesus Christ. Each is charged with the task to redeem their self and the section of creation  they occupy. Redemption takes place in the relation between man and creator, and is neither solely dependent on God’s grace nor on man’s will. No original sin can prohibit man from being able to turn to God. However, Buber is not an unqualified voluntarist. As in his political essays, he describes himself as a realistic meliorist. One cannot simply will redemption. Rather, each person’s will does what it can with the particular concrete situation that faces it.

The Hebrew notions of kavana, or concentrated inner intention, and teshuva, or (re)turning to God with one’s whole being, express the conviction that no person or action is so sinful that it cannot be made holy and dedicated to God. Man hallows creation by being himself and working in his own sphere. There is no need to be other, or to reach beyond the human. Rather, one’s ordinary life activities are to be done in such a way that they are sanctified and lead to the unification of the self and creation. The legends and anecdotes of the historic zaddikim (Hasidic spiritual and community leaders) that Buber recorded depict persons who exemplify the hallowing of the everyday through the dedication of the whole person.

If hallowing is successful, the everyday is the religious, and there is no split between the political, social or religious spheres. Consequently Buber rejects the notion that God is to be found through mystical ecstasy in which one loses one’s sense of self and is lifted out of everyday experience. Some commentators, such as Paul Mendes-Flohr and Maurice Friedman, view this as a turn away from his earlier preoccupation with mysticism in texts such as Ecstatic Confessions (1909) and Daniel: Dialogues on Realization (1913). In later writings, such as “The Question to the Single One” (1936, in Between Man and Man) and “What is Common to All” (1958, in The Knowledge of Man), Buber argues that special states of unity are experiences of self-unity, not identification with God, and that many forms of mysticism express a flight from the task of dealing with the realities of a concrete situation and working with others to build a common world into a private sphere of illusion. Buber is especially critical of Kierkegaard’s assertion that the religious transcends the ethical. Drawing on Hasidic thought, he argues that creation is not an obstacle on the way to God, but the way itself.

Buber did not strictly follow Judaism’s religious laws. Worried that an “internal slavery” to religious law stunts spiritual growth, he did not believe that revelation could ever be law-giving in itself, but that revelation becomes legislation through the self-contradiction of man. Principles require acting in a prescribed way, but the uniqueness of each situation and encounter requires each to be approached anew. He could not blindly accept laws but felt compelled to ask continually if a particular law was addressing him in his particular situation. While rejecting the universality of particular laws, this expresses a meta-principle of dialogical readiness.

Buber’s interpretation of Hasidism is not without its critics. Gershom Scholem in particular accused Buber of selecting elements of Hasidism to confirm his “existentialist” philosophy. Scholem argued that the emphasis on particulars and the concrete that Buber so admired does not exist in Hasidism and that Buber’s erroneous impressions derive from his attention to oral material and personalities at the expense of theoretical texts. In general Buber had little historical or scholarly interest in Hasidism. He took Hasidism to be less a historical movement than a paradigmatic mode of communal renewal and was engaged by the dynamic meaning of the anecdotes and the actions they pointed to. In a 1943 conversation with Scholem, Buber stated that if Scholem’s interpretation of Hassidism was accurate, then he would have labored for forty years over Hasidic sources in vain, for they would no longer interest him.

b. Biblical Studies

In addition to his work with Hasidism, Martin Buber also translated the Bible from Hebrew into German with Franz Rosenzweig, and produced several religious analyses, including Kingship of God, Moses: The Revelation and the Covenant, On the Bible: Eighteen Studies, The Prophetic Faith and Two Types of Faith. Counter to religious thinkers such as Karl Barth and Emmanuel Lévinas, Buber argues that God is not simply a wholly transcendent other, but also wholly same, closer to each person than his or her own self. However, God can be known only in his relation to man, not apart from it. Buber interprets religious texts, and the Bible in particular, as the history of God’s relation to man from the perspective of man. Thus, it is not accurate to say that God changes throughout the texts, but that the theophany, the human experience of God, changes. Consequently, Buber characterizes his approach as tradition criticism, which emphasizes experiential truth and uncovers historical themes, in contrast to source criticism, which seeks to verify the accuracy of texts.

When translating the Bible, Buber’s goal was to make the German version as close to the original oral Hebrew as possible. Rather than smoothing over difficult or unclear passages, he preferred to leave them rough. One important method was to identify keywords (Leitworte) and study the linguistic relationship between the parts of the text, uncovering the repetition of word stems and same or similar sounding words. Buber also tried to ward against Platonizing tendencies by shifting from static and impersonal terms to active and personal terms. For instance, whereas kodesh had previously been translated “holy,” he used the term “hallowing” to emphasize activity. Similarly, God is not the “Being” but the “Existing,” and what had been rendered “Lord” became “I,” “Thou” and “He.”

Buber made two important distinctions between forms of faith in his religious studies. In the 1954 essay “Prophecy, Apocalyptic, and the Historical Hour” (in Pointing the Way), he distinguishes between “apocalyptic” approaches, which dualistically separate God from world, and regard evil as unredeemable, and “prophetic” stances, which preserve the unity of God with the world and promise the fulfillment of creation, allowing evil to find direction and serve the good. In the prophetic attitude one draws oneself together so that one can contribute to history, but in the apocalyptic attitude one fatalistically resigns oneself. The tension between these two tendencies is illustrated in his 1943 historical novel Gog and Magog: A Novel (also published as For the Sake of Heaven: A Hasidic Chronicle-Novel).

In Two Types of Faith (1951), Buber distinguishes between the messianism of Jesus and the messianism of Paul and John. While he had great respect for Jesus as a man, Buber did not believe that Jesus took himself to be divine. Jesus’ form of faith corresponds to emunah, faith in God’s continual presence in the life of each person. In contrast, the faith of Paul and John, which Buber labels pistis, is that God exists in Jesus. They have a dualistic notion of faith and action, and exemplify the apocalyptic belief in irredeemable original sin and the impossibility of fulfilling God’s law. Buber accuses Paul and John of transforming myth, which is historically and biographically situated, into gnosis, and replacing faith as trust and openness to encounter with faith in an image.

4. Political Philosophy

Martin Buber’s cultural Zionism, with its early emphasis on aesthetic development, was inextricably linked to his form of socialism. Buber argues that it is an ever-present human need to feel at home in the world while experiencing confirmation of one’s functional autonomy from others. The development of culture and aesthetic capacities is not an end in itself but the precondition for a fully actualized community, or “Zionism of realization” (Verwirklichungszionismus). The primary goal of history is genuine community, which is characterized by an inner disposition toward a life in common. This refutes the common misconception that an “I-Thou” relation is an exclusive affective relation that cannot work within a communal setting. Buber critiques collectivization for creating groups by atomizing individuals and cutting them off from one another. Genuine community, in contrast, is a group bound by common experiences with the disposition and persistent readiness to enter into relation with any other member, each of whom is confirmed as a differentiated being. He argues that this is best achieved in village communes such as the Israeli kibbutzim.

In his 1947 study of utopian socialism, Paths in Utopia, and 1951 essay “Society and the State” (in Pointing the Way), Buber distinguished between the social and political principles. The political principle, exemplified in the socialism of Marx and Lenin, tends towards centralization of power, sacrificing society for the government in the service of an abstract, universal utopianism. In contrast, influenced by his close friend, anarchist Gustav Landauer, Buber postulates a social principle in which the government serves to promote community. Genuine change, he insists, does not occur in a top-down fashion, but only from a renewal of man’s relations. Rather than ever-increasing centralization, he argues in favor of federalism and the maximum decentralization compatible with given social conditions, which would be an ever-shifting demarcation line of freedom.

Seeking to retrieve a positive notion of utopianism, Buber characterizes genuine utopian socialism as the ongoing realization of the latent potential for community in a concrete place. Rather than seeking to impose an abstract ideal, he argues that genuine community grows organically out of the topical and temporal needs of a given situation and people. Rejecting economic determinism for voluntarism, he insists that socialism is possible to the extent that people will a revitalization of communal life. Similarly, his Zionism is not based on the notion of a final state of redemption but an immediately attainable goal to be worked for. This shifts the notion of utopian socialism from idealization to actualization and equality.

Despite his support of the communal life of the kibbutzim, Buber decried European methods of colonization and argued that the kibbutzim would only be genuine communities if they were not closed off from the world. Unlike nationalism, which sees the nation as an end in itself, he hoped Israel would be more than a nation and would usher in a new mode of being. The settlers must learn to live with Arabs in a vital peace, not merely next to them in a pseudo-peace that he feared was just a prelude to war. As time went on, Buber became increasingly critical of Israel, stating that he feared a victory for the Jews over the Arabs would mean a defeat for Zionism.

Buber’s criticism of Israeli policies led to many public debates with its political leaders, in particular David Ben-Gurion, Israel’s first Prime Minister. In a relatively early essay, “The Task” (1922), Buber argued that the politicization of all life was the greatest evil facing man. Politics inserts itself into every aspect of life, breeding mistrust. This conviction strengthened over time, and in his 1946 essay “A Tragic Conflict” (in A Land of Two Peoples) he described the notion of a politicized “surplus” conflict. When everything becomes politicized, imagined conflict disguises itself as real, tragic conflict. Buber viewed Ben-Gurion as representative of this politicizing tendency. Nevertheless, Buber remained optimistic, believing that the greater the crisis the greater the possibility for an elemental reversal and rebirth of the individual and society.

Buber’s relationship to violence was complicated. He argued that violence does not lead to freedom or rebirth but only renewed decline, and deplored revolutions whose means were not in alignment with their end. Afraid that capital punishment would only create martyrs and stymie dialogue, he protested the sentencing of both Jewish and Arab militants and called the execution of Nazi Adolf Eichmann a grave mistake. However, he insisted that he was not a pacifist and that, sometimes, just wars must be fought. This was most clearly articulated in his 1938 exchange of letters with Gandhi, who compared Nazi Germany to the plight of Indians in South Africa and suggested that the Jews use satyagraha, or non-violent “truth-force.” Buber was quite upset at the comparison of the two situations and replied that satyagraha depends upon testimony. In the face of total loss of rights, mass murder and forced oblivion, no such testimony was possible and satyagraha was ineffective (see Pointing the Way and The Letters of Martin Buber: A Life of Dialogue).

5. Philosophy of Education

In addition to his work as an educator, Martin Buber also delivered and published several essays on philosophy of education, including “Education,” given in 1925 in Heidelberg (in Between Man and Man). Against the progressive tone of the conference, Buber argued that the opposite of compulsion and discipline is communion, not freedom. The student is neither entirely active, so that the educator can merely free his or her creative powers, nor is the student purely passive, so that the educator merely pours in content. Rather, in their encounter, the educative forces of the instructor meet the released instinct of the student. The possibility for such communion rests on mutual trust.

The student trusts in the educator, while the educator trusts that the student will take the opportunity to fully develop herself. As the teacher awakens and confirms the student’s ability to develop and communicate herself, the teacher learns to better encounter the particular and unique in each student. In contrast to the propagandist, the true educator influences but does not interfere. This is not a desire to change the other, but rather to let what is right take seed and grow in an appropriate form. Hence they have a dialogical relationship, but not one of equal reciprocity. If the instructor is to do the job it cannot be a relationship between equals.

Buber explains that one cannot prepare students for every situation, but one can guide them to a general understanding of their position and then prepare them to confront every situation with courage and maturity. This is character or whole person education. One educates for courage by nourishing trust through the trustworthiness of the educator. Hence the presence and character of the educator is more important than the content of what is actually taught. The ideal educator is genuine to his or her core, and responds with his or her “Thou”, instilling trust and enabling students to respond with their “Thou”. Buber acknowledges that teachers face a tension between acting spontaneously and acting with intention. They cannot plan for dialogue or trust, but they can strive to leave themselves open for them.

In “Education and World-View” (1935, in Pointing the Way), Buber further elaborates that in order to prepare for a life in common, teachers must educate in such a way that both individuation and community are advanced. This entails setting groups with different world-views before each other and educating, not for tolerance, but for solidarity. An education of solidarity means learning to live from the point of view of the other without giving up one’s own view. Buber argues that how one believes is more important than what one believes. Teachers must develop their students to ask themselves on what their world-view stands, and what they are doing with it.

6. References and Further Reading

a. General

  • “Interrogation of Martin Buber.” Conducted by M.S. Friedman. Philosophic Interrogations. Ed. S. and B. Rome. New York: Holt, Rinehart and Winston, 1964.
    • Questions by more than 50 major thinkers and Buber’s responses.
  • Martin Buber Werkausgabe. Ed. Paul Mendes-Flohr and Peter Schäffer. Gütersloh: Gütersloher Verlagshaus, 2001.

    • A critical 21-volume compilation of the complete writings of Buber in German, designed to replace Buber’s self-edited Werke.
  • The Letters of Martin Buber: A Life of Dialogue. Ed. Nahum N. Glatzer and Paul Mendes-Flohr. Trans. Richard and Clara Winston and Harry Zohn. Syracuse, N.Y.: Syracuse University Press, 1996.
    • Includes letters to his wife and family as well as many notable thinkers, including Gandhi, Walter Benjamin, Albert Einstein, Herman Hesse, Franz Kafka, Albert Camus, Gustav Landauer and Dag Hammarskjöld.
  • The Martin Buber Reader. Ed. Asher Biemann. New York: Macmillan, 2002.
  • The Philosophy of Martin Buber: The Library of Living Philosophers, 12. Ed. Paul A. Schilpp and Maurice Friedman. La Salle, I.L.: Open Court, 1967.
    • Large collection of essays by Gabriel Marcel, Charles Hartshorne, Emmanuel Lévinas, Hugo Bergman, Jean Wahl, Ernst Simon, Walter Kaufmann and many others, with Buber’s replies and autobiographical statements.
  • Werke. 3 vols. Vol I: Schriften zur Philosophie. Vol 2: Schriften zur Bible. Vol. 3: Schriften zur Chassidismus. Munich and Heidelberg: Kösel Verlag and Lambert Schneider, 1962-63.
    • Comprehensive collection (more than four thousand pages long), edited by Buber. Lacks some very early and very late essays, which may be found in the Martin Buber Archives of the Jewish National and University Library at the Hebrew University of Jerusalem.

b. Mythology

  • Tales of Rabbi Nachman. Trans. Maurice Friedman. Amherst, N.Y.: Humanity Books, 1988.
  • The Legend of the Baal-Shem. Trans. Maurice Friedman. London: Routledge, 2002.
  • Tales of the Hasidim (The Early Masters and The Later Masters). New York: Schocken Books, 1991.
  • Gog and Magog: A Novel. Trans. Ludwig Lewisohn. Syracuse, N.Y.: Syracuse University Press, 1999.
  • Previously published as For the Sake of Heaven: A Hasidic Chronicle-Novel.

c. Philosophical Works

  • Between Man and Man. Trans. Ronald Gregor-Smith. New York: Routledge, 2002.
    • Good introduction to Buber’s thought that includes “Dialogue,” “What is Man?” “The Question to the Single One” (on Kierkegaard), and lectures on education.
  • Daniel: Dialogues on Realization. Trans. Maurice S. Friedman. New York: McGraw-Hill, 1965.
    • Early work, important for understanding the development to I and Thou.
  • Eclipse of God: Studies in the Relation Between Religion and Philosophy. Trans. Maurice Friedman. Atlantic Highlands, N.J.: International Humanities Press, 1988.
    • Includes critiques of Heidegger, Sartre and Jung.
  • Good and Evil: Two Interpretations. Pt. 1: Right and Wrong, trans. R.G. Smith. Pt. 2: Images of Good and Evil, trans. M. Bullock. Upper Saddle River, N.J.: Prentice Hall, 1997.
    • Very helpful to an understanding of Buber’s moral philosophy and relation to existentialism.
  • I and Thou. Trans. Ronald Gregor-Smith. New York: Scribner, 1984.
  • I and Thou. Trans. Walter Kaufmann. New York: Simon and Schuster, 1996.
  • The Knowledge of Man: Selected Essays. Trans. Maurice Friedman and Ronald Gregor-Smith. Amherst, N.Y.: Prometheus Books, 1998.
    • Mature and technical, with the important “Distance and Relation” and lectures given for the Washington School of Psychiatry.

d. Political and Cultural Writing

  • A Land of Two Peoples: Martin Buber on Jews and Arabs. Ed. Paul R. Mendes-Flohr. Chicago: University of Chicago Press, 2005.
  • Israel and the World: Essays in a Time of Crisis. New York: Schocken Books, 1963.
  • On Zion: The History of an Idea. Trans. Stanley Godman. New York: Schocken Books, 1986.
  • Paths in Utopia. Trans. R. F. Hull. New York: Syracuse University Press, 1996.
    • History and defense of utopian socialism, including analyses of Marx, Lenin, Landauer and kibbutzim.
  • Pointing the Way: Collected Essays. Ed. and trans. Maurice Friedman. Atlantic Highlands, N.J.: Humanities Press, 1988.
    • Mix of early and late essays, including essays on theatre, Bergson and Gandhi, and “Education and World-View,” “Society and the State,” “Hope for the Hour” and “Genuine Dialogue and the Possibilities of Peace.”
  • The First Buber: Youthful Zionist Writings of Martin Buber. Trans. Gilya G. Schmidt. Syracuse, N.Y.: Syracuse University Press: 1999.

e. Religious Studies

  • Ecstatic Confessions: The Heart of Mysticism. Ed. Paul R. Mendes-Flohr. San Francisco: Harper & Row, 1985.
  • Hasidism and Modern Man. Ed. and trans. Maurice S. Friedman. New York: Harper Torchbooks, 1958.
  • Kingship of God. Trans. Richard W. Scheimann. New York: Harper, 1973.
  • Moses: The Revelation and the Covenant. Amherst, N.Y.: Humanity Books, 1998.
  • On Judaism. Ed. Nahum Glatzer. New York: Schocken Books, 1967.
  • On the Bible: Eighteen Studies. Ed. Nahum Glatzer. New York: Schocken Books, 1968.
  • The Origin and Meaning of Hasidism. Ed. and trans. Maurice Friedman. New York: Horizon Press, 1960.
  • The Prophetic Faith. New York: Collier Books, 1985.
  • The Way of Man: According to the Teaching of Hasidism. London: Routledge, 2002.
    • Best short introduction to Buber’s interpretation of Hasidism.
  • Two Types of Faith. Trans. Norman P. Goldhawk. Syracuse, N.Y.: Syracuse University Press, 2003.

f. Secondary Sources

  • Anderson, Rob and Kenneth N. Cissna. The Martin Buber-Carl Rogers Dialogue: A New Transcript With Commentary. Albany: State University of New York Press, 1997.
  • Atterton, Peter, Mathew Calarco, and Maurice Friedman, eds. Lévinas & Buber: Dialogue and Difference. Pittsburg: Duquesne University Press, 2004.
    • Mix of primary sources, commentaries and argumentative essays.
  • Biemann, Asher D. Inventing New Beginnings: On the Idea of Renaissance in Modern Judaism. Stanford, C.A.: Stanford University Press, 2009.
    • Details Buber’s notions of Jewish Renaissance and aesthetic education.
  • Braiterman, Zachary. The Shape of Revelation: Aesthetics and Modern Jewish Thought. Stanford, C.A.: Stanford University Press, 2007.
    • Studies the relation between the philosophy of Buber and Rosenzweig and the aesthetics of early German modernism, especially the transition from Jugendstil to Expressionism.
  • Friedman, Maurice S. Encounter on the Narrow Ridge: A Life of Martin Buber. New York: Paragon House, 1991.
    • Biography largely condensed from Martin Buber’s Life and Work.
  • Friedman, Maurice S. Martin Buber’s Life and Work. 3 vols. Vol 1: The Early Years, 1878-1923. Vol. 2: The Middle Years, 1923-1945. Vol 3: The Later Years, 1945-1965. Detroit: Wayne State University Press, 1988.
  • Mendes-Flohr, Paul. From Mysticism to Dialogue: Martin Buber’s Transformation of German Social Thought. Detroit: Wayne State University Press, 1989.
    • Explores the influence of Landauer, Dilthey and Simmel, and Buber’s work as the editor of Die Gesellschaft.
  • Schmidt, Gilya G. Martin Buber’s Formative Years: From German Culture to Jewish Renewal, 1897-1909. Tuscaloosa: University of Alabama Press, 1995.
    • Buber’s early intellectual influences, life during university studies and turn to Zionism.
  • Scholem, Gershom. “Martin Buber’s Conception of Judaism,” in On Jews and Judaism in Crisis: Selected Essays. Ed. Werner Dannhauser. New York: Schocken, 1937.
  • Shapira, Avraham. Hope for Our Time: Key Trends in the Thought of Martin Buber. Trans. Jeffrey M. Green. Albany: State University of New York Press, 1999.
    • Systematic presentation of Buber’s main philosophic concepts.
  • Theunissen, Michael. The Other: Studies in the Social Ontology of Husserl, Heidegger, Sartre, and Buber. Cambridge, M.A.: MIT Press, 1984.
  • Urban, Martina. Aesthetics of Renewal: Martin Buber’s Early Representation of Hasidism as Kulturkritik. Chicago: The University of Chicago Press, 2008.
    • Discusses Buber’s hermeneutics, notions of anthology and Jewish renewal, and phenomenological presentation of Hasidism.

Author Information

Sarah Scott
Email: scots087@newschool.edu
The New School for Social Research
U. S. A.

Aquinas: Philosophical Theology

aquinasIn addition to his moral philosophy, Thomas Aquinas (1225-1274) is well-known for his theological writings.   He is arguably the most eminent philosophical theologian ever to have lived.  To this day, it is difficult to find someone whose work rivals Aquinas’ in breadth and influence.  Although his work is not limited to illuminating Christian doctrine, virtually all of what he wrote is shaped by his theology.  Therefore it seems appropriate to consider some of the theological themes and ideas that figure prominently in his thought.

The volume and depth of Aquinas’ work resists easy synopsis.  Nevertheless, an abridged description of his work may help us appreciate his  philosophical skill in exploring God’s nature and defending Christian teaching.  Although Aquinas does not think that philosophical reasoning can provide an exhaustive account of the divine nature, it is (he insists) both a source of divine truth and an aid in exonerating the intellectual credibility of those doctrines at the heart of the Christian faith.  From this perspective, philosophical reasoning can be (to use a common phrase) a tool in the service of theology.

An adequate understanding of Aquinas’ philosophical theology requires that we first consider the twofold manner whereby we come to know God:  reason and sacred teaching.  Our discussion of what reason reveals about God will naturally include an account of philosophy’s putative success in demonstrating both God’s existence and certain facts about God’s nature.  Yet because Aquinas also thinks that sacred teaching contains the most comprehensive account of God’s nature, we must also consider his account of faith—the virtue whereby we believe well with respect to what sacred teaching reveals about God.  Finally, we will consider how Aquinas employs philosophical reasoning when explaining and defending two central Christian doctrines:  the Incarnation and the Trinity.

Table of Contents

  1. Preliminary Matters: How Can We Know Divine Truth?
  2. Natural Theology
    1. Can We Demonstrate God’s Existence?
    2. A Sample Demonstration: The Argument from Efficient Causality
    3. God’s Nature
  3. Faith
    1. What is Faith?
    2. Faith and Voluntariness
    3. Faith and Reason
  4. Christian Doctrine
    1. Incarnation and Atonement
    2. Trinity
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources
    3. Internet Resources

1. Preliminary Matters: How Can We Know Divine Truth?

How can we know realities of a divine nature?  Aquinas  posits a “twofold mode of truth concerning what we profess about God” (SCG 1.3.2).  First, we may come to know things about God through rational demonstration.  By demonstration Aquinas means a form of reasoning that yields conclusions that are necessary and certain for those who know the truth of the demonstration’s premises.  Reasoning of this sort will enable us to know, for example, that God exists.  It can also demonstrate many of God’s essential attributes, such as his oneness, immateriality, eternality, and so forth (SCG 1.3.3).  Aquinas is not claiming that our demonstrative efforts will give us complete knowledge of God’s nature.  He does think, however, that human reasoning can illuminate some of what the Christian faith professes (SCG 1.2.4; 1.7).  Those aspects of the divine life which reason can demonstrate comprise what is called natural theology, a subject we will address in section 2.

Obviously, some truths about God surpass what reason can demonstrate.  Our knowledge of them will therefore require a different source of divine truth, namely, sacred teaching.  According to Aquinas, sacred teaching contains the most complete and reliable account of what we profess about God (SCG I.5.3).  Of course, whether sacred teaching is authoritative vis-à-vis divine realties depends on whether what it says about God is true.  How, then, can we be confident that sacred teaching is, in fact, a reliable source of divine knowledge?  An extended treatment of this matter requires that we consider the role faith plays in endorsing what sacred teaching proposes for belief.  This issue is addressed in section 3.

2. Natural Theology

Generally speaking, natural theology (NT) is a discipline that seeks to demonstrate God’s existence or aspects of his nature by means of human reason and experience.  The conclusions of NT do not rely on supernaturally revealed truths;  its point of departure is that which can be ascertained by means of the senses or rational methods of investigation.  So understood, NT is primarily a philosophical enterprise.  As one commentator explains, NT “amounts to forgoing appeals to any putative revelation and religious experience as evidence for the truth of propositions, and accepting as data only those few naturally evident considerations that traditionally constitute data acceptable for philosophy generally.  That’s what makes it natural theology” (Kretzmann, 1997: 2).

A caveat:  It is a mistake to construe NT as an autonomous branch of inquiry, at least in Aquinas’ case.  In fact, partitioning NT from divine revelation does a disservice to the theological nature of Aquinas’ overall project (for an extended defense of this position, see Hibbs, 1995 and 1998; Stump, 2003: 26-32).  For Aquinas is not content with simply demonstrating the fact of God’s existence.  The first article of ST makes this clear.  There, he asks whether knowledge of God requires something more than what philosophical investigation is able to tell us (ST Ia 1.1).  His answer is yes: although natural human reason can tell us quite a bit about God, it cannot give us salvific knowledge.  He writes:  “it was necessary for the salvation of man that certain truths which exceed human reason should be made known to him by divine revelation” (Ibid.).  In discussing truths that human reason can demonstrate, then, we should keep in mind that they comprise an overture to a more enriched and explicitly theological account of God’s nature.

a. Can We Demonstrate God’s Existence?

Aquinas thinks there are a variety of ways to demonstrate God’s existence.  But before he turns to them, he addresses several objections to making God an object of demonstration.  This essay will consider two of those objections.  According to the first objection, God’s existence is self-evident.  Therefore, any effort to demonstrate God’s existence is, at best, unnecessary (ST Ia 2.1 ad 1; SCG 1.10.1).  For Aquinas, this objection rests on a confusion about what it means for a statement to be self-evident.  He explains:  a statement is self-evident if its predicate is contained in the essence of the subject (ST Ia 2.1).  For example, the statement a triangle is a 3-sided planar figure is self-evident because the predicate-term (3-sided planar figure) is a part of the subject-term’s (triangle) nature.   Anyone who knows what a triangle is will see that this statement is axiomatic; it needs no demonstration.  On the other hand, this statement will not appear self-evident to those who do not know what a triangle is.  To employ Aquinas’ parlance, the statement is self-evident in itself (per se notum secundum se) but not self-evident to us (per se notum quod nos) (ST IaIIae 94.2;  Cf. ST Ia 2.1).  For a statement is self-evident in itself so long as it accurately predicates of the subject-term the essential characteristics it has.  Whether a statement is self-evident to us, however, will depend on whether we understand the subject-term to have those characteristics.

The aforementioned distinction (per se notum secundum se/per se notum quod nos) is helpful when responding to the claim that God’s existence is self-evident.  For Aquinas, the statement God exists is self-evident in itself since existence is a part of God’s essence or nature (that is, God is his existence—a claim to which we’ll turn below).  Yet the statement is not self-evident to us because God’s essence is not something we can comprehend fully.  Indeed, it is unlikely that even those acquainted with the idea of God will, upon reflecting on the idea, understand that existence is something that God has necessarily.  Although Aquinas does not deny that knowledge of God is naturally implanted in us, such knowledge is, at best, inchoate and imprecise;  it does not convey absolutely that God exists  (ST Ia 2.1 ad 1).  We acquire definitive knowledge of God’s existence in the same way we come to understand other natural causes, namely by identifying certain facts about the world—observable effects whose obviousness makes them better known to usand then attempting to demonstrate their pre-existing cause (ST Ia 2.2).  In other words, knowledge of God’s existence must be acquired through a posteriori demonstrations.  We will consider one of these demonstrations below.  At this point, we simply are trying to show that since God’s existence is not (to us) self-evident, the use of theistic demonstrations will not be a pointless exercise.

The second objection to the demonstrability of God’s existence is straightforward:  that which is of faith cannot be demonstrated.  Since God’s existence is an article of faith, it is not something we can demonstrate (ST Ia 2.2 obj. 1).  Aquinas’ response to this argument denies that God’s existence is an article of faith.  That is, he denies that God’s existence is a supernaturally revealed truth.  Instead, God’s existence is a demonstrable fact which supernaturally revealed truths presuppose.  The assent of faith involves embracing doctrinal teachings about God, whose existence is already assumed.  For this reason, Aquinas describes God’s existence not as an article of faith but as a preamble to the articles.  As such, God’s existence can be the subject of demonstration.

Aquinas concedes that, for some people, God’s existence will be a matter of faith.  After all, not everyone will be able to grasp the proofs for God’s existence.  Thus for some people it is perfectly appropriate to accept on the basis of sacred teaching that which others attempt to demonstrate by means of reason (ST Ia 2.2 ad 1).

b. A Sample Demonstration: The Argument from Efficient Causality

In the Summa Theologiae Ia 2.3, Aquinas offers five demonstrations for God’s existence (these are famously referred to as the “five ways”).  Each demonstration proceeds roughly as follows:  Aquinas identifies some observable phenomenon and then attempts to show that, necessarily, the cause of that phenomenon is none other than God. The phenomena Aquinas cites in these demonstrations include: 1) motion; 2) the existence of efficient causes; 3) the reality of contingency; 4) the different grades of perfection in the natural order; and 5) the end-directed activity of natural objects.   We should note that these demonstrations are highly abridged versions of arguments he addresses at length elsewhere (most notably, SCG I.13). Constraints of space do not permit an explication of each argument. But it will be helpful to consider at least one argument in order to see how these demonstrations typically proceed.

Aquinas’ argument from efficient causes—also known as “the second way”—is straightforward and does not lend itself to many interpretative disputes.   The argument is as follows:

In the world of sense we find there is an order of efficient causes. There is no case known (neither is it, indeed, possible) in which a thing is found to be the efficient cause of itself; for then it would be prior to itself, which is impossible. Now in efficient causes it is not possible to go on to infinity, because in all efficient causes following in order, the first is the cause of the intermediate cause, and the intermediate [cause] is the cause of the ultimate cause, whether the intermediate cause be several, or only one. Now to take away the cause is to take away the effect. Therefore, if there be no first cause among efficient causes, there will be no ultimate, nor any intermediate cause. But if in efficient causes it is possible to go on to infinity, there will be no first efficient cause, neither will there be an ultimate effect, nor any intermediate efficient causes; all of which is plainly false. Therefore it is necessary to admit a first efficient cause, to which everyone gives the name of God (ST Ia 2.3).

For our purposes, it might be helpful to present Aquinas’ argument in a more formal way:

  1. The world contains instances of efficient causation (given).
  2. Nothing can be the efficient cause of itself.
  3. So, every efficient cause seems to have a prior cause.
  4. But we cannot have an infinite regress of efficient causes.
  5. So there must be a first efficient cause “to which everyone gives the name God.”

First premise.  Like all of Aquinas’ theistic demonstrations, this one begins by citing an observable fact about the world, namely, that there are causal forces that produce various effects.  Aquinas does not say what these effects are, but according to John Wippel, we can assume that these effects would include “substantial changes (generation and corruption of substances) as well as various instances of motion … that is, alteration, local motion, increase, and decrease” (2006: Wippel, 58). Note here that there is no need to prove this premise.  Its truth is manifestly obvious, and thus Aquinas employs it as an argumentative point of departure.

Second premise. Aquinas then claims that it is impossible for any being to be the efficient cause of itself.  Why is self-causation impossible?  For the sake of ease, consider what it would mean for something to be the cause of its own existence (although this is not the only form of self-causation Aquinas has in mind).  In order to bring about the existence of anything, one needs a certain amount of causal power.  Yet a thing cannot have causal power unless it exists.  But if something were to be the cause of itself—that is, if it were to bring about its own existence—it would have to exist prior to itself, which is impossible (ST Ia 2.3).  Hence the third premise: every efficient cause must have a prior cause.

Aquinas’ argument in the first way—which is structurally similar to the argument from efficient causality—employs a parallel line of reasoning.  There, he says that to be in motion is to move from potentiality to actuality.  When something moves, it goes from having the ability to move to the activity of moving.  Yet something cannot be the source of its own movement.  Everything that moves does so in virtue of being moved by something that is already actual or “in act.”  In short, “whatever is in motion must be put in motion by another” (ST Ia 2.3).

Aquinas’ aim here is not to explain discrete or isolated instances of causation.  His interest, rather, is the existence of a causal order—one consisting of substances whose existence and activity depend on prior causes of that same order (Wippel, 59).  Yet this attempt to clarify Aquinas’ aim introduces an obvious problem.  If every constituent member of that order is causally dependent on something prior to itself, then it appears that the order in question must consist of an infinite chain of causes.  Yet Aquinas denies this implication (fourth premise): if the causal order is infinite, then (obviously) there could be no first cause.  But without a first cause, then (necessarily) there could be no subsequent effects—including the intermediate efficient causes and ultimate effect (ST Ia 2.3).  In other words, the absence of a first cause would imply an absence of the causal order we observe.  But since this implication is manifestly false, he says, there must be a first cause, “to which everyone gives the name God” (Ibid.).

A few clarifications about this argument are in order.  First, commentators stress that this argument does not purport to show that the world is constituted by a temporal succession of causes that necessarily had a beginning (see for example Copleston, 1955: 122-123).  Interestingly, Aquinas himself denies that the argument from efficient causality contradicts the eternality of the world (ST Ia 46.2 ad 1).  Whether the world began to exist can only be resolved, he thinks, by appealing to sacred teaching.  Thus he says that “by faith alone do we hold, and by no demonstration can it be proved, that the world did not always exist” (ST Ia 46.2).  With respect to the second way, then, Aquinas’ aim is simply to demonstrate that the order of observable causes and effects cannot be a self-existing reality.

An illustration may help clarify the sort of argument Aquinas wishes to present.  The proper growth of, say, plant life depends on the presence of sunlight and water.  The presence of sunlight and water depends on ideal atmospheric activities.  And those atmospheric activities are themselves governed by more fundamental causes, and so forth.  In this example, the events described proceed not sequentially, but concurrently. Even so, they constitute an arrangement in which each event depends for its occurrence on causally prior events or phenomena.  According to Copleston, illustrations of this sort capture the kind of causal ordering that interests Aquinas.  For “when Aquinas talks about an ‘order’ of efficient causes he is not talking of a series stretching back into the past, but of a hierarchy of causes, in which a subordinate member is here and now dependent on the causal activity of a higher member” (Copleston, 1955: 122).  Thus we might explain the sort of ordering that interests Aquinas as a metaphysical (as opposed to a temporal) ordering of causes.  And it is this sort of order that requires a first member, that is, “a cause which does not depend on the causal activity of a higher cause” (Ibid., 123).  For, as we have already seen, the absence of a first cause would imply the absence of subsequent causes and effects.  Unless we invoke a cause that itself transcends the ordering of dependent causes, we would find it difficult to account for the causal activities we presently observe.  Aquinas therefore states there must be “a first efficient, and completely non-dependent cause,” whereby “the word ‘first’ does not mean first in the temporal order but supreme or first in the ontological order” (Ibid.: 123;  For valuable commentaries on these points, see Copleston, 122-124; Wippel, 2006: 59; Reichenbach, 2008).

Second, it may appear that Aquinas is unjustified in describing the first efficient cause as God, as least if by “God” one has in mind a person possessing the characteristics Christian theologians and philosophers attribute to him (for example, omniscience, omnipotence, omnipresence, love, goodness, and so forth.).  Yet Aquinas does not attempt to show through the previous argument that the demonstrated cause has any of the qualities traditionally predicated of the divine essence.  He says:  “When the existence of a cause is demonstrated from an effect, this effect takes the place of the definition of the cause in proof of the cause’s existence” (ST Ia 2.2 ad 2).  In other words, the term God—at least as it appears in ST Ia 2.2—refers only to that which produces the observed effect.  In the case of the second way, God is synonymous with the first efficient cause;  it does not denote anything of theological substance.  We might think of the term “God” as a purely nominal concept Aquinas intends to investigate further (Te Velde, 2006: 44; Wippel, 2006:  46).  For the study of what God is must be subsequent to demonstrating that he is.  A complete account of the divine nature requires a more extensive examination, which he undertakes in the subsequent articles of ST.

c. God’s Nature

Once Aquinas completes his discussion of the theistic demonstrations, he proceeds to investigate God’s nature.  Such an investigation poses unique challenges.  Although Aquinas thinks that we can demonstrate God’s existence, our demonstrative efforts cannot tell us everything about what God is like.  As we noted before, God’s nature—that is, what God is in himself—surpasses what the human intellect is able to grasp (SCG I.14.2).  Aquinas therefore does not presume to say explicitly or directly what God is.  Instead, he investigates divine nature by determining what God is not.  He does this by denying of God those properties that are conceptually at odds with what is already concluded by means of the five ways (ST Ia 3 prologue;  Cf. SCG I.14.2 and 3).

Aquinas acknowledges a potential worry for his view.  If the method by which we investigate God is one of strict remotion, then no divine predicate can describe what God is really like.  As one objection states:  “it seems that no name can be applied to God substantially.  For Damascene says … ‘Everything said of God signifies not his substance, but rather exemplifies what he is not; or expresses some relation, or something following his nature or operation” (ST Ia 13.2 ad 1).  In other words, the terms we attribute to God either function negatively (for example, to say God is immaterial is to say he is “not material”) or describe qualities that God causes his creatures to have.  To illustrate this second alternative:  consider what we mean when we say “God is good” or “God is wise.”  According to the aforementioned objection, to say that God is good or wise is just to say that God is the cause of goodness and wisdom in creatures; the predicates in question here do not tell us anything about God’s nature (Ibid.).

For Aquinas, however, the terms we predicate of God can function positively, even if they cannot capture perfectly or make explicit the divine nature.  Here’s how.  As we have discussed, natural knowledge of God is mediated by our knowledge of the created order.  The observable facts of that order reveal an efficient cause that is itself uncaused—a self-subsisting first mover that is uncreated and is not subject to any change.  According to Aquinas, this means that God, from whom everything else is created, “contains within Himself the whole perfection of being” (ST Ia 4.2). But as the ultimate cause of our own existence, God is said to have all the perfections of his creatures (ST Ia 13.2).  Whatever perfections reside in us must be deficient likenesses of what exists perfectly in God.  Consequently, Aquinas thinks that terms such as good and wise can refer back to God.  Of course, those terms are predicated of God imperfectly just as God’s creatures are imperfect likenesses of him.  “So when we say, ‘God is good,’ the meaning is not, ‘God is the cause of goodness,’ or ‘God is not evil’; but the meaning is, ‘Whatever good we attribute to creatures, pre-exists in God,’ and in a more excellent and higher way” (Ibid.).

Moreover, denying certain properties of God can, in fact, give us a corresponding (albeit incomplete) understanding of what God is like.  In other words, the process of articulating what God is not does not yield an account of the divine that is wholly negative.  Here is a rough description of the way Aquinas’ reasoning proceeds:  we reason from theistic arguments (particularly the first and second ways) that God is the first cause; that is, God is the first being in the order of efficient causality. If this is so, there can be no potency or unrealized potential in God.  For if something has the potential or latent capacity to act, then its activity must be precipitated by some prior actuality.  But in this line of reasoning, there is no actuality prior to God.  It must follow, then, that God is pure actuality, and this in virtue of being the first cause (ST Ia 3.1).  So although this process denies God those traits that are contrary to what we know about him, those denials invariably yield a fairly substantive account of the divine life

Other truths necessarily follow from the idea that God is pure actuality.  For example, we know that God cannot be a body.  For a characteristic feature of bodies is that they are subject to being moved by something other than themselves.  And because God is not a body, he cannot be a composite of material parts (ST Ia 3.7).  Not only does Aquinas think that God is not a material composite, he also insists that God is not a metaphysical composite (Vallencia, 2005).  In other words, God is not an amalgam of attributes, nor is he a being whose nature or essence can be distinguished from his existence.  He is, rather, a simple being.

The doctrine of divine simplicity is complicated and controversial—even among those who admire Aquinas’ philosophical theology.  But the following account should provide the reader with a rough sketch of what this doctrine involves.  Consider the example human being. A person is a human being in virtue of her humanity, where “humanity” denotes a species-defining characteristic.  That is, humanity is an essence or “formal constituent” that makes its possessor a human being and not something else (ST Ia 3.3).  Of course, a human being is also material being.  In virtue of materiality, she possesses numerous individuating accidents. These would include various physical modifications such as her height or weight, her particular skin pigmentation, her set of bones, and so forth.  According to Aquinas, none of these accidental traits are included in her humanity (indeed, she could lose these traits, acquire others, and remain a human being).  They do, however, constitute the particular human being she is.  In other words, her individuating accidents do not make her human, but they do make her a particular exemplification of humanity.  This is why it would be incorrect to say that this person is identical to her humanity; instead, the individuating accidents she has make her one of many instances thereof.

But what about substances that are not composed of matter?  Such things cannot have multiple instantiations since there is no matter to individuate them into discrete instances of a specific nature or essence.  An immaterial substance then will not instantiate its nature. Instead, the substance will be identical to its nature.  This is why Aquinas insists that there can be no distinction between (1) God and (2) that by which he is God.  “He must be his own Godhead, His own life, and thus whatever else is predicated of him” (ST Ia 3.3).  For example, we often say that God is supremely good.  But it would be a mistake on Aquinas’ view to think that goodness is a property that God has, as if goodness is a property independent of God himself.  For “in God, being good is not anything distinct from him;  he is his goodness” (SCG I.3.8).  Presumably we can say the same about his knowledge,  perfection, wisdom, and other essential attributes routinely predicated of him.

So far we’ve considered the way God, as non-physical being, is simple.  What he is (God) is indistinguishable from that by which he is (his divine essence).  Presumably other immaterial beings would be simple in precisely this way in virtue of their immateriality.  Consider, for example, the notion of angels.  That there is no matter with which to individuate angelic beings implies that there will not be multiple instantiations of an angelic nature.  Like Aquinas’ notion of God, each angelic being will be identical to its specific essence or nature (ST Ia 3.3).  But God is obviously unlike angelic beings in an important way.  Not only is God the same as his essence;  he is also the same as his existence (ST Ia 3.4;  Cf. 50.2 ad 3).  In order to see what this means, consider the conclusions from section 2.2b.  There, we noted that the constituent members of the causal order cannot be the cause of their own existence and activity.  For “it is impossible for a thing’s existence to be caused by its essential constituent principles, for nothing can be the sufficient cause of its own existence, if its existence is caused” (Ibid.).  Thus the constituent members of the causal order must exist in virtue of some other, exterior principle of causality.

We are now in a position to see why, according to Aquinas, God and the principle by which he exists must be the same.  Unlike the constituent members of the causal order, all of whom receive their existence from some exterior principle, God is an uncaused cause.  In other words, God’s existence is not something bequeathed by some exterior principle or agent.  If it was, then God and the principle by which he exists would be different.  Yet the idea that God is the first efficient cause who does not acquire existence from something else implies that God is his own existence (Ibid).  Brain Davies explains this implication of the causal argument in the following way:

The conclusion Aquinas draws [from the five ways] is that God is his own existence.  He is Ipsum Esse Subsitens.  “Existence Itself” or “underived … Existence.” To put it another way, God is not a creature.  Creatures, Aquinas thinks, “have” existence, for their natures (what they are) do not suffice to guarantee their existence (that they are).  But with God this is not so.  He does not “have” existence; his existence is not received or derived from another.  He is his own existence and is the reason other things have it (Davies, 1992: 55).

For additional discussion of Aquinas’ argument for God’s existence, see Scriptural Roots and Aquinas’s Fifth Way.

3. Faith

So far, this article has shown how and to what extent human reason can lead to knowledge about God and his nature.  Aquinas clearly thinks that our demonstrative efforts can tell us quite a bit about the divine life.  Yet he also insists that it was necessary for God to reveal to us other truths by means of sacred teaching.  Unlike the knowledge we acquire by our own natural aptitudes, Aquinas contends that revealed knowledge gives us a desire for goods and rewards that exceed this present life (SCG I.5.2).  Also, revealed knowledge may tell us more about God than what our demonstrative efforts actually show.  Although our investigative efforts may confirm that God exists, they are unable to prove (for example) that God is fully present in three divine persons, or that it is the Christian God in whom we find complete happiness (ST Ia 1.1;  SCG I.5.3).  Revealed knowledge also curbs the presumptuous tendency to think that our cognitive aptitudes are sufficient when trying to determine (more generally) what is true (SCG I.5.4).

Moreover, Aquinas contends that it was fitting for God to make known through divine revelation even those truths that are accessible to human reason.  For if such knowledge depended strictly on the difficult and time-intensive nature of human investigation, then few people would actually possess it.  Also, our cognitive limitations may result in a good deal of error when trying to contrive successful demonstrations of divine realities.  Given our proneness to mistakes, relying on natural aptitude alone may seem particularly hazardous, especially when our salvation is at stake (Ibid.; Cf. SCG I.4.3-5).  For this reason, Aquinas insists that having “unshakable certitude and pure truth” with respect to the divine life requires that we avail ourselves of truths revealed by God and held by faith (Ibid., I.4.6).

a. What is Faith?

But what is “faith”?  Popular accounts of religion sometimes construe faith as a blind, uncritical acceptance of myopic doctrine.  According to Richard Dawkins, “faith is a state of mind that leads people to believe something—it doesn’t matter what—in the total absence of supporting evidence. If there were good supporting evidence then faith would be superfluous, for the evidence would compel us to believe it anyway” (Dawkins, 1989: 330).  Such a view of faith might resonate with contemporary skeptics of religion.  But as we shall see, this view is not remotely like the one Aquinas—or historic Christianity for that matter—endorses.

To begin with, Aquinas takes faith to be an intellectual virtue or habit, the object of which is God (ST IIaIIae 1.1;  4.2).  There are other things that fall under the purview of faith, such as the doctrine of the Trinity and the Incarnation.  But we do not affirm these specific doctrines unless they have some relation to God.  According to Aquinas, these doctrines serve to explicate God’s nature and provide us with a richer understanding of the one in whom our perfect happiness consists (Ibid.).  And although faith is an intellectual virtue, it would be a mistake to construe the act of faith as something that is purely cognitive in nature, such as the belief that 2 + 2 = 4, or that Venus is a planet, or that red is a primary color.  These beliefs are not (so it seems) things over which we have much voluntary control.  Perhaps this is because their truth is manifestly obvious or because they are based on claims that are themselves self-evident.  In either case, it doesn’t appear that we choose to believe these things.

By contrast, the assent of faith is voluntary.  To employ Aquinas’ terminology, the assent of faith involves not just the intellect but the will (ST IIaIIae 1.2).  By will Aquinas means a native desire or love for what we think contributes to our happiness.  How is the will involved in the assent of faith?  Aquinas appears to have something like this in mind:  suppose a person, upon hearing a homily or a convincing argument, becomes persuaded that ultimate human happiness consists in union with God.  For Aquinas, the mere acknowledgment of this truth does not denote faith—or at least a commendable form of faith that is distinct from believing certain propositions about God.  After all, the demons believe many truths about God, but they are compelled to believe due to the obviousness of those truths.  Their belief is not shaped by an affection for God and thus not praiseworthy (ST IIaIIae 5.2 ad 1 and 2).  Thus we can imagine that a person who is convinced of certain sacred truths may (for any number of reasons) choose not to consider or endorse what she now believes. Alternatively, she may, out of love for God, actively seek God as her proper end.   According to Aquinas, this love for God is what distinguishes faith from the mere acknowledgement that certain theological statements are true.  For faith involves an appetitive aspect whereby the will—a love or desire for goodness—moves us to God as the source of ultimate happiness (ST IIaIIae 2.9 ad 2; IIaIIae 4.2;  cf. Stump, 1991: 191).  We’ll say more about the relationship between love and faith in the following sub-section.

But what prompts the will to desire God?  After all, Christianity teaches that our wills have been corrupted by the Fall.  As a result of that corruption, Christian doctrine purports that we invariably love the wrong things and are inclined to ends contrary to God’s purposes.  The only way we would be motivated to seek God is if our wills were somehow changed;  that is, we must undergo some interior transformation whereby we come to love God.  According to Aquinas, that transformation comes by way of grace.  We will say more about grace in the following subsection of this article.  For now, we can construe grace as Aquinas does:  a good-making habit that inclines us to seek God and makes us worthy of eternal life (QDV 27.1).  According to Aquinas, if a person seeks God as the supreme source of human happiness, it can only be because God moves her will by conferring grace upon her.  That is why Aquinas insists that faith involves a “[voluntary] assent to the Divine truth at the command of the will moved by the grace of God” (ST IIaIIae 2.9;  Cf. 2.2).  Of course, just what it means for one’s will to be both voluntary and moved by God’s grace is a subject about which there is contentious debate.  How can the act of faith be voluntary if the act itself is a result of God generating a change in the human will?  This is the problem to which we’ll now turn.

b. Faith and Voluntariness

We may think that voluntary actions are the products of one’s free decisions and not compelled or generated by causal forces outside of one’s own will.  According to Aquinas, however, the act of faith is precipitated by grace, whereby God draws the will to himself (ST IaIIae 109.7). Does the infusion of grace contravene the sort of voluntariness that Aquinas insists is a component of faith?  Limitations of space prohibit an extensive treatment of this subject.  For this reason, a brief presentation of Aquinas’ view will follow.

The act of faith has a twofold cause:  one is external, the other is internal.  First, Aquinas says that faith requires an “external inducement, such as seeing a miracle, or being persuaded by someone” by means of reason or argument (ST IIaIIae 6.1;  Cf. 2.9 ad 3).  Observing a supernatural act or hearing a persuasive sermon or argument may corroborate the truth of sacred teaching and, in turn, encourage belief.  These inducements, however, are not sufficient for producing faith since not everyone who witnesses or hears them finds them compelling.  As Aquinas observes, of “those who see the same miracle, or who hear the same sermon, some believe, and some do not” (Ibid.).  We must therefore posit an internal cause whereby God moves the will to embrace that which is proposed for belief.  But how is it that God moves the will?  In other words, what does God do to the will that makes the assent of faith possible?  And how does God’s effort to dispose our will in a certain way not contravene its putative freedom?  None of the proposed answers to this question are uncontroversial, but what follows appears to be faithful to the view Aquinas favored (for some competing interpretations of Aquinas’ account, see Jenkins, 1998; Ross, 1985; Penelhum, 1977; and Stump, 1991 and 2003).

As indicated in the previous sub-section, charity, or the love of God, moves a person to faith (ST IIaIIae 4.3).  Aquinas states “charity is the form of faith” because the person who places her faith in God does so because of her love for God.  Thus we might think of the inward cause of faith to be a kind of infused affection or, better yet, moral inclination whereby the will is directed to God (Ibid.; 23.8).  As a result of this moral posturing, a person will be able to view Christian teaching more favorably than she would were it not for the infusion of charity.  John Jenkins endorses a similar account.  He suggests that pride, excessive passion, and other vicious habits generate within us certain prejudices that prevent us from responding positively to sacred teaching (Jenkins, 1998: 207-208).  A will that is properly directed to God, however, does not refuse a fair and charitable evaluation of Scripture’s claims.  Jenkins writes:  “a good will [and by this he means a will that has been moved by God’s grace]…permits us to see clearly and impartially that truths which are beyond our understanding…nevertheless have been revealed by God and are to be believed.” (Ibid., 208).  In other words, faith formed by charity transforms the will by allaying the strength of those appetitive obstacles that forestall love of God.  In turn, faith directs us to God and motivates us to embrace sacred teaching (ST IIaIIae 2.9 ad 3).

On this view of faith, the person who subordinates herself to God does so not as a result of divine coercion but by virtue of an infused disposition whereby she loves God.  In fact, we might argue that God’s grace makes a genuinely free response possible.  For grace curtails pride and enables us to grasp and fairly assess what the Christian faith proposes for belief (Jenkins, 209).  In doing so, it permits us to freely endorse those things that we in our sinful state would never be able—or want—to understand and embrace.  According to this view, God’s grace does not contravene the voluntary nature of our will.

c. Faith and Reason

It is clear from the preceding account of faith that Aquinas staunchly supports the use of argument in exonerating  those claims that are proposed for belief.  Indeed, the arguments offered in support of Christian claims often provide us with the motivation we sometimes need in order to embrace them.  But does the use of reasons or argument compromise the merit of faith?  Aquinas expresses the objection this way:  “faith is necessary [in order to assent to] divine things….  Therefore in this subject it is not permitted to investigate the truth [of divinely revealed realities] by reasoning” (De trinitate, 2.1 obj. 3).  He also quotes St. Gregory’s objection:  “ ‘Faith has no merit where human reason supplies proof.’ But it is wrong to do away with the merit of faith.  Therefore it is not right to investigate matters of faith by reason” (Ibid., obj. 5).  In short, human investigation into sacred doctrine threatens to render faith superfluous.  For if one were to offer a good argument for the truth of what God reveals, then there would be no need for us to exercise faith in regard to that truth.

Yet Aquinas insists that Christianity’s doctrinal truths—truths we are to embrace by faith—are often confirmed by “fitting arguments” (SCG I.6.1), and that faith can be strengthened by the use of reason (De trinitate, 2.1).  What sort of reasoning or argumentation does Aquinas have in mind?  He makes a distinction between demonstrative reasoning and persuasive reasoning.  As we saw earlier, demonstrative reasoning yields a conclusion that is undeniable for anyone who grasps the truth of the demonstration’s premises.  In these cases, believing the demonstration’s conclusion is not a voluntary affair.  If I know that the sum of all rectangles’ interior angles equals 360º and that a square is a rectangle, then I cannot help but believe that the sum of a square’s interior angles equals 360º.  In cases of demonstrative reasoning, knowledge of a demonstration’s premises is sufficient to guarantee assent to the demonstration’s conclusions  (De trinitate 2.1 ad 5).  Were a person to grasp the truth of sacred doctrine by means of this sort of reasoning, belief would be necessitated and the merit of faith destroyed (Ibid.).

Persuasive reasoning, on the other hand, does no such thing.  We might think of persuasive reasoning as playing an apologetic role vis-à-vis theological claims (Stump suggests something along these lines.  See 1991: 197).  In his own gloss on Aquinas, Jenkins suggests that “persuasive reasoning” consists of “credibility arguments” that corroborate the truth of what sacred scripture teaches but are ultimately unable “to move one to assent to the articles of faith” (Jenkins, 1997: 185-186).  In other words, the arguments in which persuasive reasoning consists may provide reasons for accepting certain doctrines, but they cannot compel acceptance of those doctrines.  One still needs the grace of faith in order to embrace them.  Whatever merits persuasive reasoning confers on sacred teaching, “it is the interior movement of grace and the Holy Spirit which is primary in bringing one to see that these truths  have been divinely revealed and are to be believed” (Ibid., 196).  This is why Aquinas insists that human investigation into matters of faith does not render faith superfluous or “deprive faith of its merit” (De trinitate 2.1 ad 5).

For more discussion, see the article Faith and Reason.

4. Christian Doctrine

A closer look at some central Christian doctrines is now in order.  The term “Christian doctrine” refers to the specific, developed teachings that are at the heart of Christian faith and practice.  And although there are many doctrines that constitute sacred teaching, at least two are foundational to Christianity and subject to thorough analysis by Aquinas.  These include the Incarnation and the Trinity.  Aquinas takes both of these doctrines to be essential to Christian teaching and necessary to believe in order to receive salvation (see ST IIaIIae 2.7 and 8, respectively).  For this reason it will be beneficial to explore what these doctrines assert.

a. Incarnation and Atonement

The doctrine of the Incarnation teaches that God literally and in history became human in the person of Jesus Christ.  The doctrine of the Incarnation further teaches that Christ is the complete and perfect union of two natures, human and divine.  The idea here is not that Jesus is some strange hybrid, a chimera of human and divine parts.  The idea rather is that in Christ there is a merger of two natures into one hypostasis—a subsisting individual composed of two discrete but complete essences (ST III 2.3).  In short, Christ is a single person who is fully human and fully divine;  he is God and man.  Aquinas’ efforts to explicate and defend this doctrine are ingenious but may prove frustrating without a more advanced understanding of the metaphysical framework he employs (see Stump 2003 for a treatment of this subject).  Rather than pursue the complexities of that framework, we will instead address a different matter to which the Incarnation is intricately connected.

According to Christian teaching, human beings are estranged from God.  That estrangement is the result of original sin—a “heritable stain” we contract from our first parent, Adam (Catholic Encyclopedia, “Original Sin”;  Cf. ST IaIIae 81.1).  So understood, sin refers not to a specific immoral act but a spiritual wounding that diminishes the good of human nature (ST IaIIae 85.1 and 3).  Sin’s stain undermines our ability to deliberate well about practical matters; it hardens our wills toward evil; and it exacerbates the impetuosity of passion, thereby making virtuous activity more difficult (ST IaIIae 85.3).  Further, Christian doctrine states that we become progressively more corrupt as we yield to sinful tendencies over time.  Sinful choices produce corresponding habits, or vices, that reinforce hostility towards God and put beatitude further beyond our reach.  No amount of human effort can remedy this problem.  The damage wrought by sin prevents us from meriting divine favor or even wanting the sort of goods that which makes union with God possible.

The Incarnation makes reconciliation with God possible.  To understand this claim, we must consider another doctrine to which the Incarnation is inextricably tied, namely, the doctrine of the Atonement.  According to the doctrine of Atonement, God reconciles himself to human beings through Christ, whose suffering and death compensates for our transgressions (ST III 48.1).  In short, reconciliation with God is accomplished by means of Christ’s satisfaction for sin.  Yet this satisfaction does not consist in making reparations for past transgressions.  Rather it consists in God healing our wounded natures and making union with him possible. The most fitting way to accomplish this task was through the Incarnation (ST III 1.2).  From this perspective, satisfaction is more restorative than retributive.  As Eleonore Stump notes:  “the function of  satisfaction for Aquinas is not to placate a wrathful God … [but] to restore a sinner to a state of harmony with God by repairing or restoring in the sinner what sin has damaged” (Stump, 2003: 432).  Aquinas emphasizes the restorative nature of satisfaction by detailing the many blessings Christ’s Incarnation and atoning work afford.  A partial list is as follows:  the incarnation provides human beings with a tangible manifestation of God himself, thereby inspiring faith;  it prompts in us hope for salvation;  it demonstrates God’s love for human beings, and in turn kindles within us a love for God; correlatively, it produces in us sorrow for past sins and a desire to turn from them; it provides us a template of humility, constancy, obedience, and justice, all of which are required for salvation; and it merits “justifying grace” for human beings (ST III 1.2; 46.3;  Cf.  90.4).

This last benefit requires explanation.  As we saw in the previous section, our natures are corrupted by sin, resulting in a weakened inclination to act virtuously and obey God’s commandments.  Only a supernatural transformation of our recalcitrant wills can heal our corrupt nature and make us people who steadily trust, hope in, and love God as the source of our beatitude.  In short, Christian doctrine purports that we need God’s grace—a divinely infused quality that inclines us toward God.  This brief description of grace might suggest that it is an infused virtue much like faith, hope, and charity.  According to Aquinas, however, grace is not a virtue.   Rather, it is the infused virtues’ “principle and root”a disposition that is antecedently necessary in order to practice the virtues themselves (ST IaIIae 110.3;  110 ad 3).  We might be inclined to think of God’s grace as a transformative quality that enables us to desire our supernatural end, fulfill God’s commandments, and avoid sin (ST IaIIae 109.2, 4, and 8).

This account helps explain why grace is said to justify sinners. Justification consists not only in the remittance of sins, but in a transmutation whereby our wills are supernaturally directed away from morally deficient ends and towards God.  In short, justification produces within us a certain “rectitude of order” whereby our passions are subordinate to reason and reason is subordinate to God as our proper end (ST IaIIae 113.1;  113.1 ad 2).  In this way God, by means of his grace, heals our fallen nature, pardons sin, and makes us worthy of eternal life.

Now, remission of sin and moral renovation cannot occur apart from the work God himself accomplishes through Christ. According to Aquinas, forgiveness and righteousness are made possible through Christ’s passion, or the love and obedience he exemplified through his suffering and death (ST III 48.2). Thus he describes Christ’s suffering and death as instruments of God’s grace; for through them we are freed from sin and reconciled to God (ST IaIIae 112.1 ad 1; ST III 48.1; 49.1-4). But how exactly does Christ’s suffering and death merit the remission of sin and unify us with God? Briefly: Christ’s loving obedience and willingness to suffer for justice’s sake merited God’s favor. Yet such favor was not limited to Christ. For “by suffering out of love and obedience, Christ gave more to God than what was required to compensate for the offense of the whole human race” (ST III 48.2; Cf. 48.4). That is to say, so great was his love and obedience that he accrued a degree of merit that was sufficient to atone for everyone’s sin (Ibid.). On this account, God accepts Christ’s perfect expression of love and obedience as a more than adequate satisfaction for sin, thereby discharging us from the debt of punishment incurred by our unfaithfulness and lack of love. But again, the aim of satisfaction is not to appease God through acts of restitution but to renovate our wills and make possible a right relationship with him (Stump, 432). After all, satisfaction for sin does us little good if Christ’s actions do not serve to change us by transforming the vicious inclinations that alienate us from God. Thus we ought not to look at Christ simply as an instrument by which our sins are wiped clean, but as one whose sacrificial efforts produce in us a genuine love for God and make possible the very union we desire (ST III 49.1; 49.3 ad 3).

The preceding survey of the Incarnation and the Atonement will undoubtedly raise further questions that we cannot possibly address here.  For example, one issue this article has not addressed is the role of Christian sacraments in conferring grace and facilitating the believer’s incorporation into God’s life (see, for example, ST III 48.2 ad 2; 62.1 and 2.  For a careful treatment of this issue, see Stump:  2003, chapter 15).  Instead, this brief survey attempts only a provisional account of how the Incarnation makes atonement for sin and reconciliation with God possible.

b. Trinity

This section will focus on the doctrine of the Trinity (with all the typical caveats implied, of course).  Aquinas’ definition of the Trinity is in full accord with the orthodox account of what Christians traditionally believe about God.  According to that account, God is one.  That is, his essence is one of supreme unity and simplicity.  Yet the doctrine also states that there are three distinct persons:  Father, Son, and Holy Spirit.  By distinct, Aquinas means that the persons of the Trinity are real individuals and not, say, the same individual understood under different descriptions.  Moreover, each of the three persons is identical to the divine essence.  That is, each person of the Trinity is equally to God.  The doctrine is admittedly confounding.  But if it is true, then it should be internally coherent.  In fact, Aquinas insists that, although we cannot prove the doctrine through our own demonstrative efforts, we can nevertheless show that this and other doctrines known through the light of faith are not contradictory (de Trinitate, 1.1 ad 5; 1.3).

Aquinas’ exposition of the Trinity endeavors to avoid two notable heresies:  Arianism and Sabellianism.  In short, Arianism denies Christ’s full divinity.  It teaches that Christ was created by God at a point in time and therefore not co-eternal with him.  Although Arianism insists that Christ is divine, his creaturely dependence on God implies that he does not share God’s essence.   In short, God and Christ are distinct substances.  What follows from this teaching is that Christ is “a second, or inferior God, standing midway between the First Cause and creatures; as Himself made out of nothing, yet as making all things else; as existing before the worlds of the ages; and as arrayed in all divine perfections except the one which was their stay and foundation” (Catholic Encyclopedia, “Arianism”).  The other heresy, Sabellianism, attempts to preserve divine unity by denying any real distinction in God.  According to this doctrine, the names “Father,” “Son,” and “Holy Spirit” refer not to discrete persons but different manifestations of the same divine being.  Aquinas’ account attempts to avoid these heresies by affirming that the persons of the Trinity are distinct without denying the complete unity of the divine essence.

How does Aquinas go about defending the traditional doctrine?  The challenge, of course, is to show that the claim

(1) the persons of the God-head are really distinct

is consistent with the claim that

(2) God is one

In an effort to reconcile (1) and (2), Aquinas argues that there are relations in God.  For example, we find in God the relational notion of paternity (which implies fatherhood) and filiation (which implies sonship) (ST Ia 28.1 sed contra).  Paternity and filiation imply different things.  And while I may be both a father and a son, these terms connote a real relation between distinct  persons (me, my son, and my father).  Thus if there is paternity and filiation in God, then there must be a real distinction of persons that the divine essence comprises (ST Ia 28.3).  We should note here that Aquinas avoids using the terms “diversity” and “difference” in this context because such terms contravene the doctrine of simplicity (ST Ia 31.2).  The notion of distinction, however, does not contravene the doctrine of simplicity because (according to Aquinas) we can have a distinction of persons while maintaining divine unity.

This last claim is obviously the troubling one.  How can we have real distinction within a being that is perfectly one?  The answer to this question requires we look a bit more closely at what Aquinas means by relation. The idea of relation goes back at least as far as Aristotle (for a good survey of medieval analyses of relations, see Brower, 2005).  For Aristotle and his commentators, the term relation refers to a property that allies the thing that has it with something else.  Thus he speaks of a relation as that which makes something of, than, or to some other thing (Aristotle, Categories, Book 7, 6b1).  For example, what is larger is larger than something else;  to have knowledge is to have knowledge of something;  to incline is to incline toward something;  and so forth (Ibid. 6b5).

Aquinas’ attempt to make sense of the Trinity involves use of (or perhaps, as Brower notes, a significant departure from) Aristotle’s idea of relation (Brower, 2005).  On the one hand, Aquinas’ understanding of relation as it applies to creatures mirrors Aristotle’s view:  a relation is an accidental property that signifies a connection to something else (ST Ia 28.1).  On the other hand, the notion of relation need not denote a property that allies different substances.  It can also refer to distinctions that are internal to a substance.  This second construal is the way Aquinas understands the notion of relation as it applies to God.  For there is within God a relation of persons, each of which enjoys a characteristic the others do not have.  As we noted before, God the Father has the characteristic of paternity, God the Son has the characteristic of filiation, and so on.  These characteristics are unique to each person, thus creating a kind of opposition that connotes real distinction (ST Ia 28.3).

Care is required before proceeding here.  Each of the aforementioned relations not only inhere in the divine essence, they are identical to it in the sense that each member of the Trinity is identical to God (ST Ia 28.2 and 29.4).  From this abbreviated account we see that relation as it exists in God is not, as it is for creatures, an accidental property.  For the relation, being identical to God, does not add to or modify the divine substance in any way.  Aquinas says:  “whatever has an accidental existence in creatures, when considered as transferred to God, has a substantial existence; for there is no accident in God; since all in Him is His essence. So, in so far as relation has an accidental existence in creatures, relation really existing in God has the existence of the divine essence in no way distinct therefrom” (ST Ia 28.2).  Seen this way, it is somewhat misleading to say that relation is something that “inheres in” God;  for the relation is identical to God himself (Emery, 2007:  94).

This woefully truncated account of Aquinas’ position presents a more detailed articulation of the very claim he needs to explain.  One can still ask:  how can God be a perfect unity and still comprise a plurality of distinct persons?  Aquinas is aware of the worry.  For “if in God there exists a number of persons, then there must be whole and part in God, which is inconsistent with the divine simplicity” (ST Ia 30.1 obj. 4;  cf. ST Ia 39.1 obj. 1).  Aquinas recognizes that most people will find it difficult to imagine how something can have within itself multiple relations and at the same time be an unqualified unity.

In order to show how one might have a plurality while preserving unity, consider the following analogy.  Using Aristotle’s account of material constitution as a point of departure, Jeffery Brower and Michael Rea suggest that a bronze statue is constituted by two discrete substances:  a lump of bronze and a statue.  Although the lump of bronze and the statue are distinct things, “they are numerically one material object.  Likewise, the persons of the Trinity are three distinct persons but numerically one God” (Brower and Rea, 2005: 69).  Although the authors do not have Aquinas’ account of divine relations in mind when using this analogy, we may cautiously avail ourselves of their insights.  If we can think of the lump of bronze and the configuration by which the bronze is a statue as a relation of two things, then we can see that relation does not concern anything that is not identical to the object (the bronze statue).  Such an account is similar to the one Aquinas has in mind when attempting reconcile (1) and (2).  For although each person of the Trinity is distinct from each other, each person is not distinct from God (ST Ia 28.2;  cf. 39.1).

Some readers might object to the use of such analogies.  In the present case, the relations that inhere in God are persons, not formally discrete features of an artifact.  Moreover, the analogy does not adequately capture the precise nature of the relations as they exist in God.  For Aquinas, the divine relations are relations of procession.  Here Aquinas takes himself to be affirming sacred teaching, which tells us that Jesus “proceeded and came forth” from the Father (John 8:42) and that the Holy Spirit proceeds from both the Father and the Son (according to the Catholic expression of the Nicene Creed).  Aquinas is careful not to suggest that the form of procession mentioned here does not consist in the production of separate beings.  Jesus does not, as Arius taught, proceed from God as a created being.  Nor does the Holy Spirit proceed from Father and Son as a creature of both.  Were this the case, neither the Son nor the Holy Spirit would be truly God (ST Ia 27.1).  Instead, the procession to which Aquinas refers does not denote an outward act at all;  procession is internal to God and not distinct from him.

In order to make sense of this idea, Aquinas employs the analogy of understanding, which consists in an interior process, namely, the conceptualization of an object understood and signified by speech (Ibid.).  He refers to this process as intelligible emanation. Intelligible concepts proceed but are not distinct from the agent who conceives them.  This notion is central to Aquinas’ account of how Father and Son relate to each other.  For the Son does not proceed from the Father as a separate being but as an intelligible conception of God himself.  Thus Aquinas describes the Son as the “supreme perfection of God, the divine Word [who] is of necessity perfectly one with the source from which he proceeds” (ST Ia 27.1 ad 2;  Cf. 27.2).  To put the matter another way, the divine Word is the likeness of God himself—a concept emanating from God’s own self-understanding.  These words may sound cryptic to the casual reader, but Davies helps render them comprehensible.  He suggests that the Son’s relationship to God is not unlike our self-conception’s relationship to ourselves.  For “there is similarity between me and my [self-image] insofar as my concept of myself really corresponds to what I am” (Davies, 196).  Similarly, Aquinas thinks of the Son “as the concept in the mind of the one conceiving of himself.  In God’s case this means that the Father brings forth the Son, who is like him insofar as he is properly understood, and who shares the divine nature since God and his understanding are the same”  (Ibid.).

Aquinas’ attempt to render the doctrine of the Trinity coherent is controversial and involves complexities not addressed here.  Yet I imagine Aquinas himself would not be surprised by the consternation some readers might express in response to his attempts to illuminate and defend this and other sacred teachings.  After all, Aquinas contends that knowledge of the divine nature will, if acquired by our own investigative efforts, be quite feeble (SCG IV.1.4;  ST IIaIIae 2.3).  And this is why God, in his goodness, must reveal to us things that transcend human reason.  But even once these things are revealed, our understanding of them will not be total or immediate.  What is required is a form of intellectual training whereby we gradually come to comprehend that which is difficult to grasp in an untutored state (Jenkins: 219).  And even those who reach a proper state of intellectual maturation will not be able to comprehend these mysteries fully, which may explain why attempts to clarify and defend these doctrines can produce so much debate.  Yet Aquinas expresses the hope that what we cannot understand completely now will be apprehended more perfectly after this life, when, according to Christian doctrine, we will see God face to face (SCG IV.1.4-5).

5. References and Further Reading

a. Primary Sources

  • Thomas Aquinas, St.. Compendium of Theology. 1947. ST. Louis MO: B. Herder, 1947.
  • Thomas Aquinas, St. Questiones de vertitate (QDV). 1954. Trans. Robert W. Mulligan, S.J. Henry Regnery Company.
  • Thomas Aquinas, St. Summa contra gentiles (SCG), vol. I. 1975. Trans. Anton Pegis. Notre Dame: University of Notre Dame Press.
  • Thomas Aquinas, St. Summa contra gentiles (SCG), vol. IV. 1975. Trans. Charles J. O’Neil. Notre Dame: University of Notre Dame Press.
  • Thomas Aquinas, St. Summa theologiae (ST). 1981. Trans. Fathers of the English Dominican Province. Westminster: Christian Classics.
  • Thomas Aquinas, St. Super Boethium de Trinitate (de Trinitate). 1993. In Aquinas On Faith and Reason, ed. Stephen Brown.  Indianapolis: Hackett Publishing.

b. Secondary Sources

  • Brower, Jeffery and Michael Rea.  2005.  “Material Constitution and the Trinity.” Faith and Philosophy 22 (January): 57-76.
  • Copleston, Fredrick. 1955.  Aquinas. Baltimore: Penguin Books.
  • Davies, Brian.  1992.  The Thought of Thomas Aquinas.  New York:  Oxford University Press.
  • Dawkins, Richard.  1990 (2nd edition). The Selfish Gene.  Oxford University Press.
  • Emery, Gilles.  2006 (2nd edition). Trinity in Aquinas.  Ann Arbor: Sapientia Press.
  • Emery, Gilles.  2007.  The Trinitarian Theology of St. Thomas Aquinas.  Oxford:  Oxford University Press.
  • Floyd, Shawn. 2006. “Achieving a Science of Sacred Doctrine,” The Heythrop Journal.  January, 47: 1-15.
  • Hibbs, Thomas.  1995.  Dialectic and Narrative in Aquinas: An Interpretation of Summa Contra Gentiles.  Notre Dame: University of Notre Dame Press.
  • Hibbs, Thomas.  1998.  “Kretzmann’s Theism vs. Aquinas’ Theism:  Interpreting the Summa Contra Gentiles,” The Thomist.  October 62: 603-22.
  • Jenkins, John.  1998. Knowledge and Faith in Thomas Aquinas.  Cambridge:  Cambridge University Press.
  • Kretzmann, Norman.  1997.  The Metaphysics of Theism:  Aquinas’ Natural Theology in Summa Contra Gentiles I.  Oxford: Oxford University Press.
  • Penelhum, Terence.  1977.  “The Analysis of Faith in Thomas Aquinas,” Religious Studies 3.
  • Ross, James. 1985.  “Aquinas on Belief and Knowledge,” in Essay Honoring Allan B. Wolter, eds. W.A. Frank and G.J. Etzkorn.  St. Bonaventure: The Franciscan Institute.
  • Stump, Eleonore.  1991.  “Aquinas on Faith and Goodness” in Being and Goodness:  The Concept of the Good in Metaphysics and Philosophical Theology, ed. Scott MacDonald.  Ithaca: Cornell University Press.
  • Stump, Eleonore.  2003. Aquinas.  New York: Routledge.  The chapter on the Atonement was especially helpful in writing section 4a of this article.
  • te Velde, Rudi.  2006.  Aquinas on God: The Divine Science of the Summa Theologiae. Burlington:  Ashgate Publishing Company.
  • Wippel, John. 2006.  “The Five Ways,” in Aquinas’ Summa Theologiae:  Critical Essays, ed. Brain Davies.  Lanham:  Rowman & Littlefield Publishers.

c. Internet Resources

  • Berry, William. “Arianism,” The Catholic Encyclopedia (2003 Edition).
  • Brower, Jeffery.  2005.  “Medieval Theories of Relations, in The Stanford Encyclopedia of Philosophy (2007 Edition), Edward N. Zalta (ed.).
  • Valencia, William.  2006.  “Divine Simplicity,” in The Stanford Encyclopedia of Philosophy (2007 Edition), Edward N. Zalta (ed.).

Author Information

Shawn Floyd
Email: sfloyd@malone.edu
Malone University
U.S.A.

Virtue Ethics

Virtue ethics is a broad term for theories that emphasize the role of character and virtue in moral philosophy rather than either doing one’s duty or acting in order to bring about good consequences. A virtue ethicist is likely to give you this kind of moral advice: “Act as a virtuous person would act in your situation.”

Most virtue ethics theories take their inspiration from Aristotle who declared that a virtuous person is someone who has ideal character traits. These traits derive from natural internal tendencies, but need to be nurtured; however, once established, they will become stable. For example,  a virtuous person is someone who is kind across many situations over a lifetime because that is her character and not because she wants to maximize utility or gain favors or simply do her duty. Unlike deontological and consequentialist theories, theories of virtue ethics do not aim primarily to identify universal principles that can be applied in any moral situation. And virtue ethics theories deal with wider questions—“How should I live?” and “What is the good life?” and “What are proper family and social values?”

Since its revival in the twentieth century, virtue ethics has been developed in three main directions: Eudaimonism, agent-based theories, and the ethics of care. Eudaimonism bases virtues in human flourishing, where flourishing is equated with performing one’s distinctive function well. In the case of humans, Aristotle argued that our distinctive function is reasoning, and so the life “worth living” is one which we reason well. An agent-based theory emphasizes that virtues are determined by common-sense intuitions that we as observers judge to be admirable traits in other people. The third branch of virtue ethics, the ethics of care, was proposed predominately by feminist thinkers. It challenges the idea that ethics should focus solely on justice and autonomy; it argues that more feminine traits, such as caring and nurturing, should also be considered.

Here are some common objections to virtue ethics. Its theories provide a self-centered conception of ethics because human flourishing is seen as an end in itself and does not sufficiently consider the extent to which our actions affect other people. Virtue ethics also does not provide guidance on how we should act, as there are no clear principles for guiding action other than “act as a virtuous person would act given the situation.” Lastly, the ability to cultivate the right virtues will be affected by a number of different factors beyond a person’s control due to education, society, friends and family. If moral character is so reliant on luck, what role does this leave for appropriate praise and blame of the person?

This article looks at how virtue ethics originally defined itself by calling for a change from the dominant normative theories of deontology and consequentialism. It goes on to examine some common objections raised against virtue ethics and then looks at a sample of fully developed accounts of virtue ethics and responses.

Table of Contents

  1. Changing Modern Moral Philosophy
    1. Anscombe
    2. Williams
    3. MacIntyre
  2. A Rival for Deontology and Utilitarianism
    1. How Should One Live?
    2. Character and Virtue
    3. Anti-Theory and the Uncodifiability of Ethics
    4. Conclusion
  3. Virtue Ethical Theories
    1. Eudaimonism
    2. Agent-Based Accounts of Virtue Ethics
    3. The Ethics of Care
    4. Conclusion
  4. Objections to Virtue Ethics
    1. Self-Centeredness
    2. Action-Guiding
    3. Moral Luck
  5. Virtue in Deontology and Consequentialism
  6. References and Further Reading
    1. Changing Modern Moral Philosophy
    2. Overviews of Virtue Ethics
    3. Varieties of Virtue Ethics
    4. Collections on Virtue Ethics
    5. Virtue and Moral Luck
    6. Virtue in Deontology and Consequentialism

1. Changing Modern Moral Philosophy

a. Anscombe

In 1958 Elisabeth Anscombe published a paper titled “Modern Moral Philosophy” that changed the way we think about normative theories. She criticized modern moral philosophy’s pre-occupation with a law conception of ethics. A law conception of ethics deals exclusively with obligation and duty. Among the theories she criticized for their reliance on universally applicable principles were J. S. Mill‘s utilitarianism and Kant‘s deontology. These theories rely on rules of morality that were claimed to be applicable to any moral situation (that is, Mill’s Greatest Happiness Principle and Kant’s Categorical Imperative). This approach to ethics relies on universal principles and results in a rigid moral code. Further, these rigid rules are based on a notion of obligation that is meaningless in modern, secular society because they make no sense without assuming the existence of a lawgiver—an assumption we no longer make.

In its place, Anscombe called for a return to a different way of doing philosophy. Taking her inspiration from Aristotle, she called for a return to concepts such as character, virtue and flourishing. She also emphasized the importance of the emotions and understanding moral psychology. With the exception of this emphasis on moral psychology, Anscombe’s recommendations that we place virtue more centrally in our understanding of morality were taken up by a number of philosophers. The resulting body of theories and ideas has come to be known as virtue ethics.

Anscombe’s critical and confrontational approach set the scene for how virtue ethics was to develop in its first few years. The philosophers who took up Anscombe’s call for a return to virtue saw their task as being to define virtue ethics in terms of what it is not—that is, how it differs from and avoids the mistakes made by the other normative theories. Before we go on to consider this in detail, we need to take a brief look at two other philosophers, Bernard Williams and Alasdair MacIntyre, whose call for theories of virtue was also instrumental in changing our understanding of moral philosophy.

b. Williams

Bernard Williams’ philosophical work has always been characterized by its ability to draw our attention to a previously unnoticed but now impressively fruitful area for philosophical discussion. Williams criticized how moral philosophy had developed. He drew a distinction between morality and ethics. Morality is characterized mainly by the work of Kant and notions such as duty and obligation. Crucially associated with the notion of obligation is the notion of blame. Blame is appropriate because we are obliged to behave in a certain way and if we are capable of conforming our conduct and fail to, we have violated our duty.

Williams was also concerned that such a conception for morality rejects the possibility of luck. If morality is about what we are obliged to do, then there is no room for what is outside of our control. But sometimes attainment of the good life is dependant on things outside of our control.

In response, Williams takes a wider concept, ethics, and rejects the narrow and restricting concept of morality. Ethics encompasses many emotions that are rejected by morality as irrelevant. Ethical concerns are wider, encompassing friends, family and society and make room for ideals such as social justice. This view of ethics is compatible with the Ancient Greek interpretation of the good life as found in Aristotle and Plato.

c. MacIntyre

Finally, the ideas of Alasdair MacIntyre acted as a stimulus for the increased interest in virtue. MacIntyre’s project is as deeply critical of many of the same notions, like ought, as Anscombe and Williams. However, he also attempts to give an account of virtue. MacIntyre looks at a large number of historical accounts of virtue that differ in their lists of the virtues and have incompatible theories of the virtues. He concludes that these differences are attributable to different practices that generate different conceptions of the virtues. Each account of virtue requires a prior account of social and moral features in order to be understood. Thus, in order to understand Homeric virtue you need to look its social role in Greek society. Virtues, then, are exercised within practices that are coherent, social forms of activity and seek to realize goods internal to the activity. The virtues enable us to achieve these goods. There is an end (or telos) that transcends all particular practices and it constitutes the good of a whole human life. That end is the virtue of integrity or constancy.

These three writers have all, in their own way, argued for a radical change in the way we think about morality. Whether they call for a change of emphasis from obligation, a return to a broad understanding of ethics, or a unifying tradition of practices that generate virtues, their dissatisfaction with the state of modern moral philosophy lay the foundation for change.

2. A Rival for Deontology and Utilitarianism

There are a number of different accounts of virtue ethics. It is an emerging concept and was initially defined by what it is not rather than what it is. The next section examines claims virtue ethicists initially made that set the theory up as a rival to deontology and consequentialism.

a. How Should One Live?

Moral theories are concerned with right and wrong behavior. This subject area of philosophy is unavoidably tied up with practical concerns about the right behavior. However, virtue ethics changes the kind of question we ask about ethics. Where deontology and consequentialism concern themselves with the right action, virtue ethics is concerned with the good life and what kinds of persons we should be. “What is the right action?” is a significantly different question to ask from “How should I live? What kind of person should I be?” Where the first type of question deals with specific dilemmas, the second is a question about an entire life. Instead of asking what is the right action here and now, virtue ethics asks what kind of person should one be in order to get it right all the time.

Whereas deontology and consequentialism are based on rules that try to give us the right action, virtue ethics makes central use of the concept of character. The answer to “How should one live?” is that one should live virtuously, that is, have a virtuous character.

b. Character and Virtue

Modern virtue ethics takes its inspiration from the Aristotelian understanding of character and virtue. Aristotelian character is, importantly, about a state of being. It’s about having the appropriate inner states. For example, the virtue of kindness involves the right sort of emotions and inner states with respect to our feelings towards others. Character is also about doing. Aristotelian theory is a theory of action, since having the virtuous inner dispositions will also involve being moved to act in accordance with them. Realizing that kindness is the appropriate response to a situation and feeling appropriately kindly disposed will also lead to a corresponding attempt to act kindly.

Another distinguishing feature of virtue ethics is that character traits are stable, fixed, and reliable dispositions. If an agent possesses the character trait of kindness, we would expect him or her to act kindly in all sorts of situations, towards all kinds of people, and over a long period of time, even when it is difficult to do so. A person with a certain character can be relied upon to act consistently over a time.

It is important to recognize that moral character develops over a long period of time. People are born with all sorts of natural tendencies. Some of these natural tendencies will be positive, such as a placid and friendly nature, and some will be negative, such as an irascible and jealous nature. These natural tendencies can be encouraged and developed or discouraged and thwarted by the influences one is exposed to when growing up. There are a number of factors that may affect one’s character development, such as one’s parents, teachers, peer group, role-models, the degree of encouragement and attention one receives, and exposure to different situations. Our natural tendencies, the raw material we are born with, are shaped and developed through a long and gradual process of education and habituation.

Moral education and development is a major part of virtue ethics. Moral development, at least in its early stages, relies on the availability of good role models. The virtuous agent acts as a role model and the student of virtue emulates his or her example. Initially this is a process of habituating oneself in right action. Aristotle advises us to perform just acts because this way we become just. The student of virtue must develop the right habits, so that he tends to perform virtuous acts. Virtue is not itself a habit. Habituation is merely an aid to the development of virtue, but true virtue requires choice, understanding, and knowledge. The virtuous agent doesn’t act justly merely out of an unreflective response, but has come to recognize the value of virtue and why it is the appropriate response. Virtue is chosen knowingly for its own sake.

The development of moral character may take a whole lifetime. But once it is firmly established, one will act consistently, predictably and appropriately in a variety of situations.

Aristotelian virtue is defined in Book II of the Nicomachean Ethics as a purposive disposition, lying in a mean and being determined by the right reason. As discussed above, virtue is a settled disposition. It is also a purposive disposition. A virtuous actor chooses virtuous action knowingly and for its own sake. It is not enough to act kindly by accident, unthinkingly, or because everyone else is doing so; you must act kindly because you recognize that this is the right way to behave. Note here that although habituation is a tool for character development it is not equivalent to virtue; virtue requires conscious choice and affirmation.

Virtue “lies in a mean” because the right response to each situation is neither too much nor too little. Virtue is the appropriate response to different situations and different agents. The virtues are associated with feelings. For example: courage is associated with fear, modesty is associated with the feeling of shame, and friendliness associated with feelings about social conduct. The virtue lies in a mean because it involves displaying the mean amount of emotion, where mean stands for appropriate. (This does not imply that the right amount is a modest amount. Sometimes quite a lot may be the appropriate amount of emotion to display, as in the case of righteous indignation). The mean amount is neither too much nor too little and is sensitive to the requirements of the person and the situation.

Finally, virtue is determined by the right reason. Virtue requires the right desire and the right reason. To act from the wrong reason is to act viciously. On the other hand, the agent can try to act from the right reason, but fail because he or she has the wrong desire. The virtuous agent acts effortlessly, perceives the right reason, has the harmonious right desire, and has an inner state of virtue that flows smoothly into action. The virtuous agent can act as an exemplar of virtue to others.

It is important to recognize that this is a perfunctory account of ideas that are developed in great detail in Aristotle. They are related briefly here as they have been central to virtue ethics’ claim to put forward a unique and rival account to other normative theories. Modern virtue ethicists have developed their theories around a central role for character and virtue and claim that this gives them a unique understanding of morality. The emphasis on character development and the role of the emotions allows virtue ethics to have a plausible account of moral psychology—which is lacking in deontology and consequentialism. Virtue ethics can avoid the problematic concepts of duty and obligation in favor of the rich concept of virtue. Judgments of virtue are judgments of a whole life rather than of one isolated action.

c. Anti-Theory and the Uncodifiability of Ethics

In the first book of the Nicomachean Ethics, Aristotle warns us that the study of ethics is imprecise. Virtue ethicists have challenged consequentialist and deontological theories because they fail to accommodate this insight. Both deontological and consequentialist type of theories rely on one rule or principle that is expected to apply to all situations. Because their principles are inflexible, they cannot accommodate the complexity of all the moral situations that we are likely to encounter.

We are constantly faced with moral problems. For example: Should I tell my friend the truth about her lying boyfriend? Should I cheat in my exams? Should I have an abortion? Should I save the drowning baby? Should we separate the Siamese twins? Should I join the fuel protests? All these problems are different and it seems unlikely that we will find the solution to all of them by applying the same rule. If the problems are varied, we should not expect to find their solution in one rigid and inflexible rule that does not admit exception. If the nature of the thing we are studying is diverse and changing, then the answer cannot be any good if it is inflexible and unyielding. The answer to “how should I live?” cannot be found in one rule. At best, for virtue ethics, there can be rules of thumb—rules that are true for the most part, but may not always be the appropriate response.

The doctrine of the mean captures exactly this idea. The virtuous response cannot be captured in a rule or principle, which an agent can learn and then act virtuously. Knowing virtue is a matter of experience, sensitivity, ability to perceive, ability to reason practically, etc. and takes a long time to develop. The idea that ethics cannot be captured in one rule or principle is the “uncodifiability of ethics thesis.” Ethics is too diverse and imprecise to be captured in a rigid code, so we must approach morality with a theory that is as flexible and as situation-responsive as the subject matter itself. As a result some virtue ethicists see themselves as anti-theorists, rejecting theories that systematically attempt to capture and organize all matters of practical or ethical importance.

d. Conclusion

Virtue ethics initially emerged as a rival account to deontology and consequentialism. It developed from dissatisfaction with the notions of duty and obligation and their central roles in understanding morality. It also grew out of an objection to the use of rigid moral rules and principles and their application to diverse and different moral situations. Characteristically, virtue ethics makes a claim about the central role of virtue and character in its understanding of moral life and uses it to answer the questions “How should I live? What kind of person should I be?” Consequentialist theories are outcome-based and Kantian theories are agent-based. Virtue ethics is character-based.

3. Virtue Ethical Theories

Raising objections to other normative theories and defining itself in opposition to the claims of others, was the first stage in the development of virtue ethics. Virtue ethicists then took up the challenge of developing full fledged accounts of virtue that could stand on their own merits rather than simply criticize consequentialism and deontology. These accounts have been predominantly influenced by the Aristotelian understanding of virtue. While some virtue ethics take inspiration from Plato’s, the Stoics’, Aquinas’, Hume’s and Nietzsche’s accounts of virtue and ethics, Aristotelian conceptions of virtue ethics still dominate the field. There are three main strands of development for virtue ethics: Eudaimonism, agent-based theories and the ethics of care.

a. Eudaimonism

“Eudaimonia” is an Aristotelian term loosely (and inadequately) translated as happiness. To understand its role in virtue ethics we look to Aristotle’s function argument. Aristotle recognizes that actions are not pointless because they have an aim. Every action aims at some good. For example, the doctor’s vaccination of the baby aims at the baby’s health, the English tennis player Tim Henman works on his serve so that he can win Wimbledon, and so on. Furthermore, some things are done for their own sake (ends in themselves) and some things are done for the sake of other things (means to other ends). Aristotle claims that all the things that are ends in themselves also contribute to a wider end, an end that is the greatest good of all. That good is eudaimonia. Eudaimonia is happiness, contentment, and fulfillment; it’s the name of the best kind of life, which is an end in itself and a means to live and fare well.

Aristotle then observes that where a thing has a function the good of the thing is when it performs its function well. For example, the knife has a function, to cut, and it performs its function well when it cuts well. This argument is applied to man: man has a function and the good man is the man who performs his function well. Man’s function is what is peculiar to him and sets him aside from other beings—reason. Therefore, the function of man is reason and the life that is distinctive of humans is the life in accordance with reason. If the function of man is reason, then the good man is the man who reasons well. This is the life of excellence or of eudaimonia. Eudaimonia is the life of virtue—activity in accordance with reason, man’s highest function.

The importance of this point of eudaimonistic virtue ethics is that it reverses the relationship between virtue and rightness. A utilitarian could accept the value of the virtue of kindness, but only because someone with a kind disposition is likely to bring about consequences that will maximize utility. So the virtue is only justified because of the consequences it brings about. In eudaimonist virtue ethics the virtues are justified because they are constitutive elements of eudaimonia (that is, human flourishing and wellbeing), which is good in itself.

Rosalind Hursthouse developed one detailed account of eudaimonist virtue ethics. Hursthouse argues that the virtues make their possessor a good human being. All living things can be evaluated qua specimens of their natural kind. Like Aristotle, Hursthouse argues that the characteristic way of human beings is the rational way: by their very nature human beings act rationally, a characteristic that allows us to make decisions and to change our character and allows others to hold us responsible for those decisions. Acting virtuously—that is, acting in accordance with reason—is acting in the way characteristic of the nature of human beings and this will lead to eudaimonia. This means that the virtues benefit their possessor. One might think that the demands of morality conflict with our self-interest, as morality is other-regarding, but eudaimonist virtue ethics presents a different picture. Human nature is such that virtue is not exercised in opposition to self-interest, but rather is the quintessential component of human flourishing. The good life for humans is the life of virtue and therefore it is in our interest to be virtuous. It is not just that the virtues lead to the good life (e.g. if you are good, you will be rewarded), but rather a virtuous life is the good life because the exercise of our rational capacities and virtue is its own reward.

It is important to note, however, that there have been many different ways of developing this idea of the good life and virtue within virtue ethics. Philippa Foot, for example, grounds the virtues in what is good for human beings. The virtues are beneficial to their possessor or to the community (note that this is similar to MacIntyre’s argument that the virtues enable us to achieve goods within human practices). Rather than being constitutive of the good life, the virtues are valuable because they contribute to it.

Another account is given by perfectionists such as Thomas Hurka, who derive the virtues from the characteristics that most fully develop our essential properties as human beings. Individuals are judged against a standard of perfection that reflects very rare or ideal levels of human achievement. The virtues realize our capacity for rationality and therefore contribute to our well-being and perfection in that sense.

b. Agent-Based Accounts of Virtue Ethics

Not all accounts of virtue ethics are eudaimonist. Michael Slote has developed an account of virtue based on our common-sense intuitions about which character traits are admirable. Slote makes a distinction between agent-focused and agent-based theories. Agent-focused theories understand the moral life in terms of what it is to be a virtuous individual, where the virtues are inner dispositions. Aristotelian theory is an example of an agent-focused theory. By contrast, agent-based theories are more radical in that their evaluation of actions is dependent on ethical judgments about the inner life of the agents who perform those actions. There are a variety of human traits that we find admirable, such as benevolence, kindness, compassion, etc. and we can identify these by looking at the people we admire, our moral exemplars.

c. The Ethics of Care

Finally, the Ethics of Care is another influential version of virtue ethics. Developed mainly by feminist writers, such as Annette Baier, this account of virtue ethics is motivated by the thought that men think in masculine terms such as justice and autonomy, whereas woman think in feminine terms such as caring. These theorists call for a change in how we view morality and the virtues, shifting towards virtues exemplified by women, such as taking care of others, patience, the ability to nurture, self-sacrifice, etc. These virtues have been marginalized because society has not adequately valued the contributions of women. Writings in this area do not always explicitly make a connection with virtue ethics. There is much in their discussions, however, of specific virtues and their relation to social practices and moral education, etc., which is central to virtue ethics.

d. Conclusion

There are many different accounts of virtue ethics. The three types discussed above are representative of the field. There is a large field, however, of diverse writers developing other theories of virtue. For example, Christine Swanton has developed a pluralist account of virtue ethics with connections to Nietzsche. Nietzsche’s theory emphasizes the inner self and provides a possible response to the call for a better understanding of moral psychology. Swanton develops an account of self-love that allows her to distinguish true virtue from closely related vices, e.g. self-confidence from vanity or ostentation, virtuous and vicious forms of perfectionism, etc. She also makes use of the Nietzschean ideas of creativity and expression to show how different modes of acknowledgement are appropriate to the virtues.

Historically, accounts of virtue have varied widely. Homeric virtue should be understood within the society within which it occurred. The standard of excellence was determined from within the particular society and accountability was determined by one’s role within society. Also, one’s worth was comparative to others and competition was crucial in determining one’s worth.

Other accounts of virtue ethics are inspired from Christian writers such as Aquinas and Augustine (see the work of David Oderberg). Aquinas’ account of the virtues is distinctive because it allows a role for the will. One’s will can be directed by the virtues and we are subject to the natural law, because we have the potential to grasp the truth of practical judgments. To possess a virtue is to have the will to apply it and the knowledge of how to do so. Humans are susceptible to evil and acknowledging this allows us to be receptive to the virtues of faith, hope and charity—virtues of love that are significantly different from Aristotle’s virtues.

The three types of theories covered above developed over long periods, answering many questions and often changed in response to criticisms. For example, Michael Slote has moved away from agent-based virtue ethics to a more Humean-inspired sentimentalist account of virtue ethics. Humean accounts of virtue ethics rely on the motive of benevolence and the idea that actions should be evaluated by the sentiments they express. Admirable sentiments are those that express a concern for humanity. The interested reader must seek out the work of these writers in the original to get a full appreciation of the depth and detail of their theories.

4. Objections to Virtue Ethics

Much of what has been written on virtue ethics has been in response to criticisms of the theory. The following section presents three objections and possible responses, based on broad ideas held in common by most accounts of virtue ethics.

a. Self-Centeredness

Morality is supposed to be about other people. It deals with our actions to the extent that they affect other people. Moral praise and blame is attributed on the grounds of an evaluation of our behavior towards others and the ways in that we exhibit, or fail to exhibit, a concern for the well-being of others. Virtue ethics, according to this objection, is self-centered because its primary concern is with the agent’s own character. Virtue ethics seems to be essentially interested in the acquisition of the virtues as part of the agent’s own well-being and flourishing. Morality requires us to consider others for their own sake and not because they may benefit us. There seems to be something wrong with aiming to behave compassionately, kindly, and honestly merely because this will make oneself happier.

Related to this objection is a more general objection against the idea that well-being is a master value and that all other things are valuable only to the extent that they contribute to it. This line of attack, exemplified in the writings of Tim Scanlon, objects to the understanding of well-being as a moral notion and sees it more like self-interest. Furthermore, well-being does not admit to comparisons with other individuals. Thus, well-being cannot play the role that eudaimonists would have it play.

This objection fails to appreciate the role of the virtues within the theory. The virtues are other-regarding. Kindness, for example, is about how we respond to the needs of others. The virtuous agent’s concern is with developing the right sort of character that will respond to the needs of others in an appropriate way. The virtue of kindness is about being able to perceive situations where one is required to be kind, have the disposition to respond kindly in a reliable and stable manner, and be able to express one’s kind character in accordance with one’s kind desires. The eudaimonist account of virtue ethics claims that the good of the agent and the good of others are not two separate aims. Both rather result from the exercise of virtue. Rather than being too self-centered, virtue ethics unifies what is required by morality and what is required by self-interest.

b. Action-Guiding

Moral philosophy is concerned with practical issues. Fundamentally it is about how we should act. Virtue ethics has criticized consequentialist and deontological theories for being too rigid and inflexible because they rely on one rule or principle. One reply to this is that these theories are action guiding. The existence of “rigid” rules is a strength, not a weakness because they offer clear direction on what to do. As long as we know the principles, we can apply them to practical situations and be guided by them. Virtue ethics, it is objected, with its emphasis on the imprecise nature of ethics, fails to give us any help with the practicalities of how we should behave. A theory that fails to be action-guiding is no good as a moral theory.

The main response to this criticism is to stress the role of the virtuous agent as an exemplar. Virtue ethics reflects the imprecise nature of ethics by being flexible and situation-sensitive, but it can also be action-guiding by observing the example of the virtuous agent. The virtuous agent is the agent who has a fully developed moral character, who possesses the virtues and acts in accordance with them, and who knows what to do by example. Further, virtue ethics places considerable of emphasis on the development of moral judgment. Knowing what to do is not a matter of internalizing a principle, but a life-long process of moral learning that will only provide clear answers when one reaches moral maturity. Virtue ethics cannot give us an easy, instant answer. This is because these answers do not exist. Nonetheless, it can be action-guiding if we understand the role of the virtuous agent and the importance of moral education and development. If virtue consists of the right reason and the right desire, virtue ethics will be action-guiding when we can perceive the right reason and have successfully habituated our desires to affirm its commands.

c. Moral Luck

Finally, there is a concern that virtue ethics leaves us hostage to luck. Morality is about responsibility and the appropriateness of praise and blame. However, we only praise and blame agents for actions taken under conscious choice. The road to virtue is arduous and many things outside our control can go wrong. Just as the right education, habits, influences, examples, etc. can promote the development of virtue, the wrong influencing factors can promote vice. Some people will be lucky and receive the help and encouragement they need to attain moral maturity, but others will not. If the development of virtue (and vice) is subject to luck, is it fair to praise the virtuous (and blame the vicious) for something that was outside of their control? Further, some accounts of virtue are dependent on the availability of external goods. Friendship with other virtuous agents is so central to Aristotelian virtue that a life devoid of virtuous friendship will be lacking in eudaimonia. However, we have no control over the availability of the right friends. How can we then praise the virtuous and blame the vicious if their development and respective virtue and vice were not under their control?

Some moral theories try to eliminate the influence of luck on morality (primarily deontology). Virtue ethics, however, answers this objection by embracing moral luck. Rather than try to make morality immune to matters that are outside of our control, virtue ethics recognizes the fragility of the good life and makes it a feature of morality. It is only because the good life is so vulnerable and fragile that it is so precious. Many things can go wrong on the road to virtue, such that the possibility that virtue is lost, but this vulnerability is an essential feature of the human condition, which makes the attainment of the good life all the more valuable.

5. Virtue in Deontology and Consequentialism

Virtue ethics offers a radically different account to deontology and consequentialism. Virtue ethics, however, has influenced modern moral philosophy not only by developing a full-fledged account of virtue, but also by causing consequentialists and deontologists to re-examine their own theories with view to taking advantage of the insights of virtue.

For years Deontologists relied mainly on the Groundwork of the Metaphysics of Morals for discussions of Kant’s moral theory. The emergence of virtue ethics caused many writers to re-examine Kant’s other works. Metaphysics of MoralsAnthropology From a Pragmatic Point of View and, to a lesser extent, Religion Within the Limits of Reason Alone, have becomes sources of inspiration for the role of virtue in deontology. Kantian virtue is in some respects similar to Aristotelian virtue. In the Metaphysics of Morals, Kant stresses the importance of education, habituation, and gradual development—all ideas that have been used by modern deontologists to illustrate the common sense plausibility of the theory. For Kantians, the main role of virtue and appropriate character development is that a virtuous character will help one formulate appropriate maxims for testing. In other respects, Kantian virtue remains rather dissimilar from other conceptions of virtue. Differences are based on at least three ideas: First, Kantian virtue is a struggle against emotions. Whether one thinks the emotions should be subjugated or eliminated, for Kant moral worth comes only from the duty of motive, a motive that struggles against inclination. This is quite different from the Aristotelian picture of harmony between reason and desire. Second, for Kant there is no such thing as weakness of will, understood in the Aristotelian sense of the distinction between continence and incontinence. Kant concentrates on fortitude of will and failure to do so is self-deception. Finally, Kantians need to give an account of the relationship between virtue as occurring in the empirical world and Kant’s remarks about moral worth in the noumenal world (remarks that can be interpreted as creating a contradiction between ideas in the Groundwork and in other works).

Consequentialists have found a role for virtue as a disposition that tends to promote good consequences. Virtue is not valuable in itself, but rather valuable for the good consequences it tends to bring about. We should cultivate virtuous dispositions because such dispositions will tend to maximize utility. This is a radical departure from the Aristotelian account of virtue for its own sake. Some consequentialists, such as Driver, go even further and argue that knowledge is not necessary for virtue.

Rival accounts have tried to incorporate the benefits of virtue ethics and develop in ways that will allow them to respond to the challenged raised by virtue ethics. This has led to very fruitful and exciting work being done within this area of philosophy.

6. References and Further Reading

a. Changing Modern Moral Philosophy

  • Anscombe, G.E. M., “Modern Moral Philosophy”, Philosophy, 33 (1958).
    • The original call for a return to Aristotelian ethics.
  • MacIntyre, A., After Virtue (London: Duckworth, 1985).
    • His first outline of his account of the virtues.
  • Murdoch, I., The Sovereignty of Good (London: Ark, 1985)
  • Williams, B., Ethics and the Limits of Philosophy (London: Fontana, 1985).
    • Especially Chapter 10 for the thoughts discussed in this paper.

b. Overviews of Virtue Ethics

  • Oakley, J., “Varieties of Virtue Ethics”, Ratio, vol. 9 (1996)
  • Trianosky, G.V. “What is Virtue Ethics All About?” in Statman D., Virtue Ethics (Cambridge: Edinburgh University Press, 1997)

c. Varieties of Virtue Ethics

  • Adkins, A.W.H., Moral Values and Political Behaviour in Ancient Greece from Homer to the End of the Fifth Century (London: Chatto and Windus, 1972).
    • An account of Homeric virtue.
  • Baier, A., Postures of the Mind (Minneapolis: University of Minnesota Press, 1985)
  • Blum, L.W., Friendship, Altruism and Morality (London: 1980)
  • Cottingham, J., “Partiality and the Virtues”, in Crisp R. and Slote M., How Should One Live? (Oxford: Clarendon Press, 1996)
  • Cottingham, J., “Religion, Virtue and Ethical Culture”, Philosophy, 69 (1994)
  • Cullity, G., “Aretaic Cognitivism”, American Philosophical Quarterly, vol. 32, no. 4, (1995a).
    • Particularly good on the distinction between aretaic and deontic.
  • Cullit,y G., “Moral Character and the Iteration Problem”, Utilitas, vol. 7, no. 2, (1995b)
  • Dent, N.J.H., “The Value of Courage”, Philosophy, vol. 56 (1981)
  • Dent, N.J.H., “Virtues and Actions”, The Philosophical Quarterly, vol. 25 (1975)
  • Dent, N.J.H., The Psychology of the Virtues (G.B.: Cambridge University Press, 1984)
  • Driver, J., “Monkeying with Motives: Agent-based Virtue Ethics”, Utilitas, vol. 7, no. 2 (1995).
    • A critique of Slote’s agent-based virtue ethics.
  • Foot, P., Natural Goodness (Oxford: Clarendon Press, 2001).
    • Her more recent work, developing new themes in her account of virtue ethics.
  • Foot, P., Virtues and Vices (Oxford: Blackwell, 1978).
    • Her original work, setting out her version of virtue ethics.
  • Hursthouse, R., “Virtue Theory and Abortion”, Philosophy and Public Affairs, 20, (1991)
  • Hursthouse, R., On Virtue Ethics (Oxford: OUP, 1999).
    • A book length account of eudaimonist virtue ethics, incorporating many of the ideas from her previous work and fully developed new ideas and responses to criticisms.
  • McDowell, J., “Incontinence and Practical Wisdom in Aristotle”, in Lovibond S and Williams S.G., Essays for David Wiggins, Aristotelian Society Series, Vol.16 (Oxford: Blackwell, 1996)
  • McDowel,l J., “Virtue and Reason”, The Monist, 62 (1979)
  • Roberts, R.C., “Virtues and Rules”, Philosophy and Phenomenological Research, vol. LI, no. 2 (1991)
  • Scanlon, T.M., What We Owe Each Other (Cambridge: Harvard University Press, 1998).
    • A comprehensive criticism of well-being as the foundation of moral theories.
  • Slote, M., From Morality to Virtue (New York: OUP, 1992).
    • His original account of agent-based virtue ethics.
  • Slote, M., Morals from Motives, (Oxford: OUP, 2001).
    • A new version of sentimentalist virtue ethics.
  • Swanton, C., Virtue Ethics (New York: OUP, 2003).
    • A pluralist account of virtue ethics, inspired from Nietzschean ideas.
  • Walker, A.D.M., “Virtue and Character”, Philosophy, 64 (1989)

d. Collections on Virtue Ethics

  • Crisp, R. and M. Slote, How Should One Live? (Oxford: Clarendon Press, 1996).
    • A collection of more recent as well as critical work on virtue ethics, including works by Kantian critics such as O’Neill, consequentialist critics such as Hooker and Driver, an account of Humean virtue by Wiggins, and others.
  • Crisp, R. and M. Slote, Virtue Ethics (New York: OUP, 1997).
    • A collection of classic papers on virtue ethics, including Anscombe, MacIntyre, Williams, etc.
  • Engstrom, S., and J. Whiting, Aristotle, Kant and the Stoics (USE: Cambridge University Press, 1996).
    • A collection bringing together elements from Aristotle, Kant and the Stoics on topics such as the emotions, character, moral development, etc.
  • Hursthouse, R., G. Lawrence and W. Quinn, Virtues and Reasons (Oxford: Clarendon Press, 1995).
    • A collections of essays in honour of Philippa Foot, including contributions by Blackburn, McDowell, Kenny, Quinn, and others.
  • Rorty, A.O., Essays on Aristotle’s Ethics (USA: University of California Press, 1980).
    • A seminal collection of papers interpreting the ethics of Aristotle, including contributions by Ackrill, McDowell and Nagel on eudaimonia, Burnyeat on moral development, Urmson on the doctrine of the mean, Wiggins and Rorty on weakness of will, and others.
  • Statman, D., Virtue Ethics (Cambridge: Edinburgh University Press, 1997).
    • A collection of contemporary work on virtue ethics, including a comprehensive introduction by Statman, an overview by Trianosky, Louden and Solomon on objections to virtue ethics, Hursthouse on abortion and virtue ethics, Swanton on value, and others.

e. Virtue and Moral Luck

  • Andree, J., “Nagel, Williams and Moral Luck”, Analysis 43 (1983).
    • An Aristotelian response to the problem of moral luck.
  • Nussbaum, M., Love’s Knowledge (Oxford: Oxford University Press, 1990)
  • Nussbaum, M., The Fragility of Goodness (Cambridge: Cambridge University Press, 1986).
    • Includes her original response to the problem of luck as well as thoughts on rules as rules of thumb, the role of the emotions, etc.
  • Statman, D., Moral Luck (USA: State University of New York Press, 1993).
    • An excellent introduction by Statman as well as almost every article written on moral luck, including Williams’ and Nagel’s original discussions (and a postscript by Williams).

f. Virtue in Deontology and Consequentialism

  • Baron, M.W., Kantian Ethics Almost Without Apology (USA: Cornell University Press, 1995).
    • A book length account of a neo-Kantian theory that takes virtue and character into account.
  • Baron, M.W., P. Pettit and M. Slote, Three Methods of Ethics (GB: Blackwell, 1997).
    • Written by three authors adopting three perspectives, deontology, consequentialism and virtue ethics, this is an excellent account of how the three normative theories relate to each other.
  • Drive,r J., Uneasy Virtue (Cambridge: Cambridge University Press, 2001).
    • A book length account of a consequentialist version of virtue ethics, incorporating many of her ideas from previous pieces of work.
  • Herman, B., The Practice of Moral Judgement (Cambridge: Harvard University Press, 1993).
    • Another neo-Kantian who has a lot to say on virtue and character.
  • Hooker, B., Ideal Code, Real World (Oxford: Clarendon Press, 2000).
    • A modern version of rule-consequentialism, which is in many respects sensitive to the insights of virtue.
  • O’Neill, “Kant’s Virtues”, in Crisp R. and Slote M., How Should One Live? (Oxford: Clarendon Press, 1996).
    • One of the first Kantian responses to virtue ethics.
  • Sherman, N., The Fabric of Character (GB: Clarendon Press, 1989).
    • An extremely sympathetic account of Aristotelian and Kantian ideas on the emotions, virtue and character.
  • Sherman, N., Making a Necessity of Virtue (USA: Cambridge University Press, 1997).

Author Information

Nafsika Athanassoulis
Email: n.athanassoulis@keele.ac.uk
Keele University
United Kingdom

Marie Le Jars de Gournay (1565—1645)

A close friend and editor of Montaigne, Marie Le Jars de Gournay is best known for her proto-feminist essays defending equality between the sexes.  Her unusual lifestyle as a single woman attempting to earn her living through writing matched her theoretical argument on the right of equal access of women and men to education and public offices.  Gournay’s extensive literary corpus touches a wide variety of philosophical issues.  Her treatises on literature defend the aesthetic and epistemological value of metaphor in poetic speech.  Her works in moral philosophy analyze the virtues and vices of the courtier, with particular attention to the evil of slander.  Her educational writings emphasize formation in moral virtue according to the Renaissance tradition of the education of the prince.  Her social criticism attacks corruption in the court, clergy and aristocracy of the period.  In her writings on gender, Gournay marshals classical, biblical, and ecclesiastical sources to demonstrate the equality between the sexes and to promote the rights of women in school and in the workplace.

Table of Contents

  1. Biography
  2. Works
  3. Philosophical Themes
    1. Language, Literature, Aesthetics
    2. Moral Philosophy
    3. Social Criticism
    4. Philosophy of Education
    5. Gender and Equality
  4. Reception and Interpretation
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Born on October 6, 1565, Marie Le Jars belonged to a minor aristocratic family.  Her father Guillaume Le Jars hailed from a noble family in the region of Sancerre; her mother Jeanne de Hacqueville descended from a family of jurists.  Her maternal grandfather and paternal uncle had distinguished themselves as writers.  After her birth, her father purchased the estate of Gournay-sur-Aronde; the family name now included “de Gournay.”

After the death of her father in 1578, Marie Le Jars de Gournay retired with her mother and siblings to the chateau of Gournay.  An avid reader, she provided herself with her own education, centered on the classics and French literature.  By the end of her adolescence, she had become fluent in Latin, learned at least some Greek, and had become a devotee of Ronsard and the Pléaide poets.  Philosophically, she read Plutarch and other Stoic authors.  Once she discovered the Essays of Montaigne, she became his enthusiastic disciple, with special interest in the more Stoic strands of his thought.

In 1588 Gournay personally met with Montaigne; the meeting would establish a lifelong friendship.  Shortly after this encounter, Gournay wrote her novella The Promenade of Monsieur de Montaigne, Concerning Love in the Work of Plutarch.  As subsequent correspondence and meetings deepened their association, Montaigne referred to Gournay as his “adopted daughter” and increasingly shared his intellectual preoccupations with her.

After the death of her mother in 1591, Gournay found herself in straitened financial circumstances.  In 1593, the widow of the recently deceased Montaigne asked Gournay to edit a posthumous edition of the works of Montaigne.  After working for more than a year at Montaigne’s estate in the Bordeaux region, Gournay produced the new edition of the works, completed by a long preface of her own composition, in 1595.  Later in life, Gournay would produce numerous new and expanded editions of the works of Montaigne.

During the next decades, Gournay led a precarious existence in the salons and courts of Paris.  As a single woman attempting to make a living through writing, translation, and editing, she became the object of mockery as well as of fascination in the capital’s literary coteries.  Her translations from the Latin, especially of Vergil, earned her a reputation as a classical scholar.  Often modeled after Montaigne’s essays, her treatises took sides in the controversies of the day.  She praised the older poetry of the Pléiade and condemned newer, more neoclassical poetry.  She defended the centrality of free will against Augustinians who stressed predestination.  She championed a humanistic model of education, with its emphasis on the mastery of classical languages, against more scientific models.  Her work as a controversialist reached its apogee in 1610, when she defended the unpopular Jesuits, whom many French pamphleteers had blamed for the assassination of King Henri IV by a religious fanatic the same year.

Despite her controversial reputation, Gournay became influential in court circles.  She undertook writing assignments for Queen Margot, Marie de Médicis, and Louis XIII.  In recognition of her literary skill, Cardinal Richelieu granted her a state pension in 1634.  During the same period she assisted in the organization of the nascent Académie française.  A committed Catholic sympathetic to the anti-Protestant parti dévot, she still maintained close connections to more libertine members of the Parisian salons, such as Gabriel Naudé and François La Mothe Le Vayer.  She maintained a correspondence with other European female scholars, notably Anna Maria van Schurman and Bathsua Reginald Makin.

Having experienced opprobrium as a career woman devoted to professional writing, Gournay used her writings to criticize the misogyny of Parisian literary society.  Her treatises Equality Between Men and Women (1622) and Complaints of Ladies (1626) defended the equality between the sexes and argued for equal access of both genders to education and to public offices.  In 1626, she published a collection of her previous writings.  A financial and critical success, this collection of her writings was subsequently expanded and reprinted by Gournay in 1634 and 1641. She died on July 13, 1645.

2. Works

The works of Marie Le Jars de Gournay cover a variety of literary genres.  As a translator, she published French versions of Cicero, Ovid, Tacitus, Sallust, and Vergil.  Her multi-volume translation of the Aeneid was the most celebrated of her translations of the Latin classics.

As a novelist, she wrote The Promenade of Monsieur de Montaigne, Concerning Love in the Work of Plutarch.  Written in 1588, this early work already raises Gournay’s proto-feminist concerns on the difficulties experienced by women who attempt to be the intellectual peers of men.  Her poetry, modeled after the outdated verse of Ronsard, was less successful.

Her successive editions of the works of Montaigne, first published in 1595, enhanced Montaigne’s reputation among the literary and philosophical elite of Europe.  Her repeatedly revised preface to these editions constituted an apology for the philosophical value and erudition of Montaigne’s essays.

As a formidable essayist herself, Gournay focused on several issues: the nature of literature; education of the prince; the nature of virtue and vice; the moral defects of contemporary society.  Especially controversial were her treatises defending the equality between the sexes and the right of women to pursue a humanistic education.  Equality Between Men and Women, Complaints of Ladies, and Apology for the Writing Women are illustrative of this genre.

In 1634 Gournay published a collection of her extant writings, called The Shadow of the Damoiselle de Gournay.  In subsequent years, she revised and expanded this edition of her works.  Named The Offerings or Presents of Demoiselle de Gournay, the last collection of her works was published in 1641.  This edition of her works runs to more than one thousand closely printed pages.

3. Philosophical Themes

Gournay’s treatises study numerous philosophical issues.  Her works on literary theory defend the value of figurative speech, especially metaphor, to communicate complex metaphysical truths. Her moral theory reflects the ethics of the Renaissance courtier.  Personal honor is the preeminent virtue, calumny the major vice.  Her pioneering work on gender insists on the equality of the sexes and on the malicious prejudice which has barred women from educational and work opportunities.  Especially bold is her social criticism. Numerous essays condemn the political and religious institutions of contemporary France for their moral defects.

a. Language, Literature, Aesthetics

A prolific poet and translator, Gournay devotes numerous treatises to issues of language and literature.  Against the neoclassical purism of certain literary critics of the period, Gournay defends the value of neologism and figurative speech.  In particular she defends the aesthetic and epistemological value of metaphor in poetic discourse.  Not only does metaphor please the senses of the reader; it communicates certain truths about God, nature, and the human soul which cannot be expressed through more concise, abstract rhetoric.  Defense of Poetry provides her most extensive analysis of innovation and simile in the poetic expression of truth.

Against contemporary critics who attempt to purify the French language through increasing stress on the rules of grammar and rhetoric, Gournay argues that poetic speech is inventive by nature.  The use of neologisms, elaborate analogies, and colorful synonyms constitutes the craft of the poet.  “Every artisan practices his or her craft according to the judgment of his or her mind.  We are artisans in our own language.   In other words, we are not only bound to work according to what we have received and learned; we are even more bound to shape, enrich, and build it, in order to add riches to riches, beauties to beauties.”  Vibrant innovation rather than imitation is the duty of the poet in creating his or her discourse.

The richly figured rhetoric defended by Gournay contrasts with the purified, restrained speech promoted by the influential neoclassical literary establishment.  The attempt to purify language of complex metaphor only leads to a vitiated speech produced by grammarians rather than poets.  “We must also laugh at what happens to these overly refined scholars when they spot some metaphorical phrase that bears excellence in its construction, brilliance, or exquisite power.  Not only do they fail to notice its beauty or its value; they denounce it and preach that restraint is more preferable.”  The pedantic speech encouraged by such strictures against metaphor quickly numbs the reader by its aridity and lack of variation.

Not only is speech denuded of metaphor bound to bore the reader; it fails to communicate the complex truths which poetry is destined to express.  The abstract, purified language of the scholar is incapable of expressing the surging emotive and spiritual life of the human person.  “[This purist approach] cannot pierce right down to the bone, as is necessary for the imagination to be properly expressed.  This must be done by a lively and powerful attack…. [The purists’] principal concern is to flee not only the frequent metaphors and proverbs we formerly used, but also to abandon borrowings from foreign languages, new expressive styles of speaking, and most of the lively diction and popular expressions—all those devices which everywhere strengthen a clause by making it more striking, especially in poetry.”  The new pedantic poetry tends toward the decorative; authentic poetry, rich in its figurative devices and colorful rhetoric, is alone capable of expressing the life great poetry embodies.  “May others look for milk and honey, if that’s what they want.  We are looking for what is called spirit and life.  I call it ‘life’ with good reason, since all speech that lacks this celestial ray in its composition—this ray of powerful dexterity, suppleness, agility, capacity to soar—is dead.”

Gournay invokes the philosopher Seneca to support her thesis that authentic poetic expression of life necessarily employs vivid, figured speech.  Great poetry often enjoys the mystical air of religious revelation.  “Seneca, a philosopher, a grave Stoic, teaches us that the soul escapes from itself and soars outside of humanity in order to give birth to something high and ecstatic far above its peers and above humanity itself.”  The rule-based strictures of the neoclassical establishment, which focuses on the surface rather than the substance of speech, threaten to destroy the religious vision which is the font of poetic inspiration.  “True poetry is an Apollonian furor.  Do they [neoclassical critics] want us to be their disciples after having been those of Apollo?  Rather, do they want us to be their schoolboys, since they crank out laws for us in order for us to crank out others?”

For Gournay, her battle against the surging neoclassical aesthetic in France is neither a simple issue of taste nor nostalgia for the embroidered lyricism of the Pléiade; it is a combat for a poetry that can express the richness of the experience of life through the verbal armory of synonym, analogy, and simile.  Only in metaphorical speech can the author express the complexity of the soul’s pilgrimage as well as touch the senses and imagination of the potential reader.

b. Moral Philosophy

In many treatises Gournay presents her moral philosophy.  Centered on questions of virtue and vice, Gournay’s moral theory defends an aristocratic code of conduct tied to the virtue of honor.  With personal reputation as a supreme good, calumny emerges as a principal evil and violence practiced in defense of one’s honor as a moral duty.

Like other neo-Stoic authors of the period, Gournay admits that the nature and authenticity of virtue is elusive.  But unlike many of her contemporaries, she does not simply dismiss virtue as a mask of the vice of pride.  In Vicious Virtue, she argues that the elusiveness of virtue is tied to the hidden motivations behind virtuous acts.  While one may observe external actions, one cannot observe the occult motives inspiring the moral agent to act in an apparently virtuous manner.  “One cannot remove from humanity all the virtuous actions it practices because of coercion, self-interest, chance, or accident.  Even graver are the external virtues which follow on some vicious inclination…To eliminate all such virtuous acts would place the human race closer to the rank of simple animals than I would dare to say.”  Much, if not all, of human moral action is motivated by immoral or amoral factors.  External virtuous conduct is caused more by personal interest or accident than by conscious virtuous intention.  To eliminate all the moral actions inspired by less than virtuous motives is to eliminate practically all deliberative moral action; the only remaining activity is comparable to that manifested by non-rational animals.

Despite the fragility of virtue, Gournay identifies certain virtues and vices as central in the moral conflicts of the age.  Calumny is a particularly dangerous vice because it destroys the personal honor and social reputation which Gournay considers a paramount moral good.  Undoubtedly, Gournay’s personal experience of battling the criticisms launched against her and of the backbiting gossip of the court helped to focus her campaign against calumny.

Of Slander provides a detailed analysis of the malicious gossip which Gournay believed to be one of the principal evils of the age.  Citing Aristotle, Gournay claims that personal honor is the most estimable external possession of the human person.  “Every day we risk many goods for the sake of life and we risk our lives for the sake of a piece of honor.  This is why Aristotle calls it the greatest of external goods, just as he qualifies shame as the greatest of external evils.  Moreover, can we deny that the love of honor is necessary as the powerful author and tutor of virtue?  At the very least, it is nine-tenths of the ten parts that make up virtue.  This is because few people are capable of biting right into this fruit, which seems too bitter without this bait.”  Since honor is so central for the cultivation of virtue, the loss of personal reputation paralyzes the pursuit of the good and constitutes a severe moral loss for the individual so affected.

Not only does calumny destroy the reputation and happiness of the victimized individual; it constitutes a grave evil for the entire ambient society.  Invoking the patristic author Bernard of Clairvaux, Gournay argues that calumny and other forms of malicious gossip constitute a serious sin, which God has promised to punish with special force.  “Both the gossip and his or her voluntary audience carry around the devil, one on the tongue, the other in the ear.  This murderous thrust of the tongue transpierces three persons in one blow: the offended party, the speaker, and the listener.”  By destroying truth and the respect of persons, calumny attacks the very foundations of social life.  It abets other expressions of the contempt of persons, such as mockery and sarcasm.  Of Slander demonstrates how easily the practice of calumny causes physical violence, such as its frequent provocation of recourse to the duel by the party whose honor has been outraged.

Gournay’s adherence to an aristocratic code of honor also appears in her treatment of violence in Is Revenge Legitimate? The treatise recognizes that Christianity would appear to ban vengeance; the believer is called to love his or her enemies and exercise forbearance in the face of evil.  But Gournay argues that only minor or ineradicable evils are to be treated this way; major injustices, such as assaults on the good name of oneself or of one family, demand swift reparation.  God’s greatest gift to human beings, reason itself, indicates that such moral infractions require strict retribution if the order of justice sustaining society is to be preserved.  “We must not doubt that this great God has given us Reason as the touchstone and lighthouse in this life.  He has based his moral laws on reason and reason on his moral laws….The Free Will which God has given us as the instrument of our salvation would be useless or would rather be a dangerous trap if it were not enlightened by this Reason, because this power did not have any light itself.  We must see if Reason could tolerate the entire abolition of vengeance, if justice and utility could do without it.”  For Gournay, the answer is obviously negative.  The defense of the social order of justice requires the willingness of individuals and of the state to uphold the order of justice by swiftly punishing those who have violated the honor of others.  The risks of abuse in the execution of this retribution should not blind individuals and the state to its necessity.  The alternative is anarchy.

c. Social Criticism

In her analysis of virtue and vice, Gournay attacks the corruption of prominent social institutions of the period.  She does not hesitate to criticize the moral failings of three powerful institutions: the court, the clergy, and the aristocracy.  Her critique of the vices typical of each institution serves the broader goal of the moral reform of France along the lines of the principles of the reforming council of the Catholic Church, the Council of Trent.

In Considerations on Some Tales of Court, Gournay criticizes the malicious gossip that dominates the court atmosphere.  Flowing from false personal pride, this tendency to slander other courtiers easily leads to violence.  “Slander is beloved by those who are looking for a fight.  It seems to give them a certain distinction of freedom.  But when I see the measures of security many of those who practice slander today take, I see the mark of servitude rather than of freedom.”  The treatise recounts how such contemptuous slander has recently provoked duels and civil wars in France.  Ultimately, it weakens loyalty to the throne by the ridicule it heaps upon the royal family and courtiers, thus undermining the stability of the social order itself.

Counsels to Certain Churchmen focus on a particular abuse: laxity in the practice of sacramental confession.  In principle, sacramental confession is the occasion for the Catholic to express sorrow for sins, express the sincere resolution to avoid committing these sins in the future, and, if judged properly contrite, to receive absolution and an appropriate restitutionary penance from the confessor.  In practice, lax confessors, who mechanically grant absolution, have turned the sacrament into a “cosmetics machine.”  No moral reform or authentic repentance occurs in this conspiracy between hardened sinners and indulgent confessors.  To make confession once again an instrument of moral conversion, Gournay insists that the confessor must employ the armory of spiritual arms available to him in treating obdurate sinners.  “To move or strengthen a penitent toward this charity [repentant love of God], notably in what concerns abstaining from committing the offense in the future, the confessor should not spare the use of solicitude, remonstrance, threats, infliction of penalties, even on occasion the refusal of absolution.  Divine, civil, and philosophical judgments tell us that if we do not prevent a crime, its evil is imputed to us.  The mouth of Saint Paul says the same thing about our responsibility: in cases of necessity, he orders us to abandon the sinner to Satan through excommunication in order to bring the sinner to repentance.”  The moral rigorism of Gournay is evident in this exhortation to confessors.  Only a strict practice of repentance and restitution on part of the sinner and of demanding scrutiny on the part of the confessor can make the act of sacramental confession the serious means of moral conversion it was instituted to be.

The vices of the French aristocracy are the object of attack in Of the Nothingness of the Average Courage of this Time and Of the Low Price of the Quality of the Nobility.  The primary vice of this social class is the absence of what should be their defining virtue: courage.  Gournay understands by courage the willingness to defend the weak which originally distinguished the nobility of the sword.  “Generous courage necessarily includes courtesy and benevolence, conjoined with a prudent use of courageous force.  It should not appear to vanquish the strong more than it lifts up the weak.  Among other reasons, this is because the vindication and protection of the weak is the very justification for the use of force and of its consequences.”  Even before the chivalric code, Plato had accurately defined the virtue of fortitude as “a prudent, tolerant expression of courage in order to realize what is just and helpful.”

According to Gournay, the traditional courage of the aristocracy has deteriorated into a cult of power for its own sake.  Rather than exercising its martial prerogatives on behalf of the oppressed, many nobles have become oppressors themselves by the use of violent power to advance their own interests or even whims.  “The first problem is the power and arrogance which flow from the sword hanging at the side of nobles.  Few of them manage to avoid becoming intoxicated by this power.  The second is a certain contagious illusion they pick up by imitation of others.  They start to believe that they are the important people, the eminent ones, and the leaders of a gang in court or in the provinces.  They usurp power, they strike a peasant or a simple bourgeois, they insult the first and worst-armed person they meet simply to have revenge—and any remonstrance concerning their behavior has little effect.  They make a scepter or rather a god out of power.”  This corruption of power into violent self-importance threatens the civil order, since it inaugurates lawlessness and civil wars motivated by little more than personal jealousy.

Gournay does not spare the poorer classes in her social critique.  In Of False Devotions Gournay criticizes those who believe that the performance of external devotions guarantee their salvation; it is only the cultivation of moral virtues in free cooperation with divine grace that can unite the human soul to God.  Gournay places unusual emphasis on two moral virtues involving self-reflection: integrity and probity.  “Among the virtues preeminent in rank are those of integrity and probity, because they give us a special attachment to the Creator and contain all the other considerations we owe the divine majesty.  The other virtues ally us primarily to other human beings.”  Given the self-reflective quality of these prime virtues, Gournay censures unbalanced devotionalism for its irrational, whimsical qualities.  The wish to please God through external gifts displaces the hard work of moral reform that should be the touchstone of the Christian life.

The stress on the cultivation of virtue over the pursuit of external devotion is not a uniquely Christian concern; Gournay cites Aristotle in her argument that the upright moral agent must carefully attempt to eradicate every vice.  “The Philosopher holds that a human being is vicious if he or she possesses just one vice and is not virtuous if he or she does not possess all the virtues.”  Certain Catholic popular devotions run the risk of deceiving their practitioners of the true state of their souls if they divert the devotee from moral scrutiny and repentance.  “These devourers of rosaries who called themselves devout are lying if they are covetous, envious, imposters, mockers, or slanderers, that is to say, the executioners of reputation, or if they assault some other interests of their neighbors.”  As is typical in Gournay’s scale of virtue, the mendacious destruction of another’s reputation emerges as the gravest vice.  When such vices are allied to the ostentatious practice of popular devotions, the vice is doubled by hypocrisy.

d. Philosophy of Education

Closely linked to her moral philosophy is Gournay’s educational theory.  In several treatises on the education of a prince, Gournay argues that the primary purpose of education is the formation of moral character.  The cultivation of virtue in general and of the virtues specific to one’s state in life constitutes the principal aim of instruction.  Humanistic in nature, the ideal education of the prince also entails extensive exercise in modern and classical languages.

Of the Education of the Royal Children of France outlines the primarily moral nature of authentic education.  In Gournay’s perspective, the pupil must be encouraged to cultivate moral virtue by precept and example.  The development of a moral personality is not guaranteed by nature or providence, since human beings possess a spacious free will.  “The salvation of the human race depends on what falls under its choice and free will.  Prevenient grace cannot force this choice although it does encourage the will to make the good choice and strengthens it when it consents.  Because of this we know that if we try to imprint on minds such qualities as faith, virtue, and reason—which we could otherwise call God’s commandments—the minds will conserve the impression of such qualities.”  Art must build on nature in developing the moral character of the pupil, because the adult’s exercise of freedom will be shaped by the moral dispositions encouraged in early age.  The assistance provided by God’s grace in adhering to the good should not be exaggerated, since grace does not overwhelm the moral agent’s exercise of personal freedom.  In her treatment of grace and freedom, Gournay clearly sides with her Jesuit allies against the emphasis of neo-Augustinian Catholics and Protestants on predestination.

The principal emphasis in this moral formation is the cultivation of the virtues.  Successful education should emphasize the development of virtues proper to the pupil’s future state in life as well as the development of the cardinal virtues.  Gournay’s plan for the education of the prince illustrates the mixture of generic and specific moral habits.  “Our muses or sciences should teach Prudence, Temperance, Fortitude and Justice.  Beyond that, they should teach liveliness, concentration, elegance, eloquence, good judgment, and restraint.  Because we are speaking of courtiers, they should also teach chivalry, courtesy, politeness, and a charming personal grace.”  The development of a moral character capable of leadership, diplomacy, and inspiration is the ultimate aim of such a royal pedagogy.

Since the cultivation of moral personality is the central aim of education, successful education depends to a great extent on the moral character of those chosen as teachers.  In the case of royalty, extraordinary care must be exercised in the choice of governors, teachers, and tutors.  Gournay sketches the ideal portrait of the governor chosen to supervise the education of the prince.  “We seek a governor who respects the laws of heaven and earth and who loves his country; a man of the ancient faith; a man who has never damaged the goods, honor, tranquility, or liberty of another; a man who prefers to undergo an injustice than to commit one; a man who is dutiful, well-mannered, charitable, free from pride and vanity; a disinterested man, who sees clearly and who acts in his own affairs as he advises others to do; a man whom one can believe when he is speaking about a friend, an enemy, or himself; who easily accepts obligations; whose words are without artifice, whose counsel is honest, whose resolution is constant; a man who has noble courage, diligent work habits, solid morals, a sense of moderation, even temper; someone whose self-possession protects him from the lure and applause of the world.”  This endless catalogue of ethical qualities for the ideal governor indicates the centrality of moral character for all educational personnel, since the moral personality of the prince develops in large part through emulation of those who instruct him.

In Institution of the Sovereign Prince, Gournay details the more humanistic side of her model of education.  In addition to moral education, the prince requires a literary formation.  Among the disciplines to be studied, Gournay underscores the importance of grammar, logic, philosophy, and theological doctrine.  She stresses the role of languages in this humanistic curriculum.  In addition to French, the pupil should learn Latin; ideally the pupil should master Latin as Montaigne did, by learning to speak it from the cradle.  Tutors should guide the pupil through Latin classics.  Also desirable is the study of Greek and Hebrew.  Not only will this literary formation provide the sovereign with cultural polish; it will permit him to understand more deeply the issues of polity and justice treated in depth by Holy Scripture and the classics.

This classical study will also reinforce the pedagogical effort to strengthen the pupil’s commitment to moral virtue.  A lifelong habit of serious study of the classics will encourage the monarch’s commitment to virtue.  The governed will imitate the virtue or lack of it in the rule and the recreation of the one who governs.  “It is necessary for a ruler to find his relaxation and delight in the muses; otherwise, he will surrender to life of debauchery, luxury, gambling, or gossip….If he indulges in debaucheries, luxury, and gambling, one will see soon enough that his subjects will grow morally ill through the contagion.”  The humanistic initiation into the appreciation of classical literature and art complements the moral formation that might prove too austere without the allure of the muses.

e. Gender and Equality

The most celebrated of Gournay’s treatises defend the equality between the sexes.  Equality Between Men and Women argues that the current subordination of women to men is based on prejudice; only the lack of educational opportunities explains the difference in cultural achievement between the sexes.  Complaint of Ladies explores the roots of the misogyny which has reduced women to a state of servitude.

In Equality Between Men and Women, Gournay develops a cumulative argument from classical, biblical, and ecclesiastical arguments to demonstrate gender equality.  This catalogue of philosophical and theological authorities, as well as the historical achievements of women themselves, indicates that prejudice alone has caused the irrational denigration of women that has become the creed of contemporary society.

Among philosophers supporting gender equality, Plato holds pride of place.  “Plato, to whom no one denies the title of ‘divine,’ assigns them [women] the same rights, faculties, and functions in the Republic.” The treatise also marshals citations from Aristotle, Cicero, Plutarch, Boccaccio, Tasso, and Erasmus on behalf of gender equality.  The historical achievements of Sappho, Hypatia of Alexandria, and Catherine of Sienna, among other women, indicate the intellectual capacity of women.

Scripture and church history provide just as ample a catalogue of citations supporting the equality of the sexes and examples of women who held offices comparable to those held by men.  From the opening book of Genesis, Holy Scripture insists that both men and women are made in the divine image; thus, they are both capable of rational reflection and are both the subject of the same rights and duties.  Several women are named as authors of biblical texts: Anne, Mary, Judith.  Deborah served as a prophet, Judith as a warrior, Tecla as a coworker with Saint Paul.  Of special interest to Gournay is the status of Mary Magdalene, who is the first disciple commissioned to announce the news of Christ’s resurrection and who bears the ancient title of ‘Apostle to the Apostles.’  Sacred tradition often depicts her preaching to the masses in Provence.  The sacerdotal ministries of church governance and preaching thus appear to be open to women as well as men.  Gournay dismisses Saint Paul’s restrictions on the teaching and preaching activities of women in church as a simple precaution against the possible temptation caused by the view of women who are “more gracious and attractive” than men.

Giving women equal access to education will quickly overcome the misogynist burdens under which they currently labor.  Deprivation of education is the sole cause of the current gap between the sexes in the area of cultural achievement.  “If the ladies arrive less frequently to the heights of excellence than do the gentlemen, it is because of this lack of good education.  It is sometimes due to the negative attitude of the teacher and nothing more.  Women should not permit this to weaken their belief that they can achieve anything.”  The path to sexual equality in the future lies in the improvement of educational opportunities for women and in the discouragement of misogynistic stereotypes which discourage women from even attempting cultural achievements.

Complaint of Ladies explores the depth of the misogyny which makes sexual equality such a distant, chimerical goal.  Gournay condemns the current social situation of women as one of tacit slavery.  “Blessed are you, Reader, if you are not of the sex to which one forbids all goods, depriving it of freedom.  One denies this sex just about everything: all the virtues and all the public offices, titles, and responsibilities.  In short, this sex has its own power taken away; with this freedom gone, the possibility of developing virtues through the use of freedom disappears.  This sex is left with the sovereign and unique virtues of ignorance, servitude, and the capacity to play the fool, if this game pleases it.”  Despite the clear philosophical, historical, scriptural, and ecclesiastical evidence for the dignity of woman and for her fundamental equality with man, the political and literary mainstream of French society continues to treat women as chattel.

Especially disturbing is the contempt with which the era’s misogynist literature treats women.  Gournay condemns the sarcastic dismissal of woman which characterizes so many of these texts.  “When I read these writings by men, I suspect that they see more clearly the anatomy of their beards than they see the anatomy of their reasons.  These tracts of contempt written by these doctors in moustaches are in fact quite handy to brush up the luster of their reputation in public opinion, since to gain esteem from the masses—this beast at several heads— nothing is easier than to mock so and so and [to compare them to] a poor crazy woman.”  If in principle men and women are clearly equal, in fact this equality will be difficult to practice in a society poisoned by a popular misogynist art, whose irrational fantasies require no further justification.

4. Reception and Interpretation

The reception of the works of Marie Le Jars de Gournay follows three distinct periods.  During her lifetime, Gournay’s writings attracted a large cultivated public.  Her polemical style and combative positions in the religious, political, and literary quarrels of the period made her a prominent essayist.  By the end of the seventeenth century, she had become virtually unread.  New editions of Montaigne had superseded her own; the antiquated quarrels over Ronsard or grace elicited little interest.  This oblivion lasted well into the twentieth century.  At the end of the century, Gournay’s works underwent a revival in literary and philosophical circles.  The major impetus for this new interest was the feminist effort to expand the canon of the humanities to include the texts of long-ignored female authors.  Gournay’s proto-feminist essays on gender equality constituted the focus of this revival.  In recent decades, they have become the object of numerous editions, translations, and commentaries.

The interpretation of the philosophy of Gournay remains largely tied to her work on the equality between the sexes and her attendant criticism of the social oppression of women.  While her pioneering work in gender theory merits such scholarly attention, it has tended to obscure her other philosophical concerns.  Gournay’s contributions to aesthetics, ethics, pedagogy, social criticism, and theology invite further discovery.  Gournay’s works have also suffered from her close association with Montaigne.  While her philosophy is clearly indebted to the mentor she reverently invokes as “the author of the Essais,” her philosophy differs from the more skeptical theories of Montaigne.  Whereas Montaigne often invokes a constellation of classical authorities to show their contradictions and to argue that many controversies have no certain solution, Gournay frequently invokes a catalogue of classical and biblical authorities to demonstrate the impressive consensus that exists among philosophical and theological authorities on a disputed topic and thus to identify the correct solution.  Gournay’s distinctive method of Catholic-humanism, in which a flood of classical and ecclesiastical authorities are harmonized to prove the truth of a contested philosophical thesis, requires further scholarly analysis.

5. References and Further Reading

The translation from French to English above are by the author of this article.

a. Primary Sources

  • Gournay, Marie Le Jars de. Les avis, ou les Présens de la Demoiselle de Gournay (Paris: s.n., 1641).
    • [This is Gournay’s last and most complete edition of her works.  An electronic version of this work is available on the Gallica section of the website for the Bibliothèque nationale de France.]
  • Gournay, Marie Le Jars de. “Préface” aux Essais de Montaigne (Paris: Tardieu-Denesle, 1828): 3-52.
    • [Gournay’s preface explains her interpretation of the essays of Montaigne and the history of her relationship to Montaigne.  An electronic version of this work is available on the Gallica section of the website for the Bibliothèque nationale de France.]
  • Gournay, Marie Le Jars de. Apology for the Woman Writing and Other Works, ed. and trans. Richard Hillman and Colette Quesnel (Chicago: University of Chicago Press, 2002).
    • [This English translation of Gournay’s writings concerning gender features a substantial biography and bibliography.]
  • Gournay, Marie Le Jars de. Preface to the Essays of Montaigne, ed. and trans. Richard Hollman and Colette Quesnel (Tempe, AZ: Medieval and Renaissance Texts and Studies, 1998).
    • [This English translation of Gournay’s preface to Montaigne features a thorough discussion of the relationship between Montaigne and Gournay; Hillman’s scholarly notes also contextualize the argument of the preface and of the essays.]

b. Secondary Sources

  • Bauschatz, Cathleen M. “Marie de Gournay’s Gendered Images for Language and Poetry,” Journal of Medieval and Renaissance Studies, 25 (1995): 489-500.
    • [Bauschatz links Gournay’s philosophy of language and art to concerns for sexual difference.]
  • Butterworth, Emily. Poisoned Words: Slander and Satire in Early Modern France (London: Legenda, 2006).
    • [In a chapter devoted to Gournay, Butterworth studies Gournay’s preoccupation with the moral problem of slander.]
  • Cholakian, Patricia Francis. “The Economics of Friendship: Gournay’s Apologie pour celle qui escrit,” Journal of Medieval and Renaissance Studies, 25 (1995): 407-17.
    • [Cholakian underscores the differences between Montaigne and Gournay concerning the capacity of women to cultivate friendship.]
  • Deslauriers, Marguerite. “One Soul in Two Bodies: Marie de Gournay and Montaigne,” Angelaki: Journal of the Theoretical Humanities, 13, 2 (2008): 5-15.
    • [Deslauriers analyzes the multiple ways in which Gournay claims to be spiritually united to Montaigne.]
  • Dykeman, Therese Boos. The Neglected Canon: Nine Women Philosophers, First to the Twentieth Century (Dordrecht, Boston: Kluwer Academic, 1999).
    • [Dykeman’s introduction to several translated essays by Gournay provides a solid biography of Gournay and develops a philosophical interpretation of her work.]
  • Franchetti, Anna Lia. L’ombre discourante de Marie de Gournay (Paris: Champion, 2006).
    • [This erudite study of the later work of Gournay argues for the Stoic influences on Gournay’s moral philosophy and philosophy of language.]
  • Lewis, Douglas. “Marie de Gournay and the Engendering of Equality,” Teaching Philosophy 22:1 (1999): 53-76.
    • [Douglas analyzes the rhetorical and argumentative strategies used by Gournay in her defense of gender equality.]
  • McKinley, Mary. “An editorial revival: Gournay’s 1617 Preface to the Essais,” Montaigne Studies 8 (1996): 145-58.
    • [By comparing the 1595 and 1617 prefaces to Montaigne, McKinley demonstrates the changes in Gournay’s intellectual convictions over two decades.]

Author Information

John J. Conley
jconley1@loyola.edu
Loyola University in Maryland
U. S. A.

Anne Le Fèvre Dacier (1647—1720)

A distinguished classicist during the reign of the French king Louis XIV, Madame Dacier achieved renown for her translation of Greek and Latin texts into French.  Her translation of Homer’s Iliad (1699) and Odyssey (1708) remains a monument of neoclassical French prose.  In defending Homer during a new chapter of the literary quarrel between the ancients and the moderns, Dacier developed her own philosophical aesthetics.  She insists on the centrality of taste as an indicator of the level of civilization, both moral and artistic, within a particular culture.  Exalting ancient Athens, she defends a primitivist philosophy of history, in which modern society represents an artistic and ethical decline from its Hebrew and Hellenic ancestors.  A proponent of Aristotle, Dacier defends the Aristotelian theory that art imitates nature, but she adds a new emphasis on the social character of the nature that art allegedly imitates.  In her philosophy of language, she explores the nature and value of metaphor in evoking spiritual truths; she also condemns the rationalist critique of language which dismisses the fictional or the analogous as a species of obscurantism.  The Bible’s robust use of metaphor has established a literary as well as a spiritual norm for Christian civilization.  Against modern censors of classical literature on the grounds of obscenity, Dacier defends the pedagogical value of the classics, especially the epics of Homer, in forming the moral character and even the piety of those who avidly study them.

Table of Contents

  1. Biography
  2. Works
  3. Philosophical Aesthetics
    1. Theory of Taste
    2. Mimesis and Nature
    3. Theory of Language
    4. Moral Pedagogy of Literature
  4. Reception and Interpretation
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Born in Preuilly-sur-Claise on August 5, 1647, Anne Le Fèvre was raised in the city of Saumur in the Loire region of central France.  Her father, Tanneguy Le Fèvre, was a professor of classical languages at a local academy.  Under her father’s tutelage, Anne Le Fèvre quickly learned Latin and Greek and demonstrated a precocious skill for the translation of the classics into French.  Her adolescent marriage in 1664 to the publisher of her father’s works, Jean Lesnier, rapidly deteriorated; the embittered spouses agreed to a permanent separation.

After the death of her father in 1674, Madame Le Fèvre Lesnier enjoyed the patronage of Pierre-Daniel Huet, a royal tutor to the French dauphin and the future bishop of Avranches.  A member of the Académie française, the scholarly cleric introduced her to the controversies surrounding Descartes in contemporary French philosophy.  Originally a supporter of Cartesianism, Huet would turn decisively against it in his Critique of Cartesian Philosophy (1689).  Huet encouraged Le Fèvre Lesnier’s move to Paris and commitment to a scholarly life devoted to the translation of the classics.

Published in 1674, her first translation, an edition of Callimachus, received the acclaim of fellow classicists.  The Duke of Montausier, the overseer of the dauphin’s education, then invited Le Fèvre Lesnier to contribute translations to the series ad usum Delphini (“for the use of the dauphin”) which he had initiated.  Her editions of Publius Annius Florus (1674), Sextus Aurelius Victor (1681), Eutropius (1683), and Dictys of Crete (1684) spread the fame of the series far beyond the court circles for which it had been originally designed.  Le Fèvre Lesnier also published independent translations of Anacreon (1681), Sappho (1681), Terence (1683), Plautus (1683), and Aristophanes (1684). The emergence of a provincial woman as France’s preeminent classicist made Madame Le Fèvre Lesnier a celebrity in the literary salons of Paris.  Padua’s Academy of Ricovrati elected her to membership in 1679.

Shortly after the death of her separated husband, Anne Le Fèvre Lesnier married fellow classicist André Dacier, a former student of her father, in 1683.  A year later, the couple retired to Castres in order to devote themselves to theological study.  Originally Protestants, both Monsieur and Madame Dacier decided to embrace Catholicism.  When they formally entered the Catholic Church in 1685, Louis XIV granted the couple a royal pension in recognition of their conversion.

In the following years, Madame Dacier published new translations of Plautus, Aristophanes, and Terence and collaborated with her husband on several translations, notably new French versions of Plutarch and Marcus Aurelius.  These translations of ancient Stoic authors reflected Madame Dacier’s sympathy for neo-Stoicism and her opposition to neo-Epicureanism in the philosophical debates of the period.  The literary skill and classical erudition of Madame Dacier earned her the praise of France’s most influential literary critic, Nicolas Boileau.

In 1699, Madame Dacier published her major work, a French translation of Homer’s Iliad.  Her version of Homer’s Odyssey followed in 1708. Widely acclaimed as both faithful translations and graceful examples of French prose, the books re-ignited the long-simmering querelle des anciens et des modernes.  Siding with the “ancients,” Dacier defended the superiority of classical literature, notably the epics of Homer, over the literary products of modern France.  Supporting the “moderns,” Antoine Houdar de la Motte published his own version of Homer in 1714, in which he radically altered the text to suit modern sensibilities and in which he criticized the stylistic and moral flaws of Homer compared with the poetry of modern France.  In the same year, Madame Dacier published her major treatise on the question: Of the Causes of the Corruption of Taste.  The work lambasted La Motte’s translation of Homer and provided a point-by-point refutation of his critique of antiquity.  The lengthy treatise also permitted Dacier to declare her philosophical allegiance to Aristotle on artistic questions and to present her own philosophy of art and language.

Even the clergy divided in this new chapter of the querelle.  Supporting La Motte, Abbé Terrasson claimed that with its superior knowledge of the world, due to the philosophy of Descartes and technological progress, modern French culture had produced a superior literature. Defending Dacier, Bishop Fénelon argued that classical literature remained superior to the uneven literary achievements of modern France.

When the Jesuit Jean Hardouin proposed a new system for interpreting Homer, Madame Dacier refuted it in her second major theoretical work: Homer Defended against the Apology of Father Hardouin, or the Sequel to the Causes of the Corruption of Taste (1716).  This treatise reconfirmed her commitment to a neo-Aristotelian theory of art and literary exegesis.  It also expanded the grounds for defending the moral and artistic superiority of ancient civilization.

Madame Dacier died at the Louvre on August 17, 1720.

2. Works

The works of Madame Dacier divide into two categories: translations from classical languages and polemical treatises.

The major translations cover the genres of history, drama, lyrical poetry, and epic.  The histories include translations of the history of Rome by Publius Annius Florus (70-140); the history of Rome by Sextus Aurelius Victor (320-390); the history of Rome by Etropius (the fourth century historian of Julian the Apostate); and the chronicle of the Trojan wars by Dictys of Crete (author of a highly fictionalized alleged eyewitness account of the war, known through a fourth-century Latin translation by Q. Septimius).  This emphasis on military and political history reflects the type of education deemed appropriate for France’s dauphin.

The dramas include translations of the comedies of Aristophanes (446-386 B.C.E.), Plautus (254-184 B.C.E.), and Terence (195-159 B.C.E.).  The lyrical poetry includes the epigrams of the Alexandrian poet Callimachus (310-240 B.C.E.), the verse of Anacreon (570-488 B.C.E.), and the love poems of Sappho (620-570 B.C.E.).  The alleged licentiousness of many of these authors appealed to the more libertine salons and sharpened the controversy over the appropriateness of a woman engaged in publication.

It was in the genre of the epic that Madame Dacier achieved her greatest fame.  Her translation of the Iliad in 1699 established her as a master of neoclassical French style as well as confirming her reputation as a preeminent classicist.  Her preface to the Iliad, staunchly defending Homer and Hellenistic civilization, helped re-launch the querelle des anciens and des modernes.  Her translation of the Odyssey in 1708 achieved similar acclaim from scholars and the general public.  Both works underwent numerous re-editions and were frequently used in Francophone secondary schools for courses in literature.

Madame Dacier wrote her two polemical treatises toward the end of her life.  On the surface, Of the Causes of the Corruption of Taste (1714) and Homer Defended against the Apology of Father Hardouin, or Sequel to the Causes of the Corruption of Taste (1716) are both occasional works.  Both books, however, transcend the immediate disputes of the quarrel over Homer; they permit Dacier to present a neo-Aristotelian theory of art, language, mimesis, and moral education.

3. Philosophical Aesthetics

The philosophical aesthetics developed by Madame Dacier appears primarily in her treatise Of the Causes of the Corruption of Taste [CCT].  Like other eighteenth-century philosophers, Dacier places the question of taste at the center of aesthetic investigation.  She considers a society’s degree of artistic taste to be linked to its degree of moral probity and political order.  In her normative judgments, Dacier praises the achievement of ancient Greece and judges modern France as decadent in comparison.  Declaring herself a partisan of Aristotle, Dacier defends the mimetic thesis that art imitates nature, but she redefines “nature” to include the psychology of the characters depicted and the predominant traits of the society mirrored in art.  Her philosophy of language defends the value of metaphorical speech against the rationalist charge of opacity.  For Dacier, classical literature possesses ethical as well as formal value inasmuch as it can encourage the formation of moral and even religious virtues in the character of the modern Christian reader.

a. Theory of Taste

For Dacier, taste is a central symptom of the general moral and political quality of society.  The capacity of a particular culture to produce and appreciate sublime works of art, especially literary works, indicates the culture’s degree of moral and civic maturity.  The decline of literary taste presages a decline in virtue among the youth who are exposed to mediocre art.  “If we tolerate false [artistic] principles spoiling the mind and judgment [of young people], there are no more resources left for them.  Bad taste and ignorance will finish off this work of leveling.  As a result, literature will be entirely lost.  And it is literature which is the source of good taste, of politeness, and of all good government” [CCT].  Dacier invokes Plato’s authority in defense of her thesis that civic virtue and vice is tied to the quality of the art and literature habitually diffused among the members of the polis.  “This is why Socrates wanted his fellow citizens to commit themselves entirely to the youth and to take great care to prepare and form good subjects for the republic” [CCT].  Through a process of empathetic imitation by its audience, great art, as exemplified by Homer’s epics, encourages the ascent of the moral, social, and political virtues central to civilization.

Fragile and fleeting, artistic taste can easily decline.  Dacier designates three principal causes of the corruption of taste: poor education; ignorance of teachers; the laziness and negligence of the pupils themselves.  Likeswise, when a society abandons the humanist ideal of an educated public who reads and cherishes the classics in the original languages, the civic virtues nurtured by exposure to the classics will inevitably fade.

Dacier identifies two particular causes of the decline of literature and morality in contemporary France.  The first is the omnipresence of licentious literature.  “One factor contributing to the corruption of taste consists of these licentious shows which directly attack religion and morals.  Their soft and effeminate poetry and music communicates all of their poison to the soul and disables all the nerves of the mind” [CCT].  Not without irony, the translator of Plautus and Aristophanes condemns licentious theater for its weakening of intellectual and moral clarity among its habitual spectators.

The second cause, the vogue of sentimental novels, operates a similar destruction of heroic virtue in its poorly constructed tales of romantic love.  “The other cause consists of these frivolous and sentimental works…these false epic poems, these absurd novels produced by ignorance and love.  They transform the greatest heroes of antiquity into bourgeois damsels.  They so accustom young people to these false characters that they can only tolerate true heroes when they resemble these bizarre and extravagant personages” [CCT].  A public sated with sentimental tales of seduction will have little capacity to understand, let alone practice, the heroic civic virtues represented by the characters of Homeric or Virgilian epic.

Dacier’s analysis of the decline of taste and the related decline of civic culture is inscribed in  her primitivist philosophy of history.  The most perfect examples of literary style lie in the inspired books of the Bible.  “When I read the books of Moses and other sacred authors who lived before the time of Homer, I am not astonished by the great taste which reigns in their writings, since they had the true God as their teacher.  One senses that no human production could possibly reach the divine character of these writings” [CCT].

Although pagan, ancient Egyptian culture receives a similar panegyric.  “I see that geometry, architecture, painting, sculpture, astronomy, and divination flourished among the Egyptians only a few centuries after the great flood.  I see a people convinced of the immortality of the soul and of the necessity of a religion, a people who had a very mysterious and enigmatic theology and who built temples and who gave to Greece her very cult and gods.  When I see the ancient monuments which remain from this people, I cannot doubt that good taste must have also reigned in their writings, although this baffles me and I do not know where all of this could have come from” [CCT].

The culture of ancient Greece, in particular the epics of Homer, also miraculously resisted the tendency of civilization to decline artistically and morally since biblical times.  “I see in Greece all at once a coup of genius.  I see a poet who, two hundred years after the Trojan War and against the degradation imprinted by nature into all the productions of the human mind, combines the glory of invention with that of perfection.  He gives us a sort of poem without any previous model, which he had imitated from no one, and which no one has been able to imitate since then.  This poem’s story, union and composition of its parts, harmony and nobility of diction, artful combination of truth and falsehood, magnificence of ideas, and sublimity of views has always made it considered as the most perfect work issuing from a human hand” [CCT].

The current disdain for Homer and other classical authors reflects the literary-cultural decadence affecting contemporary France.  The loss of the Renaissance humanist’s veneration of the classics indicates a moral and political, as well as artistic, decline for French society.  “Everywhere today there reigns a certain spirit [disdainful of the classics] more than capable of damaging literature and poetry.  This fact has already caused foreigners to reproach us that we are degenerating away from that good taste we had happily developed in the previous century” [CCT].  For Dacier, the only solution to this cultural decline is the neoclassical one: a renewed study of classical languages and literature, with a new literary effort to imitate classical authors in vernacular works and concomitantly an effort to renew political society through the imitation of the civic virtues exalted by Homer and similar Greco-Roman authors.

b. Mimesis and Nature

Throughout her polemical writings, Madame Dacier cites Aristotle’s Poetics as her primary authority for her thesis that art constitutes the imitation of nature.  An oft-cited secondary source for this mimetic theory is Horace’s Art of Poetry.  Against rationalists such as La Motte, Dacier insists that great art’s imitation of nature does not consist in the reproduction of what literally exists in the external physical world; rather, it mirrors the acts of the hidden soul and rightly incorporates mythology, hyperbole, and idealization into its portrait of the moral universe.

In imitating nature, literature must focus on what is true.  Even in writing fiction, the writer must so manipulate the characters and action that they acquire the qualities of verisimilitude.  “I am convinced that a writer writes the true more effectively than he or she does the false.  The mind struck by a real object feels it much more forcefully than if it were struck by an object it only creates by itself or that it does not believe to exist” [CCT].  Like all other artists, the poet must draw his or her truth from nature, even if the usual domain of the poet is the spiritual nature of the human soul in conflict rather than the physical landscape.

The poet’s imitation of nature is never the literal reproduction of preexistent physical or moral nature.  Embellishment of nature is often obligatory if the poet is to place into proper relief the character of his or her personages.  “The exceptional brilliance which the poet [Homer] has given to the valor of this hero [Achilles] has confused them [the critics of Homer].  They didn’t see that this exaggerated valor is there to bring out the nature of his character and not to hide his faults. Poets are like painters. They must make their hero more beautiful, as long as they always conserve the resemblance to the hero and they only add what is compatible with the basic character with which they have clothed their hero” [CCT].  In fashioning the hero who dominates the epic and tragic drama, the author inevitably eliminates and exaggerates certain details of human moral action in order to create a striking moral ideal.

To understand the legitimate freedom of the artist in the imitation of nature, it is crucial to grasp the distinction between history and poetry.  Whereas the historian depicts what actually happens, the poet can present the probable or the possible.  Whereas the historian focuses on the unique fact, the poet dwells on general human truths.  “History writes about only what has happened; poetry writes about what might have or must have happened, either necessarily or probably.  History reports on particular things, poetry on general things.  That is why poetry has greater moral value than does history.  General things interest all human beings while particular things are related only to one human being” [CCT].  In this neo-Aristotelian concept of poetic truth, the freedom of the artist is not unlimited.  Poetic license to embellish character or plot cannot trespass the limits of the probable.

The legitimate freedom of the poet can also be grasped by contrasting poetry with politics and other practical arts.  The truth expressed in the imaginative world of poetry differs from the truth sought in political judgment.  “Aristotle was right to say that ‘that one must not judge the excellence of poetry as one judges the excellence of politics, nor even the excellence of all the other arts.’  Politics and all the other arts seek the true or the possible.  Poetry seeks the astonishing and the marvelous as long as it does not clearly shock the sense of what is probable” [CCT].

Even the other fine arts do not enjoy the freedom proper to the poet in his or her evocation of nature inasmuch as they focus their imitation of nature on specific, external objects.  “In fact, all the other imitations, those of painting, sculpture, architecture, and all the other arts, aim at the imitation of only one thing” [CCT].  Literature alone imitates the universe of the human moral agent; the legitimate license of the poet flows from the challenge of this elusive, spiritual object of mimesis.

In her refutation of La Motte and other critics of Homer, Dacier defends Homer’s use of mythology and other fictional devices in his presentation of the character of Achilles.  In particular, she defends the episode of Achilles with the Phoenix, which La Motte had dismissed as a literary absurdity.  Dacier contends that the dialogue with the Phoenix helped to enhance the moral character of the flawed hero.  “No one is more convinced than I am that everything which exists in nature is not good to be depicted just because it exists.  But I believe that what the Phoenix says [in this passage disputed by La Motte] is not in the nature of the things one should not depict.  In all times and in all nations…images depend on customs and on ways of thinking.  What Homer is doing here…is still quite natural and quite appropriate to show Achilles’s tenderness.  This flows quite logically from the tenderness which the Phoenix just showed him.  It even serves to heighten the grandeur of Achilles.  What kind of child could this be who would have his tears washed away by someone like the Phoenix, son of a king?” [CCT]  The imitation of nature depends on psychological and social context.  The depiction of a mythological character like the Phoenix helps to reveal the positive moral traits of Achilles, in particular his tenderness and his king-like dignity.  Thus, the fact that such a fictional character does not exist in physical nature does not eliminate its usefulness for illuminating the moral nature of the epic hero.  The presence of the Phoenix is also justified, in a similar manner, by the cultural context of the poem’s genesis and setting.  The dialogue between the Phoenix and the warrior is perfectly logical within the religious presuppositions of the ancient Greek world.  It is this world, not the more skeptical world of eighteenth-century France, which art must imitate in Homer’s poetry.

c. Theory of Language

Like her mimetic theory of art, Dacier’s theory of language contests the rationalist thesis that ideal speech provides a clear one-to-one correspondence between a particular object and its linguistic signifier.  Dacier insists on the necessity and value of metaphorical speech, even outside the domain of poetry.

Metaphorical speech often communicates truths which cannot be expressed by more literal speech.  The frequent use of analogy is necessary for the effective communication of moral  truths which elude reduction to straightforward description.  “To depict well the objects of which one speaks, there is no method more certain than to provide images by comparison.  Does poetry alone use it?  Doesn’t eloquent oration use it just as much?  Doesn’t God use it?  Aren’t the divine Scriptures full of it?  Didn’t Our Lord use it again and again in his discourses?” [CCT]  The repeated use of metaphor in the Bible itself confirms the propriety of the recourse to metaphor in various types of religious and secular speech.

Dacier mocks the rationalists like La Motte who condemn metaphorical speech as a species of obscurantism.  “Should we say, like these literalist minds, that these [biblical] comparisons illuminate nothing and that it would have been better for the Holy Spirit to have made a plain depiction of these objects than to have had recourse to these misleading similarities?…Should we be so sure that these comparisons are imperfect and that they only serve to confuse matters rather than to clarify them?…Doesn’t one sense the awful impiety of such a position?  It is not without reason that Scripture calls impiety ignorance” [CCT].  The effort to eliminate metaphorical speech in favor of more literal language reflects the incapacity to grasp the moral realities and religious mysteries only communicated through elaborate simile.  The conceits of Scripture provide the inspired models for this metaphorical use of language to evoke the spiritual.

Rather than being inferior to clear propositional language in revealing the truth, poetry is actually more powerful than philosophy.  Homer reveals the capacity of poetry to unveil the true through the use of analogy.  “No poet has been more successful than him [Homer] in depicting objects by similarity.  Could the most philosophical discourse give a stronger and livelier picture of these objects than the images he draws in the mind through these comparisons?” [CCT]  Rather than representing confusion, metaphorical speech in the hands of a master like Homer evokes complex spiritual truths which more prosaic speech cannot express.

Dacier also defends metaphorical speech because it has the power to touch the emotions and will of the reader as well as his or her intellect.  Rather than conveying simple information, metaphorical rhetoric possesses a persuasive power absent in more literal forms of communication.  The value of metaphorical speech in the moral and religious realms lies in its capacity to shape the action of the moral agent and to convert the sinner.

d. Moral Pedagogy of Literature

For Dacier, the study of classical literature is essential to shape the moral character of the members of society, especially its governing elite.  Against the criticism that both Greek culture and literature are marked by immorality, Dacier defends the moral probity of classical Greek authors and their capacity to foster virtue in their readers.  Against the theological argument that the Greek pantheon and the authors it inspired feature immoral deities, Dacier claims that the theology of classical Greek poetry is closer to that of Christian monotheism than its modern critics would admit.

Homer epitomizes the moral value of the Greek classics.  “No philosopher has given greater precepts of morality than has Homer….Everyone [except modern critics] has recognized that the Iliad and the Odyssey art two quite perfect tableaux of human life.  With admirable variety, they represent everything that is worthy of praise or blame, that is useful or pernicious, in a word all the evils which madness can produce and all the goods which wisdom can cause” [CCT].  As evidence of Homeric passages promoting virtue, Dacier cites the prudence and wisdom apparent in King Nestor’s discourses in the Iliad and the Odyssey.

According to Dacier, La Motte and other modern critics of Homer have seriously misunderstood the moral structure of Homer’s epics and the classical Greek society they mirror.  In particular, they have misconstrued the moral nature of the epic hero Achilles.  Rather than being a model for moral imitation by the reader, Achilles is in fact a warning against the destructiveness of the vices of vanity, temerity, and arrogance with which Homer has clothed his character.  Dacier cites her philosophical guide Aristotle in this interpretation of Achilles as a salutary warning against vice.  “Did Aristotle ignore the continual emotional eruptions of Achilles?  Where did Aristotle consider them a virtue?  Undoubtedly, he made us see that the character of Achilles must represent not what a man does in anger, but rather everything anger itself can do.  Consequently, he considered this hero the brutal opposite of the man who does good” [CCT].  The modern dismissal of classical Greek literature as morally damaging is based upon such basic misinterpretations of the moral character of the epic and tragic hero.

Dacier defends the theological as well as the moral probity of Homer’s epics.  Against the common Christian charge that classical literature features a pantheon of violent, vicious deities, she insists that Homer actually provides a portrait of God and of the human soul which accords with the biblical prophets and apostles on numerous points.  “Homer recognizes one superior God, on which all the other gods are dependant.  Everywhere he supports human freedom and the concept of a double destiny so necessary to harmonize this freedom with predestination; the immortality of the soul; and punishments and rewards after death.  He recognized the great truth that human beings have nothing good which they have not received from God; that it is from God that comes all the success in what they undertake; that they must request this happy outcome by their prayers; and that the misfortune which occurs to them is called down by their folly and by the improper use they make of their freedom” [CCT].  Given its sound theology and moral psychology, the epics of Homer and similar classical Greek works of literature can nurture the theological as well as the moral virtues essential for a Christian political order.

4. Reception and Interpretation

Since the late seventeenth century, Madame Dacier has been recognized as a preeminent classicist and translator.  The essayist Madame de Lambert praised her contemporary for having contradicted anti-intellectual stereotypes of women.  “I esteem Madame Dacier infinitely.  Our sex owes her a great deal.  She has protested against the common error which condemns us to ignorance.  As much as from contempt as from an alleged superiority, men have denied us all learning. Madame Dacier is an example proving that we are capable of learning.  She has associated erudition with good manners.” [New Reflections on Women, 1727]

Dacier received philosophical as well as literary recognition.  Gilles Ménage dedicated his History of Women Philosophers (1690) to Dacier under the accolade of “the most erudite woman in the present or the past.”  In his Philosophical Dictionary (1764), Voltaire argued that “Madame Dacier was no doubt a woman superior to her sex and she has done a great service to letters.”  Countless encyclopedias of women authors cite Dacier for her erudition and scholarly productivity but her philosophical reflection has received comparatively little attention.

Recent scholarship has continued this literary rather than philosophical focus on Dacier.  Garnier (2002), Hayes (2002), and Moore (2002) examine questions of translation during the querelle des anciens et des modernes.  Bury (1999) studies Dacier in the context of the role of women intellectuals in the period.

The challenge for a philosophical interpretation of Dacier is to analyze the theories of art, mimesis, language, and education developed in her more theoretical works.  There is also the historical challenge to explore the neo-Aristotelianism she defended in the aesthetic rather than the customary metaphysical realm.  Her role in diffusing Stoic philosophy through the translations of Plutarch and Marcus Aurelius she co-authored with her husband also merits further study.

5. References and Further Reading

All translations from French to English are by the author of this article.

a. Primary Sources

  • Dacier, Anne Le Fèvre, Des causes de la corruption du goût. Paris: Rigaud, 1714.
    • A digital version of this work is available online at Gallica: Bibliothèque numérique on the webpage of the Bibliothèque nationale de France.
  • Dacier, Anne Le Fèvre, Homère défendu contre l’apologie du Père Hardouin, ou la suite aux causes de la corruption du goût. Paris: Coignard, 1716.
    • A digital version of this work is available online at Gallica: Bibliothèque numérique on the webpage of the Bibliothèque nationale de France.

b. Secondary Sources

  • Bury, Emmanuel, “Madame Dacier,” in Femmes savantes, saviors de femmes: Du crépuscule de la Renaissance à l’aube des Lumières, ed. Colette Navitel. Geneva: Droz, 1999: 209-20.
    • The author analyzes Dacier in terms of leading women intellectuals of early modernity.
  • Garnier, Bruno, “Anne Dacier, un esprit moderne au pays des anciens,” in Portraits de traductrices, ed. Jean Delisle. Ottawa and Artois: Presse universitaire d’Ottawa and Artois Presse universitaire, 2002: 13-54.
    • The author focuses on Dacier’s methods of translation.
  • Hayes, Julie Candler, “Of Meaning and Modernity: Anne Dacier and the Homer Debate,” in Strategic Rewriting, ed. David Lee Rubin. Charlottesville, VA: Rookwood, 2002: 173-95.
    • The author studies Dacier’s principles of translation and role in the querelle des anciens et des modernes.
  • Moore, Fabienne, “Homer Revisited: Anne Le Fèvre Dacier’s Preface to Her Prose Translation of the Iliad in Early Eighteenth-Century France,” Studies in the Literary Imagination, Fall 2000; 33(2): 87-107.
    • The author analyzes the moral theories as well as the translation methods of Dacier.

Author Information

John J. Conley
E-mail: jconley1@loyola.edu
Loyola University in Maryland
U. S. A.

Agnès Arnauld (1593—1671)

An abbess of the Jansenist convent of Port-Royal, Mère Agnès Arnauld developed an Augustinian philosophy shaped by the mystical currents of the French Counter-Reformation.  Her philosophy of God depicts a deity who is radically other than his creatures.  Only a negative theology, a theology of what God is not, can explore the divine attributes.  In her ethical theory, Mère Agnès contextualizes moral virtue by analyzing those religious virtues proper to a nun in a contemplative order.  Influenced by the mystical école française, the abbess stresses self-annihilation as the summit of the nun’s life of virtue.  In her legal writings tied to the reformation of the convent, the abbess defends the spiritual freedom of women.  Women are to enjoy vocational freedom, the freedom to pursue education, and the freedom to hold opinions on disputed theological questions.  Similarly, women are to enjoy substantial freedom in the exercise of their authority as superiors of convents.  During the persecution of the convent, Mère Agnès developed a moral code of resistance to abuses of power.  She details the conditions under which cooperation with illegitimate commands of civil or ecclesiastical authority could be tolerated or rejected.

Table of Contents

  1. Biography
  2. Works
  3. Philosophical Themes
    1. Negative Theology
    2. Monastic Virtue Theory
    3. Law and Freedom
    4. Ethics of Resistance
  4. Interpretation and Relevance
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1. Biography

Born on December 31, 1593, Jeanne Arnauld was the third daughter of Antoine Arnauld the Elder and Marie Catherine Marion Arnauld.  From birth, the prominent family of jurists had designated the infant for an abbacy in a convent.  Through negotiations with King Henri IV and fraudulent transaction with the Vatican, in which documents attesting the candidate’s age were falsified, her maternal grandfather Simon Marion had Jeanne appointed the abbess of the Benedictine convent of Saint-Cyr in 1599.  Assuming her religious name of Mère Cathérine –Agnès de Saint-Paul (commonly known as Mère Agnès), the infant abbess took with relish to the liturgical offices and other practices of the monastery.

As her elder sister’s reform of Port-Royal became increasingly stormy, Mère Agnès devoted her time to supporting Mère Angélique in the work of Port-Royal’s transformation.  Sealing her commitment to monastic reform, Mère Agnès renounced the abbacy of Saint-Cyr in 1610, was clothed in the Cistercian habit of Port-Royal in 1611, and pronounced her vows as a member of the community in 1612.  Mère Agnès assisted her sister in governing the burgeoning convent through a series of major offices: mistress of novices, subprioress, and vicar abbess.

During the decade of the 1610s, Mère Agnès emerged as one of the convent’s leading spiritual directors.  Her extensive correspondence reveals the eclectic influences on her thought: the Jesuit Jean Suffren; the Capuchin Archange de Pembroke; and the Feuillant Eustache de Saint-Paul Assaline.  François de Sales influenced the characteristic moderation expressed in Mère Agnès’s judgments.  On issues of gender and mystical states, the central reference was Teresa of Avila.  Her Path of Perfection, Interior Castle, and Autobiography are repeatedly cited.

In the 1620s, the convent became more turbulent.  With the transfer of the convent from the rural valley of the Chevreuese to the Parisian Saint-Jacques neighborhood in 1625, the convent came under the influence of new Oratorian chaplains, notably Pierre de Bérulle and Charles de Condren.  A disciple of the Platonist Pseudo-Dionysius, Bérulle encouraged an apophatic mysticism, which stressed the incapacity of the human mind to know God through image or concept. Condren emphasized complete abandonment to God’s will, climaxed by self-annihilation.  During the Oratorian ascendancy (1625-1636), Sebastien Zamet, an episcopal overseer of the convent and an ally of the Oratorians, pushed the convent in a less austere direction.  Liturgical offices became more complicated, church decorations became more sumptuous, and nuns were encouraged to share their latest mystical insights with devout laity in the front parlors.  The original reformers quietly fumed at what they considered a regression toward conventual decadence.

During the Oratorian ascendancy, Mère Agnès composed a small treatise, Private Chaplet of the Blessed Sacrament, under the direction of Condren.  Honoring Christ in the Eucharist, each of the sixteen stanzas devoted itself to the sixteen centuries elapsed since the Last Supper.  At Condren’s suggestion, the pious litany was expanded; the nun explained the meaning of the various apophatic titles ascribed to God. [An apophatic theology considers God to be ineffable, and it attempts to describe God in terms of what God is not.]   As the Chaplet quietly circulated among the nuns and lay benefactors of Port-Royal, a crisis erupted in 1633.  Octave Bellegarde, another episcopal supervisor of Port-Royal who disapproved of Zamet’s reform and of the Oratorians’ speculative mysticism, denounced the pamphlet as heretical and temerarious.  In June 1633, a committee of the theology faculty of the Sorbonne condemned Mère Agnès’s text as destructive of morals because of its stress on passive abandonment to God.  The Jesuit Étienne Binet seconded the condemnation, while the Jansenist Jean de Hauranne, abbé de Saint-Cyran vigorously defended the orthodoxy of the treatise.  The Vatican’s halfhearted intervention into the pamphlet war pleased neither party.  Mère Agnès’s text was withdrawn from circulation but neither was its theology condemned nor was it placed on the Index of Forbidden Books.

During her first abbacy over Port-Royal (1636-42), Mère Agnès restored the convent to the austerity of the Angelican reform.  Her relationship with the convent’s new chaplain, however, proved less than amicable.  A close friend and disciple of Cornelius Jansen, bishop of Ypres and Louvain theologian, Saint-Cyran imported the radical Augustinian theology of Jansen into the convent.  This theology’s emphasis on practical morality and the value of occasional deprivation of the sacraments clashed with the abbess’s more exuberant mystical piety.  By the time of Saint-Cyran’s imprisonment by Richelieu (1638-43), however, Mère Agnès had become a partisan of the Jansenist movement and intensified the convent’s cult of Saint-Cyran through the circulation of his letters and conferences.

At the end of her abbacy, Mère Agnès was appointed the convent’s mistress of novices.  The nun flourished in this role of spiritual counselor both through conversation with the novices and through correspondence with an extended number of correspondents.  Her letters to Jacqueline Pascal on the proper moment for entering Port-Royal express the prudence and moderation for which the nun was renowned: “Our Lord wants to purify you by this delay because you have not always desired it.  It is necessary to have a hunger and thirst for justice to expiate the disgust one once had for this vocation in earlier times.  Saint Augustine wonderfully describes this delay of God’s grace in the souls of those who desire the abundance of God’s grace, which God has postponed [L; to Jacqueline Pascal; February 25, 1650].”  Through her correspondence, the nun also participated in the philosophical and theological controversies of the day swirling around the publication of Antoine Arnauld’s Frequent Communion and Jean Brisacier’s Jansenism Confounded.

When Mère Agnès assumed her second abbacy of Port-Royal (1658-1681), the controversy over Jansenism had erupted into a crisis.  The French throne demanded that every priest, nun, and teacher in the realm sign a statement assenting to the Vatican’s condemnation of five heretical propositions found in Jansens’s massive Augustinus (published posthumously in 1640).  Using an ingenious theological distinction, Antoine Arnauld the Younger argued that members of the church were only bound to assent to church judgments of droit (concerning faith and morals); they were not bound to accept church judgments of fait (empirical fact).  The first type of judgments was essential to the church’s mission of salvation, whereas the second was not.  In the “crisis of the signature,” the Jansenists were willing to assent to the condemnation of the five theses concerning grace and freedom, but they could not assent to the erroneous judgment that Jansen had supported such heresies.   Mère Agnès indicates her refusal to giving an unreserved signature to the controversial document.  “The church is attacked in truth and charity, the two columns that support it.  This is what they are trying to destroy by this unfortunate signature, which we would offer against the truth and thereby destroy the charity we should have for the dead as well as the living.  We would be subscribing to the condemnation of a holy bishop who never taught the heresies they impute to him [L; to Madame de Foix; December 10, 1662].”

As the 1660s progressed, Mère Agnès witnessed the progressive intensification of the persecution of the convent.  The convent school and novitiate were closed; the chaplains were expelled.  Foreign nuns hostile to Jansenism were imported to govern Port-Royal, now surrounded by an armed guard.  The most recalcitrant nuns, including Mère Agnès, were exiled to foreign convents.  Mère Agnès herself was placed with the Visitation nuns; she refused to sign the controversial statement although some of her own nieces in the convent eventually yielded.  By the end of the 1660s, a truce was arranged to resolve the growing scandal of an entire convent under interdict.  In 1669, the “Peace of the Church” permitted the reopening of the convent, the resumption of liturgical life, and the reopening of the convent school and novitiate.  Several uncharacteristically placid years descended on Port-Royal.

Mère Agnès Arnauld died on February 19, 1671.

2. Works

A prolific author, Mère Agnès was one of the few Port-Royal nuns to see her works published during her lifetime.  Circulating first as a devotional pamphlet, Private Chaplet of the Blessed Sacrament (1626) caused an international dispute over its controversial apophatic approach to the divine attributes.  Louvain, the Jansenists, and the Oratorians defended the work, while the Jesuits and the Sorbonne opposed it.  Working in collaboration with Mère Angélique Arnauld and Antoine Arnauld, Mère Agnès was the principal author of the Constitutions of Port-Royal (1665), the legal framework for the Angelican reform of the convent and a theological text marked by the abbess’s Oratorian insistence on the annihilation of oneself.  Her Image of a Perfect and an Imperfect Nun (1665) provides the fullest exposition of her virtue theory.  The contemplative virtues central to the monastic life, especially the spirit of adoration, are stressed; to be authentic, virtue must empty itself of all self-interest.  Her accent on the intellectual nature of religious contemplation provoked a new controversy.  Martin de Barcos (1696) and Jean Desmarets de Saint-Sorlin (1665) criticized her approach as too intellectualist; Pierre de Nicole (1679) defended her use of reason in meditation.  The Spirit of Port-Royal (1665) underscores self-annihilation in its treatment of the spiritual character of the convent community.

The posthumously published works of Mère Agnès also make a single contribution to the philosophical and theological canon of Port-Royal.  An exercise in moral casuistry, Counsels on the Conduct Which the Nuns Should Maintain in the Event of a Change in the Governance of the Convent (1718) tackles the problem of moral cooperation with evil as it analyzes which actions would be legitimate and illegitimate in obeying the civil and ecclesiastical authorities who were persecuting the convent.  The two-volume Letters of Mère Agnès Arnauld, abbess of Port-Royal (1858) reflects the Augustinian axis of the abbess’s philosophy.   She repeatedly refers to the texts of Saint Augustine himself and modern Augustinian writers such as Jansen, Saint-Cyran, Antoine Arnauld, and Teresa of Avila in justifying her positions on spiritual government and theological controversy.

3. Philosophical Themes

The philosophical reflection of Mère Agnès Arnauld follows two primary avenues: philosophy of God and moral philosophy.  Influenced by the apophatic theology of the Oratorians, her philosophy of God stresses God’s alterity (otherness) and the incapacity of human concepts to penetrate the divine essence.  Her moral philosophy develops a theocentric account of the virtues central to the monastic life.  It also presents a casuistic analysis of the permissible and impermissible modes of cooperation with the persecutors of Port-Royal.

a. Negative Theology

In the Private Chaplet of the Blessed Sacrament [PC], Mère Agnès provides the most substantial expression of her apophatic theology.  A devotional treatise written in praise of Christ’s presence in the Eucharist, the Private Chaplet stresses the negative attributes of God disclosed in the eucharistic Christ.  The adorer of the Eucharist cannot penetrate the essence of the godhead, affirmed more accurately by terms expressing what he is not than by those expressing what he is.

A series of titles express this divine unknowability.  God is inaccessible. “He remains in himself, letting creatures remain in the incapacity to approach him [PC no.11].”  God is incomprehensible.  “He alone knows his ways.  He justifies to himself alone the plans he has for his creatures [PC no.12].”  God is entirely sovereign.  “He acts as the first cause without any subordination to the ends he has given himself [PC no.13].”  Other negative divine attributes include illimitability, inapplicability, and incommunicability.

Even the positive attributes ascribed to God receive an apophatic reinterpretation.  God’s holiness is entirely other than the alleged holiness of certain human creatures.  “The company God wants to keep with humanity is separate from it.  He resides only in himself.  It is not reasonable that God should approach us because we are only sin [PC no.1].”  The existence allegedly shared by both God and creatures is illusory.  Divine existence only manifests the non-being of creatures, especially peccatory human creatures.  “God is everything he wants to be and makes all other beings disappear.  As the sun blots out all other light, God exists simply to exist [PC no.4].”  The analogy of being disappears in this exaltation of divine alterity.

Throughout her writings, Mère Agnès emphasizes the rupture between God and human beings.  Analogical presentations of the divine attributes are inevitably anthropomorphic projections of human attributes into the divine essence.  As in the act of adoration before the Eucharist, the primary act of metaphysical affirmation of God is the adorer’s humble recognition of his or her utter incapacity to imagine or name the magnum mysterium that is the cause and the end of cosmic and human existence.  Only the language of negation and alterity can prevent both piety and philosophical reflection from deteriorating into imaginary projection.

b. Monastic Virtue Theory

In Image of a Perfect and an Imperfect Nun [IP], Mère Agnès analyzes the moral virtues proper to a nun committed to a strictly cloistered community.  Her distinction between the perfect and imperfect is not the one between virtue and vice.  The dividing line between authentic virtue and its subtle counterfeits lies in the difference between theocentric and anthropocentric postures of the will.

The monastic virtue of reverence illustrates the difference.  Both perfect and imperfect nuns practice their external obligations of divine worship and of reverence for their superiors.   The perfect focus on God alone, ignoring other creatures “as if they did not exist [IP, 7].”  The imperfect, however, suffer from a vacillating attention that “desires something other than God and that fears losing something other than God that pleases them [IP, 10].”  This anthropocentric turning on oneself corrupts the virtue that should be purely focused on God.

Other monastic virtues exemplify the split between anthropocentric and theocentric versions of virtue.  Perfect submission to the divine will accepts the periods of aridity and desolation which characterize spiritual maturation.  Imperfect submission, however, bridles at such deprivation and clings to sensible consolations.  Perfect zeal desires nothing other than the glory of God.  Imperfect zeal becomes fascinated with the external means used to glorify God and seeks recognition for its efforts.  Perfect repentance firmly renounces all sin and seeks solitude to reform one’s life.  Imperfect repentance vacillates and cannot give finality to its vague, contradictory desires for reform.  Recalling Pascal, Mère Agnès describes the vacillation of the imperfect soul: “Her mind is like a reed shaken by the wind, which makes it turn now this way and now that [IP, 53].”

This anti-anthropocentric account of virtue ultimately celebrates the annihilation of the self in the perfect practice of the monastic virtues.  Authentic humility entails recognition of one’s utter dependence on God for the least moral action.  “It is on this incapacity to perform the least good and to avoid the least evil without God’s help that the true nun establishes the unshakable foundation of humility [IP, 94].”  Similarly, authentic poverty acknowledges one’s utter non-existence in face of God.  “It is the knowledge that she has nothing that was hers before she was created out of nothing,  especially since the sin of Adam, who made all humanity worthy of not only losing the goods of heaven but of losing the goods of earth as well [IP, 100].”  Clearly influenced by the Oratorian spirituality of annihilation, the abbess depicts perfect virtue as a collapse of the moral agent into the divine will.

At the apex of the monastic virtues lie the contemplative virtues of solitude and adoration.  Authentic solitude permits the nun to recognize her utter uselessness in the face of God’s grandeur.  “God reduces us to be totally useless so that we might experience what the Prophet says: ‘Since the Lord is God, he has no need of our goods.’  This is to say that no matter how excellent our works may be, they provide no benefit to him; they are only advantageous for ourselves [IP, 148].”  In the act of adoration, the perfect nun experiences this self-annihilation in its fullest; she also discovers the source of this abolition of the human self in God’s operation of grace in the cross of Christ.  “She hears the voice of her Savior, who commands her to announce his death through her voluntary death to all things and to herself until he comes, which is to say, until she dies in her body.  He further tells us to find her glory and her rest only in the cross, in humiliation and privation of what she loves.  She should do so out of love for the one who dispossessed himself of his own glory, who annihilated himself, and who died for her salvation [IP, 160].”

This account of virtue is Augustinian in its stress on the utter necessity of grace for the performance of any moral action.  The “natural” virtues of prudence, fortitude, temperance, and justice are notable by their absence since they are illusory manifestations of pride.  The account is Oratorian inasmuch as it stresses the annihilation of the self as the key trait of the moral agent perfectly united to God in the practice of virtue.  It is also contemplative inasmuch as it integrates the practice of the moral virtues into the gaze of the adorer who knows through speculative experience and divine illumination how radically all good dispositions and good actions are caused by the sovereign godhead.  This architectural contemplative gaze simultaneously recognizes the utter nothingness of the human agent distorted by sin and concupiscence.

c. Law and Freedom

As the principal author of the Constitutions of the Monastery of Port-Royal [CM],. Mère Agnès crafted the basic legal structure for the reformed convent.  The piecemeal reforms effected by her sister Mère Angélique would now be embedded in a legal document recognized by ecclesiastical authority.  In the Constitutions, Mère Agnès also sketches her philosophy of freedom and rights in a gendered key.  The freedom of women to pursue a vocation, to develop a theological culture, and to exercise limited self-government are affirmed.  In particular, the authority of the convent’s abbess to govern and instruct her nuns without external interference is underscored.

Rooted in the Angelican reform, the Constitutions emphasize the vocational freedom of women.  The convent will only accept women who have indicated their desire to pursue a monastic calling free of parental pressure.  “We should not admit any girl if she is not truly called by God.  She should show by her life and actions a true and sincere desire to serve God.  Without this we should never admit anyone for any other reason, even when it is a question of the intelligence, the wealth, or the noble title the candidate might bring [CM, 54].”  To emphasize this vocational freedom, the Constitutions abolish the dowry requirement, long traditional for choir nuns in Benedictine and Cistercian convents.  “If a poor but excellent girl, clearly called by God, presents herself for admission, we should not refuse her, although the convent would be heavily burdened.  We would then hope that God who sent her would feed her.  We should not be afraid to make such commitments as long as we choose souls carefully and only accept souls rich in virtue instead of temporal advantages [CM, 74].”  Although this policy of vocational freedom faithfully followed the canon law of the church, it shocked French aristocratic opinion, long accustomed to placing unmotivated widows and surplus daughters in convents through the gift of a dowry.

Similarly, the education of women in Port-Royal’s convent school was to respect this vocational freedom.  The school was to accept only pupils whose parents had not already designated them for the married or cloistered life.  “We will only accept those girls whose parents offer them to God in indifference: that is, indifference as to whether they have decided to become nuns or whether they have decided to return to the world [CM, 99].”  For Mère Agnès, the major purpose of the convent school was to permit the pupils to discern their personal vocations through prayer, sacramental life, and dialogue with the teaching nuns.

The Constitutions also stress the spiritual freedom of the individual nun in her times of personal prayer.  In a period when many religious orders minutely prescribed methods of meditation, Mère Agnès insisted on the spontaneity and freedom which a nun must enjoy as she advances in prayerful maturity.  “Saint Benedict’s intention was that we should give the Holy Spirit room and time to stir up in us the spirit of meditation, which consists in a sincere desire to belong to God and to do so in purity and compunction of heart….True meditation is a celestial gift and not a human one.  It is the Holy Spirit praying for us when he makes us pray [CM, 43].”  Like virtue, prayer is theocentric in its very causation.  A certain illuminism emerges in this Augustinian account of prayer.

To exercise this contemplative freedom, the nun must develop an extensive theological culture.  In a period when personal meditation on Scripture was still considered suspect, Mère Agnès stipulates a comparatively wide number of theological texts to be studied by the nuns.  In addition to biblical reading, nuns are to meditate on works from the patristic period (Augustine, the desert fathers, Dorotheus and Bernard of Clairvaux) and the modern period (François de Sales, Louis of Grenada, and Teresa of Avila).

The nuns are also to enjoy limited self-governance.  The abbesses are to be elected by the nuns meeting in chapter.  The term of office was now to be fixed at three years, renewable for one additional term.  Although the bishop appoints a clerical overseer for the convent, the overseer is to be chosen from a list of three names presented by the abbess.  Similarly, the abbess is to exercise the right of approval for the chaplains and confessors who serve the convent.  The authority of the abbess in the reformed convent is especially pronounced.  She is to serve as the nuns’ principal spiritual director and to enjoy an extensive teaching role.  She is to provide lectures commenting on key monastic texts, such as the Rule of Saint Benedict.  In the conférence, one of the reformed Port-Royal’s creations, the abbess is to field the question of her fellow nuns on both practical and speculative issues troubling the convent.

So strong was the reformed convent’s accent on theological culture and debate that critics derided the Port-Royal nuns as théologiennes.  In Mère Agnès’ perspective, the authentic nun is the woman who freely pursues a personal vocation, who strengthens this vocation through substantial theological study, who chooses her own superiors, and who pursues God in spontaneous meditation guided by the Holy Spirit.  The challenge to the patriarchal tradition of the forced vocation and the illiterate convent was evident.

Even in her legal texts, the Augustinian philosophy of Mère Agnès is evident.  The Constitutions are not a blueprint for human efforts to build the ideal convent; they reflect the work of divine grace within the reformers.  “As Saint Augustine says, we must work to conquer our vices by constant efforts and ardent prayers, but we must recognize at the same time that our efforts as well as our prayers, if there is anything good in them, are the effects of grace [CM, 273].” Corporate, no less than individual, acts of virtue have a single causation in the operation of divine grace.

d. Ethics of Resistance

As the opposition to Port-Royal intensified, Mère Agnès composed Counsels on the Conduct Which Nuns Should Maintain in the Event of a Change in the Governance of the Convent [CC].  The work attempted to prepare the nuns to negotiate the persecutions which would soon overwhelm the convent.  Mère Agnès presciently saw the exile of recalcitrant nuns, the imposition of foreign superiors, and the use of ecclesiastical interdict (barring a Christian from participation in the sacraments) as probable tactics of the new persecution.  Her Counsels functions both as a casuistical manual, which instructs nuns on acceptable and unacceptable scenarios of cooperation with hostile authorities, and a moral exhortation, which analyzes the virtues the nuns should cultivate under duress.

If foreign superiors are imposed on the convent, the Port-Royal nuns should refuse to acknowledge their authority.  Such imposed superiors represent a violation of the convent’s constitutions, which have been duly approved by the Vatican and the French throne.  “These superiors cannot have a true authority by usurping a power that does not belong to them.  They will be intruders, even when they want to adorn themselves with the obedience due superiors [CC, 83].”  The nuns of Port-Royal have not pledged to follow generic vows of poverty, chastity, and obedience; with the approval of church and state, they have promised to live these vows in the convent of Port-Royal, ruled by its constitutions and laws.  The imposition of foreign superiors represents a serious violation of this vocational right.

In practice, the nun must distinguish between acceptable and unacceptable cooperation in the commands of these illegitimate superiors.  Material cooperation is the easiest.  The nun should quickly accept commands concerning manual labor, meals, and physical disposition of one’s space.  Even here, however, the nun must refuse commands to activities incompatible with the ethos of Port-Royal; making elaborate vestments or placing flowers on the altar, for example, would violate the convent’s austere understanding of poverty.  Moral cooperation with the illegitimate superiors should be refused.  The nun is not to reveal her convictions or feelings to the illegitimate superiors or their attendant clergy.  If a command is refused, no explanation is to be given.  Under no circumstances should the nun agree to submit to the demand of an unreserved signature on the statement assenting to the church’s condemnation of Jansen; to do so would be to deny the truth concerning grace.  Even conversations on this topic are to be avoided.

The problem of material cooperation when one is exiled to a foreign convent is comparatively easier.  A legitimate superior of a foreign convent exercises a certain authority over the entire house, including the guests who reside there.  An exiled Port-Royal nun should easily accept the host convent’s different material culture, even to the point of breaking the reformed Port-Royal’s vegetarianism, and different spiritual culture, including participation in a different version of the divine office than that used at Port-Royal.  Even here, however, a strict silence should be employed to avoid any moral cooperation with Port-Royal’s persecutors.  If an exiled nun confesses her sin to the convent’s confessor, she should not reveal anything else in her conscience except her sins, soberly described.  Interviews with the new superior should be respectful but the exiled nun should not reveal her internal state of mind.  An asceticism of the tongue is essential in this genteel campaign of resistance.

The abbess also reminds the nuns of the virtues they need to cultivate during the impending persecution.  They need to acquire the virtues of the martyrs who have preceded them.  “God clearly permits us to be consoled by the thought that we are suffering because we feared to offend him by assenting against our conscience to something we thought impossible to do without attacking the truth [CC, 104].”  In refusing to assent to the condemnation of what they believed to be Jansen’s accurate theory of grace, the nuns have become the most modern of victims: the martyr to conscience.

The persecution also gives the nuns the occasion to deepen the virtues of humility and of dependence on God’s providence.  Most strikingly, the deprivation of Holy Communion (as part of the censure of interdict) permits the nuns to discover an asacramental type of communion with Christ that seems to transcend the value of sacramental communion.  “Instead of the bread of God, we receive the word of God himself, which must be heard in our heart….We place our confidence in the promise made to us in Holy Scripture that the spiritual anointing, even greater in affliction, will teach us everything [CC, 95].” Under the brunt of persecution, the piety of the nun becomes a comparatively antinomian type of piety, no longer requiring the sacramental or sacerdotal meditation of the church to experience intimate communion with God.

4. Interpretation and Relevance

Several factors have limited the reception and interpretation of the works of Mère Agnès Arnauld as properly philosophical works.  First, a number of secondary works have focused exclusively on the controversy over the nun’s early pamphlet Private Chaplet of the Blessed Sacrament. The commentaries by Saint-Cyran (1633), Binet (1635), Armogathe (1991), and Lesaulnier (1994) indicate the longstanding interest in the church-state controversy behind this international quarrel.  This focus on the ecclesiastical politics behind the abbess’s early work has tended to devalorize the more detailed positions on virtue and authority developed by Mère Agnès in her works of maturity.  Second, the Jansenist movement itself often distanced itself from the works of the nun, considered too mystical for the practical, rationalist piety of the Jansenist mainstream.  Barcos’s (1696) critique of Mère Agnès’s theories of prayer indicates the disdain of the later Jansenist movement for a mysticism-oriented philosophy too dependent on its Oratorian sources.

The contemporary philosophical retrieval of the thought of Mère Agnès is focusing more on her work as a moralist.  Her virtue theory privileges those intellectual and volitional habits that typify the way of life proper to a strictly cloistered convent.  Contemplation itself, interpreted as a loving gaze on God freed from all self-interest, becomes the keystone of the authentic virtuous life.  As Mesnard (1994) argues, the more active dimension of the life of the embattled nun merits new consideration.  Her reflections on the quandaries of material cooperation with evil constitute a casuistry for the oppressed.  The ethics of resistance she constructed during the persecution of Port-Royal remains to be explored.

5. References and Further Reading

All translations from French to English above are by the author of this article.

a. Primary Sources

  • Arnauld, Mère Agnès. Avis donnés par la Mère Cathérine Agnès de Saint-Paul, Sur la conduite que les   religieuses doivent garder, au cas qu’il arrivât du changement dans le gouvernement de sa maison (N.p.: 1718).
    • [The treatise analyzes the morality of material cooperation with the opponents of the convent.]
  • Arnauld, Mère Agnès. Counsels on the Conduct Which the Nuns Should Maintain in the Event of a Change in the Governance of the Convent.1718.
  • Arnauld, Mère Agnès. L’image d’une religieuse parfait et d’une imparfaite, avec les occupations intérieures pour toute la journée (Paris: Charles Savreux, 1665.)
    • [This work analyzes the difference between theocentric and anthropocentric virtue, with the accent placed on the Oratorian virtue of self-annihilation.]
  • Arnauld, Mère Agnès. Les constitutions du monastère de Port-Royal du Saint-Sacrement (Mons: Gaspard Migeot, 1665.)
    • [Written with the collaboration of Antoine Arnauld and Mère Angélique Arnauld, this juridical document provides the legal framework for the reformed Port-Royal convent.]
  • Arnauld, Mère Agnès. Lettres de la Mère Agnès Arnauld, abbesse de Port-Royal, ed. Prosper Faugère [and Rachel Galet], 2 vols. (Paris: Benjamin Duprat, 1858).
    • [The correspondence indicates the broad Augustinian culture of the abbess as well as the principles of her methods of governance and of spiritual direction.]
  • Arnauld, Mère Agnès. Private Chaplet of the Blessed Sacrament (1626).
  • Arnauld, Mère Agnès. Spirit of Port-Royal (1665).

b. Secondary Sources

  • Armogathe, Robert. “Le chapelet secret de Mère Agnès Arnauld,” XVIIe siècle no. 170 (1990): 77-86.
    • [The article provides an excellent critical edition of the Private Chaplet and an analysis of the work’s theology of rupture.]
  • Barcos, Martin de. Les sentiments de l’abbé Philérème sur l’oraison mentale (Cologne: P. Du Marteau, 1696).
    • [The Jansenist leader criticizes Mère Agnès’s approach to meditation as too methodical and too intellectualist.]
  • Binet, Étienne. Discussion sommaire d’un livret intitulé “Le chapelet secret du très-saint Sacrement” (Paris: 1635).
    • [The Jesuit author criticizes Mère Agnès’s Private Chaplet for its alleged asacramentalism and discouragement of the cultivation of the moral virtues.]
  • Bugnion-Sécretan, Perle. Mère Agnès Arnauld, 1593-1672; Abbesse de Port-Royal. (Paris: Cerf, 1996).
    • [This biography uses Mère Agnès’s correspondence to probe the abbess’s psychological life.]
  • Conley, John J. Adoration and Annihilation: The Convent Philosophy of Port-Royal (Notre Dame, IN: University of Notre Dame Pres, 2009): 113-174.
    • [This chapter analyzes Mère Agnès’s Augustinian philosophy, especially its theory of virtue, freedom, and the divine attributes.]
  • Chédozeau, Bernard. “Aux sources du Traité de l’oraison de Pierre Nicole: Martin de Barcos et Jean Desmarets de Saint-Sorlin lecteurs des Occupations intérieures de la Mère Agnès Arnauld,” Chroniques de Port-Royal 43 (1994): 123-34.
    • [The article traces the influence of Mère Agnès’s spirituality on subsequent controversies over the nature of Christian contemplation.]
  • Desmarets de Saint-Sorlin, Jean. Le chemin de la paix et celui de l’inquiétude, vol. 1 (Paris: C. Audinet, 1665).
    • [The book condemns Mère Agnès’s theories of prayer as too rationalistic.]
  • Lesaulnier, Jean. “Le chapelet secret de la Mère Agnès Arnauld,” Chroniques de Port-Royal 43 (1994): 9-23.
    • [The article provides a well-documented textual study of the various versions of and controversy over the abbess’s Private Chaplet.]
  • Mesnard, Jean. “Mère Agnès femme d’action,” Chroniques de Port-Royal 43 (1994): 57-80.
    • [Unlike other commentators, the author stresses the practical rather than the mystical dimension of Mère Agnès’s work and theories.]
  • Nicole, Pierre de. Traité de l’oraison (Paris: H. Josset, 1679).
    • [A leading Jansenist philosopher defends Mère Agnés’s spirituality as both doctrinally orthodox and philosophically sound.]
  • Saint-Cyran, Jean du Vergier de Haurranne, abbé de. Examen d’une apologie qui a été faite pour server de défense à un petit livre intitulé Le chapelet secret du Très-Saint Sacrement (Paris, 1633).
    • [A defense of the Augustinian pedigree and orthodoxy of Mère Agnès’s Private Chaplet, the work marked Saint-Cyran’s inaugural alliance with the convent of Port-Royal.]
  • Timmermans, Linda. “La ‘Religieuse Parfaite’ et la théologie: L’attitude de la Mère Agnès à l’égard de la participation aux controverses,” Chroniques de Port-Royal 43 (1994): 97-112.
    • [The commentary on Image of a Perfect Nun argues that the abbess desired nuns to abstain from theological disputes; Mère Agnès’s own participation in several such public disputes is downplayed.]
  • Weaver, F. Ellen. La Contre-Réforme et les constitutions de Port-Royal (Paris: Cerf, 2002.)
    • [This study stresses the link between the abbess’s vision of reformed conventual life and both the earlier Cistercian tradition and other “non-Jansenist” currents in the Counter-Reformation.]

Author Information

John J. Conley
Email: jconley1@loyola.edu
Loyola University in Maryland
U. S. A.

Modal Illusions

We often talk about how things could have been, given different circumstances, or about how things might be in the future. When we speak this way, we presume that these situations are possible. However, sometimes people make mistakes regarding what is possible or regarding what could have been the case. When what seems possible to a person is not really possible, this person is subject to a modal illusion. With a modal illusion either (i) things seem like they could have been otherwise when they could not have been otherwise or (ii) things seem as if they could not have been otherwise when they could have been otherwise. The most widely discussed cases are instances of the former. Certain impossibilities seem (at least to some people) to be possible. Because of these illusions, there are certain necessary truths (truths which could not have been false) that are mistakenly thought to be contingent. Of particular concern to philosophers working on modal illusions are certain necessary truths that are known a posteriori, and which strike some people as contingent. The most discussed examples are found in Saul Kripke’s Naming and Necessity (1972), the work that sparked the contemporary interest in modal illusions.

While many elementary necessary truths seem to be necessary, the “necessary a posteriori” do not always seem to be so. For example, it is obviously a necessary truth that two is greater than one. It does not seem that things could have been otherwise. On the other hand, it is also a necessary truth that water is composed of H2O (as Kripke (1972) explains), but this might not seem to be necessary. The proposition expressed by the sentence ‘water is H2O’ strikes some people as contingently true because it seems that water could have been composed of something else. However, water could not have been composed of anything other than H2O since that’s what water is. Anything else would not be water. We came to know the composition of water through experience and so one might think that we could have had different experiences that would have shown that water was composed of XYZ, for example, and not H2O. However, the idea that things could have been otherwise and that the proposition is merely contingently false is a modal illusion.

Table of Contents

  1. Modal Illusions
  2. The Necessary A Posteriori
  3. Ramifications
  4. Similarity Accounts
  5. Objections
    1. True Modal Beliefs and False Non-Modal Beliefs
    2. Other Examples of Modal Illusions
  6. Two-Dimensionalist Accounts
  7. Objections
    1. Other Examples of Modal Illusions
    2. The Epistemic Status of the Secondary Proposition
    3. Believing Impossibilities
  8. Possibility Accounts
  9. Objections
    1. Conceivability and Possibility
    2. Impossible Worlds
    3. Metaphysical Possibility
  10. References and Further Reading

1. Modal Illusions

Unless otherwise specified, the terms ‘necessary,’ ‘contingent,’ ‘possible,’ ‘impossible,’ and all of their cognates refer to metaphysical notions. The phrases ‘could have been,’ ‘could not have been,’ and so forth are also used in a metaphysical sense. If p is necessarily true, then p could not have been false. If p is necessarily false, then p could not have been true.  The propositions expressed by the sentence ‘2 is greater than 1’ is necessarily true since it could not have been false, for example. If p is contingently true, then although p is true, it could have been false. For example, the proposition expressed by the sentence, ‘John McCain lost the 2008 U.S. Presidential election,’ is contingently true since it could have been false. McCain could have won the 2008 election. If p is possible, then either p is true or p is contingently false. The proposition expressed by the sentence ‘McCain won the 2008 election’ is false, but it is possible that McCain could have won the 2008 election.

Certainly, a person can be mistaken about the modal properties of many different types of statements or propositions. A person might mistakenly believe that a contingent truth known a priori is necessarily true. Kripke (1972) gives examples of the “contingent a priori that may also be illusory. Consider Kripke’s example, ‘stick S is one meter,’ said of the stick used to fix the reference of ‘one meter.’ Kripke points out that ‘stick S is one meter’ is contingent because stick S could have been a different length; it could have been longer or shorter than one meter. Yet, the speaker knows that stick S is one meter a priori because stick S is being used to fix the referent of ‘one meter.’ Before one knows how long the stick actually is, one knows that it is one meter long. It strikes some people as false that stick S could have been longer or shorter than one meter since stick S is fixing the reference of ‘one meter.’ Stick S could have been many lengths, but it could not have been longer than or shorter than one meter since ‘one meter’ refers to whatever length stick S happens to be. Those who are struck by the appearance that stick S could not have been longer or shorter than one meter, are subject to a modal illusion. (However, this does not seem to be a common mistake made regarding Kripke’s examples of the “contingent a priori”. Rather, it seems that when a person doubts the Kripkean examples of the “contingent a priori, the person believes that these truths are knowable a posteriori. One might argue that while it is necessary that stick S is one meter, one could only have known that through experience.)

There may also be “contingent a posteriori truths that are thought to be necessary. For example, Kripke (1972, p. 139) points out that it is sometimes mistakenly thought that light could not have existed without being seen as light. “The fact that we identify light in a certain way seems to us to be crucial even though it is not necessary; the intimate connection may create an illusion of necessity.” It is merely contingently true that light is seen as light, but some might think it is necessarily true and that things could not have been otherwise.

Finally, there are certainly necessary a priori truths that strike some people as merely contingently true. Any mistake about what could have been the case or could not have been the case is a modal illusion. However, the most commonly discussed examples of modal illusions are Kripke’s examples of the “necessary a posteriori and therefore, these will be the focus of this entry. Sections 3 through 8 below provide an overview of the most prominent explanations offered by contemporary philosophers regarding how or why a person subject to a modal illusion of the necessary a posteriori comes to make the mistake.

2. The Necessary A Posteriori

The following are the three most commonly discussed examples of modal illusions of the “necessary a posteriori”:

(a) Hesperus is Phosphorus.

(b) Water is H2O.

(c) This table is made of wood. (Said of a table originally made of wood.)

The examples above do strike many people as contingent on first consideration. However, the propositions expressed by each of the above sentences are necessary. For example, ‘Hesperus is Phosphorus’ is both necessary and knowable a posteriori. Given that Hesperus is Phosphorus, Hesperus is necessarily Phosphorus since being self-identical is a necessary property (Any object is necessarily identical to itself.) Yet, we came to know that Hesperus is Phosphorus through empirical means. The proposition expressed by the sentence might seem contingent to someone if that person thought that Hesperus could have been distinct from Phosphorus. (b) and (c) are also necessary since composition is a necessary property of an object or substance. But of course, we need empirical evidence to know the composition of water or this table and so both (b) and (c) are a posteriori.

Although (a), (b), and (c) are necessary truths, the following propositions are necessarily false, but may seem to some people to be merely contingently false to some people:

(a1) Hesperus is distinct from Phosphorus.

(b1) Water is XYZ.

(c1) This table is made of ice. (Said of a table originally made of wood.)

It might seem that Hesperus could have been distinct from Phosphorus, that water could have been composed of XYZ, or that this table could have been made of ice. A person might consider this table, think about what it could have been made of and come to the mistaken conclusion that it could have been made of ice and then conclude that the proposition expressed by the sentence ‘this table is made of ice’ is merely contingently false. But of course, this table could not have been made of ice. Given that this table is made of wood, it is necessarily made of wood. Any table made of ice would not be this same table.

Of course, some philosophers deny that these examples are necessary. In that case, there is no modal illusion to explain since what seems contingent is contingent and what seems possible is possible. However, each of the accounts considered below all attempt to explain the illusion in these cases because each of them accepts the Kripkean conclusions about the necessary nature of the above examples.

3. Ramifications

The correct solution to the problem of modal illusions will have an important impact on many philosophical issues because it is common for philosophical arguments to rely upon thought experiments about what is and is not possible.  For example, in the philosophy of mind, some say that they can conceive of mental activity without any physical activity or of a mental entity existing in the absence of a physical entity. Indeed, this was part of Descartes’ argument. Descartes relied on the seeming possibility that his mind or soul could exist without his body. Descartes’ narrator claimed that he could imagine being deceived about having a body, but he could not imagine being deceived about being a thinking being. So it seems that the mind or soul could exist or could have existed without the body. If this is true, then physicalism must be false.

The possibility of a philosophical zombie is often used in arguments against a physical reduction of consciousness. Some people believe that philosophical zombies could have existed. One might imagine a being completely identical in every respect to a human being, however this being is not conscious; there is no mental activity whatsoever. There are no emotions, thoughts, beliefs, fears, desires and so forth even though there are all the corresponding neurological events happening in the body. Moreover, the zombie exhibits all the behaviors of a person with emotions, thoughts, beliefs, fears, desires and so forth. For example, it acts angry when there are the neurological firings in the brain that normally occur when a person experiences anger. However, the zombie does not feel anger; the zombie does not feel anything! If these sorts of creatures could have existed, then mental activity does not supervene on physical activity. All the physical facts would be the same as they actually are but there would be no mental facts.

Another example many dualists use is that many people are struck by the feeling that pain could exist or could have existed without the corresponding physical activity in the body. Some say that they can imagine pain, the sensation, without the correlated neurological, physical activity in the body that occurs whenever a person has pain (call that C-Fiber stimulation). If this represents a genuine metaphysical possibility, then pain and other conscious events are not identical with, or reducible to, physical events.

Dualists use the sort of reasoning in these examples to show that there is no necessary connection between the mental and the physical, as perhaps these are modal illusions. Perhaps zombie worlds, body-less souls, and pain in the absence of C-Fiber stimulation are not really possible. It may be the case that although a philosophical zombie seems possible it is not possible, just as it is the case XYZ-water seems possible, even though it is not possible. In responding to arguments that rely on these appearances of possibility, many physicalists point to the Kripkean examples of the “necessary a posteriori”, arguing that these examples strike many people as contingent even though they are necessary. So even if it is necessary that mental events are physical events and even if it is true that mental events could not have existed without the corresponding physical events, it might seem as though they could have, just as it might seem as though water could have existed without being H2O even though it could not have.

Depending on the correct account of modal illusions, the seeming possibilities of philosophical zombies and of a purely mental world may or may not count as modal illusions. Different explanations of modal illusions have different consequences for the materialist/dualist debate because only some explanations of modal illusions will count zombie worlds and body-less souls as modal illusions.

4. Similarity Accounts

Some explanations of what modal illusions are contend that the person who is struck by the feeling that things could have been otherwise does not really have an impossible situation in mind. Instead, the situation the person considers is one in which there are similar objects or a similar substance and the situation has been re-described. This family of accounts, called Similarity Accounts, includes Kripke’s own. According to Kripke, it might seem possible that this (wooden) table could have been made of ice because we claim that we can imagine this table being made of ice. However, Kripke (1972, p. 114) says, “this is not to imagine this table as made of…ice, but to…imagine another table, resembling this one in all the external details made of…ice.” According to Kripke, the intuition that leads a person to conclude that this table could have been made of ice is not an intuition about this table but an intuition about a similar one. The intuition must be re-described.

Kripke (1972, p. 142) also argues that the necessarily false propositions ((a1), (b1), and (c1)) could not have been true but some “appropriate corresponding qualitative statement” for each is true. Kripke (1972, p. 143) claims that the sentence, ‘two distinct bodies might have occupied in the morning and the evening, respectively, the very positions actually occupied by Hesperus-Phosphorus-Venus’ is true and should replace the “inaccurate statement that Hesperus might have turned out not to be Phosphorus”. It is unclear whether Kripke wants to maintain that the person subject to the modal illusion really has that corresponding statement in mind or whether he simply wants to maintain that this corresponding statement should replace the false statement the person does have in mind. In either case, Kripke adopts a Similarity Account approach in saying that the person has the false belief because she considers a situation in which some planet similar to Hesperus is distinct from some planet similar to Phosphorus—and not a situation in which Hesperus is not Phosphorus. Similarity Accounts argue that if Hesperus could not have been distinct from Phosphorus, then when a person claims to believe that they could have been distinct, it cannot be because she has imagined a scenario or situation in which they are distinct since there is no such possible scenario or situation.

Kripke goes on to argue that there is no similar explanation about the belief that pain could have existed in the absence of C-Fiber stimulation. One can imagine that pain could have existed in the absence of C-Fiber stimulation; there is no re-description necessary because there is no other feeling that is very much like pain that the person imagines. To be a pain is to be felt as a pain, according to Kripke, and so if we imagine the sensation of pain without C-Fiber stimulation, the sensation we imagine must be pain—otherwise, what would the similar phenomenon be if not pain?

The appearance of pain without C-Fiber stimulation is not like the appearance of water without hydrogen and oxygen, according to Kripke. It is not true that to be water is to be experienced as water. A person can have all the experiences of water and yet the substance could be something else. When one imagines water composed of XYZ, according to these accounts, the person has imagined this similar substance—one that is experienced as water but is not water. However, when one imagines pain existing in the absence of C-Fiber stimulation, there is no phenomenon similar to pain that the person really imagines. One cannot have all the experiences of pain without there being pain. So Similarity Accounts are unable to explain the false intuition that pain could have existed in the absence of C-Fiber stimulation because this intuition is not false and so not a modal illusion.

5. Objections

a. True Modal Beliefs and False Non-Modal Beliefs

According to Similarity Accounts, the reason a person believes that something impossible could have been the case is because she imagines a situation that could have been the case for some similar objects or substances. It might seem that water could have been composed of XYZ because a person might imagine some substance very similar to water in all qualitative respects, but this substance will not really be water.

Consider a true modal belief, such as the belief that John McCain could have won the 2008 U.S. Presidential election. Normally, we would say that this is a belief regarding John McCain himself and not someone similar to John McCain in the relevant respects. Indeed, this is what Kripke wants to hold about true modal beliefs. Kripke (1972, p. 44) writes, “When you ask whether it is necessary or contingent that Nixon won the election, you are asking the intuitive question whether in some counterfactual situation this man would in fact have lost the election.” He adamantly opposes the idea that the intuition is about some man similar to Nixon, yet he claims that the intuition that this (wooden) table might have been made of ice is not about this table. There may be a reason to explain true and false modal intuitions in this non-uniform way, but without an argument, we have no reason to claim that our false modal intuitions are about objects similar to the objects we claim they are about while our true modal intuitions are about the very objects we claim they are about.

Such a theory is also non-uniform in how it would be extended to treat false non-modal beliefs.  The belief that New York City is the capital of New York State is a false non-modal belief. (It is a false belief, yet the belief is not at all about what could have been the case.) If a Similarity Account were extended to explain how or why a person has false beliefs more generally, the account would say that the person comes to this belief because he has an intuition that some city, similar to New York City in the relevant respects, is the capital of New York State. This is clearly an implausible explanation of such a false belief. We have no reason to believe that our common false beliefs stem from true beliefs about similar objects.

Now consider a necessary falsehood that a person mistakenly believes is true. Any mathematical falsehood would count. The mathematical falsehood that 18 squared is 314 (it is actually 324) is necessarily false; it could not have been true, but someone might mistakenly believe it is true. If a Similarity Account were extended to treat false beliefs more generally, the account would say that the person who believes that 18 squared is 314 does not really have 18 in mind but some number similar to 18 in the relevant respects. This is what the theory would say to explain any false mathematical beliefs. Because many (if not all) Similarity Accounts argue that one can never imagine impossibilities (which is Barcan Marcus’ claim in “Rationality and Believing the Impossible” (1983)), then no one could ever believe that a mathematical falsehood either could have been true or even is true. But clearly, we are capable of believing mathematical falsehoods.

b. Other Examples of Modal Illusions

In many occurrences of modal illusions, a person will come to realize that the proposition expressed by the sentence is necessary and will still be struck by the feeling that things could have been otherwise. As Alex Byrne (2007, p. 13) says, “A modal illusion, properly so-called, would require the appearance that p is possible in the presence of the conviction that p is impossible.” For example, a person who has read Kripke many times and acknowledges that water is necessarily H2O may still be struck by the appearance that water could have been XYZ. Call this a “person in the know.” A Similarity Account cannot explain the modal illusion in these cases. The subject in the know at once believes that water is necessarily composed of H2O and is struck by the feeling that things could have been otherwise. A Similarity Account would say that the intuition is that some other substance, similar to water in the relevant respects. The sentence ‘water could have been XYZ’ needs to be replaced.

Our subject in the know might say, “I know that it is impossible that p but it still sure seems like p could have been the case.” A Similarity Account might argue that what our subject really means is that “I know that it is impossible that p* but it still sure seems like p* could have been the case.” In that case, the account would need to explain why it is p* she has in mind in both instances. But more importantly, the account would need to explain this new illusion: if p* is possible and it strikes her as possible, why does she claim to know that it is impossible that p*? p* is possible and so according to this type of explanation, she must have a different proposition in mind. Might that be p**?

Similarity Accounts also cannot explain the illusion that this very table could have been made of ice. Imagine a person points to a wooden table and claims, “this very table is made of wood, but it could have been made of ice.” The person cannot be more specific about which table he means to consider; it is this very one in front of him, one that is made of wood. It would be absurd to say that the person is considering some other similar table that is made of ice. Our subject has said that the table he means to consider is made of wood. He could even have said, “It seems to me that any wooden table could have been made of ice.” Similarity Accounts fail to explain this illusion as well. It cannot be that our subject means to consider a specific table but mistakenly considers some similar one. He is making the claim about any wooden table, whatsoever. What is he considering in this case if the Similarity Account is correct?

6. Two-Dimensionalist Accounts

Another type of account that seeks to explain how modal illusions of the “necessary a posteriori” arise makes use of the two-dimensional semantic framework proposed by philosophers such as David Chalmers and Frank Jackson. This sort of approach aims to explain how a person might mistakenly think that a necessary proposition is contingent. As opposed to a traditional view of reference, the two-dimensional semantic framework proposes that there are two intensions of certain words. According to one common view of reference, a concept determines a function from possible worlds to referents. The function is an “intension” and it determines the “extension”. Two-Dimensionalism proposes that sometimes there are two intensions because often there is no single intension that can do all the work a meaning needs to do.

For example, Chalmers and Jackson explain that ‘water’ has two intensions. Under the common view of reference, the concept “water” determines a function from possible worlds to water/H2O. The function is an intension and determines that the extension of ‘water’ is always water/H2O. But according to the two-dimensional framework, there are two different intensions, two different functions from possible worlds to extensions. While the secondary intension of ‘water’ always picks out water/H2O, the primary intension picks out the “watery stuff” of a world—the clear, drinkable, liquid that fills the lakes, rivers, and oceans in a possible world. In certain possible worlds, that stuff is composed of XYZ and so it might seem as if the proposition expressed by the sentence ‘water is XYZ’ is merely contingently false. That is an illusion caused by conflating the primary and secondary intensions of ‘water.’ The primary intension is meant to capture the “cognitive significance” of the term, which is what a person subject to a modal illusion must have in mind.

Certain sentences thus express two different propositions depending on the two different intensions of the terms in the sentence. According to Two-Dimensionalists, the primary proposition determines the epistemic property of the sentence (whether it is a priori or a posteriori) while the secondary proposition determines the modal property of the sentence (whether it is necessary or contingent). With any example of the Kripkean “necessary a posteriori”, the primary proposition is a posteriori but not necessary, while the secondary proposition is necessary but not a posteriori. The secondary proposition in this case, that water is H2O, is necessary in the standard Kripkean sense, but it is not a posteriori because the secondary intension always picks out H2O in any possible world; we do not need to do empirical investigation to know that water is water.  The primary proposition is not necessary since the watery stuff of a world could be composed of H2O, XYZ, or something else. However, it is a posteriori. We need empirical evidence to know what water is composed of in any world.

Jackson (1997, p.76) holds that the secondary proposition is “normally meant by unadorned uses of the phrase ‘proposition expressed by a sentence’” and Chalmers (1996, p. 64) too says that the secondary proposition “is more commonly seen as the proposition expressed by a statement.” Therefore, one might say that the proposition expressed by ‘Hesperus is Phosphorus’ is necessary. If it seems contingent to a person, that is a modal illusion and the illusion is explained by the fact that the primary proposition is not necessary. According to this sort of account, when a person is subject to a modal illusion and concludes that a necessary truth is contingent, she does not consider the proposition expressed. Rather, the sentence misdescribes the situation she is considering. Her mistake is not simply in concluding that the proposition is contingent but in reporting what proposition she is considering. Two-Dimensionalist Accounts have this feature in common with Similarity Accounts: the person subject to the modal illusion does not have some impossible situation in mind. The situation she has in mind is not described correctly.

Chalmers uses his Two-Dimensionalist explanation of modal illusions to argue for dualism. According to Chalmers, pain in the absence of C-Fiber stimulation is not a modal illusion. In The Conscious Mind, Chalmers (1996, p. 133) says, “with consciousness, the primary and secondary intensions coincide.” The primary intension of ‘pain’ picks out painful sensations, feelings experienced as pain, but the secondary intension of ‘pain’ also picks out painful sensations, feelings experienced as pain, since what it means to be a pain is to be experienced as a pain. It does not always pick out C-Fiber stimulation. So, painy-stuff cannot be misdescribed by the word ‘pain’ since all that it is to be a pain is to be felt as a pain. The secondary proposition—the proposition that backs the necessity or contingency of a sentence—expressed by ‘pain is C-Fiber stimulation’ is contingent. The proposition could have been false since the secondary intension of ‘pain’ picks out something other than C-Fiber stimulation in some possible worlds. The person who believes that the proposition expressed by the sentence ‘pain is C-Fiber stimulation’ is contingently true has not made a mistake.

While Jackson once used his account of modal illusions to defend a dualist theory, he now supports physicalism. Given his physicalist commitments, Jackson should hold that a person who is struck by the feeling that pain could have existed in the absence of C-Fiber stimulation is under a modal illusion. Given his Two-Dimensionalist commitments, however, it is hard to know what he would say to explain the illusion. A Two-Dimensionalist Account of the illusion that pain could have existed in the absence of C-Fiber stimulation should say that the person who believes this imagines a situation in which the primary intension of ‘pain’ picks out something just like pain, but is not pain. It is unclear how a Two-Dimensionalist could make this sort of approach work since, as Chalmers (1996, p. 133) and Kripke (1972, p. 151) have noted, ‘pain’ always picks out pain and not painy-stuff. There is no painy-stuff that is not pain.  But perhaps what Jackson wants to argue is that while we believe we are imagining a world in which there is pain and no C-Fiber stimulation, there really must be C-Fiber stimulation in that situation.

7. Objections

a. Other Examples of Modal Illusions

Consider again the person in the know who is subject to a modal illusion. Two-Dimensionalist accounts fail to explain the illusion in these cases. Chalmers argues that it might seem as if the proposition expressed by the sentence ‘water is XYZ’ is contingently false because the sentence is used to express something true in some possible worlds. Chalmers (2007, p. 67) says that the person subject to the modal illusion considers “a conceivable situation—a small part of a world” in which watery stuff (and not water) is XYZ but the subject misdescribes the situation she is considering using the term ‘water.’ According to Chalmers (1996, p. 367, footnote 32), there is a “gap between what one finds conceivable at first glance and what is really conceivable.” It might seem conceivable that water could have been XYZ, but it is not really conceivable since it is impossible. While this may be a plausible explanation in the typical cases of modal illusions, it is an implausible explanation for what happens in the case of our subject in the know. This person knows enough to recognize that there might be a situation in which the watery stuff at a world is composed of XYZ and thus makes the primary proposition expressed by the sentence ‘water is XYZ’ true, but she does not have that proposition or situation in mind. Rather, it strikes her as possible—even though she believes it is not possible—that water could have been XYZ and that the proposition expressed (the secondary proposition) is contingently false. The person in the know would explicitly consider the secondary proposition and it might still strike her as merely contingently false.

The Two-Dimensionalist explanations also fail to explain modal illusions involving ‘This table is made of wood’ or other sentences that use demonstratives. Imagine our subject is asked whether it seems that this table could have been made of ice and a certain wooden table is pointed to. If it strikes our subject as possible, she is subject to a modal illusion. Given that the table is made of wood, it could not have been made of anything else. According to a Two-Dimensionalist explanation of modal illusions, the reason it might seem as if this table could have been made of ice is that our subject has imagined a scenario in which the primary proposition expressed by the sentence ‘this table is made of ice’ is true. It is unclear what scenario or possible world would verify the sentence. If there is one table referred to when our interrogator uses the phrase ‘this table’ and points to a specific table, what might the primary intention of ‘this table’ pick out if not this very one?

Nathan Salmon (2005, p. 234) argues that in using the demonstrative and ostensively referring to the table, “I make no reference—explicit or implicit, literal or metaphorical, direct or allusive—to any … table other than the one I am pointing to.” There is no similar table our subject is asked to consider. It is stipulated when she is asked whether it seems that this very table could have been made of ice that she is to consider this very table. When asked to imagine this very table being made of ice, either one can or one cannot. If one can, the object of belief is this very table and one is subject to a modal illusion. If one comes to the conclusion that this table could have been made of ice, one has come to a conclusion about this very table. It is an incorrect conclusion, but that doesn’t mean it wasn’t this table the person considered when reasoning to this mistaken conclusion.

Finally, consider another less discussed example of the “necessary a posteriori”. Kripke (1972) argues that every person necessarily has the parents that he or she has. Still, it seems to many people as if other people could have been their parents. If it seems to a person that she could have had different parents, that person must be subject to a modal illusion. According to Two-Dimensionalist Accounts, the reason a person makes this mistake is because she imagines a possible world in which someone very much like herself has parents other than the ones she actually has. If our subject, for example believes that ‘I am the daughter of the Queen of England,’ is merely contingently false, it is because she considers a world that would verify the primary proposition. The primary proposition, ostensibly, is true in some possible worlds, worlds in which someone very much like the speaker is the daughter of the Queen of England.

It seems very unlikely that a person would mean to imagine a world in which she is the daughter of the Queen of England and instead imagines a world in which someone just like her is the daughter of the Queen of England. It seems strange that any one would mistakenly and unknowingly use the word ‘I’ to refer to someone other than himself or herself. Furthermore, Chalmers (2006) argues that what makes the primary proposition true in certain possible worlds is not that the speakers of that world use the terms in a certain way. The way they use the terms are irrelevant. We are concerned with how we use the terms and what those terms would pick out in other possible worlds. So in this case, it is not because there is some doppelganger of our subject who uses ‘I’ to refer to herself that the sentence ‘I am the daughter of the Queen of England’ is true. It is a matter of how our subject uses the term and what the word ‘I’ would pick out in this other possible world. But given that our subject could not have been the daughter of the Queen of England (since she is not), it is unclear to whom ‘I’ refers in this possible world if not the subject herself.

b. The Epistemic Status of the Secondary Proposition

Chalmers (1996) explains that the “necessary a posteriori” express two propositions; one is necessary and the other is a posteriori but not necessary. Chalmers (1996, p. 64) claims that a statement is necessarily true in the first (a priori) sense if the associated primary proposition holds in all centered possible worlds (that is, if the statement would turn out to express a truth in any context of utterance). A statement is necessarily true in the a posteriori sense if the associated secondary proposition holds in all possible worlds (that is, if the statement as uttered in the actual world is true in all counterfactual worlds).

A statement such as ‘Hesperus is Phosphorus,’ for example, is not necessary in the first, a priori, sense because the primary proposition does not hold in all possible worlds—it does not express a truth in any context of utterance. However, it is necessary in the secondary sense since the secondary proposition holds in all possible worlds. The statement, as uttered in the actual world, is true in all counterfactual worlds. This is because the secondary proposition expresses something like “Venus-Hesperus-Phosphorus is Venus-Hesperus-Phosphorus.” Chalmers says that that this secondary proposition is not a posteriori, however. The primary proposition is a posteriori but not necessary, while the secondary proposition is necessary but not a posteriori. If it is not a posteriori, it would be either a priori or not knowable. This example seems to perhaps be a priori since it would not take any empirical investigation to know that Venus is Venus and certainly, this is fact that we can know.

But consider a statement such as ‘water is H2O.’ This statement is necessary in the secondary sense          because the secondary proposition holds in all possible worlds. The statement, as uttered in the actual world, is true in all counterfactual worlds since the secondary intension of ‘water’ always picks out H2O. But the secondary proposition is not a posteriori. Then it is either a priori or it is not knowable at all. Since we of course can know that water is H2O, it must be knowable a priori, but it is unclear how in the world a person could know the composition of water without empirical evidence.

The objection can also be made using ‘This table is made of wood.’ The secondary proposition expressed by this sentence (said of a table actually originally made of wood) is necessary in the secondary sense because the sentence, as uttered in this world, is true in all counterfactual worlds. But again, the secondary proposition is not both necessary and a posteriori. Either it is not knowable at all or else it is knowable a priori. Since we can of course know that this table is made of wood, that must be something we can know a priori, but it is even more implausible that we can know that fact a priori than it is plausible that we can know the composition of water a priori. How could we know what any table is made of without empirical evidence?

Yet Two-Dimensionalist Accounts rely on this idea to explain modal illusions of the “necessary a posteriori”. It is because one proposition is a posteriori and not necessary while the other proposition is necessary and not a posteriori that we make these modal mistakes. The proposition expressed (the necessary one) may seem contingent because the primary proposition is not necessary and because the primary proposition is not knowable a priori, one might imagine that it could have been false since one can imagine a possible world in which it is false. But if the secondary proposition is not a priori either, then we have no need to posit a primary proposition to explain the illusion.

c. Believing Impossibilities

Finally, Two-Dimensionalist Accounts assume that a person cannot imagine impossibilities, but it seems quite plausible that we can and often do imagine or believe impossibilities. We believe mathematical falsehoods, for example, which are surely impossible. Two-Dimensionalists maintain that the scenario imagined has been misdescribed and it is not an impossible scenario that the person believes to be possible.  But if a person can believe that the mathematically impossible is possible, it is a natural extension to say that a person can believe other impossibilities are possible, including metaphysical impossible scenarios such as that water could have been XYZ.

Chalmers (1996, p. 97) recognizes that some mathematical falsehoods are conceivable in a sense; both Goldbach’s Conjecture and its negation are conceivable “in some sense” but “the false member of the pair will not qualify as conceivable” in Chalmers’ usage since there is no scenario that verifies the false member of the pair. Call Goldbach’s Conjecture g and its negation ¬g. When a person claims to believe ¬g, assuming g is true, the belief must be misdescribed. Chalmers (1996, p. 67) says that although one might claim to believe that Goldbach’s Conjecture is false, he is only “conceiving of a world where mathematicians announce it to be so; but if in fact Goldbach’s Conjecture is true, then one is misdescribing this world; it is really a world in which the Conjecture is true and some mathematicians make a mistake.” This might be a plausible explanation of what is going on in the Goldbach case since, at this time, we do not know which is true and which is false, but consider any very complicated mathematical proposition that is known to be true. If someone claims to believe it is false, Chalmers would have to argue that the person has misdescribed the world imagined. This is clearly not the case in most occurrences of false mathematical beliefs. The mathematician who has erred does not imagine a situation in which the complicated mathematical proposition is “announced” to be false; he believes it is false. Two-Dimensionalist Accounts cannot explain these common mathematical false beliefs.

8. Possibility Accounts

Rather than invoking a substitute object of thought and saying that there is only one sense of ‘possibility’ relevant to the discussion, another approach to modal illusions would be to maintain that there is only one object of thought under consideration but different senses of ‘possibility’ are in play. One way to do this is to hold that it is possible that water is XYZ, for example, in some non-metaphysical sense. Such Possibility Accounts deny the assumptions made by Similarity Accounts and Two-Dimensionalist accounts that one cannot believe the impossible and that when one claims to believe the impossible, one has mis-described or re-described one’s belief. Possibility Accounts argue that the person does have in mind some impossible world, or at least some impossible situation, and mistakenly believes that it is possible or could have obtained. The reason the impossible situation might seem possible is because it is possible in some other sense.

There are many occurrences of modal illusions in which there is no similar substance or object that can serve as the object of thought and explain the illusion. Possibility Accounts deny that the false modal intuition is about some other object or substance and instead claim that the belief is about a metaphysically impossible situation and that the reason it strikes many people as possible is that it is possible in an epistemic sense. Of course there are many definitions of ‘epistemic possibility.’ According to some theorists, p is epistemically possible if p is true for all one knows. According to others, p is epistemically possible if p is not known to be false. And according to others, p is epistemically possible if p cannot be known to be false a priori. It is some version of this last definition that many theorists rely on to explain modal illusions of the “necessary a posteriori” using a Possibility Account. Since all of the examples discussed here are necessary and a posteriori, they cannot be known to be false a priori.  Therefore, each example is epistemically possible. Since each example is epistemically possible, it might seem to a person that things could have been otherwise even though things could not have been otherwise. The appearance of metaphysical possibility is explained by the epistemic possibility.

This type of account claims that a person subject to a modal illusion can, and usually does, have a metaphysical impossibility in mind, but it also claims that when the person believes the proposition expressed by the sentence ‘Hesperus is distinct from Phosphorus’ is contingently false, the proposition the person thinks is contingently false is the proposition expressed by the sentence and not some other. It is not that she believes that the sentence could have expressed something else and thus could have been true. Rather, she believes of the proposition expressed that it could have been true.

Possibility Accounts are thought to be able to explain those modal illusions that the other two types of accounts cannot explain. For example, when the person in the know says, “I know that it is impossible that p but it still sure seems like p could have been the case,” the Possibility Account argues that the subject can at once know that p is (metaphysically) impossible and be struck by the feeling that p is possible if p is possible in some other sense. Consider, too, the failed attempts to explain the modal illusion that this very table could have been made of ice. If this table could have been made of ice in some other sense, then the reason one might think that it could have been made of ice (in a metaphysical sense) is clear. Possibility accounts then must be able to explain how these impossibilities are possible in some other sense.

Stephen Yablo, a prominent defender of Possibility Accounts of modal illusions, claims that while water could not have been XYZ in a metaphysical sense, water could have been XYZ in a “conceptual” sense: if p is conceptually possible, then p could have turned out to be the case. Yablo explains that if p is metaphysically possible then p could have turned out to have been the case. There are certain propositions that while metaphysically impossible are conceptually possible. Such a proposition p could not have turned out to have been the case even though it could have turned out to be the case. This explains modal illusions of the “necessary a posteriori”. All of the examples so far considered are conceptually possible even though they are metaphysically impossible. (a1), (b1), and (c1) could have turned out to be so.

Yablo insists that conceptually possibility should not be reduced to the a priori, but without reducing it, ‘conceptual possibility’ could be cashed out in any number of ways. For instance, consider again Goldbach’s Conjecture. In some sense, either g or ¬g “could turn out to be the case” since we don’t know which is true. But in another sense, only g or ¬g could turn out to be the case since, if g is false, it is not only necessarily false, but logically impossible. Even though we don’t know right now whether g or ¬g is true, only one could turn out to be true in a certain sense. It is not clear whether or not something such as Goldbach’s conjecture could turn out to be true.

On the other hand, Yablo (1993, pp. 29-30) argues that it is conceptually possible that “there should be a town whose resident barber shaved all and only the town’s non-shavers.” This means that it could have turned out to be the case that there is a town whose resident barber shaves all and only the town’s non-shavers. However, it certainly could not have turned out to be the case that there is a town whose resident barber shaves all and only the non-shavers. The example is different than Goldbach’s conjecture. In that case, the necessary falsehood is unknown, so in some sense, the necessary falsehood could turn out to be the case. In the barber case, however, we know that the proposition is false, so it could not turn out to be true. And if it could not turn out to be the case, then such a town is not conceptually possible, contrary to Yablo’s claims.

Other Possibility Accounts avoid this problem by defining ‘epistemic possibility’ or ‘conceptual possibility’ in another way. For example, Scott Soames says that p is epistemically possible if and only if p is a way the world could conceivably be and that p is a way the world could conceivably be if we need evidence to rule out that it is the way the world is. For example, it is epistemically possible that water is XYZ because it is conceivable that the world is such that water is composed of XYZ. We do need evidence to rule out that this is the way the world is because we need evidence to know the composition of water. For all instances of the “necessary a posteriori”, one does need evidence to rule out metaphysical impossibilities that are epistemically possible. On the other hand, one does not need evidence to rule out that the world is such that there is a town whose resident barber shaves all and only the town’s non-shavers. This is not epistemically possible and not an example of the “necessary a posteriori”.

According to the schema Soames offers to identify instances of the “necessary a posteriori”, (a) is not an example of the “necessary a posteriori”. Soames argues that the proposition expressed by the sentence ‘Hesperus is Phosphorus’ is necessary, but it is not a posteriori since the proposition expressed is something like “Venus is Venus.” Clearly we do not need empirical evidence to know that is true and we do not need empirical evidence to rule out that the world is such that Venus is not Venus (or that Hesperus is not Phosphorus). If we do not need evidence to rule out that this is the way the world is, then it is not epistemically possible.

The problem with Soames’ account is that we did need evidence to know that Hesperus is Phosphorus. The ancients who made this discovery did not do it from the armchair; they needed empirical evidence. Soames claims that “the function of empirical evidence needed for knowledge that Hesperus is Phosphorus is not to rule out possible world-states in which the proposition is false…evidence is needed to rule out possible states in which we use the sentence … to express something false.” The ancients though did not need empirical evidence to rule out worlds in which the sentence is used to express something false. They needed evidence to know that Hesperus and Phosphorus were the same.

Furthermore, it seems that Soames could argue similarly regarding the other two example: Soames could say that we did not need evidence to rule out a possible world-state in which the proposition that water is H2O is false, but we needed evidence to rule out possible states in which we use the sentence ‘water is H2O’ to express something false. This is similar to what the Two-Dimensionalists argue, although Soames gives rather forceful and convincing arguments against Two-Dimensionalism himself. He does not adopt this strategy for either of the other two examples. Although Soames’ general explanation is promising, it is a problem that he rejects the explanation for one important example of modal illusions of the “necessary a posteriori”.

A Possibility Account might say that a philosophical zombie is epistemically possible but not metaphysically possible or that pain in the absence of C-Fiber stimulation is epistemically possible but not metaphysically possible. This is a common position taken by those who adopt a Possibility account. Chalmers (1996, p. 137) explains: “On this position, “zombie worlds” may correctly describe the world that we are conceiving, even according to secondary intensions. It is just that the world is not metaphysically possible.” Chalmers (1996, p. 131) claims that this is “by far the most common strategy used by materialists” and recognizes Bigelow and Pargetter (1990) and Byrne (1993) among that camp.

However, not all Possibility Accounts defend this view in the way Chalmers describes. According to some Possibility Accounts, the reason the examples of the “necessary a posteriori” strike some people as contingent is because one cannot know that their negations are false a priori. Because we cannot know that the propositions expressed by the sentences ‘philosophical zombies exist’ and ‘pain is not C-Fiber stimulation’ are false a priori, these are epistemic possibilities. Since they are epistemically possible, it might seem to some people that they are metaphysically possible even if they are not—even if physicalism is true.

On the other hand, one could adopt a Possibility Account and deny physicalism. In that case, one could allow that philosophical zombies and pain in the absence of C-Fiber stimulation are both epistemically possible and metaphysically possible. One could adopt a Possibility Account of modal illusions but deny that the dualist intuitions count as modal illusions. Accordingly, the propositions expressed by the sentences ‘philosophical zombies do not exist,’ and ‘pain is C-Fiber stimulation’ would not count as genuine instances of the “necessary a posteriori”.

9. Objections

a. Conceivability and Possibility

There is a common view that conceivability implies possibility. Gendler and Hawthorne (2007) discuss this alleged implication in detail in their introduction to Conceivability and Possibility. According to this view it cannot both be true that water could not have been XYZ and that someone might conceive that water is XYZ. If conceivability implies possibility and a person conceives that water is or could have been XYZ, then it must be possible that water could have been XYZ.  However, given Kripke’s convincing arguments, most will reject this conclusion. On the other hand, if conceivability implies possibility and water could not have been XYZ, then a person who says she conceives that water is or could have been XYZ must not really be conceiving what she claims to conceive. This motivates some who adopt a view claiming the belief needs to be re-described. Given the objection to such accounts (including the strong objection that we do believe impossibilities) it seems equally objectionable to claim that the person is not really conceiving of water when she claims to conceive that water might have been XYZ.

There does not seem to be an independent reason to maintain the link between conceivability and possibility. If conceivability does not imply possibility, then it might be the case that while water could not have been XYZ, one might conceive that it could have been.  If conceivability does not imply possibility, some version of a Possibility Account would have more force. While there does not seem to be an independent reason to maintain the link between conceivability and possibility, there are many reasons to reject it. First of all, our modal intuitions are not infallible, so we would have no reason to believe that whatever seems possible is possible. To think so is to give more credit than is due to our modal intuitions. If our modal intuitions were infallible, we would be unable to explain other modal errors that we make, such as our mathematical errors. Secondly, modal justification itself is not something philosophers have come to agree upon. We are still not sure what justifies our modal knowledge and so we cannot hold, at this time, that our modal intuitions always count as knowledge. Finally, our a posteriori justification in general is fallible. Since this is so, we have good reason to think that our a posteriori justification when it comes to modal truths might also be fallible.

b. Impossible Worlds

Chalmers objects to Possibility Accounts, or what he calls “two-senses views,” because he believes such accounts are committed to incorporating impossible worlds into their metaphysics. If p is impossible, yet epistemically possible, it must be true in some possible world, but if p is metaphysically impossible, it is true in no possible world. Therefore, it seems that there are metaphysically impossible worlds in which p is true or at which p is true. The idea of countenancing world that are impossible strikes many philosophers as highly problematic.

However, not all possibility accounts, or two-senses views, are committed to impossible worlds. If the definition of ‘possibility’ relies on possible worlds, this might be a valid concern, but not all Possibility Accounts rely on such a definition. For example, Yablo makes no mention of possible worlds. According to Yablo, p is conceptually possible if p is a way the world could have turned out to be. Yablo (1996) insists that a way the world could have turned out to be is not a possible world; it is not an entity at all.  A way the world could have been or could be is analogous to a way one feels or a way a bird might build a nest and when one talks about a way a bird might build a nest, one does not make reference to a thing.

c. Metaphysical Possibility

Perhaps the most forceful objection to a Possibility Account is that it presumes there is some sort of primitive notion of metaphysical modality that is left undefined, one that cannot be identified or analyzed in non-modal terms. Those who use the terms ‘metaphysically necessary’ or ‘metaphysically possible’ have only explained how they use the term, but no one has given an analysis of what these terms mean. The question arises as to what may be meant by ‘water is necessarily H2O’, as it seems to beg the question, “If this does not just reduce to possible worlds or to the a priori, then what does it reduce to, if anything?”

Some have argued that these notions are vague and that, although there are examples of what most people mean by metaphysically necessary and possible, there is no clear way to decide what counts as metaphysically possible in the problematic cases, including cases that have the dualists’ concerns at their center.

This is a strong objection but perhaps not an insurmountable one. While there are no clear definitions of these terms in the literature, most philosophers who use them have a basic understanding of what they mean. There is some intuitive sense that philosophers, following Kripke, have in mind. Furthermore, philosopher and non-philosophers alike do think that, although things are one way, some things could have been otherwise. It is this notion that philosophers are referring to when they use the term ‘metaphysical possibility.’  Kripke himself recognizes that there are no good definitions for these terms and that there are no necessary and sufficient conditions spelled out for either metaphysical necessity or metaphysical possibility. Still, we have a basic understanding of these notions. If p is necessary, p could not have been otherwise and ¬p could not have been true. If p is false but possible then p could have been the case even though it is not actually the case.

10. References and Further Reading

  • Barcan Marcus, R. (1983). Rationality and Believing the Impossible. Journal of Philosophy. Vol. 80, No. 6, (June 1983). pp. 321-388.
  • Bealer, G. (2004). The Origins of Modal Error. Dialectica, Vol. 58, pp. 11-42.
  • Bealer, G. (2002). Modal Epistemology and the Rationalist Renaissance. In Gendler & Hawthorne, (Eds.), Conceivability and Possibility. Oxford: Oxford University Press.
  • Bigelow, J. & Pargetter, R. (1990). Acquaintance With Qualia. Theoria. Vol. 56. pp. 129-147.
  • Byrne, A. (2007). Possibility and Imagination. Philosophical Perspectives. 21. pp. 125-144.
  • Byrne, A. (1993). The Emergent Mind, Ph.D. Dissertation, Princeton University.
  • Chalmers, D. (2007). Propositions and Attitude Ascriptions: A Fregean Account. Nous.
  • Chalmers, D. (2006). The Foundations of Two-Dimensional Semantics. In M. Garcia-Carpintero and J. Macia (eds.), Two-Dimensional Semantics. Oxford: Oxford University Press. pp. 55-140.
  • Chalmers, D. (2002). Does Conceivability Entail Possibility? In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press. pp. 145-200.
  • Chalmers, D. (1996). The Conscious Mind. Oxford, New York: Oxford University Press.
  • Della Rocca, M. (2002). Essentialism versus Essentialism. In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press. pp. 223-252.
  • Descartes, R. (1996). Meditations on First Philosophy, translated by J. Cottingham. Cambridge: Cambridge University Press.
  • Descartes, R. (1983). Principles of Philosophy, translated by V.R. Miller & R.P. Miller. Dordrecht: D. Reidel.
  • Evans, G. (2006). Comments on ‘Two Notions of Necessity.’ In M. Garcia-Carpintero and J. Macia (eds.), Two-Dimensional Semantics. Oxford: Oxford University Press. pp. 176-180.
  • Fine, K. (2002). The Varieties of Necessity. In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press. pp. 253-282.
  • Garcia-Carpintero, M. & Macia, J. (eds.). (2006). Two-Dimensional Semantics. Oxford & New York:  Oxford University Press.
  • Gendler, T.S. & Hawthorne, J. (eds.). (2002). Conceivability and Possibility. Oxford & New York: Oxford University Press.
  • Hill, C. (2006). Modality, Modal Epistemology, and the Metaphysics of Consciousness. In S. Nichol, (ed.), The Architecture of Imagination. Oxford: Oxford University Press. pp. 205-235.
  • Hill, C. (1997). Imaginability, Conceivability, Possibility, and the Mind-Body Problem. Philosophical Studies: 87: pp. 61-85.
  • Hill, C. (1991). Sensations: A Defense of Type Materialism. Cambridge: Cambridge University Press.
  • Jackson, F. (2003). Mind and Illusion. In A. O’Hear, (ed.), Mind and Persons. Cambridge: Cambridge University Press. pp. 251-272.
  • Jackson, F. (1997). From Metaphysics to Ethics: A Defense of Conceptual Analysis. Oxford: Oxford University Press.
  • Kripke, S. (1972). Naming and Necessity. Cambridge, MA: Harvard University Press.
  • Kripke, S. (1971). Identity and Necessity.  In M.K. Munitz (ed.), Identity and Individuation. New York: New York University Press.
  • Loar, B. (1990). Phenomenal States. Philosophical Perspectives. Vol. 4. pp. 81-108.
  • Ludwig, K. (2003). The Mind-Body Problem: An Overview. In T. A. Warfield & S P. Stich, (eds.), The Blackwell Guide to the Philosophy of Mind. Maldin, MA: Blackwell. pp. 1-46.
  • Lycan, W. G. (1995). A Limited Defense of Phenomenal Information. In T. Metzinger (ed.), Conscious Experience. Paderborn. Schoningh. pp. 243-258.
  • Salmon, N. (2005). Reference and Essence. Amherst, NY: Prometheus Books.
  • Sidelle, A. (2002). On the Metaphysical Contingency of the Laws of Nature. In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press. Pp, 309-366.
  • Soames, S. (2006). Kripke, the Necessary A Posteriori, and the Two-Dimensionalist Heresy. In M. Garcia-Carpintero and J. Macia (eds.), Two Dimensional Semantics. Oxford: Oxford University Press. pp. 272-292.
  • Soames, S. (2005). Reference and Description. Princeton and Oxford: Princeton University Press.
  • Sorensen, R. (2006). Meta-Conceivability and Thought Experiments. In S. Nichols (ed.), The Architecture of Imagination. Oxford: Oxford University Press. pp. 257-272.
  • Sorensen, R. (2002). The Art of the Impossible. In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press. pp. 337-368.
  • Sorensen, R. (1996). Modal Bloopers: Why Believable Impossibilities are Necessary. American Philosophical Quarterly, 33 (1): pp, 247-261.
  • Sorensen, R. (1992). Thought Experiments. NY: Oxford University Press.
  • Tye, M. (1995). Ten Problems of Consciousness. Cambridge, MA. MIT Press.
  • Wong, K. (2006). Two-Dimensionalism and Kripkean A Posteriori Necessity. In M. Garcia-Carpintero and J. Macia (eds.), Two-Dimensional Semantics. Oxford: Oxford University Press. pp. 310-326.
  • Yablo, S. (2006). No Fool’s Cold: Notes on Illusions of Possibility. In M. Garcia-Carpintero and J. Macia (eds.), Two-Dimensional Semantics. Oxford: Oxford University Press. pp. 327-346.
  • Yablo, S. (2001). Coulda Shoulda Woulda. In T. Gendler & J. Hawthorne (eds.), Conceivability and Possibility. Oxford: Oxford University Press. pp. 441-492.
  • Yablo, S. (2000). Textbook Kripkeanism and the Open Texture of Concepts. Pacific Philosophical Quarterly. 81: pp, 98-122.
  • Yablo, S. (1996). How In the World? Philosophical Topics. 24. pp. 255-286.
  • Yablo, S. (1993). Is Conceivability a Guide to Possibility? Philosophy and Phenomenological Research. 53: pp. 1-42.

Author Information

Leigh Duffy
Email: duffy.leigh@gmail.com
Buffalo State College
U. S. A.

Foundationalism

Epistemic foundationalism is a view about the proper structure of one’s knowledge or justified beliefs.  Some beliefs are known or justifiably believed only because some other beliefs are known or justifiably believed.  For example, you can know that you have heart disease only if you know some other claims such as your doctors report this and doctors are reliable.  The support these beliefs provide for your belief that you have heart disease illustrates that your first belief is epistemically dependent on these other two beliefs.  This epistemic dependence naturally raises the question about the proper epistemic structure for our beliefs.  Should all beliefs be supported by other beliefs?  Are some beliefs rightly believed apart from receiving support from other beliefs?  What is the nature of the proper support between beliefs?  Epistemic foundationalism is one view about how to answer these questions.  Foundationalists maintain that some beliefs are properly basic and that the rest of one’s beliefs inherit their epistemic status (knowledge or justification) in virtue of receiving proper support from the basic beliefs.  Foundationalists have two main projects: a theory of proper basicality (that is, a theory of noninferential justification) and a theory of appropriate support (that is, a theory of inferential justification).

Foundationalism has a long history.  Aristotle in the Posterior Analytics argues for foundationalism on the basis of the regress argument.  Aristotle assumes that the alternatives to foundationalism must either endorse circular reasoning or land in an infinite regress of reasons.  Because neither of these views is plausible, foundationalism comes out as the clear winner in an argument by elimination.  Arguably, the most well known foundationalist is Descartes, who takes as the foundation the allegedly indubitable knowledge of his own existence and the content of his ideas.  Every other justified belief must be grounded ultimately in this knowledge.

The debate over foundationalism was reinvigorated in the early part of the twentieth century by the debate over the nature of the scientific method.  Otto Neurath (1959; original 1932) argued for a view of scientific knowledge illuminated by the raft metaphor according to which there is no privileged set of statements that serve as the ultimate foundation of knowledge; rather, knowledge arises out of a coherence among the set of statements we accept.  In opposition to this raft metaphor, Moritz Schlick (1959; original 1932) argued for a view of scientific knowledge akin to the pyramid image in which knowledge rests on a special class of statements whose verification doesn’t depend on other beliefs.

The Neurath-Schlick debate transformed into a discussion over nature and role of observation sentences within a theory.  Quine (1951) extended this debate with his metaphor of the web of belief in which observation sentences are able to confirm or disconfirm a hypothesis only in connection with a larger theory.  Sellars (1963) criticizes foundationalism as endorsing a flawed model of the cognitive significance of experience.  Following the work of Quine and Sellars, a number of people arose to defend foundationalism (see section below on modest foundationalism).  This touched off a burst of activity on foundationalism in the late 1970s to early 1980s.  One of the significant developments from this period is the formulation and defense of reformed epistemology, a foundationalist view proposed foundational beliefs such as there is a God (see Plantinga (1983)). While the debate over foundationalism has abated in recent decades, new work has picked up on neglected topics about the architecture of knowledge and justification.

Table of Contents

  1. Knowledge and Justification
  2. Arguments for Foundationalism
    1. The Regress Argument
    2. Natural Judgment about Cases
  3. Arguments against Foundationalism
    1. The Problem of Arbitrariness
    2. The Sellarsian Dilemma
  4. Types of Foundationalist Views
    1. Theories of Noninferential Justification
      1. Strong Foundationalism
      2. Modest Foundationalism
      3. Weak Foundationalism
    2. Theories of Proper Inference
      1. Deductivism
      2. Strict Inductivism
      3. Liberal Inductivism
      4. A Theory of Inference and A Theory of Concepts
  5. Conclusion
  6. References and Further Reading

1. Knowledge and Justification

The foundationalist attempts to answer the question: what is the proper structure of one’s knowledge or justified beliefs? This question assumes a prior grasp of the concepts of knowledge and justification.  Before the development of externalist theories of knowledge (see entry on internalism and externalism in epistemology) it was assumed that knowledge required justification.  On a standard conception of knowledge, knowledge was justified true belief.  Thus investigation on foundationalism focused on the structural conditions for justification.  How should one’s beliefs be structured so as to be justified?  The following essay discusses foundationalism in terms of justification (see BonJour (1985) for a defense of the claim that knowledge requires justification).  Where the distinction between justification and knowledge is relevant (for example, weak foundationalism), this article will observe it.

What is it for a belief to be justified?  Alvin Plantinga (1993) observes that the notion of justification is heavily steeped in deontological terms, terms like rightness, obligation, and duty.  A belief is justified for a person if and only if the person is right to believe it or the subject has fulfilled her intellectual duties relating to the belief.  Laurence BonJour (1985) presents a slightly different take on the concept of justification stating that it is “roughly that of a reason or warrant of some kind meeting some appropriate standard” (pp., 5-6).  This ‘appropriate standard’ conception of justification permits a wider understanding of the concept of justification.  BonJour, for instance, takes the distinguishing characteristic of justification to be “its essential or internal relation to the cognitive goal of truth” (p. 8).  Most accounts of justification assume some form of epistemic internalism.  Roughly speaking, this is the view that a belief’s justification does not require that it meets some condition external to a subject’s perspective, conditions such as being true, being produced by a reliable process, or being caused by the corresponding fact (see entry on internalism and externalism in epistemology).  All the relevant conditions for justification are ‘internal’ to a subject’s perspective.  These conditions vary from facts about a subject’s occurrent beliefs and experiences to facts about a subject’s occurrent and stored beliefs and experiences and further to facts simply about a subject’s mind, where this may include information that, in some sense or other, a subject has difficulty bringing to explicit consciousness.  Although some externalists offer accounts of justification (see Goldman (1979) & Bergmann (2006)), this article assumes that justification is internalistic.  Externalists have a much easier time addressing concerns over foundationalism.  It is a common judgment that the foundationalist / coherentist debate takes place within the backdrop of internalism (see BonJour (1999)).

2. Arguments for Foundationalism

This section discusses prominent arguments for a general type of foundationalism.  Section 4, on varieties of foundationalism, discusses more specific arguments aimed to defend a particular species of foundationalism.

a. The Regress Argument

 

 

The epistemic regress problem has a long history.  Aristotle used the regress argument to prove that justification requires basic beliefs, beliefs that are not supported by any other beliefs but are able to support further beliefs (see Aristotle’s Posterior Analytics I.3:5-23).  The regress problem was prominent in the writings of the academic skeptics, especially Sextus Empiricus’s Outlines of Pyrrhonism and Diogenes Laertius “The Life of Pyrrho” in his book The Lives and Options of Eminent Philosophers.  In the 20th century the regress problem has received new life in the development of the coherentist and infinitist options (see BonJour (1985) and Klein (1999), respectively).

To appreciate the regress problem begin with the thought that the best way to have a good reason for some claim is to have an argument from which the claim follows.  Thus one possesses good reason to believe p when it follows from the premises q and r.  But then we must inquire about the justification for believing the premises.  Does one have a good argument for the premises?  Suppose one does.  Then we can inquire about the justification for those premises.  Does one have an argument for those claims?  If not, then it appears one lacks a good reason for the original claim because the original claim is ultimately based on claims for which no reason is offered.  If one does have an argument for those premises then either one will continue to trace out the chain of arguments to some premises for which no further reason is offered or one will trace out the chain of arguments until one loops back to the original claims or one will continue to trace out the arguments without end.  We can then begin to see the significance of the regress problem: is it arguments all the way down?  Does one eventually come back to premises that appeared in earlier arguments or does one eventually come to some ultimate premises, premises that support other claims but do not themselves require any additional support?

Skepticism aside, the options in the regress problem are known as foundationalism, coherentism, and infinitism.  Foundationalists maintain that there are some ultimate premises, premises that provide good reasons for other claims but themselves do not require additional reasons.  These ultimate premises are the proper stopping points in the regress argument.  Foundationalists hold that the other options for ending the regress are deeply problematic and that consequently there must be basic beliefs.

Coherentists and infinitists deny that there are any ultimate premises.  A simple form of coherentism holds that the arguments for any claim will eventually loop back to the original claim itself.  As long as the circle of justifications is large enough it is rationally acceptable.  After all, every claim is supported by some other claim and, arguably, the claims fit together in such a way to provide an explanation of their truth (see Lehrer (1997), Chs 1 & 2)

Infinitists think that both the foundationalist and coherentist options are epistemically objectionable.  Infinitists (as well as coherentists) claim that the foundationalist options land in arbitrary premises, premises that are alleged to support other claims but themselves lack reasons.  Against the coherentist, infinitists claim that it simply endorses circular reasoning: no matter how big the circle, circular arguments do not establish that the original claim is true.  Positively, infinitists maintain that possessing a good reason for a claim requires that it be supported by an infinite string of non-repeating reasons (see Klein (1999)).

Foundationalists use the regress argument to set up the alternative epistemological positions and then proceed to knock down these positions.  Foundationalists argue against infinitism that we never possess an infinite chain of non-repeating reasons.  At best when we consider the justification for some claim we are able to carry this justification out several steps but we never come close to anything resembling an unending chain of justifications.  For this criticism and others of infinitism see Fumerton (1998).

Against the coherentist the foundationalist agrees with the infinitist’s criticism mentioned above that circular reasoning never justifies anything.  If p is used to support q then q itself cannot be used in defense of p no matter how many intermediate steps there are between q and p.  This verdict against simple coherentism is strong, but foundationalist strategy is complicated by the fact that it is hard to find an actual coherentist who endorses circular reasoning (though see Lehrer (1997) Ch 1 and 2 for remarks about the circular nature of explanation).  Coherentists, rather, identify the assumption of linear inference in the regress argument and replace it with a stress on the holistic character of justification (see BonJour (1985)).  The assumption of linear inference in the regress argument is clearly seen above by the idea that the regress traces out arguments for some claim, where the premises of those arguments are known or justifiably believed prior to the conclusion being known or justifiably believed.  The form of coherentism that rejects this assumption in the regress argument is known as holistic coherentism.

Foundationalist arguments against holistic coherentism must proceed with more care.  Because holistic coherentists disavow circular reasoning and stress the mistaken role of linear inference in the regress argument, foundationalists must supply a different argument against this option.  A standard argument against holistic coherentism is that unless the data used for coherence reasoning has some initial justification it is impossible for coherence reasoning to provide justification.  This problem affected Laurence BonJour’s attempt to defend coherentism (see BonJour (1985), pp. 102-3).  BonJour argued that coherence among one’s beliefs provided excellent reason to think that those beliefs were true.  But BonJour realized that he needed an account of how one was justified in believing that one had certain beliefs, i.e., what justified one in thinking that one did indeed hold the system of beliefs one takes oneself to believe.  BonJour quickly recognized that coherence couldn’t provide justification for this belief but it wasn’t until later in his career that he deemed this problem insuperable for a pure coherentist account (see BonJour (1997) for details).

The regress problem provides a powerful argument for foundationalism.  The regress argument, though, does not resolve particular questions about foundationalism.  The regress provides little guidance about the nature of basic beliefs or the correct theory of inferential support.  As we just observed with the discussion of holistic coherentism, considerations from the regress argument show, minimally, that the data used for coherence reasoning must have some initial presumption in its favor.  This form of foundationalism may be far from the initial hope of a rational reconstruction of common sense.  Such a reconstruction would amount to setting out in clear order the arguments for various commonsense claims (for example, I have hands, there is a material world, I have existed for more than five minutes, etc) that exhibits the ultimate basis for our view of things.  We shall consider the issues relating to varieties of foundationalists views below.

b. Natural Judgment about Cases

Another powerful consideration for foundationalism is our natural judgment about particular cases.  It seems evident that some beliefs are properly basic.  Leibniz, for instance, gives several examples of claims that don’t “involve any work of proving” and that “the mind perceives as immediately as the eye sees light” (see New Essays, IV, chapter 2, 1).  Leibniz mentions the following examples:

White is not black.

A circle is not a triangle.

Three is one and two.

Other philosophers (for example, C.I. Lewis, Roderick Chisholm, and Richard Fumerton) have found examples of such propositions in appearance states (traditionally, referred to as the given).  For instance, it may not be evident that there is a red circle before one because one may be in a misleading situation (for example, a red light shining on a white circle).  However, if one carefully considers the matter one may be convinced that something appears red.  Foundationalists stress that it is difficult to see what one could offer as a further justification for the claim about how things seem to one.  In short, truths about one’s appearance states are excellent candidates for basic beliefs.

As we shall see below a feature of this appeal to natural judgment is that it can support strong forms of foundationalism.  Richard Fumerton maintains that for some cases, for example, pain states, one’s belief can reach the highest level of philosophical assurance (see Fumerton (2006)).  Other philosophers (for example, James Pryor (2000)) maintain that some ordinary propositions, such as I have hands, are foundational.

3. Arguments Against Foundationalism

This section examines two general arguments against foundationalism.  Arguments against specific incarnations of foundationalism are considered in section 4.

a. The Problem of Arbitrariness

As noted above the regress argument figures prominently in arguing for foundationalism.  The regress argument supports the conclusion that some beliefs must be justified independently of receiving warrant from other beliefs.  However, some philosophers judge that this claim amounts to accepting some beliefs as true for no reason at all, that is, epistemically arbitrary beliefs.  This objection has significant bite against a doxastic form of foundationalism (the language of ‘doxastic’ comes from the Greek word ‘doxa’ meaning belief).  Doxastic foundationalism is the view that the justification of one’s beliefs is exclusively a matter of what other beliefs one holds.  Regarding the basic beliefs, a doxastic foundationalist holds that these beliefs are ‘self-justified’ (see Pollock & Cruz (1999), 22-23).  The content of the basic beliefs are typically perceptual reports but importantly a doxastic foundationalist does not conceive of one’s corresponding perceptual state as a reason for the belief.  Doxastic foundationalists hold that one is justified in accepting a perceptual report simply because one has the belief.  However, given the fallibility of perceptual reports, it is epistemically arbitrary to accept a perceptual report for no reason at all.

The arbitrariness objection against non-doxastic theories must proceed with more care.  A non-doxastic form of foundationalism denies that justification is exclusively a matter of relations between one’s beliefs.  Consider a non-doxastic foundationalist that attempts to stop the regress with non-doxastic states like experiences.  This foundationalist claims that, for example, the belief that there is a red disk before one is properly basic.  This belief is not justified on the basis of any other beliefs but instead justified by the character of one’s sense experience.  Because one can tell by reflection alone that one’s experience has a certain character, the experience itself provides one with an excellent reason for the belief.  The critic of non-doxastic foundationalism argues that stopping with this experience is arbitrary.  After all, there are scenarios in which this experience is misleading.  If, for example, the disk is white but illuminated with red light then one’s experience will mislead one to think that the disk is really red.Unless one has a reason to think that these scenarios fail to obtain then it’s improper to stop the regress of reasons here.

One foundationalist solution to the arbitrariness problem is to move to epistemically certain foundations.  Epistemically certain foundations are beliefs that cannot be misleading and so cannot provide a foothold for arbitrariness concerns.  If, for instance, one’s experience is of a red disk and one believes just that one’s experience has this character, it is difficult to see how one’s belief could be mistaken in this specific context.   Consequently, it is hard to make sense of how one’s belief about the character of one’s experience could be epistemically arbitrary.  In general, many foundationalists want to resist this move.  First, relative to the large number of beliefs we have, there are few epistemically certain beliefs. Second, even if one locates a few epistemically certain beliefs, it is very difficult to reconstruct our common-sense view of the world from those beliefs.  If the ultimate premises of one’s view include only beliefs about the current character of one’s sense experience it’s near impossible to figure out how to justify beliefs about the external world or the past.

Another foundationalist response to the arbitrariness argument is to note that it is merely required that a properly basic belief possess some feature in virtue of which the belief is likely to be true.  It is not required that a subject believe her belief possesses that feature.  This response has the virtue of allowing for modest forms of foundationalism in which the basic beliefs are less than certain.  Critics of foundationalism continue to insist that unless the subject is aware that the belief possesses this feature, her belief is an improper stopping point in the regress of reasons.  For a defense of the arbitrariness objection against foundationalism see Klein (1999) & (2004), and for responses to Klein see Bergmann (2004), Howard-Snyder & Coffman (2006), Howard-Snyder (2005), and Huemer (2003).

b. The Sellarsian Dilemma

The Sellarsian dilemma was first formulated in Wilfrid Sellars’s rich, but difficult, essay “Empiricism and the Philosophy of Mind.”  Sellars’s main goal in this essay is to undermine the entire framework of givenness ((1963), p. 128).  Talk of ‘the given’ was prevalent in earlier forms of foundationalism (see, for example, C.I. Lewis (1929), Ch 2).  The phrase ‘the given’ refers to elements of experience that are putatively immediately known in experience.  For instance, if one looks at a verdant golf course the sensation green is alleged to be given in experience. In a Cartesian moment one may doubt whether or not one is actually perceiving a golf course but, the claim is, one cannot rationally doubt that there is a green sensation present. Strong foundationalists appeal to the given to ground empirical knowledge.  In “Empiricism and the Philosophy of Mind” Sellars argues that the idea of the given is a myth.

The details of Sellars’ actual argument are difficult to decipher.  The most promising reconstruction of Sellars’ argument occurs in chapter 4 of BonJour’s (1985).  BonJour formulates the dilemma using the notion of ‘assertive representational content’.  Representational content is the kind of content possessed by beliefs, hopes, and fears.  A belief, a hope, or a fear could be about the same thing; one could believe that it is raining, hope that it is raining, or fear that it is raining.  These states all have in common the same representational contentAssertive representational content is content that is presented as being true but may, in fact, be false.  A good case of assertive content comes from the Müller-Lyer illusion.  In this well-known experiment a subject experiences two vertical lines as being unequal in length even though they have the same length.  The subject’s experience presents as true the content that these lines are unequal.

Given the notion of assertive representational content BonJour reformulates the Sellarsian dilemma: either experience has assertive representational content or not.  If experience has assertive representational content then one needs an additional reason to think that the content is correct.  If, however, experience lacks this content then experience cannot provide a reason for thinking that some proposition is true.  The dilemma focuses on non-doxastic foundationalism and is used to argue that anyway the view is filled out, it cannot make good on the intuition that experience is a proper foundation for justification.

Let us examine each option of the dilemma staring with the second option.  A defense of this option observes that it is difficult to understand how experience could provide a good reason for believing some claim if it failed to have representational content.  Think of the olfactory experience associated with a field of flowers in full bloom.  Apart from a formed association between that experience and its cause, it is difficult to understand how that experience has representational content.  In other words, the experience lacks any content; it makes no claim that the world is one way rather than another.  However, if that is right, how could that experience provide any reason for believing that the world is one way rather than another?  If the experience itself is purely qualitative then it cannot provide a reason to believe that some proposition is true.  In short, there is a strong judgment that apart from the representational content of experience, experience is powerless to provide reasons.

A defense of the first option of the dilemma takes us back to issues raised by the arbitrariness objection.  If experience does have assertive representational content then that content can be true or false.  If the content is possibly false, the experience is not a proper stopping point in the regress of reasons.  The whole idea behind the appeal to the given was to stop the regress of reasons in a state that did not require further justification because it was not the sort of thing that needed justification.  If experience, like belief, has representational content then there is no good reason to stop the regress of reasons with experience rather than belief.  In brief, if experience functions as a reason in virtue of its assertive representational content then there is nothing special about experience as opposed to belief in its ability to provide reasons.  Since the arbitrariness objection shows that belief is not a proper stopping point in the regress, the Sellarsian dilemma shows that experience is not a proper stopping point either.

Probably the best foundationalist response to the Sellarsian dilemma is to argue that the first option of the dilemma is mistaken; experience has assertive propositional content and can still provide a regress stopping reason to believe that some claim is true.  There are broadly two kinds of responses here depending on whether one thinks that the content of experience could be false.  On one view, experience carries a content that may be false but that this experiential content provides a basic reason for thinking that this content is true.  For instance, it may perceptually seem to one that there is a coffee mug on the right corner of the desk.  This content may be false but in virtue of its being presented as true in experience one has a basic reason for thinking that it is true (see Pryor (2000) & Huemer (2001) for developments of this view).  The other view one might take is that experiential content—at least the kind that provides a basic reason—cannot be false.  One this view the kind of content that experience provides for a basic reason is something like this: it perceptually seems that there is a red disk before me.  Laurence BonJour (in BonJour & Sosa (2003)) develops a view like this.  On his view, one has a built-in constitutive awareness of experiential content, and in virtue of that awareness of content one has a basic reason to believe that the content is true.   For a good criticism of BonJour’s strategy, see Bergmann (2006), Chapter 2.  For a different, externalist response to the dilemma see Jack Lyons (2008).

See the Encyclopedia article “Coherentism” for more criticism of foundationalism.

4. Types of Foundationalist Views

This section surveys varieties of foundationalist views.  As remarked above foundationalists have two main projects: providing a suitable theory of noninferential justification and providing an adequate theory of proper inference.  We will examine three views on non-inferential justification and three views on inferential justification.

a. Theories of Noninferential Justification

An adequate theory of noninferential justification is essential for foundationalism.  Foundationalist views differ on the nature of noninferential justification.  We can distinguish three types of foundationalist views corresponding to the strength of justification possessed by the basic beliefs: strong, modest, and weak foundationalism.  In the following we shall examine these three views and the arguments for and against them.

i. Strong Foundationalism

Strong foundationalists hold that the properly basic beliefs are epistemically exalted in some interesting sense.  In addition to basic beliefs possessing the kind of justification necessary for knowledge (let us refer to this as “knowledge level justification”) strong foundationalists claim the properly basic beliefs are infallible, indubitable, or incorrigible.  Infallible beliefs are not possibly false.  Indubitable beliefs are not possible to doubt even though the content may be false, and incorrigible beliefs cannot be undermined by further information.  The focus on these exalted epistemic properties grows out of Descartes’ method of doubt.  Descartes aimed to locate secure foundations for knowledge and dismissed any claims that were fallible, dubitable, or corrigible.  Thus, Descartes sought the foundations of knowledge in restricted mental states like I am thinking.  Before we examine arguments against strong foundationalism let us investigate some arguments in favor of it.

Probably the most widespread argument for strong foundationalism is the need for philosophical assurance concerning the truth of one’s beliefs (see Fumerton (2006)).  If one adopts the philosophical undertaking to trace out the ultimate reasons for one’s view it can seem particularly remiss to stop this philosophical quest with fallible, dubitable, or corrigible reasons.  As Descartes realized if the possibility that one is dreaming is compatible with one’s evidence then that evidence is not an adequate ground for a philosophical satisfying reconstruction of knowledge.  Consequently, if a philosophically satisfying perspective of knowledge is to be found it will be located in foundations that are immune from doubt.

Another argument for strong foundationalism is C.I. Lewis’s contention that probability must be grounded in certainty (see Lewis (1952); also see Pastin (1975a) for a response to Lewis’s argument).  Lewis’s argument appeals explicitly to the probability calculus but we can restate the driving intuition apart from utilizing any formal machinery. Lewis reasoned that if a claim is uncertain then it is rationally acceptable only given further information.  If that further information is uncertain then it is acceptable only given additional information.  If this regress continues without ever coming to a certainty then Lewis conjectures that the original claim is not rationally acceptable.

We can get a sense of Lewis’s intuition by considering a conspiracy theorist that has a defense for every claim in his convoluted chain of reasoning.  We might think that, in general, the theorist is right about the conditional claims—if this is true then that is probably correct—but just plain wrong that the entire chain of arguments supports the conspiracy theory.  We correctly realize that the longer the chain of reasoning the less likely the conclusion is true.  The chance of error grows with the amount of information.  Lewis’s argument takes this intuition to its limit: unless uncertainties are grounded in certainties no claim is ever rationally acceptable.

Let us examine several arguments against strong foundationalism.  The most repeated argument against strong foundationalism is that its foundations are inadequate for a philosophical reconstruction of knowledge.  We take ourselves to know much about the world around us from mundane facts about our immediate surroundings to more exotic facts about the far reaches of the universe.  Yet if the basic material for this reconstruction is restricted to facts about an individual’s own mind it is nearly impossible to figure out how we can get back to our ordinary picture of the world.  In this connection strong foundationalists face an inherent tension between the quest for epistemic security and the hope for suitable content to reconstruct commonsense.  Few strong foundationalists have been able to find a suitable balance between these competing demands.  Some philosophers with a more metaphysical bent aimed to reduce each statement about the material world to a logical construction of statements about an individual’s own sense experience.  This project is known as phenomenalism.  The phenomenalist’ guiding idea was that statements about the physical world were really complex statements about sensations.  If this guiding idea could be worked out then strong foundationalist would have a clear conception of how the “commonsense” picture of the world could be justified.  However, this guiding idea could never be worked out.  See, for instance, Roderick Chisholm’s (1948) article.

Another argument against strong foundationalism is David Armstrong’s ‘distinct existence’ argument ((1968), 106-7).  Armstrong argues that there is a difference between an awareness of X and X, where X is some mental state.  For instance, there is a difference between being in pain and awareness of being in pain.  As long as awareness of X is distinct from X, Armstrong argues that it is possible for one to seemingly be aware of X without X actually occurring.  For instance, an intense pain that gradually fades away can lead to a moment in which one has a false awareness of being in pain.  Consequently, the thought that one can enjoy an infallible awareness of some mental state is mistaken.

A recent argument against strong foundationalism is Timothy Williamson’s anti-luminosity argument (see Williamson (2000)).  Williamson does not talk about foundationalism but talks rather in terms of the ongoing temptation in philosophy to postulate a realm of luminous truths, truths that shine so brightly they are always open to our view if we carefully consider the matter.  Even though Williamson doesn’t mention foundationalism his argument clearly applies to the strong foundationalist.  Williamson’s actual argument is intricate and we cannot go into it in much detail.  The basic idea behind Williamson’s argument is that appearance states (for example, it seems as if there is a red item before you) permit of a range of similar cases.  Think of color samples.  There is a string of color samples from red to orange in which each shade is very similar to the next.  If appearance states genuinely provided certainty, indubitability, or the like then one should be able to always tell what state one was in.  But there are cases that are so similar that one might make a mistake.  Thus, because of the fact that appearance states ebb and flow, they cannot provide certainty, indubitability or the like.  There is a burgeoning discussion of the anti-luminosity argument; see Fumerton (2009) for a strong foundationalist response and Meeker & Poston (2010) for a recent discussion and references).

ii. Modest Foundationalism

Prior to 1975 foundationalism was largely identified with strong foundationalism.  Critics of foundationalism attacked the claims that basic beliefs are infallible, incorrigible, or indubitable.  However, around this time there was a growing recognition that foundationalism was compatible with basic beliefs that lacked these epistemically exalted properties.  William Alston (1976a; 1976b), C.F. Delaney (1976), and Mark Pastin (1975a; 1975b) all argued that a foundationalist epistemology merely required that the basic beliefs have a level of positive epistemic status independent of warranting relations from other beliefs. In light of this weaker form of foundationalism the attacks against infallibility, incorrigibility, or indubitability did not touch the core of a foundationalist epistemology.

William Alston probably did the most to rehabilitate foundationalism.  Alston provides several interrelated distinctions that illustrate the limited appeal of certain arguments against strong foundationalism and also displays the attractiveness of modest foundationalism.  The first distinction Alston drew was between epistemic beliefs and non-epistemic beliefs (see 1976a).  Epistemic beliefs are beliefs whose content contains an epistemic concept such as knowledge or justification, whereas a non-epistemic belief does not contain an epistemic concept.  The belief that there is a red circle before me is not an epistemic belief because its content does not contain any epistemic concepts.  However, the belief that I am justified in believing that there is a red, circle before me is an epistemic belief on account of the epistemic concept justified figuring in its content.   Alston observes that prominent arguments against foundationalism tend to run together these two beliefs.  For instance, an argument against foundationalism might require that to be justified in believing that p one must justifiedly believe that I am justified in believing that p.  That is, the argument against foundationalism assumes that epistemic beliefs are required for the justification of non-epistemic beliefs.  As Alston sees it, once these two types of belief are clearly separated we should be suspicious of any such argument that requires epistemic beliefs for the justification of non-epistemic beliefs (for details see (1976a) and (1976b)).

A closely related distinction for Alston is the distinction between the state of being justified and the activity of exhibiting one’s justification.  Alston argues in a like manner that prominent objections to foundationalism conflate these two notions.  The state of being justified does not imply that one can exhibit one’s justification.  Reflection on actual examples support Alston’s claim.  Grandma may be justified in believing that she has hands without being in a position to exhibit her justification.  Timmy is justified in believing that he has existed for more than five minutes but he can do very little to demonstrate his justification.  Therefore, arguments against foundationalism should not assume that justification requires the ability to exhibit one’s justification.

A final, closely allied, distinction is between a justification regress argument and a showing regress argument.  Alston argues that the standard regress argument is a regress of justification that points to the necessity of immediately justified beliefs.  This argument is distinct from a showing regress in which the aim is to demonstrate that one is justified in believing p.  This showing regress requires that one proves that one is justified in believing p for each belief one has.  Given Alston’s earlier distinctions this implies that one must have epistemic beliefs for each non-epistemic belief and further it conflates the distinction between the state of being justified and the activity of exhibiting one’s justification.

With these three distinctions in place and the further claim that immediately justified beliefs may be fallible, revisable, and dubitable Alston makes quick work of the standard objections to strong foundationalism.  The arguments against strong foundationalism fail to apply to modest foundationalism and further have no force against the claim that some beliefs have a strong presumption of truth.  Reflection on actual cases supports Alston’s claim.  Grandma’s belief that she has hands might be false and revised in light of future evidence.  Perhaps, Grandma has been fitted with a prosthetic device that looks and functions just like a normal hand.  Nonetheless when she looks and appears to see a hand, she is fully justified in believing that she has hands.

Alston’s discussion of modest foundationalism does not mention weaker forms of foundationalism.  Further Alston is not clear on the precise epistemic status on these foundations.  Alston describes the ‘minimal’ form of foundationalism as simply being committed to non-inferentially justified beliefs.  However, as we shall shortly see BonJour identifies a modest and weak form of foundationalism.  For purposes of terminological regimentation we shall take ‘modest’ foundationalism to be the claim that the basic beliefs possess knowledge adequate justification even though these beliefs may be fallible, corrigible, or dubitable.  A corollary to modest foundationalism is the thesis that the basic beliefs can serve as premises for additional beliefs.  The picture then the modest foundationalist offers us is that of knowledge (and justification) as resting on a foundation of propositions whose positive epistemic status is sufficient to infer other beliefs but whose positive status may be undermined by further information.

A significant development in modest foundationalism is the rise of reformed epistemology.  Reformed epistemology is a view in the epistemology of religious belief, which holds that the belief that there is a God can be properly basic.  Alvin Plantinga (1983) develops this view.  Plantinga holds that an individual may rationally believe that there is a God even though the individual does not possess sufficient evidence to convince an agnostic.  Furthermore, the individual need not know how to respond to various objections to theism.  On Plantinga’s view as long as the belief is produced in the right way it is justified.  Plantinga has developed reformed epistemology in his (2000) volume.  Plantinga develops the view as a form of externalism that holds that the justification conferring factors for a belief may include external factors.

Modest foundationalism is not without its critics.  Some strong foundationalists argue that modest foundationalism is too modest to provide adequate foundations for empirical knowledge (see McGrew (2003)).  Timothy McGrew argues that empirical knowledge must be grounded in certainties.  McGrew deploys an argument similar to C.I. Lewis’s argument that probabilities require certainties.  McGrew argues that every statement that has less than unit probability is grounded in some other statement.  If the probability that it will rain today is .9 then there must be some additional information that one is taking in account to get this probability.  Consequently, if the alleged foundations are merely probable then they are really no foundations at all.  Modest foundationalists disagree.  They hold that some statements may have an intrinsic non-zero probability (see for instance Mark Pastin’s response to C.I. Lewis’s argument in Pastin (1975a)).

iii. Weak Foundationalism

Weak foundationalism is an interesting form of foundationalism.  Laurence BonJour mentions the view as a possible foundationalist view in his (1985) book The Structure of Empirical Knowledge.  According to BonJour the weak foundationalist holds that some non-inferential beliefs are minimally justified, where this justification is not strong enough to satisfy the justification condition on knowledge.  Further this justification is not strong enough to allow the individual beliefs to serve as premises to justify other beliefs (see BonJour (1985), 30).  However, because knowledge and inference are fundamental features to our epistemic practices, a natural corollary to weak foundationalism is that coherence among one’s beliefs is required for knowledge-adequate justification and also for one’s beliefs to function as premises for other beliefs.  Thus for the weak foundationalist, coherence has an ineliminable role for knowledge and inference.

This form of foundationalism is a significant departure from the natural stress foundationalists place on the regress argument.  Attention on the regress argument focuses one back to the ultimate beliefs of one’s view.  If these beliefs are insufficient to license inference to other beliefs it is difficult to make good sense of a reconstruction of knowledge.  At the very least the reconstruction will not proceed in a step by step manner in which one begins with a limited class of beliefs—the basic ones—and then moves to the non-basic ones.  If, in addition, coherence is required for the basic beliefs to serve as premises for other beliefs then this form of weak foundationalism looks very similar to refined forms of coherentism.

Some modest foundationalists maintain that weak foundationalism is inadequate.  James Van Cleve contends that weak foundationalism is inadequate to generate justification for one’s beliefs (van Cleve (2005)).  Van Cleve presents two arguments for the claim that some beliefs must have a high intrinsic credibility (pp. 173-4).  First, while coherence can increase the justification for thinking that one’s ostensible recollections are correct, one must have significant justification for thinking that one has correctly identified one’s ostensible recollection.  That is to say, one must have more than weak justification for thinking one’s apparent memory does report that p, whether or not this apparent memory is true.  Apart from the thought that one has strong justification for believing that one’s ostensible memory is as one takes it to be, Van Cleve argues it is difficult to see how coherence could increase the justification for believing that those apparent memories are true.

The second argument Van Cleve offers comes from Bertrand Russell ((1948), p. 188).  Russell observes that one fact makes another probable or improbable only in relation to a law.  Therefore, for coherence among certain facts, to make another fact probable one must have sufficient justification for believing a law that connects the facts.  Van Cleve explains that we might not require a genuine law but rather an empirical generalization that connects the two facts.  Nonetheless Russell’s point is that for coherence to increase the probability of some claim we must have more than weak justification for believing some generalization.  The problem for the weak foundationalist is that our justification for believing an empirical generalization depends on memory.  Consequently, memory must supply the needed premise in a coherence argument and it can do this only if memory supplies more than weak justification.  In short, the coherence among ostensible memories increases justification only if we have more than weak justification for believing some generalization provided by memory.

b. Theories of Proper Inference

Much of the attention on foundationalism has focused on the nature and existence of basic beliefs.  Yet a crucial element of foundationalism is the nature of the inferential relations between basic beliefs and non-basic beliefs.  Foundationalists claim that all of one’s non-basic beliefs are justified ultimately by the basic beliefs, but how is this supposed to work?  What are the proper conditions for the justification of the non-basic beliefs?  The following discusses three approaches to inferential justification: deductivism, strict inductivism, and liberal inductivism.

i. Deductivism

Deductivists hold that proper philosophical method consists in the construction of deductively valid arguments whose premises are indubitable or self-evident (see remarks by Nozick (1981) and Lycan (1988)).  Deductivists travel down the regress in order to locate the epistemic atoms from which they attempt to reconstruct the rest of one’s knowledge by deductive inference.  Descartes’ epistemology is often aligned with deductivism.  Descartes locates the epistemically basic beliefs in beliefs about the ideas in one’s mind and then deduces from those ideas that a good God exists.  Then given that a good God exists, Descartes deduces further that the ideas in his mind must correspond to objects in reality.  Therefore, by a careful deductive method, Descartes aims to reconstruct our knowledge of the external world.

Another prominent example of deductivism comes from phenomenalism.  As mentioned earlier, phenomenalism is the attempt to analyze statements about physical objects in terms of statements about patterns of sense data.  Given this analysis, the phenomenalist can deduce our knowledge of the external world from knowledge of our own sensory states.  Whereas Descartes’ deductivism took a theological route through the existence of a good God, the phenomenalist eschews theology and attempts a deductive reconstruction by a metaphysical analysis of statements about the external world.  Though this project is a momentous failure, it illustrates a tendency in philosophy to grasp for certainty.

Contemporary philosophers dismiss deductivism as implausible.  Deductivism requires strong foundationalism because the ultimate premises must be infallible, indubitable, or incorrigible.  However, many philosophers judge that the regress stopping premises need not have these exalted properties. Surely, the thought continues, we know things like I have hands and the world has existed for more than five minutes? Additionally, if one restricts proper inference to deduction then one can never expand upon the information contained in the premises.  Deductive inference traces out logical implications of the information contained in the premises.  So if the basic premises are limited to facts about one’s sensory states then one can’t go ‘beyond’ those states to facts about the external world, the past, or the future.  To accommodate that knowledge we must expand either our premises or our conception of inference.  Either direction abandons the deductivist picture of proper philosophical method.

ii. Strict Inductivism

One response to the above challenge for deductivism is to move to modest foundationalism, which allows the basic premises to include beliefs about the external world or the past.  However, even this move is inadequate to account for all our knowledge.  In addition to knowing particular facts about the external world or the past we know some general truths about the world such as all crows are black.  It is implausible that this belief is properly basic.  Further, the belief that every observed and unobserved crow is black is not implied by any properly basic belief such as this crow is black.  In addition to moving away from a strong foundationalist theory of non-inferential justification, one must abandon deductivism.

To accommodate knowledge of general truths, philosophers must allow for other kinds of inference beside deductive inference.  The standard form of non-deductive inference is enumerative induction.  Enumerative induction works by listing (that is, enumerating) all the relevant instances and then concluding on the basis of a sufficient sample that all the relevant instances have the target property.  Suppose, for instance, one knows that 100 widgets from the Kenosha Widget Factory have a small k printed on it and that one knows of no counterexamples to this.  Given this knowledge, one can infer by enumerative induction that every widget from the Kenosha Widget Factory has a small k printed on it. Significantly, this inference is liable to mislead.  Perhaps, the widgets one has examined are special in some way that is relevant to the small printed k.  For example, the widgets come from an exclusive series of widgets to celebrate the Kafka’s birthday.  Even though the inference may mislead, it is still intuitively a good inference.  Given a sufficient sample size and no counterexamples, one may infer that the sample is representative of the whole.

The importance of enumerative induction is that it allows one to expand one’s knowledge of the world beyond the foundations.  Moreover, enumerative induction is a form of linear inference.  The premises of the induction are known or justifiably believed prior to the conclusion being justified believed.  This suggests that enumerative induction is a natural development of the foundationalist conception of knowledge.  Knowledge rests on properly basic beliefs and those other beliefs that can be properly inferred from the best beliefs by deduction and enumerative induction.

iii. Liberal Inductivism

Strict inductivism is motivated by the thought that we have some kind of inferential knowledge of the world that cannot be accommodated by deductive inference from epistemically basic beliefs.  A fairly recent debate has arisen over the merits of strict inductivism.  Some philosophers have argued that there are other forms of non-deductive inference that do not fit the model of enumerative induction.  C.S. Peirce describes a form of inference called “abduction” or “inference to the best explanation.”  This form of inference appeals to explanatory considerations to justify belief.  One infers, for example, that two students copied answers from a third because this is the best explanation of the available data—they each make the same mistakes and the two sat in view of the third.  Alternatively, in a more theoretical context, one infers that there are very small unobservable particles because this is the best explanation of Brownian motion.  Let us call ‘liberal inductivism’ any view that accepts the legitimacy of a form of inference to the best explanation that is distinct from enumerative induction.  For a defense of liberal inductivism see Gilbert Harman’s classic (1965) paper.  Harman defends a strong version of liberal inductivism according to which enumerative induction is just a disguised form of inference to the best explanation.

A crucial task for liberal inductivists is to clarify the criteria that are used to evaluate explanations.  What makes one hypothesis a better explanation than another?  A standard answer is that hypotheses are rated as to their simplicity, testability, scope, fruitfulness, and conservativeness.  The simplicity of a hypothesis is a matter of how many entities, properties, or laws it postulates.  The theory that the streets are wet because it rained last night is simpler than the theory that the streets are wet because there was a massive water balloon fight between the septuagenarians and octogenarians last night.  A hypothesis’s testability is a matter of its ability to be determined to be true or false.  Some hypotheses are more favorable because they can easily be put to the test and when they survive the test, they receive confirmation.  The scope of a hypothesis is a matter of how much data the hypothesis covers.  If two competing hypotheses both entail the fall of the American dollar but another also entails the fact that the Yen rose, the hypothesis that explains this other fact has greater scope.  The fruitfulness of a hypothesis is a matter of how well it can be implemented for new research projects.  Darwin’s theory on the origin of the species has tremendous fruitfulness because, for one, it opened up the study of molecular genetics.  Finally, the conservativeness of a hypothesis is a matter of its fit with our previously accepted theories and beliefs.

The liberal inductivist points to the alleged fact that many of our commonsense judgments about what exists are guided by inference to the best explanation.  If, for instance, we hear the scratching in the walls and witness the disappearance of cheese, we infer that there are mice in the wainscoting.  As the liberal inductivist sees it, this amounts to a primitive use of inference to the best explanation.  The mice hypothesis is relatively simple, testable, and conservative.

The epistemological payout for accepting the legitimacy of inference to the best explanation is significant.  This form of inference is ideally suited for dealing with under-determination cases, cases in which one’s evidence for a hypothesis is compatible with its falsity.  For instance, the evidence we possess for believing that the story of general relativity is correct is compatible with the falsity of that theory.  Nonetheless, we judge that we are rational in believing that general relativity is true based on the available evidence.  The theory of general relativity is the best available explanation of the data.  Similarly, epistemological under-determination arguments focus on the fact that the perceptual data we possess is compatible with the falsity of our common sense beliefs.  If a brain in the vat scenario obtained then one would have all the same sensation states and still believe that, for example, one was seated at a desk.  Nevertheless, the truth of our commonsense beliefs is the best available explanation for the data of sense.  Therefore, our commonsense beliefs meet the justification condition for knowledge.  See Jonathan Vogel (1990) for a response to skepticism along these lines and see Richard Fumerton (1992) for a contrasting perspective.

Liberal inductivism is not without its detractors. Richard Fumerton argues that every acceptable inductive inference is either a straightforward case of induction or a combination of straightforward induction and deduction. Fumerton focuses on paradigm cases of alleged inference to the best explanation and argues that these cases are enthymemes (that is, arguments with suppressed premises).  He considers a case in which someone infers that a person walked recently on the beach from the evidence that there are footprints on the beach and that if a person walked recently on the beach there would be footprints on the beach.  Fumerton observes that this inference fits in to the standard pattern of inference to the best explanation.  However, he then argues that the acceptability of this inference depends on our justification for believing that in the vast majority of cases footprints are produced by people.  Fumerton thus claims that this paradigmatic case of inference to the best explanation is really a disguised form of inference to a particular: the vast majority of footprints are produced by persons; there are footprints on the beach; therefore, a person walked on the beach recently.  The debate of the nature and legitimacy of inference to the best explanation is an active and exciting area of research.  For an excellent discussion and defense of inference to the best explanation see Lipton (2004).

iv. A Theory of Inference and A Theory of Concepts

There are non-trivial connections between a foundationalist theory of inference and theory of concepts.  This is one of the points at which epistemology meets the philosophy of mind.  Both deductivists and strict inductivists tend to accept a thesis about the origin of our concepts.  They both tend to accept the thesis of concept empiricism in which all of our concepts derive from experience.  Following Locke and Hume, concept empiricists stress that we cannot make sense of any ideas that are not based in experience.  Some concept empiricists are strong foundationalists in which case they work with a very limited range of sensory concepts (for example, C.I. Lewis) or they are modest foundationalist in which they take concepts of the external world as disclosed in experience (that is, direct realists).  Concept empiricists are opposed to inference to the best explanation because a characteristic feature of inference to the best explanation is inference to an unobservable.  As the concept empiricist sees it this is illegitimate because we lack the ability to think of genuine non-observables.  For a sophisticated development of this view see Van Fraassen (1980).

Concept rationalists, by contrast, allow that we possess concepts that are not disclosed in experience.  Some concept rationalists, like Descartes, held that some concepts are innate such as the concepts God, substance, or I.  Other concept rationalists view inference to the best explanation as a way of forming new concepts.  In general concept rationalists do not limit the legitimate forms of inference to deduction and enumerative induction.  For a discussion of concept empiricism and rationalism in connection with foundationalism see Timothy McGrew (2003).

5. Conclusion

 

Foundationalism is a multifaceted doctrine.  A well-worked out foundationalist view needs to naturally combine a theory of non-inferential justification with a view of the nature of inference.  The nature and legitimacy of non-deductive inference is a relatively recent topic and there is hope that significant progress will be made on this score.  Moreover, given the continued interest in the regress problem foundationalism provides to be of perennial interest.  The issues that drive research on foundationalism are fundamental epistemic questions about the structure and legitimacy of our view of the world.

6. References and Further Reading

  • Alston, W. 1976a.  “Two Types of Foundationalism.” The Journal of Philosophy 73, 165-185.
  • Alston, W. 1976b.  “Has foundationalism been refuted?” Philosophical Studies 29, 287-305.
  • Armstrong, D.M. 1968. A Materialist Theory of Mind.  New York: Routledge.
  • Audi, R.  The Structure of Justification.  New York: Cambridge.
  • Bergmann, Michael. 2004.  “What’s not wrong with foundationalism,” Philosophy and Phenomenological Research LXVIII, 161-165.
  • Bergmann, Michael. 2006. Justification without Awareness.  New York: Oxford.
  • BonJour, L. 1985.  The Structure of Empirical Knowledge.  Cambridge, MA. Harvard University Press.
  • BonJour, L.  1997.  “Haack on Experience and Justification.”  Synthese 112:1, 13-23.
  • BonJour, L. 1999.  “The Dialectic of Foundationalism and Coherentism.” In The Blackwell Guide to Epistemology eds. John Greco and Ernest Sosa.  Malden, MA: Blackwell, 117-142.
  • BonJour, L and Sosa, E. 2003.  Epistemic Justification: Internalism vs. Externalism, Foundations vs. Virtues. Malden, MA: Blackwell.
  • Chisholm, R. 1948. “The Problem of Empiricism,” The Journal of Philosophy 45, 512-517.
  • Delaney, C.F. 1976. “Foundations of Empirical Knowledge – Again,” New Scholasticism L, 1-19.
  • Fumerton, R. 1980.  “Induction and Reasoning to the Best Explanation.”  Philosophy of Science 47, 589-600.
  • Fumerton, R. 1992.  “Skepticism and Reasoning to the Best Explanation.”  Philosophical Issues 2, 149-169.
  • Fumerton, R. 1998.  “Replies to My Three Critics.” Philosophy and Phenomenological Research 58, 927-937.
  • Fumerton, R.  2006. “Epistemic Internalism, Philosophical Assurance and the Skeptical Predicament,” in Knowledge and Reality, eds. Crisp, Davidson, and Laan. Dordrecht: Kluwer, 179-191.
  • Fumerton, R. 2009. “Luminous enough for a cognitive home.”  Philosophical Studies 142, 67-76.
  • Goldman, A. 1979.  “What is Justified Belief?” in Justification and knowledge. Eds.  George Pappas.  Dordrecht: D. Reidel, 1-23.
  • Haack, S. 1993.  Evidence and Inquiry: Towards Reconstruction in Epistemology. Malden, MA: Blackwell.
  • Harman, Gilbert. 1965. “Inference to the Best Explanation.”  The Philosophical Review 74, 88-95.
  • Howard-Snyder, Daniel. 2005.  “Foundationalism and Arbitrariness,” Pacific Philosophical Quarterly 86, 18-24.
  • Howard-Snyder, D & Coffman, E.J. 2006 “Three Arguments Against Foundationalism: Arbitrariness, Epistemic Regress, and Existential Support,” Canadian Journal of Philosophy 36:4, 535-564.
  • Huemer, Michael. 2003.  “Arbitrary Foundations?” The Philosophical Forum XXXIV, 141-152.
  • Klein, Peter.  1999.  “Human knowledge and the regress of reasons,” Philosophical Perspectives 13, 297-325.
  • Klein, Peter. 2004. “What is wrong with foundationalism is that it cannot solve the epistemic regress problem,”  Philosophy and Phenomenological Research LXVIII, 166-171.
  • Lehrer, K. 1997.  Self-Trust.  New York: Oxford.
  • Lewis, C.I. 1929.  Mind and the World Order.  New York: Dover Publications.
  • Lewis, C.I.  1952.  “The Given Element in Empirical Knowledge.” The Philosophical Review 61, 168-175.
  • Lipton, P. 2004.  Inference to the Best Explanation 2nd edition.  New York: Routledge.
  • Lycan, W. 1988.  Judgment and Justification.  New York: Cambridge.
  • Lyons, J. 2008.  “Evidence, Experience, and Externalism,” Australasian Journal of Philosophy 86, 461-479
  • McGrew, T. 2003. “A Defense of Classical Foundationalism,” in The Theory of Knowledge, ed. Louis Pojman, Belmont: CA. Wadsworth, pp. 194-206.
  • Meeker, K & Poston, T.  2010.  “Skeptics without Borders.”  American Philosophical Quarterly 47:3, 223-237.
  • Neurath, Otto.  1959.  “Protocol Sentences.” In Logical Positivism ed. A.J. Ayer Free Press, New York, 199-208.
  • Nozick, R. 1981.  Philosophical Explanations.  Cambridge, MA: Harvard University Press.
  • Pastin, M. 1975a. “C.I. Lewis’s Radical Foundationalism” Nous 9, 407-420.
  • Pastin, M. 1975b. “Modest Foundationalism and Self-Warrant,” American Philosophical Quarterly 4, 141-149.
  • Plantinga, A. 1983.  “Reason and Belief in God,” in Faith and Rationality. Eds. Alvin Plantinga and Nicholas Wolterstorff.  Notre Dame, IN: University of Notre Dame Press.
  • Plantinga, A. 1993.  Warrant: The Current Debate.  New York: Oxford.
  • Plantinga, A. 2000.  Warranted Christian Belief.  New York: Oxford.
  • Pollock, J and Cruz, J. 1999.  Contemporary Theories of Knowledge 2nd edition.  New York: Rowman & Littlefield.
  • Pryor, J. 2000.  “The Skeptic and the Dogmatist.”  Nous 34, 517-549.
  • Pryor, J. 2001. “Highlights of Recent Epistemology,” The British Journal for the Philosophy of Science 52, 95-124.
    • Stresses that modest foundationalism looks better in 2001 than it looked circa 1976.
  • Quine. W.V.O.  1951. “Two Dogmas of Empiricism.”  The Philosophical Review 60, 20-43.
  • Rescher, N. 1973.  The Coherence Theory of Truth.  New York: Oxford.
  • Russell, B. 1948.  Human Knowledge.  New York: Routledge.
  • Schlick, Moritz. 1959.  “The Foundation of Knowledge.” In Logical Positivism ed. A.J. Ayer Free Press, New York, 209-227.
  • Sellars, Wilfrid. 1963.  “Empiricism and the Philosophy of Mind,” in Science, Perception, and Reality.  Atascadero, CA: Ridgeview Publishing Co, pp. 127-196.
  • Triplett, Timm. 1990. “Recent work on Foundationalism,” American Philosophical Quarterly 27:2, 93-116.
  • van Cleve, James. 2005.  “Why Coherence is Not Enough:  A Defense of Moderate Foundationalism,” in Contemporary Debates in Epistemology, edited by Matthias Steup and Ernest Sosa. Oxford:  Blackwell, pp. 168-80.
  • van Fraassen, Bas. 1980.  The Scientific Image.  New York: Oxford.
  • Vogel, Jonathan. 1990.  “Cartesian Skepticism and Inference to the Best Explanation.”  The Journal of Philosophy 87, 658-666.

Author Information

Ted Poston
University of South Alabama
U. S. A.

Lokayata/Carvaka—Indian Materialism

In its most generic sense, “Indian Materialism” refers to the school of thought within Indian philosophy that rejects supernaturalism.  It is regarded as the most radical of the Indian philosophical systems.  It rejects the existence of other worldly entities such an immaterial soul or god and the after-life.  Its primary philosophical import comes by way of a scientific and naturalistic approach to metaphysics.  Thus, it rejects ethical systems that are grounded in supernaturalistic cosmologies.  The good, for the Indian materialist, is strictly associated with pleasure and the only ethical obligation forwarded by the system is the maximization of one’s own pleasure.

The terms Lokāyata and Cārvāka have historically been used to denote the philosophical school of Indian Materialism.  Literally, “Lokāyata” means philosophy of the people.  The term was first used by the ancient Buddhists until around 500 B.C.E. to refer to both a common tribal philosophical view and a sort of this-worldly philosophy or nature lore.  The term has evolved to signify a school of thought that has been scorned by religious leaders in India and remains on the periphery of Indian philosophical thought.  After 500 B.C.E., the term acquired a more derogatory connotation and became synonymous with sophistry.  It was not until between the 6th and 8th century C.E. that the term “Lokāyata” began to signify Materialist thought.  Indian Materialism has also been named Cārvāka after one of the two founders of the school.  Cārvāka and Ajita Kesakambalin are said to have established Indian Materialism as a formal philosophical system, but some still hold that Bṛhaspati was its original founder.  Bṛhaspati allegedly authored the classic work on Indian Materialism, the Bṛhaspati Sῡtra.  There are some conflicting accounts of Bṛhaspati’s life, but, at the least, he is regarded as the mythical authority on Indian Materialism and at most the actual author of the since-perished Bṛhaspati Sῡtra.  Indian Materialism has for this reason also been named “Bṛhaspatya.”

Table of Contents

  1. History
    1. Vedic Period
    2. Epic Period and Brāhmaṇical Systems
  2. Status in Indian Thought
    1. Contributions to Science
    2. Materialism as Heresy
  3. Doctrine
    1. Epistemology
    2. Ontology
    3. Cosmology
  4. Ethics
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources

1.  History

Traces of materialism appear in the earliest recordings of Indian thought.  Initially, Indian Materialism or Lokāyata functioned as a sort of negative reaction to spiritualism and supernaturalism.  During the 6th and 7th centuries C.E. it evolved into a formal school of thought and remains intact, though consistently marginalized.

a. Vedic Period

Vedic thought, in the most comprehensive sense, refers to the ideas contained within the Samhitas and the Brāhamaṇas, including the Upaniṣads.  Historians have estimated that the Vedas were written and compiled between the years 1500 B.C.E. and 300 B.C.E. It is difficult to point to one philosophical view in the Upaniṣads, at least by Western standards; however they are considered by scholars to comprise all of the philosophical writing of the Vedas.  The Vedas exemplify the speculative attitude of the ancient Indians, who had the extreme luxury of reflecting on the whence and whither of their existence.  The ancient Indians, also called Aryans, flourished due to the bounty of food and resources provided by the land.  Free from the burdens of political conflict and social upheaval, they were able to ponder the origin of the universe and the purpose of life.  Their meditations on such subjects have been recorded in the literature of the Vedas.

The Vedic period marked the weakest stage of the development of Indian Materialism.  In its most latent form, Materialism is evident in early Vedic references to a man who was known as Bṛhaspati and his followers.  The literature suggests that Bṛhaspati did not attempt to forward a constructive system of philosophy but rather characteristically refuted the claims of others schools of thought.  In this sense, followers of Bṛhaspati were not only skeptical but intentionally destructive of the orthodoxies of the time.  It is thought that any mention of “unbelievers” or “scoffers” in the Vedas refers to those who identified with Bṛhaspati and his materialist views.  Thus, Materialism in its original form was essentially anti-Vedic.  One of Bṛhaspati’s principal objections to orthodoxy was the practice of repeating verses of sacred texts without understanding their meaning.  However, Bṛhaspati’s ideas (“Bṛhaspatya”) would not become a coherent philosophical view without any positive import.  His followers eventually adopted the doctrine of “Svabhava,” which at this point in history signified the rejection of 1) the theory of causation and 2) the notion that there are good and evil consequences of moral actions.  “Svabhava” enhanced Bṛhaspatya by providing it with the beginnings of a metaphysical framework.  In the concluding portions of the Vedas there are violent tales of the opposition of the Bṛhaspatya people to the spiritualism of the time.  Interestingly, the following anecdote from the Taittiriya Brāhmaṇa implies that the gods were impervious to the destructive efforts of Bṛhaspati:

Once upon a time Bṛhaspati struck the goddess Gāyatrī on the head.  The head   smashed into pieces and the brain split.  But Gāyatrī is immortal.  She did not die.  Every bit of her brain was alive. (Dakshinaranjan, 12)

The term “Svabhava” in Sanskrit can be translated to “essence” or “nature.”  Bṛhaspati used the term to indicate a school of thought that rejected supernaturalism and the ethical teachings that followed from supernaturalist ideologies.  Bṛhaspati and his followers were scorned and ridiculed for not believing in the eternal nature of reality and for not revering the gods and the truths they were supposed to have espoused.  It is interesting to note that while other schools have incorporated the “Svabhava” as a doctrine of essences or continuity of the soul, the use of the term by Bṛhaspati was specifically meant to represent his association with the philosophical naturalism.  Naturalism, in this sense, rejects a Platonic notion of essences and the dualism that is exemplified in Platonic philosophy as well as some of the Indian spiritualistic schools.  This brand of dualism is that which asserts that there are two categorically different realms of reality: the material and the immaterial.  Supernaturalism in general embraces this doctrine and holds that the latter realm is not encompassed by “nature.”  In contrast to this, Naturalism rejects the existence of the immaterial realm and suggests that all of reality is encompassed by nature.  Widely varying schools of Naturalism exist today and do not necessarily embrace the mechanistic materialism that was originally embraced by the Cārvāka.

b. Epic Period and Brāhmaṇical Systems

The major work of the Epic Period of Indian history (circa 200 B.C.E. to 200 C.E.) is the Mahābhārata.  The Great War between the Kurus and the Pandavas inspired a many-sided conversation about morality.  Conversation developed into intellectual inquiry and religion began to be replaced by philosophy.  It was around the beginning of this period that the Bṛhaspati school began to merge with the philosophical naturalism of the time.  Naturalism rejected the existence of a spiritual realm and also rejected the notion that the morality of an action can cause either morally good or evil consequences.  Naturalist underpinnings helped to further shape Indian Materialism into a free-standing philosophical system.   The term Lokāyata replaced Bṛhaspatya and scholars have speculated that this was due to the desire for a distinction between the more evolved philosophical system and its weaker anti-Vedic beginnings.   The Lokāyata remained oppositional to the religious thought of the time, namely, Jainism and Buddhism, but it was also positive in that it claimed the epistemological authority of perception.  Furthermore, it attempted to explain existence in terms of the four elements (earth, air, fire, water).  While there is little certainty about the formal development of the Lokāyata school during the Epic Period, it is suspected that its adoption of naturalistic metaphysics led to its eventual association with scientific inquiry and rationalistic philosophy.  Materialism stood out as a doctrine because it rejected the theism of the Upaniṣadic teachings as well as the ethical teachings of Buddhism and Jainism.  It stood for individuality and rejected the authority of scripture and testimony.

The Lokāyata adopted its hedonistic values during the development of the Brāhmaṇical systems of philosophy (circa 1000 C.E.).  As a reaction against the ascetic and meditative practices of the religious devout, Indian Materialism celebrated the pleasures of the body.  People began gratifying their senses with no restraint.  Pleasure was asserted as the highest good and, according to the Lokāyata, was the only reasonable way to enjoy one’s life.  Some scholarship suggests that during this stage of its development Indian Materialism began to be referred to as “Cārvāka” in addition to the “Lokāyata.”  This is contrary to the more popular view that the school was named Cārvāka after its historical founder helped to establish the Lokāyata as a legitimate philosophy.  The term Cārvāka literally means “entertaining speech” and is derived from the term charva, which means to chew or grind with one’s teeth.  It is possible that Cārvāka himself acquired the name due to his association with Indian Materialism, which then led to the school acquiring the name as well.  This is one of many areas of the history of Indian Materialism that remains open to debate.

2.  Status is Indian Thought

The perceived value of Lokāyata from within the Indian Philosophical community is as relevant a topic as its philosophical import.  If nothing else, the etymology of the term Lokāyata is evidence of the consistent marginalization of Indian Materialism.  Because of its association with hedonistic behavior and heretical religious views, followers of the spiritualistic schools of Indian philosophy (Jainism, Buddhism, Hinduism) are reticent on the subject of the materialistic tendencies present in their own systems; however, some scholars, such as Daya Krishna, have suggested that materialism is, in varying degrees, present in all Indian philosophical schools.  This is not to say that materialism replaces other ideologies—it is to say rather that notions about the priority of this-worldliness appear even in some spiritualistic schools.  While matter does not take priority over the spiritual realm in every sense, its significance is elevated more so than in other major world religions.  This observation, for some, carries little weight when examining the philosophical import of the various Indian schools of thought; however, it seems relevant when considering the evolution of Indian thought.  The original meaning of Lokāyata as prevalent among the people has become true in the sense that it is pervasive in Indian philosophical thought at large.  This is not to say that materialism is widely accepted or even that its presence is overtly acknowledged, but it is difficult to deny its far-reaching influence on Indian Philosophy as a whole.

a. Contributions to Science

The most significant influence that Materialism has had on Indian thought is in the field of science.  The spread of Indian Materialism led to the mindset that matter can be of value in itself.  Rather than a burden to our minds or souls, the Materialist view promoted the notion that the body itself can be regarded as wondrous and full of potential.  Evidence in this shift in perspective can be seen by the progress of science over the course of India’s history.  Materialist thought dignified the physical world and elevated the sciences to a respectable level.  Moreover, the Materialist emphasis on empirical validation of truth became the golden rule of the Scientific Method.  Indian Materialism pre-dated the British Empiricist movement by over a millennium.  Whereas the authority of empirical evidence carried little weight in Ancient India, modern thought began to value the systematic and cautious epistemology that first appeared in the thought of the Lokāyata.

b. Materialism as Heresy

Regardless of its positive influence on Indian thought, the fact remains that Indian Materialism is often regarded as blatant heresy against the Spiritualistic schools.  It rejects the theism of Hinduism as well as the moralism of Buddhist and Jain thought.  The anti-orthodox claims of the Materialists are seen as heretical by the religious masses and fly in the face of the piety promoted by most religious sects.  However, it is questionable whether the formal ethics of Materialism are truly practiced to their logical extent by those who claim to belong to the school.  It is suspected by many scholars that Indian Materialism today stands for an atheistic view that values science in place of supernaturalism.  More than anything, Materialists have historically expressed a view that has not found favor among the established religious and social authorities.

3.  Doctrine

There are no existing works that serve as the doctrinal texts for the Lokāyata.  The available materials on the school of thought are incomplete and have suffered through centuries of deterioration.  Mere fragments of the Bṛhaspati Sῡtra remain in existence and because of their obscure nature provide little insight into the doctrine and practices of ancient Indian Materialists.  Clues about the history of Indian Materialism have been pieced together to formulate at best a sketchy portrayal of how the “philosophy of the people” originated and evolved over thousands of years.

a. Epistemology

Epistemological thought varies in Indian philosophy according to how each system addresses the question of “Pramānas” or the “sources and proofs of knowledge.”  (Mittal 41)  The Lokāyata (Cārvāka) school recognized perception (pratkaysa) alone as a reliable source of knowledge.  They therefore rejected two commonly held pramānas: 1) inference (anumana) and 2) testimony (sabda).  Because of its outright rejection of such commonly held sources of knowledge, the Lokāyata was not taken seriously as a school of philosophy.  The common view was that Cārvākas merely rejected truth claims and forwarded none of their own.  To be a mere skeptic during the time amounted to very low philosophical stature.

However, there are additional accounts of the Lokāyata that suggest that the epistemology was more advanced and positivistic than that of mere skepticism.  In fact, it has been compared to the empiricism of John Locke and David Hume.  The Cārvākas denied philosophical claims that could not be verified through direct experience.  Thus, the Lokāyata denied the validity of inferences that were made based upon truth claims that were not empirically verifiable.  However, logical inferences that were made based on premises that were derived from direct experience were held as valid.  It is believed that this characterization of the epistemology of the Lokāyata most accurately describes the epistemological position of contemporary Indian Materialism.

Cārvākas were, in a sense, the first philosophical pragmatists.  They realized that not all sorts of inference were problematic; in order to proceed through daily life inference is a necessary step.  For practical purposes, the Lokāyata made a distinction between inferences made based on probability as opposed to certainty.  The common example used to demonstrate the difference is the inference that if smoke is rising from a building it is probably an indication that there is a fire within the building.  However, Cārvākas were unwilling to accept anything beyond this sort of mundane use of inference, such as the mechanical inference forwarded by the Buddhists.  The Lokāyata refused to accept inferences about what has never been perceived, namely god or the after-life.

b. Ontology

The ontology of the Lokāyata rests on the denial of the existence of non-perceivable entities such as God or spiritual realm.  Critics of this school of thought point to the fallacy of moving from the premise “the soul cannot be known” to the conclusion “the soul does not exist.”  Again, there is a pragmatic tendency in this sort of thinking.  It seems that followers of the Lokāyata were not concerned with truths that could not be verified; however they were not entirely skeptical.  The Lokāyata posited that the world itself and all material objects of the world are real.  They held that all of existence can be reduced to the four elements of air, water, fire and earth.  All things come into existence through a mixture of these elements and will perish with their separation.  Perhaps the most philosophically sophisticated position of Indian Materialism is the assertion that even human consciousness is a material construct.  According to K. K. Mittal, the ontology of the Lokāyata is strictly set forth as follows:

  1. Our observation does not bring forth any instance of a disincarnate consciousness. For the manifestation of life and consciousness, body is an inalienable factor.
  2. That body is the substratum of consciousness can be seen in the undoubted fact of the arising of sensation and perception only in so far as they are conditioned by the bodily mechanism.
  3. The medicinal science by prescribing that certain foods and drinks (such as Brāhmighrta) have the properties conducive to the intellectual powers affords another proof and evidence of the relation of consciousness with body and the material ingredients (of food).  (Mittal 47)

Mittal reports (ibid.), apparently two schools of thought within the Lokāyata arose out of these tenets.  One forwarded the position that there can be no self or soul apart from the body; another posited that a soul can exist alongside a body as long as the body lives, but that the soul perishes with the body.  The latter view adopted the position that the soul is pure air or breath, which is a form of matter.  Therefore, the Lokāyata collectively rejects the existence of an other-worldly soul, while sometimes accepts the notion of a material soul.

c. Cosmology

To speculate as to why the universe exists would be an exercise in futility for an Indian Materialist.  The purpose and origin of existence is not discoverable through scientific means.  Furthermore, the speculation about such matters leads to anxiety and frustration, which reduce pleasure and overall contentment.  There is no teleology implicit in Indian Materalism, which is evidenced in the school’s position that the universe itself probably came into existence by chance.  Although there can be no certainty about the origin of the universe, the most probable explanation is that it evolved as a result of a series of random events.

There is also no doctrine of Creation in the Lokāyata.  The principles of karma (action) and niyati (fate) are rejected because they are derived from the notion that existence in itself is purposeful.  The fundamental principle of Indian Materialism was and remains “Svabhava” or nature.  This is not to suggest that nature itself has no internal laws or continuity.  It would be a misinterpretation of Indian Materialism to suppose that it forwards a cosmology of chaos.  Rather, it resembles most closely the naturalism forwarded by the American philosopher John Dewey.  While it posits no “creator” or teleology, Indian Materialism regards nature itself as a force that thrives according to its own law.

4. Ethics

The most common view among scholars regarding the ethic of Indian Materialism is that it generally forwards Egoism.  In other words, it adopts the perspective that an individual’s ends take priority over the ends of others.  Materialists are critical of other ethical systems for being tied to notions of duty or virtue that are derived from false, supernaturalist cosmologies.  Indian Materialism regards pleasure in itself and for itself as the only good and thus promotes hedonistic practices.  Furthermore, it rejects a utilitarian approach to pleasure.  Utilitarianism regards pleasure (both higher and lower) as the ultimate good and therefore promotes the maximization of the good (pleasure) on a collective level.  Indian Materialism rejects this move away from pure egoism.  The doctrine suggests that individuals have no obligation to promote the welfare of society and would only tend to do so if it were to ultimately benefit them as well.

It is interesting to note that the Cārvāka school has been maligned by virtually all schools of Indian philosophy not merely for its rejection of the supernatural but probably more so for its insistent rejection of anything beyond Egoistic ethics.  In fact, some scholars hold that Indian Materialism is purely nihilistic.  That is to say that an Egoistic or Hedonistic ethic are not even essential elements of the system, but certainly serve as accurate descriptions for the held values and practices of the Cārvāka people.  This view holds that the axiology of the Cārvāka was purely negative.  It claims nothing more than the rejection of both what we think of now as a Platonic notion of “The Good” along with any notion of “god” or “gods.”

The term “nāstika” is used by almost all schools of Indian Philosophy as a critical term to refer to another school of thought that has severely breached what is thought to be acceptable in terms of both religious beliefs and ethical values.  The greatest recipient of this term is the Cārvāka school.  Commonly degraded to the same degree, the term “Cārvāka” and the more general term “nāstika” are sometimes used interchangeably simply to denote a brand of thinking that does not fall in line with the classical schools of Indian thought.  The chief insult that is imported by the term “nāstika” is that the recipient of the title has strayed dangerously away from a path toward enlightenment.  Ethical practices and one’s spiritual education in Indian culture are inextricably tied to one another.  Those who identify with the Indian Materialist school are criticized by the prominent Indian philosophical schools of thought because they are viewed as largely ignorant of both metaphysical and moral truths.  This sort of ignorance is not perceived as a grave threat to the greater good of society, but rather to the individual who is bereft of spiritual and moral knowledge.  That Indian Philosophy as a whole shows concern for the individual beliefs and practices of its members is in stark contrast to the cultural and individual relativism that is largely embraced by the West.

5. References and Further Reading

a. Primary Sources

  • Gunaratna. Tarkarahasyadīpika. Cārvāka/Lokāyata: an Anthology of Source Materials and Some Recent Studies. Ed. Debiprasad Chattopadhyaya. New Delhi: Indian Council of Philosophical Research in association with Rddhi-India Calcutta, 1990.
  • The Mahābhārata. Trans. and Ed. James L. Fitzgerald.  Chicago: Chicago  University Press, 2004.
  • The Rāmāyaṇa of Vālmīki : an Epic of Ancient India.  Ed. Robert Goldman and Sally J. Sutherland.  Trans. Robert Goldman.  Princeton: Princeton University Press, 1984.
  • The Hymns of the Rgveda. Ed. Jagdish L. Shastri.  Trans. Ralph T. H. Griffith.  New Revised Edition.  Delhi: Motilal Banarsidass, 1973.

b. Secondary Sources

  • Chattopadhyaya, Debiprasad.  Lokāyata; a Study in Ancient Materialism. Bombay: People’s Publishing House, 1959.
  • Daksinaranjan, Sastri.  A Short History of Indian Materialism.  Calcutta: The Book Company, Ltd., 1957.
  • Dasgupta, Surendranath.  A History of Indian Philosophy.  Vol. V.  Cambridge: Cambridge University Press, 1955.
  • Flint, Robert.  Anti-theistic theories: being the Baird lecture for 1877. Edinburgh and London: W. Blackwood and Sons, 1879.
  • Garbe, Richard.  The Philosophy of Ancient India.  Chicago: Open Court Publishing Company, 1899.
  • Grimes, John A.  A Concise Dictionary of Indian Philosophy: Sanskrit Terms Defined in English. New and Revised Edition.  Albany: State University of New York Press, 1996.
  • Halbfass, Wilhelm. Tradition and Reflection: Explorations in Indian Thought. Albany, NY: State University of New York Press, 1991.
  • Hopkins, Edward Washburn.  Ethics of India.  New Haven: Yale University Press, 1924.
  • Mittal, Kewal Krishan.  Materialism in Indian Thought.  New Delhi: Munihiram Manoharlal Publishers Pvt. Ltd., 1974.
  • Radhakrishnan, Sri.  Indian Philosophy. Vols. I & II.  New York: Macmillan, 1927-1929.
  • Raju, P. T. The Philosophical Traditions of India.  Pittsburgh: University of Pittsburgh Press, 1972.
  • Raju, P. T.  Structural Depths of Indian Thought. Albany, NY: State University of New York  Press, 1985.
  • Ranganathan, Shyam.  Ethics and The History of Indian Philosophy. Delhi: Motilal Banarsidass Publishers Pvt. Ltd., 2007.
  • Sharma, Ishwar Chandra.  Ethical philosophies of India. Lincol, NE: Johnsen Publishing Company, 1965.
  • Smart, Ninian.  Doctrine and Argument in Indian Philosophy.  London: Allen and Unwin, 1964.
  • Vanamamalai, N.  “Materialist Thought in Early Tamil Literature.” Social Scientist, 2.4 (1973): 25-41.

Author Information

Abigail Turner-Lauck Wernicki
Email: awernicki@racc.edu
Holy Family University
U. S. A.

Mathematical Structuralism

The theme of mathematical structuralism is that what matters to a mathematical theory is not the internal nature of its objects, such as its numbers, functions, sets, or points, but how those objects relate to each other. In a sense, the thesis is that mathematical objects (if there are such objects) simply have no intrinsic nature. The structuralist theme grew most notably from developments within mathematics toward the end of the nineteenth century and on through to the present, particularly, but not exclusively, in the program of providing a categorical foundation to mathematics.

Philosophically, there are a variety of competing ways to articulate the structuralist theme. These invoke various ontological and epistemic themes. This article begins with an overview of the underlying idea, and then proceeds to the major versions of the view found in the philosophical literature. On the metaphysical front, the most pressing question is whether there are or can be “incomplete” objects that have no intrinsic nature, or whether structuralism requires a rejection of the existence of mathematical objects altogether. Each of these options yields distinctive epistemic questions concerning how mathematics is known.

There are ontologically robust versions of structuralism, philosophical theories that postulate a vast ontology of structures and their places; on the other hand, there are versions of structuralism amenable to those who prefer desert landscapes, denying the existence of distinctively mathematical objects altogether; and also there are versions of structuralism in between those two—postulating an ontology for mathematics, but not a specific realm of structures. The article sketches the various strengths of each option and the challenges posed for them.

Table of Contents

  1. The Main Idea
  2. Taking on the Metaphysics: The Ante Rem Approach
  3. Getting by without Ontology: Structuralism without (Ante Rem) Structures
  4. References and Further Reading

1. The Main Idea

David Hilbert’s Grundlagen der Geometrie [1899] represents the culmination of a trend toward structuralism within mathematics.  That book gives what, with some hindsight, we might call implicit definitions of geometric notions, characterizing them in terms of the relations they bear to each other.  The early pages contain phrases such as “the axioms of this group define the idea expressed by the word ‘between’ . . .” and “the axioms of this group define the notion of congruence or motion.”  The idea is summed up as follows:

We think of . . . points, straight lines, and planes as having certain mutual relations, which we indicate by means of such words as “are situated,” “between,” “parallel,” “congruent,” “continuous,” etc.  The complete and exact description of these relations follows as a consequence of the axioms of geometry.

Hilbert also remarks that the axioms express “certain related fundamental facts of our intuition,” but in the book—in the mathematical development itself—all that remains of the intuitive content are the diagrams that accompany some of the theorems.

Mathematical structuralism is similar, in some ways, to functionalist views in, for example, philosophy of mind.  A functional definition is, in effect, a structural one, since it, too, focuses on relations that the defined items have to each other.  The difference is that mathematical structures are more abstract, and free-standing, in the sense that there are no restrictions on the kind of things that can exemplify them (see Shapiro [1997, Chapter 3, §6]).

There are several different, and mutually incompatible, philosophical ideas that can underlie and motivate mathematical structuralism.  Some philosophers postulate an ontology of structures, and claim that the subject matter of a given branch of mathematics is a particular structure, or a class of structures.  An advocate of a view like this would articulate what a structure is, and then say something about the metaphysical nature of structures, and how they and their properties can become known.  Other structuralist views deny the existence of structures, and develop the underlying theme in other ways.

Let us define a system to be collection of objects together with certain relations on those objects.  For example, an extended family is a system of people under certain blood and marital relations—father, aunt, great niece, son-in-law, and so forth.  A work of music is a collection of notes under certain temporal and other musical relations.  To get closer to mathematics, define a natural number system to be a countably infinite collection of objects with a designated initial object, a one-to-one successor relation that satisfies the principle of mathematical induction and the other axioms of arithmetic.  Examples of natural number systems are the Arabic numerals in their natural order, an infinite sequence of distinct moments of time in temporal order, the strings on a finite (or countable) alphabet arranged in lexical order, and, perhaps, the natural numbers themselves.

To bring Hilbert [1899] into the fold, define a Euclidean system to be three collections of objects, one to be called “points,” a second to be called “lines,” and a third to be called “planes,” along with certain relations between them, such that the axioms are true of those objects and relations, so construed.  Otto Blumenthal reports that in a discussion in a Berlin train station in 1891, Hilbert said that in a proper axiomatization of geometry, “one must always be able to say, instead of ‘points, straight lines, and planes’, ‘tables, chairs, and beer mugs’” (“Lebensgeschichte” in Hilbert [1935, 388-429]; the story is related on p. 403).  In a much-discussed correspondence with Gottlob Frege, Hilbert wrote (see Frege [1976], [1980]):

Every theory is only a scaffolding or schema of concepts together with their necessary relations to one another, and that the basic elements can be thought of in any way one likes.  If in speaking of my points, I think of some system of things, e.g., the system love, law, chimney-sweep . . . and then assume all my axioms as relations between these things, then my propositions, e.g., Pythagoras’ theorem, are also valid for these things . . . [A]ny theory can always be applied to infinitely many systems of basic elements.

A structure is the abstract form of a system, which ignores or abstracts away from any features of the objects that do not bear on the relations.  So, the natural number structure is the form common to all of the natural number systems.  And this structure is the subject matter of arithmetic.  The Euclidean-space-structure is the form common to all Euclidean systems.  The theme of structuralism is that, in general, the subject matter of a branch of mathematics is a given structure or a class of related structures—such as all algebraically closed fields.

A structure is thus a “one over many,” a sort of universal.  The difference between a structure and a more traditional universal, such as a property, is that a property applies to, or holds of, individual objects, while a structure applies to, or holds of, systems.  Structures are thus much like structural universals, whose existence remains subject to debate among metaphysicians (see, for example, Lewis [1986], Armstrong [1986], Pagès [2002])).  Indeed, one might think of a mathematical structure as a sort of free-standing structural universal, one in which the nature of the individual objects that fill the places of the structure, is irrelevant (see Shapiro [2008, §4]).

Any of the usual array of philosophical views on universals can be adapted to structures.  One can be a Platonic ante rem realist, holding that each structure exists and has its properties independent of any systems that have that structure.  On this view, structures exist objectively, and are ontologically prior to any systems that have them (or at least ontologically independent of such systems).  Or one can be an Aristotelian in re realist, holding that structures exist, but insisting that they are ontologically posterior to the systems that instantiate them.  Destroy all the natural number systems and, alas, you have destroyed the natural number structure itself.  A third option is to deny that structures exist at all.  Talk of structures is just a convenient shorthand for talk of systems that have a certain similarity.

In a retrospective article, Paul Bernays [1967, 497] provides a way to articulate the latter sort of view:

A main feature of Hilbert’s axiomatization of geometry is that the axiomatic method is presented and practiced in the spirit of the abstract conception of mathematics that arose at the end of the nineteenth century and which has generally been adopted in modern mathematics.  It consists in abstracting from the intuitive meaning of the terms . . . and in understanding the assertions (theorems) of the axiomatized theory in a hypothetical sense, that is, as holding true for any interpretation . . . for which the axioms are satisfied.  Thus, an axiom system is regarded not as a system of statements about a subject matter but as a system of conditions for what might be called a relational structure . . . [On] this conception of axiomatics, . . . logical reasoning on the basis of the axioms is used not merely as a means of assisting intuition in the study of spatial figures; rather, logical dependencies are considered for their own sake, and it is insisted that in reasoning we should rely only on those properties of a figure that either are explicitly assumed or follow logically from the assumptions and axioms.

Advocates of these different ontological positions concerning structures take different approaches to other central philosophical concerns, such as epistemology, semantics, and methodology.  Each such view has it relatively easy for some issues and finds deep, perhaps intractable problems with others.  The ante rem realist, for example, has a straightforward account of reference and semantics:  the variables of a branch of mathematics, like arithmetic, analysis, and set theory, range over the places in an ante rem structure.  Each singular term denotes one such place.  So the language is understood at face value.  But the ante rem realist must account for how one obtains knowledge of structures, so construed, and she must account for how statements about ante rem structures play a role in scientific theories of the physical world.  As suggested by the above Bernays passage, the nominalistic, eliminative structuralist has it easier on epistemology.  Knowing a truth of, say, real analysis, is knowing what follows from the description of the theory.   But this sort of structuralist must account for the semantics of the reconstrued statements, how they are known, how they figure in science, and so forth.

2. Taking on the Metaphysics: The Ante Rem Approach

To repeat, the ante rem structuralist holds that, say, the natural number structure and the Euclidean space structure exist objectively, independent of the mathematician, her form of life, and so forth, and also independent of whether the structures are exemplified in the non-mathematical realm.  That is what makes them ante rem.  The semantics of the respective languages is straightforward:  The first-order variables range over the places in the respective structure, and a singular term such as ‘0’ denotes a particular place in the structure.

So, on this view, the statements of a mathematical theory are to be read at face value.  The grammatical structure of the mathematical language reflects the underlying logical form of the propositions.  For example, in the arithmetic equation, 3×8=24, the numerals ‘3’, ‘8’, and ‘24’, at least seem to be singular terms—proper names.  On the ante rem view, they are singular terms.  The role of a singular term is to denote an individual object.  On the ante rem view, these numerals denote places in the natural number structure.  And, of course, the equation expresses a truth about that structure.  In this respect, then, ante rem structuralism is a variation on traditional Platonism.

For this perspective to make sense, however, one has to think of a place in a structure as a bona fide object, the sort of thing that can be denoted by a singular term, and the sort of thing that can be in the range of first-order variables.  To pursue the foregoing analogy with universals, a place in a structure is akin to a role or an office, one that can be occupied by different people or things.  So, the idea here is to construe an office as an object in its own right, at least with respect to the structure.

There is, of course, an intuitive difference between an object and a place in a structure, between an office-holder and an office.  Indeed, the ante rem view depends on that very distinction, in order to characterize structures in the first place (in terms of systems).  Yet, we also think of the places in ante rem structures as objects in their own right.

This is made coherent by highlighting and enforcing a distinction in linguistic practice.  It is a matter of keeping track of what we are talking about at any given time.  There are two different orientations involved in discussing structures and their places.  First, a structure, including its places, can be discussed in the context of systems that exemplify the structure.  For example, one might say that the current vice president used to be a senator, or that the white king’s bishop in one game was the white queen’s bishop in another game.  For a more mathematical example, in the system of Arabic numerals, the symbol ‘2’ plays the two-role (if we think of the structure as starting with one), while in the system of roman numerals, the string ‘II’ plays that role.  Call this the places-are-offices perspective.  This office-orientation presupposes a background ontology that supplies objects that fill the places of the structures.  In the case of political systems, the background ontology is people (who have met certain criteria, such as being of a certain age and being duly elected); in the case of chess games, the background ontology is small, moveable objects—pieces with certain colors and shapes.  In the case of arithmetic, anything at all can be used as the background ontology: anything at all can play the two-role in a natural number system.  This is what is meant by saying that mathematical structures, like this one, are “free-standing”.

In contrast to the places-are-offices perspective, there are contexts in which the places of a given structure are treated as objects in their own right.  We say that the vice president presides over the senate, and that a bishop that is on a black square cannot move to a white square, without intending to speak about any particular vice president or chess piece.  Such statements are about the roles themselves.  Call this the places-are-objects perspective.  Here, the statements are about the respective structure as such, independent of any exemplifications it may have.  The ante rem structuralist proposes that we think of typical statements in pure mathematics as made in the places-are-objects mode.  This includes such simple equations as 3×8=24, and more sophisticated statements, for example, that there are infinitely many prime numbers.

To be sure, one can think of statements in the places-are-objects mode as simply generalizations over all systems that exemplify the structure.  This is consonant with the above passage from Bernays [1967], suggesting that we understand “the assertions (theorems) of [an] axiomatized theory in a hypothetical sense, that is, as holding true for any interpretation . . . for which the axioms are satisfied.” However, the ante rem structuralist takes the mathematical statements, in places-are-objects mode, as being about the structure itself.  Indeed, on that view, the structure exists, and so we can talk about its places directly.

So, for the ante rem structuralist, in the places-are-offices mode, singular terms denoting places are bona fide singular terms, and variables ranging over places are bona fide variables, ranging over places.  Places are bona fide objects.

The ante rem structuralist envisions a smooth interplay between places-are-offices statements and places-are-objects statements.  When treating a structure in the places-are-offices mode, the background ontology sometimes includes places from other structures.  We say, for example, that the finite von Neumann ordinals exemplify the natural number structure (under the ordinal successor relation, in which the successor of an ordinal α is α∪{α}).  In the places-are-objects mode, for set theory, the variables range over the places of the iterative hierarchy, such places construed as objects.  The von Neumann ordinals are some of those places-cum-objects.  We think of those objects as forming a system, under the ordinal successor relation.  That system exemplifies the natural number structure.  And in that system, the set-cum-object {φ,{φ}}, is in the two-role (beginning with the empty set, as zero).  We sometimes write {φ,{φ}}=2.  From the ante rem perspective, a symbol denoting {φ,{φ}} is construed in the places-are-objects mode, vis-à-vis the iterative hierarchy, and “2” is in the places-are-offices mode, vis-à-vis the natural number structure.  So construed, “{φ,{φ}}=2” is not actually an identity, but more like a predication.  It says that a certain von Neumann ordinal plays a certain role in a given natural number system.  In the Zermelo system, {{φ}}=2.

Sometimes, the background ontology for the places-are-offices perspective consists of places of the very structure under discussion.  It is commonly noted, for example, that the even natural numbers exemplify the natural number structure.  That is, we consider a system whose objects are the even natural numbers (themselves construed from the places-are-objects mode), under the relation symbolized by “+2”.  That system exemplifies the natural number structure.  In that system, 6 is in the three-role.  Of course, it would be confusing to write 6=3, but if care is taken, it can be properly understood, remembering that it is not an identity.

Trivially, the natural number structure itself, construed from the places-are-objects mode, exemplifies the natural number structure.  In that case, 6 plays the six-role.  Some philosophers might think that this raises a problem analogous to Aristotle’s Third Man argument against Plato.  It depends on the relationship between an ante rem structure and the systems it exemplifies.  In short, the Third Man arguments are problematic if one holds that the reason why a given collection of objects, under certain relations, is a natural number system is that it exemplifies the natural number structure.  This invokes something like the Principle of Sufficient Reason.  If something is so, then there must be a reason why it is so, and the cited reason must be, in some metaphysical sense, prior to the something.  From that perspective, one cannot hold that the natural number structure itself exemplifies the natural number structure because it exemplifies the natural number structure.  That would be a circular reason.  The ante rem structuralist is free to reject this instance of the Principle of Sufficient Reason.  She simply points out that the reason the natural number structure exemplifies the natural number structure is that it has a distinguished initial position and satisfies the relevant principles (for an opposing construal, see Hand [1993]).

The ante rem structuralist should say something about the metaphysical nature of a structure, and how it is that mathematical objects are somehow constituted by a structure.  Consider the following slogan from Shapiro [1997, p. 9]:

Structures are prior to places in the same sense that any organization is prior to the offices that constitute it.  The natural number structure is prior to “6,” just as “baseball defense” is prior to “shortstop” or “U.S. Government” is prior to “Vice President.”

What is this notion of priority? For the non-mathematical examples such as baseball defenses and governments, one might characterize the priority in terms of possible existence.  To say that A is prior to B is to say that B could not exist without A.  No one can be a shortstop independent of a baseball defense; no one can be vice president independent of a government (or organization).  Unfortunately, this articulation of the priority does not make sense of the mathematical cases.  The ante rem structuralist follows most ontological realists in holding that the mathematical structures and their places exist of necessity.  It does not make sense to think of the natural number structure existing without its places, nor for the places to exist without the structure.

The dependence relation in the slogans for ante rem structuralism is that of constitution.  Each ante rem structure consists of some places and some relations.  A structure is constituted by its places and its relations, in the same way that any organization is constituted by its offices and the relations between them.  The constitution is not that of mereology.  It is not the case that a structure is just the sum of its places, since, in general, the places have to be related to each other via the relations of the structure.  An ante rem structure is a whole consisting of, or constituted by, its places and its relations.

3. Getting by without Ontology: Structuralism without (Ante Rem) Structures

Some philosophers find the existence of ante rem structures extravagant.  For such thinkers, there are other ways to preserve the structuralist insights.  One can take structures to exist, but only in the systems that exemplify them.  Metaphysically, the idea is to reverse the priority cited above: structures are posterior to the systems that exemplify them—although, again, it may prove difficult to articulate the relevant notion of priority.  This would be an Aristotelian, in re realism.  On a view like this, the only structures that exist are those that are exemplified.  I do not know of any philosophers of mathematics who articulate such a view in detail.  I mention it, in passing, in light of the connection between structures and traditional universals.

Another, perhaps ontologically cleaner, option is to reject the existence of structures, in any sense of “existence.”  On such a view, apparent talk of structures is only a façon de parler, a way of talking about systems that are structured in a certain way.  The view is sometimes dubbed eliminative structuralism.

The eliminativist can acknowledge the places-are-objects orientation when discussing structures or, to be precise, when discussing structured systems, but he cannot understand such statements literally (without adopting an error theory).  For the eliminativist, the surface grammar of places-are-objects statements does not reflect their underlying logical form, since, from that perspective, there are no structures and there are no places to which one can refer.

The ante rem structuralist and the eliminativist agree that statements in the places-are-objects mode imply generalizations concerning systems that exemplify the structure.  We say, for example, that the vice president presides over the senate, and this entails that all vice presidents preside over their respective senates.  The chess king can move one square in any direction, so long as the move does not result in check.  This entails that all kings are so mobile, and so immobile.  Of course, the generalizations themselves do not entail that there are any vice presidents or chess kings—nor do they entail that there are any structures.

The eliminative structuralist holds that places-are-objects statements are just ways of expressing the relevant generalizations, and he accuses the ante rem structuralist of making too much of their surface grammar, trying to draw deep metaphysical conclusions from that.  The same goes for typical statements in pure mathematics.  Those, too, should be regimented as generalizations over all systems that exemplify the given structure or structures.  For example, the statement “For every natural number n there is a prime p>n” is rendered:

In any natural number system S, for every object x in S, there is another object y in S such that y comes after x in S and y has no divisors in S other than itself and the unit object of S.

In general, any sentence Φ in the language of arithmetic gets regimented as something like:

(Φʹ)      In any natural number system S, Φ[S],

where Φ[S] is obtained from Φ by restricting the quantifiers to the objects in S, and interpreting the non-logical terminology in terms of the relations of S.

In a similar manner, the eliminative structuralist paraphrases or regiments—and deflates—what seem to be substantial metaphysical statements, the very statements made by his philosophical opponents.  For example, “the number 2 exists” becomes “in every natural number system, there is an object in the 2-place”.  Or “real numbers exist” becomes “every real number system has objects in its places.”  These are trivially true, analytic if you will—not the sort of statements that generate heated metaphysical arguments.

The sailing is not completely smooth for the eliminative structuralist, however.  As noted, this view takes the places-are-offices perspective to be primary—paraphrasing places-are-objects statements in those terms.  Places-are-offices statements, recall, presuppose a background of objects to fill the places in the systems.  For mathematics, the nature of these objects is not relevant.  For example, as noted, anything at all can play the two-role in a natural number system.  Nevertheless, for the regimented statements to get their expected truth-values, the background ontology must be quite extensive.

Suppose, for example, that the entire universe consists of no more than 10100,000 objects.  Then there are no natural number systems (since each such system must have infinitely many objects).  So for any sentence Φ in the language of arithmetic, the regimented sentence Φʹ would be vacuously true.  So the eliminativist would be committed to the truth of (the regimented version of) 1+1=0.

In other words, a straightforward, successful eliminative account of arithmetic requires a countably infinite background ontology.  And it gets worse for other branches of mathematics.  An eliminative account of real analysis demands an ontology whose size is that of the continuum; for functional analysis, we’d need the powerset of that many objects.  And on it goes.  The size of some of the structures studied in mathematics is staggering.

Even if the physical universe does exceed 10100,000 objects, and, indeed, even if it is infinite, there is surely some limit to how many physical objects there are (invoking Cantor’s theorem that the powerset of any set is larger than it).  Branches of mathematics that require more objects than the number of physical objects might end up being vacuously trivial, at least by the lights of the straightforward, eliminative structuralist.  This would be bad news for such theorists, as the goal is to make sense of mathematics as practiced.  In any case, no philosophy of mathematics should be hostage to empirical and contingent facts, including features of the size of the physical universe.

In the literature, there are two eliminativist reactions to this threat of vacuity.  First, the philosopher might argue, or assume, that there are enough abstract objects for every mathematical structure to be exemplified.  In other words, we postulate that, for each field of mathematics, there are enough abstract objects to keep the regimented statements from becoming vacuous.

Some mathematicians, and some philosophers, think of the set-theoretic hierarchy as the ontology for all of mathematics.  Mathematical objects—all mathematical objects—are sets in the iterative hierarchy.  Less controversially, it is often thought that the iterative hierarchy is rich enough to recapitulate every mathematical theory.  Penelope Maddy [2007, 354] writes:

Set theory hopes to provide a dependable and perspicuous mathematical theory that is ample enough to include (surrogates for) all the objects of classical mathematics and strong enough to imply all the classical theorems about them.  In this way, set theory aims to provide a court of final appeal for claims of existence and proof in classical mathematics . . . Thus set theory aims to provide a single arena in which the objects of classical mathematics are all included, where they can be compared side-by-side.

One might wonder why it is that a foundational theory only needs “surrogates” for each mathematical object, and not the real things.  For a structuralist, the answer is that in mathematics the individual nature of the objects is irrelevant.  What matters is their relations to each other (see Shapiro [2004]).

An eliminative structuralist might maintain that the theory of the background ontology for mathematics—set theory or some other—is not, after all, the theory of a particular structure.  The foundation is a mathematical theory with an intended ontology in the usual, non-structuralist sense.  In the case of set theory, the intended ontology is the sets.  Set theory is not (merely) about all set-theoretic systems—all systems that satisfy the axioms.  So, the foundational theory is an exception to the theme that mathematics is the science of structure.  But, the argument continues, every other branch of mathematics is to be understood in eliminative structuralist terms.  Arithmetic is the study of all natural number systems—within the iterative hierarchy.  Euclidean geometry is the study of all Euclidean systems, and so forth.  There are thus no structures—ante rem or otherwise—and, with the exception of sets, or whatever the background ontology may be, there are no mathematical objects either.  Øystein Linnebo [2008] articulates and defends a view like this.  Although there is not much discussion of the background ontology, Paul Benacerraf’s classic [1965] can be read in these terms as well.  Benacerraf famously argues that there are no numbers—talk of numbers is only a way to talk about all systems of a certain kind, but he seems to have no similar qualms about sets.

Of course, this ontological version of eliminative structuralism is anathema to a nominalist, who rejects the existence of abstracta altogether.  For the nominalist, sets and ante rem structures are pretty much on a par—neither are wanted.  The other prominent eliminative reaction to the threat of vacuity is to invoke modality.  In effect, one avoids (or attempts to avoid) a commitment to a vast ontology by inserting modal operators into the regimented generalizations.  To reiterate the above example, the modal eliminativist renders the statement “For every natural number n there is a prime p>n” as something like:

In any possible natural number system S, for every object x in S, there is another object y in S such that y comes after x in S and y has no divisors in S other than itself and the unit object of S.

In general, let Φ be any sentence in the language of arithmetic; Φ gets regimented as:

In any possible natural number system S, Φ[S],

or, perhaps,

Necessarily, for any natural number system S, Φ[S],

where, again, Φ[S] is obtained from Φ by restricting the quantifiers to the objects in S, and by interpreting the non-logical terminology in terms of the relations of S.

The difference with the ontological, eliminative program, of course, is that here the variables ranging over systems are inside the scope of a modal operator.  So the modal eliminativist does not require an extensive rich background ontology.  Rather, she needs a large ontology to be possible.  Geoffrey Hellman [1989] develops a modal program in detail.

The central, open problem with this brand of eliminativist structuralism concerns the nature of the invoked modality.  Of course, it won’t do much good to render the modality in terms of possible worlds.  If we do that, and if we take possible worlds, and possibilia, to exist, then modal eliminative structuralism would collapse into the above, ontological version of eliminative structuralism.  Not much would be gained by adding the modal operators.  So the modalist typically takes the modality to be primitive—not defined in terms of anything more fundamental.  But, of course, this move does not relieve the modalist of having to say something about the nature of the indicated modality, and having to say something about how we know propositions about what is possible.

Invoking metaphysical possibility and necessity does not seem appropriate here.  Intuitively, if mathematical objects exist at all, then they exist of necessity.   And perhaps also intuitively, if mathematical objects do not exist, then their non-existence is necessary.  Physical and conceptual modalities are also problematic for present purposes.

Hellman mobilizes the logical modalities for his eliminative structuralism.  Our arithmetic sentence Φ becomes

In any logically possible natural number system S, Φ[S].

It is logically necessary that for any natural number system S, Φ[S].

In contemporary logic textbooks and classes, the logical modalities are understood in terms of sets.  To say that a sentence is logically possible is to say that there is a certain set that satisfies it.  Of course, this will not do here, for the same reason that the modalist cannot define the modality in terms of possible worlds.  It is especially problematic here.  It does no good to render mathematical ‘existence’ in terms of logical possibility if the latter is to be rendered in terms of existence in the set-theoretic hierarchy.  Again, the modalist takes the notion of logical possibility to be a primitive, explicated by the theory as a whole.  For more on this program, see Hellman [1989], [2001], [2005].

To briefly sum up and conclude, the parties to the debate over how to best articulate the structuralist insights agree that each of the major versions has its strengths and, of course, each has its peculiar difficulties.  Negotiating such tradeoffs is, of course, a stock feature of philosophy in general.  The literature has produced an increased understanding of mathematics, of the relevant philosophical issues, and how the issues bear on each other, and the discussion shows no signs of abating. For additional discussion see “The Applicability of Mathematics.”

4. References and Further Reading

  • Armstrong, D. [1986], “In defence of structural universals,” Australasian Journal of Philosophy 64, 85-88.
    • As the title says.
  • Awodey, S. [1996], “Structure in mathematics and logic:  a categorical perspective,” Philosophia Mathematica (3) 4, 209-237.
    • Articulates a connection between structuralism and category theory.
  • Awodey, S. [2004], “An answer to Hellman’s question:  ‘Does category theory provide a framework for mathematical structuralism?’,” Philosophia Mathematica (3) 12, 54-64.
    • Continuation of the above.
  • Awodey, S. [2006], Category theory, Oxford, Oxford University Press.
    • Readable presentation of category theory.
  • Benacerraf, P. [1965], “What numbers could not be,” Philosophical Review 74, 47-73; reprinted in Philosophy of mathematics, edited by P. Benacerraf and H. Putnam, Englewood Cliffs, New Jersey, Prentice-Hall, 1983, 272-294.
    • Classic motivation for the (eliminative) structuralist perspective.
  • Bernays, P. [1967], “Hilbert, David” in The encyclopedia of philosophy, Volume 3, edited by P. Edwards, New York, Macmillan publishing company and The Free Press, 496-504.
  • Chihara, C. [2004], A structural account of mathematics, Oxford, Oxford University Press.
    • Account of the application of mathematics in “structural” terms, but without adopting a structuralist philosophy.
  • Frege, G. [1976], Wissenschaftlicher Briefwechsel, edited by G. Gabriel, H. Hermes, F. Kambartel, and C. Thiel, Hamburg, Felix Meiner.
  • Frege, G. [1980], Philosophical and mathematical correspondence, Oxford, Basil Blackwell.
  • Hale, Bob, [1996], “Structuralism’s unpaid epistemological debts,” Philosophica Mathematica (3) 4, 124-143.
    • Criticism of modal eliminative structuralism.
  • Hand, M. [1993], “Mathematical structuralism and the third man,” Canadian Journal of Philosophy 23, 179-192.
    • Critique of ante rem structuralism, on Aristotelian grounds.
  • Hellman, G. [1989], Mathematics without numbers, Oxford, Oxford University Press.
    • Detailed articulation and defense of modal eliminative structuralism.
  • Hellman, G. [2001], “Three varieties of mathematical structuralism,” Philosophia Mathematica (III) 9, 184-211.
    • Comparison of the varieties of structuralism, favoring the modal eliminative version.
  • Hellman, G. [2005], “Structuralism,” Oxford handbook of philosophy of mathematics and logic, edited by Stewart Shapiro, Oxford, Oxford University Press, 536-562.
    • Comparison of the varieties of structuralism, again favoring the modal eliminative version.
  • Hilbert, D. [1899], Grundlagen der Geometrie, Leipzig, Teubner; Foundations of geometry, translated by E. Townsend, La Salle, Illinois, Open Court, 1959.
  • Hilbert, D. [1935], Gesammelte Abhandlungen, Dritter Band, Berlin, Julius Springer.
  • Lewis, D. [1986], “Against structural universals,” Australasian Journal of Philosophy 64, 25-46.
    • As the title says.
  • Linnebo, Øystein [2008], “Structuralism and the Notion of Dependence,” Philosophical Quarterly 58, 59-79.
    • An ontological eliminative structuralism, using set theory as the (non-structural) background foundation.
  • MacBride, F. [2005], “Structuralism reconsidered,” Oxford Handbook of philosophy of mathematics and logic, edited by Stewart Shapiro, Oxford, Oxford University Press, 563-589.
    • Philosophically based criticism of the varieties of structuralism.
  • Maddy, P. [2007], Second philosophy: a naturalistic method, Oxford, Oxford University Press.
  • McLarty, C. [1993], “Numbers can be just what they have to,” Nous 27, 487-498.
    • Connection between category theory and the philosophical aspects of structuralism.
  • Pagès, J. [2002], “Structural universals and formal relations,” Synthese 131, 215-221.
    • Articulation and defenses of structural universals.
  • Resnik, M. [1981], “Mathematics as a science of patterns: Ontology and reference,” Nous 15, 529-550.
    • Philosophical articulation of structuralism, with focus on metaphysical issues.
  • Resnik, M. [1982], “Mathematics as a science of patterns: Epistemology,” Nous 16, 95-105.
    • Philosophical articulation of structuralism, with focus on epistemological issues.
  • Resnik, M. [1992], “A structuralist’s involvement with modality,” Mind 101, 107-122.
    • Review of Hellman [1989] focusing on issues concerning the invoked notion of modality.
  • Resnik, M. [1997], Mathematics as a science of patterns, Oxford, Oxford University Press.
    • Detailed articulation of a realist version of structuralism.
  • Shapiro, S. [1997], Philosophy of mathematics: structure and ontology, New York, Oxford University Press.
    • Elaborate articulation of structuralism, with focus on the various versions; defense of the ante rem approach.
  • Shapiro, S. [2004], “Foundations of mathematics:  metaphysics, epistemology, structure,” Philosophical Quarterly 54, 16-37.
    • The role of structuralist insights in foundational studies.
  • Shapiro, S. [2008], “Identity, indiscernibility, and ante rem structuralism:  the tale of i and -i,” Philosophia Mathematica (3) 16, 2008, 285-309.
    • Treatment of the identity relation, from an ante rem structuralist perspective, and the metaphysical nature of structures.

Author Information

Stewart Shapiro
Email: shapiro.4@osu.edu
The Ohio State University, U.S.A. and
University of St. Andrews, United Kingdom

Connectionism

Connectionism is an approach to the study of human cognition that utilizes mathematical models, known as connectionist networks or artificial neural networks.  Often, these come in the form of highly interconnected, neuron-like processing units. There is no sharp dividing line between connectionism and computational neuroscience, but connectionists tend more often to abstract away from the specific details of neural functioning to focus on high-level cognitive processes (for example, recognition, memory, comprehension, grammatical competence and reasoning). During connectionism’s ideological heyday in the late twentieth century, its proponents aimed to replace theoretical appeals to formal rules of inference and sentence-like cognitive representations with appeals to the parallel processing of diffuse patterns of neural activity.

Connectionism was pioneered in the 1940s and had attracted a great deal of attention by the 1960s. However, major flaws in the connectionist modeling techniques were soon revealed, and this led to reduced interest in connectionist research and reduced funding.  But in  the 1980s  connectionism underwent a potent, permanent revival. During the later part of the twentieth century, connectionism would be touted by many as the brain-inspired replacement for the computational artifact-inspired ‘classical’ approach to the study of cognition. Like classicism, connectionism attracted and inspired a major cohort of naturalistic philosophers, and the two broad camps clashed over whether or not connectionism had the wherewithal to resolve central quandaries concerning minds, language, rationality and knowledge. More recently, connectionist techniques and concepts have helped inspire philosophers and scientists who maintain that human and non-human cognition is best explained without positing inner representations of the world. Indeed, connectionist techniques are now very widely embraced, even if few label themselves connectionists anymore. This is an indication of connectionism’s success.

Table of Contents

  1. McCulloch and Pitts
  2. Parts and Properties of Connectionist Networks
  3. Learning Algorithms
    1. Hebb’s Rule
    2. The Delta Rule
    3. The Generalized Delta Rule
  4. Connectionist Models Aplenty
    1. Elman’s Recurrent Nets
    2. Interactive Architectures
  5. Making Sense of Connectionist Processing
  6. Connectionism and the Mind
    1. Rules versus General Learning Mechanisms: The Past-Tense Controversy
    2. Concepts
    3. Connectionism and Eliminativism
    4. Classicists on the Offensive: Fodor and Pylyshyn’s Critique
      1. Reason
      2. Productivity and Systematicity
  7. Anti-Represenationalism: Dynamical Stystems Theory, A-Life and Embodied Cognition
  8. Where Have All the Connectionists Gone?
  9. References and Further Reading
    1. References
    2. Connectionism Freeware

1. McCulloch and Pitts

In 1943, neurophysiologist Warren McCulloch and a young logician named Walter Pitts demonstrated that neuron-like structures (or units, as they were called) that act and interact purely on the basis of a few neurophysiologically plausible principles could be wired together and thereby be given the capacity to perform complex logical calculation (McCulloch & Pitts 1943). They began by noting that the activity of neurons has an all-or-none character to it – that is, neurons are either ‘firing’ electrochemical impulses down their lengthy projections (axons) towards junctions with other neurons (synapses) or they are inactive. They also noted that in order to become active, the net amount of excitatory influence from other neurons must reach a certain threshold and that some neurons must inhibit others. These principles can be described by mathematical formalisms, which allows for calculation of the unfolding behaviors of networks obeying such principles. McCulloch and Pitts capitalized on these facts to prove that neural networks are capable of performing a variety of logical calculations. For instance, a network of three units can be configured so as to compute the fact that a conjunction (that is, two complete statements connected by ‘and’) will be true only if both component statements are true (Figure 1). Other logical operations involving disjunctions (two statements connected by ‘or’) and negations can also be computed. McCulloch and Pitts showed how more complex logical calculations can be performed by combining the networks for simpler calculations. They even proposed that a properly configured network supplied with infinite tape (for storing information) and a read-write assembly (for recording and manipulating that information) would be capable of computing whatever any given Turing machine (that is, a machine that can compute any computable function) can.

Figure 1: Conjunction Network We may interpret the top (output) unit as representing the truth value of a conjunction (that is, activation value 1 = true and 0 = false) and the bottom two (input) units as representing the truth values of each conjunct. The input units each have an excitatory connection to the output unit, but for the output unit to activate the sum of the input unit activations must still exceed a certain threshold. The threshold is set high enough to ensure that the output unit becomes active just in case both input units are activated simultaneously. Here we see a case where only one input unit is active, and so the output unit is inactive. A disjunction network can be constructed by lowering the threshold so that the output unit will become active if either input unit is fully active. [Created using Simbrain 2.0]

Somewhat ironically, these proposals were a major source of inspiration for John von Neumann’s work demonstrating how a universal Turing machine can be created out of electronic components (vacuum tubes, for example) (Franklin & Garzon 1996, Boden 2006). Von Neumann’s work yielded what is now a nearly ubiquitous programmable computing architecture that bears his name. The advent of these electronic computing devices and the subsequent development of high-level programming languages greatly hastened the ascent of the formal classical approach to cognition, inspired by formal logic and based on sentence and rule (see Artificial Intelligence). Then again, electronic computers were also needed to model the behaviors of complicated neural networks.

For their part, McCulloch and Pitts had the foresight to see that the future of artificial neural networks lay not with their ability to implement formal computations, but with their ability to engage in messier tasks like recognizing distorted patterns and solving problems requiring the satisfaction of multiple ‘soft’ constraints. However, before we get to these developments, we should consider in a bit more detail some of the basic operating principles of typical connectionist networks.

2. Parts and Properties of Connectionist Networks

Connectionist networks are made up of interconnected processing units which can take on a range of numerical activation levels (for example, a value ranging from 0 – 1). A given unit may have incoming connections from, or outgoing connections to, many other units. The excitatory or inhibitory strength (or weight) of each connection is determined by its positive or negative numerical value. The following is a typical equation for computing the influence of one unit on another:

Influenceiu = ai * wiu

This says that for any unit i and any unit u to which it is connected, the influence of i on u is equal to the product of the activation value of i and the weight of the connection from i to u. Thus, if ai = 1 and wiu = .02, then the influence of i on u will be 0.02. If a unit has inputs from multiple units, the net influence of those units will just be the sum of these individual influences.

One common sort of connectionist system is the two-layer feed-forward network. In these networks, units are segregated into discrete input and output layers such that connections run only from the former to the latter. Often, every input unit will be connected to every output unit, so that a network with 100 units, for instance, in each layer will possess 10,000 inter-unit connections. Let us suppose that in a network of this very sort each input unit is randomly assigned an activation level of 0 or 1 and each weight is randomly set to a level between -0.01 to 0.01. In this case, the activation level of each output unit will be determined by two factors: the net influence of the input units; and the degree to which the output unit is sensitive to that influence, something which is determined by its activation function. One common activation function is the step function, which sets a very sharp threshold. For instance, if the threshold on a given output unit were set through a step function at 0.65, the level of activation for that unit under different amounts of net input could be graphed out as follows:

Figure 2: Step Activation Function

Thus, if the input units have a net influence of 0.7, the activation function returns a value of 1 for the output unit’s activation level. If they had a net influence of 0.2, the output level would be 0, and so on. Another common activation that has more of a sigmoid shape to it – that is, graphed out it looks something like this:

Figure 3: Sigmoid Activation Function

Thus, if our net input were 0.7, the output unit would take on an activation value somewhere near 0.9.

Now, suppose that a modeler set the activation values across the input units (that is, encodes an input vector) of our 200 unit network so that some units take on an activation level of 1 and others take on a value of 0. In order to determine what the value of a single output unit would be, one would have to perform the procedure just described (that is, calculate the net influence and pass it through an activation function). To determine what the entire output vector would be, one must repeat the procedure for all 100 output units.

As discussed earlier, the truth-value of a statement can be encoded in terms of a unit’s activation level. There are, however, countless other sorts of information that can be encoded in terms of unit activation levels. For instance, the activation level of each input unit might represent the presence or absence of a different animal characteristic (say, “has hooves,” “swims,” or “has fangs,”) whereas each output unit represents a particular kind of animal (“horse,” “pig,” or “dog,”). Our goal might be to construct a model that correctly classifies animals on the basis of their features. We might begin by creating a list (a corpus) that contains, for each animal, a specification of the appropriate input and output vectors. The challenge is then to set the weights on the connections so that when one of these input vectors is encoded across the input units, the network will activate the appropriate animal unit at the output layer. Setting these weights by hand would be quite tedious given that our network has 10000 weighted connections. Researchers would discover, however, that the process of weight assignment can be automated.

3. Learning Algorithms

a. Hebb’s Rule

The next major step in connectionist research came on the heels of neurophysiologist Donald Hebb’s (1949) proposal that the connection between two biological neurons is strengthened (that is, the presynaptic neuron will come to have an even stronger excitatory influence) when both neurons are simultaneously active.  As it is often put, “neurons that fire together, wire together.” This principle would be expressed by a mathematical formula which came to be known as Hebb’s rule:

Change of weightiu = ai * au * lrate

The rule states that the weight on a connection from input unit i to output unit u is to be changed by an amount equal to the product of the activation value of i, the activation value of u, and a learning rate. [Notice that a large learning rate conduces to large weight changes and a smaller learning rate to more gradual changes.] Hebb’s rule gave connectionist models the capacity to modify the weights on their own connections in light of the input-output patterns it has encountered.

Let us suppose, for the sake of illustration, that our 200 unit network started out life with connection weights of 0 across the board. We might then take an entry from our corpus of input-output pairs (say, the entry for donkeys) and set the input and output values accordingly. Hebb’s rule might then be employed to strengthen connections from active input units to active output units. [Note: if units are allowed to have weights that vary between positive and negative values (for example, between -1 and 1), then Hebb’s rule will strengthen connections between units whose activation values have the same sign and weaken connections between units with different signs.] This procedure could then be repeated for each entry in the corpus. Given a corpus of 100 entries and at 10,000 applications of the rule per entry, a total of 1,000,000 applications of the rule would be required for just one pass through the corpus (called an epoch of training). Here, clearly, the powerful number-crunching capabilities of electronic computers become essential.

Let us assume that we have set the learning rate to a relatively high value and that the network has received one epoch of training. What we will find is that if a given input pattern from the training corpus is encoded across the input units, activity will propagate forward through the connections in such a way as to activate the appropriate output unit. That is, our network will have learned how to appropriately classify input patterns.

As a point of comparison, the mainstream approach to artificial intelligence (AI) research is basically an offshoot of traditional forms of computer programming. Computer programs manipulate sentential representations by applying rules which are sensitive to the syntax (roughly, the shape) of those sentences. For instance, a rule might be triggered at a certain point in processing because a certain input was presented – say, “Fred likes broccoli and Sam likes cauliflower.” The rule might be triggered whenever a compound sentence of the form p and q is input and it might produce as output a sentence of the form p (“Fred likes broccoli”). Although this is a vast oversimplification, it does highlight a distinctive feature of the classical approach to AI, which is the assumption that cognition is effected through the application of syntax-sensitive rules to syntactically structured representations. What is distinctive about many connectionist systems is that they encode information through activation vectors (and weight vectors), and they process that information when activity propagates forward through many weighted connections.

In addition, insofar as connectionist processing is in this way highly distributed (that is, many processors and connections simultaneously shoulder a bit of the processing load), a network will often continue to function even if part of it gets destroyed (if connections are pruned). The same kind of parallel and distributed processing (where many processors and connections are shouldering a bit of the processing load simultaneously) that enables this kind of graceful degradation also allows connectionist systems to respond sensibly to noisy or otherwise imperfect inputs. For instance, even we encoded an input vector that deviated from the one  for donkeys but was still closer to the donkey vector than to any other, our model would still likely classify it as a donkey. Traditional forms of computer programming, on the other hand, have a much greater tendency to fail or completely crash due to even minor imperfections in either programming code or inputs.

The advent of connectionist learning rules was clearly a watershed event in the history of connectionism. It made possible the automation of vast numbers of weight assignments, and this would eventually enable connectionist systems to perform feats that McCulloch and Pitts could scarcely have imagined. As a learning rule for feed-forward networks, however, Hebb’s rule faces severe limitations. Particularly damaging is the fact that the learning of one input-output pair (an association) will in many cases disrupt what a network has already learned about other associations, a process known as catastrophic interference. Another problem is that although a set of weights oftentimes exists that would allow a network to perform a given pattern association task, oftentimes its discovery is beyond the capabilities of Hebb’s rule.

b. The Delta Rule

Such shortcomings led researchers to investigate new learning rules, one of the most important being the delta rule. To train our network using the delta rule, we it out with random weights and feed it a particular input vector from the corpus. Activity then propagates forward to the output layer. Afterwards, for a given unit u at the output layer, the network takes the actual activation of u and its desired activation and modifies weights according to the following rule:

Change of weightiu = learning rate * (desiredu – au) * ai

That is, to modify a connection from input i to output u, the delta rule computes the product of the difference between the desired activation of u and the actual activation (the error score), the activation of i, and a (typically very small) learning rate. Thus, assuming that unit u should be fully active (but is not) and input i happens to be highly active, the delta rule will increase the strength of the connection from i to u. This will make it more likely that the next time i is highly active, u will be too. If, on the other hand, u should have been inactive but was not, the connection from i to u will be pushed in a negative direction. As with Hebb’s rule, when an input pattern is presented during training, the delta rule is used to calculate how the weights from each input unit to a given output unit are to be modified, a procedure repeated for each output unit. The next item on the corpus is then input to the network and the process repeats, until the entire corpus (or at least that part of it that the researchers want the network to encounter) has been run through. Unlike Hebb’s rule, the delta rule typically makes small weight changes, meaning that several epochs of training may be required before a network achieves competent performance. Again unlike Hebb’s rule, however,  the delta rule will in principle always slowly converge on a set of weights that will allow for mastery of all associations in a corpus, provided that such a set of weights exists. Famed connectionist Frank Rosenblatt called networks of the sort lately discussed perceptrons. He also proved the foregoing truth about them, which became known as the perceptron convergence theorem.

Rosenblatt believed that his work with perceptrons constituted a radical departure from, and even spelled the beginning of the end of, logic-based classical accounts of information processing (1958, 449; see also Bechtel & Abrahamson 2002, 6). Rosenblatt was very much concerned with the abstract information-processing powers of connectionist systems, but others, like Oliver Selfridge (1959), were investigating the ability of connectionist systems to perform specific cognitive tasks, such as recognizing handwritten letters. Connectionist models began around this time to be implemented with the aid of Von Neumann devices, which, for reasons already mentioned, proved to be a blessing.

There was much exuberance associated with connectionism during this period, but it would not last long. Many point to the publication of Perceptrons by prominent classical AI researchers Marvin Minsky and Seymour Papert (1969) as the pivotal event. Minsky and Papert showed (among other things) that perceptrons cannot learn some sets of associations. The simplest of these is a mapping from truth values of statements p and q to the truth value of p XOR q (where p XOR q is true, just in case p is true or q is true but not both). No set of weights will enable a simple two-layer feed-forward perceptron to compute the XOR function. The fault here lies largely with the architecture, for feed-forward networks with one or more layers of hidden units intervening between input and output layers (see Figure 4) can be made to perform the sorts of mappings that troubled Minsky and Papert. However, these critics also speculated that three-layer networks could never be trained to converge upon the correct set of weights. This dealt connectionists a serious setback, for it helped to deprive connectionists of the AI research funds being doled out by the Defense Advanced Research Projects Agency (DARPA). Connectionists found themselves at a major competitive disadvantage, leaving classicists with the field largely to themselves for over a decade.

c. The Generalized Delta Rule

In the 1980s, as classical AI research was hitting doldrums of its own, connectionism underwent a powerful resurgence thanks to the advent of the generalized delta rule (Rumelhart, Hinton, & Williams 1986). This rule, which is still the backbone of contemporary connectionist research, enables networks with one or more layers of hidden units to learn how to perform sets of input-output mappings of the sort that troubled Minsky and Papert. The simpler delta rule (discussed above) uses an error score (the difference between the actual activation level of an output unit and its desired activation level) and the incoming unit’s activation level to determine how much to alter a given weight. The generalized delta rule works roughly the same way for the layer of connections running from the final layer of hidden units to the output units. For a connection running into a hidden unit, the rule calculates how much the hidden unit contributed to the total error signal (the sum of the individual output unit error signals) rather than the error signal of any particular unit.  It adjust the connection from a unit in a still earlier layer to that hidden unit based upon the activity of the former and based upon the latter’s contribution to the total error score. This process can be repeated for networks of varying depth. Put differently, the generalized delta rule enables backpropagation learning, where an error signal propagates backwards through multiple layers in order to guide weight modifications.

Figure 4: Three-layer Network [Created using Simbrain 2.0]

4. Connectionist Models Aplenty

Connectionism sprang back onto the scene in 1986 with a monumental two-volume compendium of connectionist modeling techniques (volume 1) and models of psychological processes (volume 2) by David Rumelhart, James McClelland and their colleagues in the Parallel Distributed Processing (PDP) research group. Each chapter of the second volume describes a connectionist model of some particular cognitive process along with a discussion of how the model departs from earlier ways of understanding that process. It included models of schemata (large scale data structures), speech recognition, memory, language comprehension, spatial reasoning and past-tense learning. Alongside this compendium, and in its wake, came a deluge of further models.

Although this new breed of connectionism was occasionally lauded as marking the next great paradigm shift in cognitive science, mainstream connectionist research has not tended to be directed at overthrowing previous ways of thinking. Rather, connectionists seem more interested in offering a deeper look at facets of cognitive processing that have already been recognized and studied in disciplines like cognitive psychology, cognitive neuropsychology and cognitive neuroscience. What are highly novel are the claims made by connectionists about the precise form of internal information processing. Before getting to those claims, let us first discuss a few other connectionist architectures.

a. Elman’s Recurrent Nets

Over the course of his investigation into whether or not a connectionist system can learn to master the complicated grammatical principles of a natural language such as English, Jeffrey Elman (1990) helped to pioneer a powerful, new connectionist architecture, sometimes known as an Elman net. This work posed a direct challenge to Chomsky’s proposal that humans are born with an innate language acquisition device, one that comes preconfigured with vast knowledge of the space of possible grammatical principles. One of Chomsky’s main arguments against Skinner’s behaviorist theory of language-learning was that no general learning principles could enable humans to produce and comprehend a limitless number of grammatical sentences. Although connectionists had attempted (for example, with the aid of finite state grammars) to show that human languages could be mastered by general learning devices, sentences containing multiple center-embedded clauses (“The cats the dog chases run away,” for instance) proved a major stumbling block. To produce and understand such a sentence requires one to be able to determine subject-verb agreements across the boundaries of multiple clauses by attending to contextual cues presented over time. All of this requires a kind of memory for preceding context that standard feed-forward connectionist systems lack.

Elman’s solution was to incorporate a side layer of context units that receive input from and send output back to a hidden unit layer. In its simplest form, an input is presented to the network and activity propagates forward to the hidden layer. On the next step (or cycle) of processing, the hidden unit vector propagates forward through weighted connections to generate an output vector while at the same time being copied onto a side layer of context units. When the second input is presented (the second word in a sentence, for example), the new hidden layer activation is the product of both this second input and activity in the context layer – that is, the hidden unit vector now contains information about both the current input and the preceding one. The hidden unit vector then produces an output vector as well as a new context vector. When the third item is input, a new hidden unit vector is produced that contains information about all of the previous time steps, and so on. This process provides Elman’s networks with time-dependent contextual information of the sort required for language-processing. Indeed, his networks are able to form highly accurate predictions regarding which words and word forms are permissible in a given context, including those that involve multiple embedded clauses.

While Chomsky (1993) has continued to self-consciously advocate a shift back towards the nativist psychology of the rationalists, Elman and other connectionists have at least bolstered the plausibility of a more austere empiricist approach. Connectionism is, however, much more than a simple empiricist associationism, for it is at least compatible with a more complex picture of internal dynamics. For one thing, to maintain consistency with the findings of mainstream neuropsychology, connectionists ought to (and one suspects that most do) allow that we do not begin life with a uniform, amorphous cognitive mush. Rather, as mentioned earlier, the cognitive load may be divided among numerous, functionally distinct components. Moreover, even individual feed-forward networks are often tasked with unearthing complicated statistical patterns exhibited in large amounts of data. An indication of just how complicated a process this can be, the task of analyzing how it is that connectionist systems manage to accomplish the impressive things that they do has turned out to be a major undertaking unto itself (see Section 5).

b. Interactive Architectures

There are, it is important to realize, connectionist architectures that do not incorporate the kinds of feed-forward connections upon which we have so far concentrated. For instance, McClelland and Rumelhart’s (1989) interactive activation and competition (IAC) architecture and its many variants utilize excitatory and inhibitory connections that run back and forth between the units in different groups. In IAC models, weights are hard-wired rather than learned and units are typically assigned their own particular, fixed meanings. When a set of units is activated so as to encode some piece of information, activity may shift around a bit, but as units compete with one another to become most active through inter-unit inhibitory connections activity will eventually settle into a stable state. The stable state may be viewed, depending upon the process being modeled, as the network’s reaction to the stimulus, which, depending upon the process being modeled, might be viewed as a semantic interpretation, a classification or a mnemonic association. The IAC architecture has proven particularly effective at modeling phenomena associated with long-term memory (content addressability, priming and language comprehension, for instance). The connection weights in IAC models can be set in various ways, including on the basis of individual hand selection, simulated evolution or statistical analysis of naturally occurring data (for example, co-occurrence of words in newspapers or encyclopedias (Kintsch 1998)).

An architecture that incorporates similar competitive processing principles, with the added twist that it allows weights to be learned, is the self-organizing feature map (SOFM) (see Kohonen 1983; see also Miikkulainen 1993). SOFMs learn to map complicated input vectors onto the individual units of a two-dimensional array of units. Unlike feed-forward systems that are supplied with information about the correct output for a given input, SOFMs learn in an unsupervised manner. Training consists simply in presenting the model with numerous input vectors. During training the network adjusts its inter-unit weights so that both each unit is highly ‘tuned’ to a specific input vector and the two-dimensional array is divided up in ways that reflect the most salient groupings of vectors. In principle, nothing more complicated than a Hebbian learning algorithm is required to train most SOFMs. After training, when an input pattern is presented, competition yields a single clear winner (for example, the most highly active unit), which is called the system’s image (or interpretation) of that input.

SOFMs were coming into their own even during the connectionism drought of the 1970s, thanks in large part to Finnish researcher Tuevo Kohonen. Ultimately it was found that with proper learning procedures, trained SOFMs exhibit a number of biologically interesting features that will be familiar to anyone who knows a bit about topographic maps (for example, retinotopic, tonotopic and somatotopic) in the mammalian cortex. SOFMs tend not to allow a portion of the map go unused; they represent similar input vectors with neighboring units, which collectively amount to a topographic map of the space of input vectors; and if a training corpus contains many similar input vectors, the portion of the map devoted to the task of discriminating between them will expand, resulting in a map with a distorted topography. SOFMs have even been used to model the formation of retinotopically organized columns of contour detectors found in the primary visual cortex (Goodhill 1993). SOFMs thus reside somewhere along the upper end of the biological-plausibility continuum.

Here we have encountered just a smattering of connectionist learning algorithms and architectures, which continue to evolve. Indeed, despite what in some quarters has been a protracted and often heated debate between connectionists and classicists (discussed below), many researchers are content to move back and forth between, and also to merge, the two approaches depending upon the task at hand.

5. Making Sense of Connectionist Processing

Connectionist systems generally learn by detecting complicated statistical patterns present in huge amounts of data. This often requires detection of complicated cues as to the proper response to a given input, the salience of which often varies with context. This can make it difficult to determine precisely how a given connectionist system utilizes its units and connections to accomplish the goals set for it.

One common way of making sense of the workings of connectionist systems is to view them at a coarse, rather than fine, grain of analysis — to see them as concerned with the relationships between different activation vectors, not individual units and weighted connections. Consider, for instance, how a fully trained Elman network learns how to process particular words. Typically nouns like “ball,” “boy,” “cat,” and “potato” will produce hidden unit activation vectors that are more similar to one another (they tend to cluster together) than they are to “runs,” “ate,” and “coughed”. Moreover, the vectors for “boy” and “cat” will tend to be more similar to each other than either is to the “ball” or “potato” vectors. One way of determining that this is the case is to begin by conceiving activation vectors as points within a space that has as many dimensions as there are units. For instance, the activation levels of two units might be represented as a single point in a two-dimensional plane where the y axis represents the value of the first unit and the x axis represents the second unit. This is called the state space for those units. Thus, if there are two units whose activation values are 0.2 and 0.7, this can be represented as the point where these two values intersect (Figure 5).

Figure 5: Activation of Two Units Plotted as Point in 2-D State Space

The activation levels of three units can be represented as the point in a cube where the three values intersect, and so on for other numbers of units. Of course, there is a limit to the number of dimensions we can depict or visualize, but there is no limit to the number of dimensions we can represent algebraically. Thus, even where many units are involved, activation vectors can be represented as points in high-dimensional space and the similarity of two vectors can be determined by measuring the proximity of those points in high-dimensional state space. This, however, tells us nothing about the way context determines the specific way in which networks represent particular words. Other techniques (for example, principal components analysis and multidimensional scaling) have been employed to understand such subtleties as the context-sensitive time-course of processing.

One of the interesting things revealed about connectionist systems through these sorts of techniques has been that networks which share the same connection structure but begin training with different random starting weights will often learn to perform a given task equally well and to do so by partitioning hidden unit space in similar ways. For instance, the clustering in Elman’s models discussed above will likely obtain for different networks even though they have very different weights and activities at the level of individual connections and units.

At this point, we are also in a good position to understand some differences in how connectionist networks code information. In the simplest case, a particular unit will represent a particular piece of information – for instance, our hypothetical network about animals uses particular units to represent particular features of animals. This is called a localist encoding scheme. In other cases an entire collection of activation values is taken to represents something – for instance, an entire input vector of our hypothetical animal classification network might represent the characteristics of a particular animal. This is a distributed coding scheme at the whole animal level, but still a local encoding scheme at the feature level. When we turn to hidden-unit representations, however, things are often quite different. Hidden-unit representations of inputs are often distributed without employing localist encoding at the level of individual units. That is, particular hidden units often fail to have any particular input feature that they are exclusively sensitive to. Rather, they participate in different ways in the processing of many different kinds of input. This is called coarse coding, and there are ways of coarse coding input and output patterns as well. The fact that connectionist networks excel at forming and processing these highly distributed representations is one of their most distinctive and important features.

Also important is that connectionist models often excel at processing novel input patterns (ones not encountered during training) appropriately. Successful performance of a task will often generalize to other related tasks. This is because connectionist models often work by detecting statistical patterns present in a corpus (of input-output pairs, for instance). They learn to process particular inputs in particular ways, and when they encounter inputs similar to those encountered during training they process them in a similar manner. For instance, Elman’s networks were trained to determine which words and word forms to expect given a particular context (for example, “The boy threw the ______”). After training, they could do this very well even for sentence parts they ha not encountered before. One caveat here is that connectionist systems with numerous hidden units (relative to the amount of variability in the training corpus) tend to use the extra memory to ‘remember by rote’ how to treat specific input patterns rather than discerning more abstract statistical patterns obtaining across many different input-output vectors. Consequently, in such cases performance tends not to generalize to novel cases very well.

As we have seen, connectionist networks have a number of desirable features from a cognitive modeling standpoint. There are, however, also serious concerns about connectionism. One is that connectionist models must usually undergo a great deal of training on many different inputs in order to perform a task and exhibit adequate generalization. In many instances, however, we can form a permanent memory (upon being told of a loved one’s passing, for example) with zero repetition (this was also a major blow to the old psychological notion that rehearsal is required for a memory to make it into long-term storage). Nor is there much need to fear that subsequent memories will overwrite earlier ones, a process known in connectionist circles as catastrophic interference. We can also very quickly detect patterns in stimuli (for instance, the pattern exhibited by “J, M, P…”) and apply them to new stimuli (for example, “7, 10, 13…”). Unfortunately, many (though not all) connectionist networks (namely many back-propagation networks) fail to exhibit one-shot learning and are prone to catastrophic interference.

Another worry about back-propagation networks is that the generalized delta rule is, biologically speaking, implausible. It certainly does look that way so far, but even if the criticism hits the mark we should bear in mind the difference between computability theory questions and learning theory questions. In the case of connectionism, questions of the former sort concern what sorts of things connectionist systems can and cannot do and questions of the latter address how connectionist systems might come to learn (or evolve) the ability to do these things. The back-propagation algorithm makes the networks that utilize them implausible from the perspective of learning theory, not computability theory. It should, in other words, be viewed as a major accomplishment when a connectionist network that utilizes only biologically plausible processing principles (, activation thresholds and weighted connections) is able to perform a cognitive task that had hitherto seemed mysterious. It constitutes a biologically plausible model of the underlying mechanisms regardless of whether or not it came possess that structure through hand-selection of weights, Hebbian learning, back-propagation or simulated evolution.

6. Connectionism and the Mind

The classical conception of cognition was deeply entrenched in philosophy (namely in empirically oriented philosophy of mind) and cognitive science when the connectionist program was resurrected in the 1980s. Nevertheless, many researchers flocked to connectionism, feeling that it held much greater promise and that it might revamp our common-sense conception of ourselves. During the early days of the ensuing controversy, the differences between connectionist and classical models of cognition seemed to be fairly stark. Connectionist networks learned how to engage in the parallel processing of highly distributed representations and were fault tolerant because of it. Classical systems were vulnerable to catastrophic failure due to their reliance upon the serial application of syntax-sensitive rules to syntactically structured (sentence-like) representations. Connectionist systems superimposed many kinds of information across their units and weights, whereas classical systems stored separate pieces of information in distinct memory registers and accessed them in serial fashion on the basis of their numerical addresses.

Perhaps most importantly, connectionism promised to bridge low-level neuroscience and high-level psychology. Classicism, by contrast, lent itself to dismissive views about the relevance of neuroscience to psychology. It helped spawn the idea that cognitive processes can be realized by any of countless distinct physical substrates (see Multiple Realizability). The basic idea here is that if the mind is just a program being run by the brain, the material substrate through which the program is instantiated drops out as irrelevant. After all, computationally identical computers can be made out of neurons, vacuum tubes, microchips, pistons and gears, and so forth, which means that computer programs can be run on highly heterogeneous machines. Neural nets are but one of these types, and so they are of no essential relevance to psychology. On the connectionist view, by contrast, human cognition can only be understood by paying considerable attention to kind of physical mechanism that instantiates it.

Although these sorts of differences seemed fairly stark in the early days of the connectionism-classicism debate, proponents of the classical conception have recently made great progress emulating the aforementioned virtues of connectionist processing. For instance, classical systems have been implemented with a high degree of redundancy, through the action of many processors working in parallel, and by incorporating fuzzier rules to allow for input variability. In these ways, classical systems can be endowed with a much higher level of fault and noise tolerance, not to mention processing speed (See Bechtel & Abrahamson 2002). We should also not lose sight of the fact that classical systems have virtually always been capable of learning. They have, in particular, long excelled at learning new ways to efficiently search branching problem spaces. That said, connectionist systems seem to have a very different natural learning aptitude – namely, they excel at picking up on complicated patterns, sub-patterns, and exceptions, and apparently without the need for syntax-sensitive inference rules. This claim has, however, not gone uncontested.

a. Rules versus General Learning Mechanisms: The Past-Tense Controversy

Rumelhart and McClelland’s (1986) model of past-tense learning has long been at the heart of this particular controversy. What these researchers claimed to have shown was that over the course of learning how to produce past-tense forms of verbs, their connectionist model naturally exhibited the same distinctive u-shaped learning curve as children. Of particular interest was the fact that early in the learning process children come to generate the correct past-tense forms of a number of verbs, mostly irregulars (“go” → “went”). Later, performance drops precipitously as they pick up on certain fairly general principles (for example, adding “-ed”) and over-apply them even to previously learned irregulars (“went” may become “goed”). Lastly, performance increases as the child learns both the rules and their exceptions.

What Rumelhart and McClelland (1986) attempted to show was that this sort of process need not be underwritten by mechanisms that work by applying physically and functionally distinct rules to representations. Instead, all of the relevant information can be stored in superimposed fashion within the weights of a connectionist network (really three of them linked end-to-end). Pinker and Prince (1988), however, would charge (inter alia) that the picture of linguistic processing painted by Rumelhart and McClelland was extremely simplistic and that their training corpus was artificially structured (namely, that the proportion of regular to irregular verbs varied unnaturally over the course of training) so as to elicit u-shaped learning. Plunkett and Marchman (1993) went a long way towards remedying the second apparent defect, though Marcus (1995) complained that they did not go far enough since the proportion of regular to irregular verbs was still not completely homogenous throughout training. As with most of the major debates constituting the broader connectionist-classicist controversy, this one has yet to be conclusively resolved. Nevertheless, it seems clear that this line of connectionist research does at least suggest something of more general importance – namely, that an interplay between a structured environment and general associative learning mechanisms might in principle conspire so as to yield complicated behaviors of the sort that lead some researchers to posit inner classical process.

b. Concepts

Some connectionists also hope to challenge the classical account of concepts, which embody knowledge of categories and kinds. It has long been widely held that concepts specify the singularly necessary and jointly sufficient conditions for category membership – for instance, “bachelor” might be said to apply to all and only unmarried, eligible males. Membership conditions of this sort would give concepts a sharp, all-or-none character, and they naturally lend themselves to instantiation in terms of formal inference rules and sentential representations. However, as Wittgenstein (1953) pointed out, many words (for example, “game”) seem to lack these sorts of strict membership criteria. Instead, their referents bear a much looser family resemblance relation to one another. Rosch & Mervis (1975) later provided apparent experimental support for the related idea that our knowledge of categories is organized not in terms of necessary and sufficient conditions but rather in terms of clusters of features, some of which (namely those most frequently encountered in category members) are more strongly associated with the category than others. For instance, the ability to fly is more frequently encountered in birds than is the ability to swim, though neither ability is common to all birds. On the prototype view (and also on the closely related exemplar view), category instances are thought of as clustering together in what might be thought of as a hyper-dimensional semantic space (a space in which there are as many dimensions as there are relevant features). In this space, the prototype is the central region around which instances cluster (exemplar theory essentially does away with this abstract region, allowing only for memory of actual concrete instances). There are clearly significant isomorphisms between concepts conceived of in this way and the kinds of hyper-dimensional clusters of hidden unit representations formed by connectionist networks, and so the two approaches are often viewed as natural allies (Horgan & Tienson 1991). This way of thinking about concepts has, of course, not gone unchallenged (see Rey 1983 and Barsalou 1987 for two very different responses).

c. Connectionism and Eliminativism

Neuroscientist Patricia Churchland and philosopher Paul Churchland have argued that connectionism has done much to weaken the plausibility of our pre-scientific conception of mental processes (our folk psychology). Like other prominent figures in the debate regarding connectionism and folk psychology, the Churchlands appear to be heavily influenced by Wilfrid Sellars’ view that folk psychology is a theory that enables predictions and explanations of everyday behaviors, a theory that posits internal manipulation to the sentence-like representations of the things that we believe and desire. The classical conception of cognition is, accordingly, viewed as a natural spinoff of this folk theory. The Churchlands maintain that neither the folk theory nor the classical theory bears much resemblance to the way in which representations are actually stored and transformed in the human brain. What leads many astray, say Churchland and Sejnowski (1990), is the idea that the structure of an effect directly reflects the structure of its cause (as exemplified by the homuncular theory of embryonic development). Thus, many mistakenly think that the structure of the language through which we express our thoughts is a clear indication of the structure of the thoughts themselves. The Churchlands think that connectionism may afford a glimpse into the future of cognitive neuroscience, a future wherein the classical conception is supplanted by the view that thoughts are just points in hyper-dimensional neural state space and sequences of thoughts are trajectories through this space (see Churchland 1989).

A more moderate position on these issues has been advanced by Daniel Dennett (1991) who largely agrees with the Churchlands in regarding the broadly connectionist character of our actual inner workings. He also maintains, however, that folk psychology is for all practical purposes indispensible. It enables us to adopt a high-level stance towards human behavior wherein we are able to detect patterns that we would miss if we restricted ourselves to a low-level neurological stance. In the same way, he claims, one can gain great predictive leverage over a chess-playing computer by ignoring the low-level details of its inner circuitry and treating it as a thinking opponent. Although an electrical engineer who had perfect information about the device’s low-level inner working could in principle make much more accurate predictions about its behavior, she would get so bogged down in those low-level details as to make her greater predictive leverage useless for any real-time practical purposes. The chess expert wisely forsakes some accuracy in favor of a large increase in efficiency when he treats the machine as a thinking opponent, an intentional agent. Dennett maintains that we do the same when we adopt an intentional stance towards human behavior. Thus, although neuroscience will not discover any of the inner sentences (putatively) posited by folk psychology, a high-level theoretical apparatus that includes them is an indispensable predictive instrument.

On a related note, McCauley (1986) claims that whereas it is relatively common for one high-level  theory to be eliminated in favor of another, it is much harder to find examples where a high-level theory is eliminated in favor of a lower-level theory in the way that the Churchlands envision. However, perhaps neither Dennett nor McCauley are being entirely fair to the Churchlands in this regard. What the Churchlands foretell is the elimination of a high-level folk theory in favor of another high-level theory that emanates out of connectionist and neuroscientific research. Connectionists, we have seen, look for ways of understanding how their models accomplish the tasks set for them by abstracting away from neural particulars. The Churchlands, one might argue, are no exception. Their view that sequences are trajectories through a hyperdimensional landscape abstracts away from most neural specifics, such as action potentials and inhibitory neurotransmitters.

d. Classicists on the Offensive: Fodor and Pylyshyn’s Critique

When connectionism reemerged in the 1980s, it helped to foment resistance to both classicism and folk psychology. In response, stalwart classicists Jerry Fodor and Zenon Pylyshyn (1988) formulated a trenchant critique of connectionism. One imagines that they hoped to do for the new connectionism what Chomsky did for the associationist psychology of the radical behaviorists and what Minsky and Papert did for the old connectionism. They did not accomplish that much, but they did succeed in framing the debate over connectionism for years to come. Though their criticisms of connectionism were wide-ranging, they were largely aimed at showing that connectionism could not account for important characteristics of human thinking, such as its generally truth-preserving character, its productivity, and (most important of all) its systematicity. Of course they had no qualms with the proposal that vaguely connectionist-style processes happen, in the human case, to implement high-level, classical computations.

i. Reason

Unlike Dennett and the Churchlands, Fodor and Pylyshyn (F&P) claim that folk psychology works so well because it is largely correct. On their view, human thinking involves the rule-governed formulation and manipulation of sentences in an inner linguistic code (sometimes called mentalese). [Incidentally, one of the main reasons why classicists maintain that thinking occurs in a special ‘thought language’ rather than in one’s native natural language is that they want to preserve the notion that people who speak different languages can nevertheless think the same thoughts – for instance, the thought that snow is white.] One bit of evidence that Fodor frequently marshals in support of this proposal is the putative fact that human thinking typically progresses in a largely truth-preserving manner. That is to say, if one’s initial beliefs are true, the subsequent beliefs that one infers from them are also likely to be true. For instance, from the belief that the ATM will not give you any money and the belief that it gave money to the people before and after you in line, you might reasonably form a new belief that there is something wrong with either your card or your account. Says Fodor (1987), if thinking were not typically truth-preserving in this way, there wouldn’t be much point in thinking. Indeed, given a historical context in which philosophers throughout the ages frequently decried the notion that any mechanism could engage in reasoning, it is no small matter that early work in AI yielded the first fully mechanical models and perhaps even artificial implementations of important facets of human reasoning. On the classical conception, this can be done through the purely formal, syntax-sensitive application of rules to sentences insofar as the syntactic properties mirror the semantic ones. Logicians of the late nineteenth and early twentieth century showed how to accomplish just this in the abstract, so all that was left was to figure out (as von Neumann did) how to realize logical principles in artifacts.

F&P (1988) argue that connectionist systems can only ever realize the same degree of truth preserving processing by implementing a classical architecture. This would, on their view, render connectionism a sub-cognitive endeavor. One way connectionists could respond to this challenge would be to create connectionist systems that support truth-preservation without any reliance upon sentential representations or formal inference rules. Bechtel and Abrahamson (2002) explore another option, however, which is to situate important facets of rationality in human interactions with the external symbols of natural and formal languages. Bechtel and Abrahamson argue that “the ability to manipulate external symbols in accordance with the principles of logic need not depend upon a mental mechanism that itself manipulates internal symbols” (1991, 173). This proposal is backed by a pair of connectionist models that learn to detect patterns during the construction of formal deductive proofs and to use this information to decide on the validity of arguments and to accurately fill in missing premises.

ii. Productivity and Systematicity

Much more attention has been pain to other aspects of F&P’s (1988) critique, such as their claim that only a classical architecture can account for the productivity and systematicity of thought. To better understand the nature of their concerns, it might help if we first consider the putative productivity and systematicity of natural languages.

Consider, to start with, the following sentence:

(1)  “The angry jay chased the cat.”

The rules governing English appear to license (1), but not (2), which is made from (modulo capitalization) qualitatively identical parts:

(2)  “Angry the the chased jay cat.”

We who are fluent in some natural language have knowledge of the rules that govern the permissible ways in which the basic components of that language can be arranged – that is, we have mastery of the syntax of the language.

Sentences are, of course, also typically intended to carry or convey some meaning. The meaning of a sentence, say F&P (1988), is determined by the meanings of the individual constituents and by the manner in which they are arranged. Thus (3), which is made from the same constituents as (1), conveys a very different meaning.

(3)  “The angry cat chased the jay.”

Natural language expressions, in other words, have a combinatorial syntax and semantics.

In addition, natural languages appear to be characterized by certain recursive rules which enable the production of an infinite variety of syntactically distinct sentences. For instance, in English one such rule allows any two grammatical statements to be combined with ‘and’. Thus, if (1) and (3) are grammatical, so is this:

(4)  “The angry jay chased the cat and the angry cat chased the jay.”

Sentence (4) too can be combined with another, as in (5) which conjoins (4) and (3):

“The angry jay chased the cat and the angry cat chased the jay, and the angry cat chased the jay.”

Earlier we discussed another recursive principle which allows for center-embedded clauses.

One who has mastered the combinatorial and recursive syntax and semantics of a natural language is, according to classicists like F&P (1988), thereby capable in principle of producing and comprehending an infinite number of grammatically distinct sentences. In other words, their mastery of these linguistic principles gives them a productive linguistic competence. It is also reputed to give them a systematic competence, in that a fluent language user who can produce and understand one sentence can produce and understand systematic variants. A fluent English speaker who can produce and understand (1) will surely be able to produce and understand (3). It is, on the other hand, entirely possible for one who has learned English from a phrase-book (that is, without learning the meanings of the constituents or the combinatorial semantics of the language) to be able to produce and understand (1) but not its systematic variant (3).

Thinking, F&P (1988) claim, is also productive and systematic, which is to say that we are capable of thinking an infinite variety of thoughts and that the ability to think some thoughts is intrinsically connected with the ability to think others. For instance, on this view, anyone who can think the thought expressed by (1) will be able to think the thought expressed by (3). Indeed, claims Fodor (1987), since to understand a sentence is to entertain the thought the sentence expresses, the productivity and systematicity of language imply the productivity and systematicity of thought. F&P (1988) also maintain that just as the productivity and systematicity of language is best explained by its combinatorial and recursive syntax and semantics, so too is the productivity and systematicity of thought. Indeed, they say, this is the only explanation anyone has ever offered.

The systematicity issue has generated a vast debate (see Bechtel & Abrahamson 2002), but one general line of connectionist response has probably garnered the most attention. This approach, which appeals to functional rather than literal compositionality (see van Gelder 1990), is most often associated with Smolensky (1990) and with Pollack (1990), though for simplicity’s sake discussion will be restricted to the latter.

Pollack (1990) uses recurrent connectionist networks to generate compressed, distributed encodings of syntactic strings and subsequently uses those encodings to either recreate the original string or to perform a systematic transformation of it (e.g., from “Mary loved John” to “John loved Mary”). Pollack’s approach was quickly extended by Chalmers (1990), who showed that one could use such compressed distributed representations to perform systematic transformations (namely moving from an active to a passive form) of even sentences with complex embedded clauses. He showed that this could be done for both familiar and novel sentences. What this suggests is that connectionism might offer its own unique, non-classical account of the apparent systematicity of thought processes. However, Fodor and McLaughlin (1990) argue that such demonstrations only show that networks can be forced to exhibit systematic processing, not that they exhibit it naturally in the way that classical systems do. After all, on a classical account, the same rules that license one expression will automatically license its systematic variant. It bears noting, however, that this approach may itself need to impose some ad hoc constraints in order to work. Aizawa (1997) points out, for instance, that many classical systems do not exhibit systematicity. On the flipside, Matthews (1997) notes that systematic variants that are licensed by the rules of syntax need not be thinkable. Waskan (2006) makes a similar point, noting that thinking may be more and less systematic than language and that the actual degree to which thought is systematic may be best accounted for by, theoretically speaking, pushing the structure of the world ‘up’ into the thought medium, rather than pushing the structure of language ‘down’. This might, however, come as cold comfort to connectionists, for it appears to  merely replace one competitor to connectionism with another.

7. Anti-Represenationalism: Dynamical Stystems Theory, A-Life and Embodied Cognition

As alluded to above, whatever F&P may have hoped, connectionism has continued to thrive. Connectionist techniques are now employed in virtually every corner of cognitive science. On the other hand, despite what connectionists may have wished for, these techniques have not come close to fully supplanting classical ones. There is now much more of a peaceful coexistence between the two camps. Indeed, what probably seems far more important to both sides these days is the advent and promulgation of approaches that reject or downplay central assumptions of both classicists and mainstream connectionists, the most important being that human cognition is largely constituted by the creation, manipulation, storage and utilization of representations. Many cognitive researchers who identify themselves with the dynamical systems, artificial life and (albeit to a much lesser extent) embodied cognition endorse the doctrine that one version of the world is enough. Even so, practitioners of the first two approaches have often co-opted connectionist techniques and terminology. In closing, let us briefly consider the rationale behind each of these two approaches and their relation to connectionism.

Briefly, dynamical systems theorists adopt a very high-level perspective on human behavior (inner and/or outer) that treats its state at any given time as a point in high-dimensional space (where the number of dimensions is determined by the number of numerical variables being used to quantify the behavior) and treats its time course as a trajectory through that space (van Gelder & Port 1995). As connectionist research has revealed, there tend to be regularities in the trajectories taken by particular types of system through their state spaces. As paths are plotted, it is often as if the trajectory taken by a system gets attracted to certain regions and repulsed by others, much like a marble rolling across a landscape can get guided by valleys, roll away from peaks, and get trapped in wells (local or global minima). The general goal is to formulate equations like those at work in the physical sciences that will capture such regularities in the continuous time-course of behavior. Connectionist systems have often provided nice case studies in how to characterize a system from the dynamical systems perspective. However, whether working from within this perspective in physics or in cognitive science, researchers find little need to invoke the ontologically strange category of representations in order to understand the time course of a system’s behavior.

Researchers in artificial life primarily focus on creating artificial creatures (virtual or real) that can navigate environments in a fully autonomous manner. The strategy generally favored by artificial life researchers is to start small, with a simple behavior repertoire, to test one’s design in an environment (preferably a real one), to adjust it until success is achieved, and then to gradually add layers of complexity by repeating this process. In one early and influential manifesto of the ‘a-life’ movement, Rodney Brooks claims, “When intelligence is approached in an incremental manner, with strict reliance on interfacing to the real world through perception and action, reliance on representation disappears” (Brooks 1991). The aims of a-life research are sometimes achieved through the deliberate engineering efforts of modelers, but connectionist learning techniques are also commonly employed, as are simulated evolutionary processes (processes that operate over both the bodies and brains of organisms, for instance).

8. Where Have All the Connectionists Gone?

There perhaps may be fewer today who label themselves “connectionists” than there were during the 1990s. Fodor & Pylyshyn’s (1988) critique may be partly responsible for this shift, though it is probably more because the novelty of the approach has worn off and the initial fervor died down. Also to blame may be the fact that connectionist techniques are now very widely employed throughout cognitive science, often by people who have very little in common ideologically. It is thus increasingly hard to discern among those who utilize connectionist modeling techniques any one clearly demarcated ideology or research program. Even many of those who continue to maintain an at least background commitment to the original ideals of connectionism might nowadays find that there are clearer ways of signaling who they are and what they care about than to call themselves “connectionists.” In any case, whether connectionist techniques are limited in some important respects or not, it is perfectly clear is that connectionist modeling techniques are still powerful and flexible enough as to have been widely embraced by philosophers and cognitive scientists, whether they be mainstream moderates or radical insurgents. It is therefore hard to imagine any technological or theoretical development that would lead to connectionism’s complete abandonment. Thus, despite some early fits and starts, connectionism is now most assuredly here to stay.

9. References and Further Reading

a. References

  • Aizawa, K. (1997). Explaining systematicity, Mind and Language, 12, 115-136.
  • Barsalou, L. (1987). The instability of graded structure: Implications for the nature of concepts. In U. Neisser (Ed.), Concepts and conceptual development: Ecological and intellectual factors in categorization. Cambridge, UK: Cambridge University Press, 101-140.
  • Bechtel, W. & A. Abrahamsen. (1991). Connectionism and the mind: An introduction to parallel processing in networks. Cambridge, MA: Basil Blackwell.
  • Bechtel, W. & A. Abrahamsen. (2002). Connectionism and the mind: An introduction to parallel processing in networks, 2nd Ed. Cambridge, MA: Basil Blackwell.
    • Highly recommended introduction to connectionism and the philosophy thereof.
  • Boden, M. (2006). Mind as machine: A history of cognitive science. New York: Oxford.
  • Brooks, R. (1991). Intelligence without representation. Artificial Intelligence, 47, 139-159.
  • Chalmers, D. (1990). Syntactic transformations on distributed representations. Connection Science, 2, 53-62.
  • Chomsky, N. (1993). On the nature, use and acquisition of language. In A. Goldman (Ed.), Readings in the Philosophy and Cognitive Science. Cambridge, MA: MIT, 511-534.
  • Churchland, P.M. (1989). A neurocomputational perspective: The nature of mind and the structure of science. Cambridge, MA: MIT.
  • Churchland, P.S. & T. Sejnowski. (1990).  Neural representation and neural computation. Philosophical Perspectives, 4, 343-382.
  • Dennett, D. (1991). Real Patterns. The Journal of Philosophy, 88, 27-51.
  • Elman, J. (1990). Finding Structure in Time. Cognitive Science, 14, 179-211.
  • Fodor, J. (1987). Psychosemantics. Cambridge, MA: MIT.
  • Fodor, J. & B. McLaughlin. (1990). Connectionism and the problem of systematicity: Why Smolensky’s solution doesn’t work, Cognition, 35, 183-204.
  • Fodor, J. & Z. Pylyshyn. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28, 3-71.
  • Franklin, S. & M. Garzon. (1996). Computation by discrete neural nets. In P. Smolensky, M. Mozer, & D. Rumelhart (Eds.) Mathematical perspectives on neural networks (41-84). Mahwah, NJ: Lawrence Earlbaum.
  • Goodhill, G. (1993). Topography and ocular dominance with positive correlations. Biological Cybernetics, 69, 109-118 .
  • Hebb, D.O. (1949). The Organization of Behavior. New York: Wiley.
  • Horgan, T. & J. Tienson (1991). Overview. In Horgan, T. & J. Tienson (Eds.) Connectionism and the Philosophy of Mind. Dordrecht: Kluwer.
  • Kintsch, W. (1998). Comprehension: A Paradigm for Cognition. Cambridge: Cambridge University Press.
  • Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 59-69.
  • Marcus, R. (1995). The acquisition of the English past tense in children and multilayered connectionist networks. Cognition, 56, 271-279.
  • Matthews, R. (1997). Can connectionists explain systematicity? Mind and Language, 12, 154-177.
  • McCauley, R. (1986). Intertheoretic relations and the future of psychology. Philosophy of Science, 53, 179-199.
  • McClelland, J. & D. Rumelhart. (1989). Explorations in parallel distributed processing: A handbook of models, programs, and exercises. Cambridge, MA: MIT.
    • This excellent hands-on introduction to connectionist models of psychological processes has been replaced by: R. O’Reilly & Y. Munakata. (2000). Computational explorations in cognitive neuroscience: Understanding the mind by simulating the brain. Cambridge, MA: MIT. Companion software called Emergent.
  • McCulloch, W. & W. Pitts. (1943). A logical calculus of the ideas immanent in nervous activity Bulletin of Mathematical Biophysics, 5:115-133.
  • Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65, 386-408.
  • Miikkulainen, R. (1993). Subsymbolic Natural Language Processing. Cambridge, MA: MIT.
    • Highly recommended for its introduction to Kohonen nets.
  • Minsky, M. & S. Papert. (1969). Perceptrons: An introduction to computational geometry. Cambridge, MA: MIT.
  • Pinker, S. & A. Prince. (1988). On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28, 73-193.
  • Pollack, J. (1990). Recursive distributed representations. Artificial Intelligence, 46, 77-105.
  • Plunkett, K. & V. Marchman. (1993). From rote learning to system building: Acquiring verb morphology in children and connectionist nets. Cognition, 48, 21-69.
  • Rey, G. (1983). Concepts and stereotypes. Cognition, 15, 273-262.
  • Rosch, E. & C. Mervis. (1975). Family resemblances: Studies in the internal structure of categories. Cognitive Psychology, 7, 573-605.
  • Rumelhart, D., G. Hinton, & R. Williams. (1986). Learning internal representations by error propagation. In D. Rumelhart & J. McClelland (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition, Vol. 1. Cambridge, MA: MIT, 318-362.
  • Selfridge, O. (1959). Pandemonium: A paradigm for learning. Rpt. in J. Anderson & E. Rosenfeld (1988), Neurocomputing: Foundations of research. Cambridge, MA: MIT, 115-122.
  • Smolensky, P. (1990). Tensor product variable binding and the representation of symbolic structures in connectionist networks. Artificial Intelligence, 46, 159–216.
  • van Gelder, T. (1990). Compositionality: A connectionist variation on a classical theme. Cognitive Science, 14, 355-384.
  • van Gelder, T. & R. Port. (1995). Mind as motion: Explorations in the dynamics of cognition. Cambridge, MA: MIT.
  • Waskan, J. (2006). Models and Cognition: Prediction and explanation in everyday life and in science. Cambridge, MA: MIT.
  • Wittgenstein, L. (1953). Philosophical Investigations. New York: Macmillan.

b. Connectionism Freeware

  • BugBrain provides an excellent, accessible, and highly entertaining game-based hands-on tutorial on the basics of neural networks and gives one a good idea of what a-life is all about as well. BugBrain comes with some learning components, but they are not recommended.
  • Emergent is research-grade software that accompanies O’Reilly and Munakata’s Computational explorations in cognitive neuroscience (referenced above).
  • Simbrain is a fairly accessible, but somewhat weak, tool for implementing a variety of common neural network architectures.
  • Framsticks is a wonderful program that enables anyone with the time and patience to evolve virtual stick creatures and their neural network controllers.

Author Information

Jonathan Waskan
Email: waskan@illinois.edu
University of Illinois at Urbana-Champaign
U. S. A.

Poincaré’s Philosophy of Mathematics

Henri PoincareJules Henri Poincaré was an important French mathematician, scientist, and philosopher in the late nineteenth and early twentieth century who was especially known for his conventionalist philosophy.  Most of his publishing was in analysis, topology, probability, mechanics and mathematical physics.  His overall philosophy of mathematics is Kantian because he believes that intuition provides a foundation for all of mathematics, including geometry.

He advocated conventionalism for some principles of science, most notably for the choice of applied geometry (the geometry that is best paired with physics for an account of reality).  But the choice of a geometric system is not an arbitrary convention.  According to Poincaré, we choose the system based on considerations of simplicity and efficiency given the overall empirical and theoretical situation in which we find ourselves.  Along with the desiderata of theoretical simplicity and efficiency, empirical information must inform and guide our choices, including our geometric choices.  Thus, even with respect to applied geometry, where Poincaré is at his most conventional, empirical information is crucial to the choice we make.

Balancing the empirical element, there is also a strongly a prior element in Poincaré’s philosophical views for he argued that intuition provides an a priori epistemological foundation for mathematics.  His views about intuition descend from Kant, whom Poincaré explicitly defends.  Kant held that space and time are the forms of experience, and provide the a priori, intuitive sources of mathematical content.  While defending the same basic vision, Poincaré adapts Kant’s views by rejecting the foundation upon space and time.  Rather than time, Poincaré argues for the intuition of indefinite repetition, or iteration, as the main source of extra-logical content in number theory.  Rather than space, Poincaré argues that, in addition to iteration, we must presuppose an intuitive understanding of both the continuum and the concept of group in geometry and topology.

Table of Contents

  1. Introduction
  2. Geometry and the A Priori
  3. Poincaré’s Relationship to Kant
  4. Poincaré’s Arguments for Intuition: Continuity
  5. Poincaré’s Arguments for Intuition: Indefinitite Repetition
    1. Argument One
    2. Argument Two
    3. Argument Three
    4. Argument Four
  6. Intuition and Other Topics in Poincaré’s Philosophy
    1. Predicativism
    2. Philosophy of Science
  7. References and Further Reading

1. Introduction

Jules Henri Poincaré (1854-1912) was an important French mathematician, scientist and thinker.  He was a prolific mathematician, publishing in a wide variety of areas, including analysis, topology, probability, mechanics and mathematical physics.  He also wrote popular and philosophical works on the foundations of mathematics and science, from which one can sketch a picture of his views.

As an eminent mathematician, Poincaré’s philosophical views were influential and taken seriously during his lifetime.  Today, however, his papers seem somewhat loose, informal, and at times polemical.  Indeed many are based on speeches he gave to primarily non-philosophical audiences, and part of their aim was to entertain.  One must therefore be careful when reading Poinaré not to misinterpret him as being inconsistent, or not taking philosophy seriously.  He was a mathematician, not a trained philosopher.  Yet he regarded philosophical and foundational questions as important to science, and one can still find many philosophical insights in his writings.

He was also a Kantian because he was committed to mathematical intuition as the foundation of mathematics. Known for his conventionalist philosophy, his views are really quite complicated and subtle.  He espoused conventionalism for some principles of science, most notably for the choice of applied geometry, but he was not a conventionalist about every aspect of science.  Even the choice of a geometric system is not a completely arbitrary convention.  It is not the kind of choice that could be based on the flip of a coin, for example.  Rather, we choose – according to Poincaré – based on considerations of simplicity and efficiency given the overall empirical and theoretical situation in which we find ourselves.  His point is that when articulating a theoretical framework for a given base of evidence there are almost always alternatives.  This has become known by the slogan: “Underdetermination of theory by data.”  So, there are almost always choices in how we construct our theory.  Along with the desiderata of theoretical simplicity and efficiency, empirical information must inform and guide our choices – including our geometric choices.  Thus, even with respect to applied geometry, where Poincaré is at his most conventional, empirical information is crucial to the choice we make.

Balancing the empirical element, there is also a strongly apriorist element in Poincaré’s philosophical views.  First, he viewed Euclidean geometry as so simple that we would always prefer to alter physics than to choose a non-Euclidean geometry.  This is despite the fact that he actually used non-Euclidean geometry in some of his work on celestial mechanics.  We can regard this belief in the inherent simplicity and appeal of Euclidean geometry as simply a case of a bad gamble:  he bet on the wrong horse because he bet too early (prior to general relativity).  However, there is a second, more deeply seated, apriorist element in geometry – one that links his philosophy of geometry with his more general philosophy of mathematics.  That is his belief that mathematical intuition provides an a priori epistemological foundation for mathematics, including geometry.

2. Geometry and the A Priori

All geometries are based on some common presuppositions in the axioms, postulates, and/or definitions.  Non-Euclidean geometries can be constructed by substituting alternative versions of Euclid’s parallel postulate; but they begin by keeping some axioms fixed.  Keeping these aspects of the axiomatic structure fixed is what makes the different systems all geometries.  Unifying the various geometric systems is the fact that they determine the possible constructions, or objects, in space.  What primarily differentiates Riemannian and Lobachevskian geometries from Euclidean geometries are different existence claims regarding parallel lines (whether or not they exist, and if so how many).  In Euclidean geometry, given a line, there is exactly one parallel to it on the plane through a given external point.  In Lobachevskian geometry there are an infinity of such parallels; and in Riemannian, there are none.  The different axioms regarding parallels yield different internal angle sum theorems in each geometry:  Euclidean triangles have internal angles that sum to exactly 180 degrees; Lobachevskian triangles sum to less than 180 degrees; and Riemannian triangles sum to greater than 180 degrees.  (In the latter two cases, how much more or less than 180 degrees depends on the size of the triangle relative to the curvature of the space.)

If we consider the unifying features of these three approaches to geometry, that is, the features that the different metric systems share, a natural question concerns the epistemological and methodological status of this common basis.  One thought is that what grounds this common basis, which we might call “pure geometry in general”, is an intuitive understanding of space in general.  This is essentially what Poincaré proposed:  that there is an a priori intuitive basis for geometry in general, upon which the different metric geometries can be constructed in pure mathematics. Once constructed, they can then be applied depending on empirical and theoretical need.  The a priori basis for geometry has two elements for Poincaré.  First, he postulated that we have an intuitive understanding of continuity, which – applied to the idea of space – provides an a priori foundation for all geometry, as well as for topology.  Second, he proposed that we also have an a priori understanding of group theory.  This additional group theoretic element applied to rigid body motion for example, leads to the set of geometries of constant curvature.

For Poincaré, therefore, even if physics can help us choose between different metric geometries, the set of possibilities from which it chooses is a priori delimited by the nature of our minds.  We are led to a delimited set of possible geometries by our intuition of continuity coupled with the a priori understanding of groups.  Together these constrain our natural assumptions about possible constructions and motions in space.

As in contemporary conceptions of mathematics, Poincaré made a fairly sharp distinction between pure and applied geometry.  Pure geometry is part of pure mathematics.   As such its foundation consists in a combination of logic and intuition.  In this way, he is a Kantian about all of pure mathematics, including the mathematical study of various geometric systems.  (There is also a hint of Hilbertian axiomatics here:  in pure geometry one studies various axiom systems.)  Conventionalism for Poincaré describes applied geometry only – to characterize the quasi-empirical choice of which metric geometry to pair with physics to best model the world.

Poincaré’s philosophy of pure mathematics, is in fact dominated by the attempt to defend mathematical intuition.  This takes various forms throughout his career, but perhaps the most important example is his defense of some version of Kant’s theory of intuition in arithmetic, in opposition to the logicist program.  The logicists attempted to provide a mathematical demonstration that arithmetic has no need for intuition, Kantian or otherwise, by deriving the basic postulates of arithmetic from logical laws and, logically expressed, definitions alone.  Poincaré argued against this program, insisting that any formal system adequate to derive the basic postulates of arithmetic will by necessity presuppose some intuitive arithmetic.

In contrast with geometry, where there is a range of genuine alternatives to consider, he agreed with the logicists that there is only one genuine arithmetic.  So, the set of options is here much more strictly delimited – to one.  He disagreed with the logicists, who saw the uniqueness and epistemic depth of arithmetic as an indicator that it is nothing more than logic.  For Poincaré, arithmetic is uniquely forced on us by intuition rather than by logic alone.  Furthermore, for Poincaré, arithmetic was at the bottom of the scientific pyramid:  the most fundamental of the sciences and the one that is presupposed by all the rest.  In his hierarchy of sciences, arithmetic lies at the bottom.  Thus, arithmetic’s foundation is important for the rest of the sciences.  In order to understand Poincaré’s philosophy of science and mathematics in general, therefore, one must come to grips with his philosophy of arithmetic.

3. Poincaré’s Relationship to Kant

We must begin with Kant, who is the historical source of Poincaré’s appeal to mathematical intuition.  For Kant, there are two a priori intuitions, space and time; and these provide the form of all experience.  All experiences, inner and outer, are temporal, or in time; and all outer experiences are also spatial.  A thought or desire might be an example of a non-spatial but temporal experience; and taking a walk would be in both space and time.  According to Kant, the mind comes equipped with these forms – for otherwise, he argues, we could not account for the coherence, structure, and universality of human experience.  In his vision, a priori intuition, or spatio-temporality, helps to mold brute sensations into the objects of experience.

These same a priori intuitions, the a priori form of all experience, also explain how mathematics is both a priori (non-empirical) and yet has non-trivial content.  In short, a priori intuition supplies the non-empirical content of mathematics.  Mathematics has a distinctive subject matter, but that subject matter is not provided by some external reality, Platonic or otherwise.  Rather, it is provided a priori – by the mind itself.  Intuitive space provides much of the a priori synthetic content for geometry (which is Euclidean for Kant); and intuitive time provides the a priori synthetic content for quantitative mathematics.  This makes mathematical knowledge both synthetic and a priori.  It is synthetic because it is not mere analysis of concepts, and has an intuitive subject matter.  It is a priori because its subject matter or content, spatio-temporality, is given a priori by the form of experience.

Poincaré adopts Kant’s basic vision of mathematics as synthetic a priori knowledge owing to the epistemological and methodological foundation provided by a priori intuition.  Yet, as we have seen already, he does not agree with many details of Kant’s philosophy of mathematics.  Unlike Kant, Poincaré considers Euclidean geometry to be a kind of choice; so Euclidean geometry is not uniquely, or a priori, imposed by intuition.  The closest thing to Kant’s intuitive space, for Poincare, is not Euclidean space but rather the more minimal intuitive idea of continuity, which is one of the features presupposed in Euclidean space.  Rather than intuitive time, Poincaré emphasizes the intuitive understanding of indefinite iteration for number theory.  Though he views time as a “form pre-existent in our mind”, and one can hypothesize on the connection between this form and the intuition of indefinite iteration, Poincaré does not himself stress the connection.  Thus, both sources of mathematical information – the intuitive continuum and the intuition of indefinite iteration – are somewhat less robust, and less connected to experience, for Poincaré than for Kant.

4. Poincaré’s Arguments for Intuition: Continuity

First, we shall deal briefly with the intuitive continuum.  The clearest argument for an a priori intuition of spatial or mathematical continuity is quite Kantian, but it only appears late in Poincaré writings (Last Essays).  In earlier works his remarks about the continuum are less definite and less Kantian.  For example, in Science and Hypothesis, chapter II, he focused more on priority than apriority, arguing that the continuum is mathematically prior to analysis rather than that it is given by a priori intuition.  He thought analysis presupposes the mathematical continuum because one cannot generate the real number continuum by set theoretic constructions, “from below.”   To get genuine continuity, rather than a merely dense set, and to account for the origin, utility, and our overall understanding of the symbolic constructions, Poincaré felt we must appeal to a preconceived idea of a continuum, where “the line exists previous to the point,” (pp. 18, 21).  There is no clear suggestion here of the ideas of Kant or of the idea that a continuum is given by a priori intuition.  The mathematical continuum is rather presented as partly suggested by experience and geometry, and then refined by analysis.

A few years later, in The Value of Science, he moves closer to an apriorist view – though he does not yet use the term “intuition” in connection with the continuum.  In Chapter III he discusses the “primitively amorphous” continuum that forms a common basis for the different metric systems  (p. 37).  And in Chapter IV he asserts that the mathematical continuum is constructed from “materials and models” rather than nothing.  “These materials, like these models, preexist within [the mind],” (p. 72).  He goes on to say that it is experience that enables us to choose from the different possible models.  Thus, he has here taken a big step towards suggesting a Kantian intuition of continuity – in asserting that some materials must pre-exist within the mind in order to construct the mathematical continuum.

Later, however, Poincaré explicitly connects this idea of the pre-existence of the continuum with intuition:

“I shall conclude that there is in all of us an intuitive notion of the continuum of any number of dimensions whatever because we possess the capacity to construct a physical and mathematical continuum; and that this capacity exists in us before any experience because, without it, experience properly speaking would be impossible and would be reduced to brute sensations, unsuitable for any organization; and because this intuition is merely the awareness that we possess this faculty. And yet this faculty could be used in different ways; it could enable us to construct a space of four just as well as a space of three dimensions.  It is the exterior world, it is experience which induces us to make use of it in one sense rather than in the other.” (Last Essays, 44)

The intuitive continuum is an a priori basis for mathematical and empirical construction.  In arguing for this intuition, Poincaré appeals to its necessity for coherent, organized, experience, as well as its necessity for our capacity to construct mathematical theories of the continuum.  His approach here is now quite similar to some of Kant’s transcendental arguments.  For example, Kant argues that spatio-temporality must be brought to rather than derived from experience, for it is what makes experience coherent.  In other words, Kant argues that spatio-temporality cannot be derived, for it is required in order for us to derive anything from experience.  Poincaré’s appeal to intuition in order to explain both a mathematical capacity – the capacity to construct certain mathematical structures – and the fact that our experience is coherent, is thus very reminiscent of Kant.  It is a priori because it is necessarily prior to experience, providing its form or capacity for organization.

5. Poincaré’s Arguments for Intuition: Indefinitite Repetition

In contrast, even Poincaré’s clearest arguments for an intuition of iteration seem quite non-Kantian, for they are less connected to coherent experience, and more focused on pure mathematical contexts.   Three types of arguments are sketched below.

a. Argument One

One approach involves a kind of Sherlock Holmes strategy.  Poincaré considers several alternatives to mathematics being synthetic a priori, or based on intuition, and eliminates them.  In the course of the argument he ends up with the view that inductive reasoning is especially characteristic of mathematics; and it is why mathematics is synthetic a priori.  Induction will turn out to be the main conduit of intuition in mathematics, but first Poincaré focuses on simply its classification as synthetic a priori.  This particular argument has three parts.

He first begins by considering the alternative that mathematics, being a priori, is purely deductive, and has no extra-logical content.  Against this, Poincaré leverages his famous giant tautology objection.  If math were just logic it would be a giant tautology.  It’s not.  Thus, mathematics has some non-logical source of information or content.

The very possibility of mathematical science seems an insoluble contradiction.  If this science is only deductive in appearance, from whence is derived that perfect rigour which is challenged by none?  If, on the contrary, all the propositions which it enunciates may be derived in order by the rules of formal logic, how is it that mathematics is not reduced to a gigantic tautology?… Are we then to admit that the enunciations of all the theorems with which so many volumes are filled, are only indirect ways of saying that A is A? (Science and Hypothesis, pp 1-2)

Though this reductio by ridicule is amusing, it presupposes some things about logic, which, after logicism, are neither obvious nor uncontroversial.  One presupposition is that if something is a tautology we could recognize it.  This had already been contested by the logicist Dedekind, who acknowledged that chains of inferences can be so long, unconscious, and even frightening, that we may not recognize them as purely logical, even if they are. (Dedekind, p. 33)  Another presupposition Poincaré makes here is that logic is a giant tautology, which had already been contested by the logicist Frege, who explicitly disputes the idea that logic is sterile, (Frege, section 17).  Finally, even if we grant Poincaré’s presuppositions about logic, that it is recognizably empty, the extra-logical content on which mathematics depends is undetermined by this argument.  Additional arguments are required to move us towards the conclusion that mathematics is synthetic a priori, dependent on intuition rather than experience or some other source for its content.

Thus, Poincaré continues in the second part of this argument by considering the possibility that the extralogical content is simply provided by the non-logical axioms.  Formalism, or axiomatics, would be an example of this type of view.  In opposition to this, Poincaré argues that axiomatics is not faithful to mathematics.  According to the axiomatic viewpoint, logic can only extract what is given in the axioms (Science and Hypothesis, 2).  Poincaré feels that mathematics does more than squeeze out information that resides in axioms.  Mathematical growth can occur, he thought, within mathematics itself – without the addition of new axioms or other information.  He insists, in fact, that growth occurs by way of mathematical reasoning itself.

So, if mathematical reasoning can yield genuine growth without adding new axioms; and given his conception of logic as empty; then mathematical reasoning, not just mathematical content, must transcend logic alone.  How can mathematical reasoning transcend logic?  Well, mathematicians constantly use the tool of reasoning by recurrence, or inductive reasoning and definition, in order to make general definitions and conclusions.  A simple example of the principle of induction is:  if we can show that 0 has a property, P; and we can also show that for any number n, if n has P then n+1 has P; then we can conclude that all numbers have the property, P.  Poincaré regarded inductive reasoning as mathematical reasoning par excellence; and he felt that it transcends logic because it gives us a way to jump over infinite steps of reasoning.  Once we think about it a bit, we see it must be true: P (0) and P (n) → P (n+1) entails P (1); P (1) and P (n) → P (n+1) entails P (2); and so on.  The conclusion of induction – that for all n, P (n) – does enable us to jump over these tedious modus ponens steps, and Poincaré viewed it as a major source of progress in mathematics (Science and Hypothesis, 10-11)

Finally, to finish off this argument, Poincaré examines the nature of induction and reasoning by recurrence.  He argues that since induction cannot be logically derived, and it was certainly not traditionally regarded as a logical principle, it is synthetic.  However, it is not a merely experimental truth, because – despite the fact that it transcends logic – it is “imposed on us with an irresistible weight of evidence,” (Science and Hypothesis, 12-13).  Thus, he concludes, it is synthetic and a priori.  This status is also why it could not be regarded as a mere convention:  because it is not a choice or a definition.  Rather, it is a rule that is imposed on us by the nature of our own minds,  (Science and Hypothesis, 48).  By way of this three-part argument, Poincaré feels he has exhausted the likely alternatives; and is left with only one viable option, which is that induction is a synthetic a priori principle.

b. Argument Two

The second argument is by introspection.  This follows the last part of his argument above, and consists of an examination of the nature of the “irresistible weight of evidence” which forces induction on us.  The aim of this reflection is to establish that the reason induction is synthetic a priori, that it is based on a priori intuition.  Here we get some of the distinctive flavor of Poincaré’s conception of intuition in contrast with Kant’s.  For we see that for Poincaré, the intuition can be a kind of insight, somewhat evocative of Husserl, rather than a form of experience.  The intuition of iteration involves insight into a power of the mind itself.  So, it is the mind having a self-insight: into its own power to conceive of the indefinite iteration of an act once seen to be possible:

Why then is this view [the judgement that induction is a true principle] imposed upon us with such an irresistible weight of evidence?  It is because it is only the affirmation of the power of the mind which knows it can conceive of the indefinite repetition of the same act, when the act is once possible.  The mind has a direct intuition of this power, and experiment can only be for it an opportunity of using it, and thereby of becoming conscious of it. (Science and Hypothesis, 13).

In this case intuition gives us insight into a power of our own minds, a power to conceive of indefinite repetition, which in turn enables us to understand why induction must be true.  Thus, intuition lies at the foundation for math – whenever we explicitly (as in induction) or implicitly conceive of indefinite iterations (as in understanding domains generated by iterated processes such as the successor function).  Mathematical induction is different however from scientific induction, for it is certain while empirical induction is never certain.  Its certainty derives from the fact that it merely affirms a property of the mind itself – rather than makes an assertion about something outside the mind, (a priori versus a posteriori), (Science and Hypothesis, 13).  In this second argument, Poincaré uses intuition to explain the synthetic a priori status of induction.  Thus, despite the somewhat non-Kantian flavor of this intuition – its connection to insight rather than the form of experience – Poincaré’s use of it is analogous to Kant who also appealed to intuition to explain the synthetic a priori status of mathematics.

c. Argument Three

A third argument is really a set of objections to logicism, which take the form of circularity arguments.  When combined they add up to a powerful objection against logical or set theoretic reconstructions of arithmetic.  Each argument follows the same basic format, which is that any formal reconstruction of arithmetic that tries to avoid intuition will fail; for it will presuppose intuition somewhere in the reconstruction.

There are at least four, and taking them in order, the first two objections may not seem very impressive.

(i) First Poincaré seems to treat logicism as a kind of formalism or conventionalism, as if the Peano Axioms are implicit definitions of the concept of number.  Against this he argues that to show that these axioms are consistent requires the use of induction, which is one of the implicit definitions.  So this would be a circular endeavor.

And it would be if that were what logicism was up to.  However, logicists aimed to derive the Peano Axioms – including induction – from explicit definitions of zero, number and successor; they did not use the Peano axioms as (implicit) definitions themselves.  So this first argument seems to misfire.

(ii) In the second circularity argument Poincaré objects that the symbolism of logicism merely hides the fact that its definitions of the numbers are circular.  For example, he complains that the logicist definition of zero uses symbolic notation that means, “Zero is the number of things satisfying a condition never satisfied.  But as ‘never’ means in no case I do not see that the progress is great…” (Ewald translation, 1905b, VII, 1029)  He makes similar remarks against the standard definition of one, which in a sense invokes the idea of two.

Now, anyone familiar with contemporary logic may regard Poincaré’s complaint as a mere psychological objection based on logical ignorance, but I think this is too easy a dismissal.  His view is that a basic understanding of number is necessary in order to understand the symbolic definitions of the numbers, and this is not obviously a purely psychological point.  It is a normative claim about understanding rather than an empirical claim about how we happen to think.  So this argument cannot be immediately dismissed as has been claimed (e.g., see Goldfarb 1988).

(iii)  The last two arguments are intertwined and are generally regarded as stronger.  Following on the second argument above, Poincaré’s third objection complains that the new logic is mathematics in (symbolic) disguise.  We can reconstruct this argument along the following lines.  Modern symbolic logic has an infinite combinatorial nature, which makes it very different from Aristotelian logic.  For example, the standard definition of well-formed formula is recursive, which as we noted above is a peculiarly mathematical tool according to Poincaré.  It is the recursive nature of logic that makes it infinite.  Since recursive definition was formerly a peculiarly mathematical tool, the worry is that the logicist has in some sense shifted the boundary between math and logic.  If logic has “invaded the territory” of mathematics; and “stolen” some of its tools; then of course it would have more power.  In thus shifting the boundary, Poincaré believes, logicists have presupposed an essentially arithmetic, intuitive tool.  That is, the logicist hasn’t avoided intuition for he presupposes intuition in the very tools he uses, that is, in the new logic itself.

(iv) Fourth, if the logicist is, even just potentially, adding substantive content to logic via these new powerful tools, he owes us a justification that the new principles are – at least – consistent.  For example, the logicist could treat the rules of inference as disguised definitions of the logical constants, and then show that their use can never lead to inconsistency.  But, Poincaré objects, there will be no such consistency proof without induction.  So, the logicist will still have to presuppose induction, which has two problems.  The justification would therefore be circular since induction is one of the principles to be derived.  Also, logicism would be explicitly depending on intuition in justifying the new logical principles, which is what he was claiming to avoid.

This is not the place to assess Poincaré’s objections to logicism and the extent to which they can be dismissed as psychologistic.  (See Goldfarb 1988 for such arguments; and see the response, Folina 2006, for a rebuttal.)  Let us just say that when put together, these arguments suggest a genuine challenge to logicism along the following lines.  Modern symbolic logic has an infinite combinatorial structure, which can only be justified by mathematical means, including inductive tools.

d. Argument Four

This structure owes itself to the fact that ordinary definitions of well-formed formula in a standard system are recursive; and thus the inference rules themselves – which depend on what makes something a well-formed formula of a certain type – will also inherit this infinite combinatorial nature, (Argument 3)  Any proper understanding of the rules of inference will thus presuppose some grasp of the recursive procedures that determine them, (Argument 2)  Thus, logicist reconstructions of arithmetic, even if symbolic, cannot reduce arithmetic to an intuition-free content if recursive reasoning is intuitive.

6. Intuition and Other Topics in Poincaré’s Philosophy

To conclude, consider two other important topics:  Poincaré’s advocacy of predicative definitions in mathematics; and the more general issue of his philosophy of natural science.  Each fits with his semi-Kantian defense of intuition in mathematics.

a. Predicativism

Poincaré was central in advancing the understanding the nature of the vicious circle paradoxes of mathematics.  He was the first to articulate a general distinction between predicative and non-predicative definitions, and he helped to show the relevance of this distinction to the paradoxes in general.  Rather than treating the paradoxes on a case by case basis, he and Russell saw a common cause underlying all of them – that of self-reference.  Russell’s solution to the paradoxes – his ramified theory of types (developed in Principia Mathematica) – is indeed an attempt to formalize the idea of eliminating impredicative definitions.

The vicious circle paradoxes of mathematics showed that one can create a contradiction in mathematics by using a certain kind of self-referential definition along with some basic existence principles.  The most famous is Russell’s paradox because Russell first published his discovery of an inconsistency in Frege’s logicist system.  In generating the numbers, Frege had used an axiom that entails that any property whatever determines a set – the set of objects that have that property. Russell then considered the property of being non-self-membered.  Some sets are self-membered, the set of abstract objects is itself an abstract object, so it is self-membered; some are non-self-membered, the set of elephants is not an elephant so it is non-self-membered.  However, if non-self-membered is a bona fide property, then it too should determine a set according to Frege’s axiom:  the set of all sets that are non-self membered.  This yields a contradiction because given this property and the existence of the set by Frege’s axiom, the set in question is both self-membered and non-self-membered.

The property of being non-self-membered, however, is impredicative – for, to collect together all the sets that have this property, one must see whether the property applies to the set one is in the process of collecting.  In general, impredicative definitions appeal either implicitly or explicitly to a collection to which the object being defined belongs.  The problem with outlawing all impredicative definitions, however, is that many are unproblematic.  For example, “Tallest person in the room” is strictly speaking impredicative but neither logically inconsistent nor even confusing.  “Least upper bound” was thought by many mathematicians to fall into this category – of strictly impredicative but not viciously circular.  Indeed, the program to eliminate impredicativity from mathematics was doomed to fail.  Too many widely accepted definitions would have been eliminated; and mathematics would, as Weyl put it, have been almost unbearably awkward. (Weyl, 54)

Poincaré’s attitude to impredicativity was interesting and complex.  He was central in characterizing the notion, and as a constructivist he was someone for whom the notion is important.  However, he did not advocate a formal reconstruction of mathematics by eliminating all impredicativity.  Instead, he first advocated simply avoiding impredicative definitions.  Second, and more importantly, he distinguished between different definition contexts.  One definition context is constructive.  When the object does not already exist by virtue of another definition or presupposition, the definition context is constructive – and then it must be predicative.  For otherwise we are attempting to build something out of materials that require it to already exist, which is certainly a viciously circular procedure.  The other definition context is non-constructive, such as, when a definition merely identifies, or picks out, an already existing object.  In this case impredicativity is harmless, for it is more like the case of the “tallest man in the room”, which merely picks out an existing person and does not thereby construct him.  So, for Poincaré even the constructivist needs to worry about impredicativity only in certain situations:  when the definition is playing the role of a construction.

In this way, despite the fact that Poincaré was a constructivist, he did not regard all mathematical definitions as constructions.  There are two types of nonconstructive definition contexts:  when the object exists by way of a prior definition, and when the object exists by guarantee of intuition.  For him, least upper bound was indeed similar to the specification, “Tallest man in the room” – because he regarded sets of upper bounds as given a priori by the intuitive continuum, since all real numbers are thereby guaranteed to exist.  By relying on intuition to supplement his constructivism, he attempted to avoid the unbearable awkwardness and restrictions of a purely predicativist approach to mathematics.

b. Philosophy of Science

Poincaré’s philosophy of natural science covers much interesting terrain.  He was famous for distinguishing between types of hypotheses in science, but he also distinguishes between types of facts, emphasizing the importance of simple facts in science.  Simple facts are the most general and most useful facts, which also have the power to unify different areas of science.  These same facts are the interesting facts to us and they are the most beautiful as well.  Their beauty rests on their, “harmonious order of the parts and which a pure intelligence can grasp,” (The Value of Science, 8).  Simplicity, beauty, and utility are one and the same for Poincaré.

A second important theme in Poincaré’s vision of scientific knowledge involves his appeal to Darwin’s theory of evolution.  He asks the question why we find beauty in the simple, general, harmonious facts?  One answer is Darwinian:  natural selection will favor creatures that find interest and beauty in the facts that prove more useful to their survival.  The idea is, the fact that humans notice and are interested in regularities no doubt helped them survive.  Indeed, Poincaré appeals to natural selection in just this context, (The Value of Science, 5, 9).

Third, as noted above, Poincaré makes important distinctions between types of hypotheses, in Science and Hypothesis.  Some hypotheses are mere conventions, or definitions in disguise; some are tentative hypotheses that are malleable as a theory is being articulated or built; and some are verifiable, “And when once confirmed by experiment become truths of great fertility,” (Science and Hypothesis, xxii).  Though he is a conventionalist about some aspects of science, he opposes what he calls nominalism, which is too much emphasis on free choice in science.

Poincaré regards the utility of science as evidence that scientists do not create facts – they discover facts.  Yet, on the other hand, he does not espouse a sort of direct realism by which science merely reflects the objective world.  Science neither creates, nor passively reflects truth.  Rather, it has a limited power to uncover certain kinds of truths – those that capture, “Not things themselves… but the relations between things; outside those relations there is no reality knowable,”  (Science and Hypothesis, xxiv).

Let us consider these three aspects of Poincaré’s philosophy of science side by side with his constructivist philosophy of mathematics.  For Poincaré the most harmonious, simple, and beautiful facts are those that are typically expressed mathematically.  He goes so far as to assert that the only objective reality that science can discover consists of relations between facts; and these relations are expressed mathematically, (The Value of Science, 13).  Thus, mathematics does not merely provide a useful language for science; it provides the only possible language for knowing the only types of facts we can objectively know – the relational facts.

Poincaré’s emphasis on structural, or relational, facts; and the fact that he rejects the idea that science discovers the essences of things themselves; has been characterized by some as structural realism, (Worrall).  Structural realism currently takes various forms, but the basic aim is to stake a moderate, middle position between skepticism and naïve realism.  We cannot know things in themselves, or things directly.  So against naïve realism, science does not directly reflect reality.  Yet, the success of science is surely not a miracle; its progress not a mere illusion.  We can explain this success, without naïve or direct realism, but with the hypothesis that the important, lasting truths that science discovers are structural, or relational, in character.  Poincaré indeed espouses views that fit well with structural realism.

If relations are the most objective facts we can know; and if this is a form of realism; then relations must be real.  A question arises, however, over whether or not Poincaré’s underlying Kantian views are in tension with the realism in structural realism.  That is, given Poincaré’s anti-realism about mathematics, emphasizing the mathematical nature of the structural facts we can know seems to move us even further away from realism.  So, a question is whether his view should really be called structural Kantianism rather than structural realism.  If structure is mathematical, and mathematics is not conceived realistically, then how can he be a realist about structure?

I think there is a way to preserve the realism in his structural realism by remembering two things:  one, his appeal to the empirical basis and utility of science in opposition to the nominalist; the other is his Darwinism.  First utility.  We express the lasting, useful scientific relations mathematically; but it does not follow that the relations expressed mathematically have no reality to them over and above mathematical reality.  If the relations had no such reality, they wouldn’t be so useful.  Moreover, since the scientist relies on experimental facts, “His freedom is always limited by the properties of the raw material on which he works,” (The Value of Science, 121).  The rules that the scientist lays down are not arbitrary, like the rules of a game; they are constrained by experiment, (The Value of Science, 114).  They are also proven by their long-term usefulness; and some facts even survive theory change, at least in rough form, (The Value of Science, 95).

For Poincaré, the true relations, the real relations, are shown by their endurance through theory change; and he believed science had uncovered a number of such truths.  This is consistent with the view that what endures through scientific change, the enduring mathematically expressed relations, reflects reality as it really is, (Science and Hypothesis, chapter X).  This is the same structural realist idea that science can cut nature at its joints, where the increasing complexity of science, including the overthrow of old theories for new ones, can sometimes be construed as science making more refined cuts in roughly the same places as it progresses.  (Think of a 16th century map, which is superseded by newer, more precise, maps.  It is not that the earlier map represented nothing.)

We can bolster this picture with Poincaré’s Darwinism.  We evolved in the world as it is.  This is a kind of minimal realism for it entails that the world is a certain way independent of our social, scientific, constructions.  Evolutionary pressure gives us capacities that help us to survive.  So, there is an evolved fit between our cognitive structures and the structures of the world.  If there weren’t, we wouldn’t have survived; indeed Poincaré suggests that if the world did not contain real regularities then there might be no life at all:

The most interesting facts are those which may serve many times; these are the facts which have a chance of coming up again.  We have been so fortunate as to be born in a world where there are such…. In [a world without recurring facts] there would be no science; perhaps thought and even life would be impossible, since evolution could not there develop the preservational instincts. (The Value of Science, 5)

The existence of life, no less science, confirms the existence of real regularities in the world.  We are beings who notice, and even look for, regularities.  So we survive.  In addition, although we impose mathematics on our cognition of the world, on the way we cognize the regularities, what we impose is not arbitrary.  Rather, mathematics reflects aspects of our cognitive capacities that have helped us survive in the world as it is.  That is, our inclination to search for order and regularities is also what makes us mathematical.

Kantian constructivism about mathematics is thus not opposed to scientific realism, provided realism is not taken in a naïve way.  For Poincaré, the structural realist hypothesis is that the enduring relations, which we can know, are real, because we have evolved to cut nature at its real joints, or as he once put it its “nodal points” (Science and Method, 287).  Mathematics is a sort of by-product of evolution, on this picture.  In this way, Poincaré’s Kantianism about pure mathematics is supported by a Darwinian conception of human evolution – a picture that enables his philosophy of mathematics to coexist with his diverse views about natural science.

7. References and Further Reading

  • Dedekind:  Essays on the Theory of Numbers, Berman transl, Dover, 1963.
  • Ewald:    From Kant to Hilbert, Oxford University Press, 1996. (Contains good translations of several papers by Poincaré that were formerly available in English only in abridged form.)
  • Folina:  Poincaré and the Philosophy of Mathematics, Macmillan, 1992.
  • Folina:  “Poincaré’s Circularity Arguments for Mathematical Intuition,” The Kantian Legacy in Nineteenth Century Science, Friedman and Nordmann eds, MIT Press, 2006.
  • Frege:  The Foundations of Arithmetic, J L Austin transl, Oxford, 1969.
  • Goldfarb:  “Poincaré against the logicists,” History and Philosophy of Modern Mathematics, Aspray and Kitcher eds, University of Minnesota Press, 1988.
  • Greffe, Heinzmann and Lorenz:  Henri Poincaré, Science and Philosophy, Akademie Verlag and Albert Blanchard, 1994.  (Anthology containing a wide variety of papers.)
  • Kant:  Critique of Pure Reason, N K Smith transl, St Martin’s Press, 1965.
  • Poincaré:  Science and Hypothesis, W J Greenstreet transl, Dover 1952 (reprint of 1905; includes introduction by Larmor and general prefatory essay by Poincaré).
  • Poincaré: The Value of Science, George Bruce Halsted transl, Dover, 1958 (includes prefatory essay by Poincaré on the choice of facts).
  • Poincaré: Science and Method, Francis Maitland transl, Thoemmes Press, 1996   (reprint of 1914 edition with preface by Russell).
  • Poincaré: Last Essays, John Bolduc transl, Dover, 1963.
  • Poincaré: “Mathematics and Logic” (I, 1905b), in From Kant to Hilbert, Ewald ed, Halsted and Ewald transl, Oxford University Press, 1996.
  • Russell with Alfred North Whitehead:  Principia Mathematica, 1910-1913. 3 vols. Cambridge, UK: Cambridge Univ. Press. Revised ed., 1925-1927.
  • Weyl:  Philosophy of Mathematics and Natural Science, Helmer transl, Atheneum, 1963.
  • Worrall:  “Structural realism: the best of both worlds?” in Dialectica 43, pp. 99-124, 1989.

Author Information

Janet Folina
Email: folina@macalester.edu
Macalester College
U. S. A.

 

Tibetan Philosophy

The term “Tibet” refers to a geographic area around the Himalayan mountains and the culture which originated there.  Tibetan thought is a living tradition of rigorous argumentation, psychological insights, and philosophically relevant ideas concerning metaphysics, epistemology, ethics, and moral psychology.  It has a rigorous and formal system of philosophical debate and a wealth of meditative traditions, both of which provide insights for the nature of reality, the self, and truth.

Though it is strongly influenced by earlier Indian Buddhist philosophy, it offers a range of perspectives on these issues and presents many insights and practices of its own.  This article will provide an overview of topics that have been influential in Tibetan thought and attempt to emphasize topics that are indigenously Tibetan or have been significantly developed by Tibetan thinkers.  It is important to keep in mind that Tibetan intellectual culture often treats innovation differently than that of the West.  When a thinker comes up with a new distinction, argument, or practice it is likely to be attributed to an older, often Indian, source for various reasons including (but by no means limited to) modesty, authority, loyalty, or admiration.

Though this article avoids assuming a background knowledge of Buddhism, an understanding of the basic ideas and worldview of Buddhism, in particular Mahāyāna Buddhism, is essential for understanding Tibetan philosophy.

The italicized parenthetical terms are Tibetan unless otherwise noted and they are transliterated using the Wylie system.  They are not meant to be essential for understanding the ideas of the article and are provided to avoid the confusion caused by different writers using different English glosses.

 

Table of Contents

  1. Introduction
    1. The Tibetan cultural sphere
      1. Language and Geography
      2. Religions
    2. Philosophy
      1. Religion and Philosophy
      2. Tibetan Debate
  2. Metaphysics and Epistemology
    1. Mādhyamaka and Yogācāra
    2. The Doctrine of the Two Truths
    3. Contemplative Practices
  3. Ethics and Moral Psychology
    1. Mahāyāna Buddhist Ethics
      1. The Bodhisattva Ideal
      2. Mismatched Categories
    2. Tibetan Emphases and Innovations
      1. Elegant Sayings
      2. The Stages of the Path
      3. Mind Training
  4. References and Further Reading

 

1. Introduction

a. The Tibetan Cultural Sphere

i. Language and Geography

The term “Tibetan” refers to a cultural sphere that includes not only the present day Tibetan Autonomous Region, but also parts of Sichuan, Yunnan, Gansu, and Qinghai provinces of the People’s Republic of China as well as areas of Nepal, Bhutan, and northern India.  Though the spoken language of Tibetan in these areas is quite diverse (and often mutually unintelligible), they share a common written heritage of literature, poetry, song, and philosophical texts.  However, Tibetan philosophy is very much a living tradition with a variety of philosophical views and topical emphases.

ii. Religions

Buddhism has had a profound influence on Tibetan thought and culture.  Buddhism began to gain influence in Tibet after it became favored by king Songtsän Gampo around 641 CE.  However, there is also an indigenous Tibetan religion known as Bön (bon). Despite a somewhat competitive history, Bön and Buddhism have influenced each other greatly, making it difficult to draw a clear distinction between the two.

Today there are four main sects of Tibetan Buddhism.  The difference between sects is not always purely philosophical but often involves which practices, lineage masters, and texts they emphasize and also which translations they use.  The four major sects are:

  1. Nyingma (rnying ma) “Ancient”
  2. Sakya (sa skya) “White Earth”
  3. Kagyu (bka’ brgyud) “Oral Transmission”
  4. Gelug (dge lugs) “Way of Virtue”

The Gelug, the sect of the Dalai Lamas, came to hold the majority of the political power from the seventeenth century onward.  Since the late nineteenth century a non-sectarian movement (ris med) encouraged by the current Dalai Lama has become popular and fostered a more open approach between sects and a mixing of practices.

The texts of Tibetan Buddhist Canon are divided into two sections.  The “Translated Words” or the Kangyur (bka’ ‘gyur), which are texts that are said to be the teaching of the Buddha and the “Translated Teachings” or the Tengyur (bstan ‘gyur), which are treatises and commentaries written by Indian and Tibetan authors.

b. Philosophy

i. Religion and Philosophy

Unlike Western Philosophy since the Enlightenment, there is no rigid separation between religion and philosophy in Tibetan thought.  This does not mean that Tibetan philosophy is essentially non-rational or superstitious in nature and should not preclude philosophical interest; not anymore than references to Apollo in Plato or God in Descartes prevents philosophers from finding interesting philosophical theses in their works.  However, this lack of separation between the religious and philosophical does mean that a modern reader must keep in mind that Tibetan thinkers are likely to have aims and motives outside those usually found in Western philosophy.

Being overwhelmingly Buddhist in nature, Tibetan philosophy has a soteriological aim; one engages in philosophical investigation not only to gain an understanding of the world, but so that such an understanding can aid in eliminating suffering. For Buddhists, all human suffering arises from misunderstanding the nature of the world; through study and philosophical reflection one can come to have a better grasp of the nature of reality —particularly of suffering and its causes.  When one understands this, one can avoid much suffering by beginning to act and cultivate dispositions that are in accord with reality.  Modern philosophical theorizing in the West is commonly thought to aim at discovering the nature of reality or of the best way to live.  However, such theorizing does not often include the aim of integrating such a view of reality into everyday actions or cultivating one’s own dispositions so as to actually live in the best way possible.  For Tibetans and the Buddhist tradition more generally, since the goal of philosophical investigation is to produce a practical result, one deals not only with questions like “What is the best way to act?” but also “How can I come to act that way?”

ii. Tibetan Debate

The distinctive form of Tibetan debate (rtsod pa) plays an important part of philosophical investigations in Tibetan intellectual communities.  It is central in the Gelug sect, in particular those earning their kenpo (mkhan po) degrees, though it is also practiced in other sects to varying degrees.  The practice involves a seated defender (dam bca’ ba) and a standing challenger (rigs lam pa).  The roles are quite different; the defender must assert a thesis and attempts to defend its truth.  The challenger, however, asks questions in an attempt to get the defender to accept statements that are contradictory (for example, both “all colors are white” and “there is a color that is red”) or absurd (for example, “the color of a white religious conch shell is red”).  The challenger is not held responsible for the truth content of the questions; like someone raising an objection at a lecture, the challenger does not have to assert any thesis, but only aims to show that the defender is mistaken.

The debate begins with the challenger invoking Mañjuśrī, the bodhisattva of wisdom.  This invocation is variously interpreted, but can be seen most generally as a reminder to the debaters that they are aiming at wisdom, at finding out the truth about the subject.  The challenger then sets the topic of debate by asking a question to which the defender replies and reveals his thesis. The challenger may ask questions to clarify the defender’s thesis or establish common assumptions or simply begin the debate.  During the debate, the challenger raises questions of a particular form; a complete question is one that contains a subject, predicate, and a reason.  For example, the question “(Do you agree that) the subject, Socrates, is mortal because of being a man (?)” ascribes a predicate (being mortal) to the subject (Socrates) in virtue of a reason (being a man).  When an element is omitted or ambiguous, the defender is allowed to clarify, but upon receiving a complete question, the defender has three possible replies:

  1. “I accept” (’dod)
  2. “The reason is not established” (rtags ma grub)
  3. “It does not pervade” (ma khyab)

If the defender thinks that the proposed relationship between the subject, predicate, and reason holds, then she responds with “I accept.”  When the subject does not correspond to the reason, the defender asserts that the reason is not established. For example, “Socrates is mortal because of being an elephant” would warrant this reply because the reason, being an elephant, does not apply to the subject, Socrates.  The denial of pervasion, a Tibetan innovation that is not found in earlier Indian Buddhist debate system of Dharmakīrti, claims that the reason does not entail the predicate.  There are two kinds of failures of pervasion — one of uncertainty (ma nges pa) and one of contradiction or exclusion (’gal pa).  “Socrates is a philosopher because of being a man” is uncertain because some but not all men are philosophers; the reason, being a man, does not entail the predicate, being a philosopher. “Socrates is a reptile because of being a man” is contradictory because the terms “men” and “reptile” are exclusive; there are no men that are reptiles.

2. Metaphysics and Epistemology

a. Mādhyamaka and Yogācāra

Metaphysics and epistemology in Tibet are deeply rooted in Indian Mahāyāna Buddhist philosophy.  A focal question concerns what, if anything, has an intrinsic, unchanging essence or nature (Sanskrit: svabhāva)?  One may ask about a chair or one’s self if there is some intrinsic chair-ness or self-ness to be found. The two major schools that came to Tibet from Indian Mahāyāna Buddhism, Yogācāra (the “Mind Only” school) and Mādhyamaka (the “Middle Way” school) provide somewhat different answers to this.

The Yogācāra school, associated with Vasubandhu and his half-brother Asaṅga, replies that awareness or consciousness is the only thing with an intrinsic essence.  The general idea is that while what we perceive as reality might not have an intrinsic nature, the awareness that we have of the flow of such perceptions does have such a nature.  This school is sometimes compared with German Idealism in the West.

The Mādhyamaka school, founded by Nāgārjuna, denies that anything has an unchanging essence; this is known as the Doctrine of Emptiness (Sanskrit: śūnyatā).  To say that all phenomena are empty is to say that they are empty of a stable and unconditioned essence — tables have no intrinsic table-ness and selves have no intrinsic self-ness. This may sound extreme, but Mādhyamaka sees itself as being a middle way between the extremes of positing an entity with an eternal essence and the nihilistic denial of any existence at all (to say a chair lacks an unchanging essence is not to say that it does not exist at all).  Though the Mādhyamaka view, championed by the Gelug sect, is often seen in Tibet as the higher teaching, both Yogācāra and Mādhyamaka ideas are present.

Within the Mādhyamaka school there is a distinction over the proper method of discourse with non-Mādhyamaka philosophers, specifically whether or not it is appropriate to make positive assertions in debate. The Svātantrika view, associated with Bhāvaviveka, permits the use of assertions and independent syllogisms in debate. However, the Prāsaṅgika view, attributed to Chandrakīrti and Buddhapālita, permits only the use of logical consequences, a kind of negative method of reductio ad absurdum to establish the Mādhyamaka view in debate.  Anything else, they contend, would give the impression that they accept the unconditioned essence of any of the topics under debate.  This method has been compared with that of Wittgenstein (at least the Wittgenstein of the Tractatus Logico-Philosophicus) and the Skeptics of ancient Greece.

It is important to note that this distinction is an indigenous Tibetan one; there is no evidence of either of the terms being used by Indian Mādhyamaka philosophers. The Sanskrit terms Prāsaṅgika and Svātantrika are inventions of Western scholars to translate the Tibetan terms rang rgyud pa (the Autonomists or Svātantrika) and thal ’gyur pa (the Consequentialists or Prāsaṅgika).  Through the influence of the immensely important Gelug thinker Tsongkhapa, the Prāsaṅgika became the more influential view in Tibet. A clear and accessible entry point to these issues can be found in Jamgön Mipham’s Introduction to the Middle Way.

b. The Doctrine of the Two Truths

A seminal concept in Mādhyamaka thought, and in Mahāyāna Buddhism generally, is the idea that there are two truths: a conventional or nominal truth (Sanskrit: saṃvṛti-satya) and an ultimate truth (Sanskrit: paramārtha-satya).  The idea is similar to Berkeley’s dictum that we think with the learned, but speak with the vulgar; we can accept certain conventions without thinking them to be ultimately real.  The notion can be understood epistemically or metaphysically; the term rendered here as “truth” (Sanskrit: satya, Tibetan: bden pa) can mean “true” in the sense of a true proposition but also “real” in the sense of something actually existing in the way that it appears. Suppose one were to stumble upon a friend watching a Felix the Cat cartoon ask him what is happening. The friend is likely to reply with something like, “Felix just got hit on the head with a hammer and he’s mad.” The reply is conventionally true; the question was asked from within a system of conventions — one that assumes there are entities called characters that can perform actions and feel emotions — and the reply is true within those conventions. When pressed, both may well admit that the ultimate truth is quite different; in fact there is no Felix, simply a series of lines organized in a certain way so as to create drawings that bear a resemblance to a cat, which, when shown in rapid succession create the visual illusion of actions, events, and emotions. This is the ultimate truth about what is really happening, but to reply in this way would be both impolite and pragmatically unhelpful. The view has some affinities with fictionalism in Western philosophy in that both acknowledge some value in claims that are metaphysically ungrounded.

For the Mādhyamaka philosopher, talk of physical objects, persons, causes, and all other phenomena is true only in the conventional sense. One issue of debate in Tibet has been the relationship between the Two Truths. A radical view advocated in the fourteenth century by Dolpopa claims that the Two Truths are completely separate, advocating a doctrine called Emptiness of Other (gzhan stong) — the ideal that emptiness itself has a stable and unchanging nature.  The prevailing view, advocated by Tsongkhapa and the Gelug tradition, proposes a deep unity between the two truths. This view holds the distinction between the conventional and ultimate reality to be itself merely conventional, a doctrine called Emptiness of Self (rang stong). On this view, the property of lacking an essential nature is nothing more than a conventional designation (for more on this see Kapstein 2001 pp.221ff). The idea that emptiness itself is not an ultimately real property — the emptiness of emptiness — is taken to be paradoxical to varying degrees (see Garfield 1995 pp. 319-321 and Hayes 1994) and it is said to be one of the most difficult and subtle points in Mādhyamaka philosophy.

The Two Truths are especially important when one keeps in mind the soteriological aim of Buddhist philosophy; it allows a place for teachings that are not strictly speaking true, but benefit the student. The language used in Tibetan to translate “conventional truth” reflects this; the most common terms, both translated into English as “conventional” are tanyé (tha snyad) and kundzob (kun rdzob). The former means simply a mental label for something, a conventional sign for communications, while the latter, kundzob, means something that obscures, hides, or fakes. The distinction suggests two sorts of conventional truth; those that obscure the ultimate truth and those that do not. This finds support in common sense as some false speech is used to obscure reality, as in that of political spinsters, while other false speech is used to illuminate a truth about reality, such as telling a fictional story to teach a truth about human psychology. This distinction is explained in greater detail at Garfield (2002) pp.60ff, where he notes that emptiness itself is conventional in the illuminating tanyé sense, but not in the concealing kundzob sense.

c. Contemplative Practices

There are also more meditative practices that allow the meditator to experience the emptiness of phenomena in a more direct way. One tradition, associated with the Kagyu sect and known in Sanskrit as Mahāmudrā (Tibetan: phyag rgya chen po) meaning “The Great Seal”). Another tradition known as Dzogchen (rdzogs chen) or “The Great Perfection” has its roots in the Bön and Nyingma traditions.  These practices tend to emphasize first-hand experience and the relationship with a qualified teacher.

The core of these practices involves close observation of the mind at rest and during the arising and passing of thoughts and emotions. Through this kind of meditation one comes to see one’s own true nature (ngo rang) and directly experience emptiness. These mediations are often described with language suggesting spontaneity, immediacy, and ineffability — a non-conceptual and non-dualistic awareness of reality, which is taken to be in some sense perfect as it is. To many, these features evoke affinities with mysticism that put it outside the purview of modern Analytic philosophy, though epistemological issues like introspection, phenomenology, and the limits of language are relevant.

3. Ethics and Moral Psychology

The ethics of Tibetan philosophy is inextricably linked to Buddhist ethics, in particular the ideas of Mahāyāna Buddhism.  The Mahāyāna Buddhist tradition is far too varied and vast to be adequately covered here, so what follows is a small sampling of some of the issues that have received a good deal of attention in Tibetan thought and some of the indigenous Tibetan innovations.

a. Mahāyāna Buddhist Ethics

i. The Bodhisattva Ideal

A concept central to the distinction between Mahāyāna (“The Greater Vehicle”) and the earlier Therevāda (“The School of the Elders”) Buddhism is that of the Bodhisattva. The term “bodhisattva” (literally “enlightenment-being”) in the older Pāli literature is used to describe the Buddha before he became enlightened. There is a collection of stories of the Buddha’s previous lives, known as the Jātaka Tales, which describe how the Buddha of our time behaved in his previous lives as an animal, human, or other creature. The tales teach a moral by describing the selfless and virtuous actions of the Buddha-to-be and in these tales he is called a bodhisattva. The ideal, however, in Therevāda Buddhism is one who is awakened and escaped suffering — a Buddha.

In Mahāyāna Buddhism the Bodhisattva began to take on a more central role as a spiritual and ethical ideal. Bodhisattvas, rather than becoming enlightened and escaping the sufferings of this world, choose to forgo their own enlightenment and remain in this world in order to relieve the suffering of others. The idea is rooted in earlier Indian thought, such as the classic, Way of the Bodhisattva (Sanskrit: Bodhicaryāvatāra) by Śāntideva, the emphasis on the Bodhisattva figure and the ideal of selfless compassion are central to ethics in Tibet as well. Scores of texts composed in Tibetan praise the Bodhisattva and their motives (Sanskrit: bodhicitta) from Thogmé Zangpo’s Thirty-Seven Practices of Bodhisattvas (Tibetan: rgyal sras lag len so bdun ma) to the more recent Vast as the Heavens, Deep as the Sky (Tibetan: byang chub sems kyi bstod pa rin chen sgron ma) by Khunu Rinpoche.

ii. Mismatched Categories

Modern scholars disagree about the most accurate way to view Buddhist ethics in terms of the standard Western ethical categories. Buddhist ethics seems to have affinities with all of the major ethical theories in the West. Its emphasis on the elimination of suffering is similar to Utilitarian theories like that of Jeremy Bentham, its emphasis on a universal outlook is similar to the Kantian claims about the categorical imperative, and its Bodhisattva seems similar to the sort of ideal agents imagined in Virtue Ethics.

Naturally, there are problems with each interpretation.  It is not clear that the Utilitarian framework can account for the intrinsic value given to certain motivations and the intrinsic value given to skillful actions; for example, one might think that skillful actions (Sanskrit: kuśala) lead to the elimination of suffering because they are right, not vice versa. It is also not clear that a Kantian framework can accommodate the central role of compassion and sympathy and given the importance of the consequences of actions given in Buddhist ethics, the Kantian framework seems ill-fitting.

The view championed by Damien Keown is a characterization of Buddhist ethics in terms of Aristotelian virtue ethics. For Aristotle, one develops certain character traits so that one may achieve flourishing (Greek: eudaimonia).  Similarly, argues Keown, the bodhisattva develops certain traits with the goal of achieving freedom from suffering (Sanskrit: nirvana). The argument claims that flourishing and freedom both function as a goal for which the development of good traits is cultivated. But many scholars, famously Peter Harvey, claim that Buddhist ethics cannot be placed entirely in any single Western category.  Instead, they see Buddhist ethics as being best understood as having similarities with each, though not exclusively falling into any particular one.

b. Tibetan Emphases and Innovations

i. Elegant Sayings

A popular genre of ethical advice in Tibet is that of Legshé (legs bshad) or “Elegant Sayings.” These are related to the Indian subhāṣita format and are unusually secular in content for Tibetan literature. They are in verse form, usually with four line stanzas with seven syllables per line. Commonly studied in schools and memorized, these are very popular among Tibetans and often familiar to non-scholars.

The most popular of these texts, The Elegant Sayings of Sakya Paṇḍita (sa skya legs bshad) was composed by Sakya Paṇḍita, an important figure in the Sakya sect around the Thirteenth century. The content often concerns the traits and conduct of wise (mkhas pa), noble (ya rabs) and foolish (blun po) people along with other advice regarding common human problems and tendencies. The advice is often juxtaposed with a metaphor or similar case from everyday life.  For example, regarding determining who is wise, Sakya Paṇḍita writes:

Without questioning a wise person,

One cannot measure their depth.

Without striking a drum with a stick,

One cannot distinguish it from other drums.

Important topics include the best attitude towards achievement and failure, praise and blame, wealth, anger, and work (among others). Sakya Paṇḍita’s text inspired many similar texts, popularly Virtuous Good Advice (dge ldan legs bshad) by Panchan Sonam Drakpa, which is quite similar to Sakya Paṇḍita’s text and A Treatise on Water and Wood (chu shing bstan bcos) by Gung Thang Tenpé Dronmé, which uses only forest and water imagery. A more detailed introduction to Legshé literature and a translation of Sakya Paṇḍita’s text can be found in John Davenport’s Ordinary Wisdom.

ii. The Stages of the Path

A conceptual frame that became important in Tibet is the idea of stages on the path to enlightenment (lam rim). Its roots are in the Indian Buddhist idea of Bodhisattva Stages (Sanskrit: bodhisattva-bhumi) though the notion took hold through the Bengali monk Atiśa, who was invited to Tibet to clarify the teachings early in the eleventh century. In his Lamp for the Path to Enlightenment (byang chub lam gyi sgron ma), Atiśa distinguishes three kinds of persons/abilities (skyes bu gsum):

  1. Person of Small Ability (skyes bu chung ba)
  2. Person of Intermediate Ability (skyes bu ’bring ba)
  3. Person of Great Ability (skyes bu chen po)

Those of Small Ability can seek only worldly pleasures and are concerned with their own happiness and their future well-being. Those of Intermediate Ability are able to reject worldly pleasures, but seek to end only their own suffering.  Those of Great Ability take on suffering in order to end the suffering of others.  This division can be understood as applying to the particular situation in Tibet in which mass monasticism and more esoteric forms of Buddhism could both be found. The teaching of the three kinds of abilities can be understood as a schema for determining whether or not a monk is ready for certain higher teachings and practices. The threefold division can also be understood in a wider sense, applying to people in general and how to gauge their abilities.

Aside from the obvious emphasis on altruism, the doctrine exemplifies what Harvey (2000 p.51) terms gradualism. For many ethical systems in the West, normative prescriptions apply to everyone (or perhaps everyone who can grasp them regardless of ethical development). In many forms of Buddhist ethics, though some prescriptions like refraining from taking life apply to everyone, others only apply to those with a certain depth of moral or spiritual understanding. Harvey notes that while lay practitioners usually follow five precepts, an ordained monk is subject to two hundred or more. Similarly, different teachings, practices, and requirements are suitable for the three kinds of abilities. Those of Small Ability might benefit most from reflecting on the impermanence of worldly pleasures and the inevitability of death, while the kind of altruism and patience that those of higher stages develop is out of their reach and could prove detrimental to demand of them. Atiśa notes that just as birds with undeveloped wings cannot fly, people with undeveloped understanding cannot help others in certain ways. The implication seems to be that just as we cannot demand of baby birds that they fly, we can encourage them to act in ways that nurtures the development of their wings.

iii. Mind Training

An area developed extensively in Tibet is that of Lojong (blo sbyong) or Mind Training. Recall that because of the soteriological aspect of Tibetan ethics, the aim is not solely to give an account for what the right actions and attitudes are, but to come to manifest those attitudes and actually act in that way. Lojong is a type of meditative practice that aims at helping the practitioner to generate compassion and lessen attachment to external factors like praise and popular opinion.

One kind of Lojong, often associated with Śāntideva, is the practice of Exchanging Self and Other (bdag gzhan mnyam brje). In this practice the meditator imagines himself to be another person; often a sequence of people who are beneath, equal to, and then superior to the practitioner in some respect.  By doing this, the practitioner can come to realize that the other person is the same as them in that they wish to be happy and avoid suffering. After some practice, it becomes easier to overcome obstacles (both petty and serious) to treating others in a compassionate way.

Another kind of Lojong practice, often attributed to Atiśa but popularized by Chekawa Yeshe Dorje, is that of Giving and Taking (gtong len). In this practice one imagines oneself taking in the suffering of others, and gives to them happiness in return. This often takes the form of visualizing that with each breath, one inhales the suffering of others as thick black smoke and exhales happiness to them in the form of white light.

A general feature of Lojong is the development of an ability to take negative circumstances, like being surrounded by suffering or anger, and transform it into positive attitudes and actions. Two foundational texts in this regard are Eight Verses for Training the Mind (blo sbyong tshig brgyad ma) by Geshé Langri Tangpa and The Seven-Point Mind Training (blo sbyong don bdun ma) by Chekawa Yeshé Dorjé.

4. References and Further Reading

  • Clayton, Barbra. 2006. Moral Theory in Śāntideva’s Śikṣāsamuccaya. New York: Routledge.
    • Though primarily a discussion of Śāntideva’s lesser-known work, it has a good overview of his life and works as an informed discussion of how to consider Buddhist ethics in Western categories.
  • Dreyfus, Georges J. B. 2003. The Sound of Two Hands Clapping: The Education of a Tibetan Buddhist Monk. Berkeley: University of California Press.
    • This first-hand account of Tibetan monastic life offers a realistic picture of the actual practices as well as excellent information on Tibetan debate.
  • Garfield, Jay. 2002. Empty Words. New York: Oxford University Press.
    • An insightful collection of essays on a variety of topics in Buddhist Philosophy which focuses on Tibetan Buddhism and Analytic Philosophy.
  • Garfield, Jay. trans. 1995. The Fundamental Wisdom of the Middle Way: Nāgārjuna’s Mūlamadhyamakakārikā. New York: Oxford University Press.
    • A translation from the Tibetan text of Nāgārjuna’s most famous philosophical work.  Garfield also provides very clear and philosophically informed commentary.
  • Harvey, Peter. 2000. An Introduction to Buddhist Ethics. Cambridge: Cambridge University Press.
    • A very clear introduction to Buddhist ethics with an emphasis on normative questions.
  • Hayes, Richard. 1994. “Nāgārjuna’s Appeal” in The Journal of Indian Philosophy Vol. 22, pp.299-378.
    • A classic paper that argues that Nāgārjuna’s arguments essentially rely on the fallacy of equivocation over the term Svabhāva.
  • Kapstein, Matthew. 2001. Reasons Traces. Boston: Wisdom Publications.
    • A philosophically informed discussion of personal identity, metaphysics, and epistemology in Indian and Tibetan Buddhism.
  • Keown, Damien. 1992. The Nature of Buddhist Ethics. New York: St. Martin’s Press.
    • A very interesting philosophical discussion of Buddhist ethics, offering an interpretation of Buddhist ethics that emphasizes the similarity to Aristotelian virtue ethics.
  • Khunu Rinpoche. Gareth Sparham, trans. 1999. Vast as the Heavens Deep as the Sea. Boston: Wisdom Publications.
    • A recent text in verse form praising bodhicitta, the aspiration for enlightenment.
  • Mipham, Jamgön and Chandrakirti. Padmakara Translation Group trans. 2002. Introduction to the Middle Way. Boston: Shambhala Press.
    • As a translation of Chandrakīrti’s Madhyamakāvatāra with commentary by Mipham Jamgön, it is an important primary text.  Its introduction provides a very clear and understandable way into Mādhyamaka philosophy.
  • Patrul Rinpoche. 1998. Words of My Perfect Teacher. Boston: Shambhala Press.
    • A very popular practical guide and explanation of the Tibetan Buddhist spiritual path.
  • Perdue, Daniel. 1992. Debate in Tibetan Buddhism. Ithaca: Snow Lion Press.
    • An extensive translation and explanation of an introductory Tibetan debate manual.
  • Rossi, Donatella. 1999. The Philosophical View of the Great Perfection in the Tibetan Bon Religion. Ithaca: Snow Lion Press.
    • An overview of Dzog Chen in the Bön and Nyingma traditions; includes translations along with the original Tibetan.
  • Sakya Pandita. John Davenport trans. 2000. Ordinary Wisdom. Boston: Wisdom Publications.
    • A translation and explanation of the most famous of the Legs Bshad texts.
  • Sonam Rinchen and Ruth Sonam. 1997. The Thirty-Seven Practices of Bodhisattvas. Ithaca: Snow Lion Press.
  • Sonam Rinchen and Ruth Sonam. 1997. Atisha’s Lamp for the Path to Enlightenment. Ithaca: Snow Lion Press.
  • Sonam Rinchen and Ruth Sonam. 2001. Eight Verses for Training the Mind. Ithaca: Snow Lion Press.
    • These editions are translations by Ruth Sonam and explanations by Geche Sonam Rinchen.  They all include the original Tibetan and offer clear background for understanding the root texts.
  • Sparham, Gareth. 1993. Ocean of Eloquence. New York: SUNY Press.
    • A translation of Tsong Kha Pa’s commentary on the Yogācāra Doctrine of Mind.  An example of Yogācāra study and practice in Tibet.
  • Thupten Jinpa, ed. 2006. Mind Training: The Great Collection. Boston: Wisdom Publications.
    • An excellent collection of the Lojong or “Mind Training” literature with commentaries.
  • Thurman, Robert. 1991. The Central Philosophy of Tibet: A Study and Translation of Jey Tsong Khapa’s Essence of True Eloquence. Princeton: Princeton University Press.
    • A long introduction gives a detailed overview of Tibetan philosophy followed by a translation of an important text on Mādhyamaka by Tsong Kha Pa.
  • Wayman, Alex. 1991. Ethics of Tibet. New York: SUNY Press.
    • A translation and explanation of the Bodhisattva section of Tsong Kha Pa’s Lamrim Chenmo.  Offers an overview of the stages of the bodhisattva path.

 

 

Author Information

 

Nicolas Bommarito
Email:  npbommar@buffalo.edu
University at Buffalo
U. S. A.

 

Divine Immutability

Divine immutability, the claim that God is immutable, is a central part of traditional Christianity, though it has come under sustained attack in the last two hundred years.  This article first catalogues the historical precedent for and against this claim, then discusses different answers to the question, “What is it to be immutable?”   Two definitions of divine immutability receive careful attention.  The first is that for God to be immutable is for God to have a constant character and to be faithful in divine promises; this is a definition of “weak immutability.”  The second, “strong immutability,” is that for God to be immutable is for God to be wholly unchanging. After showing some implications of the definitions, the article focuses on strong immutability and provides some common arguments against the claim that God is immutable, understood in that way.  While most of the historical evidence discussed in this article is from Christian sources, the core discussion of what it is to be strongly immutable, and the arguments against it, are not particular to Christianity.

Table of Contents

  1. Some Historical Evidence for Divine Immutability
    1. Biblical Evidence for and against Divine Immutability
    2. Conciliar Evidence for Divine Immutability
    3. The Protestant Reformers and Divine Immutability
    4. Divine Immutability and Traditional Christianity
  2. What It Is To Be Immutable
    1. Immutability as Constancy of Character
    2. Strong Immutability—God Does Not Change in Any Way
  3. Objections to Strong Immutability
    1. God’s Knowledge of Temporally Indexed Truths, Omniscience and Immutability
    2. Immutability and Modal Collapse
    3. Responsiveness and an Immutable God
    4. Personhood and Immutability
    5. Immutability, Time, and Freedom
  4. Related Issues
    1. Divine Timelessness or Eternality
    2. Divine Impassibility
    3. The Incarnation
    4. Intrinsic/Extrinsic Properties
  5. References and Further Reading

1. Some Historical Evidence for Divine Immutability

Divine immutability is a central aspect of the traditional Christian doctrine of God, as this section will argue. For more detail on this point, see Dorner (1994) chapter 2 and Weinandy (1985).

a. Biblical Evidence for and against Divine Immutability

There are many biblical passages commonly cited as evidence either for or against the doctrine of divine immutability. This short section discusses just a few, with the aim of showing that the Bible is not explicitly clear one way or the other on the question of whether God is immutable. (See Gavrilyuk (2004), p 37-46, for a discussion of these passages and others.) Whichever view one takes on immutability, there are difficult passages for which one has to account.

In some places the Bible appears to speak in favor of divine mutability. For instance, consider these two passages:

Did Hezekiah king of Judah or anyone else in Judah put [Micah] to death? Did not Hezekiah fear the LORD and seek his favor? And did not the LORD relent, so that he did not bring the disaster he pronounced against them? (Jeremiah 26:19. This and all subsequent quotations from the Bible are taken from the New International Version).

In this first example we see the Lord relenting, not doing what he had said he would do.  That appears to be a case of changing from one course or plan of action to another.  Such change seems even clearer in the following case, where God, in response to a sin of David, both sends an angel to destroy Jerusalem, then, grieving the destruction, calls off the angel.

And God sent an angel to destroy Jerusalem. But as the angel was doing so, the LORD saw it and was grieved because of the calamity and said to the angel who was destroying the people, “Enough! Withdraw your hand” (1 Chronicles 21:15).

In this example, God puts a particular plan of action into effect, then, it appears, grieves his decision and reverses it.  God does it as a result of the calamity the angel was causing when destroying the people. God responds to his creation here, and relents.  Both of these texts, and others like them, seem to indicate that God changes, at least in changing his mind and commands. Other relevant biblical passages include, but are not limited to, Exodus 32:14 and Amos 7:1-3.

If all the evidence from the Bible were against immutability, one might think that the case against divine immutability, at least for the Christian and the Jew, would be closed.  However, the Bible also seems to teach that God does not change his mind.  For instance:

God is not a man, that he should lie, nor a son of man, that he should change his mind. Does he speak and then not act? Does he promise and not fulfill? (Numbers 23:19).

He who is the Glory of Israel does not lie or change his mind; for he is not a man, that he should change his mind (1 Samuel 15:29).

These two passages claim that God doesn’t change his mind and so are in tension with the previous two texts.  Beyond these two passages that claim that God does not change his mind, there are also passages where God is said not to change, for instance:

I the LORD do not change. So you, O descendants of Jacob, are not destroyed (Malachi 3:6).

Every good and perfect gift is from above, coming down from the Father of the heavenly lights, who does not change like shifting shadows (James 1:17).

Theologians and philosophers who wish to provide scriptural evidence for divine immutability have commonly cited these passages.

So the Biblical texts are either unclear as to whether God changes or not, or they are inconsistent.  If one wishes to maintain the consistency of scripture on the doctrine of God, one either needs to read the passages where God appears to change in light of the passages where it claims he does not, or vice versa.  But either way the Biblical evidence seems too weak to prove either divine immutability or its contrary.

b. Conciliar Evidence for Divine Immutability

While the biblical evidence seems to underdetermine whether divine immutability is true, the conciliar evidence favors the doctrine of divine immutability. While the later councils explicitly include immutability in their discussions of God’s nature, the earlier councils only discussed divine immutability in relation to the incarnation, the Christian teaching that the Second Person of the Trinity, the Son of God, became man.  This is because the incarnation seemed to require a change of some sort in God.  These early councils employed divine immutability to argue that there was no change in the Godhead when the Son became incarnate.

For instance, consider the conclusion to the creed of the first general council, Nicaea, in 325 (note that this is the end of the original creed, and not the more familiar Nicene-Constantinopolitan creed commonly employed in liturgies today):

And those who say “there once was when he was not”, and “before he was begotten he was not”, and that he came to be from things that were not, or from another hypostasis or substance, affirming that the Son of God is subject to change or alteration—these the catholic and apostolic church anathematizes (Tanner, 1990, p 5, emphasis  mine).

Here the council anathematizes those who claim that the Son of God is subject to change or alteration.  Some, particularly the Arians, were teaching that the Son was a creature and not the Creator.  This anathema is an attempt to rule out such a position by ruling out change in the Son, which only makes sense if God is changeless.  For, how would anathematizing the view that the Son changes rule out the Son’s being a creature unless being changing is incompatible with being God?  One should note, though, that even though the Arians taught that the Son was mutable, they didn’t deny the immutability of the Father, and in fact were attempting to safeguard the immutability of God in their teaching that the Son was a creature (see Gavrilyuk (2004) p 105-7, Weinandy (1985) p 5-20 for more on this).

Also, see the third letter of Cyril to Nestorius from the council of Ephesus, 431, which says, when speaking of Christ:

We do not say that his flesh was turned into the nature of the godhead or that the unspeakable Word of God was changed into the nature of the flesh. For he (the Word) is unalterable and absolutely unchangeable and remains always the same as the scriptures say (Tanner, 1990, p 51, the emphasis is mine.)

Here the council claims that the Word of God, the Second Person of the Trinity, is unalterable and absolutely unchangeable.  Notice, too, that the claim is made to defend against the unorthodox view that the twin natures of Christ mixed in the incarnation.  So whatever immutability comes to, it must come to something that rules out the admixture of natures.

Thirdly, see the Letter of Cyril to John of Antioch about Peace, again from the council of Ephesus:

…God the Word, who came down from above and from heaven, “emptied himself, taking the form of a slave”, and was called son of man, though all the while he remained what he was, that is God (for he is unchangeable and immutable by nature)… (Tanner,1990, p 72, the emphasis is mine).

Here the council claims that God is unchangeable and immutable by nature.  Whereas the first two passages cited attribute immutability to the Son, this passage attributes it more generally to God.  But even still, it would be an odd Trinitarian theology that claimed the Son to be immutable but the other Persons to be mutable. Also of note is the letter of Pope Leo to Flavian, bishop of Constantinople, about Eutyches, read at the council of Chalcedon where Pope Leo writes of “the unalterable God, whose will is indistinguishable from his goodness” (Tanner, 1990, p 79).

The closer to the present one comes in western conciliar documents, the more explicitly and repeatedly one finds affirmation of divine immutability. For instance, see the fourth council of Constantinople (869-870), the eighth ecumenical council, by western reckoning, where the Fathers claim in their creedal statement:

We confess, indeed, God to be one…ever existing without beginning, and eternal, ever the same and like to himself, and suffering no change or alteration… (Tanner, 1990, p 161).

Notice that here the object said to be without change or alteration is explicitly God.  The first two conciliar statements cited claim that the Son is immutable, and the third quotation appears to claim that God, and not just the Son, is immutable, but here the object is clearly God.  Also, the creed from the Fourth Lateran council, which met in 1215, begins, “We firmly believe and simply confess that there is only one true God, eternal and immeasurable, almighty, unchangeable, incomprehensible and ineffable…” (Tanner, 1990, p 230); the council of Basel-Ferrara-Florence-Rome, which met from 1431-1445, “deliver[ing]…the following true and necessary doctrine…firmly professes and preaches one true God, almighty, immutable and eternal…” (Tanner, 1990, p 570); the First Vatican council, which met from 1869-1870, “believes and acknowledges that there is one true and living God…he is one, singular, completely simple and unchangeable spiritual substance…” (Tanner, 1990, p 805)  Such texts show that the early church councils of undivided Christendom, as well as the later western councils of the Catholic Church, clearly teach that God is immutable.

c. The Protestant Reformers and Divine Immutability

It isn’t just early Christianity in general and Catholicism in particular that dogmatically affirms divine immutability.  One can find divine immutability in the confessions and canons of traditional Protestantism.  For instance, see the confession of faith from the French (or Gallican) Confession of 1559:

We believe and confess that there is but one God, who is one sole and simple essence, spiritual, eternal, invisible, immutable, infinite, incomprehensible, ineffable, omnipotent; who is all-wise all-good, all-just, and all-merciful (Schaff, 1877, p 359-360).

Also, see the Belgic Confession of 1561, Article 1:

We all believe with the heart, and confess with the mouth, that there is one only simple and spiritual Being, which we call God; and that he is eternal, incomprehensible invisible, immutable, infinite, almighty, perfectly wise, just, good, and the overflowing fountain of all good. (Schaff, 1877, p 383-384)

For a confessional Lutheran affirmation of divine immutability, see, for instance, “The Strong Declaration of The Formula of Concord,” XI.75, found in The Book of Concord:

And since our election to eternal life is founded not upon our godliness or virtue, but alone upon the merit of Christ and the gracious will of His Father, who cannot deny Himself, because He is unchangeable in will and essence…

In addition, see the first head, eleventh article of the canons of Dordt, from 1618-1619:

And as God himself is most wise, unchangeable, omniscient, and omnipotent, so the election made by him can neither be interrupted nor changed, recalled nor annulled; neither can the elect be cast away, nor their number diminished (Schaff, 1877, p 583).

And, finally, see the Westminster Confession of Faith from 1647:

There is but one only living and true God, who is infinite in being and perfection, ‘a most pure spirit, invisible, without body, parts, or passions, immutable, immense, eternal, incomprehensible, almighty, most wise, most holy… (Schaff, 1877, p 606).

These texts show that the dogmatic and confessional affirmations of divine immutability carry on into Protestantism.

d. Divine Immutability and Traditional Christianity

If one understands traditional Christianity either as the faith of the early, undivided Church or as the intersection of the great, historical confessional statements of Christendom, then one has strong reason to believe that traditional Christianity includes the claim that God is immutable.  Just because one has reason to affirm that God is immutable, however, does not give one reason to favor a particular definition of immutability.  The following section discusses the two leading rival theories of what it is for God to be immutable.

2. What It Is To Be Immutable

Even if it is clear that traditional Christianity includes the doctrine of divine immutability, what, precisely, that doctrine amounts to is not perspicuous.  There are many subtle and nuanced views of immutability—far too many to receive individual attention in this article.  This article focuses on the two most commonly discussed views of immutability.  One is that divine immutability merely guarantees that God’s character is unchanging, and that God will remain faithful to his promises and covenants.  This first view does not preclude other sorts of change in God.  Another, stronger, view of immutability is that the doctrine of divine immutability rules out all intrinsic change in God.  This latter understanding of immutability is the historically common view.

a. Immutability as Constancy of Character

Some thinkers see immutability as the claim that God’s character is constant.  For instance, see Richard Swinburne’s The Coherence of Theism, where he discusses both types of immutability under consideration in this section. Here he sides with the constancy of character view, which he describes as “[i]n the weaker way to say of a person that he is immutable is simply to say that he cannot change in character.” (Swinburne, 1993, p 219)  Isaak Dorner’s view is that God is ethically immutable but that divine vitality requires divine change. See Dorner (1994), especially the helpful introduction by Williams, p 19-23, and Dorner’s third essay, “The Reconstruction of the Immutability Doctrine.”  For discussions of Dorner, see Richards (2003) p 198-199 and Williams (1986). This view of immutability understands divine immutability to be the claim that God is constant in his character and virtue; that God is not fickle; and that God will remain true to his promises.

Notice that if immutability is understood in this sense, the Bible passages cited in section 1 may be easier to reconcile than on strong immutability.  The passages where God relents aren’t passages that prove that God is not constant in character.  It may well be God’s good character that causes him to relent.  Given the previous circumstances, God formed one set of intentions due to his constantly good character.  When the circumstances changed, God formed a different set of intentions, again due to his constantly good character.  What changes in these passages is not God’s good character. It is the circumstances God is in when he forms his intentions. Where the Bible teaches that God is unchanging, it means, in this understanding of immutability, that God’s character will not change.  It does not mean the stronger claim that God will not change at all.

One more point in favor of this understanding of immutability is that if it were true, other problems with divine immutability, problems discussed below in section 3, would no longer be problems.  For instance, there would be no problem of explaining how an unchanging God has knowledge of changing truths (e.g., like what time it is).  God’s knowledge could change, on this understanding of immutability, provided that such change in knowledge does not rule out constancy of character.

Another problem discussed in section 3 is that of the responsiveness of an immutable God.  Given weak immutability, divine immutability doesn’t necessitate divine unresponsiveness.  This is because God’s responding to prayers doesn’t require that his character change.  In fact, it could be exactly because his character does not change that he responds to prayers.  So responsiveness is not incompatible with this notion of immutability.  On the constancy of character understanding of immutability, not all change, and in particular, not change as a result of responding to prayer, is inconsistent with immutability.

Nevertheless, if this is the burden of divine immutability—that God’s character is constant—who would deny it (that is, what theist would deny it)?  Divine immutability is a modest thesis when understood as constancy of character.  But even if it is innocuous, and even if it has the above-mentioned positive features, it still has difficulties.  It still leaves a problem for biblical exegesis.  That’s because the first two passages discussed above in section 1 seem to show God changing his mind, whereas the second two seem to teach that God does not change his mind.  So while the fact that it provides some way to reconcile some of the biblical evidence is a point in favor of the constancy of character view, it still faces a difficulty in understanding the scriptures that seem to claim that God does not change his mind.

Moreover, divine immutability understood as only involving the constancy of character seems in tension with the use that the early teachings of the church at the first ecumenical councils made of the concept.  For instance, both quotations from the council of Ephesus claim that the Second Person of the Trinity did not change when assuming the human nature, and both point, as evidence, to the fact that he is unchangeable and immutable.  In fact, the second quotation from Ephesus has it that God is unchangeable and immutable by God’s very nature.  Immutability, however, would be no evidence for the claim that the Second Person of the Trinity didn’t change when assuming the human nature if all immutability amounts to is constancy of character.  How could the constancy of the Second Person’s character entail that he would not change when assuming the human nature?   What does that have to do with whether Christ’s “flesh was turned into the nature of the godhead or that the unspeakable Word of God was changed into the nature of the flesh”?  The change being ruled out at Ephesus is not moral change or change of character, but change of properties and change of nature.  So the early church councils don’t have the constancy of character view in mind when they claim that God is immutable.  If they had such a view in mind, they wouldn’t have thought to point to divine immutability in support of the claim that Christ didn’t change in becoming incarnate.

In regard to the later church councils and confessional statements, they don’t define the meaning of “immutability” when they assert it in the list of divine attributes.  Again, however, one notices that they do not put the affirmation of divine immutability in discussion of God’s character but in discussion of God’s existence.  One finds immutability in a list of other nonmoral attributes, and not subjugated to the affirmation that God is wholly good or holy.

For instance, the Fourth council of Constantinople teaches that God is immutable and unchangeable, and this not in relation to God’s character but in discussion of God’s very existence (“ever existing without beginning, and eternal, ever the same and like to himself, and suffering no change or alteration….”).  The claim of immutability isn’t made in relation to God’s moral character but in a list of affirmations concerning God’s mode of existence.

So, for the reasons given in the preceding paragraphs, divine immutability, taken in its traditional sense, should not be understood to mean merely constancy of character.  Surely constancy of character is a part of the concept.  But divine immutability must be more robust than that to do the work it has been tapped to do in traditional Christianity.

b. Strong Immutability—God Does Not Change in Any Way

A stronger understanding of divine immutability is that God is literally unable to change.  As Thomas Aquinas, a commonly cited proponent of this view, says: God is “altogether immutable…it is impossible for God to be in any way changeable” (Summa Theologiae, the First Part, Question nine, Article one, the response; the quotation is from the translation at newadvent.org). God doesn’t change by coming to be or ceasing to be; by gaining or losing qualities; by any quantitative growth or diminishment; by learning or forgetting anything; by starting or stopping willing what he wills; or in any other way that requires going from being one way to being another.

Whenever a proposition about God changes truth-value, the reason for the change in truth-value of the proposition is not, on this view of immutability, because of a change in God, but because of some other change in something else. (I speak here of a proposition changing its truth-value, though it is not essential for divine immutability that propositions can change truth-values.  If the reader holds a view where propositions have their truth-values eternally, the reader may substitute in his or her preferred paraphrase for apparent change in the truth-value of propositions.)  Father Jones is praising God, and so the proposition that God is being praised by Father Jones is true.  Later that same proposition is no longer true, but not because of any change in God.  It is no longer true because Father Jones stopped praising God, and not because God is in any way different than he was.  Likewise in other situations: God doesn’t go from being one way to being another; rather, something else changes and on account of that a proposition about God changes its truth-value.

One may wonder about the viability of this account when it deals with events that clearly seem to involve God doing something.  For instance, God talked to Abraham at a certain time in history.  Consider the proposition: God is talking to Abraham.  That was true at one point (Hagar might have whispered it to Ishmael after the youth asked what his father was doing).  At other times, God is not talking to Abraham.  But isn’t the change here a change in what God is doing?  Doesn’t God go from talking to not talking to Abraham?  And if so, how does that fit with the claim made in the previous paragraph, that changes in propositions about God are due to changes in things besides God?

The defender of strong immutability will draw a distinction here between the actions of God and their effects.  God, on this view, is unchangingly performing his divine action or actions, but the effects come and go.  Compare: In one swift action I throw a barrel full of messages in bottles overboard in the middle of the Atlantic.  This action of mine has multiple effects: it causes waves and ripples as the bottles hit the water.  Later, it causes other effects as people read the messages I’ve sent.  I convey some information to those whom the bottles reach, but the action I performed to do so has long since ceased.  Depending on one’s view of divine simplicity and divine eternity, some aspects of this analogy will have to be changed.  But the point remains: one action can have multiple effects at multiple times.  God immutably acts to talk with Abraham, and either does so atemporally or, if God is inside of time, has always and will always so act.  The changing of the truth-value of the proposition that God is talking to Abraham is not due to God changing, on this theory, but due to the effects of God’s action coming and going.

Strong immutability has a few things going for it.  First, it is congruent with the final four passages of Scripture cited in section 1.  If God is strongly immutable, he cannot change his mind, and he also cannot change.  So these last four passages pose no problem on this understanding of immutability.

Also, this stronger notion of immutability does the work needed for the early councils, which point to immutability to show that the Second Person of the Trinity does not change when assuming the human nature.  The conciliar reference to divine immutability is understandable if immutability is understood as strong immutability, whereas it is not understandable if it is understood in the weaker constancy of character sense.

Finally, this strong understanding of divine immutability is very common in church history. Just like the constancy of character model of divine immutability, however, this understanding is not without its own problems.  First it has to provide a way of understanding the first two scripture citations, as well as the many others where God appears to change. Furthermore, it has other difficulties, which are consider in the following section.

3. Objections to Strong Immutability

There are many objections to the strong view of divine immutability, some of which were discussed in the previous section, including changes which appear to be changes in God, but which, on this view, are parsed as changes in other things, such as the effects of the unchanging divine action.  This section discusses some other objections to strong immutability.

a. God’s Knowledge of Temporally Indexed Truths, Omniscience and Immutability

Here is a truth that I know:  that it is now 2:23pm.  That is something I couldn’t know a minute ago, and it is something that I won’t know in a minute.  At that time, I’ll know a different truth: that it is now 2:24pm.  Either God knows such temporally indexed truths—truths that include reference to particular times at which they are true—or not.  If God does not know such truths, then he is not omniscient, since there is something to be known—something a lowly creature like me does, in fact, know—of which God is ignorant.  Since very few theists, especially of a traditional stripe, are willing to give up divine omniscience, very few will be willing to claim that God is ignorant of temporally changing truths like truths about what time it is.

If God is omniscient, then God knows such temporally changing truths.  If God does know such temporally changing truths, then God changes, since God goes from knowing that it is now 2:23pm to knowing that it is now 2:24pm.  And worse, God changes with much more frequency, since there are more fine-grained truths to know about time than which minute it is (for instance, what second it is, what millisecond it is, etc.)  If God knows such truths at some times but not at others, God changes.  And if God changes, divine immutability is false.  So if God is omniscient, he is not immutable.  Therefore, God is either not immutable or not omniscient.  And since both views are explicitly held by traditional Christianity (and other monotheisms) there is a problem here for the traditional proponent of divine immutability.  This argument was put forward forcefully by Norman Kretzmann in his article Omniscience and Immutability (1966).

There are a few common responses to this argument.  First, one can claim that in order to be omniscient, God needn’t know indexed truths as indexed truths.  Second, one might claim that knowledge is not an intrinsic state or property, and that God’s immutability extends only to God’s intrinsic properties.  Third, one might argue that God does not know in the same way that we know, and this problem arises only if God knows things by being acquainted with particular propositions, as we know things.  Fourth, one might respond by assuming God is atemporally eternal and distinguishing the present-tensed terms in the premises between the eternal and temporal present.

Consider the first response.  God needn’t know that now it is 2:23pm.  Rather he knows the same fact under a non-temporally-indexed description.  For instance, God knows that the expression of this proposition, that it is now 2:23pm, is simultaneous with a state that, by convention, we call 2:23pm.  Such knowledge of simultaneity doesn’t require a temporal indexing, and so doesn’t require change across time.  One may wonder here, though, whether indexicals can be eliminated from all indexed propositions without any change in the meaning of the propositions. (For more on whether knowledge of indexical propositions can be reduced to knowledge of nonindexed propositions, see John Perry (1979).)

The second response is put forward by Brian Leftow.  Leftow understands divine immutability as the doctrine that God undergoes no change of intrinsic properties.  Intrinsic properties are properties that involve only the bearer of that property, or, put another way, properties that a thing would have even if it were the only thing in existence, or, put another way, properties a thing would have that don’t require other things to have particular properties (Leftow, 2004). My shape is a property intrinsic to me, as is my being rational.  If you could quarantine me from the influence of everything else, I’d still have my bodily shape and my rationality.  My distance from the Eiffel Tower or height relative to my little cousin, however, are extrinsic properties, since they require the existence of certain things and their having particular properties.  By changing something else and leaving me the same—let my cousin grow for a few more years—you can change my extrinsic properties.  But not so with my intrinsic properties. (This is a rough understanding of intrinsic properties, since if you quarantined me off from the influence of everything I wouldn’t have air to breathe, wouldn’t be under the influence of gravity, light, or anything else.  What it is to be intrinsic is notoriously difficult to define.  For more on intrinsic properties, see David Denby (2006).)

Is God’s knowledge intrinsic or extrinsic to God?  On this definition of intrinsic, God’s knowledge of creatures is extrinsic.  For instance, God’s being such that he knows that it is now 2:24pm entails that something else (for instance, the universe, or the present) has a property (for instance, to give some examples from Leftow (2008), being a certain age, or being a certain temporal distance from the first instant). Likewise for God’s knowledge of other changing facts; since God’s knowing that a is F, where a is not God, entails something about another being having a property—namely, it entails that a is F—such properties of God are extrinsic.  Hence God’s going from knowing that a is F to knowing that a is not F does not require an intrinsic change, and thus is not contrary to divine immutability.

This response faces a difficulty because even if God’s knowledge of other things is extrinsic, since it entails properties in things other than God, belief is not extrinsic.  My knowledge of who is in the adjoining office changes when people come and go, since knowledge entails truth, and the truth of who is there changes.  But my belief of who is there, having no necessary relation to truth, can remain constant even across change in truth-values.  This shows that even if knowledge is intrinsic, since it fluctuates with truth, belief is not extrinsic, since beliefs can be as they are whether or not the world is as they present it.

So even if God’s knowledge of creatures is extrinsic, God’s beliefs concerning creatures are intrinsic, since they don’t require anything of creatures.  This suggests that the intrinsic/extrinsic distinction will not save strong immutability from an argument from changing truths based on beliefs rather than knowledge.  In response to an argument run from beliefs rather than knowledge, one might point out that God believes all and only what is true.  Thus God’s beliefs about creatures, and not merely his knowledge about them, will be extrinsic. This is because God believes something if and only if he knows it, and he knows it if and only if it is true: God’s belief that a is F entails, and is entailed by, that a is F.

A second difficulty with Leftow’s response is that knowing and believing seem to be quintessential intrinsic properties, which might lead one to reject this understanding of intrinsic properties.  A third problem is that this view, far from keeping God unchanging, instead has some of his properties changing every instant, since he extrinsically changes with every passing instant.  If change of a property entails change full stop, and it seems to, then God is continually changing on this view.  A fourth and final problem is that this answer is inconsistent with another traditional attribute of God—atemporality.  An atemporal God cannot change at all, since change requires time.  So even if this response can answer the other problems, the proponent of divine eternality, and this includes Leftow, will not be able to embrace this response.

Tom Sullivan champions the third response. He argues that the problem arises due to a misunderstanding of how God knows.  We know by being properly related to certain thoughts or propositions.  So when the time changes, the proposition or thought we need to be related to in order to know the truth changes.  But if God does not know by being related to propositions, but in some other sui generis way that doesn’t require change in relation to propositions, then the problem may be defused (Sullivan, 1991).

This is a negative response, since it only says we don’t know as God knows, and doesn’t spell out the mode of knowing that God has.  And this counts against the response, since it doesn’t give us a way of understanding how God knows.  By being undeveloped, it is hard to analyze its merits.  Nevertheless, if it is true that God knows in a way unique to him, then that way may help solve the problem.

A final response is due to Eleonore Stump and Norman Kretzmann. Their response assumes divine eternity, which implies, in part, that God is atemporal.  They argue that the claim that God knows what time it is now is ambiguous between four readings, depending on whether the “knows” is understood as an eternally present or temporally present verb, and depending on whether the now refers to the temporal now or the atemporal now.  Thus, God knows (eternally or temporally) what time it is now (that is, in the temporal present or the eternal present).  Nothing can know what time it is in the eternal present, since in the eternal present there is no time.  So we must understand the sense of ‘now’ to be ranging over the temporal present and not the eternal present.  God, since eternal, cannot know at the present time, but must know eternally.  So the only viable reading of the four possible readings is God knows eternally what is happening in the temporal present.  Consider the following inference introduced earlier: “If God does know such temporally changing truths, then God changes, since God goes from knowing that it is now 2:23pm to knowing that it is now 2:24pm.”  This inference, Stump and Kretzmann claim, does not hold when it is disambiguated as they disambiguate it.  For God eternally knows that at different times different truths are true, for instance, that it is now (at the temporal present) a certain time, but he knows these truths in one unchanging, atemporal action.  God’s eternal knowledge not only doesn’t allow for change; it positively rules change out, since change is inconsistent with eternity.  God eternally knows what is happening now, and at every other time, but in so knowing doesn’t go from being one way to being another.  Rather God simultaneously knows (on the assumption of divine eternity) in one act of knowing all temporally indexed truths (Stump and Kretzmann, 1981, p 455-458).

This response requires the assumption of divine eternity, which may be a cost for some defenders of divine immutability.  Also, it requires an understanding of simultaneity that can allow for God to be simultaneous with all times, but not entail that all times be simultaneous. Stump and Kretzmann offer such an account of simultaneity. (For more on this topic, see Leftow (1991) chapters 14 and 15.)

b. Immutability and Modal Collapse

One might worry that strong immutability leads to a modal collapse—that whatever is actually the case is necessary and whatever is not the case is impossible.  For, one might think, if it is impossible that God change, then no matter what happens, God will be the same.  So, no matter what happens, God will talk to Abraham at a certain time.  God can’t change to do anything else.  And if God can’t change to do anything else, then it seems like he’s stuck doing what he does, knowing what he knows, desiring what he desires, and so on, come what may.  And if that’s true, it is a small step to saying nothing could be different than it is, since if God hadn’t talked to Abraham at a certain time, God would be different.  And if God were different, he would be mutable.

The key to responding to this objection is to draw a distinction between being different in different circumstances and changing.  Divine immutability rules out that God go from being one way to being another way.  But it does not rule out God knowing, desiring, or acting differently than he does.  It is possible that God not create anything.  If God hadn’t created anything, he wouldn’t talk to Abraham at a certain time (since no Abraham would exist).  But such a scenario doesn’t require that God change, since it doesn’t require that there be a time when God is one way, and a later time when he is different.  Rather, it just requires the counterfactual difference that if God had not created, he would not talk to Abraham.  Such a truth is neutral to whether or not God changes.  In short, difference across possible worlds does not entail difference across times.  Since all that strong immutability rules out is difference across times, divine immutability is not inconsistent with counterfactual difference, and hence does not entail a modal collapse.  Things could have been otherwise than they are, and, had they been different, God would immutably know things other than he does, all without change (to see more on this, see Stump (2003) p 109-115.) In the words of one Catholic dogmatist:

Because of His unchangeableness God cannot revoke what he has once freely decreed,—such decisions, for instance, as to create a visible world, to redeem the human race, to permit Christ to die on the cross, etc.—though it is possible, of course, that some other Economy different from the present might be governed by entirely different divine decrees (Pohle, 1946, p 283).

One might still have worries about modal collapse here, especially if one affirms the doctrine of divine simplicity along with strong immutability, as most proponents of strong immutability do.

As I’ve argued, strong immutability rules out differences across times, but not across possible situations or worlds (or Economies, as Pohle has it).  The doctrine of divine simplicity—the thesis that in God there is no composition whatsoever, that God is uniquely metaphysically simple—seems to rule out difference across possible worlds. For what is there in God to be different if God is wholly simple?  So it seems that these two doctrines together rule out God’s being different at all, either across time or across worlds, and so, together, they seem to entail a modal collapse.

The first thing to note here is that, even if it is true that the doctrines of divine simplicity and strong immutability together entail a modal collapse—and there is good reason to be suspicious of this claim—the doctrine of divine simplicity is doing all the work in entailing the modal collapse.  This is because it, and it alone, seems to entail that God is the same in all possible worlds—strong immutability is silent on this point.  The second thing to note here is that the doctrine of divine simplicity can be understood in many different ways, some of which do not require simplicity to entail modal collapse.  Enumerating and defending these ways, however, is beyond the scope of this entry. (For two such understandings of divine simplicity, see Stump (2003), p 109-115, and Brower (2008)).

c. Responsiveness and an Immutable God

Adherents to the three great monotheisms, as well as other theists, traditionally believe that God answers prayers.  Answering prayers requires a response to the actions of another (in particular, a response to a petition).  Here is an argument that begins with responsiveness and concludes to a mutable God.  God is responsive to prayers.  Anything that is responsive, in responding, undergoes change.  Thus if God responds to prayers, then God undergoes change.  If God undergoes change, then God is not immutable.  Therefore, if God responds to prayers, then God is not immutable.

One response to this argument is to define immutability in the weaker sense of constancy of character (the discussion here follows Eleonore Stump’s treatment of divine responsiveness in her book Aquinas (Stump, 2003, p 115-118).  See also Stump and Kretzmann, “Eternity,” especially pages 450-451).  Immutability, so defined, does not rule out responsiveness to prayers.  In fact, it might be God’s character that accounts for divine responsiveness.  The defender of the strong immutability, however, will have to make a different reply.  Since she will affirm that God responds to prayers, she will reject the claim that responsiveness requires change.  One way to support such a rejection is to provide an analysis of responsiveness that doesn’t require change across time.  Here are two such analyses:

J is responsive to T’s request to x if and only if J does x because T requested it.

J is responsive to T’s request to x if and only if J does x, and J might not have done x if T didn’t request it.

If either of these two closely related views is correct, then responsiveness doesn’t require temporal priority or change.  Notice that nothing in these two understandings of responsiveness requires change in the part of a responder.  In many cases where someone changes in responding it is, in part, due to her gaining new knowledge or having to prepare to respond.  But suppose that there was no point in her existence where she didn’t know that to which she responds or isn’t prepared to respond.  It might be hard to imagine what that would be like for a human, since we humans were once ignorant, powerless babes.  But suppose a person were omniscient and omnipotent for all of his existence.  God, since omniscient, knows of all petitions, and, since omnipotent, needn’t ever prepare to answer a petition.  So God doesn’t fall under the conditions that humans fall under which require change on their parts to respond.  God can be immutably responding to the petitions of his followers.  That is, God can act in certain ways because his followers ask him to, and he might not have acted that way had they not asked.  But he doesn’t need to change in order to do so.

What responsiveness does require is counterfactual difference.  That is, had the circumstances been different than they are, then God might have done differently.  And that’s true.  Had Monica not asked for Augustine’s conversion, and God saved Augustine, at least in part, because Monica asked him to, God might not have converted Augustine.  All this leads to an important point: responsiveness is a modal, not temporal, concept.  That is, responsiveness has to do with difference across possible situations and not change across times. To respond is to do something because of something else.  Since we’ve seen in the previous objection that divine immutability does not rule out counterfactual difference, responsiveness is not ruled out by immutability.  While in very many cases it seems that responsiveness will require change, it does not require change in situations where the responder need not gain knowledge and need not prepare to respond.

d. Personhood and Immutability

Some thinkers have claimed that there is an inconsistency in something’s being both a person and unchanging.  One reason for thinking that personhood and immutability are inconsistent is that being a person requires being able to respond, and responsiveness is not possible for something immutable.  That objection was already discussed in the proceeding section.  But there are other reasons for thinking that personhood and immutability are inconsistent.

Richard Swinburne claims that personhood and immutability are inconsistent because immutability is inconsistent with responsiveness, as the previous objection had it, and additionally because immutability is inconsistent with freedom.  God is free, and, according to Swinburne:

[A]n agent is perfectly free at a certain time if his action results from his own choice at that time and if his choice is not itself brought about by anything else.  Yet a person immutable in the strong sense would be unable to perform any action at a certain time other than what he had previously intended to do.  His course of action being fixed by his past choices, he would not be perfectly free (Swinburne, 1993, p 222).

A strongly immutable God cannot be free, and God is perfectly free, so God is not strongly immutable.

One response to this problem is to invoke divine timelessness.  If God is outside of time, this passage, which is about things that are “free at a certain time” does not apply to God. Furthermore, if we were to drop the “at a certain time” from the text, the proponent of divine timelessness would still have a response to this argument.  Given that God is atemporal, it isn’t true of God that he “previously intended to do” anything.  There are no previous or later intentions for an atemporal being—they are all at once.  Likewise, he would have no “past choices” to fix his actions.  So this argument is not applicable to an atemporal, immutable person.

Even for a temporally located immutable person, there are still responses to this argument.  The perfectly free, temporally located, immutable person needn’t have his actions brought about by anything else besides his own choices.  Such an agent can still fulfill the criterion set out by Swinburne for being perfectly free.  God’s immutable action is brought about by his own choice at a time, and his choice is not brought about by any previous things, including previous choices.  Swinburne is right that God’s past choices would bring about his present actions (being immutable, God’s choices can’t change, so the past choices are identical with the present choices), but he is wrong in thinking that his choice is brought about by previous things.  For the choice of a temporal, immutable God is everlastingly the exact same (if God goes from choosing one thing to not choosing that thing, he is not immutable).  God’s action is everlastingly the same, and everlastingly brought about by God’s choice, which is also everlastingly the same.  God’s course of action is, as Swinburne says, fixed by past choices, but those past choices are identical with the current choices, and the choices are not brought about by anything else.  So such a being will fulfill the definition of what it is to be perfectly free.

One might also think that personhood requires rationality, consciousness, the ability to communicate, and being self-conscious (William Mann, 1983, p 269-272). Notice that none of these properties are inconsistent with immutability.  Some aspects of human rationality and consciousness aren’t available for an immutable person, for example, getting angry, learning something new, or becoming aware of a situation.  That doesn’t entail that an immutable person cannot be rational or conscious at all.  Rather, it means that the aspects of rationality or consciousness that require temporal change are ruled out.  But an immutable God can still be aware of what Moses does, still respond in a way we can call wrathful, and still love Moses.  Such actions are clear cases of rationality and consciousness and none of them require, as a necessary condition, change in the agent.

e. Immutability, Time, and Freedom

Suppose that God is in time, but immutable.  That means his knowledge can’t change over time, as discussed in a previous objection.  So anything God knows now, he knew a thousand years ago.  And here’s one thing that God knows now: what I freely chose to eat for breakfast yesterday.  I know such a truth, so God can’t be ignorant of it.  Given immutability, God can’t go from not knowing it to knowing it.  So he has everlastingly known it.  Similarly for all other truths.  In general, God knows what we are going to do before we do it.

If God knows before I act that I am going to act in that way, then I can’t do anything but act in that way.  And if, for every one of my actions, I can’t do otherwise, then I can’t be free.  Put another way, God’s knowledge ten thousand years ago that I would do thus-and-such entails that now I do thus-and-such.  And that’s true of all my actions.  So God’s knowledge determines all of my actions.

The proponent of an eternal, immutable God doesn’t face this problem, since on that view God doesn’t, strictly speaking, know anything before anything else.  Likewise, someone who denies immutability may get around this objection by affirming that God changes to learn new facts as time marches on.  But the defender of a temporal, immutable God has neither of these options available.

One response open to the defender of a temporal, immutable God is to embrace the view, presented above in section 3.a, that immutability doesn’t rule out extrinsic change, and gaining or losing knowledge is extrinsic change.  The benefits and costs of this view were discussed above.

Another response would be to argue that there is an asymmetry between truths and the world which allows for prior logical determination not to render a posterior action unfree. Truths are true because reality is as it is, and not the other way around.  So the truth of God’s knowledge that I do thus-and-such is because I do thus-and-such, and not the converse.  In order to get unfree action, one must have one’s actions be done because of something else, such as force.  Since the dependence of truth on reality requires the “because of” relations to run the other way, actions entailed by the truth of earlier truths do not render such actions unfree. ( Trenton Merricks, 2009; see also Kevin Timpe, 2007).

A final response is to claim that God knows all the actions that I will do, and he knew them far before I do actually perform those actions, but, were I to freely do something else, he would have known differently than he does.  This answer requires backwards counterfactual dependence of God’s knowledge on future actions.  But it doesn’t, at least without much argument, require backwards causation. This view is known as Ockham’s Way Out, and was popularized in an article by Alvin Plantinga (1986) entitled, aptly, “On Ockham’s Way Out.”

4. Related Issues

There are both philosophical and theological issues related to divine immutability.  Some theological issues include the relationship between immutability and other attributes and the consistency of God becoming man yet being strongly immutable.  As for philosophically related issues, one is the issue discussed above in section 3.e: the issue of (theological) determinism and free will.  Another relevant issue is the distinction, so important to Leftow’s understanding of immutability (see section 3.a), between intrinsic and extrinsic properties.

a. Divine Timelessness or Eternality

As is clear from the responses to some objections in section 3, supposing that God is outside of time has some advantages when it comes to answering objections to divine immutability (Mann, 1983). Divine timelessness entails divine immutability, given that change has as a necessary condition time in which to change.  But running the entailment relation the other way—from immutability to timelessness—is more difficult.  If one can show that existing in time requires at least one sort of intrinsic change—if, for instance, change in age or duration of existence is intrinsic change—then one can argue that immutability and temporality are inconsistent (Leftow, 2004). For arguments from immutability to timelessness, see Leftow (2004).

b. Divine Impassibility

Divine impassibility is the claim that God cannot have affects, or be affected by things.  Paul Gavrilyuk describes it as follows:

[T]hat [God] does not have the same emotions as the gods of the heathen; that his care for human beings is free from self-interest and any association with evil; that since he has neither body nor soul, he cannot directly have the experiences typically connected with them; that he is not overwhelmed by emotions and in the incarnation emerges victorious over suffering and death (Gavrilyuk (2004) 15-16; for other definitions of the term, see Creel (1986) 3-10).

Notice that impassibility, as so described, doesn’t entail immutability.  An agent can be impassible in the sense described by Gavrilyuk but still mutable.  He can, for instance, change in going from not promising to promising and be impassible.  Likewise, an immutable God can be passible.  He can be continually undergoing an emotion without change—for instance, he could be continually feeling the sorrow over human sin without change (Leftow, 2004). Neither entails the other. Nevertheless, they are closely related and often discussed in tandem.

c. The Incarnation

The incarnation is the doctrine, central to Christianity, that the Son of God, the Second Person of the Trinity, assumed a full human nature (that is, all that there is to a human), and became man.  Thus the one divine person had two natures—one divine, and one human, each with its own intellect and will, and these two natures didn’t mix together or exclude one another.  For the most important traditional expression of this doctrine, see the council of Chalcedon.  (Though it must be said that the doctrine wasn’t fully developed—in particular, the parts about Christ having two wills—until later councils.)

The incarnation raises questions concerning the immutability of God insofar as in the incarnation the Second Person of the Trinity becomes a man, and becoming, at least on the face of it, appears to involve change.  So the incarnation, it has been argued, is inconsistent with divine immutability.

This is not the place to go into a theological discussion of the consistency of the two teachings.  One should note, however, that the very church fathers and councils that teach that Christ’s two natures didn’t change one another or mix together, provide as evidence, as we saw in sections 1.b and 2, that God is absolutely unchangeable by his very nature.  So the principle of charity dictates that if we do find ourselves understanding immutability and the incarnation such that there is an explicit, obvious contradiction between them, noticeable by the merest reflection upon the two doctrines, the chances are that it is our understanding, and not the traditional doctrine’s, that is at fault. To see more on the relationship between the incarnation and immutability, see Richards (2003) p 209-210 and Dodds (1986) p 272-277.  Stump (2003) chapter 14 is helpful here as well.  Also, see Weinandy (1985), which is a book-length discussion of this very question.

d. Intrinsic/Extrinsic Properties

The distinction between intrinsic and extrinsic properties is important to the discussion of divine immutability because there needs to be a way to distinguish between the predications concerning God which can change in truth-value without precluding divine immutability and those that can’t.  This was discussed in sections 2.b and 3.a.  Divine immutability is compromised if that God is planning to redeem creation changes in truth-value, but it is not compromised if that God is being praised by Father Jones changes in truth-value.  The difference between propositions of these two sorts is often spelled out in terms of intrinsic and extrinsic properties (oftentimes extrinsic changes are called Cambridge changes).  God’s plans are intrinsic to God, but his being praised is extrinsic to him (unless he is praising himself).

5. References and Further Reading

  • Brower, Jeffrey. “Making Sense of Divine Simplicity”. Faith and Philosophy 25(1) 2008. p 3-30.
  • Creel, Richard. Divine Impassibility. Cambridge: Cambridge University Press, 1986.
  • Denby, David. “The Distinction between Intrinsic and Extrinsic Properties”. Mind: A Quarterly Review of Philosophy 115(457) 2006. p 1-17.
  • Dodds, Michael. The Unchanging God of Love: a Study of the Teaching of St. Thomas Aquinas on Divine Immutability in View of Certain Contemporary Criticism of This Doctrine. Fribourg: Editions Universitaires, 1986.
    • This book provides a detailed and historical look at Thomas Aquinas’ understanding of immutability, as well as defending it against objections.
  • Dorner, I. and Robert Williams. Divine Immutability. Minneapolis: Fortress Press, 1994.
    • This is an important work on immutability by a 19th century theologian, which receives more attention in theological than in philosophical contexts.
  • Gavrilyuk, Paul. The Suffering of the Impassible God. Oxford Oxfordshire: Oxford University Press, 2004.
    • This is a good, recent discussion of divine impassibility.
  • Kretzmann, Norman. “Omniscience and Immutability”. Journal of Philosophy 63(14) 1966. p 409-421.
  • Leftow, Brian.  “Eternity and Immutability.” The Blackwell Guide to Philosophy of Religion.  Mann, William E.  Blackwell Publishing, 2004.
    • This is an excellent article on divine immutability and eternality from a philosophical viewpoint.
  • Leftow, Brian. “Immutability”. The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), Edward N. Zalta (ed.).
    • This, too, is an excellent article on divine immutability from a philosophical viewpoint.
  • Leftow, Brian. Time and Eternity. Ithaca: Cornell University Press, 1991.
    • This book provides a technical, extended discussion of divine eternality, its entailments, and arguments for and against it.
  • Mann, William. “Simplicity and Immutability in God”. International Philosophical Quarterly 23, 1983. p 267-276.
    • This article argues that divine immutability is best understood in the light of divine eternality and simplicity.  It also includes a nice discussion of immutability and personhood.
  • Merricks, Trenton.  “Truth and Freedom”. Philosophical Review 118(1), 2009. p 29-57.
  • Perry, John. “The Problem of the Essential Indexical”. Noûs 13, 1979. p 3-21.
  • Plantinga, Alvin. “On Ockham’s Way Out”. Faith and Philosophy 3(3) 1986. p 235-269.
  • Pohle, Joseph and Arthur Preuss.  God: His Knowability, Essence, and Attributes.  St. Louis, MO: Herder Book Co, 1946.
    • This is volume from a standard dogmatic set, which contains biblical, patristic, and philosophical arguments for Catholic dogmas.
  • Richards, Jay. The Untamed God. Downers Grove: InterVarsity Press, 2003.
    • This book is about divine immutability and simplicity.  It is written at a good level for a beginner, but contains discussion useful for advanced readers as well.
  • Schaff, Philip.  The Creeds of Christendom: The Evangelical Protestant Creeds, with Translations. Harper, 1877.
    • This is a useful collection of confessional statements from the protestant reformers and their successors.
  • Stump, Eleonore. Aquinas. New York: Routledge, 2003.
    • An excellent discussion of Aquinas’s philosophy, which includes extended discussions of divine responsiveness, immutability, simplicity, and eternality.
  • Stump, Eleonore, and Norman Kretzmann, “Eternity”. Journal of Philosophy 78, 1981. p 429-458.
    • A seminal article on the relationship between time and God.
  • Sullivan, Thomas D.  “Omniscience, Immutability, and the Divine Mode of Knowing”. Faith and Philosophy 8(1) 1991. p 21-35.
  • Swinburne, Richard. The Coherence of Theism. Oxford: Clarendon Press, 1993.
  • Tanner, Norman. Decrees of the Ecumenical Councils. Franklin: Sheed & Ward, 1990.
    • An excellent two volume work which contains the decrees of the councils in the original languages, with facing translations.
  • Timpe, Kevin. “Truthmaking and Divine Eternity”. Religious Studies 43(3) 2007. p 299-315.
  • Weinandy, Thomas. Does God Change?. Still River: St. Bede’s Publications, 1985.
    • This book is an interesting historical discussion of what it means to say that God is immutable but became man.
  • Williams, Robert R., “I. A Dorner: The Ethical Immutability of God”. Journal of the American Academy of Religion 54(4), 1986. p 721-738.

Author Information

Tim Pawl
Email: timpawl@stthomas.edu
University of Saint Thomas
U. S. A.

Paraconsistent Logic

A paraconsistent logic is a way to reason about inconsistent information without lapsing into absurdity. In a non-paraconsistent logic, inconsistency explodes in the sense that if a contradiction obtains, then everything (everything!) else obtains, too. Someone reasoning with a paraconsistent logic can begin with inconsistent premises—say, a moral dilemma, a Kantian antinomy, or a semantic paradox—and still reach sensible conclusions, without completely exploding into incoherence.

Paraconsistency is a thesis about logical consequence: not every contradiction entails arbitrary absurdities. Beyond that minimal claim, views and mechanics of paraconsistent logic come in a broad spectrum, from weak to strong, as follows.

On the very weak end, paraconsistent logics are taken to be safeguards to control for human fallibility. We inevitably revise our theories, have false beliefs, and make mistakes; to prevent falling into incoherence, a paraconsistent logic is required. Such modest and conservative claims say nothing about truth per se. Weak paraconsistency is still compatible with the thought that if a contradiction were true, then everything would be true, too—because, beliefs and theories notwithstanding, contradictions cannot be true.

On the very strong end of the spectrum, paraconsistent logics underwrite the claim that some contradictions really are true. This thesis—dialetheism—is that sometimes the best theory (of mathematics, or metaphysics, or even the empirical world) is contradictory. Paraconsistency is mandated because the dialetheist still maintains that not everything is true. In fact, strong paraconsistency maintains that all contradictions are false—even though some contradictions also are true. Thus, at this end of the spectrum, dialetheism is itself one of the true contradictions.

This article offers a brief discussion of some main ideas and approaches to paraconsistency. Modern logics are couched in the language of mathematics and formal symbolism. Nevertheless, this article is not a tutorial on the technical aspects of paraconsistency, but rather a synopsis of the underlying ideas. See the  suggested readings for formal expositions, as well as historical material.

Table of Contents

  1. The Problem
  2. Logical Background
    1. Definitions
    2. Two Grades of Paraconsistency
    3. Requirements for a Logic to Be Paraconsistent
  3. Schools of Paraconsistent Logic
    1. Discussive Logic
    2. Preservationism
    3. Adaptive Logic
    4. Relevance
    5. Logics of Formal Inconsistency
    6. Dialetheism
  4. Applications
    1. Moral Dilemmas
    2. Law, Science, and Belief Revision
    3. Closed Theories – Truth and Sets
      1. Naïve Axioms
      2. Further Logical Restrictions
    4. Learning, Beliefs, and AI
  5. Conclusion
  6. References and Further Reading

1. The Problem

Consider an example due to Alan Weir, concerning a political leader who absolutely, fundamentally believes in the sanctity of human life, and so believes that war is always wrong. All the same, a situation arises where her country must enter into war (else people will die, which is wrong). Entering into war will inevitably mean that some people will die. Plausibly, the political leader is now embroiled in a dilemma. This is exactly when paraconsistent inference is appropriate. Imagine our leader thinking, ‘War is always wrong, but since we are going to war anyway, we may as well bomb civilians.’ Absurdist reasoning of this sort is not only bad logic, but just plain old bad.

David Hume once wrote (1740, p. 633),

I find myself involv’d in such a labyrinth, that, I must confess, I neither know how to correct my former opinions, nor how to render them consistent.

As Schotch and Jennings rightly point out, ‘it is no good telling Hume that if his inconsistent opinions were, all of them, true then every sentence would be true.’ The best we could tell Hume is that at least some of his opinions are wrong—but ‘this, so far from being news to Hume, was what occasioned much of the anguish he evidently felt’ (Schotch et al. p. 23). We want a way to keep sensible and reasonable even when—especially when—such problems arise. We need a way to keep from falling to irrational pieces when life, logic, mathematics or even philosophy leads us into paradox and conundrum. That is what paraconsistent logics are for.

2. Logical Background

a. Definitions

A logic is a set of well-formed formulae, along with an inference relation ⊢. The inference relation, also called logical consequence, may be specified syntactically or semantically, and tells us which formulae (conclusions) follow from which formulae (premises). When a sentence B follows from a bunch of sentences A0, A1, …, An, we write

A0, A1, …, AnB.

When the relation ⊢ holds, we say that the inference is valid. The set of all sentences that can be validly inferred in a logic is called a theory.

A key distinction behind the entire paraconsistent enterprise is that between consistency and coherence. A theory is consistent if no pairs of contradictory sentences A, ¬A are derivable, or alternatively iff no single sentence of the form A & ¬A is derivable. Coherence is a broader notion, sometimes called absolute (as opposed to simple) consistency, and more often called non-triviality. A trivial or absurd theory is one in which absolutely every sentence holds. The idea of paraconsistency is that coherence is possible even without consistency. Put another way, a paraconsistent logician can say that a theory is inconsistent without meaning that the theory is incoherent, or absurd. The former is a structural feature of the theory, worth repair or further study; the latter means the theory has gone disastrously wrong. Paraconsistency gives us a principled way to resist equating contradiction with absurdity.

In classical logic, the logic developed by Boole, Frege, Russell et al. in the late 1800s, and the logic almost always taught in university courses, has an inference relation according to which

A, ¬AB

is valid. Here the conclusion, B, could be absolutely anything at all. Thus this inference is called ex contradictione quodlibet (from a contradiction, everything follows) or explosion. Paraconsistent logicians have urged that this feature of classical inference is incorrect. While the reasons for denying the validity of explosion will vary according to one’s view of the role of logic, a basic claim is that the move from a contradiction to an arbitrary formula does not seem like reasoning. As the founders of relevant logic, Anderson and Belnap, urge in their canonical book Entailment, a ‘proof’ submitted to a mathematics journal in which the essential steps fail to provide a reason to believe the conclusion, e.g. a proof by explosion, would be rejected out of hand. Mark Colyvan (2008) illustrates the point by noting that no one has laid claim to a startlingly simple proof of the Riemann hypothesis:

Riemann’s Hypothesis: All the zeros of the zeta function have real part equal to 1/2.
Proof: Let R stand for the Russell set, the set of all sets that are not members of themselves. It is straightforward to show that this set is both a member of itself and not a member of itself. Therefore, all the zeros of Riemann’s zeta function have real part equal to 1/2.

Needless to say, the Riemann hypothesis remains an open problem at time of writing.

Minimally, paraconsistent logicians claim that there are or may be situations in which paraconsistency is a viable alternative to classical logic. This is a pluralist view, by which different logics are appropriate to different areas. Just as a matter of practical value, explosion does not seem like good advice for a person who is faced with a contradiction, as the quote from Hume above makes clear. More forcefully, paraconsistent logics make claim to being a better account of logic than the classical apparatus. This is closer to a monistic view, in which there is, essentially, one correct logic, and it is paraconsistent.

b. Two Grades of Paraconsistency

Let us have a formal definition of paraconsistency.

Definition 1. A logic is paraconsistent iff it is not the case for all sentences A, B that A, ¬AB.

This definition simply is the denial of ex contradictione quodlibet; a logic is paraconsistent iff it does not validate explosion. The definition is neutral as to whether any inconsistency will ever arise. It only indicates that, were an inconsistency to arise, this would not necessarily lead to inferential explosion. In the next definition, things are a little different:

Definition 2. A logic is paraconsistent iff there are some sentences A, B such that ⊢ A and ⊢ ¬A, but not ⊢ B.

A logic that is paraconsistent in the sense of definition 2 automatically satisfies definition 1. But the second definition suggests that there are actually inconsistent theories. The idea is that, in order for explosion to fail, one needs to envisage circumstances in which contradictions obtain. The difference between the definitions is subtle, but it will help us distinguish between two main gradations of paraconsistency, weak and strong.

Roughly, weak paraconsistency is the cluster concept that

  • any apparent contradictions are always due to human error;
  • classical logic is preferable, and in a better world where humans did not err, we would use classical logic;
  • no true theory would ever contain an inconsistency.

Weak paraconsistent logicians see their role as akin to doctors or mechanics. Sometimes information systems develop regrettable but inevitable errors, and paraconsistent logics are tools for damage control. Weak paraconsistentists look for ways to restore consistency to the system or to make the system work as consistently as possible. Weak paraconsistentists have the same view, more or less, of contradictions as do classical logicians.

On the other side, strong paraconsistency includes ideas like

  • Some contradictions may not be errors;
  • classical logic is wrong in principle;
  • some true theories may actually be inconsistent.

A strong paraconsistentist considers relaxing the law of non-contradiction in some way, either by dropping it entirely, so that ¬(A & ¬A) is not a theorem, or by holding that the law can itself figure into contradictions, of the form

Always, not (A and not A),
and sometimes, both A and not A.

Strong paraconsistentists may be interested in inconsistent systems for their own sake, rather like a mathematician considering different non-Euclidean systems of geometry, without worry about the ‘truth’ of the systems; or a strong paraconsistentist may expect that inconsistent systems are true and accurate descriptions of the world, like a physicist considering a non-Euclidean geometry as the actual geometry of space.

It is important to keep weak paraconsistency distinct from logical pluralism, and strong paraconsistency or dialetheism (see §3f.) distinct from logical monism. For example, one can well be a weak paraconsistentist, insofar as one claims that explosion is invalid, even though there are no true contradictions, and at the same time a logical monist, holding that the One True Logic is paraconsistent. This was the position of the fathers of relevance logic, Anderson and Belnap, for instance. Similarly, one could be a dialetheist and a logical pluralist, as is the contemporary philosophical logician Jc Beall (see suggested readings).

c. Requirements for a Logic to be Paraconsistent

All approaches to paraconsistency seek inference relations that do not explode. Sometimes this is accomplished by going back to basics, developing new and powerful ideas about the meaning of logical consequence, and checking that these ideas naturally do not lead to explosion (e.g. relevance logic, §3d). More often paraconsistency is accomplished by looking at what causes explosion in classical inference, and simply removing the causes. In either case, there are some key constraints on a paraconsistent logic that we should look at up front.

Of course, the main requirement is to block the rule of explosion. This is not really a limitation, since explosion is prima facie invalid anyway. But we cannot simply remove the inference of explosion from classical logic and automatically get a paraconsistent logic. The reason for this, and the main, serious constraint on a paraconsistent logic, was discovered by C. I. Lewis in the 1950s. Suppose we have both A and ¬A as premises. If we have A, then we have that either A or B, since a disjunction only requires that one of its disjuncts holds. But then, given ¬A, it seems that we have B, since if either A or B, but not A, then B. Therefore, from A and ¬A, we have deduced B. The problem is that B is completely arbitrary—an absurdity. So if it is invalid to infer everything from a contradiction, then this rule, called disjunctive syllogism,

AB, ¬AB,

must be invalid, too.

There are two things to remark about the failure of disjunctive syllogism (DS).

First, we might say that classical logic runs into trouble when it comes to inconsistent situations. This something like the way Newtonian physics makes bad predictions when it comes to the large-scale structure of space-time. And so similarly, as Newtonian physics is still basically accurate and applicable on medium-sized domains, we can say that classical logic is still accurate and appropriate in consistent domains. For working out sudoku puzzles, paying taxes, or solving murder mysteries, there is nothing wrong with classical reasoning. For exotic objects like contradictions, though, classical logic in unprepared.

Secondly, since DS is a valid classical inference, we can see clearly that a paraconsistent logic will validate fewer inferences than classical logic. (No classically invalid inferences are going to become valid by dint of inconsistent information.) That is the whole idea—that classical logic allows too much, and especially given the possibility of inconsistency, we must be more discriminating. This is sometimes expressed by saying that paraconsistent logics are ‘weaker’ than classical logic; but since paraconsistent logics are more flexible and apply to more situations, we needn’t focus too much on the slang. Classical logic is in many ways more limited than paraconsistent logic (see §4c.).

A third point, which we will take up in §3d, is that the invalidity of DS shows, essentially, that for the basic inference of modus ponens to be valid in all situations, we need a new logical connective for implication, not defined in terms of disjunction and negation. Now we turn to some weak and strong systems of paraconsistency.

3. Schools of Paraconsistent Logic

a. Discussive Logic

The first paraconsistent logic was developed by Jaśkowski, a student of Lukasiewicz, in Poland in 1948. He gave some basic criteria for a paraconsistent logic:

To find a system of sentential calculus which:
1) when applied to contradictory systems would not entail their triviality;
2) would be rich enough to enable practical inference;
3) would have intuitive justification.

To meet his own criteria, Jaśkowski’s idea is to imagine a group of people having a discussion, some of whom are disagreeing with each other. One person asserts: ‘Wealth should be distributed equally amongst all persons.’ Another person says, ‘No, it should not; everyone should just have what he earns.’ The group as a whole is now in an inconsistent information state. We face such states all time time: reading news articles, blogs, and opinion pieces, we take in contradictions (even if each article is internally consistent, which is unusual). How to reason about conflicting information like this?

Jaśkowski’s idea is to prevent the inconsistent information from co-mingling. He does so, in effect, by blocking the rule of adjunction:

A, BA & B.

This rule says that, given two premises A and B, we can conjoin them into a single statement, (AB). If the adjunction rule is removed, then we can have A and ¬A, without deriving a full-blown contradiction A & ¬A. The information is kept separate. On this approach, the classical rule of explosion actually can still hold, in the form

A & ¬AB.

The aim of this approach is not to prevent explosion at the sentence level, but rather to ensure that no contradictory sentence (as opposed to inconsistent sentences) can ever arise. So while the inconsistency arising from different disagreeing parties can be made coherent sense of, a person who is internally contradictory is still reckoned to be absurd.

In 1974, Rescher and Brandom suggested a very similar approach, in terms of worlds. As Belnap has pointed out, the non-adjunctive idea has obvious applications to computer science, for example when a large amount of polling data is stored by a system.

b. Preservationism

Around 1978, the Candadian logicians Schotch and Jennings developed an approach to modal logic and paraconsistency that has some close affinities with the discussion approach. Their approach is now known as the preservationist school. The fundamental idea is that, given an inconsistent collection of premises, we should not try to reason about the collection of premises as a whole, but rather focus on internally consistent subsets of premises. Like discussion logics, preservationists see an important distinction between an inconsistent data set, like

{A, ¬A},

which is considered tractable, versus an outright contradiction like

A & ¬A,

which is considered hopeless. The whole idea is summarized in a paraphrase of Gillman Payette, a major contributor to the preservationist program:

Question: How do you reason from an inconsistent set of premises?
Answer: You don’t, since every formula follows in that case. You reason from consistent subsets of premises.

Preservationists begin with an already defined logic X, usually classical logic. They assert that we, as fallible humans, are simply sometimes ‘stuck with bad data’; and this being the case, some kind of repair is needed on the logic X to insure coherence. Preservationists define the level of a set of premises to be the least number of cells into which the set must be divided for every cell to be internally consistent. They then define an inference relation, called forcing, in terms of the logic X, as follows:

A set of sentences Γ forces A iff there is at least one subset Δ of Γ such that A is an X-valid inference from Δ.

Forcing preserves the level of Γ. If there is any consistency to preserve, forcing ensures that things do not get any more inconsistent. In particular, if a data set is inconsistent but contains no single-sentence contradictions, then the forcing relation is paraconsistent.

Aside from paraconsistent applications, and roots in modal logic, preservationists have recently proved some deep theorems about logic more generally. Payette has shown, for example, that two logics are identical iff they assign any set of sentences the same level.

Detour: Chunk and Permeate

Closely related to the preservationist paradigm is a technique called chunk and permeate, developed by Bryson Brown and Graham Priest to explain the early differential calculus of Newton and Leibniz (see inconsistent mathematics). It is known that the early calculus involved contradictions of some kind, in particular, infinitesimal numbers that are sometimes identical to zero, and other times of a non-zero quantity. Brown and Priest show how reasoning about infinitesimals (and their related notions of derivatives) can be done coherently, by breaking up the reasoning into consistent ‘chunks,’ and defining carefully controlled ‘permeations’ between the chunks. The permeations show how enough but not too much information can pass from one chunk to another, and thus reconstruct how a correct mathematical solution can obtain from apparently inconsistent data.

c. Adaptive Logic

Taking applied examples from scientific reasoning as its starting point, the adaptive logic program considers systems in which the rules of inference themselves can change as we go along. The logics are dynamic. In dynamic logics, rules of inference change as a function of what has been derived to that point, and so some sentences which were derivable at a point in time are no longer derivable, and vice versa. The program has been developed by Dederik Batens and his school in Ghent.

The idea is that our commitments may entail a belief that we nevertheless reject. This is because, as humans, our knowledge is not closed under logical consequence and so we are not fully aware of all the consequences of our commitments. When we find ourselves confronted with a problem, there may be two kinds of dynamics at work. In external dynamics, a conclusion may be withdrawn given some new information; logics in which this is allowed are called non-monotonic. External dynamics are widely recognized and are also important to the preservationist program. In internal dynamics, the premises themselves may lead to a conclusion being withdrawn. This kind of dynamic is less recognized and is more properly within the ambit of paraconsistency. Sometimes, we do derive a consequence we later reject, without modifying our convictions.

Adaptive systems work by recognizing abnormalities, and deploying formal strategies. Both of these notions are defined specifically to the task at hand; for instance, an abnormality might be an inconsistency, or it might be an inductive inference, and a strategy might be to delete a line of a proof, or to change an inference rule. The base paraconsistent logic studied by the adaptive school is called CLuN, which is all of the positive (negation-free) fragment of classical logic, plus the law of excluded middle A ∨ ¬A.

d. Relevance

Relevant logic is not fundamentally about issues of consistency and contradiction. Instead the chief motivation of relevant logic is that, for an argument to be valid, the premises must have a meaningful connection to the conclusion. For example, classical inferences like

BAB,

or

¬(AB) ⊢ A,

seem to relevance logicians to fail as decent logical inferences. The requirement that premises be relevant to the conclusion delivers a paraconsistent inference relation as a byproduct, since in ex contradictione quodlibet, the premises A and ¬A do not have anything to do with an arbitrary conclusion B. Relevant logic begins with Ackermann, and was properly developed in the work of Anderson and Belnap. Many of the founders of relevant logic, such as Robert Meyer and Richard Routley, have also been directly concerned with paraconsistency.

From our perspective, one of the most important aspects of relevant logic is that it provides an implication connective that obeys modus ponens, even in inconsistent situations. In §2b, we saw that the disjunctive syllogism is not paraconsistently valid; and so in any logic in which implication is defined by negation and disjunction, modus ponens is invalid, too. That is,

AB := ¬AB

does not, as we saw in §2b above, define a conditional that obeys

A, ABB.

In the argot, we say that ‘hook is not detachable’ or ‘ponenable’. In relevant logic, implication AB is not defined with truth-functional connectives at all, but rather is defined either axiomatically or semantically (with worlds or algebraic semantics). Going this way, one can have a very robust implication connective, in which not only modus ponens is valid,

AB, A; therefore, B.

Other widely used inferences obtain, too. Let’s just mention a few that involve negation in ways that might seem suspect from a paraconsistent point of view. We can have contraposition

AB ⊢ ¬B → ¬A,

which gives us modus tollens

AB, ¬B ⊢ ¬A.

With the law of non-contradiction ¬(A & ¬A), this gives us reductio ad absurdum, in two forms,

A → (B & ¬B) ⊢ ¬A,

A → ¬A ⊢ ¬A,

and consequentia mirabilis:

¬AAA.

Evidently the relevant arrow restores a lot of power apparently lost in the invalidity of disjunctive syllogism.

There are a great number of relevant logics differing in strength. One can do away with the laws of non-contradiction and excluded middle, giving a very weak consistent paraconsistent logic called B (for basic). Or one can add powerful negation principles as we have just seen above for inconsistent but non-trivial logics. The relevant approach was used in Meyer’s attempt to found a paraconsistent arithmetic in a logic called R# (see inconsistent mathematics). It has also been used by Brady for naïve set theory (§4c), and, more recently, Beall for truth theory. On the other hand, relevant logics validate fewer entailments than classical logic; in order for AB to be valid, we have additional requirements of relevance besides truth preservation in all possible circumstances. Because of this, it is often difficult to recapture within a relevant logic some of classical mathematical reasoning. We return to this problem in §4c below.

e. Logics of Formal Inconsistency

One of the first pioneers of paraconsistent logic was Newton C. A. da Costa in Brazil, in the 1950s. Da Costa’s interests have been largely in paraconsistent mathematics (with applications to physics), and his attitude toward paraconsistency is more open minded than some of the others we have seen. Da Costa considers the investigation of inconsistent but not trivial theories as akin to the study of non-Euclidean geometry. He has been an advocate of paraconsistency not only for its pragmatic benefits, for example in reconstructing infinitesimal calculus, but also as an investigation of novel structure for its own sake. He gives the following methodological guidelines:

  • In these calculi, the principle of contradiction should not be generally valid;
  • From two contradictory statements it should not in general be possible to deduce any statement whatever;
  • The extension of these calculi to quantification calculi should be immediate.

Note that da Costa’s first principle is not like any we’ve seen so far, and his third is more ambitious than others. His main system is an infinite hierarchy of logics known as the C systems.

The main idea of the C systems is to track which sentences are consistent and to treat these differently than sentences that may be inconsistent. Following this method, first of all, means that the logic itself is about inconsistency. The logic can model how a person can or should reason about inconsistent information. Secondly, this gives us a principled way to make our paraconsistent logic as much like classical logic as possible: When all the sentences are marked as consistent, they can be safely reasoned about in a classical way, for example, using disjunctive syllogism.

To make this work, we begin with a base logic, called C(0). When a sentence A behaves consistently in C(0), we mark it according to this definition:

A0 := ¬(A & ¬A).

Then, a strong kind of negation can be defined:

A := ¬A & A0.

The logic with these two connectives added to it, we call C(1). In C(1) then we can have inferences like

¬AB, A, A0B.

And in the same way that we reached C(1), we could go on and define a logic C(2), with an operator A1 = (A0)0, that means something like ‘behaves consistently in C(1)’. The C systems continue up to the first transfinite ordinal, C(ω).

More recently, a broad generalization of the C-systems has been developed by Carnielli, Marcos, and others, called logics of formal inconsistency. Da Costa’s C-systems are a subclass (albeit an important one) of  the much wider family of  the LFIs. The C-systems are precisely the LFIs where consistency can be expressed as a unary operator.

These logics have been used to model some actual mathematics. The axioms of Zermelo–Fraenkel set theory and some postulates about identity (=) can be added to C(1), as can axioms asserting the existence of a universal set and a Russell set. This yields an inconsistent, non-trivial set theory. Arruda and Batens obtained some early results in this set theory. Work in arithmetic, infinitesimal calculus, and model theory has also been carried out by da Costa and his students.

A driving idea of da Costa’s paraconsistency is that the law of non-contradiction ¬(A & ¬A) should not hold at the propositional level. This is, philosophically, how his approach works: ¬(A & ¬A) is not true. Aside from some weak relevant logics, this is a unique feature of the C systems (among paraconsistent logics). In other schools like the discussion and preservationist schools, non-contradiction holds not only at the level of sentences, but as a normative rule; and in the next school we consider, non-contradiction is false, but it is true as well.

f. Dialetheism

The best reason to study paraconsistency, and to use it for developing theories, would be if there were actually contradictions in the world (as opposed to in our beliefs or theories). That is, if it turns out that the best and truest description of the world includes some inconsistency, then paraconsistency is not only required, but is in some sense natural and appropriate. ‘Dialetheism’ is a neologism meaning two-way truth and is the thesis that some sentences are both true and false, at the same time and in the same way. Dialetheism is particularly motivated as a response to the liar paradox and set theoretic antinomies like Russell’s Paradox, and was pioneered by Richard Routley and Graham Priest in Australia in the 1970s. Priest continues to be the best known proponent.

A dialetheic logic is easiest to understand as a many-valued logic. This is not the only way to understand dialetheism, and the logic we are about to consider is not the only logic a dialetheist could use. Dialetheism is not a logic. But here is a simple way to introduce the concept. In addition to the truth-values true and false, sentences can also be both. This third value is a little unusual, maybe, but uncomplicated: if a sentence A is both, then A is true, and A is false, and vice versa. The most straightforward application of a ‘both’ truth-value is Priest’s logic of paradox, or LP. In LP the standard logical connectives have a natural semantics, which can be deduced following the principle that a sentence is designated iff it is at least true—i.e. iff it is true only, or both true and false. If

¬A is true when A is false,

and

¬A is false when A is true,

for example, then

¬A is both iff A is both.

So inconsistent negation is something like a fixed point. An argument is valid in LP iff it is not possible for the conclusion to be completely false but all the premises at least true. That is, suppose we have premises that are all either true or both. If the argument is valid, then the conclusion is also at least true.

In LP, any sentence of the form ¬(A & ¬A) is always true, and also some instances are sometimes false. So the law of non-contradiction is itself a dialetheia—the schema ¬(A & ¬A) is universal but also has counterexamples—and furthermore, dialetheism says of itself that it is both true and false. (The statement ‘there are true contradictions’ is both true—there are some—and false—all contradictions are false.) This may seem odd, but it is appropriate, given dialetheism’s origins in the liar paradox.

LP uses only extensional connectives (and, or, not) and so has no detachable conditional. If one adds to LP a detachable conditional, then, given its semantics, the most natural extension of LP to a logic with an implication connective is the logic called RM3. Unfortunately, this logic is not appropriate for naïve set theory or truth theory (see §4c.ii). If a fourth neutral truth value is added to LP, the logic is weakened to the system of first degree entailment FDE. In FDE, the inference

BA ∨ ¬A

is not valid any more than explosion is. This makes some sense, since if the former is invalid by dint of not representing actual reasoning, then the latter should be invalid, too, since the premise does not ‘lead to’ the conclusion. Because of this, FDE has no theorems, of the form ⊢ A, at all.

4. Applications

A paraconsistent logic becomes useful when we are faced with inconsistencies. Motivations for and applications of paraconsistency arise from situations that are plausibly inconsistent—that is, situations in which inconsistency is not merely due to careless mistakes or confusion, but rather inconsistency that is not easily dispelled even upon careful and concentrated reflection. A student making an arithmetic error does not need a paraconsistent logic, but rather more arithmetic tutorials (although see inconsistent mathematics). On the other hand, people in the following situations may turn to a paraconsistent toolkit.

a. Moral Dilemmas

A mother gives birth to identical conjoined twins (in an example due to Helen Bohse). Doctors quickly assess that if the twins are not surgically separated, then neither will survive. However, doctors also know only one of the babies can survive surgery. The babies are completely identical in all respects. It seems morally obligatory to save one of life at the expense of the other. But because there is nothing to help choose which baby, it also seems morally wrong to let one baby die rather than the other. Quite plausibly, this is an intractable moral dilemma with premises of the form we ought to save the baby on the left, and, by symmetrical reasoning about the baby on the right, also we ought not to save the baby on the left. This is not yet technically a contradiction, but unless some logical precautions are taken, it is a tragic situation on the verge of rational disaster.

A moral dilemma takes the form O(A) and O(¬A), that it is obligatory to do A and it is obligatory to do ¬A. In standard deontic logic—a logic of moral obligations—we can argue from a moral dilemma to moral explosion as follows (see Routley and Plumwood 1989). First, obligations ‘aggregate’:

O(A), O(¬A) ⊢ O(A & ¬A).

Next, note that A & ¬A is equivalent to (A & ¬A) & B. (‘Equivalent’ here can mean classically, or in the sense of C. I. Lewis’ strict implication.) Thus

O(A & ¬A) ⊢ O((A & ¬A) & B)

But O((A & ¬A) & B) ⊢ O(B). So we have shown from inconsistent obligations O(A), O(¬A), that O(B), that anything whatsoever is obligatory—in standard, non-paraconsistent systems.

A paraconsistent deontic logic can follow any of the schools we have seen already. A standard paraconsistent solution is to follow the non-adjunctive approach of Jaśkowski and the preservationists. One can block the rule of modal aggregation, so that both O(A), O(¬A) may hold without implying O(A & ¬A).

Alternatively, one could deny that A & ¬A is strictly equivalent to (A & ¬A) & B, by adopting a logic (such as a relevant logic) in which such an equivalence fails. Taking this path, we would then run into the principle of deontic consistency,

O(A) ⊢ P(A),

that if you ought to do A, then it is permissible to do A. (You are not obliged not to do A.) Accordingly, from O(A & ¬A), we get P(A & ¬A). If we had the further axiom that inconsistent actions are not permitted, then we would now have a full blown inconsistency, P(A & ¬A) and ¬P(A & ¬A). If reductio is allowed, then we would also seem to have obligations such that O(A) and ¬O(A). This move calls attention to which obligations are consistent. One could drop deontic consistency, so that A is obligatory without necessarily being permissible. Or one could reason that, however odd inconsistent actions may sound, there is no obvious reason they should be impermissible. The result would be strange but harmless statements of the form P(A & ¬A).

A principle even stronger than deontic consistency is the Kantian dictum that ‘ought implies can,’ where ‘can’ means basic possibility. Kant’s dictum converts moral dilemmas to explicit contradictions. This seems to rule out moral dilemmas, since it is not possible, e.g., both to save and not to save a baby from our conjoined twins example, it is not obligatory to save one of the two babies, appearances to the contrary. So an option for the paraconsistent deontic logician is to deny Kant’s dictum. Perhaps we have unrealizable obligations; indeed, this seems to be the intuition behind moral dilemmas. A consequence of denying Kant’s dictum is that, sometimes, we inevitably do wrong.

Most liberally, one can keep everything and accept that sometimes inconsistent action is possible. For example, if I make a contract with you to break this very contract, then I break the contract if and only if I keep it. By signing, I am eo ipso breaking and not breaking the contract. In general, though, how one could do both A and its negation is a question beyond the scope of logic.

b. Laws, Science, and Revision

Consider a country with the following laws (in an example from Priest 2006, ch. 13):

(1) No non-Caucasian people shall have the right to vote.
(2) All landowners shall have the right to vote.

As it happens, though, Phil is not Caucasian, and owns a small farm. The laws, as they stand, are inconsistent. A judge may see this as a need to impose a further law (e.g. non-Caucasians cannot own land) or revise one of the current laws. In either case, though, the law as it stands needs to be dealt with in a discriminating way. Crucially, the inferential background of the current laws does not seem to permit or entail total anarchy.

Similarly, in science we hold some body of laws as true. It is part of the scientific process that these laws can be revised, updated, or even rejected completely. The process of such progress again requires that contradictions not be met with systemic collapse. At present, it seems extremely likely that different branches of science are inconsistent with one another—or even within the same discipline, as is the case in theoretical physics with relativity and quantum mechanics. Does this situation make science absurd?

c. Closed Theories – Truth and Sets

Conceptual closure means taking a full account of whatever is under study. Suppose, for example, we are studying language. We carry out our study using language. A closed theory would have to account for our study itself; the language of the theory would have to include terms like ‘language’, ‘theory’, ‘true’, and so forth. More expansively, a theory of everything would include the theory itself. Perhaps the simplest way to grasp the nature of a closed theory is through a remark of Wittgenstein, the preface to his Tractatus: ‘In order to draw a limit to thought, one would have to find both sides of the limit thinkable.’ Priest has argued that the problematic of closure can be seen in the philosophies of Kant and Hegel, as well as in earlier Greek and Medieval thought, and continues on in postmodernist philosophies. As was discovered in the 20th century, closed formal theories are highly liable to be inconsistent, because they are extremely conducive to self-reference and diagonalization (see logical paradoxes).

For logicians, the most important of the closed theories, susceptible to self-reference, are of truth and sets. Producing closed theories of truth and sets using paraconsistency is, at least to start with, straightforward. We will look at two paradigm cases, followed by some detail on how they can be pursued.

i. Naïve Axioms

In modern logic we present formal, mathematical descriptions of how sentences are true and false, e.g. (AB) is true iff A is true and B is true. This itself is a rational statement, presumably governed by some logic and so itself amenable to formal study. To reason about it logically, we would need to study the truth predicate, ‘x is true.’ An analysis of the concept of truth that is almost too-obviously correct is the schema

T(‘A’) iff A.

It seems so obvious—until (even when?) a sentence like

This sentence of the IEP is false,

a liar paradox which leads to a contradiction, falls out the other side. A paraconsistent logic can be used for a theory of truth in which the truth schema is maintained, but where either the derivation of the paradox is blocked (by dropping the law of excluded middle) or else the contradiction is not explosive.

In modern set theory, similarly, we understand mathematical objects as being built out of sets, where each set is itself built out of pre-given sets. The resulting picture is the iterative hierarchy of sets. The problem is that the iterative hierarchy itself is a mathematically definite object, but cannot itself reside on the hierarchy. A closed theory of sets will include objects like this, beginning from an analysis of the concept of set that is almost too-obviously correct: the naïve comprehension schema,

x is a member of {y: A(y)} iff A(x).

A way to understand what naïve comprehension means is to take it as the claim: any collection of objects is a set, which is itself an object. Naïve set theory can be studied, and has been, with paraconsistent logics; see inconsistent mathematics. Contradictions like the existence of a Russell set {y: y is not a member of y} arise but are simply theorems: natural parts of the theory; they do not explode the theory.

ii. Further Logical Restrictions

For both naïve truth theory and naïve set theory, there is an additional and extremely important restriction on the logic. A logic for these schemas cannot validate contraction,

If (if A then (if A then B)), then (if A then B).

This restriction is due to Curry’s paradox, which is a stronger form of the liar paradox. A Curry sentence says

If this sentence is true, then everything is true.

If the Curry sentence, call it C, is put into the truth-schema, then everything follows by the principle of contraction:

1) T(‘C’) iff (if T(‘C’) then everything). [truth schema]
2) If T(‘C’) then (if T(‘C’) then everything). [from 1]
3) If T(‘C’) then everything. [from 2 by contraction]
4) T(‘C’) [modus ponens on 1, 3]
5) Everything. [modus ponens on 3, 4]

Since not everything is true, if the T schema is correct then contraction is invalid. For set theory, analogously, the Curry set is

C = {x: If x is a member of x, then everything is true},

and a similar argument establishes triviality.

As was discovered later by Dunn, Meyer and Routley while studying naïve set theory in relevant logic, the sentence

(A & (AB)) → B

is a form of contraction too, and so must similarly not be allowed. (Let A be a Curry sentence and B be absurdity.) Calling this sentence (schema) invalid is different than blocking modus ponens, which is an inference, validated by a rule. The above sentence, meanwhile, is just that—a sentence—and we are saying whether or not all its instances are true. If naïve truth and set theories are coherent, instances of this sentence are not always true, even when modus ponens is valid.

The logic LP does not satisfy contraction and so a dialetheic truth or set theory can be embedded in it. Some basic contradictions, like the liar paradox and Russell’s paradox, do obtain, as do a few core operations. Because LP has no conditional, though, one does not get very far. Most other paraconsistent logics cannot handle naïve set theory and naïve truth theory as stated here. A hard problem in (strong) paraconsistency, then, is how to formulate the ‘iff’ in our naïve schemata, and in general how to formulate a suitable conditional. The most promising candidates to date have been relevant logics, though as we have seen there are strict limitations.

d. Learning, Beliefs, and AI

Some work has been done to apply paraconsistency to modeling cognition. The main idea here is that the limitations on machine reasoning as (apparently) dictated by Gödel’s incompleteness theorems no longer hold. What this has to do with cognition per se is a matter of some debate, and so most applications of paraconsistency to epistemology are still rather speculative. See Berto 2009 for a recent introduction to the area.

Tanaka has shown how a paraconsistent reasoning machine revises its beliefs differently than suggested by the more orthodox but highly idealized Alchourrón-Gärdenfors-Makinson theory. That latter prevailing theory of belief revision has it that inconsistent sets of beliefs are impossible. Paraconsistent reasoning machines, meanwhile, are situated reasoners, in sets of beliefs (say, acquired simply via education) that can occasionally be inconsistent. Consistency is just one of the criteria of epistemic adequacy among others—simplicity, unity, explanatory power, etc. If this is right, the notion of recursive learning might be extended, to shed new light on knowledge acquisition, conflict resolution, and pattern recognition. If the mind is able to reason around contradiction without absurdity, then paraconsistent machines may be better able to model the mind.

Paraconsistent logics have been applied by computer scientists in software architecture (though this goes beyond the expertise of the present author). That paraconsistency could have further applications to the theory of computation was explored by Jack Copeland and Richard Sylvan. Copeland has independently argued that there are effective procedures that go beyond the capacity of Turing machines. Sylvan (formerly Routley) further postulated the possibility of dialethic machines, programs capable of computing their own decision functions. In principle, this is a possibility. The non-computability of decision functions, and the unsolvability of the halting problem, are both proved by reductio ad absurdum: if a universal decision procedure were to exist, it would have some contradictions as outputs. Classically, this has been interpreted to mean that there is no such procedure. But, Sylvan suggests, there is more on heaven and Earth than is dreamt of in classical theories of computation.

5. Conclusion

Paraconsistency may be minimally construed as the doctrine that not everything is true, even if some contradictions are. Most paraconsistent logicians subscribe to views on the milder end of the spectrum; most paraconsistent logicians are actually much more conservative than a slur like Quine’s ‘deviant logician’ might suggest. On the other hand, taking paraconsistency seriously means on some level taking inconsistency seriously, something that a classically minded person will not do. It has therefore been thought that, insofar as true inconsistency is an unwelcome thought—mad, bad, and dangerous to know—paraconsistency might be some kind of gateway to darker doctrines. After all, once one has come to rational grips with the idea that inconsistent data may still make sense, what, really, stands in the way of inconsistent data being true? This has been called the slippery slope from weak to strong paraconsistency. Note that the slippery slope, while proposed as an attractive thought by those more inclined to strong paraconsistency, could seem to go even further, away from paraconsistency completely and toward the insane idea of trivialism: that everything really is true. That is, contradictions obtain, but explosion is also still valid. Why not?

No one, paraconsistentist or otherwise, is a trivialist. Nor is paraconsistency an invitation to trivilalism, even if it is a temptation to dialetheism. By analogy, when Hume pointed out that we cannot be certain that the sun will rise tomorrow, no one became seriously concerned about the possibility. But people did begin to wonder about the necessity of the ‘laws of nature’, and no one now can sit as comfortably as before Hume awoke us from our dogmatic slumber. So too with paraconsistent logic. In one sense, paraconsistent logics can do much more than classical logics. But in studying paraconsistency, especially strong paraconsistency closer to the dialetheic end of the spectrum, we see that there are many things logic cannot do. Logic alone cannot tell us what is true or false. Simply writing down the syntactic marking ‘A’ does nothing to show us that A cannot be false, even if A is a theorem. There is no absolute safeguard. Defending consistency, or denying the absurdity of trivialism, is ultimately not the job of logic alone. Affirming coherence and denying absurdity is an act, a job for human beings.

6. References and Further Reading

It’s a little dated, but the ‘bible’ of paraconsistency is still the first big collection on the topic:

  • Priest, G., Routley, R. & Norman, J. eds. (1989). Paraconsistent Logic: Essays on the Inconsistent. Philosophia Verlag.

This covers most of the known systems, including discussive and adaptive logic, with original papers by the founders. It also has extensive histories of paraconsistent logic and philosophy, and a paper by the Routleys on moral dilemmas. For more recent work, see also

  • Batens, D., Mortensen, C., Priest, G., & van Bendegem, J.-P. eds. (2000). Frontiers of Paraconsistent Logic. Kluwer.
  • Berto, F. and Mares, E., Paoli, F., and Tanaka, K. eds. (2013). The Fourth World Congress on Paraconsistency, Springer.

A roundabout philosophical introduction to non-classical logics, including paraconsistency, is in

  • Beall, JC and Restall, Greg (2006). Logical Pluralism. Oxford University Press.

Philosophical introductions to strong paraconsistency:

  • Priest, Graham (2006). In Contradiction: A Study of the Transconsistent. Oxford University Press. Second edition.
  • Priest, Graham (2006). Doubt Truth to be a Liar. Oxford University Press.
  • Berto, Francesco (2007). How to Sell a Contradiction. Studies in Logic vol. 6. College Publications.

More philosophical debate about strong paraconsistency is in the excellent collection

  • Preist, G., Beall, JC and Armour-Garb, B. eds. (2004). The Law of Non-Contradiction. Oxford University Press.

For the technical how-to of paraconsistent logics:

  • Beall, JC and van Frassen, Bas (2003). Possibilities and Paradox: An Introduction to Modal and Many-Valued Logics. Oxford University Press.
  • Gabbay, Dov M. & Günthner, F. eds. (2002). Handbook of Philosophical Logic. Second edition, vol. 6, Kluwer.
  • Priest, Graham (2008). An Introduction to Non-Classical Logic. Cambridge University Press. Second edition.

For a recent introduction to preservationism, see

  • Schotch, P., Brown, B. and Jennings, R. eds. (2009). On Preserving: Essays on Preservationism and Paraconsistent Logic. University of Toronto Press.
  • Brown, Bryson and Priest, Graham (2004). “Chunk and Permeate I: The Infinitesimal Calculus.” Journal of Philosophical Logic 33, pp. 379–88.

Logics of formal inconsistency:

  • W. A. Carnielli and J. Marcos. A taxonomy of C- systems. In Paraconsistency: the Logical Way to the Inconsistent, Lecture Notes in Pure and Applied Mathematics, Vol. 228, pp. 01–94, 2002.
  • W. A. Carnielli, M. E. Coniglio and J. Marcos.  Logics of Formal Inconsistency. In Handbook of Philosophical Logic, vol. 14, pp. 15–107. Eds.: D. Gabbay; F. Guenthner. Springer, 2007.
  • da Costa, Newton C. A. (1974). “On the Theory of Inconsistent Formal Systems.” Notre Dame Journal of Formal Logic 15, pp. 497–510.
  • da Costa, Newton C. A. (2000). Paraconsistent Mathematics. In Batens et al. (2000), pp. 165–180.
  • da Costa, Newton C. A., Krause, Décio & Bueno, Otávio (2007). “Paraconsistent Logics and Paraconsistency.” In Jacquette, D. ed. Philosophy of Logic (Handbook of the Philosophy of Science), North-Holland, pp. 791–912.

Relevant logics:

  • Anderson, A. R. and Belnap, N. D., Jr. (1975). Entailment: The Logic of Relevance and Necessity. Princeton University Press, vol. I.
  • Mares, E. D. (2004). Relevant Logic: A Philosophical Interpretation. Cambridge University Press.

The implications of Gödel’s theorems:

  • Berto, Francesco (2009). There’s Something About Gödel. Wiley-Blackwell.

Belief revision:

  • Tanaka, Koji (2005). “The AGM Theory and Inconsistent Belief Change.” Logique et Analyse 189–92, pp. 113–50.

Artificial Intelligence:

  • Copeland, B. J. and Sylvan, R. (1999). “Beyond the Universal Turing Machine.” Australasian Journal of Philosophy 77, pp. 46–66.
  • Sylvan, Richard (2000). Sociative Logics and their Applications. Priest, G. and Hyde, D. eds. Ashgate.

Moral dilemmas:

  • Bohse, Helen (2005). “A Paraconsistent Solution to the Problem of Moral Dilemmas.” South African Journal of Philosophy 24, pp. 77–86.
  • Routley, R. and Plumwood, V. (1989). “Moral Dilemmas and the Logic of Deontic Notions.” In Priest et al. 1989, 653–690.
  • Weber, Zach (2007). “On Paraconsistent Ethics.” South African Journal of Philosophy 26, pp. 239–244.

Other works cited:

  • Colyvan, Mark (2008). “Who’s Afraid of Inconsistent Mathematics?” Protosociology 25, pp. 24–35. Reprinted in G. Preyer and G. Peter eds. Philosophy of Mathematics: Set Theory, Measuring Theories and Nominalism, Frankfurt: Verlag, 2008, pp. 28–39.
  • Hume, David (1740). A Treatise of Human Nature, ed. L. A. Selby-Bigge. Second edition 1978. Oxford: Clarendon Press.

Author Information

Zach Weber
Email: zweber@unimelb.edu.au
University of Melbourne
Australia

Email: z.weber@usyd.edu.au
University of Sydney
Australia

Divine Simplicity

Divine simplicity is central to the classical Western concept of God. Simplicity denies any physical or metaphysical composition in the divine being. This means God is the divine nature itself and has no accidents (properties that are not necessary) accruing to his nature. There are no real divisions or distinctions in this nature. Thus, the entirety of God is whatever is attributed to him.  Divine simplicity is the hallmark of God’s utter transcendence of all else, ensuring the divine nature to be beyond the reach of ordinary categories and distinctions, or at least their ordinary application. Simplicity in this way confers a unique ontological status that many philosophers find highly peculiar.

Inspired by Greek philosophy, the doctrine exercised a formative influence on the development of Western philosophy and theology. Its presence reverberates throughout an entire body of thought. Medieval debates over simplicity invoked fundamental problems in metaphysics, semantics, logic, and psychology, as well as theology. For this reason, medieval philosopher-theologians always situate the doctrine within a larger framework of concepts and distinctions crafted to deal with its consequences. An inadequate grasp of this larger framework continues to hamper the modern debates. Detractors and proponents frequently talk past each other, as this article will show. Reconstructing this larger context is not feasible here. But it will be necessary to refer to its main outlines if one is to capture the basic sense of the doctrine in its original setting.

The following overview begins with a look at some high watermarks of the doctrine. Next it  looks at what has motivated the doctrine throughout its long career. A look at the origins and motives is followed by some representative objections. The bulk of the rest of the article  sketches some common responses to these objections. The responses invoke aspects of the doctrine’s original context to further understanding of it. This treatment will mainly discuss objections to the doctrine’s internal coherence. Problems involving the compatibility of simplicity with another particular teaching generally require highly individual treatment beyond the present scope; this is also so with revealed matters such as the Trinity or Incarnation. However, some general considerations will prove applicable to these individual issues. Progress on the systematic issues seems tied to  understanding the intrinsic claims of the doctrine. A separate article examines God’s immutability, though again some considerations here could prove applicable. The following discussion will suggest that disagreements over simplicity tend to reflect prior theological disagreements over  the fundamental character of God and  what language about God can or cannot imply.

Table of Contents

  1. Origins
  2. Doctrine and Implications
  3. Motives
  4. Difficulties
  5. Responses
    1. Ontology
    2. Persons
    3. Negations
    4. Multiple Predicates
    5. Existence
  6. Conclusion
  7. References and Further Reading

1. Origins

Classic statements of the doctrine of divine simplicity are found in Augustine (354–430), Anselm (1033– 1109), and Aquinas (1225–74). Aquinas is often thought to represent the historical peak of the doctrine’s articulation and defense. Modern discussions usually reference his version as a standard, however, the roots of simplicity go back to the Ancient Greeks, well before its formal defense by representative thinkers of the three great monotheistic religions— Judaism, Christianity, and Islam. (The current English-speaking debates over simplicity usually refer to its Western, Christian developments, which are thus a focus of the present discussion.) Greek philosophers well before Socrates and Plato were fascinated by the idea of a fundamental unity underlying the vast multiplicity of individuals and their kinds and qualities. One idea proposed all things as sharing a common element, a universal substrate providing the stuff of which all things are made. Another idea proposed a being or principle characterized by a profound unity and inhabiting a realm above all else. Thales (640–546 B.C.E.) proposes water to be the common element from which all things in the universe are made. Anaximenes (588–524 B.C.E.) posits all material objects as ultimately constituted by compressed air of varying density. Parmenides (c. 515–c. 450 B.C.E.) presents an early Monism, the idea that all things are of a single substance. He holds that common to all things is their being, taken as a collective undifferentiated mass of all the being in the universe. He further introduces being as possessing an incorruptible perfection. Plato (428–348 B.C.E.) locates unity in the Forms. His metaphysics posits a supreme good constituting a unity beyond all ordinary being. The Platonic idea of a highest principle, combining supreme unity and utter perfection, strongly influenced Jewish and early Christian discussions of God’s supreme unity and perfection. Plato leaves the causal role of the supreme good somewhat vague. Aristotle (384–322 B.C.E.) posits the supreme being to be a subsisting and unchanging form that is also a first mover. Aristotle’s prime mover sits at the top of an efficient causal hierarchy governing all motion and change in the universe. Aristotle’s first mover is a simple, unchanging form that still causally affects other beings: in Aristotle’s case the heavenly spheres would move themselves in imitation of the divine perfection, resulting in the motions of terrestrial beings. Aristotle’s god is still considered ontologically finite by theistic standards and remains only a cosmic mover rather than a creator ex nihilo. The Platonic notion of a supreme perfection at a remove from all things and Aristotle’s causally efficacious, disembodied mind would combine to suggest a powerful model for Western theologians seeking language to describe God’s nature.

The Greek emphasis on a simple first principle figures prominently in the revival of classical Hellenistic philosophy at the close of the ancient world. Christianity is in its infancy when the Jewish theologian Philo of Alexandria (c. 30 B.C.E.–  50 C.E.) observes that it is already commonly accepted to think of God as Being itself and utterly simple. Philo is drawing on philosophical accounts of a supreme unity in describing God as uncomposite and eternal. He identifies this simple first being of the philosophers with the personal God of the Hebrew Scriptures who consciously creates things modeled after the divine ideas. Neoplatonist philosophers Plotinus (205–70) and later Proclus (410–85) will also posit a simple first principle. Plotinus’s Enneads speak of a One that exceeds all of the categories applicable to other things. Consequently it is unknowable and inexpressible (1962, V.3.13, VI.9.3). Plotinus voices an argument for the One’s simplicity that will emerge as a standard line of argument in later thinkers:

Even in calling it The First we mean no more than to express that it is the most absolutely simplex: it is Self-Sufficing only in the sense that it is not of that compound nature which would make it dependent upon any constituent [emphasis added]; it is the Self-Contained because everything contained in something alien must also exist by that alien. (1962, II.9.1)

For the One to have any metaphysical components is for them to account for the existence and character of the composite. Plotinus is working from the idea of a being that is utterly self-explanatory and thus is uncaused. A similar view of the first cause as lacking any internal or external causes will motivate Scholastic accounts of simplicity. Proclus’s Elements of Theology opens its analysis of the first principle by emphasizing its simplicity. (The work actually defends polytheism against the emerging Christianity.) This prioritizing of simplicity in the Elements is imitated in the anonymous Book of Causes and Dionysius’s On the Divine Names, two works that circulate to great effect in the medieval schools.

Christian theological speculation from the beginning views simplicity as essential for preserving God’s transcendence. The second-century Christian apologist Athenagoras of Athens argues that the Christian God by definition has no beginning; thus God is utterly indivisible and unchangeable. The Church Fathers—including Sts. Clement, Basil, and Cyril—see simplicity as preserving God’s transcendence and absolute perfection. St. John Damascene (c. 675–749) in book 3 of his An Exposition of the Orthodox Faith describes the divine nature as a unified single act (energeia) (1899). He allows it can be intellectually conceived under different aspects while remaining a simple being. Dionysius is the sixth-century Christian author of On the Divine Names. He long enjoyed authoritative status in the West after being mistaken for Dionysius of the Areopagus, whom St. Paul mentions in Acts. Unlike St. Augustine’s On the Trinity, Dionysius begins his account of the divine nature with  divine simplicity. Aquinas, in his last great theological synthesis, places simplicity at the head of the divine predicates (Summa theologiae Ia q.3). He first argues that simplicity is part and parcel to being a first cause. Simplicity then becomes a foundation for his account of the other major predicates of God’s nature (Burns 1993; Weigel 2008, ch. 1). However, well before Aquinas’s sophisticated treatment of the doctrine, representative thinkers of all three great monotheistic traditions recognize the doctrine of divine simplicity to be central to any credible account of a creator God’s ontological situation. Avicenna (980–1037), Averroes (1126–98), Anselm of Canterbury, Philo of Alexandria, and Moses Maimonides (1135–1204) all go out of their way to affirm the doctrine’s indispensability and systematic potential.

2. Doctrine and Implications

The doctrine proceeds by denying in God forms of ontological composition that are found in creatures. The forms of composition in question will vary with different ontological systems, particularly so in the modern cacophony of approaches to ontology. For now, it will help to stick with the claims as presented in the classic doctrine. First, God lacks any matter in his being. There are no physical parts. God is also completely independent of matter. Therefore, nothing about God depends on matter to be what it is. Second, the divine nature is not composed with something else. God is the divine nature, so there are no accidental features or other ontological accretions in God. All that God is, he is through and through. The identification of God with his nature is also understood to mean that God exhausts what it is to be divine. For instance, Socrates and Plato do not exhaust what it is to be human because each manifests different ways to be a human being. God cannot be any more divine than he is. This has the further implication that the divine nature is not sharable by multiple beings. Socrates and Plato both possess a human identity. The divine nature, however, is exclusive to God.

Another major tenet is that God is maximal existence. Aquinas calls God ipsum esse subsistens, subsistent existence itself. The Church Fathers from early on affirm God as the absolute Being. Augustine calls God “existence itself” (ipsum esse). God is the ultimate in being. God is not just the best among extant beings. There is no possible being that could be more or better than God is. Hence, God is maximal perfection and goodness. This also means God is infinite. God lacks the ontological limitations creatures have because God has no potentiality to be in a different state than he is. An immediate consequence of simplicity is that classical theism acknowledges severe limits on what created minds can know about God. Human beings can affirm propositions true of God, but no finite mind even approaches comprehending all that God is. A God that is simple is also immutable. A change requires that something in a being undergoes alternation and something else remains continuous. Yet a simple being does not have changeable components, and maximal being cannot be other than it is. There is no temporal unfolding of successive states and God is not subject to place. Thus a simple and immutable God is eternal, not subject to time. As Nicholas Wolterstoff aptly observes, divine simplicity seems to be the ontological basis for “grant[ing] a large number of other divine attributes,” and consequently “one’s interpretation of all God’s other attributes will have to be formed in light of that conviction” (Wolterstorff 1991, 531).

3. Motives

Proponents of the doctrine historically favor two lines of reasoning already mentioned. Classical theism wants to preserve God’s transcendence and also insure God is a genuine first cause. A truly uncaused first cause depends on nothing. Anselm, for instance, holds that God’s supreme perfection precludes division even “by any mind.” Yet in arguing for this state of perfection he uses the idea seen in Plotinus that components determine a composite to be what it is (Proslogion, ch. 19). Internal components are “causes” in the broad sense that the Greeks used [aition] to speak of that which determines something else to exist or be a certain way. (The narrowing of causation to efficient causation comes later.) Aquinas in his Summa theologiae similarly argues for simplicity: “Because every composite is posterior to its components, and depends upon them. However, God is the first being, as shown above [in the arguments for his existence]” (Ia q.3 a.7). Contemporary scholars often refer to God’s independence from all things as his aseity. God is not “self-caused,” as in causing himself to exist by a kind of ontological bootstrapping. Instead, he is a first cause that transcends everything and sustains everything in existence at all moments. This will be the kind of entity for which the question of its own causation or dependence cannot arise. Its nature is self-explanatory.

This idea of a first cause being utterly uncaused has its origin in a model of explanation that sees all things as subject to the principle of sufficient explanation. Everything in existence requires complete explanation for why it exists and why it has the properties it does. Something with a nature that cannot account for its own existence eventually refers back, in this model, to a single, self-explanatory first cause. (It is important to remember that the model here seeks causal explanations of particular entities. Gottfried Leibniz [1646–1716] by contrast defends the principle of a sufficient reason for the truth of all propositions. Some critics argue that this latter model poses the dilemma of having to create necessarily [not freely] or else God would have to create for reasons independent of God.) Philosophers will debate whether this model holds or whether such a first cause exists; however, such discussions fall outside the present scope. The point is that simplicity emerges from a certain view of the world’s causal intelligibility, combined with a strict reading of the unconditioned nature of the first uncaused cause. Marilyn Adams follows how these considerations about a first cause influence the doctrine of simplicity, in her study of simplicity beginning with the writings of Maimonides and ending with William of Ockham (c. 1287–1347) (1987, 930–60).

Classical theism sees simplicity as guaranteeing God’s transcendence. A simple being does not form any mixture or composition with anything else. This rules out pantheistic conceptions of God. God cannot be an aspect of the natural world, such as a world-soul. The Church Fathers, Augustine, and the Scholastics also understand simplicity as maintaining the infinite ontological distance regarded as definitive of transcendence. A complex and mutable being is not something Augustine, Maimonides, or Aquinas would call God. A composite and changeable being they see as much like the rest of creation and not transcending it in any robust sense. Christian ecclesiastical documents reflect similar concerns. Correspondence by Pope St. Leo the Great (reigned 440–61) affirms God’s simplicity and immutability. Simplicity is affirmed in the Council of Lateran IV (1215) and again as recently as Vatican I (1870). One might propose a lesser transcendence that allows for composition and change but that is another discussion. Classical theism remains consistent on the matter. Rising dissatisfaction with a simple and unchanging God in the West parallels the rising popularity of immanent, process-oriented conceptions of the divine nature (Rogers 1996, 165). (See Process Philosophy.) It was just such a dissatisfaction that led philosophers late in the last century to revive modern versions of age-old objections to the doctrine of divine simplicity.

4. Difficulties

Contemporary objections to the intrinsic coherence of the doctrine are interrelated. They rely on similar assumptions about the doctrine and its categories. One line of critique cites the intrinsic claims of the doctrine as incoherent because calling God subsistent existence does not make sense. Another line of critique looks at multiple predicates as introducing divisions in God. The relevant predicates here signify the presence of a positive reality and include such traditional predicates as God is ‘good,’ ‘wise,’ and ‘living.’. Positive divine predicates contrast with negative ones, such as calling God ‘immaterial’ or ‘immutable.’ Here the term’s immediate significance is to deny a reality or situation. In this case the terms signify the absence of matter and change.

Alvin Plantinga’s critique of simplicity in his Does God Have a Nature (1980) has become a touchstone in the contemporary debates. Earlier versions of most of Plantinga’s objections can be found in other authors (Bennett 1969; Ross 1969; LaCroix 1977; Martin 1976; Wainwright 1979). Before that, discussions of simplicity percolated though other traditions, such as in religious schools and seminaries. The recent attention to these issues by analytic philosophers is not as novel as might be thought. Variations of them are probably as old as the doctrine of divine simplicity itself.

One of Plantinga’s major criticisms is that simplicity is incompatible with God appearing to have multiple attributes. According to the doctrine, “[God] doesn’t merely have a nature or essence; he just is that nature, … [and] each of his properties is identical with each of his properties…so that God has but one property.” But this “seems flatly incompatible with the obvious fact that God has several properties; he has power and mercifulness, say, neither of which is identical with the other” (1980, 46–47). Two objections are in play. First, positive predicates normally signify distinct features or aspects in things. Whatever makes Socrates wise differs from what makes him good. Would not God also have distinct properties? Plantinga’s second objection notes that God’s nature is identical  with what is predicated of it. Socrates is not his goodness or wisdom but  God is identical with his properties (which are identical with each other). Yet, no subject is its properties, much less a property, period. Similar versions of this critique are elsewhere (see, for example, Bennett 1969; Mann 1982).

Plantinga sees an even more basic problem here. Plantinga thinks properties and natures are abstract objects: “Still further we have been speaking of [God’s] own properties; but of course there is the rest of the Platonic menagerie—the propositions, properties, numbers, sets, possible worlds, and all the rest” (1980, 35). Properties and natures are abstract objects that neither subsist as individual things, such as oak trees and cats, nor inhere in individuals. This view of properties and natures as abstracta is a common one in the analytic tradition. It flourished during the middle and later decades of the last century and appears still widely held, if less dominant. If Plantinga is right, nothing divine is a property or nature:

No property could have created the world; no property could be omniscient, or, indeed, know anything at all. If God is a property, then he isn’t a person but a mere abstract object; he has no knowledge, awareness, power, love or life. So taken, the simplicity doctrine seems an utter mistake. (47)

Properties in this view are things individuals can exemplify or instantiate, but not actually be. A painted wooden fence, for instance, exemplifies the property of being red. But redness itself is an abstract object separate from the individuals exemplifying it. Variations on this criticism in Plantinga are raised by Richard Gale (1991, 23) and Christopher Hughes (1989, 10–20) among others.

There is an additional line of objection here that commentators often miss. Plantinga takes it for granted God is a person: “If God is a property, then he isn’t a person but a mere abstract object . . .” (1980, 47). Persons are not abstract objects. Moreover, persons are composite and changeable. They have faculties of understanding and volition that involve composition and a temporal sequence of states. So nothing simple can be a person. Yet God is obviously a person, according to Plantinga and others. He is obviously then not simple. David Hume (1711–76) argues along a similar line. A simple and immutable being has no mind, for “a mind whose acts and sentiments and ideas are not distinct and successive . . . has no thought, no reason, no will, no sentiment, no love, no hatred; or in a word, is no mind at all” (1980, part 4). A simple God is not a person, nor could God have the sort of mind persons have.

Another attack on the intrinsic coherence of the doctrine cites the claim that God is Being or existence itself. This basic claim appears early on in the doctrine’s history and is held by contemporary defenders of the doctrine (see, for example, Miller 1996; Davies 2004, 174–75). But detractors find the claim puzzling at best. Christopher Hughes speaks for many in calling it “perhaps the single most baffling claim Aquinas makes about God” (1989, 4). Anthony Kenny’s analysis concludes in even stronger terms by calling the position “nothing but sophistry and illusion” (2002, 194). A. N. Prior criticizes the view as simply ill-formed, that it “is just bad grammar, a combining of words that fails to make them mean—like ‘cat no six’” (1955, 5).

The theological controversy is rooted in a prior philosophical controversy over what it means to predicate existence of objects. According to one prevalent view of existence, saying “Fido exists” adds nothing to Fido. It adds no determinate feature the way predicating ‘hairy’ or ‘four-legged’ does. Existence then is not a real property. If existence is treated as a constituent of things, then there is also a certain paradox involving the denial something exists. To say “Fido does not exist” seems to presuppose Fido is there to be talked about, but then does not exist. This is self-contradictory. Given these apparent oddities, some philosophers decided existence is not predicated of extra-mental things but of concepts. Gottlob Frege (1848–1925) will say that asserting “There exists no four-sided triangle” is just to assign the concept of such a triangle the number zero. C. J. F. Williams echoes the Fregean view in his critique of God as just “to be’” “No doubt the question ‘What is it for x to be?’ is, by Frege’s standards, and they are the right standards, ill formed. To be cannot for anything be the same as to be alive, since the latter is something that can be said of objects, while the former is used to say something of concepts” (1997, 227). This modern analysis of existence goes back to Immanuel Kant’s (1724–1804) critique of Rene Descartes’ (1596–1650) version of the ontological argument. Kant seems to have read Pierre Gassendi’s (1592–1665) analysis of Descartes’ argument. Gassendi holds that existence does not qualify as a property; it is not a property of God or of anything else.  If existence is not really saying anything directly about things, then it is nonsense to say God is literally just existence.

But suppose one allows that existence might be some sort of extra-mental aspect of things. There seem to be other problems in identifying God with existence. Existence never just occurs by itself in some rarefied form. One affirms the existence of dogs and begonias and such. Anthony Kenny notes, “If told simply that Flora is, I am not told whether she is a girl or a goddess or a cyclone, though she may be any of these. But God’s esse is esse which permits no further specification. Other things are men or dogs or clouds, but God is not anything, he just is” (2000, 58). How could existence itself subsist? Even if there could be something like mere existence, then surely God could not be some rarified glob of existence. God would seem to have many other properties. Thus, the problem of calling God subsistent existence returns one to the original problem of predicating multiple properties.

These objections represent the bulk of the objections commonly leveled at the doctrine’s basic coherence. One might summarize them as follows:

(a) God has several properties. Simplicity must deny this.

(b) Multiple properties occur as distinct from each other in things. Simplicity problematically says they are identical in God.

(c) God is a subsisting, individual thing. Properties do not subsist.

(d) In fact, properties, essences, natures are abstracta. God is not an abstract object.

(e) God is a person. Persons are ontologically complex.

(f) Simplicity says God is Being or subsistent existence. Existence is not a property, like being round.

(g) Nothing at all can be just existence.

(h) If God is some kind of rarified existence, this raises the same problem in (a).

These difficulties are hardly exhaustive. Still, together they account for much of the contemporary opposition to simplicity. They also embody certain assumptions other kinds of objections tend to use. What follows can only be a sketch of some common responses to the above objections. Another task will be to demonstrate how proponents of classical simplicity tend to invoke different background assumptions from its critics.

5. Responses

a. Ontology

Looking at the contemporary ontology in which these objections are couched is a good place to start. Plantinga considers natures, properties, essences, and the like to be causally inert abstract objects that are separate from particular individual things. In this scheme, saying God is a nature is a category mistake. It is like referring to someone’s poodle as a prime number.

However, classical simplicity uses a metaphysics that sees the predication of natures and properties differently. Natures, essences, and properties are in this view constituents of things. Nicholas Wolterstorff characterizes this difference in ontological outlook in the following manner:

The theistic identity claims [in simplicity] were put forward by thinkers working within a very different ontological style from ours. They worked within an ontology I shall call constituent ontology. [Contemporary philosophers] typically work within a style that might be called relation ontology….Claims which are baffling in one style will sometimes seem relatively straightforward in another. (1991, 540–41)

Contemporary ontologies of this sort regard natures and properties as abstracta, which individual objects only “have” in the sense of exemplifying or instantiating them. Medieval proponents of simplicity regard such things as natures and properties as entities that actually inhere in the individuals that have them. Wolterstorff observes,

An essence is [for twentieth-century philosophers] an abstract entity. For a medieval, I suggest, the essence of nature was just as concrete as that of which it is the nature….Naturally the medieval will speak of something as having a certain nature. But the having here is to be understood as having as one of its constituents . . . for [contemporary philosophers], having an essence is . . . exemplifying it.” (1991, 541–42)

Many medieval thinkers would say that Socrates and Plato both have a human nature. This means there is an intrinsic set of properties constituting their identity as human beings, instead of being some other kind of natural object. Despite having the same nature, Socrates and Plato are of course distinct individuals. How so? Each individual is made out of a different parcel, or quantity, of matter. Each has different accidental features (non-essential properties). Socrates and Plato are thus two separate composites. Moreover, each has his individual humanity. The nature present in each is individualized or “particularized” in virtue of being in separate lumps of matter, and secondarily by the presence of different accidental, individualized features inhering the individual composite substance. Humanity is not an exact replica in each, in the way new Lincoln pennies might look the same except for being in different places. In this ontological outlook, a mind can form a general concept of human nature in abstraction from its various particularized instances. But this common, abstract humanity is only an object of thought. There is no non-individualized human nature outside of minds producing abstract concepts. For this ontological perspective, there is no Platonic human nature outside of individual human beings. One might give a similar account of various properties Socrates and Plato have. Each has white skin. Each composite is white in its own particular way. One can say here that Socrates’ whiteness inheres in this composite, Plato’s in that one. The way each is white will thus look similar but also slightly different. One can form an abstract, general concept of being white that abstracts from its particular instances. However, the medievals believe such mental abstractions hardly commit one to ontological abstracta apart from minds or individual instances. Consequently, humanity and whiteness are not part of a menagerie of Platonic entities separate from the individual composite beings that exemplify them.

Similarly, classical ontology holds that the divine nature is not an abstract object. The divine nature, or the what-it-is to be God, is not separate from the being that is God. Since simplicity denies matter and accidents in God, here, as Aquinas explains in Summa theologiae, is the extraordinary case where a certain entity just is its own nature:

God is the same as his essence or nature . . . in things composed of matter and form, the nature or essence must differ from the suppositum [that is, the whole subject]….Hence the thing which is a man has something more in it than [its] humanity….On the other hand, in things not composed of matter and form, in which individualization is not due to individual matter…the forms themselves should be subsisting supposita. Therefore suppositum and nature are in them are identified. Since God is not composed of matter and form, he must be his own Godhead, his own life, and whatever else is predicated of him. (Ia q.3 a.3)

Socrates is more than his nature; a human being is a material entity and has non-essential features in addition to his nature. God just is a nature, which does not form a composite with anything else. Such an extraordinary being is difficult to imagine or know much about. But, if natures and properties can be individual components of things, then simplicity hardly makes God an abstract object. Some commentators acknowledge the different approach classical ontology has toward natures and properties, but raise objections to it (for example, Hughes 1989, 12–20). Defenders of simplicity do not find such reservations compelling, and they make the further point that simplicity at bottom never considers God an abstract object (Bergmann and Brower 2006; Leftow 1990, 593–94). The main point is that one’s own ontology might not be that of another age. A technical assessment of these rival approaches to ontology might be left for a longer discussion (Leftow 2003). One should also keep in mind that contemporary defenders of simplicity show a variety of ontological predilections. Some mix historical and contemporary ontological views without seeing incoherence in this (for example, Vallicella 1992; Miller, 1996). Adjudicating among rival ontologies, however, is the substance of a much longer discussion. (For more, see the cited sources in this paragraph.)

b. Persons

Modern authors sometimes speak of God as a person (for example, Plantinga 1980, 47, 57). If God is a person and if simplicity leaves no room for being a person, then simplicity seems incompatible with believing in God. Certainly there are reasons for calling God a person. Classical theism predicates of God such things commonly associated with persons as knowledge and a will. This is not all. Human persons and their cognitive faculties are composite and changeable. So, if persons are the model for God being a person, then simplicity runs into the problems Plantinga and Hume mention above. But then it would be odd if Jewish, Christian, and Islamic thinkers over the centuries momentarily forgot God is like a human person when they affirm God’s simplicity. In fact, referring to God as a person is more complicated than one might think.

Many theists nowadays take it for granted God is a person, albeit a kind of disembodied super-powerful one. Brian Davies observes that the formula ‘God is a person’ “is by no means a traditional one. It does not occur in the Bible. It is foreign to the Fathers and to writers up to and beyond the Middle Ages. Not does it occur in the creeds” (2000, 560).  Judaism believes man is in the image of God because man has understanding and free choice. Yet that is a long way from God actually being a person, much less in the way persons are persons. (Man is in the image of God but not vice versa.) Islam regards the ninety-nine names of Allah as titles of honor and not at all descriptions of God’s essence. The Christian Trinity speaks of three persons of one substance (ousia or substantia). It does not say the Godhead itself is a person, or that God is three persons in one person.

Stanley Rudman argues that thinking of the Godhead itself as a person is a relatively recent development (1998, ch. 8). It is mostly absent from Western theology before the eighteenth century. William Paley (1741–1805) and Friedrich Schleiermacher (1768–1834) provide early examples of trying out the idea. The nineteenth century sees an emphasis on God as a person or personality gain considerable momentum. In the present day, the eminent philosopher of religion Richard Swinburne does not find it particularly controversial to say, “That God is a person yet one without a body seems the most elementary claim of theism” (1999, 99). The difficulty lies in how one understands predicating ‘person.’ The modern sensibility seems to regard God as a person not altogether dissimilar to the way Socrates is a person. God is a disembodied mind that performs discursive thinking and makes a succession of distinct choices.

Far different is how Aquinas sees the predication of ‘person’ to God. He allows one can use the term. But here it signifies in a manner unlike its everyday use (Summa theologiae Ia q.29 a.4). It never applies univocally of God and creatures, but must be differently conceived in each case (q. 29 a.4 ad 4). Aquinas notes that ‘person’ signifies “what is most perfect in all of nature—that is, a subsistent individual of a rational nature.” Working with this general idea, God is called a person because “his essence contains every perfection,” including supreme intelligence, and because “the dignity of the divine nature excels every dignity” (q.29 a.4 ad 2). ‘Person’ thus applies to God in a manner eminently surpassing creatures. The overall context suggests Aquinas regards the term as mainly honorific, in the way God is thought of as a king on account of his rule over creation.

God is not a person if that implies any diminution of his maximal perfection. God does not go from being potentially in another state to acquiring that state. God has a rational nature, but only “if reason be taken to mean, not discursive thought, but in a general sense, an intelligent nature” (q.29 a.4 ad 3). Human persons need not be the definitive model for persons. If they are, God surely is not a person. Predicates God shares with persons, such as intellect and will, apply only by analogy. The predicates must abstract from, or be stripped of, any implication of change, composition, or imperfection. The language of personality applies with the realization that, as Brian Davies notes,

Our language for what is personal (and our primary understanding of this) comes from our knowledge of human beings. And we ought to be struck by a difference between what it takes to be a human being and what it must take to be God. . . . [They do not] reflect a knowledge of God as he is in himself. (2000, 561)

The modern tendency to think of God as a person leads to anthropomorphic interpretations of traditional divine predicates, and this arguably misses the intent of the original proponents of simplicity. A similar problem involves a lack of familiarity with the religious epistemology surrounding the doctrine.

c. Negations

Simplicity traditionally emphasizes God as profoundly unlike created beings. Classical philosophical theology frequently approaches divine predication using negative theology. God is seen as profoundly unknown as he is in himself. Much of what can be affirmed about God expresses what God is not, and in general how unlike and beyond created things God is. This preserves a sense of God’s infinite ontological distance from creatures. It also ensures predicates are not applied as if categories used for persons and everyday objects apply in roughly the same way to God.

Negative predicates such as ‘simple’ and ‘immutable’ signify the removal of features commonly found in created things. Negations should not immediately suggest positive imagery of what God is like. A temptation is to think these terms mean what it would be like for, say, an animate object or a human being to lack such features. Everyday human experience does not associate a lack of complexity with richness and perfection. One imagines dull uniformity, like a bowl of tepid porridge. Aquinas realizes this and follows his presentation of simplicity with God’s unlimited perfection and goodness. Similar caution applies to thinking about God’s immutability. Grace Jantzen observes of an unchangeable God: “A living God cannot be static: life implies change . . . [divine immutability] would preclude divine responsiveness and must rather be taken as steadfastness of character” (1983, 573). However, classical theists will argue that the correct image here should not be that of a static and inert physical object. The historical sources do not suggest this, and often go to great lengths to mitigate against this confusion. God has unlimited perfection, statues and rocks do not. As Brian Davies observes, “living” predicated of God does not mean a literal-minded image of biological life and physical change. Instead it acknowledges God’s independence from things and being a source of change in them (Davies 2004, 165–66).

Classical simplicity maintains that God is beyond knowledge of what he is like in himself. Concepts deriving from everyday experiences of physical objects remain profoundly inadequate to the reality of God. An expert might acquire a good sense of how complicated machinery works. By contrast, Aquinas introduces simplicity by saying it is safer to consider the ways God is unlike the created order, rather than like it: “Now we cannot know how God is, but only how he is not; we must therefore consider the ways in which God does not exist, rather than the ways in which he does” (Summa theologiae Ia q.3 introduction). The context suggests one cannot know the essence of God, or have any direct acquaintance of it the way one knows physical things. Positive predications of the form ”God is A” can allow readers to confuse the semantic distinction between the subject and predicate with a real distinction between God and separate properties. Plotinus operates with a similar caution in denying one can properly even say the One is (1962, V.4.1). This does not mean the One is non-extant. It signals that the One is beyond anything that could be associated with the world of changing and composite beings. Boethius discusses God as a simple being and then qualifies this by saying that God is not to be thought of as a subject. Dionysius (1957) shows an affinity with this position in his On the Divine Names.

Moses Maimonides also displays great caution in his account of simplicity and divine predication. For Maimonides, even positive predicates apply to God with severe qualifications to avoid compromising God’s simplicity (2000, ch. 50–58). Scripture enjoins the believer to affirm God is good, wise, just, and such. Yet positive predicates can only express that (a) God is the ultimate cause of certain good qualities, or (b) the predicate is a disguised negation of something from God. ‘God is good’ might mean God is the cause of good things. ‘God is living’ assures that God is not like something dead or ineffective. Subsequent thinkers will point out difficulties with this view of positive predicates. Saying nothing positive directly about God allows some strange expressions. God is the cause of everything. There are also innumerable things God is not. Thus God might be called a ‘lion’ to avoid the impression of weakness, or ‘quick-witted’ to preclude the impression that God is dull.

Aquinas will cite the Aristotelian dictum (Physics 184a23–184b12) that to affirm something exists is to have at least a very partial and incomplete notion what it is or is like. In addition, some modern commentators point out an agnosticism about God’s essence that can go too far. ‘Simple’ is a negative predicate. But the doctrine implies God is unsurpassed perfection and ultimate being. The absence of something like direct acquaintance with the divine nature could still allow positive things to be affirmed of it. This returns the discussion to the problem of assigning multiple predicates.

d. Multiple Predicates

Multiple predicates differ from each other in meaning. Must they imply multiple properties that are components in God? Maimonides handles this by denying that positive predicates of God actually refer to the divine nature. There is another way. Positive predicates are affirmed of the divine essence, but do not pick out multiple properties in God. God does not have properties, strictly speaking, if one has distinct component features in mind. The undivided reality of God confirms predicates that differ in meaning but all refer to the whole nature. Each predicate corresponds to a way of considering the divine reality. Yet none of these affirmations, taken individually or collectively, imply division. None exhaustively express the maximal perfection to which they all refer. One might use the contemporary distinction between the sense of a predicate, its meaning or conceptual associations, from its reference, the thing or things to which a predicate refers. The divine predicates differ in sense, but share the simple nature as their common referent. (Modern theories of reference differ from medieval theories of signification. But here the basic idea need not do harm.) Aquinas remarks on these predicates:

God, however, as considered in himself is altogether one and simple; but never­theless our intellect knows him by diverse conceptions, because it cannot see him as he is in himself. But, although it understands him under diverse conceptions, it knows that all these conceptions correspond (respondet) [emphasis added] to one and the same simple thing. Therefore, this plurality, which is [a plurality] according to reason, is represented by the plurality of subject and predicate; and the intellect represents the unity by composition. (Summa theologiae, Ia q.13 a.12)

“Good” and “living” are associated with two different concepts. Applied to creatures they signify distinct, inherent properties. Applied to God they are both true, but the ontological basis of their truth is the whole of what God is. The predicates retain their creaturely modes of signifying, where the mind associates the predicate with a limited and accidental property. Aquinas will say each signifies a perfection creatures have in common with God. John Damascene uses the metaphor of God being an infinite ocean of perfection, which can answer to distinctive intellectual conceptualizations while remaining undivided and unlimited in itself.

This does not mean a person grasps what it is about God or “in” God (a misleading expression) corresponding to the predicate. One can say that certain predicates should be affirmed, but claiming to know just what they signify at the level of the divine is another matter. This raises the question of what features inhering in created things would have in common with the divine reality. God’s nature seems to stretch the identity of what is predicated beyond its original significance. Marilyn Adams (1987) has suggested that the real issue with simplicity is not that multiple predicates imply composition. The problem is how the identity of the perfection signified is maintained between its created and divine applications. Aquinas notes that divine perfection differs from created perfection not just in degree. Since God is simple and maximal perfection, an entirely different mode of existence is involved. This is why he will say the predicates apply to God analogously, and not univocally, as “wise” applies to Plato and Socrates. Proponents of simplicity use a variety of solutions to show how the same predicate might refer to God and creatures. Such approaches can widely vary, according to an individual’s views on ontology and religious language (see, for example, Miller 1996; Klima 2001; Teske 1981; Vallicella, 1992; Weigel 2008, ch.6).

e. Existence

Similar considerations about divine predication can make sense of saying God is existence. As noted, contemporary philosophers often deny existence is predicated of things (Williams 1997; Kenny 2002, 110–11). Others question this. They note that the Fregean view of existence originally flourished in response to long-faded controversies in late-nineteenth- and early-twentieth-century theories of quantification and reference (Smith 1961, 118–33; Klima 2001; Knuuttila 1986; Miller 1996, 15–27). Gyula Klima observes that medieval theories of signification predicate existence of things in the world. They also speak of entities that do not exist without generating the obscure paradoxes modern assumptions about reference seem to (2001; Spade 1982). Some philosophers think that predicating existence of objects does say something non-trivial about them. Just because existence is not a determinate property, such as being orange, does not mean its predication to things adds nothing of significance. John Smith argues in this vein that “It is obvious that at least one considerable difference between lions and unicorns is that the former do exist while the latter do not,” and this need not involve some well-defined concept of existence (1961, 123). Philosophers aware of a variety of semantic theories now floating around English-speaking philosophy see the exclusively Fregean interpretation of existence as commanding less assent than it once did.

Fortunately, a sensible reading of the claim can be found without getting philosophers to agree on what existence is. First, God is not the being of all things collectively considered. This is just to have a universal concept of being that abstracts from individual beings and their determinations. But God is no lump sum of existence, which would be pantheistic. Second, saying God is existence does not mean God is some bland, characterless property of existence that one sees as common to cats, trees, and ballpoint pens. Instead, speaking of God as existence itself is a kind of shorthand for God’s ontology. Saying God’s essence is to exist expresses God’s independence from creatures as the uncaused source of all else. God depends on nothing for the being that God is. It also signals God’s supreme perfection. God’s maximal perfection and supreme unity surpass all individual beings and their limitations. Augustine will say in On the Trinity that because God is supreme among all beings, God is said to exist in the highest sense of the expression, “for it is the same thing to God to be, and to be great” (1963, V.10.11). Finally, Aquinas says that God is the full and exhaustive expression of the divine nature (Summa theologiae, Ia q.2 a.3). No other possible being rivals the divine plenitude. So, nothing else can be God. Calling God subsistent existence underscores God as (a) uncaused and independent, (b) maximal perfection, (c) simple, (d) and one.

6. Conclusion

Assessing the doctrine of divine simplicity is far more complicated than lining up objections and replies. The doctrine’s currents run deep in the history of Western philosophical and religious thought,  predating the rise of Jewish and Christian philosophical theology. The doctrine is still regarded by many as an indispensable tenet of classical theism. Simplicity speaks to one’s fundamental understanding of God. Philosophers and theologians will continue to reach widely varying conclusions about simplicity,  and the challenges it poses in a variety of areas insure it will continue to receive much attention for the foreseeable future.

7. References and Further Reading

  • Adams, Marilyn McCord. William Ockham. 2 vols. Notre Dame, IN: University of Notre Dame Press, 1987.
    • Comprehensive overview of Ockham’s (c. 1287–1347) thought and contrasting medieval positions. Extensive discussion of medieval views of simplicity.
  • Anselm of Canterbury. Monologion. In Anselm of Canterbury: The Major Works, edited and translated by Brian Davies and Gareth Evans, 5–81. Oxford: Oxford University Press, 1998.
    • Early medieval account of simplicity and the classic divine predicates.
  • Anselm of Canterbury. Proslogion. In Anselm of Canterbury: The Major Works, edited and translated by Brian Davies and Gareth Evans, 82–104. Oxford: Oxford University Press, 1998.
  • Aquinas, Thomas. Summa Theologica. (also Summa theologiae) Translated by the English Dominican Fathers. New York: Benziger Brothers, 1947.
    • A comprehensive medieval defense of simplicity and other classic divine predicates.
  • Aquinas, Thomas. On the Power of God. Translated by the English Dominican Fathers. Westminster, MD: Newman Press, 1952.
    • Extensive treatment of the problem of simplicity and multiple predicates.
  • Augustine. On the Trinity. Translated by Stephen McKenna. Washington, DC: Catholic University of America Press, 1963.
    • His handling of simplicity proves influential in later, medieval accounts.
  • Bennett, Daniel. “The Divine Simplicity.” Journal of Philosophy 69, no. 19 (1969): 628–37.
    • Examines analytic objections to a simple God having multiple properties ascribed.
  • Bergmann, Michael, and Jeffrey Brower. “A Theistic Argument against Platonism (and in Support of Truthmakers and Divine Simplicity)” In Oxford Studies in Metaphysics 2, edited by Dean Zimmerman, 357–86. Oxford: Oxford University Press, 2006.
    • Argues against properties having to be abstract objects.
  • The Book of Causes. Anonymous. Translated by Dennis Brand. Milwaukee, WI: Marquette University Press, 1984.
    • Thought to be by an unknown Arabic author abstracting from Proclus’s Elements of Theology.
  • Burns, Peter. “The Status and Function of Divine Simpleness in Summa theologiae Ia, qq.2–13.” Thomist 57, no. 1 (1993): 1–26.
    • Discusses the place and influence of simplicity in Aquinas’s account of the divine nature.
  • Davies, Brian. “A Modern Defence of Divine Simplicity.” In Philosophy of Religion: A Guide and Anthology, edited by Brian Davies, 549–64. Oxford: Oxford Uni­versity Press, 2000.
    • A sympathetic treatment of the compatibility of simplicity with other predicates.
  • Davies, Brian. Introduction to the Philosophy of Religion. 3rd ed. Oxford: Oxford University Press, 2004.
  • Dionysius. Dionysius the Areopagite “On the Divine Names” and “The Mystical Theology.” Translated by C. Rolt. London: SPCK, 1957.
    • Influential on later medieval thought about simplicity and the divine nature.
  • Gale, Richard. On the Nature and Existence of God. Cambridge: Cambridge University Press, 1991.
    • A critical response to analytic defenses of theism.
  • Hughes, Christopher. On a Complex Theory of a Simple God. Ithaca, NY: Cornell University Press, 1989.
    • Critiques Aquinas’ account of simplicity and suggests another account.
  • Hume, David. Dialogues concerning Natural Religion. Edited by Richard Popkin. Indianapolis, IN: Hackett, 1980.
    • Historically regarded as a powerful critique of the classic concept of God and arguments for God’s existence.
  • Jantzen, Grace. “Time and Timelessness.” In A New Dictionary of Christianity, edited by Alan Richardson and John Bowden. London: SCM, 1983.
    • Briefly critiques an eternal and immutable God.
  • John of Damascus (John Damascene). An Exposition of the Orthodox Faith. Translated by E.W. Watson and L. Pullan. In Nicene and Post-Nicene Fathers, second series, vol. 9. Edited by Philip Schaff and Henry Wace. Buffalo, NY: Christian Literature, 1899.
    • Systematic discussion of the divine nature and human knowledge of God. Influential precursor to Scholastic discussions.
  • Kenny, Anthony. Aquinas on Being. Oxford: Oxford University Press, 2002.
    • Argues for the incoherence of Aquinas’s ontology of existence.
  • Klima, Gyula. “Existence and Reference in Medieval Logic.” In New Essays in Free Logic, edited by Alexander Hieke and Edgar Morscher, 197–226. Dordrecht: Kluwer Academic, 2001.
    • Sophisticated technical defense of some medieval theories of existence and predication.
  • Knuuttila, Simo. “Being qua Being in Thomas Aquinas and John Duns Scotus.” In The Logic of Being: Historical Studies, edited by Simo Knuuttila and Jaakko Hintikka, 201–22. Dordrecht: Kluwer Academic, 1986.
    • Explanation and defense of Aquinas’s views on existence.
  • LaCroix, Richard. “Augustine on the Simplicity of God.” New Scholasticism 51, no. 4 (1977): 453–69.
    • Critique of Augustine’s account.
  • Leftow, Brian. “Is God an Abstract Object.” Noûs 24, no. 4 (1990): 581–98.
    • Examines the role of theories of properties in accounts of the divine nature.
  • Leftow, Brian. “Aquinas on Attributes.” Medieval Philosophy and Theology 11, no. 1 (2003): 1–41.
    • Explanation and defense of Aquinas on divine predication.
  • Maimonides, Moses ben. The Guide for the Perplexed. Rev. ed. Translated by M. Friedlander. Mineola, NY: Dover, 2000.
    • An early medieval Jewish thinker’s account of the divine nature. Influential in subsequent Scholastic discussions.
  • Mann, William. “Divine Simplicity.” Religious Studies 18 (1982): 451–71.
    • Critique of divine simplicity and often cited in contemporary discussions.
  • Martin, C.B. “God, the Null Set and Divine Simplicity.” In The Challenge to Religion Today, edited by John King-Farlow, 138–43. New York: Science History, 1976.
    • Poses objections to simplicity in an analytic vein.
  • Miller, Barry. A Most Unlikely God: A Philosophical Inquiry into the Nature of God. Notre Dame, IN: University of Notre Dame Press, 1996.
    • Sympathetic reconstruction of the classic concept of God using analytic philosophy.
  • Morris, Thomas. “On God and Mann: A View of Divine Simplicity.” Religious Studies 21, no. 3 (1985): 299–318.
    • A well-known reply to Mann (1982).
  • Owen, H. P. Concepts of Deity. London: MacMillan, 1971.
    • Comprehensive survey of conceptions of the divine nature. Defends classical monotheism.
  • Plantinga, Alvin. Does God Have a Nature? Milwaukee, WI: Marquette University Press, 1980.
    • A monograph-length analytic critique of divine simplicity and the classic concept of God. The text serves as a touchstone for contemporary philosophical debates over simplicity.
  • Plotinus. Enneads. 3rd ed. Translated by Stephen MacKenna. Revised by B. S. Page. New York: Pantheon Books, 1962.
    • Neoplatonic treatment of the divine nature.
  • Prior, A. N. “Can Religion Be Discussed?” in New Essays in Philosophical Theology, edited by Anthony Flew and Alasdair MacIntyre, 1–11. London: S.C.M. Press, 1955.
    • Critical assessment of some traditional theological positions.
  • Proclus. The Elements of Theology. Translated with a commentary by E. Dodds. Oxford: Oxford University Press, 1933.
  • Rogers, Katherin. “The Traditional Doctrine of Divine Simplicity.” Religious Studies 32, no. 2 (1996): 165–86.
    • Survey of some problems classical simplicity raises.
  • Ross, James. Philosophical Theology. New York: Bobbs-Merrill, 1969.
    • Assesses traditional philosophical theology by combining an analytic approach with a grasp of Scholastic positions.
  • Rudman, Stanley. Concepts of Person and Christian Ethics. Cambridge: Cambridge University Press, 1998.
    • Discusses the idea of the Godhead as a person and its recent history.
  • Smith, John. Reason and God: Encounters of Philosophy with Religion. New Haven, CT: Yale University Press, 1961.
    • Examines some traditional and contemporary views in philosophical theology. Defends existence as a valid predicate in theological contexts.
  • Spade, Paul. “The Semantics of Terms.” In The Cambridge History of Later Medieval Philosophy, edited by Norman Kretzmann, Anthony Kenny, and Jan Pinborg, 188–96. New York: Cambridge University Press, 1982.
    • Discussion of medieval semantic theories.
  • Stump, Eleonore, and Norman Kretzmann. “Absolute Simplicity.” Faith and Philosophy 2, no. 4 (1985): 353–82.
    • Defends the compatibility of simplicity with divine power and willing.
  • Swinburne, Richard. The Coherence of Theism. Oxford: Oxford University Press, 1999.
    • Sympathetic treatment of traditional theistic philosophical positions.
  • Teske, Roland. “Properties of God and the Predicaments in De Trinitate V.” Modern Schoolman 59 (1981): 1–19.
    • Examines multiple predicates of a simple God in Augustine’s work.
  • Vallicella, William. “Divine Simplicity: A New Defense.” Faith and Philosophy 9, no. 4 (1992): 471–78.
    • A contemporary analytic defense of divine simplicity.
  • Wainwright, William. “Augustine on God’s Simplicity: A Reply.” New Scholasticism 53, no. 1 (1979): 124–27.
  • Weigel, Peter. Aquinas on Simplicity: An Investigation into the Foundations of His Philosophical Theology. Frankfurt: Peter Lang, 2008.
    • Examines the ontological background to Aquinas’s account of simplicity and philosophical theology.
  • Williams, C. J. F. “Being.” In A Companion to Philosophy of Religion, edited by Philip Quinn and Charles Taliaferro, 223–28. Oxford: Black­well, 1997.
    • Critique of predicating existence in theological contexts.
  • Wolterstorff, Nicholas. “Divine Simplicity.” In Philosophical Perspectives 5: Philosophy of Religion 1991, edited by James Tomberlin, 531–52. Atascadero, CA: Ridgefield, 1991.
    • A critical assessment of some problems raised by simplicity and often cited in contemporary discussions.

Author Information

Peter Weigel
Email: pweigel2@washcoll.edu
Washington College

Predicative and Impredicative Definitions

The distinction between predicative and impredicative definitions is today widely regarded as an important watershed in logic and the philosophy of mathematics. A definition is said to be impredicative if it generalizes over a totality to which the entity being defined belongs. Otherwise the definition is said to be predicative. In the examples below, (2) and (4) are impredicative.

  1. Let π be the ratio between the circumference and diameter of a circle.
  2. Let n be the least natural number such that n cannot be written as the sum of at most four cubes.
  3. A natural number n is prime if and only if  n > 1 and the only divisors of n are 1 and n itself.
  4. A person x is general-like if and only if, for every property P which all great generals have, x too has P.

Definition (1) is predicative since π is defined solely in terms of the circumference and diameter of some given circle.  Definition (2), on the other hand, is impredicative, as this definition generalizes over all natural numbers, including n itself. Definition (3) is predicative, as the property of being prime is defined without any generalization over properties. By contrast, definition (4) is impredicative, as the property of being general-like is defined by generalization over the totality of all properties.

Impredicative definitions have long been controversial in logic and the philosophy of mathematics. Many prominent logicians and philosophers—most importantly Henri Poincaré, Bertrand Russell, and Hermann Weyl—have rejected such definitions as viciously circular. However, it turns out that the rejection of such definitions would require a major revision of classical mathematics. The most common contemporary view is probably that of Kurt Gödel, who argued that impredicative definitions are legitimate provided one holds a realist view of the entities in question.

Although few theorists any longer reject all impredicative definitions, it is widely recognized that such definitions require stronger theoretical assumptions than do predicative definitions.

Table of Contents

  1. Paradoxes and the Vicious Circle Principle
  2. Impredicativity in Classical Mathematics
  3. Defenses of Impredicative Definitions
  4. References and Further Readings

1. Paradoxes and the Vicious Circle Principle

The notion of predicativity has its origin in the early twentieth century debate between Poincaré, Russell and others about the nature and source of the logical paradoxes. ([Poincaré 1906], [Russell 1908]) So, it will be useful to review some of the most important logical paradoxes.

Russell’s paradox. Let the Russell class R be the class of all classes that are not members of themselves. If R is a member of itself, then it doesn’t satisfy the membership criterion and hence isn’t a member of itself. If, on the other hand, R isn’t a member of itself, then it does satisfy the membership criterion and hence is a member of itself after all. Thus, R is a member of itself iff (if and only if) R is not a member of itself.

The Liar paradox. “This sentence is false.” If this quoted sentence is true, then what it says is correct, which means that the sentence is false. If, on the other hand, the sentence is false, then what it says is correct, which means that the sentence is true. Thus, the sentence is true just in case it is false.

Berry’s paradox. There are only finitely many strings of the English alphabet of less than 200 characters. But there are infinitely many natural numbers. Hence there must be a least integer not nameable in less than 200 characters. But we have just named it in less than 200 characters!

Both Poincaré and Russell argued that the paradoxes are caused by some form of vicious circularity. What goes wrong, they claimed, is that an entity is defined, or a proposition is formulated, in a way that is unacceptably circular. Sometimes this circularity is transparent, as in the Liar paradox. But in other paradoxes there is no explicit circularity. For instance, the definition of the Russell class makes no explicit reference to the class being defined. Nor does the definition in Berry’s paradox make any explicit reference to itself.

However, Poincaré and Russell argued that paradoxes such as Russell’s and Berry’s are guilty of an implicit form of circularity. The problem with the Russell class is said to be that its definition generalizes over a totality to which the defined class would belong. This is because the Russell class is defined as the class whose members are all and only the non-self-membered objects. So one of the objects that needs to be considered for membership in the Russell class is this very class itself. Similarly, the definition in Berry’s paradox generalizes over all definitions, including the very definition in question.

Poincaré’s and Russell’s diagnosis is very general. Whenever we generalize over a totality, we presuppose all the entities that make up this totality. So when we attempt to define an entity by generalizing over a totality to which this entity would belong, we are tacitly presupposing the entity we are trying to define. And this, they claim, involves a vicious circle. The solution to the paradoxes is therefore to ban such circles by laying down what Russell calls the Vicious Circle Principle. This principle has received a bewildering variety of formulations. Here are two famous examples (from ([Russell 1908], p. 225):

Whatever involves all of a collection must not be one of the collection.

If, provided a certain collection has a total, it would have members only definable in terms of that total, then the said collection has no total.

In a justly famous analysis, Gödel distinguishes between the following three forms of the Vicious Circle Principle ([Gödel 1944]):

(VCP1) No entity can be defined in terms of a totality to which this entity belongs.

(VCP2) No entity can involve a totality to which this entity belongs.

(VCP3) No entity can presuppose a totality to which this entity belongs.

The clearest of these principles is probably (VCP1). For this principle is simply a ban on impredicative definitions. This principle requires that a definition not generalize over a totality to which the entity defined would belong.

According to Gödel, the other two principles, (VCP2) and (VCP3), are more plausible than the first, if not necessarily convincing. The tenability of these two principles is a fascinating question but beyond the scope of this survey.

For two other introductions to the question of predicativity, see [Giaquinto 2002] and (a bit more advanced) [Feferman 2005].

2. Impredicativity in Classical Mathematics

Assume Poincaré and Russell are right that impredicative definitions must be banned. What consequences would this ban have? It was soon realized that classical mathematics relies heavily on impredicative definitions. Here are two famous examples. (The examples inevitably involve some mathematics but can be skimmed by less mathematically inclined readers.)

Example 1: Arithmetic

In many approaches to the foundations of mathematics, the property N of being a natural number is defined as follows. An object x has the property N just in case x has every property F which is had by zero and is inherited from any number u to its successor u+1. Or in symbols:

Def-N N(x) ↔ ∀F[F(0) ∀u(F(u) → F(u + 1)) → F(x)]

This definition has the nice feature of entailing the principle of mathematical induction, which says that any property F which is had by zero and is inherited from any number u to its successor u+1 is had by every natural number:

∀F{F(0) ∀u(F(u) → F(u + 1)) → ∀x(N(x) → F(x))}

However, Def-N is impredicative because it defines the property N by generalizing over all arithmetical properties, including the one being defined.

Example 2: Analysis

Assume the rational numbers Q have been constructed from sets. Assume we want to go on and construct the real numbers R as lower Dedekind cuts of rationals. That is, assume we want to represent each real number by an appropriate downward closed set of rationals. An important task will then be to ensure that the Dedekind cuts which we use to represent real numbers have the following property, which plays a key role in many proofs in real analysis:

Least Upper Bound Property. Let X be a non-empty collection of reals with an upper bound. (An upper bound of X is a real number which is larger than any element of X.) Then X has a least upper bound. That is, X has an upper bound which is smaller than or equal to any other upper bound of X.

The standard proof that the class of Dedekind cuts has the Least Upper Bound Property involves the following definition of a Dedekind cut z, which can be seen to be the least upper bound of some given non-empty set X which has an upper bound:

∀q[q ∈ z ↔ ∃y(y ∈ X q ∈ y)]

However, this definition of the Dedekind cut z is impredicative because it generalizes over all Dedekind cuts y.

Responses to impredicativity in classical mathematics

So classical mathematics relies on impredicative definitions. What does this mean for the proposed ban on such definitions? Three different kinds of response have been developed.

  1. Russell and Whitehead’s response in their famous Principia Mathematica was to adopt the Axiom of Reducibility. This axioms says (loosely speaking) that every impredicative definition can be turned into a predicative one. However, this axioms has struck most people as intolerably ad hoc.
  2. Another response was initiated by Hermann Weyl [Weyl 1918] and has more recently been pursued by Solomon Feferman. (See [Feferman 1998] as well as [Feferman 2005] for a survey.) This response is to reconstruct as much of classical mathematics as possible in a way that avoids the use of impredicative definitions. Although this approach is hard to carry out and sometimes rather cumbersome, it has turned out that a surprisingly large amount of mathematics—including most of what is needed for the purposes of empirical science—can be reconstructed in a way that is predicative given the natural numbers.
  3. A third response is associated with Gödel. The fact that classical mathematics uses impredicative definitions should, according to Gödel, be considered a refutation of the vicious circle principle and its ban on impredicative definitions rather than the other way round. In Gödel’s words, we should “consider this rather as a proof that the vicious circle principle is false than that classical mathematics is false.” ([Gödel 1944], p. 135)

3. Defenses of Impredicative Definitions

The response of Gödel’s that we have just considered amounts to a pragmatic defense of impredicative definitions. Since classical mathematics is a scientifically respectable discipline, we have good reason to believe that its core forms of definition are legitimate, including many impredicative ones. But although this pragmatic defense of impredicative definitions has significant force, it would be useful to know why such definitions are legitimate despite their apparent circularity. We will now consider some attempted answers to this question, including one due to Gödel himself.

Our journey begins with Frank Ramsey’s “Foundations of Mathematics” ([Ramsey 1931]), written in 1925 when he was merely 22 years old. Ramsey provides some examples of impredicative definitions which appear to be entirely unproblematic:

(5) Let Julius be tallest person in the room.

(6) Let f(p,q) be the truth-function which is the conjunction of p, q, p v q, and p ∧ q.

(A truth-function is a function from truth-values to truth-values.) These definitions are impredicative because (5) generalizes over all people in the room, including Julius (whoever he or she turns out to be) and because (6) defines the truth-function f(p,q) by generalizing over the four listed truth-functions, one of which is easily seen to be identical to f(p,q), namely p ∧ q.

Ramsey is surely right that these two definitions are harmless. But why is that so? Ramsey isn’t entirely explicit here. His core idea appears to be that an impredicative definition is permissible provided the entity defined can at least in principle be specified or characterized independently of the totality in terms of which it is defined. Indeed, Julius (whoever he or she may be) can be specified by pointing to a person, and f(p,q), by means of a truth tables.

This theme of independent specifiability is developed further in an influential article by Paul Bernays, [Bernays 1935]. Bernays is particularly interested in our conception of sets, which, he argues, does not require all sets to be explicitly definable. Consider first the case of a finite set, say a set S with n elements. By means of what Bernays calls “combinatorial reasoning”—that is, reasoning based on the grouping and selecting objects—we establish that S has 2n subsets. We establish this by observing that all the different subsets of S correspond to all the different ways of making an independent choice as to whether each element of S is to be included in some given subset. There is no need to define all the subsets explicitly.

Much the same goes for infinite sets, according to Bernays. Our conception of infinite sets is “quasi-combinatorial” in the sense that it is based on an analogy with the combinatorial conception of finite sets. For instance, this enables us to establish that the number of subsets of the set N of natural numbers is 2ω, where ω is the cardinality or size of N. Note that this fact is established without any need to provide an explicit definition of all the subsets.

The quasi-combinatorial conception of sets ensures that sets can, at least in principle, be specified independently of their definitions. And this in turn ensures that impredicative definitions of sets are permissible. This is because sets do not depend on their explicit definitions, if any, but rather are tied to their quasi-combinatorial specifications.

Gödel also provides a philosophical defense of impredicative definitions, which supplements his pragmatic defense mentioned above. This philosophical defense has been very influential and is the source of what is probably the dominant contemporary view on the matter. According to Gödel, impredicative definitions are indeed problematic if one believes that mathematical objects are in some sense constructed by us. For:

the construction of a thing can certainly not be based on a totality of things to which the thing to be constructed belongs. ([Gödel 1944], p. 136)

But there is no such problem if instead one holds a realist view of mathematical objects:

If, however, it is a question of objects that exist independently of our constructions, there is nothing in the least absurd in the existence of totalities containing members which can be described […] only by reference to this totality. (ibid.)

Gödel’s view is thus that a ban on impredicative definitions is justified if one holds a constructivist view of the entities concerned but not if one holds a realist view.

This means that Gödel’s analysis differs from Ramsey’s and Bernays’. Gödel bases the legitimacy of impredicative definitions on the independent existence of the entities in question, whereas Ramsey and Bernays base it on these entities’ independent specifiability. Which analysis is more plausible? Examples such as (5) are handled well by both analyses. But other examples are handled much better by the Ramsey-Bernay analysis than by Gödel’s. For instance, it seems unlikely that one has to be a realist about truth-functions in order to accept the legitimacy of Ramsey’s impredicative definition (6). In a similar vein, it seems unlikely that one has to be a realist about fictional characters in order to accept the legitimacy of the following impredicative definition.

(7) Let Julia be the most beautiful character in the story of Cinderella.

Clearly, Julia is identical to Cinderella. And this identification does not require a fictional character to enjoy any real or independent existence.

These considerations suggest that the Ramsey-Bernays analysis has at least as much initial plausibility as Gödel’s. But further investigation will be needed to settle the matter.

4. References and Further Readings

  • Benacerraf, P. and Putnam, H., editors (1983). Philosophy of Mathematics: Selected Readings, Cambridge. Cambridge University Press. Second edition.
  • Bernays, P. (1935). “On Platonism in Mathematics.” Reprinted in (Benacerraf and Putnam, 1983).
  • Ewald, W. (1996). From Kant to Hilbert: A Source Book in the Foundations of Mathematics volume 2. Oxford University Press, Oxford.
  • Feferman, S. (1998). “Weyl Vindicated: Das Kontinuum Seventy Years Laters.” In Feferman’s In the Light of Logic, pages 249-283. Oxford University Press, Oxford.
  • Feferman, S. (2005). “Predicativity.” In Shapiro, S., editor, Oxford Handbook of the Philosophy of Mathematics and Logic, pages 590-624. Oxford University Press, Oxford.
  • Giaquinto, M. (2002). The Search for Certainty: A Philosophical Account of Foundations of Mathematics. Clarendon, Oxford.
  • Gödel, K. (1944). “Russell’s Mathematical Logic.” In (Benacerraf and Putnam, 1983).
  • Poincaré, H. (1906). “Les Mathematiques et la Logique.” Revue de Métaphysique et de Morale, 14:294-317. Translated as “Mathematics and Logic, II” in (Ewald, 1996), pp. 1038-1052.
  • Ramsey, F. (1931). “The Foundations of Mathematics.” In Braithwaite, R., editor, The Foundations of Mathematics and Other Essays. Routledge & Kegan Paul, London.
  • Russell, B. (1908). “Mathematical logic as based on a theory of types.” American Journal of Mathematics, 30:222-262.
  • Weyl, H. (1918). Das Kontinuum. Verlag von Veit & Comp, Leipzig. Translated as The Continuum by S. Pollard and T. Bole, Dover, 1994.

Author Information

Oystein Linnebo
Email: o.linnebo@bbk.ac.uk
Birkbeck, University of London
Great Britain

Altruism and Group Selection

Ever since Darwin created his theory of evolution in the nineteenth century, and especially since the nineteen sixties, scientists and philosophers of science have been intensely debating whether and how selection occurs at the level of the group. The debates over group selection maintain their vitality for several reasons: because group selection may explain the evolution of altruism; because “altruistic” traits—traits that reduce an individual’s  fitness while increasing the fitness of another—constitute a well-known puzzle for the theory of natural selection; because altruism is a phenomena that one seems to encounter daily in biology and society; and because altruism via group selection may explain some major evolutionary transitions in the history of life (such as the transition from separate molecules into a gene, from individual genes into a chromosome, from individual cells into a multi-cellular organism, and from multi-cellular organisms turning into a social group).

After so many years of unresolved debates, one is prone to ask: Is the group selection debate merely waiting for more data and experimentation, or are there further issues that need clarification? One type of dispute is semantic, requiring examination of the various meanings of “altruism,” “group” and “unit of selection.” Another type of dispute regards heuristic strategies, such as the assumption that phenomena similar in one respect, however dissimilar in other aspects, call for a similar explanation or a similar causal mechanism. This strategy encourages the parties to seek a single evolutionary explanation or a single selection process to drive the evolution of altruistic traits. Finally, there could be values and visual images, historically entrenched in favor of a particular kind of explanation or against it. This article develops some major historical, empirical, conceptual and practical aspects of the debates over group selection.

Table of Contents

  1. The Concept of Altruism
  2. A Chronology of the Debates
  3. Non-Empirical Aspects of the Debates
  4. Empirical Aspects of the Debates
  5. Practices in the Debates: Sociobiology
  6. References and Further Reading

1. The Concept of Altruism

Selection among groups rather than individuals is not a straightforward idea, especially not ontologically. Nonetheless, the notion of group selection is often used in evolutionary discourse, especially for explaining the evolution of altruism or sociality (the tendency to form social groups). The meaning of “altruism” in ordinary language is quite different from its use among evolutionary biologists (Sober and Wilson, 1998, pp. 17-18). An ultimate motivation of assisting another regardless of one’s direct or indirect self-benefit is necessary for it to be altruistic in the ordinary sense ─ for what we might call moral altruism (see psychological egoism). However, motivations and intentions are not accessible to someone studying non-humans. Thus, they are not part of the meaning of “altruism” in the biological sense. Biological altruism is a course of action that enhances the expected fitness of another at the expense of one’s own fitness. Whether altruism occurs depends on several things: on the population’s initial conditions, on the definition of “altruism” as absolute or relative fitness reduction ─ that is, whether one suffers a net loss or not (Kerr et al. 2003) ─ and on the meaning of “fitness” as an actuality or propensity (Mills and Beatty, 1979). Unlike ordinary speech, in biological discourse a trait that carries a cost to the individual, even if relatively small and with no net reduction of fitness, is typically labeled “altruistic” or, equivalently, “cooperative.”

These distinctions between ordinary and technical senses of “altruism” notwithstanding, many scientists often link them in the evolutionary debates over group selection. Connecting biological and moral altruism is typically done without conflating the two, that is, without committing the naturalistic fallacy of “is implies ought.” An example of such a fallacy might be: since group selection is found everywhere in nature, we should act for the benefit of the group. Instead, some scientists argue that the abundance of group selection processes throughout human evolution can explain why humans sometimes hold genuinely altruistic motivations (for example, Darwin, 1871;  Sober and Wilson, 1998, part II). Others argue that moral altruism should be praised with extra vigor, since the process of group selection hardly – if ever – occurs in nature, so human altruism is not “in harmony” with nature but rather a struggle against it (Dawkins, 1976; Williams, 1987). In short, linking “altruism” with “group selection” is historically very common although conceptually not necessary. As we shall see below, a process of group selection can act on non-altruistic traits and the evolution of a cooperative trait need not always require a group selection process. Karl Popper (1945) blamed Plato for the historical identification of the moral concept of altruism with collectivism and for contrasting altruism to individualism:

Now it is interesting that for Plato, and for most Platonists, altruistic individualisms cannot exist. According to Plato, the only alternative to collectivism is egoism; he simply identifies all altruism with collectivism; and all individualism with egoism. This is not a matter of terminology, of mere words, for instead of four possibilities, Plato recognized only two. This has created considerable confusion in speculation on ethical matters, even down to our own day (Popper, 1945, p. 101).

Whether due to Plato or local circumstances within the nineteen-century scientific community, “altruism” and “group selection” have been linked from the origin of evolutionary biology.

2. A Chronology of the Debates

Ever since Darwin, “altruism” and “group selection” are found together (Darwin, 1859, p. 236;  Lustig, 2004). Darwin, in his 1871 book The Descent of Man, pointed to a selection process at the group level as an evolutionary explanation for human altruism:

When two tribes of primeval man, living in the same country, came into competition, if (other things being equal) the one tribe included a great number of courageous, sympathetic and faithful members, who were always ready to warn each other of danger, to aid and defend each other, this tribe would succeed better and conquer the other (Darwin, 1871, p. 113).

Such altruistic behavior seems to raise a problem for a theory of natural selection, since:

It is extremely doubtful whether the offspring of the more sympathetic and benevolent parents, or of those who were the most faithful to their comrades, would be reared in greater numbers than the children of selfish and treacherous parents belonging to the same tribe. He who was ready to sacrifice his life, as many a savage has been, rather than betray his comrades, would often leave no offspring to inherit his noble nature (Darwin,  p. 114).

Given this characterization, one might think that altruistic traits would gradually disappear. Yet such traits appear quite common in nature. Darwin suggests several mechanisms within a single group to explain the puzzle of the evolution of altruism – such as reciprocal reward and punishment – that often benefit the benevolent individual in the long run relative to others in his or her group. In other words, Darwin points to selection at the level of the individual rather than the group, which renders morally praised behavior non-altruistic in the biological sense. Yet Darwin immediately makes it clear that selection between groups is the dominant process selecting for human morality, since whatever forces might act within that tribe, the disparity in accomplishment is greater between tribal groups than within each group:

It must not be forgotten that although a high standard of morality gives but a slight or no advantage to each individual man and his children over the other men of the same tribe, yet that an increase in the number of well-endowed men and an advancement in the standard of morality will certainly give an immense advantage to one tribe over another. A tribe including many members, who from possessing in a high degree the spirit of patriotism, fidelity, obedience, courage and sympathy, were always ready to aid one another, and to sacrifice themselves for the common good, would be victorious over most other tribes; and this would be natural selection (Darwin, 115-116).

Since Darwin, and with a similar naturalistic stance, biologists have continued to try to explain altruism – in humans and non-humans alike – via group selection models. Assuming that group selection does not conflict with individual selection was a common uncritical presumption until World War II (Simpson, 1941). The three decades to follow marked a dramatic change. Historians such as Keller (1988) and Mittman (1992) showed that during the 1950s and 1960s, many Anglo-American researchers came to identify altruism with conformity – and with being a tool of totalitarianism – while viewing conflicts of interests as crucial for the checks and balances of a functioning democracy. Vero C. Wynne Edwards’s  attempt at a grand synthesis of all population dynamics under the process of group selection (Simpson, p. 20) is an example. The attack on group selection, although already a long-standing element of David Lack’s controversy with Wynne-Edwards (Lack, 1956), became the focus of attention largely due to John Maynard Smith’s 1964 paper and George C. Williams’ 1966 book Adaptation and Natural Selection.

Williams (1966) advocated the parsimony of explaining seemingly sacrificial behavior without evoking altruism (in the sense of absolute reduction in fitness) or the mysterious mechanism of selection at the group level, but rather via the fitness benefits to the individual or the gene involved. A “gene’s eye-view,” employed by Maynard Smith and Williams, was given its most general form in William D. Hamilton’s 1964 papers. “Hamilton’s rule,” often used interchangeably with “kin selection” (Frank, 1998, pp. 4, 46-47; Foster et al., 2005), states that an altruistic gene will increase its frequency in a population if the ratio between the donor’s cost (c) and the benefit to the recipient (b) is less than the coefficient of (genetic) relatedness between the donor and recipient (r); that is, r > c / b. In other words, a gene for altruism (that is, an abstract gene type, not a material stretch of DNA nor a specific gene token) will spread in the population if enough organisms with an above average chance to carry that gene – that is, relatives – will be better off due to the altruistic act even if the individual organism must sacrifice its life.  It should be clear that the altruistic “trait,” explained in these “gene’s eye view” models, is no more than a quantified disposition to act altruistically given certain initial circumstances. Such gene centered models offered in the nineteen sixties by Hamilton, Maynard Smith, and Williams, and assembled in the nineteen seventies under Richard Dawkins’s The Selfish Gene (1976) and Edward O. Wilson’s Sociobiology (1975), appeared to have ended the idea of group selection altogether (although Wilson did use “group selection” for his gene-centered synthesis). Finally a single unifying model was offered to solve Darwin’s “difficulty” with no reference to mechanisms at the level of the group.

Both these books quickly became best sellers, though not everyone accepted the gene’s eye-view, either as an actual causal process of selection (Gould, 1980, Ch. 8; Sober and Lewontin, 1982) or as a useful heuristic (Wimsatt, 2007, Ch. 4-5, which reorganize Wimsatt’s 1980 and 1981 papers). Gene selection opponents granted that the outcomes of selection are often conveniently described in genetic terms for the purpose of “bookkeeping” the records of evolution. However, they argued, the gene’s eye-view fails to test the causes that produced such an outcome (Davidson, 2001). In other words, employing a model that only measures average change in gene frequency in a population may be adequate for predicting biological events yet inadequate for explaining why and how they actually occurred. These objections to gene selection are not only heuristic but also metaphysical (Agassi, 1998[1964]), since they guide one’s practice to seek observations of different events rather than differently describe the same events.

The objections to gene selection notwithstanding, throughout the heated controversy over Wilson’s, and to a lesser degree Dawkins’s, book, “group selection” was not a viable alternative (Lewontin et al., 1984). Things began to change only nearing the nineteen eighties, when David S. Wilson (1975), Michael J. Wade (1976, 1978), Dan Cohen and Ilan Eshel (1976) and Carlo Matessi and Suresh D. Jayakar (1976) independently reexamined the theory. D. S. Wilson is perhaps the biologist most closely associated with reviving the idea of group selection. In Wilson’s (1975) trait-group selection model, any set of organisms that interacts in a way that affects their fitness is a group, regardless of how short lived and spatially dissolved this group is. Wilson further demonstrated that even when an altruist loses fitness relative to an egoist within every group, the variance in fitness between groups – favoring those groups with more altruists – can override the variance in fitness within each group – favoring an egotist over an altruist ─ and thus selection at the group level can override selection at the individual level. This variance in group fitness could be inherited in many population structures, including those required for kin selection (Maynard Smith, 1964) and reciprocity (Trivers 1971; Axelrod and Hamilton’s 1981). Thus, Wilson could show that his model incorporates seemingly competing models as instances of group selection.

Cohen and Eshel (1976) and Matessi and Jayakar (1976) models clearly showed how group selection might occur in nature and that it might not be rare at all. In addition to modeling, Wade (1976, 1980) conducted laboratory experiments (mainly on red flower beetles Tribolium castaneum) that demonstrated the strong causal effects of group selection in a given population. Wade compared the evolutionary response of an inter-group selection process (that is, selection between reproductively isolated breeding groups in a population) to a process of kin selection (that is, selection between groups of relatives in a population with random mating within a common pool) to a random process (that is, selection between groups chosen at random) and to a process of individual selection (that is, selection within groups in each of these population structures). His theoretical and empirical results demonstrated the causal importance of the group selection process during evolution. That is, when group selection was taking place it generated an evolutionary response over and above all the other processes, easily detectable even when individual selection or a random process promotes the same trait as group selection, that is, even when affecting a non-altruistic trait (Griesemer and Wade, 1988).

Since the early nineteen eighties, philosophers of biology became involved in the debates surrounding group selection (Hull, 1980; Sober and Lewontin, 1982; Brandon, 1982, 1990; Sober, 1984; Griesemer, 1988; Lloyd, 1988; Sober and Wilson, 1994); and gradually “group selection” (sometimes called “multi-level selection”) became a dominant view in philosophy of science (Lloyd, 2001; Okasha, 2006). One cannot say the same about evolutionary biology, where the gene’s eye-view is still a dominant scientific perspective.

Thirty years after the publication of Sociobiology, however,  E. O. Wilson has revised the importance of kinship in relation to altruism (Wilson and Hölldobler, 2005). Originally, E. O. Wilson thought the answer to “the central theoretical problem” of altruism – in humans and non-humans alike – was all about kinship (Wilson and Hölldobler,  p.3). Now Wilson argues for a minor evolutionary effect, if any, of kin ties in the evolution of high-level social organization (“eusociality”) and commits to D. S. Wilson’s model of trait-group selection (D. S. Wilson and E. O. Wilson, 2007). This disagreement over the evolution of cooperation via group selection is still very much alive in biology and philosophy. Clarifying some of the concepts involved may help understand its dynamics.

3. Non-Empirical Aspects of the Debates

The concept of group selection refers to three different, albeit often overlapping, issues: the first involves selection, the second adaptation, and the third evolutionary transitions. For studying selection, it is necessary to determine whether variations in fitness and in trait frequency between groups exceed those variations within groups (Price, 1972; Sober and Lewontin 1982; Sober and Wilson, 1998), and whether this variance is a mere statistical by-product of selection acting between individuals or an actual causal effect of a selection process that took place at the group level (Sober, 1984; Okasha, 2006).

In addition, for studying group adaptation additional information is required on group-heritability (Lloyd, 1988; Brandon 1990; Wade, 1978, 1985; Okasha, 2006), that is, whether and how does an average trait in a daughter group resemble the average trait in the mother group more than it resembles the population mean? Is this statistical resemblance between mother and daughter group, if found, a result of random or group-structured mating in the population? Is it regularly expected in a given population structure or a product of chance, in the sense of an irregular event?

The third issue concerns how the evolutionary transition from solitary organisms to social groups occurred (Maynard Smith and Szathmáry 1995, Jablonka and Lamb, 2005). That is, it concerns how various cooperative adaptations have combined to bring about systematic altruism, so that individuals have lost their independent reproduction and mate only within the larger encompassing whole or social group. In this third type of question, one cannot assume a group structure already exists in the population in order to explain the evolution of altruism within such population – as did Darwin and many others – nor even assume that a gene for altruism already exists – as did Hamilton and many others; rather, one must explain how societies, phenotypes and genotypes emerge and co-evolve (Griesemer, 2000).

The notions of group selection and group adaptation both rely upon the meaning of a “unit of selection.” A unit of selection shows phenotypic variance, fitness variance, and heritability of traits relating to fitness (Lewontin 1970). Lewontin has shown that multiple structural units – for example, genotype, organism, and group – could hold the conditions of a unit of selection. However, the function of a unit of selection is still under conceptual dispute. Is the function of a unit of selection to replicate itself from generation to generation (Dawkins, 1976) or is it to interact with its environment in a way that causes differential reproduction (Hull 1980). Focusing on the function of a unit of selection as a replicator, means that the gene is the “real” or major unit of selection, since an organism that reproduces sexually replicates only one half of its traits on average, and a group that splits into daughter groups has an even smaller chance to replicate its trait, for example, its frequency of altruists, to the next generation of groups. Alternatively, viewing the “unit of selection” as an interactor means that a single gene cannot be a unit of selection but only whole genomes (that is,, individuals) and perhaps groups could function as such units.

But must one choose a single perspective for explaining the evolution of altruism? Kitcher, Sterelny and Waters (1990) argue for a pluralist view that suggests several equally adequate models one can use for representing the same facts. Kerr and Godfrey-Smith (2002) develop this pluralistic view into a mathematical representation, which fully translates the unit of the group from a mere background for its individuals ─ that is, “group’ as contextual” ─ to an emergent unit as a whole ─ that is, “group” as a “collective” ─ and vice versa. The advantage of pluralism in this case is that one need not decide which process actually took precedent ─ for example group selection or individual selection ─ in  explaining and predicting the evolution of altruism.

Yet pluralism comes with a price if one wishes to understand the evolution of altruism via its evolutionary casual process. In the history of science, translatability of competing models relative to a body of empirical knowledge repeatedly called scientists and philosophers to search for additional observations and/or experiments that will “break the tie” and decide which model to uphold (Agassi, 1998[1964]). In the debates over the evolution of altruism, Lloyd, (2005), Wimsatt (2007, Ch. 10) and Griesemer (2006) argue that in most cases – or at least in the interesting cases where a casual process might be operating at the level of the group – interchangeable abstract models require one to minimize  empirical details about population structure and dynamic, which are necessary for confirming one’s evolutionary explanation. These disputes over the unit of selection’s relevant function or plurality of representation have been at the focus of the philosophical debates over group selection for several decades.

4. Empirical Aspects of the Debates

Semantic disputes notwithstanding, whether or not groups in a certain population actually show heritable variance in fitness is an empirical question (Griesemer 2000). Since Wade has already demonstrated the noticeable evolutionary effects of group selection, whether or not the population is in fact divided into social groups with heritable variance in fitness should be tested in each case, prior to describing these entities as “replicators” or “interactors,” “contextual backgrounds” or “emergent collectives.”

Brandon (1990, 98–116) reviewed the empirical criteria for a process of group selection to take place: when there is no variance in group fitness or when the variance in group fitness does not depend on group structure (for example, when group differential reproduction is independent of the relative frequency of altruists in the group, but instead depends on the frequency of hurricane storms in its environment), a process of selection between groups cannot occur. When both individual selection and group selection processes affect a trait, selection within groups is more effective when variance in the fitness of individuals within each group exceeds variance in the mean fitness between groups or when the variance in a group- trait is not heritable.

“Group trait” in this context need not be a unique holistic trait but rather can be the mean phenotype of individuals in that group; similarly, “group fitness” is the mean fitness of individuals within a group relative to the mean fitness of another group; and “group heritability” traces phenotypic variation among parent-offspring lineages of groups: if the trait of the daughter-group significantly resembles its mother-group compared to the population mean, then realized group heritability is non-zero. This “group trait” describes an individual’s trait within a context of a group-structured population (Heisler and Damuth, 1987); which leads Maynard Smith and Williams to argue that this is not a group trait at all or that describing this trait as an individual trait is more useful (c.f. Okasha, 2006, p. 180). Whatever the verdict on the characterization of “group trait” and “group fitness,” an empirical dimension exists, with regard to a selection process at the level of the group, and empirical criteria to test such a process are available. One might expect multiple field and laboratory tests of the existence of group selection. Natural and laboratory tests exist (Goodnight and Stevens, 1997), yet the common practice in these debates invests relatively little in empirical study. The next section will attempt to describe this practice and suggest a rational explanation for it.

5. Practices in the Debates: Sociobiology

One of the most revealing examples for the practice in the debates over group selection is a recent debate between Wilson, the author of Sociobiology, and Dawkins, the author of The Selfish Gene, who used to employ similar selection models but now deeply disagree over the role of group selection in the evolution of eusociality.

Both sides declare that their models are translatable (Wilson and Wilson 2007, Dawkins, 1982 p. 1), that is, can agree with any set of data the other model agrees with. If this disagreement were purely about terminology, one would expect the scientific community to gradually lose interest in it. This has not happened. Another possibility is that the models agree with all the data but differ greatly in their heuristic value. In that case, one would expect many methodological comparisons of model performance – for example, comparisons of models’ precision, generality, accuracy, complexity, and/or elegance – for various species and social phenomena in the lab and in nature. Yet these are not a central part of the debate either (Sober and Wilson, 1998). Rather, it seems there is no “given” phenomenon both sides use; instead, disputants clash on how to define or describe the phenomenon the models attempt to fit. In short, they disagree over what it is that we see when several ants walk by.

For Wilson and Wilson (2007), as in earlier work by Sober and Wilson (1989), a “group” is any aggregate of individuals that is small compared to the total population to which they belong and where individuals non-randomly interact in a way that affects each other’s fitness. This is an extremely abstract understanding of what constitutes a group: one that fits many kinds of cases and is almost completely unconstrained by any particular population structure, dynamic, duration or size. Nor does it require groups to multiply as anything like cohesive wholes in order to acquire heritable variance in fitness. Indeed, such a broad definition of “group” is central for Wilson and Wilson’s definition of “group selection:” “the evolution of traits based on the differential survival and reproduction of groups” (Wilson and Wilson, p. 329). Such a group selection model need not differ empirically from the similarly broad definition of “kin selection:” “selection affected by relatedness among individuals” (Foster et al. 2005, p.58).”Relatedness” here does not refer only to family descent but to an index of comparison between any set of individuals, including strangers, from the same species.

Similar to Wilson’s and Wilson’s “group selection,” no particular population structure constrains Foster’s et al.’s application of “kin selection.” The difference between the models lies in model structure: whereas the group selection model partitions the overall selection in the population into “within group selection” and “between groups selection” components, the alternative models – for example, kin selection, reciprocity, indirect and network reciprocity (Nowak, 2006) – do not employ such partitioning, since in these models what enhances group fitness always enhances the inclusive fitness of each individual (or rather what Dawkins “only partly facetiously” describes as “that property of an individual organism which will appear to be maximized when what is really maximized is gene survival” (Dawkins 1982, p. 187)).

This theoretical difference in model structure does not necessarily emphasize different causal factors, since the context that can affect the frequency of altruists – population structure and ecology – can be captured according to both Wilson’s and Dawkins’s models (Wilson 2008 and Foster et al. 2005), and does not constrain either model. So, argue Foster et al. (2005), if Wilson’s new group model does not generate facts unattainable otherwise, why accept his definition of “kin selection,” rather than Maynard Smith’s original 1964 definition: “the evolution of characteristics which favor the survival of close relatives of the affected individual” Dawkins, p. 1145) Yet Wilson asks in return, why not go back to Darwin’s explanation of group selection? Thus the debate again seems to be over terminology, this time with a historical twist.

But why should biologists care (as they obviously do)? If the disagreement was mainly about choosing among interchangeable perspectives for the same phenomenon, a choice based on personal taste, historical uses, or the heuristic value of each model, one would expect the scientific debate to gradually dissolve in the first two cases and become pragmatically/methodology based in the third. Since the debate has neither dissolved nor turned pragmatic, and since one can plausibly assume this debate is a rational one, the remaining explanation is the best one: that Wilson and Dawkins disagree over semantics because both hope for their different concepts and models to refer to different evolutionary processes in the world. To use Dawkins’s terms, even when modestly arguing over the flipping picture we receive from a Necker cube (Dawkins, 1982 p. 1) the non-modest aim remains to decipher the picture we see from an east-African mountain: whether the small spots below are insects or buffalos (Dawkins, p. 7).

When Wilson looks at a social group he sees a unit which is a target of selection, while Dawkins sees an illusory by-product of a different selection process, acting at a single level of organization: gene selection. They disagree the way they do because they aim toward representing empirical facts accurately, but since both sides employ overly broad definitions for “group,” “group selection” and “kin selection.” it becomes very difficult to identify a specific fact, for example, a particular population dynamic or structure, to test these models in a particular case (Shavit, 2005). In short, Wilson’s and Dawkins’s concepts might be too broad to hold enough empirical content for scientifically advancing the debate over the evolution of altruism by group selection.

Not all supporters of group selection use such broad concepts. Wade (1978, 1985) defined “group selection” and “kin selection” in accord with different population structures, so his constrained models could clearly refer to distinct selection processes that he and his colleagues then compared in the lab or in the field. Both Dawkins and Wilson may object that Wade’s definitions are too narrow. They would be right in the sense that his definitions do not cover many kinds of cases, yet that does not imply that his definitions do not cover many cases. They do (for example, Wade and Goodnight, 1998 on various taxa; Aviles, 1997 on spiders).  It seems that such narrow definitions – those that restrict the kinds of cases – readily facilitate empirical tools to determine what is and is not happening in a given population, whereas the broad definitions used by Dawkins and Wilson are more likely to talk past each other without resolution. Nonetheless, the use of broad concepts seem to be dominating the field, perhaps partly due to the political images and memories that everyday terms such as “altruism,” “group” and of course “selection” carry into science from society at large (Shavit, 2008). Employing social metaphors laden with multiple conflicting meanings began with Darwin, and, ever since, explaining the evolution of altruism by group selection stubbornly remains “one special difficulty” (Darwin, 1859, p. 236).

6. References and Further Reading

  • Agassi, J.: 1998 [1964], “The Nature of Scientific Problems and Their Roots in Metaphysics,” in Bunge, M. (ed.): The Critical Approach: Essays in Honor of Karl Popper, Free Press, New York 189-211.
  • Avilés, L.: 1997, “Causes and Consequences of Cooperation and Permanent-Sociality in Spiders,” in Choe, J. C. and Crespi, B. J. (eds.): The Evolution of Social Behavior in Insects and Arachnids, Cambridge University Press, Cambridge.
  • Axelrod, R. and Hamilton, W. D.: 1981, “The Evolution of Cooperation,” Science 211, 1390–1396.
  • Mills, S. K. and Beaty, J. H.: 1979, “The Propensity Interpretation of Fitness,” Philosophy of Science 46, 236–286.
  • Brandon, R.: 1982, “The Levels of Selection,” PSA 1982, Vol. 1, eds. P. Asquith and T. Nickles, East Lansing MI., Philosophy of Science Association,  315-323.
  • Brandon, R.: 1990, Adaptation and Environment, Princeton University Press, Princeton, New Jersey.
  • Cohen, D. and Eshel, I.: 1976, “On the Founder Effect and the Evolution of Altruistic Traits,” Theoretical Population Biology 10, 276–302.
  • Darwin, C.: 1859, On the Origin of Species, The Heritage Press, New York, 1963.
  • Darwin, C.: 1871, The Descent of Man, The Heritage Press, New York, 1972.
  • Davidson, D.: 2001, Essays on Actions and Events, Clarendon Press, Oxford.
  • Dawkins, R.: 1976, The Selfish gene, Oxford University Press, Oxford.
  • Dawkins, R.: 1982, The Extended Phenotype, Oxford University Press, Oxford.
  • Foster K. R, Wenseleers T, Ratnieks F. L. M.: 2006, “Kin Selection is the Key to Altruism,” Trends in Ecology and Evolution 21: 57-60.
  • Frank, S. A.: 1998, Foundations of Social Evolution, Princeton University Press, Princeton, New Jersey.
  • Gould, S. J.: 1980, The Panda’s Thumb, W.W. Norton & Company, New York.
  • Griesemer, J.: 2000, “The Units of Evolutionary Transition,” Selection 1, 67–80.
  • Griesemer, J. and M. J. Wade.: 1988, “Laboratory Models, Causal Explanation and Group Selection,” Biology and Philosophy 3, 67–96.
  • Hamilton, W. D.: 1964, “The Genetical Evolution of Social Behavior. I,” Journal of Theoretical Biology 7, 1–16.
  • Hamilton, W. D.: 1964b, “The Genetical Evolution of Social Behavior. II,” Journal of Theoretical Biology 7, 17–52.
  • Heisler, I. L. and Damuth J.: 1987, “A Method for Analyzing Selection in Hierarchically Structured Populations,” American Naturalist 130, 582–602.
  • Hull, D.: 1980, “Individuality and Selection,” Annual Review of Ecology and Systematics 11, 311–332.
  • Jablonka E. and Lamb M.: 2005, Evolution in Four Dimensions, M.I.T. Press, Cambridge Massachusetts.
  • Keller, E. F.: 1988, “Demarcating Public from Private Values in Evolutionary Discourse,” Journal of the History of Biology 21, 195–211.
  • Kitcher, P., Sterelny, K., and Waters, C. K.: 1990,”The Illusory Riches of Sober’s Monism,” The Journal of Philosophy 87, 158—161.
  • Kropotkin, P. (1902). Mutual Aid. London: Heinemann.
  • Lack, D. L.: 1956, Swift in a Tower, Methuen, London.
  • Lewontin, R. C.: 1970, “The Units of Selection,” Annual Reviews of Ecology and Systematics 1, 1–17.
  • Lewontin R. C., Rose S. and Kamin L.: 1984, Not in our Genes, Pantheon, New York.
  • Lloyd, E. A.: 1988, The Structure and Confirmation of Evolutionary Theory, second ed., Princeton University Press, Princeton, 1994.
  • Lloyd, E. A.: 2001, “Units and Levels of Selection: An Anatomy of the Units of Selection Debates,” in R. S. Singh et al. (eds.), Thinking About Evolution, Cambridge University Press, Cambridge.
  • Lloyd, E. A.: 2005, “Why the Gene Will Not Return,” Philosophy of Science 72, 287–310.
  • Lustig, A. J.: 2004, “Ant Utopias and human Dystopias Around World War I,” in F. Vidal and L. Daston (eds.), The Moral Authority of Nature, University of Chicago press, Chicago.
  • Matessi, C. and Jayakar, S. D.: 1976, “Conditions for the Evolution of Altruism Under Darwinian Selection,” Theoretical Population Biology 9, 360–387.
  • Maynard Smith, J.: 1964, “Group Selection and Kin Selection,” Nature 201, 1145–1147.
  • Maynard Smith, J and E. Szathmáry.: 1995, The Major Transitions in Evolution, W.H. Freeman, Oxford.
  • Mitman, G.: 1992, The State of Nature, University of Chicago Press, Chicago.
  • Nowak, M. A.: 2006, “Five rules for the Evolution of Cooperation,” Science 314, 1560–1563.
  • Okasha, S. 2006, Evolution and Levels of Selection, Oxford University Press: Oxford.
  • Popper, K. R.: [1945] 2006, The Open Society and Its Enemies, Routledge, London.
  • Segerstråle, U.: 2000, Defenders of the Truth, Oxford University Press, Oxford.
  • Shavit, A.: 2005, “The notion of ‘Group’ and Tests of Group Selection,” Philosophy of Science 72, 1052–1063.
  • Shavit, A.: 2008, One for All? Facts and Values in the Debates over the Evolution of Altruism, The Magnes Press, Jerusalem.
  • Simpson, G. G.:  1941, “The Role of the Individual in Evolution,” Journal of the Washington Academy of Sciences 31, 1–20.
  • Sober, E.: 1984, “Holism, Individualism, and the Units of Selection,” in E. Sober (ed.) Conceptual Issues in Evolutionary Biology, M.I.T. Press, Cambridge Mass., p. 184–209.
  • Sober, E. and Wilson, D. S.: 1998, Unto Others, Harvard University Press, Cambridge, Mass.
  • Trivers, R.: 1971, “The Evolution of Altruism,” Quarterly Review of Biology 46, 35–57.
  • Wade, M. J.: 1976, “Group Selection Among Laboratory Populations of Tribolium,” Proceedings in the National Academy of Science 73, 4604– 4607.
  • Wade, M. J.: 1978, “A Critical Review of the Models of Group Selection,” Quarterly Review of Biology 53, 101–114.
  • Wade, M. J.: 1980, “An Experimental Study of Kin Selection,” Evolution 34, 844–855.
  • Wade, M. J.: 1985, “Soft Selection, Hard Selection, Kin Selection and Group Selection,” American Naturalist 125, 61–73.
  • Wade, M. J. and Goodnight, C. J.: 1998, “The Theories of Fisher and Wright in the Context of Metapopulations: When Nature Does Many Small Experiments,” Evolution 52, 1537–1553.
  • Williams, G. C.: 1966, Adaptation and Natural Selection, Princeton University Press, Princeton.
  • Williams, G. C.: 1989, “A Sociobiological Expansion of Evolution and Ethics,” in (eds.) J. Paradis and G. C. Williams), Evolution and Ethics, Princeton University Press, Princeton New Jersey.
  • Wilson, D. S.: 1975, “A General Theory of Group Selection,” Proceedings of the National Academy of Sciences 72, 143–146.
  • Wilson D. S. and Wilson E. O.: 2007, “Rethinking the Theoretical Foundation of Sociobiology,” The Quarterly Review of Biology 82: 327–348.
  • Wilson E. O.: 1975, Sociobiology, Harvard University Press, Cambridge Massachusetts.
  • Wilson E. O. and Hölldobler B.: 2005, “Eusociality: Origin and Consequences,” Proceedings of the National Academy of Sciences 102, 13367–13371.
  • Wimsatt, W. C.: 2007, Re-Engineering Philosophy for Limited Beings, Harvard University Press, Cambridge, Massachusetts.
  • Wynne-Edwards, V. C.: 1962, Animal Dispersion, in Relation to Social Behaviour, Oliver and Boyd, Edinburgh, Great Britain.

Author Information

Ayelet Shavit
Email: ashavit@telhai.ac.il
Tel Hai Academic College
Israel

Medieval Theories of Practical Reason

Practical reason is the employment of reason in service of living a good life, and the great medieval thinkers all gave accounts of it. Practical reason is reasoning about, or better toward, an action, and an action always has a goal or end, this end being understood to be in some sense good. The medievals generally concurred that it was always in some way directed toward the agent’s ultimate goal or final end (although there were important differences in how the agent’s relation to the final end was conceived).

In every medieval account, we find important roles for the intellect and the will—for the intellect in identifying goods to be honored and pursued, and for the will in tending toward such goods. Medieval accounts always paid attention to the relationship between practical reason and the moral trinity of happiness, law, and virtue. Perhaps the most important difference between these accounts is that some philosophers assign primacy to the intellect but others assign it to the will. This difference has led historians to identify schools of thought called intellectualism and voluntarism.

This article traces some of the main lines of medieval thought about practical reason, from its roots in Aristotle and Augustine through some of its most interesting expressions in Aquinas and Scotus, the ablest exponents, respectively, of intellectualism and voluntarism. The article points out the important differences among theorists, but also highlights the themes common to all the medieval, and it indicates some points of contact with contemporary work on practical reason, including debates about particularism and internalism.

Table of Contents

  1. Precursors: Aristotle and Augustine
    1. Aristotle
    2. Augustine
    3. Intellectualism and Voluntarism
  2. Intellectualist Theory: Aquinas
    1. The Interaction of Intellect and Will in Generating Action
    2. The Practical Syllogism
    3. Happiness, Law, and Virtue
    4. Final Comments
  3. Voluntarist Theory: Scotus
    1. Freedom of the Will
    2. Synderesis, Conscience, and the Natural Law
    3. The Non-Teleological Character of Scotus’ Thought
    4. Note on Ockham
  4. Medieval and Modern
    1. The Current Influence of Aquinas and Scotus
    2. The Medievals and Particularism
    3. The Medievals and Internalism
  5. Conclusion: Common Themes among the Medievals
  6. References and Further Reason
    1. Primary Sources
    2. Secondary Sources

1.Precursors: Aristotle and Augustine

The two most important influences upon medieval thought about practical reason were Aristotle and St. Augustine, and this first section identifies a few of the key ideas they bequeathed to their successors.

a. Aristotle

Aristotle’s theory is teleological and eudaimonist: All action is undertaken for an end, and our proximate ends, when we act rationally, form a coherent hierarchical structure leading up to our final end of eudaimonia (happiness, flourishing). Although we presuppose rather than reason about our final end formally considered—it is that which we pursue for its own sake, and for the sake of which we pursue all else; it is that which makes life worthwhile—practical reason does help us work out the correct way to think about just what that final end is, and about how to move toward it. Reason does this by means of the practical syllogism: The major premise identifies the end, some good recognized as worthy of pursuit; the minor premise interprets the agent’s situation in relation to the end; the conclusion is characteristically a choice leading directly to action that pursues means to the end (for example, Some pleasant relaxation would be good right now; reading this novel would be pleasant and relaxing; I shall read it (and straightaway I commence reading)). The work practical reason does in formulating the minor premise and identifying the means is called deliberation. While we cannot deliberate about the end identified in the major premise as an end, we can deliberate about it under its aspect as a means to some further end. Thus practical reason can (although seldom will it explicitly do so in practice) take the form of a chain of syllogisms, with the major premise of the first identifying the final end to be pursued, and the conclusion both identifying the means to that end and supplying the major premise of the next (now serving as a proximate end), until we finally reach down to something to be done here and now (the means to the most proximate end). Here is a compressed example: I should flourish as a human being, and my flourishing requires the practice of civic virtue, so I should practice civic virtue; I should practice civic virtue, in my circumstances civic virtue requires me to enlist in the army to defend my city, so I should enlist; I should enlist, and here is a recruiter to whom I must speak in order to enlist; I choose to speak to the recruiter.

Notice that in this syllogism the premises do not mention desire—the majors do not state “I want X,” but rather that X is a good to be pursued. Yet the conclusion does mention desire, or rather is a desire (for that is what choice is, deliberated desire). This is not an oversight on Aristotle’s part. Although he holds that reason and desire work together to produce action, he insists that desire naturally tends to what cognition identifies as good—as he puts it at Metaphysics 1072a29, “desire is consequent upon opinion rather than opinion on desire, for the thinking is the starting point.” Reason serves as the formal cause of action by identifying the actions (determining what “form” our actions should take) leading to the apprehended good, which is the final cause or end of action; desire serves as the efficient cause, putting the man in motion toward the end. So when a prospective end is recognized as good, a desire for it follows. The practical syllogism serves to transmit the desire for the end identified by reason as good down to means identified by reason as the appropriate way to the end.

Yet, because cognition includes sense perception, things other than those identified by reason can be presented to desire as good (as any dieter knows when offered dessert). This allows Aristotle to propose a solution to the problem of akrasia or “weakness of will,” the choosing of something we know to be bad—to put it crudely, we know it is bad, but it looks good. For reasoning to be effectively practical, and for practice to be rational, the desires must be in line with reason; for the desires to be consistently in line with reason, the moral virtues, which “train” the emotions to bring them into line with reason, are necessary. When the moral virtues, together with prudence, are present, Aristotle takes it that reasoning well and acting accordingly will follow naturally (we can speak of virtue as “second nature”).

b.  Augustine

The idea of virtuous action becoming natural is one of the points on which Augustine will disagree with Aristotle. He learns from his own experience (for example, in his robbing of the pear tree recounted in Confessions II) and from his reflections on the sin of the angels (see On Free Choice of the Will III) that the will can choose what the intellect rejects. Although the intellect is required for willing in the sense that it presents objects as good to the will, willing has no cause other than the will itself. Augustine, unlike some later Augustinians, is a eudaimonist, seeing our final end as eternal life in peace, that is, in right relation to and enjoyment of God (see The City of God XIX). Yet it should be noted that, drawing on his own experience and the writings of St. Paul, he identifies “two loves” of the will, love of God and love of self, and holds that the struggle between these two for ascendancy is the key to each human life, and indeed to history. No trace of such a struggle is to be found in Aristotle; nor   is there any such role for faith as we find in Augustine. Both in Confessions XI and in The City of God XIX Augustine chronicles the woes of temporal human existence, and the impossibility of finding peace, our final end, during our life on earth. It is thus in some sense reasonable for us to turn humbly to faith in God as our only hope for salvation. This turning, or conversion, requires an act of willed submission to God. Only after this can the intellect know, by faith, the true character of our final end, and thus only after such willing can practical reason become truly informed as to how to act. The need for conversion brings one more un-Aristotelian idea into the picture, that of obedience to divine law.

c.  Intellectualism and Voluntarism

Aristotle’s account of practical reason could be characterized as intellectualist, not   because he ignores the very important role of desire, but because reason plays the leading role, and desire is naturally inclined to follow reason (“desire is consequent upon opinion … for the thinking is the starting point”). Further, although Aristotle employs the concept of rational wish, there is serious debate as to whether this can rightly be identified with what the medievals, following Augustine, call the will. By contrast, Augustine may be termed a voluntarist, not because reason is unimportant, but because with him it is the will that plays the primary role. As we have seen, even in the absence of passion, the will may choose contrary to the judgment of the intellect, and it is only by willed humility that we can come to know our true final end by faith.

Throughout much of the Christian Middle Ages, Augustine’s influence predominates. And although much important work was done on topics highly relevant to practical reasoning—for example, passages in Peter Lombard’s Sentences, and the work of  St. Anselm on the will and of Abelard on ethics—practical reasoning itself was not generally treated in a rigorous and systematic way. But in the twelfth century, translations of Aristotle’s works, together with Muslim and Jewish commentaries, began to flow into Western Europe, and to gain in influence, eventually rivaling or surpassing the importance of Augustine’s thought. These thinkers do treat practical reasoning in rigorous fashion, and under their influence, so too do the great thinkers of the High Middle Ages. In doing so, all draw on both Aristotle and Augustine, and although it is common practice to identify some as “Aristotelians” and “intellectualists,” and others as “Augustinians” and “voluntarists,” this does run the risk of oversimplifying. The reader should keep in mind that there is no one account of the relation between intellect and will that all intellectualists held, nor one opposed account that all voluntarists held. Instead, scholars sort thinkers according to whether they hold certain characteristic theses concerning such questions as these: Is the intellect or the will the higher power? Is the will a passive power (a “moved mover”) or an active one (a “self-mover”)? What sort of cause does the intellect exert on the will’s choice—does it specify the act of will, or can the will act independently and control its own choices (and can it act contrary to judgment)? A metaphor commonly used by those now classified as voluntarists was that of the Lord and the Lampbearer: The will is the lord, deciding where to go; the intellect contributes to the decision,  but in the same manner as the servant who lights the way (or rather the possible ways) with a lamp (see for example Henry of Ghent, Quodlibet Iq14). Intellectualists, by contrast, would see the intellect as the lord, and the will as the lieutenant or executive officer.

In the intellectualist camp we can probably include St. Albert (see the first McCluskey entry for a discussion) and John of Paris;  in the voluntarist camp, St. Bonaventure and Henry of Ghent. Others, such as Giles of Rome, occupy a position in the disputed middle ground (see Kent for an intellectualist reading of Giles; Eardley for a moderately voluntarist reading). The following sections will focus on the two figures who are arguably the most important and influential thinkers of the High Middle Ages, taking Aquinas as a representative of intellectualism, and Scotus as a representative of voluntarism. But it should be kept in mind that Aquinas treats Augustine as an authority and has a much more robust conception of the will than does Aristotle, and likewise that Scotus draws heavily upon Aristotle and insists upon a very important role for the intellect.

2.  Intellectualist Theory: Aquinas

Like both Aristotle and Augustine, St. Thomas Aquinas (1225-1274) is a eudaimonist; like Augustine he takes seriously both obedience to divine law and the role of the will in the genesis of action; yet like Aristotle he is an intellectualist. (This is generally accepted, but it should be noted that some scholars have argued for more somewhat more voluntarist readings of Aquinas than that offered below. See Eardley and Westberg for sources, discussion, and criticism of these interpretations.) For Aquinas, practical reasoning plays out in a dynamic exchange between intellect and will, an exchange in which intellect always has the first word (reason being the first principle of human action), but in which the will plays a key role and the agent remains free.

a.  The Interaction of Intellect and Will in Generating Action

For Aquinas, the will tends naturally toward the good, but to act it must have the good presented to it by reason in its practical capacity. Further, after apprehending and willing the good, the agent must decide whether and how to pursue it, which involves a process of collaboration between intellect and will. Let us begin with an example, making use of Ralph McInerny’s immortal character, Fifi LaRue. In the midst of a bad day, Fifi sees a travel poster advertising a Roman holiday, apprehends “how nice that would be,” and forms a wish to go. She considers the idea as befitting, and enjoys it. Nothing seems to stand in the way; the trip would be delightful and cause no problems; she forms the intention to go. But she must take counsel as to how she could accomplish it. Due to time constraints, she must fly, but could take a bus or taxi to the airport; she consents to both. Yet the bus would be so crowded … let it be the taxi then, she judges, and so chooses. Here is a taxi; she must hail it by raising her arm. So she commands, and so uses her arm. The taxi pulls up, and off she goes.

This example involves the steps and terms Aquinas spells out in questions 8-17 of the prima secundae (the first part of the second part of the Summa theologiae),  and we should now look at some of the details of this complex discussion: The intellect apprehends something as good and thereby presents it to the will, which then wills or wishes that good as an end—call this simple willing. (Strictly speaking, it would be more proper to say the agent apprehends the good by means of her intellect and simply wills it by means of her will; this is always what Aquinas means, although for convenience he often speaks of the intellect apprehending and so forth.) This does not yet mean that the agent pursues the good; she may decide not to for a variety of reasons—perhaps it is pleasant but sinful, and she immediately rejects it—or may be as yet undecided. She may then continue to consider the good, apprehend it as befitting in some ways, and, in a second act of will regarding the possible end, enjoy it (while we perfectly enjoy only an end possessed, we may imperfectly enjoy or entertain the idea of possessing it). Again, actual pursuit need not follow—perhaps the good is befitting but not currently feasible (Fifi, perhaps, lacks the money). Finally, the agent may actually undertake to pursue this good as an end, to tend toward it, and this act of will Aquinas calls intention (and here again, Aquinas is explicit that an act of reason precedes this act of will; cf. q12a1ad3).

Now intending the good as an end, the agent must determine how best to pursue it—she must decide upon means to the end. When the means are not immediately obvious, the agent deliberates or takes counsel, in which reason seeks out acceptable ways to the end; such ways being found the will then consents to them. Reason must then issue a judgment (q14a1) as to which is preferable, followed by the act of will called choice (q13, q15a3ad3). So, Fifi took counsel as to how to reach the airport, identifies and accepts two ways (bus or taxi), then judges the taxi superior and so chooses that means. But in considering how to get from America to Rome, she is able to skip the counsel/consent stage because the means (flying) are immediately obvious (she has no time for sailing).

The choice having been made, it is time to execute. Here again we see the same pattern of an act of intellect, command, followed by an act of will, use, whereby the will employs faculties of the soul, parts of the body, or material objects to make the choice effective. So when the taxi draws near, Fifi sees that she must wave, and commands “this (waving) is to be done.”  This command informs, or gives exact shape to, her already present will to take a taxi (her choice). Her will then uses her arm, puts it in motion.

Now the process described is a complex one, having as many as twelve steps from the initial apprehension of a good down to use. Do we really go through all of this? Aquinas does not mean that we consciously rehearse all the steps every time we perform an action (just as we do not consciously rehearse rules of grammar in articulating a thought). The twelve-step process is a logical reconstruction of the role of intellect and will in generating action. The steps are those we could consciously rehearse, and perhaps sometimes do (if facing a complicated matter, say, or if doggedly pressed for an explanation or justification of a past action). Usually, our actual practical reasoning will be much more concise. Daniel Westberg and others have argued that we should understand Aquinas to have in mind a streamlined version of the process centered around intention (apprehension and intention), decision (judgment and choice), and execution (command and use), with intellect and will working in unison at each stage. Other acts mentioned by Aquinas, such as counsel and consent, may serve auxiliary roles in complex situations.

Westberg stresses that we should not take Aquinas to mean that at each stage intellect renders its judgment and then the will decides whether or not to follow it—as we will see, this is the way of the voluntarists. Instead, the will naturally tends toward the good presented to it by the intellect at each stage. So for example in discussing whether choice is an act of intellect or of will, Aquinas says choice “is materially an act of the will, but formally an act of the reason” (q13a1)—roughly, the intellect in presenting some particular thing or action as good “forms” or makes specific the will’s general tendency toward the good (Aquinas follows Aristotle in maintaining that, like substances, accidents, including actions, can be analyzed in terms of form and matter). It is because the act of choice is completed by the will (judgment alone is not yet choice) that Aquinas is prepared to call it an act of will. Yet there is a real sense in which the stage Westberg calls “decision” comprises one act of the reasoning agent, an act whose form derives from reason and whose matter is supplied by will.

Voluntarists will charge that here the intellect is determining the will, which is thus not free. Now Aquinas calls that free which “retains the power of being inclined to various things” (Iaq83a1); a subject is free if it has this power. A rock is not free just because it can be inclined to heat or chill by the external power of fire or ice. Aquinas’s implied response to the voluntarist charge in the course of his discussion of choice is that the act of choice is free because the judgment that forms it is free, and the judgment is free because in considering any particular good, reason can focus on how it is good or on how it is lacking in goodness, leading to a judgment for or against it (q13a6). Worth noting, too, is that the will (and those other affective powers, the passions) play a role in attracting or diverting the attention of reason during the counsel it takes prior to judgment. But Aquinas’s more complete response would be that, strictly, it is not the will or reason that is free; the person is free in making the judgment and thus in making the choice the judgment informs. The intellect does not make the judgment, the person—the willing and feeling, as well as thinking, person—makes it by means of his intellect. The person is the subject that “retains the power” of, say, sitting closer to or further from the fire and thus being hot or cold; he exercises this power by means of his faculties of reason and will.

All of this shows how things can in many ways be more complicated, and less mechanical, than the initial description of Fifi’s pursuit of a Roman holiday suggested. One especially important factor, just touched on, is the reflexivity of both intellect and will. The will, for example, uses both intellect and itself throughout the process of deliberation (see q16a4c&ad3). In reaching her judgment, Fifi focused on the bus being crowded, but if her affections were more attuned to saving money, she might have focused instead on its economy. Further, she could at any point consider whether she should deliberate further and decide whether or not to do so. There is a potentially infinite regress here, but not an actual one. In taking counsel, having consented to taking the bus, she could yield to impatience and hop on the bus she sees rather than thinking further and realizing that a taxi would be better. Neither the bus nor the taxi, nor for that matter any other means or particular good in this life, is a perfect good. Thus none of them determine reason in its favor. Our judgment, and thus our choice, remain free. This highlights one reason Aquinas can be called an intellectualist, namely that he identifies reason as the source of freedom (see Iaq59a3: “wherever there is intellect, there is free-will”). But again, if this seems, paradoxically, to locate freedom in reason rather than will, it is well to remember that Aquinas’s talk of the intellect doing this, and the will that, is all shorthand for the person acting by means of each faculty. It is the person, not her faculties, who judges and chooses; and does both freely.

b.  The Practical Syllogism

But how does such reasoning relate to the Aristotelian notion of the practical syllogism Aquinas adopts? The intellectual acts regarding, and the pursuant intention of, the end supply the major premise (say, “I should go to Rome.”). The minor premise is supplied by deliberation, resulting in judgment and choice (“Taking a cab to the airport is the best way to Rome.”). This may take a major premise-minor premise form as above, but often the deliberation of the agent would be better represented as a longer argument with several premises, or as an iterated series of two-premise arguments finally reaching down to the concrete action. In this case, the means to the end initially chosen would then become the object of intention as a proximate end (q12a2), and counsel would be taken as to the means to that end, and so forth, until something that can be done here and now is reached (much as we saw above in the discussion of Aristotle).

Two questions present themselves at this point: What sort of reasoning goes into the formation of intentions, and how is this reasoning, and the reasoning involved in counsel, done well or ill? Sketching an answer to these questions requires a discussion of happiness, law, and virtue.

c.  Happiness, Law, and Virtue

Aquinas agrees with Aristotle that we have a final end, and with Augustine that it is not to be attained in this life (it is not a Roman holiday, unless perhaps in a very metaphorical sense). Using the term “happiness” is a potentially misleading, but common, translation of beatitude. Blessedness or flourishing would be better, for in fact our final end is our completion or perfection. Aquinas takes it that we all agree, or would agree upon reflection, on that. There is neither need nor room for practical reasoning about it. Yet we disagree over that in which it consists: one says wealth, another power, another (Fifi, perhaps) pleasure. And here we can reason: The mere fact that Aquinas wrote the first five questions of the prima secundae shows that he thought so. There he argues that because the will wills the good universally, and only God is universally good, our final end is attained in virtuous activity culminating in the right relation to God (although we may not know that the happiness we seek can be found only in and with God), which consists principally in loving contemplation and secondarily in obedient service. Only this perfects our nature as rational creatures. Although Aquinas agrees with Augustine that this end can be attained, or even adequately understood, only by God’s grace, Aquinas takes it that we do tend naturally (even if inadequately) toward it, and that its attainment fulfills, as well as transcends, our nature (“Grace does not destroy nature but perfects it” Iaq1a8ad2). What reason is able to make out about our final end, then, is reliable and authoritative, even if always incomplete.

There is a long-standing controversy in Aquinas scholarship concerning the relationship between what Aquinas calls imperfect and perfect beatitude: Do we have a natural final end of humanly virtuous activity and a distinct supernatural final end of contemplation of and friendship with God? Or do we have just one final end that is naturally unattainable? Here readers are referred to Bradley for a very thorough discussion of the issues involved.

Because by our nature we have a final end, any other end we have (going to Rome, perhaps) could be reconsidered in its light, and since everything we do is (perhaps unconsciously) done for the sake of the final end (Ia-IIaeq1a6), every other good we pursue, though seen as an end, is also a means to our final end, and under this aspect can be deliberated about, evaluated, and judged appropriate or not. In this sense, ends too are objects of counsel and judgment (q14a2). Fifi might adopt the end of going to Rome capriciously, but she might also stand back and take counsel about it under its aspect of a means to her conception of her final end. That is the sort of reasoning that can go into the formation of intentions. To see how Aquinas thinks such reasoning, as well as the reasoning about means, should be done, we must look at how his discussion of the final end relates to his discussion of the natural law.

As natural creatures, we have a natural inclination (in fact, an ordered set of natural inclinations) toward our perfection as human beings. As rational creatures, we can understand and endorse these inclinations, and articulate them into principles of practical reason, which are at the same time precepts of the natural law. How so? As Pamela Hall and Jean Porter have argued, the process of articulation involves a reflective, and developing, grasp of human nature and its tendencies, including an understanding of it and them as good. This understanding is ultimately founded on the recognition that human nature is created and directed by God, Goodness itself (this recognition can be achieved, however imperfectly, by means of natural knowledge of God). This allows the articulated principles to meet the criteria of law (q90a4): They are ordinances of reason (our own, and ultimately God’s) for the common good (due to our social nature), made by Him who has care of the community (again, God), and promulgated (they are made known, or knowable, to us through our natural inclinations). So although the precepts of the natural law ultimately derive their authority from God, they can be known independently of any knowledge of God—as Bradley puts it, they are “metaphysically theonomous” but “logically autonomous”)—and knowledge of them certainly does not require revelation.

Briefly setting out the inclinations and some of the precepts should illustrate this process of articulation, and at the same time give some indication of how it is connected with our pursuit of our final end. Like all things, we are naturally inclined toward our own good or perfection (good is that which all things seek), and thus as being is the first thing apprehended by reason simply, good is the first thing apprehended by reason as practical, or as directed toward action. And Aquinas takes it that, just as a grasp of the meaning of being and non-being leads naturally to knowledge of the principle of noncontradiction, so a grasp of good and evil leads to knowledge of the first principle of practical reason, good is to be done and pursued, and evil avoided: “All other precepts of the natural law are based upon this: so that whatever the practical reason naturally apprehends as man’s good (or evil) belongs to the precepts of the natural law as something to be done or avoided” (q94a2). And what do we naturally apprehend as good? Those things toward which we are naturally inclined, for good is an end and these are our ends by nature. Aquinas identifies three levels of these inclinations: That common to all substances (the inclination to continue to exist), that shared with other animals (inclinations to reproduce and to educate one’s offspring), and that proper to rational beings (to know the truth, ultimately about God, and to live in society). Phrases such as “and so forth” and “and other such things” occur in this passage, indicating that this is a quick overview rather than an exhaustive statement of the content of the natural law.

How are these inclinations articulated into precepts? This question might take the form of a procedural question concerning how we might move from an inclination to a norm (a version of the concern about moving from is to ought); this is addressed above (the inclinations are directives given by eternal reason—the natural law is a participation in the eternal law in the sense that our natural inclinations have their origin in God’s plan and creative action (q91a2)). But it might also take the substantive form of asking how we move from the inclinations mentioned to particular norms, and this needs to be explained. As we saw, Aquinas holds that as soon as we understand the meaning of the terms “good” and “evil,” we naturally understand that good is to be done and pursued and evil avoided—we have this knowledge by a “natural habit” he calls synderesis, (see q94a1ad2 and Iaq79a12). We know other things in this way too: That we are to fulfill our special obligations to others, and to do evil to no one—these are elucidations of the first principle, and from them flow a number of other principles, which have also been revealed to us in the Decalogue (see Ia-IIaeq100): The command to honor one’s parents functions as a paradigm for honoring one’s indebtedness in general; the commands forbidding murder, adultery, and theft speak to refraining from doing evil to others by deed; the commands forbidding false witness and coveting speak to refraining from doing evil by word or thought.

Aquinas is not as explicit as we might wish about how we acquire this knowledge, and there is some dispute here among commentators. One question is, must we acquire it at all? Does not Aquinas say that the principles grasped by synderesis are self-evident (if that is a good translation of per se nota)? The answer is that, yes, we must acquire it, for there is no innate knowledge; synderesis is a habit and so must be acquired. We do acquire it naturally, in this sense, that once we come to understand the terms employed in the principles, the principles are naturally known to be true. Experience and reflection are needed to grasp the meaning of such terms as good and evil, the proper objects of special obligations, the scope of non-maleficence. In this process our natural inclinations play a role: life, family, social life, and knowledge are good for each, and our social nature further directs us to attend to the common good and the good of our neighbor as well as our own private good. We might sketch the process as follows (although Aquinas never puts it quite this way): Good is to be done and evil avoided. So first, since good is to be done, and special obligations indicate goods owed to others, they are to be fulfilled. Second, since evil is to be avoided, it is to be done to no one (our social inclination here coming into play); we are naturally inclined to life, family, and society, so obtaining these things is good for each and losing them evil; thus murder, adultery, and so forth are evil and so not to be done.

In any event, once we have such principles in hand, as Aquinas takes it we all do, we have also in hand a way of evaluating whether we should allow our simple willings (such as, “how nice a Roman holiday would be”) to pass into intention—would it be good or evil to go now to Rome—is it consistent or otherwise with my flourishing as a rational creature? Would it for example violate any special obligation I am under, or perhaps require stealing? As said above, any proximate end an agent is considering whether to adopt may also be seen as a means to the agent’s final end, and its suitability as such may be judged by its accord with precepts of the natural law—these should serve, we may say, as penultimate major premises, under the first principle of practical reason, of any practical syllogism (or, when stated negatively, as a “filter” for all prospective means or proximate ends).

There is one major piece of the puzzle we have yet to deal with, the role of virtue in all of this. First, how exactly do these three, blessedness, law, and virtue, fit together? As indicated above, the natural law is a participation in the eternal law that resides primarily in our natural inclinations: the rational creature “has a share of the Eternal Reason, whereby it has a natural inclination to its proper act and end: and this participation of the eternal law in the rational creature is called the natural law” (q91a2). Our natural inclinations direct us toward our proper end, that is to say toward beatitudo, and the attainment of it is the fulfillment of our inclinations. But as we have also seen, our blessedness consists in virtuous activity (culminating in the loving contemplation of God). Such being the case, we should expect the natural law to direct us toward virtuous activity, and Aquinas does say explicitly that the natural law prescribes virtuous activity (q94a3, and see Pinckaers for an interesting development of the idea that the natural inclinations are the “seeds of the virtues,” into which they grow through the work of reason and habituation). So natural law, through informing our natural inclinations, provides the direction toward our final end, through the virtues as (constitutive) means to it.

Second, how does virtue play this role? We move toward our end through free, reasoned action, and cannot simply decide to grasp our final end. We must make a series of choices and carry them out, and it is here that virtue plays its principal role. One thing we clearly must do is reason well about how to act; we require excellence in practical reasoning. And that is to say we require prudence, which just is the virtue that applies right reason to action. But we also require the moral virtues such as justice and fortitude, which enable our knowledge of both the ends and means in practical reasoning. Aquinas is clear, as Aristotle was not, that we naturally know the ends we should pursue (this is the role of synderesis; see above, and also IIa-IIaeq47a6), but he also insists that we are rightly disposed toward that end by the moral virtues (Ia-IIaeq65a1)—the moral virtues safeguard us from “forgetting” our ends under the influence of vice, custom, or passion (q94a6)—fortitude, for instance, helps us control our fear of dangers so as to remain committed to the common good. The virtues also enable us to find the right means to the end. This is properly the work of prudence. Looking at how prudence does this work will clarify how the moral virtues play a supporting role in it. Aquinas says prudence has eight “quasi-integral parts” which can be classified as follows: Those that supply knowledge (memory and understanding or an intuitive grasp of the salient features of the present situation), those that acquire knowledge (docility and shrewdness), that which uses knowledge (reasoning, constructing the practical syllogism), and those that apply knowledge in command, the chief act of prudence (foresight directs present actions to the foreseen end, circumspection adjusts means to circumstances, and caution avoids obstacles to realizing the end). Prudence depends on the moral virtues not just to safeguard reason’s grasp of principles, but throughout its reasoning toward action. The parts of prudence just enumerated should make this clear: Docility, for example, requires humility. Also, the identification of the correct means to an intended end involves the understanding, or intuitive grasp, of the situation that helps supply the minor premise in a practical syllogism (see IIa-IIaeq49a2ad1). But this understanding can be corrupted by the intrusion of passion, as in cases of incontinence (Ia-IIaeq77a2), a state to which all are subject, unless fortified by the moral virtues (Fifi’s hopping impatiently on the bus although a cab would have been better presents a very mild case of such incontinence).

d.  Final Comments

So for Aquinas practical reason is our capacity to discover how to move from our present situation toward the attainment of our final end. In successful practical reasoning, synderesis, prudence, and moral virtue work together to ensure that the action meets all of the criteria of a good action (q18aa1-4): suitability of object (what kind of action is this, borrowing or stealing?), due attention to circumstances (might frankness here and now be unduly embarrassing to one’s interlocutor?), and goodness of the end of action (is my goal in giving alms to impress a potential benefactor, or to succor the need of the less fortunate; ultimately, the end is good if and only if it is conducive to the agent’s final end). While practical reasoning presupposes our understanding of our final end as perfection, everything else in our practical lives, including our conception of our final end and to what extent we honor the principles grasped by synderesis, lies within its scope. When practical reasoning is done well leading to good action, the agent at one and the same time pursues her own perfection (the Aristotelian moment) and obeys the eternal law of God (the Augustinian)—the etymological connection between prudence and providence mirrors a metaphysical connection, for our practical reason participates in the eternal reason (q91a2; see also q19a10). Since our perfection is perfection as creatures, there is no tension between it and obedience—for Aquinas, practical reason is not torn between the fulfillment of obligation and the fulfillment of the agent.

3.  Voluntarist Theory: Scotus

The reception of Aristotle and other non-Christian thinkers was never entirely easy, and worries about the influence of Greek and Arabic thought culminated, just after Aquinas’s death, in the Condemnations of 1277. In publishing them, the Bishop of Paris condemned 219 propositions drawn chiefly from Aristotle and his commentators, and while the principal target of these condemnations was the teaching of a “radical Aristotelianism” (or “Latin Averroism”) contrary to the Catholic faith by masters on the Faculty of Arts such as Siger of Brabant, a number of the condemned propositions were drawn from Aquinas’s work, although Aquinas was not named. In their wake the marriage between Greek and Biblical thought, between Aristotle and Augustine we might say, is a stormier one. Among the chief concerns of the Condemnations were divine and human freedom, and later thinkers were especially concerned to safeguard both. Many of them, rejecting Aquinas’s account of human freedom, found it necessary to portray the will itself as free. One way they did this was to stress the will’s independence from determination by nature, including the natural power of the intellect and the second nature imparted by virtues. The will was seen as free rather than as natural, and as nobler than the intellect—thus these thinkers are often called voluntarists.

John Duns Scotus (c. 1266-1308) is the most impressive and influential of the post-1277 thinkers, and his sharp break with eudaimonism in many ways anticipates modern moral theory, especially that of Kant. It should be noted, though, that even in making this break Scotus is working within the medieval tradition, drawing here especially on St. Anselm’s work On the Fall of the Devil;   Scotus is also indebted to his Franciscan predecessors and fellow-travelers such as Henry of Ghent. The following presents some of the main lines of his account of practical reason, but readers should be aware that there are currently some major disputes over how to interpret Scotus; some of these will be mentioned, but readers are invited to consult the secondary sources mentioned for further information.

a.  Freedom of the Will

Scotus emphasizes the freedom of the will in three key ways. The first two are rooted in his (characteristically voluntarist) teaching that the will is a self-mover rather than moved by anything else (an active rather than passive power); the third helps explain this capacity for self-movement. The first, then, lies in his emphasis on the dominance of the will over other powers, including the intellect. Just as in seeing we can focus on an object not in the center of our visual field, so in intellection the will can focus on and enjoy something other than what the intellect directly presents, and thus redirect the intellect (Opus Oxoniense II, dist. 42, qq1-4, nn. 10-11). The moral importance of this is that the will can turn aside from what the intellect presents as good and pursue something else (although   that something else must be good in some respect). Second, he insists that in addition to being able to will or “nil” (velle or nolle), the will always retains the option simply to refrain from willing (non velle). This is important, for Scotus takes it that if we necessarily will something, we are not free. Scotus allows that the will is unable to nil beatitude, but holds that it can refrain from willing it, and so remains free (Ordinatio IV, suppl., dist. 49, qq9-10). This points up an important difference between his account and that of an intellectualist like Aquinas, who maintains that when the intellect has perfect vision of a perfect good (as it does only in the beatific vision), the intellect sees it as good, and the will adheres to that good, both from natural necessity. Scotus denies the necessity of willing the good presented by the intellect even here. The third point concerns his adoption of Anselm’s notion of the two affections of the will (which itself draws on Augustine’s account of the two loves of the will). The will’s tendency toward the agent’s perfection is called the affectio commodi, the natural appetite of the will that prohibits us from nilling perfection. It is similar to the will, simply, as it is understood by eudaimonist thinkers like Aquinas. See (Williams 1995) for an argument that it is identical to the will so understood; see (Toner 2005) for an argument that it is not. But it does not exhaust the will for Scotus, nor does it necessitate the willing of happiness, due to the affectio iustitiae, the tendency of the will to love things in accordance with their goodness, and not simply as means to or constituents of our own happiness. It is this affection, for Scotus, that grants the will its “native liberty.”

It also renders his account of practical reason more complicated, for now we see two distinct ways in which reason can present something as good to the will: First, something may be judged to be conducive to our happiness or perfection as rational agents (attracting the affectio commodi); second, something may be judged to be morally good or right or just (appealing to the affectio iustitiae). Thus, we can reason about how to attain happiness, or how to act justly. And although these will come together in our final union with God, they are always formally distinct and will often pull apart in this life. There is a hint here of what Sidgwick would much later call a dualism of practical reason, a dualism which in various forms characterizes most modern moral systems, Kantian or utilitarian.  Scotus’ response to this situation also anticipates modern moral thinking (see Toner on this)—the pursuit of happiness must be moderated by justice; as Scotus puts it, the affection for justice acts as a “checkrein” (moderatrix) on the affection for happiness (Ordinatio II, dist. 6, q2). If the pursuit is not so moderated, it will be bad or at best morally indifferent. A crucial, and characteristically voluntarist, implication follows: Once the intellect has judged an act to be good (in either broad sense), the will remains free to follow the judgment or not, according to which affection it acts on. It may refuse to pursue a good conducive to happiness because doing so conflicts with a requirement of justice; it may turn from a good required by justice in order to pursue happiness instead (in the Ordinatio passage just cited, Scotus accounts for the sin of the angels along these lines). For better or worse, depending upon what one takes freedom to involve, Aquinas’s moderately intellectualist view that reason and will concur in free choice has been replaced by the voluntarist view that once reason has done its work, the will must independently make its free choice.

Here we touch on a controversial area. None of the voluntarists held that reason could be dispensed with, or was unimportant. At the least, reason must present options (and recommendations) to the will for it to be able to choose. Henry of Ghent had maintained that this was the extent of reason’s contribution to free choice (that it was merely a causa sine qua non—a necessary pre-condition of willing, but not properly a cause of it). Scotus at one point held a more moderate view, that reason served as a partial efficient cause of willing. Some Scotus scholars argue that he later moved further in the voluntarist direction, coming to accept something close to Henry’s view (or at least acknowledging it as an account just as persuasive as his own earlier view; see (Dumont 2001) for a detailed discussion). Whatever the correct view of  Scotus’ mature position, however, the point about the will’s independence from reason should not be taken to be a denial of reason’s important role leading up to choice.

It would be an even greater mistake to think that, because Scotus is a voluntarist, he downplays reason’s contribution to choosing morally good actions. In fact, Scotus insists as firmly as Aquinas that to be morally good, an action must be willed in accordance with right reason (Quodlibet, q18). What does this involve for Scotus?

b.  Synderesis, Conscience, and the Natural Law

Scotus follows tradition in invoking the notions of synderesis and conscience (Ordinatio II, dist. 39): Conscience is the habit of drawing the right conclusions about what is to be done by means of the practical syllogism. As such it depends upon knowledge of the first principles of practical reason, and synderesis is the habit of knowing these. What are they? Like Aquinas, Scotus takes them to be precepts of the natural law, but his handling of these precepts is quite different. His treatment of natural law makes no reference to natural inclinations—instead of being articulations of the directedness of human nature, the precepts are rules that are self-evident to reason because their denials lead to contradictions. For example, since good is the object of love and God is infinite goodness itself, the first principle of practical reason is that God is to be loved or, most strictly, God is not to be hated (Ordinatio III, suppl., dist. 37), for “goodness itself is to be hated” is self-contradictory. Scotus also relates the natural law to the Decalogue, and holds that from this first principle we may conclude that the precepts of the First Table (relating to God) follow and belong to the natural law strictly speaking. The precepts of the Second Table (relating to neighbor), however, belong to the natural law only broadly speaking—they are consonant with the principles known to be true analytically, but do not follow from them necessarily. In this passage, Scotus also distinguishes the precepts of the First and Second Tables, the precepts that belong to the natural law strictly and only broadly speaking, as follows: It is, in the abstract, possible for us to attain our final end of loving God without following the precepts of the Second Table (although not in the concrete, given that God actually has issued these commands), but is absolutely impossible for us to attain it while disobeying the precepts of the First. Thus, practical reason by itself is sufficient to tell us that if God exists, we must not hate Him, must have no other gods before Him. Scotus does not think we are left with theoretical possibilities and unaided practical reason—we know from Revelation that God has ordained the precepts of the Second Table, which are thus binding (for having been commanded, they move beyond being merely consonant with the love of God). Still, strictly speaking they are contingent and could be set aside or altered by God’s absolute power. Indeed Scotus thinks that in certain cases God has actually dispensed from them (see Ordinatio III, suppl., dist. 37; there is dispute among scholars as to how malleable the content of moral principles concerning love of neighbor is, and how open to rational investigation; see for example Wolter, Williams 1995, and Mohle’s contribution to Williams 2003).

To illustrate the relationship of consonance, Scotus gives us an example of the analogous relationship in positive law between “the principle of positive law,” that life in community should be peaceful, and secondary legal principles concerning private property. The institution of private property is not absolutely required to preserve peace, but given the infirmities of human nature, the common holding of property is likely to result in dispute and neglect. Thus allowing people to have their own possessions is “exceedingly consonant with peaceful living.” Likewise, although failing to love one’s neighbor is not strictly inconsistent with loving God (nor rejecting precepts stated in the Second Table strictly inconsistent with loving one’s neighbor), there is a harmony or consonance at both points (between love of God and neighbor, and between love of neighbor and honoring these precepts), for God has created us as social creatures and the precepts of the Second Table are conducive to social life. Although Scotus is not explicit, we may surmise that the principle that life in community ought to be peaceful belongs to the natural law in this broad sense, as peaceful life with God’s other rational creatures seems “exceedingly consonant” with love of God. As we will see, Scotus does explicitly say elsewhere that the “Silver Rule” belongs to the law of nature (broadly speaking). Prohibitions against murder, adultery, false witness and so forth follow from these pretty clearly, by way of consonance if not strict logical necessity.

So right practical reason begins from the precepts of the natural law, but how does it move to the judgment of conscience? Let us look at a case of deciding what to say when asked about one’s role in a certain affair, perhaps when lying might keep the agent out of some trouble. Scotus takes it that reason can grasp the wrongness of lying on the following basis: The Silver Rule, “Do not do to others what you would not want them to do to you,” is not only a commandment but a law of nature, at least in the broad sense; no one would want to be deceived by his neighbor; therefore, …. (Ordinatio III, suppl., dist. 38). With this principle in hand, how is one to act? It will depend on the particulars of the situation. The agent should now know that he should not deceive, but should tell the truth (or perhaps remain silent, if, say, the person asking is a gossip with no real stake in the matter; let us assume such is not the case). This much is clear from reason’s grasp of the principle and its understanding of the agent himself as a rational being, the action as speaking to another rational being, and the object as telling the truth (Scotus gives an example with the agent under the description of (rational) animal, the action as eating, and the object as nourishing food; Quodlibet, q18). But practical reason still has work to do: It must discern the right manner in which to tell the truth (say, calmly and straightforwardly rather than aggressively or evasively), and the right time and place (later in private, rather than now in company, say). Most importantly, it must place the act in service of a “worthy purpose,” direct it to an appropriate end (one that is just rather than merely advantageous—for acts that proceed solely from the affectio commodi will not be fully in accordance with right reason, since they focus only on the value of their objects to the agent, ignoring what intrinsic value they may have—thus Scotus holds that they are at best morally indifferent).

c.  The Non-Teleological Character of Scotus’ Thought

Much of the detail above is similar to what Aquinas says about the moral goodness of action, which should not be surprising because both are drawing on Aristotle and Christian tradition, but there is an important difference as to the goodness of the ends of particular actions. Aquinas takes it that in intending, the will (and its proximate ends) should be ordered to the final end or highest good. This final end is the perfection of the agent, which itself consists in the right relation to God. In principle, the agent could articulate this ordering as a series of syllogisms in which practical reason clarified the way the pursuit of this proximate end is linked to the pursuit of the agent’s final end as set by her nature as a rational creature. A metaphorical way of putting this: Actions can be seen as episodes in a story that the agent, by means of her practical reason, is writing (or co-authoring, given God’s providential role). In the well-written story (the practically rational life blessed by grace), the episodes successfully lead up to the happy ending, in which the agent is united with her true love and, quite literally, lives happily ever after.

For Scotus, this teleological character largely (though not entirely) disappears. Actions must still be related to God, whom Scotus is happy to refer to as our final end. But now God in a way serves less as final end than as first cause, in the sense of author of the moral law or of dispensations from it; God is not so much sought after as an end, as honored and obeyed as source. At least in those actions that have creatures as their object (that is, most actions we perform in this life)—and which are therefore only contingently related to our attainment of God as our final end—practical reason does not identify the right way to act by discerning how the prospective actions contribute to a series leading up to the right relation to God (it does not construct a series of syllogisms in the way just mentioned). Instead, each prospective action is judged separately, as to whether it honors God appropriately, expresses love of God and obeys His commands (although such thoughts need not be always present in the agent’s mind). Actions may still be teleologically ordered, for a number of actions may be ordered to the accomplishment of a moral end. But it is no longer the case that all actions and their ends must be organized into a pattern or narrative completed only in the agent’s attainment of her final end, and that they can be fully assessed only in light of their place in such a pattern. Instead, each action (or course of action) stands alone as a complete work, and the ends of actions may be judged in light of their fit with the situation and their accord or discord with precepts of the natural law or other authoritative source (revealed commands, a divine dispensation). Picking up the author metaphor again, life is not so much a novel as a collection of epigrams and short stories, dedicated with love to God. This deep difference between Aquinas and Scotus is reflected in—indeed is a consequence of—their different formulations of the first principle of practical reason: “Good is to be done and pursued, and evil avoided” (Aquinas); “God is to be loved, and never hated” (Scotus). The one focuses on pursuit of the good (relationship with God); the other on the expression of love for God.

Related to this is  Scotus’ reduced role for the moral virtues: He holds that prudence can exist without moral virtue, that as free we always have what we need to do the right action here and now; it need not be part of a larger pattern involving the development of character (Ordinatio III, suppl., dist. 36). Yet, Scotus has no wish to deny that the virtues are important: they can help turn the will from evil (the willing of which can blind the intellect to the truth by turning it away for a time), can help facilitate the will’s choosing in accordance with the right judgment of prudence, and can also help the act to be done in the right manner. Moral virtue assists us, then, both in reasoning about action and in making that reasoning effectively practical, but it is not essential to performing morally good actions.

d.  Note on Ockham

Now it is perhaps these non-teleological aspects of  Scotus’ thought, more than any other, that mark him out as a transitional figure. It is thus worth noting that it is concerning this feature of his thought that some of the disputes mentioned above are taking place. Williams (1995) and MacIntyre (1990) stress the role of obligation and divine commands in his theory; Hare and Ingham stress instead the role of love and the goal of relationship with God—views perhaps susceptible of some kind of teleological interpretation after all. However in the end Scotus should be read on this, it does seem fair to say, at the least, that divine commands, and the related notions of obligation and obedience, play a more prominent role in his thinking than they do in that of Aquinas.

And in any event, the later Franciscan William of Ockham will leave little doubt that he is a divine command theorist (but, see Osborne and the noted selections in Spade for a recent exchange on this). This does not mean that he is not concerned with practical reason; he still insists that the morally good action is the one dictated by right reason and willed because so dictated (Quodlibet IIIq15). But practical reason now operates within the framework of God’s ordained power, wholly constructed by God’s sovereign will. Knowledge of what God’s power has actually ordained, and thus of how we should act, is now even more dependent upon revelation; God could, by his absolute power, command us even to hate him, and it would then be right for us to do so. Here we have moved from  Scotus’ moderate voluntarism to an extreme form in which morality consists in the obligation impressed by the commanding divine will upon the obedient (or otherwise) human will, and in which practical reason serves merely to help articulate what has been commanded and how to carry it out. The prevailing order, for Ockham, is one in which familiar concepts have application (prudence, the moral virtues, the Decalogue), but the radical contingency hanging about the whole is novel.

4.  Medieval and Modern

This section briefly examines the influence of these two theorists on contemporary practical reasoning theory, and also explores the relation between their views of practical reason and some common positions in current debates (those between Generalists and Particularists, and between Internalists and Externalists).

a.  The Current Influence of Aquinas and Scotus

The two figures focused on above are the two who seem most relevant to contemporary theorizing about practical reason. Aquinas’s influence is widespread: In Anglophonic moral philosophy Alasdair MacIntyre is perhaps the best-known among his many followers, developing Aquinas’s thought in ways more sensitive to the context of culture and tradition. Candace Vogler develops a broadly Thomistic theory of practical reason, exploring both his account of the capital vices and his division of the good into befitting, pleasurable, and useful (See (Toner 2005) for a short look at this division, and (Vogler 2002) for a very thorough treatment), concluding that in an atheistic context, it will be reasonable for some agents to be vicious. In general, the relevance of Aquinas’s thought as a development of Aristotle makes him a likely source for anyone working on practical reasoning or moral theory in this tradition, a fact not missed by some prominent moral theorists, most notably Philippa Foot and Rosalind Hursthouse. As for Scotus, his affinity with, and likely indirect influence upon, Kant, has been remarked by friends and foes alike (Williams and MacIntyre, for example). His direct influence on current thinking has not been great, but if the continuing progress on the critical edition of his works and the proliferation of Scotus scholarship are any indication, this may be beginning to change. In mainstream English philosophy, John Hare is perhaps the most prominent theorist so far to develop positions deeply indebted to Scotus.  Scotus’ combination within his moral theory of deontological and virtue elements should make his thinking of interest to Kantian or other deontological theorists intent on appropriating broadly Aristotelian notions of virtue. Also, his subtle treatment of the relations between reason, divine and human freedom, and the absolute and ordained powers of God, should make him of great interest to contemporary divine command theorists (Hare provides one example of this).

b. The Medievals and Particularism

Turning to the first of the current debates concerning practical reason: Let generalism be the view that the presence of some features of action (say that it causes pleasure, or is unkind) always tends to make the action right (or wrong)—such features have invariable “deontic valence.” This may come in forms “thin” (some natural features of action, say conduciveness to pleasure, always have a positive valence) or “thick” (while there are no such natural features, there are certain thick features, like kindness or fairness or spitefulness, that have invariable valence). Particularism, then, is the denial of this. We may speak of thin or thick forms particularism, being denials of the corresponding forms of generalism (one may, then, be at the same time a thick generalist and thin particularist). Where do the medievals fall along this spectrum? They tend toward thick generalism, indeed, we might say toward thick absolutism, a form of generalism maintaining that there are some features of action that not only tend to make an action right or wrong, but always succeed in doing so. For Aquinas, for example, the fact that any action was vicious, or violated any precept of the natural law, would make it wrong. This is thick rather than thin generalism because the precepts have evaluative content that cannot be reduced to merely “natural” or thin terms (for example, while the precept against murder is certainly not just the claim that “wrongful killing is wrong,” it is the claim that “intentional killing of the innocent is wrong,” and “innocence” cannot be reduced to thin, non-evaluative language). For Scotus, things look quite similar, within the framework of God’s ordained power. But because dispensations are possible by God’s absolute power, the features picked out by natural law precepts relevant to the Second Table are not of invariable valence (that Isaac was innocent may actually tend to make sacrificing him right, given God’s command to Abraham). Still, there are some absolutes for Scotus, those pertaining to the love of God in the First Table. Ockham comes the closest to particularism, leaving just one feature of actions that has invariably positive valence, its having been commanded by God. Ockham also maintains that, when possible, loving God above all things is always right, subtly reconciling this with his claim that God could command us not to love Him (on the grounds that given such a command it would be impossible to love Him above all things; see Quodlibet IIIq14).

c. The Medievals and Internalism

Let us turn to reasons for action and their connection to motivation. Internalism comes in many forms, but common to them is the claim that if an agent has a reason to do some action A, she also has a motive to A (the denial of this—the assertion that an agent may have a reason to A but have no motive to A—is called “externalism”). One characteristic form of internalism, often referred to as “Humean,” is the claim that if R is a reason for S to do A, then A must serve some desire that S actually has. The medievals were not internalists in this sense. A Thomistic agent, for example, has a reason to pursue a good perfective of him even if he has no desire for it at present. But, does not the agent have another desire the good serves, namely for perfection? Actually no. It is the will that naturally aims at what is perfective of the agent, and the will is a power, not a standing desire. But the will is naturally inclined to pursue such goods, so perhaps a modified internalism, that cited not just actual but also counterfactual desires (the agent would desire it if suitably informed and so forth)? Perhaps so, but details aside, there is one more critical qualification to make: Although internalism strictly requires only a connection between reason and motivation, it is usually also held that the latter has priority, that the explanatory direction is from desire to reason for action. For Aquinas, the direction is instead from reason to desire (the various acts of reason serving as the formal causes of the corresponding acts of will). Allowing for this, and given careful specifications of the counterfactual conditions, Aquinas and other intellectualists could probably be brought under some fold or other of the big tent of internalism.

For Scotus and other (sometimes more thoroughgoing) voluntarists, things are harder to see. The relation between intellect and will is looser, but still it is not held that the will’s desiring something can create a reason for the agent to act; instead, reason serves as a sort of necessary condition of the will’s act of desire (as mentioned above, perhaps a partial efficient cause as Scotus held at one point, perhaps as a causa sine qua non as Henry of Ghent held and—some argue—Scotus later held). If the will is the total cause of its own willing, or at least the primary cause, it can refrain from willing in accordance with the judgment presented by right practical reason (recall  Scotus’ point about non velle). Scotus even, following Anselm, performs a thought experiment concerning an angel created without the affectio iustitiae, maintaining that it could then only pursue its own happiness, and not what is intrinsically just. He does not explicitly say that it correctly identifies the right reasons for action, but given the independence of prudence from the moral virtues, it seems likely it could (“God is not to be hated” is, after all, supposed to be self-evidently true; and such an angel could understand the content of God’s revealed commands). If so, it could have reasons (not to hate God, not to commit or encourage lying or murder) with no corresponding desires (since it lacks the affectio iustitiae that would motivate it to follow these precepts even in cases in which doing so is not instrumental to its own happiness).

It is dangerous to sort philosophers according to distinctions they themselves do not have in mind (notice my hesitant language about Aquinas’s internalism above), but it seems that Scotus and other voluntarists would likely be externalists. This can be said more confidently—neither intellectualist nor voluntarist agents look much like the internalist and externalist agents one typically meets in the contemporary literature. But perhaps this is an advantage, for the medievals develop options largely ignored in much current discussion. And, it may be that the presence of more angels—falling, deformed, whole, and standing firm—would make for much livelier discussion.

5. Conclusion: Common Themes among the Medievals

So far this article has emphasized differences between the medieval accounts of practical reason, and their connections with some points in current theorizing. It is worth bringing out a few features that bring the medievals together while distinguishing them as a group from most current theorists. First, there is the shared Aristotelian and Augustinian heritage, already mentioned above. With this comes an agreement that our final end is the right relationship with God, a union with God by means of intellect and will. This is perfectly clear in intellectualists like Aquinas, but also holds for voluntarists. Scotus, for example, agrees that God is our final end; the initially open question is how to relate to Him: qua object of the affectio commodi (as the source of our perfection), or qua object of the affectio iustitiae (as perfect in Himself). And for all of the medievals, the good life consists in the successful attempt to achieve this union, to find, we might say, one’s proper place in Creation. In The City of God XIX.13, Augustine defines peace—our final end on his account—as the tranquillity of order, where order is the arrangement of things in which each finds its proper place in relation to the others, under God.

None of this is intended to paper over important differences, for example about just how to characterize that proper place, or whether the attempt to find it is best seen as a unified narrative or as a set of independent courses of action (whether life is a novel, we might say, or an anthology of short stories). It is intended only to stress the broad and important agreement underlying the differences in their accounts of practical reason. This is an agreement we should not find surprising given their shared belief, based on both philosophical argument and on faith, in a providential Creator, who is both Reason and Goodness. And it is an agreement whose importance we can recognize when we note that no medieval ever held that right practical reason could recommend an immoral course of action as, if Vogler is right, it can often do in an atheistic context.

6. References and Further Reason

a. Primary Sources

  • Anselm, On the Fall of the Devil, translated by Ralph McInerny in Anselm of Canterbury: The Major Works, edited by Brian Davies and Gillian Evans (Oxford: Oxford University Press, 1998).
  • Aristotle, The Nicomachean Ethics, translated by Terence Irwin (Indianapolis: Hackett Press, second edition 1999).
  • Aristotle, On the Soul, translated by J.A. Smith in The Complete Works of Aristotle, volume 1, edited by Jonathan Barnes (Princeton: Princeton University Press, 1984).
  • Aristotle, Metaphysics, translated by W.D. Ross in The Complete Works of Aristotle, volume 1, edited by Jonathan Barnes (Princeton: Princeton University Press, 1984).
  • Augustine, On Free Choice of the Will, translated by Thomas Williams (Indianapolis: Hackett Press, 1993).
  • Augustine, Confessions, translated by R.S. Pine-Coffin (London: Penguin Classics, 1961).
  • Augustine, The City of God against the Pagans, translated by R.W. Dyson (Cambridge: Cambridge University Press, 1998).
  • Henry of Ghent, Quodlibetal Questions on Free Will, translated by Roland Teske (Milwaukee: Marquette University Press, 1993).
  • Ockham (Occam), William of. Quodlibetal Questions, translated by Alfred Freddoso and Francis Kelley (New Haven: Yale University Press, 1998).
  • Scotus, John Duns. Duns Scotus on the Will and Morality, selections made and translated by Allan Wolter (Washington: The Catholic University of America Press, 1997).
    • Many of  Scotus’ writings are divided in much the way described below for Aquinas. One further subdivision often included in works commenting on Peter Lombard’s Sentences (such as  Scotus’ Ordinatio), the distinctio, is noted as “dist.”
  • Thomas Aquinas, Summa theologiae, translated by the Fathers of the English Dominican Province (Allen, TX: Christian Classics, 1981).
    • This work is divided into three parts, with the second itself sub-divided into two parts. The parts are further broken up into questions, and the questions into articles. The articles themselves comprise objections to the position Aquinas will take, a claim “to the contrary,” Aquinas’s argument for his position, and replies to the objections. Parts are customarily referred to as follows: Ia, IIa, IIIa (from the Latin prima, secunda, and tertia); the parts of the second part as Ia-IIae and IIa-IIae (from prima secundae and secunda secundae—first of the second, second of the second). Questions are denoted simply by “q,” articles by “a,” and replies to objections by “ad” or toward. If not otherwise noted, the reference is to the body of the article or corpus (“c”), Aquinas’s argument for his position. So for instance, Ia-IIaeq13a1ad3 refers to the first part of the second part, question 13, article 1, reply to the third objection.
  • Thomas Aquinas, Commentary on Aristotle’s Nicomachean Ethics, translated by C.I. Litzinger (Notre Dame: Dumb Ox Books, 1993).

b. Secondary Sources

  • Bradley, Denis. Aquinas on the Twofold Human Good (Washington: The Catholic University of America Press, 1997).
  • Cross, Richard. Duns Scotus (Oxford: Oxford University Press, 1999).
  • Dahl, Norman. Practical Reason, Aristotle, and Weakness of the Will (Minneapolis: University of Minnesota Press, 1984).
  • Dumont, Stephen. “Did Duns Scotus Change His Mind on the Will?” in Nach der Verurteilung von 1277, edited by Jan Aersten, Kent Emery, and Andreas Speer (Berlin: Walter de Gruyter, 2001), 719-794.
  • Eardley, P.S. “Thomas Aquinas and Giles of Rome on the Will,” The Review of Metaphysics 56 (2003): 835-862.
  • Gallagher, David. “Thomas Aquinas on the Will as Rational Appetite,” Journal of the History of Philosophy 29 (1991), 559-584.
  • Hall, Pamela. Narrative and the Natural Law: An Interpretation of Thomistic Ethics (Notre Dame: University of Notre Dame Press, 1994).
  • Hare, John. “Scotus on Morality and Nature,” Medieval Philosophy and Theology 9 (2000), 15-38.
  • Hare, John. God’s Call (Grand Rapids: Eerdman’s, 2000).
  • Ingham, Mary Beth. “Duns Scotus, Morality and Happiness: A Reply to Thomas Williams,” American Catholic Philosophical Quarterly 74 (2000), 173-195.
  • Ingham, Mary Beth and Mechthild Dreyer. The Philosophical Vision of John Duns Scotus (Washington: The Catholic University of America Press, 2004).
  • Kent, Bonnie. Virtues of the Will (Washington: The Catholic University of America Press, 1995).
  • MacIntyre, Alasdair. Whose Justice? Which Rationality? (Notre Dame: University of Notre Dame Press, 1988).
  • MacIntyre, Alasdair. Three Rival Versions of Moral Enquiry (Notre Dame: University of Notre Dame Press, 1990).
  • MacDonald, Scott. “Ultimate Ends in Practical Reasoning: Aquinas’s Aristotelian Moral Psychology and Anscombe’s Fallacy,” The Philosophical Review 100 (1991): 31-65.
  • MacDonald, Scott and Eleonore Stump. (editors), Aquinas’s Moral Theory: Essays in Honor of Norman Kretzmann (Ithaca: Cornell University Press, 1999).
  • McCluskey, Colleen. “Worthy Constraints in Albertus Magnus’s Theory of Action,” Journal of the History of Philosophy 39 (2001): 491-533.
  • McCluskey, Colleen. “Medieval Theories of Free Will,” Internet Encyclopedia of Philosophy.
  • McInerny, Ralph. Aquinas on Human Action (Washington: The Catholic University of America Press, 1992).
  • Osborne, Thomas. “Ockham as a Divine-Command Theorist,” Religious Studies 41 (2005): 1-22.
  • Pinckaers, Servais. The Sources of Christian Ethics, translated by Sister Mary Thomas Noble (Washington: The Catholic University of America Press, 1995).
  • Porter, Jean. Nature as Reason: A Thomistic Theory of the Natural Law (Grand Rapids: Eerdmans, 2005).
  • Rist, John. Augustine: Ancient Thought Baptized (Cambridge: Cambridge University Press, 1996).
  • Spade, Paul Vincent. (editor), The Cambridge Companion to Ockham (Cambridge: Cambridge University Press, 1999).
    • See especially the essays by King and McCord Adams.
  • Toner, Christopher. “Angelic Sin in Aquinas and Scotus and the Genesis of Some Central Objections to Contemporary Virtue Ethics,” The Thomist 69 (2005): 79-125.
  • Vogler, Candace. Reasonably Vicious (Cambridge: Harvard University Press, 2002).
  • Westberg, Daniel. Right Practical Reason (Oxford: Clarendon Press, 1994).
  • Williams, Thomas. “How Scotus Separates Morality from Happiness,” American Catholic Philosophical Quarterly 69 (1995), 425-445.
  • Williams, Thomas. (editor), The Cambridge Companion to Duns Scotus (Cambridge: Cambridge University Press, 2003).
    • See especially the essays by Mohle, Williams, and Kent.
  • Wolter, Allan. “Native Freedom of the Will as a Key to the Ethics of Scotus” in The Philosophical Theology of John Duns Scotus, edited by Marilyn McCord Adams (Ithaca: Cornell University Press, 1990).

Author Information

Christopher Toner
Email: christopher.toner@stthomas.edu
University of St. Thomas
U. S. A.

Rudolph Hermann Lotze (1817–1881)

Hermann Lotze was a key figure in the philosophy of the second half of the nineteenth century, influencing practically all the leading philosophical schools of the late nineteenth and the coming twentieth century, including (i) the neo-Kantians; (ii) Brentano and his school; (iii) The British idealists; (iv) William James’s pragmatism; (v) Husserl’s phenomenology; (vi) Dilthey’s philosophy of life; (vii) Frege’s new logic; (viii) the early Cambridge analytic philosophy.

Lotze’s main philosophical significance is as a contributor to an anti-Hegelian objectivist movement in German-speaking Europe. The publication of the first editions of his Metaphysics (1841) and Logic (1843) constituted the third wave of this movement. The first came in 1837, in the form of Bolzano’s Wissenschaftslehre. The second came three years later, in 1840, when Friedrich Adolf Trendelenburg published his Logische Untersuchungen. Lotze’s early works furthered this objectivist line of thought. And when a new surge of philosophical objectivism crested again in the 1870s, Lotze used the opportunity to restate his position in the second editions of his Logic (1874) and of his Metaphysics (1879).

Closely following Trendelenburg, Lotze advanced an objectivist philosophy that did not start from the subject-object opposition in epistemology. He insisted that this opposition  is based on a metaphysical relation that is more fundamental (Schnädelbach 1983, p. 219). In this way, the very possibility for philosophical subjectivism was suspended.

Lotze promoted the “universal inner connection of all reality” by uniting all objects and terms in a comprehensive, ordered arrangement . Especially important to Lotze’s theories of order is the concept of relation.  A favorite saying of his illustrates this point.  “The proposition, ‘things exist’,” he repeatedly said, “has no intelligible meaning except that they stand in relations to each other.”

The priority of orderly relations in Lotze’s ontology entailed that nature is a cosmos, not chaos. Furthermore, since the activity that is typical for humans—thinking—is an activity of relating, man is a microcosm. This point convinced Lotze to jointly study microcosm and macrocosm, a conviction which found expression in his three-volume book on Microcosm (1856/64).

The distinction between the universe as macrocosm and humanity as microcosm gave rise to another central component of Lotze’s philosophy: his anthropological stance.  According to Lotze, the fundamental metaphysical and logical problems of philosophy are to be discussed and answered through the lens of the microcosm, that is, in terms of the specific perceptual and rational characteristics of human beings.  There is no alternative access to them.

Lotze’s philosophical work was guided by his double qualification in medicine and philosophy. While he chose academic philosophy as his profession, his medical training was an ever-present influence on his philosophical thought, in two respects. First, his overall philosophy was characterized by a concern for scientific exactness; he criticized any philosophical doctrine that discards the results of science. Second, he devoted many academic years to (more or less philosophical) studies in medicine and physiology. His efforts in this direction resulted in foundational works in psychology, in virtue of which there is reason to count him among psychology’s founding fathers.

Table of Contents

  1. Life and Works
    1. Biography
    2. Influences and Impacts
    3. Works
  2. Philosophical Principles and Methods
    1. Rigorous, Piecemeal Philosophy
    2. The Principle of Teleomechanism
    3. Regressive Analysis
    4. Anthropology as Prima Philosophia
    5. Methods: Eclecticism and Dialectics
  3. Theoretical Philosophy
    1. Ethics
    2. Ontology and Metaphysics
    3. Epistemology
    4. Logic
    5. Philosophy of Mind
    6. Philosophy of Nature
    7. Philosophy of Language
  4. Philosophy and Life
    1. Anthropology
    2. Social Philosophy
    3. Philosophy of History
    4. Political Philosophy
    5. Philosophy of Religion
    6. Religious Practice
  5. References and Further Reading
    1. Primary Sources
    2. Secondary Sources
    3. Bibliographies
    4. Biographies
    5. Further Reading

1. Life and Works

a. Biography

Rudolph Hermann Lotze was born in Bautzen (Saxony) on May 21, 1817, the third child of a military medical doctor. Two years later the family moved to nearby Zittau.

Lotze’s father died in 1827, when Hermann was 12. Soon thereafter, the family got into serious financial troubles.  This series of events shaped Lotze’s character in significant ways. He was independent, ambitious, serious and thrifty, but also melancholic, reserved, even shy.

Between 1828 and 1834 Hermann attended the local High School (Gymnasium). In 1834 he registered at the University of Leipzig.  He wanted to study philosophy—a wish nourished by his love of art and poetry—and he did. However, his experience with financial hardship urged him to simultaneously pursue a degree in the more practical and lucrative field of medicine. Four years later, in 1838, he received doctorates in both disciplines.

After practicing medicine for a year in Zittau, Lotze joined the University of Leipzig as an adjunct lecturer in the Department of Medicine in 1839, and in the Department of Philosophy in 1840. In 1840 Lotze achieved dual degrees, based on post-doctoral dissertations (Habilitation), in medicine and philosophy. As a result, he received a license to teach (venia legendi) at German universities in these two fields.

In 1839, Lotze became engaged to Ferdinande Hoffmann of Zittau (b. 1819), and they were married in 1844.  The marriage produced four sons.  Lotze was deeply attached to his wife, and her death in 1875 was a loss from which he never recovered. One of his numerous British students, Richard Haldane (who later became Lord Chancellor), described him after his wife’s death as one who “seldom sees people, as he lives a sort of solitary life in the country where his home is, about half a mile from Göttingen, and is looked upon as unsociable” (Kuntz 1971, p. 50).

In the year of his marriage, 1844, Lotze was named Herbart’s successor as Professor of Philosophy at the University of Göttingen. He remained at Göttingen until 1880, when he was named Professor of Philosophy at the University of Berlin. A few months later (on July 1, 1881) he died of a cardiac defect that he had suffered from all his life. He was succeeded in the Berlin Chair by Wilhelm Dilthey.

b. Influences and Impacts

Among Lotze’s teachers were Gustav Theodor Fechner, from whom he learned the importance of quantitative experiment, and Christian Weiße, who helped the young Hermann to see the philosophy of German idealism from its aesthetic perspective. Lotze was especially influenced by Kant, Hegel, Herbart, Schelling and Fries. He was personally introduced to Fries—who at the time was a Professor in Jena—by his friend and Fries’ student Ernst Friedrich Apelt.

Some philosophers believe that Lotze was also influenced by his countryman Leibniz (Leibniz was born and raised in Leipzig, Saxony).  Indeed, there are some common points between these two philosophers. But Lotze himself denied such an influence. A hidden influence (seldom discussed in the literature) came from Schleiermacher—via Trendelenburg—who had insisted against the Kant–Drobisch idea of formal logic that logic must be developed together with metaphysics.

Many British and American philosophers of the 1870s and 1880s admired Lotze. William James considered him “the most exquisite of contemporary minds” (Perry 1935, ii., p. 16). Josiah Royce, James Ward and John Cook Wilson studied under him in Göttingen.  Oxford’s T. H. Green was so enthusiastic about Lotze that in 1880 he began the large project of translating his System of Philosophy. The project was incomplete two years later at the time of Green’s death, but it was continued by a team under the guidance of Bernard Bosanquet. Besides Green and Bosanquet, A. C. Bradley (brother of F. H. Bradley), R. L. Nettleship and J. Cook Wilson took part in the general editing. The translation appeared in 1884. In parallel, James Ward and Henry Sidgwick at Cambridge were instrumental in preparing the translation of Lotze’s Microcosm by Elizabeth Hamilton (daughter of William Hamilton) and E. E. Constance Jones, which was published in 1885.

c. Works

Lotze’s first publications were his “lesser” Metaphysics (1841) and “lesser” Logic (1843), in which he charted his philosophical program. His Habilitation in medicine was published in 1842 under the title Allgemeine Pathologie und Therapie als mechanische Naturwissenschaften.

Over the next ten years, Lotze worked on problems at the intersection of medicine and philosophy, in particular the relation between soul and body. The result of these studies were published in two books: Allgemeine Physiologie des körperlichen Lebens (1851) and Medicinische Psychologie oder Physiologie der Seele (1852). During this period, Lotze also published extensive essays on “Leben. Lebenskraft” (1843), “Instinct” (1844), and “Seele und Seelenleben” (1846). In the late 1840s he published important works on aesthetics: “Über den Begriff der Schönheit” (1845), “Über Bedingungen der Kunstschönheit” (1847), and “Quaestiones Lucretianae” (1852).

Microcosm (published in 3 volumes between 1856 and 1864) marked a new period in Lotze’s philosophical development. In this monumental work, he synthesized his earlier ideas: the logico-metaphysical ideas of 1841–3, his psychological ideas of 1842–52, and his aesthetic ideas of 1845–52. Despite some interpretations to the contrary, the book was not only a popular treatise. It also developed technical logical and metaphysical ideas in a form that was unknown from his earlier work.

Shortly after Lotze finished Microcosm, he started his System of Philosophy which consisted of his “greater” Logic (1874), and “greater” Metaphysic (1879).  A third part of the system, on Ethics, Aesthetics and Religious Philosophy, remained unfinished at the time of his death.  Briefly, the difference between Microcosm and System of Philosophy can be put this way: while Microcosm was something of an encyclopedia of philosophical deliberations on human life, private and public, the System was an encyclopedia of the philosophical disciplines.

Lotze possessed an extraordinary ability for studying languages. Many of his papers were written in French, some of them in Latin (e.g., “Quaestiones lucretianae”). Lotze also published a volume of his poetry (Lotze 1840).

2. Philosophical Principles and Methods

a. Rigorous, Piecemeal Philosophy

It will come as no surprise, given his medical training, that Lotze was a scientifically oriented philosopher.  His credo was that no philosophical theory should contradict scientific results. In his medical writings, and above all in the programmatic Allgemeine Pathologie of 1842, he rejected all forms of vitalism (which claims that organismic life is explained by causes other than biochemical reactions) more radically than anyone before him.

Lotze was not a lonely pioneer in embracing the scientific orientation in philosophy. In this he followed his teacher and friend, the early experimental psychologist Gustav Fechner, as well as Hegel’s contemporaries and rivals, Fries and Herbart.  However, he was unique insofar as he introduced a method for recasting particular problems of German Idealism in a refined, philosophical–logical form that was science-friendly. A typical example in this respect was his approach to studying thinking. Lotze connected thinking to two “logically different” domains, valuing and becoming (see section 3.d, below), and considered each of them to be explored by a special science: logic investigates the validity of thinking, and psychology investigates the development of thinking.

Lotze’s new method disciplined metaphysics and ethics on the one hand, and enriched logic on the other.  In other words, it made  metaphysics and ethics more exact, formal disciplines, while making logic more philosophical.

One of Lotze’s motives for embracing this approach was his desire to eliminate the radical disagreements that traditionally had characterized philosophical theorizing—a main source of philosophy’s developing reputation for being unscientific. Lotze believed that the formal (logical) presentation of philosophical theories eliminates their subjective side—the principal source of philosophical animus—and that, thus purified, even seemingly contradictory systems could be shown consistent with one another (Misch 1912, p. xxii).

Lotze’s commitment to this approach led to radical changes in his philosophical practice. In particular, he started to investigate philosophical problems bit by bit, piecemeal, so that a later discovery of a mistake in his investigation did not made his overall philosophy false. (This practice was later followed by Russell (cf. Russell 1918, p. 85) and became central to analytic philosophy.) Lotze’s piecemeal philosophy was facilitated by the introduction—or in some cases the revival—of many concepts which are still widely discussed today, including: (i) the concept of value in logic (its best known successor was the concept of truth-value); (ii) the context principle; (iii) the idea of concept/judgment as a function; (iv) the metaphors of coloring expressions and of saturated–unsaturated expressions; (v) the objective content of perception or the concept of the given (its best known successor was the concept of sense-data); (vi) the objective content of judgments; and (vii) anti-psychologism in logic.  These concepts proved to be seminal to a certain line of German-language philosophy: in various combinations, they play central roles in the thought of Frege, Brentano, Husserl, and those associated with their schools.

In short, Lotze introduced a several  philosophical–logical problems and theses which could be further investigated independently of his overall system. In this sense he instructed his readers to regard his philosophy as “an open market, where the reader may simply pass by the goods he does not want” (Lotze 1874, p. 4). Among other things, this characteristic of Lotze’s philosophy made him the most “pillaged” philosopher of the nineteenth century (Passmore 1966, p. 51). Many of his theses were embraced without crediting him.

b. The Principle of Teleomechanism

A central principle of Lotze’s philosophy was that all processes and movements—physical, biological, psychological, bodily, social, ethical, cultural—are accomplished in a way that can best be called mechanical. This “Principle of Mechanism” helped Lotze to avoid references to deep, metaphysical causes, such as vitalism in the philosophy of biology. In contrast, he insisted that, when theorizing, we are obliged to look to reality as revealed by experiment. On this point, he was clearly influenced by his education as a medical doctor.

At the same time, however, Lotze believed that there were features of experience—such as life, mind, and purpose (telos)—that could not be explained mechanistically. Lotze took these limitations on mechanistic explanation to indicate—even delineate—a “higher and essential being”, reference to which was necessary in order to make mechanistic explanations fully intelligible.  For instance, Lotze thought that our ideas of forces and natural laws describe but do not explain how things work in nature. To understand this, we must connect them with the realm of the trans-sensual (Übersinnliche, 1856b, p. 306).  Only by making this connection can we understand the processes carried out through these mechanisms.

At first glance, this move to teleology as a necessary explanatory category may seem incompatible with Lotze’s own Principle of Mechanism.  He did not think so, however, and part of Lotze’s achievement was the way in which he sought to show these prima facie contrary categories compatible.

Lotze’s solution was to declare the Principle of Mechanism not a metaphysical principle, but a purely methodological principle belonging mainly to the natural sciences.  That is, the principle does not imply that reality is, at bottom, mechanistic.  Rather, it only prescribes a methodology and a mode of interpretation or description as means to achieving a useful understanding of the processes of our environment.  As purely methodological, Lotze’s “Principle of Mechanism” does not claim to capture the full nature of those processes, nor even to begin to describe their sources.  Nor does it claim to explain—or explain away—life, mind, and purpose.  To the contrary, it is consistent with the view that mechanistic processes are the means by which purposes are realized in the world.

Thus, ultimately, Lotze’s position required seeking both mechanistic descriptions of natural processes and teleological explanations of those processes.  Lotze called this hybrid position, “teleomechanism,” or “teleological idealism.”

In Lotze’s hands, the “Principle of Teleomechanism” (i.e., that ultimate explanations should have the hybrid form described above) shapes logic, metaphysics and science through what he calls idealities (Orth 1986, p. 45)- the fundamental orienting concepts of these fields. Among the idealities are ethical values, logical validities and aesthetic worth. In science and metaphysics, the idealities of spatial and temporal order, the principle of atomicity (cf. section 3.a,e) and the aforementioned relationism (cf. the opening summary at the head of this article), play a central role.

c. Regressive Analysis

The declared objective of Lotze’s philosophy was a “reflection on the meaning of our human being [Dasein]” (1856b, p. 304). The urgency of this task was a consequence of the scientific and industrial revolution of the beginning and the middle of the nineteenth century. That revolution dramatically changed the way in which humans see the cosmos and universe. It eroded the unity of God and humanity; traditional mythology proved inconsistent. As a consequence, the world started to seem alien, cold, immense. A substantial weakening in religious belief followed. Lotze saw danger in the numerous attempts (on the side of the mechanic philosopher-scientists like Georg Büchner, Heinrich Czolbe, Franz Fick, Jacob Moleschott and Karl Vogt) to prove that the microcosm of human beings is merely mechanical, or materialistic. His objective was to disprove such attempts and to make people feel at home in the world again.

Contrary to the trends in then-current anthropology, Lotze did not seek to explain humanity in terms of the technologies it produced. Rather, he thought, the keys for understanding the human race are found in the results of human education and schooling (Bildung), as they have been developed in history. This meant that his philosophical investigations began not simply with the elements of human culture, but with developed human cultures taken as wholes, and indeed the history of such cultures taken as a whole. From these wholes, he then worked “backwards”, analyzing their “parts”, such as logic, metaphysics, science and mathematics. This is the approach of regressive analysis (1874, § 208; 1879, pp. 179 ff.).

Lotze believed that the main educational goods (Bildungsgüter) of human culture are usually conveyed by poetry and religion. They provide a “higher perspective on things,” the “point of view of the heart.” This means that the mechanistic processes upon which science focuses are not the only key to understanding the world; they are not even the most important key. To the contrary, science becomes intelligible and useful for humans only in connection with the historically developed values and forms of schooling and education characteristic of a developed human culture (cf. Lotze’s Principle of Teleomechanism, in section 2.b, above). This point is clearly seen in the fact that we have a priori notions neither of bad and good, nor of blue or sweet(1864, p. 241).

But how exactly can the history of culture command the shape of logic, metaphysics and science? Lotze’s answer in brief is: through the  idealities they produce. As magnitudes identifiable in experience, these idealities serve as orientating concepts for all academic disciplines, giving them direction and purpose within the context of a unified human life in a developed human culture.

Following Kant, Lotze claimed that idealities pertain to mental, not material, reality. However, they require matter in order to be exemplified or articulated by human beings. We understand idealities only in experience. To be more specific, we find them at work above all in our sensual life and in our feelings of pleasure and displeasure. We find them further in ethics, aesthetics, science, mathematics, metaphysics and logic. The spatial order, for example, is such an ideality: it is revealed via the matrix of discrete material entities in their dimensional magnitude and in the spaces between them, but it is not given as another thing among things. Rather, it is mentally “noticed” as a necessary “backdrop” to, a “condition of the possibility of”, the matrix of material things. (This conception was adopted by Bertrand Russell in his Essays on the Foundations of Geometry; cf. Milkov 2008)

Given his views on the relation of the material to the ideal, Lotze was convinced that the quarrel between materialism and idealism was misguided. . It was a quarrel about meaning: Idealists see too much meaning (borne by ideal entities) in reality, while materialists see no meaning in it at all.  Fearing that the characteristically vague aesthetic elements of human experience would undermine exact science, the materialists attempted to extract  all humanistic meaning from reality by sanctioning only mathematical descriptions of mechanically-construed natural processes (the likes of which we see in scientific formulae, such as F=MA in physics).  But Lotze thought such fears were in vain.  Just as mechanism was compatible with teleology, so Lotze thought that aesthetics (poetry) and religion (revealed truth) were compatible with the mathematics and calculation preferred by the materialists. By the same token, the acceptance of mechanism as a purely methodological principle in science did not invalidate the belief in free will.  On the contrary: since mechanism made the spiritual effort to achieve the trans-sensual more strenuous, it only “increased the poetical appeal of the world”(1856b, p. 306).

d. Anthropology as Prima Philosophia

Lotze’s main objective was the investigation of the concrete human being with her imaginings, dreams and feelings. He considered these elements—as expressed in poetry and art—as constitutive of a human person and her life. This explains the central role that the concept of home (Heimat) plays in his metaphysics. The related concept in his philosophy of mind is feeling and heart (Gemüt), as different from mind (Geist) and soul (Seele). Indeed, Lotze introduced the concept of heart in the wake of German mysticism (e.g., Meister Eckhart); however, he used it in a quite realistic sense. Heart is what makes us long for home. The longing itself is a result of our desires which we strive to satisfy. Life consists, above all, in consuming (geniesen) goods, material and ideal. This conception of human life is, of course, close to hedonism. (cf. section 3.a)

Lotze did not introduce anthropological investigation in philosophy. Rather, it was started in the sixteenth century, in an effort to renovate theology. During the next three centuries, anthropology became a favorite subject among German university philosophers—including Kant. In his anthropology, however, Lotze did not follow Kant. Kant distinguished between theoretical philosophy and mundane philosophy, with anthropology following in the latter category.  But Lotze abolished Kant’s distinction between the theoretical and mundane (1841a, p. 17), and he developed his “theoretical anthropology” exactly in order to merge the two philosophical disciplines into one.

The conclusion Lotze made was that Kant’s question “what can I know?” cannot be answered in the abstract; it can be only answered in terms of embodied persons in concrete socio-historical situations. Only when we embrace this perspective, Lotze thought, can we also grasp the depth and the importance of metaphysical problems.

This point brings us to the most important characteristic of Lotze’s philosophy. Lotze did not simply shift from metaphysics to anthropology. Rather, his anthropology became philosophy proper (Orth 1986, p. 43).

e. Methods: Eclecticism and Dialectics

From the very beginning of his career, Lotze’s subscribed to the view that, “When we cannot necessarily join one of the dominating parties, we [shall …] stay in the middle via free eclecticism” (Lotze 1843, p. 1). Today the word “eclecticism” is used mainly in a pejorative sense, but this was not true for Lotze. To the contrary, he thought eclecticism a most useful method in philosophy, and in 1840 even lauded it in a poem entitled “Eclecticism” (Kroneberg 1899, p. 218).

Lotze’s eclecticism was characterized by his logical turn in metaphysics. Indeed, as seen in section 2.a, the latter made his philosophy a rigorous science, enabling him to compress many of the problems of generations of philosophers into a unified theory. This point explains the astonishing success with which Lotze employed his eclecticism. It enabled him to look past the differences of philosophers like Kant, J. G. Fichte, Schelling, and Hegel, and to focus on what he took to be the most valuable ideas common to them.  Distilling their thought, he frequently reformulated their views in logically exact expressions.

Consistent with his eclecticism, Lotze also used something approaching Hegel’s dialectical method (Lotze, 1841a, p. 320). This is why “there are some passages [in Lotze’s writings] in which he does seem conscious of the contradictions and [nevertheless] attempts to mediate between the two,” rather than eliminating one of them. (Kuntz 1971, p. 34).

Some authors have a negative view of these Hegelian tendencies in Lotze. For example, Eduard von Hartmann complains that “there is scarcely a ‘yes’ by Lotze, which is not undermined at another place by a ‘no’” (Hartmann 1888, p. 147). Yet other philosophers, like George Santayana, have recognized that, despite the apparent contradictions, Lotze’s system remained very consistent overall.  Careful attention reveals that most of the supposed contradictions are apparent only, and result from the failure to note the varying perspectives from which Lotze conducted his philosophical research.

For instance, as discussed in section 2.b., Lotze insisted that mechanistic descriptions were appropriate and indeed required in science, but inappropriate in metaphysics, where teleological explanations are required.  It is easy to see this double-demand for mechanism and teleology as contradictory, so long as one fails to recognize that each demand is a “methodological” demand only, made by the requirements of two disciplines with differing norms and purposes.  Similarly, the idealistic tendencies of his system were part of a psychological description of reality, “a personal manner of reading things, a poetic intuition of the cosmic life” (Santayana 1889, 155).  Other aspects of his system—like his atomism—were radically objectivistic, suited only to the demands of scientific description and scientific work.

Lotze’s perspectivalism—his tendency to treat some views as “merely methodological” from within a given disciplinary perspective—can make him difficult to follow.  The problem is compounded by his tendency to, on occasion, switch perspectives in the course of a single work.  For instance, he begins his ontological investigations with pluralistic realism only to end it with monistic idealism. As a result, Lotze’s views are frequently difficult to state, and also difficult to criticize.

Lotze also introduced a specific method of discussing different views (Ansichten) on the subject under scrutiny. He was against the hasty satisfaction of our theoretical needs and expectations through one-sided theories. Furthermore, Lotze claimed that his final solutions were merely views which satisfy “needs of the heart”. Incidentally, this point can be comfortably interpreted in the sense of FreudWittgenstein: philosophical puzzles are similar to mental neuroses, which can be treated by changing the perspective.

3. Theoretical Philosophy

a. Ethics

Lotze’s ethics were influenced by J.F. Herbart, who preceded Lotze as the Philosophy Chair in Gottingen.  The starting point of philosophical exploration for J.F. Hebart begins with the analysis of the objects immediately given in inner and outer experience. (Pester 1997, p. 119). Being was for Herbart real—beyond and independent from the world of ideas. From here followed a strict division between theoretical and practical philosophy—reality and values, being and obligation, are independent one from another.

Lotze agreed with Herbart that we cannot draw conclusions about value from facts about reality, but he insisted that we can do the reverse; that is, we can draw conclusions about reality from facts about values. He expressed this belief in the claim that both logic and metaphysics are ultimately based on ethics. Lotze already declared this idea in his first philosophical work, his lesser Metaphysics, where he claimed that “the beginning of metaphysics lies not in itself but in ethics” (1841a, p. 329). Two years later he postulated that “the logical forms cannot be independent from metaphysical presupposition, and they also cannot be totally detached from the realm of morality” (1843, p. 7).

Of course, ethics is not presented in metaphysics in propositional form. Rather, ethics enters metaphysics in judgments about which possibilities for ordering facts correspond to an ideally presupposed order or to Lotze’s idealities (see section 2.c, above). In this sense, there is no knowledge without ethical presuppositions.

Lotze’s idealities found expression above all in the concept of value. More especially, Lotze claimed that “values are the key for the world of forms” (1857, p. 22). This position explains why in the literature, he is widely considered to be the philosopher who introduced the concept of “values” in philosophy.

Lotze was adamant that the measure of values is only the “satisfaction of the sentimental needs [Gemütsbedürfnisse]” (1852, p. 242). The most natural of these satisfactions is pleasure. This means that moral principles are to be founded on the principle of delight (Lustprincip). This is an  empirical solution to  the problems of ethics which is clearly related to Epicurean hedonism.

This position explains why Lotze avoided Kant’s formalism of the categorical imperative. Instead, following Fries, he accepted a psychological basis of the maxims of ethics, claiming that we draw our moral principles from the immediate certainty with which we consider something as true or good (1858, p. 287).

The point which unites the subjectivism of this position with Lotze’s idiosyncratic objectivism (cf. the summary) is that, despite assuming values to be recognized via delight, he does not limit them to persons only. Rather, Lotze understands values—by way of being idealities—also as crucial for apprehension of physical facts: they constitute the “meaning of the world in general—as a universal method for speculative expansion of all appearances” (Misch 1912, p. lxv).

b. Ontology and Metaphysics

According to Lotze’s metaphysics, the world consists of substances in relation, and so of substances and relations.  Let’s examine these categories, beginning with substances.

In the Aristotelian tradition, only wholes exhibiting an organic unity, such as a particular human being or a particular horse, can count as substances—arbitrary collections of things, like a heap of sand or the random assortment of items in a person’s pocket, do not count.

Lotze does not embrace either of these two conceptions of substance. Instead, he defends a constructivist position which assumes that substance is a whole composed of parts that hang together in a particular relation of dependence. More especially, the elements of the substance (the whole) stand to one another in a relation in which the elements effect each other reciprocally, binding each other together into the whole that they constitute.

In order to specify this kind of relation, Lotze borrowed from Ammonius (28,1,14) the term effectus transeunt (“action in passing”, or “cursory action”).  Effectus transeunt is the minimal effect that elements A and B exercise on each other in the substance M, in virtue of which they stay in M. Through effectus transeunt, the otherwise independent elements of the substance became interdependent. To put this in other words, effectus transeunt produces the “ontological glue” that binds elements into organic wholes.

Formally, we can describe the construction of a substance this way. The elements of a substance (a whole) stand to one another in a reciprocal relation and in a unique order (Folge)(Lotze 1879, § 69). Furthermore, if we call the whole (the substance) M, and its elements A, B, and R (A and B are particular elements which are in the focus of our attention, and R designates the sum of all unspecified elements which can occur in the whole), we can denote the whole with the formula M=φ[A B R], where φ stands for the connection between the elements. The type of connection is a resultant of the specific relations and positions of the elements of the substance, as well as of their order in it (§ 70). In fact, this is the structure of the minimal composite unity.

In general, relations play a central role in Lotze’s ontology. One of his slogans was: “It belongs to the notion and nature of existing [object] to be related” (Lotze 1885, ii. p. 587). Lotze was interested in what Bertrand Russell has later called “internal” relations, or relations between the elements in the substances. The substances themselves stay in “external” relations to one another.

The external relations are of various kinds, each of which has its idiosyncratic type of coordinate. For example, the system of geometrical relations and the system of colors are two networks of relations essential to the material world, but not to the world of art, or to the spiritual world of men. There are also other kinds of relation-networks (see Lotze 1856a, pp. 461–2; Lotze 1885 ii. p. 575). For instance, from the perspective of the subject, Lotze’s universe has at least two further relation-networks:

  1. that of perception; this network is the universe of what he calls “local-signs” (see section 3.e);
  2. that of judgments and concepts; this network is the universe of states of affairs. (see section 3.d)

In metaphysics proper, Lotze transformed the Hegelian dichotomy between being and becoming to the trichotomy being, becoming, value. The given is; it is opposed to both what happens (e.g. changes) and to the validities. The transition between these three is impossible.

From the perspective of his conception of values, Lotze also suggested a new interpretation of Plato’s theory of ideas. Ideas have two characteristics: (i) they have their own autonomous being; (ii) in the same time, ideas have properties, similar to those of the objects of reality. Lotze’s claim was that these two conditions are only fulfilled by values. In fact, Plato’s ideas are validities of truths. Plato misrepresented them as “ideas” only because in Greek there is no expression for things which have no being: and values are just such things (1874, § 317). The fact that Plato’s ideas are validities, Lotze argues further, explains why they are beyond space and time, beyond things and minds, remaining at that atomistic. Lotze’s interpretation of Plato’s ideas was further developed by Paul Natorp (Natorp 1902).

c. Epistemology

Lotze’s task in epistemology was to secure knowledge which is to be extracted, and separated, from perception. The main characteristic of knowledge is that it is true. To Lotze, this means that it, and only it, presents the things as they really are—and, in fact, that is what is expected from thinking as a result.

The difference between perception and knowledge (or thinking; in identifying thinking and knowledge Lotze was followed by Frege) can be set out in the following way.  Perception (including imagining, daydreaming, etc.) notes accidental relations of ideas, but knowledge asserts a natural fit (a “necessary connection”) among these ideas: they belong together (zusammengehören).  In other words, the perceiving mind conceives “kaleidoscopically” a multiplicity of contingent pictures (Bilder) (1843, p. 72). Only then comes thinking, which consists in going through the ideas a second time, producing in this way “secondary thoughts” (Nebengedanken). The latter connect only those ideas which intrinsicallybelong together.

Lotze describes his “secondary thoughts” as constituting “a critical stand towards an idea.” This conception assumes that we have a kind of intuition that helps us to judge is the connection of ideas that lie before us—in our perception—true, or false.

Some authors have claimed that this idea is a further transformation of Hegel’s method of dialectical self-development of the truth (Misch 1912, p. xxvii). But it would be more correct to say that Lotze’s secondary thoughts are an incorporation into logic of the old Platonic–Aristotelian idea of peirastic (tentative, experimental) inquiry that tests different opinions and decides which connection of ideas they make is true and which false. (This interpretation was supported by Lotze’s pupils, Julius Bergmann and Wilhelm Windelband.) Indeed, Lotze is adamant that “this inner regularity of the content sought-after, being unknown yet, is not open to us in specific realistic definitions of thought. However, being present in the form of opinion, it really has […] the defensive [intuitive] force to negotiate what is not suitable to her” (Lotze 1841a, p. 33).

d. Logic

The concept of the judgment and its content (Urteilsinhalt) played a central role in Lotze’s logic.  He claimed that the content of judgment is not an interrelation of ideas, as Hume and Mill believed, but an interrelation of objective contents, or things: it is a state of affairs (a concept introduced by Lotze and later also used by Husserl and Wittgenstein—cf. Milkov 2002). Since there is no difference between the content of judgments and reality, the state of affairs has the structure of the substance or of the minimal composite unity. This position was another expression of Lotze’s objectivism (see the summary).

But the content of judgment has also two other dimensions which have little to do with its structural characteristics:

First, the content of the judgment is asserted by the judgment.  Thus, the judgment has an assertoric quality, and what Lotze calls its affirmation (Bejahung), or “positing” (Setzung).  For Lotze, this is the ultimate quality of a judgment—it is what makes a judgment a judgment, as opposed to complex of terms. Later, this conception was also adopted by Frege who assumed that the judgment acknowledges the truth of its content so that only this acknowledgement makes the combination of ideas a judgment. In other words, the judgment is an acceptance, or assumption of content as true, or rejecting it as false.

This characteristic of judgment was connected with a variant of the context principle, according to which a word has a meaning not in isolation but in the context of a proposition in which it occurs: “The affirmation of a single notion has no meaning which we can specify; we can affirm nothing but a judgment in which the content of one notion is brought into relation with that of another” (Lotze 1864, p. 465; Lotze 1885 ii. p. 582).Frege followed Lotze also on this point.

Second, the content of judgment has a value: this is a point that connects Lotze’s logic with his ethics(cf. section 2.c, above). To be more specific, Lotze claimed that concepts have meaning (Bedeutung), but not value. They can have a value only through the proposition in which they occur—in its context (Lotze 1874, § 321). In 1882 Lotze’s closest pupil, Wilhelm Windelband, introduced the concept of truth-value in the wake of this idea. Nine years later, this concept was also embraced by Frege in his “Function and Concept.”

Following Herbart, and developing further the idea of content of judgment, Lotze also explored the idea of the “given” (Gegebene) in philosophy.  More especially, Lotze understood the given as an “experienced content of perception” that was different from the content of judgment, or the state of affairs. Later this conception of the given was instrumental by coining the concept of sense-data (see Milkov 2001).

e. Philosophy of Mind

As was shown in the explanation  of the principle of teleomechanism (section 2.b), Lotze was adamant that the way in which phenomena are explained in physics is not appropriate for the mental or psychical world.  For instance, mechanical descriptions do not explain why we experience the effects of light-waves as color, or of sound-waves as tones. In this regard, Lotze criticized Herbart’s view that the interaction of ideas in a person’s mind (such as how ideas compete to capture a person’s attention or compel belief) is to be explained on analogy with the physical conception of force.  On Lotze’s view, the content of ideas is more important than their intensity(1856a, pp. 238 ff.).

Concerning the relation between soul and body, the so-called “mind-body problem,” Lotze did not offer a positive theory—in fact, he denies that we can understand this relation—but adopted a version of occasionalism.  Occasionalism is the view that events in the mental realm are synchronized with events in the material realm in such a way that it seems that the two realms are interacting, even though they do not in fact interact.  To adopt this as a methodological stance was Lotze’s way of saying that, even though the two realms may interact, we do not need to understand how they do in order to have a perfectly good, practical theory about the relation between mind and body  (1852, pp. 77 f.).

To the extent that Lotze develops a solution to the “mind-body problem,” he does so by introducing his famous conception of local-signs (Localzeichen), which explains the relation between mind and matter in terms of our perception of space and movement. According to Lotze, what we directly see when perceiving a movement are only patches of color. What helps us to perceive the fact of movement is the effort that we ourselves make in perceiving the movement. Lotze calls this stimulus a “local-sign.” It is a means of transforming sense-perceptions into space-values.

This means that our knowledge of the connection of mind to matter is not a fruit of reflection but of activity (in this assumption Lotze followed J. G. Fichte); it is not simply a matter of grasping. Indeed, the process of space-perceiving is an activity of construction of the external objects, and events, in consciousness (1856a, pp. 328 f.). This conception was another critique of  the purely mechanical understanding in philosophy.

Lotze’s theory of logical signs was further developed by Hermann von Helmholtz in the conception that sense-organs do not supply isomorphic pictures of the outer world, but only signals which perception transforms further into pictures. Helmholtz’s theory, in turn, was later embraced by the logical empiricists Moritz Schlick and Hans Reichenbach.

Lotze further claimed that thoughts are tools (organa) for deciphering messages of reality. This deciphering takes place in realizing of values. The aim of human thought is not to serve as a lens for immediate grasping reality, but to be valid. This means that the structure of thoughts has scarcely anything to do with the structure of the facts. Nevertheless, their effects coincide (1874, § 342). Thus, despite the fact that there are no general ideas in reality, we understand reality  only through  general ideas.

Lotze did not believe that this conception leads to epistemological pessimism. It is true that “reality may be more extensive than our capacities for representing it (whether by knowledge, feeling, etc.)” can assimilate (Cuming 1917, p. 163). Lotze insisted, however, that these features of reality are beyond the interests of philosophers, since beyond their (human) reach (in essence, along the lines of the saying: “what the eye does not see, the heart does not grieve over”).

f. Philosophy of Nature

As a young man Lotze was befriended with Ernst Friedrich Apelt, a pupil of Fries. (cf. section 1.b) Through Apelt, Lotze became familiar with Friesian philosophy, which he later used as a convenient foil in the development of some of his own views. Fries’ philosophy followed Kant formally, but in fact was more mechanical and calculative than Kant’s. In truth, it was even more mechanical and calculative than the philosophy of Herbart, who himself was a well-known mechanistic Kantian.

Lotze criticized Fries for being too formal and forgetting the “deep problems” of philosophy. Specifically, Lotze attacked Fries’ (and arguably Kant’s) dynamic understanding of matter, which represents it as simply the interplay of powers. Thus construed, the standard, empirical properties of matter (such as extension, solidity, place, and so on) disappear. Against this conception, Lotze embraced a form of atomism, which he saw as necessary for the individuation of material objects. Indeed, humans understand something only when the content of their judgment is articulated, and there cannot be an articulation without individuation; furthermore, individuation is best carried out when we accept that there are atoms. Besides, Lotze was convinced that the order in the world cannot come into being from a purposeless and planless beginning—from what today is called an “atomless gunk.” The point is that the order  presupposes an articulation and individuation: it is order between individuals—between Lotze’s variables A, B, and R (cf. section 3.b).

Apparently, Lotze did not understand atoms as they were understood in antiquity: as ultimate elements of reality which have different forms, but the same substance .  He did conceive of them as the ultimate building blocks of the material world, but he saw them as idiosyncratic and as remaining unmodified in all compositions and divisions. In other words, whereas the ancient atomism saw each atom as made out of the same kind of substance , Lotze saw each atom as being made of a unique kind of substance , so that each atom is sui generis.

Further difference with the atomism of the antiquity was that Lotze’s atoms were punctual (i.e., point-like), without extension (unräumlich).  Indeed, extension is possible only where there are many points which can be easily identified and differentiated. The extensionless atoms find their mutual place in space through their powers. To be more specific, we conceive of them as impermeable, filling up the space, only because of their demonstrated reciprocal resistance (1856a, p. 402).

An important characteristic of matter is its passivity, i.e. its ability to be affected from the outside. True to his anthropological stance, Lotze accepted that only if two essences mutually produce their respective “sufferings” (Leiden) can they be their respective interacting causes. (1864, p. 574) (The concept of “suffering” shows influence on Lotze of his countryman Jacob Böhme – both were born in Upper Lusatia, Saxony.) At the same time, Lotze was adamant that the concepts of suffering, effecting, and interaction are only—although inescapable—scientific metaphors. We must not conceive of them literally. However, they help us to grasp the nature of the problem.

In questions of space, Lotze used his teacher Weiße, rather than Fries, as a foil. Weiße had distinguished between space and interaction (Wechselwirkung) of substance. Moreover, for Weiße, interaction is the condition of space. (2003, pp. 85 f.) In contrast, Lotze differentiated, not between interaction and space (he was convinced that the two coincide), but between extension and place. “Extension” refers to an infinite multiplicity of directions. Only place, however, makes these possibilities concrete, putting them into three coordinated directions (Pester 1997, p. 110).

g. Philosophy of Language

Starting with his lesser Logic, Lotze made great efforts to elaborate a convincing philosophy of language. His first step in this direction was to connect language with logic by claiming that logic begins with exploring language forms (1843, p. 40). The reason for this assumption was that the living, unconscious “spirit of [ordinary] language” makes a connection between what one experiences concretely in sense perception, and the abstract forms that one extracts from sense perception (p. 82).  (This idea was also adopted—via Frege—in Wittgenstein’s Tractatus, 3.1: “In a proposition a thought finds expression that can be perceived by the senses.”) Indeed, our language functions on the level of perceptions. This, however, is not a hindrance to our using it to convey truths of a higher order: truths of science, mathematics, logic, etc (1856a, p. 304).

Lotze criticized the idea that language has meaning by picturing reality. According to Lotze, not even the pictures formed by perceiving are pictures proper (cf. section 3.e, above)—much less, therefore, pictures supposedly embedded in the structures of language.  Rather than performing a picturing function, language provides something of a method.  To be more specific, it provides rules for transforming signals from the sensual world into the phenomena of our mental world, and vice-versa: from our perception into the meanings we formulate and communicate with the help of the language.  In fact, the whole relation between microcosm and macrocosm was understood by Lotze in this way. The microcosm can be characterized as a “language of the macrocosm”, and at the same time, a place for understanding the possibilities of speaking about the macrocosm (Orth 1986, p. 48).

4. Philosophy and Life

a. Anthropology

Lotze was adamant that we cannot prefer logical forms over facts, as Hegel had once done. In particular, he criticized Hegel’s ladder-model of natural history, which claimed that we can deduce the value and importance of every particular species from its place on the ladder of evolution. Instead of formal (logical) rankings of living species, Lotze promoted a comparison of their natural figures (Gestalten). (From this perspective he also criticized Darwin’s evolution theory.) The difference between the mind of animals and that of man arises not because of a difference in the elements which they contain; in fact, here and there the same building blocks, or “mosaic-stones” (Mosaikstifte), enter into the scene. (Rather, that variation results from the way in which they are combined and used (1858, p. 266).

Lotze also criticized the intellectualism of the German Idealists. Instead, he sided with the German Enlightenment’s tendency to emphasize the importance of sensuality, of feelings and imagination (Phantasie). In this key, he classified animals not according to their capacity to think (as Herder did), but according to their physical performance and forms of consumption (genießen). On this point he was criticized by many of his contemporaries, including his friends, the “speculative theists” I. H. Fichte and C. H. Weiße. These two found in the Microcosm too little idealism and too much realism (Weiße 1865, pp. 289 ff.).

This reproach was scarcely justified; for Lotze endorsed the essential difference between the human mind and that of other animals.  The difference was that all human thought has reference to, or is at least formed from within, traditions: in language, science, skills, morals, as well as in practical habits and in judgments of everyday life (1858, p. 262). Moreover, Lotze claimed that “to know man means, above all, to know his vocation [Bestimmung], the means which he has in disposition to achieve it, as well as the hindrances that he must overcome in this effort” (p. 72). In this kind of anthropology, the ability to use the arm, and later also instruments was most important.

b. Social Philosophy

Lotze treated every epoch of human culture as developed around a particular value: (i) the Orient developed a taste for the colossal, (ii) the Jews for the elevated, (iii) the Greeks for the beautiful, (iv) the Romans for dignity and elegance, (v) the Middle Ages for the fantastic and emblematic, and (vi) Modernity for the critical and inventive. These orientations and achievements are on a par with one another (1864, pp. 124 ff.). The acceptance of the plurality of values was unique in German philosophy at the time: for instance, whereas we can easily find anti-Semitic judgments from Herder and Kant, not so from Lotze.

According to Lotze, achieving social progress is not a matter of quantitative growth but of reaching a “systematic complete harmony” in this or that particular culture. This state could be attained, for example, if the rules of social conduct are conceived of as a system of rights and duties of an objective spiritual (geistiges) organism (p. 424). Such a society could be considered a work of Nature, “or rather not simply of Nature, but of the Moral World Order [sittliche Weltordnung] which is independent of the individual” (p. 443).

Lotze was not convinced that the scientific and technological progress of the human race through the first half of the nineteenth century had increased its humaneness.  For, the increase in humanity’s power over nature was accompanied by a proportional increase in our dependence upon it.  The new ways of life afforded by developing technologies created new consumption needs, but many of these new needs were superfluous—not needs at all, but only desires—and some of them could be positively harmful.  Thus it is not unreasonable to think that we might have been better-off without the technologies that, although they enabled humanity to solve certain practical problems, created others that were previously unknown.

However, such felt-needs/desires cannot be eliminated through mere insight into truth, e.g., by recognizing that they are superfluous and harmful. The disapproving stance on this matter, taken by Diogenes of Sinope or Rousseau, is attractive and plausible mainly as a critique. Indeed, the natural state, which they propagated, can be seen as a state of innocence, but also as one of barbarism.

As a solution to this problem Lotze accepted that there is a constant human way of life which repeats itself practically unchanged: its purposes, motives and habits have the same form. This is the course of the world (der Weltlauf), an ever-green stalk from which the colorful blossoms of history cyclically emerge. In fact, the true goods of our inner life increase either only slowly, or perhaps they do not increase at all (1858, p. 345).

Perhaps the most interesting development of our modern time is the introduction of division of work and the new (Protestant) phenomenon of “profession.” (This idea was further developed by Max Weber.) An important effect of this process is that life is now divided into work and leisure (1864, p. 281; pp. 245–7).

Every profession stimulates the heart to embody a specific direction of imagination, a perspective on the world, and a way of judging. This state of affairs produced different forms of existence (Existenzarten) which makes modernity one of the most interesting epochs of human history. The main disadvantage of the professional life, Lotze says, is its monotony (1858, pp. 437–8).

c. Philosophy of History

The history of human society is a central subject of Lotze’s Microcosm.  Lotze’s views on this topic are best presented in contrast with what was then the standard or “mainstream” approach to history, which he faulted for lacking realism, and therefore for failing to generate genuine historical knowledge.

Mainstream history was inspired by two chief sources: Hegelianism, and what may loosely be described as positivism.  Although radically different in their guiding assumptions, these two movements overlapped in their consequences for history.

Hegel believed that history is produced by the movements of an arcane entity called “the world-spirit” (Weltgeist) and of its interaction with humanity.  Specifically, Hegel believed that the Weltgeist’s goal was to bring the human race into the full realization of the idea of humanity, i.e., into an ideal state of being.  To this end, it leads certain humans—by means of which they are unaware—to advance the race in various ways.  These humans (heroes) turn out to be the great figures in history, and their movements and achievements, as Hegel saw it, constitute history.  That is, history consists not of everything that happens, but above all of great movements that advance humanity significantly toward its ideal, of those events that constitute a substantial realization of the ideal.

In short, the Hegelian approach requires commitment to an inevitably contentious idealization of humanity, an assumption about what counts as the highest realization of human nature.  Lotze claimed that such theories have their place in Philosophy, but they can only skew our perceptions when allowed to control our search for fundamental data in History.  In Hegel’s case, for instance, his ideal of humanity led him to neglect both the contributions of women to history (1864, pp. 47 ff.; in this regard Lotze appears as a precursor of the modern feminism), and the role played by the mundane aspects of individuals’ lives—which of course constitutes the lager part of human history.  (This claim of Lotze shows him as a predecessor of the nouvelle histoire school of Marc Bloch which accentuated discussions in history of past facts of la vie quotidienne.)

The positivist approach to history, exemplified by Leopold von Ranke and Johann Gustav Droysen, had similar consequences.  Focusing too much on “objective” facts and formal considerations, and too little on the concrete, embodied, and emotional aspects of human life, historically significant but “ordinary” elements of human life were eliminated from consideration.

Lotze rejected both the idealism of Hegel and the demand for “objective faciticity” that came from the positivists.  Against Hegel, Lotze argued that human progress does not proceed  linearly nor ladder-wise:  many achievements of human society disappear without a trace, while others disappear for a time, only to be reintroduced by new generations. Rather, Lotze saw humanity developing in a spiral pattern, in which moments of progress are offset by moments of regress.  To be sure, this perspective appears rather gloomy alongside the mainstream approach, but it is clearly more realistic, and better suited to teaching humanity about itself.

Lotze agreed with Lessing’s thesis that the purpose of history is the education of humanity. (This point coheres with Lotze’s claim, discussed in section 2.b–c above, that we can understand philosophy and science starting from the history of human education and schooling.) That assumption helps to draw a more realistic picture of human progress than what Hegelian and positivist history provided.  Seeing history as a didactic tool, Lotze’s desiderata for good historical work were shaped by his ideals for education.  In particular, they were modeled by his conviction that the purpose of human spiritual life consists in the richness of an education capable of harmonizing all the aspects of a concrete, embodied person’s life.  This is what drove Lotze to reject the positivists’ “objective facticity” as inadequate for history.

Lotze’s alternative was an aesthetic, or poetic, approach to history. (1864, p. 46)   As he saw it, poetry and history are both creative, setting up new life-worlds.   The task of the historian was to present concepts as they were understood in their original contexts, exactly as they were embraced, felt, and consumed in the past—not anachronistically, as they might be understood in the present, through the “lens” of a different form of life.  This task required both the focus on empirical fact characteristic of positivist history, but also an element of poetic imagination—for only the latter could add flesh to the dry bones of empirical fact.  By combining both modes of cognition, the historian was to determine how the concept fitted into the total form of life characteristic of the period in which it originated, as well as those that inherited the concept—in effect, to re-create the life-world of the people whose concept it was. This line of thought was later developed by R. G. Collingwood.

d. Political Philosophy

Lotze’s political philosophy discussed such themes as social rationalization, power, bureaucracy, national values, sovereignty, and international relations. Above all, he defended the enlightened, hereditary monarchy. He saw it as offering “the greatest security for steady development”—and, as he saw it, this is of greatest value in political life. (p. 444) Further, being a philosopher of the concrete, full-blooded man, with his feelings and imagination, Lotze defended paternal patriotism; he preferred the love for the concrete fatherland over the love for the state with its institutions. In particular, Lotze criticized the view (defended by his contemporary Jacob Burckhardt) that the State should exist for its own sake. He also distrusted parliamentary representation and party politics.

Lotze repudiated Plato’s model of the state as an analog of the human person, and accepted instead a model of political equilibrium construed as “the result of the reciprocal action of unequal forces” (p. 423).  In matters of international law, he was an advocate of a balance of power of sovereign states. He believed that “the increasing relations between the different divisions of humankind changed in great measure the significance of the political boundaries and gave new stimulus to the idea of cosmopolitanism” (p. 436).

Lotze disparaged those critics of modernity who claimed that its proponents only defend their desire for material well-being. Moreover, although he did not use the term “liberalism,” Lotze adhered to the principles of what we would now call “classical bourgeois liberalism;” but he criticized “Manchester liberalism” (cf. the “turbo-capitalism” of the “roaring 1990s”) that followed ideas of such philosophers as Thomas Malthus, referring, among other things, to what today is called “the paradox of liberalism:” liberalism fails to show how an isolated human being can be a subject of rights. Indeed, right is a reciprocal, and so collective, concept: “one’s right is what the others feel for us as a duty” (p. 427).

Lotze criticized the concept of natural law employed by the mainstream Western philosophers like Aristotle and Hobbes who claim that law is set by nature. Instead, Lotze had sympathies with the historicist conception of law developed by Leopold von Ranke and Friedrich von Savigny who defended the thesis that the notions of law are coined in human practice. Lotze used to say that “the beginning of all legitimacy is illegitimate, although it need not be at the same time illegal” (p. 417).

e. Philosophy of Religion

The religion of the modern man was for Lotze a feeling of life (Lebensgefühl) in which the awareness of the fragility of the human race is connected with a sense of conscience about a lay profession. (The latter point was extensively discussed by Max Weber.) Men know how modest their life-tasks are and nevertheless are happy to pursue them. This is a belief which follows the consciousness and the inner voice, and which, nevertheless, is exactly as certain as the knowledge we receive through the senses (1858, pp. 447 f.).

Lotze criticizes the Enlightenment claim that religion is only a product of human reason. If that was true, then it would be possible to replace religion with philosophy. However, for Lotze, reason alone is not enough to grasp religious truth: we learn it through revelation which can be thought of as the historical action of God (1864, p. 546). Lotze also criticizes Fries who compared religion, which starts from unproven truths, to science which is also ultimately based on unproved axioms we believe. Rather, whereas the axioms of science are general and hypothetical judgments, the propositions of religion are apodictic.

A leading idea of Lotze’s philosophy of religion was that “all the processes in nature are understandable only through the continuing involvement of God; only this involvement arranges the passing of the interaction [Übergang des Wechselwirkungs] between different parts of the world” (p. 364). This claim can be best interpreted with reference to Lotze’s concept of idealities(discussed in section 2b–c, above) Idealities are magnitudes, identifiable in experience, and are constitutive for all academic fields: science, mathematics, metaphysics. More especially, they help to orient our concepts and studies.

In more concrete terms, Lotze hung the intelligibility of natural processes on the concept of God because of his anthropological stance—of the role the concept of humanity played in his philosophy. Important point, however, is that, to him, that concept does not have a generic character; we can grasp it only in terms of particular individuals, or persons (p. 52). This explains why Lotze claimed that the kind of purposive, creative power seen in natural processes is unthinkable except in relation to a living personality with its will; and, since the process of nature emanate from no human will, we are left with the person of God (pp. 587 ff.).

Lotze’s use of God as a necessary explanatory category is reminiscent of Kant, and has a somewhat “methodological” quality about it—we cannot prove the existence of God, Lotze thought, but we must nonetheless believe in Him; for only thus is our world ultimately intelligible. This point of Lotze was interpreted by the religious liberals of the fin de siècle (by the Congregationalists, in particular) as supporting the claim that religion is a matter of judgment of value in the Kingdom of God—a thesis made popular by Lotze’s contemporary Albrecht Ritschl (1822–1889) who fought against the conservative-Lutheran and confessional theology of the time.

f. Religious Practice

Lotze understood world-religions to have started in the Orient, with the picture, familiar from the Old Testament, of the world as a system developing according to general laws. Later, the West accepted this belief in the form of Christianity. In the Age of Enlightenment, however, it started to consider the universe as something unfinished, giving opportunities to the individuals to form it according to the specific purposes of everyone. (This stance was theoretically grounded by Kant.) The future was seen as formless in principle, so that human action can change reality in an absolutely new way (Lotze 1864, p. 331). Embracing this view, the believers abandoned quietism and embraced vita activa. Reducing the horizons of human imagination to the practical tasks of the earthy world, the need to connect it with the transcendental waned. The result was the belief in progress and a turn away from God. From now on Godhood was considered mainly in moral terms.

Pagans, in their most developed form of antiquity, believed in reason, in self-respect, and in the sublime. (Lotze called this stance “heroism of the pure reason”.) Unfortunately, pagans failed to foster humaneness. This was the historical achievement of Christianity which developed a totally new understanding of the moral duties. Of course, pagans recognized moral duties too. However, they understood them as having the same necessity as natural laws have. To be more specific, Christianity—especially Protestantism—taught its believers to carry out duties following their personal conscience. In consequence, Christianity: (i) established an immediate connection to God; (ii) it made it possible for individual Christians to pursue their own values of preference which are independent from the social background of the individual and from her actual place in the society. In this way, the respect for human dignity was secured.

Historically, Christianity placed importance on the activity of teaching and learning through the establishment of schools.  . Christianity, however, is not simply a teaching. It requires faithfulness to the historical God, realized through revelation. That is why Christian dogmatics must be preserved and cultivated.

Lotze’s conclusion was that we must look upon Christian dogmatics as posing questions about the purpose of human life, not as giving answers. Lotze was confident that every new generation would return to these questions. Of course, dogmatics can be criticized: indeed, the critical Protestant theology was, historically, the best example of such criticism. But, according to Lotze, we must not cast Christian dogmatics away as obsolete.

5. References and Further Reading

a. Primary Sources

  • Lotze, Rudolph Hermann. (1840) Gedichte, Leipzig: Weidmann.
  • Lotze, Rudolph Hermann.  (1841a). Metaphysik, Leipzig: Weidmann.
  • Lotze, Rudolph Hermann. (1841b). “Bemerkungen über den Begriff des Raumes. Sendeschreiben an C. H. Weiße,” Zeitschrift für Philosophie und Spekulative Theologie 8: 1–24; in Lotze 1885/91, i, pp. 86–108.
  • Lotze, Rudolph Hermann. (1843). Logik, Lepzig: Weidmann.
  • Lotze, Rudolph Hermann. (1845). Über den Begriff der Schönheit, Göttingen: Vandenhoeck & Ruprecht.
  • Lotze, Rudolph Hermann.  (1852). Medicinische Psychologie, oder Physiologie der Seele, Leipzig: Weidmann.
  • Lotze, Rudolph Hermann.  (1856a). Mikrokosmus: Ideen zur Naturgeschichte und Geschichte der Menschheit, Versuch einer Anthropologie, 1st vol., Leipzig: Hirzel.
  • Lotze, Rudolph Hermann.  (1856b). “Selbstanzeige des ersten Bandes des Mikrokosmus,” Göttinger gelehrte Anzeigen 199: 1977–92; in Lotze 1885/91, iii, pp. 303–14.
  • Lotze, Rudolph Hermann. (1857). Streitschriften, Part One, Leipzig: Hirzel.
  • Lotze, Rudolph Hermann. (1858). Mikrokosmus, 2nd vol., Leipzig: Hirzel.
  • Lotze, Rudolph Hermann.  (1864). Mikrokosmus, 3rd vol., Leipzig: Hirzel.
  • Lotze, Rudolph Hermann. (1868). Geschichte der Aesthetik in Deutschland, München: Cotta.
  • Lotze, Rudolph Hermann.  (1874). Logik, Leipzig: Hirzel.
  • Lotze, Rudolph Hermann.  (1879). Metaphysik, Leipzig: Hirzel.
  • Lotze, Rudolph Hermann. (1884). Outlines of Metaphysic, trans. and ed. by G. T. Ladd, Boston: Ginn.
  • Lotze, Rudolph Hermann. (1885). Microcosmus: An Essay Concerning Man and his Relation to the World, 2 vols., E. Hamilton and E. E. Constance Jones, Trans., Edinburgh: T. & T. Clark.
  • Lotze, Rudolph Hermann. (1885a). Outlines of Aesthetics, trans. and ed. by G. T. Ladd, Boston: Ginn.
  • Lotze, Rudolph Hermann. (1885b). Outlines of Practical Philosophy, trans. and ed. by G. T. Ladd, Boston: Ginn.
  • Lotze, Rudolph Hermann. (1885c). Outlines of Philosophy of Religion, trans. and ed. by G. T. Ladd, Boston: Ginn.
  • Lotze, Rudolph Hermann.  (1885/91). Kleine Schriften, ed. by David Peipers, 4 vols., Leipzig: Hirzel.
  • Lotze, Rudolph Hermann. (1886). Outlines of Psychology, trans. and ed. by G. T. Ladd, Boston: Ginn.
  • Lotze, Rudolph Hermann. (1887). Outlines of Logic, trans. and ed. by G. T. Ladd, Boston: Ginn.
  • Lotze, Rudolph Hermann.  (1887). Logic (B. Bosanquet et al., trans.), 2nd ed., Oxford: Clarendon Press.
  • Lotze, Rudolph Hermann.  (1888). Metaphysic (B. Bosanquet et al., trans.) 2nd ed., Oxford: Clarendon Press. Lotze, Rudolh Hermann.  (2003). Briefe und Dokumente, Zusammengestellt, eingeleitet und kommentiert von Reinhardt Pester, Würzburg: Königshausen & Neumann.

b. Secondary Sources

  • Cuming, Agnes. (1917). “Lotze, Bradley, and Bosanquet”, Mind 26: 162–70.
  • Hartmann, Eduard von. (1888). Lotze’s Philosophie, Leipzig: Friedrich.
  • Kronenberg, Moritz. (1899). Moderne Philosophen, München: Beck.
  • Kuntz, P. G. (1971). “Rudolph Hermann Lotze, Philosopher and Critic”, Introduction to: Santayana 1889, pp. 3–94.
  • Milkov, Nikolay. (2001). “The History of Russell’s Concepts ‘Sense-data’ and ‘Knowledge by Acquaintan­ce’,” Archiv für Begriffsgeschichte 43: 221–31.
  • Milkov, Nikolay.  (2002). “Lotze’s Concept of ‘States of Affairs’ and its Critics,” Prima Philosophia 15: 437–50.
  • Milkov, Nikolay.  (2008). “Russell’s Debt to Lotze,” Studies in History and Philosophy of Science, Part A, 39: 186–93.
  • Misch, Georg. (1912). “Einleitung”, in: Hermann Rudolph Lotze, Logik, hg. von G. Misch, Leipzig: Felix Meiner, pp. ix–cxxii.
  • Natorp, Paul. (1902). Platos Ideenlehre, Leipzig: Dürr.
  • Orth, E. W. (1986). “R. H. Lotze: Das Ganze unseres Welt- und Selbstverständnisses,” in: Josef Speck (ed.), Grundprobleme der großen Philosophen. Philosophie der Neuzeit IV, Göttingen: Vandenhoeck & Ruprecht, pp. 9–51.
  • Passmore, John. (1966). A Hundred Years of Philosophy; 2nd ed., Harmondsword: Penguin.
  • Perry, Ralf Barton. (1935). The Thought and Character of William James, 2 vols., Boston: Little, Brown, and Co.
  • Pester, Reinhardt. (1997). Hermann Lotze. Wege seines Denkens und Forschens, Würzburg: Königshausen & Neumann.
  • Pester, Reinhardt. (2003). “Unterwegs von Göttingen nach Berlin: Hermann Lotzes Psychologie im Spannungsfeld von Psychologie und Philosophie,” in L. Sprung and W. Schönpflug (eds.), Zur Geschichte der Psychologie in Berlin, 2nd ed., Frankfurt: Peter Lang, pp. 125–51.
  • Russell, Bertrand. (1918). Mysticism and Logic, 3rd ed., London: Allen & Unwin, 1963.
  • Santayana, George. (1889). Lotze’s System of Philosophy, ed. by P. G. Kuntz, Bloomington: Indiana University Press, 1971.
  • Weiße, C. H. (1865). “Rezension von Mikrokosmus by H. Lotze,” Zeitschrift für Philosophie und philosophische Kritik 47: 272–315.

c. Bibliographies

  • Kuntz, P. G. (1971). “Lotze Bibliography”, in: Santayana 1889, pp. 233–69.
  • Pester, Reinhardt. (1997). “Bibliographie”, in: Pester, pp. 344–94.

d. Biographies

  • Falckenberg, Richard. (1901). Hermann Lotze, Stuttgart: Frommann.
  • Wentscher, Max. (1913). Hermann Lotze, Heidelberg: Winter.

e. Further Reading

  • Bauch, Bruno. (1918). “Lotzes Logik und ihre Bedeutung im deutschen Idealismus”, in: Beiträge zur Philosophie des deutschen Idealismus 1: 45–58.
  • Devaux, Philippe. (1932). Lotze et Son Influence sur la Philosophie Anglo-Saxonne, Bruxelles: Lamartin.
  • Frege, Gottlob. (1883). “17 Key Sentences on Logic”, in: idem, Posthumous Writings, ed. by Brian McGuinness, Oxford: Blackwell, 1979, pp. 174–175.
  • Gabriel, Gottfried. (1989a). “Einleitung des Herausgebers. Lotze und die Entstehung der modernen Logik bei Frege”, in H. R. Lotze, Logik, Erstes Buch. Vom Denken, Hamburg: Meiner, xi–xliii.
  • Gabriel, Gottfried.  (1989b). “Einleitung des Herausgebers: Objektivität, Logik und Erkenntnistheorie bei Lotze und Frege”, in H. R. Lotze, Logik, Drittes Buch. Vom Erkennen (Methodologie), Hamburg: Meiner, xi–xxxiv.
  • Harte, Frederick E. (1913). The Philosophical Treatment of Divine Personality: from Spinoza to Hermann Lotze, London: C. H. Kelly.
  • Hauser, Kai. (2003). “Lotze and Husserl,” Archiv für die Geschichte der Philosophie 85: 152–78.
  • Heidegger, Martin. (1978). Frühe Schriften, Frankfurt: Klostermann.
  • Henry, Jones. (1895). A Critical Account of the Philosophy of Lotze: The Doctrine of Thought, Glasgow: MacLehose.
  • Kraushaar, Otto. (1938 / 1939). “Lotze as a Factor in the Development of James’s Radical Empiricism and Pluralism,” The Philosophical Review, 47: 517–26 / 49: 455–71.
  • Moore, Vida F. (1901). The Ethical Aspect of Lotze’s Metaphysics, New York: Macmillan.
  • Orth, E. W. (1984). “Dilthey und Lotze. Zur Wandlung des Philosophiebegriffs in 19 Jahrhundret,” Dilthey-Jahrbuch, 2: 140–58.
  • Robins, Edwin Proctor. (1900). Some Problems of Lotze’s Theory of Knowledge, New York: Macmillan.
  • Schoen, Henri. (1901). La Métaphysique de Hermann Lotze: La philosophie des Actions et des Réactions Réciproques, Paris: Fischbacher.
  • Stumpf, Carl. (1917). “Zum Gedächtnis Lotzes,” in: Kantstudien 22: 1–26.
  • Thomas, E. E. (1921). Lotze’s Theory of Reality, London: Longmans Green.
  • Valentine, C. W. (1911). The Philosophy of Lotze in its Theological Aspects, Glasgow: Robert Maclehose.
  • Wentscher, Max. (1924). Fechner und Lotze, München: Reinhardt.

Author Information

Nikolay Milkov
Email: nikolay.milkov@upb.de
Universität Paderborn
Germany

Mathematical Platonism

Mathematical platonism is any metaphysical account of mathematics that implies mathematical entities exist, that they are abstract, and that they are independent of all our rational activities. For example, a platonist might assert that the number pi exists outside of space and time and has the characteristics it does regardless of any mental or physical activities of human beings. Mathematical platonists are often called “realists,” although, strictly speaking, there can be realists who are not platonists because they do not accept the platonist requirement that mathematical entities be abstract.

Mathematical platonism enjoys widespread support and is frequently considered the default metaphysical position with respect to mathematics. This is unsurprising given its extremely natural interpretation of mathematical practice. In particular, mathematical platonism takes at face-value such well known truths as that “there exist” an infinite number of prime numbers, and it provides straightforward explanations of mathematical objectivity and of the differences between mathematical and spatio-temporal entities. Thus arguments for mathematical platonism typically assert that in order for mathematical theories to be true their logical structure must refer to some mathematical entities, that many mathematical theories are indeed objectively true, and that mathematical entities are not constituents of the spatio-temporal realm.

The most common challenge to mathematical platonism argues that mathematical platonism requires an impenetrable metaphysical gap between mathematical entities and human beings. Yet an impenetrable metaphysical gap would make our ability to refer to, have knowledge of, or have justified beliefs concerning mathematical entities completely mysterious. Frege, Quine, and “full-blooded platonism” offer the three most promising responses to this challenge.

Nominalism, logicism, formalism and intuitionism are traditional opponents of mathematical platonism, but these metaphysical theories won’t be discussed in detail in the present article.

Table of Contents

  1. What Is Mathematical Platonism?
    1. What Types of Items Count as Mathematical Ontology?
    2. What Is It to Be an Abstract Object or Structure?
    3. What Is It to Be Independent of All Rational Activities?
  2. Arguments for Platonism
    1. The Fregean Argument for Object Platonism
      1. Frege’s Philosophical Project
      2. Frege’s Argument
    2. The Quine-Putnam Indispensability Argument
  3. Challenges to Platonism
    1. Non-Platonistic Mathematical Existence
    2. The Epistemological and Referential Challenges to Platonism
  4. Full-Blooded Platonism
  5. Supplement: Frege’s Argument for Arithmetic-Object Platonism
  6. Supplement: Realism, Anti-Nominalism, and Metaphysical Constructivism
    1. Realism
    2. Anti-Nominalism
    3. Metaphysical Constructivism
  7. Supplement: The Epistemological Challenge to Platonism
    1. The Motivating Picture Underwriting the Epistemological Challenge
    2. The Fundamental Question: The Core of the Epistemological Challenge
    3. The fundamental Question: Some Further Details
  8. Supplement: The Referential Challenge to Platonism
    1. Introducing the Referential Challenge
    2. Reference and Permutations
    3. Reference and the Löwenheim-Skolem Theorem
  9. References and Further Reading
    1. Suggestions for Further Reading
    2. Other References

1. What Is Mathematical Platonism?

Traditionally, mathematical platonism has referred to a collection of metaphysical accounts of mathematics, where a metaphysical account of mathematics is one that entails theses concerning the existence and fundamental nature of mathematical ontology. In particular, such an account of mathematics is a variety of (mathematical) platonism if and only if it entails some version of the following three Theses:

  1. Existence: Some mathematical ontology exists.
  2. Abstractness: Mathematical ontology is abstract.
  3. Independence: Mathematical ontology is independent of all rational activities, that is, the activities of all rational beings.

In order to understand platonism so conceived, it will be useful to investigate what types of items count as mathematical ontology, what it is to be abstract, and what it is to be independent of all rational activities. Let us address these topics.

a. What Types of Items Count as Mathematical Ontology?

Traditionally, platonists have maintained that the items that are fundamental to mathematical ontology are objects, where an object is, roughly, any item that may fall within the range of the first-order bound variables of an appropriately formalized theory and for which identity conditions can be provided. Section 2 provides an outline of the evolution of this conception of an object. Those readers who are unfamiliar with the terminology “first-order bound variable” can consult Model-Theoretic Conceptions of Logical Consequence. Let us call platonisms that take objects to be the fundamental items of mathematical ontology object platonisms. So, object platonism is the conjunction of three theses: some mathematical objects exist, those mathematical objects are abstract, and those mathematical objects are independent of all rational activities. In the last hundred years or so, object platonisms have been defended by Gottlob Frege [1884, 1893, 1903], Crispin Wright and Bob Hale [Wright 1983], [Hale and Wright 2001], and Neil Tennant [1987, 1997].

Nearly all object platonists recognize that most mathematical objects naturally belong to collections (for example, the real numbers, the sets, the cyclical group of order 20). To borrow terminology from model theory, most mathematical objects are elements of mathematical domains. Consult Model-Theoretic Conceptions of Logical Consequence for details. It is well recognized that the objects in mathematical domains have certain properties and stand in certain relations to one another. These distinctively mathematical properties and relations are also acknowledged by object platonists to be items of mathematical ontology.

More recently, it has become popular to maintain that the items that are fundamental to mathematical ontology are structures rather than objects. Stewart Shapiro [1997, pp. 73-4], a prominent defender of this thesis, offers the following definition of a structure:

I define a system to be a collection of objects with certain relations. … A structure is the abstract form of a system, highlighting the interrelationships among the objects, and ignoring any features of them that do not affect how they relate to other objects in the system.

According to structuralists, mathematics’ subject matter is mathematical structures. Individual mathematical entities (for example, the complex number 1 + 2i) are positions or places in such structures. Controversy exists over precisely what this amounts to. Minimally, there is agreement that the places of structures exhibit a greater dependence on one another than object platonists claim exists between the objects of the mathematical domains to which they are committed. Some structuralists add that the places of structures have only structural properties—properties shared by all systems that exemplify the structure in question—and that the identity of such places is determined by their structural properties. Michael Resnik [1981, p. 530], for example, writes:

In mathematics, I claim, we do not have objects with an “internal” composition arranged in structures, we only have structures. The objects of mathematics, that is, the entities which our mathematical constants and quantifiers denote, are structureless points or positions in structures. As positions in structures, they have no identity or features outside a structure.

An excellent everyday example of a structure is a baseball defense (abstractly construed); such positions as pitcher and shortstop are the places of this structure. Although the pitcher and shortstop of any specific baseball defense (for example, of the Cleveland Indians’ baseball defense during a particular pitch of a particular game) have a complete collection of properties, if one considers these positions as places in the structure “baseball defense,” the same is not true. For example, these places do not have a particular height, weight, or shoe size. Indeed, their only properties would seem to be those that reflect their relations to other places in the structure “baseball defense.”

Although we might label platonisms of the structural variety structure platonisms, they are more commonly labeled ante rem (or sui generis) structuralisms. This label is borrowed from ante rem universals—universals that exist independently of their instances. Consult Universals for a discussion of ante rem universals. Ante rem structures are typically characterized as ante rem universals that, consequently, exist independently of their instances. As such, ante rem structures are abstract, and are typically taken to exist independently of all rational activities.

b. What Is It to Be an Abstract Object or Structure?

There is no straightforward way of addressing what it is to be an abstract object or structure, because “abstract” is a philosophical term of art. Although its primary uses share something in common—they all contrast abstract items (for example, mathematical entities, propositions, type-individuated linguistic characters, pieces of music, novels, etc.) with concrete, most importantly spatio-temporal, items (for example, electrons, planets, particular copies of novels and performances of pieces of music, etc.)—its precise use varies from philosopher to philosopher. Illuminating discussions of these different uses, the nature of the distinction between abstract and concrete, and the difficulties involved in drawing this distinction—for example, whether my center of gravity/mass is abstract or concrete—can be found in [Burgess and Rosen 1997, §I.A.i.a], [Dummett 1981, Chapter 14], [Hale 1987, Chapter 3] and [Lewis 1986, §1.7].

For our purposes, the best account takes abstract to be a cluster concept, that is, a concept whose application is marked by a collection of other concepts, some of which are more important to its application than others. The most important or central member of the cluster associated with abstract is:

1. non-spatio-temporality: the item does not stand to other items in a collection of relations that would make it a constituent of the spatio-temporal realm.

Non-spatio-temporality does not require an item to stand completely outside of the network of spatio-temporal relations. It is possible, for example, for a non-spatio-temporal entity to stand in spatio-temporal relations that are, non-formally, solely temporal relations—consider, for example, type-individuated games of chess, which came into existence at approximately the time at which people started to play chess. Some philosophers maintain that it is possible for non-spatio-temporal objects to stand in some spatio-temporal relations that are, non-formally, solely spatial relations. Centers of gravity/mass are a possible candidate. Yet, the dominant practice in the philosophy of mathematics literature is to take non-spatio-temporal to have an extension that only includes items that fail to stand in all spatio-temporal relations that are, non-formally, solely spatial relations.

Also fairly central to the cluster associated with abstract are, in order of centrality:

2.  acausality: the item neither exerts a strict causal influence over other items nor does any other item causally influence it in the strict sense, where strict causal relations are those that obtain between, and only between, constituents of the spatio-temporal realm—for example, you can kick a football and cause it (in a strict sense) to move, but you can’t kick a number.

3.  eternality: where this could be interpreted as either

3a. omnitemporality: the item exists at all times, or

3b. atemporality: the item exists outside of the network of temporal relations,

4.  changelessness: none of the item’s intrinsic properties change—roughly, an item’s intrinsic properties are those that it has independently of its relationships to other items, and

5. necessary existence: the item could not have failed to exist.

An item is abstract if and only if it has enough of the features in this cluster, where the features had by the item in question must include those that are most central to the cluster.

Differences in the use of “abstract” are best accounted for by observing that different philosophers seek to communicate different constellations of features from this cluster when they apply this term. All philosophers insist that an item have Feature 1 before it may be appropriately labeled “abstract.” Philosophers of mathematics invariably mean to convey that mathematical entities have Feature 2 when they claim that mathematical objects or structures are abstract. Indeed, they typically mean to convey that such objects or structures have either Feature 3a or 3b, and Feature 4. Some philosophers of mathematics also mean to convey that mathematical objects or structures have Feature 5.

For cluster concepts, it is common to call those items that have all, or most, of the features in the cluster paradigm cases of the concept in question. With this terminology in place, the content of the Abstractness Thesis, as intended and interpreted by most philosophers of mathematics, is more precisely conveyed by the Abstractness+ Thesis: the mathematical objects or structures that exist are paradigm cases of abstract entities.

c. What Is It to Be Independent of All Rational Activities?

The most common account of the content of “X is independent of Y” is X would exist even if Y did not. Accordingly, when platonists affirm the Independence Thesis, they affirm that their favored mathematical ontology would exist even if there were no rational activities, where the rational activities in question might be mental or physical.

Typically, the Independence Thesis is meant to convey more than indicated above. The Independence Thesis is typically meant to convey, in addition, that mathematical objects or structures would have the features that they in fact have even if there were no rational activities or if there were quite different rational activities to the ones that there in fact are. We exclude these stronger conditions from the formal characterization of “X is independent of Y,” because there is an interpretation of the neo-Fregean platonists Bob Hale and Crispin Wright that takes them to maintain that mathematical activities determine the ontological structure of a mathematical realm satisfying the Existence, Abstractness, and Independence Theses, that is, mathematical activities determine how such a mathematical realm is structured into objects, properties, and relations. See, for example, [MacBride 2003]. Athough this interpretation of Hale and Wright is controversial, were someone to advocate such a view, he or she would be advocating a variety of platonism.

2. Arguments for Platonism

Without doubt, it is everyday mathematical activities that motivate people to endorse platonism. Those activities are littered with assertions that, when interpreted in a straightforward way, support the Existence Thesis. For example, we are familiar with saying that there exist an infinite number of prime numbers and that there exist exactly two solutions to the equation x2 ­– 5x + 6 = 0. Moreover, it is an axiom of standard set theories that the empty set exists.

It takes only a little consideration to realize that, if mathematical objects or structures do exist, they are unlikely to be constituents of the spatio-temporal realm. For example, where in the spatio-temporal realm might one locate the empty set, or even the number four—as opposed to collections with four elements? How much does the empty set or the real number p weigh? There appear to be no good answers to these questions. Indeed, to even ask them appears to be to engage in a category mistake. This suggests that the core content of the Abstractness Thesis–that mathematical objects or structures are not constituents of the spatio-temporal realm–is correct.

The standard route to the acceptance of the Independence Thesis utilizes the objectivity of mathematics. It is difficult to deny that “there exist infinitely many prime numbers” and “2 + 2 = 4” are objective truths. Platonists argue—or, more frequently, simply assume—that the best explanation of this objectivity is that mathematical theories have a subject matter that is quite independent of rational beings and their activities. The Independence Thesis is a standard way of articulating the relevant type of independence.

So, it is easy to establish the prima facie plausibility of platonism. Yet it took the genius of Gottlob Frege [1884] to transparently and systematically bring together considerations of this type in favor of platonism’s plausibility. In the very same manuscript, Frege also articulated the most influential argument for platonism. Let us examine this argument.

a. The Fregean Argument for Object Platonism

i. Frege’s Philosophical Project

Frege’s argument for platonism [1884, 1893, 1903] was offered in conjunction with his defense of arithmetic logicism—roughly, the thesis that all arithmetic truths are derivable from general logical laws and definitions. In order to carry out a defense of arithmetic logicism, Frege developed his Begriffsschift [1879]—a formal language designed to be an ideal tool for representing the logical structure of what Frege called thoughts. Contemporary philosophers would call them “propositions,” and they are what Frege took to be the primary bearers of truth. The technical details of Frege’s begriffsschift need not concern us; the interested reader can consult the articles on Gottlob Frege and Frege and Language. We need only note that Frege took the logical structure of thoughts to be modeled on the mathematical distinction between a function and an argument.

On the basis of this function-argument understanding of logical structure, Frege incorporated two categories of linguistic expression into his begriffsschift: those that are saturated and those that are not. In contemporary parlance, we call the former singular terms (or proper names in a broad sense) and the latter predicates or quantifier expressions, depending on the types of linguistic expressions that may saturate them. For Frege, the distinction between these two categories of linguistic expression directly reflected a metaphysical distinction within thoughts, which he took to have saturated and unsaturated components. He labeled the saturated components of thoughts “objects” and the unsaturated components “concepts.” In so doing, Frege took himself to be making precise the notions of object and concept already embedded in the inferential structure of natural languages.

ii. Frege’s Argument

Formulated succinctly, Frege’s argument for arithmetic-object platonism proceeds as follows:

i. Singular terms referring to natural numbers appear in true simple statements.

ii. It is possible for simple statements with singular terms as components to be true only if the objects to which those singular terms refer exist.

Therefore,

iii. the natural numbers exist.

iv. If the natural numbers exist, they are abstract objects that are independent of all rational activities.

Therefore,

v. the natural numbers are existent abstract objects that are independent of all rational activities, that is, arithmetic-object platonism is true.

In order to more fully understand Frege’s argument, let us make four observations: (a) Frege took natural numbers to be objects, because natural number terms are singular terms, (b) Frege took natural numbers to exist because singular terms referring to them appear in true simple statements—in particular, true identity statements, (c) Frege took natural numbers to be independent of all rational activities, because some thoughts containing them are objective, and (d) Frege took natural numbers to be abstract because they are neither mental nor physical. Observations (a) and (b) are important because they are the heart of Frege’s argument for the Existence Thesis, which, at least if one judges by the proportion of his Grundlagen [1884] that was devoted to establishing it, was of central concern to Frege. Observations (c) and (d) are important because they identify the mechanisms that Frege used to defend the Abstractness and Independence Theses. For further details, consult [Frege 1884, §26 and §61].

Frege’s argument for the thesis that some simple numerical identities are objectively true relies heavily on the fact that such identities allow for the application of natural numbers in representing and reasoning about reality, especially the non-mathematical parts of reality. It is applicability in this sense that Frege took to be the primary reason for judging arithmetic to be a body of objective truths rather than a mere game involving the manipulation of symbols. The interested reader should consult [Frege 1903, §91]. A more detailed formulation of Frege’s argument for arithmetic-object platonism, which incorporates the above observations, can be found below in section 5.

The central core of Frege’s argument for arithmetic-object platonism continues to be taken to be plausible, if not correct, by most contemporary philosophers. Yet its reliance on the category “singular term” presents a problem for extending it to a general argument for object platonism. The difficulty with relying on this category can be recognized once one considers extending Frege’s argument to cover mathematical domains that have more members than do the natural numbers (for example, the real numbers, complex numbers, or sets). Although there is a sense in which many natural languages do contain singular terms that refer to all natural numbers—such natural languages embed a procedure for generating a singular term to refer to any given natural number—the same cannot be said for real numbers, complex numbers, and sets. The sheer size of these domains excludes the possibility that there could be a natural language that includes a singular term for each of their members. There are an uncountable number of members in each such domain. Yet no language with an uncountable number of singular terms could plausibly be taken to be a natural language, at least not if what one means by a natural language is a language that could be spoken by rational beings with the same kinds of cognitive capacities that human beings have.

So, if Frege’s argument, or something like it, is to be used to establish a more wide ranging object platonism, then that argument is either going to have to exploit some category other than singular term or it is going to have to invoke this category differently than how Frege did. Some neo-Fregean platonists such as [Hale and Wright 2001] adopt the second strategy. Central to their approach is the category of possible singular term. [MacBride 2003] contains an excellent summary of their strategy. Yet the more widely adopted strategy has been to give up on singular terms all together and instead take objects to be those items that may fall within the range of first-order bound variables and for which identity conditions can be provided. Much of the impetus for this more popular strategy came from Willard Van Orman Quine. See [1948] for a discussion of the primary clause and [1981, p. 102] for a discussion of the secondary clause. It is worth noting, however, that a similar constraint to the secondary clause can be found in Frege’s writings. See discussions of the so-called Caesar problem in, for example, [Hale and Wright 2001, Chapter 14] and [MacBride 2005, 2006].

b. The Quine-Putnam Indispensability Argument

Consideration of the Quinean strategy of taking objects to be those items that may fall within the range of first-order bound variables naturally leads us to a contemporary version of Frege’s argument for the Existence Thesis. This Quine-Putnam indispensability argument (QPIA) can be found scattered throughout Quine’s corpus. See, for example, [1951, 1963, 1981]. Yet nowhere is it developed in systematic detail. Indeed, the argument is given its first methodical treatment in Hilary Putnam’s Philosophy of Logic [1971]. To date, the most extensive sympathetic development of the QPIA is provided by Mark Colyvan [2001]. Those interested in a shorter sympathetic development of this argument should read [Resnik 2005].

The core of the QPIA is the following:

i. We should acknowledge the existence of—or, as Quine and Putnam would prefer to put it, be ontologically committed to—all those entities that are indispensable to our best scientific theories.

ii. Mathematical objects or structures are indispensable to our best scientific theories.

Therefore,

iii. We should acknowledge the existence of—be ontologically committed to—mathematical objects or structures.

Note that this argument’s conclusion is akin to the Existence Thesis. Thus, to use it as an argument for platonism, one needs to combine it with considerations that establish the Abstractness and Independence Theses.

So, what is it for a particular, perhaps single-membered, collection of entities to be indispensable to a given scientific theory? Roughly, it is for those entities to be ineliminable from the theory in question without significantly detracting from the scientific attractiveness of that theory. This characterization of indispensability suffices for noting that, prima facie, mathematical theories are indispensable to many scientific theories, for, prima facie, it is impossible to formulate many such theories—never mind formulate those theories in a scientifically attractive way—without using mathematics.

However, indispensability thesis has been challenged. The most influential challenge was made by Hartry Field [1980]. Informative discussions of the literature relating to this challenge can be found in [Colyvan 2001, Chapter 4] and [Balaguer 1998, Chapter 6].

In order to provide a more precise characterization of indispensability, we will need to investigate the doctrines that Quine and Putnam use to motivate and justify the first premise of the QPIA: naturalism and confirmational holism. Naturalism is the abandonment of the goal of developing a first philosophy. According to naturalism, science is an inquiry into reality that, while fallible and corrigible, is not answerable to any supra-scientific tribunal. Thus, naturalism is the recognition that it is within science itself, and not in some prior philosophy, that reality is to be identified and described. Confirmational holism is the doctrine that theories are confirmed or infirmed as wholes, for, as Quine observes, it is not the case that “each statement, taken in isolation from its fellows, can admit of confirmation or infirmation …, statements … face the tribunal of sense experience not individually but only as a corporate body” [1951, p. 38].

It is easy to see the relationship between naturalism, confirmation holism, and the first premise of the QPIA. Suppose a collection of entities is indispensable to one of our best scientific theories. Then, by confirmational holism, whatever support we have for the truth of that scientific theory is support for the truth of the part of that theory to which the collection of entities in question is indispensable. Further, by naturalism, that part of the theory serves as a guide to reality. Consequently, should the truth of that part of the theory commit us to the existence of the collection of entities in question, we should indeed be committed to the existence of those entities, that is, we should be ontologically committed to those entities.

In light of this, what is needed is a mechanism for assessing whether the truth of some theory or part of some theory commits us to the existence of a particular collection of entities. In response to this need, Quine offers his criterion of ontological commitment: theories, as collections of sentences, are committed to those entities over which the first-order bound variables of the sentences contained within them must range in order for those sentences to be true.

Although Quine’s criterion is relatively simple, it is important that one appropriately grasp its application. One cannot simply read ontological commitments from the surface grammar of ordinary language. For, as Quine [1981, p. 9] explains,

[T]he common man’s ontology is vague and untidy … a fenced ontology is just not implicit in ordinary language. The idea of a boundary between being and nonbeing is a philosophical idea, an idea of technical science in the broad sense.

Rather, what is required is that one first regiment the language in question, that is, cast that language in what Quine calls “canonical notation.” Thus,

[W]e can draw explicit ontological lines when desired. We can regiment our notation. … Then it is that we can say the objects assumed are the values of the variables. … Various turns of phrase in ordinary language that seem to invoke novel sorts of objects may disappear under such regimentation. At other points new ontic commitments may emerge. There is room for choice, and one chooses with a view to simplicity in one’s overall system of the world. [Quine 1981, pp. 9-10]

To illustrate, the everyday sentence “I saw a possible job for you” would appear to be ontologically committed to possible jobs. Yet this commitment is seen to be spurious once one appropriately regiments this sentence as “I saw a job advertised that might be suitable for you.”

We now have all of the components needed to understand what it is for a particular collection of entities to be indispensable to a scientific theory. A collection of entities is indispensable to a scientific theory if and only if, when that theory is optimally formulated in canonical notation, the entities in question fall within the range of the first-order bound variables of that theory. Here, optimality of formulation should be assessed by the standards that govern the formulation of scientific theories in general (for example, simplicity, fruitfulness, conservativeness, and so forth).

Now that we understand indispensability, it is worth noting the similarity between the QPIA and Frege’s argument for the Existence Thesis. We observed above that Frege’s argument has two key components: recognition of the applicability of numbers in representing and reasoning about the world as support for the contention that arithmetic statements are true, and a logico-inferential analysis of arithmetic statements that identified natural number terms as singular terms. The QPIA encapsulates directly parallel features: ineliminable applicability to our best scientific theories (that is, indispensability) and Quine’s criterion of ontological commitment. While the language and framework of the QPIA are different from those of Frege’s argument, these arguments are, at their core, identical.

One important difference between these arguments is worth noting, however. Frege’s argument is for the existence of objects; his analysis of natural languages only allows for the categories “object” and “concept.” Quine’s criterion of ontological commitment recommends commitment to any entity that falls within the range of the first-order bound variables of any theory that one endorses. While all such entities might be objects, some might be positions or places in structures. As such, the QPIA can be used to defend ante rem structuralism.

3. Challenges to Platonism

a. Non-Platonistic Mathematical Existence

Since the late twentieth century, an increasing number of philosophers of mathematics in the platonic tradition have followed the practice of labeling their accounts of mathematics as “realist” or “realism” rather than “platonist” or “platonism.” Roughly, these philosophers take an account of mathematics to be a variety of (mathematical) realism if and only if it entails three theses: some mathematical ontology exists, that mathematical ontology has objective features, and that mathematical ontology is, contains, or provides the semantic values of the components of mathematical theories. Typically, contemporary platonists endorse all three theses, yet there are realists who are not platonists. Normally, this is because these individuals do not endorse the Abstractness Thesis. In addition to non-platonist realists, there are also philosophers of mathematics who accept the Existence Thesis but reject the Independence Thesis. Section 6 below discusses accounts of mathematics that endorse the Existence Thesis, or something very similar, yet reject either the Abstractness Thesis or the Independence Thesis.

b. The Epistemological and Referential Challenges to Platonism

Let us consider the two most common challenges to platonism: the epistemological challenge and the referential challenge. Sections 7 and 8 below contain more detailed, systematic discussions of these challenges.

Proponents of these challenges take endorsement of the Existence, Abstractness and Independence Theses to amount to endorsement of a particular metaphysical account of the relationship between the spatio-temporal and mathematical realms. Specifically, according to this account, there is an impenetrable metaphysical gap between these realms. This gap is constituted by a lack of causal interaction between these realms, which, in turn, is a consequence of mathematical entities being abstract (see [Burgess and Rosen 1997, §I.A.2.a]). Proponents of the epistemological challenge observe that, prima facie, such an impenetrable metaphysical gap would make human beings’ ability to form justified mathematical beliefs and obtain mathematical knowledge completely mysterious. Proponents of the referential challenge, on the other hand, observe that, prima facie, such an impenetrable metaphysical gap would make human beings’ ability to refer to mathematical entities completely mysterious. It is natural to suppose that human beings do have justified mathematical beliefs and mathematical knowledge, for example, that 2 + 2 = 4, and do refer to mathematical entities, for example, when we assert “2 is a prime number.” Moreover, it is natural to suppose that the obtaining of these facts is not completely mysterious. The epistemological and referential challenges are challenges to show that the truth of platonism is compatible with the unmysterious obtaining of these facts.

This raises two questions. Why do proponents of the epistemological challenge maintain that an impenetrable metaphysical gap between the mathematical and spatio-temporal realms would make human beings’ ability to form justified mathematical beliefs and obtain mathematical knowledge completely mysterious? (For readability, we shall drop the qualifier “prima facie” in the remainder of this discussion.) And, why do proponents of the referential challenge insist that such an impenetrable metaphysical gap would make human beings’ ability to refer to mathematical entities completely mysterious?

To answer the first question, consider an imaginary scenario. You are in London, England while the State of the Union address is being given. You are particularly interested in what the U.S. President has to say in this address. So, you look for a place where you can watch the address on television. Unfortunately, the State of the Union address is only being televised on a specialized channel that nobody seems to be watching. You ask a Londoner where you might go to watch the address. She responds, “I’m not sure, but if you stay here with me, I’ll let you know word for word what the President says as he says it.”  You look at her confused. You can find no evidence of devices in the vicinity (for example, television sets, mobile phones, or computers) that could explain her ability to do what she claims she will be able to. You respond, “I don’t see any TVs, radios, computers, or the like. How are you going to know what the President is saying?”

That such a response to this Londoner’s claim would be appropriate is obvious. Further, its aptness supports the contention that you can only legitimately claim knowledge of, or justified beliefs concerning, a complex state of affairs if there is some explanation available for the existence of the type of relationship that would need to exist between you and the complex state of affairs in question in order for you to have the said knowledge or justified beliefs. Indeed, it suggests something further: the only kind of acceptable explanation available for knowledge of, or justified beliefs concerning, a complex state of affairs is one that appeals directly or indirectly to a causal connection between the knower or justified believer and the complex state of affairs in question. You questioned the Londoner precisely because you could see no devices that could put her in causal contact with the President, and the only kind of explanation that you could imagine for her having the knowledge (or justified beliefs) that she was claiming she would have would involve her being in this type of contact with the President.

An impenetrable metaphysical gap between the mathematical and spatio-temporal realms of the type that proponents of the epistemological challenge insist exists if platonism is true would exclude the possibility of causal interaction between human beings, who are inhabitants of the spatio-temporal realm, and mathematical entities, which are inhabitants of the mathematical realm. Consequently, such a gap would exclude the possibility of there being an appropriate explanation of human beings having justified mathematical beliefs and mathematical knowledge. So, the truth of platonism, as conceived by proponents of the epistemological challenge, would make all instances of human beings having justified mathematical beliefs or mathematical knowledge completely mysterious.

Next, consider why proponents of the referential challenge maintain that an impenetrable metaphysical gap between the spatio-temporal and mathematical realms would make human beings’ ability to refer to mathematical entities completely mysterious. Once again, this can be seen by considering an imaginary scenario. Imagine that you meet someone for the first time and realize that you went to the same university at around the same time years ago. You begin to reminisce about your university experiences, and she tells you a story about John Smith, an old friend of hers who was a philosophy major, but who now teaches at a small liberal arts college in Ohio, was married about 6 years ago to a woman named Mary, and has three children. You, too, were friends with a John Smith when you were at the University. You recall that he was a philosophy major, intended to go to graduate school, and that a year or so ago a mutual friend told you that he is now married to a woman named Mary and has three children. You incorrectly draw the conclusion that you shared a friend with this woman while at the University. As a matter of fact, there were two John Smiths who were philosophy majors at the appropriate time, and these individuals’ lives have shared similar paths. You were friends with one of these individuals, John Smith1, while she was friends with the other, John Smith2.

Your new acquaintance proceeds to inform you that John and Mary Smith got divorced recently. You form a false belief about your old friend and his wife. What makes her statement and corresponding belief true is that, in it, “John Smith” refers to John Smith2, “Mary Smith” refers to Mary Smith2, John Smith2’s former wife, and John Smith2 and Mary Smith2 stand to a recent time in the triadic relation “x got divorced from y at time t.” Your belief is false, however, because, in it, “John Smith” refers to John Smith1, “Mary Smith” refers to Mary Smith1, John Smith1’s wife, and John Smith1 and Mary Smith1 fail to stand to a recent time in the triadic relation “x got divorced from y at time t.”

Now, consider why John Smith1 and Mary Smith1 are the referents of your use of “John and Mary Smith” while John Smith2 and Mary Smith2 are the referents of your new acquaintance’s use of this phrase. It is because she causally interacted with John Smith2 while at the University, while you causally interacted with John Smith1. In other words, your respective causal interactions are responsible for your respective uses of the phrase “John and Mary Smith” having different referents.

Reflecting on this case, you might conclude that there must be a specific type of causal relationship between a person and an item if that person is to determinately refer to that item. For example, this case might convince you that, in order for you to use the singular term “two” to refer to the number two, there would need to be a causal relationship between you and the number two. Of course, an impenetrable metaphysical gap between the spatio-temporal realm and the mathematical realm would make such a causal relationship impossible. Consequently, such an impenetrable metaphysical gap would make human beings’ ability to refer to mathematical entities completely mysterious.

4. Full-Blooded Platonism

Of the many responses to the epistemological and referential challenges, the three most promising are (i) Frege’s, as developed in the contemporary neo-Fregean literature, (ii) Quine’s, as developed by defenders of the QPIA, and (iii) a response that is commonly referred to as full-blooded or plenitudinous platonism (FBP). This third response has been most fully articulated by Mark Balaguer [1998] and Stewart Shapiro [1997].

The fundamental idea behind FBP is that it is possible for human beings to have systematically and non-accidentally true beliefs about a platonic mathematical realm—a mathematical realm satisfying the Existence, Abstractness, and Independence Theses—without that realm in any way influencing us or us influencing it. This, in turn, is supposed to be made possible by FBP combining two theses: (a) Schematic Reference: the reference relation between mathematical theories and the mathematical realm is purely schematic, or at least close to purely schematic and (b) Plenitude: the mathematical realm is VERY large. It contains entities that are related to one another in all of the possible ways that entities can be related to one another.

What it is for a reference relation to be purely schematic will be explored later. For now, these theses are best understood in light of FBP’s account of mathematical truth, which, intuitively, relies on two further Theses: (1) Mathematical theories embed collections of constraints on what the ontological structure of a given “part” of the mathematical realm must be in order for the said part to be an appropriate truth-maker for the theory in question. (2) The existence of any such appropriate part of the mathematical realm is sufficient to make the said theory true of that part of that realm. For example, it is well-known that arithmetic characterizes an ω-sequence, a countable-infinite collection of objects that has a distinguished initial object and a successor relation that satisfies the induction principle. Thus, illustrating Thesis 1, any part of the mathematical realm that serves as an appropriate truth-maker for arithmetic must be an ω-sequence. Intuitively, one might think that not just any ω-sequence will do, rather one needs a very specific ω-sequence, that is, the natural numbers. Yet, proponents of FBP deny this intuition. According to them, illustrating Thesis 2, any ω-sequence is an appropriate truth-maker for arithmetic; arithmetic is a body of truths that concerns any ω-sequence in the mathematical realm.

Those familiar with the model theoretic notion of “truth in a model” will recognize the similarities between it and FBP’s conception of truth. (Those who are not can consult Model-Theoretic Conceptions Logical Consequence, where “truth in a model” is called “truth in a structure.”) These similarities are not accidental; FBP’s conception of truth is intentionally modeled on this model-theoretic notion. The outstanding feature of model-theoretic consequence is that, in constructing a model for evaluating a semantic sequent (a formal argument), one doesn’t care which specific objects one takes as the domain of discourse of that model, which specific objects or collections of objects one takes as the extension of any predicates that appear in the sequent, or which specific objects one takes as the referents of any singular terms that appear in the sequent. All that matters is that those choices meet the constraints placed on them by the sequent in question. So, for example, if you want to construct a model to show that ‘Fa & Ga’ does not follow from ‘Fa’ and ‘Gb’, you could take the domain of your model to be the set of natural numbers, assign extensions to the two predicates by requiring Ext(F) = {x: x is even} and Ext(G) = {x: x is odd}, and assign denotations Ref(a) = 2, and Ref(b) = 3. Alternatively, you could take the domain of your model to be {Hillary Clinton, Bill Clinton}, Ext(F) = {Hillary Clinton}, Ext(G) = {Bill Clinton}, Ref(a) = Hillary Clinton, and Ref(b) = Bill Clinton. A reference relation is schematic if and only if, when employing it, there is the same type of freedom concerning which items are the referents of quantifiers, predicates, and singular terms as there is when constructing a model. In model theory, the reference relation is purely schematic. This reference relation is employed largely as-is in Shapiro’s structuralist version of FBP, whereas Balaguer’s version of FBP places a few more constraints on this reference relation. Yet neither Shapiro’s nor Balaguer’s constraints undermine the schematic nature of the reference relation they employ in characterizing their respective FBPs.

By endorsing Thesis 2, proponents of FBP endorse the Schematic Reference Thesis. Moreover, Thesis 2 and the Schematic Reference Thesis distinguish the requirements on mathematical reference (and, consequently, truth) from the requirements on reference to (and, consequently, truth concerning) spatio-temporal entities. As illustrated in section 3 above, the logico-inferential components of beliefs and statements about spatio-temporal entities have specific, unique spatio-temporal entities or collections of spatio-temporal entities as their referents. Thus, the reference relationship between spatio-temporal entities and spatio-temporal beliefs and statements is non-schematic.

FBP’s conception of reference appears to provide it with the resources to undermine the legitimacy of the referential challenge. According to proponents of FBP, in offering their challenge, proponents of the referential challenge illegitimately generalized a feature of the reference relationship between spatio-temporal beliefs and statements, and spatio-temporal entities, that is, its non-schematic character.

So, the Schematic Reference Thesis is at the heart of FBP’s response to the referential challenge. By contrast, the Plenitude Thesis is at the heart of FBP’s response to the epistemological challenge. To see this, consider an arbitrary mathematical theory that places an obtainable collection of constraints on any truth-maker for that theory. If the Plenitude Thesis is true, we can be assured that there is a part of the mathematical realm that will serve as an appropriate truth-maker for this theory because the truth of the Plenitude Thesis amounts to the mathematical realm containing some part that is ontologically structured in precisely the way required by the constraints embedded in the particular mathematical theory in question. So, the Plenitude Thesis ensures that there will be some part of the mathematical realm that will serve as an appropriate truth-maker for any mathematical theory that places an obtainable collection of constraints on its truth-maker(s). Balaguer uses the term “consistent” to pick out those mathematical theories that place obtainable constraints on their truth-maker(s). However, what Balaguer means by this is not, or at least should not be, deductively consistent. The appropriate notion is closer to Shapiro’s [1997] notion of coherent, which is a primitive modeled on set-theoretic satisfiability. Yet, however one states the above truth, it has direct consequences for the epistemological challenge. As Balaguer [1998, pp. 48–9] explains:

If FBP is correct, then all consistent purely mathematical theories truly describe some collection of abstract mathematical objects. Thus, to acquire knowledge of mathematical objects, all we need to do is acquire knowledge that some purely mathematical theory is consistent [.…] But knowledge of the consistency of a mathematical theory … does not require any sort of contact with, or access to, the objects that the theory is about. Thus, the [epistemological challenge has] been answered: We can acquire knowledge of abstract mathematical objects without the aid of any sort of contact with such objects.

5. Supplement: Frege’s Argument for Arithmetic-Object Platonism

Frege’s argument for arithmetic-object platonism proceeds in the following way:

i. The primary logico-inferential role of natural number terms (for example, “one” and “seven”) is reflected in numerical identity statements such as “The number of states in the United States of America is fifty.”

ii. The linguistic expressions on each side of identity statements are singular terms.

Therefore, from (i) and (ii),

iii. In their primary logico-inferential role, natural number terms are singular terms.

Therefore, from (iii) and from Frege’s logico-inferential analysis of the category “object,”

iv. the items referred to by natural number terms (that is, the natural numbers) are members of the logico-inferential category object.

v. Many numerical identity statements (for example, the one mentioned in (i) are true.

vi. An identity statement can be true only if the object referred to by the singular terms on either side of that identity statement exists.

Therefore, from (v) and (vi),

vii. the objects to which natural number terms refer (that is, the natural numbers) exist.

viii. Many arithmetic identities are objective.

ix. The existent components of objective thoughts are independent of all rational activities.

Therefore, from (viii) and (ix),

x. the natural numbers are independent of all rational activities.

xi. Thoughts with mental objects as components are not objective.

Therefore, from (viii) and (xi),

xii. the natural numbers are not mental objects.

xiii. The left hand sides of numerical identity statements of the form given in (i) show that natural numbers are associated with concepts in a specific way.

xiv. No physical objects are associated with concepts in the way that natural numbers are.

Therefore, from (xiii) and (xiv),

xv. The natural numbers are not physical objects.

xvi. Objects that are neither mental nor physical are abstract.

Therefore, from (xi), (xv), and (xvi),

xvii. the natural numbers are abstract objects.

Therefore, from (vii), (x), and (xvii),

xviii. arithmetic object platonism is true.

Return to section 2 where this section is references.

6. Supplement: Realism, Anti-Nominalism, and Metaphysical Constructivism

a. Realism

Since the late twentieth century, an increasing number of philosophers of mathematics who endorse the Existence Thesis, or something very similar, have followed the practice of labeling their accounts of mathematics “realist” or “realism” rather than “platonist” or “platonism,” where, roughly, an account of mathematics is a variety of (mathematical) realism if and only if it entails three theses: some mathematical ontology exists, that mathematical ontology has objective features, and that mathematical ontology is, contains, or provides the semantic values of the logico-inferential components of mathematical theories. The influences that motivated individual philosophers to adopt this practice are diverse. In the broadest of terms, however, this practice is the result of the dominance of certain strands of analytic philosophy in the philosophy of mathematics.

In order to see how one important strand contributed to the practice of labeling accounts of mathematics “realist” rather than “platonist,” let us explore Quinean frameworks. These are frameworks that embed the doctrines of naturalism and confirmational holism in a little more detail. Two features of such frameworks warrant particular mention.

First, within Quinean frameworks, mathematical knowledge is on a par with empirical knowledge; both mathematical statements and statements about the spatio-temporal realm are confirmed and infirmed by empirical investigation. As such, within Quinean frameworks, neither type of statement is knowable a priori, at least in the traditional sense. Yet nearly all prominent Western thinkers have considered mathematical truths to be knowable a priori. Indeed, according to standard histories of Western thought, this way of thinking about mathematical knowledge dates back at least as far as Plato. So, to reject it is to reject something fundamental to Plato’s thoughts about mathematics. Consequently, accounts of mathematics offered within Quinean frameworks almost invariably reject something fundamental to Plato’s thoughts about mathematics. In light of this, and the historical connotations of the label “platonism,” it is not difficult to see why one might want to use an alternate label for such accounts that accept the Existence Thesis (or something very similar).

The second feature of Quinean frameworks that warrants particular mention in regard to the practice of using “realism” rather than “platonism” to label accounts of mathematics is that, within such frameworks, mathematical entities are typically treated and thought about in the same way as the theoretical entities of non-mathematical natural science. In some Quinean frameworks, mathematical entities are simply taken to be theoretical entities. This has led some to worry about other traditional theses concerning mathematics. For example, mathematical entities have traditionally been considered necessary existents, and mathematical truths have been considered to be necessary, while the constituents of the spatio-temporal realm—among them, theoretical entities such as electrons—have been considered to be contingent existents, and truths concerning them have been considered to be contingent. Mark Colyvan [2001] uses his discussion of the QPIA—in particular, the abovementioned similarities between mathematical and theoretical entities—to motivate skepticism about the necessity of mathematical truths and the necessary existence of mathematical entities. Michael Resnik [1997] goes one step further and argues that, within his Quinean framework, the distinction between the abstract and the concrete cannot be drawn in a meaningful way. Of course, if this distinction cannot be drawn in a meaningful way, one cannot legitimately espouse the Abstractness Thesis. Once again, it looks as though we have good reasons for not using the label “platonism” for the kinds of accounts of mathematics offered within Quinean frameworks that accept the Existence Thesis (or something very similar).

b. Anti-Nominalism

Most of the Quinean considerations relevant to the practice of labeling metaphysical accounts of mathematics “realist” rather than “platonist” center on problems with the Abstractness Thesis. In particular, those who purposefully characterize themselves as realists rather than platonists frequently want to deny some important feature or features in the cluster associated with abstract. Frequently, such individuals do not question the Independence Thesis. John Burgess’ qualms about metaphysical accounts of mathematics are broader than this. He takes the primary lesson of Quine’s naturalism to be that investigations into “the ultimate nature of reality” are misguided, for we cannot reach the “God’s eye perspective” that they assume. The only perspective that we (as finite beings situated in the spatio-temporal world, using the best methods available to us, that is, the methods of common sense supplemented by scientific investigation) can obtain is a fallible, limited one that has little to offer concerning the ultimate nature of reality.

Burgess takes it to be clear that both pre-theoretic common sense and science are ontologically committed to mathematical entities. He argues that those who deny this, that is, nominalists, do so because they misguidedly believe that we can obtain a God’s eye perspective and have knowledge concerning the ultimate nature of reality. In a series of manuscripts responding to nominalists—see, for example, [Burgess 1983, 2004] and [Burgess and Rosen 1997, 2005]—Burgess has defended anti-nominalism. Anti-nominalism is, simply, the rejection of nominalism. As such, anti-nominalists endorse ontological commitment to mathematical entities, but refuse to engage in speculation about the metaphysical nature of mathematical entities that goes beyond what can be supported by common sense and science. Burgess is explicit that neither common sense nor science provide support for endorsing the Abstractness Thesis when understood as a thesis about the ultimate nature of reality. Further, given that, at least on one construal, the Independence Thesis is just as much a thesis about the ultimate nature of reality as is the Abstractness Thesis, we may assume that Burgess and his fellow anti-nominalists will be unhappy about endorsing it. Anti-nominalism, then, is another account of mathematics that accepts the Existence Thesis (or something very similar), but which cannot be appropriately labeled “platonism.”

c. Metaphysical Constructivism

The final collection of metaphysical accounts of mathematics worth mentioning because of their relationship to, but distinctness from, platonism are those that accept the Existence Thesis—and, in some cases, the Abstractness Thesis—but reject the Independence Thesis. At least three classes of accounts fall into this category. The first accounts are those that take mathematical entities to be constructed mental entities. At some points in his corpus, Alfred Heyting suggests that he takes mathematical entities to have this nature—see, for example, [Heyting 1931]. The second accounts are those that take mathematical entities to be the products of mental or linguistic human activities. Some passages in Paul Ernest’s Social Constructivism ss a Philosophy of Mathematics [1998] suggest that he holds this view of mathematical entities. The third accounts are those that take mathematical entities to be social-institutional entities like the United States Supreme Court or Greenpeace. Rueben Hersh [1997] and Julian Cole [2008, 2009] endorse this type of social-institutional account of mathematics. Although all of these accounts are related to platonism in that they take mathematical entities to exist or they endorse ontological commitment to mathematical entities, none can be appropriately labeled “platonism.”

Return to section 3 where this section is referenced.

7. Supplement: The Epistemological Challenge to Platonism

Contemporary versions of the epistemological challenge ,sometimes under the label “the epistemological argument against platonism,” can typically be traced back to Paul Benacerraf’s paper “Mathematical Truth” [1973]. In fairness to Frege, however, it should be noted that human beings’ epistemic access to the kind of mathematical realm that platonists take to exist was a central concern in his work. Benacerraf’s paper has inspired much discussion. An overview of which appears in [Balaguer 1998, Chapter 2]. Interestingly, very little of this extensive literature has served to develop the challenge itself in any great detail. Probably the most detailed articulation of some version of the challenge itself can be found in two papers collected in [Field 1989]. The presentation of the challenge provided here is inspired by Hartry Field’s formulation, yet is a little more detailed than his formulation.

The epistemological challenge begins with the observation that an important motivation for platonism is the widely held belief that human beings have mathematical knowledge. One might maintain that it is precisely because we take human beings to have mathematical knowledge that we take mathematical theories to be true. In turn, their truth motivates platonists to take their apparent ontological commitments seriously. Consequently, while all metaphysical accounts of mathematics need to address the prima facie phenomenon of human mathematical knowledge, this task is particularly pressing for platonist accounts, for a failure to account for human beings’ ability to have mathematical knowledge would significantly diminish the attractiveness of any such account. Yet it is precisely this that (typical) proponents of the epistemological challenge doubt the platonists’ ability to account for human beings having mathematical knowledge.

a. The Motivating Picture Underwriting the Epistemological Challenge

In order to understand the doubts of proponents of the epistemological challenge, one must first understand the conception or picture of platonism that motivates them. Note that, in virtue of their endorsement of the Existence, Abstractness, and Independence Theses, platonists take the mathematical realm to be quite distinct from the spatio-temporal realm. The doubts underwriting the epistemological challenge derive their impetus from a particular picture of the metaphysical relationship between these distinct realms.  According to this picture, there is an impenetrable metaphysical gap between the mathematical and spatio-temporal realms. This gap is constituted by the lack of causal interaction between these two realms, which, in turn, is a consequence of mathematical entities being abstract—see [Burgess and Rosen 1997, §I.A.2.a] for further details. Moreover, according to this picture, the metaphysical gap between the mathematical and spatio-temporal realms ensures that features of the mathematical realm are independent of features of the spatio-temporal realm. That is, features of the spatio-temporal realm do not in any way influence or determine features of the mathematical realm and vice versa. At the same time, the gap between the mathematical and spatio-temporal realms is more than merely an interactive gap; it is also a gap relating to the types of properties characteristic of the constituents of these two realms. Platonists take mathematical entities to be not only acausal but also non-spatio-temporal, eternal, changeless, and (frequently) necessary existents. Typically, constituents of the spatio-temporal world lack all of these properties.

It is far from clear that the understanding of the metaphysical relationship between the mathematical and spatio-temporal realms outlined in the previous paragraph is shared by self-proclaimed platonists. Yet this conception of that relationship is the one that proponents of the epistemological challenge ascribe to platonists. For the purposes of our discussion of this challenge, let us put to one side all concerns about the legitimacy of this conception of platonism, which, from now on, we shall simply call the motivating picture. The remainder of this section assumes that the motivating picture provides an appropriate conception of platonism and it labels as “platonic” the constituents of realms that are metaphysically isolated from and wholly different from the spatio-temporal realm in the way that the mathematical realm is depicted to be by the motivating picture.

b. The Fundamental Question: The Core of the Epistemological Challenge

Let us make some observations relevant to the doubts that underwrite the epistemological challenge. First, according to the motivating picture, the mathematical realm is that to which pure mathematical beliefs and statements are responsible for their truth or falsity. Such beliefs are about this realm and so are true when, and only when, they are appropriately related to this realm. Second, according to all plausible contemporary accounts of human beings, human beliefs in general, and, hence, human mathematical beliefs in particular, are instantiated in human brains, which are constituents of the spatio-temporal realm. Third, it has been widely acknowledged since ancient times that beliefs or statements that are true purely by accident do not constitute knowledge. Thus, in order for a mathematical belief or statement to be an instance of mathematical knowledge, it must be more than simply true; it must be non-accidentally true.

Let us take a mathematical theory to be a non-trivial, systematic collection of mathematical beliefs. Informally, it is the collection of mathematical beliefs endorsed by that theory. In light of the above observations, in order for a mathematical theory to embed mathematical knowledge, there must be something systematic about the way in which the beliefs in that theory are non-accidentally true.

Thus, according to the motivating picture, in order for a mathematical theory to embed mathematical knowledge, a distinctive, non-accidental and systematic relationship must obtain between two distinct and metaphysically isolated realms. That relationship is that the mathematical realm must make true, in a non-accidental and systematic way, the mathematical beliefs endorsed by the theory in question, which are instantiated in the spatio-temporal realm.

In response to this observation, it is reasonable to ask platonists, “What explanation can be provided of this distinctive, non-accidental and systematic relationship obtaining between the mathematical realm and the spatio-temporal realm?” As Field explains, “there is nothing wrong with supposing that some facts about mathematical entities are just brute facts, but to accept that facts about the relationship between mathematical entities and human beings are brute and inexplicable is another matter entirely” [1989, p. 232]. The above question—which this section will call the fundamental question—is the heart of the epistemological challenge to platonism.

c. The fundamental Question: Some Further Details

Let us make some observations that motivate the fundamental question. First, all human theoretical knowledge requires a distinctive type of non-accidental, systematic relationship to obtain. Second, for at least the vast majority of spatio-temporal theories, the obtaining of this non-accidental, systematic relationship is underwritten by causal interaction between the subject matter of the theory in question and human brains. Third, there is no causal interaction between the constituents of platonic realms and human brains. Fourth, the lack of causal interaction between platonic realms and human brains makes it apparently mysterious that the constituents of such realms could be among the relata of a non-accidental, systematic relationship of the type required for human, theoretical knowledge.

So, the epistemological challenge is motivated by the acausality of mathematical entities. Yet Field’s formulation of the challenge includes considerations that go beyond the acausality of mathematical entities. Our discussion of the motivating picture made it clear that, in virtue of its abstract nature, a platonic mathematical realm is wholly different from the spatio-temporal realm. These differences ensure that not only causal explanations, but also other explanations grounded in features of the spatio-temporal realm, are unavailable to platonists in answering the fundamental question. This fact is non-trivial, for explanations grounded in features of the spatio-temporal realm other than causation do appear in natural science. For examples, see [Batterman 2001]. So, a platonist wanting to answer the fundamental question must highlight a mechanism that is not underwritten by any of the typical features of the spatio-temporal realm.

Now, precisely what type of explanation is being sought by those asking the fundamental question? Proponents of the epistemological challenge insist that the motivating picture makes it mysterious that a certain type of relationship could obtain. Those asking the fundamental question are simply looking for an answer that would dispel their strong sense of mystery with respect to the obtaining of this relationship. A plausible discussion of a mechanism that, like causation, is open to investigation, and thus has the potential for making the obtaining of this relationship less than mysterious, should satisfy them. Further, the discussion in question need not provide all of the details of the said explanation. Indeed, if one considers an analogous question with regard to spatio-temporal knowledge, one sees that the simple recognition of some type of causal interaction between the entities in question and human brains is sufficient to dispel the (hypothetical) sense of mystery in question in this case.

Next ask, “Is the fundamental question legitimate?” That is, should platonists feel the need to answer it? It is reasonable to maintain that they should. Explanations should be available for many types of relationships, including the distinctive, non-accidental and systematic relationship required in order for someone to have knowledge of a complex state of affairs. It is this justified belief that legitimizes the fundamental question. One instance of it is the belief that some type of explanation should be, in principle, available for the obtaining of the specific, non-accidental and systematic relationship required for human mathematical knowledge if this is knowledge of an existent mathematical realm. It is illegitimate to provide a metaphysical account of mathematics that rules out the possibility of such an explanation being available, because it would be contrary to this justified belief. The fundamental question is a challenge to platonists to show that they have not made this illegitimate move.

Return to section 3 where this section is referenced.

8. Supplement: The Referential Challenge to Platonism

In the last century or so, the philosophy of mathematics has been dominated by analytic philosophy. One of the primary insights guiding analytic philosophy is that language serves as a guide to the ontological structure of reality. One consequence of this insight is that analytic philosophers have a tendency to assimilate ontology to those items that are the semantic values of true beliefs or statements, that is, the items in virtue of which true beliefs or statements are true. This assimilation played an important role in both of the arguments for platonism developed in section 2. The relevant language-world relations are embedded in Frege’s logico-inferential analysis of the categories of object and concept and in Quine’s criterion of ontological commitment. This assimilation is at the heart of the referential challenge (to platonism).

a. Introducing the Referential Challenge

Before developing the referential challenge, let us think carefully about the following claim: “Pure mathematical beliefs and statements are about the mathematical realm, and so are true when, and only when, they are appropriately related to this realm.” What precisely is it for a belief or statement to be about something? And, what is the appropriate relationship that must obtain in order for whatever a belief or statement is about to make that belief or statement true? It is natural to suppose that the logico-inferential components of beliefs and statements have semantic values. Beliefs and statements are “about” these semantic values. Beliefs and statements are true when, and only when, these semantic values are related in the way that those beliefs and statements maintain that they are. The formal mathematical theory that theorizes about this appropriate relation is model theory. Moreover, on the basis of the above, it is reasonable to suppose that the semantic values of the logico-inferential components of beliefs and statements are, roughly, set or determined by means of causal interaction between human beings and those semantic values.

Applying these observations to the claim “pure mathematical beliefs and statements are about the mathematical realm, and so are true when, and only when, they are appropriately related to this realm,” we find that it maintains that constituents of a mathematical realm are the semantic values of the logico-inferential components of pure mathematical beliefs and statements. Further, such beliefs and statements are true when, and only when, the appropriate semantic values are related to one another in the way that the said beliefs and statements maintain that they are related—more formally, the way demanded by the model-theoretic notion of truth in a model.

So far, our observations have been easily applicable to the mathematical case. Yet they highlight a problem. How are the appropriate semantic values of the logico-inferential components of pure mathematical beliefs and statements set or determined? If platonists are correct about the metaphysics of the mathematical realm, then no constituent of that realm causally interacts with any human being. Yet it is precisely causal interaction between human beings and the semantic values of beliefs and statements about the spatio-temporal world that is responsible for setting or determining the semantic values of such beliefs and statements. The referential challenge is a challenge to platonists to explain how constituents of a platonic mathematical realm could be set or fixed as the semantic values of human beliefs and statements.

b. Reference and Permutations

Two specific types of observations have been particularly important in conveying the force of the referential challenge. The first is the recognition that a variety of mathematical domains contain non-trivial automorphisms, which means that there is a non-trivial, structure-preserving, one-to-one and onto mapping from the domain to itself. A consequence of such automorphisms is that it is possible to systematically reassign the semantic values of the logico-inferential components of a theory that has such a domain as its subject matter in a way that preserves the truth values of the beliefs or statements of that theory. For example, consider the theory of the group {Z,+}, that is, the group whose elements are the integers  …, -2, -1, 0, 1, 2, … and whose operation is addition. If one takes an integer n to have –n as its semantic value rather than n (that is, ‘2’ refers to -2, ‘-3’ refers to 3, and so forth), then the truth values of the statements or beliefs that constitute this theory would be unaltered.  For example, “2 + 3 = 5” would be true in virtue of -2 + -3 being equal to -5. A similar situation arises for complex analysis if one takes each term of the form ‘a+bi’ to have the complex number a-bi as its semantic value rather than the complex number a+bi.

To see how this sharpens the referential challenge, suppose, perhaps per impossible, that you and your acquaintance each know a person named “John Smith.” John Smith1 and John Smith2 are actually indistinguishable on the basis of the properties and relations that you discuss with your new acquaintance. That is, all of the consequences of all of the true statements that your new acquaintance makes about John Smith2 are also true of John Smith1, and all of the consequences of all of the true statements that you make about John Smith1 are also true of John Smith2. Under this supposition, her statements are still true in virtue of her using “John Smith” to refer to John Smith2, and your statements are still true in virtue of you using “John Smith” to refer to John Smith1. Using this as a guide, you might claim that ‘2 + 3 = 5’ should be true in virtue of ‘2’ referring to 2, ‘3’ referring to 3, and ‘5’ referring to 5 rather than in virtue of ‘2’ referring to the number -2, ‘3’ referring to the number -3, and ‘5’ referring to the number -5 as would be allowed by the automorphism mentioned above. One way to put this intuition is that 2, 3, and 5, are the intended semantic values of ‘2’, ‘3’, and ‘5’ and, intuitively, beliefs and statements should be true in virtue of the intended semantic values of their components being appropriately related to one another, not in virtue of other items (for example, -2, -3, and, -5) being so related. Yet, in the absence of any causal interaction between the integers and human beings, what explanation can be provided of ‘2’, ‘3’, and ‘5’ having their intended semantic values rather than some other collection of semantic values that preserves the truth values of arithmetic statements?

c. Reference and the Löwenheim-Skolem Theorem

The sharpening of the referential challenge discussed in the previous section is an informal, mathematical version of Hilary Putnam’s permutation argument. See, for example, [Putnam 1981]. A related model-theoretic sharpening of the referential challenge, also due to Putnam [1983], exploits an important result from mathematical logic: the Löwenheim-Skolem theorem. According to the Löwenheim-Skolem theorem, any first-order theory that has a model has a model whose domain is countable, where a model can be understood, roughly, as a specification of semantic values for the components of the theory. To understand the importance of this result, consider first-order complex analysis and its prima facie intended subject matter, that is, the domain of complex numbers. Prima facie, the intended semantic value of a complex number term of the form ‘a+bi’ is the complex number a+bi. Now, the domain of complex numbers is uncountable. So, according to the Löwenheim-Skolem theorem, it is possible to assign semantic values to terms of the form ‘a+bi’ in a way that preserves the truth values of the beliefs or statements of complex analysis, and which is such that the assigned semantic values are drawn from a countable domain whose ontological structure is quite unlike that of the domain of complex numbers. Indeed, not only the truth of first-order complex analysis, but the truth of all first-order mathematics can be sustained by assigning semantic values drawn from a countable domain to the logico-inferential components of first-order mathematical theories. Since most of mathematics is formulated (or formulable) in a first-order way, we are left with the question, “How, in the absence of causal interaction between human beings and the mathematical realm, can a platonist explain a mathematical term having its intended semantic value rather than an alternate value afforded by the Löwenheim-Skolem theorem?”

Strictly speaking, a platonist could bite a bullet here and simply maintain that there is only one platonic mathematical domain, a countable one, and that this domain is the actual, if not intended, subject matter of most mathematics. Yet this is not a bullet that most platonists want to bite, for they typically want the Existence Thesis to cover not only a countable mathematical domain, but all of the mathematical domains typically theorized about by mathematicians and, frequently, numerous other domains about which human mathematicians have not, as yet, developed theories. As soon as the scope of the Existence Thesis is so extended, the sharpening of the referential challenge underwritten by the Löwenheim-Skolem theorem has force.

Return to section 3 where this section is referenced.

9. References and Further Reading

a. Suggestions for Further Reading

  • Balaguer, Mark 1998. Platonism and Anti-Platonism in Mathematics, New York, NY: Oxford University Press.
    • The first part of this book provides a relatively gentle introduction to full-blooded platonism. It also includes a nice discussion of the literature surrounding the epistemological challenge.
  • Balaguer, Mark 2008. Mathematical Platonism, in Proof and Other Dilemmas: Mathematics and Philosophy, ed. Bonnie Gold and Roger Simons, Washington, DC: Mathematics Association of America: 179–204.
    • This article provides a non-technical introduction to mathematical platonism. It is an excellent source of references relating to the topics addressed in this article.
  • Benacerraf, Paul 1973. Mathematical Truth, Journal of Philosophy 70: 661–79.
    • This paper contains a discussion of the dilemma that motivated contemporary interest in the epistemological challenge to platonism. It is relatively easy to read.
  • Burgess, John and Gideon Rosen 1997. A Subject With No Object: Strategies for Nominalistic Interpretation of Mathematics, New York, NY: Oxford University Press.
    • The majority of this book is devoted to a technical discussion of a variety of strategies for nominalizing mathematics. Yet §1A and §3C contain valuable insights relating to platonism. These sections also provide an interesting discussion of anti-nominalism.
  • Colyvan, Mark 2001. The Indispensability of Mathematics, New York, NY: Oxford University Press.
    • This book offers an excellent, systematic exploration of the Quine-Putnam Indispensability Argument and some of the most important challenges that have been leveled against it. It also discusses a variety of motivations for being a non-platonist realist rather than a platonist.
  • Field, Hartry 1980. Science Without Numbers, Princeton, NJ: Princeton University Press.
    • This book contains Field’s classic challenge to the Quine-Putnam Indispensability Argument. Much of it is rather technical.
  • Frege, Gottlob 1884. Die Grundlagen der Arithmetik: eine logisch-mathematische Untersuchung über den Begriff der Zahl, translated by John Langshaw Austin as The Foundations of Mathematics: A logico-mathematical enquiry into the concept of number, revised 2nd edition 1974, New York, NY: Basil Blackwell.
    • This manuscript is Frege’s original, non-technical, development of his platonist logicism.
  • Hale, Bob and Crispin Wright 2001. The Reason’s Proper Study: Essays towards a Neo-Fregean Philosophy of Mathematics, New York, NY: Oxford University Press.
    • This book collects together many of the most important articles from Hale’s and Wright’s defense of neo-Fregean platonism. Its articles vary in difficulty.
  • MacBride, Fraser 2003. Speaking with Shadows: A Study of Neo-Logicism, British Journal for the Philosophy of Science 54: 103–163.
    • This article provides an excellent summary of Hale’s and Wright’s neo-Fregean logicism. It is relatively easy to read.
  • Putnam, Hilary 1971. Philosophy of Logic, New York, NY: Harper Torch Books.
    • This manuscript contains Putnam’s systematic development of the Quine-Putnam Indispensability Argument.
  • Resnik, Michael 1997. Mathematics as a Science of Patterns, New York, NY: Oxford University Press.
    • This book contains Resnik’s development and defense of a non-platonist, realist structuralism. It contains an interesting discussion of some of the problems with drawing the abstract/concrete distinction.
  • Shapiro, Stewart 1997. Philosophy of Mathematics: Structure and Ontology, New York, NY: Oxford University Press.
    • This book contains Shapiro’s development and defense of a platonist structuralism. It also offers answers to the epistemological and referential challenges.
  • Shapiro, Stewart 2005. The Oxford Handbook of Philosophy of Mathematics and Logic, New York, NY: Oxford University Press.
    • This handbook contains excellent articles addressing a variety of topics in the philosophy of mathematics. Many of these articles touch on themes relevant to platonism.

b. Other References

  • Batterman, Robert 2001. The Devil in the Details: Asymptotic Reasoning in Explanation, Reduction, and Emergence, New York, NY: Oxford University Press.
  • Burgess, John 1983. Why I Am Not a Nominalist, Notre Dame Journal of Formal Logic 24: 41–53
  • Burgess, John 2004. Mathematics and Bleak House, Philosophia Mathematica 12: 18–36.
  • Burgess, John and Gideon Rosen 2005. Nominalism Reconsidered, in The Oxford Handbook of Philosophy of Mathematics and Logic, ed. Stewart Shapiro, New York, NY: Oxford University Press: 515–35.
  • Cole, Julian 2008. Mathematical Domains: Social Constructs? in Proof and Other Dilemmas: Mathematics and Philosophy, ed. Bonnie Gold and Roger Simons, Washington, DC: Mathematics Association of America: 109–28.
  • Cole, Julian 2009. Creativity, Freedom, and Authority: A New Perspective on the Metaphysics of Mathematics, Australasian Journal of Philosophy 87: 589–608.
  • Dummett, Michael 1981. Frege: Philosophy of Language, 2nd edition, Cambridge, MA: Harvard University Press.
  • Ernest, Paul 1998. Social Constructivism as a Philosophy of Mathematics, Albany, NY: State University of New York Press.
  • Field, Hartry 1989. Realism, Mathematics, and Modality, New York, NY: Basil Blackwell.
  • Frege, Gottlob 1879. Begriffsschift, eine der arithmetschen nachgebildete Formelsprache des reinen Denkens, Halle a. Saale: Verlag von Louis Nebert.
  • Frege, Gottlob 1893. Grundgesetze der Arithmetik, Band 1, Jena, Germany: Verlag von Hermann Pohle.
  • Frege, Gottlob 1903. Grundgesetze der Arithmetik, Band 2, Jena, Germany: Verlag von Hermann Pohle.
  • Hale, Bob 1987. Abstract Objects, New York, NY: Basil Blackwell.
  • Hersh, Rueben 1997. What Is Mathematics, Really? New York, NY: Oxford University Press.
  • Heyting, Alfred 1931. Die intuitionistische Grundlegung der Mathematik, Erkenntnis 2: 106–115, translated in Paul Benacerraf and Hilary Putnam, Philosophy of Mathematics: Selected Readings, 2nd edition, 1983: 52–61.
  • Lewis, David 1986. On the Plurality of Worlds, New York, NY: Oxford University Press.
  • MacBride, Fraser 2005. The Julio Czsar Problem, Dialectica 59: 223–36.
  • MacBride, Fraser 2006. More problematic than ever: The Julius Caesar objection, in Identity and Modality: New Essays in Metaphysics, ed. Fraser MacBride, New York, NY: Oxford University Press: 174–203.
  • Putnam, Hilary 1981. Reason, Truth, and History, New York, NY: Cambridge University Press.
  • Putnam, Hilary 1983. Realism and Reason, New York, NY: Cambridge University Press.
  • Quine, Willard Van Orman 1948. On what there is, Review of Metaphysics 2: 21–38.
  • Quine, Willard Van Orman 1951. Two dogmas of empiricism, Philosophical Review 60: 20–43, reprinted in From a Logical Point of View, 2nd edition 1980, New York, NY: Cambridge University Press: 20–46.
  • Quine, Willard Van Orman 1963. Set Theory and Its Logic, Cambridge, MA: Harvard University Press.
  • Quine, Willard Van Orman 1981. Theories and Things, Cambridge, MA: Harvard University Press.
  • Resnik, Michael 1981. Mathematics as a science of patterns: Ontology and reference, Noûs 15: 529–50.
  • Resnik, Michael 2005. Quine and the Web of Belief, in The Oxford Handbook of Philosophy of Mathematics and Logic, ed. Stewart Shapiro, New York, NY: Oxford University Press: 412–36.
  • Shapiro, Stewart 1991. Foundations Without Foundationalism: A Case for Second Order Logic, New York, NY: Oxford University Press.
  • Shapiro, Stewart 1993. Modality and ontology, Mind 102: 455–481.
  • Tennant, Neil 1987. Anti-Realism and Logic, New York, NY: Oxford University Press.
  • Tennant, Neil 1997. On the Necessary Existence of Numbers, Noûs 31: 307–36.
  • Wright, Crispin 1983. Frege’s Conception of Numbers as Objects, volume 2 of Scots Philosophical Monograph, Aberdeen, Scotland: Aberdeen University Press.

Author Information

Julian C. Cole
Email: colejc@buffalostate.edu
Buffalo State College
U. S. A.

The Applicability of Mathematics

The applicability of mathematics can lie anywhere on a spectrum from the completely trivial to the utterly mysterious. At the one extreme, mathematics is used outside of mathematics in cases which range from everyday calculations like the attempt to balance one’s checkbook through the most demanding abstract modeling of subatomic particles. The techniques underlying these applications are perfectly clear to those who have mastered them, and there seems to be little for the philosopher to say about such cases. At the other extreme, scientists and philosophers have often mentioned the remarkable power that mathematics provides to the scientist, especially in the formulation of new scientific theories. Most famously, Wigner claimed that “The miracle of the appropriateness of the language of mathematics for the formulation of the laws of physics is a wonderful gift which we neither understand nor deserve.” And according to Kant, “In any special doctrine of nature there can be only as much proper science as there is mathematics therein.” Many  agree that the problem of understanding the significant tie between mathematics and modern science is an interesting and significant challenge for the philosopher of mathematics.

As philosophers, our first goal should be to clarify the different problems associated with the applicability of mathematics. This article suggests some potential solutions to these problems. Section 1 considers one version of the problem of applicability tied to what is often called “Frege’s Constraint,” which is the view that an adequate account of a mathematical domain must explain the applicability of this domain outside of mathematics. Section 2 considers the role of mathematics in the formulation and discovery of new theories. This leaves out several different potential contributions that mathematics might make to science such as unification, explanation and confirmation. These are discussed in section 3, where it is suggested that a piecemeal approach to understanding the applicability of mathematics is the most promising strategy for philosophers to pursue.

Table of Contents

  1. Reasoning
  2. Formulation and Discovery
  3. Unification, Explanation and Confirmation
  4. References and Further Reading

1. Reasoning

Gottlob Frege (1848-1925) remains one of the most influential philosophers of mathematics and is thought by many to be the first philosopher in the analytic tradition. Frege’s main goal was to argue for a logicist account of arithmetic. This is the view that all arithmetical concepts can be defined in wholly logical terms and that all arithmetical truths can be proved using only logical resources. While this characterization of logicism makes no link to the applicability of arithmetic, Frege maintained that the correct account of the natural numbers must make their role in counting transparent. It is hard to find an argument for this requirement in Frege’s writings, though, or to understand what meeting it really requires. After surveying some possible interpretations of Frege’s demand, this section considers structuralist interpretations of mathematics which reject Frege’s approach.

One of Frege’s opponents is the formalist who insists that mathematics is a game that we play with symbols according to arbitrarily stipulated rules. To the formalist, mathematics is not about anything, and strings of mathematical symbols are never sentences which express meaningful claims. Against the formalist, Frege noted that “it is application alone that elevates arithmetic beyond a game to the rank of a science. So applicability necessarily belongs to it.” A remark from earlier in this passage makes clear what sense of “applicability” Frege has in mind: “Why can one get applications of arithmetical equations? Only because they express thoughts”  (Wilholt 2006, p. 72). That is, some strings of mathematical symbols are sentences which express meaningful claims, and for this reason, these sentences can be premises in arguments whose conclusions pertain to non-mathematical domains. The formalist has no way to account for the role of mathematical sentences in arguments. It is only by treating mathematical sentences like other sentences of our language that we are able to account for the role of mathematics in scientific arguments, says Frege.

In this sense of “applicability,” it is fairly uncontroversial that mathematics is applicable; and we can grant that any viable philosophy of mathematics must supply a subject-matter for mathematical claims. But notice that Frege’s argument against formalism does not rule out a two-stage view of applicability. This view proposes that mathematical claims are about an exclusively mathematical domain and that these claims play a role in scientific arguments only because there are premises which link the mathematical domain to whatever non-mathematical domain the conclusion of the argument is about. By contrast, Frege’s one-stage approach insists that the subject-matter of mathematics relates directly to whatever the mathematics is applied to. Given this distinction, we need to examine how Frege could argue for his one-stage approach. Simply appealing to the role of mathematical claims in scientific arguments is not sufficient to rule out a two-stage approach.

Another view which Frege targets is John Stuart Mill’s empiricism about arithmetic. This is the view that the subject-matter of arithmetic is physical regularities such as the results of combining physical objects together to form larger aggregates. Frege insists that empiricism is not able to account for the wide scope of the applicability of mathematics:

The basis of arithmetic lies deeper, it seems, than that of any of the empirical sciences, and even than that of geometry. The truths of arithmetic govern all that is numerable. This is the widest domain of all; for to it belongs not only the actual, not only the intuitable, but everything thinkable. Should not the laws of number, then, be connected very intimately with the laws of thought? (Frege 1884, §14)

For example, we can count the figures or forms of the valid Aristotelian syllogisms. Assuming these figures are not physical objects, the empiricist is without an explanation of the applicability of using numbers to count these objects. Frege’s own proposal related the applicability of numbers in counting to the applicability of a concept: “The content of a statement of number is an assertion about a concept” (Frege 1884, §46). As concepts have all sorts of objects falling under them, including non-physical objects such as the figures of the syllogism, the wide scope of the applicability of arithmetic is accounted for.

Frege’s link between numbers, counting and concepts does not by itself yield a satisfactory characterization of what the numbers are. Later in Foundations, Frege presents Hume’s Principle as a potential definition of what the numbers are. This principle is that the number of Fs is identical to the number of Gs if and only if the objects falling under the concept F can be put in one-one correspondence with the objects falling under the concept G. Notice that Hume’s Principle would provide a direct explanation of the wide scope of the applicability of arithmetic in counting for it makes the identity of the numbers turn on issues related to what concepts these numbers are applied to. With Hume’s Principle, an agent could then identify each number using such a concept and go on to reason about them effectively.

Frege eventually rejected Hume’s Principle as an unsatisfactory definition, although his own preferred explicit definition recovers it as a theorem. Contemporary neo-Fregeans continue to insist against Frege that Hume’s Principle is a successful definition of the natural numbers after all. Even some philosophers who completely reject Frege’s approach to arithmetic nevertheless grant the need to account for the wide scope of applicability of arithmetic. For example, Michael Dummett has endorsed these aspects of Frege’s project: “Frege’s objective was to destroy the illusion that any miracle occurs [in applications]. The possibility of the applications was built into the theory from the outset; its foundations must be so constructed as to display the most general form of those applications, and then particular applications will not appear a miracle” (Dummett 1991, p. 293). It should be clear, though, that the wide scope of applicability of a mathematical domain is not by itself sufficient to rule out a two-stage account of applications. To see why, suppose that we have identified the subject-matter of arithmetic as a domain of objects that bears no direct connection to whatever it is that is counted, be it objects that fall under concepts or something else. It still remains possible that the second stage of the account will identify non-mathematical elements whose scope is wide enough to make sense of the scope of applicability of the numbers in counting.

There is a final route to justifying Frege’s one-stage approach which turns on questions of meaning and language learning. As Dummett puts it, “The historical genesis of the theory will furnish an indispensable clue to formulating that general principle governing all possible applications.… Only by following this methodological precept can applications of the theory be prevented from assuming the guise of the miraculous; only so can philosophers of mathematics, and indeed students of the subject, apprehend the real content of the theory” (Dummett 1991, pp. 300-301). That is, when we learn about the natural numbers, we thereby learn to count. If this is right, and the learning is tied directly to the “real content,” then a two-stage account is called into question. This idea has been carefully elaborated by Crispin Wright, a prominent neo-Fregean. He speaks of Frege’s Constraint: “A satisfactory foundation for a mathematical theory must somehow build its applications, actual and potential, into its core – into the content it ascribes to the statements of the theory – rather than merely ‘patch them on from the outside’ ” (Wright 2000, p. 324). One motivation for following Frege’s Constraint turns on learning: “Someone can – and our children typically do – first learn the concepts of elementary arithmetic by a grounding in their simple empirical applications and then, on the basis of the understanding thereby acquired, advance to an a priori recognition of simple arithmetical truths” (Wright 2000, p. 327). The recognition is a priori because it is not mediated by any additional knowledge which might be justified empirically. Wright concedes that this link between learning and applicability does not extend to all mathematical domains and so concludes that Frege’s Constraint need only be met in some cases (Wright 2000, p. 329).

As with the point about scope, though, the advocate of the two-stage position may insist that learning the concept of natural number need not involve any tie to counting. This could be consistent with the sort of a priori knowledge that Wright has in mind if the second stage of the two-stage account makes applications turn on a priori considerations. This is in fact the route pursued by some strands of the philosophy of mathematics known as structuralism. In his influential paper “What Numbers Could Not Be,” Paul Benacerraf describes two students, Ernie and Johnny, who learn about the natural numbers in different ways (Benacerraf 1965). Ernie comes to identify the natural numbers 1, 2, 3, … with the sets {Ø}, {Ø, {Ø}}, {Ø, {Ø}, {Ø, {Ø}}}, … while Johnny treats the same numbers as {Ø}, {{Ø}}, {{{Ø}}}, .… Both series involve the set that has no members,  the empty set Ø. To identify a set with finitely many members we can list the names of the members between the symbols “{“ and “}”. So, Ernie and Johnny agree that 1 is identical with the set whose only member is the empty set. But they disagree on the nature of 2. For Johnny, the only member of 2 is the set {Ø}, but for Ernie 2 has two members, namely Ø and {Ø}. Benacerraf’s main point in his article is that this disagreement does not block either student from doing mathematics. As there is no mathematical reason to prefer one policy of identification, Benacerraf concludes that the natural numbers are not identical to either series of sets. Instead, “in giving the properties (that is, necessary and sufficient) of numbers you merely characterize an abstract structure – and the distinction lies in the fact that the ‘elements’ of the structure have no properties other than those relating them to other ‘elements’ of the same structure” (Benacerraf 1965, p. 291). On this approach, the natural number 2 is nothing but an element in a larger structure and all of its genuine properties accrue to it simply in virtue of its relations to other elements in the structure.

There are several ways to work out this structuralist program, but for our purposes the most important aspect of structuralism is that it naturally leads to a rejection of Frege’s Constraint and an adoption of a two-stage account of applications. In the first stage, the mathematical domain is identified with a particular abstract structure. Then, in the second stage, applications such as counting are explained in terms of structurally specified mappings between the objects in some non-mathematical domain and the elements of the mathematical structure. For example, counting objects can be thought of as establishing a one-one correspondence between the objects to be counted and an initial segment of the structure of natural numbers. Other applications for other domains may involve different kinds of mappings. But as long as the scope of the applicability of these mappings is wide enough and we have the right kind of epistemic access to them, the arguments for the one-stage account can be countered.

This line of attack against a one-stage account of applications has been traced back to the mathematician Richard Dedekind (1831-1916). See (Tait 1997). Dedekind identified the natural numbers with a particular structure and accounted for their application in counting by invoking equipollent sets, that is, sets whose members can be paired up by a one-one correspondence. For Dedekind “To say that there are 45 million Germans is to say that there is a set of Germans which is equipollent to {1, … , 45000000} – and, again, this is quite independent of how the numbers are defined” (Tait 1997, p. 230). The properties of the natural numbers over and above their place in this abstract structure are irrelevant to the existence of this sort of mapping. Tait points out that additional requirements, such as Dummett’s, turn on special considerations which are hard to motivate: “The idea that numbers can be identified or, perhaps, further identified in terms of some particular application of them is … neither a very clear idea nor a desirable one” (Tait 1997, p. 232).

Similar conclusions have recently been reached by Charles Parsons in his Mathematical Thought and Its Objects (2008). Also noting Dedekind, Parsons insists that “a structuralist understanding of what the numbers are does not stand in the way of a reasonable account of their cardinal use” (Parsons 2008, p. 74), that is, their use in counting. This is made more precise by the introduction of a distinction between the internal and external relations of the natural numbers, as opposed to a series of other objects such as sets which might have the structure of the natural numbers. The internal relations of the numbers are exhausted by what follows from a system being simply infinite. This is defined as follows: “A simply infinite system is a system (i.e., set) N such that there is a distinguished element 0 of N, and a mapping S: N → N – {0}, which is one-one and onto, such that induction holds, that is: … (∀M){[0 ε M & (∀x) (x ε M → Sx ε M)] → N ⊂ M}” (Parsons, p. 45). Parsons then shows how any two simply infinite systems will agree on the results of counting based on the existence of a one-one correspondence (Parsons 2008, p. 75). This grounds the view that the results of counting turn on external relations of the numbers. For a structuralist approach to be vindicated, a similar result would have to hold for other kinds of mathematical domains as well. Another interesting structuralist strategy is pursued by Linnebo in his paper “The Individuation of the Natural Numbers” (Linnebo 2009). Linnebo focuses on systems of numerals and uses principles about numerals to recover claims about the natural numbers. Again, this results in a two-stage picture of applications where the natural numbers are specified independently of their role in counting.

2. Formulation and Discovery

Eugene Wigner (1902-1995) was a ground-breaking physicist who also engaged in some important philosophical reflections on the role of mathematics in physics. In his paper “The Unreasonable Effectiveness of Mathematics in the Natural Sciences,” in (Wigner 1960), he emphasizes “unreasonable effectiveness,” but it is not always clear what aspects of applicability he is concerned with. In a crucial stage of his discussion he distinguishes the role of mathematics in reasoning of the sort discussed above from the use of mathematics to formulate successful scientific theories: “The laws of nature must already be formulated in the language of mathematics to be an object for the use of applied mathematics” (Wigner 1960, p. 6). This procedure is surprisingly successful for Wigner because the resulting laws are incredibly accurate and the development of mathematics is largely independent of the demands of science. As he describes it, “Most advanced mathematical concepts … were so devised that they are apt subjects on which the mathematician can demonstrate his ingenuity and sense of formal beauty” (Wigner 1960, p. 3). When these abstract mathematical concepts are used in the formulation of a scientific law, then, there is the hope that there is some kind of match between the mathematician’s aesthetic sense and the workings of the physical world. One example where this hope was vindicated is in the discovery of what Wigner calls “elementary quantum mechanics” (Wigner 1960, p. 9). Some of the laws of this theory were formulated after some physicists “proposed to replace by matrices the position and momentum variables of the equations of classical mechanics” (Wigner 1960, p. 9). This innovation proved very successful, even for physical applications beyond those that inspired the original mathematical reformulation. Wigner mentions “the calculation of the lowest energy level of helium … [which] agree with the experimental data within the accuracy of the observations, which is one part in ten millions” and concludes that “Surely in this case we ‘got something out’ of the equations that we did not put in” (Wigner 1960, p. 9).

Although extremely suggestive, Wigner’s discussion may be focused on two possible targets. First, he may be asking for an explanation of why certain physical claims are true. It is surprising that these claims are true partly because they involve highly abstract mathematical concepts. If scientists in the nineteenth century were considering the future development of physics, they probably would not have anticipated that quantum mechanics would have arisen as it did. Still, one can respond to this version of Wigner’s worries by noting that we cannot explain everything. Some physical claims can be explained, but we need to use other physical claims to do this. There is no mystery in this, and it is hard to see what special mystery there is that relates to the mathematical character of truths that we have no explanation of. A second, more plausible, candidate for Wigner’s concerns is the role of mathematics in the discovery of successful scientific theories. This is how Mark Steiner has clarified and extended Wigner’s original discussion in Steiner’s book The Applicability of Mathematics as a Philosophical Problem (1998). Steiner’s book is valuable partly for the division of problems of applicability into several categories. In the first two chapters of his book, Steiner distinguishes semantic, metaphysical and descriptive problems and argues that they have been largely resolved by Frege (Steiner 1998, p. 47). (See Steiner 2005 for a survey article on applicability that is complementary to the present article.)

Steiner also insists that there is a further problem associated with the role of mathematics in discovery. According to Steiner, physicists in the twentieth century have deployed a certain strategy for discovering new theories. This strategy depends on mathematical analogies between past theories and new proposals. The strategy is called “Pythagorean” if it depends on mathematical features of the mathematical objects in question, while it is labeled “formalist” if things turn on the syntax of mathematical language. The success of both strategies has negative implications for what Steiner calls “naturalism,” the view that the natural world is not in any way attuned to the workings of our mind. Then, “The weak conclusion is that scientists have recently abandoned naturalist thinking in their desperate attempt to discover what looked like the undiscoverable,” while “the apparent success of Pythagorean and formalist methods is sufficiently impressive to create a significant challenge to naturalism itself” (Steiner 1998, p. 75).

As with Wigner, a significant assumption that Steiner makes is that mathematics is developed using the aesthetic judgments of mathematicians. Steiner adds the claim that these judgments are “species-specific,” and so they do not track any objective features of the natural world if naturalism is true (Steiner 1998, p. 66). An advocate of some version of Frege’s one-stage account of applications would insist that their explanation of what the mathematical objects are will make direct reference to their role in science. This is why Dummett insists that meeting Frege’s Constraint will remove the appearance of a miracle from successful applications (see section 1). A structuralist who defends a two-stage account of applications also has a line of response to an aesthetic conception of mathematics. As the subject-matter of mathematics is made up of abstract structures, one can make sense of how the highly complicated structures found in nature might be studied via the more accessible abstract structures discussed by mathematicians. This is roughly the route taken by Steven French (French 2000). Either strategy must also supplement the aesthetic criteria noted by Wigner and Steiner with some more objective account of the development of mathematics. Beyond this, critics of Steiner have argued that his examples do not support the strong premises he needs for either his weak or strong conclusion. For instance, it is hard to tell what beliefs motivated physicists to substitute matrices for variables in Wigner’s example. See (Steiner 1998, pp. 95-98) for some discussion of this case. Simons suggests that physicists may have simply been desperate to try anything. As a result, the success of their attempts does not underwrite Steiner’s conclusions (Simons 2001). More generally, there are delicate historical issues in reconstructing a given scientific discovery or a pattern of discoveries of the sort Steiner describes. Some may argue that it is premature to draw philosophical conclusions from such discoveries precisely because we understand so little about how they were made. (Bangu 2006) provides additional discussion of Steiner’s argument.

There have been several other attempts to come to grips with Wigner’s worries about the contribution of mathematics to the formulation and discovery of successful scientific theories. Some agree with Wigner that mathematics has been effective in science, but question the degree to which this effectiveness has been unreasonable. For example, Ivor Grattan-Guinness presents a classification of seven ways in which a new scientific theory might relate to an old one, including connections of reduction, importation and what he calls “convolution” (Grattan-Guinness 2008, p. 9). Using this classification scheme, he argues that the analogies responsible for many scientific breakthroughs can be made sense of: “With a wide and ever-widening repertoire of mathematical theories and an impressive tableau of ubiquitous topics and notions, theory-building can be seen as reasonable to a large extent” (Grattan-Guinness 2008, p. 15). Another approach is found in Mark Wilson’s work. Though his paper “The Unreasonable Uncooperativeness of Mathematics in the Natural Sciences” does not directly engage with Wigner’s arguments, Wilson considers the possibility that successful applications of mathematics in science are rare because they largely turn on a fortunate match between the mathematics available at a given stage of development and the features of the physical systems being studied. He takes seriously the proposal of the “mathematical opportunist” who believes that “the successes of applied mathematics require some alien element that cannot be regarded as invariably present in the physical world” (Wilson 2000, p. 299). Although Wilson eventually sides with the “honest optimist” such as Euler who developed mathematical techniques that dramatically extended the scope of applicability of available mathematics, he concedes that residual modeling challenges might give further support to the opportunist picture.  (Wilson 2006) deals with these issues at greater length. For additional discussion of problems related to formulation and discovery see (Azzouni 2000), (Colyvan 2001a) and (Urquhart 2008).

3. Unification, Explanation and Confirmation

So far we have reviewed philosophical issues connected with the contributions that mathematics makes to reasoning and discovery in science. But there are many other potential ways in which mathematics might help out scientists that philosophers have only recently begun to explore. Many of these possibilities pertain to what we might call the “abstractness” of mathematical concepts. This abstractness seems to permit mathematics to unify physical phenomena. Furthermore, it may be connected to the viability of mathematical explanations of such phenomena or even the degree of confirmation that our best physical theories have achieved.

In a preliminary sense, mathematics is abstract because it is studied using highly general and formal resources. Although we may introduce a student to a group by describing a string of symbols and its permutations, the student must eventually realize that the group itself is something more general that includes this set of permutations as an instance. The abstractness of mathematics has been used as one of the arguments for structuralism of the sort reviewed in section 1. But the defenders of a one-stage view of applications also emphasize the abstractness of mathematics when they ensure that their accounts of a mathematical domain deliver a wide scope of application.

When abstractness is thought of in this way it is obvious that mathematical descriptions of physical phenomena should contribute to unification in the sciences. Morrison, for example, has described many ways in which new mathematical approaches to a range of scientific theories have helped scientists combine these theories into a single theoretical framework (Morrison 2000). Famously, Newton was able to bring together descriptions of the orbits of the planets with the behavior of falling bodies on Earth using his three laws of motion and the universal law of gravitation. Later this theory of classical mechanics was presented in an even more abstract and general form using the mathematics of the calculus of variations. This theoretical unification can be distinguished from the methodological unification that mathematics can provide to scientists. Mark Colyvan draws attention to how the methods for solving a wide class of differential equations can be brought together by considering functions on complex numbers that extend the functions on real numbers (Colyvan 2001b, pp. 81-83). More generally, textbooks in applied mathematics provide the scientist with an extensive toolbox of sophisticated techniques for treating mathematical problems that arise in scientific modeling.

In the philosophy of science, many try to provide a theory of scientific explanation using some notion of unification, so it is not surprising that the power of mathematics to unify entails for some that mathematics can also explain physical phenomena. A simple instance of this is the explanation of why it is impossible to cross this arrangement of bridges exactly once:

Given that these bridges have the abstract arrangement of a certain kind of graph and given the theorem that there is no path of the appropriate sort through this graph, we can come to appreciate why the desired kind of crossing is impossible. This explanation exploits the abstractness of the mathematics because it fails to make reference to the irrelevant material constitution of the bridges. Similarly, in an example introduced by Alan Baker (Baker 2005), an appreciation of the features of prime numbers can be used to help to explain why the life-cycles of certain periodic cicadas are a prime number of years. While Baker does not usually present his case as an instance of a unifying explanation, we can see its ability to provide a unified description of the several species in question as a central source of its explanatory power. See also (Baker 2009) and (Lyon and Colyvan 2008).

Nevertheless, there are cases of apparent mathematical explanation which do not seem to turn on unification. Robert Batterman has recently aimed “to account for how mathematical idealizations can have a role in physical explanations” (Batterman 2010, p. 2) and argued that two-stage structuralist approaches face significant challenges in doing this. He focuses on cases where mathematical operations transform one kind of mathematical representation into another mathematical representation which is qualitatively different. These “asymptotic” techniques do not appear to allow for abstract unification precisely because the character of the two representations is so different. For example, we can consider the relationship between the wave theory of light and the theory that light is made up of rays. The ray representation results from the wave representation if we take a certain kind of limit, for, example, we take the wavelength to zero. Batterman insists that both representations are needed to explain features of physical phenomena such as the bow structure of the rainbow: “The asymptotic investigation of the wave equation leads to an understanding of the stability of those phenomena under perturbation of the shape of the raindrops and other features” (Batterman 2010, p. 21). Taking the appropriate limit allows us to throw out the right kind of irrelevant details which might distinguish one rainbow from another. What results is a correct description of an important physical phenomenon along with some explanatory insight into its features. This explanatory power is not easily grounded in the ability to unify because so many aspects of the mathematics fail to have correlates in the differently constituted rainbows.

Although the existence of mathematical explanations of physical phenomena remains a topic of intense debate, there are yet other potential contributions from mathematics to the success of science. Pincock has argued that the abstract character of mathematics can help scientists to develop representations which can be more easily confirmed by the evidence available (Pincock 2007). For example, scientists can propose an equation for the relationship between heat and temperature over time without taking a stand on the nature of heat or the ultimate connection between heat and temperature. This permits science to proceed in its process of testing and refinement of hypotheses without getting bogged down in interpretative controversies. If mathematics makes this sort of contribution, though, it raises further questions about when scientists are warranted in assigning some physical interpretation to their successful mathematical representations. Here, then, we can see a direct connection between the role of mathematics in science and the viability of scientific realism.

These points about unification, explanation, confirmation and their broader significance suggest the many ways in which debates about the applicability of mathematics may proceed in the coming years. More generally, the issues discussed in this article clearly turn on a detailed appreciation of actual cases where mathematics seems to be helping out the scientist. This suggests that the most fruitful way to proceed is to move between more general philosophical reflection on the applicability of mathematics and concrete investigations of scientific practice. Deploying this method may not address many of the mainstream preoccupations of philosophers of mathematics such as the platonism-nominalism debate or questions of indispensability, but it holds out the promise of delivering a more nuanced appreciation of the central place that mathematics has in contemporary science, and it offers a relatively unexplored avenue for philosophical exploration and innovation.

4. References and Further Reading

  • Azzouni, Jody (2000), “Applying Mathematics: An Attempt to Design a Philosophical Problem,” Monist 83: 209-227.
  • Baker, Alan (2005), “Are There Genuine Mathematical Explanations of Physical Phenomena?” Mind 114: 223-238.
  • Baker, Alan (2009), “Mathematical Explanation in Science,” British Journal for the Philosophy of Science 60: 611-633.
  • Bangu, Sorin (2006), “Steiner on the Applicability of Mathematics and Naturalism,” Philosophia Mathematica 14: 26-43.
  • Batterman, Robert W. (2010), “On the Explanatory Role of Mathematics in Empirical Science,” British Journal for the Philosophy of Science 61: 1-25.
  • Benacerraf, Paul (1965), “What Numbers Could Not Be,” reprinted in P. Benacerraf & H. Putnam, Philosophy of Mathematics: Selected Readings, Second Edition, Cambridge University Press, 1983, pp. 272-294.
  • Colyvan, Mark (2001a), “The Miracle of Applied Mathematics,” Synthese 127: 265-277.
  • Colyvan, Mark (2001b), The Indispensability of Mathematics, Oxford University Press.
  • Dummett, Michael (1991), Frege: Philosophy of Mathematics, Harvard University Press.
  • Frege, Gottlob (1884), The Foundations of Arithmetic, J. L. Austin (trans.), Northwestern University Press, 1980.
  • French, Steven (2000), “The Reasonable Effectiveness of Mathematics: Partial Structures and the Applicability of Group Theory to Physics,” Synthese 125: 103-120.
  • Grattan-Guinness, Ivor (2008), “Solving Wigner’s Mystery: The Reasonable (Though Perhaps Limited) Effectiveness of Mathematics in the Natural Sciences,” The Mathematical Intelligencer 30: 7-17.
  • Kant, Immanuel (1786), The Metaphysical Foundations of Natural Science, M. Friedman (ed. and trans.), Cambridge University Press, 2004.
  • Linnebo, Øystein (2009), “The Individuation of the Natural Numbers,” in O. Bueno & Ø. Linnebo (eds.), New Waves in the Philosophy of Mathematics, Palgrave, pp. 220-238.
  • Lyon, Aidan and Mark Colyvan (2008), “The Explanatory Power of Phase Spaces,” Philosophia Mathematica 16: 227-243.
  • Morrison, Margaret (2000), Unifying Scientific Theories: Physical Concepts and Mathematical Structures, Cambridge University Press.
  • Parsons, Charles (2008), Mathematical Thought and Its Objects, Cambridge University Press.
  • Pincock, Christopher (2007), “A Role for Mathematics in the Physical Sciences,” Nous 41: 253-275.
  • Steiner, Mark (1998), The Applicability of Mathematics as a Philosophical Problem, Harvard University Press.
  • Steiner, Mark (2005), “Mathematics – Application and Applicability,” in S. Shapiro (ed.), The Oxford Handbook of Philosophy of Mathematics and Logic, Oxford University Press, pp. 625-650.
  • Tait, William (1997), “Frege versus Cantor and Dedekind: On the Concept of Number,” reprinted in W. Tait, The Provenance of Pure Reason: Essays in the Philosophy of Mathematics and Its History, Oxford University Press, 2005, pp. 212-251.
  • Urquhart, Alasdair (2008), “Mathematics and Physics: Strategies of Assimilation,” in P. Mancosu (ed.), The Philosophy of Mathematical Practice, Oxford University Press, pp. 417-440.
  • Wigner, Eugene P. (1960), “The Unreasonable Effectiveness of Mathematics in the Natural Sciences,” Communications of Pure and Applied Mathematics 13: 1-14.
  • Wilholt, Torsten (2006), “Lost on the Way From Frege to Carnap: How the Philosophy of Science Forgot the Applicability Problem,” Grazer Philosophische Studien 73: 69-82.
  • Wilson, Mark (2000), “The Unreasonable Uncooperativeness of Mathematics in the Natural Sciences,” Monist 83: 296-315.
  • Wilson, Mark (2006), Wandering Significance: An Essay on Conceptual Behavior, Oxford University Press.
  • Wright, Crispin (2000), “Neo-Fregean Foundations for Real Analysis: Some Reflections on Frege’s Constraint,” Notre Dame Journal of Formal Logic 41: 317-334, reprinted in R. Cook (ed.), The Arche Papers on the Mathematics of Abstraction, Springer, 2007.

Author information

Christopher Pincock
Email: pincock1@osu.edu
Ohio State University
U. S. A.

Bernard Bolzano: Philosophy of Mathematical Knowledge

BolzanoIn Bernard Bolzano’s theory of mathematical knowledge, properties such as analyticity and logical consequence are defined on the basis of a substitutional procedure that comes with a conception of logical form that prefigured contemporary treatments such as those of Quine and Tarski. Three results are particularly interesting: the elaboration of a calculus of probability, the definition of (narrow and broad) analyticity, and the definition of what it is for a set of propositions to stand in a relation of deducibility (Ableitbarkeit) with another. The main problem with assessing Bolzano’s notions of analyticity and deducibility is that, although they offer a genuinely original treatment of certain kinds of semantic regularities, contrary to what one might expect they do not deliver an account of either epistemic or modal necessity. This failure suggests that Bolzano does not have a workable account of either deductive knowledge or demonstration. Yet, Bolzano’s views on deductive knowledge rest on a theory of grounding (Abfolge) and justification whose role in his theory is to provide the basis for a theory of mathematical demonstration and explanation whose historical interest is undeniable.

Table of Contents

  1. His Life and Publications
  2. The Need for a New Logic
  3. Analyticity and Deducibility
  4. Grounding
  5. Objective Proofs
  6. Conclusion
  7. References and Further Reading

1. His Life and Publications

Bernard Placidus Johann Nepomuk Bolzano was born on 5 October 1781 in Prague. He was the son of an Italian art merchant and of a German-speaking Czech mother. His early schooling was unexceptional: private tutors and education at the lyceum. In the second half of the 1790s, he studied philosophy and mathematics at the Charles-Ferdinand University. He began his theology studies in the Fall of 1800 and simultaneously wrote his first mathematical treatise. When he completed his studies in 1804, two university positions were open in Prague, one in mathematics, the other one in the “Sciences of the Catholic Religion.” He obtained both, but chose the second: Bolzano adhered to the Utilitarian principle and believed that one must always act, after considering all possibilities, in accordance with the greater good. He was hastily ordained, obtained his doctoral degree in philosophy and began work in his new university position in 1805. His professional career would be punctuated by sickness—he suffered from respiratory illness—and controversy. Bolzano’s liberal views on public matters and politics would serve him ill in a context dominated by conservatism in Austria. In 1819, he was absurdly accused of “heresy” and subjected to an investigation that would last five years after which he was forced to retire and banned from publication. From then on, he devoted himself entirely to his work.

Bolzano’s Considerations on Some Objects of Elementary Geometry (1804) received virtually no attention at the time they were published and the few commentators who have appraised his early work concur in saying that its interest is merely historical. (Russ 2004, Sebestik 1992; see also Waldegg 2001). Bolzano’s investigations in geometry did not anticipate modern axiomatic approaches to the discipline–he was attempting to prove Euclid’s parallel postulate–and did not belong to the trend that would culminate with the birth of non-Euclidean geometries, the existence of which Bolzano’s contemporary Johann Carl Friedrich Gauss (1777-1855) claimed to have discovered and whose first samples were found in the works of Nikolai Lobatchevski (1792-1856) and Janos Bolyai (1802-1860), whom Bolzano did not read. (See Sebestik 1992, 33-72 for a discussion of Bolzano’s contribution to geometry; see also Russ 2004, 13-23). As Sebestik explains (1992, 35 note), Bolzano never put into question the results to which he had come in (1804).

By contrast, Bolzano is renown for his anticipation of significant results in analysis. Three booklets that appeared in 1816-17 have drawn the attention of historians of mathematics, one of which, the Pure Analytic Proof, was reedited in 1894 and 1905. (Rusnock 2000, 56-86; 158-198) At the time of their publication however they attracted hardly any notice. Only one review is known (see Schubring 1993, 43-53). According to (Grattan-Guiness 1970), Cauchy would have plagiarized (Bolzano 1817a) in his Cours d’Analyse, but this hypothesis is disputed in (Freudenthal 1971) and (Sebestik 1992, 107ff). This might explain why Bolzano chose to resume the philosophical and methodological investigations he had initiated in the Contributions to a Better Founded Exposition of Mathematics (1810) a decade earlier. At the end of the 1830s, after he had worked out the logical basis for his system in the Theory of Science (1837), Bolzano returned once more to mathematics and spent the last years of his life working on the Theory of Quantities. The latter remained unpublished until after his death, and only excerpts appeared in print in the 19th century, most notably the Paradoxes of the Infinite (1851). The Theory of Function (1930) and the Pure Theory of Numbers (1931) were edited by the Czech mathematician Karel Rychlik and published in 1930 and 1931 respectively by a commission from the Royal Bohemian Academy of Science. All these works have now been translated into English (See Russ 2004).

2. The Need for a New Logic

Bolzano understood the main obstacle to the development of mathematics in his time to be the lack of proper logical resources. He believed syllogistic (that is, traditional Aristotelian logic) was utterly unfit for the purpose. He saw the task of the speculative part of mathematics that belongs at once to philosophy as consisting in providing a new logic following which a reform of all sciences should take place. As Bolzano conceived of it, philosophy of mathematics is one aspect of a more general concern for logic, methodology, the theory of knowledge, and, in general, the epistemological foundation of deductive sciences, “purely conceptual disciplines” as Bolzano calls them, that unfolds throughout his mathematical work and forms the foremost topic of his philosophy. The latter falls in two phases. The period of the Contributions, which extends throughout the 1810s, and the period of the Theory of Science, which was written in the course of the 1820s and published anonymously in 1837. In the Contributions, Bolzano’s undertaking remained largely programmatic and by no means definitive. By the time he was writing the Theory of Science he had revised most of his views, such as those of the multiple copula, analyticity and necessity. (See Rusnock 2000, 31-55, for discussion.) Nonetheless, the leitmotiv of Bolzano’s mature epistemology already comes through in 1810, namely his fundamental disagreement with the “Kantian Theory of Construction of Concepts through Intuitions” to which he devoted the Appendix of the Contributions. (See Rusnock 2000, 198-204 for an English translation; see also Russ 2004, 132-137). In this, Bolzano can be seen to have anticipated an important aspect of later criticisms of Kant, Russell’s for instance (1903  §§ 4, 5, 423, 433-4). As Bolzano saw it, an adequate account of demonstration excludes appeal to non-conceptual inferential steps, intuitions or any other proxy for logic.

In the Theory of Science, Bolzano’s epistemology of deductive disciplines is based on two innovations. On the one hand, properties such as analyticity or deducibility (Ableitbarkeit) are defined not for thoughts or sentences but for what Bolzano conceives to be the objective content of the former and the meaning of the latter and which he calls “propositions in themselves” (Sätze and sich) or “propositions.” On the other hand, properties such as analyticity and deducibility are “formal” in that they are features of sets of propositions defined by a fixed vocabulary; they come to the fore through the application of a substitution method that consists in arbitrarily “varying” determinate components in a proposition so as derive different types of semantic regularities.

3. Analyticity and Deducibility

Bolzano’s theory of analyticity is a favored topic in the literature. (Cf. Bar-Hillel 1950; Etchemendy 1988; Künne 2006; Lapointe 2000, 2008; Morscher 2003; Neeman 1970; Proust 1981, 1989; Textor 2000, 2001) This should be no surprise. For one thing, by contrast to the Kantian definition, Bolzano’s allows us to determine not only whether a grammatical construction of the form subject-predicate is analytic, as Kant has it, but whether any construction is analytic or not. This includes hypotheticals, disjunctions, conjunctions, and so forth, but also any proposition that presents a syntactic complexity that is foreign to traditional (that is, Aristotelian) logic. Analyticity is not tied to any “syntactic” conception of “logical form.” It is a relation pertaining to the truth of propositions and not merely to their form or structure. Let ‘Aij…(S)’ stand for “The proposition S is analytic with respect to the variable components i, j…”

Aij…(S) iff:

(i)   i, j, … can be varied so as to yield at least one objectual substitution instance of S

(ii) All substitution instances of S have the same truth-value as S

where a substitution instance is “objectual” if the concept that is designated by the subject has at least one object. On this account, propositions can be analytically true or analytically false.

Although the idea that analyticity should be defined on the basis of a purely semantic criterion is in itself a great anticipation, Bolzano’s conception of analyticity fails in other respects. For one, it does not provide an account of what it means for a proposition to be true by virtue of meaning alone and to be knowable as such.  “… is analytic with respect to …” is not a semantic predicate of the type one would expect, but is a variable holding operator. A statement ascribing analyticity to a given propositional form, say “X who is a man is mortal” if it is true, is true because every substitution instance of “X who is a man is mortal” that also has objectuality is true. Bolzano’s definition of analyticity offers a fairly clear description of substitutional quantification — to say that a propositional form is analytic is to say that all its substitution instances are true. Yet because he deals not primarily with sentences and words but with their meaning, that is, with ideas and propositions in themselves, and because there is at least one idea for every object, there is in principle a “name” for every object. For this reason, although Bolzano’s approach to quantification is substitutional, he is not liable to the reproach that his interpretation of the universal quantifier cannot account for every state of the world. The resources he has at his disposal are in principle as rich as necessary to provide a complete description of the domain the theory is about.

Bolzano’s epistemology rests on a theory of logical consequence that is twofold: an account of truth preservation that is epitomized in his notion of “deductibility” (Ableitbarkeit) on the one hand (See Siebel 1996, 2002, 2003; van Benthem 1985, 2003; Etchemendy 1990), and an account of “objective grounding” (Abfolge) on the other (see Tatzel 2002, 2003; see also (Thompson 1981; Corcoran 1975). The notion of deducibility presents a semantic account of truth-preservation that is neither trivial nor careless. The same holds for his views on probability. Likewise his attempt at a definition of grounding constitutes the basis of an account of a priori knowledge and mathematical explanations whose interest has been noticed by some authors, and in some cases even vindicated (Mancosu 1999).

As Bolzano presents it, although analyticity is defined for individual propositional forms, deducibility is a property defined for sets of those forms. Let “Dij…(T’ T’, T’’, … ; S, S’, S’’, …)” stand for “The set of propositions T’ T’, T’’ is deducible from the set of propositions S, S’, S’’ with respect to i, j,….” Bolzano defines deducibility in the following terms:

Dij…(T’ T’, T’’, … ; S, S’, S’’, …) iff

(i)         i, j, … can be varied so as to yield at least one true substitution instance of S, S’, S’’, … and T, T’, T’’, …

(ii)        whenever S, S’, S’’… is true, T, T’, T’’,… is also true.

Bolzano’s discussion of deducibility is exhaustive. It extends over thirty-six paragraphs, and he draws a series of theorems from his definition. The most significant theorems are the following:

  • ¬(Aij…(T, T’, T’’…; S, S’, S’’) → Aij…(S, S’, S’’…; T, T’, T’’…,) (asymmetry)
  • (Aij…(T, T’, T’’…; S, S’, S’’) & Aij…(R, R’, R’’…; T, T’, T’’…) → (Aij…(R, R’, R’’…; S, S’, S’’…) (transitivity)

In addition, assuming that the S, S’, S’’…, share at least one variable that make them all true at the same time, then:

  • Aij…( S, S’, S’’…; S, S’, S’’) (reflexivity)

As regard reflexivity, the assumption that the S, S’, S’’… must share at least one variable follows from the fact that every time S, S’, S’’… contain a falsehood S that does not share at least one variable idea i, j, with the conclusion T, T’, T’’,…, then there are no substitution that can make both the premises and the conclusion true at the same time, and the compatibility constraint is not fulfilled.

On Bolzano’s account, fully-fledged cases of deducibility include both formally valid arguments as well as materially valid ones, for instance:

Caius is rational

is deducible with respect to ‘Caius’, ‘man’ and ‘rational’ from

Caius is a man

Men are rational

and

Caius is rational

is deducible with respect to ‘Caius’ from

Caius is a man.

There is a sharp distinction to be drawn between arguments of the former kind and arguments of the latter. Assuming a satisfactory account of logical form, in order to know that the conclusion follows from the premises in arguments of the former kind one only needs to consider their structure or form; no other kind of knowledge is required. In the latter argument however in order to infer from the premise to the conclusion, one must know more than its form. One also needs to understand the signification of ‘man’ and ‘rational’ since in order to know that Caius is rational one also needs to know in addition to the fact that Caius is a man that all men are rational. There is good evidence that Bolzano was aware of some such distinction between arguments that preserve truth and arguments that do so in virtue of their “form.” Unfortunately, Bolzano’s definition of deducibility does not systematically uphold the distinction. Since deducibility applies across the board to all inferences that preserve truth from premises to conclusion with respect to a given set of ideas, it does not of itself guarantee that an argument be formally valid and the notion of deducibility turns out to be flawed: it makes it impossible to extend our knowledge in the way we would expect it. If we know, for instance, that all instances of modus ponens are logically valid, we can infer from two propositions whose truth we’ve recognized:

If Caius is a man, then he is mortal

Caius is a man

a new proposition:

Caius is mortal

whose truth we might not have previously known. Bolzano’s account of deducibility does not allow one to extend one’s knowledge in this way since in order to know for every substitution instance that truth is preserved from the premises to the conclusion one has to know that the premises are true and that the conclusion is true.

On Bolzano’s account, in order for a conclusion to be deducible from a given set of premises, there must be at least one substitution that makes both the premises and the conclusion true at once. He calls this the “compatibility” (Verträglichkeit) condition, a requirement that is not reflected in classical conceptions of consequence. As a result, Bolzano’s program converges with many contemporary attempts at a definition of non-classical notions of logical consequence. Given the compatibility condition, although a logical truth may follow from any (set of) true premises (with respect to certain components), nothing as opposed to everything is deducible from a contradiction. The compatibility condition invalidates the ex contradictio quod libet or explosion principle. The reason for this is that no substitution of  ‘p’ in “‘q’ is deducible from ‘p and non-p’’ can fulfil the compatibility constraint; no interpretation of ‘p’ in ‘p and non-p’ can yield a true variant and hence there are no ideas that can be varied so as to make both the premises and the conclusion true at once. This has at least two remarkable upshots. First, the compatibility constraint invalidates the law of contraposition. Whenever one of S, S’, S’’… is analytically true, when all their substitution instances are true, we cannot infer from:

Dij…(T’ T’, T’’, … ; S, S’, S’’, …)

to

Dij…(¬S, ¬S’, ¬S’’, …; ¬T, ¬T’, ¬T’’…)

since ‘¬S, ¬S’, ¬S’’’ entails a contradiction, that is, an analytically false proposition. For instance,

Caius is a physician who specializes in the eyes

is deducible from

Every ophthalmologist is an ophthalmologist

Caius is an ophthalmologist

with respect to ‘ophthalmologist’. However,

It is not the case that every ophthalmologist is an ophthalmologist

It is not the case that Caius is an ophthalmologist

are not deducible with respect to the same component from:

It is not the case that Caius is a physician who specializes in the eyes.

Second, the compatibility condition makes Bolzano’s logic nonmonotonic. Whenever the premise added contains contradictory information, the conclusion no longer follows. While compatibility does not allow him to deal with all cases of defeasible inference, it allows Bolzano to account for cases that imply typicality considerations. It is typical of crows that they be black. Hence from the fact that x is a crow we can infer that x is black. On Bolzano’s account adding a premise that describes a new case that contradicts previous observation, say that this crow is not black, the conclusion no longer follows since the inference does not fulfil the compatibility condition: no substitution can make both the premises and the conclusion true at the same time.

At many places Bolzano suggests that deducibility is a type of probabilist inference, namely the limit case in which the probability of a proposition T relative to a set of premises S, S’, S’’… = 1. Bolzano also calls inferences of this type  “perfect inference.” More generally, the value of a probability inference from S, S’, S’’, … to T with respect to a set of variable ideas i, j,… is determined by comparing the number of cases in which the substitution of i, j,… yields true instances of both S, S’, S’’… and T, to the number of cases in which S, S’, S’’,… are true (with respect to i, j,…). Let’s assume that Caius is to draw a ball from a container in which there are 90 black and 10 white and that the task is to determine the degree of probability of the conclusion “Caius draws a black ball.” On Bolzano’s account, in order to determine the probability of the conclusion one must first establish the number n of admissible substitution instances K1, K2, …, Kn of the premise “Caius draws a ball” with respect to ‘ball.’ The number n of acceptable substitution instances of the premise is in general a function of the following considerations: (i) the probability of each of K1, K2, …, Kn is the same; (ii) only one of K1, K2, …, Kn can be true at once; (iii) taken together, they exhaust all objectual substitution instances of the premise. In this case, since there are 100 balls in the container, there are only 100 admissible substitution instances of the premise, namely K1: “Caius draws ball number 1,” K2: “Caius draws ball number 2,”…, K100: “Caius draws ball number 100.” If the set of K1, K2, …, Kn = k and the number of cases in which “Caius draw a black ball” is deducible from “Caius draws a ball” is m, then the probability m of “Caius draws a black ball” is the fraction m/k = 90/100 = 9/10. In the case of deducibility the number of cases in which the substitution yields both true variants of the premises and the conclusion is identical to the number of true admissible variants of the premises, that is, m = 1. If there is no substitution that makes both the premises and the conclusion true at the same time, then the degree of probability of the conclusion is 0, that is, the conclusion is not deducible from the premises.

4. Grounding

Bolzano did not think that his account of truth preservation exhausted the topic of inference since it does not account for what is specific to knowledge we acquire in mathematics. Such knowledge he considered to be necessary and a priori, two qualities relations that are defined on the basis of the substitutional method do not have. Bolzano called “grounding” (Abfolge) the relation that defines structures in which propositions relate as grounds to their consequences. As Bolzano conceived of it, my knowing that ‘p’ grounds ‘q’ has explanatory virtue: grounding aims at epitomizing certain intuitions about scientific explanation and seeks to explain, roughly, what, according to Bolzano, the truly scientific mind ought to mean when, in the conduct of a scientific inquiry, she uses the phrase “…because…” in response the question “why …?” Since in addition the propositions that pertain to “grounding” orders such as arithmetic and geometry are invariably true and purely conceptual, then grasping the relations among propositions in the latter invariably warrants knowledge that does not rest on extra-conceptual resources, a move that allowed Bolzano to debunk the Kantian theory of pure intuition.

Bolzano’s notion of grounding is defined by a set of distinctive features. For one thing, grounding is a unique relation: for every true proposition that is not primitive, there is a unique tree-structure that relates it to the axioms from which it can be deduced. That there is such a unique objective order is an assumption on Bolzano’s part that is in many ways antiquated, but it cannot be ignored. Uniqueness follows from two distinctions Bolzano makes. On the one hand, Bolzano distinguishes between simple and complex propositions: a ground (consequence) may or may not be complex. A complex ground is composed of a number of different truths that are in turn composed of a number of different primitive concepts. On the other hand, Bolzano distinguishes between the complete ground or consequence of a proposition and the partial ground or consequence thereof. On this basis, he claims that the complete ground of a proposition is never more complex than is its complete consequence. That is, propositions involved in the complete ground of a proposition are not composed of more distinct primitive concepts than is the complete consequence. Given that Bolzano thinks that the grounding order is ultimately determined by a finite number of simple concepts, this restriction implies that the regression in the grounding order from a proposition to its ground is finite. Ultimately, the regression leads to true primitive propositions, that is, axioms whose defining characteristic is their absolute simplicity.

Note that the regression to primitive propositions is not affected by the fact that the same proposition may appear at different levels of the hierarchy. Although the grounding order is structured vertically and cannot have infinitely many distinct immediate antecedents, in order to conduct basic inductive mathematical demonstration the horizontal structure needs on its part to allow for recursions. Provided that the recurring propositions do not appear on the same branch of the tree, Bolzano is in a position to avoid loops that would make it impossible to guarantee that we ever arrive at the primitive propositions or that there be primitive propositions in the first place.

Bolzano draws a distinction between cases in which what we have is the immediate ground for the truth of a proposition and cases in which the ground is mediated (implicitly or explicitly) by other truths. When Bolzano speaks of grounding, what he has in mind is invariably immediate grounding, and he understands the notion of mediate grounding as a derivative notion. It is the transitive closure of the more primitive notion of immediate grounding. p is the mediate consequence of the propositions Ψ1, …, Ψn if and only if there is a chain of immediate consequences starting with Ψ1, …, Ψn and ending with p. p is the immediate consequence of Ψ1, …, Ψn if there are no intermediate logical step between Ψ1, …, Ψn and p.

Grounding is not reflexive. p cannot be its own ground, whether mediate or immediate. The non-reflexive character of grounding can be inferred from its asymmetry, another of Bolzano’s assumption. If grounding were reflexive, then the truth that p could be grounded on itself, but given that if p grounds q it is not the case that q grounds p, this would imply a contradiction since, by substitution p could at once ground itself and not ground itself. Irreflexivity allows Bolzano to deny the traditional tenet according to which some propositions such as axioms are grounded in themselves. Bolzano explains that this is a loose way of talking, that those who maintain this idea are unaware of the putative absurdity of saying that a proposition is its own consequence and that the main motivation behind this claim is the attempt to maintain, unnecessarily, the idea that every proposition has a ground across the board. According to Bolzano however, the ground for the truth of a primitive proposition does not lie in itself but in the concepts of which this proposition consists.

One important distinction to be made between deducibility and grounding, as Bolzano conceives of them, rests in the fact that while grounding is meant to support the idea that a priori knowledge is axiomatic, that there are (true) primitive, atomic propositions from which all other propositions in the system follow as consequences, deducibility does not have such implication. Whether a proposition q is deducible from another proposition p is not contingent on q’s being ultimately derivable from the propositions from which p is derivable. That “Caius is mortal” is deducible from “Caius is a man” can be established independently of the truth that Caius is a finite being. Likewise, the possibility that deducibility be a special case of grounding is unacceptable for Bolzano. Not all cases of deducibility are cases of grounding. For instance,

It is warmer in the summer than in the winter

is deducible from

Themometers, if they function properly, are higher in the summer than in the winter

but it is not an objective consequence of the latter in Bolzano’s sense. On the contrary, the reason why thermometers are higher in the summer is that it is warmer so that, in the previous example, the order of grounding is reversed. There are cases in which true propositions that stand in a relation of deducibility also stand in a relation of grounding, what Bolzano calls “formal grounding.” It is not difficult to see what could be the interest of the latter. Strictly speaking, in an inference that fits both the notion of grounding and that of deducibility, the conclusion follows both necessarily (by virtue of its being a relation of grounding) and as a matter of truth preservation (by virtue of its being an instance of deducibility) from the premises. Formal grounding however presents little interest: it is not an additional resource of Bolzano’s logic but a designation for types of inferences that present the specificity of suiting two definitions at once: I can only know that an inference fits the definition of formal grounding if I know that it fits both that of grounding and that of deducibility. Once I know that it fits both, to say that it is a case of formal grounding does not teach me much I did not already know.

It could be tempting to think that grounding is a kind of deducibility, namely the case in which the premises are systematically simpler than the conclusion. Bolzano suggests something similar when he claims that grounding might not, in the last instance, be more than an ordering of truths by virtue of which we can deduce from the smallest number of simple premises, the largest possible number of the remaining truths as conclusion. This would require us however to ignore important differences between deducibility and grounding. When I say that “The thermometer is higher in the summer” is deducible from “It is warmer in the summer,” I am making a claim about the fact that every time “It is warmer in X” yields a true substitution instance, “The thermometer is higher in X” yields one as well. When I say that “The thermometer is higher in the summer” is grounded in “It is warmer in the summer” I am making a claim about determinate conceptual relations within a given theory. I am saying that given what it means to be warmer and what it means to be a thermometer, it cannot be the case that it be warm and that the thermometer not be high. Of course the theory can be wrong, but assuming that it is true, the relation is necessary since it follows from the (true) axioms of the theory. In this respect, a priori knowledge can only be achieved in deductive disciplines when we grasp the necessary relations that subsist among the (true and purely conceptual) propositions they involve. If I know that a theorem follows from an axiom or a set of them, I know so with necessity.

5. Objective Proofs

Bolzano’s peculiar understanding of grounding is liable to a series of problems,  both exegetical and theoretical. Nonetheless, the account of mathematical demonstration, what he terms “Begründungen,” (objective proofs), that it underlies is of vast historical interest. Three notions form the basis of Bolzano’s account of mathematical and deductive knowledge in general: grounding (Abfolge), objective justification (objective Erkenntnisgrund) and objective proof (Begründung). The structure of the theory is the following: (i) grounding is a relation that subsists between true propositions independently of epistemic access to them. We may grasp objective grounding relations and (ii) the possibility of grasping the latter is also the condition for our having objective justifications for our beliefs, as opposed to merely “subjective” ones. Finally, (iii) objective proofs are meant to cause the agent to have objective justifications in this sense. With respect to (ii), Bolzano’s idea is explicitly Aristotelian: Bolzano believes that whenever an agent grasps p and grasps the grounding relation between p and q, she also knows the ground for the existence of q and therefore putatively why q is true, namely because p. If we follow (iii), the role of a (typically) linguistic or schematic representation of (i) is to cause the agent to have (ii). According to Bolzano, objective proofs succeed in providing agents with an objective justification for their relevant beliefs because they make the objective ground of the propositions that form the content of these beliefs epistemically accessible to the agent. As Bolzano sees it, the typical objective proof is devised so as to reliably cause the reader or hearer to have an objective justification for the truth of the proposition. The objective proof is merely ‘reliable’ since whether I do acquire objective knowledge upon surveying the proof in question depends in part on my background knowledge, in part on my overall ability to process the relevant inferences and the latter according to Bolzano’s theory of cognition is mostly a function of my having been previously acquainted with many inferences of different types. The more accustomed I am to drawing inferences, the more reliably the objective proof is likely to cause in me the relevant objective justification.

According to Bolzano, there are good reasons why we should place strong constraints on mathematical demonstration, and in everyday practice favor the objective proofs that provide us with objective mathematical knowledge. It would be wrong however to assume that on his account mathematical knowledge can only be achieved via objective proofs. Objective proofs are not the only type of demonstration in Bolzano’s theory of knowledge, nor indeed the only bona fide one. Bolzano opposes objective proofs, that is, proofs that provide an objective justification to what he calls Gewissmachungen (certifications). Certifications, according to Bolzano, are also types of demonstrations (there are many different species thereof) in the sense that they too are meant to cause agents to know a certain truth p on the basis of another one q. When an agent is caused to know that something is true on the basis of a certification, the agent has a subjective, as opposed to an objective, justification for his or her belief. Bolzano’s theory of certification and subjective justification is an indispensible element of his account of empirical knowledge. Certifications are ubiquitous in empirical sciences such as medicine. Medical diagnosis relies on certifications in Bolzano’s sense. Symptoms are typically visible effects, direct or indirect, of diseases that allow us to recognize them. When we rely on symptoms to identify a disease, we thus never know this disease through its objective ground. Likewise, subjective proofs also play an important role in Bolzano’s account of mathematical knowledge. As Bolzano sees it, in order to have an occurrent (and not a merely dispositional) cognitive attitude towards a given propositional content, an agent must somehow be causally affected. This may be brought about in many ways. Beliefs and ideas arise in our mind most of the time in a more or less sophisticated, chaotic and spontaneous way, on the basis of mental associations and/or causal interactions with the world. The availability of a linguistic object that represents the grounding relation is meant to reliably cause objective knowledge, that is, to bring one’s interlocutor to have occurent objective knowledge of a certain truth. This may however not be the best way to cause the given belief per se. It might be that in order to cause me to recognize the truth of the intermediate value theorem, my interlocutor needs resort to a more or less intuitive diagrammatic explanation, which is precisely what objective proofs exclude. Since as Bolzano conceives of it the purpose of demonstrations is primarily to cause the interlocutor to have a higher degree of confidence (Zuversicht) in one of his beliefs, and since Bolzano emphasizes the effectiveness of proofs over their providing objective justifications, objective proofs should not be seen as the only canonical or scientifically acceptable means to bring an agent to bestow confidence on a judgment. Besides, Bolzano warns us against the idea that one ought to use only logical or formal demonstrations that might end up boring the interlocutor to distraction and have a rather adverse epistemic effect. Although Bolzano claims that we ought to use objective proofs as often as possible, he also recognizes that we sometimes have to take shortcuts or simply use heuristic creativity to cause our interlocutor to bestow confidence on the truths of mathematics, especially when the interlocutor has only partial and scattered knowledge of the discipline.

Objective proof, in addition to its epistemic virtue, introduces pragmatic constraints on demonstration that are meant to steer actual practices in deductive science. The idea that mathematical demonstrations ought to reflect the grounding order entails two things. First, it requires that an agent does not deny that a proposition has an objective ground and is thus inferable from more primitive propositions every time this agent, perhaps owing to her medical condition or limited means of recognition, fails to recognize that the proposition has an objective ground. Consequently, it insures that the demonstration procedure is not short-circuited by criterion such as intuition, evidence or insight. The requirement that mathematical demonstrations be objective proofs forbids that the agent’s inability to derive a proposition from more primitive ones be compensated by a non grounding-related feature. In this relation, Mancosu speaks of the heuristic fruitfulness of Bolzano’s requirement on scientific exposition. (Mancosu 1999, 436) Although Bolzano considered that objective proofs should be favored in mathematical demonstration and despite the fact that he thought that only objective proofs have the advantage of letting us understand why a giving proposition is true, he did not think that in everyday practice mathematical demonstrations ought to be objective proofs. Bolzano thinks that there are situations in which it is legitimate to accept proofs that deliver only evidential knowledge. When it comes to setting out a mathematical theory the main objective should be to cause the agent to have more confidence in the truth of the proposition to be demonstrated than he would have otherwise or even merely to incite him to look for an objective justification by himself. Hence, given certain circumstantial epistemic constraints, Bolzano is even willing to concede that certain proofs can be reduced to a brief justification of one’s opinion. Furthermore, though this would deserve to be investigated further, it is worth mentioning that Bolzano is not averse to reverting to purely inductive means, for instance, when it comes to mathematical demonstration. This may seem odd, but Bolzano has good reasons to avoid requiring that all our mathematical proofs provide us with objective and explanatory knowledge. For one thing, asking that all mathematical proofs be objective proofs would not be a reasonable requirement and, in particular, it would not be one that is always epistemically realizable. Given the nature of grounding, it would often require us to engage in the production of linguistic objects that have immense proportions. Since they are merely probable, Bolzano does think that evidential proofs need to be supplemented by “decisive” ones. One could want to argue that the latter reduce to objective proofs. If, upon surveying an objective proof, I acquire an objective justification, I cannot doubt the truth of the conclusion, and it is therefore decisively true. But it is hard to imagine that Bolzano would have thought that the linguistic representation of an inference from deducibility would be any less decisive. Consider this inference:

Triangles have two dimensions

is deducible from

Figures have two dimensions

Triangles are figures.

Not only is the inference truth-preserving, but the conclusion is also a conceptual truth. It is composed only of concepts which, according to Bolzano, means that its negation would imply a contradiction and is therefore necessary. In mathematics and other conceptual disciplines, deducibility and grounding both have the epistemic particularity of yielding a belief that can be asserted with confidence. By contrast, according to Bolzano, though an agent need not always be mistaken whenever she asserts a proposition that stands to its premises in a mere relation of probability, she is at least liable to committing an error. Inferences whose premises are only probable can only yield a conclusion that has probability. As Bolzano sees it, confidence is a property of judgments that are indefeasible. The conclusion (perfectly) deduced from a set of a priori propositions cannot be defeated if only because, if I know its ground, I also know why it is true and necessarily so. Similarly, if p is true and if I know that q is deducible from p (and this holds a fortiori in the case in which p and q are conceptual truths), I have a warrant, namely the fact that I know that truth is preserved from premises to conclusion, and I cannot be mistaken about the truth of q.

6. Conclusion

The importance of Bolzano’s contribution to semantics can hardly be overestimated. The same holds for his contribution to the theoretical basis of mathematical practice. Far from ignoring epistemic and pragmatic constraint, Bolzano discusses them in detail, thus providing a comprehensive basis for a theory of mathematical knowledge that was aimed at supporting work in the discipline. As a mathematician, Bolzano was attuned to philosophical concerns that escaped the attention of most of his contemporaries and many of his successors. His theory is historically and philosophically interesting, and  it deserves to be investigated further.

7. References and Further Reading

  • Bar-Hillel, Yehoshua (1950) “Bolzano’s Definition of Analytic Propositions” Methodos,  32-55. [Republished in Theoria 16, 1950, pp. 91-117; reprinted in Aspects of language: Essays and Lectures on Philosophy of Language, Linguistic Philosophy and Methodology of Linguistics, Jerusalem, The Magnes Press 1970 pp. 3-28].
  • Benthem, Johan van (2003) “Is There Still Logic in Bolzano’s Key?” in Bernard Bolzanos Leistungen in Logik, Mathematik und Physik, Edgar Morscher (ed.) Sankt Augustin, Academia, 11-34.
  • Benthem, Johan van (1985) “The Variety of Consequence, According to Bolzano”, Studia Logica 44/4, 389-403.
  • Benthem, Johan van (1984) Lessons from Bolzano. Stanford, Center for the Study of Language and Information, Stanford University, 1984.
  • Bolzano, Bernard (1969-…) Bernard Bolzano-Gesamtausgabe, dir. E. Winter, J. Berg, F. Kambartel, J. Louzil, B. van Rootselaar, Stuttgart-Bad Cannstatt, Fromann-Holzboog,  2 A, 12.1, Introduction par Jan Berg.
  • Bolzano, Bernard (1976) Ausgewählte Schriften, Winter, Eduard (ed.), Berlin, Union Verlag.
  • Bolzano, Bernard (1851) Paradoxien des Unendlichen, (reprint) Wissenschaftliche Buchgesellschaft, 1964. [Dr Bernard Bolzano’s Paradoxien des Unendlichen herausgegeben aus dem schriftlichem Nachlasse des Verfassers von Dr Fr. Příhonský, Leipzig, Reclam. (Höfler et Hahn (Eds.), Leipzig, Meiner, 1920)]
  • Bolzano, Bernard (1948) Gemoetrishche Arbeiten [Geometrical Works], Spisy Bernada Bolzana, Prague, Royal Bohemian Academy of Science.
  • Bolzano, Bernard (1837) Wissenschaftslehre, Sulzbach, Seidel.
  • Bolzano, Bernard (1931) Reine Zahlenlehre [Pure Theory of Numbers], Spisy Bernada Bolzana, Prague, Royal Bohemian Academy of Science.
  • Bolzano, Bernard (1930) Funktionenlehre [Theory of Function] Spisy Bernada Bolzana, Prague, Royal Bohemian Academy of Science;
  • Bolzano, Berbard (1817a) Rein Analytischer Beweis des Lehrsatzes,dass zwischen je zwey Werthe, die ein entgegengesetzes Resultat gewähren, wenigstens eine reelle Wurzel der Gleichung liege, Prague, Haase. 2nd edition, Leipzig, Engelmann, 1905; Facsimile, Berlin, Mayer & Mueller, 1894.
  • Bolzano, Bernard (1817b) Die drey Probleme der Rectification, der Complanation und der Cubirung, ohne Betrachtung des unendlich Kleinen, Leipzig, Kummer.
  • Bolzano, Bernard (1816) Der binomische Lehrsatz und als Folgerung aus ihm der polynomische, und die Reihen, die zur Berechnung der Logarithmen une Exponentialgrösse dienen, Prague, Enders.
  • Bolzano, Bernard (1812) Etwas aus der Logik, Bolzano Gesamtausgabe, Gesamtausgabe, Stuttgart, Frohmann-Holzboog, vol. 2 B 5, p.140ff.
  • Bolzano, Bernard (1810) Beyträge zu einer begründeteren Darstellung der  Mathematik; Widtmann, Prague. (Darmstadt, Wissenschaftliche Buchgesellschaft,1974).
  • Coffa, Alberto (1991) The semantic tradition fro Kant to Carnap, Cambridge, Cambridge University Press.
  • Dubucs, Jacques & Lapointe, Sandra (2006) “On Bolzano’s Alleged Explicativism,” Synthese 150/2, 229–46.
  • Etchemendy, John (2008) “Reflections on Consequence,” in (Patterson 2008), 263-299.
  • Etchemendy, John (1990) The Concept of Logical Consequence, Cambridge, Harvard University Press.
  • Etchemendy, John (1988) “Models, Semantics, and Logical Truth”, Linguistics and Philosophy, 11, 91-106.
  • Freudenthal, H (1971) (“Did Cauchy Plagiarize Bolzano?”, Archives for the History of Exact Sciences 375-92.
  • Grattan-Guiness, Ivan (1970) “Bolzano, Cauchy and the ‘New Analysis’ of the Early Nineteenth Century,” Archives for the History of Exact Sciences, 6, 372-400.
  • Künne Wolfgang (2006) “Analyticity and logical truth: from Bolzano to Quine”, in (Textor 2006), 184-249.
  • Lapointe, Sandra (2008), Qu’est-ce que l’analyse?, Paris, Vrin.
  • Lapointe, Sandra (2007) “Bolzano’s Semantics and His Critique of the Decompositional Conception of Analysis” in The Analytic Turn, Michael Beaney (Ed.), London, Routledge, pp.219–234.
  • Lapointe, Sandra (2000). Analyticité, Universalité et Quantification chez Bolzano. Les Études Philosophiques, 2000/4, 455–470.
  • Morscher, Edgar (2003) “La Définition Bolzanienne de l’Analyticité lLgique”, Philosophiques 30/1, 149-169.
  • Neeman, Ursula (1970), “Analytic and Synthetic Propositions in Kant and Bolzano” Ratio 12, 1-25.
  • Patterson, Douglas (ed.) (2008) News Essays on Tarski’s Philosophy, Oxford, Oxford.
  • Příhonský, František (1850) Neuer Anti-Kant: oder Prüfung der Kritik der reinen Vernunft nach den in Bolzanos Wissenschaftslehre niedergelegten Begriffen, Bautzen, Hiecke.
  • Proust, Joëlle (1989) Questions of Form. Logic and the Analytic Proposition from Kant to Carnap. Minneapolis: University of Minnesota Press.
  • Proust, Joëlle (1981) “Bolzano’s analytic revisited”, Monist, 214-230.
  • Rusnock, Paul (2000) Bolzano’s philosophy and the emergence of modern mathematics, Amsterdam, Rodopi.
  • Russ, Steve (2004) The Mathematical Works of Bernard Bolzano, Oxford, Oxford Univewrsity Press.
  • Russell, Bertrand (1903) The Principles of Mathematics, Cambridge, Cambridge University Press.
  • Sebestik, Jan (1992) Logique et mathématique chez Bernard Bolzano, Paris, Vrin.
  • Schubring, Gert (1993) “Bernard Bolzano. Not as Unknown to His Contemporaries as Is Commonly Believed?” Historia Mathematica, 20, 43-53.
  • Siebel, Mark (2003) “La notion bolzanienne de déductibilité” Philosophiques, 30/1, 171-189.
  • Siebel, Mark (2002) “Bolzano’s concept of consequence” Monist, 85, 580-599.
  • Siebel, Mark (1996) Der Begriff der Ableitbarkeit bei Bolzano, Sankt Augustin, Academia Verlag.
  • Tatzel, Armin (2003) “La théorie bolzanienne du fondement et de la consequence” Philosophiques 30/1, 191-217.
  • Tatzel, Armin (2002) “Bolzano’s theory of ground and consequence” Notre Dame Journal of Formal Logic 43, 1-25.
  • Textor, Mark (ed.) (2006) The Austrian Contribution to Analytic Philosoph, New York, Routledge.
  • Textor,  Mark, (2001) “Logically analytic propositions “a posteriori”?” History of Philosophy Quarterly, 18, 91-113.
  • Textor, Mark (2000) “Bolzano et Husserl sur l’analyticité,” Les Études Philosophiques 2000/4 435–454.
  • Waldegg, Guillermina, (2001) “Ontological Convictions and Epistemological Obstacles in Bolzano’s Elementary Geometry”, Science and Education, 10/4 409-418.

Author Information

Sandra LaPointe
Email: sandra.lapointe@mac.com
Kansas State University
U. S. A.

The New Atheists

The New Atheists are authors of early twenty-first century books promoting atheism. These authors include Sam Harris, Richard Dawkins, Daniel Dennett, and Christopher Hitchens. The “New Atheist” label for these critics of religion and religious belief emerged out of journalistic commentary on the contents and impacts of their books. A standard observation is that New Atheist authors exhibit an unusually high level of confidence in their views.  Reviewers have noted that these authors tend to be motivated by a sense of moral concern and even outrage about the effects of religious beliefs on the global scene. It is difficult to identify anything philosophically unprecedented in their positions and arguments, but the New Atheists have provoked considerable controversy with their body of work.

In spite of their different approaches and occupations (only Dennett is a professional philosopher), the New Atheists tend to share a general set of assumptions and viewpoints. These positions constitute the background theoretical framework that is known as the New Atheism. The framework has a metaphysical component, an epistemological component, and an ethical component.  Regarding the metaphysical component, the New Atheist authors share the central belief that there is no supernatural or divine reality of any kind.  The epistemological component is their common claim that religious belief is irrational. The moral component is the assumption that there is a universal and objective secular moral standard. This moral component sets them apart from other prominent historical atheists such as Nietzsche and Sartre, and it plays a pivotal role in their arguments because it is used to conclude that religion is bad in various ways, although Dennett is more reserved than the other three.

The New Atheists make substantial use of the natural sciences in both their criticisms of theistic belief and in their proposed explanations of its origin and evolution. They draw on science for recommended alternatives to religion. They believe empirical science is the only (or at least the best) basis for genuine knowledge of the world, and they insist that a belief can be epistemically justified only if it is based on adequate evidence. Their conclusion is that science fails to show that there is a God and even supports the claim that such a being probably does not exist. What science will show about religious belief, they claim, is that this belief can be explained as a product of biological evolution. Moreover, they think that it is possible to live a satisfying non-religious life on the basis of secular morals and scientific discoveries.

Table of Contents

  1. Faith and Reason
  2. Arguments For and Against God’s Existence
  3. Evolution and Religious Belief
  4. The Moral Evaluation of Religion
  5. Secular Morality
  6. Alleged Divine Revelations
  7. Secular Fulfillment
  8. Criticism of the New Atheists
  9. References and Further Reading

1. Faith and Reason

Though it is difficult to find a careful and precise definition of “faith” in the writings of the New Atheists, it is possible to glean a general characterization of this cognitive attitude from various things they say about it. In The Selfish Gene, Richard Dawkins states that faith is blind trust without evidence and even against the evidence. He follows up in The God Delusion with the claim that faith is an evil because it does not require justification and does not tolerate argument. Whereas the former categorization suggests that Dawkins thinks that faith is necessarily non-rational or even irrational, the latter description seems to imply that faith is merely contingently at odds with rationality. Harris’s articulation of the nature of faith is closer to Dawkins’ earlier view. He says that religious faith is unjustified belief in matters of ultimate concern.  According to Harris, faith is the permission religious people give one another to believe things strongly without evidence. Hitchens says that religious faith is ultimately grounded in wishful thinking. For his part, Dennett implies that belief in God cannot be reasonable because the concept of God is too radically indeterminate for the sentence “God exists” to express a genuine proposition.  Given this, Dennett questions whether any of the people who claim to believe in God actually do believe God exists. He thinks it more likely that they merely profess belief in God or “believe in belief” in God (they believe belief in God is or would be a good thing). According to this view there can be no theistic belief that is also reasonable or rational. Critics point out that the New Atheist assumption that religious faith is irrational is at odds with a long philosophical history in the West that often characterizes faith as rational.  This Western Philosophical tradition can be said to begin with Augustine and continue through to present times.

The New Atheists subscribe to some version or other of scientism as their criterion for rational belief.  According to scientism, empirical science is the only source of our knowledge of the world (strong scientism) or, more moderately, the best source of rational belief about the way things are (weak scientism). Harris and Dawkins are quite explicit about this. Harris equates a genuinely rational approach to spiritual and ethical questions with a scientific approach to these sorts of questions. Dawkins insists that the presence or absence of a creative super-intelligence is a scientific question. The New Atheists also affirm evidentialism, the claim that a belief can be epistemically justified only if it is based on adequate evidence. The conjunction of scientism and evidentialism entails that a belief can be justified only if it is based on adequate scientific evidence. The New Atheists’ conclusion that belief in God is unjustified follows, then, from their addition of the claim that there is inadequate scientific evidence for God’s existence (and even adequate scientific evidence for God’s non-existence).  Dawkins argues that the “God Hypothesis” the claim that there exists a superhuman, supernatural intelligence who deliberately designed and created the universe, is “founded on local traditions of private revelation rather than evidence” (2006, pp. 31-32). Given these New Atheist epistemological assumptions (and their consequences for religious epistemology), it is not surprising that some criticism of their views has included questions about whether there is adequate scientific support for scientism and whether there is adequate evidence for evidentialism.

2. Arguments For and Against God’s Existence

Since atheism continues to be a highly controversial philosophical position, one would expect that the New Atheists would devote a fair amount of space to a careful (and, of course, critical) consideration of arguments for God’s existence  and that they would also spend a corresponding amount of time formulating a case for the non-existence of God.  However, none of them addresses either theistic or atheistic arguments to any great extent. Dawkins does devote a chapter apiece to each of these tasks, but he has been criticized for engaging in an overly cursory evaluation of theistic arguments and for ignoring the philosophical literature in natural theology.   The literature overlooked by Dawkins addresses issues relevant to his claim that there almost certainly is no God. Harris, who thinks that atheism is obviously true, does not dedicate much space to a discussion of arguments for or against theism.  He does sketch a brief version of the cosmological argument for God’s existence but asserts that the final conclusion does not follow because the argument does not rule out alternative possibilities for the universe’s existence. Harris also hints at reasons to deny God’s existence by pointing to unexplained evil and “unintelligent design” in the world. Hitchens includes chapters entitled “The Metaphysical Claims of Religion are False” and “Arguments from Design,” but his more journalistic treatment of the cases for and against God’s existence amounts primarily to the claim that the God hypothesis is unnecessary since science can now explain what theism was formerly thought to be required to explain, including phenomena such as the appearance of design in the universe. After considering the standard arguments for God’s existence and rehearsing standard objections to them, Dennett argues that the concept of God is insufficiently determinate for it to be possible to know what proposition is at issue in the debate over God’s existence.

Dawkins’ argument for the probable non-existence of God is the most explicit and thorough attempt at an atheistic argument amongst the four. However, he does not state this argument in a deductively valid form, so it is difficult to discern exactly what he has in mind. Dawkins labels his argument for God’s non-existence “the Ultimate Boeing 747 gambit,” because he thinks that God’s existence is at least as improbable as the chance that a hurricane, sweeping through a scrap yard, would have the luck to assemble a Boeing 747 (an image that he borrows from Fred Hoyle, who used it for a different purpose). At the heart of his argument is the claim that any God capable of designing a universe must be a supremely complex and improbable entity who needs an even bigger explanation than the one the existence of such a God is supposed to provide. Dawkins also says that the hypothesis that an intelligent designer created the universe is self-defeating. What he appears to mean by this charge is that this intelligent design hypothesis claims to provide an ultimate explanation for all existing improbable complexity and yet cannot provide an explanation of its own improbable complexity. Dawkins further states that the God hypothesis creates a vicious regress rather than terminating one. Similarly, Harris follows Dawkins’ in arguing that the notion of a creator God leads to an infinite regress because such a being would have to have been created. Some critics, like William Lane Craig, reply that, at best, Dawkins’ argument could show only that the God hypothesis does not explain the appearance of design in the universe (a claim that Craig denies) but it does not demonstrate that God probably doesn’t exist. Dawkins’ assumption that God would need an extenal cause flies in the face of the longstanding theological assumption that God is a perfect and so necessary being who is consequently self-existent and ontologically independent. At the very least, Dawkins owes the defender of this classical conception of God further clarification of the kind of complexity he attributes to God and further arguments for the claims that God possesses this kind of complexity and that God’s being complex in this way is incompatible with God’s being self-existent. In reply to Dawkins, Craig argues that though the contents of God’s mind may be complex, God’s mind itself is simple.

3. Evolution and Religious Belief

The New Atheists observe that if there is no supernatural reality, then religion and religious belief must have a purely natural explanation. They agree that these sociological and psychological phenomena are rooted in biology. Harris summarizes their view by saying that as a biological phenomenon, religion is the product of cognitive processes that have deep roots in our evolutionary past. Dawkins endorses the general hypothesis that religion and religious belief are byproducts of something else that has survival value. His specific hypothesis is that human beings have acquired religious beliefs because there is a selective advantage to child brains that possess the rule of thumb to believe, without question, whatever familiar adults tell them. Dawkins speculates that this cognitive disposition, which tends to help inexperienced children to avoid harm, also tends to make them susceptible to acquiring their elders’ irrational and harmful religious beliefs. Dawkins is less committed to this specific hypothesis than he is to the general hypothesis, and he is open to other specific hypotheses of the same kind. Dennett discusses a number of these specific hypotheses more thoroughly in his attempt to “break the spell” he identifies as the taboo against a thorough scientific investigation of religion as one natural phenomenon among many.

At the foundation of Dennett’s “proto-theory” about the origin of religion and religious belief is his appeal to the evolution in humans (and other animals) of a “hyperactive agent detection device” (HADD), which is the disposition to attribute agency – beliefs and desires and other mental states – to anything complicated that moves. Dennett adds that when an event is sufficiently puzzling, our “weakness for certain sorts of memorable combos” cooperates with our HADD to constitute “a kind of fiction-generating contraption” that hypothesizes the existence of invisible and even supernatural agents (2006, pp. 119-120). Dennett goes on to engage in a relatively extensive speculation about how religion and religious belief evolved from these purely natural beginnings. Though Hitchens mentions Dennett’s naturalistic approach to religion in his chapter on “religion’s corrupt beginnings,” he focuses primarily on the interplay between a pervasive gullibility he takes to be characteristic of human beings and the exploitation of this credulity that he attributes to the founders of religions and religious movements. The scientific investigation of religion of the sort Dennett recommends has prompted a larger interdisciplinary conversation that includes both theists and non-theists with academic specialties in science, philosophy, and theology (see Schloss and Murray 2009 for an important example of this sort of collaboration).

4. The Moral Evaluation of Religion

The New Atheists agree that, although religion may have been a byproduct of certain human qualities that proved important for survival, religion itself is not necessarily a beneficial social and cultural phenomenon on balance at present. Indeed, three of the New Atheists (Harris, Dawkins, and Hitchens) are quite explicit in their moral condemnation of religious people on the ground that religious beliefs and practices have had significant and predominately negative consequences. The examples they provide of such objectionable behaviors range from the uncontroversial (suicide bombings, the Inquisition, “religious” wars, witch hunts, homophobia, etc.) to the controversial (prohibition of “victimless crimes” such as drug use and prostitution, criminalization of abortion and euthanasia,  “child abuse” due to identification of children as members of their parents religious communities, and so forth). Harris is explicit about placing the blame for these evils on faith, defined as unfounded belief. He argues that faith in what religious believers take to be God’s will as revealed in God’s book inevitably leads to immoral behaviors of these sorts. In this way, the New Atheists link their epistemological critique of religious belief with their moral criticism of religion.

The New Atheists counter the claim that religion makes people good by listing numerous examples of the preceding sort in which religion allegedly makes people bad. They also anticipate the reply that the moral consequences of atheism are worse than those of theism. A typical case for this claim appeals to the atrocities perpetrated by people like Hitler and Stalin. The New Atheists reply that Hitler was not necessarily an atheist because he claimed to be a Christian and that these regimes were evil because they were influenced by religion or were like a religion and that, even if their leaders were atheists (as in the case of Stalin), their crimes against humanity were not caused by their atheism because they were not carried out in the name of atheism. The New Atheists seem to be  agreed that theistic belief has generally worse attitudinal and behavioral moral consequences than atheistic belief. Dennett is characteristically more hesitant to draw firm conclusions along these lines until further empirical investigation is undertaken..

5. Secular Morality

These moral objections to religion presuppose a moral standard. Since the New Atheists have denied the existence of any supernatural reality, this moral standard has to have a purely natural and secular basis. Many non-theists have located the natural basis for morality in human convention, a move that leads naturally to ethical relativism. But the New Atheists either explicitly reject ethical relativism, or affirm the existence of the “transcendent value” of justice, or assert that there is a consensus about what we consider right and wrong, or simply engage in a moral critique of religion that implicitly presupposes a universal moral standard.

The New Atheists’ appeal to a universal secular moral standard raises some interesting philosophical questions. First, what is the content of morality? Harris comes closest to providing an explicit answer to that question in stating that questions of right and wrong are really questions about the happiness and suffering of sentient creatures. Second, if the content of morality is not made accessible to human beings by means of a revelation of God’s will, then how do humans know what the one moral standard is? The New Atheists seem to be agreed that we have foundational moral knowledge. Harris calls the source of this basic moral knowledge “moral intuition.” Since the other New Atheists don’t argue for the moral principles to which they appeal, it seems reasonable to conclude that they would agree with Harris. Third, what is the ontological ground of the universal moral standard? Given the assumption that ethical relativism is false, the question arises concerning what the objective natural ground is that makes it the case that some people are virtuous and some are not and that some behaviors are morally right and some are not. Again, Harris’s view that our ethical intuitions have their roots in biology is representative. Dawkins provides “four good Darwinian reasons” that purport to explain why some animals (including, of course human beings) engage in moral behavior. And though Dennett’s focus is on the evolution of religion, he is likely to have a similar story about the evolution of morality. One problem with this biological answer to our philosophical question is that it could only explain what causes moral behavior; it can’t also account for what makes moral principles true. The fourth philosophical question raised by the New Atheists is one they address themselves: “Why should we be moral?” Harris’s answer is that being moral tends to contribute to one’s happiness. Dawkins’ reply to the critic who asks, “If there’s no God, why be good?” seems to amount to no more than the observation that there are moral atheists. But this could only show that belief in God is not needed to motivate people to be moral; it doesn’t explain what does (or should) motivate atheists to be moral.

6. Alleged Divine Revelations

If there is no divine being, then there are no divine revelations. If there are no divine revelations, then every sacred book is a merely human book. Harris, Dawkins, and Hitchens each construct a case for the claim that no alleged written divine revelation could have a divine origin. Their arguments for this conclusion focus on what they take to be the moral deficiencies and factual errors of these books. Harris quotes passages from the part of the Old Testament traditionally labeled the “Law” that he considers barbaric and then asserts (on the basis of his view that Jesus can be read to endorse the entirety of Old Testament law) that the New Testament does not improve on these injunctions. He says that any subsequent more moderate Christian migration away from these biblical legal requirements is a result of taking scripture less and less seriously. Dawkins agrees with Harris that the God of the Bible and the Qur’an is not a moderate. As a matter of fact, he says that “The God of the Old Testament is arguably the most unpleasant character in all of fiction” (Dawkins 2006, p. 31). Though he says that “Jesus is a huge improvement over the cruel ogre of the Old Testament” (Dawkins 2006, p. 25), he argues that the doctrine of atonement, “which lies at the heart of New Testament theology, is almost as morally obnoxious as the story of Abraham setting out to barbecue Isaac” (Dawkins 2006, p. 251). Hitchens adds his own similar criticisms of both testaments in two chapters: “The Nightmare of the ‘Old’ Testament” and “The ‘New’ Testament Exceeds the Evil of the ‘Old’ One.” He also devotes a chapter to the Qur’an (as does Harris) and a section to the Book of Mormon. Dennett hints at a different objection to the Bible by remarking that anybody can quote the Bible to prove anything.

This collective case against the authenticity of any alleged written divine revelation raises interesting questions in philosophical theology about what kind of book could qualify as “God’s Word.” For instance, Harris considers it astonishing that a book as “ordinary” as the Bible is nonetheless thought to be a product of omniscience. He also says that, whereas the Bible contains no formal discussion of mathematics and some obvious mathematical errors, a book written by an omniscient being could contain a chapter on mathematics that would still be the richest source of mathematical insight humanity has ever known. This sort of claim invites further discussion about the sorts of purposes God would have and strategies God would employ in communicating with human beings in different times and places.

7. Secular Fulfillment

Each of the New Atheists recommends or at least alludes to a non-religious means of personal fulfillment and even collective well-being. Harris advocates a “spirituality” that involves meditation leading to happiness through an eradication of one’s sense of self. He thinks that scientific exploration into the nature of human consciousness will provide a progressively more adequate natural and rational basis for such a practice. For inspiration in a Godless world, Dawkins looks to the power of science to open the mind and satisfy the psyche. He celebrates the liberation of human beings from ignorance due to the growing and assumedly limitless capacity of science to explain the universe and everything in it. Hitchens hints at his own source of secular satisfaction by claiming that the natural is wondrous enough for anyone. He expresses his hope for a renewed Enlightenment focused on human beings, based on unrestricted scientific inquiry, and eventually productive of a new humane civilization. Dennett believes that a purely naturalistic spirituality is possible through a selfless attitude characterized by humble curiosity about the world’s complexities resulting in a realization of the relative unimportance of one’s personal preoccupations.

8. Criticism of the New Atheists

A number of essays and books have been written in response to the New Atheists (see the “References and Further Reading” section below for some titles). Some of these works are supportive of them and some of them are critical. Other works include both positive and negative evaluations of the New Atheism. Clearly, the range of philosophical issues raised by the New Atheists’ claims and arguments is broad. As might be expected, attention has been focused on their epistemological views, their metaphysical assumptions, and their axiological positions. Their presuppositions should prompt more discussion in the fields of philosophical theology, philosophy of science, philosophical hermeneutics, the relation between science and religion, and historiography. Conversations about the New Atheists’ stances and rationales have also taken place in the form of debates between Harris, Dawkins, Hitchens, and Dennett and defenders of religious belief and religion such as Dinesh D’Souza, who has published his own defense of Christianity in response to the New Atheists’ arguments. These debates are accessible in a number of places on the Internet. Finally, the challenges to religion posed by the New Atheists have also prompted a number of seminars and conferences. One of these is a conference presented by the Center for Philosophy of Religion at the University of Notre Dame, entitled, “My Ways Are Not Your Ways: The Character of the God of the Hebrew Bible” ( 2009). For an introduction to the sorts of issues this conference addresses, see Copan 2008.

9. References and Further Reading

  • Berlinski, David. The Devil’s Delusion: Atheism and its Scientific Pretensions (New York: Crown Forum, 2008).
    • A response to the New Atheists by a secular Jew that defends traditional religious thought.
  • Copan, Paul. “Is Yahweh a Moral Monster? The New Atheists and Old Testament Ethics,” Philosophia Christi 10:1, 2008, pp. 7-37.
    • A defense of the God and ethics of the Old Testament against the New Atheists’ criticisms of them.
  • Copan, Paul and William Lane Craig, eds. Contending with Christianity’s Critics (Nashville, Tenn.: Broadman and Holman, 2009).
    • A collection of essays by Christian apologists that addresses challenges from New Atheists and other contemporary critics of Christianity.
  • Craig, William Lane, ed. God is Great, God is Good: Why Believing in God is Reasonable and Responsible (Grand Rapids: InterVarsity Press, 2009).
    • A collection of essays by philosophers and theologians defending the rationality of theistic belief from the attacks of the New Atheists and others.
  • Dawkins, Richard. The Selfish Gene, 2nd ed. (Oxford: Oxford University Press, 1989).
    • An explanation and defense of biological evolution by natural selection that focuses on the gene.
  • Dawkins, Richard. The God Delusion (Boston: Houghton Mifflin, 2006).
    • A case for the irrationality and immoral consequences of religious belief that draws primarily on evolutionary biology.
  • Dennett, Daniel. Breaking the Spell: Religion as a Natural Phenomenon (New York: Penguin, 2006).
    • A case for studying the history and practice of religion by means of the natural sciences.
  • D’Souza, Dinesh. What’s So Great About Christianity (Carol Stream, IL: Tyndale House Publishers, 2007).
    • A defense of Christianity against the criticisms of the New Atheists.
  • Eagleton, Terry. Reason, Faith, and Revolution: Reflections on the God Debate (New Haven: Yale University Press, 2009).
    • A critical reply to Dawkins and Hitchens (“Ditchkins”) by a Marxist literary critic.
  • Flew, Antony. There is a God: How the World’s Most Notorious Atheist Changed His Mind (New York: HarperOne, 2007).
    • A former atheistic philosopher’s account of his conversion to theism (which includes a section by co-author Roy Abraham Varghese that provides a critical appraisal of the New Atheism).
  • Harris, Sam. The End of Faith: Religion, Terror, and the Future of Reason (New York: Norton, 2004).
    • An intellectual and moral critique of faith-based religions that recommends their replacement by science-based spirituality.
  • Harris, Sam. Letter to a Christian Nation (New York: Vintage Books, 2008).
    • A revised edition of his 2006 response to Christian reactions to his 2004 book.
  • Hitchens, Christopher. God is Not Great: How Religion Poisons Everything (New York: Twelve, 2007).
    • A journalistic case against religion and religious belief.
  • Keller, Timothy. The Reason for God: Belief in God in an Age of Skepticism (New York: Dutton, 2007).
    • A Christian minister’s reply to objections against Christianity of the sort raised by the New Atheists together with his positive case for Christianity.
  • Kurtz, Paul. Forbidden Fruit: The Ethics of Secularism (Amherst, New York: Prometheus Books, 2008).
    • A case for an atheistic secular humanistic ethics by a philosopher.
  • McGrath, Alister and Joanna Collicutt McGrath. The Dawkins Delusion? Atheist Fundamentalism and the Denial of the Divine (Downers Grove, IL: InterVarsity Press, 2007).
    • A critical engagement with the arguments set out in Dawkins 2006.
  • Ray, Darrel W. The God Virus: How Religion Infects Our Lives and Culture (IPC Press, 2009).
    • A book by an organizational psychologist that purports to explain how religion has negative consequences for both individuals and societies.
  • Schloss, Jeffrey and Michael Murray, eds. The Believing Primate: Scientific, Philosophical, and Theological Reflections on the Origin of Religion (New York: Oxford University Press, 2009).
    • An interdisciplinary discussion of issues raised by the sort of naturalistic account of religion promoted in Dennett 2006 and elsewhere.
  • Stenger, Victor. God: The Failed Hypothesis. How Science Shows That God Does Not Exist (Prometheus Books, 2008).
    • A scientific case for the non-existence of God by a physicist.
  • Stenger, Victor. The New Atheism: Taking a Stand for Science and Reason (Prometheus Books, 2009).
    • A defense of the New Atheism by a physicist.
  • Ward, Keith. Is Religion Dangerous? (Grand Rapids: Eerdmans, 2006).
    • A defense of religion against the New Atheists’ arguments by a philosopher-theologian.

Author Information

James E. Taylor
Email: taylor@westmont.edu
Westmont College
U. S. A.

Atheism

The term “atheist” describes a person who does not believe that God or a divine being exists.  Worldwide there may be as many as a billion atheists, although social stigma, political pressure, and intolerance make accurate polling difficult.

For the most part, atheists have presumed that the most reasonable conclusions are the ones that have the best evidential support.  And they have argued that the evidence in favor of God’s existence is too weak, or the arguments in favor of concluding there is no God are more compelling.  Traditionally the arguments for God’s existence have fallen into several families: ontological, teleological, and cosmological arguments, miracles, and prudential justifications.  For detailed discussion of those arguments and the major challenges to them that have motivated the atheist conclusion, the reader is encouraged to consult the other relevant sections of the encyclopedia.

Arguments for the non-existence of God are deductive or inductive.  Deductive arguments for the non-existence of God are either single or multiple property disproofs that allege that there are logical or conceptual problems with one or several properties that are essential to any being worthy of the title “God.”  Inductive arguments typically present empirical evidence that is employed to argue that God’s existence is improbable or unreasonable.  Briefly stated, the main arguments are:  God’s non-existence is analogous to the non-existence of Santa Claus.  The existence of widespread human and non-human suffering is incompatible with an all powerful, all knowing, all good being.  Discoveries about the origins and nature of the universe, and about the evolution of life on Earth make the God hypothesis an unlikely explanation.  Widespread non-belief and the lack of compelling evidence show that a God who seeks belief in humans does not exist.  Broad considerations from science that support naturalism, or the view that all and only physical entities and causes exist, have also led many to the atheism conclusion.

The presentation below provides an overview of concepts, arguments, and issues that are central to work on atheism.

Table of Contents

  1. What is Atheism?
  2. The Epistemology of Atheism
  3. Deductive Atheology
    1. Single Property Disproofs
    2. Multiple Property Disproofs
    3. Failure of Proof Disproof
  4. Inductive Atheology
    1. The Prospects for Inductive Proof
    2. The Santa Claus Argument
    3. Problem of Evil
    4. Cosmology
    5. Teleological Arguments
    6. Arguments from Nonbelief
    7. Atheistic Naturalism
  5. Cognitivism and Non-Cognitivism
  6. Future Prospects for Atheism
  7. References and Further Reading

1. What is Atheism?

Atheism is the view that there is no God.  Unless otherwise noted, this article will use the term “God” to describe the divine entity that is a central tenet of the major monotheistic religious traditions–Christianity, Islam, and Judaism.  At a minimum, this being is usually understood as having all power, all knowledge, and being infinitely good or morally perfect.  See the article Western Concepts of God for more details.  When necessary, we will use the term “gods” to describe all other lesser or different  characterizations of divine beings, that is, beings that lack some, one, or all of the omni- traits.

There have been many thinkers in history who have lacked a belief in God.  Some ancient Greek philosophers, such as Epicurus, sought natural explanations for natural phenomena. Epicurus was also to first to question the compatibility of God with suffering.  Forms of philosophical naturalism that would replace all supernatural explanations with natural ones also extend into ancient history.  During the Enlightenment, David Hume and Immanuel Kant give influential critiques of the traditional arguments for the existence of God in the 18th century.  After Darwin (1809-1882) makes the case for evolution and some modern advancements in science, a fully articulated philosophical worldview that denies the existence of God gains traction.  In the 19th and 20th centuries, influential critiques on God, belief in God, and Christianity by Nietzsche, Feuerbach, Marx, Freud, and Camus set the stage for modern atheism.

It has come to be widely accepted that to be an atheist is to affirm the non-existence of God.  Anthony Flew (1984) called this positive atheism, whereas to lack a belief that God or gods exist is to be a negative atheist. Parallels for this use of the term would be terms such as “amoral,” “atypical,” or “asymmetrical.”  So negative atheism would includes someone who has never reflected on the question of whether or not God exists and has no opinion about the matter and someone who had thought about the matter a great deal and has concluded either that she has insufficient evidence to decide the question, or that the question cannot be resolved in principle.  Agnosticism is traditionally characterized as neither believing that God exists nor believing that God does not exist.

Atheism can be narrow or wide in scope.  The narrow atheist does not believe in the existence of God (an omni- being).  A wide atheist does not believe that any gods exist, including but not limited to the traditional omni-God.  The wide positive atheist denies that God exists, and also denies that Zeus, Gefjun, Thor, Sobek, Bakunawa and others exist.  The narrow atheist does not believe that God exists, but need not take a stronger view about the existence or non-existence of other supernatural beings.  One could be a narrow atheist about God, but still believe in the existence of some other supernatural entities.  (This is one of the reasons that it is a mistake to identify atheism with materialism or naturalism.)

Separating these different senses of the term allows us to better understand the different sorts of justification that can be given for varieties of atheism with different scopes.  An argument may serve to justify one form of atheism and not another.  For Instance, alleged contradictions within a Christian conception of God by themselves do not serve as evidence for wide atheism, but presumably, reasons that are adequate to show that there is no omni-God would be sufficient to show that there is no Islamic God.

2. The Epistemology of Atheism

 

We can divide the justifications for atheism into several categories.  For the most part, atheists have taken an evidentialist approach to the question of God’s existence.  That is, atheists have taken the view that whether or not a person is justified in having an attitude of belief towards the proposition, “God exists,” is a function of that person’s evidence.  “Evidence” here is understood broadly to include a priori arguments, arguments to the best explanation, inductive and empirical reasons, as well as deductive and conceptual premises.  An asymmetry exists between theism and atheism in that atheists have not offered faith as a justification for non-belief.  That is, atheists have not presented non-evidentialist defenses for believing that there is no God.

Not all theists appeal only to faith, however.  Evidentialists theist and evidentialist atheists may have a number of general epistemological principles concerning evidence, arguments, and implication in common, but then disagree about what the evidence is, how it should be understood, and what it implies.  They may disagree, for instance, about whether the values of the physical constants and laws in nature constitute evidence for intentional fine tuning, but agree at least that whether God exists is a matter that can be explored empirically or with reason.

Many non-evidentialist theists may deny that the acceptability of particular religious claim depends upon evidence, reasons, or arguments as they have been classically understood.  Faith or prudential based beliefs in God, for example, will fall into this category.  The evidentialist atheist and the non-evidentialist theist, therefore, may have a number of more fundamental disagreements about the acceptability of believing, despite inadequate or contrary evidence, the epistemological status of prudential grounds for believing, or the nature of God belief.  Their disagreement may not be so much about the evidence, or even about God, but about the legitimate roles that evidence, reason, and faith should play in human belief structures.

It is not clear that arguments against atheism that appeal to faith have any prescriptive force the way appeals to evidence do.  The general evidentialist view is that when a person grasps that an argument is sound that imposes an epistemic obligation on her to accept the conclusion.  Insofar as having faith that a claim is true amounts to believing contrary to or despite a lack of evidence, one person’s faith that God exists does not have this sort of inter-subjective, epistemological implication.  Failing to believe what is clearly supported by the evidence is ordinarily irrational.  Failure to have faith that some claim is true is not similarly culpable.

Justifying atheism, then, can entail several different projects.  There are the evidential disputes over what information we have available to us, how it should be interpreted, and what it implies.  There are also broader meta-epistemological concerns about the roles of argument, reasoning, belief, and religiousness in human life.  The atheist can find herself not just arguing that the evidence indicates that there is no God, but defending science, the role of reason, and the necessity of basing beliefs on evidence more generally.

Friendly atheism; William Rowe has introduced an important distinction to modern discussions of atheism. If someone has arrived at what they take to be a reasonable and well-justified conclusion that there is no God, then what attitude should she take about another person’s persistence in believing in God, particularly when that other person appears to be thoughtful and at least prima facie reasonable?  It seems that the atheist could take one of several views.  The theist’s belief, as the atheist sees it, could be rational or irrational, justified or unjustified.  Must the atheist who believes that the evidence indicates that there is no God conclude that the theist’s believing in God is irrational or unjustified?  Rowe’s answer is no.  (Rowe 1979, 2006)

Rowe and most modern epistemologists have said that whether a conclusion C is justified for a person  S is a function of the information (correct or incorrect) that S possesses and the principles of inference that S employs in arriving at C.  But whether or not C is justified is not directly tied to its truth, or even to the truth of the evidence concerning C.  That is, a person can have a justified, but false belief.  She could arrive at a conclusion through an epistemically inculpable process and yet get it wrong.  Ptolemy, for example, the greatest astronomer of his day, who had mastered all of the available information and conducted exhaustive research into the question, was justified in concluding that the Sun orbits the Earth.  A medieval physician in the 1200s who guesses (correctly) that the bubonic plague was caused by the bacterium yersinia pestis would not have been reasonable or justified given his background information and given that the bacterium would not even be discovered for 600 years.

We can call the view that rational, justified beliefs can be false, as it applies to atheism, friendly or fallibilist atheism.  See the article on Fallibilism. The friendly atheist can grant that a theist may be justified or reasonable in believing in God, even though the atheist takes the theist’s conclusion to be false.  What could explain their divergence to the atheist?  The believer may not be in possession of all of the relevant information.  The believer may be basing her conclusion on a false premise or premises.  The believer may be implicitly or explicitly employing inference rules that themselves are not reliable or truth preserving, but the background information she has leads her, reasonably, to trust the inference rule.  The same points can be made for the friendly theist and the view that he may take about the reasonableness of the atheist’s conclusion.  It is also possible, of course, for both sides to be unfriendly and conclude that anyone who disagrees with what they take to be justified is being irrational.  Given developments in modern epistemology and Rowe’s argument, however, the unfriendly view is neither correct nor conducive to a constructive and informed analysis of the question of God.

Atheists have offered a wide range of justifications and accounts for non-belief.  A notable modern view is Antony Flew’s Presumption of Atheism (1984). Flew argues that the default position for any rational believer should be neutral with regard to the existence of God and to be neutral is to not have a belief regarding its existence.  And not having a belief with regard to God is to be a negative atheist on Flew’s account. “The onus of proof lies on the man who affirms, not on the man who denies. . . on the proposition, not on the opposition,” Flew argues (20).  Beyond that, coming to believe that such a thing does or does not exist will require justification, much as a jury presumes innocence concerning the accused and requires evidence in order to conclude that he is guilty.  Flew’s negative atheist will presume nothing at the outset, not even the logical coherence of the notion of God, but her presumption is defeasible, or revisable in the light of evidence.  We shall call this view atheism by default.

The atheism by default position contrasts with a more permissive attitude that is sometimes taken regarding religious belief.  The notions of religious tolerance and freedom are sometimes understood to indicate the epistemic permissibility of believing despite a lack of evidence in favor or even despite evidence to the contrary.  One is in violation of no epistemic duty by believing, even if one lacks conclusive evidence in favor or even if one has evidence that is on the whole against.  In contrast to Flew’s jury model, we can think of this view as treating religious beliefs as permissible until proven incorrect.  Some aspects of fideistic accounts or Plantinga’s reformed epistemology can be understood in this light.  This sort of epistemic policy about God or any other matter has been controversial, and a major point of contention between atheists and theists.  Atheists have argued that we typically do not take it to be epistemically inculpable or reasonable for a person to believe in Santa Claus, the Tooth Fairy, or some other supernatural being merely because they do not possess evidence to the contrary.  Nor would we consider it reasonable for a person to begin believing that they have cancer because they do not have proof to the contrary.  The atheist by default argues that it would be appropriate to not believe in such circumstances.  The epistemic policy here takes its inspiration from an influential piece by W.K. Clifford (1999) in which he argues that it is wrong, always, everywhere, and for anyone, to believe anything for which there is insufficient reason.

There are several other approaches to the justification of atheism that we will consider below.  There is a family of arguments, sometimes known as exercises in deductive atheology, for the conclusion that the existence of God is impossible.  Another large group of important and influential arguments can be gathered under the heading inductive atheology.  These probabilistic arguments invoke considerations about the natural world such as widespread suffering, nonbelief, or findings from biology or cosmology.  Another approach, atheistic noncognitivism, denies that God talk is even meaningful or has any propositional content that can be evaluated in terms of truth or falsity. Rather, religious speech acts are better viewed as a complicated sort of emoting or expression of spiritual passion.  Inductive and deductive approaches are cognitivistic in that they accept that claims about God have meaningful content and can be determined to be true or false.

3. Deductive Atheology

 

Many discussions about the nature and existence of God have either implicitly or explicitly accepted that the concept of God is logically coherent.  That is, for many believers and non-believers the assumption has been that such a being as God could possibly exist but they have disagreed about whether there actually is one.  Atheists within the deductive atheology tradition, however, have not even granted that God, as he is typically described, is possible.  The first question we should ask, argues the deductive atheist, is whether the description or the concept is logically consistent.  If it is not, then no such being could possibly exist.  The deductive atheist argues that some, one, or all of God’s essential properties are logically contradictory.  Since logical impossibilities are not and cannot be real, God does not and cannot exist.  Consider a putative description of an object as a four-sided triangle, a married bachelor, or prime number with more than 2 factors.  We can be certain that no such thing fitting that description exists because what they describe is demonstrably impossible.

If deductive atheological proofs are successful, the results are epistemically significant.  Many people have doubts that the view that there is no God can be rationally justified.  But if deductive disproofs show that there can exist no being with a certain property or properties and those properties figure essentially in the characterization of God, then we will have the strongest possible justification for concluding that there is no being fitting any of those characterizations.  If God is impossible, then God does not exist.

It may be possible at this point to re-engineer the description of God so that it avoids the difficulties, but as a consequence the theist faces several challenges according to the deductive atheologist.  First, if the traditional description of God is logically incoherent, then what is the relationship between a theist’s belief and some revised, more sophisticated account that allegedly does not suffer from those problems? Is that the God that she believed in all along?  Before the account of God was improved by consideration of the atheological arguments, what were the reasons that led her to believe in that conception of God?  Secondly, if the classical characterizations of God are shown to be logically impossible, then there is a legitimate question as whether any new description that avoids those problems describes a being that is worthy of the label.  It will not do, in the eyes of many theists and atheists, to retreat to the view that God is merely a somewhat powerful, partially-knowing, and partly-good being, for example.  Thirdly, the atheist will still want to know on the basis of what evidence or arguments should we conclude that a being as described by this modified account exists?  Fourthly, there is no question that there exist less than omni-beings in the world.  We possess less than infinite power, knowledge and goodness, as do many other creatures and objects in our experience.  What is the philosophical importance or metaphysical significance of arguing for the existence of those sorts of beings and advocating belief in them?  Fifthly, and most importantly, if it has been argued that God’s essential properties are impossible, then any move to another description seems to be a concession that positive atheism about God is justified.

Another possible response that the theist may take in response to deductive atheological arguments is to assert that God is something beyond proper description with any of the concepts or properties that we can or do employ as suggested in Kierkegaard or Tillich.  So complications from incompatibilities among properties of God indicate problems for our descriptions, not the impossibility of a divine being worthy of the label. Many atheists have not been satisfied with this response because the theist has now asserted the existence of and attempted to argue in favor of believing in a being that we cannot form a proper idea of, one that does not have  properties that we can acknowledge; it is a being that defies comprehension.  It is not clear how we could have reasons or justifications for believing in the existence of such a thing.  It is not clear how it could be an existing thing in any familiar sense of the term in that it lacks comprehensible properties.  Or put another way, as Patrick Grim notes, “If a believer’s notion of God remains so vague as to escape all impossibility arguments, it can be argued, it cannot be clear to even him what he believes—or whether what he takes for pious belief has any content at all,” (2007, p. 200).  It is not clear how it could be reasonable to believe in such a thing, and it is even more doubtful that it is epistemically unjustified or irresponsible to deny that such a thing is exists.  It is clear, however, that the deductive atheologist must acknowledge the growth and development of our concepts and descriptions of reality over time, and she must take a reasonable view about the relationship of those attempts and revisions in our ideas about what may turns out to be real.

a. Single Property Disproofs

 

Deductive disproofs have typically focused on logical inconsistencies to be found either within a single property or between multiple properties.  Philosophers have struggled to work out the details of what it would be to be omnipotent, for instance.  It has come to be widely accepted that a being cannot be omnipotent where omnipotence simply means to power to do anything including the logically impossible.  This definition of the term suffers from the stone paradox.  An omnipotent being would either be capable of creating a rock that he cannot lift, or he is incapable.  If he is incapable, then there is something he cannot do, and therefore he does not have the power to do anything.  If he can create such a rock, then again there is something that he cannot do, namely lift the rock he just created.  So paradoxically, having the ability to do anything would appear to entail being unable to do some things.  As a result, many theists and atheists have agreed that a being could not have that property.  A number of attempts to work out an account of omnipotence have ensued.  (Cowan 2003, Flint and Freddoso 1983, Hoffman and Rosenkrantz 1988 and 2006, Mavrodes 1977, Ramsey 1956, Sobel 2004, Savage 1967, and Wierenga 1989 for examples).  It has also been argued that omniscience is impossible, and that the most knowledge that can possibly be had is not enough to be fitting of God.  One of the central problems has been that God cannot have knowledge of indexical claims such as, “I am here now.”  It has also been argued that God cannot know future free choices, or God cannot know future contingent propositions, or that Cantor’s and Gödel proofs imply that the notion of a set of all truths cannot be made coherent.  (Everitt 2004, Grim 1985, 1988, 1984, Pucetti 1963, and Sobel 2004).  See the article on Omniscience and Divine Foreknowledge for more details.

The logical coherence of eternality, personhood, moral perfection, causal agency, and many others have been challenged in the deductive atheology literature.

b. Multiple Property Disproofs

Another form of deductive atheological argument attempts to show the logical incompatibility of two or more properties that God is thought to possess.  A long list of properties have been the subject of multiple property disproofs, transcendence and personhood, justice and mercy, immutability and omniscience, immutability and omnibenevolence, omnipresence and agency, perfection and love, eternality and omniscience, eternality and creator of the universe, omnipresence and consciousness.   (Blumenfeld 2003, Drange 1998b, Flew 1955, Grim 2007, Kretzmann 1966, and McCormick 2000 and 2003)

The combination of omnipotence and omniscience have received a great deal of attention.  To possess all knowledge, for instance, would include knowing all of the particular ways in which one will exercise one’s power, or all of the decisions that one will make, or all of the decisions that one has made in the past.  But knowing any of those entails that the known proposition is true.  So does God have the power to act in some fashion that he has not foreseen, or differently than he already has without compromising his omniscience?  It has also been argued that God cannot be both unsurpassably good and free.  (Rowe 2004).

c. Failure of Proof Disproof

When attempts to provide evidence or arguments in favor of the existence of something fail, a legitimate and important question is whether anything except the failure of those arguments can be inferred.  That is, does positive atheism follow from the failure of arguments for theism?  A number of authors have concluded that it does.  They taken the view that unless some case for the existence of God succeeds, we should believe that there is no God.

Many have taken an argument J.M. Findlay (1948) to be pivotal.  Findlay, like many others, argues that in order to be worthy of the label “God,” and in order to be worthy of a worshipful attitude of reverence, emulation, and abandoned admiration, the being that is the object of that attitude must be inescapable, necessary, and unsurpassably supreme.  (Martin 1990, Sobel 2004).  If a being like God were to exist, his existence would be necessary.  And his existence would be manifest as an a priori, conceptual truth.  That is to say that of all the approaches to God’s existence, the ontological argument is the strategy that we would expect to be successful were there a God, and if they do not succeed, then we can conclude that there is no God, Findlay argues. As most see it these attempts to prove God have not met with success, Findlay says, “The general philosophical verdict is that none of these ‘proofs’ is truly compelling.”

4. Inductive Atheology

a. The Prospects for Inductive Proof

 

The view that there is no God or gods has been criticized on the grounds that it is not possible to prove a negative.  No matter how exhaustive and careful our analysis, there could always be some proof, some piece of evidence, or some consideration that we have not considered.  God could be something that we have not conceived, or God exists in some form or fashion that has escaped our investigation.  Positive atheism draws a stronger conclusion than any of the problems with arguments for God’s existence alone could justify.  Absence of evidence is not evidence of absence.

Findlay and the deductive atheological arguments attempt to address these concerns, but a central question put to atheists has been about the possibility of giving inductive or probabilistic justifications for negative existential claims.  The response to the, “You cannot prove a negative” criticism has been that it invokes an artificially high epistemological standard of justification that creates a much broader set of problems not confined to atheism.

The general principle seems to be that one is not epistemically entitled to believe a proposition unless you have exhausted all of the possibilities and proven beyond any doubt that a claim is true.  Or put negatively, one is not justified in disbelieving unless you have proven with absolute certainty that the thing in question does not exist.  The problem is that we do not have a priori disproof that many things do not exist, yet it is reasonable and justified to believe that they do not:  the Dodo bird is extinct, unicorns are not real, there is no teapot orbiting the Earth on the opposite side of the Sun, there is no Santa Claus, ghosts are not real, a defendant is not guilty, a patient does not have a particular disease, so on.  There are a wide range of other circumstances under which we take it that believing that X does not exist is reasonable even though no logical impossibility is manifest. None of these achieve the level of deductive, a priori or conceptual proof.

The objection to inductive atheism undermines itself in that it generates a broad, pernicious skepticism against far more than religious or irreligious beliefs.  Mackie (1982) says, “It will not be sufficient to criticize each argument on its own by saying that it does not prove the intended conclusion, that is, does not put it beyond all doubt.  That follows at once from the admission that the argument is non-deductive, and it is absurd to try to confine our knowledge and belief to matters which are conclusively established by sound deductive arguments.  The demand for certainty will inevitably be disappointed, leaving skepticism in command of almost every issue”  (p. 7).  If the atheist is unjustified for lacking deductive proof, then it is argued, it would appear that so are the beliefs that planes fly, fish swim, or that there exists a mind-independent world.

The atheist can also wonder what the point of the objection is.  When we lack deductive disproof that X exists, should we be agnostic about it?  Is it permissible to believe that it does exist?  Clearly, that would not be appropriate.  Gravity may be the work of invisible, undetectable elves with sticky shoes.  We don’t have any certain disproof of the elves—physicists are still struggling with an explanation of gravity.  But surely someone who accepts the sticky-shoed elves view until they have deductive disproof is being unreasonable.  It is also clear that if you are a positive atheist about the gravity elves, you would not be unreasonable.  You would not be overstepping your epistemic entitlement by believing that no such things exist.  On the contrary, believing that they exist or even being agnostic about their existence on the basis of their mere possibility would not be justified.  So there appear to be a number of precedents and epistemic principles at work in our belief structures that provide room for inductive atheism.  However, these issues in the epistemology of atheism and recent work by Graham Oppy (2006) suggest that more attention must be paid to the principles that describe epistemic  permissibility, culpability, reasonableness, and justification with regard to the theist, atheist, and agnostic categories.

Below we will consider several groups of influential inductive atheological arguments .

b. The Santa Claus Argument

Martin (1990) offers this general principle to describe the criteria that render the belief, “X does not exist” justified:

A person is justified in believing that X does not exist if

(1)  all the available evidence used to support the view that X exists is shown to be inadequate; and

(2)  X is the sort of entity that, if X exists, then there is a presumption that would be evidence adequate to support the view that X exists; and

(3)  this presumption has not been defeated although serious efforts have been made to do so; and

(4)  the area where evidence would appear, if there were any, has been comprehensively examined; and

(5)  there are no acceptable beneficial reasons to believe that X exists.  (p. 283)

Many of the major works in philosophical atheism that address the full range of recent arguments for God’s existence (Gale 1991, Mackie 1982, Martin 1990, Sobel 2004, Everitt 2004, and Weisberger 1999) can be seen as providing evidence to satisfy the first,  fourth and fifth conditions.  A substantial body of articles with narrower scope (see References and Further Reading) can also be understood to play this role in justifying atheism.  A large group of discussions of Pascal’s Wager and related prudential justifications in the literature can also be seen as relevant to the satisfaction of the fifth condition.

One of the interesting and important questions in the epistemology of philosophy of religion has been whether the second and third conditions are satisfied concerning God.  If there were a God, how and in what ways would we expect him to show in the world?  Empirically?  Conceptually?  Would he be hidden?  Martin argues, and many others have accepted implicitly or explicitly, that God is the sort of thing that would manifest in some discernible fashion to our inquiries.   Martin concludes, therefore, that God satisfied all of the conditions, so, positive narrow atheism is justified.

c. Problem of Evil

The existence of widespread human and non-human animal suffering has been seen by many to be compelling evidence that a being with all power, all knowledge, and all goodness does not exist.  Many of those arguments have been deductive:  See the article on The Logical Problem of Evil. In the 21st century, several inductive arguments from evil for the non-existence of God have received a great deal of attention.  See The Evidential Problem of Evil.

d. Cosmology

Questions about the origins of the universe and cosmology have been the focus for many inductive atheism arguments.  We can distinguish four recent views about God and the cosmos:

Naturalism: On naturalistic view, the Big Bang occurred approximately 13.7 billion years ago, the Earth formed out of cosmic matter about 4.6 billion years ago, and life forms on Earth, unaided by any supernatural forces about 4 billion years ago.  Various physical (non-God) hypotheses are currently being explored about the cause or explanation of the Big Bang such as the Hartle-Hawking no-boundary condition model, brane cosmology models, string theoretic models, ekpyrotic models, cyclic models, chaotic inflation, and so on.

Big Bang Theism: We can call the view that God caused about the Big Bang 13.7 billion years ago Big Bang Theism.

Intelligent Design Theism: There are many variations, but most often the view is that God created the universe, perhaps with the Big Bang 13.7 billion years ago, and then beginning with the appearance of life 4 billion years ago.  God supernaturally guided the formation and development of life into the forms we see today.

Creationism: Finally, there is a group of people who for the most part denies the occurrence of the Big Bang and of evolution altogether; God created the universe, the Earth, and all of the life on Earth in its more or less present form 6,000-10,000 years ago.

atheism graphic

Taking a broad view, many atheists have concluded that neither Big Bang Theism, Intelligent Design Theism, nor Creationism is the most reasonable description of the history of the universe.  Before the theory of evolution and recent developments in modern astronomy, a view wherein God did not play a large role in the creation and unfolding of the cosmos would have been hard to justify.  Now, internal problems with those views and the evidence from cosmology and biology indicate that naturalism is the best explanation.  Justifications for Big Bang Theism have focused on modern versions of the Cosmological and Kalam arguments.  Since everything that comes into being must have a cause, including the universe, then God was the cause of the Big Bang. (Craig 1995)

The objections to these arguments have been numerous and vigorously argued.  Critics have challenged the inference to a supernatural cause to fill gaps in the natural account, as well as the inferences that the first cause must be a single, personal, all-powerful, all-knowing, and all-good being.  It is not clear that any of the properties of God as classically conceived in orthodox monotheism can be inferred from what we know about the Big Bang without first accepting a number of theistic assumptions.  Infinite power and knowledge do not appear to be required to bring about a Big Bang—what if our Big Bang was the only act that a being could perform?  There appears to be consensus that infinite goodness or moral perfection cannot be inferred as a necessary part of the cause of the Big Bang—theists have focused their efforts in the problem of evil, discussions just attempting to prove that it is possible that God is infinitely good given the state of the world.  Big Bang Theism would need to show that no other sort of cause besides a morally perfect one could explain the universe we find ourselves in.  Critics have also doubted whether we can know that some supernatural force that caused the Big Bang is still in existence or is the same entity as identified and worshipped in any particular religious tradition.  Even if major concessions are granted in the cosmological argument, all that it would seem to suggest is that there was a first cause or causes, but widely accepted arguments from that first cause or causes to the fully articulated God of Christianity or Islam, for instance, have not been forthcoming.

In some cases, atheists have taken the argument a step further.  They have offered cosmological arguments for the nonexistence of God on the basis of considerations from physics, astronomy, and subatomic theory.  These arguments are quite technical, so they are given brief attention.  God, if he exists, knowing all and having all power, would only employ those means to his ends that are rational, effective, efficient, and optimal.  If God were the creator, then he was the cause of the Big Bang, but cosmological atheists have argued that the singularity that produced the Big Bang and events that unfold thereafter preclude a rational divine agent from achieving particular ends with the Big Bang as the means.  The Big Bang would not have been the route God would have chosen to this world as a result.  (Stenger 2007, Smith 1993, Everitt 2004.)

e. Teleological Arguments

In William Paley’s famous analysis, he argues by analogy that the presence of order in the universe, like the features we find in a watch, are indicative of the existence of a designer who is responsible for the artifact.  Many authors—David Hume (1935), Wesley Salmon (1978), Michael Martin (1990)—have argued that a better case can be made for the nonexistence of God from the evidence.

Salmon, giving a modern Bayesian version of an argument that begins with Hume, argues that the likelihood that the ordered universe was created by intelligence is very low.  In general, instances of biologically or mechanically caused generation without intelligence are far more common than instances of creation from intelligence.  Furthermore, the probability that something that is generated by a biological or mechanical cause will exhibit order is quite high.  Among those things that are designed, the probability that they exhibit order may be quite high, but that is not the same as asserting that among the things that exhibit order the probability that they were designed is high.  Among dogs, the incidence of fur may be high, but it is not true that among furred things the incidence of dogs is high.  Furthermore, intelligent design and careful planning very frequently produces disorder—war, industrial pollution, insecticides, and so on.

So we can conclude that the probability that an unspecified entity (like the universe), which came into being and exhibits order, was produced by intelligent design is very low and that the empirical evidence indicates that there was no designer.

See the article on Design Arguments for the Existence of God for more details about the history of the argument and standard objections that have motivated atheism.

f. Arguments from Nonbelief

Another recent group of inductive atheistic arguments has focused on widespread nonbelief itself as evidence that atheism is justified.  The common thread in these arguments is that something as significant in the universe as God could hardly be overlooked.  The ultimate creator of the universe and a being with infinite knowledge, power, and love would not escape our attention, particularly since humans have devoted such staggering amounts of energy to the question for so many centuries.   Perhaps more importantly, a being such as God, if he chose, could certainly make his existence manifest to us.  Creating a state of affairs where his existence would be obvious, justified, or reasonable to us, or at least more obvious to more of us than it is currently, would be a trivial matter for an all-powerful being. So since our efforts have not yielded what we would expect to find if there were a God, then the most plausible explanation is that there is no God.

One might argue that we should not assume that God’s existence would be evident to us.  There may be reasons, some of which we can describe, others that we do not understand, that God could have for remaining out of sight.   Revealing himself is not something he desires, remaining hidden enables people to freely love, trust and obey him, remaining hidden prevents humans from reacting from improper motives, like fear of punishment, remaining hidden preserves human freewill.

The non-belief atheist has not found these speculations convincing for several reasons.  In religious history, God’s revealing himself to Moses, Muhammad, Jesus’ disciples, and even Satan himself did not compromise their cognitive freedom in any significant way.  Furthermore, attempts to explain why a universe where God exists would look just as we would expect a universe with no God have seemed ad hoc.  Some of the logical positivists’ and non-cognitivists’ concerns surface here.  If the believer maintains that a universe inhabited by God will look exactly like one without, then we must wonder what sort of counter-evidence would be allowed, even in principle, against the theist’s claim.  If no state of affairs could be construed as evidence against God’s existence, then what does the claim, “God exists,” mean and what are its real implications?

Alternately, how can it be unreasonable to not believe in the existence of something that defies all of our attempts to corroborate or discover?

Theodore Drange (2006)  has developed an argument that if God were the sort of being that wanted humans to come to believe that he exists, then he could bring it about that far more of them would believe than currently do.  God would be able, he would want humans to believe, there is nothing that he would want more, and God would not be irrational.  So God would bring it about that people would believe.  In general, he could have brought it about that the evidence that people have is far more convincing than what they have.  He could have miraculously appeared to everyone in a fashion that was far more compelling than the miracles stories that we have.  It is not the case that all, nearly all, or even a majority of people believe, so there must not be a God of that sort.

J.L. Schellenberg (1993) has developed an argument based upon a number of considerations that lead us to think that if there were a loving God, then we would expect to find some manifestations of him in the world.  If God is all powerful, then there would be nothing restraining him from making his presence known.  And if he is omniscient, then surely he would know how to reveal himself.  Perhaps, most importantly, if God is good and if God possesses an unsurpassable love for us, then God would consider each human’s requests as important and seek to respond quickly.  He would wish to spare those that he loves needless trauma.  He would not want to give those that he loves false or misleading thoughts about his relationship to them.  He would want as much personal interaction with them as possible, but of course, these conditions are not satisfied.  So it is strongly indicated that there is no such God.

Schellenberg gives this telling parable:

“You’re still a small child, and an amnesiac, but this time you’re in the middle of a vast rain forest, dripping with dangers of various kinds.  You’ve been stuck there for days, trying to figure out who you are and where you came from.  You don’t remember having a mother who accompanied you into this jungle, but in your moments of deepest pain and misery you call for her anyway, ‘Mooooommmmmmm!’  Over and over again.  For days and days … the last time when a jaguar comes at you out of nowhere … but with no response.  What should you think in this situation?  In your dying moments, what should cross your mind?  Would the thought that you have a mother who cares about you and hears your cry and could come to you but chooses not to even make it onto the list?” (2006, p. 31)

Like Drange, Schellenberg argues that there are many people who are epistemically inculpable in believing that there is no God.  That is, many people have carefully considered the evidence available to them, and have actively sought out more in order to determine what is reasonable concerning God.  They have fulfilled all relevant epistemic duties they might have in their inquiry into the question and they have arrived at a justified belief that there is no God.  If there were a God, however, evidence sufficient to form a reasonable belief in his existence would be available. So the occurrence of widespread epistemically inculpable nonbelief itself shows that there is no God.

g. Atheistic Naturalism

The final family of inductive arguments we will consider involves drawing a positive atheistic conclusion from broad, naturalized grounds.  See the article on Naturalism for background about the position and relevant arguments.  Comments here will be confined to naturalism as it relates to atheism.

Methodological naturalism can be understood as the view that the best or the only way to acquire knowledge within science is by adopting the assumption that all physical phenomena have physical causes.  This presumption by itself does not commit one to the view that only physical entities and causes exist, or that all knowledge must be acquired through scientific methods.  Methodological naturalism, therefore, is typically not seen as being in direct conflict with theism or having any particular implications for the existence or non-existence of God.

Ontological naturalism, however, is usually seen as taking a stronger view about the existence of God.  Ontological naturalism is the additional view that all and only physical entities and causes exist.

Among its theistic critics, there has been a tendency to portray ontological naturalism as a dogmatic ideological commitment that is more the product of a recent intellectual fashion than science or reasoned argument.  But two developments have contributed to a broad argument in favor of ontological naturalism as the correct description of what sorts of things exist and are causally efficacious.  First, there is a substantial history of the exploration and rejection of a variety of non-physical causal hypotheses in the history of science.  Over the centuries, the possibility that some class of physical events could be caused by a supernatural source, a spiritual source, psychic energy, mental forces, or vital causes have been entertained and found wanting.  Second, evidence for the law of the conservation of energy has provided significant support to physical closure, or the view that the natural world is a complete closed system in which physical events have physical causes.  At the very least, atheists have argued, the ruins of so many supernatural explanations that have been found wanting in the history of science has created an enormous burden of proof that must be met before any claim about the existence of another worldly spiritual being can have credence.  Ontological naturalism should not be seen as a dogmatic commitment, its defenders have insisted, but rather as a defeasible hypothesis that is supported by centuries of inquiry into the supernatural.

As scientific explanations have expanded to include more details about the workings of natural objects and laws, there has been less and less room or need for invoking God as an explanation.  It is not clear that expansion of scientific knowledge disproves the existence of God in any formal sense any more than it has disproven the existence of fairies, the atheistic naturalist argues.  However, physical explanations have increasingly rendered God explanations extraneous and anomalous.  For example, when Laplace, the famous 18th century French mathematician and astronomer, presented his work on celestial mechanics to Napoleon, the Emperor asked him about the role of a divine creator in his system Laplace is reported to have said, “I have no need for that hypothesis.”

In many cases, science has shown that particular ancillary theses of traditional religious doctrine are mistaken.  Blind, petitionary prayer has been investigated and found to have no effect on the health of its recipients, although praying itself may have some positive effects on the person who prayers (Benson, 2006).  Geology, biology, and cosmology have discovered that the Earth formed approximately 3 billion years ago out of cosmic dust, and life evolved gradually over billions of years.  The Earth, humans, and other life forms were not created in their present form some 6,000-10,000 years ago and the atheistic naturalist will point to numerous  alleged miraculous events have been investigated and debunked.

Wide, positive atheism, the view that there are no gods whatsoever, might appear to be the most difficult atheistic thesis to defend, but ontological naturalists have responded that the case for no gods is parallel to the case for no elves, pixies, dwarves, fairies, goblins, or other creates.  A decisive proof against every possible supernatural being is not necessary for the conclusion that none of them are real to be justified.  The ontological naturalist atheist believes that once we have devoted sufficient investigation into enough particular cases and the general considerations about natural laws, magic, and supernatural entities, it becomes reasonable to conclude that the whole enterprise is an explanatory dead end for figuring out what sort of things there are in the world.

The disagreement between atheists and theists continues on two fronts.  Within the arena of science and the natural world, some believers have persisted in arguing that material explanations are inadequate to explain all of the particular events and phenomena that we observe.  Some philosophers and scientists have argued that for phenomena like consciousness, human morality, and some instances of biological complexity, explanations in terms of natural or evolutionary theses have not and will not be able to provide us with a complete picture.  Therefore, the inference to some supernatural force is warranted.  While some of these attempts have received social and political support, within the scientific community the arguments that causal closure is false and that God as a cause is a superior scientific hypothesis to naturalistic explanations have not received significant support.  Science can cite a history of replacing spiritual, supernatural, or divine explanations of phenomena with natural ones from bad weather as the wrath of angry gods to disease as demon possession.  The assumption for many is that there are no substantial reasons to doubt that those areas of the natural world that have not been adequately explained scientifically will be given enough time.  ( Madden and Hare 1968, Papineau, Manson, Nielsen 2001, and Stenger.)  Increasingly, with what they perceive as the failure of attempts to justify theism, atheists have moved towards naturalized accounts of religious belief that give causal and evolutionary explanations of the prevalence of belief.  (See Atrans, Boyer, Dennett 2006)

5. Cognitivism and Non-Cognitivism

 

In 20th century moral theory, a view about the nature of moral value claims arose that has an analogue in discussions of atheism.  Moral non-cognitivists have denied that moral utterances should be treated as ordinary propositions that are either true or false and subject to evidential analysis.  On their view, when someone makes a moral claim like, “Cheating is wrong,” what they are doing is more akin to saying something like, “I have negative feelings about cheating.  I want you to share those negative feelings.  Cheating.  Bad.”

A non-cognitivist atheist denies that religious utterances are propositions.  They are not the sort of speech act that have a truth value.  They are more like emoting, singing, poetry, or cheering.  They express personal desires, feelings of subjugation, admiration, humility, and love.  As such, they cannot and should not be dealt with by denials or arguments any more than I can argue with you over whether or not a poem moves you.  There is an appeal to this approach when we consider common religious utterances such as, “Jesus loves you.”  “Jesus died for your sins.”  “God be with you.”  What these mean, according to the non-cognitivist, is something like, “I have sympathy for your plight, we are all in a similar situation and in need of paternalistic comforting, you can have it if you perform certain kinds of behaviors and adopt a certain kind of personal posture with regard to your place in the world.  When I do these things I feel joyful, I want you to feel joyful too.”

So the non-cognitivist atheist does not claim that the sentence, “God exists” is false, as such.  Rather, when people make these sorts of claims, their behavior is best understood as a complicated publicizing of a particular sort of subjective sensations.  Strictly speaking, the claims do not mean anything in terms of assertions about what sorts of entities do or do not exist in the world independent of human cognitive and emotional states.  The non-cognitivist characterization of many religious speech acts and behaviors has seemed to some to be the most accurate description.  For the most part, atheists appear to be cognitivist atheists.  They assume that religious utterances do express propositions that are either true or false.  Positive atheists will argue that there are compelling reasons or evidence for concluding that in fact those claims are false.  (Drange 2006, Diamond and Lizenbury 1975, Nielsen 1985)

Few would disagree that many religious utterances are non-cognitive such as religious ceremonies, rituals, and liturgies.   Non-cognitivists have argued that many believers are confused when their speech acts and behavior slips from being non-cognitive to something resembling cognitive assertions about God.  The problem with the non-cognitivist view is that many religious utterances are clearly treated as cognitive by their speakers—they are meant to be treated as true or false claims, they are treated as making a difference, and they clearly have an impact on people’s lives and beliefs beyond the mere expression of a special category of emotions.  Insisting that those claims simply have no cognitive content despite the intentions and arguments to the contrary of the speaker is an ineffectual means of addressing them.  So non-cognitivism does not appear to completely address belief in God.

6. Future Prospects for Atheism

20th century developments in epistemology, philosophy of science, logic, and philosophy of language indicate that many of the presumptions that supported old fashioned natural theology and atheology are mistaken.  It appears that even our most abstract, a priori, and deductively certain methods for determining truth are subject to revision in the light of empirical discoveries and theoretical analyses of the principles that underlie those methods.  Certainty, reasoning, and theology, after Bayes’ work on probability, Wittgenstein’s fideism, Quine’s naturalism, and Kripke’s work on necessity are not what they used to be.  The prospects for a simple, confined argument for atheism (or theism) that achieves widespread support or that settles the question are dim.  That is because, in part, the prospects for any argument that decisively settles a philosophical question where a great deal seems to be at stake are dim.

The existence or non-existence of any non-observable entity in the world is not settled by any single argument or consideration.  Every premise is based upon other concepts and principles that themselves must be justified.  So ultimately, the adequacy of atheism as an explanatory hypothesis about what is real will depend upon the overall coherence, internal consistency, empirical confirmation, and explanatory success of a whole worldview within which atheism is only one small part.  The question of whether or not there is a God sprawls onto related issues and positions about biology, physics, metaphysics, explanation, philosophy of science, ethics, philosophy of language, and epistemology.  The reasonableness of atheism depends upon the overall adequacy of a whole conceptual and explanatory description of the world.

7. References and Further Reading

  • Atran, Scott, 2002,  In Gods We Trust:  The Evolutionary Landscape of Religion. New York:  Oxford University Press.
    • An evolutionary and anthropological account of religious beliefs and institutions.
  • Benson H, Dusek JA, Sherwood JB, Lam P, Bethea CF, Carpenter W, Levitsky S, Hill PC, Clem DW Jr, Jain MK, Drumel D,Kopecky SL, Mueller PS, Marek D, Rollins S, Hibberd PL. “Study of the Therapeutic Effects of Intercessory Prayer (STEP) in cardiac bypass patients: a multicenter randomized trial of uncertainty and certainty of receiving intercessory prayer.” American Heart Journal, April 2006 151(4):934-42.
  • Blumenfeld, David, 2003,  “On the Compossibility of the Divine Attributes,”  In The Impossibility of God. eds, Martin and Monnier. Amherst, N.Y.:  Prometheus Press.
    • The implications of perfection show that God’s power, knowledge, and goodness are not compatible, so the standard Judeo-Christian divine and perfect being is impossible.
  • Boyer, Pascal 2001, Religion Explained: The Evolutionary Origins of Religious Thought.  New York:  Basic Books.
    • An influential anthropological and evolutionary work.  Religion exists to sustain important aspects of social psychology.
  • Clifford, W.K., 1999, “The Ethics of Belief,”  in The Ethics of Belief and other Essays. Amherst, NY: Prometheus Books.
    • Famously, Clifford argues that it is wrong always and anywhere to believe anything on the basis of insufficient evidence.  Important and influential argument in discussions of atheism and faith.
  • Cowan, J. L., 2003,  “The Paradox of Omnipotence,” In The Impossibility of God. eds, Martin and Monnier. Amherst, N.Y.:  Prometheus Press.
    • No being can have the power to do everything that is not self-contradictory.  That God has that sort of omnipotence is itself self-contradictory.
  • Craig, William L. and Quentin Smith 1995.  Theism, Atheism, and Big Bang Cosmology. N.Y.: Oxford University Press.
    • Craig and Smith have an exchange on the cosmological evidence in favor of theism, for atheism, and Hawking’s quantum cosmology.  The work is part of an important recent shift that takes the products of scientific investigation to be directly relevant to the question of God’s existence.
  • Darwin, Charles,  1871.  The Descent of Man, and the Selection in Relation to Sex. London:  John Murray.
    • Twelve years after The Origin of Species, Darwin makes a thorough and compelling case for the evolution of humans.  He also expands on numerous details of the theory.
  • Darwin, Charles, 1859.  The Origin of Species by Means of Natural Selection. London:  John Murray.
    • Darwin’s first book where he explains his theory of natural selection.  No explicit mention of humans is made, but the theological implications are clear for the teleological argument.
  • Dennett, Daniel, 2006.  Breaking the Spell:  Religion as a Natural Phenomenon. New York:  Viking Penguin.
    • Important work among the so-called New Atheists.  Dennett argues that religion can and should be studying by science.  He outlines evolutionary explanations for religion’s cultural and psychological influence.
  • Diamond, Malcolm L. and Lizenbury, Thomas V. Jr. (eds)  The Logic of God, Indianapolis, Ind.:  Bobbs-Merrill, 1975.
    • A collection of articles addressing the logical coherence of the properties of God.
  • Drange, Theodore, 1998a.  Nonbelief and Evil. Amherst, N.Y.:  Prometheus Books.
    • Drange gives an argument from evil against the existence of the God of evangelical Christianity, and an argument that the God of evangelical Christianity could and would bring about widespread belief, therefore such a God does not exist.
  • Drange, Theodore, 1998b.  “Incompatible Properties Arguments:  A Survey.”  Philo 1: 2.  pp. 49-60.
    • A useful discussion of several property pairs that are not logically compatible in the same being such as:  perfect-creator, immutable-creator, immutable-omniscient, and transcendence-omnipresence.
  • Drange, Theodore,  2006.  “Is “God Exists” Cognitive?”  Philo 8:2.
    • Drange argues that non-cognitivism is not the best way to understand theistic claims.
  • Everitt, Nicholas, 2004.  The Non-Existence of God.  London:  Routledge.
    • Everitt considers and rejects significant recent arguments for the existence of God.  Offers insightful analyses of ontological, cosmological, teleological, miracle, and pragmatic arguments.  The argument from scale and deductive atheological arguments are of particular interest
  • Findlay, J.N.,  1948.  “Can God’s Existence be Disproved?”  Mind 54, pp.  176-83.
    • Influential early argument.  If there is a God, then he will be a necessary being and the ontological argument will succeed.  But the ontological argument and our efforts to make it work have not been successful.  So there is no God.
  • Flew, A. and MacIntyre, A. eds., 1955, New Essays in Philosophical Theology, London: S.C.M. Press.
    • Influential early collection of British philosophers where the influence of the Vienna Circle is evident in the “logical analysis” of religion.  The meaning, function, analysis, and falsification of theological claims and discourse are considered.
  • Flew, Antony. 1955. “Divine Omnipotence and Human Freedom.” in New Essays in Philosophical Theology, Anthony Flew and Alasdair MacIntyre (eds.).   New York: Macmilla
    • An early work in deductive atheology that considers the compatibility of God’s power and human freedom.
  • Flew, Antony, 1984.  “The Presumption of Atheism.”  in God, Freedom, and Immortality.  Buffalo, N.Y.: Prometheus Books, pp. 13-30.
    • A collection of Flew’s essays, some of which are antiquated.  The most important are “The Presumption of Atheism,” and “The Principle of Agnosticism.”
  • Flint and Freddoso, 1983. “Maximal Power.”  in The Existence and Nature of God, Alfred J. Freddoso, ed.  Notre Dame, Ind.:  University of Notre Dame Press.
    • Gives an account of omnipotence in terms of possible worlds logic and with the notion of two world sharing histories.  It attempts to avoid a number of paradoxes.
  • Gale, Richard, 1991.  On the Nature and Existence of God. Cambridge:  Cambridge University Press.
    • Gale gives a careful, advanced analysis of several important deductive atheological arguments as well as the ontological and cosmological arguments, and concludes that none for theism are successful.  But he does not address inductive arguments and therefore says that he cannot answer the general question of God’s existence.
  • Grim, Patrick, 1985.  “Against Omniscience:  The Case from Essential Indexicals,”  Nous, 19. pp.  151-180.
    • God cannot be omniscient because it is not possible for him to have indexical knowledge such as what I know when I know that I am making a mess.
  • Grim, Patrick, 1988.  “Logic and Limits of Knowledge and Truth,” Nous 22.  pp.  341-67.
    • Uses Cantor and Gödel to argue that omniscience is impossible within any logic we have.
  • Grim, Patrick, 2007.  “Impossibility Arguments.”  in The Cambridge Companion to Atheism, Michael Martin (ed).  N.Y.:  Cambridge University Press.
    • Grim outlines several recent attempts to salvage a workable definition of omnipotence from Flint and Freddoso, Wierenga, and Hoffman and Rosenkrantz.  He argues that they do not succeed leaving God’s power either impossible or too meager to be worthy of God.  Indexical problems with omniscience and a Cantorian problem render it impossible too.
  • Gutting, Gary, 1982.  Religious Belief and Religious Skepticism. Notre Dame, Ind.:  University of Notre Dame Press.
    • Gutting criticizes Wittgensteinians such as Malcolm, Winch, Phillips, and Burrell before turning to Plantinga’s early notion of belief in God as basic to noetic structures.  Useful for addressing important 20th century linguistic and epistemological turns in theism discussions.
  • Harris, Sam, 2005.  The End of Faith. N.Y.:  Norton.
    • Another influential New Atheist work, although it does not contend with the best philosophical arguments for God.  Harris argues that faith is not an acceptable justification for religious belief, particularly given the dangerousness of religious agendas worldwide.  A popular, non-scholarly book that has had a broad impact on the discussion.
  • Hoffman, Joshua and Rosenkrantz, 1988.  “Omnipotence Redux,”  Philosophy and Phenomenological Research 43.  pp.  283-301.
    • Defends Hoffman and Rosenkrantz’s account of omnipotence against criticisms offered by Flint, Freddoso, and Wierenga.
  • Hoffman, Joshua and Rosenkrantz, 2006.  “Omnipotence,” Stanford Encyclopedia of Philosophy.
    • A good overview of the various attempts to construct a philosophically viable account of omnipotence.
  • Howard-Snyder, Daniel and Moser, Paul, eds. 2001. Divine Hiddenness:  New Essays. Cambridge University Press.
    • A central collection of essays concerning the question of God’s hiddenness.  If there is a God, then why is his existence not more obvious?
  • Howard-Snyder, Daniel, 1996.  “The Argument from Divine Hiddenness.”  Canadian Journal of Philosophy 26. 433-53.
    • Howard-Snyder argues that there is a prima facie good reason for God to refrain from entering into a personal relationship with inculpable nonbelievers, so there are good reasons for God to permit inculpable nonbelief.  Therefore, inculpable nonbelief does not imply atheism.
  • Hume, David, 1935.  Dialogues Concerning Natural Religion, ed. Norman Kemp Smith, Oxford:  Clarendon Press.
    • Hume offers his famous dialogues between Philo, Demea, and Cleanthes in which he explores the empirical evidence for the existence of God.  No work in the philosophy of religion except perhaps Anselm or Aquinas has received more attention or had more influence.
  • Kitcher, Philip, 1982.  Abusing Science Cambridge, Mass.:  MIT Press.
    • A useful, but somewhat dated and non-scholarly, presentation of the theory of evolution and critique of creationist arguments against it.
  • Kretzmann, Norman, 1966.  “Omniscience and Immutability,” Journal of Philosophy 63.  pp.  409-21.
    • A perfect being is not subject to change.  A perfect being knows everything.  A being that knows everything always knows what time it is.  A being that always knows what time it is subject to change.  Therefore, a perfect being is subject to change.  Therefore, a perfect being is not a perfect being.  Therefore, there is no perfect being.
  • Mackie, J. L. 1982.  The Miracle of Theism.  New York:  Oxford University Press.
    • An influential and comprehensive work.  He rejects many classic and contemporary ontological, cosmological, moral, teleological, evil, and pragmatic arguments.
  • Madden, Edward and Peter Hare, eds., 1968. Evil and the Concept of God. Springfield, IL: Charles C. Thomas.
    • Madden and Hare argue against a full range of theodicies suggesting that the problem of evil cannot be adequately answered by philosophical theology.
  • Manson, Neil A., ed., 2003, God and Design, London: Routledge.
    • The best recent academic collection of discussions of the design argument.
  • Martin, Michael, 1990.   Atheism:  A Philosophical Justification. Philadelphia:  Temple University Press, 1990.
    • A careful and comprehensive work that surveys and rejects a broad range of arguments for God’s existence.  One of the very best attempts to give a comprehensive argument for atheism.
  • Martin, Michael and Ricki Monnier, eds.  2003.  The Impossibility of God. Amherst, N.Y.:  Prometheus Press.
    • An important collection of deductive atheological arguments—the only one of its kind.  A significant body of articles arguing for the conclusion that God not only does not exist, but is impossible.
  • Martin, Michael and Ricki Monnier, eds.  2006.  The Improbability of God. Amherst, N.Y.:  Prometheus Press.
    • The companion to The Impossibility of God. An important collection of inductive atheological arguments distinct from the problem of evil.  God’s existence is unreasonable.
  • Matson, Wallace I., 1965.  The Existence of God. Ithaca, N.Y.:  Cornell University Press.
    • Matson critically scrutinizes the important arguments (of the day) for the existence of God.  He concludes that none of them is conclusive and that the problem of evil tips the balance against.
  • Mavrodes, George, 1977.  “Defining Omnipotence,”  Philosophical Studies, 32.  pp. 191-202.
    • Mavrodes defends limiting omnipotence to exclude logically impossible acts.  It is no limitation upon a being’s power to assert that it cannot perform an incoherent act.
  • McCormick, Matthew, 2000.  “Why God Cannot Think:  Kant, Omnipresence, and Consciousness,”  Philo 3:  1.  pp.  5-19.
    • McCormick argues, on Kantian grounds, that being in all places and all times precludes being conscious because omnipresence would make it impossible for God to make an essential conceptual distinction between the self and not-self.
  • McCormick, Matthew,  2003.  “The Paradox of Divine Agency,” in The Impossibility of God, Martin, Michael and Ricki Monnier, eds.  Amherst, N.Y.:  Prometheus Press.
    • God is traditionally conceived of as an agent, capable of setting goals, willing and performing actions.  God can never act, however, because no state of affairs that deviates from the dictates of his power, knowledge, and perfection can arise.  Therefore, God is impossible.
  • Morris, Thomas, ed. 1987.  The Concept of God, Oxford:  Oxford University Press.
    • A valuable set of discussions about the logical viability of different properties of God and their compatibility.
  • Nielsen, Kai, 1985. Philosophy and Atheism. New York: Prometheus.
    • A useful collection of essays from Nielsen that addresses various, particularly epistemological, aspects of atheism.
  • Nielsen, Kai, 2001. Naturalism and Religion. New York: Prometheus.
    • Defends naturalism as atheistic and adequate to answer a number of larger philosophical questions.  Considers some famous objections to naturalism including fideism and Wittgenstein.
  • Oppy, Graham (1995). Ontological Arguments and Belief in God, N.Y.:  Cambridge University Press.
    • Perhaps the best and most thorough analysis of the important versions of the ontological argument.
  • Oppy, Graham, 2006.  Arguing About Gods. N.Y.:  Cambridge University Press.
    • There are no successful arguments for the existence of orthodoxly conceived monotheistic gods.  This project includes some very good, up to date, analyses of rational belief and belief revision, ontological arguments, cosmological arguments, teleological arguments, Pascal’s wager, and evil.  He sees these all as fitting into a larger argument for agnosticism.
  • Papineau, David, 2007.  “Naturalism,” Stanford Encyclopedia of Philosophy.
    • A good general discussion of philosophical naturalism.
  • Rowe, William, 1979.  “The Problem of Evil and Some Varieties of Atheism,” American Philosophical Quarterly 16.  pp.  335-41.
    • A watershed work giving an inductive argument from evil for the non-existence of God.  This article has been anthologized and responded as much or more than any other single work in atheism.
  • Rowe, William L., 1998.  “Atheism.” In E. Craig (Ed.), Routledge Encyclopedia of Philosophy. London: Routledge.
    • A good but brief survey of philosophical atheism.
  • Rowe, William, 1998.  The Cosmological Argument. N.Y.:  Fordham University Press.
    • Rowe offers a thorough analysis of many important historically influential versions of the cosmological argument, especially Aquinas’, Duns Scotus’s, and Clarke’s.
  • Rowe, William,  2004.  Can God Be Free? Oxford:  Oxford University Press.
    • Rowe considers a range of classic and modern arguments attempting to reconcile God’s freedom in creating the world with God’s omnipotence, omniscience, and perfect goodness. Rowe argues against their compatibility with this principle:   If an omniscient being creates a world when there is a better world that it could have created instead, then it is possible that there exist a being better than it—a being whose degree of goodness is such that it could not create that world when there is a better world it could have created instead.
  • Salmon, Wesley, 1978.  “Religion and Science:  A New Look at Hume’s Dialogues,” Philosophical Studies 33 (1978):  143-176.
    • A novel Bayesian reconstruction of Hume’s treatment of design arguments.  In general, since it is exceedingly rare for things to be brought into being by intelligence, and it is common for orderly things to come into existence by non-intelligence, it is more probable that the orderly universe is not the product of intelligent design.
  • Schellenberg, J.L., 1993.  Divine Hiddenness and Human Reason.  Ithaca, N.Y.:  Cornell University Press.
    • Schellenberg argues that the absence of strong evidence for theism implies that atheism is true.
  • Schellenberg, J.L., 2006.  “Divine Hiddenness justifies atheism,”  Contemporary Debates in the Philosophy of Religion, ed. Peterson and VanArragon.  Oxford: Blackwell Publishing.  pp. 30-41.
    • Many people search in earnest for compelling evidence for God’s existence, but remain unconvinced and epistemically inculpable.  This state of divine hiddenness itself implies that there is no God, independent of any positive arguments for atheism.
  • Smart, J.C.C.  (2004) “Atheism and Agnosticism”  Stanford Encyclopedia of Philosophy.
    • An outdated and idiosyncratic survey of the topic.  Heavily influenced by positivism from the early 20th century.
  • Smart, J.J.C. and Haldane, John, 2003.  Atheism and Theism. Oxford: Blackwell.
    • An influential exchange between Smart (atheist) and Haldane (theist)
  • Smith, Quentin, 1993.  “Atheism, Theism, and Big Bang Cosmology,” in Theism, Atheism, and Big Bang Cosmology. eds. William Lane Craig and Quentin Smith.  Oxford:  Clarendon Press, pp. 195-217.
    • Smith gives a novel argument and considers several objections:  God did not create the big bang.  If he had, he would have ensured that it would unfold into a state containing living creatures.  But the big bang is inherently lawless and unpredictable and is not ensured to unfold this way.
  • Sobel, Jordan Howard, 2004.  Logic and Theism, Arguments for and Against Beliefs in God. Cambridge:  Cambridge University Press.
    • A broad, conventionally structured work in that it covers ontological, cosmological, and teleological arguments, as well as the properties of God, evil, and Pascal.  Notable for its attempts to bring some sophisticated, technical logic tools to the reconstructions and analyses.
  • Stenger, Victor.  2007.  God:  The Failed Hypothesis:  How Science Shows that God Does Not Exist. Prometheus Books.
    • An accessible work that considers scientific evidence that might be construed as against the existence of God:  evolution, supernaturalism, cosmology, prayer, miracles, prophecy, morality, and suffering.   Not a scholarly philosophical work, but interesting survey of relevant empirical evidence.
  • Weisberger, A.M. 1999.  Suffering Belief:  Evil and the Anglo-American Defense of Theism. New York:  Peter Lang Publishing.
    • Weisberger argues that the problem of evil presents a disproof for the existence of the God of classical monotheism.
  • Wierenga, Edward, 1989.  The Nature of God:  An Inquiry Into Divine Attributes. Ithaca, N.Y.:  Cornell University Press.
    • Wierenga offers an important, thorough, and recent attempt to work out the details of the various properties of God and their compatibilities.  He responds to a number of recent counterexamples to different definitions of omnipotence, omniscience, freedom, timelessness, eternality, and so on.  Employs many innovations from developments in modern logic.

Author Information

Matt McCormick
Email: mccormick@csus.edu
California State University, Sacramento
U. S. A.